Update README.md
Browse files
README.md
CHANGED
|
@@ -22,7 +22,7 @@ widget:
|
|
| 22 |
- example_title: Librispeech sample 2
|
| 23 |
src: https://cdn-media.huggingface.co/speech_samples/sample2.flac
|
| 24 |
model-index:
|
| 25 |
-
- name:
|
| 26 |
results:
|
| 27 |
- task:
|
| 28 |
name: Automatic Speech Recognition
|
|
@@ -155,16 +155,15 @@ img {
|
|
| 155 |
|
| 156 |
## <span style="color:#466f00;">Description:</span>
|
| 157 |
|
| 158 |
-
`
|
| 159 |
|
| 160 |
-
This XL variant of the FastConformer [1] architecture integrates the TDT [2] decoder and is trained with full attention, enabling efficient transcription of audio segments up to 24 minutes in a single pass.
|
| 161 |
|
| 162 |
**Key Features**
|
| 163 |
- Accurate word-level timestamp predictions
|
| 164 |
- Automatic punctuation and capitalization
|
| 165 |
- Robust performance on spoken numbers, and song lyrics transcription
|
| 166 |
|
| 167 |
-
For more information, refer to the [Model Architecture](#model-architecture) section and the [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#fast-conformer).
|
| 168 |
|
| 169 |
This model is ready for commercial/non-commercial use.
|
| 170 |
|
|
@@ -185,7 +184,7 @@ This model serves developers, researchers, academics, and industries building ap
|
|
| 185 |
|
| 186 |
### <span style="color:#466f00;">Release Date:</span>
|
| 187 |
|
| 188 |
-
05/
|
| 189 |
|
| 190 |
### <span style="color:#466f00;">Model Architecture:</span>
|
| 191 |
|
|
|
|
| 22 |
- example_title: Librispeech sample 2
|
| 23 |
src: https://cdn-media.huggingface.co/speech_samples/sample2.flac
|
| 24 |
model-index:
|
| 25 |
+
- name: Quantum_STT_V2.0
|
| 26 |
results:
|
| 27 |
- task:
|
| 28 |
name: Automatic Speech Recognition
|
|
|
|
| 155 |
|
| 156 |
## <span style="color:#466f00;">Description:</span>
|
| 157 |
|
| 158 |
+
`Quantum_STT_V2.0` is a 600-million-parameter automatic speech recognition (ASR) model designed for high-quality English transcription, featuring support for punctuation, capitalization, and accurate timestamp prediction. Try Demo here: https://huggingface.co/spaces/Quantamhash/Quantum_STT_V2.0
|
| 159 |
|
| 160 |
+
This XL variant of the FastConformer [1] architecture integrates the TDT [2] decoder and is trained with full attention, enabling efficient transcription of audio segments up to 24 minutes in a single pass.
|
| 161 |
|
| 162 |
**Key Features**
|
| 163 |
- Accurate word-level timestamp predictions
|
| 164 |
- Automatic punctuation and capitalization
|
| 165 |
- Robust performance on spoken numbers, and song lyrics transcription
|
| 166 |
|
|
|
|
| 167 |
|
| 168 |
This model is ready for commercial/non-commercial use.
|
| 169 |
|
|
|
|
| 184 |
|
| 185 |
### <span style="color:#466f00;">Release Date:</span>
|
| 186 |
|
| 187 |
+
14/05/2025
|
| 188 |
|
| 189 |
### <span style="color:#466f00;">Model Architecture:</span>
|
| 190 |
|