|
1 | 1 | # <img src="https://raw.githubusercontent.com/coqui-ai/TTS/main/images/coqui-log-green-TTS.png" height="56"/> |
2 | 2 |
|
3 | 3 | 🐸TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. |
4 | | -🐸TTS comes with [pretrained models](https://github.com/coqui-ai/TTS/wiki/Released-Models), tools for measuring dataset quality and already used in **20+ languages** for products and research projects. |
| 4 | +🐸TTS comes with pretrained models, tools for measuring dataset quality and already used in **20+ languages** for products and research projects. |
5 | 5 |
|
6 | 6 | [](https://github.com/coqui-ai/TTS/actions) |
7 | 7 | [](https://badge.fury.io/py/TTS) |
@@ -135,6 +135,66 @@ $ make install |
135 | 135 |
|
136 | 136 | If you are on Windows, 👑@GuyPaddock wrote installation instructions [here](https://stackoverflow.com/questions/66726331/how-can-i-run-mozilla-tts-coqui-tts-training-with-cuda-on-a-windows-system). |
137 | 137 |
|
| 138 | +## Use TTS |
| 139 | + |
| 140 | +### Single Speaker Models |
| 141 | + |
| 142 | +- List provided models: |
| 143 | + |
| 144 | + ``` |
| 145 | + $ tts --list_models |
| 146 | + ``` |
| 147 | +
|
| 148 | +- Run TTS with default models: |
| 149 | +
|
| 150 | + ``` |
| 151 | + $ tts --text "Text for TTS" |
| 152 | + ``` |
| 153 | +
|
| 154 | +- Run a TTS model with its default vocoder model: |
| 155 | +
|
| 156 | + ``` |
| 157 | + $ tts --text "Text for TTS" --model_name "<language>/<dataset>/<model_name> |
| 158 | + ``` |
| 159 | +
|
| 160 | +- Run with specific TTS and vocoder models from the list: |
| 161 | +
|
| 162 | + ``` |
| 163 | + $ tts --text "Text for TTS" --model_name "<language>/<dataset>/<model_name>" --vocoder_name "<language>/<dataset>/<model_name>" --output_path |
| 164 | + ``` |
| 165 | +
|
| 166 | +- Run your own TTS model (Using Griffin-Lim Vocoder): |
| 167 | +
|
| 168 | + ``` |
| 169 | + $ tts --text "Text for TTS" --model_path path/to/model.pth.tar --config_path path/to/config.json --out_path output/path/speech.wav |
| 170 | + ``` |
| 171 | +
|
| 172 | +- Run your own TTS and Vocoder models: |
| 173 | + ``` |
| 174 | + $ tts --text "Text for TTS" --model_path path/to/config.json --config_path path/to/model.pth.tar --out_path output/path/speech.wav |
| 175 | + --vocoder_path path/to/vocoder.pth.tar --vocoder_config_path path/to/vocoder_config.json |
| 176 | + ``` |
| 177 | +
|
| 178 | +### Multi-speaker Models |
| 179 | +
|
| 180 | +- List the available speakers and choose as <speaker_id> among them: |
| 181 | +
|
| 182 | + ``` |
| 183 | + $ tts --model_name "<language>/<dataset>/<model_name>" --list_speaker_idxs |
| 184 | + ``` |
| 185 | +
|
| 186 | +- Run the multi-speaker TTS model with the target speaker ID: |
| 187 | +
|
| 188 | + ``` |
| 189 | + $ tts --text "Text for TTS." --out_path output/path/speech.wav --model_name "<language>/<dataset>/<model_name>" --speaker_idx <speaker_id> |
| 190 | + ``` |
| 191 | +
|
| 192 | +- Run your own multi-speaker TTS model: |
| 193 | +
|
| 194 | + ``` |
| 195 | + $ tts --text "Text for TTS" --out_path output/path/speech.wav --model_path path/to/config.json --config_path path/to/model.pth.tar --speakers_file_path path/to/speaker.json --speaker_idx <speaker_id> |
| 196 | + ``` |
| 197 | +
|
138 | 198 | ## Directory Structure |
139 | 199 | ``` |
140 | 200 | |- notebooks/ (Jupyter Notebooks for model evaluation, parameter selection and data analysis.) |
|
0 commit comments