|
--- |
|
language: |
|
- en |
|
library: xvasynth |
|
tags: |
|
- audio |
|
- text-to-speech |
|
- speech-to-speech |
|
- voice conversion |
|
- tts |
|
pipeline_tag: text-to-speech |
|
--- |
|
|
|
GitHub project, inference Windows/Electron app: https://github.com/DanRuta/xVA-Synth |
|
|
|
Fine-tuning app: https://github.com/DanRuta/xva-trainer |
|
|
|
The base model for training other [🤗 xVASynth's](https://huggingface.co/spaces/Pendrokar/xVASynth-TTS) FastPitch 1.1 type models (v2). Used to fine tune models with xVATrainer TTS model training app and not for inference. All created by Dan ["@dr00392"](https://huggingface.co/dr00392) Ruta. |
|
|
|
v3 models are called [xVAPitch](https://huggingface.co/Pendrokar/xvapitch) and are not based on FastPitch. |
|
|
|
There are hundreds of fine-tuned models on the web. But most of them use non-permissive datasets. |
|
|
|
## xVASynth Editor v2 walkthrough video ▶: |
|
[](https://www.youtube.com/watch?v=W-9SFoNuTtM) |
|
|
|
## xVATrainer v1 walkthrough video ▶: |
|
[](https://www.youtube.com/watch?v=PXv_SeTWk2M) |
|
|
|
## References |
|
- [1] [FastPitch: Parallel Text-to-speech with Pitch Prediction](https://arxiv.org/abs/2006.06873) |
|
- [2] [One TTS Alignment To Rule Them All](https://arxiv.org/abs/2108.10447) |
|
|
|
Used datasets: Unknown/Non-permissiable data |