xva_fastpitch1_1 / README.md
Pendrokar's picture
link to xvapitch
4af294b verified
---
language:
- en
library: xvasynth
tags:
- audio
- text-to-speech
- speech-to-speech
- voice conversion
- tts
pipeline_tag: text-to-speech
---
GitHub project, inference Windows/Electron app: https://github.com/DanRuta/xVA-Synth
Fine-tuning app: https://github.com/DanRuta/xva-trainer
The base model for training other [🤗 xVASynth's](https://huggingface.co/spaces/Pendrokar/xVASynth-TTS) FastPitch 1.1 type models (v2). Used to fine tune models with xVATrainer TTS model training app and not for inference. All created by Dan ["@dr00392"](https://huggingface.co/dr00392) Ruta.
v3 models are called [xVAPitch](https://huggingface.co/Pendrokar/xvapitch) and are not based on FastPitch.
There are hundreds of fine-tuned models on the web. But most of them use non-permissive datasets.
## xVASynth Editor v2 walkthrough video ▶:
[![Video](https://img.youtube.com/vi/W-9SFoNuTtM/hqdefault.jpg)](https://www.youtube.com/watch?v=W-9SFoNuTtM)
## xVATrainer v1 walkthrough video ▶:
[![Video](https://img.youtube.com/vi/PXv_SeTWk2M/hqdefault.jpg)](https://www.youtube.com/watch?v=PXv_SeTWk2M)
## References
- [1] [FastPitch: Parallel Text-to-speech with Pitch Prediction](https://arxiv.org/abs/2006.06873)
- [2] [One TTS Alignment To Rule Them All](https://arxiv.org/abs/2108.10447)
Used datasets: Unknown/Non-permissiable data