Pendrokar
/

xva_fastpitch1_1

speech-to-speech

voice conversion

Model card Files Files and versions Community

xva_fastpitch1_1 / README.md

Pendrokar's picture

link to xvapitch

4af294b verified 3 months ago

|

history blame contribute delete

1.35 kB

	---
	language:
	- en
	library: xvasynth
	tags:
	- audio
	- text-to-speech
	- speech-to-speech
	- voice conversion
	- tts
	pipeline_tag: text-to-speech
	---

	GitHub project, inference Windows/Electron app: https://github.com/DanRuta/xVA-Synth

	Fine-tuning app: https://github.com/DanRuta/xva-trainer

	The base model for training other [🤗 xVASynth's](https://huggingface.co/spaces/Pendrokar/xVASynth-TTS) FastPitch 1.1 type models (v2). Used to fine tune models with xVATrainer TTS model training app and not for inference. All created by Dan ["@dr00392"](https://huggingface.co/dr00392) Ruta.

	v3 models are called [xVAPitch](https://huggingface.co/Pendrokar/xvapitch) and are not based on FastPitch.

	There are hundreds of fine-tuned models on the web. But most of them use non-permissive datasets.

	## xVASynth Editor v2 walkthrough video ▶:
	[![Video](https://img.youtube.com/vi/W-9SFoNuTtM/hqdefault.jpg)](https://www.youtube.com/watch?v=W-9SFoNuTtM)

	## xVATrainer v1 walkthrough video ▶:
	[![Video](https://img.youtube.com/vi/PXv_SeTWk2M/hqdefault.jpg)](https://www.youtube.com/watch?v=PXv_SeTWk2M)

	## References
	- [1] [FastPitch: Parallel Text-to-speech with Pitch Prediction](https://arxiv.org/abs/2006.06873)
	- [2] [One TTS Alignment To Rule Them All](https://arxiv.org/abs/2108.10447)

	Used datasets: Unknown/Non-permissiable data