EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion Paper • 2412.20359 • Published Dec 29, 2024 • 8 • 1
VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech Paper • 2406.08076 • Published Jun 12, 2024 • 6 • 1
DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing Paper • 2406.08802 • Published Jun 13, 2024 • 8 • 1
REWIND: Speech Time Reversal for Enhancing Speaker Representations in Diffusion-based Voice Conversion Paper • 2505.20756 • Published May 27 • 8 • 1