Self-Prompt Tuning: Enable Autonomous Role-Playing in LLMs Paper • 2407.08995 • Published Jul 12, 2024 • 1
CS-Dialogue: A 104-Hour Dataset of Spontaneous Mandarin-English Code-Switching Dialogues for Speech Recognition Paper • 2502.18913 • Published Feb 26
FELLE: Autoregressive Speech Synthesis with Token-Wise Coarse-to-Fine Flow Matching Paper • 2502.11128 • Published Feb 16
ChildMandarin: A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5 Paper • 2409.18584 • Published Sep 27, 2024
SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors Paper • 2503.16578 • Published Mar 20
Chinese-LiPS: A Chinese audio-visual speech recognition dataset with Lip-reading and Presentation Slides Paper • 2504.15066 • Published Apr 21
StreamMel: Real-Time Zero-shot Text-to-Speech via Interleaved Continuous Autoregressive Modeling Paper • 2506.12570 • Published Jun 14 • 1
AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection Paper • 2406.07256 • Published Jun 11, 2024 • 1
EmotionTalk: An Interactive Chinese Multimodal Emotion Dataset With Rich Annotations Paper • 2505.23018 • Published May 29
MusicEval: A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music Evaluation Paper • 2501.10811 • Published Jan 18
Omni-Thinker: Scaling Cross-Domain Generalization in LLMs via Multi-Task RL with Hybrid Rewards Paper • 2507.14783 • Published Jul 20 • 3
DIFFA: Large Language Diffusion Models Can Listen and Understand Paper • 2507.18452 • Published Jul 24 • 1
RealTalk-CN: A Realistic Chinese Speech-Text Dialogue Benchmark With Cross-Modal Interaction Analysis Paper • 2508.10015 • Published Aug 6