Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model Paper • 2506.13642 • Published 10 days ago • 26
Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space Paper • 2505.13181 • Published May 19 • 9