Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models Paper • 2605.21573 • Published 10 days ago • 103
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation Paper • 2605.25874 • Published 5 days ago • 97
Toto 2.0: Time Series Forecasting Enters the Scaling Era Paper • 2605.20119 • Published 11 days ago • 38
Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality? Paper • 2605.22109 • Published 9 days ago • 169
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published 16 days ago • 145
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 27 days ago • 165
Training LLM Agents for Spontaneous, Reward-Free Self-Evolution via World Knowledge Exploration Paper • 2604.18131 • Published Apr 20 • 10
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 242
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 504
Combee: Scaling Prompt Learning for Self-Improving Language Model Agents Paper • 2604.04247 • Published Apr 5 • 31
When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models Paper • 2604.08546 • Published Apr 9 • 115
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published Apr 8 • 189
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise Paper • 2602.12783 • Published Feb 13 • 246
Investigating Autonomous Agent Contributions in the Wild: Activity Patterns and Code Change over Time Paper • 2604.00917 • Published Apr 1 • 18