EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents Paper • 2605.13841 • Published 13 days ago • 64
Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics Paper • 2605.12178 • Published 14 days ago • 60
view article Article Vision Language Models (Better, faster, stronger) +3 merve, sergiopaniego, ariG23498, pcuenq, andito • May 12, 2025 • 613
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 326
Apriel-Reasoner: RL Post-Training for General-Purpose and Efficient Reasoning Paper • 2604.02007 • Published Apr 2 • 13
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published Mar 25 • 98
Reasoning as Compression: Unifying Budget Forcing via the Conditional Information Bottleneck Paper • 2603.08462 • Published Mar 9 • 22
In-Context Reinforcement Learning for Tool Use in Large Language Models Paper • 2603.08068 • Published Mar 9 • 43
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published Mar 12 • 65