Conan: Progressive Learning to Reason Like a Detective over Multi-Scale Visual Evidence Paper • 2510.20470 • Published 28 days ago • 11
Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence Paper • 2510.20579 • Published 28 days ago • 55
Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization Paper • 2510.13554 • Published Oct 15 • 56
VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection Paper • 2505.20289 • Published May 26 • 10
Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for Frozen LLMs Paper • 2505.19075 • Published May 25 • 21
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO Paper • 2505.22453 • Published May 28 • 46
Sherlock: Self-Correcting Reasoning in Vision-Language Models Paper • 2505.22651 • Published May 28 • 50
ATLAS: Learning to Optimally Memorize the Context at Test Time Paper • 2505.23735 • Published May 29 • 22
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning Paper • 2505.23380 • Published May 29 • 22
ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents Paper • 2505.23923 • Published May 29 • 8