ACC: Compiling Agent Trajectories for Long-Context Training Paper • 2605.21850 • Published 3 days ago • 56
The Unlearnability Phenomenon in RLVR for Language Models Paper • 2605.16787 • Published 8 days ago • 5
On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists Paper • 2605.20668 • Published 4 days ago • 11
You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 Trajectories Paper • 2605.21468 • Published 4 days ago • 44
EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents Paper • 2605.13941 • Published 11 days ago • 24
STALE: Can LLM Agents Know When Their Memories Are No Longer Valid? Paper • 2605.06527 • Published 17 days ago • 44
Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems Paper • 2605.14892 • Published 10 days ago • 47
Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning Paper • 2605.14386 • Published 10 days ago • 59
Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling Paper • 2605.13301 • Published 11 days ago • 155
Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design Paper • 2605.15871 • Published 9 days ago • 16
Solvita: Enhancing Large Language Models for Competitive Programming via Agentic Evolution Paper • 2605.15301 • Published 10 days ago • 22
Distilling Long-CoT Reasoning through Collaborative Step-wise Multi-Teacher Decoding Paper • 2605.02290 • Published 20 days ago • 39
Nudging Beyond the Comfort Zone: Efficient Strategy-Guided Exploration for RLVR Paper • 2605.15726 • Published 9 days ago • 32
Do not copy and paste! Rewriting strategies for code retrieval Paper • 2605.08299 • Published 16 days ago • 10
Continual Harness: Online Adaptation for Self-Improving Foundation Agents Paper • 2605.09998 • Published 13 days ago • 17
Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training Paper • 2605.12483 • Published 12 days ago • 10