Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains Paper • 2507.17746 • Published Jul 23 • 1
Which Heads Matter for Reasoning? RL-Guided KV Cache Compression Paper • 2510.08525 • Published Oct 9 • 22
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets? Paper • 2510.02209 • Published Oct 2 • 52
SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines Paper • 2509.21320 • Published Sep 25 • 99