TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments Paper • 2510.01179 • Published Oct 1 • 25
VisualSphinx: Large-Scale Synthetic Vision Logic Puzzles for RL Paper • 2505.23977 • Published May 29 • 10
TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning Paper • 2505.14625 • Published May 20 • 13
SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities Paper • 2502.12025 • Published Feb 17 • 3
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding Paper • 2503.02951 • Published Mar 4 • 33
Small Models Struggle to Learn from Strong Reasoners Paper • 2502.12143 • Published Feb 17 • 39
CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models Paper • 2406.12257 • Published Jun 18, 2024
Stronger Models are NOT Stronger Teachers for Instruction Tuning Paper • 2411.07133 • Published Nov 11, 2024 • 38
API Pack: A Massive Multilingual Dataset for API Call Generation Paper • 2402.09615 • Published Feb 14, 2024