Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training Paper • 2508.00414 • Published 5 days ago • 64
AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents Paper • 2506.14205 • Published Jun 17 • 7 • 3
AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents Paper • 2506.14205 • Published Jun 17 • 7
MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research Paper • 2505.19955 • Published May 26 • 12
HardTests: Synthesizing High-Quality Test Cases for LLM Coding Paper • 2505.24098 • Published May 30 • 44
HardTests: Synthesizing High-Quality Test Cases for LLM Coding Paper • 2505.24098 • Published May 30 • 44 • 2
RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination Paper • 2505.21925 • Published May 28 • 36
view reply Which checkpoint did you use in the end to evaluate on IOI? Was it the one after 10 epochs or was it a checkpoint in between?
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents Paper • 2408.07060 • Published Aug 13, 2024 • 43
ReDi: Efficient Learning-Free Diffusion Inference via Trajectory Retrieval Paper • 2302.02285 • Published Feb 5, 2023
Syntax Error-Free and Generalizable Tool Use for LLMs via Finite-State Decoding Paper • 2310.07075 • Published Oct 10, 2023 • 1