4 581

M Saad Salman

MSS444

MSS444

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

ACC: Compiling Agent Trajectories for Long-Context Training

upvoted a paper 3 days ago

The Unlearnability Phenomenon in RLVR for Language Models

upvoted a paper 3 days ago

On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists

View all activity

Organizations

None yet

upvoted a paper 1 day ago

ACC: Compiling Agent Trajectories for Long-Context Training

Paper • 2605.21850 • Published 3 days ago • 56

upvoted 4 papers 3 days ago

The Unlearnability Phenomenon in RLVR for Language Models

Paper • 2605.16787 • Published 8 days ago • 5

On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists

Paper • 2605.20668 • Published 4 days ago • 11

Generative Recursive Reasoning

Paper • 2605.19376 • Published 4 days ago • 25

You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 Trajectories

Paper • 2605.21468 • Published 4 days ago • 44

upvoted 11 papers 5 days ago

Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems

Paper • 2605.14892 • Published 10 days ago • 47

Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning

Paper • 2605.14386 • Published 10 days ago • 59

Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published 10 days ago • 108

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Paper • 2605.13301 • Published 11 days ago • 155

Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design

Paper • 2605.15871 • Published 9 days ago • 16

Solvita: Enhancing Large Language Models for Competitive Programming via Agentic Evolution

Paper • 2605.15301 • Published 10 days ago • 22

Distilling Long-CoT Reasoning through Collaborative Step-wise Multi-Teacher Decoding

Paper • 2605.02290 • Published 20 days ago • 39

Nudging Beyond the Comfort Zone: Efficient Strategy-Guided Exploration for RLVR

Paper • 2605.15726 • Published 9 days ago • 32

upvoted 4 papers 10 days ago

Reward Hacking in Rubric-Based Reinforcement Learning

Paper • 2605.12474 • Published 12 days ago • 5

Do not copy and paste! Rewriting strategies for code retrieval

Paper • 2605.08299 • Published 16 days ago • 10

Continual Harness: Online Adaptation for Self-Improving Foundation Agents

Paper • 2605.09998 • Published 13 days ago • 17

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

Paper • 2605.12483 • Published 12 days ago • 10

M Saad Salman

AI & ML interests

Recent Activity

Organizations

MSS444's activity