MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning Paper • 2505.24846 • Published 27 days ago • 15
SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents Paper • 2505.23559 • Published 28 days ago • 12
ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind Paper • 2505.22961 • Published 28 days ago • 8
Time-R1 Collection Time-R1: Framework and resources for endowing LLMs with comprehensive temporal reasoning (understanding, prediction, creative generation). • 7 items • Updated 24 days ago • 1
Maintaining Adversarial Robustness in Continuous Learning Paper • 2402.11196 • Published Feb 17, 2024 • 1
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL Paper • 2505.02391 • Published May 5 • 24