SAHOO: Safeguarded Alignment for High-Order Optimization Objectives in Recursive Self-Improvement Paper • 2603.06333 • Published Mar 6 • 1 • 3
Small Vision-Language Models are Smart Compressors for Long Video Understanding Paper • 2604.08120 • Published Apr 9 • 20 • 3
Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding Paper • 2604.08537 • Published Apr 9 • 9 • 3
Time is Not a Label: Continuous Phase Rotation for Temporal Knowledge Graphs and Agentic Memory Paper • 2604.11544 • Published Apr 13 • 4 • 3
Models That Know How Evaluations Are Designed Score Safer Paper • 2605.28591 • Published 3 days ago • 4 • 5
SWE-AGILE: A Software Agent Framework for Efficiently Managing Dynamic Reasoning Context Paper • 2604.11716 • Published Apr 13 • 5 • 3
ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation Paper • 2605.28293 • Published 3 days ago • 78 • 3
Models That Know How Evaluations Are Designed Score Safer Paper • 2605.28591 • Published 3 days ago • 4 • 5
Self-Improving Language Models with Bidirectional Evolutionary Search Paper • 2605.28814 • Published 3 days ago • 50 • 3
CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation Paper • 2605.25378 • Published 5 days ago • 48 • 2
Representation Fréchet Loss for Visual Generation Paper • 2604.28190 • Published 30 days ago • 32 • 2
Let ViT Speak: Generative Language-Image Pre-training Paper • 2605.00809 • Published 29 days ago • 33 • 3
DenoiseRL: Bootstrapping Reasoning Models to Recover from Noisy Prefixes Paper • 2605.28421 • Published 3 days ago • 43 • 4
Guiding LLM Post-training Data Engineering with Model Internals from Sparse Autoencoders Paper • 2605.27354 • Published 4 days ago • 12 • 3
Long Live The Balance: Information Bottleneck Driven Tree-based Policy Optimization Paper • 2605.28109 • Published 3 days ago • 17 • 3
F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare Paper • 2602.06717 • Published Feb 6 • 75 • 3