π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows Paper • 2605.14678 • Published 12 days ago • 102
WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation Paper • 2605.10912 • Published 20 days ago • 46
WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation Paper • 2605.10912 • Published 20 days ago • 46
ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents Paper • 2605.12481 • Published 19 days ago • 27
ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents Paper • 2605.12481 • Published 19 days ago • 27
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 242
DARE: Diffusion Large Language Models Alignment and Reinforcement Executor Paper • 2604.04215 • Published Apr 5 • 21
DARE: Diffusion Large Language Models Alignment and Reinforcement Executor Paper • 2604.04215 • Published Apr 5 • 21
DARE: Diffusion Large Language Models Alignment and Reinforcement Executor Paper • 2604.04215 • Published Apr 5 • 21
$ρ$-$\texttt{EOS}$: Training-free Bidirectional Variable-Length Control for Masked Diffusion LLMs Paper • 2601.22527 • Published Jan 30 • 1
ρ-EOS: Training-free Bidirectional Variable-Length Control for Masked Diffusion LLMs Paper • 2601.22527 • Published Jan 30 • 1
Geometrically-Constrained Agent for Spatial Reasoning Paper • 2511.22659 • Published Nov 27, 2025 • 41
Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Step Paper • 2509.23924 • Published Sep 28, 2025 • 9
Conditional Advantage Estimation for Reinforcement Learning in Large Reasoning Models Paper • 2509.23962 • Published Sep 28, 2025 • 5
Conditional Advantage Estimation for Reinforcement Learning in Large Reasoning Models Paper • 2509.23962 • Published Sep 28, 2025 • 5 • 2
LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions Paper • 2510.08211 • Published Oct 9, 2025 • 23
Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents Paper • 2509.26354 • Published Sep 30, 2025 • 18
Conditional Advantage Estimation for Reinforcement Learning in Large Reasoning Models Paper • 2509.23962 • Published Sep 28, 2025 • 5