arxiv:2605.14678
Haoran Zhang
zzzhr97
AI & ML interests
Lange Language Models, Large Reasoning Models
Recent Activity
submitted a paper 1 day ago
π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows authored a paper 2 days ago
$π$-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows