--- license: apache-2.0 language: - en pipeline_tag: text-generation library_name: transformers --- # 🧠 A²FM: Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning **A²FM (Adaptive Agent Foundation Model)** unifies reasoning-centric and agentic paradigms into a single framework that adaptively selects among **three execution modes** — *instant*, *reasoning*, and *agentic*. It follows a **route-then-align** training principle and introduces **Adaptive Policy Optimization (APO)** to jointly optimize accuracy and efficiency. A²FM achieves **state-of-the-art performance** on major reasoning and agentic benchmarks: - **13.4%** on *BrowseComp* (agentic) - **70.4%** on *AIME25* (reasoning) - **16.7%** on *HLE* (general) Notably, its adaptive execution achieves a **cost of pass of only \$0.00487 per correct answer**, cutting cost by **45.2% vs. reasoning** and **33.5% vs. agentic**, delivering substantially higher cost efficiency while maintaining comparable accuracy. 📄 [**Paper**](https://arxiv.org/abs/2510.12838) 💻 [**GitHub**](https://github.com/OPPO-PersonalAI/Adaptive_Agent_Foundation_Models) --- ## 🔑 Key Highlights - ⚙️ **Unified reasoning & agentic modeling** Integrates direct reasoning, chain-of-thought, and tool-augmented actions within a single backbone. - 🔄 **Route-then-Align supervised fine-tuning** Trains task-aware routing followed by mode-aligned trajectory learning. - 🧩 **Adaptive Policy Optimization (APO)** Reinforcement learning with adaptive sampling and cost-regularized reward for efficiency–accuracy balance. - 💡 **Substantially lower inference cost** Adaptive routing cuts redundant reasoning/tool use while preserving correctness. --- ## 📘 Citation ```bibtex @article{chen2025textsuperscript, title={A$\backslash$textsuperscript $\{$2$\}$ FM: An Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning}, author={Chen, Qianben and Cao, Jingyi and Zhang, Jiayu and Qin, Tianrui and Li, Xiaowan and Zhu, King and Shi, Dingfeng and Zhu, He and Liu, Minghao and Liang, Xiaobo and others}, journal={arXiv preprint arXiv:2510.12838}, year={2025} }