|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: text-generation |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
# 🧠 A²FM: Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning |
|
|
|
|
|
**A²FM (Adaptive Agent Foundation Model)** unifies reasoning-centric and agentic paradigms into a single framework that adaptively selects among **three execution modes** — *instant*, *reasoning*, and *agentic*. |
|
|
It follows a **route-then-align** training principle and introduces **Adaptive Policy Optimization (APO)** to jointly optimize accuracy and efficiency. |
|
|
|
|
|
A²FM achieves **state-of-the-art performance** on major reasoning and agentic benchmarks: |
|
|
- **13.4%** on *BrowseComp* (agentic) |
|
|
- **70.4%** on *AIME25* (reasoning) |
|
|
- **16.7%** on *HLE* (general) |
|
|
|
|
|
Notably, its adaptive execution achieves a **cost of pass of only \$0.00487 per correct answer**, cutting cost by **45.2% vs. reasoning** and **33.5% vs. agentic**, delivering substantially higher cost efficiency while maintaining comparable accuracy. |
|
|
|
|
|
📄 [**Paper**](https://arxiv.org/abs/2510.12838) |
|
|
💻 [**GitHub**](https://github.com/OPPO-PersonalAI/Adaptive_Agent_Foundation_Models) |
|
|
|
|
|
--- |
|
|
|
|
|
## 🔑 Key Highlights |
|
|
|
|
|
- ⚙️ **Unified reasoning & agentic modeling** |
|
|
Integrates direct reasoning, chain-of-thought, and tool-augmented actions within a single backbone. |
|
|
|
|
|
- 🔄 **Route-then-Align supervised fine-tuning** |
|
|
Trains task-aware routing followed by mode-aligned trajectory learning. |
|
|
|
|
|
- 🧩 **Adaptive Policy Optimization (APO)** |
|
|
Reinforcement learning with adaptive sampling and cost-regularized reward for efficiency–accuracy balance. |
|
|
|
|
|
- 💡 **Substantially lower inference cost** |
|
|
Adaptive routing cuts redundant reasoning/tool use while preserving correctness. |
|
|
|
|
|
--- |
|
|
|
|
|
## 📘 Citation |
|
|
|
|
|
```bibtex |
|
|
@article{chen2025textsuperscript, |
|
|
title={A$\backslash$textsuperscript $\{$2$\}$ FM: An Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning}, |
|
|
author={Chen, Qianben and Cao, Jingyi and Zhang, Jiayu and Qin, Tianrui and Li, Xiaowan and Zhu, King and Shi, Dingfeng and Zhu, He and Liu, Minghao and Liang, Xiaobo and others}, |
|
|
journal={arXiv preprint arXiv:2510.12838}, |
|
|
year={2025} |
|
|
} |
|
|
|