GTAlign

community

AI & ML interests

None defined yet.

Recent Activity

GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare

ArXiv Hugging Face

GTAlign applies game-theoretic principles to fine-tune reasoning LLMs, encouraging them to make decisions that are not only accurate but also rational, cooperative, and transparent in dialogue settings.

Models

We have released five model checkpoints, and we are preparing more thoroughly trained models.

Model Name Size Dataset Hugging Face Link
GTAlign/Qwen2.5-3B-Math-140step 3B Math Model
GTAlign/Qwen2.5-3B-Medium-110step 3B Medium Model
GTAlign/Qwen2.5-3B-AbgQA-140step 3B Ambig-QA Model
GTAlign/Qwen2.5-3B-WildGuard-140step 3B WildGuard Model
GTAlign/Qwen2.5-3B-Full-160step 3B Full Model

datasets 0

None public yet