Improve model card for HAPO: Add paper/code links, detailed description, and documentation from GitHub

#1
by nielsr HF Staff - opened

This PR significantly enhances the model card for the HAPO-based Qwen2.5-Math-1.5B model by:

  • Updating the title to include the paper name for better discoverability.
  • Adding a direct link to the paper: From Uniform to Heterogeneous: Tailoring Policy Optimization to Every Token's Nature
  • Including a link to the GitHub repository: https://github.com/starriver030515/HAPO
  • Providing a detailed "About HAPO" section: Incorporating the paper's abstract and a framework image to thoroughly describe the model's background and methodology.
  • Adding comprehensive documentation from the GitHub README: Including sections for installation, usage (training and evaluation scripts), experimental results, and training dynamics, making the model card a central resource. All image links have been updated to raw GitHub URLs for proper rendering.
  • Populating the BibTeX citation: With the correct entry, using eprint={2509.16591} from the paper's canonical link.
  • Retaining existing metadata: library_name: transformers, license: mit, and pipeline_tag: text-generation are kept based on evidence and majority consensus.

This update aims to make the model card much more informative and user-friendly for the Hugging Face community.

Cannot merge
This branch has merge conflicts in the following files:
  • README.md

Sign up or log in to comment