The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization Paper • 2403.17031 • Published Mar 24, 2024 • 8
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published Nov 9, 2025 • 135
Laguna XS.2 Collection Designed for agentic coding and long-horizon work on a local machine. Apache 2.0. • 5 items • Updated 26 days ago • 24
view article Article Harness, Scaffold, and the AI Agent Terms Worth Getting Right sergiopaniego, ariG23498 • 9 days ago • 91
view article Article EMO: Pretraining mixture of experts for emergent modularity allenai • 25 days ago • 38
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published Nov 22, 2024 • 68
Tiny Models for CI Collection A collection of tiny models of common model architectures. Useful for e2e smoke tests across real pretrained models to validate loss behavior. • 10 items • Updated Apr 22 • 1
view article Article TRL v1.0: Post-Training Library Built to Move with the Field +2 qgallouedec, stevhliu, pcuenq, sergiopaniego • Mar 31 • 53
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego • Mar 10 • 159
view article Article Introducing Storage Buckets on the Hugging Face Hub +10 Wauplin, coyotte508, XciD, victor, julien-c, lhoestq, pierric, Sylvestre, hlarcher, rajatarya, seanses, assafvayner • Mar 10 • 195
view article Article Did GPT 5.2 make a breakthrough discovery in theoretical physics? dlouapre • Feb 19 • 62
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 ggerganov, ngxson, allozaur, lysandre, victor, julien-c • Feb 20 • 507