It comes packed with updates: > Agent training with tools in GRPO > New CISPO & SAPO losses + reasoning rewards > vLLM quantization in colocate mode > Dataset shuffling in SFT > Lots of NEW examples > Tons of fixes and documentation improvements
The LLM by @karpathy is officially in the library, and we wrote a blog covering: how did we port the model, differences from the original, and how to run or train it.