view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face By abidlabs and 4 others • 8 days ago • 138
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • 29 days ago • 611
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles Paper • 2505.19914 • Published May 26 • 44
Sanskrit Collection collection of all Sanskrit text, currently at 115K samples • 8 items • Updated May 24 • 11
view article Article Finally, a Replacement for BERT: Introducing ModernBERT By bclavie and 14 others • Dec 19, 2024 • 673
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published Mar 26 • 57
view article Article Gotchas in Tokenizer Behavior Every Developer Should Know By qgallouedec • Apr 18 • 40
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published Mar 14 • 114
Tessa-T1 REACT REASONING MODEL Collection Tessa-T1 is a model that generates Stateful React with tailwind styling. It has features of other libraries as well. It is based on Qwen2.5-Coder. • 5 items • Updated Mar 24 • 8
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5, 2024 • 284
SLM Judge Models Collection Base model(s) merged with the specific evaluation task adapter. Each model performs excellently for its purpose and remains useful for general tasks. • 6 items • Updated Feb 18 • 1
view article Article Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies By prithivMLmods • Feb 17 • 23
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM By ariG23498 and 3 others • Mar 12 • 448
TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation Paper • 2503.04872 • Published Mar 6 • 15