view article Article FineWeb-C: A Community-Driven Dataset for Educational Quality Annotations in 122 Languages By davanstrien and 5 others • 29 days ago • 29
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • 29 days ago • 611
view article Article FineWeb2-C: Help Build Better Language Models in Your Language By davanstrien and 5 others • Dec 23, 2024 • 21