Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published 7 days ago • 91
Running on CPU Upgrade 2.19k 2.19k The Smol Training Playbook 📚 The secrets to building world-class LLMs
AWorld: Orchestrating the Training Recipe for Agentic AI Paper • 2508.20404 • Published Aug 28 • 38
DeepScholar-Bench: A Live Benchmark and Automated Evaluation for Generative Research Synthesis Paper • 2508.20033 • Published Aug 27 • 10
mlabonne/gemma-3-27b-it-abliterated Image-Text-to-Text • 27B • Updated Mar 21 • 5.28k • • 234
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale Paper • 2508.10711 • Published Aug 14 • 142
facebook/dinov3-vit7b16-pretrain-lvd1689m Image Feature Extraction • 7B • Updated Aug 19 • 18.4k • 185
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21 • 378
gpt-oss Collection Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. • 2 items • Updated Aug 7 • 379
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5 Text Generation • 50B • Updated Oct 15 • 62.3k • 209