view article Article The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix By codelion • 6 days ago • 30
Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published 9 days ago • 99
Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs Paper • 2510.18245 • Published 19 days ago • 6
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI Paper • 2510.05684 • Published Oct 7 • 136
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 189
💧 LFM2 Collection LFM2 is a new generation of hybrid models, designed for on-device deployment. • 22 items • Updated 3 days ago • 117