view article Article Introducing Command A Vision: Multimodal AI built for Business By CohereLabs and 3 others • 6 days ago • 60
SmolDocling datasets Collection Datasets used to train SmolDocling • 6 items • Updated 6 days ago • 26
view article Article TimeScope: How Long Can Your Video Large Multimodal Model Go? By orrzohar and 3 others • 14 days ago • 31
view article Article Fast LoRA inference for Flux with Diffusers and PEFT By sayakpaul and 1 other • 14 days ago • 40
view article Article Arc Virtual Cell Challenge: A Primer By FL33TW00D-HF and 1 other • 19 days ago • 47
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • 29 days ago • 611
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens Paper • 2506.17218 • Published Jun 20 • 27
SmolVLA Collection Small, efficient and light-weight VLAs pretrained on community datasets • 1 item • Updated Jun 1 • 27
view article Article Weekly Robotics June #1 - SmolVLA discovery and thoughts By Beegbrain • Jun 3 • 9
view article Article Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H By Hcompany and 1 other • Jun 3 • 70
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data By danaaubakirova and 8 others • Jun 3 • 216
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published Jun 2 • 122
view article Article CodeAgents + Structure: A Better Way to Execute Actions By akseljoonas and 1 other • May 28 • 71
view article Article Exploring Quantization Backends in Diffusers By derekl35 and 2 others • May 21 • 39
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch By ariG23498 and 6 others • May 21 • 199