view article Article Building the Hugging Face MCP Server By evalstate and 3 others • 27 days ago • 57
view article Article Tiny Agents in Python: a MCP-powered agent in ~70 lines of code By celinah and 3 others • May 23 • 153
view article Article Transformers backend integration in SGLang By marcsun13 and 4 others • Jun 23 • 50
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • 29 days ago • 611
view article Article StarCoder: A State-of-the-Art LLM for Code By lvwerra and 1 other • May 4, 2023 • 62
view article Article You could have designed state of the art positional encoding By FL33TW00D-HF • Nov 25, 2024 • 331
100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models Paper • 2505.00551 • Published May 1 • 37
view article Article Welcome Llama 4 Maverick & Scout on Hugging Face! By burtenshaw and 6 others • Apr 5 • 146
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published Feb 10 • 154
view article Article Open-source DeepResearch – Freeing our search agents By m-ric and 4 others • Feb 4 • 1.28k
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 By eliebak and 2 others • Jan 28 • 876
view article Article Assisted Generation: a new direction toward low-latency text generation By joaogante • May 11, 2023 • 69
view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference By mfuntowicz and 1 other • Jan 16 • 75
Agentless: Demystifying LLM-based Software Engineering Agents Paper • 2407.01489 • Published Jul 1, 2024 • 63
view article Article Use Models from the Hugging Face Hub in LM Studio By yagilb • Nov 28, 2024 • 140
view article Article Unlocking Longer Generation with Key-Value Cache Quantization By RaushanTurganbay • May 16, 2024 • 49
view article Article 🪆 Introduction to Matryoshka Embedding Models By tomaarsen and 2 others • Feb 23, 2024 • 153