view article Article Transformers backend integration in SGLang By marcsun13 and 4 others • 3 days ago • 35
view article Article (LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware By derekl35 and 4 others • 7 days ago • 62
view article Article Fine-tuning Llama 2 70B using PyTorch FSDP By smangrul and 3 others • Sep 13, 2023 • 24
Flux quantized checkpoints Collection This collection regroups quantized flux checkpoints that we used in this blogpost: https://huggingface.co/blog/diffusers-quantization • 5 items • Updated May 21 • 1
view article Article The Transformers Library: standardizing model definitions By lysandre and 3 others • May 15 • 114
view article Article Exploring Quantization Backends in Diffusers By derekl35 and 2 others • May 21 • 37
view article Article Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs By wenhuach and 8 others • Apr 29 • 33
view article Article 🔥 Announcing FLUX-Juiced: The Fastest Image Generation Endpoint (2.6 times faster)! By PrunaAI and 3 others • Apr 23 • 9
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 27 days ago • 201
view article Article Memory-efficient Diffusion Transformers with Quanto and Diffusers By sayakpaul and 1 other • Jul 30, 2024 • 66
view article Article Welcome Llama 4 Maverick & Scout on Hugging Face! By burtenshaw and 6 others • Apr 5 • 145
view article Article NVIDIA's GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets By mingyuliutw and 4 others • Mar 18 • 41
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM By ariG23498 and 3 others • Mar 12 • 436
view article Article LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone! By medmekk and 1 other • Mar 7 • 65
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models Paper • 2310.08659 • Published Oct 12, 2023 • 28
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy By medmekk and 5 others • Sep 18, 2024 • 253