Hugging Quants

AI & ML interests

Optimised quants for high-throughput deployments! Compatible with Transformers, TGI & vLLM 🤗

Organization Card

Community About org cards

Welcome to the home of exciting quantized models! We'd love to see increased adoption of powerful state-of-the-art open models, and quantization is a key component to make them work on more types of hardware.

Resources:

Llama 3.1 Quantized Models: Optimised Quants of Llama 3.1 for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗.
Hugging Face Llama Recipes: A set of minimal recipes to get started with Llama 3.1.

Collections 3

View 3 collections

models 21

hugging-quants/Llama-4-Scout-17B-16E-Instruct-fbgemm

Image-Text-to-Text • 109B • Updated Apr 9, 2025 • 5 • 2

hugging-quants/Llama-4-Scout-17B-16E-Instruct-fbgemm-unfused

Image-Text-to-Text • 109B • Updated Apr 9, 2025 • 5 • 2

hugging-quants/gemma-2-9b-it-AWQ-INT4

Text Generation • 9B • Updated Oct 17, 2024 • 2.6k • 9

hugging-quants/Mixtral-8x7B-Instruct-v0.1-AWQ-INT4

Text Generation • 47B • Updated Oct 7, 2024 • 13.9k

hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF

Text Generation • 1B • Updated Sep 25, 2024 • 39.6k • 21

hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF

Text Generation • 1B • Updated Sep 25, 2024 • 788k • 46

hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF

Text Generation • 3B • Updated Sep 25, 2024 • 25.1k • 27

hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF

Text Generation • 3B • Updated Sep 25, 2024 • 3.77k • 52

hugging-quants/Meta-Llama-3.1-405B-BNB-NF4

Text Generation • 418B • Updated Sep 16, 2024 • 58 • 2

hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4

Text Generation • 423B • Updated Sep 16, 2024 • 74 • 5

datasets 0

None public yet