Optimised AWQ Quants for high-throughput deployments of Gemma2! Compatible with Transformers, TGI & VLLM 🤗
AI & ML interests
Optimised quants for high-throughput deployments! Compatible with Transformers, TGI & vLLM 🤗
Recent Activity
Organization Card
Welcome to the home of exciting quantized models! We'd love to see increased adoption of powerful state-of-the-art open models, and quantization is a key component to make them work on more types of hardware.
Resources:
- Llama 3.1 Quantized Models: Optimised Quants of Llama 3.1 for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗.
- Hugging Face Llama Recipes: A set of minimal recipes to get started with Llama 3.1.
Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models.
-
hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF
Text Generation • 3B • Updated • 1k • 52 -
hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
Text Generation • 3B • Updated • 20.4k • 25 -
hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF
Text Generation • 1B • Updated • 346k • 35 -
hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation • 1B • Updated • 34.3k • 18
Optimised AWQ Quants for high-throughput deployments of Gemma2! Compatible with Transformers, TGI & VLLM 🤗
Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models.
-
hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF
Text Generation • 3B • Updated • 1k • 52 -
hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
Text Generation • 3B • Updated • 20.4k • 25 -
hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF
Text Generation • 1B • Updated • 346k • 35 -
hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation • 1B • Updated • 34.3k • 18
models
21
hugging-quants/Llama-4-Scout-17B-16E-Instruct-fbgemm
Any-to-Any
•
109B
•
Updated
•
7
•
2
hugging-quants/Llama-4-Scout-17B-16E-Instruct-fbgemm-unfused
Any-to-Any
•
109B
•
Updated
•
4
•
2
hugging-quants/gemma-2-9b-it-AWQ-INT4
Text Generation
•
2B
•
Updated
•
2.34k
•
6
hugging-quants/Mixtral-8x7B-Instruct-v0.1-AWQ-INT4
Text Generation
•
6B
•
Updated
•
10.1k
hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation
•
1B
•
Updated
•
34.3k
•
18
hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF
Text Generation
•
1B
•
Updated
•
346k
•
35
hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
Text Generation
•
3B
•
Updated
•
20.4k
•
25
hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF
Text Generation
•
3B
•
Updated
•
1k
•
52
hugging-quants/Meta-Llama-3.1-405B-BNB-NF4
Text Generation
•
211B
•
Updated
•
28
•
2
hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4
Text Generation
•
214B
•
Updated
•
16
•
5
datasets
0
None public yet
