|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: Qwen3-Coder-32B-32B-Instruct |
|
|
tags: |
|
|
- transformers |
|
|
- zen |
|
|
- text-generation |
|
|
- thinking-mode |
|
|
- zoo-gym |
|
|
- hanzo-ai |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: text-generation |
|
|
library_name: transformers |
|
|
model-index: |
|
|
- name: Zen-Coder |
|
|
results: |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
name: MMLU |
|
|
type: MMLU |
|
|
metrics: |
|
|
- type: accuracy |
|
|
value: 0.789 |
|
|
name: MMLU |
|
|
widget: |
|
|
- text: "User: What is the capital of France?\n\nAssistant:" |
|
|
--- |
|
|
|
|
|
# Zen-Coder (480B (30B active)) |
|
|
|
|
|
Part of the [Zen AI Model Family](https://huggingface.co/zenlm) |
|
|
|
|
|
## Model Description |
|
|
|
|
|
**Parameters**: 480B (30B active) |
|
|
**Base Model**: Qwen3-Coder-32B |
|
|
**Specialization**: Advanced code generation & debugging |
|
|
**Training**: Code-specific training on 100+ languages |
|
|
**Context**: 32K-128K tokens |
|
|
**Thinking**: Up to 512,000 tokens |
|
|
|
|
|
## Files in This Repository |
|
|
|
|
|
This repository contains ALL formats and quantizations: |
|
|
|
|
|
### π· SafeTensors (Original) |
|
|
- `model.safetensors` - Full precision weights |
|
|
- `config.json` - Model configuration |
|
|
- `tokenizer.json` - Fast tokenizer |
|
|
|
|
|
### π’ GGUF Quantized |
|
|
- `zen-coder-480b-instruct-Q4_K_M.gguf` - 4-bit (recommended) |
|
|
- `zen-coder-480b-instruct-Q5_K_M.gguf` - 5-bit (balanced) |
|
|
- `zen-coder-480b-instruct-Q8_0.gguf` - 8-bit (high quality) |
|
|
|
|
|
### π MLX (Apple Silicon) |
|
|
- `mlx-4bit/` - 4-bit quantized for M-series |
|
|
- `mlx-8bit/` - 8-bit quantized for M-series |
|
|
|
|
|
## Performance |
|
|
|
|
|
| Benchmark | Score | Rank | |
|
|
|-----------|-------|------| |
|
|
| MMLU | 78.9% | Top 10% | |
|
|
| GSM8K | 89.3% | Top 15% | |
|
|
| HumanEval | 72.8% | Top 20% | |
|
|
|
|
|
## Quick Start |
|
|
|
|
|
### Transformers |
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained("zenlm/zen-coder-480b-instruct") |
|
|
tokenizer = AutoTokenizer.from_pretrained("zenlm/zen-coder-480b-instruct") |
|
|
|
|
|
# With thinking mode |
|
|
messages = [{"role": "user", "content": "Your question here"}] |
|
|
text = tokenizer.apply_chat_template(messages, enable_thinking=True) |
|
|
``` |
|
|
|
|
|
### GGUF with llama.cpp |
|
|
```bash |
|
|
./main -m zen-coder-480b-instruct-Q4_K_M.gguf -p "Your prompt" -n 512 |
|
|
``` |
|
|
|
|
|
### MLX for Apple Silicon |
|
|
```python |
|
|
from mlx_lm import load, generate |
|
|
model, tokenizer = load("zenlm/zen-coder-480b-instruct") |
|
|
response = generate(model, tokenizer, "Your prompt", max_tokens=200) |
|
|
``` |
|
|
|
|
|
## Unique Training Background |
|
|
|
|
|
Code-specific training on 100+ languages |
|
|
|
|
|
This model was specifically optimized for advanced code generation & debugging with careful attention to: |
|
|
- Inference efficiency |
|
|
- Memory footprint |
|
|
- Quality preservation |
|
|
- Thinking capabilities |
|
|
|
|
|
--- |
|
|
|
|
|
Part of the Zen Family β’ [Collection](https://huggingface.co/collections/zenlm/zen) β’ [GitHub](https://github.com/zenlm/zen) |
|
|
|