louijiec's picture
Update README.md
997ddb5 verified
---
license: apache-2.0
tags:
- qlora
- verilog
- code-generation
- circuit-synthesis
- electronic-design-automation
language: en
datasets:
- bnadimi/PyraNet-Verilog
model-index:
- name: veriforge-deepseek-coder-1.3b-instruct
results: []
---
# veriforge-deepseek-coder-1.3b-instruct
This model is a QLoRA fine-tuned version of [`deepseek-ai/deepseek-coder-1.3b-instruct`](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-instruct), designed for the domain of **Verilog RTL synthesis**. It accepts natural-language descriptions of digital circuits and generates Verilog code modules.
## ✨ Model Details
- **Base Model**: DeepSeek-Coder-1.3B-Instruct (4-bit quantized)
- **Fine-Tuning**: QLoRA on Hugging Face `Trainer` API
- **Domain**: Hardware Description Language (HDL), Electronic Design Automation (EDA)
- **Tokenizer**: AutoTokenizer with trust_remote_code=True
## 📚 Dataset
- **Source**: [PyraNet-Verilog](https://huggingface.co/datasets/bnadimi/PyraNet-Verilog)
- **Content**: Natural-language descriptions paired with their corresponding Verilog implementations
- **Preprocessing**: Reformatted into instruction-style prompts with markdown headers
## 🧪 Training Configuration
- Framework: `transformers`, `peft`, `accelerate`, `bitsandbytes`
- Epochs: 10
- Batch Size: 4 (with gradient accumulation of 4)
- Optimizer: AdamW
- Learning Rate: 2e-4
- Device: Google Colab GPU (supports A100/T4)
- Precision: 4-bit (QLoRA) + FP16 mixed-precision
## 🚀 Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("louijiec/veriforge-deepseek-coder-1.3b-instruct")
tokenizer = AutoTokenizer.from_pretrained("louijiec/veriforge-deepseek-coder-1.3b-instruct")
prompt = """### Task: Synthesize Verilog\nDesign a 2-to-1 multiplexer using behavioral modeling.\n### Verilog Code:"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## ✅ Evaluation
This model has been sanity-checked using prompt-based outputs that are expected to include:
- Valid Verilog keywords (`module`, `input`, `output`, `assign`, `endmodule`)
- Structured code starting with `module`
- Coherent outputs for standard digital design prompts (e.g., multiplexers, adders, encoders)
For functional verification, use [Icarus Verilog](http://iverilog.icarus.com/) or [Verilator](https://www.veripool.org/verilator/) to simulate output.
## 📎 Citations
- Dettmers et al. (2023). [QLoRA: Efficient Finetuning of Quantized LLMs](https://arxiv.org/abs/2305.14314)
- DeepSeek. [deepseek-ai/deepseek-coder-1.3b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-instruct)
- Bnadimi. [PyraNet-Verilog dataset](https://huggingface.co/datasets/bnadimi/PyraNet-Verilog)
- Hugging Face 🤗 Transformers. https://github.com/huggingface/transformers