Qwen3-14B Unsloth Fine-tuned Model
Model Description
This is a fine-tuned version of the Qwen3-14B model optimized with Unsloth and 4-bit quantization for efficient inference. The model has been specifically trained on Turkish mathematical reasoning datasets to enhance its problem-solving capabilities in Turkish.
Key features
- 2x faster training using Unsloth optimization
- 4-bit quantization for reduced memory footprint
- Enhanced Turkish mathematical reasoning capabilities
- Compatible with Hugging Face's TRL library
Model Details
- Base Model: unsloth/qwen3-14b-unsloth-bnb-4bit
- License: Apache 2.0
- Fine-tuned by: momererkoc
- Language: Primarily Turkish (with English capabilities)
- Quantization: 4-bit via BitsAndBytes
Training Data
The model was fine-tuned on specialized Turkish datasets:
- Turkish Math 186k
- OpenMathReasoning-mini
Usage
from huggingface_hub import login
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
login(token="hf_...")
tokenizer = AutoTokenizer.from_pretrained("unsloth/qwen3-14b-unsloth-bnb-4bit",)
base_model = AutoModelForCausalLM.from_pretrained(
"unsloth/qwen3-14b-unsloth-bnb-4bit",
device_map={"": 0}, token=""
)
model = PeftModel.from_pretrained(base_model,"momererkoc/qwen3-14b-reasoning-turkish-math186k")
question = "Bir çiftlikte 12 inek ve 8 tavuk vardır. Çiftliğe 5 inek daha eklenirse toplam hayvan sayısı kaç olur?"
messages = [
{"role" : "user", "content" : question}
]
text = tokenizer.apply_chat_template(
messages,
tokenize = False,
add_generation_prompt = True,
enable_thinking = True,
)
from transformers import TextStreamer
_ = model.generate(
**tokenizer(text, return_tensors = "pt").to("cuda"),
max_new_tokens = 3000,
temperature = 0.6,
top_p = 0.95,
top_k = 20,
streamer = TextStreamer(tokenizer, skip_prompt = True),
)
Performance
2x faster training compared to standard implementations
Reduced GPU memory requirements
Maintains strong Turkish language understanding
Enhanced mathematical reasoning capabilities
Optimization Details
The model uses:
Unsloth for accelerated training
4-bit quantization via BitsAndBytes
LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning
Acknowledgments
Special thanks to:
Unsloth team for optimization tools
Hugging Face for transformers ecosystem
Dataset creators for Turkish math datasets
