Qwen3-0.6B-Coding-Finetuned-v1

This model is a fine-tuned version of Qwen/Qwen3-0.6B specialized for Python code generation tasks. It's designed to understand programming-related instructions and provide accurate and efficient Python code solutions.

πŸ’» Model Description

  • Base Model: Qwen/Qwen3-0.6B
  • Fine-tuning Method: QLoRA (Quantized Low-Rank Adaptation)
  • Dataset: TokenBender/code_instructions_122k_alpaca_style - A large dataset of coding instructions and their corresponding solutions.
  • Training: Optimized for instruction-based code generation using 4-bit quantization for efficiency.

⚠️ Important Considerations

  • Verify All Code: Generated code may contain errors or be suboptimal. Always test and review the code thoroughly before using it in production environments.
  • Security: The generated code has not been vetted for security vulnerabilities. Be cautious when using it in security-sensitive applications.
  • Not a Replacement for Developers: This model is a tool to assist developers, not replace them. Human oversight and expertise are crucial.

πŸš€ Usage

With transformers

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch

model_id = "rohitnagareddy/Qwen3-0.6B-Coding-Finetuned-v1"

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True
)

# Create conversation for a Python code-generation task
messages = [
    {"role": "system", "content": "You are an expert coding assistant."},
    {"role": "user", "content": "Write a Python function that takes a list of integers and returns the sum of all even numbers in the list."}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer
)

# Generate response
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

πŸ”§ GGUF Versions

This repository includes quantized GGUF versions for use with llama.cpp and compatible tools:

  • Qwen3-0.6B-Coding-Finetuned-v1.fp16.gguf - Full precision (largest, best quality)
  • Qwen3-0.6B-Coding-Finetuned-v1.Q8_0.gguf - 8-bit quantization (good balance)
  • Qwen3-0.6B-Coding-Finetuned-v1.Q5_K_M.gguf - 5-bit quantization (smaller, fast)
  • Qwen3-0.6B-Coding-Finetuned-v1.Q4_K_M.gguf - 4-bit quantization (smallest, fastest)

Example with llama.cpp

./main -m ./Qwen3-0.6B-Coding-Finetuned-v1.Q4_K_M.gguf -n 256 -p "<|im_start|>system\nYou are an expert coding assistant.<|im_end|>\n<|im_start|>user\nCreate a Python function to find the factorial of a number.<|im_end|>\n<|im_start|>assistant\n"

πŸ“Š Training Details

  • Training Epochs: 1
  • QLoRA Rank (r): 16
  • QLoRA Alpha: 32
  • Learning Rate: 2e-4
  • Optimizer: Paged AdamW 32-bit
  • Target Modules: Auto-detected linear layers

Model created by rohitnagareddy using an automated Colab script.

Downloads last month
137
Safetensors
Model size
596M params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for rohitnagareddy/Qwen3-0.6B-Coding-Finetuned-v1

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(134)
this model

Dataset used to train rohitnagareddy/Qwen3-0.6B-Coding-Finetuned-v1

Space using rohitnagareddy/Qwen3-0.6B-Coding-Finetuned-v1 1