Gazal-R1-32B-sft-merged-preview

This is a DoRA adapter fine-tuned on top of Qwen/Qwen3-32B for specialized medical reasoning tasks.

Model description

This adapter was trained using PEFT/LoRA to enhance the base model's ability to perform step-by-step clinical reasoning and medical problem-solving.

Training data

The model was fine-tuned on a synthetic, structured reasoning dataset, which contains medical questions with step-by-step reasoning and final answers.

Training procedure

The model was trained using:

  • LoRA with rank 256
  • DoRA (Weight-Decomposed Low-Rank Adaptation)
  • rsLoRA (Rank-stabilized LoRA)
  • BF16 precision training

Use cases and limitations

This model is intended for medical education and clinical reasoning training. It should NOT be used for actual medical diagnosis or treatment decisions. Always consult qualified healthcare professionals for medical advice.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model and tokenizer
base_model_id = "Qwen/Qwen3-32B"
adapter_id = "TachyHealth/Gazal-R1-32B-sft-merged"

# Load the tokenizer and base model
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype="auto",
    device_map="auto",
)

# Load the LoRA adapter
model = PeftModel.from_pretrained(model, adapter_id)

# Prepare a prompt following the format during training
query = """[MEDICAL QUESTION]"""

messages = [
    {"role": "system", "content": "When solving complex medical problems, follow this specific format..."},
    {"role": "user", "content": query}
]

input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)

# Generate response
outputs = model.generate(
    input_ids=inputs.input_ids,
    max_new_tokens=2048,
    temperature=0.6,
    do_sample=True,
)

response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Performance Results

Gazal-R1 achieves exceptional performance across standard medical benchmarks:

Model Size MMLU Pro (Medical) MedMCQA MedQA PubMedQA
Gazal-R1 (Final) 32B 81.6 71.9 87.1 79.6
Gazal-R1 (SFT-only) 32B 79.3 72.3 86.9 77.6
Llama 3.1 405B Instruct 405B 70.2 75.8 81.9 74.6
Qwen 2.5 72B Instruct 72B 72.1 66.2 72.7 71.7
Med42-Llama3.1-70B 70B 66.1 72.4 80.4 77.6
Llama 3.1 70B Instruct 70B 74.5 72.5 78.4 78.5
QwQ 32B 32B 70.1 65.6 72.3 73.7
Qwen 3 32B 32B 78.4 71.6 84.4 76.7
Downloads last month
2,948
Safetensors
Model size
32.8B params
Tensor type
BF16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TachyHealth/Gazal-R1-32B-sft-merged-preview

Base model

Qwen/Qwen3-32B
Finetuned
(30)
this model
Finetunes
1 model
Quantizations
1 model

Dataset used to train TachyHealth/Gazal-R1-32B-sft-merged-preview