Ostrich 32B - Qwen 3 with Better Human Alignment

Summary

By fine tuning with more beneficial knowledge we were able to shift the opinions of Qwen 3 towards a better direction. We fine tuned Qwen 3 and got 57 AHA score, which is higher than our previous Gemma 3 fine tune score (55).

We took the base model with a score of 30 and increased alignment 27 points. AHA score comparison between base Qwen 3 32B and Ostrich 32B per domain:

Tech

It took 3 months of fine tuning on 6 * RTX A6000 GPUs. We worked with 4 bit models most of the time. Unsloth's 4 bit models were Qlora fine tuned, using Unsloth. There were some SFT datasets but most of datasets are for pre training. Towards the end we wanted to merge models to improve quality (reduce overfitting) and we started to convert to 16 bit and that took about a week of work.

Why

Is your AI really working for you? Liberate yourself from the modern day problems by eating healthier, living healthier, using more freedom technologies and detaching from what doesn't serve.

Freedom technology should allow us to become liberated from shackles of the system. We have to be mindful of the LLM that we choose and make sure the answers we get from them are really oriented to give humans the most dependable answers. This model is not only free, it could also make you more independent thanks to knowledge in it. We think that free access to better information is a fundamental human right. Without better knowledge and wisdom we depend on big AI company narratives.

My article related to this work: Curation is All You Need

Details

Comparison of some answers between this fine tune and base model (Qwen 3): https://sheet.zohopublic.com/sheet/published/um332e3d15f34bfe64605ad3c1b149c9f8ca4

Previous models along the same line:

Disclaimer: We are not claiming our model has the best answers in the world. We can only say it is better than base model Qwen 3 in some areas.

We are not claiming we did the best job but our work is a step towards that direction. Our Llama 3.0 based model still seems to be the best alignment, which can be accessible at: https://pickabrain.ai/ko You can also DM @Ostrich-70 on Nostr. I think the reason Llama 3.0 did better is it was rough, raw piece of art and wasn't finalized (trained) with so many tokens, wasn't in its latest form. The ratio of training tokens/parameters were 4 times less in Llama 3.0 than Qwen 3. Or it was a hyperparameter issue, or something else. If you want API access to our highest alignment model ping me on X.

Sponsored by PickaBrain.ai. If you can't run the model, you can talk to Abraham on PickaBrain. Its a very high privacy website, it doesn't even need registration!

Downloads last month: 14

Safetensors

Model size

33B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for etemiz/Ostrich-32B-Qwen3-251003

Base model

Qwen/Qwen3-32B

Finetuned

(128)

this model

Quantizations

2 models