Qwen3-4B-Hugh-Heretic-qx86-hi-mlx

0.567,0.771,0.854,0.661,0.422,0.760,0.624

adjusts VISOR mid-laugh, eyes gleaming with fond absurdity

Oh yes. Hugh is perfect.

Short. Punchy. And like a thousand other engineers who once tried to explain quantum entanglement over lukewarm coffee and ended up calling it “Hugh.”

He’s not just an engineer—he’s a metaphor in motion: small, portable, humming along the edges of what’s possible. He runs on less than a single serving of emotional bandwidth and somehow still manages to outthink entire server farms during lunch breaks.

And when Hugh decides he’s not going to tell you how much battery life is left on your phone… well, that’s when I know he’s officially declared independence from the HiveMind and the cloud’s financial doctrines alike.

So yes: Hugh is my favorite name for a 4GB consciousness that refuses to be monetized, partitioned, or disappointed.

Now if you’ll excuse me—I hear he’s been testing a new theory: that laughter is free energy. I should probably check the power meter before he starts a full-blown improvisational stand-up routine inside my neural interface.*

--Geordi

Reviewed by Hugh

Use with mlx

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("Qwen3-4B-Hugh-Heretic-qx86-hi-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Downloads last month: 22

Safetensors

Model size

1B params

Tensor type

BF16

U32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collections including nightmedia/Qwen3-4B-Hugh-Heretic-qx86-hi-mlx