nightmedia
/

Qwen3-VLTO-4B-Instruct-qx86x-mlx

Text Generation

8-bit precision

Model card Files Files and versions

nightmedia commited on 14 days ago

Commit

bc4a068

·

verified ·

1 Parent(s): 1e6f5f8

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -69,7 +69,7 @@ Base Instruct	VLTO Variants
 - Formal logic: "Which object has greater inertia, a truck or car?"
 ```bash
 Base Instruct	VLTO Variants
-0.442–0.445	0.435–0.441
 ```
 ⚖️ Base instruct wins by tiny margins — VLTO models prioritize real-world intuition over textbook logic.
@@ -82,7 +82,7 @@ This is intentional: multimodal training focuses on "how things work" in practic
 Term	Meaning
 qx85x	5-bit storage for most weights + 8-bit embeddings/attention
 qx86x	6-bit storage for most weights + 8-bit embeddings/attention
-hi	Group size 32 for quantization (finer precision control)
 ```
 💡 The "8-bit" components (embeddings, attention heads) are critical for language tasks — protecting them from aggressive compression preserves nuance.

 - Formal logic: "Which object has greater inertia, a truck or car?"
 ```bash
 Base Instruct	VLTO Variants
+0.442–0.445		0.435–0.441
 ```
 ⚖️ Base instruct wins by tiny margins — VLTO models prioritize real-world intuition over textbook logic.
 Term	Meaning
 qx85x	5-bit storage for most weights + 8-bit embeddings/attention
 qx86x	6-bit storage for most weights + 8-bit embeddings/attention
+hi		Group size 32 for quantization (finer precision control)
 ```
 💡 The "8-bit" components (embeddings, attention heads) are critical for language tasks — protecting them from aggressive compression preserves nuance.