Update README.md
Browse files
README.md
CHANGED
|
@@ -69,7 +69,7 @@ Base Instruct VLTO Variants
|
|
| 69 |
- Formal logic: "Which object has greater inertia, a truck or car?"
|
| 70 |
```bash
|
| 71 |
Base Instruct VLTO Variants
|
| 72 |
-
0.442β0.445
|
| 73 |
```
|
| 74 |
βοΈ Base instruct wins by tiny margins β VLTO models prioritize real-world intuition over textbook logic.
|
| 75 |
|
|
@@ -82,7 +82,7 @@ This is intentional: multimodal training focuses on "how things work" in practic
|
|
| 82 |
Term Meaning
|
| 83 |
qx85x 5-bit storage for most weights + 8-bit embeddings/attention
|
| 84 |
qx86x 6-bit storage for most weights + 8-bit embeddings/attention
|
| 85 |
-
hi
|
| 86 |
```
|
| 87 |
π‘ The "8-bit" components (embeddings, attention heads) are critical for language tasks β protecting them from aggressive compression preserves nuance.
|
| 88 |
|
|
|
|
| 69 |
- Formal logic: "Which object has greater inertia, a truck or car?"
|
| 70 |
```bash
|
| 71 |
Base Instruct VLTO Variants
|
| 72 |
+
0.442β0.445 0.435β0.441
|
| 73 |
```
|
| 74 |
βοΈ Base instruct wins by tiny margins β VLTO models prioritize real-world intuition over textbook logic.
|
| 75 |
|
|
|
|
| 82 |
Term Meaning
|
| 83 |
qx85x 5-bit storage for most weights + 8-bit embeddings/attention
|
| 84 |
qx86x 6-bit storage for most weights + 8-bit embeddings/attention
|
| 85 |
+
hi Group size 32 for quantization (finer precision control)
|
| 86 |
```
|
| 87 |
π‘ The "8-bit" components (embeddings, attention heads) are critical for language tasks β protecting them from aggressive compression preserves nuance.
|
| 88 |
|