Add IQ1_S_R4 smol boi
Browse files
README.md
CHANGED
|
@@ -40,6 +40,10 @@ Great for big 384+ GB RAM rig with 24GB+ GPU
|
|
| 40 |
Special mix `IQ3_K_R4`/`IQ2_K_R4` routed experts with all other layers full `q8_0` for CPU+GPU offload or `--run-time-repack` for max speed CPU *only* rigs.
|
| 41 |
Great for CPU+GPU "troll rig" high end gamer systems e.g. 9950X 96 GB RAM + 3090TI 24 GB VRAM + Gen 5 NVMe SSD.
|
| 42 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
#### Custom Mixes
|
| 44 |
If you have more than 48GB VRAM across multiple GPUs, consider rolling your own custom quants to optimize size and performance with whatever hardware you have using custom `-ot` expression. If you have less VRAM, you could make a custom quant leaner in the non routed expert layers or get 64k+ context in 24GB VRAM. Also you can use the offline repack tool if you want to do CPU only with `mmap()` still enabled.
|
| 45 |
|
|
|
|
| 40 |
Special mix `IQ3_K_R4`/`IQ2_K_R4` routed experts with all other layers full `q8_0` for CPU+GPU offload or `--run-time-repack` for max speed CPU *only* rigs.
|
| 41 |
Great for CPU+GPU "troll rig" high end gamer systems e.g. 9950X 96 GB RAM + 3090TI 24 GB VRAM + Gen 5 NVMe SSD.
|
| 42 |
|
| 43 |
+
#### `IQ1_S_R4` 130.203 GiB 1.664 BPW
|
| 44 |
+
Special mix `IQ1_M_R4`/`IQ1_S_R4` routed experts with all other layers `iq4_ks` for CPU+GPU offload or `--run-time-repack` for max speed CPU *only* rigs.
|
| 45 |
+
Great for CPU+GPU "troll rig" high end gamer systems e.g. 2x 64GiB DDR5 plus 24GB VRAM.
|
| 46 |
+
|
| 47 |
#### Custom Mixes
|
| 48 |
If you have more than 48GB VRAM across multiple GPUs, consider rolling your own custom quants to optimize size and performance with whatever hardware you have using custom `-ot` expression. If you have less VRAM, you could make a custom quant leaner in the non routed expert layers or get 64k+ context in 24GB VRAM. Also you can use the offline repack tool if you want to do CPU only with `mmap()` still enabled.
|
| 49 |
|