ubergarm
/

DeepSeek-V3-0324-GGUF

Text Generation

Model card Files Files and versions

ubergarm commited on Jun 8

Commit

50eb2a9

·

1 Parent(s): 32d5daf

Add IQ1_S_R4 smol boi

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -40,6 +40,10 @@ Great for big 384+ GB RAM rig with 24GB+ GPU
 Special mix `IQ3_K_R4`/`IQ2_K_R4` routed experts with all other layers full `q8_0` for CPU+GPU offload or `--run-time-repack` for max speed CPU *only* rigs.
 Great for CPU+GPU "troll rig" high end gamer systems e.g. 9950X 96 GB RAM + 3090TI 24 GB VRAM + Gen 5 NVMe SSD.
 #### Custom Mixes
 If you have more than 48GB VRAM across multiple GPUs, consider rolling your own custom quants to optimize size and performance with whatever hardware you have using custom `-ot` expression. If you have less VRAM, you could make a custom quant leaner in the non routed expert layers or get 64k+ context in 24GB VRAM. Also you can use the offline repack tool if you want to do CPU only with `mmap()` still enabled.

 Special mix `IQ3_K_R4`/`IQ2_K_R4` routed experts with all other layers full `q8_0` for CPU+GPU offload or `--run-time-repack` for max speed CPU *only* rigs.
 Great for CPU+GPU "troll rig" high end gamer systems e.g. 9950X 96 GB RAM + 3090TI 24 GB VRAM + Gen 5 NVMe SSD.
+#### `IQ1_S_R4` 130.203 GiB 1.664 BPW
+Special mix `IQ1_M_R4`/`IQ1_S_R4` routed experts with all other layers `iq4_ks` for CPU+GPU offload or `--run-time-repack` for max speed CPU *only* rigs.
+Great for CPU+GPU "troll rig" high end gamer systems e.g. 2x 64GiB DDR5 plus 24GB VRAM.
 #### Custom Mixes
 If you have more than 48GB VRAM across multiple GPUs, consider rolling your own custom quants to optimize size and performance with whatever hardware you have using custom `-ot` expression. If you have less VRAM, you could make a custom quant leaner in the non routed expert layers or get 64k+ context in 24GB VRAM. Also you can use the offline repack tool if you want to do CPU only with `mmap()` still enabled.