Thireus
commited on
Commit
·
a1be202
1
Parent(s):
6709d7f
Update README.md
Browse files
README.md
CHANGED
|
@@ -38,10 +38,10 @@ cd GGUF-Tool-Suite
|
|
| 38 |
rm -f download.conf # Make sure to copy the relevant download.conf for the model before running quant_assign.py
|
| 39 |
cp -f models/DeepSeek-TNG-R1T2-Chimera/download.conf . # Use the download.conf of the chosen model
|
| 40 |
mkdir -p kitchen && cd kitchen
|
| 41 |
-
../quant_downloader.sh ../recipe_examples/DeepSeek-TNG-R1T2-Chimera.ROOT-3.0624bpw-3.3657ppl.238GB-GGUF_11GB-GPU_227GB-CPU.13549e6_1ac857a.recipe
|
| 42 |
|
| 43 |
# Launch ik_llama's llama-cli:
|
| 44 |
-
ulimit -n
|
| 45 |
~/ik_llama.cpp/build/bin/llama-cli \
|
| 46 |
-m DeepSeek-TNG-R1T2-Chimera-THIREUS-BF16-SPECIAL_TENSOR-00001-of-01148.gguf \
|
| 47 |
-mla 3 -fa -amb 512 -fmoe -ctk f16 -c 4096 -ngl 99 \
|
|
@@ -74,6 +74,8 @@ Here’s how DeepSeek-R1-0528 quantized with **Thireus’ GGUF Tool Suite** stac
|
|
| 74 |
|
| 75 |
More perplexity/bpw graphs for other supported models: https://github.com/Thireus/GGUF-Tool-Suite/tree/main/ppl_graphs
|
| 76 |
|
|
|
|
|
|
|
| 77 |
---
|
| 78 |
|
| 79 |
## 🚀 How do I get started?
|
|
|
|
| 38 |
rm -f download.conf # Make sure to copy the relevant download.conf for the model before running quant_assign.py
|
| 39 |
cp -f models/DeepSeek-TNG-R1T2-Chimera/download.conf . # Use the download.conf of the chosen model
|
| 40 |
mkdir -p kitchen && cd kitchen
|
| 41 |
+
../quant_downloader.sh ../recipe_examples/ik_llama.cpp_recipes/DeepSeek-TNG-R1T2-Chimera.ROOT-3.0624bpw-3.3657ppl.238GB-GGUF_11GB-GPU_227GB-CPU.13549e6_1ac857a.recipe
|
| 42 |
|
| 43 |
# Launch ik_llama's llama-cli:
|
| 44 |
+
ulimit -n 9999 # Lifts "too many open files" limitation on Linux
|
| 45 |
~/ik_llama.cpp/build/bin/llama-cli \
|
| 46 |
-m DeepSeek-TNG-R1T2-Chimera-THIREUS-BF16-SPECIAL_TENSOR-00001-of-01148.gguf \
|
| 47 |
-mla 3 -fa -amb 512 -fmoe -ctk f16 -c 4096 -ngl 99 \
|
|
|
|
| 74 |
|
| 75 |
More perplexity/bpw graphs for other supported models: https://github.com/Thireus/GGUF-Tool-Suite/tree/main/ppl_graphs
|
| 76 |
|
| 77 |
+
*All PPL values are computed with the parameters `-ctk f16 -c 512 -b 4096 -ub 4096`. Changing any of these parameters will alter the PPL. In particular, reducing `-b 4096 -ub 4096` increases the PPL, while increasing them decreases the PPL.*
|
| 78 |
+
|
| 79 |
---
|
| 80 |
|
| 81 |
## 🚀 How do I get started?
|