inferencerlabs commited on
Commit
5bc369f
·
verified ·
1 Parent(s): adc61d2

Upload complete model

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -8,7 +8,7 @@ tags:
8
  ---
9
  *UPLOADING*
10
 
11
- **See MiniMax-M2 6.5bit MLX in action - [demonstration video coming soon](https://youtube.com/xcreate)**
12
 
13
  *q6.5bit quant typically achieves 1.128 perplexity in our testing which is equivalent to q8.*
14
  | Quantization | Perplexity |
@@ -22,11 +22,11 @@ tags:
22
 
23
  ## Usage Notes
24
 
25
- * Tested on a MacBook Pro connecting to a M3 Ultra 512GB RAM over the internet using [Inferencer app](https://inferencer.com)
26
  * Memory usage: ~175 GB
27
  * Expect 36 tokens/s for small contexts (200 tokens) down to 11 token/s for large (6800 tokens)
28
  * Quantized with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.28
29
- * For more details see [demonstration video coming soon](https://youtube.com/xcreate) or visit [MiniMax-M2](https://huggingface.co/MiniMaxAI/MiniMax-M2).
30
 
31
  ## Disclaimer
32
 
 
8
  ---
9
  *UPLOADING*
10
 
11
+ **See MiniMax-M2 6.5bit MLX in action - [demonstration video]https://youtu.be/DCVKP_o2HU0)**
12
 
13
  *q6.5bit quant typically achieves 1.128 perplexity in our testing which is equivalent to q8.*
14
  | Quantization | Perplexity |
 
22
 
23
  ## Usage Notes
24
 
25
+ * Tested on a MacBook Pro connecting to a M3 Ultra 512GB RAM over the internet using [Inferencer app v1.5.4](https://inferencer.com)
26
  * Memory usage: ~175 GB
27
  * Expect 36 tokens/s for small contexts (200 tokens) down to 11 token/s for large (6800 tokens)
28
  * Quantized with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.28
29
+ * For more details see [demonstration video](https://youtu.be/DCVKP_o2HU0) or visit [MiniMax-M2](https://huggingface.co/MiniMaxAI/MiniMax-M2).
30
 
31
  ## Disclaimer
32