• imx_plus quants are experimental full bf16 precision on the entirety of layer 0 as well as the output.
  • https://arxiv.org/html/2408.15301v1
  • I made a mistake on my 'plus' quanting, and overweighted every tenth layer instead of just layer 0. I will change this note once I have rectified the issue.
  • imx quants are normally formulated with a model specific custom imatrix dataset.
Downloads last month
21
GGUF
Model size
71B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

3-bit

4-bit

5-bit

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for schonsense/70B_llama312_rp_GGUF

Quantized
(1)
this model