schonsense
/

70B_llama312_rp_GGUF

Model card Files Files and versions

imx_plus quants are experimental full bf16 precision on the entirety of layer 0 as well as the output.
https://arxiv.org/html/2408.15301v1
I made a mistake on my 'plus' quanting, and overweighted every tenth layer instead of just layer 0. I will change this note once I have rectified the issue.
imx quants are normally formulated with a model specific custom imatrix dataset.

Downloads last month: 21

GGUF

Model size

71B params

Architecture

llama

Hardware compatibility

Log In to view the estimation

3-bit

4-bit

5-bit

6-bit

View +1 variant

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for schonsense/70B_llama312_rp_GGUF

Base model

meta-llama/Llama-3.1-70B

Finetuned

schonsense/70B_llama312_RP_ft

Quantized

(1)

this model