EXL3 quant with 3bpw MLP projection layer and 4bpw for all other layers, to fit in 24GB cards with 16K context. Original description:

Merged jukofyork/command-r-35b-writer-v3-multiplicative-lora into CohereLabs/c4ai-command-r-v01 using jukofyork/merge-lora.

Untested... But appears to have worked:

✓ Successfully merged and uploaded model!
Model URL: https://huggingface.co/jukofyork/command-r-35b-writer-v3
Merge mode: Multiplicative
Scale factor: 1
Processed 15 shards
Merged 72 layers with LoRA weights

Downloads last month: 11

Safetensors

Model size

11B params

Tensor type

F16

I16

Model tree for Downtown-Case/jukofyork_command-r-35b-writer-v3-exl3-3.75bpw-hb6

Base model

CohereLabs/c4ai-command-r-v01

Quantized

(14)

this model