Finetune and Quantize with AWQ

#13
by ryan-rozanitis-bd - opened

Hi, I completed a full finetune and got an additional model.safetensors file. Afterwards, I merged the resulting model.saftensors file into the base modela and quantized the result with AWQ. Now, when I run a simple test using vllm, The model can't be used for inference. Have you experienced this error before?

The model was fintuned with deepspeed and I used the mit-han-lab/pile-val-backup dataset.

IndexError: start out of range (expected to be in range of [-4096, 4096], but got 7168)

Sign up or log in to comment