Finetune and Quantize with AWQ
#13
by
ryan-rozanitis-bd
- opened
Hi, I completed a full finetune and got an additional model.safetensors file. Afterwards, I merged the resulting model.saftensors file into the base modela and quantized the result with AWQ. Now, when I run a simple test using vllm, The model can't be used for inference. Have you experienced this error before?
The model was fintuned with deepspeed and I used the mit-han-lab/pile-val-backup dataset.
IndexError: start out of range (expected to be in range of [-4096, 4096], but got 7168)