Add/update the quantized ONNX model files and README.md for Transformers.js v3

#7
by whitphx HF Staff - opened

Applied Quantizations

βœ… Based on decoder_with_past_model.onnx with slimming

↳ βœ… q4f16: decoder_with_past_model_q4f16.onnx (added)

βœ… Based on decoder_model.onnx with slimming

↳ βœ… q4f16: decoder_model_q4f16.onnx (added)

βœ… Based on encoder_model.onnx with slimming

↳ βœ… q4f16: encoder_model_q4f16.onnx (added)

βœ… Based on decoder_model_merged.onnx without slimming

↳ βœ… fp16: decoder_model_merged_fp16.onnx (replaced because it was invalid)
↳ βœ… q4f16: decoder_model_merged_q4f16.onnx (added)

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment