Add/update the quantized ONNX model files and README.md for Transformers.js v3
#5
by
whitphx
HF Staff
- opened
Applied Quantizations
β
Based on decoder_with_past_model.onnx
with slimming
β³ β
q4f16
: decoder_with_past_model_q4f16.onnx
(added)
β
Based on decoder_model.onnx
with slimming
β³ β
q4f16
: decoder_model_q4f16.onnx
(added)
β
Based on encoder_model.onnx
with slimming
β³ β
q4f16
: encoder_model_q4f16.onnx
(added)
β
Based on decoder_model_merged.onnx
without slimming
β³ β
fp16
: decoder_model_merged_fp16.onnx
(replaced because it was invalid)
β³ β
q4f16
: decoder_model_merged_q4f16.onnx
(added)