whitphx HF Staff commited on 2 days ago

Commit

e5ff7c0

verified ·

1 Parent(s): c4b04f7

Add/update the quantized ONNX model files and README.md for Transformers.js v3

## Applied Quantizations

### ✅ Based on `decoder_model.onnx` *with* slimming

↳ ✅ `fp16`: `decoder_model_fp16.onnx` (added)
↳ ✅ `int8`: `decoder_model_int8.onnx` (added)
↳ ✅ `uint8`: `decoder_model_uint8.onnx` (added)
↳ ✅ `q4`: `decoder_model_q4.onnx` (added)
↳ ✅ `q4f16`: `decoder_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `decoder_model_bnb4.onnx` (added)

### ✅ Based on `encoder_model.onnx` *with* slimming

↳ ❌ `int8`: `encoder_model_int8.onnx` (added but JS-based E2E test failed)
```
dtype not specified for "decoder_model_merged". Using the default dtype (fp32) for this device (cpu).
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Could not find an implementation for ConvInteger(10) node with name '/conv1/Conv_quant'
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ ✅ `uint8`: `encoder_model_uint8.onnx` (added)
↳ ✅ `q4`: `encoder_model_q4.onnx` (added)
↳ ✅ `q4f16`: `encoder_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `encoder_model_bnb4.onnx` (added)

### ✅ Based on `decoder_with_past_model.onnx` *with* slimming

↳ ✅ `fp16`: `decoder_with_past_model_fp16.onnx` (added)
↳ ✅ `int8`: `decoder_with_past_model_int8.onnx` (added)
↳ ✅ `uint8`: `decoder_with_past_model_uint8.onnx` (added)
↳ ✅ `q4`: `decoder_with_past_model_q4.onnx` (added)
↳ ✅ `q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `decoder_with_past_model_bnb4.onnx` (added)

### ✅ Based on `decoder_model_merged.onnx` *without* slimming

↳ ✅ `fp16`: `decoder_model_merged_fp16.onnx` (replaced because it was invalid)
↳ ✅ `int8`: `decoder_model_merged_int8.onnx` (added)
↳ ✅ `uint8`: `decoder_model_merged_uint8.onnx` (added)
↳ ✅ `q4`: `decoder_model_merged_q4.onnx` (added)
↳ ✅ `q4f16`: `decoder_model_merged_q4f16.onnx` (added)
↳ ✅ `bnb4`: `decoder_model_merged_bnb4.onnx` (added)

Files changed (23) hide show

README.md +18 -0
onnx/decoder_model_bnb4.onnx +3 -0
onnx/decoder_model_fp16.onnx +3 -0
onnx/decoder_model_int8.onnx +3 -0
onnx/decoder_model_merged_bnb4.onnx +3 -0
onnx/decoder_model_merged_fp16.onnx +2 -2
onnx/decoder_model_merged_int8.onnx +3 -0
onnx/decoder_model_merged_q4.onnx +3 -0
onnx/decoder_model_merged_q4f16.onnx +3 -0
onnx/decoder_model_merged_uint8.onnx +3 -0
onnx/decoder_model_q4.onnx +3 -0
onnx/decoder_model_q4f16.onnx +3 -0
onnx/decoder_model_uint8.onnx +3 -0
onnx/decoder_with_past_model_bnb4.onnx +3 -0
onnx/decoder_with_past_model_fp16.onnx +3 -0
onnx/decoder_with_past_model_int8.onnx +3 -0
onnx/decoder_with_past_model_q4.onnx +3 -0
onnx/decoder_with_past_model_q4f16.onnx +3 -0
onnx/decoder_with_past_model_uint8.onnx +3 -0
onnx/encoder_model_bnb4.onnx +3 -0
onnx/encoder_model_q4.onnx +3 -0
onnx/encoder_model_q4f16.onnx +3 -0
onnx/encoder_model_uint8.onnx +3 -0

README.md CHANGED Viewed

@@ -5,4 +5,22 @@ library_name: transformers.js
 https://huggingface.co/openai/whisper-tiny with ONNX weights to be compatible with Transformers.js.
 Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

 https://huggingface.co/openai/whisper-tiny with ONNX weights to be compatible with Transformers.js.
+If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
+```bash
+npm i @huggingface/transformers
+```
+```js
+import { pipeline } from '@huggingface/transformers';
+// Create the pipeline
+const pipe = await pipeline('automatic-speech-recognition', 'Xenova/whisper-tiny', {
+    dtype: 'fp32',  // Options: "fp32", "fp16", "q8", "q4"
+});
+// Use the model
+const result = await pipe('input text or data');
+console.log(result);
+```
 Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

onnx/decoder_model_bnb4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1971aa8a89d28bffe596e3b085fca06e5b5b5f82d9ae1b0ecc498b324034bb51
+size 85954063

onnx/decoder_model_fp16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:43dbcbd6a54f7bd4e1a89ae31eedcc906fa42d0623600f1eb11759503e740b4c
+size 59284531

onnx/decoder_model_int8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1995c47b14f6eb4fa9f090ad43fdf8ee239b70a8ac6202913601b71611a858c5
+size 110041956

onnx/decoder_model_merged_bnb4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ea54ad2b302939f5832d60a70fdecfeacdf547aa09d514ebbe8e2809210eaaba
+size 86150186

onnx/decoder_model_merged_fp16.onnx CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:52137cf6784f86d12f807c462e0fad154fc4418554c9c2d6bdb918eecac432d8
-size 59600258

 version https://git-lfs.github.com/spec/v1
+oid sha256:b5b6e3f37071723df3f47cf1b448a9672780b015846886336a5e712f02813541
+size 59603028

onnx/decoder_model_merged_int8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:212f4f045ba70f7418e3a7727590c43fd0e1f3eceebd555a828f67fdd6945239
+size 30727931

onnx/decoder_model_merged_q4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:462a65ea8459402cded5e6f22a378ac410ec7e0aad9367ebb08431906c237660
+size 86739474

onnx/decoder_model_merged_q4f16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:dc59a0cad1aa37442390f2b3cf9696868e38a570a23901c4efacde71b6690e98
+size 46041144

onnx/decoder_model_merged_uint8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:58152fd4aafc0c27489d8c732bfcb9f9a06c762d0005e52b8dcc000f934dc834
+size 30727961

onnx/decoder_model_q4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ee35f54e2805bfd745069b8e59dc0031618e149e07e9d538e418c933b63ef5ed
+size 86543639

onnx/decoder_model_q4f16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:22dd644b9fdc233a8c549333918b4f48cc18697eda55e313bebee0338f5e148d
+size 45724502

onnx/decoder_model_uint8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ba85d58c9fb02dc2dcfcb15607723b3272e154398511eccefe9da52f3682ffc6
+size 110041986

onnx/decoder_with_past_model_bnb4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2cbe80031272b92954938e53ba9d91a8a41f00511b471e5364224fb054a45c69
+size 85287005

onnx/decoder_with_past_model_fp16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a3229f06de0d7e36a0c11fd65421720187d392ca8881ba20f31751ed37753bc0
+size 56928509

onnx/decoder_with_past_model_int8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:22c912aa9a27d39443a76d90b40946970d60a10b99a96807666c6cfa276ab2ac
+size 108852031

onnx/decoder_with_past_model_q4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:29a1f511a97e8d3b83711b24470e9a0f81326bf75e5d5444fc8cccdc02b0ae8b
+size 85802901

onnx/decoder_with_past_model_q4f16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:39299269a0abf99032a2729c76428507a998c76ab7fdace644ff9ef224659d1c
+size 45063048

onnx/decoder_with_past_model_uint8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3f838b01d502ad0a07650f9ec55820f303d133d5fb9c0f66734237e6e8677238
+size 108852054

onnx/encoder_model_bnb4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3d9b46e4fe69d81316a0d79bba07123feb3b2e9a26b43632673f52561813f17b
+size 8563828

onnx/encoder_model_q4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f895af36f57fec9cbeac8d29a982ae47b2e81e461d98320fbd30c47d01a6a13f
+size 9006044

onnx/encoder_model_q4f16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fd605566b1dc81d05e378df6e782b3525a66b6efd52916c618ff301573577949
+size 6303086

onnx/encoder_model_uint8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:40df112bf7fa198a05f802c67a1b98e19b9287cf008de4496df51338e3f2cb47
+size 10079602