whitphx HF Staff commited on
Commit
bc360d7
Β·
verified Β·
1 Parent(s): 7725fa9

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Browse files

## Applied Quantizations

### βœ… Based on `decoder_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_bnb4.onnx` (added)

### βœ… Based on `encoder_model.onnx` *with* slimming

↳ ❌ `int8`: `encoder_model_int8.onnx` (added but JS-based E2E test failed)
```
dtype not specified for "decoder_model_merged". Using the default dtype (fp32) for this device (cpu).
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Could not find an implementation for ConvInteger(10) node with name '/conv1/Conv_quant'
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ βœ… `uint8`: `encoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `encoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `encoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `encoder_model_bnb4.onnx` (added)

### βœ… Based on `decoder_with_past_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_with_past_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_with_past_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_with_past_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_with_past_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_with_past_model_bnb4.onnx` (added)

### βœ… Based on `decoder_model_merged.onnx` *without* slimming

↳ βœ… `fp16`: `decoder_model_merged_fp16.onnx` (replaced because it was invalid)
↳ βœ… `int8`: `decoder_model_merged_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_merged_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_merged_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_merged_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_merged_bnb4.onnx` (added)

README.md CHANGED
@@ -5,4 +5,22 @@ library_name: transformers.js
5
 
6
  https://huggingface.co/openai/whisper-medium with ONNX weights to be compatible with Transformers.js.
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
 
5
 
6
  https://huggingface.co/openai/whisper-medium with ONNX weights to be compatible with Transformers.js.
7
 
8
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
9
+ ```bash
10
+ npm i @huggingface/transformers
11
+ ```
12
+
13
+ ```js
14
+ import { pipeline } from '@huggingface/transformers';
15
+
16
+ // Create the pipeline
17
+ const pipe = await pipeline('automatic-speech-recognition', 'Xenova/whisper-medium', {
18
+ dtype: 'fp32', // Options: "fp32", "fp16", "q8", "q4"
19
+ });
20
+
21
+ // Use the model
22
+ const result = await pipe('input text or data');
23
+ console.log(result);
24
+ ```
25
+
26
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
onnx/decoder_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3cfcea7268a27708dd65b2ef2fdd84fce87bfc38d93cbd8dadacc939612cda78
3
+ size 443519184
onnx/decoder_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6b048fab0c9e4b36ee08d24e5e866839328ec3b6ef5ab85ad705a4252fd3fdc1
3
+ size 914323444
onnx/decoder_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:de6c273d3b6709cff6ccb58027d4a25e38b1f9fe166d923987b23d1004a4c4d4
3
+ size 673045389
onnx/decoder_model_merged_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:378fb1a11615ad08ea8c415d2265d9bb11c819dbc07a77ff5dfe685d6e5754d0
3
+ size 444673676
onnx/decoder_model_merged_fp16.onnx CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c55f59293dc00faa4c139c1be03a2e699dfd93f81d2351eda5b59ac449568098
3
- size 916172654
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fe6170106d419388f5287e2d465fe0e594e84f5835118173dae213fd05ff47ba
3
+ size 916186656
onnx/decoder_model_merged_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:65927eecc665dc398d2053222ee178fa3a02af269f0825867b5e16fc4b8a7f19
3
+ size 462662335
onnx/decoder_model_merged_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:72cff87d24864768d33a4f0447417a989cb2574c4142bbab2c600a63fb6581c5
3
+ size 469835828
onnx/decoder_model_merged_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1af57265f6224ead6557221192948ccb5c605d0ae96355701ec07b76f92e5989
3
+ size 337396943
onnx/decoder_model_merged_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c45e39b6494ad1cc2e3a1bd82dab61673faa3d445f11cb59f9911b0bff926fb9
3
+ size 462662444
onnx/decoder_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4366f97d9e79301d2d8039e3cc536c0aa2a2219f27e610594639726dcb8e2340
3
+ size 468683064
onnx/decoder_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:37025322c5aa638ba6f9b37403407ec0bd212b2403319941879142e2614db1ef
3
+ size 335545151
onnx/decoder_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9a0b5e6caeb5789201c1078c8b5f262d260eba20e2914660b0ea0b46c9303e5d
3
+ size 673045498
onnx/decoder_with_past_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:77d4c296f9d1358c5c9eb46e460c13bd9a3b8961f50eccbe136d76d8a8e17825
3
+ size 415139034
onnx/decoder_with_past_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8974a72c5bfcc098d950839860e715ddf6ca6955597cc12727f61fcdb9a88ca4
3
+ size 813664334
onnx/decoder_with_past_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ad768737b624290e0140dfbab56558c79be542866a54d4e98439d165bd7a8b75
3
+ size 622605324
onnx/decoder_with_past_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f26d656c37024c82f571d80228ddcd0bc1c8d650024bc7a79374d4248f0325be
3
+ size 437157570
onnx/decoder_with_past_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:66124078951e7e95c2eae739c417aa2af0ff5a150ffedd19899d8c9e3fae8fc4
3
+ size 307230681
onnx/decoder_with_past_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:134a091f3cce0ff1ac9130e2a8691a2a4fb0078d2604ee513bc7d3313e4eaa8b
3
+ size 622605413
onnx/encoder_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c186feea76b2af78b976272d1a3e3f0ba5a4c576235fcb7dbfca6b318eff0937
3
+ size 191120171
onnx/encoder_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e22a27383d33e8eb9a07a3cd434f5ef47c43a19de5db07335fca9fa497718743
3
+ size 209993363
onnx/encoder_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:758d4b51c58518fc78267b1fddf66852077e96cd0c58f6904ca284b55fe12d0b
3
+ size 180667559
onnx/encoder_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:76eec06b6b09c5a7a0859c86e8d616dd4d99047488055db1be441622d4dc9a85
3
+ size 313193350