whitphx HF Staff commited on
Commit
49c7977
Β·
verified Β·
1 Parent(s): 9226b95

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Browse files

## Applied Quantizations

### βœ… Based on `decoder_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_bnb4.onnx` (added)

### βœ… Based on `encoder_model.onnx` *with* slimming

↳ ❌ `int8`: `encoder_model_int8.onnx` (added but JS-based E2E test failed)
```
dtype not specified for "decoder_model_merged". Using the default dtype (fp32) for this device (cpu).
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Could not find an implementation for ConvInteger(10) node with name '/conv1/Conv_quant'
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ βœ… `uint8`: `encoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `encoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `encoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `encoder_model_bnb4.onnx` (added)

### βœ… Based on `decoder_with_past_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_with_past_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_with_past_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_with_past_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_with_past_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_with_past_model_bnb4.onnx` (added)

### βœ… Based on `decoder_model_merged.onnx` *without* slimming

↳ βœ… `fp16`: `decoder_model_merged_fp16.onnx` (replaced because it was invalid)
↳ βœ… `int8`: `decoder_model_merged_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_merged_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_merged_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_merged_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_merged_bnb4.onnx` (added)

README.md CHANGED
@@ -5,4 +5,24 @@ library_name: transformers.js
5
 
6
  https://huggingface.co/openai/whisper-base with ONNX weights to be compatible with Transformers.js.
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
 
5
 
6
  https://huggingface.co/openai/whisper-base with ONNX weights to be compatible with Transformers.js.
7
 
8
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
9
+ ```bash
10
+ npm i @huggingface/transformers
11
+ ```
12
+
13
+ ## Basic Usage
14
+
15
+ ```js
16
+ import { pipeline } from '@huggingface/transformers';
17
+
18
+ // Create the pipeline
19
+ const pipe = await pipeline('automatic-speech-recognition', 'Xenova/whisper-base', {
20
+ dtype: 'fp32', // Options: "fp32", "fp16", "q8", "q4"
21
+ });
22
+
23
+ // Use the model
24
+ const result = await pipe('input text or data');
25
+ console.log(result);
26
+ ```
27
+
28
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
onnx/decoder_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1ce67d337008dc166238285fef67019d5814015e55a2d48ad762ba6f22c42f4f
3
+ size 121779106
onnx/decoder_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dc3bfe671c3366fd49b0826ca90b835fd1c1d21c05b4a018230069847a20cf63
3
+ size 104271798
onnx/decoder_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3ee68cb8c91496ef0a753e42ccdd42e8d390b67ada2a8b57dc74df675d2ce707
3
+ size 159408039
onnx/decoder_model_merged_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:22cd95cd2fbbcc5d65fde9b6755170b24f9042453f9df133fc850af553749a4a
3
+ size 122069922
onnx/decoder_model_merged_fp16.onnx CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f55117d3409469546c58e5ef231d46013f59407acf91c18bfce611121b261598
3
- size 104739140
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d875a521a214f62258f922933e8afe96794c03d571b4039fe08bc9000fb49ddc
3
+ size 104742968
onnx/decoder_model_merged_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f936da4b81f487f8a742554aecc4b8e12ad6121f76d044027f72380139ae869c
3
+ size 53707761
onnx/decoder_model_merged_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a68dcdbb6551967030dcbdcef400d6e62b1624234c2e606f75d1d2878aef5def
3
+ size 123641874
onnx/decoder_model_merged_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aadeadbecc79bee3287cb34f0b37c768a0ddfd243c3367f209e446d9e3ae4c4a
3
+ size 68573265
onnx/decoder_model_merged_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bf7c1df5c518e084e80c3262c98e6b653c77c15d0ed59b1c4369824974cae065
3
+ size 53707789
onnx/decoder_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8595a4b52a8996ea8126169c13cde87e0f51d1156884bb467ec56ce26f3c1c55
3
+ size 123351490
onnx/decoder_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ea9d3c693418f92962ec0c2151b3d86f85dc67684a75c6a642882f95fd206d61
3
+ size 68104873
onnx/decoder_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:029592f362a11665c4dd9ed5055d2e7258d1969d183a5590ae26944ac65c9943
3
+ size 159408067
onnx/decoder_with_past_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5add1d1705db854bcee188312319bbf6fb34fcaf2b3223cf9f3fc1224b0fedaa
3
+ size 120002734
onnx/decoder_with_past_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f1a3faee7a81fb21bce0c28888e5fab661a1b9951ac4643e4bdeefc9b0597d5d
3
+ size 97985286
onnx/decoder_with_past_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2eb3a1e69cb5cbe09ed60fb2136d9f372cb04f174bf25957ad9f81474940bce5
3
+ size 156245370
onnx/decoder_with_past_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c51fa5423a9f0867ecd114f5f6e1878484f7496ad98b98557e4087c485c8b490
3
+ size 121378606
onnx/decoder_with_past_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:183dc8292650913537a3512dca3b0cca0cf37ab400dcc2eb583e29249dac96d6
3
+ size 66338557
onnx/decoder_with_past_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c4ae517fa06bb9cfea59efecdf0a5a385c270c018ede6c82117d5397cb0654ac
3
+ size 156245393
onnx/encoder_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6bda97a7bbf47850bbd627961daaf6310c61e214f2cb294c5bb901c76ba12b6e
3
+ size 17570314
onnx/encoder_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dc0cb2885dc2bc570dc8de8d00987cd20314e5773f3baf968b3219810142d886
3
+ size 18749674
onnx/encoder_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9205a07a3040b786fafc6d8b9afe4339f6acaefb561b51f660ec5327ef4147b8
3
+ size 14138124
onnx/encoder_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f7c8ff66fb3b9aa687a1afe951917959c586ea9e460e15a2753f31f740e982b4
3
+ size 23132968