whitphx HF Staff commited on
Commit
0b7684a
Β·
verified Β·
1 Parent(s): 8569aae

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Browse files

## Applied Quantizations

### βœ… Based on `decoder_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_bnb4.onnx` (added)

### βœ… Based on `encoder_model.onnx` *with* slimming

↳ ❌ `int8`: `encoder_model_int8.onnx` (added but JS-based E2E test failed)
```
dtype not specified for "decoder_model_merged". Using the default dtype (fp32) for this device (cpu).
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Could not find an implementation for ConvInteger(10) node with name '/conv1/Conv_quant'
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ βœ… `uint8`: `encoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `encoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `encoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `encoder_model_bnb4.onnx` (added)

### βœ… Based on `decoder_with_past_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_with_past_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_with_past_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_with_past_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_with_past_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_with_past_model_bnb4.onnx` (added)

### βœ… Based on `decoder_model_merged.onnx` *without* slimming

↳ βœ… `fp16`: `decoder_model_merged_fp16.onnx` (replaced because it was invalid)
↳ βœ… `int8`: `decoder_model_merged_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_merged_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_merged_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_merged_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_merged_bnb4.onnx` (added)

README.md CHANGED
@@ -5,4 +5,24 @@ library_name: transformers.js
5
 
6
  https://huggingface.co/openai/whisper-medium.en with ONNX weights to be compatible with Transformers.js.
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
 
5
 
6
  https://huggingface.co/openai/whisper-medium.en with ONNX weights to be compatible with Transformers.js.
7
 
8
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
9
+ ```bash
10
+ npm i @huggingface/transformers
11
+ ```
12
+
13
+ ## Basic Usage
14
+
15
+ ```js
16
+ import { pipeline } from '@huggingface/transformers';
17
+
18
+ // Create the pipeline
19
+ const pipe = await pipeline('automatic-speech-recognition', 'Xenova/whisper-medium.en', {
20
+ dtype: 'fp32', // Options: "fp32", "fp16", "q8", "q4"
21
+ });
22
+
23
+ // Use the model
24
+ const result = await pipe('input text or data');
25
+ console.log(result);
26
+ ```
27
+
28
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
onnx/decoder_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5293b41eb6e438aa8d64f9644729dd7b5a1ba417ebf43d1d7b1f77dbdadf6d13
3
+ size 443515088
onnx/decoder_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:568f48782fd2a4b3e5e61caad5891fcedc8c3405bf2b37cef2fa104f075cd55e
3
+ size 914321396
onnx/decoder_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:62bb284f0b138a8b68177a2a9819530b2482069d23332234ba0dda091db11040
3
+ size 673040270
onnx/decoder_model_merged_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eb5f15ce17e0fdb0d259b06c050ab049fb3aecef9240edbb35a9ef141e86997b
3
+ size 444669580
onnx/decoder_model_merged_fp16.onnx CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:143a257d1de5b72ae5415660d4bc808be8077be06539a41717b0c4b4379ea107
3
- size 916170606
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6185a1994544fc69cbc14ecdf8fb249bdcc3191c7af51cc96bdbbcf6c0d984cc
3
+ size 916184608
onnx/decoder_model_merged_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a180861fbdb51a4f32dee00f9ea5a4cda73169ec8c2b3cf4ef937182c83082b7
3
+ size 462661312
onnx/decoder_model_merged_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a936b3f7d96295a2235dea39a5a2f6e2e8255295e1c47d95f8092fb900dccb69
3
+ size 469831732
onnx/decoder_model_merged_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:480388fc84c8bec87259fbc6693a0cf41239e2d21a8527a5d482b537cb447af5
3
+ size 337394895
onnx/decoder_model_merged_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9ad45d24ac6872708ff2911b88b172e472f39049a9413085d89f887e336d72dc
3
+ size 462661417
onnx/decoder_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ab22d2c6c95d0c6656b63cda2145e63b8e05b93fc90b5acbd358412fed926874
3
+ size 468678968
onnx/decoder_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1b1d26bcb57b94ce002d41d6100dc3fa94b7ad637567e348c80683db191638c2
3
+ size 335543103
onnx/decoder_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ae641f4d8300d25b4e05d5046dd446cd117d0d034e7220a39bfd26bf732e7816
3
+ size 673040375
onnx/decoder_with_past_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:732624a610977cfe123fec4d37bde8d1fde1aae1d7a1a571aa59f5723cce346c
3
+ size 415134938
onnx/decoder_with_past_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d6f3f0a8b243524207cff1cce9e66b93f7ead50d5eaf970e45b175833ab3b829
3
+ size 813662286
onnx/decoder_with_past_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:06cd6247d31afc56c333e8dae58c1632610f9502454a7e40684407adf1149234
3
+ size 622600205
onnx/decoder_with_past_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c2b605277ff6c7892e104643c893ec4ddd661ba3c64b308ba94d4efa2f427f6a
3
+ size 437153474
onnx/decoder_with_past_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31386672a90ff3c93b3f07501231918d2e6867f662f28601feb76111c98ecd2b
3
+ size 307228633
onnx/decoder_with_past_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dbd02857ab462aa2f060fb3f935a18dade7aa5f89473a1b2751cca0e64c0a14b
3
+ size 622600289
onnx/encoder_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:09c2754c77eddcf8eb1ca43acd3ea165d42ca18972164e4eed66d7177c54829d
3
+ size 191120171
onnx/encoder_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d41e95fa59b67b9f504eee0dac102aba160a1093cfbede3cc2f5c020d5ae9df7
3
+ size 209993363
onnx/encoder_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:29fb4341e7861c988c0b871d95b0a3557247fb1ee3296cd06cac11c55226683a
3
+ size 180667559
onnx/encoder_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:169ad1105672ee5e438711576a7a96e9467a7af0b1a64f01a796b36c42574965
3
+ size 313193350