whitphx HF Staff commited on
Commit
2882d62
Β·
verified Β·
1 Parent(s): f4ef619

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Browse files

## Applied Quantizations

### βœ… Based on `decoder_model_merged.onnx` *with* slimming

**The base model `decoder_model_merged.onnx` has been renamed to `model.onnx`.**

↳ βœ… `fp16`: `model_fp16.onnx` (added)
↳ βœ… `int8`: `model_int8.onnx` (added)
↳ βœ… `uint8`: `model_uint8.onnx` (added)
↳ βœ… `q4`: `model_q4.onnx` (added)
↳ βœ… `q4f16`: `model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `model_bnb4.onnx` (added)

README.md CHANGED
@@ -5,4 +5,20 @@ library_name: transformers.js
5
 
6
  https://huggingface.co/hf-internal-testing/tiny-random-Starcoder2ForCausalLM with ONNX weights to be compatible with Transformers.js.
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
 
5
 
6
  https://huggingface.co/hf-internal-testing/tiny-random-Starcoder2ForCausalLM with ONNX weights to be compatible with Transformers.js.
7
 
8
+ ## Usage (Transformers.js)
9
+
10
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
11
+ ```bash
12
+ npm i @huggingface/transformers
13
+ ```
14
+
15
+ **Example:** Text generation.
16
+
17
+ ```js
18
+ import { pipeline } from '@huggingface/transformers';
19
+
20
+ const generator = await pipeline('text-generation', 'Xenova/tiny-random-Starcoder2ForCausalLM');
21
+ const output = await generator('Once upon a time, there was', { max_new_tokens: 10 });
22
+ ```
23
+
24
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
onnx/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:75d5ed8d790b92205311b6015faf8d3ebc929f27e4be25234e1d8491ddab1c4f
3
+ size 308904
onnx/model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8ff9a1d33653afce52b35bb3636fe224b2ba304467d04c642a9088824cafafba
3
+ size 239050
onnx/model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f0cccfcf2f53a96d2e3df397042d65a36f6d8cc1e8d1a08ce488d364ee8f4370
3
+ size 172448
onnx/model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8a8a63697b60513d3facf96e732f88931036991e98205ef263335ca39526f8e3
3
+ size 157407
onnx/model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:22d063b5d397de0790d10010cea770d7304024ee83a2f4ae5f6a34c64096b656
3
+ size 240722
onnx/model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7afab5f261a3a1ec0ab425e997b384138eee95b28181408ec0d1fc110d5ea1d5
3
+ size 159495
onnx/model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cee4e4af160699ee4fe07eee8b4c743bf32bfd3833a960f6d3b2fc609dd0966f
3
+ size 157413