add acc data
Browse files
README.md
CHANGED
@@ -224,6 +224,32 @@ for output in outputs:
|
|
224 |
~~~
|
225 |
|
226 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
227 |
|
228 |
|
229 |
### Reproduce the model
|
|
|
224 |
~~~
|
225 |
|
226 |
|
227 |
+
### Evaluate the model
|
228 |
+
|
229 |
+
~~~bash
|
230 |
+
auto-round --eval --model "Intel/DeepSeek-R1-0528-Qwen3-8B-int4-AutoRound-inc" --eval_bs 16 --tasks leaderboard_ifeval,leaderboard_mmlu_pro,gsm8k,lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,cmmlu,ceval-valid
|
231 |
+
~~~
|
232 |
+
|
233 |
+
|
234 |
+
|
235 |
+
| Metric | BF16 | INT4(auto-round) | INT4 (auto-round-best) |
|
236 |
+
| -------------------- | ------ | ---------------- | ---------------------- |
|
237 |
+
| Avg | 0.5958 | 0.5913 | 0.5926 |
|
238 |
+
| arc_challenge | 0.5137 | 0.5102 | 0.5043 |
|
239 |
+
| arc_easy | 0.7908 | 0.7862 | 0.7921 |
|
240 |
+
| boolq | 0.8498 | 0.8526 | 0.8443 |
|
241 |
+
| ceval-valid | 0.7296 | 0.7177 | 0.7140 |
|
242 |
+
| cmmlu | 0.7159 | 0.7029 | 0.7027 |
|
243 |
+
| gsm8k | 0.8211 | 0.8029 | 0.8234 |
|
244 |
+
| hellaswag | 0.5781 | 0.5703 | 0.5670 |
|
245 |
+
| lambada_openai | 0.5544 | 0.5490 | 0.5626 |
|
246 |
+
| leaderboard_ifeval | 0.2731 | 0.2729 | 0.2542 |
|
247 |
+
| leaderboard_mmlu_pro | 0.4115 | 0.4105 | 0.4117 |
|
248 |
+
| openbookqa | 0.3020 | 0.3060 | 0.3100 |
|
249 |
+
| piqa | 0.7617 | 0.7617 | 0.7612 |
|
250 |
+
| truthfulqa_mc1 | 0.3562 | 0.3611 | 0.3696 |
|
251 |
+
| winogrande | 0.6835 | 0.6740 | 0.6788 |
|
252 |
+
|
253 |
|
254 |
|
255 |
### Reproduce the model
|