wenhuach commited on
Commit
3817939
·
verified ·
1 Parent(s): a816085

add acc data

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md CHANGED
@@ -224,6 +224,32 @@ for output in outputs:
224
  ~~~
225
 
226
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
227
 
228
 
229
  ### Reproduce the model
 
224
  ~~~
225
 
226
 
227
+ ### Evaluate the model
228
+
229
+ ~~~bash
230
+ auto-round --eval --model "Intel/DeepSeek-R1-0528-Qwen3-8B-int4-AutoRound-inc" --eval_bs 16 --tasks leaderboard_ifeval,leaderboard_mmlu_pro,gsm8k,lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,cmmlu,ceval-valid
231
+ ~~~
232
+
233
+
234
+
235
+ | Metric | BF16 | INT4(auto-round) | INT4 (auto-round-best) |
236
+ | -------------------- | ------ | ---------------- | ---------------------- |
237
+ | Avg | 0.5958 | 0.5913 | 0.5926 |
238
+ | arc_challenge | 0.5137 | 0.5102 | 0.5043 |
239
+ | arc_easy | 0.7908 | 0.7862 | 0.7921 |
240
+ | boolq | 0.8498 | 0.8526 | 0.8443 |
241
+ | ceval-valid | 0.7296 | 0.7177 | 0.7140 |
242
+ | cmmlu | 0.7159 | 0.7029 | 0.7027 |
243
+ | gsm8k | 0.8211 | 0.8029 | 0.8234 |
244
+ | hellaswag | 0.5781 | 0.5703 | 0.5670 |
245
+ | lambada_openai | 0.5544 | 0.5490 | 0.5626 |
246
+ | leaderboard_ifeval | 0.2731 | 0.2729 | 0.2542 |
247
+ | leaderboard_mmlu_pro | 0.4115 | 0.4105 | 0.4117 |
248
+ | openbookqa | 0.3020 | 0.3060 | 0.3100 |
249
+ | piqa | 0.7617 | 0.7617 | 0.7612 |
250
+ | truthfulqa_mc1 | 0.3562 | 0.3611 | 0.3696 |
251
+ | winogrande | 0.6835 | 0.6740 | 0.6788 |
252
+
253
 
254
 
255
  ### Reproduce the model