Alex commited on
Commit
ac1299b
·
1 Parent(s): 9404fa8

merge_resolve

Browse files
Files changed (1) hide show
  1. README.md +2 -23
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- title: CircleGuardBench
3
  emoji: ⚪
4
  colorFrom: gray
5
  colorTo: indigo
@@ -7,35 +7,14 @@ sdk: gradio
7
  sdk_version: 4.44.1
8
  app_file: app.py
9
  pinned: true
10
- short_description: First benchmark testing LLM guards on safety and accuracy.
11
  models:
12
- - AtlaAI/Selene-1-Mini-Llama-3.1-8B
13
- - google/gemma-3-12b-it
14
- - google/gemma-3-4b-it
15
- - meta-llama/Llama-3.1-8B-Instruct
16
- - meta-llama/Llama-3.2-3B-Instruct
17
- - meta-llama/Llama-4-Maverick-17B-128E-Instruct
18
- - meta-llama/Llama-4-Scout-17B-16E-Instruct
19
- - meta-llama/Llama-Guard-3-1B
20
- - meta-llama/Llama-Guard-3-8B
21
- - meta-llama/Llama-Guard-4-12B
22
- - mistralai/Ministral-8B-Instruct-2410
23
- - mistralai/Mistral-Small-3.1-24B-Instruct-2503
24
- - Qwen/Qwen2.5-7B-Instruct
25
- - Qwen/Qwen3-0.6B
26
- - Qwen/Qwen3-1.7B
27
- - Qwen/Qwen3-4B
28
- - Qwen/Qwen3-8B
29
 
30
  ---
31
 
32
  # CodeReview Bench Leaderboard
33
 
34
- <<<<<<< HEAD
35
  A comprehensive benchmark and leaderboard for code review generation models, inspired by [CodeReviewBench](https://huggingface.co/spaces/your-org/CodeReviewBench).
36
- =======
37
- A comprehensive leaderboard for evaluating automated code review systems across programming languages and review quality dimensions.
38
- >>>>>>> f990f507d1e99e7867021841fa223fe6ca8f653b
39
 
40
  ## Features
41
 
 
1
  ---
2
+ title: CodeReviewBench
3
  emoji: ⚪
4
  colorFrom: gray
5
  colorTo: indigo
 
7
  sdk_version: 4.44.1
8
  app_file: app.py
9
  pinned: true
10
+ short_description: A comprehensive benchmark and leaderboard for code review generation models.
11
  models:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  ---
14
 
15
  # CodeReview Bench Leaderboard
16
 
 
17
  A comprehensive benchmark and leaderboard for code review generation models, inspired by [CodeReviewBench](https://huggingface.co/spaces/your-org/CodeReviewBench).
 
 
 
18
 
19
  ## Features
20