Alex commited on
Commit
b31be61
Β·
1 Parent(s): 982b341

space updated

Browse files
SUBMISSION_EXAMPLE.md ADDED
@@ -0,0 +1,266 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # πŸ“ Model Submission Example
2
+
3
+ This guide shows you exactly how to submit your code review model to the leaderboard.
4
+
5
+ ## πŸš€ Step-by-Step Submission Process
6
+
7
+ ### 1. **Access the Submission Form**
8
+
9
+ - Open the CodeReview Leaderboard in your browser
10
+ - Navigate to the **πŸ“ Submit Model** tab
11
+ - Click on the "πŸ“ Submit New Model Results" accordion to expand the form
12
+
13
+ ### 2. **Fill in Basic Information**
14
+
15
+ #### **Model Name** ✨
16
+
17
+ ```
18
+ Example: microsoft/CodeT5-base
19
+ Format: organization/model-name
20
+ ```
21
+
22
+ #### **Programming Language** πŸ”
23
+
24
+ ```
25
+ Select: Python
26
+ (or Java, JavaScript, C++, Go, Rust, etc.)
27
+ ```
28
+
29
+ #### **Comment Language** 🌍
30
+
31
+ ```
32
+ Select: English
33
+ (or Chinese, Spanish, French, German, etc.)
34
+ ```
35
+
36
+ #### **Taxonomy Category** 🏷️
37
+
38
+ ```
39
+ Select: Bug Detection
40
+ (or Security, Performance, Code Style, etc.)
41
+ ```
42
+
43
+ ### 3. **Performance Scores** (0.0 - 1.0)
44
+
45
+ #### **BLEU Score**
46
+
47
+ ```
48
+ Example: 0.742
49
+ Range: 0.0 to 1.0
50
+ Description: Measures similarity between generated and reference reviews
51
+ ```
52
+
53
+ #### **Pass@1**
54
+
55
+ ```
56
+ Example: 0.685
57
+ Range: 0.0 to 1.0
58
+ Description: Success rate when model gets 1 attempt
59
+ ```
60
+
61
+ #### **Pass@5**
62
+
63
+ ```
64
+ Example: 0.834
65
+ Range: 0.0 to 1.0
66
+ Description: Success rate when model gets 5 attempts
67
+ ```
68
+
69
+ #### **Pass@10**
70
+
71
+ ```
72
+ Example: 0.901
73
+ Range: 0.0 to 1.0
74
+ Description: Success rate when model gets 10 attempts
75
+ ```
76
+
77
+ ### 4. **Quality Metrics** (0 - 10)
78
+
79
+ Rate your model across these 10 dimensions:
80
+
81
+ #### **Readability: 8**
82
+
83
+ ```
84
+ How clear and readable are the generated code reviews?
85
+ Scale: 0 (unreadable) to 10 (very clear)
86
+ ```
87
+
88
+ #### **Relevance: 7**
89
+
90
+ ```
91
+ How relevant are the reviews to the actual code changes?
92
+ Scale: 0 (irrelevant) to 10 (highly relevant)
93
+ ```
94
+
95
+ #### **Explanation Clarity: 8**
96
+
97
+ ```
98
+ How well does the model explain identified issues?
99
+ Scale: 0 (unclear) to 10 (very clear explanations)
100
+ ```
101
+
102
+ #### **Problem Identification: 7**
103
+
104
+ ```
105
+ How effectively does it identify real code problems?
106
+ Scale: 0 (misses issues) to 10 (finds all problems)
107
+ ```
108
+
109
+ #### **Actionability: 6**
110
+
111
+ ```
112
+ How actionable and useful are the suggestions?
113
+ Scale: 0 (not actionable) to 10 (very actionable)
114
+ ```
115
+
116
+ #### **Completeness: 7**
117
+
118
+ ```
119
+ How thorough and complete are the reviews?
120
+ Scale: 0 (incomplete) to 10 (comprehensive)
121
+ ```
122
+
123
+ #### **Specificity: 6**
124
+
125
+ ```
126
+ How specific are the comments and suggestions?
127
+ Scale: 0 (too generic) to 10 (very specific)
128
+ ```
129
+
130
+ #### **Contextual Adequacy: 7**
131
+
132
+ ```
133
+ How well does it understand the code context?
134
+ Scale: 0 (ignores context) to 10 (perfect context understanding)
135
+ ```
136
+
137
+ #### **Consistency: 6**
138
+
139
+ ```
140
+ How consistent is the model across different code reviews?
141
+ Scale: 0 (inconsistent) to 10 (very consistent)
142
+ ```
143
+
144
+ #### **Brevity: 5**
145
+
146
+ ```
147
+ How concise are the reviews without losing important information?
148
+ Scale: 0 (too verbose/too brief) to 10 (perfect length)
149
+ ```
150
+
151
+ ### 5. **Submit Your Model**
152
+
153
+ - Click the **πŸš€ Submit Model** button
154
+ - Wait for validation and processing
155
+ - Check for success/error message
156
+
157
+ ## πŸ“‹ Complete Example Submission
158
+
159
+ Here's a real example of submitting the CodeT5-base model:
160
+
161
+ ```yaml
162
+ Model Information:
163
+ Model Name: "microsoft/CodeT5-base"
164
+ Programming Language: "Python"
165
+ Comment Language: "English"
166
+ Taxonomy Category: "Bug Detection"
167
+
168
+ Performance Scores:
169
+ BLEU Score: 0.742
170
+ Pass@1: 0.685
171
+ Pass@5: 0.834
172
+ Pass@10: 0.901
173
+
174
+ Quality Metrics:
175
+ Readability: 8
176
+ Relevance: 7
177
+ Explanation Clarity: 8
178
+ Problem Identification: 7
179
+ Actionability: 6
180
+ Completeness: 7
181
+ Specificity: 6
182
+ Contextual Adequacy: 7
183
+ Consistency: 6
184
+ Brevity: 5
185
+ ```
186
+
187
+ ## πŸ”’ Security & Rate Limiting
188
+
189
+ ### **IP-based Rate Limiting**
190
+
191
+ - **5 submissions per IP address per 24 hours**
192
+ - Submissions are tracked by your IP address
193
+ - Rate limit resets every 24 hours
194
+
195
+ ### **Validation Rules**
196
+
197
+ - Model name must follow `organization/model` format
198
+ - All performance scores must be between 0.0 and 1.0
199
+ - All quality metrics must be between 0 and 10
200
+ - Pass@1 ≀ Pass@5 ≀ Pass@10 (logical consistency)
201
+
202
+ ## βœ… After Submission
203
+
204
+ ### **Immediate Feedback**
205
+
206
+ You'll see one of these messages:
207
+
208
+ #### **Success βœ…**
209
+
210
+ ```
211
+ βœ… Submission recorded successfully!
212
+ ```
213
+
214
+ #### **Error Examples ❌**
215
+
216
+ ```
217
+ ❌ Rate limit exceeded: 5/5 submissions in 24 hours
218
+ ❌ Model name contains invalid characters
219
+ ❌ Pass@1 score cannot be higher than Pass@5
220
+ ❌ Score BLEU out of range: 1.2 (must be between 0 and 1)
221
+ ```
222
+
223
+ ### **View Your Results**
224
+
225
+ - Your model will appear in the **πŸ† Leaderboard** tab
226
+ - Use filters to find your specific submission
227
+ - Check the **πŸ“ˆ Analytics** tab for submission history
228
+
229
+ ## 🎯 Tips for Better Submissions
230
+
231
+ ### **Model Naming**
232
+
233
+ ```
234
+ βœ… Good: "microsoft/CodeT5-base"
235
+ βœ… Good: "facebook/bart-large"
236
+ βœ… Good: "my-org/custom-model-v2"
237
+ ❌ Bad: "my model"
238
+ ❌ Bad: "model@v1.0"
239
+ ```
240
+
241
+ ### **Performance Scores**
242
+
243
+ - Be honest and accurate with your evaluations
244
+ - Use proper evaluation methodology
245
+ - Ensure Pass@k scores are logically consistent
246
+ - Document your evaluation process
247
+
248
+ ### **Quality Metrics**
249
+
250
+ - Rate based on actual model performance
251
+ - Consider multiple test cases
252
+ - Be objective in your assessment
253
+ - Document your rating criteria
254
+
255
+ ## 🀝 Need Help?
256
+
257
+ If you encounter issues:
258
+
259
+ 1. Check the error message for specific guidance
260
+ 2. Verify all fields are filled correctly
261
+ 3. Ensure you haven't exceeded rate limits
262
+ 4. Contact maintainers if problems persist
263
+
264
+ ---
265
+
266
+ **Ready to submit your model? Head to the πŸ“ Submit Model tab and follow this guide!** πŸš€
src/display/css_html_js.py CHANGED
@@ -12,8 +12,8 @@ DARK_THEME_CSS = """
12
  --text-primary: #e6edf3;
13
  --text-secondary: #7d8590;
14
  --border-color: #30363d;
15
- --accent-color: #238636;
16
- --accent-hover: #2ea043;
17
  --danger-color: #da3633;
18
  --warning-color: #d29922;
19
  --info-color: #1f6feb;
@@ -101,14 +101,14 @@ DARK_THEME_CSS = """
101
 
102
  .gradio-container input:focus, .gradio-container select:focus, .gradio-container textarea:focus {
103
  border-color: var(--accent-color) !important;
104
- box-shadow: 0 0 0 2px rgba(35, 134, 54, 0.2) !important;
105
  }
106
 
107
  /* Buttons */
108
  .gradio-container button {
109
  background: var(--accent-color) !important;
110
- color: white !important;
111
- border: none !important;
112
  border-radius: 6px !important;
113
  padding: 8px 16px !important;
114
  font-weight: 500 !important;
@@ -118,6 +118,7 @@ DARK_THEME_CSS = """
118
  .gradio-container button:hover {
119
  background: var(--accent-hover) !important;
120
  transform: translateY(-1px) !important;
 
121
  }
122
 
123
  .gradio-container button:active {
@@ -158,7 +159,7 @@ DARK_THEME_CSS = """
158
 
159
  .gradio-container .slider input[type="range"]::-webkit-slider-thumb {
160
  background: var(--accent-color) !important;
161
- border: 2px solid var(--bg-secondary) !important;
162
  border-radius: 50% !important;
163
  width: 18px !important;
164
  height: 18px !important;
@@ -193,8 +194,8 @@ DARK_THEME_CSS = """
193
 
194
  /* Status messages */
195
  .gradio-container .success {
196
- background: rgba(35, 134, 54, 0.1) !important;
197
- color: var(--accent-color) !important;
198
  border: 1px solid var(--accent-color) !important;
199
  border-radius: 6px !important;
200
  padding: 12px 16px !important;
 
12
  --text-primary: #e6edf3;
13
  --text-secondary: #7d8590;
14
  --border-color: #30363d;
15
+ --accent-color: #ffffff;
16
+ --accent-hover: #f0f0f0;
17
  --danger-color: #da3633;
18
  --warning-color: #d29922;
19
  --info-color: #1f6feb;
 
101
 
102
  .gradio-container input:focus, .gradio-container select:focus, .gradio-container textarea:focus {
103
  border-color: var(--accent-color) !important;
104
+ box-shadow: 0 0 0 2px rgba(255, 255, 255, 0.2) !important;
105
  }
106
 
107
  /* Buttons */
108
  .gradio-container button {
109
  background: var(--accent-color) !important;
110
+ color: var(--bg-primary) !important;
111
+ border: 1px solid var(--border-color) !important;
112
  border-radius: 6px !important;
113
  padding: 8px 16px !important;
114
  font-weight: 500 !important;
 
118
  .gradio-container button:hover {
119
  background: var(--accent-hover) !important;
120
  transform: translateY(-1px) !important;
121
+ color: var(--bg-primary) !important;
122
  }
123
 
124
  .gradio-container button:active {
 
159
 
160
  .gradio-container .slider input[type="range"]::-webkit-slider-thumb {
161
  background: var(--accent-color) !important;
162
+ border: 2px solid var(--bg-primary) !important;
163
  border-radius: 50% !important;
164
  width: 18px !important;
165
  height: 18px !important;
 
194
 
195
  /* Status messages */
196
  .gradio-container .success {
197
+ background: rgba(255, 255, 255, 0.1) !important;
198
+ color: var(--text-primary) !important;
199
  border: 1px solid var(--accent-color) !important;
200
  border-radius: 6px !important;
201
  padding: 12px 16px !important;
src/display/formatting.py CHANGED
@@ -53,13 +53,13 @@ def format_metric_score(score: int, metric_name: str) -> str:
53
 
54
  # Color coding based on score
55
  if score >= 8:
56
- color = "#28a745" # Green
57
  elif score >= 6:
58
- color = "#ffc107" # Yellow
59
  elif score >= 4:
60
- color = "#fd7e14" # Orange
61
  else:
62
- color = "#dc3545" # Red
63
 
64
  return f"<span style='color: {color}; font-weight: 600;'>{score}</span>"
65
 
@@ -101,9 +101,9 @@ def format_taxonomy_badge(category: str) -> str:
101
  "Code Style": "#6f42c1",
102
  "Performance": "#fd7e14",
103
  "Security": "#e83e8c",
104
- "Maintainability": "#20c997",
105
  "Documentation": "#17a2b8",
106
- "Testing": "#28a745",
107
  "Architecture": "#6c757d",
108
  "Best Practices": "#007bff",
109
  "Refactoring": "#ffc107"
 
53
 
54
  # Color coding based on score
55
  if score >= 8:
56
+ color = "#ffffff" # White
57
  elif score >= 6:
58
+ color = "#d0d0d0" # Light gray
59
  elif score >= 4:
60
+ color = "#a0a0a0" # Gray
61
  else:
62
+ color = "#707070" # Dark gray
63
 
64
  return f"<span style='color: {color}; font-weight: 600;'>{score}</span>"
65
 
 
101
  "Code Style": "#6f42c1",
102
  "Performance": "#fd7e14",
103
  "Security": "#e83e8c",
104
+ "Maintainability": "#ffffff",
105
  "Documentation": "#17a2b8",
106
+ "Testing": "#ffffff",
107
  "Architecture": "#6c757d",
108
  "Best Practices": "#007bff",
109
  "Refactoring": "#ffc107"