CodeReviewBench / SUBMISSION_EXAMPLE.md
Alex
space updated
b31be61
# πŸ“ Model Submission Example
This guide shows you exactly how to submit your code review model to the leaderboard.
## πŸš€ Step-by-Step Submission Process
### 1. **Access the Submission Form**
- Open the CodeReview Leaderboard in your browser
- Navigate to the **πŸ“ Submit Model** tab
- Click on the "πŸ“ Submit New Model Results" accordion to expand the form
### 2. **Fill in Basic Information**
#### **Model Name** ✨
```
Example: microsoft/CodeT5-base
Format: organization/model-name
```
#### **Programming Language** πŸ”
```
Select: Python
(or Java, JavaScript, C++, Go, Rust, etc.)
```
#### **Comment Language** 🌍
```
Select: English
(or Chinese, Spanish, French, German, etc.)
```
#### **Taxonomy Category** 🏷️
```
Select: Bug Detection
(or Security, Performance, Code Style, etc.)
```
### 3. **Performance Scores** (0.0 - 1.0)
#### **BLEU Score**
```
Example: 0.742
Range: 0.0 to 1.0
Description: Measures similarity between generated and reference reviews
```
#### **Pass@1**
```
Example: 0.685
Range: 0.0 to 1.0
Description: Success rate when model gets 1 attempt
```
#### **Pass@5**
```
Example: 0.834
Range: 0.0 to 1.0
Description: Success rate when model gets 5 attempts
```
#### **Pass@10**
```
Example: 0.901
Range: 0.0 to 1.0
Description: Success rate when model gets 10 attempts
```
### 4. **Quality Metrics** (0 - 10)
Rate your model across these 10 dimensions:
#### **Readability: 8**
```
How clear and readable are the generated code reviews?
Scale: 0 (unreadable) to 10 (very clear)
```
#### **Relevance: 7**
```
How relevant are the reviews to the actual code changes?
Scale: 0 (irrelevant) to 10 (highly relevant)
```
#### **Explanation Clarity: 8**
```
How well does the model explain identified issues?
Scale: 0 (unclear) to 10 (very clear explanations)
```
#### **Problem Identification: 7**
```
How effectively does it identify real code problems?
Scale: 0 (misses issues) to 10 (finds all problems)
```
#### **Actionability: 6**
```
How actionable and useful are the suggestions?
Scale: 0 (not actionable) to 10 (very actionable)
```
#### **Completeness: 7**
```
How thorough and complete are the reviews?
Scale: 0 (incomplete) to 10 (comprehensive)
```
#### **Specificity: 6**
```
How specific are the comments and suggestions?
Scale: 0 (too generic) to 10 (very specific)
```
#### **Contextual Adequacy: 7**
```
How well does it understand the code context?
Scale: 0 (ignores context) to 10 (perfect context understanding)
```
#### **Consistency: 6**
```
How consistent is the model across different code reviews?
Scale: 0 (inconsistent) to 10 (very consistent)
```
#### **Brevity: 5**
```
How concise are the reviews without losing important information?
Scale: 0 (too verbose/too brief) to 10 (perfect length)
```
### 5. **Submit Your Model**
- Click the **πŸš€ Submit Model** button
- Wait for validation and processing
- Check for success/error message
## πŸ“‹ Complete Example Submission
Here's a real example of submitting the CodeT5-base model:
```yaml
Model Information:
Model Name: "microsoft/CodeT5-base"
Programming Language: "Python"
Comment Language: "English"
Taxonomy Category: "Bug Detection"
Performance Scores:
BLEU Score: 0.742
Pass@1: 0.685
Pass@5: 0.834
Pass@10: 0.901
Quality Metrics:
Readability: 8
Relevance: 7
Explanation Clarity: 8
Problem Identification: 7
Actionability: 6
Completeness: 7
Specificity: 6
Contextual Adequacy: 7
Consistency: 6
Brevity: 5
```
## πŸ”’ Security & Rate Limiting
### **IP-based Rate Limiting**
- **5 submissions per IP address per 24 hours**
- Submissions are tracked by your IP address
- Rate limit resets every 24 hours
### **Validation Rules**
- Model name must follow `organization/model` format
- All performance scores must be between 0.0 and 1.0
- All quality metrics must be between 0 and 10
- Pass@1 ≀ Pass@5 ≀ Pass@10 (logical consistency)
## βœ… After Submission
### **Immediate Feedback**
You'll see one of these messages:
#### **Success βœ…**
```
βœ… Submission recorded successfully!
```
#### **Error Examples ❌**
```
❌ Rate limit exceeded: 5/5 submissions in 24 hours
❌ Model name contains invalid characters
❌ Pass@1 score cannot be higher than Pass@5
❌ Score BLEU out of range: 1.2 (must be between 0 and 1)
```
### **View Your Results**
- Your model will appear in the **πŸ† Leaderboard** tab
- Use filters to find your specific submission
- Check the **πŸ“ˆ Analytics** tab for submission history
## 🎯 Tips for Better Submissions
### **Model Naming**
```
βœ… Good: "microsoft/CodeT5-base"
βœ… Good: "facebook/bart-large"
βœ… Good: "my-org/custom-model-v2"
❌ Bad: "my model"
❌ Bad: "model@v1.0"
```
### **Performance Scores**
- Be honest and accurate with your evaluations
- Use proper evaluation methodology
- Ensure Pass@k scores are logically consistent
- Document your evaluation process
### **Quality Metrics**
- Rate based on actual model performance
- Consider multiple test cases
- Be objective in your assessment
- Document your rating criteria
## 🀝 Need Help?
If you encounter issues:
1. Check the error message for specific guidance
2. Verify all fields are filled correctly
3. Ensure you haven't exceeded rate limits
4. Contact maintainers if problems persist
---
**Ready to submit your model? Head to the πŸ“ Submit Model tab and follow this guide!** πŸš€