Spaces:
Sleeping
Sleeping
# π Model Submission Example | |
This guide shows you exactly how to submit your code review model to the leaderboard. | |
## π Step-by-Step Submission Process | |
### 1. **Access the Submission Form** | |
- Open the CodeReview Leaderboard in your browser | |
- Navigate to the **π Submit Model** tab | |
- Click on the "π Submit New Model Results" accordion to expand the form | |
### 2. **Fill in Basic Information** | |
#### **Model Name** β¨ | |
``` | |
Example: microsoft/CodeT5-base | |
Format: organization/model-name | |
``` | |
#### **Programming Language** π | |
``` | |
Select: Python | |
(or Java, JavaScript, C++, Go, Rust, etc.) | |
``` | |
#### **Comment Language** π | |
``` | |
Select: English | |
(or Chinese, Spanish, French, German, etc.) | |
``` | |
#### **Taxonomy Category** π·οΈ | |
``` | |
Select: Bug Detection | |
(or Security, Performance, Code Style, etc.) | |
``` | |
### 3. **Performance Scores** (0.0 - 1.0) | |
#### **BLEU Score** | |
``` | |
Example: 0.742 | |
Range: 0.0 to 1.0 | |
Description: Measures similarity between generated and reference reviews | |
``` | |
#### **Pass@1** | |
``` | |
Example: 0.685 | |
Range: 0.0 to 1.0 | |
Description: Success rate when model gets 1 attempt | |
``` | |
#### **Pass@5** | |
``` | |
Example: 0.834 | |
Range: 0.0 to 1.0 | |
Description: Success rate when model gets 5 attempts | |
``` | |
#### **Pass@10** | |
``` | |
Example: 0.901 | |
Range: 0.0 to 1.0 | |
Description: Success rate when model gets 10 attempts | |
``` | |
### 4. **Quality Metrics** (0 - 10) | |
Rate your model across these 10 dimensions: | |
#### **Readability: 8** | |
``` | |
How clear and readable are the generated code reviews? | |
Scale: 0 (unreadable) to 10 (very clear) | |
``` | |
#### **Relevance: 7** | |
``` | |
How relevant are the reviews to the actual code changes? | |
Scale: 0 (irrelevant) to 10 (highly relevant) | |
``` | |
#### **Explanation Clarity: 8** | |
``` | |
How well does the model explain identified issues? | |
Scale: 0 (unclear) to 10 (very clear explanations) | |
``` | |
#### **Problem Identification: 7** | |
``` | |
How effectively does it identify real code problems? | |
Scale: 0 (misses issues) to 10 (finds all problems) | |
``` | |
#### **Actionability: 6** | |
``` | |
How actionable and useful are the suggestions? | |
Scale: 0 (not actionable) to 10 (very actionable) | |
``` | |
#### **Completeness: 7** | |
``` | |
How thorough and complete are the reviews? | |
Scale: 0 (incomplete) to 10 (comprehensive) | |
``` | |
#### **Specificity: 6** | |
``` | |
How specific are the comments and suggestions? | |
Scale: 0 (too generic) to 10 (very specific) | |
``` | |
#### **Contextual Adequacy: 7** | |
``` | |
How well does it understand the code context? | |
Scale: 0 (ignores context) to 10 (perfect context understanding) | |
``` | |
#### **Consistency: 6** | |
``` | |
How consistent is the model across different code reviews? | |
Scale: 0 (inconsistent) to 10 (very consistent) | |
``` | |
#### **Brevity: 5** | |
``` | |
How concise are the reviews without losing important information? | |
Scale: 0 (too verbose/too brief) to 10 (perfect length) | |
``` | |
### 5. **Submit Your Model** | |
- Click the **π Submit Model** button | |
- Wait for validation and processing | |
- Check for success/error message | |
## π Complete Example Submission | |
Here's a real example of submitting the CodeT5-base model: | |
```yaml | |
Model Information: | |
Model Name: "microsoft/CodeT5-base" | |
Programming Language: "Python" | |
Comment Language: "English" | |
Taxonomy Category: "Bug Detection" | |
Performance Scores: | |
BLEU Score: 0.742 | |
Pass@1: 0.685 | |
Pass@5: 0.834 | |
Pass@10: 0.901 | |
Quality Metrics: | |
Readability: 8 | |
Relevance: 7 | |
Explanation Clarity: 8 | |
Problem Identification: 7 | |
Actionability: 6 | |
Completeness: 7 | |
Specificity: 6 | |
Contextual Adequacy: 7 | |
Consistency: 6 | |
Brevity: 5 | |
``` | |
## π Security & Rate Limiting | |
### **IP-based Rate Limiting** | |
- **5 submissions per IP address per 24 hours** | |
- Submissions are tracked by your IP address | |
- Rate limit resets every 24 hours | |
### **Validation Rules** | |
- Model name must follow `organization/model` format | |
- All performance scores must be between 0.0 and 1.0 | |
- All quality metrics must be between 0 and 10 | |
- Pass@1 β€ Pass@5 β€ Pass@10 (logical consistency) | |
## β After Submission | |
### **Immediate Feedback** | |
You'll see one of these messages: | |
#### **Success β ** | |
``` | |
β Submission recorded successfully! | |
``` | |
#### **Error Examples β** | |
``` | |
β Rate limit exceeded: 5/5 submissions in 24 hours | |
β Model name contains invalid characters | |
β Pass@1 score cannot be higher than Pass@5 | |
β Score BLEU out of range: 1.2 (must be between 0 and 1) | |
``` | |
### **View Your Results** | |
- Your model will appear in the **π Leaderboard** tab | |
- Use filters to find your specific submission | |
- Check the **π Analytics** tab for submission history | |
## π― Tips for Better Submissions | |
### **Model Naming** | |
``` | |
β Good: "microsoft/CodeT5-base" | |
β Good: "facebook/bart-large" | |
β Good: "my-org/custom-model-v2" | |
β Bad: "my model" | |
β Bad: "model@v1.0" | |
``` | |
### **Performance Scores** | |
- Be honest and accurate with your evaluations | |
- Use proper evaluation methodology | |
- Ensure Pass@k scores are logically consistent | |
- Document your evaluation process | |
### **Quality Metrics** | |
- Rate based on actual model performance | |
- Consider multiple test cases | |
- Be objective in your assessment | |
- Document your rating criteria | |
## π€ Need Help? | |
If you encounter issues: | |
1. Check the error message for specific guidance | |
2. Verify all fields are filled correctly | |
3. Ensure you haven't exceeded rate limits | |
4. Contact maintainers if problems persist | |
--- | |
**Ready to submit your model? Head to the π Submit Model tab and follow this guide!** π | |