CodeReviewBench / SUBMISSION_EXAMPLE.md
Alex
space updated
b31be61

A newer version of the Gradio SDK is available: 5.42.0

Upgrade

πŸ“ Model Submission Example

This guide shows you exactly how to submit your code review model to the leaderboard.

πŸš€ Step-by-Step Submission Process

1. Access the Submission Form

  • Open the CodeReview Leaderboard in your browser
  • Navigate to the πŸ“ Submit Model tab
  • Click on the "πŸ“ Submit New Model Results" accordion to expand the form

2. Fill in Basic Information

Model Name ✨

Example: microsoft/CodeT5-base
Format: organization/model-name

Programming Language πŸ”

Select: Python
(or Java, JavaScript, C++, Go, Rust, etc.)

Comment Language 🌍

Select: English  
(or Chinese, Spanish, French, German, etc.)

Taxonomy Category 🏷️

Select: Bug Detection
(or Security, Performance, Code Style, etc.)

3. Performance Scores (0.0 - 1.0)

BLEU Score

Example: 0.742
Range: 0.0 to 1.0
Description: Measures similarity between generated and reference reviews

Pass@1

Example: 0.685
Range: 0.0 to 1.0  
Description: Success rate when model gets 1 attempt

Pass@5

Example: 0.834
Range: 0.0 to 1.0
Description: Success rate when model gets 5 attempts  

Pass@10

Example: 0.901
Range: 0.0 to 1.0
Description: Success rate when model gets 10 attempts

4. Quality Metrics (0 - 10)

Rate your model across these 10 dimensions:

Readability: 8

How clear and readable are the generated code reviews?
Scale: 0 (unreadable) to 10 (very clear)

Relevance: 7

How relevant are the reviews to the actual code changes?
Scale: 0 (irrelevant) to 10 (highly relevant)

Explanation Clarity: 8

How well does the model explain identified issues?
Scale: 0 (unclear) to 10 (very clear explanations)

Problem Identification: 7

How effectively does it identify real code problems?
Scale: 0 (misses issues) to 10 (finds all problems)

Actionability: 6

How actionable and useful are the suggestions?
Scale: 0 (not actionable) to 10 (very actionable)

Completeness: 7

How thorough and complete are the reviews?
Scale: 0 (incomplete) to 10 (comprehensive)

Specificity: 6

How specific are the comments and suggestions?
Scale: 0 (too generic) to 10 (very specific)

Contextual Adequacy: 7

How well does it understand the code context?
Scale: 0 (ignores context) to 10 (perfect context understanding)

Consistency: 6

How consistent is the model across different code reviews?
Scale: 0 (inconsistent) to 10 (very consistent)

Brevity: 5

How concise are the reviews without losing important information?
Scale: 0 (too verbose/too brief) to 10 (perfect length)

5. Submit Your Model

  • Click the πŸš€ Submit Model button
  • Wait for validation and processing
  • Check for success/error message

πŸ“‹ Complete Example Submission

Here's a real example of submitting the CodeT5-base model:

Model Information:
  Model Name: "microsoft/CodeT5-base"
  Programming Language: "Python"
  Comment Language: "English"
  Taxonomy Category: "Bug Detection"

Performance Scores:
  BLEU Score: 0.742
  Pass@1: 0.685
  Pass@5: 0.834
  Pass@10: 0.901

Quality Metrics:
  Readability: 8
  Relevance: 7  
  Explanation Clarity: 8
  Problem Identification: 7
  Actionability: 6
  Completeness: 7
  Specificity: 6
  Contextual Adequacy: 7
  Consistency: 6
  Brevity: 5

πŸ”’ Security & Rate Limiting

IP-based Rate Limiting

  • 5 submissions per IP address per 24 hours
  • Submissions are tracked by your IP address
  • Rate limit resets every 24 hours

Validation Rules

  • Model name must follow organization/model format
  • All performance scores must be between 0.0 and 1.0
  • All quality metrics must be between 0 and 10
  • Pass@1 ≀ Pass@5 ≀ Pass@10 (logical consistency)

βœ… After Submission

Immediate Feedback

You'll see one of these messages:

Success βœ…

βœ… Submission recorded successfully!

Error Examples ❌

❌ Rate limit exceeded: 5/5 submissions in 24 hours
❌ Model name contains invalid characters
❌ Pass@1 score cannot be higher than Pass@5
❌ Score BLEU out of range: 1.2 (must be between 0 and 1)

View Your Results

  • Your model will appear in the πŸ† Leaderboard tab
  • Use filters to find your specific submission
  • Check the πŸ“ˆ Analytics tab for submission history

🎯 Tips for Better Submissions

Model Naming

βœ… Good: "microsoft/CodeT5-base"
βœ… Good: "facebook/bart-large"  
βœ… Good: "my-org/custom-model-v2"
❌ Bad: "my model"
❌ Bad: "model@v1.0"

Performance Scores

  • Be honest and accurate with your evaluations
  • Use proper evaluation methodology
  • Ensure Pass@k scores are logically consistent
  • Document your evaluation process

Quality Metrics

  • Rate based on actual model performance
  • Consider multiple test cases
  • Be objective in your assessment
  • Document your rating criteria

🀝 Need Help?

If you encounter issues:

  1. Check the error message for specific guidance
  2. Verify all fields are filled correctly
  3. Ensure you haven't exceeded rate limits
  4. Contact maintainers if problems persist

Ready to submit your model? Head to the πŸ“ Submit Model tab and follow this guide! πŸš€