Spaces:

kenkaneki
/

CodeReviewBench

Running

File size: 2,337 Bytes

d4d998a
346c3c5
d4d998a
 
 
 
346c3c5
d4d998a
 
 
 
 
 
346c3c5
 
 
d4d998a
346c3c5
 
d4d998a
 
5a2e143
346c3c5
5a2e143
1148074
5a2e143
1148074
d847be7
346c3c5
d4d998a
 
a17bcda
d4d998a
346c3c5
d4d998a
346c3c5
 
3a2adee
d4d998a
346c3c5
 
 
 
 
00fe44f
d4d998a
 
346c3c5
d4d998a
 
346c3c5
 
 
2daa8e2
d4d998a
 
346c3c5
d4d998a

"""
Text content for the CodeReview Bench Leaderboard.
"""

TITLE = """
<div style="text-align: center; margin-bottom: 1rem">
    <h1>CodeReview Bench Leaderboard</h1>
</div>
"""

INTRODUCTION_TEXT = """
## Introduction

CodeReview Bench is a comprehensive benchmark for evaluating the quality and effectiveness of automated code review systems.
This leaderboard tracks model performance across various programming languages and review criteria,
including readability, relevance, explanation clarity, and actionability.

Models are evaluated on their ability to provide high-quality code reviews that are helpful,
accurate, and actionable across multiple programming languages and review categories.
"""

LLM_BENCHMARKS_TEXT = """
CodeReview Bench is a comprehensive benchmark for evaluating automated code review systems across programming languages and review quality dimensions.

It evaluates models on their ability to provide high-quality code reviews using both LLM-based multimetric evaluation (readability, relevance, explanation clarity, problem identification, actionability, completeness, specificity, contextual adequacy, consistency, brevity) and exact-match metrics (pass@1, pass@5, pass@10) presented in our paper.

The benchmark supports both Russian and English comment languages across 4 programming languages including Python, Java, Go, Scala

"""

EVALUATION_QUEUE_TEXT = """
## Submit Your Model

To add your model to the CodeReview Bench leaderboard:

1. Run your evaluation using the CodeReview Bench framework
2. Upload your results in .jsonl format using this form.
3. Once validated, your model will appear on the leaderboard.

### Requirements:
- Results must include all required metrics: LLM-based multimetric scores and exact-match metrics
- Submissions should cover multiple programming languages where applicable
- Both Russian and English comment languages are supported

### ✉️✨ Ready? Upload your results below!
"""

CITATION_BUTTON_LABEL = "Cite CodeReview Bench"

CITATION_BUTTON_TEXT = """
@misc{codereviewbench2025,
  author = {CodeReview Bench Team},
  title = {CodeReview Bench: Comprehensive Benchmark for Automated Code Review Systems},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\\url{https://github.com/your-org/codereview-bench}}
}
"""