A collection of evaluation benchmarks for the Italian language.
Simone Conia
s-conia
AI & ML interests
Natural Language Processing, Multilinguality, Knowledge Graphs, Semantics
Recent Activity
authored
a paper
about 1 month ago
Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM
Evaluation in Multiple-Choice Question Answering
liked
a dataset
6 months ago
sapienzanlp/ea-mt-benchmark