model,elo,wins,losses,ties,games_played,confidence_interval
Llama-3.2-1b-Instruct,1516.0,1,0,0,1,784.0
Qwen2.5-1.5b-Instruct,1500.0,0,0,0,0,inf
Qwen2.5-3b-Instruct,1500.0,0,0,0,1,784.0
Llama-3.2-3b-Instruct,1500.0,0,0,0,1,784.0
Gemma-3-1b-it,1500.0,0,0,0,0,inf
Gemma-2-2b-it,1500.0,0,0,0,1,784.0
IBM Granite-3.3-2b-instruct,1500.0,0,0,0,0,inf
Phi-4-mini-instruct,1484.0,0,1,0,2,554.4