Back to Leaderboard

Gemini 2.5 Pro

Google

Rank #2 of 8 models

86.8%

+1.9 vs avg

Coverage
78.0%-0.9 vs avg
Validity
95.7%+4.7 vs avg
Local Score
86.5%+2.0 vs avg
Cross-File
87.5%+1.8 vs avg

Score Distribution

Performance by Language

Category Comparison

Local Logic
86.5%
Cross-File
87.5%

Judge Analysis (Sonnet vs GPT)

Latency (p50 / p90 / p99)

8ms

p50

57.7s

p90

73.3s

p99

GLM-5
6ms
Gemini 2.5 Pro
8ms
Kimi K2.5
8ms
Claude Haiku 4.5
8ms
Gemini 3 Flash
8ms
Claude Sonnet 4.5
10ms
Gemini 3.1 Pro
21ms
GPT-5.2
19.3s
Pass Rate
33.3%
Parse Rate
33.3%
Tests
75
Errors
50

Sample Traces (10 of 25)

View all in Explorer →