Arena Leaderboard
Elo rankings from human & LLM judge votes
Judge
👑
| Rank | Model | Company | Elo | Win% | W / L / T | Battles |
|---|---|---|---|---|---|---|
| 1 | Nanonets OCR3 | 1135 | 78.6% | 15W / 3L / 3T | 21 | |
| 2 | Nanonets OCR2+ | 1083 | 71.7% | 15W / 5L / 3T | 23 | |
| 3 | GPT-5 Mini | 1067 | 64.3% | 7W / 3L / 4T | 14 | |
| 4 | GPT-5.2 | 1063 | 63.0% | 11W / 5L / 7T | 23 | |
| 5 | GPT-5.4 · Medium Reasoning | 1029 | 56.3% | 6W / 4L / 6T | 16 | |
| 6 | Gemini 2.5 Pro | 1006 | 50.0% | 5W / 5L / 6T | 16 | |
| 7 | Claude Sonnet 4.6 | 1001 | 50.0% | 7W / 7L / 5T | 19 | |
| 8 | Gemini 2.5 Flash · Thinking | 999 | 46.2% | 4W / 5L / 4T | 13 | |
| 9 | Claude Sonnet 4.6 · Thinking | 995 | 50.0% | 9W / 9L / 8T | 26 | |
| 10 | Gemini 3.1 Pro | 979 | 41.7% | 3W / 5L / 4T | 12 | |
| 11 | GPT-5.4 · Low Reasoning | 978 | 44.4% | 5W / 7L / 6T | 18 | |
| 12 | GPT-5.4 | 972 | 39.3% | 1W / 4L / 9T | 14 | |
| 13 | Claude Opus 4.6 · Low Thinking | 966 | 40.5% | 2W / 6L / 13T | 21 | |
| 14 | Gemini 2.5 Flash | 956 | 39.3% | 3W / 6L / 5T | 14 | |
| 15 | Claude Opus 4.6 | 946 | 40.9% | 4W / 8L / 10T | 22 | |
| 16 | GPT-4.1 | 936 | 30.0% | 2W / 8L / 5T | 15 | |
| 17 | Gemini 3 Flash | 889 | 20.0% | 1W / 10L / 4T | 15 |