Arena Leaderboard

Elo rankings from human & LLM judge votes

Battle Now
Judge
RankModelCompanyEloWin%W / L / TBattles
1Nanonets OCR3nanonets113578.6%15W / 3L / 3T21
2Nanonets OCR2+nanonets108371.7%15W / 5L / 3T23
3GPT-5 Miniopenai106764.3%7W / 3L / 4T14
4GPT-5.2openai106363.0%11W / 5L / 7T23
5GPT-5.4 · Medium Reasoningopenai102956.3%6W / 4L / 6T16
6Gemini 2.5 Progoogle100650.0%5W / 5L / 6T16
7Claude Sonnet 4.6anthropic100150.0%7W / 7L / 5T19
8Gemini 2.5 Flash · Thinkinggoogle99946.2%4W / 5L / 4T13
9Claude Sonnet 4.6 · Thinkinganthropic99550.0%9W / 9L / 8T26
10Gemini 3.1 Progoogle97941.7%3W / 5L / 4T12
11GPT-5.4 · Low Reasoningopenai97844.4%5W / 7L / 6T18
12GPT-5.4openai97239.3%1W / 4L / 9T14
13Claude Opus 4.6 · Low Thinkinganthropic96640.5%2W / 6L / 13T21
14Gemini 2.5 Flashgoogle95639.3%3W / 6L / 5T14
15Claude Opus 4.6anthropic94640.9%4W / 8L / 10T22
16GPT-4.1openai93630.0%2W / 8L / 5T15
17Gemini 3 Flashgoogle88920.0%1W / 10L / 4T15