LeaderboardBenchmarksModels1v1ArenaResults ExplorerGitHub

Models

24 models evaluated across 3 document AI benchmarks.

1

Nanonets OCR-3

Nanonets

Overall
84.4
2

Gemini 3.1 Pro

Google

Overall
83.2
3

Nanonets OCR2+

Nanonets

Overall
81.8
4

Gemini-3-Pro

Google

Overall
81.4
5

GPT-5.4

OpenAI

Overall
81.0
6

Claude Sonnet 4.6

Anthropic

Overall
80.8
7

Claude Opus 4.6

Anthropic

Overall
80.3
8

Gemini-3-Flash

Google

Overall
79.9
9

GPT-5.2

OpenAI

Overall
79.2
10

Qwen3.5-9B

Alibaba

Overall
77.0
11

Qwen3.5-4B

Alibaba

Overall
73.1
12

Mistral Small 4

Mistral AI

Overall
71.5
13

GPT-5-Mini

OpenAI

Overall
70.8
14

GPT-4.1

OpenAI

Overall
70.0
15

Claude Haiku 4.5

Anthropic

Overall
69.6
16

Ministral-8B

Mistral AI

Overall
69.3
17

GLM-OCR

Zhipu AI

Overall
63.6
18

Qwen3.5-2B

Alibaba

Overall
63.2
19

Qwen3.5-0.8B

Alibaba

Overall
58.0
20

GPT-5-Nano

OpenAI

Overall
50.7
21

Llama-3.2-Vision-11B

Meta

Overall
50.1
22

Pixtral-12B

Mistral AI

Overall
46.0
0

Gemma-3-12B-IT

Google

Overall
0.0
0

Datalab Marker

Datalab

Overall
0.0

Open benchmark for document AI models.

v1.5