AI Model Compare Engine
Benchmark and compare leading AI models across reasoning, coding, vision, speed, and cost.
Select Models to Compare (max 6)
Benchmark Comparison
Sort by:| Model | Provider | Overall | Reasoning | Coding | Vision | Math | Speed | Efficiency | Context | Input Cost | Output Cost | Latency |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GPT-4o | OpenAI | 88 | 94BEST | 93 | 96BEST | 91 | 78 | 72 | 128K | $2.50/1M | $10.00/1M | 142ms |
| Claude 3.5 Sonnet | Anthropic | 87 | 93 | 95BEST | 91 | 90 | 82 | 78 | 200K | $3.00/1M | $15.00/1M | 187ms |
| DeepSeek V3 | DeepSeek | 92BEST | 89 | 88 | 78 | 92 | 85 | 88 | 128K | $0.27/1M | $1.10/1M | 156ms |
| Gemini 2.0 Flash | 88 | 90 | 92 | 94 | 94BEST | 95BEST | 92 | 1M | $0.35/1M | $0.70/1M | 89ms | |
| Llama 3.1 70B | Meta | 86 | 86 | 84 | 75 | 85 | 90 | 95BEST | 128K | Free/1M | Free/1M | 112ms |
GPT-4o
OpenAI · ~1.8T
May 2024
88
Overall
MultimodalEnterpriseVision
94
Reasoning
93
Coding
96
Vision
91
Math
78
Speed
72
Efficiency
Claude 3.5 Sonnet
Anthropic · ~175B
Jun 2024
87
Overall
CodingAnalysisLong-context
93
Reasoning
95
Coding
91
Vision
90
Math
82
Speed
78
Efficiency
DeepSeek V3
DeepSeek · ~671B
Dec 2024
92
Overall
MathCost-efficientCoding
89
Reasoning
88
Coding
78
Vision
92
Math
85
Speed
88
Efficiency
Gemini 2.0 Flash
Google · ~100B
Dec 2024
88
Overall
SpeedLong-contextCost-efficient
90
Reasoning
92
Coding
94
Vision
94
Math
95
Speed
92
Efficiency
Llama 3.1 70B
Meta · 70B
Jul 2024
86
Overall
Open-sourceLocalFine-tuning
86
Reasoning
84
Coding
75
Vision
85
Math
90
Speed
95
Efficiency