Multi-dimensional rankings based on model speed tests, provider health checks, and standard model benchmarks. Compare providers, endpoints, models, and reliability at a glance.
Compares AI models across Artificial Analysis aggregate, reasoning, coding, and math benchmarks in one table.
Artificial Analysis standard benchmarks
| Rank | Model | Intelligence Index | Coding Index | Math Index | Reasoning | GPQA | HLE | Coding | SciCode | AIME | MATH-500 | Updated |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | DeepSeek V2.5deepseek-v2-5 | 12.5#100 | - | - | - | - | - | - | - | - | 76.3%#34 | May 25, 2026, 04:00 PM |
2 | GPT-4o Minigpt-4o-mini | 12.6#99 | - | 14.7#54 | 64.8%#63 | 42.6%#94 | 4.0%#84 | 23.4%#63 | 22.9%#92 | 11.7%#35 | 78.9%#32 | May 25, 2026, 04:00 PM |
3 | Gemini 2.5 Flash Litegemini-2-5-flash-lite | 12.7#97 | 7.4#87 | 35.3#44 | 72.4%#54 | 47.4%#92 | 3.7%#88 | 40.0%#48 | 17.7%#94 | 50.0%#16 | 92.6%#19 | May 25, 2026, 04:00 PM |
4 | GLM-4.5Vglm-4-5v | 12.7#97 | 10.8#85 | 15.3#53 | 75.1%#48 | 57.3%#84 | 3.6%#91 | 35.2%#50 | 18.8%#93 | - | - | May 25, 2026, 04:00 PM |
5 | GPT-4gpt-4 | 12.8#96 | 13.1#80 | - | - | - | - | - | - | - | - | May 25, 2026, 04:00 PM |
6 | GPT-4.1 Nanogpt-4-1-nano | 13.0#95 | 11.2#82 | 24.0#49 | 65.7%#62 | 51.2%#90 | 3.9%#86 | 32.6%#53 | 25.9%#85 | 23.7%#28 | 84.8%#28 | May 25, 2026, 04:00 PM |
7 | Llama 4 Scoutllama-4-scout | 13.5#94 | 6.7#88 | 14.0#55 | 75.2%#45 | 58.7%#81 | 4.3%#79 | 29.9%#56 | 17.0%#95 | 28.3%#25 | 84.4%#29 | May 25, 2026, 04:00 PM |
8 | GPT-4 Turbogpt-4-turbo | 13.7#93 | 21.5#65 | - | 69.4%#59 | - | 3.3%#96 | 29.1%#57 | 31.9%#75 | 15.0%#31 | 73.7%#36 | May 25, 2026, 04:00 PM |
9 | Gemini 1.5 Flashgemini-1-5-flash | 13.8#92 | - | - | 68.0%#61 | 46.3%#93 | 3.5%#95 | 27.3%#58 | 26.7%#83 | 18.0%#30 | 82.7%#30 | May 25, 2026, 04:00 PM |
10 | Grok-2grok-2 | 13.9#91 | - | - | 70.9%#58 | 51.0%#91 | 3.8%#87 | 26.7%#60 | 28.5%#81 | 13.3%#34 | 77.8%#33 | May 25, 2026, 04:00 PM |
11 | Ring Flash 2.0ring-flash-2-0 | 14.0#90 | 10.6#86 | 83.7#14 | 79.3%#37 | 72.5%#59 | 8.9%#50 | 62.8%#27 | 16.8%#96 | - | - | May 25, 2026, 04:00 PM |
12 | Gemini 2.0 Flash Litegemini-2-0-flash-lite | 14.5#89 | - | - | - | 54.2%#88 | 4.4%#77 | 17.9%#67 | 24.7%#89 | 30.3%#24 | 87.3%#25 | May 25, 2026, 04:00 PM |
13 | Gemini 2.0 Flash Lite 001gemini-2-0-flash-lite-001 | 14.7#88 | - | - | 72.4%#54 | 53.5%#89 | 3.6%#91 | 18.5%#66 | 25.0%#88 | 27.7%#26 | 87.3%#26 | May 25, 2026, 04:00 PM |
14 | Qwen3qwen3 | 15.0#87 | 14.2#77 | 66.3#23 | 77.7%#43 | 65.9%#69 | 6.8%#61 | 51.5%#36 | 30.4%#78 | 72.7%#9 | 97.5%#7 | May 25, 2026, 04:00 PM |
15 | Devstral Smalldevstral-small | 15.2#86 | 12.1#81 | 29.3#46 | 62.2%#64 | 41.4%#95 | 3.7%#88 | 25.4%#61 | 24.3%#90 | 0.3%#37 | 63.5%#37 | May 25, 2026, 04:00 PM |
16 | Gemini 1.5 Progemini-1-5-pro | 16.0#85 | 23.6#61 | - | 75.0%#49 | 58.9%#79 | 4.9%#73 | 31.6%#54 | 29.5%#79 | 23.0%#29 | 87.6%#24 | May 25, 2026, 04:00 PM |
17 | DeepSeek V3deepseek-v3 | 16.5#84 | 16.4#74 | 26.0#48 | 75.2%#45 | 55.7%#86 | 3.6%#91 | 35.9%#49 | 35.4%#62 | 25.3%#27 | 88.7%#23 | May 25, 2026, 04:00 PM |
18 | GLM-4.6Vglm-4-6v | 17.1#83 | 11.1#84 | 26.3#47 | 75.2%#45 | 56.6%#85 | 3.7%#88 | 41.1%#45 | 27.2%#82 | - | - | May 25, 2026, 04:00 PM |
19 | DeepSeek R1 Distill Qwendeepseek-r1-distill-qwen | 17.2#82 | - | 63.0#24 | 73.9%#53 | 61.5%#75 | 5.5%#66 | 27.0%#59 | 37.6%#52 | 68.7%#12 | 94.1%#14 | May 25, 2026, 04:00 PM |
20 | GPT-4ogpt-4o | 17.3#81 | 16.7#73 | 6.0#56 | 74.8%#50 | 54.3%#87 | 3.3%#96 | 30.9%#55 | 33.3%#69 | 15.0%#31 | 75.9%#35 | May 25, 2026, 04:00 PM |
21 | Gemini 2.0 Progemini-2-0-pro | 18.1#80 | 25.5#57 | - | 80.5%#34 | 62.2%#74 | 6.8%#61 | 34.7%#51 | 31.3%#76 | 36.0%#21 | 92.3%#21 | May 25, 2026, 04:00 PM |
22 | Gemini 2.0 Flashgemini-2-0-flash | 18.5#79 | 13.6#79 | 21.7#51 | 77.9%#42 | 62.3%#73 | 5.3%#67 | 33.4%#52 | 33.3%#69 | 33.0%#22 | 93.0%#18 | May 25, 2026, 04:00 PM |
23 | GPT-4.5gpt-4-5 | 20.0#78 | - | - | - | - | - | - | - | - | - | May 25, 2026, 04:00 PM |
24 | o1 Minio1-mini | 20.4#77 | - | - | 74.2%#52 | 60.3%#77 | 4.9%#73 | 57.6%#31 | 32.3%#73 | 60.3%#14 | 94.4%#13 | May 25, 2026, 04:00 PM |
25 | Gemini 2.5 Flashgemini-2-5-flash | 20.6#76 | 17.8#72 | 60.3#28 | 80.9%#31 | 68.3%#63 | 5.1%#70 | 49.5%#38 | 29.1%#80 | 50.0%#16 | 93.2%#17 | May 25, 2026, 04:00 PM |
26 | GPT-OSSgpt-oss | 20.8#75 | 14.4#76 | 62.3#26 | 71.8%#56 | 61.1%#76 | 5.1%#70 | 65.2%#26 | 34.0%#66 | - | - | May 25, 2026, 04:00 PM |
27 | Mistral Medium 3.1mistral-medium-3-1 | 21.3#74 | 18.3#70 | 38.3#38 | 68.3%#60 | 58.8%#80 | 4.4%#77 | 40.6%#46 | 33.8%#67 | - | - | May 25, 2026, 04:00 PM |
28 | Devstral 2devstral-2 | 22.0#73 | 23.7#60 | 36.7#42 | 76.2%#44 | 59.4%#78 | 3.6%#91 | 44.8%#43 | 33.1%#71 | - | - | May 25, 2026, 04:00 PM |
29 | Mistral Large 3mistral-large-3 | 22.8#72 | 22.7#62 | 38.0#39 | 80.7%#32 | 68.0%#65 | 4.1%#81 | 46.5%#40 | 36.2%#59 | - | - | May 25, 2026, 04:00 PM |
30 | GPT-4.1 Minigpt-4-1-mini | 22.9#71 | 18.5#69 | 46.3#35 | 78.1%#40 | 66.4%#68 | 4.6%#75 | 48.3%#39 | 40.4%#43 | 43.0%#19 | 92.5%#20 | May 25, 2026, 04:00 PM |
31 | GLM-4.5 Airglm-4-5-air | 23.2#70 | 23.8#59 | 80.7#17 | 81.5%#30 | 73.3%#57 | 6.8%#61 | 68.4%#24 | 30.6%#77 | 67.3%#13 | 96.5%#12 | May 25, 2026, 04:00 PM |
32 | MiniMax M1minimax-m1 | 24.4#69 | 14.5#75 | 61.0#27 | 81.6%#29 | 69.7%#61 | 8.2%#53 | 71.1%#23 | 37.4%#53 | 84.7%#7 | 98.0%#5 | May 25, 2026, 04:00 PM |
33 | Grok-3grok-3 | 25.2#68 | 19.8#68 | 58.0#30 | 79.9%#36 | 69.3%#62 | 5.1%#70 | 42.5%#44 | 36.8%#55 | 33.0%#22 | 87.0%#27 | May 25, 2026, 04:00 PM |
34 | O1 Proo1-pro | 25.8#67 | - | - | - | - | - | - | - | - | - | May 25, 2026, 04:00 PM |
35 | O3 Minio3-mini | 25.9#65 | 17.9#71 | - | 79.1%#38 | 74.8%#53 | 8.7%#51 | 71.7%#22 | 39.9%#46 | 77.0%#8 | 97.3%#8 | May 25, 2026, 04:00 PM |
36 | Qwen3.5 Omni Flashqwen3-5-omni-flash | 25.9#65 | 14.0#78 | - | - | 74.2%#55 | 7.1%#57 | - | 25.5%#87 | - | - | May 25, 2026, 04:00 PM |
37 | GPT-4.1gpt-4-1 | 26.3#63 | 21.8#64 | 34.7#45 | 80.6%#33 | 66.6%#67 | 4.6%#75 | 45.7%#41 | 38.1%#51 | 43.7%#18 | 91.3%#22 | May 25, 2026, 04:00 PM |
38 | Kimi K2kimi-k2 | 26.3#63 | 22.1#63 | 57.0#31 | 82.4%#26 | 76.6%#49 | 7.0%#60 | 55.6%#33 | 34.5%#64 | 69.3%#11 | 97.1%#9 | May 25, 2026, 04:00 PM |
39 | GLM-4.5glm-4-5 | 26.4#62 | 26.3#53 | 73.7#21 | 83.5%#23 | 78.2%#47 | 12.2%#47 | 73.8%#20 | 34.8%#63 | 87.3%#6 | 97.9%#6 | May 25, 2026, 04:00 PM |
40 | GPT-5 Nanogpt-5-nano | 26.8#61 | 20.3#67 | 83.7#14 | 78.0%#41 | 67.6%#66 | 8.2%#53 | 78.9%#17 | 36.6%#58 | - | - | May 25, 2026, 04:00 PM |
41 | DeepSeek R1deepseek-r1 | 27.1#60 | 24.0#58 | 76.0#20 | 84.9%#15 | 81.3%#40 | 14.9%#41 | 77.0%#18 | 40.3%#44 | 89.3%#4 | 98.3%#4 | May 25, 2026, 04:00 PM |
42 | DeepSeek V3.1deepseek-v3-1 | 28.1#59 | 28.4#50 | 49.7#34 | 83.3%#24 | 73.5%#56 | 6.3%#64 | 57.7%#30 | 36.7%#56 | - | - | May 25, 2026, 04:00 PM |
43 | DeepSeek V3.1 Terminusdeepseek-v3-1-terminus | 28.5#58 | 31.9#44 | 53.7#33 | 83.6%#22 | 75.1%#51 | 8.4%#52 | 52.9%#35 | 32.1%#74 | - | - | May 25, 2026, 04:00 PM |
44 | GLM-4.7 Flashglm-4-7-flash | 30.1#57 | 25.9#54 | - | - | 58.1%#82 | 7.1%#57 | - | 33.7%#68 | - | - | May 25, 2026, 04:00 PM |
45 | GLM-4.6glm-4-6 | 30.2#56 | 30.2#46 | 44.3#36 | 78.4%#39 | 63.2%#72 | 5.2%#69 | 56.1%#32 | 33.1%#71 | - | - | May 25, 2026, 04:00 PM |
46 | MiMo-V2-Flashmimo-v2-flash | 30.3#55 | 25.8#55 | 67.7#22 | 74.4%#51 | 65.6%#70 | 8.0%#55 | 40.2%#47 | 25.9%#85 | - | - | May 25, 2026, 04:00 PM |
47 | O1o1 | 30.7#54 | 20.5#66 | - | 84.1%#17 | 74.7%#54 | 7.7%#56 | 67.9%#25 | 35.8%#61 | 72.3%#10 | 97.0%#10 | May 25, 2026, 04:00 PM |
48 | Claude Haiku 4.5claude-haiku-4-5 | 31.0#53 | 29.6#48 | 39.0#37 | 80.0%#35 | 64.6%#71 | 4.3%#79 | 51.1%#37 | 34.4%#65 | - | - | May 25, 2026, 04:00 PM |
49 | Qwen3 Maxqwen3-max | 31.4#52 | 26.4#52 | 80.7#17 | 84.1%#17 | 76.4%#50 | 11.1%#48 | 76.7%#19 | 38.3%#50 | - | - | May 25, 2026, 04:00 PM |
50 | DeepSeek V3.2deepseek-v3-2 | 32.1#51 | 34.6#39 | 59.0#29 | 83.7%#19 | 75.1%#51 | 10.5%#49 | 59.3%#28 | 38.7%#48 | - | - | May 25, 2026, 04:00 PM |
51 | Claude Opus 4claude-opus-4 | 33.0#49 | - | 36.3#43 | 86.0%#10 | 70.1%#60 | 5.9%#65 | 54.2%#34 | 40.9%#38 | 56.3%#15 | 94.1%#14 | May 25, 2026, 04:00 PM |
52 | Claude Sonnet 4claude-sonnet-4 | 33.0#49 | 30.6#45 | 38.0#39 | 83.7%#19 | 68.3%#63 | 4.0%#84 | 44.9%#42 | 37.3%#54 | 40.7%#20 | 93.4%#16 | May 25, 2026, 04:00 PM |
53 | o4 Minio4-mini | 33.1#48 | 25.6#56 | 90.7#10 | 83.2%#25 | 78.4%#46 | 17.5%#37 | 85.9%#5 | 46.5%#17 | 94.0%#2 | 98.9%#3 | May 25, 2026, 04:00 PM |
54 | Gemini 3.1 Flash Litegemini-3-1-flash-lite | 33.5#47 | 30.1#47 | - | - | 82.2%#37 | 16.2%#39 | - | 41.9%#36 | - | - | May 25, 2026, 04:00 PM |
55 | Gemini 2.5 Progemini-2-5-pro | 34.6#46 | 32.0#43 | 87.7#13 | 86.2%#9 | 84.4%#27 | 21.1%#31 | 80.1%#15 | 42.8%#29 | 88.7%#5 | 96.7%#11 | May 25, 2026, 04:00 PM |
56 | Gemini 3 Flashgemini-3-flash | 35.0#45 | 37.8#28 | 55.7#32 | 88.2%#3 | 81.2%#42 | 14.1%#42 | 79.7%#16 | 49.9%#10 | - | - | May 25, 2026, 04:00 PM |
57 | Claude Opus 4.1claude-opus-4-1 | 36.0#44 | - | - | - | - | - | - | - | - | - | May 25, 2026, 04:00 PM |
58 | MiniMax M2minimax-m2 | 36.1#43 | 29.2#49 | 78.3#19 | 82.0%#27 | 77.7%#48 | 12.5%#46 | 82.6%#12 | 36.1%#60 | - | - | May 25, 2026, 04:00 PM |
59 | Claude Sonnet 4.5claude-sonnet-4-5 | 37.1#42 | 33.5#41 | 37.0#41 | 86.0%#10 | 72.7%#58 | 7.1%#57 | 59.0%#29 | 42.8%#29 | - | - | May 25, 2026, 04:00 PM |
60 | O3o3 | 38.4#41 | 38.4#27 | 88.3#12 | 85.3%#14 | 82.7%#34 | 20.0%#32 | 80.8%#14 | 41.0%#37 | 90.3%#3 | 99.2%#2 | May 25, 2026, 04:00 PM |
61 | Step 3.5 Flashstep-3-5-flash | 38.5#40 | 34.6#39 | - | - | 82.6%#35 | 22.6%#28 | - | 38.5%#49 | - | - | May 25, 2026, 04:00 PM |
62 | GPT-5.1 Codex Minigpt-5-1-codex-mini | 38.6#38 | 36.4#32 | 91.7#9 | 82.0%#27 | 81.3%#40 | 16.9%#38 | 83.6%#11 | 42.6%#31 | - | - | May 25, 2026, 04:00 PM |
63 | Qwen3.5 Omni Plusqwen3-5-omni-plus | 38.6#38 | 27.6#51 | - | - | 82.6%#35 | 13.9%#43 | - | 40.5%#42 | - | - | May 25, 2026, 04:00 PM |
64 | MiniMax M2.1minimax-m2-1 | 39.4#37 | 32.8#42 | 82.7#16 | 87.5%#4 | 83.0%#31 | 22.2%#30 | 81.0%#13 | 40.7%#40 | - | - | May 25, 2026, 04:00 PM |
65 | o3 Proo3-pro | 40.7#36 | - | - | - | 84.5%#26 | - | - | - | - | - | May 25, 2026, 04:00 PM |
66 | Kimi K2 Thinkingkimi-k2-thinking | 40.9#35 | 34.8#38 | 94.7#6 | 84.8%#16 | 83.8%#29 | 22.3%#29 | 85.3%#6 | 42.4%#34 | - | - | May 25, 2026, 04:00 PM |
67 | GPT-5 Minigpt-5-mini | 41.2#34 | 35.3#37 | 90.7#10 | 83.7%#19 | 82.8%#32 | 19.7%#34 | 83.8%#10 | 39.2%#47 | - | - | May 25, 2026, 04:00 PM |
68 | MiniMax M2.5minimax-m2-5 | 41.9#33 | 37.4#29 | - | - | 84.8%#24 | 19.1%#35 | - | 42.6%#31 | - | - | May 25, 2026, 04:00 PM |
69 | GLM-4.7glm-4-7 | 42.1#32 | 36.3#33 | 95.0#5 | 85.6%#13 | 85.9%#21 | 25.1%#26 | 89.4%#2 | 45.1%#21 | - | - | May 25, 2026, 04:00 PM |
70 | GLM-5V Turboglm-5v-turbo | 42.9#31 | 36.2#34 | - | - | 80.9%#44 | 15.8%#40 | - | 43.5%#25 | - | - | May 25, 2026, 04:00 PM |
71 | Claude Opus 4.5claude-opus-4-5 | 43.1#29 | 42.9#17 | 62.7#25 | 88.9%#2 | 81.0%#43 | 12.9%#45 | 73.8%#20 | 47.0%#13 | - | - | May 25, 2026, 04:00 PM |
72 | GPT-5.1 Codexgpt-5-1-codex | 43.1#29 | 36.6#31 | 95.7#3 | 86.0%#10 | 86.0%#20 | 23.4%#27 | 84.9%#7 | 40.2%#45 | - | - | May 25, 2026, 04:00 PM |
73 | MiMo-V2-Omnimimo-v2-omni | 43.4#28 | 35.5#36 | - | - | 82.8%#32 | 19.9%#33 | - | 36.7%#56 | - | - | May 25, 2026, 04:00 PM |
74 | GPT-5.4 Nanogpt-5-4-nano | 44.0#27 | 43.9#14 | - | - | 81.7%#39 | 26.5%#19 | - | 46.9%#15 | - | - | May 25, 2026, 04:00 PM |
75 | Claude Sonnet 4.6claude-sonnet-4-6 | 44.4#26 | 46.4#10 | - | - | 79.9%#45 | 13.2%#44 | - | 46.9%#15 | - | - | May 25, 2026, 04:00 PM |
76 | GPT-5gpt-5 | 44.6#24 | 36.0#35 | 94.3#7 | 87.1%#6 | 85.4%#22 | 26.5%#19 | 84.6%#8 | 42.9%#28 | 95.7%#1 | 99.4%#1 | May 25, 2026, 04:00 PM |
77 | GPT-5 Codexgpt-5-codex | 44.6#24 | 38.9#25 | 98.7#2 | 86.5%#8 | 83.7%#30 | 25.6%#23 | 84.0%#9 | 40.9%#38 | - | - | May 25, 2026, 04:00 PM |
78 | Qwen3.5qwen3-5 | 45.0#23 | 41.3#22 | - | - | 89.3%#10 | 27.3%#16 | - | 42.0%#35 | - | - | May 25, 2026, 04:00 PM |
79 | Claude Opus 4.6claude-opus-4-6 | 46.5#21 | 47.6#7 | - | - | 84.0%#28 | 18.6%#36 | - | 45.7%#19 | - | - | May 25, 2026, 04:00 PM |
80 | DeepSeek V4 Flashdeepseek-v4-flash | 46.5#21 | 38.7#26 | - | - | 89.4%#9 | 32.1%#11 | - | 44.9%#22 | - | - | May 25, 2026, 04:00 PM |
81 | GLM-5 Turboglm-5-turbo | 46.8#19 | 36.8#30 | - | - | 84.7%#25 | 25.4%#24 | - | 43.6%#24 | - | - | May 25, 2026, 04:00 PM |
82 | Kimi K2.5kimi-k2-5 | 46.8#19 | 39.6#24 | - | - | 87.9%#13 | 29.4%#12 | - | 49.0%#12 | - | - | May 25, 2026, 04:00 PM |
83 | GPT-5.1gpt-5-1 | 47.7#18 | 44.7#12 | 94.0#8 | 87.0%#7 | 87.3%#16 | 26.5%#19 | 86.8%#4 | 43.3%#26 | - | - | May 25, 2026, 04:00 PM |
84 | Gemini 3 Progemini-3-pro | 48.4#17 | 46.5#9 | 95.7#3 | 89.8%#1 | 90.8%#6 | 37.2%#5 | 91.7%#1 | 56.1%#3 | - | - | May 25, 2026, 04:00 PM |
85 | GPT-5.4 Minigpt-5-4-mini | 48.9#16 | 51.5#5 | - | - | 87.5%#14 | 26.6%#18 | - | 49.9%#10 | - | - | May 25, 2026, 04:00 PM |
86 | GPT-5.2 Codexgpt-5-2-codex | 49.0#14 | 43.0#16 | - | - | 89.9%#8 | 33.5%#9 | - | 54.6%#4 | - | - | May 25, 2026, 04:00 PM |
87 | MiMo-V2.5mimo-v2-5 | 49.0#14 | 42.1#19 | - | - | 84.9%#23 | 25.2%#25 | - | 43.1%#27 | - | - | May 25, 2026, 04:00 PM |
88 | MiMo-V2-Promimo-v2-pro | 49.2#13 | 41.4#21 | - | - | 87.0%#17 | 28.3%#13 | - | 42.5%#33 | - | - | May 25, 2026, 04:00 PM |
89 | Grok 4.20grok-4-20 | 49.3#12 | 40.5#23 | - | - | 91.1%#5 | 32.2%#10 | - | 45.6%#20 | - | - | May 25, 2026, 04:00 PM |
90 | MiniMax M2.7minimax-m2-7 | 49.6#11 | 41.9#20 | - | - | 87.4%#15 | 28.1%#14 | - | 47.0%#13 | - | - | May 25, 2026, 04:00 PM |
91 | GLM-5glm-5 | 49.8#10 | 44.2#13 | - | - | 82.0%#38 | 27.2%#17 | - | 46.2%#18 | - | - | May 25, 2026, 04:00 PM |
92 | Qwen3.6 Plusqwen3-6-plus | 50.0#9 | 42.9#17 | - | - | 88.2%#12 | 25.7%#22 | - | 40.7%#40 | - | - | May 25, 2026, 04:00 PM |
93 | GPT-5.2gpt-5-2 | 51.3#8 | 48.7#6 | 99.0#1 | 87.4%#5 | 90.3%#7 | 35.4%#7 | 88.9%#3 | 52.1%#7 | - | - | May 25, 2026, 04:00 PM |
94 | GLM-5.1glm-5-1 | 51.4#7 | 43.4#15 | - | - | 86.8%#18 | 28.0%#15 | - | 43.8%#23 | - | - | May 25, 2026, 04:00 PM |
95 | DeepSeek V4 Prodeepseek-v4-pro | 51.5#6 | 47.5#8 | - | - | 88.8%#11 | 35.9%#6 | - | 50.0%#9 | - | - | May 25, 2026, 04:00 PM |
96 | GPT-5.3 Codexgpt-5-3-codex | 53.6#5 | 53.1#3 | - | - | 91.5%#3 | 39.9%#3 | - | 53.2%#6 | - | - | May 25, 2026, 04:00 PM |
97 | MiMo-V2.5-Promimo-v2-5-pro | 53.8#4 | 45.5#11 | - | - | 86.6%#19 | 33.8%#8 | - | 50.2%#8 | - | - | May 25, 2026, 04:00 PM |
98 | GPT-5.4gpt-5-4 | 56.8#3 | 57.2#1 | - | - | 92.0%#2 | 41.6%#2 | - | 56.6%#2 | - | - | May 25, 2026, 04:00 PM |
99 | Gemini 3.1 Progemini-3-1-pro | 57.2#2 | 55.5#2 | - | - | 94.1%#1 | 44.7%#1 | - | 58.9%#1 | - | - | May 25, 2026, 04:00 PM |
100 | Claude Opus 4.7claude-opus-4-7 | 57.3#1 | 52.5#4 | - | - | 91.4%#4 | 39.6%#4 | - | 54.5%#5 | - | - | May 25, 2026, 04:00 PM |