qwen3-vl-thinking
Developer: Alibaba
Also known as
Qwen3 VL Thinking by Alibaba is available through 77 API providers on LMSpeed. Compare API pricing from $0.010 to $75.00 per million input tokens across providers. Free API access is offered by 1 provider. In speed benchmarks, the fastest provider reaches 52 tok/s.
Compare Qwen3 VL Thinking API pricing across 76 providers. Input prices range from $0.010 to $75.00 per million input. 素墨API offers the lowest rate at $0.010/M.
| Provider | Model Variant | Input ($/M) | Output ($/M) | Speed (t/s) |
|---|---|---|---|---|
| Qwen/Qwen3-VL-32B-Thinking | Free | Free | 38.8 t/s | |
| 素墨API | qwen3-vl-235b-a22b-thinking | $0.010 | $0.010 | — |
| 素墨API | qwen3-vl-8b-thinking | $0.010 | $0.010 | — |
| Qwen/Qwen3-VL-30B-A3B-Thinking | $0.070 | $0.280 | — | |
| qwen/qwen3-vl-8b-thinking | $0.117 | $1.36 | — | |
| qwen/qwen3-vl-30b-a3b-thinking | $0.130 | $1.56 | — | |
| qwen3-vl-8b-thinking | $0.140 | $1.40 | — | |
| Seamee API | Qwen/Qwen3-VL-8B-Thinking | $0.180 | $2.00 | — |
| Qwen/Qwen3-VL-8B-Thinking | $0.180 | $2.00 | — | |
| Qwen/Qwen3-VL-235B-A22B-Thinking | $0.200 | $0.800 | — | |
| Qwen/Qwen3-VL-32B-Thinking | $0.200 | $1.50 | — | |
| qwen3-vl-30b-a3b-thinking | $0.210 | $2.10 | — | |
| Qwen/Qwen3-VL-235B-A22B-Thinking | $0.250 | $1.00 | — | |
| qwen/qwen3-vl-235b-a22b-thinking | $0.260 | $2.60 | — | |
| Seamee API | Qwen/Qwen3-VL-30B-A3B-Thinking | $0.290 | $1.00 | — |
| Qwen/Qwen3-VL-30B-A3B-Thinking | $0.290 | $1.00 | — | |
| 人人 API | qwen3-vl-235b-a22b-thinking | $0.300 | $0.300 | — |
| qwen3-vl-30b-a3b-thinking | $0.375 | $3.75 | — | |
| AAAI | qwen3-vl-8b-thinking | $0.500 | $5.00 | — |
| qwen3-vl-8b-thinking | $0.500 | $5.00 | — |
Pricing data from provider public APIs
Qwen3 VL Thinking is free to use through 1 provider with no per-token charges. These providers offer free API credits or a free tier:
| Provider | Speed (t/s) |
|---|---|
| 38.8 tok/s |
Compare speed and latency performance across all API providers.
Showing 1-2 of 2 providers
| Provider | Speed | Latency | Tests |
|---|---|---|---|
accounts/fireworks/models/qwen3-vl-235b-a22b-thinking | 51.54 tok/s | 1.25s | 5 |
Qwen/Qwen3-VL-32B-Thinking | 38.81 tok/s | 19.77s | 69 |
Latest benchmark results measuring API response speed and first-token latency.
| Time | Model | Speed | Latency |
|---|---|---|---|
| 03/30/2026, 16:26 | Qwen/Qwen3-VL-32B-Thinking | 26.50 tok/s | 54.45s |
| 03/30/2026, 16:26 | Qwen/Qwen3-VL-32B-Thinking | 35.49 tok/s | 20.31s |
| 03/30/2026, 16:26 | Qwen/Qwen3-VL-32B-Thinking | 28.97 tok/s | 25.33s |
| 03/30/2026, 16:26 | Qwen/Qwen3-VL-32B-Thinking | 3.59 tok/s | 30.18s |
| 03/30/2026, 16:26 | Qwen/Qwen3-VL-32B-Thinking | 33.21 tok/s | 29.26s |
| 03/24/2026, 17:04 | Qwen/Qwen3-VL-32B-Thinking | 33.88 tok/s | 21.22s |
| 03/24/2026, 17:04 | Qwen/Qwen3-VL-32B-Thinking | 35.12 tok/s | 10.10s |
| 03/24/2026, 17:04 | Qwen/Qwen3-VL-32B-Thinking | 32.49 tok/s | 21.67s |
| 03/24/2026, 17:04 | Qwen/Qwen3-VL-32B-Thinking | 39.50 tok/s | 9.95s |
| 03/24/2026, 17:04 | Qwen/Qwen3-VL-32B-Thinking | 20.69 tok/s | 21.71s |
| 03/23/2026, 01:58 | Qwen/Qwen3-VL-32B-Thinking | 34.68 tok/s | 14.42s |
| 03/23/2026, 01:58 | Qwen/Qwen3-VL-32B-Thinking | 35.50 tok/s | 8.57s |
| 03/23/2026, 01:58 | Qwen/Qwen3-VL-32B-Thinking | 55.33 tok/s | 12.08s |
| 03/23/2026, 01:58 | Qwen/Qwen3-VL-32B-Thinking | 35.71 tok/s | 18.73s |
| 03/23/2026, 01:58 | Qwen/Qwen3-VL-32B-Thinking | 42.87 tok/s | 16.08s |
| 03/17/2026, 13:54 | Qwen/Qwen3-VL-32B-Thinking | 41.00 tok/s | 21.78s |
| 03/17/2026, 13:54 | Qwen/Qwen3-VL-32B-Thinking | 43.46 tok/s | 7.96s |
| 03/17/2026, 13:54 | Qwen/Qwen3-VL-32B-Thinking | 60.33 tok/s | 7.85s |
| 03/17/2026, 13:54 | Qwen/Qwen3-VL-32B-Thinking | 41.29 tok/s | 10.65s |
| 03/17/2026, 13:54 | Qwen/Qwen3-VL-32B-Thinking | 67.28 tok/s | 10.46s |
