LLM API Pricing & Speed Test

Compare LLM API pricing and speed benchmarks across 100+ providers — OpenAI, DeepSeek, Claude, Gemini, and more. Find the cheapest and fastest API for your project.

Recent Test Results

Time	Provider	Model	Speed	Latency
Apr 17, 12:50 AM	platform.aitools.cfd	zhipu/glm-4-flash	34.65t/s	1.15s
Apr 16, 09:05 PM	platform.aitools.cfd	zhipu/glm-4-flash	33.28t/s	0.84s
Apr 16, 09:01 PM	platform.aitools.cfd	zhipu/glm-4-flash	39.25t/s	0.69s
Apr 16, 08:26 PM	platform.aitools.cfd	qwen/qwen3-8b	26.08t/s	33.74s
Apr 16, 08:18 PM	platform.aitools.cfd	zhipu/glm-4-flash	34.54t/s	0.64s

Professional API Speed Testing Tool

Compare API pricing per token, run speed benchmarks, and find the best provider for your AI application.

API Pricing Comparison: Compare per-token API pricing across 100+ providers side by side. Find the cheapest LLM API for any model, with free tier and credits tracking.
Real-time Speed Benchmarks: Run standardized speed benchmarks measuring API throughput and latency with multi-prompt testing
Custom Endpoint Benchmarks: Benchmark any API endpoint with custom base URLs and API keys — official providers, proxies, and self-hosted models
Speed Benchmark Analytics: Detailed speed benchmark metrics including first token latency, output throughput, and processing time across providers
Real-time Streaming Results: Monitor benchmark progress in real-time with streaming results for each prompt evaluation

Frequently Asked Questions

Learn more about LMSpeed

How do I compare LLM API pricing across providers?

LMSpeed aggregates per-token pricing from 100+ API providers. Visit any model page to see a side-by-side pricing comparison table showing input and output rates per million tokens, so you can find the cheapest provider for each model.

Which LLM APIs are free?

Many providers offer free API tiers or credits for popular models like DeepSeek, Gemini, and Llama. Check our Free LLM API directory for a complete list of models with free access, including speed benchmarks for each free provider.

How does LMSpeed conduct speed benchmark testing?

LMSpeed employs a five-round continuous stress testing mechanism with standardized prompts. Token calculations are performed accurately using tiktoken, measuring output throughput (tokens per second) and first-token latency.

How to compare speed between different API providers?

Use our performance leaderboards and model detail pages to visually compare API speed benchmarks across providers. The system ranks providers by throughput, latency, and health, helping you choose the fastest and most reliable API.

Is long-term performance monitoring supported?

Coming soon

Time

Provider

Model

Speed

Latency

Apr 17, 12:50 AM

platform.aitools.cfd

zhipu/glm-4-flash

34.65t/s

1.15s

Apr 16, 09:05 PM

platform.aitools.cfd

zhipu/glm-4-flash

33.28t/s

0.84s

Apr 16, 09:01 PM

platform.aitools.cfd

zhipu/glm-4-flash

39.25t/s

0.69s

Apr 16, 08:26 PM

platform.aitools.cfd

qwen/qwen3-8b

26.08t/s

33.74s

Apr 16, 08:18 PM

platform.aitools.cfd

zhipu/glm-4-flash

34.54t/s

0.64s