Question 1

What counts as a free LLM API on LMSpeed?

Accepted Answer

A model–provider pair whose current input and output prices are both $0 per token. Some providers offer permanent free tiers, others only give one-time credits to new accounts. We mark an offering as free only while its public pricing is zero, and re-check pricing pages regularly.

Question 2

Are these free APIs really free? Any catch?

Accepted Answer

Most have rate limits — requests per minute, daily caps, or context-length limits — and many require account signup with a verified phone or payment method. Some are time-limited promotions. Always read the provider's terms and quotas before depending on a free endpoint.

Question 3

Are community-run relays and non-profit aggregators included? Any extra caveats?

Accepted Answer

Some entries are community-run relays ("公益站") that bundle paid upstream keys and redistribute access for free at the operator's expense. They often advertise larger quotas and a broader model list than official free tiers, but reliability is much lower: operators can pull the plug or disappear overnight, pricing and quotas can change without notice, and many sites are invite-only — requiring a GitHub invite, a forum referral, or a closed community to register. Some keep signups disabled indefinitely. Treat them as best-effort backup channels; keep anything important on official paid endpoints.

Question 4

Which free LLM API is fastest?

Accepted Answer

Speed varies by model and provider. Sort the table by Most tested for the most reliable benchmarks, or pick a model family from the chips above to drill in. Each row shows median tokens per second and first-token latency from real API tests.

Question 5

How does LMSpeed measure speed and latency?

Accepted Answer

We send identical prompts to each provider through a five-round stress test, count output tokens with tiktoken, and measure both throughput (tokens per second) and time to first token. Numbers are aggregated as medians to resist outliers and refresh on a regular cadence.

Question 6

Can I use a free LLM API in production?

Accepted Answer

For prototypes, side projects, and low-traffic tools, yes. Production traffic will usually hit a rate limit quickly. Treat the free tier as an evaluation channel: validate the model and provider, then move to a paid endpoint with the same model when you scale.

Question 7

Why don't I see a specific model in this list?

Accepted Answer

Either no provider currently offers it for free, the free promotion ended, or it has not been benchmarked yet. Open the model's main page to compare paid options, or let us know about a missing free provider via the feedback link in the footer.

Model	Providers	Speed	Latency	Tests
AnthropicFree ReasoningToolsFilesVisionAnthropic Claude Sonnet 4.6 extends the Sonnet line with improved tool use, coding reliability, and long-context performance for everyday production workloads.		30.62 t/s	4.12 s	10
AnthropicFree ToolsMultimodal200K工具调用Anthropic Claude Haiku 4.5 delivers fast, low-cost responses while retaining solid instruction following for chat, classification, and lightweight coding.

Free LLM API Models

How to use

Filter by model family

Compare speed and latency

Open the provider details

Free LLM API FAQ