KKSJ-AI is an AI model aggregation API platform that provides access to multiple AI models through OpenAI-compatible interfaces. It supports over 215 models including OpenAI's GPT-4o series, Claude series, Gemini series, image generation models like DALL-E 3, and embedding models. The platform features enterprise-grade architecture with 99.9% availability, multi-node load balancing, and global server deployment. Key capabilities include real-time analytics dashboards, flexible key management with quota controls, and transparent billing. It's designed for developers needing multi-model AI access with minimal code changes.

Model

Input ($/M)

Output ($/M)

Audit

Speed

Latency

gpt-5.6-lunacodex特价

$6.16/M

$36.99/M

—

gpt-5.6-lunacodex

$9.25/M

$55.48/M

—

gpt-5.6-lunadefault

$9.25/M

$55.48/M

—

gpt-5.6-lunaazpro

$12.33/M

$73.97/M

—

gpt-5.6-lunaazplus

$14.38/M

$86.30/M

—

gpt-5.6-lunaguan

$39.04/M

$234.25/M

—

gpt-5.6-terracodex特价

$6.16/M

$36.99/M

—

gpt-5.6-terracodex

$9.25/M

$55.48/M

—

gpt-5.6-terradefault

$9.25/M

$55.48/M

—

gpt-5.6-terraazpro

$12.33/M

$73.97/M

—

gemini-2.0-pro-exp-02-05

—

49.4 t/s

1.77 s

Time

Model

Speed

Latency

Mar 18, 10:33 AM

gemini-2.0-pro-exp-02-05

49.44 tok/s

1.77s

Provider

Why compare

Models

Free

Avg price

Speed

30d uptime

KKSJ-AI

kksj-ai

An AI model aggregation API platform offering multi-model access with OpenAI-compatible interfaces at low cost.

Current provider baseline

159

$9.54/M

49 tok/s

99.6%

api-n1n-ai

N1N provides API access to a wide range of AI models including GPT-4, Claude 3, Gemini, and others for text, image, and video generation.

Lower average pricing
Faster measured speed
Higher 30-day availability
Broader model coverage

177

$8.77/M

90 tok/s

99.7%

api-kr777-top

CaMeL AI provides an OpenAI-compatible API gateway with extensive model coverage and pricing options.

Faster measured speed
Broader model coverage

194

$78.76/M

93 tok/s

99.2%

new-waadri-top

WAADRI runs a unified AI model gateway that exposes aggregated model access through OpenAI-, Claude-, and Gemini-compatible interfaces.

More free-model options
Broader model coverage

192

710

N/A

5.5%

newapi-higobs-com

An OpenAI-compatible API gateway providing access to multiple large language models and AI services.

Lower average pricing
Faster measured speed

137

$2.21/M

118 tok/s

0.5%

apitoken-online

ApiToken Online offers an OpenAI-compatible API gateway at apitoken.online with transparent per-model pricing and multi-provider routing.

Faster measured speed
More free-model options

$21.19/M

81 tok/s

72.9%

catclaw-moetu-vip

CatClaw API is an OpenAI-compatible LLM gateway at catclaw.moetu.vip, offering multi-model API access with transparent pricing.

Lower average pricing
Faster measured speed

$0/M

62 tok/s

0.4%

Announcements

default4/26/2026

Due to recent risk control measures, the cost of the codex special offer group has increased significantly. To ensure group stability, the price for this group will be adjusted to 0.6 yuan per cut, effective from 12:00 on 2026.05.2.

default4/24/2026

Added a codex trusted access group. All accounts in this group have passed OpenAI trusted access and come with CTF-related system prompts. It also enables fast mode by default to facilitate quick requests, allowing for some cybersecurity content. Trusted access supports legal and ethical cybersecurity work, including discovering and patching vulnerabilities, defensive attack chain simulation, and vulnerability research. Use is limited to systems you own or are explicitly authorized to assess.

default4/23/2026

gpt-5.5 launched on the codex channel.

default3/24/2026

To maintain stability of the claudemax group, the group multiplier is expected to be adjusted from 1.4 to 1.5.

default3/8/2026

New limited-time special group for codex, pure codex channel, official native experience, limited-time special price of 0.2 yuan per unit, official special price already supports 5.4.

default9/18/2025

Added gpt-5 related models.

default9/18/2025

Added claude-opus-4.1 related models.

default9/18/2025

Gemini official update, gemini-2.5 related models. Except for the official version, all other versions have been delisted by the official. Please update soon to avoid affecting usage.

default9/18/2025

For gemini related models, if you need to output the reasoning process, use models with the -thinking suffix. By default, the reasoning process is not output, but gemini will still reason, making the first token very slow!!! Some models can disable reasoning; use the -nothinking suffix. 2.5flashlite has reasoning disabled by default.

default9/18/2025

Most models are at official rates. Some models have higher costs, up to double the official rate. All prices are clearly marked.

default9/18/2025

For detailed information on each group, please go to the token settings page and click on token groups to view.

default9/18/2025

For easier calculation, the recharge price on this site has been changed to 1 yuan per unit, but the usage price remains unchanged. Group multipliers have been adjusted accordingly; the default group is 0.9, still 0.9 yuan per unit.

Alipay online recharge is now supported; 1 yuan recharges 1 credit, with the default group multiplier at 0.9!!!

default9/18/2025

This site's domain https://api.kksj.me will expire in December 2024. Please switch to https://api.kksj.org as soon as possible. This domain is a long-term domain and will remain valid indefinitely. Please save it promptly.

default9/18/2025

Domestic optimized domain with dynamic DNS resolution and CDN acceleration https://cnapi.kksj.org Recommended for use when the official domain is unavailable.

FAQ

What is a relay API? Why use a relay API?

Our relay API platform acts as an intelligent distribution system between users and major model service providers. Although you still call the model service providers' APIs, we handle user authentication, token distribution, billing management, and data queries through a unified interface. The core value includes: efficient access to over 1000 mainstream models with a single interface; extreme cost-effectiveness through our platform; and a shared compute pool based on our large user base, giving you access to millions of API call frequencies. In short, we lower technical and cost barriers, allowing every user to easily, safely, and efficiently use powerful AI model services.

Why does the relay API offer discounts compared to purchasing directly?

We serve global users by aggregating API call volumes from all users on our platform to negotiate exclusive discounts with model suppliers. With our collective purchasing power, we ensure the lowest prices that individual or enterprise users cannot obtain directly, passing these savings directly to each user.

Is there any difference in quality or speed compared to calling the official API directly?

There is no difference in quality or speed. We essentially call the official API, and with our global acceleration network, we significantly improve API request response speeds, ensuring it's more efficient and faster than calling independently.

A very small number of models give incorrect answers when asked about their version, knowledge base training cutoff, etc.

First, it's important to clarify that answers from the web version (reverse-engineered) and the API version may differ. The web version typically includes built-in prompts to ensure accuracy and match the knowledge base, while the API version, after training, usually only updates training data without specific adaptations. Therefore, you can compare the performance of the web version and official API version at any time to verify that our answers are completely consistent with the official ones. This comparison method can also help identify irregular merchants who use reverse-engineered methods to replace official APIs.

How can I determine if it is a GPT-4 model?

You can use the following logical question to test: Question: What is the relationship between Lu Xun and Zhou Shuren? GPT-3.5: Lu Xun and Zhou Shuren are two different people. GPT-4: Lu Xun and Zhou Shuren are the same person.

How are fees calculated, and are there discounts compared to the official pricing?

For example, text-based models are generally calculated by tokens, with 1K tokens costing xx USD, which is the same billing method as official. Our discount: purchasing USD through official channels requires real-time exchange rates (approximately 1:7), while on our platform, you can buy USD at a discounted rate (check the recharge page for specific exchange rates).

Why does it return upstream load or this error is not common?

In most cases, it's because the content format you sent is unsupported or incorrect. You can contact customer service for assistance.

Why is the completion response sometimes empty?

In most cases, it's because the text content you sent contains filtered prohibited words, so it was intercepted.

Does it support high-concurrency scenarios like immersive translation?

Yes, it supports.

Why does vision sometimes fail to recognize images?

It's related to image format, MIME type, etc. Generally, images that can be directly downloaded are fine.

Invalid token

First, confirm that the API endpoint is correctly filled. If it still doesn't work, please generate a new token.

How to contact us?

The only official customer service QQ (others are scammers, please do not be deceived): 1244119140

What is the quota? How is it calculated?

The quota calculation formula is as follows: Quota = Group Multiplier * Model Multiplier * (Prompt Tokens + Completion Tokens * Completion Multiplier). Completion Multiplier explanation: GPT3.5: fixed at 1.33; GPT4: fixed at 2 (consistent with official). Note: In non-stream mode, the official API returns total token consumption, but prompt and completion have different multipliers.

Why does it indicate insufficient quota when the account quota is sufficient?

This is because token quota and account quota are separate: Token quota is only used to set maximum usage limits; users can freely set token quota; please check if your token quota is sufficient.

Why does the log show an extra billing entry after using your built-in chat software?

This is normal. The chat software has an automatic title summarization feature enabled, which incurs negligible fees. If you mind, simply disable this feature.

Why are the input tokens for gpt-4o-mini image recognition always over 30,000?

This is normal because the token billing for image recognition in mini differs from other models. Please refer to the official documentation; our billing rules are consistent with theirs.

This model is not available in the group or the model has no usage permission

This is because some models are only available through specific channels. Please go to the model pricing page to see which groups can use this model, then click edit token on the token settings page to switch groups.

Why are some model multipliers more expensive than official?

To make it convenient for users, we decided to use the default group as a universal group. In most cases, users only need to use the default group to access all models they want without frequently switching groups. However, this leads to an issue where different model channels have varying costs. For costs that the default group cannot cover, we adjust the model multipliers, up to twice the official rate. If the cost exceeds twice the multiplier, we may open a new group. For example, the GPT model in the default group has an official 1x multiplier, so 0.9 yuan can use 1 USD of official quota, while the Claude model has higher costs, with a 2x multiplier, so 1.8 yuan can use 1 USD of official quota. Apart from this, everything else is consistent with official pricing. Users can use it with confidence.

Notes

Health checks: Scope: the 72-hour chart and recent availability measure API connectivity only. Each bar summarizes one hour of checks. Targets: LMSpeed tries the configured health check URL and provider status URL first, then API endpoints derived from known API hosts and recent speed-test base URLs. A website host is considered only when it looks like an API endpoint. Probe steps: each candidate goes through DNS lookup, TCP connection, TLS handshake for HTTPS, and an HTTP HEAD request with redirects followed. Probing stops after the first reachable candidate. Reachable criteria: every required network step must succeed. An HTTP response below 500 is treated as reachable, including 401 because it confirms that an authenticated API endpoint responded, except for statuses classified as blocked. Blocked results: HTTP 403, 429, 521, 525, and 530, plus detected WAF or Cloudflare challenges, are shown as blocked and excluded from availability calculations because LMSpeed cannot determine whether the API itself is down. Model availability: when a dedicated test key is configured, LMSpeed sends an authenticated GET request to a derived /models endpoint and compares returned model IDs with this provider's listed models. These per-model results appear in Models & Pricing and are not included in the provider connectivity percentage. Timeouts: TCP connection, TLS handshake, HTTP connectivity, and model requests each use a 20-second timeout. A full run can take longer when several candidates are tried. Frequency: a background worker checks all providers every 5 minutes by default. The 72-hour chart combines those samples into hourly bars, and the schedule may be changed by the service operator. Limit: automated samples are not an SLA and do not guarantee account quota, every model, every region, or successful completion requests. Check the provider's own status page before making operational decisions.

Domain Rating data is sourced from Ahrefs. It is a 0–100 backlink-based domain strength signal and does not measure API speed or reliability.

Announcements and FAQ are read from this provider's NewAPI status snapshot when available. LMSpeed stores the original content and optional English translations from the provider status source, then shows the localized fields on this page.

KKSJ-AI

KKSJ-AI

Features

Login Methods

API Endpoints

Health Check

API Benchmarks & Pricing

Recent Test Records

Similar API Provider Alternatives to Compare

Announcements

FAQ

What is a relay API? Why use a relay API?

Why does the relay API offer discounts compared to purchasing directly?

Is there any difference in quality or speed compared to calling the official API directly?

A very small number of models give incorrect answers when asked about their version, knowledge base training cutoff, etc.

How can I determine if it is a GPT-4 model?

How are fees calculated, and are there discounts compared to the official pricing?

Why does it return upstream load or this error is not common?

Why is the completion response sometimes empty?

Does it support high-concurrency scenarios like immersive translation?

Why does vision sometimes fail to recognize images?

Invalid token

How to contact us?

What is the quota? How is it calculated?

Why does it indicate insufficient quota when the account quota is sufficient?

Why does the log show an extra billing entry after using your built-in chat software?

Why are the input tokens for gpt-4o-mini image recognition always over 30,000?

This model is not available in the group or the model has no usage permission

Why are some model multipliers more expensive than official?

Notes

Similar API Provider Alternatives to Compare

Provider	Why compare	Models	Free	Avg price	Speed	30d uptime
KKSJ-AI kksj-ai An AI model aggregation API platform offering multi-model access with OpenAI-compatible interfaces at low cost.	Current provider baseline	159	6	$9.54/M	49 tok/s	99.6%
api-n1n-ai N1N provides API access to a wide range of AI models including GPT-4, Claude 3, Gemini, and others for text, image, and video generation.	Lower average pricing Faster measured speed Higher 30-day availability Broader model coverage	177	6	$8.77/M	90 tok/s	99.7%
api-kr777-top CaMeL AI provides an OpenAI-compatible API gateway with extensive model coverage and pricing options.	Faster measured speed Broader model coverage	194	6	$78.76/M	93 tok/s	99.2%
new-waadri-top WAADRI runs a unified AI model gateway that exposes aggregated model access through OpenAI-, Claude-, and Gemini-compatible interfaces.	More free-model options Broader model coverage	192	710	N/A	N/A	5.5%
newapi-higobs-com An OpenAI-compatible API gateway providing access to multiple large language models and AI services.	Lower average pricing Faster measured speed	137	0	$2.21/M	118 tok/s	0.5%
apitoken-online ApiToken Online offers an OpenAI-compatible API gateway at apitoken.online with transparent per-model pricing and multi-provider routing.	Faster measured speed More free-model options	33	11	$21.19/M	81 tok/s	72.9%
catclaw-moetu-vip CatClaw API is an OpenAI-compatible LLM gateway at catclaw.moetu.vip, offering multi-model API access with transparent pricing.	Lower average pricing Faster measured speed	32	2	$0/M	62 tok/s	0.4%

KKSJ-AI

KKSJ-AI

Features

Login Methods

API Endpoints

About KKSJ-AI

Health Check

API Benchmarks & Pricing

Recent Test Records

Similar API Provider Alternatives to Compare

Announcements

FAQ

What is a relay API? Why use a relay API?

Why does the relay API offer discounts compared to purchasing directly?

Is there any difference in quality or speed compared to calling the official API directly?

A very small number of models give incorrect answers when asked about their version, knowledge base training cutoff, etc.

How can I determine if it is a GPT-4 model?

How are fees calculated, and are there discounts compared to the official pricing?

Why does it return upstream load or this error is not common?

Why is the completion response sometimes empty?

Does it support high-concurrency scenarios like immersive translation?

Why does vision sometimes fail to recognize images?

Invalid token

How to contact us?

What is the quota? How is it calculated?

Why does it indicate insufficient quota when the account quota is sufficient?

Why does the log show an extra billing entry after using your built-in chat software?

Why are the input tokens for gpt-4o-mini image recognition always over 30,000?

This model is not available in the group or the model has no usage permission

Why are some model multipliers more expensive than official?

Notes

Similar API Provider Alternatives to Compare