What is a relay API? Why use a relay API?
Our relay API platform acts as an intelligent distribution system between users and major model service providers. Although you still call the model service providers' APIs, we handle user authentication, token distribution, billing management, and data queries through a unified interface. The core value includes: efficient access to over 1000 mainstream models with a single interface; extreme cost-effectiveness through our platform; and a shared compute pool based on our large user base, giving you access to millions of API call frequencies. In short, we lower technical and cost barriers, allowing every user to easily, safely, and efficiently use powerful AI model services.
Why does the relay API offer discounts compared to purchasing directly?
We serve global users by aggregating API call volumes from all users on our platform to negotiate exclusive discounts with model suppliers. With our collective purchasing power, we ensure the lowest prices that individual or enterprise users cannot obtain directly, passing these savings directly to each user.
Is there any difference in quality or speed compared to calling the official API directly?
There is no difference in quality or speed. We essentially call the official API, and with our global acceleration network, we significantly improve API request response speeds, ensuring it's more efficient and faster than calling independently.
A very small number of models give incorrect answers when asked about their version, knowledge base training cutoff, etc.
First, it's important to clarify that answers from the web version (reverse-engineered) and the API version may differ. The web version typically includes built-in prompts to ensure accuracy and match the knowledge base, while the API version, after training, usually only updates training data without specific adaptations. Therefore, you can compare the performance of the web version and official API version at any time to verify that our answers are completely consistent with the official ones. This comparison method can also help identify irregular merchants who use reverse-engineered methods to replace official APIs.
How can I determine if it is a GPT-4 model?
You can use the following logical question to test: Question: What is the relationship between Lu Xun and Zhou Shuren? GPT-3.5: Lu Xun and Zhou Shuren are two different people. GPT-4: Lu Xun and Zhou Shuren are the same person.
How are fees calculated, and are there discounts compared to the official pricing?
For example, text-based models are generally calculated by tokens, with 1K tokens costing xx USD, which is the same billing method as official. Our discount: purchasing USD through official channels requires real-time exchange rates (approximately 1:7), while on our platform, you can buy USD at a discounted rate (check the recharge page for specific exchange rates).
Why does it return upstream load or this error is not common?
In most cases, it's because the content format you sent is unsupported or incorrect. You can contact customer service for assistance.
Why is the completion response sometimes empty?
In most cases, it's because the text content you sent contains filtered prohibited words, so it was intercepted.
Does it support high-concurrency scenarios like immersive translation?
Yes, it supports.
Why does vision sometimes fail to recognize images?
It's related to image format, MIME type, etc. Generally, images that can be directly downloaded are fine.
Invalid token
First, confirm that the API endpoint is correctly filled. If it still doesn't work, please generate a new token.
How to contact us?
The only official customer service QQ (others are scammers, please do not be deceived): 1244119140
What is the quota? How is it calculated?
The quota calculation formula is as follows: Quota = Group Multiplier * Model Multiplier * (Prompt Tokens + Completion Tokens * Completion Multiplier). Completion Multiplier explanation: GPT3.5: fixed at 1.33; GPT4: fixed at 2 (consistent with official). Note: In non-stream mode, the official API returns total token consumption, but prompt and completion have different multipliers.
Why does it indicate insufficient quota when the account quota is sufficient?
This is because token quota and account quota are separate: Token quota is only used to set maximum usage limits; users can freely set token quota; please check if your token quota is sufficient.
Why does the log show an extra billing entry after using your built-in chat software?
This is normal. The chat software has an automatic title summarization feature enabled, which incurs negligible fees. If you mind, simply disable this feature.
Why are the input tokens for gpt-4o-mini image recognition always over 30,000?
This is normal because the token billing for image recognition in mini differs from other models. Please refer to the official documentation; our billing rules are consistent with theirs.
This model is not available in the group or the model has no usage permission
This is because some models are only available through specific channels. Please go to the model pricing page to see which groups can use this model, then click edit token on the token settings page to switch groups.
Why are some model multipliers more expensive than official?
To make it convenient for users, we decided to use the default group as a universal group. In most cases, users only need to use the default group to access all models they want without frequently switching groups. However, this leads to an issue where different model channels have varying costs. For costs that the default group cannot cover, we adjust the model multipliers, up to twice the official rate. If the cost exceeds twice the multiplier, we may open a new group. For example, the GPT model in the default group has an official 1x multiplier, so 0.9 yuan can use 1 USD of official quota, while the Claude model has higher costs, with a 2x multiplier, so 1.8 yuan can use 1 USD of official quota. Apart from this, everything else is consistent with official pricing. Users can use it with confidence.