An AI model aggregation platform providing unified API access to multiple large language models with cost optimization features.
TokenPony (小马算力) is an AI model aggregation service that offers a unified API interface compatible with OpenAI and Claude specifications. The platform allows developers to access various large language models through a single endpoint while supporting load balancing and cost optimization.
Key models available include:
- Deepseek-v3-0324: Enhanced reasoning capabilities for mathematical and coding tasks
- qwen3-coder-480b: 450B parameter code generation model with multi-language support
- kimi-k2-instruct-0905: Trillion-parameter MoE model for code generation and creative writing
The service provides metrics including average TTFT <500ms and pricing starting under ¥7 per million tokens. The platform serves over 60,000 developers with monthly token consumption exceeding 90 billion.

