Website
Updated 2/8/2026
素墨API interface preview
Performance Stats
Avg Speed
862.16t/s
Latency
7.02s
Total Tests
180
Models
32
A

素墨API

About 素墨API

A non-profit AI infrastructure offering free access to integrated large language models with privacy-focused, no-logging policies.

OpenAIgpt-ossChatGLMGLM-4QwenQwen3QwenQwen3-OmniMetaAILlama 3.1ChatGLMGLM-4.5DeepSeekDeepSeek-V3DeepSeekDeepSeek-V3.2

Health Check

98%Recent availability
History (72 pts)
PastNow

Supported Models

ModelSpeedLatencyTests
echo
6152.15 t/s
0.75s
15
llama3.1-8B
1421.44 t/s
0.94s
10
快速/llama3.1-8B
1258.69 t/s
1.24s
15
gpt-oss-120b
970.77 t/s
0.94s
5
llama3.1-8b
731.95 t/s
1.17s
15
gpt-oss-20b:free
339.45 t/s
2.51s
5
gemini-3.0-flash
336.24 t/s
12.65s
5
翻译/glm-4.7
308.58 t/s
3.72s
5
gpt-oss-120b:free
257.64 t/s
2.50s
5
gpt-oss-120b-medium
189.93 t/s
8.29s
5
google/gemma-3-1b-it
176.46 t/s
0.87s
5
gemini-2.5-flash
173.49 t/s
11.32s
5
openai/gpt-oss-20b
167.79 t/s
2.37s
5
qwen3-1.7b:free
133.77 t/s
5.08s
5
测试/gemini-2.5-pro
82.44 t/s
17.04s
5
mimo-v2-flash
76.18 t/s
5.01s
5
gemini-3-flash-preview
73.06 t/s
4.16s
10
测试/gemini-3-flash-preview
69.26 t/s
11.71s
5
qwen3-omni-flash-2025-12-01
65.20 t/s
5.13s
5
qwen3-omni-flash-2025-12-01
65.20 t/s
5.13s
5
Showing 20 of 32 models.

Recent Test Records

TimeModelSpeedLatency
Feb 28, 04:37 AMecho
16467.28 t/s
0.81s
Feb 28, 04:36 AMecho
1008.34 t/s
0.67s
Feb 28, 04:35 AMecho
980.83 t/s
0.77s
Feb 28, 04:26 AMllama3.1-8B
1220.87 t/s
1.04s
Feb 28, 04:26 AMllama3.1-8B
1622.01 t/s
0.84s
Feb 28, 04:25 AM快速/llama3.1-8B
1238.04 t/s
1.04s
Feb 28, 04:14 AM快速/llama3.1-8B
1302.82 t/s
1.49s
Feb 28, 04:14 AM快速/llama3.1-8B
1235.21 t/s
1.19s
Feb 28, 04:13 AMllama3.1-8b
743.43 t/s
1.58s
Feb 28, 04:12 AMllama3.1-8b
759.33 t/s
0.80s