NVIDIA提供用于构建、定制和部署多模态生成式AI模型的AI和加速计算API。
NVIDIA通过API提供AI和加速计算服务,主要包括NeMo等服务,用于构建和部署多模态生成式AI模型。主要功能涵盖模型开发、定制和跨领域部署。显著优势包括与NVIDIA硬件平台(如DGX、HGX)集成,支持高性能计算。典型应用场景涉及AI研究、生命科学(通过BioNeMo)和3D模拟工作流(通过Omniverse Cloud)。
2.50s |
| 25 |
| qwen/qwq-32b | 31.54 t/s | 1.92s | 15 |
| moonshotai/kimi-k2-instruct | 54.35 t/s | 0.53s | 15 |
| deepseek-ai/deepseek-r1 | 87.63 t/s | 8.96s | 15 |
| mistralai/mistral-small-24b-instruct | 29.68 t/s | 0.49s | 10 |
| qwen/qwen3-next-80b-a3b-instruct | 118.96 t/s | 0.58s | 10 |
| google/gemma-2-27b-it | 43.69 t/s | 0.23s | 10 |
| ai21labs/jamba-1.5-large-instruct | 55.60 t/s | 0.29s | 10 |
| google/gemma-3-27b-it | 62.41 t/s | 0.20s | 5 |
| nvidia/llama-3.3-nemotron-super-49b-v1.5 | 79.68 t/s | 0.28s | 5 |
| moonshotai/kimi-k2-instruct-0905 | 47.28 t/s | 0.72s | 5 |
| deepseek-ai/deepseek-r1-distill-qwen-14b | 40.96 t/s | 0.49s | 5 |
| microsoft/phi-3-medium-128k-instruct | 18.27 t/s | 0.53s | 5 |
| meta/llama-3.1-70b-instruct | 51.18 t/s | 0.23s | 5 |
| 01-ai/yi-large | 43.74 t/s | 0.22s | 5 |
| mistralai/mixtral-8x22b-instruct-v0.1 | 89.66 t/s | 0.22s | 5 |
| nvidia/llama-3.1-nemotron-70b-instruct | 52.19 t/s | 0.23s | 5 |
| microsoft/phi-4-mini-flash-reasoning | 74.19 t/s | 0.46s | 5 |
| qwen/qwen3-235b-a22b | 34.77 t/s | 25.15s | 5 |
| openai/gpt-oss-20b | 239.61 t/s | 10.88s | 5 |
| deepseek-ai/deepseek-r1-distill-qwen-32b | 33.97 t/s | 0.59s | 5 |
| 时间 | 模型 | 速度 | 延迟 |
|---|---|---|---|
| Nov 8, 12:52 PM | openai/gpt-oss-120b | 149.96 t/s | 18.38s |
| Nov 8, 12:45 PM | nvidia/llama-3.3-nemotron-super-49b-v1.5 | 79.68 t/s | 0.28s |
| Nov 6, 01:20 PM | deepseek-ai/deepseek-v3.1 | 20.43 t/s | 5.37s |
| Oct 31, 05:26 AM | qwen/qwen3-next-80b-a3b-instruct | 158.38 t/s | 0.50s |
| Oct 5, 02:42 AM | deepseek-ai/deepseek-v3.1 | 28.14 t/s | 0.52s |
| Oct 5, 12:04 AM | deepseek-ai/deepseek-v3.1 | 29.94 t/s | 0.38s |
| Oct 5, 12:01 AM | deepseek-ai/deepseek-r1 | 117.81 t/s | 7.08s |
| Sep 28, 04:25 PM | qwen/qwen3-next-80b-a3b-instruct | 79.54 t/s | 0.67s |
| Sep 28, 10:27 AM | moonshotai/kimi-k2-instruct-0905 | 47.28 t/s | 0.72s |
| Sep 20, 04:57 AM | google/gemma-2-27b-it | 43.48 t/s | 0.22s |