Inference Volume$52.1B+5.1%
Model Rankings165updated
AI Traffic624.8K+38.2%
Global Latency11msoptimal
API Calls102.4M+12.4%
Live Agents582,391+42.1%
AI Market Cap$58.2B+5.2%
GPU Nodes6,247+10.2%
Inference Volume$52.1B+5.1%
Model Rankings165updated
AI Traffic624.8K+38.2%
Global Latency11msoptimal
API Calls102.4M+12.4%
Live Agents582,391+42.1%
AI Market Cap$58.2B+5.2%
GPU Nodes6,247+10.2%

NVIDIA H200 141GB

NVIDIA Data Center141GB HBM3e • 4.8 TB/s • 67 TFLOPS FP32
$35,000 - $50,0003.77%Pre-order
TDP
700W
VRAM
141GB HBM3e
Bandwidth
4.8 TB/s
Compute
67 FP32 / 3,958 FP8
Released
Q2 2024
Availability
Pre-order

Description

The H200 upgrades the H100 with 141GB HBM3e memory and 4.8 TB/s bandwidth, delivering up to 1.8x faster inference for large language models.

Benchmarks

LLM Inference
1.8x vs H100
HPC GEMS
47.8 TFLOPS
GPT-3 175B
0.9 sec/token
Memory Bandwidth
4.8 TB/s