CoreWeave, the AI Hyperscaler™, has announced its latest MLPerf v5.0 results—establishing a new industry benchmark for AI inference. By leveraging NVIDIA's powerful GB200 Grace Blackwell Superchips, CoreWeave achieved 800 tokens per second (TPS) on the open-source Llama 3.1 405B model, setting a high bar for large-model inference performance in the cloud. This milestone solidifies CoreWeave’s role as a top-tier cloud infrastructure provider, purpose-built to meet the demands of cutting-edge AI workloads. Read More