Oracle and AMD today announced a major expansion of their partnership: the new AMD Instinct™ MI355X GPUs will soon be available on Oracle Cloud Infrastructure (OCI). The rollout brings massive compute power—up to 131,072 MI355X GPUs—to customers building, training, and inferencing large-scale AI models.
With more than 2X better price-performance than the previous generation, Oracle and AMD aim to redefine how enterprises scale AI workloads in the cloud.
1. Powering the Next Wave of Agentic AI Applications
The collaboration will deliver zettascale AI Superclusters on OCI, optimized for:
- Training massive frontier models
- Fine-tuning and inferencing at scale
- Running agentic AI applications with fast time-to-first-token (TTFT) and high tokens-per-second throughput
“We’re delivering the broadest AI infrastructure offerings to meet the growing needs of AI workloads,” said Mahesh Thiagarajan, EVP at Oracle Cloud Infrastructure.
2. AMD Instinct MI355X: Performance, Efficiency, Scale
The new MI355X GPUs deliver:
- 2.8X throughput increase over prior generation GPUs
- 288 GB of HBM3 with up to 8 TB/s memory bandwidth
- FP4 support for ultra-efficient, high-speed inference
- Liquid cooling for 64 GPUs per rack at 125kW density
- Built-in support for agentic and LLM workloads
“Our collaboration with Oracle helps customers scale AI cost-effectively with open solutions,” said Forrest Norrod, EVP and GM, Data Center Solutions Group, AMD.
3. Built for Production-Scale AI
The upcoming OCI shapes powered by AMD Instinct MI355X are designed for:
- Large-scale training of foundation and open-source models
- Enterprise-grade inference with ultra-low latency
- Flexible cloud deployment, no vendor lock-in via AMD ROCm open stack
Coupled with Oracle’s RDMA-based cluster network and high-performance storage, these solutions will make it easier to deploy, scale, and manage demanding AI workloads.
4. Powerful Compute + Smart Networking
Each Supercluster is backed by:
- A high-frequency AMD Turin CPU head node with up to 3 TB system memory
- AMD Pollara™ AI NICs — offering advanced RoCE, congestion control, and UEC-compliant Ethernet fabric
Oracle is the first cloud provider to deploy AMD Pollara, unlocking ultra-low-latency backend networks for massive GPU clusters.
5. Unlocking Innovation at Every Layer
Key capabilities for OCI customers include:
- Faster LLM training with FP4 compute
- Large model inference entirely in memory
- Seamless orchestration for multi-GPU jobs
- Open-source compatibility with ROCm libraries, compilers, and runtimes
- Fully programmable networking with industry-standard open fabrics
6. Expanding AI Choice & Flexibility in the Cloud
Oracle’s latest move emphasizes its commitment to offering:
- Broad choice of AI hardware
- Open, developer-friendly environments
- High-efficiency, high-throughput compute for both training and inference
This announcement complements OCI’s broader AI strategy, which includes offerings with NVIDIA, Ampere, Intel, and now next-gen AMD Instinct MI355X GPUs.
A New Era of Scalable, Cost-Effective AI in the Cloud
With up to 131,072 MI355X GPUs powering OCI’s new zettascale AI infrastructure, Oracle and AMD are setting a new benchmark for performance, efficiency, and flexibility in the cloud.
Power Tomorrow’s Intelligence — Build It with TechEdgeAI.