Akamai, a leader in cybersecurity and cloud computing, has introduced Akamai Cloud Inference, a cutting-edge solution designed to accelerate AI inference by moving workloads closer to end users. Running on Akamai Cloud—the world’s most distributed platform—this innovation aims to overcome the limitations of centralized cloud models, delivering up to 2.5x lower latency and 3x better throughput.
“While LLM training will continue in hyperscale data centers, real-time inference must happen at the edge,” said Adam Karon, COO at Akamai. “Our globally distributed platform uniquely positions us to lead this shift.”
Features of Akamai Cloud Inference
1. AI-Optimized Compute at Scale
Akamai Cloud provides a versatile compute stack to handle diverse AI inference challenges:
- CPUs, GPUs, and ASIC VPUs – Balancing performance and efficiency for various workloads.
- NVIDIA AI Enterprise Integration – Optimizing inference with Triton, TAO Toolkit, TensorRT, and NVFlare.
- Cost Savings – Businesses can save up to 86% on AI inference compared to hyperscaler infrastructure.
2. AI-Ready Data Management
Akamai’s data fabric ensures low-latency access to real-time data for inference:
- VAST Data partnership – Enables rapid data retrieval for faster AI decision-making.
- Highly scalable object storage – Handles vast AI training artifacts and fine-tuned model data.
- Vector database integrations – Partners with Aiven and Milvus for Retrieval-Augmented Generation (RAG).
3. Containerized AI Deployment
Akamai leverages Kubernetes orchestration for scalable and cost-effective AI deployment:
- Linode Kubernetes Engine (LKE)-Enterprise – A new enterprise-grade Kubernetes platform.
- Support for KServe, Kubeflow, and SpinKube – Streamlining AI model deployment and inference.
- Petabyte-scale performance – Optimized for large-scale AI workloads.
4. Edge AI Compute for Real-Time Processing
Akamai brings AI inference closer to users through edge computing:
- WebAssembly (Wasm) integration – Enables serverless AI inference via lightweight execution at the edge.
- Fermyon partnership – Simplifies AI-powered application development.
- Global scale – Akamai’s 4,200+ PoPs across 130+ countries ensure AI inference can happen anywhere.
The Shift from Training to Inference
While LLMs require massive compute resources for training, enterprises are now prioritizing real-time AI inference for practical business applications. LLMs are costly and better suited for general-purpose tasks, whereas lightweight AI models:
- Are industry-specific and optimized for real-world use cases.
- Can be deployed efficiently at the edge for faster decision-making.
- Offer a higher ROI by focusing on business-driven AI adoption.
“Training an LLM is like creating a map—it takes time and resources. Inference, however, is like using GPS, providing instant, real-time insights,” said Karon. “Inference is the next frontier for AI.”
Edge AI: Powering Real-World AI Applications
Akamai Cloud Inference is already transforming AI applications across industries:
- In-car voice assistants – Enabling real-time, AI-powered driver interactions.
- AI-powered crop management – Optimizing agriculture with edge AI insights.
- Retail image optimization – Enhancing product visualization and recommendation engines.
- Virtual garment try-ons – Providing AI-driven shopping experiences.
- Automated product descriptions – Generating e-commerce-ready content in seconds.
- Customer sentiment analysis – Delivering real-time AI-driven feedback insights.
Akamai Cloud Inference is redefining AI inference by moving workloads closer to users, reducing costs, improving speed, and enabling real-time AI applications. As AI adoption shifts from training to inference, Akamai’s globally distributed cloud ensures businesses can deploy scalable, efficient, and low-latency AI solutions at the edge.