GigaIO, a leader in scalable edge-to-core AI platforms, has announced the next phase of its strategic partnership with d-Matrix. This collaboration aims to deliver the world’s most expansive and scalable AI inference solution, designed to meet the increasing demands of enterprises deploying AI at scale. By integrating d-Matrix’s Corsair inference platform into GigaIO’s SuperNODE architecture, the companies are redefining the future of enterprise AI, offering an unprecedented solution that eliminates the complexity and bottlenecks traditionally associated with large-scale AI inference deployments.
The Power of the GigaIO SuperNODE and d-Matrix Corsair Integration
1. Unmatched Scalability and Performance
At the heart of this joint solution lies the GigaIO SuperNODE platform, which can support dozens of d-Matrix Corsair accelerators within a single node. This setup creates a scalable architecture that significantly improves AI inference speed and efficiency. Enterprises can now deploy ultra-low-latency batched inference workloads without the need for complex multi-node configurations, which have traditionally been a bottleneck in large-scale AI deployments.
2. Exceptional Inference Capabilities
The integrated platform offers industry-leading performance metrics, enabling enterprises to achieve remarkable results in AI inference, including:
- Processing capability of 30,000 tokens per second with just 2 milliseconds per token for large AI models like Llama3 70B.
- Up to 10x faster interactive speeds compared to GPU-based solutions.
- 3x better performance at a similar total cost of ownership.
- 3x greater energy efficiency, contributing to more sustainable AI deployments.
3. Simplified Deployment and Enhanced Efficiency
One of the standout features of this collaboration is the simplified deployment process. By eliminating the need for complex multi-node configurations, the GigaIO SuperNODE offers businesses the ability to quickly scale their AI workloads without compromising on performance. The solution is designed to optimize both operational efficiency and total cost of ownership (TCO), making it easier for enterprises to deploy cutting-edge AI technologies while achieving significant cost savings.
Technological Integration: Driving Innovation in AI
This partnership leverages GigaIO’s PCIe Gen 5-based AI fabric, ensuring near-zero latency communication between multiple Corsair accelerators. This state-of-the-art architecture maximizes the efficiency of d-Matrix’s Digital In-Memory Compute (DIMC) technology, which provides a 150 TB/s memory bandwidth. This approach eliminates the traditional bottlenecks seen in distributed inference workloads, enabling faster and more efficient AI computations.
Industry Recognition and Performance Validation
The partnership between GigaIO and d-Matrix has already garnered significant attention in the industry, with GigaIO recently achieving the highest tokens per second for a single node in the MLPerf Inference: Datacenter benchmark database. This milestone further validates GigaIO’s leadership in scale-up AI infrastructure, and showcases the performance and capabilities of the new integrated solution.
“The market has been demanding more efficient, scalable solutions for AI inference workloads that don’t compromise performance,” said Alan Benjamin, CEO of GigaIO. “Our partnership with d-Matrix brings together tremendous engineering innovation, resulting in a solution that redefines what’s possible for enterprise AI deployment.”
A New Era for Enterprise AI Inference
The strategic partnership between GigaIO and d-Matrix represents a significant leap forward in the realm of enterprise AI inference. By combining GigaIO’s scalable AI infrastructure with d-Matrix’s advanced inference acceleration technology, the two companies have created a solution that delivers unprecedented performance, energy efficiency, and cost-effectiveness. This collaboration promises to make enterprise-scale generative AI not only commercially viable but also accessible for organizations of all sizes.