Penguin Solutions, Inc. has announced a significant expansion of its ICE ClusterWare™ software platform (formerly Scyld ClusterWare®). The latest update introduces multi-tenancy support, enhanced orchestration controls, and streamlined workflows, helping enterprises build fully optimized AI ecosystems—referred to as Intelligent Compute Environments.Additionally, Penguin Solutions has launched the ICE ClusterWare AIM™ service, an advanced optimization solution aimed at maximizing AI infrastructure performance through predictive automation. This service ensures AI workloads run at peak efficiency while reducing operational complexity.
Enhancements in ICE ClusterWare Software
Penguin Solutions has upgraded ICE ClusterWare with new capabilities to facilitate scalable and efficient AI, HPC, and data infrastructure management. The latest updates include:
- Multi-Tenancy Foundational Support
- Ensures secure and logical resource isolation.
- Enables dynamic workload partitioning for multiple users and departments.
- Enhanced Orchestration Controls
- Introduces a unified control plane to manage sub-clusters and associated infrastructure efficiently.
- Streamlined Workflows
- Automates policy enforcement for AI infrastructure deployment.
- Optimizes job scheduling for improved operational efficiency.
ICE ClusterWare AIM Service: Advanced AI Optimization
The ICE ClusterWare AIM™ service leverages automation and operational intelligence to optimize large-scale AI and HPC clusters. Penguin Solutions, having managed over two billion hours of GPU runtime, introduces three core enhancements:
- Automated Remediation
- Identifies and resolves system inefficiencies before they impact AI workloads.
- Prescriptive Maintenance
- Uses patent-pending technology to detect and mitigate failures before they occur, minimizing downtime.
- Operational Efficiency at Scale
- Optimizes workload distribution to maximize GPU utilization.
- Works with common scheduling software to improve ROI on AI infrastructure investments.
Industry Impact and Expertise
Industry experts emphasize the importance of automation and optimization in AI infrastructure. According to Ashish Nadkarni, IDC Group Vice President, AI infrastructure challenges often lead to underutilized GPU clusters, operating at just 50% of their potential value. ICE ClusterWare software and AIM services provide an integrated and automated approach to scaling AI environments efficiently.
Penguin Solutions’ ICE ClusterWare software and ICE ClusterWare AIM service are critical components of the OriginAI® Infrastructure Solution. With over 25 years of expertise and more than 85,000 GPUs deployed and managed, Penguin Solutions empowers enterprises to seamlessly scale AI infrastructure while ensuring cost-effectiveness, operational efficiency, and high performance.