AMD and Aligned Help USC ISI Build a CUDA-Free LLM Giant—Meet MEGALODON
In a bold move away from the NVIDIA-dominated AI training landscape, Aligned, AMD, and the University of Southern California’s Information Sciences Institute (USC ISI) have joined forces to train MEGALODON, a next-generation large language model built for scale—and built without CUDA.
At the heart of the collaboration is a shared mission: to prove that large-scale LLMs don’t have to run on NVIDIA GPUs. Instead, the trio is leveraging AMD’s Instinct™ MI300 GPU family and its open-source ROCm™ software stack, with Aligned providing critical AI ops support to make the training process both efficient and hardware-agnostic.
Why MEGALODON Matters: Architecture Meets Scale
The MEGALODON project isn’t just another LLM clone. It’s USC ISI’s flagship model featuring a Moving Average Equipped Gated Attention (MEGA) architecture—a novel design that promises to retain long-context information more effectively while reducing computational load.
As AI researchers and institutions scramble to train ever-larger models, long-context handling is emerging as a core challenge. MEGALODON’s approach could represent a major leap forward for tasks like long-form reasoning, document summarization, and multimodal memory retention.
“MEGALODON represents a leap forward in architecture and scalability,” said Jonathan May, research associate professor at USC’s Viterbi School of Engineering. “This collaboration brings the computational power and infrastructure expertise needed to realize its full potential.”
Breaking the CUDA Mold: AMD + Aligned in Action
Until now, most large LLMs—from GPT-style models to open-source variants—have been trained on NVIDIA GPUs using the CUDA software stack. MEGALODON aims to change that narrative.
With AMD Instinct MI325X GPUs and the ROCm™ platform, USC ISI researchers will tap into powerful hardware alternatives—without being locked into a single ecosystem. Aligned’s role is pivotal: providing system orchestration, performance tuning, and operational support to ensure smooth, large-scale training runs on AMD silicon.
“Our mission is to enable powerful AI on any hardware,” said Chris Ensey, CEO of Aligned. “This partnership exemplifies that vision—scalable AI, beyond proprietary boundaries.”
High Stakes, High Performance
The project also benefits from USC ISI’s broader network of high-stakes partnerships with DARPA, NSF, Lockheed Martin, and Chevron. Backed by NSF grants and supported by USC’s supercomputing infrastructure, MEGALODON is currently being trained on advanced clusters with future expansion planned on MI355X GPUs.
Additional infrastructure includes:
- Supercomputing Clusters for parallel model training
- High-throughput Storage Systems for large data ingestion
- Custom Chip Prototyping Labs like MOSIS, for potential hardware-software co-design research
Key Collaboration Highlights:
- New LLM Architecture: MEGALODON uses MEGA attention for long-context efficiency
- Non-NVIDIA Training: One of the first major LLMs trained exclusively on AMD MI325X + ROCm
- Public + Private Backing: Supported by NSF, with ties to federal agencies and private enterprise
- Aligned’s AI Ops Expertise: Ensuring scalable training on AMD hardware without CUDA dependencies
Implications: Beyond NVIDIA, Toward AI Portability
As the AI community explores ways to reduce hardware lock-in, the MEGALODON project could prove pivotal. If successful, it validates the use of AMD GPUs at scale for training state-of-the-art LLMs—opening the door for broader innovation, cost competition, and platform diversity in AI development.
In an industry obsessed with size and speed, MEGALODON may soon be known for something rarer: flexibility.
Power Tomorrow’s Intelligence — Build It with TechEdgeAI.