ST. LOUIS (SC25), November 17, 2025 — As enterprises accelerate adoption of large language models (LLMs), generative AI, and real-time inference applications, a new bottleneck has emerged: memory scale, bandwidth, and latency. XConn Technologies (XConn), a leader in next-generation interconnect solutions for high-performance computing and AI infrastructure, and MemVerge®, the leader in Big Memory software, today announced a joint demonstration of a Compute Express Link® (CXL®) memory pool designed to break through the AI memory wall. The live demo will take place at Supercomputing 2025 (SC25)in St. Louis, November 16–21, 2025 in booth #817 station 2 and 8.
Academic and industry analysts agree that memory bandwidth growth has lagged far behind compute performance. While server FLOPS have surged, DRAM and interconnect bandwidth have scaled much more slowly, making memory the dominant bottleneck for many AI inference workloads. Experts warn that AI growth is already hitting a memory wall forcing a rapid need for memory and interconnect architectures to evolve. The memory-intensive nature of retrieval-augmented generation, vector search, agentic AI, and large language model inference is pushing traditional DDR and HBM-based server architectures to their limits, creating both performance and TCO challenges.
“As AI workloads and model sizes explode, the limiting factor is no longer just GPU count, it’s how much memory can be shared, how fast it can be accessed, and how cost-efficiently it can scale,” said Gerry Fan, CEO of XConn Technologies. “Our collaboration with MemVerge demonstrates that CXL memory pooling at 100 TiB and beyond is production-ready, not theoretical. This is the architecture that makes large-scale AI inference truly feasible.”
To address these challenges, XConn and MemVerge are demonstrating a rack-scale CXL memory pooling solution built around XConn’s Apollo hybrid CXL/PCIe switch and MemVerge’s Gismo technology, optimized for NVIDIA’s Dynamo architecture and NIXL software stack. The demo showcases how AI inference workloads can offload and share massive KV cache resources dynamically across GPUs and CPUs, achieving greater than 5× performance improvements compared with SSD-based caching or RMDA-based KV cache offloading, while reducing total cost of ownership. The demo particularly shows a scalable memory architecture for AI inference workloads where there is a disaggregation of prefill and decode work stages.
“Memory has become the new frontier of AI infrastructure innovation,” said Charles Fan, CEO and co-founder of MemVerge. “By using MemVerge GISMO with XConn’s Apollo switch, we’re showcasing software-defined, elastic CXL memory that delivers the performance and flexibility needed to power the next wave of agentic AI and hyperscale inference. Together, we’re redefining how memory is provisioned and utilized in AI data centers.”
As AI becomes increasingly data-centric and memory-bound, rather than compute-bound, traditional server architectures can no longer keep up. CXL memory pooling addresses these limitations by enabling dynamic, low-latency memory sharing across CPUs, GPUs, and accelerators. It scales up to hundreds of terabytes of shared memory, reduces TCO through better utilization, reduces over-provisioning and enhances throughput for inference-first workloads, generative AI, real-time analytics, and in-memory databases.
SC25 attendees can experience the joint demo featuring a CXL memory pool dynamically shared across CPUs and GPUs, with inferencing benchmarks illustrating significant performance and efficiency gains for KV cache offload and AI model execution. For more details about SC25 and to register, visit https://sc25.supercomputing.org.
About XConn Technologies
XConn Technologies Holdings, Inc. (XConn) is the innovation leader in next-generation interconnect technology for high-performance computing and AI applications. The company is the industry’s first to deliver a hybrid switch supporting both CXL and PCIe on a single chip. Privately funded, XConn is setting the benchmark for data center interconnect with scalability, flexibility, and performance. For more information visit: xconn-tech.com.
About MemVerge
MemVerge is a leading provider of AI memory software. MemVerge solutions help enterprises stand up long term memory for their agentic AI initiatives, and help AI data centers improve performance and efficiency by expanding and sharing memories between GPUs. For more information about MemVerge software, please visit memverge.ai.

Techedge AI is a niche publication dedicated to keeping its audience at the forefront of the rapidly evolving AI technology landscape. With a sharp focus on emerging trends, groundbreaking innovations, and expert insights, we cover everything from C-suite interviews and industry news to in-depth articles, podcasts, press releases, and guest posts. Join us as we explore the AI technologies shaping tomorrow’s world.








