AIC and ScaleFlux Deliver Optimized Hardware Platform for AI Context Memory Storage with BlueField-4. Joint solution addresses growing KV-cache and long-context inference workloads with a purpose-built CMX infrastructure tier.

MILPITAS, Calif., March 13, 2026 /PRNewswire/ -- As AI models evolve to support longer prompts, multi-turn conversations, and autonomous agents, the amount of memory required to store inference context—particularly the model's key-value (KV) cache—has expanded dramatically. In many deployments, the KV cache required for long-context workloads exceeds the available GPU and system memory, creating a new bottleneck for AI infrastructure operators.

AIC and ScaleFlux solution for CMX Storage

ScaleFlux and AIC are delivering a joint hardware platform designed to accelerate emerging Inference Context Memory Storage (introduced as ICMS and now referred to as CMX) deployments for large-scale AI inference infrastructure. CMX architectures address the challenge of AI Agent workloads by introducing a high-performance storage layer that can hold and serve large context datasets outside GPU memory while maintaining the low latency required for inference operations.

By combining the AIC F2032-G6 JBOF Storage System with ScaleFlux NVMe SSDs and NVIDIA's latest data-center networking technologies—including the NVIDIA BlueField‑4 DPU and NVIDIA ConnectX‑9 SuperNIC—the companies are delivering a purpose-built hardware platform optimized for the rapidly growing context memory storage tier in modern AI clusters.

The AIC F2032-G6 JBOF platform provides an ideal foundation for this new infrastructure tier. Designed as a high-density NVMe storage system, the platform integrates BlueField-4 DPUs and/or ConnectX-9 SuperNICs to deliver high-throughput, low-latency connectivity between GPU servers and shared context memory storage.

When populated with ScaleFlux NVMe SSDs, the system delivers a powerful and efficient hardware configuration for CMX deployments. ScaleFlux SSD technology is designed to sustain high-IOPS, low-latency data access patterns typical of KV-cache workloads while improving storage efficiency and overall system utilization. All of this operates together to minimize the crucial "time to first token" and minimize the time GPUs lose waiting for data. Lower wait times translate to higher GPU utilization and greater ROI from those multi-million (or billion!) dollar investments in AI infra.

"AI inference is rapidly shifting from stateless queries to persistent, long-context interactions," said Michael Liang, CEO at AIC. "Our new F2032-G6 platform, combined with BlueField-4 and ConnectX-9 networking, provides the high-performance storage architecture needed to support context memory storage at scale."

"Context memory is emerging as a new data tier in AI infrastructure," said Hao Zhong, CEO and Co-Founder at ScaleFlux. "By pairing ScaleFlux NVMe SSDs with AIC's high-density JBOF platform and NVIDIA's advanced data-center networking technologies, we are delivering a hardware solution optimized for the next generation of AI inference pipelines."

The joint platform helps AI infrastructure operators address several key challenges associated with long-context inference workloads, including:

Expanding KV-cache requirements driven by larger context windows and persistent AI sessions

driven by larger context windows and persistent AI sessions Efficient offloading of context memory from GPU HBM and system DRAM

from GPU HBM and system DRAM High-performance shared storage capable of serving context data to large GPU clusters

capable of serving context data to large GPU clusters Scalable infrastructure architectures for agentic AI and multi-modal inference workloads

As organizations deploy increasingly sophisticated AI services, the need for scalable context memory infrastructure is expected to grow rapidly. Solutions such as the AIC F2032-G6 JBOF with ScaleFlux NVMe SSDs provide a flexible and efficient hardware platform to support this new layer in the AI data pipeline.

Together, AIC and ScaleFlux are enabling AI infrastructure builders to deploy high-performance context memory storage systems that help maximize GPU utilization while supporting the next generation of long-context AI applications.

About ScaleFlux

In an era where data reigns supreme, ScaleFlux emerges as the vanguard of enterprise storage and memory technology, poised to redefine the landscape of the data infrastructure - from cloud to AI, enterprise, and edge computing. ScaleFlux is a fabless semiconductor company which developed a holistic approach to storage and memory innovation to seamlessly combine hardware and software, unlocking unprecedented performance, efficiency, security and scalability for data-intensive applications. ScaleFlux's cutting-edge technology promises not just to manage the data deluge but to transform it into actionable insights and value, heralding a new dawn for businesses and data centers worldwide. For more details, visit scaleflux.com or contact us at [email protected].

About AIC

AIC Inc. is a global leader in server and storage solutions. With nearly 30 years of expertise in high-density storage servers, storage server barebones, and high-performance computers, AIC has expanded into AI storage and AI edge appliances, achieving significant market recognition for its branded products. The company's in-house design, manufacturing, and validation capabilities ensure products are highly flexible and configurable to meet diverse form factor requirements. Headquartered in Taiwan, AIC operates offices and facilities across the United States, Asia, and Europe. For more information, please visit www.aicipc.com or contact us at [email protected].

