Faster AI Inference enabled by 1000x Higher Scale with Delos Data™ Nonstop AI™

News provided by

May 27, 2026, 08:00 ET

Delos Nonstop AI^™ Product Portfolio Solves AI Interconnect Bottlenecks and Increases GPU Utilization

Delos AI interconnect architecture enables faster and efficient data movement between GPUs
Solves interconnect issues cluster wide for large-scale inference workloads as unified system
Delos is showing the cluster and software running AI workloads at Computex

PALO ALTO, Calif., May 27, 2026 /PRNewswire/ -- Delos Data today announced a new cluster architecture and server design that accelerates AI inference through cabled scale-up. Unlike existing solutions, it addresses inference performance at a unified system level by reducing GPU idle time caused by data center interconnect bottlenecks — and is designed to flexibly scale to thousands of GPUs, CPUs, and accelerators.

"The GPU is no longer the bottleneck — the network is. The next leap in AI performance will come not just from more compute but from how the compute is connected," said Ed Doe, Co-founder and CEO of Delos Data. "With the interconnect market set to hit $100 billion by 2030, the industry is waking up to this. What Delos Data has built is exactly what the moment demands — a unified system architecture that eliminates fragmentation between layers and lets AI inference run the way it was meant to: fast, at scale, and without compromise."

Highlights of the Delos Data Nonstop AI^™ approach

Large ecosystem: Built within today's GPU and accelerator ecosystems
Higher scale: Delos Server brings scale-up IO to faceplace, enabling larger scale-up domains
Better Visibility: Delos Mosaic^™ software allows for fine-grained visibility into scale-up domain

"The industry has reached an inference inflection point where traditional architectures can no longer keep pace with the networking demands of massive GPU clusters," said Dan Daly, co-founder and CTO of Delos Data. "We are introducing a fundamentally new interconnect architecture that brings scale-up connectivity directly to the network. Our architecture allows for flexible scaling of GPU clusters to ensure the number of GPUs in the cluster can be right sized to the AI models and the inference performance requirements."

Benefits for hyperscalers, AI infrastructure builders, and enterprises scaling inference:

Reduced cost per token
Improved tokens per watt
Increased tokens per second, driving greater revenue per second

Experience Delos at Computex 2026

Delos Data will demonstrate Nonstop AI^™ live at Computex 2026 in Booth J1328, showcasing real-time inference bottleneck detection using Delos Mosaic^™ and AI workload tuning in a production-scale environment. Attendees will see firsthand how Mosaic delivers actionable cluster intelligence to accelerate and stabilize inference workloads.

Availability
Early access software deployments with select customers available now; broader availability is planned for Q4 2026.

About Delos Data
Delos Data is building AI Interconnect for faster stronger Inference, starting from how compute is distributed and memory is interconnected. The team is a mixture of software, systems, and silicon experts focused on delivering the world's most capable and responsive intelligence at scale. To learn more, please visit www.delosdata.com

Contact: Nicole Conley, Tanis Communications, [email protected]

SOURCE Delos Data