SAN FRANCISCO, March 5, 2026 /PRNewswire/ -- Together AI , the AI Native Cloud powering some of the world's fastest-growing AI companies, today launched AI Native Conf, its first-ever conference dedicated to builders creating the next generation of AI applications. The event comes amid rapid business momentum for Together AI, which now serves thousands of customers, supports over one million developers, and has achieved 10x year-over-year growth in annual contract revenue (ACR), including 27 customer deals exceeding $1 million and one exceeding $1 billion.

Together AI has emerged as a core infrastructure provider for leading AI-native companies including Cursor, Decagon, and Cartesia, delivering production-scale inference, pre-training and model shaping. With an industry-leading systems research lab led by the creators of FlashAttention and ThunderKittens, Together AI sits at the intersection of frontier research and real-world deployment.

"AI is moving faster than any technological shift we've seen before, and the companies being built today look fundamentally different," said Vipul Ved Prakash, co-founder and CEO of Together AI. "This event is about bringing together the AI-native builders at the frontlines and sharing what it actually takes to run AI in production at scale. Our advantage is simple: the same researchers who publish foundational work are the ones shipping it into production systems our customers rely on."

Announcing New Research Breakthroughs and Products From The AI-Native Cloud

At AI Native Conf, Together AI unveiled new research-to-production advancements across kernels, reinforcement learning, and inference optimization, underscoring the company's deep research bench and rapid development cadence. These advancements will help companies improve training and inference performance to enable businesses of all sizes to capitalize on the benefits of generative and agentic AI.

Key announcements include:

FlashAttention 4 , the latest evolution of the widely adopted kernel now powering most major language models in production. FlashAttention 4 delivers up to 4x performance improvements at long sequence lengths , narrowing the gap between theoretical and real-world performance for long-context workloads like coding agents and document reasoning.

, the latest evolution of the widely adopted kernel now powering most major language models in production. FlashAttention 4 delivers up to , narrowing the gap between theoretical and real-world performance for long-context workloads like coding agents and document reasoning. A new Reinforcement Learning API that decouples inference and training, enabling globally distributed reinforcement learning pipelines that were previously only feasible for organizations with massive, co-located GPU clusters.

that decouples inference and training, enabling globally distributed reinforcement learning pipelines that were previously only feasible for organizations with massive, co-located GPU clusters. ThunderAgent , an open-source, program-aware system for serving and training agentic workloads, delivering up to 3.6x throughput improvements and significantly reduced memory overhead.

, an open-source, program-aware system for serving and training agentic workloads, delivering up to and significantly reduced memory overhead. ATLAS-2, which uses real-time user data to adapt and optimize, immediately delivering 1.5X faster inference results.



A Gathering for the AI-Native Generation

AI Native Conf was created in response to the rapid emergence of AI-native startups. Generative AI is already being used by 70% of companies, according to McKinsey, making it the fastest-adopted major technology platform in history. As a result, a new class of AI-native companies is scaling at unprecedented speed, with many reaching $100 million in ARR faster than any previous generation of startups.

The conference features leaders building at the frontier of AI, including Grant Lee, co-founder and CEO of Gamma, and Arjun Desai, co-founder of Cartesia, alongside researchers and engineers deploying AI systems at massive scale.

About Together AI

Together AI is the AI Native Cloud, combining state-of-the-art open-source models, high-performance infrastructure, and frontier research in AI efficiency and scalability. Founded in 2022, Together AI powers over a million of developers and some of the world's most demanding AI workloads, delivering production-scale inference, training, and reinforcement learning for the next generation of AI-native companies.

