Along with new agentic AI workflows, the new model is already powering faster localization for dozens of Deepdub GO enterprise customers

TEL AVIV, Israel, March 10, 2026 /PRNewswire/ -- Deepdub , a foundational voice AI company pioneering expressive voice technologies, announced today the launch of its latest AI speech model Phantom X 3.2, designed to redefine the standards of dubbing and real-time voice agents. With enhanced voice quality, multilingual capabilities, and ultra-low latency, Phantom X 3.2 is built to meet the growing demands of global enterprises for scalable, high-quality AI voice and dubbing solutions. Additionally, Deepdub's new agentic AI workflows will be demoed at the upcoming NVIDIA GTC , showcasing the future of AI-powered localization.

Deepdub GO, the company's enterprise localization platform purpose-built for localization at scale, is now powered by Phantom X 3.2. GO continues to serve as the backbone of Deepdub's enterprise offering, enabling production teams to generate, review, and deploy AI dubbing across dozens of languages within high-volume localization pipelines. With GO, Deepdub's strategic partners have uninterrupted and complete access to the world's most advanced AI-powered localization platform, including Phantom X 3.2 and all new foundation models and agentic capabilities as they are introduced.

Phantom X 3.2 introduces a new generation of dubbing capabilities engineered for studio-grade quality at enterprise scale. The model produces professional-quality voice output with human-like delivery across extreme pitch, speed, and prosody ranges, and supports zero-shot voice cloning from as little as one second of reference audio, even from noisy or degraded source material. Expanded emotion styles, including Joy, Giggle, and Laughter, can be layered within a single line, and a new Key Names and Phrases (KNP) system ensures consistent pronunciation and translation of recurring character names and technical terms across full episodes and series.

The model's precision phonetics for stress-timed languages ensures perfect pronunciation in languages where stress impacts meaning, such as Russian, Hebrew, and other languages in which incorrectly applied stress alters the meaning of the word. This ensures, for example, words like "zamok" (castle vs. lock) or "BI-ra" (beer vs. capital city) are correctly pronounced. This makes it an essential tool for global enterprises aiming to localize content accurately.

Phantom X 3.2 enables streaming platforms and studios to localize series into 10–20 languages simultaneously while maintaining consistent character voices, accurate pronunciation of names and terms, and natural performance across episodes. The model also supports animation and franchise localization, large catalogue dubbing of films and television libraries, fast-turnaround localization for trailers, promos, and global releases, and natural narration for documentaries and unscripted programming.

For real-time voice agents, Phantom X 3.2 delivers approximately 125ms end-to-end latency, making it suited for demanding voice agent use cases such as customer support, virtual assistants, and interactive AI pipelines. Speech generation begins as text arrives, processing the remainder of each sentence in parallel to enable natural, uninterrupted real-time conversations. The model also maintains consistent voice identity, emotional control, and audio quality across extended interactions, with automatic speaker gender detection that persists throughout a session.

"The demands on voice AI have never been more complex or more consequential," said Ofir Krakowski, CEO and co-founder of Deepdub. "Content owners and global enterprises need every language to feel native, and every conversation to feel human. But beyond quality, the economics of localization are being rewritten — streaming platforms can now make on-demand localization decisions as content breaks through in a new market, without pre-committing budgets to languages that may never be needed. With Phantom X 3.2, we've built a model that meets every bar simultaneously – Hollywood-grade expressiveness, real-time responsiveness, and the unit economics that make agile, language-by-language expansion a real business decision rather than a gamble. And this is just the beginning. We're continuing to push the boundaries of what's possible in dubbing and localization, with agentic AI workflows that will further automate and orchestrate pipelines end-to-end, making world-class localization faster, smarter, and more accessible than ever before."

About Deepdub

Deepdub is the foundational voice AI model company pioneering expressive voice technologies for global enterprises across TV, film, advertising, gaming, e-learning, and AI-agent applications. The company's international team of technology, dubbing, and linguistic experts deliver an end-to-end voice solution that preserves the emotional and cultural integrity of original content in more than 130 languages and dialects. With an advisory board that includes media leaders such as Kevin Reilly, former Chief Content Officer at HBO Max, and Emiliano Calemzuk, former President of Fox Television Studios, Deepdub is eliminating language barriers to enable the global diffusion of media on major streaming platforms like Netflix, Amazon Prime, and Hulu. Visit https://deepdub.ai or follow us on LinkedIn for more information.

