
Following a record-breaking sweep of global embedding leaderboards, Octen debuts its proprietary distributed search engine, achieving 60ms latency and 1M+ QPS to power search in the age of AI
SAN FRANCISCO, April 22, 2026 /PRNewswire/ -- Octen, the search infrastructure company for the generative AI era, today announced the launch of the world's fastest and most scalable web search API. Headquartered in both San Francisco and Singapore, Octen is deploying a global data search layer engineered with high-concurrency infrastructure built for the agentic internet, to replace the current "human-centric" search stack with industry-leading response-time and accuracy. To accelerate its deployment, Octen completed a $10 million seed funding round led by Square Peg.
The technical requirements for AI search are fundamentally different from those of the human era. An autonomous AI agent can execute hundreds or thousands of concurrent searches to complete a single complex task, aiming to research, verify, and synthesize information at a scale and speed that breaks traditional search infrastructure.
"For AI, concurrency and latency are the core foundational capabilities," said Kuan Zou, Founder and CEO of Octen. "AI interacts with information in a way that is light-years beyond human capability. An agent can process thousands of data points simultaneously, which places a massive technical burden on the underlying stack. By solving the challenges of sub-100ms latency and million-level QPS, Octen is enabling a new milestone in AI applications by allowing agents to reason over the live web with the same speed and fluidity as memory."
Zou brings a decade of experience building and scaling large-scale search and AI infrastructure. Prior to founding Octen, Zou was the product lead of Alibaba Cloud AI search, powering systems used by hundreds of millions of end-users worldwide, after building Baidu's enterprise search platform from the ground up. He has assembled a team of engineers and AI researchers from Meta, Google, TikTok, Alibaba, Baidu, DeepSeek, and Xiaohongshu to build what Octen aims to make the foundational retrieval layer for AI agents worldwide.
Since the company was established just three months ago, Octen's proprietary Octen-8B model swept the RTEB (Retrieval Embedding Benchmark), outperforming established industry titans in both precision and long-context understanding and establishing Octen as the new technical gold standard in data retrieval logic. Octen's dual-hub engineering team in San Francisco and Singapore has deployed a proprietary, ultra-large-scale distributed search engine that resets the industry's performance ceiling:
- Industry Leading Speed: ~60ms average response time and minute-level data updates, enabling AI agents to reason and act on live information rather than stale snapshots.
- Unmatched Throughput: Designed for high-density workloads, the system is the first to support over 1 million queries per second (QPS) at production scale.
- Trillion-Scale Global Index: A massive, multi-language index covering the breadth of the web provides a structured data layer purpose-built for machine consumption.
Octen's $10 million seed funding round was led by Square Peg, with participation from Argor and a cohort of leading AI scientists. The capital has been utilized to scale Octen's globally distributed server architecture and grow its engineering and developer relations teams across its two primary hubs.
Tushar Roy, Partner, Square Peg, said: "Search infrastructure was built for humans, ranked by ads and formatted for people to browse. That doesn't work for AI. Octen is building the search layer that is faster, cleaner, and purpose-built for how AI consumes information. What drew us in was the team. Zou has put together an impressive team with extensive experience building systems that powered search at significant scale, throughput and reliability. They are among the best in the world at what they do, and we're excited to be partnering with them as they build Octen."
Octen is currently in invitation-only beta to test with design partners across the GenAI ecosystem, powering the next generation of autonomous researchers, enterprise assistants, and agentic workflows.
To explore the API or access the open-source models, visit octen.ai.
About Octen
Octen is the search infrastructure for the generative AI era, providing a real-time, LLM-native search API that powers AI agents, chatbots, research assistants, and autonomous applications. Unlike traditional search engines built for humans, Octen delivers low-latency, structured, and up-to-date results that flow directly into AI workflows, helping developers skip the complexity of building their own retrieval pipelines.
SOURCE Octen
Share this article