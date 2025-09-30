The collaboration combines multimodal database technology, advanced AI agents, and large-scale datasets to accelerate drug discovery.

SAN FRANCISCO, Sept. 30, 2025 /PRNewswire/ -- TileDB, Inc., Kepler AI, and Tahoe Therapeutics today announced a groundbreaking partnership to create the first public-facing platform that enables researchers to run sophisticated AI agent-based queries on massive single-cell datasets at unprecedented scale. This collaboration represents a major advancement in computational biology, combining TileDB's multimodal database technology with Kepler AI's agent systems and Tahoe Therapeutics' comprehensive single cell datasets.

Tahoe-100M Atlas Available in TileDB

Tahoe-100M, generated by the Tahoe Therapeutics team using the Mosaic technology, and open-sourced for access, is a giga-scale single-cell perturbation atlas consisting of over 100 million transcriptomic profiles from 50 cancer cell lines exposed to 1,200 small-molecule perturbations. Routinely used in research and development, the scale and complexity of this data made it difficult to query in the standard h5ad format.

By leveraging TileDB's unique ability to store multidimensional data in a multimodal database architecture, the atlas is now available in an interoperable, highly scalable cloud-native platform for pharma and biotech teams who would like to leverage Tahoe-100M in conjunction with their own single-cell datasets to train foundational models or run real-time queries using agents.

"TileDB is the only database technology capable of handling the multidimensional nature of single-cell genomics data while maintaining the performance necessary for real-time AI agent queries," said Stavros Papadopoulos at TileDB. "This partnership represents exactly the kind of transformative application we envisioned when we built our multimodal database platform."

Unprecedented Scale and Capabilities

Previously, due to the sheer scale of Tahoe-100M, Kepler AI was constrained to building agents on pseudo-bulked single-cell data, which provided only a limited view of cellular complexity. The collaboration now enables AI agents to work directly with full-resolution single-cell data at massive scale.

The new platform provides researchers with access to 250 gigabytes (~16% size reduction from original h5ad/parquet) of data including 100 million cells from Tahoe Therapeutics and enables Kepler agents to query full-resolution single-cell data. This represents one of the largest publicly accessible single-cell datasets optimized for AI-driven analysis.

Key capabilities of the platform include:

Real-time AI agent queries across multidimensional single-cell datasets

Scalable analysis from individual cells to population-level insights

Discover biomarkers and therapeutic pathways previously hidden in pseudo-bulk data

Industry Leaders Unite for Scientific Advancement

"This collaboration represents a paradigm shift in how we approach single-cell analysis," said Ashton Teng, CEO at Kepler AI. "The enablement of full single-cell data analysis unlocks many more applications that researchers have been asking for."

The partnership builds on each company's core strengths. TileDB's multimodal database technology provides the essential infrastructure for storing and querying complex biological data at scale. Kepler AI contributes advanced AI-native interfaces to allow AI agents to seamlessly interact with large-scale biological data. Tahoe Therapeutics brings comprehensive, high-quality single-cell datasets that span multiple therapeutic areas.

"We want more people using our data. Integrating with TileDB makes Tahoe-100M – the largest dataset of its kind – fast and reproducible to analyze at single-cell resolution, lowering the barrier for ML teams to run real experiments, train larger models, and push virtual-cell modeling forward," said Nima Alidoust, CEO at Tahoe Therapeutics.

The platform is expected to significantly accelerate research timelines by enabling researchers to:

Identify transcriptional correlates for drug susceptibility and resistance

Analyze cell cycle state dynamics in response to diverse drug classes

Characterize genotype-specific transcriptional responses to targeted therapies

Availability and Access

The public-facing platform available today at https://tahoebio.ai/tahoedive . After registration, users can select the new single-cell Tahoe-100M dataset to enable analysis for questions that require single-cell resolution. The existing pseudobulk dataset, which is ideal for different expression analysis and pathway analysis, is still available. Note that the free version of the Kepler AI platform contains 32GB of memory, so a representative subset of cells will be selected. Please contact [email protected] for custom compute resource needs.

For deployments in your own environment and use agents on your governed data, contact [email protected] or visit https://www.tiledb.com/talk-to-us .

About TileDB

TileDB is an omnimodal intelligence platform designed by scientists to accelerate scientific discovery. TileDB allows organizations to govern and analyze all data types, including data that does not fit into relational databases designed for structured tabular data. Built on a powerful shape-shifting array database, TileDB handles the complexities of non-traditional "unstructured" multimodal data, such as genomic variants, bulk and single-cell transcriptomics, proteomics and, biomedical imaging, as well as the frontier data of the future. Used by science and data teams within the top 10 big pharma and biotechs to power their multiomics FAIR data platforms, TileDB is the destination for scientific breakthroughs where frontier multimodal data is driving drug discovery.

About Kepler AI

Kepler AI is the leading provider of enterprise-grade AI agents and AI-native data interfaces for the life sciences. Kepler allows for secure automation of end-to-end bioinformatics and data-heavy research workflows, unlocking a step change in research efficiency and quality. Based in San Francisco, Kepler was founded by a team of scientists and engineers who built foundational scientific infrastructure at GRAIL, Databricks, Foresite Labs, and Xaira Therapeutics.

About Tahoe Therapeutics

Tahoe Therapeutics is building AI-powered models of the human cell to design better drugs for more patients. Its technology platform generates large-scale, perturbative single-cell datasets that enable a new generation of biological foundation models. Based in South San Francisco, Tahoe was founded by a team of scientists and technologists advancing the frontiers of drug discovery, genomics, and machine learning.

