Dataset grows to more than 312,000 whole genomes with longitudinal clinical data

GSK is one of the first to lead a further expansion of 50,000 whole genomes paired with proteomic data

SAN DIEGO, March 5, 2026 /PRNewswire/ -- Illumina, Inc. (NASDAQ: ILMN), and Nashville Biosciences, LLC (NashBio), today announced two advancements in scale and depth of the Alliance for Genomic Discovery (AGD or the Alliance). With the addition of Regeneron Genetics Center® (RGC®) as the tenth member, the Alliance can expand the core dataset to 312,000 whole genomes. The Alliance also announced a new initiative: a dataset of 50,000 additional whole genomes with paired proteomic data generated using Illumina® Protein Prep. GSK is among the first participants in this multiomic expansion.

Dataset grows to more than 312,000 whole genomes with longitudinal clinical data

The AGD dataset is among the largest collections of whole genome sequences available, and it is the world's largest pairing of whole genome sequences with the depth of clinical data that comes from a leading academic medical center. De-identified, deep phenotypic data from electronic health records (EHR) enables more precise definition of disease cohorts and are enriched for advanced disease.

"AGD has already enabled disease-impacting discoveries in autoimmune disease and obesity, with many more such studies underway, and continued expansion accelerates this progress," said Rami Mehio, senior vice president and general manager of BioInsight at Illumina. "Integrating high-quality clinical and genomic data with advanced AI will help pharma translate discoveries into meaningful advances for patients."

RGC, a wholly owned subsidiary of Regeneron, harnesses the vast potential of human genetics to discover important new medicines, validate existing research programs, and optimize clinical trials. With a database of nearly 3 million sequenced exomes and de-identified EHRs, RGC enables meaningful biological discoveries and guides Regeneron's broader drug discovery and development efforts.

"We are thrilled to join the Alliance alongside so many of our longtime partners, including Vanderbilt with their exceptional biobank and population-scale genetics program — one of the most impressive in the world," said Aris Baras, MD, senior vice president, head of RGC and co-head of Regeneron Genetic Medicines. "Together, this alliance brings together an extraordinarily large and rich dataset, and we cannot wait to see the discoveries that lie ahead. At Regeneron, we have always believed that human genetics is the most powerful compass we have for finding the right targets and delivering the best innovative medicines; initiatives like this are exactly how we continue to push the boundaries of what's possible for patients."

Through the Alliance, RGC will add significant scale to the Alliance's database as well as its already expansive genomic database. One of RGC's main goals is to uncover large-effect protective genetic factors that can illuminate the next generation of high-confidence drug targets and ultimately deliver transformative new medicines. Regeneron integrates human genetics across its entire enterprise and intends to leverage this data at every stage of drug discovery and development — from target discovery to clinical trial design to patient and market access to emerging predictive health analytics. This collaboration reflects RGC's ongoing partnerships with Illumina and the broader biopharma community to build large-scale population genomics consortia, a commitment that dates back to landmark initiatives such as the UK Biobank.

Powerful proteomic data can accelerate drug discovery research for pharma

Multiomic data adds new dimensions to drug discovery research, uncovering deeper biological information than genomics alone. GSK is one of the first to participate in the next phase of AGD, which expands the dataset's molecular depth by adding proteomics to enable entirely new layers of biological insight. The dataset will consist of 50,000 paired whole-genome and proteomic samples, designed to facilitate faster, more efficient target discovery and therapy development.

Adding proteomics to the AGD dataset will aid in understanding molecular mechanisms of disease-associated genetic variation. The diverse genetic ancestry in AGD provides an opportunity to study population-specific genetic variants and their associated proteins. Illumina Protein Prep is already accelerating breakthroughs across cancer, cardiometabolic, and immunologic diseases. Illumina's recent acquisition of SomaLogic proteomics technologies expands Illumina's multiomics portfolio, empowering AGD to accelerate drug discovery and improve health care.

"We are thrilled to have GSK on board as we move to the next evolution of AGD," said Leeland Ekstrom, PhD, chief executive officer of NashBio. "There is proven value in the integration of proteomics and comprehensive datasets, evidenced by the boon of large-scale studies showing promise in pinpointing drug targets linked to human disease."

In conjunction with Illumina's recently announced Billion Cell Atlas, this expanded effort continues the momentum of pharma leveraging combinations of large-scale datasets to identify and understand genetic targets, leading to new insights into disease mechanisms.

About the Alliance for Genomic Discovery

Existing Alliance members include AbbVie, Alnylam, Amgen, AstraZeneca, Bayer, Bristol Myers Squibb, GSK, Merck, and Novo Nordisk. RGC's membership expands the real-world genomic database to a cohort of 312,000 whole genomes. Along with GSK, Amgen is also participating in the proteomic expansion of AGD.

AGD builds on NashBio and Vanderbilt University Medical Center's decades-long investment in the BioVU biobank, early EHR adoption, and clinico-genomic research.

AGD began sequencing in January 2023, making it one of the fastest large-scale genomics projects to date. Large-scale aggregation with DRAGEN Iterative gVCF Genotyper further enhances variant calling accuracy and consistency across diverse populations, enabling deeper insights into rare and complex genetic traits. The speed of this effort reflects the operational capabilities and deep collaboration between the participating life sciences organizations.

Illumina and NashBio are actively expanding the AGD network to continue to build on current successes, accelerate therapeutic discovery, and set new standards for clinical R&D pace, cost efficiency, and efficacy.

Use of forward-looking statements

This release may contain forward-looking statements that involve risks and uncertainties. Among the important factors to which our business is subject that could cause actual results to differ materially from those in any forward-looking statements are: (i) challenges inherent in researching, developing and launching new technologies; (ii) our and our partners' ability to deploy new products, services, and applications, and to expand the markets for genomics-related products and services; and (iii) the challenges associated with multiparty collaborations, including our reliance on the performance of such partners, together with other factors detailed in our filings with the Securities and Exchange Commission, including our most recent filings on Forms 10-K and 10-Q, or in information disclosed in public conference calls, the date and time of which are released beforehand. We undertake no obligation, and do not intend, to update these forward-looking statements, to review or confirm analysts' expectations, or to provide interim reports or updates on the progress of the current quarter.

About Illumina

Illumina is improving human health by unlocking the power of the genome. Our focus on innovation has established us as a global leader in DNA sequencing and array-based technologies, serving customers in the research, clinical, and applied markets. Our products are used for applications in the life sciences, oncology, reproductive health, agriculture, and other emerging segments. To learn more, visit illumina.com and connect with us on X, Facebook, LinkedIn, Instagram, TikTok, and YouTube.

About NashBio

Nashville Biosciences LLC (NashBio), a wholly owned, for-profit subsidiary of Vanderbilt University Medical Center (VUMC), was created to make complex healthcare data easy to use for a wide range of life science research and development applications. Leveraging Vanderbilt University innovation, NashBio harnesses extensive real-world genomics and other longitudinal multi-modal datasets, along with powerful bioinformatics tools, to build and deliver a wide range of data products and services. NashBio works with clients in biotech, pharma, diagnostics, medical devices, and other life sciences domains to support their most critical use cases. We believe smarter data enables better outcomes for our clients and ultimately for patients. For more information, please visit NashBio.com, connect with us on LinkedIn or follow us on X at @NashvilleBio .

Contacts

Investors:

Illumina Investor Relations

858-291-6421

[email protected]

Media:

Christine Douglass

[email protected]

SOURCE Illumina, Inc.