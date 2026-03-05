SAN JOSE, Calif., March 5, 2026 /PRNewswire/ -- iMerit , a leader in expert structured data for healthcare AI, in collaboration with Segmed and Advocate Health , today announced the release of the largest open-source, annotated breast tomosynthesis data set available to date, designed to accelerate the development of applications for breast cancer detection. This data set is biopsy confirmed, interpreted by U.S. Board Certified, MQSA-certified radiologists, and meticulously segmented by breast imaging specialists, ensuring the highest standards of accuracy and clinical relevance.

Dataset Characteristics

To provide researchers with high-quality, clinically relevant data, the collection features several key characteristics:

Dataset Volume: Imaging studies from 558 female patients with additional longitudinal data available through Segmed.





Imaging studies from 558 female patients with additional longitudinal data available through Segmed. Gold-Standard Modality: All images utilize Digital Breast Tomosynthesis (DBT) also known as 3D Mammography.





All images utilize Biopsy Confirmed Diagnosis: The data is highly curated, with a balanced distribution of ground-truth outcomes, including 271 malignant (48.5%) and 287 benign (51.5%) cases.





The data is highly curated, with a balanced distribution of ground-truth outcomes, including and cases. Focus on Early Detection: With an average tumor size of just 1.34 cm , the dataset is uniquely suited for training AI models on subtle, early-stage findings. Approximately 85% of lesions are smaller than 2 cm.





With an , the dataset is uniquely suited for training AI models on subtle, early-stage findings. Approximately are smaller than 2 cm. Demographic Representation: The patient population includes a broad range of ages (averaging 62 years) and racial background composition of 96% White , 1% Black or African American, 1% Asian, and 1% Multi-racial patients.





The patient population includes a broad range of ages (averaging 62 years) and racial background composition of , patients. Technical Specifications: Data is provided in DICOM format for the volumes and JSON for the annotations that includes classifications and coordinates of lesions. The data is fully de-identified in compliance with HIPAA and GDPR standards.

Accelerating Early Detection

Breast cancer remains a significant global health challenge, with 1 in 8 women diagnosed during their lifetime. In 2026 alone, an estimated 321,910 women in the United States will be diagnosed with invasive breast cancer. On average, a woman in the U.S. is diagnosed every two minutes. Early detection is critical, as the five-year survival rate can exceed 99 % when breast cancer is caught at an early stage.

"At iMerit, we believe high-quality, responsibly annotated data is the foundation for meaningful advances in AI for healthcare," said Dr. Sina Bari, VP of Healthcare and Life Science AI at iMerit. "By releasing this dataset openly, we hope to empower researchers worldwide to develop tools that can support radiologists, improve outcomes, and ultimately save lives."

"At Segmed, our mission is to make medical data accessible in a secure and scalable way," said Dr. Martin Willemink, Co-Founder and Chief Scientific Officer at Segmed. "This collaboration exemplifies the power of partnerships in breaking down barriers and driving innovation in medical AI."



Collaboration for Women's Health

This initiative reflects a shared commitment by iMerit, Segmed, and Advocate Health to advance women's health by fostering open research within the medical community. By making the dataset open-source, the partners aim to lower barriers for academic researchers, startups, and established institutions alike.

The data set is available for free download to registered users at:

https://imerit.net/3d-mammogram-dataset/

About iMerit: iMerit is a leading AI data company that powers advanced machine learning and artificial intelligence models through its network of domain experts and its software Ango Hub. Learn more at https://imerit.net .

About Segmed: Segmed, Inc. streamlines access to diverse, high-quality medical imaging studies for biopharmaceutical R&D and AI development. For more information, visit www.segmed.ai .

About Advocate Health: Advocate Health is one of the largest not-for-profit integrated health systems in the United States, dedicated to advancing clinical excellence and groundbreaking research.

