NCI U01 Research Initiative

Fusing cancer data to reveal what's hidden.

MEFINDER integrates radiology, digital pathology, clinical records, and social determinants of health to discover novel cancer phenotypes and deliver population-specific risk predictions — freely, openly, equitably.

260,815+

Breast Cancer Patients

EMBED v2

~1M

Imaging Exams

Multi-modal

5

Institutions

US-wide

10

Open-Source Tools

Freely on GitHub

01The Clinical Problem

The cost of one molecular assay — for every patient.

Molecular assays like Decipher are expensive, inconsistently covered by insurance, and still don't capture the full complexity of tumor biology.

Patients with identical diagnoses experience vastly different outcomes. Yet current tools treat them the same. MEFINDER develops computational alternatives that fuse information already collected during routine clinical care.

Current approach

$3,400 / patient

Molecular assay · Limited coverage

MEFINDER

Routine data

Computational · Open-source · Equitable

Same diagnosis.
Different outcomes.
One framework to explain why.

MEFINDER is an NCI U01-funded initiative led by Dr. Judy Gichoya at Emory University's HITI Lab. By integrating five data modalities — radiology, pathology, EHR, genomics, and social determinants of health — the framework discovers novel disease subtypes and predicts who needs more aggressive treatment.

Every tool is open-source, every method reproducible, every dataset documented — so the research community can build on this work rather than starting over.

Read our full mission
02The Framework

Five steps from raw data
to clinical insight.

Full Framework

01

Data Collection

Radiology · Pathology · EHR · SDOH

02

Harmonization

DICOM preprocessing · Stain normalization · QC

03

Feature Extraction

Radiomics · Pathomics · Deep embeddings · NLP

04

Multimodal Fusion

Graph networks · Co-attention · Contrastive learning

05

Phenotype Discovery

Novel subtypes · Risk stratification · Treatment guidance

03Clinical Applications

Two cancers. One framework.

MEFINDER targets high-impact prognosis problems where multimodal AI can meaningfully reduce cost and improve equity of care.

01 / Breast Cancer

ER+ Recurrence Prediction

Integrating mammography, breast MRI, digital pathology, and clinical records from 260,815+ patients to predict late recurrence in ER-positive breast cancer without costly molecular assays.

260,815 patients~1M examsEMBED v2
MamoCLIPHistoQCF-SYN
Explore this use case
02 / Prostate Cancer

Biochemical Recurrence

Predicting PSA recurrence post-therapy using biparametric MRI and H&E pathology slides via APIC — delivering Decipher-comparable performance at a fraction of the $3,400 test cost.

~5,000 patients387 bpMRIMulti-site
APICProstateNetMQUAL
Explore this use case
04Open-Source Ecosystem

10 tools.
All freely available.

From raw DICOM to multimodal embeddings — the full pipeline as open-source software.

Browse All Tools

HistoQC

stable
Digital Pathology

Open-source quality control for whole-slide pathology images with thousands of downloads.

View on GitHub

F-SYN

stable
Digital Pathology

Fourier-based spatial image normalization for stain harmonization — avoids GAN artifacts.

View on GitHub

MQUAL

stable
MRIRadiology

MRI quality assessment tool evaluating signal-to-noise, motion artifacts, and sequence completeness.

View on GitHub

Beaks

stable
RadiologyDigital Pathology

Cross-modality quality assessment framework for both radiology and pathology image sets.

View on GitHub

APIC

stable
Digital Pathology

AI-based pathology image classifier for tumor-immune interaction and treatment benefit prediction in prostate cancer.

View on GitHub

MamoCLIP

stable
MammographyRadiology

Federated contrastive learning framework for large-scale mammography representation learning.

View on GitHub
05Consortium

Five institutions. One mission.

View all details
InstitutionRoleKey ContributionDatasets
Emory UniversityLeadCoordination · HITI Lab · NLP labelingEMBED v2 · EPIP
Indiana UniversityPathomicsAPIC · Clinical trial validationCHAARTED · STAMPEDE
StanfordNLP & BreastNLP toolkit · Data harmonizationStanford cohort · CA Registry
Mayo ClinicFollow-upLong-term outcomes (10–15 yr)Mayo Biobank 75k+
Veterans AffairsProstate MRISlide digitization · Diverse populationVA bpMRI 387 (expanding)
06Selected Publications

Peer-reviewed research.

All publications

01

Breast Cancerconference2024

MamoCLIP: A Strong Contrastive Baseline for Full-Field Digital Mammography Analysis

Shrivastava A, Ghosh S, Gichoya JW, et al.

International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)

02

Prostate Cancerjournal2024

APIC: AI-Based Pathology Image Classifier for Treatment Benefit Prediction in Prostate Cancer

Bhatt D, Shrivastava A, Bhargava R, Gichoya JW, et al.

Journal of Clinical Oncology: Clinical Cancer Informatics

03

Frameworkjournal2023

HistoQC: An Open-Source Quality Control Tool for Digital Pathology Slides

Janowczyk A, Zuo R, Gilmore H, Feldman M, Madabhushi A

JCO Clinical Cancer Informatics

DOI: 10.1200/CCI.18.00136
07Get Involved

The framework is open.
So is the community.

Whether you're a researcher, clinician, patient advocate, or trainee — MEFINDER is built for collaboration.