Last updated:
ID:
1057239
Start date:
31 October 2025
Project status:
Current
Principal investigator:
Dr Luke Stetson
Lead institution:
PreOncology, United States of America

Outline of Research

Cancer is the second leading cause of death worldwide. Unlike cardiovascular disease, where risk scores such as the Framingham and ASCVD equations have transformed prevention, no broadly adopted predictive framework exists for cancer. The UK Biobank (UKBB), with its depth of genetic, lifestyle, and clinical data, provides a unique opportunity to build multi-modal cancer risk models. Complementary external cohorts will support cross-validation and generalizability.

Research Questions:

Can models built in UKBB improve cancer risk prediction beyond regression?

Which demographic, lifestyle, genomic, and clinical features contribute most to risk?

Can models trained in UKBB generalize to independent cohorts?

Objectives:

Establish baseline Cox regression models.

Apply ML/AI approaches (XGBoost, Random Survival Forests, CNNs, transformers) to capture nonlinearities and interactions.

Develop meta-learners to optimize performance.

Validate externally (WHI, PLCO, NLST, EPIC).

Scientific Rationale:
UKBB’s prospective, integrated design enables the next generation of cancer risk prediction tools. Combining exposures and biology allows earlier identification of high-risk individuals and more effective prevention strategies.

Responsible AI and Dissemination:
All models will be documented, version-controlled, and interpretable (e.g., SHAP, feature importance). Performance will be evaluated across subgroups to address fairness. Outputs will be used for research only and not for individual prognostication. Findings will inform population-level risk understanding and hypothesis generation. Any translational use would require external validation and regulatory approval. Results will be published in peer-reviewed journals, shared at scientific meetings, and derived variables returned to UKBB per policy.