Cancer risk prediction and model validation with 'omics data

Last updated:: 16 March 2026

ID:: 1089913
Start date:: 5 March 2026
Project status:: Current
Principal investigator:: Dr James Geoffrey Dowty
Lead institution:: University of Melbourne, Australia

We aim to use genomics, proteomics and other ‘omics data, plus lifestyle/clinical risk factors to estimate corresponding risks of cancer, predict cancer risks, validate cancer risk models, and identify causal effects on cancer risk.

Research questions and rationales:
1. Can we predict the risk of common cancers with a novel plasma proteomic risk score? Individual protein biomarkers have been shown using UKB data (PMID 38750076) to predict future cancer onset, and small studies (PMID 39402035) have combined these into proteomic risk scores, but this promising avenue of enquiry is still in its infancy.
2. Can these proteomic risk scores be integrated with polygenic risk scores and lifestyle/clinical risk factors into a more powerful risk prediction model? Small-scale studies (PMID 39402035) have shown that combining proteomic risk scores with polygenic risk scores gives a predictive performance better than for polygenic risk scores alone. But we would like to test this in a large study (UKB) and also combine this model with lifestyle/clinical risk factors into a comprehensive risk prediction model.
3. How well do the resulting models predict cancer risk (with performance metrics estimated using cross validation)? How well do existing models for predicting cancer risk perform, especially in comparison to our new models? How well do other models broadly in the area of cancer risk prediction, including predicting cancer risk factors, perform? Models cannot be used clinically until they have been validated.
4. Can we identify the causal effects of proteomic and other risk factors on cancer risk, using Mendelian randomisation and other methods of causal inference? We will unravel the causal relationships between cancer risk factors and cancer.
5. Can we predict cancer using other ‘omics data and new analytical approaches?