We intend to validate whether generative AI models are capable of modeling systemic conditions for earlier disease detection when trained on large-scale peripheral omics datasets. Our research focuses on three principal questions: (1) can these models achieve sensitivity, specificity, and overall predictive power on par with clinically guided methods? (2) How does their reliability evolve with increasing dataset size and heterogeneity? (3) In which conditions is multimodal integration necessary to reach performance benchmarks set by conventional diagnostics?
To address these questions, we will design and benchmark a panel of generative models that condition clinical measurements on participants’ peripheral omics data. Access to the UK Biobank’s extensive and demographically diverse cohort will be crucial to capturing the intrinsic variability that underlies disease phenotypes. Model performance will be evaluated by comparing predictive accuracy and robustness to noise and physiological variability against established diagnostic workflows relying on targeted biomarkers and clinical tests. Ad hoc validation will be performed for cardiovascular and neurodegenerative disorders, given their clinical importance, complexity, and our previous work in related areas. Sensitivity and specificity will be assessed across different disease stages and population subgroups to ensure generalizability and fairness (including the integration of external datasets with comparable measurements).
Our scientific rationale builds on the premise that blood-based omics, when leveraged through adequate nonlinear models, can encapsulate complex systemic signals beyond the capabilities of traditional biomarker-based and linear hypothesis-testing methods. We believe that demonstrating comparable diagnostic power from peripheral omics could motivate a shift toward a more proactive, large-scale screening, reducing healthcare costs while enhancing patient outcomes and quality of care.