Rationale: Cardio-cerebrovascular diseases (CCVDs) arise from complex genetic, molecular, environmental, and organ-level interactions that conventional risk models fail to capture. The UK Biobank (UKB), integrating genomics, proteomics, metabolomics, multimodal imaging, and longitudinal health records, offers an unparalleled platform for systems-level discovery.
Key Questions: (1) Can integrated multi-omics improve prediction of CCVD onset and outcomes beyond clinical tools? (2) Which biomarkers and molecular networks consistently drive disease development? (3) How do multi-omics signatures mediate genetic risk and modifiable exposures?
Aims: (1) Develop and validate multi-omics risk models by combining polygenic scores with proteomic, metabolomic, and imaging-derived phenotypes; benchmark with conventional tools with cross-validation, calibration, and decision-curve analysis. (2) Identify robust biomarkers and molecular networks through machine learning and systems biology, testing across subcohorts and ancestries. (3) Link mechanisms to outcomes via causal inference (Mendelian randomization, mediation, colocalization) to map molecular pathways from liability and exposures to CCVD endpoints.
Methods: Incident CCVD cohorts will be assembled. Feature engineering will harmonize PRS, pQTL/mQTL-informed proteins/metabolites, lifestyle, and exposome metrics. Predictive models will include regularized regression, gradient boosting, and survival deep learning, with explainability (e.g., SHAP). Robustness will be tested by sensitivity analyses for missingness and batch effects, plus internal/external validation.
Impact: This project shifts CCVD risk stratification from static factor lists to mechanism-based, actionable models. Deliverables include validated prediction scores with superior accuracy, prioritized biomarker panels and druggable pathways, and reproducible pipelines (with derived features shared where permitted).