Last updated:
ID:
1121952
Start date:
10 March 2026
Project status:
Current
Principal investigator:
Mr Kareem Elgohary
Lead institution:
National School of Mines Paris, France

Research outline:
This project aims to understand how multiple clinical, biological, and lifestyle factors interact to influence the long term risk of developing complex diseases such as early onset cancers. The central research question is whether detailed health information such as medical history, laboratory results, and lifestyle factors can be combined with biological (omics) data to create models that accurately and transparently predict disease risk.

Scientific rationale:
Electronic Health Records (EHRs) contain rich information on patient demographics, diagnoses, treatments, and outcomes, offering a valuable resource for predicting disease risk non invasively. However, EHR data alone often fail to capture the underlying biological mechanisms driving disease progression. Recent advances in multimodal representation learning make it possible to integrate diverse data types and uncover hidden relationships between molecular and clinical features. The UK Biobank provides the large scale, longitudinal, and well characterized dataset required to explore these interactions comprehensively.

Objectives:
i) To design computational models that integrate longitudinal EHR, biological (omics) data and images into unified multimodal embeddings.
ii) To model patient health trajectories for accurate and biologically interpretable disease risk prediction.
iii) To investigate how early life health conditions influence future disease development, focusing particularly on early onset cancers.

Expected impact:
By combining temporal clinical data with molecular information, this research will enhance understanding of disease mechanisms, improve patient stratification, and support the early identification of high risk individuals. The resulting explainable AI models will guide clinicians toward personalized prevention strategies and contribute to more transparent and equitable precision medicine.