Prediction of complex phenotypes and diseases from personal genomic data and health data using machine learning
Principal Investigator: Professor Chongle Pan
Approved Research ID: 52970
Approval date: November 12th 2019
Many complex diseases have both genetic risk factors and environmental risk factors. Genetic variations in the genome of an individual determines the genetic risk of this individual for these complex diseases. Predictive genomics allows estimation of the genetic risks of individuals for a certain disease based on their personal genomes. In this study, we will develop computational models for predictive genomics and benchmark their prediction accuracy using the UK Biobank data. We will test computational models built with existing statistical methods and new machine learning methods. This study will focus on three complex diseases, including obesity, cancers, cardiovascular diseases, and diabetes. We anticipate this project will last three years from initial data download to publication of results in peer-review journals. This study will have significant impact on public health by enabling precision preventive medicine. Individuals with high risks for a disease can be identified with high accuracy from their personal genomes and health data. This provides many opportunities for parents and physicians of these individuals to reduce environmental risk factors and make lifestyle changes.