Identification of Genomic Risk Factors and Prediction of Cardiovascular Disease Risk through Deep Learning
Identification of Genomic Risk Factors and Prediction of Cardiovascular Disease Risk through Deep Learning
Lay summary
Risk factors for coronary artery disease (CAD), such as blood pressure, cholesterol levels, and cigarette smoking status have been incorporated into risk prediction models to guide clinical management. However, there remains a great deal of uncertainty when applying these risk scores to patient care, especially for prevention of first adverse coronary events, such as stroke or heart attack. One approach to reducing the uncertainty around CAD risk management is to use polygenic risk scores (PRS) - a method to aggregate the known genetic risk factors for CAD into a single cumulative score. CAD PRSs have been shown to be useful for the identification of some high-risk individuals who receive greater benefit from initiation of statin therapy and healthy lifestyle changes.
However, current PRSs are simple and limited in a number of ways, including the small number of genetic variants included in the model, and do not capture many of the interactions that likely exist between genetic variants as well as interactions between genetic variants and clinical risk factors. The goal of this proposal is to address these limitations through neural network (deep learning)-based prediction of CAD risk.
In Aim 1, we will use a certain type of neural network, autoencoders, to transform genetic data into a compressed representation that will be useful for reducing complexity and extracting interesting genetic features.
In Aim 2, we will lay the foundation for complex deep-learning based prediction of CAD risk by first generating more simple neural network-based CAD risk prediction models and comparing them to more standard approaches.
In Aim 3, will finally generate more complex deep-learning-based CAD risk prediction models and compare them to the more simple approaches developed in aim 2.
We expect that investigation of the genetic features predictive of CAD risk identified by our proposed approach will provide insight into the biological basis of CAD, and provide improved approaches for risk stratification that can inform statin therapy initiation or be utilized to motivate lifestyle changes.