Genetics of complex diseases, comorbidities, and the disease risk prediction
Complex diseases are caused by a combination of genetic, environmental, and lifestyle factors. Most diseases fall into this category, including several congenital defects and many adult-onset diseases. Some examples include coronary artery disease(CAD), asthma, and many more. Many diseases occur in high frequency in the general population, e.g. the prevalence of CAD is ~6% in general population, so called common diseases. Therefore, common complex diseases cause the majority of morbidity and mortality. To search for the candidate new treatments and preventions to reduce the burden on patient families and healthcare costs, we plan to test the associations of the DNA polymorphisms with disease phenotypes to pinpoint the genetic regions associated with the disease risk. As the association is not causation, to understand the genetic mechanisms, we will apply statistical methods to determine the causation from associations. The gene-based rare variant association analysis is one of the approaches for discovering the causative genetic mechanisms, but it's generally underpowered as it needs a much larger sample size. Therefore, we plan to develop a new common and rare variant integrative method that integrates the information from variant annotations, phasing for rare variants, and borrowing prior information from related traits or risk factors to have a boosted power for the association test.
The currently most powerful way to quantify the genetic disease risk is to have a weighted sum of all the associated genetic risk loci across the whole genome as a continuous score, termed the polygenic risk score(PRS). The disease risk prediction by PRS was awarded as one of the top ten technology breakthroughs by the MIT Technology Review 2018. However, the current methods for PRS are underpowered and usually biased when applied to non-European people. Therefore, we plan to develop new deep learning methods that can integrate the risks from genetics and lifestyles to make the prediction more powerful. Furthermore, by leveraging the transfer learning techniques from artificial intelligence, we will improve cross-population disease risk prediction transferability to make healthcare more equitable between people of different ancestries.
The proposal aims to map novel genetic risk variants and genes to complex diseases, identify shared genetic components between them, and develop powerful models for disease risk prediction. This study will extend our understanding of the genetic etiology and provide more powerful models for disease risk prediction which will benefit the downstream disease risk stratification, intervention, and treatments.