Statistical methods to characterize gene environment interplay underlying complex phenotypes
Approved Research ID: 77327
Approval date: August 10th 2022
Both genetic and environmental factors contribute to the risk of developing a disease (e.g., type 1 diabetes), or the variation of a quantitative phenotype (e.g., cholesterol level). Genetic risk of a disease is modified by environmental exposures, e.g., smoking increases the genetic risk of lung cancer. Thus, it is important to study the role of gene-environment (GxE) interactions underlying a disease/phenotype.
A common approach to identify GxE signals, testing genome-wide genetic variants, has seen limited success mainly due to low statistical power. We will explore some potential avenues to improve the power. For example, multiple environmental factors can interact with a genetic locus, and small GxE effects can be shared across multiple exposures. A lipid phenotype (e.g., HDL) can have simultaneous effect due to GxE interaction between a genetic variant and physical activity, sleep duration and alcohol consumption. We aim to improve the power of detecting an overall GxE effect by integrating multiple environmental factors.
Majority of GxE tests are implemented for individual single nucleotide polymorphisms (SNPs). However, a gene-level test can improve the power of detecting a GxE effect. It can provide better biological interpretation of a GxE signal. To perform gene-level GxE test, we will test for an effect of interaction between genetic component of gene expression and environmental factor.
A subtype of a complex disease can be defined based on a unique combination of genetic factors and/or environmental exposures for a subgroup of patients, e.g., a subgroup of autism patients can have recurrent mutation in the same risk gene. We will develop statistical models which would identify such unidentified subtypes of a phenotype.
For a complex disease, risk prediction models based on individual-level genetic profile mainly use marginal genetic effects of associated genetic variants. We aim to improve risk prediction models by integrating main effects of genetic and environmental factors, and GxE interaction effects altogether.
Thus, we aim to develop novel statistical methods and related software tools to efficiently analyze genetic and environmental data to better understand the complex interplay between genetics and environment. We will apply the proposed approaches to UK Biobank data for extensive demonstration. Since, both genetics and environment contribute to the risk of a disease, our methods will help to detect lifestyle/environmental risk factors that interact with genetic factors to enhance/reduce the disease risk, and to reveal its overall architecture. We expect a timeline of five years to complete the proposed research project.