Skip to navigation Skip to main content Skip to footer

Approved Research

Advanced Statistical and Machine Learning Approaches for Identifying Gene-by-Environment Interactions Across Ancestral Populations

Principal Investigator: Dr Curtis Boykin
Approved Research ID: 66192
Approval date: January 17th 2022

Lay summary


The sequencing of the human genome offered a lot of promise for understanding the genetic basis behind many traits that are consequential for health and wellness. By understanding the genome, we might be able to understand why certain individuals suffer from specific illnesses, which could offer insight into how to manage and treat certain diseases. While decades of research have revealed a very strong genetic basis for many traits, questions remain about how certain aspects of our environment or experience may play a role in how we study genomes. More advanced statistical techniques may allow us to better measure how the environment may influence disease risk.


We aim to study how many under-appreciated features of the environment-including those associated with social stratification (e.g., socioeconomic status)---can influence how we study and interpret the relationship between genetics and observable traits. We will develop and apply new statistical methods across a number of traits and environmental factors, learning how certain environmental factors may interact with and affect the genetic architecture of behavioral and psychological traits. The UK Biobank contains hundreds of thousands of individuals from diverse demographic backgrounds, which makes it an ideal set to examine these questions. It will allow us to examine many traits of interest and examine different environmental factors. Gene-by-environment interactions have been well studied in other complex traits (e.g., height). We plan to evaluate the power and robustness of our developed methods on these phenotypes, and then focus on behavioral and psychological traits, which can be influenced by experiences, and contribute to mental health conditions.


Because the foundation for the statistical methods that we will apply for this work are published, we anticipate a brief study time. We anticipate 12 months from the start of the study, to study completion, where analyses will be complete, with the team starting the process of sharing results with the scientific community via publication shortly thereafter.

Public health impact:

While the science of genomics has already provided insight into how genes can influence disease risk, there remain many other factors that can influence disease risk. For example, the lived experience and exposure to certain environments can lead to certain mental health ailments. Our study hopes to propose better methods for separating the effects of genes from those of the environment. Further, we hope to identify some specific factors that might make some individuals more susceptible to mental illness.