Skip to navigation Skip to main content Skip to footer

Approved Research

Leveraging ancestry to jointly characterize genetic and environmental contributions to health disparities

Principal Investigator: Dr Leonardo Marino-Ramirez
Approved Research ID: 65206
Approval date: August 17th 2023

Lay summary

I propose to develop and apply a local ancestry mapping approach to jointly characterize the genetic and environmental contributions to health disparities in diverse cosmopolitan populations.  The approach I propose is powered by biobank-scale data sets, which combines genome-wide genetic information with rich sources of environmental, lifestyle and clinical data for many thousands of participants represented in electronic health records (EHR).  My approach to this problem entails a combination of algorithm development, genome analytics, and electronic health record analysis, with an emphasis on the development and application of genetic ancestry-inference algorithms to address specific questions regarding the relationship between genetic ancestry, environment and health outcomes.

The specific aims for the initial phase of my research program are:

Aim 1.   Genetic ancestry inference at scale.  Develop algorithms for genetic ancestry inference at biobank scale, with an emphasis on local ancestry inference.

Aim 2.   Local ancestry mapping.  Use local ancestry mapping to jointly characterize the genetic and environmental contributions to health disparities in diverse cosmopolitan populations.

Aim 3. Landscape of health disparities.  Characterize the landscape of health disparities in the UK Biobank with respect to disease prevalence differences between population groups defined by age, ethnicity, geography, socioeconomic status, and sex.

Scope extension: Health disparities, which can be defined as avoidable differences in health outcomes between population groups, are both a threat to public health and a pressing scientific challenge.  The relative importance of genetic versus environmental effects for health disparities, particularly for complex common diseases that have multifactorial etiologies, has long been debated.  Nevertheless, the reality is that health outcomes are influenced by a combination of genetic and environmental factors as well as myriad interactions among them.  Indeed, gene-environment interactions have recently been emphasized as a promising area for genomics-enabled health disparities research.  The overall goal of this work is to develop and apply novel bioinformatics approaches for the analysis of biobank-scale data sets to support the discovery of genetic and environmental contributions to health disparities.  Discovery of gene-by-environment interactions will be prioritized.  The UK Biobank provides an unprecedented opportunity to jointly analyze genetic and environmental contributions to health disparities at a high level of resolution, in support of health equity for currently underserved communities.  Novel methods in bioinformatics and computational genomics are needed to exploit the wealth of data being generated as a part of the UK Biobank.  Methods for genetic ancestry inference are particularly relevant to health disparities, given the relationship between population structure and the distribution of health-related genetic variants, and these algorithms must be fast and efficient in order to accommodate the scale and complexity of biobank datasets.  A focus on genetic ancestry can facilitate the disambiguation of genetic and environmental contributions to health disparities.  Our work will combine the development of novel algorithms for genetic ancestry inference at biobank-scale, with the re-purposing of quantitative genetic statistical methods and machine learning techniques for ancestry-informed analyses of biobank data, towards the joint interrogation of genetic and environmental contributions to ethnic health disparities in complex common disease.