Skip to navigation Skip to main content Skip to footer

Approved Research

The impact of population structure on biomedical outcomes in the UK Biobank

Principal Investigator: Professor Eimear Kenny
Approved Research ID: 84541
Approval date: November 22nd 2022

Lay summary

The use of genomic information in healthcare is a rapidly growing area of research. A person's genetic risk for a complex trait (i.e. one that's aetiology is comprised of a heterogeneous mix of many genetic and environmental factors) is typically modelled using statistical information derived from large-scale studies of human genomes (genome-wide association studies). The design of these studies typically rely on assumptions that the presence of confounding caused by additional factors that are correlated with the disease are either absent, or are adequately accounted for in the statistical model used, so as to produce accurate information on the relationship between individual genetic variants and their contribution to disease risk. There is emerging evidence however, that these assumptions may not hold in many cases, and that the presence of subtle correlations in genetic and environmental exposures between individuals may impact our ability to model the true underlying genomic risk accurately and consistently across different groups of people when using currently available data and best practices. To gain a better understanding of this, we aim to measure levels of relatedness between individuals in the UK Biobank by identifying parts of their genome that they have inherited from shared ancestors at some time in the recent past. We can use the information learned from the presence of these shared genetic segments to in turn track the relationships between individuals and their shared history, geography, genetics and environment, which will in turn allow us to explore shared patterns of both genetic and environmental factors that may be biasing the way that we currently estimate the genetic risk of disease. Characterizing these features of the population will inform our ability to build better prediction models in the future.  We anticipate that the full completion of this study will take around three years, and will result in a better understanding of how to measure and interpret the genetic component of disease risk in real-world populations, allowing us to produce more consistently accurate models for predicting risk for genetic disease, to ultimately be implemented in preventative clinical care.