Approved Research

Mixed model methods and toolsets for analysis of large-scare genetic, genomic and phenotype data.

University of Maryland Baltimore

Lay summary

The UK Biobank is one of the richest biomedical data sets in the world with over 500,000 subjects with thousands of detailed health and disease measures, called phenotypes, from medical records coupled with genetic data. Currently the genetic data comprises of only a few million variants directly measured across individuals, but within a few years all 500,000 subjects will have whole-genome sequencing, which will increase that number to over a billion variants.

The goal of genetic research is to discover biologically connections between genotypes and disease-related phenotype to hopefully accelerate development of therapeutic targets and agents to treat disease and improve human health To achieve these goals will require novel methodological, statistical and computational tools to be able to make sense out of the very large number of possible phenotype-genotype combinations in UK Biobank dat. The goal of our research is to develop these computational tools and models to efficiently mine this big data landscape for biological discovery. Important components of the model will be incorporating measures over time (longitudinal) to understand the role genetics and environment that contribute to change in disease susceptibility and also combining correlated phenotypes (multitrait) such as glucose and weight into a single analysis to understand the relationship and overlap between different biological pathways that influence disease risk. The software tools we develop will be made available to the general research community.