Skip to navigation Skip to main content Skip to footer

Approved research

Multi-trait GWAS analyses in the UK Biobank

Principal Investigator: Professor Cathryn Lewis
Approved Research ID: 18177
Approval date: May 15th 2017

Lay summary

GWAS methods have been developed to analyse associations between SNPs and multiple phenotypes jointly. We have produced one such method (MultiPhen) and performed a simulation study finding that multivariate analyses can double the discovery of trait associated genetic variation compared with univariate analyses. Multi-trait analyses, such as polygenic risk scores, offer insights into shared and distinct aetiology among different phenotypes, such as ADHD, autism, schizophrenia, eating disorders and obesity. We will perform single and multi-trait analyses on the UK Biobank to boost discovery power of causal genetic variants, identify shared aetiology among phenotypes, and evaluate method performance on real data. Identifying shared genetic risk between physical and psychiatric phenotypes, in particular, could shed light on the aetiology of psychiatric disorders (which our group focuses on). Joint analysis of multiple traits may lead to substantially greater identification of novel genotype-phenotype associations and provide insights into the biological network underlying correlated phenotypes. Exposing the genetics responsible for comorbidity between phenotypes could also uncover possibilities for drug repositioning. By evaluating method performance we may find whether power is optimised by performing such analyses on clinically related phenotypes, those that are most highly correlated, or on sets of phenotypes with heterogeneous correlation structure. Genome-wide association study (GWAS) analyses will be performed on subsets of the UK Biobank phenotypes, allowing some traits to be analysed jointly for the first time. We will use several statistical approaches to investigate patterns of shared and distinct genetic risk between and within psychiatric and physical traits. We will use 'polygenic risk scoring' and 'linkage disequilibrium score regression' analyses software to infer genetic overlap and estimate genetic correlations between different phenotypes and outcomes, using our local computing facilities. We wish to include the full UK Biobank cohort in our analyses. In response to feedback from our approved preliminary application: we intend to use data from self-reported baseline information and health records to derive phenotypes. We acknowledge the complexity of looking across outcomes from different medical record sources and the difficulty in reliably ascertaining case status across multiple sources. We have undertaken preliminary analyses aimed at identifying case status in the UK Biobank using multivariate measures (self-report, hospital data, treatment status) as part of our collaboration on application 16577, and will extend this work with approval of this application.