Efficient Multi-trait Genome-wide Association Analysis Method
Genome-wide association studies (GWAS) are becoming valuable tools for investigation of the genetic architecture of economically important quantitative traits. The common practice of GWAS is to identify the association between a single locus (SNP) and a single trait. Such studies are not capable of taking advantage of information from multiple correlated traits. Multi-trait models is expected to increase the accuracy of the polygenic risk scores and the power of pleiotropic gene detectio by making use of information from genetically correlated traits. However, current multi-trait models require large memory consumption. If we apply these models to anayze ten traits from 500,000 individuals, it needs about 182TB memory, which faces with enormous computational pressure.
In this proposed 36-month project, we aim to develop an efficient multi-trait GWAS method which can deal with more than ten traits from 500,000 individuals. The method will take sparse matrix technology, which is expected consume only tens of GB of memory usage. We will apply our method to the UK-biobank data and accomplish three specific project aims. First, we will estimate the genetic correlations of pairwise traits, which help us understand the genetic structure of complex traits in human; Second, we put the traits with strong genetic correlation together and perform multi-trait GWAS to locate pleiotropic genes; Third, we will assess the prediction accuracy of the polygenic risk scores with our efficient multi-trait models. Together, this project will help identify genetic variants affecting multiple traits with the potential to guide prevention and treatment, thereby making people to access to personalized medicine.