Identical by descent and relatedness estimation using combined information from multiple samples
Principal Investigator:
Professor Amy Williams
Approved Research ID:
19947
Approval date:
May 2nd 2016
Lay summary
Identification of genetic variants associated to disease requires understanding the relatedness between all study samples. Traditional association testing necessitates the exclusion of related study samples while mixed models enable their inclusion by specifying their relatedness. We propose new methods for inferring identical by descent (IBD) segments and the relationship status between every pair of individuals in a large cohort. These method rely on combining information from sequenced panels and other samples to improve inference accuracy. We will evaluate and apply the method to the UK Biobank genotype data and make the software we develop available for use by other researchers. We aim to develop methods that will enable IBD and relationship inference at greater accuracy than current methods provide. This will enable more effective genome-wide association studies through a better characterization of the samples' relatedness. We will use combined data from the UK Biobank, the 1000 Genomes project, and others to perform IBD detection at high accuracy by leveraging information all available samples. We will then use these IBD segments for all pairs of individuals to infer the relatedness status of the Biobank samples. Relatedness inference will identify clusters of individuals that share IBD segments in common in order to infer the relationships of members of the cluster to each other. Using multiple samples to jointly infer relatedness will enable increased accuracy compared to existing approaches that perform inference in a pairwise fashion. Full cohort