Analysis of variants with low minor allele frequency in the UKBiobank cohort
Principal Investigator:
Dr Jeanette Schmidt
Approved Research ID:
55681
Approval date:
February 3rd 2020
Lay summary
The proposed research aims to validate new genotype calling algorithms for rare variants in array based genotyping data and in particular in the UK Biobank cohort. The results of using the improved algorithms for these variants will be compared to variant information from exon sequencing data available for the UK Biobank cohort. The results will be analyzed by computing both sensitivity as well as positive predictive value of the new algorithms. Rare variants have been associated with risk factors for many diseases and are therefore of great interest. Several studies have for example examined rare variants in UK Biobank to help explain a person's risk to develop COPD. We have characterized the behavior of new genotyping algorithms and have discovered several metrics that appear to significantly improve the genotyping calls of such variants. To date we have applied these new algorithms and metrics to relatively small datasets with known genotypes, such as several hundred HapMap samples. The UK Biobank dataset will allow for the characterization and validation of the algorithms on a large and important dataset. The results of our research will provide insight on the accuracy of genotype calls using an array platform. Arrays have been shown to provide highly accurate results on variants that are relatively common, (defined as greater than 1% or perhaps 0.1% of the population). The ability to assess the reliability of rarer variants will allow researchers to use array platforms and resulting data for additional applications, such as identifying risk factors for disease. We expect this project to take less than a year to complete.