Characterizing the contribution of short tandem repeats to human phenotypes.
Approved Research ID: 46122
Approval date: April 10th 2019
Short Tandem Repeats (STRs) are a class of genetic variation comprising of repeated short sequences of DNA in the genome. Several dozen STRs are known to contribute to human diseases, including Huntington's Disease and Fragile X Syndrome. However there are more than 1 million STRs in the human genome, most of which remain uncharacterized. Traditionally studying the role of STRs has been difficult since they are complex to analyze and are not directly captured by most genetics studies. We recently developed a resource that allows to analyze STRs in large datasets where they were not directly genotyped using a technique known as imputation. Here, we will leverage this resource to identify the contribution of STRs to a variety of traits in humans. We expect that imputing STR in the UK Biobank data and performing association tests will take up to 1 year with 1-2 years follow up work to evaluate and interpret our results. We expect our study will identify a novel class of genetic variation with widespread impact on a variety of human traits.
Scope extension: We will additionally analyze the contribution of other complex variants, including variable number tandem repeats and HLA haplotypes, to complex traits. The same rationale that supports the inclusion of STRs alongside SNPs in genetic analyses encourages the inclusion of other complex variants types, and there is already evidence that play a role (e.g. Mukamel, et al. 2021). We will call these other variant types both via imputation and directly-calling from the 450k whole exome sequences. Among other methods, we will continue to look at length-based associations for VNTRs; for HLA haplotypes we will look for haplotype associations.