Assessing the history and health consequences of rare variants
Principal Investigator:
Professor Gilean McVean
Approved Research ID:
12788
Approval date:
April 1st 2016
Lay summary
Each individual carries many rare genetic variants (frequency less than 1 in 1000), several of which affect gene function and may affect disease risk. Moreover, rare variants, being typically recent in origin, are often geographically restricted and more so than common variants, which can cause difficulties for genetic association studies. We will analyze the variants represented on the UK Biobank Axiom array to characterize the distribution of rare variants between individuals, estimate the penetrance of disease-causing variants understood to cause severe genetic disease and to infer their evolutionary history (age, geographical origin, recurrence and evidence for purifying selection). By assessing the penetrance of known ?disease-causing? rare variants the research will provide an unbiased assessment of the health consequences of particular types of genetic alteration in individuals not ascertained for a particular disease (or without a family history). Moreover, by assessing the correlation between genetic and geographic proximity (population stratification) we will gain more powerful and better controlled tests for estimating the disease risks associated with rare variants. Finally, by understanding the evolutionary history of variants, we will estimate the mutation and selection pressures associated with different types of genetic change. Rare genetic variants (frequency of less than 1 in 1000) will be identified within each individual from the Axiom array data. The likely impact of such variants on gene function and disease will be inferred by comparison to external databases. The penetrance of mutations will be estimated by analyzing medical data available on cohort participants (e.g. cancer registries, hospital admissions, prescription records). Relatedness structures, evolutionary history, and correlation with geography will be inferred by measuring the sharing of combinations of variants (haplotypes) around rare variants and linking information to data available on participant?s geographic data. Full cohort