Skip to navigation Skip to main content Skip to footer

Approved Research

The biology, population genetics, functional consequences, and disease relevance of genomic copy number variation

Principal Investigator: Dr Michael Talkowski
Approved Research ID: 50765
Approval date: July 17th 2020

Lay summary

Occasionally, a section of DNA in a human genome will mistakenly be deleted or duplicated itself, leading to different numbers of copies of segments of DNA between individuals. This is a phenomenon known as copy-number variation (CNV). Previous studies have shown that CNVs are an important factor in human evolution and adaptation. For instance, certain regions of the genome that distinguish humans from chimps and other primates have accumulated numerous extra duplicated copies. CNVs have also been confidently shown to influence numerous severe diseases, especially cancers and pediatric neurological conditions like autism and intellectual disability. However, previous studies have been too small to produce robust estimates of which genes and regions of the genome are tolerant of CNVs, and which will cause disease. As a consequence, it is currently challenging to interpret CNVs found in an individual patient's genome during routine clinical testing. Therefore, the genetic data collected as part of the UK Biobank represents one of several large-scale initiatives that provide a unique opportunity to develop catalogues of the CNV-tolerant regions of the genome; these catalogues will aid in the clinical diagnostic interpretation of CNVs, and will facilitate future studies of genome biology, resulting in an improved knowledge of which segments of DNA are particularly sensitive to being deleted or duplicated.

In this project, we will integrate genetic data from the UK Biobank with existing resources to develop a large CNV atlas across human populations. From this atlas, we will construct reference panels to guide scientists and clinicians when interpreting CNVs. We anticipate that these resources might be widely useful to the medical and research communities. We anticipate a 12-24 month timeline to complete these results, with all results being made openly available to the scientific community.