Methods to facilitate variant interpretation in genomic sequencing

Last updated:: 2 July 2025

ID:: 53953
Start date:: 3 February 2020
Project status:: Closed
Principal investigator:: Mr Bob Vaughan
Lead institution:: Congenica Ltd, Great Britain

Genetics plays a major role in the cause and predisposition to disease. Approximately 3.5M people in the UK suffer from a rare disease at some point in their lives, and research shows that nearly 50% of rare disease sufferers wait >5 years for a diagnosis. Delays in diagnosis lead to inappropriate management and disease progression. 80% of these are rare diseases are influenced by a genetic change in their genome. There is also a growing understanding of the role of genetics in cancer, which it is estimated will affect approximately 50% of people in the UK during their lifetime. Characterization of the genetic changes that have occurred in these diseases can lead to a diagnosis and determine which treatment is likely to work. Therefore, sequencing the genomes of these patients is essential to ensure they will receive the best possible care. NHS England recently launched the NHS Genomic Medicine Service to enable this to become routine. Yet while DNA sequencing is now becoming widespread, there is still a bottleneck in interpreting the data. Many hundreds of genetic changes can be identified in a person’s genome, many of which are unrelated to the disease. Determining which of these is having an effect is a highly manual and time-consuming process. In addition, a particular genetic change may not cause disease in all cases and may be influenced by other factors. Highly-trained clinical scientists must therefore search through large numbers of genetic changes and assess a wide range of data sources including the academic literature to determine which ones are the most likely to be disease-causing. We aim to develop an automation system that supports clinical interpretation. The more we know about which variants are present in different patients and in which disease, the better chance we have of predicting which variants are disease-causing in the future population. The availability of the high-quality UK Biobank sequencing dataset, along with the recent development of innovative statistical and machine-learning methods, means we can now seek to identify the factors that determine whether a genetic change is important in a given patient and use this information to develop tools that predict whether a variant is likely to be disease-causing. Automating this process will ease the pressure on clinical scientists, ultimately reducing waiting times for patients with genetic disease.