Principal Investigator: Dr Giordano Botta
Allelica S.r.l., Rome, ItalyTags: 40692, GWAS, polygenic risk scores, prediction, translation
Since the human genome was fully sequenced almost 20 years ago the scientific community has made great strides in understanding the genetic basis of many human traits and diseases. Much of what we know has been advanced by the generation of large datasets, due to the increasingly cheap ability to sequence DNA, and by the development of sophisticated statistical machinery to discover patterns in these large, complex genetic datasets. A particularly fruitful avenue of enquiry has been genome-wide association studies (GWAS), which compare DNA between people with or without a specific disease or with varying values of a particular continuous trait. These analyses have generated lists of genetic variants that have been robustly associated with human traits and diseases, and, crucially, offer an indication of the strength and size of the association.
An individual overall risk of getting a disease, like diabetes – or their value of a trait, like height – is a complex combination of many different environmental, lifestyle, and genetic factors. Genetics on its own will never be able to provide a complete assessment of an individual risk. However, because many diseases and traits do have a genetic component, and these are being identified through GWAS, we are now in a position to use this information to better predict the genetic component of their risk, so that interventions can be better targeted and lifestyles can potentially be modified.
Our research project aims to translate the results of GWAS into usable information that can inform on an individual genetic liability for a trait or disease. To achieve this we have built a pipeline to turn an individual genetic sequence information into a genetic risk score. We would like to use the UK Biobank dataset to test the utility and generalizability of our pipeline. Because this dataset has genetic information on half a million individuals together with matched measurements on a wide variety of different traits and disease outcomes, it is a unique resource for assessing our pipelines. By using this resource, we will be able to iterate and improve our algorithms to generate a product that can take anyone genetic data and turn it into a genetic risk prediction. We anticipate that this product will be of use to genetic screening programs as well as individual consumers hoping to learn about the effect of their genome on their bodies, ultimately leading to better disease risk prediction, prognosis, and stratification.
Project extension – January 2020
The aim of this project is to establish the general applicability of polygenic risk scores (PRS) as a method for identifying an individual genomic liability for a range of traits. Many genome-wide association studies (GWAS) have now been performed on a diverse set of traits and there is a growing consensus that the genetic component to many traits is controlled by variation at multiple positions in the genome, i.e they are polygenic. These GWAS are identifying increasingly long lists of robustly associated loci, each with an estimated contribution to the phenotypic variation of a trait. In relation to health and nutrition, for example, we now have sets of loci, with estimates of the size and strength of their effects, for cardiovascular disease, type 2 diabetes, hypertension, osteoporosis, metabolic syndrome, irritable bowel syndrome, lipid levels (LDL, tryglycerides) and insulin metabolism.
We have built a database of publicly available GWAS summary statistics and developed algorithms to generate PRS for a range of traits. We would now like to ask how well do our PRS algorithms predict risk? What effect does genotype imputation have on the accuracy of PRS? Can we combine PRS with other measurements to further refine our scores?
Risk profile modeling based on blood lipid levels.
Last updated Jan 28, 2020