Approved research

Determination of correlations between polygenic risk scores of 40 traits and actual outcomes using 500.000 individuals

University Medical Center Groningen

Lay summary

Scientifically validated methods to calculate the health risks of an individual, such as polygenic risk scores, readily exist, but the public is not benefiting from these as they is not applied beyond research. We will focus on making personalized genomics directly accessible to the public by delivering actionable results, which are aimed at improving their health and lifestyle and ultimately their well-being. We will employ scientific methods to assess the health risks of individuals, based genetic profiling, blood measurements, physical assessment, a questionnaire and activity monitoring using the Fitbit Inspire watch. Based on the individuals health risks, dietary and life style recommendations will be made. We believe the use of large datasets is the best way to conduct proper research, for which reason the UK Biobank data is uniquely suited for our project. This project specifically entails the risk prediction of 40 traits for 500.000 individuals, such as high cholesterol levels or diabetes, based on genetic profiles and then correlating them to the actual measured values. The strength of that correlation will indicate how well we are able to predict actual outcomes, based on genetic profiles. The outcome will allow us to determine which traits we can and should be using when determining our health recommendations to an individual. Furthermore, we aim to identify lifestyle and dietary changes that could offset the increased disease risk in individuals prone to a disease. These can then be used to advice individuals prone to disease to allow them to live healthier lives. For example, if turns out that there is a strong correlation between individuals that are diabetic and their genetic risk to become diabetic, we can use this to make health recommendations that reduce the chances of developing diabetes. For example, if an individual is genetically prone to become diabetic and has a high sugar diet, a low sugar diet will be recommended, with the advice to monitor insulin levels on a 3 monthly interval basis. The aim is to prevent disease, rather than curing it, for which purpose the genetic data is uniquely suited.

Scope extension:

To what extent do predicted polygenic risk scores, based on genetic profiles, correlate with actual outcomes for 40 measured traits?

Which lifestyle and dietary factors affect disease risk in individuals that are genetically sensitive for that disease?

Can we create and optimize a risk score model derived from non-genetic factors including among others, biomarkers, lifestyle and dietary factors and how well would this correlate with actual outcomes for 40 measured traits and how does this compare to the standard currently used in doctor practices?

How much stronger do risk models including genetic risk correlate to outcomes, compared to models excluding genetic risk scores?

Among others, previous analyses (described above) have allowed us to model blood pressure and estimate what the blood pressure of a non-drug using individual should be based on a number of variables (e.g. BMI) with reasonably high accuracy. Also for statin users, this allows us to estimate what their blood pressure level would be if they would not be using any statins. This enables us to test the effect of statins in users. We expect the delta between the observed and predicted blood pressure level to correlate with their genetic propensity to absorb and digest at a different rate, which is the fundament of the pharmaco genetic passports being widely developed. Unfortunately, evidence showing this in large cohorts supporting this notion is completely absent, which is why we would like to extent our project to allow us to investigate this utilizing our findings so far. Currently a number of guidelines have been developed by the Clinical Pharmacogenetics Implementation Consortium (CPIC) and Dutch Pharmacogenetics Working Group (DPWG). We would like to test when we apply these guidelines to the 80.000 statin users (of which approximately 8.000 have a repeat measurement) present in the UKB, if identify individuals that are deemed slow metabolizers/fast degraders of statins are indeed either using more of the drugs or experiencing less of the outcome (statins lower cholesterol). While doing this we will account for life style and genetic propensity to have higher cholesterol levels based on the previous analyses in our project, readily calculated in the prior parts of this project.

We have build models that can predict certain outcomes with reasonably high accuracy and are curious to research if we can further improve these predictions by adding more data layers. The UK Biobank also has retina scans available (https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=21017). It has previously been reported that these can be used to predict various outcomes such as coronary artery disease (e.g. poplin et al., 2018). We also would like to use ML to predict a number of outcomes and add it to the predictive models we have constructed at this point (without the eye scans) to see what the added value of this data layer is in comparison to the other data layers (e.g. Does a retina scan add more predictive power than a genotyping array can?).

We wish to test what added predictive power we can draw from time series and imaging data for CAD, specifically the heart rate data and the heart MRI scans. Parameters derived from these datapoints will be integrated with our existing models we constructed previously in this project.