Last updated:
ID:
63774
Start date:
13 May 2021
Project status:
Closed
Principal investigator:
Professor Su-In Lee
Lead institution:
University of Washington, United States of America

Aims: The goal of our project is to help build a flexible tool for assessing people’s risks for various diseases. Our tool will be able to make predictions about an individual’s risk based on whatever data they provide, and will also suggest missing information/tests that would be useful to make their risk estimates more certain.

Scientific rationale: With large studies like the UK BioBank, it’s possible for researchers to build powerful algorithms to predict people’s disease risk from many variables. However, in the real world, doctors don’t often have every piece of information about a patient (e.g., all lab tests or personal information), making it difficult to use those algorithms effectively. Therefore, one of our goals is to adapt these powerful algorithms to handle missing information for individuals in the general population by first making estimates about what their missing values could be. We plan to use each of these estimates for our algorithms to produce a “risk interval,” or a range of risk scores an individual may have based on the information we know about them.

Another goal of our method is to identify which pieces of information may be the most useful for the algorithms to know (i.e., which variables would help narrow the range of risk scores and decrease uncertainty). This could provide doctors with suggestions about what lab test to order or information to ask the patient which would help them better assess their risk. In order to generate these suggestions, we plan to apply an “explainable” method to our algorithm, in order to assess how much each variable contributes to the algorithm’s outputs for a given individual’s data. If having different estimates for a piece of information leads to huge fluctuations in our algorithm’s prediction, obtaining the true information for that particular variable would lead to a narrower risk range.

Public Health Impact: Our project aims to provide a way to assess risk without full information, and therefore may be useable in the general population. These prediction intervals described above would be useful for doctors to determine if more information or tests are needed, and if so, our method’s recommendations would provide useful next steps. Finally, by predictions and explanations across many individuals, we may gain insights about risk factors for these diseases.