Skip to navigation Skip to main content Skip to footer

Approved Research

Leveraging health records to detect and corroborate disease risk factors

Principal Investigator: Professor Adi Shraibman
Approved Research ID: 89496
Approval date: November 21st 2022

Lay summary

The aim of the project is to teach students about the ways in which health data can be analyzed and leveraged to gain insight into the risk factors of a disease - whether as an instrument for the (external) validation of published results or as a tool for predicting which people are at risk for the onset of a disease. We hope that by doing so we are not only introducing students to this rapidly developing field, and giving them the fundamental tools for building an academic career in this venue, but also stand to gain new insights into the risk factors of the considered diseases, as well as a way to integrate these risk factors into a risk score or a predictive model.

The scientific rationale is that the abundance of medical data allows for the exploration of a great number of hypotheses regarding causal relations between indications and health outcomes. Relations which are already described in the literature are less likely to be examined by researchers looking for novel discoveries, while at the same time they are particularly interesting in the setting of a students' workshop. Note that though less interesting for academic purposes, validating and studying known relations of great interest for the clinical physician.

The workshop itself will last one academic year, with the students doing most of the analysis and coding during the summer break following the school year in which they attend the workshop. However, we plan for the workshop to be part of the college's regular curriculum, and so would like access to the data for at least 3 years.

Direct public health impact would come from a successful student project - that is, one which is successful in constructing a high quality predictive model (or risk score) which is based on readily available data. More importantly, the indirect impact would be in the training of some 20-30 students each year in this field, and setting them on a path which will allow them to make their own impact in years to come.