Last updated:
ID:
917642
Start date:
21 July 2025
Project status:
Current
Principal investigator:
Dr YouHyun Park
Lead institution:
Yonsei University of Wonju Severance Christian Hospital, Korea (South)

Stroke remains a leading cause of mortality and disability worldwide. While numerous studies have examined stroke incidence, there is a need for improved prediction models specifically focused on stroke-related mortality. This project aims to develop and validate a machine learning-based model to predict the risk of stroke death using large-scale health and biomarker data from the UK Biobank.

We will use data on sociodemographic characteristics, health behaviors (e.g., smoking, alcohol), clinical biomarkers, medication use, comorbidities, and hospital inpatient records (HES) to identify important predictors of stroke mortality. Our primary outcome will be stroke-specific death, identified through linkage to national death registries using ICD-10 codes. The model will be trained using various machine learning algorithms, such as XGBoost, random forests, and logistic regression, and its performance will be evaluated using metrics such as AUC, sensitivity, specificity, and calibration.

The objectives of this research are:

To identify the most influential predictors of stroke mortality in a large prospective cohort.

To compare the predictive performance of multiple machine learning algorithms.

To develop a risk stratification model that could help in early identification of individuals at high risk of stroke death.

This study will contribute to improved understanding of stroke prognosis and may inform clinical decision-making and public health strategies aimed at reducing stroke-related mortality.