Questions: How can we develop a multi-ethnic disease risk prediction model that enhances generalizability across diverse populations?
Objective: We aim to develop and validate a disease risk score model with improved generalizability across diverse populations. Specifically, this project seeks to develop a generalized multi-ethnic disease risk prediction model that can accurately assess disease risk across different ethnic groups.
Scientific rationale: The development of generalizable disease risk prediction models has become increasingly important due to rising global migration, which has led to a decline in single-ethnicity populations and the emergence of genetically diverse communities. While existing ethnicity-specific models may perform well within their respective populations, their accuracy in admixed or multi-ethnic groups remains uncertain, limiting broader applicability.
To address this, we aim to develop a multi-ethnic disease risk prediction model using UK Biobank data alongside our machine learning models trained on 80,000 Korean individuals, covering 39 diseases in males and 41 in females, with an average of 50 disease-specific markers per disease.
As part of risk model construction, we will integrate comparative genomics to characterize subpopulation genetic structures and assess their disease susceptibilities. Given that most of our existing data originates from Jeju Island, we aim to characterize its genetic distinctiveness compared to the Korean mainland. Through population genetics analyses, we will extract genetic parameters to refine the model, ensuring population structure and genetic diversity are adequately represented in disease risk predictions.