Machine learning-based disease risk prediction in large-scale genomic cohorts from multiethnic populations
Approved Research ID: 76615
Approval date: October 5th 2021
Individual-level disease risk prediction is a major challenge to address for realisation of personalized healthcare and medicine. We developed a new machine learning-based method named STMGP (smooth-threshold multivariate genetic prediction), which has a better predictive ability compared with conventional methods, and extended the method so that it is also applicable to larger data sets. We evaluate the performance of the method, by applying it to a wide range of traits/diseases in the UK Biobank data set. We will also investigate in detail the effects of genetic architectures of traits/diseases, and demographic characteristics and genetic backgrounds of the population, on the performances. In addition, we also evaluate the performance of disease risk prediction taking into account gene-environment (GxE) interactions, which are intractable in conventional methods. The findings of the study would accelerate translation of accurate risk prediction system into clinical practice, not only in a specific population but also in demographically and/or genetically diverse populations.