Using Machine Learning and Deep Learning for (Early) Detection of Dementia Disease

Last updated:: 17 July 2025

ID:: 412140
Start date:: 12 December 2024
Project status:: Current
Principal investigator:: Mr Abdullah Kavakli
Lead institution:: İstanbul Technical University, Turkiye

Dementia affects over 55 million people globally, leading to irreversible cognitive decline without a cure. It includes various diseases that impair memory, thinking, and decision-making. Despite extensive research using neuroimaging tools like MRI, PET, and CT, and genetic analyses, these methods are primarily diagnostic and limited by small sample sizes. Early prediction of dementia is crucial but challenging with current tools.
To address this, we propose examining comprehensive health records, lifestyles, and behaviors to identify dementia-related patterns. Integrating machine learning and deep learning approaches can aid in data interpretation and classification, and reveal relevant patterns and novel associations.
Our goal is to understand the causes and associations of dementia by predicting it early using data from the UK Biobank. We aim to:
1. Identify the underlying causes and associations of dementia and Alzheimer’s disease.
2. Accurately predict individuals who will develop dementia early.
3. Predict the onset or progression of dementia based on these predictions.
By leveraging machine learning and decision explainability tools, we hope to advance early prediction and understanding of dementia.
The UK Biobank’s “big data” presents numerous challenges, especially with unstructured data requiring extensive processing. This dataset, comprising information from around half a million individuals, is pivotal for advancements in deep learning and machine learning. Recent years have seen significant growth in these fields, with techniques like attention models, transformer-based models, and algorithms such as LGBM, XGBoost, and CatBoost achieving remarkable performance. These powerful tools are crucial for predicting outcomes, particularly in early classification.
Successful prediction relies not only on learning models but also on essential pre-processing steps such as data augmentation, scaling, and visualization. Data augmentation addresses unbalanced data through methods like adjusting algorithm parameters or weights of minority classes. Notable have been developed in the literature.
Despite the high performance of deep neural network architectures, they are often seen as “black-box” models, posing a challenge in healthcare applications. To enhance model transparency and trust, SHAP (SHapley Additive exPlanations) values are used to explain model outputs.
Dementia typically has a life expectancy of 6 to 10 years post-diagnosis. By 2050, preventive measures could benefit 150 million people. This research aims to develop an efficient, explainable model to classify dementia stages and identify related features through individual tests. These models will capture complex patterns and key biomarkers, contributing to improved longevity for the general population. The findings will be published in leading medical journals.
It’s estimated to take 36 months for my master’s to PhD research for a longer scope.