Attention Deficit Hyperactivity Disorder (ADHD) exhibits substantial heterogeneity in etiology, neurobiology, and clinical outcomes. Symptom-based diagnostics lack mechanistic precision, leading to inconsistent treatment responses and poor prognostic predictability. To address this gap, this project applies data-driven interpretable machine learning (IML) to identify predictive biomarkers and developmental subgroups of ADHD, supporting the advancement of precision psychiatry.
Large-scale multimodal data-including genetic (polygenic risk scores), neuroimaging (functional/structural connectivity), metabolomic, clinical, and environmental variables-will be integrated using ensemble learning and deep learning models to predict ADHD onset risk and symptom persistence. The SHAP (SHapley Additive exPlanations) framework will quantify individual biomarker contributions and elucidate interaction effects such as gene-environment and imaging-clinical relationships.
Based on SHAP-derived feature representations, unsupervised clustering (e.g., Gaussian Mixture Models, Non-negative Matrix Factorization) will be employed to identify distinct ADHD subgroups with unique biological and clinical characteristics. Cross-cohort and independent validation will assess model robustness and generalizability.
The study aims to:
(1) build and validate a high-performance multimodal ADHD risk prediction model;
(2) identify and rank key biomarkers and interaction pairs that drive individual-level risk; and
(3) delineate biologically and clinically distinct ADHD subgroups with differing treatment responses and prognostic trajectories.
By linking predictive accuracy with mechanistic interpretability, this research will establish a scalable framework for stratified intervention and precision medicine in ADHD.