Approved research

Interactive Ensemble clustering for mixed data with application to mood disorders

University of Oregon

Lay summary

The aim of our project is to develop an innovative approach that utilizes large population datasets to improve the classification and diagnosis of mood disorders. This project is motivated by the hypothesis that the development of novel clustering methods will enable the identification of clinically significant structures with these large population data sets. Such an approach must overcome a large number of methodological challenges introduced by the complexity of the problem and the nature of large-scale health data, including complex and unknown structure, high dimensionality, heterogeneity, and a complex mixtures of variables The psychiatric community has recognized the critical need for a more precise, evidence-based approach for the diagnosis and treatment of mental illnesses. Mood disorders are the leading cause of disability worldwide; about 1 in 5 British adults will experience depression at some point in their lives. The proposed program paves the way for this vision by developing new algorithms and visual tools for precision classification and diagnosis. The rigorous identification of subgroups of individuals within heterogeneous populations will facilitate accurate and targeted diagnosis, and provide opportunity for personalized evidence-based interventions. We are developing a novel methodology for clustering that is based on an consensus approach that accounts for uncertainty in the population and clustering method. Various components of the project will entail methodology development, data cleaning and processing, application of methods to data, and interpretation of results. Different tasks require different expertise, all of which are present in the team, but each task is motivated and will connect to the data requested from the UK Biobank. The research will be undertaken at our various institutions, and we connect in bi-weekly meetings to discuss and facilitate progress. We are requesting access to the full cohort data.