Approved Research
Linked phenotype-genotype characterisation study of MDD with anhedonia in UK Psychiatric Cohort
Lay summary
In order to understand and treat the underlying causes mental illness or dementia, researchers need access to data that connects rich descriptions of disease symptoms to comprehensive genetic information. Unfortunately mental health clinical data is some of the least structured of any medical specialism. 80-85% of the usable symptom data in UK mental health records is 'locked' away in the form of unstructured free-text clinical notes. Akrivia Health has developed a technological solution to overcome this limitation, applying state-of-the-art machine learning natural language processing (NLP) methods to automatically extract mentions of clinically useful concepts (e.g., medications, diagnoses, symptoms) from patient notes.
The purpose of this study is to anonymously link Akrivia's enriched mental health data with the extensive genetic information recorded by UK Biobank. Patients in the two datasets will be matched in such a way as to ensure no identifiable information (e.g., NHS numbers) is exchanged between Akrivia and UKB.
Once the linked dataset has been created, Akrivia will partner with a leading industry partner to leverage the new resource for a focused data science project. Specifically, the team will use the combined dataset to investigate disease profiles of patients with a diagnosis of Major Depressive Disorder (MDD) with anhedonia symptoms. The first stage of this analysis will be to explore the correspondence between Akrivia's NLP-derived symptom data and structured measures of anhedonia (derived from both Akrivia's EHR data and structured data in UKB, e.g., PHQ-9 subscales). The second stage will be to expand the symptom profile of our target cohort using the breadth of data available via Akrivia's EHR data and the UKB dataset. The third stage will be to leverage the historic data provided by the EHR to extend the cross-sectional symptom profiles into longitudinal patient histories. Finally, we will augment the longitudinal phenotype profiles with genotypic data available through UKB, for example by mapping the polygenic risk score for anhedonia already derived using the UKB genotype dataset to specific anhedonia-related symptom profiles or disease trajectories identified in the EHR cohort during stages 1-3.
Involving an industry partner means that the results of the project can be directly translated into new ideas for treatments, follow-on research and clinical trials. By linking data on patient symptoms, diagnosis, medication with genetic information, this project will lay the foundation for improving patient outcomes through treatments that target underlying causes of mental illness rather than surface symptoms.