Last updated:
Author(s):
Yuening Wang, Rodrigo Benavides, Luda Diatchenko, Audrey V. Grant, Yue Li
Publish date:
12 May 2022
Journal:
iScience
PubMed ID:
35637735

Abstract

Large biobank repositories of clinical conditions and medications data open opportunities to investigate the phenotypic disease network. We present a graph embedded topic model (GETM). We integrate existing biomedical knowledge graph information in the form of pre-trained graph embedding into the embedded topic model. Via a variational autoencoder framework, we infer patient phenotypic mixture by modeling multi-modal discrete patient medical records. We applied GETM to UK Biobank (UKB) self-reported clinical phenotype data, which contains 443 self-reported medical conditions and 802 medications for 457,461 individuals. Compared to existing methods, GETM demonstrates good imputation performance. With a more focused application on characterizing pain phenotypes, we observe that GETM-inferred phenotypes not only accurately predict the status of chronic musculoskeletal (CMK) pain but also reveal known pain-related topics. Intriguingly, medications and conditions in the cardiovascular category are enriched among the most predictive topics of chronic pain.

Related projects

Full cohort, with the possibility of adding data from the future web-based questionnaire on pain.

Institution:
McGill University, Canada

All projects