Disease areas:
  • brain
Last updated:
Author(s):
Sanghyun Shon, Younhee Ko, Hojin Yoon, Kyeongmin Kwak, Hwamin Lee
Publish date:
7 January 2026
Journal:
Briefings in Bioinformatics
PubMed ID:
41632594

Abstract

Although neuromuscular junction disorders (NMDs) and inflammatory polyneuropathies (IPNs) are biologically distinct, direct genetic comparisons between them remain limited, suggesting that additional underlying biological differences may yet be uncovered. Few studies have explored whether differences in variant patterns within shared biological pathways can be leveraged to distinguish NMDs and IPNs using machine learning (ML). We propose an interpretable ML framework based on Pathway-based Genetic Variant Dosage Average (PGVDA) to classify NMDs and IPNs and to identify key genes and pathways differentiating diseases. Using nonsynonymous variants from 667 UK Biobank participants, logistic regression identified disease-associated variants. Significant pathways were identified via pathway enrichment analysis with adjusted P-value < 0.05. PGVDA was calculated by assigning the log odds ratio to each variant dosage and then computing a weighted average at the pathway level. Dimensionality reduction was performed via hierarchical clustering based on gene-set overlaps and then PGVDAs with a variance inflation factor (VIF) > 10 were excluded. ML models were evaluated using leave-one-out cross validation. Utilizing the best-performing model, SHAP-based interpretation was applied using two distinct input configurations. Pathway-level interpretation using PGVDA input included stages of PGVDA scaling and ML-based classification, while variant-level interpretation using variant dosage input encompassed stages from odds ratio-based weight assignment to ML-based classification. Using logistic regression model with best performance, key differentiating five PGVDAs and 10 genes within each pathway were identified, suggesting that pathway-level variant aggregation enables accurate and interpretable classification of these two neuromuscular diseases. External validation is needed to ensure generalizability across populations.

Related projects

Extensive research within the last several decades has revealed that the major risk factors for most chronic diseases including cancers are infections, obesity, alcohol, tobacco,…

Institution:
Korea University College of Medicine, Korea (South)

All projects