Last updated:
Author(s):
Samson J. Mataraso, Camilo A. Espinosa, David Seong, S. Momsen Reincke, Eloise Berson, Jonathan D. Reiss, Yeasul Kim, Marc Ghanem, Chi-Hung Shu, Tomin James, Yuqi Tan, Sayane Shome, Ina A. Stelzer, Dorien Feyaerts, Ronald J. Wong, Gary M. Shaw, Martin S. Angst, Brice Gaudilliere, David K. Stevenson, Nima Aghaeepour
Publish date:
16 January 2025
Journal:
Nature Machine Intelligence
PubMed ID:
40008295

Abstract

Omics studies produce a large number of measurements, enabling the development, validation and interpretation of systems-level biological models. Large cohorts are required to power these complex models; yet, the cohort size remains limited due to clinical and budgetary constraints. We introduce clinical and omics multimodal analysis enhanced with transfer learning (COMET), a machine learning framework that incorporates large, observational electronic health record databases and transfer learning to improve the analysis of small datasets from omics studies. By pretraining on electronic health record data and adaptively blending both early and late fusion strategies, COMET overcomes the limitations of existing multimodal machine learning methods. Using two independent datasets, we showed that COMET improved the predictive modelling performance and biological discovery compared with the analysis of omics data with traditional methods. By incorporating electronic health record data into omics analyses, COMET enables more precise patient classifications, beyond the simplistic binary reduction to cases and controls. This framework can be broadly applied to the analysis of multimodal omics studies and reveals more powerful biological insights from limited cohort sizes.

Related projects

This research project is focused on using artificial intelligence and machine learning to better predict the risk of diseases like heart disease and diabetes. The…

Institution:
Stanford University, United States of America

All projects