Disease areas:
  • nutrition and metabolism
Last updated:
Author(s):
Guillermo Briseño Sanchez, Nadja Klein, Hannah Klinkhammer, Andreas Mayr
Publish date:
21 March 2025
Journal:
Statistical Methods in Medical Research
PubMed ID:
40116638

Abstract

Motivated by challenges in the analysis of biomedical data and observational studies, we develop statistical boosting for the general class of bivariate distributional copula regression with arbitrary marginal distributions, which is suited for binary, count, continuous or mixed outcomes. To arrive at a flexible model for the entire conditional distribution, not only the marginal distribution parameters but also the copula parameters are related to covariates through additive predictors. We suggest estimation by means of an adapted component-wise gradient boosting algorithm. A key benefit of boosting as opposed to classical likelihood or Bayesian estimation is the implicit data-driven variable selection mechanism as well as shrinkage. To the best of our knowledge, our implementation is the only one that combines a wide range of covariate effects, marginal distributions, copula functions, and implicit data-driven variable selection. We showcase the versatility of our approach to data from genetic epidemiology, healthcare utilization and childhood undernutrition. Our developments are implemented in the R package gamboostLSS, fostering transparent and reproducible research.

Related projects

The goal of the proposed project is to develop and apply statistical methods to better predict the individual risk of patients for a specific disease…

Institution:
University Hospital Bonn, Germany

All projects