Abstract
BACKGROUND AND AIMS: Emerging machine learning models show promise in addressing the unmet needs for the non-invasive screening of metabolic dysfunction-associated steatotic liver disease (MASLD) but lack extensive validation. We aimed to identify the most effective model for MASLD detection.
METHODS: This study enrolled five cohorts: the epidemiological survey for MASLD in South China (January 2020 to March 2022), the UK Biobank database (April 2007 to December 2010), the NHANES III database (1988-1994), the NHANES 2017-2020 database and multi-centre databases with liver biopsy data from South China. The diagnosis of hepatic steatosis was established using the vibration-controlled transient elastography, magnetic resonance imaging-based proton density fat fraction, ultrasonography and biopsy. A total of 34 methods were analysed, comprising 6 machine learning models and 28 traditional scores. Survival analysis was conducted to assess the predictive value of these indicators for MASLD prognosis.
RESULTS: The final analysis included a total of 24,861 subjects. The area under the receiver operating characteristic curve (AUROC) for the extreme gradient boosting (XGB) model in detecting MASLD exceeded 0.8 across all five databases. In the subgroup of lean individuals, the combination of the triglyceride-glucose index and waist circumference yielded an AUROC ranging from 0.60 to 0.88. In the NHANES databases, the overall survival rate for the MASLD group was significantly lower than that of the non-MASLD group (p < 0.001). Additionally, the logistic regression model demonstrated strong predictive ability for overall survival in MASLD subjects.
CONCLUSIONS: The XGB model exhibited superiority over traditional non-invasive methods in detecting MASLD.
TRIAL REGISTRATION: The research was registered in the Chinese Clinical Trial Register (ChiCTR2000034197).