Abstract

Development and validation of an early diagnosis model for severe mycoplasma pneumonia in children based on interpretable machine learning.

Xie, Si (S);Wu, Mo (M);Shang, Yu (Y);Tuo, Wenbin (W);Wang, Jun (J);Cai, Qinzhen (Q);Yuan, Chunhui (C);Yao, Cong (C);Xiang, Yun (Y);

 
     

Author information

Respir Res.2025 May 13;26(1):182.doi:10.1186/s12931-025-03262-1

Abstract

BACKGROUND: Pneumonia is a major threat to the health of children, especially those under the age of five. Mycoplasma  pneumoniae infection is a core cause of pediatric pneumonia, and the incidence of severe mycoplasma pneumoniae pneumonia (SMPP) has increased in recent years. Therefore, there is an urgent need to establish an early warning model for SMPP to improve the prognosis of pediatric pneumonia.

METHODS: The study comprised 597 SMPP patients aged between 1 month and 18 years. Clinical data were selected through Lasso regression analysis, followed by the application of eight machine learning algorithms to develop early warning model. The accuracy of the model was assessed using validation and prospective cohort. To facilitate clinical assessment, the study simplified the indicators and constructed visualized simplified model. The clinical applicability of the model was evaluated by DCA and CIC curve.

RESULTS: After variable selection, eight machine learning models were developed using age, sex and 21 serum indicators identified as predictive factors for SMPP. A Light Gradient Boosting Machine (LightGBM) model demonstrated strong performance, achieving AUC of 0.92 for prospective validation. The SHAP analysis was utilized to screen advantageous variables, which contains of serum S100A8/A9, tracheal computed tomography (CT), retinol-binding protein(RBP), platelet larger cell ratio(P-LCR) and CD4+CD25+Treg cell counts, for constructing a simplified model (SCRPT) to improve clinical applicability. The SCRPT diagnostic model exhibited favorable diagnostic efficacy (AUC > 0.8). Additionally, the study found that S100A8/A9 outperformed clinical inflammatory markers can also differentiate the severity of MPP.

CONCLUSIONS: The SCRPT model consisting of five dominant variables (S100A8/A9, CT, RBP, PLCR and Treg cell) screened based on eight machine learning is expected to be a tool for early diagnosis of SMPP. S100A8/A9 can also be used as a biomarker for validity differentiation of SMPP when medical conditions are limited.

© Copyright 2013-2025 GI Health Foundation. All rights reserved.
This site is maintained as an educational resource for US healthcare providers only. Use of this website is governed by the GIHF terms of use and privacy statement.