Prediction model for prolonged fever in patients with Mycoplasma pneumoniae pneumonia: a retrospective study of 716 pediatric patients

Objective To identify patients with Mycoplasma pneumoniae pneumonia (MPP) with a risk of prolonged fever while on macrolides. Methods A retrospective study was performed with 716 children admitted for MPP. Refractory MPP (RMPP-3) was defined as fever persisting for > 72 h without improvement in clinical and radiologic findings after macrolide antibiotics (RMPP-3) or when fever persisted for > 120 h (RMPP-5) without improvement in clinical and radiologic findings. Radiological data, laboratory data, and fever profiles were compared between the RMPP and non-RMPP groups. Fever profiles included the highest temperature, lowest temperature, and frequency of fever. Prediction models for RMPP were created using the logistic regression method and deep neural network. Their predictive values were compared using receiver operating characteristic curves. Results Overall, 716 patients were randomly divided into two groups: training and test cohorts for both RMPP-3 and RMPP-5. For the prediction of RMPP-3, a conventional logistic model with radiologic grouping showed increased sensitivity (63.3%) than the model using laboratory values. Adding laboratory values in the prediction model using radiologic grouping did not contribute to a meaningful increase in sensitivity (64.6%). For the prediction of RMPP-5, laboratory values or radiologic grouping showed lower sensitivities ranging from 12.9 to 16.1%. However, prediction models using predefined fever profiles showed significantly increased sensitivity for predicting RMPP-5, and neural network models using 12 sequential fever data showed a greatly increased sensitivity (64.5%). Conclusion RMPP-5 could not be effectively predicted using initial laboratory and radiologic data, which were previously reported to be predictive. Further studies using advanced mathematical models, based on large-sized easily accessible clinical data, are anticipated for predicting RMPP.

response of the host are commonly suggested [2,3]. Macrolide antibiotics have been generally preferred as the first-choice agents for MP infections because secondary antibiotics such as tetracyclines and fluoroquinolones are not recommended because of the risk of severe adverse events, especially in pediatric patients.
Macrolide resistance rates have risen throughout the world and vary across countries [4][5][6][7]. Although macrolides could be continued in cases of mild to moderate infections irrespective of their resistance, replacement by alternative antibiotics or additional corticosteroids have been shown to improve radiological abnormalities and clinical symptoms [8,9]. Additionally, the severity of the disease is partially related to the degree to which the host immune response reacts to infection. The concept of immune-mediated lung disease provides a basis for consideration of immunomodulatory therapy in addition to conventional antimicrobial therapies for the management of MP infections [10].
The appropriate time for alternative treatment is not clarified, but it still depends on the physician's decision. Alternative treatments are delayed on some occasions owing to concerns regarding toxicities and adverse effects of secondary antibiotics or the possibility of blurred diagnosis caused by corticosteroids, leading to aggravation of the clinical course. Protracted courses of fever or worsening respiratory exertion despite treatment with macrolides are reported to complicate atelectasis, parapneumonic effusion, bronchiolitis obliterans, necrotizing pneumonitis, pulmonary abscess, and systemic inflammatory response syndrome [2,[11][12][13][14]. At the initiation phase of macrolide therapy, physicians find it difficult to predict patients with a prolonged or severe clinical course. Previous studies have suggested individual cut-off values for inflammatory markers to differentiate between the patients with or without clinical and radiological progression after macrolide therapy for 7 days or longer [8,[15][16][17][18][19][20]. Identifying patients who are expected to undergo a prolonged or severe clinical course would help in providing them with timely secondary treatment and mitigating their clinical course [8,17,21,22].
This study aims to identify the predictive factors for prolonged fever in patients with MP pneumonia with readily accessible clinical, laboratory, and radiological data and to develop a predictive model for these patients in whom timely initiation of secondary treatment options should be considered.

Study design and ethical considerations
The medical records of previously healthy children admitted for MP infection at our institution between January 2015 and December 2019 were retrospectively reviewed.
The study was designed and conducted using the format recommended by the Strengthening the Reporting of Observational Studies in Epidemiology guidelines. The study protocol was approved by the Institutional Review Board of Kangdong Sacred Heart Hospital. The review board waived the requirement for informed consent for this study.

Study patients
All patients who had symptoms and signs indicative of pneumonia at admission, including fever (≥ 38 °C), cough, and abnormal lung auscultation, were included. Empiric antibiotics were initially prescribed for these patients (β-lactam agents and/or macrolides). Only patients initiated on a regimen with macrolides were included. When the patients were considered to have persistent fever with no improvement in their clinical status and radiologic findings after 72 h or longer of macrolide treatment, they were either continued on macrolides, started on additional intravenous methylprednisolone (1-2 mg/kg/d for 3-5 days), with or without addition of secondary antibiotics (tetracyclines or fluoroquinolones), depending on their clinical, laboratory, and radiologic findings.
Diagnosis of M. pneumoniae pneumonia was confirmed by laboratory data and chest radiographs. A baseline blood sample and nasopharyngeal aspirate/ swab (NPA) were collected for serological and microbiological testing. M. pneumoniae infection was confirmed using serologic testing and/or polymerase chain reaction (PCR) testing of the NPA. An enzyme immunoassay for IgM antibodies specific to M. pneumoniae (EIA, Bio-Rad Platelia ™ M. pneumoniae IgM, California, USA) was performed with the initial blood samples according to the manufacturer's protocol. MP infection was confirmed when a positive IgM titer and/or a positive PCR result for M. pneumoniae was observed. When the initial result of IgM antibodies was negative in a highly suspected patient without a positive PCR result, it was then repeated every 2-3 days thereafter until a positive conversion was confirmed to avoid missing false-negative cases.
We excluded patients with underlying diseases, patients who were treated for confirmed or suspected MP infection within the prior four weeks, patients with either positive IgM or PCR for MP but whose symptoms and radiographic findings were incompatible with pneumonia, patients treated with antiviral agents for proven influenza virus with fever onset within 72 h, patients who received intravenous corticosteroids or were changed to alternative antimicrobials (tetracyclines or fluoroquinolones) within 72 h, and patients who were afebrile after admission. Although some of the patients had received additional treatment including intravenous methylprednisolone or secondary antibiotics after 72 h, only those patients whose fever and clinical symptoms persisted longer than 120 h after the additional treatment were included in the cohort.

Definitions of RMPP-3 and RMPP-5
A case with persistent fever for > 72 h without improvement in the clinical and radiological findings despite appropriate management with macrolides was defined as refractory M. pneumoniae pneumonia (RMPP-3). Patients with persistent fever for > 120 h without improvement in the clinical and radiological findings despite appropriate management were defined as RMPP-5. We hypothesized that the predictive variables for fever > 72 h and fever > 120 h would differ. We targeted to identify and compare those variables within the same cohort which was alternatively divided into training and test cohorts by these two definitions.

Grouping: training and test cohorts for RMPP-3 and RMPP-5
Patients were randomly grouped into the training (n = 501) and test cohorts (n = 215) by 67:33 splitting using the Python Scikit-learn library (Fig. 1). Each cohort was then categorized into the RMPP-3 group and non-RMPP-3 group based on their duration to defervescence. Defervescence was defined as maintenance of body temperature below 38 °C for at least 24 h. For the prediction analysis of patients with fever for > 120 h, the group randomization process was implemented again on the same cohort, after which each cohort was categorized into the RMPP-5 and non-RMPP-5 groups.

Predictors: fever profiles
The frequency of fever was defined as the number of peaks on the temperature curve. It was only counted when body temperature was ≥ 38.0 °C and had increased ≥ 0.6 °C within 4 h. If the patient continued to have temperature changes of< 0.6 °C but whose body temperature was ≥ 38.0 °C during the 4-h interval, it was counted as valid (continuous fever pattern).

Predictors: clinical data
Demographic and clinical information were collected in a standardized form by reviewing the electronic medical records. The following information was gathered: duration of fever (before and after hospitalization), total hospital days, and fever profile (highest body temperature, lowest body temperature, frequency of peak fever over 39 °C, frequency of peak fever over 40 °C, and total frequency of peak fever) extracted from 12 sequential fever data within 48 h. These fever profiles were only included in the analysis for the prediction of prolonged fever over 120 h (RMPP-5).

Predictors: radiologic data
Chest radiographs were reviewed independently by two experienced radiologists. They were blinded to the clinical data and original radiographic interpretations. Radiological findings at admission were categorized into four groups: group 1, patients with parahilar peribronchial opacification or diffuse interstitial infiltration; group 2, patients with reticular, nodular, or reticulonodular densities; group 3, patients with segmental or lobar consolidation in a single lobe with or without pleural effusion of 1/4-1/2 in the decubitus position; and group 4, patients with lobar consolidation in 2 or more lobes and/or pleural effusion of more than 1/2 in the decubitus position. The images were interpreted and compared by two radiologists to reach a consensus.

Statistical analyses
Continuous variables were presented as mean ± standard deviation and were compared using an independent t-test. Categorical variables were presented as frequency (%) and were compared using the Pearson chi-squared test or Fisher's exact test. Based on the data from the training cohorts, the univariate logistic regression analysis was performed for identifying significant independent predictors for RMPP-3 or RMPP-5. With the significant predictors, stepwise multivariate logistic regression analysis was performed for creating conventional prediction models. To reflect the 12 sequential fever data on the prediction models effectively, a deep neural network (DNN) model was additionally created. DNN included two hidden layers. A dropout layer was used after the first hidden layer to prevent overfitting. For hyperparameter optimization, 20% of the training cohort patients were assigned to the validation cohort. Optimization was performed using the Adam method, and model loss was calculated through binary cross-entropy. Calculations to determine the optimal number of layers and neurons for all DNNs were performed. For each combination of layers and hidden units, hyperparameters for obtaining the best performance for the combination were optimized.
The prediction power of the conventional logistic prediction model and the DNN model was evaluated in the test cohorts using receiver operating characteristic (ROC) curves. Logistic regression analysis was performed using SPSS 25 (IBM Corp., Armonk, NY, USA). DNN models were developed using Python 3.7 (opensource projects) with Anaconda 4.7.12, and TensorFlow 2.0.

Baseline characteristics
Overall, 716 patients with M. pneumoniae were enrolled during the five-year study period after applying the exclusion criteria. The mean age of the entire cohort was 5.6 years (range, 1-16 years), and 350 patients (48.8%) were boys. No patients were transferred to the intensive care unit or received mechanical ventilation. Doxycycline and intravenous levofloxacin were finally prescribed in 36 patients (5.0%) and 10 patients (1.4%), respectively. One hundred sixty-three patients (32.5%) in the training cohort (n = 501) and 79 patients (36.7%) in the test cohort (n = 215) were classified as RMPP-3 ( Fig. 1). Sixty-five patients (13.0%) in the training cohort and 31 patients (14.4%) in the test cohort were classified as RMPP-5.
In the training cohort for RMPP-3, duration of fever at admission were not significantly different between the RMPP group and non-RMPP group (p = 0.057). Duration of fever after admission and the total duration of hospitalization were significantly longer in the RMPP-3 group (p< 0.001) than in the non-RMPP-3 group (Table 1). In the training cohort for RMPP-5, however, fever duration at admission was longer in the non-RMPP-5 group (p< 0.001) compared with RMPP-5 group (Table 2). No difference was observed in the rates of concurrent respiratory virus detection between RMPP group and the non-RMPP group.

Model development for predicting RMPP-3 from the training cohort
Univariate logistic analysis identified that mean WBC count, percentage of neutrophils, absolute neutrophil count, percentage of lymphocytes, absolute lymphocyte count, platelets, CRP, LDH, radiologic grouping, and presence of pleural effusion were significantly associated with RMPP-3 grouping (p< 0.05). Using all significant variables from the univariate analysis, a conventional logistic model using stepwise procedure predicting RMPP-3 was created, which only selected four variables including platelets (odds ratio (OR) 0.991, p< 0.001), CRP (OR 1.014, p< 0.001), LDH (OR 1.006, p< 0.001), and radiologic grouping (p< 0.001) as significant components of the prediction model (shown in Table 3).

Model development for predicting RMPP-5 from the training cohort
Univariate logistic analysis identified sex, mean WBC count, percentage of neutrophils, percentage of lymphocytes, absolute lymphocyte count, platelets, CRP, LDH, radiologic grouping, pleural effusion, all fever profiles, and 12 sequential body temperatures as significantly associated with RMPP-5 grouping (p< 0.05). Using all significant variables from the univariate analysis, a conventional logistic model using stepwise procedure predicting RMPP-5 was created, which only selected three variables including radiologic grouping (p< 0.001), the lowest temperature (OR 6.494, p< 0.001), and the frequency of peak fever within 48 h (OR 1.603, p< 0.001) as significant components of the prediction model (shown in Table 4).
Including two hidden layers (128 neurons in the first layer and 64 neurons in the second layer), a DNN model was created using 12 sequential body temperatures. The validation loss and validation accuracy of the DNN model were 0.1807 and 0.9172, respectively (epoch = 15, Fig. 2).

Prediction of RMPP-3 in the test cohort
The performance of conventional logistic models predicting RMPP-3 is compared in Table 5. Among the prediction models using individual variables, the prediction model using radiologic grouping showed the   Fig. 3).

Prediction of RMPP-5 in the test cohort
The performance of conventional logistic models predicting RMPP-5 is compared in Table 6. While conventional logistic models using only radiological grouping did not show significant predictive power in the test cohort, prediction models using the fever profiles (lowest     (Fig. 4).

Discussion
To prevent the progression of MP pneumonia resulting in severe and prolonged clinical course, early recognition and timely treatment is important for patients who display clinical and radiological aggravation during macrolide therapy [8,16,23,24]. To our knowledge, this is the first study to demonstrate a prediction model for refractory MP pneumonia based on readily accessible sequential fever data in addition to clinical, laboratory, and radiologic variables at admission. For prediction of RMPP-3, a conventional logistic model using only radiologic grouping showed increased sensitivity (63.3%) than the model using laboratory values, including CRP and LDH. Adding laboratory values in the prediction model using radiologic grouping did not meaningfully contribute to an increase in sensitivity (64.6%). For the prediction of RMPP-5, laboratory values and radiologic grouping showed lower sensitivities ranging from 12.9 to 16.1%. However, prediction models using the predefined fever profiles showed significantly increased sensitivity for predicting RMPP-5, and neural network models using 12 sequential fever data showed a greatly increased sensitivity of 64.5%. Predicting high-risk patients for refractory MP pneumonia would enable physicians to calibrate their expectations of progression in these patients and to provide earlier alternative treatment. Several studies have tried to identify predictors for refractory MP pneumonia and have suggested individual cut-off values of inflammatory markers, namely CRP, LDH, and ferritin or cytokines, such as IL-6, IL-8,  IL-10, IL-18, and interferon-gamma [15-17, 19, 23]. However, the application of these findings in clinical practice is limited by lower prediction power or accessibility of the tests. Although it is plausible that increased inflammatory cytokines are related to the severity of MP pneumonia, serum cytokine assays are mostly limited for research purposes and are not routinely measured. Bronchoscopy and bronchoalveolar lavage studies are useful tools not only for identifying the causative organism but also for the removal of mucosal plugs in severe pneumonia, but are generally performed for a small proportion of MP pneumonia cases. The requirement of sedation, the necessity of special equipment, and the need for an experienced bronchoscopist limit their accessibility. A study using CRP value of 16.5 mg/L as the cutoff value showed a sensitivity of 74.7% and a specificity of 77.2% for predicting refractory MP pneumonia [18]. However, our prediction model created using the CRP level of the training cohort showed a sensitivity of 32.9% for the prediction of RMPP-3 in the test cohort even when it was combined with the LDH level (Table 5). For the prediction of RMPP-5, our prediction model using the CRP level showed a lower sensitivity of 12.9% even when used in combination with ALC and LDH levels ( Table 6). Previous prediction models, created without validation, are inevitably vulnerable to model overfitting, resulting from institutional selection bias, which limits their clinical use. Therefore, a reasonable prediction model should undergo internal validation by a separate test cohort or external validation using data from another institution. Thus, it is understandable that previously identified laboratory markers such as CRP and LDH showed lower sensitivities (below 30%) for predicting RMPP-3 and RMPP-5 in our cohorts (Tables 5, 6). Such low sensitivities limit their clinical application for the timely detection of refractory MP pneumonia. To overcome such bias, our 716 enrolled patients were divided into training and test datasets for internal validation, which prevented overfitting and created a reasonable prediction model. For prediction of RMPP-3, according to a previous study, initial radiologic grouping was the most prominent predictor [25]. While the underlying mechanisms are still unclear, the pattern of pulmonary lesions in MP infection is reported to be influenced by the characteristics of host cell-mediated immunity [26,27]. Thus, radiological evidence of lung involvement is consistent with the strong host immune response in RMPP.
Both initial laboratory values and radiologic grouping showed limited prediction power for the prediction of RMPP-5. However, we tried to predict RMPP using initially available data and focused on the fever data during the initial 48-h period. Inflammatory cytokines involved in the immunopathogenesis of MP infection are reported to be increased in RMPP [3,20,28]. Since these cytokines act as endogenous pyrogens that play a pivotal role in inducing fever response, their levels are associated with core body temperature [29]. Although initial single timepoint data were limited for predicting RMPP-5, the prediction model using predefined fever profiles showed a two-fold increase in sensitivity (16.1% to 32.3%), and the DNN model using all 12 sequential fever data within 48 h showed a four-fold increase in sensitivity (64.5%) for predicting RMPP-5. Theoretically, DNN is a black-box approach, and the causes of superior prediction power of the DNN model cannot be identified. However, the greatly increased sensitivity for predicting RMPP-5 with the DNN model using only the initial 48-h fever data is noteworthy.
The major purpose of our grouping that included RMPP-3 and RMPP-5 was to evaluate the prediction power of the statistical model at two separate time points, to compare their prediction power, and to infer the causes for the difference. The prediction power of our statistical models for the later event (RMPP-5) was considerably lower than that for the early event (RMPP-3). Evaluating the model prediction power at separate time points enabled us to trace the changing trends in the variables of the prediction models at different time points. We identified fever profiles and radiologic grading as the most effective predictors that have superior prediction power for the 'later event' (RMPP-5).
The main limitation of our study is its retrospective design based on a limited number of inpatients from a single center, which might have introduced a selection bias. However, our prediction models underwent internal validation. Prediction models were created only from the data in the training cohort, and their prediction power was estimated in the test cohort, which was not used for model development. Nevertheless, external validation of our model in a prospective, large-scale cohort is needed for validating our results. Second, a possibility of under-diagnosis and over-diagnosis in MPP exists because of false negative IgM antibodies in the early stage or persistent IgM antibodies in convalescent patients with recent infection. We attempted to minimize these misdiagnoses through our strict exclusion criteria. Third, prediction models were not developed using tests, namely cytokines or FOB, which were reported to be significant. We especially focused on the accessibility of the tests, and those tests were not considered useful in usual clinical practice. Lastly, data on macrolide resistance were not included. Although febrile days during macrolide administration were reported to be greater in macrolide-resistant patients (3.5-4.0 days vs. 1.0-1.5 days) [9,30], prolonged fever in RMPP patients may not imply macrolide resistance because fever might have resolved spontaneously in some macrolide-resistant patients. The clinical efficacy of macrolide for treating MP infection may not only reflect its direct antimicrobial activity but also reflect its anti-inflammatory effects [31].
Development of tests based on data obtained from routine examination of vital signs and its integration into the clinical workflow can be more effective than utilizing new tests that are less verified and less accessible. Further studies utilizing such potential data are needed for improving the prediction power.

Conclusion
In summary, our study showed that for prediction of RMPP-3, a conventional logistic model using only radiologic grouping showed a favorable predictive power than the model using initial laboratory values. In contrast, RMPP-5 could not be effectively predicted using the initial laboratory and radiologic data, which were previously reported to be significantly predictive. However, the prediction models using predefined fever profiles showed a two-fold increase in sensitivity (16.1-32.3%), and the DNN model using all 12 sequential fever data within 48 h showed a four-fold increase in sensitivity (64.5%). Further studies using more advanced mathematical models based on easily accessible large-sized clinical data are anticipated to be helpful for predicting RMPP.