Prediction of gold stage in patients hospitalized with COPD exacerbations using blood neutrophils and demographic parameters as risk factors

Background Patients hospitalized with chronic obstructive pulmonary disease (COPD) exacerbations are unable to complete the pulmonary function test reliably due to their poor health conditions. Creating an easy-to-use instrument to identify the Global Initiative for Chronic Obstructive Lung Disease (GOLD) stage will offer valuable information that assists clinicians to choose appropriate clinical care to decrease the mortality in these patients. The objective of this study was to develop a prediction model to identify the GOLD stage in the hospitalized exacerbation of chronic obstructive pulmonary disease (ECOPD) patients. Methods This prospective study involved 155 patients hospitalized for ECOPD. All participants completed lung function tests and the collection of blood neutrophils and demographic parameters. Receiver operating characteristic (ROC) curve was plotted based on the data of 155 patients, and was used to analyze the disease severity predictive capability of blood neutrophils and demographic parameters. A support vector regression (SVR) based GOLD stage prediction model was built using the training data set (75%), whose accuracy was then verified by the testing data set (25%). Results The percentage of blood neutrophils (denoted as NEU%) combined with the demographic parameters was associated with a higher risk to severe episode of ECOPD. The area under the ROC curve was 0.84. The SVR model managed to predict the GOLD stage with an accuracy of 90.24%. The root-mean-square error (RMSE) of the forced expiratory volume in one second as the percentage of the predicted value (denoted as FEV1%pred) was 8.84%. Conclusions The NEU% and demographic parameters are associated with the pulmonary function of the hospitalized ECOPD patients. The established prediction model could assist clinicians in diagnosing GOLD stage and planning appropriate clinical care.


Background
Chronic obstructive pulmonary disease (COPD) is a progressive lung condition and a leading cause of adult morbidity and mortality worldwide [1][2][3]. Exacerbation of chronic obstructive pulmonary disease (ECOPD) is an event characterized by a sustained worsening of the respiratory symptoms of a patient (including cough, phlegm production, and dyspnea), beyond normal day-to-day variations, which often necessitates additional therapies [4,5]. These episodes requiring hospitalization are associated with increased morbidity, mortality, and put enormous burden upon healthcare systems [6,7]. Inflammation is a key component in the pathogenesis of COPD [8]. It has previously been observed that COPD is not only associated with abnormal inflammatory response of the lung, but also with systemic inflammation, including systemic oxidative stress, activation of circulated immune cells and inflammatory cells, and the increased circulating levels of inflammatory cytokines [9]. It is generally considered that ECOPD reflects a flare-up of these underlying inflammatory processes [10], and is linked to a neutrophilic signature response [11].
A recent systematic literature review concluded that ECOPDs are extremely dangerous events. There is an urgent need to identify tolerable treatment guidelines and manage acute exacerbations in hospitalized ECOPD patients [12]. The Global Initiative for Chronic Obstructive Pulmonary Disease (GOLD) uses the ratio of forced expiratory volume in one second to forced vital capacity (denoted as FEV 1 /FVC) as the diagnostic criteria for airflow obstruction, whose nominal value shall be smaller than 0.70; and classifies the airflow obstruction severity based on the value of forced expiratory volume in one second as percentage of predicted value (denoted as FEV 1 %pred) as shown in Table 1. GOLD stages 1-4 respectively represent mild, moderate, severe and very severe. Inadequate diagnosis of COPD and the lack of spirometric assessment can lead to inadequate treatment strategies, with health costs and risks for patients, leading to delays in diagnosing and treatment of the true cause of the symptoms [13][14][15]. Patients hospitalized with COPD exacerbations due to poor health status, are unable to complete the pulmonary function test reliably. As one longitudinal study indicated, 50% of pulmonary function test results are unacceptable [16].
Many studies on evaluating the disease severity in COPD patients have been focused on the stable stage instead of exacerbating stage. 78% of the ECOPD patients have clear evidence of viral or bacterial infection [17].
The percentage of neutrophils (NEU%) is commonly used as clinical bacterial infection indicators. Based on the needs of unified assessment criteria that can accurately reflect the pulmonary function of hospitalized ECOPD patients, we explored the predictive capability of the percentage of blood NEU% and demographic parameters in GOLD stage and created a prediction model based on support vector regression (SVR) for predicting GOLD stage in Hospitalized ECOPD Patients [18].

Subjects selection
We conducted a prospective study to explore the predictive capability of the NEU% and demographic parameters in the GOLD stage. A total of 155 subjects (135 males and 20 females) were included in the study, all of whom were from the Respiratory Department of the Affiliated Suzhou Science and Technology Town Hospital of Nanjing Medical University in Suzhou, China. The Medical Ethics Committee approved the study, and all subjects were required to sign an informed consent form. Ethics approval for the data collection and the use of clinical data in the study were obtained from the Ethics Committee of the Affiliated Suzhou Science and Technology Town Hospital of Nanjing Medical University (IRB20180009). 269 candidates were collected from initially selected patients hospitalized for ECOPD. These subjects were over 40 years old, clinically diagnosed as COPD, either with aggravating symptoms or with no history of pulmonary dysfunction. A total of 86 patients were excluded due to the following exclusion criteria: (1) 34 patients with noninfectious exacerbations, including those caused by pneumothorax or heart failure; (2) 30 patients withdrew consent; (3) 22 patients with mechanical barrier or hearing disease; (4) 28 patients due to death or refer to other hospitals. Ultimately, 155 patients were enrolled (Fig. 1).

Clinical data
Demographic information including sex, age, height and weight was recorded upon admission to the hospital. After interview and signing a written consent, 155 patients participated in the pulmonary function tests using equipment manufactured by CareFusion, USA. The tests were guided by the same professional doctor when the patients' health status allowed to do so. Three effective pulmonary function tests were performed with the same, regularly gauged spirometer to reduce measurement errors. The average FEV 1 %pred value and the GOLD stage of the three tests were recorded. FEV 1 was reported in litres; and the Z-score to correctly characterize the cohorts was investigated. The FEV 1 Z-score of each patient was derived by using norms from the Global Lung Initiative (GLI) based specially on developed software [19]. Peripheral venous blood was drawn for peripheral blood examination at the same time period with pulmonary function tests. The automatic blood analyzer (SYSMEX Japan) was used to calculate the NEU%. The process from identification of potential patients to data collection can be visualized in Fig. 2.

Risk stratification
Donner et al. [20] proved that once the FEV1 falls below 1L, it appears that there is a rapid increase in the impact of ECOPD on the daily life and well-being of the patients. Manhire et al. [21] also mentioned in their research that most practicing physicians and radiologists use 1L as a cut off for FEV1 to assess the severity of the disease. We adopted FEV 1 <1L as a threshold to determine the severe episode of ECOPD. All enrolled patients were assessed whether they have severe episodes of ECOPD using FEV 1 <1L as the threshold, and were thus classified into four stages according to GOLD guidelines.

Statistical analysis
The demographic information, blood NEU% and FEV 1 of all the participants were expressed as the mean (SD) for normally distributed data or median (IQR) for nonnormally distributed data, and the percentages for the categorical variable. Differences of continuous variables between two groups were assessed using Student's t-tests or Mann-Whitney U tests if normality could not be assumed. Weight and BMI were normally distributed and therefore the Student's t-tests was used to compare differences between two episode patterns. Mann-Whitney U tests was used for the other variables. Depending on normality assessment of the variables, One-way ANOVA (if normally distributed) or Kruskal-Wallis tests were used to compare differences between more than two groups. Statistical significance was assumed when P<0.05. Univariate logistic regression models were developed to assess the correlation between NEU%, demographic parameters and severe episode of ECOPD. All variables correlated with the severe episode of ECOPD were considered in a multivariable model. Receiver operating characteristic (ROC) curves were constructed to evaluate the discrimination of models. An area under the ROC curve of 0.8 or greater is generally considered to be a good predictor [22]. All statistical analyses were performed using SPSS version 24.0. The outcome variables were defined as the FEV 1 %pred value and the GOLD stage of patients hospitalized for ECOPD. The patients were randomly divided into a training set (75%) and a testing set (25%). The training set was used to develop the SVR based prediction model; whereas the testing set was used to validate the predictive performance of FEV 1 %pred value and GOLD stage. Pearson correlation coefficient was adopted to evaluate the linear correlation of the predicted and the measured values of FEV 1 %pred. If the P-value is less than 0.05, the difference was considered significant. The root mean squared errors (RMSE) and correlation coefficient (r) were used to quantitatively describe the strength of the relationship between the predicted and the measured values of FEV 1 %pred. The GOLD stage was classified by the predicted values of FEV 1 %pred according to the GOLD guidelines, and compared with the GOLD stage classified by the measured value of FEV 1 %pred. The GOLD stage prediction accuracy was also calculated to assess the discrimination capability of the model.

Demographic information, blood NEU% and FEV 1 in all the enrolled patients
Ultimately, 155 subjects were enrolled. A total of 93 subjects were defined as non-severe episode pattern; and 62 subjects were defined as severe episode pattern. The height, weight, BMI, FEV 1 , and FEV 1 %Pred in the nonsevere episode group were higher than those in the severe episode group, with significance (p < 0.05). Blood NEU% in the non-severe episode group was lower than that in the severe episode group, with significance (p < 0.001). There was no significant difference in the sex and age between the two groups (P = 0.587, P = 0.202) ( Table 2).
All enrolled patients were classified into GOLD stages 1-4 based on the GOLD. The factors associated with GOLD stage are shown in Table 3. Univariate analysis demonstrated that sex, age, weight, BMI and NEU % are the risk factors of different GOLD stages (Table 3). On the basis of the univariate analysis, the univariable and multivariable models were used to discriminate a severe episode of ECOPD.

Discrimination of a severe episode of ECOPD
ROC plots and the areas under the ROC curves of the various models to discriminate a severe episode of ECOPD are shown in Fig. 3 and Table 4. A model

GOLD stage prediction
The characteristics of the training set and testing set are included in Table 5. There was no significant difference in all the involved factors, i.e., demographics, blood count, pulmonary function and COPD GOLD stage of subjects between the two groups. Fig. 4 and Fig. 5 show the predictive capability in FEV 1 %pred value of the SVR based prediction model. The association between the predicted and the measured FEV 1 %pred value was strong with r=0.92; and the difference was not significant (P>0.05). The total sample size of GOLD stage 1 was only 13, since most hospitalized ECOPD patients tended to have higher GOLD stage. As the degree of airflow limitation of patients with GOLD stage 1 and 2 is moderate, we combined GOLD stage 1 and GOLD stage 2 as the moderate group. The predictive performance on the FEV 1 %pred value and GOLD stage are shown in Table 6. Figure 5 indicated that in the case of FEV 1 %pred exceeding 70%, the model could bring pessimistic prediction results. Analysis on the GOLD stage predictive performance showed that, under the circumstance of predicted FEV 1 %pred exceeding 70%, the algorithm would overestimate the GOLD stage. To be more specific, patients of GOLD stage 1 may be classified to GOLD stage 2. GOLD defines GOLD stage 1 and GOLD stage 2 as moderate airflow obstruction. Their treatment plan will not be confused with GOLD stage 3 and GOLD stage 4, which stand for severe airflow obstruction. Figure 6 shows the predictive performance in different GOLD stage. The overall COPD GOLD stage prediction accuracy was 90.24%.

Discussion
ECOPD is a kind of acute attack process, where the patients' respiratory symptoms continue to worsen over their daily status. The frequent episodes of ECOPD resulted in an accelerated decline in FEV 1 . Meanwhile, the rapid decline of FEV 1 performs as an independent hazard factor for ECOPD. The vicious circle between the decline of FEV 1 and the frequent attack of ECOPD affect the prognosis and mortality of the patients [23]. In this analysis, we focused on the discrimination value of blood NEU% as a biomarker for a severe episode of ECOPD, and the GOLD stage prediction in hospitalized ECOPD patients. We attempted to create an easy-to-use measure to estimate the value of FEV 1 %Pred and to identify the GOLD stage that could assist clinicians in choosing appropriate measures of medical care to decrease future hospitalization rates and mortality in hospitalized ECOPD patients.
In line with previous studies, the outcome of pulmonary function test relied on the cooperation of ECOPD patients, most likely due to the limitation by force-velocity characteristics of expiratory muscles [16,24]. Biomarkers were required for effective risk stratification and making individualized treatment decision.
The pathophysiological mechanism of most cases of ECOPD is an acute burst of local or systemic inflammatory mediators following respiratory bacterial or virus infection. Usually, high levels of non-specific inflammatory biomarkers are expected [25]. Neutrophils are the most abundant inflammatory cells in blood and sputum. As neutrophil proteases can generalize many of the characteristics of ECOPD including emphysema and mucus hypersecretion [26], ECOPD is characterized as a neutrophil inflammatory disorder in most cases. A study on peripheral blood neutrophils from ECOPD patients conducted by Milara et. al. showed that compared with healthy control group, the release of the neutrophil activation marker neutrophil elastase (NE) and reactive oxygen species (ROS) increased by 2 times and 30% respectively [27]. Jones et al. observed that compared with the healthy controls, bacteria stimulated neutrophil degranulation was greater in the ECOPD group [28]. Corhay et al. focused on exacerbation whichever its trigger, and found that neutrophil inflammatory markers declined after treatment [29]. We designated a statistically significant difference in the NEU% between ECOPD patients with different GOLD stages to extend these findings. ECOPD patients with higher blood NEU% had a higher tendency of severe episode of ECOPD, whose GOLD stage risk stratification could thus be higher. The differences between ECOPD patients with different GOLD stages are consistent with the results of Perera et al. They found that there were significant differences in systemic markers of inflammation between patients with GOLD stages 3 and 4 vs. controls without COPD; while there was no significant difference between GOLD 2 patients and controls [30].
We sought for factors that would discriminate a severe episode of ECOPD in clinical cases. Although the multivariable demographic parameters or NEU% values reflected the relative risk of a severe episode of ECOPD, considering the moderate values of areas under the ROC curves, the overall prediction performance is still quite limited. No matter which cut-off  . 6 Comparison between the predicted and the measured GOLD stage. GOLD stage 1 and GOLD stage 2 were combined as the moderate group level is chosen, the false positive rate is still very high, so the specificity for acceptable value of sensitivity is low. With increase in blood NEU%, the risk of a severe episode of ECOPD increased. The overall discrimination value of multivariable factors including demographic parameters and blood NEU% was encouraging with the area under the ROC curve of 0.84. To further study the FEV 1 %Pred prediction and the GOLD stage categorization capability of the blood NEU% and demographic parameters, we randomly divided the data collected from the ECOPD patients into a training data set to develop a prediction model and a testing data set to validate the predictive performance. The selected demographic parameters included sex, age, weight and BMI, which had demonstrated their relevance to the target values. We used supervised learning algorithm to evaluate the predictive capability of the risk factors, and classified the subjects to 4 different GOLD stages. Searching for the right subjects was one of the major difficulties of our study.
On the other hand, support vector machine (SVM) is a learning method based on the principle of structural risk minimization of statistical learning theory. It shows many unique advantages in solving the problem of small sample and nonlinearity [18]. SVR is a model dealing with the SVM regression problems, which showed acceptable regression capacity in estimating the value of FEV 1 %Pred and identifying the GOLD stage.
To our knowledge, this is the first study in ECOPD patients to predict the value of FEV 1 %Pred and identify the GOLD stage based on demographic parameters and blood NEU%. In the absence of a clear biomarker to categorize the GOLD stage of ECOPD patients, our research provides an auxiliary guidance value for the clinicians to diagnose GOLD stage and establish appropriate clinical care, since the demographic parameters and blood NEU% are easy to be obtained.
Limitations of our current study should also be noted. First, the relatively small number of subjects enrolled in this study could limit the predictive performance of the model, especially when comparing to the previous work of Cristóbal et al. [31] and Godtfredsen and coworkers [32]. The predictive performance of the prediction model was limited in the ECOPD patients with optimistic degree of airflow obstruction, which could also be resulted from the lower influence of inflammatory factors when the symptoms were moderate. To find proper ECOPD patients and guide them to complete the pulmonary function test turned out to be one of the biggest difficulties during our research. To overcome this limitation, we used the most widely accepted learning method SVM to establish the prediction model. The grouping strategy of the training set and testing set was able to tackle the problem of multiple covariates larger than the samples (patients) or "p > n problem". Importantly, the overall ECOPD GOLD stage prediction accuracy of the establish prediction model was 90.24%. Besides, Sørheim and coworkers showed that pulmonary function injury may differ between sexes. There was a sexual imbalance in our study, as the ECOPD patients included were mostly male (135/155). The model's predictive performance on female patients could be limited. Considering the low population of the study, comorbidity and different treatments during hospitalization that are not reported herein, could influence the result of this work. Therefore, our future work is to balance the sex composition and extend the observation time to carry out larger scale research to verify our findings. As an additional limitation of the study, the patient's general condition, comprehension and cooperative degree could also influence the accuracy of pulmonary function test results. Nevertheless, every enrolled patient was trained and guided by the same professional physician to minimize the impact of external factors on the measurement.

Conclusions
In summary, a prediction model based on demographic parameters and blood NEU% has been established to predict the value of FEV 1 %Pred and identify the GOLD stage of the patients hospitalized with ECOPD. This easy-touse instrument can assist clinicians in diagnosing GOLD stage, and offers valuable information to determine the appropriate clinical care for hospitalized ECOPD patients.