Factors associated with overdiagnosis of benign pulmonary nodules as malignancy: a retrospective cohort study

Objective To establish a preoperative model for the differential diagnosis of benign and malignant pulmonary nodules (PNs), and to evaluate the related factors of overdiagnosis of benign PNs at the time of imaging assessments. Materials and methods In this retrospective study, 357 patients (median age, 52 years; interquartile range, 46–59 years) with 407 PNs were included, who underwent surgical histopathologic evaluation between January 2020 and December 2020. Patients were divided into a training set (n = 285) and a validation set (n = 122) to develop a preoperative model to identify benign PNs. CT scan features were reviewed by two chest radiologists, and imaging findings were categorized. The overdiagnosis rate of benign PNs was calculated, and bivariate and multivariable logistic regression analyses were used to evaluate factors associated with benign PNs that were over-diagnosed as malignant PNs. Results The preoperative model identified features such as the absence of part-solid and non-solid nodules, absence of spiculation, absence of vascular convergence, larger lesion size, and CYFRA21-1 positivity as features for identifying benign PNs on imaging, with a high area under the receiver operating characteristic curve of 0.88 in the validation set. The overdiagnosis rate of benign PNs was found to be 50%. Independent risk factors for overdiagnosis included diagnosis as non-solid nodules, pleural retraction, vascular convergence, and larger lesion size at imaging. Conclusion We developed a preoperative model for identifying benign and malignant PNs and evaluating factors that led to the overdiagnosis of benign PNs. This preoperative model and result may help clinicians and imaging physicians reduce unnecessary surgery.


Introduction
Lung cancer remains one of the most prominent causes of cancer-related deaths worldwide, with its incidence and mortality rates demonstrating a marked increase in recent years.Between 2000 and 2016, there was a 0.8% annual increase in the age-standardized incidence, while new cases and deaths surged by 162.6% and 123.6%, respectively [1].Early-stage lung cancer has no obvious clinical symptoms but is detected by computed tomography (CT) as pulmonary nodules (PNs).Annual screening with LDCT is recommended for high-risk individuals by guidelines from the U.S. Preventive Services Task Force (USPSTF),and this recommendation increases the demand for low-dose CT (LDCT) for the patient [2].With the development of CT technology and the proliferation of lung cancer screening programs, the detection rate of PNs is increasing, which may cause public anxiety [3].However, despite the identification of numerous clinical-pathologic factors (e.g., age, sex, smoking history, family history of cancer, lesion type, lesion size, lesion location, biomarker, and image feature) that have been associated with the nature of PNs, the definitive predictive factors that distinguish between benign and malignant PNs remain elusive [4,5].The relative importance and interrelationship of these factors are unclear, and with current knowledge, it is difficult to accurately predict which PNs are at risk for over-diagnosis.
To discriminate between the benign and malignant PNs, lung cancer screening programs have been implemented in many countries.However, there are some controversies and risks associated with lung cancer screening.One of the risks is false-positive results.A meta-analysis revealed that screening leads to a higher long-term cumulative incidence of lung cancer (1.51; 95% CI: 1.06-2.14),with an estimated 49% of screen-detected cancers potentially being over-diagnosed [6].A study of the National Lung Screening Trial were drew a similar result as well [7].In a screening program, false-positive results are associated with increased healthcare costs, patient anxiety, and morbidity or mortality related to diagnosis and treatment [8].In an analysis of over 9000 lung cancer screening examinations, the frequencies of malignancy in Lung-RADS 4A, 4B and 4X nodules were 15.5%, 36.3%, and 76.8%.Therefore, the majority of suspicious nodules that undergo additional work-up, and intervention were in fact benign [9].Many patients experienced unnecessary surgery or biopsy due to the falsepositive results, in which the mainstream choice for PN is minimally invasive video-assisted thoracoscopic surgery (VATS) [10].However, because the small nodules are difficult to locate with the tactile sensation or the naked eye, wedge resection under VATS for ground-glass opacity nodules (GGN) is challenging.The frequency of complications is estimated to be 3-4% of treated patients, of which prolonged postoperative air leak is the most frequent and the other significant complications are bleeding, infections, postoperative pain, and recurrence at the port site [11].These inevitably increase the risk and pain of patients.What's more, observations of a review found that at least 95% of PNs are benign, which are most commonly granulomas or intrapulmonary lymph nodes [12].
The objective of this study is to analyze the imaging features of nodules in preoperative chest CT scans of patients who have undergone surgery or biopsy, and to establish a preoperative prediction model for nodules that incorporates the patients' clinical and pathological features.This model aims to minimize the rate of unnecessary surgery for benign PNs.Moreover, the study seeks to pinpoint specific factors that contribute to the classification of benign PNs.

Patient selection
This study was approved by the institutional review board, and the requirement for informed consent was waived due to its retrospective design.Patients who were diagnosed with PN(s) and underwent spiral CT scans at our institution between January 2020 and December 2020 were initially considered eligible for our study (Fig. 1).The inclusion criteria were as follows:  All data was extracted from the database of our hospital.The generic perioperative information of patients was reviewed including demographic information (age, sex, smoking history, family history of cancer, and biomarker positive), pathological information, and surgical details (approach of operation, extent of resection, and resection location).

Images acquisition
All the patients included in the study underwent noncontrast CT (NCCT) as imaging data before lung nodule resection or biopsy, which was an interval of fewer than 14 days from NCCT.The chest CT was performed with Siemens (SOMATOM Force), Canon (Aquilion PRIME), and Philips (Ingenuity CT) scanners using the following same acquisition parameters: layer thickness, 1.0 mm; tube voltage, 120 kvp; tube current-exposure time product, 160 mA.All images were set with a standard lung window (window width 1600HU; window position, -600HU) window.Images were transferred to the Picture Archiving and Communication System (PACS) system.

Images analysis
Two chest radiologists (Bao Shasha, a third-year postgraduate student and Deng Ailin, a third-year post-graduate student) were double-blinded to review the lesion and surrounding structures according to the American College of Radiology Lung-Reporting and Data System and to identify the possibility of PNs being benign or malignant.According to this system, Lung-RADS 1 or 2 lesions are generally considered benign due to their low risk of malignancy, while Lung-RADS 4B or 4X lesions are classified as malignant given their high malignancy risk.The malignancy status of Lung-RADS 3 or 4A lesions largely depends on the expertise of radiologists in differentiating benign from malignant PNs.When there is a disagreement, it is defined by group discussion between the two mentioned above and Xirui Duan (a first-year post-graduate student).In previous studies and models [13][14][15][16][17], spiculation, pleural retraction, vascular convergence and air bubble sign have also been used as one of the imaging risk factors for radiologists to distinguish between benign and malignant PNs.Recent review also supports these imaging features in evaluating PNs [12].Therefore, the nodular lesions were retrospectively categorized according to imaging findings as follows: (a) spiculation: a radial and unbranched striated shadow extending from the boundary of the PNs to the surrounding parenchyma of the pulmonary; (b) pleural retraction: a retraction of adjacent pleura toward the nodule; (c) vascular convergence: vessels are clustered internally or abnormally inclined toward the nodules compared with the normal pulmonary parenchyma; or (d) air bubble sign (vacuolar sign): a small air-containing space < 5 mm in length within the PNs.

Dataset allocation
All pulmonary nodular lesions were classified into one of two groups based on the final histopathology results; the final histopathology result was defined as benign or malignant by surgical or biopsy histopathologic evaluation.Overdiagnosis was defined when a benign PN at biopsy or surgery was assessed as high risk of malignancy on CT scan.The included lesions were randomly assigned to the training and validation data set at a ratio of 7:3.

Logistic regression analysis
Clinical, histopathologic, and imaging features were also evaluated for all benign pulmonary nodular lesions at biopsy or surgery to investigate the factors associated with benign PNs over-diagnosed at CT scan by logistic regression analysis.Specifically, an bivariate logistic regression analysis was performed on the training set to identify factors associated with a benign PN.A multivariable logistic regression analysis was conducted by using variables selected according to their clinical meaning and statistical significance (p, 0.05).The bivariate and multivariable logistic regression mentioned above will be used to identify factors associated with the over-diagnosis of benign PNs too.Multiple imputations were applied for the missing values, which used a fully conditional specification method; pooled adjusted ORs with 95% CIs were provided after 5 multiple imputations [18].The predictive performance of the training set was calculated as the median value of the 5 results for the missing imputations.However, we obtained the receiver operating characteristic curve through complete case analysis.

Model validation
The developed multivariable regression model was then validated with the validation set.The model's discrimination capability was evaluated using the area under the receiver operating characteristic curve (ROC), which is equivalent to the Harrell's c-statistic for binary results.The goodness of fit was assessed by using the Hosmer-Lemeshow test.The association between the observed and predicted probabilities of a benign PN was visually displayed through a calibration plot.The receiver operating characteristic curve and calibration plot for the validation set were derived from a complete case analysis, without employing multiple imputations.In real-world clinical situations, where minimizing biopsies is crucial to reduce overdiagnosis, a predictive model developed without relying on biopsy outcomes or histopathological data may be more appropriate.

Analysis of over-diagnosed benign PNs
Clinical, histopathologic, and imaging features were also evaluated for all benign pulmonary nodular lesions at biopsy or surgery to investigate the factors associated with benign PNs over-diagnosed at CT scan by multivariable logistic regression analysis.

Statistical analysis
Statistical analysis was performed with software (SPSS version 27.0.1,SPSS for Statistical Computing; and GraphPad Prism version 8.0.2,The ROC curve and Calibration plot were established by GraphPad Prism).Continuous variables are expressed as medians and interquartile ranges depending on their distribution and were compared by using the Mann-Whitney U test.Categorical variables are expressed as numbers with percentages and were compared by using the x 2 test or Fisher exact test.The results were considered statistically significant with two-tailed analyses, with p values less than 0.05.
Out of the 357 patients, we allocated 285 patients to the training set and the remaining 122 patients to the validation set.The patient demographic characteristics, along with the baseline clinical, imaging, and pathologic characteristics of the training set (n = 285) and validation set (n = 122), are presented in Table 1.

Multivariable logistic regression analysis and model validation for identifying benign PNs
Based on the results of bivariate logistic regression analysis, a final multivariable logistic regression analysis model was developed to identify benign PNs using the following features: lesion size at imaging; lesion type of part-solid nodules or non-solid nodules; imaging finding of spiculation, vascular convergence, or vacuolar sign; a biomarker of CYFRA21-1 (Table 4).In multivariable analysis, lesion types manifesting as partsolid nodules (OR, 0.14; 95% CI: 0.06, 0.37; P < 0.001) and non-solid nodules (OR, 0.05; 95% CI: 0.02, 0.14; P < 0.001) remained statistically significant independent factors for benign PNs, which were inversely associated with benign PNs.The area under the curve (AUC) were 0.83 (range, 0.77-0.89)by complete case analysis in the training set.The validation of the predictive model  4 and 5).

Discussion
In this study, we identified specific preoperative features for evaluating benign pulmonary nodules (PNs) that were confirmed by surgery or biopsy.Lesions that were neither part-solid nor non-solid nodules, large lesion size, and the absence of spiculation and vascular convergence at CT scan were significantly associated with benign PNs.These features were validated as successful predictors of benign PNs in the validation set.The overdiagnosis rate of benign PNs at imaging assessment as malignant PNs was 50.0%.Non-solid nodules, lesion size, spiculation, pleural retraction, and vascular convergence were positively associated with overdiagnosis of benign PNs as malignant PNs at surgery or biopsy.
Although the benefits of lung cancer screening and diagnosis of early pulmonary cancer have largely been demonstrated [19], we still consider the risk of overdiagnosis based on highly benign nodule surgery rates.Some indolent tumors have no impact on the patients' lives even if left untreated [20] and we focus on anxiety and unnecessary invasive treatment brought to patients.Invasive treatment of benign PNs might not increase the prognosis of treatment and can cause complications in patients undergoing of biopsy or surgery.Transthoracic core needle aspiration biopsy and fine-needle aspiration (FNA) are performed under CT guidance to obtain tissue.Core biopsies are superior to FNA because of their higher yield, but more importantly, biopsies allow the assessment of tissue structure and provide sufficient material for immunohistochemical and genetic analysis.However, complications can occur in transthoracic core needle aspiration biopsy despite all precautions taken.Complications of transthoracic core needle aspiration biopsy include pneumothorax, hemothorax, hemoptysis, infection, tumor spreading, and air embolism, with the most common complication as pneumothorax.According to a population-level retrospective cohort analysis,16,971 patients underwent transthoracic core needle aspiration biopsy, and 25.8% experienced a complication within 3 days of the procedure (pneumothorax 23.3%, hemorrhage 3.6%, and air embolism 0.02%) [21].Several lately studies [22,23] have been evaluating the complications and risk-benefit of benign PNs which were treated with VATS.It is essential to discriminate benign PNs preoperatively because the complications of VATS might cause some irreversible injury in clinical practice.
In recent studies, VATS has been shown to cause complications in 33.9% of patients at 90 days post-operatively, which has no significant differences with thoracotomy [23].In our study, the patients were divided into a training set and a validation set, and the imaging features and biomarkers for differentiating benign and malignant PNs were obtained.Multivariate analysis showed that ground glass nodule or non-solid nodule, no spiculation, no vacuolar sign, vascular convergence, large lesion size and positive CYFRA21-1 were still independent factors for the differential diagnosis of benign nodules.Then, our model identified the final type of PNs based on preoperative results with an area under the receiver operating    The number of patients for whom histopathological data were available for each molecular marker is given characteristic curve of 0.88 in the validation set.Similarly, previous research has also proven that type of PNs and image findings of spiculation, vascular convergence and vacuolar sign are independent risk factors for pulmonary cancer [24][25][26].Other studies have similarly shown the importance of the type of PNs and image findings of spiculation and vascular convergence in the diagnosis of benign and malignant PNs [27][28][29].Several studies [30,31] have been proven the value of combining with CEA, CYFRA21-1 and NSE, but no significant association was found in our study.In our study, 50% of benign PNs were over-diagnosed as malignant at the time of imaging assessment which was not negligibly high; if the PN is found to be non-solid with spiculation, pleural retraction, vascular convergence, and larger lesion size at the time of CT scan, the possibility of overdiagnosis of PNs can be reconsidered.At the same time, attention should be taken not to excessively increase the proportion of PN size in the judgment of benign and malignant PNs.Although we anticipated that the patient's age and smoking history, the location of the PN, and biomarkers might influence the judgment of overdiagnosis or not, this hypothesis was not supported after multivariate analysis.This might be because the number of our study population was not large enough and the biomarkers were not highly sensitive to overall malignant nodules but were sensitive to specific subtypes.Biomarkers, as a means of cancer screening, take advantage of the characteristics of minimally invasive.However, conventional tumor markers (CEA, NSE, CYFRA21-1, and SCC) appear to be sensitive only to certain types of tumors or require a large enough tumor volume to produce, so they are not sensitive in PNs ranging from 5 to 15 mm.However, a study [32] has found that the expression of specific biomarkers such as plasma proteins LG3BP and C163A, combined with age, smoking status, nodule diameter, shape, and location, has a good ability to distinguish benign and malignant PNs.The popularization of special biomarkers may increase the accuracy of differentiating benign and malignant PNs, but it also puts forward higher requirements for the ability to discriminate.Overall, considering that it is unacceptable to miss the diagnosis of malignant PNs, clinicians and radiologists are still debating the appropriate treatment strategy for 5 mm to 15 mm PNs.
We believe our model can be used before surgery to help clinicians decisively select lesions that are likely to be benign.In our study, we focused on NCCT examination features.PNs type and size are important influences on the classification of PNs, and nodules' type can be depicted by dual-energy CT [33].A study found that adenocarcinoma in situ (AIS) and minimally invasive adenocarcinoma (MIA) patients had a 100% 5-year recurrence-free rate after resection of PNs, suggesting that it is important to distinguish AIS and MIA from other malignant nodules [34].We believe that there is potential for better prediction of PNs to distinguish AIS and MIA and protect patients from non-essential invasive treatment in the future through increased use of dual-energy CT features.
Our study had several limitations.First, it was conducted at a single institution, and we did not perform external validation from an external institution.The sample inevitably increased the proportion of oncology patients, due to our institution being an oncology specialized hospital.Second, the rate of overdiagnosis in this study may be higher than that of its providers due to the high cost of missed diagnoses and physicians' hypersensitivity to high-risk lesions.Third, the different CT machine models and scanning parameters used in this study may lead to the lack of standardization of image details.Last, due to the lack of unified standards for patient examination, the data such as biomarkers are missing, which ultimately leads to an unsatisfactory sample size.

Conclusion
In conclusion, a large number of patients with benign PNs in the clinic were over-diagnosed as malignant nodules in imaging, and unnecessary surgery or core biopsy intervention was performed.This preoperative model and the factors that led to the overdiagnosis of benign PNs may help clinicians reduce unnecessary surgery and help imaging physicians make more accurate diagnosis.
(a) age ≥ 18 years old;(b) nodule size of 5 mm to 15 mm;(c) diagnosis of PN(s) by postoperative pathologic examination;(d) clear pathologic results and (e) no history of cancer.Patients were excluded due to the following reasons:(a) poor quality of CT images, (b) no information about histopathology results from biopsy or surgical histopathologic evaluation, (c) PN(s) were intraoperative and no corresponding imaging data are available, (d) prior pulmonary surgery, and (e) pathologically confirmed metastasis.

Fig. 1
Fig. 1 Patient Inclusion and Exclusion Criteria Flow Diagram

Fig. 2
Fig. 2 Computed tomography image shows vascular convergence (blue arrows), pleural retraction (green arrows), and spiculation (red arrows).a a solid nodule with an average diameter of 1.1 cm in the right lower lobe of a 54-year-old woman's lung and pathologically confirmed to be benign PN; b a part-solid nodule with an average diameter of 1.1 cm in left lower lobe of 58-year-old woman's lung and pathologically confirmed to be malignant PN

Fig. 3
Fig. 3 Computed tomography image shows vascular convergence (blue arrows), pleural retraction (green arrows), and spiculation (red arrows).A solid nodule with an average diameter of 1.3 cm in the right upper lobe of a 59-year-old woman's lung and pathologically confirmed to be benign PN

Fig. 4 Fig. 5
Fig. 4 Receiver operating characteristics curves with calibration plots representing the discriminatory ability of the predictive model for benign PNs in (A) training (n = 285) and (B) validation sets (n = 122) by using complete case analysis.AUC = area under the receiver operating characteristic curve

Table 1
Clinical and pathologic characteristics of patients and lesions in training and validation setUnless otherwise noted, variables are expressed as numbers of patients with percentages in parentheses, or as medians, with interquartile ranges in parentheses.CEA Carcinoma Embryonic Antigen, NSE Neuron Specific Enolase, CYFRA21-1 Cytokeratin 19 fragment, SCC Squamous Cell Carcinoma Antigen a Data in parentheses are numerator/denominator; the number of patients for whom histopathological data were available for each molecular marker is given

Table 3 ,
Fig.2).Smoking history, family history of cancer, lesion size at imaging, image findings of Pleural retraction and vacuolar sign, location, and biomarker of CEA, NSE, and SCC were not associated with benign PNs.

Table 3
Bivariate logistic regression analysis to predict benign nodules in training setData in parentheses are 95% CIs.CEA Carcinoma Embryonic Antigen, NSE Neuron Specific Enolase, CYFRA21-1 Cytokeratin 19 fragment, SCC Squamous Cell Carcinoma Antigen a The number of patients for whom histopathological data were available for each molecular marker is given

Table 4
Multivariable logistic regression analysis to predict benign nodules in training set Data in parentheses are 95% CIs a The number of patients for whom histopathological data were available for each molecular marker is given

Table 5
Multivariable logistic regression analysis to identify factors for benign nodules are diagnosed as malignant nodules Unless otherwise noted, variables are expressed as numbers of patients with percentages in parentheses, or as medians, with interquartile ranges in parentheses.CEA Carcinoma Embryonic Antigen, NSE Neuron Specific Enolase, CYFRA21-1 Cytokeratin 19 fragment, SCC Squamous Cell Carcinoma Antigen a Data in parentheses are 95% CIs b Data are median millimeters; data in parentheses are interquartile range c