- Research
- Open access
- Published:
Artificial intelligence-based model for predicting pulmonary arterial hypertension on chest x-ray images
BMC Pulmonary Medicine volume 24, Article number: 101 (2024)
Abstract
Background
Pulmonary arterial hypertension is a serious medical condition. However, the condition is often misdiagnosed or a rather long delay occurs from symptom onset to diagnosis, associated with decreased 5-year survival. In this study, we developed and tested a deep-learning algorithm to detect pulmonary arterial hypertension using chest X-ray (CXR) images.
Methods
From the image archive of Chiba University Hospital, 259 CXR images from 145 patients with pulmonary arterial hypertension and 260 CXR images from 260 control patients were identified; of which 418 were used for training and 101 were used for testing. Using the testing dataset for each image, the algorithm outputted a numerical value from 0 to 1 (the probability of the pulmonary arterial hypertension score). The training process employed a binary cross-entropy loss function with stochastic gradient descent optimization (learning rate parameter, α = 0.01). In addition, using the same testing dataset, the algorithm’s ability to identify pulmonary arterial hypertension was compared with that of experienced doctors.
Results
The area under the curve (AUC) of the receiver operating characteristic curve for the detection ability of the algorithm was 0.988. Using an AUC threshold of 0.69, the sensitivity and specificity of the algorithm were 0.933 and 0.982, respectively. The AUC of the algorithm’s detection ability was superior to that of the doctors.
Conclusion
The CXR image-derived deep-learning algorithm had superior pulmonary arterial hypertension detection capability compared with that of experienced doctors.
Background
Pulmonary arterial hypertension (PAH) is a subtype of pulmonary hypertension (PH) and is characterized by elevated pulmonary arterial pressure due to increased pulmonary vascular resistance [1]. Symptoms of PAH are non-specific but commonly include exertional dyspnea and fatigue [2]. The estimated prevalence of PAH is 48–55 per 1 million adults in economically developed countries, and the typical course of PAH without treatment is death from right-sided heart failure. Currently, selective pulmonary vasodilators that act via three different pathways are available for treating PAH, and clinicians recommend an initial combination therapy [3].
Despite the establishment of treatment algorithms and reduced mortality due to PAH, the number of patients in the intermediate- and high-risk groups remain high (risk stratification according to the European Society of Cardiology and European Respiratory Society guidelines) [4, 5].
A contributor to the number of patients at risk is diagnostic delay. Diagnostic delay often occurs due to overlooked characteristic imaging findings signifying the presence of PH and related clinical symptoms [6]. A 2013 study reported that the mean time from symptom onset to diagnosis was 47 ± 34 months [6]. Moreover, patients reported 5.3 ± 3.8 general practitioner visits before being seen at a PH center [6]. At symptom onset, 95% of patients were classified according to the World Health Organization Functional Class (WHO-FC) II; however, the distribution of WHO-FC II, III, and IV in the same patient group at diagnosis was 0%, 94%, and 6%, respectively, representing a significant underestimation of the condition.
A recent study by Khou et al. reported that the mean time from symptom onset to diagnosis was 2.5 ± 4.1 years, and a longer diagnostic interval was associated with decreased 5-year survival. Therefore, although the treatment options for PAH are advancing, delays in diagnosis have not improved [7]. General physicians need to suspect PAH on chest X-ray (CXR) images in the early stages of the disease and refer patients to a PH center as appropriate [8]. Computer-aided detection/diagnosis (CAD) supports the detection and diagnosis of abnormalities or diseases. It can help physicians detect suspicious lesions that are easily overlooked, leading to improved detection accuracy [9].
CAD algorithms on CXR to detect pneumonia [10], lung nodules [11], pulmonary tuberculosis [12], and interstitial lung diseases [13] have been developed using deep-learning methods. Kusunose et al. developed and tested a deep learning algorithm to predict elevated pulmonary arterial pressure in patients with suspected PH using CXR images. In their study, the area under the curve (AUC) of the artificial intelligence (AI) algorithm was 0.71, and the negative predictive value (NPV) of the AI algorithm for detecting PH was 95.0%, providing promising results [14]. However, a deep-learning algorithm for PAH has yet to be developed.
To address this need, we aimed to develop and test a deep learning algorithm using the CXR images of patients with PAH. We hypothesized that the algorithm could help physicians detect PAH and provide better detection capabilities compared with that of doctors’ opinion alone.
Methods
Study design
This retrospective study used data collected from patients at Chiba University Hospital to develop and test a deep-learning algorithm using CXR images. All study protocols were performed in accordance with the Declaration of Helsinki. The Research Ethics Committee of the Graduate School of Medicine, Chiba University approved this clinical study (approval number: 4203). All adult participants provided written informed consent to participate in this study. The study design followed the Checklist for Artificial Intelligence in Medical Imaging (CLAIM).
Datasets
The dataset included 145 patients with PAH and 260 control patients, almost all residing in Tokyo or Chiba Prefecture and visited Chiba University Hospital between January 1, 2003, and December 31, 2020. Using a Swan-Ganz catheter, PAH was diagnosed based on a right heart catheter (RHC). The diagnosis of PAH was made using hemodynamic measurements according to the most recent World Symposium standards: mean pulmonary arterial pressure > 20 mmHg, pulmonary arterial wedge pressure ≤ 15 mmHg, and pulmonary vascular resistance > 3 wood units [2, 15].
Due to the rarity of PAH, one to three CXR scans were used independently during the clinical course of each of the 145 patients with PAH (259 total images). The RHC-CXR data pair from patients with PAH, performed within 3-day intervals, constituted the analysis dataset. The control group consisted of 260 patients who visited the Ophthalmology Department of Chiba University Hospital from January 1, 2015 to December 31, 2020 and had CXR images without suspicion of PH (260 total images). For the control group, we used CXR images which were confirmed to be free of PAH, obvious cardiac enlargement, or interstitial pneumonia by two respirologists.
Development of a deep-learning method for PAH detection
The ResNet50 model pre-trained on ImageNet-1k was used for image classification. A fully connected layer was added to the final layer for binary classification [16]. A PyTorch framework was used for the ResNet50 model implementation [17]. The training was performed on a workstation using an NVIDIA Tesla T4 graphics processing unit (Santa Clara, CA, USA). All CXR datasets were split in a 4:1 ratio for development and testing. The developed CXR dataset was separated into training and validation subsets using the K-fold cross-validation method (K = 4). The patient identification number was used to separate the CXR images to ensure that multiple CXR images from the same patient were not distributed across the datasets. Before training, the image contrasts were normalized to an intensity range of 0–255 and resized to 320 pixels × 320 pixels. Following this, the neural network was trained using Stochastic Gradient Descent. The objective function was Binary Cross Entropy Loss and the learning rate equaled 0.001. Training was performed for 50 epochs. Data augmentation during training included left/right and up/down flipping, changes in saturation and hue, and cropping.
Testing the detecting capability of the algorithm
The algorithm outputted a numerical value from 0 to 1 using the testing dataset (the probability of the PAH score) for each image. A receiver operating characteristic (ROC) curve was plotted to represent the sensitivity and false positive rate (1 - specificity) for every score cut-off, and the AUC was calculated. The optimal AUC threshold was determined using the Youden index and sensitivity and specificity were calculated.
Comparing the performance of the algorithm versus the performance of the doctors
The detecting capability of the algorithm was compared with that of doctors (respirologists [> 8 years of experience], n = 7 and radiologists [> 14 years of experience], n = 2). The distance from the doctor’s eyes to the monitor was approximately 60 cm. The time allowance given to read each image was ≤ 5 s.
Statistical analysis
We compared the ROC curve of this algorithm with that of the doctor’s reading results. All statistical analyses were performed with EZR (Saitama Medical Center, Jichi Medical University, Saitama, Japan), which is a graphical user interface for R (The R Foundation for Statistical Computing, Vienna, Austria). Statistical significance was set at P < 0.05.
Results
Clinical characteristics
Participants were divided into two groups; those with PAH (PAH group: 145 patients; mean age, 51.9 ± 16.1 years; 34 males) and the control (control group: 260 patients; mean age, 62.5 ± 9.6 years; 131 males). The characteristics of the patients with PAH included in this study are shown in Table 1.
Detection capability of the algorithm
Figure 1 shows the distribution of CXR images included in the testing dataset according to the algorithm-assigned scores for those confirmed to be PAH-positive or PAH-negative.
We divided the data sets into training and test sets, and then performed a four-fold cross-validation on the training set. The AUC of each model test on the test set was 0.9801 (SD, 0.0054). The final performance of the ensemble of four models was 0.988 (95% confidence interval [CI], 0.972–1.000). Using an AUC threshold of 0.69, the sensitivity and specificity of the algorithm were 0.933 and 0.982, respectively. Figure 2 shows the ROC curve of the AI algorithm using images from the testing dataset. In addition, all the doctors’ results are plotted. All plotted dots were below the ROC curve.
Additionally, AUCs were calculated for both sexes. The test dataset included 31 patients with PAH (PAH group: 9 males; 45 total images; mean age, 54.8 ± 5.6 years) and 56 patients without suspicion of PAH (control group: 30 males; 56 total images; mean age, 57.5 ± 8.0 years). There were no significant differences in age (p > 0.05). In the male-only group (39 males; 48 total images), the AUC of the AI algorithm for the detection of PAH was 0.993 (95% confidence interval [CI], 0.978–1.000). Using an AUC threshold of 0.801, the sensitivity and specificity of the algorithm were 0.967 and 0.944, respectively. Meanwhile, in the female-only group (48 females; 53 total images), the AUC of the AI algorithm for the detection of PAH was 0.983 (95% CI, 0.951–1.000). Using an AUC threshold of 0.693, the sensitivity and specificity of the algorithm were 1.000 and 0.926, respectively. Furthermore, the AUCs of the AI algorithm for both sexes were high.
Performance of the algorithm versus the performance of the doctors
Among the nine doctors who interpreted the CXR images, the mean sensitivity and specificity were 0.829 and 0.906, respectively. The median (range) sensitivity and specificity were 0.822 (0.667–0.911) and 0.875 (0.821–0.982), respectively. For the detection of PAH, the AUC of the doctors was 0.945 (95% CI, 0.907–0.983), while that of the AI algorithm was 0.988 (95% CI, 0.972–1.000) and superior to that of the doctors (p = 0.0175). These findings demonstrated that the algorithm’s detection performance was superior to that of the doctors.
Discussion
In this retrospective study, we developed and tested the CAD algorithms for analyzing CXR images to diagnose PAH. The AUC of the algorithm was 0.988, which was superior to that of experienced doctors. However, further real-world clinical trials are needed to determine how this algorithm can contribute to PAH detection.
CXR scans can be used as a simple and economical data source that are available globally. Recently, deep-learning AI diagnostic systems have been developed for chest imaging. Early detection and diagnosis of PAH are important because a longer diagnostic interval is associated with decreased 5-year survival [7]. If CAD improves the detection rate of PAHs on CXR images and makes this information available to non-specialist physicians, the time required to diagnose PAH can be reduced, which could lead to better prognosis.
In this study, we developed an algorithm to extract suspected patients with PAH using CXR images and achieved an AUC of 0.988, albeit in a limited population. In a previous study, Kusunose et al. developed an algorithm to predict increased pulmonary arterial pressure using CXR images in a population with suspected PH, obtaining an AUC of 0.71 and a specificity of 0.95 [14]. However, the CAD task used was more difficult than that used in our study because the people in the control group (non-PH) had subjective symptoms and were suspected of having PH. To visualize the focus pixels of the model inference, we used Gradient-weighted Class Activation Mapping (Grad-CAM). This technique specifically allows for the visualization of the focus pixels per image. Figure 3 is a single example to demonstrate the function of the model. Our AI model focused on the pericardial areas of the CXR images (Fig. 3) and appeared to label for the presence or absence of PAH. Nine doctors diagnosed PAH on CXR images by mainly focusing on the enlargement of the pulmonary artery and the heart. The focused area of the AI model and the doctors suggests that the AI model has similar diagnostic criteria to those of the doctors. In cases of false-positive results in the control group, heart enlargement could be considered PAH by the AI algorithm (Supplementary Fig. 1). Whereas in the study by Kusunose et al., the AI program focused on the right upper lung area and the area around the heart because the right upper pulmonary field is generally a common site of focal congestion [14].
Future studies should include diseases with CXR images similar to PAH, such as cardiac hypertrophy, and collect data from diverse clinical settings. This study can serve as a precursor to allow for the development of algorithms for all PH groups. For example, validating the algorithm on various groups of patients in addition to those with PAH, such as those with chronic thromboembolic PH, would be beneficial. We believe this report will serve as a basis for future large-scale studies.
Our study had some limitations. First, the number of images used in the training dataset was relatively small because PAH is a rare disease. The small sample size could have impacted the accuracy of the algorithm as a larger training dataset would improve the algorithm’s performance. Therefore, further validation of the algorithm in a larger population in the future is required, including not only PAH but also PH patients with other diseases such as chronic thromboembolic PH. Second, CXR images were obtained from two forms of imaging equipment, including both imaging plates (IPs) and flat-panel detectors (FPDs). Computed radiography systems with FPDs generally have a higher image quality than those which use IPs. Therefore, differences in image quality and other factors may have affected the study results. In the future, consistency should be ensured regarding the type of imaging detector used. Third, due to ethical issues, RHC could not be performed in patients within the control group who presented with no symptoms of cardiovascular disease. Therefore, we could not completely discount PAH presence in the control group, indicating a potential validation bias in the participant selection strategy. However, as PAH is a very rare disease [3], we believe that it was not highly likely that patients in the control group had PAH. Fourth, the learning and testing protocols used for the algorithm were conducted at a single facility. Consequently, it is unclear whether the same performance can be achieved using CXR images under different conditions and with different equipment. Further clinical trials are required to validate this algorithm in a real-world setting.
In conclusion, we successfully developed a deep-learning algorithm which could detect PAH using CXR images. The detection capability of the algorithm was superior to that of experienced doctors. Analyzing CXR images with our algorithm could provide general physicians and non-specialists for PAH with an opportunity to suspect PH, which might lead to early diagnosis.
Data availability
The data supporting this study’s findings are available from the corresponding author, Shun Imai, upon reasonable request.
Abbreviations
- PAH:
-
pulmonary arterial hypertension
- PH:
-
pulmonary hypertension
- WHO-FC:
-
World Health Organization Functional Class
- CXRs:
-
chest X-rays
- CAD:
-
computer-aided detection/diagnosis
- AI:
-
artificial intelligence
- AUC:
-
area under the curve
- NPV:
-
negative predictive value
- RHC:
-
right heart catheter
- ROC:
-
receiver operating characteristics
References
Badesch DB, Raskob GE, Elliott CG, Krichman AM, Farber HW, Frost AE, et al. Pulmonary arterial hypertension: baseline characteristics from the REVEAL Registry. Chest. 2010;137:376–87.
Frost A, Badesch D, Gibbs JSR, Gopalan D, Khanna D, Manes A, et al. Diagnosis of pulmonary hypertension. Eur Respir J. 2019;53:1801904.
Humbert M, Kovacs G, Hoeper MM, Badagliacca R, Berger RMF, Brida M, et al. 2022 ESC/ERS guidelines for the diagnosis and treatment of pulmonary hypertension. Eur Heart J. 2022;43:3618–731.
Galiè N, Channick RN, Frantz RP, Grünig E, Jing ZC, Moiseeva O, et al. Risk stratification and medical therapy of pulmonary arterial hypertension. Eur Respir J. 2019;53:1801889.
Nagata J, Sekine A, Tanabe N, Taniguchi Y, Ishida K, Shiko Y, et al. Mixed venous oxygen tension is a crucial prognostic factor in pulmonary hypertension: a retrospective cohort study. BMC Pulm Med. 2022;22:282.
Strange G, Gabbay E, Kermeen F, Williams T, Carrington M, Stewart S, et al. Time from symptoms to definitive diagnosis of idiopathic pulmonary arterial hypertension: the delay study. Pulm Circ. 2013;3:89–94.
Khou V, Anderson JJ, Strange G, Corrigan C, Collins N, Celermajer DS, et al. Diagnostic delay in pulmonary arterial hypertension: insights from the Australian and New Zealand pulmonary hypertension registry. Respirology. 2020;25:863–71.
Weatherald J, Humbert M. The ‘great wait’ for diagnosis in pulmonary arterial hypertension. Respirology. 2020;25:790–2.
Qin C, Yao D, Shi Y, Song Z. Computer-aided detection in chest radiography based on artificial intelligence: a survey. Biomed Eng OnLine. 2018;17:113.
Parveen NR, Sathik MM. Detection of Pneumonia in chest X-ray images. J Xray Sci Technol. 2011;19:423–8.
Schilham AM, van Ginneken B, Loog M. A computer-aided diagnosis system for detection of lung nodules in chest radiographs with an evaluation on a public database. Med Image Anal. 2006;10:247–58.
Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284:574–82.
Nishikiori H, Kuronuma K, Hirota K, Yama N, Suzuki T, Onodera M, et al. Deep learning algorithm to detect fibrosing interstitial lung disease on chest radiographs. Eur Respir J. 2023;61:2102269.
Kusunose K, Hirata Y, Tsuji T, Kotoku J, Sata M. Deep learning to predict elevated pulmonary artery pressure in patients with suspected pulmonary hypertension using standard chest X ray. Sci Rep. 2020;10:19311.
Vachiéry JL, Tedford RJ, Rosenkranz S, Palazzini M, Lang I, Guazzi M, et al. Pulmonary hypertension due to left heart disease. Eur Respir J. 2019;53:1801897.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2016. p. 770–8.
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, PyTorch: An imperative style, high-performance deep learning library. In: Adv Neural Inf Process Syst 32: Annual Conference on Neural Information Processing Systems, NeurIPS 2019. Vancouver, BC, Canada, 2019. Wallach HM, Larochelle H, Beygelzimer A et al. d’Alché-Buc F, Fox EA, Garnett R, editors; 2019.
Acknowledgements
We would like to thank Editage (www.editage.com) for English language editing.
Funding
This work was partly supported by a grant to The Intractable Respiratory Diseases and Pulmonary Hypertension Research Group, the Ministry of Health, Labor and Welfare, Japan. This study was supported by M3 Inc., Tokyo, Japan.
Author information
Authors and Affiliations
Contributions
SI, SS, JN, TS, TN, SH and KO were equally involved in this study conceptualization, study design, data analysis, and writing of the original draft. SI, SS, JN, AN, AS, TS, AS, AN, HY read chest radiographs as specialist doctors. NS, NT, TB, and TS critically revised the report and commented on drafts of this manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
This study was performed in accordance with the Declaration of Helsinki. This study was approved by Research Ethics committee of the Graduate School of Medicine, Chiba University (approval number: 4203). All adult participants provided written informed consent to participate.
Consent for publication
Not applicable.
Competing interests
TN, SH, and KO are employees of M3 Inc. All other authors do not have competing interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Imai, S., Sakao, S., Nagata, J. et al. Artificial intelligence-based model for predicting pulmonary arterial hypertension on chest x-ray images. BMC Pulm Med 24, 101 (2024). https://doi.org/10.1186/s12890-024-02891-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12890-024-02891-4