Validation of overnight oximetry to diagnose patients with moderate to severe obstructive sleep apnea
BMC Pulmonary Medicine volume 15, Article number: 24 (2015)
Polysomnography (PSG) is treated as the gold standard for diagnosing obstructive sleep apnea (OSA). However, it is labor-intensive, time-consuming, and expensive. This study evaluates validity of overnight pulse oximetry as a diagnostic tool for moderate to severe OSA patients.
A total of 699 patients with possible OSA were recruited for overnight oximetry and PSG examination at the Sleep Center of a University Hospital from Jan. 2004 to Dec. 2005. By excluding 23 patients with poor oximetry recording, poor EEG signals, or respiratory artifacts resulting in a total recording time less than 3 hours; 12 patients with total sleeping time (TST) less than 1 hour, possibly because of insomnia; and 48 patients whose ages less than 20 or more than 85 years old, data of 616 patients were used for further study. By further considering 76 patients with TST < 4 h, a group of 540 patients with TST ≥ 4 h was used to study the effect of insufficient sleeping time. Alice 4 PSG recorder (Respironics Inc., USA) was used to monitor patients with suspected OSA and to record their PSG data. After statistical analysis and feature selection, models built based on support vector machine (SVM) were then used to diagnose moderate and moderate to severe OSA patients with a threshold of AHI = 30 and AHI = 15, respectively.
The SVM models designed based on the oxyhemoglobin desaturation index (ODI) derived from oximetry measurements provided an accuracy of 90.42-90.55%, a sensitivity of 89.36-89.87%, a specificity of 91.08-93.05%, and an area under ROC curve (AUC) of 0.953-0.957 for the diagnosis of severe OSA patients; as well as achieved an accuracy of 87.33-87.77%, a sensitivity of 87.71-88.53%, a specificity of 86.38-86.56%, and an AUC of 0.921-0.924 for the diagnosis of moderate to severe OSA patients. The predictive outcome of ODI to diagnose severe OSA patients is better than to diagnose moderate to severe OSA patients.
Overnight pulse oximetry provides satisfactory diagnostic performance in detecting severe OSA patients. Home-styled oximetry may be a tool for severe OSA diagnosis.
Breathing-related sleep disorder is a spectrum of diseases including snoring, upper airway resistance syndrome and sleep apnea . Several studies have explored that sleep apnea is associated with obesity, daytime fatigue, high blood pressure, and increased risk of cardiovascular morbidity [2,3]. Obstructive sleep apnea (OSA) is prevalent in 4% of men and 2% of women ; among them, up to 93% of women and 82% of men have not been diagnosed , resulting in an increased risk of 2–7 folds in causing motor vehicle crashes  and causing several chronic diseases, such as metabolic syndrome ; neurocognitive deficits , vigilance alteration and attentional decline ; and erectile dysfunction .
Although polysomnography (PSG) is regarded as the gold standard, it is time-consuming, labor-intensive, and expensive . According to a recent report, waiting time for accessing to diagnose and treat patients with suspected OSA is lengthy even in the developed counties around Europe, Australia, the United States, and Canada ; for example, it was estimated that the waiting time for non-urgent referrals for a sleep study ranges from 0 to 48 months in UK, 2 weeks to 2 months in Belgium, 4 to 68 weeks in Australia, a few weeks to more than a year in the US, and 8 to 30 months in Canada . The waiting time ranges from 3 to 7 days in our hospital. Hence, other devices which are cheap, safe, and accurate; readily and easily accessed; and have no risk or side effects to the patients are needed for decreasing waiting time and cost for OSA diagnosis . In addition to labor-intensive, time-consuming, and high examination cost, PSG also has other limitations, such as technical expertise is required and timely access is restricted . Thus, home pulse oximetry has been proposed as a valuable screening tool, although its effectiveness in screening patients with OSA has been debated for several years . A number of studies have assessed its usefulness, but sometimes with conflicting results [14-16]. For example, home overnight oximetry was found to be not very correlated with PSG for testing children  and to be considerable night-to-night variability for aged patients with chronic obstructive pulmonary disease (COPD) . However, Brouillette and colleagues reported that oximetry could be used to diagnose OSA for children with a positive predictive value of 97% .
It was reported that total sleeping time (TST) and sleep efficiency were significantly smaller for the first night in-home PSG compared to the 2nd and 3rd nights . Night-to-night variability of AHI greater than 5 among 21% of the study patients had also been observed . Furthermore, Newell et al  showed that TST, sleep efficiency, and AHI were significantly different between 2 consecutive nights of PSG recording for 4 individual groups of subjects, including sleep-related breathing disorders, insomnia, movement and behavioral disorders, and healthy control, indicating that AHI might be affected by the TST.
Patients who do not have enough sleep time during the examination might have great influence on the diagnosis of OSA severity. Although Lam and colleagues excluded subjects with TST less than 4 hours , the reason why these data were removed was not reported. In this study, we compared two groups of subjects, one with TST less than 4 hours and another with TST more than 4 hours.
The support vector machine applied to oximetry for the diagnosis of OSA has been little explored. Recently, SVM was used to design classifiers for the diagnosis of OSA patients based on the data acquired from only 149 and 100 recruited subjects, respectively, by Al-Angari et al.  and Marcos et al. . Marcos et al.  designed an SVM classifier using Gaussian kernel to classify spectral features of SaO2 signals. The subjects divided into 60 normal and 89 OSA patients without considering the severity were recruited in their study. Al-Angari et al.  adopted linear kernel and second-order polynomial kernel to design the SVM classifiers for discriminating 1-mimute epochs of OSA segments from the normal segments, and to discriminate patients from the normal subjects. The subjects were divided into 50 normal and 50 OSA patients. Only accuracy, sensitivity, and specificity were evaluated in both studies, while areas under ROC curve (AUC) were not calculated. In the present study, the PSG data of 616 patients with possible OSA were used to design SVM classifiers for the diagnoses of severe and moderate to severe OSA patients.
The objectives of this study are to (1) validate nocturnal oximetry for the screening of high-risk OSA patients using oxyhemoglobin desaturation index (ODI) to detect severe and moderate-to-severe patients; (2) consider the effect of patients with TST less than 4 hours compared with those having longer TST during PSG examination; and (3) determine the best kernel and features used for constructing the SVM classifiers to discriminate the OSA patients from the normal subjects.
A total of 699 patients with possible OSA were recruited and tested using PSG device for overnight recording at the Sleep Center of China Medical University Hospital after written informed consents from Jan. 2004 to Dec. 2005. By excluding 23 patients with total recording time less than 3 hours, 12 with poor oximetry recordings, poor EEG signals, or respiratory artifacts, and 48 subjects whose ages less than 20 or more than 85 years old (7), data of 616 patients were used for analysis. By further considering 76 patients with TST less than 4 hours , a group of 540 patients whose TST more than 4 hours was used to study the effect of insufficient sleeping time. The study has been approved by Institutional Review Board of China Medical University (Study No. DMR 96-IRB-17).
Alice 4 PSG recorder (Respironics Inc., USA) was used to monitor and record patients with suspected OSA, during which a number of physiologic variables were measured and recorded during sleep. Physiologic sensors were used to record (1) EEG for detecting brain electrical activity and sleep staging on the basis of 30-sec epochs, (2) EOG and submental EMG for detecting eye and jaw muscle movement, (3) tibia EMG for monitoring leg muscle movement, (4) airflow (by using nasal cannula to measure oronasal pressur) for detecting breath interruption (1287 nasal flow, PTAF 2; Pro-Tech; Pittsburgh, PA), (5) inductance plethysmorgraphy for estimating respiratory effort (chest and abdominal excursion), (6) ECG for measuring heart rate, and (7) arterial oxygen saturation by using a finger probe oximeter (935 Oximeter Sensor; Respironics; Murrysville, PA).
PSG recordings were performed overnight (from 9:00 PM to 06:00 AM in general) in the sleep center for a minimum of 3 hours of light-on to light-off recording time when patients received OSA diagnoses. During the examination, the attending technicians on duty closely inspected the acquired signals of individual channels displayed on the monitor to check their signal quality. If the signal quality was poor or contained obvious artifacts, the technicians would check the instrument setup, cable connection, and electrode-skin contact on-site to ensure proper recording. The quality of PSG raw data was verified by an experienced physician of sleep medicine and scored by a technician certificated by the Taiwan Society of Sleep Medicine. During off-line analysis, the periods with artifacts or poor signal quality not meeting the diagnostic requirements, such as poor signals or artifacts in acquired oximetry, EEG, or respiratory data that did not allow the reading of the SpO2, sleep stages, or the respiratory events, were deleted from the recorded signals. For patients whose net total recording time less than 3 hours or TST less than 1 hour were arranged to receive another free examinations. Sleeping efficiency was calculated as TST divided by total recording time.
The oximeter was used to measure the SpO2 with a sampling rate of 1 Hz. Only the SpO2 signal acquired from pulse oximeter was used for ODI calculation without the knowledge of other PSG information. For pulse rates less than 112 beat/min, the SpO2 calculation is based on a four-beat exponential averaging. The averaging time is doubled to 8 beats for pulse rate greater than 112 beat/min, and is redoubled again for pulse rate more than 225 beat/min. Artifacts with changes of oxygen saturation between consecutive sampling intervals greater than 4%/s, and any oxygen saturation less than 50% were detected and eliminated by an automated algorithm .
An apnea was defined as a minimum of 10 seconds of airflow cessation, and a hypopnea was determined by a 30% reduction in airflow preceding a period of normal breathing for a minimum of 10 seconds and oxyhemoglobin desaturation (decrease in SpO2 ≧ 4%) . Apnea-hypopnea index (AHI) was calculated as the total number of apneas (central, mixed, or obstructive) and hypopneas per hour of sleep. An arousal event was defined as an episode lasting 3 seconds or more with a return of α activity associated with a detectable increase in EMG activity . Arousal index (AI) was the total number of arousals per hour of sleep. OSA patients were diagnosed as mild, moderate, and severe with an AHI greater than 5/h, 15/h, and 30/h, respectively.
The questionnaires including demographic information (age, gender, education, profession type, as well as smoking and drinking status), Epworth scaling score (ESS), as well as symptoms (snoring, drooling and teeth-grinding during sleep; feeling headache, tired, and dry mouth after wake-up; daytime drowsiness, memory problem, and traffic accidence caused by drowsiness when driving) and comorbidities (hypertension, diabetes, high blood lipid, high uric acid, stroke, heart attack, angina pectoris, asthma, thyroid disorder, and allergic rhinitis) related to OSA diagnosis were filled by the patients before PSG recording. Anthropometric parameters (weight, height, as well as waist, neck and hip circumferences) were measured and BMI calculated by the technicians.
Oxyhemoglobin desaturation index derived from pulse oximetry measurement
Oxyhemoglobin desaturation index (ODI), represented as ODInb, includes three components, i.e., threshold (n), baseline (b), and lasting time (fixed to t = 3 sec in this study) parameters. For threshold n, the amount of oxyhemoglobin desaturation n% decrease from the baseline was calculated. In this study, two different baselines, including the mean of all-night oxygen (A)  and the mean of the top 20% of oxyhemoglobin saturation values over the 1 min preceding the scanned oxyhemoglobin value (T)  were adopted. The lasting time parameter t = 3 was defined as that an oxyhemoglobin desaturation event with a minimum of 3 sec were continued. Finally, the ODI was calculated by dividing the total oxyhemoglobin desaturation counts with the total recording time (hours). For example, ODI4A calculated the amount of oxyhemoglobin desaturation events with a 4% decrease from the mean of all-night oxygen lasting for more than 3 seconds per hour. Because the oximetry is one of the PSG channels, we aim to investigate the accuracy, sensitivity, specificity, and area under ROC curve (AUC) of overnight oximetry for the diagnosis of moderate and moderate to severe OSA patients by ODI parameters, including ODI2 (ODI2T), ODI3 (ODI3T), ODI4T, and ODI4A, derived from oximetry measurements.
Consideration of TST during PSG examination
Subjects who do not have enough sleep time during the examination might have great influence on the diagnosis of OSA severity. Although Lam and colleagues excluded subjects with TST less than 4 hours , the reason why these data were removed was not reported. In this study, we compared two groups of subjects, one with TST less than 4 hours and another one with TST more than 4 hours.
Demographic, anthropometric, questionnaire, and PSG data of patients were analyzed using descriptive statistical analysis for calculating means and standard deviations of individual variables. Inferential statistical analyses including Pearson chi-square test and analysis of variance (ANOVA) were applied to detect significant variables for discriminating among normal subjects and mild, moderate, and severe OSA patients. Moreover, linear regression analysis and Bland-Altman plot were performed to compare correlations and agreements between ODI variables (ODI2, ODI3, ODI4T, and ODI4A) and AHI. Analysis of ROC curves were performed to set the cutoff values of different ODI parameters that best discriminate moderate and moderate/severe OSA patients from normal and mild subjects. Furthermore, we tested and compared various combinations of neck circumference (NC), body mass index (BMI), ESS, and ODI variables in the diagnosis of severe and moderate to severe patients based on 2 thresholds, AHI = 30 and AHI = 15, respectively. Finally, support vector machine (SVM), an artificial technique, was used to design computer-assisted diagnostic systems based on parameters derived from the oximetry measurements only by excluding signals acquired from other PSG channels. Comparisons of the diagnostic performance of traditional oximetry parameters with those from the SVM models were also conducted.
Support vector machine
The SVM technique is a useful technique for data classification and regression, which has become an important tool for machine learning and data mining. In general, SVM has better performance when competed with existing methods, such as neural networks and decision trees [29-31]. Recently, application of SVM in medicine has grown rapidly. For examples, it has been applied in prediction of disulfide bonding patterns in proteins , discrimination of malignant and benign cervical lymph nodes , disease diagnosis using tongue images , diagnoses of cardiovascular disease  and breast cancer , and predictions of successful ventilation weaning . A special property of SVM is that it can simultaneously minimize the empirical classification error and maximize the geometric margin. Its goal is to separate multiple clusters with a set of unique hyperplanes having greatest margin to the edge of each cluster to reduce misclassification induced by noise, where each hyperplane separating two clusters is not unique for ordinary linear classifiers. Recently, SVM was also used to design classifiers for the diagnosis of OSA patients [23,24].
Demographic and clinical characteristics of participants
The tested subjects were divided into 4 groups based on the AHI for statistical analysis. Table 1 shows the demographic and clinical characteristics of participants and compares the statistic results among 4 patient groups. Regarding the influence of gender in OSA severity, the proportion of male patients increases significantly following the OSA severity (Chi-square test, p < 0.001). Demographic and anthropometric variables (age, BMI, and NC), ESS obtained from the questionnaire, and variables derived from oximetry (ODI and heart rate), reaching a level of significance (ANOVA, p < 0.01) among two or three stages, were selected as potential variables for designing the SVM classifiers to detect severe (30≦AHI) and moderate to severe (15≦AHI) patients. It was found that only ODI could be used to differentiate all 4 different groups. Variables, including TST, sleep latency, arousal count, and arousal index, derived from EEG and/or EMG, were also calculated and compared among 4 groups of participants. It can be observed that the TST is lower while the arousal count and arousal index are higher for more severe patients.
Discrimination of normal, mild, moderate, and severe participants
Table 2 shows the confusion matrices of multiclass classifiers in the discrimination of 4 groups of participants using ODI parameters derived from the oximetry measurements for two datasets including 616 (dataset 1) and 540 (dataset 2) subject data, respectively. ODI parameters can provide a total predictive accuracy of 71.27% and 73.52% for dataset 1 and dataset 2, respectively. The sensitivities of SVM models in diagnosing severe patients are 88.07% and 87.76% in dataset 1 and dataset 2, respectively, while the sensitivities are less than 75% for other 3 categories. The predictive accuracy is obtained by tenfold cross validation.
Diagnosis of severe and moderate/severe OSA patients
Table 3 displays various combinations of NC, BMI, ESS, and ODI for designing predictive models in the diagnosis of severe OSA patients, and the prediction rates and AUC achieve 89.81-90.58% and 0.946-0.958, respectively. Diagnostic models designed using ODI parameters derived from pulse oximetry alone achieved prediction performance similar to other multiclass predictors designed using more variables. Models designed with ODI parameters provided a sensitivity of 87.36-89.87%, a specificity of 91.08-93.05%, an accuracy of 90.42-90.55%, and an AUC of 0.953-0.957 for the diagnosis of severe OSA patients in both datasets.
Table 4 compares the diagnostic performance of various diagnostic models designed with different combination of variables for the diagnosis of moderate to severe patients. The sensitivity, specificity, accuracy, and AUC achieve 87.71-89.97%, 83.08-86.56%, 86.85-88.14%, and 0.921-0.941, respectively. Again, ODI alone provides a similar predictive accuracy of 87.33-87.77%, a sensitivity of 87.71-88.53%, a specificity of 86.38-86.56%, and an AUC of 0.921-0.924, respectively, for the diagnosis of moderate to severe OSA patients.
Figure 1 illustrates and compares the ROC curves of models designed using traditional ODI variables (ODI2, ODI3, ODI4T, and ODI4A) and their combinations (ODI2 and ODI4A). In addition to higher accuracy (Table 3), the AUCs of the SVM models designed using 2 ODI variables are greater than other single-variable models, as shown in Figure 1(a) and 1(b), indicating its capability in diagnosing severe OSA patients outperforms other traditional models. Regarding the diagnosis of moderate to severe patients, as shown in Table 4, the multiple-variable SVM models and traditional single-variable models exhibit similar diagnostic performance.
Cutoff values for diagnosis of OSA patients using ODI variables
The cutoff values of single-variable models for the diagnosis of severe and moderate to severe OSA patients are shown in Tables 3 and 4, respectively. The optimal cutoff values for severe patient diagnosis are 27.2, 18.4, 11.2, and 13.7, respectively, in dataset 1, as well as 27.2, 18.4, 10.4, and 13.7, respectively, in dataset 2 for ODI2, ODI3, ODI4T, and ODI4A. The cutoff values were decreased to 21.2, 9.5, 7.3, and 8.5 for dataset 1 as well as 21.2, 9.5, 7.3, and 7.3 for dataset 2 when making diagnosis of moderate to severe patients using ODI2, ODI3, ODI4T, and ODI4A, respectively.
Effect of less TST
Table 5 shows that the number and percentage of subjects with different OSA severity, exhibiting significant difference between patients with TST less than 4 hours and those with TST more than 4 hours (p < 0.01). As indicated in the table, the percentage of severe patients in the group of TST < 4 h group (63.2%) is higher than the TST ≥ 4 h group (43.9%), while it is opposite by considering the cumulated normal and mild subjects (13.2% vs 35.0%). On the other hand, the percentages of moderate patients are very close between two groups (23.9% vs 21.1%). No significant difference (p > 0.05) was observed between two groups regarding anthropometric variables (BMI and NC), questionnaire (ESS), and HR, while significant difference was found for age (p < 0.001), ODI4A (p < 0.05), and AHI (p < 0.01). Interestingly, except ODI4A, other ODI parameters, i.e., ODI2, ODI3, and ODI4T, exhibited no significant difference (p > 0.05) between two different TST groups. It mimics ODI4A is more sensitive than ODI2, ODI3, and ODI4T in discriminating patients with less TST from those with more TST. Furthermore, as shown in Table 5, patients with less TST demonstrate significantly greater latency (p < 0.001) and arousal index (p < 0.05); however, the arousal counts is significantly smaller (p < 0.001). We suspected that data recorded from subjects who do not have enough sleeping time might not be reliable for analysis and should be excluded. An additional PSG examination should be conducted to draw more solid diagnostic conclusion.
Discussion and conclusion
Prediction of OSA using questionnaires, demographics, clinical features, and physiological examination has been extensively studied in the last decade [11,38]. Goncalves and colleagues used Epworth sleeping scale (ESS), the sleeping disorders questionnaire, the Beck depression inventory, the medical outcome study 36-item short form health survey (SF-36), and a questionnaire on driving difficulties and accidents to evaluate subjects who were suspected to have breathing-related sleep disorders. Among them, ESS was found to be correlated to arousal index and apnea-hyponea index (AHI) , while contradicted by other investigators [11,39]. Khoo et al  used questionnaires, containing questions regarding snoring, choking, suffocating, and abrupt awaking during sleep, to study Asian populations, including Chinese, Malaysian, and Indian. It was found that the risk factors are similar to white populations in strong association of snoring and sleep apnea with male gender, older age, obesity, family history, and smoking . In addition, demographic, clinical, and biochemical factors including age, sex, observed sleep apnea, fasting insulin, glycosylated hemoglobin A1C, and central obesity and lager NC were found to significantly increase the risk of higher AHI for the severely obese patients . Strong correlation between patient self-perception and clinical examination, including Friedman tongue position grade and Friedman clinical staging, of OSA severity and AHI was also found . In this study, ESS questionnaire was considered to be included for designing SVM classifiers for OSA diagnosis. As indicated in Table 1, it is only effective in discriminating the normal from the severe OSA patients. In contrast, all ODI parameters are able to discriminate the normal and different stages of OSA patients. As presented in Tables 3 and 4, diagnostic performance is not improved when ODI parameters are combined with ESS, nor is combined with BMI and/or NC. ODI parameters provided by overnight oximetry measurements may become good predictors in the diagnosis of OSA, while demographic and questionnaire variables are not very helpful to elevate the prediction rate .
Selection of ODI parameters for designing SVM models
The selection of variables for designing predictive models for OSA diagnosis is based on ANOVA analysis, linear regression analysis, and Bland-Altman plot. Although all variables shown in Table 1 reached a level of significance in discriminating 4 groups of participants when tested with ANOVA, only NC, BMI, and ODI parameters were considered as potential predictors after post-hoc analysis in discriminating severe patients from the other groups. In contrast, all the ODI parameters are able to discriminate any 2 groups of participants. The variables derived from PSG channels other than oximetry, including TST, sleep latency, arousal count, and arousal index, were not included, because they cannot be derived from the home oximetry.
Several ODI parameters, including ODI2, ODI3, ODI4T, and ODI4A, which were highly correlated with AHI were considered to be included for designing the SVM models. In addition to correlation, their agreements with AHI were also evaluated by Bland-Altman plots. After linear regression analysis, the obtained linear predictor functions were used to perform Bland-Altman plots to check their agreements with the gold standard AHI. As illustrated in Figure 2, Bland-Altman plots were compared among single and combined ODI parameters after linear regressions. The correlation coefficients (R2) of AHI versus ODI2, ODI3, ODI4T, and ODI4A were 0.770, 0.835, 0.813, and 0.878, respectively. The agreements of AHI with individual ODI parameters were evaluated by observing their means and differences. As exhibited in Figure 2 (a)-(d), the 95% limits of agreements (±1.96 SD) are ±26.4, ±22.4, ±23.8, and ±19.2, respectively, for ODI2, ODI3, ODI4T, and ODI4A with a mean of 0. ODI4A presented best agreement with AHI and was defined as the potential variable for combining with other variables to achieve better diagnostic performance. Figure 2 (e) and (f) demonstrate the Bland-Altman plots for ODI4A combined with ODI2 and ODI3, respectively. It can be found that the former combination (R2) = 0.881, ±1.96 SD = ±19.0) exhibits slightly better correlation and agreement with AHI than the latter (R2 = 0.880, ±1.96 SD = ±19.1) and ODI4A alone (R2 = 0.878, ±1.96 SD = ±19.2). The correlations and agreements were not improved by including more ODI parameters; for example, as shown in Figure 3, the correlation and agreements were slightly degraded for the combined ODI4A, ODI2, and ODI3 (R2 = 0.881 and ±1.96 SD = ±20.0) as well as for the combined ODI4A, ODI4T, ODI2, and ODI3 (R2 = 0.880 and ±1.96 SD = ±38.9) with significantly deviated means of 3.5 and −15.4, respectively. The analysis of ROC curves further validated that the combination of ODI4A and ODI2 achieved better diagnostic performance, especially for the diagnosis of severe OSA patients, than the predictors based on single ODI parameters.
Comparison of multi-variable SVM classifiers and single-variable predictors
As shown in the confusion matrices (Table 2) resulted from multiclass SVM classifiers for the detection of normal, mild, moderate, and severe patients, the sensitivity of ODI variables (ODI 2 and ODI4A) achieves more than 87% in detecting severe OSA patients, while it is only 41-74% for the other three groups. The misclassification rates of severe and moderate patients classified as normal are 0.35% (1/285) and 2.27% (3/132), respectively, for dataset 1, as well as 0% and 2.63% (3/114), respectively, for dataset 2. The misclassified rates are only 1.38% and 0%, respectively, for normal subjects to be classified as moderate or severe when tested with dataset 1 and dataset 2. The results mimic that a system designed based on ODI parameters for diagnosing moderate or moderate/severe patients is expected to have great diagnostic performance.
The sensitivity in detecting severe patients is 87.36%, which indicates that 12.64% of the severe patients will be treated as normal, mild, or moderate (Table 3). Further PSG test will be expected to confirm the diagnoses of those mild and moderate patients. On the other hand, the percentage of normal, mild and moderate patients who were diagnosed as severe is 6.95% (1-specificity). According to the results of our dataset indicated in Table 2, none of the normal subject was diagnosed as severe, mimicking that the subjects being diagnosed as severe were mild or moderate OSA patients. It is acceptable for this misclassification (false-positive) since it was suggested that mild and moderate patients also need treatments using continuous positive airway pressure (CPAP) . The oximetry can be effective to be used for diagnosing severe OSA patients. The predictive model is suitable to predict severe patients using cheaper and convenient oximetry, while the non-severe patients, including normal, mild, and moderate patients, need to be confirmed by more expensive PSG examinations. Cost-effectiveness analysis needs to be conducted to evaluate the strategy of home oximetry test followed by PSG examination for the diagnosis of OSA patients to reduce healthcare cost and medical facility usage.
Marcos et al. designed an SVM classifier using Gaussian kernel to classify spectral features of SaO2 signals . The 149 subjects were divided into 60 normal and 89 OSA patients. The severity of OSA patients was not considered in their study. Since Gaussian kernel is deemed as an example of RBF kernel, the results were compared with our results obtained from RBF SVM classifier. The obtained accuracy, sensitivity, and specificity were 88.00%, 84.44%, and 93.33%, respectively, with an AUC of 0.921. In comparison, as presented in Tables 3, our results showed better performance with accuracy (≥90.42%), sensitivity (≥87.36%), specificity (≥91.08%), and AUC (≥0.953) in diagnosing severe OSA patients; and, as indicated in Table 4, similar performance with accuracy (≥87.33%), sensitivity (≥87.71%), specificity (≥86.38%), and AUC (≥0.921) in diagnosing moderate to severe patients.
Al-Angari et al. adopted linear kernel and second-order polynomial kernel to design the SVM classifiers for discriminating 1-mimute epochs of OSA segments from the normal segments, and to discriminate patients from the normal subjects . Only 100 subjects (50 normal subjects and 50 OSA patients) were recruited in this study, lacking the representativeness of the study samples. The best accuracies achieved for the SVM classifiers designed based on the oxygen saturation features are 80.10% (Sensitivity: 60.9%; Specificity: 94.1%) and 95% (Sensitivity: 100%; Specificity: 90.2%) for epoch classification and subject classification, respectively. No AUC value was provided for both classifications.
Schlotthauer et al.  adopted a novel method based on the empirical mode decomposition to detect oxyhemoglobin desaturation events from the recorded pulse oximetry signals. ODI was then calculated from the oxyhemoglobin desaturation events to detect moderate OSA syndrome with a cutoff ODI value of 18.521, achieving a sensitivity, specificity, and AUC of 0.851, 0.853, and 0.923, respectively. Its diagnostic performance is slightly worse than the predictor ODI3 and the SVM model designed using combined ODI parameters, i.e., ODI2 and ODI4A, but is similar to the single-variable predictors, ODI2, ODI4T, and ODI4A (Table 4).
As shown in Table 6, diagnostic performance of SVM models embedding different kernels used for the diagnosis of severe OSA patients is compared. SVM models embedding RBF kernels present slightly better diagnostic performance than models designed with linear or polynomial kernels.
Effect of less TST
By considering the TST, as depicted in Table 5, age, ODI, and AHI exhibit significant difference between patients with TST less than 4 hours and those with TST more than 4 hours. It indicates that older subjects and more severe OSA patients with higher ODI and AHI tend to have less sleeping time. There are two possible explanations of this finding: (1) some severe patients tend to sleep less because of frequent apnea or hypopnea occurrences and (2) data recorded from subjects who do not have enough sleeping time might not be reliable for further analysis. Regarding the first possibility, the frequent occurrences of apnea or hypopnea after having fallen asleep induces insomnia for severe patients. The environmental change might also be the reason for causing insomnia . By observing the latency time in Table 5, significant difference (p < 0.001) can be found between two groups. Subjects with less sleeping time demonstrate greater latency. Although arousal count for TST less than 4 hours group is significantly smaller than those with TST more than 4 hours group (p < 0.001); however, the arousal index is significantly greater (p < 0.01). We suggest that subjects who were diagnosed as normal but didn’t take enough sleeping time in the sleeping center might be caused by environmental change and should have another PSG examination to confirm OSA diagnosis. By using oximeter to test OSA at home may eliminate such variation.
With regard to the second possibility, the data collected from the 76 (12.34%) subjects who had TST less than 4 hours were removed, resulting in a total of 540 subjects. To compare the accuracy in the discrimination of 4 different groups of subjects (Table 2), a greater increase in detecting mild (62.99% vs 70.25%) patients can be observed, while it is only small difference in accuracy regarding detection of normal (73.61% vs 75.00%), moderate (41.67% vs 46.49%), and severe (88.07% vs 87.76%) groups. Moreover, as shown in Tables 3 and 4, there is no significant difference in diagnostic performance when diagnosing severe patients or moderate and severe patients. The effect of less sleeping time needs to be further investigated.
Unlike PSG and single-lead ECG, the limitation of oximetry measurement is that it is unable to score sleep quality . Standard PSG scores sleep quality based on EEG signal analysis by grading the sleep quality into 4 stages of continuum of depth during non-REM sleep. Thomas and colleagues reported that sleep could be identified as wake/REM, cyclic alternating pattern (CAP), and non-CAP based on the Fourier analysis of R-R interval series and its associated ECG-derived respiration (EDR) signal . The main advantage of using a combination of oximetry parameters as the predictor is its great sensitivity (>90%) in the diagnosis of severe OSA patients with cheaper price compared to PSG.
It was reported that study subjects selected based on whether they has been referred for the index test instead of clinical symptoms tended to have lower accuracy, whereas studies with nonconsecutive inclusion of subjects, retrospective selection of data, and discrimination between healthy subjects and severe patients had significantly higher accuracy . Although our data were collected consecutively from Jan. 2004 to Dec. 2015 and the moderate and severe OSA patients were discriminated with a clear threshold of AHI = 15 and AHI = 30, respectively, expecting not to overestimate the diagnostic accuracy, several limitations must be considered when interpreting the findings. First, this is a tertiary referral hospital and the subjects were recruited from the outpatients with suspicious OSA. Among the 616 subjects studied, only 72 were verified as normal accounting to only 11.69% of the total subjects recruited, which was much less than the moderate (21.43%) and severe (46.27%) patients. Moreover, the study subjects selected based on the referred PSG examinations instead of clinical symptoms tended to underestimate the diagnostic accuracy . Second, all events found in PSG were verified by the technicians working in the sleep center. The variation occurred among different technicians cannot be avoided and neglected since each one has his/her subjective judgment. Therefore, designing an objective computer-assisted system to eliminate the loads and subjective opinions of individual technicians is warranted. Third, the oximetry is part of the gold standard PSG examination, which reduces the quality of the study and can artificially increase the diagnostic accuracy using the parameters derived from the measurement of home-style oximetry for its liability to induce errors and to couple noise artifacts . Finally, due to the design of this study limiting the applicability of the results at home, valid conclusions cannot be drawn about the accuracy of the oximetry to detect or exclude OSA patients outside the sleep laboratory.
Overnight pulse oximetry provides satisfactory diagnostic performance in detecting severe OSA patients. Home-styled oximetry may provide a tool for the diagnosis of severe OSA patients. In addition to ODI parameters (ODI2 and ODI4A), other variables derived from the oximetry, such as heart rate variability (HRV) may also be considered to be used for designing computer-assisted diagnostic system to improve the diagnostic performance.
Pang KP, Terris DJ, Podolsky R. Severity of obstructive sleep apnea: correlation with clinical examination and patient perception. Otolaryngol Head Neck Surg. 2006;135:555–60.
Somers VK, White DP, Amin R, Abraham WT, Costa F, Culebras A, et al. Sleep apnea and cardiovascular disease: an american heart association/american college of cardiology foundation scientific statement from the american heart association council for high blood pressure research professional education committee, council on clinical cardiology, stroke council, and council on cardiovascular nursing. In collaboration with the national heart, lung, and blood institute national center on sleep disorders research (national institutes of health). Circulation. 2008;118:1080–111.
Kasai T, Floras JS, Bradley TD. Sleep apnea and cardiovascular disease: a bidirectional relationship. Circulation. 2012;126:1495–510.
Young T, Palta M, Dempsey J, Skatrud J, Weber S, Badr S. The occurrence of sleep-disordered breathing among middle-aged adults. N Engl J Med. 1993;328:1230–5.
Young T, Evans L, Finn L, Palta M. Estimation of the clinically diagnosed proportion of sleep apnea syndrome in middle-aged men and women. Sleep. 1997;20:705–6.
Hartenbaum N, Collop N, Rosen IM, Phillips B, George CF, Rowley JA, et al. Sleep apnea and commercial motor vehicle operators: statement from the joint task force of the american college of chest physicians, american college of occupational and environmental medicine, and the national sleep foundation. J Occup Environ Med. 2006;48:S4–37.
Lam JC, Lam B, Lam CL, Fong D, Wang JK, Tse HF, et al. Obstructive sleep apnea and the metabolic syndrome in community-based Chinese adults in Hong Kong. Respir Med. 2006;100:980–7.
Lal C, Strange C, Bachman D. Neurocognitive impairment in obstructive sleep apnea. Chest. 2012;141:1601–10.
Gosselin N, Mathieu A, Mazza S, Petit D, Malo J, Montplaisir J. Attentional deficits in patients with obstructive sleep apnea syndrome: an event-related potential study. Clin Neurophysiol. 2006;117:2228–35.
Teloken PE, Smith EB, Lodowsky C, Freedom T, Mulhall JP. Defining association between sleep apnea syndrome and erectile dysfunction. Urology. 2006;67:1033–7.
Pang KP, Terris DJ. Screening for obstructive sleep apnea: an evidence-based analysis. Am J Otolaryngol. 2006;27:112–8.
Flemons WW, Littner MR, Rowley JA, Gay P, Anderson WM, Hudgel DW, et al. Home diagnosis of sleep apnea: a systematic review of the literature. An evidence review cosponsored by the american academy of sleep medicine, the american college of chest physicians, and the american thoracic society. Chest. 2003;124:1543–79.
Netzer N, Eliasson AH, Netzer C, Kristo DA. Overnight pulse oximetry for sleep-disordered breathing in adults: a review. Chest. 2001;120:625–33.
Choi S, Bennett LS, Mullins R, Davies RJ, Stradling JR. Which derivative from overnight oximetry best predicts symptomatic response to nasal continuous positive airway pressure in patients with obstructive sleep apnoea? Respir Med. 2000;94:895–9.
Teramoto S, Matsuse T, Fukuchi Y. Clinical significance of nocturnal oximeter monitoring for detection of sleep apnea syndrome in the elderly. Sleep Med. 2002;3:67–71.
Vazquez J, Tsai W, Flemons W, Masuda A, Brant R, Hajduk E, et al. Automated analysis of digital oximetry in the diagnosis of obstructive sleep apnoea. Thorax. 2000;55:302–7.
Kirk VG, Bohn SG, Flemons WW, Remmers JE. Comparison of home oximetry monitoring with laboratory polysomnography in children. Chest. 2003;124:1702–8.
Lewis CA, Eaton TE, Fergusson W, Whyte KF, Garrett JE, Kolbe J. Home overnight pulse oximetry in patients with COPD: more than one recording may be needed. Chest. 2003;123:1127–33.
Brouillette RT, Morielli A, Leimanis A, Waters KA, Luciano R, Ducharme FM. Nocturnal pulse oximetry as an abbreviated testing modality for pediatric obstructive sleep apnea. Pediatrics. 2000;105:405–12.
Zheng H, Sowers M, Buysse DJ, Consens F, Kravitz HM, Matthews KA, et al. Sources of variability in epidemiological studies of sleep using repeated nights of in-home polysomnography: SWAN sleep study. J Clin Sleep Med JCSM Off Publ Am Academy Sleep Med. 2012;8:87–96.
Ahmadi N, Shapiro GK, Chung SA, Shapiro CM. Clinical diagnosis of sleep apnea based on single night of polysomnography vs. two nights of polysomnography. Sleep Breath. 2009;13:221–6.
Newell J, Mairesse O, Verbanck P, Neu D. Is a one-night stay in the lab really enough to conclude? First-night effect and night-to-night variability in polysomnographic recordings among different clinical population samples. Psychiatry Res. 2012;200:795–801.
Al-Angari HM, Sahakian AV. Automated recognition of obstructive sleep apnea syndrome using support vector machine classifier. IEEE Transactions Information Technol Biomed Publ IEEE Engineer Med Biol Soc. 2012;16:463–8.
Marcos JV, Hornero R, Alvarez D, del Campo F, Zamarron C. Assessment of four statistical pattern recognition techniques to assist in obstructive sleep apnoea diagnosis from nocturnal oximetry. Med Eng Phys. 2009;31:971–8.
Magalang UJ, Dmochowski J, Veeramachaneni S, Draw A, Mador MJ, El-Solh A, et al. Prediction of the apnea-hypopnea index from overnight pulse oximetry. Chest. 2003;124:1694–701.
Berry RB, Budhiraja R, Gottlieb DJ, Gozal D, Iber C, Kapur VK, et al. Rules for scoring respiratory events in sleep: update of the 2007 AASM manual for the scoring of sleep and associated events. Deliberations of the sleep apnea definitions task force of the american academy of sleep medicine. J Clin Sleep Med JCSM Off Publ Am Academy Sleep Med. 2012;8:597–619.
Iber C, Ancoli-Israel S, Chesson ALJ, Quan SF. The AASM manual for the scoring sleep and associated events: rules, terminology and technical specifications, 1st ed. Westchester, Illinois: American Academy of Sleep Medicine; 2007.
Gyulay S, Olson LG, Hensley MJ, King MT, Allen KM, Saunders NA. A comparison of clinical assessment and home oximetry in the diagnosis of obstructive sleep apnea. Am Rev Respir Dis. 1993;147:50–3.
Brown MP, Grundy WN, Lin D, Cristianini N, Sugnet CW, Furey TS, et al. Knowledge-based analysis of microarray gene expression data using support vector machines. PNAS. 2000;97:262–7.
DeCoste D, Schuolkopf B. Training invariant support vector machines. Mach Learn. 2002;46:161–90.
LeCun Y, Jackel L, Bottou L, Brunot A, Cortes C, Denker J, et al. Comparison of learning algorithms for handwritten digit recognition. In International Conference on Artificial Neural Networks. Fogelman F, Gallinari P, editors. 1995:53–60.
Lin HH, Hsu JC, Hsu YN, Pan RH, Chen YF, Tsang LY. Disulfide connectivity prediction based on structural information without a prior knowledge of the bonding state of cysteines. Computers in Biology and Medicine. 2013;43:1941–8.
Zhang J, Wang Y, Dong Y, Wang Y. Computer-aided diagnosis of cervical lymph nodes on ultrasonography. Comput Biol Med. 2008;38:234–43.
Zhi L, Zhang D, Yan J, Li Q, Tang Q. Classification of hyperspectral medical tongue images for tongue diagnosis. Comput Med Imaging Graph. 2007;31:672–8.
Eom JH, Kim SC, Zhang BT. AptaCDSS-E: A classifier ensemble-based clinical decision support system for cardiovascular disease level prediction. Expert Systems Appl. 2008;34:2465–79.
Polat K, Gunes S. Breast cancer diagnosis using least square support vector machine. Digital Signal Process. 2007;17:694–701.
Hsu JC, Chen YF, Chung WS, Tan TH, Chen T, Chiang JY. Clinical verification of a clinical decision support system for ventilator weaning. Biomed Eng Online. 2013;12 Suppl 1:S4.
Goncalves MA, Paiva T, Ramos E, Guilleminault C. Obstructive sleep apnea syndrome, sleepiness, and quality of life. Chest. 2004;125:2091–6.
Dixon JB, Schachter LM, O’Brien PE. Predicting sleep apnea and excessive day sleepiness in the severely obese: indicators for polysomnography. Chest. 2003;123:1134–41.
Khoo SM, Tan WC, Ng TP, Ho CH. Risk factors associated with habitual snoring and sleep-disordered breathing in a multi-ethnic Asian population: a population-based study. Respir Med. 2004;98:557–66.
Lin CL, Yeh C, Yen CW, Hsu WH, Hang LW. Comparison of the indices of oxyhemoglobin saturation by pulse oximetry in obstructive sleep apnea hypopnea syndrome. Chest. 2009;135:86–93.
Chen YF, Hang LW, Huang CS, Liang SJ, Chung WS. Polysomnographic predictors of persistent continuous positive airway pressure adherence in patients with moderate and severe obstructive sleep apnea. Kaohsiung J Med Sci. 2015;31:83–9.
Schlotthauer G, Di Persia LE, Larrateguy LD, Milone DH. Screening of obstructive sleep apnea with empirical mode decomposition of pulse oximetry. Med Eng Phys. 2014;36:1074–80.
Doi Y, Minowa M, Tango T. Impact and correlates of poor sleep quality in Japanese white-collar employees. Sleep. 2003;26:467–71.
Thomas RJ, Mietus JE, Peng CK, Goldberger AL. An electrocardiogram-based technique to assess cardiopulmonary coupling during sleep. Sleep. 2005;28:1151–61.
Rutjes AW, Reitsma JB, Di Nisio M, Smidt N, van Rijn JC, Bossuyt PM. Evidence of bias and variation in diagnostic accuracy studies. CMAJ Canadian Med Assoc J J de l’Assoc Medicale Canadienne. 2006;174:469–76.
Abd Sukor J, Mohktar MS, Redmond SJ, Lovell NH. Signal quality measures on pulse oximetry and blood pressure signals acquired from self-measurement in a home environment. IEEE J Biomed Health Inform. 2015;19:102–8.
This study was supported in part by Ministry of Science and Technology, Taiwan under grant no. NSC98-2410-H-039-003-MY2 and Taichung Hospital, Ministry of Health and Welfare under grant No. FL1030505007-2.
The authors declare that they do not have competing interests.
These authors’ individual contributions were as follows. Conception and design: LWH, JHC, WSC, and YFC. Administrative support: LWH, HLW, and HHL. Collection and assembly of data: All authors. Data analysis and interpretation: WSC and YFC. Manuscript writing: All authors. Final approval of manuscript: All authors.
About this article
Cite this article
Hang, LW., Wang, HL., Chen, JH. et al. Validation of overnight oximetry to diagnose patients with moderate to severe obstructive sleep apnea. BMC Pulm Med 15, 24 (2015). https://doi.org/10.1186/s12890-015-0017-z