Development and validation of a simple-to-use clinical nomogram for predicting obstructive sleep apnea

Background The high cost and low availability of polysomnography (PSG) limits the timely diagnosis of OSA. Herein, we developed and validated a simple-to-use nomogram for predicting OSA. Methods We collected and analyzed the cross-sectional data of 4162 participants with suspected OSA, seen at our sleep center between 2007 and 2016. Demographic, biochemical and anthropometric data, as well as sleep parameters were obtained. A least absolute shrinkage and selection operator (LASSO) regression model was used to reduce data dimensionality, select factors, and construct the nomogram. The performance of the nomogram was assessed using calibration and discrimination. Internal validation was also performed. Results The LASSO regression analysis identified age, sex, body mass index, neck circumference, waist circumference, glucose, insulin, and apolipoprotein B as significant predictive factors of OSA. Our nomogram model showed good discrimination and calibration in terms of predicting OSA, and had a C-index value of 0.839 according to the internal validation. Discrimination and calibration in the validation group was also good (C-index = 0.820). The nomogram identified individuals at risk for OSA with an area under the curve (AUC) of 0.84 [95% confidence interval (CI), 0.83–0.86]. Conclusions Our simple-to-use nomogram is not intended to replace standard PSG, but will help physicians better make decisions on PSG arrangement for the patients referred to sleep center. Electronic supplementary material The online version of this article (10.1186/s12890-019-0782-1) contains supplementary material, which is available to authorized users.


Background
Obstructive sleep apnea (OSA) is one of the most common types of sleep-disordered breathing. One notable characteristic of OSA is recurrent episodes of partial/ complete pharyngeal collapse and subsequently reduced oronasal airflow, or even cessation of breathing, during sleep [1]. OSA is closely associated with impairments in daily activities and social functioning due to daytime sleepiness and fatigue, in turn caused by sleep fragmentation and clinical sequelae such as hypertension, diabetes, dyslipidemia, cognitive dysfunction, cardiovascular events and even all-cause mortality [2][3][4][5][6][7]. The prevalence of OSA is high and has increased with the obesity epidemic according to data gathered since the 1990s [8][9][10]; however, most cases remain undiagnosed and thus the disease is undertreated, resulting in a high social and economic burden.
Although awareness of the prevalence of OSA and its clinical sequelae have increased recently based on data from western countries, the wait time for polysomnography (PSG) ranges from 2 to 60 months [11]; thus, limited access to PSG, which also involved significant expense, remains a major issue. Chinese patients with OSA also need time to reach the aforementioned treatment plan due to a lack of readily available PSG labs. A delayed diagnosis in turn results in delayed OSA treatment and, consequently, more comorbidities. Thus, a simple-touse and reliable method to identify and triage patients at high risk for OSA is urgently needed.
Well-designed questionnaires [i.e., the STOP-Bang questionnaire (SBQ), STOP questionnaire (STOP), Epworth Sleepiness Scale (ESS) and the Berlin questionnaire (BQ)] have been designed as substitute methods for diagnosing OSA in the absence of standard PSG. However, the results have been unsatisfactory. In a recent meta-analysis, pooled specificity was low, ranging from 42 to 65% [12]. Furthermore, questionnaires can be susceptible to bias and snoring is not amenable to self-report, as OSA symptoms occur during sleep. Uncertainty exists about the accuracy and clinical utility of the abovementioned screening tools [13]. An optimized diagnostic tool that combines multiple, objective clinical biomarkers, to avoid bias, has yet to be developed.
No study has determined whether using a combination of objective, clinical biomarkers confers a good diagnostic ability for OSA. Therefore, the aim of our study was to develop and validate a simple-to-use nomogram that incorporated objective demographic, biochemical, and anthropometric parameters to determine whether this nomogram minimized the number of missed OSA diagnoses.

Methods
Our study used a cross-sectional design, was performed in accordance with Declaration of Helsinki and its amendments, and was approved by the Ethics Committee of Shanghai Jiaotong University Affiliated Sixth People's Hospital, Shanghai, China. Written informed consent was obtained from each subject before study enrollment.

Study population
Subjects who complained of snoring or other symptoms (such as daytime sleepiness) of OSA, and who were referred to our sleep center between 2007 and 2016 were consecutively enrolled. All subjects completed questionnaires, which were collected and checked by two independent investigators and captured medical history and health status of the subjects. The exclusion criteria were age less than 18 years; serious systemic disease (i.e., congestive heart failure, severe intrinsic pulmonary disease, chronic kidney disease or hepatic disease); pregnancy; had anti-diabetes and taken lowing-lipids drugs and previous OSA treatment. Subjects with missing clinical data were also excluded from the final analysis on agreement among all authors.

Data collection and analysis Anthropometric measurements
Height and weight were measured with the subjects in light clothing and bare feet, as previously described [14]. Weight was measured by an electronic scale. Height was measured from the feet to the head, as the maximum distance when subjects were standing up straight. Body mass index (BMI) was defined as weight (kg) divided by height in meters 2 . Neck circumference (NC) was measured at the level of the cricothyroid membrane; hip circumference (HC) was measured as the maximum girth at the greater trochanters; waist circumference (WC) was measured at the middle of the lower costal margin and iliac crest. All anthropometric data mentionedabove were recorded twice.

Biochemical measurements
A fasting blood sample was drawn from the vein of each subject at 7 AM after PSG monitoring. The serum lipid profile, including total cholesterol (TC), triglycerides (TG), high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), apolipoprotein A-I (ApoA-I), apolipoprotein B (ApoB), apolipoprotein E, and lipoprotein (a) [Lp(a)] were measured using routine procedures at our hospital laboratory, as previously described [14]. Serum glucose levels were measured using an auto analyzer (Hitachi, Tokyo, Japan), and serum insulin levels were measured using an immunoradiological method [15].

Sleep evaluation
Objective sleepiness was evaluated by standard PSG (Alice 4 or 5; Respironics, Pittsburgh, PA, USA) according to the American Academy of Sleep Medicine (AASM) 2007 guidelines [16]. A posture and snoring sensor was equipped and an electroencephalogram, bilateral electrooculogram, modified lead II electrocardiogram, bipolar chin electromyogram, oral airflow, nasal airflow (with nasal cannula), pulse oximetry, thoracic and abdominal respiratory effort were obtained. The sleep recordings were staged automatically and then checked by a skilled technician manually.
Apnea was defined as an absence of oronasal airflow by at-least 90% relative to baseline and lasting ≥10 s. Hypopnea was defined as any upper airflow reduction of 50% for at least 10 s, accompanied by either a decrease in oxyhemoglobin saturation at least 3% or terminated by awakening. Arousal was defined as abrupt shifts in EEG electroencephalographic frequency lasting for ≥3 s [16]. The apnea-hypopnea index (AHI) was given by the number of apnea and hypopnea events per hour of sleep. The oxygen desaturation index was defined as the total number of episodes of oxyhemoglobin desaturation ≥3% per hour of sleep. The micro-arousal index was defined as the average number of arousals per hour of sleep. OSAS was diagnosed as the AHI ≥ 5 times per hour. OSAS was classified as mild (5~15), moderate (153 0), or severe (≥ 30), respectively [16].

Statistical methodology
The statistical analysis was conducted using R (ver. 3.0.1, Vienna, Austria) and MedCalc software (ver. 12.7.3, Ostend, Belgium). The glmnet package in R was used for the least absolute shrinkage and selection operator (LASSO) logistic regression, and significant factors there in were used to construct the nomogram [17], for detail, please see Additional file 1. Validation was conducted using one thousand bootstrap analyses. Calibration diagrams were established as previously described [18]. To evaluate the predictive or discriminatory abilities of this model, an index of probability of concordance (C-index) was calculated among predicted and actual outcomes [19].
The C-index had a range from 0.5 to 1.0, with 0.5 was considered to be random chance; 1.0 was denoted as perfect discrimination [19]. The LASSO feature regression model was used to distinguish OSA from non-OSA cases. Area under the curve (AUC) in receiver operating characteristic (ROC) analysis was used to evaluate predictive accuracy. Two-sided p-values < 0.05 was considered as statistical significance.

Demographic characteristics of the study subjects
A total of 4162 consecutive participants who underwent full-night standard PSG were finally included in our study. We randomly selected 2913 subjects (70%) for inclusion in the training group, and the remaining 1249 (30%) were assigned to the validation group using a random number table. The demographic characteristics of the training and validation groups are summarized in  Table 1. No obvious differences were found between the two OSA groups or two non-OSA groups in terms of demographic characteristics. Patients in both OSA groups were older, more obese, and had poorer metabolic profiles and sleep variables, aside from ApoA-I and Lp(a) in the validation group (Table 1).

Factor selection for the predictive model
The LASSO method is suitable for regression of highly dimensional data, as it can be used to extract the most important predictive factors from a primary dataset. In this study, a risk score was calculated for each subject via a linear combination of factors that were weighted by their coefficients. Eighteen variables were reduced to eight potential predictors using the LASSO regression model. Then, a coefficient profile plot was produced (Fig. 1a). A cross-validated error plot of the LASSO regression model is shown in Fig. 1b. The most regularized and parsimonious model, with a cross-validated error within 1 standard error of the minimum, included eight variables. The path of the coefficients included in this model, with varying log-transformed lambda values, is shown in Fig. 1b. The model incorporated eight independent predictors (age, sex, glucose, ApoB, insulin, BMI, NC, and WC) and was developed as a simple-to-use nomogram (Fig. 2).

Validation of the nomogram
Validation of the nomogram was performed with a 1000 bootstrap analysis. In terms of the prediction of OSA, the C-index for the nomogram was 0.839 in the training group. In the validation group, the C-index was 0.820, which exceeded 0.7 and thus indicated that the model is suitable and sufficiently accurate for patients with OSA. The calibration plots demonstrated an excellent correlation between observed and predicted OSA in both the training (Fig. 3a) and validation groups (Fig. 3b).

LASSO feature regression model
We established a LASSO feature regression model to visualize the differences between OSA and non-OSA directly. The cutoff point to distinguish OSA from non-OSA was 137.2 (which was calculated from the nomogram). All patients had a nomogram score, which were standardized using the following formula: (nomogram score − 137.2)/standard deviation. The y-axis represents the calculated value; the x-axis represents each patient (green bars are non-OSA and red bars are OSA). In our study, we included more OSA participants than non-OSA participants; thus, too many subjects with OSA resulted in the accumulation of colored blocks. So we randomly selected 537 OSA and with 537 non-OSA patients together to establish the LASSO feature regression model (Fig. 4). We also used this nomogram to distinguish OSA from non-OSA, non-moderate to severe and non-severe OSA using AHI cutoffs of 5, 15 and 30 events/hour. The ROC curves showed that the optimum diagnostic cutoff  Table 2).

An example for nomogram usage
To help new users understand how to proceed with this nomogram, we took one severe patients with OSA (AHI = 43.5 events per hour) for example (  to our sleep center, 96% risk to be OSA. He will be arranged to PSG monitoring earlier.

Discussion
In this study, we developed and validated an easy-to-use nomogram as a new approach to diagnose OSA in a large clinical sample. The nomogram incorporates eight items: age, sex, glucose, ApoB, insulin, BMI, NC, and WC. To our knowledge, this is the first study to establish an objective model, including common demographic, anthropometric, and biochemical variables, to distinguish OSA from non-OSA. The nomogram showed good accuracy and discrimination. In total, 18 candidate variables were used for construction of the nomogram, which was reduced to eight potential predictors using the LASSO regression method. LASSO is suitable for analyzing large sets of clinical factors and avoids overfitting [20]. Our nomogram suggested that obesity and presence of a glucose metabolic disorder may be good predictors of OSA. Furthermore, the nomogram may serve as a useful tool for optimal identification of patients at high risk for OSA. Thus, therapeutic decisions will be better informed and the likelihood of early intervention for high risk patients would be increased, particularly in clinics lacking standard PSG equipment.
Luo M, et al. also established nomogram encompasses lots of subjective variables through an ordinal logistic regression procedure [21]. They found that the discrimination accuracies of this nomogram for non-OSA, moderate-severe OSA, and severe OSA were 83.8, 79.9, and 80.5%, respectively. Our study has a bit lower sensitivities, but with similar AUC (nearly 0.8). The biggest difference from the above-mentioned research is our study used objective parameters which could be avoiding bias caused by questionnaire recalled by patients. Besides, our study also had some advantages such as using LASSO regression for analyzing clinical factors, 10-fold subjects than previous study and had validation group. Other studies also used clinical nomograms to calculate AHI and even estimate median survival time/ event-free survival [7,22]. We will also extend the usage of nomograms in OSA in further prospective studies. Fig. 4 The LASSO feature regression. Standardized total score for each participant in the training group. Green bars represent scores for subjects without OSA, and red bars represent scores for those with OSA OSA remains underdiagnosed and undertreated in clinical settings, given the substantial burden that it confers. Thus, several questionnaire-based prediction tools for OSA have been developed and validated. For example, in a recent meta-analysis, the sensitivity (specificity) of the BQ, SBQ, STOP, and ESS for detecting mild OSA was 76% (59%), 88% (42%), 87% (42%), and 54% (65%), respectively; for moderate OSA, the corresponding values were 77% (44%), 90% (36%), 89% (32%), and 47% (62%); and for severe OSA they were 84% (38%), 93% (35%), 90% (28%), and 58% (60%) [12]. Although the SBQ seems to be more accurate than the other three questionnaires for screening OSA, its relatively low specificity limits its clinical utility [12]. One study assessed the ability of a simple, two-part questionnaire to predict OSA in a clinical setting; it showed high sensitivity (96.6%) but relatively low specificity (40.4%) [23]. However, the sample size (128 patients) in that study was relatively small. Shah et al. established a predictive model for sleep apnea that included age, BMI, snoring, and sex, and had a sensitivity of 0.77 and specificity of 0.75 [24]. However, a portable PSG rather than standard PSG was used in the Hispanic Community Health Study/Study of Latinos (HCHS/SOL) study, which may therefore have underestimated the severity of sleep apnea and also did not assess sleep stage, duration, or fragmentation (as noted by the authors [24]. The NoSAS score, a new screening tool that encompasses NC, BMI, snoring, age, and sex, was developed and validated in the HypnoLaus and EPISONO sleep cohort studies [25]. The NoSAS score seems marginally superior versus the BQ and ESS; however, its sensitivity and specificity are also low. Our objective nomogram seemed to perform better than traditional subjective questionnaires, however, this conclusion needs to be treated with caution because of had no validation in other ethnic groups. Our study was performed according to previously described screening principles (WHO, 1968). We employed a rigorous methodology and an appropriate cross-sectional study design. Although our study had strengths, such as a large sample size, and all samples were evaluated by standard PSG, certain limitations should also be addressed. First, we only used demographic, anthropometric, and biochemical data to establish the nomogram; however, OSA is also affected by genetic factors, as evidenced by genomewide association studies, and we did not consider genomic characteristics. Second, although our nomogram was developed in the context of a large sample size, and internal validation and external validation were performed, it has not been validated in different ethnic groups and populations. Third, the validation group was derived from the same institution as the training group, which makes it difficult to generalize the results to other populations. Therefore, we plan to externally validate our predictive model at other institutions. Fourth, the prevalence of OSA in our sleep center is higher than that in the general population; this high prevalence of OSA may have affected our evaluation of the predictive parameters by inflating their positive predictive value. Fifth, the assessment of biochemical variables raises concerns in term of money, time, and effort when compared with simple screening questionnaires. Lastly, the use of AASM 2007 scoring rules was not justified and could also be one of the potential limitations of our study.

Conclusions
In conclusion, our simple-to-use nomogram, which incorporates demographic, anthropometric, and biochemical parameters, is considered to be a convenient tool for identifying undiagnosed OSA. This intuitive risk assessment tool may be useful for high-risk OSA subjects in clinical settings.