A cross-sectional study of the identification of prevalent asthma and chronic obstructive pulmonary disease among initiators of long-acting β-agonists in health insurance claims data

Background Claims data are potentially useful for identifying long-acting β-agonist (LABA) use by patients with asthma, a practice that is associated with increased mortality. We evaluated the accuracy of claims data for classifying prevalent asthma and chronic obstructive pulmonary disease (COPD) among initiators of LABAs. Methods This study included adult LABA initiators during 2005–2008 in a US commercial health plan. Diagnosis codes from the 6 months before LABA initiation identified potential asthma or COPD and a physician adjudicated case status using abstracted medical records. We estimated the positive predictive value (PPV) and 95% confidence intervals (CI) of covariate patterns for identifying asthma and COPD. Results We sought 520 medical records at random from 225,079 LABA initiators and received 370 (71%). The PPV for at least one asthma claim was 74% (CI 63–82), and decreased as age increased. Having at least one COPD claim resulted in a PPV of 82% (CI 72–89), and of over 90% among older patients, men, and recipients of inhaled anticholinergic drugs. Only 2% (CI 0.2–7.6) of patients with a claim for COPD alone were found to have both COPD and asthma, while 9% (CI 4–16) had asthma only. Twenty-one percent (CI 14–30) of patients with claims for both diagnoses had both conditions. Among patients with no asthma or COPD claims, 62% (CI 50–72) had no confirmed diagnosis and 29% (CI 19–39) had confirmed asthma. Conclusions Subsets of patients with asthma, COPD, and both conditions can be identified and differentiated using claims data, although categorization of the remaining patients is infeasible. Safety surveillance for off-label use of LABAs must account for this limitation.


Background
Although long-acting β-agonist (LABA) medications improve lung function in asthma and chronic obstructive pulmonary disease (COPD) by inducing relaxation of bronchial smooth muscle and inhibiting release of mediators of hypersensitivity [1], their use in asthma, at least in the absence of concomitant inhaled corticosteroids (ICS), has been associated with more asthma-related deaths [2]. These findings, though not uniform [3], have led to increased regulatory restrictions on the use of LABA, including the implementation of Risk Evaluation and Mitigation Strategies (REMS) that aim to inform prescribers and patients about these risks [4]. Indeed, some LABAs have only an indication for COPD [5].
Monitoring whether patients with asthma only use LABAs has the potential to ensure that prescribers are using the products in accordance with known safety data. Physicians should generally avoid prescribing LABA as monotherapy for asthma (an "off-label" use), so that unnecessary harms in patients with asthma are avoided, while ensuring that the product reaches patients with COPD (an "on-label" use), in whom LABA riskbenefit profiles appear acceptable. Health insurance claims data provide an efficient core for this type of surveillance, but the data must accurately characterize LABA users' likely indication for use of the drugs (asthma or COPD).
Indeed, COPD and asthma are distinct clinical entities, but share symptoms and treatments. COPD is characterized by non-reversible airflow restriction or obstruction due to chronic bronchitis or emphysema [6]. Asthma involves recurrent airway obstruction, hyper-responsiveness, and inflammation [7]. First onset of asthma generally occurs at a young age, while first onset of COPD typically occurs among patients over 40 years of age.
The similarities in these conditions complicate differentiation of COPD and asthma in claims data. Therefore the objective of this study was to identify patterns of health insurance claims for asthma and COPD among users of LABA (with or without ICS use) that might improve the categorization of individual patients with asthma or COPD. We studied only new users of LABA to quantify the measurement characteristics of the data for identifying the indication for treatment so that these data would directly inform safety surveillance for LABA.

Data sources
The study population came from the Normative Health Information Database, a claims database of a large US commercial health plan (UnitedHealth Care). Diagnoses associated with the claims are recorded using the International Classification of Disease, 9th revision (ICD-9) coding system. Procedures map to ICD-9, Common Procedural Terminology (CPT), and the Centers for Medicare and Medicaid Services Common Procedure Coding System (HCPCS) codes. National Drug Codes (NDCs) identify medications.

Cohort formation
Eligible patients were initiators or switchers of LABA or LABA and ICS combinations who were at least 20 years old between 01 January 2005 and 31 December 2008 ( Figure 1). We included all LABA and ICS products on the U.S. market during the study period. The LABAs were: arformoterol, formoterol, and salmeterol. The ICS were: beclomethasone, budesonide, ciclesonide, flunisolide, fluticasone, mometasone, and triamcinolone.
We defined initiation as a dispensing of a LABA preceded by 6 months of continuous enrollment without a dispensing of the same LABA. Switchers were patients who initiated a new LABA, but used another LABA during the previous 6 months. The study began for each patient at their first eligible LABA dispensing (index date). The 6 months prior to the index date was the period from which we identified diagnosis, procedure, and medication codes that characterized the study population. Patients were considered concomitant users of LABA and ICS upon initiation of a combination LABA/ICS product or if they initiated a LABA and received a dispensing of an ICS on the same day.
Patients with a recent previous dispensing of an ICS were in the monotherapy group. While requiring that patients received the combination formulation of LABA/ ICS or concomitant ICS on the same day as the initial LABA dispensing may result in misclassification of individuals who received ICS and LABA on different days, this more stringent definition increases the likelihood that we include only users concomitant therapy in the combination LABA/ICS group. This preference is important because combination LABA/ICS for asthma appears safer than LABA monotherapy and should not be discouraged. (Empirically, this decision was unimportant. Ninety-six percent of patients initiated on a combination product. Of the 9,965 patients classified as LABA-only initiators, 1,931 [19%] used ICS in the baseline period and were potentially misclassified. This number represents only 0.9% of the full study population, and this misclassification is of little consequence).

Identification of asthma and COPD
Potential cases of asthma or COPD had ICD-9 diagnosis codes for asthma or COPD in any diagnosis field on claims occurring in the 6 months prior to the index date on an inpatient or outpatient claim (a similar approach requiring 12 months of continuous enrollment yielded similar results). Patients with at least one code for asthma, COPD, both, or neither in the baseline period were eligible for subsequent medical record abstraction. We excluded from consideration certain diagnosis codes that might include COPD, but are mixed with other disorders, including ICD-9 490 (bronchitis, not specified as acute or chronic) [6]. Because hospitalization for asthma is rare relative to the prevalence of the disease, we did not distinguish between inpatient and outpatient claims. We chose to develop our own definition of asthma and COPD despite the existence of algorithms in the literature (e.g., Mapel et al. [8] and Dombkowski et al. [9]), because in our work we aimed to classify baseline diagnoses (indications), whereas most definitions in the literature apply specifically to outcomes or cohort definitions. Definitions of outcomes and covariates are more robust to insensitive measures of disease than case criteria for identification of off-label use.
To verify the diagnoses, we sought a random sample of 130 medical records (with the goal of obtaining 100) from each of the following 4 categories of eligible patients (520 patients in total): Patients with at least one claim for asthma: ICD-9 493.xx, asthma Patients with at least one claim for COPD: ICD-9 491.2x, obstructive chronic bronchitis ICD-9 492.8, other emphysema ICD-9 496, chronic airway obstruction, not elsewhere classified Patients with claims for both asthma and COPD Patients without a claim for either asthma or COPD We sought a total of 640 records, including alternate records for 120 of the 520 sampled patients. We sought medical records from hospitals and physicians' offices using a 2-step process. First, we created a chronological listing of insurance claims for the sampled patients noting the date of service corresponding to the diagnosis code of interest (asthma or COPD). Patients with claims for both diagnoses had the one closest to the LABA initiation date noted, and those with neither had no claim noted.
These claims listings were reviewed to identify the provider or facility from which the medical record was likely to document the diagnosis. Trained abstractors contacted the provider or facility seeking the medical record. For records with clearly inadequate information (as assessed prior to the adjudication process), we re-contacted providers to request additional information. If additional information was not obtained for these cases, the charts were not adjudicated and did not enter analysis. Incomplete medical records were those without clinical information available (i.e., the contained administrative material only). Records with clinical information, even if incomplete, were forwarded to the adjudicators for review.
A physician external to the study team, blinded to study medications, adjudicated the presence of asthma and COPD using the Global Initiative for Chronic Obstructive Lung Disease (GOLD) criteria for COPD [10] and the third Expert Panel Report of Guidelines on Asthma [11]. Confirmation required the affirmative listing of the case criteria in the medical records. The Additional file 1 includes additional information on the case adjudication process.

Analysis
We initially tabulated characteristics of the study population from the 6 months before LABA initiation, including whether LABA initiators were apparent naïve users of all LABAs, the prevalence of comorbidities, and the intensity of healthcare utilization characteristics, such as outpatient visits, emergency department visits, hospitalizations, and home oxygen use. Next, we empirically identified covariates by listing the 100 most prevalent (top 100) drug classes, top 100 diagnoses (at the 3-digit ICD-9 level) and the top 100 procedures received by LABA users, stratifying by ICS.
Using adjudication results as the gold standard, we then estimated the association between the baseline characteristics and the adjudicated diagnosis, with the aim to distinguish between patients with confirmed asthma and those without, and those with confirmed COPD and those without.
We estimated the positive predictive value (PPV) of claims identification of asthma or COPD stratified by the 3 covariates with the largest absolute difference in prevalence across patients with confirmed and unconfirmed asthma. We use this approach among all patients except for those with only a claim for COPD. Whereas for patients with claims for asthma or no claim for asthma or COPD, we were interested in the fraction with asthma (and therefore contraindications to LABA therapy), for patients with claims-based COPD only, we were interested in the fraction with true COPD (i.e., an appropriate indication for LABA therapy). For the latter set, we compared categories of confirmed COPD vs. unconfirmed COPD to identify the 3 most individually discriminating covariates. We estimated 95% confidence intervals (CI) using the exact binomial method.
We chose this categorization of asthma and COPD to be useful for monitoring off-label prescribing of LABAs, where it is of interest to know what fraction of patients with claims for COPD truly have COPD (on-label use -the latter set), and for patients with at least one claim for asthma or for patients with no claim for asthma or COPD, what fraction truly have asthma only (off-label use-the former set). These analyses were restricted to patients for whom we received a medical record. Additionally, we tabulated these PPVs within categories of a number of clinically derived variables that could plausibly be associated with a different PPV and within covariate patterns defined by the 3 most individually discriminating covariates.
We also conducted a post-hoc regression analysis as an additional means to identify covariates whose presence may modify the accuracy of the identification of asthma or COPD. Using multinomial logistic regression models, we regressed confirmed case status (asthma only, COPD only, both, or neither, with neither as the referent) on the claims definitions for these conditions, plus the covariates with product terms for each covariate and the claims definitions. We retained an empirical approach to covariate selection by using a stepwise algorithm, entering and retaining variables with p-values < 0.05.

Human subjects considerations
Since this study used protected health information (PHI) to link insurance claims to patient's medical records, we operated with the oversight of the New England Institutional Review Board who approved our protocol, privacy practices, and whose Privacy Board granted a waiver of authorization for the use of PHI without obtaining patient consent.

Results
Of the 225,079 eligible initiators of a LABA, 9,965 (4.4%) appeared to receive LABA monotherapy ( Table 1). Users of a LABA without concomitant ICS were more likely to be ≥ 65 years of age, while there were no observed differences with respect to sex. Women accounted for nearly two-thirds of patients with baseline claims for asthma, while men accounted for over one-half of patients with COPD. A total of 107,839 of the 225,079 (47.9%) patients had no claim for asthma or COPD in the baseline period. The distribution of number of diagnoses of asthma or COPD in the baseline period was low, ranging from a mean of 0.6 in the 0-90 days before the index date to 0.18 in the 91-183 days before the index date.
Patients on LABAs without ICS had a higher observed prevalence of use of other pulmonary medications, including inhaled anticholinergics, leukotriene modifiers, and xanthine derivatives. Use of inhaled anticholinergics was more prevalent among patients with baseline claims for COPD, while use of leukotriene modifiers was more common among patients with claims for asthma.
We sought 640 medical records related to 520 claims consistent with asthma, COPD, both, or neither. One hundred twenty of these records were requested from alternate providers, when such an alternate was available and the initial provider was unable or unwilling to complete the request. We received medical records relating to 370 (71%) of potential cases. The Supplementary Material contains information on the reasons for nonprocurement of medical records, which were generally administrative in nature. Patients whose medical records we received were slightly older than the underlying LABA cohort, but had a similar sex distribution ( Table 2).
The PPV of having at least one claim for asthma (and no COPD claim) in the 6 months prior to the index date was 73.6% (CI 63.3-82.3) for confirmed asthma and 81.5% for COPD with no asthma claim (CI 72.1-88.9; Table 3). We observed that 21.2% (CI 13.8-30.3) of patients with claims for asthma and COPD were confirmed to have both conditions, while 28.9% of patients with no claims for asthma or COPD (CI 19.5-39.9) were confirmed to have asthma. Regarding the negative predictive value of claims for asthma and COPD, 61.5% (CI 50.1-71.9) of patients without these diagnoses had no evidence of the conditions in the medical record, while 28.9% (CI 19.5-39.9) had confirmed asthma and 9.6% (CI 4.1-18.1) had confirmed COPD.
The PPV of having at least one claim for asthma decreased with increasing age, but the PPV of having of at least one claim for COPD increased with age (Table 4). Stratification on the 3 covariates with the largest individual  discrimination between confirmed and unconfirmed case status resulted in higher PPVs. For instance, among patients with at least one claim for COPD and at least one dispensing of an inhaled anticholinergic medication, the PPV was 94.3% (CI 80.8-99.3). Similar findings were evident for the other empirically identified discriminators. Stratification on clinically defined covariates resulted in some categories where the PPV was higher than the average PPV for that diagnosis ( Table 5). The PPV for at least one asthma claim was increased in the presence of previous asthma medication use and lower respiratory tract infections. The PPV for at least one COPD claim was increased when a pulmonologist was the prescriber of the COPD medications and when there was a diagnosis of bronchitis or bronchiolitis. The Supplementary Material contains estimated PPVs across strata defined by combinations of variables and for a long baseline period of 12 months.
The multinomial logistic regression modeling identified several variables that improved prediction of true case status through interaction with the claims definitions. The variables "symptoms involving respiratory system and other chest symptoms" (ICD-9 786.xx), the number of baseline drug dispensings, and age > 40 years improved prediction of COPD only. The number of drug dispensings also improved prediction of asthma only and both asthma and COPD relative to neither. The presence of a baseline dispensing of a beta-adrenergic drug and the diagnosis "other forms of chronic ischemic heart disease" (ICD-9 414.xx) improved prediction of true asthma only. No other variables arose as significant predictors.

Discussion
Subsets of LABA initiators with asthma, COPD, and both conditions can be identified and differentiated using claims data, although categorization of the remaining patients is largely infeasible. Within strata of selected covariates, the claims showed better predictive ability to identify asthma only among patients with claims for asthma or claims for both asthma and COPD, and to identify COPD among patients with claims for COPD. Among patients without claims for asthma or COPD, there was no subset within which the PPV exceeded 50%, meaning the classification of patients who do not have a claim indicating asthma or COPD remains uncertain. Additionally, requiring the presence of claims for asthma or COPD resulted in a population in which nearly 25% of persons appear to not have the condition of interest.
That a substantial fraction of the LABA users did not have a claim associated with asthma or COPD is problematic in that it leaves a large fraction of patients unclassified in the data. Similarly, multiple diagnoses of asthma or COPD were rare, limiting our ability to refine the algorithms by requiring more than one diagnosis for the condition of interest. Both of these findings are consistent with previous experience with health plan data.   Loughlin and colleagues found with tegaserod, a drug with a clear indication (irritable bowel syndrome), only 32% of tegaserod initiators had a claim for irritable bowel syndrome in the 6-month period preceding tegaserod initiation [12]. To address this expected feature of the data, we abstracted medical records on a subset of LABA initiators with no claims for asthma or COPD, and confirmed their actual case status. The resulting data provide an estimate of the distribution of actual asthma and COPD among those without a claims-diagnosis. Nevertheless, a large fraction of patients remain unclassified. In studies of off-label use, this limitation means that there will be overestimation of off-label use (under the assumption that some of the unclassified patients have the "on-label" indication). In studies where there is a desire to exclude patients with asthma or COPD, it may be infeasible to completely exclude these individuals. Our findings were qualitatively similar to those in a study of the predictive ability of the Lovelace Health Plan data, a staff-and network-model health maintenance organization, for identifying preclinical COPD [8].
In that study, a small number of claims-based variables predicted the presence of preclinical COPD with a PPV of 23% to 39% depending on the definition used. The higher PPVs observed in our study probably reflect our focus on diagnosed COPD, while the Lovelace study screened for preclinical disease.
We did not directly assess the sensitivity of the claims definition; however, the small number of LABA users observed to have baseline claims for asthma or COPD overall suggest that the sensitivity of the diagnosis codes on claims may be low since nearly all LABA users would be expected to have one of these conditions. Other limitations of this study include the use of medical records as a gold standard measure of asthma and COPD. This approach requires the assumption that the information needed for asthma and COPD classification is present and not differentially present across the strata of the predictive variables. If the incorrect diagnosis is listed in the medical record or the medical record is incomplete, then the present estimates of the PPV could be biased. The bias would likely result in an underestimate of the PPV Table 4 Positive predictive value of claims for asthma, COPD, Both or Neither in the 6 Months prior to initiation of a long-acting beta agonist, Overall and stratified by empirically-identified covariates (Continued)  if the medical records are incomplete, since we required that cases have the diagnostic criteria affirmatively listed in their record. These data may not be generalizable to populations of patients who do not use LABAs and caution is warranted for applications of these data to patients with different treatment patterns. Since our objective was to study LABA initiators, we did not collected data on non-users of LABA. Similarly, these data may not generalize to populations other than the commercially insured, but stratifying this population according to relevant characteristics and providing the PPV within strata, improves the external application of these data for case identification [13]. Although these characteristics for stratification are based on health insurance claims in this study were, and so are subject to misclassification on the clinical constructs of interest, this type of misclassification is frequently thought to be non-differential with respect to other claims-based variables so that the most plausible direction of misclassification bias would be toward showing no difference in the PPV across these variables [14]. In some strata of predictive covariates, the data were sparse, which resulted in uncertainty around some PPV estimates. Additionally, we evaluated the performance of our own measure of asthma and COPD. Therefore, these data may provide limited information on the validity of other disease definitions.
A major strength of this study is the large source population, and its reflection of routine clinical practice across the US. These data provide adequate statistical power for the study across a number of stratification variables, and improves its generalizability.
Because of low sensitivity, the claims data do not seem sufficient to definitively identify asthma alone without sacrificing the PPV. However, this study provides estimates of Table 5 Positive predictive value of claims for asthma, COPD, both or neither in the 6 months prior to initiation of a long-acting beta agonist, overall and stratified by clinically-defined covariates (Continued) the fraction of patients with confirmed asthma that one might expect from studies identified on the basis of claims alone. For subsets of patients with claims for asthma or COPD, the claims have fairly high predictive ability for identifying actual diagnoses. Therefore, a reasonable use of these data for LABA risk management is to estimate the fraction of patients with asthma through application of the observed distributions from this study, and to identify subsets of these patients based on the presence of at least one additional predictive covariate to achieve a higher degree of specificity. This use of the data would allow for periodic tracking of the fraction of potential off-label use among a subset of patients likely to have asthma only, while at the same time applying the observed (confirmed) distribution of asthma only from this study for broader context.

Conclusions
In summary, the results from this study suggest that it is feasible to differentiate subsets of LABA initiators into categories that might represent on-vs. off-label use of LABAs. It is not possible to categorize all LABA initiators with respect to these diagnoses. Safety surveillance for off-label use of LABAs must account for this limitation. Nevertheless, these data provide confirmed distributions of actual diagnoses that can be extrapolated to future surveillance studies and facilitate improved identification of subsets of patients with asthma, COPD, both, or neither on the basis of health insurance claims data. Implementation of simple algorithms that combine the presence of a diagnosis code for asthma or COPD with the presence or absence of a number of single PPVpredictive covariates would allow for periodic tracking of off-label LABA prescribing among subsets of all LABA users. The specific algorithm chosen should be targeted to the application at hand.