Characterizing idiopathic pulmonary fibrosis patients using US Medicare-advantage health plan claims data

Background Idiopathic pulmonary fibrosis (IPF) is a rare life-threating interstitial lung disease (ILD). This study characterizes demographics, health care utilization, and comorbidities among elderly IPF patients and estimates prevalence and incidence rates for selected outcomes. Methods Cohort study using a large US health insurance database (Optum’s Medicare Advantage plan). Inclusion criteria: ≥ 1 diagnosis code for IPF (2008 - 2014), age ≥65 years, no diagnosis of IPF or other ILD in prior 12 months. Demographics, health care utilization, comorbidities and incidence rates for various outcomes were estimated. Follow-up continued until the earliest of: health plan disenrollment, death, a claim for another known cause of ILD, or end of the study period. Results 4,716 patients were eligible; 53.4% had IPF diagnostic testing. Median age was 77.5 years, 50.3% were male, median follow-up time was 0.8 years. Incidence rates ranged from 1.0/1,000 person-years (lung transplantation) to 374.3/1,000 person-years (arterial hypertension). Baseline characteristics and incidence rates were similar for cohorts of patients with and without IPF diagnostic testing. Conclusions Elderly IPF patients experience a variety of comorbidities before and after IPF diagnosis. Therapies for IPF and for the associated comorbidities may reduce morbidity and associated health care utilization of these patients. Electronic supplementary material The online version of this article (10.1186/s12890-018-0759-5) contains supplementary material, which is available to authorized users.


Background
Idiopathic pulmonary fibrosis (IPF) is a rare interstitial lung disease (ILD) which is progressive and life-threatening. It generally occurs in patients over the age of 50 years. The diagnosis is challenging as IPF shares symptoms with many other kinds of lung disease, but improvements in diagnosis have been made [1,2]. Diagnostic criteria include the presence of a specific radiologic pattern of usual interstitial pneumonia (UIP) on HRCT, or specific combinations of radiologic and histopathologic patterns in patients who have undergone a surgical lung biopsy [1,3]. Nintedanib and pirfenidone were both approved for use in the United States (US) by the Food and Drug Administration (FDA) in October 2014. These were the first pharmacological treatments shown to be efficacious in patients with IPF, leading to slower disease progression [4,5]. The objective of this study was to characterize the IPF population aged 65 years and above prior to the marketing of pharmaceutical treatments for IPF to gain a better understanding of the characteristics and disease outcomes.

Data Source
This non-interventional cohort study was based on a proprietary research database containing eligibility, pharmacy claims, and medical claims data from a Medicare Advantage and Part D plan (MAPD) managed by Optum. Medical and pharmacy claims data are available starting in 2006 for approximately 4.2 million members. Medicare Advantage plans are offered by private companies, such as the health insurer associated with Optum, and approved by Medicare.
Medical and pharmacy information, including inpatient and outpatient procedures and diagnoses, is available for Medicare enrollees with medical and pharmacy coverage. Pharmacy claims contain sufficient information to trace patients' pharmacy expenditures through the multiple phases of the Part D plans, although information on medications administered during inpatient stays is not captured. All data access conforms to applicable Health Insurance Portability and Accountability Act policies.
To identify possible out-of-hospital deaths, claims data were linked to the Social Security Administration (SSA) Death Master File (DMF), a compilation of mortality information derived from the US SSA payment records. The DMF currently contains over 94 million records. Information on cause of death is not included.

Cohort Identification
Patients with at least one medical claim with a diagnosis code of IPF and aged 65 years or older between 01 January 2008 and 30 September 2014 were identified. The ICD-9 code for IPF was 516.3 prior to October 2011, and changed to 516.31 in October 2011. Patients were required to have complete medical coverage and pharmacy benefits and at least 12 months of continuous health plan enrollment prior to the IPF claim date (baseline period).
The cohort entry date (index date) was set as the date of the first medical claim with a diagnosis of IPF within the study period. Only incident cases, defined as IPF patients without IPF claims in the baseline period were eligible for this study. Patients with other known causes of ILD such as connective tissue disease, hypersensitivity pneumonitis and others during baseline were excluded (Additional file 1, [3]). Patients with claims for IPF diagnostic testing (high-resolution computed tomography (HRCT) of the thorax or surgical lung biopsy (SLB) during baseline) were identified for a subgroup analysis (referred to as the IPF diagnostic testing subgroup). This was a stricter case definition done to identify a subgroup who would more definitively be considered to have IPF, to assess how the case definition may influence the results. The number of patients with IPF diagnostic testing in the first 30 days after diagnosis was also evaluated.
Since IPF shares symptoms with many other lung diseases, it is possible for a diagnosis of IPF to be made, then later modified to an alternate ILD diagnosis. These patients are included in the primary analysis so as not to use future events to classify patients at baseline). For exploratory purposes, however, a sensitivity analysis excluding patients who had at least one claim during follow-up for one of the ILD conditions (listed in Additional file 1) was done to estimate the impact of these potentially misdiagnosed members of the cohort on the baseline characteristic distributions and outcomes.

Baseline characteristics
Baseline characteristics include age at cohort entry, sex, geographic area, race, length of health plan membership prior to cohort entry, cohort entry year, medication dispensings of: oral corticosteroids, N-acetyl cysteine (NAC), azathioprine and cyclophosphamide, open lung biopsies, oxygen therapy, medication for gastroesophageal reflux disease (GERD) therapy (e.g., H2 receptor blockers, proton pump inhibitors), anticoagulation (e.g., vitamin K antagonists, heparin, novel oral anticoagulants and antiplatelet therapy), dispensings of drugs that may induce pulmonary fibrosis (amiodarone, bleomycin, nitrofurantoin, methotrexate, gold salts), Epstein-Barr virus (EBV), hepatitis C, and broncho-alveolar lavage. Conditions defined in the list of primary and secondary outcomes below were also included in the list of baseline conditions, as were measures of health care utilization (HCRU) such as number of inpatient and outpatient visits and associated costs.
When available, algorithms that have been validated in administrative databases were used to define outcomes of interest. Otherwise, outcome definitions were determined with clinical input and searches of medical claims coding systems.
Follow-up time for each cohort member extended from the day after the index date until the earliest of disenrollment from the health plan, death, occurrence of baseline exclusion criterion during follow-up, or the end of the study period. For each outcome, patients were censored for any second occurrences of that outcome, but remained eligible for a different outcome.

Statistical Methods
Mean, standard deviation (SD), median and interquartile ranges (IQR) were calculated for continuous variables and absolute and relative frequencies were calculated for categorical variables.
For the baseline characteristics, relative frequencies (prevalences) were calculated by dividing the number of patients in the cohort with the condition during baseline by the total number of patients in the cohort. Incidence rates (IRs) were calculated by dividing the number of patients with the outcome by the sum of all observation time-to-event or censoring for all patients within each cohort. IRs are presented per 1,000 person-years (pys) with 95% confidence intervals (CIs). They are shown only for the patients who did not have evidence of the condition during baseline. Length of follow-up (mean, SD, median, interquartile range (IQR)) was summarized by reason for censoring.
All analyses were conducted using SAS 9.4.

Results
A total of 4,716 incident cases of IPF were identified. The baseline characteristics of the IPF cohort and subgroup with IPF diagnostic testing during baseline were similar, overall and within stratum of cohort entry time based on changes in ICD-9 coding (i.e., cohort entry before October 2011 vs during or after October 2011).
Due to the similarity of characteristics, all results are reported without stratification by cohort entry time.
Baseline HCRU metrics are summarized in Table 2. Half of the IPF cohort (50.8%) was hospitalized and 82.9% had dispensings for at least 3 unique medications. The median number of physician visits was 12.0 (IQR: 8.0-19.0). The median total cost was $11,865 (IQR: 2,465-25,113). Nearly half of the costs were from facility charges, which is consistent with the observed proportion of patients with inpatient hospitalizations.
The mean length of time from cohort entry until censoring was 1.3 years (SD 1.4, median 0.8) ( Table 3). The most common reason for censoring (other than the end of the study period) was the end of health plan enrollment (27.3% and 26.3%, for IPF cohort and IPF diagnostic testing subgroup). The smallest proportion of patients in each cohort was censored because one of the baseline exclusion criteria (e.g., other known causes of ILD) was observed during follow-up (17.1% and 19.1%, respectively.) The baseline prevalence of the primary and secondary conditions are presented in Table 4. With the exception of lung cancer (16.1%), the prevalence was low for the primary conditions, ranging from 0.2% for PAH and lung transplantation to 4.6% for PH in the IPF cohort. Among the secondary conditions, the prevalence ranged from ≤1% for GI perforation, neutropenia, hemorrhoids with bleeding, intracranial hemorrhage, acute pancreatitis, and hepatic failure to the highest prevalence estimate of 76.3% for arterial hypertension. The results were similar for the IPF diagnostic testing subgroup.
The incidence of the outcomes for the IPF cohort and IPF diagnostic testing subgroup are summarized in Table 5. In the IPF cohort, the IRs of the primary outcomes during follow-up ranged from 1.0/1,000 pys for lung transplantation to 180.4/1,000 pys for all-cause mortality. IRs of the secondary outcomes ranged from 3.7/1,000 pys for hepatic failure to 374.3/1,000 pys for arterial hypertension, respectively. Overall, IRs of most of the primary and secondary outcomes were slightly lower in the IPF cohort relative to the IPF diagnostic testing subgroup.
The results were not substantially affected when patients who had claims during follow-up for other known causes of ILD (i.e., claims suggesting they were misdiagnosed as IPF at cohort entry) were excluded (n=808, 17%, data not shown). One notable exception was that this subset of patients had a higher all-cause mortality rate (201.7/1,000 pys).

Discussion
The results of this study suggest that IPF patients aged 65 years and above have a high morbidity and mortality. It included a broad range of comorbidities and outcomes, some of which have only been rarely or not been yet characterized in IPF populations. In particular, no published studies were found that included the prevalence or incidence of outcomes such as hepatic failure, acute pancreatitis or acute renal failure.
The patients included in this study are from a wide range of providers covered by Medicare, not restricted to major medical facilities, thus also reflecting diagnoses given by providers in general practice. Within the incident IPF cohort of 4,716 patients, over one half of the patients had a procedure for IPF diagnostic testing during baseline, the vast majority of which was HRCT rather than surgical biopsy. Surgical lung biopsy is an invasive diagnostic test; both HRCT and lung biopsy would tend to be performed only in larger medical settings. Although surgical lung biopsy was recommended by the 2011 international guidelines for the    confirmation of the IPF diagnosis in patients who have a possible or probable UIP pattern on HRCT [3], not all patients are eligible or willing to undergo that procedure. As this study includes centers that are not necessarily ILD referral centers, there is the potential that these centers are not so experienced and comfortable to perform this invasive procedure. The subgroup analysis only looks at the diagnostic procedures done at baseline (i.e. until the first medical claim of IPF). In the majority of cases the surgical lung biopsy is only done after the result of the HRCT (in cases of possible or probable UIP)which means the surgical lung biopsy would only be done after the baseline period and therefore not captured in the definition of the cohort/subgroup. Unlike clinical trials [5] that include a higher proportion of males, the gender distribution in this study is consistent with other publications on IPF populations, where approximately 50% of the patients were female [26,27]. This may reflect women's greater likelihood to seek health-care services, therefore are more frequently observed in insurance claim databases [26]. Changes in the ICD-9 codes for IPF in October of 2011 did not influence the baseline characteristics of patients identified. However, if coding practice changes were implemented gradually, then some possible IPF patients with claims on or after October 2011 who were coded with the pre-October 2011 code (516.3) would have been excluded.
Comparisons to other publications should consider that study findings are influenced by the complexity of the IPF cohort definition, different distributions of age and sex, the coding used to define comorbidities, length of follow-up and the different underlying databases. Nevertheless, the reported prevalence and incidence rates observed in this study fell within the range of estimates reported in other publications. A systematic literature review [28] of the prevalence of pulmonary and extra pulmonary comorbidities among IPF patients included several of the outcomes included in this study and estimates ranged from 3-86% for PH, 3-48% for lung cancer, 6-91% for sleep apnea (various types), 6-67% for COPD, 6-68% for ischemic heart disease (IHD) and 0-94% for GERD. Our reported prevalence estimates fell within these ranges, with baseline prevalence of 4.6% for PH, 16.1% for lung cancer, 8.1% for obstructive sleep apnea, 51.5% for COPD, 40.4% for IHD and 28.0% for GERD.
Of particular interest for IPF populations is the occurrence of exacerbations of IPF. The percent of patients in this cohort with ARWUC during follow-up was rather low with 2.4% (95% CI: 2.0-2.8, data not shown). In comparison, a retrospective study of data collected from 461 patients with diagnosed IPF reported an annual percent of 14.2% of clinically defined acute respiratory worsening [29]. Reports in clinical trials have tended to be lower than this, and Raghu et al reported that ARW are believed to occur in between 5 and 10% of IPF patients per year [29]. The incidence of ARWUC is difficult to establish in claims data due to variations in methodologies used in different studies [29] and the numerous exclusionary comorbidities included in the definition, such as left heart failure, pulmonary embolism and other identifiable causes of lung injury. In addition, dyspnea, an essential component of the clinical definition of ARW may not be well-captured in claims data, leading to underestimation of the condition. This ARWUC algorithm is a proxy for clinically defined ARW and further validation is desirable.  Relative to findings reported in a similar study based on Optum's commercially insured population [30], the IPF cohort in this study population has substantially higher morbidity and mortality. This is likely due to the fact that the study population of this study was restricted to Medicare-eligible patients aged 65 years and above, while the previous study included IPF patients aged 40 years and older. Primary outcomes with notably higher IRs (per 1,000 pys) included AMI (34.4 vs 13.8), pulmonary hypertension (46.0 vs. 22.5), lung cancer (26.0 vs. 17.6), and mortality (180.4 vs. 97.1). The IR for lung transplantation was lower (1.0 vs. 6.0), likely due to the practice in the US during this study period of restricting lung transplantation to patients under the age of 65 years [31]. IRs of most of the secondary outcomes were higher among the elderly population, in particular for chronic renal failure, congestive heart failure, pulmonary rehabilitation and intracranial hemorrhage, all of which were more than twice as high. In both studies, arterial hypertension was the secondary outcome with the highest IR and the most prevalent condition during baseline (76.3% in elderly, (Table 5), 55.3% in commercial (data not shown)). Although arterial hypertension is known to be highly prevalent in the US population, the high IR may also be due to the broad range of codes used to define this outcome [16,17].
Collard et al [27] performed a similar analysis using US claims data from commercially insured and Medicare patients. The Collard et al [27] population was not restricted to the Medicare population (20.8% were 64 years or younger) and many of the outcome definitions were less inclusive and consequently, IRs (per 1,000 pys) were consistently lower than those reported in the Optum population: heart failure (67. This study has some key limitations. The study was done using automated medical and prescription claims. While claims data are extremely valuable for the efficient examination of health care outcomes and utilization, all claims databases have certain inherent limitations. Presence of a diagnosis code on a medical claim is not necessarily positive presence of disease, as it may be incorrectly coded or included as rule-out criteria rather than actual disease. Similarly, claims for diagnostic tests may be observed but results are not reported in claims data. We restricted the eligibility criteria to having a one-year look back for identification of incident cases of IPF, so patients who had an earlier diagnosis of IPF but not in the one-year lookback could have been misclassified as incident. However, this is unlikely as IPF is a chronic condition and it is highly likely that patients would be seeking health care during that one-year look back period. These limitations may lead to potential misclassification and impact this study due to the difficulty in diagnosing IPF when patients first present to medical     providers. Sensitivity and subgroup analyses were implemented to evaluate the impact that different selection criteria had on cohort characteristics, specifically creating a stricter cohort definition for the subgroup analysis which includes only those patients who had IPF diagnostic testing in line with the recommendations of the international diagnostic guidelines [3]. True incidence is difficult to identify, as it cannot be determined whether the code pertains to newly observed conditions or for care related to the conditions that were observed at an earlier (and possibly unobserved) date. Although this study focuses on the IR among patients without the condition observed in baseline, this may still result in the overestimation of rates of outcomes such as chronic and acute renal failure, which are captured by a broad range of non-specific codes, resulting in some of the highest incidence rates observed in this study. Some outcomes, such as epistaxis or other bleeding events may not require medical attention and therefore are not well captured in medical claims, leading to underestimation of those events. Medications that can be obtained without prescriptions (i.e., over-the-counter medications) are not observed in claims data. Medications that are given during an inpatient stay are also not captured, leading to underestimation of their use. In addition, given that there were no efficacious or approved treatments available during this study period other than lung transplantation, patients may have received additional (but unobserved in claims data) health care and medications through involvement in clinical trials.
Duration of follow-up can be limited in the Optum-MAPD database due to individuals changing into more traditional Medicare coverages or Medicare Advantage plans administered by other insurance companies. Thus, outcomes that occur after enrollment ends would not contribute to the prevalence or incidence rates. Although not systematically evaluated, there is a suggestion that as patients get sicker, they are more likely to disenroll from Medicare Advantage plans [33,34].
Historically, the SSA DMF provided about 90% coverage for patients older than 65 years. Starting on 01 November 2011, the SSA determined that protected state records could no longer be disclosed. Section 205(r) of the Social Security Act prohibited the SSA from disclosing state death records received through their contracts with these states, except in limited circumstances. This has resulted in the reduction of available records on the public DMF by 1 million annually (about 30% reduction) [35]. Some deaths may not be available in the DMF, and out-of-hospital deaths may be not be identified, potentially leading to underestimation of the mortality incidence rate.
There are several advantages to conducting this study in the Optum-MAPD. Unlike site-based or registry-based studies that are typically limited in population sample size, the Optum-MAPD contains millions of lives, allowing for broader investigations of drug use patterns and disease outcomes. This is especially valuable for investigating rare outcomes in a population with a rare disease such as IPF. Underlying information is geographically diverse across the US. Relative to the overall US Medicare population, the Optum MAPD population has a similar distribution of gender and age, members from the Northeast and Midwest, and proportion of members who are African-American, Hispanic or Asian. It has a higher proportion of members from the South and fewer from the West and a higher proportion with race categorized as other/unknown. The average length of enrollment in the MAPD is almost 5 years for this age group.

Conclusions
Elderly IPF patients experience a variety of comorbidities, both before and after diagnosis. This study helps to gain a better understanding of the outcomes of this disease, which is important to optimize management of this patient population and thus improve disease outcomes. The cohort characteristics were not affected by modifications to the cohort definition, suggesting that the definition of IPF captures similar patients despite variations in the operational definition of IPF. Therapies for IPF and for the associated comorbidities may reduce morbidity and associated HCRU of these patients.

Availability of data and materials
The datasets generated and analyzed during the current study are not publicly available due to the restrictions of the data license, but may be available from Optum through a data license agreement by interested parties.
Authors' contributions KM, CE, NH, LW contributed to the study concept and design. CC and HN acquired the data. KM drafted the manuscript. CE, NH, LW critically revised the manuscript for important intellectual content. All authors contributed to the analysis and interpretation of data. All authors read and approved the final manuscript.
Ethics approval and consent to participate This study was approved by the Western Institutional Review Board (WIRB, Study No.: 1178316). This Board found that this research met the requirements for a waiver of consent under 45 CFR 46.116(d), thus no further permissions were required to utilize the datasets for this study.

Consent for publication
Not applicable.
Competing interests NH and LW are employees for Boehringer Ingelheim., CE and HN are employees of Optum. KM and CC were employees of Optum at the time this study was conducted. The authors have no conflicts of interest to report.

Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Author details