Creating scenarios of the impact of copd and their relationship to copd assessment test (CAT™) scores
© Jones et al; licensee BioMed Central Ltd. 2011
Received: 18 February 2011
Accepted: 11 August 2011
Published: 11 August 2011
The COPD Assessment Test (CAT™) is a new short health status measure for routine use. New questionnaires require reference points so that users can understand the scores; descriptive scenarios are one way of doing this. A novel method of creating scenarios is described.
A Bland and Altman plot showed a consistent relationship between CAT scores and scores obtained with the St George's Respiratory Questionnaire for COPD (SGRQ-C) permitting a direct mapping process between CAT and SGRQ items. The severity associated with each CAT item was calculated using a probabilistic model and expressed in logits (log odds of a patient of given severity affirming that item 50% of the time). Severity estimates for SGRQ-C items in logits were also available, allowing direct comparisons with CAT items. CAT scores were categorised into Low, Medium, High and Very High Impact. SGRQ items of corresponding severity were used to create scenarios associated with each category.
Each CAT category was associated with a scenario comprising 12 to 16 SGRQ-C items. A severity 'ladder' associating CAT scores with exemplar health status effects was also created. Items associated with 'Low' and 'Medium' Impact appeared to be subjectively quite severe in terms of their effect on daily life.
These scenarios provide users of the CAT with a good sense of the health impact associated with different scores. More generally they provide a surprising insight into the severity of the effects of COPD, even in patients with apparently mild-moderate health status impact.
Understanding a chronic obstructive pulmonary disease (COPD) patient's health status is an integral part of overall patient management. International guidelines on the management of COPD recommend that both lung function and health status are monitored regularly to guide any changes in treatment , and both the European Respiratory Society and the American Thoracic Society recommend that health status should be assessed as an outcome in clinical trials of new and existing pharmacological therapies for treatment of COPD . A number of different questionnaires are available that assess health status in COPD, these include the Chronic Respiratory Questionnaire (CRQ) , the Clinical COPD Questionnaire (CCQ) , the St. Georges Respiratory Questionnaire (SGRQ)  and a revised form of the SGRQ, SGRQ-C, which retains the accuracy and responsiveness of the SGRQ but which features fewer questions; scores obtained with the SGRQ and SGRQ-C are directly comparable .
All health status questionnaires require reference points so that physicians can attach meaning to their scores. One approach is to calculate a minimum clinically important difference (MCID). This allows users of the questionnaire to distinguish clinically relevant differences within patients, for example in an interventional trial, or in the same patient over time, for example before and after pulmonary rehabilitation. However, the MCID only provides an estimate of the minimum worthwhile difference and does not describe in what nature the health status has changed . Another approach is to relate scores to clinical scenarios. This has been done to illustrate the MCID (4 units) for the SGRQ , where the scenarios are based on responses to individual questions. For example, a scenario describing a patient who; no longer takes a long time to wash or dress, can now walk up stairs without stopping and go out for entertainment relates to a pattern of change in the patients health status correspondent to a 4-unit improvement. Despite these useful descriptive characteristics, within the field of pulmonary medicine there has been no attempt to create scenarios that can provide clinicians with descriptions that cover the entire range of a health status scale.
We have recently described the development of a new simple health status questionnaire, the COPD Assessment Test (CAT™) [9, 10], which correlates very well with the SGRQ-C in stable COPD patients (r = 0.80) and in patients experiencing an exacerbation (r = 0.78). This paper describes the development of descriptive scenarios for the CAT based upon the content of the SGRQ-C.
'Mapping' the contents of SGRQ-C to the CAT was possible as the CAT was developed using Rasch methodology while development of the SGRQ-C involved retrospective Rasch analysis of the original SGRQ to identify items that could be removed. Consequently it has been possible to convert both questionnaires scores to a common unit of measurement that then allows direct comparison between CAT scores and SGRQ item severity scores, and subsequent mapping of SGRQ-C scenarios to the full spectrum of CAT scores.
Comparison of CAT and SGRQ scores
The correlation between SGRQ and CAT scores in stable patients is good (r = 0.80) , however a better method of assessing the agreement of two instruments designed to measure the same thing is a technique known most commonly as the Bland and Altman plot . This tests whether the two instruments behave in the same way across the entire scaling range of the instruments, by plotting the difference between measurements made by the two instruments in the same individual against the mean of the two measurements. The differences should be small across the scaling range and have no, or only a very small, correlation with the means. The CAT scale ranges from 0 to 40 while the SGRQ scale ranges from 0 to 100, therefore in order to create a Bland and Altman plot, it was necessary to multiply the CAT score by 2.5 to make the scaling range directly comparable with that of the SGRQ. This CAT score was called the 'adjCAT'.
Rasch methodology is based upon testing the performance of the Guttman scaling properties of a questionnaire's constituent items [12–14]. The key property of this type of scale is the assumption that, for an item of given severity, a patient will have a high probability of responding positively to items that indicate lesser severity than the item in question and a lower probability of responding positively to items that reflect greater severity, when a positive response denotes the presence rather than the absence of an impairment or disability. Rasch modelling was used in the development of CAT, as described elsewhere . Using this approach, severity is calculated as the log odds (logit) of a patient affirming that item 50% of the time. The average severity of the items is conventionally fixed at zero logit, therefore a mild score has a negative logit and a severe score has a positive logit.
Scoring the CAT
Abbreviated conversion table from CAT score to logits
Scoring the SGRQ
Scores for the SGRQ are calculated by applying empirically derived weights to the patients' responses to each item. This is an entirely different methodology from that used for scoring the CAT and meant that a simple direct mapping exercise to relate CAT scores to SGRQ scores was not possible. However, a recent exercise to refine the SGRQ to produce the SGRQ-C used Rasch methodology . This process also provided estimates of the severity of each item calculated as logits, which made it possible to compare CAT scores and SGRQ items using the same metric. Most of the items in the SGRQ are dichotomous, so we used the logit for that item. About 15% of the items have multiple response categories and in these cases we used the logit for each category of response.
Mapping CAT scores to SGRQ items
Patients were recruited from sites in Belgium, France, Germany, The Netherlands, Spain, UK, and USA. Full details of patient recruitment and questionnaire administration are available elsewhere . The study was conducted in compliance with the Declaration of Helsinki with ethics approval provided by local ethics committees. All patients provided written informed consent prior to study procedures.
CAT categories within a COPD population
Full details of these patients have been published elsewhere , in brief their mean age was 66 years, 32% were female and their mean FEV1 was 58% predicted. In Figure 1, the CAT severity categories are superimposed upon a cumulative frequency distribution of CAT scores in 1503 patients recruited from Belgium, France, Germany, The Netherlands, Spain, UK, and USA. The proportion of scores was 18% Low Impact, 43% Medium Impact, 28% High Impact, and 11% Very High Impact.
Correlation with SGRQ
The Bland and Altman plot also shows the limits of agreement between CAT and SGRQ; 31% of the score differences are less than 5 points (i.e. difference of ≤5%) and 60% are less than 10 points (difference of ≤10%), and 90% are less than 20 points (difference of ≤20%). These numbers show substantial agreement between the CAT and SGRQ.
Creation of CAT scenarios
SGRQ-C items grouped by corresponding CAT severity category
Very High Impact
Breathless several days a week
Housework takes long or stop for rests
Chest causes lot of problems or most important problem
Cough causes tiredness
Breathless walking up hills
Breathless most days a week
3 or more attacks of chest trouble in last year
Takes a long time to get washed or dressed
Difficult to carry heavy loads, etc (-3.16)
Bring up phlegm several days a week
Get afraid/panic when can't get breath
Breathless walking around home
Have to stop/slow down if hurry/walk fast
Wheezing attacks only with chest infections
Breathless walking on level ground outside the house
Chest trouble is a nuisance to family, friends (1.49)
Chest condition causes a few problems
Cough several days a week
Wheezing attacks several days a week
Cannot take bath/shower or takes long time (1.55)
Difficult to walk up hill, light gardening, etc
Wheezing attacks a few days a month
Cough and/or breathing embarrassing in public
Cannot go out for entertainment (1.87)
Most days are good in average week
1-2 attack of chest trouble in last year
Cough and/or breathing disturbs sleep
Coughs hurts (2.11)
Stops 1 or 2 things
Bring up phlegm most days a week
Feel not in control of chest problem (0.49)
Cannot do housework
Breathless walking up a flight of stairs
A few good days in an average week
Wheezing attacks most days a week
Have become frail or invalid because of chest (2.42)
Cough only with chest infections
Cough most days a week
Stops patient doing most things they want to do (0.60)
Cannot go out of house for shopping
Walk slower than others or stop for rests
Breathless when bending over
Breathless getting washed/dressed (0.62)
Stops patient doing everything they want to do (3.11)
Breathless only with chest infections
Wheeze worse in morning
No good days in average week (0.63)
Cannot move far from bed or chair (3.40)
Get exhausted easily
Breathless when talking
Walk slowly or stop walking one flight of stairs
Exercise felt not to be safe
Bring up phlegm only with chest infections
Everything seems too much of an effort
Usually cannot play sports or games
COPD ladder of severity
COPD ladder of poor health
Cannot move far from bed or chair
Have become frail or an invalid
Cannot do housework
Cannot take bath/shower or takes a long time
Breathless walking around the home
Chest trouble has become a nuisance to friends/relatives
Everything seems too much of an effort
No good days in the week
Stops patient doing most of what they want to do
Feel that not in control of chest problem
Cough/breathing disturbs sleep
Get afraid or panic when cannot get breath
Wheeze worse in the morning
Breathless on bending over
Wheezing attacks on most days
Cough several days a week
Breathlessness on most days
Housework takes a long time or have to take rests
Usually cannot play sports or games
Gets exhausted easily
Walk slower than other people or stop for rests
Breathlessness stops patient doing one or two things
Chest condition causes a few problems
Breathless walking up hills
This analysis has used an objective scientific method to create clinical scenarios that are associated with different scores obtained with a new measure of impaired health status for COPD. A number of factors made this possible: 1. Rasch-imputed mapping has been used successfully in other diseases to map measures between two instruments , and develop scenarios corresponding to outcomes within an instrument ; 2. CAT scores and SGRQ-C scores correlate well across the entire scaling range from very mild to very severe; 3. The CAT scores and SGRQ-C items could be expressed in the same units of measurement; 4. The SGRQ is made up of sufficient items (some of which have multiple response options, each with its own calculated logit value) to permit relatively rich descriptions, so each CAT category was associated with 12 or more SGRQ-C items; 5. Rasch models are thought to be sample independent , thereby permitting comparisons between different groups of patients.
This approach enabled us to provide scenarios that describe patients exhibiting CAT scores ranging from the very mild to the very severe. For example, patients who become breathless while walking up hills fall into the Low Impact CAT category, while those who become breathless while walking around the home fall into the Very High Impact category. These scenarios allow for a more rounded understanding of the effects of COPD associated with different CAT scores and for a more ready appreciation of what the scores mean for the patient in terms of the effect of COPD on their lives. The data used to map SGRQ-C items to CAT severities were derived from multiple countries and, during the CAT's development, items that performed differently in different countries were excluded. For these reasons, we believe that large regional variation in the scenarios is unlikely and that they are applicable wherever a valid translation of CAT is available (current list available at http://www.catestonline.org).
There are, however, some weaknesses with the approach used here. Ideally, the Rasch analysis would have been performed on the same patient population as that used for the CAT analysis, but this was not possible for resource reasons. However, we have shown previously that within a study population repeat estimates of item severity calculated using Rasch analyses were very stable over time . The items in the SGRQ-C don't provide a fully comprehensive description of every effect that COPD can have on a person, but there are common effects that should be experienced by most patients. Some of the items do not seem intuitively to be of the 'right' severity, for example bringing up phlegm only with chest infections is associated with a similar degree of severity as having to stop when walking up stairs, however these severity estimates were calculated using data from approximately 900 COPD patients  so they should be reliable. Finally, as the cut-point for categories for CAT severity were chosen ad hoc and on a purely descriptive basis rather than on empirical clinical definition, there is the possibility that where items mapped from the SGRQ-C fell close to the border between two severity categories they may have been mis-assigned. It is beyond the scope of this work to validate the CAT severity categories, and it is acknowledged that future work may be needed to prospectively both test the validity of the CAT severity categories (and SGRQ-C mapping) in a cohort of patients in whom data is collected using both SGRQ-C and the CAT, and to relate the CAT severity categories to needs of care.
An alternative approach to conveying the impact of COPD, as reflected in CAT scores, is to present a usable number of selected SGRQ items in an ascending hierarchy of severity or ladder. When using such a ladder it is important to remember that higher scores are likely to be associated with many of the milder items; a patient whose sleep is disturbed by cough or breathlessness is also likely to do housework slowly and be unable to do one or two things that they would like to do. By the same token, they are less likely to be breathless when walking around the home or have problems bathing. This COPD severity ladder is presented as an alternative approach to scenarios for providing clinicians with a picture of the life and health of a COPD patient with any given CAT score. It is important to note that it should not be used as a scale and CAT scores should not be attributed to the patient's response to selected items from this ladder - its purpose is purely illustrative.
One important contribution of this work is to focus attention on the true impact of COPD on a patient's life. In this respect, the very general adjectives used to describe the severity of the impact of the disease on the patient may be doing a disservice to the patient. A 'Medium Impact' CAT score looks anything but medium when described as a scenario, most healthy people are likely to judge that getting exhausted easily and needing to take a long time to do housework constitutes quite severe impact on health. If use of the CAT and these scenarios produces a re-evaluation of what constitutes 'mild or moderate COPD', then patients can only benefit.
In conclusion, this work has shown that it is possible to relate CAT scores to scenarios descriptive of impaired health status in COPD. The CAT is a concise instrument for use in everyday clinical practice; the scenarios described here allow for a more complete understanding of what its scores reflect in terms of the effect of the disease on the patient's health. It is our hope that a more complete understanding of a COPD patient's health status may help clinicians optimise their management.
Editorial support in the form of copyediting and styling the manuscript for submission were provided by Geoff Weller at Gardiner-Caldwell Communication and was funded by GlaxoSmithKline. Manuscript administration charges were paid by GlaxoSmithKline.
- Global Initiative for Chronic Obstructive Lung Disease (GOLD) guideline: Global Strategy for the Diagnosis, management and Prevention of Chronic Obstructive Pulmonary Disease (updated 2009). last accessed 12 May 2011, [http://www.goldcopd.com]
- Cazzola M, MacNee W, Martinez FJ, Rabe KF, Franciosi LG, Barnes PJ, Brusasco V, Burge PS, Calverley PM, Celli BR, Jones PW, Mahler DA, Make B, Miravitlles M, Page CP, Palange P, Parr D, Pistolesi M, Rennard SI, Rutten-van Mölken MP, Stockley R, Sullivan SD, Wedzicha JA, Wouters EF, American Thoracic Society; European Respiratory Society Task Force on outcomes of COPD: Outcomes for COPD pharmacological trials: from lung function to biomarkers. Eur Respir J. 2008, 31: 416-469. 10.1183/09031936.00099306.PubMedView Article
- Guyatt GH, Berman LB, Townsend M, Pugsley SO, Chambers LW: A measure of quality of life for clinical trials in chronic lung disease. Thorax. 1987, 42: 773-778. 10.1136/thx.42.10.773.PubMedPubMed CentralView Article
- van der Molen T, Willemse BW, Schokker S, ten Hacken NH, Postma DS, Juniper EF: Development, validity and responsiveness of the Clinical COPD Questionnaire. Health Qual Life Outcomes. 2003, 1: 13-10.1186/1477-7525-1-13. 28PubMedPubMed CentralView Article
- Jones PW, Quirk FH, Baveystock CM: The St George's Respiratory Questionnaire. Respir Med. 1991, 85 (Suppl B): 25-31.PubMedView Article
- Meguro M, Barley EA, Spencer S, Jones PW: Development and validation of an improved COPD-specific version of the St George's Respiratory Questionnaire. Chest. 2006, 132: 456-463.View Article
- Jones PW: St. George's Respiratory Questionnaire: MCID. COPD. 2005, 2: 75-79. 10.1081/COPD-200050513.PubMedView Article
- Jones PW: Interpreting thresholds for a clinically significant changes in health status in asthma and COPD. Eur Respir J. 2002, 19: 398-404. 10.1183/09031936.02.00063702.PubMedView Article
- Jones PW, Harding G, Berry P, Wiklund I, Chen WH, Kline Leidy N: Development and first validation of the COPD Assessment Test. Eur Respir J. 2009, 34: 648-654. 10.1183/09031936.00102509.PubMedView Article
- Jones P, Harding G, Wiklund I, Berry P, Leidy N: Improving the process and outcome of care in COPD: development of a standardised assessment tool. Prim Care Respir J. 2009, 18: 208-215. 10.4104/pcrj.2009.00053.PubMedView Article
- Bland JM, Altman DG: Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986, 1: 307-310.PubMedView Article
- Rasch G: Probabilistic models for some intelligence and attainment tests. AA. 1960, Chicago: University of Chicago Press
- Barley EA, Jones PW: Repeatability of a Rasch model of the AQ20 over five assessments. Qual Life Res. 2006, 15: 801-809. 10.1007/s11136-005-5466-z.PubMedView Article
- Andrich D, Sheridon BE, Luo G: RUMM2020 (Rasch unidimensional measurement models). AA. 2005, Perth: RUMM Laboratory
- Hawthorne G, Densley K, Pallant JF, Mortimer D, Segal L: Deriving utility scores from the SF-36 health instrument using Rasch analysis. Qual Life Res. 2008, 17: 1183-1193. 10.1007/s11136-008-9395-5.PubMedView Article
- Young TA, Rowen D, Norquist J, Brazier JE: Developing preference-based health measures: using Rasch analysis to generate health state values. Qual Life Res. 2010, 19: 907-917. 10.1007/s11136-010-9646-0.PubMedView Article
- Wright BD, Linacre JM: Glossary of Rasch Measurement Terminology Rasch Measurement Transactions. 2001, 15: 824-825. last accessed 12 May 2011, [http://www.rasch.org/rmt/rmt152e.htm]
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2466/11/42/prepub