Validation of the global lung initiative 2012 multi-ethnic spirometric reference equations in healthy urban Zimbabwean 7–13 year-old school children: a cross-sectional observational study

Background The 2012 Global Lung Function Initiative (GLI2012) provide multi-ethnic spirometric reference equations (SRE) for the 3–95 year-old age range, but Sub-Saharan African populations are not represented. This study aimed to evaluate the fit of the African-American GLI2012 SRE to a population of healthy urban and peri-urban Zimbabwean school-going children (7–13 years). Methods Spirometry and anthropometry were performed on black-Zimbabwean children recruited from three primary schools in urban and peri-urban Harare, with informed consent and assent. Individuals with a history or current symptoms of respiratory disease or with a body mass index-z score (BMI) < − 2 were excluded. Spirometry z-scores were generated from African-American GLI2012 SRE, which adjust for age, sex, ethnicity and height, after considering all GLI2012 modules. Anthropometry z-scores were generated using the British (1990) reference equations which adjust for age and sex. The African-American GLI2012 z-score distribution for the four spirometry measurements (FVC, FEV1, FEV1/FVC and MMEF) were evaluated across age, height, BMI and school (as a proxy for socioeconomic status) to assess for bias. Comparisons between the African-American GLI2012 SRE and Polgar equations (currently adopted in Zimbabwe) on the percent-predicted derived values were also performed. Results The validation dataset contained acceptable spirometry data from 712 children (344 girls, mean age: 10.5 years (SD 1.81)). The spirometry z-scores were reasonably normally distributed, with all means lower than zero but within the range of ±0.5, indicating a good fit to the African-American GLI2012 SRE. The African-American GLI2012 SRE produced z-scores closest to a normal distribution. Z-scores of girls deviated more than boys. Weak correlations (Pearson’s correlation coefficient < 0.2) were observed between spirometry and anthropometry z-scores, and scatterplots demonstrated no systematic bias associated with age, height, BMI or socioeconomic status. The African-American GLI2012 SRE provided a better fit for Zimbabwean paediatric spirometry data than Polgar equations. Conclusion The use of African-American GLI2012 SRE in this population could help in the interpretation of pulmonary function tests.


Background
Spirometry is a clinical tool used to measure and monitor lung function. There are well-defined spirometric variables that inform about patterns of lung function abnormalities and aid in the diagnosis of different types of lung disease that may manifest with obstructive and restricted lung function patterns [1]. Lung function results obtained from a patient after a spirometry manoeuvre are compared to appropriate spirometric reference equations (SRE) derived from healthy individuals of the same ethnicity, height, age, and sex [2]. SRE have traditionally been generated using different methods and populations, resulting in significant variability, and rarely including data from sub-Saharan Africa [3][4][5][6]. There is also increasing concern over the use of fixed percentage predicted cut-offs in SRE in clinical settings to define abnormalities as it can lead to incorrect interpretation of spirometry results [2,7].
To address this, the European Respiratory Society (ERS), through the Global Lung Function Initiative (GLI), developed global SRE for healthy individuals aged 3-95 years in 2012. The data used to generate the GLI 2012 SRE were collected from Europe, Australia, Latin America, East Asia, India, North America and North Africa [8]. The GLI 2012 provide ethnic-specific equations for Caucasians, African-Americans, South East Asians and North East Asians. The GLI 2012 provide age-, height-, sex-and ethnic-specific SRE [9]. These equations provide lower-limit-of-normal (LLN) values, which can be defined as the 5th percentile values (z-score < − 1.64) of the healthy, non-smoking population [2]. The zscore reflects the number of standard deviations a measurement is positioned from its predicted/reference value, centered at zero [10]. It is a function of a normally distributed population and is thought to be a more valid measure to define the LLN as compared to traditional fixed cut-offs (i.e., 0.8 for forced vital capacity [FVC] and forced expiratory volume in 1 s [FEV 1 ], and 0.7 for the FEV 1 /FVC ratio) used to help define airflow limitation and obstruction [2,11,12]. Use of the GLI 2012 SRE is endorsed by the American Thoracic Society (ATS) and the ERS, and many manufacturers now install the module in their devices [8,13,14].
Studies validating the GLI 2012 SRE have made varying conclusions, with some indicating a poor fit for local populations [10,15]. However, the FEV 1 /FVC ratio has consistently demonstrated a better fit across populations than other lung function measurements [10,[15][16][17]. Potential reasons for poor fit of SRE include sampling which is unrepresentative of the population, potential mis-specification of the prediction equations, and environmental factors such as exposure to indoor and/or ambient air pollution, malnutrition, and low socioeconomic status (SES), which may result in lower lung volumes on a population level, leading to erroneous estimations [18][19][20][21][22][23]. Like many SRE, the GLI 2012 SRE lack contribution of lung function data from sub-Saharan African populations, and use of the African-American GLI 2012 SRE is generally recommended for African populations [8].
As such, an ERS task force recommended additional studies to validate the GLI 2012 SRE in non-Caucasian populations [8]. A cross-sectional observational study was performed to evaluate the performance of the GLI 2012 SRE among urban and peri-urban Zimbabwean children aged 7-13 years. The GLI 2012 SRE were also compared against the Polgar equations because they are currently used in clinical practice.

Study population
Between June and October 2018, black-Zimbabwean children aged 7 to 13 years were recruited from three schools in Harare randomly selected from three economic zones classified as high, medium and low-income status by the Ministry of Education. The schools were classified after taking into account the location and economic status of the school. Children were excluded from the validation dataset if they had a history of chronic respiratory disease or respiratory symptoms including cough with or without sputum, wheeze and shortness of breath in the past 3 months, or reported regular exposure to smoke in the past 6 months (living at least 3 days per week with people smoking cigarettes) [24,25]. Children with body mass index (BMI) z-score < − 2 were also excluded from the analysis dataset [8,26]. Eligible children were randomly selected from each class level in a 1: 1 sex ratio in advance using class attendance registers supplied by the schools and replacements for those absent were conveniently sampled from the same class. Based on GLI guidelines, a minimum sample size of 150 was required for each group (boys and girls) to evaluate the GLI 2012 SRE [27].

Data collection
A self-administered parental paper questionnaire was used to collect data on children's respiratory health, including asthma or other chronic respiratory diseases. An interviewer-administered paper questionnaire was used to record sociodemographic data and current respiratory symptoms from the children. Height (cm) and weight (kg) were measured barefoot in light clothing with 1.0 cm and 0.1 kg precision. A Seca mechanical medical weight scale and Seca 213 stadiometer (Seca Mechanical Floor Scales Class III, Seca Precision for health, Hamburg, Germany) were used to measure weight and height respectively. Spirometry was performed using Windows 10 Koko S x software connected to a pneumotach (Koko Legend S x , nSpire Health, Inc. Longmont, USA) according to ATS/ERS guidelines [28].
The instructor demonstrated an exemplary spirometry manoeuvre before the child attempted spirometry. The test was phased as an initial deep breath, followed by a maximum exhalation phase and a final inhalation phase as per the instructor's direction. Tests were performed from a standing position with each child taking on average 8-11 min to perform at least three volume-time curves. Children performed three to eight efforts and the best manoeuvre was used for analysis [28]. The best effort of manoeuvres was defined as the largest sum of All volume-time curves were first checked by the diagnostic software, assessing the longevity of the exhalation phase (≥ 6 s in ≥10 year-olds and ≥ 3 s in < 10 year-olds) [30]. The operator further checked the degree of effort as indicated by the curve's sharp peak, and absence of cough/glottic closure during exhalation. Only measures from children performing at least three acceptable and repeatable efforts were included in the validation dataset [28]. The same device was used for all spirometry sessions performed and the machine was calibrated daily before use and after a change in ambient conditions (two units change in temperature measured in degrees Celsius and atmospheric pressure measured in millimetres of mercury).

Statistical analysis
Data was de-identified by unique identifier codes and entered into STATA for analysis (StataCorp. 2017. Stata Statistical Software: Release 15. College Station, TX: StataCorp LLC). Spirometry outcomes were FVC, FEV 1, FEV 1 /FVC ratio and MMEF (maximal-mid expiratory flow). GLI z-scores and LLN values for FVC, FEV 1 , FEV 1 /FVC, and MMEF, were computed using GLI 2012 SRE using height, age, sex and ethnic data [2,31]. The z-score and LLN values were calculated using the available Microsoft-Excel Macro calculators, which provide an age, height, sex and ethnic-specific value [8]. The GLI 2012 z-score is an unbiased estimate showing the positioning of an observed spirometry value in the distribution of the GLI 2012 SRE [32]. If the GLI 2012 SRE and the observed spirometry values are in perfect agreement, the mean z-score is zero with a standard deviation (SD) of one (a normally distributed set of data). According to the consensus reached by the GLI team and other studies validating these SRE, a mean z-score outside the range of ±0.5 is considered to be clinically significant, corresponding to at least 5-6% difference in the specified lung function measurement [8,10,[15][16][17]. The LLN was considered as the fifth percentile of the healthy population calculated using the GLI 2012 SRE. We considered all GLI ethnic modules to determine if the African-American ones provided the most appropriate fit.
The Shapiro-Wilk test and visual plots (histograms and quantile-quantile (Q-Q) plots) were used to assess normality of variables. Outcomes were compared graphically against age, height, weight and BMI z-scores, calculated using the 1990 British reference values as well as school (as a proxy for SES) to determine if any bias was present [33]. A circular scatter around the origin would provide no evidence for bias with anthropometry zscores, while no linear relationship should be present with age.
We also evaluated the association between anthropometry and spirometry z-scores using Pearson's productmoment correlation and linear regression. A lack of correlation or association indicates a good fit of the GLI 2012 SRE on the population [16].
The predicted GLI 2012 were also statistically compared against the Polgar SRE for the observed measurements [34].
Normally-distributed variables are presented as mean (SD), and the student's t-test was used to compare means of spirometry and anthropometry z-scores across demographic factors. All results are sex-specific to account for smaller lung volumes in girls compared to boys and the high variation expected in this age group of 7-13-year-olds, because girls will be at a more advanced stage of puberty than boys [35].

Results
Of 978 children that were approached, 209 (21%) did not provide consent. After exclusion of 24 individuals who did not meet eligibility criteria and 33 children who failed to perform technically acceptable spirometry measurements, 712 were included in the analysis (Fig. 1).
Age ranged from 7 to 13 for both girls and boys. However, boys had a higher mean age, BMI-for-age and MMEF z-scores, congruent to other studies [36][37][38] (Table 1).
On average, children who were excluded from the study were older (11.6 years, SD: 1.45), than those considered for analysis. The ratios of boys to girls in the included (1:1) and excluded (1:2) study groups were different, with 37 girls being excluded from the study. The mean BMI z-scores for excluded and included children were − 0.28(1.81) and 0.07(0.9) respectively. (Table 1S1, Supplementary file 1).

GLI 2012 z-scores
The Shapiro Wilk test, highlighted that the FEV 1 /FVC (for both sexes) and MMEF (for boys) z-scores generated from our sample were not perfectly normally distributed (mean≠0, SD ≠ 1; Table 2) [39]. Nonetheless, the GLI2012 SRE for a given age, sex, height and ethnicity  showed Q-Q plots in a straight line ( Figure 1S2, Supplementary file 2) which indicated relative normality, although mean GLI SRE z-scores were negative. Importantly, the distribution of spirometry z-scores showed that the African-American module defined in the GLI 2012 SRE is a good fit for urban and peri-urban Zimbabwean children. The African-American module gave the smallest absolute differences (closest to zero) as compared to other GLI 2012 ethnic modules which were also generally out of the range of ±0.5.

Scatterplots and distribution of African-American GLI 2012 z-scores
Scatterplots for spirometry z-scores did not show any linear trend (Fig. 2). The spread of z-scores was less variable for the FEV 1 /FVC ratio compared to FVC and FEV 1 z-scores across age. The scatterplots showed z-scores below the lower threshold values of − 1.64 (LLN) were not distributed in any particular pattern that might suggest an association of impaired lung function with age, height or BMI (Figs. 2 and 3). The distribution of spirometry z-scores in relation to the 5th percentile (LLN) identified that for FEV 1 , 8.7% (7.9% of boys, 9.6% of girls) and for FVC, 5.8% (4.1% of boys, 7.6% of girls) had values below the LLN. However, the FEV 1 /FVC z-scores showed a different pattern with 18.4% (18.2% of boys, 18.6% of girls) of children having values below the LLN indicating a deviation from the GLI 2012 distribution.

Anthropometric and demographic factors related to African-American GLI z-scores
The analysis of relationships between height, weight, BMI, age, and sex with spirometry z-scores demonstrated weak correlations, with Pearson's correlation coefficient values between ±0.2 ( Table 3). The linear associations between spirometry variables, anthropometric indices and school income as indicated by β coefficients from linear regression were within ±0.5 (Table 1S3, Supplementary  file 3).
Scatterplots for spirometry z-scores plotted against BMI z-scores showed a central cluster around the origin (Fig. 3b, d, f), providing no evidence for bias. However, all height scatterplots (Fig. 3 a, c, e) were more dispersed across values of height z-score, suggesting greater variability compared to the BMI plots with this most evident for FEV 1 across height z-scores (Fig. 3a). Scatterplots stratified by school showed similar patterns to unstratified plots showing no bias by SES. (Figure 1S4-3S4, Supplementary file 4).

Comparison of the African-American GLI 2012 and the Polgar SRE
Comparisons between the mean percentage predicted for FVC, FEV 1 , FEV 1 /FVC and MMEF by sex, generated from the African-American GLI 2012 and the Polgar SRE were performed. All of the mean percent predicted values were lower than 100% (full prediction) regardless of SRE used. Percent predicted values were consistently closer to 100% when using the GLI 2012 as compared to the Polgar SRE, indicating a better fit for the African-American GLI 2012 SRE. The FVC measurements were the least underestimated by the Polgar SRE whilst MMEF had the highest differences (Fig. 4). The observed patterns were the same in girls and boys. A Bland-Altman plot for the spirometric variables showed mean differences between the GLI 2012 and Polgar SRE and evidence of proportional bias as the difference of GLI 2012

Discussion
This study is the first to evaluate the use of the African-American GLI 2012 SRE in Zimbabwean children aged 7-13 years attending primary school. Our findings demonstrate that lung function parameters for Zimbabwean children are comparable to those of African-American children as indicated by the overall fit of African-American GLI 2012 SRE. Thus, the African-American GLI 2012 SRE is applicable for use in Zimbabwean children.
These findings are consistent with other findings in children [15] and adults [40] from sub-Saharan Africa. The similarities in spirometric variables between Zimbabwean and African-American children highlight the influence of ethnic background on lung development in healthy individuals, regardless of healthcare access, exposure to air pollution and SES [15,41,42]. Indeed, we detected no difference in lung function patterns between schools belonging to areas characterised by a different SES in this study. We identified anthropometry differences in this population consistent with studies that have also highlighted sex-related differences in anthropometry and lung function indices in children of the same age [36,37].
Z-scores for spirometry variables are dimensionless values that show the number of SDs the measurement is positioned from the GLI 2012 SRE population values [2,15]. The GLI 2012 SRE predict standardised z-score values that are adjusted for ethnicity and anthropometric variables. Mean African-American GLI 2012 z-scores for all the spirometry variables were within 0.5 z-scores from zero, which is within the acceptable range of the GLI 2012 perfect fit prediction [15,32]. However, the z-score SD for the FEV 1 /FVC ratio was ≥1, indicating more variability than the reference population, thus affecting the performance of the African-American GLI 2012 LLN in this population [15,43,44]. By definition, the LLN allows 5% of healthy people to be misclassified and higher variability in FEV 1 /FVC may increase misclassification of airway obstruction [2,44]. Conversely, however, as the overall population is slightly shifted down away from the predicted mean, this may reflect an actual reduction of FEV 1 /FVC in our population. The FEV 1 /FVC is sensitive to early life exposures and maybe an early indicator of decline in lung function later in life [45]. In this study, all the spirometry z-scores had a negative offset, indicating that the African-American GLI 2012 SRE generates values which are slightly above those of Zimbabwean children regardless of sex. Mean predicted values for all spirometry values were lower than 100% (perfect fit), and the observed differences were lower in girls than boys.
With a perfect fit, the z-scores developed from the GLI 2012 SRE should show a lack of association with ethnicity and anthropometric variables since they are independent variables for generating the LLN [8,16]. We identified weak correlations between anthropometric and spirometry z-scores with no consistent direction. Furthermore, the scatterplots for these associations showed no particular pattern indicating a lack of any physiological correlations. Similar results indicating weak correlations were also reported in other studies from Tunisian, Swedish and Asian populations [10,15,16]. Analysis of the scatterplots and multivariable analysis stratified by school-income level showed inconsistent influence of SES in explaining the variability in lung function z-scores. However, the associations detected between FEV 1 /FVC and BMI z-scores may be contributing to the high variability in this measure, resulting in less goodness of fit by the African-American GLI 2012 SRE. Furthermore, this finding highlights the possibility of more variability in the body frames of Zimbabwean as compared to African American children, and this may influence the association of anthropometric and spirometric measurements in our population.
Most physicians in Zimbabwe use the Polgar SRE for diagnosis of lung disease, which were developed from North America, Europe and Japan and compiled by Polgar & Promadaht (1971) for the 6-18-year age group [2,34]. In contrast, the GLI 2012 produced SRE from 74, 117 healthy individuals worldwide. Mean comparisons of percent predicted GLI 2012 SRE-derived values against the Polgar values in this population showed substantially higher lung function prediction for the African-American GLI 2012 SRE (5.6, 9.1 and 3.6% in FVC, FEV 1 and FEV 1 /FVC, respectively) [8,46]. Results showing lower Polgar predicted values as compared to the GLI 2012 values have also been identified in other populations [15,46].
Our results suggest that the use of the African-American GLI 2012 SRE in Zimbabwean children can improve identification of a tendency towards a restrictive and obstructive lung function pattern. Diagnosis of associated lung diseases can be enhanced by using LLN to identify impaired lung function rather than fixed-cut offs, as this approach mitigates the anthropometric and ethnic group related biases that can result in misclassification of borderline lung function [8,47]. The LLN values were developed from a large sample using z-scores adjusted for ethnic groups, height, age and sex. The LLN values can help define lung function abnormality: airflow   can alter the interpretation of spirometry results which will, in turn, affect the overall classification of patients as having a tendency towards an obstructive or restricted lung pattern, thereby, modifying the prevalence and subtypes of lung disorders [46,48]. The negative mean spirometry z-scores for all the variables implies the LLN should be cautiously interpreted by practitioners, to avoid over-classifying children with low lung function. This study represents a response to the call of the ERS to validate the GLI 2012 SRE in ethnic groups that are not included in the sample used to derive these SRE [8]. Strengths of our study include a randomly selected sample, and high quality lung function variables collected in a standardised manner based on ATS/ERS guidelines. We used the same spirometer that was regularly calibrated to minimise variability, and the failure rate for valid measurements was low. We acknowledge several limitations. We had a 20% refusal rate but the overall sample size was sufficient to validate the GLI 2012 SRE. The z-score calculations may have been subject to measurement error because they are adjusted for height which was measured only to the nearest centimetre; for instance, a one cm difference in height for a 12-year-old male child can relate to a difference of 0.08 and 0.1 in the predicted FEV 1 and FVC z-scores, respectively. Our results may not be generalisable to other Zimbabwean settings where exposure to indoor and outdoor air pollution may differ from Harare; we did not measure air pollution so were unable to assess its effects. The study did not capture birthweight and preterm status which is associated with the general lung development in children.

Conclusion
The African-American GLI 2012 SRE are appropriate for predicting lung function in Zimbabwean school-going urban and peri-urban children aged 7-13 years. The use of the African-American GLI 2012 SRE in healthy Zimbabwean children shows better prediction compared to the Polgar SRE, supporting that African-American GLI 2012 SRE are the equations of choice to use in evaluating lung function in Zimbabwean urban and peri-urban school-age children.