DNA methylation and gene expression profiles to identify childhood atopic asthma associated genes

Background Asthma is a chronic inflammatory disorder of the airways involving many different factors. This study aimed to screen for the critical genes using DNA methylation/CpGs and miRNAs involved in childhood atopic asthma. Methods DNA methylation and gene expression data (Access Numbers GSE40732 and GSE40576) were downloaded from the Gene Expression Omnibus database. Each set contains 194 peripheral blood mononuclear cell (PBMC) samples of 97 children with atopic asthma and 97 control children. Differentially expressed genes (DEGs) with DNA methylation changes were identified. Pearson correlation analysis was used to select genes with an opposite direction of expression and differences in methylation levels, and then Gene Ontology (GO) function and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed. Protein–protein interaction network and miRNA–target gene regulatory networks were then constructed. Finally, important genes related to asthma were screened. Results A total of 130 critical DEGs with DNA methylation changes were screened from children with atopic asthma and compared with control samples from healthy children. GO and KEGG pathway enrichment analysis found that critical genes were primarily related to 24 GO terms and 10 KEGG pathways. In the miRNA–target gene regulatory networks, 9 KEGG pathways were identified. Analysis of the miRNA–target gene network noted an overlapping KEGG signaling pathway, hsa04060: cytokine-cytokine receptor interaction, in which the gene CCL2, directly related to asthma, was involved. This gene is targeted by eight asthma related miRNAs (hsa-miR-206, hsa-miR-19a, hsa-miR-9,hsa-miR-22, hsa-miR-33b, hsa-miR-122, hsa-miR-1, and hsa-miR-23b). The genes IL2RG and CCl4 were also involved in this pathway. Conclusions The present study provides a novel insight into the underlying molecular mechanism of childhood atopic asthma.


Background
Asthma is a respiratory disease caused by the interaction of genetic and environmental factors, known to be mediated by epigenetics [1]. Approximately 334 million people worldwide suffer from asthma. Childhood asthma mortality varies from 0.0 to 0.7 per 100,000 [2]. Candidate genes for asthma are wide-spread throughout the genome. There are multiple genes involved which may affect expression of the asthma phenotype. Different genes are related to childhood and adult asthma resulting in a different physiological foundation, and different methods of treatment for the two diseases [3,4].
Genetic variants drive the onset and development of asthma, including cytotoxic T-lymphocyte-associated protein 4 (CTLA4) and interleukin-10 (IL-10) which are involved in immune system regulation and inflammation [5,6]. DNA methylation, such as of the gene encoding the β-2 adrenergic receptor, is the most common epigenetic mechanism in the pathogenesis of asthma, and can change gene expression in asthmatic patients [7][8][9]. Acevedo et al. reported that regional DNA methylation and mRNA levels at the Gasdermin B/ORMDL sphingolipid biosynthesis regulator 3 locus were associated with the risk of childhood asthma [10]. Nicodemus-Johnson et al. reported that the type 2 cytokine IL-13 is a key mediator, and it is upregulated in asthma [11]. They found that a single exposure of IL-13 may induce DNA methylation changes in an asthmatic's airway cells and contribute to various asthma phenotypes. Brand et al. found that the epigenetic regulation of T cells can influence the sensitization and progress of experimental asthma [12]. Based on these findings, it is expected that biomarkers for the early diagnosis of asthma can be determined from the level of gene expression and methylation regulation.
In a prior study, Yang et al. demonstrated that DNA methylation at specific gene loci are associated with asthma based on GSE40736 data set, and suggested that epigenetic changes might play a role in establishing the immune phenotype associated with asthma. However, the results of that analysis were only at the DNA level [13]. In the current study, we aimed to identify critical genes and miRNAs in the progression of childhood atopic asthma. We downloaded DNA methylation and gene expression data from the GEO database, and screened critical differentially expressed genes (DEGs) with significant methylation changes from samples obtained from atopic asthmatic patients and compared them with samples from healthy controls. Then we identified the Gene Ontology (GO) function and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of these DEGs and constructed a gene co-expression and miRNA-target gene regulatory network. The aim of this study is to potentially provide novel diagnostic biomarkers in the nasal epithelia of children with atopic asthma.

DNA methylation and gene expression data resource
The DNA methylation dataset, superserie GSE40736, was downloaded from the National Center of Biotechnology Information Gene Expression Omnibus (GEO) database(https:// www. ncbi. nlm. nih. gov/) [14], which contained two subseries (GSE40732 and GSE40576) that are gene expression and methylation level detection spectra, respectively. Each set contains 194 peripheral blood mononuclear cell (PBMC) samples of 97 children with atopic asthma and 97 control children. The GSE40732 data set was tested on the platform of Nimble Gen Homo sapiens Expression Array. The GSE40576 data set was tested on the Illumina HumanMethylation450 BeadChip platform.

Data preprocessing and differentially expressed gene screening
After downloading the original microarray data, the limma package in R software (version3.1.3, https:// bioco nduct or. org/ packa ges/ relea se/ bioc/ html/ limma. html) [15] was used to normalize the DNA methylation and gene expression data. DNA hypomethylation and hypermethylation were common cancer hallmarks. False discovery rate (FDR) values and fold change values were calculated by using the limma package to evaluate the DEGs and differentially methylated genes (DMGs) between the disease and control groups. An FDR < 0.05 and |log2FC|> 0.5 were considered to be threshold values.
Moreover, the pheatmap package (Version 1.0.8, https:// cran.r-proje ct. org/ packa ge= pheat map) [16] in R software was used to perform the bidirectional hierarchical clustering analysis for the gene expression and methylation values based on Euclidean distance [17,18]. The pheatmap was constructed to visualize gene expression values.

Gene ontology function and KEGG pathway analysis for DEGs and DMGs
Initially, we compared the collection of DEGs and DMGs, kept the intersection of the two data sets, and analyzed the overall correlation between the degree of difference in methylation and expression levels. The cor.test function (https:// stat. ethz. ch/R-manual/ R-devel/ libra ry/ stats/ html/ cor. test. html) was used to calculate the Pearson correlation coefficient. The expression and methylation level differences in opposite directions were reserved for further analysis. Subsequently, the Database for Annotation, Visualization and Integrated Discovery tool, version 6.8 [19,20] (DAVID, https:// david. ncifc rf. gov/) was used to perform GO function and KEGG pathway enrichment analysis for the mRNAs with an opposite direction of difference in expression and methylation levels. The threshold value was considered to be P < 0.05.

Analysis of protein-protein interaction network
String, version 10.5 [21] (https:// string-db. org/), was used to search for the interaction between gene product proteins for genes with opposite expression and methylation levels, and an interactive network was built. The interactive network was visualized by Cytoscape version 3.7.2 software [22] (http:// www. cytos cape. org/). The GO biological process and KEGG signal pathway analysis based on DAVID were then performed on the gene nodes that constituted the interaction network. The threshold value was considered to be P < 0.05.  1 Volcano plot for the differentially expressed genes (a) and DNA methylation data (b). The horizontal dotted line represents an FDR = 0.05 threshold line;the red vertical dotted line represents the | logFC | > 0.5 threshold line; the red and blue dots represent significantly up-regulated and down-regulated DEGs and DMGs, respectively; and the black dots represent the non-differential expression genes. Bidirectional hierarchical clustering heatmaps for differentially expressed genes (c) and differentially expressed methylation regions (d). Black represents asthmatic samples and white represents healthy controls. FDR, false discovery rate; FC, fold change; DEGs, differentially expressed genes; DMGs, differentially methylated genes

MiRNA-target gene regulatory network construction
We used the Human MicroRNA Disease Database [23] (HMDD, http:// www. cuilab. cn/ hmdd) to search for miR-NAs directly associated with asthma. The target genes of asthma miRNAs were then screened using the star-Base version 2.0 database [24] (http:// starb ase. sysu. edu. cn/). The starBase database provides the comprehensive target gene prediction information from five databases: TargetScan, picTar, RNA22, PITA, and miRanda. We selected regulatory relationships included in at least one of the databases as miRNA-target gene relationship pairs to construct a miRNA -mRNA regulation relationship. Cytoscape 3.7.2 was used to display the networks. Finally, KEGG pathway analysis for target genes was performed using DAVID software.

Selection and mechanism analysis of candidate agents
In the Comparative Toxicogenomics Database, 2019 update [25] (http:// ctd. mdibl. org/), using "asthmatic" as a keyword, we searched for KEGG pathways and genes directly related to asthma, and compared them with pathways in which the genes in the constructed interaction network were significantly involved in the relevant pathways. We selected disease pathways with direct involvement of the asthma genes, constructed this part of the network separately, screened genes directly related to the disease, and conducted mechanism research through the important pathways of gene participation.

Differentially expressed genes and methylated sites screening
Expression and methylation level files were downloaded. A total of 933 (239 downregulated and 694 upregulated) DEGs and 751 (412 hypomethylated and 339 hypermethylated) DMGs were identified between the asthmatic and healthy control groups. Volcano plots for the DEGs and differentially methylated sites were shown in Fig. 1a and b. After screening DEGs and DMGs from the gene expression and methylation profiles, the corresponding gene expression and signal values were visualized in bidirectional hierarchical clustering heatmaps (Fig. 1c,  d). As can be seen in the figure, the difference between the selected DEGs and DMGs of the asthma and control groups is significant. The bidirectional hierarchical cluster heatmap revealed that the samples were clearly divided into two groups based on the screened DEGs and DMGs.

Gene ontology and KEGG pathway analysis
We screened a total of 284 intersection genes that were differentially expressed in the DNA methylation and gene expression data set, and analyzed the relationship between gene expression and DNA methylation changes by calculating the correlation coefficient (Fig. 2a). Critical gene expression levels and the DNA methylome are shown in Fig. 2b. We reserved 130 genes for further Comparison of DMGs and DEGs in a set of venn diagrams (a). Distribution map for differentially expressed genes with DNA methylation changes (b). The horizontal axis represents the degree of significant methylation difference, and the vertical axis represents the degree of gene expression difference. Red dots represent genes with opposite degrees of expression and methylation levels. DEGs, differentially expressed genes; DMGs, differentially methylated genes analysis whose expression and methylation levels differed in opposite levels. Among these, there were 35 genes with hypermethylation and decreased expression and 95 genes with hypomethylation and increased expression. GO and KEGG pathway enrichment analyses showed that the critical genes were primarily related to 24 GO terms and 10 pathways ( Table 1). The GO identified genes were involved in cellular functions including cellular defense response and oxidation reduction. The gene pathways were involved in multiple areas including natural killer cell mediated cytotoxicity, valine, leucine and isoleucine degradation, and steroid hormone biosynthesis.

Protein-protein interaction network analysis
In the protein-protein interaction network, a total of 119 nodes were identified. This included 33 hypermethylated, downregulated genes and 86 hypomethylated, upregulated genes with 426 pairs of co-expression interactions (Fig. 3). As shown in Table 2, these genes were significantly associated with 16 GO terms and 10 KEGG pathways. GO functional analysis found that these nodes were primarily involved in functions such as cellular defense response and oxidation reduction. KEGG pathways were primarily involved in natural killer cell mediated cytotoxicity, steroid hormone biosynthesis, and neuroactive ligand-receptor interaction.

Construction of a pathway network directly related to asthma
We screened 119 KEGG pathways and 116 genes that were directly associated with asthma by searching the CTD database. After comparison with the genes in the constructed regulatory network and the pathways in which genes participate significantly, an overlapping KEGG signaling pathway, hsa04060: cytokine-cytokine receptor interaction, was obtained in which the C−C motif chemokine ligand 2 (CCL2) gene directly related to asthma is involved. This gene is targeted by eight asthma related miRNAs (hsa-miR-206, hsa-miR-19a, hsa-miR-9, hsa-miR-22, hsa-miR-33b, hsa-miR-122, hsa-miR-1, and hsa-miR-23b). As noted in Fig. 5, two additional genes, IL2RG and CCl4, are involved in this pathway.

Discussion
Asthma is a complex multifactorial disease caused by the interaction of genetic and environmental factors. In our study, the hub genes were explored via an analysis of multiple data sets that included samples from asthmatics and healthy controls. A total of 130 critical DEGs that were differentially expressed in DNA methylome were detected. In the miRNA−target gene regulatory network directly related to asthma, an overlapping  Fig. 3 The protein-protein interaction network of genes with opposite directions of expression and methylation levels. The blue triangle represents hypermethylated downregulated genes, the red triangle represents hypomethylated upregulated genes KEGG pathway, hsa04060: cytokine−cytokine receptor interaction, was noted, in which the CCL2 gene directly related to asthma is involved, and the gene is targeted by 8 asthma related miRNAs. Two other genes, IL2RG and CCL4, are known to be involved in this pathway.
CCL2, also known as MCP-1, is one of several cytokine genes clustered on the q-arm of chromosome 17. Chemokines are a superfamily of secreted proteins involved in immunoregulatory and inflammatory processes. CCL2 is a member of the CC subfamily which is characterized by two adjacent cysteine residues. It binds to chemokine receptors CCR2 and CCR4. The results of our study also show that CCL2 and CCl4 are involved in the cyclokine receptor interaction signaling pathway. CCL2 is closely involved in the inflammatory response in children with asthma [26]. Multiple miRNAs have been shown to regulate the occurrence of inflammation in different diseases through CCL2. Roff et al. reported that microRNA-570-3p regulates HuR and cytokine (CCL2 and CCL4) expression in airway epithelial cells [27]. Downregulation of CCL2 induced by the upregulation of microRNA206 is associated with the severity of HEV71 encephalitis [28]. Chen et al. reported that miR-22 is downregulated in PBMCs from patients with coronary artery disease, and that miR-22 may participate in the inflammatory response by targeting MCP-1 [29]. These findings are consistent with the results of this study.
Interleukin 2 receptor subunit gamma, the protein encoded by IL2RG, is an important signaling component of many interleukin receptors, including IL-2, IL-4, IL-7, and IL-21, and is thus referred to as the common gamma chain [30]. Mutations in this gene cause X-linked severe combined immunodeficiency as well as X-linked combined immunodeficiency, a less severe immunodeficiency disorder [31]. The pathway analysis showed that IL2RG was involved in primary immunodeficiency and cytokine-cytokine receptor interaction. We speculate that IL2RG and CCL4 might be important genes related to childhood atopic asthma, and together with CCL2 participate in the cytokine receptor interaction signaling pathway that plays a role in childhood atopic asthma.  5 Genes directly related to asthma and KEGG pathways. The blue triangle represents hypermethylated downregulated genes, the red triangle represents hypomethylated up-regulated genes, the yellow circle represents miRNAs directly related to asthma, and the yellow square represents KEGG signal pathways directly related to asthma. KEGG, Kyoto Encyclopedia of Genes and Genomes There are some limitations in this study. The key genes obtained in this study have not been further verified. In addition, the samples in these two data sets are all PBMCs, which would be more convincing if they were airway epithelial cells. But our research provides new biological insights into the development of asthma.

Conclusions
In conclusion, our study identified a total of 130 DEGs with significant DNA methylation changes. In a regulatory network directly related to asthma, the KEGG signaling pathway hsa04060:cyclokine-cyclokine receptor interaction was found, in which the CCL2 gene, directly related to asthma, is involved, and this gene is targeted by eight asthma related miRNAs. We speculate that IL2RG and CCL4, which are also involved in this pathway, might be critical genes related to childhood specific asthma, and together with CCL2 play a role in this disease. The bioinformatics analysis in this study may provide a valuable and reliable basis for determining biomarkers for the development of childhood atopic asthma.