Location:Home >> Detail
Med One. 2016 Oct 25;1:e160022. DOI:10.20900/mo.20160022.
1Department of Biomedical Engineering, Tianjin University, Tianjin, 300072, China;
2Department of Radiology and Imaging Sciences, NIH, Bethesda, 20852, USA;
3Department of Genomics Research, R&D Solutions, Elsevier Inc., Rockville, MD, 20852, USA;
4Unit on Statistical Genomics, NIMH/NIH, Bethesda, 20852, USA.
Correspondence: Dr. Hongbao Cao, Department of Genomics Research, R&D Solutions, Elsevier Inc., Rockville, MD, 20852, USA; Email: firstname.lastname@example.org; Tel: 240-461-9642
Background: Previous studies have shown that Helicobacter pylori infection (HPI) is related to a reduced risk of esophageal adenocarcinoma (EAC) with unknown biological explanation. Here, we hypothesized that EAC and HPI may present strong genetic associations.
Methods: To identify potential EAC risk genes from HPI-gene group, we conducted an integrated analysis using large scale ResNet relation data and gene expression data for HPI and EAC. Disease-gene relation data were acquired from Pathway Studio ResNet Mammalian database. Gene expression data were acquired from samples of 92 subjects including 64 EAC cases and 28 normal controls.
Results: Genes linked to HPI and EAC present significant overlap (79 genes, p-value = 2.5E-75), playing roles within multiple common genetic pathways (enrichment p-value ≤ 5.05E-17 for the top 10 pathways) that implicated with both diseases. Moreover, we identified a genetic network of 32 genes, through which HPI may exert influence on EAC. Furthermore, 6 HPI genes presented significant difference (p-value < 1e-10) between EAC cases and controls, including MUC13, AQP3, TFF3, SFTPD, NOD2 and PIGR. Network analysis showed that these genes demonstrated strong functional association with EAC and may serve as potential EAC risk genes.
Conclusion: Results from this study support the hypothesis that complex genetic associations exist between HPI and EAC and that HPI related genes may also play roles in the pathogenic development of EAC, which provides new insights for the identification of candidate EAC genes.
Esophageal adenocarcinoma (EAC) is one of the high mortality cancers in developed countries with rapidly increasing incidence . Studies have suggested that at least 95 % of EAC cases arise from the metaplastic condition known as Barrett's esophagus . Genetic studies using both Genome-wide association study (GWAS) and gene expression data have been conducted to explore the genetic risks associated with EAC [3,4], where hundreds of genes linked EAS have been reported. However, the basic carcinogenesis mechanisms underlying the clinical outcome of EAC remains unclear. Here, we studied the genetic association between Helicobacter pylori infection (HPI) and EAC, with the purpose to gain better understanding of the genetic bases of EAC and identify novel potential genes for the disease.
Helicobacter pylorus is a gram-negative bacillus that is usually found in human gastric mucosal epithelium. Affecting over half of the world's population, HPI is a cause of Gastroesophageal Reflux Disease (GERD) and a risk factor for gastric cancer . However, HPI seems to be associated with a reduced risk of EAC - over 40 % lower incidence of EAC has been observed in people with HPI than that in uninfected people [6,7]. Although the biological explanation for a protective effect of HPI in case of EAC remains unclear, it is believed that the reduced EAC risk may be linked to lower gastric acid levels in HPI patients [7,8].
In recent years, Pathway Studio ResNet database has been widely used to study modeled relationships between proteins, genes, complexes, cells, tissues and diseases ; http://pathwaystudio.gousinfo.com/Mendeley.html]. In this study, we integrated large scale ResNet relation data and gene expression data to test the hypothesis that there is a shared genetic base between HPI and EAC, and that HPI related genes may also associated with EAC. Our results supported the hypothesis and may identified potential novel risk genes for EAC.
First we studied the large scale HPI-gene and EAC-gene ResNet relation data with the hope to identify shared genes and genetic pathways. Then, integrated EAC expression data to discover novel genes from HPI-gene group. After that, used functional network analysis to study the potential pathogenic significance of these candidate genes to EAC.HPI-Gene and EAC-Gene data acquisition
Disease-gene relation data for both HPI and EAC were acquired from Pathway Studio ResNet relation database, which has been widely used to study modeled relationships between proteins, genes, complexes, cells, tissues and diseases (http://pathwaystudio.gousinfo.com/Mendeley.html). Updated weekly, The PS ResNet Databases is the largest database among known competitors in the field . Besides the full lists of genes, we also presented the supporting references for each disease-gene relation in Supplementary Table S1 and S2, including titles of the references and the related sentences where these relations were identified. These information could be used to located detailed description of how a candidate gene is related to HPI and/or EAC.Identification of risk genes
We used a gene expression data set (GSE13898) of 92 subjects to test the genes related to HPI but not been reported to have an association with EAC with the purpose to identify potential EAC risk genes. The gene expression profiles were acquired from 64 primary esophageal adenocarcinoma and 28 surrounding normal fresh frozen tissues. All the tissues were obtained after curative resection with pathologic confirmation at UT M.D. Anderson Cancer Center (MDACC). Microarray experiment and data analysis were done in the Department of systems biology at MDACC.Network analysis of EAC risk genes
To validate the potential candidate EAC risk genes, we perform network analysis between a target gene and EAC to identify any entities that serve as a bridge connecting both the gene and EAC. The target entity analysis included proteins/genes, small molecular/drugs and functional classes. The relation data between these target entities and target gene and EAC were acquired from Pathway Studio ResNet database for analysis.
We conducted a systematic analysis on the HPI-Gene and EAC-Gene ResNet relation data to identify genes associated with HPI and EAC. Results showed that 276 genes were associated with HPI, supported by 720 scientific references from 1992 to June 2016 (Supplementary Table S1a and Table S1b). For EAC, we identified 293 genes supported by 700 references from 1993 to June 2016 (Supplementary Table S2a and Table S2b). A significant overlap of 79 genes occurs between the HPI-genes and EAC-genes (Right tail Fisher’s Exact test, p-value = 2.5E-75), as shown in Fig. 1 (see Supplementary Table S3a and Table S3b for the gene list and references).
To test the functional profile of the 79 genes associated with both HPI and EAC, we conducted a Pathway Enrichment Analysis (PEA) using Pathway Studio. The 10 most significantly enriched pathways (p-value ≤ 5.05E-17) are presented in Table 1. In total, 637 pathways/gene sets were enriched with p-value < 1e-3 including 77 out of these 79 genes (Supplementary Table S4).
Note: For each pathway/Go term, the p-value was calculated using Fisher-Exact test against the hypothesis that a randomly selected gene group of same size (79) can generate a same or higher overlap with the corresponding pathway/Go term. All these pathways/Go terms passed the FDR correction (q = 0.001).
Through PEA we found 37 pathways/gene sets (57 unique genes) related to cell growth and proliferation，34(49 unique genes) to cell apoptosis, 10 (29 unique genes) to protein kinase; 9 (26 unique genes) to protein phosphorylation, 9 (38 unique genes) to transcription factors, 6 (30 unique genes) to immune system, and 2 (21 unique genes) to single-organism developmental process. Many of these pathways have been implicated in both HPI and EAC, such as the response to lipopolysaccharide (GO ID: 0032496) [11,12], ageing (GO ID: 0016280) [13,14], response to hypoxia (GO ID: 0001666) [15,16], and positive regulation of cell proliferation (GO ID: 0008284) [17,18]. For more detailed information of these significantly enriched pathways, please refer to Supplementary Table S4.
Our results suggest that HPI and EAC share multiple genetic pathways, through which a large group of genes play roles affecting the pathogenic development of both diseases.Possible co-regulations between HPI and EAC
Further functional network analysis using PS showed that, 32 out of the 79 genes are downstream targets of HPI (influenced by HPI), while they are the upstream regulator of EAC, as shown in Fig. 2. Therefore, HPI may influence the pathogenic development of EAC through the regulation of these 32 genes. Under each relation (arrow) in Fig. 2, there is support from one or more references (see Supplementary Table S3b), which could be used for detailed description of each relation.
Our results suggest that any genes linked to HPI may be worthy of study for their potential relation to EAC. These genes affect the pathogenic development of HPI, which in turn may influence the disease status of EAC.
Above ResNet relation data analysis showed that more HPI genes were not linked to EAC than these were (197 vs. 79; see Fig. 1). To identify genes linked to HPI and also are potentially risk genes for EAC, a gene expression analysis was conducted to study the expression difference between EAC case and controls on these 197 genes (see Supplementary Table 5 for results). Fig. 3 elucidates the ‘–log10’ transferred p-values (q = 0.001 for FDR) of each gene.
For the gene expression analysis, 62 out of 197 HPI genes passed the FDR correction (q = 0.001). Moreover, 6 genes presented significant difference (p-value < 1e-10) between EAC cases and controls, including MUC13, AQP3, TFF3, SFTPD, NOD2 and PIGR. According to PS ResNet database, these 6 genes present no direct relation with EAC (no reference reporting an association between these genes and EAC). However, these genes demonstrate strong indirect linkage to EAC, bridged by 29 genes/proteins, 10 small molecular and 7 functional classes (see Fig. 4). The 46 entities and the 141 relations with 1,385 supporting references in Fig. 4 are presented in Supplementary Table S5a and S5b, respectively.
Previous studies showed that HPI is strongly linked to reduce incidence of EAC with unclear mechanism [6,7,19]. In this study, we used large scale ResNet relation data and gene expression data to study the share genes and genetic pathway between HPI and EAC, based on which we identified potential novel risk genes for EAC.
Our results showed that genes linked to HPI and EAC present significant overlap (79 genes, p-value = 2.5E-75). Furthermore, 77 out of these 79 genes were significantly enriched within 637 pathways (p-value < 1e-3, FDR corrected: q = 0.005), many of which have been implicated to be linked to both HPI and EAC, such as the response to lipopolysaccharide (GO ID: 0032496), ageing (GO ID: 0016280), response to hypoxia (GO ID: 0001666), and positive regulation of cell proliferation (GO ID: 0008284) [11-18]. These results suggest that HPI and EAC share multiple genetic pathways, through which a large group of genes regulate the pathogenic development of both diseases.
Moreover, we observed a 32-gene network, through which HPI could affect the disease status of EAC (Fig. 2). Our findings provide further support for the hypothesis that HPI genes may also regulate pathogenic development of EAC.
Closer study of the 197 HPI alone genes (Fig. 1 (a)) using EAC gene expression data showed that, large portion (62/197 = 31.47 %, q = 0.001 for FDR) of these HPI genes also demonstrated difference between EAC cases and controls (FDR corrected p-value < 0.001), as shown in Fig. 3. Moreover, six genes were identified as potential EAC markers (FDR corrected p-value < 1e-10), including MUC13, AQP3, TFF3, SFTPD, NOD2 and PIGR. Further validation using ResNet network analysis showed that, these six genes presented strong indirect correlation with EAC, forming a functional genetic network supported by 1,385 supporting references (Fig. 4). Through the network, multiple pathways could be identified through which a gene may affect the disease status of EAC. For example, NOD2 has been reported to be involved in the production of microbicidal Reactive Oxygen Species (ROS) , which play an important role in the development of EAC . This finding supports a NOD2 → ROS → EAC pathway. Another possible MUC13 → EAC pathway was identified as follows. MUC13 has been shown to regulate the secretion of chemokine . The chemokine receptors are Class A GPCRs coupled with Gαi heterotrimeric G protein, which play pivotal role in tumorigenesis and metastasis of EAC . Therefore, by regulating the secretion of chemokines, MUC13 MUC13 may regulate EAC pathogenesis through a chemokine pathway, building a MUC13 → chemokine pathway → EAC regulation mechanism.
To sum up, results from this study support the hypothesis that HPI and EAC present significant genetic level association, which may explain their clinical correlations. Moreover, novel potential EAC genes can be identified by integrating ResNet relation data and gene expression data. To our knowledge, this is the first study integrating large scale ResNet relation data and gene expression data to study the molecular associations between HPI and EAC. Findings here may provide new insights into the current field of HPI-EAC correlation study, and guarantee further studies using more data sets to identify novel potential risk genes for EAC.
We would like to thank Dr. Sana Khan for her suggestions and writing help in the development of this manuscript. Dr. Khan is with Department of Genomics Research, R&D Solutions, Elsevier Inc.