16
Int. J. Computational Biology and Drug Design, Vol. 4, No. 2, 2011 111 Copyright © 2011 Inderscience Enterprises Ltd. Gene regulation in glioblastoma: a combinatorial analysis of microRNAs and transcription factors Xue Gong Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee 37232, USA E-mail: [email protected] Jingchun Sun Departments of Biomedical Informatics and Psychiatry, Vanderbilt University School of Medicine, Nashville, Tennessee 37232, USA E-mail: [email protected] Zhongming Zhao* Departments of Biomedical Informatics, Psychiatry, and Cancer Biology, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA E-mail: [email protected] *Corresponding author Abstract: Glioblastoma is the most common and most lethal brain tumour in humans. Illustrating the functions being disturbed during carcinogenesis and how they are deregulated is very important for us to understand its underlying mechanism. Transcriptional aberrations may play a vital role in the etiology of glioblastoma, which might be caused by both genomic alterations and other regulation molecules. In this study, we investigated possible cooperative deregulation of microRNAs (miRNAs) and transcription factors (TFs) in glioblastoma, under the hypothesis that miRNAs and TFs might have a combinational regulatory effect on glioblastoma genes. We searched glioblastoma-specific regulatory networks by integrating glioblastoma related miRNAs, TFs and genes, and identified 54 feed-forward loops (FFLs). Follow up functional enrichment analysis of these FFLs uncovered some functions important to carcinogenesis but also some unique functions specific to the FFLs we identified. Keywords: systems biology; glioblastoma; cancer gene; microRNA; TF; transcription factor; FFL; feed-forward loop; gene regulation; functional analysis; integration analysis; pathway. Reference to this paper should be made as follows: Gong, X., Sun, J. and Zhao, Z. (2011) ‘Gene regulation in glioblastoma: a combinatorial analysis of microRNAs and transcription factors’, Int. J. Computational Biology and Drug Design, Vol. 4, No. 2, pp.111–126.

Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

Int. J. Computational Biology and Drug Design, Vol. 4, No. 2, 2011 111

Copyright © 2011 Inderscience Enterprises Ltd.

Gene regulation in glioblastoma: a combinatorial analysis of microRNAs and transcription factors

Xue Gong Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee 37232, USA E-mail: [email protected]

Jingchun Sun Departments of Biomedical Informatics and Psychiatry, Vanderbilt University School of Medicine, Nashville, Tennessee 37232, USA E-mail: [email protected]

Zhongming Zhao* Departments of Biomedical Informatics, Psychiatry, and Cancer Biology, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA E-mail: [email protected] *Corresponding author

Abstract: Glioblastoma is the most common and most lethal brain tumour in humans. Illustrating the functions being disturbed during carcinogenesis and how they are deregulated is very important for us to understand its underlying mechanism. Transcriptional aberrations may play a vital role in the etiology of glioblastoma, which might be caused by both genomic alterations and other regulation molecules. In this study, we investigated possible cooperative deregulation of microRNAs (miRNAs) and transcription factors (TFs) in glioblastoma, under the hypothesis that miRNAs and TFs might have a combinational regulatory effect on glioblastoma genes. We searched glioblastoma-specific regulatory networks by integrating glioblastoma related miRNAs, TFs and genes, and identified 54 feed-forward loops (FFLs). Follow up functional enrichment analysis of these FFLs uncovered some functions important to carcinogenesis but also some unique functions specific to the FFLs we identified.

Keywords: systems biology; glioblastoma; cancer gene; microRNA; TF; transcription factor; FFL; feed-forward loop; gene regulation; functional analysis; integration analysis; pathway.

Reference to this paper should be made as follows: Gong, X., Sun, J. and Zhao, Z. (2011) ‘Gene regulation in glioblastoma: a combinatorial analysis of microRNAs and transcription factors’, Int. J. Computational Biology and Drug Design, Vol. 4, No. 2, pp.111–126.

Page 2: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

112 X. Gong et al.

Biographical notes: Xue Gong received her PhD in Biophysics from the University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang, China in 2010. Currently, she is a Postdoctoral Fellow in the Department of Biomedical Informatics at Vanderbilt University Medical Center. Her research interests include studying cancer mechanisms at the systems biology level using high-throughput biological data, including gene expression profiles, protein-protein interaction networks, cancer genome profiling and others, and target prediction of microRNA.

Jingchun Sun received her PhD in Bioinformatics from Shanghai Jiao Tong University, Department of Biochemistry and Molecular Biology, Shanghai, China in 2005. Currently, she is a Postdoctoral Fellow in the Departments of Biomedical Informatics and Psychiatry at Vanderbilt University Medical Center. Her research interests include studying complex diseases such as psychiatric disorders and cancers through bioinformatics and systems biology approaches.

Zhongming Zhao received his PhD in Human and Molecular Genetics from the University of Texas Health Science Center at Houston and MD Anderson Cancer Center, Houston, Texas in 2000. Currently, he is an Associate Professor in the Departments of Biomedical Informatics, Psychiatry, and Cancer Biology at Vanderbilt University Medical Center. His research interests include bioinformatics and systems biology approaches to studying complex diseases, genome-wide or large-scale analysis of genetic variation and methylation patterns, next-generation sequencing data analysis, comparative genomics and biomedical informatics.

1 Introduction

Glioblastoma multiforme (GBM) is the most common and most lethal brain tumour in humans, accounting for 52% of all parenchymal brain tumour cases and 20% of all intracranial tumours (Lipsitz et al., 2003). The molecular mechanisms, genetics and paths to treatment of glioblastoma have been studied extensively during the past two decades (Mischel et al., 2003). In addition to traditional low through-put studies such as linkage analysis, The Cancer Genome Atlas (TCGA) aims to catalogue and discover major cancer-causing genome alterations in human tumours through integrated multi-dimensional analysis. Glioblastoma is the first specific cancer that underwent comprehensive genomic characterisation by the TCGA consortium. One comprehensive analysis identified 42 candidate genes from the 206 samples through mutation screening and copy number variation (CNV) studies (TCGA, 2008). Several core pathways with frequent aberrations among patients were characterised, involving RTK signalling, P53 and Rb tumour suppressor pathways. As more genomic data has been made available, thanks to the rapid development and application of high throughput biotechnologies in complex disease studies, investigators are likely to identify more glioblastoma genes in the near future.

Instead of the typical analysis of single genomic dataset, investigators now shifted to study the mechanisms of carcinogenesis through integrative analysis of multiple genomic datasets, as genomic datasets from different platforms such as gene expression, CNV, methylation, genome-wide association studies (GWAS), and microRNA (miRNA)

Page 3: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

Gene regulation in glioblastoma: a combinatorial analysis 113

expression have often become available for a specific disease or phenotype (Archer et al., 2010; Jia et al., 2011). Such integrative genomics approaches have been quickly demonstrated effective. For example, the TCGA consortium carried out a comprehensive analysis of CNV, gene expression, and DNA methylation data in glioblastoma patients and then they could define key pathways in this specific cancer (TCGA, 2008). Chan et al. (2008) and Schuebel et al. (2007) attempted to discover cancer related genes through the integration of DNA mutation and methylation data. Recently, Gaire et al. (2010) demonstrated that deregulation of miRNA and CNV were mutually exclusive in glioblastoma patients, while Verhaak et al. (2010) applied an integrative genomic approach and identified clinically relevant subtypes of glioblastoma characterised by abnormalities in four genes (PDGFRA, IDH1, EGFR, and NF1). Kim et al. (2008) also identified a miRNA and cancer gene cluster through an integrative analysis. Therefore, extended investigation of the mechanism of carcinogenesis by integrating multiple layers of genomic data, especially gene expression data, could result in more insightful and promising discoveries.

The heterogeneous nature of cancer might make cancer a complicated disease at the individual molecule level. Limited categories of function are disturbed during carcinogenesis which are defined as hallmarks of cancer, including self-sufficiency in growth signals, insensitivity to antigrowth signals, evading apoptosis, limitless replicative potential, sustained angiogenesis, tissue invasion and metastasis (Hanahan and Weinberg, 2000) and avoidance of immune-surveillance (Zitvogel et al., 2006). Therefore, discovering the deregulated functions in glioblastoma and how these functions are regulated by different kinds of molecules is important not only to uncover the underlying mechanism of carcinogenesis, but also in drug design and biomarker discovery. Indeed, we carried out a functional enrichment analysis for glioblastoma related genes and found they were enriched in some growth signal pathways (FAK signalling, PI3K/AKT signalling and SAPK/JNK signalling), apoptosis pathways (Myc mediated apoptosis signalling), metastasis pathways (glioma invasiveness signalling) and some immune related pathways (IL-12 signalling, IL-9 signalling and B cell receptor signalling).

Both transcription factors (TFs) and miRNAs are important regulators of gene expression. transcription factors regulate gene expression at the transcriptional level. A high proportion of oncogenes and tumour suppressor genes encode TFs. Additionally, many cancer genes have been found to be regulated by TFs. miRNAs are a class of small (19–27nt) endogenous RNAs, and they mainly regulate gene expression through post-transcriptional repression. Recent studies have implicated miRNAs in carcinogenesis. Several databases, such as miR2Disease (Jiang et al., 2009), HMDD (Lu et al., 2008) and PhenomiR (Ruepp et al., 2010), have been developed to compile disease related miRNAs. TFs and miRNAs are capable of cooperatively regulating the same gene in terms of feed-forward loops (FFLs). Examination of FFLs in a cellular system has been emerging as a powerful bioinformatics tool, and many important findings have been reported (Friard et al., 2010; Qiu et al., 2010; Re et al., 2009; Shalgi et al. 2007; Tsang et al. 2007). Nevertheless, there has been no such examination of the FFLs in a specific cancer. Herein, we hypothesised that miRNAs and TFs play combinatory regulatory roles in glioblastoma related genes and, thus, we explored miRNA-TF regulatory network in glioblastoma. Through our investigation, 54 FFLs were found, consisting of 34 glioblastoma related miRNAs, 27 TFs and 18 glioblastoma related genes. We performed the functional analysis for the FFLs and accordingly assigned the most reliable function to each loop. Besides the common functions shared

Page 4: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

114 X. Gong et al.

with the functions of all glioblastoma related genes, there are several functions specific to the FFLs, including ATM Signalling, ERK5 Signalling, Role of PKR in Interferon Induction and Antiviral Response, Cardiac Hypertrophy Signalling, Relaxin Signalling, Glucocorticoid Receptor Signalling, and FXR/RXR Activation.

2 Materials and methods

2.1 Glioblastoma related miRNAs, genes and target prediction

There are several curated databases aimed at compiling disease related miRNAs with the evidence of differential expression of miRNAs in disease cohorts. In our study, the glioblastoma related miRNAs (hereafter denoted as ‘GB_miRNAs’) were obtained from three databases, namely miR2Disease (Jiang et al., 2009), HMDD (human miRNA-associated disease database) (Lu et al., 2008) and PhenomiR (Ruepp et al., 2010).

Glioblastoma related genes (hereafter denoted as ‘GB_genes’) were downloaded from the cancer gene database F-Census (Gong et al., 2010), an integrated cancer gene data source. There are two data sources of the GB_genes in this database: one part is from low throughput experiments such as linkage analysis and association studies, and the other is from high throughput genomic studies such as genome-wide mutation and CNV screening of glioblastoma patients.

We used one of the most reliable target prediction tools thus far, TargetScan, to retrieve all the predicted miRNA target interactions from the TargetScan server (Release 5.1, April 2009) (Grimson et al., 2007). Both the conserved and un-conserved targets of miRNAs were utilised to enlarge the coverage of the target prediction. The miRNA and target gene pairs between the GB_miRNAs and GB_genes were then extracted using a custom Perl code.

2.2 Transcription factor regulation data

A comprehensive dataset of TFs and gene interaction data was prepared by Tu et al. (2009). We acquired this dataset from the author of this work, Dr. Kang Tu. The data consisted of two sources, with one part parsed from the UCSC TFBS data (http://genome.ucsc.edu) and the other part downloaded from the TRED database (http://rulai.cshl.edu/TRED).

The TF and miRNA interactions were predicted based on the UCSC TFBS data. The transcriptional starting site (TSS) of miRNA was defined by the following criteria.

• if the miRNA is located in a transcript unit (< 5k bp), the TSS of the first miRNAs is used

• if the miRNA resides in a host gene, the TSS of the host is used

• otherwise, the 5000 nt upstream of the mature miRNA is considered as the TSS.

The promoter region is defined as the interval upstream 900 nt and downstream 100 nt of the TSS. If there is TFBS in the promoter region, we predicted that the miRNA was regulated by the TF.

Page 5: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

Gene regulation in glioblastoma: a combinatorial analysis 115

2.3 Feed-forward loops (FFLs) and statistics tests

We then searched the FFLs in which a GB_gene was regulated by a GB_miRNA and a TF, as shown in Figure 1.

Figure 1 An overview of miRNAs, TFs and their regulatory networks in glioblastoma. GB, glioblastoma; TF, transcription factor; miRNA, microRNA; FFL, feed-forward loop

To assess the significance of the identified FFLs, two randomisation experiments were performed. For the first experiment, we utilised random non-cancer related miRNAs, which were extracted from all human miRNAs, excluding cancer disease related miRNAs collected in the three databases (Jiang et al., 2009; Lu et al., 2008; Ruepp et al., 2010). In each run, we extracted the same number of random miRNAs as the GB_miRNAs and calculated the number of FFLs. This process was repeated 1000 times. We set the P value as the proportion of random results that have no less than the number of FFLs observed in the set of GB_miRNAs. The second randomisation experiment utilised random genes. We extracted genes numbering equally with the GB_genes from all the available human genes after excluding the cancer genes compiled in F-Census (Gong et al., 2010), and we repeated the process 1000 times of the FFLs search to estimate the empirical P value.

2.4 Functional analysis

The functional analyses were generated by the Ingenuity Pathways Analysis (IPA) system (Ingenuity® Systems, Inc., http://www.ingenuity.com). We used Canonical Pathways Analysis to identify the pathways from the IPA Library of Canonical Pathways that were most significant to the data set. Fisher’s exact test was used to calculate a P value, determining the probability that the association between the genes in the dataset and the canonical pathway is explained by chance alone. The P value was then adjusted by False Discovery Rate (FDR) using Benjamini-Hochberg (BH) procedure for multiple testing (Benjamini and Hochberg, 1995). First, to obtain the functions of GB_genes, the functional analysis was carried out on the entire list of GB_genes; then, for the FFLs, the GB_miRNAs, TF and GB_genes were all used as input of the functional analysis.

Page 6: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

116 X. Gong et al.

3 Results and discussion

3.1 Functional analysis of glioblastoma related genes

A total of 43 GB_genes were retrieved from an integrated cancer gene database, F-Census (Gong et al., 2010). To obtain the functions of the GB_genes, enriched canonical pathways were picked out through the IPA analysis. Using an FDR cutoff 0.05, 123 pathways were found to be enriched with GB_genes, indicating the broad functions of the GB_genes. We next used a stringent FDR cutoff 0.01 for a more detailed examination. There were 60 pathways satisfied with this cutoff value (Table 1). Among the most significant pathways, several cancer related pathways stood out, such as Glioblastoma Multiforme Signalling (FDR = 1.45E-10) and Glioma Signalling (FDR = 1.74E-10). The results not only revealed the function categories of the GB_genes but also, in turn, confirmed the reliability of these genes, as their functions supported the pathology of glioblastoma. Several other cancer-related signalling pathways were also found. One possible reason is that these cancer-related pathways share a large proportion of genes.

It has been hypothesised that, although cancer is heterogeneous in terms of individual molecules, the disturbed function modules in cancer are highly limited to several ‘hallmarks of cancer’, including self-sufficiency in growth signals, insensitivity to antigrowth signals, evading apoptosis, limitless replicative potential, sustained angiogenesis, tissue invasion and metastasis (Hanahan and Weinberg, 2000) and avoidance of immunosurveillance (Zitvogel et al., 2006). The enriched functions of the GB_genes fall into these hallmarks, further demonstrating their association with carcinogenesis. For example, GB_genes were enriched in growth signal pathways such as FAK Signalling (FDR = 8.91E-05), PI3K/AKT Signalling (FDR = 2.95E-04), SAPK/JNK Signalling (FDR = 2.00E-03), and HGF Signalling (FDR = 2.45E-03). Besides, Myc Mediated Apoptosis Signalling (FDR = 2.14E-05) and Glioma Invasiveness Signalling (FDR = 1.00E-02) were related to apoptosis and metastasis, respectively. Moreover, there are some immune related pathways enriched with GB_ genes, including IL-12 Signalling and Production in Macrophages (FDR = 3.47E-03), IL-9 Signalling (FDR = 3.72E-03), B Cell Receptor Signalling (FDR = 6.76E-03) and IL-2 Signalling (FDR = 9.77E-03).

Table 1 The canonical pathways enriched with all the GB_genes with FDR cutoff 0.01

Pathways FDR

Non-small cell lung cancer signalling 7.94E-12

Melanoma signalling 2.51E-11

p53 signalling 1.35E-10

Glioblastoma multiforme signalling 1.45E-10

Glioma signalling 1.74E-10

Pancreatic adenocarcinoma signalling 3.55E-10

Small cell lung cancer signalling 8.71E-10

Page 7: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

Gene regulation in glioblastoma: a combinatorial analysis 117

Table 1 The canonical pathways enriched with all the GB_genes with FDR cutoff 0.01 (continued)

Pathways FDR

Ovarian cancer signalling 2.09E-09

Bladder cancer signalling 1.20E-07

Chronic myeloid leukaemia signalling 2.00E-07

Endometrial cancer signalling 3.31E-07

Cell cycle: G1/S checkpoint regulation 3.55E-07

Hereditary breast cancer signalling 6.76E-07

HER-2 signalling in breast cancer 2.04E-06

Cyclins and cell cycle regulation 2.04E-06

Prostate cancer signalling 2.75E-06

Role of tissue factor in cancer 1.07E-05

Myc mediated apoptosis signalling 2.14E-05

Molecular mechanisms of cancer 2.63E-05

FAK signalling 8.91E-05

Neuregulin signalling 9.77E-05

PTEN signalling 1.82E-04

Cell cycle: G2/M DNA damage checkpoint regulation 2.14E-04

EGF signalling 2.45E-04

PI3K/AKT signalling 2.95E-04

Aryl hydrocarbon receptor signalling 4.47E-04

Inositol phosphate metabolism 4.79E-04

Estrogen-dependent breast cancer signalling 6.17E-04

SAPK/JNK signalling 2.00E-03

Antiproliferative role of TOB in T cell signalling 2.14E-03

Amyotrophic lateral sclerosis signalling 2.24E-03

Neuropathic pain signalling in dorsal horn neurons 2.34E-03

HGF signalling 2.45E-03

HIF1α signalling 2.88E-03

Huntington's disease signalling 3.02E-03

iCOS-iCOSL signalling in T helper cells 3.02E-03

Role of NANOG in mammalian embryonic stem cell pluripotency 3.24E-03

IL-12 signalling and production in macrophages 3.47E-03

IL-9 signalling 3.72E-03

Colorectal cancer metastasis signalling 3.72E-03

Page 8: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

118 X. Gong et al.

Table 1 The canonical pathways enriched with all the GB_genes with FDR cutoff 0.01 (continued)

Pathways FDR

Type II diabetes mellitus signalling 3.80E-03

Cell cycle regulation by BTG family proteins 3.89E-03

p70S6K signalling 4.27E-03

Insulin receptor signalling 5.13E-03

FcγRIIB signalling in B lymphocytes 5.37E-03

Docosahexaenoic Acid (DHA) signalling 5.37E-03

PI3K signalling in B lymphocytes 5.37E-03

Role of Oct4 in mammalian embryonic stem cell pluripotency 6.31E-03

Thyroid cancer signalling 6.31E-03

MSP-RON signalling pathway 6.61E-03

B Cell receptor signalling 6.76E-03

CNTF signalling 8.51E-03

Dendritic cell maturation 9.33E-03

Thrombopoietin signalling 9.33E-03

Role of BRCA1 in DNA damage response 9.77E-03

IL-2 signalling 9.77E-03

RAR activation 1.00E-02

NF-κB signalling 1.00E-02

Lymphotoxin β receptor signalling 1.00E-02

Glioma invasiveness signalling 1.00E-02

The biological function of a gene can be defined at several levels, ranging from the basic biological attributes of a protein product to the nature of physical and regulatory interactions, membership in a given biological pathway, and membership of a specific biological network (such as a protein-protein interaction (PPI) sub-network). Ingenuity Pathway Analysis system can be considered as the most comprehensive system so far to capture these functions at multiple levels. It not only integrates the function information from public databases, such as Gene Ontology (GO) and the KEGG pathway database, but also expanded and improved the annotations by manual curation of numerous peer-reviewed scientific research and review papers and text books by many PhD level scientists.

3.2 Glioblastoma related miRNAs and their interactions with GB_genes

To obtain glioblastoma related miRNAs, we extracted 27 mature miRNAs, 20 pre-miRNAs and 110 pre-miRNAs from the miR2Disease, HMDD, and PhenomiR databases, respectively. A total list of 115 glioblastoma related mature miRNAs were

Page 9: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

Gene regulation in glioblastoma: a combinatorial analysis 119

obtained and utilised for our analysis after converting the pre-miRNAs to the corresponding mature forms according to miRBase (Release 16). Among the 114 GB_miRNAs, 99 belong to 79 miRNA families, which were defined as groups of mature miRNAs sharing the same seed across multiple species. Fifty-four of the 115 GB_miRNAs belong to miRNA clusters (<5 kb) and 44 miRNAs reside within host genes.

In this study, the GB_miRNAs were all derived from the evidence of their de-regulation in glioblastoma samples. It is worth noting that the deregulation of expression might not be the cause of the disease but rather may be the consequence of up-stream casual disturbs. Therefore, other evidence supporting the causal contribution of miRNAs in carcinogenesis is warranted, which can be obtained through analysis of data such as that generated by next generation sequencing.

Based on the interactions of the miRNAs and target genes predicted by TargetScan 5.1 (Grimson et al., 2007), 757 pairs of interaction between the GB_miRNAs and the GB_genes were extracted. One-hundred and one GB_miRNAs targeted at least one GB_gene. Forty-five GB_genes were regulated by at least one GB_miRNA. The average number of target GB_genes of GB_miRNAs was 8, with the miRNA has-miR-129-5p targeting the largest number of GB_genes (17). The average number of the regulated GB_miRNAs for the GB_genes was 18, with the gene ARNT2 regulated by the most number of GB_miRNAs (52 GB_miRNAs).

There has been remarkable progress in the field of miRNA target prediction during the past a few years, yet it remains an unsolved problem with the potential uncertainty of high false positive of the predictions. Although more than ten computational tools have been developed and tested extensively for miRNA target prediction, false positive prediction is still a major concern. In this project, we utilised the targets generated by TargetScan, which serves as one of the most reliable prediction tools. Currently, there are efforts to improve target prediction by a combination of sequence and other features like expression and function. Therefore, further work is warranted to develop novel algorithms by taking advantage of this information to achieve more reliable targets. Moreover, it was reported that not only the mature miRNA but also the primary miRNA can regulate target genes (Trujillo et al., 2010). This discovery will bring a mechanical revolution to the target prediction field. For example, the miRNAs in the same family might possess and predict different sets of targets, a concept beyond the prediction function in any currently available target prediction tools. However, such work is outside the scope of this project.

3.3 Feed-forward loops in glioblastoma

Next, we explored the combinational regulation of GB_genes by the GB_miRNAs and TFs in terms of FFLs as the procedures shown in Figure 1. An FFL is defined as a 3-node motif including a GB_gene, a GB_miRNA and a TF, in which the GB_gene was targeted by both the GB_miRNA and TF, and the miRNA was regulated by the TF. By exhaustive searching, 54 FFLs could be formed among the GB_genes, GB_miRNAs and TFs (Table 1). These FFLs included 34 GB_miRNAs, 27 TFs and 18 GB_genes.

To test whether we had obtained significantly more FFLs using GB_miRNAs and GB_genes than random sets of miRNAs and genes, two randomisation experiments were carried out. In the first experiment, we extracted random miRNAs from all miRNAs,

Page 10: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

120 X. Gong et al.

excluding the known cancer related ones (see Materials and methods section), and observed a mean of 34 FFLs with P value 0.018. Although the P value was significant (P < 0.05), the magnitude of the significance was not strong. One reasonable explanation is that there remain some cancer related miRNAs in the list from which we extracted random data, as it was agreed that most of the miRNAs were associated with carcinogenesis. For the second experiment of random gene lists, the random gene list was generated from all the human genes excluding the known cancer genes compiled in F-Census (see section Methods). The mean of the FFLs was 24 and the P value was 7.0E-3.

We compared the FFLs in this study with a previous study for schizophrenia (Guo et al., 2010). Interestingly, no overlap was found. Our further examination of the datasets in these two studies revealed that only a small portion of miRNAs (7) and genes (2) were shared between the two diseases, thus potentially accounting for the difference of the FFLs. This comparative result suggests that the underlying regulatory mechanisms of these two diseases might be different, even though they are both brain related diseases. However, caution should be used because data utilised in neither disease study is complete.

The FFLs are one of the regulation models by TF and miRNA to a target gene. However, there are additional models, such as TF as mediate in miRNA regulation of a gene, as suggested in Tu et al. (2009). Recently, Su et al. (2010) explored to find all the potential motifs of TF-miRNA-gene by combining gene expression data in mouse brain development. Therefore, it would be interesting to explore the active TF-miRNA-gene motifs for glioblastoma by taking advantage of the expression data of miRNA and mRNA in the future work. Furthermore, we can extend this analysis by considering the motif of 4 nodes consisting of a pair of genes that are co-expressed and they are co-regulated by a pair of TF and miRNA. Such analysis might be more informative in investigating the regulatory system in complex disease.

3.4 Functional analysis of feed-forward loops

To obtain the general functions of the FFLs, we used all the miRNA, TF and the target genes as input to perform a pathway enrichment analysis through the IPA system (see Materials and methods section). There are 85 enriched functions for the FFLs at FDR 0.05. Comparing these functions to the ones using all the GB_miRNAs at the same FDR cutoff, there are 78 common functions between the two lists and 7 functions specific to the FFLs, including ATM Signalling (FDR = 4.79E-02), ERK5 Signalling (FDR = 1.23E-02), Role of PKR in Interferon Induction and Antiviral Response (FDR = 3.63E-02), Cardiac Hypertrophy Signalling (FDR = 3.98E-02), Relaxin Signalling (FDR = 4.79E-02), Glucocorticoid Receptor Signalling (FDR = 2.63E-02) and FXR/RXR Activation (FDR = 2.14E-02).

It is not surprising that the FFLs possess most functions that the GB_genes are expected to have (Griffiths-Jones et al., 2006), since the genes in the FFLs are a sub-set of the entire list of GB_genes. However, it is worth noting that there are some specific functions unique to the FFLs due to additional GB_miRNAs and TFs regulating the GB_genes. These unique functions all play important roles in carcinogenesis of glioblastoma and can provide vital hints to discover the underlying mechanisms, thus, they represent important and novel findings in this study. For example, the

Page 11: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

Gene regulation in glioblastoma: a combinatorial analysis 121

signalling cascade of ATM-Chk2, known to primarily respond to DNA double-strand breaks, was activated in human glioblastoma multiforme (GBM) cell lines as reported by Bartkova et al. (2010). Another example is the enriched function of FXR/RXR Activation in Glioblastoma. Farnesoid X Receptor (FXR) is a ligand dependent transcriptional factor and plays a critical role in bile acid, cholesterol and carbohydrate metabolism. Several recent studies have shown that FXR may contribute to breast, colorectal and hepatocellular cancer (Catalano et al., 2010; Gadaleta et al., 2010; Modica et al., 2008). Our functional analysis of FFLs further demonstrated that FXR/RXR Activation might be associated with glioblastoma.

To assign function to each FFL, we did the following analysis. The starting point was observing the function of target genes in each loop, which is the major player of function. For the target genes annotated to the enriched functions obtained by the IPA analysis, we assigned the function with the lowest FDR to this loop. Here, we excluded the disease related functions to obtain a more refined function of the loop. Thirty seven loops were assigned functions by this way. For the remaining target genes without annotation of enriched pathways, we perform a GO annotation analysis. We extracted all the functions for both the target gene and the TF in a loop and then manually decided a closest common function of the target gene and the TF, and then assigned it to this loop. The FFLs were categorised by functions as listed in Table 2.

Table 2 List of feed-forward loops consisting of glioblastoma related genes, Transcription Factor and glioblastoma related miRNAs and their top function. Column 1: Glioblastoma related miRNA name. Column 2: Transcription factor. Column 3: Glioblastoma related gene name. Column 4: The assigned function of each loop. Loops in bold are the functions specific to the feed-forward loops

microRNA TF GB_gene Top function

hsa-let-7f IRF1 LRP2 Role of PKR in interferon induction and antiviral response

hsa-miR-425 CREB1 C6ORF170 ATM signalling

hsa-miR-30b MEF2A EGFR ERK5 signalling

hsa-miR-328 MEF2A PIK3R1 ERK5 signalling

hsa-miR-140-3p HNF4A LMX1A FXR/RXR activation

hsa-miR-27b SREBF1 PIK3R1 FXR/RXR activation

hsa-miR-222 NR3C1 LMX1A Glucocorticoid receptor signalling

hsa-miR-140-3p GATA3 LRP2 Role of PKR in interferon induction and antiviral response

hsa-miR-197 GATA1 LRP2 Role of PKR in interferon induction and antiviral response

hsa-miR-218 MEF2A SKP2 Cell cycle: G1/S checkpoint regulation

hsa-miR-139-3p CUTL1 IDH1 Citrate cycle

hsa-miR-223 MEIS1 PIK3R1 FLT3 signalling in hematopoietic progenitor cells

hsa-miR-9 MEIS1 PIK3R1 FLT3 signalling in hematopoietic progenitor cells

Page 12: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

122 X. Gong et al.

Table 2 List of feed-forward loops consisting of glioblastoma related genes, Transcription Factor and glioblastoma related miRNAs and their top function. Column 1: Glioblastoma related miRNA name. Column 2: Transcription Factor. Column 3: Glioblastoma related gene name. Column 4: The assigned function of each loop. Loops in bold are the functions specific to the feed-forward loops (continued)

microRNA TF GB_gene Top function

hsa-miR-128 ARNT ARNT2 Hypoxia signalling in the cardiovascular system

hsa-miR-151-3p PPARG PIK3R1 IL-12 signalling and production in macrophages

hsa-miR-150 POU2F1 PPARG IL-12 signalling and production in macrophages

hsa-miR-23a NFE2L1 PIK3R1 IL-9 signalling

hsa-miR-98 NKX2 PIK3R1 Neurotrophin/TRK signalling

hsa-miR-342-3p BACH1 C21ORF29 NRF2-mediated oxidative stress response

hsa-miR-29b TP53 PIK3R1 p53 signalling

hsa-miR-29b TP53 PKHD1 p53 signalling

hsa-miR-29b TP53 TP53 p53 signalling

hsa-miR-195 POU3F2 PIK3R1 PI3K/AKT signalling

hsa-miR-150 POU2F1 PTEN PI3K/AKT signalling

hsa-miR-221 POU2F1 PTEN PI3K/AKT signalling

hsa-let-7f IRF1 PIK3R1 Prolactin signalling

hsa-miR-152 CUTL1 PTEN PTEN signalling

hsa-miR-378 MZF1 PTEN PTEN signalling

hsa-miR-483-3p POU2F2 PTEN PTEN signalling

hsa-miR-150 POU2F1 LRRC7 Role of BRCA1 in DNA damage response

hsa-miR-30d ARNT RB1 Role of BRCA1 in DNA damage response

hsa-miR-9 MEIS1 RB1 Role of BRCA1 in DNA damage response

hsa-miR-23b FOXD3 PIK3CA Role of Oct4 in mammalian embryonic stem cell pluripotency

hsa-miR-98 NR2F2 TP53 Role of Oct4 in mammalian embryonic stem cell pluripotency

hsa-miR-19a STAT5A PIK3R1 Role of tissue factor in cancer

hsa-miR-130a GATA1 C21ORF29 Thrombin signalling

hsa-miR-516a-3p GATA1 C21ORF29 Thrombin signalling

hsa-miR-130a GATA1 LGI1 Thrombin signalling

hsa-miR-24 GATA1 PIK3R1 Thrombin signalling

hsa-miR-26b GATA1 PIK3R1 Thrombin signalling

hsa-miR-342-3p CEBPA PIK3R1 VDR/RXR activation

hsa-miR-376a CUTL1 LRP2 Cell proliferation

Page 13: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

Gene regulation in glioblastoma: a combinatorial analysis 123

Table 2 List of feed-forward loops consisting of glioblastoma related genes, Transcription Factor and glioblastoma related miRNAs and their top function. Column 1: Glioblastoma related miRNA name. Column 2: Transcription Factor. Column 3: Glioblastoma related gene name. Column 4: The assigned function of each loop. Loops in bold are the functions specific to the feed-forward loops (continued)

microRNA TF GB_gene Top function

hsa-miR-139-3p CUTL1 LMX1A Differentiation of mesencephali dopamine neurons

hsa-miR-376a CUTL1 LMX1A Differentiation of mesencephalic dopamine neurons

hsa-miR-139-3p CUTL1 GOPC Golgi to plasma membrane transport

hsa-miR-150 PAX2 GOPC Golgi to plasma membrane transport

hsa-miR-30e YY1 GOPC Golgi to plasma membrane transport

hsa-miR-27b RFX1 C6ORF170 Immune response

hsa-miR-378 NFE2L1 PIK3R1 Inflammatory response

hsa-miR-98 NKX2 C21ORF29 Neuron migration

hsa-miR-140-3p GATA3 LGI1 Positive regulation of synaptic transmission

hsa-miR-204 POU3F2 PIK3R1 Regulation of neurogenesis

hsa-miR-483-3p POU3F2 PIK3R1 Regulation of neurogenesis

hsa-miR-218 MEIS1 LRP2 Regulation of vascular development

3.5 Case study of FFLs

Some key pathways with frequent alterations in terms of CNV and DNA mutation were identified in the TCGA project, including the PI3K signalling pathway, RTK/RAS signalling pathway, P53 signalling pathway and RB signalling pathway. These pathways play important roles in cell migration, DNA repair, cell cycle progression and apoptosis. The core molecules in the pathways were extracted and the interplay among these pathways were obtained, as shown in Figure 2, from which mutation or copy number frequency of most genes could be observed (TCGA, 2008). High frequency of alteration among the 91 samples could be seen in some genes such as PTEN (36%), TP53 (35%), RB (11%) and so on. These genes were also target genes in the FFLs we indentified. For example, PTEN participated in five FFLs, including FFLs of hsa-miR-150 – POU2F1 – PTEN, hsa-miR-221 – POU2F1 – PTEN, hsa-miR-152 – CUTL1 – PTEN, hsa-miR-378 – MZF1 – PTEN and hsa-miR-483-3p – POU2F2 – PTEN (see Figure 2 and Table 2). Similarly, TP53 and RB could be regulated by FFLs of hsa-miR-30d – ARNT – RB1, hsa-miR-9 – MEIS1 – RB1 and hsa-miR-98 – NR2F2 –TP53, respectively. These results indicated that the genes in the key pathways were regulated by a complicated manner among patients. Future investigation of the function and regulation of these genes is thus warranted. If it is found functionally critical, it may be utilised as a promising biomarkers or therapy targets for glioblastoma.

Page 14: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

124 X. Gong et al.

Figure 2 Feed-forward loops (left panel) that regulate several core genes (PTEN, TP53 and RB1) in glioblastoma (see online version for colours)

This figure was adapted from Supplementary Figure 8 published in Nature, 2008, 455:1061-1068 authored by The Cancer Genome Atlas Research Network. The adaptation was granted by the Nature Publishing Group.

4 Conclusion

Our analysis characterised the important regulatory functions disturbed by the glioblastoma related genes from the pathway perspective. Additionally, we explored how these functions were deregulated using the feed-forward loop approach. We found that there were a large proportion of functions shared between the glioblastoma related genes and the FFLs, but, more importantly, we also found some other functions specific to the FFLs. Our work provided data for future investigation of the mechanisms underlying glioblastoma and also potential regulatory subunits that might be useful for biomarker discovery and therapy targets for glioblastoma.

Acknowledgements

We thank Drs. An-Yuan Guo and Peilin Jia for helpful discussion and Dr. Kang Tu for kindly sharing data with us. This work was partially supported by Vanderbilt’s Specialised Program of Research Excellence in GI Cancer grant (P50CA95103) and the VICC Cancer Center Core grant (P30CA68485).

Page 15: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

Gene regulation in glioblastoma: a combinatorial analysis 125

References Archer, K.J., Zhao, Z., Guennel, T., Maluf, D.G., Fisher, R.A. and Mas, V.R. (2010) ‘Identifying

genes progressively silenced in preneoplastic and neoplastic liver tissues’, Int. J. Comput. Biol. Drug Des., Vol. 3, pp.52–67.

Bartkova, J., Hamerlik, P., Stockhausen, M.T., Ehrmann, J., Hlobilkova, A., Laursen, H., Kalita, O., Kolar, Z., Poulsen, H.S., Broholm, H., Lukas, J. and Bartek, J. (2010) ‘Replication stress and oxidative damage contribute to aberrant constitutive activation of DNA damage signalling in human gliomas’, Oncogene, Vol. 29, pp.5095–5102.

Benjamini, Y. and Hochberg, Y. (1995) ‘Controlling the false discovery rate: a practical and powerful approach to multiple testing’, J. R. Statist. Soc. B, Vol. 57, pp.289–300.

Catalano, S., Malivindi, R., Giordano, C., Gu, G., Panza, S., Bonofiglio, D., Lanzino, M., Sisci, D., Panno, M.L. and Ando, S. (2010) ‘Farnesoid X receptor, through the binding with steroidogenic factor 1-responsive element, inhibits aromatase expression in tumor Leydig cells’, J. Biol. Chem., Vol. 285, pp.5581–5593.

Chan, T.A., Glockner, S., Yi, J.M., Chen, W., Van Neste, L., Cope, L., Herman, J.G., Velculescu, V., Schuebel, K.E., Ahuja, N. and Baylin, S.B. (2008) ‘Convergence of mutation and epigenetic alterations identifies common genes in cancer that predict for poor prognosis’, PLoS Med., Vol. 5, pp.e114.

Friard, O., Re, A., Taverna, D., De Bortoli, M. and Cora D. (2010) ‘CircuitsDB: a database of mixed microRNA/transcription factor feed-forward regulatory circuits in human and mouse’, BMC Bioinformatics, Vol. 11, p.435.

Gadaleta, R.M., van Mil, S.W., Oldenburg, B., Siersema, P.D., Klomp, L.W. and van Erpecum K.J. (2010) ‘Bile acids and their nuclear receptor FXR: relevance for hepatobiliary and gastrointestinal disease’, Biochim. Biophys. Acta, Vol. 1801, pp.683–692.

Gaire, R.K., Bailey, J., Bearfoot, J., Campbell, I.G., Stuckey, P.J. and Haviv, I. (2010) ‘MIRAGAA – a methodology for finding coordinated effects of microRNA expression changes and genome aberrations in cancer’, Bioinformatics, Vol. 26, pp.161–167.

Gong, X., Wu, R., Zhang, Y., Zhao, W., Cheng, L., Gu, Y., Zhang, L., Wang, J., Zhu, J. and Guo, Z. (2010) ‘Extracting consistent knowledge from highly inconsistent cancer gene data sources’, BMC Bioinformatics, Vol. 11, p.76.

Griffiths-Jones, S., Grocock, R.J., van Dongen, S., Bateman, A. and Enright, A.J. (2006) ‘miRBase: microRNA sequences, targets and gene nomenclature’, Nucleic Acids Res., Vol. 34, pp.D140–D144.

Grimson, A., Farh, K.K., Johnston, W.K., Garrett-Engele, P., Lim, L.P. and Bartel D.P. (2007) ‘MicroRNA targeting specificity in mammals: determinants beyond seed pairing’, Mol. Cell, Vol. 27, pp.91–105.

Guo, A.Y., Sun, J., Jia, P. and Zhao, Z. (2010) ‘A novel microRNA and transcription factor mediated regulatory network in schizophrenia’, BMC Syst Biol., Vol. 4, p.10.

Hanahan, D. and Weinberg, R.A. (2000) ‘The hallmarks of cancer’, Cell, Vol. 100, pp.57–70. Jia, P., Ewers, J.M. and Zhao, Z. (2011) ‘Prioritization of epilepsy associated candidate genes by

convergent analysis’, PLoS ONE, Vol. 6, No. 2, p.e17162. Jiang, Q., Wang, Y., Hao, Y., Juan, L., Teng, M., Zhang, X., Li, M., Wang, G. and Liu, Y. (2009)

‘miR2Disease: a manually curated database for microRNA deregulation in human disease’, Nucleic Acids Res., Vol. 37, pp.D98–D104.

Kim, H., Huang, W., Jiang, X., Pennicooke, B., Park, P.J. and Johnson, M.D. (2008) ‘Integrative genome analysis reveals an oncomir/oncogene cluster regulating glioblastoma survivorship’, Proc. Natl. Acad. Sci., USA, Vol. 107, pp.2183–2188.

Lipsitz, D., Higgins, R.J., Kortz, G.D., Dickinson, P.J., Bollen A.W., Naydan D.K. and LeCouteur, R.A. (2003) ‘Glioblastoma multiforme: clinical findings, magnetic resonance imaging, and pathology in five dogs’, Vet Pathol., Vol. 40, pp.659–669.

Page 16: Gene regulation in glioblastoma: a combinatorial analysis ... Gong GBM.pdf · University of Harbin Medical University, College of Bioinformatics Science and Technology, Hei Longjiang,

126 X. Gong et al.

Lu, M., Zhang, Q., Deng, M., Miao, J., Guo, Y., Gao, W. and Cui, Q. (2008) ‘An analysis of human microRNA and disease associations’, PLoS One, Vol. 3, pp.e3420.

Mischel, P.S., Nelson S.F. and Cloughesy T.F. (2003) ‘Molecular analysis of glioblastoma: pathway profiling and its implications for patient therapy’, Cancer Biol. Ther., Vol. 2, pp.242–247.

Modica, S., Murzilli, S., Salvatore, L., Schmidt, D.R. and Moschetta, A. (2008) ‘Nuclear bile acid receptor FXR protects against intestinal tumorigenesis’, Cancer Res., Vol. 68, pp.9589–9594.

Qiu, C., Wang, J., Yao, P., Wang, E. and Cui, Q. (2010) ‘microRNA evolution in a human transcription factor and microRNA regulatory network’, BMC Syst. Biol., Vol. 4, p.90.

Re, A., Cora, D., Taverna, D. and Caselle, M. (2009) ‘Genome-wide survey of microRNA-transcription factor feed-forward regulatory circuits in human’, Mol. Biosyst., Vol. 5, pp.854–867.

Ruepp, A., Kowarsch, A., Schmidl, D., Buggenthin, F., Brauner, B., Dunger, I., Fobo, G., Frishman, G., Montrone, C. and Theis, F.J. (2010) ‘PhenomiR: a knowledgebase for microRNA expression in diseases and biological processes’, Genome Biol., Vol. 11, p.R6.

Schuebel, K.E., Chen, W., Cope, L., Glockner, S.C., Suzuki, H., Yi, J.M., Chan, T.A., Van Neste, L., Van Criekinge, W., van den Bosch, S., van Engeland, M., Ting, A.H., Jair, K., Yu, W., Toyota, M., Imai, K., Ahuja, N., Herman, J.G. and Baylin, S.B. (2007) ‘Comparing the DNA hypermethylome with gene mutations in human colorectal cancer’, PLoS Genet., Vol. 3, pp.1709–1723.

Shalgi, R., Lieber, D., Oren, M. and Pilpel, Y. (2007) ‘Global and local architecture of the mammalian microRNA-transcription factor regulatory network’, PLoS Comput. Biol., Vol. 3, pp.e131.

Su, N., Wang, Y., Qian, M. and Deng, M. (2010) ‘Combinatorial regulation of transcription factors and microRNAs’, BMC Syst. Biol., Vol. 4, p.150.

TCGA (2008) ‘Comprehensive genomic characterization defines human glioblastoma genes and core pathways’, Nature, Vol. 455, pp.1061–1068.

Trujillo, R.D., Yue, S.B., Tang, Y., O’Gorman, W.E. and Chen, C.Z. (2010) ‘The potential functions of primary microRNAs in target recognition and repression’, Embo J., Vol. 29, pp.3272–3285.

Tsang, J., Zhu, J. and van Oudenaarden, A. (2007) ‘MicroRNA-mediated feedback and feedforward loops are recurrent network motifs in mammals’, Mol. Cell, Vol. 26, pp.753–767.

Tu, K., Yu, H., Hua, Y.J., Li, Y.Y., Liu, L., Xie, L. and Li, Y.X. (2009) ‘Combinatorial network of primary and secondary microRNA-driven regulatory mechanisms’, Nucleic Acids Res., Vol. 37, pp.5969–5980.

Verhaak, R.G., Hoadley, K.A., Purdom, E., Wang, V., Qi, Y., Wilkerson, M.D., Miller, C.R., Ding, L., Golub, T., Mesirov, J.P., Alexe, G., Lawrence, M., O’Kelly, M., Tamayo, P., Weir, B.A., Gabriel, S., Winckler, W., Gupta, S., Jakkula, L., Feiler, H.S., Hodgson, J.G., James, C.D., Sarkaria, J.N., Brennan, C., Kahn, A., Spellman, P.T., Wilson, R.K., Speed T.P., Gray, J.W., Meyerson, M., Getz, G., Perou, C.M. and Hayes, D.N. (2010) ‘Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1’, Cancer Cell, Vol. 17, pp.98–110.

Zitvogel, L., Tesniere A. and Kroemer G. (2006) ‘Cancer despite immunosurveillance: immunoselection and immunosubversion’, Nat. Rev. Immunol., Vol. 6, pp.715–727.