14
Functional annotation and netwo Functional annotation and netwo rk reconstruction through cros rk reconstruction through cros s-platform integration of micro s-platform integration of micro array data array data X. J. Zhou et al. 2005 X. J. Zhou et al. 2005

Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al. 2005

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al. 2005

Functional annotation and network rFunctional annotation and network reconstruction through cross-platform econstruction through cross-platform

integration of microarray dataintegration of microarray data

X. J. Zhou et al. 2005X. J. Zhou et al. 2005

Page 2: Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al. 2005

Challenges in microarray data Challenges in microarray data analysisanalysis

• Integration of multiple microarray data sets.– Different platforms, e.g. cDNA arrays, Affymetrix arrays– Alternative experimental parameters

• Identification of functionally related genes which do not have similar expression patterns.

• Reconstruction of transcriptional regulatory networks.– It is difficult to elucidate the cooperativity between TFS beca

use the changes in their expression are often subtle and their activities are often controlled at levels other than expression.

Page 3: Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al. 2005

Data pre-processingData pre-processing• Classify the 618 expression profiles into

39 data sets. A data set contains a set of expression profiles measured under relevant conditions. – 19 cDNA data sets from SMD– 4 Affymetrix data sets from GEO– 16 data sets from Rosetta

Page 4: Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al. 2005

19 SMD data sets19 SMD data sets• Alpha factor release• cdc15 block release• DTT Exposure• Elutriation• Forkhead regulation• Gamma radiation• Menadione exposure• DNA damage (MMS) response• Nitrogen depletion• Nutrition limitation• Osmotic shock• SIR proteins (Chromatin Silencing)• Sorbitol effects• H2O2 response• Heat shock• Heat steady• CellCycle Factor• YPD Stationary phase• Zinc homoeostasis

Corresponding to 19 SMD subcategories

Page 5: Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al. 2005

4 GEO data sets4 GEO data sets• Aging• Chitin synthesis• Fermentation time course• Ume6 regulon

Page 6: Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al. 2005

16 Rosetta data sets 16 Rosetta data sets • Cell cycle control• Cell wall organization• Chromatin assembly• Ion homeostasis• Nucleotide metabolism• Organelle biogenesis• Perception of external stimulus• Protein biosynthesis• Protein degradation• Protein metabolism• Protein phosphorylation• Protein transport• Pseudohyphal growth• Steroid metabolism• Amino Acid Starvation• MAPK pathway

Classification is based on the GeneOntology (GO) biological process categories of the deleted genes.

Page 7: Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al. 2005

The idea: The idea: 22ndnd-order -order expression correlationexpression correlation

• 1st-order expression correlation– Correlation of expression patterns from

one data set– For each pair of genes, a vector of

length n is obtained. n is the number of data sets.

• 2nd-order expression correlation– Correlation of the 1st-order expression

correlation

Page 8: Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al. 2005

An exampleAn example

The overall expression similarity between the two gene pairs is not significantly high. However, their 1st-order expression correlation profiles exhibit high correlation, that is, the four genes have high 2nd-order expression correlation.

Page 9: Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al. 2005

Clustering functionally Clustering functionally related genesrelated genes

• Procedure– Identification of doublets

• A doublet is a pair of genes that is tightly co-expressed in multiple data sets.

– Clustering of doublets based on their 1st- order expression correlation profiles

• Results– 72 of the top 100 tightest clusters are

functionally homogeneous.

Page 10: Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al. 2005

Gene function predictionGene function prediction• A prediction of function is made for a

doublet only if it is in a tight cluster that includes at least three doublets and in which all remaining doublets share the same function.

• 79 functions are assigned to 67 unknown genes. Some have been verified by experimental studies.

Page 11: Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al. 2005

Reconstruction of regulatory networksReconstruction of regulatory networks

• For each transcription module, a 1st-order average expression correlation profile (a vector with the same length as the number of data sets) is calculated. The profile of a module can be interpreted as the activity profile of the transcription factor(s) that regulate the module. – A transcription module is defined to be a set of genes that are

regulated by the same transcription factor(s) based on genome-wide location data, and are coexpressed in multiple data sets.

– 60 TM are identified. • A 2nd-order expression correlation is calculated for tw

o activity profiles of transcription factors, to measure the cooperativity between the two transcription factors.– 34 pairs show high 2nd-order correlation.

Page 12: Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al. 2005
Page 13: Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al. 2005

Clustering of modulesClustering of modules

Page 14: Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al. 2005

Annotation of TFsAnnotation of TFs• The function of a TF is predicted

based on two evidences:– The functions of known genes in its

target module– The functions of known genes in other

modules in the same module cluster

• TF GAT3 is predicted to play a role in mitotic and meiotic cell cycles.