Upload
marsden-salazar
View
21
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Canadian Bioinformatics Workshops. www.bioinformatics.ca. Module #: Title of Module. 2. Module 4 Analyzing gene list function and associations. Quaid Morris Interpreting gene lists from – omics studies July 15-16, 2010. Place an image representing the talk here. - PowerPoint PPT Presentation
Citation preview
Canadian Bioinformatics Workshops
www.bioinformatics.ca
2Module #: Title of Module
Place an image representing the talk herePlace an image representing the talk here
Module 4Analyzing gene list function and
associations
http://morrislab.med.utoronto.ca
Module 4 bioinformatics.ca
Overview
• Extending gene lists using functional associations
• Sources of functional association• GeneMANIA
Module 4 bioinformatics.ca
Extending Gene Lists
• Given a gene list, find other similar genes– Gene list defines the query and the “function” of interest
• Query: complex or pathway components– Result: additional members
• Query: kinases– Result: other kinases and related genes
• Query: genes affected in RNAi screen– Result: other genes that may affect phenotype
Module 4 bioinformatics.ca
Network-Based Gene Function Prediction
• Genes of similar sequence often have similar function• Unknown gene similar to known gene likely to have
similar function (annotation transfer)• Guilt-by-association principle• Many other similarity measures for genes (e.g. co-
localization)
Fraser AG, Marcotte EM - A probabilistic view of gene function - Nat Genet. 2004 Jun;36(6):559-64
Module 4 bioinformatics.ca
Eisen et al (PNAS 1998)
Functional association networks to predict gene function
CDC3
CDC16CLB4
RPN3RPT1
RPT6
UNK1
Protein degradation
Cell cycle
UNK2
Microarray expression data
Fraser AG, Marcotte EM - A probabilistic view of gene function - Nat Genet. 2004 Jun;36(6):559-64
Co-expression network
Module 4 bioinformatics.ca
Predicting Gene Function Using a Network
CDC3
CDC16
CLB4
RPN3RPT1
RPT6
UNK1
UNK2
Is gene X involved in cell cycle regulation?
UNK3e.g. co-expression
+
+ +
-
-
-
Classificationalgorithm
UNK1 0.9
UNK2 0.1
UNK3 0.05
Discriminant value
Labelled examples
Discriminant value: a value you can use to rank the genes according to certainty or threshold to classify genes
?
?
?
Module 4 bioinformatics.ca
Predicting Gene Function Using a Network
CDC3
CDC16
CLB4
RPN3RPT1
RPT6
UNK1
UNK2
Is gene X involved in cell cycle regulation?
UNK3
+
+ +
-
-
-
UNK1 0.9
UNK2 0.1
UNK3 0.05
Discriminant value
Labelled examples
Discriminant value: a value you can a) use to rank the genes according to certainty and b) threshold to classify genes
?
?
?
kNN,SVM, LabelProp
e.g. co-expression
Module 4 bioinformatics.ca
Label propagation algorithm
MCA1
CDC48
CPR3
TDH2
MCA1CDC48
CPR3
TDH2
MCA1
CDC48
CPR3
TDH2
Guilt-by-association
-1 …
……
…...
.+1
Discriminant Value
Label propagation vs guilt-by-association
Module 4 bioinformatics.ca
Types of functional associations• Molecular Interactions (i.e. physical interactions)• Regulatory Interactions (e.g. ChIP-chip binding)• Genetic Interactions (e.g. synthetic lethality)• Similarity relationships
– Co-expression– Protein sequence (e.g. BLAST –log(E-value))– Domain architecture– Phylogenetic profiles– Gene neighborhood**– Gene fusion**– …
** most useful for bacterial genes
Module 4 bioinformatics.ca
Problem: genes are multi-function• Gene function could be a/the:
– Biological process,– Biochemical/molecular function,– Subcellular/Cellular localization,– Regulatory targets,– Temporal expression pattern,– Phenotypic effect of deletion.
Some networks may be better for some types of gene function than others
Query-specific weights for multifaceted functional queries
June 24, 2009 The GeneMANIA project 13
+GeneticTong et al. 2001
w1 x w2 x w3 xweights
Co-expression
CDC27
APC11CDC23
XRS2RAD54
MRE11
UNK1
UNK2
Cell cycle
DNA repair
Pavlidis et al, 2002, Lanckriet et al, 2004Mostafavi et al, 2008
=
+Co-complexed
Jeong et al 2002
Module 4 bioinformatics.ca
GeneMANIA in the MouseFunc contest
Sara Mostafavi
“Test” benchmark: Predicting held-out genes
One of GeneMANIA’s two entries had the best area under the ROC curve in every category
GeneMANIA performance on yeast
15
* Myers et al, 2005** Tsuda et al, 2005
More error Slower
GeneMANIA on 15 networks
GeneMANIA label propagation on bioPIXIE*
Probabilistic graph search* on bioPIXIE*
GeneMANIA on 5 networks
TSS** on 5 networks
Mostafavi et al, 2008
Module 4 bioinformatics.ca
GeneMANIA Prediction Server
16
http://www.genemania.org or http://qa.genemania.org
Module 4 bioinformatics.ca
GeneMANIA network data sources
Module 4 bioinformatics.ca
GeneMANIA Cytoscape Plugin
Module 4 bioinformatics.ca
Other prediction servers
• STRING (http://string-db.org/)
• Funcoup (http://funcoup.sbc.su.se/)
• FunctionalNet (http://www.functionalnet.org)
• bioPIXIE (http://pixie.princeton.edu)
• MouseNet (http://mousenet.princeton.edu/)
Module 4 bioinformatics.ca
Chemogenomics
• STITCH: Chemical-Protein Interactions• http://stitch.embl.de/
Module 4 bioinformatics.ca
What Have We Learned?• Network-based gene function prediction
– Guilt-by-association principle• used to predict gene function using functional association networks
– Many types of functional associations exist• Can be combined intelligently to optimize prediction accuracy
– Convenient software available: GeneMANIA– Emerging area: chemical genomics gene function prediction
Module 4 bioinformatics.ca
Please follow along lab display on the wiki