22
Canadian Bioinformatics Workshops www.bioinformatics.ca

Canadian Bioinformatics Workshops

Embed Size (px)

DESCRIPTION

Canadian Bioinformatics Workshops. www.bioinformatics.ca. Module #: Title of Module. 2. Module 4 Analyzing gene list function and associations. Quaid Morris Interpreting gene lists from – omics studies July 15-16, 2010. Place an image representing the talk here. - PowerPoint PPT Presentation

Citation preview

Page 1: Canadian Bioinformatics Workshops

Canadian Bioinformatics Workshops

www.bioinformatics.ca

Page 2: Canadian Bioinformatics Workshops

2Module #: Title of Module

Page 3: Canadian Bioinformatics Workshops

Place an image representing the talk herePlace an image representing the talk here

Module 4Analyzing gene list function and

associations

http://morrislab.med.utoronto.ca

Page 4: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

Overview

• Extending gene lists using functional associations

• Sources of functional association• GeneMANIA

Page 5: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

Extending Gene Lists

• Given a gene list, find other similar genes– Gene list defines the query and the “function” of interest

• Query: complex or pathway components– Result: additional members

• Query: kinases– Result: other kinases and related genes

• Query: genes affected in RNAi screen– Result: other genes that may affect phenotype

Page 6: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

Network-Based Gene Function Prediction

• Genes of similar sequence often have similar function• Unknown gene similar to known gene likely to have

similar function (annotation transfer)• Guilt-by-association principle• Many other similarity measures for genes (e.g. co-

localization)

Fraser AG, Marcotte EM - A probabilistic view of gene function - Nat Genet. 2004 Jun;36(6):559-64

Page 7: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

Eisen et al (PNAS 1998)

Functional association networks to predict gene function

CDC3

CDC16CLB4

RPN3RPT1

RPT6

UNK1

Protein degradation

Cell cycle

UNK2

Microarray expression data

Fraser AG, Marcotte EM - A probabilistic view of gene function - Nat Genet. 2004 Jun;36(6):559-64

Co-expression network

Page 8: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

Predicting Gene Function Using a Network

CDC3

CDC16

CLB4

RPN3RPT1

RPT6

UNK1

UNK2

Is gene X involved in cell cycle regulation?

UNK3e.g. co-expression

+

+ +

-

-

-

Classificationalgorithm

UNK1 0.9

UNK2 0.1

UNK3 0.05

Discriminant value

Labelled examples

Discriminant value: a value you can use to rank the genes according to certainty or threshold to classify genes

?

?

?

Page 9: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

Predicting Gene Function Using a Network

CDC3

CDC16

CLB4

RPN3RPT1

RPT6

UNK1

UNK2

Is gene X involved in cell cycle regulation?

UNK3

+

+ +

-

-

-

UNK1 0.9

UNK2 0.1

UNK3 0.05

Discriminant value

Labelled examples

Discriminant value: a value you can a) use to rank the genes according to certainty and b) threshold to classify genes

?

?

?

kNN,SVM, LabelProp

e.g. co-expression

Page 10: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

Label propagation algorithm

MCA1

CDC48

CPR3

TDH2

MCA1CDC48

CPR3

TDH2

MCA1

CDC48

CPR3

TDH2

Guilt-by-association

-1 …

……

…...

.+1

Discriminant Value

Label propagation vs guilt-by-association

Page 11: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

Types of functional associations• Molecular Interactions (i.e. physical interactions)• Regulatory Interactions (e.g. ChIP-chip binding)• Genetic Interactions (e.g. synthetic lethality)• Similarity relationships

– Co-expression– Protein sequence (e.g. BLAST –log(E-value))– Domain architecture– Phylogenetic profiles– Gene neighborhood**– Gene fusion**– …

** most useful for bacterial genes

Page 12: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

Problem: genes are multi-function• Gene function could be a/the:

– Biological process,– Biochemical/molecular function,– Subcellular/Cellular localization,– Regulatory targets,– Temporal expression pattern,– Phenotypic effect of deletion.

Some networks may be better for some types of gene function than others

Page 13: Canadian Bioinformatics Workshops

Query-specific weights for multifaceted functional queries

June 24, 2009 The GeneMANIA project 13

+GeneticTong et al. 2001

w1 x w2 x w3 xweights

Co-expression

CDC27

APC11CDC23

XRS2RAD54

MRE11

UNK1

UNK2

Cell cycle

DNA repair

Pavlidis et al, 2002, Lanckriet et al, 2004Mostafavi et al, 2008

=

+Co-complexed

Jeong et al 2002

Page 14: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

GeneMANIA in the MouseFunc contest

Sara Mostafavi

“Test” benchmark: Predicting held-out genes

One of GeneMANIA’s two entries had the best area under the ROC curve in every category

Page 15: Canadian Bioinformatics Workshops

GeneMANIA performance on yeast

15

* Myers et al, 2005** Tsuda et al, 2005

More error Slower

GeneMANIA on 15 networks

GeneMANIA label propagation on bioPIXIE*

Probabilistic graph search* on bioPIXIE*

GeneMANIA on 5 networks

TSS** on 5 networks

Mostafavi et al, 2008

Page 16: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

GeneMANIA Prediction Server

16

http://www.genemania.org or http://qa.genemania.org

Page 17: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

GeneMANIA network data sources

Page 18: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

GeneMANIA Cytoscape Plugin

Page 19: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

Other prediction servers

• STRING (http://string-db.org/)

• Funcoup (http://funcoup.sbc.su.se/)

• FunctionalNet (http://www.functionalnet.org)

• bioPIXIE (http://pixie.princeton.edu)

• MouseNet (http://mousenet.princeton.edu/)

Page 20: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

Chemogenomics

• STITCH: Chemical-Protein Interactions• http://stitch.embl.de/

Page 21: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

What Have We Learned?• Network-based gene function prediction

– Guilt-by-association principle• used to predict gene function using functional association networks

– Many types of functional associations exist• Can be combined intelligently to optimize prediction accuracy

– Convenient software available: GeneMANIA– Emerging area: chemical genomics gene function prediction

Page 22: Canadian Bioinformatics Workshops

Module 4 bioinformatics.ca

Please follow along lab display on the wiki