Using networks to derive function

Preview:

DESCRIPTION

Systems Biology Workshop, Technical University of Denmark, Lyngy, Denmark, May 14-15, 2009

Citation preview

Using networks to derive function

Lars Juhl Jensen

STRING

Jensen, Kuhn et al., Nucleic Acids Research, 2009

functional associations

Frishman et al., Modern Genome Annotation, 2009

common basis

630 genomes

model organism databases

Ensembl

RefSeq

genomic context methods

gene fusion

Korbel et al., Nature Biotechnology, 2004

conserved neighborhood

operons

Korbel et al., Nature Biotechnology, 2004

bidirectional promoters

Korbel et al., Nature Biotechnology, 2004

phylogenetic profiles

Korbel et al., Nature Biotechnology, 2004

primary experimental data

protein interactions

yeast two-hybrid

affinity purification

fragment complementation

Jensen & Bork, Science, 2008

genetic interactions

Beyer et al., Nature Reviews Genetics, 2007

BINDBiomolecular Interaction Network Database

BioGRIDGeneral Repository for Interaction Datasets

DIPDatabase of Interacting Proteins

IntAct

MINTMolecular Interactions Database

HPRDHuman Protein Reference Database

PDBProtein Data Bank

inferred associations

gene coexpression

GEOGene Expression Omnibus

expression compendia

curated knowledge

complexes

MIPSMunich Information center

for Protein Sequences

Gene Ontology

pathways

Letunic & Bork, Trends in Biochemical Sciences, 2008

KEGGKyoto Encyclopedia of Genes and Genomes

MetaCyc

Reactome

PIDNCI-Nature Pathway Interaction Database

literature mining

MEDLINE

SGDSaccharomyces Genome Database

The Interactive Fly

OMIMOnline Mendelian Inheritance in Man

co-mentioning

statistical methods

NLPNatural Language Processing

Gene and protein namesCue words for entity recognitionVerbs for relation extraction

[nxgene The GAL4 gene]

[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]

easy in theory …

… but not in practice

many data types

not comparable

variable quality

many sources

different file formats

different gene identifiers

partially redundant

spread over 630 genomes

quality scores

reproducibility

von Mering et al., Nucleic Acids Research, 2005

benchmarking

von Mering et al., Nucleic Acids Research, 2005

orthology

von Mering et al., Nucleic Acids Research, 2005

two modes

COG mode

von Mering et al., Nucleic Acids Research, 2005

protein mode

von Mering et al., Nucleic Acids Research, 2005

combine all evidence

Frishman et al., Modern Genome Annotation, 2009

Acknowledgments

Christian von Mering

Michael Kuhn

Manuel Stark

Samuel Chaffron

Philippe Julien

Tobias Doerks

Jan Korbel

Berend Snel

Martijn Huynen

Peer Bork

Recommended