35
A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Embed Size (px)

Citation preview

Page 1: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

A Common Language for Annotation of Genes from

Yeast, Flies and Mice

The Gene Ontologies

…and Plants and Worms

…and Humans

…and anything else!

Page 2: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Gene Ontology Objectives

• GO represents concepts used to classify specific parts of our biological knowledge:– Biological Process– Molecular Function– Cellular Component

• GO develops a common language applicable to any organism

• GO terms can be used to annotate gene products from any species, allowing comparison of information across species

Page 3: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Expansion of Sequence Info

Page 4: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Eukaryotic Genome Sequences Year Genome # GenesSize (Mb)

Yeast (S. cerevisiae) 1996 12 6,000

Worm (C. elegans) 1998 97 19,100

Fly (D. melanogaster) 2000 120 13,600

Plant (A. thaliana) 2001 125 25,500

Human (H. sapiens, 1st Draft) 2001 ~3000 ~35,000

Entering the Genome Sequencing Era

Page 5: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Baldauf et al. (2000)Science 290:972

Page 6: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

MCM3

MCM2

CDC46/MCM5

CDC47/MCM7

CDC54/MCM4

MCM6

These proteins form a hexamer in the species that have been examined

Comparison of sequences from 4 organisms

Page 7: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

http://www.geneontology.org/

Page 8: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Outline of Topics

• Introduction to the Gene Ontologies (GO)

• Annotations to GO terms

• GO Tools

• Applications of GO

Page 9: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

What is an Ontology? (from OED)

1721 BAILEY, Ontology, an Account of being in the Abstract. 1733 (title) A Brief Scheme of Ontology or the Science of Being in General. a1832 BENTHAM Fragm. Ontol. Wks. 1843 VIII. 195 The field of ontology, or as it may otherwise be termed, the field of supremely abstract entities, is a yet untrodden labyrinth. 1884 BOSANQUET tr. Lotze's Metaph. 22 Ontology..as a doctrine of the being and relations of all reality, had precedence given to it over Cosmology and Psychology, the two branches of enquiry which follow the reality into its opposite distinctive forms.

Page 10: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Sriniga Srinivasan, Chief Ontologist, Yahoo!

The ontology. Dividing human knowledge into a clean set of categories is a lot like trying to figure out where to find that suspenseful black comedy at your corner video store. Questions inevitably come up, like are Movies part of Art or Entertainment? (Yahoo! lists them under the latter.) -Wired Magazine, May 1996

Page 11: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

• Molecular Function = elemental activity/task– the tasks performed by individual gene products; examples are carbohydrate

binding and ATPase activity

• Biological Process = biological goal or objective– broad biological goals, such as mitosis or purine metabolism, that are accomplished

by ordered assemblies of molecular functions

• Cellular Component = location or complex– subcellular structures, locations, and macromolecular complexes; examples include

nucleus, telomere, and RNA polymerase II holoenzyme

The 3 Gene Ontologies

Page 12: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Function (what) Process (why)

Drive nail (into wood) Carpentry

Drive stake (into soil) Gardening

Smash roach Pest Control

Clown’s juggling object Entertainment

Example: Gene Product = hammer

Page 13: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Biological ExamplesMolecular FunctionMolecular FunctionBiological ProcessBiological Process Cellular ComponentCellular Component

Page 14: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

term: MAPKKK cascade (mating sensu Saccharomyces)

goid: GO:0007244

definition: OBSOLETE. MAPKKK cascade involved in transduction of mating pheromone signal, as described in Saccharomyces.

definition_reference: PMID:9561267

comment: This term was made obsolete because it is a gene product specific term. To update annotations, use the biological process term 'signal transduction during conjugation with cellular fusion ; GO:0000750'.

Terms, Definitions, IDs

definition: MAPKKK cascade involved in transduction of mating pheromone signal, as described in Saccharomyces

Page 15: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Directed Cyclic Graph

Figure 4.1. Life cycles of heterothallic and homothallic strains of S. cerevisiae. Heterothallic strains can be stably maintained as diploids and haploids, whereas homothallic strains are stable only as diploids, because the transient haploid cells switch their mating type, and mate.

An Introduction to the Genetics and Molecular Biology of the Yeast Saccharomyces cerevisiae Fred Sherman 2000; Modified from: F. Sherman, Yeast genetics. The Encyclopedia of Molecular Biology and Molecular Medicine, pp. 302-325, Vol. 6. Edited by R. A. Meyers, VCH Pub., Weinheim, Germany,1997.

Page 16: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Nucleus

Nucleoplasm Nuclearenvelope

Chromosome Perinuclear spaceNucleolus

A child is a subset ofa parent’s elements

The cell component term Nucleus has 5 children

Parent-Child Relationships

Page 17: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Derivation of Romance languages from Latin. From R.A. Hall Jr., Introductory Linguistics; originally published by Chilton Books,now distributed by Rand McNally & Co.

“Tree” Relationships

Page 18: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Ontology RelationshipsDirected Acyclic Graph

http://www.ebi.ac.uk/ego

Page 19: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Evidence Codes for GO Evidence Codes for GO AnnotationsAnnotations

http://www.geneontology.org/doc/GO.evidence.html

Page 20: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

IEA Inferred from Electronic Annotation

ISS Inferred from Sequence Similarity

IEP Inferred from Expression Pattern

IMP Inferred from Mutant Phenotype

IGI Inferred from Genetic Interaction

IPI Inferred from Physical Interaction

IDA Inferred from Direct Assay

RCA Inferred from Reviewed Computational Analysis

TAS Traceable Author Statement

NAS Non-traceable Author Statement

IC Inferred by Curator

ND No biological Data available

Page 21: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

IEAInferred from Electronic Annotation

• Sequence Similarity (BLAST)

• Automatic transfer from mappings (InterPro2GO, EC2GO etc.)

-> Not manually reviewed

Page 22: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

ISSInferred from Sequence or Structural

Similarity

• Sequence similarity

• Recognized domains

• Structural similarity

->Use of ‘with’ column recommended

Page 23: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

IEPInferred from Expression Pattern

• Transcript levels (Northerns, microarrays)

• Protein levels (Western blots)

->Timing or localization of expression

->Biological process annotations

Page 24: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

IMPInferred from Mutant Phenotype

• Gene mutation/knockout

• Overexpression/ectopic expression

• Anti-sense experiments

• RNAi experiments

• Specific protein inhibitors

Page 25: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

IGIInferred from Genetic Interaction

• Suppressors, synthetic lethals…

• Functional complementation

• Rescue experiments

->Use of ‘with’ column recommended

Page 26: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

IPIInferred from Physical Interaction

• 2-hybrid interactions

• Co-purification

• Co-immunoprecipitation

• Ion/complex/protein binding experiments

->Use of ‘with’ column recommended

Page 27: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

IDAInferred from Direct Assay

• Enzyme assays

• In vitro reconstitution (e.g. transcription)

• Immunofluorescence (for cell. comp.)

• Cell fractionation (for cell. comp.)

• Physical interaction/binding assay

Page 28: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

RCAInferred from Reviewed Computational

Analysis

• Non-sequence-based computational methods

• Genome-wide analyses (e.g. 2-hybrid)

• Combinations of large-scale experiments

Page 29: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

TASTraceable Author Statement

• Support from review article

• Textbook ‘common knowledge’

->Data that can be ‘traced’ back

Page 30: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

NASNon-traceable Author Statement

• Database entries that don't cite a paper

->Data that cannot be ‘traced’ back

Page 31: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

ICInferred by Curator

• Not supported by any direct evidence

• Inferred from other GO annotations

-> GO term in ‘with/from’ column required

Page 32: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

NDNo biological Data available

• molecular function unknown GO:0005554

• biological process unknown GO:0000004

• cellular component unknown GO:0008372

Curator found no information supporting any annotation

Page 33: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

TAS/IDA

IMP/IGI/IPI

ISS/IEP

NAS

IEA

Term Hierarchy

Page 34: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Qualifiers

NOT: explicit note that a gene product is not associated with a GO term

colocalizes_with: only transient localization,or low resolution of an assay

contributes_to: gene product that is part of a complex can be annotated to the process/function of the complex

http://www.geneontology.org/GO.annotation.shtml#qual

The qualifier modifies the interpretation of a GO term

Page 35: A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

http://www.geneontology.org/doc/GO.evidence.html