Upload
eytan
View
25
Download
0
Tags:
Embed Size (px)
DESCRIPTION
GO : the Gene Ontology “because you know sometimes words have two meanings”. Amelia Ireland GO Curator EBI, Cambridge, UK. What’s in a name?. What is a cell?. Cell. Cell. Cell. Cell. Cell. Cell. Image from http://microscopy.fsu.edu. Cell. A cell can be a part or a whole organism. - PowerPoint PPT Presentation
Citation preview
GO : the Gene Ontology
“because you know sometimeswords have two meanings”
Amelia IrelandGO Curator
EBI, Cambridge, UK
What’s in a name?
• What is a cell?
Cell
Cell
Cell
Cell
Cell
Cell
Image from http://microscopy.fsu.edu
Cell
• A cell can be a part or a whole organism
Images from http://microscopy.fsu.edu
What’s in a name?
What’s in a name?
• Glucose synthesis• Glucose biosynthesis• Glucose formation• Glucose anabolism• Gluconeogenesis
• All refer to the process of making glucose from simpler components
What’s in a name?
• Same name for different concepts• Different names for the same concept• Vast amounts of biological data from
different sources
Cross-species or cross-database comparison is difficult
The problem:
What is the Gene Ontology?
• A (part of the) solution: The Gene Ontology: “a controlled
vocabulary that can be applied to all organisms even as knowledge of gene and protein roles in cells is accumulating and changing”
• A controlled vocabulary to describe gene products - proteins and RNA - in any organism.
What is GO?
• One of the Open Biological Ontologies
• Standard, species-neutral way of representing biology
• Three structured networks of defined terms to describe gene product attributes
• More like a phrase book than a biology text book
How does GO work?
• What does the gene product do?• Where and when does it act?• Why does it perform these activities?
What information might we want to capture about a gene product?
No GO Areas
• GO covers ‘normal’ functions and processes No pathological processes No experimental conditions
• NO evolutionary relationships• NO gene products• NOT a system of nomenclature
Cellular Component
• where a gene product acts
Cellular Component
Cellular Component
Cellular Component
• Enzyme complexes in the component ontology refer to places, not activities.
Molecular Function
• activities or “jobs” of a gene product
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
glucose-6-phosphate isomerase activity
Molecular Function
insulin bindinginsulin receptor activity
Molecular Function
drug transporter activity
Molecular Function
• A gene product may have several functions; a function term refers to a single reaction or activity, not a gene product.
• Sets of functions make up a biological process.
Biological Process
a commonly recognized series of events
cell division
Biological Process
transcription
Biological Process
regulation of gluconeogenesis
Biological Process
limb development
Biological Process
courtship behavior
Anatomy of a GO term
id: GO:0006094name: gluconeogenesisnamespace: processdef: The formation of glucose fromnoncarbohydrate precursors, such aspyruvate, amino acids and glycerol.[http://cancerweb.ncl.ac.uk/omd/index.html]exact_synonym: glucose biosynthesisxref_analog: MetaCyc:GLUCONEO-PWYis_a: GO:0006006is_a: GO:0006092
unique GO IDterm name
definition
synonymdatabase ref
parentage
ontology
Anatomy of a GO term
• Species-specific terms use the phrase “sensu xxx” - ‘in the sense of’
• stalk formation sensu Plantae: slender or elongated
structure that supports a plant, plant part or plant organ
sensu Dictyostelium: a tubular structure that consists of cellulose-covered cells stacked on top of each other and surrounded by an acellular stalk tube composed of cellulose and glycoprotein.
Anatomy of a GO term
• GO synonyms include alternative wordings, spellings, and related concepts Broader, narrower, exact or related Useful search aid
name: glucose transportexact_synonym: gluco-hexose transportnarrow_synonym: glucose shuttling
Ontology Structure
• Ontologies are structured as a hierarchical directed acyclic graph
• Terms can have more than one parent and zero, one or more children
• Terms are linked by two relationships is-a part-of
Ontology Structure
cell
membrane chloroplast
mitochondrial chloroplastmembrane membrane
is-apart-of
True Path Rule
• The path from a child term all the way up to its top-level parent(s) must always be true
cell nucleus
chromosome
But what about bacteria?
True Path Rule
Resolved component ontology structure:
cell cytoplasm
chromosome nuclear chromosome
nucleus nuclear chromosome
GO for it!
• GO to
http://www.ebi.ac.uk/~aji/intro.html
GO Annotation
• Using GO terms to represent the activities and localizations of a gene product
• Annotations contributed by members of the GO Consortium model organism databases cross-species databases, eg. UniProt
• Annotations freely available from GO website
GO Annotation
• Database object gene or gene product
• GO term ID e.g. GO:0003677
• Reference for annotation e.g. PubMed paper, BLAST results
• Evidence code from evidence code ontology
GO Annotation
• Electronic annotation from mappings files
e.g. UniProt keyword2go
High quantity but low quality Annotations to low level terms Not checked by curators
• Manual annotation From literature curation Time consuming but high quality
GO Annotation
ISS Inferred from Sequence/Structural SimilarityIDA Inferred from Direct AssayIPI Inferred from Physical InteractionTAS Traceable Author StatementNAS Non-traceable Author StatementIMP Inferred from Mutant PhenotypeIGI Inferred from Genetic InteractionIEP Inferred from Expression PatternIC Inferred by CuratorND No Data available
IEA Inferred from electronic annotation
GO Annotate
In this study, we report the isolation and molecular characterization of the B. napus PERK1 cDNA, that is predicted to encode a novel receptor-like kinase. We have shown that like other plant RLKs, the kinase domain of PERK1 has serine/threonine kinase activity. In addition, the location of a PERK1-GTP fusion protein to the plasma membrane supports the prediction that PERK1 is an integral membrane protein…these kinases have been implicated in early stages of wound response…
Function: protein serine/threonine kinase activity ; GO:0004674 (IDA)
Component:integral to plasma membrane ; GO:0005887 (IDA)
Process: response to wounding ; GO:0009611 (NAS)
GO for it (again)!
• GO to
http://www.ebi.ac.uk/~aji/annotI.html