48
GO and OBO: an introduction

GO and OBO:

Embed Size (px)

DESCRIPTION

GO and OBO:. an introduction. What is the Gene Ontology? What is OBO? OBO-Edit demo & practical. Gene Ontology. Built for a very specific purpose: “annotation of genes and proteins in genomic and protein databases” Applicable to all species. Evolution of GO. Original GO created in 2000 - PowerPoint PPT Presentation

Citation preview

Page 1: GO and OBO:

GO and OBO:GO and OBO:

an introductionan introduction

Page 2: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

• What is the Gene Ontology?• What is OBO?• OBO-Edit demo & practical

• What is the Gene Ontology?• What is OBO?• OBO-Edit demo & practical

Page 3: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Gene OntologyGene Ontology

• Built for a very specific purpose:“annotation of genes and proteins in

genomic and protein databases”• Applicable to all species

• Built for a very specific purpose:“annotation of genes and proteins in

genomic and protein databases”• Applicable to all species

Page 4: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Evolution of GOEvolution of GO

• Original GO created in 2000• Three databases involved:

– FlyBase (Drosophila)– MGI (Mouse)– SGD (S. cerevisae)

• Used immediately

• Original GO created in 2000• Three databases involved:

– FlyBase (Drosophila)– MGI (Mouse)– SGD (S. cerevisae)

• Used immediately

Page 5: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Evolution of GOEvolution of GO

• Later databases:– TAIR (Arabadopsis)– TIGR (microbes including prokaryotes)– SWISS-PROT (several thousand species inc. human)– PSU (P. falciparum)

• Recent additions– ZFIN (zebrafish)– PAMGO (plant pathogens)

• Later databases:– TAIR (Arabadopsis)– TIGR (microbes including prokaryotes)– SWISS-PROT (several thousand species inc. human)– PSU (P. falciparum)

• Recent additions– ZFIN (zebrafish)– PAMGO (plant pathogens)

Page 6: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Evolution of GOEvolution of GO

• GO development traditionally annotation-driven– development directed by use

• Terms added as new species annotated• Terms added on as as-needed basis

• GO development traditionally annotation-driven– development directed by use

• Terms added as new species annotated• Terms added on as as-needed basis

Page 7: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Evolution of GOEvolution of GO

• Developed by an international consortium of biologists and computer scientists– members from individual databases– central office at EBI

• Development involves collaboration with domain experts from different biological fields– also formal ontologists

• Developed by an international consortium of biologists and computer scientists– members from individual databases– central office at EBI

• Development involves collaboration with domain experts from different biological fields– also formal ontologists

Page 8: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Evolution of GOEvolution of GO

• Resulted in ‘organic’ structure, little formality

• Ontological formality added subsequently– philosophical and logical

• Resulted in ‘organic’ structure, little formality

• Ontological formality added subsequently– philosophical and logical

Page 9: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Growth of GOGrowth of GOGO term history 2001 - 2007

0

5000

10000

15000

20000

25000

30000

Jan-01Apr-01Jul-01Oct-01Jan-02Apr-02Jul-02Oct-02Jan-03Apr-03Jul-03Oct-03Jan-04Apr-04Jul-04Oct-04Jan-05Apr-05Jul-05Oct-05Jan-06Apr-06Jul-06Oct-06Jan-07

Date

Number of terms

obsolete

undefined terms

defined terms

Page 10: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

How does GO work?How does GO work?

• What does the gene product do?• Where and when does it act?• Why does it perform these

activities?

• What does the gene product do?• Where and when does it act?• Why does it perform these

activities?

What information might we want to capture about a gene product?What information might we want to capture about a gene product?

Page 11: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

GO structureGO structure

• GO terms divided into three parts:– cellular component– molecular function– biological process

• GO terms divided into three parts:– cellular component– molecular function– biological process

Page 12: GO and OBO:

Cellular ComponentCellular Component

• where a gene product acts

Page 13: GO and OBO:

Cellular ComponentCellular Component

Page 14: GO and OBO:

Cellular ComponentCellular Component

Page 15: GO and OBO:

Cellular ComponentCellular Component

• Enzyme complexes in the component ontology refer to places, not activities.

Page 16: GO and OBO:

Molecular FunctionMolecular Function

• activities or “jobs” of a gene product

glucose-6-phosphate isomerase activity

Page 17: GO and OBO:

Molecular FunctionMolecular Function

insulin bindinginsulin receptor activity

Page 18: GO and OBO:

Molecular FunctionMolecular Function

drug transporter activity

Page 19: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Molecular FunctionMolecular Function

• A gene product may have several functions; a function term refers to a single reaction or activity, not a gene product.

• Sets of functions make up a biological process.

• A gene product may have several functions; a function term refers to a single reaction or activity, not a gene product.

• Sets of functions make up a biological process.

Page 20: GO and OBO:

Biological ProcessBiological Process

a commonly recognized series of events

cell division

Page 21: GO and OBO:

Biological ProcessBiological Process

transcription

Page 22: GO and OBO:

Biological ProcessBiological Process

regulation of gluconeogenesis

Page 23: GO and OBO:

Biological ProcessBiological Process

limb development

Page 24: GO and OBO:

Biological ProcessBiological Process

courtship behavior

Page 25: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Ontology StructureOntology Structure

• Terms are linked by two relationships– is-a – part-of

• Terms are linked by two relationships– is-a – part-of

Page 26: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Ontology StructureOntology Structurecell

membrane chloroplast

mitochondrial chloroplastmembrane membrane

is-apart-of

Page 27: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Ontology StructureOntology Structure

• Ontologies are structured as a hierarchical directed acyclic graph (DAG)

• Terms can have more than one parent and zero, one or more children

• Ontologies are structured as a hierarchical directed acyclic graph (DAG)

• Terms can have more than one parent and zero, one or more children

Page 28: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Ontology StructureOntology Structurecell

membrane chloroplast

mitochondrial chloroplastmembrane membrane

Directed Acyclic Graph (DAG) - multiple

parentage allowed

Page 29: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Open Biomedical Ontologies (OBO)Open Biomedical Ontologies (OBO)

• GO is a member of OBO • An umbrella project for grouping

different ontologies in biological/medical field– a repository for ontologies with

defined set of standards• Available from a single source:http://obo.sourceforge.net/

• GO is a member of OBO • An umbrella project for grouping

different ontologies in biological/medical field– a repository for ontologies with

defined set of standards• Available from a single source:http://obo.sourceforge.net/

Page 30: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Why do we need OBO?Why do we need OBO?

• GO covers small area of biology:– molecular function of a protein– biological function of a protein– cellular location of a protein

• GO covers small area of biology:– molecular function of a protein– biological function of a protein– cellular location of a protein

Page 31: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Why do we need OBO?Why do we need OBO?

• Lots of other aspects that also need to be captured, e.g.:– phenotype– anatomy– genomic– taxonomy

• Lots of other aspects that also need to be captured, e.g.:– phenotype– anatomy– genomic– taxonomy

Page 32: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Why do we need OBO?Why do we need OBO?

• Many groups develop their own ontologies– e.g. plant ontology, anatomies for specific organisms

• No standardisation of ontologies with respect to:– format– scope – relationships

• No way of knowing whether such ontologies already exist

• No mechanism of distribution for other groups

• Many groups develop their own ontologies– e.g. plant ontology, anatomies for specific organisms

• No standardisation of ontologies with respect to:– format– scope – relationships

• No way of knowing whether such ontologies already exist

• No mechanism of distribution for other groups

Page 33: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Why do we need OBO?Why do we need OBO?

• Creating ontologies takes a lot of work– Makes sense to reuse existing

ontologies where possible• Improves data integration where

small set of ontologies used• Allows ontologies to be made

available from a single place

• Creating ontologies takes a lot of work– Makes sense to reuse existing

ontologies where possible• Improves data integration where

small set of ontologies used• Allows ontologies to be made

available from a single place

Page 34: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Why do we need OBO?Why do we need OBO?

• Ultimate aim: a complete set of integrated ontologies completely covering the biomedical domain

• Ultimate aim: a complete set of integrated ontologies completely covering the biomedical domain

Page 35: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

OBO requirementsOBO requirements

To be part of OBO, ontologies must:

• Be open, can be used by all without any constraint

Page 36: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

OBO requirements: openOBO requirements: open

• Ontologies can be used by anyone without any constraints, except:– original authors are acknowledged– cannot be edited and then released

under same name

• Ontologies can be used by anyone without any constraints, except:– original authors are acknowledged– cannot be edited and then released

under same name

Page 37: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

OBO requirementsOBO requirements

To be part of OBO, ontologies must:

• Be open, can be used by all without any constraint

• Be in a common shared syntax

Page 38: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

OBO requirements: syntax OBO requirements: syntax

• Usually the OBO format, same as primary GO format– and adaptions of OBO format

• Also accept OWL (Web Ontology Language) format

• Allows the same tools to be applied, facilitating shared software implementations

• Usually the OBO format, same as primary GO format– and adaptions of OBO format

• Also accept OWL (Web Ontology Language) format

• Allows the same tools to be applied, facilitating shared software implementations

Page 39: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Anatomy of an OBO termAnatomy of an OBO termid: GO:0006094name: gluconeogenesisnamespace: processdef: The formation of glucose fromnoncarbohydrate precursors, such aspyruvate, amino acids and glycerol.[http://cancerweb.ncl.ac.uk/omd/index.html]exact_synonym: glucose biosynthesisxref_analog: MetaCyc:GLUCONEO-PWYis_a: GO:0006006is_a: GO:0006092

id: GO:0006094name: gluconeogenesisnamespace: processdef: The formation of glucose fromnoncarbohydrate precursors, such aspyruvate, amino acids and glycerol.[http://cancerweb.ncl.ac.uk/omd/index.html]exact_synonym: glucose biosynthesisxref_analog: MetaCyc:GLUCONEO-PWYis_a: GO:0006006is_a: GO:0006092

unique IDterm name

definition

synonymdatabase ref

parentage

ontology

Page 40: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

OBO requirementsOBO requirements

To be part of OBO, ontologies must:

• Be open, can be used by all without any constraint

• Be in a common shared syntax• Not overlap with other ontologies in

OBO

Page 41: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

OBO requirements: overlappingOBO requirements: overlapping• Ontologies can (and should)

overlap partially, but large overlap should be avoided

• Idea is that terms from different ontologies can be combined to form new terms

• Striving for accepted standards rather than competition

• Ontologies can (and should) overlap partially, but large overlap should be avoided

• Idea is that terms from different ontologies can be combined to form new terms

• Striving for accepted standards rather than competition

Page 42: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

OBO requirementsOBO requirements

To be part of OBO, ontologies must:

• Be open, can be used by all without any constraint

• Be in a common shared syntax• Not overlap with other ontologies in

OBO• Share a unique identifier space

Page 43: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

OBO requirements: id spaceOBO requirements: id space• So, for example, the GO identifier

is “GO”:– No other OBO ontology could use this

id space

• Prevents problems where multiple ontologies are used together

• So, for example, the GO identifier is “GO”:– No other OBO ontology could use this

id space

• Prevents problems where multiple ontologies are used together

Page 44: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

OBO requirementsOBO requirements

To be part of OBO, ontologies must:

• Be open, can be used by all without any constraint

• Be in a common shared syntax• Not overlap with other ontologies in

OBO• Share a unique identifier space• Include text definitions of their terms

Page 45: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

OBO requirementsOBO requirements

• In addition, OBO includes ontology of relationships– all ontologies should use these

definitions of relationships• For example

– part_of– develops_from– regulates

• In addition, OBO includes ontology of relationships– all ontologies should use these

definitions of relationships• For example

– part_of– develops_from– regulates

Page 46: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

What’s availableWhat’s available

• demo:http://obo.sourceforge.net/

• demo:http://obo.sourceforge.net/

Page 47: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

Editing ontologiesEditing ontologies

• GO is edited using OBO-Edit– stand-alone Java application– available for all platforms– browse, create or edit any ontology in

OBO format

• GO is edited using OBO-Edit– stand-alone Java application– available for all platforms– browse, create or edit any ontology in

OBO format

Page 48: GO and OBO:

Jane Lomax EMBL-EBIJane Lomax EMBL-EBI

OBO-Edit demoOBO-Edit demo

• Browsing ontologies– loading ontologies (including loading multiple ontologies)– graph viewer– reasoner/single relationship views– searching/filtering/rendering– help

• Creating/editing ontologies– creating a new ontology– adding terms– copying/moving/deleting terms– adding definitions, dbxrefs etc– verification plugin– saving ontologies

• Browsing ontologies– loading ontologies (including loading multiple ontologies)– graph viewer– reasoner/single relationship views– searching/filtering/rendering– help

• Creating/editing ontologies– creating a new ontology– adding terms– copying/moving/deleting terms– adding definitions, dbxrefs etc– verification plugin– saving ontologies