26
Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

Embed Size (px)

Citation preview

Page 1: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

Application of OBO Foundry Principles in GO

Chris MungallLawrence Berkeley Labs

NCBOGO Consortium

Page 2: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

The GO is 3 ontologies

• Molecular function (MF)• Biological Process (BP)• Cellular component (CC)

Page 3: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

The GO is 3 orthogonal ontologies

• Molecular function– (a kind of dependent continuant)

• Biological Process– (a kind of occurrent)

• Cellular component– (a kind of independent continuant)

Page 4: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

The GO is 3 orthogonal ontologies of canonical

biology • Molecular function

– (a kind of dependent continuant)

• Biological Process– (a kind of occurrent)

• Cellular component– (a kind of independent continuant)

oncogenesisX

fin regeneration yesacquisition of nutrientsfrom hostyes

Page 5: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

The GO is 3 orthogonal canonical species-neutral

ontologies• Molecular function

– many core functions• Biological Process

– core shared processes (e.g. transcription)– processes specific to organism types (e.g.

fin development, fly courtship behaviour)

• Cellular component– prokaryotes and eukaryotes

Page 6: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

part ofthepart_oftreein GO

Page 7: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

Granularity of the GO

Function Process Continuant

Molecular GO-MF

GO-BP GO-CC

(Sub?)Cellular

GO-BP GO-CC

Organismal GO-BP

Page 8: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

The GO is an ontology, with rich terminological

features• GO ‘terms’ are actually representations of types

(aka kinds, universals, classes)– The actual terms (i.e. the phrases used by biologists)

are attached to the representations of types as names and synonyms

• synonyms have linguistic relations to the GO types– exact, broad, narrow, related

• distinct from ontological relations between GO types– is_a, part_of

• GO is moving to genus-differentia style definitions– Many definitions are still dictionary-style,

terminological• describe how the term is used

Page 9: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

Genus differentia definitions

central nervous system morphogenesisGenus: morphogenesisDifferentia: has_outcome central nervous system

The process by which the anatomical structure of the central nervous system is generated and organized

Page 10: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

The GO is both reference and application ontology

• The same artefact (i.e. file) is used for both ontology editing and data annotation

• This has worked reasonably well until now• We may encourage making a distinction

– application views (aka GO slims) • currently only used to present a very small subset

of GO• consider wider use for extracting most of ontology

Page 11: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

Relations in GO (current)

• part_of– conforms to RO

• X part_of Y: all Xs are part_of some Y (for the entirety of the duration of the existence of the X)

– e.g. nucleus part_of cell (all nuclei are always part_of a cell, not all cells have a nucleus as part)

• for both continuants and processes– no ordering for processes

• is_a– sort-of conforms to RO

• X is_a Y: all Xs are Ys (for the entirety of Xs existence)• but there are issues with is_a in GO:

– is_a incompleteness– is_a polyhierarchies

Page 12: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

is a has issues

• Not all GO types have is_a parents– not a problem in MF– fixed in CC (July 2006)– being fixed in BP (Sept 2006: right now, here in

Seattle)

• Still a contentious issue?– is_a completion requires new high-level types

in ontology– perceived as being too abstract by biologists– simple solution: application ontology

• remove high level terms in annotation view

Page 13: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

is_a polyhierarchies

• is_a diamonds cause problems– tangled DAGs, easy to make mistakes

• Source of problems typically due to multiple axes of classification– e.g. due to composite terms

• Solution:– Genus - differentia (aristotelian) definitions

• aka cross-products [Hill et al]

– Always a single genus• choose consistent axis of classification

– Allow classifier/reasoner to provide different views of ontology

Page 14: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

a tangled hierarchyin GO

problem:mixes (at least)two axes ofclassification

Page 15: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

biosynthesisis_ametabolism

Page 16: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

cysteineis_aserine family amino acidis_aamino acidis_aamine

Page 17: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

cysteineis_aserine family amino acidis_aamino acidis_aserine

Page 18: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

The solution: separate the axes

cysteine biosynthesis (GO)Genus: biosynthesisDifferentia: has_outcome cysteine

biosynthesis (GO)

metabolism (GO)

cysteine (ChEBI)

serine family amino acid (ChEBI)

computablegenus-differentiadefinition

Page 19: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

Compute the subsumption DAG from the definition

cysteine biosynthesis (GO)Genus: biosynthesisDifferentia: has_outcome cysteine

cysteine metabolism (GO)

serine family amino acid biosynthesis (GO)

the DAG isrequiredfor applicationssuch as annotationsearch

Page 20: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium
Page 21: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

Pre- and post- composition

• References to types can be pre-composed in ontology, prior to annotation– Ontology editor creates term, with ID– Use reasoners to classify the DAG

automatically

• References to types can be post-composed (created on the fly) at annotation time– No term with ID is created

• Computationally, it makes no difference– provided we adhere to the genus-differentia

formalism

Page 22: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

OBO Foundry practices and pre-composition

• Pre-composition of terms in the ontology is good as it creates a map of biological reality, linking foundry ontologies– within reason

Page 23: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

Examples

• GO Biological Process x OBO Cell– neuron migration– cone cell fate specification– T cell homeostasis– erythrocte degranulation

• OBO Cell is species-neutral

Page 24: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

Current status

• The ability to effectively created computationally visible genus-differentia definitions is new to most OBO ontologies

• Soon to be created:– SO (many terms now done)– GO-BP definitions referencing OBO-Cell– OBO Disease definitions referencing FMA– And more…

• Difficult:– GO-BP and ChEBI (chemical entities)– GO-BP and anatomy (we need CARO!)

Page 25: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

development in GO (current)

neural tube formation

neural tube development

neural plate formation

neural plate development

GO

part_of

neural plate morphogenesis

(is_a not shown)

Page 26: Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium

development in GO (future)

presumptive spinal cord

neural plate

neural keel

neural rod

neural tube

spinal cord

transformation_of

neural tube formation

neural tube development

neural plate formation

neural plate development

has_participant

AO GO

part_of

neural tube formationGenus: tube formation Differentia: has_outcome neural tube

neural plate morphogenesis