Introduction to Ontologies for Environmental Biology

Preview:

DESCRIPTION

Oxford, August 2007

Citation preview

Introduction to Ontologies for Environmental Biology

Barry Smith

http://ontology.buffalo.edu/smith

2

Finnegans Webconcepttypeclassinstancemodelrepresentationdataprocessproperty

Disciplines here involved

GIS

Ecology

Environmental biology

Various -omics disciplines

Bioinformatics

Medical Informatics

Database science

Semantic webists

...

4

Part 1: What is an Ontology?

5

what cellular component?

what molecular function?

what biological process?

6

natural language labels designed for use in annotations

to make the data cognitively accessible to human beings

and algorithmically tractable to computers

7

compare: legends for mapscompare: legends for maps

8

compare: legends for mapscommon legends allow (cross-border) integration

9

ontologies are legends for data

10

compare: legends for diagrams

Ramirez et al. Linking of Digital Images to Phylogenetic Data Matrices Using a Morphological OntologySyst. Biol. 56(2):283–294, 2007

12

computationally tractable legends

help integrate complex representations of reality

help human beings find things in complex representations of reality

help computers reason with complex representations of reality

ontologies are used to annotate data

but there are two kinds of annotations

16

names of types

17

names of instances

18

A basic distinction

type vs. instance

science text vs. diary

human being vs. Michael Ashburner

19

A 515287 DC3300 Dust Collector Fan

B 521683 Gilmer Belt

C 521682 Motor Drive Belt

Catalog vs. inventory

20

Ontology types Instances

21

An ontology is a collection of standardized names for types

We learn about types in reality from looking at the results of scientific experiments captured in the form of scientific theories

Ontologies provide the terminological scaffolding of scientific theories

experiments relate to what is particular science describes what is general

siamese

mammal

cat

organism

thingtypes

animal

instances

frog

22

23

types vs. their extensions

type

{a,b,c,...} class of instances = a collections

of particulars

24

Extension =def

The extension of a type A is the class of instances of A

(the class of all entities to which the term ‘A’ applies)

25

types vs. classes

types

{c,d,e,...} classes

26

types vs. classes

types

extensions ~ defined classes

27

Defined class =def

member of Abba aged > 50 years

pizza with > 4 different toppings

red wine to serve with fish

28

Part 2: The OBO Foundry

29

what cellular component?

what molecular function?

what biological process?

The Gene Ontology

The Gene Ontology

32

Five bangs for your GO buck

1. based in biological science

2. cross-species data comparability (human, mouse, yeast, fly ...)

3. cross-granularity data integration (molecule, cell, organ, organism)

4. cumulation of scientific knowledge in algorithmically tractable form

5. links people to software

6. part of Open Biomedical Ontologies (OBO)

The Gene Ontology

33

Entry point for creation of web-accessible biomedical data

GO initially low-tech to encourage users

Simple (web-service-based) tools created to support the work of biologists in creating annotations (data entry)

OBO OWL DL converters now making OBO Foundry annotated data immediately accessible to Semantic Web data integration projects

The OBO Foundry

A suite of high quality interoperable reference ontologies to serve the annotation of biomedical data

providing guidelines for those who need to create new ontology resources

http://obofoundry.org

35

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic Quality(PaTO)

Biological Process

(GO)

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Componen

t(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process(GO)

The OBO Foundry building out from the original GO

Simple guidelines

• use singular nouns

• distinguish continuants from occurrents

• distinguish things from their qualities

• distinguish types from their instances

• do not use the weasel word ‘concept’

37

OPENNESS: The ontology is open and available to be used by all.

FORMAL LANGUAGE: The ontology is in, or can be instantiated in, a common formal language.

ORTHOGONALITY: The developers of the ontology agree in advance to collaborate with developers of other OBO Foundry ontology where domains overlap.

CONVERGENCE: The developers agree to work torwards a single ontology for each domain.

http://obofoundry.org/http://obofoundry.org/

CRITERIA

38

UPDATE: The developers of each ontology commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement.

IDENTIFIERS: The ontology possesses a unique identifier space within OBO.

VERSIONING: The ontology provider has procedures for identifying distinct successive versions.

DEFINITIONS: The ontology includes textual definitions for all terms.

CRITERIA

http://obofoundry.org/http://obofoundry.org/

39

CLEARLY BOUNDED: The ontology has a clearly specified and clearly delineated content.

DOCUMENTATION: The ontology is well-documented.

USERS: The ontology has a plurality of independent users.

COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.

CRITERIA

http://obofoundry.org/http://obofoundry.org/

40

Foundry ontologies all work in the same way

all are built to represent the types existing in a pre-existing domain and the relations between these types in a way which can support reasoning

– we have data– we need to make this data available for semantic

search and algorithmic processing– we create a consensus-based ontology for annotating

the data– and ensure that it can interoperate with Foundry

ontologies for neighboring domains

41

Formal-Ontological Relations

is_a

part_of

located_at

depends_on

is_boundary_of

adjacent_to

42

To support integration of ontologies

relational expressions such as

is_a

part_of

...

should be used in the same way in all ontologies involved

43

to define these relations properly

we need to take account of both types and instances in reality

44

Kinds of relations

<instance, type>: Toronto instance_of city

<instance, instance>: Toronto part_of Ontario

<type, type>: waterfall part_of river

45

is_a

human is_a mammal

all instances of the type human are as a matter of necessity instances of the type mammal

46Karen Eilbecksong.sf.netproperties and features of

nucleic sequencesSequence Ontology

(SO)

RNA Ontology Consortium(under development)three-dimensional RNA

structuresRNA Ontology

(RnaO)

Barry Smith, Chris Mungallobo.sf.net/relationshiprelationsRelation Ontology (RO)

Protein Ontology Consortium(under development)protein types and

modificationsProtein Ontology

(PrO)

Michael Ashburner, Suzanna Lewis, Georgios Gkoutos

obo.sourceforge.net/cgi-bin/ detail.cgi?

attribute_and_valuequalities of biomedical entities

Phenotypic Quality Ontology

(PaTO)

Gene Ontology Consortiumwww.geneontology.orgcellular components, molecular functions, biological processes

Gene Ontology (GO)

FuGO Working Groupfugo.sf.netdesign, protocol, data

instrumentation, and analysis

Functional Genomics Investigation Ontology

(FuGO)

JLV Mejino Jr.,Cornelius Rosse

fma.biostr.washington.edu

structure of the human bodyFoundational Model of

Anatomy (FMA)

Melissa Haendel, Terry Hayamizu, Cornelius Rosse,

David Sutherland, (under development)

anatomical structures in human and model organisms

Common Anatomy Refer-

ence Ontology (CARO)

Paula Dematos,Rafael Alcantara

ebi.ac.uk/chebimolecular entitiesChemical Entities of Bio-logical Interest (ChEBI)

Jonathan Bard, Michael Ashburner, Oliver Hofman

obo.sourceforge.net/cgi-bin/detail.cgi?cell

cell types from prokaryotes to mammals

Cell Ontology (CL)

CustodiansURLScopeOntology

47Karen Eilbecksong.sf.netproperties and features of

nucleic sequencesSequence Ontology

(SO)

RNA Ontology Consortium(under development)three-dimensional RNA

structuresRNA Ontology

(RnaO)

Barry Smith, Chris Mungallobo.sf.net/relationshiprelationsRelation Ontology (RO)

Protein Ontology Consortium(under development)protein types and

modificationsProtein Ontology

(PrO)

Michael Ashburner, Suzanna Lewis, Georgios Gkoutos

obo.sourceforge.net/cgi-bin/ detail.cgi?

attribute_and_valuequalities of biomedical entities

Phenotypic Quality Ontology

(PaTO)

Gene Ontology Consortiumwww.geneontology.orgcellular components, molecular functions, biological processes

Gene Ontology (GO)

FuGO Working Groupfugo.sf.netdesign, protocol, data

instrumentation, and analysis

Functional Genomics Investigation Ontology

(FuGO)

JLV Mejino Jr.,Cornelius Rosse

fma.biostr.washington.edu

structure of the human bodyFoundational Model of Anatomy (FMA)

Melissa Haendel, Terry Hayamizu, Cornelius Rosse,

David Sutherland, (under development)

anatomical structures in human and model organisms

Common Anatomy Refer-

ence Ontology (CARO)

Paula Dematos,Rafael Alcantara

ebi.ac.uk/chebimolecular entitiesChemical Entities of Bio-logical Interest (ChEBI)

Jonathan Bard, Michael Ashburner, Oliver Hofman

obo.sourceforge.net/cgi-bin/detail.cgi?cell

cell types from prokaryotes to mammals

Cell Ontology (CL)

CustodiansURLScopeOntology

Pleural Cavity

Pleural Cavity

Interlobar recess

Interlobar recess

Mesothelium of Pleura

Mesothelium of Pleura

Pleura(Wall of Sac)

Pleura(Wall of Sac)

VisceralPleura

VisceralPleura

Pleural SacPleural Sac

Parietal Pleura

Parietal Pleura

Anatomical SpaceAnatomical Space

OrganCavityOrganCavity

Serous SacCavity

Serous SacCavity

AnatomicalStructure

AnatomicalStructure

OrganOrgan

Serous SacSerous Sac

MediastinalPleura

MediastinalPleura

TissueTissue

Organ PartOrgan Part

Organ Subdivision

Organ Subdivision

Organ Component

Organ Component

Organ CavitySubdivision

Organ CavitySubdivision

Serous SacCavity

Subdivision

Serous SacCavity

Subdivision

Foundational Model of Anatomy

Pleural Cavity

Pleural Cavity

Interlobar recess

Interlobar recess

Mesothelium of Pleura

Mesothelium of Pleura

Pleura(Wall of Sac)

Pleura(Wall of Sac)

VisceralPleura

VisceralPleura

Pleural SacPleural Sac

Parietal Pleura

Parietal Pleura

Anatomical SpaceAnatomical Space

OrganCavityOrganCavity

Serous SacCavity

Serous SacCavity

AnatomicalStructure

AnatomicalStructure

OrganOrgan

Serous SacSerous Sac

MediastinalPleura

MediastinalPleura

TissueTissue

Organ PartOrgan Part

Organ Subdivision

Organ Subdivision

Organ Component

Organ Component

Organ CavitySubdivision

Organ CavitySubdivision

Serous SacCavity

Subdivision

Serous SacCavity

Subdivision

part

_of

is_a

50

Mature OBO Foundry ontologies now undergoing reform

Cell Ontology (CL)Chemical Entities of Biological Interest (ChEBI)Foundational Model of Anatomy (FMA)Gene Ontology (GO)Phenotypic Quality Ontology (PaTO)Relation Ontology (RO)Sequence Ontology (SO)

51

Ontologies being built to satisfy Foundry principles ab initio

Ontology for Clinical Investigations (OCI)Common Anatomy Reference Ontology (CARO)Ontology for Biomedical Investigations (OBI)Protein Ontology (PRO)RNA Ontology (RnaO)Subcellular Anatomy Ontology (SAO)

52

Ontologies in planning phaseBiobank/Biorepository Ontology (BrO, part of OBI)Environment Ontology (EnvO) Immunology Ontology (ImmunO)Infectious Disease Ontology (IDO)Mouse Adult Neurogenesis Ontology (MANGO)

OBO Foundry Success Story

Model organism research seeks results valuable for the understanding of human disease.

This requires the ability to make reliable cross-species comparisons, and for this anatomy is crucial.

But different MOD communities have developed their anatomy ontologies in uncoordinated fashion.

53

Ontologies facilitate grouping of annotations

brain 20 hindbrain 15 rhombomere 10

Query brain without ontology 20Query brain with ontology 45

54

CARO – Common Anatomy Reference Ontology

for the first time provides guidelines for model organism researchers who wish to achieve comparability of annotations

for the first time provides guidelines for those new to ontology work

See Haendel et al., “CARO: The Common Anatomy Reference Ontology”, in: Burger (ed.), Anatomy Ontologies for Bioinformatics: Springer, in press.

55

56

CARO-conformant ontologies already in development:

Fish Multi-Species Anatomy Ontology (NSF funding received)Ixodidae and Argasidae (Tick) Anatomy Ontology Mosquito Anatomy Ontology (MAO) Spider Anatomy OntologyXenopus Anatomy Ontology (XAO)

undergoing reform: Drosophila and Zebrafish Anatomy Ontologies

Part 3 The Hole Story

The Ontology of Environments

Initial hypothesis:Environments are holes

environmentplacesite

nichehabitatsettinghole

spatial regioninteriorlocation

Places are holes

66

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic Quality(PaTO)

Biological Process

(GO)

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Componen

t(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process(GO)

No place for environments

A Neglected Major Category in Ontologies thus far

Things (e.g. organisms)

Qualities / Features

Functions

Processes

Environments = that into which organisms (etc.) fit

68

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

Environments are holes in which organisms, cells, molecules ... can live

envi

ron

men

ts

are

her

e

Environments are holes

Double Hole Structure of the Occupied Niche

Medium (filling the environing hole)

Tenant (occupying the central hole)

Retainer (a boundary of some surrounding structure)

Tenant, medium and retainer

the medium of the bear’s niche is a

circumscribed body of air

medium might be body of water, cytosol, nasal mucosa, epithelium, endocardium,

synovial tissue ...

The Empty Niche

Fiat boundary Physical boundary

Two Types of Boundary

Fiat boundary Physical boundary

Positive and negative parts

positivepart

negativepartor hole

(made of matter)

(not made of matter)

Four Basic Niche Types(Niche as generalized hole)

1 2 3 4

1: a womb; an egg; a house (better: the interior thereof)2: a snail’s shell; 3: the niche of a pasturing cow; 4: the niche around a circling buzzard (fiat boundary)

Types of relations for EnvO

in

on (surface of)

surrounds

lives_in

attaches to

realizes

occupies (spatial region)

...

Lexical Semantics

the fruit is in the bowlthe bird is in the nestthe lion is in the cagethe pencil is in the cupthe fish is in the riverthe river is in the valleythe water is in the lakethe car is in the garagethe fetus is in the cavity in the uterine liningthe colony of whooping crane is in its breeding grounds

Double Hole Structure

Medium (filling the environing hole)

Tenant (occupying the central hole)

Retainer (a boundary of some surrounding structure)

when a tenant leaves its niche the gap left by the tenant is filled immediately by the surrounding medium

A hole in the ground

Solid physical boundaries at the floor and walls

but with a fiat lid:

hole

Part 4: Not every hole is an environment

An environment is a special kind of (generalized) hole

but what kind?

Elton – niche as role

the ‘niche’ of an animal means its place in the biotic environment, its relations to food and enemies. [...] When an ecologist says ‘there goes a badger’ he should include in his thoughts some definite idea of the animal’s place in the community to which it belongs, just as if he had said ‘there goes the vicar’ (Elton 1927, pp. 63f.)

G.E. Hutchinson: niche as volume in a functionally defined space

the niche = an n-dimensional hyper-volume whose dimensions correspond to resource gradients over which species are distributed

G.E. Hutchinson (1957, 1965)

Hypervolume niche = a location in an attribute space

defined by a specific constellation of environmental variables such as degree of slope, exposure to sunlight, soil fertility, foliage density, salinity...

Niche Construction

Lewontin: niches normally arise in symbiosis with the activities of organisms or groups of organisms (“ecosystem engineering”);

they are not already there, like vacant rooms in a gigantic evolutionary hotel, awaiting organisms who would evolve into them. (The Triple Helix, Gene Organism, Environment)

Part Last: Bringing Together the Spatial and Functional Approaches to Environment Ontology

The environment is not a location in an attribute space, but it must have features have such location

Every environment must have some spatial location

The functional niche presupposes the spatial-structural niche

Ontology of environment + ontology of associated environmental features

J. J. Gibson’s Ecological Psychology

The terrestrial environment is [best] described in terms of a medium, substances, and the surfaces that separate them. (Gibson 1979, p. 16)

Gibson’s theory of surface layout

‘a sort of applied geometry that is appropriate for the study of perception and behavior’ (1979, p. 33)

ground, open environment, enclosure, detached object, attached object, hollow object, place, sheet, fissure, stick, fiber, dihedral, etc.

Gibson’s theory of surface layout as an anatomy of environments

• systems of barriers, doors, pathways to which the behavior of organisms is specifically attuned,

• temperature gradients, patterns of movement of air or water molecules

• water holes, food sources (features)

• apertures (mouths, sphincters ...)

Two sets of issues

Environments, as spatial structures, and their parts

Environmental attributes (qualities, functions), determining multidimensional loci à la Hutchinson

Aim

To define structural properties such as: open, closed, connected, compact, spatial coincidence, integrity, aggregate, boundary

RCC (Region Connection Calculus) plus extensions

Ecological Niche Concepts

niche as particular place or subdivision of an environment that an organism or population occupies

vs.

niche as function of an organism or population within an ecological community

Next steps

Our data needs are to link niche features with geo-locations

Scale: From geographic to microbiological

From locations of organisms/samples, sources of museum artifacts ...

to organism interactions, e.g. on bacterial infection – how the interior of one organism or organism part serves as environment for another organism

Hosts for bacterial infection(interior of) lung blood (bacteremia)erythrocyte - plasmodium inhabits red blood cells hepatocyte – plasmodium infects liver cells macrophagegut and oral mucosa, nasal mucosa, vaginal mucosa kidney bladder portion of epithelial tissue

C: bacteria (arrows) adhering to and penetrating the epithelial cells (×3,000)

D: abscess (Ab) formation in subepithelial region with a colony of bacteria (arrows) and a red blood cell (RBC) in it (×2,000)

106

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic Quality(PaTO)

Biological Process

(GO)

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Componen

t(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process(GO)

Environments, environment parts (features), environment qualities

Ontologies neededEnvironment -- Taxonomy

place, habitat, city, farm, building (interior), oral cavity, uterine cavity, gut ...

Environment part – Anatomy of environments (Surface, conduit, entry ...)city wall, uterine wall, water source, ...

Environment functionprotection, supply of food,...

Environment quality – (Phenotypes) ambient temperature, salinity, ...

Recommended