Intelligence OntologyA Strategy for the Future
Barry SmithUniversity at Buffalo
http://ontology.buffalo.edu/smith
ncornational center for ontological research
ncornational center for ontological research
The goal of NCOR is to advance ontological research and applications by
promoting the creation and use of high quality ontologies. It thus supports
the development of metrics for ontology evaluation designed to bring about
an evolutionary improvement in ontology quality. NCOR serves as a
vehicle to coordinate, to enhance, to publicize, and to seek funding for
ontology-related activities in fields such as scientific research, intelligence
analysis and multisource information fusion, qualitative spatiotemporal
reasoning, and terminological systems. It organizes conferences and
research groups focusing on specific aspects of ontology development,
facilitates the exchange of research personnel for short- and long-term
visits, and participates in nationally and internationally funded research
networks.
http://ncor.us
OIC-2007 Proceedings:http://ceur-ws.org
Volume 299
OIC-2007 ONTOLOGY FOR THE
INTELLIGENCE COMMUNITYtowards effective exploitation and
integration of intelligence resourcesNOVEMBER 28-29, 2007 · COLUMBIA, MD
ECOR – European Center for Ontological Research
JCOR – Japanese Center for Ontological ResearchInaugural meeting in Tokyo, February 26-27, 2008
NCOR
• Science-based ontology evaluation to
create an evolutionary path towardsontology improvement
• Ontology interoperability
• Promotion of best practice
7
The problem we face in biology...
MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPIPSKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDSFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVVAGEAASSNHHQKISRVTRKRPREPKSTNDILVAGQKLFGSSFEFRDLHQLRLCYEIYMADTPSVAVQAPPGYGKTELFHLPLIALASKGDVEYVSFLFVPYTVLLANCMIRLGRRGCLNVAPVRNFIEEGYDGVTDLYVGIYDDLASTNFTDRIAAWENIVECTFRTNNVKLGYLIVDEFHNFETEVYRQSQFGGITNLDFDAFEKAIFLSGTAPEAVADAALQRIGLTGLAKKSMDINELKRSEDLSRGLSSYPTRMFNLIKEKSEVPLGHVHKIRKKVESQPEEALKLLLALFESEPESKAIVVASTTNEVEELACSWRKYFRVVWIHGKLGAAEKVSRTKEFVTDGSMQVLIGTKLVTEGIDIKQLMMVIMLDNRLNIIELIQGVGRLRDGGLCYLLSRKNSWAARNRKGELPPKEGCITEQVREFYGLESKKGKKGQHVGCCGSRTDLSADTVELIERMDRLAEKQATASMSIVALPSSFQESNSSDRYRKYCSSDEDSNTCIHGSANASTNASTNAITTASTNVRTNATTNASTNATTNASTNASTNATTNASTNATTNSSTNATTTASTNVRTSATTTASINVRTSATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSATTTASINVRTSATTTESTNSSTSATTTASINVRTSATTTKSINSSTNATTTESTNSNTNATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSAATTESTNSNTSATTTESTNASAKEDANKDGNAEDNRFHPVTDINKESYKRKGSQMVLLERKKLKAQFPNTSENMNVLQFLGFRSDEIKHLFLYGIDIYFCPEGVFTQYGLCKGCQKMFELCVCWAGQKVSYRRIAWEALAVERMLRNDEEYKEYLEDIEPYHGDPVGYLKYFSVKRREIYSQIQRNYAWYLAITRRRETISVLDSTRGKQGSQVFRMSGRQIKELYFKVWSNLRESKTEVLQYFLNWDEKKCQEEWEAKDDTVVVEALEKGGVFQRLRSMTSAGLQGPQYVKLQFSRHHRQLRSRYELSLGMHLRDQIALGVTPSKVPHWTAFLSMLIGLFYNKTFRQKLEYLLEQISEVWLLPHWLDLANV
How to do biology across the genome
9
where in the body?where in the cell?what kind of disease process?how was the data collected?
10
ontologies = high quality controlled structured vocabularies for the annotation (description) of data
11
annotating images
MouseEcotope GlyProt
DiabetInGene
GluChem
sphingolipid transporter
activity
The OBO Foundry Idea
MouseEcotope GlyProt
DiabetInGene
GluChem
Holliday junction helicase complex
The OBO Foundry Idea
MouseEcotope GlyProt
DiabetInGene
GluChem
sphingolipid transporter
activity
The OBO Foundry Idea
Broad-coverage semantic annotation systems which will enable intelligent
integration of gigantic bodies of heterogeneous data need to be
created also outside biology.
geospatialtransportreligion
weather
ethnicitychemicalspoliticslaw
using common rules drawing on best practices for creating ontologies
... and for linking ontologies
In areas such as:
geospatialtransportreligionweather
ethnicitychemicalspoliticslaw
exploiting the division of labor
relying on champions in dispersed communities to invest in public-domain resources
We will also need:
ontology of documents
ontology of provenance
ontology of names
ontology of numbers (IDs)
ontology of signatures
ontology of identity
...
Ontology Scope URL Custodians
Cell Ontology (CL)
cell types from prokaryotes to mammals
obo.sourceforge.net/cgi-
bin/detail.cgi?cell
Jonathan Bard, Michael Ashburner, Oliver Hofman
Chemical Entities of Bio-
logical Interest (ChEBI)
molecular entities ebi.ac.uk/chebiPaula Dematos,Rafael Alcantara
Common Anatomy Refer-
ence Ontology (CARO)
anatomical structures in human and model
organisms(under development)
Melissa Haendel, Terry Hayamizu, Cornelius
Rosse, David Sutherland,
Foundational Model of Anatomy (FMA)
structure of the human body
fma.biostr.washington.
edu
JLV Mejino Jr.,Cornelius Rosse
Functional Genomics Investigation
Ontology (FuGO)
design, protocol, data instrumentation, and
analysisfugo.sf.net FuGO Working Group
Gene Ontology (GO)
cellular components, molecular functions, biological processes
www.geneontology.org
Gene Ontology Consortium
Phenotypic Quality Ontology
(PaTO)
qualities of anatomical structures
obo.sourceforge.net/cgi
-bin/ detail.cgi?attribute_and_value
Michael Ashburner, Suzanna
Lewis, Georgios Gkoutos
Protein Ontology (PrO)
protein types and modifications
(under development)Protein Ontology
Consortium
Relation Ontology (RO)
relationsobo.sf.net/
relationshipBarry Smith, Chris
Mungall
RNA Ontology(RnaO)
three-dimensional RNA structures
(under development) RNA Ontology Consortium
Sequence Ontology(SO)
properties and features of nucleic sequences
song.sf.net Karen Eilbeck
20
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity
(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic
Quality(PaTO)
Organism-Level Process
(GO)
CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Compone
nt(FMA, GO)
Cellular Function
(GO)
Cellular Process
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)
The OBO Foundryobofoundry.org
GRANULARITY
RELATION TO TIME
http://obofoundry.org
Ontologies facilitate grouping of annotations
brain 20 hindbrain 15 rhombomere 10
Query brain without ontology 20Query brain with ontology 45
All OBO Foundry ontologieswork in the same way
• we have data (biosample, haplotype, clinical data, survey data, ...)
• we need to make this data available for semantic search and algorithmic processing
• we create a consensus-based ontology for annotating the data
23
24
25
26
to enhance alignment of data about instances (communities, places, ...)
Community / Population Ontology
• family, clan• ethnicity• religion • diet• social networking•education (literacy ...)• healthcare (economics ...)•...
•household forms• demography• public health
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Family, Community, Deme, Population
OrganFunction
(FMP, CPRO) Phenotypic
Quality(PaTO)
Biological Process
(GO)
Organism(NCBI
Taxonomy)
Anatomical Entity(FMA, CARO)
CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Componen
t(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)
http://obofoundry.org
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
COMPLEX OFORGANISMS
Family, Community, Deme, Population
OrganFunction
(FMP, CPRO)
Population Phenotype
PopulationProcess
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity(FMA, CARO) Phenotypic
Quality(PaTO)
Biological Process
(GO)CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Componen
t(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)
http://obofoundry.org
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
COMPLEX OF ORGANISMS
Family, Community,
Deme, Population OrganFunction
(FMP, CPRO)
Population
Phenotype
Population Process
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
(FMA, CARO)
Phenotypic Quality(PaTO)
Biological Process
(GO)CELL AND CELLULAR
COMPONENT
Cell(CL)
Cell Com-
ponent(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)
E N
V I R
O N
M E
N T
http://obofoundry.org
OBO FoundryGenomic Standards Consortium
National Environment Research Council (UK)Barcode of Life Project
Encyclopedia of Life Project
The Environment Ontology
Applications of EnvO in Biology
Support the annotation of meta-data related to: Data about biological samples produced from various
technologies• Metagenomics, Metabolomics, Proteomics, Transcriptomics,
Genomics...
Data Produced from remote sensing equipment Images• Web 2.0, tagging
Physical holdings• Museum artifacts, (preserved) biological samples / organisms
...anything that has an environment
How EnvO currently works for information retrieval
Retrieve all experiments on organisms obtained from:• deep-sea thermal vents• arctic ice cores• rainforest canopy• alpine melt zone
Retrieve all data on organisms sampled from:• hot and dry environments• cold and wet environments• a height above 5,000 meters
Retrieve all the omic data from soil organisms subject to:• moderate heavy metal contamination
Environment = totality of circumstances external to a living organism or group of organisms
• pH• evapotranspiration• turbidity• available light• predominant vegetation• predatory pressure• nutrient limitation …
extend EnvO to the clinical domain
neighborhood patterns• built environment, living conditions• climate• social networking• crime, transport• education, religion, work• health, hygiene
disease patterns• bio-environment (bacteriological, ...)• patterns of disease transmission (links to IDO)
works in tandem with
GAZ.obo: An Open Source Gazetteer Constructed on Ontological Principles
http://darwin.nerc-oxford.ac.uk/gc_wiki/index.php/EnvO_Project
The Environment Ontology