27
NIFSTD AND NEUROLEX: DEVELOPMENT OF A COMPREHENSIVE NEUROSCIENCE ONTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita BANDROWSKI, Jeffery S. GRETHE, Amarnath GUPTA, Maryann E. MARTONE University of California, San Diego, CA George Mason University, Fairfax, VA Yale University, New Haven, CT ICBO Workshop 2011 July 26, 2011 Funded in part by the NIH Neuroscience Blueprint HHSN271200800035C via NIDA NEUROSCIENCE INFORMATION FRAMEWORK

N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

Embed Size (px)

Citation preview

Page 1: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

NIFSTD AND NEUROLEX: DEVELOPMENT OF A COMPREHENSIVE NEUROSCIENCE ONTOLOGY

Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita BANDROWSKI, Jeffery S. GRETHE, Amarnath GUPTA, Maryann E. MARTONE

University of California, San Diego, CA George Mason University, Fairfax, VA

Yale University, New Haven, CT

ICBO Workshop 2011July 26, 2011

Funded in part by the NIH Neuroscience Blueprint HHSN271200800035C via NIDA

NEUROSCIENCE INFORMATION FRAMEWORK

Page 2: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

NIF: DISCOVER AND UTILIZE WEB-BASED NEUROSCIENCE RESOURCES

A portal for finding and using neuroscience resources

A consistent framework for describing resources

Provides simultaneous search of multiple types of information, organized by category

NIFSTD Ontology, a

critical component Enables concept-based search

UCSD, Yale, Cal Tech, George Mason, Harvard MGH

Supported by NIH Blueprint

Easier

The Neuroscience Information Framework (NIF), http://neuinfo.org

Page 3: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

NIF STANDARD ONTOLOGIES (NIFSTD)

• Set of modular ontologies – Covering neuroscience relevant

terminologies– Comprehensive ~60, 000 distinct concepts

+ synonyms

• Expressed in OWL-DL language– Supported by common DL Resoners

• Closely follows OBO community best practices

• Avoids duplication of efforts – Standardized to the same upper level

ontologies • e.g., Basic Formal Ontology (BFO), OBO

Relations Ontology (OBO-RO), Phonotypical Qualities Ontology (PATO)

– Relies on existing community ontologies e.g., CHEBI, GO, PRO, OBI etc.

3

• Modules cover orthogonal domain e.g. , Brain Regions, Cells, Molecules,

Subcellular parts, Diseases, Nervous system functions, etc.

Bill Bug et al.

Page 4: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

4

NIFSTD EXTERNAL COMMUNITY SOURCESDomain External Source Import/ Adapt Module Organism taxonomy NCBI Taxonomy, GBIF, ITIS, IMSR, Jackson Labs mouse catalog Adapt NIF-Organism

Molecules IUPHAR ion channels and receptors, Sequence Ontology (SO), ChEBI, and Protein Ontology (PRO); pending: NCBI Entrez Protein, NCBI RefSeq, NCBI Homologene, NIDA drug lists

Adapt IUPHAR, ChEBI;Import PRO, SO

NIF-MoleculeNIF-Chemical

Sub-cellular Sub-cellular Anatomy Ontology (SAO). Extracted cell parts and subcellular structures. Imported GO Cellular Component

Import NIF-Subcellular

Cell CCDB, NeuronDB, NeuroMorpho.org. Terminologies; pending: OBO Cell Ontology

Adapt NIF-Cell

Gross Anatomy NeuroNames extended by including terms from BIRN, SumsDB, BrainMap.org, etc; multi-scale representation of Nervous System Macroscopic anatomy

Adapt NIF-GrossAnatomy

Nervous system function

Sensory, Behavior, Cognition terms from NIF, BIRN, BrainMap.org, MeSH, and UMLS

Adapt NIF-Function

Nervous system dysfunction

Nervous system disease from MeSH, NINDS terminology; Disease Ontology (DO)

Adapt/Import NIF- Dysfunction

Phenotypic qualities PATO is Imported as part of the OBO foundry core Import NIF-Quality Investigation: reagents Overlaps with molecules above, especially RefSeq for mRNA Import NIF-InvestigationInvestigation: instruments, protocols

Based on Ontology for Biomedical Investigation (OBI) to include entities for biomaterial transformations, assays, data transformations

Adapt NIF-Investigation

Investigation: Resource NIF, OBI, NITRC, Biomedical Resource Ontology (BRO) Adapt NIF-Resource Biological Process Gene Ontology’s (GO) biological process in whole Import NIF-BioProcess Cognitive Paradigm Cognitive Paradigm Ontology (CogPO) Import NIF-Investigation

Page 5: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

IMPORTING OR ADAPTING A NEW ONTOLOGY OR VOCABULARY SOURCE

Source Import/adapt

a source already in OWL, uses the OBO-RO and the BFO and is orthogonal to existing modules

the import simply involves adding an owl:import statement

existing orthogonal ontology is in OWL but does not use the same foundational ontologies as NIFSTD

an ontology-bridging module (explained later) is constructed declaring the deep level semantic equivalencies such as foundational objects and processes.

external source is satisfied by the above two rules but observed to be too large for NIF’s scope of interests

a relevant subset is extracted. MIREOT principles has been adopted

external source has not been represented in OWL, or does not use the same foundation as NIFSTD,

the terminology is adapted to OWL/RDF in the context of the NIFSTD foundational layer ontologies

Page 6: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

NIFSTD DESIGN PRINCIPLES• Single Inheritance for Named Classes– Follows simple inheritance principle for named

classes – An asserted named class can have only one named

class as its superclass– Promotes the named classes to be univocal and to

avoid ambiguities • Classes with multiple named superclasses – Can be inferred using automated reasoners– Saves a great deal of manual labor and minimizes

human errors• Alan Rector’s Normalization principles.

Page 7: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

DESIGN PRINCIPLES

• Unique Identifiers and Annotation Properties. – NIFSTD entities are identified by a unique

identifier and accompanied by a variety of annotation properties • Derived from Dublin Core Metadata (DC) and Simple

Knowledge Organization System (SKOS) model. • Synonyms, acronyms, definition, defining source etc.

– Reuse the same URI through MIREOTed classes from external source, • Allows to avoid extra mapping annotations, e.g., class

identifiers remain unaltered.

Page 8: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

DESIGN PRINCIPLES

• Annotation properties associated with versioning different levels of contents– creation date and modification dates– file level versioning for each of the modules – annotations for retiring antiquated concept

definitions • hasFormerParentClass and isReplacesByClass etc. • tracking former ontology graph position and

replacement concepts.

Page 9: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

DESIGN PRINCIPLES

• Object Properties and Bridge Modules. – Mostly drawn from OBO Relations Ontology (OBO-RO) – Intra-module relations are kept within the same

module• ONLY universal restrictions are considered

– e.g., partonomy relations within different brain regions

– The cross-module relations are specified in separate bridging modules• Modules that only contain logical restrictions on a set of

classes assigned between multiple modules. • Allows main domain modules—e.g., anatomy, cell type, etc.

to remain independent of one another

Page 10: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

DESIGN PRINCIPLES

Helps keeping the modularity principles intact facilitate extensions for broader communities without NIF-centric views These bridging modules can easily be excluded in order to focus on core modules

Two example bridging modules in NIFSTD

Page 11: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

TYPICAL KNOWLEDGE MODEL

A typical knowledge model in NIFSTD. Both cross-modular and intra-modular classes are associated through object properties mostly drawn from the OBO Relations ontology (RO).

Page 12: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

An Analogy

Easier

Difficult

Page 13: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

13

TYPICAL USE OF ONTOLOGY IN NIF

• Basic feature of an ontology – Organizing the concepts involved in a domain into

a hierarchy and– Precisely specifying how the classes are ‘related’

with each other (i.e., logical axioms)

• Explicit knowledge are asserted but implicit logical consequences can be inferred – A powerful feature of an ontology

Page 14: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

14

Class name Asserted necessary conditions Cerebellum Purkinje cell 1. Is a ‘Neuron’

2. Its soma lies within 'Purkinje cell layer of cerebellar cortex’3. It has ‘Projection neuron role’4. It uses ‘GABA’ as a neurotransmitter5. It has ‘Spiny dendrite quality’

Class name Asserted defining (necessary & sufficient) expressionCerebellum neuron Is a ‘Neuron’ whose soma lies in any part of the

‘Cerebellum’ or ‘Cerebellar cortex’

Principal neuron Is a ‘Neuron’ which has ‘Projection neuron role’, i.e., a neuron whose axon projects out of the brain region in which its soma lies

GABAergic neuron Is a ‘Neuron’ that uses ‘GABA’ as a neurotransmitter

ONTOLOGY – ASSERTED HIERARCHY

Page 15: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

NIF CONCEPT-BASED SEARCH• Search Google: GABAergic neuron• Search NIF: GABAergic neuron

– NIF automatically searches for types of GABAergic neurons

Types of GABAergic neurons

Page 16: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

N I F S t a n d a rd O n t o l o g i e s 16

NIFSTD CURRENT VERSION

• Key feature: Includes useful defined concepts to infer useful classification

Page 17: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

N I F S t a n d a rd O n t o l o g i e s

NIFSTD AND NEUROLEX WIKI

• Semantic wiki platform• Provides simple forms for

structured knowledge• Can add concepts,

properties• Generate hierarchies

without having to learn complicated ontology tools

• Good teaching tool for principles behind ontologies

• Community can contribute

17

Stephen D. Larson et al.

Page 18: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

NeuroLex vs.NIFSTD

NeuroLex NIFSTD

A semantic mediawiki based website containing the content of the NIFSTD plus additional community contributions

Collection of cohesive, unified modular ontologies deployed in OWL

Categories Classes

Content is fluid and can be updated at any time.

Structure is based on OBO foundry principles

Defines relationships between categories as simple properties

Defines relationships between classes as OWL restrictions derived from RO

At a glance guide to the differences between NeuroLex and NIFSTDLarson et. al

Page 19: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

Top Down Vs. Bottom upTop-down ontology construction• A select few authors have write privileges• Maximizes consistency of terms with each other• Making changes requires approval and re-publishing• Works best when domain to be organized has: small corpus, formal categories, stable entities, restricted entities, clear edges.• Works best with participants who are: expert catalogers, coordinated users, expert users, people with authoritative source of judgment

Bottom-up ontology construction• Multiple participants can edit the ontology instantly• Control of content is done after edits are made based on the merit of the content• Semantics are limited to what is convenient for the domain• Not a replacement for top-down construction; sometimes necessary to increase flexibility• Necessary when domain has: large corpus, no formal categories, no clear edges• Necessary when participants are: uncoordinated users, amateur users, naïve catalogers• Neuroscience is a domain that is less formal and neuroscientists are more uncoordinated

Larson et. al

NIFSTD

NEUROLEX

Page 20: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

http://neurolex.org/wiki/Special:ContributionScores

NEUROLEX WIKI CONTRIBUTIONS

Page 21: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita
Page 22: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

NIFSTD/NEUROLEX CURATION WORKFLOW

‘has soma location’ in NeuroLex == ‘Neuron X’ has_part some (‘Soma’ and (part_of some ‘Brain region Y’)) in NIFSTD

Page 23: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

N I F S t a n d a rd O n t o l o g i e s 23

ACCESS TO NIFSTD CONTENTS

• NIFSTD is available as– OWL Format

http://ontology.neuinfo.org – RDF and SPARQL Endpoint

http://ontology.neuinfo.org/sparql-endpoint.html

• Specific contents through web services – http://ontology.neuinfo.org/ontoquest

-service.html

• Available through NCBO Bioportal– Provides annotation and mapping

services– http://bioportal.bioontology.org/

Page 24: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

24

WORKING TO INCORPORATE COMMUNITY

• NeuroPsyGrid – http://www.neuropsygrid.org

• NDAR Autism Ontology – http://ndar.nih.gov

• Disease Phenotype Ontology– http://openccdb.org/wiki/index.php/Disease_Ontology

• Cognitive Paradigm Ontology (CogPO) – http://wiki.cogpo.org

• Neural ElectroMagnetic Ontologies (NEMO) – http://nemo.nic.uoregon.edu

Page 25: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

25

SUMMARY AND CONCLUSIONS

• NIF with NIFSTD provides an example of how ontologies can be practically applied to enhance search and data integration across diverse resources

• We believe, we have defined a process to form complex semantics to various neuroscience concepts through NIFSTD and through NeuroLex collaborative environment.

• NIF encourages the use of community ontologies

• Moving towards building rich knowledgebase for Neuroscience that integrates with larger life science communities.

Page 26: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita
Page 27: N IF S TD AND N EURO L EX : D EVELOPMENT OF A C OMPREHENSIVE N EUROSCIENCE O NTOLOGY Fahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Anita

Point of Discussion

• Gaining OBO Foundry community consensus for a production system is difficult as we often need to move quickly along with the project

• We rather favor a system whereby we start with minimal complexity as required and add more as the ontologies evolve over time towards perfection

• What should be the most effective way to collaborate and gain community consensus?