16
Linking biodiversity data with the Biological Collections Ontology Ramona Walls (iPlant Collaborative, University of Arizona) John Deck (University of California at Berkeley) Robert Guralnick (University of Colorado at Boulder) John Wieczorek (University of California at Berkeley)

Linking biodiversity data with the Biological Collections Ontology Ramona Walls (iPlant Collaborative, University of Arizona) John Deck (University of

Embed Size (px)

Citation preview

Linking biodiversity data with the Biological Collections Ontology

Ramona Walls (iPlant Collaborative, University of Arizona)John Deck (University of California at Berkeley)

Robert Guralnick (University of Colorado at Boulder)John Wieczorek (University of California at Berkeley)

http://code.google.com/p/bco/

What it means to be an OBO Foundry Ontology

• Shared commitment to creating a suite of interoperable ontologies that span the biological and biomedical domains– non-redundancy– re-use of existing terms

• Adherence to OBO Foundry principles, including:– open access, willingness to collaborate– shared formats, relations, URIs, naming conventions– good documentation, single locus of authority

• Access to OBO Foundry community resources– tools– expertise

Scope of the BCO:

transect

depth

* *

*

*

*

*sample collection point

water sample at depth X

aliquot

*

metagenome

Environmental samples:

Collections of organisms and their parts(museum or voucher specimens):

Surveys, ecological observations:plot

sub-plottransect (within plot)

individual (within plot)

individual (within sub-plot)

Initial focus of BCO: tracking materials and data through sampling chains

Moorea Biocode bioinventory event

Museum specimens

Tissue sample at Smithsonian Institution

Gut sample Metagenomic sequencesat CAMERA portal

Genbank sequence

Digital image stored on Morphbank

identification

Insect specimen

KEY:

subclass of

has specified output

has specified input

instance of

derives from

BCO:material sampling process

BCO:identification process

BCO:material sample

OBI:sequencing assay

OBI:sequence data

Genbank sequence B

TaxonID A

TaxonID B

Tissue sampling

DNA extraction

Identification using key

Identification using BLAST

Sequencing

Biocode Sampling

Tissue sample

DNA molecules

BCO:taxonomic name

rdfs:Class

Example data: processes

Investigation Study process ID process type has input has ouput date

Moorea Biocode Project

Moorea Biocode project planned process

Moorea Biocode Project

Moorea insect inventory

Moorea insect inventory planned process

Moorea Biocode Project

Moorea insect inventory insect collection 01

material sampling process insect 01 insect 01 2010

Moorea Biocode Project

Moorea insect inventory tissue sampling 01

material sampling process insect 01 tissue sample 01 2010

Moorea Biocode Project

Moorea insect inventory

insect gut sampling 04

material sampling process insect 01 insect gut sample 04 2010

Moorea Biocode Project

Moorea insect inventory

insect gut sampling 05

material sampling process insect 02 insect gut sample 05 2010

Moorea Biocode Project

Moorea insect inventory dna isolation 01

material sampling process tissue sample 01 DNA sample 01 2010

Moorea Biocode Project

Moorea insect inventory dna isolation 04

material sampling process

insect gut sample 04 DNA sample 04 2010

Moorea Biocode Project

Moorea insect inventory

insect observation 06 observing process insect in situ 06 image 01 2010

Moorea Biocode Project

Moorea insect inventory identification 01.1 tax. iden. by morph. key insect 01 insect taxon 01 2010

Moorea Biocode Project

Moorea insect inventory identification 01.2

tax. iden. using dna barcode dna isolation 01 insect taxon 01 2011

Moorea Biocode Project

Moorea insect inventory identification 04.1 tax. iden. using BLAST dna isolation 04 microbial taxon 01 2010

Moorea Biocode Project

Moorea insect inventory identification 06.1

morph. tax. identification image 01 insect taxon 01 2010

Moorea Biocode Project

Moorea insect inventory identification 07.1

morph. tax. identification image 02 insect taxon 02 2011

Example data: material entities and information artifacts

Individual Type Inferred type

insect 01 organism or virus or viroid material sample

insect 02 organism or virus or viroid material sample

tissue sample 01 organism part material sample

tissue sample 02 organism part material sample

insect gut sample 04 material entity material sample

DNA sample 01 DNA material sample

DNA sample 02 DNA material sample

insect in situ 06 organism or virus or viroid material target of observation

image 01 photographic image information artifact

insect taxon 01 taxonomic name information artifact

insect taxon 02 taxonomic name information artifact

microbial taxon 01 taxonomic name information artifact

microbial taxon 02 taxonomic name information artifact

List all processes that took place in 2010 as part of the Moorea insect inventory

BFO: process and BFO:part of occurent BCO_example:Moorea insect inventory and date=2010

Study process ID process type dateMoorea insect inventory insect collection 01 material sampling process 2010Moorea insect inventory insect collection 02 material sampling process 2010Moorea insect inventory tissue sampling 01 material sampling process 2010Moorea insect inventory tissue sampling 02 material sampling process 2010Moorea insect inventory insect gut sampling 04 material sampling process 2010Moorea insect inventory insect gut sampling 05 material sampling process 2010Moorea insect inventory dna isolation 01 material sampling process 2010Moorea insect inventory dna isolation 02 material sampling process 2010Moorea insect inventory dna isolation 04 material sampling process 2010Moorea insect inventory dna isolation 05 material sampling process 2010Moorea insect inventory insect observation 06 observing process 2010Moorea insect inventory identification 01.1 tax. iden. by morph. key 2010Moorea insect inventory identification 02.1 tax. iden. by morph. key 2010Moorea insect inventory identification 04.1 tax. iden. using BLAST 2010Moorea insect inventory identification 04.2 tax. iden. using BLAST 2010Moorea insect inventory identification 04.3 tax. iden. using BLAST 2010Moorea insect inventory identification 04.4 tax. iden. using BLAST 2010Moorea insect inventory identification 04.5 tax. iden. using BLAST 2010Moorea insect inventory identification 06.1 morph. tax. identification 2010

List the output (“has specified output”) of every “taxonomic identification process” that has as input

(“has specified input”) the "insect 03".

Study process ID process type has specified input

has specified output

Moorea insect inventory

identification 03.1 tax. iden. by morph. key

insect 03 insect taxon 01

Study process ID process type has input has ouputMoorea insect inventory

tissue sampling 03 material sampling process

insect 03 tissue sample 03

Moorea insect inventory

dna isolation 03 material sampling process

tissue sample 03 DNA sample 03

Moorea insect inventory

identification 03.2 tax. iden. using dna barcode

DNA sample 03 insect taxon 03

Future directions - technical

• SPARQL endpoint with example queries– Check the BCO wiki (

http://code.google.com/p/bco/)• Implement community curation tools such as

Quick Term Templates or BioPortal– Requests can go to the Issue tracker now:

http://code.google.com/p/bco/issues/list

Future directions - ontological

• Better integration with OBI and other ontologies

• More sophisticated treatment of naming/taxonomy/identification

• Ontological modeling of surveys/inventories• Mappings to DwC, MIxS, other vocabularies• Testing with real data sets

Contributors:Steve Baskauf, Vijay Barve, Jim Beach, Reed Beaman, Matthiew Bietz, Stan Blum, Shawn Bowers, Pier Luigi Buttigieg, Neil Davies, Gabi Droege, Dag Endresen, Maria Alejandra Gandolfo, Robert Hanner, Alyssa Janning, Michelle Koo, Kris Krishtalka, John Kunze, Andréa Matsunaga, Peter Midford, Chuck Miller, Norman Morrison, Gil Nelson, OBI Developers, Éamonn O’Tuama, Cynthia Parr, Sujeevan Ratnasingham, Jai Rideout, Robert Robbins, Phillipe Rocca-Serra, Joel Sachs, Inigo San Gil, Herbert Schentz, Mark Schildhauer, Barry Smith, Peter Sterk, Steve Stones-Havas, Brian Stucky, Andrea Thomer, Mellisa Tulig, Dave Vieglais, Brian Wee, Trish Whetzel, Jamie Whitacre, Greg Whitbread, John Wooley

Funding RCN4GSC: Research Coordination Network for Genomic Standards

Consortium (DBI-0840989)

IB3 EAGER: An Interoperable Information Infrastructure for Biodiversity Research (IIS-1255035)