Upload
vodan
View
214
Download
0
Embed Size (px)
Citation preview
Why we need a semantic web frameworkfor marine ecos stem indicatorsfor marine ecosystem indicators
Stace E. Beaulieu ([email protected]) and Andrew R. MaffeiWoods Hole Oceanographic Institution
P t A F M i Di St f P t i k W t d X M h ll MPeter A. Fox, Massimo Di Stefano, Patrick West, and X. Marshall MaTetherless World Constellation, Rensselaer Polytechnic Institute
Jonathan A Hare Michael FogartyJonathan A. Hare, Michael Fogarty, Kim Hyde, and Sean LuceyNOAA Northeast Fisheries Science Center
ECO-OP is supported byECO OP is supported by NSF Interop #0955649
Ocean Sciences Meeting 2014 1
What is an ecosystem indicator?“based on
IdentifyDevelop
monitoring &
based on verifiable data [and] conveys
i f tiDevelop conceptual model
Identify management objectives &
targets
monitoring & reporting systems
information about more than itself”
Communicate and interpret Calculate
indicators
Gather & review
Identify possible
Test & refine indicators with
Determine key questions &
indicatorsindicatorsdataindicators stakeholdersindicator use
Diagram and quote modified from:Identify & consult stakeholders/
Biodiversity Indicators Partnership (2011)http://www.bipindicators.net
audience
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future 2
An ecosystem indicator is BigData
IdentifyDevelop
monitoring &
‐Dataheterogeneity
data
Develop conceptual model
Identify management objectives &
targets
monitoring & reporting systems
‐ Data integration‐ Provenance
Communicate and interpret Calculate
indicators
data
Identify possible
Test & refine indicators with
Determine key questions & data
data
indicatorsindicatorsindicators stakeholdersindicator use
data
Identify & consult stakeholders/ Indicator may not have same meaning asaudience
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future
‐Indicator may not have same meaning assource data
3
Why we need a semantic web frameworkfor marine ecos stem indicatorsfor marine ecosystem indicators
“Web of data”Web of data
‐ Uses Linked Data standards
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future
‐ Uses Linked Data standards‐ Knowledge base currently describes 3.22 million things in an ontology
4
Large Marine Ecosystems (LMEs) of the World
U.S. Northeast ShelfNortheast ShelfLME
http://www.lme.noaa.gov/LMEWeb/downloads/lme_biomass.pdf
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future 5
Use Case: Ecosystem Status Report for the LME
Goal: “traceability, repeatability, explanation, verification, and validation” for ecosystem data and information products in the NEFSC Ecosystem Status Reportin the NEFSC Ecosystem Status Report Indicators
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future 6
Example Indicator #1: Primary Production
Similar data product from another groupthat also provides indicators for the LME
Figure in Report
http://www.seaaroundus.org/lme/7.aspx
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future
Why different? I manually determined the provenance …
7
Diagram of source datasets “funneling”into deri ed data prod ctinto derived data product
Person who generated
the indicator
Source data from different
Derived Dataset used in different
providers Figure in Report
Process Steps at eachSteps at each
arrow
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future 8
Provenance: the history, or lineage,to trace deri ed data prod cts back to so rce datato trace derived data products back to source data
Person who generated
the indicator
Source data from different
? ? ?
Derived Dataset used in?
different providers
used in Figure in Report
Chl‐a Primary SST
yproduction
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future 9
“Semantic provenance”domain-specific ontologies considered
Linked Ocean Data
GCMD
https://github.com/adamml/LinkedOceanDataCloud/blob/master/linkedOceanDataCloud.jpg
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future 10
Adopt PROV-O: Linked Data provenance ontology pro ides domain
E tit
- provides domain-independent classes and properties
EntitySubclass: Dataset
- can be extended to create domain-specific classes and properties
Person who generated
the indicator
http://www.w3.org/TR/prov‐o/ W3C Recommendation 30 April 2013
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future 11
PROV-O adopted for Global Change Information System(GCIS)
GCIS ontology
(GCIS)
Key Message linked to Dataset
PROV ontology
Key Message linked to domain concept
http://cmapspublic3.ihmc.us/rid=1L2W4LDNM‐D80T3Z‐10N0/GCIS%20report‐specific%20key%20messages.cmap
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future 12
Extension of GCIS ontology for ECO-OP use case
But what was missing from GCIS ontology?
ActivityDataset generation: “crux” of the
ActivitySubclass:DatasetGeneration
ECO‐OP use caseGeneration
PROV‐ES:Earth Science Provenance Ontology working group
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future
Earth Science Provenance Ontology working groupis testing subclasses of this activity
13
Prototype using IPython Notebook
Can we automate the capture of machine‐readable, provenance metadata, while
Source Dataset
provenance metadata, while the scientist is generating the indicator?
EntitySubclass: Dataset
ActivityActivitySubclass: IPythonNotebookRun
Dataset generation
Manual:Dataset matched (dc:subject) to a concept in Linked Oceanto a concept in Linked Ocean Data vocabulary (skos:Concept)
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future 14
Example Indicator #2: North Atlantic Oscillation
We used IPython Notebook to reproduce the Chapterto reproduce the Chapter on Climate Forcing in the Ecosystem Status Report
http://tw.rpi.edu/web/doc/AGUFM/2013/ReproducibilityInOceanSciences
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future 15
Bring provenance tracking to the scientist
Source Dataset
DataData processing
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future 16
Screenshot of IPython Notebook used to track provenance
http://ipython.org/
‐ Notebooks are formatted as JSON files
‐We are currently coding theWe are currently coding the provenance triples to JSON‐LD (lightweight Linked Data format)
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future 17
Specific outcomes of ECO-OP use case
1) Implement PROV provenance capture for derived dataset in IPythonNotebookNotebook
2) Extend GCIS ontology to include the activity of dataset generation, and provide testbed for PROV ES granularity of this activityprovide testbed for PROV‐ES granularity of this activity
3) Establish best practices for semantic annotation of derived dataset in Integrated Ecosystem Assessment
DiscoveryUnderstandinggRe‐usabilityData Integration
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future 18
Why we need a semantic web frameworkfor marine ecos stem indicatorsfor marine ecosystem indicators
Future work
ECO‐OPuse case
http://linkeddatadeveloper.com/Projects/Linked‐Data/media/fig11.2.png• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future 19
Ecosystem‐based management involves the sharing of data and information products among diverse stakeholders
Can an intelligent agent find the overlap infind the overlap in datasets used in marine ecosystem assessments?
http://www.iucnredlistofecosystems.org/http://www.oceanhealthindex.org/
• Motivation • Use Case • Adopt ontologies • Evaluate prototype • Future 20