View
812
Download
0
Category
Tags:
Preview:
DESCRIPTION
Prov4J: A Semantic Web Framework for Generic Provenance Management André Freitas, Arnaud Legendre, Sean O’Riain, Edward Currypaper: http://andrefreitas.org/papers/Prov4J%20A%20Semantic%20Web%20Framework%20for%20Generic%20Provenance%20Management.pdf
Citation preview
Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
Digital Enterprise Research Institute www.deri.ie
Prov4J: A Semantic Web Framework for Generic
Provenance Management André Freitas, Arnaud Legendre, Sean O’Riain, Edward
Curry
Digital Enterprise Research Institute www.deri.ie
Outline
Motivation. Generic provenance management on the
Web. Prov4J:
Capture Representation Consumption Deployment
Digital Enterprise Research Institute www.deri.ie
Motivation: Data on the Web
Accelerated by the adoption and uptake of Linked Data. Paradigm shift:
Change in the way information is consumed on the Web. Main Issue:
Quality Assessment & Trustworthiness.
Digital Enterprise Research Institute www.deri.ie
Motivation
Provenance as a cornerstone element for quality assessment.
Expansion of the application of provenance into different domains and types of systems
Generic applications generating or consuming data on the Web need to become provenance-aware.
Provenance-aware: Ability to capture, represent and consume provenance information associated with the data.
Digital Enterprise Research Institute www.deri.ie
Generic Provenance Management
Provenance management for this larger audience.
Covers the set of the most frequent requirements for provenance capture and consumption on the Web.
Digital Enterprise Research Institute www.deri.ie
Generic Provenance Management
Provenance for the Masses
Digital Enterprise Research Institute www.deri.ie
Research Questions
Are Semantic Web standards and tools appropriate for capturing, representing and consuming provenance on the Web?
What are the key software engineering aspects which need to be employed to reduce the barriers for the construction of provenance-aware applications?
Digital Enterprise Research Institute www.deri.ie
Research Goals
Answer these questions.
Provide a Generic Provenance Management Framework for the Web.
Make it available for experimentation by the community.
Digital Enterprise Research Institute www.deri.ie
Main Components
Provenance Representation Provenance Consumption Provenance Capture
W3PProv4J
Digital Enterprise Research Institute www.deri.ie
W3P
Lightweight provenance ontology for the Web. Focused on provenance for data quality assessment. Designed to be compatible with the Open Provenance
Model. Dimensions: Workflow, Publishing and Social Provenance. Building W3P:
Use cases; Data quality dimensions; Literature review; Requirements; Core provenance concepts; Use and refinement;
Digital Enterprise Research Institute www.deri.ie
W3P: Classes & Properties (excerpt)
Core Workflow Model
Digital Enterprise Research Institute www.deri.ie
Building Prov4J
Core requirements for a generic provenance management framework. Capture Consumption
Provenance architecture. Core software engineering aspects for capturing
provenance. Deployment in a real world scenario. Core requirements coverage analysis.
Digital Enterprise Research Institute www.deri.ie
Core Requirements
Provenance capture: Minimum number of software adaptations Low impact on performance Expressive interface Scalability Structured provenance data Publication of provenance data
Provenance consumption: Query expressivity Query performance & scalability Provenance discovery Mapping from different provenance models Usability
Digital Enterprise Research Institute www.deri.ie
Core Requirements (cont’d)
Common requirements: User data representation independency Separation of concerns Reliable provenance storage Basic system administration support Security
Digital Enterprise Research Institute www.deri.ie
High-Level Architecture
Digital Enterprise Research Institute www.deri.ie
Consumption: Components
Digital Enterprise Research Institute www.deri.ie
Consumption: Query Types
Query Types SPARQL based queries Queries supported by reasoning Path queries Navigational queries Similarity queries
Query Type Distribution (API) 33% used transitivity 9% used rules reasoning 9% used path features 20% used SPARQL extensions 30% pure SPARQL 4% similarity
Digital Enterprise Research Institute www.deri.ie
Capture: Software Engineering Principles
Aspect Oriented Programming & Annotations. Pushback capture. Minimization of Adaptations.
Context-based provenance construction. Provenance URIs.
Digital Enterprise Research Institute www.deri.ie
Capture: Adaptations
Digital Enterprise Research Institute www.deri.ie
Capture: Logging & Storage
Digital Enterprise Research Institute www.deri.ie
Scenario
Digital Enterprise Research Institute www.deri.ie
Core Requirements Coverage
Digital Enterprise Research Institute www.deri.ie
Summary
Semantic Web standards and tools played a fundamental role in the construction of the framework.
Query expressivity over original SPARQL was improved.
Transitivity, path queries proved to be very important features.
Framework is usable in a realistic scenario. High coverage of core requirements. Available for download from early November/2010.
Digital Enterprise Research Institute www.deri.ie
Future Work
Evaluation of query expressivity and performance.
W3C Prov-XG requirements coverage analysis. Improvement of the coverage of the core
requirements.
Digital Enterprise Research Institute www.deri.ie
http://prov4j.org
Recommended