Upload
barry-norton
View
504
Download
2
Embed Size (px)
Citation preview
A Data API with Security and
Graph-level Access ControlDr. Barry Norton,
Development Manager, ResearchSpace* The British Museum
* supported by the Andrew W. Mellon Foundation
ESWC, Crete
May 2014
Open Endpoint
• “The idea that an enterprise would allow a public SQL interface is laughable”
• True, but furthermore:
• The idea that an enterprise would allow unrestricted SQL (even querying, let alone update) is also laughable
Reality• In reality the enterprise:
• runs an active directory service
• assigns permissions to databases using this
• assign read/write permissions to tables using this
• allows third-party software providers access only to pre-defined queries and updates
Admission
• This talk is of low originality and negligible scientific value!
State of the Art• Some triplestores already provide graph-level access control
• The Datalift project produced a query-rewriting system to provide access control over arbitrary triplestores (Costabello et al., ECAI2012)
• Knud and Leigh, formerly of Talis, already presented how Kasabi allowed pre-defined parameterised SPARQL queries (WWW2012)
• The BBC, and other enterprises that have followed their example, already use a similar approach in practice (see presentations of Jem Rayfield)
State of the Art• Some triplestores already provide graph-level access control
• The Datalift project produced a query-rewriting system to provide access control over arbitrary triplestores (Costabello et al., ECAI2012)
• Knud and Leigh, formerly of Talis, already presented how Kasabi allowed pre-defined parameterised SPARQL queries (WWW2012)
• The BBC, and other enterprises that have followed their example, already use a similar approach in practice (see presentations of Jem Rayfield)
Proprietary
Enumerates graphs in query
Dead !(and proprietary user management)
Closed source
Approach (1/4)
Approach (2/4)
Approach (3/4)
Approach (4/4)• RESTful API for management of queries and updates (i.e. each such
becomes a URI-identified resource)
• Each query/update
• Can be parameterised (by POSTing parameters with XSD datatypes that substitute for variables)
• Can be executed (subject to access and rewriting) by POSTing parameter values
• Can be scheduled for execution by POSTing a schedule
• Can be tested, on schedule, by POSTing an XPATH or SPARQL ASK query
• Provides a GETtable resource per scheduled execution
ResearchSpace• a reusable set of Linked Data-based components, making up
• a platform that allows researchers to make claims (additions and changes to GLAM data) -
• that preserves and aggregates canonical data across Museums (LAMs),
• attributes claims,
• records arguments based on
• provenanced data annotation,
• image annotation
• forum-based discussion with explicit annotation
• will allow inference over claims
RS Search
Fundamental Relationships
RS (Conjunctive) Search
RS Data Annotation
Untrue claims?• For years a (naive) objection to Linked Data has
been:
• “If I publish my data and give my things identifiers (URIs), won’t people make untrue claims?”
!
!
<http://collection.britishmuseum.org/id/object/YCA62958> crm:P52_has_current_owner
<http://semanticweb.org/id/Barry_Norton>
RS Image Annotation
RS ‘Data Basket’
RS Dashboard
Future Work• Query rewriting currently uses the Jena parser,
want to re-implement using SPIN
• Rewrites become CONSTRUCT queries, rather than code
• Make a publicly-accessible instance of the API so people can publish parameterised queries and schedule tests on endpoints