A Data API with Security and Graph-Level Access Control

A Data API with Security and

Graph-level Access ControlDr. Barry Norton,

Development Manager, ResearchSpace* The British Museum

* supported by the Andrew W. Mellon Foundation

ESWC, Crete

May 2014

Open Endpoint

• “The idea that an enterprise would allow a public SQL interface is laughable”

• True, but furthermore:

• The idea that an enterprise would allow unrestricted SQL (even querying, let alone update) is also laughable

Reality• In reality the enterprise:

• runs an active directory service

• assigns permissions to databases using this

• assign read/write permissions to tables using this

• allows third-party software providers access only to pre-defined queries and updates

Admission

• This talk is of low originality and negligible scientific value!

State of the Art• Some triplestores already provide graph-level access control

• The Datalift project produced a query-rewriting system to provide access control over arbitrary triplestores (Costabello et al., ECAI2012)

• Knud and Leigh, formerly of Talis, already presented how Kasabi allowed pre-defined parameterised SPARQL queries (WWW2012)

• The BBC, and other enterprises that have followed their example, already use a similar approach in practice (see presentations of Jem Rayfield)

State of the Art• Some triplestores already provide graph-level access control

• The Datalift project produced a query-rewriting system to provide access control over arbitrary triplestores (Costabello et al., ECAI2012)

• Knud and Leigh, formerly of Talis, already presented how Kasabi allowed pre-defined parameterised SPARQL queries (WWW2012)

• The BBC, and other enterprises that have followed their example, already use a similar approach in practice (see presentations of Jem Rayfield)

Proprietary

Enumerates graphs in query

Dead !(and proprietary user management)

Closed source

BM Dataset

http://collection.britishmuseum.org

http://collection.britishmuseum.org

Approach (1/4)

Approach (2/4)

Approach (3/4)

Approach (4/4)• RESTful API for management of queries and updates (i.e. each such

becomes a URI-identified resource)

• Each query/update

• Can be parameterised (by POSTing parameters with XSD datatypes that substitute for variables)

• Can be executed (subject to access and rewriting) by POSTing parameter values

• Can be scheduled for execution by POSTing a schedule

• Can be tested, on schedule, by POSTing an XPATH or SPARQL ASK query

• Provides a GETtable resource per scheduled execution

ResearchSpace• a reusable set of Linked Data-based components, making up

• a platform that allows researchers to make claims (additions and changes to GLAM data) -

• that preserves and aggregates canonical data across Museums (LAMs),

• attributes claims,

• records arguments based on

• provenanced data annotation,

• image annotation

• forum-based discussion with explicit annotation

• will allow inference over claims

RS Search

Fundamental Relationships

RS (Conjunctive) Search

RS Data Annotation

Untrue claims?• For years a (naive) objection to Linked Data has

been:

• “If I publish my data and give my things identifiers (URIs), won’t people make untrue claims?”

!

!

<http://collection.britishmuseum.org/id/object/YCA62958> crm:P52_has_current_owner

<http://semanticweb.org/id/Barry_Norton>

http://semanticweb.org/id/Barry_Norton

RS Image Annotation

RS ‘Data Basket’

RS Dashboard

Future Work• Query rewriting currently uses the Jena parser,

want to re-implement using SPIN

• Rewrites become CONSTRUCT queries, rather than code

• Make a publicly-accessible instance of the API so people can publish parameterised queries and schedule tests on endpoints

Internet

A Data API with Security and Graph-Level Access Control