18
Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA, USA D2RQ Lessons Learned Christian Bizer Richard Cyganiak Freie Universität Berlin

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Embed Size (px)

Citation preview

Page 1: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

W3C Workshop on RDF Access to Relational Databases

25-26 October, 2007 — Boston, MA, USA 

D2RQ

Lessons Learned

Christian BizerRichard Cyganiak

Freie Universität Berlin

Page 2: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

The D2RQ Plattform

2002: D2R MAP dump relational databases as RDF

based on and expressive declarative mapping language

2004: D2RQ RDQL/SPARQL to SQL query rewriting

Jena and Sesame API

2006: D2R Server SPARQL, Linked Data access over the Web

Tested with Oracle, MySQL, and PostgreSQL

Should work with any SQL-92 compatible database

GNU GPL license, 4600 downloads (150 per month)

Page 3: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

Outline

1. D2RQ Mapping Language

2. D2RQ Architecture and Interfaces

3. Areas for Future Community Work1. RDF Access to Relational Databases

2. The Web Perspective

Page 4: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

The D2RQ Mapping Language

Declarative language to express mappings between a given RDF schemata and a given relational database schemata.

Page 5: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

Class Map

map:Author_ClassMap a d2rq:ClassMap; d2rq:class foaf:Person; d2rq:uriPattern "/people/@@Author.ID@@".

Author

ID first last email

12 Chris Bizer [email protected]

http://www4.wiwiss.fu-berlin/d2rServer/people/12rdf:type foaf:Person .

Page 6: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

Property Bridge

map:email_PropertyBridge a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Author_ClassMap; d2rq:property foaf:name; d2rq:pattern "@@Author.first@@ @@Author.last@@".

http://www4.wiwiss.fu-berlin/d2rServer/people/12 foaf:name “Chris Bizer” .

Author

ID first last email

12 Chris Bizer [email protected]

Page 7: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

Joins

map:author_PropertyBridge a d2rq:PropertyBridge; d2rq:belongsToClassMap :PeopleClassMap; d2rq:property dc:creator; d2rq:refersToClassMap :PapersClassMap; d2rq:join “Author.ID=Rel_Authors_Papers.AuthorID"; d2rq:join "Rel_Authors_Papers.PaperID=Papers.ID“.

Author

ID name email

12 Chris [email protected]

Papers

Rel_Authors_Papers

ID title confID

312 D2R Server 132

AuthorID PaperID

12 312

http://www4.wiwiss.fu-berlin/d2rServer/docs/312 dc:creator http://www4.wiwiss.fu-berlin/d2rServer/people/12 .

Page 8: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

Other Features of the Mapping Language

Conditional mappings

Value translation tables

Extensible with arbitrary value translation functions

Performance hints

Page 9: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

D2RQ Architecture and Interfaces

Page 10: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

Performance and Limitations

Performance is fine with databases containing a few million records. Dumps, Linked Data und HTML interface usually no problem.

Simple SPARQL queries usually fine.

Complex SPARQL queries (OPTINAL, FILTER, LIMIT) sometimes slow.

Due to limitations of the implementation. Will improve with future  releases.

Limitations No support for Named Graphs

Read only. No support for CREATE/DELETE/UPDATE

No support for inference

Page 11: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

Areas for Future Community Work

Page 12: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

RDF Access to Relational Databases

With Virtuoso, DartGrid, SPASQL, SquirrelRDF, Relational.OWL, D2RQ, and … there are various suitable solutions around.

Compare the Expressivity of Mapping Languages People need weird mappings and fixups for database design anti-patterns.

We need an accepted mapping benchmark which reflects this.

First approach: THALIA testbed.

Compare the Performance of the different Implementations We need an accepted performance benchmark.

Page 13: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

Future Community Work seen from the Web Perspective

Mapping relational databases to RDF is a local problem and its technical realization matters little from the Web perspective.

What people really want are expressive and fast queries

over an integrated view

on an unbounded number of data sources (the Web)

expressed via simple user interfaces.

We should aim at providing answers to the well-known, but hard data integration questions arising from this scenario.

Page 14: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

Testbed: The Linking Open Data Cloud

Page 15: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

Federation versus Replication

1. Virtual Integration via SPARQL Query Federation DARQ (HU Berlin) Complicated and slow.

2. Materialized Integration via Crawling Zitgist (Zitgist), SWSE (DERI), Swoogle (UMBC), Watson (Open University) Fast, but requires huge RDF repositories. Worked for HTML, worked for RSS, so why not for RDF?

3. Materialization On-the-Fly Crawl only data that is needed while answering the query. Semantic Web Client Library (FU Berlin), SWIC (University of London) Works, but is really slow.

DBpedia geonames

RDF Link

RDF Link

FOAF RDF Link SIOCRDF Link

Page 16: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

Data Source Discovery and Description

1. Registry-based Discovery Registries collect links or data source descriptions.

- Example: Ping the Semantic Web

Work on data source descriptions

- DARQ, SADDLE

2. Link-based Discovery Discovering RDF data by following RDF Links.

Worked fine on the classic HTML Web, so why not for the Semantic Web?

Page 17: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

Schema Mapping

Still no clear answers to: How to express mappings between different RDF vocabularies?

How to publish and search for such mappings on the Web?

RDF Schema and OWL are insufficient in practice to express mappings.

Maybe upcoming Rules Interchange Format (RIF) could provide a solution?

Page 18: Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007) W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA,

Chris Bizer, Richard Cyganiak: D2RQ – Lessons Learned (25.10.2007)

Conclusion

We should have a look which parts of the Semantic Web puzzle are missing to make RDF-based data integration work on WEB- scale!

This talk is online athttp://sites.wiwiss.fu-berlin.de/suhl/bizer/pub/Bizer-Cyganiak-D2RQ-slides.pdf