20
Linked Data tooling and XML WWW.FREME-PROJECT.EU 1 Co-funded by the Horizon 2020 Framework Programme of the European Union Grant Agreement Number 644771 30 JUNE 2015 Felix Sasaki DFKI / W3C Fellow LINKED DATA TOOLING AND XML www.freme-project.eu

Linked data tooling XML

Embed Size (px)

Citation preview

Linked Data tooling and XML WWW.FREME-PROJECT.EU 1

Co-funded by the Horizon 2020 Framework Programme of the European Union Grant Agreement Number 644771

30 JUNE 2015

Felix Sasaki

DFKI / W3C Fellow

LINKED DATA TOOLING AND XML

www.freme-project.eu

Linked Data tooling and XML WWW.FREME-PROJECT.EU 2

BACKGROUND: THE FREME PROJECT

• Two year H2020 Innovation action; start February 2020

• Industry partners leading four business cases around digital content and (linked) data

• Technology development bridging language and data

• Outreach and business modelling demonstrating monetization of the multilingual data value chain

Linked Data tooling and XML WWW.FREME-PROJECT.EU 4

WHAT IS LINKED DATA?

• A way to represent data on the Web

◦ Give each data item (a “resource”) an unique identifier: http://example.com/xml-ug-berlin

◦ Create links between data items; describe the type of links also via unique identifiers

- http://example.com/xml-ug-berlin

- http://schema.org/Place

- http://dbpedia.org/resource/Berlin

Means: “The XML User Group Berlin takes place in Berlin”

In linked data terminology: a triple, consisting of subject, predicate and object

• Linked data: applying Web principles to data

◦ Data item (a “resource”) = like a Web page with a web address

◦ Links between data items = links between pieces of web content

◦ Types of links are clear for the human reader

The XML User Group Berlin takes place in <a href=”http://en.wikipedia.org/Berlin">Berlin</a>

◦ Linked data provides links in a machine readable way

Linked Data tooling and XML WWW.FREME-PROJECT.EU 5

WHAT DO YOU DO WITH LINKED DATA?

• Creation

◦ From scratch

◦ Based on existing structured data

◦ Based on existing unstructured data

• Storing

◦ RDF (“Resource Description Framework”)

• Modelling of vocabularies (not covered here)

◦ Creating schemas for linked data

◦ Using schema languages with various levels of expressivity: RDF Schema , SKOS, OWL

◦ If possible: avoid and use existing linked data vocabularies

• Consumption

◦ Query of linked data via SPARQL

• Further processing

Linked Data tooling and XML WWW.FREME-PROJECT.EU 6

LINKED DATA TOOLING AND XML – BACKGROUND

• If you embrace the linked data technology stack, use the Apache Jena Java library

◦ http://jena.apache.org/

◦ Provides all you need for linked data creation, modelling, storage and query

• This presentation assumes you want to integrate linked data processing with an XML technology stack – potential reasons

◦ You don’t want or cannot replace existing XML tooling

◦ Your data is both XML and linked data – you need to interface between the two

◦ Your workflow assumes both XML and linked data – at least partial conversion are needed

• In general, don’t think about formal aspects of RDF – they are not needed for integration with XML processing

Linked Data tooling and XML WWW.FREME-PROJECT.EU 7

SYNTAXES

• Linked data can be written in various syntaxes

◦ RDF/XML: XML syntax

◦ Turtle

◦ N3

◦ JSON-LD

◦ RDFa

• Examples are on the following slides

Linked Data tooling and XML WWW.FREME-PROJECT.EU 8

TURTLE SYNTAX

@prefix db: <http://dbpedia.org/resource/>.

@prefix dbont: <http://dbpedia.org/ontology/populationTotal>.

@prefix sdo: <http://schema.org/>.

<http://example.com/xml-ug-berlin> sdo:Place db:Berlin.

db:Berlin dbont:populationTotal

Structure:

• Declaration of prefixes

• Write each part of a triple (subject, predicate, object) explicitly

• Easy to read & easy to create with XML tooling

• Will be used here to create linked data

• Example: see linked-data-tooling-xml-examples/example1.ttl

Linked Data tooling and XML WWW.FREME-PROJECT.EU 9

JSON-LD SYNTAX

{

"@context": { "db": "http://dbpedia.org/resource/", … },

"@graph": [ {

"@id": "http://example.com/xml-ug-berlin",

"schema:Place": {

"@id": "dbpedia:Berlin"

} },

{

"@id": "dbpedia:Berlin",

"dbont:populationTotal": 3415091

} ] }

• Full example: see linked-data-tooling-xml-examples/example1.json

Linked Data tooling and XML WWW.FREME-PROJECT.EU 10

RDF/XML SYNTAX

<rdf:RDF …>

<rdf:Description rdf:about="http://example.com/xml-ug-berlin">

<sdo:Place>

<rdf:Description rdf:about="http://dbpedia.org/resource/Berlin">

<dbont:populationTotal rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">3415091</dbont:populationTotal>

</rdf:Description>

</sdo:Place>

</rdf:Description>

</rdf:RDF>

• Full example: see linked-data-tooling-xml-examples/example1.xml

Linked Data tooling and XML WWW.FREME-PROJECT.EU 11

FURTHER SYNTAXES

• See linked-data-tooling-xml-examples folder

◦ Microdata: example1-microdata.html

◦ RDFa: example1-rdfa.html

◦ N-Triples: example1.nt

• Tooling for conversion and validation

◦ Conversion: http://rdf-translator.appspot.com/

◦ Conversion: http://www.easyrdf.org/converter

◦ JSON-LD checking: http://json-ld.org/playground/index.html

◦ RDF/XML validation: http://www.w3.org/RDF/Validator/

Linked Data tooling and XML WWW.FREME-PROJECT.EU 12

EXAMPLE: CONVERTING A TABLE TO LINKED DATA

Files:

• XML file with table: linked-data-tooling-xml-examples/table.xml

• XSLT stylesheet to create turtle file: linked-data-tooling-xml-examples/table2ttl.xsl

• Ouput in TTL: linked-data-tooling-xml-examples/table.ttl

Linked Data tooling and XML WWW.FREME-PROJECT.EU 13

EXAMPLE: CONVERTING A TABLE TO LINKED DATA

Files:

• XML file with table: linked-data-tooling-xml-examples/table.xml

• XSLT stylesheet to create turtle file: linked-data-tooling-xml-examples/table2ttl.xsl

• Ouput in TTL: linked-data-tooling-xml-examples/table.ttl

Lessons learned:

• Conversions are data and vocabulary specific

• Good practices

◦ Use existing linked data vocabularies

◦ Link to existing linked data sources

Linked Data tooling and XML WWW.FREME-PROJECT.EU 14

EXAMPLE: XML FORMAT SPECIFIC CONVERSION

Format “XLIFF”

Files:

• XLIFF Input file: linked-data-tooling-xml-examples/example-xliff.xlf

• XSLT stylesheet to create turtle file: linked-data-tooling-xml-examples/xliff-to-nif.xsl

• Output in TTL: linked-data-tooling-xml-examples/example-xliff.ttl

Linked Data tooling and XML WWW.FREME-PROJECT.EU 15

SERVE YOUR OWN LINKED DATA

• Widely used server: Apache Jena Fuseki

◦ http://jena.apache.org/documentation/serving_data/

• Usage

1. Convert your XML data to RDF (see previous slides)

2. Install Apache Fuseki (with Java available: download > unzip > start server)

3. Add data via Fuseki Web interface http://localhost:3030

4. Run SPARQL queries

Linked Data tooling and XML WWW.FREME-PROJECT.EU 16

QUERYING LINKED DATA VIA XSLT

Files:

• XSLT stylesheet that processes linked data: linked-data-tooling-xml-examples/generate-markup-person.xsl

• ePub file that has the unstructured content

Workflow:

• Stylesheet uses text content as input to linked data query

• Query executed via SPARQL, uses Dbpedia SPARQL endpoint

• Output is available in SPARQL query output format

◦ Has also XML syntax

◦ See example at linked-data-tooling-xml-examples/sparql-output.xml

◦ Can then be processed in XSLT

Linked Data tooling and XML WWW.FREME-PROJECT.EU 17

SUMMARY – LINKED DATA AND XML TOOLING

• Easy to produce with existing XML content

◦ Use approaches for structured and unstructured content

◦ Store linked data in turtle syntax

◦ Provide via triple store

• Easy to consume in XML tool chains

◦ Output of SPARQL queries: use XML based SPARQL result format

◦ Queries of public and your own data sets

• Main missing piece: knowledge transfer - e.g. about

◦ Adequate syntaxes in your (XML) workflow

◦ Existing data sets: sustainability, quality, licenses, …

◦ How to interrelate linked data queries with XSLT (= via XML result format)

Linked Data tooling and XML WWW.FREME-PROJECT.EU 18

FURTHER TOPICS

• Linked data support in XML data base solutions

◦ Example MarkLogic

• Combined data and language processing

◦ See FREME project

Linked Data tooling and XML WWW.FREME-PROJECT.EU 19

Co-funded by the Horizon 2020 Framework Programme of the European Union Grant Agreement Number 644771

30 JUNE 2015

Felix Sasaki

DFKI / W3C Fellow

LINKED DATA TOOLING AND XML

www.freme-project.eu