Linked Data tooling and XML WWW.FREME-PROJECT.EU 1
Co-funded by the Horizon 2020 Framework Programme of the European Union Grant Agreement Number 644771
30 JUNE 2015
Felix Sasaki
DFKI / W3C Fellow
LINKED DATA TOOLING AND XML
www.freme-project.eu
Linked Data tooling and XML WWW.FREME-PROJECT.EU 2
BACKGROUND: THE FREME PROJECT
• Two year H2020 Innovation action; start February 2020
• Industry partners leading four business cases around digital content and (linked) data
• Technology development bridging language and data
• Outreach and business modelling demonstrating monetization of the multilingual data value chain
Linked Data tooling and XML WWW.FREME-PROJECT.EU 3
EXAMPLES:
See http://www.w3.org/People/fsasaki/linked-data-tooling-xml-examples.zip
Linked Data tooling and XML WWW.FREME-PROJECT.EU 4
WHAT IS LINKED DATA?
• A way to represent data on the Web
◦ Give each data item (a “resource”) an unique identifier: http://example.com/xml-ug-berlin
◦ Create links between data items; describe the type of links also via unique identifiers
- http://example.com/xml-ug-berlin
- http://schema.org/Place
- http://dbpedia.org/resource/Berlin
Means: “The XML User Group Berlin takes place in Berlin”
In linked data terminology: a triple, consisting of subject, predicate and object
• Linked data: applying Web principles to data
◦ Data item (a “resource”) = like a Web page with a web address
◦ Links between data items = links between pieces of web content
◦ Types of links are clear for the human reader
The XML User Group Berlin takes place in <a href=”http://en.wikipedia.org/Berlin">Berlin</a>
◦ Linked data provides links in a machine readable way
Linked Data tooling and XML WWW.FREME-PROJECT.EU 5
WHAT DO YOU DO WITH LINKED DATA?
• Creation
◦ From scratch
◦ Based on existing structured data
◦ Based on existing unstructured data
• Storing
◦ RDF (“Resource Description Framework”)
• Modelling of vocabularies (not covered here)
◦ Creating schemas for linked data
◦ Using schema languages with various levels of expressivity: RDF Schema , SKOS, OWL
◦ If possible: avoid and use existing linked data vocabularies
• Consumption
◦ Query of linked data via SPARQL
• Further processing
Linked Data tooling and XML WWW.FREME-PROJECT.EU 6
LINKED DATA TOOLING AND XML – BACKGROUND
• If you embrace the linked data technology stack, use the Apache Jena Java library
◦ http://jena.apache.org/
◦ Provides all you need for linked data creation, modelling, storage and query
• This presentation assumes you want to integrate linked data processing with an XML technology stack – potential reasons
◦ You don’t want or cannot replace existing XML tooling
◦ Your data is both XML and linked data – you need to interface between the two
◦ Your workflow assumes both XML and linked data – at least partial conversion are needed
• In general, don’t think about formal aspects of RDF – they are not needed for integration with XML processing
Linked Data tooling and XML WWW.FREME-PROJECT.EU 7
SYNTAXES
• Linked data can be written in various syntaxes
◦ RDF/XML: XML syntax
◦ Turtle
◦ N3
◦ JSON-LD
◦ RDFa
• Examples are on the following slides
Linked Data tooling and XML WWW.FREME-PROJECT.EU 8
TURTLE SYNTAX
@prefix db: <http://dbpedia.org/resource/>.
@prefix dbont: <http://dbpedia.org/ontology/populationTotal>.
@prefix sdo: <http://schema.org/>.
<http://example.com/xml-ug-berlin> sdo:Place db:Berlin.
db:Berlin dbont:populationTotal
Structure:
• Declaration of prefixes
• Write each part of a triple (subject, predicate, object) explicitly
• Easy to read & easy to create with XML tooling
• Will be used here to create linked data
• Example: see linked-data-tooling-xml-examples/example1.ttl
Linked Data tooling and XML WWW.FREME-PROJECT.EU 9
JSON-LD SYNTAX
{
"@context": { "db": "http://dbpedia.org/resource/", … },
"@graph": [ {
"@id": "http://example.com/xml-ug-berlin",
"schema:Place": {
"@id": "dbpedia:Berlin"
} },
{
"@id": "dbpedia:Berlin",
"dbont:populationTotal": 3415091
} ] }
• Full example: see linked-data-tooling-xml-examples/example1.json
Linked Data tooling and XML WWW.FREME-PROJECT.EU 10
RDF/XML SYNTAX
<rdf:RDF …>
<rdf:Description rdf:about="http://example.com/xml-ug-berlin">
<sdo:Place>
<rdf:Description rdf:about="http://dbpedia.org/resource/Berlin">
<dbont:populationTotal rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">3415091</dbont:populationTotal>
</rdf:Description>
</sdo:Place>
</rdf:Description>
</rdf:RDF>
• Full example: see linked-data-tooling-xml-examples/example1.xml
Linked Data tooling and XML WWW.FREME-PROJECT.EU 11
FURTHER SYNTAXES
• See linked-data-tooling-xml-examples folder
◦ Microdata: example1-microdata.html
◦ RDFa: example1-rdfa.html
◦ N-Triples: example1.nt
• Tooling for conversion and validation
◦ Conversion: http://rdf-translator.appspot.com/
◦ Conversion: http://www.easyrdf.org/converter
◦ JSON-LD checking: http://json-ld.org/playground/index.html
◦ RDF/XML validation: http://www.w3.org/RDF/Validator/
Linked Data tooling and XML WWW.FREME-PROJECT.EU 12
EXAMPLE: CONVERTING A TABLE TO LINKED DATA
Files:
• XML file with table: linked-data-tooling-xml-examples/table.xml
• XSLT stylesheet to create turtle file: linked-data-tooling-xml-examples/table2ttl.xsl
• Ouput in TTL: linked-data-tooling-xml-examples/table.ttl
Linked Data tooling and XML WWW.FREME-PROJECT.EU 13
EXAMPLE: CONVERTING A TABLE TO LINKED DATA
Files:
• XML file with table: linked-data-tooling-xml-examples/table.xml
• XSLT stylesheet to create turtle file: linked-data-tooling-xml-examples/table2ttl.xsl
• Ouput in TTL: linked-data-tooling-xml-examples/table.ttl
Lessons learned:
• Conversions are data and vocabulary specific
• Good practices
◦ Use existing linked data vocabularies
◦ Link to existing linked data sources
Linked Data tooling and XML WWW.FREME-PROJECT.EU 14
EXAMPLE: XML FORMAT SPECIFIC CONVERSION
Format “XLIFF”
Files:
• XLIFF Input file: linked-data-tooling-xml-examples/example-xliff.xlf
• XSLT stylesheet to create turtle file: linked-data-tooling-xml-examples/xliff-to-nif.xsl
• Output in TTL: linked-data-tooling-xml-examples/example-xliff.ttl
Linked Data tooling and XML WWW.FREME-PROJECT.EU 15
SERVE YOUR OWN LINKED DATA
• Widely used server: Apache Jena Fuseki
◦ http://jena.apache.org/documentation/serving_data/
• Usage
1. Convert your XML data to RDF (see previous slides)
2. Install Apache Fuseki (with Java available: download > unzip > start server)
3. Add data via Fuseki Web interface http://localhost:3030
4. Run SPARQL queries
Linked Data tooling and XML WWW.FREME-PROJECT.EU 16
QUERYING LINKED DATA VIA XSLT
Files:
• XSLT stylesheet that processes linked data: linked-data-tooling-xml-examples/generate-markup-person.xsl
• ePub file that has the unstructured content
Workflow:
• Stylesheet uses text content as input to linked data query
• Query executed via SPARQL, uses Dbpedia SPARQL endpoint
• Output is available in SPARQL query output format
◦ Has also XML syntax
◦ See example at linked-data-tooling-xml-examples/sparql-output.xml
◦ Can then be processed in XSLT
Linked Data tooling and XML WWW.FREME-PROJECT.EU 17
SUMMARY – LINKED DATA AND XML TOOLING
• Easy to produce with existing XML content
◦ Use approaches for structured and unstructured content
◦ Store linked data in turtle syntax
◦ Provide via triple store
• Easy to consume in XML tool chains
◦ Output of SPARQL queries: use XML based SPARQL result format
◦ Queries of public and your own data sets
• Main missing piece: knowledge transfer - e.g. about
◦ Adequate syntaxes in your (XML) workflow
◦ Existing data sets: sustainability, quality, licenses, …
◦ How to interrelate linked data queries with XSLT (= via XML result format)
Linked Data tooling and XML WWW.FREME-PROJECT.EU 18
FURTHER TOPICS
• Linked data support in XML data base solutions
◦ Example MarkLogic
• Combined data and language processing
◦ See FREME project
Linked Data tooling and XML WWW.FREME-PROJECT.EU 19
Co-funded by the Horizon 2020 Framework Programme of the European Union Grant Agreement Number 644771
30 JUNE 2015
Felix Sasaki
DFKI / W3C Fellow
LINKED DATA TOOLING AND XML
www.freme-project.eu
Linked Data tooling and XML WWW.FREME-PROJECT.EU 20
CONTACTS
Felix Sasaki
E-mail: [email protected]
CONSORTIUM