47
Linking Media and Data using Apache Marmotta Keynote at LIME 2014 Workshop Sebastian Schaffert and Thomas Kurz

Linking Media and Data using Apache Marmotta (LIME workshop keynote)

Embed Size (px)

DESCRIPTION

Sebastian Schaffert is CTO and co-founder of RedLink GmbH. He is also currently working as head of the "Knowledge and Media Technologies" department at Salzburg Research and occassionally as a lecturer at the University of Applied Sciences (FH) Salzburg. He received his diploma in Computer Science in 2001 and his PhD in 2004, both at the University of Munich, Germany. His current research focus is Semantic Web technologies, especially Linked Data, Semantic Search, Information Extraction, and Multimedia Information Systems. Keynote at LIME workshop at ESWC 2014.

Citation preview

Page 1: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Linking Media and Data using Apache Marmotta

Keynote at LIME 2014 WorkshopSebastian Schaffert and Thomas Kurz

Page 2: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Contents

➔Motivation: The Red Bull Content Pool

➔Background:

➔ Linked Media Principles

➔ Media Fragments and Media Ontology

➔Implementation: Linked Media Framework

➔ Red Bull Use Case

➔ ConnectMe Use Case

➔Standardising: The Linked Data Platform

➔Introducing Apache Marmotta

➔Querying for Multimedia Fragments: SPARQL-MM

2009

2011

2013

2014

Page 3: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Motivation: The Red Bull Content Pool

Page 4: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Linked Media (2009)

Linked Media = Linked People + Linked Content + Linked Data

Page 5: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Motivation: The Red Bull Content Pool

➔ online archive containing video and image material related to extreme sports events organised by Red Bull

➔ business-to-business portal where journalists can get material for further broadcasting (mostly for free)

➔ material comes with metadata in the form of tables in word documents:

➔ interview transcriptions (with time interval start/end second)

➔ scene descriptions (with time interval start/end second)

➔ music cue sheets (copyright information about background music tracks)

Page 6: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Motivation: The Red Bull Content Pool (2009)

Page 7: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Motivation: The Red Bull Content Pool

➔Problems:

➔ videos consist of series of scenes with many different persons

➔ scanning through a video to find a particular scene is a huge amount of work

➔ metadata is valuable but not really exploited for searching videos and while playing videos

Page 8: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Can we help Markus?

Name: Markus

Occupation: sports journalist

Company: RegioTV Pinzgau

Objective: create report about cliff diving

Requires: videos, background info, contacts

How can we help Markus? efficient and precise search in the Red Bull Content Pool compact and relevant display of background information contacts (e.g. website,email) of athletes, other journalists, etc.

fast and successful creation of the report

Page 9: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Background: Linked Media Principles

Page 10: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Linked Media Principles (2009)

➔ Linked Data is „read-only“i.e. focus was on publication of big datasets, not the interaction with data

a system for managing media assets needs to be capable of updating resources and their metadata

➔ Linked Data is „data-only“i.e. a resource is represented either as RDF metadata for machines or as HTML tables for humans, but in all cases it is metadata and not content

a system for managing media assets needs to be capable of managing both media content and metadata about that content

Page 11: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Linked Media Principles (2009)

➔ extend Linked Data for updates using REST principles (HTTP):

➔ GET: returns a resource (as in Linked Data)

➔ POST: creates a new resource and uploads content or metadata

➔ PUT: updates content or metadata of a resource

➔ DELETE: removes a resource and all associated information

➔ extend Linked Data for arbitrary media formats using MIME:

➔ controlled by Accept: (in case of GET) and Content-Type: (in case of PUT/POST) HTTP headers

➔ header value: MIME type (e.g. text/turtle or image/jpeg) and type of relationship (e.g. rel=content or rel=meta)

➔ accessing a resource with GET or PUT redirects to the actual representation specified by MIME type and relationship

Page 12: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Linked Media Principles (2009)

➔ Example 1: Retrieve HTML table representation of resource metadata

➔ Example 2: Retrieve HTML content of resource

➔ Example 3: Update resource metadata

GET http://data.redlink.io/resource/1234Accept: text/html; rel=meta

GET http://data.redlink.io/resource/1234Accept: text/html; rel=content

PUT http://data.redlink.io/resource/1234Content­Type: text/turtle; rel=meta

<http://data.redlink.io/resource/1234>     mm:hasFragment <http://data.redlink.io/resource/1234#t=0,10> 

Page 13: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Background: Media Fragments URI &

Ontology for Media Resources

Page 14: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Media Fragments URI

➔ media content currently treated as „black box binary content“

➔ interaction only via plugin or special browser support

➔ linking to a subsequence of a video not possible

➔ Media Fragments URI: use the „fragment“ part of a URI to encode temporal and spatial subsequences

➔ Examples:

Identify the sequence from second 3 to second 10 of the video:

http://data.redlink.io/resource/cliff_diving.ogg#t=3,10

Identify the spatial box 320x240 at x=160 and y=120 of the video

http://data.redlink.io/resource/cliff_diving.ogg#xywh=160,120,320,240  

Page 15: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Ontology for Media Resources

➔ common data model for representing video metadata:

➔ identification

➔ creation (hasCreator, hasPublisher, ...)

➔ content description (hasLanguage, hasGenre, hasKeyword,...)

➔ rights and distribution (hasPermissions, hasTargetAudience, ...)

➔ technical properties (hasCompression, hasFormat, ...)

➔ fragments (hasFragment, hasChapter, ...)

➔ mapping tables from the most popular video metadata formats to the Ontology for Media Resources (EXIF, MPEG-7, TV-Anytime, YouTube, ID3)

Page 16: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Combining Media Fragments and Media Ontology

➔ use Media Fragment URIs to uniquely identify fragments of media content

➔ browser compatibility

➔ Linked Data compatibility

➔ use Ontology for Media Resources to describe these fragments

➔ RDF compatibility

➔ rich description graph with SPARQL querying

Page 17: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Combining Media Fragments and Media Ontology

@prefix ma: <http://www.w3.org/ns/ma­ont#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf­schema#> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix dct: <http://purl.org/dc/terms/> .

<http://example.org/v1> a ma:MediaResource;rdfs:label "A sports video";ma:locator <http://my.videos.org/v1.mp4>;ma:hasFragment <http://example.org/v1#fragment1>;ma:hasFragment <http://example.org/v1#fragment2>.

<http://example.org/v1#fragment1> ma:locator <http://my.videos.org/v1.mp4#xywh=percent:26,20,22,80&t=194,198>; 

dct:subject <http://example.org/person/Connor_Macfarlane>.

<http://example.org/v1#fragment2> ma:locator <http://my.videos.org/v1.mp4#xywh=percent:71,0,29,100&t=193,198>; 

dct:subject <http://example.org/person/Lewis_Jones>.

<http://example.org/person/Connor_Macfarlane>                                   foaf:name "Connor Macfarlane".<http://example.org/person/Lewis_Jones> foaf:name "Lewis Jones".

Page 18: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Combining Media Fragments and Media Ontology

Page 19: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Implementation:The Linked Media Framework

Page 20: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Behind the Scenes: Linked Media Framework

Linked Data Server with updates and uniform management of content and metadata => particularly well-suited for multimedia content and metadata!

Linked Media Principles for resource-centric access to content and metadata

SPARQL Query and SPARQL Update 1.1 for structural updating and querying

Modules for Reasoning, Semantic Search, Linked Data Caching, Versioning, and Social Media

Specialised on Linked Media and Linked Enterprise Content

Code, Installer, Screencasts and more:http://code.google.com/p/lmf/

Page 21: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Linked Media Framework (Architecture)

Page 22: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

LMF Semantic Search

Facetted Search over Content and Metadata with SOLR compatible API

RDF Path Language for configurable Metadata Indexing

Multiple Cores with different configurations to adapt to different search requirements

Page 23: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

LMF Reasoning

Rule-based reasoning over triples in the LMF triple store to represent implicit knowledge

Reason maintenance allows to describe justifications for inferences

adapted version of sKWRL rule language: more efficient implementation, improved reason maintenance

Page 24: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

LMF Linked Data Caching

transparently retrieves linked resources from the Linked Data cloud when needed (e.g. LD Path or SPARQL query)

powerful component for integrating with other information systems exposing their data as Linked Media or Linked Data

adapters for services offering their data in proprietary formats (e.g. YouTube, Vimeo, …)

Page 25: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

LMF Classification and Sentiment Analysis

support for statistical text classification, allows to train different classifiers with sample texts for arbitrary categories

suggest most likely category for a text according to similarity with training data

analyse text for positive or negative sentiment (German and English)

25

Page 26: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

LMF Social Media Integration

allows linking to social media resources, e.g. Facebook or Google accounts, videos, interests

allows authentication and data import from selected social media services (Facebook, YouTube, generic RSS)

Page 27: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

LMF Versioning

keeps history of updates in the Linked Media Framework

provides information for trust and provenance of data, e.g. annotations added to the system

Page 28: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Use Case:Red Bull Semantic Search Prototype

Page 29: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Media Fragment Search

Page 30: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Spatial and Temporal Fragments

Page 31: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Use Case:LIME Media Player (ConnectMe Project)

Page 32: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

LIME Player: Interaction with Fragments

Page 33: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Standardisation:The Linked Data Platform

Page 34: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Linked Data Platform: Introduction

➔ recommendation draft of the LDP working group at W3C

➔ support for „read/write Linked Data“

➔ support for RDF and non-RDF resources

➔ can be used as an alternative for Linked Media Principles

➔ advantage of standardisation and wide adoption

➔ considerably more complex standard and protocol

➔ URL: http://www.w3.org/TR/ldp/

Page 35: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Linked Data Platform: Concepts

➔ access and interaction according to REST webservice principles

➔ GET: returns description of a resource

➔ POST: creates a new resource

➔ PUT: replaces the description of a resource

➔ DELETE: removes the description of a resource

➔ Linked Data Platform Resources (LDP-R)

➔ RDF resources (LDP-RS): RDF description of a resource

➔ non-RDF resources (LDP-NR): arbitrary (media) content

➔ Linked Data Platform Containers (LDP-C)

➔ collection of LDP resources, e.g. „students“, „professors“, „lectures“

➔ basic container (LDP-BC): simple collection of resources with common URI prefix

➔ direct container (LDP-DC): collection with explicit membership (as triple)

➔ indirect container (LDP-IC): collection with implicit membership (based on content)

Page 36: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

LDP Basic Containers (LDP-BC)

➔ collection of LDP resources

➔ identification via common URI prefix, e.g.http://example.com/container1/ahttp://example.com/container1/b

➔ can contain both RDF and non-RDF resources at the same time

➔ container is itself an RDF resource

➔ description as RDF:

@base <http://example.com/container1/>@prefix dcterms: <http://purl.org/dc/terms/>.@prefix ldp: <http://www.w3.org/ns/ldp#>.

<>   a ldp:BasicContainer;   dcterms:title "A very simple container";   ldp:contains <a>, <b>, <c>.

Page 37: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Introducing Apache Marmotta

Page 38: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Apache Marmotta

➔ a simplification of the Linked Media Framework taking core components:

➔ Linked Data Server with SPARQL 1.1

➔ Linked Data Cache

➔ Versioning, Reasoning

➔ no search, no content analysis

➔ reference implementation of the Linked Data Platform and participation in W3C working group

➔ highly modular and extensible to build custom Linked Data applications (both client and server)

http://marmotta.apache.org

Page 39: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Apache Marmotta: Architecture

Page 40: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Querying Multimedia Fragments

Page 41: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

SPARQL-MM: Introduction

➔ extension of SPARQL with specific multimedia functions and relations, implemented in Apache Marmotta

Relation Function Aggregation Function

Spatial mm:rightBeside mm:spatialIntersection

mm:spatialOverlaps mm:spatialBoundingBox

… …

Temporal mm:after mm:temporalIntersection

mm:temoralOverlaps mm:temporalIntermediate

… …

Combined mm:overlaps mm:boundingBox

mm:contains mm:intersection

A list of all functions can be found at:https://github.com/tkurz/sparql-mm/blob/master/sparql-mm/functions.md

Page 42: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

SPARQL-MM: A sample query

Give me the spatio-temporal snippet that shows Lewis Jones right beside Connor Macfarlane.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

PREFIX mm: <http://linkedmultimedia.org/sparql­mm/functions#>

PREFIX ma: <http://www.w3.org/ns/ma­ont#>

PREFIX dct: <http://purl.org/dc/terms/>

SELECT (mm:boundingBox(?l1,?l2) AS ?two_guys) WHERE {

    ?f1 ma:locator ?l1; dct:subject ?p1.

    ?p1 foaf:name "Lewis Jones".

    ?f2 ma:locator ?l2; dct:subject ?p2. 

    ?p2 foaf:name "Connor Macfarlane".

    FILTER mm:rightBeside(?l1,?l2)

    FILTER mm:temporalOverlaps(?l1,?l2)

}

Page 43: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

SPARQL-MM: A sample query

mm:boundingBox(?l1,?l2)

Page 44: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

SPARQL-MM: Demo

DEMO!

Page 45: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Conclusions

Page 46: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Conclusions

➔ semantic media asset management requires management and interaction with both content and metadata

➔ Linked Media Principles (2009) were a first approach to extend Linked Data with support for semantic media asset management

➔ Linked Data Platform (W3C working draft) supersedes Linked Media Principles, as it covers the same aspects and more

➔ semantic media asset management requires specific media access and querying

➔ Media Fragments URI (W3C) to identify media fragments

➔ Ontology for Media Resources (W3C) to describe media fragments

➔ SPARQL-MM to query media fragment descriptions

Page 47: Linking Media and Data using Apache Marmotta  (LIME workshop keynote)

Thanks for your Attention!

Dr. Sebastian Schaffert

Chief Technology Officer

Redlink GmbH

[email protected]