A Mashup-friendly Resource and Metadata Management Framework

1

A Mashup-friendly Resource and Metadata Management Framework

MUPPLE workshop, EC-TEL 2008Sept 17 2008

Hannes EbnerRoyal Institute of Technology (KTH), Stockholm

Matthias PalmérUppsala University

2

Overview

• Mashups and PLEs• Existing standards and why they don't work• What is a resource and how can we manage it?• Design details, used technologies, and interfaces• A use case• What is different now? - Conclusions• Next steps

3

Our view on mashups and PLEs• What is a mashup?

Combination of several data sources Make use of public APIs Build upon a standard protocol: HTTP Use standard data formats: XML, JSON

• What is a PLE? Kind of a mashup Personally aligned: you pluck what you need Not necessarily bound to the Web

4

What is needed for powerful mashups?

• An extensive resource and metadata management

• What does extensive mean? A differentiation between:

● Resource● Metadata● Administrative information (entries)

Sets of entries to be managed together (contexts)

• What do we get?A flexible way of managing, integrating, and reusing

resources

5

How does this look like?

Context 2Context 1

Entry 1

23

MD

Entry 2Fi

le 1

MD

Entry 3

MD

Entry 2

Entry 1

1MD

External URL

6

How do we get it?

• SCAM (Standardized Contextualized Access to Metadata) framework, version 4

• What is it good for? Provides a unified mechanism for accessing and

managing resources and their metadata Information can be reused among any number of

tools Applications can rely on SCAM instead of own data

and metadata layers

7

What else is out there? (1/2)

• Content Packaging: IMS CP Packages resources and describes them with

metadata Used in IMS ePortfolio and IMS Learning Design Targeted towards transfer between systems No simple access to packaged resources

• WebDAV extension to HTTP Makes WWW writable Supports collections, resources, and links Provides search and versioning support Limited metadata through resource properties Reusing and describing resources in different contexts

not possible

8

What else is out there? (2/2)

• Atom and AtomPub Based on XML, mostly used for feeds Creating and updating resources through AtomPub Basic concepts:

● Collections (feed containing entries)● Workspaces (grouping of collections)● Services (grouping of workspaces)

Feeds provide metadata for each entry No search No use of external metadata possible Modification of available concepts is out of scope

9

Why doesn't it work?

• Purpose was different• The discussed standards do not match up for a

sound architecture• Missing management of resources and metadata• Difficult integration in a mashup environment

SCAM may support some as a complement

10

What is a resource anyway?

• Vague concept of resources Common definition: regular files, links to web

content Wider perspective: books, physical persons, events,

comments, etc

• W3C TAG's definition: No limitation of the scope Whatever might be identified by a URI Information resources: has a digital representation Other resources: no digital representation

11

How does SCAM manage such resources?

• W3C TAG's differentiation between information and other resources is not enough

Distinction whether the resource or the metadata is managed in SCAM

We introduce Location Types:● Local: Both resource and metadata are managed● Link: Resource is referenced, metadata is

managed● Reference: Both resource and metadata are

referenced● LinkReference: Same as reference, with

additional managed metadata

• Access Control on the level of contexts (sets of entries) or individual entries

12

Location types

R = Resource M = Metadata C = Cache

Outside of Repository

Repository

Entry

R M

Entry Entry

R

M

Entry

Local Link LinkReference

Reference

C

M

M

R R M

C

13

Resource types (1/2)

• In addition to location type: Representation type

● Does the resource have a digital representation or not?

Builtin type● Does the resource get a special treatment within

SCAM? MIME type

● e.g. text/html, application/pdf, ... Application type

● Article, Assignment, Task, Person, ...

14

Resource types (2/2)Resources can be divided into

Builtin type one of:

Context, List, ResultList, User, Group

Builtin type: None& Mime type& Application type

Uploaded file or Links E.g. ”text/html” & Assignment

Digital resources Non-digital resources

Builtin type: None & Application type(application specific)

E.g. Event, Task, Concept, Person, Book, Car, House

15

Which technologies are used?

• RDF serves the purpose as a specification language

• Sesame RDF framework Quad store, support for Named Graphs High scalability, flexible API, powerful extension API

• Restlet framework, works towards JSR-311Supports containers, e.g. Tomcat, Jetty

16

Why do we use RDF?

• Semantics are well defined

• Named Graphs (an identifiable set of triples) closely reflect the system's design

Each entry uses up to three Named Graphs● Administrative entry information● Entry's metadata● The resource itself

Each context is a NG, indexing all NGs of all entries

• Flexible way of managing metadata SCAM does not have an understanding of the

metadata itself

17

How do we talk to SCAM?

• REST based HTTP interface

• Metadata harvesting OAI-PMH (both directions) FIRE/LRE

• Querying via SQI

• Serialization formats RDF/XML, TriG JDIL (based on JSON)

18

Confolio - A use case

• Web-based e-portfolio for document and resource management with collaboration support

• SCAM's types are mapped to specific features Contexts: portfolios Lists: folders Entries: items (in folders)

• Loose connections across installations Supported by references “Mounting” in external folders or resources

• Builds upon AJAX Dojo as generic framework SHAME library for presenting and editing metadata

19

Confolio - Screenshot

20

What is different now? - Conclusions

• Introduction of a generic architectureEspecially adapted to the needs of web applications

• Most important innovation:Introduction of an entry Manages a resource, its metadata and its

administrative information Keeps track of location type and resource types

• Many PLEs or Mashups have a notion of objects, assets or resources – entry is perhaps a better alternative for capturing a wider range of situations

21

Next steps

• Stabilization of search API Needs to respect the ACL Free-text search Qualified searches using a subset of SPARQL

• SPARQL endpoint of Sesame could be exposed already nowWe don't do it: security issues (ACL)

• Support for a wide range of data formatsSupported now: RDF and JSON

22

Further information

• Project web sitehttp://confluence.iml.umu.se/display/SCAM

• Knowledge Management Research group at KTHhttp://kmr.nada.kth.se

• It's Open Source Try it! Contributions welcome!