28
A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

A web-based repository service for vocabularies and alignments in the

Cultural Heritage domain

Lourens van der MeijAntoine Isaac

Claus Zinn

Page 2: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

• Authors not here

• Projects

Page 3: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Using SW techniques for CH data

Page 4: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Focus on vocabularies and alignments

• Knowledge Organization Systems (KOS) like thesauri are used to describe cultural objects

• Many different KOSs are used in different institutions

• Merging them in one global vocabulary is not realistic nor desirable

Page 5: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Semantic matching as a solution to tackle semantic heterogeneity

Page 6: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Eliciting needs for a repository

Application cases• Semantic search and browsing• (Re-)Indexing

Overall functions• Uniform access to vocabularies• Access & management of alignments

Experiment idea: test SW techniques for flexibility, ease of re-use and linking models & data

Page 7: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Existing RDF best practices: SKOSanimals

NT catscats

UF domestic catsRT wildcatsBT animalsSN used only for domestic cats

domestic catsUSE cats

wildcats

Page 8: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Existing RDF best practices: SKOSCrucial features for a repository• Vocabulary membership• Cross-vocabulary mapping properties

Page 9: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Existing RDF best practices: OAEI

From Ontology Alignment Evaluation Initiative

• Mapping cells– 2 entities being matched– 1 relation type (any!)– 1 measure– Provide hook for annotations

• Alignments between ontologies as set of cells– Can also be annotated

http://oaei.ontologymatching.org

Page 10: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Existing RDF best practices: OAEI

Page 11: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Need for a service API?

• Need for dedicated middleware: some reqs beyond basic data access are not met by standard SPARQL– Full-text search on labels– Ranking of results– Access control/authentication– Query complexity control– LoD data publication strategy– Other data exchange formats (JSON)

• APIs are also a good way to structure practices in a domain

Page 12: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

API design

• API is inspired by both SKOS and OAEI APIs

• But dedicated to simple vocabulariesNot fully-fledged ontologies

• Dedicated to vocabularies and alignmentsMore than usual terminology repositories

• Alignments are for simple vocabulariesRestricting OAEI-based functions to SKOS mappings

Page 13: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Distributed service architecture

• Allowing to serve either vocabularies or alignments or bothFitting different stakeholder missions/interests

• One service can sit on several othersDistribution thought as a scalability-enablerSends reassuring message re. access control

Page 14: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Distributed service architecture

Page 15: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

CATCH service implementation

Page 16: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

CATCH service implementation

Plus: many alignments automatically created in the STITCH project

Page 17: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Driven by “business” interestsE.g., KB has a list of relevant KOSs in its context

KB

KBDeposit

Coll.

KBScientific

Coll.

DutchPublic

Libraries

LC(US Nat.

Lib)

BnF(FrenchNat. Lib)

DNB(GermanNat. Lib)

DutchBook-trade

Biblion

NURBISACsubjectcodes

Brinkman GTT

NBCclass.

UNESCOclass.

KBCorporatie+ Persoon

RAMEAUsubject

headings

LCSHsubject

headings

DDCDewey

decimalclass.

SWDsubject

headings

Personennamendatei

LCauthority

file

AutoritésBNF

otherclassifications

domain/discipline

classifications

subjectthesauri /

subj. headinglists

bookcollectiondatasets

person/corporation

data

Doel-groep

--audience

overlap between book collections(thickness indicates degree of overlap)

Vertical adjustment between a coll. and KOSsdenotes KOSs' being used to describe that coll.

Johan Stapel

Page 18: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Deployment (1)

Vocabulary and alignment browser

Page 19: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Deployment (2)

RAMEAU (French NL) as linked data• Interlinked with LCSH (Library of Congress)• Soon to SWD (German NL)

• Using manual mappings from the MACS project

http://stitch.cs.vu.nl/repository

Page 20: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Deployment (3)STITCH re-indexing prototype (ISWC 2009)• Plugged onto KB cataloguing system

Page 21: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Lessons learnt

• Middleware is still useful– To match real application requirements– To gather communities of practice around new usages

• But SW tools really help building it

• Relevance of existing models like SKOS– Only one part of SKOS unused (collections) and one

extension required (concept scheme groups)– Disclaimer: we were involved in SKOS

• Interest from the Cultural Heritage domain

Page 22: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

(Changing landscape of) Issues

• Some basic middleware functions like full-text search are now tackled by vendor-specific SPARQL ext.We prefer it that way

• Working out the distributed architecture is difficultProgress on federated RDF repositories can be useful

• Versioning/changes MUST be addressed at a fine-grained level (concepts)Maybe the issue with the least mature solutions!

Page 23: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Future work

Already started!

CATCHplus: continuing CATCH efforts, bringing them even closer to production

New repository and interface

Page 24: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn
Page 25: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn
Page 26: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Current work

• Refinement of HTTP APIE.g., Possibility to search for pairs of related concepts, with

constraintsCloser to SPARQL, but still limiting complexity

• Based on Openlink Virtuoso– Disk-based implementation can handle huge datasets– Built-in LOD function & full-text features

Page 27: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Current work

• Architecture is no longer distributed, for now!Difficult conflict between requirements– Some clients had requirements for SPARQL– Federated SPARQL query is (was?) not yet mature

• Named graphs are being experimented– For representing KOS data bundles (file upload) – For contextualizing triples (one shortcoming of SKOS/RDF)

Page 28: A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn

Thanks!

http://stitch.cs.vu.nl/repository