40
Semantic annotation and Semantic annotation and search of large virtual search of large virtual heritage collections heritage collections Guus Schreiber Guus Schreiber Free University Amsterdam Free University Amsterdam

Semantic annotation and search of large virtual heritage collections

  • Upload
    mikaia

  • View
    46

  • Download
    0

Embed Size (px)

DESCRIPTION

Semantic annotation and search of large virtual heritage collections. Guus Schreiber Free University Amsterdam. Overview. A non-technical view on the Semantic Web Work on Semantic-Web deployment SKOS, RDFa Semantic annotation and search in virtual collections: the E-Culture example. - PowerPoint PPT Presentation

Citation preview

Page 1: Semantic annotation and search of large virtual heritage collections

Semantic annotation and search of Semantic annotation and search of large virtual heritage collectionslarge virtual heritage collections

Guus SchreiberGuus Schreiber

Free University AmsterdamFree University Amsterdam

Page 2: Semantic annotation and search of large virtual heritage collections

Overview

• A non-technical view on the Semantic Web• Work on Semantic-Web deployment

– SKOS, RDFa

• Semantic annotation and search in virtual collections: the E-Culture example

Page 3: Semantic annotation and search of large virtual heritage collections

The Web: resources and links

URL URLWeb link

Page 4: Semantic annotation and search of large virtual heritage collections

The Semantic Web: typed resources and links

URL URLWeb link

ULAN

Henri Matisse

Dublin Core

creator

Painting“Femme aux chapeau”

SFMOMA

Page 5: Semantic annotation and search of large virtual heritage collections
Page 6: Semantic annotation and search of large virtual heritage collections
Page 7: Semantic annotation and search of large virtual heritage collections

Principle 1: semantic annotation

• Description of web objects with “concepts” from a shared vocabulary

Page 8: Semantic annotation and search of large virtual heritage collections

Principle 2: semantic search

• Search for objects which are linked via concepts (semantic link)

• Use the type of semantic link to provide meaningful presentation of the search results

urang-utang

orange

ape

great ape

Page 9: Semantic annotation and search of large virtual heritage collections

Principle 3: multiple vocabularies. or: the myth of a unified vocabulary

• In large virtual collections there are always multiple vocabularies – In multiple languages

• Every vocabulary has its own perspective– You can’t just merge them

• But you can use vocabularies jointly by defining a limited set of links– “Vocabulary alignment”

• It is surprising what you can do with just a few links

Page 10: Semantic annotation and search of large virtual heritage collections

Example“Tokugawa”

AAT style/period Edo (Japanese period) Tokugawa

SVCN period Edo

SVCN is local in-house thesaurus

Page 11: Semantic annotation and search of large virtual heritage collections

A link between two thesauri

Page 12: Semantic annotation and search of large virtual heritage collections

RDF/OWL language constructs

• classes and individuals• subclasses• properties• subproperties• domain/range of

properties• XML Schema datatypes

• equality, inequality • inverse, transitive,

symmetric, functional properties

• property constraints: cardinality, allValuesFrom, someValuesFrom

• conjunction, disjunction, negation of classes

• hasValue, enumerated type

Page 13: Semantic annotation and search of large virtual heritage collections

How useful are RDF and OWL?

• RDF: basic level of interoperability• Some constructs of OWL are key:

– Logical characteristics of properties: symmetric, transitive, inverse

– Identity: sameAs

• OWL pitfalls– Bad: if it is written in OWL it is an ontology– Worse: if it is not in OWL, then it is not an

ontology

Page 14: Semantic annotation and search of large virtual heritage collections

W3C Semantic Web Deployment Working Groupmaking vocabularies/thesauri/ontologies available on the Web

• Schema for interoperable RDF/OWL representation of vocabularies – SKOS

• Publication guidelines: – URI management, representation of versions

• Embedding RDF in (X)HTML pages– RDFa

Page 15: Semantic annotation and search of large virtual heritage collections

SKOS: pattern for thesaurus modeling

• Based on ISO standard• RDF representation• Documentation:

http://www.w3.org/TR/swbp-skos-core-guide/• Base class: SKOS Concept

Page 16: Semantic annotation and search of large virtual heritage collections
Page 17: Semantic annotation and search of large virtual heritage collections
Page 18: Semantic annotation and search of large virtual heritage collections

Multi-lingual labels for concepts

Page 19: Semantic annotation and search of large virtual heritage collections

Semantic relation:broader and narrower

• No subclass semantics assumed!

Page 20: Semantic annotation and search of large virtual heritage collections

Indexing a resource with a SKOS concept

• primarySubject is defined as subproperty

Page 21: Semantic annotation and search of large virtual heritage collections

Adding semantics

• Adding OWL statements• Interpretations of thesaurus relations such as

narrower as subclass-of are often imprecise (but can still be useful)

• Learning relations between thesauri is important form of additional semantics– Example: AAT contains styles; ULAN contains

artists, but there is no link– Availability of this kind of alignment knowledge is

extremely useful

Page 22: Semantic annotation and search of large virtual heritage collections

W3C standardization process

• Input: draft specification• Collect use cases• Derive requirements• Create issues list: requirements that cannot be

handled by the draft spec• Propose resolutions for issues• Continuously: ask for public feedback/comments• Get consensus on amended spec• Find two independent implementation for each

feature in the spec

Page 23: Semantic annotation and search of large virtual heritage collections

Example issue: relationships between lexical labels

• In draft SKOS spec lexical labels of concepts are represented as datatype properties

• Use cases require relations between labels, e.g. “AAT” is an acronym of “Art & Architecture Thesaurus”

• This is a problem because literals have no URI (so cannot be subject of an RDF property)

• Possible resolutions:– Labels/terms as classes– Relaxing constraints on label property– …..

Page 24: Semantic annotation and search of large virtual heritage collections

Recipes for vocabulary URIs

• Simplified rule:– Use “hash" variant” for vocabularies that are

relatively small and require frequent access

http://www.w3.org/2004/02/skos/core#Concept – Use “slash” variant for large vocabularies, where

you do not want always the whole vocabulary to be retrieved

http://xmlns.com/foaf/0.1/Person

• For more information and other recipes, see:

http://www.w3.org/TR/swbp-vocab-pub/

Page 25: Semantic annotation and search of large virtual heritage collections
Page 26: Semantic annotation and search of large virtual heritage collections

Query for WordNet URI returns “concept-bounded description”

Page 27: Semantic annotation and search of large virtual heritage collections

RDFa: embedding RDF metadata in an (X)HTML file

Regular HTML

Resulting RDF statements

HTML with RDFa

Page 28: Semantic annotation and search of large virtual heritage collections
Page 29: Semantic annotation and search of large virtual heritage collections

More information

Page 30: Semantic annotation and search of large virtual heritage collections

E-Culture demonstrator

• Part of large Dutch knowledge-economy project MultimediaN

• Partners: VU, CWI, UvA, DEN,ICN

• People: – Alia Amin, Lora Aroyo, Mark

van Assem, Victor de Boer, Lynda Hardman, Michiel Hildebrand, Laura Hollink, Marco de Niet, Borys Omelayenko, Marie-France van Orsouw, Jos Taekema, Annemiek Teesing, Anna Tordai, Jan Wielemaker, Bob Wielinga

• Artchive.com, ICN: Rijksmuseum Amsterdam, Dutch ethnology musea (Amsterdam, Leiden), National Library (Bibliopolis)

Page 31: Semantic annotation and search of large virtual heritage collections

Use case: painting style

Find paintings of a similar style

KLIMT, GustavPortrait of Adele Bloch-Bauer I1907Oil and gold on canvas138 x 138 cmAustrian Gallery, Vienna

Page 32: Semantic annotation and search of large virtual heritage collections

How can we find this other ‘Art nouveau’ painting?

MUNCH, EdvardThe Scream1893Oil, tempera and pastel on

cardboard91 x 73.5 cmNational Gallery, Oslo

Page 33: Semantic annotation and search of large virtual heritage collections

Issues w.r.t. the use case

• Parse annotation to find matches with thesauri terms– E.g. match artists to ULAN individuals

• Artists-style links– AAT contains styles; ULAN contains artists, but there is no

link• Learn link from corpora• Derive it from other annotations

– Domain-specific rules/reasoning needed • see example in SWRL doc• Painters may have painted in multiple styles

Page 34: Semantic annotation and search of large virtual heritage collections

Example enrichment

• Learning relations between art styles in AAT and artists in ULAN through NLP of art0historic texts

• But don’t learn things that already exist!

Page 35: Semantic annotation and search of large virtual heritage collections
Page 36: Semantic annotation and search of large virtual heritage collections

Culture Web demonstratorhttp://e-culture.multimedian.nl

Page 37: Semantic annotation and search of large virtual heritage collections
Page 38: Semantic annotation and search of large virtual heritage collections
Page 39: Semantic annotation and search of large virtual heritage collections

16 Nov 200616 Nov 2006

Page 40: Semantic annotation and search of large virtual heritage collections

Perspectives

• Basic Semantic Web technology is ready for deployment– in open knowledge-rich domains– Important research issues: scalability, vocabulary

alignment, metadata extraction

• Web 2.0 features:– Involving community experts in annotation– Personalization, myArt

• Social barriers have to be overcome!– “open door” policy– Involvement of general public => issues of “quality”

• Importance of using open standards– Away from custom-made flashy web sites