Transcript
Page 1: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

LREC 2008, Marrakech, Morocco

Hennie BrugmanMPI for Psycholinguistics, Nijmegen, Netherlands Véronique MalaiséFree University, Amsterdam, Netherlands

Laura HollinkFree University, Amsterdam, Netherlands

Page 2: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Overview

– The CATCH programme and its annotation requirements– Existing models– Annotation Meta Model (AMM) and its application to CATCH cases– Software and infrastructure– Conclusions

Page 3: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

CATCH – Continuous Access To Cultural Heritage

– Dutch research program funded by NWO– Apply state of the art methods to the construction and

exploitation of digital collections of large Cultural Heritage institutions

– Currently 10 projects, hosted by Cultural Heritage institutions– Rijksmuseum Amsterdam, Dutch National Archive, Dutch National Library, Netherlands

Institute for Sound and Vision, etc

– Results and software applicable across institutions and collections

Page 4: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Annotations in CATCH

• Objectives:

- Cross media, cross collection, cross institution annotation of digital objects and segments of objects

- Add new layers of annotation to existing annotations

- Centralize storage and exploitation of annotations generated by CATCH projects

- Apply and showcase annotation recommendation modules/services from several CATCH projects

Page 5: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Annotations in CATCH

“some text description”Resource

Primitive value (string, date, number,…)

http://www.beeldengeluid.nl/GTAA#Subject_kunst_

Semantic value

property

property

Page 6: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Annotations in CATCH

“Abraham van Beijeren”

Resource

• complete resource

• catalog, metadata

• resource types

• images

• text

• html, xml

• audio

• video

artist

Page 7: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Annotations in CATCH

“very much”Resource segment – sound-video

choral:transcription

Page 8: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Annotations in CATCH

“roemer”

Resource segment - image

racm-glass:Shape

Page 9: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Annotations in CATCH

“boven eener verloting te Amsterdam”

Resource segment – scanned handwriting

scratch:transcription

Page 10: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Annotations in CATCH

“boven eener verloting te Amsterdam”

Resource segment – text

scratch:transcription

http://geonames.org/NL/Amsterdam

choice:location

Page 11: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Annotations in CATCH

“boven eener verloting te Amsterdam”

Resource segment – text

http://geonames.org/NL/Amsterdam

“bommenwerpers boven de hoofdstad”

Page 12: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Annotations in CATCH

“boven eener verloting te Amsterdam”

Resource segment – text

http://geonames.org/NL/Amsterdam

“bommenwerpers boven de hoofdstad”

http://TGN/Amsterdamhttp://TGN/NL

Page 13: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Annotations in CATCH

– Further requirements for the annotation formalism– Project and media specific extensions– Predefined annotation schemes– Generic and specific queries possible– Expressive and simple– Reuse or include existing annotation models or vocabularies

Page 14: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Existing annotation models

– Semantic web community– Focus on (semantic) annotation values– Anchoring mainly to complete resources or web pages

– Linguistic annotation community– Anchoring to text or time series– Usually no semantic values

– Media industry (e.g. MPEG-7)

– Objections– Not all media types covered– Too complex or specialized– Hardly ever annotation of annotations, and of segments of annotation

values

Page 15: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Annotation Meta Model (AMM) – why RDF?

– RDF, RDFS and OWL seem good modeling languages for the domain of annotation - graphs versus hierarchies

– Some of our requirements automatically met:– Class and property inheritance– Constraints (e.g. domains and ranges for properties)– Integration of semantic values– Classes and instances for annotation schemes and annotation

resp.– General and specific queries

Page 16: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

AMM – Core model

hasCoordinates

anchorsTo

feature

hasUnit

AnnotatableObject rdfs:type

AnnotatableObject rdfs:type

Unit rdfs:type

Coordinates rdfs:type

Page 17: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

AMM – Core model

hasCoordinates

anchorsTo

feature

hasUnit

AnnotatableObject rdfs:type

AnnotatableObject rdfs:type

Unit rdfs:type

Coordinates rdfs:type

Page 18: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

AMM – Core model

hasCoordinates

anchorsTo

feature

hasUnit

AnnotatableObject rdfs:type

AnnotatableObject rdfs:type

Unit rdfs:type

Coordinates rdfs:type

Page 19: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

AMM – Core model

hasCoordinates

anchorsTo

feature

hasUnit

AnnotatableObject rdfs:type

AnnotatableObject rdfs:type

Unit rdfs:type

Coordinates rdfs:type

Page 20: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

AMM – Core model

hasCoordinates

anchorsTo

feature

hasUnit

AnnotatableObject rdfs:type

AnnotatableObject rdfs:type

Unit rdfs:type

Coordinates rdfs:type

Page 21: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

AMM – Core model

AnnotatableObject

Coordinates

TimeSeriesObjectTextObject ImageObject

TimeSegmentRegion2DTextSpan

Page 22: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

AMM – Scanned handwriting case

ann

“in Amsterdam is het alle dagen feest”

image1 amm:addressesRegion

amm:anchorsTo

(454,107,110,204)

“handwriting.jpg”

text1

text2

amm:anchorsTo

(beginNode, endNode)

http://www.geonames.org/places#Amsterdam

amm:addressesTextSpan

hasText

image2 dc:title

hw:transcription

hw:location “Amsterdam”

hasText

“pixels” amm:hasUnit

AnnotatableObject

Coordinates

Page 23: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

AMM – Scanned handwriting case

ann

“in Amsterdam is het alle dagen feest”

image1 amm:addressesRegion

amm:anchorsTo

(454,107,110,204)

“handwriting.jpg”

text1

text2

amm:anchorsTo

(beginNode, endNode)

http://www.geonames.org/places#Amsterdam

amm:addressesTextSpan

hasText

image2 dc:title

hw:transcription

hw:location “Amsterdam”

hasText

“pixels” amm:hasUnit

AnnotatableObject

Coordinates

Page 24: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

AMM – Scanned handwriting case

ann

“in Amsterdam is het alle dagen feest”

image1 amm:addressesRegion

amm:anchorsTo

(454,107,110,204)

“handwriting.jpg”

text1

text2

amm:anchorsTo

(beginNode, endNode)

http://www.geonames.org/places#Amsterdam

amm:addressesTextSpan

hasText

image2 dc:title

hw:transcription

hw:location “Amsterdam”

hasText

“pixels” amm:hasUnit

AnnotatableObject

Coordinates

Page 25: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

AMM – Scanned handwriting case

ann

“in Amsterdam is het alle dagen feest”

image1 amm:addressesRegion

amm:anchorsTo

(454,107,110,204)

“handwriting.jpg”

text1

text2

amm:anchorsTo

(beginNode, endNode)

http://www.geonames.org/places#Amsterdam

amm:addressesTextSpan

hasText

image2 dc:title

hw:transcription

hw:location “Amsterdam”

hasText

“pixels” amm:hasUnit

AnnotatableObject

Coordinates

Page 26: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

AMM – Scanned handwriting case

ann

“in Amsterdam is het alle dagen feest”

image1 amm:addressesRegion

amm:anchorsTo

(454,107,110,204)

“handwriting.jpg”

text1

text2

amm:anchorsTo

(beginNode, endNode)

http://www.geonames.org/places#Amsterdam

amm:addressesTextSpan

hasText

image2 dc:title

hw:transcription

hw:location “Amsterdam”

hasText

“pixels” amm:hasUnit

AnnotatableObject

Coordinates

Page 27: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

AMM – Other cases

– Semantic annotations of segments of text documents– Manually annotated image regions– Complex linguistic annotation of co-occurring speech and gesture– Syntactic annotation of text

Page 28: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Software and infrastructure

– RDF repository, wrapped with AMM web service– Stores AMM model, project-specific annotation schemes and annotation data

– Java API, defining and implementing this web service– Clients:

– CHOICE@CATCH documentalist support system– Integrated multimedia and web based “Annotation and

Recommendation” demonstrator for CATCH

Page 29: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

CHOICE Documentalist Support System

Page 30: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Annotation and Recommendation demonstrator

Page 31: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Conclusions

– All requirements seem to be met– Applicable to a wide range of very different cases– Repository works efficiently, however not yet tested with a large

number of AnnotatableObjects (so far, approx. 50.000)– Highlights:

– Layered annotation– All media types are or can be supported– Annotation with multimedia objects or object segments possible

Page 32: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Thank you

Page 33: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Instances: CHOICE text annotation

“http://www.beeldengeluid.nl/Thesaurus/GTAASkosv7.owl#”

“Subject”

a1addressesTextSpan

partOf

(n1, n2)

“AndereTijdenGemmeker.txt”

apoldaSubject

apoldaOntology

Subject_bevelhebbers_

apoldaIdentifier

r1dc:title

Page 34: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Instances: video/audio annotation

“dit is gesproken tekst”

a1

addressesTimeInterval

partOf

t1

“gesturing_people.mpg”

transcription

textObjecta2partOf

partOfSpeech(beginNode, endNode)

http://www.isodatcats.org/part-of-speech#Noun

t2

hasTime

hasTime

“3521”

“4692”

addressesTextSpan

“some gesture description”

a3addressesTimeInterval

partOf

t3

“gesturing_people.mpg”

t4

hasTime

hasTime

“3854”

“5290”

handshape

http://www.mpi.nl/myShapes#fist

next

dc:description

dc:title

dc:title

Page 35: A Common Multimedia Annotation Framework for Cross Linking Cultural Heritage Digital Collections

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Instances: syntax tree on text

a1

addressesTextSpan

partOf

(n0, n3)

“german sentence 1”r1dc:title

“Der Mann geht schnell”hasText

a2

a3

a4

partOf

partOfpartOf

“Der”hasText

“Mann”

hasText

(n4, n8)

(n9, n13)

(n14, n21)

addressesTS

addressesTS

addressesTS

“geht”

“schnell”

syntaxD

syntax

N syntax

V

syntax

Adv

NP1

VP1

S1

anchorsToanchorsToNP

SVP

anchorsTo

anchorsTo anchorsToanchorsTo

syntax

syntax

syntax


Recommended