View
45
Download
2
Category
Tags:
Preview:
DESCRIPTION
Antoine Isaac Europeana – VU University Amsterdam. Dagstuhl Multilingual Semantic Web seminar. Europeana. “A digital library that is a single, direct and multilingual access point to the European cultural heritage.” European Parliament. 24 M objects ( images, text, sound and video) - PowerPoint PPT Presentation
Citation preview
Antoine Isaac
Europeana – VU University Amsterdam
Dagstuhl Multilingual Semantic Web seminar
Europeana
24 M objects (images, text, sound and video)
From over 2.200 libraries, museums, archives
From 33 countries
For everyone
“A digital library that is a single, direct and multilingual access point to the European cultural heritage.”
European Parliament
Multilingual Access in Europeana
Dimensions of multilingual access
Interface
Search (query translation or document translation)
Result presentation
Browsing
Europeana's efforts
Interface translated into 26 languages
Query translation: only prototype
Query result filtering by country/language
Document translation (user enabled)
Semantic contextualization of objects
• Multilingual enrichment/annotation of metadata
Making metadata work for multilingual access
Current metadata in Europeana
Simple object records
Flat (text values)
Without language tags!
Only language-related info on metadata is at collection level
• Can be "mul"
Need to change!
a new Europeana Data Model (EDM)
"Semantic layer" of contextual resources(concepts, persons, places, events...)
Networked objects
Cultural artefact
PaintingSculptureBuildling
Exploiting semantic relationse.g. “broader concept”, “place of birth”, “involved
person”…
Multilingual metadata
Fetching already available linked data
http://www.w3.org/2005/Incubator/lld/XGR-lld-vocabdataset/
E.g., from libraries
Interoperability
Encouraging the use of RDF + common and simple elements
Interoperability
Encouraging the use of common and simple data elements
<skos:Concept rdf:about="http://www.mimo-db.eu/InstrumentsKeywords/2308"> <skos:prefLabel xml:lang="fr">Piano carré</skos:prefLabel> <skos:prefLabel xml:lang="it">Pianoforte a tavolino</skos:prefLabel> <skos:prefLabel xml:lang="en">Square pianoforte</skos:prefLabel> <skos:prefLabel xml:lang="de">Tafelklavier</skos:prefLabel> <skos:prefLabel xml:lang="nl">Tafelpiano</skos:prefLabel> <skos:prefLabel xml:lang="sv">Taffel</skos:prefLabel> <skos:broader> <skos:Concept rdf:about="http://www.mimo-db.eu/InstrumentsKeywords/2273"> <skos:prefLabel xml:lang="en">Pianofortes</skos:prefLabel> </skos:Concept> </skos:broader></skos:Concept>
Interoperability
mixed nature of eligible contextual resources: dictionaries, synonym/translation lists, thesauri, authority lists, gazetteers…
interplay: “semantic” data next to multilingual data
Simultaneous approaches
Getting richer semantic/multilingual metadata from providers
Fetching third-party contextual data and linking it to “un-contextualized” objects
Linking contextual data from an institution to another more general / more commonly used contextual dataset
• Dbpedia.org, VIAF.org…
Status and challenges
Current status
All this is work in progress and will take time
R&D prototypes (EuropeanaConnect) showing the challenges of gathering appropriate multilingual tools and data
First tests of simple techniques in production portal: GeoNames (places) and GEMET (concepts)
Encouraging, but illustrate issues with too naïve approaches (no NLP) and incomplete data
• Cheval
• Poison
http://www.europeana.eu
Problems & requirements
For providers & Europeana
Continue work on metadata
Benchmarking (cf. CHiC lab @ CLEF)
Positioning as consumers and contributors of data (cf Asun’s slides)
data.europeana.eu
For language-intensive tools and resources
Availability: open resources
Interoperability
Simplicity
• But not always! E.g., not only “first hit” translations
Scale: scalability of tools, number and scope of datasets
Many languages, some lesser-resourced (wrt. English)
Another illustration: VOICES project
Something entirely different but not completely unrelated
Voice-based community-centric mobile services for social development
Easing communication on agricultural trade
Listing of products/prices via phone/radio
Pilot in Mali
Challenges
Data-centric project, but language technology plays a crucial role
Objects should be provided with textual and audio labels (text-to-speech system) in different languages
Local languages: e.g., Bambara
Lack of resource: need low-cost, easy-to-adapt solutions
Victor de Boer, VU Amsterdam (v.de.boer@cs.vu.nl)
Thank you
aisaac@few.vu.nl
http://www.few.vu.nl/~aisaac/
Some slides based on Marlies Olensky and Juliane Stiller -
Multilingual Web Workshop, June 11, 2012, Dublin
Recommended