Upload
antoine-isaac
View
531
Download
1
Tags:
Embed Size (px)
Citation preview
Classification schemes, thesauri and other Knowledge Organization Systems
- a Linked Data perspective
Antoine Isaac
Pelagios: Linked Pasts
London, July 20-21, 2015
Classification schemes?
Scope: knowledge organization systems (KOS) such as classification systems, thesauri, gazetteers, subject heading lists…
(last-minute addition: also time periods, cf. PeriodO )
Simple Knowledge Organization System
SKOS is for exchanging KOSs as Linked Data (in RDF)
• Better than semi-structured data (CSV)
• Still relatively simple
A SKOS graphanimalscats
UF domestic catsRT wildcatsBT animalsSN used only for domestic
catsdomestic cats
USE catswildcats
Representing semantics
The formal way: OWL Semantic Web ontology language
Used for ontologies that enable machine reasoning
Mother is a class
Parent is the class of entities of type Person that are related to at least one other resource of type Person using the child property
…
Do we want to represent every vocabulary as a formal ontology?
It is possible, but not easy
KOS are large
KOS have softer “semantics”
Parent RelatedTerm Child
KOS have a focus on terminological information
Child UsedFor Offspring
Softer semantics can be useful for many applications!
Europeana and knowledge organisation systems
Create a “semantic layer” on top of cultural heritage objects
From: Stefan Gradmann
Using KOS in the Europeana Data Model
Enhanced descriptive metadata
Using KOS Linked Data
<skos:Concept rdf:about="http://www.mimo-db.eu/InstrumentsKeywords/2251"> <skos:prefLabel xml:lang="">Harpsichord</skos:prefLabel> <skos:prefLabel xml:lang="de">Cembalo</skos:prefLabel> <skos:prefLabel xml:lang="sv">Cembalo</skos:prefLabel> <skos:prefLabel xml:lang="fr">Clavecin</skos:prefLabel> <skos:prefLabel xml:lang="it">Clavicembalo</skos:prefLabel> <skos:prefLabel xml:lang="en">Harpsichord</skos:prefLabel> <skos:prefLabel xml:lang="nl">Klavecimbel</skos:prefLabel> <skos:broader> <skos:Concept rdf:about="http://www.mimo-db.eu/InstrumentsKeywords/2239"> <skos:prefLabel>Harpsichords</skos:prefLabel> </skos:Concept> </skos:broader></skos:Concept>
Other types of contextual resources
<gn:Feature rdf:about="http://sws.geonames.org/3176959/"> <gn:name>Florence</gn:name> <gn:alternateName xml:lang="ko"> 피렌체 </gn:alternateName> <gn:alternateName xml:lang="ja"> フィレンツェ </gn:alternateName> <gn:alternateName xml:lang="th">ฟลอเรนซ์�</gn:alternateName> <gn:alternateName xml:lang="bo">ཧྥུ་ལོ� ་རོ� ན་ཟི� འུ་ཡ།</gn:alternateName> <gn:alternateName xml:lang="cy">Fflorens</gn:alternateName> <gn:alternateName xml:lang="bs">Firenca</gn:alternateName> <gn:alternateName xml:lang="hbs">Firenca</gn:alternateName> <gn:alternateName xml:lang="hr">Firenca</gn:alternateName> <gn:alternateName xml:lang="sq">Firenca</gn:alternateName> <gn:alternateName xml:lang="pl">Firence</gn:alternateName> <gn:alternateName xml:lang="sl">Firence</gn:alternateName> <gn:alternateName xml:lang="lij">Firense</gn:alternateName> <gn:population>371517</gn:population> <wgs84_pos:lat>43.76667</wgs84_pos:lat> <wgs84_pos:long>11.25</wgs84_pos:long>
http://blogs.getty.edu/iris/art-architecture-thesaurus-now-available-as-linked-open-data/
Multilingual search
'uurglazen' in Italy
http://europeana.eu/portal/search.html?query=uurglazen&rows=96&qf=COUNTRY%3Aitaly
Vocabularies currently provided to Europeana
Europeana metadata enrichment
Enrichment types and vocabularies
Enrichment Type
Target vocabulary
Source metadata fields
Number of enriched objects
Places GeoNames dcterms:spatial, dc:coverage
7M
Concepts GEMET, DBpedia,
dc:subject, dc:type
9.2M
Agents DBpedia dc:creator, dc:contributor
144K
Time Semium Time
dc:date, dc:coverage, dcterms:temporal, edm:year
10,2M
Work in progress
Entity-based search and browsing
Annotation
Pundit @ DM2E project http://dm2e.eu
Europeana Channels
Semantic auto-completion
Not only end-user facing functions
Data must be accessible
(Unified) APIs, Linked Data
Data re-users should be able to provide enhanced services to their audience easily, especially in digital humanities
Specific collection and application needs cannot rely on a handful of generic vocabularies
Work needed
Vocabulary management and publication
Europeana developed its own WWI vocabulary based on a subset of LCSH
Terms translated in 10 languages and linked to id.loc.gov
Vocabulary services
http://data.europeana.eu/concept/loc/sh85148236
Representing finer-grained semantics
More precise relationships and formal semantics
For query expansion or data validation
E.g. ISO 25964 and Getty SKOS extensions
Representing finer-grained semantics
Depth level, concept associations
XKOS
Pre-coordinated strings
MADS/RDF
Representing finer-grained semantics?
Finer-grained semantics can be useful, but core models are key
They are what most people will start using
The need for alignment / co-reference / reconciliation
KOS 1:animalscatswildcats
KOS 2:animalhumanobject
A lot of work (being) done
A long line of work in the KOS community: DESIRE, CARMEN, Renardus, LIMBER, HILT, MSAC, MACS, Crisscross, KoMoHe, FAO…
Continued in Linked data context: Pleiades, Wikidata…
MACS: 120K links between Library of Congress Subject Headings (LCSH), RAMEAU, Schlagwortnormdatei (SWD)
Semantic mismatches
Irish vocabulary
From: Runar Bergheim
Norwegian vocabulary
skos:exactMatch
Requires flexible approaches
AMALGAME/CultuurLink:
http://semanticweb.cs.vu.nl/amalgame/http://cultuurlink.beeldengeluid.nl/
Finding and re-using vocabularies
Well-known or new vocabularies
Wikidata, VIAF, Geonames, Pleiades, DBpedia, LCSH…
Data repositories and inventories
The Data Hub
Vocabulary selection criteria
Available in technically appropriate way
Well-maintained
Documented (including metadata)
Well-connected, e.g. equivalent elements in other vocabularies are indicated
Multilingual
Open• license stacking hampers re-use
Quality assessment?
Cf. Data on the Web Best Practices
http://www.w3.org/TR/2015/WD-dwbp-20150625/#dataVocabularies
Take-home messages
Efforts across the whole ecosystem
Publishers of vocabularies, Providers of object data, Application developers, Researchers…
Requires to get very different steps right
Implementing standards for data exchange
Design consuming applications
Not only technical: encouraging open data!