Upload
hideaki-takeda
View
875
Download
0
Embed Size (px)
Citation preview
Hideaki Takeda / National Institute of Informatics
Identity and schema for Linked Data
Hideaki Takeda
National Institute of Informatics
takeda@nii.ac.jp
2012 INTERNATIONAL ASIAN SUMMER SCHOOL IN LINKED DATA
IASLOD 2012, August 13-17, 2012, KAIST, Daejeon, Korea
Hideaki Takeda / National Institute of Informatics
How to put the data into computer?
• How to describe the data? – The way to describe individual data
• Schema/Class/Concept
– The way to describe relationship among schema/class/concept • Ontology/Taxonomy/Thesaurus
• How to refer the data? – The way to identify individual data
• Identifier
– Relationship among identifiers
Hideaki Takeda / National Institute of Informatics
Architecture for the Semantic Web
Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
The world of instances (Linked Data)
The world of classes (Ontologies)
Hideaki Takeda / National Institute of Informatics
Layers of Semantic Web • Ontology
– Descriptions on classes
– RDFS, OWL
– Challenges for ontology building
• Ontology building is difficult by nature
– Consistency, comprehensiveness, logicality
• Alignment of ontologies is more difficult
Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
Descriptions on classes
インスタンスに関する記述
Ontology
Linked Data
Hideaki Takeda / National Institute of Informatics
Layers of Semantic Web • Linked Data
– Descriptions on instances (individuals)
– RDF + (RDFS, OWL)
– Pros for Linked Data
• Easy to write (mainly fact description)
• Easy to link (fact to fact link)
– Cons for Linked Data
• Difficult to describe complex structures
• Still need for class description (-> ontology)
Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
Descriptions on classes
Description on instances
Ontology
Linked Data
Hideaki Takeda / National Institute of Informatics
Importance of Identifiers for Entities
• Everything should be identifiable!
• Human can identify things with vague identifiers or even without identifiers with help from the context around things
• On the web, the context is usually not available and the computer can seldom understand the context even if it exists
• So we need identifiers for all things
Hideaki Takeda / National Institute of Informatics
Identification System
• Identification is one of the primary functions for human information processing – Naming: e.g., names for people, pets, and some daily
things • OK if the number of things is not so big
– Systematic Identification • e.g., phone number, post-code, passport number, product number,
ISBN • If the number of things is big enough
• Requirements for Systematic Identification – Identifier is stable and sustainable – Uniqueness is guaranteed – Identifier publisher is reliable and sustainable
Hideaki Takeda / National Institute of Informatics
Identification system for Web
• Not so different from conventional identification systems • Difference
– Cross-system use – Truly digitized
• Requirements for Systematic Identification for web – Identifier is stable and sustainable (even after an entity may
disappear) – Uniqueness is guaranteed over all systems – Description on should be associated to identifiers
• since entities may not accessible
– Identifier publisher is reliable and sustainable
Hideaki Takeda / National Institute of Informatics
Solutions for the Requirements by LOD
• Requirements for Systematic Identification for web – 1. Identifier is stable and sustainable (even after an
entity may disappear) • (up to each identifier publisher)
– 2. Uniqueness is guaranteed over all systems • URI (not URN)
– 3. Description on should be associated to identifiers • Dereferenceable URI
– If URI is accessed, a description associated to it should be returned
– 4. Identifier publisher is reliable and sustainable
Hideaki Takeda / National Institute of Informatics
Some examples ISBN(International Standard Book Number)
• Abstract
– a unique numeric commercial book identifier
– 13 digits
• Prefix: 978 or 979 (for compatibility with EAN code)
• Group(language-sharing country group): 1 to 5 digits
• Publisher code:
• Item number:
• Check num: 1 digit
– Management: two layers
• National ISBN Agency – Publisher
• Requirement Satisfaction
– 1. (Stable ID) Maybe (versioning often matters, and sometimes publisher may re-use ISBN)
– 2. (Unique ID) Uniqueness is guaranteed but not URI
– 3. (Dereferenceable) No mechanisms (amazon does instead!)
– 4. (Reliable publisher) Yes
Hideaki Takeda / National Institute of Informatics
Some examples DOI (Digital Object Identifier)
• Abstract
– An identifier for scientific digital objects (mostly scientific articles)
– An unfixed string: “prefix/suffix”
• Prefix: assigned for publishers
• Suffix: assigned for each object
– Management: three layers
• IDF (International DOI Foundation) – Registration Agency – Publisher
• Requirement Satisfaction
– 1. (Stable ID) Yes (not re-usable)
– 2. (Unique ID)Uniqueness is guaranteed and URI accessible (http://dx.doi.org/”DOI”)
– 3. (Dereferenaceable)Mapping to object pages but no RDF
– 4. (Reliable publisher) Maybe
Hideaki Takeda / National Institute of Informatics
Some examples Dbpedia (as Identifier)
• Abstract
– A wikipedia page
– Name of wikipedia page
• Maintained manually
– Disambiguation page
– Redirect page
• Requirement Satisfaction
– 1. (Stable ID) maybe (sometimes disappear, sometimes change names, sometime change contents)
– 2. (Unique ID) Uniqueness is mostly guaranteed and URI accessible
– 3. (Dereferenceable) RDF
– 4. (Reliable publisher) Maybe
•
Hideaki Takeda / National Institute of Informatics
Identification of relationship between identifiers
• Co-existence of multiple identification systems on a field – Difference of coverage – Difference of Viewpoint
An entity can have multiple identifiers Need for mapping between identifiers in different
identification systems Method: Use special properties
owl:sameAs, (rdfs:seeAlso, skos:exactMatch) http://sameas.org
Some problems – Logical inconsistency with owl:sameAs – Maintainance
Hideaki Takeda / National Institute of Informatics
Summary for ID
• Identification is the crucial part in LOD
– Data availability
– Data inconsistency
– Data interoperability
• Establishment of a good identification system leads a reliable and sustainable LOD.
Hideaki Takeda / National Institute of Informatics
Structuring Information • A wide range of structuring information
– Keywords, tags
• A freely chosen word or phrase just indicating some features
– Controlled vocabulary
• Mapping to the fixed set of words or phrases
• e.g., the list of countries, the name authorities
– Classification
• System for classifying entities. Often hierarchical. Class may not carry meaning.
– Taxonomy
• Hierarchical term system for classification. Upper/lower relation usually means general/specific relation
• e.g., the subject headings of LC
– Thesaurus
• System for semantics. More different types of relations: (hypersym, hyposym), synonym, antonym, homonym, holonym, meronym
– Ontology
• System of concepts. Concepts rather than words. More various relations, the definitions of concepts
Hideaki Takeda / National Institute of Informatics
Examples in Library Science
• Many systems in the library community • Classification
– Universal Decimal Classification (UDC)
• Controlled Vocabulary – the authority files for person names, organizations, location names
• Library of Congress : 8 Million records, MADS &SKOS • British Library: 2.6 million records, foaf & BIO (A vocabulary for
biographical information) • National Diet Library (Japan): 1 million records, foaf • Deutsche Nationalbibliothek (DNB, Germany): 1.8 & 1.3 million records
(names & organization), • Virtual International Authority File (VIAF): 4 million records
• Taxonomy – Subject Heading: LC, NDL,
• Library of Congress: MADS &SKOS • British Library: • National Diet Library (Japan): 0.1 million records, SKOS • Deutsche Nationalbibliothek (DNB, Germany): 0.16 million records
Hideaki Takeda / National Institute of Informatics
UDC as Linked Data UDC ELEMENT DEFINITION SKOS TERM UDC
SUBPROPERTY
UDC number (notation) UDC notation is combination of symbols (numerals, signs and letters) that represent a class, its position in the hierarchy and its relation to other classes. Notation is a language-independent indexing term that enables mechanical sorting and filing of subjects. Also called 'UDC number' and 'UDC classmark'
skos:notation ---
class identifier (URI) A unique identifier assigned to each UDC class. It identifies the relationship between a class' meaning and its notational representation
skos:Concept ---
broader class (URI) Superordinate class: the class hierarchically above the class in question skos:broader ---
caption Verbal description of the class content skos:prefLabel ---
including note Extension of the caption containing verbal examples of the class content (usually a selection of important terms that do not appear in the subdivision)
skos:note udc:includingNote
application note Instructions for number building, further extension and specification of the class skos:note udc:applicationNote
scope note Note explaining the extent and the meaning of a UDC class. Used to resolve disambiguation or to distinguish this class from other similar classes
skos:scopeNote
---
examples Examples of combination are used to illustrate UDC class building i.e. complex subject statements
skos:example ---
see also reference Indication of conceptual relationship between UDC classes from different hierarchies skos:related ---
<skos:Concept rdf:about="http://udcdata.info/025553">
<skos:inScheme rdf:resource="http://udcdata.info/udc-schema"/>
<skos:broader rdf:resource="http://udcdata.info/025461"/>
<skos:notation rdf:datatype="http://udcdata.info/UDCnotation">510.6</skos:notation>
<skos:prefLabel xml:lang="en">Mathematical logic</skos:prefLabel>
<skos:prefLabel xml:lang="ja">記号論理学</skos:prefLabel>
<skos:related rdf:resource="http://udcdata.info/000016"/>
</skos:Concept>
http://udcdata.info/
69,000 records
40 Languages
Hideaki Takeda / National Institute of Informatics
http://id.loc.gov/authorities/names/n79084664.html <http://id.loc.gov/authorities/names/n79084664>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://www.loc.gov/mads/rdf/v1#PersonalName> .
<http://id.loc.gov/authorities/names/n79084664>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://www.loc.gov/mads/rdf/v1#Authority> .
<http://id.loc.gov/authorities/names/n79084664>
<http://www.loc.gov/mads/rdf/v1#authoritativeLabel>
"Natsume, Sōseki, 1867-1916"@en .
<http://id.loc.gov/authorities/names/n79084664>
<http://www.loc.gov/mads/rdf/v1#elementList>
_:bnode7authoritiesnamesn79084664 .
_:bnode7authoritiesnamesn79084664
<http://www.w3.org/1999/02/22-rdf-syntax-ns#first>
_:bnode8authoritiesnamesn79084664 .
_:bnode7authoritiesnamesn79084664
<http://www.w3.org/1999/02/22-rdf-syntax-ns#rest>
_:bnode010 .
_:bnode8authoritiesnamesn79084664
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://www.loc.gov/mads/rdf/v1#FullNameElement> .
_:bnode8authoritiesnamesn79084664
<http://www.loc.gov/mads/rdf/v1#elementValue>
"Natsume, Sōseki,"@en .
_:bnode010 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first>
_:bnode11authoritiesnamesn79084664 .
_:bnode010 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .
_:bnode11authoritiesnamesn79084664
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://www.loc.gov/mads/rdf/v1#DateNameElement> .
_:bnode11authoritiesnamesn79084664
<http://www.loc.gov/mads/rdf/v1#elementValue> "1867-1916"@en .
<http://id.loc.gov/authorities/names/n79084664>
<http://www.loc.gov/mads/rdf/v1#classification> "PL812.A8" .
<http://id.loc.gov/authorities/names/n79084664>
<http://www.loc.gov/mads/rdf/v1#hasExactExternalAuthority>
<http://viaf.org/viaf/sourceID/LC%7Cn+79084664#skos:Concept> .
<http://id.loc.gov/authorities/names/n79084664>
<http://www.loc.gov/mads/rdf/v1#isMemberOfMADSCollection>
<http://id.loc.gov/authorities/names/collection_NamesAuthorizedHeadin
gs> .
<http://id.loc.gov/authorities/names/n79084664>
<http://www.loc.gov/mads/rdf/v1#isMemberOfMADSScheme>
<http://id.loc.gov/authorities/names> .
<http://id.loc.gov/authorities/names/n79084664>
<http://www.loc.gov/mads/rdf/v1#isMemberOfMADSCollection>
<http://id.loc.gov/authorities/names/collection_LCNAF> .
Hideaki Takeda / National Institute of Informatics
http://id.loc.gov/authorities/subjects/sh85008180.html
Hideaki Takeda / National Institute of Informatics
http://data.bnf.fr/11932084/intelligence_artificielle/
Hideaki Takeda / National Institute of Informatics
Some examples Scientific Names for Species and Taxa
• Abstract
– Names for biological species and other taxa (kingdom, divison, class, order, family, tribe, genus)
– A string
• Binomial name for species
• Academic societies maintain taxon names individually
– E.g., Papilo xuthus (Asian Swallowtail, ナミアゲハ,호랑나비)
• Requirement Satisfaction
– 1. Mostly yes (sometimes disappear, change names, change contents)
– 2. Uniqueness is generally guaranteed but precise speaking some ambiguity because of change.
– 3. No. Many systems exists but none covers all species
– 4. Maybe
Hideaki Takeda / National Institute of Informatics
分類群 Taxon 植物
Plants 藻類 Algae
菌類 Fungi
動物 Animals
ドメイン Domain
界 Kingdom
門 Division/Phylum -phyta -phyta -mycota
亜門 Subdivision/Subphylum -phytina -phytina -mycotina
綱 Class -opsida -phyceae -mycetes
亜綱 Subclass -idae -phycidae -mycetidae
目 Order -ales -ales -ales
亜目 Suborder -ineae -ineae -ineae
上科 Superfamily -acea -acea -acea -oidea
科 Family -aceae -aceae -aceae -idae
亜科 Subfamily -oideae -oideae -oideae -inae
族/連 Tribe -eae -eae -eae -ini
亜族/亜連 Subtribe -inae -inae -inae -ina
属 Genus
亜属 Subgenus
種 Species
亜種 Subspecies
Hideaki Takeda / National Institute of Informatics
Ontology
An ontology is an explicit specification of a conceptualization [Gruber]
An ontology is an explicit specification of a conceptualization. The
term is borrowed from philosophy, where an Ontology is a systematic account of Existence. For AI systems, what "exists" is that which can be represented. When the knowledge of a domain is represented in a declarative formalism, the set of objects that can be represented is called the universe of discourse. This set of objects, and the describable relationships among them, are reflected in the representational vocabulary with which a knowledge-based program represents knowledge. Thus, in the context of AI, we can describe the ontology of a program by defining a set of representational terms. In such an ontology, definitions associate the names of entities in the universe of discourse (e.g., classes, relations, functions, or other objects) with human-readable text describing what the names mean, and formal axioms that constrain the interpretation and well-formed use of these terms. Formally, an ontology is the statement of a logical theory.
Hideaki Takeda / National Institute of Informatics
Conceptualization object
box
red box blue box yellow box
on_desk(A)
on(A, B)
put(A,B)
object
box
box
color:{red, blue, yellow}
on_desk(A)
on(A, B)
put(A,B)
object
box desk
on(A/box, B/object)
put(A/box,B/object)
box
color:{red, blue, yellow}
Trade off between generality and efficiency
There are many possible ways to conceptualize the target world
Hideaki Takeda / National Institute of Informatics
Types of Ontologies
• Upper (top-level) ontology vs. Domain ontology – Upper Ontology: A common ontology throughout all domains – Domain Ontology: An ontology which is meaningful in a specific
domain
• Object ontology vs. Task ontology – Object Ontology: An ontology on “things” and “events” – Task Ontology: An ontology on “doing”
• Heavy-weight ontology vs. light-weight ontology – Heavy-weight ontology: fully described ontology including
concept definitions and relations, in particular in a logical way – Light-weight ontology: partially described ontology including
typically only is-a relations
Hideaki Takeda / National Institute of Informatics
Top-level ontology
• Ontology which covers all of the world!
• Very…. Difficult – e.g., how does a thing exist?
• A thing is four dimensional existence?
• A thing exists three-dimensionally over time?
• Common requirements – A small number of concepts can cover the world
– Concepts can be used in lower ontologies
– Concept should be general and abstract
Hideaki Takeda / National Institute of Informatics
Top-level ontology • Three approaches – Formal approach
• Logical formalization • Fully Abstract • Pros: clean • Cons: hardly understandable • e.g., Sowa’s top-level ontology, DOLCE
– Linguistic approach • Use and extension of linguistic concepts • Partially abstract and partially general • Pros: understandable • Cons: limitation to the linguistic world • e.g., Penman Upper Model, WordNet
– Empirical Approach • Use and extension of everyday concepts • Mostly general • Pros: understandable and applicable to all the world • Cons: lack of solid foundation • e.g. SUMO, Cyc, EDR
Hideaki Takeda / National Institute of Informatics
Empirical top-level ontology • SUMO(Suggested Upper
Merged Ontology)
– Collection and organization of concepts used frequently
– Simple relationship between concepts
Entity
BiologicalProcess
ChangeOfState
Transfer
IntentionallyCausedProcess
NaturalProcess
Inorganic
Organic
Substance
Collection
CorpuscularObject
SelfConnectedObject
Process
Object
Abstract
Phsical
PathojogicProcess
PhisiologicProcess
SocialInteraction
Searching
ChangeOfProssession
Communication
BringingTogether
Meeting
Contest
Cooperation
Impelling
Transportation
Removing
PuttingImpacting
Motion
Separating
Hideaki Takeda / National Institute of Informatics
Formal Ontology: DOLCE
• DOLCE(a Descriptive Ontology for Linguistic and Cognitive Engineering)
– Intended to a reference system for top-level ontology
– Logical definition
– Particular (DOLCE) vs. Universal
• Particular: ontology about things, phenomena, quality…
• Universal: ontology for describing particular like categories and attributes
Hideaki Takeda / National Institute of Informatics
Formal Ontology: DOLCE
• Concepts – Endurant / Perdurant / Quality / Abstract
• Endurant: – “Things” – An existence over time – May change its attribute
• Perdurant – “process” – No change over time – May switch a part to the other
• Relations – Parthood (abstract or perdurant) – Temporally Parthood (endurant) – Constitution (endurant or perdurant) – Participation between perdurant and endurant
ALLEntity
PDPerdurantOccurence
EDEndurant
QQuality
ABAbstract
ASArbitrary
Sum
NPEDNon-Physical
Endurant
PEDPhysicalEndurant
MAmount of
Matter
EVEvent
STVStative
APOAgentive
Physical Object
FFeature
POBPhysicalObject NAPO
Non-agentivePhysical Object
NPOBNon-physical
Object
SOBSocial Object
MOBMental Object
PROProcess
STState
ACCAccomplishment
ACHAchievement
AQAbstract Quality
PQPhysical Quality
TQTemporal Quality
TLTemporal Location
SLSpatial Location
RRegion
TRTemporal Region
Fact
SetT
Time IntervalPR
Physical Region
ARAbstract Region
SSpace Region
Hideaki Takeda / National Institute of Informatics
Linguistic top-level ontology
• WordNet – A lexical reference system
• “Link-based electronic dictionary”
– Concepts • synset
– Noun 79,689 – Verb 13,508
– Relations • synonym • hypernym/hyponym (is-a) • holonym/meronym (a-part-of)
http://www.cogsci.princeton.edu/cgi-bin/webwn
Hideaki Takeda / National Institute of Informatics
Linguistic top-level ontology • WordNet
– Top-level • { entity, physical thing (that which is perceived or known or inferred to
have its own physical existence (living or nonliving)) } • { psychological_feature, (a feature of the mental life of a living organism) } • { abstraction, (a general concept formed by extracting common features
from specific examples) } • { state, (the way something is with respect to its main attributes; "the
current state of knowledge"; "his state of health"; "in a weak financial state") }
• { event, (something that happens at a given place and time) } • { act, human_action, human_activity, (something that people do or cause
to happen) } • { group, grouping, (any number of entities (members) considered as a
unit) } • { possession, (anything owned or possessed) } • { phenomenon, (any state or process known through the senses rather
than by intuition or reasoning) }
Hideaki Takeda / National Institute of Informatics
Summary for structuring information
• Keywords, tags/Controlled vocabulary /Classification/Taxonomy /Thesaurus/Ontology
– The difference is not clear, not important
– The trend is to go more structured ones
– The same requirements to Identification systems
Hideaki Takeda / National Institute of Informatics
Summary
• Requirements for Successful Structuring Systems
– 1. Entity is stable and sustainable
– 2. Uniqueness is guaranteed over all systems
– 3. Description on should be associated to entity
– 4. System publisher is reliable and sustainable
• Learn from success in the library community
LOD Tech.
can help
Hideaki Takeda / National Institute of Informatics
Schema/Vocabulary for LOD
• Class/Concept description – Axiom of a concept in ontology – Database schema for a table in Relational database – Object definition in Object-Oriented Programming/DB
• Class description in Semantic Web – RDFS/OWL description for a class
• RDFS: Simple class system • OWL: Description Logic-based
• Class description in Linked Data – Mostly RDFS-based (exception: owl:sameAs) – Simple Structure (mostly property-value pair)
Hideaki Takeda / National Institute of Informatics
Schema/Vocabulary for LOD
• The importance of sharing schema
– Interoperability
– Generic applications
• Some famous and frequently used shemata
– Dublin Core
– FOAF (Friend-Of-A-Friend)
– SKOS (Simple Knowledge Organization System)
Hideaki Takeda / National Institute of Informatics
Usage of Common Vocabularies Prefix Namespace Used by
dc http://purl.org/dc/elements/1.1/ 66 (31.88 %)
foaf http://xmlns.com/foaf/0.1/ 55 (26.57 %)
dcterms http://purl.org/dc/terms/ 38 (18.36 %)
skos http://www.w3.org/2004/02/skos/core# 29 (14.01 %)
akt http://www.aktors.org/ontology/portal# 17 (8.21 %)
geo http://www.w3.org/2003/01/geo/wgs84_pos# 14 (6.76 %)
mo http://purl.org/ontology/mo/ 13 (6.28 %)
bibo http://purl.org/ontology/bibo/ 8 (3.86 %)
vcard http://www.w3.org/2006/vcard/ns# 6 (2.90 %)
frbr http://purl.org/vocab/frbr/core# 5 (2.42 %)
sioc http://rdfs.org/sioc/ns# 4 (1.93 %)
LDOW2011 Presentation, Christian Bizer (Freie Universität Berlin), 2011
Hideaki Takeda / National Institute of Informatics
(Simple) Dublin Core
• Started from the library community
• Now maintained by DCMI (Dublin Core Metadata Initiative)
• (Simple) Dublin Core – Just 15 elements – Simple is best – No range restriction – http://purl.org/dc/elements/1.1/
• 15 elements – Title – Creator – Subject – Description – Publisher – Contributor – Date – Type – Format – Identifier – Source – Language – Relation – Coverage – Rights
Hideaki Takeda / National Institute of Informatics
dc terms • Qualified Dublin Core
– Domain & Range
– More precise terms
• Extension of simple dc
Properties in the / abstract , accessRights , accrualMethod , accrualPeriodicity , accrualPolicy , alternative , audience , available , bibliograp
hicCitation ,conformsTo , contributor , coverage , created , creator , date , dateAccepted , dateCopyrighted , dateSubmitted , description ,educationLevel , extent , format , hasFormat , hasPart , hasVersion , identifier , instructionalMethod , isFormatOf , isPartOf , isReferencedBy ,isReplacedBy , isRequiredBy , issued , isVersionOf , language , license , mediator , medium , modified , provenance , publisher , references ,relation , replaces , requires , rights , rightsHolder , source , spatial , subject , tableOfContents , temporal , title , type , valid
Properties in the /elements/1.1/namespace
contributor , coverage , creator , date , description , format , identifier , language , publisher , relation , rights , source , subject , title , type
Vocabulary Encoding Schemes DCMIType , DDC , IMT , LCC , LCSH , MESH , NLM , TGN , UDC
Syntax Encoding Schemes Box , ISO3166 , ISO639-2 , ISO639-3 , Period , Point , RFC1766 , RFC3066 , RFC4646 , RFC5646 , URI , W3CDTF
Classes Agent , AgentClass , BibliographicResource , FileFormat , Frequency , Jurisdiction , LicenseDocument , LinguisticSystem , Location ,LocationPeriodOrJurisdiction , MediaType , MediaTypeOrExtent , MethodOfAccrual , MethodOfInstruction , PeriodOfTime , PhysicalMedium ,PhysicalResource , Policy , ProvenanceStatement , RightsStatement , SizeOrDuration , Standard
DCMI Type Vocabulary Collection , Dataset , Event , Image , InteractiveResource , MovingImage , PhysicalObject , Service , Software , Sound , StillImage , Text
Terms related to the DCMI Abstract Model
memberOf , VocabularyEncodingScheme
Hideaki Takeda / National Institute of Informatics
Dcterms subPropertyOf Domain Range
contributor dc:contributor rdfs:Resource dcterms:Agent
creator dc:creator, dcterms:contributor
rdfs:Resource dcterms:Agent
coverage dc:coverage rdfs:Resource dcterms:LocationPeriodOrJurisdiction
spatial dc:coverage, dcterms:coverage
rdfs:Resource dcterms:Location
Temporal dc:coverage, dcterms:coverage
rdfs:Resource dcterms:PeriodOfTime
Date dc:date rdfs:Resource rdfs:Literal
Available dc:date, dcterms:date rdfs:Resource rdfs:Literal
Created dc:date, dcterms:date rdfs:Resource rdfs:Literal
dateAccepted dc:date, dcterms:date rdfs:Resource rdfs:Literal
dateCopyrighted dc:date, dcterms:date rdfs:Resource rdfs:Literal
dateSubmitted dc:date, dcterms:date rdfs:Resource rdfs:Literal
Issued dc:date, dcterms:date rdfs:Resource rdfs:Literal
Modified dc:date, dcterms:date rdfs:Resource rdfs:Literal
Valid dc:date, dcterms:date rdfs:Resource rdfs:Literal
description dc:description rdfs:Resource rdfs:Resource
Abstract dc:description, dcterms:description
rdfs:Resource rdfs:Resource
tableOfContents dc:description, dcterms:description
rdfs:Resource rdfs:Resource
format dc:format rdfs:Resource dcterms:MediaTypeOrExtent
extent dc:format, dcterms:format rdfs:Resource dcterms:SizeOrDuration
Medium dc:format, dcterms:format dcterms:PhysicalResource
dcterms:PhysicalMedium
Identifier dc:identifier rdfs:Resource rdfs:Literal
bibliographicCitation
dc:identifier, dcterms:identifier
dcterms:BibliographicResource
rdfs:Literal
Language dc:language rdfs:Resource dcterms:LinguisticSystem
Publisher dc:publisher rdfs:Resource dcterms:Agent Relation dc:relation rdfs:Resource rdfs:Resource
source dc:source, dcterms:relation rdfs:Resource rdfs:Resource
Dcterms subPropertyOf Domain Range
conformsTo dc:relation, dcterms:relation rdfs:Resource dcterms:Standard
hasFormat dc:relation, dcterms:relation rdfs:Resource rdfs:Resource
hasPart dc:relation, dcterms:relation rdfs:Resource rdfs:Resource
hasVersion dc:relation, dcterms:relation rdfs:Resource rdfs:Resource
isFormatOf dc:relation, dcterms:relation rdfs:Resource rdfs:Resource
isPartOf dc:relation, dcterms:relation rdfs:Resource rdfs:Resource
isReferencedBy dc:relation, dcterms:relation rdfs:Resource rdfs:Resource
isReplacedBy dc:relation, dcterms:relation rdfs:Resource rdfs:Resource
isRequiredBy dc:relation, dcterms:relation rdfs:Resource rdfs:Resource
isVersionOf dc:relation, dcterms:relation rdfs:Resource rdfs:Resource
References dc:relation, dcterms:relation rdfs:Resource rdfs:Resource
Replaces dc:relation, dcterms:relation rdfs:Resource rdfs:Resource
Requires dc:relation, dcterms:relation rdfs:Resource rdfs:Resource
Rights dc:rights rdfs:Resource dcterms:RightsStatement
accessRights dc:rights, dcterms:rights rdfs:Resource dcterms:RightsStatement
License dc:rights, dcterms:rights rdfs:Resource dcterms:LicenseDocument
Subject dc:subject rdfs:Resource rdfs:Resource
title dc:title rdfs:Resource rdfs:Resourcerdfs:Literal
alternative dc:title, dcterms:title rdfs:Resource rdfs:Resourcerdfs:Literal
type dc:type rdfs:Resource rdfs:Class audience rdfs:Resource dcterms:AgentClass educationLevel dcterms:audience rdfs:Resource dcterms:AgentClass mediator dcterms:audience rdfs:Resource dcterms:AgentClass
accrualMethod dcmitype:Collection
dcterms:MethodOfAccrual
accrualPeriodicity dcmitype:Collection
dcterms:Frequency
accrualPolicy dcmitype:Collection
dcterms:Policy
instructionalMethod rdfs:Resource dcterms:MethodOfInstruction
provenance rdfs:Resource dcterms:ProvenanceStatement
rightsHolder rdfs:Resource dcterms:Agent
http://www.kanzaki.com/docs/sw/dc-domain-range.html http://dublincore.org/documents/dcmi-terms/
Hideaki Takeda / National Institute of Informatics
The Friend of a Friend (FOAF) • Metadata describe persons and their relationship
• Voluntary project
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
<#JW>
a foaf:Person ;
foaf:name "Jimmy Wales" ;
foaf:mbox <mailto:[email protected]> ;
foaf:homepage <http://www.jimmywales.com/> ;
foaf:nick "Jimbo" ;
foaf:depiction
<http://www.jimmywales.com/aus_img_small.jpg> ;
foaf:interest <http://www.wikimedia.org> ;
foaf:knows [
a foaf:Person ;
foaf:name "Angela Beesley"
] .
<http://www.wikimedia.org>
rdfs:label "Wikipedia" .
Classes:
| Agent | Document | Group | Image | LabelProperty |
OnlineAccount | OnlineChatAccount |
OnlineEcommerceAccount | OnlineGamingAccount |
Organization | Person | PersonalProfileDocument | Project |
Properties:
| account | accountName | accountServiceHomepage | age |
aimChatID | based_near | birthday | currentProject |
depiction | depicts | dnaChecksum | familyName |
family_name | firstName | focus | fundedBy | geekcode |
gender | givenName | givenname | holdsAccount |
homepage | icqChatID | img | interest | isPrimaryTopicOf |
jabberID | knows | lastName | logo | made | maker | mbox |
mbox_sha1sum | member | membershipClass | msnChatID
| myersBriggs | name | nick | openid | page | pastProject |
phone | plan | primaryTopic | publications |
schoolHomepage | sha1 | skypeID | status | surname | theme
| thumbnail | tipjar | title | topic | topic_interest | weblog |
workInfoHomepage | workplaceHomepage | yahooChatID |
Hideaki Takeda / National Institute of Informatics
SKOS (Simple Knowledge Organization System)
• Metadata for taxonomy
– Hierarchical structure of concepts
• Invented to represent taxonomy such as subject heading
• =/= subclass relationship among classes
• W3C Recommendation 18 August 2009
Hideaki Takeda / National Institute of Informatics
SKOS (Simple Knowledge Organization System)
• SKOS Core (hierarchical concept structure)
– skos:semanticRelation
– skos:broaderTransitive
– skos:narrowerTransitive
– skos:broader
– skos:narrower
– skos:related
– skos:preflabel
– skos:altlabel
– skos:hiddenlabel
subPropertyOf
Hideaki Takeda / National Institute of Informatics
SKOS (Simple Knowledge Organization System)
• SKOS Mapping
– skos:mappingRelation
– skos:closeMatch
– skos:exactMatch
– skos:broadMatch
– skos:narrowMatch
– skos:relatedMatch
subPropertyOf
Hideaki Takeda / National Institute of Informatics
Linked Open Vocabulary (LOV)
• A technical platform for search and quality assessment among the vocabularies ecosystem
– Register schemata
– Search schemata
• http://labs.mondeca.com/dataset/lov/
Hideaki Takeda / National Institute of Informatics
More Info.
• http://www.w3.org/2005/Incubator/lld/wiki/Vocabulary_and_Dataset
Hideaki Takeda / National Institute of Informatics
Summary for schema
• Some major schemata
– DC, DC terms, FOAF, SKOS …
• More domain-specific schemata
– CIDOC CRM
– PRISM
– …
• Re-using is highly recommended
– LOV