Where are we
Digital Librariesg– Discovery of information
• Describing InformationDescribing Information– Metadata
• MARCMARC• Dublin Core
• MODS • METS• TEI• EAD•......
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -1
What have we learned
Many types of Metadata– Administrative– Descriptive– Access/Use– Preservation– Technical/Structural– Other ...
Many metadata schema in the world of Digital Libraries– Dublin Core– MARCXMLMARCXML– MODS– EAD
TEI– TEI– Etc.
Most used representation (expression) of metadataXML– XML
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -2
Representation of knowledge
Description of Information (resources) trough metadata is i i “k l d i ”an exercise in “knowledge representation”
Knowledge representation might be language dependentK l t ti i th “H l G l” f C t Knowlege representation is the “Holy Graal” of Computer Science– Artificial IntelligenceArtificial Intelligence– Expert Systems– Ontologies– .....
Many models/languages proposed in the last 40 yearsM t f th d d t “b t f ” th d Most of the advances due to “brute force” methods
Three conceptual models of interest to Digital Libraries: FRBR RDF DCAMFRBR, RDF, DCAM
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -3
FRBR
Functional Requirements for Bibliographic Recordsq g p Approved by IFLA in 1997, published in 1998 An abstract conceptual model of the ‘bibliographicAn abstract conceptual model of the bibliographic
universe’ It is based on the entity-relationship model (Entities, y p (
Attributes, Relationships) FRBR was defined having in mind the “User Tasks”
– Find– Identify– Select– Obtain
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -4
User functions/tasks
Using the data to FIND materials that correspond to the Using the data to FIND materials that correspond to the user's stated search criteria
Using the data retrieved to IDENTIFY an entity– e.g. to confirm that the document described corresponds
to the document sought by the user, or to distinguish between two similar documents
Using the data to SELECT an entity that is appropriate to the user's needs– e g to select a text in a language the user understandse.g. to select a text in a language the user understands,
or to choose a version of a computer program that is compatible with the hardware and operating system available to the useravailable to the user
Using the data in order to acquire or OBTAIN access to the entity describe
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -5
FRBR Entities
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -6
Group 1 entities
WORKa distinct intellectual or artistic creation
EXPRESSIONthe intellectual or artistic realization of a work in the form of alpha-numeric, musical, or choreographic notation, sound image object movement etc or any combinationsound, image, object, movement, etc., or any combination of such forms
MANIFESTATION MANIFESTATIONthe physical embodiment of an expression of a work.
ITEMITEMa single exemplar of a manifestation.
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -7
FRBR – Group 1 entities
products of pintellectual or artistic
dendeavour
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -8
Examples of Work and ExpressionsExpressions
w1 Henry Gray’s Anatomy of the human bodyy y y y– e1 text and illustrations for the first edition– e2 text and illustrations for the second edition– e3 text and illustrations for the third edition– ….
w1 J S Bach’s The art of the fugue w1 J. S. Bach s The art of the fugue– e1 the composer’s score for organ– e2 an arrangement for chamber orchestra by Anthony Lewise2 an arrangement for chamber orchestra by Anthony Lewis– ….
w1 Jules et Jim (motion picture)– e1 the original French language version– e2 the original with English subtitles added– ….
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -9
Examples of <different> Works
William Shakespeare’s Romeo and Juliet Franco Zeffirelli’s motion picture Romeo andFranco Zeffirelli s motion picture Romeo and
Juliet Baz Lurhmann’s motion picture William Baz Lurhmann s motion picture William
Shakespeare’s Romeo and Juliet ….
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -10
Examples of Expressions
Franz Schubert’s Trout quintet (Work) Franz Schubert s Trout quintet (Work)– e1 the composer’s notated music
2 th i l k f d b R i– e2 the musical work as performed by Rosina Lhevinne, piano, Stuart Sankey, double bass, and members of the Juilliard String Quartetmembers of the Juilliard String Quartet
– e3 the musical work as performed by Jörg Demus, piano, and the members of the Collegium Aureump , g
– e4 the musical work as performed by Emanuel Ax, piano, members of the Guarneri String Quartet, and Julius Levine, double bass
– ….
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -11
Examples of Manifestations
J. S. Bach’s Six suites for unaccompanied cello (Work)– e1 performances by Janos Starker recorded partly in 1963
and completed in 19651 di l d 33 1/3 d di i 1966 b• m1 recordings released on 33 1/3 rpm sound discs in 1966 by
Mercury• m2 recordings re-released on compact disc in 1991 by g p y
Mercury– e2 performances by Yo-Yo Ma recorded in 1983
• m1 recordings released on 33 1/3 rpm sound discs in 1983 by CBS Records
• m2 recordings re-released on compact disc in 1992 by CBSm2 recordings re released on compact disc in 1992 by CBS Records
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -12
Example of Item
w1 Ronald Hayman’s Playbacky y– e1 the author’s text edited for publication
• m1 the book published in 1973 by Davis-Poynterm1 the book published in 1973 by Davis Poynter– i1 copy autographed by the author
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -13
From Work to Item
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -14
Family of Works
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -15
FRBR – Group 2 entities
entities responsible for the intellectual orentities responsible for the intellectual or artistic content, the physical production and dissemination, or the custodianship f th titi i th fi tof the entities in the first group
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -16
FRBR – Group 3 entities
G 3Group 3
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -17
FUB 2012-2013 Vittore Casarosa – Digital Libraries Parte 5 -18
FUB 2012-2013 Vittore Casarosa – Digital Libraries Parte 5 -19
Shakespeare, William, 1564-1616. Hamlet. French.
LC Control No. : 47023612
p , ,
LCCN Permalink : http://lccn.loc.gov/47023612
Type of Material : Book (Print, Microform, Electronic, etc.)yp ( , , , )
Personal Name : Shakespeare, William, 1564-1616.
Main Title : ... Hamlet, traduit par André Gide.
Published/Created : [Paris] Gallimard [1946]
Description : 2 p. l., 7-237, [2] p. 17 cm.
CALL NUMBER : PR2779.H3 G5Copy 1FUB 2012-2013 Vittore Casarosa – Digital Libraries Parte 5 -20
Shakespeare, William, 1564-1616. Hamlet. French.
LC Control No. : 47023612
p , ,
WorkLCCN Permalink : http://lccn.loc.gov/47023612
Type of Material : Book (Print, Microform, Electronic, etc.)yp ( , , , )
Personal Name : Shakespeare, William, 1564-1616.
Main Title : ... Hamlet, traduit par André Gide.
Published/Created : [Paris] Gallimard [1946]
Description : 2 p. l., 7-237, [2] p. 17 cm.
CALL NUMBER : PR2779.H3 G5Copy 1FUB 2012-2013 Vittore Casarosa – Digital Libraries Parte 5 -21
Shakespeare, William, 1564-1616. Hamlet. French.
LC Control No. : 47023612
p , ,
ExpressionLCCN Permalink : http://lccn.loc.gov/47023612
Type of Material : Book (Print, Microform, Electronic, etc.)yp ( , , , )
Personal Name : Shakespeare, William, 1564-1616.
Main Title : ... Hamlet, traduit par André Gide.
Published/Created : [Paris] Gallimard [1946]
Description : 2 p. l., 7-237, [2] p. 17 cm.
CALL NUMBER : PR2779.H3 G5Copy 1FUB 2012-2013 Vittore Casarosa – Digital Libraries Parte 5 -22
Shakespeare, William, 1564-1616. Hamlet. French.
LC Control No. : 47023612
p , ,
ManifestationLCCN Permalink : http://lccn.loc.gov/47023612
Type of Material : Book (Print, Microform, Electronic, etc.)yp ( , , , )
Personal Name : Shakespeare, William, 1564-1616.
Main Title : ... Hamlet, traduit par André Gide.
Published/Created : [Paris] Gallimard [1946]
Description : 2 p. l., 7-237, [2] p. 17 cm.
CALL NUMBER : PR2779.H3 G5Copy 1FUB 2012-2013 Vittore Casarosa – Digital Libraries Parte 5 -23
Shakespeare, William, 1564-1616. Hamlet. French.
LC Control No. : 47023612
p , ,
LCCN Permalink : http://lccn.loc.gov/47023612
Type of Material : Book (Print, Microform, Electronic, etc.)yp ( , , , )
Personal Name : Shakespeare, William, 1564-1616. ItemMain Title : ... Hamlet, traduit par André Gide.
Published/Created : [Paris] Gallimard [1946]
Description : 2 p. l., 7-237, [2] p. 17 cm.
CALL NUMBER : PR2779.H3 G5Copy 1FUB 2012-2013 Vittore Casarosa – Digital Libraries Parte 5 -24
Attributes of Work
title of the work form of work date of the work other distinguishing characteristic intended termination
i t d d di intended audience context for the work medium of performance (musical work) medium of performance (musical work) numeric designation (musical work) key (musical work) key (musical work) coordinates (cartographic work) equinox (cartographic work) equinox (cartographic work)
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -25
Attributes of Expression
title of the expressionf f i
expected frequency of issue (serial) type of score (musical notation) form of expression
date of expression language of expression
th di ti i hi h t i ti
type of score (musical notation) medium of performance (musical
notation or recorded sound) scale (cartographic image/object)
other distinguishing characteristic extensibility of expression revisability of expression
( g p g j ) projection (cartographic image/object) presentation technique (cartographic
image/object)t ti f li f ( t hi extent of the expression
summarization of content context for the expression
representation of relief (cartographic image/object)
geodetic, grid, and vertical measurement (cartographic
critical response to the expression use restrictions on the expression sequencing pattern (serial)
( g pimage/object)
recording technique (remote sensing image)
special characteristic (remote sensing expected regularity of issue (serial) special characteristic (remote sensing
image) technique (graphic or projected image)
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -26
Attributes of Manifestation
title of the manifestation statement of responsibility
collation (hand-printed book) publication status (serial) statement of responsibility
edition/issue designation place of publication/distribution publisher/distributor
d t f bli ti /di t ib ti
publication status (serial) numbering (serial) playing speed (sound recording) groove width (sound recording)
ki d f tti ( d di ) date of publication/distribution fabricator/manufacturer series statement form of carrier
kind of cutting (sound recording) tape configuration (sound recording) kind of sound (sound recording) special reproduction characteristic (sound
extent of the carrier physical medium capture mode dimensions of the carrier
recording) colour (image) reduction ratio (microform) polarity (microform or visual projection) dimensions of the carrier
manifestation identifier source for acquisition/access authorization terms of availability
t i ti th if t ti
p y ( p j ) generation (microform or visual projection) presentation format (visual projection) system requirements (electronic resource) file characteristics (electronic resource) access restrictions on the manifestation
typeface (printed book) type size (printed book) foliation (hand-printed book)
file characteristics (electronic resource) mode of access (remote access electronic
resource) access address (remote access electronic
resource)resource)
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -27
Attributes of Item
item identifier fingerprint provenance of the item provenance of the item marks/inscriptions exhibition history condition of the item treatment history scheduled treatment scheduled treatment access restrictions on the item
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -28
Attributes and user tasks
The “knowledge” described by means of the g yentities and their attributes is the base upon which a user can perform his/her tasksp
Verify the usefulness of the attributes for the performance of each taskperformance of each task
User tasksFi d– Find
– Identify– Select– Obtain
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -29
User functions/tasks
Using the data to FIND materials that correspond to the Using the data to FIND materials that correspond to the user's stated search criteria
Using the data retrieved to IDENTIFY an entity– e.g. to confirm that the document described corresponds
to the document sought by the user, or to distinguish between two similar documents
Using the data to SELECT an entity that is appropriate to the user's needs– e g to select a text in a language the user understandse.g. to select a text in a language the user understands,
or to choose a version of a computer program that is compatible with the hardware and operating system available to the useravailable to the user
Using the data in order to acquire or OBTAIN access to the entity describe
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -30
Work and users’ tasks
■ = high value □ = medium value ○ = low value
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -31
Manifestation and users’ tasks
■ = high value □ = medium value ○ = low value■ high value □ medium value ○ low value
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -32
The “FRBR family”
FRBR: the original framework– All entities, focusing on Group 1
FRAR (FRAD) F ti l R i t f FRAR (FRAD): Functional Requirements for Authority Records/Data
Focus on Group 2– Focus on Group 2– Published in 2009
FRSAR (FRSAD): Functional Requirements for FRSAR (FRSAD): Functional Requirements for Subject Authority Records/Data– Focus on ‘aboutness’Focus on aboutness– In revision after IFLA review, published in 2010
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -33
FRAD
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -34
FRSAD
Vittore Casarosa – Digital LibrariesFUB 2012-2013 Part 5 -35
Simple Dublin Core Elements
Definition of elements (or terms) to describe resources
Content Intellectual Instantiation
Definition of elements (or terms) to describe resources
PropertyTitle Creator DateSubjectDescription
ContributorPublisher
FormatIdentifierDescription
TypeSource
PublisherRights
IdentifierLanguage
SourceRelationCoverage
FUB 2012-2013 Vittore Casarosa – Digital Libraries
Coverage
Part 5 -36
Terminology for DCAMDublin Core Abstract ModelDublin Core Abstract Model
Resource– a resource is anything that has identity. For example, a
resource may be an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), andservice (e.g., today s weather report for Los Angeles ), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resourcesbound books in a library can also be considered resources.
Property– a property is a specific aspect, characteristic, attribute, or p p y p p , , ,
relation used to describe a resource. Record
d i t t d t d t b t– a record is some structured metadata about a resource, comprising one or more properties and their associated values.
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -37
DCAM – DC Abstract Model
Dublin Core is used to make descritpions about resources A description is made up of
– the URI of the resource being described (resource URI)– one or more statements (about just that one resource)
Each statement is made up of– a property URI (that identifies a property)– a value URI (that identifies a value) and/or – one or more representations of the value (usually a value
string)
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -38
Literal and non-literal values
DCAM supports the distinction betweenpp values that really are strings (literals, indicated with value
strings)titl (t t)– titles (text)
– counts (integers)– identifiers (string tokens)– identifiers (string tokens)– etc.
values that are things, concepts or other non-string g , p gresources (non-literals, indicated with a value URI, a vocabulary encoding scheme, one or more value strings)
P– Persons– Documents– EventsEvents– etc.
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -39
Encoding schemes
Values and value strings can be ‘qualified’ by using g q y gencoding schemes
A vocabulary encoding scheme is used to indicate a “set y gof values”, of which the value is a member– e.g. the value is a member of LCSHg
A syntax encoding scheme is used to indicate how the value string is structured– e.g. the value string is structured according to the
W3CDTF rules (“2004-10-12”)
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -40
Summary of the model
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -41
Example of a DCAM descriptionset in DC-TEXTset in DC-TEXT
@prefix dcterms: <http://purl org/dc/terms/>@prefix dcterms: <http://purl.org/dc/terms/> . DescriptionSet ( D i ti ( Description (
ResourceURI ( <http://example.org/123> ) Statement ( PropertyURI ( dcterms:title ) LiteralValueString ( "Learning Biology"
Language ( "en" ) Language ( en ) ) ) ) )
)
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -42
RDF representation
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -43
Another DCAM descriptionin DC-TEXTin DC-TEXT
@prefix xsd: <http://www w3 org/2001/XMLSchema#>@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .@prefix ex: <http://example.org/ns#> . DescriptionSet (DescriptionSet ( Description ( ResourceURI ( <http://example.org/person123> )
( Statement ( PropertyURI ( ex:age ) LiteralValueString ( "43" SyntaxEncodingSchemeURI ( xsd:int ) ) )) ) )
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -44
RDF representation
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -45
Another DCAM description
@prefix dcterms: <http://purl.org/dc/terms/> . @prefi e <http // e ample org/ns#>@prefix ex: <http://www.example.org/ns#> . DescriptionSet ( Description (Description (
ResourceURI ( <http://example.org/123> ) Statement ( PropertyURI ( dcterms:subject )
// / ValueURI ( <http://example.org/subject32> ) VocabularyEncodingSchemeURI ( ex:ExampleSubjects
ValueString ( "Biology" Language ( "en" ) Language ( en )
) ValueString ( "EA32" syntaxEncodingSchemeURI ( ex:SubjectEncoding )
) ) ) ))
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -46
RDF representation
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -47
From DCAM to RDF to .....
Real-world metadata applications tend to be based on loosely d f d i i ( h h d ib dgrouped sets of descriptions (where the described resources are
typically related in some way) In the abstract model they are known as as description setsy p
– for example, a description set might comprise descriptions of both a painting and the artist
D i ti t i t ti t d f th f h Description sets are instantiated, for the purposes of exchange between software applications, in the form of metadata records
Each record conforms to one of the DCMI encoding guidelines (HTML g g (meta tags, XML, RDF/XML, etc.)
It is easy to express a DC description set as a RDF graph, and then express it in RDF/XMLexpress it in RDF/XML
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -48
Use of abstract model
FUB 2012-2013 Vittore Casarosa – Digital Libraries Part 5 -49