58
Metadata Metadata Standards and Standards and Applications Applications 6. Vocabularies: 6. Vocabularies: Attributes and Values Attributes and Values

Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Embed Size (px)

Citation preview

Page 1: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards Metadata Standards and Applicationsand Applications

6. Vocabularies: Attributes 6. Vocabularies: Attributes and Valuesand Values

Page 2: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Goals of SessionGoals of Session

Understand how different Understand how different vocabularies are used in metadatavocabularies are used in metadata

Learn about relationships in Learn about relationships in vocabulariesvocabularies

Understand methods of encoding Understand methods of encoding vocabularies for various purposesvocabularies for various purposes

Learn about how registries are used Learn about how registries are used to document vocabulariesto document vocabularies

Metadata Standards & ApplicationsMetadata Standards & Applications 22

Page 3: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 33

Vocabulary IssuesVocabulary Issues

Where vocabularies occur in Where vocabularies occur in metadatametadata

Establishment of formal relationships Establishment of formal relationships among terms (where appropriate)among terms (where appropriate)

Testing and validation of termsTesting and validation of terms The role of Metadata RegistriesThe role of Metadata Registries

Page 4: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 44

Why bother?Why bother?

To improve retrieval, i.e., to get an To improve retrieval, i.e., to get an optimum balance of optimum balance of precisionprecision and and recallrecall– PrecisionPrecision – How many of the retrieved – How many of the retrieved

records are relevant?records are relevant?– RecallRecall – How many of the relevant – How many of the relevant

records did you retrieve?records did you retrieve?

Page 5: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 55

Improving recall Improving recall andand precision precision

Controlled Vocabularies improve Controlled Vocabularies improve recall by addressing synonyms [attire recall by addressing synonyms [attire vs. dress vs. clothing]vs. dress vs. clothing]

Controlled Vocabularies improve Controlled Vocabularies improve precision by addressing homographs precision by addressing homographs [bridge (game) vs. bridge (structure) [bridge (game) vs. bridge (structure) vs. bridge (dental device)]vs. bridge (dental device)]

Page 6: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 66

Types of Controlled Types of Controlled VocabulariesVocabularies

ListsLists Synonym RingsSynonym Rings TaxonomyTaxonomy ThesaurusThesaurus [Classification Schemes][Classification Schemes] OntologyOntology

Page 7: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 77

Thesauri & ClassificationThesauri & Classification

Some knowledge management Some knowledge management researchers feel that these are researchers feel that these are essentially the same, with the essentially the same, with the primary difference being whether the primary difference being whether the preferred term is a notation preferred term is a notation

As the need to do machine readable As the need to do machine readable encoding progresses, some encoding progresses, some additional differences are emergingadditional differences are emerging

Page 8: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 88

ListsLists

A A listlist is a simple group of terms is a simple group of terms Example:Example:

AlabamaAlabamaAlaskaAlaskaArkansasArkansasCaliforniaCaliforniaColoradoColorado. . . .. . . .

Frequently used in Web site pick lists Frequently used in Web site pick lists and pull down menusand pull down menus

Page 9: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 99

Synonym RingsSynonym Rings Synonym rings are used Synonym rings are used to expandto expand queries for queries for

content objectscontent objects– If a user enters any one of these terms as a query to the If a user enters any one of these terms as a query to the

system, all items are retrieved that contain any of the system, all items are retrieved that contain any of the terms in the clusterterms in the cluster

Synonym rings are Synonym rings are often used in systems where often used in systems where the underlying content objects are left in their the underlying content objects are left in their unstructuredunstructured natural language formatnatural language format– the control is achieved through the interface by drawing the control is achieved through the interface by drawing

together similar terms into these clusterstogether similar terms into these clusters Synonym rings are used in conjunction with searchSynonym rings are used in conjunction with search

engines and provide a minimal amount of control engines and provide a minimal amount of control of the diversity of the language found in the texts of the diversity of the language found in the texts of the underlying documentsof the underlying documents

Page 10: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 1010

TaxonomiesTaxonomies

A A taxonomy taxonomy is a set of preferred terms, is a set of preferred terms, all connected by a hierarchy or all connected by a hierarchy or polyhierarchypolyhierarchy

Example:Example:ChemistryChemistry

Organic chemistryOrganic chemistryPolymer chemistryPolymer chemistry

NylonNylon

Frequently used in web navigation Frequently used in web navigation systemssystems

Page 11: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 1111

ThesauriThesauri

A A thesaurusthesaurus is a controlled vocabulary is a controlled vocabulary with multiple types of relationshipswith multiple types of relationships

Example:Example:RiceRice

UF paddyUF paddy

BT CerealsBT Cereals

BT Plant productsBT Plant products

NT Brown riceNT Brown riceRT Rice strawRT Rice straw

Page 12: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 1212

OntologyOntology

A useful definition: “An arrangement A useful definition: “An arrangement of concepts and relations based on of concepts and relations based on an underlying model of reality.”an underlying model of reality.”– Ex.: Organs, symptoms, and diseases in Ex.: Organs, symptoms, and diseases in

medicinemedicine No real agreement on definition—No real agreement on definition—

every community uses the term in a every community uses the term in a slightly different wayslightly different way

Page 13: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 1313

Thesaural RelationshipsThesaural Relationships

Relationship types:Relationship types: Use/Used For – indicates preferred termUse/Used For – indicates preferred term Hierarchy – indicates broader and Hierarchy – indicates broader and

narrower termsnarrower terms Associative – almost unlimited types of Associative – almost unlimited types of

relationships may be usedrelationships may be used

It is the most complex format for It is the most complex format for controlled vocabularies and widely used. controlled vocabularies and widely used.

Page 14: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

14

Page 15: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 1515

Z39.19 Types of ConceptsZ39.19 Types of Concepts

Things and their physical partsThings and their physical parts MaterialsMaterials Activities or processesActivities or processes Events or occurrencesEvents or occurrences Properties or states of persons, things, Properties or states of persons, things,

materials or actionsmaterials or actions Disciplines or subject fieldsDisciplines or subject fields Units of measurementUnits of measurement Unique entitiesUnique entities

Page 16: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 1616

ExamplesExamples

Birds (things)Birds (things) Ornithology (discipline)Ornithology (discipline) Feathers (materials)Feathers (materials) Flying (activity or process)Flying (activity or process) Bird counts (event)Bird counts (event) Barn Owl (unique entity)Barn Owl (unique entity)

Page 17: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 1717

RelationshipsRelationships

EquivalenceEquivalence HierarchicalHierarchical AssociativeAssociative

Page 18: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 1818

Equivalence RelationshipsEquivalence Relationships

Term A and Term B overlap completelyTerm A and Term B overlap completely

A = B

Page 19: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 1919

Hierarchical RelationshipsHierarchical Relationships

Term A is included in Term BTerm A is included in Term B

B A

Page 20: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 2020

Associative RelationshipsAssociative Relationships

Semantics of terms A and B overlapSemantics of terms A and B overlap

A B

Page 21: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 2121

Expressing RelationshipExpressing Relationship

Page 22: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 2222

Hierarchy rulesHierarchy rules

Relationships must be independent Relationships must be independent of context of context

Examples:Examples:– Mice (BT Rodents); Rodents (NT Mice)Mice (BT Rodents); Rodents (NT Mice)– NOT Mice (BT Pests); Pests (NT Mice)NOT Mice (BT Pests); Pests (NT Mice)

Page 23: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 2323

Hierarchy rulesHierarchy rules

Terms must represent the same type Terms must represent the same type of entity of entity

Examples:Examples:– Shoes (BT Footwear); Footwear (NT Shoes (BT Footwear); Footwear (NT

Shoes)Shoes)– NOT Shoes (BT Shoemaking); NOT Shoes (BT Shoemaking);

Shoemaking (NT Shoes)Shoemaking (NT Shoes)

Page 24: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 2424

Vocabulary ManagementVocabulary Management The degree of control over a vocabulary is The degree of control over a vocabulary is

(mostly) independent of its type(mostly) independent of its type– Uncontrolled Uncontrolled – Anybody can add anything at – Anybody can add anything at

any time and no effort is made to keep things any time and no effort is made to keep things consistent consistent

– Managed Managed – Software makes sure there is a list – Software makes sure there is a list that is consistent (no duplicates, no orphan that is consistent (no duplicates, no orphan nodes) at any one time. Almost anybody can nodes) at any one time. Almost anybody can add anything, subject to consistency rulesadd anything, subject to consistency rules

– Controlled Controlled – A documented process is – A documented process is followed for the update of the vocabulary. Few followed for the update of the vocabulary. Few people have authority to change the list. people have authority to change the list. Software may help, but emphasis is on human Software may help, but emphasis is on human processes and custodianshipprocesses and custodianship

Page 25: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 2525

Informal VocabulariesInformal Vocabularies

New movement towards ‘bottom up’ New movement towards ‘bottom up’ classification goes by many names:classification goes by many names:– TaggingTagging– Social bookmarkingSocial bookmarking– FolksonomiesFolksonomies

Many in this movement, seeing Many in this movement, seeing problems of scale, are moving problems of scale, are moving towards more formalizationtowards more formalization

Page 26: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Libraries/Museums and TaggingLibraries/Museums and Tagging

Penn Tags Penn Tags – Still experimental, primarily internal to PennStill experimental, primarily internal to Penn– http://tags.library.upenn.edu/help/http://tags.library.upenn.edu/help/

Library of Congress Flickr projectLibrary of Congress Flickr project– Open public tagging, still unclear how results will be Open public tagging, still unclear how results will be

usedused– http://www.flickr.com/photos/library_of_congress/http://www.flickr.com/photos/library_of_congress/

The Art Museum Social Tagging Project The Art Museum Social Tagging Project – Research/software project focused on museum Research/software project focused on museum

applicationapplication– http://www.steve.museum/http://www.steve.museum/

Metadata Standards & ApplicationsMetadata Standards & Applications 2626

Page 27: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 2727

Current Encoding Standards: Current Encoding Standards: AuthoritiesAuthorities

MARC 21MARC 21– Authority Format used for names, Authority Format used for names,

subjects, series; subjects, series; – Classification Format used for subject Classification Format used for subject

classificationclassification MADS (a derivative of MARC MADS (a derivative of MARC

authorities)authorities)– Used primarily for namesUsed primarily for names

Page 28: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

28

MARC 21 Authority Name

Page 29: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

29

MARC 21 Authority Subject

Page 30: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

30

MARC 21 Classification LCC

Page 31: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

31

MARC 21 Classification DDC

Page 32: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

What is MADS?What is MADS?

Metadata Authority Description SchemaMetadata Authority Description Schema– A companion to MODS for authority data using A companion to MODS for authority data using

XMLXML– Defines a subset of MARC authority elements Defines a subset of MARC authority elements

using language-based tagsusing language-based tags– Elements have same definitions as equivalent Elements have same definitions as equivalent

MODSMODS MADS can be used for metadata about MADS can be used for metadata about

people, organizations, events, subjects, people, organizations, events, subjects, time periods, genres, geographics and time periods, genres, geographics and occupationsoccupations

Metadata Standards & ApplicationsMetadata Standards & Applications 3232

Page 33: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

MADS ElementsMADS Elements AuthorityAuthority

– namename– titleInfotitleInfo– topictopic– temporaltemporal– genregenre– geographicgeographic– hierarchicalGeographichierarchicalGeographic– occupationoccupation

RelatedRelated– same subelementssame subelements

VariantVariant– same subelementssame subelements

NoteNote AffiliationAffiliation urlurl IdentifierIdentifier fieldOfActivityfieldOfActivity ExtensionExtension recordInforecordInfo

Metadata Standards & ApplicationsMetadata Standards & Applications 3333

Page 34: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 3434

Page 35: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

New/Upcoming New/Upcoming Standards:AuthoritiesStandards:Authorities

Functional Requirements for Authority Data Functional Requirements for Authority Data (FRAD)(FRAD)– A new model for authority informationA new model for authority information– Developed by the IFLA Working Group on Functional Developed by the IFLA Working Group on Functional

Requirements and Numbering of Authority Records Requirements and Numbering of Authority Records (FRANAR)(FRANAR)

– VIAF (Virtual International Authority File)VIAF (Virtual International Authority File) Prototype at: http://orlabs.oclc.org/viaf/ Prototype at: http://orlabs.oclc.org/viaf/

A Review of the Feasibility of an International A Review of the Feasibility of an International Authority Data Number (ISADN)Authority Data Number (ISADN)

Simple Knowledge Organization System (SKOS)—Simple Knowledge Organization System (SKOS)—a W3C standarda W3C standard

Metadata Standards & ApplicationsMetadata Standards & Applications 3535

Page 36: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 3636

Page 37: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Functions of the Authority FileFunctions of the Authority File

Document decisionsDocument decisions Serve as reference toolServe as reference tool Control forms of access pointsControl forms of access points Support access to bibliographic filesSupport access to bibliographic files Link bibliographic and authority filesLink bibliographic and authority files

(Slide from Glenn Patton) (Slide from Glenn Patton)

Page 38: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 3838

FRANAR Concept Model, top

Page 39: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 3939

FRANAR Concept Model, bottom

Page 40: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

FRAD person attributesFRAD person attributes

From FRBR (AACR2 additions to names):From FRBR (AACR2 additions to names):Dates associated with the personDates associated with the personTitle of personTitle of personOther designation associated with the personOther designation associated with the person

New:New:GenderGenderPlace of birthPlace of birthPlace of deathPlace of deathCountryCountryPlace of residencePlace of residenceAffiliationAffiliationAddressAddressLanguage of personLanguage of personField of activityField of activityProfession/occupationProfession/occupationBiography/historyBiography/history

(Slide from Ed Jones)(Slide from Ed Jones)

Page 41: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 4141

VIAF Search Result

Page 42: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 4242

VIAF DNB Display

Page 43: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

SKOSSKOS

Simple Knowledge Organisation Simple Knowledge Organisation System (SKOS)System (SKOS)– A World Wide Web Consortium (W3C) A World Wide Web Consortium (W3C)

standardstandard– Based on RDF and OWLBased on RDF and OWL– Currently resolving “last call” Currently resolving “last call”

comments, will be finalized in early comments, will be finalized in early 20092009

– http://www.w3.org/skos/ http://www.w3.org/skos/

Metadata Standards & ApplicationsMetadata Standards & Applications 4343

Page 44: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

44

The skos:Concept class allows you to assert that a resource is a conceptual resource. That is, the

resource is itself a concept.

Page 45: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

45

The RDF/XML Encoded Version

Page 46: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

46

Preferred and Alternative Lexical Labels

Page 47: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

47

The RDF/XML Encoded Version

Page 48: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

48

Registries: the Big Picture

(Adapted from Wagner & Weibel, “The Dublin Core Metadata Registry: Requirements, Implementation, and Experience” JoDI, 2005)

Page 49: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 4949

Why Registries?Why Registries?

Support the “interoperability cycle”:Support the “interoperability cycle”:– Discovery of available schemes and schemas Discovery of available schemes and schemas

for description of resourcesfor description of resources– Promote reuse of extant schemes and schemas Promote reuse of extant schemes and schemas – Access to machine-readable and human-Access to machine-readable and human-

readable services readable services – Support for crosswalking and translationSupport for crosswalking and translation

Coping with a “state of perpetual Coping with a “state of perpetual metadata heterogeneity” (Bianchi and metadata heterogeneity” (Bianchi and Petrone)Petrone)

Page 50: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 5050

What Do Registries Register?What Do Registries Register?

Metadata Schemas (element sets, Metadata Schemas (element sets, formats)formats)– Crosswalks between metadata schemasCrosswalks between metadata schemas

Controlled VocabulariesControlled Vocabularies– Mappings between vocabulariesMappings between vocabularies

Application ProfilesApplication Profiles– Schema and vocabulary information in Schema and vocabulary information in

combination with specific usage combination with specific usage instructioninstruction

Page 51: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

51

Dublin Core Registry—Term Level

Page 52: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

52

NSDL Registry—Property Vocabulary List

Page 53: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

53

NSDL Registry—Property Vocabulary Detail

Page 54: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

54

Element Detail RDF

Page 55: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

55

Concept Vocabulary Detail

Page 56: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & Applications

56

Concept Vocabulary XML Schema

Page 57: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Please Play!Please Play!

The NSDL Registry has a “sandbox” The NSDL Registry has a “sandbox” where anyone can try out the where anyone can try out the registry software:registry software:– http://sandbox.metadataregistry.org http://sandbox.metadataregistry.org

Please feel free to play in the Please feel free to play in the Registry Sandbox!Registry Sandbox!

Note: The production registry is open Note: The production registry is open as well, but not for play …as well, but not for play …

Metadata Standards & ApplicationsMetadata Standards & Applications 5757

Page 58: Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Metadata Standards & ApplicationsMetadata Standards & Applications 5858

AcknowledgementsAcknowledgements

Some slides used here are from Some slides used here are from presentations by Marcia Zeng and presentations by Marcia Zeng and Alistair MilesAlistair Miles