28
A Quick Introduction to Metadata Michael Day UKOLN: The UK Office for Library and Information Networking, University of Bath http://www.ukoln.ac.uk/ [email protected] Running a Public Library Website A workshop organised by UKOLN in association with EARL University of Bath, 15-16 November 1999

A Quick Introduction to Metadata Michael Day UKOLN: The UK Office for Library and Information Networking, University of Bath [email protected]

Embed Size (px)

Citation preview

A Quick Introduction to Metadata

Michael Day

UKOLN: The UK Office for Library and Information Networking, University of Bath

http://www.ukoln.ac.uk/

[email protected]

Running a Public Library WebsiteA workshop organised by UKOLN in association with EARL

University of Bath, 15-16 November 1999

Running a Public Library Website, University of Bath, 15-16 November 1999

2

Presentation Outline

• Some definitions• Metadata and the Web

– RDF

• Resource discovery– Dublin Core– Information Gateways

• Other metadata implementations– Digital preservation

Running a Public Library Website, University of Bath, 15-16 November 1999

3

Metadata: definitions (1)

Metadata = data about data

“… the Internet-age term for structured data about data” - Joint NSF-EU Working Group on Metadata (1998)

“… structured data about data that imposes order on a disordered information universe” - Carl Lagoze (Cornell University)

Running a Public Library Website, University of Bath, 15-16 November 1999

4

Metadata: definitions (2)

“… machine understandable information about web resources or other things” - Tim Berners-Lee (World Wide Web Consortium)

Roles:• Provides information about resources• Supports operations carried out on

information objects

Running a Public Library Website, University of Bath, 15-16 November 1999

5

Metadata: uses

Metadata can support many potential applications:

• Resource discovery• Content ratings• E-commerce• Authentication• Data management• Intellectual property rights management• Digital preservation

Running a Public Library Website, University of Bath, 15-16 November 1999

6

Metadata and the WebMetadata - the missing architectural component from the initial implementation of the Web

Metadata - RDF

PICS, TCN,

MCF, DSig,

DC,...

AddressingURL

Data formatHTML

TransportHTTP

Running a Public Library Website, University of Bath, 15-16 November 1999

7

RDF

The Resource Description Framework:• Part of the W3C (World Wide Web

Consortium) Metadata Activity• Developing a common syntax for

expressing assertions about information on the web

– RDF Syntax Working Group– RDF data model and RDF/XML syntax– RDF Schema Working Group

http://www.w3.org/Metadata/

Running a Public Library Website, University of Bath, 15-16 November 1999

8

Resource discovery

Main approaches:– Robot-based Web index services

(AltaVista, Lycos, etc.) – Utilising human intelligence to identify and

evaluate Internet resources.– Links pages– Information gateways

– The library cataloguing method, creating bibliographic records for Internet resources in library catalogues (InterCat)

Running a Public Library Website, University of Bath, 15-16 November 1999

9

A metadata typology

Simple Rich

Adapted from: L. Dempsey and R. Heery, “Metadata: a current view of practice and

issues”, Journal of Documentation, vol. 54, no.2, March 1998, pp. 145-172.

Band One Band Two Band Three

(full textindexes)

(simplestructuredgenericformats)

(more complexstructure,domainspecific)

(part of largersemanticframework)

Proprietaryformats

ProprietaryformatsDublin CoreROADSIAFA/Whois++templates

FGDCMARC

TEI headersICPSREADCIMI

Running a Public Library Website, University of Bath, 15-16 November 1999

10

The Dublin Core

Dublin Core Metadata Initiative (DCMI):• An initiative to define a core set of

metadata elements for resource discovery on the Internet

• 7 DC workshops– “ ... the broadest international, interdisciplinary

effort in resource description on the Internet ... the leading initiative for improving resource discovery on the Web” - Stu Weibel (OCLC)

http://purl.oclc.org/dc

Running a Public Library Website, University of Bath, 15-16 November 1999

11

DC elements15 Elements:

• Title

• Subject

• Description

• Creator

• Publisher

• Contributor

• Date • Type

Semantics defined in Internet RFC 2413 (1998); now superseded by DC version 1.1

• Format • Identifier • Source • Language • Relation• Coverage • Rights

Running a Public Library Website, University of Bath, 15-16 November 1999

12

DC qualifiers

DC-4 Workshop (Canberra):

• TYPE, SCHEME and LANGUAGE

DC Data Model working group:

• Element Qualifiers - refine the semantics of a DC element

• Value Qualifiers - gives context to the element value by

– indicating how to parse the value, e.g. an ISO 8601 date

– indicating the use of controlled vocabularies, e.g. LCSH, DDC or LCNAF

• Value Components

Running a Public Library Website, University of Bath, 15-16 November 1999

13

DC syntax

Guidelines and tools developed:• “Encoding DC Metadata in HTML”

(Internet-Draft)• Data Model working group - “Guidance

on expressing DC within the RDF” (working draft)

• Creation tools - e.g., DC-dot:

Some examples ...

http://www.ukoln.ac.uk/metadata/dcdot/

Running a Public Library Website, University of Bath, 15-16 November 1999

14

Running a Public Library Website, University of Bath, 15-16 November 1999

15

DC in HTML (1)<html>

<head>

<title>Dorset Library Service</title>

<link rel="schema.DC" href="http://purl.org/dc">

<meta name="DC.Title" content="Dorset Library Service">

<meta name="DC.Subject" content=”public libraries; Dorset County Council">

<meta name="DC.Publisher" content="European Regional Internet Registry/RIPE NCC">

<meta name="DC.Date" scheme="WTN8601" content="1999-08-05">

<meta name="DC.Type" content="Text">

<meta name="DC.Format" content="text/html">

<meta name="DC.Format" content="3791 bytes">

<meta name="DC.Identifier" content="http://www.dorset-cc.gov.uk/library.htm">

</head>

Running a Public Library Website, University of Bath, 15-16 November 1999

16

Running a Public Library Website, University of Bath, 15-16 November 1999

17

DC in HTML (2)<html>

<head>

<title> Bath and North East Somerset Library and Archives</title>

<link rel="schema.DC" href="http://purl.org/dc">

<meta name="DC.Title" content="Bath and North East Somerset Library, and Archives">

<meta name="DC.Subject" content=”public libraries; archives; Bath and North East Somerset">

<meta name="DC.Publisher" content="Bath University">

<meta name="DC.Date" scheme="WTN8601" content="1999-06-23">

<meta name="DC.Type" content="Text">

<meta name="DC.Format" content="text/html">

<meta name="DC.Format" content="2719 bytes">

<meta name="DC.Identifier" content="http://hosted.ukoln.ac.uk/libweb/bathnes/">

</head>

Running a Public Library Website, University of Bath, 15-16 November 1999

18

Running a Public Library Website, University of Bath, 15-16 November 1999

19

DC in RDF/XML<?xml version="1.0"?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.0/"> <rdf:Description about="http://www.earl.org.uk/index.html”> <dc:title> EARL, the Consortium for Public Library Networking </dc:title> <dc:creator> EARL Consortium </dc:creator> <dc:type>Text</dc:type> <dc:format>4699 bytes</dc:format> <dc:language> en </dc:language> </rdf:Description></rdf:RDF>

Running a Public Library Website, University of Bath, 15-16 November 1999

20

In abbreviated syntax<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.0/"> <rdf:Description about="http://www.earl.org.uk/index.html" dc:Title="EARL, the Consortium for Public Library Networking" dc:Creator="EARL Consortium [email protected]" dc:Subject="earl, public libraries, uk, networking, consortium" dc:Publisher="EARL Consortium [email protected]" dc:Date="1999-10-20" dc:Type="Text" > <dc:format> <rdf:Bag rdf:_1="text/html" rdf:_2="4699 bytes" /> </dc:format> </rdf:Description> </rdf:RDF>

Running a Public Library Website, University of Bath, 15-16 November 1999

21

DC Implementations

• DC creation tools– DC-dot– Nordic Metadata Project - Template

• Metadata-aware indexing tools– DESIRE - Combine

• Conversion tools– Metadata Cross-walks– Nordic Metadata Project - d2m– Project BIBLINK

• Interoperability– AHDS Gateway

Running a Public Library Website, University of Bath, 15-16 November 1999

22

Information Gateways

Roles of gateways:• Selection

– Gateways select resources according to some pre-defined criteria (e.g. subject area, some measure of quality)

• Creation of metadata– Gateways create simple resource

descriptions that can be both searched and browsed

Running a Public Library Website, University of Bath, 15-16 November 1999

23

The eLib programme

JISC funded:• Selected gateways (SOSIG, EEVL, OMNI,

Biz/ed, History, etc.)• ROADS Resource Organisation and

Discovery in Subject-based services– Developing Web-based tools for information

gateways

– Cross-searching (Whois++)

– Content creation rules (cataloguing guidelines)

http://www.ilrt.bris.ac.uk/roads/

Running a Public Library Website, University of Bath, 15-16 November 1999

24

Running a Public Library Website, University of Bath, 15-16 November 1999

25

Running a Public Library Website, University of Bath, 15-16 November 1999

26

The RDN

Resource Discovery Network• Funded by JISC, ESRC and AHRB• Co-operative network:

– Independent service providers (hubs) – Resource Discovery Network Centre

(RDNC)– Set service standards– Collection management policy– Develop strategic partnerships

– Cross-searching across multiple hubs

Running a Public Library Website, University of Bath, 15-16 November 1999

27

Digital preservation

A variety of preservation strategies are available - all are dependent upon the creation, capture and storage of metadata

• Recent initiatives include:– Reference Model for an Open Archival Information

System (OAIS) – Research Libraries Group (RLG) Working Group

on Preservation Issues of Metadata– Cedars project - funded by JISC under eLib,

managed by Consortium of University Research Libraries (CURL)

– Digital Services Project - National Library of Australia

Running a Public Library Website, University of Bath, 15-16 November 1999

28

UKOLN

UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC) of the higher education funding councils, as well as by project funding from the JISC and the European Union. UKOLN also receives support from the University of Bath, where it is based.

http://www.ukoln.ac.uk/