Metadata: first principles Pat Bell Knowledge, Analysis and Intelligence

Preview:

Citation preview

Metadata: first principles

Pat Bell

Knowledge, Analysis and Intelligence

Definition

“Metadata is data about data … structured information

about a resource”

Instances of metadataresource: bookmetadata: catalogue record

Instances of metadata

resource: record

metadata: corporate file plan

Instances of metadata

resource: person

metadata: directory entry

Instances of metadata

resource: web page

metadata: …

… (Right click on web page) …

… (Select view source) …

Instances of metadataresource: web pagemetadata: tags

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd"><!-- InstanceBeginEditable name="doctitle" --><title>HM Revenue &amp; Customs: Child Benefit &amp; Guardian's Allowance</title>

<!-- InstanceBeginEditable name="Metadata" --><meta name="title" lang="eng" content="" /><meta name="description" lang="eng" content="" /><meta name="keywords" lang="eng" content="" /><meta name="eGMS.subject.category" lang="eng" scheme=“IPSV" content="Tax, Benefits" /><meta name="DCTERMS.audience" lang="eng" content="all" /><meta name="DC.creator" lang="eng" content="HM Revenue and Customs" /><meta name="DC.date.issued" scheme=" W3CDTF" content="2006-03-24" /><meta name="DC.date.modified" scheme= "W3CDTF" content="" /><meta name="eGMS.disposal.review" scheme=" W3CDTF" content="2006/04/01" /><meta name="DC.identifier" scheme="URI" content="" /><meta name="DC.format" lang="eng" content="text/html"/><meta name="DC.language" scheme="ISO639-2/T" content="eng" /><meta name="DC.publisher" lang="eng" content="HM Revenue and Customs" /><meta name="eGMS.rights.copyright" lang="eng" content="HM Revenue and Customs" /><!-- InstanceEndEditable -->

1st principleone resource, one description

The resource

The metadata

Title: Mona Lisa Title: Mona Lisa

Creator: Da Vinci Creator: Bell

Relation: (Very distant) Relation:

Uses of metadatatoday

Resource discovery

Resource

administration

Technical support

search

authentication

navigation disposal

version control

filtering

Intellectual property rights

preservation

Uses of metadatatomorrow: the semantic web

“An extension of the web … that will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can carry out sophisticated tasks for users”

Tim Berners-Lee et al, Scientific American 17 May 2001

Uses of metadatabuilding blocks for the semantic web

• Metadata …

• … expressed using the Resource Description Framework (RDF) …

• … in standardised XML (eXtensible Markup Language) documents.

Find out more at the World Wide Web Consortium (W3C): www.w3.org/

Components of metadatastatement

<meta name="title" lang="eng" content="" /><meta name="description" lang="eng" content="" /><meta name="keywords" lang="eng" content="" /><meta name="eGMS.subject.category" lang="eng" scheme=“IPSV" content="Tax, Benefits" /><meta name="DCTERMS.audience" lang="eng" content="all" /><meta name="DC.creator" lang="eng" content="HM Revenue and Customs" />

<meta name="DC.date.issued" scheme=" W3CDTF" content="2006-03-24" /><meta name="DC.date.modified" scheme= "W3CDTF" content="" /><meta name="eGMS.disposal.review" scheme=" W3CDTF" content="2006/04/01" /><meta name="DC.identifier" scheme="URI" content="" /><meta name="DC.format" lang="eng" content="text/html"/><meta name="DC.language" scheme="ISO639-2/T" content="eng" /><meta name="DC.publisher" lang="eng" content="HM Revenue and Customs" /><meta name="eGMS.rights.copyright" lang="eng" content="HM Revenue and Customs" />

Components of metadataelements

<meta name="title" lang="eng" content="" />

<meta name="description" lang="eng" content="" />

<meta name="keywords" lang="eng" content="" />

<meta name="eGMS.subject.category" lang="eng" scheme=“IPSV" content="Tax, Benefits" />

<meta name="DCTERMS.audience" lang="eng" content="all" />

<meta name="DC.creator" lang="eng" content="HM Revenue and Customs" />

<meta name="DC.date.issued" scheme=" W3CDTF" content="2006-03-24" />

<meta name="DC.date.modified" scheme= "W3CDTF" content="" />

<meta name="eGMS.disposal.review" scheme=" W3CDTF" content="2006/04/01" />

Components of metadatarefinements (Qualifiers)

<meta name="title" lang="eng" content="" /><meta name="description" lang="eng" content="" /><meta name="keywords" lang="eng" content="" />

<meta name="eGMS.subject.category" lang="eng" scheme=“IPSV" content="Tax, Benefits" /><meta name="DCTERMS.audience" lang="eng" content="all" /><meta name="DC.creator" lang="eng" content="HM Revenue and Customs" />

<meta name="DC.date.issued" scheme=" W3CDTF" content="2006-03-24" />

<meta name="DC.date.modified" scheme= "W3CDTF" content="" />

<meta name="eGMS.disposal.review" scheme=" W3CDTF" content="2006/04/01" />

2nd principledumb-down

A valid value for a refinement must also be valid for the unrefined element

date issued (2007-07-25) is fine for date

date updating frequency (monthly) is not

Components of metadataencoding schemes

<meta name="title" lang="eng" content="" /><meta name="description" lang="eng" content="" /><meta name="keywords" lang="eng" content="" />

<meta name="eGMS.subject.category" lang="eng" scheme=“IPSV" content="Tax, Benefits" /><meta name="DCTERMS.audience" lang="eng" content="all" /><meta name="DC.creator" lang="eng" content="HM Revenue and Customs" />

<meta name="DC.date.issued" scheme=" W3CDTF" content="2006-03-24" /><meta name="DC.date.modified" scheme= "W3CDTF" content="" />

<meta name="eGMS.disposal.review" scheme=" W3CDTF" content="2006/04/01" />

Components of metadataencoding schemes

Two sorts:

• Controlled vocabulary (Pick list)

eg Library of Congress Subject Headings

• Syntax (Prescribed format)

eg Date format yyyy-mm-dd

(and you can have free text tags, like Title)

Components of metadatavalues

<meta name="title" lang="eng" content="" />

<meta name="description" lang="eng" content="" />

<meta name="keywords" lang="eng" content="" />

<meta name="eGMS.subject.category" lang="eng" scheme="GCL" content="Tax, Benefits" />

<meta name="DCTERMS.audience" lang="eng" content="all" />

<meta name="DC.creator" lang="eng" content="HM Revenue and Customs" />

<meta name="DC.date.issued" scheme=" W3CDTF" content="2006-03-24" />

<meta name="DC.date.modified" scheme= "W3CDTF" content="" />

<meta name="eGMS.disposal.review" scheme=" W3CDTF" content="2006/04/01" />

3rd principleappropriate values

• Develop policies to support local requirements

• But keep in mind wider needs

• The metadata can be used by people as well as machines

Summary• Metadata is structured resource description

• A very abstract name for more concrete activities

• For resource discovery and administration, and technical support

• A building block of the semantic web

• Three principles: one to one, dumb-down and appropriate values

• Statements break down into elements, refinements, encoding schemes and values

The role of the information professional

Not tagging huge numbers of resources for someone else

Part of implementing a system (website, EDRM…)

Part of managing the system

Expert and guardian of standards

Guidance to the people who do the tagging

Recommended