MAINTAINING QUALITY METADATA: TOWARD EFFECTIVE DIGITAL RESOURCE LIFECYCLE MANAGEMENT

Preview:

DESCRIPTION

MAINTAINING QUALITY METADATA: TOWARD EFFECTIVE DIGITAL RESOURCE LIFECYCLE MANAGEMENT. Daniel Gelaw Alemneh University of North Texas. University of North Texas (UNT) Libraries Digital Initiatives. Collaborative Initiatives CyberCemetery GPO NARA – Affiliated Archive Texas Register Archive - PowerPoint PPT Presentation

Citation preview

MAINTAINING QUALITY METADATA: TOWARD EFFECTIVE DIGITAL

RESOURCE LIFECYCLE MANAGEMENT

Daniel Gelaw AlemnehUniversity of North Texas

ICKM 2008

University of North Texas (UNT) Libraries Digital Initiatives

Collaborative Initiatives• CyberCemetery

• GPO• NARA – Affiliated Archive

• Texas Register Archive• Secretary of State’s Office

• Texas Laws and Resolutions Archive• Secretary of State’s Office

• The Portal to Texas History• 45 Libraries & Museums

• Web-at-Risk Project• California Digital Library• New York University

• National Digital Newspaper Program (NDNP)• Between 1836 and 1922.

ICKM 2008

University of North Texas (UNT) Libraries Digital Initiatives

Library Digital Collections: • Congressional Research Service Archive

• 10,000+ CRS Reports• World War Poster Collection

• 500 WWI and WWII Posters• Advisory Commission on Intergovernmental Relations

• 408 reports = 47,874 pages• Federal Communications Commission (FCC) Record

• 136 issues = 43,115 pages (6 of 21 volumes completed)• Electronic Theses and Dissertations (ETDs)

• 3000+ more in queue• Jean-Baptiste Lully (Music) Collection

• 27 scores = 10,000 pages• Other digitization projects

• http://www.library.unt.edu/libraries-and-collections/digital-collections

ICKM 2008

ICKM 2008

Metadata Environment

• Metadata-based digital resource management activities

• UNT Libraries metadata locally qualified Dublin Core based descriptive metadata.

• Detailed technical and preservation metadata elements• Web based metadata creation and editing

• Interoperability

• Metadata Crosswalks• Mods• Marc• oai_dc• PREMIS

ICKM 2008

Metadata Quality

• The two aspects of digital library data quality:

• The quality of the data in the objects themselves• The quality of the metadata associated with the objects

• Poor metadata quality:

• Ambiguities • Poor recall• Poor precision • Inconsistency of search results

ICKM 2008

Metadata Quality …

• Most Common errors:

• Incorrect Data: • Letter transposition• Letter omission• Letter insertion• Letter substitution or misstrokes

• Missing Data

• Elements and values not present at all (null) • Insufficient or incomplete data

• Ambiguous Data • Confusing or inconsistent data e.g. multiple spellings, multiple

possible meanings, mixed cases, initials, etc.

ICKM 2008

Factors Influencing Metadata Quality

• Local Requirements:

• Objects Heterogeneity

• What type of objects will the repository contain?

• Granularity

• How will they be described?

• Functionality

• What functionality is required? • How will it be interfaced?

ICKM 2008

Factors Influencing Metadata Quality …

• Collaborative Requirements:

• Diversity of Users

• How best diverse information-seeking behaviors can be met?

• Interoperability

• Will metadata be meaningful within aggregations of various kinds? • What is required for interoperability? (Structure, semantics, & syntax)

• Digital rights issues

• Will access restrictions be imposed? • Are requirements formal or informal? • Are there other access and associated digital rights issues?

ICKM 2008

Factors Influencing Metadata Quality…

• Training Issues

• Necessary expertise to create and manage rigorous metadata

• Metadata quality can be determined to a great extent by:• knowledge of the source, and • knowledge of the methodology used to create the statement

• Cost • Rigorous metadata is resource intensive and too costly

ICKM 2008

UNT Metadata Quality Assurance Mechanisms & Tools

• The two main stages of metadata qualities assurances:

• Pre-injust

• 1. Metadata Creation tools (Templates)

• Post-injust

• 2. Metadata Analysis tools (Web-based tools)

ICKM 2008

Quality Assurance Mechanisms and Tools: Templates

1. Metadata Creation Tools (Templates)

• Validates Mandatory elements

• Metadata Template Creator

• Template Reader

• Controlled vocabularies (UNTLBS)

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

UNT Metadata Quality Assurance Mechanisms & Tools…

2. Metadata Analysis Tools • NULL Values

• List/Browse All Values (by each qualifiers and elements)

• List Authorities Values

• Graphical reports and other fun stuff

• Clickable Maps by Institution and Collection

• Word Clouds by elements

• Records added overtime and other graphical reports

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

ICKM 2008

Summary• Determine level of quality required

• Partners may have much in common, but they have diverse and sometimes conflicting metadata requirements.

• Determine nature of gap and how to close it • effectiveness, efficiency, practicability, scalability

• Machine verses human error handling • How much of the process can be automated? • Human review of results is still essential (e.g. highlighted items)

• Compromise • One size does not fit all!

• Prioritize • Resources very unlikely to be available to meet all requirements

• Test the workflow • Test, retest, and evaluate the quality cycle continuously

ICKM 2008

ICKM 2008

Questions?Daniel.alemneh@unt.edu

Thank You!

Recommended