View
889
Download
3
Tags:
Embed Size (px)
DESCRIPTION
A presentation given by Susan McElrath, University Archivist at American University to my History in the Digital Age course.
Citation preview
SUSAN MCELRATH
UNIVERSITY ARCHIVIST
Digital Projects in Special Collections
AMERICAN UNIVERSITY
MARCH 7 , 2012
Digital Collections, Exhibits, and Repositories
� What is the difference?
� Repository
�multiple collections or institutions
� Collection
� one collection or theme� one collection or theme
� Exhibit
� one theme – a selection of items
Multi-Institutional Digital Repository
Institutional Digital Repository
Thematic Digital Collection
Digital Exhibit
Digital Exhibit on 1960 San Francisco Fire
Alternate approach to same topic
Digitization Project Planning
� What work needs to be done;
� How it will be done (according to which standards, specifications, best practices);
� Who should do the work (and where);
� How long the work will take;
� How much it will cost, both to "resource" the infrastructure and to do the content conversion
� http://www.ncecho.org/dig/guide_1planning.shtml
� http://www.nyu.edu/its/humanities/ninchguide/II/
Components of Digitization Projects
� Planning and Project Management
� Selection
� File Formats – master & access derivatives
� Conservation Treatment
� Reformatting� Reformatting
� Metadata Design & Creation
� Quality Control
� Web Platform
� Open source vs. proprietary systems
� Preservation
Selection Criteria
� Should they be digitized?
� Research Value
� May they be digitized?
� Copyright status
Can they be digitized?� Can they be digitized?
� Condition
� Format
� http://www.nedcc.org/resources/leaflets/6Reformatting/06PreservationAndSelection.php
� http://www.dlib.org/dlib/september09/ooghe/09ooghe.html
Digitization Standards
� Technical Standards
� Federal Agency Digitization Guidelines Initiative (FADGI)
� http://www.digitizationguidelines.gov/
� NARA
� California Digital Library (CDL)� California Digital Library (CDL)
� http://www.cdlib.org/services/dsc/tools/docs/cdl_gdi_v2.pdf
� University of Colorado
� https://www.cu.edu/digitallibrary/cudldigitizationbp.pdf
Metadata Requirements
� Metadata Requirements
� Descriptive Metadata
� Technical & Administrative Metadata
� Element Sets and Standards
Dublin Core� Dublin Core
� http://dublincore.org/documents/dces/
� METS/MODS
� http://www.loc.gov/standards/mods/
� http://www.loc.gov/standards/mets/
� VRA Core
� http://www.loc.gov/standards/vracore/
Web Platform Options
� Open Source Software
� OMEKA
� Greenstone
� DSpace
� Fedora� Fedora
� Proprietary Software
� Contentdm (OCLC)
� Luna Insight
� Digitool
Web Harvesting involves:
� Identifying and collecting web resources
� Providing search capability for archived web collections
� Managing and preserving web resources
Web Harvesting
� The most common web archiving technique uses web crawlers to automate the process of collecting web pages. Web crawlers typically view web pages in the same manner that users with a browser see the Web, and therefore provide a comparatively simple and therefore provide a comparatively simple method of remotely harvesting web content.
Web Crawling Problems
� Robots exclusion protocol may deny crawlers access to portions of a website.
� Large portions of a web site may be hidden in the deep Web.
Crawler traps may cause a crawler to download an � Crawler traps may cause a crawler to download an infinite number of pages, so crawlers are usually configured to limit the number of dynamic pages they crawl.
� Calendars often cause problems for crawlers.
Web Harvesting Resources
� International Internet Preservation Consortium
� http://netpreserve.org/about/index.php
� Library of Congress
� http://www.loc.gov/webarchiving
Archive-It (Service)� Archive-It (Service)
� www.archive-it.org
American University Digital Collections