Local History and Historic Preservation Conference
Building Digital CollectionsPart 2: Managing and sharing
Supported by WHRAB
TODAY’S AGENDA
• Introductions• Tell us about yourself
• Creating an inventory• Starting your inventory• Selecting content to preserve
• Managing your collections• Organizing collections• Management tools
• Storage options• Access considerations• Why provide access?• Software options
• Promoting your collections• Wrap-up and final thoughts Waterford Public Library/University of
Wisconsin Digital Collections
introductions
• We are…• Sarah Grimm, Electronic
Records Archivist, Wisconsin Historical Society
• Emily Pfotenhauer, Recollection Wisconsin Program Manager, WiLS
• You are…• What organization do you
represent? • What digital projects are you
currently working on or thinking about?
Eager Free Public Library/University of Wisconsin Digital Collections
LOC and DPOE
The Library of Congress started the Digital Preservation Outreach and Education (DPOE) program in order to foster national outreach and education to encourage individuals and organizations to actively preserve their digital content.
http://www.digitalpreservation.gov/education/
Digital Preservation
Digital preservation combines policies, strategies and actions to ensure access to reformatted and born digital content regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time. Working group on Defining Digital Preservation, ALA Annual Conference, 6/24/2007
What is Digital content?
• Digital content is any content that is published or distributed in a digital form, including text, data, sound recordings, photographs and images, motion pictures, and software.• Digital materials created from analogue sources• Born-digital content
• Digital materials you currently have or create – or expect to have – that you want to preserve.
Well-managed Collections
• Sample characteristics of well-managed: • Basic information about each collection• Minimal metadata for objects (you define) • Common file formats • Controlled and known storage of content • Multiple copies in at least 2 locations
INVENTORY
Philharmonic Chorus MembersImage ID: WHi-92113
Why Do We Identify Content?
• Not all digital content can or should be preserved
• Good digital preservation requires an explicit commitment of resources, which - for most organizations - means planning ahead
• An explicit inventory is the best way to identify content
Food for the Boys in FranceImage ID: WHi-35438
First Steps
• Identifying content is a first step to planning for current and future preservation needs
• Ask: what content do I have, will I have,might I have, must I have?
An inventory is the best way to identify what content you have now – and raise awareness in your institution.
Goals
• Identify potential digital content you may need to preserve• Treat the inventory as a management tool that grows as your
preservation program grows• Use it as a planning tool – e.g., to prepare staff, training,
annual growth• Use as a basis for acquiring content, defining submission
agreements, plans
Does your institution have an inventory of your digital content?
Inventory Considerations
• Inventory content is more important than the technology• Inventory results should be: • Documented: an inventory should actually exist• Usable: use a simple format to sort, list, etc.• Available: accessible to others• Scalable: should be able to add content/fields over time• Current: update periodically and date it
Inventory Tips
• Don’t let implementing the software become the focus• Use software you know and have available• Test it out with a number of people and collections• Stick with a single format; don't change once you've
decided on it• Be consistent, comprehensive, and concise
How Much Detail to Include
• Inventories can be general to detailed • Determine appropriate level of detail for you• Factors in determining level of detail:• Extent of content to be inventoried• Nature & location of content • Resources available to complete inventory• Timeframe & deadlines for completion
What Do You Have?
• Identify collections of digital materials. (Don’t work at the item level…..)• Provide a brief title and description • Estimated growth over time ***
Who is involved?
• Who is currently managing the collection/digital content• Who knows the most about it?• Creator (Internal or External) – who created the digital
content
Digital Management
Collections Management
Creator
THESE MAY BE DIFFERENT PEOPLE
What does it consist of?
• Medium (6cds, 1 hard drive, 115 floppy disks)• Extent = Format + Amount (600 .pdfs, 30 .doc)• File Size – (MB, GB, TB)
http://www.csgnetwork.com/memconv.html
Date Considerations
Inventories should note:•Date of inventory and updates to it•Dates associated with the content (18601865)•Date of files – created or modified (2009)•Date received – if relevant / possible (2011)
ShawanoProbate Cases1860-1865
Digitized by USG In 2009
Received by WHS In 2011
Content Location
Locations of content are important :• List primary locations (Network drive location, Hard drive on Bo
b’s shelf)• List locations of all backups/copies (CDs in the storage room,
weekly backup tapes)
Remember to change locations as content moves
Selection Process
Why select content to preserve?
Log jam on the St. Croix River, 1886Wisconsin Historical Society WHi-2364
Why select content to preserve?
• Cost: storage may be cheap, management is not…especially over time
• Discovery and dissemination services: scale, scope, performance, sustainability
• Quality of content may be variable• Content meets organization’s mission
Selection Criteria
Ask yourself which materials are…•most significant to your organization?•most unique?•highest value?•most extensive?•most requested/used?•easiest?•oldest?•newest?•at risk?
Neville Public Museum of Brown County
Show Stoppers
Stop if or when the answer is NO•Content• Does the content have long term value?• Does it fit your scope and mission?
•Technical• Is it feasible for you to preserve the content?
•Access• Is it possible to make the content available? • Are you the only holder of this content?
Add to your inventory
Supplement your inventory with more detailed information about the material you plan to preserve over the long term.•Access• How will the public access the content?• Is access restricted? How? For how long?
•Rights • Who owns the rights to preserve and disseminate?
•Use• What’s the lifespan of the content? • Will its value/use change over time?
Add to your inventory
• Data criticality• Is it only in digital form?
Do we hold the only copy?
• Business/mission criticality• If we lose it, what’s the
damage to our reputation? How will it impact our function or services?
Charlie Chaplin and Jackie Coogan in The Kid.Image ID: WHi-68423
Selection Exercise
Postal workers sorting mail, 1955Wisconsin Historical Society WHi-36392
Next Steps
Memorial Union Steps
Analyze the Results
When the inventory is complete, ask yourselves what digital content•do we have that we didn’t know about?•should we be keeping that we aren’t now?•will we create or likely acquire in the future?•are we required to keep? •do we need to review?
"Deering Ideal" Stripper Harvester Catalog CoverImage ID: WHi-27577
ORGANIZE YOUR FILES
• Centralize your files• Minimize your layers• Leave breadcrumbs (AKA
“READ ME”)• Determine what you don’t
know
IH General Office Mail RoomImage ID: WHi-12016
WHAT NOT TO KEEP?
• Backups/copies/drafts• Supplementary files that
provide no additional long-term value• Corrupted files• Same item – different file
formats• Items that don’t fit your
organization’s purpose
Boy on Curb near Trash PileImage ID: WHi-57208
Goals/Outcomes
• Expanded inventory of content to preserve …and what you can delete (gray areas identified)
• Well-defined and documented selection criteria, policies and procedures • Better understanding of content for future planning and
growth
Greater knowledge = greater control!
Tools
Guitar Maker's ShopImage ID: WHi-27234
Remove Empty Directories
The application searches and deletes empty directories recursively below a given start folder and shows the result in a well arranged tree
http://sourceforge.net/projects/rem-empty-dir/files/latest/download?source=files
Remove Duplicate Files
• Auslogics Duplicate File Finder http://www.auslogics.com/en/software/duplicate-file-finder/
• Similar Images http://similarimages.en.softonic.com/
• VisiPics http://www.visipics.info/index.php?title=Main_Page
Auslogics Duplicate File Finder
Select Search Criteria
Select More Search Criteria
Select Delete Criteria
Image Viewer
IrfanView http://www.irfanview.com/
•Tool with many different capabilities for image manipulation/editing•For photos, we can easily view an entire folder’s worth of images at one time
Checksums
• Checksums (AKA “Hash Sums”) are created by programs running an algorithm against the contents of a file. (there are many free utilities that will perform this function for you)
• The resulting checksum is a short sequence of letters and/or numbers that uniquely identifies that file. (think “electronic fingerprint”)
Unix cksum utility
Why is this a good thing?
• Checksums help maintain the INTEGRITY of your collections because they will tell you when things change over time.
• If two files are exactly the same, the checksums of those files will also be exactly the same (generally speaking )
• If a file becomes corrupted, degraded or is changed in some way, the next time you run the utility on it, the checksum will change
MD5summer
• MD5summer http://www.md5summer.org/download.html
• This tool will give you a couple of options for the hashing algorithm MD5 SHA-1
• Other tools will give you other options……
How Does it work?
• Open MD5summer• Select your
root folder• Select
“Create Sums”
Create List of files to sum
• Select the files to beadded• Click “Add” or
“Add recursively” • Click “OK”
MD5 sums will start Generating
Save the File
Verify Hash Values
• Copy files to anotherdirectory(think “backup”)
• Open MD5Summer• Select the files in
the new location• Click “Verify Sums”
Open the Md5sum file
• Find your MD5 file• Click “Open”
MD5sums will be compared
YEAH!
IF THE FILES ARE DIFFERENT……
Uh-Oh!
Things to remember
Things that will NOT affect checksums•Moving items from one place to another •Changing the file name
Run on the master fileswhen a collection is completed
Set up a schedule to run“verify checks” periodically
St. Mary of the Lake Parish School First DayImage ID: WHi-98433
STORAGE
Key Decision Points
• How are you going to organize it? • What are you going to store it on?• Where are you going to store it?• How many copies do you
need?
Post OfficeImage ID: WHi-9135
Factors to consider
• Immediate Costs • Quantity (size and number of files)
• Number of copies
• Media (life span, availability, $$)
• Other resources• Expertise (skills required to manage)
• Services (local vs. hosted)
• Partners (achieving geographic distribution)
• Institutional constraints
How Many and Where?
• Multiple• Minimum: two (2) copies in two locations• Optimum: six (6) copies
• Geographically distributed• Don’t keep your copies onsite if possible
Local STORAGE OPTIONS
• Local network • RAID device• External hard drive• Archival quality (gold) CDs
or DVDsTake into account potential future storage needs.
Villa Terrace Decorative Arts Museum
Cloud storage options
Commercial options:•Google Drive• Up to 5GB free (approx. 140 high-resolution TIFF files)• 25GB = $2.50/month
• Amazon Simple Storage Service (S3)• $.095 per GB/month
Institutional options:•DuraCloud
*Public Records Board Guidance on the Use of Contractors for Records Management Services
*Use of Contractors for Records Management Services
Access Considerations
Historical Society library stacks, 1896Wisconsin Historical Society WHi-23281
why are you providing access to content?
• User demand• Institutional visibility• Legal mandates or grant
requirements• Generate revenue• Contribute to our collective
knowledge
South Wood County Historical Museum
What makes a good online collection?
• Publicly accessible.• Searchable - Includes keywords and other descriptive
information (metadata) so users can find what they’re looking for.• Organized and consistent.• Based on existing international/national/statewide
standards and best practices.• Uses software that is sustainable (will be around for a
long time) and interoperable (can be migrated or shared).• Respects intellectual property rights.
What are we aiming for?
Content should be delivered to users over time:•Easily – using current and known technologies•Coherently – well-documented and presented•Completely – intact and well-formed •Correctly – accurately representing content•Reliably – using well-managed technologies•Consistently – in accordance with policies•Fairly – with equity and precedent
Some software options
• CONTENTdm• ResCarta Web• PastPerfect Online• Omeka
Beloit College
contentdm
• Hosted by Milwaukee Public Library through Recollection Wisconsin• Produced and distributed by OCLC• Costs:• $200 one-time setup fee• Annual hosting fees starting at $75
http://content.mpl.org/ashland
http://content.mpl.org/ashland
http://content.mpl.org/ashland
http://content.mpl.org/ashland
Rescarta web
• Free and open source• Host it yourself; or hosting available through Northern
Micrographics (fee-based)• ResCarta Foundation – based in La Crosse
http://www.ecpubliclibrary.info/research/general/history.html
http://www.ecpubliclibrary.info/research/general/history.html
http://www.ecpubliclibrary.info/research/general/history.html
Pastperfect online
• PastPerfect add-on• Requires PastPerfect MultiMedia Upgrade• Hosted by PastPerfect• Costs:• $285 set-up• $440 annually (price breaks for AASLH members)
http://oshkosh.pastperfect-online.com
http://oshkosh.pastperfect-online.com
http://oshkosh.pastperfect-online.com
omeka
• Free and open source• Host it yourself; or subscribe to hosted version, omeka.net• Developed by the Center for History and New Media, George
Mason University
http://uwoshkosh.omeka.net
http://uwoshkosh.omeka.net
http://uwoshkosh.omeka.net
Promotion
Wisconsin Tourism Sign, Rhinelander, 1930-1942Wisconsin Historical Society WHi-37927
Potential audiences
• Local residents• Students and teachers• Genealogists• Specialists (e.g. Civil War
re-enactors, railroad buffs)• Academic researchers• Curious Wisconsinites• Everyone!
College of Menominee Nation
Stakeholders and partners
• Board• Staff and/or volunteers• Local experts• Community members• Chamber of Commerce• Local government• Students• Other organizations in
your community/ county/region• Who else? McMillan Memorial Library, Wisconsin Rapids
Encouraging use of your collections
• Organizations are moving away from “if you build it, they will come” approach – Google is not enough• Participatory archives
concept—shared authority, community engagement• Bring your content to your
audience—find them where they already are• Let them look behind the
curtain and see projects in progress, warts and all Milwaukee Public Library
PROMOTION – BRAINSTORMING
• What are some ways you’ve had success promoting your digital collections?• What are cool ideas you’ve seen that you’d like to
try?
Marketing ideas
• Add introduction/background information on your own website• http://www.newberlinhistoricalsociety.org
• Highlight an item of the day/week/month• https://www.facebook.com/lacross
e.history
• Host an opening event• Whitefish Bay Public Library• College of Menominee Nation
• Host a slide show or exhibition• South Wood County Historical
Museum• Mineral Point Historical Society Rock County Historical Society
Marketing ideas
• Send someone with a laptop to popular local spots/events to demonstrate digital collections:• Ask, “Where do people go first to look for this kind of
information?” and then, market there! • Upload a few digitized images to Flickr with descriptions that
point back to your related digital and physical collections.• Contribute to relevant pages on Wikipedia and include references
pointing to specific digital materials.• Request that the Chamber of Commerce and other
relevant local organizations link to the new digital collections from their websites.• Send a press release to local media
EVALUATING IMPACTEVALUATING IMPACT
Understanding current users…Online survey instrumentWeb analyticsEmail subscriber listsVisitor forms
Understanding future users…Special interest groups (AASLH, SAA, etc.)ListservsWorkshops and conference sessions
WRAPPING UP – FINAL THOUGHTS
Commencement, 1978UW-Madison Archives
Next steps/To do list
• Create and maintain an inventory• Develop your selection criteria• Play with the tools• Develop a storage management policy• E.g., number of copies, locations
• Monitor copies of content for errors/changes• Evaluate technology to determine your preferred access
platform• Develop a marketing plan• Determine how you will evaluate the success of your
marketing plan
Thank you!
• Sarah Grimm, Wisconsin Historical [email protected]
• Emily Pfotenhauer, [email protected]
• Slides and handouts available at http://recollectionwisconsin.org/localhistory2013
South Wood County Historical Museum