InPursuitofCommonDigitalCura2onGuidelines:
AnExplora2onofCurrentPrac2cesatMichiganState
LisaSchmidt,ElectronicRecordsArchivistMichiganStateUniversity
Archives&HistoricalCollec2onsMarch29,2010
Overview
• MichiganStateUniversityandtheMSUArchives&HistoricalCollec2ons
• Archives2.0:Policymakersvs.Custodians
• MSUArchivesElectronicRecordsIni2a2ves
• DigitalCura2onPlanning(DCP)Project
2
MichiganStateUniversity
• Establishedin1855byactoftheMichiganLegislaturetocreateanagriculturalcollege
• Na2on’spioneerlandgrantcollege
• Tieroneresearchuniversitywithsignificantna2onalandglobalimpact
• Leaderininnova2onandtechnology
• 46,648students:36,337undergrad,10,311graduate/professional
3
MichiganStateUniversity
4
MSUArchives&HistoricalCollec2ons
• OfficialrepositoryforthehistoricalarchivesofMichiganStateUniversity
• EstablishedbyBoardofTrusteesmandatein1969– CollectandpreservehistoricalrecordsofMSU
– Provideuniversitycommunity,scholars,andgeneralpublicwithaccesstorecords
– Approvefinaldisposi2onanddestruc2on
• 33,000cubicfeetofuniversityrecords5
MSUArchives&HistoricalCollec2ons
• Subjects:– Administra2on
– Athle2cs
– Campusbuildingsandgrounds
– Studentgroupsandac2vi2es
– Facultypapersandresearch
• 700+historicalcollec2onsrelatedtoMichiganandtheGreatLakesregion
6
MSUArchives&HistoricalCollec2ons
• Ac2velyassistsMSUunitsinefficientadministra2onandmanagementofofficialuniversityrecords
• Includesmanagement,collec2on,andpreserva2onofelectronicrecords
7
Archives2.0
“Theins2tu2onalarchiveneedstoassumemoreofapolicyrole,iden2fyingrecordsthroughoutthecampusandworkingtoensurethatdigitalrecordsarebothmaintainedbytheircreatorsandkeptreadyforresearchuse.”
RichardCox,“TheAcademicArchivesoftheFuture,”EDUCAUSEReviewMagazine,Volume43
8
ElectronicRecordsIni2a2ves
• Documentmanagementsystem– ExploringbothenterpriseDMSandguidelinesforlocallevelDMSs
• SpartanArchive– NHPRC‐fundedprojecttodevelopagovernancestructureandtechnicalinfrastructuretoaccession,provideaccessto,andpreserveelectronicrecords
9
ElectronicRecordsIni2a2ves
• EnterpriseBusinessSystemsProject(EBSP)– Mul2‐yearini2a2vetocreatestreamlinedbusinessprocessesandinterconnectedadministra2vesystemsforMSU’sfinance,humanresources,andresearchadministra2on
• DigitalCura2onPlanningProject
10
DigitalCura2onPlanningProject
• TheProblem
• DigitalCura2onInternship
• OriginalDigitalPreserva2onPlanProposal
• New,CurrentDigitalCura2onPlan
11
TheProblem
• MichiganState’sgrowingbodyofdigitalassetsandinforma2on– Ins2tu2onalrecords
– Facultyandstudentresearch
– Thesesanddisserta2ons
– Universitypublica2ons
– Mul2mediacollec2ons
– Digitalsurrogatesofculturalmaterial
– Learningobjectsandcoursematerials12
TheProblem
• Valuabledigitalresourcescreatedthroughmuch2me,effort,grantfunding,humancapital,andresearch
• Changingtechnologylikelytorenderdigitalassetsinaccessibleabsentalong‐termmanagementandpreserva2onplan
• Storagelimita2ons
13
TheProblem
• Somecampusunitshavecreatedtheirowndigitalrepositories
• But—nocomprehensive,campus‐widedigitalpreserva2onstrategyorguidelines
• Noins2tu2onalrepository
14
DigitalCura2onInternship
• Winter2009
• InternfromUniversityofMichiganSchoolofInforma2on
• Assessedproblemspaceinrela2ontodigitalmul2mediacollec2ons
• Interviewed7units
15
DigitalCura2onInternship
• Recommenda2ons– Morecomprehensivesurveyneeded
– Guidanceonbestprac2cesinselec2on,formats,namingconven2ons,metadata
– Beierlong‐termstorageop2ons
– Ins2tu2onalrepository
16
TheSolu2on:OriginalDPProposal
• Digitalpreserva2onplanrootedinbestprac2cestoprovidetrustworthystewardshipofdigitalassetsandintellectualproperty
• Collabora2onofMSULibraries,UniversityArchives,andMATRIX
• Toplevelbuy‐in:VPofLibraries,Compu2ngandTechnologyfundingdigitalpreserva2onanalystposi2on
17
TheSolu2on:OriginalDPProposal
• Engagingdigitalpreserva2onanalystforoneyear
• Planningteam– Representa2vesfromotherunits
– Monthlymee2ngs
– Buy‐inandrealitycheckbeyondArchives,Libraries,andMATRIX
18
OriginalDigitalPreserva2onPlan
• Conductanenvironmentalscanoftheuniversity’sdigitalassets
• SurveyofMSU’sexis2ngdigitalrepositoriesandtechnicalinfrastructure
• Iden2fybestpreserva2on,management,andaccessprac2cesalreadyoncampus
19
OriginalDigitalPreserva2onPlan
• Developpolicies,procedures,andworkflowtostandardizeMSU’sapproachtodigitalassetmanagementandpreserva2on
• Explorepoten2alcollabora2onswithotherins2tu2onsandconsor2a—suchasHathiTrust,LOCKSS,CIC
20
OverlyAmbi2ous!
• Wouldeventuallyreachsatura2onpointwithbroad,all‐encompassinginventory
• Impossibletocompleteinone‐year2meframe
• Concernoverpercep2onofcrea2onofone‐size‐fitsalldatarepository,lossofcontrolofdigitalassetsatunitlevel
21
NewDigitalCura2onPlan
• Campus‐wide,self‐selec2ngsurveyusingweb‐basedques2onnaire
• In‐depthinterviewswithselectunits
• Inventoryandappraisedigitalassetsofselectunits
• Evaluatetechnicalinfrastructures,storageneeds,metadataschemes,andnamingconven2ons
22
NewDigitalCura2onPlan
“Stopdiscipliningdataandstartherdingit.”SteveBailey,ManagingtheCrowd
23
An2cipatedOutcomes
• Guidelinesforelectronicrecordsappraisal,preferredfileformats,metadata,andfilenamingconven2ons
• Layeredstoragesolu2onandfiletransfermethodologies
• Founda2onfortheestablishmentofanins2tu2onalrepositoryorins2tu2on‐widefedera2onofdigitalrepositories
24
Storage
• CentralITsupportsadministra2vebusinesssystems,e‐mail,academicsupportfunc2ons– Pro:Moreefficientmanagementofelectronicrecordsanddigitalassets
• Tradi2onoflocalITstaffmanagingunitsystems
• Pooreconomymeritscloserlookatcentralvs.localIT
25
Storage
• CentralITdevelopingvirtualserverenvironmentstolocalunits
• Layeredstorage,avarietyofstoragetypesorlevelstomeetdiverseneeds– Localstorageforfilesoftemporary,short‐termuse
– Permanentlong‐termstorageenvironment,possiblyundercustodianshipoftheArchives
26
WhatisDigitalCura2on?
“Digital curation is maintaining and adding valueto a trusted body of digital information for currentand future use… the active management andappraisal of data over the life-cycle of scholarlyand scientific materials.”
DigitalCura2onCentre(www.dcc.ac.uk)
27
WhatisDigitalCura2on?
“Implicit ... are the processes of digital archivingand preservation but it also includes all theprocesses needed for good data creation andmanagement, and the capacity to add value todata to generate new sources of information andknowledge.”
DigitalCura2onCentre(www.dcc.ac.uk)
28
DigitalCura2onintheZeitgeist!
• Ini2a2vesatotheruniversi2es– PennState,OhioState,Duke,Yale
• Interestinprojectfromotherins2tu2ons– UniversityofUtah,JamesMadison,Smithsonian
• Invita2ontopresentatALAMidwinter
• Invita2ontosubmitjournalar2cle
• ApproachedbyNEHtodevelopproposalwithotherCICins2tu2ons
29
BaselineDataQues2onnaire
• Informal,web‐basedsurvey
• Publicizedtopoten2alpar2cipantsthroughITExchange,MSUNews,projectwebsite/blog
• Encouragedpar2cipa2onoftechnologystaffandcontentcreators
• AvailablefortwoweeksinOctober2009
30
BaselineDataQues2onnaire
• Typesofdigitalcontent
• Digitalcontentmakinguplargestpercentage
• ApproximatevolumeofdigitalcontentinTB
• Storagemedia
• Fileformats
• Formatsmakinguplargestpercentage
31
BaselineDataQues2onnaire
• Onlinestoragecapacityandexpansionplans
• Contentmanagementsystemsused
• Digitalrepositorysouwareused
• Presenceofconfiden2aldata
• Addi2onalcomments
32
Ques2onnaireResults
• 90responses– 23academicdepartments
– 31administra2veservicesunits
– 9researchcenters
– 27technologyservicesunits
33
Ques2onnaireResults
• Typesofdigitalcontentvariedconsiderably
• Fileformatsvariedconsiderably
• Storagemostlyonharddrives,somecombina2onofremovablemediaandnetworkedstorage
34
Ques2onnaireResults
• 17unitsplannedincreaseofstoragecapacity,mostfrom1‐10TB
• SeveralCMSand/ordigitalrepositoryimplementa2ons
35
Ques2onnaireResults
• Greatinterestandenthusiasminproject
• Anecdotalcomments– “Accumula2ngmorethanwecanstore!”
– Requestsforguidanceoniden2fyingandhandlingarchive‐worthyfilesat2meofcrea2on
– Howtochoosedigitalassetmanagementsystem
36
One‐on‐OneInterviews
• Largeproblemspace—howtobreakdown?
• Decidedtostartbyfocusingonunitswithcontentmanagementsystemsand/ordigitalrepositories
• Informal,two‐hourconversa2onsratherthanformalinterviews
• Heldatunitoffice
37
One‐on‐OneInterviews
• Digitalcontent,rela2ontomissionofunit
• Contentthatmustbepreserved– Ofongoinguse
– Archival,documentsac2vityofunitoruniversity
• Fileformats
• Storage,includinganyissues
38
One‐on‐OneInterviews
• Contentmanagementsystemand/ordigitalrepository– Systemusedandwhyitwaschosen
– Whatit’susedfor
• Ingest,archivalstorage/preserva2on,accessprocesses
39
One‐on‐OneInterviews
• Metadatastoredwithorrelatedtocontent
• Filenamingconven2ons
40
One‐on‐OneInterviews
• MSUExtension/AgricultureandNaturalResources(ANR)TechnologyServices– DotNetNuke,SharePoint,IntrafinityPortal(wriienforMSUE)
• Art&ArtHistoryDepartment– Masterimagefilesstoredoffline
– AccessfilesstoredinMDID
– MetadatacatologedusingIRIS
41
One‐on‐OneInterviews
• ConfuciusIns2tute– PromotesChineseLanguage/CultureEduca2on
– VersionCue,Subversion(SVN)
• DepartmentofTheatre– 75%digitalphotos,15%CADdrawings
– In‐houseCMSbasedonLAMP
– ResourceSpacedigitalrepository
42
One‐on‐OneInterviews
• Broadcas2ngServices
• CenterforResearchonMathema2csandScienceEduca2on(CRMSE)
• TurfgrassInforma2onCenter(TIC)
• MATRIX
• Na2onalSuperconduc2ngCyclotronLaboratory(NSCL)
• PhysicalPlant
43
Prototype:UniversityRela2ons
• Publicrela2onsarmofMichiganState
• Holdrecordsofhistoricalvaluetotheuniversity
• Serversburs2ngattheseamswithdigitalphotosandvideo
44
UniversityRela2ons:DigitalPhotos
• Hundredsofthousandsofdigitalphotos
• 4.6TBonnetworkedservers
• NikonRAWNEF,TIFF,JPEGformats
• Someembargoesanduserestric2ons
• 21,000imagesindexedinExtensisPorxolio
• 5,100imagespubliclyavailablethroughNetPublishPorxolio
45
URDigitalPhotos:Value?
• Someofhistorical/archivalsignificance
• Manyoftemporaryuse/value
• Manyofnocurrentvalue,shouldbedisposedof
46
URDigitalPhotos:Cura2onNeeds
• Recordsinventory
• Appraisalofcurrentlyheldfiles– Iden2fypermanentrecordsofarchivalvalue
• Storage– Preserva2onspace/environmentforarchivalmasters
– Publicaccessspacefordatabaseandlowresolu2onfiles
47
UniversityRela2ons:DigitalVideo
• MSUTodayshowonBig10Network
• ShotinXDCAMHD
• Showsrun30‐60minutes
• Avid,OpenMediaFramework,QuickTimeformats– Avidforedi2ng
– QuickTimeforaccesscopies
48
UniversityRela2ons:DigitalVideo
• 2TBnetworkedstorage
• 6TBnon‐networkedstorage
• 4TBinternalstorageonedi2ngmachines
• 16TB“scratch”storageonAvidUnitnetwork
49
UniversityRela2ons:DigitalVideo
• AccessversionsuploadedtoYouTubewithclosedcap2oning
• TapessenttoProvost’soffice
• URkeepstwoeditedversions– Showmaster,includingtextoverlays
– Cleanmaster
• Mostusagewithin6monthsofproduc2on
50
URDigitalVideo:Cura2onNeeds
• Preserva2onguidelines,includingformatrecommenda2ons
• Archivalstorage
• Filetransferworkflow
• Abilitytoprovideaccessorreproduceasneeded
51
UniversityRela2ons:NextSteps
• Recordsinventoryandappraisal/selec2onguidelines– Meetwithcontentcreatorsandusers
• Storageop2ons
• Custodyandfiletransferworkflows
52
Analysis
• Unitsdevelopedsolu2onsthatfitnatureoftheirdata,needsoftheirusers
• Someusingcommercialsouware,someopensource
• Someholdcontentofarchivalvalue,touniversityand/ortheunit
• Needforappraisalandpreserva2onguidelines
53
Analysis:TheGood
• Mostunitsbackingupdatainsomefashion
• Manydemonstrategooduseofmetadata
• Manyusingrepositorysouwaretomanagedigitalcontent
• Manyhadgoodaccess/discoveryinterfaces
• Manyhadstrongsupportfrommanagement,stablefunding
• Opentodigitalcura2onguidelines
54
Analysis:TheNot‐So‐Good
• Liileemphasison/plansforpreserva2on
• Backups/servermirrorstooclosetoproduc2on
• Somecreateanduseliileornometadata
• Liileinthewayofdigitalcura2onpolicies
• Ques2onofsupport,sustainablefunding
• Culturalandfinancialinter2a
55
MetadataComparison
• Inves2gatemetadataapproachesandschemasusedbyprototypeunits
• ComparetoDublinCoremetadataelements
56
MetadataComparison
• Sixunitshadmetadatatoshare– MATRIX,Theatre,andMSUExtensionbasedonDublinCore
– Art&ArtHistoryusesIRISdatastandardforcatalogingandmanagementofartimages,metadatabasedonVRACoreandCCO
– PhysicalPlantmetadatafromengineeringcontentmanagementsolu2onusedtomanagefacili2esassets
– TICusesbibliographicindexingtermsinCuadraStarsystem
57
DigitalCura2on:NextSteps
• Best/goodprac2cesrecommenda2onsandguidance
• Foster“CommunityofPrac2ce”mee2ngs
• NEHgrantproposal
• Developmentoflayeredstorageplans,includingtransferstoArchives
• Moreappraisalhelp– UniversityRela2ons– Otherunits,especiallythosewithcontenttotransfertoArchives
58
DigitalCura2on:NextSteps
• Explora2onofunrepresentedunitswithothertypesofdigitalcontentandcura2onprac2ces
• Three2ers– Cura2onplanforMSUpublica2ons,toincludeinves2ga2onintotheop2onsofcrea2nganins2tu2onalrepositoryorusingCLOCKSSorotherLOCKSS‐baseddistributedpreserva2onsolu2on
– Cura2onguidelinesformaterialthatshouldbetransferredtoArchives
– Cura2onguidelinesfornon‐MSU‐relatedmaterial
59
Deliverables&Dissemina2on
• FinalreporttoUniversityforthisphaseofproject
• Projectwebsite:hip://msudcp.archives.msu.edu/
• Presenta2ons– ALAMidwinterMee2ng,January2010
– SAAStudentChapter,SchoolofInforma2on,UniversityofMichigan,March2010
– SAAAnnualMee2ng,August2010
• Publica2ons– LibraryResources&TechnicalServices(LRTS)journalar2cle,dateTBD
60
References
• Cox,Richard.“TheAcademicArchivesoftheFuture,”EDUCAUSEReviewMagazine,Volume43,hip://www.educause.edu/EDUCAUSE+Review/
• DigitalCura2onCentre(DCC),hip://www.dcc.ac.uk• MichiganStateUniversityArchives&HistoricalCollec2ons,
hip://www.archives.msu.edu/
• MichiganStateUniversityDigitalCura2onPlanningProject,hip://msudcp.archives.msu.edu/
61