A Micro-Services-Based Approach for Curation and Preservation

  • Published on
    14-Feb-2017

  • View
    215

  • Download
    1

Embed Size (px)

Transcript

  • FromPreservationtoCuration: extendingboundaries,creatingnewservices,

    engagingnewusersPatriciaCruse/StephenAbrams

    UniversityofCaliforniaCurationCenter

    CaliforniaDigitalLibrary

    NDIIPP/NDSAPartnerMeetingJuly1921,2011

  • Ourenvironmentcirca20022008

    Focusonpreservation

    Stakeholders:memory organizations

    Infrastructure:static

    Services:hosted

    Content:museumand library

    Sustainability:?

  • Thechanginglandscape

    Everincreasingnumber,size,and diversityofcontent

    Everincreasingdiversityof partners,andstakeholders

    Decreasingresources

    Inevitabilityofdisruptivechange

    Technology

    Institutionalmission

    Users

    changingexpectations

    andneeds

  • Whatkeepsusersupatnight?

    Whatis

    metadata?

    Aretherestandardsor

    bestpracticesIshouldbe

    awareof?

    Howmuchwill

    itcost??

    WhyshouldIcare

    aboutpreservation?I

    justneedaplaceto

    putmydata.WherecanI

    gethelp?

    HowcanIsharemy

    workwithmy

    colleagues?

    HowcanIpublishthe

    dataassociatedwith

    mypublications?

    HowdoIfulfillthe

    datamanagement

    requirementsofmy

    grant?

    HowcanImake

    sureIgetcredit?

    Cantmyworkbe

    includedinthe

    WebofScience?

    HowcanIprovide

    accesstomy

    work?

  • Fourquestionsorimperatives?

    Howcanwebestrespondorganizationally? Howdoesourtechnicallandscapechange? Whatisthevalueofourservicestoourdiverse

    communityofusers? Howcanwebuild(orreach)newcommunities?

  • UniversityofCaliforniaCurationCenterCreativepartnershipbetweentheCDL,the10UC

    campuses,individualsandpeerinstitutions

    Acommunityofshared concernandpractice

    Achanneltopooland distributediverse

    experience,expertise,and resources

    Robust,innovative,and costeffectivesolutionsto

    counteractinevitable disruptivechange

  • UCCurationCentersenvironmenttoday

    Organization&Stakeholders:

    UClibraries,UC community,andbeyond

    Focusoncurationandentire

    informationlifecycleTechnology

    and

    Infrastructure:simple, flexible,adaptable

    Services:diverse

    Content:agnostic

    Sustainability:amustUCCommunityExternaltoUC

    ThankstoMacKenzieSmith IDCC2010

  • DataManagementPlanning(DMP)Tool

    Fundingagenciesrequiringa

    DMP

    1.

    connectresearchers

    withresources

    2.

    streamlinetheprocess

    toproduceacredible andhighqualityplan formanagingdata

    Eightinstitutionscoming

    together

    Toolwillhavemultiplephases

  • DMPToolOutoftheBox1.Forallusers

    Stepbystepwizard

    forgeneratingdatamanagementplans

    Generalguidanceforeachsection:helptextandresourcesrelevant

    toall

    SaveaplanasPDF,MS

    Word,plaintextor

    generatealinktoaPDF

    versionofthefinished

    plan

    2.ForDMPToolPartners

    Customizedlinksto

    resourcesavailable

    toallinstitutions

    researchers

  • EZID:longtermidentifiersmadeeasy

    takecontrolofthemanagementand

    distributionofyourresearch,shareandgetcreditforit,andbuildyourreputationthroughitscollectionand

    documentation

    PrimaryFunctions

    1.Createpersistentidentifiers

    2.Manageidentifiersovertime

    3.Manageassociatedmetadataovertime

  • EZIDsupportsawishlistfordataasakey componentofscholarlycommunication

    Supportingresearchers Preciseidentificationofa

    dataset

    Credittodataproducersand datapublishers

    Alinkfromthetraditional literaturetothedata

    Researchmetricsfordatasets

    SupportingacommunityBusinessmodel

    TieredpricingstructureforUC,

    nonUC,forprofit

    Revenuesupportsoperations

    anddevelopment

    Rangeofcustomers:

    governmentagencies,

    researchcenters,institutions,

    forprofit

    Workingwithpublishersto

    exposedataaspublication

  • ServiceoverviewOpentotheUCcommunity&beyond

    Discipline/contentagnosticServicedelivery:hostedorlocal

    deployedEasytouseUIorAPI

    PrimaryFunctions

    1.Deposit

    2.Manage(metadata,versions,etc)

    3.Share(withotherresearchers)

    4.Access(expose)

    5.Preserve

  • Merrittsdiverseserviceofferingtothecommunity

    Merrittsserviceoffering

    Darkarchiveforimportant digitalassets

    Brightarchivewithdirect discoveryandaccess

    Preservationbackendfor existingornewdiscoveryand

    contentmanagementsystems Integrationwithdistributed

    datagrids Localdeployments

    SupportingthecommunityBusinessmodel

    PricingstructureforUC,non

    UC,andforprofit Payasyougo Payoncestoreforever

    Revenuesupportsoperation

    anddevelopment

    Rangeofcustomers:

    governmentagencies,

    researchcenters,institutions,

    forprofit

  • WebArchivingServiceCapturetodaysweb,buildtomorrowsarchive

    PrimaryFunctions

    1.Collectwebpublishedcontent

    2.Managecontent

    3.Publishcontentforpublicaccess

  • WASserviceoverview

    Businessmodelinplace

    PricingstructureforUC, nonUC,andforprofit

    Servicefeeandstorageused

    Revenuesupports operationsand

    development Rangeofcustomers:

    agencies,researchcenters, academicinstitutions,

    researchers,libraries

  • DigitalCurationforEXCEL(DCXL)Project OpensourceMSExceladdins

    Problemstatement

    Dataarethebuildingblocksof

    scientificresearch.

    ManyscientistsuseMSExcel

    torecord,manage,view, graph,andmanipulate

    datasets.

    Excelscurrentfeaturesetcan

    beabarriertosharing, verifying,andpreserving

    DCXLOutputs

    Requirements(open)

    Opensourceaddin

    interoperable,

    sharable,

    publishable,

    archivable

    Newcommunityofpractice

  • WhatanExceladdincoulddo

    Somepreliminaryideastobetter

    publish,share,andarchive

    Permitstandardizedcolumn headers

    Versioningandstandard dateformats

    Autoarchivingand persistentidassignment

    Speedbumps

    to

    discouragemacrosetal.

    Participants UC3attheCDL

    UCCampuscommunity

    DataONE

    Thebroadercommunity

    MSResearch

    GordonandBettyMoore Foundation

    Deliverydate:Spring2012

  • Visionforadatapaper Idea:wraptheunfamiliarinafamiliar

    faade

    Adatapaper

    minimallyconsistsofa

    coversheetandasetoflinkstoarchived

    artifacts

    Coversheetcontainsfamiliarelements:

    title,date,authors,abstract,and

    persistentidentifier(DOI,ARK,etc.)

    Justenoughtopermitbasicexposureto

    anddiscovery

    Buildingabasicdatacitation

    IndexingbyservicessuchasWebof

    Science,GoogleScholar

    Instillingconfidenceinthe

    identifiersstability

  • DataPublishingattheCDL

    UCCurationCenter

    MerrittCurationrepository

    EZID:Persistentidmanagement

    andresolution(ARKs,DOIs,etal.)

    PublishingServicesProgram

    Onlinejournals,withpeerreview

    Scholarlycommunication:grey

    literaturetopostprints

    Searchanddisplaytools(XTF)

  • Lessonslearned(andstilllearning)

    Goalistoworkonseveralfrontstomakeacomplex problemssmaller

    Dontcirclethewagons

    Stopdoingwhatyoucantsupport

    Outsourceand/orusethirdpartycomponents

    Deploynewinfrastructureandservicesthatcanbeusedin diverseways

    Engagewithnewcommunities researchcommunity

    Supportemerginginitiatives

    Collaboratenowmorethanever!

  • UC3sspecialagentsattheCDL

    TracySeneca MargaretLow MarkReyes

    StephenAbrams

    PerryWillett

    MarisaStrong GregJanee

    DavidLoy

    ScottFisher

    CarlyStrasser

    TrishaCruseJohnKunze

    ErikHetnzerLisaColvin

    From Preservation to Curation: extending boundaries, creating new services, engaging new users Our environment circa 2002-2008The changing landscapeWhat keeps users up at night?Four questions or imperatives?University of California Curation CenterUC Curation Centers environment todayData Management Planning (DMP) ToolDMP Tool Out-of-the-Box EZID: long-term identifiers made easyEZID supports a wish list for data as a key component of scholarly communicationSlide Number 12Merritts diverse service offering to the communitySlide Number 14WAS service overviewDigital Curation for EXCEL (DCXL) ProjectOpen source MS Excel add-insWhat an Excel add-in could doVision for a data paper Data Publishing at the CDLLessons learned (and still learning)UC3s special agents at the CDL

Recommended

View more >