17
A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California Curation Center California Digital Library Global Oracle PASIG User Group Meeting, Redwood Shores, May 10-12, 2011

A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

Embed Size (px)

Citation preview

Page 1: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

A Micro-Services-Based Approach for Curation and Preservation Solutions

Stephen AbramsPatricia Cruse

John KunzePerry Willett

University of California Curation CenterCalifornia Digital Library

Global Oracle PASIG User Group Meeting, Redwood Shores, May 10-12, 2011

Page 2: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

Who are we?

Page 3: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

What do we do?

• Innovative curation solutions (preservation and access) for the UC community and external partners– EZID, Merritt repository, Web Archiving Service (WAS)– Guidelines and best practices– Integration with other CDL programs (e.g., OA publishing,

special collections) and external initiatives (e.g., DataCite, DataONE, HathiTrust, etc.)

Publish Preserve

Access

Collect

Discover

Gather

Create

Share

CurationResearchTeachingLearning

Information lifecycleScholarly lifecycle

Page 4: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

That’s simple, right?

• Ever increasing number, size, and diversity of content

Page 5: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

That’s simple, right?

• Ever increasing number, size, and diversity of content

• Expanded use by new constituencies

Information Center for the Environment

Minnesota Digital Library

Page 6: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

That’s simple, right?

• Ever increasing number, size, and diversity of content

• Expanded use by new constituencies

• Ongoing (potentially disruptive) changes in technology and user expectation

• Increasing obligations, shrinking budgets

http://www.flickr.com/photos/krbuchholz/3278516200/ http://www.flickr.com/photos/mildlydiverting/32286893

Page 7: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

Merritt repository

“How can I meet the data management requirements of my grant?”

“I know my desktop content is at risk; what should I do?”

“What’s a good way to share the data underlying a recent publication?”

“How can I ensure persistent availability?”

Page 8: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

Merritt repository

Preservation back-end for existing discovery services

Dark archive for preservation masters

Integration with distributed data gridsBright archive for

preservation and end-user access

Page 9: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

Managerially friendly

Page 10: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

Model free

application/msword 342.5 KB

Page 11: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

Strongly versioned

Page 12: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

Easy submission

Page 13: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

Merritt micro-services

• Merritt is built from a micro-services toolkit– IdM/Authn/Authz LDAP– Persistent identifiers EZID– Persistent storage CAN/Pairtree/Dflat/Checkm/ReDD– Fixity Fixity– Replication

Replication– Catalog Inventory/4store– Ingest Ingest/Zookeeper– Characterization JHOVE2

– Discovery XTF– Transformation– Notification– Annotation

Version 2

GhOST/Shibboleth

Page 14: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

Micro-service choreography

User agent Ingest

Inventory

Fixity

EZID

NodeStore

Node

Node

SIP AIP AIP

DIP

DIP

DIP

Notification

Identifier

Page 15: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

Micro-services

The Unix philosophy• “Make each program do one thing

well”• “To do a new job, build afresh

rather than complicate old programs by adding new features”

• “Expect the output of every program to become the input to another, as yet unknown, program”

• “Design and build software … to be tried early”

• “Don't hesitate to throw away the clumsy parts and rebuild them”

L. McIlroy et al.“Unix time-sharing system forward”

Bell System Technical Journal 57:6, part 2 (1978): 1902

• Complex emergent behavior• Low barrier, low maintenance,

low commitment• Policy neutral,

protocol/platform independent• The file system is the database

http://www.flickr.com/photos/oskay/265899811

Page 16: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

Curious Oysters, http://www.flickr.com/photos/thecuriousoysters/4458657148/

Questions?

Page 17: A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California

For more information

UC Curation Centerhttp://www.cdlib.org/[email protected]

Merritt repositoryhttp://merritt.cdlib.org/

UC3Stephen Abrams Margaret LowLisa Colvin David LoyPatricia Cruse Mark Reyes Scott Fisher Tracy Seneca Alex Genadinik Joan StarrErik Hetzner Marisa StrongGreg Janée Perry WillettJohn Kunze

• Abrams, Cruse, Kunze, & Minor, “Curation micro-services: A pipeline metaphor for repositories,” Journal of Digital Information 12:2 (2011)

• Abrams, Kunze, & Loy, “An emergent micro-services approach to digital curation infrastructure” International Journal of Digital Curation 5:1 (2010): 172-186