28
Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer Webinar Series

Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Embed Size (px)

Citation preview

Page 1: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Merritt RepositoryDepositing Content and Providing Access

University of California Curation Center TeamCalifornia Digital Library

July 28, 2011

UC3 Summer Webinar Series

Page 2: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Merritt summary

• Curation repository– Supporting long-term preservation and access– Publish, share, preserve, discover, (re-)use

• “Model free”– There are no prescriptive requirements for content genre,

format, structure, or accompanying metadata

• No service fee (for UC affiliates)– Contributors are billed only for storage, $1.04/GB/year

Cost of a physical book in offsite storage $4.62/yearCost of a digital book in HathiTrust $0.15/yearCost of a digital book in Merritt $0.06/year

Cost of a dataset in Merritt $1.00/year

For more information, review the June 9 webinarhttp://www.cdlib.org/uc3/uc3webinars.html

Page 3: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Master recipe

• Registration (one time) [contributor → UC3, [email protected]]

• Submission [contributor → Merritt]

• Ingest [Merritt]

• Notification [Merritt → contributor]

• Discovery/delivery [consumer → Merritt → consumer]

Page 4: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Registration

• Contact Perry Willett, Merritt service manager [email protected]

Page 5: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Submission

• User interface• METS feeder• API

manual deposits

existing DPR workflows

automated deposits

Page 6: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

UI submission

• The submission package is always a single file

• An opportunity to supply descriptive metadata

Page 7: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

UI submission

• The submission package is always a single file, which may be:– For a single object

• The complete object• A multi-file object in a container (zip, gzip, tar.gz)

• A multi-file object defined by a manifest

– For a batch of objects• A manifest referring to single file objects• A manifest referring to objects in containers• A manifest referring to objects defined by manifests

Page 8: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Manifest

• A “packing slip” for an object, providing URLs for all object’s file components– Object manifest

• Algorithm = adler32, crc32, md2, md5, sha1, sha256, sha384, sha256

• See User’s Guide and online help for more information http://merritt.cdlib.org/

fileURL | hashAlgorithm | hashValue | fileSize | fileName | mimeType...

#%checkm_0.7#%profile| http://uc3.cdlib.org/registry/ingest/manifest/mrt-ingest-manifest #%prefix | mrt: | http://merritt.cdlib.org/terms##%prefix | nfo: | http://www.semanticdesktop.org/ontologies/2007/03/22/nfo# #%fields | nfo:fileUrl | nfo:hashAlgorithm | nfo:hashValue | nfo:fileSize | nfo:fileLastModified | nfo:fileName | mrt:mimeType

http://merritt.cdlib.org/samples/call911.jpg | md5 | 47d321056e60944a06973...http://merritt.cdlib.org/samples/call911.txt | md5 | 77fe42b1055bbabe51648...

#%eof

Page 9: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Manifest

• A “packing slip” for a batch, providing URLs for all object’s file components– Batch manifest

• Batch of single file objects• Batch of container objects• Batch of manifest objects

• An Excel macro is available for automatically generating manifests from spreadsheets http://merritt.cdlib.org/docs/merrittManifest.xls

• See User’s Guide and online help for more information http://merritt.cdlib.org/

fileURL | hashAlgorithm | hashValue | fileSize | fileName | primaryID | localID | creator | title | date...

Page 10: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Metadata

• Submission form• Batch manifest• Object component: mrt-erc.txt

erc:who: Blaine, Tegan Woodwardwhat: Continuous measurements of atmospheric argon/nitrogen ...when: 2005where: ark:/20775/bb21509964

Dublin Kernel Dublin Core Element

who creator Responsible person or party

what title Content description

when date Lifecycle-meaningful date

where identifier Locally-meaningful identifier

http://dublincore.org/groups/kernel/spec/

Page 11: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

METS feeder

• METS must conform to a profile documented in the CDL Guidelines for Digital Objectshttp://www.cdlib.org/services/dsc/contribute/docs/GDO.pdf

– METS, all referenced file components, and manifest must be web accessible

– The Merritt IP address can be provided for configuring firewall rules

• Feeder manifest

• Submission

http://url/path/mets.xmlhttp://url/path/mets.xml...

http://feeder.cdlib.org/?userID=id&authCode=passwd& accessGroupID=collection&manifestURL=manifest

Page 12: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

API submission

Field Value

filename optional File name

file required File contents

type optional

File type:• file • batch-manifest• container • container-batch-

manifest• object-manifest • single-file-batch-

manifestprofile required Profile (supplied by UC3)

primaryIdentifier optional Primary identifier (ARK)

localIdentifier optional Local identifier

digestType optional

Message digest type:• adler-32 • sha-1• crc-32 • sha-256• md2 • sha-384• md5 • sha-512

Page 13: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

API submission

Field optional Value

digestValue optional Message digest value (hexadecimal encoded)

creator optional Creator

title optional Title

date optional Date

note optional Descriptive note

responseForm optional Response form:• anvl• json• xhtml• xml

Page 14: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

API submission

POST /object/ingest HTTP/1.1Host: merritt.cdlib.orgContent-type: multipart/form-data; boundary=boundary

--boundaryContent-disposition: form-data; name=“file”; filename=“filename”file--boundaryContent-disposition: form-data; name=“type”

type--boundaryContent-disposition: forma-data; name=“profile”

profile--boundary...

Page 15: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

API submission

• cURLhttp://curl.haxx.se/

% curl –s –u user password –F “file=@manifest” -F “type=manifest-type” -F “profile=profile” -F “localIdentifier=identifier” -F “creator=creator” -F title=title” http://merritt.cdlib.org/object/ingest

Page 16: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Ingest

• Primary identifier– ARK (required; auto-generated by if not

supplied)

– DOI (can be optionally requested from )

• Validation

• Characterization

• SIP → AIPISO 1472, Open Archival InformationSystem (OAIS)

Page 17: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Notification

• You will receive two email separate notifications– Initial notification that we have received your submission,

and that it is queued for subsequent processing

– Final notification that we have fully processed your submission• UC3’s preservation commitment starts at the time of final

notification

Page 18: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Initial notification

From: UC3 Merritt Support [mailto:[email protected]] Sent: Thursday, July 14, 2011 3:28 PMTo: Stephen AbramsSubject: Completion of submission Completion of submission - Notification  - Submission ID: bid-4ed4bf45-aa78-4da7-bb65- 63b125d88150 - Job(s):  

Number of pending job(s): 1Number of completed job(s): 0Number of failed job(s): 0

  - User agent: slabrams - Submission date: 2011-07-14T15:27:41-07:00 - Status: QUEUED

Completion of submission - Notification Report

- Submission ID: bid-4ed4bf45-aa78-4da7-bb65-63b125d88150 - Job(s):

- Job ID: jid-3498bef6-e296-429d-b652-da1f35f8bc04 - Primary ID: ark:/20775/bb21509964 - Local ID: http://libraries.ucsd.edu/ark:/20775/bb21509964;b4946677;umi-ucsd-1040 - Filename: manifest2.txt - Object title: Continuous measurements of atmospheric argon/nitrogen as a tracer of

air-sea heat flux : models, methods, and data

- Object creator: Blaine, Tegan Woodward - Object date: 2005 - Status: PENDING

- User agent: slabrams - Submission date: 2011-07-14T15:27:41-07:00 - Status: QUEUED

With attachment, bid-4ed4bf45-aa78-4da7-bb65-63b125d88150.txt

Page 19: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Final notification

From: UC3 Merritt Support [mailto:[email protected]] Sent: Thursday, July 14, 2011 3:28 PMTo: Stephen AbramsSubject: Completion of ingest Notification Summary - Submission ID: bid-4ed4bf45-aa78-4da7-bb65-63b125d88150 - Job(s):

Number of pending job(s): 0Number of completed job(s): 1Number of failed job(s): 0

- User agent: slabrams - Queue Priority: 06 - Submission date: 2011-07-14T15:27:41-07:00 - Completion date: 2011-07-14T15:27:53-07:00 - Status: COMPLETED

With attachment, bid-4ed4bf45-aa78-4da7-bb65-63b125d88150.txt

Completion of ingest - Notification Report

- Submission ID: bid-4ed4bf45-aa78-4da7-bb65-63b125d88150 - Job(s):

- Job ID: jid-3498bef6-e296-429d-b652-da1f35f8bc04 - Primary ID: ark:/99999/fk4vm4kg6 - Local ID: ark:/20775/bb21509964 - Version: 3 - Filename: manifest2.txt - Object title: Continuous measurements of atmospheric argon/nitrogen as a tracer of air-sea heat flux : models, methods,

and data - Object creator: Blaine, Tegan Woodward - Object date: 2005 - Object state: http://store-stage.cdlib.org:35121/state/2111/ark%3A%2F99999%2Ffk4vm4kg6?t=xhtml - Submission date: 2011-07-14T15:27:46-07:00 - Completion date: 2011-07-14T15:27:53-07:00 - Status: COMPLETED

- User agent: slabrams - Queue Priority: 06 - Submission date: 2011-07-14T15:27:41-07:00 - Completion date: 2011-07-14T15:27:53-07:00 - Status: COMPLETED

Page 20: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Discovery/delivery

• Search

Page 21: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Discovery/delivery

• Search

Page 22: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Discovery/delivery

• Search

Page 23: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Discovery/delivery

• Browse

Page 24: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Discovery/delivery

• Browse

Page 25: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Coming soon …

• Enhanced characterization– JHOVE2

http://jhove2.org/

• Faceted search/browse– XTF (the technology behind )

http://xtf.cdlib.org/

• Investigation of CMS/DAMS-like function through integration with …– Islandora/Drupal (in cooperation with UCLA)

– Alfresco (in cooperation with UCB)

– Omeka (in cooperation with UCSC)

Page 26: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Questions?

Page 27: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

Upcoming webinars

Date/time TopicThursday, August 112:00 pm

EZID: Create and Manage Persistent IdentifiersJoan Starr, UC3/CDL

Thursday, August 252:00 pm

DCXL (Data Curation Excel)Carly Strasser, UC3/CDL

Thursday, Sept. 222:00 pm

Data Management Planning ToolPatricia Cruse/Tracy Seneca, UC3/CDL

http://www.cdlib.org/uc3/uc3webinars.html

Page 28: Merritt Repository Depositing Content and Providing Access University of California Curation Center Team California Digital Library July 28, 2011 UC3 Summer

For more information

UC Curation Centerhttp://www.cdlib.org/uc3http://www.cdlib.org/uc3/[email protected]

Stephen Abrams David LoyLisa Colvin Mark Reyes Patricia Cruse Abhishek SalveScott Fisher Tracy Seneca Erik Hetzner Joan StarrGreg Janée Carly StrasserJohn Kunze Marisa StrongMargaret Low Perry Willett

UC3 webinar serieshttp://www.cdlib.org/uc3/uc3webinars.html

Merritt repositoryhttp://merritt.cdlib.org/ http://merritt.cdlib.org/helphttp://merritt.cdlib.org/docs/merritt_handout.pdfhttp://merritt.cdlib.org/docs/merritt_user_guide.pdf