27
Dermot Frost Digital Repository of Ireland Trinity College Dublin

Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

Embed Size (px)

DESCRIPTION

Presentation to Search Solutions conference, London, November 2013. Discusses use of open source technology including Solr and Blacklight to build a search engine with multiple content types, file formats and metadata standards from many collections

Citation preview

Page 1: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

Dermot FrostDigital Repository of Ireland Trinity College Dublin

Page 2: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

Mission

DRI is a trusted digital repository for Humanities and Social Sciences Data

- linking and preserving the rich data held by Irish institutions, with a central internet access point

- Our Cultural & Social Heritage

Page 3: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

International Networks

Page 4: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

DRI Platform

Access Preservation

Federated Archives, Storage

Discovery

App App App Linked Logainm

Page 5: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

Metadata

Page 6: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

Formats

Page 7: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

Digital Preservation

Data citation, Permanent IDs

Metrics, funding, allowable costs

Training

Sustained e-infrastructure

Copyright, IPR, licensing, data protection, embargoes

Open metadata, open access

Policy, Services, Systems → Practice

Global Good Data Practice

Page 8: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

DRI Presentation

Dr. Una Walker, NIVAL

Photographic archive

Kilkenny Design Workshops

Page 9: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

DRI Presentation

Dr. Seathrún Ó Tuairisg, NUIG

RTÉ RnG: 40 years broadcast history

Folklore gathering, béaloideas

Irish Language

Page 10: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

DRI Presentation

Robin Adams, TCD Library

Stained Glass Windows

Business records, ephemera

Harry Clarke 1889 to 1931

Page 11: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

DRI Presentation

THE MEDIA ENVELOPE

Mon

Tue

Wed

Thu

FriSat01 January 1973 ☛

Sun

GENRE: ALL GENRES ☛

SOURCE TYPE: All Sources ☛

15:00 15:30 16:00 16:30 17:00 17:30 18:00

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

15 16 17 18

R.T.E. (Radio Telefís Éireann)

Radio Éireann

SEARCH FACILITY

The Kerryman

---

USER LOGIN // PERSONALISATION

Sesame Street

.

Ireland in the EEC

---

Ol. Clarence the Cross-Eyed Lion

Three-o-One: The Sound of ...Nua..

LANGUAGE: All Languages ☛

Tógha.. “The Good .. The Y.News

Music on th.. ß

01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 ..

Tragedy Hits Holiday Group

Tue

Wed

Thu

FriSat01 January 1973 ☛

Sun

Mon

01

Martin McGuinness held at Bridewell

Coombe Hospital wins the baby race

Official IRA adm...

The Connacht Tribune 5 Accused of Misleading Villagers

01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 ..

Tue

Wed

Thu06-07 January 1973 ☛

Sun

01

--- ---Mon

Fri

Roads’ Row Victory for Connemara Hurling Star Drowned while Oyster Dredging

Sat

IVERNIA in London Nora one of the first in Europe

The Irish Press

Prof Chris Morash & Dr John Keating, NUIM

Page 12: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

Dr. Jane Gray, IQDA

Changing life patterns in Ireland, 1900s to present

Life Stories

"My mother used to make a ball and we used to play ball, she used to make a hurl out of a bit of a board and make the handle a bit thin and you could catch it, no shape or make it only a bit of a board. And she used to make a ball out of a soft set of turf and put an old sock around it"

Page 13: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

REPOSITORY STRUCTURE

Page 14: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

Open source componentsCustom code to join them together

Page 15: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

OAIS model

Page 16: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

Using Hydra – ruby on rails

Obvious to use Blacklight

Therefore use SOLR

Objects injested into Fedora Commons

Use the Solrizer gem to create the Solr index

Search setup

Page 17: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost
Page 18: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

Object metadata all CC0

Search will return metadata on all records

Authorization system will restrict access to the objects

Multi-lingual data (English and Irish at the moment)

Indices for each language

Can search across specific or all

Search setup

Page 19: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

DATA PRESERVATION

Page 20: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

Multi-site repository

Dublin and Maynooth (~25km separation)

Asynchronous replication

Ability to catch errors on the fly

Segregated storage

Master copies with surrogates for public access

Preservation strategy

Page 21: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

Using CEPH as the underlying storage system

Provides Posix, S3 and Block access

Using S3 – potential to move to commercial cloud

Tiered storage and multi-site features

Erasure coding to reduce raw storage needs

CEPH features

Page 22: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

Groups of objects bundled using bagit format

Checksums built into the format for error detection

Useful for bulk transport of objects

Potential integration with DARIAH storage testbed

Data representation

Page 23: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

USER ACCESS

Page 24: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

Primarily through the blacklight search interface

Other routes• Curated collections and virtual galleries • Georeferenced data – mapping• Temporal data – timelines• User definied collections• DOI references in papers

User Access

Page 25: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost
Page 26: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

DRI Presentation

Page 27: Information Preservation and Access at the Digital Repository of Ireland - Dermot Frost

• http://projectblacklight.org/• http://projecthydra.org/• http://www.fedora-commons.org/• http://opennebula.org/• http://lucene.apache.org/solr/• http://www.ceph.com/• http://tools.ietf.org/html/draft-kunze-bagit• http://www.dri.ie• http://apps.dri.ie/locationLODer/