25
Dr Sandra Collins Director, Digital Repository of Ireland Royal Irish Academy

Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

Embed Size (px)

DESCRIPTION

Presentation at WMPA2014 - The 1st Winter School on Multimedia Processing and Applications Dublin, Ireland, January 6-8, 2014 Co-located with MMM 2014, The 20th Anniversary International Conference on MultiMedia Modeling. Trinity College Dublin

Citation preview

Page 1: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

Dr Sandra CollinsDirector, Digital Repository of Ireland Royal Irish Academy

Page 2: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

Mission

DRI is a trusted digital repository for Humanities and Social Sciences Data

- linking and preserving the rich data held by Irish institutions, with a central internet access point

- Our Cultural & Social Heritage

Page 3: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

DRI Platform

Access Preservation

Federated Archives, Storage

Discovery

App App App Linked Logainm

Page 4: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

Interviews

National Practice Survey

National Steering Committee

National Guidelines

Government adoption

www.oaireland.ie

Growing Digital Preservation & Access Policy

Page 5: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

Metadata

Page 6: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

Formats

Page 7: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

Digital Preservation

Data citation, Permanent IDs

Metrics, funding, allowable costs, training

Sustained e-infrastructure

Copyright, IPR, licensing, data protection

Open metadata, open access

Research Data Alliance 2014

Policy, Services, Systems → Practice

Global Good Data Practice

Page 8: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

Open source components, custom code engineering

Repository

Page 9: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

OAIS Model

Page 10: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives
Page 11: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

Objects injested into Fedora Commons

Use the Solrizer gem to create the Solr index

Object metadata all CC0

Search will return metadata on all records

Authorization system will restrict access to the objects

Multi-lingual data (English and Irish at the moment)

Indices for each language

Search setup

Page 12: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

Primarily through the blacklight search interface

Other routes• Curated collections and virtual galleries • Georeferenced data – mapping• Temporal data – timelines• User defined collections• DOI references in papers

User Access

Page 13: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives
Page 14: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

DRI Presentation

Page 15: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives
Page 16: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

200RESEARCHERS

74MEURO 30PARTNERS

40INVESTIGATORS 8INSTITUTIONS

1CENTRE!

Page 17: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

7X RESEARCH STRANDS

Personal Sensing

Semantic Web

Linked Data

Decision Analytics

Reasoning

Media Analytics

Recommender Systems

Work Packages

Page 18: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

Linked data based discovery platform

Across multiple RTE Archives, media formats

Enhanced data discovery and delivery

Enhanced workflows, digital practices, tools

Digital Preservation, discovery and access

Goal of Archive Discovery Project

Page 19: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

Two Key Ingredients

1. RDF – Resource Description FrameworkGraph based Data – nodes and arcs– Identifies objects (URIs)– Interlink information (Relationships)

2. Vocabularies (Ontologies)– provide shared understanding of a domain– organise knowledge in a machine-

comprehensible way– give an exploitable meaning to the data

Page 20: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

Linked Open Data cloud

Over 200 open data sets with more than 26 billion facts,interlinked by 400 million typed links, doubling every 10 month!

Media

Government

Geo

Publications

User-generated

Life sciences

Cross-domain

UK government

BBC

LinkedGeoData

Page 21: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

1. Data & System survey

2. Architecture Specification

3. Selection of Pilot Data

4. Retrieval and transformation of data

5. Setup & integration of the metadata repository

6. Metadata enhancement

7. Implementation of content discovery

8. Classification & evaluation of discovery content

9. Demonstrator

10. Performance KPIs

11. Enhanced workflows for content processing

Development

Page 22: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

Schematic Overview

Complex systems, customised software, grown and adapted over time and use

Page 23: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

AuthorisationPublic user Academic

researcherRTÉ researcher

RTÉ journalist

RTÉ Archivist

RTÉ Archives administrator

Search ✔ ✔ ✔ ✔ ✔ ✔

View ✔ ✔ ✔ ✔ ✔ ✔

Amend ✔ ✔ ✔ ✔

Create ✔ ✔

Delete ✔

User mgmt. ✔

Page 24: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

OUTCOMES FOR RTÉ

Page 25: Sandra Collins - Building a linked data based content discovery service for the RTÉ Archives

OUTCOMES FOR RTÉ

Pilot discovery platform

for all media

formats

Pilot discovery platform

for all media

formats

Enabling cross-Institutional, cross-collection curation