22
Towards Computational Research Objects David De Roure Indianapolis Edition

Towards Computational Research Objects

Embed Size (px)

DESCRIPTION

"Towards a Science of Reproducible Science?" DPRMA Workshop talk at JCDL 2013, Indianapolis, 25th July 2013. Workshop website is http://dprma.oerc.ox.ac.uk/ Paper is David De Roure. 2013. Towards computational research objects. In Proceedings of the 1st International Workshop on Digital Preservation of Research Methods and Artefacts (DPRMA '13). ACM, New York, NY, USA, 16-19. DOI=10.1145/2499583.2499590 http://doi.acm.org/10.1145/2499583.2499590

Citation preview

Page 1: Towards Computational Research Objects

Towards ComputationalResearch Objects

David De Roure

Indianapolis Edition

Page 2: Towards Computational Research Objects

1. A Brief History of Research Objects

2. The motivation for Computational Research Objects

3. (A small illustration)

Page 3: Towards Computational Research Objects

http://www.myexperiment.org/

Packs

Page 4: Towards Computational Research Objects

In contrast to photo-sharing on Flickr or videos on YouTube, the basic unit of sharing in myExperiment is not a single file but rather a package of components that make up an experiment - what we call an Encapsulated myExperiment Object (EMO), and others have called Reproducible Research Objects. Notionally an EMO is a folder containing the various assets associated with an experiment. In the scientific context there are stringent requirements with respect to versioning, ownership, intellectual property and the maintenance of provenance information. We have looked at emerging practice in sharing “pieces of science” in the scientific and scholarly lifecycle, from social sites to digital repositories. myExperiment provides simple and extensible support to better understand requirements as new collaborative practice emerges. In this presentation, we will describe the characteristics of EMOs and present our initial design solution which supports the requirements of encapsulation and preserves our principles of simplicity and interoperability.

Sharing Digital Science

David De Roure, University of Southampton; Carole Goble, University of Manchester

EMOs

Page 5: Towards Computational Research Objects

Iain Buchan

ResearchObjects

Page 6: Towards Computational Research Objects

Results

Logs

Results

Metadata PaperSlides

Feeds into

produces

Included in

produces Published in

produces

Included in

Included in Included in

Published in

Workflow 16

Workflow 13

Common pathways

QTLPaul’s PackPaul’s Research

Object

Page 7: Towards Computational Research Objects

http://www.openarchives.org/ore/terms/aggregates

http://eprints.ecs.soton.ac.uk/id/eprint/20817

OAI-ORE

Page 8: Towards Computational Research Objects

• Workflow – pack contains a number of workflows

• Presentation - encapsulation of a single presentation

• Collection - a number of things (workflows/presentations/papers)

• Heterogeneous - where the workflows do not appear to have a clear common purpose

• Homogeneous - workflows appear to be designed to work together

• Paper - source for a paper• Tutorial - tutorial material• Data - collection of data files• Derived data - results of

workflow• Benchmark - benchmarking

data• Supplementary - stuff

associated with a paper• Noise - tests, tryouts, rubbish• Oddity - none of the above

Analysis by Sean Bechhofer

Pack analysis Workflow Centric ROs

Page 9: Towards Computational Research Objects

used

wasGeneratedBy

wasStartedAt

"2012-06-21"

Metagenome

Sample

wasAssociatedWith

Workflow server

wasInformedBy

wasStartedBy

Workflow run

wasGeneratedBy

Results

Sequencing

wasAssociatedWith

Alice

hadPlan

Workflow definition

hadRole

Lab technician

Resultshttps://w3id.org/bundleStian Soiland-Reyes

Research Object Bundle

Page 10: Towards Computational Research Objects

Join the W3C Community Group www.w3.org/community/rosc

www.researchobject.org

Page 11: Towards Computational Research Objects

Notifications and automatic re-runs

Machines are users too

Autonomic

Curation

Self-repair

New research?

Page 12: Towards Computational Research Objects

The Executable Thesis

new data

new results

executablethesis

PhD Student

Page 13: Towards Computational Research Objects

A new role for thescientific publisher?

Digital library?

The Executable Journal

A thought experiment…

Page 14: Towards Computational Research Objects

KnowledgeInfrastructureKnowledge

Objects

Descriptivelayer

Observatories

An

no

tati

on

Page 15: Towards Computational Research Objects

Research Objects

ComputationalResearch Objects

WorkflowsPacks O

AIO

RE

W3C PRO

V

Page 16: Towards Computational Research Objects

• Social Objects, designed to facilitate human interpretation (e.g. containing narratives) and shared as part of a (hybrid) sensemaking network

• Machine Objects, semantically described and programmatically accessible, designed for automation, scale and heterogeneity

• Composable with a distributed computational model, such that a Computational Research Object can itself assemble systems of objects, and these systems may consume and produce Computational Research Objects. We can reason about them.

Computational Research Objects

Page 17: Towards Computational Research Objects

1. I take a digital audio recording and perform a series of analysis tasks leading to a result dataset

2. The environment captures the history of my analysis in a CRO, with descriptions of input data, analysis history (workflow) inc software, output data, narrative.

3. Another researcher finds CRO (cited in social media), tests it, runs it with different audio data (capturing as a CRO)

4. A data scientist registers the CRO to be run automatically when new data arrives, and configures a post-process so that they are notified if new results meet criteria

5. This common pattern of installing multiple CROs with a post-processor is captured for reuse

Simplest Scenario

Page 18: Towards Computational Research Objects

• The simple example takes us quickly to the stage of writing programs which act on CROs

• Isn’t this all a bit Computer Sciencey?• Yes! But it’s not CS for the sake of CS • It’s CS for “rigour and openness”• The idea is to establish Computer Science techniques

to be able to help design and validate our future research systems

Towards a Science of Reproducibility?

Page 19: Towards Computational Research Objects

Several Scheme concepts map directly into the CRO model:1. Closures (as mutable objects and first class functions)2. Environments3. Continuations

A prototype RO interpreter has been implemented – here is a simple example based on memoization (or should I say roification…)

(For Lisp hackers)

Page 20: Towards Computational Research Objects

> (define (f x) (analyse x))> (f 10);Value: 100> (define ro1 (roify f))> ((ro1 'x) 2);Value: 4> ((ro1 'x) 3);Value: 9> ((ro1 'x) 2); precomputed;Value: 4

> (define foo (ro1 'v))> (foo); confirmed(3) = 9; confirmed(2) = 4;Value: #t> (define (analyse x) (+ x x))> (foo); changed(3) = 6 <> 9;Value: #f> (define a (delay ((ro1 'x) 5))> (a);Value: 10

Page 21: Towards Computational Research Objects

1. Next steps? Develop more scenarios – including scale, validation, design

2. Higher order functions, e.g. capturing common patterns, seem to be expressive compared to normal workflow mechanics

3. The RO interpreter in Scheme is proof of concept… but actually it could be made operational

4. If nothing else this is a simulation of the/a future and may provide insights

5. Social machines and human computation research involves computational-style descriptions of processes involving humans – exploring in SOCIAM and Smart Society projects

Closing thoughts

Page 22: Towards Computational Research Objects

[email protected]/people/dderwww.scilogs.com/eresearch@dder

Thanks to Iain Buchan, Sean Bechhofer, Carole Goble and all my colleagues in myExperiment, Wf4Ever, myGrid and FORCE11. Research supported in part by Wf4Ever (FP7-ICT ICT-2009.4 project 270192)Some of these ideas were first presented at Microsoft e-Science Workshop, Stockholm, December 2011