30
David De Roure University of Southampton e-Science is about Scientists too The Evolution of the Grid and the Web OGF Semantic Grid Research Group www.semanticgrid.org eResearch Australasia 2007 7/31/2007 | | Slide 2 This is the story of the Semantic Grid It’s a Tale of Two Infrastructures – the Grid and the Web – both evolving And the most important system of all – the scientists, doing science Introduction 26/6/2007| | Slide 2

e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

David De RoureUniversity of Southampton

e-Science is about Scientists tooThe Evolution of the Grid and the Web

OGF Semantic Grid Research Group

www.semanticgrid.org

eResearch Australasia 2007 7/31/2007 | | Slide 2

This is the story of the Semantic Grid

It’s a Tale of Two Infrastructures – the Grid and the Web – both evolving

And the most important system of all – the scientists, doing science

Introduction

26/6/2007| | Slide 2

Page 2: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007 7/31/2007 | | Slide 3

1. The search for the missing link DataServicesPeople

2.Evolution of the Web

3.“Ending the Tyranny of the Grid”(or rather, ending the accidental tyranny of the grid mindset)

Overview

26/6/2007| | Slide 3

eResearch Australasia 2007 7/31/2007 | | Slide 4

The Semantic Grid Report 2001

John Taylor

There are a number of grid applications being developed and there is a whole raft of computer technologies that provide fragments of the necessary functionality. However there is currently a major gap between these endeavours and the vision of e- Science in which there is a high degree of easy- to- use and seamless automation and in which there are flexible collaborations and computations on a global scale.

Us

e- Science is about global collaboration in key areas of science and the next generation of infrastructure that will enable it

Page 3: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007

Missing Link

Grid InfrastructureGrid Infrastructure7/31/2007 | | Slide 5

ScientistsScientists

eResearch Australasia 2007

Why Semantic Web?

Huge potential for Science– making data reusable,

interlinked– making connections

between decoupled content

– generating new intelligence

Automation requires machine-processable descriptionsGrid community talking about metadata and knowledge

7/31/2007 | | Slide 6

Grid Computing

The Semantic

Web

The Semantic

Grid

Web Services

Page 4: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007 7/31/2007 | | Slide 7

Building bridges

RDF…RDF…RDF…

eResearch Australasia 2007 23/3/2007 | Semantic Web in e-Science | Slide 8

The Semantic Web layer cakeW3

C

26/6/2007| | Slide 8

Page 5: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007 7/31/2007 | | Slide 9

Semantic Grid = Grid + Semantic Web for e- Science

Scale of data and computation

Sca

le o

f In

tero

pera

bilit

y SemanticWeb

ClassicalWeb

SemanticGrid

ClassicalGrid

Based on an idea by Norman Paton

eResearch Australasia 2007 7/31/2007 | | Slide 10

Semantic Grid

The Semantic Web is an extension of the current Web in which information and services are given well-defined meaning, better enabling computers and people to work in cooperation

www.semanticgrid.org

Grid

Grid

Free the data!Free the services!

Free the people!

Page 6: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007 7/31/2007 | | Slide 11

My Chemistry Experiment

Box of Chemists

eResearch Australasia 2007 26/2/2007 | myExperiment | Slide 12

X-Raye-Lab

Analysis

Properties

Propertiese-Lab

SimulationVideo

Diff

ract

omet

er

Grid Middleware

StructuresDatabase

CombeChem pilot project

www.combechem.org

Page 7: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

Learning & Teaching workflows

Research & e-Science workflows

Aggregator services: national, commercial

Repositories : institutional, e-prints, subject, data, learning objects

Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules

Harvestingmetadata

Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media

Resource discovery, linking, embedding

Deposit / self-archiving

Peer-reviewed publications: journals, conference proceedings

Publication

Validation

Data analysis, transformation, mining, modelling

Resource discovery, linking, embedding

Deposit / self-archiving

Learning object creation, re-use

Searching , harvesting, embedding

Quality assurance bodies

Validation

Presentation services: subject, media-specific, data, commercial portals

Resource discovery, linking, embedding

The scholarly knowledge cycle.Liz Lyon, Ariadne, July 2003.

This work is licensed under a Creative Commons LicenseAttribution-ShareAlike 2.0

© Liz Lyon (UKOLN, University of Bath), 2003

Reducing Time-to-Discovery by Reducing Time-to-Experiment

eResearch Australasia 2007 7/31/2007 | | Slide 14

The key observation!

“Publication at Source” describes the need to capture data and its context from the outset and maintain a complete end-to-end connection between the laboratory bench and the intellectual chemical knowledge that is published as a result of the investigation.

e- Science = Record and Reuse. Reuse needs provenance

The details of the origins of data are just as important to understanding as their actual values

Page 8: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007 26/2/2007 | myExperiment | Slide 15

eResearch Australasia 2007 26/2/2007 | myExperiment | Slide 16

1 1 2 2 1 3 1 4

Sample of 4-flourinatedbiphenyl

Add CoolReflux

Butanone Sample ofK2CO3Powder

Weigh

grammes0.9031

Measure

40 ml

Add

Weigh

2.0719 g

text

3 5

Add

g

Sample ofBr11OCB

2 6

Reflux

2 7

Cool

Water

Measure

30 ml

9

Liquid-liquid

extraction

DCM

Measure

3 of 40 ml

10

Dry

MgSO4

11

Filter(Buchner)

12RemoveSolvent

by RotaryEvaporation

13

Fuse

Silica

14Column

Chromatography

Ether/PetrolRatio

Butanone dried via silica column andmeasured into 100ml RB flask.

Used 1ml extra solvent to wash outcontainer.

Started reflux at 13.30. (Had tochange heater stirrer) Only reflux

for 45min, next step 14:15.

Inorganics dissolve 2layers. Added brine

~20ml.

Organics are yellowsolution

Washed MgSO4 withDCM ~ 50ml

Measure

excess

Observation Types

weight - grammes

measure - ml, drops

annotate - text

temperature - K, °C

Key

Process

Input

Literal

Observation

Add CoolRefluxAddAdd Reflux Cool Dry Filter Remove

Solventby Rotary

Evaporation

Fuse ColumnChromatography

Dissolve 4-flourinatedbiphenyl inbutanone

Add K2CO3powder

Heat at refluxfor 1.5 hours

Cool and addBr11OCB

Heat atreflux untilcompletion

Cool and addwater (30ml)

Combine organics,dry over MgSO4 &filter

Removesolvent invacuo

Liquid-liquid

extraction

Extract withDCM(3x40ml)

Fuse compound to silica &column in ether/petrol

4 8

Add

Add

text

Annotate

Annotate

text

Weigh

Annotate

g

Annotate Annotate

text text

Future Questions

Whether to have many subclasses of processes or fewer with annotations

How to depict destructive processes

How to depict taking lots of samples

What is the observation/process boundary? e.g. MRI scan

1.5918

Combechem

30 January 2004gvh, hrm, gms

Ingredient List

Fluorinated biphenyl 0.9 gBr11OCB 1.59 gPotassium Carbonate 2.07 gButanone 40 ml

image

To

Do

Lis

tP

lan

Pro

ce

ssR

ec

ord

The RDF Graph

26/6/2007| | Slide 16

Page 9: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007 26/6/2007| | Slide 17

eResearch Australasia 2007

CombeChem Principles

It’s a Semantic DataGrid

Linking decoupled data “In the Wild” (3rd party sources)

Publish don’t warehouse

Think Holistic – we’re working in the context of the Scholarly Knowledge Cycle

Power of Provenance

A little Semantics goes a long way

Empowerment versus requirements capture

26/6/2007| | Slide 18

Page 10: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

E. Science laboris

Workflows are the new rock and roll.

Machinery for coordinating the execution of (scientific) services and linking together (scientific) resources.

The era of Service Oriented Applications

Repetitive and mundane boring stuff made easier.

The challenge for biology is complexity and heterogeneity, not so much compute.

eResearch Australasia 2007

Kepler

Triana

BPEL

Ptolemy II

Scientific memesAccompany their published outcomes

400 Taverna workflows in the Web Cloud

Taverna

Page 11: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

34,632 sourceforge downloads.Ranked 210 in sourceforge activity (06 June 07)

1,500 per month, 200+ sitesUsers throughout UK, USA, Europe, and SE AsiaSystems biologyProteomicsGene/protein annotationMicroarray data analysisMedical image analysisHeart simulationsHigh throughput screeningPhenotypical studiesPhylogenyText miningPlants, Mouse, HumanAstronomy

Taverna

eResearch Australasia 2007

Recycling, Reuse, Repurposing

Paul meets Jo

Trypanosomiasis cattle workflow reused without change

Identified the biological pathways involved in sex dependence in the mouse model, previously believed to be involved in the ability of mice to expel the parasite

Previously a manual two year study, by Jo, of candidate genes had failed to do this

26/6/2007| | Slide 22

Page 12: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007

myGrid Principles

Reuse, reuse and reuse – workflows transcend their application

In the Wild – Integrating 3rd party services

A little Semantics goes a long way

Power of Provenance

Users add value“Come As You Are”. Your desktop app. Understand the rewards system of stakeholders– Jam Today and more / better Jam Tomorrow

26/6/2007| | Slide 23

eResearch Australasia 2007 23/3/2007 | Semantic Web in e-Science | Slide 24

agentslogic

grids

semantic web

grids

semantic web

applications

applications

hybrid

p2p

hybrid

GGF5 Semantic Grid BOFEdinburgh, July 2002

GGF9 Semantic Grid WorkshopChicago, October 2003

GGF11 Semantic Grid Applications WorkshopHawaii, June 2004

Dagstuhl SeminarJuly 2005

GGF16 3rd GGF Semantic Grid WorkshopAthens, February 2006

OGF19 Web 2.0 and the Grid WorkshopNorth Carolina, January 2007

Page 13: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007 25

WaveWave 2 2 –– start 2006start 2006

Degree

DataminingGrid

data, knowledge, data, knowledge, semanticssemantics

OntoGrid

InteliGridK-WF Grid

Chemomen tum

A-Ware Sorma

platforms, user platforms, user environmentsenvironments

CoreGRIDvirtual laboratories

UniGrids HPC4U

g-Eclipse

Gredia

GridComp

QosCosGrid

Grid4all

Provenance

AssessGridGridTrust

trust, securitytrust, security

Grid services, Grid services, business modelsbusiness models

ArguGrid Edutain@ Grid

GridEconGridCoord

Nessi-GridChallengers

NextGRIDservice

architecture

Akogrimomobile services

BREINagents & semantics

BeinGridbusiness

experiments

supporting the Grid communitysupporting the Grid community

SIMDATindustrial

simulations

XtreemOS

Linux based Grid

operating system

BeinGridbusiness

experiments

KnowArc

EC-GinBridge

Grid@Asia EchoGrid

international cooperationinternational cooperation

Specific support action

Integrated project

Network of excellence

Specific targeted research projectWave 1 Wave 1 –– start 2004start 2004

EU Funding: 130 M€

Grid Research Projects under FP6Grid Research Projects under FP6

eResearch Australasia 2007 ©

Grid Ontology

Page 14: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

Semantic OGSA

• Semantic Grid Reference Architecture

• Mixed ecosystem of Grid and Semantic Grid services

• Services ignorant of bindings

• Services binding aware but unable to process them

• Services binding aware and capable of processing

• Everything is OGSA compliant27

Optimization

Execution Management

Resourcemanagement

Data

Security

Information Management

Infrastructure Services

Application 1 Application N

Semantic Services

eResearch Australasia 2007

A utility is a directly and immediately useable service with established functionality, performance and dependability

Services are knowledge-assisted (‘semantic’) to facilitate automation and advanced functionality

Service- Oriented Knowledge UtilityNGG3

The architecture comprisesservices which may be instantiated and assembled dynamically, hence the structure, behaviour and location of software is changing at run-time

26/6/2007| | Slide 28

Page 15: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007

Transformative Application- to

enhance discovery & learning

R&D to enhance technical and social dimensions of future CI

systems

Provisioning -Creation, deployment

and operation of advanced CI

Dan Atkins

26/6/2007| | Slide 29

eResearch Australasia 2007

www.smarttea.org

CombeChem Smart Tea

26/6/2007| | Slide 30

Page 16: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007

Bioinformatics is not Chemistry

There are many pieces, from many boxes, but no box, and no lid with a complete picture of what the puzzle is supposed to be.

Planning? No.Metadata an afterthought

Carole Goble

26/6/2007| | Slide 31

eResearch Australasia 2007

Key collective activities in e- science

interpretation of data/events

following through decisions/coordinating activities

producing documents& other artifacts

archiving/recovering information

informal and formalcommunication

meetings

http://www.aktors.org/coakting/

26/6/2007| | Slide 32

Page 17: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007

How can we weave together distributed discourse and documents?

Add notational and hypermedia structure to discussions & documents

Add notational and hypermedia structure to discussions & documents

Add indices to events in meetings so the meetings themselves become indexed documents

Add indices to events in meetings so the meetings themselves become indexed documents

Make meetings persistent and replayable

Make meetings persistent and replayable

Mike Daw

26/6/2007| | Slide 33

eResearch Australasia 2007 7/31/2007 | | Slide 34

Workshop 1: Kyra Norman and Orchestra Cube

Angela Piccini

performativity, place, space

26/6/2007| | Slide 34

Page 18: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007

The Collaborative Semantic Grid, CTS 2006

Stephen Downie

Annotation to associate features and “ground truth” to objects and to share in the community

26/6/2007| | Slide 35

eResearch Australasia 2007

People Principles

e- Science as sense- making

Supporting formal and informal scientific process

Collaboration over artefacts

Scaling up from project to community

26/6/2007| | Slide 36

Page 19: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007 7/31/2007 | | Slide 37

Evolution

Time

eResearch Australasia 2007

Web 2.0 Design Patterns

http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html

1. The Long Tail

2. Data is the Next Intel Inside

3. Users Add Value

4. Network Effects by Default

5. Some Rights Reserved

6. The Perpetual Beta

7. Cooperate, Don't Control

8. Software Above the Level of a Single Device

26/6/2007| | Slide 38

Page 20: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007 7/31/2007 | | Slide 39

Joint Information Systems Committee 7/31/2007 | | Slide 40

Geoffrey Fox

Page 21: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

7/31/2007 | | Slide 41

Too sophisticated for its own good?

Page 22: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007

Provide an advanced infrastructure to enable researchers to do exciting new things

Middleware hides the complexity of underlying systems, and this needs standards

Layered model – success is invisible infrastructure

The Grid Mindset

26/6/2007| | Slide 43

eResearch Australasia 2007

Overengineering of standards

Assumption that users will come

Divorces computation from content provision

Service provider mentality

– users seen as consumers of services not producers of value

NB These oppose the characteristics of Web 2.0

When Grids go bad

26/6/2007| | Slide 44

Page 23: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007

The Grid Problem

The Web 2.0 community decided Web Services are too complicated so they use HTTP instead.

The Grid community decided Web Services aren’t complicated enough so they invented OGSA.

Matthew Dovey

26/6/2007| | Slide 45

S M NTICW B CLUB∃ ∀∃

You are a member of the

Page 24: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007

Semantic Web is not a quick win

– Learning curve for concepts and tools

Return on Investment more visible “in the large”

– Need to bootstrap to get value

When Web goes bad

26/6/2007| | Slide 47

eResearch Australasia 2007

Missing Link

Grid InfrastructureGrid Infrastructure

ScientistsScientistsReally simple interfaces to computation,

storage, content and annotationutilities

Developers

Ecosystem

26/6/2007| | Slide 48

Page 25: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007

Virtual Research Environments

VRE 1Technology-focused

Experimental

Diverse design & development approaches

Stand-alone solutions

VRE 2User- & research practice-focusedDevelopmental Unified design & development approachesIntegrated solutions

CollaborationSupporting small & large-scale researchSupport for single-disciplinary and multi-disciplinary research

26/6/2007| | Slide 49

eResearch Australasia 2007

Virtual Research Environment Approach

Contextual Analysis

IntegrationDesign

Building

Testing

User Needs Analysis

Changeanalysis

Pilots

System analysis

Other Program

mes &

Initiatives Pr

ogra

mm

e Ev

alua

tion

Prog

ram

me

Eval

uatio

n

e-R

esea

rch

Inte

rope

rabi

lity

e-R

esea

rch

Inte

rope

rabi

lity

developers

users

Com

mun

ity

Enga

gem

ent

Com

mun

ity

Enga

gem

ent

26/6/2007| | Slide 50

Page 26: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007

David De Roure

[email protected]

Carole Goble

[email protected]/6/2007| | Slide 51

eResearch Australasia 2007 26/6/2007| | Slide 52

Page 27: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007

Warehouse or Federation

Community web site, and Distributed stores

Multiple myExperiments

Mixed identity regimes: an identity authority

Publish what I want when I want within the group I want

Open Archives Initiativehttp://www.openarchives.org/

Leveraging from the CombeChem project http://www.combechem.org/

cloud

enterprise

personal

laboratory

project

26/6/2007| | Slide 54

Page 28: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007

Cooperate don’t control

Bring functionality to the user –don’t force them to come to you

26/6/2007| | Slide 55

eResearch Australasia 2007

The Future

A mixture of Grid, Semantic Web and Web 2.0

The Grid is about linking things up so that people can do new stuff, so we need to empower people to do functionality mashups

Use Semantic Web technologies (RDF and Ontologies) to assist

– Mashing up of data

– Finding and using services

– Empowering people

– Working with live feeds, ...

The Grid community can learn from Web 2.0 in terms of how developers and users engage with the new capabilities

– bring new functionality to the users rather than expecting them to come to it, and enable them to participate

– Web 2.0 is compatible with Grid in that it requires robust services underlying it

26/6/2007| | Slide 56

Page 29: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007

Semantic Grid – metadata management and automation through annotation – is needed more and more to work with decoupled, disconnected, distant resources In the Wilde- is for Empowering Scientists not just Enabling Science

– Think vertically as well as horizontally! – Rise above the pieces, harness the new capabilities,

create an ecosystem of participation

Messages

26/6/2007| | Slide 57

eResearch Australasia 2007

semanticgrid.org

David De Roure

[email protected]

Carole Goble

Geoffrey Fox

Marlon Pierce

Semantic Grid Research Group

See the Call for Participation for the OGF21 Grid and Web 2.0 WorkshopOGF21 Seattle, October 15-19, 2007

26/6/2007| | Slide 58

Page 30: e-Science is about Scientists too - Open Research: Home · CombeChem pilot project . Learning & Teaching workflows Research & e-Science workflows ... experiments, Grids, fieldwork,

eResearch Australasia 2007

Credits, Links and Contacts

Slides

– Stephen Downie, Liz Lyon, Geoffrey Fox, Jeremy Frey, Carole Goble, Angela Piccini , Savas

Teams

– CombeChem

– myGrid

– myExperiment

– CoAKTinG/Memetic

David De Roure

[email protected]

Carole Goble

[email protected]

26/6/2007| | Slide 59