42
Lessons learnt building OGSA- DAI EGC 2005 Malcolm Atkinson Director www.nesc.ac.uk 15 th February 2005

Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director 15 th February 2005

Embed Size (px)

Citation preview

Page 1: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Lessons learnt building OGSA-

DAIEGC 2005

Malcolm Atkinson

Director

www.nesc.ac.uk

15th February 2005

Page 2: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Contents

Why invest in shared softwareFacilitating ApplicationsFacilitating Production useImproving code quality

The Data BonanzaOGSA-DAIInternational Collaboration

Foundation for economic high-quality e-Infrastructure

Summary & Take Home Message

Page 3: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Eternal Triangle

Applications

OperationsDevelopers

allwant reliability,dependability,

security, performancelong-term stability

How do you balance innovation and safe engineering

Page 4: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Eternal Triangle

Applications

OperationsDevelopers

allwant reliability,dependability,

security, performancelong-term stability

Many e-Infrastructures

Mix & Match model

Prefer just one e-Infrastructure

Testing & maintenance limit

innovation

One e-InfrastructureStable with good

management tools

Compromise: One e-Infrastructure – Select services & libraries

Agreed & Simple APIs / mappings wrap all common functions

Page 5: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005
Page 6: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Generating & Storing Data gets Easier

More, faster & cheaper digital devicesHigher resolutionGreater deploymentFaster duty cyclesDo they produce required metadata?

Larger, faster & cheaper storage technologyEconomic to store (multiple copies of) primary dataEconomic to store derived dataCrucial to store sufficient good quality metadata

Digital Communications – higher bandwidth & cheaper

Practical to access and copy remote dataLatency not decreasing – get what you need in a few trips

Page 7: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Multi-dimensional Growth

The Number of Data CollectionsGrows rapidly

The Size of each Data CollectionGrows rapidly

The Complexity of each Data CollectionGrows rapidly & autonomously

The Interdependencies between Collections

Grow rapidly

The User communitiesGrow rapidly – dispersed, diverse & mingling

Page 8: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Data Integration is Everything

MotivationNo business or research team is satisfied with one data resource

Data Curation Expertise Human CentredIntegration Human centredDomain-specialist driven

Dynamic specification of combination functionIterative processes

Revised request minutes later Revised request after months of thought

Sources inevitably heterogeneousTime-varying content, structure & policiesRobust, stable steerable integration services

Higher-level services over multiple resourcesFundamental requirements for (re)negotiation

Federation or Virtualisation

preceding integration

or kit of integration tools to be interwoven

with an application?

Scientific Insightneeded here

Page 9: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Data Integration is Everything

MotivationNo business or research team is satisfied with one data resource

Data Curation Expertise Human CentredIntegration Human centredDomain-specialist driven

Dynamic specification of combination functionIterative processes

Revised request minutes later Revised request after months of thought

Sources inevitably heterogeneousTime-varying content, structure & policiesRobust, stable steerable integration services

Higher-level services over multiple resourcesFundamental requirements for (re)negotiation

Federation or Virtualisation

preceding integration

or kit of integration tools to be interwoven

with an application?

Scientific Insightneeded here

This is the motivation for & home of OGSA-DAIIdentify the recurrent requirements

Provide one infrastructure that meets themWide use enables a robust, reliable and

supported set of facilitiesSteadily increase power of facilities

Steadily raise the level of abstractionStandardise & achieve multi-national

investment

Page 10: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

OGSA-DAI Project

OGSA-DAI is one of the Grid Middleware Centre Projects Collaboration between:

– EPCC– IBM (+ Oracle in phase 1)– National e-Science Centre– Manchester University– Newcastle University

Project funding:– OGSA-DAI, 2002-03,

• £3.3 million from the UK Core e-Science funding programme

– DAIT (DAI Two), 2003-06• £1.3 million from the UK e-Science Core Programme II

"OGSA-DAI" is a trade mark

Funded by UK’s Department of Trade & Industry + Engineering

& Physical Sciences Research Council as part of the e-Science

Core Programme

Thanks to Mario Antonioletti for these EPCC slides

Page 11: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Geographically Distributed Team

EPCC Team, Edinburgh

IBM Development Team, Hursley

ESNW, Manchester

Neresc, Newcastle

NeSC, Edinburgh

Page 12: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Communication vital

Web Site

Support Desk

Web Site

Training

Telephone conferences

Face-to-face meetings

Access Grid meetings

Bugzilla

Twiki

IRC

Email

Mailing Lists

Page 13: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Infrastructure

IBM

EPCC

Test Machines & Databases

Test Machines & Databases

IRC

Email

Mailing Lists

Access Grid

Telephone

NeSC

CVS Repository

Twiki

Page 14: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Basic Operational Model

Data Resource

Container

DAISGR

Client GDSF

GDS

Page 15: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

More Complex Behaviour

Data Resource

Container

Client GDSGDT

Data Resource

Container

GDS

GDT

Deliver data back to the client.

Data Resource

Deliver data to

a third

party.

Deliver data another GDS.

And there's a lot more that you can do …

Page 16: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

GridData

Service

DataResource

PerformDocument

ResponseDocument

ResultData

Page 17: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

gzipCompression

zipArchive

xslTransform

Predefined Activities

sqlQueryStatement

sqlStoredProcedure

sqlUpdateStatement

sqlBulkLoadRowset

xPathStatement

xUpdateStatement

xQueryStatement

xmlResourceManagement

xmlCollectionManagementrelationalResourceManager

inputStream

outputStream

DeliverFromURL

DeliverToURL

DeliverToGFTP

DeliverFromGFTP

DeliverToStream

DeliverFromGDT DeliverToGDT

DeliverToFileDeliverFromFile

fileWritingdirectoryAccess

fileAccessfileManipulation

Developers encouraged to roll their own – many do

Page 18: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

INFOD

GridFTP

DT

TMBoFADFBoF

GGF

Arch Data ISP SRM

OGSA

CMM GIR

GSM

GFS DAIS

CGS GRAAP

Policy

IETF

SNMP

Other Standards Bodies

????W3C

XQuery

ANSI

SQL

DMTF

CIM

OASIS

WS-DM

WS-RF

WS-N

WSPolicy

WSAddress

OREP

JDBC

JCP

DFDL

Standardisation is important

http://forge.gridforum.org/projects/dais-wg

Page 19: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Example Projects Using OGSA-DAI

OGSA-DAI(http://www.ogsadai.org.uk)

AstroGrid(http://www.astrogrid.org/)

BioSimGrid(http://www.biosimgrid.org/)

BioGrid(http://www.biogrid.jp/)

Bridges(http://www.brc.dcs.gla.ac.uk/projects/bridges/)

eDiaMoND (http://www.ediamond.ox.ac.uk/)

FirstDig(http://www.epcc.ed.ac.uk/~firstdig/)

GeneGrid(http://www.qub.ac.uk/escience/projects.php#genegrid)

GEON(http://www.geongrid.org/)

IU RGRBench(http://www.cs.indiana.edu/~plale/projects/RGR/OGSA-DAI.html)

myGrid(http://www.mygrid.org.uk/)

N2Grid(http://www.cs.univie.ac.at/institute/index.html?project-80=80)

ODD-Genes(http://www.epcc.ed.ac.uk/oddgenes/)

OGSA-WebDB(http://www.gtrc.aist.go.jp/dbgrid/)

INWA(http://www.epcc.ed.ac.uk/)

Page 20: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

OGSA-DAI User Project classification

OGSA-DAI

BiologicalSciences

PhysicalSciences

Commercial Applications

ComputerSciences

• FirstDig

• INWA

• Bridges • AstroGrid

• BioSimGrid• BioGrid

• eDiamond• myGrid

• ODD-Genes

• N2Grid

• GEON

• MCS

• IU RGBench

• OGSA Web-DB

• GeneGrid

• GridMiner

Page 21: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

OGSA-DAI Downloads690 downloads since May 04Actual user downloads not search engine crawlersDoes not include downloads as part of GT3.2 releasesData from 13 December 04

Total of 966 registered users

R1.0 (Jan 03) 107R1.5 (Feb 03) 110R2.0 (Apr 03) 254R2.5 (Jun 03) 294R3.0 (Jul 03) 792R3.1 (Feb 04) 655R4.0 (May 04) 939R5.0 (Dec 04) 138

Total 3323

Total Downloads

China25%

United States14%

Japan8%

United Kingdom22%Unknown

8%

Germany5%

Taiwan2%

Italy3%

Austria2%

Page 22: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Further Information

The OGSA-DAI Project Site:– http://www.ogsadai.org.uk

The DAIS-WG site:– http://forge.gridforum.org/projects/dais-wg/

OGSA-DAI Users Mailing list– [email protected]– General discussion on grid DAI matters

Formal support for OGSA-DAI releases– http://www.ogsadai.org.uk/support– [email protected]

OGSA-DAI training courses

Page 23: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005
Page 24: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Platforms & Users

Currently on OGSI (GT3)Discontinue support when ~ R6

Currently on WS-I+ (OMII1)Will be in next releaseWithout asynchronous & Third-party data transfers

Currently in Preview on WSRF (GT4)Not yet a supported release ~GT4 release

Users about equally dividedSome still use R3!

Re-designed architectureLong list of requested featuresMany projects want long-term support commitments

Page 25: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

DS (Mobius)DS (DAIS)

DS (OGSA-DAI)

DRAM

initiateDataService( )

Registry DSDL

Request TADD

DR

Initiates/Manages

0

n

Id – UUIDperformRequest()

Single Service SessionId - UUID

DIDTypeFormat

Txn

Response TADD

Compute&

storageresources

Registry2

Local Store

DRs

DRs

LoggingService

Page 26: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

33

OGSA-DAI & Triangle

Applications

OperationsDevelopers

allwant reliability,dependability,

security, performancelong-term stability

One client libraryIncreasingly important

More abstraction needed

Mostly use client library

Some use protocolsNo extra tools yet

No tools or interfaces yetMotivation for new

architecture

Compromise: One e-Infrastructure – Select services & librariesDevelop: Higher-level Client Library & Tools, more Integration, Operations support

Page 27: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

OGSA-DAI team needs

Agreed data naming systemOGSA effort – 3-level: human, abstract & physical address

Addresses of state & data resourcesWS-Addressing

Life Time ManagementWS-RF Resource LifeTime – “imported into OGSA-DAI”

PropertiesWS-Resource Properties – easily implemented look alike

Error reportingWS-BaseFaults

Agreed Data Transport AbstractionOGSA-Data Design + InfoD meeting at GGF13 SeoulMost of all we need these standard with APIs

only one of each!

Page 28: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005
Page 29: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

The Ultimate Challenge

Testing Large-scale, always on, distributed persistent infrastructureProduct space of platforms and external components{Oracle, DB2, MySQL, Postgres, …}{Xindice, eXist, …}{files, DFDL, semi-structured, indexed, text-mined, …} {java, J2EE, .NET, …} {OGSI, WS-I+, WSRF, …} {client libraries in: java, C, C++, C#, …} …Growing proportion of team effort – though mechanised

MaintenanceFixing bugs (<20%), Dealing with context changes (>20%)Providing new required functions (~60%)Better coding and testing can at best save 20%Maintenance is a life sentenceNo remission for good behaviour!

Grows to dominate costs and

limit development

Page 30: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

To Meet the Challenge 1

Agree an Architecture: OGSA + NextGRID + …

To partition the problem spaceTo raise the level of abstraction & discourseTo provide a framework for collaboration

Environments in which alternative solutions can perform roles

Incremental progress to agreement Profiles

Invest in APIsProtect Application Developer investmentProtect Middleware subsystem investmentClarify requirementsSpecify semantics

Page 31: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

To Meet the Challenge 2

Raise the Level of AbstractionGreater benefit for Application DevelopersGreater benefit for other Middleware DevelopersEasier comprehension for designers, implementers & exploitersOpportunity for implementation improvement increases

Form ≤ 2 International Alliances / ConsortiaAgree on target e-Infrastructure function and propertiesSafety of Open Source – future maintenance always possible

But only affordable through alliances

Agree partitioned R&D task: Country X delivers A and Country Y delivers B

Incremental development of relationship Compete partnership trust mutual dependence

Avoid brittle dependency Minimum functionality base platform in which subsystems can work OGSA base profile a good candidate

Page 32: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

To Meet the Challenge 3

Desist from Starting from Scratch“pencil sharpening” auto-distractionSort-term illusion of progress and successLong-term – another body of software to maintainDivision of effortYour legacy: Transition and translation problemsYour legacy: Another body of software to maintain

Darwinian survival of the fittestDoesn’t result in “best”Expensive & slow

Some diversity and competition valuableBut don’t let it split users, developers, operations, training, …

Discard ego trips, “nationalism” & excess

competition – wasteful and harm

ful

Page 33: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005
Page 34: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Observations 1

E-InfrastructureDisruptive technology

Will change what we do and how we do it

Opportunity to reap major benefits Is Europe prepared? Education, Education, Education Will we focus effort? Critical mass. Don’t divide & conquer ourselves

Education essential

Must Collaborate InternationallyTo agree, build and operate e-InfrastructureTo give adequate support to our usersTo afford maintenance and operationsTo facilitate international research, business & decisions

Page 35: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Observations 2 – OGSA-DAI

>30 staff years of effort, Release 5, coming soon R63 platforms, >1000 users, world-wide use, diversityBacked by standards effort – hard work!User community & User groupMajor investment in client-side API

Beneficial for users, application developers & trainingProvides implementation optionsUndergoing re-design

Flexible and Extensible frameworkEssential for applicationsContributors build using this – webDB, streaming data, …No contributor code shipped with releases yet

Diverse demand for new featuresDiverse & multi-platformLooking for reassurance about future support and maintenance

Perhaps 25% of original OGSA-DAI vision built so far

Page 36: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005
Page 37: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

Reserve slides for questions

End of slide show

Page 38: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

From OGSA Status and Future, Hiro Kishimoto and Ian Foster, GGF12slide originally from Michael Behrens, DISA consultant

Page 39: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

GRID COMPUTING

UTILITY COMPUTING

DISTRIBUTED COMPUTING

Core Services

Base Profile

WS-Addressing

Privacy

WS-BaseNotificati

on

CIM/JSIM

WSRF-RAP

WSDM

WS-Security

Naming

OGSA-EMS OGSA Self Mgmt

GFD-C.16

GGF-UR

Data Model

HTTP(S)/SOAP

Discovery

SAML/XACML

WSDL

WSRF-RL

Trust

WS-DAI

VO Management

Information

Distributed query processing

ASP

Data Centre

Use Cases & Applications

Collaboration Multi Media Persistent Archive

Data Transport

WSRF-RP

X.509

Standard Evolving Gap Hole

Provided by David Snelling (Fujitsu) and Mark Linesch (GGF & HP).

Page 40: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

47

Why Invest in OGSArchitecture 4

IntegrationCompletenessAbstraction

CooperationOGSA partitions the e-Infrastructure implementationEncourages independent concurrent & coordinated

Development or evaluation of each part’s standards R&D on implementation of each part

Promises assembly of the partsBasic profile provides context for concurrent R&D

Context for each M/W developer to build against Reduced interdependence – each can deliver if others don’t

Focus effort on reaching

minimum threshold

that makes this work

Page 41: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

48

Back OGSA more

Invest effort in OGSAInvestigating, evaluating, contributing, commentingImplementing profilesUsing it

It is the ONLY show in townWhich offers

integration, completeness, abstraction A foundation for collaboration

UK focus on Data Design TeamUK efforts in other design teams

EMS, Grid markets, JSDL, GSM, mySpace, …

Page 42: Lessons learnt building OGSA-DAI EGC 2005 Malcolm Atkinson Director  15 th February 2005

49

Use OGSA for Collaboration

Big push to Reach OGSA Basic ProfileSufficient platform, context & frameworkFor safely partitioning further R&D

Agree a division of workUpgrading / alternative trade-off componentsNew componentsHigher level facilities

Minimise duplicationMaximise combined efforts to deliver

Function, Stability, Quality & Abstraction

Bury the egos,

project competition & national pride silos