43
ATF Progress Report 8/3/2001 Steve Fisher / RAL <[email protected]>

ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

Embed Size (px)

Citation preview

Page 1: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

ATF Progress Report

8/3/2001

Steve Fisher / RAL

<[email protected]>

Page 2: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 2

Who are the ATF• Francesco Giacomini – WP1• Wolfgang Hoschek – WP2• Steve Fisher – WP3 – acting chair• German Cancio – WP4• Tim Folkes – WP5• Brian Tierney – Consultant

• Ingo Augustin – WP8,9 and 10• Dave Kelsey – Security• Ian Foster – Consultant• Carl Kesselman – Consultant• Fabrizio Gagliardi

Page 3: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 3

Meetings• A few half days last year

– Not much achieved

• A workshop of a week with Ian Foster and Carl Kesselman– Identified some issues

• 2 day session at beginning of February– Beginning to work well together

• 2 day session at beginning of March• Half day yesterday• Plan to continue 1.5 days / 2 weeks

Page 4: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 4

Why do we exist?• To define a viable architecture in terms of a set of

components• To ensure that these are a useful set of components

able to interwork to meet the (evolving) requirements• To alert the PTB to components that appear not to be

fully covered by the various WPs.• For each deliverable, collect input from the WPs and

produce a document showing the essential functionality. Inform the PTB if it appears that a component will not have the functionality required by another component. ????

• Make proposals to the PTB on certain technical matters.

Page 5: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 5

First architecture & M9 documents• We have the first version of a document which

defines the architecture of the EU DataGrid. – This is a large undertaking – Unlike other software projects we have worked on, many

parts of the system are new (to us).– We are not just re-painting and re-assembling old

components but planning to build something new.

• Consequently for now we can only describe what we see as a feasible architecture for the DataGrid. – The new components will have to be prototyped – some will

just not work.

• The document will evolve during the lifetime of the project.

• Similar comments relate to the M9 document

Page 6: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 6

Standards• We want to be able to use existing standards where

appropriate• We recognise the importance of being able to

interwork with similar projects – especially HEP related ones

• We consider the work of the GGF to be very important as a genuine grassroots attempt to define Grid standards. We plan to work with the GGF both by contributing to the standardisation work and by moving towards these standards as they evolve.

• Decent standards only arrive when you can choose from a range of solutions

Page 7: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 7

Plan• Produce model of system (UML)• Define services – then check if there is obvious

mapping to an existing WP– Most architectural components are services

• Write an architecture/design document based on Model

• Consider month 9 functionality

As we proceeded various issues appeared and we made tentative decisions…

Page 8: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 8

GridServices• The system is made up of a set of GridServices. • Each service is accessed via a port as is the case

with http or ftp.• A service is defined by the protocol it speaks.

However the normal programmer does not see the protocol itself but accesses it through a client API. In defining the system we choose to think first of the functionality (API) and then of the protocol needed to communicate via that API to the service. We have defined these APIs as Java calls.

• We have also gone inside each service a little to outline how they work and to demonstrate that they can interwork to achieve the desired functionality.

Page 9: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 9

No built in restrictions

• Restrictions should not be built into the architecture nor the code which restrict access to the various GridServices.

• GridServices each have a policy which may control the access (from a job or user).

Page 10: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 10

Masters and Replicas• Files (normally) identified by a logical name

– Expressed as a URL

• The ReplicaManager manages the ReplicaCatalog which knows about all replicas of a (physical) Master file

• GridScheduler consults the ReplicaManager– Decides to use an existing replica (or the master)– Or to make a new replica

• The user does not manage the replicas, the ReplicaManager does

Page 11: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 11

Security• We have so far not thought much about

security• We now have a link to the Security group

within WP6 – via Dave Kelsey• Plan to improve in this area

Page 12: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 12

XML and https• As a general way to develop new services we

favour using XML over http(s)• http(s) looks after the (secure) sending of

some message and getting back the answer• The XML looks after encoding the data in

standard way.• For example SQL can be embedded within

XML – for which there is a “standard” already in existence.

Page 13: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 13

Globus etc.• It is our goal to define an architecture which is

essentially independent of Globus but which can easily be built using components of Globus.

• M9 prototype will be mainly Globus based.• We recognise that we need compatibility with

other projects, yet we also need to be able to develop our own work. – Our architecture document is and will be

independent but refers and will refer to other documents where appropriate.

Page 14: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 14

Tools• Publish .ps and .pdf • Avoid tools not available on Linux • Try Together as a UML CASE tool.

– Encourage use of UML

Page 15: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 15

Platforms and Languages• Middleware will run on i386 and IA64 Linux

and on Sparc Solaris. For the desktop Linux and Solaris and a web interface will be supported.

• WPs 1-5 APIs will be written to support: Java and C, with Python and Perl via swig.

• WPs 1-5 are encouraged to consider the benefits of the more modern languages such as Java and Python when implementing new code and to note that mixing Java and C++ is difficult!

Page 16: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 16

GridServices• ComputingElement• StorageElement• SQLDatabaseService• Information and Monitoring • ServiceIndex• GridScheduler• FileCopier• ReplicaCatalog• ReplicaManager• SoftwareRepositoryService• Network

• Seeking interfaces

• Simple dependencies

• Checking interactions

Page 17: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 17

A jobExecutable= /usr/local/atlas.shRequirements = TS >= 1GBInput.LFN = http://atlas.hep/foo.inargv1 = Input.LFNOutput.LFN= http://atlas.hep/foo.outOutput.SE = http://datastore.rl.ac.uk/argv2 = Output.LFN

#!/bin/shgridcp $1 ~/tmp1grep higgs ~/tmp1 > ~/tmp2gridcp ~/tmp2 $2

Page 18: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 18

SubmitJob

Page 19: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 19

ComputingElement• The interface to computing power. Its

primary functionality is, given a job description, to take that job and undertake to run it.

• The Fabric Coordination Manager (FCM) interfaces the different kinds of Local Resource management System (LRMS) such as PBS, LSF or Condor.

• FCM tailored to the LRMS to provide missing functionality

WP1

WP4

CE

FCM

LRMS

Page 20: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 20

StorageElement• The Storage Element

(SE) will have three interfaces. The first will be a file access interface. This will provide the basic file access methods such as get, put and delete

• The second two interfaces are optional, but provide further functionality.

Page 21: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 21

SQLDatabaseService• Very recently defined service• Store & retrieve meta data (SQL insert,

delete, update, query)• Build from standardized commodity

components• Use SQL in XML over https• Can use with any local or remote RDBMS

(MySQL, Oracle, DB2, ...)

Page 22: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 22

Information and Monitoring• Base on GMA from GGF• Choose Consumer/Producer

protocol• Choose registration protocol• 2 prototypes planned:

– LDAP based (pull only)– Relational– Some mixture??

Consumer

Producer

Registry

Page 23: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 23

ServiceIndex• For the construction of a distributed “web” of

services• Soft state service registration, simple service

lookup• Query engine can crawl the web of services,

to provide better query support.

Page 24: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 24

GridScheduler• This service offers reliable job submission,

where reliable means that if the job fails for reasons which are independent of the job, it is rescheduled.

• To facilitate this an appropriate job monitoring has to be performed.

• Main task is to find the right SE and CE to use.

Page 25: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 25

FileCopier• Transfer data securely from one physical

location to another one. • Initially file is transfer unit• Consider using globus-url-copy

Page 26: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 26

ReplicaCatalog• Used by the ReplicaManager• Mapping of logical file name one or more

physical file names. – A replica catalog contains zero or more logical file

collections– Each file collection contains zero or more logical

files– Each logical file contains one or more physical

files.

• Requests will be sent as XML over https.

Page 27: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 27

ReplicaManager• Knows about file replicas (through

ReplicaCatalogue) • Consistent replica creation, selection, moving,

deletion are managed.• Normally file deletion requests go through the

replica manager.

Page 28: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 28

SoftwareRepositoryService• The applications need to be available within the CEs

so that they can be run. • Part of the environment may be provided by the CE.• Need a service which is able to deal with inter-

dependencies and set up the environment and the application so that it can run.

• To do this efficiently will require some kind of caching mechanism – Expect to make use of the data management services. – Caches of old programs must be eliminated promptly.

Page 29: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 29

Network• Future network capabilities are expected to

include the ability to ask for various Quality of Service (QoS) levels, and the ability to make advanced reservations.

• These advanced network capabilities will be incorporated as they become available.

Page 30: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 30

SubmitJob

Page 31: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 31

Services @ M9

CE SE SQLDataBase

ReplicaManager

ReplicaCatalogue

FileCopier

GridScheduler

ServiceIndex

IMS

Resources

Collective Services

Page 32: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 32

ComputingElement @ M9• Front End (WP1)

– The ComputingElement will be based on Globus GRAM. The associated information service to which we publish for this release will be Globus GRIS.

– Additions and modifications to the type of data published to the GRIS will be carried out.

• Back End (WP4)– Interim installation system built on existing tools

selected during a tools survey. • provides the necessary functionality for installing and

configuring computing nodes and for installing/updating system software packages on them.

Page 33: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 33

StorageElement @ M9• Provide a first release of the API that will work

with Castor at CERN.

Page 34: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 34

SQLDatabaseService @ M9• The core functionality: SQL insert, delete,

update and query will be available.• Functionality available from a command line

tool, a web browser and via an API• Testing will be done only on MySQL and

Oracle.

Page 35: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 35

Information and Monitoring @ M9

• Relational– We plan to have a rudimentary

system based on the Relation Model by Month 9

– Producer and Consumer API and basic code library supporting both single and streaming data requests.

– The directory service (to register producers) will initially be based on SQL or the ServiceIndex.

– Hope to be far enough to consider whether to continue or abandon Relational approach

• LDAP– Producer and

Consumer API and basic code library supporting only single data requests.

– The directory service (to register producers) will initially be based on LDAP or the ServiceIndex.

• Presentation– Very rudimentary

presentation of monitoring data.

Page 36: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 36

ServiceIndex @ M9• All functionality will be provided in a

prototype.

Page 37: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 37

GridScheduler @ M9• The first release of the Grid Scheduler addresses the

submission and monitoring of batch jobs.• A command-line user interface will allow a user to:

– submit a job to the Grid Scheduler;– monitor its execution;– remove it.

• The description of a job (application, input and output data, other requirements) will be expressed in a Job Description Language (JDL). – The first version of the JDL will be based on the Condor

ClassAds.

• The first prototype will use the Condor matchmaking library.

Page 38: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 38

FileCopier• All functionality will be provided.

Page 39: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 39

ReplicaCatalog• By Month 9, the Globus LDAP based

implementation may be preferable. Later, an RDBMS model may be used. All functionality described in the Architecture document will be provided.

Page 40: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 40

ReplicaManager• By month nine, user directed replica creation

and deletion will be available.

Page 41: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 41

Areas poorly covered• Implementation of managed interface of the

SE• Security – especially authorisation

– But now we have Dave Kelsey

• Accounting– WP1?

• SoftwareRepositoryService

Page 42: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 42

Worries• The nature of replication

– Is it just a file copy?– Can we do anything for R/W data bases?

• “GridMap” file– Scaleable mechanism to do authorisation world

wide – taking into account complex policies imposed primarily by:

• Countries• Experiments

– Mechanism to relate a file to its owner…

• Can we develop our prototypes fast enough to convince the world of our sanity?

Page 43: ATF Progress Report 8/3/2001 Steve Fisher / RAL. 8 March 2001ATF report - Steve Fisher/RAL2 Who are the ATF Francesco Giacomini – WP1 Wolfgang Hoschek

8 March 2001ATF report - Steve Fisher/RAL 43

Final remarks• Version 1 of the documents are available• They only represent a snapshot of a partial

design• As promised in the proposal – there is a lot of

innovation planned• Need to prototype• Tension between providing a usable M9

deliverable and looking to the goal.• ATF will now work on Version 2…