73
1 Workflow tutorial @ ISSGC’09 Workflow tutorial @ ISSGC’09 www.lpds.sztaki.hu/gasuc www.portal.p-grade.hu Gergely Sipos MTA SZTAKI [email protected] EGEE Training and Induction EGEE Application Porting Support

Session 46 - Principles of workflow management and execution

Embed Size (px)

Citation preview

Page 1: Session 46 - Principles of workflow management and execution

1

Workflow tutorial @ Workflow tutorial @ ISSGC’09ISSGC’09

www.lpds.sztaki.hu/gasuc www.portal.p-grade.hu

Gergely SiposMTA SZTAKI

[email protected]

EGEE Training and InductionEGEE Application Porting Support

Page 2: Session 46 - Principles of workflow management and execution

2

It’s already Day 10…It’s already Day 10…

Page 3: Session 46 - Principles of workflow management and execution

3

Agenda of the morningAgenda of the morning

9-10:30 – Lecture room• Introduction to workflow systems and problems • P-GRADE Portal as an implementation with demo

Break

11-12:30 – Computer room• Hands-on: workflows, parameter studies• Further information and next steps

Page 4: Session 46 - Principles of workflow management and execution

4

Many of my slides were taken fromMany of my slides were taken from

• Abu Zafar Abbasi• Peter Kacsuk• Johan Montagnat• Tristan Glatard• Ewa Deelman

Page 5: Session 46 - Principles of workflow management and execution

5

WorkflowWorkflow

The automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules to achieve, or contribute to, an overall business goal.

• Workflow management system (WFMS) is the software that does it

www.wfmc.org

Workflow Reference Model, 19/11/1998

Page 6: Session 46 - Principles of workflow management and execution

6

Why use workflowWhy use workflowss in Grid? in Grid?

• Build distributed applications through orchestration of multiple services

• A single job or a single service is good for nothing…

• Integration of multiple teams involved• Collaborative work

• Unit of reusage• (E-)science requires traceable, repetable analysis

• (Typically) ease of use grids• Graphical representation

Page 7: Session 46 - Principles of workflow management and execution

7

Grid Workflow definition examplesGrid Workflow definition examples

Grid workflow can be defined as the composition of grid application services which execute on heterogeneous and distributed resources in a well-defined order to accomplish a specific goal.

R. Buyya

The automation of the processes, which involves the orchestration of a set of Grid services, agents and actors that must be combined together to solve a problem or to define a new service.

Geoffrey Fox [GGF 10]

Page 8: Session 46 - Principles of workflow management and execution

8

25 x

10 x25 x 5 x

Forecasting dangerous weather situations (storms, fog, etc.), crucial task in the protection of life and propertyProcessed information:

surface level measurements, high-altitude measurements, radar, satellite, lightning, results of previous computed models

Requirements:•Execution time < 10 min•High resolution (1km)

Example: Example: Ultra-short range weather Ultra-short range weather forecast with P-GRADE Portalforecast with P-GRADE Portal

Execution on a GT2 based Hungarian Grid

Page 9: Session 46 - Principles of workflow management and execution

9

Montage applicationMontage application~7,000 compute jobs in instance~7,000 compute jobs in instance~10,000 nodes in the executable ~10,000 nodes in the executable workflowworkflowsame number of clusters as same number of clusters as processorsprocessorsspeedup of ~15 on 32 processorsspeedup of ~15 on 32 processors

Example: Montage workflow with Pegasus (and DAGMan)

Pegasus: a Framework for Mapping Complex Scientific Workflows onto Distributed Systems, Ewa Deelman, Gurmeet Singh, Mei-Hui Su, James Blythe, Yolanda Gil, Carl Kesselman, Gaurang Mehta, Karan Vahi, G. Bruce Berriman, John Good, Anastasia Laity, Joseph C. Jacob, Daniel S. Katz, Scientific Programming Journal, Volume 13, Number 3, 2005

Tasks run on NSF’s TeraGrid

Page 10: Session 46 - Principles of workflow management and execution

10

Example: CancerGrid workflowExample: CancerGrid workflowwith gUSE (and WS-PGRADE)with gUSE (and WS-PGRADE)

1

1

x1

N

xN

NxM

NxM

NxM

xN

N

xN

N

NxM

Generator job

N=20e-30e, M=100 ~2.7 billion tasks !!!

Generator job

1

CancerGridPortal

Workflow is hidden from end usersTasks run on Desktop Grids and RDBMS

http://www.cancergrid.eu/

Page 11: Session 46 - Principles of workflow management and execution

11

Grid WFMSGrid WFMS

Source: Jia Yu and Rajkumar Buyya: A Taxonomy of Workflow Management Systems for Grid Computing, Journal of Grid Computing, Volume 3, Numbers 3-4 / September, 2005

Page 12: Session 46 - Principles of workflow management and execution

12

What doWhat does a typical Grid WFMS provide?es a typical Grid WFMS provide?

• A level of abstraction above grid processes– gridftp, lcg-cr, lfc-mkdir, ...– condor-submit, globus-job-run, glite-wms-job-submit, ...– lcg-infosites, ...

• A level of abstraction above „legacy processes”– SQL read/write– HTTP file transfer– ...

• Automated mapping and execution of tasks grid resources– Submission of jobs– Invocation of (Web) services– Manage data – Catalog intermediate and final data products

• Improve successful application execution• Improve application performance• Provide provenance tracking capabilities

Page 13: Session 46 - Principles of workflow management and execution

13

What does a typical grid What does a typical grid workflow consist of?workflow consist of?

• Dataflow graph• Activities

– Definition of Jobs– Specification of services

• Data channels– Data transfer– Coordination

• Cyclic (DAG) /acyclic• Conditional statements

Page 14: Session 46 - Principles of workflow management and execution

14

Data lifecycle in workflowsData lifecycle in workflows

Data Discovery

Der

ived

Dat

a an

d

Pro

vena

nce

Arc

hiva

l

Data Processing

Data A

nalysis Setup

Data Lifecycle in a Workflow Environment

Metadata Catalogs

Provenance Catalogs

Component Libraries

Workflow Template Libraries

Data Replica CatalogsData Movement Services

Software Catalogs

Workflow Creation

Workflow Mapping andExecution

Workflow Reuse

Page 15: Session 46 - Principles of workflow management and execution

15

User interactionUser interaction

Data Discovery

Der

ived

Dat

a an

d

Pro

vena

nce

Arc

hiva

l

Data Processing

Data A

nalysis Setup

Data Lifecycle in a Workflow Environment

Metadata Catalogs

Provenance Catalogs

Component Libraries

Workflow Template Libraries

Data Replica CatalogsData Movement Services

Software Catalogs

Workflow Creation

Workflow Mapping andExecution

Workflow Reuse WF definition tools

WF enactmentservice

Storages,Catalogs

Page 16: Session 46 - Principles of workflow management and execution

16

Layered architecture of WFMSLayered architecture of WFMS

Grid schedulere.g. Condor Schedd

Reliable, scalable execution of independent tasks (locally, across the network), priorities, scheduling

WF scheduler e.g. Condor DAGMan

Reliable and scalable execution of dependent tasks

WF optimizere.g. Pegasus Mapper

A decision system that develops strategies for reliable and efficient execution in a variety of environments

Cyberinfrastructure: Cluster, Condor pool, OSG, EGEE, TeraGrid

Abstract Workflow

Results

Page 17: Session 46 - Principles of workflow management and execution

17

(Some of the) available grid (Some of the) available grid workflow systemsworkflow systems

http://www.gridworkflow.org Categories for

– Composition tools – Description languages

• Scientific• Industrial• Formalism

– Engines

Some relevant tools for ARC, gLite, Globus, UNICORE grid users• Condor DAGMan

– Used as an enactor in P-GRADE Portal, Pegasus, …– Uses DAGMan WF language (DAG = Directed Acyclic Graph)

• MOTEUR– Interfaced with “pilot job” framework on EGEE (pull style job execution)– Uses SCUFL WF language

• gLite WMS– Describe workflows in JDL– Share Input-Output sandboxes with multiple jobs

• Taverna– Mainly for cluster computing– ARC interface is available by Lubeck University

• …

Page 18: Session 46 - Principles of workflow management and execution

18

Workflow sharing:Workflow sharing:MyExperimentMyExperiment

1812/3/06

http://www.myexperiment.org/

Page 19: Session 46 - Principles of workflow management and execution

19

Workflow sharing:Workflow sharing:MyExperimentMyExperiment

1912/3/06

http://www.myexperiment.org/

Page 20: Session 46 - Principles of workflow management and execution

20

Current and Future ResearchCurrent and Future Research• Workflow provenance

– Reproducability, traceability trust in vitro simulations• Flexibility

– Views at various level: end user, application developer, grid operator, ...• Information sources

– Heterogenities, inconsistencies• Automation

– Manual vs. Automated workflow design; reasoning and planning– Semantics for operations and data

• Interoperability– Reusability of applications– Complex workflow built from multiple sources– Standards vs future requirements

• Collaborative usage– Versioning– Change management

• Adaptive computing– Workflow refinement adapts to changing execution environment– Optimizing execution in multi-dimensional requirement spaces– Long-lived workflows

Page 21: Session 46 - Principles of workflow management and execution

21

P-GRADE PortalP-GRADE Portal

A Grid WFMS

www.portal.p-grade.hu

Page 22: Session 46 - Principles of workflow management and execution

22

Short History of P-GRADE portalShort History of P-GRADE portal

• Parallel Grid Application Development Environment

• Initial development started in the Hungarian SuperComputing Grid project in 2003

• It has been continuously developed since 2003• Around 30 manyear development + training + user support

• Detailed information: http://portal.p-grade.hu/ • Open Source community development since

January 2008: https://sourceforge.net/projects/pgportal/

• Current version: 2.8

Page 23: Session 46 - Principles of workflow management and execution

23

Current Current P-GRADE P-GRADE Portal Portal related projectsrelated projects

• GGF GIN (Since 2006)– Providing the GIN Resource Testing portal

• EU EGEE-II, EGEE-III (2006-2010)– Tool recommended for application development– Intensively used in new users’ training

• EU SEE-GRID-SCI (2008-2010)– Interfacing to DSpace-based workflow storage– Infrastructure testing workflows

• EU CancerGrid (2007-2009)– Development of new generation P-GRADE (gUSE

and WS-PGRADE)– Integration with desktop grids

• EU EDGeS (2008-2009)– Transparent access to Desktop Grid systems

Page 24: Session 46 - Principles of workflow management and execution

24

Portal installationsPortal installations

P-GRADE Portal services:– SEE-GRID infrastructure– Several VOs of EGEE:

• Biomed, Astronomy, Central European, NA4,...

– GILDA: Training VO of EGEE– Many national Grids (UK National Grid Service,

HunGrid, Turkish Grid, etc.)– US Open Science Grid, TeraGrid– OGF Grid Interoperability Now (GIN) VO– …

Portal services and account request:http://portal.p-grade.hu/index.php?m=3&s=0

Account request form on portal login page

Page 25: Session 46 - Principles of workflow management and execution

25

Multi-Grid portal installation:Multi-Grid portal installation:www.lpds.sztaki.hu/multi-gridwww.lpds.sztaki.hu/multi-grid

Page 26: Session 46 - Principles of workflow management and execution

26

Design principlesDesign principles of P-GRADE portalof P-GRADE portal

• P-GRADE Portal is not only a user interface, it is a – General purpose– Workflow-level – Multi-Grid – Application Development and Execution Environment

• P-GRADE Portal includes a high-level middleware layer for orchestrating jobs on grid resources – inside a grid– among several different grids (and several VOs)

• P-GRADE Portal is grid-neutral:– Unlike many existing grid portals it is not tailored to any particular

grid type– Can be connected to various grids based on different grid

middleware• LCG-2, gLite, GT2, GT4, ARC, Unicore, etc.

– Implements the high-level grid middleware services on top of the existing grid middleware services

– The workflow interface is the same no matter which type of grid is connected to it

Page 27: Session 46 - Principles of workflow management and execution

27

What is a P-GRADE Portal workflow?What is a P-GRADE Portal workflow?

• A directed acyclic graph where– Nodes represent jobs (batch

programs to be executed on a computing element)

– Ports represent input/output files the jobs expect/produce

– Arcs represent file transfer operations

• semantics of the workflow:– A job can be executed if all

of its input files are available

Page 28: Session 46 - Principles of workflow management and execution

28

Three levels of parallelismThree levels of parallelism

– PS workflow level: Parameter study execution of the workflow

– Workflow level: Parallel execution among workflow nodes (WF branch parallelism)

Multiple jobs run parallel

Each job can be a parallel program

– Job level: Parallel execution inside a workflow node (MPI job as workflow component)

Multiple instances of the same workflow process

different data files

Page 29: Session 46 - Principles of workflow management and execution

29

~100independent

jobs torun

Example: Computational Example: Computational ChemistryChemistry

Department of Chemistry, University of Perugia

SOLUTION OF SCHRODINGER EQUATION FOR TRIATOMIC SYSTEMS USING TIME-DEPENDENT (RWAVEPR) OR TIME INDEPENDENT (ABC) METHOD

A single execution can be between 5 hours and 10 hours

SEQUENTIAL FORTRAN 90

Many simulations at the same time

Page 30: Session 46 - Principles of workflow management and execution

30

Typical user scenarioTypical user scenarioJob compilation phaseJob compilation phase

Portalserver

Gridservices

DOWNLOAD BINARI(ES)

UPLOAD JOB SOURCE(S)

Client COMPILE – EDIT

Page 31: Session 46 - Principles of workflow management and execution

31

Typical user scenarioTypical user scenarioWorkflow development phaseWorkflow development phase

Portalserver

Gridservices

START EDITOR

OPEN & EDIT WORKFLOW

ADD BINARIES

SAVE WORKFLOW

Client

DSpace WFrepository

IMPORT WORKFLOW

Page 32: Session 46 - Principles of workflow management and execution

32

MyProxyCertificate servers

Portalserver

Gridservices

TRANSFER FILES, SUBMIT JOBS

DOWNLOAD (SMALL)

RESULTS

DOWNLOAD (SMALL)

RESULTS

Typical user scenariosTypical user scenarios Workflow execution phaseWorkflow execution phase

VISUALIZE JOBS and

WORKFLOW PROGRESS

MONITOR JOBS

DOWNLOAD PROXY CERTIFICATES

Client

Page 33: Session 46 - Principles of workflow management and execution

33

Accessing local and remote filesAccessing local and remote files

Portalserver

Gridservices

Computing elements

Storage elements and File catalogs

REMOTE INPUTFILES

REMOTE OUTPUT

FILES

LOCAL INPUT FILES

& EXECUTABLES

LOCAL OUTPUT

FILES

LOCAL INPUT FILES

& EXECUTABLES

LOCAL OUTPUT

FILES

Only the permanent

files!

Use legacy executables with Grid files without touching the code

Page 34: Session 46 - Principles of workflow management and execution

34

Extended DAGMan

Java Webstartworkflow editor

Web browser

EGEE, Globus (and ARC) Grid services + MyProxy service (gLite WMS, LFC,…; Globus GRAM, …)

Globus and gLite command line clients + scripts

P-GRADE PortalP-GRADE Portal structural overviewstructural overview

Extended DAGMan WF specification

Globus GIISgLite BDII

DSpacerepository

Page 35: Session 46 - Principles of workflow management and execution

35

Web interface - PortletsWeb interface - Portlets

Page 36: Session 46 - Principles of workflow management and execution

36

Email notificationsEmail notifications

NOTIFY

Page 37: Session 46 - Principles of workflow management and execution

37

Workflow portletWorkflow portlet

WORKFLOW EDITOR

Page 38: Session 46 - Principles of workflow management and execution

38

Graphical workflow editingGraphical workflow editing

• To define a graph:1. Drag & drop components:

jobs and ports

2. Define their properties

3. Connect ports by channels (no cycles, no loops)

System generates JDL for each job automatically

Page 39: Session 46 - Principles of workflow management and execution

39

Workflow Workflow EditorEditorProperties of a jobProperties of a job

Properties of a job:• Executable file• Type of executable

(Sequential / Parallel)• Command line parameters• Which resource to use?

• Which VO?• Broker or Computing

element?

Page 40: Session 46 - Principles of workflow management and execution

40

Workflow Workflow EditorEditorDefining input-output filesDefining input-output files

File propertiesType: input: the executable reads output: the executable generatesFile type: local: comes from my desktop remote: comes from an SEFile: location of the fileInternal file name: Executable uses this e.g. fopen(“file.in”, …)File storage type (output files only): Permanent: final result Volatile: temp. data channel

Page 41: Session 46 - Principles of workflow management and execution

41

• Client side location:result.dat

• LFC logical file name(LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04_-_result.dat

• GridFTP address (in Globus Grids):gsiftp://somengshost.ac.uk/mydir/result.dat

Local fileLocal file

Remote fileRemote file

How to refer to an I/O file?How to refer to an I/O file?

• Client side location:c:\experiments\11-04.dat

• LFC logical file name(LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04.dat

• GridFTP address (in Globus Grids):gsiftp://somengshost.ac.uk/mydir/11-04.dat

Input file Output file

Page 42: Session 46 - Principles of workflow management and execution

42

Upload a workflow from client side Upload a workflow from client side or from FTP serveror from FTP server

UPLOAD

STORED on FTP server

Page 43: Session 46 - Principles of workflow management and execution

43

Importing an applicationImporting an application

INCOMPLETE WORKFLOW Open it in editor and save it again

Page 44: Session 46 - Principles of workflow management and execution

44

Import a workflow from DSpace Import a workflow from DSpace repositoryrepository

Page 45: Session 46 - Principles of workflow management and execution

45

External access to DSpaceExternal access to DSpacehttp://pgrade-dspace.sztaki.huhttp://pgrade-dspace.sztaki.hu

Page 46: Session 46 - Principles of workflow management and execution

46

Certificate and proxy Certificate and proxy management Portletmanagement Portlet

Page 47: Session 46 - Principles of workflow management and execution

47

OGF GIN interoperability portal by P-GRADEAcccessing Globus, gLite and ARC based grids/VOs simultaneously

P-GRADE

GEMLCA

Portal

GEMLCA GEMLCA RepositoryRepository

P-GRADEportal

Proxy 1

Proxy 2

Proxy 5

Proxy 4

Proxy 3

Proxy 6

Page 48: Session 46 - Principles of workflow management and execution

48

Application executionApplication execution

Page 49: Session 46 - Principles of workflow management and execution

49

Fault-tolerant executionFault-tolerant execution

• Utilizing– Condor DAGMan’s rescue mechanism– EGEE job resubmission mechanism of WMS

• If the EGEE broker leaves a job stuck in a CEs’ queue, the portal automatically – kills the job on this site and – resubmits the job to the broker by prohibiting this

site.

• As a result – the portal guarantees the correct submission of a job

as long as there exists at least one matching resource

– job submission is reliable even in an unreliable grid

Page 50: Session 46 - Principles of workflow management and execution

50

Information system visualizationInformation system visualization

Page 51: Session 46 - Principles of workflow management and execution

51

LFC-SELFC-SE file browser portlet file browser portlet

Page 52: Session 46 - Principles of workflow management and execution

52

Compilation supportCompilation support

Page 53: Session 46 - Principles of workflow management and execution

53

WORKFLOW WORKFLOW DEMODEMO

Page 54: Session 46 - Principles of workflow management and execution

54

From workflows to From workflows to parameter studiesparameter studies

Advanced execution patterns

Page 55: Session 46 - Principles of workflow management and execution

55

Scaling up a workflow to a Scaling up a workflow to a parameter studyparameter study

Complete workflow

P-GRADE Portal:Files in the same LFC catalog

(e.g. /grid/gilda/sipos/myinputs)

P-GRADE Portal:Results produced in

the same catalog

Page 56: Session 46 - Principles of workflow management and execution

56

Advanced parameter studiesAdvanced parameter studies

Generator component(s)

Initial input data

Generate orcut input into smaller pieces

Collector component(s)

Aggregate result

Complete workflow

P-GRADE Portal:Files in the same LFC catalog

(e.g. /grid/gilda/sipos/myinputs)

P-GRADE Portal:Results produced in

the same catalog

Page 57: Session 46 - Principles of workflow management and execution

57

Concept of parameter study Concept of parameter study workflowsworkflows

GEN

SEQ

COLL

SEQSEQSEQ

Parameter study part

Collector part evaluates and

integrates the results

Generator part generates the

input parameter space

Page 58: Session 46 - Principles of workflow management and execution

58

Turning a WF into a parameter studyTurning a WF into a parameter study

By switching at least one of the open input ports

into a “PS Input port” the WF is turned into a Parameter Study

Page 59: Session 46 - Principles of workflow management and execution

59

Input-output files are stored in SEsInput-output files are stored in SEs

/grid/gilda/sipos/InputImages Image.0 Image.1

/grid/gilda/sipos/XCoordinates XCoordinate.0 XCoordinate.1

/grid/gilda/sipos/YCoordinates YCoordinate.0 YCoordinate.1

/grid/gilda/sipos/Output ImagePart.0 ImagePart.1 . . .

2 x 2 x 2 = 8 execution of the whole workflow

CROSS PRODUCT of data items

Page 60: Session 46 - Principles of workflow management and execution

60

A B

Typical data-flow compositionsTypical data-flow compositions

A X B

MActivity / WF

A1

A2

A3

B1

B2

B3

{A1, A

2, A

3} {B

1, B

2, B

3}

XActivity / WF

A1

A2

A3

B1

B2

B3

{A1, A

2, A

3} {B

1, B

2, B

3}

dot iterator:one-to-one

cross iterator:all-to-all

Activity / WF

Ai

Bj

{A1, A

2, A

3}

match iterator

If Ai and B

j have a

common ancestor

{B1, B

2, B

3}

A M B

CROSS ITERATOR DOT ITERATOR MATCH ITERATOR

Find these in TAVERNA, MOTEURP-GRADE Portalsupports this

Page 61: Session 46 - Principles of workflow management and execution

61

PS Input PortPS Input Port

Grid Directory instead of

FILE reference

Page 62: Session 46 - Principles of workflow management and execution

62

Parameter generatorParameter generator

Generator can be attached to any parameter input port

Generator can be• Auto generator: to generate text files• Custom generator: to generate any content

Generated files are moved into SE by the portal

Page 63: Session 46 - Principles of workflow management and execution

63

Definition Window of Auto Generator JobDefinition Window of Auto Generator Job

User defines the template of the text file

User puts key(s) into the template

User defines values for the key(s)• Integer number• Real number• Custom set• …

Page 64: Session 46 - Principles of workflow management and execution

64

PPlacement of resultlacement of result

Page 65: Session 46 - Principles of workflow management and execution

65

Will contain one compressed file for each execution of the workflow.

Use the default value!

Choose a „reliable” Storage Element

PPlacement of resultlacement of result

Page 66: Session 46 - Principles of workflow management and execution

66

Executing PS workflowsExecuting PS workflows

PS Details for parameter sweep

workflows applications

Page 67: Session 46 - Principles of workflow management and execution

67

Detailed view of a PS workflowDetailed view of a PS workflow

Workflow instances

Overall statistics of workflow instances

Collector job(s)

Generator job(s)

Page 68: Session 46 - Principles of workflow management and execution

68

PARAMETER STUDY PARAMETER STUDY WORKFLOW WORKFLOW DEMODEMO

Page 69: Session 46 - Principles of workflow management and execution

69

Thank you!Thank you!

[email protected]

Learn once, use everywhereDevelop once, execute anywhere

Page 70: Session 46 - Principles of workflow management and execution

70

Backup slides to answer Backup slides to answer questionsquestions

Page 71: Session 46 - Principles of workflow management and execution

71

Proxy delegations Proxy delegations

MyProxyserver

P-GRADE Portalserver GILDA

services

Proxy VOMSserver

Proxy

Proxy

VOMS ext.

Proxy

VOMS ext.

usernamepassword

Proxy based authentication

Login & psw based

authentication

usernamepassword

Page 72: Session 46 - Principles of workflow management and execution

72

SettingsSettings

Portal administrator can – connect the portal

to several grids– register default

resources of the connected grids

Page 73: Session 46 - Principles of workflow management and execution

73

SettingsSettings

User can customize the connected grids by adding and removing resources