24
WOSE 19 th October 2006, NeSC, Edinburgh Best Practices in Distributed Workflows and Web Services Rob Allan, Asif Akram, David Meredith CCLRC Daresbury Laboratory

WOSE 19 th October 2006, NeSC, Edinburgh Best Practices in Distributed Workflows and Web Services Rob Allan, Asif Akram, David Meredith CCLRC Daresbury

Embed Size (px)

Citation preview

WOSE

19th October 2006, NeSC, Edinburgh

Best Practices in Distributed Workflows and Web Services

Rob Allan, Asif Akram, David Meredith CCLRC Daresbury Laboratory

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

WOSE Project

Workflow Optimisation for Services in e-Science EPSRC funded project In collaboration with Imperial College and Cardiff

University CCLRC is investigating some user requirements Developing Use Cases based on existing e-

Science Applications e-HTPX: An e-Science Resources for High

Throughput Protein Crystallography

http://www.grids.ac.uk/WOSE

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

Best Practices for Workflows

Best practices for interoperable Web Services

Modular workflow design

Hierarchical workflow exception handling

Compensation mechanisms

Build adaptivity and flexibility into workflow

Account for workflow management

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

Best Practices for Workflows

Best practices for interoperable Web Services a)Selection of WS styleb)Abstraction of data modelc) When to use ‘loosely’ or ‘strongly’ typed Web servicesd)Approach to writing Web services (code first / WSDL first)

Modular workflow design Hierarchical workflow exception handling Compensation mechanisms Build adaptivity and flexibility into workflow Account for workflow management

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

a) Selection of WS style

RPC/ encoded: (depreciated – only use is for legacy purposes)

RPC/ literal: WS-I compliant but has limitations for validation of complex data (due to namespace issues)

Document/ literal (wrapped) style is best (100% WS-I compliant). Supports modelling and validation of complex data from different namespaces in plain XML instance documents.

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

b) Abstraction of data model

Separating the XML Schema elements and complex types defined within the <wsdl:types> section of the WSDL file into separate files.

Recommended as data can be modelled in different documents according to different namespace and domain / data requirements <schema…>

<xsd:include schemaLocation=“schema1.xsd”/>

<xsd:import schemaLocation=“schema2.xsd”/>

</schema>

<wsdl:definitions …

<wsdl:types>

<xsd:schema …>

<xsd:import namespace=“…”

schemaLocation="BulkMRFinal.xsd"/>

</xsd:schema>

</wsdl:types>

..

</wsdl:definitions>Schema1.xsd

Schema2.xsd

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

b) Abstraction of Data Model (Benefits)Separation of roles – Data model can be developed in isolation

from WSDL.

Schema / data model re-usability – Share and re-use schema rather than re-designing schema for each new WS.

Isolation of changing components – Data model can be subject to change. Its isolation limits the impact on other WS components such as concrete WSDL file.

Data model avoids dependencies on WSDL + SOAP namespaces – Good data model design.

Full Schema functionality – XML Schema is powerful (e.g. xsd patterns/regular-expressions, optional elements, enumerations, type restrictions etc)

Extensible Modelling – Schema can be extended without breaking software that uses the original schema through the use of xsd:any / xsd:anyType (wildcard / placeholder for extending schema where necessary).

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

b) Abstraction of Data Model

Abstract the XML Schema(s) used in the <wsdl:types> section of the WSDL file into separate files.

1) Model data logically in separate schema files with different namespaces as required by the data model.

2) Combine separate Schema files as required using <xsd:include> and <xsd:import>

3) Import the schema file(s) into the WSDL file.

4) Reduces the size + complexity of the WSDL file.

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

c) When to use ‘loose’ or ‘strong’ typing

A strongly typed WS:

Defines a WSDL with a complete definition of a WS operation’s input and output messages in XML schema with tight constraints on the allowed values.

• xsd:patterns, xsd:restrictions, xsd:enumerations, xsd:sequences etc

Advantages:

• Well defined service interface – all info necessary to invoke the service is encapsulated within the WSDL (client and automation friendly)

• Strong control on the data that enters the business logic of the service

Disadvantages:

• Requires a working knowledge of XML Schema

• Resistive to change in data model

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

A loosely typed WS:

Uses generic data types in the WSDL which are used to ‘wrap’ data of other formats:

• String – can wrap markup fragments (e.g. xml, name-value pairs, SQL)• xsd:any / xsd:anyType - placeholders for embedding arbitrary XML• SOAP attachments (used to send data of any format, e.g binary)

c) When to use ‘loose’ or ‘strong’ typing

Advantages:

• Flexible – single WS can handle multiple types of message + message data

• Easy to develop

Disadvantages:

• Incomplete WSDL interface - Requires manual negotiation between client and service to establish format of data wrapped by loose type.

• Prone to message exceptions (requires WS to be v.tolerant in what it accepts)

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

Two Choices: a) Code first (‘bottom

up’)b) WSDL first (‘contract

driven’)a) Code first (generating WSDL from WS implantation)

Advantages:

• Simple

Disadvantages:

• WSDL files created from src code are always loosely typed (e.g. a src code: String is just a String – can’t impose xsd:patterns or xsd:restrictions if starting with a String).

• WSDL auto-generation tools can introduce technical dependencies upon implementation language !

• Not all language specific data types can be mapped into interoperable XML (‘the Object – XML mismatch’).

d) Approach to writing WS

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

d) Approach to writing WS

b) WSDL first or ‘contract driven’ approach (generate the WS operation from the WSDL file)

Advantages:

• Authored WSDL can be strongly and loosely typed where necessary

• Interoperability issues between different languages are avoided if starting with plain XML:

• Can auto-generate service implementation classes from WSDL.

Disadvantages:

• Developer requires a reasonable knowledge of XML Schema + WSDL

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

Best Practices for Workflows

Best practices for interoperable Web Services

Modular workflow design

Hierarchical workflow exception handling

Compensation mechanisms implemented by

partner services

Build adaptivity and flexibility into workflow

Account for workflow management

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

Modular Design

Modules operate as “object oriented black boxes” in the overall workflow, with their own variables, computational logic, dependency constraints, event handlers.

Group similar or related activities Module components should be scalable and

replaceable (to serve as repeatable units)

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

Best Practices for Workflows

Best practices for interoperable Web Services

Modular workflow design

Hierarchical workflow exception handling

Compensation mechanisms implemented by

partner services

Build adaptivity and flexibility into workflow

Account for workflow management

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

Exception Handling

Types of Exception “expected exceptions” (“variations”) “unexpected exceptions”

Exception handlers can be defined at different hierarchical levels. Global (unexpected)

–Scoped (expected)•Inline (expected)

Exceptions should be handled at the lowest available level and unrecognized exceptions are passed to a higher level.

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

processfaulthandler

catch

catch

faulthandler

catch

catch

scope

invoke

catch

catch

global faulthandlers

scoped faulthandlers

inline faulthandlers

Exception Handling

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

Best Practices for Workflows

Best practices for interoperable Web Services

Modular workflow design

Hierarchical workflow exception handling

Compensation mechanisms provided by partner

services

Build adaptivity and flexibility into workflow

Account for workflow management

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

Compensation

Mechanism

“undoing steps that have already completed successfully” Reversing the effects of successful activities in

the abandoned workflow A+B+C, if C fails may need to roll back to A

Provision of compensation is responsibility of the service.

Long running processes require some sort of compensation.

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

Best Practices for Workflows

Best practices for interoperable Web Services

Modular workflow design

Hierarchical workflow exception handling

Compensation mechanisms provided by partner

services

Build adaptivity and flexibility into workflow

Account for workflow management

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

Adaptive Workflow

Workflow can be static or dynamic depending on whether changes are accommodated at design time or runtime.

Adaptivity and flexibility cater for adjustments/ modifications without breaking a workflow. Provide “place holders” for future extensions

e.g. replaceable “empty activities” Externalizing changeable data e.g. partner

service endpoints specified in configuration files.

Use generic/ loose data types (rather than strongly typed data).

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

Best Practices for Workflows

Best practices for interoperable Web Services

Modular workflow design

Hierarchical workflow exception handling

Compensation mechanisms implemented by

partner services

Build adaptivity and flexibility into workflow

Account for workflow management

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

Management of Workflow

Workflow management involves Breakpoints: arbitrary location crucial to

the process Steering points: e.g. reset, re-schedule,

restart, undo, abort, complete, recover, ignore or jump operations.

Persistence: state recorded at break+steering points (allows re-execution of workflow from prior state, recording/ inspection of prior state data values)

Presenter Name

Facility NameWOSE Workshop, 19th October 2006, National e-Science Centre, Edinburgh

References A. Akram WS-RF Tutorials. Simplified version to be published by IBM Developer

Works http://www.grids.ac.uk/WOSE/tutorials A. Akram Requirements and expectations from Workflow A. Akram, D. Meredith, R.J. Allan Best practices in web service style, data binding

and validation for use in data-centric scientific applications Proc. All Hands Meeting 2006, Nottingham, paper 621

A. Akram, D. Chohan, D. Meredith, R.J. Allan CCLRC Portal Infrastructure to support Research Facilities Concurrency and Computation: Practice and Experience (2006)

A, Akram, J. Kewley, R.J. Allan Data Centric approach for Workflows Proc. The Tenth IEEE International Enterprise Distributed Object Computing Conference (EDOC 2006), Hong Kong

A, Akram, J. Kewley, R.J. Allan Modelling WS-RF based Enterprise Applications Proc. The Tenth IEEE International Enterprise Distributed Object Computing Conference (EDOC 2006)

L. Huang, A, Akram, D.W. Walker, R.J. Allan, O.F. Rana, Y. Haung. A workflow portal supporting multi-language interoperation and optimisation Concurrency and Computation: Practice and Experience (2005)

A, Akram, D. Chohan, D. Meredith, X.D. Wang, R.J. Allan CCLRC Portal infrastructure to support research facilities Global Grid Forum 14, USA, 2006

A, Akram, D. Meredith, R.J. Allan Evaluation of BPEL for Scientific Workflows 6th IEEE International Symposium on Cluster Computing and the Grid, CCGrid 2006, Singapore