Orchestrating Scientific Workflows with BPEL
Open Middleware Infrastructure Institute
University College London
Software Systems Engineering Group
15 September - 16 September, e-Science Institute Edinburgh
UCL Department of Computer Science
2
Contacts
Dr Steven Newhouse Open Middleware Infrastructure Institute
Prof Wolfgang Emmerich
University College London
Mr Liang Chen University College London
Mr Bruno Wassermann University College London
UCL Department of Computer Science
3
Introduction
• BPEL is the industry standard for orchestration of Web services• BPEL can provide many benefits for scientific workflows• BPEL relies on complex set of technologies• Additional support required to obtain benefits
– Additional language abstractions– Tool support
• OMII-BPEL project offers needed additional language abstractions and tool support to hide some of the complexity
• Aim is to give introduction to BPEL and OMII-BPEL in order to enable audience to express their workflow needs in BPEL using our environment
UCL Department of Computer Science
4
Learning Objectives
• Understand requirements of scientific workflows• Learn about basic concepts and language constructs of BPEL• Learn about advanced BPEL concepts required for large-scale
workflows– E.g., hierarchical composition, asynchronous invocation, message
correlation, etc.
• Learn about language extensions provided by OMII-BPEL and how to use them
• Learn about issues of scalability and how to deal with them• Some tips & tricks for building orchestratable Web services
UCL Department of Computer Science
5
Assumptions
You understand XML and its related technologies XML Schema 1.0 XML Namespaces XPath 1.0 SOAP 1.2
You know about Web services WSDL 1.1
UCL Department of Computer Science
6
Agenda
1. Overview of OMII-BPEL Offering2. Introduction to workflows3. Fundamental BPEL concepts4. BPEL Language Elements5. Advanced BPEL Concepts6. Sedna BPEL Extensions7. Scalability Considerations8. Engineering Orchestratable Web Services9. Summary, Conclusions, Discussion, Q&A
UCL Department of Computer Science
7
Overview of OMII-BPEL Offering
• Sedna• ActiveBPEL Engine• ActiveBPEL Management & Monitoring Console
UCL Department of Computer Science
8
1. Overview of OMII-BPEL Offering
• Aim is to make benefits of BPEL accessible to application scientists
• For this, need to overcome a number of issues:– Provide suitable set of abstractions to simplify creation of scientific
workflows– Provide tool support to hide complexity of technologies inherent to
BPEL– Provide integration of various middleware technologies
• This is what OMII-BPEL offers
UCL Department of Computer Science
9
Overview of OMII-BPEL Offering (continued)
UCL Department of Computer Science
10
Sedna
• Our visual modelling environment
• Provides access to all native BPEL activities plus Sedna BPEL extensions
• Developed as Eclipse plug-in• Features include:
– Pre-deployment validation– Integration with ActiveBPEL
engine– Project management
including cvs– Context-sensitive wizards
UCL Department of Computer Science
11
Sedna (continued)
• Package Explorer for managing workflow projects and their related resources
• Integration with Eclipse New Wizard
• Simple Project • New BPEL Workflow Wizard• Sedna files
UCL Department of Computer Science
12
Sedna (continued)
• Overwiew Page• Wizards for setting up
partners, global variables, and namespaces
• Wizards to inspect and modify setup information
UCL Department of Computer Science
13
Sedna (continued)
• Process Map• Palette of activities• Canvas• Properties view• Problems view• Deployment menu
option– BPR archive
UCL Department of Computer Science
14
ActiveBPEL Engine
• Open source implementation of BPEL4WS 1.1 under active development
• Industrial-strength BPEL workflow enactment engine– Scalable
– Persistence
– Hot-deployment
• Reads BPEL and WSDL files from deployment archive (BPR files)• Listens for incoming messages to trigger deployed processes
UCL Department of Computer Science
15
ActiveBPEL Engine Management Console
UCL Department of Computer Science
16
ActiveBPEL Monitoring Console
UCL Department of Computer Science
17
2. Introduction to Workflows
• Some background about BPEL• Scientific Workflows• Orchestration vs. Choreography
UCL Department of Computer Science
18
Workflows
• Academic research since 1970’s• Various graphical notations• important impetus for adoption of workflows in
business is Enterprise Application Integration (EAI)
UCL Department of Computer Science
19
EAI & SOA
• EAI refers to the need of integrating a number of different often heterogeneous applications across an enterprise
• Simplified by Service Oriented Architecture (SOA)– An application architecture– Separate functions (of the various applications) are loosely coupled– Separate functions can be invoked via well-defined interfaces
• SOA promises reusability and flexibility• SOA simplifies the EAI problem
– Services can be re-used in ways that were not anticipated at design time– Value of services (applications) can increase by composing services over time– Composition vs. extension
UCL Department of Computer Science
20
SOA & BPEL
• SOA is an application architecture • Web services are the primary technology for
implementing SOAs• Composition of Web services is needed to realise
the vision of SOAs• BPEL is the primary industry standard for
composing Web services
UCL Department of Computer Science
21
Some BPEL Background
• BPEL is the result of convergence of two previous workflow languages– XLANG (Microsoft)– WSFL (IBM)
• Supported by vast array of providers– IBM, Oracle, Parasoft, Active Endpoints, Microsoft, etc.
• BPEL is a reactive workflow language based on events• BPEL makes use of the Web services stack
– SOAP, WSDL, UDDI, WS-Addressing, …
UCL Department of Computer Science
22
Some BPEL Background (continued)
• An XML-based programming language• Provides control and data flow constructs for
combining Web services• A BPEL process consists of a set of activities and
links between those activities• Expressed in an XML file that can be read and
executed by any conformant BPEL engine
UCL Department of Computer Science
23
Scientific Workflows
• Proliferation of OGSA Grid computing infrastructures• Need to combine grid services into larger services or into
experiments– Apply computationally expensive models– Then filter and convert resulting data
• Experiments need to be changed frequently to incorporate new insights and ideas
• Application scientists need ownership of their workflows
UCL Department of Computer Science
24
Scientific Workflows (continued)
• BPEL is primarily targeted at business workflows
• Scientific workflows differ in a number of ways
• The main difference is one of scale along several dimensions
UCL Department of Computer Science
Scientific workflows
Business workflows
Thousands of service instances (partners)
<< service instances
Thousands of basic service invocations; ten thousands of SOAP messages
<< invocations and SOAP messages
Large numbers of sub-workflows for parallel execution
<< opportunities for parallel execution
Very large amounts of data to be analysed routinely
<< amount of data to be analysed
25
Orchestration vs. Choreography
Orchestration– Central process acts as
controller of involved Web services
– Explicit definition of control and data flow in central process
– Involved Web services have no knowledge of their involvement in higher-level application
Choreography– All involved Web services
are aware of their partners and when to invoke operations
– No central controller– Focus on message
exchange
UCL Department of Computer Science
Taken from http://blog.whatfettle.com/archives/000250.html
26
3. Fundamental BPEL Concepts
1. Namespaces
2. Partners
3. Variables
4. Messages
UCL Department of Computer Science
27
Namespaces
• XML namespaces ensure that XML type definitions are unique (similar to Java package names)
• Important when combining services that were created autonomously• A namespace is identified via a globally unique identifier called a URI• A namespace declaration is of the form:
– xmlns:<prefix>=“<URI>”– xmlns:test=“http://test.com”
• The prefix allows us to use qualified names in XML documents:– <test:name>…</test:name>
UCL Department of Computer Science
28
Namespaces (continued)
• Each BPEL process defines its own targetNamespace
<process name=“HelloWorld"targetNamespace="http://omii-bpel.ac.uk/helloworld"xmlns:tns="http://omii-bpel.ac.uk/helloworld" />
• This serves to qualify the elements of a process (e.g., element declarations, type definitions, messages, portTypes, etc.)
• Also allows WSDL definition to be split across several files (one for interface description, another one for implementation description) and use targetNamespace to import parts
• In the Sedna editor you can specify the target namespace in the new workflow wizard or on the overview page
UCL Department of Computer Science
29
Namespaces (continued)
Above: The setup area for namespaces and the target namespace on the overview page.
Right: The new workflow wizard allows you to import the process’s target namespace from an existing WSDL definition.
UCL Department of Computer Science
30
Partners
• We need to define the interactions between a process and its partners
• These interactions occur through Web service interfaces
• In BPEL this relationship is represented via partner links
• In WSDL via partner link types
UCL Department of Computer Science
31
Partners – WSDL
<partnerLinkType name="BuyerSellerLink"xmlns="http://schemas.xmlsoap.org/ws/2003/05/partner-link/"><role name="Buyer">
<portType name="buy:BuyerPortType"/></role><role name="Seller">
<portType name="sell:SellerPortType"/></role>
</partnerLinkType>
• Each partner link type defines up to 2 roles
• Each role must support exactly one WSDL portType
• In this example portTypes for each role have different namespaces
• In case of asynchronous communication (later) both portTypes can be from same namespace
• A portType can define exactly 1 role– That denotes a service that
does not impose any requirements on other service
UCL Department of Computer Science
32
Partners - BPEL
• In BPEL we use partner links to define the other participants of an interaction
• A partner link has a name used throughout the process to identify a particular partner
• Name is used in activities to obtain list of operations to invoke (from portTypes)
• Partner link specifies the appropriate partner link type • Has two role attributes:
– myRole role of the process in interaction– partnerRole role of the partner in interaction
UCL Department of Computer Science
33
Partners - BPEL
• In Sedna use Partner Wizard on overview page to set up partners
• Click Add Partner button for new partners
• Click partner icons to modify existing partners
• Specify URL of file location of partner’s WSDL definition
• Follow guidance given by wizard
• Example of partner with just myRole attribute set
UCL Department of Computer Science
34
Variables
• BPEL variables can be used to store messages or any other kind of data representing state of process
• Type of a variable can be– WSDL message type– XML Schema simple type– XML Schema element
• An XML Schema complex type must be associated with an element to be used as the type of a BPEL variable
• Only variables of a message type can be used as input/output in operations (invocation of operations on service partners)
• Types can be defined in some XSD, the process WSDL, or some partner WSDL
UCL Department of Computer Science
35
Variables (continued)
• Two ways to assign value to variable– Use assign construct– Receipt of message assigned to variable
• Can have global variables and local variables defined in scope
• Scoping Rules– Local variable overrides outer variable with same name and type– Variable visible in its own scope and all nested scopes– Must not have variables with same name but different types in one
enclosing scope hierarchy
UCL Department of Computer Science
36
Variables (continued)
• In Sedna use overview page to set up global process variables
• Click Add Partner button• In wizard
– Type variable name– Select message type from
drop-down list• Wizard uses set up
partners to present list of message types
• Does not currently support XSD types directly (will discuss later)
• Modify variables by clicking icons
UCL Department of Computer Science
37
Messages
• WSDL message has a unique name and consists of several parts
• Parts are linked to type definitions or element declarations
• WSDL types element allows for definition of types, elements, and import of external schemas
UCL Department of Computer Science
38
Messages<types>
<schema targetNamespace… xmlns=“http:www.w3.org/2001/XMLSchema”>
<element name=“FirstName” type=“xsd:string”/><element name=“Age” type=“xsd:int”/>
<complexType name=“employeeType”><sequence>
<element name=“name” ref=“FirstName”/><element name=“old” ref=“Age”/>
</sequence></complexType>
</schema><schema …><import namespace="http://data.country.samples.sedna.bpel.omii.ac.uk" schemaLocation="http://localhost:8080/axis/schemas/CCSData.xsd"/> </schema>
</types>
UCL Department of Computer Science
39
Messages
<message name=“EmployeeInputMsg”><part name=“main” type=“tns:employeeType”/>
</message>
• Message has a name attribute which must be unique• part name attribute must be unique within message scope• type attribute refers to a type definition• element attribute refers to an element declaration• When should we define a part via a type def and when via
an element declaration?– WS-I 1.0 Basic Profile and possibly WSDL 1.2: A message must
not contain parts that specify type as well as element attributes
UCL Department of Computer Science
40
Messages
• Messages form the input and output to operations• Their parts are transformed into SOAP Body
element content
UCL Department of Computer Science
41
4. BPEL Language Elements
1. Assignments
2. Receive
3. Reply
4. Pick
5. Switch
6. Wait
7. Invoke
8. Process
9. Scope
10.Sequence
11.Flows
12.While
UCL Department of Computer Science
42
Assignments
• Assign activity to copy data among variables (from source to target)
• Can also use it to construct data• Can calculate value of expression and store in variable• Supports
– XPath functions– BPEL functions (extensions to XPath functions)– XQuery– XSLT
• Powerful but complex
UCL Department of Computer Science
43
Assignments
• Type compatibility– From variable of WSDL message type to another
variable of WSDL message type must be the same WSDL message type
– Source and target are XML Schema types or elements source must contain element or type associated with destination
UCL Department of Computer Science
44
Assignments – Literal XML
• Assign literal XML or any other kind of value to a target variable (e.g., part of a message)
• Selecting assign activity in editor populates corresponding properties view
UCL Department of Computer Science
45
Assignments – Literal XML
<employee>
<name>Bruno</name>
<age>26</age>
</employee>
• Enter this into from (literal) field
• Then, select To (container) to specify the target variable and To (part) to specify the part of the message
<message name=“CandidateInMsg”>
<part name=“payload” type=“tns:employeeType”/>
</message>
<complexType name=“employeeType”>
<sequence>
<element name=“employee”>
<complexType>
<sequence>
<element name=“name” type=“xsd:string”/>
<element name=“age” type=“xsd:int”/>
…
UCL Department of Computer Science
46
Assignments – Numerical Value + Query
26
• Enter this into From (Expression) field
• Then, select To (container) to specify the target variable and To (part) to specify the part of the message
• Then enter following To (Query):
/employee/age
<message name=“CandidateInMsg”>
<part name=“payload” type=“tns:employeeType”/>
</message>
<complexType name=“employeeType”>
<sequence>
<element name=“employee”>
<complexType>
<sequence>
<element name=“name” type=“xsd:string”/>
<element name=“age” type=“xsd:int”/>
…
UCL Department of Computer Science
47
Assignments – Copying Between Variables
• Select names of source and target variables
• Select source and/or target variable parts as needed
• Can use several assign statements in a row
UCL Department of Computer Science
48
Assignments – Copy Between Variables with Query• Messages
<msg name=“SimpleEmployeeMsg”> <part name=“payload”
type=“employeeType”/></msg>
<msg name=“ComplexEmployeeMsg”> <part name=“main”
type=“employeeRecordType”/></msg>
• Variables
<variable name=“source” type=“SimpleEmployeeMsg”/>
<variable name=“target” type=“ComplexEmployeeMsg”/>
• employeeRecordType
<complexType name="employeeRecordType"> <sequence> <element name=“employeeRecord”>
<complexType> <sequence> <element name="FirstName" type="xsd:string"/> <element name="Confidential"> <complexType> <sequence> <element name="Age" type="xsd:int"/> </sequence> </complexType> </element> …
UCL Department of Computer Science
49
Assignments – Copy Between Variables with Query
UCL Department of Computer Science
50
Assignments – XPath Functions
• getVariableData(‘var_name’,’part_name’,<query>)• Making me a year older:
bpws:getVariableData(‘source’,’payload’,’/employee/age’) +1
• Enter this in From (Expression) field• Remaining settings stay as before
• Greeting me:concat(‘Hi ‘,bpws:getVariableData(‘source’,’payload’,
‘/employee/name’))
UCL Department of Computer Science
51
Assignments – XPath Functions
• true() – boolean value• position()=x – accessing array elements
/…/…/employees[position()=1]
UCL Department of Computer Science
52
Receive
• Receive activity is a blocking wait for a matching message
• createInstance attribute– Can be set to ‘yes’ when Receive is first activity in
process– Upon receipt of a matching message, creates new
process instance
UCL Department of Computer Science
53
Receive
UCL Department of Computer Science
54
Reply
• Reply allows process to send a message in response to one previously received
• Reply is used for synchronous interactions among partners• The combination of a receive and a corresponding reply at
a later point in the process represents a request-response operation
• A reply is for the same partner link, portType and operation as its preceding receive
UCL Department of Computer Science
55
Reply
UCL Department of Computer Science
56
Reply
UCL Department of Computer Science
<operation name="echoRequest"><input message="tns:echoMessage"/><output message="tns:echoMessage"/></operation>
57
Pick
• Pick awaits occurrence of a set of events• Executes activity associated with event that occurred• Events are arrival of a particular message or a timeout• Arrival of message can be one-way or request-response• onMessage event
– Each pick must have at least one onMessage– Configure onMessage with partner, operation, variable– Pick with onMessage is equivalent to receive– Pick with onMessage can be first activity in process (replace receive); in
that case no timeout allowed; specify createInstance=yes
UCL Department of Computer Science
58
Pick
• onAlarm– The other type of event– If no matching message has been received by the time
specified in onAlarm, we execute the onAlarm branch– XML time format
• P6Y5M4DT3H2M1S 6 years, 5 months, 4 days, 3 hours, 2 minutes, 1 second
• PT10S 10 second timeout
UCL Department of Computer Science
59
Pick
UCL Department of Computer Science
60
Switch
• Switch allows us to model conditional behaviour• Select one of a number of branches according to
conditions• Have one or more conditional branches plus an optional
otherwise branch• Branches are evaluated in the order they appear and
first branch to evaluate to true is executed
UCL Department of Computer Science
61
Wait
• Wait specifies delay for a period of time or until a deadline is reached– One minute delay: PT1M– Deadline: ‘2005-09-16T16:30+00:00’
UCL Department of Computer Science
62
Invoke
• Invoke allows us to call an operation on a service partner• Can be one-way (asynchronous)
– Only specify an input message
• Can be request-response (synchronous)– Have input and output message
• Match input variable to input message type of target operation
• Similarly for output variable
UCL Department of Computer Science
63
Process
• Process is the top-level element of any workflow• You must give it a name
– XML NCName: Letter, _, -, digits – No spaces
• Contains targetNamespace and tns prefix• Defines prefixes for any relevant namespaces• bpws is the prefix used for BPEL namespace
– Automatically added by editor
UCL Department of Computer Science
64
Scope
• Scope is a nested activity similar to a programming language block {}
• Can define its own variables (local variables)• Can contain one activity• Aids BPEL engine in management of variables
– Restricts visibility of variables– Restricts lifetime of variables
UCL Department of Computer Science
65
Sequence
• Sequence contains one or more activities which are performed in the order indicated
• Part of BPEL structured activities– Sequential – sequence, switch, while– Concurrent – flow– Non-deterministic choice – pick
• Need sequence inside scope, if scope contains more than 1 activity
UCL Department of Computer Science
66
Flows
• Flow creates a set of concurrent activities nested within it• Provided to express concurrency• Complex semantics and awkward to model large-scale
concurrency• Use link to express synchronisation dependencies
between activities• Transition Condition, optional• Suppress Join Failure, default is no
UCL Department of Computer Science
67
While
• While provides for the repeated execution of one activity
• Executes until condition does not hold anymore• Include several activities inside sequence nested
within while• Express condition using XPath and BPEL XPath
functions
UCL Department of Computer Science
68
5. Advanced BPEL Concepts
1. Hierarchical Composition
2. Asynchronous Invocation
3. Message Correlation
4. Scope Synchronisation
UCL Department of Computer Science
69
Hierarchical Composition
• Each BPEL process can be represented as a Web service
• Initial receive gives input message• Final reply provides output message• Can invoke process as any other Web service
providing appropriate input • Request-response interaction
UCL Department of Computer Science
70
Hierarchical Composition
• Composition of workflows has many benefits– Enables break-down of big workflow into several
smaller sub-workflows– Can independently implement and test smaller sub-
workflows, then compose into bigger workflows– Reduction of complexity for modelling & design– Re-use of common workflows– Reduces size of BPEL files
UCL Department of Computer Science
71
Asynchronous Invocation
• Asynchronous invocation is possible with BPEL, but complex and not well supported
• Also, SOAP/HTTP – HTTP is a request-response protocol
• One-way operations allow us to model asynchronous interactions
• In BPEL have a one-way invoke and then a receive
UCL Department of Computer Science
72
Asynchronous Invocation
• Invoke calls an initiate operation on the asynchronous partner
• Receive allows asynchronous partner to return result via callback (onResult operation)
• Need correlation information to be able to match asynchronous response to correct process instance
• What do the Web services look like?
UCL Department of Computer Science
73
Asynchronous Invocation
• Process WSDL <portType name=“countryCodeCallbackPT"> <operation name="onResult"> <input message="tns:CountryCodeResultMsg"/> </operation> </portType>
• Asynchronous Service Partner WSDL<portType name=“CountryCodeServicePT"> <operation name="initiateCCS"> <input message="impl:CountryCodeRequestMsg"
name=“ccs_initiate"/> </operation></portType>
UCL Department of Computer Science
74
Asynchronous Invocation
UCL Department of Computer Science
75
Message Correlation
• Asynchronous interactions require information for the correlation of exchanged messages
• A process is available on a port• A port may host several process instances• How does the run-time environment determine the
correct process instance for an incoming message?
UCL Department of Computer Science
76
Message Correlation
• Seller and buyer example– Buyer needs to return acknowledgement for order asynchronously– Use order id to correlate messages
• Implicit correlation – WS-A• Explicit correlation – correlator fields in request and
response messages• Correlation sets
– Can be global or local (scope)– Visibility rules same as for variables
UCL Department of Computer Science
77
Message Correlation
• Correlation sets are like late-bound constants not variables– Binding triggered by specially marked operation– Only 1 value per iteration– No new assignments– Operations are: receive, reply, invoke, pick –
onMessage
UCL Department of Computer Science
78
Message Correlation
• Correlation set is a named group of properties• A given message can carry multiple correlation
sets• Steps for setting up and using correlation sets
1. Set up properties– A property can be an XML Schema simple type– <bpws:property name=“customerID”
type=“xsd:int”/>
UCL Department of Computer Science
79
Message Correlation
2. Set up property aliases• Maps property to message part
• Can specify queries to further qualify
• <bpws:propertyAlias propertyName="cor:customerID"
messageType="tns:POMessage" part="PO"
query="/PO/CID"/>
UCL Department of Computer Science
80
Message Correlation
3. Define correlation sets• Named collection of properties• <correlationSets
…xmlns:cor="http://example.com/supplyCorrelation.wsdl">
<!-- Order numbers are particular to a customer,
this set is carried in application data -->
<correlationSet name="PurchaseOrder"
properties="cor:customerID cor:orderNumber"/>
UCL Department of Computer Science
81
Message Correlation
4. Use correlation set in operation• <receive partnerLink="Buyer" portType="SP:PurchasingPT"operation="AsyncPurchase"variable="PO"><correlations><correlation set="PurchaseOrder" initiate="yes"></correlations></receive><invoke partnerLink="Buyer" portType="SP:BuyerPT"operation="AsyncPurchaseResponse"
inputVariable="POResponse"><correlations><correlation set="PurchaseOrder" initiate="no"
pattern="out"><correlation set="Invoice" initiate="yes" pattern="out"></correlations></invoke>
UCL Department of Computer Science
82
Scope Synchronization
• Serializable Scopes ensure consistent access to shared variables
• Provides concurrency control in governing access to shared variables
• Variable access serializable property set to yes• Only leaf scopes can be serializable scope• Variable Access Interference
– Long-running executions in indexed flow have potential for interference
UCL Department of Computer Science
83
6. Sedna BPEL Extensions
1. Indexed Flows
2. Plug-ins
UCL Department of Computer Science
84
Indexed Flows
• Indexed Flows are an additional language abstraction developed by OMII-BPEL
• Replaces native BPEL flow• Editor translates indexed flow into standard BPEL flow• Indexed flow is a container for any set of activities• Simplifies modelling of concurrent sets of activities and
maintenance
UCL Department of Computer Science
85
Indexed Flows
• Configuration parameters– Index name– Start range – End range
• Activities executed end range – start range number of times
• Can use value of index in statements– #index_name#– #i#
UCL Department of Computer Science
86
Plug-ins
• Domain-specific activities• Combination of simple Java (how to export into
BPEL) and XML descriptor (name, image, Java class)
• Available in palette for drag-n-drop• Editor automatically adds required partners,
namespaces, and variables
UCL Department of Computer Science
87
Plug-ins
• Different from hierarchical composition– Hierarchical composition uses complete workflow with
initial receive and eventual reply– Plug-in can be any complex snippet of BPEL that is
inserted at specified location in workflow
UCL Department of Computer Science
88
7. Scalability Considerations
1. Tomcat Configuration
2. ActiveBPEL Engine Configuration
3. Distribution
UCL Department of Computer Science
89
Configuring for Scalability
• In order to execute scalable workflows, need to configure environment appropriately
• BPEL engine can launch large number of concurrent flows
• Need to ensure that adequate resources will be available– Memory, Threads, OS File Descriptors
UCL Department of Computer Science
90
Tomcat configuration
• Tomcat configuration via server.xml or Admin page– $CATALINA_HOME/conf/server.xml– http://localhost:8080/ - will find link for Admin page and
others here– Need to set up users and passwords to use Admin
page in users.xml
UCL Department of Computer Science
91
Tomcat Configuration
• Axis hosts Web services that will be invoked• Need to support the degree of concurrency required by Axis• Configuration is application-specific• Configuration also depends on available resources
– What does the OS support in terms of memory, threads, and file descriptors?
• GridSAM Example– GridSAM needs to access 10 files per invocation– If OS makes 1000 file descriptors available, can have 100
concurrent client requests to GridSAM
UCL Department of Computer Science
92
Tomcat Configuration
• Parameters– maxThread --- the size of the thread pool. The
maximum degree of concurrency we can afford to support given OS resources.
– acceptCount --- size of the Tomcat request queue, maximum queue length for incoming connection requests when all possible request processing threads are in use.
UCL Department of Computer Science
93
ActiveBPEL Engine Configuration
• ActiveBPEL configuration is made simple via its Management console
• Parameters– Min / Max Work Manager Thread Pool --- Determines
how many process instances the engine can handle concurrently
UCL Department of Computer Science
94
ActiveBPEL Engine Configuration
UCL Department of Computer Science
95
Distribution
• Memory requirements are high for scientific workflows• Example memory requirement• Want some load-balancing• Can have GridSAM run on one server and ActiveBPEL on
another server• Can have multiple ActiveBPEL instances on several
servers to execute sub-workflows
UCL Department of Computer Science
96
BPEL File Size
• Another memory requirement to consider is parsing of large BPEL files
• Imagine– 38 x 200 parallel executions translated into parallel
BPEL flows large BPEL file
• Use hierarchical composition– Have 38 flows in one workflow– 200 in another sub-workflow
UCL Department of Computer Science
97
8. Engineering Orchestratable Web Services
1. Thread Safety
2. Document Literal vs. RPC
3. Java2WSDL and WSDL2Java
UCL Department of Computer Science
98
Thread Safety
• Web services used in composition may maintain state– Read/write to files
• These Web services may get invoked concurrently by BPEL engine
• Need to ensure that Web services are thread-safe
UCL Department of Computer Science
99
Document Literal vs. RPC
• Two communication styles– Document vs. rpc
• Two encoding choices– Literal vs. SOAP encoding
UCL Department of Computer Science
100
Document Literal vs. RPC
• Literal encoding– Body conforms to an XML Schema– More flexible, don’t need to fit to method signature– Better choice for data interoperability
• Model data separately from processing it
• SOAP (section 5) encoding– A set of encoding rules for serialising objects, object graphs,
structs, arrays– Can’t transform message with XSLT– Can’t validate on an XML Schema
UCL Department of Computer Science
101
Document Literal vs. RPC
• Document Style– Body element contains an arbitrary XML instance– SOAP runtime imposes no structure on Body element content and
passes XML instance on unchanged
<SOAP-ENV:Envelope xmlns:SOAP-ENV=“…”><SOAP-ENV:Body>
<eg:employee xmlns:eg=“http://example.com”><eg:name>Bruno</eg:name>
</eg:employee> <SOAP-ENV:Body>
</SOAP-ENV:Envelope>
UCL Department of Computer Science
102
Document Literal vs. RPC
• RPC Style– SOAP message represents remote procedure call– URI identifying transport address for call– Name of method to invoke + namespace identifying target service– Parameter types and values– SOAP runtime imposes message structure
<SOAP-ENV:Envelope xmlns:SOAP-ENV=“…”> <SOAP-ENV:Body>
<eg:getEmployee xmlns:eg=“http://example.com”/ SOAP-ENG:encodingStyle=…>
<firstName xsi:type=“xsd:string”>Bruno</firstName></eg:getEmployee>
</SOAP-ENV:Body><SOAP-ENV:Envelope>
UCL Department of Computer Science
103
Document Literal vs. RPC
• Document Literal makes for better interoperability• Can validate SOAP message body contents
against a schema• More appropriate model for handling data in SOA
UCL Department of Computer Science
104
Java2WSDL and WSDL2Java
• http://ws.apache.org/axis/java/user-guide.html • Tools provided by Apache Axis to help developing Web services and
their implementations in Java• WSDL2Java
– Generates Java implementation (client-side, server-side) from WSDL definition
– Can also generate JUnit test stub– Binding information– Mapping of XML data types to Java– One interface per binding (doc/lit, rpc)– Stub to turn method invocations into SOAP calls– Service interface and locator to return stubs
UCL Department of Computer Science
105
Java2WSDL and WSDL2Java
• Java2WSDL– Build Web service starting from Java interface– Steps:
• Provide a Java interface or class• Create WSDL using Java2WSDL• Create Bindings using WSDL2Java
– Beware:• Generates SOAP RPC Web service
UCL Department of Computer Science
106
Summary & Conclusions
• OGSA Grid Infrastructures• Scalability• Additional Language Abstractions• Tool Support• Configuration of Middleware• Thread Safety
UCL Department of Computer Science
107
Questions
UCL Department of Computer Science
108
References
• Oracle Technology Network – BPEL– http://otn.oracle.com/bpel
• UCL OMII-BPEL Homepage– http://sse.cs.ucl.ac.uk/omii-bpel
UCL Department of Computer Science
109
Appendix A - SOAP
• Message structure• Faults• Communication Styles
UCL Department of Computer Science
110
Message Structure
• An XML-based protocol for exchanging structured and typed data between applications
• A SOAP message can be transmitted using various transport mechanisms (e.g. HTTP, MOM)
• SOAP message carries requests and responses• Structure of SOAP message
– Envelope• Top element representing SOAP message
– Header• Optional, contains additional processing or control information (QoS,
authentication, transaction)• Child elements called header entries
– Body• Mandatory, body entries contain main message content for recipient
UCL Department of Computer Science
111
Header Attributes
• SOAP Intermediaries examine headers for processing information and forward message
• actor attribute in header notifies intermediary of processing opportunity
• mustUnderstand attribute {0|1} can|must process
UCL Department of Computer Science
112
Faults
• Fault element to as body entry to indicate failure information
• Fault codes– VersionMismatch – incorrect qualification of SOAP message
Envelope element– MustUnderstand – application could not process header entry
(mustUnderstand = 1)– Client – message not formed correctly or required information
missing; do not resend– Server – message okay, but could not be processed by recipient;
try again later
UCL Department of Computer Science
113
Communication Styles
• Document Style– Body element contains an arbitrary XML instance– SOAP runtime imposes no structure on Body element content and
passes XML instance on unchanged
<SOAP-ENV:Envelope xmlns:SOAP-ENV=“…”><SOAP-ENV:Body>
<eg:employee xmlns:eg=“http://example.com”><eg:name>Bruno</eg:name>
</eg:employee> <SOAP-ENV:Body>
</SOAP-ENV:Envelope>
UCL Department of Computer Science
114
Communication Styles (continued)
• RPC Style– SOAP message represents remote procedure call– URI identifying transport address for call– Name of method to invoke + namespace identifying target service– Parameter types and values– SOAP runtime imposes message structure
<SOAP-ENV:Envelope xmlns:SOAP-ENV=“…”> <SOAP-ENV:Body>
<eg:getEmployee xmlns:eg=“http://example.com”/ SOAP-ENG:encodingStyle=…>
<firstName xsi:type=“xsd:string”>Bruno</firstName></eg:getEmployee>
</SOAP-ENV:Body><SOAP-ENV:Envelope>
UCL Department of Computer Science
115
Appendix B - WSDL
• Elements of WSDL Document• Bindings• Operation Overloading
UCL Department of Computer Science
116
WSDL Document
• WSDL document describes service interface contract in XML
• Elements of a WSDL document– definitions – top-level element; contains name, tns, namespaces– portType – named collection of operations– operation – abstract description of service call with input/output
msg and fault msgs– msg – description of data that is sent consisting of parts– part – each part has a data type– types – contains definitions/imports of data type definitions
117
WSDL Bindings
• A binding provides information about protocol, concrete data formats – Transport & Style (e.g., SOAP over HTTP using RPC style)– It maps portType to (description of interface) to actual
implementation– Can have various bindings (implementations) for same portType
• A port specifies network address of service• A service is collection of ports for one service
– Contained ports can specify different network addresses and/or bindings
UCL Department of Computer Science
118
WSDL Operation Names
• Can use operation overloading within same portType (same ops name, different parameters)
• But according to WS-I Basic Profile 1.0 and probably WSDL 1.2 – Operation names within a portType must be unique– Operation names across portTypes may be identical
UCL Department of Computer Science