23
Efficient execution of composite Web services exchanging intensional data Chang-Sup Park a, * , Soyeon Park b a Department of Internet Information Engineering, The University of Suwon, San 2-2 Wau-ri, Bongdam-eup, Hwaseong-si, Gyeonggi-do 445-743, Republic of Korea b Department of Computer Science, University of Illinois at Urbana-Champaign, 4222 Siebel Center, 201 N.Goodwin Avenue, Urbana, IL 61801, USA Received 20 September 2006; received in revised form 20 August 2007; accepted 24 August 2007 Abstract Web service technologies provide a standard means of integrating heterogeneous applications distributed over the Inter- net. Successive compositions of new Web services using pre-existing ones usually create a hierarchical structure of invoca- tions among a large number of Web services. For the efficient execution of these composite Web services, we propose an approach which exploits intensional XML data, i.e. an XML document that contains special elements representing the calls to Web services, in order to delegate the invocations of the external Web services to some relevant nodes. We formal- ize an invocation plan for composite Web services in which intensional data is used as their parameters and results, and define a cost-based optimization problem to obtain an efficient invocation plan for them. We provide an A * heuristic search algorithm to find an optimal invocation plan for a given set of Web services and also present a greedy method of generating an efficient solution in a short time. The experimental results show that the proposed greedy method can find a close-to-optimal solution efficiently and has good scalability for a complex call hierarchy of Web services. Ó 2007 Elsevier Inc. All rights reserved. Keywords: Web service; Invocation plan; Intensional data; XML; Optimization 1. Introduction Web services [36] are emerging as a major technology for the inter-operation and integration of heteroge- neous, distributed applications on the Internet using XML and Web protocols. Based on the Service-Oriented Architecture (SOA) paradigm to develop reliable and flexible software, a new application can be easily assem- bled from pre-existing Web services [13]. Successive compositions of Web services using other Web services usually create a complex structure of interactions among a large number of composite Web services distrib- uted over the Internet. 0020-0255/$ - see front matter Ó 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.ins.2007.08.021 * Corresponding author. Tel.: +82 31 229 8048; fax: +82 31 229 8281. E-mail addresses: [email protected] (C.-S. Park), [email protected] (S. Park). Available online at www.sciencedirect.com Information Sciences 178 (2008) 317–339 www.elsevier.com/locate/ins

Efficient execution of composite Web services exchanging intensional data

Embed Size (px)

Citation preview

Available online at www.sciencedirect.com

Information Sciences 178 (2008) 317–339

www.elsevier.com/locate/ins

Efficient execution of composite Web servicesexchanging intensional data

Chang-Sup Park a,*, Soyeon Park b

a Department of Internet Information Engineering, The University of Suwon, San 2-2 Wau-ri, Bongdam-eup, Hwaseong-si,

Gyeonggi-do 445-743, Republic of Koreab Department of Computer Science, University of Illinois at Urbana-Champaign, 4222 Siebel Center, 201 N.Goodwin Avenue,

Urbana, IL 61801, USA

Received 20 September 2006; received in revised form 20 August 2007; accepted 24 August 2007

Abstract

Web service technologies provide a standard means of integrating heterogeneous applications distributed over the Inter-net. Successive compositions of new Web services using pre-existing ones usually create a hierarchical structure of invoca-tions among a large number of Web services. For the efficient execution of these composite Web services, we propose anapproach which exploits intensional XML data, i.e. an XML document that contains special elements representing thecalls to Web services, in order to delegate the invocations of the external Web services to some relevant nodes. We formal-ize an invocation plan for composite Web services in which intensional data is used as their parameters and results, anddefine a cost-based optimization problem to obtain an efficient invocation plan for them. We provide an A* heuristicsearch algorithm to find an optimal invocation plan for a given set of Web services and also present a greedy methodof generating an efficient solution in a short time. The experimental results show that the proposed greedy method can finda close-to-optimal solution efficiently and has good scalability for a complex call hierarchy of Web services.� 2007 Elsevier Inc. All rights reserved.

Keywords: Web service; Invocation plan; Intensional data; XML; Optimization

1. Introduction

Web services [36] are emerging as a major technology for the inter-operation and integration of heteroge-neous, distributed applications on the Internet using XML and Web protocols. Based on the Service-OrientedArchitecture (SOA) paradigm to develop reliable and flexible software, a new application can be easily assem-bled from pre-existing Web services [13]. Successive compositions of Web services using other Web servicesusually create a complex structure of interactions among a large number of composite Web services distrib-uted over the Internet.

0020-0255/$ - see front matter � 2007 Elsevier Inc. All rights reserved.

doi:10.1016/j.ins.2007.08.021

* Corresponding author. Tel.: +82 31 229 8048; fax: +82 31 229 8281.E-mail addresses: [email protected] (C.-S. Park), [email protected] (S. Park).

318 C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339

The concept of intensional XML documents and their applications have been recently proposed in the lit-erature [1,4,19,33]. These are XML documents that contain calls to external Web services. Fig. 1a shows an

Fig. 1. Example intensional XML documents.

C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339 319

example intensional XML document which describes a list of hotel information and contains a call to a Webservice, GetCurrentRoomRates, provided by a hotel. When a Web service call embedded in an intensionaldocument is activated and completed successfully, the result of the Web service can be integrated into the doc-ument. For example, the <service_call> element in Fig. 1a can be replaced by the information on thecurrently available rooms and their rates returned from the Web service call. We consider that intensionalXML data can be exchanged between Web services in the form of the input parameter and output resultof the services, which are referred to in this paper as the intensional parameter and result. For example, aninput SOAP message to a Web service, GetHotelInfo, shown in Fig. 1b includes intensional parameterscontaining the calls to two external Web services, i.e., GetBudget from a bank and GetTouristAttrac-

tions from a tourist information center, which provide the budget of a given user and the list of famous sitesin a given city, respectively. Fig. 1c shows an output message returned from the GetHotelInfo service,which contains an intensional result representing a list of hotel information grouped by nearby places of inter-est and also embeds a call to another Web service, GetCurrentRoomRates, provided by a particular hotel.

By using an intensional form of data as the parameter or result of a Web service, we can delegate invoca-tions of Web services to some other Web services. If a Web service calls one of its component Web serviceswith an intensional parameter, the invoked one should execute the service calls embedded in the parameterbefore performing its own business logic. On the other hand, a composite Web service can return an inten-sional result to its caller containing calls to some of its component Web services. Then, the client of the servicecan execute a portion of the deferred service calls by itself and delegate the remaining service calls successivelyto its caller using its own intensional result. Thus, there exist a lot of different ways to invoke composite Webservices by exploiting intensional data.

Fig. 2 shows an example of Web service interactions using intensional results. Here, a client or an agentservice calls a Web service WS1, and then WS1 invokes another Web service WS2 which is composed ofthe Web services WS4, WS5, and WS6. In this example, WS2 invokes only WS5 by itself and returns an inten-sional result to WS1 embedding the service calls to WS4 and WS6. Then, WS1 calls WS6 on behalf of WS2 andreturns an intensional result to its client containing the service call to WS4. Finally, the client calls WS4 byitself and receives an extensional result from WS4. As will be described in Section 3, delegating a call to aWeb service S to any other Web service by means of an intensional result can affect the set of possible callersof the Web services contained in the composition hierarchy of Web services rooted at S.

The XML-based SOAP protocol used for invoking Web services often produces considerable performanceoverhead for the service client and message transmission network, which is mainly due to the XML-relatedtasks such as encoding, decoding, and parsing of XML data and the large size of SOAP messages. Previousstudies have shown that SOAP is orders of magnitude slower than binary network protocols such as Java RMIand CORBA/IIOP and that the size of the encoded message is typically about 10 times that of its binary

WS4

WS5

WS6

Int. Result (WS4, WS6)

Client /Agent

Service

WS2

WS3WS7

Service Call

WS1Int. Result (WS4)

Service Call

Service Call & Ext. Result

Service Call & Ext. Result Service Call

& Ext. Result

Service Call & Ext. Result

Service Call & Ext. Result

Service call defined in WS2Intensional result embedding the call to WSiExtensional result

:Int. Result (WSi) :

Ext. Result :

WS4

WS5

WS6

Int. Result (WS4, WS6)

Client /Agent

Service

WS2

WS3WS7

Service Call

WS1Int. Result (WS4)

Service Call

Service Call & Ext. Result

Service Call & Ext. Result Service Call

& Ext. Result

Service Call & Ext. Result

Service Call & Ext. Result

Service call defined in WS2Intensional result embedding the call to WSiExtensional result

:Int. Result (WSi) :

Ext. Result :

Fig. 2. An example of Web service invocations returning intensional results.

320 C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339

representation [10,12]. The result of an experimental evaluation of SOAP for a real business application alsoindicates that SOAP and XML have poorer performance than a domain-specific messaging protocol and abinary wire format [16].

When we delegate a service call to some other nodes in the system using intensional data, the candidatenodes which can perform activation of the service call typically have different resources, workloads, and com-munication costs with respect to the target service. Thus, by selecting a node which can process the serviceinvocation and message transmission efficiently over the heavy-weight SOAP protocol, we can reduce the costof invoking the Web service significantly. In this paper, in an attempt to improve the performances of overallWeb service systems, we focus on the optimization of the invocation strategy for a set of given Web services byexploiting intensional data to delegate service invocations to appropriate nodes participating in the hierarchi-cal composition of the Web services. We formally describe an optimal cost invocation plan for Web serviceswhich interact hierarchically using intensional data and propose effective methods of generating an optimalinvocation plan or an efficient one. The invocation plan can be exploited by the Web service execution serversto determine which of the component or delegated service calls they should activate themselves and which onesthey should delegate to other Web services using intensional data.

This paper is organized as follows. In Section 2, we discuss previous work on Web services and intensionalXML data, as well as defining a system architecture to support our approach. Section 3 formalizes the conceptof a feasible invocation plan for Web services which exchange intensional data as service parameters or results.In Section 4, we describe a cost model for activating a service call from a client, investigate the complexity ofthe exhaustive search for an optimal solution, and propose some heuristic approaches to finding an optimalsolution and an efficient solution based on the A* search and greedy method, respectively. We present exper-imental results on the effectiveness and performance of the proposed methods in Section 5 and draw our con-clusions in Section 6.

2. Background and related work

The deployment of Web services is supported by various standards such as Web Service Description Lan-guage (WSDL), Universal Description, Discovery, and Integration (UDDI), and Simple Object Access Pro-tocol (SOAP). These standards support definition of the functions and interfaces of Web services,advertisement of Web services to the community of potential user applications, and binding for remote invo-cation of Web services, respectively [35,36]. Recent studies on Web services include their high-level modelingand specification, automatic selection and composition, semantic inter-operation, and the use of Web servicetechnologies for developing distributed and collaborative systems in different domains [7,11,20,21,23,26,40].

Recently, several approaches and applications have been proposed which exploit intensional data embed-ding calls to external functions or Web services [1,4,19,33]. For example, Smart tags in Microsoft Office-baseddocuments (e.g., MS Excel spread-sheets) can link text to one or more actions that can execute custom appli-cations or invoke external .NET Web services [33]. Active XML (AXML) [1] provides a declarative frame-work for data integration and management on the Web utilizing Web services and intensional XML data,called AXML documents, in a peer-to-peer environment. Each peer node provides AXML services whichallow clients to access AXML documents stored in the peer’s repository. Defined as parameterized XML que-ries over AXML documents which may contain calls to external Web services provided by other peers, theycan be used to search, gather, and integrate data distributed over the peer nodes.

Abiteboul et al. [2] studied the problem of distributing and replicating XML data and Web services in apeer-to-peer setting by exploiting intensional data. Milo et al. [24] provided document and schema rewritingmethods to support the exchange of intensional data between heterogeneous applications. Ruberg et al. [32]proposed an approach to materializing service calls in an AXML document stored in a node in a peer-to-peerenvironment. They considered delegating the execution of the embedded service calls to other peer nodes andsuggested two heuristics to find an efficient strategy to delegate and execute the service calls. Their material-ization strategy using the delegation of service calls is similar to our invocation plan using intensional data, buttheir method differs from ours in many respects. They assumed that a node can delegate the invocation of ser-vice calls to any other peer node while no concrete method was provided to delegate service calls to othernodes. The heuristics that they proposed are too simple to reduce the search space effectively and no perfor-

C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339 321

mance evaluation was presented in their work. Moreover, since they focused on the materialization of the ser-vice calls contained in an XML document, the delegation of service calls by means of intensional results wasnot considered at all in their approach.

Benatallah et al. [5,6] presented the Self-serv system to support the model-driven development and decen-tralized orchestration of composite Web services. It uses state-charts to express the control-flow dependenciesand data-flow dependencies among the Web services composing a new composite Web service. To execute thecomposite service in a peer-to-peer way, it statically extracts scheduling information for the component ser-vices from the state-chart specification of the composite service. The result is used by the coordinators ofthe component Web services which are responsible for the service invocations and control-flow notificationsin executing the composite service. They showed by experiments that the proposed approach can be used todevelop and deploy large composite Web services in a short time and can achieve better performance whenexecuting composite services than a centralized coordination approach through the effective distribution ofthe load among the participant systems. Related to the Self-serv approach, Younas et al. [38] proposed theNetwork-based composition protocol which exploits active network technology to improve the efficiency ofWeb service composition based on the peer-to-peer paradigm. It deploys and executes the coordinators ofcomponent Web services at the active network nodes located between the distributed Web service systems,which can perform customized computations on the messages flowing through them. Since part of the coor-dination process is accomplished at the network nodes, the response time of the composite Web services can bereduced. These previous approaches, however, generate a static peer-to-peer execution scheme for the compos-ite Web service according to the result of the state-chart modeling of the service composition. By consideringintensional data exchange among component services to delegate service invocations, we can obtain variousexecution schemes for the component services which have better performance than the previous schemes.In fact, the coordination plan for component services having data-flow dependencies among them can alsobe produced by delegating the service invocations to the component services using intensional parameters.Thus, our method of generating an optimized invocation plan for component Web services using intensionaldata can be applied and incorporated into these previous approaches.

Amer-Yahia and Kotidis [3] developed a middleware architecture for exchanging large amounts of XMLdata between two enterprise applications based on Web services and proposed an approach to optimizingthe data transfer process including rewriting the schema between the source and target systems. Our workis different from theirs, since we consider the optimization of the performance of the Web services which inter-act hierarchically using intensional forms of data.

In [29], we proposed a Web service system architecture to support the distributed invocation of Web ser-vices using intensional data. Fig. 3 shows a high-level view of the system architecture which includes

Application Server

SOAP Server

Intensional XML Processor

PerformanceMonitor

service call / result

Web services

InvocationPlan

Web serviceInvocation Plan

OptimizerUDDI

Registry

description &access info

invocation costestimates

Other Web services

publish

Application Server

SOAP Server

Intensional XML Processor

PerformanceMonitor

service call / result

Web services

InvocationPlan

Web serviceInvocation Plan

OptimizerUDDI

Registry

description &access info

invocation costestimates

Other Web services

publish

Fig. 3. A high-level system architecture.

322 C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339

application servers, a UDDI registry, and the invocation plan optimizer proposed in this paper. Applicationservers manage and execute the Web services deployed on them. They have an intensional XML data pro-cessor and a performance monitor, as well as a conventional SOAP server for handling input and outputmessages. The performance monitor continuously measures the workload and performance of the applica-tion server as well as the costs of communication with other application servers providing related Web ser-vices and estimates the invocation cost of the Web services. We assume the existence of a set ofpre-determined Web services which are hierarchically related to each other and which already share theirdescription and access information retrieved from a UDDI registry. The estimated costs are periodicallydelivered to the invocation plan optimizer, which generates an efficient invocation strategy for the Web ser-vices being considered using their description and invocation cost estimates. The details of the optimizationmethod will be presented in Section 4. The result strategy, which delegates service calls using intensionaldata, is then propagated to the relevant application servers. The intensional XML data processor incorpo-rated in the application servers parses the intensional parameters supplied to local Web services and theintensional results received from external Web services, which are delivered in SOAP messages at run-time.Then, it activates a subset of the service calls embedded in the intensional data based on the service invo-cation plan provided by the plan optimizer.

Using intensional forms of data as the parameters and results of Web services requires some extensions tothe existing Web service protocols. For example, the existing method of defining the schema (i.e. type) of inputand output messages for Web services in WSDL needs to be extended to support intensional data, which mayinclude service calls. Moreover, dynamic type validation should be performed on the input and output mes-sages of the Web services. In [24], an extended version of XML schema and WSDL was proposed to supportthe use of intensional data in the description of Web services and a schema enforcement method was also pre-sented for verifying and rewriting intensional parameters and results. These techniques can be utilized toimplement our approach.

Finally, there have been some studies which attempted to improve the performance of SOAP for Web ser-vices and the remote procedure call (RPC) mechanism used for distributed applications in general[17,18,25,34]. Most of this work concentrated on optimizing the implementation of the methods involved, suchas the automatic generation of stub codes, data/objects encoding schemes, and lightweight network protocols,in order to reduce the run-time overhead of each remote call. Yeung and Kelly [37] proposed a method ofoptimizing distributed Java programs in order to delay and aggregate Java RMI calls having some data depen-dencies, which are sent to the remote server for concurrent execution in order to reduce the network overhead.However, the proposed call forwarding is restricted to the same remote object providing multiple dependentmethods (i.e. functions). To the best of our knowledge, the delegation of the invocation of remote procedurecalls to other relevant remote procedures using intensional forms of parameters or results has not been fullyinvestigated in the context of distributed applications, nor in Web services.

3. Invocation plan for Web services using intensional data

In this section, we formalize the concept of a feasible invocation plan for Web services exchanging inten-sional data. For the sake of simplicity, we will address the issues of the intensional results and intensionalparameters separately. First, an intensional result can be utilized by a Web service to delegate some Web ser-vice calls to its caller Web service, hence the invocation plans using intensional results are defined based on acaller–callee relationship among the given set of composite Web services. Considering each invocation of aWeb service independently, we can represent the call relations among a set of Web services by a directed tree.Second, we consider the sending of an intensional form of parameters to a Web service which includes calls toa set of other Web services instead of some extensional values when the input data of the Web service is depen-dent on the results of the delegated service calls. Thus, invocation plans exploiting intensional parameters areinvestigated for a set of Web services which are called by a composite Web service and have a data dependencyrelationship, which can be represented by a directed acyclic graph.

In Section 3.1, we first investigate the invocation plan exploiting the intensional results of the Web services.Then, in the next section, we consider the issue of delegating Web service invocations using intensional para-meters among a set of component Web services.

(c) An invocation tree for DT (d) infeasible invocation for DT

(a) A call definition tree DT (b) DT *

A

B

C

D

A

B

C

D

BC

D

B

C

D

A A

BC

D

B

C

D

Fig. 4. An example of a call definition tree and an invocation tree for Web services.

C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339 323

3.1. Invocation plan using intensional results

For the sake of simplicity, we assume that there exists at most one sequence of service calls between any pairof Web services and that there is no cycle in it, as shown in Fig. 2. Then, such hierarchical interactions of Webservices can be represented in the form of a directed tree rooted at a client or a user agent, which is defined asfollows.

Definition 1 [Web service call definition tree]. DT = (Vd,Ad,Wd) is a weighted directed tree which representsthe call relations among the given Web services, where Vd is a set of finite number of vertices representing theWeb services,1 Ad is a set of directed edges between a pair of vertices in Vd which represent service callsbetween the Web services and Wd : Ad! Z+ is a weight function that provides the cost of the serviceinvocation for each arc in Ad. A directed path from vertex v to vertex w in DT is called a call definition path

from v to w and is denoted by Pd(v,w).

Note that our method which will be described subsequently can also be used for a more general type ofinteractions among Web services. For example, if a Web service calls a particular Web service more than onceor there exist multiple paths of service calls between two Web services, such interactions can be represented asa Directed Acyclic Graph (DAG). However, multiple invocations of the same Web service should bedifferentiated from each other since their parameters and results may have different values. Thus, the DAG canbe transformed into a call definition tree by decomposing the vertices and arcs representing the multipleinvocations of the same Web service.

If a call definition path between a pair of Web services, v and w, exists in a call definition tree and is denotedby Pd(v,w) = (v, s1, s2, . . . , sn,w), where sn (n P 1) is a vertex contained in the path, an instance of Web serviceinvocations is possible in which intensional results containing a service call to w are transmitted from sn to v

via a subset of Web services in (s1, s2, . . . , sn�1) in the reverse order of the sequence and then v directly invokesw. Thus, for a given call definition tree DT = (Vd,Ad,Wd), the directed acyclic graph DT � ¼ ðV d ;A�d ;W

�dÞ

derived from the transitive closure of Ad over Vd and a weight function defined on A�d indicates all the possibleinvocations among the Web services in DT obtained by exploiting intensional results (see Fig. 4a and b). Sinceeach Web service in DT must be invoked once from a Web service or client according to the definition of DT, afeasible instance of invoking the Web services can be represented as a directed spanning tree for DT* defined asfollows.

1 In this paper, we use the term ‘Web services’ and ‘vertices’ in a call definition tree or invocation tree interchangeably.

324 C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339

Definition 2 [Web service invocation tree considering intensional results]. For a given call definition treeDT = (Vd,Ad,Wd), IT = (Ve,Ae,We) is a weighted directed tree which represents a feasible instance of invok-ing the Web services in DT using intensional results, where V e ¼ V d ;Ae � A�d , and We : Ae! Z+ satisfiesW e � W �

d . A directed path from a vertex v to a vertex w in IT is called an invocation path from v to w andis denoted by Pe(v,w).

Fig. 4c shows an example of an invocation tree for the call definition tree in Fig. 4a. Note that not everydirected spanning tree for DT* represents a feasible invocation instance for the Web services in DT. In Fig. 4d,for example, if A directly calls C, D cannot be invoked by B since the service call to D cannot be sent to B usingintensional results. Thus, the tree in Fig. 4d is not a feasible invocation tree for DT. We observe that theinvocation trees for a call definition tree have the following properties.

Lemma 1. If there is a call definition path Pd(v,w) for a pair of Web services v and w in a call definition treeDT = (Vd,Ad,Wd), there exists no invocation tree for DT which contains an invocation path Pe(w,v) from w to v.

Proof. Suppose that there exists an invocation tree IT = (Ve,Ae,We) containing the invocation path Pe(w,v)and let Pe(w,v) = (s1, s2, . . . , sn), where sn (n P 2) is a vertex in IT. Since Ae � A�d by Definition 2, there shouldbe a path Pd(si, si+1) in DT for all arcs (si, si+1) in Pe(w,v) (1 6 i 6 n � 1). This means that DT has a call def-inition path Pd(w,v) and thus has a cycle between v and w. This is a contradiction since DT is defined as a treein Definition 1. h

Lemma 2. Assume that a call definition tree DT has a path Pd(v,w) = (v, s1, s2, . . . , sn,w) (n P 1) between a pair of

Web services v and w. If an invocation tree IT for DT has an arc (v,w), it also has invocation paths Pe(v, si) for all

si (1 6 i 6 n) between v and w in Pd(v,w).

Proof. This lemma can be proved by mathematical induction for a number n of Web services between v and w

in Pd(v,w). When n = 1 i.e. Pd(v,w) = (v, s1,w), v can call w only if it invokes s1 and receives an intensionalresult from it containing a service call to w hence the lemma holds. Now assume that the lemma holds forall n such that n 6 k (k P 1). This means that jPd(v,w)j 6 k + 1 where jPd(v,w)j denotes the length of the pathPd(v,w). When n = k + 1 i.e. jPd(v,w)j = k + 2, an intensional result containing the service call to w should bedelivered from sn to v, directly or indirectly, before v invokes w. That is, an invocation path Pe(v, sn) from v tosn should exist in IT. Suppose that Pe(v, sn) = (r1, r2, . . . , rm) where r1 = v and rm = sn. Then, all ri (1< i < m)should belong to the set S = {si j 1 6 i 6 n-1} by the definition of the invocation tree and the uniqueness ofthe path between a pair of vertices in the call definition tree. Thus, IT has paths Pe(v, ri) for all ri (1<i 6 m), and since jPd(ri, ri+1)j 6 k + 1 for all ri (1 6 i < m), an invocation path Pe(ri, s) should exist in IT forall vertices s between ri and ri+1 in Pd(ri, ri+1), if there are any, by the assumption required for induction. Con-sequently, for all the vertices si between v and w in Pd(v,w), there exists Pe(v, si) in IT. h

From the above lemmas, we have the following result concerning the feasible invocation plans for the givenWeb services using intensional results (e.g., see Fig. 5).

(a) Feasible invocations

…v si

… ……… …s1 sn

…si

… ……… …s1 sn

(b) Infeasible invocations

…v wx si

……… … …v wx si

……… …

w

Fig. 5. Invocations of Web services using intensional results.

C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339 325

Theorem 1. Let DT be a call definition tree of Web services and IT be an invocation tree for DT. Suppose that

DT has a call definition path Pd(v,w) between a pair of Web services v and w such that

Pd(v,w) = (v, s1, s2, . . . , sn,w) where sn (n P 1) is a vertex in DT. If IT has an arc (v,w) (i.e., v invokes w), then

for all vertices si (1 6 i 6 n) in Pd(v,w), all of the vertices x adjacent to or originating from si (i.e., the parentvertex and child vertices of si) in IT are also in the path Pd(v, sn).

Proof. Assume that the vertex x adjacent to si (i.e., the parent vertex of si) in IT is not in the path Pd(v, sn) butis an ancestor of v in DT. Since (v,w) is in IT, there exists an invocation path Pe(v, si) in IT by Lemma 2. How-ever, since Pe(v,x) does not exist in IT by Lemma 1,x is not on Pe(v, si). This means that si should be invokedboth from x and another Web service in Pe(v, si), which is not a feasible invocation instance of the Web ser-vices in DT. On the other hand, suppose that a vertex x adjacent from si (i.e., a child vertex of si) in IT is not inthe path Pd(v,w) but is a descendant of w in DT. Then, since (si,x) is in IT and w is on Pd(si,x), there existsPe(si,w) in IT by Lemma 2. Since v is not in the invocation path by Lemma 1,w should be called both from vand another Web service in Pe(si,w). That is also a contradiction to the definition of an invocation tree. h

3.2. Invocation plan using intensional parameters

In this section, we consider the data dependency relation among the Web services which are components ofa composite Web service and formalize the invocation plan for the component Web services which exploitintensional parameters. Parameters of the intensional form may contain calls to some Web services and beused to delegate the service invocations to another Web service. In particular, they can be exploited when datadependency exists between a pair of component Web services which should be successively invoked by a com-posite Web service and the result of a component Web service invoked earlier is used only to make the inputparameter of the other component Web service. For example, assume that a composite Web service S callsWeb services A and B and that the input parameter of B is dependent on the result of A. If the result of A

is not used in S except for the purpose of generating the input data for B, S can send B an intensional param-eter which contains a service call to A instead of calling A by itself and in this way make B invoke A to obtainextensional input data before B executes its own business logic.

Definition 3 [Data dependency]. For a pair of Web services v and w, there exists data dependency from v to w

if the result from w is needed to generate the value of an input parameter of v.

Data dependency between a pair of Web services called by a composite Web service is a partial orderrelation on the set of component Web services. Thus, we can represent data dependencies among thecomponent Web services as a directed acyclic graph rooted at the composite Web service. We define the Webservice dependency graph as follows.

Definition 4 [Web service dependency graph]. DG = (r,Vd,Ad,Wd) is a weighted directed acyclic graph, wherer is the vertex denoting a composite Web service, Vd is a set of vertices representing the component Web ser-vices which are defined to be called by r, Ad is a union of the set of arcs (v,w) over Vd denoting data depen-dency from v to w and the set of arcs (r,v) for all v in it Vd, and Wd : Ad! Z+ is a weight function thatprovides the cost of service invocation for each arc in Ad. A directed path from a vertex v to a vertex w inDG is called a dependency path from v to w and is denoted by Pd(v,w).

Fig. 6a shows an example dependency graph which represents the data dependencies among six componentWeb services invoked by a composite Web service. A dependency graph DG for the given Web services can beused to delegate service calls to some other Web services using intensional parameters. Suppose that adependency path (s1, s2, . . . , sn) exists in DG, where sn (n P 3) is a vertex in DG. Without consideringintensional parameters, the composite Web service r should call the Web services on the path in the reverseorder (i.e., from sn to s1) by passing an extensional input parameter to a Web service si which is generated fromthe result of the previously executed Web service si+1 (1 6 i 6 n-1). However, r can delegate the invocation ofthe Web services on the path (s2, s3,. . ., sn) to s1 using an intensional form of parameter containing the calls to

(a) A dependency graph DG (b) Possible invocation trees for DG

s2

s5

s1

s3

s4

s6

r

s2

s5

s1

s3

s4

s6

r

s2

s5

s1

s3

s4

s6

r

s2

s5

s1

s3

s4

s6

r

s2

s5

s1

s3

s4

s6

r

s2

s5

s1

s3

s4

s6

r

Fig. 6. An example of the invocation trees for a dependency graph of Web services.

326 C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339

those Web services, and then s1 can either activate any embedded service call or delegate it to another Webservice using an intensional parameter. Consequently, for a given dependency graph DG = (r,Vd,Ad,Wd), thegraph DG� ¼ ðr; V d ;A�d , W �

dÞ derived from the transitive closure of Ad over Vd and a weight function definedover A�d represents all the possible invocations among the Web services in DG obtained by exploitingintensional parameters. Assuming that each Web service call defined in a composite Web service should beinvoked only once even if multiple Web services have data dependencies on it, a feasible instance of invokingthe component Web services can be represented as a directed spanning tree for DG* which is defined asfollows.

Definition 5 [Web service invocation tree considering intensional parameters]. For a given dependency graphDG = (r,Vd,Ad,Wd), IT = (Ve, Ae,We) is a weighted directed tree which represents a feasible instance ofinvoking the Web services within it using intensional parameters, where Ve = Vd, Ae � A�d , and We : Ae! Z+

satisfies W e � W �d . A directed path from a vertex v to a vertex w in IT is called an invocation path from v to w

and is denoted by Pe(v,w).

Fig. 6b shows two example invocation trees for the dependency graph given in Fig. 6a. We observe that afeasible invocation instance for the Web services in DG is a directed spanning tree for DG* which satisfiesadditional restrictions. Unlike the call definition tree in the previous section, a dependency graph may havemultiple paths between a pair of Web services, and invocation trees for such dependency graph have thefollowing property (see Fig. 7).

Theorem 2. If an invocation tree IT for a dependency graph DG = (r,Vd,Ad,Wd) has an arc (v,w) between a pair

of Web services v and w, for all vertices x such that an arc (x,w) belongs to Ad (i.e., which are dependent on w),

there is an invocation path Pe(v,x) in IT.

Proof. The arc (v,w) in IT means that v invokes w directly and receives the result of w. Since x depends on w,an extensional input parameter for x derived from the result of w is needed when x is invoked, and thus itshould be sent from v to x directly or indirectly through a subset of Web services in a dependency path fromv to x. Therefore, an invocation path Pe(v,x) should exist in IT. h

v w… …

x

r

data dependency service invocationinvocation path

v w… …

x

r

data dependency service invocationinvocation path

Fig. 7. A feasible invocation tree considering intensional parameters.

C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339 327

According to the above theorem, we can identify the possible callers of a Web service s on which multipleWeb services are dependent in the dependency graph. They should be close enough to the root of the graph tobe able to invoke the Web services which are dependent on s. This means that the callers should also be depen-dent on these Web services, directly or transitively. Invocation trees for a dependency graph also have similarproperties to those for a call definition tree, which are described in the following lemmas and theorem.

Lemma 3. If there is a dependency path Pd(v,w) for a pair of Web services v and w in a dependency graph

DG = (Vd,Ad,Wd), there exists no invocation tree for DG containing an invocation path Pe(w, v) from w to v.

Proof. This lemma can be proved in a similar manner to Lemma 1 by showing that an invocation tree con-taining an invocation path Pe(w,v) gives rise to a cycle in DG, which contradicts the definition of the acyclicdependency graph DG. h

Lemma 4. If an invocation tree IT for a dependency graph DG = (r,Vd,Ad,Wd) has an arc (v,w) between a pair of

Web services v and w, for all dependency paths Pd(v,w) = (v, s1, s2, . . ., sn,w) (n P 1) from v to w in DG, there

should be an invocation path Pe(v, si) in IT for all si in Pd(v, w).

Proof. This lemma can be proved by mathematical induction. Since the invocation arc (v,w) is in IT and thearc (sn,w) belongs to Ad, there is an invocation path Pe(v, sn) in IT by Theorem 2. Let’s suppose that thereexists an invocation path Pe(v, si) for 1< i 6 n. Then, for the caller x of si in Pe(v, si) there exists an invocationpath Pe(x, si�1) by Theorem 2 since the arc (x, si) is in IT and the arc (si�1, si) is in Ad. Thus there is an invo-cation path Pe(v, si�1) in IT. Consequently, for all the vertices si between v and w in Pd(v,w), there existsPe(v, si) in IT. h

From the above lemmas, we have the following result concerning the feasible invocation plan of Web ser-vices using intensional parameters.

Theorem 3. Let DG be a dependency graph of Web services and IT be an invocation tree for DG. Suppose

that DG has a dependency path Pd(v,w) = (v, s1, s2, . . . , sn,w) (n P 1) between a pair of Web services v and w.

If IT has an arc (v,w) (i.e., v invokes w), for all vertices si (1 6 i 6 n) between v and w in Pd(v,w), then all

vertices x adjacent to or from si (i.e., the parent vertex or the child vertices of si) in IT are also in the path

Pd(v, sn).

This theorem can be proved in much the same way as Theorem 1, based on Lemmas 3 and 4. Note thatLemmas 3, 4, and Theorem 3 are respectively equivalent to Lemmas 1, 2, and Theorem 1, which are concern-ing the properties of the invocation plans for a call definition tree in Section 3.1. Thus, for the Web serviceshaving the same structure of call definition relations or dependency relations, there is a duality in the invoca-tion plans using intensional results and those using intensional parameters, considering only the tree-struc-tured dependency relations among the component Web services.

4. Optimizing invocation plans

An invocation tree for a call definition tree or a dependency graph of Web services described in Section 3represents a global plan to invoke the Web services efficiently using intensional results or parameters. Asshown in Section 2, it is generated by the invocation plan optimizer for Web services, assuming that the infor-mation on the composition and dependency relation of the Web services can be obtained directly from theWeb service providers or indirectly from a registry service such as a UDDI registry [28]. There are usuallya large number of invocation trees for a given set of Web services, as described in Section 4.2, and their exe-cution costs, which are defined as the sum of the costs of all the Web service invocations in the invocationtrees, differ from one another. Thus, the invocation plan optimizer should be able to find an efficient invoca-tion tree that has a small execution cost.

In this section, we focus on the optimization of the invocation plans for Web services which can return andaccept intensional results. We first present our cost model for invoking a Web service from a client node. Then,we analyze the difficulty involved in conducting an exhaustive search for the optimal invocation plan and

328 C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339

propose two approaches to finding an optimal solution and an efficient one for the given Web services in orderto improve the overall performance of the distributed Web services.

For the duality of the invocation plans described in the previous section, the proposed method can also beused for generating an efficient invocation plan for a set of component Web services which have a tree-struc-tured dependency relation.

4.1. Cost model

In this paper, we use a cost model for the invocation of Web services, which is similar to that presented in[32]. Activating a service call typically involves client-side tasks, server-side tasks, and the transmission ofinput and output messages between the client and server. The client-side tasks include initializing service acti-vation, generating the input SOAP message, decoding the output SOAP message returned from the Web ser-vice, and parsing the result XML data to identify new service calls embedded in the intensional result. Theserver-side tasks consist of decoding the input SOAP message delivered from the client, parsing the decodedXML data, executing the invoked Web service, and then generating the result SOAP message. Note that, with-out considering replication of a Web service to multiple nodes in the network, the costs of the server-side tasksare the same, regardless of the candidate callers of the Web service. Thus, we define the cost of invoking a Webservice s from a client c as the sum of the client-side cost and the communication cost between the client andthe server as follows.

TableMeani

Symbo

Init(c)Encode

Decod

Parse(Comm

InvocationCostðc; sÞ ¼ InitðcÞ þ Encodeðc; sÞ þ Decodeðc; sÞ þ Parseðc; sÞ þ Commðc; sÞ þ Commðs; cÞ

The meaning of the symbols is described in Table 1. We use the response time as the metric of the cost factorssince it has the greatest perceived impact on the user [9,31].

For the cost-based optimization, we need to estimate the invocation costs of the relevant Web services fromeach composite Web service. Numerous tools and techniques could be used for the measurement and predic-tion of the workload and performance of the distributed systems as well as the Web service systems [9,22,31].The client-side cost of each node can be estimated by monitoring the workload and predicting the performanceof the node continuously. This estimation may be performed and published by a pre-defined calibrating servicein each node as suggested in [32]. The estimation of the communication cost between a pair of nodes can beachieved by monitoring the bandwidth, latency, and traffic over the network between them. To cope with thewide variability of the network performance, however, we need to periodically re-optimize the invocation planfor the given Web services.

4.2. Exhaustive search

A simple approach to optimization is to search the solution space exhaustively by enumerating all the pos-sible invocation trees defined in Section 3.1 and select the one having the minimum execution cost. We cangenerate all of the possible invocation trees systematically by considering the vertices in the given call defini-tion tree DT in increasing order of their depth. Let IT(k) be the set of possible invocation trees for a subset ofvertices in DT whose depths are no greater than k. Then, every invocation tree Ti in IT(k) can be derived froman invocation tree Tj in IT(k � 1) by including Tj as a sub-graph and adding the vertices v of depth k from DT

and the new arcs (u,v) from a vertex u in Tj to v. Note that the caller vertex u of v should be selected onlyamong those vertices on the path from the root to the parent vertex of v in Tj.

1ng of the symbols

l Description

Time to initialize invocation of a Web service at the node of c

(c, s) Average time to generate an input SOAP message to s at the node of c

e(c, s) Average time to decompose an output SOAP message from s at the node of c

c, s) Average time to parse the result (intensional) XML data from s at the node of c

(c,s) Average time to transfer data from c to s

C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339 329

While this exhaustive search always finds an optimal solution, it requires a huge amount of execution time.Assuming that DT is a perfect tree in which all internal vertices have the same out-degree f and all leaf verticeshave the same depth2 h, we have

2 Indefined

jIT ðhÞj ¼X

k1;k2;...;kh�1

f h�1

k1; k2; . . . ; kh�1

� �2f� �k1 � 3f

� �k2 � � � hf� �kh�1 ¼ 2f þ 3f þ � � � þ hf

� �f h�1

¼Xh

i¼2

if

!f h�1

where k1 þ k2 þ . . .þ kh�1 ¼ f h�1 and h P 2:

This means that as the fan-out and height of the call definition tree increase, the number of possible invocationtrees grows drastically. For example, even for the small values of f = 3 and h = 4, the number of invocationtrees amounts to 9927 = 7.6e + 53. In terms of the number n of Web services, this method has a time complex-ity of, hence it is difficult to use in a real environment.

4.3. A* search algorithm

In this section, we provide an algorithm based on the A* search strategy [27] for finding an optimal invo-cation plan more efficiently than the exhaustive search. It searches for an optimal solution in a state-spacegraph of possible invocation plans, pruning a part of the search space that does not contain an optimalsolution.

The A* search is a general graph search algorithm which is used to find a path from an initial node to a goalnode. It is a kind of best-first search approach which uses a heuristic estimation function f(n) to rank eachnode n by a cost estimate of the best route to a goal node that goes through the node n and visits the nodesin the order of the heuristic estimate. Let f(n) = g(n) + h(n), where g(n) is the cost of the current path from theinitial node to n and h(n) is the heuristic estimate of the minimal cost h*(n) of going from n to a goal node. Ithas been proven that any A* algorithm using a heuristic estimation function h(n) such that h(n) 6 h*(n) for allnodes n is admissible, i.e., is guaranteed to find the minimal cost path from the start node to a goal node [27].

The A* algorithm which is proposed to find an optimal invocation plan is shown in Fig. 8. Given a calldefinition tree DT = (Vd,Ad,Wd) whose root vertex is denoted by r, a state s in the search space representsan invocation tree ITs for a sub-graph of DT containing all the vertices whose depth is no greater than ds

in DT. The algorithm begins with a start state s0 for r in the set OPEN of active states to be explored. At eachstage, it selects and removes the state s from OPEN that has the smallest value of f(s) defined below. f(s) refersto a heuristic estimation of the actual cost f*(s) of an optimal solution that can be achieved from s, i.e. a min-imum cost invocation tree for DT containing ITs as a sub-graph. Then, as shown in line 16–20 in Fig. 8, itexpands s by generating its successor states representing the invocation trees which have ITs as a sub-graphand additionally contain the arcs from the vertices in ITs to the vertices of depth ds+1 in DT (refer toFig. 9a). The algorithm terminates when a solution state is found from OPEN that has an invocation tree con-taining all the vertices in DT.

To propose an admissible heuristic function f used in the search algorithm, we consider an auxiliary tree foreach state in the search space defined below.

Definition 6. An auxiliary tree for the invocation tree ITs= (Vs,As,Ws) of a state s is ATs = (V,A,W) whereV = Vd,A = As [ {(u,w) j w2Vd � Vs and for each w,u is a vertex in Pe(r,v) in ITs or in Pd(v,w) in DT suchthat W �

d((u,w)) is minimal, where v is a leaf vertex of ITs that is on the path Pd(r,w)}, and W : A! Z+ satisfiesW � W �

d .

Note that ATs is a spanning tree for DT* which contains ITs as a sub-graph. In addition, for each vertex w

in DT* which is not in ITs, an arc that has the minimum invocation cost for w is selected among the incomingarcs into w in DT* and included in ATs (see Fig. 9b). Since an incoming arc for each vertex is selected for ATs

without considering the other vertices in Vd � Vs, it may not represent a feasible invocation plan for DT

this paper, the depth of a vertex in a tree means the number of edges in the path from the root to the vertex. The height of a tree isas the depth of the deepest leaf vertex in the tree.

Fig. 8. A* search algorithm.

ur v w

ITs

Vd Vs N

arc in DTarc in ITs

new arc added to ITsnode in Pe(r, v)

ur v w

ITs

V Vs N

arc in DTarc in ITs

new arc added to ITsnode in Pe(r, v)

ur v

w

ITs

Vd Vs

Pe(r, v)

Pd(v, w)

arc in DTarc in ITs

Wd*((u, w))

ur v

w

ITs

Vd Vs

Pe(r, v)

Pd(v, w)

arc in DTarc in IT

Wd*((u, w))

Fig. 9. Execution of the A* algorithm.

330 C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339

satisfying Theorem 1. However, we have Weight(ATs) 6Weight(IT) for all the invocation trees IT for DT

which contain ITs. Thus, we define the heuristic function f(s) to estimate the cost of the optimal invocationtree containing ITs by

f ðsÞ ¼ WeightðAT sÞ ¼ WeightðIT sÞ þX

w2V d�V s

Minu2P eðr;vÞ�P d ðv;wÞ

W �dððu;wÞÞ ð1Þ

C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339 331

where v is a leaf vertex of ITs on the path Pd(r,w) in DT, and we have f(s) 6 f*(s). Assuming that g*(s) is thecost of ITs and h(s) is a heuristic estimate of the minimal cost h*(s) of the invocations of the Web services notincluded in ITs, we have h(s) = f(s) � g*(s) 6 f*(s) � g*(s) = h*(s) for all states s in the search space. Therefore,the admissibility property of A* algorithms ensures that the proposed algorithm is guaranteed to generate anoptimal-cost invocation plan for any given hierarchical call definitions of Web services.

4.4. Greedy algorithm

The A* algorithm proposed in the previous section, although able to find an optimal solution, may expe-rience severe performance degradation as the number of Web services increases, as shown in our experimentalresult in Section 5. To produce a cost-efficient invocation plan for a large number of Web services in a shorttime, we propose the greedy method shown in Fig. 10.

This algorithm builds an invocation tree IT for a given call definition tree DT incrementally by traversingDT in a breadth-first manner starting from the root vertex. For each vertex w visited during the search, itselects a vertex u among the vertices already contained in IT as the caller of w and adds w and the arc(u,w) into IT. To guarantee that IT is a feasible invocation tree for DT satisfying Theorem 1,u should be

Fig. 10. Greedy algorithm.

r vw

IT Desc(w)

Pe(r, u)

xut

Wd*((t, x))

Wd ((u, w))

r vw

IT Desc(w)

Pe(r, u)

xut

Wd ((t, x))

Wd*((u, w))

Fig. 11. Selecting the caller of w in the greedy algorithm.

332 C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339

selected from those vertices which are on the invocation path Pe(r,v) from the root vertex r to the parent vertexv of w in IT. Note that if w is invoked by a Web service u, the descendents of w in DT can be called only fromthe vertices among u and its ancestors in IT. That is, a descendent x of w can be called from a vertex t on thepath Pe(r,u) as shown in Fig. 11. To generate an efficient invocation plan for overall Web services, we considernot only the invocation cost for w but also the invocation costs for the descendent Web services of w whenselecting the caller of w. Specifically, denoting the set of the descendents of w by Desc(w), we select as the callerof w a vertex u on Pe(r, v) which minimizes the value of W �

dððu;wÞÞ þP

x2DescðwÞ Mint2P eðr;uÞ

W �dððt; xÞÞ. This means

that for the vertices x in Desc(w) we should repetitively find their callers t having the minimum invocation costfrom different sets of possible callers on Pe(r,u) determined by considering the position of the candidate calleru of w on Pe(r,v). Therefore, the time complexity of finding the caller of w is O(jDesc(w)jÆl2) where l is thelength of Pe(r,v).

However, this task can be performed more efficiently in O(jDesc(w)jÆl) by avoiding repetitive cost compar-isons using extra memory space in O(jDesc(w)j) as follows. Let’s consider the candidate callers of w on Pe(r,v)in increasing order of depth. For a candidate caller u, we store the minimum costs of invoking the descendentvertices x of w from some vertices t in Pe(r,u). Then, we reuse the stored information in the next stage with thenext candidate caller of w, as shown in line 12–23 in the algorithm in Fig. 10. As a result, if we suppose thatDT is a perfect tree which has out-degree f and height h and contains n vertices, the proposed algorithm can

find an invocation tree for DT in OPh

i¼1jDescðwÞj � i � f i� �

¼ OPh

i¼1f h�iþ1�1

f�1� i � f i

� �¼ O

�f

f�1nlog2

f n�2nlogf n

�¼ O

�nlog2n

�.

5. Performance evaluation

This section evaluates the effectiveness and performance of the proposed methods by experiments. In par-ticular, we compared the capability of the methods with that of the Context heuristic search approach pro-posed by Ruberg et al. [32]. As described in Section 2, their approach attempted to generate an efficientmaterialization strategy for an intensional XML document by delegating the activation of Web service callsembedded in the document to the appropriate nodes in a peer-to-peer system. Considering a directed acyclicgraph representing the data dependencies among the service calls in the document, the proposed strategy del-egates the invocation of a set of service calls contained in a sub-graph of the dependency graph to a peer node.To find a solution for a given intensional XML document quickly from the potentially large search space ofmaterialization strategies, a Context heuristic was employed. This heuristic restricts the candidates of the invo-cation peer for a sub-graph rooted at a Web service s within those peer nodes providing the Web serviceswhich are reachable from s or can reach s in the dependency graph. It also makes sure that the selected invo-cation peer fully materializes the considered sub-graph by activating all the Web service calls contained withinit. Thus, if the heuristic is used to obtain an invocation plan for hierarchically composed Web services usingintensional data, it can only produce limited forms of invocation plans in which a service call delegated to aWeb service cannot be successively delegated to another Web service.

In this section, we present the results of three experiments. The first experiment shows the qualities of theinvocation plans generated by the methods being considered for different test call definition trees (DTs) ofWeb services. The quality of each solution is measured in terms of the total invocation cost of Web servicesin the plan. In the second experiment, we conducted simulations based on queueing networks in order to

C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339 333

model the execution of composite Web services and evaluate the actual performance gains which can beachieved by the proposed optimization method. The third experiment compares the execution time of the con-sidered methods in performing optimization.

For the experiments, we implemented the proposed greedy and A* algorithms as well as the Context heu-ristic search method in the C++ language using the Standard Template Libraries (STL). We conducted exper-iments using a Linux server employing an Intel Xeon 3.2GHz processor and 6GB of main memory.

5.1. Quality of solution

In the first experiment, we generated 100 test call definition trees which have a height of 3 or 4 and a fan-outfrom 0 to 3 selected at random. The invocation costs of the Web services were also randomly chosen between 1and 65,535. Then, we performed cost-based optimization for each test DT using the considered methods.Fig. 12a shows the costs of the result invocation plans which are arranged in non-decreasing order of the costof the input DTs. The result indicates that the cost of the optimal invocation plan obtained by the A* algo-rithm is about 61.7% of the cost of the input DT on average. The average costs of the results produced by theproposed greedy algorithm and the Context heuristic algorithm are about 62.4% and 83.0% of the cost of theinput DT, respectively. This cost reduction in the invocation plans was achieved by distributing the activationsof some Web services to other Web service providers which can process the service invocations more effi-ciently. We observe that the proposed greedy algorithm generated more efficient invocation plans than theContext heuristic method for all the test input DTs. This is mainly due to the fact that the Context heuristiccan generate only limited forms of invocation strategies as mentioned above; when a subtree of the givendependency tree is delegated to a peer node, all the Web services contained in the subtree should be fullyinvoked and materialized by that peer node. Meanwhile, the greedy algorithm can produce an invocation planin which a Web service that has received an intensional parameter or result can successively delegate a part ofthe service calls in the intensional data to the other relevant Web services.

0200000400000600000800000

10000001200000140000016000001800000

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96Call Definition Tree ID

Cos

t

Input DTGreedy SolutionOptimal SolutionContext Heuristic Solution

(a) Cost of the invocation plans generated by the proposed and previous methods

0.5

0.6

0.7

0.8

0.9

1

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96Call definition tree ID

Cos

t rat

io

Optimal/Greedy Optimal/Context

(b) Cost ratio of the optimal solution to the other solutions

Fig. 12. Quality of optimization results.

334 C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339

Fig. 12b presents the qualities of the greedy solution and the Context heuristic solution, which are evaluatedin terms of the ratio of the cost of the optimal solution to that of the greedy or context solution. Note thatmost of the greedy solutions are very close to optimal; the greedy algorithm yielded optimal solutions for71% of the test DTs and the sub-optimal solution for a DT had a quality which was approximately 96.8%of that of the optimal solution for the same DT on average. However, there is a larger gap between the qualityof the invocation plans obtained by the Context heuristic method and that of the optimal solutions.

We also experimented with the various methods to examine the impact of the height and fan-out of theinput call definition trees on the quality of the methods. We generated 10 different input DTs having a par-ticular configuration and applied each of the methods to them. Fig. 13 shows the results of the optimizationfor the DTs having different heights ranging from 2 to 5. The fan-outs of the internal vertices of the DTs werefixed at 2 in Fig. 13a and were randomly selected from 0 to 3 in Fig. 13b. Fig. 13a indicates that as the heightof the input DTs increases from 2 to 5, the average cost of the greedy solutions decreases from about 78.8% to51.2% of the average cost of the input DTs executed without optimization. Thus, the effect of the cost reduc-tion in the solutions generated by the proposed methods is seen to enhance as the height of the given call def-inition trees increases. We observe that the average costs of the Context heuristic solutions for the heights of 2and 5 are about 85.3% and 67.1% of the average costs of the input DTs, respectively. The quality of the greedysolutions is also very close to optimal; the cost ratio of the optimal solution to the greedy one is 100% for aheight of 2 and about 96.5% for a height of 5. This can be compared with the quality of the Context heuristicsolution, which is about 80.2% of the optimal solution on average.

Fig. 14 shows the results of the experiments in which the fan-out of the DTs, i.e., the number of Web ser-vices called by a composite Web service, was varied. In these experiments, the internal vertices of an input DThave the same out-degree selected from 1 to 4 in Fig. 14a and have different out-degrees whose maximum value

0

500000

1000000

1500000

2000000

2500000

1 2 3 4Height of Input DT

Cos

t

Input DTContext Heuristic SolutionGreedy SolutionOptimal Solution

0

200000

400000

600000

800000

1000000

1200000

1 2 3 4Height of Input DT

Cos

t

Input DTContext Heuristic SolutionGreedy SolutionOptimal Solution

(a) DTs with the fan-out of 2 (b) DTs with the fan-out from 0 to 3

Fig. 13. Optimization results for the input DTs having different heights.

0

500000

1000000

1500000

2000000

2500000

3000000

1 2 3 4Fan-out of Input DT

Cos

t

Input DTContext Heuristic SolutionGreedy SolutionOptimal Solution

0

100000

200000

300000

400000

500000

600000

1 2 3 4Maximum fan-out of the nodes in DT

Cos

t

Input DTContext Heuristic SolutionGreedy SolutionOptimal Solution

(a) DTs with the same fan-out in all vertices (b) DTs with different fan-outs in vertices

Fig. 14. Optimization results for the input DTs having different fan-outs.

C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339 335

is between 1 and 4 in Fig. 14b. The height of the input DTs was fixed at 3 in all experiments. As in the casewhere the height of the DTs was varied, the extent of the cost reduction in the generated solutions increases asthe fan-out of the input DTs increases. Specifically, Fig. 14a shows that the average cost of the greedy solu-tions decreases from about 75.3% to 59.9% of the average cost of the input DTs while the average cost of theContext heuristic solutions is reduced from about 83.7% to 73.8%. The quality of the greedy solutions is veryclose to that of the optimal ones (i.e., above 96.3% of the optimal solutions), but it tends to decrease slightly asthe fan-out of the input DTs increases. The quality of the Context heuristic solutions ranges from about 78.2%(when the fan-out is 4) to 90.0% (when the fan-out is 1) of the optimal solutions on average.

5.2. Performance modeling

To show the effectiveness of the invocation plans optimized by the proposed greedy method in the perfor-mance of hierarchically composed Web services, we developed a performance model based on queueing net-works [14]. Queueing network modeling has been widely used to analyze and predict the performance ofcomputer systems as well as Web applications [22,30,38,39]. We constructed and experimented with a queue-ing network model for composite Web services using an open-source simulation package called Java Model-ling Tools (JMT) [8,15]. Each Web service is modeled by a queueing station, and a hierarchical composition ofWeb services is represented as a network of queueing stations. An M/M/1/k queueing system was used tomodel the queueing stations in the network, where both the inter-arrival times of the users’ requests and ser-vice times of the Web service are exponentially distributed (i.e. generated by Poisson processes). Each queue-ing station has a single server to process incoming requests and an FCFS queue with finite capacity k wheresome requests may be dropped. We assume that all the Web services have the same constant service time fortheir own business operations while they have different average execution times when it comes to processingthe invocations of external Web services, which are affected by the network transmission cost of the input andoutput messages between the client and server of the invoked Web services. We also assume that the servicetimes of the Web services are load-independent.

For the performance simulations, we generated call definition trees of Web services having different heightsand fan-outs. The invocation cost of a Web service was chosen at random between 1 and 1000, which repre-sents the average processing time, in milliseconds, of the service invocation consisting of the client-side tasksand message transmissions over the network. We assumed that all the Web services had the same executiontime of 50 milliseconds for their business operations. We then used the proposed greedy optimization methodand the previous Context heuristic method to produce invocation plans for the input trees. We constructedsingle-class open queueing networks for the generated plans as well as for the original call definition trees.Then, we carried out experiments with the queueing networks to obtain the values of various system perfor-mance indices such as the average response time and throughput under different traffic loads. We consideredeight different arrival rates of simultaneous users’ requests in the simulations. Each experiment was allowed torun for 100,000 simulation time units to obtain a steady state distribution.

Fig. 15 shows the results of the simulations for a given hierarchical composition of Web services, where theheight and fan-out of the call definition tree are 3 and 2, respectively. The results demonstrate considerableperformance benefit of the invocation plan generated by the proposed method in comparison with the originalcomposition scheme and the Context heuristic solution. It is observed that the average response time andthroughput of the distributed invocation strategy produced by the greedy method are improved by about23% and 43%, respectively, with respect to the performances of the un-optimized execution of the call defini-tion tree. These figures also show the impact of the incoming rates of users’ requests on the system perfor-mances. Fig. 16 presents the simulation results for a call definition tree which has a height and fan-out of 3and thus consists of 40 Web services. It demonstrates that for the more complex structure of a compositeWeb service, the invocation strategy based on the proposed optimization method still performs better thanthe previous approach, as well as the execution of Web services as defined in their compositions. In particular,the system response times of the execution plan generated by the greedy method are reduced to below 68% ofthose of the original composition on average. Based on the simulation results, we expect that the proposedmethod can achieve greater improvement in the system performance as the complexity of the hierarchical com-position of the Web services increases.

05

101520253035

0.5 1 1.5 2 2.5 3 3.5 4Arrival rate of requests (jobs/sec)

Mea

n re

spon

se ti

me

(sec

)Original PlanContext Heuristic PlanGreedy Plan

00.20.40.60.8

11.21.41.6

0.5 1 1.5 2 2.5 3 3.5 4Arrival rate of requests (jobs/sec)

Mea

n th

roug

hput

(job

s/se

c)

Original PlanContext Heuristic PlanGreedy Plan

(a) Average response time (b) Average throughput

Fig. 15. Simulation results for an input DT having the height of 3 and fan-out of 2.

01020304050607080

0.5 1 1.5 2 2.5 3 3.5 4Arrival rate of requests (jobs/sec)

Mea

n re

spon

se ti

me

(sec

)

Original PlanContext Heuristic PlanGreedy Plan

0

0.2

0.4

0.6

0.8

1

0.5 1 1.5 2 2.5 3 3.5 4Arrival rate of requests (jobs/sec)

Mea

n th

roug

hput

(job

s/se

c)

Original PlanContext Heuristic PlanGreedy Plan

(a) Average response time (b) Average throughput

Fig. 16. Simulation results for an input DT having the height of 3 and fan-out of 3.

336 C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339

5.3. Execution time

In this section, we compare the execution time of the investigated methods in order to evaluate their exe-cution performances. Fig. 17 presents the CPU time required by each method to produce the invocation plansin the first experiment in Section 5.1. The results are arranged in non-decreasing order of the number of Webservices contained in the test DTs. The greedy algorithm terminated in less than 10 milliseconds for all of thetest DTs, while the A* algorithm took a larger amount of time with a maximum of about 93 hours. Moreover,the execution time of the A* algorithm severely increases as the number of Web services in a DT grows. Theexecution performance of the Context heuristic method also tends to be considerably aggravated for a largenumber of Web services as in the case of the A* algorithm. Consequently, we observe that the proposed greedyalgorithm scales up with respect to the number of Web services much better than the A* algorithm and Con-text heuristic method.

Fig. 18 shows more experimental results concerning the performance of the greedy algorithm. In the exper-iment shown in Fig. 18a, the heights of the input DTs were fixed at 4 and their fan-outs were varied from 1 to10, and in Fig. 18b the fan-outs were fixed at 3 and their heights were varied from 1 to 10. Fig. 18c shows theresults of another experiment in which the input DTs have a fan-out of 5 and heights of between 3 and 7,which are arranged in non-decreasing order of the number of Web services contained in the tested inputDTs. All the results indicate that the greedy algorithm can perform optimization in a reasonable time evenfor a large value of the fan-out or height of the Web service compositions and for a large number of Webservices.

110

1001000

10000100000

100000010000000

1000000001000000000

4 5 6 6 7 8 9 9 10 11 12 14 15 15 16 17 18 20 21 22 25 28 30 38 41Number of Web services

Exec

utio

n tim

e (m

sec) A-Star Algorithm

Context Heuristic AlgorithmGreedy Algorithm

Fig. 17. Execution time of the considered algorithms.

(a) DTs with different fan-outs (b) DTs with different heights

0100000200000300000400000500000600000700000800000900000

1 2 3 4 5 6 7 8 9 10Height of DT

Num

ber o

f W

eb s

ervi

ces

0

200000

400000

600000

800000

1000000

1200000

1400000

Exec

utio

n tim

e (m

sec)Number of Web services

Execution time

0

2000

4000

6000

8000

10000

12000

1 2 3 4 5 6 7 8 9 10Fan-out of DT

Num

ber o

f W

eb s

ervi

ces

050001000015000200002500030000350004000045000

Exe

cutio

n tim

e (m

sec)

Number of Web servicesExecution time

0

50

100

150

200

250

5 14 17 24 33 40 49 54 69 82 102 124 136 180 204 254 318 372 473 547 724 874

Number of Web services in DT

Exec

utio

n tim

e (m

sec)

(c) DTs with the different number of Web services

Fig. 18. Execution performance of the greedy algorithm.

C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339 337

6. Conclusion

In this work, we studied the distributed invocation of Web services exploiting an intensional form of datawhich may embed calls to external Web services. We formalized a feasible invocation plan for Web serviceswhich use intensional results and parameters to delegate the activation of service calls to other relevant nodesproviding Web services. Then, we considered the cost-based optimization of the invocation plan for Web ser-vices exploiting intensional results. We analyzed the complexity of the exhaustive search of an optimal invo-cation plan in the solution space and provided an A* heuristic algorithm to find an optimal solution moreefficiently than by the exhaustive search. We also suggested a greedy method which can generate an efficientinvocation plan quickly. We showed by experiments that the proposed greedy method can find an efficientsolution for a given call definition tree of Web services, which is very close to the optimal one. The cost of

338 C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339

the optimized solution is greatly reduced compared to that of the original Web service interactions which deli-ver only extensional data, and thus the overall performances of the Web services can be improved considerablyby using the proposed method. The experimental results also indicate that our method has good executionperformance and scalability for the complex hierarchy of interactions among Web services.

The optimization of the invocation plan for a general dependency graph of Web services and the invocationplan considering both intensional parameters and results together will be further investigated in a future work.The proposed approach performs the optimization and generation of an invocation plan based on the costestimates collected from Web service providers and, hence, we need periodic re-optimization to deal withthe highly variable workload in the systems. We plan to study a distributed and dynamic optimization schemewhich can be performed at each node to support the autonomy and dynamically changing workload of theindividual Web service systems.

References

[1] S. Abiteboul, O. Benjelloun, T. Milo, I. Manolescu, R. Weber, Active XML: A data-centric perspective on web services, TechnicalReport, No. 381, INRIA Futurs, 2004.

[2] S. Abiteboul, A. Bonifati, G. Cobna, I. Manolescu, T. Milo, Dynamic XML documents with distribution and replication, in:Proceedings of ACM SIGMOD Conference, 2003, pp. 527–538.

[3] S. Amer-Yahia, Y. Kotidis, A web-services architecture for efficient XML data exchange, in: Proceedings of the InternationalConference on Data Engineering, 2004, pp. 523–534.

[4] Apache Jelly: Executable XML. Available from: <http://jakarta.apache.org/commons/jelly/>.[5] B. Benatallah, M. Dumas, Q.Z. Sheng, Facilitating the rapid development and scalable orchestration of composite web services,

Distributed and Parallel Databases 17 (2005) 5–37.[6] B. Benatallah, Q.Z. Sheng, M. Dumas, The Self-serv environment for web services composition, IEEE Internet Computing 7 (1)

(2003) 40–48.[7] D. Berardi, D. Calvanese, G. De Giacomo, R. Hull, M. Mecella, Automatic composition of transition-based semantic web services

with messaging, in: Proceedings of the 31st VLDB Conference, 2005, pp. 613–624.[8] M. Bertoli, G. Casale, G. Serazzi, Java modelling tools: an open source suite for queueing network modelling and workload analysis,

in: Proceedings of the International Conference on Quantitative Evaluation of Systems, 2006, pp. 119–120.[9] L. Cherkasova, Y. Fu, W. Tang, A. Vahdat, Measuring and characterizing end-to-end internet service performance, ACM

Transactions on Internet Technology 3 (4) (2003) 347–391.[10] K. Chiu, M. Govindaraju, R. Bramley, Investigating the limits of SOAP performance for scientific computing, in: Proceedings of the

IEEE International Symposium on High Performance Distributed Computing (HPDC’02), 2002, pp. 246–254.[11] F. Curbera, R. Khalaf, N. Mukhi, S. Tai, S. Weerawarana, The next step in web services, Communications of the ACM 46 (10) (2003)

29–34.[12] D. Davis, M. Parashar, Latency performance of SOAP implementations, in: Proceedings of the 2nd IEEE/ACM International

Symposium on Cluster Computing and the Grid (CCGrid2002), 2002, pp. 407–412.[13] T. Erl, Service-Oriented Architecture: Concepts, Technology, and Design, Prentice Hall, 2005.[14] D. Gross, C.M. Harris, Fundamentals of Queueing Theory, third ed., John Wiley & Sons, Inc., New York, 1998.[15] Java Modelling Tools. Available from: <http://jmt.sourceforge.net/>.[16] C. Kohlhoff, R. Steele, Evaluating SOAP for high performance business applications: real-time trading systems, in: Alternate Paper

Tracks of WWW2003, 2003. Available from: <http://www2003.org/cdrom/papers/alternate/P872/p872-kohlhoff.html>.[17] K. Kono, T. Masuda, Efficient RMI: Dynamic specialization of object serialization, in: Proceedings of the 20th International

Conference on Distributed Computing Systems, 2000, pp. 308–315.[18] V. Krishnaswamy, D. Walther, S. Bhola, E. Bommaiah, G. Riley, B. Topol, M. Ahamad, Efficient implementations of java remote

method invocation (RMI), in: Proceedings of the 4th USENIX Conference on Object-Oriented Technologies and Systems, 1998, pp.19–36.

[19] Macromedia Coldfusion MX. Available from: <http://www.macromedia.com/>.[20] I. Manolescu, M. Brambilla, S. Ceri, S. Comai, P. Fraternali, Model-driven design and deployment of service-enabled web

applications, ACM Transations on Internet Technology 5 (3) (2005) 439–479.[21] B. Medjahed, A. Bouguettaya, A. Elmagarmid, Composing web services on the semantic web, VLDB Journal 12 (4) (2003) 333–351.[22] D.A. Menasce, V.A.F. Almeida, Capacity Planning for Web Services: Metrics, Models, and Methods, Prentice-Hall PTR, Upper

Saddle River, New Jersey, 2002.[23] N. Milanovic, M. Malek, Current solutions for web service composition, IEEE Internet Computing 8 (6) (2004) 51–59.[24] T. Milo, S. Abiteboul, B. Amann, O. Benjelloun, F. Dang Ngoc, Exchanging intensional XML data, in: Proceedings of ACM

SIGMOD Conference, 2003, pp. 289–300.[25] G. Muller, R. Marlet, E.-N. Volanschi, C. Consel, C. Pu, A. Goel, Fast, Optimized Sun RPC using automatic program specialization,

in: Proceedings of the 18th International Conference on Distributed Computing Systems, 1998, pp. 240–249.

C.-S. Park, S. Park / Information Sciences 178 (2008) 317–339 339

[26] M. Nagarajan, K. Verma, A.P. Sheth, J.A. Miller, J. Lathem, Semantic interoperability of web services – challenges and experiences,in: Proceedings of the 4th IEEE International Conference on Web Services, 2006, pp. 373–382.

[27] N.J. Nilsson, Artificial Intelligence: A New Synthesis, Morgan Kaufmann Publishers, Inc., San Francisco, CA, 1998.[28] Oasis UDDI. Available from: <http://uddi.org/>.[29] C.S. Park, S. Park, Distributed invocation of composite web services, in: Proceedings of the International Conference on Embedded

and Ubiquitous Computing (EUC-06), 2006, pp. 385–393.[30] D. Peng, Y. Yuan, K. Yue, X. Wang, A. Zhou, Capacity planning for composite web services using queueing network-based models,

in: Proceedings of the 5th International Conference on Advances in Web-Age Information Management, 2004, pp. 439–448.[31] R. Rajamony, M. Elnozahy, Measuring client-perceived response times on the WWW, in: Proceedings of USENIX Symposium on

Internet Technologies and Systems, 2001, pp.185–196.[32] N. Ruberg, G. Ruberg, I. Manolescu, Towards cost-based optimization for data-intensive web service computations, in: Proceedings

of Brazilian Symposium on Databases, 2004, pp. 283–297.[33] D. Singh, Integrating Office XP Smart Tags with the.NET XML Web Services, In the ASP Today Library, 2002. Available from:

<http://asptoday.com/Content.aspx?id=1364>.[34] P.G. Soares, On remote procedure call, in: Proceedings of the Conference of the Centre for Advanced Studies on Collaborative

Research, 1992, pp. 215–267.[35] A. Tsalgatidou, T. Pilioura, An overview of standards and related technology in web services, Distributed and Parallel Databases 12

(2) (2002) 135–162.[36] Web Service Activity. Available from: <http://www.w3.org/2002/ws>.[37] K.C. Yeung, P.H.J. Kelly, Optimizing Java RMI Programs by Communication Restructuring, in: Proceedings of ACM/IFIP/

USENIX International Middleware Conference (2003) 324–343.[38] M. Younas, I. Awan, D. Duce, An efficient composition of web services with active network support, Expert Systems with

Applications 31 (2006) 859–869.[39] M. Younas, I. Awan, R. Holton, D. Duce, A P2P network protocol for efficient choreography of web services, in: Proceedings of the

21st International Conference on Advanced Networking and Applications, 2007, pp. 839–846.[40] J. Zhang, J. Gong, H. Lin, G. Wang, J. Huang, J. Zhu, B. Xu, J. Teng, Design and development of distributed virtual geographic

environment system based on web services, Information Sciences 177 (19) (2007) 3968–3980.