14

Click here to load reader

A static resource allocation framework for Grid-based streaming applications

Embed Size (px)

Citation preview

Page 1: A static resource allocation framework for Grid-based streaming applications

CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCEConcurrency Computat.: Pract. Exper. 2006; 18:653–666Published online 8 November 2005 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cpe.972

A static resource allocationframework for Grid-basedstreaming applications

Liang Chen and Gagan Agrawal∗,†

Department of Computer Science and Engineering, Ohio State University,Columbus, OH 43210, U.S.A.

SUMMARY

A number of applications increasingly rely on, or can potentially benefit from, analysis and monitoring ofdata streams. To support the processing of streaming data in a Grid environment, we have been developinga middleware system called GATES (Grid-based AdapTive Execution on Streams). Our target applicationsare those involving high-volume data streams and requiring distributed processing of data arising from adistributed set of sources. This paper addresses the problem of resource allocation in the GATES system.Although resource discovery and resource allocation have been active topics in Grid community, thepipelined processing and real-time constraint required by distributed streaming applications pose newchallenges. We present a resource allocation algorithm that is based on minimal spanning trees. We evaluatethe algorithm experimentally and demonstrate that it results in configurations that are very close to optimal,and significantly better than most other possible configurations. Copyright c© 2005 John Wiley & Sons, Ltd.

KEY WORDS: Grid middleware; streaming data; resource allocation

1. INTRODUCTION

A number of applications across computer sciences and other science and engineering disciplinesincreasingly rely on, or can potentially benefit from, analysis and monitoring of data streams.In the stream model of processing, data arrives continuously and needs to be processed in real-time,i.e. the processing rate must match the arrival rate. There are two trends contributing to the emergenceof this model. First, scientific simulations and increasing numbers of high-precision data collectioninstruments (e.g. sensors attached to satellites and medical imaging modalities) are generating datacontinuously and at a high rate. The second is the rapid improvements in the technologies for the wide-area network (WAN), as evidenced, for example, by the National Lambda Rail (NLR) proposal and the

∗Correspondence to: Gagan Agrawal, Department of Computer Science and Engineering, Ohio State University, Columbus,OH 43210, U.S.A.†E-mail: [email protected]

Copyright c© 2005 John Wiley & Sons, Ltd.Received 10 December 2004

Revised 22 February 2005Accepted 1 March 2005

Page 2: A static resource allocation framework for Grid-based streaming applications

654 L. CHEN AND G. AGRAWAL

interconnectivity between the TeraGrid and Extensible Terascale Facility (ETF) sites. As a result, thedata can often be transmitted faster than it can be stored or accessed from disks within a cluster.

The important characteristics that apply across a number of stream-based applications are: (1) thedata arrive continuously, 24 hours a day and seven days a week; (2) the volume of data is enormous,typically tens or hundreds of gigabytes a day, and the desired analysis could also require largecomputations; (3) these data often arrive at a distributed set of locations, and all data cannot becommunicated to a single site; (4) it is often not feasible to store all data for processing at a latertime, thereby requiring analysis in real-time.

We briefly describe two representative examples. The first example is computer vision basedsurveillance. Multiple cameras shooting images from different perspectives can capture moreinformation about a scene or a set of scenes. This can enable tracking of people and monitoring ofcritical infrastructure [1]. A recent report indicated that real-time analysis of the capture of morethan three digital cameras is not possible on current desktops, as the typical analysis requires largecomputations. Distributed and Grid-based processing can enable such analysis, especially when thecameras are physically distributed and/or high-bandwidth networking is available. Similar issues arisein online network intrusion detection, which is a critical step for cyber-security. Online analysis ofstreams of connection request logs and identifying unusual patterns is considered useful for networkintrusion detection [2]. To be really effective, it is desirable that this analysis be performed in adistributed fashion, and connection request logs at a number of sites be analyzed. Large volumes ofdata and the need for real-time response make such analysis challenging.

We view the problem of flexible and adaptive processing of distributed data streams as a Gridcomputing problem. We believe that a distributed and networked collection of computing resourcescan be used for analysis or processing of these data streams. Computing resources close to the sourceof a data stream can be used for initial processing of the data stream, thereby reducing the volume ofdata that needs to be communicated. Other computing resources can be used for more expensive and/orcentralized processing of data from all sources. Because of the real-time requirements, there is a needfor adapting the processing in such a distributed environment, and achieving the best accuracy of theresults within the real-time constraint.

In view of the above, we have been developing a Grid middleware system called GATES(Grid-based AdapTive Execution on Streams) [3]. The three important aspects of this system areas follows. First, it is designed to use the existing Grid standards and tools to the extent possible.Specifically, our system is built on the Open Grid Services Interface (OGSI) model and uses the initialversion of GT 3.0. Second, the system offers a high-level interface that allows the users to specify thealgorithm(s) and the steps involved in processing data streams. The users need not be concerned withthe details such as discovering and allocating Grid resources, registering their own data stream’s Webservices and deploying the Web services. The third significant aspect of our system is that it flexiblyachieves the best accuracy that is possible while maintaining the real-time constraint on the analysis.

In this paper, we consider the problem of resource allocation for an application using the GATESsystem. Although resource discovery and resource allocation have been active topics in the Gridcommunity, the pipelined processing and real-time constraint required by distributed streamingapplications pose new challenges. We present a resource allocation algorithm that is based onminimal spanning trees. We evaluate the algorithm experimentally and demonstrate that it resultsin configurations that are very close to optimal, and significantly better than most other possibleconfigurations.

Copyright c© 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2006; 18:653–666

Page 3: A static resource allocation framework for Grid-based streaming applications

A STATIC RESOURCE 655

The rest of this paper is organized as follows. In Section 2, we give an overview of the GATESsystem. The resource allocation problem and our algorithm are described in Section 3. We evaluateour algorithm in Section 4. We compare our work with related efforts in Section 5 and conclude inSection 6.

2. OVERVIEW OF THE GATES SYSTEM

This section describes the major design aspects of our GATES system.

2.1. Key goals

There are four main goals behind the design of the system.

1. Use the existing Grid infrastructure to the greatest extent possible. Particularly, our system buildson top of the OGSI [4], and uses its reference implementation, Globus 3.0. The Globus supportallows the system to carry out automatic resource discovery and matching between the resourcesand the requirements.

2. Support distributed processing of one or more data streams, by facilitating applications thatcomprise a set of stages. For analyzing more than one data stream, at least two stages arerequired. Each stage accepts data from one or more input streams and outputs zero or morestreams. The first stage is applied near sources of individual streams, and the second stage isused for computing the final results. However, based on the number and types of streams andthe available resources, more than two steps could also be required. All intermediate stages takeone or more intermediate streams as input and produce one or more output streams. GATES’sapplication programmer interfaces (APIs) are designed to facilitate the specification of suchstages.

3. Enable the application to achieve the best accuracy, while maintaining the real-time constraint.For this, the middleware allows the application developers to expose one or more adjustmentparameters at each stage. An adjustment parameter is a tunable parameter whose value can bemodified to increase the processing rate and, in most cases, reduce the accuracy of the processing.Examples of such adjustment parameters are rate of sampling (i.e. what fraction of data-items areactually processed) and size of summary structure at an intermediate stage (which means howmuch information is retained after a processing stage). The middleware automatically adjusts thevalues of these parameters to meet the real-time constraint on processing.

4. Enable easy deployment of the application. This is done by supporting a Launcher and aDeployer. The system is responsible for initiating the different stages of the computation atdifferent resources.

GATES is also designed to execute applications on heterogeneous resources. The only requirementsfor executing an application are: (1) support for a Java Virtual Machine (JVM), as the applications arewritten in Java; (2) availability of Globus 3.0; and (3) a Web server that supports the user applicationrepository. Thus, the applications are independent of processors and operating systems on which theyare executed.

Copyright c© 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2006; 18:653–666

Page 4: A static resource allocation framework for Grid-based streaming applications

656 L. CHEN AND G. AGRAWAL

Launcher

Application Developer Application User

Deployer

the place where the configuration file is

Fetch the configuration info.

Grid Resource managerRetrieve the codes

Codefor

stage 1

GATES Grid Service

GATES Grid Service

Codefor

stage 2

GATES Grid Service

Codefor

stage N

Application Repository

Codes for stagesCodes for stages

Codes for stages

Web Server

Configuration info.

GATES Grid Service

GATES Grid Service

Data Streams Data Streams

Figure 1. Overall system architecture.

2.2. System architecture and design

The overall system architecture is shown in Figure 1. The system distinguishes between an applicationdeveloper and an application user. An application developer is responsible for dividing an applicationinto stages, choosing adjustment parameters, and implementing the processing at each stage. Moreover,the developer writes an XML file, specifying the configuration information of an application.Such information includes the number of stages and where the stages’ codes are. After submitting thecodes to application repositories, the application developer informs an application user of the URL linkto the configuration file. An application user is only responsible for starting and stopping an application.

The above design simplifies the task of application developers and users, as they are not responsiblefor initiating the different stages on different resources. To support the easy deployment and execution,the Launcher and the Deployer are used. The Launcher is in charge of getting configuration filesand analyzing them by using an embedded XML parser. To start the application, the user simplypasses the XML file’s URL link to the Launcher. The Deployer is responsible for the deployment.Specifically, it: (1) receives the configuration information from the Launcher; (2) consults with aGrid resource manager to find the nodes where the resources required by the individual stages areavailable; (3) initiates instances of GATES Grid services at the nodes; (4) retrieves the stage codes

Copyright c© 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2006; 18:653–666

Page 5: A static resource allocation framework for Grid-based streaming applications

A STATIC RESOURCE 657

from the application repositories; and (5) uploads the stage specific codes to every instance, therebycustomizing it.

After the Deployer completes the deployment, the instances of the GATES Grid service start to makenetwork connections with each other and execute the stage functionalities. The GATES Grid serviceis an OGSI Grid service [5] that implements the self-adaptation algorithm and is able to contain andexecute user-specified codes.

3. RESOURCE ALLOCATION ALGORITHM

In this section, we describe our algorithm for resource allocation. Initially, we describe the problemand argue about its complexity.

3.1. Problem definition and complexity

The resource allocation problem for a GATES application is essentially that of creating a deploymentconfiguration. A deployment configuration of an application comprises the following components:

1. the number of data sources and their location;2. the destination, i.e. the node where the final results are needed;3. the number of stages in the application;4. the number of instances of each stage;5. how the instances connect to each other;6. the node at which each instance is assigned.

Figure 2 shows two possible deployment configurations for an application that has four data sourcesand five stages.

Now, let us consider the problem of creating a deployment configuration for an execution ofan application. The components 1, 2, and 3 of the deployment configuration are known when anapplication is initiated. Therefore, the problem is that of determining components 4, 5, and 6.Determining components 4, 5, and 6 involves both resource discovery and resource allocation.Our focus is on the problem of resource allocation. For resource discovery, we can make use ofinformation services [6] provided by Globus toolkit 3.0 (GT3.0) to collect and aggregate resourceinformation.

One possible approach for resource allocation is to enumerate and compare all possibleconfigurations and find the one that will enable the best performance. However, such an exhaustivesearch algorithm has at least an exponential complexity. Given an application with m stages, n datasources and k available computing nodes for placement of stages’ instances, the number of possibleconfigurations can be denoted by F(n,m, k), where

F(2, n, k) = 1

F(m, n, k) =∑

1≤i≤n

(S(i)n ∗ F(m − 1, i, k − i) ∗ P i

k )

Copyright c© 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2006; 18:653–666

Page 6: A static resource allocation framework for Grid-based streaming applications

658 L. CHEN AND G. AGRAWAL

Stage 2Instance 1

Stage 2Instance 2

Stage 3Instance 1

Stage 4Instance 1

Stage 4Instance 2

Stage 3Instance 2

Stage 2Instance 3

Destination

DataSource 1

DataSource 2

DataSource 3

DataSource 4

Stage 2Instance 1

Stage 3Instance 1

Stage 4Instance 1

Stage 2Instance 2

Destination

DataSource 1

DataSource 2

DataSource 3

DataSource 4

Figure 2. Two example deployment configurations.

where m ≥ 3, n ≥ 1, k ≥ (m ∗ n), and P ik = k!/(k − i)!. S(i)

n denotes the Stirling numbers of thesecond kind. If there are only three stages, the number of all possible configurations is

F(3, n, k) =∑

1≤i≤n

(S(i)n ∗ P i

k )

≥∑

1≤i≤n

(S(i)n ∗ P i

n)

= nn

The above derivation shows that a lower bound on the complexity of the exhaustive search algorithmis �(nn). Therefore, it is not practical. Below, we describe our algorithm that can find a deploymentconfiguration in O(nk2) time.

3.2. Algorithm description

The goal of our algorithm is to determine the deployment configuration that gives the application thebest chance to achieve the real-time constraint, which still maintaining high accuracy in the processing.In this paper, we only consider static resource allocation, and dynamic resource allocation is a topicfor future research. We also assume that except for the final stage in the pipeline, where data from allsources must be combined together, stages in the application could be processed independently for each

Copyright c© 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2006; 18:653–666

Page 7: A static resource allocation framework for Grid-based streaming applications

A STATIC RESOURCE 659

data source. The main observation in our algorithm design is that data arrival rates at the first one or twostages are typically so high that high network bandwidths are desired at these stages. After data havebeen processed by these stages, the arrival rates at the following stages typically decrease significantly.Our algorithm is also based on the assumption that computation is typically not the bottleneck in theprocessing.

The algorithm has two main steps. First, we create a key path corresponding to each data source.Second, we merge these key paths to create a layout tree.

The algorithm proceeds as follows. We initially construct a weighted graph in which every node isviewed as a vertex, and a network connection between two nodes is viewed as an edge. By node, wemean any computing unit, which could be a cluster or a symmetric multi-processor (SMP) machine,and is capable of executing multiple processes. The main idea is that communication bandwidth forprocesses within a node is much higher than the bandwidth between the nodes. The weight of an edgeis the negative value of the network connection’s bandwidth. Given this formulation, our goal is toplace stages so as to minimize the network communication time in the processing. Therefore, for everydata source, we construct a minimum spanning tree (MST) by making the data source the initial setand applying the Prim’s algorithm to the graph. Note that we prefer to apply the MST algorithm, ratherthan the shortest path algorithm, because it aggressively minimizes the weight at the top level of thetree.

Next, we seek the key path for each data source. This is the path from the data source node to thedestination node in the MST corresponding to each data source. We start to mark nodes in the path, i.e.the parent node of the data source node is marked as the second stage, the grandparent node of the datasource node is marked as the third stage, etc.

Ideally, we wish to have the last stage placed on the destination node. However, this will only happenif the number of tree nodes in a key path is equal to the number of stages being deployed. This may notbe true in practice, and the number of stages could be both less than or greater than the number of nodesin the path. We make the following adjustments in such cases. If the length of a key path is longer thanthe number of stages m, we insert some transport stages, called transporters, which simply forward thedata they receive. Adding transporters does introduce some overheads. However, we believe that theseoverheads are nominal compared with the delays caused by choosing another lower bandwidth path,simply because it has exactly m number of nodes. Moreover, these transporters are always deployed atthe end of a pipeline, where the arrival rates are not high. Towards the end of this section, we describean optimization to reduce the number of transporter if possible. When the length of a key path is shorter,the additional stages are deployed at the parent node of the data source node. The reason again is thathigher data arrival rates are typically seen at the beginning of a pipeline.

Now, we have a key path corresponding to each data source, and we need to create a layout tree.By default, we will have one instance of each stage for each data source, with the exception of thelast stage. As we stated earlier, our goal is to minimize the communication time and computation istypically not a bottleneck. Therefore, we proceed as follows. Consider two different key paths whichinvolve the same node. In general, the node could be executing different stages for different key paths.However, if it executes the same stage, we merge the paths. By repeating this step, we can get a layouttree from the set of key paths.

A layout tree determines the components 4 and 5 of a deployment configuration. To decide thecomponent 6, we need to map a vertex in the layout tree to a computing node. We can query thenode information service by specifying the resource requirements of the stage that is supposed to be

Copyright c© 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2006; 18:653–666

Page 8: A static resource allocation framework for Grid-based streaming applications

660 L. CHEN AND G. AGRAWAL

deployed in this node. Thus, we can get a deployment configuration and then call the launcher programto automatically launch the application.

We now state the complexity of this algorithm. Recall that the number of data sources is n, thenumber of stages is m, and the number of available nodes is k(k > m). The main cost is in theinvocation of the Prim’s algorithm. Each invocation takes O(k2) time and we need n invocations.Thus, the complexity of the algorithm is O(nk2).

3.3. Additional optimization

Although the overheads introduced by transporters are not critical, we wish to find unnecessary trans-porters and eliminate them. Assume that a key path is denoted by N1, E1, N2, E2, . . . , Ni, Ei, . . . , Nm,where the data source is denoted by N1, the destination is denoted by Nm, Ni is the ith node in the path,and Ei is the edge connecting Ni with Ni+1. If the node Ni is marked as a transporter, Ni−1 has anedge e in the graph which connects to Nj (j > i), and the weight of e is smaller than that of the edgesfrom E1 to Ei−1. Then the nodes from Ni to Nj−1 are unnecessary transporters. We can eliminate themand replace the edges from Ei to Ej−1 with the edge e. As we show in our experiments, an optimizedconfiguration does achieve better performance than the non-optimized configuration.

4. EXPERIMENTAL EVALUATION

This section presents results from a number of experiments we conducted to evaluate our algorithm.Specifically, we had the following goals: (1) show that the deployment configuration created andoptimized by our algorithm can be as good as the best one among a large number of choices manuallyenumerated; and (2) demonstrate that the algorithm-created configuration can outperform most of alarge number of possible configuration.

One of the challenges in conducting our experiments was to have a setup where network bandwidthscan vary significantly and topology can be quite complex and, yet, repeatable and reliable experimentscould be conducted. In a local area network (LAN) environment or within a single organization, thebandwidths are unlikely to vary much, and the topology is usually very simple. In such scenarios, theresource allocation problem usually becomes quite trivial. At the same time, a WAN environment doesnot allow repeatable experiments. Therefore, to conduct our experiments, we set up an environment inwhich network bandwidths are simulated by inserting delays between packages and network topologiesare created randomly. This allowed us to focus on our goal, which was to demonstrate that our algorithmis effective when network bandwidths can vary significantly and topology is quite complex.

The experiments we report were conducted using the counting samples application.The classical counting samples problem is as follows [7]. A data stream comprises a set of integers.We are interested in determining the n most frequently occurring values and their number ofoccurrences at any given point in the stream. Since it is not possible to store all values, a summarystructure must be maintained to determine the frequently occurring values. Gibbons and Matias havedeveloped an approximate method for answering such queries with limited memory [7].

We implement a distributed version of the counting samples application. In this version, fullstreams are not forwarded to a central machine, but are instead sent to some machines close totheir respective data source. A fixed number of most frequently occurring items at each stream

Copyright c© 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2006; 18:653–666

Page 9: A static resource allocation framework for Grid-based streaming applications

A STATIC RESOURCE 661

324,276.00

816,435.17

318,986.80

812,009.33

307,858.20

805,029.33

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

750000 integers 2000000 integersSize of the data set

Exe

cutio

n Ti

me

(ms)

auto-config opt-config man-config

Figure 3. Comparing auto-config, manual-config, and opt-config.

are determined there and forwarded to the central machine, where the final results were computed.This number was 100 in all our experiments. A deployment configuration of this application couldbe determined manually by comparing a large number of possible configurations. We call thisconfigurationmanual-config. The configuration automatically generated by the algorithm is calledauto-config. The configuration where we further apply the optimization of removing unnecessarytransporters is called opt-config.

Based on these three configurations, we conducted three sets of experiments, which are detailedbelow.

Experiment 1. Our first experiment demonstrated auto-config and opt-config are almost asgood as a manual-config, which is based on enumerating a significant fraction of all possiblechoices. We had four data sources with fixed locations. We further made six clusters available torun the intermediate stage of the application. The results are shown in Figure 3. We considered twodifferent cases, corresponding to 700 000 and 2 000 000 integers being produced by each data source.When each data source created 700 000 integers, the application using the manual-config is 5.3%faster than that using the auto-config and 3.6% faster than that using opt-config. In the secondcase, we see that the manual-config is just 1.4% faster than auto-config and 0.8% faster thanopt-config.

Thus, the results indicate that the performances in the above three scenarios are very close, i.e. ouralgorithm is effective. Furthermore, the larger the size of the datasets, the differences are smaller.

Copyright c© 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2006; 18:653–666

Page 10: A static resource allocation framework for Grid-based streaming applications

662 L. CHEN AND G. AGRAWAL

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

1,000,000

Experiments using the various configurations

Exe

cutio

n Ti

me

(ms)

opt-config configurations randomly chosen

Figure 4. Comparing auto-config with 120 other configurations.

Experiment 2. The second experiment was conducted in the same environment as the first one.We randomly selected 120 out of 1296 configurations that were possible and compared them withthe opt-config. We did not carry out the extensive evaluation, as the total number of choices wasvery large. As shown in Figure 4 , opt-config outperforms all but one of the randomly selectedconfigurations. The one configuration that performs better than the optimized one is just 2% faster.Almost all of the other configurations resulted in a slow-down by at least a factor of 2, and in manycases, up to a factor of 3. This again shows that our algorithm is effective.

Experiment 3. In the environment we had considered for the first two experiments, the total numberof possible deployment configurations was very large. This did not allow us to perform exhaustivecomparisons. Therefore, we used a different environment, in which the number of data sourceswas three and the number of available clusters was four. The network topology was randomlycreated. The number of possible deployment configurations was now 64, which allowed exhaustivecomparisons. We compared the algorithm generated and optimized configuration, opt-config,against all 64 configurations.

The results are shown in Figure 5. opt-config outperforms all possible configurations, and inmost cases, by at least a factor of 2. This confirms our observations from the previous two experiments.

Copyright c© 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2006; 18:653–666

Page 11: A static resource allocation framework for Grid-based streaming applications

A STATIC RESOURCE 663

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

1,000,000

Experiments using the various configurations

Exe

cutio

n Ti

me

(ms)

opt-config All man-configs

Figure 5. Comparing auto-config with all possible configurations.

5. RELATED WORK

Resource allocation has been an important topic in the Grid community. Most of the initial work hasbeen on static matching of the resource requirements and the available resources [8–14]. However, noneof the these efforts considered pipelined or streaming applications. It should be noted that our approachdoes not require resource requirements to be explicitly stated by an application, in contrast to Condor’smatchmaking [15] or the Aurora system [16]. Much work has been done on resource discovery [17,18],often using mobile agents or objects to perform efficient search. Our focus is on resource allocation,and we assume that one of the existing techniques has been used for resource discovery. Realtor [19]is a protocol for supporting survivability and information assurance by migrating components tosafe locations under circumstances of external attack, malfunction, or lack of resources. Our workis distinct in considering resource degradation and application adaptation. Isert and Schwan havedeveloped a system called ACDS, which includes a monitoring and steering tool for adapting streambased computations [20], including assigning alternative resources. In comparison, we consider a morerestrictive class of applications, but automate the dynamic resource allocation process more.

In the area of stream processing, the work that is probably the closest to our work is the dQUOBproject [21,22]. This system enables continuous processing of SQL queries on data streams. Our workis distinct in the following ways. First, we support an API to allow general processing, and not just

Copyright c© 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2006; 18:653–666

Page 12: A static resource allocation framework for Grid-based streaming applications

664 L. CHEN AND G. AGRAWAL

SQL queries. Second, the processing can be done in a pipeline of stages. Third, our system is built ontop of Globus 3.0 and, thus, exploits the existing support for resource discovery. Data stream processinghas also received much attention in the database community [23]. Prominent work in this area has beendone at Stanford [24], Berkeley [25], Brown and MIT [16], Wisconsin [26], among others. The focus inthis community has largely been on the centralized processing of a single data stream. Our focus is quitedifferent, as we consider distributed processing of distributed data streams, and use Grid resources andstandards. Aurora* is a framework for distributed processing of data streams, but only within a singleadministrative domain [27]. Also, the focus in this work is on scalable communication, and there is nosupport for adaptation of the processing.

Our work has some similarities with the Grid-based (dynamic) workflow projects, including theSDSC Matrix project‡, work by Abramson and Kommineni at Monash University [28], and byDeelman et al. at ISI [29]. Our work is distinct in considering streaming data with real-time constrainton the processing.

With increasing wide-area bandwidth, the potential for real-time wide-area distributed computinghas been recognized by others as well. As part of the Optiputer project, Kim has outlined a proposalfor using real-time programming techniques [30]. However, they do not consider the stream-model ofprocessing or adaptation of processing to achieve a real-time constraint.

Many researchers have proposed techniques for real-time resource allocation, including, forexample, [31]. The problem we consider is different because we focus on pipelined processing ofstreaming data and wide-area distributed environments.

6. CONCLUSIONS AND FUTURE WORK

Resource allocation has been an important problem in Grid computing and has been widely studied formany application classes. With scientific instruments and experiments that continuously generate data,and increasing network speeds, we expect processing of data streams to be an important applicationclass for Grid computing. This paper has investigated resource allocation for applications that involveprocessing of distributed data streams.

We have presented a heuristic algorithm for this problem, which is based on minimum spanningtrees. Our current evaluation has shown that the algorithm is effective in practice.

Several issues remain to be considered. First, our current algorithm assumes that networkingbandwidths between the initial stages are the major bottleneck. In general, the algorithm should beable to take relative computational and communication costs between different stages as the input,and should then determine the bottleneck. Similarly, we currently assume that all but the final stagescan process each data source independently. Again, we can generalize this and consider a tree or aDAG representation of the stages. Another direction will be to consider dynamic resource allocation,as available network bandwidths or computational cycles can vary over time. Finally, evaluation ofalgorithms in realistic but controlled environment remains a challenge.

‡See http://www.npaci.edu/DICE/SRB/matrix/.

Copyright c© 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2006; 18:653–666

Page 13: A static resource allocation framework for Grid-based streaming applications

A STATIC RESOURCE 665

REFERENCES

1. Borovikov E, Sussman A, Davis L. A high-performance multi-perspective vision studio. Proceedings of the InternationalSupercomputing Conference (ICS). ACM Press: New York, 2003.

2. Dokas P, Ertoz L, Kumar V, Lazarevic A, Srivastava J, Tan P. Data mining for network intrusion detection. Proceedingsof the NSF Workshop on Next Generation Data Mining, November 2002.

3. Chen L, Reddy K, Agrawal G. GATES: A Grid-based middleware for distributed processing of data streams. Proceedings ofthe IEEE Conference on High Performance Distributed Computing (HPDC). IEEE Computer Society Press: Los Alamitos,CA, 2004.

4. Foster I, Kesselman C, Nick JM, Tuecke S. The Physiology of the Grid: An Open Grid Services Architecture for DistributedSystems Integration. Open Grid Service Infrastructure Working Group, Global Grid Forum, June 2002.

5. Foster I, Kesselman C, Nick J, Tuecke S. Grid services for distributed systems integration. IEEE Computer 2002; 35(6):37–46.

6. Czajkowski K, Fitzgerald S, Foster I, Kesselman C. Grid information services for distributed resource sharing. Proceedingsof the 10th IEEE International Symposium on High-Performance Distributed Computing (HPDC-10), August 2001.

7. Gibbons PB, Matias Y. New sampling-based summary statistics for improving approximate query answers. Proceedingsof the 1998 ACM SIGMOD Conference. ACM Press: New York, 1998; 331–342.

8. Foster I, Kesselman C, Tuecke S. The anatomy of the Grid: Enabling scalable virtual organizations. International Journalof Supercomputer Applications 2001; 15(3).

9. Chapin S, Katramatos D, Karpovish J, Grimshaw A. Resource management in Legion. Future Generation ComputerSystems 1999; 15(5–6):583–594.

10. Xu D, Nahrstedt K, Wichadakul D. QoS-aware discovery of wide-area distributed services. Proceedings of the 1stIEEE/ACM International Symposium on Cluster Computing and the Grid, May 2001; 92–99.

11. Iamnitchi A, Foster I, Nurmi D. A peer-to-peer approach to resource discovery in Grid environments. High PerformanceDistributed Computing, Edinburgh, U.K., July 2002. IEEE Press: Piscataway, NJ, 2002.

12. Rana OF, Bunford-Jones D, Walker DW, Addis M, Surridge M, Hawick K. Resource discovery for dynamic clusters incomputational Grid. Proceedings of the 10th IEEE Heterogeneous Computing Workshop, San Francisco, CA, 2001.

13. Zhang L, Deering S, Estrin D, Shenker S, Zappala D. Rsvp: A new resource reservation protocol. IEEE Networks Magazine1993; 31(9):8–18.

14. Thain D, Tannenbaum T, Livny M. Distributed computing in practice: The Condor experience. Concurrency andComputation: Practice and Experience 2005; 17(2–4):323–356.

15. Raman R, Livny M, Solomon M. Matchmaking: Distributed resource management for high throughput computing.Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing, Chicago, IL, July1998.

16. Carney D, Etintemel U, Cherniak M, Corvey C, Lee S, Seidman G, Stonebraker M, Tatbul N, Zodnik S. Monitoringstreams—a new class of data management applications. Proceedings of the Conference on Very Large DataBases (VLDB),2002; 215–226.

17. Jun K, Boloni L, Palacz K, Marinescu DC. Agent-based resource discovery. Proceedings of the 9th HeterogeneousComputing Workshop, May 2000; 43–52.

18. Moreau L. Agents for the Grid: A comparison for Web services (part 1: The transport layer). Proceedings of the 2ndIEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID 2002), Berlin, Germany, 2002;220–228.

19. Choi BK, Rho S, Bettati R. Dynamic resource discovery for applications survivability in distributed real-time systems.Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), 2003; 122.

20. Isert C, Schwan K. ACDS: Adapting computational data streams for high performance. Proceedings of the 14thInternational Parallel and Distributed Processing Symposium (IPDPS 2000), Cancun, Mexico, May 2000. IEEE ComputerSociety Press: Los Alamitos, CA, 2000; 641–646.

21. Plale B. Leveraging runtime knowledge about event rates to improve memory utilization in wide area data stream filtering.Proceedings of the IEEE High Performance Distributed Computing (HPDC), August 2002.

22. Plale B, Schwan K. Dynamic querying of streaming data with the dQUOB system. IEEE Transactions on Parallel andDistributed Systems 2003; 14(4):422–432.

23. Golab L, Ozsu M. Issues in data stream management. SIGMOD Record 2003; 32(2):5–14.24. Arasu A, Babu S, Widom J. An abstract semantics and concrete language for continuous queries over streams and relations.

Proceedings of the 9th International Conference on Data Base Programming Languages (DBPL ’03), September 2003.25. Chandrasekaran S et al. Telegraphcq: Continuous dataflow processing for an uncertain world. Proceedings of the

Conference on Innovative Data Systems Research (CIDR), 2003; 269–280.26. Viglas S, Naughton J. Rate-based query optimization for streaming information sources. Proceedings of the 2002 ACM

SIGMOD International Conference on Management of Data, June 2002.

Copyright c© 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2006; 18:653–666

Page 14: A static resource allocation framework for Grid-based streaming applications

666 L. CHEN AND G. AGRAWAL

27. Cherniack M, Balakrishnan H, Balazinska M, Carney D, Cetintemel U, Zing Y, Zdonik S. Scalable distributed streamprocessing. Proceedings of the Conference on Innovative Data Systems Research (CIDR), January 2003.

28. Abramson D, Kommineni J. A flexible IO scheme for Grid workflows. Proceedings of the International Parallel andDistributed Processing Symposium (IPDPS), April 2004.

29. Deelman E et al. Mapping abstract complex workflows onto Grid environments. Journal of Grid Computing 2003; 1(1):25–39.

30. Kim KH. Wide-area real-time distributed computing in a tightly managed Grid: An optiputer project (Keynote Abstract).Proceedings of AINA-2004, March 2004; 226–236.

31. Rosu DI, Schwan K, Yalamanchili S, Jha R. On adaptive resource allocation for complex real-time applications.Proceedings of the 18th Real Time Systems Symposium (RTSS), December 1997.

Copyright c© 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2006; 18:653–666