31
PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation University of Toronto March 28, 2011 MIDDLEWARE SYSTEMS RESEARCH GROUP Resource Allocation Algorithms for Event-Based Enterprise Systems

PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

  • Upload
    xannon

  • View
    28

  • Download
    0

Embed Size (px)

DESCRIPTION

Resource Allocation Algorithms for Event-Based Enterprise Systems. PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation University of Toronto March 28, 2011. MIDDLEWARE SYSTEMS. RESEARCH GROUP. publisher. subscriber. subscriber. - PowerPoint PPT Presentation

Citation preview

Page 1: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Candidate: Alex K. Y. CheungSupervisor: Hans-Arno Jacobsen

PhD Thesis PresentationUniversity of TorontoMarch 28, 2011

MIDDLEWARE SYSTEMSRESEARCH GROUP

Resource Allocation Algorithms for Event-Based Enterprise Systems

Page 2: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Introduction to Distributed Content-based Publish/Subscribe

2

subscriber

brand = ‘Honda’ cashback > $2000

subscriber

brand= ‘Honda’ cashback > $4000

publisher

brand = ‘Honda’ cashback = $6000

broker

multicast

Advertisement pathSubscription pathPublication path

brand = ‘Honda’ cashback >= $0

Page 3: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Desirable Properties of Distributed Content-based Publish/Subscribe

• Decoupling of data sources and sinks Ease of component addition and removal

• Flexible routing based on message content Efficient use of network resources

• Distributed broker overlay network Scalable Fault tolerant

3

Page 4: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Applications of Publish/Subscribe• Network and systems monitoring [Mukherjee 1994]• Business activity monitoring [Fawcett et al. 1999]• Business process execution [Schuler et al. 2001]• Workflow management [Cugola et al. 2001]• Multiplayer online games [Bharambe et al. 2002]• RSS filtering [Petrovic et al. 2005; Rose et al. 2007] • Automated service composition [Hu et al. 2008]• Resource discovery [Yan et al. 2009]

4

Page 5: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Real Deployments of Distributed Publish/Subscribe• GooPS

▫ Google’s pub/sub messaging middleware to integrate web applications (such as Gmail, Google Docs, Google Calendar) on a world-wide scale supporting millions of users

▫ Hundreds of brokers with tens of thousands of pub/sub clients

• Yahoo Message Broker▫ Yahoo’s pub/sub middleware to integrate applications with

their database system, PNUTS• SuperMontage

▫ Tibco’s pub/sub distribution network for Nasdaq’s quote and order-processing system

• GDSN (Global Data Synchronization Network)▫ A global pub/sub network that allows retailers and

suppliers (i.e., Walmart, Target, Metro, etc.) to exchange timely and accurate supply chain data

5

Page 6: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Contributions

•Load Balancing in Content-based Publish/Subscribe Systems (ACM TOCS’10)

•Publisher Placement Algorithms in Content-based Publish/Subscribe (IEEE ICDCS’10)

•Green Resource Allocation Algorithms in Content-based Publish/Subscribe (IEEE ICDCS’11)

6

Page 7: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Problem• Brokers located at different geographical areas

may suffer from uneven load distribution due to▫Heterogeneous servers▫Network congestion▫Different densities and interests of end-users

• Consequences▫Overloaded brokers introduce high delivery

delays that may ultimately crash from running out of memory

▫System that does not scale with the added resources

7

Page 8: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

SS

SS

SS SS SS

PP

Visualizing the Problem

8

Page 9: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

PPSS

SS

SS SS SS

Overview of Load Balancing Approach

9

Local Load BalancingGlobal Load Balancing

offloading broker

load-accepting broker

Page 10: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Evaluation• Implemented on a

real open source pub/sub system called PADRES

• PlanetLab and a cluster testbed

• Local and global load balancing

• Homogeneous and heterogeneous servers

• Compared against a naive approach

10

B20B20

B21B21

B22B22

B30B30

B31B31

B32B32

B40B40

B41B41

B42B42

SSSSSS

B10B10

B11B11

B12B12

PPPP

Global LB Setup

B50B50

B51B51

B52B52

B60B60

B61B61

B62B62

Page 11: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Summary

•Load balancing enables the pub/sub system to scale with the number of resources

•Load balancing solutions that are unaware of subscription load and relationships are ineffective ▫Long response time ▫Unstable system

11

Page 12: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Contributions

•Load Balancing in Content-based Publish/Subscribe Systems (ACM TOCS’10)

•Publisher Placement Algorithms in Content-based Publish/Subscribe (IEEE ICDCS’10)

•Green Resource Allocation Algorithms in Content-based Publish/Subscribe (IEEE ICDCS’11)

12

Page 13: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Problem

•Publishers can join anywhere or to the closest broker in the overlay

•Consequences▫High delivery delay

Sluggish system

▫High resource usage in terms of matching, network bandwidth, and subscription storage High IT costs

13

PP

SS

SS

Page 14: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Approach• Adaptively move publisher to area of

matching subscribers• Two unique solutions

▫ POP (Publisher Optimistic Placement) Decision is based on the average

number of downstream publication deliveries

▫ GRAPE (Greedy Relocation Algorithm for Publishers of Events) Decision is based on the end-to-end

delivery delay, total broker message rate, and user specified inputs including the minimization metric (load/delivery delay) and weight

14

SS

SS

PP

Page 15: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Evaluation• Implemented on the open

source pub/sub system called PADRES

• PlanetLab and a cluster testbed

• Enterprise and random workloads

15

Reduced delivery delay by up to

68%

Reduced delivery delay by up to

68%

Reduced message

rate by up to 85%

Reduced message

rate by up to 85%

Page 16: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Summary

•POP is suitable for pub/sub systems that strive for simplicity, such as GooPS

•GRAPE is suitable for systems that strive to minimize in the extremes, such as system load in sensor networks or delivery delay in SuperMontage

16

Page 17: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Contributions

•Load Balancing in Content-based Publish/Subscribe Systems (ACM TOCS’10)

•Publisher Placement Algorithms in Content-based Publish/Subscribe (IEEE ICDCS’10)

•Green Resource Allocation Algorithms in Content-based Publish/Subscribe (IEEE ICDCS’11)

17

Page 18: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Problem• What is the deployment strategy for the broker

overlay, publisher assignment, and subscriber assignment to minimize the broker message rate and number of allocated brokers?

• Proven to be an NP-complete problem• Benefits

▫ Increase capacity of the system▫ More efficient energy usage of the allocated

servers▫ Fewer servers mean lower investment and

maintenance costs▫ Inline with Green IT, which is also what enterprises

such as Google and Yahoo are currently engaged in

18

Page 19: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Approach• 3 phase design

.

• Most compelling properties▫ Language independent

Content-based (XPath, regex, ranged, SQL, composite subscriptions, etc.) and topic-based, such as GooPS

▫ Works effectively under any workload (defined or undefined)

19

Phase 1

Record the publications delivered to each subscription into bit vectors

Phase 2

Use information from the bit vectors to allocate subscriptions to brokers using one of 10 algorithms

Phase 3

Construct the broker overlay with 3 optimization techniques and deploy the new configuration

Page 20: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Phase 1: Subscription Profiling

20

0 0 0000000

Message ID of first index

Start of bit vector

1Publications delivered to subscription

B34-M213B34-M213

B34-M215B34-M215

B34-M216B34-M216

B34-M217B34-M217

B34-M220B34-M220

B34-M222B34-M222

B34-M225B34-M225

B34-M226B34-M226

B34-M213

01 01 01 01 01 01 01

Profile of each subscriber per advertisement maintained at the

subscriber’s first broker

Message IDMessage ID

Cardinality of bit vector corresponds to bandwidth requirement of the subscriptionUsed to compute “closeness” of between any two subscriptions in the clustering algorithm. closeness = |si ∩ sj|

Fixed size so shift left if next publication is out of bit vector range

Page 21: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Phase 2: Subscription Allocation Algorithms• MANUAL/(AUTOMATIC)

▫ Tree with fanout of 2, manual (random) placement of clients• Fastest Broker First (FBF)

▫ Assign subscriptions randomly to the next most powerful broker

• Bin Packing▫ Like FBF, but assigns the next highest traffic subscription

• PAIRWISE-N, PAIRWISE-K (related approaches in ICDCS’02)▫ Subscription clustering where the number of clusters is given

• CRAM (Clustering with Resource Awareness and Minimization)▫ Dynamically determines the number of clusters▫ Utilizes a new clustering algorithm that is more effective▫ Evaluated with 4 different subscription closeness metrics, with

one derived from Banavar et al. in ICDCS '99

21

Page 22: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Bin Packing

22

SSSSSS SS SS SS

Page 23: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Bin Packing’s Allocation Result

23

SS

SS

SS SS

SSSS

Page 24: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

SS

SS

Phase 3: Broker Overlay Construction

24

SS

SS

SS

SS

SS SS SS

Page 25: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Bin Packing’s Final Overlay

25

SSSSSSSS

SS

SS SS

SSSS

PPPP(( GRAPE )

) (( GRAPE )

)

Page 26: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Evaluation• Implemented on the PADRES open source

content-based pub/sub project•Evaluated on a cluster testbed with 80 brokers•Evaluated on SciNet, an HPC with 1000 brokers•Comparison against two related works (Riabov et

al. ICDCS’02, Banavar et al. ICDCS’99)•Homogeneous and heterogeneous scenarios•Workload saturates the initial deployment

(MANUAL)

26

Page 27: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Evaluation Results on SciNet

27

Reduced message

rate by up to 92%

Reduced message

rate by up to 92%

Reduced number of allocated

brokers by up to 91%

Reduced number of allocated

brokers by up to 91%

Page 28: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Summary

•CRAM combines the benefits of ▫Subscription clustering▫Resource awareness from Bin Packing by simultaneously reducing both▫Broker message rates▫Number of allocated brokers

•Bit vectors are powerful▫Language independent (XPath, regex,

topics)▫Effective with any workload distribution

28

Page 29: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Conclusions•Load balancing increases

▫Availability by circumventing overloads▫Scalability of the system

•Publisher placement algorithms reduce ▫Broker input load by up to 68%▫Broker message rate by up to 85%▫Delivery delay by up to 68%

•Resource allocation algorithms reduce▫Average broker message rate by up to 92%▫Number of allocated brokers by up to 91%

29

Page 30: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Future Work•Self-tuning of load balancing parameters•React dynamically by growing and

shrinking the network in incremental steps• Improve runtime of the CRAM algorithm by

parallelization or reducing its computational complexity

•Model workload with more sophisticated methods, such as stochastic processes, to improve accuracy of load estimation

•Address fault resiliency in each approach

30

Page 31: PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation

PhD Thesis Presentation, Alex Cheung © 2011

Q & A

31