Upload
xannon
View
28
Download
0
Embed Size (px)
DESCRIPTION
Resource Allocation Algorithms for Event-Based Enterprise Systems. PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation University of Toronto March 28, 2011. MIDDLEWARE SYSTEMS. RESEARCH GROUP. publisher. subscriber. subscriber. - PowerPoint PPT Presentation
Citation preview
PhD Candidate: Alex K. Y. CheungSupervisor: Hans-Arno Jacobsen
PhD Thesis PresentationUniversity of TorontoMarch 28, 2011
MIDDLEWARE SYSTEMSRESEARCH GROUP
Resource Allocation Algorithms for Event-Based Enterprise Systems
PhD Thesis Presentation, Alex Cheung © 2011
Introduction to Distributed Content-based Publish/Subscribe
2
subscriber
brand = ‘Honda’ cashback > $2000
subscriber
brand= ‘Honda’ cashback > $4000
publisher
brand = ‘Honda’ cashback = $6000
broker
multicast
Advertisement pathSubscription pathPublication path
brand = ‘Honda’ cashback >= $0
PhD Thesis Presentation, Alex Cheung © 2011
Desirable Properties of Distributed Content-based Publish/Subscribe
• Decoupling of data sources and sinks Ease of component addition and removal
• Flexible routing based on message content Efficient use of network resources
• Distributed broker overlay network Scalable Fault tolerant
3
PhD Thesis Presentation, Alex Cheung © 2011
Applications of Publish/Subscribe• Network and systems monitoring [Mukherjee 1994]• Business activity monitoring [Fawcett et al. 1999]• Business process execution [Schuler et al. 2001]• Workflow management [Cugola et al. 2001]• Multiplayer online games [Bharambe et al. 2002]• RSS filtering [Petrovic et al. 2005; Rose et al. 2007] • Automated service composition [Hu et al. 2008]• Resource discovery [Yan et al. 2009]
4
PhD Thesis Presentation, Alex Cheung © 2011
Real Deployments of Distributed Publish/Subscribe• GooPS
▫ Google’s pub/sub messaging middleware to integrate web applications (such as Gmail, Google Docs, Google Calendar) on a world-wide scale supporting millions of users
▫ Hundreds of brokers with tens of thousands of pub/sub clients
• Yahoo Message Broker▫ Yahoo’s pub/sub middleware to integrate applications with
their database system, PNUTS• SuperMontage
▫ Tibco’s pub/sub distribution network for Nasdaq’s quote and order-processing system
• GDSN (Global Data Synchronization Network)▫ A global pub/sub network that allows retailers and
suppliers (i.e., Walmart, Target, Metro, etc.) to exchange timely and accurate supply chain data
5
PhD Thesis Presentation, Alex Cheung © 2011
Contributions
•Load Balancing in Content-based Publish/Subscribe Systems (ACM TOCS’10)
•Publisher Placement Algorithms in Content-based Publish/Subscribe (IEEE ICDCS’10)
•Green Resource Allocation Algorithms in Content-based Publish/Subscribe (IEEE ICDCS’11)
6
PhD Thesis Presentation, Alex Cheung © 2011
Problem• Brokers located at different geographical areas
may suffer from uneven load distribution due to▫Heterogeneous servers▫Network congestion▫Different densities and interests of end-users
• Consequences▫Overloaded brokers introduce high delivery
delays that may ultimately crash from running out of memory
▫System that does not scale with the added resources
7
PhD Thesis Presentation, Alex Cheung © 2011
SS
SS
SS SS SS
PP
Visualizing the Problem
8
PhD Thesis Presentation, Alex Cheung © 2011
PPSS
SS
SS SS SS
Overview of Load Balancing Approach
9
Local Load BalancingGlobal Load Balancing
offloading broker
load-accepting broker
PhD Thesis Presentation, Alex Cheung © 2011
Evaluation• Implemented on a
real open source pub/sub system called PADRES
• PlanetLab and a cluster testbed
• Local and global load balancing
• Homogeneous and heterogeneous servers
• Compared against a naive approach
10
B20B20
B21B21
B22B22
B30B30
B31B31
B32B32
B40B40
B41B41
B42B42
SSSSSS
B10B10
B11B11
B12B12
PPPP
Global LB Setup
B50B50
B51B51
B52B52
B60B60
B61B61
B62B62
PhD Thesis Presentation, Alex Cheung © 2011
Summary
•Load balancing enables the pub/sub system to scale with the number of resources
•Load balancing solutions that are unaware of subscription load and relationships are ineffective ▫Long response time ▫Unstable system
11
PhD Thesis Presentation, Alex Cheung © 2011
Contributions
•Load Balancing in Content-based Publish/Subscribe Systems (ACM TOCS’10)
•Publisher Placement Algorithms in Content-based Publish/Subscribe (IEEE ICDCS’10)
•Green Resource Allocation Algorithms in Content-based Publish/Subscribe (IEEE ICDCS’11)
12
PhD Thesis Presentation, Alex Cheung © 2011
Problem
•Publishers can join anywhere or to the closest broker in the overlay
•Consequences▫High delivery delay
Sluggish system
▫High resource usage in terms of matching, network bandwidth, and subscription storage High IT costs
13
PP
SS
SS
PhD Thesis Presentation, Alex Cheung © 2011
Approach• Adaptively move publisher to area of
matching subscribers• Two unique solutions
▫ POP (Publisher Optimistic Placement) Decision is based on the average
number of downstream publication deliveries
▫ GRAPE (Greedy Relocation Algorithm for Publishers of Events) Decision is based on the end-to-end
delivery delay, total broker message rate, and user specified inputs including the minimization metric (load/delivery delay) and weight
14
SS
SS
PP
PhD Thesis Presentation, Alex Cheung © 2011
Evaluation• Implemented on the open
source pub/sub system called PADRES
• PlanetLab and a cluster testbed
• Enterprise and random workloads
15
Reduced delivery delay by up to
68%
Reduced delivery delay by up to
68%
Reduced message
rate by up to 85%
Reduced message
rate by up to 85%
PhD Thesis Presentation, Alex Cheung © 2011
Summary
•POP is suitable for pub/sub systems that strive for simplicity, such as GooPS
•GRAPE is suitable for systems that strive to minimize in the extremes, such as system load in sensor networks or delivery delay in SuperMontage
16
PhD Thesis Presentation, Alex Cheung © 2011
Contributions
•Load Balancing in Content-based Publish/Subscribe Systems (ACM TOCS’10)
•Publisher Placement Algorithms in Content-based Publish/Subscribe (IEEE ICDCS’10)
•Green Resource Allocation Algorithms in Content-based Publish/Subscribe (IEEE ICDCS’11)
17
PhD Thesis Presentation, Alex Cheung © 2011
Problem• What is the deployment strategy for the broker
overlay, publisher assignment, and subscriber assignment to minimize the broker message rate and number of allocated brokers?
• Proven to be an NP-complete problem• Benefits
▫ Increase capacity of the system▫ More efficient energy usage of the allocated
servers▫ Fewer servers mean lower investment and
maintenance costs▫ Inline with Green IT, which is also what enterprises
such as Google and Yahoo are currently engaged in
18
PhD Thesis Presentation, Alex Cheung © 2011
Approach• 3 phase design
.
• Most compelling properties▫ Language independent
Content-based (XPath, regex, ranged, SQL, composite subscriptions, etc.) and topic-based, such as GooPS
▫ Works effectively under any workload (defined or undefined)
19
Phase 1
Record the publications delivered to each subscription into bit vectors
Phase 2
Use information from the bit vectors to allocate subscriptions to brokers using one of 10 algorithms
Phase 3
Construct the broker overlay with 3 optimization techniques and deploy the new configuration
PhD Thesis Presentation, Alex Cheung © 2011
Phase 1: Subscription Profiling
20
0 0 0000000
Message ID of first index
Start of bit vector
1Publications delivered to subscription
B34-M213B34-M213
B34-M215B34-M215
B34-M216B34-M216
B34-M217B34-M217
B34-M220B34-M220
B34-M222B34-M222
B34-M225B34-M225
B34-M226B34-M226
B34-M213
01 01 01 01 01 01 01
Profile of each subscriber per advertisement maintained at the
subscriber’s first broker
Message IDMessage ID
Cardinality of bit vector corresponds to bandwidth requirement of the subscriptionUsed to compute “closeness” of between any two subscriptions in the clustering algorithm. closeness = |si ∩ sj|
Fixed size so shift left if next publication is out of bit vector range
PhD Thesis Presentation, Alex Cheung © 2011
Phase 2: Subscription Allocation Algorithms• MANUAL/(AUTOMATIC)
▫ Tree with fanout of 2, manual (random) placement of clients• Fastest Broker First (FBF)
▫ Assign subscriptions randomly to the next most powerful broker
• Bin Packing▫ Like FBF, but assigns the next highest traffic subscription
• PAIRWISE-N, PAIRWISE-K (related approaches in ICDCS’02)▫ Subscription clustering where the number of clusters is given
• CRAM (Clustering with Resource Awareness and Minimization)▫ Dynamically determines the number of clusters▫ Utilizes a new clustering algorithm that is more effective▫ Evaluated with 4 different subscription closeness metrics, with
one derived from Banavar et al. in ICDCS '99
21
PhD Thesis Presentation, Alex Cheung © 2011
Bin Packing
22
SSSSSS SS SS SS
PhD Thesis Presentation, Alex Cheung © 2011
Bin Packing’s Allocation Result
23
SS
SS
SS SS
SSSS
PhD Thesis Presentation, Alex Cheung © 2011
SS
SS
Phase 3: Broker Overlay Construction
24
SS
SS
SS
SS
SS SS SS
PhD Thesis Presentation, Alex Cheung © 2011
Bin Packing’s Final Overlay
25
SSSSSSSS
SS
SS SS
SSSS
PPPP(( GRAPE )
) (( GRAPE )
)
PhD Thesis Presentation, Alex Cheung © 2011
Evaluation• Implemented on the PADRES open source
content-based pub/sub project•Evaluated on a cluster testbed with 80 brokers•Evaluated on SciNet, an HPC with 1000 brokers•Comparison against two related works (Riabov et
al. ICDCS’02, Banavar et al. ICDCS’99)•Homogeneous and heterogeneous scenarios•Workload saturates the initial deployment
(MANUAL)
26
PhD Thesis Presentation, Alex Cheung © 2011
Evaluation Results on SciNet
27
Reduced message
rate by up to 92%
Reduced message
rate by up to 92%
Reduced number of allocated
brokers by up to 91%
Reduced number of allocated
brokers by up to 91%
PhD Thesis Presentation, Alex Cheung © 2011
Summary
•CRAM combines the benefits of ▫Subscription clustering▫Resource awareness from Bin Packing by simultaneously reducing both▫Broker message rates▫Number of allocated brokers
•Bit vectors are powerful▫Language independent (XPath, regex,
topics)▫Effective with any workload distribution
28
PhD Thesis Presentation, Alex Cheung © 2011
Conclusions•Load balancing increases
▫Availability by circumventing overloads▫Scalability of the system
•Publisher placement algorithms reduce ▫Broker input load by up to 68%▫Broker message rate by up to 85%▫Delivery delay by up to 68%
•Resource allocation algorithms reduce▫Average broker message rate by up to 92%▫Number of allocated brokers by up to 91%
29
PhD Thesis Presentation, Alex Cheung © 2011
Future Work•Self-tuning of load balancing parameters•React dynamically by growing and
shrinking the network in incremental steps• Improve runtime of the CRAM algorithm by
parallelization or reducing its computational complexity
•Model workload with more sophisticated methods, such as stochastic processes, to improve accuracy of load estimation
•Address fault resiliency in each approach
30
PhD Thesis Presentation, Alex Cheung © 2011
Q & A
31