View
212
Download
0
Tags:
Embed Size (px)
Citation preview
&Electrical ComputerENGINEERING
Team 1: Box Office
17-654: Analysis of Software Artifacts18-846: Dependability Analysis of Middleware
JunSuk Oh, YounBok Lee, KwangChun Lee, SoYoung Kim, JungHee Jo
2
Team Members
JunSuk Oh YounBok Lee KwangChun Lee SoYoung Kim JungHee Jo
http://www.ece.cmu.edu/~ece846/team1/index.html
3
Baseline Application
• System description– Box Office is a system for users to search movie tickets and reserve tickets
• Base Features– A user can login– A user can search movies– A user can reserve tickets
• Configuration– Operating System
• Server: Windows 2000 Server, Windows XP Professional• Client: Windows XP Professional
– Language• Java SDK 1.4.2
– Middleware• Enterprise Java Beans
– Third-party Software• Database: MySQL• Web Application Server: Jboss• Java-IDE: Eclipse, Netbean• J2EE Eclipse Plug-in: Lomboz
4
• Operating System– Easier to set up the development environment than Linux Cluster
– Easier to handle by ourselves
• JBoss– Environment is supported by teaching assistants
• EJB– Popular technology in the industry, members’ preference
• MySQL– Easy to install and use
– Easy to get the developing document
• Eclipse– All team members have experience in this technology
• Lomboz– Enables Java developers to build, test and deploy using J2EE
Baseline Application - Configuration Selection Criteria
5
Baseline Architecture
Client Tier
Middle Tier DB Tier
DataBase
J N D
I
cardinfo
login
movie
reserv
user
Client
Entity Beans
Session Bean
Entity Bean
Client Pool DB
Table
session
JNDI Lookup
RPC
DB Access
6
Fault-Tolerance Goals
• Replication Style– Passive Replication
• Approach– Replication
• 2 replicas are located on separate machines
– Sacred Components• Replication manager
• Database
• Client
– Fault Detector• Client
– State• All beans are stateless
• States are stored in the database
7
Client n
FT-Baseline Architecture
SacredFault Tolerant
Database
Factory
Client side(Sacred)
Client 2
Primary Replica
ReplicationManager
Client 1
Machine 1
Machine 2
JNDI
Backup Replica
JNDI
Factory
8
Mechanisms for Fail-Over (1)
• Fault Injector– Periodically, fault injector
kills replica in turn (1 min)
• Replication manager– 10 seconds after server
fails, Replication Manager invokes factory to relaunch the failed replica .
• Fail-over mechanism– Fault detection
– Replica location
– Connection establishment
– Retry
PrimaryReplica
Client
Fail-Over Mechanism
Factory
BackupReplica
ReplicationManager
Factory
1. Request
2. Server Failed
FaultInjector
Replicate Inject fault
4. Retry
3. Connection established
9
Mechanisms for Fail-Over (2)
• Fault Detection– Exception handling by Client
• RemoteException – NoSuchObjectException, ConnectException (RMI)
• NameNotFoundException (JNDI Failure)
• Replica location– Client knows the servers from whom it should request service
• Connection establishment– Get a connection to new replica
– Server reference should be looked up:• When client request the service for the first time
• When client detects server failure and try request to other server
– Client retries the request to backup replica until service becomes available
• Retry– Request service again
10
Failover Mechanism (3) - Avoid Duplicate Transaction
• Target case – Transaction is stored in the DB but it cannot be informed to client
• Mechanism
Client Server Database
1. Service request 2. Store to DB
3. Return result4. Inform client
Client
Replica 1
Database
1. Request OP #15 2. Store OP #15
3. Success4. Inform client
Replica 2
5. Retry OP #15 6. Check Trx state
7. Duplicate8. Duplicate
11
Fail-Over Measurements
Round Trip Time in Failover(14 Fault Injections)
10
100
1000
10000
0 10 20 30 40 50 60 70 80 90 100
# of Invocations
RTT (
ms)
– High Peak: RemoteException– Low Peak:
NameNotFoundException
200
12
Fail-Over Measurements
Decomposition of RTT in Failover (Low Peaks)
FD, 16ms ,7%
Retry,116ms ,
55%
CE, 82ms ,38%
FD: Fault Detect CE: Connection Establishment
Decomposition of RTT in Failover (High Peaks)
FD, 7661 ,97%
CE, 93ms ,1%
Retry,123ms , 2%
13
RT-FT-Baseline Architecture
• Two steps to the Optimization– Step 1: Reduce the connection establishment time
• Client needs to reconnect to available replica after fault detection
• Pre-established connection: Connector on the client side will maintain the connection to each replica in the background
► Reconnection time disappeared but still graph shows spikes due to the time for catching connection exception
– Step 2: Reduce the fault detection time• Reducing the catching exception time
– RemoteException – NoSuchObjectException, ConnectException
• Having fault detector on client side
• Fault detector will update the status of replicas periodically.
• Clients can know the status of replicas beforehand.
► Getting rid of fault detection time as well as spikes!!
14
RT-FT-Baseline Architecture
Replica 1
Replica 2
Connector
Client
statusServer1statusServer2
: Pinging for checking status
: Establishing connection as background
updatechecking
LocalFD
15
Bounded “Real-Time” Fail-Over Measurements
• Fail-over graphs after optimization step1
200
16
Bounded “Real-Time” Fail-Over Measurements
• Fail-over graphs after optimization step2
200
17
Analysis on Fail-over Optimization
FD: Fault Detect CE: Connection Establishment
Failover Optimization (Low Peaks)
CE, 82ms ,38%
Retry,116ms ,
55%
FD, 16ms ,7%
Failover Optimization (High Peaks)
FD, 7661 ,97%
CE, 93ms ,1%
Retry,123ms , 2%
Low Peaks High Peaks
FD CE Retry FD CE Retry
Before optimization 16ms 82ms 116ms 7661ms 93ms 123ms
After optimization 0ms 0ms 104ms 0ms 0ms 104ms
Reduction 100% 100% 10.34% 100% 100% 15.45%
:Reduced part
18
High Performance: Load Balancing
• Distributed clients’ requests among multiple servers
• Having separate load balancer to control the access to the servers
• Strategy– Static load balancing
• Round Robin way
• Assign server in turns
– Dynamic load balancing• Load balancer periodically checks the current number of client of each server
• Dynamically assign the server to each client
– Simulation strategy• Measurements on the actual server A&B RTT
• Move to the SIMULATION environment
• Find the working load balancing strategy
• Confirm the load balancing strategy in the actual environment
• Find alternative load balancing strategies
19
Load Balancing Strategy
Replica A Replica B
LoadBalancer
1. W
hich
Ser
ver?
2. S
erve
r A
3. C
onne
ct
Client 1 Client 2 Client N…
4. W
hich
Ser
ver?
5. S
erve
r B
6. C
onne
ct LoadBalancer
1. W
hich
Ser
ver?
4. S
erve
r A
5. C
onne
ct
Client 1 Client 2 Client N…
2. How many Clients?
3. Two 3. Ten
Strategy 1 (Round Robin) Strategy 2 (Check for # of clients)
Replica A Replica B
20
Performance Measurements
Load Balance Test- RTT of Client 1 -
0
200
400
600
800
1000
1200
1400
1600
0 10 20 30 40 50# of Clients
RTT
(m
s)
Single Server
Load Balance1 (Round Robin)
Load Balance2
replica
a Client
21
Load Balancing Strategy
• Load balancing strategy by using historical data and simulation systems
• Testing load balancing strategy under the simulation environment
• Predict load balancing strategy performance
nSampleSampleSampleClient ,12,11,11 ,,:
nSampleSampleSampleClient ,22,21,22 ,,:
nSampleSampleSampleClient ,502,501,5050 ,,:
nSampleSampleSampleClient ,12,11,11 ,,:
nSampleSampleSampleClient ,22,21,22 ,,:
nSampleSampleSampleClient ,502,501,5050 ,,:
Server AServer A
Server BServer B
Data CollectionData Collection Load Balancing Load Balancing
Algorithm DevelopmentAlgorithm Development
Min MaxMin Max
Load BalancingLoad Balancing
AlgorithmAlgorithm
40
60
80
10
01
20
Random Load Balancing
Client # 4
40
60
80
10
01
20
Min Max Load Balancing
Client # 4
Histogram of islands
islands
Fre
quen
cy
0 5000 10000 15000
010
2030
40
Histogram of islands
islands
Fre
quen
cy
0 5000 10000 15000
010
2030
40
41
2 1 1 1 1 0 0 1
Histogram of sqrt(islands)
sqrt(islands)
Fre
quen
cy
0 20 40 60 80 100 120 140
05
1015
2025
3035
Histogram of sqrt(islands)
sqrt(islands)
Den
sity
0 20 40 60 80 100 120 140
0.00
0.02
0.04
0.06
0.08
0.10
11
19
5
32
1 0 0 2 3 2
Algorithm Algorithm
Performance PredictionPerformance Prediction
Round RobinRound Robin
AlgorithmAlgorithm
22
More on Strategy
nSampleSampleSampleClient ,12,11,11 ,,:
nSampleSampleSampleClient ,22,21,22 ,,:
nSampleSampleSampleClient ,502,501,5050 ,,:
nSampleSampleSampleClient ,12,11,11 ,,:
nSampleSampleSampleClient ,22,21,22 ,,:
nSampleSampleSampleClient ,502,501,5050 ,,:
Server AServer A
Server BServer B
Consider X clientsConsider X clients
Min Max AlgorithmMin Max Algorithm
Y clients Y clients
from Server Afrom Server A
X-Y clients X-Y clients
from Server Bfrom Server B
ALLOCATEALLOCATE
Average YAverage Y
Clients RTTClients RTT
Average X-YAverage X-Y
Clients RTTClients RTT
Random Random
SamplesSamples
RandomRandom
SamplesSamples
Repeat 1000Repeat 1000
Average RTTAverage RTT
10 20 30 40 50
20
04
00
60
08
00
Comparison of Load Balancing Strategy
Client #
Ave
rag
e R
TT
23
Server A & B Performance Measurements (RTT)
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49
05
00
10
00
15
00
20
00
Server A
Client #
Se
rve
r A
RT
T[m
s]
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49
05
00
10
00
15
00
20
00
Server B
Client #
Se
rve
r B
RT
T[m
s]
24
Performance Measurements (II)
10 20 30 40 50
100
200
300
400
500
600
Comparison of Load Balancing Strategy
Client #
Ave
rag
e R
TT
1 7 14 22 30 38 46
05
00
10
00
15
00
20
00
Server A
Client #
Se
rve
r A
RT
T[m
s]
1 7 14 22 30 38 46
05
00
10
00
15
00
20
00
Server B
Client #
Se
rve
r B
RT
T[m
s]
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49
05
001
000
150
02
000
Server A
Client #
Se
rver
A R
TT
[ms]
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49
05
001
000
150
02
000
Server B
Client #
Se
rver
B R
TT
[ms]
Random Load BalancingRandom Load Balancing
Min Max Loading BalancingMin Max Loading Balancing
LP Load BalancingLP Load Balancing
10 20 30 40 50
20
04
00
60
08
00
Comparison of Load Balancing Strategy
Client
Ave
rag
e R
TT
25
Other Features
ExperimentalData from Server Algorithm Testing
With Empirical Data& Parameter Updates
Load BalancerIntelligence Update
Server AServer A Server BServer B
40
60
80
10
01
20
Random Load Balancing
Client # 4
40
60
80
10
01
20
Min Max Load Balancing
Client # 4
Histogram of islands
islands
Fre
quen
cy
0 5000 10000 15000
010
2030
40
Histogram of islands
islands
Fre
quen
cy
0 5000 10000 15000
010
2030
40
41
2 1 1 1 1 0 0 1
Histogram of sqrt(islands)
sqrt(islands)
Fre
quen
cy
0 20 40 60 80 100 120 140
05
1015
2025
3035
Histogram of sqrt(islands)
sqrt(islands)
Den
sity
0 20 40 60 80 100 120 140
0.00
0.02
0.04
0.06
0.08
0.10
11
19
5
32
1 0 0 2 3 2
Server AServer A Server BServer B
ClientClientClientClient
26
Insights from Measurements
• FT – Two different types of peak were measured according to different exception.
• RT-FT – Connection Establishment time was removed
• Pre-connection before the failover.
• But, still high peak remained.
– Fault Detection time was removed • Watchdog before catching exception.
• RT-FT Performance – Round Robin is good for our situation
• Servers have similar capacity.
– Load balancing algorithm can be selected considering running environment
• Test Environment– Keep clean environment to reduce jitter.
27
What we learned & accomplished
• What we learned?– How to handle JBoss
• First experience for majority of team members
– Careful analysis of the test result definitely save the time– How to control the factors to get the better data
• What we accomplished?– FT
• Passive replication strategy• Avoid duplicate transaction
– RT-FT• Pre-established connection strategy• Local Fault Detector for checking status of server beforehand
– Performance• Implement Static Load Balancing• Implement Dynamic Load Balancing• Simulate several load balancing strategy
28
Open Issues & Future Challenge
• Open Issue– FindAll() doesn’t work on Jboss on Linux
• It works well on Windows OS
– Implementing several load balancing strategy• Min Max, LP (Linear Programming) algorithm
• Future Challenge– Separate JNDI service
– Get server list from Replication Manager dynamically
– Try Active Replication
– Try development without IDE tool