Upload
vuongtram
View
218
Download
0
Embed Size (px)
Citation preview
Capturing Network Traffic DynamicsSmall Scales
Stochastic Systems and Modelling in Networking and FinancePart II
Dependable Adaptive Systems and Mathematical ModelingKaiserslautern, August 2006
Rolf Riedi
Dept of Statistics
Rudolf Riedi Rice University stat.rice.edu/~riedi
Model and Physical Reality
PhenomenonPhysical System
StatisticalModel Physical
Model
StochasticModel
Convergence of ON-OFF to fBm
LRDSelf-similarity
Queuing predictionEstimation of LRD
User responsiblefor bursts at large scale
Large scaleswell understood
Rudolf Riedi Rice University stat.rice.edu/~riedi
Failure of classical prediction
• Interarrival times – Not exponential– Not independent
Paxson-Floyd, 1995
Rudolf Riedi Rice University stat.rice.edu/~riedi
Measured Data• Time series (Ak,Zk) collected at gateway of LAN
– k= number of data packet– Ak= arrival time of packet– Bk= size of packet
• Work load until time t:
••Working arrival per m time units
Rudolf Riedi Rice University stat.rice.edu/~riedi
Long Range Dependence
• High variability at large scales caused by correlation
• Auto-covariance function
• LRD– Slowly decaying auto-covariance
• Cox: for ½ < H < 1 (presence of LRD)
– Var-time-plot: simple first diagnostics for LRD
Rudolf Riedi Rice University stat.rice.edu/~riedi
fBm Historic facts• Brown (1820): observes particle motion• Markov (1900+): Markov chains • Einstein (1905): Heat equ for Brownian motion• Wiener (1923): continuous Markov process• Kolmogorov (1930’s): theory of stochastic processes• Kolmogorov (1940): fBm• Levy, Lamperti: H-sssi processes (1962)• Mandelbrot & VanNess: integral representation (1965)
• Adler: fractal path properties of fBm• Samorodnitsky & Taqqu:
self-similar stable motion
Rudolf Riedi Rice University stat.rice.edu/~riedi
Connection-level Analysis and
Modeling of Network Traffic
Rudolf Riedi Rice University stat.rice.edu/~riedi
Aggregate Traffic at small scales• Trace:
– Time stamped headers– Sender-Receiver IP !!
• Large scales– Gaussian– LRD (high variability)
• Small scales– Non-Gaussian– Positive process– Burstiness
Objective :– Origins of bursts
Auckland Gateway (2000)Aggregate Bytes per time
Gaussian: 1%Real traffic: 3%
Kurtosis - Gaussian : 3- Real traffic: 5.8
Mean
99%
0 0.5 1 1.5 2 2.5 3x 105
0
50
100
150
200
250
300
350
400
450histogram
Gaussian: 1%Real traffic: 3%
0 2000 4000 60000
0.5
1
1.5
2
2.5
3x 105
time (1 unit=500ms)nu
mbe
r of b
ytes
Mean
99%
Rudolf Riedi Rice University stat.rice.edu/~riedi
• ON/OFF model – Superposition of sources– Connection level model
• Explains large scale variability: – LRD, Gaussian– Cause: Costumers– Heavy tailed file sizes !!
Bursts in the ON/OFF framework
• Small scale bursts:– Non-Gaussianity– Conspiracy of sources ??– Flash crowds ??
(dramatic increase of active sources)
Rudolf Riedi Rice University stat.rice.edu/~riedi
Non-Gaussianity: A Conspiracy?
• The number of active connections is close to Gaussian; provides no indication of bursts in the load
• Indication for:- No conspiracy of sources- No flash crowds
Load: Bytes per 500 ms
Number of active connections
Mean
99%
Mean
99%
Rudolf Riedi Rice University stat.rice.edu/~riedi
Non-Gaussianity: a case studyTypical bursty arrival(500 ms time slot)
Histogram of load offered in same time bin per connection:One connection dominates
150 Kb
Typical Gaussian arrival(500 ms time slot)
Histogram of load offered in same time bin per connection:Considerable balanced “field” of connections
10 Kb
Rudolf Riedi Rice University stat.rice.edu/~riedi
Non-Gaussianity and Dominance
• Dominant connections correlate with bursts
Circled in Red: Instances where one connections contributes over 50% of load(resolution 500 ms)
Mean
99%
Rudolf Riedi Rice University stat.rice.edu/~riedi
Non-Gaussianity and Dominance
Systematic study: time series separation• For each bin of 500 ms:
remove packets of the ONE strongest connection• Leaves “Gaussian” residual traffic
Overall traffic Residual traffic1 Strongest connection
= +Mean
99%
Rudolf Riedi Rice University stat.rice.edu/~riedi
Separation on Connection Level
Definition:• Alpha connections:
Peak rate > mean arrival rate + 1 std dev
• Beta connections: Residual traffic
• Findings are similar for different time series– Auckland (2000+2001), Berkeley, Bellcore, DEC– 500ms, 50ms, 5ms resolution
Rudolf Riedi Rice University stat.rice.edu/~riedi
Alpha Traffic Component
• There are few Alpha connections – < 1% (AUCK 2000: 427 of 64,087 connections)
– 3% of load
Alpha is extremely bursty
Beta is little bursty
Overall traffic is quite bursty
• Alpha connections cause bursts:
Multifractal spectrum:Wide spectrum means bursty
Balanced (50% alpha) very bursty
Rudolf Riedi Rice University stat.rice.edu/~riedi
Multifractal spectrum: Microscope for Bursts
α=.7 α=.9 α=.8
• Collect points t with same α :
aLarge Deviation type result
Rudolf Riedi Rice University stat.rice.edu/~riedi
Beta Traffic Component
• Constitutes main load• Governs LRD properties of overall traffic• Is Gaussian at sufficient utilization (Kurtosis = 3)
• Is well matched by ON/OFF model
Beta traffic Number of connections = ON/OFF
Variance time plot
Mean
99%
Rudolf Riedi Rice University stat.rice.edu/~riedi
Simple Connection Taxonomy
Careful analysis on connection level shows : this is the onlysystematic reason
Bursts arise from largetransfers over fast links.
RTTrate
bandwidth =But:
Rudolf Riedi Rice University stat.rice.edu/~riedi
Cwnd or RTT?
Correlation coefficient=0.68
RTT has strong influence on bandwidth and dominance.
103
104
105
106
10-1
100
101
102
peak-rate (Bps)
1/R
TT (1
/s)
Correlation coefficient=0.01
103
104
105
10610
2
103
104
105
peak-rate (Bps)
cwnd
(B)
Colorado State University trace, 300,000 packets
cwnd 1/RTT
ratepeak →
Beta Alpha Beta Alpha
Rudolf Riedi Rice University stat.rice.edu/~riedi
Examples of Alpha/Beta Connections
0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4
-200
0
200
400
600
800
1000
1200
1400
one beta connection
packet arrival time (second)
pack
et s
ize
(byt
es)
forward direction
reverse direction
41.3 41.35 41.4 41.45
-200
0
200
400
600
800
1000
1200
1400
one alpha connection (96078, 196, 80, 59486)
packet arrival time (second)
pack
et s
ize
(byt
es)
forward direction
reverse direction
Alpha connections burst because of short round trip time, not large rate
Notice the different time scales
Rudolf Riedi Rice University stat.rice.edu/~riedi
Modeling Network Traffic
Physical Model• Traffic (user): superposition of ON/OFF
sources requesting files with heavy tailed size• Network: heterogeneous bandwidth
variable sending-rates (fixed per ON/OFF source)
• Explains properties of traffic:–LRD: heavy tailed transfer of beta sources (crowd)
–Bursts: few large transfers of few alpha sources
Mathematical Model•…accommodate this insight within ON-OFF?
Rudolf Riedi Rice University stat.rice.edu/~riedi
Modeling of Alpha Traffic• ON/OFF model revisited:
High variability in connection rates (RTTs)
Low rate = beta High rate = alpha
fractional Gaussian noise Non-Gaussian limit??
+
=
+
+
=
Rudolf Riedi Rice University stat.rice.edu/~riedi
Modeling of Alpha Traffic• ON/OFF model revisited:
High variability in connection rates (RTTs)
Low rate = beta High rate = alpha
fractional Gaussian noise Non-Gaussian limit
0 2000 4000 60000
0.5
1
1.5
2
x 105
time (1 unit=500ms)
num
ber o
f byt
es
0 2000 4000 60000
0.5
1
1.5
2
2.5
3x 105
time (1 unit=500ms)
num
ber o
f byt
es
Rudolf Riedi Rice University stat.rice.edu/~riedi
Towards mathematical models
Renewal reward processes
Rudolf Riedi Rice University stat.rice.edu/~riedi
Stable distributions• Parameters:
• Equivalent definitions:– Stable– Limit of iid sums
• Known special cases:– Gaussian– Cauchy
• Characteristic fct
Rudolf Riedi Rice University stat.rice.edu/~riedi
Different Limits for ON-OFF model
• Recall limitsof ON-OFF sourcesmultiplexed
• Possible limits of renewal reward aggregateWillinger Paxson R Taqqu
Rudolf Riedi Rice University stat.rice.edu/~riedi
Modeling of Alpha Traffic• ON/OFF model revisited:
High variability in connection rates (RTTs)
Low rate = beta High rate = alpha
fractional Gaussian noise stable Levy noise
+
=
+
+
=
Rudolf Riedi Rice University stat.rice.edu/~riedi
Impact: Simulation
• Simulation: ns topology to include alpha links
Simple: equal bandwidth Realistic: heterogeneousend-to-end bandwidth
• Congestion control• Design and management
Rudolf Riedi Rice University stat.rice.edu/~riedi
Inpact: Understanding• LRD
– Large time scales – approx. Gaussian – Client behavior– Bandwidth over Buffer
• Multifractal– Small time scale– Network topology– Control at flow level– Simulation
Structure: Multiplicative AdditiveModel: hybrid tree
Mixture of Gaussian - Stable
packetscheduling
sessionlifetime
networkmanagement
round-triptime
< 1 msec msec-sec minutes hours
Rudolf Riedi Rice University stat.rice.edu/~riedi
Impact: Performance• Beta Traffic rules the small Queues• Alpha Traffic causes the large Queue-sizes
(despite small Window Size)
Alpha connections
Queue-size overlapped with Alpha PeaksTotal
traffic
Rudolf Riedi Rice University stat.rice.edu/~riedi
Self-similar Burst Model• Alpha component = self-similar stable
– (limit of a few ON-OFF sources in the limit of fast time)
• This models heavy-tailed bursts – (heavy tailed files)
• TCP control: alpha CWND arbitrarily large – (short RTT, future TCP mutants)
• Analysis via De-Multiplexing:– Optimal setup of two individual Queues to come closest to
aggregate Queue
De-Multiplexing:Equal critical time-scales
Q-tail ParetoDue to Levy noise
Beta (top) + Alpha
Rudolf Riedi Rice University stat.rice.edu/~riedi
ON-OFF Burst Model• Alpha traffic = High rate ON-OFF source (truncated)• This models bi-modal bandwidth distribution• TCP: bottleneck is at the receiver (flow control
through advertised window)• Current state of measured traffic• Analysis: de-multiplexing and variable rate queue
Beta (top) + Alpha Variable Service Rate Queue-tail Weibull(unaffected) unless
• rate of alpha traffic larger than capacity – average beta arrival • and duration of alpha ON period heavy tailed
Rudolf Riedi Rice University stat.rice.edu/~riedi
On-off parameters
• Free parameters in on-off model?– File size = duration * rate : these variables are dependent– Assuming two of them are independent leads to following models:
• Simulation: same “behavior” for entire traffic results in poor match. Different models (power/patience) for alpha and beta?
Power modelFile size and rate independent
Patience model:File size and duration independent
Real trace Real trace
Simulation using observed size and rate independently
Simulation using observed size and duration independently
Rudolf Riedi Rice University stat.rice.edu/~riedi
Duration (s)
Rat
e (b
ytes
per
sec
)
Duration and Rate (Beta)
0.1 10 1000 10
1k
100k
0
10
20
30
40
50
60
70
80
90
Size (bytes)
Rat
e (b
ytes
per
sec
)
Filesize and Rate (Alpha)
10 1k 100k 10M 10
1k
100k
0
2
4
6
8
10
12
14
16
Size (bytes)
Rat
e (b
ytes
per
sec
)
Filesize and Rate (Beta)
10 1k 100k 10M 10
1k
100k
0
10
20
30
40
50
60
70
80
90
Duration (s)
Rat
e (b
ytes
per
sec
)
Duration and Rate (Alpha)
0.1 10 1000 10
1k
100k
0
2
4
6
8
10
12
Free parameters: statistical analysis
Alpha Beta
Duration- Rate
Size- Rate
X
• Beta users: rate determines file size• Alpha users are “free”
Rudolf Riedi Rice University stat.rice.edu/~riedi
0 1000 2000 3000 4000 5000 6000 70000
1
2
3
4
5
6
7
8
9x 10
5 Bytes per time (overall)
Time bin (1 unit = 500ms)
Byt
es
0 1000 2000 3000 4000 5000 6000 70000
1
2
3
4
5
6
7
8
9x 10
5 Bytes per time (Alpha)
Time bin (1 unit = 500ms)
Byt
es
0 1000 2000 3000 4000 5000 6000 70000
1
2
3
4
5
6
7
8
9x 10
5 Bytes per time (Beta)
Time bin (1 unit = 500ms)
Byt
es
0 2000 4000 60000
2
4
6
8
x 105 Bytes per time (overall)
Time bin (1 unit = 500ms)
Byt
es
0 2000 4000 60000
2
4
6
8
x 105 Bytes per time (Alpha)
Time bin (1 unit = 500ms)
Byt
es
0 2000 4000 60000
2
4
6
8
x 105 Bytes per time (overall)
Time bin (1 unit = 500ms)
Byt
es
Scheme RD:Rate Durationindependent
Total
Scheme SD:Size Durationindependent
Scheme SR: Size Rate
independent
Alpha
Beta
0 2000 4000 60000
0.5
1
1.5
2
2.5
3
x 107 Bytes per time (overall)
Time bin (1 unit = 500ms)
Byt
es
0 2000 4000 60000
1
2
3
4
5
6
x 107 Bytes per time (Beta)
Time bin (1 unit = 500ms)
Byt
es
0 2000 4000 60000
2
4
6
8
10x 10
6 Bytes per time (Beta)
Time bin (1 unit = 500ms)
Byt
es
0 2000 4000 60000
2
4
6
8
x 105 Bytes per time (Alpha)
Time bin (1 unit = 500ms)
Byt
es
0 2000 4000 60000
2
4
6
8
x 105 Bytes per time (Beta)
Time bin (1 unit = 500ms)
Byt
es
Real Trace
0 2000 4000 60000
2
4
6
8
x 105 Bytes per time (overall)
Time bin (1 unit = 500ms)
Byt
es
Free parameters: SIMULATION
Rudolf Riedi Rice University stat.rice.edu/~riedi
Network-User Driven Traffic model
CONCLUSION: • Rates for alpha drawn to be large, beta drawn to be
small but:– Alpha: power model: Rate independent of Size– Beta: patience factor: Rate independent of Duration
• New limiting results needed for novel ON-OFF settings
0 1000 2000 3000 4000 5000 6000 70000
1
2
3
4
5
6
7
8
9x 10
5 Bytes per time (overall)
Time bin (1 unit = 500ms)
Byt
es
0 1000 2000 3000 4000 5000 6000 70000
1
2
3
4
5
6
7
8
9x 10
5 Bytes per time (overall)
Time bin (1 unit = 500ms)
Byt
es
Original trace (Bellcore) Alpha (SR) + Beta (RD)
Rudolf Riedi Rice University stat.rice.edu/~riedi
Model and Physical Reality
PhenomenonPhysical System
StatisticalModel
ON-OFF withasympt. regimes. Renewal Reward.
StochasticModel
Two component model(alpha/beta users)
Self-similar limitsare fBm (multiplexed beta)or Levy stable (fast alpha)
Queuing predictionDetection of alpha users User responsible
for bursts at large scale
Small scalessufficiently understood.
Choice of physical model not clear