Self-Similarity in Network Traffic
Kevin Henkener5/29/2002
What is Self-Similarity?
Self-similarity describes the phenomenon where a certain property of an object is preserved with respect to scaling in space and/or time.
If an object is self-similar, its parts, when magnified, resemble the shape of the whole.
Pictorial View of Self-Similarity
The Famous Data
Leland and Wilson collected hundreds of millions of Ethernet packets without loss and with recorded time-stamps accurate to within 100µs.
Data collected from several Ethernet LAN’s at the Bellcore Morristown Research and Engineering Center at different times over the course of approximately 4 years.
Why is Self-Similarity Important? Recently, network packet traffic has been
identified as being self-similar. Current network traffic modeling using
Poisson distributing (etc.) does not take into account the self-similar nature of traffic.
This leads to inaccurate modeling which, when applied to a huge network like the Internet, can lead to huge financial losses.
Problems with Current Models Current modeling shows that as the number
of sources (Ethernet users) increases, the traffic becomes smoother and smoother
Analysis shows that the traffic tends to become less smooth and more bursty as the number of active sources increases
Problems with Current Models Cont.’d Were traffic to follow a Poisson or Markovian
arrival process, it would have a characteristic burst length which would tend to be smoothed by averaging over a long enough time scale. Rather, measurements of real traffic indicate that significant traffic variance (burstiness) is present on a wide range of time scales
Pictorial View of Current Modeling
Side-by-side View
Definitions and Properties
Long-range Dependence covariance decays slowly
Hurst Parameter Developed by Harold Hurst (1965) H is a measure of “burstiness”
also considered a measure of self-similarity 0 < H < 1 H increases as traffic increases
Definitions and Properties Cont.’d
low, medium, and high traffic hours as traffic increases, the Hurst parameter increases
i.e., traffic becomes more self-similar
Self-Similar Measures
Background Let time series: X = (Xt : t = 0, 1, 2, ….) be a
covariance stationary stochastic process autocorrelation function: r(k), k ≥ 0 assume r(k) ~ k-β L(t), as k∞ where 0 < β < 1
limt∞ L(tx) / L(t) = 1, for all x > 0
Second-order Self-Similar Exactly
A process X is called (exactly) self-similar with self-similarity parameter H = 1 – β/2 if
for all m = 1, 2, …. var(X(m)) = σ2m-β
r(m)(k) = r(k), k ≥ 0 Asymptotically
r(m)(k) = r(k), as m∞ aggregated processes are the same
Current model shows aggregated processes tending to pure noise
Measuring Self-Similarity
time-domain analysis based on R/S statistic analysis of the variance of the aggregated
processes X(m)
periodogram-based analysis in the frequency domain
Methods of Modeling Self-Similar Traffic Two formal mathematical models that yield
elegant representations of self-similarity
fractional Gaussian noise fractional autoregressive integrated moving-
average processes
Results
Ethernet traffic is self-similar irrespective of time Ethernet traffic is self-similar irrespective of where it
is collected The degree of self-similarity measured in terms of
the Hurst parameter h is typically a function of the overall utilization of the Ethernet and can be used for measuring the “burstiness” of the traffic
Current traffic models are not capable of capturing the self-similarity property
Results Cont.’d
There exists the presence of concentrated periods of congestion at a wide range of time scales
This implies the existence of concentrated periods of light network load
These two features cannot be easily controlled by traffic control. i.e., burstiness cannot be smoothed
Results Cont.’d
These two implications make it difficult to allocated services such that QOS and network utilization are maximized.
Self-similar burstiness can lead to the amplification of packet loss.
Problems with Packet Loss
Effects in TCP TCP guarantees that packets will be delivered and will be
delivered in order When packets are lost in TCP, the lost packets must be
retransmitted This wastes valuable resources
Effects in UDP UDP sends packets as quickly as possible with no promise
of delivery When packets are lost, they are not retransmitted Repercussions for packet loss in UDP include “jitter” in
streaming audio/video etc.
Possible Methods for Dealing with the Self-Similar Property of Traffic Dynamic Control of Traffic Flow Structural resource allocation
Dynamic Control of Traffic Flow Predictive feedback control
identify the on-set of concentrated periods of either high or low traffic activity
adjust the mode of congestion control appropriately from conservative to aggressive
Dynamic Control of Traffic Flow Cont.’d Adaptive forward error correction
retransmission of lost information is not viable because of time-constraints (real-time)
adjust the degree of redundancy based on the network state increase level of redundancy when traffic is high
could backfire as too much of an increase will only further aggrevate congestion
decrease level of redundancy when traffic is low
Structural Resource Allocation Two types:
bandwidth buffer size
Bandwidth increase bandwidth to accommodate periods of
“burstiness” could be wasteful in times of low traffic intensity
Structural Resource Allocation Cont.’d
buffer size increase the buffer size in routers (et. al.) such that they
can absorb periods of “burstiness” still possible to fill a given router’s buffer and create a
bottleneck
tradeoff increase both until they complement each other and
begin curtailing the effects of self-similarity