Upload
lobo
View
21
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Performance Engineering of the WWW: Application to Dimensioning and Caching. Jean-Crysostome Bolot Philipp Hoschka INRIA. Main Ideas. Focuses on applying analytic techniques to study performance issues of the WWW Dimensioning Caching Main contributions: - PowerPoint PPT Presentation
Citation preview
Performance Engineering of the WWW:Application to Dimensioning and Caching
Jean-Crysostome BolotPhilipp HoschkaINRIA
Main Ideas
Focuses on applying analytic techniques to study performance issues of the WWW Dimensioning Caching
Main contributions: Show the use of time series analysis to model Web
traffic and forecasting Web server loads Show that cache replacement algorithms other than LRU
and size-based should be used
Dimensioning and caching what?
WWW = distributed data service WWW Quality = Data access quality Access time = Users’ utility function Decompose access time into:
Time the request takes to reach the server Time the server takes to process the request Time the reply takes to get to the client Time the client takes to process the reply
Use of Time Series for Web Dimensioning
A time series is a sequence of observations of a random process taken sequentially in time
Data can be measured in an active or passive manner An intrinsic feature of a time series is that typically
there are dependencies among adjacent observations Time series analysis is a large body of principles
designed to study such dependencies To study time series we need:
identify a model fit the model validate the model
Time Series Modeling
Let’s say we measure round-trip times for packets Constructing a time series model involves expressing
rrtn in terms of previous observations rttn-i and some error noise en
Noise processes are assumed to be uncorrelated with mean 0 and finite variance (simplest models)
For n > 0, rttn = f({rttn-i},{en-i}, i >= 0)
Most common models in the literature are linear models, the most known being: Autoregressive (AR), the Moving Average (MA), and the Autoregressive Moving Average (ARMA)
AR Models
In this model, the current value of the process is expressed as a finite, linear aggregate of previous values of the process and an error term et
Ex: zt = values of a process at time t rttt = 1*rttt-1 + 2*rttt-2 +...+ p*rttt-p
Why Autoregressive?
MA Models
In these models zt is expressed as a linearly dependent combination of a finite number q of previous error terms (e’s): rttt = 0et + 1*et-1 + 2*et-2+ ...+q*et-q
It is called an MA process of order q The weights not need to add to 1 or be positive
Mixed AR-MA Models
In order to get more flexibility in fitting actual time series, it is sometimes convenient to include both AR and MA terms in the same model:
rttt = 1*rttt-1 + 2*rttt-2 +...+ p*rttt-p - et - 1*et-1 - ...- q*et-q
The model uses p + q + 2 unknown parameters ARMA models have been widely used for: video
traffic, call requests in telephone networks, and memory references software systems
Not widely used in computer networking
ARIMA Models
The ARMA model fitting assumes the underlying stochastic process to be stationary
Recall: a stationary process is one that remains in equilibrium about a constant mean level
Many “real-life” time series are nonstationary Nonstationary processes do not oscillate around
a specific mean However, they can exhibit homogeneous
behavior
ARIMA Models
The most common approach to deal with them: use two models: One for the nonstationary part One for the stationary residual part
Nonstationary time series can be modeled with an integrated model as the ARIMA model
An ARIMA model of order (p,d,q) is an ARMA model of order (p,q) that has been differenced d times
Seasonal ARIMA Models
For other nonstationary series, “plain” ARIMA models can not be used
The most common of such series are the seasonal trends
Ex: The data for a particular hour in a month-long trace is typically correlated with the hours preceding it as well as with the same hours in preceding days
We can deal with them using seasonal ARIMA models referred to as ARIMA (p,d,q) x (P,D,Q)s
Idea: Carry out two models. One for the entire time series and another only for data points that are s units apart
After Selecting the model...
Once a model has been selected we have: identification (select the values for p and q) Estimation (estimate the values of 1,...,p,
0,...,q) Evaluation of diagnostic checking
Now we have a model, use it to forecast (predict) future values of the process
Application to Web Analysis
Use these models to analyze several data sets from their Web servers: Number of requests handled Size of the replies (Mbytes)
Want to study variations at the hour granularity (consider averages over a month interval)
Main point: there are strong seasonal variations (daily cycles)
Also, observe trend reflecting the growing number of requests handled by the server over the past year
Used seasonal ARIMA model (2,2,1)x(3,2,0)24
Using the Model to Predict
Want to forecast the number of requests received by the server (important for dimensioning the server)
Forecasting problem: Given a series of values observed up to time n, predict the value of the process at some specific time in the future minimizing some prediction error
Found that the ARIMA-based forecasting provides reasonable accurate short- and medium-term predictions
More accurate medium- and long-term predictions are limited because of limited available trace data
Efficient Cache Replacement Algorithms for Web Proxies
Avoid overloads? => control the amount of data pumped
into the network => minimize distant requests => caching
Proxy caching is good iff clients exhibit enough temporal locality in accessing documents
Also, small files are requested more often Good replacement algorithms are needed Typically used: LRU and Size-based
Caching Algorithms
Cache algorithms are compared in terms of: Miss ratio Normalized Resolution time
The lower the miss ratio the lower the amount of data going through the network
User don’t care about miss ratios: they want low response times
Quantify quality of algorithm in terms of normalized resolution time T: the ratio of the average resolution time with and without the cache
Cache Algorithms
Let p: miss probability, Tc: Average time to access cache entry Tnc: Average time to access a document not in the cache
Then: Tc + p*Tnc
T = --------------- Tnc
Assuming Tnc >> Tc, T~p T is minimized when p is minimized
Cache Algorithms
The above statement seems to argue for large caches However,
cache size is limited the miss ratio is related to the size of the documents
stored in the cache For a given cache size, the number of cached
documents, and hence the hit ratio, decrease as the document size increases
Small files are more often requested This observations have led to algorithms that take
into account not only temporal locality but also document sizes
Cache Algorithms
Surprisingly, no cache algorithm takes as an input parameter the time it took to retrieve a given document
A Web cache replacement algorithm should take into account the retrieval time associated with each document in the cache
One way to achieve this: assign weights to documents and use a weight-based replacement algorithm
The weights might be a function of: the time to last reference the item the time it took to retrieve the item the expected time-to-live for the item the size of the document, etc.
Selecting A Replacement Algorithm
The problem can be associated with a function that given: The state s of the cache, and A newly retrieved document
Decides the following: Should the retrieved document be cached? If yes and no space available, which existing entry to
discard?
State s of the cache: The set of documents stored For each document, a set of sate variables which typically
include statistical information associated with the document
Selecting a Replacement Algorithm
Examples of state variables for a document: ti: time since document was last referenced Si: Size of the document rtti: Time it took to retrieve the document
ttli: Time-to-live for the document
Idea: to assign a weight to each cached document:
Wi = W(ti, Si, rtti, ttli)
This weight function can be specialized to obtain commonly used algorithms
Selecting A Replacement Algorithm
They suggest the following function for Web proxies:
W(ti, Si, rtti, ttli) = (w1*rtti + w2*Si)/ ttli + (w3 + w4*Si)/ ti
The second term captures temporal locality and the first term captures the cost associated with retrieving documents
The multiplying factor 1/ttli indicates: the cost associated with retrieving a document increases as the useful lifetime of the document decreases
Cache Coherence in Web Proxies
Usually remote sites exporting time-critical pages, associate with them time-to-live values
Not commonly used => don’t use ttli Instead:
W(ti, Si, rtti, ttli) = w1*rtti + w2*Si + (w3 + w4*Si)/ ti
To select the wi values, must consider overall goals: Maximize hit ratio? Minimize perceived retrieval time for random user? Minimize the cache size for a given hit ratio? Etc.
Testing Two Algorithms
Use trace driven simulations to compare the performance of different schemes: LRU and a scheme that takes into account all state variables
Algorithm 1: W(ti, Si, rtti, ttli) = 1/ti
Algorithm 2: W(ti, Si, rtti, ttli) = w1*rtti + w2*Si + (w3 + w4*Si)/ ti
Parameters: w1: 5000 b/s w2: 1000 w3: 1000 b/s w4: 10 s
Testing two Algorithms
To compare, the following performance measures are used: Miss Ratio Weighted miss ratio: probability that a document is
not in the cache multiplied by the document’s weight Miss ratio for algorithm 1 is slightly lower than the miss
ratio for algorithm 2 Weighted miss ratio (and hence the perceived retrieval
time) is much lower for algorithm 2 It is good to use a cache replacement algorithm that
takes retrieval time into consideration