25
Performance Engineering of the WWW: Application to Dimensioning and Caching Jean-Crysostome Bolot Philipp Hoschka INRIA

Performance Engineering of the WWW: Application to Dimensioning and Caching

  • Upload
    lobo

  • View
    21

  • Download
    0

Embed Size (px)

DESCRIPTION

Performance Engineering of the WWW: Application to Dimensioning and Caching. Jean-Crysostome Bolot Philipp Hoschka INRIA. Main Ideas. Focuses on applying analytic techniques to study performance issues of the WWW Dimensioning Caching Main contributions: - PowerPoint PPT Presentation

Citation preview

Page 1: Performance Engineering of the WWW: Application to Dimensioning and Caching

Performance Engineering of the WWW:Application to Dimensioning and Caching

Jean-Crysostome BolotPhilipp HoschkaINRIA

Page 2: Performance Engineering of the WWW: Application to Dimensioning and Caching

Main Ideas

Focuses on applying analytic techniques to study performance issues of the WWW Dimensioning Caching

Main contributions: Show the use of time series analysis to model Web

traffic and forecasting Web server loads Show that cache replacement algorithms other than LRU

and size-based should be used

Page 3: Performance Engineering of the WWW: Application to Dimensioning and Caching

Dimensioning and caching what?

WWW = distributed data service WWW Quality = Data access quality Access time = Users’ utility function Decompose access time into:

Time the request takes to reach the server Time the server takes to process the request Time the reply takes to get to the client Time the client takes to process the reply

Page 4: Performance Engineering of the WWW: Application to Dimensioning and Caching

Use of Time Series for Web Dimensioning

A time series is a sequence of observations of a random process taken sequentially in time

Data can be measured in an active or passive manner An intrinsic feature of a time series is that typically

there are dependencies among adjacent observations Time series analysis is a large body of principles

designed to study such dependencies To study time series we need:

identify a model fit the model validate the model

Page 5: Performance Engineering of the WWW: Application to Dimensioning and Caching

Time Series Modeling

Let’s say we measure round-trip times for packets Constructing a time series model involves expressing

rrtn in terms of previous observations rttn-i and some error noise en

Noise processes are assumed to be uncorrelated with mean 0 and finite variance (simplest models)

For n > 0, rttn = f({rttn-i},{en-i}, i >= 0)

Most common models in the literature are linear models, the most known being: Autoregressive (AR), the Moving Average (MA), and the Autoregressive Moving Average (ARMA)

Page 6: Performance Engineering of the WWW: Application to Dimensioning and Caching

AR Models

In this model, the current value of the process is expressed as a finite, linear aggregate of previous values of the process and an error term et

Ex: zt = values of a process at time t rttt = 1*rttt-1 + 2*rttt-2 +...+ p*rttt-p

Why Autoregressive?

Page 7: Performance Engineering of the WWW: Application to Dimensioning and Caching

MA Models

In these models zt is expressed as a linearly dependent combination of a finite number q of previous error terms (e’s): rttt = 0et + 1*et-1 + 2*et-2+ ...+q*et-q

It is called an MA process of order q The weights not need to add to 1 or be positive

Page 8: Performance Engineering of the WWW: Application to Dimensioning and Caching

Mixed AR-MA Models

In order to get more flexibility in fitting actual time series, it is sometimes convenient to include both AR and MA terms in the same model:

rttt = 1*rttt-1 + 2*rttt-2 +...+ p*rttt-p - et - 1*et-1 - ...- q*et-q

The model uses p + q + 2 unknown parameters ARMA models have been widely used for: video

traffic, call requests in telephone networks, and memory references software systems

Not widely used in computer networking

Page 9: Performance Engineering of the WWW: Application to Dimensioning and Caching

ARIMA Models

The ARMA model fitting assumes the underlying stochastic process to be stationary

Recall: a stationary process is one that remains in equilibrium about a constant mean level

Many “real-life” time series are nonstationary Nonstationary processes do not oscillate around

a specific mean However, they can exhibit homogeneous

behavior

Page 10: Performance Engineering of the WWW: Application to Dimensioning and Caching

ARIMA Models

The most common approach to deal with them: use two models: One for the nonstationary part One for the stationary residual part

Nonstationary time series can be modeled with an integrated model as the ARIMA model

An ARIMA model of order (p,d,q) is an ARMA model of order (p,q) that has been differenced d times

Page 11: Performance Engineering of the WWW: Application to Dimensioning and Caching

Seasonal ARIMA Models

For other nonstationary series, “plain” ARIMA models can not be used

The most common of such series are the seasonal trends

Ex: The data for a particular hour in a month-long trace is typically correlated with the hours preceding it as well as with the same hours in preceding days

We can deal with them using seasonal ARIMA models referred to as ARIMA (p,d,q) x (P,D,Q)s

Idea: Carry out two models. One for the entire time series and another only for data points that are s units apart

Page 12: Performance Engineering of the WWW: Application to Dimensioning and Caching

After Selecting the model...

Once a model has been selected we have: identification (select the values for p and q) Estimation (estimate the values of 1,...,p,

0,...,q) Evaluation of diagnostic checking

Now we have a model, use it to forecast (predict) future values of the process

Page 13: Performance Engineering of the WWW: Application to Dimensioning and Caching

Application to Web Analysis

Use these models to analyze several data sets from their Web servers: Number of requests handled Size of the replies (Mbytes)

Want to study variations at the hour granularity (consider averages over a month interval)

Main point: there are strong seasonal variations (daily cycles)

Also, observe trend reflecting the growing number of requests handled by the server over the past year

Used seasonal ARIMA model (2,2,1)x(3,2,0)24

Page 14: Performance Engineering of the WWW: Application to Dimensioning and Caching

Using the Model to Predict

Want to forecast the number of requests received by the server (important for dimensioning the server)

Forecasting problem: Given a series of values observed up to time n, predict the value of the process at some specific time in the future minimizing some prediction error

Found that the ARIMA-based forecasting provides reasonable accurate short- and medium-term predictions

More accurate medium- and long-term predictions are limited because of limited available trace data

Page 15: Performance Engineering of the WWW: Application to Dimensioning and Caching

Efficient Cache Replacement Algorithms for Web Proxies

Avoid overloads? => control the amount of data pumped

into the network => minimize distant requests => caching

Proxy caching is good iff clients exhibit enough temporal locality in accessing documents

Also, small files are requested more often Good replacement algorithms are needed Typically used: LRU and Size-based

Page 16: Performance Engineering of the WWW: Application to Dimensioning and Caching

Caching Algorithms

Cache algorithms are compared in terms of: Miss ratio Normalized Resolution time

The lower the miss ratio the lower the amount of data going through the network

User don’t care about miss ratios: they want low response times

Quantify quality of algorithm in terms of normalized resolution time T: the ratio of the average resolution time with and without the cache

Page 17: Performance Engineering of the WWW: Application to Dimensioning and Caching

Cache Algorithms

Let p: miss probability, Tc: Average time to access cache entry Tnc: Average time to access a document not in the cache

Then: Tc + p*Tnc

T = --------------- Tnc

Assuming Tnc >> Tc, T~p T is minimized when p is minimized

Page 18: Performance Engineering of the WWW: Application to Dimensioning and Caching

Cache Algorithms

The above statement seems to argue for large caches However,

cache size is limited the miss ratio is related to the size of the documents

stored in the cache For a given cache size, the number of cached

documents, and hence the hit ratio, decrease as the document size increases

Small files are more often requested This observations have led to algorithms that take

into account not only temporal locality but also document sizes

Page 19: Performance Engineering of the WWW: Application to Dimensioning and Caching

Cache Algorithms

Surprisingly, no cache algorithm takes as an input parameter the time it took to retrieve a given document

A Web cache replacement algorithm should take into account the retrieval time associated with each document in the cache

One way to achieve this: assign weights to documents and use a weight-based replacement algorithm

The weights might be a function of: the time to last reference the item the time it took to retrieve the item the expected time-to-live for the item the size of the document, etc.

Page 20: Performance Engineering of the WWW: Application to Dimensioning and Caching

Selecting A Replacement Algorithm

The problem can be associated with a function that given: The state s of the cache, and A newly retrieved document

Decides the following: Should the retrieved document be cached? If yes and no space available, which existing entry to

discard?

State s of the cache: The set of documents stored For each document, a set of sate variables which typically

include statistical information associated with the document

Page 21: Performance Engineering of the WWW: Application to Dimensioning and Caching

Selecting a Replacement Algorithm

Examples of state variables for a document: ti: time since document was last referenced Si: Size of the document rtti: Time it took to retrieve the document

ttli: Time-to-live for the document

Idea: to assign a weight to each cached document:

Wi = W(ti, Si, rtti, ttli)

This weight function can be specialized to obtain commonly used algorithms

Page 22: Performance Engineering of the WWW: Application to Dimensioning and Caching

Selecting A Replacement Algorithm

They suggest the following function for Web proxies:

W(ti, Si, rtti, ttli) = (w1*rtti + w2*Si)/ ttli + (w3 + w4*Si)/ ti

The second term captures temporal locality and the first term captures the cost associated with retrieving documents

The multiplying factor 1/ttli indicates: the cost associated with retrieving a document increases as the useful lifetime of the document decreases

Page 23: Performance Engineering of the WWW: Application to Dimensioning and Caching

Cache Coherence in Web Proxies

Usually remote sites exporting time-critical pages, associate with them time-to-live values

Not commonly used => don’t use ttli Instead:

W(ti, Si, rtti, ttli) = w1*rtti + w2*Si + (w3 + w4*Si)/ ti

To select the wi values, must consider overall goals: Maximize hit ratio? Minimize perceived retrieval time for random user? Minimize the cache size for a given hit ratio? Etc.

Page 24: Performance Engineering of the WWW: Application to Dimensioning and Caching

Testing Two Algorithms

Use trace driven simulations to compare the performance of different schemes: LRU and a scheme that takes into account all state variables

Algorithm 1: W(ti, Si, rtti, ttli) = 1/ti

Algorithm 2: W(ti, Si, rtti, ttli) = w1*rtti + w2*Si + (w3 + w4*Si)/ ti

Parameters: w1: 5000 b/s w2: 1000 w3: 1000 b/s w4: 10 s

Page 25: Performance Engineering of the WWW: Application to Dimensioning and Caching

Testing two Algorithms

To compare, the following performance measures are used: Miss Ratio Weighted miss ratio: probability that a document is

not in the cache multiplied by the document’s weight Miss ratio for algorithm 1 is slightly lower than the miss

ratio for algorithm 2 Weighted miss ratio (and hence the perceived retrieval

time) is much lower for algorithm 2 It is good to use a cache replacement algorithm that

takes retrieval time into consideration