42
Fair Scheduling in Web Servers CS 213 Lecture 17 L.N. Bhuyan

Fair Scheduling in Web Servers CS 213 Lecture 17 L.N. Bhuyan

  • View
    230

  • Download
    1

Embed Size (px)

Citation preview

Fair Scheduling in Web Servers

CS 213 Lecture 17

L.N. Bhuyan

Objective

• Create an arbitrary number of service quality classes and assign a priority weight for each class.

• Provide service differentiation for different use classes in terms of the allocation of CPU and disk I/O capacities

Differentiated Service in a Web Cluster: Objective

• Create an arbitrary number of service quality classes and assign a priority weight for each class.

• Provide service differentiation for different use classes in terms of the allocation of CPU and disk I/O capacities

Ref: Demand Driven Service Differentiation in Cluster-based Network Servers, Infocom 2001, by Zhu, Tang and Yang

Target System

Service Differentiation

• Requests of higher classes receive better services than lower classes, especially when the system is heavily loaded

• Request from lower classes should not be sacrificed for requests from higher classes when the system load is light

Definitions

• Requests: C1, C2, …, CN

• Corresponding Weights: W1, W2, …, WN

• Stretch factors: S1, S2, …, SN

stretch factor: the ratio of the response time of a request to the service demand of that request

• Average arrive rate: λi

• Average processing rate: μi

• Minimum resource requirement of class I, ρi= λi /μi

Optimal Problem

• Minimize:

F= W1S1+ W2S2+ … + WNSN

such that: S1 ≤ S2 ≤ … ≤ SN

S1, S2, …, SN ≤ K

K is a stretch factor bound, where K > 1 is a predefined threshold

Optimal Solution

Scheduling optimization for Resource-Intensive Web

requestsIn SPAA’99

By Zhu, Smith and Yang

Request Classification

• Static Data

web pages, images, etc.

does not consume much system resource

Request Classification (Cont.)

• Dynamic Data

e-commerce, database searching, personalized information

generated dynamically, place greater I/O and CPU demands

1 to 2 orders of magnitude longer processing time than static requests based on IBM Olympics and Alexandria Digital Library data

Flat Architecture

• Server nodes can process both static and dynamic requests

Master/Slave Architecture

• Server nodes are divided in two groups:Slave group only processes dynamic requests

Master group can handles both requests

How to partition a cluster

• Questions:

1: Given p nodes, what is the optimal number of master nodes?

2: What percentage of dynamic requests should be processed on masters?

Goal: ensure the master/slave

architecture outperforms flat

architecture

M/S and Flat Models

Performance Metric

• Stretch factor: the ratio of response time at a particular load to that at no load

• Average stretch factor is more suited than average response time for systems with highly variable task sizes. Average stretch factor indicates load of a system.

Evaluation results

• M/S: up to 69% performance improvement over flat

Separation of dynamic and static content

• Resource reservation: up to 68% improvement

• Resource requirement sampling: up to 23% improvement

Performance Guarantees for Internet Services (Gage)

• Environment: Web hosting services

multiple logical web servers (service subscriber) on a single physical web server cluster.

• Gage:

guarantee each web server with a pre specific performance

a distinct number of URL requests to service per second

Components

Each service subscriber maintain a queue

• Request classificationdetermines the queue for each input request

• Request schedulingdetermines which queue to serve next to meet the QoS requirement for each subscriber.

• Resource usage accountingcapture detailed resource usage associated with each subscriber’s service requests.

The Gage System

• QoS guarantee QoS is in terms of a fixed number of generic URL

request which represents an average web site access Currently, assuming it is 10msec of CPU time, 10msec

of disk I/O and 2000 bytes of network bandwidthEach subscribe is given a fixed number of generic requests.

Other possible QoS metrics: response time, delay jitter etc.

• Using TCP splicing

Request Scheduling

Two decisions:

• Which request should be serviced next according to each subscriber’s static resource

reservation and dynamic resource usage

• Which RPN should service this request according to the load information on each RPN and also

exploit access locality