21
1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

Embed Size (px)

Citation preview

Page 1: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

1

Sizing the Streaming Media Cluster Solution for a Given Workload

Lucy Cherkasova and Wenting Tang

HPLabs

Page 2: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

2

Capacity Planning Scenarios• Service provider needs to migrate his media site to a

new infrastructure. • While he has information about the site workload (the

media the server logs reflecting the accesses to the media site in the past), it is a problem to map workload requirements in the resource requirements

• Can we design a tool helping to accomplish the capacity planning tasks?

• The goal of the proposed capacity planning tool is to provide the best cost/performance configuration for support of a known media service workload.

Page 3: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

3

Capacity Planning Framework

Page 4: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

4

Main Components

• Two main components:– A media workload profiler MediaProf that extracts a set of

quantitative and qualitative parameters characterizing the service demand

– The capacity measurements of h/w and s/w solutions using a specially designed set of media benchmarks;

• The capacity planning tool matches the requirements of the media service workload profile, SLAs and configuration constraints to produce the best available cost/performance solution.

Page 5: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

5

Basic Benchmarks:• Single File Benchmark: all clients are accessing the

same file (encoded at different bit rates)• Unique Files Benchmark: all clients are accessing

different (unique) files(encoded at different bit rates)• In our tests, we use the sets of files encoded at

different bit rates:• 28 Kb/s (analog modem users)

• 56 Kb/s (analog modem and ISDN users)

• 112Kb/s (dual ISDN users)

• 256Kb/s (cable modem users)

• 350Kb/s (DSL/cable users)

• 500Kb/s (high-bandwidth users)

Page 6: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

6

Workload-Aware Performance Model of Streaming Media Server Capacity

• How to compute the expected media server capacity for realistic workload if the measured capacities under the basic benchmarks are given.

• We introduce cost function which defines a fraction of system resources needed to support a particular stream depending on – file encoding bit rate and – file access type (streamed from memory or disk) .

• Introduced cost function uses a single value to reflect the combined resource requirements such as CPU, disk, memory and server bandwidth necessary to support a particular media request.

Page 7: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

7

Computing Required System Capacity

Example: Computed Load of 4.5 indicates that considered media workload requires 5 nodes for its support.

Capacity equation:

Page 8: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

8

Workload Profiler MediaProf

• MediaProf reflects the access traffic profile for capacity planning goals:– Evaluates the number of simultaneous

(concurrent) connections over time;– Classifies the simultaneous connections into the

encoding bit rate bins;– Classifies the simultaneous connections by the

file access type: disk vs memory

Page 9: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

9

Segment-based Memory Model

To stream the file from memory, it is not necessary to havethe whole file in memory!

Page 10: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

10

Media Workload CharacterizationExample: analysis of the HP Corporate Media Site over a periodof 1 year duration:

Number of concurrent connections Peak Bandwidth requirements

Number of requests served from memory Number of requests served from disk

Page 11: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

11

Overall Capacity Planning Process There are several logical steps in the capacity

planning procedure:– Computing the media site workload profiles for

different memory sizes of interest. During the initial step, we assume a “single node” cluster: N=1

– Computing the service demand profile. The service demand profile is the ordered list of pairs: (time duration, service demand).

For example: (300, 4.5 ) (600, 4 ) (2000, 3.8)

(1000, 3.5)

Page 12: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

12

Overall Capacity Planning Process (cont.)

– Combining the service demand requirements, the SLAs, and the configuration constraints:

• SLAs: Based on the past workload history, find the configuration that 99% of the time is capable of processing the load;

• Constraints: Based on the past workload history, find the configuration that 90% of the time is utilized under 70% of its capacity.

Page 13: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

13

Overall Capacity Planning

Additionally, we need to do a cluster sizing with an appropriate load balancing strategy

Page 14: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

14

Evaluating Load Balancing Solutions• For an accurate cluster sizing we need to take into account

both: increased processing power (N nodes) and increased memory size (N times M)

• In our capacity planning tool, we implemented 2 strategies:– Round Robin– LARD (locality-aware ): first access goes to random node in the

cluster, but subsequest requests to the same file are send to the same node.

• If the outcome of the first iteration is k nodes then– Partition original workload in k sub-trace W1 , W2 , …, Wk where

Dispatcher employs the corresponding load balancing strategy;– Compute media workload profile for each W1 , W2 , …, Wk using

MediaProf – Merging the computed sub-workload profiles and computing

overall service demand profile

Page 15: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

15

Performance Results

• For workload generation, we used MediSyn: publicly available synthetic workload generator.

• WSYN (with parameters that are typical for enterprise media workloads)

• 20% of videos are 0-2 min long• 10% of videos are 2-5 min long• 13% of videos are 5-10 min long Video Duration• 23% of videos are 10-30 min long• 21% of videos are 30-60 min long• 13% of videos are longer than 60 min

• 5% of videos encoded at 56 Kb/s • 20% of videos encoded at 112Kb/s • 50% of videos encoded at 256Kb/s File Encoding Bit Rate• 25% of videos encoded at 500Kb/s

Page 16: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

16

Simulation Environment (cont.)

• The file popularity in WSYN is defined by Zipf-like distribution with alpha = 1.34.

• Overall, WSYN has 800 files (with 41GB storage footprint), and 90% of requests target 10% of the files (with 3.8GB storage footprint)

• Media server capacity :

Let the server memory size of interest is: 0.5GB and 1Gb,and the cost of disk access is 5 times the cost of memory access.

Page 17: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

17

Capacity Planning (first iteration)

Considered workload requires 5nodes with memory of 1GB,or 6 nodes with memory of 0.5GB

Page 18: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

18

Round Robin Strategy

Since the RR strategy distributes the requests uniformly to all the machines, this prohibits efficient memory usage increased cluster memory does not provide additional performance benefits

Page 19: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

19

LARD Load Balancing

Locality aware load balancing strategy provides significant performance benefits due to efficient memory usage.

Page 20: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

20

Cluster sizing

• Cluster sizing results for a given synthetic workload are summarized in the following Table:

Locality aware load balancing strategy utilizes the increased cluster memory more efficiently, and requires less nodes toSupport the same traffic.

Page 21: 1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs

21

Conclusion• We proposed a new unified benchmarking and

capacity planning framework:– Measure media server via a set of basic benchmarks;

– Derives the resource requirements using a single value cost function;

– Estimate the service capacity requirements from the proposed media workload profile.

• In future work, we would like to incorporate the availability requirements into the proposed capacity planning framework.