36
Use of a Levy Distribution for Modeling Best Case Execution Time Variation Jonathan Beard, Roger Chamberlain SBS S tream B ased S upercomputing Lab http://sbs.wustl.edu Work also supported by: 1

Use of a Levy Distribution for Modeling Best Case Execution Time Variation

Embed Size (px)

DESCRIPTION

Presentation for 11th European Workshop on Performance Engineering. Topic: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

Citation preview

Page 1: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

Use of a Levy Distribution for Modeling Best Case Execution Time Variation

Jonathan Beard, Roger Chamberlain

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Work also supported by:

1

Page 2: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

Outline• Motivation!

• Stream Processing!

• Optimization Goals!

• Methodology!

• Distributions!

• Results

2

Page 3: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Streaming Computing

3

Page 4: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Streaming Computing

Kernel

3

Page 5: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Streaming Computing

Kernel 1

Kernel 2

Kernel 3

Kernel 2

Stream

Stream

Stream

Stream

4

Page 6: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Streaming Languages

StreamIt, Auto-Pipe, Brook, Cg, S-Net, Scala-Pipe, Streams-C and

many others

5

Page 7: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Optimization

KernelFast

Slow

Super Fast

Medium

6

Page 8: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Optimization

Kernel 1

Kernel 2

Kernel 3

Kernel 2

multi-core A1 2

3 4

multi-core B1 2

3 4

More allocation choices, NUMA node A or B to

allocate stream.

7

Page 9: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Optimization

Kernel 1

Kernel 2

Kernel 3

Kernel 2

multi-core A1 2

3 4

multi-core B1 2

3 4

More allocation choices, NUMA node A or B to

allocate stream.

1 2

7

Page 10: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Optimization

Kernel 1

Kernel 2

Kernel 3

Kernel 2

multi-core A1 2

3 4

multi-core B1 2

3 4

More allocation choices, NUMA node A or B to

allocate stream.

1 2

7

Page 11: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Optimization

B CQ1 Q2A

A B C

“Stream” is modeled as a Queue

8

Page 12: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Optimization

B CQ1 Q2A

A B C

“Stream” is modeled as a Queue

8

Page 13: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Streaming on Multi-core Systems

• Commodity multi-core timer availability and latency • Frequency scaling and core migration • Measuring modifies the application behavior

Problem: Accurate measurement is very difficult. Is there a way to decide on a model without it.

9

We want good models for streaming systems on shared multi-core systems (i.e., a cluster)

Page 14: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Derived Information

10

Expected Observed

Page 15: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Derived Information

10

Expected Observed

Is there a pattern of minimal variation within the systems we’re running on?

Avg. Service Time = E[ X ] + Error

Page 16: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

GoalFind a distribution that characterizes the minimum expected variation of a

hardware and software system

Use this characterization to accept or reject models

11

Page 17: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Process

12

• Measurement!• Workload definition!• Find a distribution!• Utilize the distribution to aid model selection

Page 18: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Timer Mechanism

13

Ask for Time

Receive Time

Timer Thread Code

Page 19: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Timer Mechanism

14

Timer Thread

rdtsc clock_gettime• x86 assembly • varying methods

to serialize • relatively fast • multiple drift

issues

• POSIX standard • relatively accurate • portable • slower than rdtsc

Page 20: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Two Timing Choices

15

Page 21: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

NUMA Node Variations

16

Page 22: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Minimize Variation• Restricting timer to single core

!• Use the x86 rdtsc instruction with processor

recommended serializers for each processor type !

• Keeping processes under test on the same NUMA node as timer !

• Run timer thread with altered priority to minimize core context swaps

17

Page 23: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Best Case Execution Time Variation

• no-op instruction implemented in most processors !• usually takes exactly 1 cycle !• no real functional units are involved, so least

taxing !• variation observed in execution time should be

external to process

18

Page 24: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Data Collection• no-op loops calibrated for various nominal

times, tied to a single core and run thousands of times

!• Execution time measured end to end for

each run, environment collected !• Parameters include:

Number of processes executing on core Number of context swaps (voluntary, involuntary) Many others

19

Page 25: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Levy Distribution

20

Execution Time Error ( obs - mean )

Page 26: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Levy Distribution

21

Normal Distribution

Page 27: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Levy Distribution

22

Gumbel Distribution

Page 28: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Levy Distribution

23

Levy Distribution

Page 29: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Levy Distribution

23

Levy Distribution

Page 30: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Levy Distribution• Truncation enables mean calculation, but

requires fitting to each dataset to find where to truncate

!• The truncation parameters are correlated to

both the number of processes per core and the expected execution time

!• Roughly linear relationship gives an

approximate solution to truncation parameters without refitting

24

Page 31: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Levy Fit

25

!0.000025 !0.00001

!0.0000155!0.000015!0.0000145!0.000014

!0.000025 !0.00001 0

!0.000014!0.0000135!0.000013!0.0000125

!0.00006 !0.00003 0!0.00005!0.000045!0.00004!0.000035!0.00003!0.000025!0.00002

!0.00005 !0.00002 0!0.00003!0.000025!0.00002!0.000015!0.00001

1 - 5 processes 6 - 10 processes

16 - 20 processes11 - 15 processes

Page 32: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Test Setup

26

BQ1A

Question: Can we use an M/M/1 queueing model to estimate the mean queue occupancy of this system?

!Hypothesis: Lower Kullback-Leibler (KL) divergence

between expected and realized distribution is associated with higher model accuracy.

Page 33: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Test Setup

27

BQ1A

1. Dedicated thread of execution monitors queue occupancy

2. Calculate the estimated mean queue occupancy using the M/M/1 model

3. Calculate KL Divergence for the arrival process distribution using the truncated Levy distribution noise model

Page 34: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Convolution with Exponential

28

Page 35: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Conclusions• The truncated Levy distribution can be used to

approximate BCETV !• The distribution of BCETV can be used as a tool

to accept or reject a stochastic queueing model based on distributional assumptions

!• KL divergence between the expected and

convolved distribution highly correlates with queue model accuracy

29

Page 36: Use of a Levy Distribution for Modeling Best Case Execution Time Variation

SBSStream Based

Supercomputing Labhttp://sbs.wustl.edu

Parting NotesSlides available here:

sbs.wust.edu !

Timer C++ template code: http://goo.gl/ItJ3jP

!

Test harness used to collect data: http://goo.gl/U1VG6N

30