131
Introduction to System Performance Design by Gerrit Muller University of South-Eastern Norway-NISE e-mail: [email protected] www.gaudisite.nl Abstract What is System Performance? Why should a software engineer have knowledge of the other parts of the system, such as the Hardware, the Operating System and the Middleware? The applications that he/she writes are self-contained, so how can other parts have any influence? This introduction sketches the problem and shows that at least a high level understanding of the system is very useful in order to get optimal performance. Distribution This article or presentation is written as part of the Gaudí project. The Gaudí project philosophy is to improve by obtaining frequent feedback. Frequent feedback is pursued by an open creation process. This document is published as intermediate or nearly mature version to get feedback. Further distribution is allowed as long as the document remains complete and unchanged. September 9, 2018 status: preliminary draft version: 0.5 UI process screen Sample application code: for x = 1 to 3 { for y = 1 to 3 { retrieve_image(x,y) } } What If.... store

Tutorial Measuring and Modeling System Performance

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Tutorial Measuring and Modeling System Performance

Introduction to System Performance Designby Gerrit Muller University of South-Eastern Norway-NISE

e-mail: [email protected]

Abstract

What is System Performance? Why should a software engineer have knowledgeof the other parts of the system, such as the Hardware, the Operating System andthe Middleware? The applications that he/she writes are self-contained, so howcan other parts have any influence? This introduction sketches the problem andshows that at least a high level understanding of the system is very useful in orderto get optimal performance.

Distribution

This article or presentation is written as part of the Gaudí project. The Gaudí projectphilosophy is to improve by obtaining frequent feedback. Frequent feedback is pursued by anopen creation process. This document is published as intermediate or nearly mature versionto get feedback. Further distribution is allowed as long as the document remains completeand unchanged.

September 9, 2018status: preliminarydraftversion: 0.5

UI process

screen

Sample application code:

for x = 1 to 3 {

for y = 1 to 3 {

retrieve_image(x,y)

}

}

What If....

store

Page 2: Tutorial Measuring and Modeling System Performance

Content of Problem Introduction

content of this presentation

Example of problem

Problem statements

Introduction to System Performance Design2 Gerrit Muller

version: 0.5September 9, 2018

PINTROcontent

Page 3: Tutorial Measuring and Modeling System Performance

Image Retrieval Performance

Sample application code:

for x = 1 to 3 {

for y = 1 to 3 {

retrieve_image(x,y)

}

}

alternative application code:

event 3*3 -> show screen 3*3

<screen 3*3>

<row 1>

<col 1><image 1,1></col 1>

<col 2><image 1,2></col 2>

<col 3><image 1,3></col 3>

</row 1><row 2>

<col 1><image 1,1></col 1>

<col 2><image 1,2></col 2>

<col 3><image 1,3></col 3>

</row 1>

<row 2>

<col 1><image 1,1></col 1>

<col 2><image 1,2></col 2>

<col 3><image 1,3></col 3>

</row 3>

</screen 3*3>

application need:

at event 3*3 show 3*3 images

instanteneousdesign

design

or

Introduction to System Performance Design3 Gerrit Muller

version: 0.5September 9, 2018

PINTROsampleCode

Page 4: Tutorial Measuring and Modeling System Performance

Straight Forward Read and Display

UI process

screen

Sample application code:

for x = 1 to 3 {

for y = 1 to 3 {

retrieve_image(x,y)

}

}

What If....

store

Introduction to System Performance Design4 Gerrit Muller

version: 0.5September 9, 2018

PINTROwhatIf1

Page 5: Tutorial Measuring and Modeling System Performance

More Process Communication

Sample application code:

for x = 1 to 3 {

for y = 1 to 3 {

retrieve_image(x,y)

}

}

What If....

screenserver

9 *

retrieve

9 *

update

UI process

database

screen

Introduction to System Performance Design5 Gerrit Muller

version: 0.5September 9, 2018

PINTROwhatIf2

Page 6: Tutorial Measuring and Modeling System Performance

Meta Information Realization Overhead

Sample application code:

for x = 1 to 3 {

for y = 1 to 3 {

retrieve_image(x,y)

}

}

What If....

Meta------

---------

--------

Image data

Attribute = 1 COM object

100 attributes / image

9 images = 900 COM objects

1 COM object = 80µs

9 images = 72 ms

Attributes

screenserver

9 *

retrieve

9 *

update

UI process

database

screen

Introduction to System Performance Design6 Gerrit Muller

version: 0.5September 9, 2018

PINTROwhatIf3

Page 7: Tutorial Measuring and Modeling System Performance

I/O overhead

Sample application code:

for x = 1 to 3 {

for y = 1 to 3 {

retrieve_image(x,y)

}

}

What If....

- I/O on line basis (5122 image)

- . . .

9 * 512 * tI/O

tI/O ~= 1ms

Introduction to System Performance Design7 Gerrit Muller

version: 0.5September 9, 2018

PINTROwhatIf4

Page 8: Tutorial Measuring and Modeling System Performance

Non Functional Requirements Require System View

Sample application code:

for x = 1 to 3 {

for y = 1 to 3 {

retrieve_image(x,y)

}

}

can be:

fast, but very local

slow, but very generic

slow, but very robust

fast and robust

...

The emerging properties (behavior, performance)

cannot be seen from the code itself!

Underlying platform and neighbouring functions

determine emerging properties mostly.

Introduction to System Performance Design8 Gerrit Muller

version: 0.5September 9, 2018

PINTROconclusionWhatIf

Page 9: Tutorial Measuring and Modeling System Performance

Function in System Context

usage context

HW HW HW

OS OS OS

MW MW MW MW

F

&

S

F

&

S

F

&

S

F

&

S

F

&

S

F

&

S

F

&

S

F

&

S

Functions &

Services

Middleware

Operating systems

Hardware

performance and behavior of a function

depend on realizations of used layers,

functions in the same context,

and the usage context

Introduction to System Performance Design9 Gerrit Muller

version: 0.5September 9, 2018PINTROconclusion

Page 10: Tutorial Measuring and Modeling System Performance

Challenge

HW HW HW

OS OS OS

MW MW MW MW

F

&

S

F

&

S

F

&

S

F

&

S

F

&

S

F

&

S

F

&

S

F

&

S

Functions & Services

Middleware

Operating systems

Hardware

Performance = Function (F&S, other F&S, MW, OS, HW)

MW, OS, HW >> 100 Manyear : very complex

Challenge: How to understand MW, OS, HW

with only a few parameters

Introduction to System Performance Design10 Gerrit Muller

version: 0.5September 9, 2018

PINTROproblemStatement

Page 11: Tutorial Measuring and Modeling System Performance

Summary of Problem Introduction

Summary of Introduction to Problem

Resulting System Characteristics cannot be deduced from local code.

Underlying platform, neighboring applications and user context:

have a big impact on system characteristics

are big and complex

Models require decomposition, relations and representations to analyse.

Introduction to System Performance Design11 Gerrit Muller

version: 0.5September 9, 2018

PINTROsummary

Page 12: Tutorial Measuring and Modeling System Performance

Performance Method Fundamentalsby Gerrit Muller HSN-NISE

e-mail: [email protected]

Abstract

The Performance Design Methods described in this article are based on a multi-view approach. The needs are covered by a requirements view. The systemdesign consists of a HW block diagram, a SW decomposition, a functional designand other models dependent on the type of system. The system design is usedto create a performance model. Measurements provide a way to get a quantifiedcharacterization of the system. Different measurement methods and levels arerequired to obtain a usable characterized system. The performance model andthe characterizations are used for the performance design. The system designdecisions with great performance impact are: granularity, synchronization, prior-ization, allocation and resource management. Performance and resource budgetsare used as tool.

The complete course ASPTM is owned by TNO-ESI. To teach this course a license from TNO-ESI is required. This material is preliminary course material.

September 9, 2018status: draftversion: 0.2

determine most

important and critical

requirements

model

analyse constraints

and design options

simulate

build proto

measure

evaluate

analyse

Page 13: Tutorial Measuring and Modeling System Performance

Positioning in CAFCR

diverse

complex

fuzzy

performance

expectations

needs

Customer

What

Customer

How

Product

What

Product

How

What does Customer need

in Product and Why?

Customer

objectives

Application Functional Conceptual Realization

SMART

+ timing

requirements

+ external

interfaces

modelsanalysis

modelsanalysis

simulationsmeasurements

simulationsmeasurements

execution architecture

designthreads

interrupts

timers

queues

allocation

scheduling

synchronization

decoupling

Performance Method Fundamentals13 Gerrit Muller

version: 0.2September 9, 2018

EAAandCAFCR

Page 14: Tutorial Measuring and Modeling System Performance

Toplevel Performance Design Method

2A Measure performance at 3 levels

1A Collect most critical performance and timing requirements

1B Find system level diagrams

3 Evaluate performance, identify potential problems

2B Create Performance Model

4 Performance analysis and design

Re-iterate all steps

application, functions and micro benchmarks

granularity, synchronization, priorization,

allocation, resource management

are the right requirements addressed,

refine diagrams, measurements, models, and improve design

HW block diagram, SW diagram, functional model(s)

concurrency model, resource model, time-line

Performance Method Fundamentals14 Gerrit Muller

version: 0.2September 9, 2018

PMFtopLevel

Page 15: Tutorial Measuring and Modeling System Performance

Incremental Approach

determine most

important and critical

requirements

model

analyse constraints

and design options

simulate

build proto

measure

evaluate

analyse

Performance Method Fundamentals15 Gerrit Muller

version: 0.2September 9, 2018

EAAspiral

Page 16: Tutorial Measuring and Modeling System Performance

Decomposition of System TR in HW and SW

orig

ina

l b

y T

on

Ko

ste

lijk

system

TR

hardware

TR

software

TR

ns

us

ms

s

most and hardest

TR handled by HW

new control TRs

Performance Method Fundamentals16 Gerrit Muller

version: 0.2September 9, 2018

EAAhwswRequirements

Page 17: Tutorial Measuring and Modeling System Performance

Quantification Steps

order of magnitude

guestimates

calibrated estimates

10

50 200

30 300

10030 300

70 140

90 115

feasibilitymeasure,

analyze,

simulate

back of the

envelope

benchmark,

spreadsheet

calculation

99.999 100.001cycle

accurate

Performance Method Fundamentals17 Gerrit Muller

version: 0.2September 9, 2018

BWMAquantificationSteps

Page 18: Tutorial Measuring and Modeling System Performance

Iteration

zoom in on detail

aggregate to end-to-end performance

from coarse guestimate to reliable prediction

from typical case to boundaries of requirement space

from static understanding to dynamic understanding

from steady state to initialization, state change and shut down

discover unforeseen critical requirements

improve diagrams and designs

from old system to prototype to actual implementation

Performance Method Fundamentals18 Gerrit Muller

version: 0.2September 9, 2018

PMFiteration

Page 19: Tutorial Measuring and Modeling System Performance

Construction Decomposition

tunerframe-

bufferMPEG DSP CPU RAM

drivers scheduler OS

etc

audio video TXTfile-

systemnetworkingetc.

view PIP

browseviewport menu

adjustview

TXT

hardware

driver

applications

services

toolboxes

domain specific generic

signal processing subsystem control subsystem

Performance Method Fundamentals19 Gerrit Muller

version: 0.2September 9, 2018

CVconstructionDecomposition

Page 20: Tutorial Measuring and Modeling System Performance

Functional Decomposition

storage

acquisition

processingcompress

encoding

display

processing

de-

compress decodingdisplay

acquisition

Performance Method Fundamentals20 Gerrit Muller

version: 0.2September 9, 2018

CVfunctionalDecomposition

Page 21: Tutorial Measuring and Modeling System Performance

An example of a process decomposition of a MRI scanner.

image handlingscan control

scan

control

acq

control

recon

control

xDAS recon

db

control

disk

scan

UI

image handling

UI

archiving

control

media

import

export

network

display

control

display device hardware

server

process

UI process

legend

Performance Method Fundamentals21 Gerrit Muller

version: 0.2September 9, 2018

CVprocessDecomposition

Page 22: Tutorial Measuring and Modeling System Performance

Combine views in Execution Architecture

other architecture

views

execution

architecture

functional

model

process

display

receive demux

store

Map

process

task

threadthreadthread

process

task

threadthreadthread

process

taskthreadthreadthread

interrupt

handlersinput

hardware

tuner drive

CPU DSP RAM

input

repository

structure

queue

DCTmenu

txt

tuner

foundation

classes

hardware

abstraction

list DVD drive

UI toolkit processing

Applicationsplay zap

input

dead lines

timing, throughput

requirements

execution architecture

issues:

concurrency

scheduling

synchronisation

mutual exclusion

priorities

granularity

Performance Method Fundamentals22 Gerrit Muller

version: 0.2September 9, 2018

CVexecutionArchitecture

Page 23: Tutorial Measuring and Modeling System Performance

Layered Benchmarking Approach

CPU

cache

memory

bus

..

(computing) hardware

typical values

interference

variation

boundaries

operating system

services

applications

network transfer

database access

database query

services/functions

duration

CPU time

footprint

cache

end-to-end

function

duration

services

interrupts

task switches

OS services

CPU time

footprint

cache

latency

bandwidth

efficiency

interrupt

task switch

OS services

duration

footprint

interrupts

task switches

OS services

tools

locality

density

efficiency

overhead

Performance Method Fundamentals23 Gerrit Muller

version: 0.2September 9, 2018

EBMIbenchmarkStack

Page 24: Tutorial Measuring and Modeling System Performance

Micro Benchmarks

object creation

object destructionmethod invocation

component creation

component destruction

open connection

close connection

method invocationsame scope

other context

start session

finish session

perform transaction

query

transfer data

function call

loop overhead

basic operations (add, mul, load, store)

infrequent operations,

often time-intensive

often repeated

operations

database

network,

I/O

high level

construction

low level

construction

basic

programming

memory allocation

memory free

task, thread creationOS task switch

interrupt response

HW cache flush

low level data transfer

power up, power down

boot

Performance Method Fundamentals24 Gerrit Muller

version: 0.2September 9, 2018

RVuTimingBenchmarks

Page 25: Tutorial Measuring and Modeling System Performance

Modeling and Analysis Fundamentals of Technologyby Gerrit Muller University of South-Eastern Norway-NISE

e-mail: [email protected]

Abstract

This presentation shows fundamental elements for models that are ICT-technology related. Basic hardware functions are discussed: storage, commu-nication and computing with fundamental characteristics, such as throughput,latency, and capacity. A system is build by layers of software on top of hardware.The problem statement is how to reason about system properties, when thesystem consists of many layers of hardware and software.

Distribution

This article or presentation is written as part of the Gaudí project. The Gaudí projectphilosophy is to improve by obtaining frequent feedback. Frequent feedback is pursued by anopen creation process. This document is published as intermediate or nearly mature versionto get feedback. Further distribution is allowed as long as the document remains completeand unchanged.

September 9, 2018status: preliminarydraftversion: 0.5

random

data

pro

cessin

g

pe

rfo

rma

nce

in o

ps/s

data set sizein bytes

103

106

109

1012

1015

L1

cache

L3

cache

main

memory

hard

disk

disk

farm

robotized

media

109

103

106

Page 26: Tutorial Measuring and Modeling System Performance

Presentation Content Fundamentals of Technology

content of this presentation

generic layering and block diagrams

typical characteristics and concerns

figures of merit

example of picture caching in web shop application

Modeling and Analysis Fundamentals of Technology26 Gerrit Muller

version: 0.5September 9, 2018

MAFTcontent

Page 27: Tutorial Measuring and Modeling System Performance

What do We Need to Analyze?

working range

dependencies

realization variability

scalability

required analysis :

How do parameters result in NFR's?

relevant non functional

requirements

parameters in design

space

system

design

latencytime from start

to finish

throughputamount of information per time

transferred or processed

footprint (size)amount of data&code

stored

message format(e.g. XML)

network medium(e.g. ethernet, ISDN)

communication protocol(e.g. HTTPS, TCP)

Modeling and Analysis Fundamentals of Technology27 Gerrit Muller

version: 0.5September 9, 2018

MAFTcharacteristics

Page 28: Tutorial Measuring and Modeling System Performance

Typical Block Diagram and Typical Resources

data

base

server

web

server

client client

network

network

client

screen screen screen

presentation

computation

communication

storage

legend

Modeling and Analysis Fundamentals of Technology28 Gerrit Muller

version: 0.5September 9, 2018

MAFTgenericBlockDiagram

Page 29: Tutorial Measuring and Modeling System Performance

Hierarchy of Storage Technology Figures of Merit

fast

volatile

archival

persistent

robotized

optical media

tape

disks

disk arrays

disk farms

main memory

processor cache

L1 cache

L2 cache

L3 cache

sub ns

ns

n kB

n MB

late

ncy

capa

city

tens ns n GB

n*100 GB

n*10 TB

n PB

ms

>s

Modeling and Analysis Fundamentals of Technology29 Gerrit Muller

version: 0.5September 9, 2018

MAFTstorage

Page 30: Tutorial Measuring and Modeling System Performance

Performance as Function of Data Set Size

ran

do

m d

ata

pro

ce

ssin

g

pe

rfo

rma

nce

in o

ps/s

data set sizein bytes

103

106

109

1012

1015

L1

cache

L3

cache

main

memory

hard

disk

disk

farm

robotized

media

109

103

106

Modeling and Analysis Fundamentals of Technology30 Gerrit Muller

version: 0.5September 9, 2018

MAFTstoragePerformance

Page 31: Tutorial Measuring and Modeling System Performance

Communication Technology Figures of Merit

PCB level

network

Serial I/O

LAN

on chip

n GHz

frequ

ency

distan

ce

tens ns

n ms

n 10ms

network n ns

late

ncy

sub ns

n GHz

n 100MHz

n mmconnection

n mm

n cm

n m

n km

globalWAN

n ms

n 100MHz

100MHz

n GHz

Modeling and Analysis Fundamentals of Technology31 Gerrit Muller

version: 0.5September 9, 2018

MAFTcommunication

Page 32: Tutorial Measuring and Modeling System Performance

Multiple Layers of Caching

back

office

server

mid

office

server

client client

network

network

client

screen screen screen

network layer cache

file cache

application cache

memory caches

L1, L2, L3

virtual memory

100 ms

10 ms

1 s

100 ns

1 ms

cache

miss

penalty

1 ms

10 µs

10 ms

1 ns

100 ns

cache hit performance

network layer cache

file cache

application cache

memory caches

L1, L2, L3

virtual memory

network layer cache

file cache

application cache

memory caches

L1, L2, L3

virtual memory

network layer cache

file cache

application cache

memory caches

L1, L2, L3

virtual memorytypical cache 2 orders

of magnitude faster

Modeling and Analysis Fundamentals of Technology32 Gerrit Muller

version: 0.5September 9, 2018

MAFTgenericCaches

Page 33: Tutorial Measuring and Modeling System Performance

Why Caching?

project risk

performance

response time

life cycle

cost

latency penalty once

overhead once

processing once

limit storage needs to fit

in fast local storage

low latency

low latency

less communication

design parameters

caching algorithm

storage location

cache size

chunk size

format

in (pre)processed format

larger chunks

local storage

fast storage

frequently used subsetlong latency

mass storage

resource intensive

processing

overhead

communication

long latency

communication

Modeling and Analysis Fundamentals of Technology33 Gerrit Muller

version: 0.5September 9, 2018MAFTwhyCaching

Page 34: Tutorial Measuring and Modeling System Performance

Example Web Shop

data

base

server

web

server

client client

network

network

client

screen screen screen

product

descriptions

logistics

ERP

customer

relationsfinancial

exhibit products

sales & order intake

order handling

stock handling

financial bookkeeping

consumer

browse products

order

pay

track

customer relation

management

update catalogue

advertize

after sales support

enterprise

logistics

finance

product management

customer managment

Modeling and Analysis Fundamentals of Technology34 Gerrit Muller

version: 0.5September 9, 2018

MAFTexampleWebShop

Page 35: Tutorial Measuring and Modeling System Performance

Impact of Picture Cache

back

office

server

mid

office

server

client client

network

network

client

screen screen screen

product

descriptions

logistics

ERP

customer

relationsfinancial

picture

cache

less loadless server costs

fast response

less load

less server costs

Modeling and Analysis Fundamentals of Technology35 Gerrit Muller

version: 0.5September 9, 2018

MAFTwebShopPictureCache

Page 36: Tutorial Measuring and Modeling System Performance

Risks of Caching

project risk

cost

effort

performance

life cycle

cost

effortability to benefit from

technology improvements

robustness for application

changes

in (pre)processed format

larger chunks

robustness for changing

context (e.g. scalability)local storage

fast storage

frequently used subset

failure modes in

exceptional user space

robustness for concurrent

applications

Modeling and Analysis Fundamentals of Technology36 Gerrit Muller

version: 0.5September 9, 2018

MAFTrisksOfCaching

Page 37: Tutorial Measuring and Modeling System Performance

Summary

Conclusions

Technology characteristics can be discontinuous

Caches are an example to work around discontinuities

Caches introduce complexity and decrease transparancy

Techniques, Models, Heuristics of this module

Generic block diagram: Presentation, Computation,

Communication and Storage

Figures of merit

Local reasoning (e.g. cache example)

Modeling and Analysis Fundamentals of Technology37 Gerrit Muller

version: 0.5September 9, 2018

MAFTsummary

Page 38: Tutorial Measuring and Modeling System Performance

Modeling and Analysis: Measuringby Gerrit Muller University of South-Eastern Norway-NISE

e-mail: [email protected]

Abstract

This presentation addresses the fundamentals of measuring: What and howto measure, impact of context and experiment on measurement, measurementerrors, validation of the result against expectations, and analysis of variation andcredibility.

Distribution

This article or presentation is written as part of the Gaudí project. The Gaudí projectphilosophy is to improve by obtaining frequent feedback. Frequent feedback is pursued by anopen creation process. This document is published as intermediate or nearly mature versionto get feedback. Further distribution is allowed as long as the document remains completeand unchanged.

September 9, 2018status: preliminarydraftversion: 1.2

system

under study

measurement

instrument

measured signal

noise resolution

value

measurement

error

time

valu

e

+ε1

calibrationoffset

characteristics

measurements have

stochastic variations and

systematic deviations

resulting in a range

rather than a single value

-ε2

+ε1-ε2

Page 39: Tutorial Measuring and Modeling System Performance

Presentation Content

content

What and How to measure

Impact of experiment and context on measurement

Validation of results, a.o. by comparing with expectation

Consolidation of measurement data

Analysis of variation and analysis of credibility

Modeling and Analysis: Measuring39 Gerrit Muller

version: 1.2September 9, 2018

MAMEcontent

Page 40: Tutorial Measuring and Modeling System Performance

Measuring Approach: What and How

how

what

1. What do we need to know?

2. Define quantity to be measured.

4A. Define the measurement circumstances fe.g. by use cases

3. Define required accuracy

5. Determine actual accuracy

4C. Define measurement set-up

4B. Determine expectation

6. Start measuring

7. Perform sanity check expectation versus actual outcome

uncertainties, measurement error

historic data or estimation

initial model

purpose

ite

rate

Modeling and Analysis: Measuring40 Gerrit Muller

version: 1.2September 9, 2018

MAMEwhatAndHow

Page 41: Tutorial Measuring and Modeling System Performance

1. What do We Need? Example Context Switching

(computing) hardware

operating system

ARM 9

200 MHz CPU

100 MHz bus

VxWorks

test program

What:

context switch time of

VxWorks running on ARM9

estimation of total lost CPU

time due to

context switching

guidance of

concurrency design and

task granularity

Modeling and Analysis: Measuring41 Gerrit Muller

version: 1.2September 9, 2018

MAMEcaseARM

Page 42: Tutorial Measuring and Modeling System Performance

2. Define Quantity by Initial Model

What (original):

context switch time of

VxWorks running on ARM9

tp2tp1, before tscheduler

Process 1

Process 2

Scheduler

What (more explicit):

The amount of lost CPU time,

due to context switching on

VxWorks running on ARM9

on a heavy loaded CPU

tschedulertcontext switch = tp1, loss+

tscheduler tp1, after

tp1, no switching

tp1,losstp2,loss

p2 pre-empts p1 p1 resumes

= lost CPU time

legend

time

Modeling and Analysis: Measuring42 Gerrit Muller

version: 1.2September 9, 2018

MAMEdefineQuantity

Page 43: Tutorial Measuring and Modeling System Performance

3. Define Required Accuracy

estimation of total

lost CPU time

due to

context switching

guidance of

concurrency

design and task

granularitycost of context

switchdepends on OS and HW

number of

context switchesdepends on application

purpose drives required accuracy

~10%

Modeling and Analysis: Measuring43 Gerrit Muller

version: 1.2September 9, 2018

MAMEaccuracy

Page 44: Tutorial Measuring and Modeling System Performance

Intermezzo: How to Measure CPU Time?

CPUHW

Timer

I/O

Logic analyzer / Oscilloscope

High resolution ( ~ 10 ns)

Cope with limitations:

- Duration (16 / 32 bit

counter)

- Requires Timer Access

High resolution ( ~ 10 ns)

requires

HW instrumentationOS-

Timer

OS

Low resolution ( ~ µs - ms)

Easy access

Lot of instrumentation

Modeling and Analysis: Measuring44 Gerrit Muller

version: 1.2September 9, 2018

PHRTmeasuringTime

Page 45: Tutorial Measuring and Modeling System Performance

4A. Define the Measurement Set-up

experimental set-up

tp2tp1, before tscheduler tscheduler tp1, aftertp1,losstp2,loss

p2 pre-empts p1p1 resumes

= lost CPU time

P1 P2

real world

many concurrent processes, with

# instructions >> I-cache

# data >> D-cache

pre-empts

causes

cach

e flu

sh

no other

CPU activities

Mimick relevant real world characteristics

Modeling and Analysis: Measuring45 Gerrit Muller

version: 1.2September 9, 2018

MAMEdefineCircumstances

Page 46: Tutorial Measuring and Modeling System Performance

4B. Case: ARM9 Hardware Block Diagram

PCBchip

CPU

Instruction

cache

Data

cache

memoryon-chip

bus

cache line size:

8 32-bit words

memory

bus

200 MHz 100 MHz

Modeling and Analysis: Measuring46 Gerrit Muller

version: 1.2September 9, 2018

PHRTarmCacheExample

Page 47: Tutorial Measuring and Modeling System Performance

Key Hardware Performance Aspect

memory

request wo

rd 1

wo

rd 7

wo

rd 4

wo

rd 3

wo

rd 2

wo

rd 8

wo

rd 6

wo

rd 5

38 cycles

memory access time in case of a cache miss

200 Mhz, 5 ns cycle: 190 ns

data

memory

response

22 cycles

Modeling and Analysis: Measuring47 Gerrit Muller

version: 1.2September 9, 2018

EBMImemoryTimingARM

Page 48: Tutorial Measuring and Modeling System Performance

OS Process Scheduling Concepts

New

Running

Waiting

Ready

Terminated

interrupt

create

exit

Scheduler

dispatch

IO or event

completion

Wait

(I/O / event)

Modeling and Analysis: Measuring48 Gerrit Muller

version: 1.2September 9, 2018

PSRTprocessConcepts

Page 49: Tutorial Measuring and Modeling System Performance

Determine Expectation

input data HW:

tARM instruction = 5 ns

tmemory access = 190 ns

simple SW model of context switch:

save state P1

determine next runnable task

update scheduler administration

load state P2

run P2

Estimate how many

instructions and memory accesses

are needed per context switch

Calculate the estimated time

needed per context switch

Modeling and Analysis: Measuring49 Gerrit Muller

version: 1.2September 9, 2018

MAMEexpectationCS

Page 50: Tutorial Measuring and Modeling System Performance

Determine Expectation Quantified

input data HW:

tARM instruction = 5 ns

tmemory access = 190 ns

simple SW model of context switch:

save state P1

determine next runnable task

update scheduler administration

load state P2

run P2

Estimate how many

instructions and memory accesses

are needed per context switch

Calculate the estimated time

needed per context switch

me

mo

ry

acce

sse

s

instr

uctio

ns

110

120

110

110

250

6100

+

500 ns

1140 ns+

1640 ns

tcontext switch = 2 µsround up (as margin) gives expected

Modeling and Analysis: Measuring50 Gerrit Muller

version: 1.2September 9, 2018

MAMEexpectationCSsubstituted

Page 51: Tutorial Measuring and Modeling System Performance

4C. Code to Measure Context Switch

Task 2Task 1

Time Stamp End

Cache Flush

Time Stamp Begin

Context Switch

Time Stamp End

Cache Flush

Time Stamp Begin

Context SwitchTime Stamp End

Cache Flush

Time Stamp Begin

Context Switch Time Stamp End

Cache Flush

Time Stamp Begin

Context Switch

Modeling and Analysis: Measuring51 Gerrit Muller

version: 1.2September 9, 2018

PHRTarmCacheTaskSwitchCode

Page 52: Tutorial Measuring and Modeling System Performance

Measuring Task Switch Time

TimeC

onte

xt switch

Conte

xt switch

Tim

e S

tam

p B

egin

Tim

e S

tam

p E

nd

Tim

e S

tam

p B

egin

Tim

e S

tam

p E

nd

Sta

rt Cach

e F

lush

Sta

rt Cach

e F

lush

Sch

edule

r

Sch

edule

r

Conte

xt switch

Tim

e S

tam

p B

egin

Process 1

Process 2

Scheduler

Modeling and Analysis: Measuring52 Gerrit Muller

version: 1.2September 9, 2018

PHRTarmCacheTaskSwitchMeasuring

Page 53: Tutorial Measuring and Modeling System Performance

Understanding: Impact of Context Switch

Clo

ck c

ycle

s P

er

Instr

uctio

n (

CP

I)

1

2

3

Sch

edule

r

Sch

edule

r

Task 1

Task 2

Task 1

Task 1 Task 2

Time

Based on figure diagram

by Ton Kostelijk

Process 1

Process 2

Scheduler

Modeling and Analysis: Measuring53 Gerrit Muller

version: 1.2September 9, 2018

PHRTarmCacheTaskSwitch

Page 54: Tutorial Measuring and Modeling System Performance

5. Accuracy: Measurement Error

system

under study

measurement

instrument

measured signal

noise resolution

value

measurement

error

time

valu

e

+ε1

calibrationoffset

characteristics

measurements have

stochastic variations and

systematic deviations

resulting in a range

rather than a single value

-ε2

+ε1-ε2

Modeling and Analysis: Measuring54 Gerrit Muller

version: 1.2September 9, 2018

MAMEmeasurementError

Page 55: Tutorial Measuring and Modeling System Performance

Accuracy 2: Be Aware of Error Propagation

tduration = tend - tstart

tend

tstart = 10 +/- 2 µs

= 14 +/- 2 µs

tduration = 4 +/- ? µs

systematic errors: add linear

stochastic errors: add quadratic

Modeling and Analysis: Measuring55 Gerrit Muller

version: 1.2September 9, 2018

MAMEerrorPropagation

Page 56: Tutorial Measuring and Modeling System Performance

Intermezzo Modeling Accuracy

Measurements have

stochastic variations and systematic deviations

resulting in a range rather than a single value.

The inputs of modeling,

"facts", assumptions, and measurement results,

also have stochastic variations and systematic deviations.

Stochastic variations and systematic deviations

propagate (add, amplify or cancel) through the model

resulting in an output range.

Modeling and Analysis: Measuring56 Gerrit Muller

version: 1.2September 9, 2018

MAMEintermezzo

Page 57: Tutorial Measuring and Modeling System Performance

6. Actual ARM Figures

ARM9 200 MHz

as function of cache use

From cache 2 µs

After cache flush 10 µs

Cache disabled 50 µs

cache setting tcontext switch

tcontext switch

Modeling and Analysis: Measuring57 Gerrit Muller

version: 1.2September 9, 2018

PHRTarmCacheActualFigures

Page 58: Tutorial Measuring and Modeling System Performance

7. Expectation versus Measurement

input data HW:

tARM instruction = 5 ns

tmemory access = 190 ns

simple SW model of context switch:

save state P1

determine next runnable task

update scheduler administration

load state P2

run P2

me

mo

ry

acce

sse

s

instr

uctio

ns

110

120

110

110

250

6100

+

500 ns

1140 ns+

1640 ns

tcontext switch = 2 µsexpected

tcontext switch = 10 µsmeasured

How to explain?

potentially missing in expectation:

memory accesses due to instructions

~10 instruction memory accesses ~= 2 µs

memory management (MMU context)

complex process model (parents,

permissions)

bookkeeping, e.g performance data

layering (function calls, stack handling)

the combination of above issues

However, measurement seems to make sense

Modeling and Analysis: Measuring58 Gerrit Muller

version: 1.2September 9, 2018

MAMEexpectationDiscussion

Page 59: Tutorial Measuring and Modeling System Performance

Conclusion Context Switch Overhead

toverhead ncontext switch tcontext switch*=

ncontext switch

(s-1

) toverheadCPU load

overhead

tcontext switch = 10µs

500

5000

50000

5ms

50ms

500ms

0.5%

5%

50%

toverhead

1ms

10ms

100ms

0.1%

1%

10%

tcontext switch = 2µs

CPU loadoverhead

Modeling and Analysis: Measuring59 Gerrit Muller

version: 1.2September 9, 2018

PSRTcontextSwitchOverhead

Page 60: Tutorial Measuring and Modeling System Performance

Summary Context Switching on ARM9

goal of measurement

Guidance of concurrency design and task granularity

Estimation of context switching overhead

Cost of context switch on given platform

examples of measurement

Needed: context switch overhead ~10% accurate

Measurement instrumentation: HW pin and small SW test program

Simple models of HW and SW layers

Measurement results for context switching on ARM9

Modeling and Analysis: Measuring60 Gerrit Muller

version: 1.2September 9, 2018

MAMEcaseARM9summary

Page 61: Tutorial Measuring and Modeling System Performance

Summary Measuring Approach

Conclusions

Measurements are an important source of factual data.

A measurement requires a well-designed experiment.

Measurement error, validation of the result determine the credibility.

Lots of consolidated data must be reduced to essential

understanding.

Techniques, Models, Heuristics of this module

experimentation

error analysis

estimating expectations

Modeling and Analysis: Measuring61 Gerrit Muller

version: 1.2September 9, 2018

MAMEsummary

Page 62: Tutorial Measuring and Modeling System Performance

Colophon

This work is derived from the EXARCH course at CTT

developed by Ton Kostelijk (Philips) and Gerrit Muller.

The Boderc project contributed to the measurement

approach. Especially the work of

Peter van den Bosch (Océ),

Oana Florescu (TU/e),

and Marcel Verhoef (Chess)

has been valuable.

Modeling and Analysis: Measuring62 Gerrit Muller

version: 1.2September 9, 2018

MAMEcolophon

Page 63: Tutorial Measuring and Modeling System Performance

Modeling and Analysis: Budgetingby Gerrit Muller TNO-ESI, HSN-NISE

e-mail: [email protected]

Abstract

This presentation addresses the fundamentals of budgeting: What is a budget,how to create and use a budget, what types of budgets are there. What is therelation with modeling and measuring.

Distribution

This article or presentation is written as part of the Gaudí project. The Gaudí projectphilosophy is to improve by obtaining frequent feedback. Frequent feedback is pursued by anopen creation process. This document is published as intermediate or nearly mature versionto get feedback. Further distribution is allowed as long as the document remains completeand unchanged.

September 9, 2018status: preliminarydraftversion: 1.0

budgetdesign

estimates;simulations

V4aa

IO

micro benchmarks

aggregated functions

applications

measurements existing system

model

tproc

tover

+

tdisp

tover

+

+

spec

SRStboot 0.5s

tzap 0.2s

measurements new (proto)

systemform

micro benchmarks

aggregated functions

applications

profiles

traces

tuning

10

20

30

5

20

25

55

tproc

tover

tdisp

tover

Tproc

Tdisp

Ttotal

feedback

can be more complex

than additions

Page 64: Tutorial Measuring and Modeling System Performance

Budgeting

content of this presentation

What and why of a budget

How to create a budget (decomposition, granularity, inputs)

How to use a budget

Modeling and Analysis: Budgeting64 Gerrit Muller

version: 1.0September 9, 2018

MABUcontent

Page 65: Tutorial Measuring and Modeling System Performance

What is a Budget?

A budget is

a quantified instantation of a model

A budget can

prescribe or describe the contributions

by parts of the solution

to the system quality under consideration

Modeling and Analysis: Budgeting65 Gerrit Muller

version: 1.0September 9, 2018

MABUbudget

Page 66: Tutorial Measuring and Modeling System Performance

Why Budgets?

• to make the design explicit

• to provide a baseline to take decisions

• to specify the requirements for the detailed designs

• to have guidance during integration

• to provide a baseline for verification

• to manage the design margins explicitly

Modeling and Analysis: Budgeting66 Gerrit Muller

version: 1.0September 9, 2018

MABUgoals

Page 67: Tutorial Measuring and Modeling System Performance

Visualization of Budget Based Design Flow

budgetdesign

estimates;simulations

V4aa

IO

micro benchmarks

aggregated functions

applications

measurements existing system

model

tproc

tover

+

tdisp

tover

+

+

spec

SRStboot 0.5s

tzap 0.2s

measurements new (proto)

systemform

micro benchmarks

aggregated functions

applications

profiles

traces

tuning

10

20

30

5

20

25

55

tproc

tover

tdisp

tover

Tproc

Tdisp

Ttotal

feedback

can be more complex

than additions

Modeling and Analysis: Budgeting67 Gerrit Muller

version: 1.0September 9, 2018

EAAbudget

Page 68: Tutorial Measuring and Modeling System Performance

Stepwise Budget Based Design Flow

1B model the performance starting with old systems

1A measure old systems

1C determine requirements for new system

2 make a design for the new system

3 make a budget for the new system:

4 measure prototypes and new system

flow model and analytical model

micro-benchmarks, aggregated functions, applications

response time or throughput

explore design space, estimate and simulate

step example

models provide the structure

measurements and estimates provide initial numbers

specification provides bottom line

micro-benchmarks, aggregated functions, applications

profiles, traces

5 Iterate steps 1B to 4

Modeling and Analysis: Budgeting68 Gerrit Muller

version: 1.0September 9, 2018

TCRbudgets

Page 69: Tutorial Measuring and Modeling System Performance

Budgets Applied on Waferstepper Overlay

process

overlay

80 nm

reticule

15 nm

matched

machine

60 nm

process

dependency

sensor

5 nm

matching

accuracy

5 nm

single

machine

30 nm

lens

matching

25 nm

global

alignment

accuracy

6 nm

stage

overlay

12 nm

stage grid

accuracy

5 nm

system

adjustment

accuracy

2 nm

stage Al.

pos. meas.

accuracy

4 nm

off axis pos.

meas.

accuracy

4nm

metrology

stability

5 nm

alignment

repro

5 nm

position

accuracy

7 nm

frame

stability

2.5 nm

tracking

error phi

75 nrad

tracking

error X, Y

2.5 nm

interferometer

stability

1 nm

blue align

sensor

repro

3 nm

off axis

Sensor

repro

3 nm

tracking

error WS

2 nm

tracking

error RS

1 nm

Modeling and Analysis: Budgeting69 Gerrit Muller

version: 1.0September 9, 2018

ASMLoverlayBudget

Page 70: Tutorial Measuring and Modeling System Performance

Budgets Applied on Medical Workstation Memory Use

shared code

User Interface process

database server

print server

optical storage server

communication server

UNIX commands

compute server

system monitor

application SW total

UNIX Solaris 2.x

file cache

total

obj data

3.0

3.2

1.2

2.0

2.0

0.2

0.5

0.5

12.6

bulk data

12.0

3.0

9.0

1.0

4.0

0

6.0

0

35.0

code

11.0

0.3

0.3

0.3

0.3

0.3

0.3

0.3

0.3

13.4

total

11.0

15.3

6.5

10.5

3.3

6.3

0.5

6.8

0.8

61.0

10.0

3.0

74.0

memory budget in Mbytes

Modeling and Analysis: Budgeting70 Gerrit Muller

version: 1.0September 9, 2018

RVmemoryBudgetTable

Page 71: Tutorial Measuring and Modeling System Performance

Power Budget Visualization for Document Handler

paper path

scannerand feeder

procedé

UI and control

finisher

paper input module

power

supplies

sca

nn

er

fee

de

r

UI a

nd

co

ntr

ol

coolingpower supplies

paper path

procedé fin

ish

er

pa

pe

r

inp

ut

mo

du

le

size

proportional

to power

physical

layout

legend

cooling

Modeling and Analysis: Budgeting71 Gerrit Muller

version: 1.0September 9, 2018

MDMpowerProportions

Page 72: Tutorial Measuring and Modeling System Performance

Alternative Power Visualization

power supplies

cooling

UI and control

paper path

paper input module

finisher paper

procedé

electricalpower

heat

Modeling and Analysis: Budgeting72 Gerrit Muller

version: 1.0September 9, 2018MDMpowerArrows

Page 73: Tutorial Measuring and Modeling System Performance

Evolution of Budget over Time

fact finding through details

aggregate to end-to-end performance

search for appropriate abstraction level(s)

from coarse guesstimate

to reliable prediction

from typical case

to boundaries of requirement space

from static understanding

to dynamic understanding

from steady state

to initialization, state change and shut down

from old system

to prototype

to actual implementation

time

start later only if needed

Modeling and Analysis: Budgeting73 Gerrit Muller

version: 1.0September 9, 2018

MABUincrements

Page 74: Tutorial Measuring and Modeling System Performance

Potential Applications of Budget based design

• resource use (CPU, memory, disk, bus, network)

• timing (response, latency, start up, shutdown)

• productivity (throughput, reliability)

• Image Quality parameters (contrast, SNR, deformation, overlay, DOF)

• cost, space, time

Modeling and Analysis: Budgeting74 Gerrit Muller

version: 1.0September 9, 2018

MDMbudgetApplications

Page 75: Tutorial Measuring and Modeling System Performance

What kind of budget is required?

static

is the budget based on

wish, empirical data, extrapolation,

educated guess, or expectation?

typical case

global

approximate

dynamic

worst case

detailed

accurate

Modeling and Analysis: Budgeting75 Gerrit Muller

version: 1.0September 9, 2018MDMbudgetTypes

Page 76: Tutorial Measuring and Modeling System Performance

Summary of Budgeting

A budget is a quantified instantiation of a model

A budget can prescribe or describe the contributions by parts of the solution

to the system quality under consideration

A budget uses a decomposition in tens of elements

The numbers are based on historic data, user needs, first principles and

measurements

Budgets are based on models and estimations

Budget visualization is critical for communication

Budgeting requires an incremental process

Many types of budgets can be made; start simple!

Modeling and Analysis: Budgeting76 Gerrit Muller

version: 1.0September 9, 2018

MABUsummary

Page 77: Tutorial Measuring and Modeling System Performance

Colophon

The Boderc project contributed to Budget Based

Design. Especially the work of

Hennie Freriks, Peter van den Bosch (Océ),

Heico Sandee and Maurice Heemels (TU/e, ESI)

has been valuable.

Modeling and Analysis: Budgeting77 Gerrit Muller

version: 1.0September 9, 2018

MABUcolofon

Page 78: Tutorial Measuring and Modeling System Performance

Formula Based Performance Designby Gerrit Muller HSN-NISE

e-mail: [email protected]

Abstract

Performance models are mostly simple mathematical formulas. The challengeis to model the performance at an appropriate level. In this presentation weintroduce several levels of modeling, labeled zeroth order, second order, et cetera.AS illiustration we use the performance of MRI reconstruction.

The complete course ASPTM is owned by TNO-ESI. To teach this course a license from TNO-ESI is required. This material is preliminary course material.

September 9, 2018status: draftversion: 1.0

Page 79: Tutorial Measuring and Modeling System Performance

Theory Block: n Order Formulas

0th order main function

parameters

order of magnitude

relevant for main function

1st order add overhead

secondary function(s)estimation

2nd

order interference effects

circumstancesmore accurate, understanding

main function, overhead

and/or secondary functions

Formula Based Performance Design79 Gerrit Muller

version: 1.0September 9, 2018

PHRTcpuLoadFormulaIntro

Page 80: Tutorial Measuring and Modeling System Performance

CPU Time Formula Zero Order

tcpu total tcpu processing=

nx tpixelny= * *

tUI+

tcpu processing

Formula Based Performance Design80 Gerrit Muller

version: 1.0September 9, 2018

PHRTcpuLoadFormulaZeroOrder

Page 81: Tutorial Measuring and Modeling System Performance

CPU Time Formula First Order

tcpu total tcpu processing

tcontext switch

overhead

+= tUI

+

Formula Based Performance Design81 Gerrit Muller

version: 1.0September 9, 2018

PHRTcpuLoadFormulaFirstOrder

Page 82: Tutorial Measuring and Modeling System Performance

CPU Time Formula Second Order

tcpu total tcpu processing tcontext switch

overhead

tstall time due to

context switching

+=

+

+

tstall time due to

cache efficiency

signal processing: high efficiency

control processing: low/medium efficiency

+tUI

Formula Based Performance Design82 Gerrit Muller

version: 1.0September 9, 2018

PHRTcpuLoadFormulaSecondOrder

Page 83: Tutorial Measuring and Modeling System Performance

Case MRI Reconstruction

MRI reconstruction

"Test" of performance model on another case

Scope of performance and significance of impact

Formula Based Performance Design83 Gerrit Muller

version: 1.0September 9, 2018

PHRTcaseMRreconstructionIntro

Page 84: Tutorial Measuring and Modeling System Performance

MR Reconstruction Context

Host

Acquisition

StorageViewing

&Printing

ma

gn

et

gra

die

nts

RF

AD

C Reconstruction

control

Formula Based Performance Design84 Gerrit Muller

version: 1.0September 9, 2018

PHRTreconstructionMRcase

Page 85: Tutorial Measuring and Modeling System Performance

MR Reconstruction Performance Zero Order

trecon = nraw-x * tfft(nraw-y)

ny * tfft(nraw-x)

+

tfft(n) = cfft * n * log(n)

filter FFT FFTcorrections

nraw-x

nraw-y

nraw-x

nraw-y

nraw-x

ny

nx

ny

nx

ny

Formula Based Performance Design85 Gerrit Muller

version: 1.0September 9, 2018

PHRTreconstructionMRzeroOrder

Page 86: Tutorial Measuring and Modeling System Performance

Zero Order Quantitative Example

Typical FFT, 1k points ~ 5 msec

( scales with 2 * n * log (n) )

nraw-x = 512

nraw-y = 256

ny = 256

nx = 256

using:

trecon = nraw-x * tfft(nraw-y)

ny * tfft(nraw-x)

+

+

512 * 1.2 + 256 * 2.4

~= 1.2 s

Formula Based Performance Design86 Gerrit Muller

version: 1.0September 9, 2018

PHRTreconstructionMRzeroOrderQuantified

Page 87: Tutorial Measuring and Modeling System Performance

MR Reconstruction Performance First Order

trecon =

nraw-x * tfft(nraw-y)

ny * tfft(nraw-x)

tfilter(nraw-x ,nraw-y) +

+

+

tfft(n) = cfft * n * log(n)

filter FFT FFTcorrections

nraw-x

nraw-y

nraw-x

nraw-y

nraw-x

ny

nx

ny

nx

ny

tcorrections(nx ,ny)

Formula Based Performance Design87 Gerrit Muller

version: 1.0September 9, 2018

PHRTreconstructionMRfirstOrder

Page 88: Tutorial Measuring and Modeling System Performance

First Order Quantitative Example

Typical FFT, 1k points ~ 5 msec

( scales with 2 * n * log (n) )

Filter 1k points ~ 2 msec

( scales linearly with n )

Correction ~ 2 msec

( scales linearly with n )

Formula Based Performance Design88 Gerrit Muller

version: 1.0September 9, 2018

PHRTreconstructionMRfirstOrderQuantified

Page 89: Tutorial Measuring and Modeling System Performance

MR Reconstruction Performance Second Order

trecon =

nraw-x * ( tfft(nraw-y)

ny * ( tfft(nraw-x)

tfilter(nraw-x ,nraw-y) +

+

+

tfft(n) = cfft * n * log(n)

filter FFT FFTcorrections

nraw-x

nraw-y

nraw-x

nraw-y

nraw-x

ny

nx

ny

nx

ny

tcol-overhead

tcorrections(nx ,ny)

trow-overhead

tcontrol-overhead

+

) +

) +

Formula Based Performance Design89 Gerrit Muller

version: 1.0September 9, 2018

CVreconstructionPerformanceModel

Page 90: Tutorial Measuring and Modeling System Performance

Second Order Quantitative Example

Typical FFT, 1k points ~ 5 msec

( scales with 2 * n * log (n) )

Filter 1k points ~ 2 msec

( scales linearly with n )

Correction ~ 2 msec

( scales linearly with n )

Control overhead = ny * trow overhead

10 .. 100 µs

Formula Based Performance Design90 Gerrit Muller

version: 1.0September 9, 2018

PHRTreconstructionMRsecondOrderQuantified

Page 91: Tutorial Measuring and Modeling System Performance

MR Reconstruction Performance Third Order

overhead

trecon = nraw-x * ( tfft(nraw-y)

ny * ( tfft(nraw-x)

tfilter(nraw-x ,nraw-y) + +

+

tfft(n) = cfft * n * log(n)

filter FFT FFTcorrections

tcol-overhead

tcorrections(nx ,ny)trow-overhead +tcontrol-overhead+

) +

) +

nraw-x

nraw-y

nraw-x

nraw-y

nraw-x

ny

FFT computations

column overhead

FFT computations

row overheadcorrection computations

overhead

filter computations

read I/O

write I/O

malloc, freetranspose

bookkeeping

number crunching

overhead

focus on overhead

reduction

is more important

than faster algorithms

this is not an excuse

for sloppy algorithms

readI/O

writeI/O

trans-pose

nraw-x

ny

nx

ny

nx

ny

tread I/O +twrite I/O+ttranspose

Formula Based Performance Design91 Gerrit Muller

version: 1.0September 9, 2018

RVreconstructionPerformanceAnalysis

Page 92: Tutorial Measuring and Modeling System Performance

Summary Case MRI Reconstruction

MRI reconstruction

System performance may be determined by other than standard facts

E.g. more by overhead I/O rather than optimized core processing

==> Identify & measure what is performance-critical in application

Formula Based Performance Design92 Gerrit Muller

version: 1.0September 9, 2018

PHRTcaseMRreconstructionSummary

Page 93: Tutorial Measuring and Modeling System Performance

Physical Models of an Elevatorby Gerrit Muller University of South-Eastern Norway-NISE

e-mail: [email protected]

Abstract

An elevator is used as a simple system to model a few physical aspects. Wewill show simple kinematic models and we will consider energy consumption.These low level models are used to understand (physical) design considerations.Elsewhere we discuss higher level models, such as use cases and throughput,which complement these low level models.

Distribution

This article or presentation is written as part of the Gaudí project. The Gaudí projectphilosophy is to improve by obtaining frequent feedback. Frequent feedback is pursued by anopen creation process. This document is published as intermediate or nearly mature versionto get feedback. Further distribution is allowed as long as the document remains completeand unchanged.

September 9, 2018status: preliminarydraftversion: 0.4

coarse 2nd order correction

1st order modelinput data

Sone floor = 3m

vmax = 2.5 m/s

amax = 1.2 m/s2 (up)

jmax = 2.5 m/s3

S0 = 0ms

t

v

t

a

tta ta

tone floor = 2 tatone floor ~=

2√( 1.5 / (0.5*1.2)) ~=

2 * 1.6s ~= 3s

S(ta) =

v(ta) = amta

v

tta tatj tj

tone floor = 2 ta + 2 tj

tj ~= 0.5s

tone floor ~= 2*1.6 + 2*0.5 ~= 4s

1

2* amax * ta

2

ta =√( S(ta)/ (0.5*amax))

v(ta) ~= 1.2*1.6 ~= 1.9 m/s

=2√( S(ta)/ (0.5*amax))

Page 94: Tutorial Measuring and Modeling System Performance

Learning Goals

To understand the need for

· various views, e.g. physical, functional, performance

· mathematical models

· quantified understanding

· assumptions (when input data is unavailable yet) and later validation

· various visualizations, e.g. graphs

· understand and hence model at multiple levels of abstraction

· starting simple and expanding in detail, views, and solutions gradually, based on

increased insight

To see the value and the limitations of these conceptual models

To appreciate the complementarity of conceptual models to other forms of modeling,

e.g. problem specific models (e.g. structural or thermal analysis), SysML models, or

simulations

Physical Models of an Elevator94 Gerrit Muller

version: 0.4September 9, 2018EPMlearningGoals

Page 95: Tutorial Measuring and Modeling System Performance

warning

This presentation starts with a trivial problem.

Have patience!

Extensions to the trivial problem are used to illustrate

many different modeling aspects.

Feedback on correctness and validity is appreciated

Physical Models of an Elevator95 Gerrit Muller

version: 0.4September 9, 2018

EPMwarning

Page 96: Tutorial Measuring and Modeling System Performance

The Elevator in the Building

inhabitants want to reach

their destination fast and comfortable

building owner and service operator

have economic constraints:

space, cost, energy, ...elevator

40

m

building

top floor

Physical Models of an Elevator96 Gerrit Muller

version: 0.4September 9, 2018

EPMbuilding

Page 97: Tutorial Measuring and Modeling System Performance

Elementary Kinematic Formulas

a0t2

St = position at time t

vt = velocity at time t

at = acceleration at time t

jt = jerk at time t

v = dS

dta =

dv

dtj =

da

dt

Position in case of uniform acceleration:

St = S0 + v0t + 1

2

Physical Models of an Elevator97 Gerrit Muller

version: 0.4September 9, 2018

EPMkinematicFormulas

Page 98: Tutorial Measuring and Modeling System Performance

Initial Expectations

elevator4

0m

vmax = maximum velocity

amax = maximum acceleration

jmax = maximum jerk

What values do you expect or prefer

for these quantities? Why?

ttop floor = time to reach top floor

building

top floor

Physical Models of an Elevator98 Gerrit Muller

version: 0.4September 9, 2018

EPMinitialExpectations

Page 99: Tutorial Measuring and Modeling System Performance

Initial Estimates via Googling

vmax ~= 2.5 m/s

amax ~= 1.2 m/s2 (up)

jmax ~= 2.5 m/s3

Google "elevator" and "jerk":

ttop floor ~= 16 s

numbers from: http://www.sensor123.com/vm_eva625.htm

CEP Instruments Pte Ltd Singapore

12% of gravity;

weight goes up

humans feel changes of forces

high jerk values are uncomfortable

relates to motor design

and energy consumption

relates to control designelevator

40

m

building

top floor

Physical Models of an Elevator99 Gerrit Muller

version: 0.4September 9, 2018

EPMinitialEstimates

Page 100: Tutorial Measuring and Modeling System Performance

Exercise Time to Reach Top Floor Kinematic

input data elementary formulas

St = 40m

v = dS

dta =

dv

dtj =

da

dt

Position in case of uniform acceleration:

St = S0 + v0t +

vmax = 2.5 m/s

amax = 1.2 m/s2 (up)

jmax = 2.5 m/s3

S0 = 0m

exercises

ttop floor is time needed to reach top floor without stopping

Make a model for ttop floor and calculate its value

Make 0e order model, based on constant velocity

Make 1e order model, based on constant acceleration

What do you conclude from these models?

a0t21

2

Physical Models of an Elevator100 Gerrit Muller

version: 0.4September 9, 2018

EPMtopFloorModelExercises

Page 101: Tutorial Measuring and Modeling System Performance

Models for Time to Reach Top Floor

1st order model

0th order model

input data elementary formulas

Stop floor = 40m

v = dS

dta =

dv

dtj =

da

dt

Position in case of uniform acceleration:

St = S0 + v0t +

vmax = 2.5 m/s

amax = 1.2 m/s2 (up)

jmax = 2.5 m/s3

S0 = 0m

Stop floor = vmax * ttop floor

ttop floor = Stop floor / vmax

ttop floor = 40/2.5 = 16s

s

t

v

t

s

t

v

t

a

tta tatv

ttop floor = ta + tv + ta

ta = vmax / amax

S(ta) =

Slinear = Stop floor - 2 * S(ta)

tv = Slinear / vmax

ta ~= 2.5/1.2 ~= 2s

S(ta) ~= 0.5 * 1.2 * 22

S(ta) ~= 2.4m

tv ~= (40-2*2.4)/2.5

tv ~= 14s

ttop floor ~= 2 + 14 + 2

ttop floor ~= 18s

a0t21

2

1

2* amax * ta

2

Physical Models of an Elevator101 Gerrit Muller

version: 0.4September 9, 2018

EPMtopFloorModels

Page 102: Tutorial Measuring and Modeling System Performance

Conclusions Move to Top Floor

Conclusions

vmax dominates traveling time

The model for the large height traveling time can be

simplified into:

ttravel = Stravel/vmax + (ta + tj)

Physical Models of an Elevator102 Gerrit Muller

version: 0.4September 9, 2018

EPMtopFloorModelConclusions

Page 103: Tutorial Measuring and Modeling System Performance

Exercise Time to Travel One Floor

exercise

Make a model for tone floor and calculate it

What do you conclude from this model?

input data elementary formulas

Stop floor = 40m

v = dS

dta =

dv

dtj =

da

dt

Position in case of uniform acceleration:

St = S0 + v0t +

vmax = 2.5 m/s

amax = 1.2 m/s2 (up)

jmax = 2.5 m/s3

S0 = 0m

a0t21

2

Physical Models of an Elevator103 Gerrit Muller

version: 0.4September 9, 2018

EPMoneFloorPerformanceExercise

Page 104: Tutorial Measuring and Modeling System Performance

2nd Order Model Moving One Floor

2nd

order modelinput data

Sone floor = 3m

vmax = 2.5 m/s

amax = 1.2 m/s2 (up)

jmax = 2.5 m/s3

S0 = 0m

s

t

v

t

a

t

t one floor = 2 ta + 4 tj

tj = amax / jmax

S1 = 1/6 * jmax tj3

S2 = S1 + v1ta + 0.5 amaxta2

tj ~= 1.2/2.5 ~= 0.5s

S1 ~= 1/6 * 2.5 * 0.53 ~= 0.05m

ta tatj tj tj tjS1 S4S0 S2S3 S5

v1 = 0.5 jmax tj2

S3 = S2 + v2tj + 0.5 amaxtj2 - 1/6 jmaxtj

3

v2 = v1 + amax ta

S3 = 0.5 St

v1 ~= 0.5 * 2.5 * 0.52 ~= 0.3m/s

et cetera

Physical Models of an Elevator104 Gerrit Muller

version: 0.4September 9, 2018

EPMoneFloorModel2

Page 105: Tutorial Measuring and Modeling System Performance

1st Order Model Moving One Floor

coarse 2nd order correction

1st order modelinput data

Sone floor = 3m

vmax = 2.5 m/s

amax = 1.2 m/s2 (up)

jmax = 2.5 m/s3

S0 = 0ms

t

v

t

a

tta ta

tone floor = 2 tatone floor ~=

2√( 1.5 / (0.5*1.2)) ~=

2 * 1.6s ~= 3s

S(ta) =

v(ta) = amta

v

tta tatj tj

tone floor = 2 ta + 2 tj

tj ~= 0.5s

tone floor ~= 2*1.6 + 2*0.5 ~= 4s

1

2* amax * ta

2

ta =√( S(ta)/ (0.5*amax))

v(ta) ~= 1.2*1.6 ~= 1.9 m/s

=2√( S(ta)/ (0.5*amax))

Physical Models of an Elevator105 Gerrit Muller

version: 0.4September 9, 2018

EPMoneFloorModel

Page 106: Tutorial Measuring and Modeling System Performance

Conclusions Move One Floor

Conclusions

amax dominates travel time

The model for small height traveling time can be

simplified into:

ttravel = 2 √(Stravel/0.5 amax) + tj

Physical Models of an Elevator106 Gerrit Muller

version: 0.4September 9, 2018

EPMoneFloorModelConclusions

Page 107: Tutorial Measuring and Modeling System Performance

Exercise Elevator Performance

exercise

Make a model for ttop floor

Take door opening and docking into account

What do you conclude from this model?

Physical Models of an Elevator107 Gerrit Muller

version: 0.4September 9, 2018

EPMtopFloorPerformanceExercise

Page 108: Tutorial Measuring and Modeling System Performance

Elevator Performance Model

performance modelfunctional model

outcome

assumptionsundock

elevator

move

elevator

close doors

dock

elevator

open doors

elevator

ttop floor = tclose + tundock + tmove + tdock + topen

tclose ~= topen ~= 2s

tundock ~= 1s

tdock ~= 2s

tmove ~= 18s

ttop floor ~= 2 + 1 + 18 + 2 + 2

ttop floor ~= 25s

Physical Models of an Elevator108 Gerrit Muller

version: 0.4September 9, 2018

EPMelevatorPerformance

Page 109: Tutorial Measuring and Modeling System Performance

Conclusions Performance Model Top Floor

Conclusions

The time to move is dominating the traveling time.

Docking and door handling is significant part of the

traveling time.

ttop floor = ttravel + televator overhead

Physical Models of an Elevator109 Gerrit Muller

version: 0.4September 9, 2018

EPMtopFloorPerformanceConclusions

Page 110: Tutorial Measuring and Modeling System Performance

Measured Elevator Acceleration

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

5 10 15 20 25

m/s2

s

graph reproduced from:

http://www.sensor123.com/vm_eva625.htm

CEP Instruments Pte Ltd Singapore

Physical Models of an Elevator110 Gerrit Muller

version: 0.4September 9, 2018

EPMtopFloorMeasured

Page 111: Tutorial Measuring and Modeling System Performance

Theory versus Practice

What did we ignore or forget?

acceleration: up <> down 1.2 m/s2 vs 1.0 m/s

2

slack, elasticity, damping et cetera of cables, motors....

controller impact

.....

controllerz set point

actual z

Physical Models of an Elevator111 Gerrit Muller

version: 0.4September 9, 2018

EPMtopFloorIgnored

Page 112: Tutorial Measuring and Modeling System Performance

Exercise Time to Travel One Floor

exercise

Make a model for tone floor

Take door opening and docking into account

What do you conclude from this model?

Physical Models of an Elevator112 Gerrit Muller

version: 0.4September 9, 2018

EPMoneFloorOverheadExercise

Page 113: Tutorial Measuring and Modeling System Performance

Elevator Performance Model

performance model one floor (3m)functional model

outcome

assumptionsundock

elevator

move

elevator

close doors

dock

elevator

open doors

elevator

tone floor = tclose + tundock + tmove + tdock + topen

tclose ~= topen ~= 2s

tundock ~= 1s

tdock ~= 2s

tmove ~= 4s

tone floor ~= 2 + 1 + 4 + 2 + 2

tone floor ~= 11 s

Physical Models of an Elevator113 Gerrit Muller

version: 0.4September 9, 2018

EPMelevatorPerformanceOneFloor

Page 114: Tutorial Measuring and Modeling System Performance

Conclusions Performance Model One Floor

Conclusions

Overhead of docking and opening and closing doors

is dominating traveling time.

Fast docking and fast door handling has significant

impact on traveling time.

tone floor = ttravel + televator overhead

Physical Models of an Elevator114 Gerrit Muller

version: 0.4September 9, 2018

EPMoneFloorPerformanceConclusions

Page 115: Tutorial Measuring and Modeling System Performance

Exercise Time Line

Exercise

Make a time line of people using the elevator.

Estimate the time needed to travel to the top floor.

Estimate the time needed to travel one floor.

What do you conclude?

time

action 1 action 3

action 2

start ready

Physical Models of an Elevator115 Gerrit Muller

version: 0.4September 9, 2018

EPMexerciseTimeLine

Page 116: Tutorial Measuring and Modeling System Performance

Time Line; Humans Using the Elevator

outcome

tone floor ~= 8 +2 + 11 + twait

dockopendoors u

nd

ock

closedoors

walkin

selectfloor

wait

for leaving

people

minimal waiting time

dockopendoors

walkoutwait for elevator

press

button

ttravel

assumptions human dependent data

twait for elevator = [0..2 minutes] depends heavily on use

twait for leaving people = [0..20 seconds] idem

twalk in~= twalk out ~= 2 s

tselect floor ~= 2 s

assumptions additional elevator data

tminimal waiting time ~= 8s

ttravel top floor ~= 25s

ttravel one floor ~= 11s

ttop floor ~= 8 +2 + 25 + twait

~= 21 s + twait

~= 35 s+ twait

time

other

people entering

move

0 5 sec

scale

tone floor = tminimal waiting time +

twalk out + ttravel one floor + twait

ttop floor = tminimal waiting time +

twalk out + ttravel top floor + twait

Physical Models of an Elevator116 Gerrit Muller

version: 0.4September 9, 2018

EPMelevatorTimeLine

Page 117: Tutorial Measuring and Modeling System Performance

Overview of Results for One Elevator

0th order time to

move elevator 40m

1st order correction

elevator

docking and doors

16s

2s

7s

human related10s

waiting timetwait

16s

25s

35s

35s + twait

1st order model

elevator

docking and doors

3+1s

7s

human related10s

waiting timetwait

4s

11s

21s

21s + twait

2nd

order correction

top floor one floor

Physical Models of an Elevator117 Gerrit Muller

version: 0.4September 9, 2018

EPMoneElevatorOverview

Page 118: Tutorial Measuring and Modeling System Performance

Conclusions Time Line

Conclusions

The human related activities have significant impact

on the end-to-end time.

The waiting times have significant impact on the

end-to-end time and may vary quite a lot.

tend-to-end = thuman activities + twait + televator travel

Physical Models of an Elevator118 Gerrit Muller

version: 0.4September 9, 2018

EPMtimeLineConclusions

Page 119: Tutorial Measuring and Modeling System Performance

Exercise Energy and Power

Exercise

Estimate the energy consumption and the average and

peak power needed to travel to the top floor.

What do you conclude?

Physical Models of an Elevator119 Gerrit Muller

version: 0.4September 9, 2018

EPMexerciseEnergyAndPower

Page 120: Tutorial Measuring and Modeling System Performance

Energy and Power Model

1st order model

input data elementary formulas

St = 40m

vmax = 2.5 m/s

amax = 1.2 m/s2 (up)

jmax = 2.5 m/s3

S0 = 0m

s

t

v

t

a

tta tatv

Ekin = 1/2 mv2

Epot = mgh

g = 10 m/s2

W=dE

dt

tta tatv

Ekin Wkin

t

Epot

t

melevator = 1000 Kg (incl counter weight)

mpassenger = 100 Kg

Ekin max = 1/2 m vmax2

~= 0.5 * 1100 * 2.52

~= 3.4 kJ

Wkin max = m vmax amax

~= 1100 * 2.5 * 1.2

~= 3.3 kW

Epot = mgh

~= 100 * 10 * 40

~= 40 kJ

Wpot max ~= Epot/tv

~= 40/16

~= 2.5 kW

ignored:

friction and other losses

efficiency of energy transfer

1 passenger going up

t

Wpot

Physical Models of an Elevator120 Gerrit Muller

version: 0.4September 9, 2018

EPMenergyAndPowerModel

Page 121: Tutorial Measuring and Modeling System Performance

Energy and Power Conclusions

Conclusions

Epot dominates energy balance

Wpot is dominated by vmax

Wkin causes peaks in power consumption and absorption

Wkin is dominated by vmax and amax

Ekin max = 1/2 m vmax2

~= 0.5 * 1100 * 2.52

~= 3.4 kJ

Wkin max = m vmax amax

~= 1100 * 2.5 * 1.2

~= 3.3 kW

Epot = mgh

~= 100 * 10 * 40

~= 40 kJ

Wpot max ~= Epot/tv

~= 40/16

~= 2.5 kW

Physical Models of an Elevator121 Gerrit Muller

version: 0.4September 9, 2018

EPMpowerAndEnergyConclusions

Page 122: Tutorial Measuring and Modeling System Performance

Exercise Qualities and Design Considerations

Exercise

What other qualities and design considerations relate to

the kinematic models?

Physical Models of an Elevator122 Gerrit Muller

version: 0.4September 9, 2018

EPMqualitiesExercise

Page 123: Tutorial Measuring and Modeling System Performance

Conclusions Qualities and Design Considerations

Examples of other qualities and design considerations

safety

acoustic noise

mechanical vibrations

air flow

operating life, maintenance

...

vmax

vmax, amax, jmax

vmax, amax, jmax

obstacles cause

vibrations

rails

cage

vm

ax

?

duty cycle, ?

Physical Models of an Elevator123 Gerrit Muller

version: 0.4September 9, 2018

EPMqualitiesConclusions

Page 124: Tutorial Measuring and Modeling System Performance

Other Domains

applicability in other domains

kinematic modeling can be applied in a wide range of domains:

transportation systems (trains, busses, cars, containers, ...)

wafer stepper stages

health care equipment patient handling

material handling (printers, inserters, ...)

MRI scanners gradient generation

...

Physical Models of an Elevator124 Gerrit Muller

version: 0.4September 9, 2018EPMotherDomains

Page 125: Tutorial Measuring and Modeling System Performance

Exercise Multiple Users

Exercise

Assume that a group of people enters the elevator at the

ground floor. On every floor one person leaves the elevator.

What is the end-to-end time for someone traveling to the top

floor?

What is the desired end-to-end time?

What are potential solutions to achieve this?

What are the main parameters of the design space?

Physical Models of an Elevator125 Gerrit Muller

version: 0.4September 9, 2018

EPMmultipleUsersExercise

Page 126: Tutorial Measuring and Modeling System Performance

Multiple Users Model

outcome

~= 13 * (8 + 11) + 2 + twait

door handlingdockingmoving

walkout

tend-to-end

elevator data

tmin wait ~= 8s

tone floor ~= 11s

twalk out ~= 2s

nfloors = 40 div 3 +1 = 14

nstops = nfloors – 1 = 13

tend-to-end = nstops (tmin wait + tone floor) + twalk out + twait

~= 249 s + twait

tnon-stop ~= 35 s+ twait

time

another

13 floorsdock

opendoors

walkin

selectfloor

wait

for

leaving

people

minimal waiting time

wait for elevator

press

button

other

people

entering

minimal waiting time

walkout

Physical Models of an Elevator126 Gerrit Muller

version: 0.4September 9, 2018

EPMmultipleUsersTimeLine

Page 127: Tutorial Measuring and Modeling System Performance

Multiple Users Desired Performance

Considerations

desired time to travel to top floor ~< 1 minute

note that twait next = ttravel up + ttravel down

if someone just misses the elevator then the waiting time is

tend-to-end ~= 249 + 35 + 249 = 533s ~= 9 minutes!

desired waiting time ~< 1 minute

missed

trip

return

down

trip

up

Physical Models of an Elevator127 Gerrit Muller

version: 0.4September 9, 2018

EPMdesiredMultipleUserPerformance

Page 128: Tutorial Measuring and Modeling System Performance

Design of Elevators System

building

elevators

characteristics

individual

elevators

configuration

of elevators

scheduling

strategy

people(time)

trips(time)

tend-to-end

twait

distribution ofdesign

option

vmax, amax, jmax

npassengers

topen, tclose, tdock, tundock

tmin wait

nelevators

reachable floors

position

usage

Design of a system with multiple elevator

requires a different kind of models: oriented towards logistics

Physical Models of an Elevator128 Gerrit Muller

version: 0.4September 9, 2018

EPMelevatorsSystem

Page 129: Tutorial Measuring and Modeling System Performance

Exceptional Cases

Exceptional Cases

non-functioning elevator

maintenance, cleaning of elevator

elevator used by people moving household

rush hour

special events (e.g. party, new years eve)

special floors (e.g. restaurant)

many elderly or handicapped people

playing children

Physical Models of an Elevator129 Gerrit Muller

version: 0.4September 9, 2018

EPMexceptionalCases

Page 130: Tutorial Measuring and Modeling System Performance

Wrap-up Exercise

Make a list of all visualizations and

representations that we used during the

exercises

Physical Models of an Elevator130 Gerrit Muller

version: 0.4September 9, 2018

EPMwrapUpExercise

Page 131: Tutorial Measuring and Modeling System Performance

Summary of Visualizations and Representations

mathematical formulas

schematic graphs

measurement graph

quantificationtimeline, concurrencyfunctional

physical

elevator

40

m

building

top floor

a0t2

St = S0 + v0t + 1

2

s

t

v

tta tatv

functional model

undock

elevator

move

elevator

close doors

dock

elevator

open doors

elevator

ttop floor ~= 2 + 1 + 18 + 2 + 2

ttop floor ~= 25s

ttop floor = tclose + tundock + tmove + tdock + topen

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

5 10 15 20 25

m/s2

s

graph reproduced from:

http://www.sensor123.com/vm_eva625.htm

CEP Instruments Pte Ltd Singapore

dockopendoors u

nd

ock

closedoors

walkin

selectfloor

wait

for

leavin

g

peopl

e

minimal waiting time

dockopendoors

walkoutwait for elevator

press

button

ttravel

time

other

people

entering

move

0 5 sec

scale

rails

cage

vm

ax

1st order model

elevator

docking and

doors

3+1s

7s

human related10s

waiting timetwait

4s

11s

21s

21s + twait

2nd

order correction

Physical Models of an Elevator131 Gerrit Muller

version: 0.4September 9, 2018

EPMsummaryVisualizations