85
Mats Björkman, MdH This page intentionally left blank Please focus here

This page intentionally left blank

  • Upload
    adelie

  • View
    35

  • Download
    0

Embed Size (px)

DESCRIPTION

This page intentionally left blank. Please focus here. Measurement-based Research Methods in Computer Engineering. Mats Björkman Mälardalens Högskola. Overview. Introduction Experimental-based research methodology Statistics Measurements Methodology Examples Pitfalls Conclusions. - PowerPoint PPT Presentation

Citation preview

Page 1: This page intentionally left blank

Mats Björkman, MdH

This page intentionally left blank

Please focus here

Page 2: This page intentionally left blank

Mats Björkman, MdH

Measurement-based Research Methods in Computer Engineering

Mats BjörkmanMälardalens Högskola

Page 3: This page intentionally left blank

Mats Björkman, MdH

Overview Introduction Experimental-based research

methodology Statistics Measurements Methodology Examples Pitfalls Conclusions

Page 4: This page intentionally left blank

Mats Björkman, MdH

Introduction Measurement-based research

is founded in: Experimental research methodology Statistics

Page 5: This page intentionally left blank

Mats Björkman, MdH

Experimental-based research methodology Overview (repetition) Comments

Page 6: This page intentionally left blank

Mats Björkman, MdH

Experimental-based research methodology - Overview Already the old Greeks… Two main standpoints: Rational methods, it all comes from

the brain, everything can be thought out

Idealistic methods, everything we observe give us knowledge about the ideal world (e.g. Plato)

Page 7: This page intentionally left blank

Mats Björkman, MdH

Different Methodologies Rational research meant thinking,

thus deductive (logical) methodologies

Idealistic research meant observing and drawing conclusions, thus inductive methodologies

Page 8: This page intentionally left blank

Mats Björkman, MdH

Practice often a mixture E.g. Astronomy, a combination of

induction from observations and deduction through e.g. Mathematics

Page 9: This page intentionally left blank

Mats Björkman, MdH

Medieval times Much debate around whether or

not God stands above the laws of logic

The question “Why?” important Research always seen through the

glasses of religion

Page 10: This page intentionally left blank

Mats Björkman, MdH

The Scientific Revolution Bacon, Copernicus, Kepler, Newton

etc. Focus on “How?” Nature and religion may be treated

separately as long as focus is on How

Led to the development of the “traditional” sciences

Page 11: This page intentionally left blank

Mats Björkman, MdH

Modern Science Karl Popper - important philosopher Science is a process of testing and

refining hypotheses Induction problem: Can experience

be generalized? Popper says ‘no’, experiments

cannot prove general hypotheses

Page 12: This page intentionally left blank

Mats Björkman, MdH

Modern Science Falsification is the most important

feature of science according to Popper Hypotheses cannot be proven, but

they can be falsified by counter-examples

Theories are compared by their expressiveness and by their abilities to withstand falsification

Page 13: This page intentionally left blank

Mats Björkman, MdH

Modern Science Since hypotheses cannot be

generally proven, corroboration and statistics play important roles

Hypothetical-deductive research methods build on these views of science

However, corroboration versus verification can be and is discussed

Page 14: This page intentionally left blank

Mats Björkman, MdH

Hypothetical-deductive methods Problem formulation Hypothesis Deduction to find evaluation criteria Experiment(s)/observation(s) Conclusion

(corroboration/verification or falsification)

Page 15: This page intentionally left blank

Mats Björkman, MdH

Problem formulation The research problem is

formulated Typically, a good problem is

addressable through empirical studies

Page 16: This page intentionally left blank

Mats Björkman, MdH

Hypothesis A hypothesis regarding the answer to

the research question is formulated This is the really creative part of

research, scientific intuition and a good “educated guess” are important to success

(Popper: The more “risky” the hypothesis, the “better” the result.)

Page 17: This page intentionally left blank

Mats Björkman, MdH

Deduction From the hypothesis, criteria are

deduced, the criteria to be used to test the hypothesis

Page 18: This page intentionally left blank

Mats Björkman, MdH

Experiments/observations The “hard work” part of research Experiments are set up and/or

observations are performed in order to corroborate/verify (or falsify) the hypothesis

Page 19: This page intentionally left blank

Mats Björkman, MdH

Corroboration/verification or falsification Using the deduced criteria on the

results of the experiments/observations leads to either corroboration/verification or falsification of the hypothesis

Page 20: This page intentionally left blank

Mats Björkman, MdH

Then iterate… Modern scientific research is

typically a series of hypothetical-deductive situations; each corroboration/ verification or falsification gives input to a new or modified research question etc. etc.

Through this process, our scientific theories are expanded and refined

Page 21: This page intentionally left blank

Mats Björkman, MdH

What is “the Truth”? Experimental research is often

more quantitative than qualitative For quantitative results, confidence

levels or margins of errors are used in attempts to “encircle” the Truth (should it exist)

Experiments are repeated and/or modified until confidence levels or error margins are satisfactory

Page 22: This page intentionally left blank

Mats Björkman, MdH

What is “the Truth”? For qualitative results, we must

also use statistics. Even if we believe in induction and

that the Truth is possible to find, there are always experimental errors and the like that makes 100% impossible to reach

Here too, repeated experiments are needed

Page 23: This page intentionally left blank

Mats Björkman, MdH

Conclusions Experimental research is an

iterative process Potential falsification is important:

experiments without risks are not interesting

Page 24: This page intentionally left blank

Mats Björkman, MdH

Conclusions Examples: If we know the outcome

beforehand, the experiment is of no scientific value.

If there is no way to falsify the hypothesis (e.g. pseudoscience), the experiment is of no scientific value.

Page 25: This page intentionally left blank

Mats Björkman, MdH

Experimental-based research methodology - Comments 100 % does not exist in reality!

Page 26: This page intentionally left blank

Mats Björkman, MdH

Experimental-based research methodology - Comments In reality, there is always a

residual chance/risk that something really weird will happen

Page 27: This page intentionally left blank

Mats Björkman, MdH

Experimental-based research methodology - Comments Therefore, Popper and his followers

are maybe not wrong, but it is kind of irrelevant whether a hypothesis can be “generally proven” or not

(I’m trying to be provocative here…)

Page 28: This page intentionally left blank

Mats Björkman, MdH

Experimental-based research methodology - Comments If I show that in X percent of all

cases, some hypothesis Y holds… …then according to Popper, this

cannot prove the general case… …but if 1 – X (the risk of Y not

holding) is smaller than e.g. the risk of the world being ended by a comet…

…then who cares?

Page 29: This page intentionally left blank

Mats Björkman, MdH

Experimental-based research methodology - Comments In reality, we always take

calculated risks If a hypothesis is true for all

practical purposes, then it is an academic question (a philosophical question) whether or not the hypothesis is TRUE

Page 30: This page intentionally left blank

Mats Björkman, MdH

Experimental-based research methodology - Conclusions Conclusion: The hypothetical-

deductive method is the modern methodology in experimental research

However, not everyone agrees that hypotheses cannot be generally proven (and others don’t care…)

Page 31: This page intentionally left blank

Mats Björkman, MdH

An Experimental Example The work I did for my PhD thesis

around the performance of parallel implementations of communication protocols

Page 32: This page intentionally left blank

Mats Björkman, MdH

Parallel TCP and UDP stacks On a shared-memory

multiprocessor we implemented parallel TCP/IP/Ethernet and UDP/IP/Ethernet stacks

The performance behavior of these stacks gave rise to the research question “What factors limit the performance of these parallel implementations?”

Page 33: This page intentionally left blank

Mats Björkman, MdH

Performance limiting factors In parallel processing, critical

resources must be protected from simultaneous access, in our case by using locks

Hence, these critical sections were main suspects as performance limiting factor

Page 34: This page intentionally left blank

Mats Björkman, MdH

Our research hypothesis Our hypothesis then was “locking

is the main performance limiting factor”

We built a performance model using only locking and processing

If our hypothesis was right, then the model should behave like the real system

Page 35: This page intentionally left blank

Mats Björkman, MdH

Experiment Our experiment was to run the

model with the same input as the real implementation and compare results

Page 36: This page intentionally left blank

Mats Björkman, MdH

Results For the TCP stack,

results were fairly accurate for low numbers of processors (but far from perfect).

Conclusion: locking “is probably” one major factor (but not the only)

Page 37: This page intentionally left blank

Mats Björkman, MdH

Results For the UDP

stack, results differed widely.

Conclusion: locking is not a major factor here

Page 38: This page intentionally left blank

Mats Björkman, MdH

Results from conclusions We need to rethink and refine

(iterate) Locking obviously is one factor, but

not the only Need to think again and formulate

a new hypothesis

Page 39: This page intentionally left blank

Mats Björkman, MdH

New hypothesis Next to contention for shared

software resources, contention for shared hardware resources (e.g. buses, memory) is a likely candidate

New hypothesis: Contention for locks and contention for the bus/memory system are the two main factors

Page 40: This page intentionally left blank

Mats Björkman, MdH

New model We then built a new model that

captured the effects of both locking and bus/memory contention

The same evaluation criteria as before, model and reality should agree

Page 41: This page intentionally left blank

Mats Björkman, MdH

New results For the new

model, the TCP results were very good

Page 42: This page intentionally left blank

Mats Björkman, MdH

New results While not perfect,

UDP results also showed that our new model captured the main behavior of the UDP stack

Page 43: This page intentionally left blank

Mats Björkman, MdH

New results Conclusion: Lock and bus/memory

contention “are” the two main performance limiting factors for the observed implementations

Page 44: This page intentionally left blank

Mats Björkman, MdH

Statistics

Statistics are used for many purposes:

Quantify results Measure confidence Statistics needed in

corroboration process

Page 45: This page intentionally left blank

Mats Björkman, MdH

Statistics – Result quantification Assume we are measuring

some property P that has a certain (but to us unknown) value V

When measuring, we get a measured value V*

Page 46: This page intentionally left blank

Mats Björkman, MdH

Statistics – Result quantification How is V and V* related? It depends on our

measurement methods Ideally, our measurement

method gives an exact result, i.e. V* = V

Page 47: This page intentionally left blank

Mats Björkman, MdH

Statistics – Result quantification However, most measurement

methods are: Statistical by their nature Inexact Deliberately simplified (model)

Page 48: This page intentionally left blank

Mats Björkman, MdH

Statistics – Measurement methods

Statistical by nature: Sampling a typical example:

Instead of observing a long and possibly continuous process, we take a number of snapshots

These snapshots are statistically representative of the process

Page 49: This page intentionally left blank

Mats Björkman, MdH

Statistics – Measurement methods

Example: Counting cars (vehicles) We want to know the number of

cars/vehicles passing outside Rosenhill on one day

Instead of counting for 24 hours, we can count 10 randomly chosen minutes and multiply by 144.

Page 50: This page intentionally left blank

Mats Björkman, MdH

Statistics – Measurement methods

Inexactness: If our measurement tools have

lower resolution than the property we are measuring, we introduce measurement errors

Page 51: This page intentionally left blank

Mats Björkman, MdH

Statistics – Measurement methods

Example: Using a simple scale to weigh something will only yield approximative values of the weight

Page 52: This page intentionally left blank

Mats Björkman, MdH

Statistics – Measurement methods

Simplification: We approximate the problem

by introducing an inexact model

Page 53: This page intentionally left blank

Mats Björkman, MdH

Statistics – Measurement methods

Example: Counting cars (again) We put a rubber tube across

the street with a counter that ticks every time the tube is run over by a vehicle wheel (pair)

Page 54: This page intentionally left blank

Mats Björkman, MdH

Statistics – Measurement methods

Example: Counting cars (again) We get one tick per vehicle axle We can approximate that all

vehicles have 2 axles… …or use some previously

determined value, e.g. 2.042 axles

Page 55: This page intentionally left blank

Mats Björkman, MdH

Statistics – Measurement methods

Note that whereas using 2 axles is a coarser simplification, using 2.042 axles introduces dependencies on some earlier measurements and their reliability and exactness

Page 56: This page intentionally left blank

Mats Björkman, MdH

Statistics – Result quantification Dependent on the number of

measurements we make and the measurement precision etc., we will get a measured value V* where we can quantify the relation between V* and the true value V

Page 57: This page intentionally left blank

Mats Björkman, MdH

Statistics – Result quantification With enough knowledge about

all parts of our measurement process, we can determine that:

With a confidence of q, the true value V lies in the range of [V*-1,V*+2] (with some specific values of q, 1, 2).

Page 58: This page intentionally left blank

Mats Björkman, MdH

Statistics – Result quantification Two goals then are to: Have as high confidence (q) as

possible in the results, and have as small interval margins

(1, 2) as possible.

Page 59: This page intentionally left blank

Mats Björkman, MdH

Statistics – Result quantification Both confidence levels and

margins of error are dependent on the measurement methods we use

Page 60: This page intentionally left blank

Mats Björkman, MdH

Statistics – Result quantification Therefore, it is a prime task to

make measurements as reliable as possible

By removing sources of errors, the results can be very much better

Page 61: This page intentionally left blank

Mats Björkman, MdH

Statistics – Result quantification Repetition is another important

point By repeated experiments, we

can obtain better confidence in our results, as well as reduce the margins of error

Page 62: This page intentionally left blank

Mats Björkman, MdH

Statistics – Methodology In order to reduce errors,

measurement experiments need to be as controlled as possible

By controlled we mean that we should reduce and (hopefully) control all error sources

Page 63: This page intentionally left blank

Mats Björkman, MdH

Methodology If there are factors in the

environment that cannot be eliminated, they should be measured and quantified

Example: vehicle axles, 2.042 is a quantification of our “conversion error” from axles to vehicles

Page 64: This page intentionally left blank

Mats Björkman, MdH

Methodology - dilemma Methodology dilemma:

Intrusive measurements Example: Timestamping a code

section Time for timestamping

included in reported times…

Page 65: This page intentionally left blank

Mats Björkman, MdH

Timestamping dilemma

Timestamp

Timestamp

Code to measure

tProblem: ≠

Page 66: This page intentionally left blank

Mats Björkman, MdH

Timestamping: reducing error

Timestamp

Timestamp

Code to measure

Timestamp

Timestamp

- ≈

Page 67: This page intentionally left blank

Mats Björkman, MdH

Timestamping: reducing error

Timestamp

Timestamp

Code to measureCode to measure

Code to measure

1

2

N

≅ N ∗

Page 68: This page intentionally left blank

Mats Björkman, MdH

Simulations One way to avoid intrusive

measurements Problem: Simulations must be

verified/corroborated against reality

Page 69: This page intentionally left blank

Mats Björkman, MdH

Simulations Problem: Simulations lend

themselves to simplifications This means that we introduce

modeling errors These errors must be identified

and controlled

Page 70: This page intentionally left blank

Mats Björkman, MdH

Pitfalls

There are many pitfalls in experimental research, especially when it involves measurements

Page 71: This page intentionally left blank

Mats Björkman, MdH

Pitfall: Sampling too sparsely Example from a very early

publication We performed measurements

on protocol stacks in a UNIX environment

Page 72: This page intentionally left blank

Mats Björkman, MdH

Pitfall: Sampling too sparsely Our controlled measurements

did not agree with reality

0

5

10

15

20

25

30

35

40

45

ControlledReality

Page 73: This page intentionally left blank

Mats Björkman, MdH

Pitfall: Sampling too sparsely Turned out, our measurement

points were not representative

0

5

10

15

20

25

30

35

40

45

ControlledReality

Page 74: This page intentionally left blank

Mats Björkman, MdH

Pitfall: Sampling too sparsely We had chosen measurement

points (data sizes) in multiples of 1KB

The UNIX mbuf system puts communication data into 128-byte linked buffers. For 1KB, 8 such buffers are substituted for one large 1KB buffer

Page 75: This page intentionally left blank

Mats Björkman, MdH

Pitfall: Sampling too sparsely This means: Sending 1023 bytes means

handling of 8 small buffers Sending 1024 bytes means

handling of 1 large buffer Sending 1025 bytes means

handling of 1 large and 1 small buffer

Page 76: This page intentionally left blank

Mats Björkman, MdH

Pitfall: Sampling too sparsely Hence, our measurement points

were extreme points, therefore our results deviated from reality’s mean vaules

0

5

10

15

20

25

30

35

40

45

ControlledReality

Page 77: This page intentionally left blank

Mats Björkman, MdH

Pitfall: Time correlations

When working in a real system, unknown time correlations may occur. Process scheduling is one typical example.

Page 78: This page intentionally left blank

Mats Björkman, MdH

Pitfall: Time correlations If the arrival time of packets to

a host is timestamped in a process, the timestamps will exhibit a pattern correlated to the scheduling of that process.

Page 79: This page intentionally left blank

Mats Björkman, MdH

Pitfall: Clock resolution

Measuring small time quantities can be hard, and clocks may not be trustworthy on small time scales

Page 80: This page intentionally left blank

Mats Björkman, MdH

Pitfall: Clock resolution

Example: uniqtime in UNIX systems

Uniqtime keeps track of clock reads. Has the clock not ticked since last read, uniqtime will add a small fraction to the time to avoid “time standing still”.

Page 81: This page intentionally left blank

Mats Björkman, MdH

Pitfall: Clock resolution

This means that times in the same order as the clock tick cannot me measured accurately

Page 82: This page intentionally left blank

Mats Björkman, MdH

Pitfall: Trusting authorities

Authorities can make errors, too.Example: ns (and ns-2) are

standard simulators for communication.

ns comes with a large set of standard protocols

Page 83: This page intentionally left blank

Mats Björkman, MdH

Pitfall: Trusting authoritiesWe were interested in investigating the

backoff mechanism in Ethernet.We found out that the Ethernet

implementation in ns was broken (and had been so for a long time).

For several years, many people around the world had used a broken protocol in their simulations

Page 84: This page intentionally left blank

Mats Björkman, MdH

Conclusions Measurement-based research is a

prime example of experimental research methodology

Measurements are tricky, but fun! If you are interested in a career in

measurement-based research, study statistics!

Page 85: This page intentionally left blank

Mats Björkman, MdH

The End

That’s all, Folks!