Maths and Statistics Seminar Series 2: Experimental Design John Fenlon [email protected] 1

1

Maths and Statistics Seminar Series

2: Experimental Design

John Fenlon

[email protected]

mailto:[email protected]

2

A statistical career• RA @ Warwick Univ. (1969-71)• SO, HSO, SSO @ GRI, nr Reading (1971-79)• SSO, Water Quality, Severn-Trent (1979-83)• PSO @ GCRI, L’hampton (1983-1995)• Head of Statistics @ HRI, Littlehampton, then

Wellesbourne (1987-2004)• Director, RISCU & Reader in Statistics @ Univ.

Warwick (2003-2011)• Consultant, RISCU & Teaching Fellow (2011-)

3

‘It is easy to lie with statistics.It is hard to tell the truth with-out statistics.’

- Andreijs Dunkels

4

Two Farmers

• A uses variety X and gets a yield of 24.1• B uses variety Y and gets 21.4

• A grows X on a second field and gets 24.2

5

Two types of study

• Observational studies– An observational study compares

populations• Designed experiments

– An experiment compares treatments

• Example: randomised clinical trial vs. Similar study using hospital records

6

Statistical design allows us to

● avoid poor experimentation● make efficient / ethical use of resources● distinguish between signal and noise● Choose the right size of experiment● work with general heterogeneity of experimental

material, by grouping into homogeneous blocks● design a sequence of experiments within an

overall research programme

7

Publications with ‘Experimental Design’ or ‘Design of Experiments’ in

the titleYear Experimental Design Design of

experiments

2005 141 45

2006 146 55

2007 169 57

2008 211 89

2009 183 96

2010 204 82

2011 230 91

2012 (160) (56)

8

Areas of application• Agricultural field experiments• Ecological / environmental• Food processing / baking, etc• Industrial (e.g. chemical, pharma., bio-

genetics)• Engineering• Medical• Transport / environment• Educational• Computer experiments

9

Some simple examples

• We wish to compare the potential of several different varieties of wheat. How might we test this, and what inferences can we draw from the results?

• How might we determine what type of battery would give the longest life in a torch?

• It is thought that a new method of road-management might work better at certain types of junction. How should we go about testing this?

10

Some more simple examples

• How should we maximise the yield of a chemical product which we know is dependent on operating temperature and running time?

• A pharmaceutical company has developed a new drug to treat a particular disease. How can they test its efficacy?

• The Government need to test whether genetically-modified (GM) crops have an impact on the environment. How might they go about this?

11

HISTORICAL

• Rothamsted– Agricultural field trials (Fisher & Yates)

• World War II– Industrial statistics

• Impact of Computers• Medicine + other disciplines • Engineering (Taguchi & robust design)• Optimal Design• Computer Experiments

12

Classes of Experimental Problems (Wu and Hamada, 2000)

• Treatment comparisons• Variable screening• Response surface exploration• System optimisation• System robustness

13

Experimental Units• Experimental unit:

smallest division of the experimental material such that any two units may receive different treatments in the actual experiment

• Examples:– a plot of land– a patient in a hospital– a class of students– a lump of dough – a group of animals in a pen – a specific run on a machine with given conditions.

14

Comparative experiments

• Treatments applied at different times / places will almost certainly produce different means

• It is important to compare treatments on material / units that are as similar as possible

• In designed experiments we can infer a causal effect of treatments

15

The Three R’s of Experimental Design

• Replication• Randomisation• R...Blocking• Representativeness

16

ReplicationReplication is the process of running the

same treatment on different (i.e. independ-ent) experimental units.

Examples:• Different mice in an assay• Running a reaction again• More than one plot in a field trial

It does NOT mean repeating the reading, or sampling within a unit!

17

Replication • Provides a true estimate of variance• Helps to avoid of outliers

How much replication depends on

• resources available• variability of the experimental units• treatment structure• size of effect that we want to detect• relative importance of different comparisons

18

Randomisation

In practice, randomisation means that, once the units for the trial have been selected, it is entirely a matter of chance which unit receives which treatment.

Furthermore, the selection of one particular treatment-unit combination should have no influence on the treatment received by the unit that is adjacent in space or time.

19

RANDOMISATION• Provides a valid estimate of error• Guards against bias

– Systematic bias– Selection bias– Accidental bias– Cheating

• Use of ‘blind’, ‘double-blind’ principles in experimental trials

20

Mechanics of randomisation

• ‘Bingo’• Dice, coins and cards• Random number tables• Computer programs

21

Blocking – local control• Exploiting variation to advantage• Compare treatments on homogeneous material• Eliminate ‘known’ sources of variation in material• Examples

– Plots in field trials– Animals in litters– Times of day– Patients of a given age / medical history– Positions in a glasshouse– Different labs

22

A food chemistry experiment

We wish to compare the nutritional quality of 5 brands of pizza (A, B, C, D and E) using four different labs

Here the labs are blocks

I A D B C E

II E B A C D

III B A E D C

IV E D C B A

23

Representativeness

• a nutrition experiment on specific breed of cow – is it representative of all cows?

• we test various engine oils on a Ford engine – can we assume that the results will hold for a VW engine?

How representative is the experiment for the material for which inference is to be made?

Examples:

24

25

26

27

Production scheduling

28

29

Design & Analysis• Analysis Of Variance (ANOVA)

– A way of partitioning the variance in the experiment and attributing it to the design and treatment components

• A Linear Model– The Design defines a specific model which

determines the analysis.• The replication error gives us a

measure against which we can compare other effects (e.g. Blocks & Treatments)

30

• Essentially factors can be classified in two ways: quantitative and qualitative.

• Classical experimental design in agriculture was primarily qualitative (e.g. varieties of wheat, types of feed, etc.) although fertiliser experiments, sowing rates, etc are quantitative.

• In contrast industrial experiments were often done with purely quantitative factors, which quite naturally led to consideration of the response surface for an experiment – in the same way that one might look at the response profile for a quantitative response in agriculture.

Treatments in Designed Experiments

31

Three simple designs

B A C B

A

C B

A

C

B A C C B C A

B

C

B A C A C B C

A

B

B A C B A A B

A

C

32

Factorial treatment setsAllow us to address more than one question in the same analysis

– Not just “Are the treatments different?”– Are there differences between the levels of each

factor?– Do the differences between the levels of one factor

depend on the level of the other factor at which we observe the response? i.e. Is there an interaction?

– Does the form of the interaction between two factors depend on the level of a third factor at which we observe the response?

33

Example: cut-flower trials

dens 1 dens 2 dens 3

sowing 1

sowing 2

sowing 3

34

Treatments: two reference oils RL206 and RL133

Design: one furnace running a.m. or p.m. two different tubes (3 & 4) two replicates

7 Dec 8 Dec 12 Dec 13 Dec

a.m. RL206 4 RL133 4 RL133 3 RL206 3

p.m. RL133 3 RL206 3 RL206 4 RL133 4

14 Dec 16 Dec 19 Dec 20 Dec

a.m.

RL133 3 RL206 4 RL133 4 RL206 3

p.m.

RL206 4 RL133 3 RL206 3 RL133 4

Example (oil residues)

35

A Flour experiment

Four flour formulations (four levels of factor A)

X two yeast levels (factor N, low or high)

X two proof times (factor S, short or long)

X degree of mixing (factor Q, two levels)

X dough time delay (factor T, short or long).

This was a 4 x 24 experiment with 64

combinations. Only 32 combinations could be used in two blocks of 16

36

Table 15.17 (DV1999, p.504): A blocked ½-fraction of a 4 x 24 experiment on bread Only one block (of two) shown

Block (Day)

Treatment combination

A1 A2 A1A2 N S Q T

1 000011 -1 -1 1 -1 -1 1 1

000110 -1 -1 1 -1 1 1 -1

001001 -1 -1 1 1 -1 -1 1

001100 -1 -1 1 1 1 -1 -1

010010 -1 1 -1 -1 -1 1 1

010111 -1 1 -1 -1 1 1 -1

011000 -1 1 -1 1 -1 -1 1

011101 -1 1 -1 1 1 -1 -1

100010 1 -1 -1 -1 -1 1 1

100111 1 -1 -1 -1 1 1 -1

101000 1 -1 -1 1 -1 -1 1

101101 1 -1 -1 1 1 -1 -1

110011 1 1 1 -1 -1 1 1

110110 1 1 1 -1 1 1 -1

111001 1 1 1 1 -1 -1 1

111100 1 1 1 1 1 -1 -1

37

A ‘fun’ example : Helicopter

38

Antony & Antony (2001)

• Objective: “to identify the optimal settings of control factors which would maximise the flight time of paper helicopters (with minimum variation)”

• Control factors: those that can be easily controlled and varied by the designer or operator

39

Example of control factors in the helicopter expt (Table 2 from A²)

Control factor Labels Level 1 Level 2

Paper type A Regular Bond

Body length B 8 cm 12 cm

Wing length C 8 cm 12 cm

Body width D 2 cm 3 cm

No. of clips E 1 2

Wing shape F Flat Angled

40

41

42

43

A Checklist for Designing Experiments

• Define Objectives• Identify all sources of variation• Choose a rule for assigning units to

treatments• Specify measurement & procedure • Run a pilot experiment• Specify the model• Outline the analysis• Determine the number of observations• Review the above and revise (if nec.)

Documents

Maths and Statistics Seminar Series 2: Experimental Design John Fenlon [email protected] 1