27
Topic 1: Statistical Analysis Assessment Statements 1.1.1 State that error bars are a graphical representation of the variability of data 1.1.2 Calculate the mean and standard deviation of a set of values. 1.1.3 State that the term standard deviation is used to summarize the spread of values around the mean, and that 68% of the values fall within one standard deviation of the mean. 1.1.4 Explain how the standard deviation is useful for comparing the means and the spread of data between 2 or more samples 1.1.5 Deduce the significance of the difference between two sets of data using calculated values for t and the appropriate tables. 1.1.6 Explain that the existence of a correlation does not establish that there is a causal relationship between two variables.

Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Embed Size (px)

Citation preview

Page 1: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Topic 1: Statistical Analysis• Assessment Statements

– 1.1.1 State that error bars are a graphical representation of the variability of data

– 1.1.2 Calculate the mean and standard deviation of a set of values.

– 1.1.3 State that the term standard deviation is used to summarize the spread of values around the mean, and that 68% of the values fall within one standard deviation of the mean.

– 1.1.4 Explain how the standard deviation is useful for comparing the means and the spread of data between 2 or more samples

– 1.1.5 Deduce the significance of the difference between two sets of data using calculated values for t and the appropriate tables.

– 1.1.6 Explain that the existence of a correlation does not establish that there is a causal relationship between two variables.

Page 2: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Why use statistics?

• In science, we are always making observations about the world around us.

• Many times these observations result in the collection of measurable, quantitative data.

• Using this data, we can ask questions based upon some of these observations and try to answer those questions.

Page 3: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Essential Statistics Terms• Mean: the average of the data points

• Range: the difference between the largest and smallest observed values in a data set (the measure of the spread of the data)

• Standard Deviation: a measure of how the individual observations of a data set are dispersed or spread out around the mean

• Error bars: Graphical representation of the variability of data

Page 4: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Standard Deviation

• Standard deviation is used to summarize the spread of values around the mean and to compare the means of data between two or more samples.– In a normal distribution, about 68% of all values

lie within ±1 standard deviation of the mean. 95% will lie within ±2 standard deviations of the mean.

– This normal distribution will form a bell curve– The shape of the curve tells us how close all of

the data points are to the mean.

Page 6: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Calculating Standard Deviation• We can use the formula to solve for standard

deviation…

Sx =Σ(xi - (Σx)2

n

n - 1

Example data: Ward pg.5For calculator help, go to http://www.heinemann.co.uk/hotlinks ,enter code 4242p and click weblink 1.4a or 1.4b

HOWEVER the easiest way to calculate…find an app online!!!!

Page 7: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Are our results reliable enough to support a conclusion?

Page 8: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Imagine we chose two children at random from two class rooms…

D8 C1

… and compare their height …

Page 9: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

D8 C1… we find that

one pupil is taller than the

other

WHY?

Page 10: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

REASON 1: There is a significant difference between the two groups, so pupils in C1 are taller than

pupils in D8D8

YEAR 7

C1

YEAR 11

Page 11: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

REASON 2: By chance, we picked a short pupil from D8 and a tall one from C1

D8 C1

Sammy

(Year 9)

HAGRID

(Year 9)

Page 12: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

How do we decide which reason is most likely?

MEASURE MORE STUDENTS!!!

Page 13: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

If there is a significant difference between the two groups…

D8 C1… the average or mean height of the two groups should

be very…

… DIFFERENT

Page 14: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

If there is no significant difference between the two groups…

D8 C1… the average or mean height of the two groups should

be very…

… SIMILAR

Page 15: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Remember:

Living things normally show a lot of variation, so…

Page 16: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

It is VERY unlikely that the mean height of our two samples will be exactly the same

C1 Sample

Average height = 162 cm

D8 Sample

Average height = 168 cm

Is the difference in average height of the samples large enough to be significant?

Page 17: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

We can analyse the spread of the heights of the students in the samples by drawing histograms

Here, the ranges of the two samples have a small overlap, so…

… the difference between the means of the two samples IS probably significant.

2

4

6

8

10

12

14

16

Fre

quen

cy

140-149

150-159

160-169

170-179

180-189

Height (cm)

C1 Sample

2

4

6

8

10

12

14

16

Fre

quen

cy

140-149

150-159

160-169

170-179

180-189

Height (cm)

D8 Sample

Page 18: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Here, the ranges of the two samples have a large overlap, so…

… the difference between the two samples may NOT be significant.

The difference in means is possibly due to random sampling error

2

4

6

8

10

12

14

16

Fre

quen

cy

140-149

150-159

160-169

170-179

180-189

Height (cm)

C1 Sample

2

4

6

8

10

12

14

16

Fre

quen

cy

140-149

150-159

160-169

170-179

180-189

Height (cm)

D8 Sample

Page 19: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

To decide if there is a significant difference between two samples we must compare the mean height for each sample…

… and the spread of heights in each sample.

Statisticians calculate the standard deviation of a sample as a measure of the spread of a sample

Sx =Σ(xi -

(Σxi)2 n

n - 1

Where:Sx is the standard deviation of sampleΣ stands for ‘sum of’xi stands for the individual measurements in

the samplen is the number of individuals in the sample

You can calculate standard deviation using the formula:

Page 20: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

It is much easier to use the statistics functions on a scientific calculator!

Set calculator on statistics mode

MODE 2 (CASIO fx-85MS)

Clear statistics memory

SHIFT CLR 1 (Scl) =

e.g. for data 25, 34, 13

Enter data

2 5 DT (M+ Button) 3 4 DT 1 3 DT

Page 21: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Calculate the mean

AC SHIFT S-VAR (2 Button) 1 ( x ) = 24

Calculate the standard deviation

AC SHIFT S-VAR 3 =(xσn-1) 10.5357

Page 22: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Student’s t-test

The Student’s t-test compares the averages and standard deviations of two samples to see if there is a significant difference between them.

We start by calculating a number, t

t can be calculated using the equation:

( x1 – x2 )

(s1)2

n1

(s2)2

n2

+

t =Where:

x1 is the mean of sample 1s1 is the standard deviation of sample 1n1 is the number of individuals in sample 1x2 is the mean of sample 2s2 is the standard deviation of sample 2n2 is the number of individuals in sample 2

Page 23: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Worked Example: Random samples were taken of pupils in C1 and D8

Their recorded heights are shown below…

Students in C1 Students in D8

Student Height (cm)

145 149 152 153 154 148 153 157 161 162

154 158 160 166 166 162 163 167 172 172

166 167 175 177 182 175 177 183 185 187

Step 1: Work out the mean height for each sample

161.60C1: x1 = 168.27D8: x2 =

Step 2: Work out the difference in means

6.67x2 – x1 = 168.27 – 161.60 =

Page 24: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Step 3: Work out the standard deviation for each sample

C1: s1 = 10.86 D8: s2 = 11.74

Step 4: Calculate s2/n for each sample

(s1)2

n1

=C1:

10.862 ÷ 15 = 7.86

(s2)2

n2

=D8:

11.742 ÷ 15 = 9.19

Page 25: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Step 5: Calculate (s1)2

n1

+(s2)2

n2

(s1)2

n1

+(s2)2

n2

= (7.86 + 9.19) = 4.13

Step 6: Calculate t (Step 2 divided by Step 5)

t =

(s1)2

n1

+(s2)2

n2

=

x2 – x1

6.67

4.13= 1.62

Page 26: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Step 7: Work out the number of degrees of freedom

d.f. = n1 + n2 – 2 = 15 + 15 – 2 = 28

Step 8: Find the critical value of t for the relevant number of degrees of freedom

Use the 95% (p=0.05) confidence limit

Critical value = 2.048

Our calculated value of t is below the critical value for 28d.f., therefore, there is no significant difference between the height of students in samples from C1 and D8

Page 27: Topic 1: Statistical Analysis Assessment Statements –1.1.1 State that error bars are a graphical representation of the variability of data –1.1.2 Calculate

Do not worry if you do not understand how or why the test works

Follow the instructions

CAREFULLY

You will NOT need to remember how to do this for your exam