Interpreting data … Drawing and comparing Box and Whisker diagrams (Box plots)

Preview:

Citation preview

Interpreting data …

Drawing and comparingBox and Whisker diagrams

(Box plots)

Learning objectives

All: (B grade)• Calculate quartiles and draw box and

whisker diagramsMost: (A grade)• Interpret box and whisker diagrams

and use to compare datasetsSome: (Stats GCSE)• Use all terminology and use IQR to

find “outliers”

A list of data

• The weights (KG) of 15 children:

37, 42, 31, 35, 48, 29, 50, 36, 44, 28, 63, 35, 41, 52, 43

Difficult to UNDERSTAND what thesechildren look like from the list …

• Minimum = 28KG• Maximum = 63KG• Range = 35KG• Mode = 35KG• Median = 41KG• Mean = 40.9KG

• The weights (KG) of 15 children:28, 29, 31, 35, 35, 36, 37, 41, 42, 43, 44, 48, 50, 52, 63

Stem and leaf …

• 28, 29, 31, 35, 35, 36, 37, 41, 42, 43, 44, 48, 50, 52, 63

2 8 93 1 5 5 6 74 1 2 3 4 85 0 26 3

Key: 2 9 means 29

ORDERED STEM & LEAF

Another useful summary• A diagram to show:

min (28KG), max (63KG), median (41KG) …

Min Median Max

Median

• ½(n + 1)th piece of data (ordered)

28, 29, 31, 35, 35, 36, 37, 41, 42, 43, 44, 48, 50, 52, 63

15 items of data … n = 15

½(n + 1) = ½(15 + 1) = 8th item

Lower Quartile

• ¼(n + 1)th piece of data (ordered)

28, 29, 31, 35, 35, 36, 37, 41, 42, 43, 44, 48, 50, 52, 63

15 items of data … n = 15

¼(n + 1) = ¼(15 + 1) = 4th item

Upper Quartile

• ¾(n + 1)th piece of data (ordered)

28, 29, 31, 35, 35, 36, 37, 41, 42, 43, 44, 48, 50, 52, 63

15 items of data … n = 15

¾(n + 1) = ¾(15 + 1) = 12th item

Add that to our box plot• A diagram to show:

min (28KG), lower quartile = 35KG

max (63KG), upper quartile = 48KG median (41KG) …

MinMedian

MaxLQ UQ

Some terminology

MinMedian

MaxLQ UQQ0

Q1Q2 Q3 Q4

Alternative names for quartiles

Some terminology

UQ – LQ = Interquartile Range (IQR)

Max – Min = Range

Some terminology

Positive skew: median closer to LQ than UQ

Negative skew: median closer to UQ than LQ

Symmetrical distribution

Interpreting the box plot

• Easily see lightest / heaviest and range• The ‘box’ contains the middle 50% of

people (the most ‘representative half’)• The ‘whiskers’ show the lightest 25%

and heaviest 25% of people (extremes)

Comparing groupsBoys

Girls

“Lightest girl lighter than lightest boy”

“Heaviest boy heavier than heaviest girl”

“Most representative half of girls generallylighter than most representative half of boys”

Comparing groupsBoys

Girls

“Lightest girl same as lightest boy”

“Heaviest boy same as heaviest girl”“All of the most representative half of girls lighter than most representative half of boys”

“Three quarters of girls lighter than three quarters of boys”

Ascending height order …

Source: Dr Pearl’s 1938 study of 100,000 non smokers

25,000people

25,000people

25,000people

25,000people

The Queue of DEATH!

Come on guys, this is so SLOW!

Source: Dr Pearl’s 1938 study of 100,000 smokers

25,000people

25,000people

25,000people

25,000people

The Queue of DEATH!

Woah there!I’m not

ready yet!

smokers

non-smokers

Direct comparisons easy with box plots

23 boys and 11 girls were given a maths test.Their scores are listed below:Boys: 7, 13, 15, 19, 35, 35, 37, 43, 44, 44, 45, 46, 47, 47, 49, 51, 52, 55, 55, 56, 78, 82, 91Girls: 7, 18, 23, 47, 58, 63, 68, 72, 72, 75, 87

Use box plots to compare the differences between the boys and girls scores and comment on the differences.

Which scores (if any) might be considered ‘outliers’ and why (/why not)?

23 boys and 11 girls were given a maths test.Their scores are listed below:Boys: 7, 13, 15, 19, 35, 35, 37, 43, 44, 44, 45, 46, 47, 47, 49, 51, 52, 55, 55, 56, 78, 82, 91Girls: 7, 18, 23, 47, 58, 63, 68, 72, 72, 75, 87

Boys GirlsMin 7 7LQ 35 23Median 46 63UQ 55 72Max 91 87

IQR 20 49Range 84 80

11

9

}

}40

9

}

}negative

skewsymmetricaldistribution

0 10 20 30 40 50 60 70 80 90 100(Maths score out of 100)

Box plot of boys and girls maths scores

B

G

Looking for ‘outliers’

When do we feel our ‘extreme’ data isjust TOO extreme?

Outliers• High Outliers > UQ + 1.5 x IQR• Low Outliers < LQ – 1.5 x IQR

Eg. For boys 1.5 x IQR = 1.5 x 20 = 30

Scores less than LQ – 30 (35 – 30 = 5) are outliers

Scores more than UQ + 30 (55 + 30 = 85) are outliers

The only outlier is the score of 91 … but that is notsuch an unreasonable score! This is just a guide!

For girls outliers are < -50.5 or > 145.5 … so no outliers there!

Plenary

1) What’s the point in Box plots?2) Give some advantages and

disadvantages of Box plots over other methods of comparing data.

Recommended