Tall Stories or how a simple question doesn’t always have a simple answer

Preview:

DESCRIPTION

Tall Stories or how a simple question doesn’t always have a simple answer. Neil Sheldon Royal Statistical Society Centre for Statistical Education neilsheldon.net. Some people think statistics is a branch of mathematics . ... but they’re wrong. It’s more important than that . - PowerPoint PPT Presentation

Citation preview

Tall Storiesor how a simple question doesn’t always have a simple

answer

Neil SheldonRoyal Statistical Society

Centre for Statistical Education

neilsheldon.net

Some people think statistics is a branch of

mathematics ...

... but they’re wrong.It’s more important

than that ...

... it’s a life skill

The purpose of statistics is understanding,

not numbers

Understanding statistics is understanding the

world around you

Understanding statistics enables you take better

decisions

The statistics tell a story ...

... but first you have to understand the story behind the statistics

Tall Storiesor how a simple question doesn’t always

have a simple answer

Tall stories

• How tall am I ...

Tall stories

• How tall am I ...– in absolute terms?

What is my height in feet and inches?

Tall stories

• How tall am I ...– in absolute terms?

What is my height in feet and inches?

– in relative terms? Am I short, or tall, or about average?

Tall stories

• What factors influence the answers to– the absolute question?– the relative question?

Tall stories

• What factors influence the answers to– the absolute question?– the relative question?

• Variation,variation, variation!

If only ...

Men, white From the US National Health And Nutrition Examination Survey ‘NHANES III’ 1988-94

Men, black

Men, Hispanic

Men, other

Women, white

Women, black

Women, Hispanic

Women, other

But where do the data come from?

And what are the implications of sampling variation?

NormalSimulation.xlsx

Cross-sectional and longitudinal studies

Age differences in height derived from cross-sectional studies can be the result of differential secular influences among the age cohorts. To determine the magnitude of height loss that accompanies aging, longitudinal studies are required. The authors studied 2,084 men and women aged 17–94 years enrolled from 1958 to 1993 in the Baltimore Longitudinal Study of Aging, Baltimore, Maryland. On average, men's height was measured nine times during 15 years and women's height five times during 9 years. The rate of decrease in height was greater for women than for men. For both sexes, height loss began at about age 30 years and accelerated with increasing age. Cumulative height loss from age 30 to 70 years averaged about 3 cm for men and 5 cm for women; by age 80 years, it increased to 5 cm for men and 8 cm for women. Am J Epidemiol 1999;150:969-77.

Longitudinal data

Overlapping longitudinal data

• Judged by the height of the doorframes he built, medieval man seems to have been short by today’s standards.

• But evidence gathered from 3,000 skeletons reveals that human height has varied little over the past 1,000 years.

• From the 10th century through to the 19th, the average height of adult men was 5ft 7in or 170cm - just 2in below today's average.

• Women were an average of 5ft 2in or 158cm - just over an inch shorter than today.

All the bones in the study came from the

medieval St Peter's Church in Barton

upon Humber, North East Lincolnshire.

They were shorter back then ...

Based on a modest sample of skeletons from northern Europe, average heights fell from 173.4 cm in the early Middle Ages to a low of roughly 167 cm during the 17th and 18th centuries. Taking the data at face value, this decline of approximately 6.4 cm substantially exceeds any prolonged downturns found during industrialization in several countries that have been studied. Significantly, recovery to levels achieved in the early Middle Ages was not attained until the early 20th century. It is plausible to link the decline in average height to climate deterioration; growing inequality; urbanization and the expansion of trade and commerce, which facilitated the spread of diseases; fluctuations in population size that impinged on nutritional status; the global spread of diseases associated with European expansion and colonization; and conflicts or wars over state building or religion. Because it is reasonable to believe that greater exposure to pathogens accompanied urbanization and industrialization, and there is evidence of climate moderation, increasing efficiency in agriculture, and greater interregional and international trade in foodstuffs, it is plausible to link the reversal of the long-term height decline with dietary improvements. Richard H Steckel

... Or were they?

Variation in height during the day

Did you know that astronauts are up to 2 inches taller while they're in space? As soon as they come back to Earth, though, they return to their normal height.

Imagine that the vertebrae in your back form a giant spring. Pushing down on the spring keeps it coiled tightly. When the force is released, the spring stretches out. In the same way, the spine elongates by up to three percent while humans travel in space.

To some degree, a similar stretching of the spine happens to you every night. When you lie down, gravity isn't pushing down on your vertebrae. Measure your height carefully as soon as you get up or while you are still lying down. You will find that you're about a centimeter or two taller.

Variation in height during the day

Variation with wealth

Mean height of Dutch adults

Time series data

Variation by sex

Men are, on average, taller than women.

But some women are taller than some men.

How can we quantify this?

men and women heights overlap.xls

Data, data everywhere ...

Human height - Wikipedia, the free encyclopedia.htm

Self-reported height

Genetic variation

• Children of tall parents are, on average, – as tall as– taller than– shorter thantheir parents ... ?

• Parents of tall children are, on average, – as tall as– taller than– shorter thantheir children ... ?

Regression to the mean

P P

C C

Individual heights are measured in standard deviations from male mean or female mean as appropriate. Then P is the average of father’s height and mother’s height

C = P C = P

• Variation, variation, variation

–A very simple question, ‘How tall am I’, raises many issues to do with variation

–These issues go to the heart of many statistical concepts

• Variation, variation, variation–variation within groups

This is ‘the usual’ concept of variation: the variability within a population or a sample is measured by the standard deviation or the inter-quartile range

• Variation, variation, variation–variation between groups

Groups may differ from one another. Sometimes the variation between groups is more important, sometimes it is less important, than the variation within groups. (Analysis of variance treats this in fine detail.)

• Variation, variation, variation–variation within individuals

Sometimes the attribute to be measured is not constant:

it may have a cyclical variation it may have a trend over time ... and it may have both

• Variation, variation, variation–variation over time

Where an attribute is observed to vary with time, the variation may be

cross-sectional: “the older ones were like that when they were young”

longitudinal: “that’s what happens as you get older”

... or a combination of the two

• Variation, variation, variation–historical variation

A longer-term variation that may be quite distinct from longitudinal or cross-sectional variation

• Variation, variation, variation–variation in definition

Any attribute being measured or counted has first to be defined. It is very common for definitions to vary from one situation to another

• Variation, variation, variation–sampling “error” variation

A misnomer, as it is in the nature of samples to vary: it’s not a bug but a feature. We all know that samples vary, but we are often tempted to read more information into a sample than it can actually offer

• Variation, variation, variation–sampling bias variation

There are many ways in which a non-random sample can be unrepresentative. Opportunity sampling – measuring or counting whatever is at hand – may be the most common and the most dangerous

• Variation, variation, variation–self-reporting variation

Lacks objectivity and so can be deeply misleading. It’s like anecdotal evidence on a large scale

• Variation, variation, variation–variation and correlation

Strong correlations reduce variation: knowing the value of one variable can reduce the uncertainty about another

• Variation, variation, variation–variation by error

And, underpinning everything else, there are the errors we all make in counting and measuring, recording and tabulating.Even if all the other sources of variation are controlled and understood, our own fallibility ensures that there will always be variation in the data

Tall Storieshow a simple question doesn’t always have a simple answer

Neil SheldonRoyal Statistical Society

Centre for Statistical Education

neilsheldon.net

Recommended