17
1 Name ____________________ Maths teacher _____ YEAR 12 Statistics 2013 Evaluate a statistically based report AS91266 2.11 Level 2 Internal 2 credits Evaluate a statistically based report

YEAR 12 Statistics 2013 - Weebly

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

1

Name ____________________ Maths teacher _____

YEAR 12 Statistics 2013

Evaluate a statistically based report

AS91266 2.11 Level 2 Internal 2 credits Evaluate a statistically based report

2

Statistics – following the PPDAC cycle

Problem – this is the stage where you

define the question you want to answer. Ask

yourself

how do we go about answering this question?

what do we need to know?

how will we find the information that we

need?

what will we do with the information that

we collect?

who will find this information useful?

is this information relevant to the problem?

Plan –This stage is about how

you will gather the data. Ask

yourself How would you answer the

question now, before you gather

the data?

how will we gather this data?

what data will we gather?

what measurement system will we

use?

how are we going to record this

information?

Data – This

stage is concerned

with how the data is

collected, managed

and organised. Ask

yourself

How shall we

record the data

(in a table)?

Analysis –This stage

is about exploring the data,

calculating statistics and

drawing graphs and interpreting

them in terms of the question

posed This is the I notice, I wonder stage. Ask

yourself

What is a typical value?

Where are most of the

values?

What sort of graph

would display the data

best?

What sort of scale shall

I use for the axes?

Use graphical language e.g.

spread, skew, mean, mode,

variation

Conclusion

– This stage is about

answering the

question in the

problem section

and providing

reasons based on

your analysis.

the statistical enquiry cycle

3

Understanding and using statistical language

We need to use statistical language to describe data and to communicate precise information. A population can be any group of individuals or measurements that we are interested in finding out about. A census is a survey of the whole population, asking every individual the same questions. A sample is part of the population that we can measure to find out about the population, using an inference. Parameters are measures which describe a population, such as mean, median, inter-quartile range. Statistics are measures which describe a sample, such as mean, median, inter-quartile range. An inference is an estimate for a population parameter, based on a sample statistic. A sample can give useful information about a typical member of the population and about the shape of the population distribution, but not about largest or smallest individuals in the population. A numerical data set has a distribution which describes how the data varies among the sample or population. A distribution can be described by:

a measure of the centre or typical value such as mean, median, mode.

a measure of the variation or spread, such as IQR or standard deviation

a measure of the shape, such as symmetry or skew

a description of unusual features of the data, such as outliers.

A distribution is best displayed by a graph such as a dot plot, histogram or bar graph.

4

Advantages Disadvantages

Mean Easy to calculate

Every value is taken into account

Influenced by outliers

Median Typical of middle data values

Not affected by outliers or

incorrect values

Hard to calculate manually

Mode For non-numeric data it is the only

average you can use

May not exist

Tells us nothing about the

remaining data

The Median is the MIDDLE NUMBER (or average of two middle numbers) of a group of numbers

listed in order from smallest to largest.

The Mean is the sum of a set of numbers divided by how many

numbers are in the set.

Measures of Central Tendency

A single number that can represent the typical value of a set of numbers.

The Mode is the number that occurs the most often in

a set of data.

5

An important idea Consider the scores in for 5 contestants Adeline, Belinda, Carey, Dorothy and Elika in a trampoline competition. There are six judges who give a score to each contestant from one to six. The results for each student are shown in the dot plots below.

We need a way of distinguishing between the different spread of data for each distribution. The range does not distinguish between them. We use the standard deviation. Use you calculator in stats mode to calculate the mean and

standard deviation for each set of data. Use the x and xσn functions.

0 1 2 3 4 5 6 7

E

trampoline results Dot Plot

0 1 2 3 4 5 6 7

D

trampoline results Dot Plot

0 1 2 3 4 5 6 7

B

trampoline results Dot Plot

0 1 2 3 4 5 6 7

A

trampoline results Dot Plot

What do you notice about the distributions of the scores? They all have the same mean and median and range, but the actual distributions are all different.

0 1 2 3 4 5 6 7

C

trampoline results Dot Plot

How does the standard deviation vary with the spread

of the data?

Small SD means little spread (most values close to the

mean)

Large SD means a lot of spread (many values far from

the mean).

Note that a uniform distribution has a middling spread.

contestant mean sd

A

3.5 2.1

B

3.5 2.5

C

3.5 1.5

D

3.5 2.2

E

3.5 1.8

The standard deviation is a measure of the spread of a set of data. It is a rough measure of the average distance from the mean.

6

Measures of Spread

Measures the variation in a set of numbers.

Low variation means most values are close to the centre. High variation means lots of values are far from the centre.

Range is… The difference between the upper and lower extremes (difference between the maximum and minimum values).

Interquartile range is… The difference between the upper and lower quartiles, which is also the middle 50% of

your data

Standard deviation is a measure of spread. A low standard deviation indicates that the data points tend to be very

close to the mean , whereas high standard deviation indicates that the data are spread out over a large range of values.

Is affected by outliers and skewed data.

7

Estimating the Mean and Standard Deviation

These data are about the height, weight and age of bears. The measurements were recorded in America.

height

80 100 120 140 160 180 200 220

heights of year 9 students Dot Plot

Weight

0 100 200 300 400 500 600

Bears Dot Plot

Age

0 40 80 120 160

Bears Dot Plot

Mean: 61.3 in Standard deviation 9.4 in

Mean: 192 pounds Standard deviation 110 pounds

Mean:_43 months______ Standard deviation 34 months

mean SD

year 12 13.1 2.4

year 9 9.0 2.8

This dot plot shows the number of words remembered in Kim’s Game by a year 9 class and a year 12 class.

Mean:___156.7 cm__________ Standard deviation 14.6 cm___

dominant_handsec

nondominant_hand...

0 10 20 30 40

w riting time (secs) Dot Plot

Length

40 50 60 70 80

Bears Dot Plot

This dot plot shows the time taken to write a sentence with the dominant hand and the non-dominant hand by a class of year 12 students.

mean SD

dominant hand 13.6 sec 1.4 sec

non-dominant hand 34.9 sec 4.6 sec

8

SAMPLING

Statistics involves the collection and analysis of data to find answers to complex problems. When possible, information may be collected from the entire group under investigation. This is called a census. Sampling involves surveying some of the group of interest. A sample survey is carried out to collect information (data) when it is impractical, too expensive or unnecessary to carry out a census. Situations involving sampling:

Polls to establish public opinion related to politics or current events or business opinion

Market research on goods and services

Scientific experiments or tests eg taking a sample of blood.

Radio and TV ratings are based on the listening and habits of a sample of people. These determine advertising rates and the money available for programming.

Government agencies collect information to make predictions for the needs of schools, hospitals, social services.

Reasons for sampling

Economic: it is too expensive to survey the entire population

Time: collecting data takes time

Availability of information: the entire population may not be accessible

The nature of the method of data collection eg finding the number of hours a light bulb lasts, The target population The population under investigation is called the target population. A population is not necessarily people, it could be icecream, lightbulbs, the time taken to send a text, etc The Sampling Frame In order to make a selection from the target population a list called the sampling frame is used. Some examples of possible sampling frames are a school roll, the electoral roll for a district, a map showing the houses in an area, a farmer’s database of her stock. Sometimes the sampling frame may not exactly coincide with the target population, for example using the telephone directory to survey adults in Auckland.

Bias To be effective a sample should be representative of the target population so that correct conclusions may be reached about the population as a whole. The characteristics of the sample should be the same as the characteristics of the target population. If the sample does not accurately represent the target population the survey method is said to be biased. Size of the sample The larger the sample, the more likely your sample statistic is to be a good estimate of the population parameter. However, for estimating the mean or median of populations up to 1000, a sample of about 30 is usually big enough to give a reasonable estimate.

Making an inference For each sampling method, once the sample units have been selected the data value for each is recorded. Statistics can be calculated from the sample such as the mean, median, range, standard deviation. These statistics can then be used to make an estimate of the parameter for the target population. Making an estimate in this way is called making an inference about the population.

9

Sampling variation The variation in a sample statistic from sample to sample due to the variability in the population, the sample

size and random variation.

Suppose a sample is taken and a sample statistic, such as a sample mean, is calculated. If a second sample of

the same size is taken from the same population, it is almost certain that the sample mean calculated from

this sample will be different from that calculated from the first sample. If further sample means are

calculated, by repeatedly taking samples of the same size from the same population, then the differences in

these sample means illustrate sampling variation.

Sampling error The error (in an estimate of a population parameter, based on a sample statistic) caused because data are

collected from part of a population rather than the whole population (even if the sample is unbiased).

Sampling error occurs because of sampling variation. Even if identical sampling methods are used, two

samples are likely to give different estimates of the population mean, percentages, standard deviation, which

are also different from the true population mean, percentage, standard deviation. If the sampling method is

valid and reliable, the sample will represent the population but there is likely to be some sampling error in

each estimate of a population parameter. If the sample is very small (less than 30 for an estimate of the

median or mean, or less than 250 for an estimate of a proportion) then sampling error may cause the sample

to be biased and unrepresentative of the population. If the sample is large it may biased for other reasons

(non-sampling error), but it is unlikely to be biased due to sampling error.

An estimate of a population parameter, such as a sample mean or sample proportion, is different for

different samples (of the same size) taken from the population. Sampling error is due to sampling variation

and is one reason for the difference between an estimate and the true, but unknown, value of the population

parameter. The other reason is non-sampling error.

Non-sampling error The error (in an estimate of a population parameter, based on a sample statistic) caused because of human

error (either in designing, carrying or contributing to the survey). Non-sampling errors have the potential to

cause bias in estimates based on surveys or samples.

To minimise non-sampling error:

Use a sampling frame which is representative of the population

Use a sampling method (random or systematic) which is likely to give a representative sample

Make sure the survey questions are clear, unbiased and easy to answer

Some sources of non-sampling error are more difficult to control:

People who choose not to answer (non-response)

People who don’t tell the truth

There are many types of non-sampling errors, and the names used for them are not consistent.Some

examples of non-sampling errors causing bias in the sample are:

The sampling frame or sampling process is such that a specific group is excluded or under-represented in the sample, deliberately or inadvertently. If the excluded or under-represented group is different, with respect to survey issues, then bias will occur.

The sampling process allows individuals to select themselves. Individuals with strong opinions or those with substantial knowledge will tend to be over-represented, creating bias.

Bias will occur if people who refuse to answer have different views of the survey issues from those who respond. This can also happen with people who are never contacted and people who have yet to make up their minds.

If the response rate (the proportion of the sample that takes part in a survey) is low, bias can occur because respondents may tend consistently to have views that are more extreme than those of the population in general.

The wording of questions, the order in which they are asked, and the number and type of options offered can influence survey results.

Answers given by respondents do not always reflect their true beliefs because they may feel under social pressure not to give an unpopular or socially undesirable answer.

Answers given by respondents may be influenced by the desire to impress an interviewer.

10

Sampling Strategies Statisticians have developed various methods for taking a sample. To produce a survey that is free from bias the method should ensure that the sampling frame is representative of the target population, and that every unit in the sampling frame has an equal chance of being included in the sample.

1. Simple Random Sampling This is the mathematical equivalent of drawing the names out of a hat, or sticking a pin randomly in a list of names. It involves

Allocating a number to every unit in the sample frame

Generate random numbers

Match the random numbers generated to the units

Record what you are interested in about the unit Notes 1. If the same random number comes up more than once it should be disregarded and another random

number generated to replace it. 2. Although the simple random sampling method is free from bias the sample may turn out to be not

representative of the population. 3. A disadvantage of this method is that it is time consuming or impossible to carry out with large

populations.

2. Systematic Sampling This is the mathematical equivalent of selecting every 10th person on a list. It involves

Using a random number to find a starting point on the sampling frame

Divide the total by the sample size (and round) to find how many units to count to select the next unit.

Notes 1. This method has the advantage of being quicker than simple random sampling. 2. If the target population has recurring patterns in it the sample may not be representative. 3. If the counting number is too large the method becomes awkward. 4. If the counting number is too small the sample may not be representative of the population.

3. Stratified Sampling This method is only used when you have evidence that there are subgroups in the population which you expect to have different parameters. It involves

splitting the population into layers or strata. Each unit in the population is allocated to one layer eg male/female. The number to be selected from each layer is calculated to be in the same proportion as the number in each layer in the population. For example, from a group of 60 men and 20 women for a sample of size 8 you would select 6 men and 2 women.

Taking a simple random sample or systematic sample from each layer in the usual way. Notes 1. This method guarantees that each strata is represented 2. The success of the method depends on the choice of strata.

4. Cluster sampling This method chooses a part or parts of the population which is believed to be representative and samples only from that part, using one of the methods above. Eg. Surveying people in Ellerslie, as it has an ethnic breakdown which is representative of all Auckland. Notes 1. It is easier and cheaper than sampling from the whole population. 2. The success of the method depends on the choice of the cluster(s).

5. Non probability sampling

These methods include convenience sampling (eg person on the street surveys) and quota sampling (a form of convenience sampling in which target numbers are set for certain groups to ensure representation, such as sampling equal numbers of men and women). Notes

1. This method assumes that the people encountered in the convenience sample are representative of the population, which may or may not be a valid assumption.

11

2. If the sample is not representative of the population you can’t make a valid inference about the population.

6. Self selected sampling An extreme form of non probability sampling is self-selected sampling (eg text in your vote). Notes

1. With self selected sampling it is usual to get responses only from those people who feel strongly about the issue in question.

2. There is a very high chance that a sample obtained with self selected sampling will not be representative of the target population.

3. It is highly likely that an inference made from the sample would not be valid for the target population.

Summary of advantages and disadvantages of different sampling methods The advantages only apply when the sampling frame is representative of the population.

method advantage disadvantage

Simple random sampling

Usually representative May be time consuming and expensive to organise for a large population

Systematic sampling Quicker and easier to organise than a simple random sample but still likely to be representative

Cannot be used when there may be cyclic patterns in the data.

Stratified sampling Ensures each identified strata of the population are represented. Comparisons can be made between strata.

Requires prior knowledge of the population

Cluster sampling

Usually representative; may be less expensive than simple random sampling

Relies on the clusters selected being representative of the population.

Convenience sampling

The sampling units are chosen because they are easy to access.

Unlikely to be representative

Self-selected sampling

This relies on people volunteering to take part in the research.

Unlikely to be representative

12

Survey Methods

Questionnaires and other surveys can be completed in a face-to-face or telephone interview, or self-

administered on paper or internet. When choosing a survey method, you need to consider who your target

group is, the best way to reach them, the cost, and the time available. Even the best-designed survey will have

some non-response which may bias the results. There are advantages and disadvantages for each method. method example advantages disadvantages

Writt

en

self-a

dm

inis

tere

d

Cost is relatively low Geographic distribution can be

wide Sensitivity issues handled well

Response rate may be poor and biased towards more educated and those with an interest in the topic

No knowledge about non-response

Long time between data collection and analysis

Inte

rne

t

self-a

dm

inis

tere

d

Low cost (no paper, no data entry costs, no postage)

data collection is quick Geographic distribution may be

wide Questionnaires may be

complex because the skips are programmed in

Pop-up instructions, videos, voice-overs, animation are available to make it more fun and dynamic

Bias against those without internet access

Self selection bias Non-response bias

13

Face

-to-f

ace

inte

rvie

w

Good control of question order Good quality of responses Appropriate for some sensitive

issues

Cost is high Data collection period is long Geographic distribution must

be clustered Takes a long time

Tele

pho

ne

inte

rvie

w

Call centre selling.

Numbers are selected

randomly from a phone book

or by generating

random numbers.

Cost is relatively low Geographic distribution can be

wide Sensitivity issues handled well

Response rate may be poor and biased towards more educated and those with an interest in the topic

No knowledge about non-response

Long time between data collection and analysis

Questionnaire design must ensure that the questions asked are:

Easy to understand

Are not leading questions (designed to get one particular answer)

Allow for all possible responses A well designed questionnaire will enable useful information to be collected. A poorly designed questionnaire will result in non-sampling errors due to non-response, biased data and incorrect responses. More big ideas in statistics:

Reliability describes the repeatability and consistency of test or sample. A sampling process is

reliable if it gives a similar distribution each time it is repeated.

Example:

RELIABILITY AND STATISTICS Physical scientists expect to obtain exactly the same results every single time, due to the relative predictability of the physical realms. If you are a nuclear physicist or an inorganic chemist, repeat experiments should give exactly the same results, time after time.

Ecologists and social scientists, on the other hand, understand fully that achieving exactly the same results is

an exercise in futility. Research in these disciplines incorporates random factors and natural fluctuations and,

whilst any experimental design must attempt to eliminate confounding variables and natural variations, there

will always be some disparities.

The key to performing a good experiment is to make sure that your results are as reliable as is possible; if

anybody repeats the experiment, powerful statistical tests will be able to compare the results and the scientist

can make a solid estimate of statistical reliability. Read more: http://www.experiment-resources.com/definition-of-reliability.html#ixzz1fosgGb00

Validity defines the strength of the final results and whether they can be regarded as accurately

describing the real world. A sampling process is valid if it is unbiased and is likely to give a sample that is

representative of the population the sample comes from.

Example:

14

Comparing RELIABILITY and VALIDITY Reliability and validity are often confused, but the terms actually describe two completely different concepts, although they are often closely inter-related. This distinct difference is best summed up with an example:

Example: A researcher devises a new test that measures IQ more quickly than the standard IQ test: If the new test delivers scores for a candidate of 87, 65, 143 and 102, then the test is not reliable or valid, and

it is fatally flawed.

If the test consistently delivers a score of 100 when checked, but the candidates real IQ is 120, then the test is reliable, but not valid.

If the researcher’s test delivers a consistent score of 118, then that is pretty close, and the test can be considered both valid and reliable.

Questions, questions, questions…

The problem is the big question an investigation is trying to answer, the purpose of an investigation. It is often

written as a question.

A survey question is a question asked in a survey or questionnaire in order to get information to help answer

the problem question.

A critical question (or worry question or interrogative question), is a question asked by someone interpreting

data or reading a statistical report (see page 16).

STATISTICAL LITERACY Who needs statistical literacy?

Statistical literacy is needed by data consumers – anyone who tries to evaluate numerical information. Statistical literacy is needed most by journalists, policy analysts, decision makers and by political, economic and social leaders, but most of all by the citizens of a modern democracy.

What should a statistically-literate person be able to do?

Statistical literates should be able to evaluate number-based claims in the media. Consider these newspaper headlines:

Soft Drinks Could Boost Pancreatic Cancer Risk.

Absent Dads cause Earlier Puberty in Girls.

Weddings boost mood

Shooter video games can improve decision making

New heart disease drug does not improve patient outcomes A statistically-literate reader can tell that the first two claims will have much weaker support because the outcome is not repeatable for a given person: you only get cancer once; you only go through puberty once. Comparing different people weakens the argument.

15

They can tell that the last three claims have stronger support since the outcomes can be measured before and after the event or condition in question. But only the last one can have fairly strong support. The last study is the one in which the outcome is repeatable AND the subject can be assigned – unknowingly to either get new drug or to get a placebo. Statistical literacy is like speed reading or speed dating. You get more information faster.

Statistical literates can spot the difference between association and causation.

A statistically-literate reader knows that words like “kills”, “causes” and “blame” make “causal claims” that are much more disputable than “association claims” involving words like “attributed to”, “associated with”, “tied to”, “linked to” or “due to”.

A statistically-literate reader can spot obvious errors in news stories. Consider these:

Racial Imbalance Persists at Elite Public Schools New York Times 11/08/2008. “at Stuyvesant...2% of blacks, 3% of Hispanics, 24% of whites and 72% of Asians were accepted.” The 100% total is suspicious. These are not parts of a pie so there is no reason for that total. The report should have said “among those accepted, 2% are blacks, 3% are Hispanics, 24% are whites and 72% are Asians.” Since these are parts of the same “pie”, they should total 100%.

Study says too much candy could lead to prison. AP 9/30/2009. “Of the children who ate candies or chocolates daily at age 10, 69 percent were later arrested for a violent offense by the age of 34.” This is an incredible statistic. Do you believe eating candy daily can predict criminal behaviour 20 years in advance? No! The truth: “69% of respondents who were violent criminals by the age of 34 years reported that they ate confectionary nearly every day during childhood.” The AP reversed the order: “69% of daily candy-eating kids became violent criminals by 34” is very different from “69% of violent criminals by age 34 had been daily candy-eaters as kids”.

Statistically-literate readers look for weasel words: count words that imply much but assert very little – words like many, some or few, lots or little, high or low, often or seldom.

Many teens share prescription drugs. Some elderly get futile care.

Adult video gamers often overweight, depressed. Older drivers in fewer crashes. Statistically-literate readers look out for ideas that are vague.

Consider these headlines:

High exposure to BPA linked to low sperm count. How low is low? By choosing a higher sperm count cutoff, the number of men with low sperm count is increased.

Too much TV psychologically harms children. Exactly how much TV is too much? What did they consider harm? What did they consider psychological harm?

A statistically-literate person has an idea of when a relationship is not likely to be causal. Consider the traffic fatality graph at right. As the amount of lemons imported from Mexico increased, the US traffic fatality rate decreased. The lack of a plausible mechanism and the small effect size (a 6% drop from 15.8 to 14.8) all but invite alternate explanations.

16

A statistically-literate reader knows the difference between frequently and likely.

Car most frequently stolen: Honda Civic

Car most likely to be stolen: Cadillac Escalade Frequently is a count. Honda Civics are common so the number stolen is higher. Here likely is a rate per car. Cadillac Escalades are less common so the theft-rate is higher.

A statistical-literate knows the difference between real statistics and speculative statistics. Which counts are real: deaths due to poisoning or drownings versus deaths due to obesity, radon or second-hand smoke? Answer: the former (corner-certified); the latter are all speculative.

A statistically-literate person knows how to read statements involving rates and percentages. Do these statements say the same thing?

Percentage of women who smoke vs. percentage of smokers who are women.

Death rate of men vs. male rate of death. In both cases, the answer is “No”. How about these statements?

Percentage of women who smoke vs. percentage of smokers among women. Here the answer is “Yes.”

Statistical literacy critical questions to ask about the article or report

Purpose of the article or report and identification of the population of interest.

A description of measures and data representations used in the article or report and an evaluation of the

appropriateness of these to the purpose.

How was the data displayed in the article?

Are the displays or measures appropriate for the type of data?

Are the displays or measures misleading in any way?

What summary statistics were used in the article?

Do the comments match the graphs/displays given?

Were outliers or extreme values present in the data, and if so, how were they handled?

A description of the sampling or survey method, including reference to sample size when available, used

in the article or report and an evaluation of the appropriateness of these to the purpose.

Is the original data available?

What type of data is it, categorical or numerical?

How accurate is the data?

Did the data require cleaning?

Where is the data that was quoted/used in the article from?

17

What were the survey questions asked?

What was the data collection method?

Were the survey questions appropriate?

Could the survey questions be misinterpreted or not give the data needed?

What were/are the variables of interest?

How were the variables of interest measured?

An evaluation of the validity and bias of the information presented in the media report. This may involve

using relevant contextual knowledge. Consider how the author/s of the article or report collected the

information, and the assumptions that the author/s made.

Consider any bias present in the article or report. Bias is where the author may have a particular point of

view. A biased article or report may still be valid, even though it is one-sided.

It is not enough just to state that an article or report is (or is not) biased or valid - you must comment on why

it is biased or not biased, or valid or not valid. Put notes on your report about the bias and validity of your

sources of information.

Do the comments (descriptions) made in the article or report reflect accurately the data given?

Are any comments misleading or biased?

Could alternative analyses be made?

Could the data have been interpreted in another way?

What data/information is not present?

A summary of the results of investigation and an evaluation of the effectiveness of the article or report

in meeting the purpose. This may involve using relevant contextual knowledge.

What questions is the article or report answering (what is the investigative question(s))?

Who is the article intended to be about (who is the intended population)?

Who is the article or report aimed at (who might be interested in the outcomes)?

What is the purpose of the article or report?

What further information is needed?

Are there any underlying or lurking variables that may have an impact on the outcome?

Are claims made in the article or report valid and/or sensible?