Major Topics/ Strands A. Interpreting Categorical and
Quantitative Data (ID) Exploring Data B. Conditional Probability
and the Rules of Probability (CP) Anticipating Patterns in Advance
C. Making Inferences and Justifying Conclusions (IC) Statistical
Inference
Slide 3
Why Statistics? Arthur Benjamin TED Talk 2009 Teach Statistics
over Calculus http://www.youtube.co m/watch?v=BhMKmovN jvc
http://www.youtube.co m/watch?v=BhMKmovN jvc
Slide 4
Why Statistics? (cond) Most people will take at most one
Statistics class in their lives. That includes future senators to
sales clerks, as well as presidents, CEOs, jurors, doctors, and
other decision makers Its our job to teach them how to make
informed decisions!
Slide 5
Prudential Age Commercial Awesome data collection example.
http://www.youtube.co m/watch?v=C3qj88J7-jA http://www.youtube.co
m/watch?v=C3qj88J7-jA
Slide 6
Slide 7
Types of Variables!! Categorical Data M&M colors Gender
Whether an individual has a cellular phone Quantitative Data Height
Armspan Distance from home
Slide 8
Graphing Variables Categorical Data Pie chart Bar chart Two-way
table Quantitative Data Dotplot Stemplot Histogram Scatterplot Time
plot
Slide 9
Common Core Math 1 Goals Summarize, represent, and interpret
data on a single count or measurement variable. S-ID.1 Represent
data with plots on the real number line (dot plots, histograms, and
box plots). S-ID.2 Use statistics appropriate to the shape of the
data distribution to compare center (median, mean) and spread
(interquartile range, standard deviation) of two or more different
data sets. S-ID.3 Interpret differences in shape, center, and
spread in the context of the data sets, accounting for possible
effects of extreme data points (outliers).
Slide 10
Middle Grades Foundation 6 th grade: Develop understanding of
statistical variability. 6.SP.1. Recognize a statistical question
as one that anticipates variability in the data related to the
question and accounts for it in the answers. 6.SP.2. Understand
that a set of data collected to answer a statistical question has a
distribution which can be described by its center, spread, and
overall shape. 6.SP.3. Recognize that a measure of center for a
numerical data set summarizes all of its values with a single
number, while a measure of variation describes how its values vary
with a single number. Summarize and describe distributions. 6.SP.4.
Display numerical data in plots on a number line, including dot
plots, histograms, and box plots. 6.SP.5. Summarize numerical data
sets in relation to their context. a) Reporting the number of
observations. b) Describing the nature of the attribute under
investigation, including how it was measured and its units of
measurement. c) Giving quantitative measures of center (median
and/or mean) and variability (interquartile range and/or mean
absolute deviation), as well as describing any overall pattern and
any striking deviations from the overall pattern with reference to
the context in which the data were gathered. d) Relating the choice
of measures of center and variability to the shape of the data
distribution and the context in which the data were gathered.
Slide 11
Middle Grades Foundation 7 th grade: Use random sampling to
draw inferences about a population. 7.SP.1. Understand that
statistics can be used to gain information about a population by
examining a sample of the population; generalizations about a
population from a sample are valid only if the sample is
representative of that population. Understand that random sampling
tends to produce representative samples and support valid
inferences. 7.SP.2 Use data from a random sample to draw inferences
about a population with an unknown characteristic of interest.
Generate multiple samples (or simulated samples) of the same size
to gauge the variation in estimates or predictions. Draw informal
comparative inferences about two populations. 7.SP.3 Informally
assess the degree of visual overlap of two numerical data
distributions with similar variabilities, measuring the difference
between the centers by expressing it as a multiple of a measure of
variability. For example, the mean height of players on the
basketball team is 10 cm greater than the mean height of players on
the soccer team, about twice the variability (mean absolute
deviation) on either team; on a dot plot, the separation between
the two distributions of heights is noticeable. 7.SP.4 Use measures
of center and measures of variability for numerical data from
random samples to draw informal comparative inferences about two
populations. For example, decide whether the words in a chapter of
a seventh-grade science book are generally longer than the words in
a chapter of a fourth-grade science book.
Slide 12
Activity #1 Tennis Balls Using a ruler measure the diameter of
a tennis ball to the nearest millimeter. Place your measurement on
a post it and place it on the board above our number line. Describe
the distribution.
Slide 13
Activity #2 Peanuts! Dont freak out there are none in the room!
My students took a sample of unshelled peanuts and measured the
lengths of those peanuts in millimeters. We then created a line
plot
Slide 14
Middle School Foundation 8.SP.4. Understand that patterns of
association can also be seen in bivariate categorical data by
displaying frequencies and relative frequencies in a two-way table.
Construct and interpret a two-way table summarizing data on two-
categorical variables collected from the same subjects. Use
relative frequencies calculated for rows or columns to describe
possible association between the two variables. For example,
collect data from students in your class on whether or not they
have a curfew on school nights and whether or not they have
assigned chores at home. Is there evidence that those who have a
curfew also tend to have chores?
Slide 15
Activity #3 M&M Data Take pack of snack size M&Ms and
compare it to a pack of regular size M&Ms. Create a two way
table to compare this data. How can we compare this data? How can
we graph this data?
Slide 16
Middle School Foundation 8 th grade: Use random sampling to
draw inferences about a population. 8.SP.1. Construct and interpret
scatter plots for bivariate measurement data to investigate
patterns of association between two quantities. Describe patterns
such as clustering, outliers, positive or negative association,
linear association, and nonlinear association.. 8.SP.2 Know that
straight lines are widely used to model relationships between two
quantitative variables. For scatter plots that suggest a linear
association, informally fit a straight line, and informally assess
the model fit by judging the closeness of the data points to the
line.
Slide 17
Activity 4 Typhoons in the Pacific This is a problem I adapted
from the 2013 AP Statistics exam problem #6.
Slide 18
Slide 19
Common Core Math 2 Goals S-CP.1 Describe events as subsets of a
sample space (the set of outcomes) using characteristics (or
categories) of the outcomes, or as unions, intersections, or
complements of other events ("or," "and," "not") with visual
representations including Venn diagrams. S-CP.2 Understand that two
events A and B are independent if the probability of A and B
occurring together is the product of their probabilities, and use
this characterization to determine if they are independent.
Slide 20
Common Core Math 2 Goals S-CP.3 Understand the conditional
probability of A given B as P(A and B)/P(B), and interpret
independence of A and B as saying that the conditional probability
of A given B is the same as the probability of A, and the
conditional probability of B given A is the same as the probability
of B. S-CP.4 Construct and interpret two-way frequency tables of
data when two categories are associated with each object being
classified. Use the two-way table as a sample space to decide if
events are independent and to approximate conditional
probabilities.
Slide 21
Common Core Math 2 Goals S-CP.5 Recognize and explain the
concepts of conditional probability and independence in everyday
language and everyday situations. S-CP.6 Find the conditional
probability of A given B as the fraction of B's outcomes that also
belong to A, and interpret the answer in terms of the model.
Slide 22
Middle Grade Alignment CCSS.Math.Content.7.SP.C.5 Understand
that the probability of a chance event is a number between 0 and 1
that expresses the likelihood of the event occurring. Larger
numbers indicate greater likelihood. A probability near 0 indicates
an unlikely event, a probability around 1/2 indicates an event that
is neither unlikely nor likely, and a probability near 1 indicates
a likely event. CCSS.Math.Content.7.SP.C.5
CCSS.Math.Content.7.SP.C.6 Approximate the probability of a chance
event by collecting data on the chance process that produces it and
observing its long- run relative frequency, and predict the
approximate relative frequency given the probability.
CCSS.Math.Content.7.SP.C.6 CCSS.Math.Content.7.SP.C.7 Develop a
probability model and use it to find probabilities of events.
Compare probabilities from a model to observed frequencies; if the
agreement is not good, explain possible sources of the discrepancy.
CCSS.Math.Content.7.SP.C.7 CCSS.Math.Content.7.SP.C.7a Develop a
uniform probability model by assigning equal probability to all
outcomes, and use the model to determine probabilities of events.
CCSS.Math.Content.7.SP.C.7a CCSS.Math.Content.7.SP.C.7b Develop a
probability model (which may not be uniform) by observing
frequencies in data generated from a chance process.
CCSS.Math.Content.7.SP.C.7b
Slide 23
Middle Grade Alignment CCSS.Math.Content.7.SP.C.8 Find
probabilities of compound events using organized lists, tables,
tree diagrams, and simulation. CCSS.Math.Content.7.SP.C.8
CCSS.Math.Content.7.SP.C.8a Understand that, just as with simple
events, the probability of a compound event is the fraction of
outcomes in the sample space for which the compound event occurs.
CCSS.Math.Content.7.SP.C.8a CCSS.Math.Content.7.SP.C.8b Represent
sample spaces for compound events using methods such as organized
lists, tables and tree diagrams. For an event described in everyday
language (e.g., "rolling double sixes"), identify the outcomes in
the sample space which compose the event.
CCSS.Math.Content.7.SP.C.8b CCSS.Math.Content.7.SP.C.8c Design and
use a simulation to generate frequencies for compound events.
CCSS.Math.Content.7.SP.C.8c CCSS.Math.Content.8.SP.A.4 Understand
that patterns of association can also be seen in bivariate
categorical data by displaying frequencies and relative frequencies
in a two-way table. Construct and interpret a two-way table
summarizing data on two categorical variables collected from the
same subjects. Use relative frequencies calculated for rows or
columns to describe possible association between the two
variables.
Slide 24
Why Probability? Looking at games of chance Card games,
lotteries, fantasy sports, horse racing Looking at social science
data Life, Death, medical field, biostatistics Looking at
scientific data variations in individual measurement are random
(example: tennis ball diameter measurements)
Slide 25
Chance Behavior Chance Behavior is unpredictable in the short
run but has a regular and predictable pattern in the long run.
Slide 26
Randomness We call a phenomenon random if individual outcomes
are uncertain but there is nonetheless a regular distribution of
outcomes in a large number of repetitions.
Slide 27
Priniples of Randomness 1. Long series of independent trials 2.
The idea is empirical. We can estimate a real-world probability by
actually observing many trials. (ex. Simulation combining class
data) 3. Short runs only give a rough estimate; some several
hundred simulations are necessary to settle down a
probability.
Slide 28
Definition of Probability The probability of any outcome of a
random phenomenon is the proportion of times the outcome would
occur in a very long series of repetitions. That is, the
probability is a long-term relative frequency.
Slide 29
Interpreting Probabilities Ex. (a) There is a.3 chance of rain
tomorrow. How do you interpret this statement?
Slide 30
Interpreting Probabilities Ex. (a) There is a.3 chance of rain
tomorrow. Answer: Under the same conditions after a long run of
days under the same conditions there is 30% chance that it will
rain tomorrow. Meteorologists may have examined 100 days, 200 days
maybe more, but probably not just 10 days and 3 resulted in
rain.
Slide 31
Interpreting Probabilities Ex. (b) Your probability of winning
at this lottery game is 1/1000. How do you interpret this
statement?
Slide 32
Interpreting Probabilities Ex. (b) Your probability of winning
at this lottery game is 1/1000. Answer: Playing the lottery for a
long run of the same conditions there is a one and one-thousand
chance of winning. It may take a 1,000, 2,000, maybe more plays of
this lottery to settle down this probability and finally result in
a win.
Slide 33
Must be Independent !!! In order for an event to be considered
random it must be independent. Each event does not influence the
outcome of another event. Example: rolling a die. Rolling a 3 does
not influence the probability of rolling a 6 on the next roll.
Slide 34
Slide 35
Sample Space A Sample Space S is a random phenomenon is the set
of all possible Outcomes.
Slide 36
Event An event is any outcome or a set of outcomes of a random
phenomenon. This is a subset of the sample space
Slide 37
Probability Model A probability model is a mathematical
description of a random phenomenon consisting of two parts: A
sample space A way of assigning probabilities to events
Slide 38
Example #1 Consider a situation in which shoppers were
categorized by gender (M or F) and the type of music purchased (C =
classical, R = rock, K = country, and P = Rap) a) What is the
sample space? b) What probability is associated to each event? c)
Event in which a shopper purchased classical. d) Event in which the
shopper was male.
Slide 39
Example #2 An observer stands at the bottom of a freeway off-
ramp and records the turning direction (L=left, R=right) of each of
three successive vehicles. What is the sample space? Whats the
probability of each outcome? What event(s) has exactly one car
turning right? What event(s) has exactly one car turning left? What
event(s) have all cars turning the same direction
Slide 40
Assigning Probability Some events are equally likely and some
are not. Students need to be aware that the TOTAL number of events
is not always the denominator to the probability. Lets start with
some possible equally likely scenarios
Slide 41
Equally Likely Events (a) Whether a fair die lands on 1,2,3,4,5
or 6. (b) The sum of two fair dice landing on 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12 (c) A fair coin landing on heads or tails when tossed
(d) A fair coin landing on heads or tails when spun on its side.
(e) A tennis racquet landing with the label up or down when spun on
its end
Slide 42
Equally Likely Events (f) Your grade in this course being A, B,
C, D, or F (g) Whether or not California experiences a catastrophic
earthquake within the next year (h) Whether or not your server
correctly brings you the meal you ordered in a restaurant (i)
Whether or not there is intelligent life on Mars (j) Whether or not
a woman will be elected President in next election.
Slide 43
Equally Likely Events (k) Whether or not a woman will be
elected President before the year 2010. (l) Colors of Reeses Pieces
candies: orange, yellow and brown
Slide 44
Probability Example #1 The heart association claims that only
10% of US adults over age 30 can pass the presidents physical
fitness commissions minimum requirements. In a group of 4 randomly
chosen adults, what is the probability that 2 can pass and 2 cannot
pass?
Slide 45
Probability Example #2 Advertising Agency Worksheet
Slide 46
Simulation The imitation of chance behavior, based on a model
that accurately reflects the situation, is called a
simulation.
Slide 47
Simulation Steps State: What is the question of interest about
some chance process? Plan: Describe how to use a chance device to
imitate one repetition of the process Explain clearly how to
identify the outcomes of the chance process and what variable to
measure. Do: Perform many repetitions of the simulation. Conclude:
Use the results to answer the question of interest
Slide 48
Probability Example #3 Eric Staal, center for the Carolina
Hurricanes, is off to a strong start of an NHL season the season.
He is getting about 8 shots on goal a game and is making a third of
his shots. What is the probability that Eric scores 4 goals in a
game?
Slide 49
Slide 50
Common Core Math 3 Goals Understand and evaluate random
processes underlying statistical experiments S-IC.1 Understand
statistics as a process for making inferences about population
parameters based on a random sample from that population. Make
inferences and justify conclusions from sample surveys,
experiments, and observational studies S-IC.3 Recognize the
purposes of and differences among sample surveys, experiments, and
observational studies; explain how randomization relates to each.
S-IC.4 Use data from a sample survey to estimate a population mean
or proportion; develop a margin of error through the use of
simulation models for random sampling. S-IC.5 Use data from a
randomized experiment to compare two treatments; use simulations to
decide if differences between parameters are significant. S-IC.6
Evaluate reports based on data.
Slide 51
Middle Grades Alignment 7 th grade: Use random sampling to draw
inferences about a population. 7.SP.1. Understand that statistics
can be used to gain information about a population by examining a
sample of the population; generalizations about a population from a
sample are valid only if the sample is representative of that
population. Understand that random sampling tends to produce
representative samples and support valid inferences. 7.SP.2 Use
data from a random sample to draw inferences about a population
with an unknown characteristic of interest. Generate multiple
samples (or simulated samples) of the same size to gauge the
variation in estimates or predictions.
Slide 52
Middle Grades Alignment Draw informal comparative inferences
about two populations. 7.SP.3 Informally assess the degree of
visual overlap of two numerical data distributions with similar
variabilities, measuring the difference between the centers by
expressing it as a multiple of a measure of variability. 7.SP.4 Use
measures of center and measures of variability for numerical data
from random samples to draw informal comparative inferences about
two populations.
Slide 53
Activity #1 - The 1 in 6 wins Game As a special promotion for
its 20-ounce bottles of soda, a soft drink company printed a
message on the inside of each cap. Some of the caps said Please try
again, while others said Youre a winner! The company advertised the
promotion with the slogan 1 in 6 wins a prize. Seven friends each
buy one bottle 20-ounce bottle of the soda at a local convenience
store. The clerk is surprised when three of them win a prize. Is
this group of friends just lucky, or is the companys claim
inaccurate?
Slide 54
Activity #2 Sleep Deprivation Source: Rossman et. al NSF
Project Researchers have established that sleep deprivation has a
harmful effect on visual learning. But do these effects linger for
several days, or can a person make up for sleep deprivation by
getting a full nights sleep on subsequent nights? A recent study
investigated this question by randomly assigning 21 subjects to one
of two groups: one group was deprived of sleep on the night
following training and pre-testing with a visual discrimination
task, and the other group was permitted unrestricted sleep on that
first night. Both groups were then allowed as much sleep as they
wanted on the following two nights. All subjects were then
re-tested on the third day.
Slide 55
Sleep Deprivation Data Subjects performance on the test was
recorded as the minimum time (in milliseconds) between stimuli
appearing on a computer screen for which they could accurately
report what they had seen on the screen. Sleep deprivation (n =
11): -14.7, -10.7, -10.7, 2.2, 2.4, 4.5, 7.2, 9.6, 10.0, 21.3, 21.8
Unrestricted sleep (n = 10): -7.0, 11.6, 12.1, 12.6, 14.5, 18.6,
25.2, 30.5, 34.5, 45.6
Slide 56
Did sleep deprivation cause difference in performance? Or is
there another possible explanation?
Slide 57
Rerandomizing Simulation Place 21 cards (subjects) in a bag If
no difference in treatment effects, then values same as in original
study How large a difference in group means with different random
assignments? Mix your cards and draw 10 to represent the
unrestricted group. Compare your mean to 19.82. Report.
Slide 58
Physical simulation can be tedious
Slide 59
Final Thoughts sleep deprivation Research question? Do the
effects of sleep deprivation on visual learning last for several
days? Idea: suppose theres no treatment effect Differences due to
random assignment? Re-randomize many times What would you
conclude?