Foundations of Inferential Statistics
PADM 582University of La Verne
Soomi Lee, Ph.D
Overview
• Review Descriptive Statistics• Some in-class exercise1. Hypothesis2. Probability3. Significance
Why should we care?
• Quantitative (interval, ratio)• Ordinal• Nominal (qualitative)
Levels of Measurement
Respondent ID
Months in Unemployment
Level of Satisfaction in
New Job Training Program (scale
1-5)
Gender (1=female;
0=male)
1 13 5 1
2 22 4 0
3 16 3 0
4 4 4 1
5 9 2 1
Why should we care?
1. List the Sample2. Frequency Table3. Summary Statistics
– Central tendency: mean, median, mode– Dispersion (spread): standard deviation, range, variance– Skewness, kurtosis
4. Graphs and Charts– Histogram (quantitative)– Bar, column, pie charts (ordinal, nominal)– Line chart: historical data (time series data)
• Correlation: description of the relationship between two variables– Correlation coefficient, Correlation of Determination– Visualization: scatter plot
Ways to Summarize Data
Why should we care?
Should the department send the dean the mean age or the median age?
In-class Exercise 1
Member Age
Williamson 64
Campbell 31
Gonzales 65
Marquez 27
Seymour 35
Sandoval 40
Weber 33
Why should we care?
Who is Correct?
In-Class Exercise 2
Case Worker Case Load
Williamson 43
Campbell 57
Gonzales 35
Marquez 87
Seymour 36
Sandoval 93
Weber 45
Kim 48
Meier 41
Becker 40
Why should we care?
What does this number mean to the chief of custodial engineer?
Skewness = -22.46
In-Class Exercise 3
Why should we care?
Police Fire
610 570
590 580
650 700
650 600
640 480
580 690
550 740
550 450
Mean=603 Mean=601
In-Class Exercise 4
There is really no fairness issue to worry about. Is it true?
Summary Statistics
What you will learn in Chapter 7
1. The difference between samples and populations (again)
2. The importance of…• The null hypothesis• The research hypotheses
3. How to judge a good hypothesis
Hypothesis (Ch.7)
What is a hypothesis?
• An “educated guess”
• Their role is to reflect the general problem statement or question that is driving the research
• Translates the problem or research question into a form that can be tested.
What is Hypothesis?
Samples and Populations
• Population– The large group to which you would like
to generalize your findings
• Sample– The smaller, representative group of the
population that is used to do the research
• Sampling error – a measure of how well a sample represents the population
Samples and Populations
The Null Hypothesis
• Statements that contain two or more things that are unrelated to one another
H0 : m1 = m2
– The starting point and is accepted as true without knowing more information
– Benchmark to compare actual outcomes
The Null Hypothesis
The Research Hypothesis
• Statement that there is a relationship between two variables
• Two Types…1. Nondirectional H1 : 1 ≠ 2
• Reflects a difference; direction is not specified
• Two-tailed test
2. Directional H1 : 1 > 2
• Reflects a difference; direction is specified• One-tailed test
The Research Hypothesis
Null Hypothesis Research Hypothesis
No relationship between variables Relationship between variables
Refers to the population Refers to the sample
Indirectly tested Directly tested
Written using Greek symbols Written using Roman symbols
Implied hypothesis Explicit hypothesis
Differences between Null and Research Hypotheses
What Makes a Good Hypothesis?
• Stated in a declarative form rather than a question
• Defines an expected relationship between variables
• Reflects the theory or literature on which they are based
• Brief and to the point• Testable – includes variables that can be
measured
What Makes a Good Hypothesis?
What Makes a Good Hypothesis?
• Stated in a declarative form rather than a question
• Defines an expected relationship between variables
• Reflects the theory or literature on which they are based
• Brief and to the point• Testable – includes variables that can be
measured
What Makes a Good Hypothesis?
Group Work
What you will learn in Chapter 8
1. Understanding probability is basic to understanding statistics
2. Characteristics of the “normal” curve– i.e. the bell-shaped curve
3. All about z scores– Computing them– Interpreting them
Probability
Why Probability?
• Basis for the normal curve– Provides basis for understanding
probability of a possible outcome
• Basis for determining the degree of confidence that an outcome is “true”– Example:
• Are changes in student scores due to a particular intervention that took place or by chance alone?
Why Probability?
• Visual representation of a distribution of scores
• Three characteristics…1. Mean, median, and mode are equal to one
another2. Perfectly symmetrical about the mean3. Tails are asymptotic (get closer to horizontal
axis but never touch)
The Normal Curve (the Bell-Shaped Curve)
The Normal CurveThe Normal Curve (the Bell-
Shaped Curve)
• In general, many events occur right in the middle of a distribution with few on each end.
The Normal Curve (the Bell-Shaped Curve)
More Normal Curve 101More Normal Curve 101
More Normal Curve 101
• For all normal distributions…
– Almost 100% of scores will fit between -3 and +3 standard deviations from the mean.
– So…distributions can be compared
– Between different points on the X-axis, a certain percentage of cases will occur.
More Normal Curve 101
What’s Under the Curve?What’s under the Curve?
The z Score
• A standard score that is the result of dividing the amount that a raw score differs from the mean of the distribution by the standard deviation.
• What about those symbols?
The z Score
The z Score
• A common statistical way of standardizing data on one scale so a comparison can take place is using a z-score.
• The z-score is like a common yard stick for all types of data.
The z Score
Using the ComputerCalculating z Scores
The z Score
• Scores below the mean are negative (left of the mean) and those above are positive (right of the mean)
• A z score is the number of standard deviations from the mean
• z scores across different distributions are comparable
The z Score
What z Scores Represent
• The areas of the curve that are covered by different z scores also represent the probability of a certain score occurring.
• So try this one…– In a distribution with a mean of 50 and a
standard deviation of 10, what is the probability that one score will be 70 or above?
What z Score Represent
What z Scores Really Represent
• Knowing the probability that a z score will occur can help you determine how extreme a z score you can expect before determining that a factor other than chance produced the outcome
• Keep in mind… z scores are typically reserved for populations
What z Score Really Represent
Hypothesis Testing & z Scores
• Any event can have a probability associated with it.– Probability values help determine how
“unlikely” the event might be– The key - less than 5% chance of
occurring and you have a significant result
Hypothesis Testing and z Scores
Using the ComputerGroup Work
What you will learn in Chapter 9
1. What significance is and why it is important– Significance vs. Meaningfulness
2. Type I Error3. Type II Error4. How inferential statistics works
Statistical Significance
The Concept of Significance
• Any difference between groups that is due to a systematic influence rather than chance– Must assume that all other factors that
might contribute to differences are controlled
The Concept of Significance
If Only We Were Perfect…
• Significance level – The risk associated with not being 100%
positive that what occurred in the experiment is a result of what you did or what is being tested
• The goal is to eliminate competing reasons for differences as much as possible.
• Statistical Significance– The degree of risk you are willing to take that
you will reject a null hypothesis when it is actually true.
If Only We are Perfect…
The World’s Most Important TableDifferent Types of Errors
Type I Errors (Level of Significance)
• The probability of rejecting a null hypothesis when it is true
• Conventional levels are set between .01 and .05
• Usually represented in a report as p < .05
Type I Errors
Type II Errors
• The probability of accepting a null hypothesis when it is false
• As your sample characteristics become closer to the population, the probability that you will accept a false null hypothesis decreases
Type II Errors
Significance Versus Meaningfulness
• A study can be statistically significant but not very meaningful
• Statistical significance can only be interpreted for the context in which it occurred
• Statistical significance should not be the only goal of scientific research
– Significance is influenced by sample size…we’ll talk more about this later.
Significance vs. Meaningfulness
How Inference Works
• A representative sample of the population is chosen.
• A test is given, means are computed and compared
• A conclusion is reached as to whether the scores are statistically significant
• Based on the results of the sample, an inference is made about the population.
How Inference Works
Test of Significance
1. A statement of the null hypothesis.2. Set the level of risk associated with the null
hypothesis.3. Select the appropriate test statistic.4. Compute the test statistic (obtained) value5. Determine the value needed to reject the null
hypothesis using the appropriate table of critical values
6. Compare the obtained value to the critical value7. If obtained value is more extreme, reject the null
hypothesis8. If obtained value is not more extreme, accept the
null hypothesis
Test of Significance
Next Week
• Homework 3 due: next week in class
• Extra credit homework (eligible only for those who got scores below 50 on hw1) due also next week in class