45
Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Embed Size (px)

Citation preview

Page 1: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Foundations of Inferential Statistics

PADM 582University of La Verne

Soomi Lee, Ph.D

Page 2: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Overview

• Review Descriptive Statistics• Some in-class exercise1. Hypothesis2. Probability3. Significance

Page 3: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Why should we care?

• Quantitative (interval, ratio)• Ordinal• Nominal (qualitative)

Levels of Measurement

Respondent ID

Months in Unemployment

Level of Satisfaction in

New Job Training Program (scale

1-5)

Gender (1=female;

0=male)

1 13 5 1

2 22 4 0

3 16 3 0

4 4 4 1

5 9 2 1

Page 4: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Why should we care?

1. List the Sample2. Frequency Table3. Summary Statistics

– Central tendency: mean, median, mode– Dispersion (spread): standard deviation, range, variance– Skewness, kurtosis

4. Graphs and Charts– Histogram (quantitative)– Bar, column, pie charts (ordinal, nominal)– Line chart: historical data (time series data)

• Correlation: description of the relationship between two variables– Correlation coefficient, Correlation of Determination– Visualization: scatter plot

Ways to Summarize Data

Page 5: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Why should we care?

Should the department send the dean the mean age or the median age?

In-class Exercise 1

Member Age

Williamson 64

Campbell 31

Gonzales 65

Marquez 27

Seymour 35

Sandoval 40

Weber 33

Page 6: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Why should we care?

Who is Correct?

In-Class Exercise 2

Case Worker Case Load

Williamson 43

Campbell 57

Gonzales 35

Marquez 87

Seymour 36

Sandoval 93

Weber 45

Kim 48

Meier 41

Becker 40

Page 7: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Why should we care?

What does this number mean to the chief of custodial engineer?

Skewness = -22.46

In-Class Exercise 3

Page 8: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Why should we care?

Police Fire

610 570

590 580

650 700

650 600

640 480

580 690

550 740

550 450

Mean=603 Mean=601

In-Class Exercise 4

There is really no fairness issue to worry about. Is it true?

Page 9: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Summary Statistics

Page 10: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

What you will learn in Chapter 7

1. The difference between samples and populations (again)

2. The importance of…• The null hypothesis• The research hypotheses

3. How to judge a good hypothesis

Hypothesis (Ch.7)

Page 11: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

What is a hypothesis?

• An “educated guess”

• Their role is to reflect the general problem statement or question that is driving the research

• Translates the problem or research question into a form that can be tested.

What is Hypothesis?

Page 12: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Samples and Populations

• Population– The large group to which you would like

to generalize your findings

• Sample– The smaller, representative group of the

population that is used to do the research

• Sampling error – a measure of how well a sample represents the population

Samples and Populations

Page 13: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

The Null Hypothesis

• Statements that contain two or more things that are unrelated to one another

H0 : m1 = m2

– The starting point and is accepted as true without knowing more information

– Benchmark to compare actual outcomes

The Null Hypothesis

Page 14: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

The Research Hypothesis

• Statement that there is a relationship between two variables

• Two Types…1. Nondirectional H1 : 1 ≠ 2

• Reflects a difference; direction is not specified

• Two-tailed test

2. Directional H1 : 1 > 2

• Reflects a difference; direction is specified• One-tailed test

The Research Hypothesis

Page 15: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Null Hypothesis Research Hypothesis

No relationship between variables Relationship between variables

Refers to the population Refers to the sample

Indirectly tested Directly tested

Written using Greek symbols Written using Roman symbols

Implied hypothesis Explicit hypothesis

Differences between Null and Research Hypotheses

Page 16: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

What Makes a Good Hypothesis?

• Stated in a declarative form rather than a question

• Defines an expected relationship between variables

• Reflects the theory or literature on which they are based

• Brief and to the point• Testable – includes variables that can be

measured

What Makes a Good Hypothesis?

Page 17: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

What Makes a Good Hypothesis?

• Stated in a declarative form rather than a question

• Defines an expected relationship between variables

• Reflects the theory or literature on which they are based

• Brief and to the point• Testable – includes variables that can be

measured

What Makes a Good Hypothesis?

Page 18: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Group Work

Page 19: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

What you will learn in Chapter 8

1. Understanding probability is basic to understanding statistics

2. Characteristics of the “normal” curve– i.e. the bell-shaped curve

3. All about z scores– Computing them– Interpreting them

Probability

Page 20: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Why Probability?

• Basis for the normal curve– Provides basis for understanding

probability of a possible outcome

• Basis for determining the degree of confidence that an outcome is “true”– Example:

• Are changes in student scores due to a particular intervention that took place or by chance alone?

Why Probability?

Page 21: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

• Visual representation of a distribution of scores

• Three characteristics…1. Mean, median, and mode are equal to one

another2. Perfectly symmetrical about the mean3. Tails are asymptotic (get closer to horizontal

axis but never touch)

The Normal Curve (the Bell-Shaped Curve)

Page 22: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

The Normal CurveThe Normal Curve (the Bell-

Shaped Curve)

Page 23: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

• In general, many events occur right in the middle of a distribution with few on each end.

The Normal Curve (the Bell-Shaped Curve)

Page 24: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

More Normal Curve 101More Normal Curve 101

Page 25: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

More Normal Curve 101

• For all normal distributions…

– Almost 100% of scores will fit between -3 and +3 standard deviations from the mean.

– So…distributions can be compared

– Between different points on the X-axis, a certain percentage of cases will occur.

More Normal Curve 101

Page 26: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

What’s Under the Curve?What’s under the Curve?

Page 27: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

The z Score

• A standard score that is the result of dividing the amount that a raw score differs from the mean of the distribution by the standard deviation.

• What about those symbols?

The z Score

Page 28: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

The z Score

• A common statistical way of standardizing data on one scale so a comparison can take place is using a z-score.

• The z-score is like a common yard stick for all types of data.

The z Score

Page 29: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Using the ComputerCalculating z Scores

Page 30: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

The z Score

• Scores below the mean are negative (left of the mean) and those above are positive (right of the mean)

• A z score is the number of standard deviations from the mean

• z scores across different distributions are comparable

The z Score

Page 31: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

What z Scores Represent

• The areas of the curve that are covered by different z scores also represent the probability of a certain score occurring.

• So try this one…– In a distribution with a mean of 50 and a

standard deviation of 10, what is the probability that one score will be 70 or above?

What z Score Represent

Page 32: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D
Page 33: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

What z Scores Really Represent

• Knowing the probability that a z score will occur can help you determine how extreme a z score you can expect before determining that a factor other than chance produced the outcome

• Keep in mind… z scores are typically reserved for populations

What z Score Really Represent

Page 34: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Hypothesis Testing & z Scores

• Any event can have a probability associated with it.– Probability values help determine how

“unlikely” the event might be– The key - less than 5% chance of

occurring and you have a significant result

Hypothesis Testing and z Scores

Page 35: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Using the ComputerGroup Work

Page 36: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

What you will learn in Chapter 9

1. What significance is and why it is important– Significance vs. Meaningfulness

2. Type I Error3. Type II Error4. How inferential statistics works

Statistical Significance

Page 37: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

The Concept of Significance

• Any difference between groups that is due to a systematic influence rather than chance– Must assume that all other factors that

might contribute to differences are controlled

The Concept of Significance

Page 38: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

If Only We Were Perfect…

• Significance level – The risk associated with not being 100%

positive that what occurred in the experiment is a result of what you did or what is being tested

• The goal is to eliminate competing reasons for differences as much as possible.

• Statistical Significance– The degree of risk you are willing to take that

you will reject a null hypothesis when it is actually true.

If Only We are Perfect…

Page 39: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

The World’s Most Important TableDifferent Types of Errors

Page 40: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Type I Errors (Level of Significance)

• The probability of rejecting a null hypothesis when it is true

• Conventional levels are set between .01 and .05

• Usually represented in a report as p < .05

Type I Errors

Page 41: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Type II Errors

• The probability of accepting a null hypothesis when it is false

• As your sample characteristics become closer to the population, the probability that you will accept a false null hypothesis decreases

Type II Errors

Page 42: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Significance Versus Meaningfulness

• A study can be statistically significant but not very meaningful

• Statistical significance can only be interpreted for the context in which it occurred

• Statistical significance should not be the only goal of scientific research

– Significance is influenced by sample size…we’ll talk more about this later.

Significance vs. Meaningfulness

Page 43: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

How Inference Works

• A representative sample of the population is chosen.

• A test is given, means are computed and compared

• A conclusion is reached as to whether the scores are statistically significant

• Based on the results of the sample, an inference is made about the population.

How Inference Works

Page 44: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Test of Significance

1. A statement of the null hypothesis.2. Set the level of risk associated with the null

hypothesis.3. Select the appropriate test statistic.4. Compute the test statistic (obtained) value5. Determine the value needed to reject the null

hypothesis using the appropriate table of critical values

6. Compare the obtained value to the critical value7. If obtained value is more extreme, reject the null

hypothesis8. If obtained value is not more extreme, accept the

null hypothesis

Test of Significance

Page 45: Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Next Week

• Homework 3 due: next week in class

• Extra credit homework (eligible only for those who got scores below 50 on hw1) due also next week in class