Download pptx - Foundations of Inferential Statistics PADM 582 University of La Verne Soomi Lee, Ph.D

Foundations of Inferential Statistics

PADM 582University of La Verne

Soomi Lee, Ph.D

Overview

• Review Descriptive Statistics• Some in-class exercise1. Hypothesis2. Probability3. Significance

Why should we care?

• Quantitative (interval, ratio)• Ordinal• Nominal (qualitative)

Levels of Measurement

Respondent ID

Months in Unemployment

Level of Satisfaction in

New Job Training Program (scale

1-5)

Gender (1=female;

0=male)

1 13 5 1

2 22 4 0

3 16 3 0

4 4 4 1

5 9 2 1

Why should we care?

1. List the Sample2. Frequency Table3. Summary Statistics

– Central tendency: mean, median, mode– Dispersion (spread): standard deviation, range, variance– Skewness, kurtosis

4. Graphs and Charts– Histogram (quantitative)– Bar, column, pie charts (ordinal, nominal)– Line chart: historical data (time series data)

• Correlation: description of the relationship between two variables– Correlation coefficient, Correlation of Determination– Visualization: scatter plot

Ways to Summarize Data

Why should we care?

Should the department send the dean the mean age or the median age?

In-class Exercise 1

Member Age

Williamson 64

Campbell 31

Gonzales 65

Marquez 27

Seymour 35

Sandoval 40

Weber 33

Why should we care?

Who is Correct?

In-Class Exercise 2

Case Worker Case Load

Williamson 43

Campbell 57

Gonzales 35

Marquez 87

Seymour 36

Sandoval 93

Weber 45

Kim 48

Meier 41

Becker 40

Why should we care?

What does this number mean to the chief of custodial engineer?

Skewness = -22.46

In-Class Exercise 3

Why should we care?

Police Fire

610 570

590 580

650 700

650 600

640 480

580 690

550 740

550 450

Mean=603 Mean=601

In-Class Exercise 4

There is really no fairness issue to worry about. Is it true?

Summary Statistics

What you will learn in Chapter 7

1. The difference between samples and populations (again)

2. The importance of…• The null hypothesis• The research hypotheses

3. How to judge a good hypothesis

Hypothesis (Ch.7)

What is a hypothesis?

• An “educated guess”

• Their role is to reflect the general problem statement or question that is driving the research

• Translates the problem or research question into a form that can be tested.

What is Hypothesis?

Samples and Populations

• Population– The large group to which you would like

to generalize your findings

• Sample– The smaller, representative group of the

population that is used to do the research

• Sampling error – a measure of how well a sample represents the population

Samples and Populations

The Null Hypothesis

• Statements that contain two or more things that are unrelated to one another

H0 : m1 = m2

– The starting point and is accepted as true without knowing more information

– Benchmark to compare actual outcomes

The Null Hypothesis

The Research Hypothesis

• Statement that there is a relationship between two variables

• Two Types…1. Nondirectional H1 : 1 ≠ 2

• Reflects a difference; direction is not specified

• Two-tailed test

2. Directional H1 : 1 > 2

• Reflects a difference; direction is specified• One-tailed test

The Research Hypothesis

Null Hypothesis Research Hypothesis

No relationship between variables Relationship between variables

Refers to the population Refers to the sample

Indirectly tested Directly tested

Written using Greek symbols Written using Roman symbols

Implied hypothesis Explicit hypothesis

Differences between Null and Research Hypotheses

What Makes a Good Hypothesis?

• Stated in a declarative form rather than a question

• Defines an expected relationship between variables

• Reflects the theory or literature on which they are based

• Brief and to the point• Testable – includes variables that can be

measured



• Stated in a declarative form rather than a question

• Defines an expected relationship between variables

• Reflects the theory or literature on which they are based

• Brief and to the point• Testable – includes variables that can be

measured


Group Work


1. Understanding probability is basic to understanding statistics

2. Characteristics of the “normal” curve– i.e. the bell-shaped curve

3. All about z scores– Computing them– Interpreting them

Probability

Why Probability?

• Basis for the normal curve– Provides basis for understanding

probability of a possible outcome

• Basis for determining the degree of confidence that an outcome is “true”– Example:

• Are changes in student scores due to a particular intervention that took place or by chance alone?

Why Probability?

• Visual representation of a distribution of scores

• Three characteristics…1. Mean, median, and mode are equal to one

another2. Perfectly symmetrical about the mean3. Tails are asymptotic (get closer to horizontal

axis but never touch)

The Normal Curve (the Bell-Shaped Curve)

The Normal CurveThe Normal Curve (the Bell-

Shaped Curve)

• In general, many events occur right in the middle of a distribution with few on each end.

The Normal Curve (the Bell-Shaped Curve)

More Normal Curve 101More Normal Curve 101

More Normal Curve 101

• For all normal distributions…

– Almost 100% of scores will fit between -3 and +3 standard deviations from the mean.

– So…distributions can be compared

– Between different points on the X-axis, a certain percentage of cases will occur.

More Normal Curve 101

What’s Under the Curve?What’s under the Curve?

The z Score

• A standard score that is the result of dividing the amount that a raw score differs from the mean of the distribution by the standard deviation.

• What about those symbols?

The z Score

The z Score

• A common statistical way of standardizing data on one scale so a comparison can take place is using a z-score.

• The z-score is like a common yard stick for all types of data.

The z Score

Using the ComputerCalculating z Scores

The z Score

• Scores below the mean are negative (left of the mean) and those above are positive (right of the mean)

• A z score is the number of standard deviations from the mean

• z scores across different distributions are comparable

The z Score

What z Scores Represent

• The areas of the curve that are covered by different z scores also represent the probability of a certain score occurring.

• So try this one…– In a distribution with a mean of 50 and a

standard deviation of 10, what is the probability that one score will be 70 or above?

What z Score Represent

What z Scores Really Represent

• Knowing the probability that a z score will occur can help you determine how extreme a z score you can expect before determining that a factor other than chance produced the outcome

• Keep in mind… z scores are typically reserved for populations

What z Score Really Represent

Hypothesis Testing & z Scores

• Any event can have a probability associated with it.– Probability values help determine how

“unlikely” the event might be– The key - less than 5% chance of

occurring and you have a significant result

Hypothesis Testing and z Scores

Using the ComputerGroup Work


1. What significance is and why it is important– Significance vs. Meaningfulness

2. Type I Error3. Type II Error4. How inferential statistics works

Statistical Significance

The Concept of Significance

• Any difference between groups that is due to a systematic influence rather than chance– Must assume that all other factors that

might contribute to differences are controlled

The Concept of Significance

If Only We Were Perfect…

• Significance level – The risk associated with not being 100%

positive that what occurred in the experiment is a result of what you did or what is being tested

• The goal is to eliminate competing reasons for differences as much as possible.

• Statistical Significance– The degree of risk you are willing to take that

you will reject a null hypothesis when it is actually true.

If Only We are Perfect…

The World’s Most Important TableDifferent Types of Errors

Type I Errors (Level of Significance)

• The probability of rejecting a null hypothesis when it is true

• Conventional levels are set between .01 and .05

• Usually represented in a report as p < .05

Type I Errors

Type II Errors

• The probability of accepting a null hypothesis when it is false

• As your sample characteristics become closer to the population, the probability that you will accept a false null hypothesis decreases

Type II Errors

Significance Versus Meaningfulness

• A study can be statistically significant but not very meaningful

• Statistical significance can only be interpreted for the context in which it occurred

• Statistical significance should not be the only goal of scientific research

– Significance is influenced by sample size…we’ll talk more about this later.

Significance vs. Meaningfulness

How Inference Works

• A representative sample of the population is chosen.

• A test is given, means are computed and compared

• A conclusion is reached as to whether the scores are statistically significant

• Based on the results of the sample, an inference is made about the population.

How Inference Works

Test of Significance

1. A statement of the null hypothesis.2. Set the level of risk associated with the null

hypothesis.3. Select the appropriate test statistic.4. Compute the test statistic (obtained) value5. Determine the value needed to reject the null

hypothesis using the appropriate table of critical values

6. Compare the obtained value to the critical value7. If obtained value is more extreme, reject the null

hypothesis8. If obtained value is not more extreme, accept the

null hypothesis

Test of Significance

Next Week

• Homework 3 due: next week in class

• Extra credit homework (eligible only for those who got scores below 50 on hw1) due also next week in class