32
URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8 of Neil Salkind’s Statistics for People who (Think They) Hate Statistics )

URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Embed Size (px)

Citation preview

Page 1: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

URBP 204A QUANTITATIVE METHODS I

Statistical Analysis Lecture II

Gregory NewmarkSan Jose State University

(This lecture accords with Chapters 6,7, & 8 of Neil Salkind’sStatistics for People who (Think They) Hate Statistics)

Page 2: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Populations and Samples• Populations

– All the people in a specified group of people• The population of Students at SJSU• The population of Students in Urban Planning at SJSU• The population of Students in 204A this semester

• Samples– A portion of a larger population selected for study

• A 500 person Sample of Students at SJSU• A 50 person Sample of Students in Urban Planning• A 15 person Sample of Students in 204A this semester

Page 3: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Populations and Samples• Ideally, research covers entire populations

– “Medicine X always cures the common cold”

• Financially, research is expensive– “We can’t afford to test Medicine X on everyone”

• Practically, we test samples of a population– “We can afford to test Medicine X on 1,000 people”

• Hopefully, those samples well represent the actual population– “For our results to be generalizable, our 1,000 people

should approximate the characteristics of everyone”

Page 4: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Populations and Samples

Page 5: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Populations and Samples• Sampling Error

– A measure of how well a sample approximates the characteristics of the larger population

– The difference between a sampling statistic (i.e., values in the sample) and a population parameter (i.e., values in the population)

– Low sampling error means higher precision– Higher precision means more generalizability– Valuable research has a high degree of

generalizability

Page 6: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Questions and Hypotheses• Research Questions (Problem Statements)

– What you are trying to investigate

• Hypotheses– Translates research question into a testable form

Page 7: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Hypotheses• Null Hypothesis

– Assumption that no relationship exists in population– Statements of equality– Examples

• “There is no relationship between reaction time and problem solving ability”

• “There is no difference in the average GRE scores of women and men”

– Purposes (Null Hypothesis can not be tested directly)• Starting point for research

– Until you prove a difference you have to assume none exists• Benchmark to compare observations

– Defines a range within which observed difference may be due to change

Page 8: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Hypotheses

Page 9: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Hypotheses• Research Hypothesis

– Definitive statement that a relationship exists in a sample

– Statements of inequality– Examples

• “There is a positive relationship between reaction time and problem solving ability”

• “There is a difference in the average GRE scores of women and men”

– Two Types• Non-directional – there is a difference but its direction is

unspecified• Directional – there is a difference and its direction is

specified– Purpose – to provide a hypothesis for direct testing

Page 10: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Hypotheses• Should be stated in a clear, forceful, declarative form

– “Students who complete all assignments will get higher grades in 204A than those who do not.”

• Should be expressed succinctly– Avoid excessive verbiage that can confuse your readers

• Should posit an expected relationship between variables– This will focus the research and avoid ‘scattershot’ approach

• Should reflect theory or literature– This ensures that the researcher has investigated the issue in

advance• Should be testable

– One can actually carry out the research– Defines how measurement will happen

Page 11: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Hypotheses Quotes• The great tragedy of Science - the slaying of a beautiful

hypothesis by an ugly fact.– Thomas H. Huxley (1825 - 1895)

• There are two possible outcomes: If the result conforms the hypothesis, then you've made a measurement. If the result is contrary to the hypothesis, then you've made a discovery.– Enrico Fermi (1901-1954)

• It is a good morning exercise for a research scientist to discard a pet hypothesis every day before breakfast. It keeps him young.– Konrad Lorenz (1903 - 1989)

• For every fact there is an infinity of hypotheses.– Robert M. Pirsig (1928 - )

Page 12: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Inferential Statistics• Descriptive Statistics describe a data set

– “The average height in this class is 5’6” with a standard deviation of 3”.”

• Inferential Statistics are used to make inferences from sample data to populations– “Based on our class data, we infer that the

average height at SJSU is 5’6” with a standard deviation of 3”.”

Page 13: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Inferential Statistics

Page 14: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

The Normal Curve• Visual representation of a distribution of

scores with the following characteristics– Mean, median, and mode are the same– Symmetry around the mean (or mode or median)– Tails of curve approach zero asymptotically

Page 15: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

The Normal Curve

Page 16: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

The Normal Curve• We can exploit these properties of the normal

curve to compare distributions with different means and standard deviations, by putting them into standard scores based on the standard deviation

• Basically, we can compare curves by discussing their standard deviations

Page 17: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Z-Scores• A commonly used standardized score• Represent the number of standard deviations a

raw score falls from the mean• Result of dividing the amount that a raw score

differs from the mean of a distribution by the standard deviation of that distribution

• Z = z score; X = individual score; Xbar = mean; s = standard deviation

Page 18: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Z-Scores• Characteristics

– Z scores above the mean are:• Positive• To the right of the mean• In the upper half of the distribution

– Z scores below the mean are:• Negative• To the left of the mean• In the lower half of the distribution

– Z scores have associated probabilities

Page 19: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Z-Scores• Every z score has an associated probability• We can use that property to test hypotheses• This property enables inferential statistics• We can assess whether an event is due to

chance or reflects some research finding• Typically, we reject the null hypothesis if an

event has less than a 5% chance of occurring• In that case, the research hypothesis likely

makes more sense

Page 20: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Class Lab• Have everyone report their height in inches• Determine class mean• Determine class standard deviation• Calculate z score for your height• What percentage of the class is taller than

you? (see chart in back of book or online)

• Have everyone move the data into SPSS and repeat the experiment

Page 21: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

The Normal CurveThe Normal Law

by W.J. Youden (1900 - 1971)

THENORMAL

LAW OF ERRORSTANDS OUT IN THE

EXPERIENCE OF MANKINDAS ONE OF THE BROADEST

GENERALIZATIONS OF NATURALPHILOSOPHY ... IT SERVES AS THE

GUIDING INSTRUMENT IN RESEARCHESIN THE PHYSICAL AND SOCIAL SCIENCES AND

IN MEDICINE, AGRICULTURE, AND ENGINEERING.IT IS AN INDISPENSABLE TOOL FOR THE ANALYSIS AND THE

INTERPRETATION OF THE BASIC DATA OBTAINED BY OBSERVATION AND EXPERIMENT

Page 22: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Statistical Significance• Refers to whether or not an observed effect is due to

chance or to systematic influence.– “There is a positive statistically significant relationship

between GDP and average life span.”– Statistical significance makes the null hypothesis less

attractive an explanation than the research hypothesis• Ideally, research would control for all other factors,

but in practice there will be uncontrolled error.– “There is a chance that a low GDP nation will have a higher

average life span, due to unaccounted for factors.”• Researchers ultimately define the level of certainty

they are willing to accept in determining significance.– “There is a 1 in 20 chance that the observed effect is not

due to the hypothesized reason, and we can live with that.”

– This is called significance level (or critical p-value).

Page 23: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Significance Levels can Vary

Page 24: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Statistical Significance

• To review:– First, hypothesize a relationship

• Null Hypothesis means no relationship (often implied)• Research Hypothesis means there is a relationship

– Second, test the research hypothesis• Define your significance level• Do your experiment

– Third, based on your findings either:• Reject the null and accept the research hypothesis• Accept the null and reject the research hypothesis

Page 25: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Statistical Significance

• Data and Dating– Is this enough to reject the null hypothesis?

Page 26: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Statistical Significance• Null Hypotheses can be either true or false

– If true, there is an equality– If false, there is an inequality

• The Null Hypothesis can not be directly tested– This presents a problem because one might reject the null

when it is true (Type I) or accept it when it is false (Type II)– Four options:

No ProblemAccept the Null Hypothesis when there is truly no difference between groups

Type I Error (False Positive)Reject the Null Hypothesis when there is truly no difference between groups

Type II Error (False Negative)Accept the Null Hypothesis when there truly are differences between groups

No ProblemReject the Null Hypothesis when there truly are differences between groups

Page 27: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Significant vs. Meaningful• Statistically significant does not always imply

the finding is meaningful– “There is a statistically significant ¼ inch difference

in the heights of women and men.”– “There is a statistically significant $0.50 difference

in the per capita tax returns of married couples versus singles.”

• Large samples will almost always find statistically significant differences.

• The researcher needs to assess the meaning of the outcomes by considering their context.

Page 28: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Statistical Significance Revisited

• Steps:– State hypothesis– Set significance level associated with null

hypothesis– Select statistical test (we will learn these soon)– Computation of obtained test statistic value – Computation of critical test statistic value– Comparison of obtained and critical values

• If obtained > critical reject the null hypothesis• If obtained < critical stick with the null hypothesis

Page 29: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Statistical Significance Revisited

• One Tailed Test

Page 30: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Statistical Significance Revisited

• Two Tailed Test

Page 31: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Inferential Statistics Revisited

• Inference allows decisions to be made about populations based on information about samples.

• Steps:– Take a representative sample– Test each member of the sample– Analyze data to determine if variation is due to

chance (accept null hypothesis) or statistically significant (accept research hypothesis)

– Conclusions inferred about population

Page 32: URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8

Inferential Statistics Revisited