86
Psych 230 Psychological Measurement and Statistics

Psych 230 Psychological Measurement and Statistics

  • View
    220

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Psych 230 Psychological Measurement and Statistics

Psych 230

Psychological Measurement and Statistics

Page 2: Psych 230 Psychological Measurement and Statistics

Last Time….

• Hypothesis testing reviewed

• Statistical Errors reviewed

• Z-test reviewed

• One sample T-Test

Page 3: Psych 230 Psychological Measurement and Statistics

This Time….

• Reporting statistics

• Two sample T-Test

• For Independent samples

Page 4: Psych 230 Psychological Measurement and Statistics

Statistical Testing

1. State the hypotheses (H0 and H1)

2. Decide which test to use 3. Calculate the obtained value4. Calculate the critical value (size of )5. Make our conclusion

Page 5: Psych 230 Psychological Measurement and Statistics

Statistical tests (so far)

• The statistical tests we have used so far concentrate on finding whether a sample is representative of a known population

• Two characteristics of these tests:– one sample is drawn– we know the population mean

• Z-test– we also know the population variance

• T-test (one sample)– we do not know the population variance

Page 6: Psych 230 Psychological Measurement and Statistics

So far

• Hardly any of these tests are used in psychological research.

• They were used to introduce you to the logic behind statistical testing.– Incrementally.

• The z-test showed us how to test the likelihood a sample came from a population where we know and x

Page 7: Psych 230 Psychological Measurement and Statistics

So Far Continued

• Rarely do we know both of those population parameters.

• It did introduce you to critical values of a statistical test and what they mean.

• Next we learned how to do a single sample t-test.• Again this test is not used often in psychological

research– Rarely do we know the population parameter

Page 8: Psych 230 Psychological Measurement and Statistics

So Far continued

• Even though it isn’t used much, it did introduce you to the t distribution.

• From now until the end of this course we will be learning about statistical tests which are used in this field.

• These statistical tests are useless without the methods that go with them.

Page 9: Psych 230 Psychological Measurement and Statistics

Introduction to experimental methods

• Over the next few lectures we will be covering statistics which are used for various types of experiments.

Page 10: Psych 230 Psychological Measurement and Statistics

Introduction to experimental methods

• Over the next few lectures we will be covering statistics which are used for various types of experiments.

• What is an experiment?

Page 11: Psych 230 Psychological Measurement and Statistics

Introduction to experimental methods

• Over the next few lectures we will be covering statistics which are used for various types of experiments.

• What is an experiment?• It is a research tool which helps scientists infer

causality.• The method is born from the criteria set by a

philosopher named David Hume

Page 12: Psych 230 Psychological Measurement and Statistics

Hume’s Criteria

• A cause must precede effect• A cause and effect must be spatially and

temporally contiguous– The cause and the effect touch.

• A necessary connection between cause and effect.– Without cause you don’t see the effect.

Page 13: Psych 230 Psychological Measurement and Statistics

Experimental Methods

• John Stewart Mill expanded on this and proposed various methods. – The method of his which is most closely related to

experimental methods is his Method of Difference. • An experimenter takes two groups which are

identical in every respect except for one thing and the effect is present after the introduction of that one thing, that one thing is the cause.

• The Experimental methods try to simulate conditions which approximate these three criteria.

Page 14: Psych 230 Psychological Measurement and Statistics

Experimental Methods • For example let’s take the simplest experiments.

– Experimenters randomize the participants– Why do they do that? They randomize the participants for the two

groups in order to try to eliminate effects due to individual differences.

• In other words individuals have a multitude of other causes for their behavior.

– These other causes are called error.– Since error is random and if the assignment to conditions is random

then the errors in one group on average should cancel out the errors due to individual differences in the other group.

• Randomization helps satisfy Hume’s third criteria by making the two groups as identical to each other as possible when it comes to individual differences.

Page 15: Psych 230 Psychological Measurement and Statistics

Experimental Method

• Control of the independent variable is also a necessary component of a true experiment.– By having control of the independent variable the

experimenter can try to come close to meeting the standards outlined by Mill.

Page 16: Psych 230 Psychological Measurement and Statistics

Today

• The first statistic along with the corresponding method is a bivalent experiment with unrelated groups.– Unrelated groups means that the individuals in

the control condition are not the same individuals in the experimental condition.

• The corresponding statistic is the two sample T-Test with unrelated groups

Page 17: Psych 230 Psychological Measurement and Statistics

Two Sample T-test

Page 18: Psych 230 Psychological Measurement and Statistics

Two Sample T-Test

• Important!

• Make sure you understand:

– What a dependent variable, an independent variable and a condition are

Page 19: Psych 230 Psychological Measurement and Statistics

Two Sample T-Test

• The tests we have used so far assume that we know the population mean of the variable we are interested in

• We compared a sample to the population to see if this sample is “representative” of the population

• None of these really help us when it comes to inferring cause in an experimental setting. – The population is not identical to the experimental group or sample in

every way but the independent variable.• In most psychological experiments though, we simply do not

know what the population scores are– Experiment: We want to know if the degree of friendliness of wait staff

in a restaurant (very friendly or not friendly) affects the size of tips• How to conduct this experiment?

Page 20: Psych 230 Psychological Measurement and Statistics

Two Sample T-Test

• Before, we compared a sample to the population • Now, we compare one sample to another sample • In this experiment, we wish to compare the

performance of two groups (friendly vs. non-friendly)– we instruct some waiters to be extremely friendly, and

instruct other waiters to be very unfriendly. Then we compare the tips received by the two groups.

• Although we will examine the difference between the two samples, we are really interested in whether the two population means are different– we want to generalize these results

Page 21: Psych 230 Psychological Measurement and Statistics

Two Sample T-Test

• Friendly group: We calculate Xfriendly, which we use to estimate friendly

• Unfriendly group: We calculate Xunfriendly, which we use to estimate unfriendly

• Then, we want to know whether these two ’s are the same– do all of the observations come from the same

distribution?

Page 22: Psych 230 Psychological Measurement and Statistics

Two Sample T-Test

Tip amount

Unfriendly condition Friendly condition

Page 23: Psych 230 Psychological Measurement and Statistics

Assumptions of the Two Sample T-Test

1. The dependent scores measure an interval or ratio variable

2. The populations of raw scores form at least roughly normal distributions

3. The populations have homogeneous variance. Homogeneity of variance means that the variance of all populations (2

x) being represented are equal.

Page 24: Psych 230 Psychological Measurement and Statistics

Two Sample T-Test

1. State the hypotheses (H0 and H1)

2. Decide which test to use 3. Calculate the obtained value4. Calculate the critical value (size of )5. Make our conclusion

Page 25: Psych 230 Psychological Measurement and Statistics

Independent Samples T-test

Page 26: Psych 230 Psychological Measurement and Statistics

Two Sample T-Test

1. State the hypotheses (H0 and H1)

2. Decide which test to use 3. Calculate the obtained value4. Calculate the critical value (size of )5. Make our conclusion

Page 27: Psych 230 Psychological Measurement and Statistics

1. State the Hypotheses

• What do we predict in our wait staff friendliness experiment?

• We probably predict that the friendliness of the wait staff makes some difference to the tip size

• So, we predict that the average tip given to friendly wait staff is different than the average tip given to unfriendly wait staff

• We can write this as:friendly unfriendly

• Or:friendly - unfriendly 0

• Is this H0 or H1? One-tailed or two-tailed?

Page 28: Psych 230 Psychological Measurement and Statistics

1. State the Hypotheses

• H1 : friendly unfriendly

– there is a difference in the tip size between the two conditions

• The null hypothesis (H0) always states that there is no relationship between the variables– friendliness does not affect tip size

• H0 : friendly = unfriendly

Page 29: Psych 230 Psychological Measurement and Statistics

2. Decide which test to use

• Are we comparing a sample to a population?– Yes: Z-test if we know the population std dev– Yes: One-sample T-test if we do not know the population

std dev– No: Keep looking

• How many samples are we comparing?– Two: Use the Two-sample T-test

• Are the samples independent or related?– Independent: Use Independent Samples T-test– Related: Use Related Samples T-test

Page 30: Psych 230 Psychological Measurement and Statistics

3. Calculate tobt

• As before, we will calculate a Tobt from our data and compare it to Tcrit

• As before, Tobt is calculated by:

(Result of Study) - (Population mean for H0)Standard Error

• Now though, the result of the study that we are calculating is the difference between the two means we observed– difference between the average tip size for friendly waitstaff

vs. the average tip size for unfriendly waitstaff

Page 31: Psych 230 Psychological Measurement and Statistics

Plotting the difference between 1 and 2

a a a

a

The difference between the population means

Page 32: Psych 230 Psychological Measurement and Statistics

Plotting the difference between 1 and 2

a a a

a

Tcrit=-1.833

Tobt=-1.71

The difference between the population means

Tcrit=+1.833

Page 33: Psych 230 Psychological Measurement and Statistics

Real differences

Page 34: Psych 230 Psychological Measurement and Statistics

3. Calculate tobt

• As before, we will calculate a tobt from our data and compare it to tcrit

2)(,)()(

212121

21

nndfs

xxt

xxobt

Page 35: Psych 230 Psychological Measurement and Statistics

3. Calculate tobt

• As before, we will calculate a tobt from our data and compare it to tcrit

2)(,)()(

212121

21

nndfs

xxt

xxobt

What are these terms?

Page 36: Psych 230 Psychological Measurement and Statistics

21 • This is your expected term.•Remember any of these tests can be summarized as observed – expected/error

•In other words this is your null hypothesis.•The two populations which these samples came from are not different.

•Whatever the means are, if our populations are the same this term is going to equal zero.

Page 37: Psych 230 Psychological Measurement and Statistics

3. Calculate tobt

• As before, we will calculate a tobt from our data and compare it to tcrit

2)(,)()(

212121

21

nndfs

xxt

xxobt

What about this one?

Page 38: Psych 230 Psychological Measurement and Statistics

21 xxs • This one is your error term you are basically

subtracting each of the sample’s standard error from each other.

• Depending on your data there are different ways to calculate this term.

• Since we are talking about experiments, we are going to go over the case of the two groups having identical number of individuals.

• There are other ways to calculate this term if you have unequal sample sizes but we are not going to go over how to do them manually.– That’s what computers are for.

Page 39: Psych 230 Psychological Measurement and Statistics

3. Calculate tobt

• What’s ?• • is the standard error of the difference

between means

• This is similar to the standard error terms we’ve had before, but less intuitive to think about

21 xxs

21 xxs

Page 40: Psych 230 Psychological Measurement and Statistics

3. Calculate tobt

The caret means the estimate is unbiased. If you just have an s, you need to do your own unbiasing by subtracting 1 from n in the denominator

2

22

1

21 ˆˆ

21 n

s

n

ss xx

Page 41: Psych 230 Psychological Measurement and Statistics

3. Calculate tobt• The computational formula for a two sample t-test with

independent groups does away with the carets, and is actually easier to handle if you’re starting from raw data (even though it looks more intimidating)

• If you have to do one by hand from raw data this is the way to go.

)1()1(,

11

2

)()(21

2121

21

2

222

1

2

121

2121

nndf

nnnn

n

xx

n

xx

xxtobt

Page 42: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

• After conducting our friendliness experiment, we get the following data:

Friendly Unfriendly

Mean tip (%) 23 20

Subjects 17 17

Est Variance 9.0 7.5

Page 43: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

• After conducting our friendliness experiment, we got the following data:

Friendly Unfriendly

Mean tip (%) X1=23 X2=20

Subjects n1= 17 n2= 17

Est Variance ŝ12= 9.0 ŝ2

2= 7.5

Page 44: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

Friendly Unfriendly

Mean tip (%) X1=23 X2=20

Subjects n1= 17 n2= 15

Est Variance ŝ12= 9.0 ŝ2

2= 7.5

2

22

1

21 ˆˆ

21 n

s

n

ss xx

Page 45: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

Friendly Unfriendly

Mean tip (%) X1=23 X2=20

Subjects n1= 17 n2= 15

Est Variance ŝ12= 9.0 ŝ2

2= 7.5

= √ (9/17) + (7.5/15) = 1.02998.97.44.53.

17

5.7

17

9ˆˆ

2

22

1

21

21

n

s

n

ss xx

Page 46: Psych 230 Psychological Measurement and Statistics

3. Calculate tobt

Friendly Unfriendly

Mean tip (%) X1=23 X2=20

Subjects n1= 17 n2= 15

Est Variance

Errotr term

ŝ12= 9.0

.98

ŝ22= 7.5

2)(,)()(

212121

21

nndfs

xxt

xxobt

Page 47: Psych 230 Psychological Measurement and Statistics

3. Calculate tobt

Friendly Unfriendly

Mean tip (%) X1=23 X2=20

Subjects n1= 17 n2= 15

Est Variance ŝ12= 9.0 ŝ2

2= 7.5

2)(,)()(

212121

21

nndfs

xxt

xxobt

2)1717(,98.

0)2023(

dftobt

Page 48: Psych 230 Psychological Measurement and Statistics

3. Calculate tobt

Friendly Unfriendly

Mean tip (%) X1=23 X2=20

Subjects n1= 17 n2= 15

Est Variance ŝ12= 9.0 ŝ2

2= 7.5

2)(,)()(

212121

21

nndfs

xxt

xxobt

2)1717(,98.

0)2023(

dftobt

Page 49: Psych 230 Psychological Measurement and Statistics

3. Calculate tobt

Friendly Unfriendly

Mean tip (%) X1=23 X2=20

Subjects n1= 17 n2= 15

Est Variance ŝ12= 9.0 ŝ2

2= 7.5

2)(,)()(

212121

21

nndfs

xxt

xxobt

32,06.3

32,98.

3

2)1717(,98.

0)2023(

df

df

dftobt

Page 50: Psych 230 Psychological Measurement and Statistics

Plotting the difference between 1 and 2

a a a

a

Tobt=+3.06

The difference between the population means

Page 51: Psych 230 Psychological Measurement and Statistics

4. Calculate the critical value• Assume =0.05• We are looking for any difference, therefore

it will be a two-tailed test

Two-tailed Testdf =.05 =.0132 2.021 2.704

322)1717(2)( 21 nndf

021.2critt

Page 52: Psych 230 Psychological Measurement and Statistics

Tcrit and Tobt

a a a

a

Tcrit=-2.021

Tobt=+3.06

The difference between the population means

Tcrit=+2.021

Page 53: Psych 230 Psychological Measurement and Statistics

5. Make our Conclusion

• As Tobt is inside the rejection region, we reject H0 and accept H1

• We conclude that friendliness in waitstaff is significantly related to tip size– Friendlier waitstaff get significantly bigger tips

06.3

021.2

obt

crit

t

t

Page 54: Psych 230 Psychological Measurement and Statistics

Another example

• Does cramming for exams hurt or help grades?

• We run an experiment where one group is randomly selected and instructed to cram for an exam. Another random group is instructed to study at regular intervals. The data are as follows:

Cram Regular

Score (X) 43 48

Subjects (n) 31 31

Est. Var. (ŝ2) 64 83.6

Page 55: Psych 230 Psychological Measurement and Statistics

1. State the Hypotheses

• One-tailed or two-tailed?• What do we predict?

• HA: cram regular

• H0: cram = regular

Page 56: Psych 230 Psychological Measurement and Statistics

2. Decide which test to use

• Are we comparing a sample to a population?

• How many samples are we comparing?– Two: Use the Two-sample T-test

• Are the samples independent or related?– Independent: Use Independent Samples T-test– Related: Use Related Samples T-test

Page 57: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

Cram Regular

Score (X) 43 48

Subjects (n) 31 31

Est. Var.(ŝ2) 64 83.6

2)(,)()(

212121

21

nndfs

xxt

xxobt

Page 58: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

Cram Regular

Score (X) 43 48

Subjects (n) 31 31

Est. Var.(ŝ2) 64 83.6

2)(,)()(

212121

21

nndfs

xxt

xxobt

First calculate the error term

Page 59: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

Cram Regular

Score (X) 43 48

Subjects (n) 31 31

Est. Var.(ŝ2) 64 83.6

2)(,)()(

212121

21

nndfs

xxt

xxobt

2

22

1

21 ˆˆ

21 n

s

n

ss xx

Page 60: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

Cram Regular

Score (X) 43 48

Subjects (n) 31 31

Est. Var.(ŝ2) 64 83.6

2)(,)()(

212121

21

nndfs

xxt

xxobt

18.276.470.206.2

31

6.83

31

64ˆˆ

2

22

1

21

21

n

s

n

ss xx

Page 61: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

Cram Regular

Score (X) 43 48

Subjects (n) 31 31

Est. Var.(ŝ2) 64 83.6

2)(,)()(

212121

21

nndfs

xxt

xxobt

Page 62: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

Cram Regular

Score (X) 43 48

Subjects (n) 31 31

Est. Var.(ŝ2) 64 83.6

2)3131(,18.2

0)4843(

dftobt

2)(,)()(

212121

21

nndfs

xxt

xxobt

Page 63: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

Cram Regular

Score (X) 43 48

Subjects (n) 31 31

Est. Var.(ŝ2) 64 83.6

60,29.2

60,18.2

5

2)3131(,18.2

0)4843(

df

df

dftobt

2)(,)()(

212121

21

nndfs

xxt

xxobt

Page 64: Psych 230 Psychological Measurement and Statistics

4. Calculate the critical value

• =0.05• Two-tailed test• df = (n1 - 1) + (n2 - 1) • df = (31 - 1) + (31 - 1) = 60

Two-tailed Testdf =.05 =.0160 2.000 2.660

Tcrit = 2.000

Page 65: Psych 230 Psychological Measurement and Statistics

Tcrit and Tobt

a a a

a

Tcrit=-2.0

Tobt=-2.3

The difference between the population means

Tcrit=+2.0

Page 66: Psych 230 Psychological Measurement and Statistics

5. Make our Conclusion

• Tcrit = 2.0• Tobt = -2.3

• As Tobt is inside the rejection region, we reject H0 and accept H1

• We conclude that cramming is significantly related to test performance– cramming leads to significantly worse test performance

Page 67: Psych 230 Psychological Measurement and Statistics

Cause and Effect

• True experiments• The experimenter controls the IV• We can make a causal argument from the results

• Intact group design experiments• (e.g. IV = gender, age, genes, etc)• All you can say is whether or not there is a difference

between groups

Page 68: Psych 230 Psychological Measurement and Statistics

The End

• Have a good weekend.

Page 69: Psych 230 Psychological Measurement and Statistics

Related Sample T-test

Page 70: Psych 230 Psychological Measurement and Statistics

Related Samples T-Test• In a matched samples design, the researcher

matches each subject in one condition with a subject in the other condition– we do this so that we have more comparable subjects in

the conditions

• In a repeated measures design, each subject is tested under all conditions of the independent variable

• These lower the variability due to individual differences in a study and makes significance easier to achieve

• But they’re not appropriate for many studies

Page 71: Psych 230 Psychological Measurement and Statistics

Related Samples T-Test

• Direct difference method: find the difference between the scores and treat the differences as raw scores

• D is the difference between the scores in condition one and two

• is the mean difference: D n

DD

Page 72: Psych 230 Psychological Measurement and Statistics

Related Samples T-Test

• t ratio for related samples:

1,

1

2

2

ndf

nnn

DD

Dt

Page 73: Psych 230 Psychological Measurement and Statistics

Example - Related Samples T-Test

• Researchers have developed a new medication therapy for Capgras syndrome and wish to see if this drug can help patients who suffer from this disorder. 5 patients were selected to receive this drug, with their scores on Capgras scale measured before and after treatment. The scores were as follows:

Subject Before After 1 11 8 2 16 11 3 20 15 4 17 11 5 10 11

Page 74: Psych 230 Psychological Measurement and Statistics

1. Decide which test to use

• Are we comparing a sample to a population?– Yes: Z-test if we know the population std dev– Yes: One-sample T-test if we do not know the population

std dev– No: Keep looking

• How many samples are we comparing?– Two: Use the Two-sample T-test

• Are the samples independent or related?– Independent: Use Independent Samples T-test– Related: Use Related Samples T-test

Page 75: Psych 230 Psychological Measurement and Statistics

2. State the Hypotheses

• What are our hypotheses? What do we predict?• We probably predict that the drug will make some

difference to the symptoms of Capgras syndrome • So, we predict that there will be a difference

between the severity of symptoms before and after treatment

• We can write this as:drug no drug

• Or:D 0, where D=difference score

Page 76: Psych 230 Psychological Measurement and Statistics

2. State the Hypotheses

• H1 : D 0 – there is a difference in the symptoms before and after

treatment

• The null hypothesis (H0) always states that there is no relationship between the variables– the drug will not affect the disorder

• H0 : D = 0

Page 77: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

• Because we have the same group of people (or a matched group of people) in both of our groups, we can take a little short cut in finding the Tobt

• We can transform the data by calculating difference scores for each of our subjects, and using these scores to find Tobt

• When we do this, calculating Tobt is the same as computing the one-sample t-test we discussed last week

Page 78: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

Subject Before After 1 11 8 2 16 11 3 20 15 4 17 11 5 10 11

n

DD

Page 79: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

Subject Before After DifferenceDifference2

1 11 8 +3 9 2 16 11 +5 25 3 20 15 +5 25 4 17 11 +6 36 5 10 11 -1 1

n

DD

Page 80: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

Subject Before After DifferenceDifference2

1 11 8 +3 9 2 16 11 +5 25 3 20 15 +5 25 4 17 11 +6 36 5 10 11 -1 1

6.3

96

182

D

D

Dn

DD

Page 81: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

6.3

96

182

D

D

D

1,

1

2

2

ndf

nnn

DD

Dt

Page 82: Psych 230 Psychological Measurement and Statistics

3. Calculate Tobt

6.3

96

182

D

D

D

88.22031

6.3

)15(5518

96

6.3

1,

1

2

2

2

t

t

t

ndf

nnn

DD

Dt

Page 83: Psych 230 Psychological Measurement and Statistics

Plotting the difference between 1 and 2

a a a

a

Tobt=+2.88

The difference between the population means

Page 84: Psych 230 Psychological Measurement and Statistics

4. Calculate the critical value

• Assume =0.05• We are looking for any difference, therefore it will

be a two-tailed test• df = N -1 • df = (5 - 1) = 4

Two-tailed Testdf =.05 =.014 2.776 4.604

tcrit = 2.776

Page 85: Psych 230 Psychological Measurement and Statistics

tcrit and tobt

a a a

a

Tcrit=-2.776

Tobt=+2.88

The difference between the population means

Tcrit=+2.776

Page 86: Psych 230 Psychological Measurement and Statistics

5. Make our Conclusion

• tcrit = 2.776• tobt = +2.88

• As tobt is inside the rejection region, we reject H0 and accept H1

• We conclude that the new drug significantly alters symptoms of Capgras syndrome– Symptoms decrease significantly on medication– t(4) = 2.88, p<0.05