129
Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Embed Size (px)

Citation preview

Page 1: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Psych 230

Psychological Measurement and Statistics

Pedro WolfSeptember 9, 2009

Page 2: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

So Far

• Stem and leaf plots• Bar plots• Summarizing scores using Frequency– how a frequency distribution is created

• Graphing frequency distributions– bar graphs, histograms, polygons

• Types of distribution– normal, skewed, bimodal

• Relative frequency and the normal curve

Page 3: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

the Normal Curve

How likely is it that a certain score will occur?

Page 4: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Today….

• Summarizing scores using central tendency– what is central tendency?

• The Mode– what it is, how to calculate it, & when to use it

• The Median– what it is, how to calculate it, & when to use it

• The Mean– what it is, how to calculate it, & when to use it– applying the mean to research

Page 5: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

• Range– what it is & how to calculate it

• Variance– what it is & how to calculate it

• Standard Deviation– what it is & how to calculate it

• Variability and the Normal Distribution • Population Variance and Standard Deviation

Page 6: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Central Tendency

Page 7: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Why do we need a measure of Central Tendency?

• Often we would like to know the most typical or representative score of a dataset– How many drinks do students consume a week?– What are the political beliefs of students?– What is people’s favorite color?– How much do lawyers get paid?– What is the temperature in London?

• There are different ways to calculate a typical score.– Each way has advantages and disadvantages. Depends on:

• Type of data• Distribution of data

Page 8: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

What is a Measure of Central Tendency?

• Measures of central tendency answer the question: – “Are the scores generally high scores or generally low

scores?”• Allow us to compare values:

– Average high / low temp in May in Tucson: 90ºF / 53ºF– Average high / low temp in May in St. Petersburg: 59ºF / 42ºF

• A statistic that indicates where the center of the distribution tends to be located

Page 9: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Measures of Central Tendency

• There are three commonly used measures of central tendency

• Mode• Median• Mean

• There is no single, perfect, measure of central tendency

Page 10: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Measures of Central Tendency

987654321

Freq

uen

cy

5

4

3

2

1

0

987654321

Freq

uen

cy

5

4

3

2

1

0

987654321Fr

eque

ncy

5

4

3

2

1

0

Page 11: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example

• The following are the salaries of the 15 employees of a small consulting company

$82,000 $64,000 $36,400 $34,000 $29,200$29,200 $29,200 $28,000 $26,800 $26,800$26,800 $24,400 $24,400 $24,400 $24,400

• What is the typical salary of an employee in this company?

• How can different measures of central tendency be used to make different arguments?

Page 12: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

The Mode

Page 13: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

What is the Mode?

• The mode is the score that has the highest frequency in the data

• The mode is always used to describe central tendency when the scores reflect a nominal scale of measurement

• Can also be used for other scales of measurement

• Scores: 2,3,4,4,5,5,5• Mode=5

Page 14: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Mode

• Can find the mode by inspection (as opposed to computation)

• Simply the score with the highest frequency

Page 15: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example - Mode from Raw Scores

• What is the mode of the following data:

14 10 1 2 15 414 14 13 15 11 1513 2 12 13 14 1314 25 47 1 14 15

Page 16: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example - Mode from Raw Scores

• What is the mode of the following data:

14 10 1 2 15 414 14 13 15 11 1513 2 12 13 14 1314 25 47 1 14 15

Mode = 14

Page 17: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example - Mode from Frequency Table

• What is the mode of the following data:

Value Frequency6 55 64 33 12 11 1

Page 18: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example - Mode from Frequency Table

• What is the mode of the following data:

Value Frequency Mode = 56 55 64 33 12 11 1

Page 19: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example - Mode from Graphs

Score

54321

Frequency

140

120

100

80

60

40

20

0

Quiz #1 Scores

Page 20: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example - Mode from Graphs

Mode = 5Score

54321

Frequency

140

120

100

80

60

40

20

0

Quiz #1 Scores

Page 21: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Unimodal Distributions

• When a graph has one hump (such as on the normal curve) the distribution is called unimodal

Page 22: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Bimodal Distributions

• When a graph shows two scores that are tied for the most frequently occurring score, it is called bimodal.

Page 23: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example

• The following are the salaries of the 15 employees of a small consulting company.

$82,000 $64,000 $36,400 $34,000 $29,200$29,200 $29,200 $28,000 $26,800 $26,800$26,800 $24,400 $24,400 $24,400 $24,400

• What is the modal salary?

• $24,400– is this a good description of the typical salary?

Page 24: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Mode

• Advantages:– can be used with nominal data– easily identified– unaffected by extreme scores– bimodal datasets may suggest interesting subgroups

• Disadvantages– not necessarily a unique score– not very precise– cannot be manipulated mathematically

Page 25: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

The Median

Page 26: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

What is the Median?

• The median is the middle score of the data; the score that divides the data in half

• The median is the score at the 50th percentile

– you did this in your homework when calculating the quartiles

• The median is used to summarize ordinal or highly skewed interval or ratio scores

Page 27: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to Find the Median

• When data are normally distributed, the median is the same score as the mode.

• When data are not normally distributed, follow the following procedure:– arrange the scores from lowest to highest.– if there are an odd number of scores, the median is the

score in the middle position.– if there are an even number of scores, the median is the

average of the two scores in the middle.– Median score = (N+1)/2

Page 28: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example - Median from Raw Scores

• What is the median of the following data:

14 10 1 2 15 414 14 13 15 11 1513 2 12 13 14 1314 25 47 1 14 15

Page 29: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example - Median from Raw Scores

• What is the median of the following data:

14 10 1 2 15 414 14 13 15 11 1513 2 12 13 14 1314 25 47 1 14 15

• First, arrange in order of magnitude

1 1 2 2 4 1011 12 13 13 13 1314 14 14 14 14 1415 15 15 15 25 47

Page 30: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example - Median from Raw Scores

1 1 2 2 4 1011 12 13 13 13 1314 14 14 14 14 1415 15 15 15 25 47

• Number of scores (N) = • N=24

• Median is the average of the middle two:• (N+1)/2 = (24+1)/2 = 25/2 = 12.5• Average of the 12th and 13th score

• (13 + 14) / 2 = 13.5

Page 31: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example - Median from Frequency Table

• What is the median of the following data:

X f6 55 64 33 12 11 1

Page 32: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example - Median from Frequency Table

• What is the median of the following data:

X f N=176 55 64 33 12 11 1

Page 33: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example - Median from Frequency Table

• What is the median of the following data:

X f N=176 55 6 Median = (N+1)/2 = (17+1)/2 = 4 3 9th score3 12 11 1

Page 34: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example - Median from Frequency Table

• What is the median of the following data:

X f N=176 55 6 Median = (N+1)/2 = (17+1)/2 = 4 3 9th score3 12 1 Median = 51 1

Page 35: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Median

• Advantages:– useful for skewed distributions– unaffected by extreme scores– useful for dividing sets of scores in to two halves (for

example, high and low scorers in an exam)

• Disadvantages– does not take into account extreme scores– cannot be manipulated mathematically

Page 36: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

The Mean

Page 37: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

What is the Mean?

• The mean is the score located at the exact mathematical center of a distribution– the “average”

• The mean is used to summarize interval or ratio data in situations when the distribution is symmetrical and unimodal

• By far the most commonly used measure of central tendency

Page 38: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to Find the Mean

• The symbol for the sample mean is

• The formula for the sample mean is:

X X

N

X

Page 39: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example

• Calculate the mean of the following data:

• 12, 15, 17, 12, 13, 9, 1, 6, 3, 12, 12, 16, 17• Mode = 12• Median = 12

N = 13X = 12+15+17+12+13+9+1+6+3+12+12+16+17

X = 145

Page 40: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example

• Calculate the mean of the following data:

• 12, 15, 17, 12, 13, 9, 1, 6, 3, 12, 12, 16, 17• Mode = 12• Median = 12

N = 13X = 12+15+17+12+13+9+1+6+3+12+12+16+17

X = 145

= X / N = 145 / 13 = 11.154

X

Page 41: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example

• The following are the salaries of the 15 employees of a small consulting company

$82,000 $64,000 $36,400 $34,000 $29,200$29,200 $29,200 $28,000 $26,800 $26,800$26,800 $24,400 $24,400 $24,400 $24,400

• What is the mean salary?

• 510,000/15 = $34,000

Page 42: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Sample Mean vs. Population Mean

• is the sample mean. This is a sample statistic.

• The mean of a population is a parameter. It is symbolized by (pronounced “mew”)

• is used to estimate the corresponding population mean

X

X

Page 43: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Your Turn - Mean

• For the mean, we need X and N • We know that N = 18• What is X?

Page 44: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Your Turn - Mean

• For the mean, we need X and N • We know that N = 18• What is X?

• 10+11+12+13+13+13+13+14+14+14+14+14+14+15+15+15+15+17

X=246

Page 45: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Your Turn - Mean

67.1318

246

246

N

XX

X

Page 46: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Weighted mean• The mean of a group of means• Sometimes you want to compare groups with

different numbers of scores• Suppose you have 4 class averages: 75, 78, 72,

80.• How do you find the mean?• (75+78+72+80)/4 = 76.25• Only works if every class has the same number

of people

Page 47: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Formula for Weighted Mean

ΣfX

Ntot

Xw =

Page 48: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Weighted mean

• If the Ns of the groups of scores differ…

Page 49: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Which measure to use?

Page 50: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Central Tendency - Normal Distributions

• On a perfect normal distribution, all three

measures of central tendency are located at the

same score: mean=median=mode

Page 51: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Central Tendency - Normal Distributions

• As the mean uses all of the information in the data,

it is the preferred one to use in this case

Page 52: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Central Tendency - Skewed Distributions

• Use the median to summarize a highly skewed

distribution

Page 53: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Which measure to use?

• Generally, the mean is the best measure of central tendency– All scores count in computing the mean

• Unless:• Nominal data: use the mode• Highly skewed data: use the median

Page 54: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Samples and Populations

• The sample mean provides a better estimate of the central tendency of a population than the sample median.

• The more observations in our sample, the closer the sample mean will be to the population mean.

• Often, the sample mean will be more accurate than any one individual

Page 55: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Applying the Mean to Research

Page 56: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Why the Mean?

• So, the mean is usually the best “model” of our data– The best summary of the distribution– And, we can do statistics on it– This makes it ideal for comparing groups

Page 57: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Using the mean

• Prediction– without knowing anything else, the mean is our best

estimate

• Describing a score’s location– a deviation score indicates a raw score’s location and

frequency

• Describing the population mean– ultimately we want to describe the population

• Summarizing Experiments

Page 58: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Summarizing Experiments

• We compute the mean every time we have a sample of normally distributed scores

• The first step in an experiment is to usually to compute the mean of the dependent variable

Page 59: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example

• An experiment was conducted to test the efficacy of a new diet drug. The drug was administered to rats in various dosages and the rat’s food consumption measured. The data are as follows:

Rat 0mg 2mg 4mg 6mgA 12 10 13 7B 13 10 12 5C 9 10 11 7D 10 10 7 5E 18 10 7 4

Page 60: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example

• An experiment was conducted to test the efficacy of a new diet drug. The drug was administered to rats in various dosages and the rat’s food consumption measured. The data are as follows:

Rat 0mg 2mg 4mg 6mgA 12 10 13 7B 13 10 12 5C 9 10 11 7D 10 10 7 5E 18 10 7 4

X (0mg) = 12.4

X (2mg) = 10

X (4mg) = 10

X (6mg) = 5.6

Page 61: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Graphing Experimental Results

• Plot the independent variable on the X axis and the dependent variable on the Y axis

• Create a bar graph when the independent variable is a nominal or ordinal variable

• Create a line graph when the independent variable is an interval or a ratio variable

Page 62: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Bar Graphs

• The bar above each condition on the X axis is placed to the height on the Y axis that corresponds to the mean score for that condition

Page 63: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Line Graphs

• A line graph uses straight lines to connect adjacent data points

Page 64: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Today….

• Understanding variability– what can it tell us

• Range– what it is & how to calculate it

• Variance– what it is & how to calculate it

• Standard Deviation– what it is & how to calculate it

• Variability and the Normal Distribution • Population Variance and Standard Deviation

Page 65: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Understanding Variability

Page 66: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Understanding Variability

• The mean gives us a good measure of the central

tendency of our data - the average value

• Mean age in the class = 20.34

• Does this tell us all we need to know about the

distribution of ages in the class?

• We also need a measure of how spread out the

scores are

Page 67: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Understanding Variability

• Measures of variability describe the extent to

which scores in a distribution differ from each

other

• The mean is our best estimate of central tendency.

What is our best estimate of variability?

Page 68: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Variability - Example

• Same means but different variability

Page 69: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Variability - Example

• Same means but different variability

Page 70: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Variability - Example

• Three variations of the normal distribution

Page 71: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

The Range

Page 72: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

What is the Range?

• The range indicates the distance between the two most extreme scores in a distribution

• Range = highest score – lowest score

Page 73: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

The Range - Example

• What is the range of the following dataset?

12, 16, 18, 23, 11, 10, 9, 4, 23, 15, 14, 13

• Range = highest score – lowest score

• Range = 23 - 4

• Range =19

Page 74: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

The Range - Your turn

• What are the ranges of the following datasets?

– 2, 6, 12, 10, 0

– 4, 7, 5, 8, 6

– 6, 6, 6, 6, 6

• 1) 12 - 0 = 12

• 2) 8 - 4 = 4

• 3) 6 - 6 = 0

Page 75: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

The Range

• Though the range does give some idea of the spread of data, it is quite a crude measure

• Based on highest and lowest values, and so reflects the least typical scores

Page 76: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Variance and Standard Deviation

Page 77: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Variance and Standard Deviation

• Most psychological research involves interval or ratio scores which approximate a normal distribution

• In these situations, we use two, similar, measures of variability, known as the variance and the standard deviation

• A measure of how different the scores are from each other

Page 78: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Variance and Standard Deviation

• A measure of how different the scores are from each other– calculate by measuring how much the scores differ from the

mean. (Remember: the mean is our best estimate of central tendency)

• The variance and standard deviation indicate how much the scores are spread out around the mean

• Mean hours of tv watched weekly = 7.99

• Are the number of hours of tv between 6 and 8 hours or between 1 and 30 hours?

Page 79: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Variance and Standard Deviation

• We want a measure of how much the scores are spread out around the mean

• Why not just take an average of the distance between each score and the mean?

– Data: 4, 6, 8, 10, 12

– Mean = 8

• Cannot do this, because the sum of the deviations always equals 0

– positive deviations cancel out negative deviations

Page 80: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Variance and Standard Deviation

• So, we want a measure which is like the average of the deviations, but which is calculated differently

• Conceptually, we can think of the variance and standard deviation as the typical amount that each score differs from the mean

Page 81: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

The Variance

• One solution to the problem of deviations canceling out is to square the deviations

• Why? All the positive deviations will stay positive and all negative deviations will become positive - no canceling out

• Data: 4, 6, 8, 10, 12• Mean=8• Sum of the deviations ∑(X-X) = – [(4-8)+(6-8)+(8-8)+(10-8)+(12-8)] – [(-4)+(-2)+(0)+(2)+(4)] = 0

Page 82: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

The Variance

• What happens when we square the deviations?

• Data: 4, 6, 8, 10, 12• Mean=8• Sum of the deviations ∑(X-X)2 = • [(4-8)2 + (6-8)2 + (8-8)2 +(10-8)2 + (12-8)2] =• [(-4)2 +(-2)2 +(0)2 +(2)2 +(4)2] =• [(16) +(4) +(0) +(4) +(16)] = 40

Page 83: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

The Variance

• Does 40 reflect the average deviation from the mean?

• To get an average deviation, we should divide by the number of scores (N)

∑(X-X)2 = 40 = 8 N 5

• This statistic is known as the variance (S2X)

Page 84: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance

• The sample variance is the average of the squared deviations of scores around the sample mean

Page 85: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance

• The sample variance is the average of the squared deviations of scores around the sample mean

N

XXS X

22 )(

Note: we can use a simpler formula to calculate the variance by hand

Page 86: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance - Example

• Estimates of professor’s age:

• Data: 36, 34, 45, 31, 35, 38, 36

• N = 7

• X = (36+34+45+31+35+38+36) / 7 = 36.43

N

XXS X

22 )(

Page 87: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance - Example

• Estimates of professor’s age:

• Data: 36, 34, 45, 31, 35, 38, 36

• N = 7

• X = (36+34+45+31+35+38+36) / 7 = 36.43

• ∑(X-X)2 =

[(36-36.43)2 + (34-36.43)2 + (45-36.43)2 + (31-36.43)2 + (35-36.43)2 + (38-36.43)2 + (36-36.43)2] =

N

XXS X

22 )(

Page 88: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance - Example

• Estimates of professor’s age:

• Data: 36, 34, 45, 31, 35, 38, 36

• N = 7

• X = (36+34+45+31+35+38+36) / 7 = 36.43

• ∑(X-X)2 =

[(36-36.43)2 + (34-36.43)2 + (45-36.43)2 + (31-36.43)2 + (35-36.43)2 + (38-36.43)2 + (36-36.43)2] =

[(-0.43)2 + (-2.43)2 + (8.57)2 + (-5.43)2 + (-1.43)2 + (1.57)2 + (-0.43)2] =

N

XXS X

22 )(

Page 89: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance - Example

• Estimates of professor’s age:

• Data: 36, 34, 45, 31, 35, 38, 36

• N = 7

• X = (36+34+45+31+35+38+36) / 7 = 36.43

• ∑(X-X)2 =

[(36-36.43)2 + (34-36.43)2 + (45-36.43)2 + (31-36.43)2 + (35-36.43)2 + (38-36.43)2 + (36-36.43)2] =

[(-0.43)2 + (-2.43)2 + (8.57)2 + (-5.43)2 + (-1.43)2 + (1.57)2 + (-0.43)2] =

0.184 + 5.904 + 73.444 + 29.484 + 2.044 + 2.464 + 0.184 = 113.708

N

XXS X

22 )(

Page 90: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance - Example

• N = 7

• X = 36.43

• ∑(X-X)2 = 113.708

N

XXS X

22 )(

Page 91: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance - Example

• N = 7

• X = 36.43

• ∑(X-X)2 = 113.708

S2X = ∑(X-X)2 = 113.708 = 16.244

N 7

N

XXS X

22 )(

Page 92: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance - Your Turn

• Find the variance of the following dataset:

• 3,3,5,9

N

XXS X

22 )(

Page 93: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance - Your Turn

• Find the variance of the following dataset: 3,3,5,9

• N = 4

• X = 5

• ∑(X-X)2 =

[(3-5)2 + (3-5)2 + (5-5)2 +(9-5)2] =[(-2)2 + (-2)2 + (0)2 + (4)2] =[(4) +(4) +(0) + (16)] = 24

S2X = ∑(X-X)2 = 24 = 6

N 4

N

XXS X

22 )(

Page 94: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

The Variance - pros and cons

• Pros:

• The variance is a legitimate measure of variability

• Usefully communicates the relative variability of scores

• We will use it extensively in further statistics

Page 95: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

The Variance - pros and cons

• Cons:

• The variance doesn’t make much sense as a measure of the “average deviation”

– we have squared all of the scores, so they are unrealistically large

– Professor’s Age example: • Data: 36, 34, 45, 31, 35, 38, 36

• Mean = 36.43

• Variance = 16 (16 what? 16 squared years)

Page 96: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

What is the Standard Deviation?

• The variance is a squared deviation score. To convert it back to the original scale, we can take the square root of the variance. This is known as the standard deviation (SX).

• The standard deviation indicates the “average deviation” from the mean, the consistency in the scores, and how far scores are spread out around the mean

• The larger the value of SX, the more the scores are spread out around the mean, and the wider the distribution

Page 97: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Standard Deviation

• The sample standard deviation is the square root of the variance

N

XXS X

2)(

Note: You can use a simpler formula to calculate the variance by hand.

Page 98: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Standard Deviation - Example

• Find the standard deviation of the following dataset: 3,3,5,9

Page 99: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Standard Deviation - Example

• Find the standard deviation of the following dataset: 3,3,5,9

• N = 4

• X = 5

• ∑(X-X)2 = 24[(3-5)2 + (3-5)2 + (5-5)2 +(9-5)2] = [(-2)2 + (-2)2 + (0)2 + (4)2] =[(4) +(4) +(0) + (16)] = 24

SX = √ 24 = √6 = 2.45 4

N

XXSX

2)(

Page 100: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Standard Deviation - Example

• Estimates of professor’s age:

• Data: 36, 34, 45, 31, 35, 38, 36

Page 101: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Standard Deviation - Example

• Estimates of professor’s age:

• Data: 36, 34, 45, 31, 35, 38, 36

• N = 7

• X = 36.43

• ∑(X-X)2 = 113.708

SX = √ 113.708 = √16.244 = 4.03 7

N

XXSX

2)(

Page 102: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Normal Distribution and the Standard Deviation

• The standard deviation is a measure of how far scores are from the mean, on average

N

XXS X

2)(

Page 103: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Normal Distribution and the Standard Deviation

• The standard deviation is related to the normal distribution

• On any normal distribution, approximately 0.34 of the scores lie between the mean and the score one standard deviation higher than the mean

• And, of course, 0.34 lie between the mean and the score one standard deviation lower than the mean

• So, 0.68 of the scores (or 68%) are within one standard deviation of the mean

Page 104: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Normal Distribution and the Standard Deviation

• Approximately 34% of the scores in a perfect normal distribution are between the mean and the score that is one standard deviation from the mean.

Page 105: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Normal Distribution and the Standard Deviation

• Why is this useful?• You score 85 on an exam - is this good?• If you know scores on an exam are normally

distributed, that the mean score was 80 and the standard deviation was 5, what percentage of people did you score better than?

Page 106: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Normal Distribution and the Standard Deviation

• Why is this useful?• You score 85 on an exam - is this good?• If you know scores on an exam are normally

distributed, that the mean score was 80 and the standard deviation was 5, what percentage of people did you score better than?

• You were above the mean: better than 50%• You are one standard deviation above the mean:

better than another 34%• Therefore, you scored better than 84% of people

Page 107: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Normal Distribution and the Standard Deviation

Page 108: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Normal Distribution and the Standard Deviation

Mean=66.51

Var=14.646

StdDev=3.827

HEIGHT

8176

7166

6156

51

HEIGHTFr

equ

ency14

12

10

8

6

4

2

0

Page 109: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Normal Distribution and the Standard Deviation

HEIGHT

8176

7166

6156

51

HEIGHTFr

equ

ency14

12

10

8

6

4

2

0

62.68 70.38

Mean=66.51

Var=14.646

StdDev=3.827

Page 110: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Computing the SampleVariance and Standard Deviation

Page 111: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Computing the Variance and Standard Deviation

• The previous formulas we used for calculating variance and standard deviation showed that we were computing a measure of the average deviation of scores from the mean

• However, if you have to do it by hand you can use some quicker and easier formulas for calculating these statistics.

Page 112: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance

• The computing formula for the sample variance is:

NNX

XSX

22

2

)(

Page 113: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Computing the Variance - Example

• Estimates of professor’s age:

• Data: 36, 34, 45, 31, 35, 38, 36

• N = 7

• ∑X2 = 9403

• (∑X)2 = 65025

NNX

XSX

22

2

)(

Page 114: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Computing the Variance - Example

• Estimates of professor’s age:

• Data: 36, 34, 45, 31, 35, 38, 36

• N = 7

• ∑X2 = 9403

• (∑X)2 = 65025

9403 - (65025 / 7) = 9403 - 9289.286 = 113.714 = 16.244

7 7 7

NNX

XSX

22

2

)(

Page 115: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

New Terminology and Notation Review

• The Sum of Squared Xs

– first square each raw score and then sum the squared Xs.

• The Sum of Xs, Squared– first sum the raw scores and then square that sum.

2X

2)( X

Page 116: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Review

• Find ∑X2 for the following data:

4, 6, 7, 3, 2, 7, 2, 5– first square each raw score and then sum the squared Xs.

(4)2 + (6)2 + (7)2 + (3)2 + (2)2 + (7)2 + (2)2 + (5)2 =

16 + 36 + 49 + 9 + 4 + 49 + 4 + 25 = 192

Page 117: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Review

• Find (∑X)2 for the following data:

4, 6, 7, 3, 2, 7, 2, 5– first sum the raw scores and then square that sum.

(4 + 6 + 7 + 3 + 2 + 7 + 2 + 5)2 =

(36)2 = 1296

Page 118: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance

• (∑X)2= 65025• ∑X2=9403• N=7

NNX

XSX

22

2

)(

Page 119: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance

• (∑X)2= 65025• ∑X2=9403• N=7

• SX2 = 9403 – 65025/7

7

NNX

XSX

22

2

)(

Page 120: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance

• (∑X)2= 65025• ∑X2=9403• N=7

• SX2 = 9403 – 65025/7

7

• SX2 = 9403 – 9289.26

7

NNX

XSX

22

2

)(

Page 121: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance

• (∑X)2= 65025• ∑X2=9403• N=7

• SX2 = 9403 – 65025/7 SX

2 = 113.74

7 7

• SX2 = 9403 – 9289.26

7

NNX

XSX

22

2

)(

Page 122: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Variance

• (∑X)2= 65025• ∑X2=9403• N=7

• SX2 = 9403 – 65025/7 SX

2 = 113.74

7 7

• SX2 = 9403 – 9289.26 SX

2 = 16.25

7

NNX

XSX

22

2

)(

Page 123: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

How to find the Standard Deviation

• The computing formula for the sample standard deviation is:

NNX

XSX

22 )(

Page 124: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Computing the Standard Deviation - Example

• Estimates of professor’s age:

• Data: 36, 34, 45, 31, 35, 38, 36

• N = 7

• ∑X2 = 9403

• (∑X)2 = 65025

• S2X = 16.244

• SX = √16.244 = 4.03

NNX

XSX

22 )(

Page 125: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Applying Variability to Research

Page 126: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example

• An experiment was conducted to test the efficacy of a new diet drug. The drug was administered to rats in various dosages and the rat’s food consumption measured. The data are as follows:

Rat 0mg 2mg 4mg 6mgA 12 10 13 7B 13 10 12 5C 9 10 11 7D 10 10 7 5E 18 10 7 4

X (0mg) = 12.4

X (2mg) = 10

X (4mg) = 10

X (6mg) = 5.6

Page 127: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Example

• An experiment was conducted to test the efficacy of a new diet drug. The drug was administered to rats in various dosages and the rat’s food consumption measured. The data are as follows:

Rat 0mg 2mg 4mg 6mgA 12 10 13 7B 13 10 12 5C 9 10 11 7D 10 10 7 5E 18 10 7 4

X (0mg) = 12.4, Sx = 3.13

X (2mg) = 10, Sx = 0

X (4mg) = 10, Sx = 2.52

X (6mg) = 5.6, Sx = 1.2

Page 128: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Rat data - population distributions

0 2 4 6 8 10 12 14

6mg, Mean = 5.8 2mg, Mean = 100mg, Mean = 12.4

Page 129: Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009

Putting it all together….

• Descriptive statistics - the three steps of analyzing any set of data are:

1. Consider the scale of measurement and the shape of the distribution

2. Describe where most participants scored, usually by computing the mean

3. Describe the variability of the scores, usually by computing the sample standard deviation