63
1 Multiple-choice example

1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

Embed Size (px)

Citation preview

Page 1: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

1

Multiple-choice example

Page 2: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

2

Solution • The mean and

median would very RARELY have EXACTLY the same value.

• The median is indeed the middle score (the 50th percentile).

• The median is LESS vulnerable.

• We have our answer.

Page 3: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

3

Example

Page 4: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

4

Solution

• A grouping variable is a set of CODE NUMBERS, not real scores.

• Yes. • It’s absolutely essential. • We have our answer.

Page 5: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

5

Last week

1. How to enter data into SPSS.

2. How to obtain various displays and graphs of the Caffeine data.

Page 6: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

6

The SPSS Data Editor

Works in two display modes:

1. VARIABLE VIEW, in which you enter information about the variables in your data set;

2. DATA VIEW, in which you enter your data.

Page 7: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

7

Entering data into SPSS

• When entering data into SPSS, begin in Variable View.

• Good work in Variable View confers benefits both at the stage of data entry and when you are viewing the output.

Page 8: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

8

Variable View completed

• Variable View controls the appearance of Data View.

• Variable View also controls the appearance of the output. The graphs I produced with SPSS all had clear labels, for example.

• Variable View creates your WORKING ENVIRONMENT.

Actually, value labels

Page 9: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

9

Between subjects experiments

• In the caffeine experiment, each of the participants in an experiment is tested under only ONE of the conditions making up the independent variable.

• In this experiment, the conditions making up the independent variable are said to vary BETWEEN SUBJECTS, and the experiment is said to be of BETWEEN SUBJECTS design.

Page 10: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

10

Within subjects experiments

• In WITHIN SUBJECTS experiments, each participant is tested under ALL conditions.

• The Stroop experiment, which I described in the first lecture, was of within subjects design: each participant performed under all three conditions.

Page 11: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

11

Grouping variables

• When entering data from a between subjects experiment, you need a GROUPING VARIABLE.

• A GROUPING VARIABLE is a column of code numbers or VALUES, each number representing the condition under which a score was achieved.

• We can let 1 = ‘Placebo’ and 2 = ‘Caffeine’, where 1 and 2 are VALUES and ‘Placebo’ and ‘Caffeine’ are VALUE LABELS.

Page 12: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

12

Part of Data View

• In Data View, all the scores from the experiment appear as a SINGLE COLUMN of numbers.

• The Group variable appears as a single column of CODE NUMBERS.

Page 13: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

13

Seeing the value labels

• To see the value labels in Data View (instead of the values),click Value Labels in the View menu.

• Seeing the value labels helps you avoid transcription errors when inputting data.

Page 14: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

14

Graphs and displays

We looked at two kinds of graphs:

1.those that are pictures of DISTRIBUTIONS;

2.those that SUMMARISE DISTRIBUTIONS by picturing the statistics.

Page 15: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

15

Histograms and Stem-&-leaf displays

• Histograms and stem-and-leaf displays are pictures of DISTRIBUTIONS.

Page 16: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

16

A histogram

Page 17: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

17

Stem-and-leaf display

Page 18: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

18

Bar charts and box plots

• BAR CHARTS and BOX PLOTS are pictures of the STATISTICS of distributions.

Page 19: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

19

Bar chart (with error bars)

means

standard deviations

Page 20: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

20

Types of bar chart

SPSS offers three types of bar chart:

1. Simple

2. Clustered

3. Stacked

I do not like stacked bar charts, so I shall consider only the Simple and Clustered types.

Page 21: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

21

Suppose there are two grouping variables in your data set

• We have run another caffeine experiment.• But this time, we want to include the

participant’s gender in the data set. • We add a second grouping variable,

Gender, where 1 = Male and 2 = Female. • As well as comparing the performance of

the Caffeine and Placebo groups, we also want to compare the mean scores of the female and male participants.

Page 22: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

22

Two grouping variables

Page 23: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

23

Clustered bar chart

Page 24: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

24

Explanation

• Here the ‘category variable’ is Experimental Condition.

• The ‘Cluster variable’ is Gender.

• It looks as if the males are a little better under the Placebo condition.

• It looks as if the females are a little better under the Caffeine Condition.

Page 25: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

25

Boxplots

medians

Upper quartiles

Lower quartiles

Extreme score

Outlier

whiskers

Page 26: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

26

Lecture 6

Descriptive statistics with SPSS

Page 27: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

27

The three most important properties of a distribution

1. Its typical value, AVERAGE or CENTRAL TENDENCY, measured by the MEAN, the MEDIAN and the MODE.

2. The SPREAD or DISPERSION of scores around the average value, measured by the STANDARD DEVIATION and RANGE STATISTICS such as the SIMPLE RANGE, the INTERQUARTILE and the SEMI-INTERQUARTILE RANGES.

3. The SHAPE of the distribution.

Page 28: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

28

Results of the caffeine experiment

Page 29: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

29

Obtaining descriptive statistics

• Descriptive statistics such as the mean and standard deviation can be obtained through a variety of SPSS procedures.

• The MEANS procedure gives the means and standard deviations of data in different groups.

Page 30: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

30

Finding the Means procedure

Page 31: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

31

The Means procedure

• Our data set has a grouping variable. The Means option in Compare Means is the best one to start with.

• Note that the Descriptive Statistics procedure is only suitable for data sets that have no grouping variables.

• Descriptives would only calculate the statistics of ALL the scores.

The Desciptive Statistics procedure does not allow comparison between groups and is inappropriate for grouped data.

Page 32: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

32

The Means dialog box

The variables in your data set are listed here initially.

Select an item by clicking on it: it will appear in inverse video.

The central arrows will come live and you can use one of them to transfer the item to the appropriate box.

Click here to order additional statistics.

Page 33: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

33

The Options dialog box

Click the Continue button to return to the Means dialog.

To add extra items, click the first (to select it) then, holding the Control button down, click on the other items you want to include.

Click the black arrow in the central pillar to transfer the statistics to the right hand panel.

Page 34: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

34

The SPSS Output Viewer

Page 35: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

35

The output

• There are two tables.

• The Case Processing Summary warns you if SPSS has failed to include any data in the analysis.

• The Report is a table containing the statistics you asked for.

Page 36: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

36

The Case Summary

• Clearly, SPSS found no problems with these data.

• All the data have been included in the analysis.

Page 37: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

37

The Report

• The Placebo and Caffeine means are 9.25 and 11.90, respectively. The corresponding medians are 9.00 and 12.50.

• As we saw before, the simple range of values in either distribution is 15 hits.

Page 38: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

38

Editing SPSS output

• If you don’t like a table in the Output, you can alter its appearance dramatically.

• You can remove information you do not need.

• You can change the shape of the table.

• The first move is to double-click on the image in the Viewer.

Page 39: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

39

Changing the table

• Double-click on the item concerned, to produce a shaded border.

Page 40: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

40

The Output Editor Click on Pivot to access the Pivot menu.

Page 41: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

41

The Pivot menu

Page 42: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

42

A more compact table

Page 43: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

43

Leaving the Editor

• Click the File drop-down menu (in the Editor) and choose Close.

• Click anywhere in the Viewer outside the rectangle, which will then disappear.

Page 44: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

44

Some revision

Page 45: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

45

Percentiles

• A PERCENTILE is the VALUE or SCORE below which a specified percentage or proportion of the distribution lies.

• The 30th percentile is the value below which 30% of the distribution lies.

• The 70th percentile is the value below which 70% of scores lie.

Page 46: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

46

The 30th and 70th percentiles

0.30

30th percentile

70th percentile

0.70

(0.70)

(0.30)

Page 47: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

47

The median is the 50th percentile

0.50 0.50

50th percentile

(median)

Page 48: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

48

The 2.5th and 97.5th percentiles • The distribution of IQ has

a mean of 100 and an SD of 15.

• 130 is 2 SD’s above the mean.

• Below that value lies 0.95 + 0.025 = 0.975 or 97.5% of the distribution.

• So 130 is the 97.5th percentile.

• Similarly the value 70 lies 2SD’s below the mean. Only .025 = 2.5% of values lie below that. An IQ of 70 is the 2.5th percentile.

0.95 (95%)

mean

mean – 1.96×SD mean +1.96×SD

2 ½ % = .025 2 ½ % = .025

Page 49: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

49

Populations and samples

• I measure the IQs of 20 people.

• The POPULATION is the reference set containing the reaction times of ALL people.

• The 20 IQs I have are merely a subset or SAMPLE from the population.

• Suppose I calculate their mean and SD.

• I might find that M = 108.0 and SD = 16.1.

Page 50: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

50

Statistics versus parameters

• STATISTICS are characteristics of SAMPLES; PARAMETERS are characteristics of POPULATIONS.

• A normal population has two parameters: 1. the mean;2. the standard deviation. • The IQ population has a mean of 100

and an SD of 15.

Page 51: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

51

The meaning of ‘error’

• The population mean and standard devation are 100 and 15, respectively; whereas the sample mean and standard deviation are 108.0 and 16.1.

• We usually study samples in order to make inferences about the populations from which they have been drawn.

• But if the statistics of samples are taken as ESTIMATES of parameters, they are likely to be in error, to at least some extent.

Page 52: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

52

Sampling variability

• Population characteristics are fixed.

• But the values of the corresponding statistics wobble around the values of the parameters.

• This ‘wobble’ is known as SAMPLING VARIABILITY or SAMPLING ERROR.

Page 53: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

53

Sampling variability

• Suppose I take 25 people’s IQs and calculate the mean. It might be 95.1 .

• I do this 4000 times, calculating the value of the mean each time.

• I command SPSS to draw a histogram of the distribution of the MEANS of the 4000 samples.

Page 54: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

54

Distribution of the sample mean

Page 55: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

55

Distribution of the sample mean

• The values of the mean range from 89.9 to 110.0 .

• SAMPLING implies SAMPLING VARIABILITY or SAMPLING ERROR.

Page 56: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

56

The laws of large numbers

• A set of theorems in probability theory.

• Their import is that if you make the sample size large enough, the mean of the sample will be as close as you wish to the value of the population mean.

• So if we take the mean and standard deviation of 4000 IQs, their values should be close to 100 and 15, respectively.

Page 57: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

57

Distribution of 4000 IQs

Page 58: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

58

The 2.5th and 97.5th percentiles

• Since this is a very large sample from a normal population, we should find that the 97.5th percentile is close to 130 (2 SDs above the mean) and the 2.5th percentile is close to 70 (2 SDs below the mean).

• To obtain these percentiles, we can use SPSS’s Freqencies procedure.

Page 59: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

59

The Frequencies procedure

• It’s in the Descriptive Statistics menu.

• Click on Frequencies… to enter the Frequencies dialog box.

Page 60: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

60

The Frequencies dialog

Click on the Display frequency tables box to remove the arrow. This avoids a huge output of cumulative frequencies.

Page 61: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

61

The output

• As you would expect, the mean is close to 100.

• The mean and median have similar values: 99.90 to 2 places of decimals.

• The 2.5th and 97.5th percentiles are close to 70 and 130, respectively.

Page 62: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

62

Exercises

• I have made the SPSS data file for the 4000 IQs available on my Website.

• Command SPSS to draw a histogram of the distribution.

• Use the Descriptives procedure to find the mean and standard deviation of the distribution.

• I suggest you try the Frequencies procedure to obtain the 2.5th and 97.5th percentiles.

Page 63: 1 Multiple-choice example. 2 Solution The mean and median would very RARELY have EXACTLY the same value. The median is indeed the middle score (the 50

63

The caffeine data

• I have also brought the caffeine data on site.

• Try using the Means procedure to obtain descriptive statistics of the Caffeine and Placebo distributions.

• Try producing boxplots and a bar chart.

• Introduce a Gender grouping variable into the data and obtain a clustered bar chart.