12
1 G89.2229 Lect 13M • Why might data be missing in psychological studies? • Missing data patterns • Overview of statistical approaches • Example G89.2229 Multiple Regression Week 13 (Monday)

1 G89.2229 Lect 13M Why might data be missing in psychological studies? Missing data patterns Overview of statistical approaches Example G89.2229 Multiple

Embed Size (px)

Citation preview

Page 1: 1 G89.2229 Lect 13M Why might data be missing in psychological studies? Missing data patterns Overview of statistical approaches Example G89.2229 Multiple

1G89.2229 Lect 13M

• Why might data be missing in psychological studies?

• Missing data patterns

• Overview of statistical approaches

• Example

G89.2229 Multiple Regression Week 13 (Monday)

Page 2: 1 G89.2229 Lect 13M Why might data be missing in psychological studies? Missing data patterns Overview of statistical approaches Example G89.2229 Multiple

2G89.2229 Lect 13M

What leads to missing data?

• Experimental studies • equipment failure• experimenter error• subject noncompliance• drop out of subject• data entry error

» SOLUTION: collect more data

Page 3: 1 G89.2229 Lect 13M Why might data be missing in psychological studies? Missing data patterns Overview of statistical approaches Example G89.2229 Multiple

3G89.2229 Lect 13M

What leads to missing data?

• Observational (nonexperimental) studies» equipment failure» observer/coder error» subject refusal» subject loss to follow-up» design/measure changes» nested interview questions» SOLUTION:

• Collect More Data

Page 4: 1 G89.2229 Lect 13M Why might data be missing in psychological studies? Missing data patterns Overview of statistical approaches Example G89.2229 Multiple

4G89.2229 Lect 13M

Missing data patterns

• Terms suggested by Rubin» Rubin (1976), Little & Rubin (1987)

• In some cases, the data are MISSING COMPLETELY AT RANDOM (MCAR)» Which data point is missing cannot be

predicted by any variable, measured or unmeasured.• Prob(M|Y)=Prob(M)

» The missing data pattern is ignorable. Analyzing available complete data is just fine.

Page 5: 1 G89.2229 Lect 13M Why might data be missing in psychological studies? Missing data patterns Overview of statistical approaches Example G89.2229 Multiple

5G89.2229 Lect 13M

Missing data patterns

• In other cases, the data are MISSING AT RANDOM (MAR)» Which data point is missing is systematically

related to subject characteristics, but these are all measured• Conditional on observed variables,

missingness is random

• Prob(M|Y)=Prob(M|Yobserved)

» E.g. Lower educated respondents might not answer a certain question.

» Missingness is can be treated as ignorable.

Page 6: 1 G89.2229 Lect 13M Why might data be missing in psychological studies? Missing data patterns Overview of statistical approaches Example G89.2229 Multiple

6G89.2229 Lect 13M

Missing data pattern

• When data is Not Missing At Random (NMAR)» Data are missing because of process related to

value that is unavailable• Someone was too depressed to come report

about depression• Abused woman is not allowed to meet

interviewer» Missing data pattern is not ignorable.

Page 7: 1 G89.2229 Lect 13M Why might data be missing in psychological studies? Missing data patterns Overview of statistical approaches Example G89.2229 Multiple

7G89.2229 Lect 13M

Statistical Approaches

• Listwise deletion» If a person is missing on any analysis variable,

he is dropped from the analysis

• Pairwise deletion» Correlations/Covariances are computed using

all available pairs of data.

• Imputation of missing data values

• Model-based use of complete data» E-M (estimation-maximization approach)» Illustration in Excel

Page 8: 1 G89.2229 Lect 13M Why might data be missing in psychological studies? Missing data patterns Overview of statistical approaches Example G89.2229 Multiple

8G89.2229 Lect 13M

Classic Cohen & Cohen advice

• Create dummy code for who has missing data (M)» Find out what variables are related to

missingness

• Insert mean or some other value for missing values in IV and create multiple regression with full data plus variable M.» Procedure has been criticized for

underestimating variance

• Current text reflects compromise

Page 9: 1 G89.2229 Lect 13M Why might data be missing in psychological studies? Missing data patterns Overview of statistical approaches Example G89.2229 Multiple

9G89.2229 Lect 13M

Case Study:Depression Following Miscarriage

» Neugebauer et al (1992) American Journal of Public Health, 82, 1332-1339.

» Neugebauer et al (1992) American Journal of Obstetrics and Gynecology, 166, 104-109.

• Neugebauer and his colleagues recruited women who sought treatment for miscarriages and measured their levels of depression at 2 weeks, 6 weeks and 26 weeks post miscarriage.

• The study built on a case-control study of the causes of miscarriage that successfully recruited nearly 80% of eligible women. Neugebauer and his colleagues enrolled approximately 85% of the women in the initial study.

• 382 women were initially enrolled.

Page 10: 1 G89.2229 Lect 13M Why might data be missing in psychological studies? Missing data patterns Overview of statistical approaches Example G89.2229 Multiple

10G89.2229 Lect 13M

Neugebauer Missing Data

• Some women were not available in the first two weeks following miscarriage, and others were not available in the subsequent two-week windows for followup measurement.

• Only 166 women were measured at all three time points.

• Missing observations were not related to:» SES, Ethnicity, Parity (# of pregnancies), # of

previous miscarriages

• Those with missing observations were» Somewhat younger, with fewer living children.

Page 11: 1 G89.2229 Lect 13M Why might data be missing in psychological studies? Missing data patterns Overview of statistical approaches Example G89.2229 Multiple

11G89.2229 Lect 13M

Pattern of missing data

2 Wk 6Wk 26Wk N

0 0 0 166

0 0 1 30

0 1 0 12

0 1 1 24

1 0 0 88

1 0 1 26

1 1 0 36

Page 12: 1 G89.2229 Lect 13M Why might data be missing in psychological studies? Missing data patterns Overview of statistical approaches Example G89.2229 Multiple

12G89.2229 Lect 13M

Means for different groups

10.0

12.0

14.0

16.0

18.0

20.0

22.0

24.0

26.0

28.0

0 5 10 15 20 25 30

0,0,0

0,0,1

0,1,0

0,1,1

1,0,0

1,0,1

1,1,0