21
Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics Biostatistics 510 13-15 March 2007 Carla Talarico

Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

  • Upload
    lydia

  • View
    29

  • Download
    2

Embed Size (px)

DESCRIPTION

Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics. Biostatistics 510 13-15 March 2007 Carla Talarico. Overview. Variable stratification Cochran-Mantel-Haenszel (CMH) statistics Matching and matched data Agreement statistics McNemar’s Test Cohen’s Kappa. - PowerPoint PPT Presentation

Citation preview

Page 1: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

Categorical Data Analysis:

Stratified Analyses, Matching, and Agreement Statistics

Biostatistics 510

13-15 March 2007

Carla Talarico

Page 2: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

Overview

• Variable stratification

• Cochran-Mantel-Haenszel (CMH) statistics

• Matching and matched data

• Agreement statistics– McNemar’s Test– Cohen’s Kappa

Page 3: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

Stratification by a Third Variable

• Exposure of interest

• Disease outcome

• Third variable, e.g., confounder

E D?

C

Page 4: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

Confounding

• Effect of exposure on disease may be different in the presence of a third variable (“Confounder”)

• Reflects the fact that epidemiologic research is conducted among humans with unevenly distributed characteristics

• Results because of a lack of comparability between the exposed and unexposed groups in the base population

Page 5: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

Controlling for Confounding

• Design phase of studies– Randomization in experimental studies– Restriction– Matching

• Analysis phase– Stratified analysis– Model fitting

Page 6: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

Stratified Analyses:The CMH Option in SAS

• Gives a stratified statistical analysis of the relationship between Exposure (E) and Disease (D), after controlling for a Confounder (C):

Proc freq; tables C * E * D / cmh;Run;

Proc freq; tables C1 * C2 * E * D / cmh;Run;

• Can simultaneously stratify by multiple confounders:

Page 7: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

Estimates of Common Relative Risk for 2x2 Tables

• Adjusted odds ratio (OR) and relative risk (RR) for stratified 2x2 tables with 95% CL

• Obtain OR and RR estimates for association between Exposure and Disease, adjusted for the Confounder

• For this course, report the Mantel-Haenszel estimate of the common odds ratio, ORMH

Page 8: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

Breslow-Day Test for Homogeneity of the Odds Ratios

• For stratified 2x2 tables

• Null hypothesis is that the ORs are equal across all strata– χ2 distribution with q – 1 df, where q is the number of

strata

• Alternative hypothesis is that at least one stratum-specific OR differs from other stratum-specific ORs

Page 9: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

χ2BD (con’t)

• If reject H0 for χ2BD test:

– There is evidence for heterogeneity of ORs across strata; not appropriate to report the adjusted common OR

– Report the stratum-specific ORs when effect modification is present

Page 10: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

CMH Statistic 1:Nonzero Correlation

• Tests the null hypothesis of no association vs. the alternative hypothesis that there is a linear association between the row and column variables in at least one stratum

• Both row and column variables have to be ordinal

• Under H0, ~ χ2 with 1 df

Page 11: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

CMH Statistic 2: Row Mean Scores Differ

• Tests the null hypothesis of no association vs. the alternative hypothesis that the mean scores of the table rows are unequal for at least one stratum

• Useful only when the column variable is ordinal

• Under H0, ~ χ2 with (r – 1) df

Page 12: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

CMH Statistic 3: General Association

• Tests the null hypothesis of no association vs. the alternative hypothesis that there is some kind of association between the row and column variables for at least one stratum

• Does not require the row or column variable to be ordinal

• Under H0, ~ χ2 with (r – 1)(c – 1) df

Page 13: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

Matching

• Control for confounding more efficiently than if the matching had not been performed

• Design phase of a study

• Gain statistical efficiency in effect estimation

Page 14: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

Matching (con’t)

• Select comparison participants into a study such that they are the same (or nearly the same) on certain variable(s)

• Matched design requires a matched analysis

• Once match on a variable, the effect of that variable cannot be estimated in your data set

Page 15: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

Matched Data and theAGREE Option in SAS

• AGREE option computes tests and measures of agreement for square tables (where the number of rows equal the number of columns)

title "McNemar's Test for highchol and hibmi for pill and non-pill";

proc freq data=pairs; tables hichol1*hichol2 hibmi1*hibmi2 / agree norow nocol;run;

Page 16: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

AGREE Option in SAS

• AGREE option generates:

-McNemar’s Test

-Kappa

-Weighted Kappa

Page 17: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

McNemar’s Test of Symmetry for Matched Samples

• For 2x2 tables

• Appropriate when have data from matched pairs of subjects with a dichotomous (yes/no) outcome

• Null hypothesis of marginal homogeneity

– Werner data set of matched pairs, comparing proportion of women with high cholesterol who take birth control pill to the proportion of women with high cholesterol who do not take the pill

• χ2 distribution with 1 df

Page 18: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

McNemar’s Test for Matched Proportions• Werner data set

with age-matched pairs Frequency

Percent

Pill:

High Chol=1

Pill:

High Chol=2 Total

No Pill:

High Chol=1

21

22.83

21

22.83

42

45.65

No Pill:

High Chol=2

23

25.00

27

29.35

50

54.35

Total 44

47.83

48

52.17

92

100.00Χ2

M = (21 – 23)2

(21 +23)

= 0.0909

• There are 92 pairs. • 45.65% of the NoPill group

have high chol.• 47.83% of the Pill group

have high chol.

Page 19: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

Simple Kappa Coefficient (Cohen’s Kappa)

• Measure of inter-rater agreement, corrected for chance

• Scale from -1 to +1

– Κ = +1 when there is perfect agreement

– Κ = 0 when the agreement equals that expected by chance

• Magnitude of Kappa reflects the strength of the agreement, beyond chance

Κ = P0 - Pe

1 - Pe

Page 20: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

Cohen’s Kappa (con’t)

• SAS gives 95% CI for Kappa

• Kappa Guidelines (Landis and Koch)

Kappa Statistic

Strength of Agreement

<0.00 Poor

0.00 – 0.20 Slight

0.21 – 0.40 Fair

0.41 – 0.60 Moderate

0.61 – 0.80 Substantial

0.81 – 1.00 Almost perfect

Page 21: Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics

Good Resources for Categorical Data Analysis and SAS

• SAS: Categorical Data Analysis Using The SAS System by Maura E. Stokes, Charles S. Davis, and Gary G. Koch. 2nd Ed, SAS Institute Inc., Cary, NC, 2000.

• See pages 155-156 of Biostat 510 course pack

• Kappa: “The Measurement of Observer Agreement for Categorical Data,” by J. Richard Landis and Gary G. Koch. Biometrics 33(1):159-174, 1977