44
Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences and Engineering University of Guanajuato Campus Celaya-Salvatierra

Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Embed Size (px)

Citation preview

Page 1: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Biostatistics coursePart 13

Effect measures in 2 x 2 tables

Dr. Sc. Nicolas Padilla RaygozaDepartment of Nursing and Obstetrics

Division Health Sciences and EngineeringUniversity of Guanajuato

Campus Celaya-Salvatierra

Page 2: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Biosketch

Medical Doctor by University Autonomous of Guadalajara. Pediatrician by the Mexican Council of Certification on

Pediatrics. Postgraduate Diploma on Epidemiology, London School of

Hygiene and Tropical Medicine, University of London. Master Sciences with aim in Epidemiology, Atlantic International

University. Doctorate Sciences with aim in Epidemiology, Atlantic

International University. Associated Professor B, Department of Nursing and Obstetrics,

Division of Health Sciences and Engineering, University of Guanajuato, Campus Celaya Salvatierra, Mexico.

[email protected]

Page 3: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Competencies

The reader will obtain Risk Ratio or Odds Ratio from a 2 x 2 table.

He (she) will calculate 95% confidence interval from RR or OR.

He (she) will identify potential confounders and/or interactions.

He (she) will apply Mantel Haenzsel test for RR, OR and Chi-squared.

Page 4: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Introduction

In part 12 of the course, we tested the association between two categorical variables.

Now, we review the methods used to measure the association.

We will work with binary variables, so we will use 2 x 2 tables.

Page 5: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Example

A nurse in a poor area of Mexico, was informed that many area children attending the nursery were sick of respiratory infections.

She designed a cohort study to investigate the problem.

During the following years 1000 children were followed.

The main research question was: Attending nursery is associated with respiratory

infection?

Page 6: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Example

Respiratory infection

Respiratory infection

Total

Attending nursery

Yes

n %

No

n %

Yes 37 33.9 72 66.1 109

No 43 4.8 848 95.2 891

Total 80 8 920 92 1000

Page 7: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Risk Ratio (RR)

In health research, the term "risk" is used instead of proportion. For example:

The risk of infection among children attending day care was 33.9%.

Thus, the risk ratio is the ratio of two proportions. The risk of respiratory infection for those attending the

nursery 37 / (37 + 72) = 37/109 = 0.339 The risk of respiratory infection in children not attending day

care is: 43 / (43 + 848) = 43/891 = 0.048. The risk ratio (RR) is the ratio of these two risks.

Risk ratio = 0.339 / 0.048 = 7.06

Page 8: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Risk Ratio (RR)

In general, the risk ratio can be obtained with the following formula, where a, b, c and d are the frequencies in the 2 x 2 table.

Outcome Outcome Total

Exposure Yes No

Yes a b a + b

No c d c + d

Total a + c b + d N

Risk Ratio = (a /a+b) / (c/c + d)

Page 9: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Odds Ratio (OR)

The Odds Ratio (OR) is the ratio of the chance (probability) of the results between those exposed and the chance of the outcome among non-exposed. The chance of infection among attendees of the

nursery is: 37 / 72 = 0,514 The chance of infection among children not attending

day care is: 43 / 848 = 0,051 The Odds Ratio of these two probabilities: OR =

0,514 / 0,051 = 10.08 In general, the Odds Ratio was found with the following

formula: OR = ad / bc = (a / c) / (b / d)

Page 10: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Confidence intervals

In the analysis of data from children attending day care or not, we have the option to use RR or OR, to measure the effect of attendance at the nursery.

Each value is an estimate only, so these values should be reported with confidence intervals. An approximate confidence interval at 95% for the RR

is found using the following formula: Minimum value: RR / EF Maximum value: RR x EF

EF = exp(1.96√(1/a) – (1/a+b) + (1/c) –(1/c+d))

Page 11: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Confidence intervals

CI for the data of children who attend day care or not, is: EF = exp (1.96 √ 1 / 37 - 1 / 109 + 1 / 43 -

1/891 = 1.48 RR = 7.06 Minimum 7.06/1.48 = 4.77 Maximum value 7.06 x 1.48 = 10.45

95% CI = 4.77 to 10.45

Page 12: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Confidence intervals

An approximate confidence interval at 95% for the OR is found using the following formula: Minimum value: OR / EF Maximum value: OR x EF

EF = exp(1.96√(1/a) + (1/b) + (1/c) + (1/d))

Page 13: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Confidence intervals

CI for the data of children who attend day care or not, is: EF = exp (1.96 √ 1 / 37 + 1 / 72 + 1 / 43 +1 /

848 = 1.65 OR = 10.08 Minimum value 10.08/1.65 = 6.11 Maximum value 10.08 x 1.65 = 16.63

95% CI = 6.11 to 16.63

Page 14: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Which measure is best?

Risk Ratios are calculated for cross-sectional and cohort studies. The formula for the 95% confidence interval for

RR requires larger sample sizes than for OR. OR are calculated for case-control and cross-

sectional studies. In case-control studies is not possible to calculate

risks, and therefore can not calculate RR. There is an advantage in using OR.

It is a consistent measure of effect, unlike RR.

Page 15: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Example (Cont…)

Mexican children showed a strong association between exposure (attending nursery) and outcome (respiratory infection).

However such an association may be confounded by other factor(s).

For example, although children who attend day care, seem to have a 7 times higher risk of respiratory infection, the cause of the infection can also be something that is associated with children who go to daycare.

In other words, to attend the nursery may be a marker of exposure that causes a respiratory infection.

If this is true, we can say that the association between respiratory infections and assistance to the nursery, are confused.

Page 16: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

How identify a potential confounder?

To evaluate a potential confounder, we should consider three aspects: The exposure The outcome The confounder

Page 17: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Example

The nurse is interested in the association between day care attendance and presence of respiratory infection, but is aware that children might be exposed to other factors that cause respiratory infection.

For example, overcrowding at home is a risk factor for respiratory infection.

It is therefore a potential confounder of the association between attendance at day care and respiratory infections.

Page 18: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Confounders

For a variable has been a potential confounding, it should meet three conditions: Must be:

an independent risk factor for the outcome of interest

should be associated with the exposure of interest

not be in the cause pathway between exposure and outcome.

Page 19: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Confounders

How do we check these conditions in the study of Mexican children? Condition 1 of confusion:

Risk factor for the outcome of interest Is there an association between overcrowding and

respiratory infection?

Overcrowding in home

RI

Yes

RI

No

Risk of RI

Yes 54 55 54/109 =0.5

No 21 870 21/891= 0.02

RR = 25

95%CI = 15.72 a 39.75

X2= 311.67

P<<0.05

Page 20: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Confounders

How do we check these conditions in the study of Mexican children? Condition 2 of confusion:

Association with exposure Is there an association between overcrowding and

assistance to child care?

Overcrowding in home

Attendance to nursery

Yes

Attendance to nursery

No

Yes 43 66

No 35 856

X2= 170.39

P<<0.05

Page 21: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Confounders

How do we check these conditions in the study of Mexican children? Condition 3 of confusion:

Is the potential confusion is the causal pathway? In this example, it is unlikely that child care

assistance, is caused by overcrowding

Page 22: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Do we have a confounder?

In this study, overcrowding has satisfied the three conditions necessary for a confounding variable: It is an independent risk factor for the outcome of

interest. Overcrowding is associated with respiratory infection.

It is associated with the exposure of interest. Overcrowding is associated with attendance at the nursery.

It is not in the causal pathway. Overcrowding is unlikely to be the cause of attendance at nursery.

Page 23: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Stratified tables

Now, we know that the data must be additionaly analyzed for to have the effect of overcrowding.

To adjust for confounder variable, we stratified the table 2 x 2 of interest.

The table without stratify is called raw table. Can be divided into strata defined by the confounder

variable. The sample is divided into two groups, each of them the

status of overcrowding is the same. The two groups are:

Overcrowding and without overcrowding

Page 24: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Stratified tables

If we want to find childcare assistance is associated with respiratory infection when comparing children within the same category of overcrowding.

The raw table for the relationship between respiratory infections and child care assistance:

Respiratory infection

Respiratory infection

Total

Attendance to nursery

Yes

n %

No

n %

Yes 37 33.9 72 66.1 109

No 43 4.8 848 95.2 891

Total 80 8 920 92 1000

Page 25: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Stratified tables

Now, it is show stratified tables by overcrowding and without overcrowding:

Respiratory infectionYes

Respiratory infectionNo

Total

NurseryYes

61 14 75

NurseryNo

5 21 26

Total 66 35 101

Respiratory infectionYes

Respiratory infectionNo

Total

NurseryYes

10 24 34

NurseryNo

4 861 865

Total 14 885 899

Overcrowding Without overcrowding

RR= 4.23 X2=32.88 p=0.000095%CI 1.91 a 9.37

RR= 63.6 X2=178.84 p=0.000095%CI 21.01 a 192.56

Page 26: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Stratified tables

Do you think that attendance at nursery is a risk factor for respiratory infections among children with overcrowding?

Yes, children attending day care are 63 times more at risk of respiratory infection than those who do not attend nursery.

The p value indicates a strong association between attendance at daycare and respiratory infection in the group without overcrowding.

Page 27: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Stratified tables

Do you think that attendance at nursery is a risk factor for respiratory infection in the group without overcrowding?

Yes, children attending day care are more than 3 times more at risk of respiratory infection than those not attending the nursery.

The p value indicates a strong association between attendance at daycare and respiratory infection in this group.

Within each stratum, the association between attendance at day care and respiratory infections is now independent of overcrowding at home.

Page 28: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Comparison of results

How to compare these results with those of the raw table? The raw table shows a strong relationship between attendance at day

care and respiratory infection, RR is different in both tables stratified but remains a significant statistical association.

RR 95%CI X2 P-value

Raw 7.06 4.77 a 10.45 111.88 <0.05

Overcrowding 4.23 1.91 a 9.37 32.88 <0.05

Without overcrowding

63.6 21.01 a 192.56 178.84 <0.05

Page 29: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Adjusted Risk Ratios

Nurse do not want show data divided into strata, prefer a global estimate of the effect of attended to nursery in respiratory tract infection adjusted by overcrowding.

This can be done by calculate RR using a Mantel Haenzsel method. First, look 2 x s table in each strata.

Exposure Disease Yes

DiaseaseNo

Total

Yes ae be

No ce de

Total ne

Page 30: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Risk Ratios from Mantel Haenzsel

Adjusted RR (summarized), can be obtained with:

Ʃ a (c+d)/n

RRMantel Haenzsel = ---------------

Ʃ c (a+b)/n

This give us a average of RR initially estimate into each table ; more important each table with more sample size.

Page 31: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Adjusted Risk Ratio

We calculate overcrowding adjusted RR with Mantel Haenzsel formula:

Respiratory infectionYes

Respiratory infection No

Total

NurseryYes

61 14 75

Nursery No

5 21 26

Total 66 35 101

Respiratory infectionYes

Respiratory infection No

Total

NurseryYes

10 24 34

Nursery No

4 861 865

Total 14 885 899

Overcrowding Non-overcrowding

61 (5 + 21)/ 101 + 10 (4 + 861)/899 15.70 + 9.62 25.32------------------------------------------------ = ----------------- = ----------- = 6.565 (61 + 14)/101 + 4 (10 + 24)/899 3.71 + 0.15 3.86

Page 32: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Adjusted Odds Ratio

Adjusted OR is calculate in similar form that adjusted RR.

Ʃ ad/n

RMMantel Haenzel= -----------

Ʃ bc/n

Exposure DiseaseYes

Diasease No

Total

Yes ae be

No ce de

Total ne

Page 33: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Adjusted Odds Ratio

In a cross-sectional study, on the use of quinfamide after a amoebic dysentery, it was reported how many are carriers of Entamoeba histolytic.

Non-carrier Carrier Total

Quinfamide 100 54 154

Non quinfamide

15 72 87

Total 115 126 241

Page 34: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Adjusted Odds Ratio

We calculate adjusted OR by residence area, with the Mantel Haenzsel formula:

Non-carrier Carrier Total

Quinfamide Yes

35 39 74

Quinfamide No

10 51 61

Total 45 90 135

Non-carrier Carrier Total

Quinfamide Yes

65 14 79

QuinfamideNo

5 21 26

Total 70 35 105

Urban Rural

(35 x 51 /135) + (65 x 21/105) 13.2 + 13 26.2---------------------------------------- = ----------------- = ---------- = 7.4(39 x 10 / 135) + (14 x 5 /105) 2.89 +0.67 3.56

Page 35: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Mantel Haenzsel X2

The nurse now knows that the association between respiratory infection and attend to nursery still is after adjusted by overcrowding, confounder variable.

Now, she want to calculate a Chi squared test to significance of this association, adjusted by confounder.

This can be do, calculating X2Mantel-Haenzsel test.

Page 36: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Mantel Haenzsel X2

To calculate adjusted Chi squared test for the confounder, we calculate Mantel Haenzsel Chi squared. Null hypothesis is that there is not association between attend to nursery and respiratory infection.

Ho : OR = 1.

[Ʃae-ƩE(ae)]2

X2Mantel Haenzsel= -------------------

ƩV(ae)

Page 37: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

We should go, step by step, beginning with 2 x 2 of each strata.

Exposure Disease Yes

Disease No Total

Yes ae be

No ce de

Total ne

Mantel Haenzsel X2

Page 38: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Mantel Haenzsel Chi squared test is an average of individuals Chi squared of each table.

To calculate Mantel Haenzsel Chi squared test, we need three values of each table: ae number of ill and exposed

E(ae) value expected of ae

V(ae) variance (standard error squared) of ae, where,

E(ae) = total row x total column / grand total = (ae + be) x (ae + ce)/ne  

(ae + be) x (ce + de) x (ae + ce) x (be + de)

V(ae) = --------------------------------------------------------

ne²(ne - 1)

Mantel Haenzsel X2

Page 39: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Overcrowding table a = 61 E(a) = 75 x 66 / 101 = 49.01 V(a) = (75 x 66 x 26 x 35) / (101² x (101 - 1)) = 4.42

Non-overcrowding table a = 10 E(a) = 34 x 14 / 899 = 0.53 V(a) = 34 x 14 x 865 x 885 / (899² x (899 - 1)) = 0.50 To obtain Mantel Haenzsel Chi squared test (adjusted Chi squared

by overcrowding), we add these values from the two strata, using the formula:

Example

[Ʃae-ƩE(ae)]2

X2Mantel Haenzsel= -------------------

ƩV(ae)

Page 40: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

To obtain Mantel Haenzsel Chi squared test (Adjusted Chi squared test by overcrowding), we add these values, using the formula:

a E(a) V(a)

Overcrowding 61 49.01 4.42

Non-overcrowding 10 0.53 0.50

Total 71 49.54 4.92

 

X2Mantel-Haenzsel = (71 – 49.54)²/4.92= 93.60

Example

Page 41: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Confusion or not confusion

How we decide if there is confusion? There are nor statistical tests to demonstrate

confusion. We do calculate statistical tests and measure the

effect raw and stratified tables. Then, we calculate summarized statistical test and

we compare them with the raws, and we conclude if there is confusion or not.

Page 42: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Confusion or not confusion

If there is an important difference between raw and adjusted estimates, we say that the association of interest is confounding by another factor.

We look the data of children that attend to nursery and respiratory infection.

After adjust by overcrowding, RR diminish from 7.06 to 6.56.

Page 43: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Posibles effects from confusion

Generally there are more than one confounder. They can have different effects:

The association in study, can be or not significative before of adjust for a confounder and not significative after.

The association can be significative after adjust for a confounder but with a p-value less significative.

Strata can show oposite results and in this case, it is better, show stratified results. This is interaction or effect modified.

Confounder can hide an existing relationship.

Page 44: Biostatistics course Part 13 Effect measures in 2 x 2 tables Dr. Sc. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division Health Sciences

Bibliografía

1.- Last JM. A dictionary of epidemiology. New York, 4ª ed. Oxford University Press, 2001:173.

2.- Kirkwood BR. Essentials of medical ststistics. Oxford, Blackwell Science, 1988: 1-4.

3.- Altman DG. Practical statistics for medical research. Boca Ratón, Chapman & Hall/ CRC; 1991: 1-9.