23
Biostatistics Case Studies 2007 Peter D. Christenson Biostatistician http://gcrc.labiomed.org/ Biostat Session 5: Demonstrating Lack of Treatment Effect: Equivalence or Non-inferiority

Biostatistics Case Studies 2007

Embed Size (px)

DESCRIPTION

Biostatistics Case Studies 2007. Session 5: Demonstrating Lack of Treatment Effect: Equivalence or Non-inferiority. Peter D. Christenson Biostatistician http://gcrc.labiomed.org/Biostat. Terminology. Superiority and/or Inferiority Study: - PowerPoint PPT Presentation

Citation preview

Page 1: Biostatistics Case Studies 2007

Biostatistics Case Studies 2007

Peter D. Christenson

Biostatistician

http://gcrc.labiomed.org/Biostat

Session 5:

Demonstrating Lack of Treatment Effect: Equivalence or Non-inferiority

Page 2: Biostatistics Case Studies 2007

Terminology

Superiority and/or Inferiority Study:

• Two or more treatments are assumed equal and the study is designed to find overwhelming evidence of a difference.

• Usually, one treatment is a control, sham, or placebo.

• Most common comparative study type.

• It is rare to assess only one of superiority or inferiority (“one-sided” statistical tests), unless there is

biological impossibility of one of them.

Page 3: Biostatistics Case Studies 2007

Terminology

Equivalence Study:

• Two treatments are assumed to differ and the study is designed to find overwhelming evidence that they are equal.

• Usually, the quantity of interest is a measure of biological activity or potency and “treatments” are

drugs or lots or batches of drugs.

• AKA, bioequivalence.

• Sometimes used to compare clinical outcomes for two active treatments, e.g., statins or vaccines, if neither treatment can be considered standard or accepted.

This usually requires large numbers of subjects

Page 4: Biostatistics Case Studies 2007

Terminology

Non-Inferiority Study:

• Usually a new treatment or regimen is compared with an accepted treatment or regimen or standard of care.

• The new treatment is assumed inferior to the standard and the study is designed to show overwhelming

evidence that it is at least nearly as good, i.e., non- inferior. It may has other advantages, e.g., oral vs. inj.

• A negative inferiority study fails to detect inferiority, but does not necessarily give evidence for non-inferiority.

• The accepted treatment is usually known to be efficacious already, but an added placebo group may also be used.

• The distinguishing feature is an attempt to prove negativity, not the one-sidedness of the inference.

Page 5: Biostatistics Case Studies 2007

Case Study

Page 6: Biostatistics Case Studies 2007
Page 7: Biostatistics Case Studies 2007
Page 8: Biostatistics Case Studies 2007

pASA+PPI = 1.5%

Demonstrate: pclop – pASA+PPI ≤ 4%

N=145/group Power=80% for what?

Page 9: Biostatistics Case Studies 2007

Typical Analysis: Inferiority or Superiority

H0: pclop – pASA+PPI = 0%

H1: pclop – pASA+PPI ≠ 0%

H1 → therapies differ

α = 0.05

Power = 80% for Δ=|pclop - pASA+PPI| =?

Clop inferior

= 95% CI for pclop – pASA+PPI

Clop superior

0

0

pclop – pASA+PPI

pclop – pASA+PPI

[Not used in this paper]

0pclop – pASA+PPI

No diff detected*

* and 80% chance that a Δ of (?) or more would be detected.

Page 10: Biostatistics Case Studies 2007

Typical Analysis: Inferiority or Superiority

H0: pclop – pASA+PPI = 0%

H1: pclop – pASA+PPI ≠ 0%

H1 → therapies differ

α = 0.05

Power = 80% for Δ=|pclop - pASA+PPI| =?

[Not used in this paper]

So, N=331/group → 80% chance that a Δ of 4% or more would be detected.

Detectable Δ = 5.5%-1.5%=4%

Page 11: Biostatistics Case Studies 2007

Typical Analysis: Inferiority or Superiority

H0: pclop – pASA+PPI = 0%

H1: pclop – pASA+PPI ≠ 0%

H1 → therapies differ

α = 0.05

Power = 80% for Δ=|pclop - pASA+PPI| =4%

[Not used in this paper]

H0: pclop – pASA+PPI ≤ 0%

H1: pclop – pASA+PPI > 0%

H1 → clop inferior

Note that this could be formulated as two one-sided tests (TOST):

α = 0.025

Power = 80% for pclop - pASA+PPI =4%

H0: pclop – pASA+PPI ≥ 0%

H1: pclop – pASA+PPI < 0%

H1 → clop superior

α = 0.025

Power = 80% for pclop - pASA+PPI =-4%

Page 12: Biostatistics Case Studies 2007

Demonstrating Equivalence

H0: |pclop – pASA+PPI| ≥ E%

H1: |pclop – pASA+PPI| < E%

H1 → therapies “equivalent”, within E

[Not used in this paper]

H0: pclop – pASA+PPI ≤ -4%

H1: pclop – pASA+PPI > -4%

H1 → clop non-superior

Note that this could be formulated as two one-sided tests (TOST):

α = 0.025

Power = 80% for pclop - pASA+PPI = 0%

H0: pclop – pASA+PPI ≥ 4%

H1: pclop – pASA+PPI < 4%

H1 → clop non-inferior

α = 0.025

Power = 80% for pclop - pASA+PPI = 0%

Page 13: Biostatistics Case Studies 2007

Demonstrating Equivalence

H0: |pclop – pASA+PPI | ≥ 4%

H1: |pclop – pASA+PPI | < 4%

H1 → equivalence

α = 0.05

Power = 80% for pclop

- pASA+PPI = 0

Clop non-superior

= 95% CI for pclop – pASA+PPI

Clop non-inferior

0

0

pclop – pASA+PPI

pclop – pASA+PPI

0

pclop – pASA+PPI Equivalence*

-4 4

-4

-4

4

4

* both non-superior and non-inferior.

Page 14: Biostatistics Case Studies 2007

This Paper: Inferiority and Non-Inferiority

H0: pclop – pASA+PPI ≤ 0%

H1: pclop – pASA+PPI > 0%

H1 → clop inferior

Apparently, two one-sided tests (TOST), but only one explicitly powered:

α = 0.025

Power = 80% for pclop - pASA+PPI = ?%

H0: pclop – pASA+PPI ≥ 4%

H1: pclop – pASA+PPI < 4%

H1 → clop non-inferior

α = 0.025

Power = 80% for pclop - pASA+PPI = 0%

The authors chose E=4% as the maximum therapy difference that therapies are considered equivalent.

Page 15: Biostatistics Case Studies 2007

This Paper: Inferiority and Non-Inferiority

Clop inferior

= 95% CI for pclop – pASA+PPI

Clop non-inferior

0

0

pclop – pASA+PPI

pclop – pASA+PPI

0

pclop – pASA+PPI

“Non-clinical” inferiority*

-4 4

-4

-4

4

4

* clop is statistically inferior, but not enough for clinical significance.

Decisions:

Observed Results: pclop = 8.6%; pASA+PPI = 0.7%; 95% CI = 3.4 to 12.4

12

0-4 4

pclop – pASA+PPI

Clop inferior

Page 16: Biostatistics Case Studies 2007

Power for Test of Clopidrogrel Non-Inferiority

H0: pclop – pASA+PPI ≥ 4%

H1: pclop – pASA+PPI < 4%

H1 → clop non-inferior

α = 0.025

Power = 80% for pclop

- pASA+PPI = 0%

Page 17: Biostatistics Case Studies 2007

Power for Test of Clopidrogrel Inferiority

H0: pclop – pASA+PPI ≤ 0%

H1: pclop – pASA+PPI > 0%

H1 → clop inferior

α = 0.025

Power = 80% for pclop

- pASA+PPI = 7.3%

Detectable Δ = 8.8%-1.5%=7.3%

Page 18: Biostatistics Case Studies 2007

Conclusions: This Paper

• In this paper, clop was so inferior that investigators were apparently lucky to have enough power for detecting it. The CI was too wide with this N for detecting a smaller therapy difference.

• Investigators justify testing non-inferiority of clop only (and not of Aspirin + Nexium) with the lessened desirability of combination therapy (?).

•This is a good approach for size and power for a new competing therapy against a standard, if the N for clop inferiority had been considered also.

• Note that power calculations were based on actual %s of subjects, whereas cumulative 12-month incidence was used in the analysis. There are not power calculations for equivalency tests using survival analysis, that I know of.

Page 19: Biostatistics Case Studies 2007

Conclusions: General

• “Negligibly inferior” would be a better term than non-inferior.

• All inference can be based on confidence intervals.

• Pre-specify the comparisons to be made. Cannot test for both non-inferiority and superiority.

• Power for only one or for multiple comparisons, e.g., non-inferiority and inferiority. Power can be different for different comparisons.

• Very careful consideration must be given to choice of margin of equivalence (4% here). The study is worthless if others in the field would find your margin too large.

Page 20: Biostatistics Case Studies 2007

FDA Guidelines• http://www.fda.gov/cder/guidance/4155fnl.pdf

• FDA has at least 4 major concerns:

1. Need strong evidence that standard treatment is effective.

2. Must have acceptable margin of equivalence that is much smaller than the effect of the standard over placebo.

3. Trial design must be very close to that which established the effectiveness of the standard treatment.

4. Study conduct must be high quality. This sounds like business-speak about “excellence”, but it’s really referring to the fact that superiority studies are by nature conservative: e.g., non-compliance and misclassification bias the results toward no effect. Those flaws in a non-inferiority study have the same bias, making it easier to falsely prove the aim.

Page 21: Biostatistics Case Studies 2007

Appendix: Possible Errors in Study Conclusions

Truth:

H0: No Effect H1: Effect

No Effect

Effect

Study Claims:

Correct

CorrectError (Type I)

Error (Type II)

Power: Maximize

Choose N for 80%

Set α=0.05

Specificity=95%

Specificity

Sensitivity

Typical study to demonstrate superiority/inferiority

Page 22: Biostatistics Case Studies 2007

Appendix: Graphical Representation of Power

H0

HA

H0: true effect=0

HA: true effect=3

Effect in study=1.13

\\\ = Probability of concluding HA if H0 is true.

41%

5%

Effect (Group B mean – Group A mean)

/// = Probability of concluding H0 if HA is true. Power=100-41=59%Note greater power if larger N, and/or if true effect>3, and/or less subject heterogeneity.

N=100 per

Group

Larger Ns give

narrower curves

Typical study to demonstrate superiority/inferiority

Page 23: Biostatistics Case Studies 2007

www.stat.uiowa.edu/~rlenth/Power

Appendix: Online Study Size / Power Calculator

Does NOT include tests

for equivalence

or non-inferiority

or non-superiority