Upload
macaulay-peck
View
21
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Biostatistics Case Studies 2007. Session 5: Demonstrating Lack of Treatment Effect: Equivalence or Non-inferiority. Peter D. Christenson Biostatistician http://gcrc.labiomed.org/Biostat. Terminology. Superiority and/or Inferiority Study: - PowerPoint PPT Presentation
Citation preview
Biostatistics Case Studies 2007
Peter D. Christenson
Biostatistician
http://gcrc.labiomed.org/Biostat
Session 5:
Demonstrating Lack of Treatment Effect: Equivalence or Non-inferiority
Terminology
Superiority and/or Inferiority Study:
• Two or more treatments are assumed equal and the study is designed to find overwhelming evidence of a difference.
• Usually, one treatment is a control, sham, or placebo.
• Most common comparative study type.
• It is rare to assess only one of superiority or inferiority (“one-sided” statistical tests), unless there is
biological impossibility of one of them.
Terminology
Equivalence Study:
• Two treatments are assumed to differ and the study is designed to find overwhelming evidence that they are equal.
• Usually, the quantity of interest is a measure of biological activity or potency and “treatments” are
drugs or lots or batches of drugs.
• AKA, bioequivalence.
• Sometimes used to compare clinical outcomes for two active treatments, e.g., statins or vaccines, if neither treatment can be considered standard or accepted.
This usually requires large numbers of subjects
Terminology
Non-Inferiority Study:
• Usually a new treatment or regimen is compared with an accepted treatment or regimen or standard of care.
• The new treatment is assumed inferior to the standard and the study is designed to show overwhelming
evidence that it is at least nearly as good, i.e., non- inferior. It may has other advantages, e.g., oral vs. inj.
• A negative inferiority study fails to detect inferiority, but does not necessarily give evidence for non-inferiority.
• The accepted treatment is usually known to be efficacious already, but an added placebo group may also be used.
• The distinguishing feature is an attempt to prove negativity, not the one-sidedness of the inference.
Case Study
pASA+PPI = 1.5%
Demonstrate: pclop – pASA+PPI ≤ 4%
N=145/group Power=80% for what?
Typical Analysis: Inferiority or Superiority
H0: pclop – pASA+PPI = 0%
H1: pclop – pASA+PPI ≠ 0%
H1 → therapies differ
α = 0.05
Power = 80% for Δ=|pclop - pASA+PPI| =?
Clop inferior
= 95% CI for pclop – pASA+PPI
Clop superior
0
0
pclop – pASA+PPI
pclop – pASA+PPI
[Not used in this paper]
0pclop – pASA+PPI
No diff detected*
* and 80% chance that a Δ of (?) or more would be detected.
Typical Analysis: Inferiority or Superiority
H0: pclop – pASA+PPI = 0%
H1: pclop – pASA+PPI ≠ 0%
H1 → therapies differ
α = 0.05
Power = 80% for Δ=|pclop - pASA+PPI| =?
[Not used in this paper]
So, N=331/group → 80% chance that a Δ of 4% or more would be detected.
Detectable Δ = 5.5%-1.5%=4%
Typical Analysis: Inferiority or Superiority
H0: pclop – pASA+PPI = 0%
H1: pclop – pASA+PPI ≠ 0%
H1 → therapies differ
α = 0.05
Power = 80% for Δ=|pclop - pASA+PPI| =4%
[Not used in this paper]
H0: pclop – pASA+PPI ≤ 0%
H1: pclop – pASA+PPI > 0%
H1 → clop inferior
Note that this could be formulated as two one-sided tests (TOST):
α = 0.025
Power = 80% for pclop - pASA+PPI =4%
H0: pclop – pASA+PPI ≥ 0%
H1: pclop – pASA+PPI < 0%
H1 → clop superior
α = 0.025
Power = 80% for pclop - pASA+PPI =-4%
Demonstrating Equivalence
H0: |pclop – pASA+PPI| ≥ E%
H1: |pclop – pASA+PPI| < E%
H1 → therapies “equivalent”, within E
[Not used in this paper]
H0: pclop – pASA+PPI ≤ -4%
H1: pclop – pASA+PPI > -4%
H1 → clop non-superior
Note that this could be formulated as two one-sided tests (TOST):
α = 0.025
Power = 80% for pclop - pASA+PPI = 0%
H0: pclop – pASA+PPI ≥ 4%
H1: pclop – pASA+PPI < 4%
H1 → clop non-inferior
α = 0.025
Power = 80% for pclop - pASA+PPI = 0%
Demonstrating Equivalence
H0: |pclop – pASA+PPI | ≥ 4%
H1: |pclop – pASA+PPI | < 4%
H1 → equivalence
α = 0.05
Power = 80% for pclop
- pASA+PPI = 0
Clop non-superior
= 95% CI for pclop – pASA+PPI
Clop non-inferior
0
0
pclop – pASA+PPI
pclop – pASA+PPI
0
pclop – pASA+PPI Equivalence*
-4 4
-4
-4
4
4
* both non-superior and non-inferior.
This Paper: Inferiority and Non-Inferiority
H0: pclop – pASA+PPI ≤ 0%
H1: pclop – pASA+PPI > 0%
H1 → clop inferior
Apparently, two one-sided tests (TOST), but only one explicitly powered:
α = 0.025
Power = 80% for pclop - pASA+PPI = ?%
H0: pclop – pASA+PPI ≥ 4%
H1: pclop – pASA+PPI < 4%
H1 → clop non-inferior
α = 0.025
Power = 80% for pclop - pASA+PPI = 0%
The authors chose E=4% as the maximum therapy difference that therapies are considered equivalent.
This Paper: Inferiority and Non-Inferiority
Clop inferior
= 95% CI for pclop – pASA+PPI
Clop non-inferior
0
0
pclop – pASA+PPI
pclop – pASA+PPI
0
pclop – pASA+PPI
“Non-clinical” inferiority*
-4 4
-4
-4
4
4
* clop is statistically inferior, but not enough for clinical significance.
Decisions:
Observed Results: pclop = 8.6%; pASA+PPI = 0.7%; 95% CI = 3.4 to 12.4
12
0-4 4
pclop – pASA+PPI
Clop inferior
Power for Test of Clopidrogrel Non-Inferiority
H0: pclop – pASA+PPI ≥ 4%
H1: pclop – pASA+PPI < 4%
H1 → clop non-inferior
α = 0.025
Power = 80% for pclop
- pASA+PPI = 0%
Power for Test of Clopidrogrel Inferiority
H0: pclop – pASA+PPI ≤ 0%
H1: pclop – pASA+PPI > 0%
H1 → clop inferior
α = 0.025
Power = 80% for pclop
- pASA+PPI = 7.3%
Detectable Δ = 8.8%-1.5%=7.3%
Conclusions: This Paper
• In this paper, clop was so inferior that investigators were apparently lucky to have enough power for detecting it. The CI was too wide with this N for detecting a smaller therapy difference.
• Investigators justify testing non-inferiority of clop only (and not of Aspirin + Nexium) with the lessened desirability of combination therapy (?).
•This is a good approach for size and power for a new competing therapy against a standard, if the N for clop inferiority had been considered also.
• Note that power calculations were based on actual %s of subjects, whereas cumulative 12-month incidence was used in the analysis. There are not power calculations for equivalency tests using survival analysis, that I know of.
Conclusions: General
• “Negligibly inferior” would be a better term than non-inferior.
• All inference can be based on confidence intervals.
• Pre-specify the comparisons to be made. Cannot test for both non-inferiority and superiority.
• Power for only one or for multiple comparisons, e.g., non-inferiority and inferiority. Power can be different for different comparisons.
• Very careful consideration must be given to choice of margin of equivalence (4% here). The study is worthless if others in the field would find your margin too large.
FDA Guidelines• http://www.fda.gov/cder/guidance/4155fnl.pdf
• FDA has at least 4 major concerns:
1. Need strong evidence that standard treatment is effective.
2. Must have acceptable margin of equivalence that is much smaller than the effect of the standard over placebo.
3. Trial design must be very close to that which established the effectiveness of the standard treatment.
4. Study conduct must be high quality. This sounds like business-speak about “excellence”, but it’s really referring to the fact that superiority studies are by nature conservative: e.g., non-compliance and misclassification bias the results toward no effect. Those flaws in a non-inferiority study have the same bias, making it easier to falsely prove the aim.
Appendix: Possible Errors in Study Conclusions
Truth:
H0: No Effect H1: Effect
No Effect
Effect
Study Claims:
Correct
CorrectError (Type I)
Error (Type II)
Power: Maximize
Choose N for 80%
Set α=0.05
Specificity=95%
Specificity
Sensitivity
Typical study to demonstrate superiority/inferiority
Appendix: Graphical Representation of Power
H0
HA
H0: true effect=0
HA: true effect=3
Effect in study=1.13
\\\ = Probability of concluding HA if H0 is true.
41%
5%
Effect (Group B mean – Group A mean)
/// = Probability of concluding H0 if HA is true. Power=100-41=59%Note greater power if larger N, and/or if true effect>3, and/or less subject heterogeneity.
N=100 per
Group
Larger Ns give
narrower curves
Typical study to demonstrate superiority/inferiority
www.stat.uiowa.edu/~rlenth/Power
Appendix: Online Study Size / Power Calculator
Does NOT include tests
for equivalence
or non-inferiority
or non-superiority