2014-10-22 EUGM | WEI | Moving Beyond the Comfort Zone in Practicing Translational Stats

Moving beyond the comfort zone inMoving beyond the comfort zone in practicing translational statistics

L.J. WeiH d U i itHarvard University

Why are we staying in a “Comfort Zone” ?“Comfort Zone” ? Generally following a fixed pattern for conducting

studies

Are we like lawyers?

Avoiding delay of review processes? Avoiding delay of review processes?

What is the goal of a clinical study?study? Use efficient and reliable procedures to obtain

robust, clinically interpretable results with respect to risk-benefit perspectives at the patient’s level.

What are the problems?What are the problems?

The conventional way to conduct trials gives us The conventional way to conduct trials gives us fragmentary information

Lack of clinically meaningful totality evidence

Difficult to use the trial results for future patient’s management

A Few Methodology IssuesA Few Methodology Issues1. Estimation vs. testing P-value provides little clinical information about

treatment effectivenessTh i f th ff t ( ffi d t i it ) The size of the effects (efficacy and toxicity) matters

Design using interval estimates is quite flexible Design using interval estimates is quite flexible Almost everything we want to know via testing,

we can get from estimation e ca ge o es a o

TREAT study for EPO CV safetyTREAT study for EPO CV safety If we follow the patients up to 48 month,

the control arm's average stroke-free time is46.9 months and the Darb arm's is 46 months. The difference is 0 9 month (0 95 CI: 0 4m 1 4m)The difference is 0.9 month (0.95 CI: 0.4m, 1.4m) with p<0.001 (very significant).

2. How do we define a primary endpoint with multiple outcomes?

What is current practice? C t/ it l Component/composite analyses

Efficacy and toxicity (how to connect them together?)together?)

Disease burden measure? Competing risks problem?Competing risks problem? Informative dropout?

Example : Beta-Blocker Evaluation of S i l (BEST) T i l (NEJM 2001)Survival (BEST) Trial (NEJM, 2001) Study

B i d l l l b Bucindolol vs. placebo patients with advanced chronic heart failure-- n = 2707

f Average follow-up: 2 years Primary endpoint: overall survival

H d ti f d th 0 90 ( l 0 1) Hazard ratio for death = 0.90 (p-value = 0.1)

BEST TrialBEST Trial

Possible solutions?Possible solutions? Using the patient’s disease burden or progression

information during the entire followup to define the “responder”

Creating more than one response categories: ordinal categorical responseordinal categorical response

Brian Claggett’s thesis papera C agge s es s pape

BEST Example: 8 CategoriesBEST Example: 8 Categories 1: No events1: No events 2: Alive, non-HF hospitalization only 3: Alive, 1 HF hosp., p 4: Alive, >1 HF hosp. 5: Late non-CV death (>12 months)( ) 6: Late CV death (>12 months) 7: Early non-CV death (<12 months) 8: Early CV death (<12 months)

3. How to handle dropouts or competing risks? LOCF? BOCF? MMRM (model based) Pattern mixture model (cannot handle non-

random missing)U i d l i ith diff t t Using responder analysis with different ways to define informative dropouts for sensitivity analysis

4. Analysis of Covariance Compare two treatments with baseline

adjustments via regression modelsF li d l diff t dj t t For nonlinear model, different adjustments may lead to incoherent results

The inadequacy of the Cox ANCOVA The inadequacy of the Cox ANCOVA

Possible solutions?

Using the augmentation method by Tsiatis et al; Tian et al.

N d t if th b li i t No need to pre-specify the baseline covariates, but a set of potential covariates in the adjustment processprocess

5. Data monitoring Heavily utilizing p-value or conditional power A low conditional power may indicate that the

l i i t ll th i lsample size is too small or there is no real treatment difference

Using estimation and prediction for monitoring? Using estimation and prediction for monitoring?

6. Stratified medicine (personalized medicine)? A negative trial does not mean the treatment is no

good for anyoneA iti t i l d t it k f A positive trial does not mean it works for everyone

The usual subgroup analysis is not adequate to The usual subgroup analysis is not adequate to address this issue

Need a built-in pre-specified procedure for eed a bu p e spec ed p ocedu e oidentifying patients who benefit from treatment

7. Identify patients who respond the new therapy (predictive enrichment)

8. How to monitor safety? What is the conventional way? Component-wise tabulation or analysis? No information about multiple AE events at the

patient levelG hi l th d? Graphical method?

9. Quantifying treatment contrast (difference)? Should be model-free parameter Using difference of means, median, etc. For censored data, using a constant hazard ratio

(heavily model-based)? M d l b d i diffi lt t i t t Model-based measure is difficult to interpret or validate

Issues for the hazard ratio estimateestimate Hazard ratio estimate is routinely used for

designing, monitoring and analyzing clinical studies in survival analysis

Model Free Parameter for Treatment ContrastContrast * Considering a two treatment comparison study in

“survival analysis”* How do we quantify the treatment difference?

M di f il ti ( t b ti bl )• Median failure time (may not be estimable); • t-year survival rate (not an overall measure)?

A t t h d ti ti ith th l• A constant hazard ratio over time with the log-rank test

Eastern Cooperative Oncology Gro pGroup

E4A03 trial to compare low- and high-dose dexamethasone for naïve patients with multiple myeloma

The primary endpoint is the survival time The primary endpoint is the survival time n=445 The trial stopped early at the second interim The trial stopped early at the second interim

analysis; the low dose was superior. Patients on high-dose arm were then received a e s o g dose a e e e ece ed

low-dose and follow-up for overall survival were continued.

A Cancer Study ExampleA Cancer Study Example 1.

00.

8 Group 1

Group 2

0.6

Pro

babi

lity

Group 2

0.2

0.4P

0.0

0 10 20 30 40

Month

The proportional hazards assumption is not valid The PH estimator is estimating a quantity which

cannot be interpreted and, worse, depends on the study specific censoring distributionsstudy-specific censoring distributions

Any model-based treatment contrast has such issues (need a model-free parameter)issues (need a model free parameter)

The logrank test is not powerful

Conventional analysis: Log-rank test: p=0.47 Hazard Ratio: HR=0.87 (0.60, 1.27)

What is the alternative way for s r i al anal sis?survival analysis? Using the area under the curve of Kaplan-Meier

estimate up to a fixed time point

Restricted mean survival time Restricted mean survival time

Model-free and a global measure of efficacyModel free and a global measure of efficacy

Can be estimated even under heavy censoring

Cancer Study ExampleCancer Study Example

Restricted Mean (up to 40 months): 35.4 months vs. 33.3 months ∆ = 2 1 (0 1 4 2) months; p=0 04 ∆ = 2.1 (0.1, 4.2) months; p=0.04 Ratio of Survival time = 35.4/33.3 = 1.06 (1.00,

1.13) Ratio of time lost = 6.7/4.6 = 1.46 (1.02, 2.13)

10. Post-marketing/safety studies ? It is not appropriate to use an event driven

procedure to conduct a safety study.Th t t i l th ti tt The event rate is low, the exposure time matters

Requires lot of resources (large or long-term study)study)

Meta analysis; observational studies

CV safety study for anti-diabetes dr gsdrugs Event driven studies, that is, we need to have a

pre-specified # of events so the resulting confidence interval for the treatment difference is “narrow”narrow

For example, the upper bound of 95% confidenceFor example, the upper bound of 95% confidence interval is less than 1.3

The EXAMINE trial (alogliptin)The EXAMINE trial (alogliptin)

NEJM, October 3, 2013

RMST (24 months): Placebo 21.9 (21.7, 22.2)Alogliptin 22 0 (21 8 22 3)

RMST (30 months): Placebo 27.1 (26.7, 27.4)Alogliptin 27 2 (26 9 27 5)Alogliptin 22.0 (21.8, 22.3)

Difference -0.08 (-0.39, 0.24)Ratio 1.00 (0.98, 1.01)

Alogliptin 27.2 (26.9, 27.5)Difference -0.12 (-0.56, 0.33)Ratio 1.00 (0.98, 1.01)

Whole dataN=5380

1/2 of dataN=2690

1/3 of dataN=1793

1/4 of dataN=1345N 5380 N 2690 N 1793 N 1345

Hazard Ratio 0.824, 1.129 0.775, 1.213 0.742, 1.283 0.711, 1.338

Difference in RMST(30m)

-0.558, 0.325 -0.721,0.500

-0.852, 0.647

-0.980, 0.753

11. Meta analysis for safety issues

( ) f Nissen and Wolski (2007) performed a meta analysis to examine whether Rosiglitazone (Avandia, GSK), a drug for treating type 2 diabetes mellitus significantlydrug for treating type 2 diabetes mellitus, significantly increases the risk of MI or CVD related death.

ExampleEff t f R i lit MI CVD D thEffect of Rosiglitazone on MI or CVD Deaths

Avandia was introduced in 1999 and is widely used as monotherapy or in fixed-dose combinations with

ith A d t A d leither Avandamet or Avandaryl.

The original approval of Avandia was based on its ability in reducing blood glucose and glycated hemoglobin levels.

Initial studies were not adequately powered to determine the effects of this agent on micro- or macro- vascular complications of diabetes, including cardiovascular morbidity and mortality.

ExampleEff t f R i lit MI CVD D thEffect of Rosiglitazone on MI or CVD Deaths

However the effect of any anti-diabetic therapy on However, the effect of any anti diabetic therapy on cardiovascular outcomes is particularly important because more than 65% of deaths in patients with di b t f di ldiabetes are from cardiovascular causes.

Of 116 screened studies, 48 satisfied the inclusion criteria for the analysis proposed in Nissen and Wolski (2007).42 studies were reported in Nissen and Wolski (2007) the42 studies were reported in Nissen and Wolski (2007), the

remaining 6 studies have zero MI or CVD death 10 studies with zero MI events

25 t di ith CVD l t d d th25 studies with zero CVD related deaths

Event Rates from 0% to 2.70% for MI Event Rates from 0% to 1.75% for CVD Death

MI CVD Death

?????? ??????

Log Odds Ratio

95% CI: (1 03 1 98); p value = 0 03 95% CI: (0 98 2 74); p value = 0 06

Log Odds Ratio

95% CI: (1.03, 1.98); p-value = 0.03 (in favor of the control)

95% CI: (0.98, 2.74); p-value = 0.06

QuestionsQuestions

Rare events? How to utilize studies with 0/0 events?

f f ? Validity of asymptotic inference? Exact inference?

Choice of effect measure? Choice of effect measure? Between Study Heterogeneity? Common treatment effect or study specific treatment Common treatment effect or study specific treatment

effect? The number of studies not large? g

Exact Inference Asymptotic Inference

MI

M

%18.0ˆ %19.0ˆ 95% CI: (-0.08, 0.38)%P-value = 0.27

95% CI: (0.02, 0.42)%P-value = 0.03

Exact Inference Asymptotic Inference

Dea

thC

VD D

%063.0ˆ %11.0ˆ 95% CI: (-0.13, 0.23)%P-value = 0.83

95% CI: (0.00, 0.31)%P-value = 0.05

SummarySummary

C ld dif t ti ti l t i i ? Could we modify our statistical training? Teaching young generations “how, where and what to

learn” Learning from doing a project with mentoring? Could we have a coherent approach from the

beginning to the end for a research project?beginning to the end for a research project?

George Box: Instead of figuring out the optimal g g g psolution to a wrong problem, try to get A solution to a right problem.

Asking ourselves “What is the question?”Asking ourselves What is the question?

Health & Medicine

2014-10-22 EUGM | WEI | Moving Beyond the Comfort Zone in Practicing Translational Stats