Classic Studies and Pitfalls in Clinical Trials · 2017-01-26 · Classic Studies and Pitfalls in Clinical Trials Brian G. Feagan MD Senior Scientific Director, Robarts Clinical Trials

Classic Studies and Pitfalls in

Clinical Trials

Brian G. Feagan MD

Senior Scientific Director, Robarts Clinical Trials Inc.

Professor of Medicine, Epidemiology and Biostatistics

Western University

London, Ontario, Canada

Topics To Be Covered

• Assessment of trial design - a “consumers” guide

• Standard IBD trial designs - methodological

issues

• Common Pitfalls – examples of what went

wrong!

• New trial designs

Some Key Issues in Assessing RCTs

• Randomized ?

• Blinded?

• A prior sample size calculation- adequate

statistical power?

• Clinically relevant endpoint?

• Clinically relevant effect size?

• Dropouts/missing data

Candidate Endpoints

• Mortality

• Hospitalization/Surgery

• Complications

• Symptoms (Patient Reported Outcomes)

• Endoscopy

• Histopathology (UC)

n=778 randomized to adalimumab

(ADA) 40 mg EOW or weekly, or

placebo, through 56 weeks

Feagan et al. Gastroenterology 2008;135(5):1493-9 8

Kaplan-Meier CD-Related Hospitalization:

CHARM

Placebo

Adalimumab

Days since randomization

50 100 150 200 250 300 350

30

20

10

0

Week 2

3-month hospitalization risk

Placebo (%) 7.3

ADA (%) 1.6 (RR reduction: 78%)

12-month hospitalization risk

Placebo (%) 13.9

ADA (%) 5.9 (RR reduction: 57%)

% H

os

pit

ali

za

tio

n

Time to Surgery in Patients with Severe UC

Järnerot G et al. Gastroenterology. 2005;128: 1805-11.

Complications: Growth Failure

• Growth and development a special problem in pediatrics

• Markowitz trial of 6-MP - 3.8 cm increase in height with active treatment vs. placebo

• Only 56 patients required to detect a 3cm difference at an alpha error of 0.05!

Trial Designs in IBD

R

Induction & maintenance (ACT I & II, PRECiSE-1)

R

Maintenance (ACCENT I, CHARM, P2)

R

Induction only (Targan, CLASSIC-I, Singleton)

R R

Induction & maintenance cross-over (NACSG)

Courtesy William Sandborn

CLASSIC: Adalimumab in Active

Crohn’s Disease

Remission (CDAI < 150) at Week 4

Hanauer S, et al. Gastroenterology. 2004;127:332.

P = 0.36

P = 0.06

P = 0.001

12

18

24

36

0

10

20

30

40

50

60

Placebo

(n = 74)

40/20 mg

(n = 74)

80/40 mg

(n = 75)

160/80 mg

(n = 76)

% o

f S

ub

jec

ts

Induction Only: What are the issues?

• Timing of primary endpoint

• Not sufficient for regulatory approval

• Patient acceptance

• Dose in maintenance

Colombel et al. Gastroenterology 2007;132:52

Maintenance Therapy- Adalimumab in

Crohn's Disease: CHARM

***p<0.001 vs placebo

Placebo (n=170) Adalimumab SC, 40 mg EOW (n=172)

Adalimumab SC, 40 mg q-week (n=157)

*** ***

12

36 41

0

100

Remission (CDAI <150)

CDAI ≥100 vs baseline

Patients (%)

*** ***

18

43 49

CDAI ≥70 vs baseline

*** ***

16

41 48

0

100

Patients (%) Remission (CDAI<150) Response

Median Time to Loss of Response

Through Week 54

Week 2 Responders

Hanauer S. Lancet. 2002 May 4; 359(9317): 1541-9.

Open-Label Induction Followed by

Randomization of Responders to Maintenance

What Are the Issues?

• Tends to overestimate the effect size

• Duration?

• Analysis – simple proportions vs. survival

• Very feasible- attractive

ADA 160 mg

Placebo

Ra

nd

om

iza

tion

/ b

as

elin

e

Mayo assessment

Week 0 Week 4 Week 52 Week 2

ADA 80 mg

Placebo

ADA 40 mg eow

Week 8

Placebo

Co

-prim

ary

en

dp

oin

t

(rem

iss

ion

)

Co

-prim

ary

en

dp

oin

t

(rem

iss

ion

)

Week 32 Week 12

Patients could switch to

ADA OL 40 mg eow if

inadequate response

at / after Week 12

Week 20

Ultra 2 Study Design

Data on file Abbott Laboratories

Randomized Induction and Maintenance -

“Treat Right Through” - What Are the

Issues?

• High risk- insufficient numbers of patients to answer the

maintenance question

• “missing data”

• Potential effects on patient selection

• Need to handle multiplicity of testing

• Estimation of sample size to answer the maintenance

question dependent upon induction efficacy, retention

rate

Steroid

Dependent

Patients

MTX

At 16 weeks:

remission and off

corticosteroids

Placebo

MTX

Placebo

Outcome:

At 40 weeks:

no ensuing relapse

(CDAI increase of 100

points or introduction

of treatment for active

disease)

2 :1 Yes

1:1

NACSG MTX Design

Part II Results: Remission at Week 40

0

10

20

30

40

50

60

70

MTX

Placebo

% R

em

issio

n W

eek 4

0

P = 0.015

65.0%

Feagan et al.N Eng J Med. 2000;342:1627-1632

38.9%

Randomized Induction Followed by Re-

Randomization of Responders to Maintenance

What Are the Issues?

• Logistically complex

• “Funnel effect” – need to overfill the induction

component

• How to handle placebo responders

• Integral to sequential designs (use of open label

“feeder” to power maintenance)

Lessons Learned: Common Pitfalls

• Lack of clear PK/PD relationship

• Inadequate dose finding

• High placebo rates/poor patient selection/choice

of outcomes

• Type II error

• Poor trial design

• Multiple competing goals

• “Over-engineering”

• Early drug development

is problematic because

usually small numbers

of patients are evaluated

• Strong PK/PD relationship

is invaluable for establishing

POC and dose proportionality

• e.g., RDP58 therapy for UC

Lack of Clear PK/PD Relationship

0

10

20

30

40

50

60

70

80

% “

Su

ccess”

Placebo RDP58-200 RDP58-300

32%

71% 72%

Travis S. et al. Inflammatory Bowel Disease. 2005;11(8):713-719

Inadequate Dose Finding

Targan SR et al. N Engl J Med. 1997;337:1029-1035

Clinical remission CDAI score < 150

• Small molecules usually

linear dose relationship

• Monoclonals usually flat

• Assumptions about

effective dose based

on RA data

Treatment Group

Rem

issio

n (

%)

CLASSIC: Remission at Week 4

Hanauer et al. Gastroenterology. 2006:30(2):232-33

P = 0.36

P = 0.06

P = 0.001

12

18

24

36

0

10

20

30

40

50

60

Placebo

(N = 74)

40/20 mg

(N = 74)

80/40 mg

(N = 75)

160/80 mg

(N = 76)

Perc

en

tag

e o

f S

ub

jects

The Placebo Effect

• Complex – determined

by multiple factors

• Major issue in IBD trials,

especially CD

• Outcome measure specific

• Can really ruin your day!

Sargramostim

(Recombinant GMCSF 6 μg/kg/d SQ)

Korzenik JR, et al. N Engl J Med. 2005;98:S247

0

350

100

300

200

Med

ian

CD

AI

Sco

re

Treatment Week

Sargramostim

Placebo

P = 0.75 P = 0.19 P = 0.03 P = 0.006 P = 0.004

4 2 8 6 0

50

250

150 Remission

Response (100-point)

N = 124

Active CD

High Placebo Rates

80

70

60

50

40

30

20

10

AUS CAN UK BRA RUS UKR

Resp

on

se o

r R

em

issio

n

(%)

80 25 43 28 49 44

placebo sargramostim

Non-Responder Imputation

Countries with 10 or more subjects enrolled

(no. subjects)

NOVEL 1 (USA)

Placebo Response (CDAI ≥ 100 pt decline)

CDAI and the Placebo Response

• Minimize by:

• Reduce concomitant medication

• Short duration for induction studies

• Reduce the number of clinic visits

• Confirm objective evidence of

inflammation at entry

• Robust endpoints (remission CDAI <150)

CRP and Placebo-Induced Remission

Will et al. Gastroenterology April 2005;128(4) Suppl 2:A-48 Abstract #338

p<0.001 p<0.001

174 123 104 193

0

5

10

15

20

25

30

35

>5 >7.5 >10 >15

CRP in mg/dL

Pe

rce

nt

Pa

tie

nts

in

R

em

iss

ion

What is the Role of Endoscopy?

van Dulleman H et al. Gastroenterology1995 Jul;109(1):129-35

Recent Negotiations with FDA

on Ulcerative Colitis Endpoints

• Entry criteria are

patients with rectal

bleeding and a

minimum of Grade 2

endoscopic changes

(friability)

• Primary endpoint is

no rectal bleeding

and no friability on

endoscopy

Modified Baron = 0

UCCS = 1

Riley score = 0

MLN-02: Clinical Remission at Week 6

Feagan B. et. Al. N Eng J Med. 2005

Placebo 0.5 mg/kg 2.0 mg/kg

0

5

10

•5

20

25

30

35

Overall P = 0.030

% R

em

issio

n

15%

34%

33%

P = 0.021 P = 0.015

Type II Errors

Rutgeerts et al. N Engl J Med. 1994;331:842-845

Treatment (weeks)

2 4 8 10

Pa

tie

nts

in

Re

mis

sio

n (

%)

(CD

AI 1

50

)

100

80

60

40

20

0

*

† ‡ ‡

Prednisolone 40 mg qd tapered q 2 wk to 25 mg (N=88)

Budesonide 9 mg qd x 8 wk tapered to 6 mg (N=88)

*P = 0.22; †P <0.001; ‡P = 0.12

Type II Errors

• 1 - beta = statistical power

• Commonly 10-20%

• Many trials look for huge effect size

• Special problem of non-inferiority trials

0

20

40

60

80

100

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Duration of trial (months)

AZA Placebo

Candy S. Gut. 1995 Nov;37(5):674-8

Reasons Trials Fail: Over- Engineering

Trial Design Complexity: ACCENT I

Multiple goals = “over-engineering”

All Patients, n = 573

Infliximab 5 mg/kg

Week 2

Single Dose Group 3 Dose Induction Group Placebo

n = 110

5 mg/kg

n = 113

5 mg/kg

n = 112

Infusion

Week 0

Responders at Week 2

n = 335 (58%)

Evaluation

10 mg/kg Week 14

Week 54

Week 6

Week 22

Week 30

Week 38

Week 46

Hanauer SB et al. Lancet. 2002 May 4;359(9317):1541-9

ACCENT I: What Questions Were Asked

vs. What Questions Were Answered?

Efficacy in maintenance

Optimal dose in maintenance

x Single vs. three dose induction

Steroid-sparing

x Dose escalation for secondary failure

x Prevention of surgery

x Mucosal healing

New corticosteroid-

sparing agent vs Placebo

New corticosteroid-

sparing agent vs TNF

antagonist

The Next Generation Trials: Active

Comparators vs. Placebo - Non-inferiority vs.

Superiority ?

Patients

receiving

corticosteroids

Sample

Size (N) Response

Rate

30% vs

45 %

1500 !

2-sided = 0.5

Beta = 0.20

Assumptions:

Non- inferiority

Clinically insignificant difference 7.5%

Superiority

Clinically significant difference 15%

320

30%, one-sided 95 %CI /

MCID12.5%

Novel RCT Designs:

Cluster Randomization

• Groups of patients randomized (not individuals)

• Cluster can be a practice, hospital, community

Reasons for Adopting


• To avoid treatment group “contamination”

• Administrative convenience

• To obtain cooperation of investigators

• Ethical considerations

• To enhance patient compliance

Bland (2004)

Cluster Randomization Trials

Published 1981-2003

1 2003

Impact of Cluster Randomization on the

Design and Analysis of a Trial

• Degree of similarity in outcomes within a cluster

must be accounted for in design and analysis

• Can be quantified by an ICC = ρ [~Pearson’s r ]

• ρ = 1.0 = absolute correlation versus

• ρ = 0.0 = absolute independence

• Reduces efficiency of statistical testing/inflates

sample size

• Inferences at cluster level - no penalty, therefore

efficient for policy questions

REACT I: 40 GI COMMUNITY PRACTICES


20 practices 20 practices

Treatment

Algorithm

Usual

Care

Each practice to provide data on 60 patients

Therapeutic Algorithm for Active Luminal CD (Moderate Risk)

GCS (Bud vs Pred depending on

disease activity and localization)

Yes

TNF Antagonist + AZA or MTX (GCS as needed)

Evaluate in 4 wks - remission? (HBS ≤ 4)

Taper GCS Add TNF Antagonist + AZA or MTX

5-ASA

Antibiotics

Re-evaluate in 12 wks - remission?

Yes

Continue Combination

Maintenance Therapy

No

No Yes


Maintenance Therapy Consider

Resection

Taper GCS, re-evaluate

in 12 wks - remission?


No


Yes No

No Maintenance Therapy

Increase TNF Antagonist to weekly dose

Switch Antimetabolite


No Yes


Maintenance Therapy

Switch TNF Blocker Continue Combination

Maintenance Therapy Re-evaluate in 12 wks - remission?

Without Fistula

Yes No

REACT

Primary Efficacy Measure

• Proportion of patients in remission at the

practice level at the end of the 12-month follow-

up period

• Remission is defined as a Harvey-Bradshaw

score (HBS) ≤ 4 without the use of steroids for

the treatment of CD

REACT: Time to first hospitalization,

surgery or complication

34.7%

27.4%

10

20

30

40

Time (months)

Ho

sp

italisati

on

, su

rgery

or

co

mp

licati

on

s (

%)

HR (95% CI) = 0.73 (0.62, 0.86), p <0.001

0 0 3 6 9 12 15 18 21 24

Conventional management

Early combined

immunosuppression

Khanna R, et al. Lancet. 2015 Nov 7;386(10006):1825–34

Early Combined

Immunosuppression

N (%)

Conventional

Management

N (%)

P Value

Worsening Crohn’s disease

Abscess 32 (3.0) 33 (3.7) 0.36

Fistula 29 (2.7) 39 (4.3) 0.03

Stricture/bowel obstruction 67 (6.2) 82 (9.1) 0.01

Serious worsening disease 98 (9.0) 96 (10.7) 0.65

Serious extra-intestinal

manifestations

47 (4.3) 50 (5.6) 0.37

Serious drug-related complications 10 (0.9) 10 (1.1) 0.84

Deaths

Cardiovascular 2 (0.2) 5 (0.5)

Thromboembolic 1 (0.1) 1 (0.1)

Cancer 3 (0.3) 2 (0.2)

Infection 1 (0.1) 1 (0.1)

Other 0 (0.0) 1 (0.1)

Total Mortality 7 (0.7) 10 (1.1) 0.33b

REACT: Serious Disease and

Drug-Related Complications and Mortality

Khanna R, et al. Lancet. 2015 Nov 7;386(10006):1825–34

Three Key Differences: REACT II vs. REACT I

1. Decision – making driven by endoscopy

2. Protocolized usual care arm

3. Deep remission as the primary outcome

Documents

Classic Studies and Pitfalls in Clinical Trials · 2017-01-26 · Classic Studies and Pitfalls in Clinical Trials Brian G. Feagan MD Senior Scientific Director, Robarts Clinical Trials