296

Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and
Page 2: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and
Page 3: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and
Page 4: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and
Page 5: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

Chapman & Hall/CRC

Taylor & Francis Group

6000 Broken Sound Parkway NW, Suite 300

Boca Raton, FL 33487-2742

© 2007 by Taylor & Francis Group, LLC

Chapman & Hall/CRC is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works

Printed in the United States of America on acid-free paper

10 9 8 7 6 5 4 3 2 1

International Standard Book Number-10: 1-58488-776-1 (Hardcover)

International Standard Book Number-13: 978-1-58488-776-8 (Hardcover)

This book contains information obtained from authentic and highly regarded sources. Reprinted

material is quoted with permission, and sources are indicated. A wide variety of references are

listed. Reasonable efforts have been made to publish reliable data and information, but the author

and the publisher cannot assume responsibility for the validity of all materials or for the conse-

quences of their use.

No part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any

electronic, mechanical, or other means, now known or hereafter invented, including photocopying,

microfilming, and recording, or in any information storage or retrieval system, without written

permission from the publishers.

For permission to photocopy or use material electronically from this work, please access www.

copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC)

222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that

provides licenses and registration for a variety of users. For organizations that have been granted a

photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and

are used only for identification and explanation without intent to infringe.

Library of Congress Cataloging-in-Publication Data

Chow, Shein-Chung, 1955-

Adaptive design methods in clinical trials / Shein-Chung Chow and Mark

Chang.

p. cm. -- (Biostatistics ; 17)

Includes bibliographical references and index.

ISBN 1-58488-776-1

1. Clinical trials. 2. Adaptive sampling (Statistics). 3. Experimental design. 4.

Clinical trials--Statistical methods. I. Chang, Mark. II. Title.

R853.C55.C53 2006

610.7’4--dc22 2006048510

Visit the Taylor & Francis Web site at

http://www.taylorandfrancis.com

and the CRC Press Web site at

http://www.crcpress.com

Page 6: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

Series Introduction

The primary objectives of the Biostatistics Book Series are to provideuseful reference books for researchers and scientists in academia, in-dustry, and government, and also to offer textbooks for undergraduateand/or graduate courses in the area of biostatistics. This book series willprovide comprehensive and unified presentations of statistical designsand analyses of important applications in biostatistics, such as those inbiopharmaceuticals. A well-balanced summary will be given of currentand recently developed statistical methods and interpretations for bothstatisticians and researchers/scientists with minimal statistical knowl-edge who are engaged in the field of applied biostatistics. The series iscommitted to providing easy-to-understand, state-of-the-art referencesand textbooks. In each volume, statistical concepts and methodologieswill be illustrated through real-world examples.

On March 16, 2004, the FDA released a report addressing the recentslowdown in innovative medical therapies being submitted to the FDAfor approval, “Innovation/Stagnation: Challenge and Opportunity onthe Critical Path to New Medical Products.” The report describes the ur-gent need to modernize the medical product development process — theCritical Path — to make product development more predictable and lesscostly. Through this initiative, the FDA took the lead in the developmentof a national Critical Path Opportunities List, to bring concrete focus tothese tasks. As a result, the FDA released a Critical Path Opportuni-ties List that outlines 76 initial projects to bridge the gap between thequick pace of new biomedical discoveries and the slower pace at whichthose discoveries are currently developed into therapies two years later.The Critical Path Opportunities List consists of six broad topic areas:(i) development of biomarkers, (ii) clinical trial designs, (iii) bioinformat-ics, (iv) manufacturing, (v) public health needs, and (iv) pediatrics. Asindicated in the Critical Path Opportunities Report, biomarker develop-ment and streamlining clinical trials are the two most important areasfor improving medical product development. Streamlining clinicaltrials calls for advancing innovative trial designs such as adaptive de-signs to improve innovation in clinical development. These 76 initialprojects are the most pressing scientific and/or technical hurdles caus-ing major delays and other problems in the drug, device, and/or biologic

Page 7: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

development process. Among these six topics, biomarker developmentand streamlining clinical trials are the two most important areas forimproving medical product development.

This volume provides useful approaches for implementation of adap-tive design methods to clinical trials to pharmaceutical research anddevelopment. It covers statistical methods for various adaptive designssuch as adaptive group sequential design, N-adjustable design, adap-tive dose-escalation design, adaptive seamless phase II/III trial design(drop-the-losers design), adaptive randomization design, biomarker-adaptive design, adaptive treatment-switching design, adaptive-hypotheses design, and any combinations of the above designs. It willbe beneficial to biostatisticians, medical researchers, pharmaceuticalscientists, and reviewers in regulatory agencies who are engaged in theareas of pharmaceutical research and development.

Shein-Chung Chow

Page 8: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

Preface

In recent years, the use of adaptive design methods in clinical trialshas attracted much attention from clinical investigators and biostatis-ticians. Adaptations (i.e., modifications or changes) made to the trialand/or statistical procedures of on-going clinical trials based on accrueddata have been in practice for years in clinical research and develop-ment. In the past several decades, we have adopted statistical proce-dures in the literature and applied them directly to the design of clinicaltrials originally planned by ignoring the fact that adaptations, modifi-cations, and/or changes have been made to the trials. As pointed outby the United States Food and Drug Administration (FDA), these pro-cedures, however, may not be motivated by best clinical trial practice.Consequently, they may not be the best tools to handle certain situa-tions. Adaptive design methods in clinical research and developmentare attractive to clinical scientists and researchers due to the followingreasons. First, they do reflect medical practice in the real world. Second,they are ethical with respect to both efficacy and safety (toxicity) of thetest treatment under investigation. Third, they are not only flexible butalso efficient in clinical development, especially for early phase clinicaldevelopment. However, there are issues regarding the adjustments oftreatment estimations and p-values. In addition, it is also a concernthat the use of adaptive design methods in a clinical trial may have ledto a totally different trial that is unable to address the scientific/medicalquestions the trial is intended to answer.

In practice, there existed no universal definition of adaptive designmethods in clinical research until recently, when The PharmaceuticalResearch and Manufacturers of America (PhRMA) gave a formal defini-tion. Most literature focuses on adaptive randomization with respect tocovariate, treatment, and/or clinical response; adaptive group sequen-tial design for interim analysis; and sample size re-assessment. In thisbook, our definition is broader. Adaptive design methods include anyadaptations, modifications, or changes of trial and/or statistical proce-dures that are made during the conduct of the trials. Although adaptivedesign methods are flexible and useful in clinical research, little or noregulatory guidances/guidelines are available. The purpose of this bookis to provide a comprehensive and unified presentation of the principles

Page 9: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

and methodologies in adaptive design and analysis with respect to adap-tations made to trial and/or statistical procedures based on accrued dataof on-going clinical trials. In addition, this book is intended to give awell-balanced summary of current regulatory perspectives and recentlydeveloped statistical methods in this area. It is our goal to provide a com-plete, comprehensive, and updated reference and textbook in the areaof adaptive design and analysis in clinical research and development.

Chapter 1 provides an introduction to basic concepts regarding theuse of adaptive design methods in clinical trials and some statisticalconsiderations of adaptive design methods. Chapter 2 focuses on theimpact on target patient populations as the result of protocol amend-ments. Also included in this chapter is the generalization of statisticalinference, which is drawn based on data collected from the actual pa-tient population as the result of protocol amendments, to the originallyplanned target patient population. Several adaptive randomization pro-cedures that are commonly employed in clinical trials are reviewed inChapter 3. Chapter 4 studies the use of adaptive design methods inthe case where hypotheses are modified during the conduct of clinicaltrials. Chapter 5 provides an overall review of adaptive design methodsfor dose selection, especially in dose finding and dose response relation-ship studies in early clinical development. Chapter 6 introduces thecommonly used adaptive group sequential design methods in clinicaltrials. Blinded procedures for sample size re-estimation are given inChapter 7. Statistical tests for seamless phase II/III adaptive designsand statistical inference for switching from one treatment to anotheradaptively, and the corresponding practical issues that may arise arestudied in Chapter 8 and Chapter 9, respectively. Bayesian approachesfor the use of adaptive design methods in clinical trials are outlined inChapter 10. Chapter 11 provides an introduction to the methodology ofclinical trial simulation for evaluation of the performance of the adap-tive design methods under various adaptive designs that are commonlyused in clinical development. Case studies regarding the implementa-tion of adaptive group sequential design, adaptive dose-escalation de-sign, and adaptive seamless phase II/III trial design in clinical trialsare discussed in Chapter 12.

From Taylor & Francis, we would like to thank David Grubbs andDr. Sunil Nair for providing us the opportunity to work on this book.We would like to thank colleagues from the Department of Biostatisticsand Bioinformatics and Duke Clinical Research Institute (DCRI) ofDuke University School of Medicine and Millennium Pharmaceuticals,Inc., for their support during the preparation of this book. We wish toexpress our gratitude to the following individuals for their encourage-ment and support: Roberts Califf, M.D. and John Hamilton, M.D. of

Page 10: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

Duke Clinical Research Institute and Duke University Medical Center;Nancy Simonian, M.D., Jane Porter, M.S., Andy Boral, M.D. and JimGilbert, M.D. of Millennium Pharmaceuticals, Inc.; Greg Campbell,Ph.D. of the U.S. Food and Drug Administration; and many friends fromacademia, the pharmaceutical industry, and regulatory agencies.

Finally, the views expressed are those of the authors and not nec-essarily those of Duke University School of Medicine and MillenniumPharmaceuticals, Inc. We are solely responsible for the contents anderrors of this edition. Any comments and suggestions will be very muchappreciated.

Shein-Chung Chow, Ph.D.Duke University School of Medicine, Durham, NC

Mark Chang, Ph.D.Millennium Pharmaceuticals, Inc., Cambridge, MA

Page 11: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and
Page 12: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

Contents

1 Introduction 11.1 What Is Adaptive Design? . . . . . . . . . . . . . . . . 31.2 Regulatory Perspectives . . . . . . . . . . . . . . . . . 61.3 Target Patient Population . . . . . . . . . . . . . . . . 81.4 Statistical Inference . . . . . . . . . . . . . . . . . . . 101.5 Practical Issues . . . . . . . . . . . . . . . . . . . . . . 11

1.5.1 Moving target patient population . . . . . . 121.5.2 Adaptive randomization . . . . . . . . . . . . 131.5.3 Adaptive hypotheses . . . . . . . . . . . . . . 141.5.4 Adaptive dose-escalation trials . . . . . . . . 151.5.5 Adaptive group sequential design . . . . . . 151.5.6 Adaptive sample size adjustment . . . . . . 161.5.7 Adaptive seamless phase II/III design . . . 171.5.8 Adaptive treatment switching . . . . . . . . 181.5.9 Bayesian and hybrid approaches . . . . . . 181.5.10 Clinical trial simulation . . . . . . . . . . . . 191.5.11 Case studies . . . . . . . . . . . . . . . . . . 19

1.6 Aims and Scope of the Book . . . . . . . . . . . . . . . 20

2 Protocol Amendment 232.1 Actual Patient Population . . . . . . . . . . . . . . . . 232.2 Estimation of Shift and Scale Parameters . . . . . . 26

2.2.1 The case where µActual is randomand σActual is fixed . . . . . . . . . . . . . . . . 28

2.3 Statistical Inference . . . . . . . . . . . . . . . . . . . 312.3.1 Test for equality . . . . . . . . . . . . . . . . . 332.3.2 Test for non-inferiority/superiority . . . . . . 342.3.3 Test for equivalence . . . . . . . . . . . . . . . 34

2.4 Sample Size Adjustment . . . . . . . . . . . . . . . . 352.4.1 Test for equality . . . . . . . . . . . . . . . . . 352.4.2 Test for non-inferiority/superiority . . . . . . 362.4.3 Test for equivalence . . . . . . . . . . . . . . . 37

2.5 Statistical Inference with Covariate Adjustment . . 382.5.1 Population and assumption . . . . . . . . . . 38

Page 13: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

2.5.2 Conditional inference . . . . . . . . . . . . . . 392.5.3 Unconditional inference . . . . . . . . . . . . 40

2.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . 43

3 Adaptive Randomization 473.1 Conventional Randomization . . . . . . . . . . . . . . 483.2 Treatment-Adaptive Randomization . . . . . . . . . . 523.3 Covariate-Adaptive Randomization . . . . . . . . . . 553.4 Response-Adaptive Randomization . . . . . . . . . . 583.5 Issues with Adaptive Randomization . . . . . . . . . 703.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 73

4 Adaptive Hypotheses 754.1 Modifications of Hypotheses . . . . . . . . . . . . . . 764.2 Switch from Superiority to Non-Inferiority . . . . . . 784.3 Concluding Remarks . . . . . . . . . . . . . . . . . . . 87

5 Adaptive Dose-Escalation Trials 895.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 895.2 CRM in Phase I Oncology Study . . . . . . . . . . . . 915.3 Hybrid Frequentist-Bayesian Adaptive Design . . . 935.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . 1005.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . 104

6 Adaptive Group Sequential Design 1076.1 Sequential Methods . . . . . . . . . . . . . . . . . . . 1086.2 General Approach for Group Sequential Design . . . 1126.3 Early Stopping Boundaries . . . . . . . . . . . . . . . 1146.4 Alpha Spending Function . . . . . . . . . . . . . . . . 1226.5 Group Sequential Design Based on Independent

p-Values . . . . . . . . . . . . . . . . . . . . . . . . . . 1236.6 Calculation of Stopping Boundaries . . . . . . . . . . 1256.7 Group Sequential Trial Monitoring . . . . . . . . . . 1286.8 Conditional Power . . . . . . . . . . . . . . . . . . . . 1336.9 Practical Issues . . . . . . . . . . . . . . . . . . . . . . 135

7 Adaptive Sample Size Adjustment 1377.1 Sample Size Re-estimation without

Unblinding Data . . . . . . . . . . . . . . . . . . . . . 1387.2 Cui–Hung–Wang’s Method . . . . . . . . . . . . . . . 1407.3 Proschan–Hunsberger’s Method . . . . . . . . . . . . 1427.4 Muller–Schafer Method . . . . . . . . . . . . . . . . . 1467.5 Bauer–Kohne Method . . . . . . . . . . . . . . . . . . 146

Page 14: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

7.6 Generalization of Independent p-ValueApproaches . . . . . . . . . . . . . . . . . . . . . . . . 148

7.7 Inverse-Normal Method . . . . . . . . . . . . . . . . . 1577.8 Concluding Remarks . . . . . . . . . . . . . . . . . . . 158

8 Adaptive Seamless Phase II/III Designs 1618.1 Why a Seamless Design Is Efficient . . . . . . . . . . 1618.2 Step-Wise Test and Adaptive Procedures . . . . . . . 1628.3 Contrast Test and Naive p-Value . . . . . . . . . . . . 1638.4 Comparison of Seamless Designs . . . . . . . . . . . 1658.5 Drop-the-Loser Adaptive Design . . . . . . . . . . . . 1678.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 171

9 Adaptive Treatment Switching 1739.1 Latent Event Times . . . . . . . . . . . . . . . . . . . 1749.2 Proportional Hazard Model with Latent

Hazard Rate . . . . . . . . . . . . . . . . . . . . . . . . 1779.2.1 Simulation results . . . . . . . . . . . . . . . . 179

9.3 Mixed Exponential Model . . . . . . . . . . . . . . . . 1819.3.1 Biomarker-based survival model . . . . . . . 1829.3.2 Effect of patient enrollment rate . . . . . . . 1849.3.3 Hypothesis test and power analysis . . . . . . 1879.3.4 Application to trials with treatment

switch . . . . . . . . . . . . . . . . . . . . . . . 1899.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . 193

10 Bayesian Approach 19510.1 Basic Concepts of Bayesian Approach . . . . . . . . . 195

10.1.1 Bayes rule . . . . . . . . . . . . . . . . . . . . 19610.1.2 Bayesian power . . . . . . . . . . . . . . . . . 200

10.2 Multiple-Stage Design for Single-Arm Trial . . . . . 20110.2.1 Classical approach for two-stage

design . . . . . . . . . . . . . . . . . . . . . . 20210.2.2 Bayesian approach . . . . . . . . . . . . . . . 203

10.3 Bayesian Optimal Adaptive Designs . . . . . . . . . . 20510.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . 209

11 Clinical Trial Simulation 21111.1 Simulation Framework . . . . . . . . . . . . . . . . . 21211.2 Early Phases Development . . . . . . . . . . . . . . . 213

11.2.1 Dose limiting toxicity (DLT) and maximumtolerated dose (MTD) . . . . . . . . . . . . . 214

11.2.2 Dose-level selection . . . . . . . . . . . . . . 214

Page 15: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

11.2.3 Sample size per dose level . . . . . . . . . . 21511.2.4 Dose-escalation design . . . . . . . . . . . . 215

11.3 Late Phases Development . . . . . . . . . . . . . . . . 21511.3.1 Randomization rules . . . . . . . . . . . . . . 21611.3.2 Early stopping rules . . . . . . . . . . . . . . 21611.3.3 Rules for dropping losers . . . . . . . . . . . 21611.3.4 Sample size adjustment . . . . . . . . . . . . 21711.3.5 Response–adaptive randomization . . . . . 21711.3.6 Utility-offset model . . . . . . . . . . . . . . 21811.3.7 Null-model versus model approach . . . . . 21911.3.8 Alpha adjustment . . . . . . . . . . . . . . . 219

11.4 Software Application . . . . . . . . . . . . . . . . . . . 22011.4.1 Overview of ExpDesign studio . . . . . . . . 22011.4.2 How to design a trial with

ExpDesign studio . . . . . . . . . . . . . . . 22211.4.3 How to design a conventional trial . . . . . 22211.4.4 How to design a group sequential trial . . . 22311.4.5 How to design a multi-stage trial . . . . . . 22411.4.6 How to design a dose-escalation trial . . . . 22511.4.7 How to design an adaptive trial . . . . . . . 227

11.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 22711.5.1 Early phases development . . . . . . . . . . 22811.5.2 Late phases development . . . . . . . . . . . 230

11.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . 235

12 Case Studies 23912.1 Basic Considerations . . . . . . . . . . . . . . . . . . . 239

12.1.1 Dose and dose regimen . . . . . . . . . . . . 24012.1.2 Study endpoints . . . . . . . . . . . . . . . . 24012.1.3 Treatment duration . . . . . . . . . . . . . . 24012.1.4 Logistical considerations . . . . . . . . . . . 24112.1.5 Independent data monitoring

committee . . . . . . . . . . . . . . . . . . . . 24112.2 Adaptive Group Sequential Design . . . . . . . . . . 241

12.2.1 Group sequential design . . . . . . . . . . . 24112.2.2 Adaptation . . . . . . . . . . . . . . . . . . . 24212.2.3 Statistical methods . . . . . . . . . . . . . . 24312.2.4 Case study — an example . . . . . . . . . . . 243

12.3 Adaptive Dose-Escalation Design . . . . . . . . . . . 24412.3.1 Traditional dose-escalation design . . . . . . 24412.3.2 Adaptation . . . . . . . . . . . . . . . . . . . 24512.3.3 Statistical methods . . . . . . . . . . . . . . 24512.3.4 Case study — an example . . . . . . . . . . . 245

Page 16: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

12.4 Adaptive Seamless Phase II/III Design . . . . . . . . 24712.4.1 Seamless phase II/III design . . . . . . . . . 24712.4.2 Adaptation . . . . . . . . . . . . . . . . . . . 24812.4.3 Methods . . . . . . . . . . . . . . . . . . . . . 24812.4.4 Case study — some examples . . . . . . . . 24912.4.5 Issues and recommendations . . . . . . . . . 252

Bibliography 255

Index 269

Page 17: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and
Page 18: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CHAPTER 1

Introduction

In clinical research, the ultimate goal of a clinical trial is to evaluatethe effect (e.g., efficacy and safety) of a test treatment as compared to acontrol (e.g., a placebo control, a standard therapy, or an active controlagent). To ensure the success of a clinical trial, a well-designed studyprotocol is essential. A protocol is a plan that details how a clinical trialis to be carried out and how the data are to be collected and analyzed.It is an extremely critical and the most important document in clinicaltrials, since it ensures the quality and integrity of the clinical investi-gation in terms of its planning, execution, conduct, and the analysis ofthe data of clinical trials. During the conduct of a clinical trial, adher-ence to the protocol is crucial. Any protocol deviations and/or violationsmay introduce bias and variation to the data collected from the trial.Consequently, the conclusion drawn based on the analysis results ofthe data may not be reliable and hence may be biased or misleading.For marketing approval of a new drug product, the United States Foodand Drug Administration (FDA) requires that at least two adequateand well-controlled clinical trials be conducted to provide substantialevidence regarding the effectiveness of the drug product under inves-tigation (FDA, 1988). However, under certain circumstances, the FDAModernization Act (FDAMA) of 1997 includes a provision (Section 115of FDAMA) to allow data from a single adequate and well-controlledclinical trial to establish effectiveness for risk/benefit assessment ofdrug and biological candidates for approval. The FDA indicates thatsubstantial evidence regarding the effectiveness and safety of the drugproduct under investigation can only be provided through the conductof adequate and well-controlled clinical studies. According to the FDA1988 guideline for Format and Content of the Clinical and StatisticalSections of New Drug Applications, an adequate and well-controlledstudy is defined as a study that meets the characteristics of the follow-ing: (i) objectives, (ii) methods of analysis, (iii) design, (iv) selection ofsubjects, (v) assignment of subjects, (vi) participants of studies, (vii) as-sessment of responses, and (viii) assessment of effect. In the study pro-tocol, it is essential to clearly state the study objectives of the study.Specific hypotheses that reflect the study objectives should be providedin the study protocol. The study design must be valid in order to provide

Page 19: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

2 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

a fair and unbiased assessment of the treatment effect as compared toa control. Target patient population should be defined through the in-clusion/exclusion criteria to assure the disease conditions under study.Randomization procedures must be employed to minimize potential biasand to ensure the comparability between treatment groups. Criteria forassessment of the response should be pre-defined and reliable. Appro-priate statistical methods should be employed for assessment of theeffect. Procedures such as blinding for minimization of bias should beemployed to maintain the validity and integrity of the trial.

In clinical trials, it is not uncommon to adjust trial and/or statisti-cal methods at the planning stage and during the conduct of clinicaltrials. For example, at the planning stage of a clinical trial, as an alter-native to the standard randomization procedure, an adaptive random-ization procedure based on treatment response may be considered fortreatment allocation. During the conduct of a clinical trial, some adapta-tions (i.e., modifications or changes) to trial and/or statistical proceduresmay be made based on accrued data. Typical examples for adaptationsof trial and/or statistical procedures of on-going clinical trials include,but are not limited to, the modification of inclusion/exclusion criteria,the adjustment of study dose or regimen, the extension of treatmentduration, changes in study endpoints, and modification in study designsuch as group sequential design and/or multiple-stage flexible designs(Table 1.1). Adaptations to trial and/or statistical procedures of on-goingclinical trials will certainly have an immediate impact on the target pop-ulation and, consequently, statistical inference on treatment effect ofthe target patient population. In practice, adaptations or modificationsto trial and/or statistical procedures of on-going clinical trials are nec-essary, which not only reflect real medical practice on the actual patient

Table 1.1 Types of Adaptation in Clinical Trials

Adaptation Examples

Prospective (by design) Interim analysisStop trial early due to safety, futility/efficacySample size re-estimation, etc.

On-going (ad hoc) Inclusion/exclusion criteriaDose or dose regimenTreatment duration, etc.

Retrospective∗ Study endpointSwitch from superiority to non-inferiority, etc.

∗Adaptation at the end of the study prior to database lock or unblinding.

Page 20: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

INTRODUCTION 3

population with the disease under study, but also increase the probabil-ity of success for identifying the clinical benefit of the treatment underinvestigation.

The remainder of this chapter is organized as follows. In the nextsection, a definition regarding so-called adaptive design is given.Section 1.2 provides regulatory perspectives regarding the use of adap-tive design methods in clinical research and development. Sections 1.3and 1.4 describe the impact of an adaptive design on the targetpatient population and statistical inference following adaptations oftrial and/or statistical procedures, respectively. Practical issues thatare commonly encountered when applying adaptive design methods inclinical research and development are briefly outlined in Section 1.5.Section 1.6 presents the aims and scope of the book.

1.1 What Is Adaptive Design

On March 16, 2006, the FDA released a Critical Path OpportunitiesList that outlines 76 initial projects to bridge the gap between the quickpace of new biomedical discoveries and the slower pace at which thosediscoveries are currently developed into therapies. (See, e.g., http://www.fda.gov/oc/initiatives/criticalpath.) The Critical Path Opportunities Listconsists of six broad topic areas of (i) development of biomarkers, (ii) clin-ical trial designs, (iii) bioinformatics, (iv) manufacturing, (v) publichealth needs, and (iv) pediatrics. As indicated in the Critical PathOpportunities Report, biomarker development and streamlining clini-cal trials are the two most important areas for improving medical prod-uct development. The streamlining clinical trials call for advancing in-novative trial designs such as adaptive designs to improve innovationin clinical development.

In clinical investigation of treatment regimens, it is not uncommonto consider adaptations (i.e., modifications or changes) in early phaseclinical trials before initiation of large-scale confirmatory phase III tri-als. We will refer to the application of adaptations to clinical trials asadaptive design methods in clinical trials. The adaptive design meth-ods are usually developed based on observed treatment effects. To allowwider flexibility, adaptations in clinical investigation of treatment reg-imen may include changes of sample size, inclusion/exclusion criteria,study dose, study endpoints, and methods for analysis (Liu, Proschan,and Pledger, 2002). Along this line, the PhRMA Working Group de-fines an adaptive design as a clinical study design that uses accumu-lating data to decide on how to modify aspects of the study as it contin-ues, without undermining the validity and integrity of the trial (Gallo

Page 21: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

4 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

et al., 2006). As indicated by the PhRMA Working Group, the adap-tation is a design feature aimed to enhance the trial, not a remedyfor inadequate planning. In other words, changes should be made bydesign and not on an ad hoc basis. By design changes, however, do notreflect real clinical practice. In addition, they do not allow flexibility.As a result, in this book we will refer to an adaptive design of a clin-ical trial as a design that allows adaptations or modifications to someaspects (e.g., trial and/or statistical procedures) of the trial after its initi-ation without undermining the validity and integrity of the trial (Chow,Chang, and Pong, 2005). Adaptations or modifications of on-going clini-cal trials that are commonly made to trial procedures include eligibilitycriteria, study dose or regimen, treatment duration, study endpoints,laboratory testing procedures, diagnostic procedures, criteria for evalu-ability, assessment of clinical responses, deletion/addition of treatmentgroups, and safety parameters. In practice, during the conduct of theclinical trial, statistical procedures including randomization procedurein treatment allocation, study objectives/hypotheses, sample size re-assessment, study design, data monitoring and interim analysis pro-cedure, statistical analysis plan, and/or methods for data analysis areoften adjusted in order to increase the probability of success of the trialby controlling the pre-specified type I error. Note that in many cases,an adaptive design is also known as a flexible design (EMEA, 2002).

Adaptive design methods are very attractive to clinical researchersand/or sponsors due to their flexibility, especially when there are pri-ority changes for budget/resources and timeline constraints, scientific/statistical justifications for study validity and integrity, medical con-siderations for safety, regulatory concerns for review/approval, and/orbusiness strategies for go/no-go decisions. However, there is little orno information available in regulatory requirements as to what levelof flexibility in modifications of trial and/or statistical procedures ofon-going clinical trials would be acceptable. It is a concern that theapplication of adaptive design methods may result in a totally differ-ent clinical trial that is unable to address the scientific/medical ques-tions/hypotheses the clinical trial is intended to answer. In addition,an adaptive design suffers from the following disadvantages. First, itmay result in a major difference between the actual patient populationas the result of adaptations made to the trial and/or statistical proce-dures and the (original) target patient population. The actual patientpopulation under study could be a moving target depending upon thefrequency and extent of modifications (flexibility) made to study param-eters. Second, statistical inferences such as confidence interval and/orp-values on the treatment effect of the test treatment under study maynot be reliable. Consequently, the observed clinical results may not be

Page 22: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

INTRODUCTION 5

reproducible. In recent years, the use of adaptive design methods inclinical trials has attracted much attention from clinical scientists andbiostatisticians.

In practice, adaptation or modification made to the trial and/or statis-tical procedures during the conduct of a clinical trial based on accrueddata is usually recommended by the investigator, the sponsor, or anindependent datamonitoring committee. Although the adaptation ormodification is flexible and attractive, it may introduce bias and con-sequently has an impact on statistical inference on the assessment oftreatment effect for the target patient population under study. The com-plexity could be substantial depending upon the adaptation employed.Basically, the adaptation employed can be classified into three cate-gories: prospective (by design) adaptation, concurrent (or on-going adhoc) adaptation (by protocol amendment), and retrospective adaptation(after the end of the conduct of the trial, before database lock and/or un-blinding). As it can be seen, the on-going ad hoc adaptation has higherflexibility, while prospective adaptation is less flexible. Both types ofadaptation require careful planning. It should be noted that statisticalmethods for certain kinds of adaptation may not be available in the lit-erature. As a result, some studies with complicated adaptation may bemore successful than others.

Depending upon the types of adaptation or modification made, com-monly employed adaptive design methods in clinical trials include, butare not limited to: (i) an adaptive group sequential design, (ii) anN-adjustable design, (iii) an adaptive seamless phase II/III design, (iv) adrop-the-loser design, (v) an adaptive randomization design, (vi) anadaptive dose-escalation design, (vii) a biomarker-adaptive design,(viii) an adaptive treatment-switching design, (ix) an adaptive-hypotheses design, and (x) any combinations of the above. An adaptivegroup sequential design is an adaptive design that allows for prema-turely terminating a trial due to safety, efficacy, or futility based on in-terim analysis results, while an N-adjustable design is referred to as anadaptive design that allows for sample size adjustment or re-estimationbased on the observed data at interim. A seamless phase II/III adap-tive trial design refers to a program that addresses within a singletrial objectives that are normally achieved through separate trials inphases IIb and III (Inoue, Thall, and Berry, 2002; Gallo et al., 2006).An adaptive seamless phase II/III design would combine two separatetrials (i.e., a phase IIb trial and a phase III trial) into one trial andwould use data from patients enrolled before and after the adapta-tion in the final analysis (Maca, et al., 2006). A drop-the-loser designis a multiple-stage adaptive design that allows dropping the inferiortreatment groups. Adaptive randomization design refers to a design

Page 23: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

6 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

that allows modification of randomization schedules. Adaptive dose-escalation design is often used in early phase clinical development toidentify the maximum tolerable dose, which is usually considered theoptimal dose for later phase clinical trials. Biomarker-adaptive designis a design that allows for adaptations based on the response of biomark-ers such as genomic markers. Adaptive treatment-switching design is adesign that allows the investigator to switch a patient’s treatment froman initial assignment to an alternative treatment if there is evidence oflack of efficacy or safety of the initial treatment. Adaptive-hypothesesdesign refers to a design that allows change in hypotheses based oninterim analysis results. Any combinations of the above adaptive de-signs are usually referred to as multiple adaptive designs. In practice,depending upon the study objectives of clinical trials, a multiple adap-tive design with several adaptations may be employed at the same time.In this case, statistical inference is often difficult if not impossible to ob-tain. These adaptive designs will be discussed further in later chaptersof this book.

In recent years, the use of these adaptive designs has received muchattention. For example, the Journal of Biopharmaceutical Statistics(JBS) published a special issue (Volume 15, Number 4) on AdaptiveDesign in Clinical Research in 2005 (Pong and Luo, 2005). This specialissue covers many statistical issues related to the use of adaptive designmethods in clinical research (see e.g., Chang and Chow, 2005; Chow,Chang, and Pong, 2005; Chow and Shao, 2005; Hommel, Lindig, andFaldum, 2005; Hung et al., 2005; Jennison and Turnbull, 2005; Kelly,Stallard, and Todd, 2005; Kelly et al., 2005; Li, Shih, and Wang, 2005;Proschan, 2005; Proschan, Leifer, and Liu, 2005; Wang and Hung, 2005).The PhRMA Working Group also published an executive summary onadaptive designs in clinical drug development to facilitate wide usage ofadaptive designs in clinical drug development (Gallo et al., 2006). Thisbook is intended to address concerns and/or practical issues that mayarise when applying adaptive design methods in clinical trials.

1.2 Regulatory Perspectives

As pointed out by the FDA, modification of the design of an experi-ment based on accrued data has been in practice for hundreds, if notthousands, of years in medical research. In the past, we have had atendency to adopt statistical procedures in the literature and applythem directly to the design of clinical trials (Lan, 2002). However, sincethese procedures were not motivated by clinical trial practice, they maynot be the best tools to handle certain situations. The impact of any

Page 24: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

INTRODUCTION 7

adaptations made to trial and/or statistical methods before, during, andafter the conduct of trial could be substantial.

The flexibility in design and analysis of clinical trials in early phasesof the drug development is very attractive to clinical researchers/scientists and the sponsors. However, its use in late phase II or phase IIIclinical investigations has led to regulatory concerns regarding its limi-tation of interpretation and extrapolation from trial results. As there isan increasing need for flexibility in design and analysis of clinical trials,the European Agency for the Evaluation of Medicinal Products (EMEA)published a concept paper on points to consider on methodological is-sues in confirmatory clinical trials with flexible design and analysisplan (EMEA, 2002, 2006). The EMEA’s points to consider discuss pre-requisites and conditions under which the methods could be acceptablein confirmatory phase III trials for regulatory decision making. Princi-pal pre-requisite for all considerations is that methods under investi-gation can provide correct p-values, unbiased estimates, and confidenceintervals for the treatment comparison(s) in an actual clinical trial. Asa result, the use of an adaptive design not only raises the importanceof well-known problems of studies with interim analyses (e.g., lack ofa sufficient safety database after early termination and over-running),but also bears new challenges to clinical researchers.

From a regulatory point of view, blinded review of the database at in-terim analyses is a key issue in adaptive design. During these blindedreviews, often the statistical analysis plan is largely modified. At thesame time, more study protocols are submitted, where little or no in-formation on statistical methods is provided and relevant decisions aredeferred to a statistical analysis or even the blinded review, which hasled to a serious regulatory concern regarding the validity and integrityof the trial. In addition, what is the resultant actual patient populationof the study after the adaptations of the trial procedures, especiallywhen the inclusion/exclusion criteria are made, is a challenge to theregulatory review and approval process. A commonly asked questionis whether the adaptive design methods have resulted in a totally dif-ferent trial with a totally different target patient population. In thiscase, is the usual regulatory review and approval process still applica-ble? However, there is little or no information in regulatory guidancesor guidelines regarding regulatory requirement or perception as to thedegree of flexibility that would be accepted by the regulatory agencies.In practice, it is suggested that regulatory acceptance should be justi-fied based on the validity of statistical inference of the target patientpopulation.

It should be noted that although adaptations of trial and/or statis-tical procedures are often documented through protocol amendments,

Page 25: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

8 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

standard statistical methods may not be appropriate and may lead to in-valid inference/conclusion regarding the target patient population. As aresult, it is recommended that appropriate adaptive statistical methodsbe employed. Although several adaptive design methods for obtainingvalid statistical inferences on treatment effects available in the liter-ature (see, e.g., Hommel, 2001; Liu, Proschan, and Pledger, 2002) areuseful, they should be performed in a completely objective manner. Inpractice, however, it can be very difficult to reach this objectivity inclinical trials due to external inferences and different interests fromthe investigators and sponsors.

As a result, it is strongly recommended that a guidance/guideline foradaptive design methods be developed by the regulatory authoritiesto avoid every intentional or unintentional manipulation of the adap-tive design methods in clinical trials. The guidance/guideline shoulddescribe in detail not only the standards for use of adaptive designmethods in clinical trials, but also the level of modifications in an adap-tive design that is acceptable to the regulatory agencies. In addition,any changes in the process of regulatory review/approval should alsobe clearly indicated in such a guidance/guideline. It should be noted thatthe adaptive design methods have been used in the review/approval pro-cess of regulatory submissions for years, though it may not have beenrecognized until recently.

1.3 Target Patient Population

In clinical trials, patient populations with certain diseases under studyare usually described by the inclusion/exclusion criteria. Patients whomeet all inclusion criteria and none of the exclusion criteria are quali-fied for the study. We will refer to this patient population as the targetpatient population. For a given study endpoint such as clinical response,time to disease progression, or survival in the therapeutic area of on-cology, we may denote the target patient population by (µ,σ ), where µ

is the population mean of the study endpoint and σ denotes the pop-ulation standard deviation of the study endpoint. For a comparativeclinical trial comparing a test treatment and a control, the effect size ofthe test treatment adjusted for standard deviation is defined as

µT − µC

σ,

where µT and µC are the population means for the test treatmentand the control, respectively. Based on the collected data, statisti-cal inference such as confidence interval and p-value on the effect

Page 26: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

INTRODUCTION 9

size of the test treatment can then be made for the target patientpopulation.

In practice, as indicated earlier, it is not uncommon to modify trialprocedures due to some medical and/or practical considerations dur-ing the conduct of the trial. Trial procedures of a clinical trial arereferred to as operating procedures, testing procedures, and/or diag-nostic procedures that are to be employed in the clinical trial. As aresult, trial procedures of a clinical trial include, but are not limitedto, the inclusion/exclusion criteria, the selection of study dose or reg-imen, treatment duration, laboratory testing, diagnostic procedures,and criteria for evaluability. In clinical trials, we refer to statisticalprocedures of a clinical trial as statistical procedures and/or statisticalmodels/methods that are employed at planning, execution, and con-duct of the trial as well as the analysis of the data. Thus, statisticalprocedures of a clinical trial include power analysis for sample sizecalculation at planning stage, randomization procedure for treatmentallocation prior to treatment, modifications of hypotheses, change instudy endpoint, and sample size re-estimation at interim during theconduct of the trial. As indicated in the FDA 1988 guideline and theInternational Conference on Harmonization (ICH) Good Clinical Prac-tices (GCP) guideline (FDA, 1988; ICH, 1996), a well-designed protocolshould detail how the clinical trial is to be carried out. Any deviationsfrom the protocol and/or violations of the protocol will not only dis-tort the original patient population under study, but will also introducebias and variation to the data collected from the trial. Consequently,conclusions drawn based on statistical inference obtained from theanalysis results of the data may not be applied to the original targetpatient population.

In clinical trials, the inclusion/exclusion criteria and study dose orregimen and/or treatment duration are often modified due to slowenrollment and/or safety concerns during the conduct of the trial. Forexample, at screening, we may disqualify too many patients with strin-gent inclusion/exclusion criteria. Consequently, the enrollments may betoo slow to meet the timeline of the study. In this case, a typical approachis to relax the inclusion/exclusion criteria to increase the enrollment.On the other hand, the investigators may wish to have the flexibilityto adjust the study dose or regimen to achieve optimal clinical benefitof the test treatment during the trial. The study dose may be reducedwhen there are significant toxicities and/or adverse experiences. In ad-dition, the investigators may wish to extend the treatment durationto (i) reach best therapeutic effect or (ii) achieve the anticipated eventrate based on accrued data during the conduct of trial. These modifi-cations of trial procedures are commonly encountered in clinical trials.

Page 27: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

10 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Modifications of trial procedures are usually accomplished through pro-tocol amendments, which detail rationales for changes and the impactof the modifications.

Any adaptations made to the trial and/or statistical procedures mayintroduce bias and/or variation to the data collected from the trial. Con-sequently, it may result in a similar but slightly different target patientpopulation. We will refer to such a patient population as the actual pa-tient population under study. As mentioned earlier, in practice, it is aconcern whether adaptations made to the trial and/or statistical proce-dures could lead to a totally different trial with a totally different targetpatient population. In addition, it is of interest to determine whetherstatistical inference obtained based on clinical data collected from theactual patient population could be applied to the originally planned tar-get patient population. These issues will be studied in the next chapter.

1.4 Statistical Inference

As discussed in the previous section, modifications of trial procedureswill certainly introduce bias/variation to the data collected from thetrial. The sources of these biases and variations can be classified into oneof the following four categories: (i) expected and controllable,(ii) expected but not controllable, (iii) unexpected but controllable, and(iv) unexpected and not controllable. For example, additional bias/variation is expected but not controllable when there is a change instudy dose or regimen and/or treatment duration. For changes in labo-ratory testing procedures and/or diagnostic procedures, bias/variationis expected but controllable by (i) having experienced technicians to per-form the tests or (ii) conducting appropriate training for inexperiencedtechnicians. Bias/variation due to patient non-compliance to trial pro-cedures is usually unexpected but is controllable by improving theprocedure for patients’ compliance. Additional bias/variation due to un-expected and uncontrollable sources is usually referred to as the randomerror of the trial.

In practice, appropriate statistical procedures should be employed toidentify and eliminate/control these sources of bias/variation wheneverpossible. In addition, after the adaptations of the trial procedures, es-pecially the inclusion/exclusion criteria, the target patient populationhas been changed to the actual patient population under study. In thiscase, how to generalize the conclusion drawn based on statistical infer-ence of the treatment effect derived from clinical data observed fromthe actual patient population to the original target patient populationis a challenge to clinical scientists. It, however, should be noted that

Page 28: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

INTRODUCTION 11

although all modifications of trial procedures and/or statistical proce-dures are documented through protocol amendments, it does not implythat the collected data are free of bias/variation. Protocol amendmentsshould not only provide rationales for changes but also detail how thedata are to be collected and analyzed following the adaptations of trialand/or statistical procedures. In practice, it is not uncommon to ob-serve the following inconsistencies following major adaptations of trialand/or statistical procedures of a clinical trial: (i) a right test for wronghypotheses, (ii) a wrong test for the right hypotheses, (iii) a wrong testfor wrong hypotheses, and (iv) the right test for the right hypotheses butinsufficient power. Each of these inconsistencies will result in invalidstatistical inferences and conclusions regarding the treatment effectunder investigation.

Flexibility in statistical procedures of a clinical trial is very attractiveto the investigator and/or sponsors. However, it suffers the disadvan-tage of invalid statistical inference and/or misleading conclusion if theimpact is not carefully managed. Liu, Proschan, and Pledger (2002)provided a solid theoretical foundation for adaptive design methods inclinical development under which not only a general method for pointestimation, confidence interval, hypotheses testing, and overall p-valuecan be obtained, but also its validity can be rigorously established. How-ever, they do not take into consideration the fact that the target patientpopulation has become a moving target patient population as the resultof adaptations made to the trial and/or statistical procedures throughprotocol amendments. This issue will be further discussed in the nextchapter.

The ICH GCP guideline suggests that a thoughtful statisticalanalysis plan (SAP), which details statistical procedures (includingmodels/methods), should be employed for data collection and analysis.Any deviations from the SAP and violations of the SAP could decreasethe reliability of the analysis results, and consequently the conclusiondrawn from these analysis results may not be valid.

In summary, the use of adaptive design methods in clinical trials mayhave an impact on the statistical inference on the target patient popula-tion under study. Statistical inference obtained based on data collectedfrom the actual patient population as the result of modifications madeto the trial procedures and/or statistical procedures should be adjustedbefore it can be applied to the original target patient population.

1.5 Practical Issues

As indicated earlier, the use of adaptive design methods in clinicaltrials has received much attention because it allows adaptations of

Page 29: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

12 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

trial and/or statistical procedures of on-going clinical trials. The flexi-bility for adaptations to study parameters is very attractive to clinicalscientists and sponsors. However, from regulatory point of view, sev-eral questions have been raised. First, what level of adaptations to thetrial and/or statistical procedures would be acceptable to the regulatoryauthorities? Second, what are the regulatory requirements and stan-dards for review and approval process of clinical data obtained fromadaptive clinical trials with different levels of adaptations to trial and/orstatistical procedures of on-going clinical trials? Third, has the clinicaltrial become a totally different clinical trial after the adaptations tothe trial and/or statistical procedures for addressing the study objec-tives of the originally planned clinical trial? These concerns are nec-essarily addressed by the regulatory authorities before the adaptivedesign methods can be widely accepted in clinical research and devel-opment.

In addition, from the scientific/statistical point of view, there are alsosome concerns regarding (i) whether the modifications to the trial pro-cedures have resulted in a similar but different target patient popu-lation, (ii) whether the modifications of hypotheses have distorted thestudy objectives of the trial, (iii) whether the flexibility in statistical pro-cedures has led to biased assessment of clinical benefit of the treatmentunder investigation. In this section, practical issues associated with theabove questions that are commonly encountered in clinical trials whenapplying adaptive design methods of on-going clinical trials are brieflydescribed. These issues include moving target patient population as theresult of protocol amendments, adaptive randomization, adaptive hy-potheses, adaptive dose-escalation trials, adaptive group sequential de-signs, adaptive sample size adjustment, adaptive seamless phase II/IIItrial design, dropping the losers adaptively, adaptive treatment switch-ing, Bayesian and hybrid approaches, clinical trial simulation, and casestudies.

1.5.1 Moving target patient population

In clinical trials, it is important to define the patient population withthe disease under study. This patient population is usually describedbased on eligibility criteria, i.e., the inclusion and exclusion criteria.This patient population is referred to as the target patient population.As indicated in Chow and Liu (2003), a target patient population is usu-ally roughly defined by the inclusion criteria and then fine-tuned by theexclusion criteria to minimize heterogeneity of the patient population.When adaptations are made to the trial and/or statistical procedures,

Page 30: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

INTRODUCTION 13

especially the inclusion/exclusion criteria during the conduct of thetrial, the mean response of the primary study endpoint of the targetpatient population may be shifted with heterogeneity in variability. Asa result, adaptations made to trial and/or statistical procedures couldlead to a similar but different patient population. We will refer to thisresultant patient population as the actual patient population. In prac-tice, it is a concern that a major (or significant) adaptation could resultin a totally different patient population. During the conduct of a clinicaltrial, if adaptations are made frequently, the target patient populationis in fact a moving target patient population (Chow, Chang, and Pong,2005). As a result, it is difficult to draw an accurate and reliable statisti-cal inference on the moving target patient population. Thus, in practice,it is of interest to determine the impact of adaptive design methods onthe target patient population and consequently the corresponding sta-tistical inference and power analysis for sample size calculation. Moredetails are given in the next chapter.

1.5.2 Adaptive randomization

In clinical trials, randomization models such as the population model,the invoked population model, and the randomization model with themethod of complete randomization and permuted-block randomizationare commonly used to ensure a balanced allocation of patients to treat-ment within either a fixed total sample size or a pre-specified blocksize (Chow and Liu, 2003). The population model is referred to as theconcept that clinicians can draw conclusions for the target patient pop-ulation based on the selection of a representative sample drawn fromthe target patient population by some random procedure (Lehmann,1975; Lachin, 1988). The invoked population model is referred to asthe process of selecting investigators first and then selecting patientsat each selected investigator’s site. As it can be seen, neither the se-lection of investigators nor the recruitment of patients at the selectedinvestigator’s site is random. However, treatment assignment is ran-dom. Thus, the invoked randomization model allows the analysis of theclinical data as if they were obtained under the assumption that thesample is randomly selected from a homogeneous patient population.Randomization model is referred to as the concept of randomization orpermutation tests based on the fact that the study site selection andpatient selection are not random, but the assignment of treatments topatients is random. Randomization model/method is a critical compo-nent in clinical trials because statistical inference based on the datacollected from the trial relies on the probability distribution of the

Page 31: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

14 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

sample, which in turn depends upon the randomization procedure em-ployed.

In practice, however, it is also of interest to adjust the probability ofassignment of patients to treatments during the study to increase theprobability of success of the clinical study. This type of randomizationis called adaptive randomization because the probability of the treat-ment to which a current patient is assigned is adjusted based on theassignment of previous patients. The randomization codes based on themethod of adaptive randomization cannot be prepared before the studybegins. This is because the randomization process is performed at thetime a patient is enrolled in the study, whereas adaptive randomizationrequires information on previously randomized patients. In practice,the method of adaptive randomization is often applied with respect totreatment, covariate, or clinical response. Therefore, the adaptive ran-domization is known as treatment-adaptive randomization, covariate-adaptive randomization, or response-adaptive randomization. Adaptiverandomization procedures could have an impact on sample size requiredfor achieving a desired statistical power and consequently statistical in-ference on the test treatment under investigation. More details regard-ing the adaptive randomization procedures described above and theirimpact on sample size calculation and statistical inference is given inChapter 3 of this book.

1.5.3 Adaptive hypotheses

Modifications of hypotheses during the conduct of a clinical trial com-monly occur due to the following reasons: (i) an investigational methodhas not yet been validated at the planning stage of the study, (ii) in-formation from other studies is necessary for planning the next stageof the study, (iii) there is a need to include new doses, and (iv) reco-mmendations from a pre-established data safety monitoring committee(Hommel, 2001). In clinical research, it is not uncommon to have morethan one set of hypotheses for an intended clinical trial. These hypothe-ses may be classified as primary hypotheses and secondary hypothesesdepending upon whether they are the primary study objectives or sec-ondary study objectives. In practice, a pre-specified overall type I errorrate is usually controlled for testing the primary hypotheses. However,if the investigator is interested in controlling the overall type I error ratefor testing secondary hypotheses, then techniques for multiple testingare commonly employed. Following the ideas of Bauer (1999), Kieser,Bauer, and Lehmacher (1999), and Bauer and Kieser (1999) for generalmultiple testing problems, Hommel (2001) applied the same techniques

Page 32: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

INTRODUCTION 15

to obtain more flexible strategies for adaptive modifications of hypothe-ses based on accrued data at interim by changing the weights of hy-potheses, changing a prior order, or even including new hypotheses.The method proposed by Hommel (2001) enjoys the following advan-tages. First, it is a very general method in the sense that any type ofmultiple testing problems can be applied. Second, it is mathematicallycorrect. Third, it is extremely flexible, which allows not only changes todesign, but also changes to the choice of hypotheses or weights for themduring the course of the study. In addition, it also allows the additionof new hypotheses. Modifications of hypotheses can certainly have animpact on statistical inference for assessment of treatment effect. Morediscussions are given in Chapter 4 of this book.

1.5.4 Adaptive dose-escalation trials

In clinical research, the response in a dose response study could be abiological response for safety or efficacy. For example, in a dose-toxicitystudy, the goal is to determine the maximum tolerable dose (MTD). Onthe other hand, in a dose-efficacy response study, the primary objectiveis usually to address one or more of the following questions: (i) Is thereany evidence of the drug effect? (ii) What is the nature of the dose-response? and (iii) What is the optimal dose? In practice, it is always aconcern as to how to evaluate dose–response relationship with limitedresources within a relatively tight time frame. This concern led to a pro-posed design that allows less patients to be exposed to the toxicity andmore patients to be treated at potentially efficacious dose levels. Sucha design also allows pharmaceutical companies to fully utilize their re-sources for development of more new drug products. In Chapter 5, weprovide a brief background of dose escalation trials in oncology trials.We will review the continued reassessment method (CRM) proposedby O’Quigley, Pepe, and Fisher (1990) in phase I oncology trials. Wewill study the hybrid frequentist-Bayesian adaptive approach for bothefficacy and toxicity (Chang, Chow, and Pong, 2005) in detail.

1.5.5 Adaptive group sequential design

In practice, flexible trials are usually referred to as trials that utilizeinterim monitoring based on group sequential and adaptive methodo-logy for (i) early stopping for clinical benefit or harm, (ii) early stoppingfor futility, (iii) sample size re-adjustment, and (iv) re-designing thestudy in midstream. In practice, an adaptive group sequential design isvery popular due to the following two reasons. First, clinical endpoint

Page 33: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

16 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

is a moving target. The sponsors and/or investigators may change theirminds regarding clinically meaningful effect size after the trial starts.Second, it is a common practice to request a small budget at the designand then seek supplemental funding for increasing the sample size afterseeing the interim data.

To protect the overall type I error rate in an adaptive design withrespect to adaptations in some design parameters, many authors haveproposed procedures using observed treatment effects. This leads to thejustification for the commonly used two-stage adaptive design, in whichthe data from both stages are independent and the first data set is usedfor adaptation (see, e.g., Proschan and Hunsberger, 1995; Cui, Hung,and Wang, 1999; Liu and Chi, 2001). In recent years, the concept of two-stage adaptive design has led to the development of the adaptive groupsequential design. The adaptive group sequential design is referred toas a design that uses observed (or estimated) treatment differences atinterim analyses to modify the design and sample size adaptively (e.g.,Shen and Fisher, 1999; Cui, Hung, and Wang, 1999; Posch and Bauer,1999; Lehmacher and Wassmer, 1999).

In clinical research, it is desirable to speed up the trial and at thesame time reduce the cost of the trial. The ultimate goal is to get theproducts to the marketplace sooner. As a result, flexible methods foradaptive group sequential design and monitoring are the key factorsfor achieving this goal. With the availability of new technology such aselectronic data capture, adaptive group sequential design in conjunctionwith the new technology will provide an integrated solution to the lo-gistical and statistical complexities of monitoring trials in flexible wayswithout biasing the final conclusions. Further discussion regarding theapplication of adaptive group sequential designs in clinical trials canbe found in Chapter 6.

1.5.6 Adaptive sample size adjustment

As indicated earlier, an adaptive design is very attractive to the sponsorsin early clinical development because it allows modifications of the trialto meet specific needs during the trial within limited budget/resourcesand target timelines. However, an adaptive design suffers from a loss ofpower to detect a clinically meaningful difference of the target patientpopulation under the actual patient population due to bias/variationthat has been introduced to the trial as the result of changes in studyparameters during the conduct of the trial. To account for the expectedand/or unexpected bias/variation, statistical procedures for sample sizecalculation are necessarily adjusted for achieving the desired power.

Page 34: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

INTRODUCTION 17

For example, if the study, regimen, and/or treatment duration havebeen adjusted during the conduct of the trial, not only the actual pa-tient population may be different from the target patient population,but also the baseline for the clinically meaningful difference to be de-tected may have been changed. In this case, sample size required forachieving the desired power for correctly detecting a clinically mean-ingful difference based on clinical data collected from the actual patientpopulation definitely needs adjustment.

It should be noted that procedures for sample size calculation basedon power analysis of an adaptive design with respect to specific changesin study parameters are very different from the standard methods.The procedures for sample size calculation could be very complicatedfor a multiple adaptive design (or a combined adaptive design) involv-ing more than one study parameter. In practice, statistical tests for anull hypothesis of no treatment difference may not be tractable undera multiple adaptive design. Chapter 7 provides several methods foradaptive sample size adjustment which are useful for multiple adaptivedesigns.

1.5.7 Adaptive seamless phase II/III design

A phase II clinical trial is often a dose-response study, where the goal isto find the appropriate dose level for the phase III trials. It is desirableto combine phase II and III so that the data can be used more effi-ciently and duration of the drug development can be reduced. A seam-less phase II/III trial design refers to a program that addresses within asingle trial objective what is normally achieved through separate trialsin phases IIb and III (Gallo et al., 2006). An adaptive seamless phaseII/III design is a seamless phase II/III trial design that would use datafrom patients enrolled before and after the adaptation in the final anal-ysis (Maca et al., 2006). Bauser and Kieser (1999) provide a two-stagemethod for this purpose, where the investigators can terminate the trialentirely or drop a subset of regimens for lack of efficacy after the firststage. As pointed out by Sampson and Sill (2005), their procedure ishighly flexible, and the distributional assumptions are kept to a mini-mum. This results in a usual design in a number of settings. However,because of the generality of the method, it is difficult, if not impossi-ble, to construct confidence intervals. Sampson and Sill (2005) deriveda uniformly most powerful conditionally unbiased test for normal end-point. For other types of endpoints, no results match Sampson and Sill’sresults. Thus, it is suggested that computer trial simulation be used insuch cases. More information is provided in Chapter 8.

Page 35: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

18 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

1.5.8 Adaptive treatment switching

For evaluation of the efficacy and safety of a test treatment for progres-sive diseases such as oncology and HIV, a parallel-group active-controlrandomized clinical trial is often conducted. Under the parallel-groupactive-control randomized clinical trial, qualified patients are randomlyassigned to receive either an active control (a standard therapy or atreatment currently available in the marketplace) or a test treatmentunder investigation. Patients are allowed to switch from one treatmentto another, due to ethical consideration, if there is lack of responses orthere is evidence of disease progression. In practice, it is not uncommonthat up to 80% of patients may switch from one treatment to another.This certainly has an impact on the evaluation of the efficacy of the testtreatment. Despite allowing a switch between two treatments, manyclinical studies are to compare the test treatment with the active-controlagent as if no patients had ever switched. Sommer and Zeger (1991) re-ferred to the treatment effect among patients who complied with treat-ment as biological efficacy. Branson and Whitehead (2002) widened theconcept of biological efficacy to encompass the treatment effect as ifall patients adhered to their original randomized treatments in clinicalstudies allowing treatment switch.

The problem of treatment switching is commonly encountered in can-cer trials. In cancer trials, most investigators would allow patients to getoff the current treatment and switch to another treatment (either thestudy treatment or a rescue treatment) when there is progressed dis-ease, due to ethical consideration. However, treatment switching dur-ing the conduct of the trial has presented a challenge to clinical scien-tists (especially biostatisticians) regarding the analysis of some primarystudy endpoints such as median survival time. Under certain assump-tions, Shao, Chang, and Chow (2005) proposed a method for estimationof median survival time when treatment switching occurs during thecourse of the study. Several methods for adaptive treatment switchingare reviewed in Chapter 9.

1.5.9 Bayesian and hybrid approaches

Drug development is a sequence of drug decision-making processes,where decisions are made based on the constantly updated informa-tion. The Bayesian approach naturally fits this mechanism. However,in the current regulatory setting, the Bayesian approach is not readyas the criteria for approval of a drug. Therefore, it is desirable to useBayesian approaches to optimize the trial and increase the probability

Page 36: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

INTRODUCTION 19

of success under current frequentist criterion for approval. In the nearfuture, it is expected that drug approval criteria will become Bayesian.In addition, full Bayesian is important because it can provide more in-formative information and optimal criteria for drug approval based onrisk-benefit ratio rather than subjectively (arbitrarily) set α = 0.05, asfrequentists did.

1.5.10 Clinical trial simulation

It should be noted that for a given adaptive design, it is very likely thatadaptations will be made to more than one study parameter simulta-neously during the conduct of the clinical trial. To assess the impact ofchanges in specific study parameters, a typical approach is to perform asensitivity analysis by fixing other study parameters. In practice, the as-sessment of the overall impact of changes in each study parameter isalmost impossible due to possible confounding and/or masking effectsamong changes in study parameters. As a result, it is suggested that aclinical trial simulation be conducted to examine the individual and/oroverall impact of changes in multiple study parameters. In addition,the performance of a given adaptive design can be evaluated throughthe conduct of a clinical trial simulation in terms of its sensitivity,robustness, and/or empirical probability of reproducibility. It, however,should be noted that a clinical trial simulation are conducted in sucha way that the simulated clinical data are able to reflect the real situ-ation of the clinical trial after all of the modifications are made to thetrial procedures and/or statistical procedures. In practice, it is then sug-gested that assumptions regarding the sources of bias/variation as theresults of modifications of the on-going trial be identified and be takeninto consideration when conducting the clinical trial simulation.

1.5.11 Case studies

As pointed out by Li (2006), the use of adaptive design methods pro-vides a second chance to re-design the trial after seeing data inter-nally or externally at interim. However, it may introduce so-called op-erational biases such as selection bias, method of evaluations, earlywithdrawal, modification of treatments, etc. Consequently, the adap-tation employed may inflate type I error rate. Li (2006) suggested acouple of principles when implementing adaptive designs in clinicaltrials: (i) adaptation should not alter trial conduct, and (ii) type I errorshould be preserved. Following these principles, some studies with com-plicated adaptations may be more successful than others. The successful

Page 37: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

20 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

experience for certain adaptive designs in clinical trials is importantto investigators in clinical research and development. For illustrationpurposes, some successful case studies including the implementation ofan adaptive group sequential design (Cui, Hung, and Wang, 1999), anadaptive dose-escalation design (Chang and Chow, 2005), and adaptiveseamless phase II/III trial design (Maca et al., 2006) are provided in thelast chapter of this book.

1.6 Aims and Scope of the Book

This is intended to be the first book entirely devoted to the use of adap-tive design methods in clinical trials. It covers all of the statistical is-sues that may occur at various stages of adaptive design and analysisof clinical trials. It is our goal to provide a useful desk reference and thestate-of-the art examination of this area to scientists and researchersengaged in clinical research and development, those in governmentregulatory agencies who have to make decisions in the pharmaceuti-cal review and approval process, and biostatisticians who provide thestatistical support for clinical trials and related clinical investigation.More importantly, we would like to provide graduate students in theareas of clinical development and biostatistics an advanced textbookin the use of adaptive design methods in clinical trials. We hope thatthis book can serve as a bridge between the pharmaceutical industry,government regulatory agencies, and academia.

The scope of this book covers statistical issues that are commonlyencountered when modifications of study procedures and/or statisticalprocedures are made during the course of the study. In this chapter, thedefinition, regulatory requirement, target patient population, statisti-cal issues of adaptive design, and analysis for clinical trials have beendiscussed. In the next chapter, the impact of modifications made to trialprocedures and/or statistical procedures on the target patient popula-tion, statistical inference, and power analysis for sample size calculationas the result of protocol amendments are discussed. In Chapter 3, vari-ous adaptive randomization procedures for treatment allocation will bediscussed. Chapter 4 covers adaptive design methods for modificationsof hypotheses including the addition of new hypotheses after the re-view of interim data. Chapter 5 provides an overall review of adaptivedesign methods for dose selection, especially in dose-finding and dose-response relationship studies in early clinical development. Chapter 6introduces the commonly used adaptive group sequential design in clin-ical trials. Blinded procedures for sample size re-estimation are given inChapter 7. Statistical tests for adaptive seamless phase II/III designs,and statistical inference for switching from one treatment to another

Page 38: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

INTRODUCTION 21

adaptively and the corresponding practical issues that may arise arestudied in Chapter 8 and Chapter 9, respectively. Bayesian and hy-brid approaches for the use of adaptive design methods in clinical trialsare outlined in Chapter 10. Chapter 11 provides an introduction to themethodology of clinical trial simulation for evaluation of the perfor-mance of the adaptive design methods under various adaptive designsthat are commonly used in clinical development. Case studies regard-ing the implementation of adaptive group sequential design, adaptivedose-escalation design, and adaptive seamless phase II/III trial designin clinical trials are discussed in Chapter 12.

For each chapter, whenever possible, real examples from clinical tri-als are included to demonstrate the use of adaptive design methodsin clinical trials including clinical/statistical concepts, interpretations,and their relationships and interactions. Comparisons regarding therelative merits and disadvantages of the adaptive design methods inclinical research and development are discussed whenever deemed ap-propriate. In addition, if applicable, topics for future research are pro-vided. All computations in this book are performed using 8.20 of SAS.Other statistical packages such as S-plus can also be applied.

Page 39: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and
Page 40: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CHAPTER 2

Protocol Amendment

In clinical trials, it is not uncommon to modify the trial and/or sta-tistical procedures of on-going trials due to scientific/statistical justi-fications, medical considerations, regulatory concerns, and/or businessinterest/decisions. When modifications of trial and/or statistical proce-dures are made, a protocol amendment is necessarily filed to individualinstitutional review boards (IRBs) for review/approval before imple-mentation. As discussed in the previous chapter, major (or significant)adaptation of trial and/or statistical procedures of a on-going clinicaltrial could alter the target patient population of the trial and conse-quently lead to a totally different clinical trial that is unable to answerthe scientific/medical questions the trial is intended to address. In thischapter, we will examine the impact of protocol amendments on the tar-get patient population through the assessment of a shift parameter, ascale parameter, and a sensitivity index. The impact of protocol amend-ments on power for detecting a clinically significant difference and thecorresponding statistical inference are also studied.

In the next section, a shift parameter, a scale parameter, and a sen-sitivity index that provide useful measures of change in the targetpatient population as the result of protocol amendments are defined.Section 2.2 provides estimates of the shift and scale parameters of thetarget patient population and the sensitivity index both conditionallyand unconditionally, assuming that the resultant actual patient popula-tion after modifications is random. The impact of protocol amendmentson statistical inference and power analysis for sample size calculationis discussed in Sections 2.3 and 2.4, respectively. Section 2.5 providesstatistical inference for treatment effect when there are protocol amend-ments, assuming that changes in protocol are made based on one or afew covariates. A brief concluding remark is given in the last section ofthis chapter.

2.1 Actual Patient Population

As indicated earlier, in clinical trials it is not uncommon to modify trialand/or statistical procedures of on-going trials. However, it should benoted that any adaptation made to the trial and/or statistical procedures

Page 41: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

24 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

may introduce bias and/or variation to the data collected from the trial.Most importantly, it may result in a similar but slightly different targetpatient population. We will refer to such a patient population as theactual patient population under study. After a given protocol amend-ment, we denote the actual patient population by (µ1, σ1), where µ1 =µ+ε is the population mean of the primary study endpoint and σ1 = Cσ

(C > 0) denotes the population standard deviation of the primary studyendpoint. In this chapter, we will refer to ε and C as the shift and scaleparameters of the target patient population (µ, σ ) after modification ismade. As it can be seen, the difference between the actual patient pop-ulation (µ1, σ1) and the original target patient population (µ, σ ) can becharacterized as follows:

∣∣∣∣

µ1

σ1

∣∣∣∣=

∣∣∣∣

µ + ε

∣∣∣∣=

∣∣∣∣

∆µ

σ

∣∣∣∣= |∆|

∣∣∣µ

σ

∣∣∣,

where

∆ =1 + ε/µ

Cis a measure of change in the signal-to-noise ratio of the actual patientpopulation as compared to the original target population. Chow, Shao,and Hu (2002) refer to ∆ as the sensitivity index. As a result, the signal-to-noise ratio of the actual patient population can be expressed as thesensitivity index times the effect size (adjusted for standard deviation)of the original target patient population. For example, when ε = 0 andC = 1 (i.e., the modification made has no impact on the target patientpopulation), then ∆ = 1 (i.e., there is no difference between the targetpatient population and the actual patient population after the modi-fication). For another example, if ε = 0 and C = 0.5 (i.e., there is nochange in mean but the variation has reduced by 50% as the result ofthe modification), then ∆ = 2, which is an indication of sensitivity (i.e.,the actual effect size to be detected is much larger than the originallyplanned effect size under the target patient population) as the result ofthe modifications made.

In practice, each modification to the trial procedures may result ina similar but slightly different actual patient population. Denote by(µi, σi) the actual patient population after the ith modification of trialprocedure, where µi = µ + εi and σi = Ciσ, i = 0, 1, . . . , m. Note thati = 0 reduces to the original target patient population (µ, σ ). Thatis, when i = 0, ε0 = 0 and C0 = 1. After m protocol amendments(i.e., m modifications are made to the study protocol), the resultant ac-tual patient population becomes (µm, σm), where µm = µ +

∑mi=1 εi and

σm =∏m

i=1 Ciσ . It should be noted that (εi, Ci), i = 1, . . . , m are in factrandom variables. As a result, the resultant actual patient population

Page 42: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

PROTOCOL AMENDMENT 25

Table 2.1 Changes in Sensitivity Indices With RespectTo ε/µ and C

Inflation of Variability Deflation of Variability

ε/µ (%) C (%) ∆ C (%) ∆

−20 100 0.800 — —−20 110 0.727 90 0.889−20 120 0.667 80 1.000−20 130 0.615 70 1.143−10 100 0.900 — —−10 110 0.818 90 1.000−10 120 0.750 80 1.125−10 130 0.692 70 1.571−5 100 0.950 — —−5 110 0.864 90 1.056−5 120 0.792 80 1.188−5 130 0.731 70 1.357

0 100 1.000 — —0 110 0.909 90 1.1110 120 0.833 80 1.2500 130 0.769 70 1.4295 100 1.050 — —5 110 0.955 90 1.1675 120 0.875 80 1.3135 130 0.808 70 1.50010 100 1.100 — —10 110 1.000 90 1.22210 120 0.917 80 1.37510 130 0.846 70 1.57120 100 1.200 — —20 110 1.091 90 1.33320 120 1.000 80 1.50020 130 0.923 70 1.714

following certain modifications to the trial procedures is a moving tar-get patient population rather than a fixed target patient population. Itshould be noted that the effect of εi could be offset by Ci for a givenmodification i as well as by (ε j , Cj) for another modification j. As aresult, estimates of the effects of (εi, Ci), i = 1, . . . , mare difficult, if notimpossible, to obtain. In practice, it is desirable to limit the combinedeffects of (εi, Ci), i = 0, . . . , m to an acceptable range for a valid andunbiased assessment of treatment effect regarding the target patientpopulation based on clinical data collected from the actual patient pop-ulation. Table 2.1 provides changes in ∆ with respect to various values

Page 43: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

26 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Figure 2.1 3-D Plot of Sensitivity Index Versus ε/µ and C.

of ε/µ and C. As it can be seen from Table 2.1, a shift in mean of targetpatient population (i.e., ε) may be offset by the scale parameter C, theinflation or reduction of the variability as the result of the modificationmade.

For example, a shift of 10% (−10%) in mean could be offset by a 10%inflation (reduction) of variability. As a result, ∆ may not be sensitivedue to the confounding (or masking) effect between ε and C. However,when C (ε) remains unchanged, ∆ is a reasonable measure for the sensi-tivity since ∆ moves away from the unity (no change) as ε (C ) increases.To provide a better understanding of the changes in ∆ with respect tovarious values of ε/µ and C, a 3-dimensional plot is given in Figure 2.1.Figure 2.1 confirms that when C (ε) remains unchanged, ∆ moves awayfrom the unity (i.e., there is no change in the target patient population)as ε (C) increases.

2.2 Estimation of Shift and Scale Parameters

The shift and scale parameters (i.e., ε and C) of the target populationafter a modification (or a protocol amendment) is made can be esti-mated by

ε = µActual − µ,

and

C = σActual/σ ,

Page 44: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

PROTOCOL AMENDMENT 27

1

0.75

0.5

0.25

02.50–2.5–5 5 x

Figure 2.2 Plot of N(x;µ, σ2) Versus N(x;µµ, σ2 +σ2µ). Note that the solid line

is for N(x;µ, σ2) and the dot line is for N(x;µµ, σ2 + σ2µ).

respectively, where (µ, σ ) and (µActual, σActual) are some estimates of(µ, σ ) and (µActual, σActual), respectively. As a result, the sensitivity indexcan be estimated by

∆ =1 + ε/µ

C.

Estimates for µ and σ can be obtained based on data collected prior toany protocol amendments that are issued. Assume that the responsevariable x is distributed as N(µ, σ 2). Let xji, i = 1, . . . , nj ; j = 0, . . . , mbe the response of the ith patient after the jth protocol amendment. Asa result, the total number of patients is given by

n =m∑

j=0

nj.

Note that n0 is the number of patients in the study prior to any proto-col amendments. Based on x0i, i = 1, . . . , n0, the maximum likelihoodestimates of µ and σ 2 can be obtained as follows:

µ =1n0

n0∑

i=1

x0i, (2.1)

σ 2 =1n0

n0∑

i=1

(x0i − µ)2. (2.2)

To obtain estimates for µActual and σActual, for illustration purpose, wewill only consider the case where µActual is random and σActual is fixed.For other cases such as (i) µActual is fixed but σActual is random, (ii) bothµActual and σActual are random, (iii) µActual, σActual and nj are all random,

Page 45: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

28 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

and (iv) µActual, σActual, nj , and m are all random, estimates for µActualand σActual may be obtained following a similar idea.

2.2.1 The case where µActual is random and σActual is fixed

For convenience’s sake, we set µActual = µ and σActual = σ for the deri-vation of estimates of ε and C. Assume that x conditional on µ, i.e.,x|µ=µActual follows a normal distribution N(µ, σ 2). That is,

x|µ=µActual ∼ N(µ, σ 2), (2.3)

where µ is distributed as N(µµ, σ 2µ) and σ, µµ, and σµ are some unknown

constants. Thus, the unconditional distribution of x is a mixed normaldistribution given below

N(x;µ, σ 2)N(µ;µµ, σ 2µ)dµ =

1√2πσ 2

1√

2πσ 2µ

∫ ∞

−∞e− (x−µ)2

2 σ2− (µ−µµ)2

2 σ2µ dµ,

(2.4)

where x ∈ (−∞, ∞). It can be verified that the above mixed normaldistribution is a normal distribution with mean µµ and variance σ 2+σ 2

µ

(see proof given in the next section). In other words, x is distributed asN(µµ, σ 2 + σ 2

µ).As it can be seen from Figure 2.2 when µActual is random and σActual

is fixed, the shift from the target patient population to the actual pop-ulation could be substantial, especially for large µµ and σ 2

µ.

Maximum Likelihood Estimation Given a protocol amendment jand independent observations xji, i = 1, 2, . . . , nj , the likelihood func-tion is given by

lj =nj∏

i=1

(1√

2πσ 2e− (xij −µ j)

2

2 σ2

)1

2πσ 2µ

e− (µ j −µµ)2

2 σ2µ , (2.5)

where µ j is the population mean after the jth protocol amendment.Thus, given mprotocol amendments and observations xji, i = 1, . . . , nj ;j = 0, . . . , m, the likelihood function can be written as

L =m∏

j=0

lj =(

2πσ 2)− n

2

m∏

j=0

e−∑nji=1

(xij −µ j)2

2 σ21

2πσ 2µ

e− (µ j −µµ)2

2 σ2µ

. (2.6)

Page 46: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

PROTOCOL AMENDMENT 29

Hence, the log-likelihood function is given by

LL = −n2

ln(

2πσ 2) − m+ 1

2ln

(

2πσ 2µ

)

(2.7)

− 12 σ 2

m∑

j=0

nj∑

i=1

(xij − µ j)2 − 12 σ 2

µ

m∑

j=0

(µ j − µµ)2.

Based on (2.7), the maximum likelihood estimates (MLEs) of µµ, σ 2µ ,

and σ 2 can be obtained as follows:

µµ =1

m+ 1

m∑

j=0

µ j , (2.8)

where

µ j =1nj

nj∑

i=1

xji, (2.9)

σ 2µ =

1m+ 1

m∑

j=0

(µ j − µµ)2, (2.10)

and

σ 2 =1n

m∑

j=0

nj∑

i=1

(xji − µ j)2. (2.11)

Note that ∂LL∂µ j

= 0 leads to

σ 2µ

nj∑

i=1

xji + σ 2µµ − (nj σ2µ + σ 2)µ j = 0.

In general, when σ 2µ and σ 2 are compatible and n j is reasonably large,

σ 2µµ is negligible as compared to σ 2µ

∑nji=1 xji and σ 2 is negligible as

compared to nj σ2µ. Thus, we have the approximation (2.9), which greatly

simplifies the calculation. Based on these MLEs, estimates of the shiftparameter (i.e., ε) and the scale parameter (i.e., C) can be obtained asfollows:

ε = µ − µ,

C =σ

σ,

respectively. Consequently, the sensitivity index can be estimatedby simply replacing ε, µ, and C with their corresponding estimates ε, µ,and C.

Remarks In the above derivation, we account for the sequence of pro-tocol amendments assuming that the target patient population has been

Page 47: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

30 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

changed (or shifted) after each protocol amendment. Alternatively, ifthe cause of the shift in target patient population is not due to thesequence of protocol amendments but rather random protocol devia-tions or violations, we may obtain the following alternative (conditional)estimates.

Given m protocol deviations or violations and independent observa-tions xji, i = 1, . . . , nj ; j = 0, . . . , m, the likelihood function can bewritten as

L =m∏

j=0

nj∏

i=1

lji

=m∏

j=0

[

(

2πσ 2)− nj

2 e−∑nji=1

(xij −µ j)2

2 σ2(

2πσ 2µ

)− nj2 e

− nj(µ j −µµ)2

2 σ2µ

]

.

Thus, the log-likelihood function is given by

LL = −n2

ln(

2πσ 2) − n

2ln

(

2πσ 2µ

)

(2.12)

− 12 σ 2

m∑

j=0

nj∑

i=1

(xij − µ j)2 − 12 σ 2

µ

m∑

j=0

[nj(µ j − µµ)2].

As a result, the maximum likelihood estimates of µµ, σ 2µ , and σ 2 are

given by

µµ =1n

m∑

j=0

nj∑

i=1

xji = µ, (2.13)

σ 2µ =

1n

m∑

j=0

[nj(µ j − µµ)2], (2.14)

σ 2 =1n

m∑

j=0

nj∑

i=1

(xji − µ j)2, (2.15)

where

µ j ≈ 1nj

nj∑

i=1

xji.

Similarly, under random protocol violations, the estimates of µµ, σ 2µ ,

and σ 2 can also be obtained based on the unconditional probabilitydistribution described in (2.4) as follows. Given mprotocol amendmentsand observations xji, i = 1, . . . , nj ; j = 0, . . . , m, the likelihood function

Page 48: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

PROTOCOL AMENDMENT 31

can be written as

L =m∏

j=0

nj∏

i=1

1

2π(σ 2 + σ 2µ)

e− (x ji−µµ)2

2(σ2+σ2µ )

.

Hence, the log-likelihood function is given by

LL = −n2

ln[

2π(σ 2 + σ 2µ)

]

+m∑

j=0

nj∑

i=1

(xji − µµ)2

2 (σ 2 + σ 2µ)

. (2.16)

Based on (2.16), the maximum likelihood estimates of µµ, σ 2µ , and σ 2

can be easily found. However, it should be noted that the MLE for µµ

and σ 2∗ = (σ 2 + σ 2

µ) are unique but the MLEs for σ 2 and σ 2µ are not

unique. Thus, we have

µ = µµ =1n

m∑

j=0

nj∑

i=1

xji,

σ 2 = σ 2∗ =

1n

m∑

j=0

nj∑

i=1

(xji − µ)2.

In this case, the sensitivity index is equal to 1. In other words, randomprotocol deviations or violations (or the sequence of protocol amend-ments) do not have an impact on statistical inference on the targetpatient population. It, however, should be noted that the sequence ofprotocol amendments usually result in a moving target patient popula-tion in practice. As a result, the above estimates of µµ and σ 2

∗ are oftenmisused and misinterpreted.

2.3 Statistical Inference

To illustrate the impact on statistical inference regarding the targetpatient population after m protocol amendments, for simplicity, we willonly focus on statistical inference on ε, C, and ∆ for the case whereµActual is random and σActual is fixed. For the cases where (i) µActual is fixedand σActual is random, (ii) both µActual and σActual are random, (iii) µActual,σActual, and nj are all random, and (iv) µActual, σActual, nj , and m are allrandom, the idea described in this section can be similarly applied.

First, we note that the test statistic is dependent on sampling proce-dure (it is a combination of protocol amendment and randomization).The following theorem is useful. We will frequently use the well-knownfact that linear combination of independent variables with normal

Page 49: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

32 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

distribution or asymptotic normal distribution follows a normal dis-tribution. Specifically,

Theorem 2.1 Suppose that X|µ ∼ N(µ, σ 2) and µ ∼ N(µµ, σ 2µ), then

X ∼ N(µµ, σ 2 + σ 2µ) (2.17)

Proof. Consider the following characteristic function of a normal dis-tribution N(t;µ, σ 2)

φ0(w) =1√

2πσ 2

∫ ∞

−∞eiwt− 1

2σ2(t−µ)2dt = eiwµ− 1

2 σ2w2. (2.18)

For distribution X|µ ∼ N(µ, σ 2) and µ ∼ N(µµ, σ 2µ), the characteristic

function after exchange the order of the two integrations is given by

φ(w) =∫ ∞

−∞eiwµ− 1

2 σ2w2N(µ; µµ, σ 2

µ)dµ =∫ ∞

−∞e

iwµ− µ−µµ

2σ2µ− 1

2 σ2w2

dµ.

Note that∫ ∞

−∞e

iwµ− (µ−µµ)2

2σ2µ dµ = eiwµ− 12 σ2w2

is the characteristic function of the normal distribution. It follows that

φ(w) = eiwµ− 12 (σ2+σ2

µ )w2,

which is the characteristic function of N(µµ, σ 2 +σ 2µ). This completes the

proof.

Given protocol amendment j and that xji, i = 1, 2, . . . , nj are inde-pendent and identically distributed (i.i.d.) normal N(µ j , σ 2),

µ j =1nj

nj∑

i=1

xji

is conditionally normally distributed with mean µ j and variance σ2

nj,

i.e.,

µ j ∼ N(

µ j ,σ 2

nj

)

.

In addition,

µ j ∼ N(µµ, σµ).

Hence, from the above theorem, µ j is normally distributed with meanµµ and variance σ2

nj+ σ 2

µ , i.e.,

µ j ∼ N(

µµ,σ 2

nj+ σ 2

µ

)

, (2.19)

Page 50: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

PROTOCOL AMENDMENT 33

and

µ ∼ N

µµ,σ 2

(m+ 1)2

m∑

j=0

(1nj

)

+σ 2

µ

m+ 1

, (2.20)

where

µ =1

m+ 1

m∑

j=0

µ j . (2.21)

If nj is random, i.e., nj follows a distribution, say fnj , then, we have

µ ∼∞∑

k

fkN

(

µµ,σ 2

(m+ 1)2

(1k

)

+σ 2

µ

m+ 1

)

. (2.22)

Note that at the end of the study, σ 2 and σ 2µ can be replaced with their

estimates if m and nj are sufficiently large.For two independent samples, we have

µ1 − µ2 ∼ N(

µµ1 − µµ2 , σ 2p

)

,

where

σ 2p =

σ 21

(m1 + 1)2

m1∑

j=0

(1

n1 j

)

+σ 2

µ1

m1 + 1+

σ 22

(m2 + 1)2

m2∑

j=0

(1

n2 j

)

+σ 2

µ2

m2 + 1,

(2.23)

and m1 and m2 are the number of modifications made to the two groupsrespectively.

2.3.1 Test for equality

To test whether there is a difference between the mean response of atest compound as compared to a placebo control or an active controlagent, the following hypotheses are usually considered:

H0 : µ1 = µ2 vs Ha : µ1 �= µ2.

Under the null hypothesis, the test statistic is given by

z =µ1 − µ2

σp, (2.24)

where µ1 and µ2 can be estimated from (2.8) and (2.9) and σ 2p can be

estimated using (2.23) with estimated variances from (2.10) and (2.11).Under the null hypothesis, the test statistic follows a standard normaldistribution for large sample. Thus, we reject the null hypothesis at theα level of significance if z > zα/2.

Page 51: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

34 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

2.3.2 Test for non-inferiority/superiority

As indicated by Chow, Shao, and Wang (2003), the problem of testingnon-inferiority and (clinical) superiority can be unified by the followinghypotheses:

H0 : µ1 − µ2 ≤ δ vs Ha : µ1 − µ2 > δ,

where δ is the non-inferiority or superiority margin. When δ > 0, therejection of the null hypothesis indicates the superiority of the test com-pound over the control. When δ < 0, the rejection of the null hypothesisindicates the non-inferiority of the test compound against the control.Under the null hypothesis, the test statistic

z =µ1 − µ2 − δ

σp(2.25)

follows a standard normal distribution for large sample. Thus, we rejectthe null hypothesis at the α level of significance if z > zα.

It should be noted that α level for testing non-inferiority or supe-riority should be 0.025 instead of 0.05 because when δ = 0, the teststatistic should be the same as that for testing equality. Otherwise, wemay claim superiority with a small δ that is close to zero for observingan easy statistical significance. In practice, the choice of δ plays an im-portant role for the success of the clinical trial. It is suggested that δ

should be chosen in such a way that it is both statistically and clini-cally justifiable. The European Agency for the Evaluation of MedicinalProducts recently issued a draft points to consider guideline on thechoice of non-inferiority margin (EMEA, 2004). Along this line, Chowand Shao (2006) provided some statistical justification for the choice ofδ in clinical trials.

2.3.3 Test for equivalence

For testing equivalence, the following hypotheses are usuallyconsidered:

H0 : |µ1 − µ2| > δ vs Ha : |µ1 − µ2| ≤ δ,

where δ is the equivalence limit. Thus, the null hypothesis is rejectedand the test compound is concluded to be equivalent to the control if

µ1 − µ2 − δ

σp≤ −zα or

µ1 − µ2 − δ

σp≥ zα. (2.26)

It should be noted that the FDA recommends an equivalence limit of(80%,125%) for bioequivalence based on geometric means using log-transformed data.

Page 52: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

PROTOCOL AMENDMENT 35

2.4 Sample Size Adjustment

2.4.1 Test for equality

The hypotheses for testing equality of two independent means can bewritten as

H0 : ε = µ1 − µ2 = 0 vs Ha : ε �= 0.

Under the alternative hypothesis that ε �= 0, the power of the abovetest is given by

Φ(

ε

σp− zα/2

)

+ Φ(−ε

σp− zα/2

)

≈ Φ( |ε|

σp− zα/2

)

. (2.27)

Since the true difference ε is an unknown, we can estimate the powerby replacing ε in (2.27) with the estimated value ε. As a result, thesample size needed to achieve the desired power of 1−β can be obtainedby solving the following equation

|ε|σe

√2n

− zα/2 = zβ , (2.28)

where n is the sample size per group, and

σe =√

n2σ 2

p =

√√√√

σ 2

(m+ 1)2

m∑

j=0

(nnj

)

+nσ 2

µ

(m+ 1)(2.29)

for homogeneous variance condition and balance design. This leads tothe sample size formulation

n =4(z1−α/2 + z1−β)2σ 2

e

ε2, (2.30)

where ε, σ 2, σ 2µ , m, and rj = nj

n are estimates at the planning stage ofa given clinical trial. The sample size can be easily solved iteratively.Note that if nj = n

m+1, then

n =4(z1−α/2 + z1−β)2

(

σ 2 + nσ2µ

m+1

)

ε2.

Solving the above equation for n, we have

n =1R

4(z1−α/2 + z1−β)2(σ 2 + σ 2µ)

ε2=

1R

nclassic, (2.31)

Page 53: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

36 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 2.2 Relative Efficiency

mσ2

µ

σ2σ2

µ

ε2R

0 0.05 0.005 0.831 0.05 0.005 0.942 0.05 0.005 0.983 0.05 0.005 0.994 0.05 0.005 1.00

Note: α = 0.05, β = 0.1.

where R is the relative efficiency given by

R =nclassic

n=

(

1 +σ 2

µ

σ 2

) (

1 − 4(z1−α/2 + z1−β)2

ε2

σ 2µ

m+ 1

)

. (2.32)

Table 2.2 provides various m and σ 2µ with respect to R. As it can be

seen from Table 2.2, an increase of m will result in a decrease of n.Consequently, there is a significant decrease of the desired power of theintended trial when no amendment m = 0 and σ 2

µ = 0, R = 1.

2.4.2 Test for non-inferiority/superiority

The hypotheses for testing non-inferiority or (clinical) superiority of amean can be written as

H0 : ε = µ1 − µ2 ≤ δ vs Ha : ε > δ,

where δ is the non-inferiority or (clinical) superiority margin. Under thealternative hypothesis ε > 0, the power of the above test is given by

Φ

ε − δ

σe

√2n

− zα/2

. (2.33)

The sample size required for achieving the desired power of 1 − β canbe obtained by solving the following equation

ε − δ

σe

√2n

− zα/2 = zβ. (2.34)

This leads to

n =2(z1−α + z1−β)2σ 2

e

(ε − δ)2, (2.35)

Page 54: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

PROTOCOL AMENDMENT 37

where σ 2e is given by (2.29), and ε, σ 2, σ 2

µ , m, and rj = nj

n are estimates atthe planning stage of a given clinical trial. The sample size can be easilysolved iteratively. If nj = n

m+1, then the sample size can be explicitly

written as

n =1R

(z1−α + z1−β)2σ 2

(ε − δ)2, (2.36)

where R is the relative efficiency given by

R =n

nclassic=

(

1 − (z1−α + z1−β)2

(ε − δ)2σ 2

µ

m+ 1

)

. (2.37)

2.4.3 Test for equivalence

The hypotheses for testing equivalence can be written as

H0 : |ε| = |µ1 − µ2| > δ vs Ha : |ε| ≤ δ,

where δ is the equivalence limit. Under the alternative hypothesis that|ε| ≤ δ, the power of the of the test is given by

Φ

δ − ε

σe

√2n

− zα

+ Φ

δ + ε

σe

√2n

− zα

− 1 (2.38)

≈ 2Φ

δ − |ε|σe

√2n

− zα

− 1.

As a result, the sample size needed in order to achieve the desired powerof 1 − β can be obtained by solving the following equation

δ − |ε|σe

√2n

− zα = zβ/2. (2.39)

This leads to

n =2(z1−α + z1−β/2)2σ 2

e

(|ε| − δ)2, (2.40)

where σ 2e is given by (2.29). If nj = n

m+1, then the sample size is given by

n =1R

(z1−α + z1−β/2)2σ 2

(|ε| − δ)2, (2.41)

Page 55: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

38 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

where R is the relative efficiency given by

R =n

nclassic=

(

1 − (z1−α + z1−β/2)2

(|ε| − δ)2σ 2

µ

m+ 1

)

. (2.42)

2.5 Statistical Inference with Covariate Adjustment

As indicated earlier, statistical methods of analyzing clinical data shouldbe modified when there are protocol amendments during the trial, sinceany protocol deviations and/or violations may introduce bias to the con-clusion drawn based on the analysis of data with no changes made to thestudy protocol. For example, the target patient population of the clinicaltrial is typically defined through the patient inclusion/exclusion crite-ria. If the patient inclusion/exclusion criteria are modified during thetrial, then the resulting data may not be from the target patient popu-lation in the original protocol and, thus, statistical methods of analysishave to be modified to reflect this change. Chow and Shao (2005) mod-eled the population deviations due to protocol amendments using somecovariates and developed a valid statistical inference which is outlinedin this section.

2.5.1 Population and assumption

In this section, for convenience’s sake, we denote the target patient pop-ulation by P0. Parameters related to P0 are indexed by a subscript 0.For example, in a comparative clinical trial comparing a test treatmentand a control, the effect size of the test treatment (or a given study end-point) is µ0T − µ0C, where µ0T and µ0C are respectively the populationmeans of the test treatment and the control for patients in P0.

Suppose that there are a total of K possible protocol amendments.Let Pk be the patient population after the kth protocol amendment,k = 1, . . . , K. As indicated earlier, a protocol change may result in a pa-tient population similar but slightly different from the original targetpatient population, i.e., Pk may be different from P0, k = 1, . . . , K. Forexample, when patient enrollment is too slow due to stringent patientinclusion/exclusion criteria, a typical approach is to relax the inclu-sion/exclusion criteria to increase the enrollment, which results in apatient population larger than the target patient population. Becauseof the possible population deviations due to protocol amendments, stan-dard statistical methods may not be appropriate and may lead to invalidinference/conclusion regarding the target patient population.

Page 56: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

PROTOCOL AMENDMENT 39

Let µk be the mean of the study variable related to the patient pop-ulation Pk after the kth protocol amendment, k = 1, . . . , K. Note thatthe subscript T or C to indicate the test treatment and the control isomitted for simplicity in general discussion. Suppose that, for each k,clinical data are observed from nk patients so that the sample mean ykis an unbiased estimator of µk, k = 0, 1, . . . , K. Ignoring the differenceamong Pk’s results in an estimator y =

k nkyk/∑

k nk, which is an un-biased estimator of a weighted average of µk’s, not the original definedtreatment effect µ0.

In many clinical trials, protocol changes are made by using one or a fewcovariates. Modifying patient inclusion/exclusion criteria, for example,may involve patient age or ethnic factors. Treatment duration, studydose/regimen, factors related to laboratory testing, or diagnostic pro-cedures are other examples of such covariates. Let x be a (possiblymultivariate) covariate whose values are distinct for different protocolamendments. Throughout this article we assume that

µk = β0 + β ′xk, k = 0, 1, . . . , K, (2.43)

where β0 is an unknown parameter, β is an unknown parameter vectorwhose dimension is the same as x, β ′ denotes the transpose of β, andxk is the value of x under the kth amendment (or the original protocolwhen k = 0). If values of x are different within a fixed population Pk,then xk is a characteristic of x such as the average of all values of xwithin Pk.

Although µ1, . . . , µK are different from µ0, model (2.43) relates themwith the covariate x. Statistical inference on µ0 (or more generally, anyfunction of µ0, µ1, . . . , µK) can be made based on model (2.43) and datafrom Pk, k = 0, 1, . . . , K.

2.5.2 Conditional inference

We first consider inference conditional on a fixed set of K ≥ 1 protocolamendments. Following the notation in the previous section, let yk bethe sample mean based on nk clinical data from Pk, which is an unbiasedestimator of µk, k = 0, 1, . . . , K. Again, a subscript T or C should beadded when we consider the test treatment or the control. Under model(2.43), parameters β0 and β can be unbiasedly estimated by

(β0

β

)

= (X′WX)−1X′Wy, (2.44)

where y = (y0, y1, . . . , yK)′, X is a matrix whose kth row is (1, x′k), k =

0, 1, . . . , K, and W is a diagonal matrix whose diagonal elements aren0, n1, . . . , nK . Here, we assume that the dimension of x is less or equal

Page 57: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

40 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

to K so that (X′WX)−1 is well defined. To estimate µ0, we may use thefollowing unbiased estimator:

µ0 = β0 + β ′x0.

For inference on µ0, we need to derive the sampling distribution ofµ0. Assume first that, conditional on the given protocol amendments,data from each Pk are normally distributed with a common standarddeviation σ . Since each yk is distributed as N(µk, σ 2/nk) and it is rea-sonable to assume that data from different Pk’s are independent, µ0 isdistributed as N(µ0, σ 2c0) with

c0 = (1, x0)(X′WX)−1(1, x0)′.

Let s2k be the sample variance based on the data fromPk, k = 0, 1, . . . , K.

Then, (nk−1)s2k/σ 2 has the chi-square distribution with nk−1 degrees of

freedom and, consequently, (N−K)s2/σ 2 has the chi-square distributionwith N − K degrees of freedom, where

s2 =∑

k

(nk − 1)s2k/(N − K)

and N =∑

k nk. Confidence intervals for µ0 and testing hypothesesrelated to µ0 can be carried out using the t-statistic t = (µ0−µ0)/

√c0s2.

When Pk’s have different standard deviations and/or data from Pkare not normally distributed, we have to use approximation by assum-ing that all nk’s are large. By the central limit theorem, when all nk’sare large, µ0 is approximately normally distributed with mean µ0 andvariance

τ 2 = (1, x0)(X′WX)−1X′WΣX(X′WX)−1(1, x0)′, (2.45)

where Σ is the diagonal matrix whose kth diagonal element is the popu-lation variance of Pk, k = 0, 1, . . . , K. Large sample statistical inferencecan be made by using the z-statistic z = (µ0 − µ0)/τ (which is approx-imately distributed as the standard normal), where τ is the same as τ

with the kth diagonal element of Σ estimated by s2k , k = 0, 1, . . . , K.

2.5.3 Unconditional inference

In practice, protocol amendments are usually made in a random fashion;that is, the investigator of an on-going trial decides to make a protocolchange with certain probability and, in some cases, changes are madebased on the accrued data of the trial. Let CK denote a particular set ofK protocol amendments as described in the previous sections, and let Cbe the collection of all possible sets of protocol amendments. For exam-ple, suppose that there are a total of M possible protocol amendments

Page 58: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

PROTOCOL AMENDMENT 41

indexed by 1, . . . , M. Let CK be a subset of K ≤ M integers, i.e.,

CK = {i1, . . . , iK} ⊂ {1, . . . , M}.Then, CK denotes the set of protocol amendments i1, . . . , iK and C is thecollection of all subsets of {1, . . . , M}. In a particular problem, CK is cho-sen based on a (random) decision rule ξ (often referred to as adaptationrule) and P(CK) = P(ξ = CK), the probability that CK is the realizationof protocol amendments, is between 0 and 1 and

CK∈C P(CK) = 1.For a particular CK , let zCK be the z-statistic defined in the previous

section and let L(zCK |ξ = CK) be the conditional distribution of zξ givenξ = CK . Suppose that L(zCK |ξ = CK) is approximately standard normalfor almost every sequence of realization of ξ . We now show thatL(zξ ), theunconditional distribution of zξ , is also approximately standard normal.According to Theorem 1 in Liu, Proschan, and Hedger (2002),

L(zξ ) = E

[∑

CK∈CL(zCK |ξ = CK)Iξ=CK

]

, (2.46)

where Iξ=CK is the indicator function of the set {ξ = CK} and the ex-pectation E is with respect to the randomness of ξ . For every fixed realnumber t, by the assumption,

P(zCK ≤ t|ξ = CK) → Φ(t) almost surely, (2.47)

where Φ is the standard normal distribution function. Multiplying Iξ=CK

to both sides of (2.47) leads to

P(zCK ≤ t|ξ = CK)Iξ=CK → Φ(t)Iξ=CK almost surely. (2.48)

Since the left-hand side of (2.48) is bounded by 1, by the dominatedconvergence theorem,

E [P(zCK ≤ t|ξ = CK)Iξ=CK ] → E [Φ(t)Iξ=CK ]

= Φ(t)E [Iξ=CK ]

= Φ(t)P(CK).

It follows from (2.46) and (2.47) that

P(zξ ≤ t) =∑

CK∈CE [P(zCK ≤ t|ξ = CK)Iξ=CK ]

→∑

CK∈CΦ(t)P(CK)

= Φ(t).

Hence, large sample inference can be made using the z-statistic zξ .

Page 59: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

42 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

It should be noted that the finite distribution of zξ given by (2.46) maybe very complicated. Furthermore, assumption (2.47), i.e., L(zCK |ξ =CK) is approximately standard normal for almost every sequence ofrealization of ξ , has to be verified in each application (via the construc-tion of zCK and asymptotic theory such as the central limit theorem).Assumption (2.47) certainly holds when the adaptation rule ξ and zCK

are independent for each CK ∈ C. For example, when protocol changesare modifying patient inclusion/exclusion criteria, the adaptation ruleξ is related to patient recruiting and, thus, is typically independentof the main study variable such as the drug efficacy. Another exampleis adjustments of study dose/regimen because of safety considerations,which may be approximately independent with the drug efficacy. Someother examples can be found in Liu, Proschan, and Hedger (2002).

Example 2.1 A placebo-control clinical trial was conducted to evalu-ate the efficacy of an investigational drug for treatment of patients withasthma. The primary study endpoint is the change in FEV1 (forced ex-pired volume per second), which is defined to be the difference betweenthe FEV1 after treatment and the baseline FEV1. During the conductof the clinical trial, the protocol was amended twice due to slow en-rollment. For each protocol amendment, a modification to the inclusioncriterion regarding the baseline FEV1 was made. In the original pro-tocol, patients were included if and only if their baseline FEV1 werebetween 1.5 Liters and 2.0 Liters. At the first and the second protocolamendments, the range of the baseline FEV1 in this inclusion criterionwas changed to, respectively, 1.5 Liters to 2.5 Liters and 1.5 Liters to3.0 Liters. Some summary statistics are given in Table 2.3.

We first consider the analysis conditional on the two protocolamendments, i.e., the sample size (number of patients) in each periodof the trial (before or after protocol amendments) is treated as fixed.

Table 2.3 Summary Statistics in the Asthma Trial Example

Baseline Number Baseline FEV1 FEV1FEV1 Range of Patients FEV1 Mean Change Mean Change S.D.

1.5 ∼ 2.0 9 1.86 0.31 0.14Test drug 1.5 ∼ 2.5 15 2.30 0.42 0.14

1.5 ∼ 3.0 16 2.79 0.54 0.16

1.5 ∼ 2.0 8 1.82 0.16 0.15Placebo 1.5 ∼ 2.5 16 2.29 0.19 0.13

1.5 ∼ 3.0 16 2.84 0.20 0.14

Page 60: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

PROTOCOL AMENDMENT 43

Following the procedure described on the previous page with the meanof baseline FEV1 as the (one-dimensional) covariate x in assumption(2.43), we obtain estimates β0 = −0.14 and β = 0.25 (according toformula (2.44)) for the test drug. Then, an estimate of µ0, the pop-ulation mean for the test drug under the original protocol, is µ0 =−0.14 + 0.25 × 1.86 = 0.33. An estimated τ according to formula (2.48)with elements of Σ estimated by sample variances is τ = 0.04. Similarly,for the placebo, the population mean under the original protocol is es-timated as 0.099 + 0.038 × 1.82 = 0.17 with estimated τ to be 0.04. Thus,an estimate of the population mean difference (between the test drugand placebo under the original protocol) is 0.33 − 0.17 = 0.16 with anestimated standard error

√2 × 0.04 = 0.057. The approximate p-value

for the one-sided null hypothesis that the test drug mean is no largerthan the placebo mean is equal to 0.0021, and the approximate p-valuefor the two-sided null hypothesis that the test drug mean is the sameas the placebo mean is 0.0042.

If we ignore the protocol amendments and combine all data, then ourestimate of the mean difference between the test drug and placebo is0.25, which is clearly biased. On the other hand, one may perform ananalysis based on patients enrolled before the first protocol amendment,i.e., 9 patients in the test drug group and 8 patients in the placebo group.The resulting estimate of the mean difference between the test drug andplacebo is 0.15 with an estimated standard error

√2 × 0.047 = 0.066.

The approximate p-value is 0.0116 for the one-sided null hypothesis and0.0232 for the two-sided null hypothesis. The test drug effect seems lesssignificant, but this is due to the use of data gathered before the firstprotocol amendment.

We now consider unconditional analysis. As we discussed above, if theadaptation rule is independent of the z-statistic for every possible wayof making protocol amendments, then the z-statistic is still valid for un-conditional analysis. In the asthma trial example, protocol amendmentswere made because of the slow enrollment and, hence, could be viewedas a process independent of the FEV1 change scores. Thus, the conclu-sions based on the conditional analysis and unconditional analysis inthis example are the same.

2.6 Concluding Remarks

As indicated in this chapter, it is not uncommon to issue protocol amend-ments during the conduct of a clinical trial. The protocol amendmentsare necessary to describe what changes have been made and the ra-tionales behind the changes to ensure the validity and integrity ofthe clinical trial. As the result of the modifications, the original target

Page 61: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

44 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

patient population under study could have become a similar but differ-ent patient population. If the modifications are made frequently duringthe conduct of the trial, the target patient population is in fact a movingtarget patient population. In practice, there is a risk that major (or sig-nificant) modifications made to the trial and/or statistical procedurescould lead to a totally different trial, which cannot address the scien-tific/medical questions that the clinical trial is intended to answer. Thus,it is of interest to measure the impact of each modification made to thetrial procedures and/or statistical procedure after the protocol amend-ment. In this chapter, it is suggested that independent estimates of ε,C and ∆, which are measures of the shift of mean response, cause theinflation/reduction of variability of the response and effect size of theprimary study endpoint, respectively. The estimates of ε, C and ∆ areuseful in providing the signal regarding how the target patient popu-lation has been changed due to the protocol amendment. In practice, itshould be noted that reliable estimates of ε, C and ∆ may not be avail-able for trials with small sample sizes. In other words, the estimates ofε, C and ∆ may not be accurate and reliable because it is possible thatonly a few observations are available after the protocol amendment,especially when there are a number of protocol amendments. In addi-tion, although we consider the case where the sample size after protocolamendment is random, the number of protocol amendments is also arandom variable, which complicates the already complicated procedurefor obtained accurate and reliable estimates of ε, C and ∆. In practice,it should be recognized that protocol amendments are not given gifts.Potential risks for introducing additional bias/variation as the result ofmodifications made should be carefully evaluated before addressing theissue of a protocol amendment. It is important to identify, control, andhopefully eliminate/minimize the sources of bias/variation. For goodclinical and/or statistical practices, it is then strongly suggested thatthe protocol amendments be limited to a small number such as 2 or 3in clinical trials.

In current practice, standard statistical methods are applied to thedata collected from the actual patient population regardless the fre-quency of changes (protocol amendments) that have been made dur-ing the conduct of the trial, provided that the overall type I error iscontrolled at the pre-specified level of significance. This, however, hasraised a serious regulatory/statistical concern as to whether the resul-tant statistical inference (e.g., independent estimates, confidence inter-vals, and p-values) drawn on the originally planned target patient pop-ulation based on the clinical data from the actual patient population (asthe result of the modifications made via protocol amendments) are accu-rate and reliable. As discussed in this chapter, the impact on statistical

Page 62: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

PROTOCOL AMENDMENT 45

inference due to protocol amendments could be substantial, especiallywhen there are major modifications, which have resulted in a significantshift in mean response and/or inflation of the variability of response ofthe study parameters. It is suggested that a sensitivity analysis withrespect to changes in study parameters be performed to provide a bet-ter understanding on the impact of changes (protocol amendments) instudy parameters on statistical inference. Thus, regulatory guidanceon what range of changes in study parameters are considered accept-able is necessary. As indicated earlier, adaptive design methods arevery attractive to the clinical researchers and/or sponsors due to theirflexibility, especially in clinical trials of early clinical development. It,however, should be noted that there is a high risk that a clinical trialusing adaptive design methods may fail in terms of its scientific valid-ity and/or its limitation of providing useful information with a desiredpower, especially when the sizes of the trials are relatively small andthere are a number of protocol amendments. In addition, statisticallyit is a challenge to clinical researchers when there are missing val-ues. The causes of missing values could be related to or unrelated tothe changes or modifications made in the protocol amendments. In thiscase, missing values must be handled carefully to provide an unbiasedassessment and interpretation of the treatment effect.

For some types of protocol amendments, the method proposed byChow and Shao (2005) gives valid statistical inference for character-istics (such as the population mean) of the original patient population.The key assumption in handling population deviation due to proto-col amendments is assumption (2.43), which has to be verified in eachapplication. Although a more complicated model (such as a non-linearmodel in x) may be considered, model (2.43) leads to simple derivationsof sampling distributions of the statistics used in inference. The otherdifficult issue in handling protocol amendments (or, more generally,adaptive designs) is the fact that the decision rule for protocol amend-ments (or the adaptation rule) is often random and related to the mainstudy variable through the accrued data of the on-going trial. Chow andShao (2005) showed that if an approximate pivotal quantity conditionalon each realization of the adaptation rule can be found, then it is alsoapproximately pivotal unconditionally and can be used for uncondi-tional inference. Further research on the construction of approximatepivotal quantities conditional on the adaptation rule in various problemsis needed.

After some modifications are made to the trial and/or statistical pro-cedures, not only the target patient population may have become a sim-ilar but different patient population, but also the sample size may notachieve the desired power for detection of a clinically important effect

Page 63: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

46 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

size of the test treatment at the end of the study. In practice, we ex-pect to lose power when the modifications have led to a shift in meanresponse and/or inflation of variability of the response of the primarystudy endpoint. As a result, the originally planned sample size mayhave to be adjusted. In this chapter, it is suggested that the relativeefficiency at each protocol amendment be taken into consideration forderivation of an adjusted factor for sample size in order to achieve thedesired power. More details regarding the adjustment of sample sizewhen various adaptive designs/methods are used in clinical trials aregiven in the subsequent chapters of this book.

Page 64: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CHAPTER 3

Adaptive Randomization

Randomization plays an important role in clinical research. For a givenclinical trial, appropriate use of randomization procedure not only en-sures that the subjects selected for the clinical trial are a truly rep-resentative sample of the target patient population under study, butalso provides an unbiased and fair assessment regarding the efficacyand safety of the test treatment under investigation. As pointed outin Chow and Liu (2003), statistical inference of the efficacy and safetyof a test treatment under study relies on the probability distributionof the primary study endpoints of the trial, which in turn depends onthe randomization model/method employed for the trial. Inadequaterandomization model/method may violate the primary distribution as-sumption and consequently distort statistical inference. As a result, theconclusion drawn based on the clinical data collected from the trial maybe biased and/or misleading.

Based on the allocation probability (i.e., the probability of assign-ing a patient to a treatment), the randomization procedures that arecommonly employed in clinical trials can be classified into four cat-egories: conventional randomization, treatment-adaptive randomiza-tion, covariate-adaptive randomization, and response-adaptive random-ization. The conventional randomization refers to any randomizationprocedures with a constant treatment allocation probability. Commonlyused conventional randomization procedures include simple (orcomplete) randomization, stratified randomization, and cluster ran-domization. Unlike the conventional randomization procedures, treat-ment allocation probabilities for adaptive randomization proceduresusually vary over time depending upon the cumulative informationon the previously assigned patients. Similar to the conventional ran-domization procedures, treatment-adaptive randomization procedurescan also be prepared in advance. For covariate-adaptive randomizationand response-adaptive randomization procedures, the randomizationcodes are usually generated in a dynamically real-time fashion. This isbecause the randomization procedure is based on the patient informa-tion on covariates or response observed up to the time when the random-ization is performed. Treatment-adaptive randomization and covariate-adaptive randomization are usually considered to reduce treatment

Page 65: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

48 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

imbalance or deviation from the target sample size ratio between treat-ment groups. On the other hand, response-adaptive randomization pro-cedure emphasizes ethical consideration, i.e., it is desirable to providepatients with better/best treatment based on the knowledge about thetreatment effect at that moment.

In practice, conventional randomization procedures could result insevere treatment imbalance at some time point during the trial or atthe end of the trial, especially when there is a time-dependent het-erogeneous covariance that relates to treatment responses. Treatmentimbalance could decrease statistical power for demonstration of treat-ment effect of the intended trial and consequently the validity of thetrial. In this chapter, we attempt to provide a comprehensive review ofvarious randomization procedures from each category.

In the next section, the conventional randomization procedures arebriefly reviewed. In Section 3.2, we introduce some commonly usedtreatment-adaptive randomization procedures in clinical trials. Severalcovariate-adaptive randomization procedures and response-adaptiverandomization methods are discussed in Section 3.3 and Section 3.4,respectively. In Section 3.5, some practical issues in adaptive random-ization are examined. A brief summary is given in the last section ofthis chapter.

3.1 Conventional Randomization

As mentioned earlier, the treatment allocation probability of conven-tional randomization procedures is a fixed constant. As a result, itallows the experimenters to prepare the randomization codes in advance.The conventional randomization procedures are commonly employed inclinical trials, particularly in double-blind randomized clinical trials.In what follows, we introduce some commonly employed conventionalrandomization procedures, namely, simple randomization, stratifiedrandomization, and cluster randomization.

Simple randomization

Simple (or complete) randomization is probably one of the most com-monly employed conventional randomization procedures in clinical tri-als. Consider a clinical trial for comparing the efficacy and safety of ktreatments in treating patients with certain diseases. For a simple ran-domization, each patient is randomly assigned to each of the k treat-ment groups with a fixed allocation probability pi(i = 1, . . . , k), where∑k

i=1 pi = 1. The allocation probabilities are often expressed as the

Page 66: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE RANDOMIZATION 49

ratio between the sample size (ni) of the ith treatment group and theoverall sample size (n =

∑ki=1 ni), i.e., pi = ni

n , which is usually re-ferred to as the sample size ratio of the ith treatment group. In theinterest of treatment balance, an equal allocation probability for eachtreatment group (i.e., pi = p for all i) is usually considered, which hasthe following advantages. First, it has the most (optimal) statisticalpower for correct detection of a clinically meaningful difference underthe condition of equal variances. Second, it is ethical in the sense ofequal toxicity (Lachin, 1988). In practice, however, it may be of interestto have an unequal allocation between treatment groups. For exam-ple, it may be desirable to assign more patients to the treatment groupthan a placebo group. It, however, should be noted that a balanced de-sign may not achieve the optimal power when there is heterogeneityin variance between treatment groups. The optimal power can only beachieved when the sample size ratio is proportional to the standarddeviation of the group.

The simple (complete) randomization for a two-arm parallel groupclinical trial can be easily performed assuming that the treatment as-signments are independent Bernoulli random variable with a successprobability of 0.5. In practice, treatment imbalance inevitably occurseven by chance alone. Since this treatment imbalance could result ina decrease in power for detecting a clinically meaningful difference, itis of interest to examine the probability of imbalance. Denote the twotreatments under study by treatment A and treatment B, respectively.Let Dn = NA(n) − NB(n) be the measure of the imbalance in treatmentassignment at stage n, where NA(n) and NB(n) are the sample size oftreatment A and treatment B at stage n, respectively. Then, the imbal-ance Dn is asymptotically normally distributed with mean 0 and vari-ance n. Therefore, the probability of imbalance, for a real value r > 0,is given by (see, e.g., Rosenberger et al., 2002; Rosenberger and Lachin,2002)

P(|Dn| > r) = 2[

1 − Φ(

r√n

)]

. (3.1)

The sample size for a unbalanced design with homogeneous variance isgiven by

n =1R

2(z1−α/2 + z1−β)σ 2

δ2,

where the relative efficiency R is a function of sample size ratio k = n2n1

between the two groups, i.e.,

R =4

2 + k + 1/k.

Page 67: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

50 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 3.1 Relative Efficiencies

Sample Size Ratio, k Relative Efficiency, R Pr(Efficiency< R)

1 1 11.5 0.96 0.112 0.889 0.012.5 0.816 0.001

Note: n = 64.

Let

r = n2 − n1 =k − 11 + k

n.

Then

P(|Dn| > r) = 2[

1 − Φ(

k − 11 + k

√n)]

.

As it can be seen from Table 3.1, Pr(R < 0.96) = 11% and Pr(R <

0.899) = 1%. Thus, Pr(0.899 < R < 0.96) = 10%.

Stratified randomization

As discussed above, simple (complete) randomization does not assurethe balance between treatment groups. The impact of treatment im-balance could be substantial. In practice, treatment imbalance couldbecome very critical, especially when there are important covariates.In this case, stratified randomization is usually recommended to re-duce treatment imbalance. For a stratified randomization, the targetpatient population is divided into several homogenous strata, whichare usually determined by some combinations of covariates (e.g., pa-tient demographics or patient characteristics). In each stratum, a sim-ple (complete) randomization is then employed. Similarly, treatmentimbalance of stratified randomization for a clinical trial comparing twotreatment groups can be characterized by the following probability ofimbalance asymptotically (see, e.g., Hallstron and Davis, 1988)

P(|D| > r) = 2

[

1 − Φ

(

r√

Var(D)

)]

, (3.2)

where

Var(D) =∑s

i=1 bi + s6

,

Page 68: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE RANDOMIZATION 51

s is the number of strata, bi is the size of the ith block, and

D =s∑

i=1

|Ni − 2Ai|,

in which Ni and Ai are the number of patients and the number ofpatients in treatment A within the ith stratum, respectively.

When the number of strata is large, it is difficult to achieve treatmentbalance across all stages. This imbalance will decrease the power ofstatistical analysis such as the analysis of covariance (ANCOVA).

Cluster randomization

In certain trials, the appropriate unit of randomization may be someaggregate of individuals. This form of randomization is known as clus-ter randomization or group randomization. Cluster randomization isemployed by necessity in trials in which the intervention is by naturedesigned to be applied at the cluster level such as community-basedinterventions. In the simple cluster randomization, the degree of im-balance can be derived based on simple randomization, i.e.,

P(|Dncluster | > r) = 2[

1 − Φ(

r√ncluster

)]

,

where

Dncluster = Ncluster A(ncluster) − Ncluster B(ncluster).

The number of clusters is Ncluster = N/k where k is the number ofsubjects within each cluster. Then

Dncluster = Dn/k =NA(n/k)

k− NB(n/k)

k.

Thus we have

P(|Dn|/k > r) = 2[

1 − Φ(

r√n

)]

.

It can be written as

P(|Dn| > r) = 2[

1 − Φ(

rk√

n

)]

.

It should be noted that the analysis for a cluster-randomized trial isvery different from that of an individual subject–based randomizationtrial. A cluster-randomization trial requires adequate numbers of bothindividual subjects and clusters.

Page 69: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

52 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Remarks For a given sample size, the statistically most powerful de-sign is defined as a design with allocation probabilities proportional tothe standard deviation of the group. For binary responses, Neyman’streatment allocation with the following allocation ratio leads to a mostpowerful design

r = na/nb =(

pa

pb

1 − pa

1 − pb

) 12

, (3.3)

where pa and pb are the proportions for treatment A and treatmentB, respectively. Note that for the most powerful design, the target im-balance is r0 �= 0 and the power of a design can be measured by thequantity

P(|D| > r − r0).

3.2 Treatment-Adaptive Randomization

Treatment-adaptive randomization is also known as variance-adaptiverandomization. The purpose of a treatment-adaptive randomization isto achieve a more balanced design or to reduce the deviation fromthe target treatment allocation ratio by utilizing a varied allocationprobability. Commonly used treatment-adaptive randomization mod-els in clinical research and development include block randomization,a biased-coin model, and various urn models. To introduce these ran-domization procedures, consider a two-arm parallel group randomizedclinical trial comparing a test treatment (A) with a control (B).

Block randomization

In block randomization, the allocation probability is a fixed constant be-fore any of the two treatment groups reach its target number. However,after the target number is reached in one of the two treatment groups,all the future patients in the trial will be assigned to the other treatmentgroup. As a result, block randomization is a deterministic randomiza-tion procedure. It should be noted that although the block size of theblock randomization can vary, a small block size will reduce the ran-domness. The minimum block size commonly chosen in clinical trials istwo, which leads to an alternative assignment of the two treatments.In variance-adaptive randomization, the imbalance can be reduced oreliminated when the target number of patients is exactly randomized.Note that when there are two treatment groups, the block randomiza-tion is sometimes referred to as a truncated binomial randomization.

Page 70: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE RANDOMIZATION 53

The allocation probability is defined as

P =

0 if NA( j − 1) = n/2,1 if NB( j − 1) = n/2,

0.5 otherwise,

where NA( j − 1) and NB( j − 1) are the sample size of treatment A andtreatment B at stage j − 1, respectively and n/2 is the target numberfor each group.

Efron’s biased coin model

Efron (1971) proposed a biased coin design to balance treatment assign-ment. The allocation rule to treatment A is defined as follows:

P(δ j |∆ j−1) =

0.5 if NA( j) = NB( j),p if NA( j) < NB( j),

1 − p if NA( j) > NB( j),

where δ j is a binary indicator for treatment assignment of the jth sub-ject, i.e., δ j = 1 if treatment A is assigned and δ j = 0 if treatment B isassigned, and ∆ j−1 = {δ1, . . . , δ j−1} is the set of treatment assignmentup to subject j − 1. The imbalance is measured by

|Dn| = |NA(n) − n|.The limiting balance property can be obtained by random walk methodas follows:

limm→∞ Pr(|D2m| = 0) = 1 − 1 − p

p,

limm→∞ Pr(|D2m| = 1) = 1 − (1 − p)2

p2.

Note that for an odd number of patients, the minimum imbalance is 1.It can be seen that as p → 1, we achieve perfect balance. But such aprocedure is deterministic.

Lachin’s urn model

Lachin’s urn model is another typical example of variance-adaptive ran-domization. The model is described as follows. Suppose that there areNA white balls and NB red balls in an urn initially. A ball is randomlydrawn from the urn without replacement. If it is a white ball, the patientis assigned to receive treatment A; otherwise, the patient is assigned toreceive treatment B. Therefore, if NA and NB are the target sample sizesfor treatment groups A and B, respectively, the target sample size ratio

Page 71: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

54 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

(or balance) is always reached if the total planned number of patientsis reached. The treatment allocation probability for treatment group Ain a trial comparing two treatment groups is

P(A) =n2

− NA( j − 1)NA + NB − ( j − 1)

.

Although Lachin’s urn model can result in a perfect balance design afterall patients are randomized, the maximum imbalance occurs when halfof the treatment allocations are completed, which is given by

Pmax(|Dn| > r) = 2[

1 − Φ(

2rn

(n − 1))]

.

Friedman-Wei’s urn model

The Friedman-Wei’s urn model is a popular model that can reducepossible treatment imbalance (see, e.g., Friedman, 1949; Wei, 1977;Rosenberger and Lachin, 2002). The Friedman-Wei’s urn model isdescribed below. Suppose that there is an urn containing a white ballsand a red balls. For treatment assignment, a ball is drawn at randomand then replaced. If the ball is white, then treatment A is assigned.On the other hand, if a red ball is drawn, then treatment B is assigned.Furthermore, b additional balls of the opposite color of the ball chosenare added to the urn. Note that a and b could be any reasonable non-negative numbers. This drawing procedure is replaced for each treat-ment assignment. Denote a urn design by UD(a, b). The allocation rulefor UD(a, b) can then be defined mathematically as follows:

P(δ j = 1|∆ j−1) =a + bNB( j − 1)2a + b( j − 1)

. (3.4)

Note that UD(a, 0) is nothing but a simple or complete randomization.Let Dn be the absolute difference in number of subjects between the

two treatment groups after the nth treatment assignment. Then Dnforms a stochastic process with possible values d ∈ {0, 1, 2, . . . , n}. Atinitial, D0 = 0. The (n + 1)th stage transition probabilities are thengiven by (see also Wei, 1977)

Pr(Dn+1 = d − 1|Dn = d) = 1/2 + bd[2(2a + bn)], (3.5)

Pr(Dn+1 = d + 1|Dn = d) = 1/2 − bd[2(2a + bn)],

Pr(Dn+1 = 1|Dn = 0) = 1,

where n ≥ d ≥ 1. Note that P(d, n) is a monotonically increasing func-tion with respect to d and a monotonically decreasing function with re-spect to n. P(d, n) tends to 1/2 as n increases for a fixed d > 0. Therefore,the UD(a, b) forces the trial to be more balanced when severe imbalance

Page 72: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE RANDOMIZATION 55

occurs. In addition, UD(a, b) can also ensure the balance of a relativelysmall trial. It should be noted that UD(a, b) behaves like the completerandomization design as n increases.

The transition probabilities in (3.1) can be used recursively to cal-culate the probability of an imbalance of degree d at any stage of thetrial as

Pr(Dn+1 = d) = Pr(Dn+1 = d|Dn = d − 1) Pr(Dn+1 = d − 1)

+ Pr(Dn+1 = d|Dn = d + 1)Pr(Dn+1 = d + 1). (3.7)

For a moderate or large n, the probability of imbalance is approxi-mately the normal distribution, Dn˜N

(

0, n(a+b)3b−a

)

. As a result, the prob-ability of imbalance for large sample size n can be expressed as

P(|Dn| > r) = 2

{

1 − Φ

(

r

3b − an(a + b)

)}

. (3.8)

Remarks The urn procedure is relative easy to implement. It forces asmall-scale trial to be balanced but approaches complete randomizationas the sample size increases. It has less vulnerability to selection biasthan does the permuted-block design, biased-coin design, or randomallocation rule. As n increases, the potential selection bias approachesto the complete randomization for which the expected selection bias iszero. The urn design can also be extended to the prospective stratifica-tion trial when the number of strata is either small or large.

The urn design can easily be generalized to the case of multiple-groupcomparisons (Wei, 1978; Wei, Smythe, and Smith 1986). We can evenfurther generalize it using different a and b for difference groups in theurn model.

3.3 Covariate-Adaptive Randomization

The covariate-adaptive randomization is usually considered to reducethe covariate imbalance between treatment groups. Thus, the covariate-adaptive randomization is also known as adaptive stratification. Allo-cation probability for the covariate-adaptive randomization is modifiedover time during the trial based on the cumulative information aboutbaseline covariates and treatment assignments. Covariate-adaptiverandomization includes Zelen’s model, Pocock-Simon’s model, Wei’smarginal urn design, minimization, and the Atkinson optimal model,which will be briefly described below.

Page 73: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

56 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Zelen’s model

Zelen’s model (Zelen, 1974) requires a simple randomization sequence.When the imbalance reaches a certain threshold, the next subject willbe forced to be assigned to be the group with fewer subjects. Let Nik(n)be the number of patients in stratum i = 1, 2, . . . , s of the kth treatmentk = 1, 2. When patient n+1 in stratum i is ready to be randomized, onecomputes Di(n) = Ni1(n) − Ni2(n). For an integer c, if |Di(n)| < c, thenthe patient is randomized according to schedule; otherwise, the patientwill be assigned to the group with fewer subjects, where the constantcan be c = 2, 3, or 4 as Zelen suggested.

Pocock-Simon’s model

Similar to the Zelen’s model, Pocock and Simon (1975) proposed analternative covariate-adaptive randomization procedure. We followRosenberger and Lachin’s descriptions of the method (Rosenberger andLachin, 2002). Let Nijk(n), i = 1, . . . , I, j = 0, 2, . . . , ni, and k = 1, 2(1 = treatment A, 2 = treatment B) be the number of patients in stra-tum j of covariate i on treatment k after n patients have been random-ized. Note that

∏Ii=1 ni = s is the total number of strata in the trial.

Suppose the (n + 1)th patient to be randomized is a member of stratar1, . . . , rI of covariates 1, . . . , I. Let Di(n) = Niri1 − Niri2. Define the fol-lowing weighted difference measure D(n) =

∑Ii=1 wi Di(n), where wi are

weights chosen depending on which covariates are deemed of greaterimportance. If D(n) is less than 1/2, then the weighted difference mea-sure indicates that B has been favored thus far for that set, r1, . . . , rI , ofstrata and the patient n+ 1 should be assigned with higher probabilityto treatment A, and vice versa. If D(n) is greater than 1/2, Pocock andSimon (1975) suggested biasing a coin with

p =c∗ + 1

3

and implementing the following rule: if D(n) < 1/2, then assign the nextpatient to treatment A with probability p; if D(n) > 1/2, then assignthe next patient to treatment B with probability p; and if D(n) = 1/2,then assign the next patient to treatment A with probability 1/2, wherec∗ ∈ [1/2, 1].

Note that if c∗ = 1, we have a rule very similar to Efron’s biasedcoin design as described in the previous section. If c∗ = 2, we have thedeterministic minimization method proposed by Taves (1974) (see alsoSimon, 1979). Note that many other rules could also be derived followingthe Zelen’s rule and Taves’s minimization method with a biased coin

Page 74: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE RANDOMIZATION 57

twist to give added randomization. Efron (1980) described one of suchrules and applied it to a clinical trial in ovarian cancer research.

Pocock and Simon (1975) also generalized their covariate-adaptiverandomization procedure to more than two treatments. They suggestedthe following allocation rule be applied:

pk = c∗ − 2(K c∗ − 1)kK(K + 1)

, k = 1, . . , K,

where K is the number of treatments.

Wei’s marginal urn design

In practice, when the number of covariates results in a large numberof strata with small stratum sizes, the use of a separate urn in eachstratum could result in treatment imbalance within strata. Wei (1978)proposed marginal urn design for solving the problem. Instead of usingN urns, one for each unique stratum, he suggested using the urn withmaximum imbalance to do the randomization each time. For a givennew subject with covariate values r(1), . . . , r(I), treatment imbalancewithin each of the corresponding urns is calculated. The one with thegreatest imbalance is used to generate the treatment assignment for thenext subject. A ball from that urn is chosen with replacement. Mean-while, b balls representing the opposite treatment are added to the urnscorresponding to that patient’s covariate values. Wei (1978) called thisapproach a marginal urn design because it tends to balance treatmentassignments within each category of each covariate marginally, andthus also jointly (Rosenberger and Lachin, 2002).

Imbalance minimization model

Imbalance minimization allocation has been advocated as an alterna-tive to the stratified randomization when there are large numbers ofprognostic variables under the imbalance minimization model (Birkett,1985). The allocation of a patient is determined as follows. A new pa-tient is first classified according to the prognostic variables of interest.He/she is then tentatively assigned to each treatment group in turn, anda summary measure of the resulting treatment imbalance is calculated.The measure of imbalance is obtained by summing the absolute valueof excess number of patients receiving one treatment rather than othertreatment within every level of each prognostic variable. The two mea-sures are compared, and final allocation is made to that group that mini-mizes that imbalance measurement. As indicated by Birkett (1985), theimbalance minimization would help gain the power.

Page 75: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

58 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Although minimization has been widely used in clinical trials, it isa concern that the potential risk of enabling the investigator to breakthe code due to the deterministic nature of the allocation may bias theenrollment of patients (Ravaris et al., 1976; Gillis and Ratkowsky, 1978;Weinthrau et al., 1977).

Atkinson optimal model

Atkinson (1982) considered a linear regression model to minimize thevariance of treatment contrast in the presence of important covariates.The allocation rule is given by

pk =dA(k, ξn)

∑Kk=1 dA(k, ξn)

, (3.9)

where

ξn = arg maxξ

{|A′M−1(ξ)A|−1}

, (3.10)

in which M = X′X is the pxp dispersion matrix from nobservations, andA is an sxp matrix of contrasts, s < p. More details regarding Atkinson’soptimal model can be found in Atkinson and Donev (1992).

3.4 Response-Adaptive Randomization

Response-adaptive randomization is a randomization technique inwhich the allocation of patients to treatment groups is based on theresponse (outcome) of the previous patients. The purpose is to providethe patients better/best treatment based on the knowledge about thetreatment effect at that moment. As a result, response-adaptive ran-domization takes the ethical concern into consideration. The well-knownresponse-adaptive models include play-the-winner (PW) model, ran-domized play-the-winner (RPW) model, Rosenberger’s optimizationmodel, Bandit model, and optimal model with finite population. In whatfollows, these response-adaptive randomization models will be brieflydescribed.

Play-the-winner model

Play-the-winner (PW) model can be easily applied to clinical trials com-paring two treatments (e.g., treatment A and treatment B) with binaryoutcomes (i.e., success or failure). For PW model, it is assumed that theprevious subject’s outcome will be available before the next patient israndomized. The treatment assignment is based on treatment responseof the previous patient. If a patient responds to treatment A, then the

Page 76: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE RANDOMIZATION 59

next patient will be assigned to treatment A. Similarly, if a patientresponds to treatment B, then the next patient will be assigned totreatment B. If the assessment of previous patient is not available, thetreatment assignment can be based on the last available patient withresponse assessment or randomly assigned to treatment A or B. It isobvious that this model lacks randomness.

Randomized play-the-winner model

The randomized play-the-winner (RPW) model is a simple probabilisticmodel to sequentially randomize subjects in a clinical trial (see, e.g.,Rosenberger, 1999; Coad and Rosenberger, 1999). RPW model is usefulespecially for clinical trials comparing two treatments with binary out-comes. For RPW, it is assumed that the previous subject’s outcome willbe available before the next patient is randomized. At the start of theclinical trial, an urn contains αA balls for treatment A and αB balls fortreatment B, where αA and αB are positive integers. For convenience’ssake, we will denote these balls by either type A or type B balls. Whena subject is recruited, a ball is drawn and replaced. If it is a type A ball,the subject receives treatment A; if it is type B, the subject receivestreatment B. When a subject’s outcome is available, the urn is updated.A success on treatment A or a failure on treatment B will generate anadditional b type B balls in the urn, where b is a positive integer. In thisway, the urn builds up more balls representing the more successful (orless successful) treatment.

There are some interesting asymptotic properties with RPW. LetNa/N be the proportion of subjects assigned to treatment A out of Nsubjects. Also, let qa = 1 − pa and qb = 1 − pb be the failure probabili-ties. Further, let F be the total number of failures. Then, we have (Weiand Durham, 1978)

limN→∞

Na

Nb=

qb

qa, (3.11)

limN→∞

Na

N=

qb

qa + qb,

limN→∞

FN

=2qaqb

qa + qb.

Note that for balanced randomization, E(F/N) = (qa + qb)/2.Since treatment assignment is based on response of previous patients

in RPW model, it is not optimized with respect to any clinical endpoint.It is reasonable to randomize treatment assignment based on someoptimal criteria such as minimizing the expected numbers treatmentfailures. This leads to the so-called optimal designs.

Page 77: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

60 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Optimal RPW model

Adaptive designs have long been proposed for ethical reasons. The basicidea is to skew allocation probabilities to reflect the response historyof patients, hopefully giving a greater than 50% chance of a patient’sreceiving the treatment performing better thus far in the trial. The opti-mal randomized play-winner model (ORPW) is to minimize the numberof failures in the trial.

There are three commonly used efficacy endpoints in clinic trials,namely, simple proportion difference (pa − pb), the relative risk (pa/pb),and the odds ratio (paqb/pbqa), where qa = 1 − pa and qb = 1 − pb arefailure rates. These can be estimated consistently by replacing pa withpa and pb with pb, where pa and pb are the proportions of observedsuccesses in treatment groups A and B, respectively. Suppose that wewish to find the optimal allocation r = na/nb such that it minimizes theexpected number of treatment failures naqa + nbqb, which is mathemat-ically given by

r∗ = arg minr

{naqa + nbqb} (3.12)

= arg minr

{r

1 + rnqa +

11 + r

nqb

}

.

For simple proportion difference, the asymptotic variance is given by

paqa

na+

pbqb

nb=

(1 + r)(pa qa + r pb qb)nr

= K, (3.13)

where K is some constant. Solving (3.13) for n yields

n =(1 + r)(pa qa + r pb qb)

rK. (3.14)

Substituting (3.14) into (3.13), we obtain

r∗ = arg minr

{(r pa + qb)(paqa + r pbqb)

r K

}

. (3.15)

Taking the derivative of (3.14) with respect to r and equating to zero,we have

r∗ =(

pa

pb

) 12

.

Note that r∗ does not depend on K.Note that the limiting allocation for the RPW rule ( qb

qa) is not optimal

for any of the three measures. It is also interesting to note that none of

Page 78: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE RANDOMIZATION 61

Table 3.2 Asymptotic Variance with RPW

Measure r∗ Asymptotic Variance

Proportion difference(

papb

) 12 pa qa

na+ pb qb

nb

Relative risk(

papb

) 12

(qbqa

)pa q2

bnaq3

a+ pb qb

nbq2a

Odds ratio(

pbpa

) 12

(qbqa

)pa q2

bnaq3

a p2b+ pbqb

nbq2a p2

b

Source: Rosenberger and Lachin (2002), p. 176.

the optimal allocation rules yields Neyman allocation given by (Melfiand Page, 1998)

r∗ =(

pa

pb

qa

qb

) 12

,

which minimizes the variance of the difference in sample proportions.Note that Neyman allocation would be unethical when pa > pb (i.e.,more patients receive the inferior treatment, Table 3.2).

Because the optimal allocation depends on the unknown binomialparameters, we must develop a sequential design that can approximatethe optimal design. The rule for the proportion difference is to simplyreplace the unknown success probabilities in the optimal allocation ruleby the current estimate of the proportion of successes (i.e., pa,n and pb,n)observed in each treatment group thus far. This leads to the so-calledsequential maximum likelihood procedure. Alternatively, we can usea Bayesian approach such as Bandit allocation rule, where differentoptimal criteria can be optionally utilized.

Bandit model

A bandit allocation rule is a Bayesian approach that utilizes prior in-formation on unknown parameters in conjunction with incoming datato determine optimal treatment assignment at each stage of the trial(Hardwick and Stout, 1991, 1993, 2002). The weighting of returns isknown as discounting, which consists of multiplying the payoff of eachoutcome by the corresponding element of a discount sequence. The prop-erties of any given bandit allocation rule depend upon the associateddiscount sequence and prior distribution.

Consider a two-arm bandit (TAB) design for the two proportion dif-ference. The procedure can be described as follows.

Page 79: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

62 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

(1) Binary outcomes Xia and Xib for the two treatment groups areBernoulli random variable:

Xia ∼ B(1, pa), (3.16)

and

Xib ∼ B(1, pb), i = 1, 2, . . . , n.

(2) Prior distribution is assumed to be a beta distribution:

pa ∼ Beta(a0, b0), (3.17)

and

pb ∼ Beta(c0, d0).

(3) At stage m ≤ n, the posteriors of pa and pb are given by

(pa|k, i, j) ∼ Beta(a, b), (3.18)

and

(pb|k, i, j) ∼ Beta(c, d),

where

k =m∑

i=1

δia, i =k∑

i=1

Xia, and j =m−k∑

i=1

Xia,

and

a = i + a0,b = k − i − b,c = j + c0,d = m− k − j + d0.

(3.19)

Thus, the posterior means of pa and pb at m stage are given by

Em[pa] = a/(a + b),

and

Em[pb] = c/(c + d),

where Em[.] denotes expectation under the model.(4) Two commonly used discount sequences {1, β1, β2, . . . , βn} are the

n-horizon uniform sequence with all βi = 1, and the geometric sequencewith all βi = β(0 < β < 1).

(5) Allocation rule, δ, is defined to be a sequence (δ1, δ2, . . . , δn) whereδi = 1 if the ith subject receives treatment A and δi = 0 if the ith subjectreceives treatment B. It is required that the decision, δi at stage i bedependent only upon the information available at that time (not the

Page 80: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE RANDOMIZATION 63

future). The two commonly used allocation rules are Uniform Banditand Truncated Gittins Lower Bound.

The truncation here refers to a rule that if a state is reached suchthat the final decision cannot be influenced by any further outcomes,then the treatment with the best success rate will be used for all furthersubjects.

Uniform Bandit The n-horizon uniform TAB uses prior and accumu-lated information to minimize the number of failures during the trial.Let �m(i, j, k, l) denote the minimal possible expected number of fail-ures remaining in the trial, if m patients have already been treated andthere were i successes and j failures on treatment A, and k successesand l failures on treatment B. (Note that one parameter can be elim-inated since m = i + j + k + l.) The algorithmic approach is based onthe observation that if A were used on next patient, then the expectednumber of failures for patient m+ 1 and through n would be

�Am(i, j, k, l) = Em[pa]�m+1(i, j, k, l) + Em[1 − pa](1 + �m+1(i, j, k, l)),

(3.20)

If B were used, we would have

�Bm(i, j, k, l) = Em[pb]�m+1(i, j, k, l) + Em[1 − pb](1 + �m+1(i, j, k, l)).

(3.21)

Therefore, � satisfies the recurrence

�m(i, j, k, l) = min{

�Am(i, j, k, l), �

Bm(i, j, k, l)

}

, (3.22)

which can be solved by dynamic programming, starting with patientn and proceeding toward the first patient. The computation is at orderof O(n4).

Gittins Lower Bound According to a theorem of Gittins and John(Berry and Fristedt, 1985), for bandit problems with geometric discountand independent arms, for each arm there exists an index with theproperty that, at given stage, it is optimal to select, at the next stage,the arm with the higher index. The index for an arm, the Gittins Index,is a function only of the posterior distribution and discount factor β.

The existence of Gittins index removes many computation difficultiesassociated with other Bandit problems.

Remarks For small samples, the allocation rule can be implementedby means of dynamic programming (Hardwick and Stout, 1991). Se-quential treatment allocation can also be done based on other optimalcriteria. For example, Hardwick and Stout (2002) developed the allo-cation rule based on maximizing the likelihood of making the correct

Page 81: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

64 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

decision by utilizing a curtailed equal allocation rule with a minimal ex-pected sample size. The optimization is given for any fixed |pa − pb| = ∆among the curtailed (pruned) equal allocation rule. The pruning refersto a rule whereby, if a state is reached such that the sign of the finalobserved difference in success rate for the two groups will not be in-fluenced by any further outcomes, then the trial will be stopped. Thepruning could result in an insufficient sample size or power for clinicaltrials.

Rosenberger et al.(2002) used computer simulations to compare theORPW, Neyman allocation, the RPW rule, and equal allocation. Theyfound out that the RPW rule tends to be highly variable for larger val-ues of pa and pb. The adaptive structure of sequential designs includesdependencies that could result in extra-binomial variability. This in-creased variability will decrease power to some extent. ORPW reducesthe expected number of failures from equal allocation and reduces theexpected failures by around 3 or 4 when pa and pb are small to mod-erate. When pa and pb are large, there are more moderate reductions,and it is questionable whether adaptive designs would improve muchover equal allocation when a test is based on proportion difference. Forthe RPW design, for example, if pa = 0.7 and pb = 0.9 with a samplesize of 192, the RPW design has power 0.88 for z1 with t expected 31.5failures, while equal allocation design for 162 patients has power 0.90with 32.4 failures.

The RPW rule does not require instantaneous outcomes, or even thatthey are available before randomization of the next subject. Investiga-tors can update the urn when a subject’s outcome is ascertained. Theeffect of this will “slow” the adaptation, and hence there will be lessbenefit to subjects, particularly those recruited early. If delay of the re-sponse is so significant, it could be practically impossible to implementthe RPW rule.

Bandit model for finite population

The bandit allocation rule discussed in the previous section is optimalin the sense that it minimizes the number of failures in the trial. Inwhat follows, we will discuss an optimal criterion in the scope of theentire patient population with the disease, and compare five differentrandomization procedures (Berry and Eick, 1995) with this criterion.

Suppose that the “patient horizon” is N. Each of the N patients is tobe treated with one of two treatments, A and B. Treatment allocationis sequential for the first n patients and the response is dichotomousand immediate. Let Zj , j = 1, . . . , N, denote the response of patient j;Zj = 1 if success and Zj = 0 if failure. The probability of a success with

Page 82: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE RANDOMIZATION 65

treatment A is pa and with B is pb. We have that

E[Zj |pa, pb]=

{

P[Zj = 1|pa, pb] = pa, if patient j receives treatment AP[Zj = 1|pa, pb] = pb, if patient j receives treatment B.

Since treatment allocation for patients 1 to n is sequential, treatmentassignment can depend upon the responses of all previously treatedpatients. However, treatment assignment for patients n + 1 to N candepend only on the responses of patients 1 to n. In all the procedureswe consider, these latter patients receive the treatment with the largerfraction of successes among the first n. (If the two observed successproportions are equal, then the treatment with the greater number ofobservations is given to patients n + 1 to N.) Let D be the class of alltreatment allocation procedures satisfying these restrictions.

The conditional worth (W) of procedure τ ∈ D (given pa and pb) is

Wδ(pa, pb) = Eδ

N∑

j=1

Zj |pa, pb]

, (3.23)

where the distribution of the Zj ’s is determined by τ. This can be nogreater than Nmax{pa, pb}. The conditional expected successes lost (ESL)using τ is:

Lτ (pa, pb) = Nmax{pa, pb} − Wτ (pa, pb). (3.24)

This function is obviously non-negative for all τ.

Allocation Procedures Barry and Eirick (1995) considered fouradaptive procedures and compared them with a balanced randomizeddesign or equal randomization (ER). All of these procedures are mem-bers of D. We describe the procedures on the basis of the way theyallocate treatments to the first n patients. We assume for conveniencethat n is an even integer.

Procedure ER: Half of the first n patients are randomly assigned totreatment A and the other half to B. For comparison purposes, it doesnot matter whether patients are randomized in pairs, or in blocks oflarger size.

Procedure JB (J. Bather): Treatments A and B are randomly assignedto patients 1 and 2 so that one patient receives each. Suppose thatduring the trial m patients have been treated, 2 ≤ m < n, and assumethat sa, fa, sb, fb successes and failures have been observed on A and B,respectively (sa + fa + sb + fb = m). Define

λ(k) = (4 +√

k)/(15k). (3.25)

Let λa = λ(sa + fa) and λb = λ(sb + fb). Procedure JB random-izes between the respective treatments except that the randomization

Page 83: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

66 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

probabilities depend upon the previously observed response. Let

q =sa

sa + fa− sb

sb + fb+ 2(λa − λb), (3.26)

then under procedure JB, the next patient (patient m + 1) receivestreatment A with probability

λa

λa + λbexp(q/λa) for q ≤ 0

and

1 − λb

λa + λbexp(q/λb) when q > 0.

Procedure TAB is a two-armed bandit procedure where the first n pa-tients are randomized based on the current probability distribution of(pa, pb), assuming a uniform prior density on (pa, pb):

π(pa, pb) = 1 on (0, 1) × (0, 1) (3.27)

The next patient receives treatment A with probability equal to thecurrent probability that pa > pb. This probability is

∫ 1

0

∫ 1

0

usa(1 − u) favsb(1 − v) fb dudv

{

B(sa + 1, fa + 1)B(sb + 1, fb + 1}−1 (3.28)

where B(., .) is the complete beta function:

B(a, b) =∫ 1

0

ua−1(1 − u)b−1du. (3.29)

Procedure PW (Play-the-winner/Switch-from-loser): The first patientreceives treatment A or B with a equal probability 0.5. For patients 2to n, the treatment given to the previous patient is used again if it wassuccessful; otherwise the other treatment is used.

Procedure RB (Robust Bayes): This strategy is optimal in the follow-ing two-arm bandit problem. Suppose that the uniform prior densityof (pa, pb) is given, the discount sequence of β = {1, β1, β2, . . . , βn} isdefined by

βi =

1 for 1 ≤ i ≤ n,N − n for i = n + 1,

0 for i > n + 1.

(3.30)

That means that all N patients have equal weights. The first n patientseach have a weight of N − n. Procedure RB maximizes

∫ 1

0

∫ 1

0

Wτ (pa, pb; β) dpadpb (3.31)

Page 84: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE RANDOMIZATION 67

over all δ ∈ D, where

Wδ(pa, pb; β) = Eδ

N∑

j=1

β j Zj |pa, pb

.

This maximum can be found using dynamic programming. The startingpoint is after the n patients in the trial have responded. The subsequentexpected number of the successes is N − n times the maximum of thecurrent expected values of pa and pb. If both treatments are judgedequally effective at any stage, then procedure RB randomizes the nexttreatment assignment.

Procedure RB comprises a dynamic programming process. The sym-metry of the uniform prior distribution implies that the treatments areinitially equivalent. For the first patient, one is chosen at random. If thefirst patient has a success, then the second patient receives the sametreatment. If the first patient has a failure, then the second patient re-ceives the other treatment. Thus, procedure RB imitates procedure PWfor the first two treatment assignments. The same treatment is usedas long as it is successful, again imitating PW. However, after a failure,switching to the other treatment may or may not be optimal. If the datasufficiently strongly favor the treatment that has just failed, then thattreatment will be used again.

When following procedure RB, if the current probability of success fortreatment A (which is the current expected value of pa) is greater thanthat for treatment B, then treatment A may or may not be optimal forthe next patient. If the current number of patients on treatment A, sa+ fais smaller than the number of patients on B, sb + fb, then A is indeedoptimal. However, if sa + fa is greater than sb + fb, then, for sufficientlylarge N, treatment B is optimal irrespective of the current expectedvalues of pa and pb. Procedure RB tends to assign the currently superiortreatment, but less so for large N than for small N. As N increases,gathering information early on by balancing the treatment assignmentis important. Thus, assignment to the two treatments tends to be morebalanced when N is large than when it is small.

Some comparisons between these procedures are possible withoutdetailed calculations. Procedure ER is designed to obtain informationfrom the first n patients that will help patients outside the trial. Becauseit gives maximal information about pa − pb, its performance relative tothe other procedures will improve as N increases.

Procedure RB is designed to perform well on the average for anyn, N, pa, and pb. Of the five procedures described, it alone specificallyuses the value of N, giving it an advantage over the other procedures.

Page 85: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

68 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Procedure PW ignores most of the accumulating data; its treatmentassignments are based not on sufficient statistics but only on the resultfor the previous patient. On the other hand, since PW tends to allocatepatients to both of the treatments except when one or both p’s is closeto 1, it should perform well when N is large.

Procedures JB and TAB are quite similar. Both randomize allocationsso that the currently superior treatment is more likely to be assigned.

As indicated above, RB maximizes (3.31) over all procedures in D.Thus, procedures PW, JB and TAB will not perform better than RBwhen averaged over pa and pb. However, they might outperform RBfor some N and some moderately large set of (pa, pb). The computersimulations showed that they do not.

Remarks Berry and Eick (1995) conducted computer simulations com-paring the five methods mentioned above with N = 100, 1000, 10,000and 100,000. Their main conclusion is that a balanced randomized de-sign is nearly optimal when the disease is relatively common, e.g., whenN is moderately large (such as N ≥ 10,000). However, when a substan-tial portion of patients are involved in the trial (as with a rare form ofcancer), then adaptive procedures can perform substantially better thana balanced randomization. There are many relevant questions that needto be answered before these adaptive allocations can provide practicaladvantages. These questions include (i) How relevant is the condition?(ii) How effective are the treatments A and B? (iii) Are other effectivetreatments available? (iv) How long will it take to discover a new treat-ment that is clearly superior to both A and B? In addition, if a Bayesianapproach is used, should pa and pb have different priorities because thecontrol is using an approved drug?

Adaptive models for ordinal and continuous outcomes

Ordinal Outcome Ivan and Flournoy (2001) developed an urn model,called Ternary urn model, for categorical outcome. In this section, wewill introduce an urn model for ordinal outcome. This model can fallinto Rosenberger’s treatment effect mapping model, whose allocationrule is given by

P(δ j |∆ j−1) = g(Ej),

where g is a function of treatment effect Ej−1 at stage j.We propose here a response-adaptive model with ordinal outcomes

for multiple treatments. Suppose there are K treatment groups in thetrial and the primary response is ordinal with M categories. Without

Page 86: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE RANDOMIZATION 69

loss of generality, let C j( j = 1, . . , M) be the integer scales for the or-dinal response with a higher score indicating a desired outcome. Theresponse-adaptive urn model is defined as follows. There are K typesof balls in an urn, initially ai balls of type i. The treatment assignmentfor a patient is determined by the ball type randomly drawn from theurn with replacement. If a ball of type k is drawn from the urn, thepatient will be assigned treatment k. Then observe the response for allthe patients treated. If a patient with treatment i had response C j atstage n (n patients have been treated), then nCj balls of type i will beadded to the urn. Repeat this procedure for treatment assignment toall patients in the trial.

Normal Outcome Let Yi be a continuous variable representing theresponse of the ith patient, treated with either A or B following theadaptive design. Assume responses to be instantaneous and normallydistributed. Suppose µa and µb are population characteristics repre-senting the treatment effects A and B, respectively (assume a largervalue µ indicates a better result). For the ith patient, we define an indi-cator variable δi which takes the value 1 or 0 accordingly as the patientis treated by A or B. Then, the adaptive allocation rule is described asfollows:

For the initial two patients, we randomly assign one to each treatmentA or B. For patient i + 1 ( 2 < i ≤ n), we assign him/her to treatment Awith a probability of

Pa(δi+1|δ1, . . , δi, Y1, . . . , Yi) =

Φ

µa − µb

σp

√1ia

+ 1ib

α

, (3.32)

where α is constant that can be determined by optimal criteria later,Φ(•) is the standard normal cumulative distribution function, ia andib are number of patients in treatment A and B at state i, the pooledvariance

σ 2p =

(ia − 1)σ 2a − (ib − 1)σ 2

b

ia − ib − 2,

and µa = Ya and µb = Yb. Note that Bandyopadhyay and Biswas(1997) suggested using the allocation probability Φ( µa−µb

T ), where T is aconstant.

Survival Outcome Rosenber and Seshaiyer (1997) proposed a treat-ment effect mapping g(S) = 0.5(1 + S), where S is the centered andscaled logrank test. We suggest using the optimal model proposed inthe previous section since logrank test statistic is normally distributed.

Page 87: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

70 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

3.5 Issues with Adaptive Randomization

Rosenberger and Lachin (2002) discussed the issues with adaptivedesign. They classified the bias into acrrual bias and selection bias assummarized below.

Accrual bias

RPW or other adaptive designs may lead to a unique type of bias, i.e.,accrual bias, by which volunteers may wish to be recruited later in thestudy to take advantage of the benefit from previous outcomes. Earliersubjects are mostly to have higher probabilities of receiving the inferiortreatment.

Accidental bias

Efron (1971) introduced the term accidental bias to describe the bias inestimation of treatment effect induced by an unobserved covariate. Thebias in estimation of treatment, (E(α) − α)2, is minimized when treat-ment assignment is balanced, where α and α are the true treatmenteffect and estimated treatment effect through linear regression, respec-tively. The bound of the bias due to unbalanced treatment assignmentsis controlled by the eigenvalue of covariance matrix of treatment assign-ment sequence. The eigenvalues for different randomization models arepresented in Table 3.3.

Accidental bias does not appear to be a serious problem for any ofthe randomization models discussed so far, except for the truncatedbinomial design. More details regarding the accidental bias can be foundin Rosenberger and Lachin (2002).

Table 3.3 Accidental Bias for Various Randomization Models

Model Name Maximum Eigenvalue, λmax

Complete random 1

Lachin’s allocation rule 1+ 1n−1

Stratified Lachin’s allocation rule 1+ 1m−1

Truncated binomial model√

πn/3 ≤ λmax ≤ √n/2

Friedman-Wei’s urn model 1+ 23

ln nn +O(n−1)

Note: n = sample size, and m = sample size with each stratum.

Page 88: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE RANDOMIZATION 71

Selection bias

Selection bias refers to biases that are introduced into an unmaskedstudy because an investigator may be able to guess the treatment as-signment of future patients based on knowing the treatments assignedto the past patients. Patients usually enter a trial sequentially overtime. Staggered entry allows the possibility for a study investigator toalter the composition of the groups by attempting to guess which treat-ment will be assigned next. Based on whichever treatment is guessedto be assigned next, the investigator can then choose the next patientscheduled for randomization to be one whom the investigator considersto be better suited for that treatment. One of the principal concerns inan unmasked study is that a study investigator might attempt to “beatthe randomization” and recruit patients in a manner such that each pa-tient is assigned to whichever treatment group the investigator feels isbest suited to that individual patient (Rosenberger and Lachin, 2002).

Blackwell and Hodges (1957) developed a model for selection bias.Using this model the selection bias can be measured by the so-calledexpected bias factor,

E(F) = E(G − n/2), (3.33)

where G is total number of correct guesses (A better to B better), n/2 isthe number of patients in each of the two groups.

Blackwell and Hodges (1957) showed that the optimal strategy for theexperimenter upon randomizing the jth patient is to guess treatment Awhen NA( j − 1) < NB( j − 1) and B when NA( j − 1) > NB( j − 1). Whenthere is a tie, the experimenter guesses with equal probability. This iscalled convergence strategy. The expected bias factors under convergencestrategy for various randomization models are presented in Table 3.4.

Inferential analysis

Analyses based on a randomization model are completely different fromtraditional analyses using hypotheses tests of population parametersunder the Neyman-Pearson paradigm. The most commonly used basisfor the development of a statistical test is the concept of a populationmodel, where it is assumed that the sample of patients is representativeof a reference population and that the patient responses to treatmentare independent and identically distributed from a distribution depen-dent on unknown population parameters. A null hypothesis under apopulation model is typically based on the equality of parameters fromknown distributions. Permutation tests or randomization tests are non-parametric tests. The null hypothesis of a permutation test is that theassignment of treatment A versus B has no effect on the responses of the

Page 89: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

72 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 3.4 Expected Selection Bias Factors Under ConvergenceStrategy

Model Maximum Eigenvalue, λmax

Complete random 0

Lachin’s allocation rule 2n−1(

nn/2

)− 12

Stratified Lachin’s allocation rule M

2m−1

m − 12

( mm/2

)

Truncated binomial model n2n+1

(n

n/2

)

Friedman-Wei’s urn model∑n

i=1

[12 +

βE(|Di−1|)2(2α+β(i−1))

]

− n2

Note: n = sample size, m = # of patients/block, and M = number of blocks.

n patients randomized in the study. The essential feature of a permuta-tion test is that in a randomization null hypothesis, the set of observedresponses is assumed to be a set of deterministic values that are un-affected by treatment. The observed difference between the treatmentgroups depends only upon the way in which the n patients were ran-domized. Permutation tests are assumption-free, but depend explicitlyupon the particular randomization procedure used.

A number of questions arise about the permutation test. (1) Whatmeasure of extremeness, or test statistic, should be used? The most gen-eral family of permutation tests is the family of linear rank tests. Linearrank tests are used often in clinical trials, and the family includes suchtests as the traditional Wilcoxon rank-sum test and the logrank test.(2) Which set of permutations of the randomization sequence should beused for comparison? (3) If the analysis of a clinical trial is based ona randomization model that does not in any way involve the notion ofa population, how can results of the trial be generalized to determinethe best care for future patients? However, this weakness exists in thepopulation model too.

Power and sample size

For the urn UD(α, β) design, if the total sample size n = 2m is specified,a perfectly balanced design with na = nb = mwill minimize the quantity

η = [1/na + 1/nb]

Page 90: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE RANDOMIZATION 73

which is η = 2/m. If n is not known beforehand, it is interesting to knowhow many extra observations are needed for UD(0, β) to reduce η to beless than or equal to 2/m. That is, we continue taking observations untilna and nb satisfy

1na

+1nb

≤ 2m

(3.34)

If we write na +nb = 2m+ν, then ν is the number of additional observa-tions required by the UD(0, β) to satisfy this condition. It follows (Wei,1978) that for any given ν and large m,

Pr(ν ≤ z) ≈ Φ[(3z)1/2] − Φ[−(3z)1/2]. (3.35)

For large m, Pr[ν ≤ 4] is approximately 0.9995, and thus the UD(0, β)needs at most 4 extra observations to satisfy the above inequality, i.e.,to yield the same efficiency as the perfectly balanced randomizationallocation rule.

3.6 Summary

In this chapter, we have discussed several types of adaptive random-ization. Theoretically, outcomes (efficacy or safety) with a response-dependent randomization are not independent. Therefore, thepopulation-based inferential analysis method distinguishes the twotypes of adaptive designs. However, the randomization-based analy-sis (permutation test) can be used under both adaptive randomiza-tions. When a test statistic for the non-adaptive randomized trial isgoing to be used for an adaptive randomization trial, the correspondingpower/sample size calculation for the non-adaptive randomization trialcan also be used. For small sample size, permutation can be used for theinferential analysis, confidence interval estimation, and power/samplesize estimation. An adaptive randomization can be either optimal orintuitive. The outcome can be binary, ordinal, or continuous. The adap-tive approach can be Bayesian or non-Bayesian. Our discussions havebeen focused on the cases with two treatment groups, but it can be easilyexpanded to multiple arms.

Page 91: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and
Page 92: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CHAPTER 4

Adaptive Hypotheses

Modifications of hypotheses of on-going clinical trials based on accrueddata can certainly have an impact on statistical power for testing the hy-potheses with the pre-selected sample size. Modifications of hypothesesof on-going trials commonly occur during the conduct of a clinical trialdue to the following reasons: (i) an investigational method has not yetbeen validated at the planning stage of the study, (ii) information fromother studies is necessary for planning the next stage of the study, (iii)there is a need to include new doses, and (iv) recommendations from apre-established data (safety) monitoring committee (DMC) (Hommel,2001). In addition, to increase the probability of success, the spon-sors may switch a superiority hypothesis (originally planned) to a non-inferiority hypothesis. In this chapter, we will refer to adaptive hypothe-ses as modifications of hypotheses of on-going trials based on accrueddata. Adaptive hypotheses can certainly affect the clinically meaningfuldifference (e.g., effect size, non-inferiority margin, or equivalence limit)to be detected, and consequently the sample size is necessarily adjustedfor achieving the desired power.

In this chapter, we will examine the impact of a modification to hy-potheses on the type I error rate, the statistical power of the test,and sample size for achieving the desired power. For a given clinicaltrial, the situations where hypotheses are modified as deemed appro-priate by the investigator or as recommended by an independent datasafety monitoring board (DSMB) after the review of interim data duringthe conduct of the clinical trial are described in the next section. InSection 4.2, the choice of non-inferiority margin, change in statisti-cal inference, and impact on sample size calculation when switchingfrom a superiority hypothesis to a non-inferiority hypothesis are dis-cussed. Multiple hypotheses such as independent versus dependentand/or primary versus secondary hypotheses are discussed inSection 4.3. Also included in this section is a proposed decision theoryapproach for testing multiple hypotheses. A brief concluding remark isgiven in the last section.

Page 93: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

76 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

4.1 Modifications of Hypotheses

In clinical trials with planned data monitoring for safety and interimanalyses for efficacy, a recommendation for modifying or changing thehypotheses are commonly made after the review of interim data. Thepurpose for such a recommendation is to ensure the success of the clin-ical trials for identifying best possible clinical benefits to the patientswho enter the clinical trials. In practice, the following situations arecommonly encountered.

The first commonly seen situation for modifying hypotheses duringthe conduct of a clinical trial is switching a superiority hypothesis toa non-inferiority hypothesis. For a promising compound, the sponsorwould prefer an aggressive approach for planning a superiority study.The study is usually powered to compare the promising compound witha placebo control or an active-control agent. However, the interim analy-sis results may not support superiority at interim analysis. In this case,instead of declaring the failure of the superiority trial, the independentdata monitoring committee may recommend to switch from testing thesuperiority hypothesis to a non-inferiority hypothesis. The switch froma superiority hypothesis to a non-inferiority hypothesis will certainly in-crease the probability of success of the trial because the study objectivehas been modified to establish non-inferiority rather than show supe-riority. Note that the concept of switching a superiority hypothesis to anon-inferiority hypothesis is accepted by the regulatory agency such asthe U.S. FDA, provided that the impact of the switch on statistical issues(e.g., the determination of non-inferiority margin) and inference (e.g.,appropriate statistical methods) on the assessment of treatment effectare well justified. More details regarding the switch from a superiorityhypothesis to a non-inferiority hypothesis are given in the next section.

Another commonly seen situation where the hypotheses are modi-fied during the conduct of a clinical trial is the switch from a singlehypothesis to a composite hypothesis or multiple hypotheses. A com-posite hypothesis is defined as a hypothesis that involves more thanone study endpoint. These study endpoints may or may not be indepen-dent. In many clinical trials, in addition to the primary study endpoint,some clinical benefits may be observed based on the analysis/review ofthe interim data from secondary endpoints for efficacy and/or safety. Itis then of particular interest to the sponsor to change testing a singlehypothesis for the primary study endpoint to testing a composite hy-pothesis for the primary endpoint in conjunction with several secondaryendpoints for clinical benefits or multiple hypotheses for the primaryendpoint and the secondary endpoints. More details regarding testingmultiple hypotheses are given in Section 4.3.

Page 94: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE HYPOTHESES 77

Other situations where the hypotheses are modified during the con-duct of a clinical trial include (i) change in hypotheses due to the switchin study endpoints, (ii) dropping ineffective treatment arms, and (iii) in-terchange between the null hypothesis and the alternative hypothesis.These situations are briefly described below.

In cancer trials, there is no universal agreement regarding whichstudy endpoint should be used as the primary study endpoint for eval-uation of the test treatment under investigation. Study endpoints suchas response rate, time to disease progression, and survival are com-monly used study endpoints for cancer clinical trials (see Williams,Pazdur, and Temple, 2004). A typical approach is to choose one studyendpoint as the primary endpoint for efficacy. Power analysis for sam-ple size calculation is then performed based on the primary endpoint,and other study endpoints are considered as secondary endpoints forclinical benefits. After the review of the interim data, the investiga-tor may consider switching the primary endpoint to a secondaryendpoint if no evidence of substantial efficacy in terms of the origi-nally selected primary endpoint (e.g., response rate) is observed, buta significant improvement in efficacy is detected in one of the sec-ondary endpoints (e.g., time to disease progression or median survivaltime).

For clinical trials comparing several treatments or several doses ofthe same treatment with a placebo or an active-control agent, a parallel-group design is usually considered. After the review of the interim data,it is desirable to drop the treatment groups or the dose groups whicheither show no efficacy or exhibit serious safety problems based on eth-ical consideration. It is also desirable to modify the dose and/or doseregimen for patients who are still on the study for best clinical results.As a result, hypotheses and the corresponding statistical methods fortesting treatment effect are necessarily modified for a valid and fair as-sessment of the effect of the test treatment under investigation. Moredetails regarding dropping the losers are discussed in Chapter 8.

In some cases, we may consider switching the null hypothesis andthe alternative hypothesis. For example, a pharmaceutical companymay conduct a bioavailability study to study the relative bioavailabilityof a newly developed formulation as compared to the approved formu-lation by testing the null hypothesis of bioinequivalence against thealternative hypothesis of bioequivalence. The idea is to reject the nullhypothesis and conclude the alternative hypothesis. After the reviewof the interim data, the sponsor realizes that the relative bioavailabil-ities between the two formulations are not similar. As a result, insteadof establishing bioequivalence, the sponsor may wish to demonstratesuperiority in bioavailability for the new formulation.

Page 95: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

78 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

4.2 Switch from Superiority to Non-Inferiority

As indicated in the previous section, it is not uncommon to switch froma superiority hypothesis (the originally planned hypothesis) to a non-inferiority hypothesis during the conduct of clinical trials. The purposeof this switching is to increase the probability of success. For testingsuperiority, if we fail to reject the null hypothesis of non-superiority, thetrial is considered a failure. On the other hand, the rejection of the nullhypothesis of inferiority provides the opportunity for testing superioritywithout paying any statistical penalty due to closed testing procedure.

When comparing a test treatment with a standard therapy or anactive control agent, as indicated by Chow, Shao, and Wang (2003), theproblem of testing non-inferiority and superiority can be unified by thefollowing hypotheses:

H0 : ε ≤ δ versus Ha : ε > δ,

where ε = µ2 − µ1 is the difference in mean responses between thetest treatment (µ2) and the active control agent (µ1), and δ is the clin-ical superiority or non-inferiority margin. In practice, when δ > 0, therejection of the null hypothesis indicates clinical superiority over thereference drug product. When δ < 0, the rejection of the null hypoth-esis implies non-inferiority against the reference drug product. Notethat when δ = 0, the above hypotheses are referred to as hypothesesfor testing statistical superiority, which is usually confused with that ofclinical superiority.

Non-inferiority margin

One of the major considerations in a non-inferiority test is the selectionof the non-inferiority margin. A different choice of non-inferiority mar-gin may affect the method of analyzing clinical data and consequentlymay alter the conclusion of the clinical study. As pointed out in theguideline by the International Conference on Harmonization (ICH), thedetermination of non-inferiority margins should be based on both sta-tistical reasoning and clinical judgment. Despite the existence of somestudies, there is no established rule or gold standard for determinationof non-inferiority margins in active control trials.

According to the ICH E10 Guideline, a non-inferiority margin may beselected based on past experience in placebo control trials with valid de-sign under conditions similar to those planned for the new trial, and thedetermination of a non-inferiority margin should not only reflect uncer-tainties in the evidence on which the choice is based, but also be suitablyconservative. Furthermore, as a basic frequentist statistical principle,

Page 96: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE HYPOTHESES 79

the hypothesis of non-inferiority should be formulated with popula-tion parameters, not estimates from historical trials. Along these lines,Chow and Shao (2006) proposed a method of selecting non-inferioritymargins with some statistical justification. Chow and Shao proposednon-inferiority margin depends on population parameters including pa-rameters related to the placebo control if it were not replaced by theactive control. Unless a fixed (constant) non-inferiority margin can bechosen based on clinical judgment, a fixed non-inferiority margin notdepending on population parameters is rarely suitable. Intuitively, thenon-inferiority margin should be small when the effect of the active con-trol agent relative to placebo is small or the variation in the populationunder investigation is large. Chow and Shao’s approach ensures that theefficacy of the test therapy is superior to placebo when non-inferiority isconcluded. When it is necessary/desired, their approach can produce anon-inferiority margin that ensures that the efficacy of the test therapyrelative to placebo can be established with great confidence.

Because the proposed non-inferiority margin depends on populationparameters, the non-inferiority test designed for the situation wherethe non-inferiority margin is fixed has to be modified in order to applyit to the case where the non-inferiority margin is a parameter. In whatfollows, Chow and Shao’s method for determination of non-inferioritymargin is described.

Chow and Shao’s Approach Let θT, θA, and θP be the unknownpopulation efficacy parameters associated with the test therapy, theactive control agent, and the placebo, respectively. Also, let δ ≥ 0 bea non-inferiority margin. Without loss of generality, we assume that alarge value of population efficacy parameter is desired. The hypothesesfor non-inferiority can be formulated as

H0 : θT − θA ≤ −δ versus Ha : θT − θA > −δ. (4.1)

If δ is a fixed pre-specified value, then standard statistical methods canbe applied to testing hypotheses (4.1). In practice, however, δ is oftenunknown.

There exists an approach that constructs the value of δ based on aplacebo-controlled historical trial. For example, δ = a fraction of thelower limit of the 95% confidence interval for θA−θP based on some his-torical trial data (see, e.g., CBER/FDA Memorandum, 1999). Althoughthis approach is intuitively conservative, it is not statistically valid be-cause (i) if the lower confidence limit is treated as a fixed value, thenthe variability in historical data is ignored; and (ii) if the lower con-fidence limit is treated as a statistic, then this approach violates thebasic frequentist statistical principle, i.e., the hypotheses being testedshould not involve any estimates from current or past trials.

Page 97: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

80 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

From a statistical point of view, the ICH E10 Guideline suggests thatthe non-inferiority margin δ should be chosen to satisfy at least thefollowing two criteria:

Criterion 1. The ability to claim that the test therapy is non-inferior tothe active-control agent and is superior to the placebo (even thoughthe placebo is not considered in the active-control trial).

Criterion 2. The non-inferiority margin should be suitably conserva-tive, i.e., variability should be taken into account.

A fixed δ (i.e., it does not depend on any parameters) is rarely suit-able under criterion 1. Let ∆ > 0 be a clinical superiority margin if aplacebo-controlled trial is conducted to establish the clinical superiorityof the test therapy over a placebo control. Since the active control is anestablished therapy, we may assume that θA − θP > ∆. However, whenθT − θA > −δ (i.e., the test therapy is non-inferior to the active control)for a fixed δ, we cannot ensure that θT − θP > ∆ (i.e., the test therapy isclinically superior to the placebo) unless δ = 0.

Thus, it is reasonable to consider non-inferiority margins dependingon unknown parameters. Hung et al. (2003) summarized the approachof using the non-inferiority margin of the form

δ = γ (θA − θP), (4.2)

where γ is a fixed constant between 0 and 1. This is based on the ideaof preserving a certain fraction of the active control effect θA − θP. Thesmaller θA−θP is, the smaller δ. How to select the proportion γ , however,is not discussed.

Following the idea of Chow and Shao (2006), we now derive a non-inferiority margin satisfying criterion 1. Let ∆ > 0 be a clinical superi-ority margin if a placebo control is added to the trial. Suppose that thenon-inferiority margin δ is proportional to ∆, i.e., δ = r∆, where r is aknown value chosen in the beginning of the trial. To be conservative,r should be ≤ 1. If the test therapy is not inferior to the active-controlagent and is superior over the placebo, then both

θT − θA > −δ versus θT − θP > ∆ (4.3)

should hold. Under the worst scenario, i.e., θT − θA achieves its lowerbound −δ, the largest possible δ satisfying (4.3) is given by

δ = θA − θP − ∆,

which leads to

δ =r

1 + r(θA − θP). (4.4)

From (4.2) and (4.4), γ = r/(r + 1). If 0 < r ≤ 1, then 0 < γ ≤ 12.

Page 98: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE HYPOTHESES 81

The above argument in determining δ takes Criterion 1 into account,but is not conservative enough, since it does not consider the variability.Let θT and θP be sample estimators of θT and θP, respectively, based ondata from a placebo-controlled trial. Assume that θT − θP is normallydistributed with mean θT − θP and standard error SET−P (which is trueunder certain conditions or approximately true under the central limittheorem for large sample sizes). When θT = θA − δ,

P(θT − θP < ∆) = Φ(

∆ + δ − (θA − θP)SET−P

)

(4.5)

where Φ denotes the standard normal distribution function. If δ is cho-sen according to (4.4) and θT = θA − δ, then the probability that θT − θPis less than ∆ is equal to 1

2. In view of Criterion 2, a value much smaller

than 12

for this probability is desired, because it is the probability thatthe estimated test therapy effect is not superior over that of the placebo.

Since the probability in (4.5) is an increasing function of δ, the smallerδ (the more conservative choice of the non-inferiority margin) is, thesmaller the chance that θT − θP is less than ∆. Setting the probabilityon the left-hand side of (4.5) to ε with 0 < ε ≤ 1

2, we obtain that

δ = θA − θP − ∆ − z1−ε SET−P,

where za = Φ−1(a). Since ∆ = δ/r, we obtain that

δ =r

1 + r(θA − θP − z1−ε SET−P). (4.6)

Comparing (4.2) and (4.6), we obtain that

γ =r

1 + r

(

1 − z1−ε SET−P

θA − θP

)

,

i.e., the proportion γ in (4.2) is a decreasing function of a type of noise-to-signal ratio (or coefficient of variation).

As indicated by Chow and Shao (2006), the above non-inferiority mar-gin (4.6) can also be derived from a slightly different point of view.Suppose that we actually conduct a placebo-controlled trial with supe-riority margin ∆ to establish the superiority of the test therapy overthe placebo. Then, the power of the large sample t-test for hypothesesθT − θP ≤ ∆ versus θT − θP > ∆ is approximately equal to

Φ(

θT − θP − ∆SET−P

− z1−α

)

,

where α is the level of significance. Assume the worst scenario θT =θA − δ and that β is a given desired level of power. Then, settingthe power to β leads to

θA − θP − ∆ − δ

SET−P− z1−α = zβ ,

Page 99: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

82 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

i.e.,

δ =r

1 + r[θA − θP − (z1−α + zβ)SET−P]. (4.7)

Comparing (4.6) with (4.7), we have

z1−ε = z1−α + zβ.

For α = 0.05, the following table gives some examples of values of β, ε,and z1−ε .

β ε z1−ε

0.36 0.1000 1.2820.50 0.0500 1.6450.60 0.0290 1.8970.70 0.0150 2.1700.75 0.0101 2.3200.80 0.0064 2.486

As a result, we arrive at the following conclusions with respect to thenon-inferiority margin given by (4.6).

1. The non-inferiority margin (4.6) takes variability into consid-eration, i.e., δ is a decreasing function of the standard error ofθT − θP. It is an increasing function of the sample sizes, sinceSET−P decreases as sample sizes increase. Choosing a non-inferiority margin depending on the sample sizes does not vio-late the basic frequentist statistical principle. In fact, it cannotbe avoided when variability of sample estimators is consid-ered. Statistical analysis, including sample size calculation inthe trial planning stage, can still be performed. In the limitingcase (SET−P → 0), the non-inferiority margin in (4.6) is thesame as that in (4.4).

2. The ε value in (4.6) represents a degree of conservativeness.An arbitrarily chosen ε may lead to highly conservative tests.When sample sizes are large (SET−P is small), one can afforda small ε. A reasonable value of ε and sample sizes can bedetermined in the planning stage of the trial.

3. The non-inferiority margin in (4.6) is non-negative if and onlyif θA − θP ≥ z1−ε SET−P, i.e., the active control effect is sub-stantial or the sample sizes are large. We might take our non-inferiority margin to be the larger of the quantity in (4.6) and 0to force the non-inferiority margin to be nonnegative. However,it may be wise not to do so. Note that if θA is not substantially

Page 100: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE HYPOTHESES 83

larger than θP, then non-inferiority testing is not justifiablesince, even if δ = 0 in (4.1), concluding H1 in (4.1) does notimply the test therapy is superior over the placebo. Using δ in(4.6), testing hypotheses (4.1) converts to testing the superior-ity of the test therapy over the active control agent when δ isactually negative. In other words, when θA−θP is smaller thana certain margin, our test automatically becomes a superioritytest and the property P(θT − θP < ∆) = ε (with ∆ = |δ|/r) stillholds.

4. In many applications, there is no historical data. In such casesparameters related to placebo are not estimable and, hence, anon-inferiority margin not depending on these parameters isdesired. Since the active control agent is a well-establishedtherapy, let us assume that the power of the level α test show-ing that the active control agent is superior to placebo by themargin ∆ is at the level η. This means that approximately,

θA − θP ≥ ∆ + (z1−α + zη)SEA−P.

Replacing θA − θP − ∆ in (4.6) by its lower bound given in theprevious expression, we obtain the non-inferiority margin

δ = (z1−α + zη)SEA−P − z1−ε SET−P.

To use this non-inferiority margin, we need some informa-tion about the population variance of the placebo group. Asan example, consider the parallel design with two treatments,the test therapy and the active control agent. Assume thatthe same two-group parallel design would have been used if aplacebo-controlled trial had been conducted. Then

SEA−P =√

σ 2A/nA + σ 2

P/nP

and

SET−P =√

σ 2T/nT + σ 2

P/nP,

where σ 2k is the asymptotic variance for

√nk(θk − θk) and nk is

the sample size under treatment k. If we assume σP/√

nP = c,then

δ = (z1−α + zη)

σ 2A

nA+ c2 − z1−ε

σ 2T

nT+ c2. (4.8)

Formula (4.8) can be used in two ways. One way is to replacec in (4.8) with an estimate. When no information from theplacebo control is available, a suggested estimate of c is thesmaller of the estimates of σT/

√nT and σA/

√nA. The other

Page 101: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

84 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

way is to carry out a sensitivity analysis by using δ in (4.8) fora number of c values.

Statistical inference

When the non-inferiority margin depends on unknown populationparameters, statistical tests designed for the case of constant non-inferiority margin may not be appropriate. Valid statistical tests forhypotheses (4.1) with δ given by (4.2) are derived in CBER/FDAMemorandum (1999), Holmgren (1999), and Hung et al. (2003), assum-ing that (i) γ is known and (ii) historical data from a placebo-controlledtrial are available and the so-called “constancy condition” holds, i.e.,the active control effects are equal in the current and the historicalpatient populations. In this section, we derive valid statistical testsfor the non-inferiority margin given in (4.6) or (4.8). We use the samenotations as described in the previous section.

Tests based on historical data under constancy condition Wefirst consider tests involving the non-inferiority margin (4.6) in the casewhere historical data for a placebo-controlled trial assessing the effect ofthe active control agent are available and the constancy condition holds,i.e., the effect θA0 −θP0 in the historical trial is the same as θA−θP in thecurrent active control trial, if a placebo control is added to the currenttrial. It should be emphasized that the constancy condition is a crucialassumption for the validity of the results.

Assume that the two-group parallel design is adopted in both thehistorical and current trials and that the sample sizes are respectivelynA0 and nP0 for the active control and placebo in the historical trialand nT and nA for the test therapy and active control in the currenttrial. Without the normality assumption on the data, we adopt the largesample inference approach. Let k = T, A, A0 and P0 be the indexes,respectively, for the test and active control in the current trial and theactive control and placebo in the historical trial. Assume that nk = lknfor some fixed lk and that, under appropriate conditions, estimators θkfor parameters θk satisfy

√nk(θk − θk) →d N(0, σ 2

k ) (4.9)

as n → ∞, where →d denotes convergence in distribution. Also, assumethat consistent estimators σ 2

k for σ 2k are obtained. The following result

can be established.

Page 102: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE HYPOTHESES 85

Theorem 4.1 We have

θT − θA + r1+r (θA0 − θP0 − z1−ε SET−P) − (θT − θA + δ)

SET−C→d N(0, 1),

(4.10)

where

SET−P =√

σ 2T/nT + σ 2

P0/nP0

is an estimator of SET−P =√

σ 2T/nT + σ 2

P0/nP0 and SET−C is an estima-tor of SET−C, the standard deviation of θT − θA + r

1+r (θA0 − θP0), i.e.,

SET−C =

σ 2T

nT+

σ 2A

nA+

(r

1 + r

)2 (σ 2

A0

nA0+

σ 2P0

nP0

)

.

Proof: From result (4.9), the independence of data from differentgroups, and the constancy condition,

θT − θA + r1+r (θA0 − θP0) − [θT − θA

r1+r (θA − θP)]

SET−C→d N(0, 1). (4.11)

From the consistency of σ 2k and the fact that

√nSET−C is a fixed constant,

SET−P − SET−P

SET−C=

√n(SET−P − SET−P)√

nSET−C= op(1)

and

SET−C

SET−C− 1 =

√n(SET−C − SET−C)√

nSET−C= op(1),

where op(1) denotes a quantity converging to 0 in probability. Then

θT − θA + r1+r (θA0 − θP0 − z1−ε SET−P) − (θT − θA + δ)

SET−C

=

{

θT − θA + r1+r (θA0 − θP0) − [θT − θA + r

1+r (θA − θP)]

SET−C

− r1 + r

SET−P − SET−P

SET−C

}

SET−C

SET−C

=

{

θT − θA + r1+r (θA0 − θP0) − [θT − θA + r

1+r (θA − θP)]

SET−C

− op(1)

}

[1 + op(1)]

and result (4.10) follows from result (4.11) and Slutsky’s theorem.

Page 103: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

86 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Then, when the non-inferiority margin in (4.6) is adopted, the nullhypothesis H0 in (4.1) is rejected at approximately level α if

θT − θA +r

1 + r(θA0 − θP0 − z1−ε SET−P) − z1−α SET−C > 0.

Impact on sample size

Using result (4.10), we can approximate the power of this test by

Φ(

θT − θA + δ

SET−C− z1−α

)

.

Using this formula, we can select the sample sizes nT and nA to achievea desired power level (say β), assuming that nA0 and nP0 are given (inthe historical trial). Assume that nT/nA = λ is chosen. Then nT shouldbe selected as a solution of

θT − θA +r

1 + r

θA − θP − z1−ε

σ 2T

nT+

σ 2P0

nP0

= (z1−α + zβ)

σ 2T

nT+

λσ 2A

nT+

(r

1 + r

)2 (σ 2

A0

nA0+

σ 2P0

nP0

)

. (4.12)

Although equation (4.12) does not have an explicit solution in terms ofnT, its solution can be numerically obtained once initial values for allparameters are given.

Remarks

The constancy condition The use of historical data usually increasesthe power of the test for hypotheses with a non-inferiority margin de-pending on parameters in the historical trial. On the other hand, us-ing historical data without the constancy condition may lead to invalidconclusions. As indicated in Hung et al. (2003), checking the constancycondition is difficult. In this subsection we discuss a method of check-ing the constancy condition under an assumption much weaker thanthe constancy condition.

Note that the key is to check whether the active control effect θA − θPin the current trial is the same as θA0 − θP0 in the historical trial. Ifwe assume that the placebo effects θP and θP0 are the same (which ismuch weaker than the constancy condition), then we can check whetherθA = θA0 using the data under the active control in the current andhistorical trials.

Page 104: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE HYPOTHESES 87

Tests without historical data We now consider tests when a non-inferiority margin (4.8) is chosen. Following the same argument as givenin the proof of result (4.10), we can establish that

θT − θA + (z1−α + zη)SEA−P − z1−ε SET−P − (θT − θA + ∆)

SET−A→d N(0, 1),

(4.13)

where

SEk−l =√

σ 2k /nk + σ 2

l /nl.

Hence, when the non-inferiority margin in (4.8) is adopted, the nullhypothesis H0 in (4.1) is rejected at approximately level α if

θT − θA + (z1−α + zη)SEA−P − z1−ε SET−A − z1−α

σ 2T

nT+

σ 2A

nA> 0.

The power of this test is approximately

Φ(

θT − θA + δ

SET−A− z1−α

)

.

If nT/nA = λ, then we can select the sample sizes nT and nA to achievea desired power level (say β) by solving

θT − θA + (z1−α + zη)

λσ 2A

nT+

σ 2P

nP− z1−ε

σ 2T

nT+

σ 2P

nP

= (z1−α + zβ)

λσ 2A

nT+

σ 2T

nT.

4.3 Concluding Remarks

For large-scale clinical trials, a data safety monitoring committee (DMC)is usually established to monitor safety and/or perform interim efficacyanalysis of the trial based on accrual data at some pre-scheduled timeor when the trial has achieved a certain number of events. Based onthe interim results, the DMC may recommend modification of study ob-jectives and/or hypotheses. As an example, suppose that a clinical trialwas designed as a superiority trial to establish the superiority of thetest treatment as compared to a standard therapy or an active controlagent. However, after the review of the interim results, it is determinedthat the trial will not be able to achieve the study objective of estab-lishing superiority with the observed treatment effect size at interim.

Page 105: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

88 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

The DMC does not recommend stopping the trial based on futility anal-ysis. Instead, the DMC may suggest modifying the hypotheses for test-ing non-inferiority. This modification raises a critical statistical/clinicalissue regarding the determination of a non-inferiority margin. Chowand Shao (2006) proposed a statistical justification for determining anon-inferiority margin based on accrued data at interim following ICHguidance.

Page 106: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CHAPTER 5

Adaptive Dose-Escalation Trials

In clinical research, the response in a dose response study could be abiological response for safety or efficacy. For example, in a dose-toxicitystudy, the goal is to determine the maximum tolerable dose (MTD). Onthe other hand, in a dose-efficacy response study, the primary objectiveis usually to address one or more of the following questions: (i) Is thereany evidence of the drug effect? (ii) What is the nature of the doseresponse? and (iii) What is the optimal dose? In practice, it is alwaysa concern as to how to evaluate the dose-response relationship withlimited resources within a relatively tight time frame. This concernled to a proposed design that allows less patients to be exposed to thetoxicity and more patients to be treated at potentially efficacious doselevels. Such a design also allows pharmaceutical companies to fullyutilize their resources for development of more new drug products (see,e.g., Arbuck, 1996; Babb, Rogatko, and Zacks, 1998; Babb and Rogatko,2001; Berry et al., 2002; Bretz and Hothorn, 2002).

The remainder of this chapter is organized as follows. In the nextsection, we provide a brief background of dose-escalation trials. Theconcepts of the continued reassessment method (CRM) in phase I on-cology trials is reviewed in Section 5.2. In Section 5.3, we propose ahybrid frequentist-Bayesian adaptive approach. In Section 5.4, severalsimulations were conducted to evaluate the performance of the proposedmethod. The concluding remarks are presented in Section 5.5.

5.1 Introduction

For dose-toxicity studies, the traditional escalation rules (TER), whichare also known as the “3 + 3” rules, are commonly used in the earlyphase of oncology studies. The “3 + 3” rule is to enter three patientsat a new dose level and then enter another three patients when doselimiting toxicity (DLT) is observed. The assessment of the six patients isthen performed to determine whether the trial should be stopped at thelevel or to increase the dose. Basically, there are two types of the “3+3”rules, namely the traditional escalation rule (TER) and strict traditionalescalation rule (STER). TER does not allow dose de-escalation but STER

Page 107: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

90 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

does when two of three patients have DLTs. The “3 + 3” rules can begeneralized to the “m + n” TER and STER escalation rules. Chang andChow (2006a) provided a detailed description of general “m + n” designswith and without dose de-escalation. The corresponding formulas forsample size calculation can be found in Lin and Shih (2001).

Recently, many new methods such as the assessment of dose re-sponse using multiple-stage designs (Crowley, 2001) and the contin-ued reassessment method (CRM) (see, e.g., O’Quigley, Pepe, and Fisher,1990; O’Quigley and Shen, 1996; Babb and Rogatko, 2004) have beendeveloped. For the method of CRM, the dose-response relationship iscontinually reassessed based on accumulative data collected from thetrial. The next patient who enters the trial is then assigned to the po-tential MTD level. This approach is more efficient than that of the usualTER with respect to the allocation of the MTD. However, the efficiencyof CRM may be at risk due to delayed response and/or a constrainton dose-jump in practice (Babb and Rogatko, 2004). In recent years,the use of adaptive design methods for characterizing dose-responsecurves has become very popular (Bauer and Rohmel, 1995). An adap-tive design is a dynamic system that allows the investigator to optimizethe trial (including design, monitoring, operating, and analysis) withcumulative information observed from the trial. For Bayesian adap-tive design for dose-response trials, some researchers suggest the useof loss/utility function in conjunction with dose assignment based onminimization/maximization of loss/utility function (e.g., Gasparini andEisele, 2000; Whitehead, 1997).

In this chapter, we use an adaptive method that combines CRMand utility-adaptive randomization (UAR) for multiple-endpoint tri-als (Chang, Chow, and Pong, 2005). The proposed UAR is similar toresponse-adaptive randomization (RAR). It is an extension of RAR tomulti-endpoint case. In UAR scheme, the probability of assigning a pa-tient to a particular dose level is determined by its normalized utilityvalue (ranging from 0 to 1). The CRM could be a Bayesian, a frequen-tist, or a hybrid frequentist-Bayesian–based approach. This proposedmethod has the advantage of achieving the optimal design by meansof the adaptation to the accrued data of an on-going trial. In addition,CRM could provide a better prediction of dose-response relationshipby selecting an appropriate model as compared to the method simplybased on the observed response. In practice, it is not uncommon thatthe observed response rate is lower in a higher-dose group than thatin a lower-dose group provided that the high-dose group, in fact, has ahigher response rate. The use of CRM is able to avoid this problem byusing a monotonic function such as a logistic function in the model. Theproposed adaptive method deals with multiple endpoints in two ways.

Page 108: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE DOSE-ESCALATION TRIALS 91

The first approach is to model each dose-endpoint relationship with orwithout constraints among the models. Each model is usually a mono-tonic function family such as logistic or some power function. The secondapproach is to combine the multiple endpoints into a single utility in-dex, and then model the utility using a more flexible function familysuch as hyper-logistic function as proposed in this chapter.

5.2 CRM in Phase I Oncology Study

CRM is originally used in Phase I oncology trials (O’Quigley, Pepe, andFisher, 1990). The primary goal of phase I oncology trials is not only toassess the dose-toxicity relationship, but also to determine MTD. Dueto potential high toxicity of the study drug, in practice usually only asmall number of patients (e.g., 3 to 6) are treated at each ascendingdose level. The most common approach is the “3 + 3” TER with a pre-specified sequence for dose escalation. However, this ad hoc approachis found to be inefficient and often underestimates the MTD, especiallywhen the starting dose is too low. The CRM is developed to overcomethese limitations. The estimation or prediction from CRM is weightedby a number of data points. Therefore, if the data points are mostlyaround the estimated value, then the estimation is more accurate. CRMassigns more patients near MTD; consequently, the estimated MTD ismuch more precise and reliable. In practice, this is the most desirableoperating characteristic of the Bayesian CRM. In what follows, we willbriefly review the CRM approach.

Dose-toxicity modeling

In most phase I dose response trials, it is assumed that there is mono-tonic relationship between dose and toxicity. This ideal relationship sug-gests that the biologically inactive dose is lower than the active dose,which is in turn lower than the toxic dose. To characterize this relation-ship, the choice of an appropriate dose-toxicity model is important. Inpractice, the logistic model is often utilized:

p(x) = [1 + bexp(−ax)]−1

where p(x) is the probability of toxicity associated with dose x, anda and b are positive parameters to be determined. Practically, p(x) isequivalent to toxicity rate or dose limiting toxicity (DLT) rate as definedby the Common Toxicity Criteria (CTC) of the United States NationalCancer Institute. Denote θ the probability of DLT (or DLT rate) at MTD.

Page 109: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

92 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Then, the MTD can be expressed as

MTD =1a

ln(

1 − θ

)

.

If we can estimate a and b or their posterior distributions (for commonlogistic model, b is predetermined), we will be able to determine theMTD or provide predictive probabilities for MTD. The choice of toxicityrate at MTD, θ , depends on the nature of the DLT and the type of thetarget tumor. For an aggressive tumor and a transient and non-life-threatening DLT, θ could be as high as 0.5. For persistent DLT and lessaggressive tumors, it could be as low as 0.1 to 0.25. A commonly usedvalue is somewhere between 0 and 1/3 = 0.33 (Crowley, 2001).

Dose-level selection

The initial dose given to the first patients in a phase I study shouldbe low enough to avoid severe toxicity but high enough for observingsome activity or potential efficacy in humans. The commonly used start-ing dose is the dose at which 10% mortality (LD10) occurs in mice. Thesubsequent dose levels are usually selected based on the following mul-tiplicative set

xi = fi−1xi−1 (i = 1, 2, . . . k),

where fi is called the dose escalation factor. The highest dose levelshould be selected such that it covers the biologically active dose, butremains lower than the toxic dose. In general, CRM does not requirepre-determined dose intervals. However, the use of a pre-determineddose is often for practical convenience.

Reassessment of model parameters

The key is to estimate the parameter a in the response model. An initialassumption or a prior one about the parameter is necessary in order toassign patients to the dose level based the dose-toxicity relationship.This estimation of a is continually updated based on cumulative dataobserved from the trial. The estimation method can be a Bayesian orfrequentist approach. For Bayesian approaches, it leads to the posteriordistribution of a. For frequentist approaches such as maximum likeli-hood, estimate or least-square estimate are straightforward. Note thatthe Bayesian approach requires a prior distribution about parameter a.

It it provides posterior distribution of a and predictive probabilities ofMTD. The frequentist, Bayesian and a hybrid frequentist-Bayesian–based approaches in conjunction with the response-adaptive random-ization will be further discussed in this chapter.

Page 110: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE DOSE-ESCALATION TRIALS 93

Assignment of next patient

The updated dose-toxicity model is usually used to choose the dose levelfor the next patient. In other words, the next patient enrolled in the trialis assigned to the current estimated MTD based on the dose-responsemodel. Practically, this assignment is subject to safety constraints suchas limited dose jump and delayed response. Assignment of patientsto the most updated MTD is intuitive. It leads to the majority of thepatients being assigned to the dose levels near MTD, which allows amore precise estimate of MTD with a minimum number of patients.

5.3 Hybrid Frequentist-Bayesian Adaptive Design

When Bayesian is used for multiple-parameter response model, somenumerical irregularities cannot be easily resolved. In addition, Bayesianmethods require an extensive computation for a multiple-parametermodel. To overcome these limitations, a hybrid frequentist and Bayesianmethod is useful. The use of a utility-adaptive randomization allows theallocation of more patients to the superior dose levels and less patientsto the inferior dose levels. This adaptive method is not only optimal eth-ically but also has a favorable benefit/risk ratio (such as benefit-safetyand/or benefit-cost).

The adaptive model

In this section, we use the adaptive design as outlined in Figure 5.1 fordose-response trial or Phase II/III combined trials. Start with severaldose levels with or without a placebo group, followed by the predictionof the dose-response relationship using Bayesian or other approachesbased on accrued real-time data. The next patient is then randomizedto a treatment group based on the utility-adaptive or response-adaptiverandomization algorithm. We may allow the inferior treatment groupsto be dropped when there are too many groups based on results fromthe analysis of the accrued data using Bayesian and/or frequentist ap-proaches. It is expected that the predictive dose-response model in con-junction with a utility-adaptive randomization could lead to a designthat assigns more patients to the superior arms and consequently amore efficient design. We will illustrate this point further via computertrial simulations. In the next section, technical details for establish-ing dose-response relationships using CRM in conjunction with utility-adaptive randomization are provided.

Page 111: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

94 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Figure 5.1 Bayesian adaptive trial.

Utility-based unified CRM adaptive approach

The utility-based unified CRM adaptive approach can be summarizedby the following steps.

Step 1: Construct utility function based on trial objectives;Step 2: Propose a probability model for dose–response relation-

ship;Step 3: Construct prior probability distributions of the parame-

ters in the response model;Step 4: Form the likelihood function based on incremental infor-

mation on treatment response during the trial;Step 5: Reassess model parameters or calculate the posterior

probability of the model parameters;Step 6: Update the expected utility function based on dose-response

model;Step 7: Determine next action or make adaptations such as chang-

ing the randomization or drop inferior treatment arms;Step 8: Further collect trial data and repeat Steps 5 to 7 until

stopping criteria are met.

Construction of utility function

Let X = {x1, x2, . . . xk} be the action space where xi is coded value foraction of anything that would affect the outcomes or decision making,such as a treatment, a withdrawal of a treatment arm, a protocol amend-ment, stopping the trial, an investment of advertising for the prospective

Page 112: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE DOSE-ESCALATION TRIALS 95

drug, or any combination of the above. xi can be either a fixed dose or avariable of dose given to a patient. If action xi is not taken, then xi = 0.

Let y = {y1, y2, . . . ym} be the outcomes of interest, which can be efficacyor toxicity of a test drug, the cost of trial, etc. Each of these outcomes,yi is a function of action yi(x), x ∈ X. The utility is defined as

U =m∑

j=1

wj =m∑

j=1

w(yj), (5.1)

where U is normalized such that 0 ≤ U ≤ 1 and wj are pre-specifiedweights.

Probability model for dose response

Each of the outcomes can be modeled by the following generalized prob-ability model:

Γ j(p) =k∑

i=1

ajixi, j = 1, . . . , m (5.2)

where

p = {p1, . . . , pm}, pj = P(yj ≥ τ j),

and τ j is a threshold for jth outcome. The link function, Γ j(.), is a gen-eralized function of all the probabilities of the outcomes. For simplicity,we may consider

Γ j(pj) =k∑

i=1

ajixi, j = 1, . . . , m, (5.3)

and

pj(x, a) = Γ−1j

(k∑

i=1

ajixi

)

, j = 1, . . . , m. (5.4)

The essential difference between (5.4) and (5.5) is that the former mod-els the outcomes jointly, while the latter models each outcome inde-pendently. Therefore, for (5.5), Γ−1

j is simply the inverse function of Γ j .However, for (5.4), Γ−1

j is not a simple inverse function and sometimesthe explicit solution may not exist. Γ j can be used to model the con-straints between outcomes, e.g., the relationship between the two bloodpressures. Using the link functions could reduce some of the irregu-lar models in the modeling process when there are multiple outcomeendpoints.

For univariate case, logistic model is commonly used for monotonicresponse. However, for utility, we usually don’t know whether it is

Page 113: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

96 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Figure 5.2 Curves of hyper-logistic function family.

monotonic or not. Therefore, we proposed the following model

pj(x, a) = (aj1 exp(aj2x) + aj3 exp(−aj4x))−m; ( j = 1, . . . m) (5.5)

where aji and m are usually positive values and m is suggested to be 1.Note that when aj1 = 1, aj2 = 0, and m = 1, (5.7) degenerates tothe common logistic model. When aj1 = aj3 = 1, aj2 = 0, and aj4 =2, it reduces to the hyperbolic tangent model. Note that aji must bedetermined such that 0 ≤ pj(x, a) ≤ 1 for the dose range x under study.For convenience, we will refer to (5.7) as a hyper-logistic function.

The hyper-logistic function is useful especially in modeling utilityindex because it is not necessarily monotonic. However, the range ofparameter should be carefully determined before modeling. It is sug-gested that corresponding various shapes be examined. Some differentshapes that are generated by hyper-logistic function are presented inFigure 5.2.

A special case that is worth mentioning here is the following single-utility index (or combined outcomes) model, where the probability isbased on utility rather than each outcome

p = P(U ≥ τ ). (5.6)

Page 114: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE DOSE-ESCALATION TRIALS 97

Unlike the probability models defined based on individual outcomessuch as single efficacy or safety outcome, the probability models definedbased on utility are often single mode.

Prior distribution of parameter tensor a

The Bayesian approach requires the specification of prior probabilitydistribution of the unknown parameter tensor aji.

a ∼ gj0(a), j = 1, . . . m, (5.7)

where gj0(a) is the prior probability for the jth endpoint.

Likelihood function

The next step is to construct the likelihood function. Given n observa-tions with yji associated with endpoint j and dose xmi , the likelihoodfunction can be written as

f jn(r |a) =n∏

i=1

[

Γ−1j (aj mi xmi )

]rji[

1 − Γ−1j (aj mi xmi )

]1−rji , j = 1, . . . m,

(5.8)

where

rji =

{

1, if yji ≥ τ j

0, otherwise, j = 1, . . . m. (5.9)

Assessment of parameter a

The assessment of the parameters in the model can be carried out in dif-ferent ways: Bayesian, frequentist, or hybrid approach. Bayesian andhybrid approaches are to assess the probability distribution of the pa-rameter, while the frequentist approach is to provide a point estimateof the parameter.

Bayesian approach For the Bayesian approach, the posterior prob-ability can be obtained as follows:

gj(a|r) =f jn(r|a)gj0 (a)

f jn(r|a)gj 0(a) da, j = 1, . . . m. (5.10)

We then update the probabilities of the outcome or the predictive prob-abilities

pj =∫

Γ−1j

(k∑

i=1

ajixi

)

gj(a|r) da, j = 1, . . . m. (5.11)

Page 115: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

98 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Note that the Bayesian approach is computationally intensive. Alter-natively, we may consider a frequentist approach to simplify the calcu-lation, especially when limited knowledge about the prior is availableand non-informative prior is to be used.

Maximum likelihood approach The maximum likelihood estimatesof the parameters are given by

aji MLE = arg maxa

{ f jn(r | a)}, j = 1, . . . m, (5.12)

where aji MLE is the parameter set for the jth outcome. After havingobtained aji MLE, we can update the probability using

pj(x, a) = Γ−1j

(k∑

i=1

aji MLE xi

)

, j = 1, . . . m. (5.13)

Least square approach The least square approach minimizes thedifference between predictive probability and the observed rate or prob-ability, which is given below.

a jLSE = arg mina

{Lj(a))}, j = 1, . . . m, (5.14)

where

Lj(a) =k∑

i=1

(pj(xi, a)− pj(xi, a))2 .

We then update the probability

pj(x, a) = Γ−1j

(k∑

i=1

aji LSE xi

)

, (5.15)

where aji LSE, the component of a, is the parameter set for the jthoutcome.

Hybrid frequentist-Bayesian approach Although Bayesian CRMallows the incorporation of prior knowledge about the parameters, somedifficulties arise when it is used in a model with multiple parameters.The difficulties include (i) computational burden and (ii) numerical in-stability in evaluation posterior probability. A solution is to use the fre-quentist approach to estimate all the parameters and use the Bayesianapproach to re-estimate the posterior distribution of some parameters.This hybrid frequentist-Bayesian approach allows the incorporation ofprior knowledge about parameter distribution but avoids computationalburden and numerical instability. Details regarding the hybrid methodwill be specified.

Page 116: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE DOSE-ESCALATION TRIALS 99

Determination of next action

As mentioned earlier, the actions or adaptations taken should be basedon trial objectives or utility function. A typical action is a change ofthe randomization schedule. From the dose-response model, since eachdose associates with a probability of response, the expected utility func-tion is then given by U =

∑mj=1 pj(x, a)wj . Two approaches, deter-

ministic and probabilistic, can be taken. The former refers to the op-timal approach where actions can be taken to maximize the expectedutility, while the latter refers to adaptive randomization where treat-ment assignment to the next patient is not fully determined by thealgorithm.

Optimal approach As mentioned earlier, in the optimal approach,the dose level assigned to the next patient is based on optimization ofthe expected utility, i.e.,

xn+1 = arg maxxi

U =m∑

j=1

pjwj .

However, it is not feasible due to its difficulties in practice.

Utility-adaptive randomization approach Many of the response-adaptive randomizations (RAR) can be used to increase the expectedresponse. However, these adaptive randomization are difficult to applydirectly to the case of the multiple endpoints. As an alternative,the following utility-adaptive randomization algorithm is proposed. Thisutility-adaptive randomization, which combines the idea fromrandomized-play-winner (Rosenberger and Lachin, 2003) and Lachin’surn models, is based on the following. The target randomization prob-ability to xi group is proportional to the current estimation of utilityor response rate of the group, i.e., U(xi)/

∑ki=1 U(xi), where K is the

number of groups. When a patient is randomized into xi group, the ran-domization probability to this group should be reduced. These lead tothe following proposed randomization model:

Probability of randomizing a patient to group xi is proportional to thecorresponding posterior probability of the utility or response rate, i.e.,

�(xi) =1c

(

U(xi)∑k

i=1 U(xi)− ni

N

)

, (5.16)

where the normalization factor

c =∑

i

(

U(xi)∑k

i=1 U(xi)− ni

N

)

,

Page 117: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

100 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

ni is the number of patients that have been randomized to the xi group,and N is the total estimated number of patients in the trial. We willrefer to this model as the utility-offset model.

Rules for dropping losers For ethical and economical reasons, wemay consider to drop some inferior treatment groups. The issue is howto identify the inferior arms with certain statistical assurance. Thereare many possible rules for dropping an ineffective dose level. For exam-ple, just to name two, (i) maximum sample size ratio between groupsthat exceeds a threshold, Rn and the number of patients randomizedexceeds a NR; or (ii) the maximum utility difference, Umax − Umin > δuand its naive confidence interval width less than a threshold, δuw. Thenaive confidence interval is calculated as if Umax and Umin are observedresponses with the corresponding sample size at each dose level. Notethat the normalized utility index U ranges from 0 to 1. Note that, op-tionally, we may also choose to retain the control group or all groups forthe purpose of comparison between groups.

Stopping rule Several stopping rules are available for stopping atrial. For example, we may stop the trial when

(i) General rule: total sample size exceeds N, a threshold, or(ii) Utility rules: maximum utility difference Umax −Umin > δu and

its naive confidence interval width is less than δuw, or(iii) Futility Rules: Umax −Umin < δ f and its naive confidence inter-

val width is less than δ f w.

5.4 Simulations

Design settings

Without loss of generality, a total of 5 dose levels is chosen for the trialsimulations. The response rates, p(U > u), associated with each doselevel are summarized in Table 5.1. These response rates are not chosenfrom the hyper-logistic model, but arbitrarily in the interest of reflectingcommon practices.

Table 5.1 Assumed Dose-Response Relationship for Simulations

Dose Level 1 2 3 4 5

Dose 20 40 70 95 120Target Response Rate 0.02 0.07 0.37 0.73 0.52

Page 118: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE DOSE-ESCALATION TRIALS 101

Response model

The probability model considered was p(x, a) = P(U ≥ u) under thehyper-logistic model with three parameters (a1, a3, a4), i.e.,

p(x, a) = C(

a1 e0.03x + a3e−a4x)−1 , (5.17)

where a1 ∈ [0.06, 0.1], a3 ∈ [150, 200]. The use of scale factor C is asimple way to assure that 0 ≤ pj(x, a) ≤ 1 during in the simulationprogram.

Prior distribution

Two non-informative priors for parameter a4 were used for the trialsimulations. They are defined over [0.05, 0.1] and [0.01, 0.1], respec-tively.

a4 ∼ g0(a4) =

{1

b−a , a ≤ a4 ≤ b0, otherwise.

(5.18)

Reassessment method

In the simulations, the hybrid frequentist-Bayesian method was used.We first used the frequentist least squares method to estimate the 3parameters ai (i = 1, 3, 4), then used the estimated a1, a3, and theprior for a4 in the model and Bayesian method to obtain the posteriorprobability distribution of parameter a4 and the predictive probabilities.

Utility-adaptive randomization

Under the utility-offset model described earlier, the probability for ran-domizing a patient to the dose level xi is given by

�(xi) =

1K , before an observed responder

C(

p(xi)∑k

i=1 p(xi)− ni

N

)

, after an observed responder,(5.19)

where p(xi) is the response rate or the predictive probability in Bayesiansense, ni is the number of patients that have been randomized to the xigroup, N is the total estimated number of patients in the trial, and Kis number of dose groups.

Rules of dropping losers and stopping rule

In the simulations, no losers were dropped. The trial was stopped whenthe subjects randomized reached the prespecified maximum number.

Page 119: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

102 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Statistics from Trial Simulations

10.80.60.40.2

0

10.80.60.40.20

Predictive

Observed

True

% Subject4321 5

Dose Level

Rat

io

Figure 5.3 A typical example of simulation results.

Simulation results

Computer simulations were conducted to evaluate the operating char-acteristics. Four different sample sizes (i.e., N = 20, 30, 50, 100) and twodifferent non-informative priors were used. The outcomes from a typi-cal simulation are presented in Figure 5.3. It is shown that predictedresponse rates are still reasonably good even when the observed (sim-ulated) response rates are showing the sine-shape. The average resultsfrom 1000 simulations per scenario are summarized in Table 5.2.

We can see that the number of patients randomized into each group isapproximately proportional to the corresponding utility or rate in eachdose group, which indicates that the utility-adaptive randomization isefficient in achieving the target patient allocations among treatmentgroups. Compared to traditional balanced design with multiple arms,the adaptive design smartly allocates majority patients to the desirabledose levels. In all cases, the predicted rates are similar to the simulatedrates. This is reasonable since non-informative priors were used. Theprecision of the predictive rate at each dose level is measured by itsstandard deviation:

δp =

√√√√

1Ns

Ns∑

i=1

[ pi(x) − p(x)]2, (5.20)

where Ns is the number of simulations, pi(x) is the simulated responserate or predicted response rate, and p(x) is the mean response rate atdose level x. The hybrid method provides estimates with reasonableprecision when there are 50 subjects (10 subjects per arm on average)

Page 120: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE DOSE-ESCALATION TRIALS 103

Table 5.2 Comparisons of Simulation Results#

Scenario Dose Level 1 2 3 4 5

Target rate 0.02 0.07 0.37 0.73 0.52

n∗ = 100 Simulated rate 0.02 0.07 0.36 0.73 0.52Predicted rate 0.02 0.07 0.41 0.68 0.43Standard deviation 0.00 0.01 0.08 0.07 0.03Number of subjects 1.71 4.78 25.2 41.8 26.6

n∗ = 50 Simulated rate 0.02 0.07 0.36 0.73 0.52Predicted rate 0.02 0.07 0.40 0.65 0.41Standard deviation 0.00 0.02 0.11 0.09 0.04Number of subjects 1.02 2.48 12.6 20.5 13.4

n∗ = 30 Simulated rate 0.02 0.05 0.36 0.73 0.51Predicted rate 0.02 0.07 0.40 0.63 0.40Standard deviation 0.00 0.02 0.13 0.11 0.05Number of subjects 1.00 1.62 7.50 11.9 8.00

n∗ = 20 Simulated rate 0.02 0.06 0.34 0.72 0.51Predicted rate 0.02 0.07 0.37 0.58 0.38Standard deviation 0.00 0.02 0.15 0.14 0.06Number of subjects 1.00 1.03 4.68 7.60 5.68

n∗∗ = 50 Simulated rate 0.02 0.07 0.36 0.73 0.51Predicted rate 0.02 0.07 0.41 0.65 0.41Standard deviation 0.00 0.02 0.11 0.09 0.04Number of subjects 1.02 2.53 12.7 20.5 13.3

# From simulation software: ExpDesign Studio by www.CTriSoft.net∗ Uniform prior over [0.05, 0.1]; ∗∗ uniform prior over [0.01, 0.1].

or more. The precision reduces when sample size reduces, but the preci-sion is not so bad even with only 20 patients. The relationships betweennumber of subjects and precision for the most interested dose levels(level 3, 4, and 5) are presented in Figure 5.4. The predicted responserates seem not to be sensitive to the non-informative priors. The sim-ulation results for the two non-informative priors are very similar.Note that precision very much depends upon the number of parametersused in the response model. The more parameters, the less precision.In the simulations a three-parameter hyper-logistic model was used.If a single-parameter model were used, the precision of the predictedresponse rates would be greatly improved.

Page 121: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

104 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

0.16

0.14

0.12

0.10

0.08

0.06

0.04

0.0280604020 100

Sample Size, n

Std

. Dev

. of P

redi

ctiv

e R

ate

Dose level 3

Dose level 4

Dose level 5

Figure 5.4 Relationship between sample size and standard deviation of pre-dictive rate.

5.5 Concluding Remarks

The proposed utility-adaptive randomization has the desirable propertythat the proportion of subjects assigned to the treatment is proportionalto the response rate or the predictive probability. Assigning more pa-tients to the superior groups allows a more precise evaluation of thesuperior groups with relatively small number of patients. The hybridCRM approach with hyper-logistic model gives reliable predictive re-sults regarding dose response with a minimum number of patients. Itis important to choose proper ranges for the parameters of the hyper-logistic model, which can be managed when one becomes familiar withthe curves of hyper-logistic function affected by its parameters. In thecase of low response rates, the sample size is expected to increase accord-ingly. The hybrid approach allows the users to combine the computa-tional simplicity and numerical stability of frequentist method with theflexibility in priors and predictive nature of the Bayesian approach. Theproposed method can be used in phase II/III combined study to accel-erate the drug development process. However, there are some practical

Page 122: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE DOSE-ESCALATION TRIALS 105

issues such as how to perform a thorough review when the trial is on-going, which are necessarily considered for safety of the patients. Sincethe proposed Bayesian adaptive design is multiple-endpoint oriented,it can be used for various situations. For example, for ordinal responses(e.g., CTC grades), we can consider the different levels of responses asdifferent endpoints, then model them separately, and calculate the ex-pected utility based on the models. Alternatively, we can also form theutility first by assigning different weights to the response levels andthen modeling the utility.

Page 123: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and
Page 124: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CHAPTER 6

Adaptive Group Sequential Design

In clinical trials, it is not uncommon to perform data safety monitor-ing and/or interim analyses based on accrued data up to a certain timepoint during the conduct of a clinical trial. The purpose is not only tomonitor the progress and integrity of the trial, but also to take ac-tion regarding early termination (if there is evidence that the trialwill put subjects in an unreasonable risk or the treatment is ineffec-tive) or modifications to the study design in compliance with ICH GCPfor data standards and data quality. In most clinical trials, the primaryreasons for conducting interim analyses of accrued data are probablydue to (Jennison and Turnbull, 2000): (i) ethical considerations, (ii) ad-ministrative reasons, and (iii) economic constraints. In practice, sinceclinical trials involve human subjects, it is ethical to monitor the trialsto ensure that individual subjects are not exposed to unsafe or inef-fective treatment regimens. When the trials are found to be negative(i.e., the treatment appears to be ineffective), there is an ethical imper-ative to terminate the trials early. Ethical consideration indicates thatclinical data should be evaluated in terms of safety and efficacy of thetreatment under study based on accumulated data in conjunction withupdated clinical information from literature and/or other clinical trials.

From the administrative point of view, interim analyses are neces-sarily conducted to ensure that the clinical trials are being executed asplanned. For example, it is always a concern whether subjects who meetthe eligibility criteria are from the correct patient population (which isrepresentative of the target patient population). Also, it is importantto assure that the trial procedures, dose/dose regimen, and treatmentduration adhere to the study protocol. An early examination of interimresults can reveal the problems (such as protocol deviations and/or pro-tocol violations) of the trials early. Immediate actions can be taken toremedy issues and problems detected by interim analyses. An earlyinterim analysis can also verify critical assumptions made at the plan-ning stage of the trials. If serious violations of the critical assumptionsare found, modifications or adjustments must be made to ensure thequality and integrity of the trials.

The remainder of this chapter is organized as follows. In the nextsection, basic concepts of sequential methods in clinical trials are

Page 125: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

108 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

introduced. In Section 6.2, a unified approach for group sequentialdesigns for normal, binary, and survival endpoints is introduced. InSection 6.3, the boundary function proposed by Wang and Tsiatis (1987)is considered for construction of the stopping boundaries based on equalinformation intervals. Also included in this section is the discussion ofthe use of unequal information intervals under a two-stage design whichis of practical interest in clinical trials. Section 6.4 introduces a moreflexible (adaptive) design, i.e., error-spending approach, where the in-formation intervals and number of analyses are not pre-determined,but the error-spending function is pre-determined. The relationshipbetween error-spending function and Wang and Tsiatis’ boundary func-tion is also discussed in this section. A proposed approach for formu-lating the group sequential design based on independent p-values fromthe sub-samples from different stages is described in Section 6.5. Theformulation is valid regardless of the methods used for calculation ofthe p-values. Trial monitoring and conditional powers for assessment offutility for comparing means and comparing proportions are derived inSections 6.7 and 6.8, respectively. Practical issues are given in the lastsection.

6.1 Sequential Methods

As pointed out by Jennison and Turnbull (2000), the concept of sequen-tial statistical methods was originally motivated by the need to obtainclinical benefits under certain economic constraints. For a trial witha positive result, early stopping means that a new product can be ex-ploited sooner. If a negative result is indicated, early stopping ensuresthat resources are not wasted. Sequential methods typically lead tosaving in sample size, time, and cost when compared with the standardfixed sample procedures. Interim analyses enable management to makeappropriate decisions regarding the allocation of limited resources forcontinued development of the promising treatment.

In clinical trials, sequential methods are used when there are formalinterim analyses. An interim analysis is an analysis intended to assesstreatment effect with respect to efficacy or safety at any time priorto the completion of a clinical trial. Because interim analysis resultsmay introduce bias to subsequent clinical evaluation of the subjectswho enter the trial, all interim analyses should be carefully plannedin advance and described in the study protocol. Under special circum-stances, there may be a need for an interim analysis that was notplanned originally. In this case, a protocol amendment describing therationale for such an interim analysis should be implemented prior to

Page 126: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE GROUP SEQUENTIAL DESIGN 109

any clinical data being unblinded. For many clinical trials of investi-gational products, especially those that have public attention for majorhealth significance, an external independent group or data monitoringcommittee (DMC) should be formed to perform clinical data monitor-ing for safety and interim analysis for efficacy. The U.S. FDA requiresthat the role/responsibility and function/activities of a DMC be clearlystated in the study protocol to maintain the integrity of the clinical trial(FDA, 2000, 2005; Offen, 2003).

Basic concepts

A (fully) sequential test is referred to as a test conducted based on ac-crued data after every new observation is obtained. A group sequentialtest, as opposed to a fully sequential test, is referred to as a test per-formed based on accrued data at some pre-specified intervals ratherthan after every new observation is obtained (Jennison and Turnbull,2000).

Error-inflation For a conventional single-stage trial with one-sidedα = 0.025, the null hypothesis (H0) is rejected if the statistic z ≥ 1.96.For a sequential trial with K analyses, if at the k-th analysis (k =1, 2, . . . , K), if the absolute value of Zk is sufficiently large, we will rejectH0 and stop the trial. It is not appropriate to simply apply a level-α one-sided test at each analysis since the multiple tests would lead to an infla-tion of the type-I error rate. In fact, the actual α level is given by 1−(1−αk). Thus, for K = 5, the actual α level is 0.071, nearly 3 times as big asthat of the 0.025 significance level applied at each individual analysis.

Stopping boundary Stopping boundaries consist of a set of criticalvalues that the test statistics calculated from actual data will be com-pared with to determine whether the trial should be terminated orcontinue. For example, Figure 6.1 provides a set of critical values asboundaries for stopping. In other words, if the observed sample meanat a given stage falls outside the boundaries, we will terminate the trial;otherwise, the trial continues.

Boundary scales Many different scales can be used to construct thestopping boundaries. The four commonly used scales are the standard-ized z-statistic, the sample-mean scale, the error-spending scale, andthe sum-mean scale. In principle, these scales are equivalent to one an-other after appropriate transformation. Among the four scales, sample-mean scale and error-spending scale have most intuitive interpreta-tions. As an example, consider the hypothesis for testing the differencebetween two-sample independent means. These scales are defined asfollows:

Page 127: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

110 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Stopping Boundaries (on Standard-z)3.38

1.69

0

–1.69

–3.38

3.38

1.69

0

–1.69

–3.38321 4

Analysis Stage

Sta

ndar

d-z

Figure 6.1 Stopping boundaries.

• Sample-mean scale: θk = xAk − xBk.

• Standardized z-statistic: Zk = θk√

Ik, where Ik = nk2σ2 is the

information level.• Sum-mean scale: Dk =

∑ki=1 xAi − ∑k

i=1 xBi.

• Error-spending scale: α (sk) , which is also known as probabil-ity scale.

When the number of interim analyses increases, there are too manypossible designs with desired α and power. Hence, in practice, it is dif-ficult to select a most appropriate design that can fit our need. Thus,it is suggested that a simple function to define the preferred stoppingboundaries be used. Such a function with a few parameters such asthe O’Brien-Fleming, Pocock, Lan-DeMets-Kim, or Wang and Tsiatisboundary functions are useful.

Optimal/flexible multiple-stage designs In early phase cancer tri-als, it is undesirable to stop a study early when the test drug is promis-ing. On the other hand, it is desirable to terminate the study as early aspossible when the test treatment is not effective due to ethical consid-eration. For this purpose, an optimal or flexible multiple-stage designis often employed to determine whether the test treatment holds suffi-cient promise to warrant further testing. In practice, optimal or flexiblemultiple-stage designs are commonly employed in phase II cancer trialswith single arm. These multiple-stage designs include optimal multiple-stage designs such as minimax design and Simon’s optimal two-stagedesign (Simon, 1989; Ensign et al., 1994) and flexible multiple-stage de-signs (see, e.g., Chen, 1997; Chen and Ng, 1998; Sargent and Goldberg,2001).

Page 128: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE GROUP SEQUENTIAL DESIGN 111

The concept of an optimal two-stage design is to permit early stoppingwhen a moderately long sequence of initial failure occurs, denoted bythe number of subjects studied in the first and second stage by n1 andn2, respectively. Under a two-stage design, n1 patients are treated atthe first stage. If there are fewer than r1 responses, then stop the trial.Otherwise, stage 2 is implemented by including the other n2 patients.A decision regarding whether the test treatment is promising is thenmade based on the response rate of the N = n1 + n2 subjects. Let p0

be the undesirable response rate and p1 be the desirable response rate(p1 > p0). If the response rate of a test treatment is at the undesirablelevel, one may reject it as an ineffective compound with a high probabil-ity, and if its response rate is at the desirable level, one may not reject itas a promising compound with a high probability. As a result, under atwo-stage trial design, it is of interest to test the following hypotheses:

H0 : p ≤ p0 versus Ha : p ≥ p1.

Rejection of H0 (or Ha) means that further (or not further) study ofthe test treatment should be carried out. Note that under the abovehypotheses, the usual type I error is the false positive in accepting anineffective drug and the type II error is the false negative in rejectinga promising compound. An alternative to the optimal two-stage designdescribed above is so-called flexible two-stage design (Chen and Ng,1998). The flexible two-stage design simply assumes that the samplesizes are uniformly distributed on a set of k consecutive possible values.

For comparison of multiple arms, Sargent and Goldberg (2001) pro-posed a flexible optimal design that allows clinical scientists to selectthe treatment to proceed for further testing based on other factors whenthe difference in the observed response rates between treatment armsfalls into the interval of [−δ, δ], where δ is a pre-specified quantity. Theproposed rule is that if the observed difference in the response rates ofthe treatments is larger than δ, then the treatment with the highestobserved response rate is selected. On the other hand, if the observeddifference is less than or equal to δ, other factors may be considered inthe selection. It should be noted that under this framework, it is notessential that the very best treatment is definitely selected; rather it isimportant that a substantially inferior treatment is not selected whena superior treatment exists.

Remarks Note that in a classic design with a fixed sample size, eithera one-sided α of 0.025 or a two-sided α of 0.05 can be used because theywill lead to the same results. However, in group sequential trials, two-sided tests should not be used based on the following reasons. For a trialthat is designed to allow early stopping for efficacy, when the test drugis significantly worse than the control, the trial may continue to claim

Page 129: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

112 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

efficacy instead of stopping. This is due to the difference in continuationregion between a one-sided and a two-sided test, which will have an im-pact on the differences in the consequent stopping boundaries betweenthe one-sided and the two-sided test. As a result, it is suggested thatthe relative merits and disadvantages for the use of a one-sided testand a two-sided test should be carefully evaluated. It should be notedthat if a two-sided test is used, the significance level α could be inflated(for futility design) or deflated (for efficacy design).

6.2 General Approach for Group Sequential Design

Jennison and Turnbull (2000) introduced a unified approach for groupsequential trial design. This unified approach is briefly described below.Consider a group sequential study consisting of up to K analyses. Thus,we have a sequence of test statistics {Z1, . . . , ZK}. Assume that thesestatistics follow a joint canonical distribution with information levels{I1, . . . , Ik} for the parameter. Thus, we have

Zk ∼ N(θ√

Ik, 1), 1, . . . , K,

where Cov(Zk1 , Zk2) =√

Ik1 Ik2 , 1 ≤ k1 ≤ k2 ≤ K.

Table 6.1 provides formulas for sample size calculations for differenttypes of study endpoints in clinical trials, while Table 6.2 summarizesunified formulation for different types of study endpoints under a groupsequential design. As an example, for the logrank test in a time-to-eventanalysis, the information can be expressed as

Ik =r

(1 + r)2dk =

r

(1 + r)2Nk

σ 2,

where dk is the expected number of deaths, Nk is the expected numberof patients, and r is the sample size ratio.

Let T0 and Tmax be the accrual time and the total follow-up time,respectively. Then, under the condition of exponential survival distri-bution, we have

dik =Nik

To

(

To − 1λieλi T

(

eλi To − 1))

, T > To; i = 1, 2; k = 1, . . . , K,

where dik is the number of deaths in group i at stage k, and Nik is thenumber of patients in group i at stage k.

σ 2 =N1k + N2k

d1k + d2k=

1 + rξ1 + rξ2

,

Page 130: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE GROUP SEQUENTIAL DESIGN 113

Table 6.1 Sample Sizes for Different Types of Endpoints

Endpoint Sample Size Variance

One mean n =(z1−a+z1−β)

2σ2

ε2;

Two means n1=(z1−a+z1−β)

2σ2

(1+1/r)−1ε2

;

One proportion n =(z1−a+z1−β)

2σ2

ε2; σ2= p(1 − p)

Two proportions n1=(z1−a+z1−β)

2σ2

(1+1/r)−1ε2

;σ2= p(1 − p);

p =n1 p1+n2 p2n1+n2

.

One survival curve n =(z1−a+z1−β)

2σ2

ε2; σ2= λ2

0

(

1 − eλ0T0−1T0λ0eλ0Ts

)−1

Two survival curves n1=(z1−a+z1−β)

2σ2

(1+1/r)−1ε2

;

σ2=rσ2

1 +σ22

1+r ,

σ2i = λ2

i

(

1 − eλi T0−1T0λieλi Ts

)−1

Note: r = n2n1

.λ0 = expected hazard rate, T0 = uniform patient accrual time andTs = trial duration. Logrank-test is used for comparison of the two survival curves.

Table 6.2 Unified Formulation for Sequential Design

Single mean Zk=(xk−µ0)√

Ik Ik=nkσ2

Paired means Zk= dk√

Ik Ik=nkσ2

Two means Zk=(xAk−xBk)√

Ik Ik=(

σ2A

nAk+

σ2B

nBk

)−1

One proportion Zk=(pk−p0)√

Ik Ik=nkσ2 , σ2= p(1 − p)

Two proportions Zk=(pAk−pBk)√

Ik Ik=1σ2

(1

nAk+ 1

nBk

)−1,

σ2= p(1 − p)

One survival curve Zk= Sk/√

Ik Ik= dk=Nkσ2 ,

σ2 is given in (6.1)

Two survival curves Zk= Sk/√

Ik Ik=r dk

(1+r)2 = r Nk(1+r)2σ2 ,

σ2 is given in (6.1)

Page 131: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

114 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

where

ξi = 1 − eλi To − 1Toλie−λi T

.

Note that in practice, we may first choose c and α in the stopping bound-aries using the conditional probabilities and then perform sample sizecalculation based on the determined boundaries and the correspondingconditional probabilities for achieving the desired power.

6.3 Early Stopping Boundaries

In clinical trials, it is desirable to stop the trial early if the treatmentunder investigation is ineffective. On the other hand, if a strong (orhighly significant) evidence of efficacy is observed, we may also termi-nate the trial early. In what follows, we will discuss boundaries for earlystopping of a given trial due to (i) efficacy, (ii) futility, and (iii) efficacyor futility assuming that there are a total of K analyses in the trial.

Early efficacy stopping

For the case of early stopping for efficacy, we consider testing the one-sided null hypothesis that H0 : µA ≤ µB, where µA and µB could bemeans, proportions, or hazard rates for treatment groups A and B, re-spectively. Figure 6.2 illustrates stopping boundaries for efficacy basedon standard Z score. The decision rules for early stopping for efficacyare then given by

{If Zk < αk, continue on next stage;If Zk ≥ αk, (k = 1, . . . K − 1), stop reject H0,

Stopping Boundaries (on Standard-z)2.96

1.48

0

–1.48

–2.96

2.96

1.48

0

–1.48

–2.96321 4

Analysis Stage

Sta

ndar

d-z

Figure 6.2 Stopping boundary for efficacy.

Page 132: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE GROUP SEQUENTIAL DESIGN 115

and{

If ZK < αK , stop accept H0;If ZK ≥ αK , stop reject H0.

In addition to Pocock’s test and O’Brien and Fleming’s test, Wang andTsiatis (1987) proposed a family of two-sided tests indexed by the pa-rameter of ∆, which is also based on the standardized test statistic Zk.Wang and Tsiatis’ test includes Pocock’s and O’Brien-Fleming’s bound-aries as special cases. As a result, in this section, we will simply focuson the method proposed by Wang and Tsiatis (1987). Wang and Tsiatis’boundary function is given by

ak > αK

(kK

)∆−1/2

, (6.1)

where c is function of K, α,and ∆.

Note that sample sizes N0 in Table 6.3b (also, Table 6.4b, 6.5b) are gen-erated using ExpDesign Studio(R) based on effect size δ0 = 0.1. There-fore, for an effect size δ, the sample size is given by N = N0

(0.1δ

)2.

Example 6.1 (normal endpoint) Suppose that an investigator isinterested in conducting a clinical trial with 5 analyses for comparinga test drug (T) with a placebo (P). Based on information obtained froma pilot study, data from the test drug and the placebo seem to have acommon variance, i.e., σ 2 = σ 2

1 = σ 22 = 4 with µT − µP = 1. Assuming

these observed values are true, it is desirable to select a maximumsample size such that there is a 85% (1 − β = 0.85) power for detectingsuch a difference between the test drug and the placebo at the 2.5%(one-sided α = 0.025) level of significance. The W-T stopping boundarywith ∆ = 0.3 is used.

The stopping boundaries are given in Table 6.3a. From Table 6.3a,α5 = 2.1697. Since the effect size δ = µT−µP

σ= 0.5, the required sample

size for a classic design when there are no planned interim analyses is

Nfixed = 3592(

0.10.5

)2

= 144.

The maximum sample is given by

Nmax = 3898(

0.10.5

)2

= 156,

while the expected sample size under the alternative hypothesis is

N = 2669(

0.10.5

)2

= 107.

Page 133: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

116 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 6.3a Final Efficacy Stopping Boundary αK

K∆ 1 2 3 4 5

0 1.9599 1.9768 2.0043 2.0242 2.03960.1 1.9936 2.0258 2.0503 2.06870.2 2.0212 2.0595 2.0870 2.10850.3 2.0595 2.1115 2.1452 2.16970.4 2.1115 2.1850 2.2325 2.26620.5 2.1774 2.2891 2.3611 2.41320.6 2.2631 2.4270 2.5403 2.62450.7 2.3642 2.6061 2.7807 2.91850.8 2.4867 2.8297 3.0900 3.30740.9 2.6306 3.1022 3.4820 3.80661.0 2.7960 3.4268 3.9566 4.4252

Note: Equal info intervals, one-sided α = 0.025.

Table 6.3b Maximum Sample Size and Expected Sample SizeUnder Ha

K∆ 1 2 3 4 5

0 3592 3616/3161 3652/2986 3674/2884 3689/28280.1 3644/3080 3685/2920 3713/2829 3732/27740.2 3691/3014 3739/2855 3771/2768 3794/27170.3 3758/2966 3829/2804 3870/2720 3898/26690.4 3850/2941 3962/2773 4031/2695 4078/26490.5 3968/2937 4161/2780 4285/2715 4374/26820.6 4128/2965 4436/2828 4662/2792 4834/27850.7 4316/3012 4809/2923 5195/2933 5520/29750.8 4548/3088 5288/3063 5914/3135 6487/32490.9 4821/3188 5881/3242 6860/3400 7788/35931.0 5130/3307 6586/3454 8011/3696 9427/3974

Note: Equal info intervals, one-sided α = 0.025, power = 85% and effect size= 0.01.

Early futility stopping

For the case of early stopping for futility, similarly, we consider testingthe one-sided null hypothesis that Ho : µA ≤ µB, where µA and µB couldbe means, proportions, or hazard rates for treatment groups A and B,respectively. Figure 6.3 illustrates stopping boundaries for early futility

Page 134: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE GROUP SEQUENTIAL DESIGN 117

Table 6.4a Final Symmetric Futility Stopping Boundary βK

K∆ 1 2 3 4 5

0 1.9599 1.9546 1.9431 1.9316 1.92240.1 1.9500 1.9339 1.9201 1.90980.2 1.9419 1.9224 1.9063 1.89260.3 1.9316 1.9063 1.8857 1.86960.4 1.9155 1.8834 1.8581 1.83740.5 1.8972 1.8535 1.8202 1.79380.6 1.8765 1.8191 1.7754 1.73980.7 1.8512 1.7777 1.7238 1.67900.8 1.8237 1.7364 1.6698 1.61690.9 1.7950 1.6916 1.6169 1.55721.0 1.7651 1.6480 1.5641 1.4998

Note: Equal info intervals, one-sided α = 0.025.

Table 6.4b Maximum Sample Size and Expected Sample Size Under Ho

K∆ 1 2 3 4 5

0 3592 3608/2616 3636/2433 3656/2287 3672/22020.1 3626/2496 3658/2305 3683/2178 3705/20960.2 3655/2396 3700/2174 3734/2053 3759/19740.3 3701/2321 3770/2054 3817/1924 3855/18450.4 3761/2267 3874/1960 3956/1815 4020/17280.5 3845/2241 4024/1901 4165/1742 4285/16480.6 3950/2238 4223/1880 4454/1713 4662/16180.7 4073/2254 4459/1889 4805/1723 5131/16320.8 4207/2285 4726/1923 5196/1761 5651/16750.9 4352/2327 4999/1971 5594/1813 6167/17321.0 4500/2376 5266/2026 5963/1868 6632/1788

Note: Equal info intervals, one-sided α = 0.025 and δ = 0.1. Power = 85%.

based on Z score. The decision rules for early stopping for futility arethen given by

{If Zk < βk, (k = 1, . . . K − 1), stop and accept Ho;If Zk ≥ βk, continue on next stage,

and{

If ZK < βK , stop and accept Ho;If ZK ≥ βK , stop reject Ho.

Page 135: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

118 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Stopping Boundaries (on Standard-z)1.58

0.79

0

–0.79

–1.58

1.58

0.79

0

–0.79

–1.58321 4

Analysis Stage

Sta

ndar

d-z

Figure 6.3 Stopping boundary for futility.

We propose an inner futility stopping boundary that is symmetric tothe Wang-Tsiatis’ efficacy stopping boundary on the z-scale and trian-gular boundary, which are given below, respectively:

• Symmetric boundary:

βk > 2βK

kK

− βK

(kK

)∆−1/2

. (6.2)

• Triangular boundary:

βk = βKk − k0

K − k0, wherek0 =

[K2

]

+ 1. (6.3)

where [x] is the integer part of x.

Example 6.2 (binary endpoint) Suppose that an investigator isinterested in conducting a group sequential trial comparing a test drugwith a placebo. The primary efficacy study endpoint is a binary re-sponse. Based on information obtained in a pilot study, the responserates for the test drug and the placebo are given by 20% (p1 = 0.20) and30% (p2 = 0.30), respectively. Suppose that a total of 2 (K = 2) analysesare planned. It is desirable to select a maximum sample size in orderto have an 85% (1 − β = 0.85) power at the 2.5% (one-sided α = 0.025)level of significance. The effect size is

δ =p2 − p1

( p(1 − p)=

0.3 − 0.2√

0.25(1 − 0.25)= 0.23094.

Page 136: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE GROUP SEQUENTIAL DESIGN 119

If a symmetric inner boundary boundaries are used with ∆ = 0.6, it isan optimal design with the minimum expected sample size under thenull hypothesis. From Table 6.5a, β2 = 1.8765, and from (6.2), we have

β1 = 2βK

kK

− βK

(kK

)∆−1/2

= 2(1.8765)√

1/2 − 1.8765 (1/2)0.6−0.5

= 0.902 94.

From Table 6.5b, the sample size needed for a fixed sample size designis

Nfixed = 3592(

0.10.23094

)2

= 674.

The maximum sample size and the expected sample size under the nullhypothesis are

Nmax = 3950(

0.10.23094

)2

= 742

and

Nexp = 2238(

0.10.23094

)2

= 420.

Early efficacy-futility stopping

For the case of early stopping for efficacy or futility, similarly, we con-sider testing the one-sided null hypothesis that Ho : µA ≤ µB, whereµA and µB could be means, proportions, or hazard rates for treatmentgroups A and B, respectively. Figure 6.4 illustrates stopping boundariesfor early efficacy/futility stopping. The decision rules for early stoppingfor efficacy or futility are then given by

{If Zk < βk, (k = 1, . . . K), stop and accept Ho;If Zk ≥ αk, (k = 1, . . . K), stop and reject Ho.

The stopping boundaries are the combination of the previous efficacyand futility stopping boundaries, i.e.,

Symmetric boundary{

ak = αK(k/K)∆−1/2

βk = 2βK

√kK − βK( k

K )∆−1/2 (6.4)

Triangle boundary{

ak = αK(k/K)∆−1/2

βk = βKk−k0K−k0

, (6.5)

Page 137: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

120 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Stopping Boundaries (on Standard-z)3.38

1.69

0

–1.69

–3.38

3.38

1.69

0

–1.69

–3.38321 4

Analysis Stage

Sta

ndar

d-z

Figure 6.4 Stopping boundary for efficacy or futility.

where the starting inner boundary is given by k0 =[ K

2

]

+ 1.

Example 6.3.3 (survival endpoint) Suppose that an investigator isinterested in conducting a survival trial with 3 (K = 3) analyses at the2.5% level of significance (one-sided α = 0.025) with an 85% (1−β = 0.85)power. Assume that median survival time is 0.990 year (λ1 = 0.7/Year)for group 1, 0.693 year (λ2 = 1/Year) for group 2. T0 = 1 year and studyduration Ts = 2 years. Using

σ 2i = λ2

i

(

1 − eλi T0 − 1T0λieλi Ts

)−1

,

Table 6.5a Final Stopping Boundaries αK = βK

K∆ 1 2 3 4 5

0 1.9599 1.9730 1.9902 2.0028 2.01430.1 1.9856 2.0074 2.0235 2.03730.2 2.0074 2.0361 2.0568 2.07170.3 2.0396 2.0821 2.1096 2.13030.4 2.0866 2.1521 2.1946 2.22560.5 2.1487 2.2532 2.3301 2.37370.6 2.2290 2.3898 2.5035 2.58960.7 2.3301 2.5678 2.7458 2.88710.8 2.4530 2.7929 3.0593 3.27980.9 2.6000 3.0685 3.4475 3.77591.0 2.7722 3.3947 3.9183 4.3823

Note: Equal info intervals, one-sided α = 0.025.

Page 138: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE GROUP SEQUENTIAL DESIGN 121

Table 6.5b Maximum Sample Size and Expected Sample Size Under Hoand Ha

K∆ 1 2 3 4 5

0 3592 3636/2722/3142 3698/2558/2942 3744/2420/2826 3785/2346/27640.1 3680/2615/3050 3758/2449/2868 3818/2333/2760 3868/2260/26980.2 3755/2534/2974 3860/2342/2793 3937/2236/2692 3995/2167/26280.3 3868/2484/2918 4024/2251/2728 4128/2140/2628 4207/2073/25670.4 4037/2472/2893 4280/2193/2689 4442/2067/2587 4562/1994/25270.5 4265/2497/2897 4662/2181/2687 4931/2034/2584 5136/1950/25270.6 4573/2568/2937 5209/2226/2726 5675/2059/2630 6038/1959/25780.7 4984/2695/3017 5985/2348/2816 6775/2164/2724 7427/2047/26770.8 5525/2891/3143 7078/2580/2967 8402/2397/2879 9574/2272/28300.9 6238/3180/3327 8619/2972/3207 10805/2835/3131 12885/2736/30831.0 6238/3180/3327 10784/3591/3591 14358/3590/3590 17953/3591/3591

Note: Equal info intervals, one-sided α = 0.025, δ = 0.1, power = 85%.

we obtain σ 21 = 0.762 2, and σ 2

2 = 1. 303. Note that

δ =λ2 − λ1

σ=

1 − 0.7√0.762 2 + 1. 303

= 0.208 76.

The stopping boundaries (∆ = 0.1) are given by α3 = β3 = 2.0074(Table 6.6). Thus, we have

α1 = 3. 115 2, α2 = 2. 360 9;β1 = −0.797 23, β2 = 0.917 21.

Thus,

Nfixed =(zα/2 + zβ)2

(

σ 21 + σ 2

2

)

(λ2 − λ1)2

=(1.96 + 1.0364)2 (0.762 2 + 1. 303)

(1 − 0.7)2

= 206.

The maximum sample size is given by

Nmax = 3758(

0.10.20876

)2

= 862.

The expected sample size under the null hypothesis is given by

No = 2449(

0.10.20876

)2

= 562.

Page 139: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

122 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 6.6 Error Spending Functions

O’Brien-Fleming α1(s) = 2{1 − Φ(zα/√

2)Pocock α2(s) = α log [1 + (e − 1)s]Lan-DeMets-Kim α3(s) = αsθ , θ > 0

Hwang-Shih α4(s) = α[(1 − eζs)/(1 − e−ζ )], ζ = 0

Source: Chow, Shao, and Wang (2003)

The expected sample size under the alternative hypothesis is given by

Na = 2868(

0.10.20876

)2

= 658.

6.4 Alpha Spending Function

Lan and DeMets (1983) proposed to distribute (or spend) the total prob-ability of false positive risk as a continuous function of the informationtime in group sequential procedures for interim analyses. If the totalinformation is scheduled to accumulate over the maximum duration Tis known, the boundaries can be computed as a continuous function ofthe information time. This continuous function of the information timeis referred to as the alpha spending function, denoted by α(s). The alphaspending function is an increasing function of information time. It is 0when the information time is 0 and is equal to the overall significancelevel when information time is 1. In other words, α(0) = 0 and α(1) = α.Let s1 and s2 be two information times, 0 < s1 < s2 ≤ 1. Also, denoteα(s1) and α(s2) as their corresponding value of alpha spending functionat s1 and s2. Then,

0 < α(s1) < α(s2) ≤ α.α(s1)

is the probability of type I error one wishes to spend at information times1. For a given alpha spending function α(s) and a series of standardizedtest statistic Zk, k = 1, . . . , K, the corresponding boundaries ck, k =1, . . . , K are chosen such that under the null hypothesis

P(Z1 < c1, . . . , Zk−1 < ck−1, Zk ≥ ck)

= α

(kK

)

− α

(k − 1

K

)

.

Some commonly used alpha spending functions are summarized inTable 6.6.

Page 140: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE GROUP SEQUENTIAL DESIGN 123

We now introduce the procedure for sample size calculation based onLan-DeMets’ alpha spending function, i.e.,

α(s) = αsθ , θ > 0.

Although alpha spending function does not require a fixed maximumnumber and equal spaced interim analyses, it is necessary to make thoseassumptions in order to calculate the sample size under the alternativehypothesis. The sample size calculation can be performed using com-puter simulations. As an example, in order to achieve a 90% power atthe one-sided 2.5% level of significance, it is necessary to have nfixed = 84subjects per treatment group for classic design. The maximum samplesize needed for achieving the desired power with 5 interim analysesusing the Lan-DeMets type alpha spending function with δ = 2 can becalculated as 92 subjects.

6.5 Group Sequential Design Basedon Independent P-values

In this section, we discuss n-stage adaptive design based on individualp-values from each stage proposed by Chang (2005). In an adaptivegroup sequential design with K stages, a hypothesis test is performedat each stage, followed by certain actions according to the outcomes.Such actions could be early stopping for efficacy or futility, sample sizere-estimation, modification of randomization, or other adaptations. Atthe kth stage, a typical set of hypotheses for the treatment difference isgiven by

H0k : ηk1 ≥ ηk2 vs. Hak : ηk1 < ηk2, (6.6)

where ηk1 and ηk2 are the treatment response such as mean, proportion,or survival at the kth stage. We denote the test statistic for H0k and thecorresponding p-value by Tk and pk, respectively.

The global test for the null hypothesis of no treatment effect canbe written as an intersection of the individual hypotheses at differentstages.

H0 : H01 ∩ . . . ∩ H0K (6.7)

Note that a one-sided test is assumed in this chapter. At the kth stage,the decision rules are given by

Stop for efficacy if Tk ≤ αk ,Stop for futility if Tk > βk,Continue with adaptations if αk < Tk ≤ βk,

(6.8)

Page 141: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

124 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

where αk < βk (k = 1, . . . , K − 1), and αK = βK . For convenience’s sake,αk and βk are referred to as the efficacy and futility boundaries, respec-tively.

To reach the kth-stage, a trial has to pass all stages up to the (k− 1)th

stages, i.e.,

0 ≤ αi < Ti ≤ βi ≤ 1(i = 1, . . . , k − 1).

Therefore, the cumulative distribution function of Tk is given by

ϕk (t) = Pr(Tk < t, α1 < t1 ≤ β1, . . . , αk−1 < tk−1 ≤ βk−1)

=∫ β1

α1

. . .

∫ βk−1

αk−1

∫ t

0

fT1...Tk dtk dtk−1 . . . dt1, (6.9)

where fT1...Tk is the joint probability density function of T1, . . . , and Tk,and ti is realization of Ti.

The joint probability density function fT1...Tk in (6.5) can be any den-sity function. However, it is desirable to choose Ti such that fT1...Tk

has a simple form. Note that when ηk1 = ηk2, the p-value pk from thesub-sample at the kth stage is uniformly distributed on [0,1] under Hoand pk (k = 1, . . . , K) are mutually independent. These two desirableproperties can be used to construct test statistics for adaptive designs.

The simplest form of the test statistic at the kth stage is given by

Tk = pk, k = 1, . . . , K, (6.10)

due to independence of pk, fT1...Tk = 1 under H0 and

ϕk (t) = tk−1∏

i=1

Li, (6.11)

where Li = (βi − αi). For convenience’s sake, when the upper boundexceeds the lower bound, define

∏Ki=1(·) = 1. It is obvious that the error

rate (α spent) at the kth stage is given by

πk = ϕk(αk). (6.12)

When the efficacy is claimed at a certain stage, the trial is stopped.Therefore, the type I errors at different stages are mutually exclusive.Hence, the experiment-wise type I error rate can be written as

α =K∑

k=1

πk. (6.13)

From (6.13), the experiment-wise type I error rate is given by

α =K∑

k=1

αk

k−1∏

i=1

Li. (6.14)

Page 142: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE GROUP SEQUENTIAL DESIGN 125

Note that (6.15) is the necessary and sufficient condition for determi-nation of the stopping boundaries (αi, βi).

6.6 Calculation of Stopping Boundaries

Two-stage design

For a two-stage design, (6.15) becomes

α = α1 + α2(β1 − α1). (6.15)

For convenience’s sake, a table of stopping boundaries is constructedusing (6.16) (Table 6.7). The adjusted p-value is given by

p(t, k) ={

t if k = 1,α1 + (β1 − α1)t if k = 2.

(6.16)

K-stage design

For K-stage designs, it is convenient to define a function for Li and αi.

Such functions could be of the form

Lk = b(

1k

− 1K

)

, (6.17)

and

ak = c kθα, (6.18)

Table 6.7 Stopping Boundaries for Two-Stage Designs

α1 0.000 0.005 0.010 0.015 0.020

β1

0.15 0.1667 0.1379 0.1071 0.0741 0.03850.20 0.1250 0.1026 0.0789 0.0541 0.02780.25 0.1000 0.0816 0.0625 0.0426 0.02170.30 0.0833 0.0678 0.0517 0.0351 0.01790.35 α2 0.0714 0.0580 0.0441 0.0299 0.01520.40 0.0625 0.0506 0.0385 0.026 0.01320.50 0.0500 0.0404 0.0306 0.0206 0.01040.80 0.0312 0.0252 0.0190 0.0127 0.00641.00 0.0250 0.0201 0.0152 0.0102 0.0051

Note: One-sided α = 0.025.

Page 143: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

126 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

where b, c, and θ are constants. Because of the equality Lk = βk − αk,the futility boundaries become

βk = b(

1k

− 1K

)

+ ckθα. (6.19)

Note that the coefficient b is used to determine how fast the continuationband shrinks when a trial proceeds from one stage to another. θ is usedto determine the curvity of the stopping boundaries αk and βk. Substi-tuting (6.18) and (6.19) into (6.16) and solving it for constant c, leads to

c =

[K∑

k=1

{

bk−1kθ

k−1∏

i=1

(1i

− 1K

)}]−1

(6.20)

When K, b and θ are pre-determined, (6.20) can be used to obtain c.Then, (6.19) and (6.20) can be used to obtain the stopping boundariesαk and βk. For convenience, constant c is tabulated for different b, θ andK (Table 6.8).

Note that when θ < 0, θ > 0, and θ = 0, the efficacy stopping bound-aries are a monotonically decreasing function of k, an increasing func-tion of k, and a constant, respectively. When the constant b increases,the continuation-band bound by the futility and efficacy stopping bound-aries will shrink faster. It is suggested that b should be small enoughsuch that all βk < 1. Also, it should be noted that LK = 0.

Trial Examples

To illustrate the group sequential methods described in the previoussection, in what follows, two examples concerning clinical trials withgroup sequential designs are given. For illustration purposes, these twoexamples have been modified slightly from actual trials.

Example 6.1 Consider a two-arm comparative oncology trial compar-ing a test treatment with a control. Suppose that the primary efficacyendpoint of interest is time to disease progression (TTP). Based ondata from previous studies, the median time for TTP is estimated to be8 months (hazard rate = 0.08664) for the control group, and 10.5 months(hazard rate = 0.06601) for the test treatment group. Assume that thereis a uniform enrollment with an accrual period of 9 months and the totalstudy duration is expected to be 24 months. The logrank test will be usedfor the analysis. Sample size calculation will be performed under theassumption of an exponential survival distribution.

Assuming that the median time for TTP for the treatment group is10.5 months, the classic design requires a sample size of 290 per treat-ment group in order to achieve an 80% power at the α = 0.025 level

Page 144: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE GROUP SEQUENTIAL DESIGN 127

Table 6.8 Constant c in (6.18)

Kb θ 2 3 4 5 6

0.25 −0.5 0.9188 0.8914 0.8776 0.8693 0.86380.0 0.8889 0.8521 0.8337 0.8227 0.81540.5 0.8498 0.8015 0.7776 0.7635 0.75401.0 0.8000 0.7385 0.7087 0.6911 0.6795

0.50 −0.5 0.8498 0.7989 0.7733 0.7578 0.74750.0 0.8000 0.7347 0.7023 0.6830 0.67020.5 0.7388 0.6581 0.6190 0.5960 0.58081.0 0.6667 0.5714 0.5267 0.5009 0.4840

0.75 −0.5 0.7904 0.7196 0.6840 0.6626 0.64830.0 0.7273 0.6400 0.5972 0.5718 0.55490.5 0.6535 0.5509 0.5022 0.4738 0.45531.0 0.5714 0.4571 0.4052 0.3757 0.3567

1.00 −0.5 0.7388 0.6512 0.6074 0.5811 0.56350.0 0.6667 0.5625 0.5120 0.4823 0.46270.5 0.5858 0.4683 0.4138 0.3825 0.36221.0 0.5000 0.3750 0.3200 0.2894 0.2699

of significance (based on a one-sided test). Suppose that an adaptivegroup sequential design is considered in order to improve efficiencyand allow some flexibility of the trial. Under an adaptive group se-quential design with an interim analysis, a sample size of 175 patientsper treatment group is needed for controlling the overall type I errorrate at 0.025 and for achieving the desired power of 80%. The interimanalysis allows for early stopping for efficacy with stopping boundariesα1 = 0.01, β1 = 0.25, and α2 = 0.0625 from Table 6.7. The maximumsample size allowed for adjustment is nmax = 350. The simulation re-sults are summarized in Table 6.8, where EESP and EFSP stand forearly efficacy stopping probability and early futility stopping probabil-ity, respectively.

Note that power is the probability of rejecting the null hypothesis.Therefore, when the null hypothesis is true, the power is equal to thetype I error rate α. From Table 6.9, it can be seen that the one-sided α

is controlled at the 0.025 level as expected. The expected sample sizesunder both hypothesis conditions are smaller than the sample size for

Page 145: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

128 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 6.9 Operating Characteristics of Adaptive Methods

Median Time ExpectedTest Control EESP EFSP N Power (%)

0 0 0.010 0.750 216 2.510.5 8 0.440 0.067 254 79

Note: 1,000,000 simulation runs.

the classic design (290/group). The power under the alternative hypoth-esis is 79%.

To demonstrate the calculation of the adjusted p-value, assume that(i) naive p-values from the logrank test (or other tests) are p1 = 0.1 andp2 = 0.07, and (ii) the trial stops at stage 2. Therefore, t = p2 = 0.07 >

α2 = 0.0625, and we fail to reject the null hypothesis of no treatmentdifference.

Example 6.2 Consider a phase III, randomized, placebo-controlled,parallel group study for evaluation of the effectiveness of a test treat-ment in adult asthmatics. Efficacy will be assessed based on the forcedexpiratory volume in one second (FEV1). The primary endpoint is thechange in FEV1 from baseline. Based on data from phase II studiesand other resources, the difference in FEV1 change from baseline be-tween the test drug and the control is estimated to be δ = 8.18% witha standard deviation σ of 18.26%. The classic design requires 87 sub-jects per group for achieving an 85% power at the significance level ofα = 0.025 (based on a one-sided test). Alternatively, a 4-stage adaptivegroup sequential design is considered for allowing comparisons at var-ious stages. The details of the design and the corresponding operatingcharacteristics are given in Table 6.10 and Table 6.11, respectively.

Note that the two sequential designs and the classic design have thesame expected sample size under the alternative hypothesis condition.The two sequential designs have an increase in power as compared tothe classic design.

6.7 Group Sequential Trial Monitoring

Data monitoring committee

As indicated in Offen (2003), the stopping rule can only serve as a guidefor stopping a trial. For clinical trials with group sequential design, aData Monitoring Committee (DMC) is usually formed. The DMC will

Page 146: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE GROUP SEQUENTIAL DESIGN 129

Table 6.10 Four-Stage Adaptive Design Specifications

Design Scenario Stage k1 2 3 4

αi 0.01317 0.02634 0.03950 0.05267βi 0.38817 0.15134 0.08117 0.05267

GSD 1 N 60 120 170 220

αi 0.00800 0.01600 0.02400 0.03200βi 0.78200 0.28200 0.11533 0.03200

GSD 2 N 50 95 135 180

Note: For design 1, b = 0.5, θ = 1, and c = 0.5267.For design 2, b = 1.032, θ = 1, and c = 0.32.

usually evaluate all aspects of a clinical trial including integrity, quality,benefits, and the associated risks before a decision regarding the ter-mination of the trial can be reached (Ellenberg, Fleming, and DeMets,2002). In practice, it is recognized that although all aspects of the con-duct of the clinical trial adhered exactly to the conditions stipulatedduring the design phase, the stopping rule chosen at the design phasemay not be used directly because there are usually complicated factorsthat must be dealt with before such a decision can be made. In whatfollows, the rationales for closely monitoring a group sequential trialare briefly outlined.

DMC meetings are typically scheduled based on the availability ofits members, which may be different from the schedules as specified

Table 6.11 Operating Characteristics of Various Designs

Design Scenario Expected N Range of N Power (%)

Classic Ho 87 87–87 2.5Ha 87 85

GSD 1 Ho 85 60–220 2.5Ha 87 94

GSD 2 Ho 92 50–180 2.5Ha 87 89

Note: 500,000 and 100,000 simulation runs for each Ho and Ha scenar-ios, respectively.

Page 147: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

130 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

at the design phase. In addition, the enrollment may be different fromthe assumption made at the design phase. The deviation of analysisschedule will affect the stopping boundaries. Therefore, the boundariesshould be re-calculated based on the actual schedules.

In practice, the variability of the response variable is usually un-known. At interim analysis, an estimate of the variability can be ob-tained based on the actual data collected for the interim analysis. How-ever, the estimate may be different from the initial guess of the variabil-ity at the design phase. The deviation of the variability will certainlyaffect the stopping boundaries. In this case, it is of interest to studythe likelihood of success of the trial based on current available informa-tion such as conditional and predictive power and repeated confidenceintervals. Similarly, the estimation of treatment difference in responsecould be different from the original expectation in the design phase. Thiscould lead to the use of an adaptive design or sample size re-estimation(Jennison and Turnbull, 2000).

In general, efficacy is not the only factor that will affect DMC’s recom-mendation regarding the stop or continuation of the trial. Safety factorsare critical for DMC to make an appropriate recommendation to stopor continue the trial. The term benefit-risk ratio is probably the mostcommonly used composite criterion in assisting the decision making. Inthis respect, it is desirable to know the likelihood of success of the trialbased on the conditional power or the predictive power.

In many clinical trials, the company may need to make critical de-cisions on its spending during the conduct of the trials due to somefinancial constraints. In this situation, the concept of benefit-risk ratiois also helpful when viewed from the financial perspective. In prac-tice, the conditional and/or predictive powers are tools for the decisionmaking. In group sequential trials, the simplest tool that is commonlyused for determining whether or not to continue or terminate a trialis the use of sequential stopping boundaries. The original methodologyfor group sequential boundaries required that the number and tim-ing of interim analyses be specified in advance in the study protocol.Whitehead (1983, 1994) introduced another type of stopping boundarymethod (the Whitehead triangle boundaries). This method permits un-limited analyses as the trial progresses. This method is referred to as acontinuous monitoring procedure. In general, group sequential methodsallow for symmetric or asymmetric boundaries. Symmetric boundarieswould demand the same level of evidence to terminate the trial earlyand claim either lack of a beneficial effect or establishment of a harmfuleffect. Asymmetric boundaries might allow for less evidence for a nega-tive harmful trend before an early termination is recommended. Forexample, the O’Brien-Fleming sequential boundary might be used for

Page 148: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE GROUP SEQUENTIAL DESIGN 131

monitoring beneficial effects, while a Pocock-type sequential boundarycould provide guidance for safety monitoring.

A practical and yet complicated method is to use the operating char-acteristics desired for the design, which typically include false positiveerror rate, the power curve, the sample size distribution or informationlevels, the estimates of treatment effect that would correspond to earlystopping, the naive confidence interval, repeated confidence intervals,the curtailment (conditional power and predictive powers), and the fu-tility index. Both the conditional power and the predictive power arethe likelihood of rejecting the alternative hypothesis conditioning on thecurrent data. The difference is that the conditional power is based on afrequentist approach and the predictive power is a Bayesian approach.Futility index is a measure of the likelihood of failure to reject Ho atk analysis given that Ha is true. The defining property of a (1-α)-levelsequence of repeated confidence interval (RCI) for θ is

Pr{

θ ∈ Ik for all k = 1, . . . , K}

= 1 − α.

Here each Ik (k = 1, . . . , K) is an interval computed from the informationavailable at analysis k. The calculation of the RCI at analysis k is similarto the naive confidence interval but z1−α is replaced with αk, the stoppingboundary on the standard z-statistic. For example, CI = d ± z1−ασ ;RCI = d ± αkσ (Jennison and Turnbull, 2000)

The conditional power method can be used to assess whether an earlytrend is sufficiently unfavorable that reversal to a significant positivetrend is very unlikely or nearly impossible. The futility index can alsobe used to monitor a trial. Prematurely terminating a trial with a verysmall futility index might be inappropriate. It is the same for continuinga trial with very high futility index.

Principles for monitoring a sequential trial

Ellenberg, Fleming, and DeMets (2002) shared their experiences onDMC and provided the following six useful principles for monitoring asequential trial.

Short-term versus long-term treatment effects When early datafrom the clinical trial appear to provide compelling evidence for short-term treatment effects, and yet the duration of patient follow-up is in-sufficient to assess long-term treatment efficacy and safety, early ter-mination may not be warranted and perhaps could even raise ethicalissues. The relative importance of short-term and long-term results de-pends on the clinical setting.

Early termination philosophies Three issues should be addressed.First, what magnitude of estimated treatment difference, and over what

Page 149: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

132 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

period of time, would be necessary before a beneficial trend would besufficiently convincing to warrant early termination? Second, shouldthe same level of evidence be required for a negative trend as for apositive trend before recommending early termination? Third, for a trialwith no apparent trend, should the study continue to the scheduledtermination?

Responding to early beneficial trends Determination of optimallength of follow-up can be difficult in a clinical trial having an earlybeneficial trend. Ideally, evaluating the duration of treatment benefitwhile continuing to assess possible side effects over a longer period oftime would provide the maximum information for clinical use. However,for patients with life-threatening diseases such as heart failure, cancer,or advanced HIV/AIDS, strong evidence of substantial short-term ther-apeutic benefits may be compelling even if it is unknown whether thesebenefits are sustained over the long term. Under these circumstances,early termination might be justified to take advantage of this importantshort-term benefit, with some plan for continued follow-up implemen-tation to identify any serious long-term toxicity.

Responding to early unfavorable trends When an unfavorabletrend emerges, three criteria should be considered by a DMC as itwrestles with the question of whether trial modification or terminationshould be recommended: (i) Are the trends sufficiently unfavorable thatthere is very little chance of establishing a significant beneficial effectby the completion of the trial? (ii) Have the negative trends ruled outthe smallest treatment effect of clinical interest? (iii) Are the negativetrends sufficiently strong to conclude a harmful effect?

While the conditional power argument only allows a statement offailure to establish benefit, the symmetric or asymmetric boundary ap-proach allows the researchers to rule out beneficial effects for a treat-ment or, with more extreme results, to establish harm. When earlytrends are unfavorable in a clinical trial that is properly powered, stoch-astic curtailment criteria would generally yield monitoring criteria fortermination that would be similar to the symmetric lower boundaryfor lack of benefit. However, in underpowered trials having unfavor-able trends, the stochastic curtailment criteria for early terminationgenerally would be satisfied earlier than criteria based on the groupsequential lower boundary for lack of benefit. Most trials comparinga new intervention to a standard of care or a control regimen do notset out to establish that the new intervention is inferior to the control.However, some circumstances may in fact lead to such a consideration.

Responding to unexpected safety concerns Statistical methodsare least helpful when an unexpected and worrying toxicity profile

Page 150: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE GROUP SEQUENTIAL DESIGN 133

begins to emerge. In this situation there can be no pre-specified sta-tistical plan, since the outcome being assessed was unanticipated.

Responding when there are no apparent trends In some trials,no apparent trends of either beneficial or harmful effects emerge as thetrial progresses to its planned conclusion. In such instances, a decisionmust be made as to whether the investment in participant, physician,and fiscal resources, as well as the burden of the trial to patients, re-mains viable and compelling.

6.8 Conditional Power

Conditional power at a given interim analysis in group sequential trialsis defined as the power of rejecting the null hypothesis at the end of thetrial conditional on the observed data accumulated up to the time pointof the planned interim analysis. For many repeated significance testssuch as Pocock’s test, O’Brien and Fleming’s test, and Wang and Tsiatis’test, the trial can only be terminated under the alternative hypothesis.In practice, this is usually true if the test treatment demonstrates sub-stantial evidence of efficacy. However, it should be noted that if the trialindicates a strong evidence of futility (lack of efficacy) during the in-terim analysis, it is unethical to continue the trial. Hence, the trial mayalso be terminated under the null hypothesis. However, except for theinner wedge test, most repeated significance tests are designed for earlystop under the alternative hypothesis. In such a situation, the analy-sis of conditional power (or equivalently futility analysis) can be usedas a quantitative method for determining whether the trial should beterminated prematurely.

Comparing means

Let xij be the observation from the jth subject ( j = 1, . . . , ni) in theith treatment group (i = 1, 2). xij , j = 1, . . . , ni, are assumed to be in-dependent and identically distributed normal random variables withmean µi and variance σ 2

i . At the time of interim analysis, it is assumedthat the first mi of ni subjects in the ith treatment group have alreadybeen observed. The investigator may want to evaluate the power forrejection of the null hypothesis based on the observed data and appro-priate assumption under the alternative hypothesis. More specifically,define

xa,i =1

mi

mi∑

j=1

xij and xb,i =1

ni − mi

ni∑

j=mi+1

xij .

Page 151: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

134 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

At the end of the trial, the following Z test statistic is calculated:

Z =x1 − x2

s21/n1 + s2

2/n2

≈ x1 − x2√

σ 21 /n1 + σ 2

2 /n2

=(m1 xa,1 + (n1 − m1)xb,1)/n1 − (m2 xa,2 + (n2 − m2)xb,2)/n2

σ 21 /n1 + σ 2

2 /n2

.

Under the alternative hypothesis, we assume µ1 > µ2. Hence, the powerfor rejecting the null hypothesis can be approximated by

1 − β = P(Z > zα/2)

= P

(n1−m1)(xb,1−µ1)n1

− (n2−m2)(xb,2−µ2)n2

√(n1−m1)σ

21

n21

+ (n2−m2)σ22

n22

> τ

= 1 − Φ(τ ),

where

τ =[

zα/2

σ 21 /n1 + σ 2

2 /n2 − (µ1 − µ2)

−(

m1

n1(xa,1 − µ1) − m2

n2(xa,2 − µ2)

)]

[(n1 − m1)σ 2

1

n21

+(n2 − m2)σ 2

2

n22

]−1/2

.

As it can be seen from the above, the conditional power depends notonly upon the assumed alternative hypothesis (µ1, µ2) but also uponthe observed values (xa,1, xa,2) and the amount of information that hasbeen accumulated (mi/ni) at the time of interim analysis.

Comparing proportions

When the responses are binary, similar formulas can also be obtained.Let xij be the binary response observed from the jth subject ( j =1, . . . , ni) in the ith treatment group (i = 1, 2). Again, xij , j = 1, . . . , ni,are assumed to be independent and identically distributed binary vari-ables with mean pi. At the time of interim analysis, it is also assumedthat the first mi of ni subjects in the ith treatment group have beenobserved. Define

xa,i =1

mi

mi∑

j=1

xij and xb,i =1

ni − mi

ni∑

j=mi+1

xij .

Page 152: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE GROUP SEQUENTIAL DESIGN 135

At the end of the trial, the following Z test statistic is calculated:

Z =x1 − x2

x1(1 − x1)/n1 + x2(1 − x2)/n2

≈ x1 − x2√

p1(1 − p1)/n1 + p2(1 − p2)/n2

=(m1 xa,1 + (n1 − m1)xb,1)/n1 − (m2 xa,2 + (n2 − m2)xb,2)/n2

p1(1 − p1)/n1 + p2(1 − p2)/n2

.

Under the alternative hypothesis, we assume p1 > p2. Hence, the powerfor rejecting the null hypothesis can be approximated by

1 − β = P(Z > zα/2)

= P

(n1−m1)(xb,1−µ1)n1

− (n2−m2)(xb,2−µ2)n2

√(n1−m1)p1(1−p1)

n21

+ (n2−m2)p2(1−p2)n22

> τ

= 1 − Φ(τ ),

where

τ =[

zα/2

p1(1 − p1)/n1 + p2(1 − p2)/n2 − (µ1 − µ2)

−(

m1

n1(xa,1 − µ1) − m2

n2(xa,2 − µ2)

)]

[(n1 − m1)p1(1 − p1)

n21

+(n2 − m2)p2(1 − p2)

n22

]−1/2

.

Similarly, the conditional power depends not only upon the assumed al-ternative hypothesis (p1, p2) but also upon the observed values (xa,1, xa,2)and the amount of information that has been accumulated (mi/ni) at thetime of interim analysis.

6.9 Practical Issues

The group sequential procedures for interim analyses are basically inthe context of hypothesis testing, which is aimed at pragmatic study ob-jectives, i.e., which treatment is better. However, most new treatmentssuch as cancer drugs are very expensive or very toxic or both. As a result,only if the degree of the benefit provided by the new treatment exceedssome minimum clinically significant requirement, it will then be con-sidered for the treatment of the intended medical conditions. Therefore,an adequate well-controlled trial should be able to provide not only thequalitative evidence whether the experimental treatment is effective

Page 153: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

136 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

but also the quantitative evidence from the unbiased estimation of thesize of the effectiveness or safety over placebo given by the experimentaltherapy. For a fixed sample design without interim analyses for earlytermination, it is possible to achieve both qualitative and quantitativegoals with respect to the treatment effect. However, with group sequen-tial procedure, the size of benefit of the experimental treatment by themaximum likelihood method is usually overestimated because of thechoice of stopping rule. Jennison and Turnbull (1990) pointed out thatthe sample mean might not be even contained in the final confidence in-terval. As a result, estimation of the size of treatment effect has receiveda lot of attention. Various estimation procedures have been proposedsuch as modified maximum likelihood estimator (MLE), median unbi-ased estimator (MUE), and the midpoint of the equal-tailed 90% con-fidence interval. For more details, see Cox (1952), Tsiatis et al. (1984),Kim and DeMets (1987), Kim (1989), Chang and O’Brien (1986), Changet al. (1989), Chang (1989), Hughes and Pocock (1988), and Pocock andHughes (1989).

The estimation procedures proposed in the above literature requireextensive computation. On the other hand, simulation results (Kim,1989; Hughes and Pocock, 1988) showed that the alpha spending func-tion corresponding to the O’Brien-Fleming group sequential procedureis very concave and allocates only a very small amount of total nominalsignificance level to early stages of interim analyses, and hence, thebias, variance, and mean square error of the point estimator followingO’Brien-Fleming procedure are also the smallest. Current researchersmainly focus upon the estimation of the size of the treatment effect forthe primary clinical endpoints on which the group sequential procedureis based. However, there are many other secondary efficacy and safetyendpoints to be evaluated in the same trial. The impact of early termi-nation of the trial based on the results from primary clinical endpointson the statistical inference for these secondary clinical endpoints is un-clear. In addition, group sequential methods and their followed estima-tion procedures so far are only concentrated on the population average.On the other hand, inference of variability is sometimes also of vitalimportance for certain classes of drug products and diseases. Researchon estimation of variability following early termination is still lacking.Other areas of interest for interim analyses include clinical trials withmore than 2 treatments and bioequivalence assessment. For group se-quential procedures for the trials with multiple treatments, see Hughes(1993) and Proschan et al. (1994). For group sequential bioequivalencetesting procedure, see Gould (1995).

Page 154: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CHAPTER 7

Adaptive Sample Size Adjustment

In clinical trials, it is desirable to have a sufficient number of subjectsin order to achieve a desired power for correctly detecting a clinicallymeaningful difference if such a difference truly exists. For this purpose,a pre-study power analysis is often conducted for sample size estima-tion under certain assumptions such as the variability associated withthe observed response of the primary study endpoint (Chow, Shao, andWang, 2003). If the true variability is much less than the initial guessof the variability, the study may be over-powered. On the other hand,if the variability is much larger than the initial guess of the variabil-ity, the study may not achieve the desired power. In other words, theresults observed from the study may be due to chance alone and cannotbe reproducible. Thus, it is of interest to adjust sample sizes adaptivelybased on accrued data at interim.

Adaptive sample size adjustment includes planned and unplanned(unexpected) sample size adjustment. Planned sample size adjustmentis referred to as sample size re-estimation at interim analyses in a groupsequential clinical trial design or an N-adjustable clinical trial design.Most unplanned sample size adjustments are due to changes made toon-going study protocols and/or unexpected administrative looks basedon accrued data at interim. Chapter 2 provides an adjustment factorfor sample size as the result of protocol amendments. In this chapter,we will focus on the case of planned sample size adjustment in a groupsequential design.

In the next section, statistical procedures for sample size re-estimationwithout unblinding the treatment codes are introduced. Statistical meth-ods such as Cui-Hung-Wang’s idea, Proschan-Hunsberger’s method,and Bauer and Kohne’s approach for sample size re-estimation withunblinding data in group sequential trial designs are given in Sections7.2, 7.3, and 7.4, respectively. Other methods such as the generaliza-tion of independent p-volume approaches and inverse-normal methodare discussed in sections 7.5 and 7.6, respectively. Some concludingremarks are given in the last section of this chapter.

Page 155: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

138 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

7.1 Sample Size Re-estimation without Unblinding Data

In clinical trials, the sample size is determined by a clinically mean-ingful difference and information on the variability of the primary end-point. Since the natural history of the distribution is usually not knownor the test treatment under investigation is a new class of drug, the esti-mate of variability for the primary endpoint for sample size estimationmay not be adequate during the planning stage of the study. As a result,the planned sample size may need to be adjusted during the conduct ofthe trial if the observed variability of the accumulated response on theprimary endpoint is very different from that used at the planning stage.To maintain the integrity of the trial, it is suggested that sample sizere-estimation be performed without unblinding of the treatment codesif the study is to be conducted in a double-blind fashion. Procedureshave been proposed for adjusting the sample size during the course ofthe trial without unblinding and altering the significance level (Gould,1992, 1995; Gould and Shih, 1992). For simplicity, let us consider a ran-domized trial with two parallel groups comparing a test treatment anda placebo. Suppose that the distribution of the response of the primaryendpoint is normally distributed. Then, the total sample size requiredfor achieving a desired power of 1 − β for a two-sided alternative hy-pothesis can be obtained using the following formula (see, e.g., Chow,Shao, and Wang, 2003)

N =4σ 2(zα/2 + zβ)

∆2,

where ∆ is the difference of clinically importance. In general, σ 2 (thewithin-group variance) is unknown and needs to be estimated basedon previous studies. Let σ ∗2 be the within-group variance specified forsample size determination at the planning stage of the trial. At theinitiation of the trial, we expect the observed variability to be similar toσ ∗2 so that the trial will have sufficient power to detect the differenceof clinically importance. However, if the variance turns out to be muchlarger than σ ∗2, we will need to re-estimate the sample size withoutbreaking the randomization codes. If the true within-group variance isin fact σ ′2, then the sample size to be adjusted to achieve the desiredpower of 1 − β at the α level of significance for a two-sided alternativeis given by

N′ = Nσ ′2

σ ∗2 ,

where N is the planned sample size calculated based on σ ∗2. However,σ ′2 is usually unknown and must be estimated from the accumulateddata available from a total n of N patients. One simple approach to

Page 156: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SAMPLE SIZE ADJUSTMENT 139

estimate σ ′2 is based on the sample variance calculated from the nresponses, which is given by

s2 =1

n − 1

∑ ∑

(yij − y)2,

where yij is the jth observation in group i and y is the overall samplemean, j = 1, . . . , ni, i = 1 (treatment), 2 (placebo), and n = n1 + n2.Ifn is large enough for the mean difference between groups to provide areasonable approximation to ∆, then it follows that σ ′2 can be estimatedby (Gould, 1995)

σ ′2 =n − 1n − 2

(

s2 − ∆2

4

)

.

Note that the estimation of within-group variance σ ′2 does not requirethe knowledge of the treatment assignment, and hence the blindness ofthe treatment codes is maintained. However, one of the disadvantagesof this approach is that it does depend upon the mean difference, whichis not calculated and is unknown.

Alternatively, Gould and Shih (1992) and Gould (1995) proposed aprocedure based on the concept of EM algorithm for estimating σ ′2

without a value for ∆. This procedure is briefly outlined below. Sup-pose that n observations, say, yi, i = 1, . . . , n on a primary endpointhave been obtained from n patients. The treatment assignments forthese patients are unknown. Gould and Shih (1992) and Gould (1995)considered randomly allocating these n observations to either of thetwo groups assuming that the treatment assignments are missing atrandom by defining the following πi as the treatment indicator

πi ={

1 if the treatment is the test drug0 if the treatment is placebo.

The E step is to obtain the provisional values of the expectation of πi(i.e., the conditional probability that patient i is assigned to the testdrug given yi), which is given by

P(πi = 1|yi) =(

1 + exp{(µ1 − µ2)(µ1 + µ2 − 2yi)/2σ 2])−1 ,

where µ1 and µ2 are the population mean of the test drug and theplacebo, respectively. The M step involves the maximum likelihood es-timates of µ1, µ2 and σ after updating πi by their provisional valuesobtained from the E step in the log-likelihood function of the interimobservations, which is given by

1 = n log σ +∑

[πi(yi − µ1)2 + (1 − πi)(yi − µ2)2]2σ 2

.

The E and M steps are iterated until the values converge. Gould andShih (1992) and Gould (1995) indicated that this procedure can estimate

Page 157: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

140 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

within-group variance quite satisfactorily, but failed to provide a reli-able estimate of µ1 − µ2. As a result, the sample size can be adjustedwithout knowledge of treatment codes. For sample size re-estimationprocedure without unblinding treatment codes with respect to binaryclinical endpoints, see Gould (1992, 1995). A review of the methods forsample size re-estimation can be found in Shih (2001).

7.2 Cui–Hung–Wang’s Method

For a given group sequential trial, let Nk and Tk be the planned samplesize and test statistic at stage k. Thus, we have

Tk =√

Nk√2σ

(

1Nk

Nk∑

i=1

xi − 1Nk

Nk∑

i=1

yi

)

.

Denote NL and TL by the planned cumulative sample size from stage1 to stage L and the weighted test statistic from stage 1 to stage L.Thus, for a group sequential trial without sample size adjustment, thetest statistic for mean difference between two groups at stage k canbe expressed as weighted test statistics of the sub-samples from theprevious stages as follows (see, e.g., Cui, Hung, and Wang, 1999)

TL+ j = TL

(NL

NL+ j

)1/2

+ wL+ j

[NL+ j − NL

NL+ j

]1/2

, (7.1)

where

wL+ j =∑NL+ j

i=NL+1(xi − yi)√

2(NL+ j − NL), (7.2)

in which xi and yi are from treatment group 1 and 2, respectively, andML is the adjusted cumulative sample from stage 1 to stage L.

For group sequential trials with sample size adjustments, let M bethe total sample size after adjustment and N be the original plannedsample size. We may consider adjusting sample size with effect size asfollows

M =(

δ

∆L

)2

N, (7.3)

where δ is the expected difference (effect size) given by

(µ2 − µ1)σ

,

Page 158: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SAMPLE SIZE ADJUSTMENT 141

Table 7.1 Effects of Sample Adjustment Using OriginalTest Statistic and Stopping Boundaries

Time to N change, tL 0.20 0.4 0.6 0.8 ∞Type-I error rate, α 0.038 0.035 0.037 0.033 0.025Power 0.84 0.91 0.94 0.96 0.61

Note: δ = 0.03, α = 0.025, and power = 0.9 => N = 250/group.True ∆ = 0.21. Sample size adjustment is based on equation 7.3.Source: Cui, Hung, and Wang (1999).

and ∆L is observed mean difference ∆µLσ

at stage L. Based on the ad-justed sample sizes, test statistic TL+ j becomes

UL+ j = TL

(NL

NL+ j

)1/2

+ w∗L+ j

[NL+ j − NL

NL+ j

]1/2

, (7.4)

where

w∗L+ j =

∑ML+ ji=NL+1(xi − yi)

2(ML+ j − NL). (7.5)

Cui, Hung, and Wang (1999) showed that using UL+ j and originalboundary from the group sequential trial will not inflate the type Ierror rate, both mathematically and by means of computer simulation(see also Table 7.1 and Table 7.2).

Example 7.1Cui, Hung, and Wang (1999) gave following example: A phase III two-arm trial for evaluating the effect of a new drug for prevention of my-ocardial infection in patients undergoing coronary artery bypass graftsurgery has a sample size of 600 [300] patients per group to detect a 50%reduction in incidence from 22% to 11% with 95% power. However, at in-terim analysis based on data from 600 [300] patients, the test group has16.5% incidence rate. If this incidence rate is the true rate, the power

Table 7.2 Effects of Sample Adjustment Using New TestStatistic and Original Stopping Boundaries

Time to N change, tL 0.20 0.4 0.6 0.8 ∞Type-I error rate, α 0.025 0.025 0.025 0.025 0.025Power 0.86 0.90 0.92 0.91 0.61

Note: N = 250/group, true, α = 0.025. True ∆ = 0.21Note: sample size adjustment is based on equation 7.3.Source: Cui, Hung, and Wang (1999).

Page 159: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

142 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

is about 40%. If using the Cui-Hung-Wang method to increase samplesize to 1400 per group [unrealistically large], the power is 93% basedon their simulation with 20,000 replications. (The calculations seem tobe incorrect! All the sample sizes are twice as large as they should be.)

RemarksCiu-Hung-Wang’s method has the following advantages. First, the ad-justment of sample size is easy. Second, using the same stopping bound-aries from the traditional group sequential trial is straightforward. Thedisadvantages include (i) their sample size adjustment is somewhat adhoc, which does not aim a target power; and (ii) weighting outcomesdifferently for patients from different stages is difficult to explainclinically.

7.3 Proschan–Hunsberger’s Method

For a given two-stage design, Proschan and Hunsberger (1995) andProschan (2005) considered adjusting the sample size at the secondstage based on the evaluation of conditional power given the data ob-served at the first stage. We will refer to their method as Proschan-Hunsberger’s method. Let Pc(n2, zα|z1, δ) be the conditional probabilitythat Z exceeds zα, given that Z1 = z1 and δ = (µx − µy)/σ based onn = n1 + n2 observations. That is,

Pc(n2, zα|z1, δ)

= Pr (Z > zα|Z1 = z1, δ)

= Pr

[

n1

(

Y1 − X1

)

+ n2

(

Y2 − X2

)

√2σ 2n

> zα|Z1 = z1, δ

]

= Pr

[

n2

(

Y2 − X2

) − n2δσ√2n2σ 2

>zα

√2σ 2n − z1

√2n1σ 2 − n2δσ√

2nσ 2|δ

]

.

If we treat σ as the true σ, we have

Pc(n2, zα|z1, δ) = 1 − Φ

[

√2n − z1

√2n1 − n2δ√

2n2

]

. (7.6)

Since Z1 is normally distributed, the type I error rate of this two-stageprocess without early stopping is given by∫ ∞

−∞Pc(n2, zα|z1, 0)φ(z1)dz1=

∫ ∞

−∞

{

1 − Φ

[

√2n − z1

√2n1√

2n2

]}

φ(z1)dz1.

Page 160: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SAMPLE SIZE ADJUSTMENT 143

Proschan and Hunsberger (1995) showed that without adjustment ofreject regions, the type I error rate caused by sample size adjustmentcould be as high as

αmax = α + 0.25e−z2α /2.

In the interest of controlling the type I error rate at the nominal level,we may consider modifying the rejection region from zα to zc such that

∫ ∞

−∞Pc(n2, zc|z1, 0)φ(z1)dz1 = α, (7.7)

where zc is a function of z1, n1, and n2. Since the integration is a con-stant α, we wish to find zc such that Pc(n2, zc|z1, 0) depends only upona function of z1, say A(z1), i.e.,

Pc(n2, zc|z1, 0) = A(z1). (7.8)

From (7.7) and (7.8), we can solve for zc as follows

zc =√

n1 z1 +√

n2 zA√n1 + n2

, (7.9)

where zA = Φ−1(1− A(z1)) and A(z1) is any increasing function with therange of [0, 1] satisfying

∫ ∞

−∞A(z1)φ(z1)dz1 = α. (7.10)

To find the function of A(z1), it may be convenient to write A(z1) inthe form of A(z1) = f (z1)/φ(z1).(7.10) provides the critical value for re-jecting the null hypothesis while protecting overall α. The next step isto choose the additional sample size n2 at stage 2. At stage 1 we havethe empirical estimate δ = (y1 − x1)/σ of the standard treatment dif-ference. We may wish to power the trial to detect a value somewherebetween the originally hypothesized difference and the empirical esti-mate. Whichever target difference δ we use, if we plug zc from (7.9) into(7.6), we obtain the conditional power

Pc(n2, zc|z1, δ) = 1 − Φ(

zA −√

n2/2 δ)

. (7.11)

Supposing that the desired power is 1 − β2, we can immediately obtainthe required sample size as follows

n2 =2(zA + zβ2)

2

δ2. (7.12)

Plugging this into (7.9), we have

zc =δ√n1

2z1 + (zA + zβ2) zA

√n12

δ2 + (zA + zβ2)2

. (7.13)

Page 161: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

144 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Note that zc is the reject region in n terms of (√

n1z1 +√

n2z2)/√

n, usualz score on all 2n observations. If we use the empirical estimate δ, (7.12)and (7.13) become

n2 =n1(zA + zβ2)

2

z21

, (7.14)

and

zc =z21 + (zA + zβ2) zA

z21 + (zA + zβ2)

2. (7.15)

The fact that the value of β2 can be changed after observing Z1 =z1 underscores the flexibility of the procedure. Note that (7.14) is thesample size formula to achieve power 1 − β in the fixed sample test atlevel A(z1). This interpretation allows us to see the benefit of extendingour study relative to starting a new one. If A(z1) < α, we would bebetter off starting a new study than extending the old. We can extendthe test procedure to two-stage with negative or positive early stoppingwith stopping rules in the first stage being

Stop not to reject Ho, z1 < zcl;Continue to stage 2, zcl ≤ z1 ≤ zcu;Stop to reject Ho, z1 > zcu.

(7.16)

Then (7.7) should be modified as follows:

α1 +∫ zcu

zcl

Pc(n2, zc|z1, 0)φ(z1)dz1 = α (7.17)

where φ(z1) is the standard normal density function α1 =∫ +∞

zcuφ(z1)dz1.

Define

Pc(z1; zc1; δ) =

0 z1 < zcl;Pc(n2, zc|z1, δ) zcl ≤ z1 ≤ zcu;1 z1 > zcu.

(7.18)

Note that

α1 =∫ zcu

zcl

Pc(n2, zc|z1, 0)φ(z1)dz1.

Then (7.17) can be written

α =∫ ∞

−∞Pc(n2, zc|z1, 0)φ(z1)dz1, (7.19)

Page 162: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SAMPLE SIZE ADJUSTMENT 145

Table 7.3 Corresponding Values zcu for Different zcl

α0= Φ(1 − zcl)α .10 .15 .20 .25 .30 .35 .40 .45 .50

0.025 2.13 2.17 2.19 2.21 2.22 2.23 2.25 2.26 2.270.050 1.77 1.82 1.85 1.88 1.89 1.91 1.93 1.94 1.95

Source: Proschan and Hunsberger, 1995.

which in the same format as (7.7). Let Pc(z1; zc1; δ) = A(z1), i.e.,

A(z1) =

0 z1 < zcl;Pc(n2, zc|z1, δ) zcl ≤ z1 ≤ zcu;1 z1 > zcu.

(7.20)

Note that (7.10)–(7.18) are still valid when A(z1) is any increasingfunction with range [0, 1] satisfying the form of

A(z1) =

0 z1 < zcl;f (z1)/φ(z1) zcl ≤ z1 ≤ zcu;1 z1 > zcu.

(7.21)

Proschan (1995) and Hunsberger (2004) gave the following linear-errorfunction:

A(z1) = 1 − Φ(√

2zα − z1).

Given observed treatment difference δ, we can use (7.12) or (7.14) re-estimate sample size and substitute (7.21) into (7.13) or (7.15) to deter-mine reject region zc. Note that this is a design without early stopping.

Example 7.2 Consider the following circular-function:

A(z1) =

0 z1 < zcl;

1 − Φ(√

z2cu − z2

1) zcl ≤ z1 ≤ zcu;1 z1 > zcu.

(7.22)

For each zcl, the corresponding zcu can be calculated by substituting(7.21) into (7.11). and then calculated zc from (7.12) or (7.14) with A(z1)of (7.21). For convenience’s sake, we tabulate zcl, zcu and zc in Table 7.3for overall one-sided α = 0.025 and 0.5.

Page 163: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

146 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

7.4 Muller–Schafer Method

Muller and Schafer (2001) showed how one can make any data-dependent change in an on-going adaptive trial and still preserve theoverall type I error. To achieve that, all one need do is preserve the con-ditional type I error of the remaining portion of the trial. Therefore, theMuller-Schafer method is a special conditional-error approach.

7.5 Bauer–Kohne Method

Two-stage design

Bauer-Kohne’s method is based on the fact that under H0 the p-value forthe test of a particular null hypothesis, H0k, in a stochastically indepen-dent sample, which is generally uniformly distributed on [0,1], wherecontinuous test statistics are assumed (Bauer and Kohne, 1994, 1996).Usually, however, one obtains conservative combination tests when un-der H0 the distribution of the p-values is stochastically larger than theuniform distribution.

Moreover, under H0 the distribution of the resultant p-value is stochas-tically independent of previously measured random variables. Hence,provided H0 is true, data-dependent planning (e.g., sample size change)does not change the convenient property of independently and uni-formly distributed p-values, but some care is needed (Liu, Proschan,and Pledger, 2002). The modification considered here would imply thatdata are not allowed to be pooled over the whole trial. Data from thestages before and after that adaptive interim analysis have to be lookedat separately. In the analysis, p-values from the partitioned samplehave to be used. These are general measures of “deviation” from the re-spective null hypotheses. There are many ways to test the intersectionHo of two or more individual null hypotheses based on independentlyand uniformly distributed p-values for the individual test (Hedges andOlkin, 1985; Sonnesmann, 1991). Fisher’s criterion using the productof the p-values has good properties. One obvious question is whethercombination tests could be used that explicitly include a weighting ofthe p-values by the sample size. If the sample size itself is open to adap-tation, then clearly the answer is no. To derive critical regions for teststatistics explicitly containing random sample sizes, one would needto know or pre-specify the distribution. This, however, contradicts theintended flexibility of general approach.

Let P1 and P2 be the p-values for the sub-samples obtained fromthe first stage and second stage, respectively. Fisher’s criterion leads to

Page 164: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SAMPLE SIZE ADJUSTMENT 147

Table 7.4 Stopping Boundaries α0, α1, and Cα

α Cα α0 0.3 0.4 0.5 0.6 0.7 1.0

0.1 0.02045 0.0703 0.0618 0.0548 0.0486 0.0429 0.020450.05 0.00870 α1 0.0299 0.0263 0.0233 0.0207 0.0183 0.008700.025 0.00380 0.0131 0.0115 0.0102 0.0090 0.0080 0.00380

Source: Table 1 of Bauer (1994).

rejection of H0 at the end of trial if

P1 P2 ≤ cα = e− 12 χ2

4,1−α , (7.23)

where χ24,1−α is the (1 − α)−quantile of the central χ2 distribution with

4 degrees of freedom (see Table 7.4). Decision rules at the first stage:

P1 ≤ α1, Stop trial and reject H0

P1 > α0, Stop trial and accept H0

α1 < P1 ≤ α0, Continue to the second stage.(7.24)

For determination of α1 and α0, the overall type I error rate is given by

α1 +∫ α0

α1

∫ cαP1

0

dP2dP1 = α1 + cα lnα0

α1. (7.25)

Letting this error rate equal to α, and using the relationship

cα = e− 12 χ2

4,1−α ,

we have

α1 + lnα0

α1e− 1

2 χ24,1−α = α. (7.26)

Decision rules at final stage are given by{

P1 P2 ≤ e− 12 χ2

4,1−α , Reject H0

Otherwise, Accept H0.

Assuming that zi and ni are the standardized mean and sample sizefor the sub-sample at stage i, under the condition that n1 = n2, theuniformly most powerful test is given by

z1 + z2√2

≥ z1−α.

Equivalently,

Φ−1o (1 − P1) + Φ−1

o (1 − P2) ≥√

2Φ−1o (1 − α).

Page 165: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

148 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Three-stage design

Let pi(i = 1, 2, 3) be the p-values for the hypothesis tests based on sub-samples obtained at stage 1, 2, and 3, respectively.

Decision Rules:

Stage 1:

Stop with rejection of H0 if p1 ≤ α1,Stop without rejection of H0 if p1 > α0,Otherwise continue to stage 2.

Stage 2:

Stop with rejection of H0 if p1 p2 ≤ ca2 = e− 12 χ2

4 (1−α2),Stop without rejection of H0 if p2 > α0,Otherwise continue.

Stage 3:{

Stop with rejection of H0 if p1 p2 p3 ≤ da = e− 12 χ2

6 (1−α),Otherwise stop without rejection of H0.

To avoid qualitative interaction between stages and treatments, choose

ca2 = dα/α0.

Then, no values of p3 ≥ α0 can lead to the rejection of H0 because theprocedure would have stopped beforehand. On the other hand, if

α1 ≥ cα2/α0,

then no p2 ≥ α0 can lead to the rejection of H0. Note that

α1 +∫ α0

α1

∫ dα/(α0 p1)

0

dp1dp2 +∫ α0

α1

∫ α0

dα/(α0 p1)

∫ dα/(p1 p2)

0

dp1dp2dp3

= a1 +dα

α0(lnα0 − lnα1) + da(2 lnα0 − ln dα)(lnα0 − lnα1)

+dα

2(ln2 α0 − ln2 α1).

Now, let it be equal to α. We can then solve for α1 given α and α0. Forα0 = 0.025, dα = 0.000728.

7.6 Generalization of Independent p-Value Approaches

General Approach

Consider a clinical trial with K stages and at each stage a hypothe-sis test is performed, followed by some actions that are dependent onthe analysis results. Such actions could be an early futility or efficacystopping, sample size re-estimation, modification of randomization, or

Page 166: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SAMPLE SIZE ADJUSTMENT 149

Table 7.5 Stopping Boundaries

α0 α1 α2

.4 .0265 .0294

.6 .0205 .0209

.8 .0137 .0163

Source: Bauer and Kohne (1995)

other adaptations. The objective of the trial (e.g., testing the efficacyof the experimental drug) can be formulated using a global hypothesistest, which is the intersection of the individual hypothesis tests fromthe interim analyses.

H0 : H01 ∩ . . . ∩ H0K , (7.27)

where H0k (k = 1, . . . , K) is the null hypothesis test at the kth interimanalysis. Note that the H0k have some restrictions, that is, rejectionof any H0k (k = 1, . . . , K) will lead to the same clinical implication(e.g., drug is efficacious). Otherwise the global hypothesis cannot beinterpreted. In the rest of the paper, H0k will be based on sub-samplesfrom each stage with the corresponding test statistic denoted by Tk andp-value denoted by pk. The stopping rules are given by

Stop for efficacy if Tk ≤ αk ,Stop for futility if Tk > βk,Continue with adaptations if αk < Tk ≤ βk,

(7.28)

where αk < βk (k = 1, . . . , K − 1), and αK = βK . For convenience, αk andβk are called the efficacy and futility boundaries, respectively.

To reach the kth stage, a trial has to pass the 1st to (k − 1)th stages,therefore the CDF of Tk is given by

ϕk (t) = Pr(Tk < t, α1 < t1 ≤ β1, . . . , αk−1 < tk−1 ≤ βk−1)

=∫ β1

α1

. . .

∫ βk−1

αk−1

∫ t

0

fT1...Tk dtk dtk−1. . .dt1, (7.29)

where fT1...Tk is the joint PDF of T1, . . . , and Tk. The error rate (α spent)at the kth stage is given by Pr(Tk < αk), that is,

πk = ϕk (αk) . (7.30)

Page 167: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

150 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

When the efficacy is claimed at a certain stage, the trial is stopped.Therefore, the type I errors at different stages are mutually exclusive.Hence the experiment-wise type I error rate can be written as

α =K∑

k=1

πk. (7.31)

(7.33) is the key to determining the stopping boundaries as illustratedin the next four sections with two-stage adaptive designs.

There are different p-values that can be calculated: unadjusted p-value and adjusted p-value. Both are measures of the statistical strengthfor treatment effect. The unadjusted p-value (pu) associates with πkwhen the trial stops at the kth stage, while the adjusted p-value asso-ciates with the overall α. The unadjusted p-value corresponding withan observed t when the trial stops at the kth stage is given by

pu(t; k) = ϕk (t) , (7.32)

where ϕk (t) is obtained from (7.31). The adjusted p-value correspondingto an observed test statistic Tk = t at the kth stage is given by

p(t; k) =k−1∑

i=1

πi + pu(t; k), k = 1, . . .K. (7.33)

Note that the adjusted p-value is a measure of statistical strength forrejecting H0. The later the H0 is rejected, the larger the adjusted p-valueis and the weaker the statistical evidence is.

Selection of Test Statistic

Without losing generality, assume H0k is a test for the efficacy of theexperimental drug, which can be written as

H0k : ηk1 ≥ ηk2 vs. Hak : ηk1 < ηk2, (7.34)

where ηk1 and ηk2 are the treatment response (mean, proportion, orsurvival) in the two comparison groups at the kth stage. The joint PDFfT1...Tk in (7.31) can be any density function; however, it is desirable tochose Tk such that fT1...Tk has a simple form. Note that when ηk1 = ηk2, thep-value pk from the sub-sample at the kth stage is uniformly distributedon [0,1] under H0. This desirable property can be used to construct teststatistics for adaptive designs.

In what follows, two different combinations of the p-values will bestudied: (i) linear combination of p-values (Chang, 2005), and (ii) prod-uct of p-values. The linear combination is given by

Tk = Σki=1wki pi, k = 1, . . . , K, (7.35)

Page 168: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SAMPLE SIZE ADJUSTMENT 151

where wki > 0, and K is the number of analyses planned in the trial.There are two interesting cases of (7.37) that will be studied. They arethe test based on the individual p-value for the sub-sample obtainedat each stage and the test based on the sum of the p-values from thesub-samples.

The test statistic using the product of p-values is given by

Tk = Πki=1 pi, k = 1, . . . , K. (7.36)

This form has been proposed by Bauer and Kohne (1994) using Fisher’scriterion. Here, it will be generalized without using Fisher’s criterionso that the selection of stopping boundaries is more flexible. Note thatpk in (7.37) and (7.38) is the p-value from the sub-sample at the kthstage, while pu(t; k) and p(t; k) in the previous section are unadjustedand adjusted p-values, respectively, calculated from the test statistic,which are based on the cumulative sample up to the kth stage wherethe trial stops.

Test Based on Individual P-values

The test statistic in this method is based on individual p-values fromdifferent stages. The method is referred to as method of individual p-values (MIP). By defining the weighting function as wki = 1 if i = k, andwki = 0 otherwise, (7.37) becomes

Tk = pk. (7.37)

Due to the independence of pk, fT1...Tk = 1 under H0 and

ϕk (t) = tk−1∏

i=1

Li, (7.38)

where Li = (βi − αi) and for convenience, when the upper bound exceedsthe lower bound, define

∏0i=1(·) = 1 .The family experiment-wise type I

error rate is given by

α =K∑

k=1

αk

k−1∏

i=1

Li. (7.39)

(7.41) is the necessary and sufficient condition for determining the stop-ping boundaries, (αi, βi).

For a two-stage design, (7.41) becomes

α = α1 + α2(β1 − α1) (7.40)

Page 169: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

152 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 7.6 Stopping Boundaries with MIP

α1 0.000 0.005 0.010 0.015 0.020

β1

0.15 0.1667 0.1379 0.1071 0.0741 0.03850.20 0.1250 0.1026 0.0789 0.0541 0.02780.25 0.1000 0.0816 0.0625 0.0426 0.02170.30 0.0833 0.0678 0.0517 0.0351 0.01790.35 α2 0.0714 0.0580 0.0441 0.0299 0.01520.40 0.0625 0.0506 0.0385 0.026 0.01320.50 0.0500 0.0404 0.0306 0.0206 0.01040.80 0.0312 0.0252 0.0190 0.0127 0.00641.00 0.0250 0.0201 0.0152 0.0102 0.0051

Note: One-sided α = 0.025.

For convenience, a table of stopping boundaries has been constructedusing (7.42) (Table 7.6). The adjusted p-value is given by

p(t, k) =

{

t if k = 1,α1 + (β1 − α1)t if k = 2.

(7.41)

Test Based on Sum of P-values

This method is referred to as the method of sum of p-values (MSP). Thetest statistic in this method is based on the sum of the p-values fromthe sub-samples. Defining the weights as wki = 1, (7.37) becomes

Tk =k∑

i=1

pi, k = 1, . . . , K. (7.42)

For two-stage designs, the α spent at stage 1 and stage 2 are given by

Pr(T1 < α1) =∫ α1

0

dt1 = α1, (7.43)

and

Pr(T2 < α2, α1 < T1 ≤ β1) =

{∫ β1

α1

∫ α2

t1dt2dt1, for β1 ≤ α2,

∫ α2

α1

∫ α2

t1dt2dt1, for β1 > α2,

(7.44)

respectively. Carrying out the integrations in (7.46) and substitutingthe results into (7.33), it is immediately obtained that

α ={

α1 + α2(β1 − α1) − 12(β2

1 − α21), for β1 < α2,

α1 + 12(α2 − α1)2, for β1 ≥ α2.

(7.45)

Page 170: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SAMPLE SIZE ADJUSTMENT 153

Table 7.7 Stopping Boundaries with MSP

α1 0.000 0.005 0.010 0.015 0.020

β1

0.05 0.5250 0.4719 0.4050 0.3182 0.20170.10 0.3000 0.2630 0.2217 0.1751 0.12250.15 α2 0.2417 0.215 4 0.187 1 0.1566 0.12000.20 0.2250 0.2051 0.1832 0.1564 0.1200

>0.25 0.2236 0.2050 0.1832 0.1564 0.1200

Note: One-sided α = 0.025.

Various stopping boundaries can be chosen from (7.47). See Table 7.7for examples of the stopping boundaries.

The adjusted p-value can be obtained by replacing α1 with t in (7.45)if the trial stops at stage 1 and by replacing α2 with t in (7.47) if thetrial stops at stage 2.

p(t; k) =

t, k = 1,α1 + t(β1 − α1) − 1

2(β2

1 − α21), k = 2 and t ≤ α2,

α1 + 12(t − α1)2, k = 2 and t > α2,

(7.46)

where t = p1 if the trial stops at stage 1 (k = 1) and t = p1 + p2 if thetrial stops at stage 2 (k = 2).

Test Based on Product of P-values

This method is known as the method of products of p-values (MPP). Thetest statistic in this method is based on the product of the p-values fromthe sub-samples. For two-stage designs, (7.37) becomes

Tk = Πki=1 pi, k = 1, 2. (7.47)

The α spent in the two stages are given by

Pr(T1 < α1) =∫ α1

0

dt1 = α1 (7.48)

and

Pr(T2 < α2, α1 < T1 ≤ β1) =∫ β1

α1

∫ min(α2, t1)

0

1t1

dt2dt1. (7.49)

Page 171: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

154 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 7.8 Stopping Boundaries with MPP

α1 0.005 0.010 0.015 0.020

β1

0.15 0.0059 0.0055 0.0043 0.00250.20 0.0054 0.0050 0.0039 0.00220.25 0.0051 0.0047 0.0036 0.00200.30 0.0049 0.0044 0.0033 0.00180.35 α2 0.0047 0.0042 0.0032 0.00170.40 0.0046 0.0041 0.0030 0.00170.50 0.0043 0.0038 0.0029 0.00160.80 0.0039 0.0034 0.0025 0.00141.00 0.0038 0.0033 0.0024 0.0013

Note: One-sided α = 0.025.

(7.51) can also be written as

Pr(T2 < α2, α1 < T1 ≤ β1)

=

{∫ β1

α1

∫ α2

01t1

dt2dt1, for β1 ≤ α2;∫ α2

α1

∫ α2

01t1

dt2dt1 +∫ β1

α2

∫ t10

1t1

dt2dt1, for β1 > α2.(7.50)

Carrying out the integrations in (7.52) and substituting the results into(7.33), it is immediately obtained that

α =

{

α1 + α2 ln β1α1

, for β1 ≤ α2,

α1 + α2 ln β1α1

+ (β1 − α2), for β1 > α2.(7.51)

Note that the stopping boundaries based on Fisher’s criterion are specialcases of (7.53), where

β1 < α2

and

α2 = exp[

−12χ2

4 (1 − α)]

, (7.52)

that is, α2 = 0.0380 for α = 0.025. Examples of the stopping boundariesusing (7.49) are provided in Table 7.8.

The adjusted p-value can be obtained by replacing α1 with t in (7.50)if the trial stops at stage 1 and replacing α2 with t in (7.53) if the trialstops at stage 2.

p(t; k) =

t, k = 1,α1 + t ln β1

α1, k = 2 and t ≤ α2,

α1 + t ln β1α1

+ (β1 − t), k = 2 and t > α2,(7.53)

Page 172: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SAMPLE SIZE ADJUSTMENT 155

where t = p1 if the trial stops at stage 1 (k = 1) and t = p1 + p2 if thetrial stops at stage 2 (k = 2).Rules for Sample Size Adjustment The primary rule for adjust-ment is based on the ratio of the initial estimate of effect size (E0) tothe observed effect size (E), specifically,

N =∣∣∣∣

E0

E

∣∣∣∣

a

N0, (7.54)

where N is the newly estimated sample size, N0 is the initial samplesize which can be estimated from a classic design, and a is a constant,

E =ηi2 − ηi1

σi. (7.55)

With large sample size assumption, the common variance for the twotreatment groups is given by

σ 2i =

σ 2i , for normal endpoint,

ηi(1 − ηi), for binary endpoint,

η2i

[

1 − eηi T0−1T0ηieηi Ts

]−1

, for survival endpoint,

(7.56)

where

ηi =ηi1 + ηi1

2.

ηi = ηi1+ηi22

and the logrank test is assumed to be used for the survivalanalysis. Note that the standard deviations for proportion and survivalhave several versions. There are usually slight differences in samplesize or power among the different versions.

The sample size adjustment in (7.56) should have the following addi-tional constraints: (i) It should be smaller than Nmax (due to financialand or other constraints) and greater than or equal to Nmin (the samplesize for the interim analysis), and (ii) If E and E0 have different signs,no adjustment will be made.

Operating characteristics

The operating characteristics are studied using the following example,which is modified slightly from an oncology trial.

Example 7.3 In a two-arm comparative oncology trial, the primaryefficacy endpoint is time to progression (TTP). The median TTP is esti-mated to be 8 months (hazard rate = 0.08664) for the control group, and10.5 months (hazard rate = 0.06601) for the test group. Assume a uni-form enrollment with an accrual period of 9 months and a total studyduration of 24 months. The log-rank test will be used for the analysis.

Page 173: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

156 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 7.9 Operating Characteristics of Adaptive Methods

Median time Expected Power (%)Test Control EESP EFSP N MIP/MSP/MPP

0 0 0.010 0.750 216 2.5/2.5/2.59.5 8 0.174 0.238 273 42.5/44.3/45.7

10.5 8 0.440 0.067 254 78.6/80.5/82.711.5 8 0.703 0.015 219 94.6/95.5/96.7

Note: 1,000,000 simulation runs.

An exponential survival distribution is assumed for the purpose of sam-ple size calculation.

When there is a 10.5 month median time for the test group, the classicdesign requires a sample size of 290 per group with 80% power at alevel of significance (one-sided) α of 0.025. To increase efficiency, anadaptive design with an interim sample size of 175 patients per groupis used. The interim analysis allows for early efficacy stopping withstopping boundaries (from Tables 7.6, 7.7, and 7.8) α1 = 0.01, b1 = 0.25and α2 = 0.0625 (MIP), 0.1832 (MSP), 0.00466 (MPP). The sample sizeadjustment is based on the rules described in the appendix where a = 2.The maximum sample size allowed for adjustment is Nmax = 350. Thesimulation results are presented in Table 7.9, where the abbreviationsEESP and EFSP stand for early efficacy stopping probability and earlyfutility stopping probability, respectively.

Note that power is the probability of rejecting the null hypothesis.Therefore when the null hypothesis is true, the power is the type I errorrate α. From Table 7.9, it can be seen that the one-sided α is controlledat a 0.025 level as expected for all three methods. The expected samplesizes under all four scenarios are smaller than the sample size for theclassic design (290/group). In terms of power, MPP has 1% more powerthan MSP, and MSP has about 1% more power than MIP. If the stoppingboundaries are changed to α1 = 0.005 and β1 = 0.2, then the power(median TTP = 10.5 months for the test group) will be 76, 79, and82 for MIP, MSP and MPP, respectively. The detailed results are notpresented.

To demonstrate how to calculate the adjusted p-value (e.g., usingMSP), assume that naive p-values from logrank test (or other test)are p1 = 0.1, p2 = 0.07, and the trial stops at stage 2. Therefore,t = p1 + p2 = 0.17 < α2, and the null hypothesis of no treatmentdifference is rejected. In fact, the adjusted p-value is 0.0228 (< 0.025)which is obtained from Eq. 7.46 using t = 0.17 and α1 = 0.01.

Page 174: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SAMPLE SIZE ADJUSTMENT 157

RemarksWith respect to the accuracy of proposed methods, a larger portion ofthe stopping boundaries in Tables 7.6 through 7.8 is validated usingcomputer simulations with 10,000,000 runs for each set of boundaries(α1, β1, α2). To conduct an adaptive design using the methods proposed,follow the steps below:

Step 1: If MIP is used for the design, use (7.42) or Table 7.6 to de-termine the stopping boundaries (α1, β1, α2), and use (7.43) to calculatethe adjusted p-value when the trial is finished.

Step 2: If MSP is used for the design, use (7.45) or Table 7.7 to deter-mine the stopping boundaries, and use (7.48) to calculate the adjustedp-value.

Step 3: If MPP is used for the design, use (7.53) or Table 7.8 to de-termine the stopping boundaries, and (7.54) to calculate the adjustedp-value.

To study the operating characteristics before selecting an optimal de-sign, simulations must be conducted with various scenarios. A SAS pro-gram for the simulations can be obtained from the author. The programhas fewer than 50 lines of executable code and is very user-friendly.

7.7 Inverse-Normal Method

Lehmacher and Wassmer (1999) proposed normal-inverse method. Thetest statistic that results from the inverse normal method of combiningindependent p values (Hedges and Olkin, 1985) is given by

1√k

k∑

i=1

Φ−1(1 − pi) (7.57)

where Φ−1(·) denotes the inverse cumulative standard normal distri-bution function. The proposed approach involves using the classicalgroup sequential boundaries for the statistics (7.59). Since the

Φ−1(1 − pi)′s, k = 1, 2, . . . , K,

are independent and standard normally distributed, the proposed ap-proach maintains α exactly for any (adaptive) choice of sample size.

Example 7.4 Lehmacher and Wassmer gave the following example todemonstrate their method. In a randomized, placebo-controlled, double-blind study involving patients with acne papulopustulosa, Plewig’s gradeIl-Ill, the effect of treatment under a combination of 1% chloramphenicol(CAS 56-75-7) and 0.5% pale sulfonated shale oil versus the alcoholic

Page 175: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

158 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

vehicle (placebo) was investigated (Fluhr et al., 1998). After 6 weeksof treatment, reduction of bacteria from baseline, examined on agarplates (log CFU/cm2; CFU, colony forming units), of the active groupas compared to the placebo group were assessed. The available datawere from 24 and 26 patients in the combination drug and the placebogroups, respectively. The combination therapy resulted in a highly sig-nificant reduction in bacteria as compared to placebo using a two-sidedt-test for the changes (p = 0.0008).

They further illustrated the method. Suppose that it was intended toperform a three-stage adaptive Pocock’s design with α = 0.01, and after2 × 12 patients the first interim analysis was planned. The two-sidedcritical bounds for this method are α1 = α2 = α3 = 2.873 (Pocock, 1977).After ni = 12 patients per group, the test statistic of the t-test is 2.672with one-sided p-value p1 = 0.0070, resulting from an observed effectsize xii − x21 = 1.549 and an observed standard deviation si = 1.316. Thestudy should be continued since

Φ−1(i − P1) = 2.460 < α1.

The observed effect is fairly near to significance. We therefore plan thesecond interim analysis to be conducted after observing the next 2 × 6patients, i.e., the second interim analysis will be performed after fewerpatients than the first. The t-test statistic of the second stage is equalto 1.853 with one-sided p value P2 = 0.0468 (x12 − x22 = 1.580, standarddeviation of the second stage s2 = 1.472). The test statistic becomes

√2(Φ−1(1 − p1) + Φ−1(1 − p2)) = 2.925,

yielding a significant result after the second stage of the trial. Corre-sponding approximate 99% RCIs are (0.12, 3.21) and (0.21, 2.92) for thefirst and the second stages, respectively.

7.8 Concluding Remarks

One of the purposes for adaptive sample size adjustment based on ac-crued data at interim in clinical trials is not only to achieve statisticalsignificance with a desired power for detecting a clinically significantdifference, but also to have the option for stopping the trial early forsafety or futility/benefits. To maintain the integrity of the trial, it isstrongly recommended that an independent data monitoring committee(DMC) should be established to perform (safety) data monitoring andinterim analyses for efficacy/benefits regardless of whether the reviewor analysis is blinded or unblinded. Based on sample size re-assessment,the DMC can then recommend one of the following: (i) continue the trial

Page 176: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SAMPLE SIZE ADJUSTMENT 159

with no changes; (ii) decrease/increase the sample size for achievingstatistical significance with a desired power; (iii) stop the trial earlyfor safety, futility or benefits; and (iv) modifications to the study proto-col. In a recent review article by Montori et al. (2005), it is indicatedthat there is a significant increasing trend in the percentage of ran-domized clinical trials that are stopped early for benefits in the pastdecade. Pocock (2005), however, criticized that the majority of the trialsstopped early for benefits do not have correct infrastructure/system inplace, and hence the decisions/recommendations for stopping early maynot be appropriate. Chow (2005) suggested that so-called reproducibil-ity probability be carefully evaluated if a trial is to be stopped early forbenefits, in addition to an observed small p-value.

It should be noted that current methods for adaptive sample sizeadjustment are mostly developed in the interest of controlling an over-all type I error rate at the nominal level. These methods, however, areconditional and may not be feasible to reflect current best medical prac-tice. As indicated in Chapter 2, the use of adaptive design methods inclinical trials could result in a moving target patient population afterprotocol amendments. As a result, sample size required in order to havean accurate and reliable statistical inference on the moving target pa-tient population is necessarily adjusted unconditionally. In practice, itis then suggested that current statistical methods for group sequentialdesigns should be modified to incorporate the randomness of the tar-get patient population over time. In other words, sample sizes shouldbe adjusted adaptively to account for random boundaries at differentstages.

Page 177: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and
Page 178: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CHAPTER 8

Adaptive Seamless Phase II/III Designs

A seamless phase II/III trial design is a program that addresses withina single trial the objectives that are normally achieved through sepa-rate trials in phases IIb and III (Gallo et al., 2006; Chang et al., 2006).An adaptive seamless phase II/III design is a combination of phase IIand phase III, which aims at achieving the primary objectives normallyachieved through the conduct of separate phase II and phases III tri-als, and would use data from patients enrolled before and after theadaptation in the final analysis (Maca et al., 2006). In a seamless de-sign, there is usually a so-called learning phase that serves the samepurpose as a traditional phase II trial, and a confirmatory phase thatserves the same objective as a traditional phase III study. Comparedto traditional designs, a seamless design can not only reduce samplesize but also shorten the time to bring a positive drug candidate to themarketplace. In this chapter, for illustration purposes, we will discussdifferent seamless designs and their utilities, through real examples.We will also discuss the issues that are commonly encountered followedby some recommendations for assuring the validity and integrity of aclinical trial with seamless design.

In the next section, the efficiency of an adaptive seamless design isdiscussed. Section 8.2 introduces step-wise tests with respect to someadaptive procedures employed in a seamless trial design. Statisticalmethods based on the concepts of contrast test and naıve p-value aredescribed in Section 8.3. Section 8.4 compares several seamless designs.Drop-the-loses adaptive designs are discussed in Section 8.5. A briefsummary is given in the last section of this chapter.

8.1 Why a Seamless Design Is Efficient

The use of a seamless design enjoys the following advantages. There areopportunities for savings when (i) a drug is not working through earlystopping for futility and when (ii) a drug has a dramatic effect by earlystopping for efficacy. A seamless design is efficient because there is nolead time between the learning and confirmatory phases. In addition,

Page 179: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

162 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

the data collected at the learning phase are combined with those dataobtained at the confirmatory phase for final analysis.

The most notable difference between an adaptive seamless phaseII/III design and the traditional approach that has separate phase IIand phase III trials is the control of the overall type I error rate (alpha)and the corresponding power for correctly detecting a clinically mean-ingful difference. In the traditional approach, the actual α is equal toαI IαI I I , where αI I and αI I I are the type I error rates controlled at phaseII and phase III, respectively. If two phase III trials are required, then

α = αI IαI I IαI I I .

In a seamless adaptive phase II/III design, actual α = αI I I . If two phaseIII studies are required, then α = αI I IαI I I . Thus, the α for a seamlessdesign is actually 1/αI I times larger than the traditional design. Simi-larly, we can steal power (permissible) by using a seamless design. Herepower refers to the probability of correctly detecting a true but not hy-pothetical treatment difference. In a classic design, the actual power isgiven by

power = powerII ∗ powerIII ,

while in a seamless adaptive phase II/III design, actual power is

power = powerIII ,

which is 1/powerII times larger than the traditional design.In practice, it is not necessary to gain much power by combining the

data from different phases. Table 7.9 provides a summary of the results(power) using individual p-values from the sub-sample of each stage(MIP), sum of the p-values from different stages (MSP), and Fisher’scombination of the p-values from sub-samples from each stage (MPP)under a two-group parallel design. Power differences between methodsare small.

8.2 Step-Wise Test and Adaptive Procedures

Consider a clinical trial with K stages. At each stage, a hypothesestesting is performed followed by some actions that are dependent onthe analysis results. In practice, three possible actions are often con-sidered. They are (i) an early futility stopping, (ii) an early stoppingdue to efficacy, or (iii) dropping the losers (i.e., inferior arms includingpre-determined sub-populations are dropped after the review of anal-ysis results at each stage). In practice, the objective of the trial (e.g.,testing for efficacy of the experimental drug) can be formulated as aglobal hypotheses testing problem. In other words, the null hypothesis

Page 180: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SEAMLESS PHASE II/III DESIGNS 163

is the intersection of individual null hypotheses at each stage, that is

H0 : H01 ∩ . . . ∩ H0K , (8.1)

where H0k (k = 1, . . . , K) is the null hypothesis at the kth stage. Notethat the H0k have some restrictions, that is, the rejection of any H0k(k = 1, . . . , K) will lead to the common clinical implication of the efficacyof the study medication. Otherwise, the global hypothesis cannot be in-terpreted. The following step-wise test procedure is often employed. LetTk be the test statistic associated with H0k. Note that for convenience’ssake, we will only consider one-sided tests throughout this chapter.

Stopping Rules

To test (8.1), first, we need specify stopping rules at each stage. Thestopping rules considered here are given by

Stop for efficacy if Tk ≤ αk ,Stop for futility if Tk > βk,Drop losers and continue if αk < Tk ≤ βk,

(8.2)

where αk < βk (k = 1, . . . , K − 1), and αK = βK . For convenience’s sake,we denote αk and βk as the efficacy and futility boundaries, respectively.The test statistic is defined as

Tk =k∏

i=1

pi, k = 1, . . . , K, (8.3)

where pi is the naive p-value for testing H0k based on sub-samples col-lected at the kth stage. It is assumed that pk is uniformly distributedunder the null hypothesis H0k.

8.3 Contrast Test and Naive p-Value

In clinical trials with M arms, the following general one-sided contrasthypotheses testing is often considered:

H0 : L(u) ≤ 0; vs. Ha : L(u) = ε > 0, (8.4)

where

L(u) =M∑

i=1

ciui

Page 181: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

164 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

is a linear combination of ui, i = 1, . . . , M, in which ci are the contrastcoefficients satisfying

M∑

i=1

ci = 0,

and ε is a pre-specified constant. In practice, ui can be the mean, propor-tion, or hazard rate for the ith group depending on the study endpoint.Under the null hypothesis of (8.4), a contrast test can be obtained asfollows:

Z =L(u; H)

varL(u), (8.5)

where u is an unbiased estimator of u, where u = (ui) and

ui =nij∑

j=1

xij

nij.

Let

ε = E(L(u)), v2 = var(L(u)), (8.6)

where homogeneous variance assumption under H0 and Ha is assumed.It is also assumed that ui, i = 1, . . . , M are mutually independent.Without loss of generality, assume that ciui > 0 as an indication ofefficacy. Then, for a superiority design, if the null hypothesis H0 givenin (8.4) is rejected for some ci satisfying

∑Mi=1 ci = 0 , then there is a

difference among ui, i = 1, . . . , M.

Let u be the mean for a normal endpoint (proportion for a binaryendpoint, or MLE of the hazard rate for a survival endpoint). Then,by the Central Limit Theorem, the asymptotic distribution of the teststatistic and the pivotal statistic can be obtained as follows:

Z =L(u|H0)

v∼ N(0, 1), (8.7)

and

Z =L(u|Ha)

v∼ N(

ε

v, 1), (8.8)

where

v2 =M∑

i=1

c2i var(ui) = σ 2

M∑

i=1

c2i

ni=

θ2

n, (8.9)

and

θ2 = σ 2M∑

i=1

c2i

fi, (8.10)

Page 182: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SEAMLESS PHASE II/III DESIGNS 165

in which ni is the sample size for the ith arm, fi = nin is the size fraction,

n =∑k

i=0 ni, and σ 2 is the variance of the response under H0. Thus, thepower is given by

power = Φo

(ε√

n − θz1−α

θ

)

, (8.11)

where Φ0 is cumulative distribution function of N(0, 1). Thus, the sam-ple size required for achieving a desired power can be obtained as

n =(z1−a + z1−β)2θ2

ε2. (8.12)

It can be seen from above that (8.11) and (8.12) include the commonlyemployed one-arm or two-arm superiority design and non-inferioritydesigns as special cases. For example, for a one-arm design, c1 = 1,and for a two-arm design, c1 = −1 and c2 = 1. Note that a minimalsample size is required when the response and the contrasts have thesame shape under a balanced design. If, however, an inappropriate setof contrasts is used, the sample size could be several times larger thanthe optimal design.

When the shape of the ci is similar to the shape of uik, the test statisticis a most powerful test. However, ui is usually unknown. In this case, wemay consider the adaptive contrasts cik = uik−1 at the kth stage, whereuik−1 is the observed response at the previous stage, i.e., the (k − 1)th

stage. Thus, we have

Zk =m∑

i=1

(uik−1 − uk−1)uik, (8.13)

which are conditionally, given data at the (k − 1)th stage, normally dis-tributed. It can be shown that the correlation between zk−1 and zk iszero since pk (=α) is independent of cm = um, m = 1, . . . , M and pk isindependent of Zk−1 or pk−1. It follows that Z1 and Z2 are independent.

8.4 Comparison of Seamless Designs

There are many possible seamless adaptive designs. It is helpful toclassify these designs into the following four categories according to thedesign features and adaptations: (i) fixed number of regimens, whichincludes stopping early for futility, biomarker-informed stopping earlyfor futility, and stopping early for futility/efficacy with sample size re-estimation; (ii) flexible number of regimens, which includes a flexiblenumber of regimens, adaptive hypotheses, and response-adaptive ran-domization; (iii) population adaptation, where the number of patientgroups can be changed from the learning phase to the confirmatory

Page 183: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

166 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

phase; and (iv) combinations of (ii) and (iii). For seamless adaptivedesigns in category (iii), patient groups are often correlated, such asthe entire patient population and sub-population with certain genomicmarkers. When all patient groups are mutually independent, categories(iii) and (iv) are statistically equivalent. Chang (2005) discussed variousdesigns in category (i) and proposed the use of a Bayesian biomarker-adaptive design in a phase II/III clinical program.

In this section, we will focus on seamless adaptive designs with aflexible number of regimens. We will compare four different seamlessadaptive designs with normal endpoints. Each design has five treatmentgroups including a control group in the learning phase. Since there aremultiple arms, contract tests are considered for detecting treatmentdifference under the null hypothesis of

H0 :5∑

i=1

ciui > 0,

where ci is the contrast for the ith group that has an expected responseof ui. The test statistic is T = Σ5

i=1ciui. The four seamless designs consid-ered here include (i) a five-arm group sequential design; (ii) an adaptivehypotheses design, where contracts ci change dynamically according tothe shape of the response (ui) for achieving the most power; (iii) a drop-the-losers design, where inferior groups (losers) will be dropped, but twogroups and the control will be kept in the confirmatory phase; and (iv) akeep-the-winner design, where we keep the best group and the controlat the confirmatory phase. Since the maximum power is achieved fora balanced design when the shape of the contrasts is consistent withthe shape of the response (Stewart and Ruberg, 2000; Chang and Chow,2006; Chang, Chow, and Pong, 2006), in the adaptive hypotheses ap-proach, the contrasts in the confirmatory phase are re-shaped basedon the observed responses in the learning phase. Three different re-sponse and contrast shapes are presented (see Table 8.1). The powersof the adaptive designs are summarized in Table 8.2, where the gen-eralized Fisher combination method (Chang, 2005) with efficacy andfutility stopping boundaries of α1 = 0.01, β1 = 1, α2 = 0.0033 are used.

Table 8.1 Response and Contrast Shapes

Shape u1 u2 u3 u4 u5 c1 c2 c3 c4 c5

Monotonic 1.0 2.0 3.5 4.0 4.5 −1.9 −0.9 0.1 1.1 1.6Convex 1.0 1.0 4.0 1.0 3.0 −1.0 −1.0 2.0 −1.0 1.0Step 1.0 3.4 3.4 3.4 3.4 −1.92 0.48 0.48 0.48 0.48

Page 184: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SEAMLESS PHASE II/III DESIGNS 167

Table 8.2 Power (%) of Contrast Test

ContrastResponse Design Monotonic Wave Step

Sequential 96.5 27.1 71.0Monotonic Adaptive 83.4 50.0 70.0

Drop-losers 94.6 72.7 87.9Keep-winner 97.0 84.9 94.2

Sequential 26.5 95.8 23.3Wave Adaptive 49.5 82.1 48.0

Drop-losers 56.1 85.6 54.6Keep-winner 67.7 88.7 66.6

Sequential 42.6 14.6 72.4Step Adaptive 41.0 26.4 54.6

Drop-losers 64.9 49.6 78.9Keep-winner 77.5 64.6 87.9

Note: σ = 10, one-sided α = 0.025, interim n = 64/group.Expected total n = 640 under the alternative hypothesis for all the designs.For simplicity, assume the best arm is correctly predetermined at interimanalysis for the drop-losers and keep-winner designs.

It can be seen that the keep-the-winner design is very robust for differ-ent response and contrast shapes.

It can be seen from Table 8.1 and Table 8.2, when the contrast shapeis consistent with the response shape, it gives the most power regard-less the type of design. When the response shape is unknown, adaptivedesign (may be drop-the-losers) is the most powerful design. Note thatthe examples provided here controlled the type I error under the globalnull hypothesis.

8.5 Drop-the-Loser Adaptive Design

In pharmaceutical research and development, it is desirable to shortenthe time of the conduct of clinical trials, data analysis, submission, andregulatory review and approval process in order to bring the drug prod-uct to the marketplace as early as possible. Thus, any designs whichcan help to achieve the goals are very attractive. As a result, variousstrategies are proposed. For example, Gould (1992) and Zucker et al.(1999) considered using blinded data from the pilot phase to design aclinical trial and then combine the data for final analysis. Proschan andHunsberger (1995) proposed a strategy which allows re-adjustment of

Page 185: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

168 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

the sample size based upon unblinded data so that a clinical trial isproperly powered. Other designs have been proposed to combine stud-ies conducted at different phases of traditional clinical development intoa single study. For example, Bauer and Kieser (1999) proposed a two-stage design which enables investigators to terminate the trial entirelyor drop a subset of the regimens for lack of efficacy at the end of thefirst stage. Their procedure is highly flexible, and the distributionalassumptions are kept to a minimum. Bauer and Kieser’s method hasthe advantage of allowing hypothesis testing at the end of the confir-mation stage. However, it is difficult to construct confidence intervals.Brannath, Koening, and Bauer (2003) examined adjusted and repeatedconfidence intervals in group sequential designs, where the responsesare normally distributed. Such confidence intervals are important forinterpreting the clinical significance of the results.

In practice, drop-the-losers adaptive designs are useful in combiningphase II and phase III clinical trials into a single trial. In this sec-tion, we will introduce the application of the drop-the-losers adaptivedesigns under the assumption of normal distributions. The concept,however, can be similarly applied to other distributions. The drop-the-losers adaptive design consists of two stages with a simple data-baseddecision made at the interim. In the early phase of the trial, the inves-tigators may administer K experimental treatments (say τ1, . . . , τK) ton subjects per treatment. A control treatment, τ0, is also administeredduring this phase. Unblinded data on patient responses are collectedat end of the first stage. The best treatment group (based on observedmean) and the control group are retained and other treatment groupsare dropped at the second stage.

Cohen and Sackrowitz (1989) provided an unbiased estimate usingdata from both stages. Cohen and Sackrowitz considered using a condi-tional distribution on a specific event to construct conditionally unbia-sed estimators. Following a similar idea, the test statistic and confidenceintervals can be derived. The specific conditional distribution used inthe final analysis depends on the outcomes from the first stage. Providedthe set of possible events from the first stage on which one condition is apartition of the sample space, the conditional corrections also hold un-conditionally. In our setting, conditional level α tests are unconditionallevel α tests. In other words, if all of the null hypotheses are true for allof the treatments, the probability of rejecting a null hypothesis is nevergreater than α, regardless of which treatment is selected (Sampson andSill, 2005).

Sampson and Sill assume the following ordered outcome after thefirst stage,

Q = {X : X1 > X2 > · · · > Xk}

Page 186: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SEAMLESS PHASE II/III DESIGNS 169

so that τ1 continues into the second stage (other outcomes would beequivalently handled by relabeling). Typically, at the end of the trial, wewant to make inferences on the mean µ1 of treatment τ1, and compare itwith the control (e.g., test H0 : µ1 − µ0 ≤ ∆10 or construct a confidenceinterval about µ1 − µ0). Therefore, we can construct uniformly mostpowerful (UMP) unbiased tests for hypotheses concerning ∆1 basedupon the conditional distribution of W given (X ∗, T). To see this, notethat the test is based on the conditional event Q. Lehmann (1983) givesa general proof for this family, which shows that conditioning on suf-ficient statistics associated with nuisance parameters causes them todrop from the distribution. In addition, the theorem states that the useof this conditional distribution to draw inferences about a parameter ofinterest yields hypothesis testing procedures which are uniformly mostpowerful unbiased unconditionally (i.e., UMPU before conditioning onthe sufficient statistics). The test statistic is

W =n0 (nA + nB)

(n0 + nA + nB) σ 2(Z − Y0),

which has the distribution of

W ∼ fa (W|∆1, X∗, T) = CN exp{

− 12G

(W − G∆1)2

}

D

where

Z =

(

nAXM + nBY)

nA + nB,

XM = max(

X1, . . . , Xk)

= maximum mean observed mean at first stage,

G =n0 (nA + nB)

(n0 + nA + nB) σ 2,

T =n0Y0 + (nA + nB) Z

n0 + nA + nB,

D = Φ

[√

nA (nA − nB)(

σ 2W + (nA + nB)(

T − X2

))

nA (nA + nB)σ

]

,

and CN is the normalization constant, which involves integrating Wover real line.

In order to test

H0 : µ1 − µ0 ≤ ∆10 versus Ha : µ1 − µ0 > ∆10,

we consider the following function

FQ (W|∆10, X∗, T) =∫ W

−∞fQ (t |∆10, X∗, T) dt.

Page 187: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

170 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

We can use the function to obtain a critical value, Wu, through FQ (Wu|∆10,X∗, T) = 1 − α. Note that 100(1 − α)% confidence intervals, [∆L, ∆U] ,can be constructed from

FQ (Wobs|∆L, X∗, T) = 1 − α/2,

and

FQ (W|∆10, X∗, T) = α/2

because W|X∗, T is monotonic likelihood ratio in T. Computer programsare available at

http : //www.gog.org/sdcstaf f/mikesill

or

http : //sphhp.buf f alo.edu/biostat/.

To illustrate the calculation, Sampson and Sill (2005) simulated a dataset from a design, where k = 7, µ1 = · · · = µ7 = 0, nA = nB1 = 100, n0 =200, and σ = 10. The experimental data have been relabeled in decreas-ing order. Suppose we want to test

H0 : µ1 − µ0 ≤ 0 versus H1 : µ1 − µ0 > 0.

The simulated data are given as follows:

X1 = 1.8881,

X2 = 0.9216,

X3 = 0.0691,

X4 = −0.3793,

X5 = −0.3918,

X6 = −0.8945,

X7 = −0.9276,

Y = 0.7888,

Y0 = −0.4956.

Based on these data, Wobs = 1.83. Thus, the rejection region is givenby (2.13, ∞). Hence, we fail to reject H0. The 95% confidence interval isgiven by (−0.742, 3.611).

Page 188: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE SEAMLESS PHASE II/III DESIGNS 171

8.6 Summary

The motivation behind the use of an adaptive seamless design is proba-bly the possibility of shortening the time of development of a new med-ication. As indicated earlier, an adaptive seamless phase II/III designis not only flexible but also efficient as compared to separate phase IIand phase III studies. However, benefits and drawbacks for implement-ing an adaptive seamless phase II/III design must be carefully weighedagainst each other. In practice, not all clinical developments may be can-didates for such a design. Maca et al. (2006) proposed a list of criteria fordetermining the feasibility of the use of an adaptive seamless design inclinical development plan. These criteria include endpoints and enroll-ment, clinical development time, and logistical considerations, whichare briefly outlined below.

One of the most important feasibility considerations for an adaptiveseamless design is the amount of time that a patient needs in order toreach the endpoint, which will be used for dose escalation. If the end-point duration is too long, the design could result in unacceptable inef-ficiencies. In this case, a surrogate marker with much shorter durationmight be used. Thus, Maca et al. (2006) suggested that well-establishedand understood endpoints (or surrogate markers) be considered whenimplementing an adaptive seamless design in clinical development. It,however, should be noted that if the goal of a phase II program is to learnabout the primary endpoint to be carried forward into phase III, anadaptive seamless design would not be feasible. As the use of an adap-tive seamless design is to shorten the time of development, whether theadaptive seamless design would achieve the study objectives within areduced time frame would be another important factor for feasibilityconsideration, especially when the adaptive seamless trial is the onlypivotal trial required for regulatory submission. In the case where thereare two pivotal trials, whether the second seamless trial can shorten theoverall development time should be taken into feasibility considerationas well. Logistical considerations are drug supply and drug packaging.It is suggested that development programs which do not have costly orcomplicated drug regimens would be better suited to adaptive seamlessdesigns.

Although adaptive seamless phase II/III designs are efficient and flex-ible as compared to the traditional separate phase II and phase III stud-ies, potential impact on statistical inference and p-value after adapta-tions are made should be carefully evaluated. It should be noted thatalthough more adaptations allow higher flexibility, this could result ina much more complicated statistical analysis at the end of trial.

Page 189: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and
Page 190: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CHAPTER 9

Adaptive Treatment Switching

For evaluation of the efficacy of a test treatment for progressive diseasesuch as cancer or HIV, a parallel-group active control randomized clin-ical trial is often conducted. Under the study design, qualified patientsare randomly assigned to receive either an active control (a standardtherapy or a treatment currently available in the marketplace) or thetest treatment under investigation. Patients are allowed to switch fromone treatment to another due to ethical consideration such as lack ofresponse or evidence of disease progression. In practice, it is not un-common that up to 80% of patients may switch from one treatment toanother. This certainly has an impact on the evaluation of the efficacyof the test treatment. Despite allowing a switch between two treat-ments, many clinical studies are to compare the test treatment withthe active control agent as if no patients had ever switched. Sommerand Zeger (1991) referred to the treatment effect among patients whocomplied with treatment as biological efficacy. In practice, the survivaltime of a patient who switched from the active control to the test treat-ment might be on the average longer than his/her survival time thatwould have been if he/she had adhered to the original treatment (eitherthe active control or the test treatment), if switching is based on prog-nosis to optimally assign patients’ treatments over time. We refer tothe difference caused by treatment switch as switching effect. The pur-pose of this chapter is to discuss some models for treatment switchwith switching effect and methods for statistical inference under thesemodels.

The remainder of this chapter is organized as follows. In the nextsection, the latent event times model under the parametric setting isdescribed. In Section 9.2, the concept of latent hazard rate is consideredby incorporating the switching effect in the latent hazard functions.Statistical inferences are also derived using Cox’s regression with someadditional covariates and parameters in this section. A simulation studyis carried out in Section 9.3 to examine the performance of the methodas compared to two other methods, where the one ignoring the switchingdata and the other one including the switching data but ignoring theswitching effect. A mixed exponential model is considered to assess the

Page 191: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

174 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

total survival time in Section 9.4. Some concluding remarks are givenin the last section of this chapter.

9.1 Latent Event Times

Suppose that patients are randomly assigned to two treatment groups:a test treatment and an active control. Consider the case where thereis no treatment switch and the study objective is to compare the effi-cacy of the two treatments. Let T1, . . , Tn be independent non-negativesurvival times and C1. . ., Cn be independent non-negative censoringtimes that are independent of survival times. Thus, the observationsare Yi = min(Ti, Ci), i = 1 if Ti ≤ Ci, and i = 0 if Ti > Ci. Assume thatthe test treatment acts multiplicatively on a patient’s survival time,i.e., an accelerated failure time model applies. Denote the magnitudeof this multiplicative effect by e−β , where β is an unknown parameter.Assume further that the survival time distribution under the activecontrol has a parametric form Fθ (t), where θ is an unknown parametervector and Fθ (t) is a known distribution when θ is known. Let ki be thetreatment indicator for the ith patient, i.e., ki = 1 for the test treatmentand ki = 0 for the active control. Then, the distribution of the survivaltime is given by

P (Ti ≤ t) = Fθ (eβki t), t > 0. (9.1)

If Fθ has a density fθ , then the density of Ti is given by eβki , t > 0.Consider the situation where patients may switch their treatments

and the study objective is to compare the biological efficacy. Let Si > 0denote the ith patient’s switching time. Branson and Whitehead (2002)introduced the concept of latent event time in the simple case where onlypatients in the control group may switch. We define the latent eventtime in the general case as follows. For a patient with no treatmentswitch, the latent event time is the same as his/her survival time. Forpatient i who switches at time Si, the latent event time Ti is an abstractquantity defined to be the patient’s survival time that would have beenif this patient had not switched the treatment. For patients who switchfrom the active control group to the test treatment group, Branson andWhitehead (2002) suggested the following model conditional on Si:

Tid= Si + eβ (Ti − Si), (9.2)

where d denotes equality in distribution. That is, the survival time fora patient who switched from the active control to the test treatmentcould be back-transformed to the survival time that would have been ifthe patient had not switched. For the case where patients may switch

Page 192: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE TREATMENT SWITCHING 175

from either group, the model (9.2) can be modified as follows:

Tid= Si + eβ(1−2ki) (Ti − Si), (9.3)

where ki is the indicator for the original treatment assignment, not forthe treatment after switching.

Model (9.2) or (9.3), however, does not take into account the fact thattreatment switch is typically based on prognosis and/or investigator’sjudgment. For example, a patient in one group may switch to another be-cause he/she does not respond to the original assigned treatment. Thismay result in a somewhat optimal treatment assignment for the patientand a survival time longer than those patients who did not switch. Ig-noring such a switching effect will lead to a biased assessment of thetreatment effect Shao, Chang and Chow (2005) consider the followingmodel conditional on Si:

Tid= Si + eβ(1−2ki)wk,η (Si) (Ti − Si), (9.4)

where η is an unknown parameter vector and wk,η (S) are known func-tions of the switching time S when η and k are given. Typically, wk,η (S)should be close to 1 when S is near 0, i.e., the switching effect is negli-gible if switching occurs too early. Note that

limS↓0

wk,η (S) = 1.

An example is

wk,η (S) = exp(

ηk,0S+ ηk,1S2)

,

where ηk,l are unknown parameters.Under model (9.1) and model (9.4), the distributions of the survival

times for patients who switched treatments are given by (conditionalon Si)

P(Ti ≤ t) = P(Ti ≤ Si + eβ(1−2ki)wk,η(Si)(t − Si))

= Fθ (eβki [Si + eβ(1−2ki)wk,η(Si)(t − Si)])

= Fθ (eβki Si + eβ(1−ki)wk,η(Si)(t − Si))

for ki = 0, 1. The distributions for patients who never switch are

Fθ (eβki t), ki = 0, 1.

Assume that F has a density fθ . For convenience’s sake, we denoteSi = ∞ for patient i who never switch. Then, the conditional likelihood

Page 193: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

176 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

function given Si is

L(θ , β, η)

=∏

i:Si=∞[eβki fθ (eβki Yi)]δi [1 − Fθ (eβki Yi)]1−δi

×∏

i:Si<∞[eβ(1−ki)wki (si) fθ (eβki Si + eβ(1−ki)wki (Si)(Yi − Si))]δi

×[1 − Fθ (eβki Si + eβ(1−ki)wki (Si)(Yi − Si))]δi .

Let γ = (θ , β, η). The parameter vector can be estimated by solving thefollowing likelihood equation

∂ log L(γ )∂γ

= 0. (9.5)

Under some regularity conditions, the estimate of γ is asymptoticallynormal with mean vector 0 and covariance matrix

[

E∂2 log L(γ )

∂γ ∂γ ′

]−1

Var[

E∂ log L(γ )

∂γ

] [

E∂2 log L(γ )

∂γ ∂γ ′

]−1

, (9.6)

which can be estimated by substituting with its estimate. Statisticalinference can then be obtained based on the asymptotic results.

Branson and Whitehead (2002) proposed an iterative parameter es-timation (IPE) method for statistical analysis of data with treatmentswitch. The idea of the method is to relate the distributions of the sur-vival times of the two treatments under a parametric model. Thus,under model (9.2), IPE can be described as follows. First, an initial es-timate β of β is obtained. Then, latent event times are estimated as

Ti = Si + bβ (Ti − Si)

for patients who switched their treatments. Next, a new estimate of β isobtained by using the estimated latent event times as if they were theobserved data. Finally, the previously described procedure is iterateduntil the estimate of β converges.

Note that although a similar IPE method can be applied under model(9.4), it is not recommended for the following reason. If initial esti-mates of model parameters are obtained by solving the likelihood equa-tion given in (9.5), then iteration does not increase the efficiency ofestimates and hence adds unnecessary complexity for computation. Onthe other hand, if initial estimates are not solutions of the likelihoodequation given in (9.5), then they are typically not efficient and theestimates obtained by IPE (if they converge) may not be as efficientas the solutions of the likelihood equation (9.5). Thus, directly solving

Page 194: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE TREATMENT SWITCHING 177

the likelihood equation (9.5) produces estimates that are either moreefficient or computationally simpler than the IPE estimates.

9.2 Proportional Hazard Model with LatentHazard Rate

The above parametric approach for latent event times is useful. How-ever, statistical analysis under such a parametric model may not berobust against model mis-specifications. For survival data in clinicaltrials, alternatively, we may consider the following Cox’s proportionalhazard model, which is a semi-parametric model.

Let F(t) be the distribution of the survival time and f (t) be its corre-sponding density. Then, the hazard rate at time t is defined as

λ(t) = f (t)/[1 − F(t)].

The Cox’s proportional hazard model is then given by

λki (t) = λ0 (t) eβki , (9.7)

where ki is the treatment indicator and λ0(t) is left unspecified. In amore general setting, we can replace ki in (9.7) by a covariate vectorassociated with the ith patient and β by a parameter vector. Undermodel (9.7), if there is no treatment switch, an estimator of β can beobtained by maximizing the following partial likelihood function

L (β) =∏

i

(

eβki

j∈Rieβkj

)δi

, (9.8)

where Ri is the set of patients who are alive and observed just beforetime Ti. When there is treatment switch but the switching effect isignored (i.e., patients switch treatments at random), model (9.7) can bemodified by replacing ki by the time-dependent covariate ki(t) as follows:

ki (t) =

{

1 − ki, t ≥ Si

ki t < Si,

where Si is the switching time for the ith patient and 0 ≤ t < ∞. Notethat by definition, Si = ∞ if the ith patient never switched. This reducesto a special case of the proportional hazard model with time-dependentcovariates (see, e.g., Kalb and Prentice, 1980; Cox and Oakes, 1984).

Consider the case where the switching effect wk,η(Si) may dependon prognosis and/or investigator’s assessment, which is an unknownparameter vector. Instead of including the switching effect in the modelas latent event times (9.4), we include it in the proportional hazard

Page 195: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

178 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

model as follows:

λki (t) = λ0 (t) eβki(t)wk,η (t, Si), (9.9)

where

wk,η (t, Si) =

{

wk,η (Si), t ≥ Si

1, t < Si.

We refer to this model as the latent hazard rate model since λki(t) in(9.9) corresponds to a latent event time and hence can be treated as alatent hazard rate. Under the latent hazard rate model (9.9), the partiallikelihood is given by

L (β, η) =∏

i:Si=∞

eβki wk,η (Ti, Si)

j∈Ri

eβkj wk,η (Ti, Si)

−1

δi

. (9.10)

Estimators of β and η can be obtained by solving

∂ log L(γ )∂γ

= 0,

where γ = (β, η). Under some regularity conditions, these estimatorsare asymptotically normal with mean vector 0 and covariance matrixas given in (9.6). Based on the asymptotic results, statistical inferencecan be obtained.

If log wk,η(s) is linear in η such as

wk,η(s) = eηk,0S+ηk,1S2,

then model (9.9) is another special case of the proportional hazard modelwith time-dependent covariates since the switching effect term can bewritten as

wk,η(t, Si) = eηki ,0Si(t)+ηki ,1S2i (t), (9.11)

where

Si (t) =

{

Si, t ≥ Si

0 t < Si,

which can be treated as another time-dependent covariate. That is,model (9.9) is the proportional hazard model with time-dependent co-variates ki(t) and Si (t), where Si (t) is additional time-independent co-variate. Thus, the parameter vector is given by

γ = (β, η0,0, η0,1, η1,0, η1,1)

Page 196: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE TREATMENT SWITCHING 179

and it can be estimated by solving

i

δi

(

zii −∑

j∈RiZijeγ ′ Zij

j∈Rieγ ′ Zij

)

= 0,

where

Zij =(

kj (Ti), (1 − kj)Sj (Ti), (1 − kj) S2j (Ti), kj Sj (Ti), kj S2

j (Ti))

.

The resulting estimator, denoted by γ , is asymptotically normal withmean 0 and a covariance matrix that can be estimated by

B−1 AB−1,

where

A =∑

i

δi

(

zii −∑

j∈RiZijeγ ′ Zij

j∈Rieγ ′ Zij

) (

zii −∑

j∈RiZijeγ ′ Zij

j∈Rieγ ′ Zij

)′,

and

B=∑

i

δi

(∑

j∈RiZijeγ ′ Zij

j∈Rieγ ′ Zij

) (∑

j∈RiZijeγ ′ Zij

j∈Rieγ ′ Zij

)′

−∑

i

δi

j∈RiZij Z′

i jeγ ′ Zij

j∈Rieγ ′ Zij

.

The above results can be easily extended to the case where thereare some other time-independent and/or time-dependent covariates inmodel (9.9). The latent event times approach described in the previoussection coincides with the latent hazard rate model when the survivaltime distribution is exponential, i.e.,

F(t) = 1 − e−t/θ

with an unknown parameter θ > 0. However, for other types of survivaltime distributions, the two approaches are different. If we apply thelatent event times model (9.4) under the semi-parametric approach inwhich Cox’s proportional hazard model is used for data without treat-ment switch, then the resultant latent hazard model for log λki (t)/λ0(t)depends on t is rather complicated. Statistical inference under suchmodels could be very difficult.

9.2.1 Simulation results

In this section, we studied the finite sample performance of the Cox’sproportional hazard model with switching effect through simulation.Consider a clinical trial comparing a test treatment with an active

Page 197: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

180 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

control agent. Suppose 300 patients per treatment group is planned.The survival time was generated according to the exponential distribu-tion with hazard rate 0.0693 for the active control group (mean = 14.43months) and 0.0462 for the treatment group (mean = 21.65 months).For both treatment groups, the random censoring time was generatedaccording to the uniform distribution over the range of 15–20 months.This results in the censoring percentage of 24.6% for the active controlgroup and 34.6% for the treatment group. On the other hand, the pa-tient treatment switch time was generated according to the exponentialdistribution with a mean of 7.22 months for the active control group anda mean of 10.82 months for the treatment group. The switching rate isabout 67% for both the active control group and the treatment group.This choice of switching rate is within the range of 60–80% practicalexperience of patients who switch from one treatment to another.

For each combination of parameters (β and ηk, j), 1000 simulationruns were done. The results are summarized in Table 9.1. As it can beseen from Table 9.1, the estimators based on the method of latent haz-ard rate perform well. The relative bias is within 3% in all cases. Theperformance of the estimator of β (the treatment effect) is generally bet-ter than that of the estimators of η’s (switching effects) in terms of bothrelative bias and the coefficient of variation. We note that the estimatedstandard deviation has very little bias. Furthermore, the coverage prob-ability of the asymptotic confidence interval is close to the nominal levelof 95%.

In the simulation, two other methods were also considered for the pur-pose of comparison. One method is to estimate β under model (9.7) withwk,η(t; Si) ≡ 1, (i.e., ignoring the switching effect), which is clearly bi-ased. The other method is to estimate β under model (9.7) based on datafrom patients who adhered to their original randomized treatments (i.e.,ignoring data from patients who switched). Simulation results indicated

Table 9.1 Simulation Results Based on 1000 Simulation Runs

Proposed Method Other* Other†Parameter β η0,0 η0,1 η1,0 η1,1 β β

True value −0.406 0.100 0.009 0.080 0.010 −0.406 −0.406Mean of estimates −0.396 0.098 0.009 0.082 0.010 −0.393 0.033SD of estimates 0.128 0.052 0.004 0.049 0.004 0.147 0.094Mean of estimated SD 0.129 0.053 0.004 0.049 0.004 0.148 0.093Coverage probability 0.951 0.951 0.949 0.952 0.956 0.951 0.003

∗The method that ignores data from patients who switched.†The method that ignores switching effect, i.e., uses model with wki ,η(t; Si) ≡ 1.

Page 198: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE TREATMENT SWITCHING 181

that this method has little bias in estimating β. However, the estimatorfrom this method has larger standard deviation (less efficiency). Theefficiency gain in using data from patients who switched is about 15%under the switching rate of 67%. This gain is not as large as what ishoped for, because of the fact that the proposed method estimates fouradditional parameters ηk, j . However, a 15% gain in efficiency in termsof standard deviation amounts to approximately a 32% of reduction insample size. For example, suppose that a sample size 100 is requiredin the case of no switching. If the switching rate is 67% and we ignoredata from patients who switched, then we need a sample size of approx-imately 300 in order to retain the efficiency. On the other hand, with thesame switching rate and the proposed method, the sample size requiredto retain the same efficiency is 228.

9.3 Mixed Exponential Model

As indicated earlier, treatment switch is a common and flexible medicalpractice in cancer trials due to ethical considerations. Treatment switchis in fact a response-adaptive switch. Due to the treatment switch, how-ever, the treatment effect can only be partially observed, and the ef-fects of different treatments are difficult to separate from each other. Inthis case, the commonly used exponential models with a single param-eter may not be appropriate. Alternatively, a mixed exponential model(MEM) with multiple parameters is more flexible and hence suitablefor a wide range of applications (see, e.g., Mendenhal and Hader, 1985;Susarda and Pathala, 1965; Johnson et al., 1994).

In clinical trials, the target patient population often consists of two ormore subgroups based on heterogeneous baseline characteristics (e.g.,the patients could be a mixture of the second-line and the third-lineoncology patients). The median survival time of the third-line patientsis usually shorter than that of the second-line patients. If the survivaltimes of the two subgroup populations are modeled by exponential dis-tributions with hazard rates λ1 and λ2, respectively, then the survivaldistribution of the total population in the trial is a mixed exponentialdistribution with a probability density function of

P1λ1e−λ1t + P2λ2e−λ2t(t > 0),

where t is the survival time and P1 and P2 (fixed or random) are theproportions of the two sub-populations. Following the similar idea ofMendenhal and Hader (1958), the maximum likelihood estimates ofthe parameters λi and Pi can be obtained. In clinical trials, a patient’streatment is switched in the middle of the study often because there is

Page 199: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

182 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

a biomarker, such as disease progression, indicating the failure of theinitial treatment regimen. If the test drug is more effective than thecontrol, then the majority of patients in the control group will switch tothe test drug. In this case, the response or survival difference betweenthe two treatment groups will be dramatically reduced in comparisonto the case without treatment switching. Moreover, if the test drug ismuch more effective in treating a patient after disease progression thanbefore disease progression, it could lead to an erroneous conclusion thatthe test drug is inferior to the control without considering the switchingeffect, but in fact it is not. This biomarker-based treatment switch is ob-viously not a random switch but a response-adaptive treatment switch.In what follows, we will focus on the application of mixed exponentialmodel to a clinical trial with biomarker response-adaptive treatmentswitching (Chang, 2005).

9.3.1 Biomarker-based survival model

In cancer trials, there are often some signs/symptoms (or more gener-ally biomarkers) that indicate the state of the disease and the ineffec-tiveness or failure of a treatment. A cancer patient often experiencesseveral episodes of progressed diseases before death. Therefore, it isnatural to construct a survival model based on the disease mechanism.In what follows, we consider a mixed exponential model, which is de-rived from the more general mixed Gamma model. Let τi be the timefrom the (i − 1)th disease progression to the ith disease progression,where i = 1, . . , n. τi is assumed to be mutually independent with prob-ability density function of fi(τi). The survival time t for a subject canbe written as follows:

t =n∑

i=1

τi. (9.13)

Note that the nth disease progression is death. The following lemma re-garding the distribution of linear combination of two random variablesis useful.

Lemma Given x ∼ fx(x) and y ∼ fy(y). Define z = ax + by. Then, theprobability density function of z is given by

fz(z) =1a

∫ ∞

−∞f(

z − bya

, y)

dy (9.14)

Proof.

Fz(z) = P(Z ≤ z) =∫ ∫

ax+by≤z

f (x, y)dxdy =∫ ∞

−∞

∫ z−bya

−∞f (x, y)dxdy (9.15)

Page 200: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE TREATMENT SWITCHING 183

Take the derivative with respect to z and exchange the order of the twolimit processes, (9.14) is immediately obtained.

Corollary When x and y are independent, then

fz(z) =1a

∫ ∞

−∞fx

(z − by

a

)

fY(y)dy. (9.16)

Theorem If n independent random variables τi, i = 1, . . . , n are ex-ponentially distributed with parameter λi, i.e.,

τi ∼ fi(τi) = λie−λiτi , (τi ≥ 0),

then the probability density function of random variable t =∑n

i=1 τi isgiven by

f (t; n) =n∑

i=1

λie−λi t

∏nk=1

k=i(1 − λi

λk), t > 0, (9.17)

where λi = λk if k = i for i, k ∈ m0 ≤ n and mi is the number of replicatesfor λi with the same value.

Proof. By mathematical induction, when n = 2, Lamma (9.14) gives(λi = λk if i = k)

f (t; 2) = λ1λ2

∫ t

0

exp(−λ1t − (λ2 − λ1)τ2)dτ2 =λ1e−λ1t

1 − λ1λ2

+λ2e−λ2t

1 − λ2λ1

.

Therefore, (9.17) is proved for n = 2.Now assume (9.17) holds for any n ≥ 2, and it will be proven that

(9.17) also holds for n + 1. From (9.17) and corollary (9.16), we have

f (t; n + 1) =∫ t

0

f (t − τn+1; n) fn+1(τn+1)dτn+1

=∫ t

0

n∑

i=1

λie−λi(t−τn−1)

∏nk=1

k=i(1 − λi

λk)λn+1e−λn+1τn+1dτn+1

=n∑

i=1

1∏n

k=1

k=i(1 − λi

λk)

[

λie−λi t

1 − λiλn+1

+λn+1e−λn+1t

1 − λn+1λi

]

=n+1∑

i=1

λie−λi t

∏n+1k=1

k=i

(1 − λiλk

).

This completes the proof.

For the exponential distribution fi(τi) = λie−λi τ , by the above theo-rem, the probability density function of t , which is a mixed Gamma

Page 201: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

184 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

distribution, is given by

f (t; n) =

{ n∑

i=1

wiλie−λi t, t > 0; λi = λk if k = i, (9.18)

where λi = λk if k = i for i, k ≤ m0 ≤ n, mi is the number of replicatesfor λi with the same value, and

1∏

k=1

k=i

(

1 − λi

λk

)

= 1.

For disease progression, it is usually true that λi > λk for i > k. Notethat f (t; n) does not depend on the order of λi in the sequence, and

f (t; n)λn→+∞ = f (t; n − 1).

The survival function S(t) can be easily obtained from (9.18) by inte-gration and the survival function is given by

S(t; n) =n∑

i=1

wie−λi t; t > 0, n ≥ 1, (9.19)

where the weight is given by

wi =

n∏

k=1,k=i

(1 − λi

λk)

−1

. (9.20)

The mean survival time and its variance are given by

µ =n∑

i=1

wi

λiand σ 2 =

n∑

i=1

wi

λ2i

, (9.21)

respectively. When n = 1, w1 = 1, (9.19) reduces to the exponentialdistribution. It can be shown that the weights have the properties of∑n

i=1 wi = 1 and∑n

i=1 wiλi = 0.

9.3.2 Effect of patient enrollment rate

In this section, we will examine the effect of the accrual duration on thesurvival distribution. Let N be the number of patients enrolled and let(0, t0) be the patient enrollment period defined as the time elapsed fromthe first patient enrolled to the last patient enrolled. Also, let t denotethe time elapsed from the beginning of the trial. Denote fd(t) and fe(τe),where τeε[0, T0], the probability density function of failure (death) andthe patient enrollment rate, respectively. The failure function (or the

Page 202: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE TREATMENT SWITCHING 185

probability of death before time t) can be expressed as

F(t) =∫ t

0

fd(τ )dτ =∫ t

0

∫ min(τ,t0)

0

f (τ − τe) fe(τe)dτedτ. (9.22)

For a uniform enrollment rate,

fe(τe) =

{Nt0

, τeε[0, t0]0, otherwise

and probability density function (9.18), (9.22) becomes

F(t) =∫ t

0

∫ min(τ,t0)

0

n∑

i=1

wiλie−λi(τ−τe)

t0dτedτ.

After the integration, we have

F(t) =

{1t0

{t +∑n

i=1wiλi

[

e−λi t − 1]}, t ≤ t0

1t0

{t0 +∑n

i=1wiλi

[

e−λi t − e−λi(t−t0)]

t > t0. (9.23)

Differentiating it with respect to t, it can be obtained that

f (t) =

{ 1t0

(

1 − ∑ni=1 wie−λi t

)

, t ≤ t01t0

∑ni=1 wi

[

e−λi(t−t0) − e−λi t]

, t > t0. (9.24)

The survival function is then given by

S(t) = 1 − F(t), (9.25)

and the number of deaths among N patients can be written as

D(t) = NF(t). (9.26)

Note that (9.23) is useful for sample size calculation with a non-parametric method. For n = 1, (9.26) reduces to the number of deathswith the exponential survival distribution, i.e.,

D =

{

R(t − 1λe−λt), t ≤ t0

R[

t0 − 1λ(eλt0 − 1)e−λt

]

, t > t0,

where the uniform enrollment rate R = Nt0

.

Parameter estimate

It is convenient to use the paired variable (tj , δ j) defined as (tj , 1) for afailure time tj and (tj , 0) for a censored time tj . The likelihood then canbe expressed as

L =N∏

j=1

[

f (tj)]δ j

[

S(tj)]1−δ j , (9.27)

Page 203: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

186 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

where the probability density function f (t) and survival function S(t)are given by (9.18) and (9.19), respectively, for instantaneous enroll-ment, but (9.24) and (9.25) otherwise. Note that for an individual whosesurvival time is censored at tj , the contribution to the likelihood is givenby the probability of surviving beyond that point in time, i.e., S(tj). Toreduce the number of parameters in the model, we can assume that thehazard rates take the form of a geometric sequence, i.e., λi = aλi−1

or λi = aiλ0; i = 1, 2, . . . , n. This leads to a two-parameter modelregardless of n, the number of progressions. The maximum likelihoodestimates of λ and a can be easily obtained through numericaliterations.

Example 9.1 To illustrate the mixed exponential model for obtainingthe maximum likelihood estimates with two parameters of λ1 and λ2,independent x1 j and x2 j , j = 1, . . , N from two exponential distributionswith λ1 and λ2, respectively were generated. Let τ j = x1 j + x2 j . Thenτ j has a mixed exponential distribution with parameters λ1 and λ2. Lettj = min(τ j , Ts), where Ts is the duration of the study. Now, we havethe paired variables

(

tj , δ j)

, j = 1, . . . , N, which were used to obtainthe maximum likelihood estimators λ1 and λ2. Using (9.21) and theinvariance principle of maximum likelihood estimators, the maximumlikelihood estimate of mean survival time, µ, can be obtained as

µ =2∑

j=1

wj

λ j=

1λ1

+1λ2

. (9.28)

For each of the three scenarios (i.e., λ1 = 1, λ2 = 1.5; λ1 = 1, λ2 =2; λ1 = 1, λ2 = 5), 5000 simulation runs were done. The results ofthe means and coefficients of variation of the estimated parameters aresummarized in Table 9.2. As it can be seen from Table 9.2, the mixedexponential model performs well, which gives an excellent estimate ofmean survival time for all three cases with virtually no bias and a lessthan 10% coefficient of variation. The maximum likelihood estimateof λ1 is reasonably good with a bias less than 6%. However, there areabout 5% to 15% over-estimates for λ2 with large coefficients of variationranging from 30% to 40%. The bias increases as the percent of the cen-sored observations increases. Thus, it is suggested that the maximumlikelihood estimate of mean survival time rather than the maximumlikelihood estimate of the hazard rate be used to assess the effect of atest treatment.

Page 204: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE TREATMENT SWITCHING 187

Table 9.2 Simulation Results with Mixed Exponential Model

λ1 λ2 µ λ1 λ2 µ λ1 λ2 µ

True 1.00 1.50 1.67 1.00 2.00 1.50 1.00 5.00 1.20Mean* 1.00 1.70 1.67 1.06 2.14 1.51 1.06 5.28 1.20CV* 0.18 0.30 0.08 0.20 0.37 0.08 0.18 0.44 0.09PDs (%) 93 96 96Censors (%) 12 8 5

Note: Study duration T = 3.2 with quick enrollment. Number of subjects N = 100.∗Mean and coefficient of variation of the estimates from 5000 runs for each scenario

9.3.3 Hypothesis test and power analysis

In a two-arm clinical trial comparing treatment difference in survival,the hypotheses can be written as

H0 : µ1 ≥ µ2 (9.29)

Ha : µ1 < µ2.

Note that hazard rates for the two treatment groups may change overtime. In practice, the proportional hazard rates do not generally holdfor a mixed exponential model. In what follows, we will introduce twodifferent approaches for hypotheses testing: nonparametric and simu-lation methods.

Nonparametric method

In most clinical trials, there are some censored observations. In thiscase, the parametric method is no longer valid. Alternatively, nonpara-metric methods such as the logrank test (Marubini and Valsecchi, 1995)are useful. Note that procedure for sample size calculation using thelogrank test under the assumption of an exponential distribution isavailable in the literature (see, e.g., Marubini and Valsecchi, 1995;Chang and Chow, 2005). In what follows, we will derive a formula forsample size calculation under the mixed exponential distribution basedon logrank statistic. The total number of deaths required for a one-sidedlogrank test for the treatment difference between two equal-sized inde-pendent groups is given by

D =

[

z1−α + 2z1−β

√θ

1 + θ

]2 (1 + θ

1 − θ

)2

, (9.30)

Page 205: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

188 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

where the hazard ratio is

θ =ln F1(Ts)ln F2(Ts)

, (9.31)

Ts is trial duration, and Fk(Ts) is the proportion of patients with theevent in the kth group. The relationship between Fk(Ts) and t0 and Tsand hazard rates is given by (9.23). From (9.23) and (9.26), the totalnumber patients required for the case where the enrollment is uni-formly distributed can be obtained as follows:

N =

[

z1−α + 2z1−β

√θ

1+θ

]2 (1+θ1−θ

)2

F1 + F2, (9.32)

where t0 is the duration of enrollment.

Example 9.2 Assume a uniform enrollment with a duration of t0 = 10months and trial duration Ts = 14 months. At the end of the study,the proportions of failures are F1 = 0.8 and F2 = 0.75 for the control(group 1) and the active drug (group 2), respectively. Choose a powerof 90% and one-sided α = 0.025. The hazard ratio is calculated using(9.31) as θ = 1.29. From (9.32), the total number of patients required isN = 714. If hazard rates are given instead of the proportions of failures,we may use (9.23) to calculate the proportion of failures first.

Simulation method

The computer simulation is a very useful tool, which can be directlyapplied to almost all hypotheses testing problems and power analysisfor sample size calculations. It can be used with or without censoring.It can also be easily applied to a trial with treatment switching. Thefollowing is the simulation algorithm:

Step 1: Generate simulation data under H0

Generate xi and yi independently from a mixed exponential distributionwith parameters λ11 and λ12, the hazard rates under the null hypothesisHo, using the method as described in the previous section.

Step 2: Find the distribution of test statistic T under H0

For each set of data, calculate the test statistic T as defined (e.g., themaximum likelihood estimate of mean difference). By repeating step 1M times, M values of the test statistic T, and its distribution can beobtained. The precision of the distribution will increase as the numberof replications, M, increases.

Page 206: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE TREATMENT SWITCHING 189

Step 3: Calculate p-value

Sort the Ts obtained from step 2, and calculate the test statistic T basedon the observed value from the trial. The p-value is the proportion ofthe simulated Ts whose values are larger (less) than T.

Step 4: Calculate test statistic under Ha

For the power calculation, data under Ha must be generated (i.e., gener-ate xi (i = 1, . . . , N) from a mixed exponential distribution with param-eters λ11 and λ12, and yi (i = 1, . . . , N) from another mixed exponentialdistribution with parameters λ21 and λ22, as described in section 4).Calculate the test statistic as in step 2.

Step 5: Calculate the power of test

Repeat step 4 M times to form the distribution of the test statistic underHa. The power of the test with N subjects per group is the proportionof the simulated test statistic T with its value exceeding the criticalvalue Tc.

Note that steps 4 and 5 are for power analysis for sample size calcu-lations. For hypotheses testing, only the distribution of the test statisticunder H0 is required. However, for power and sample size calculation,distributions under both H0 and Ha are required. This simulation ap-proach is useful in dealing with adaptive treatment switching.

9.3.4 Application to trials with treatment switch

As indicated earlier, it is not uncommon to switch a treatment whenthere is evidence of lack of efficacy or disease progression. Due to thenatural course of the disease, most patients will have disease progres-sion during the trial. When this happens, the clinician often gives an al-ternative treatment. Patients who were initially treated with the stan-dard therapy are often switched to the test drug, but patients who wereinitially treated with the test drug are not necessarily switched to thestandard therapy; rather, they could be switched to a different drugthat has a similar effect as the control.

For a typical patient in the Kth group who experiences disease pro-gression and dies, the time to disease progression is assumed to be anexponential distribution with a hazard rate of λk1, and the time fromthe progression to death is another exponential distribution with a haz-ard rate of λk2. Due to treatment switching, a patient in treatment 1will have a hazard rate of λ∗

12 = λ22 after switching to treatment 2,

Page 207: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

190 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

and a patient in treatment 2 will have a hazard rate of λ∗22 = λ12 af-

ter switching to treatment 1. Further, assume all patients eventuallyhave progressive disease and will switch treatment if the trial lastslong enough. Under these conditions, the probability density functionand survival function for the kth group are given by

f ∗k = w∗

k1λk1e−λk1t + w∗k2λ

∗k2e−λ∗

k2t (9.33)

and

S∗k = w∗

k1e−λk1t + w∗k2e−λk2t, (9.34)

respectively, where

w∗k1 =

[

1 − λk1

λ∗k2

]−1

,

and

w∗k2 =

[

1 − λ∗k2

λk1

]−1

.

The likelihood for kth group is similar to (9.27), i.e.,

L∗k =

N∏

j=1

[

f ∗k (tj)

]δ j[

S∗k (tj)

]1−δ j. (9.35)

The likelihood for the two groups combined is given by

L∗ = L∗1L∗

2.

Under the assumption that all patients will eventually switch treat-ment, maximizing L∗ is equivalent to maximizing both L∗

1 and L∗2. After

obtaining the maximum likelihood estimates of λ11, λ22, λ21, and λ12

using (9.35), one can calculate the mean survival times µ1 and µ2 forthe two groups using

µk =(

1λk1

+1

λk2

)

. (9.36)

µk is called the estimator of latent survival time µk. The latent survivaltime can be interpreted as (i) the survival time that would have beenobserved if the patient had not switched the treatment, or (ii) overallsurvival benefit of the drug in treating patients with different baselineseverities (e.g., 2nd-line and 3rd-line oncology patients). In the firstinterpretation, it is assumed that the drug has that magnitude of re-treatment effect, which implies that the investigator should not switchtreatment at all. The second interpretation is appropriate regardless ofthe re-treatment effect of the drug. For the hypotheses testing

Ho : µ1 ≥ µ2 versus Ha : µ1 < µ2,

Page 208: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE TREATMENT SWITCHING 191

the test statistic is defined as

T = µ2 − µ1.

Example 9.3 Suppose that a clinical trial comparing two parallel treat-ment groups was conducted to demonstrate that the test drug (group 2)is better than the control (group 1) in survival. The study duration was3.2 years and the enrollment was quick (i.e., t0 = 0). The trial protocolallowed the investigator to switch a patient’s treatment if his/her dis-ease progression was observed. Suppose that λ11 = 1, λ12 = 5, λ21 = 0.7,and λ22 = 1.5. The latent mean survival times for the two groups calcu-lated using (9.21) are µ1 = 1.2 and µ2 = 2.095 for the control and testgroups, respectively. The latent survival times indicate that the testdrug is better than the control in survival. However, if the treatmentswitch is not taken into account, the mean survival will be calculatedas follows:

µ1 =1

λ11+

1λ22

= 1.667

and

µ2 =1

λ21+

1λ12

= 1.629,

which could lead to a wrong conclusion that the control is better thanthe test drug. Therefore, it is critical to consider the switching effect inthe statistical analysis.

Note that the nonparametric method may not be appropriate for trialswith treatment switching. In this case, it is suggested that the methodof computer simulation be used. Table 9.3 presents results of computersimulation. The results indicate that under H0, the critical point Tc forrejecting the null hypothesis is 0.443 for the test statistic (the observedmean survival difference). Using Tc = 0.443 to run the simulation underHa with N = 250 patients per group shows that the trial has 86% powerfor detecting the difference. The distributions of the test statistic underH0 and Ha are plotted in Figure 9.1. Under Ha, there is about a 3% over-estimate of µ1 and a 3% under-estimate of µ2. There is about a 13%under-estimate in mean difference µ2 − µ1 with a standard deviationof 0.34. The expected standard deviation for mean survival difference

σ =√

σ 21 + σ 2

2 = 1.88,

where σ 21 and σ 2

2 are calculated from (9.21). We can see that the methodof computer simulation improves the precision at the cost of accuracy.

Remarks For patients who are never switched regardless of treatmentfailure or other reasons, the probability density function and survival

Page 209: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

192 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 9.3 Simulation Results with Treatment Switching

Under H0 Condition

λ11 λ12 µ1 λ21 λ22 µ2 µ2−µ1

True 1 5 1.2 1 5 1.2 0.00Mean* 1.03 5.17 1.20 1.02 5.13 1.20 0.004SD* 0.11 1.63 .12 0.11 1.60 .12 0.22PDs (%) 96 96Censors (%) 5 5

Under Ha Condition

λ11 λ12 µ1 λ21 λ22 µ2 µ2−µ1

True 1 5 1.2 0.7 1.5 2.10 0.90Mean* 1.00 5.25 1.24 0.71 1.64 2.06 0.82SD* 0.14 1.81 0.17 0.07 0.42 0.20 0.34PDs (%) 96 89Censors (%) 11 12

*Mean and standard deviation of the estimates (5000 runs).

50

40

30

20

10

01.51.00.50.0–0.5–1.0 2.0

Critical point = 0.443 Test Statistic

Freq

uenc

y (%

)

H0: λ11 – λ12 and λ21 – λ11

H3: λ11 – 1, λ12 – 1 and λ11 – 0.7, λ11 – 1.5

Figure 9.1 Distribution of test statistic with n = 250 (5000 runs).

Page 210: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

ADAPTIVE TREATMENT SWITCHING 193

function are given by (9.18) and (9.19), respectively. Denote Ps the pro-portion of patients who are willing to switch. Then, the distribution ofsurvival time for a typical patient in the trial is given by

fk(t) = Ps f ∗k + (1 − Ps) fk,

which is the unconditional distribution. For patients who alreadyswitched and for patients who did not switch at a certain time, theconditional distributions will be different. This case will not be studiedhere. It is assumed that re-treatment of a patient with the same drugafter disease progression is not an option, and all patients will eventu-ally develop progressed disease and switch their treatments at the timeof disease progression if the trial lasts long enough. In the simulation,the sample size N has been adjusted several times in order to meet thepower requirement. Note that every time N changes, the critical pointalso changes due to the nature of the test statistic.

9.4 Concluding Remarks

Branson and Whitehead’s method considered the use of latent eventtimes to model a patient’s observed survival time after switching andthe survival time that would have been observed if this patient hadnot switched treatment. Their model does not take into account thefact that a treatment switch is often based on the observed effect ofthe current treatment. For example, the survival time of a patient whoswitches from the active control to the test treatment might be longerthan his survival time if he had adhered to the original treatment.Therefore, Branson and Whitehead’s model is a model for random treat-ment switching with a constant latent hazard rate over time. In fact,even in the case of random switching, the hazard rate increases afterthe switch, and the later the switch occurs, the bigger the hazard rateafter the switch. Shao, Chang, and Chow’s method considered a gener-alized time-dependent Cox’s proportional hazard model and providedmaximum likelihood estimates (MLE) of the parameters. However, themethod for hypothesis testing was not provided.

On the other hand, the biomarker-based mixed exponential modelprovides a great flexibility for modeling. The maximum likelihood es-timate does not require the collection of biomarker data in the trial.Instead, the time to biomarker response can be estimated using one ofthe hazard rates in the mixed exponential model. One can use the datato improve the maximum likelihood estimates of the parameters if thebiomarker response data are available.

Page 211: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and
Page 212: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CHAPTER 10

Bayesian Approach

As pointed out by Woodcock (2005), Bayesian approaches to clinical tri-als are of great interest in the medical product development communitybecause they offer a way to gain valid information in a manner that ispotentially more parsimonious of time, resources, and investigationalsubjects than our current methods. The need to streamline the prod-uct development process without sacrificing important information hasbecome increasingly apparent. Temple (2005) also indicated that theFDA’s reviewers have already used some of the thinking processes thatinvolve Bayesian approaches, although the Bayesian approaches arenot implemented. In Bayesian paradigm, initial beliefs concerning aparameter of interest (discrete or continuous) are expressed by a priordistribution. Evidence from further data is then modeled by a likeli-hood function for the parameter. The normalized product of the priorand the likelihood forms so-called posterior distribution. Based on theposterior distribution, conclusions regarding the parameter of interestcan then be drawn. The possible use of Bayesian methods in clinical tri-als has been studied extensively in the literature in recent years. See,for example, Brophy and Joseph (1995), Lilford and Braunholtz (1996),Berry and Stangl (1996), Gelman, Carlin, and Rubin (2003), Spiegel-halter, Abrams, and Myles (2004), Goodman (1999, 2005), Louis (2005),and Berry (2005).

Bayesian approaches for dose-escalation trials have been discussedin Chapter 4. In this chapter, our focus will be placed on the use ofdifferent utilities of Bayesian approaches in clinical trials. In the nextsection, some basic concepts of Bayesian approach such as Bayes ruleand Bayesian power are given. Section 10.2 discusses the determinationof prior distribution. In Section 10.3, Bayesian approaches to multiple-stage design for a single-arm trial are discussed. Section 10.4 introducesthe use of Bayesian optimal adaptive designs in clinical trials. Someconcluding remarks are given in the last section of this chapter.

10.1 Basic Concepts of Bayesian Approach

In this section, basic concepts of a Bayesian approach such as Bayesrule and Bayesian power are briefly described.

Page 213: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

196 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

10.1.1 Bayes rule

Let θ be the parameter of interest. Denote the prior distribution ofθ as π (θ) . Also let f (x|θ) be the sampling distribution of X given θ.

Basically, a Bayesian approach involves the following four elements:

• The joint distribution of (θ , X) , which is given by

ϕ (θ , x) = f (x|θ) π (θ) ; (10.1)

• The marginal distribution of X, which can be obtained as

m(x) =∫

ϕ (θ , x) dθ =∫

f (x|θ) π (θ) dθ ; (10.2)

• The posterior distribution of θ , which can be obtained by Bayes’formula

π (θ |x) =f (x|θ) π (θ)

m(x); (10.3)

• The predictive probability distribution, which is given by

P (y|x) =∫

P (x|y, θ)π (θ |x) dθ. (10.4)

To illustrate the use of a Bayesian approach, the following exam-ples are useful. We first consider the case where the study endpoint isdiscrete.

Example 10.1 Assume that x follows a binomial distribution with theprobability of success p, i.e., X ∼ B(n, p). Also, assume that the pa-rameter of interest p follows a beta distribution with parameters α andβ, i.e., p ∼ Beta(α, β). Thus, the prior distribution is given by

π (p) =1

B (α, β)pα−1 (1 − p)β−1 , 0 ≤ p ≤ 1, (10.5)

where

B (α, β) =Γ (α) Γ (β)Γ (α + β)

.

Furthermore, the sampling distribution of X given p is given by

f (x|p) =(

nx

)

px (1 − p)n−x , x = 0, 1, . . . , n. (10.6)

Page 214: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

BAYESIAN APPROACH 197

Thus, the joint distribution of (p, X) is given by

ϕ (p, x) =

(nx

)

B (α, β)pα+x−1 (1 − p)n−xβ−1

. (10.7)

and the marginal distribution of X is given by

m(x) =

(nx

)

B (α, β)B (α + x, n − x + β). (10.8)

As a result, the posterior distribution of p given X can be obtained as

π (p|x) =pα+x−1 (1 − p)n−xβ−1

B (α + x, β + n − x)= Beta (α + x, β + n − x). (10.9)

For another example, consider the case where the study endpoint isa continuous variable.

Example 10.2 Assume that x follows a normal distribution with meanθ and variance σ 2/n, i.e., X ∼ N

(

θ , σ 2/n)

. Also, assume that the pa-rameter of interest θ follows a normal distribution with mean µ andvariance σ 2/n0, i.e., θ ∼ N

(

µ, σ 2/n0

)

. Thus, we have

π (θ |X) ∝ f (X|θ)π (θ).

As a result, the posterior distribution of θ given X can be obtained as

π (θ |X) = Ce− (X−θ)2n2σ2 e− (θ−µ)2n0

2σ2, (10.10)

where C is a constant with θ. It can be verified that (10.10) is a normaldistribution with mean θ n0µ+nx

n0+n and variance σ2

n0+n, i.e.,

θ |X ∼ N(

θn0µ + nX

n0 + n,

σ 2

n0 + n

)

.

Now, based on (10.10), we can make predictions concerning future val-ues of x by taking into account for the uncertainty about its mean θ. Forthis purpose, we rewrite X = (X − θ) + θ so that X is the sum of two in-dependent quantities, i.e., (X − θ) ∼ N

(

0, σ 2/n)

and θ ∼ N(

µ, σ 2/n0

)

.

As a result, the predictive probability distribution can be obtained as(see, e.g., Spiegelhalter, Abrams, and Rubin, 2004)

X ∼ N(

µ, σ 2

(1n

+1n0

))

. (10.11)

Page 215: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

198 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Note that if we observe the first n1 observations (i.e., the mean of thefirst n1 observations, xn1 is known), then the predictive probability dis-tribution is given by

X|xn1 ∼ N(

n0µ + n1xn1

n0 + n1, σ 2

(1

n0 + n1+

1n

))

. (10.12)

The above basic Bayesian concepts can be easily applied to some clas-sical designs in clinical trials. However, it may affect the power andconsequently sample size calculation. For illustration purposes, we con-sider the following example.

Example 10.3 Consider a two-arm parallel-group design comparing atest treatment and a standard therapy or an active control agent. Underthe two-arm trial, it can be verified that the power is a function of effectsize of ε (see, e.g., Chow, Shao, and Wang, 2003). That is,

power (ε) = Φ0

(√nε

2− z1−α

)

, (10.13)

where Φ0 is cumulative distribution function of the standard normaldistribution. Suppose the prior distribution of the uncertainty ε is π (ε).Then the expected power is given by

Pexp =∫

Φ0

(√nε

2− z1−α

)

π (ε) dε. (10.14)

In practice, a numerical integration is usually employed for evaluationof (10.14). To illustrate the implication of (10.14), we assume a one-sidedα = 0.025 (i.e., z1−α = 1.96) and the following prior for ε

π (ε) =

{

1/3, ε = 0.1, 0.25, 0.40, otherwise.

. (10.15)

Conventionally, we use the mean (median) of the effect size ε = 0.25to design the trial and perform sample size calculation assuming thatε = 0.25 is the true effect size. For the two-arm balanced design withβ = 0.2 or power = 80%, the classical approach gives the followingsample size:

n =4(z1−a + z1−β)2

ε2=

4 (1.96 + 0.842)2

0.252= 502. (10.16)

Page 216: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

BAYESIAN APPROACH 199

On the other hand, the Bayesian approach based on the expected powerfrom (10.14) yields

Pexp

=13

[

Φ0

(0.1

√n

2− z1−α

)

+ Φ0

(0.25

√n

2− z1−α

)

+ Φ0

(0.4

√n

2− z1−α

)]

(10.1)

=13

[

Φ0

(

0.1√

5022

− 1.96

)

+ Φ0

(

0.25√

5022

− 1.96

)

+ Φ0

(

0.4√

5022

− 1.96

)]

=13

[Φ0 (−0.839 73) + Φ0 (0.840 67) + Φ0 (2. 521 1)]

=13

(0.2005 + 0.7997 + 0.9942) = 0.664 8 = 66%. (10.17)

As it can be seen from the above, the expected power is only 66%, whichis lower than the desired power of 80%. As a result, in order to reachthe same power, the sample size is necessarily increased.

If π (ε) follows a normal distribution as N(

µ, σ 2/n0

)

, then the ex-pected power can be obtained using the predictive distribution by eval-uating the chance that the critical event occurs, i.e.,

P(

X >1√n

z1−ασ

)

.

This gives

Pexp = Φ(√

n0

n0 + n

√n

σ− z1−α

))

. (10.18)

As indicated earlier, the total sample size required for achieving thedesired power is a function of the effect size ε, i.e.,

n(ε) =4(z1−a + z1−β)2

ε2. (10.19)

Thus, the expected total sample size can be obtained as

nexp =∫

4(z1−a + z1−β)2

ε2π (ε) dε. (10.20)

Page 217: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

200 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

For a given flat prior, i.e., π (ε) ∼ 1b−a ,where a ≤ ε ≤ b, we have

nexp =∫ b

a

4(z1−a + z1−β)2

ε2

1b − a

=4

ab(z1−a + z1−β)2. (10.21)

This gives the following sample size ratio

Rn =nexp

n=

ε2

ab.

As it can be seen from the above, if ε = 0.25, α = 0.025, β = 0.8, n = 502,a = 0.1, b = 0.4 (note that (a + b)/2 = ε), then

Rn =0.252

(0.1) (0.4)= 1. 56.

This indicates that the frequentist approach could substantially under-estimate the sample size required for achieving the desired power.

10.1.2 Bayesian power

For testing the null hypothesis that H0 : θ ≤ 0 against an alternativehypothesis that Ha : θ > 0, we defined the Bayesian significance as

PB = P (θ < 0 | data) < αB.

Note that the Bayesian significance can be easily found based on theposterior distribution. For the case where the data and the prior bothfollow a normal distribution, the posterior distribution is given by

π (θ |x) = N(

n0µ + nxn0 + n

,σ 2

n0 + n

)

. (10.22)

Thus, Bayesian significance can be calculated if the parameter estimateX satisfies

X >

√n0 + nz1−ασ − n0µ

n. (10.23)

Note that the Bayesian power is then given by

PB (n) = Φ(

µ√

n0 + n√

n0

σ√

n−

√n0

nz1−α

)

. (10.24)

Example 10.4 For illustration purposes, consider a phase II hypoten-sion study comparing a test treatment with an active control agent.Suppose the primary endpoint is the reduction in systolic blood pres-sure (SBP). Assume that the estimated treatment effect is normally

Page 218: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

BAYESIAN APPROACH 201

distributed, i.e.,

θ ∼ N(

µ,2σ 2

n0

)

.

The design is targeted to achieve the Bayesian power at (1 − βB) at theBayesian significance level of αB = 0.2. For the sample size, the samplemean difference

θ ∼ N(

θ ,2σ 2

n

)

,

where n is the sample size per group. For a large sample size, we canassume that σ is known. In this case, the sample size n is the solutionof the following equation:

Φ(

µ√

n0 + n√

n0

σ√

2n−

√n0

nz1−αB

)

= 1 − βB. (10.25)

This leads toµ

√n0 + n

√n0√

2n−

√n0

nz1−αB = z1−βB. (10.26)

The above can be rewritten as follows:

An + B√

n + C = 0, (10.27)

where

A = z21−βB

− µ2n0,B = 2 z1−βBz1−α

√2n0,

C = 2z21−αn0 − µ2n2

0.

(10.28)

As a result, we can solve (10.27) for n, which is given by

n =

(

−B +√

B2 − 4AC2A

)2

(10.29)

10.2 Multiple-Stage Design for Single-Arm Trial

As indicated in Chapter 6, in phase II cancer trials, it is undesirable tostop a study early when the test drug is promising, and it is desirableto terminate the study as early as possible when the test treatment isnot effective due to ethical consideration. For this purpose, a multiple-stage design single-arm trial is often employed to determine whetherthe test treatment is promising for further testing. For a multiple-stage single-arm trial, the classical method and the Bayesian approachare commonly employed. In this section, these two methods are brieflydescribed.

Page 219: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

202 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

10.2.1 Classical approach for two-stage design

The most commonly used two-stage design in phase II cancer trials isprobably Simon’s optimal two-stage design (Simon, 1989). The conceptof Simon’s optimal two-stage design is to permit early stopping when amoderately long sequence of initial failure occurs. Thus, under a two-stage trial design, the hypotheses of interest are given below:

H0 : p ≤ p0 versus Ha : p ≥ p1,

where p0 is the undesirable response rate and p1 is the desirable re-sponse rate (p1 > p0). If the response rate of a test treatment is at theundesirable level, one may reject it as an ineffective treatment with ahigh probability, and if its response rate is at the desirable level, onemay not reject it as a promising compound with a high probability. Notethat under the above hypotheses, the usual type I error is the false posi-tive in accepting an ineffective drug and the type II error is the falsenegative in rejecting a promising compound.

Let n1 and n2 be the number of subjects in the first and second stage,respectively. Under a two-stage design, n1 patients are treated at thefirst stage. If there are fewer than r1 responses, then stop the trial.Otherwise, additional n2 patients are recruited and tested at the sec-ond stage. A decision regarding whether the test treatment is promisingis then made based on the response rate of the n = n1+n2 subjects. Notethat the rejection of H0 (or Ha) means that further (or not further) studyof the test treatment should be carried out. Simon (1989) proposed to se-lect the optimal two-stage design that achieves the minimum expectedsample size under the null hypothesis. Let nexp and Pet be the expectedsample size and the probability of early termination after the first stage.Thus, we have

nexp = n1 + (1 − Pet)n2.

At the end of the first stage, we would terminate the trial early andreject the null hypothesis if r1 or fewer responses are observed. As aresult, Pet is given by

Pet = Bc(r1; n1, p),

where Bc(r1; n1, p) denotes the cumulative binomial distribution thatX ≤ r1. Thus, we reject the test treatment at the end of the secondstage if r or fewer responses are observed. The probability of rejectingthe test treatment with success probability p is then given by

Bc(r1; n1, p) +min(n1,r)

x=r1+1

B(x; n1, p)Bc(r − x; n2, p),

Page 220: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

BAYESIAN APPROACH 203

where B(x; n1, p) denotes the binomial probability density function. Forspecific values of p0, p1, α, and β, Simon’s optimal two-stage design canbe obtained as the two-stage design that satisfies the error constraintsand minimizes the expected sample size when the response rate is p0.

Example 10.5 Assume the undesirable and desirable response ratesunder the null hypothesis and the alternative hypothesis are given by0.05 and 0.25, respectively. For a given one-sided α = 0.05 with a desiredpower of 80%, we obtain the sample size and the operating character-istics as follows. The sample size required at stage 1 is n1 = 9. Thecumulative sample size at stage 2 is n = 17. The actual overall α andthe actual power are given by 0.047 and 0.812, respectively. The stop-ping rules are specified as follows. At stage 1, stop and accept the nullhypothesis if the response rate is less than or equal to 0/9. Otherwise,continue to stage 2. The probability of stopping for futility is 0.63 whenH0 is true and 0.075 when Ha is true. At stage 2, stop and accept the nullhypothesis if the response rate is less than or equal to 2/17. Otherwise,stop and reject the null hypothesis.

10.2.2 Bayesian approach

Assume that X ∼ B (n, p) and p ∼ Beta (a, b). At the first stage withn1 subjects, the posterior distribution of p given X is given by

π1 (p|x) = Beta (a + x, b + n1 − x). (10.30)

Similarly, at the end of the second stage with n subjects, the posteriordistribution of p given X is

π (p|x) = Beta (a + x, b + n − x). (10.31)

As a result, the cutoff point for stopping the trial is chosen in a way suchthat the Bayesian power at the first stage is (1 − β1) at Bayesian sig-nificance level of αB1. Similarly, the n is chosen such that the Bayesianpower at the second stage is (1 − β) at Bayesian significance level αB.Based on (10.30) and (10.31), conditional power and predictive powercan be derived. Given X out of n1 patients who respond at the firststage, the probability (or conditional power) of having at least y addi-tional responses out of the n2 additional patients at the second stage isgiven by

P (y|x, n1, n2) =n2∑

i=y

(

n2

i

)(xn1

)i (

1 − xn1

)n2−i

. (10.32)

However, Bayesian approach provides a different look. Assume that theprior follows a binomial prior distribution p ∈ [0, 1]. Thus, the sampling

Page 221: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

204 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

distribution of X given p is given by

P(X = x|p) =(

n1

x

)

px(1 − p)n1−x.

Since

P(a < p < b ∩ X = x) =∫ b

a

(n1

x

)

px(1 − p)n1−xdp,

and

P(X = x) =∫ 1

0

(n1

x

)

px(1 − p)n1−xdp,

the posterior distribution of p given X can be obtained as

P(a < p < b | X = x) =

∫ ba

(n1x

)

px(1 − p)n1−xdp∫ 1

0

(n1x

)

px(1 − p)n1−xdp

=

∫ ba px(1 − p)n1−xdp

B(x + 1, n1 − x + 1),

where

B(x + 1, n1 − x + 1) =Γ(x + 1)Γ(n1 − x + 1)

Γ(n1 + 2).

Thus, the posterior distribution of p conditionally on X = x responsesout of n1 is a Beta distribution, i.e.,

π(p|x) =px(1 − p)n1−x

B(x + 1, n1 − x + 1). (10.33)

As a result, the predictive power (which is different from the frequen-tist’s conditional power) or the predictive probability of having at leasty responders out of additional m patients, given the observed responserate of x/n1 at the first stage, is given by

P (y|x, n1, n2) =∫ 1

0

P(X ≥ k|p, n2)π (p|x) dp

=∫ 1

0

n2∑

i=y

(n2

i

)

pi(1 − p)n2−i px(1 − p)n1−x

B(x + 1, n1 − x + 1)dp. (10.34)

Carrying out the integration, we have

P (y|x, n1, n2) =n2∑

i=y

(n2

i

)B(x + i + 1, n2 + n1 − x − i + 1)

B(x + 1, n1 − x + 1).

Page 222: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

BAYESIAN APPROACH 205

Note that the above results follow directly from the fact that∫ 1

0

pa (1 − p)b dp = B(a + 1, b + 1). (10.35)

As a result, the conditional and/or predictive power can be used to designthe sequential or other adaptive designs.

10.3 Bayesian Optimal Adaptive Designs

In practice, different adaptations and choices of priors with many possi-ble probabilistic outcomes (good or bad) could lead to different types ofadaptive designs. How to select an efficient adaptive design among thesedesigns has become an interesting question to the investigators. In thissection, we propose to evaluate so-called utility index for choosing aBayesian optimal adaptive design. The utility index is an indicator ofpatients’ health outcomes. Bayesian optimal design is a design that hasmaximum expected utility under financial, time, and other constraints.

For illustration purposes, we apply the approach for choosing a Baye-sian optimal design among the three commonly considered two-armdesigns in pharmaceutical research and development. The three com-monly considered two-arm trial designs include a classical approachof two separate two-arm designs (i.e., a two-arm phase II design fol-lowed by a two-arm phase III design) and two different seamless phaseII/III designs, which are group sequential designs with O’Brien-Flemingboundary and Pocock boundary, respectively. For each design, we calcu-late the utility index and weighted by its prior probability to obtain theexpected utility for the design. For a given design, the Bayesian optimaldesign is the one with maximum utility. For convenience’s sake, we con-sider three scenarios of prior knowledge, which are given in Table 10.1.

Assume that there is no dose selection issue (i.e., the dose has beendetermined by the toxicity and biomaker response in early studies). Forthe classical approach of two separate two-arm designs, we consider aphase IIb and a phase III (assuming that one phase III study is sufficientfor regulatory approval). For the phase II study, we assume δ = 0.2, one-sided α = 0.1 and power = 0.8. Thus, the total sample size required is

Table 10.1 Prior Knowledge

Scenario Effect Size Prior Prob.

1 0 0.22 0.1 0.23 0.2 0.6

Page 223: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

206 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 10.2 Characteristics of Classic Phase II and III Designs

Prob. of Continue Phase IIIScenario, i Effect Size Prior Prob. π to Phase III, Pc Power, P3

1 0 0.2 0.1 0.0252 0.1 0.2 0.4 0.6393 0.2 0.6 0.9 0.996

n1 = 450. For the phase III trial, we assume

δ = 0.2(0) + 0.2(0.1) + 0.6(0.2)= 0.14,

which was calculated from Table 10.1. Furthermore, assuming that α =0.025 (one-sided) and power = 0.9, the total sample size required isn = 2144. If the phase II study didn’t show any statistical significance,we will not conduct the phase III trial. Note that the rule is not alwaysfollowed in practice. The probability that continues to phase III is theweighted continual probability given in Table 10.2, i.e.,

Pc =3∑

i=1

Pc (i)π(i)

= 0.2 (0.1) + 0.2 (0.4) + 0.6(0.9)= 0.64.

Thus, the expected sample size for phase II and phase III trials as awhole is given by

N = (1 − Pc) n1 + Pcn= (1 − 0.62)(450) + 0.62 (2144)= 1500.

The overall expected power is then given by

P =3∑

i=1

Pc(i)π(i)P3(i)

= (0.2)(0.1)(0.025) + (0.2)(0.4)(0.639) + (0.6)(0.9)(0.996)= 0.59

In conclusion, the classical approach for separate phase II and phaseIII designs (i.e., a phase II design followed by a phase III design) hasan overall power of 59% with expected total (combined) sample size of1500. On the other hand, for seamless phase II/III designs, we assume

Page 224: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

BAYESIAN APPROACH 207

Table 10.3 Characteristics of Seamless Phase II/IIIDesign with OB Boundary

Scenario, i Effect Size Prior Prob. π Nexp Power

1 0 0.2 1600 0.0252 0.1 0.2 1712 0.463 0.2 0.6 1186 0.98

(i) δ = 0.14, (ii) one-sided α = 0.025, (iii) power = 0.90, (iv) O’Brien-Fleming efficacy stopping boundary, and (v) symmetrical futility stop-ping boundary. Suppose there is one planned interim analysis when 50%of patients are enrolled. Thus, this design has one interim analysis andone final analysis. Thus, we have:

• The sample sizes for the two analyses are 1,085, and 2,171,respectively.

• The sample size ratio between the two groups is 1.• The maximum sample size for the design is 2,171.• Under the null hypothesis, the expected sample size is 1,625.• Under the alternative hypothesis, the expected sample size is

1,818.

The decision rules are specified as follows. At stage 1, accept the nullhypothesis if z1 < 0. We would reject the null hypothesis if z1 ≥ 2.79;Otherwise, continue. At stage 2, accept the null hypothesis if z < 1.974and reject the null hypothesis if z ≥ 1.97. The stopping probabilities atthe first stage are given by

• Stopping for H0 when H0 is true is 0.5000,• Stopping for Ha when H0 is true is 0.0026,• Stopping for H0 when Ha is true is 0.0105,• Stopping for Ha when Ha is true is 0.3142.

The operating characteristics are summarized in Table 10.3.Furthermore, the average total expected sample size can be obtained as

π (i) Nexp (i) = 0.2(1600) + 0.2(1712) + 0.6(1186)

= 1374.

Page 225: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

208 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 10.4 Characteristics of Seamless Phase II/IIIDesign with Pocock Boundary

Scenario, i Effect Size Prior Prob. π Nexp Power

1 0 0.2 1492 0.0252 0.1 0.2 1856 0.643 0.2 0.6 1368 0.996

and the average power is given by∑

π (i) Nexp (i) Power (i) = 0.2(0.025) + 0.2(0.46) + 0.6(0.98)

= 0.69.

Now, consider another type of adaptive seamless phase II/III trialdesign using Pocock’s efficacy stopping boundary and the symmetric fu-tility stopping boundary. Based on the same parameter specifications,i.e., α = 0.025, power = 0.9, mean difference = 0.14, and standard devi-ation = 1, we have:

• The sample sizes for the two analyses are 1,274, and 2,549,respectively.

• The sample size ratio between the two groups is 1.• The maximum sample size for the design is 2,549.• Under the null hypothesis, the expected sample size is 1,492.• Under the alternative hypothesis, the expected sample size is

1,669.

The decision rules are specified as follows. At stage 1, accept the nullhypothesis if the p-value > 0.1867 (i.e., z1 < 0.89). We would reject thenull hypothesis if p-value ≤ 0.0158 (i.e., z1 ≥ 2.149); otherwise, con-tinue. At stage 2, accept the null hypothesis if the p-value > 0.0158(i.e., z = 2.149) and reject the null hypothesis if the p-value ≤ 0.0158.The stopping probabilities at the first stage are given by

• Stopping for H0 when H0 is true is 0.8133,• Stopping for Ha when H0 is true is 0.0158,• Stopping for H0 when Ha is true is 0.0538,• Stopping for Ha when Ha is true is 0.6370.

Other operating characteristics are summarized in Table 10.4.

Page 226: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

BAYESIAN APPROACH 209

Table 10.5 Comparison of Classic and Seamless Designs

Design Nmax Average Nexp Average Power Expected utility

Classic 1500 0.59 $0.515BOB 1374 0.69 $0.621BPocock 1490 0.73 $0.656B

Furthermore, the average total expected sample size can be obtainedas

π (i) Nexp (i) = 0.2(1492) + 0.2(1856) + 0.6(1368)

= 1490,

and the average power is given by∑

π (i) Nexp (i) Power (i) = 0.2(0.025) + 0.2(0.64) + 0.6(0.996)

= 0.73.

In addition, we may compare these designs from the financial per-spective. Assume that the pre-patient cost of the trial is about $50Kand the value of regulatory approval before deducting the cost of thetrial is $1B. For simplicity, potential time savings are not included inthe calculation. We consider the following expected utility

Expected utility = (average power)($80M) − (average Nexp)($50K).

Table 10.5 summarizes the comparison of the classical separate phaseII and phase III design and the two seamless phase II/III designs usingO’Brien-Fleming boundary and Pocock boundary.

As it can be seen from Table 10.5, Pocock’s design is the best amongthe three designs based on power or the expected utility.

10.4 Concluding Remarks

The Bayesian approach has several advantages in pharmaceutical de-velopment. First, it provides the opportunity to continue updating theinformation/knowledge regarding the test treatment under study.Second, Bayesian approach is a decision-making process that specifi-cally ties to a particular trial, a clinical development program, and acompany’s portfolio for pharmaceutical development. In practice, regu-latory agencies require the type I error rate be controlled with many ex-ternally mandated restrictions when implementing a Bayesiandesign, which might decrease the application of Bayesian methods in

Page 227: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

210 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

clinical trials. In essence, once Bayesian methods become morefamiliar to clinical scientists, they will face fewer externally mandatedrestrictions.

In the past several decades, the process of pharamaceutical develop-ment has been criticized for not being able to bring promising and safecompounds to the marketplace in a timely fashion. As a result, there isincreasing demand from political bodies and consumer groups to makedrug development more efficient, safer, and yet faster. However, it isa concern that we may abandon fundamental scientific principles forpharmaceutical (clinical) research and development. Alternatively, itis suggested that a Bayesian approach be used because it will lead tomore rapid and more economical drug development without sacrificinggood science. However, the use of Bayesian approach in pharmaceu-tical development is not widely accepted by the regulatory agenciessuch as the U.S. FDA, although it has been used more in certain ther-apeutic areas of medical device development. It should be noted thatthe Bayesian approach is useful for some diseases such as cancer inwhich there is a burgeoning number of biomarkers available for mod-elling the disease’s progress. These biomarkers will enable a patient’sdisease progression to be monitored more accurately. Consequently, amore accurate assessment of the patient’s outcome can be made. In re-cent years, trials in early phases of clinical development (e.g., phases Ior II) are becoming increasingly Bayesian, especially in the area of on-cology. Moreover, strategic planning and portfolio management such asformal utility assessment and decision-making processes in some phar-maceutical companies are becoming increasingly Bayesian. The use ofthe Bayesian approach in various phases of pharmaceutical develop-ment will become evident to the decision makers in the near future.

Page 228: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CHAPTER 11

Clinical Trial Simulation

Clinical trial simulation is a process that uses computers to mimic theconduct of a clinical trial by creating virtual patients and extrapolating(or predicting) clinical outcomes for each virtual patient based on thepre-specified models (Li and Lai, 2003). The primary objective of clin-ical trial simulation is multi-fold. First, it is to monitor the conduct ofthe trial, project outcomes, anticipate problems, and recommend reme-dies before it is too late. Second, it is to extrapolate (or predict) theclinical outcomes beyond the scope of previous studies from which theexisting models were derived using the model techniques. Third, it is tostudy the validity and robustness of the trial under various assumptionsof study designs. Clinical trial simulation is often conducted to verify(or confirm) the models depicting the relationships between the inputssuch as dose, dosing time, patient characteristics, and disease severityand the clinical outcomes such as changes in the signs and symptomsor adverse events within the study domain. In practice, clinical trialsimulation is often considered to predict potential clinical outcomes un-der different assumptions and various design scenarios at the planningstage of a clinical trial for better planning of the actual trial.

Clinical trial simulation is a powerful tool in pharmaceutical develop-ment. The concept of clinical trial simulation is very intuitive and easyto implement. In practice, clinical trial simulation is often considered auseful tool for evaluation of the performance of a test treatment undera model with complicated situations. It can achieve the goal with mini-mum assumptions by controlling type I error rate effectively. It can alsobe used to visualize the dynamic trial process from patient recruitment,drug distribution, treatment administration, and pharmacokinetic pro-cesses to biomarker development and clinical responses. In this chapter,we will review the application of clinical trial simulations in both earlyand late phases of pharmaceutical development.

In the next section, the framework of a clinical trial simulation isbriefly outlined. Section 11.2 provides an overview of commonly em-ployed clinical trial simulations in early phases clinical development,while Section 11.3 discusses some theoretical and practical issues thatare commonly encountered when conducting clinical trial simulationsin late phases clinical development. Section 11.4 introduces a software

Page 229: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

212 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

product developed by CTriSoft Intl for clinical trial simulation withdemonstration. A number of examples concerning early and late phasedevelopment are given in Section 11.5. A brief concluding remark isgiven in the last section.

11.1 Simulation Framework

The framework of clinical trial simulation is rather simple. It consists oftrial design, study objectives (hypotheses), model, and statistical tests.For the trial design, critical design features such as (i) a parallel designor a crossover design, (ii) a balanced or unbalanced design, (iii) thenumber of treatment groups, and (iv) adaptation algorithms need tobe clearly specified. For the trial design, hypotheses such as testingfor equality, superiority, or non-inferiority/equivalence can then be for-mulated for achieving the study objectives. A statistical model is nec-essarily implemented to generate virtual patients and extrapolate (orpredict) clinical outcomes. We can then evaluate the performance of thetest treatment through the study of statistical properties of the statis-tical tests derived under the null and alternative hypotheses.

More specifically, we begin a clinical trial simulation by choosing astatistical model under a valid trial design with various assumptionsaccording to the trial setting. We then simulate the trial by creatingvirtual patients and generating the clinical outcomes for each virtualpatient based on the model specifications under the null hypothesis ofH0 for a large number of times (say m times). For each simulation run,we calculate the test statistic. The m test statistic values constitute adistribution of the test statistic numerically. Similarly, we repeat theprocess to simulate the trial under the alternative hypothesis of Ha mtimes. The m test statistic values obtained represent the distribution ofthe test statistic under the alternative hypothesis. These two distribu-tions can be used to determine the critical region for a given α level ofsignificance, p-value for a given data, and the corresponding power forthe given critical region. To provide a better understanding, Figure 11.1provides the flowchart of the general simulation framework.

Note that the computer simulation starts with generating data un-der the null hypothesis. The data are often generated from a simpledistribution such as a normal distribution for continuous variables, abinary distribution for discrete variables, and an exponential distri-bution for time-to-event data. The generation of simulated data occursonly once per simulation run. The adaptive algorithm is usually appliedlater for the case of a sequential trial design or an N-adjustable design(sample size re-estimation). In some cases such as response-adaptive

Page 230: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CLINICAL TRIAL SIMULATION 213

Repea

t n ti

mes

(run

s)

Trial Design Specification

Objective in Hypothesis Test

Generate data under thenull hypothesis

Generate data under thealternative hypothesis

Apply adaptationsalgorithm

Calculate test statistics

Distribution of teststatistic under Ho

Distribution of teststatistic under Ha

Determine criticalregion based on alpha

Determine p-value andpower

Figure 11.1 Simulation framework.

randomization, the randomization may have to be generated step-by-step. It should be noted that under the above mentioned framework,other operating characteristics such as stopping boundary and condi-tional probability can also be obtained.

11.2 Early Phases Development

Clinical trial simulation has been widely used in early phases develop-ment such as pharmacokinetic and pharmacodynamic (PK/PD) mod-eling, assessment of QT/QTc prolongation, and optimal dose-findingstrategies (Kimko and Duffull, 2003). To illustrate the application ofsimulation in early phase clinical trials, we will focus on an early phaseoncology study (e.g., dose-escalation trial) because (i) typically thereare only a very small number of patients involved due to potential highrisk of toxicity of the test drug (how to obtain a precise estimation of thedose-toxicity relationship is challenging to the investigator); (ii) the tra-ditional escalation methods are ad hoc and lack scientific or statistical

Page 231: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

214 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

justification (the reliability of the traditional escalation method is al-ways a concern to the investigator); (iii) many approaches based onclinical trial simulation have been proposed in the literature. See, forexample, O’Quigley, Pepe, and Fisher (1990), Crowley (2001), CTriSoftIntl (2002), and Babb and Rogatko (2004). Our intention is not only toreview or compare these methods and make a recommendation, but alsoto use several typical methods as vehicles to illustrate the applicationof clinical trial simulation.

In a typical early phase dose escalation trial, we often begin with atreatment at a low dose that is very likely to be safe. Then, a small num-ber of cohorts of patients are treated at progressively higher doses untildrug-related toxicity reaches a pre-determined level. The primary objec-tives are not only to determine the maximum tolerated dose (MTD), butalso to characterize the dose limiting toxicity (DLT) of the test treatmentunder investigation. In what follows, key study parameters that arenecessarily specified when conducting a simulation for dose-escalationtrial are briefly described.

11.2.1 Dose limiting toxicity (DLT) and maximum tolerated dose (MTD)

The definitions of dose limiting toxicity (DLT) and maximum tolerabledose (MTD) need to be specified in a simulation for dose-escalation trial.As indicated in Chapter 5, drug toxicity is considered tolerable if thetoxicity is acceptable, manageable, and reversible. A dose limiting toxi-city (DLT) is defined based on the Common Terminology Criteria (CTC).For example, any AE from the CTC categories of grade 3 or higher re-lated to treatment is considered a DLT. The maximum tolerated dose(MTD) is defined as a dose level at which DLT occurs with at least acertain frequency, e.g., one-third.

11.2.2 Dose-level selection

Next, we need to select initial dose for the simulation. As discussed inChapter 5, the commonly used starting dose is the dose at which 10%mortality (LD10) occurs in mice. The subsequent dose levels are usuallyselected based on the following multiplicative set, xi = fi−1xi−1 (i =1, 2, . . .k), where fi is the dose-escalation factor. Two commonly usedsequences of dose-escalation factors are the Fibonacci numbers (2, 1.5,1.67, 1.60, 1.63, 1.62, 1.62, . . . ) and modified Fibonacci numbers (2, 1.65,1.52, 1.40, 1.33, 1.33, . . . ). Note that the highest dose level in thetrial should be selected such that it covers the biologically activedose.

Page 232: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CLINICAL TRIAL SIMULATION 215

11.2.3 Sample size per dose level

In principle, the number of patients to be treated per dose level shouldbe small enough to limit any potential safety issues, but large enough toallow the investigators to determine the optimal dose levels. In practice,recommendations may vary between one and eight patients per doselevel. It is suggested that three to six patients per lower dose level or aminimum of three per dose level and a minimum of five near the MTDbe included. This information is necessarily provided for the conduct ofthe clinical trial simulation.

11.2.4 Dose-escalation design

For the clinical trial simulation, we need to specify which dose-escalationdesign will be used. Commonly employed dose-escalation designsinclude (i) the traditional dose-escalation design and (ii) two-stage dose-escalation design. As indicated in Chapter 5, the traditional dose-escalation design utilizes the traditional escalation rules (TER) or stricttraditional escalation rules (STER). Two-stage dose-escalation designutilizes a two-stage escalation algorithm (TSER), where in the firststage, only a single patient is treated at each dose level. When a DLTis observed, the traditional “3 + 3” rule is applied.

If the continual reassessment method (CRM) is to be used, the dosetoxicity model and prior distribution of the parameters should be spec-ified. In practice, the logistic model p(x) = [1 + bexp(−ax)]−1 is oftenutilized, where p(x) is the probability of toxicity associated with dosex, and a and b are positive parameters to be determined. The updateddose toxicity model is then used to determine the dose level for the nextpatient.

11.3 Late Phases Development

Although clinical trial simulation has been used for several decades, ithas not received much attention until recently (Maxwell, Domenet, andJoyce, 1971; Parmigiani, 2002; Kimko and Duffull, 2003). Since clinicaltrial simulation is useful in evaluating very complicated models withminimum assumptions, it has become a very popular tool for searchingfor an optimal trial design in late phases clinical development. In thissection, we will focus on the simulation for adaptive designs discussedin the previous chapters of this book because statistical theory of anadaptive design could be very complicated with limited analytical so-lutions under some strong assumptions. Besides, in some complicated

Page 233: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

216 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

adaptive designs, analytical solutions might not exist. Similarly, whenconducting a simulation for adaptive trial designs, key adaptation rulesare necessarily specified, which are briefly described below.

11.3.1 Randomization rules

In clinical trials, it is desirable to randomize more patients to superiortreatment groups. This can be accomplished by increasing the proba-bility of assigning a patient to the treatment group when the evidenceof responsive rate increases in a group. The response-adaptive ran-domization rule can be the randomized-play-the-winner or utility offsetmodel. Response-adaptive randomization requires unblinding the data,which may not feasible at real time. There is often a delayed response,i.e., randomizing the next patient before knowing responses of previ-ous patients. Therefore, it is practical to unblind the data several timesduring the trial, i.e., group sequential response-adaptive randomiza-tion, instead of fully sequential adaptive randomization.

11.3.2 Early stopping rules

It is desirable to stop a trial when the efficacy or futility of the testdrug becomes obvious during the trial. To stop a trial prematurely, weprovide a threshold for the number of subjects randomized and at leastone of the following:

(1) Utility rules: The difference in response rate between the mostresponsive group and the control group exceeds a threshold,and the corresponding two-sided 95% naive confidence intervallower bound exceeds a threshold.

(2) Futility rules: The difference in response rate between themost responsive group and the control is lower than a thresh-old, and the corresponding two-sided 90% naive confidence in-terval upper bound is lower than a threshold.

11.3.3 Rules for dropping losers

In addition to the response-adaptive randomization, you can also im-prove the efficiency of a trial design by dropping some inferior groups(losers) during the trial. To drop a loser, we provide two thresholds for(1) maximum difference in response rate between any two dose levels,and (2) the corresponding two-sided 90% naive confidence lower bound.We may choose to retain all the treatment groups without dropping a

Page 234: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CLINICAL TRIAL SIMULATION 217

loser and/or to retain the control group with a certain randomizationrate for the purpose of statistical comparisons between the active groupsand the control.

11.3.4 Sample size adjustment

Sample size determination requires anticipation of the expected treat-ment effect size defined as the expected treatment difference divided byits standard deviation. It is not uncommon that the initial estimationof the effect size turns out to be too large or small, which consequentlyleads to an under-powered or over-powered trial. Therefore, it is desir-able to adjust the sample size according to the effect size for an on-goingtrial. The sample size adjustment is determined by a power function oftreatment effect size, i.e.,

N = N0

(E0max

Emax

)a

, (11.1)

where N is the newly estimated sample size, N0 the initial sample size,and a a constant. The effect size Emax is defined as

Emax =pmax − p1

σ 2; σ 2 = p(1 − p); p =

pmax + p1

2;

pmax and p1 are the maximum response rates, respectively, and thecontrol response rate, and E0max is the initial estimation of Emax.

11.3.5 Response–adaptive randomization

Conventional randomization refers to any randomization procedure witha constant treatment allocation probability such as simple randomiza-tion. Unlike the conventional randomization, response-adaptive ran-domization is a randomization in which the probability of allocating apatient to a treatment group is based on the response of the previouspatients. The purpose is to improve the overall response rate in the trial.There are many different algorithms such as random-play-the-winner(RPW), the utility-offset model, and the maximum utility model. Thegeneralized randomized-play-the-winner denoted by RPW(n1, n2, . . . , nk;m1, m2, . . . , mk) can be described as follows.

• Step 1: Place ni balls of the ith color (corresponding to theith treatment) into an urn (i = 1, 2, . . . , k), where k is numberof treatment groups. There are initially N =

∑ni balls in

the urn.

Page 235: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

218 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 11.1 Bias in Rate Due to Adaptive Randomization

Dose Level 1 2 3 4 5

Target rate 0.5 0.1 0.2 0.7 0.55Number of patients 130 14 23 362 171Observed rate 0.47 0.08 0.16 0.7 0.53Standard deviation 0.10 0.08 0.10 0.05 0.08Bias in rate 0.03 0.02 0.04 0.00 0.02

N = 700; 10,000 simulations.

• Step 2: Randomly choose a ball from the urn. If it is the ithcolor, assign the next patient to the ith treatment group.

• Step 3: Add mk balls of the ith color to the urn for each responseobserved in the ith treatment. This creates more chances forchoosing the ith treatment.

• Step 4: Repeat Steps 2 and 3.

When ni = n and mi = m for all i, we simply write RPW(n, m) forRPW(n1, n2, . . . , nk; m1, m2, . . . , mk).

Note that the commonly used estimators that are based on the as-sumption of independent samples are often biased due to adaptive ran-domization RPW(1,1). See Table 11.1.

11.3.6 Utility-offset model

To have a high probability of achieving target patient distribution amongthe treatment groups, the probability of assigning a patient to a groupshould be proportional to the corresponding predicted or observed re-sponse rate minus the proportion of patients that have been assignedto the group, i.e.,

Pr(i) =

{1k , SR = 0,1c max

{

0,(

RiSR

− niN

)}

, SR > 0,(11.2)

where the normalization factor

c =∑

i

(Ri

SR− ni

N

)

,

Page 236: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CLINICAL TRIAL SIMULATION 219

Table 11.2 Comparisons of Simulations Results

N Dose Level 1 2 3 4 5

Target rate 0.02 0.07 0.37 0.73 0.5250 Predicted rate 0.02 0.07 0.40 0.65 0.4130 Predicted rate 0.02 0.07 0.40 0.63 0.4020 Predicted rate 0.02 0.07 0.37 0.58 0.38

ni is the number of patients that have been randomized to the ith group,Ri is the observed response rate for the ith group,

SR =k∑

i=1

Ri,

k is the number of dose groups and N is the total estimated numberof patients in the trial. The maximum utility model for the adaptive-randomization always assigns the next patient to the group that hasthe highest response rate based on current estimation of either theobserved or model-based predicted response rate.

11.3.7 Null-model versus model approach

It is interesting to compare model and null-model approaches. Whenthe sample size is larger than 20 per group, there is no obvious advan-tage of using the model-based method with respect to the precision andaccuracy (Table 11.2). Therefore, a null-model approach will be used inthe subsequent simulations.

11.3.8 Alpha adjustment

The α-adjustment is required when (i) there are multiple comparisonswith more than two groups involved, (ii) there are interim looks, i.e.,early stopping for futility or efficacy, and (iii) there is a response-dependent sampling procedure such as response-adaptive randomiza-tion and unblinded sample size re-estimation. When samples or ob-servations from the trial are not independent, the response data areno longer normally distributed. Therefore, the p-value from a normaldistribution assumption should be adjusted, or equivalently the alphashould be adjusted if the p-value is not adjusted. For the same reason,the other statistic estimates from normal assumption should also beadjusted.

Page 237: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

220 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 11.3 Alpha Inflation/Deflation

Randomization Inflated FWE α Adjusted α

RPW(1,0) 0.146 0.015RPW(1,1) 0.143 0.007

Note: Five group with total N = 100. H0 Rate = 0.5.

The adjusted alpha can be found easily using simulation (See exam-ple, Table 11.3). Run simulations under the null hypothesis H0 to findthe adjusted alpha for the pairwise comparisons such that family-wiseerror rate = 0.025. Then run simulations under the alternative hypoth-esis Ha using the adjusted alpha to find the power for various samplesizes and other scenarios.

11.4 Software Application

In this section, we will introduce the application of clinical trial sim-ulation through a software ExpDesign Studio® developed by CTriSoftIntl (2002) with demonstration (www.CTriSoft.net).

11.4.1 Overview of ExpDesign Studio

ExpDesign Studio is an integrated environment for designing clini-cal trials. It is a user-friendly statistical software that consists of fivemain components. These main components include Conventional TrialDesign (CTD), Sequential Trial Design (STD), Multi-Stage Trial De-sign (MSTD), Dose-Escalation Trial Design (DETD), and Adaptive TrialDesign (ATD). Conventional trial design is probably the most com-monly used design in practice. There are over 160 procedures for sam-ple size calculation in CTD for various trial designs. STD covers abroad range of sequential trials including methods for different num-bers of experiment groups, different endpoints (e.g., mean, proportion,and survival), different hypotheses (e.g., test for difference and equiv-alence), and different stopping boundaries. MSTD provides three op-timal designs, namely minimax (MinMax) design, MinExp, and max-imization of utility (MaxUtility) design. These designs minimize themaximum sample size, the expected sample size, and maximize theutility index, respectively. DETD provides researchers with an efficientway to search the optimal design for dose-escalation trials with differ-ent criteria by means of computer simulations. It includes the tradi-tional escalation rules, restricted escalation rules, two-stage escalationalgorithms, and customized algorithms with varieties of dose intervals

Page 238: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CLINICAL TRIAL SIMULATION 221

Figure 11.2 The ExpDesign integrated environment.

such as Fibonacci, modified Fibonacci series, and the customized doseinterval series. It also provides three different toxicity models. ATD inExpDesign Studio allows a user to simulate trials under specific adap-tive designs. One can use response-adaptive randomization to assignmore patients to superior treatment groups or drop the losers (or theinferior groups). One may stop a trial prematurely to claim efficacyor futility based on the observed data. One may also modify samplesize based on observed treatment difference. One may also conductsimulations using Bayesian or frequentist modeling approaches or anon-parametric approach. All design reports are generated through anautomated procedure.

ExpDesign Studio covers many statistical tools required for design-ing a trial. It is helpful to get familiar with the functions of the icons onthe toolbar. The black-white icons on the left-hand side of the toolbarare standard for all word processors. The first group of the four coloricons is the starting point to launch the four different types of designs:Conventional Trial Design, Sequential Trial Design, Multi-Stage TrialDesign, and Dose-Escalation Trial Design (see Figures 11.2 and 11.3).Alternatively, one may click one of the four buttons in the ExpDesignTMStudio to start the corresponding design. The next set of three color iconsare for launching Design Example, Computing Design Parameters, and

Page 239: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

222 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Figure 11.3 ExpDesign start window.

generating Design Report. Following these are four color icons for thetoolkits including Graphic Calculator, Distribution Calculator, Confi-dence Interval Calculator, and TipDay. One can move the mouse overany icon on the toolbar to see the Tiptext, which describes what the iconis for.

11.4.2 How to design a trial with ExpDesign studio

To get started, we first double-click on ExpDesign Studio icon. Then,on the ExpDesign start window, we select the trial design we wish toimplement.

11.4.3 How to design a conventional trial

To design a conventional trial, we simply follow the following steps.

• Step 1: Click Conventional Trial Design on the toolbar.• Step 2: Select the desired option from Design Option Panel.

Page 240: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CLINICAL TRIAL SIMULATION 223

Figure 11.4 The conventional design window.

• Step 3: Select a method from the list of methods available.• Step 4: Enter appropriate values for the selected design.• Step 5: Click on Compute to calculate the required sample size.• Step 6: Click on Report to view the report of the selected design.• Step 7: Click on Print to print the desired output.• Step 8: Click on Copy-Graph to copy the graph and use Paste or

Paste-Special under Edit menu to paste it to other applications(Figure 11.4).

11.4.4 How to design a group sequential trial

To design a group sequential trial, we simply follow the following steps.

• Step 1: Click on Group Sequential Design on the toolbar.• Step 2: Select the desired option from the Design Option Panel.• Step 3: Select a method from the list of methods available.• Step 4: Enter appropriate values for the selected design.• Step 5: Click on Compute to generate the design.• Step 6: Click on Report to view the design report.

Page 241: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

224 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Figure 11.5 Sequential design window.

• Step 7: Click on Print to print the desired output.• Step 8: Click on Copy-Graph to copy the graph and use Paste or

Paste-Special under Edit menu to paste it to other applications.• Step 9: Click Save to save the design specification or report

(Figure 11.5).

11.4.5 How to design a multi-stage trial

To design a multiple-stage design, we simply follow the following steps.

• Step 1: Click on Multi-Stage Design on the toolbar.• Step 2: Select the desired Option from the Multi-Stage Design

window or open an existing design by clicking Open.• Step 3: Enter appropriate values for the selected design in the

textboxes.• Step 4: Click on Compute to generate the valid designs.• Step 5: Click on Report to view the report of the selected design.• Step 6: Click on Print to print the desired output.• Step 7: Click on Save to save the design specification or report

(Figure 11.6).

Page 242: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CLINICAL TRIAL SIMULATION 225

Figure 11.6 Multi-stage design window.

11.4.6 How to design a dose-escalation trial

To design a dose-escalation trial, we simply follow the following steps.

• Step 1: Click Dose-Escalation Design on the toolbar.• Step 2: Enter appropriate values for your design on the Basic

Spec Panel.• Step 3: Select Dose-response Model, Escalation Scheme, and

Dose Interval Spec or open an existing design by clickingOpen.

• Step 4: Click on Compute to generate the simulation results.• Step 5: Click on Report to view the design report.• Step 6: Click on Print to print the desired output.• Step 7: Click on Save to save the design specification or report

(Figures 11.7 and 11.8).

When clicking on Customized Design for the Escalation Scheme (seeFigure 11.7), a second window will pop up for further selection of theescalation rules. When it is done, click OK. It will go back to the Dose-Escalation Design Window.

Page 243: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

226 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Figure 11.7 Dose-escalation design window.

Figure 11.8 Dose-escalation customization window.

Page 244: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CLINICAL TRIAL SIMULATION 227

Figure 11.9 The adaptive design window.

11.4.7 How to design an adaptive trial

To design an adaptive trial, we simply follow the following steps.

• Step 1: Click on Adaptive Design on the toolbar.• Step 2: Follow the steps specified in the Simulation Setup

panel.• Step 3: Specify parameters in each of the steps.• Step 4: Click on Run to generate the simulation results.• Step 5: Click on Report to view the design report.• Step 6: Click on Print to print the desired output.• Step 7: Click on Save to save the design specification or report

(Figure 11.9).

11.5 Examples

In this section, several examples regarding the application of the clini-cal trial simulations in early and late phases of pharmaceutical devel-opment are provided. For early phases development, we will considerphase I oncology study for dose escalation with different traditional

Page 245: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

228 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 11.4 Simulation Results from TER

Dose Level 1 2 3 4 5 6 7

Dose 30 45 68 101 151 228 342DLT rate (%) 1 1.2 1.5 2.2 3.7 8.4 25mean n 3.1 3.1 3.1 3.2 3.5 5.0 5.0MTD rate (%) 0 0 1 2 6 57 33

escalation rules (TER) and continued re-assessment method (CRM). Forlate phases development, we will consider phase II oncology trials com-paring two treatment groups under different adaptive design settingsuch as an adaptive group sequential design, an adaptive-randomizationdesign, an N-adjusted design (sample size re-estimation), a drop-the-loser design, and an adaptive dose-response design.

11.5.1 Early phases development

To illustrate the application of clinical trial simulation in early phasesof clinical development, we consider the example of dose-escalation trialas described in the previous section. Suppose the study objective is todetermine the maximum tolerable dose (MTD) of newly developed testtreatment. In what follows, the application of clinical trial simulationfor dose escalation with different traditional escalation rules (TER) andcontinued reassessment method (CRM) are discussed.

Example 11.1 Simulation example using TER and TSER For aplanned dose-escalation trial, suppose we are interested in choosing anappropriate dose-escalation algorithm from the traditional escalationrule (TER) and the two-stage escalation rule (TSER) described earlier.Suppose that based on results from pre-clinical studies, it was estimatedthat the toxicity (DLT rate) is 1% for the starting dose of 30 mg/m2 (1/10of the lethal dose). The DLT rate at the MTD was defined as 17% andthe MTD was estimated to be 300 mg/m2. The true dose toxicity wasassumed to follow a logistic model, i.e.,

Logit(p) = −4.952 + 0.011Dose,

where p is the probability of DLT or DLT rate. There were 7 planneddose levels with a constant dose increment factor of 1.5. Suppose onlyone dose level de-escalation was allowed. Twenty thousand simulationruns were conducted using ExpDesign Studio (www.CTriSoft.net) for

Page 246: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CLINICAL TRIAL SIMULATION 229

Table 11.5 Simulation Results from TSER

Dose 30 45 68 101 151 228 342

mean n 1.1 1.1 1.2 1.3 1.5 5.9 5.2MTD rate (%) 0 0 0 0 2 69 29

each of the two algorithms. The results are summarized in Tables 11.5through 11.7.

The true MTD is somewhere between dose level 6 and 7. As it can beseen from Tables 11.5 and 11.6, the two-stage design requires fewer pa-tients at the dose levels lower than MTD. Table 11.7 indicates that thereis a large gain in the expected sample size with TSER, i.e., 17 compared26 in TER. On average, both methods cause 2 DLTs each and 5 patientsare treated above the MTD. The TSER increases slightly in precisioncompared to TER (66 versus 70 in dispersion of simulated MTDs). Inthe given scenario, both TER and TSER underestimate the true MTD(about 13%). This bias can be reduced using continual re-assessmentmethod (CRM), which will be discussed below.

Example 11.2 Simulation example using CRM We repeat theabove clinical trial simulation using the Bayesian continual re-assessment method (CRM). The logistic model considered to model thedose toxicity relationship is given by

Pr(x = 1) = (1 + 150e−ax)−1,

where parameter a follows a uniform prior distribution over the rangeof [0, 0.3]. The next patient is assigned to the dose level that has thepredicted response rate closest to the target DLT rate of 0.17 as de-fined earlier for the MTD. No response delay is considered. Due tosafety concerns, no dose jump is allowed. Ten thousand simulationswere conducted using ExpDesign Studio� version 2.0, where 21 subjects

Table 11.6 Comparison of TER and TSER

Method MTD MTD σMTD N DLTs n

3 + 3 TER 300 257 70 26 2 5Two-stage 300 258 66 17 2 5.2

MTD = Simulated MTD, σMTD = dispersion of MTDs, N = expectedsample size, n = patients treated above MTD.

Page 247: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

230 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 11.7 Simulation Results with CRM

Dose Level 1 2 3 4 5 6 7

Dose 30 45 68 101 151 228 342PDLT (%) 1 1.2 1.5 2.2 3.7 8.4 25PDLT (%) .9 1.1 1.4 2.2 4.1 10.3 30σDLT (%) 1 1 0.3 0.6 1.5 5.2 15n 1 1 1 1 1.24 9.24 6.5

PDLT = DLT rate, PDLT and σDLT are the predicted rate andits standard deviation, n = number of patients.

were used for each trial or simulation. The results are summarized inTable 11.8.

In this example, the CRM produces excellent predictions for the DLTrates. We can see that one of the advantages with CRM is that it pro-duces the posterior parameter distribution and predicted probability ofDLT for each dose level (any dose) and allows us to select an unplanneddose level as the MTD. In the current case, the true MTD is 300 mg/m2

with a DLT rate PDLT = 0.17, which is an unplanned dose level. Aslong as the dose-response model is appropriately selected, bias can beavoided. This is an advantage compared to TER or TSER discussed inthe previous example. The simulations also indicate that the number ofDLTs per trial is 2.5. The number of patients treated with dose higherthan MTD is 6.5 per trial. These numbers are larger than those in TERand TSER, which can be viewed as a trade-off for bias reduction.

11.5.2 Late phases development

To investigate the effect of the adaptations, we will compare the classic,group sequential, and adaptive designs with regards to their operating

Table 11.8 Group Sequential Design

Dose Level Control Active

Number of patients 244 244Response rate 0.2 0.3Observed rate 0.198 0.303Standard deviation 0.028 0.033

Page 248: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CLINICAL TRIAL SIMULATION 231

characteristics using computer simulations. In what follows, each ex-ample represents a different trial design. Examples 3 through 6 as-sume a phase II oncology trial with two treatment groups. The pri-mary endpoint is tumor response (PR and CR), and the estimated re-sponse rates for the two groups are 0.2 and 0.3, respectively. We usesimulation to calculate the sample size required, given that one-sidedalpha = 0.05 and power = 80%. Note that a classical fixed samplesize design with 600 subjects will have a power of 81.4% at one-sidedα = 0.025. The total number of responses per trial is 150 based on 10,000simulations.

Example 11.3 Flexible design with sample size re-estimation Inpractice, since the power of a trial depends upon the estimated effectsize, it is desirable to have a design that allows modification of thesample size at some points of time during the conduct of the trial. Letus design a trial that allows a sample size re-estimation and evaluatethe robustness of the design. In order to control the family-wise errorrate (FWE) at 0.025, the alpha must be adjusted to 0.023, which can bedone by computer simulation under the null hypothesis. The averagesample size is 960 under the null hypothesis. Using the algorithm forsample size re-estimation (1), where E0max = 0.1633 and a = 2, the de-sign has 92% power with an average sample size of 821.5. Now assumethat the initial effect sizes are not 0.2 versus 0.3 for the two treatmentgroups. Instead, they are 0.2 and 0.28, respectively. We want to knowwhat power the flexible design will have. Keep everything the same(also keep E0max = 0.1633), but change the response rates to 0.2 and0.28 for the two dose levels and run the simulation again. It turns outthat the design has 79.4% power with an average sample size of 855.Given the two response rates 0.2 and 0.28, the design with a fixed sam-ple size of 880 has a power of 79.4%. We can see that there is a savingof 25 patients by using the flexible design. If the response rates are 0.2and 0.3, for 92.1% power, the required sample size is 828 with the fixedsample size design, which means that the flexible design saves 6–7 sub-jects. A flexible design increases power when observed effect size is lessthan expected, while a traditional design with a fixed sample size eitherincreases or decreases the power regardless of the observed effect sizewhen the sample increases.

Example 11.4 Design with play-the-winner randomization Torandomize more patients to the superior treatment group usingresponse-adaptive randomization, the same example with a sample sizeof 600 subjects is used. The commonly used response-adaptive random-ization is RPW(1,1), i.e., one initial ball for each group and one addi-tional ball with the corresponding color for each response. The data will

Page 249: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

232 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

be unblinded for every 100 new patients. The adjusted alpha is foundto be 0.02. The design has 77.3% power with an average sample sizeof 600. On average, there are 223 subjects in dose level 1 and 377 indose level 2. The expected number of responses is 158. Comparing thisto the classical design, the RPW design gains 8 responses, but loses 4%in power.

Example 11.5 Group sequential design with one Interim anal-ysis Group sequential design is very popular in clinical trials. For thetwo-arm trial, the adjusted alpha is found to be 0.024. The maximumnumber of subjects is 700. The trial will stop if 350 or more patientsare randomized and one of the following criteria is met: (1) The efficacy(utility) stopping criterion: The maximum difference in response ratebetween any dose and the control is larger than 0.1 with the lower boundof the two-sided 95% naive confidence interval greater than or equal to0.0; or (2) The futility stopping criterion: The maximum difference inresponse rate between any dose and dose level 1 is less than 0.05 withthe upper bound of the one-sided 95% naive confidence interval lessthan 0.1.

The average total number of subjects for each trial is 488. The totalnumber of responses per trial is 122. The probability of correctly pre-dicting the most responsive dose level is 0.988 based on observed rates.Under the alternative hypothesis, the probability of early stopping forefficacy is 0.505, and the probability of early stopping for futility is0.104. The power for testing the treatment difference is 0.825.

Example 11.6 Adaptive design permitting early stopping andsample size re-estimation Sometimes it is desirable to have a de-sign permitting both early stopping and sample size modification.

With an initial sample size of 700 subjects, a grouping size of 350,and a maximum sample size of 1000, the one-sided adjusted alpha isfound to be 0.05. The simulation results are presented as follows:

The maximum sample size is 700. The trial will stop if 350 patientsor more are randomized and one of the following criteria is met: (1) Theefficacy (utility) stopping criterion: The maximum difference in responserate between any dose and the control is larger than 0.1 with the lowerbound of the two-sided 95% naive confidence interval greater than orequal to 0.0; or (2) The futility stopping criterion: The maximum differ-ence in response rate between any dose and the control is less than 0.05with the upper bound of the one-sided 95% naive confidence interval lessthan 0.1. The sample size will be re-estimated when 350 subjects havebeen randomized. When the null hypothesis is true (p1 = p2 = 0.2), theaverage total number of subjects for each trial is 398.8. The probability

Page 250: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CLINICAL TRIAL SIMULATION 233

Table 11.9 Design with Early Stoppingand n-Re–estimation

Dose Level Control Active

Number of patients 271 272Response rate 0.2 0.3Observed rate 0.198 0.303Standard deviation 0.028 0.032

of early stopping for efficacy is 0.0096. The probability of early stoppingfor futility is 0.9638.

When the alternative hypothesis is true (p1 = 0.2, p2 = 0.3), the aver-age total number of subjects for each trial is 543.5. The total number ofresponses per trial is 136 (Table 11.9). The probability of correctly pre-dicting the most responsive dose level is 0.985 based on observed rates.The probability of early stopping for efficacy is 0.6225. The probabilityof early stopping for futility is 0.1546. The power for testing the treat-ment difference is 0.842. Examples 11.7 through 11.9 are for the samescenario as the six-arm study with response rates of 0.5, 0.4, 0.5, 0.6,0.7, and 0.55 for the 6 dose levels from doses 1 through 6, respectively.

Example 11.7 Conventional design with multiple treatmentgroups With 800 subjects, a 0.5 response rate under H0, and a group-ing size of 100, we found the one-sided adjusted α to be 0.0055. Thetotal number of responses per trial is 433. The probability of correctlypredicting the most responsive dose level is 0.951 based on observedrates. The power for testing the maximum effect comparing any doselevel to the control is 80%. The powers for comparing each of the 5 doselevels to the control are 0, 0.008, 0.2, 0.796, and 0.048, respectively.

Example 11.8 Response-adaptive design with multiple treat-ment groups To further investigate the effect of Random-Play-the-Winner randomization RPW(1,1), a design with 800 subjects, a grouping

Table 11.10 Design with RPW(1,1) under H0

Dose Level 1 2 3 4 5 6

Response rate 0.5 0.5 0.5 0.5 0.5 0.5Observed rate 0.50 0.49 0.49 0.49 0.49 0.49

Page 251: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

234 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 11.11 Design with RPW(1,1) Under Ha

Dose Level 1 2 3 4 5 6

No. of subjects 200 74 100 133 176 116Response rate 0.5 0.4 0.5 0.6 0.7 0.55Observed rate 0.50 0.39 0.49 0.59 0.7 0.54

size of 100, and a response rate of 0.2 under the null hypothesis is simu-lated (Table 11.10). The one-sided adjusted α is found to be 0.016. Usingthis adjusted alpha and response rates 0.5, 0.4, 0.5, 0.6, 0.7, and 0.55for dose levels 1 through 6, respectively, the simulation indicates thatthe designed trial has 86% power and 447 responders per trial on av-erage (Table 11.11). In comparison to 80% power and 433 respondersfor the design with simple randomization RPW(1,0), the adaptive ran-domization is superior in both power and number of responders. Thesimulation results also indicate that there are biases in the estimatedmean response rates in all dose levels except dose level 1, where a fixedrandomization rate is used.

The average total number of subjects for each trial is 800. The totalnumber of responses per trial is 446.8. The probability of correctly pre-dicting the most responsive dose level is 0.957 based on observed rates.The power for testing the maximum effect comparing any dose level tothe control (dose level 1) is 0.861 at a one-sided significance level (al-pha) of 0.016. The powers for comparing each of the 5 dose levels to thecontrol are 0, 0.008, 0.201, 0.853, and 0.051, respectively.

Example 11.9 Adaptive design with dropping the losers Imple-menting the mechanism of dropping losers can also improve the effi-ciency of a design. With 800 subjects, a grouping size of 100, a responserate of 0.2 under the null hypothesis, and a fixed randomization ratein dose level 1 at 0.25, an inferior group (loser) will be dropped if themaximum difference in response between the most effective group andthe least effective group (loser) is greater than 0 with the lower boundof the one-sided 95% naive confidence interval greater than or equal

Table 11.12 Bias in Rate with Dropping Losers Under H0

Dose Level 1 2 3 4 5 6

Response rate 0.5 0.5 0.5 0.5 0.5 0.5Observed rate 0.50 0.46 0.46 0.46 0.46 0.46

Page 252: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CLINICAL TRIAL SIMULATION 235

Table 11.13 Bias in Rate with Dropping Losers under Ha

Dose Level 1 2 3 4 5 6

No. of subjects 200 26 68 172 240 95Response rate 0.5 0.4 0.5 0.6 0.7 0.55Observed rate 0.50 0.37 0.46 0.57 0.69 0.51

to 0. Using the simulation, the adjusted alpha is found to be 0.079.From the simulation results in Tables 11.12 and 11.13, biases can beobserved with this design. The design has 90% power with 467 respon-ders. The probability of correctly predicting the most responsive doselevel is 0.965 based on observed rates. The powers for comparing eachof the 5 dose levels to the control (dose level 1) are 0.001, 0.007, 0.205,0.889, and 0.045, respectively. The design is superior to both RPW(1,0)and RPW(1,1).

Example 11.10 Dose-response trial design The trial objective is tofind the optimal dose with the highest response rate. There are 5 doselevels and 30 planned subjects in each simulation (Table 11.14). Thehyper-logistic model is defined with the parameters a3 ∈ [20, 100] anda4 ∈ [0, 0.05]. The RPW(1, 1) is used for the randomization. The simu-lation results show that the probability of correctly predicting the mostresponsive dose level is 0.992 by the model and only 0.505 based onobserved rates.

11.6 Concluding Remarks

From classic design to group sequential design to adaptive design, eachstep forward increases in complexity and at the same time improves theefficiency of clinical trials as a whole. Adaptive designs can increase the

Table 11.14 Dose-Response Trial Design

Dose Level 1 2 3 4 5

No. of subjects 15 30 50 85 110Response rate 0.2 0.3 0.6 0.7 0.5Observed rate 0.193 0.294 0.593 0.691 0.489Predicted rate 0.098 0.181 0.406 0.802 0.379σobs 0.185 0.209 0.221 0.204 0.226σprd 0.074 0.073 0.104 0.151 0.056

Page 253: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

236 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

number of responses in a trial and provide more benefits to the patientin comparison to the classic design. With sample size re-estimation, anadaptive design can preserve the power even when the initial estima-tions of treatment effect and its variability are inaccurate. In the case ofa multiple-arm trial, dropping inferior arms or response-adaptive ran-domization can improve the efficiency of a design dramatically. Find-ing analytic solutions for adaptive designs is theoretically challenging.However, computer simulation makes it easier to achieve an optimaladaptive design. It allows a wide range of test statistics as long as theyare monotonic functions of treatment effects. Adjusted alpha and p-values due to response-adaptive randomization and other adaptationswith multiple comparisons can be determined easily using computersimulations. Unbias in point estimations with adaptive designs has notbeen completely resolved by using computer simulations. However, thebias can be ignored in practice by using a proper grouping size (cluster)such that there are only a limited number of adaptations (<8).

Simulations have many applications in clinical trials. In early phasesof the clinical development with many uncertainties (variables), clinicaltrial simulation can be used to assess the impacts of the variables. Clin-ical trial simulation with the Bayesian approach could produce certaindesirable operating characteristics that can answer questions raisedin the early development phase such as posterior distribution of tox-icity. There are a few things we should caution about with regard toclinical trial simulation. The quality of simulation results is very muchdependent on the quality of the pseudo-random number. We should beaware that most built-in random number generators in computer lan-guages and software tools are poor in quality; therefore it should notbe used directly if we are not sure about the algorithm. Implementinga high-quality of random number generator is as simple as a dozen oflines of computer code (Press et al., 2002; Gentle, 1998) as implementedby ExpDesign Studio®. The common steps for conducting a clinicaltrial simulation include defining the objectives, analyzing the problem,assessing the scope of the work and proposing time and resources re-quired to complete the task, examining the assumptions, obtaining andvalidating the data source if applicable, nailing down evaluation cri-teria for various trial designs/scenarios, selecting an appropriate soft-ware tool, outlining the computer algorithms for the simulation, imple-menting and validating the algorithms, proposing scenarios to simulate,conducting simulations, interpreting results and making recommenda-tions, and last but not least, addressing the limitations of the performedsimulations.

We have demonstrated that clinical trial simulation can be used forvarious complex designs. By comparing the operating characteristics of

Page 254: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CLINICAL TRIAL SIMULATION 237

each design provided by clinical trial simulation, we are able to choosean optimal design or development strategy. Computer simulation iscommonly seen in statistics literature related to clinical trials, most ofthem for power calculations and data analyses. We will see more phar-maceutical companies using clinical trial simulation for their clinicaldevelopment planning to streamline the drug process, increase the prob-ability of success, and reduce the cost and time-to-market. Ultimately,clinical trial simulation will bring the best treatment to the patients.When conducting clinical trial simulation for an adaptive design, thefollowing is recommended in order to protect the integrity of the trial:Specify in the trial protocol (i) the type of adaptive design, (ii) details ofthe adaptations to be used, (iii) the estimator for treatment effect, and(iv) the hypotheses and the test statistics. Run the simulation underthe null hypothesis and construct the distribution of the test statistic.To estimate the sample size for the adaptive design, run the simulationunder the alternative hypothesis and construct the distribution of thetest statistic. Sensitivity analyses are suggested to simulate potentialmajor protocol deviations that would impact the validity of the trialsimulations. To achieve an optimal design, a Bayesian or frequentist-Bayesian hybrid adaptive approach should be used. There are severalsimulation tools available on the Web. For adaptive designs discussedin this section, the ExpDesign Studio� can be used.

Page 255: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and
Page 256: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CHAPTER 12

Case Studies

As indicated earlier, the adaptation or modification made to a clinicaltrial includes prospective adaptation (by design), concurrent or on-goingadaptation (ad hoc), and retrospective adaptation (at the end of the trialand prior to database lock or unblinding). Different adaptation or mod-ification could lead to different adaptive designs with different levelsof complexity. In practice, it is suggested that by design prospectiveadaptation be considered at the planning stage of a clinical trial (Galloet al., 2006), although it may not reflect real practice in the conductof clinical trials. Li (2006) pointed out that the use of adaptive designmethods (either by design adaptation or ad hoc adaptation) providesa second chance to re-design the trial after seeing data internally orexternally at interim. However, it may introduce so-called operationalbiases such as selection bias, method of evaluations, early withdrawal,and modification of treatments. Consequently, the adaptation employedmay inflate type I error rate. Uchida (2006) also indicated that thesebiases could be translated to information (assessment) biases, whichmay include (i) patient enrollment, (ii) differential dropouts in favor ofone treatment, (iii) crossover of the other treatment, (iv) protocol de-viation due to additional medications/treatments, and (v) differentialassessment of the treatments. As a result, it is difficult to interpret theclinically meaningful effect size for the treatments under study (seealso, Quinlan, Gallo, and Krams, 2006).

In the next section, basic considerations when implementing adap-tive design methods in clinical trials are given. Successful experiencefor the implementation of adaptive group sequential design (see, e.g.,Cui, Hung, and Wang, 1999), adaptive dose-escalation design (see, e.g.,Chang and Chow, 2005), and adaptive seamless phase II/III trial design(see, e.g., Maca et al., 2006) are discussed in Section 12.2, Section 12.3,and Section 12.4, respectively.

12.1 Basic Considerations

As discussed in early chapters of this book, the motivation behind theuse of adaptive design methods in clinical trials includes (i) the flexibilityin modifying trial and statistical procedures for identifying best clinical

Page 257: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

240 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

benefits of a compound under study and (ii) the efficiency in shorten-ing the development time of the compound. In addition, adaptive de-signs provide the investigator a second chance to re-design the trialwith more relevant data observed (internally) or clinical informationavailable (externally) at interim. The flexibility and efficiency are veryattractive to investigators and/or sponsors. However, major adaptationmay alter trial conduct and consequently result in a biased assessmentof the treatment effect. Li (2006) suggested a couple of principles whenimplementing adaptive designs in clinical trials: (i) adaptation shouldnot alter trial conduct and (ii) type I error should be preserved. Fol-lowing these principles, some studies with complicated adaptation maybe more successful than others. In what follows, some basic considera-tions when implementing adaptive design methods in clinical trials arediscussed.

12.1.1 Dose and dose regimen

Dose selection is an integral part of clinical development. An inade-quate selection of dose for a large confirmatory trial could lead to afailure of the development of the compound under study. Traditionaldose-escalation and/or dose de-escalation studies are not efficient. Theobjective of dose or dose regimen selection is not only to select the bestdose group but also to drop the least efficacious or unsafe dose groupwith limited number of patients available. Under this consideration,adaptive designs with appropriate adaptation in selection criteria anddecision rules are useful.

12.1.2 Study endpoints

Maca et al. (2006) suggested that well-established and well-understoodstudy endpoints or surrogate markers be considered when implement-ing adaptive design methods in clinical trials, especially when the trialis to learn about the primary endpoints to be carried forward into laterphase clinical trials. An adaptive design would not be feasible for clini-cal trials without well-established or well-understood study endpointsdue to (i) uncertainty of the treatment effect and (ii) the fact that aclinically meaningful difference cannot be determined.

12.1.3 Treatment duration

For a given study endpoint, treatment duration is critical in order toreach the optimal therapeutic effect. If the treatment duration is shortrelative to the time needed to enroll all patients planned for the study,

Page 258: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CASE STUDIES 241

then an adaptive design such as response-adaptive randomization de-sign is feasible. On the other hand, if the treatment duration is too long,too many patients would be randomized during the period, which couldresult in unacceptable inefficiencies. In this case, it is suggested thatan adaptive biomarker design be considered.

12.1.4 Logistical considerations

Logistical considerations relative to the feasibility of adaptive designsin clinical trials include, but are not limited to, (i) drug management, (ii)site management, and (iii) procedural consideration. For costly and/orcomplicated dose regimens drug packaging and drug supply could bea challenge to the use of adaptive design methods in clinical trials,especially when the adaptive design allows dropping the inferior dosegroups. Site management is the selection of qualified study sites and pa-tient recruitment for the trial. For some adaptive designs, recruitmentrate is crucial to the success of the trial, especially when the intentionof the trial is to shorten the time of development. Procedural consider-ations are decision processes and dissemination of information in orderto maintain the validity and integrity of the trial.

12.1.5 Independent data monitoring committee

When implementing an adaptive design in a clinical trial, an indepen-dent data monitoring committee (DMC) is necessarily considered formaintaining the validity and integrity of the clinical trial. A typicalexample is the implementation of an adaptive group sequential designwhich cannot only allow stopping a trial early due to safety and/orfutility/efficacy, but also address sample size re-estimation based on thereview of unblinded data. In addition, DMC conveys some limited infor-mation to investigators or sponsors about treatment effects, proceduralconventions, and statistical methods with recommendations so that theadaptive design methods can be implemented with less difficulty.

12.2 Adaptive Group Sequential Design

12.2.1 Group sequential design

Group sequential design is probably one of the most commonly usedclinical trial designs in clinical research and development. As indicatedin Chapter 6, the primary reasons for conducting interim analyses of

Page 259: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

242 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

accrued data are probably due to (i) ethical consideration, (ii) adminis-trative reasons, and (iii) economic constraints. Group sequential designis very attractive because it allows stopping a trial early due to (i) safety,(ii) futility, and/or (iii) efficacy. Moreover, it also allows adaptive samplesize adjustment at interim either blinding or unblinding through anindependent data monitoring committee (DMC).

12.2.2 Adaptation

Basic adaptation strategy for an adaptive group sequential design isthat one or more interim analyses may be planned. In practice, it isdesirable to stop a trial early if the test compound is found to be inef-fective or not safe. However, it is not desirable to terminate a trial earlyif the test compound is promising. To achieve these goals, data safetymonitoring and interim analyses for efficacy are necessarily performed.Note that how to control the overall type I error rate and how to deter-mine treatment effect that the trial should be powered at the time ofinterim analyses would be the critical issues for this adaptation.

At each interim analysis, an adaptive sample size adjustment basedon unblinded interim results and/or external clinical information avail-able at interim may be performed. In practice, at the planning stage ofa clinical trial, a pre-study power analysis is usually conducted basedon some initial estimates of the within or between patient variationand the clinically meaningful difference to be detected. This crucialinformation is usually not available or it is available (e.g., data fromsmall pilot studies) with a high degree of uncertainty (Chuang-Stein,et al., 2006). Lee, Wang, and Chow (2006) showed that sample size ob-tained based on estimates from small pilot studies is highly unstable.Thus, there is a need to adjust sample size adaptively at interim. Mehtaand Patel (2003) and Offen et al. (2006) also discussed other situationswhere sample size re-estimation at interim is necessary. The use of anindependent data monitoring committee (DMC) would be the criticalissue for this adaptation.

Other adaptations such as adaptive hypotheses from a superioritytrial to a non-inferiority trial may be considered. In practice, interimresults may indicate that the trial will never achieve statistical signif-icance at the end of the trial. In this case, the sponsors may considerchanging the hypotheses or study endpoints to increase the probabilityof success of the trial. A typical example is to switch from superiority hy-potheses to non-inferiority hypotheses. At the end of the trial, final anal-ysis will be performed for testing non-inferiority rather than superior-ity. Note that superiority can still be tested after the non-inferiority has

Page 260: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CASE STUDIES 243

been established without paying any statistical penalty due to closedtesting procedure. Note that the determination of non-inferiority mar-gin would be the challenge for this adaptation.

12.2.3 Statistical methods

For the adaptation of interim analyses, statistical methods as describedin Chapter 6 for controlling overall type I error rate are useful. For theadaptation of sample size re-estimation, the methods proposed by Cui,Hung, and Wang (1999), Fisher’s combination of p-values, error-functionmethod, inverse normal method, or linear combination of p-values canbe used. For the adaptation of switching hypotheses, statistical meth-ods discussed in Chapter 4 are useful. As indicated by Chuang-Steinet al. (2006), since the weighting of the normal statistics will not, ingeneral, be proportional to the sample size for that stage, the methoddoes not use the sufficient statistics (the unweighted mean differenceand estimated standard deviation from combined stages) for testing,and is therefore less efficient (Tsiatis and Mehta, 2003). Additional dis-cussion on efficiency can be found in Burman and Sonesson (2006) andJennison and Turnbull (2006a).

12.2.4 Case study — an example

For illustration purposes, consider the example given in Cui, Hung,and Wang (1999). This example considers a phase III two-arm trial forevaluating the effect of a new drug for prevention of myocardial infec-tion (MI) in patients undergoing coronary artery bypass graft surgery.It was estimated that a sample size of 300 patients per group wouldgive a 95% power for detecting a 50% reduction in incidence rate from22% to 11% at the one-sided significance level of α = 0.025. Althoughthe sponsor was confident about the incidence rate of 11% in the con-trol group, they were not sure about the 11% incidence rate in thetest group. Thus, an interim analysis was planned to allow for samplesize re-estimation based on observed treatment difference. The interimanalysis was scheduled when 50% of the patients were enrolled andhad their efficacy assessment. The adaptive group sequential using themethod of Fisher’s combination of stage-wise p-values was considered.The decision rules were: at stage 1, stop for futility if the stage-wisep-value p1 > α0 and stop for efficacy if p1 ≤ α1; at the final stage, ifp1 p2 ≤ Cα, claim efficacy; otherwise claim futility. The stopping bound-ary was chosen from Table 7.4. The futility boundary α0 = 0.5, theefficacy stopping boundary α1 = 0.0102 at stage 1 and Cα = 0.0038 at

Page 261: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

244 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

the final stage. The upper limit of the sample size is Nmax = 800 pergroup. The futility boundary was used to stop the trial in the case of verysmall effect size because in such a case, to continue the trial would re-sult in an unrealistic large sample size or Nmax with insufficient power.The adaptive group sequential design would have a 99.6% power whenthe incidence rates were 22% and 11%, and an 80% power when theincidence rates were 22% and 16.5%.

At interim analysis based on data from 300 patients, it was observedthat the test group had an incidence rate of 16.5% and 11% in the controlgroup. If these incidence rates were the true incidence rates, the powerfor the classical design would be about 40%. Under the adaptive groupsequential design, the sample size was re-estimated to be 533 per group.If the 16.5% and 11% are the true incidence rates, the conditional poweris given by 88.6%.

Remarks Note that the trial was originally designed not allowingfor sample size re-estimation. The sponsor requested sample size re-estimation and was rejected by the FDA. The trial eventually failedto demonstrate statistical significance. In practice, it is recommendedthat the adaptation for sample size re-estimation be considered in thestudy protocol and an independent data monitoring committee (DMC)be established to perform sample size re-estimation based on the reviewof unblinded date at interim to maintain the validity and integrity ofthe trial.

12.3 Adaptive Dose-Escalation Design

12.3.1 Traditional dose-escalation design

As discussed in Chapter 5, the traditional “3 + 3” escalation rule iscommonly considered in phase I dose-escalation trials for oncology. The“3 + 3” rule is to enter three patients at a new dose level and then en-ter another three patients when dose limiting toxicity is observed. Theassessment of the six patients is then performed to determine whetherthe trial should be stopped at the level or to increase the dose. The goalis to find the maximum tolerated dose (MTD). The traditional “3 + 3”rule (TER) is not efficient with respect to the number of dose limitingtoxicities and the estimation of MTD. There is a practical need to havea better design method that will reduce the number of patients andnumber of DLTs, and at the same time have a more precise estimationof MTD. We will use Bayesian continual reassessment method (CRM)to achieve our goals.

Page 262: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CASE STUDIES 245

12.3.2 Adaptation

The basic adaptation strategy for an adaptive dose-escalation trial de-sign is change in traditional escalation rule (TER). As discussed inChapter 5, the traditional 3 + 3 dose-escalation rule is not efficient.As a result, a m + n dose-escalation rule may be considered with somepre-specified selection criteria based on the dose limiting toxicity (DLT).Other adaptations that are commonly considered include the applica-tion of adaptation to the design characteristics such as the selection ofstarting dose, the determination of dose levels, prior information on themaximum tolerable dose (MTD), dose toxicity model (Figure 12.1),stopping rules, and statistical methods.

12.3.3 Statistical methods

As indicated earlier, many methods such as the assessment of doseresponse using multiple-stage designs (Crowley, 2001) and the contin-ued re-assessment method (CRM) are available in the literature forassessment of dose-escalation trials. For the method of CRM, the dose-response relationship is continually re-assessed based on accumulativedata collected from the trial. The next patient who enters the trial isthen assigned to the potential MTD level. This approach is more effi-cient than that of the usual TER with respect to the allocation of theMTD. However, the efficiency of CRM may be at risk due to delayedresponse and/or a constraint on dosejump in practice (Babb and Ro-gatko, 2004). Chang and Chow (2005) proposed an adaptive method thatcombines CRM and utility-adaptive randomization (UAR) for multiple-endpoint trials. The proposed UAR is an extension of the response-adaptive randomization (RAR). Note that the CRM could be a Bayesian,a frequentist, or a hybrid frequentist-Bayesian–based approach. Aspointed out by Chang and Chow (2005), this method has the advantageof achieving the optimal design by means of the adaptation to the ac-crued data of an on-going trial. In addition, CRM could provide a betterprediction of dose-response relationship by selecting an appropriatemodel as compared to the method simply based on the observed response.

12.3.4 Case study — an example

A trial is designed to establish the dose toxicity relationship and to iden-tify maximum tolerable dose (MTD) for a compound in treatment of pa-tients with metastatic androgen independent prostate cancer. Based onpre-clinical data, the estimated MTD is about 400 mg/m2. The modified

Page 263: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

246 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

1.0

0.8

0.6

0.4

0.2

0.076543210 8

Pro

babi

lity

of D

LTToxicity Model

Dose Level (mg/m2)

MDTθ

θ

Figure 12.1 Example of dose-toxicity model.

Fibonacci series is chosen for the dose levels (Table 12.1). Eight doselevels are considered in this trial with the option of adding more doselevels if necessary. The initial dose level is chosen to be 30 mg/m2, atwhich about 10% of deaths (MELD10) occur in mice after the verifica-tion that no lethal and no life-threatening effects were seen in anotherspecies. The toxicity rate (i.e., the DLT rate) at MTD defined for thisindication/population is 17%.

We compare the operating characteristics between the traditional es-calation rule (TER) design and the CRM design. In CRM, the followinglogistic model is used:

p =1

1 + 150 exp(−ax),

where the prior distribution for parameter a is flat over (0,0.12).Using ExpDesign Studio®, the simulation results are presented as

follows. The TER predicts the MTD at dose level 7 (lowest dose level tothe true MTD) with a probability of 75%. The CRM predicted the MTDat dose level 7 with a probability of 100% (5000 out of 5000 simulations).The TER requires average 18.2 patients and 2.4 DLTs per trial. CRMrequires only 12 patients with 4 DLTs per trial.

Remarks Dose-escalation trial is an early phase trial with flexibleadaptation for dose selection using CRM. It is suggested that the protocol

Page 264: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CASE STUDIES 247

Table 12.1 Dose Levels and DLT Rates

Dose Level 1 2 3 4 5 6 7 8

dose 30 60 99 150 211 280 373 496DLT rate 0.010 0.012 0.017 0.026 0.042 0.071 0.141 0.309

should be submitted to the Regulatory Agencies for review and approvalprior to the initiation of the trial. The design characteristics such asstarting dose, dose levels, prior information on the MTD, dose toxicitymodel, escalation rule, and stopping rule should be clearly stated inthe study protocol. In practice, there may be a situation that the nextpatient has enrolled before the response from the previous patient isobtained. In this case, the efficiency of the TER and CRM may be re-duced. There may also have been limitation for dose escape which mayalso reduce the efficiency of the CRM. To investigate this, a simulationwas conducted using ExpDesign Studio with cluster randomization ofsize 3 patients (i.e., 2 patients have enrolled before the response is ob-tained). In the simulation, we allow for one level of dose escape. Underthis scenario, the CRM requires only 12 patients with 1.8 DLTs. It canbe seen that with the limitation of dose escalation, the average DLTsper trial is reduced from 4 to 2 without sacrificing the precision. Thisis because with 12 patients there is enough precision to eliminate theprecision loss due to the delayed response. Note that when we use CRM,we need to do the modeling or simulation at real time such that the nextdose level can be determined quickly.

12.4 Adaptive Seamless Phase II/III Design

12.4.1 Seamless phase II/III design

A seamless phase II/III trial design is a design that combines a tra-ditional phase IIb trial and a traditional phase III trial into a singletrial. As a result, the study objectives of a seamless design are thecombination of the objectives which have been traditionally addressedin separate trials. This type of design closes the gap of the time thatwould have occurred between the two trials which are traditionallyconducted separately. Maca et al. (2006) defines an adaptive seamlessphase II/III design as a seamless phase II/III design that would usedata from patients enrolled before and after the adaptation in the finalanalysis. Thus, the feasibility and/or efficiency of an adaptive seamless

Page 265: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

248 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

phase II/III trial design depend upon the adaptation employed. In prac-tice, it is possible to combine studies into a seamless design within oracross phases of clinical development. Seamless designs are useful inearly clinical development since there are opportunities in the seamlesstransition between phase IIb (learning phase) and phase III (confirmingphase).

12.4.2 Adaptation

The basic adaptation strategy for a seamless phase II/III trial design isto drop the inferior arms (or drop the losers) based on some pre-specifiedcriteria at the learning phase. The best arms and the control arm willbe retained and advanced to the confirming phase. Other commonlyemployed adaptations include (i) enrichment process at the learningphase of the trial and (ii) change in treatment allocation at the confirm-ing phase. The enrichment process is commonly employed to identifysub-populations, which are most likely to respond to the treatment orto suffer from adverse events based on some pre-specified criteria ongenomic biomarkers. Change in treatment allocation at the confirmingphase is not only to have more patients to be assigned to superior armsbut also to increase the probability of success of the trial. In practice,it is not uncommon to apply an adaptation on the primary study end-point. For example, at the learning phase, the treatment effect may beassessed based on a short-term primary efficacy endpoint (or a surro-gate endpoint or some biomarkers), while a long-term primary studyendpoint is considered at the confirming phase. It should be noted thatthe impact on statistical inference should be carefully evaluated whenimplementing an adaptation to a seamless trial design.

12.4.3 Methods

For an adaptive seamless phase II/III trial design, if the trial is notstopped for lack of efficacy after the first stage, we proceed into the sec-ond stage. At the end of the second stage, we may calculate the secondstage p-value based upon the disjoint sample of the second stage only.The final analysis is conducted by combining the two p-values into a sin-gle test statistic using a pre-defined combination function. The methodproposed by Sampson and Sill (2005) for dropping the losers in normalcase and the contrast test with p-value combination method suggestedby Chang, Chow, and Pong (2006) are useful. Note that a data-dependentcombination rule should not be used after the first stage. It is suggestedthat a prior be considered at the planning phase.

Page 266: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CASE STUDIES 249

Thall, Simon, and Ellenberg (1989) proposed a two-stage design fortrials with binary outcomes. In their proposed two-stage design, thefirst stage is used to select the best treatment, and the second stageincludes just the selected treatment together with the control. As indi-cated by Thall, Simon, and Ellenberg (1989), the inclusion of the controlin the first stage is crucial because it allows results from that stage to bepooled with the data observed in the second stage. On the other hand,Schaid, Wieand, and Therneau (1990) considered a time-to-event end-point. Their design not only allows stopping the trial early for efficacyafter the first stage, but also allows more than one treatment to con-tinue into the second stage. Stallard and Todd (2003) generalized thesedesigns to allow for the possibility of more than two stages by usingerror-spending functions. The design is applicable for general endpointsuch as normal, binary, ordinal, or survival time. These methods areuseful for adaptive seamless phase II/III trial designs.

Remarks Note that the methods considered above are based on thesame primary endpoint. As an alternative, Todd and Stallard (2005)considered an adaptive group sequential design that incorporates treat-ment selection based upon a short-term endpoint, followed by a confir-mation stage comparing the selected treatment with control in termsof a long-term primary endpoint.

12.4.4 Case study — some examples

In what follows, several examples are provided to illustrate the im-plementation of adaptive seamless phase II/III trial designs in clinicaltrials. The first three examples are adopted from Maca et al. (2006).

Example 12.1 Adaptive Treatment Selection Similar to thestudy design described in Bauer and Kieser (1999), an adaptive seam-less phase II/III two-stage design is considered to evaluate the efficacyand safety of a compound when given in combination with methotrex-ate to patients with active rheumatoid arthritis (RA) who have had twoor more inadequate responses to anti-TNF therapy (Maca et al., 2006).The objectives of this trial include treatment selection and efficacy con-firmation. The primary clinical endpoint is ACR-20 (American Collegeof Rheumatology 20 score) at 6 months. The first stage will be used fordose selection, while the second stage is for efficacy confirmation. Atthe end of the second stage, data obtained from the second stage andthe relevant data from the first stage will be combined for final analy-sis using Fisher’s combination test at the significance level of 0.025. Inthis adaptive seamless phase II/III two-stage design, early stopping isallowed for futility, but not for efficacy.

Page 267: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

250 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Qualified subjects will be randomly assigned to five treatment arms(4 active and 1 placebo) in equal ratio. At the end of the first stage, aplanned interim analysis will be conducted. The best active treatmentarm and the placebo arm will be retained and advanced to the secondstage for efficacy confirmation. The best treatment arm will be selectedbased on a set of pre-defined efficacy, safety, and immunogenicity cri-teria. Based on interim results, the decisions that (i) the best treat-ment group (safe and efficacious) will be advanced to the second stage,(ii) the least efficacious and/or unsafe treatment groups will be dropped,and (iii) the futility requirement will be evaluated will be made basedupon the clinical endpoint in conjunction with information obtainedfrom early endpoint. To ensure the validity and integrity of the trialdesign, two committees will be established: one is an external indepen-dent data monitoring committee (DMC) and the other one is an internalExecutive Decision Committee (EDC).

Example 12.2 Confirming Treatment Effect A seamless phaseII/III trial design is employed to confirm substantial treatment effect inpatients with neuropathic pain (Maca et al., 2006). The primary end-point considered in this trial is the PI-NRS score change from baselineto the 8th week of treatment. Qualified patients will be randomly as-signed to four treatment groups (three active doses and one placebo) atequal ratio. An adaptation for having two interim analyses is plannedprior to the end of the enrollment period. Another adaptation to selectbest dose and change treatment allocation at first interim is also made.Note that patients are initially equally randomized to three doses ofthe new compound and placebo. Two interim analyses are planned be-fore the end of the enrollment period. At the second interim, one dosewill be selected to continue for confirmation of treatment effect. Finalanalysis will use a p-value combination test to confirm the superiorityof the selected dose over placebo.

As compared to the traditional approach of a phase IIb trial withthree doses and placebo based on a short-term endpoint (e.g., 2 weekschange from baseline) followed by a confirmatory phase III trial with along-term (e.g., 8 week) endpoint, this adaptive design combines the twotrials in one study with a single protocol and uses information on thelong-term endpoint from patients in the first (learning) and the second(confirmatory) phases of the study. Note that a longitudinal model forthe primary endpoint in time (i.e., at 2 weeks, 4 weeks, 6 weeks, and8 weeks) can be used to improve the estimate of the sustained treatmenteffect.

Example 12.3 Confirming Efficacy in Sub-population Maca et al.(2006) presented an example concerning a clinical trial using a

Page 268: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CASE STUDIES 251

two-stage adaptive design trial for patients with metastatic breast can-cer. This trial has two objectives. First, it is to select patient populationat the first stage. Second, it is to confirm the efficacy based on the hazardratio between treatment and control at the second stage. In addition, agenomic biomarker (expression level) was used. Patients were initiallyrandomized into three treatment arms (two actives and one control) inequal ratio. One interim analysis was planned for this study when thetrial reached approximately 60% of the targeted events. During the in-terim analysis, two sub-populations were defined based on whether thebiomarker expression level of each subject of that population exceeded apre-defined cutoff value. Data were analyzed for efficacy evaluation andsafety assessment for both populations. A decision was made based onpatients advanced to the second stage. Moreover, a futility was also as-sessed at this interim analysis. The final analysis was performed usingthe inverse normal p-value combination test for the selected populationto confirm the superiority of the selected dose over control. In contrastto a conventional design with one phase IIb for population selection fol-lowed by a confirmatory phase III study, this trial design was able toachieve the same objectives.

Example 12.4 The objective of this trial in patients with asthma isto confirm sustained treatment effect, measured as FEV1 change frombaseline to the 1 year of treatment. Initially, patients are equally ran-domized to four doses of the new compound and placebo. Based on earlystudies, the estimated FEV1 changes at 4 weeks are 6%, 13%, 15%, 16%,and 17% (with pooled standard deviation 18%) for the placebo (dose level0), dose level 2, 3, and 4, respectively. One interim analysis is plannedwhen 50% of patients have their short-term efficacy assessment (4-weekchange from baseline). The interim analysis will lead to either pick-ing the winner (arm with best observed response) or early stopping forfutility with a very conservative stopping boundary. The selected doseand placebo will be used at the stage 2. The final analysis will use a sumof the stage-wise p-values from both stages. (Note that the Fisher’s com-bination of p-values was not used because it does not provide a designwith futility stopping only. In other work, Fisher’s combination is not ef-ficient for adaptive design with no early efficacy stopping.) The stoppingboundaries are shortened from Table 7.7: α1 = 0, β1 = 0.25, α2 = 0.2236.

The decision rule will be: if p1 > β1, stop the trial; if p1 ≤ β1, proceed tothe second stage. At the final analysis, if p1 + p2 ≤ 0.2236, claim efficacy,otherwise claim futility. The p1 is the p-value from a contrast test basedon sub-sample from stage 1 (See Table 12.2 for contrasts).

With the maximum sample size of 120 per group, the power is 91.2%at overall one-sided α−level of 0.025. The expected total sample size is

Page 269: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

252 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Table 12.2 Seamless Design

Arms 0 1 2 3 4

4-week FEV1 change 0.06 0.12 0.13 0.14 0.15Contrasts −0.54 0.12 0.13 0.14 0.15

240 and 353 under the null hypothesis and the alternative hypothesis,respectively. As mentioned earlier, the method controls the type I errorunder global null hypothesis.

12.4.5 Issues and recommendations

Adaptive seamless trial designs are very attractive to the sponsors inpharmaceutical development. They help in identifying best treatmentin a more efficient way with certain accuracy and reliability within a rel-atively short time frame. In practice, the following issues are commonlyencountered when implementing a seamless trial design.

Clinical development time

As indicated earlier, the primary rationale behind the use of an adaptiveseamless trial design is to shorten the time of development. Thus, it isimportant to consider whether a seamless development program wouldaccomplish the goal for reducing the development time. In practice, it isclear if the seamless trial is the only pivotal trial required for regulatorysubmission. However, if the seamless trial is one of the two pivotal trialsrequired for registration, the second pivotal trial should be completedin a timely fashion, which can shorten the overall development time.During the planning stage, the additional time required for the secondseamless trial must be included in the evaluation of the overall clinicaldevelopment time.

Statistical inference

As indicated by Maca et al. (2006), data analysis for seamless trials maybe problematic due to bias induced by the adaptation employed. Thisbias of the maximum likelihood estimate of the effect of the selectedtreatment over the control could lead to an inaccurate coverage for theassociated confidence interval. As a result, it is suggested that the testcomparing the selected treatment with the control must be adjusted togive a correct type I error rate. Estimates of treatment effect must also

Page 270: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

CASE STUDIES 253

be adjusted to avoid statistical bias and produce confidence intervalswith correct coverage probability. Brannath, Koening, and Bauer (2003)derived a set of repeated confidence intervals by exploiting the dualityof hypothesis tests and confidence intervals. This approach, however,is strictly conservative if the trial stops at the interim analysis. Al-ternatively, Posch et al. (2005) proposed a more general approach: touse for each null hypothesis a different combination test. Stallard andTodd (2005) evaluated the bias of the maximum likelihood estimate andproposed a bias-adjusted estimate. Using a stage-wise ordering of thesample space, they also constructed a confidence region for the selectedtreatment effect.

Decision process

Seamless Phase II/III trials usually involve critical decision making atthe end of the learning phase. A typical approach is to establish an in-dependent data monitoring committee (DMC) to monitor ongoing trials.For an adaptive seamless phase II/III trial design, the decision processmay require additional expertise not usually represented on DMCs,which may require a sponsor’s input or at least a sponsor’s ratificationon DMC’s recommendation. Maca et al. (2006) pointed out several crit-ical aspects that are relevant to the decision process in seamless phaseII/III designs. These critical aspects include (i) composition of the de-cision board, (ii) process for producing analysis results, (iii) sponsorrepresentation, (iv) information inferable from the selection decision,and (v) regulatory perspectives.

For the composition of the decision board, it is not clear whether allof the study objectives or adaptation of a seamless trial at the learningphase should be addressed by the same board or by separate boards.There seems to be no universal agreement on which approach is cor-rect. Maca et al. (2006) suggested that if it is decided that a singleboard would suffice, then at a minimum, it should be strongly consid-ered whether the composition of the board should be broadened to in-clude individuals not normally represented on a DMC who have properperspective and experience in making the selection decision; for exam-ple, individuals with safety monitoring expertise may not have rele-vant experience in dose selection. On the other hand, if separate boardsare used, then the members of the board making the selection deci-sion should in general only review unblinded data at the selection pointand should only see results relevant to the decision they are chargedto make. For the process of producing analysis results, results shouldbe produced by an independent statistician and/or programmer whoshould provide the results directly to the appropriate DMC for review.

Page 271: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

254 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

For sponsor representation, as indicated in a recent FDA guidance, thefollowing principles should be followed (FDA, 2005):

• Sponsor representatives who will participate in the recom-mendation, or be allowed to ratify the recommendation, areadequately distanced from trial activities, i.e., they do not haveother trial responsibilities, and should have limited direct con-tact with people who are involved in the day-to-day manage-ment of the trial.

• Sponsor representation is minimal to meet the needs, i.e., thesmallest number of sponsor representatives which can pro-vide the necessary perspective is involved, these individualssee the minimum amount of unblinded information needed toparticipate in the decision process, and only at the decisionpoint.

• Appropriate protections and firewalls are in place to ensurethat knowledge is appropriately limited; e.g., procedures andresponsibilities are clearly documented and understood by allparties involved, confidentiality agreements reflecting theseare produced, secure data access and transfer processes are inplace, etc.

For information inferable from the selection decision, as a generalprinciple, knowledge regarding which treatment groups continue intothe confirming phase should be perceived to provide only minimal in-formation without potentially biasing the conduct of the trial. Thereshould be caution to limit the information to personnel who may inferfrom a particular selection decision. For regulatory perspectives, it isstrongly recommended that a regulatory reviewer be consulted whenimplementing adaptive seamless trial designs in appropriate clinicaldevelopment programs in order to maintain the validity and integrityof the trial.

Page 272: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

Bibliography

Arbuck, S. G. (1996). Workshop on phase I study design. Annals of Oncology,7, 567–573.

Atkinson, A. C. (1982). Optimum biased coin designs for sequential clinicaltrials with prognostic factors. Biometrika, 69, 61–67.

Atkinson, A. C. and Donev, A. N. (1992). Optimum Experimental Designs. OxfordUniversity Press, New York, New York.

Babb, J. S. and Rogatko, A. (2001). Patient specific dosing in a cancer phase Iclinical trial. Statistics in Medicine, 20, 2079–2090.

Babb, J. S. and Rogatko, A. (2004). Bayesian methods for cancer phase I clinicaltrials. In Advances in Clinical Trial Biostatistics, Geller, Nancy L. (ed.).Marcel Dekker, New York, New York.

Babb, J., Rogatko, A., and Zacks, S. (1998). Cancer phase I clinical trials. Effi-cient dose escalation with overdose control. Statistics in Medicine, 17, 1103–1120.

Bandyopadhyay, U. and Biswas, A. (1997). Some sequential tests in clinicaltrials based on randomized play-the-winner rule. Calcutta. Stat. Assoc. Bull.,47, 67–89.

Banerjee, A. and Tsiatis, A. A. (2006). Adaptive two-stage designs in phase IIclinical trials. Statistics in Medicine, in press.

Bauer, P. (1999). Multistage testing with adaptive designs (with discussion).Biometrie und Informatik in Medizin und Biologie, 20, 130–148.

Bauer, P. and Kieser, M. (1999). Combining different phases in developmentof medical treatments within a single trial. Statistics in Medicine, 18,1833–1848.

Bauer, P. and Kohne, K. (1994). Evaluation of experiments with adaptiveinterim analysis. Biometrics, 50, 1029–1041.

Bauer, P. and Kohne, K. (1996). Evaluation of experiments with adaptiveinterim analyses. Biometrics, 52, 380 (Correction).

Bauer, P. and Konig, F. (2006). The reassessment of trial perspectives frominterim data – A critical view. Statistics in Medicine, 25, 23–36.

Bauer, P. and Rohmel, J. (1995). An adaptive method for establishing a dose-response relationship. Statistics in Medicine, 14, 1595–1607.

Bechhofer, R. E., Kiefer, J., and Sobel, M. (1968). Sequential Identification andRanking Problems. University of Chicago Press, Chicago, Illinois.

Berry, D. A. (2005). Introduction to Bayesian methods III: Use and interpreta-tion of Bayesian tools in design and analysis. Clinical Trials, 2, 295–300.

Berry, D. A. and Eick, S. G. (1995). Adaptive assignment versus balanced ran-domization in clinical trials: A decision analysis. Statistics in Medicine, 14,231–246.

Page 273: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

256 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Berry, D. A. and Fristedt, B. (1985). Bandit Problems: Sequential Allocation ofExperiments. Chapman and Hall, London.

Berry, D. A., Muller, P., Grieve, A. P., Smith, M., Parke, T., Blazek, R., Mitchard,N., and Krams, M. (2002). Adaptive Bayesian designs for dose-ranging drugtrials. In Case Studies in Bayesian Statistics V. Lecture Notes in Statistics.Springer, New York, 162–181.

Berry, D. A. and Stangl, D. K. (1996). Bayesian Biostatistics. Marcel Dekker,New York, New York.

Birkett, N. J. (1985). Adaptive allocation in randomized controlled trials. Con-trolled Clinical Trials, 6, 146–155.

Bischoff, W. and Miller, F. (2005). Adaptive two-stage test procedures to find thebest treatment in clinical trials. Biometrika, 92, 197–212.

Blackwell, D. and Hodges, J. L. Jr. (1957). Design for the control of selectionbias. Annal. of Mathematical Statistics, 28, 449–460.

Brannath, W., Koening, F., and Bauer, P. (2003). Improved repeated con-fidence bounds in trials with a maximal goal. Biometrical Journal, 45,311–324.

Brannath, W., Posch, M., and Bauer, P. (2002). Recursive combination tests.Journal of American Statistical Association, 97, 236–244.

Branson, M. and Whitehead, W. (2002). Estimating a treatment effect in sur-vival studies in which patients switch treatment. Statistics in Medicine, 21,2449–2463.

Bretz, F. and Hothorn, L. A. (2002). Detecting dose-response using contrasts:Asymptotic power and sample size determination for binary data. Statisticsin Medicine, 21, 3325–3335.

Bronshtein, I. N., Semendyayev, K. A., Musiol, G., and Muehlig, H. (2004).Handbook of Mathematics. Springer-Verlag, Berlin, Heidelberg.

Brophy, J. M. and Joseph, L. (1995). Placing trials in context using Bayesiananalysis. GUSTO revisited by reverend Bayes [see comments]. Journal ofAmerican Medical Association, 273, 871–875.

Burman, C. F. and Sonesson, C. (2006). Are flexible designs sound? Biometrics,in press.

Chaloncr, K. and Larntz, K. (1989). Optimal Bayesian design applied tologistic regression experiments. Journal of Planning and Inference, 21,191–208.

Chang, M. (2005). Bayesian adaptive design with biomarkers. Presented atIBC’s Second Annual Conference on Implementing Adaptive Designs for DrugDevelopment, November 7–8, 2005, Nassau Inn, Princeton, New Jersey.

Chang, M. (2005). A simple n-stage adaptive design, submitted.Chang, M. (2005). Adaptive design based on sum of stagewise p-values,

submitted.Chang, M. and Chow, S. C. (2005). A hybrid Bayesian adaptive design for dose

response trials. Journal of Biopharmaceutical Statistics, 15, 667–691.Chang, M. and Chow, S. C. (2006a). Power and sample size for dose response

studies. In Dose Finding in Drug Development, N. Ting (ed.). Springer,New York, New York.

Page 274: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

BIBLIOGRAPHY 257

Chang, M. and Chow, S. C. (2006b). An innovative approach in clinicaldevelopment - Utilization of adaptive design methods in clinical trials,submitted.

Chang, M., Chow, S. C., and Pong, A. (2006). Adaptive design in clinicalresearch - Issues, opportunities, and recommendations. Journal of Biophar-maceutical Statistics, 16, 299–309.

Chang, M. N. (1989). Confidence intervals for a normal mean following groupsequential test. Biometrics, 45, 249–254.

Chang, M. N. and O’Brien, P. C. (1986). Confidence intervals following groupsequential test. Controlled Clinical Trials, 7, 18–26.

Chang, M. N., Wieand, H. S., and Chang, V. T. (1989). The bias of the sampleproportion following a group sequential phase II trial. Statistics in Medicine,8, 563–570.

Chen, J. J., Tsong, Y., and Kang, S. (2000). Tests for equivalence or non-inferiority between two proportions, Drug Information Journal, 34, 569–578.

Chen, T. T. (1997). Optimal three-stage designs for phase II cancer clinicaltrials. Statistics in Medicine, 16, 2701–2711.

Chen, T. T. and Ng, T. H. (1998). Optimal flexible designs in phase II cancerclinical trials. Statistics in Medicine, 17, 2301–2312.

Chevret, S. (1993). The continual reassessment method in cancer phase I clinicaltrials: A simulation study. Statistics in Medicine, 12, 1093–1108.

Chow, S. C. (2005). Randomized trials stopped early for benefit. Presentedat Journal Club, Infectious Diseases Division, Duke University School ofmedicine, Durham, North Carolina.

Chow, S. C. (2006). Adaptive design methods in clinical trials. InternationalChinese Statistical Association Bulletin, January 2006, 37–41.

Chow, S. C., Chang, M., and Pong, A. (2005). Statistical consideration of adaptivemethods in clinical development. Journal of Biopharmaceutical Statistics, 15,575–591.

Chow, S. C. and Liu, J. P. (2003). Design and Analysis of Clinical Trials, 2nd ed.John Wiley and Sons, New York, New York.

Chow, S. C. and Shao, J. (2002). Statistics in Drug Research. MarcelDekker, New York, New York.

Chow, S. C. and Shao, J. (2005). Inference for clinical trials with some protocolamendments. Journal of Biopharmaceutical Statistics, 15, 659–666.

Chow, S. C. and Shao, J. (2006). On margin and statistical test for non-inferiority in active control trials. Statistics in Medicine, 25, 1101–1113.

Chow, S. C., Shao, J., and Hu, Y. P. (2002). Assessing sensitivity and similarityin bridging studies. Journal of Biopharmaceutical Statistics, 12, 385–400.

Chow, S. C., Shao, J., and Wang, H. (2003). Sample Size Calculation in ClinicalResearch. Marcel Dekker, New York, New York.

Chuang-Stein, C. and Agresti, A. (1997). A review of tests for detecting a mono-tone dose-response relationship with ordinal response data. Statistics inMedicine, 16, 2599–2618.

Chuang-Stein, C., Anderson, K., Gallo, P., and Collins, S. (2006). Sample sizere-estimation, submitted.

Page 275: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

258 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Coad, D. S. and Rosenberger, W. F. (1999). A comparison of the randomizedplay-the-winner and the triangular test for clinical trials with binary re-sponses. Statistics in Medicine, 18, 761–769.

Coburger, S. and Wassmer, G. (2003). Sample size reassessment in adap-tive clinical trials using a bias corrected estimate. Biometrical Journal, 45,812–825.

Cohen, A. and Sackrowitz, H. B. (1989). Exact tests that recover interblock infor-mation in balanced incomplete block design. Journal of American StatisticalAssociation, 84, 556–559.

Conawav, M. R. and Petroni, G. R. (1996). Designs for phase II trials allowingfor a trade-off between response and toxicity. Biometrics, 52, 1375–1386.

Cox, D. R. (1952). A note of the sequential estimation of means. Proc. Camb.Phil. Soc., 48, 447–450.

Cox, D. R. and Oakes, D. (1984). Analysis of Survival Data. Monographs onStatistics and Applied Probability, Chapman and Hall, London.

Crowley, J. (2001). Handbook of Statistics in Clinical Oncology. Marcel Dekker,New York, New York.

CTriSoft Intl. (2002). Clinical Trial Design with ExpDesign Studio, www.ctrisoft.net.

Cui, L., Hung, H. M. J., and Wang, S. J. (1999). Modification of sample size ingroup sequential trials. Biometrics, 55, 853–857.

DeMets, D. L. and Ware, J. H. (1980). Group sequential methods for clinicaltrials with a one-sided hypothesis. Biometrika, 67, 651–660.

DeMets, D. L. and Ware, J. H. (1982). Asymmetric group sequentialboundaries for monitoring clinical trials. Biometrika, 69, 661–663.

Dent, S. F. and Fisenhauer, F. A. (1996). Phase I trial design: Are new method-ologies being put into practice? Annals of Oncology, 6, 561–566.

Dunnett, C. W. (1955). A multiple comparison procedure for comparing severaltreatments with a control. Journal of American Statistical Association, 50,1076–1121.

Efron, B. (1971). Forcing a sequential experiment to be balanced. Biometrika,58, 403–417.

Efron, B. (1980). Discussion of “Minimumchi-square, not maximum likelihood.”Annal of Statistics, 8, 469–471.

Ellenberg, S. S., Fleming, T. R., and DeMets, D. L. (2002). Data MonitoringCommittees in Clinical Trials – A Practical Perspective. John Wiley and Sons,New York, New York.

EMEA (2002). Point to Consider on Methodological Issues in ConfirmatoryClinical Trials with Flexible Design and Analysis Plan. The European Agencyfor the Evaluation of Medicinal Products Evaluation of Medicines for HumanUse. CPMP/EWP/2459/02, London, UK.

EMEA (2004). Point to Consider on the Choice of Non-inferiority Margin. TheEuropean Agency for the Evaluation of Medicinal Products Evaluation ofMedicines for Human Use. London, UK.

EMEA (2006). Reflection paper on Methodological Issues in ConfirmatoryClinical Trials with Flexible Design and Analysis Plan. The European Agency

Page 276: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

BIBLIOGRAPHY 259

for the Evaluation of Medicinal Products Evaluation of Medicines for HumanUse. CPMP/EWP/2459/02, London, UK.

Ensign, L. G., Gehan, E. A., Kamen, D. S., and Thall, P. F. (1994). An opti-mal three-stage design for phase II clinical trials. Statistics in Medicine, 13,1727–1736.

Faries, D. (1994). Practical modification of the continual reassessment methodfor phase I cancer clinical trials. Journal of Biopharmaceutical Statistics,4, 147–164.

FDA (1988). Guideline for Format and Content of the Clinical and Statisti-cal Sections of New Drug Applications. The United States Food and DrugAdministration, Rockville, Maryland.

FDA (2000). Guidance for Clinical Trial Sponsors On the Establishment andOperation of Clinical Trial Data Monitoring Committees. The United StatesFood and Drug Administration, Rockville, Maryland.

FDA (2005). Guidance for Clinical Trial Sponsors. Establishment and Op-eration of Clinical Trial Data Monitoring Committees (Draft). The UnitedStates Food and Drug Administration, Rockville, Maryland. http://www.fda.gov/cber/qdlns/clintrialdmc.htm.

Follman, D. A., Proschan, M. A., and Geller, N. L. (1994). Monitoring pairwisecomparisons in multi-armed clinical trials. Biometrics, 50, 325–336.

Friedman, B. (1949). A simple urn model. Comm. Pure Appl. Math., 2,59–70.

Gallo, P., Chuang-Stein, C., Dragalin, V., Gaydos, B., Krams, M., andPinheiro, J. (2006). Adaptive design in clinical drug development - An exec-utive summary of the PhRMA Working Group (with discussions). Journal ofBiopharmaceutical Statistics, 16, 275–283.

Gasprini, M. and Eisele, J. (2000). A curve-free method for phase I clinical trials.Biometrics, 56, 609–615.

Gelman, A., Carlin, J. B., and Rubin, D. B. (2003). Bayesian Data Analysis,2nd ed. Chapman and Hall/CRC, New York, New York.

Gillis, P. R. and Ratkowsky, D. A. (1978). The behaviour of estimatorsof the parameters of various yield-density relationships. Biometrics, 34,191–198.

Goodman, S. N. (1999). Towards evidence-based medical statistics I: Thep-value fallacy. Annals of Internal Medicine, 130, 995–1004.

Goodman, S. N. (2005). Introduction to Bayesian methods I: Measuring thestrength of evidence. Clinical Trials, 2, 282–290.

Goodman, S. N., Lahurak, M. L., and Piantadosi, S. (1995). Some practicalimprovements in the continual reassessment method for phase I studies.Statistics in Medicine, 5, 1149–1161.

Gould, A. L. (1992). Interim analyses for monitoring clinical trials that do notmaterially affect the type I error rate. Statistics in Medicine, 11, 55–66.

Gould, A. L. (1995). Planning and revising the sample size for a trial. Statisticsin Medicine, 14, 1039–1051.

Gould, A. L. (2001). Sample size re-estimation: Recent developments and prac-tical considerations. Statistics in Medicine, 20, 2625–2643.

Page 277: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

260 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Gould, A. L. and Shih, W. J. (1992). Sample size re-estimation withoutunblinding for normally distributed outcomes with unknown variance.Communications in Statistics - Theory and Methodology, 21, 2833–2853.

Hallstron, A. and Davis, K. (1988). Inbalance in treatment assignments in strat-ified blocked randomization. Controlled Clinical Trials, 9, 375–382.

Hardwick, J. P. and Stout, Q. F. (1991). Bandit strategies for ethical sequentialallocation. Computing Science and Stat., 23, 421–424.

Hardwick, J. P. and Stout, Q. F. (1993). Optimal allocation for estimating theproduct of two means. Computing Science and Stat., 24, 592–596.

Hardwick, J. P. and Stout, Q. F. (2002). Optimal few-stage designs. Journal ofStatistical Planning and Inference, 104, 121–145.

Hawkins, M. J. (1993). Early cancer clinical trials: Safety, numbers, and consent.Journal of the National Cancer Institute, 85, 1618–1619.

Hedges. L. V. and Olkin, I. (1985). Statistical Methods for Meta-analysis.Academic Press, New York, New York.

Hellmich, M. (2001). Monitoring clinical trials with multiple arms. Biometrics,57, 892–898.

Hochberg, Y. (1988). A sharper Bonferroni’s procedure for multiple tests ofsignificance. Biometrika, 75, 800–803.

Holmgren, E. B. (1999). Establishing equivalence by showing that a specifiedpercentage of the effect of the active control over placebo is maintained.Journal of Biopharmaceutical Statistics, 9, 651–659.

Hommel, G. (2001). Adaptive modifications of hypotheses after an interimanalysis. Biometrical Journal, 43, 581–589.

Hommel, G. and Kropf, S. (2001). Clinical trials with an adaptive choice ofhypotheses. Drug Information Journal, 33, 1205–1218.

Hommel, G., Lindig, V., and Faldum, A. (2005). Two stage adaptive designswith correlated test statistics. Journal of Biopharmaceutical Statistics, 15,613–623.

Horwitz, R. I. and Horwitz, S. M. (1993). Adherence to treatment and healthoutcomes. Annals of Internal Medicine, 153, 1863–1868.

Hothorn, L. A. (2000). Evaluation of animal carcinogenicity studies: Cochran-Armitage trend test vs. Multiple contrast tests. Biometrical Journal, 42,553–567.

Hughes, M. D. (1993). Stopping guidelines for clinical trials with multiple treat-ments. Statistics in Medicine, 12, 901–913.

Hughes, M. D. and Pocock, S. J. (1988). Stopping rules and estimation problemsin clinical trials. Statistics in Medicine, 7, 1231–1242.

Hung, H. M. J., Wang, S. J., Tsong, Y., Lawrence, J., and O’Neil, R. T. (2003).Some fundamental issues with non-inferiority testing in active controlledtrials. Statistics in Medicine, 22, 213–225.

Hung, H. M. J., Cui, L, Wang, S. J., and Lawrence, J. (2005). Adaptivestatistical analysis following sample size modification based on interimreview of effect size. Journal of Biopharmaceutical Statistics, 15,693–706.

ICH (1996). International Conference on Harmonization Tripartite Guidelinefor Good Clinical Practice.

Page 278: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

BIBLIOGRAPHY 261

ICH E9 Expert Working Group (1999). Statistical principles for clinical trials(ICH Harmonized Tripartite Guideline E9). Statistics in Medicine, 18, 1905–1942.

Inoue, L. Y. T., Thall, P. F. and Berry, D. A. (2002). Seamlessly expanding arandomized phase II trial to phase III. Biometrics, 58, 823–831.

Ivanova, A. and Flournoy, N. (2001). A birth and death urn for ternaryoutcomes: Stochastic processes applied to urn models. In Probability andStatistical Models with Applications. Charalambides, C. A., Koutras, M. V.,and Balakrishnan, N. (eds.). Chapman and Hall/CRC Press, Boca Raton,Florida, 583–600.

Jennison, C. and Turnbull, B. W. (1990). Statistical approaches to interim mon-itoring of medical trials: A review and commentary. Statistics in Medicine,5, 299–317.

Jennison, C. and Turnbull, B. W. (2000). Group Sequential Method withApplications to Clinical Trials. Chapman and Hall/CRC Press, New York,New York.

Jennison, C. Turnbull, B. W. (2003). Mid-course sample size modification inclinical trials based on the observed treatment effect. Statistics in Medicine,22, 971–993.

Jennison, C. and Turnbull, B. W. (2006a). Adaptive and non-adaptive groupsequential tests. Biometrika, 93, in press.

Jennison, C. and Turnbull, B. W. (2006b). Efficient group sequential designswhen there are several effect sizes under consideration. Statistics inMedicine, 25, in press.

Jennison, C. and Turnbull, B. W. (2005). Meta-analysis and adaptive groupsequential design in the clinical development process. Journal of Biophar-maceutical Statistics, 15, 537–558.

Johnson, N. L., Kotz, S., and Balakrishnan, N. (1994). Continuous UnivariateDistributions, Vol. 1. John Wiley and Sons, New York, New York.

Kalbeisch, J. D. and Prentice, R. T. (1980). The Statistical Analysis of FailureTime Data. Wiley, New York, New York.

Kelly, P. J., Sooriyarachchi, M. R., Stallard, N., and Todd, S. (2005). Apractical comparison of group-sequential and adaptive designs. Journal ofBiopharmaceutical Statistics, 15, 719–738.

Kelly, P. J., Stallard, N., and Todd, S. (2005). An adaptive group se-quential design for phase II/III clinical trials that select a singletreatment from several. Journal of Biopharmaceutical Statistics, 15,641–658.

Kieser, M., Bauer, P., and Lehmacher, W. (1999). Inference on multiple endpointsin clinical trials with adaptive interim analyses. Biometrical Journal, 41,261–277.

Kieser, M. and Friede, T. (2000). Re-calculating the sample size in internal pilotstudy designs with control of the type I error rate. Statistics in Medicine, 19,901–911.

Kieser, M. and Friede, T. (2003). Simple procedures for blinded sample sizeadjustment that do not affect the type I error rate. Statistics in Medicine, 22,3571–3581.

Page 279: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

262 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Kim, K. (1989). Point estimation following group sequential tests. Biometrics,45, 613–617.

Kimko, H. C. and Duffull, S. B. (2003). Simulation for Designing Clinical Trials.Marcel Dekker, New York, New York.

Kramar, A., Lehecq, A., and Candalli, E. (1999). Continual reassessmentmethods in phase I trials of the combination of two drugs in oncology. Statis-tics in Medicine, 18, 1849–1864.

Lachin, J. M. (1988). Statistical properties of randomization in clinical trials.Controlled Clinical Trials, 9, 289–311.

Lan, K. K. G. (2002). Problems and issues in adaptive clinical trial design.Presented at New Jersey Chapter of the American Statistical Association,Piscataway, New Jersey, June 4, 2002.

Lan, K. K. G. and DeMets, D. L. (1983). Discrete sequential boundaries forclinical trials. Biometrika, 70, 659–663.

Lan, K. K. G. and DeMets, D. L. (1987). Group sequential procedures: calendarversus information time. Statistics in Medicine, 8, 1191–1198.

Lee, Y., Wang, H., and Chow, S. C. (2006). A bootstrap-median approach forstable sample size determination based on information from a small pilotstudy, submitted.

Lehmacher, W., Kieser, M., and Hothorn, L. (2000). Sequential and multipletesting for dose-response analysis. Drug Information Journal, 34, 591–597.

Lehmacher, W. and Wassmer, G. (1999). Adaptive sample size calculations ingroup sequential trials. Biometrics, 55, 1286–1290.

Lehmann, E. L. (1975). Nonparametric: Statistical Methods Based on Ranks.Holden-Day, San Francisco.

Lehmann, E. L. (1983). The Theory of Point Estimation. Wiley, New York,New York.

Li, H. I. and Lai, P. Y. (2003). Clinical trial simulation. In Encyclopedia ofBiopharmaceutical Statistics, Chow, S. C. (ed.). Marcel Dekker, New York,New York, 200–201.

Li, N. (2006). Adaptive trial design - FDA statistical reviewer’s view. Presentedat the CRT 2006 Workshop with the FDA, Arlington, Virginia, April 4, 2006.

Li, W. J., Shih, W. J., and Wang, Y. (2005). Two-stage adaptive design forclinical trials with survival data. Journal of Biopharmaceutical Statistics, 15,707–718.

Lilford, R. J. and Braunholtz, D. (1996). For debate: The statistical basis ofpublic policy: A paradigm shift is overdue. British Medical Journal, 313,603–607.

Lin, Y. and Shih, W. J. (2001). Statistical properties of the traditional algorithm-based designs for phase I cancer clinical trials. Biostatistics, 2, 203–215.

Liu, Q. (1998). An order-directed score test for trend in ordered 2xK Tables.Biometrics, 54, 1147–1154.

Liu, Q. and Chi, G. Y. H. (2001). On sample size and inference for two-stageadaptive designs. Biometrics, 57, 172–177.

Liu, Q. and Pledger, G. W. (2005). Phase 2 and 3 combination designs to accel-erate drug development. Journal of American Statistical Association, 100,493–502.

Page 280: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

BIBLIOGRAPHY 263

Liu, Q., Proschan, M. A., and Pledger, G. W. (2002). A unified theory of two-stage adaptive designs. Journal of American Statistical Association, 97,1034–1041.

Lokhnygina, Y. (2004). Topics in design and analysis of clinical trials. Ph.D.Thesis, Department of Statistics, North Carolina State Univeristy. Raleigh,North Carolina.

Louis, T. A. (2005). Introduction to Bayesian methods II: Fundamentalconcepts. Clinical Trials, 2, 291–294.

Maca, J., Bhattacharya, S., Dragalin, V., Gallo, P., and Krams, M. (2006).Adaptive seamless phase II/III designs - Background, operational aspects,and examples, submitted.

Marubini, E. and Valsecchi, M. G. (1995). Analysis Survival Data from ClinicalTrials and Observational Studies. John Wiley and Sons, New York, New York.

Maxwell, C., Domenet, J. G., and Joyce, C. R. R. (1971). Instant experience inclinical trials: A novel aid to teaching by simulation. J. Clin. Pharmacol., 11,323–331.

Melfi, V. and Page, C. (1998). Variability in adaptive designs for estima-tion of success probabilities. In New Developments and Applications inExperimental Design, IMS Lecture Notes Monograph Series, 34, 106–114.

Mendelhall, W. and Hader, R. J. (1985). Estimation of parameters of mixedexponentially distributed failure time distributions from censored life testdata. Biometrika, 45, 504–520.

Mehta, C. R. and Tsiatis, A. A. (2001). Flexible sample size considerationsusing information-based interim monitor. Drug Information Journal, 35,1095–1112.

Mehta, C. R. and Patel, N. R. (2005). Adaptive, group sequential and decisiontheoretic approaches to sample size determination, submitted.

Montori, V. M., Devereaux, P. J., Adhikari, N. K. J., Burns, K. E. A. et al.(2005). Randomized trials stopped early for benefit - A systematic review.Journal of American Medical Association, 294, 2203–2209.

Muller, H. H. and Schafer, H. (2001). Adaptive group sequential designs forclinical trials: Combining the advantages of adaptive and classical groupsequential approaches. Biometrics, 57, 886–891.

Neuhauser, M. and Hothorn, L. (1999). An exact Cochran-Armitage test fortrend when dose-response shapes are a priori unknown. ComputationalStatistics & Data Analysis, 30, 403–412.

Offen, W. W. (2003). Data Monitoring Committees (DMC). In Encyclopediaof Biopharmaceutical Statistics, Chow, S. C. (ed.) Marcel Dekker, New York,New York.

Offen, W., Chuang-Stein, C., Dmitrienko, A., Littman, G., Maca, J., Meyerson,L., Muirhead, R., Stryszak, P., Boddy, A., Chen, K., Copley-Merriman,K., Dere, W., Givens, S., Hall, D., Henry, D., Jackson, J. D., Krishen, A.,Liu, T., Ryder, S., Sankoh, A. J., Wang, J., and Yeh, C. H. (2006). Multipleco-primary endpoints: Medical and statistical solutions. Drug InformationJournal, in press.

O’Quigley, J., Pepe, M., and Fisher, L. (1990). Continual reassessment method:A practical design for phase I clinical trial in cancer. Biometrics, 46, 33–48.

Page 281: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

264 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

O’Quigley, J. and Shen, L. (1996). Continual reassessment method: A likelihoodapproach. Biometrics, 52, 673–684.

Parmigiani, G. (2002). Modeling in Medical Decision Making. John Wiley andSons, West Sussex, England.

Paulson, E. (1964). A selection procedure for selecting the population with thelargest mean from k normal populations. Annal of Mathematical Statistics,35, 174–180.

Pocock, S. J. (2005). When (not) to stop a clinical trial for benefit. Journal ofAmerican Medical Association, 294, 2228–2230.

Pocock, S. J. and Simon, R. (1975). Sequential treatment assignment with bal-ancing for prognostic factors in the controlled clinical trials. Biometrics, 31,103–115.

Pong, A. and Luo, Z. (2005). Adaptive design in clinical research. A special issueof the Journal of Biopharmaceutical Statistics, 15, No. 4.

Posch, M. and Bauer, P. (1999). Adaptive two stage designs and the conditionalerror function. Biometrical Journal, 41, 689–696.

Posch, M. and Bauer, P. (2000). Interim analysis and sample size reassessment.Biometrics, 56, 1170–1176.

Posch, M., Bauer, P., and Brannath, W. (2003). Issues in designing flexible trials.Statistics in Medicine, 22, 953–969.

Posch, M., Konig, F., Brannath, W., Dunger-Baldauf, C., and Bauer, P. (2005).Testing and estimation in flexible group sequential designs with adaptivetreatment selection. Statistics in Medicine, 24, 3697–3714.

Proschan, M. A. and Hunsberger, S. A. (1995). Designed extension of studiesbased on conditional power. Biometrics, 51, 1315–1324.

Proschan, M. A. (2005). Two-stage sample size re-estimation based on anuisance parameter: A review. Journal of Biopharmaceutical Statistics, 15,539–574.

Proschan, M. A., Leifer, E., and Liu, Q. (2005). Adaptive regression. Journal ofBiopharmaceutical Statistics, 15, 593–603.

Proschan, M. A. and Wittes, J. (2000). An improved double sampling procedurebased on the variance. Biometrics, 56, 1183–1187.

Proschan, M. A., Follmann, D. A., and Waclawiw, M. A. (1992). Effects ofassumption violations on type I error rate in group sequential monitoring.Biometrics, 48, 1131–1143.

Proschan, M. A., Follmann, D. A., and Geller, N. L. (1994). Monitoring multi-armed trials. Statistics in Medicine, 13, 1441–1452.

Quinlan, J. A., Gallo, P., and Krams, M. (2006). Implementing adaptivedesigns: Logistical and operational consideration, submitted.

Ravaris, C. L., Nies, A., Robinson, D. S., and Ives, J. O. (1976). Multipledose controlled study of phenelzine in depression anxiety states. Areh GenPsychiatry, 33, 347–350.

Robins, J. M. and Tsiatis, A. A. (1991). Correcting for non-compliance inrandomized trials using rank preserving structural failure time models.Communications in Statistics – Theory and Methods, 20, 2609–2631.

Rom, D. M. (1990). A sequentially rejective test procedure based on a modifiedBonferroni inequality. Biometrika, 77, 663–665.

Page 282: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

BIBLIOGRAPHY 265

Rosenberger, W. F. and Lachin, J. (2002). Randomization in Clinical Trials.John Wiley and Sons, New York, New York.

Rosenberger, W. F. and Seshaiyer, P. (1997). Adaptive survival trials. Journalof Biopharmaceutical Statistics, 7, 617–624.

Rosenberger, W. F., Stallard, N., Ivanova, A., Harper, C. N., and Ricks, M. L.(2001). Optimal adaptive designs for binary response trials. Biometrics, 57,909–913.

Sampson, A. R. and Sill, M. W. (2005). Drop-the-loser design: Normal case (withdiscussions). Biometrical Journal, 47, 257–281.

Sargent, D. J. and Goldberg, R. M. (2001). A flexible design for multiple armedscreening trials. Statistics in Medicine, 20, 1051–1060.

Schaid, D. J., Wieand, S., and Therneau, T. M. (1990). Optimal two stagescreening designs for survival comparisons. Biometrika, 77, 659–663.

Shao, J., Chang, M., and Chow, S. C. (2005). Statistical inference for cancertrials with treatment switching. Statistics in Medicine, 24, 1783–1790.

Shen, Y. and Fisher, L. (1999). Statistical inference for self-designing clinicaltrials with a one-sided hypothesis. Biometrics, 55, 190–197.

Shih, W. J. (2001). Sample size re-estimation - A journey for a decade.Statistics in Medicine, 20, 515–518.

Shirley, E. (1977). A non-parametric equivalent of Williams’ test for contrastingincreasing dose levels of treatment. Biometrics, 33, 386–389.

Siegmund, D. (1985). Sequential Analysis: Tests and Confidence Intervals.Springer, New York, New York.

Simon, R. (1979). Restricted randomization designs in clinical trials. Biometrics,35, 503–512.

Simon, R. (1989). Optimal two-stage designs for phase II clinical trials.Controlled Clinical Trials, 10, 1–10.

Sommer, A. and Zeger, S. L. (1991). On estimating efficacy from clinical trials.Statistics in Medicine, 10, 45–52.

Sonnemann, E. (1991). Kombination unabhangiger Tests. In Biometrie in derchemisch-pharmazeutischen Industrie 4, Stand und Perspektiven, Vollmar, J.(ed.). Stuttgart: Gustav-Fischer.

Spiegelhalter, D. J., Abrams, K. R, and Myles, J. P. (2004). Bayesian Ap-proach to Clinical Trials and Health-care Evaluation. John Wiley and Sons,Ltd., The Atrium, Southern Gate, Chrichester, West Sussex P019 8SQ,England.

Stallard, N. and Todd, S. (2003). Sequential designs for phase III clini-cal trials incorporating treatment selection. Statistics in Medicine, 22,689–703.

Stallard, N. and Todd, S. (2005). Point estimates and confidence regions forsequential trials involving selection. Journal of Statistical Planning andInference, 135, 402–419.

Stewart, W. and Ruberg, S. J. (2000). Detecting dose response with contrasts.Statistics in Medicine, 19, 913–921.

Susarla, V. and Pathala, K. S. (1965). A probability distribution for time offirst birth. Journal of Scientific Research, Banaras Hindu University, 16,59–62.

Page 283: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

266 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Taves, D. R. (1974). Minimization - A new method of assessing patients andcontrol groups. Clinical Pharmacol. Ther., 15, 443–453.

Temple, R. (2005). How FDA currently make decisions on clinical studies.Clinical Trials, 2, 276–281.

Thall, P. F., Simon, R., and Ellenberg, S. S. (1989). A two-stage design forchoosing among several experimental treatments and a control in clinicaltrials. Biometrics, 45, 537–547.

Todd, S. (2003). An adaptive approach to implementing bivariate groupsequential clinical trial designs. Journal of Biopharmaceutical Statistics, 13,605–619.

Todd, S. and Stallard, N. (2005). A new clinical trial design combining Phases 2and 3: Sequential designs with treatment selection and a change of endpoint.Drug Information Journal, 39, 109–118.

Tsiatis, A. A and Mehta, C. (2003). On the inefficiency of the adaptive designfor monitoring clinical trials. Biometrika, 90, 367–378.

Tsiatis, A. A., Rosner, G. L., and Mehta, C. R. (1984). Exact confidence intervalfollowing a group sequential test. Biometrics, 40, 797–803.

Tukey, J. W. and Heyse, J. F. (1985). Testing the statistical certainty of aresponse to increasing doses of a drug. Biometrics, 41, 295–301.

Uchida, T. (2006). Adaptive trial design - FDA view. Presented at the CRT 2006Workshop with the FDA, Arlington, Virginia, April 4, 2006.

Wald, A. (1947). Sequential Analysis. Dover Publications, New York, New York.Wang, S. J. and Hung, H. M. J. (2005). Adaptive covariate adjustment in

clinical trials. Journal of Biopharmaceutical Statistics, 15, 605–611.Wang, S. K. and Tsiatis, A. A. (1987). Approximately optimal one-parameter

boundaries for a sequential trials. Biometrics, 43, 193–200.Wassmer, G. (1998), A comparison of two methods for adaptive interim analyses

in clinical trials. Biometrics, 54, 696–705.Wassmer, G., Eisebitt, R., and Coburger, S. (2001). Flexible interim analyses

in clinical trials using multistage adaptive test designs. Drug InformationJournal, 35, 1131–1146.

Wei, L. J. (1977). A class of designs for sequential clinical trials. Journal ofAmerican Statistical Association, 72, 382–386.

Wei, L. J. (1978). The adaptive biased-coin design for sequential experiments.Annal of Statistics, 9, 92–100.

Wei, L. J. and Durham, S. (1978). The randomized play-the-winnerrule in medical trials. Journal of American Statistical Association, 73,840–843.

Wei, L. J., Smythe, R. T., and Smith, R. L. (1986). K-treatment comparisonswith restricted randomization rules in clinical trials. Annal of Statistics, 14,265–274.

Weinthrau, M., Jacox, R. F., Angevine, C. D., Atwater, E. C. (1977). Piroxicam(CP 16171) in rheumatoid arthritis: A controlled clinical trial with novelassessment features. J Rheum, 4, 393–404.

White, I. R., Babiker, A. G., Walker, S., and Darbyshire, J. H. (1999).Randomisation-based methods for correcting for treatment changes:

Page 284: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

BIBLIOGRAPHY 267

Examples from the Concorde trial. Statistics in Medicine, 18, 2617–2634.

White, I. R., Walker, S., and Babiker, A. G. (2002). Randomisation-based efficacyestimator. Stata Journal, 2, 140–150.

Whitehead, J. (1993). Sample size calculation for ordered categorical data.Statistics in Medicine, 12, 2257–2271.

Whitehead, J. (1994). Sequential methods based on the boundaries approachfor the clinical comparison of survival times (with discussions). Statistics inMedicine, 13, 1357–1368.

Whitehead, J. (1997). Bayesian decision procedures with application to dose-finding studies. International Journal of Pharmaceutical Medicine, 11,201–208.

William, D. A. (1971). A test for difference between treatment means whenseveral dose levels are compared with a zero dose control. Biometrics, 27,103–117.

William, D. A. (1972). Comparison of several dose levels with a zero dosecontrol. Biometrics, 28, 519–531.

Williams, G., Pazdur, R., and Temple, R. (2004). Assessing tumor-related signsand symptoms to support cancer drug approval. Journal of Biopharmaceuti-cal Statistics, 14, 5–21.

Woodcock, J. (2005). DFA introduction comments: Clinical studies design andevaluation issues. Clinical Trials, 2, 273–275.

Zelen, M. (1974). The randomization and stratification of patients to clinicaltrials. Journal of Chronic Diseases, 28, 365–375.

Zucker, D. M., Wittes, J. T., Schabenberger, O., and Brittain, E. (1999).Internal pilot studies II: Comparison of various procedures. Statistics inMedicine, 19, 901–911.

Page 285: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and
Page 286: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

Index

A

Accelerated failure time model, 174Accidental bias, 70Accrual bias, 70Accrued data, 9, 18, 45, 90, 245

adaptations based on, 2, 5analysis, 93design modifications based on, 6determination of non-inferiority

margin based on, 88interim analysis based on, 109modification of hypothesis based on,

15, 75sample size adjustment based on, 137,

158Active-control

agent, 76, 77, 80parallel-group, 18

Adaptation(s), 3, 5, 16, 19, 64, 90,242–243

ad hoc, 2, 5, 239after final analysis, 17, 161algorithms, 212basic strategy, 242, 245biomarkers and, 6of design characteristics, 245dose-escalation trial, 245effect on clinical trials, 2, 240effects of, 2, 230, 240examples, 2, 242flexibility, 12of interim analyses, 243population, 165prospective, 2, 239retrospective, 2, 5, 239rule, 41, 42, 45, 216of sample size, 146, 243for seamless phase II/II trial designs,

248of switching hypotheses, 243

Adaptive design(s)

basic considerations whenimplementing, 239–241

bias and, 19biomarker, 6computer simulation and, 236defined, 3–6disadvantages, 16, 45dose-escalation, 244–247for dose-response trials, 19drop-the-loser, 167–170, 234–235four-stage specifications, 129group sequential, 5, 15–16, 107–136,

228, 241–244, 249adaptation, 242–243case study, 243–244commonly used, 20implementation, 20, 239, 241–244statistical, 243

hybrid frequentist-Bayesian, 15,93–100

key issue, 7methods, 3, 5, 8

advantages of, 19, 45basic considerations when

implementing, 239–241Bayesian approach, 195disadvantages of, 19, 21for dose-response curves, 90impact of, 13theoretical basis for, 11

multiple, 6, 17n-stage, 123overall type I error rate, 16permitting early stopping and sample

size re-estimation, 232response, 233seamless phase II/III, 247–254selection, 205statistical theory, 215two-stage, 16window, 227

Page 287: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

270 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Adaptive dose-escalation trial, 12, 15,89–105, 245

Adaptive group sequential design, 5,15–16, 107–136, 228, 249

adaptation, 242–243case study, 243–244commonly used, 20implementation, 20, 239, 241–244statistical, 243

Adaptive hypothesis design, 6, 166Adaptive model(s), 93

for ordinal and continuous outcomes,68–69

response, 68Adaptive randomization, 2, 12, 13–14,

47–73bias and, 218design, 5

definition of, 5–6procedures, 14, 20

Adaptive sample size adjustment, 12,16–17, 137–159

at interim, 242Adaptive seamless phase II/III design, 5,

17, 161–171benefits and drawbacks, 171case studies, 247–254traditional approach vs., 162

Adaptive stratification, 55Adaptive treatment switching, 12, 18,

173–193biomarker response, 182design, 5, 6simulation approach, 189

Adjusted p-value, 125, 128, 150, 152, 156Allocation probability, 47, 54, 217

for covariate-adaptive randomization,55

defined, 53equal, 49fixed, 48varied, 52

Allocation rulein Atkinson optimal model, 58balanced randomization, 73bandit, 61, 64in Efron’s biased coin model, 53in Friedman-Wei’s Efron’s urn model,

54, 70, 72Lachin’s, 70, 72

stratified, 70, 72

in Pocock-Simon’s model, 57in truncated binomial model, 70, 72

Alpha spending function, 122–123,136

Alternative hypothesis, 77conditional power and, 133sample size adjustment and, 35, 115,

122, 138simulation and, 212–213true, 233two-stage design and, 203

Analysis of covariance (ANCOVA), 51Asthma, 42–43, 251Atkinson optimal model, 55, 58

B

Balance design, 35perfect, 54

Balanced randomized design, 65, 68Bandit allocation rule, 61, 64Bandit model, 58, 61–68

for finite population, 64–68Bauer and Kohne’s method, 14, 137,

146–248, 151, 168, 249Bayes rule, 196–200Bayesian adaptive design, 15, 195

for dose-response trials, 90hybrid frequentist, 15, 93–100

Bayesian approach, 18, 92,195–210

advantages, 209–210bandit allocation rule as, 61basic concepts, 195–201hybrid-frequentist, 98–99predictive power as, 131prior probability and, 97simulation with, 236

Bayesian optimal design, 205–209Bias

accidental, 70accrual, 70adaptation and, 10adaptive design and, 19adaptive randomization and, 218expected factor, 71group size and, 236minimization, 2operational, 239potential risks for, 44in rate with dropping losers, 234,

235

Page 288: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

INDEX 271

reduction, 229, 230selection, 55, 71statistical, 239

Biased coin design, 53Binary endpoint, 118, 155, 164Bioequivalence, 77Biological efficacy, 18, 173Biomarker-adaptive design, 5, 6Block randomization, 52–53

permuted, 13Boundary scales, 109–110

C

Cancer trials, 18, 68, 77, 201. See alsoOncology

biomarkers and, 182, 210breast, 251optimal/flexible multiple-stage design,

110ovarian, 57parallel-group active control

randomization, 173prostatic, 245responding to early beneficial trends

in, 132treatment-switch in, 181two-stage design in phase II, 202

Case studies, 19–20, 239–254Chow and Shao’s approach, 79Clinical trial simulation, 12, 19, 211–237

early phase, 228–230considerations, 213–215

dose-escalation design, 215dose-level selection, 214dose limiting toxicity and

maximum tolerated dose, 214sample size per dose level, 215

examples, 227–230with CRM, 229–230with TER and TSER, 228–229

examples, 227–235framework, 212–213late phase, 230–235

considerations, 215–220alpha adjustment, 219–220early stopping rules, 216null-model vs. model approach, 219randomization rules, 216response-adaptive randomization,

217–218rules for dropping losers, 216–217

sample size adjustment, 217utility-offset model, 218–219

examples, 230–235adaptive design permitting early

stopping, 232–233adaptive design with dropping the

losers, 234–235conventional design with multiple

treatments, 233design with play-the-winner

randomization, 231–232dose-response trial design, 235flexible design with sample size

re-estimation, 231group sequential design, 232responsive-adaptive design with

multiple treatments, 233–234software application, 220–227

designing with, 222–227adaptive trial, 227conventional trial, 222–223dose-escalation trial, 225–226group sequential trial, 223–224multi-stage trial, 224

overview, 220–222Closed testing procedure, 78, 243Cluster randomization, 47, 48, 51–52,

247Combined adaptive design, 17Common toxicity criteria, 91Comparing means, 108, 133–134Comparing proportions, 108, 134–135Complete randomization, 13, 47, 48. See

also Simple randomizationaccidental bias, 70urn procedure and, 55

Complete randomization design, 55Conditional inference, 39–40Conditional power, 131, 133–135, 142,

203, 244Confidence interval, 4, 8, 11, 79, 169, 252

asymptotic, 180final, 136naive, 100, 131, 216simulation, 222

Constancy condition, 84, 86Continual reassessment method (CRM),

15, 89, 90, 215, 228, 244in reduction of bias, 229

Conventional randomization, 48–52, 217Convergence strategy, 71

Page 289: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

272 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

Covariate-adaptive randomization, 55–58Covariate-adjustment, 38–43Cox’s proportional hazard model, 177,

179CRM. See Continual reassessment

method (CRM)CTriSoft, 103, 212, 220, 228Cui-Hui-Wang method, 137, 140–142

D

Data monitoring committee, 76, 109,128–131, 158, 241, 253

Data safety monitoring board, 75DLT. See Dose limiting toxicity (DLT)DMC. See Data monitoring committeeDose, 212

adjustment, 2assignment based on

minimization/maximization offunction, 90

efficacious, 15, 89maximum tolerable, 6, 89, 214optimal, strategies for finding, 213reduction, 8toxicity and, 91

Dose de-escalation, 89Dose-efficiency response study, 15, 89Dose-escalation design, 5, 20, 239,

244–247Dose escalation factor, 92, 214Dose-escalation trial, 213

adaptive, 12, 15, 89–105, 245design, 225simulation for, 214, 220

Dose-level selection, 92Dose limiting toxicity (DLT), 89, 92, 214,

244selection criteria based on, 245

Dose regimen, 2, 39, 42, 77, 108, 240Dose response, 15, 89

study, 15, 17curves, 90model, 99using multiple-stage designs, 90

Dose-toxicity modeling, 91–92Dose-toxicity study, 15, 89Drop the loser design, 5, 167–170Dropping losers, 12, 77, 162, 234, 248

rules, 100, 101, 216

E

Early efficacy-futility stopping, 119–122Early efficacy stopping, 114–116, 156,

251Early futility stopping, 116–119, 156, 162Early phases development, 213–215Early stopping boundaries, 114–122Effect size, 8, 115, 198, 239

clinically meaningful, 16sample size and, 140, 217sensitivity index and, 24

Efficacy, 42, 89, 107biological, 18, 173, 174Chow and Shao’s approach, 79comparisons, 48early (See Early efficacy)early stopping for, 114, 161, 162endpoints, 42, 60, 77, 118, 128, 251evaluation, 18lack of, 17premature termination of trial and, 5,

221unbiased and fair assessment, 47

Efron’s biased coin model, 53EM algorithm, 139EMEA. See European Agency for the

Evaluation of MedicinalProducts

Equal randomization, 65Error inflation, 109Ethical consideration, 18, 48, 77, 107,

110, 173, 201, 242European Agency for the Evaluation of

Medicinal Products, 6, 34Expected bias factor, 71Extrapolate, 211, 212

F

Family experiment-wise, 151FEV1 change, as endpoint parameter, 42,

128, 251Flexible design, 4, 7, 231

with sample size re-estimation, 231Flexible trials, 15Force expiratory volume per second. See

FEV1 change, as endpointparameter

Friedman-Wei’s urn model, 54–55accidental bias, 70selection bias, 72

Page 290: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

INDEX 273

Futility, 2assessment, 108, 250based on interim analyses, 5early stopping for, 15, 88rules, 100, 216

Futility design, 112Futility index, 131Futility inner stopping boundary, 118

G

GCP. See Good Clinical PracticesGenomic markers, 6, 166Gittins lower bound, 63–64Good Clinical Practices, 9, 108Group sequential design

adaptive, 5, 15–16, 107–136, 228,249

adaptation, 242–243case study, 243–244commonly used, 20implementation, 20, 239, 241–244statistical, 243

based on independent p-values,123–125

five-arm, 166general approach for, 112–114with one interim analyses, 232sample size adjustment in, 137

H

Heart failure, 132HIV/AID, 18, 132, 173Homogeneity, 13, 35, 50, 164Hybrid, 15, 89Hybrid frequentist-Bayesian adaptive

design, 15, 93–100Hyper-logistic function, 91, 96, 104Hypothesis test, 123, 148, 187–189, 213

global, 149null, 149

I

Imbalance minimization model, 57–58Inclusion/exclusion criteria, 2, 3, 7, 11,

38modification, 9

Individual p-value, 151Inferential analysis, 71–72

Interactive parameter estimation (IPE),176–177

IPE. See Interactive parameterestimation (IPE)

K

k-stage design, 125–126

L

Lachin’s urn model, 53–54Lan-DeMets-Kim functions, 110, 122–123Late phases development, 215–220Latent event times, 174–177Latent hazard rate, 177–181Linear combination of p-values, 150, 243Log-likelihood function, 29, 30, 31, 139Log-rank test, 155Long-term treatment, 131

M

Marginal distribution, 196, 197Maximum likelihood estimates (MLE),

28–31, 139, 181, 186, 190Maximum likelihood estimator, 136Maximum tolerable dose, 6, 15, 89, 90,

214, 228, 244based on dose response model, 93defined, 214expression of, 92in simulation studies, 214

Median survival time, 18, 77, 120, 181Median unbiased estimator, 136Method of individual p-values (MIP),

151–152, 156, 157, 162Method of products of p-values (MPP),

153, 156Method of sum of p-values (MSP), 152,

156, 162MIP. See Method of individual p-values

(MIP)Mixed exponential model, 173, 181–193Mixed normal distribution, 28MLE. See Maximum likelihood estimates

(MLE); Maximum likelihoodestimator (MLE)

Moving target patient population, 12–13,25, 31, 44, 159

MPP. See Method of products of p-values(MPP)

Page 291: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

274 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

MSP. See Method of sum of p-values(MSP)

Muller-Schafer method, 146Multiple adaptive design, 6, 17Multiple-endpoint oriented, 105Multiple stage design, 110, 224

for single-arm trial, 201–205Multiple testing, 14–15

N

N-adjustable design, 5, 213Naive p-value, 163–165National Cancer Institute, 91Non-inferiority, 165

hypothesis, 76margin, 34, 36, 75, 78–84, 243switch from superiority to, 2, 78–87test for, 34, 36, 212

Non-informative priors, 101, 102, 103Nonparametric method, 185, 187–188Normal endpoint, 17, 115, 155, 164Normal outcome, 69Null hypothesis, 17, 72, 109

hazard rates under, 188interchange between alternative

hypothesis and, 77one-sided, 43under population model, 72rejection of, 34, 78, 133, 143, 191in simulation framework, 213test, 149test statistic and, 33

O

O’Brien and Fleming’s test, 115, 133,136

O’Brien-Fleming boundary, 110, 130,205, 207, 209

O’Brien-Fleming error spendingfunctions, 122

O’Brien-Fleming group sequentialprocedure, 136

O’Brien-Fleming test, 115, 133Oncology, 8

trials, 15, 18 (See also Cancer trials)mixed exponential model, 181phase I, 89, 91–93, 227

dose-escalation, 244dose-level selection, 92dose-toxicity modeling, 91–92

reassessment of modelparameters, 92–93

phase II, 228two-arm comparative, 126, 155

Operating characteristics, 102, 155–156,203, 207

of adaptive methods, 128comparisons, 236, 246desirable, 236of various designs, 129

Optimal allocation, 60, 61Optimal/flexible multiple-stage designs,

110–111Optimal randomized play-the-winner

model, 60–61Optimal two-stage design, 110–111,

202–203Ordinal outcome, 68–69

P

Parallel group, 18, 174, 198Patient population

actual, 4, 6, 10, 23–26genomic markers, 166homogenous, 13selection, 251target, 2, 3, 8–10, 31

conclusions for, 13moving, 12–13, 25, 159shift in, 30for stratified randomization, 50, 107subgroups of, 181

Permutation test, 71–72Pharmacodynamics, 213Pharmacokinetics, 211, 213Phase I oncology study. See Oncology,

trials, phase IPhRMA Working Group, 3–4, 6Play-the-winner model, 58–59Pocock, 136, 158, 159Pocock boundary, 110, 131, 205, 208, 209Pocock error spending functions, 122Pocock-Simon’s model, 56–57Pocock’s test, 115, 133Population model, 13, 71, 72Posterior distribution, 64, 92, 195, 230

of toxicity, 236Posterior means, 62Power, 64, 118

analysis, 9, 13, 17, 51, 77, 187pre-study, 242

Page 292: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

INDEX 275

conditional, 130, 131, 133–135, 244for detection of clinically important

effect, 45function, 91insufficient, 11, 244optimal, 49predictive, 130, 131, 203protocol amendments on, 23statistical, 14, 48, 51, 73stealing, 162test for equivalence, 37test for non-inferiority/superiority, 36

Predictive power, 130, 131, 203Prior distribution, 61, 62, 101, 195, 196,

215, 246Bayesian approach and, 92binomial, 203of parameter tensor, 97uniform, 67, 229

Product of p-values, 150, 151, 153–155Proportional hazard model, 177–181Proschan-Hunsberger’s method, 137,

142–145Prospective trials, 2

adaptation, 5, 239stratification, 55

Protocol amendment(s), 5, 23–46, 95, 108actual patient population, 23–26estimation of shift and scale

parameters, 26–31sample size adjustment, 35–38statistical inference, 31–34statistical inference with covariate

adjustment, 38–43

Q

QTc prolongation, 213

R

Randomizationcluster, 51–52, 247model, 13, 47, 71, 72, 99

accidental bias, 70simple, 48–50stratified, 50–51

Randomized play-the-winner model, 59,216, 217

Reassessment method, 101continual, 15, 89, 90, 215, 228, 229, 244

Relative efficiency, 36, 37, 38, 49, 50

Repeated confidence interval, 131Reproducibility, 19, 159Response-adaptive randomization, 14,

47–48, 58–70, 90, 92, 165, 216,236, 241

defined, 58Retrospective adaptation, 2, 5, 239Robustness, 19, 211, 231

S

Sample sizeadjustment, 12, 15, 16–17, 35–38,

137–159calculation, 90, 112, 114change, 3fixed, 119insufficient, 64maximum, 116, 118power analysis, for calculation, 13, 20,

23, 77power and, 72–73pre-selected, 75re-assessment, 4saving in, 108

Sample size ratio, 49–50, 112maximum, 100

Sample size re-estimation, 2, 5, 9, 20,123, 130, 212, 228

flexible design with, 231unblinded, 219without unblinding, 138–140

Sampling distribution, 40, 196Scale parameter, 23, 26, 29Seamless design, 161–171

comparisons, 165–167contrast test and naive p-value,

163–165drop-the-loser adaptive design,

167–170efficiency, 161–162objectives, 247step-wise test and adaptive procedures,

162–163Seamless phase II/III design, 161–171,

247–254Selection bias, 19, 71, 239

expected factors, 72sample size and, 55

Sensitivityanalyses, 19, 45, 84, 237index, 23

Page 293: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

276 ADAPTIVE DESIGN METHODS IN CLINICAL TRIALS

changes in, 25estimation of, 27

indication of, 24Sequential methods, 108–112

basic concepts, 109–112Shift parameter, 23, 29Short-term treatment, 132

efficacy, 248, 251endpoints, 249long-term treatment vs., 131

Simon’s two-stage design, 110, 202Simple randomization, 48–50, 217, 234

for two-arm parallel group, 49Statistical analysis plan, 4, 7, 11Statistical inference, 2, 10–11, 31–34,

84–86conclusions from, 9with covariate adjustment, 38–43impact of protocol amendments on, 23valid, 45validity, 7

Statistical procedures, 2, 3defined, 9documentation, 7, 11for identifying best clinical benefit,

239–240modifications, 20, 23, 45, 239–240, 444for sample size re-estimation, 137

Step-wise test, 162–163STER. See Strict traditional escalation

rule (STER)Stopping boundary, 109, 110, 131, 213,

243for efficacy, 114, 120, 207, 208, 243for futility, 118, 120, 207

Stopping rule(s), 100, 149, 163, 245choice of, 136clear statement of, 247early, 216in first stage, 144, 203as guide, 128in second stage, 203

Stratified randomization, 47, 50–51Strict traditional escalation rule (STER),

89–90, 215Sum of p-values, 152–153Superiority margin, 34, 36, 80

switch to non-inferiority margin, 2,78–87

Survival endpoint, 120, 155, 164Survival outcome, 69

Switch from superiority tonon-inferiority, 2, 78–87

Switching effect, 173, 182Cox’s proportional hazard model with,

179in statistical analyses, 191

Symmetric boundary, 118, 119

T

Target patient population, 8–10defining, 2moving, 12–13, 25, 31, 44, 159statistical inference and, 31for stratified randomization, 50subgroups, 181

TER. See Traditional escalation rules(TER)

Test for equality, 33, 35Test for equivalence, 34, 37Test for non-inferiority/superiority, 34, 36Tests without historical data, 87Time-dependent covariate, 177Time-to-event analysis, 112Traditional escalation rules (TER),

89–90, 215Treatment-adaptive randomization, 14,

47–48, 52–55Treatment imbalance, 48, 57

impact of, 50reducing, 54of stratified randomization, 50

Treatment switching, 5adaptive, 6, 12, 18, 173–193in cancer trials, 18

Trial procedure, 24Triangular boundary, 118Truncated-binomial randomization, 52

accidental bias, 70Truncation, defined, 63Two-arm bandit, 61, 66Two-stage design, 108, 125–126, 142,

146–147classical approach for, 202–203optimal, 110patient population for, 229for trials with binary outcomes, 249

Type I error rate, 14, 16, 75, 109, 124,127, 141

control of, 159, 162, 209, 211, 242family experiment-wise, 151inflation of, 239

Page 294: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and

INDEX 277

U

Unadjusted p-value, 150Unbalanced design, 49, 212Unconditional inference, 40–42,

40–43Uniform bandit, 63Uniformly most powerful unbiased,

169United States National Cancer

Institute, 91Urn design, 54

Wei’s marginal, 57Utility-adaptive randomization, 90, 93,

99–100, 101, 245Utility-based unified CRM adaptive

approach, 94Utility function, 90, 99

construction of, 94–95Utility index, 91, 96, 100, 205, 220Utility-offset model, 100, 101, 217,

218–219

V

Variance-adaptive randomization, 52Variation, 1, 24

coefficient of, 180, 186controlling and eliminating sources

of, 10patient, 242statistical procedures and, 10

Virtual patients, 211, 212

W

Wang-Tsiatis’ boundary, 118Wei’s marginal urn design, 57Whitehead triangle boundaries, 130Wilcoxon rank-sum test, 72Without unblinding, 138–140

Z

Zelen’s model, 56

Page 295: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and
Page 296: Adaptive Design Methods in Clinical Trailsvct.qums.ac.ir/portal/file/?184695/Adaptive-Design-Methods-in-Clinical-Trials.pdfof adaptive design and analysis in clinical research and