409

Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,
Page 2: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

Textbook of Clinical Trials

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 3: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

Textbook of Clinical Trials

Edited by

David MachinDivision of Clinical Trials and Epidemiological Sciences,

National Cancer Centre, Singapore and UK Children’s Cancer Study Group,University of Leicester, UK.

Simon DayMedicines and Healthcare products Regulatory Agency,

London, UK

Sylvan GreenArizona Cancer Centre, Tucson, Arizona, USA

Section Editors

Brian EverittInstitute of Psychiatry, London, UK

Stephen GeorgeDepartment of Biostatistics and Informatics,

Duke University Medical Centre, Durham, NC, USA

Page 4: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

Copyright 2004 John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester,West Sussex PO19 8SQ, England

Telephone (+44) 1243 779777

Email (for orders and customer service enquiries): [email protected] our Home Page on www.wileyeurope.com or www.wiley.com

All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in anyform or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except under the terms ofthe Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing AgencyLtd, 90 Tottenham Court Road, London W1T 4LP, UK, without the permission in writing of the Publisher. Requests tothe Publisher should be addressed to the Permissions Department, John Wiley & Sons Ltd, The Atrium, Southern Gate,Chichester, West Sussex PO19 8SQ, England, or emailed to [email protected], or faxed to (+44) 1243 770620.

This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. Itis sold on the understanding that the Publisher is not engaged in rendering professional services. If professional adviceor other expert assistance is required, the services of a competent professional should be sought.

Other Wiley Editorial Offices

John Wiley & Sons Inc., 111 River Street, Hoboken, NJ 07030, USA

Jossey-Bass, 989 Market Street, San Francisco, CA 94103-1741, USA

Wiley-VCH Verlag GmbH, Boschstr. 12, D-69469 Weinheim, Germany

John Wiley & Sons Australia Ltd, 33 Park Road, Milton, Queensland 4064, Australia

John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop #02-01, Jin Xing Distripark, Singapore 129809

John Wiley & Sons Canada Ltd, 22 Worcester Road, Etobicoke, Ontario, Canada M9W 1L1

Wiley also publishes its books in a variety of electronic formats. Some content that appearsin print may not be available in electronic books.

Library of Congress Cataloging-in-Publication Data

Textbook of clinical trials / edited by David Machin ... [et al.].p. cm.

Includes bibliographical references and index.ISBN 0-471-98787-5 (alk. paper)

1. Clinical trials. I. Machin, David.

R853.C55T49 2004610′.72′4–dc22

2003065147

British Library Cataloguing in Publication Data

A catalogue record for this book is available from the British Library

ISBN 0-471-98787-5

Typeset in 10/12pt Times by Laserwords Private Limited, Chennai, IndiaPrinted and bound in Great Britain by Antony Rowe Ltd, Chippenham, WiltshireThis book is printed on acid-free paper responsibly manufactured from sustainable forestryin which at least two trees are planted for each one used for paper production.

Page 5: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

Contents

Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1 Brief History of Clinical Trials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3S. Day and F. Ederer

2 General Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11David Machin

3 Clinical Trials in Paediatrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45Johan P.E. Karlberg

4 Clinical Trials Involving Older People . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55Carol Jagger and Antony J. Arthur

5 Complementary Medicine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63P.C. Leung

CANCER 85

6 Breast Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87Donald A. Berry, Terry L. Smith and Aman U. Buzdar

7 Childhood Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101Sharon B. Murphy and Jonathan J. Shuster

8 Gastrointestinal Cancers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117Dan Sargent, Rich Goldberg and Paul Limburg

9 Haematologic Cancers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141Charles A. Schiffer and Stephen L. George

Page 6: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

vi CONTENTS

10 Melanoma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149P.-Y. Liu and Vernon K. Sondak

11 Respiratory Cancers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159Kyungmann Kim

CARDIOVASCULAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

12 Cardiovascular . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175Lawrence Friedman and Eleanor Schron

DENTISTRY AND MAXILLO-FACIAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

13 Dentistry and Maxillo-facial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193May C.M. Wong, Colman McGrath and Edward C.M. Lo

DERMATOLOGY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

14 Dermatology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211Luigi Naldi and Cosetta Minelli

PSYCHIATRY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

15 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235B.S. Everitt

16 Alzheimer’s Disease . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243Leon J. Thal and Ronald G. Thomas

17 Anxiety Disorders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255M. Katherine Shear and Philip W. Lavori

18 Cognitive Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273Nicholas Tarrier and Til Wykes

19 Depression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297Graham Dunn

REPRODUCTIVE HEALTH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

20 Contraception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315Gilda Piaggio

21 Gynaecology and Infertility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337Siladitya Bhattacharya and Jill Mollison

RESPIRATORY 357

22 Respiratory 359Anders Kallen

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401

Page 7: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

Contributors

TONY ARTHUR School of Nursing, Faculty of Medicine and Health Sciences,University of Nottingham, Queens Medical Centre,Nottingham NG7 2UH, UK, Email:[email protected]

DONALD BERRY Department of Biostatistics, UTMD Anderson Cancer Centre,1400 Holcombe Blvd, Box 447, Houston, TX 77030, USA,Email: [email protected]

S. BHATTACHARYA Department of Obstetrics and Gynaecology, AberdeenMaternity Hospital, Cronhill Road, Aberdeen AB25 2ZD, UK,Email: [email protected]

ARMAN BUZDAR Department of Breast Medical Oncology, The University ofTexas, M.D. Anderson Cancer Centre, Texas, USA

SIMON DAY Licensing Division, Medicines and Healthcare productsRegulatory Agency, Room 13-205 Market Towers, 1 NineElms Lane, London SW8 5NQ, UK,Email: [email protected]

GRAHAM DUNN Biostatistics Group, School of Epidemiology and HealthSciences, Stopford Building, Oxford Road, Manchester M139PT, UK, Email: [email protected],[email protected]

FRED EDERER The EMMES Corporation, 401 North Washington Street, Suite700, Rockville, MD 20850, USA, Email: [email protected]

Page 8: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

viii CONTRIBUTORS

BRIAN EVERITT Box 20, Biostatistics Department, Institute of Psychiatry,Denmark Hill, London SE5 8AF, UK, Email:[email protected]

LARRY FRIEDMAN National Heart, Lung, and Blood Institute, Building 31, Room5A03, Bethesda, MD 20892 2482, USA, Email:[email protected]

STEPHEN GEORGE Department of Biostatistics and Bioinformatics, DukeUniversity Medical Centre, Durham, NC 27710, USA, Email:[email protected]

RICH GOLDBERG Medical Oncology, Mayo Clinic, 200 1st Street SW,Rochester, MN 55905, USA, Email:[email protected]

SYLVAN GREEN Arizona Cancer Centre, 1515 N Campbell Ave, PO Box245024, Tucson, AZ 85724, USA, Email:[email protected]

CAROL JAGGER Trent Institute for HSR, Department of Epidemiology andPublic Health, University of Leicester, 22–28 Princess RoadWest, Leicester LE1 6TP, UK, Email: [email protected]

ANDERS KALLEN AstraZeneca R&D, Lund, Sweden, Email:[email protected]

JOHAN KARLBERG Clinical Trials Centre, Faculty of Medicine, The University ofHong Kong, Hong Kong SAR, PR China, Email:[email protected]

KYUNGMANN KIM Department of Biostatistics & Medical Informatics, Universityof Wisconsin, 600 Highland Avenue, Box 4675, Madison, WI53792, USA, Email: [email protected]

PHILIP LAVORI VACSPCC (151K), 795 Willow Road, Menlo Park, CA 940252539, USA, Email: [email protected]

P.C. LEUNG Department of Orthopaedics & Traumatology, The ChineseUniversity of Hong Kong, Room 74026, 5th Floor, ClinicalSciences Building, Prince of Wales Hospital, Shatin, HongKong, Email: [email protected]

PAUL LIMBURG Gastroenterology and Hepatology, Mayo Clinic, 200 1st StreetSW, Rochester, MN 55905, USA, Email:[email protected]

Page 9: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CONTRIBUTORS ix

P.Y. LIU SWOG Statistical Centre, MP 557, Fred Hutchinson CancerResearch Centre, 1100 Fairview Avenue North, PO Box19024, Seattle, WA 98109 1024, USA, Email:[email protected]

EDWARD LO Dental Public Health, Faculty of Dentistry, The University ofHong Kong, 34 Hospital Road, Hong Kong, Email:[email protected]

DAVID MACHIN Division of Clinical Trials and Epidemiological Sciences,National Cancer Centre Singapore, 11 Hospital Drive,Singapore 169610, Singapore, Email: [email protected],[email protected]

COLAMN MCGRATH Dental Public Health, Faculty of Dentistry, The University ofHong Kong, 34 Hospital Road, Hong Kong, Email:[email protected]

COSETTA MINELLI Medical Statistics Group, Department of Epidemiology andPublic Health, University of Leicester, 22–28 Princess RoadWest, Leicester LE1 6TP, UK, Email: [email protected]

JILL MOLLISON Department of Public Health, University of Aberdeen,Polwarth Building, Foresterhill, Aberdeen AB25 2ZD, UK,Email: [email protected]

SHARON B. MURPHY Children’s Cancer Research Institute, The University of TexasHealth Science Centre at San Antonio, 7703 Floyd Curl Drive,MC 7784, San Antonio, TX 78229-3900, USA, Email:[email protected]

LUIGI NALDI Clinica Dermatologica, Ospedali Riuniti, L.go Barozzi 1,24100 Bergamo, Italy, Email: [email protected]

GILDA PIAGGIO Special Programme of Research, Development and ResearchTraining in Human Reproduction, Department of ReproductiveHealth and Research, World Health Organisation, 1211Geneva 27, Switzerland, Email: [email protected]

DANIEL SARGENT Mayo Clinic, 200 First Street, SW, Kahler 1A, Rochester, MN55905, USA, Email: [email protected]

CHARLES SCHIFFER Division of Oncology and Hematology/Oncology, KarmanosCancer Institute, Wayne State University School of Medicine,Detroit, MI 48201, USA.

ELEANOR SCHRON National Heart, Lung, and Blood Institute, Bethesda, MD20892 2482, USA, Email: [email protected]

Page 10: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

x CONTRIBUTORS

KATHERINE SHEAR Panic Anxiety and Traumatic Grief Program, WesternPsychiatric Institute and Clinic, 3811 O’Hara Street,Pittsburgh, PA 15213, USA, Email: [email protected],[email protected]

JON SHUSTER PO Box 100212, College of Medicine, University of Florida,Gainesville, FL 32610 0212, USA, Email:[email protected]

TERRY SMITH UTMD Anderson Cancer Centre, 1400 Holcombe Blvd, Box447, Houston, TX 77030, USA, Email:[email protected]

VERNON K. SONDAK University of Michigan Medical Centre, Division of SurgicalOncology, 1500 E Medical Centre Drive, 3306Cancer/Geriatrics Ctr Box 0932, Ann Arbor, MI 48109 0932,USA, Email: [email protected]

NICHOLAS TARRIER University of Manchester, Academic Division of ClinicalPsychology Education & Research Building, WythenshaweHospital, Southmoor Road, Manchester M23 9LT, UK, Email:[email protected]

LEON THAL Department of Neurosciences, University of California, SanDiego, 9500 Gilman Drive, San Diego, CA 92037, USA,Email: [email protected]

RONALD THOMAS Department of Family and Preventative Medicine, andNeurosciences, UCSD 9500 Gilman Drive, La Jolla, CA92039 0645, USA, Email: [email protected]

MAY C.M. WONG Dental Public Health, Faculty of Dentistry, The University ofHong Kong, 34 Hospital Road, Hong Kong, Email:[email protected]

TIL WYKES Department of Psychiatry, De Crespigny Park, London SE58AF, UK, Email: [email protected]

Page 11: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

Preface

This Textbook of Clinical Trials is not a textbookof clinical trials in the traditional sense. Rather, itcatalogues in part both the impact of clinical tri-als – particularly the randomised controlled trial–on the practice of medicine and allied fieldsand on the developments and practice of medicalstatistics. The latter has evolved in many waysthrough the direct needs of clinical trials and theconsequent interaction of statistical and clinicaldisciplines. The impact of the results from clin-ical trials, particularly the randomised controlledtrial, on the practice of clinical medicine andother areas of health care has been profound. Inparticular, they have provided the essential under-pinning to evidence-based practice in many disci-plines and are one of the key components for reg-ulatory approval of new therapeutic approachesthroughout the world.

Probably the single most important contribu-tion to the science of comparative clinical trialswas the recognition, more than 50 years ago, thatpatients should be allocated to the options underconsideration at random. This was the founda-tion for the science of clinical trial research andplaced the medical statistician at the centre of theprocess. Although the medical statistician may beat the centre, he or she is by no means alone.Indeed the very nature of clinical trial research ismultidisciplinary so that a ‘team’ effort is always

needed from the concept stage, through design,conduct, monitoring and reporting.

Some of the developments impacting on clini-cal trials have been truly statistical in nature, forexample Cox’s proportional hazards model, whileothers such as the intention-to-treat (ITT) princi-ple are – in some sense – based more on expe-rience. Other important statistical developmentshave not depended on technical advancement,but rather on conceptual advancement, such asthe now standard practice of reporting confidenceintervals rather then relying solely on p-valuesat the interpretation stage. Of major importanceover this same time period has been the expansionin data processing capabilities and the range ofanalytical possibilities only made possible by thetremendous development in computing power.However, despite many advances, the majorityof randomised controlled trials remain simple indesign – most often a comparison between tworandomised groups.

On the medical side there have been manychanges including new diseases that raisenew issues. Thus, as we write, SARS hasemerged: the final extent of the epidemicis unknown, diagnosis is problematical andno specific treatment is available. In moreestablished diseases there have been majoradvances in the types of treatment available, bethey in surgical technique, cancer chemotherapy

Page 12: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

xii PREFACE

or psychotropic drugs. Advances in medicaland associated technologies are not confinedto curative treatments but extend, for example,to diagnostic methods useful in screening fordisease, vaccines for disease prevention, drugsand devices for female and male contraception,and pain relief and psychological supportstrategies in palliative care.

Clinical trials imply some intervention affect-ing the subjects who are ultimately recruited intothem. These subjects will range from the veryhealthy, perhaps women of a relatively young agerecruited to a contraceptive development trial,to those (perhaps elderly) patients in terminaldecline from a long-standing illness. Each groupstudied in a clinical trial, from unborn child toaged adult, brings its own constraint on the ulti-mate design of the trial in mind. So too does therelative efficacy of the current standard. If theoutcome is death and the prognosis poor, thenbolder steps may be taken in the choice of treat-ments to test. If the disease is self-limiting or

the outcome cosmetic then a more conservativeapproach to treatment options would be justified.

In all this activity the choice of clinical trialdesign and its ultimate conduct are governedby essential ethical constraints, the willingnessof subjects to consent to the trial in questionand their right to withdraw from the trial shouldthey wish.

Thus the Textbook of Clinical Trials addressessome of these and many other issues as theyimpact on patients with cancer, cardiovasculardisease, dermatological, dental, mental and oph-thalmic health, gynaecology and respiratory dis-eases. In addition, chapters deal with issues relat-ing to complementary medicine, contraceptionand special issues in children and special issuesin older patients. A brief history of clinical tri-als and a summary of some pertinent statisticalissues are included.

David Machin, Simon Day and Sylvan Green

Page 13: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

INTRODUCTION

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 14: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

1

Brief History of Clinical Trials

S. DAY1 AND F. EDERER2

1Licensing Division, Medicines and Healthcare products Regulatory Agency, London, UK2The EMMES Corporation, Rockville, MD 20850, USA

The modern-day birth of clinical trials is usuallyconsidered to be the publication in 1948 bythe UK Medical Research Council1 of a trialfor the treatment of pulmonary tuberculosis withstreptomycin. However, earlier but less welldocumented examples do exist. The comparativeconcept of assessing therapeutic efficacy hasbeen known from ancient times. Slotki2 cites adescription of a nutritional experiment involvinga control group in the Book of Daniel from theOld Testament:

Then Daniel said to the guard whom the masterof the eunuchs had put in charge of Hananiah,Mishael, Azariah and himself, ‘Submit to us thistest for ten days. Give us only vegetables to eatand water to drink; then compare our looks withthose of the young men who have lived on thefood assigned by the king, and be guided in yourtreatment of us by what you see.’ The guard listenedto what they said and tested them for ten days. Atthe end of ten days they looked healthier and werebetter nourished than all the young men who hadlived on the food assigned them by the king. Sothe guard took away the assignment of food andthe wine they were to drink, and gave them onlythe vegetables.

Daniel lived around the period 800BC and althoughit may not be possible to confirm the accuracyof the account, what is clear is that when thispassage was written–around 150BC–the ideascertainly existed.

The passage from Daniel describes not just acontrol group, but a concurrent control group.This fundamental element of clinical research didnot begin to be widely practised until the latterhalf of the twentieth century.

Much later than the book of Daniel, but stillvery early, is an example from the fourteenthcentury: it is a letter from Petrarch to Boccacetocited by Witkosky:3

I solemnly affirm and believe, if a hundredor a thousand of men of the same age, sametemperament and habits, together with the samesurroundings, were attacked at the same time by thesame disease, that if one followed the prescriptionsof the doctors of the variety of those practicing atthe present day, and that the other half took nomedicine but relied on Nature’s instincts, I have nodoubt as to which half would escape.

The Renaissance period (fourteenth to sixteenthcenturies) provides other examples including an

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 15: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

4 TEXTBOOK OF CLINICAL TRIALS

unplanned experiment in the treatment of battle-field wounds. Packard4 describes how the surgeonAmbroise Pare was using the standard treatmentof pouring boiled oil over soldiers’ wounds dur-ing a battle to capture the castle of Villaine in 1537.When he ran out of oil, he resorted to a mixture ofegg yolks, oil of roses and turpentine. The supe-riority of the new ‘treatment’ became evident thenext day:

I raised myself very early to visit them, whenbeyond my hope I found those to whom I appliedthe digestive medicament feeling but little pain,their wounds neither swollen nor inflamed, andhaving slept through the night. The others to whomI had applied the boiling oil were feverish withmuch pain and swelling about their wounds. ThenI determined never again to burn thus so cruelly byarquebusses.

Perhaps the most famous historical example ofa planned controlled, comparative, clinical trialis from the eighteenth century: that where Lind5

found oranges and lemons to be the most effectiveof six dietary treatments for scurvy on board ships.He wrote:

On the 20th of May, 1747, I took twelve patientsin the scurvy, on board the Salisbury at sea. Theircases were as similar as I could have them. They allin general had putrid gums, the spots and lassitude,with weakness of their knees. They lay togetherin one place, being a proper apartment for thesick in the fore-hold; and had one diet commonto all, viz. water-gruel sweetened with sugar inthe morning; fresh mutton-broth often times fordinner; at other times puddings, boiled biscuit withsugar etc. And for supper, barley and raisins, riceand currants, sago and wine, or the like. Two ofthese were ordered each a quart of cyder a day.Two others took twenty-five gutts of elixir vitriolthree times a day, upon an empty stomach; using agargle strongly acidulated with it for their mouths.Two others took two spoonfuls of vinegar threetimes a day, upon an empty stomach; having theirgruels and their other food well acidulated with it,as also the gargle for their mouths. Two of theworst patients, with the tendons in the ham rigid(a symptom none of the rest had) were put undera course of sea-water. Of this they drank half apint every day, and sometimes more or less as it

operated, by way of gentle physic. Two others hadeach two oranges and one lemon given them everyday. These they eat with greediness, at differenttimes, upon an empty stomach. They continued butsix days under this course, having consumed thequantity that could be spared. The two remainingpatients, took the bigness of a nutmeg three timesa day of an electuary recommended by a hospital-surgeon, made of garlic, mustard-feed, rad. raphan,balsam of Peru, and gum myrrh; using for commondrink barley-water well acidulated with tamarinds;by a decoction of which, with the addition ofcremor tartar, they were greatly purged three orfour times during the course.

The consequence was, that the most sudden andvisible good effects were perceived from the useof the oranges and lemons; one of those who hadtaken them, being at the end of six days fit forduty. The spots were not indeed at that time quiteoff his body, or his gums sound; without any othermedicine, than a gargle of elixir vitriol, he becamequite healthy before we came into Plymouth, whichwas on the 16th June. The other was the bestrecovered of any in his condition; and being nowdeemed pretty well, was appointed nurse, to the restof the sick.

Pierre-Charles-Alexandre Louis, a nineteenth-century clinician and pathologist, introduced thenumerical aspect to comparing treatments.6 Hisidea was to compare the results of treatments ongroups of patients with similar degrees of disease,i.e. to compare ‘like with like’:

I come now to therapeutics, and suppose that youhave some doubt as to the efficacy of a particularremedy: How are you to proceed?. . .You wouldtake as many cases as possible, of as similar adescription as you could find, and would counthow many recovered under one mode of treatment,and how many under another; in how short a timethey did so, and if the cases were in all respectsalike, except in the treatment, you would have someconfidence in your conclusions; and if you werefortunate enough to have a sufficient number offacts from which to deduce any general law, itwould lead to your employment in practice of themethod which you had seen oftenest successful.

‘Like with like’ was an important step forwardfrom Lind’s treatment of scurvy. Note, although

Page 16: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

BRIEF HISTORY OF CLINICAL TRIALS 5

early in Lind’s passage he says that: ‘Their caseswere as similar as I could have them’, laterhe acknowledges that the two worst cases bothreceived the same treatment: ‘Two of the worstpatients, with the tendons in the ham rigid (asymptom none of the rest had) were put undera course of sea-water’. It remained for BradfordHill, more than a century later, to use a formalmethod for creating groups of cases that were ‘inall respects alike, except in the treatment’.

RANDOMISATION

The use of randomisation was a contribution bythe statistician R.A. Fisher in agriculture (see, forexample, Fisher,7 Fisher and McKenzie8). Fisherrandomised plots of crops to receive differenttreatments and in clinical trials there were earlyschemes to use ‘group randomisation’: patientswere divided into two groups and then thetreatment for each group was randomly selected.The Belgian medicinal chemist van Helmont9

described an early example of this:

Let us take out of the hospitals, out of the Camps,or from elsewhere, 200, or 500 poor People thathave Fevers, Pleurisies, &c, Let us divide them intohalfes, let us cast lots, that one half of them mayfall to my share, and the others to yours,. . . we shallsee how many funerals both of us shall have: Butlet the reward of the contention or wager, be 300florens, deposited on both sides.

Amberson and McMahon10 used group randomi-sation in a trial of sanocrysin for the treatmentof pulmonary tuberculosis. Systematic assign-ment was used by Fibiger,11 who alternatelyassigned diphtheria patients to serum treatmentor an untreated control group. Alternate assign-ment is frowned upon today because knowledgeof the future treatment allocations may selectivelybias the admission of patients into the treatmentgroup.12 Diehl et al.13 reported a common coldvaccine study with University of Minnesota stu-dents as subjects where blinding and randomassignment of patients to treatments appears tohave been used:

At the beginning of each year. . . students wereassigned at random. . . to a control group or anexperimental group. The students in the controlgroups. . . received placebos. . . All students thoughtthey were receiving vaccines. . . Even the physi-cians who saw the students. . . had no informationas to which group they represented.

However Gail14 points out that although thisappears to be a randomised clinical trial, a furtherunpublished report by Diehl clarifies that this isanother instance of systematic assignment:

At the beginning of the study, students whovolunteered to take these treatments were assignedalternately and without selection to control groupsand experimental groups.

Bradford Hill, in the study of streptomycin inpulmonary tuberculosis,1 used random samplingnumbers in assigning treatments to subjects, sothat the subject was the unit of randomisation. Thisstudy is now generally acknowledged to be the‘first properly randomised clinical trial’.

Later Bradford Hill and the British MedicalResearch Council continued with further ran-domised trials: chemotherapy of pulmonary tuber-culosis in young adults,15 antihistaminic drugsin the prevention and treatment of the commoncold,16 cortisone and aspirin in the treatmentof early cases of rheumatoid arthritis17,18 andlong-term anticoagulant therapy in cerebrovascu-lar disease.19

In America, the National Institutes of Healthstarted its first randomised trial in 1951. It wasa National Heart Institute study of adrenocorti-cotropic hormone (ACTH), cortisone and aspirinin the treatment of rheumatic heart disease.20 Thiswas followed in 1954 by a randomised trial ofretrolental fibroplasia (now known as retinopathyof prematurity), sponsored by the National Insti-tute of Neurological Diseases and Blindness.21

During the four decades following the pioneer-ing trials of the 1940s and 1950s, there was alarge growth in the number of randomised trialsnot only in Britain and the US, but also in Canadaand mainland Europe.

Page 17: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

6 TEXTBOOK OF CLINICAL TRIALS

BLINDING

The common cold vaccine study published byDiehl et al.,13 cited earlier, in which Universityof Minnesota students were alternately assignedto vaccine or placebo, was a masked (or blinded)clinical trial:

All students thought they were receiving vaccines. . .Even the physicians who saw the students. . . had noinformation as to which group they represented.

Blinding was used in the early Medical ResearchCouncil trials in which Bradford Hill was involved.Thus, in the first of those trials, the study of strep-tomycin in tuberculosis,1 the X-ray films wereviewed by two radiologists and a clinician, eachreading the films independently and not knowingif the films were of C (control, bed-rest alone) orS (streptomycin and bed-rest) cases.

Bradford Hill22 (Reproduced with permission)noted in respect of using such blinding andrandomisation:

If [the clinical assessment of the patient’s progressand of the severity of the illness] is to be usedeffectively, without fear and without reproach, thejudgements must be made without any possibilityof bias, without any overcompensation for anypossible bias, and without any possible accusationof bias.

Not simply overcoming bias, but overcomingany possible accusation of bias is an importantjustification for blinding and randomisation.

In the second MRC trial, the antihistaminecommon cold study,16 placebos, indistinguishablefrom the drug under test, were used. Here,Bradford Hill noted:

. . . in [this] trial. . . feelings may well run high. . .

either of the recipient of the drug or the clinicalobserver, or indeed of both. If either were allowedto know the treatment that had been given, I believethat few of us would without qualms accept thatthe drug was of value–if such a result came out ofthe trial.

ETHICS

Experimentation in medicine is as old as medicineitself. Some experiments on humans have beenconducted without concern for the welfare ofthe subjects, who have often been prisoners ordisadvantaged people. Katz23 provides examplesof nineteenth-century studies in Russia and Ire-land of the consequences of infecting people withsyphilis and gonorrhoea. McNeill24 describeshow during the same period in the US, physiciansput slaves into pit ovens to study heat stroke, andpoured scalding water over them as an experi-mental cure for typhoid fever. He even describeshow one slave had two fingers amputated in a‘controlled trial’, one finger with anaesthesia andone without!

Unethical experiments on human beings havecontinued into the twentieth century and havebeen described by, for example, Beecher,25

Freedman26 and McNeil.24 In 1932 the US Pub-lic Health Service began a study in Tuskegee,Alabama, of the natural progression of untreatedsyphilis in 400 black men. The study continueduntil 1972, when a newspaper reported that thesubjects were uninformed or misinformed aboutthe purpose of the study.26 Shirer,27 amongst oth-ers, describes how during the Nazi regime from1933 to 1945, German doctors conducted exper-iments, mainly on Jews, but also on Gypsies,mentally disabled persons, Russian prisoners ofwar and Polish concentration camp inmates. TheNazi doctors were later tried for their atrocities in1946–1947 at Nuremberg and this led to the writ-ing, by three of the trial judges, of the NurembergCode, the first international effort to lay downethical principles of clinical research.28 Principle1 of the Nuremberg Code states:

The voluntary consent of the human subject isabsolutely essential. This means that the personinvolved should have legal capacity to give consent;should be so situated as to be able to exercisefree power of choice, without the intervention ofany element of force, fraud, deceit, duress, over-reaching, or other ulterior form of constraint orcoercion; and should have sufficient knowledge andcomprehension of the elements of the subject matter

Page 18: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

BRIEF HISTORY OF CLINICAL TRIALS 7

involved as to enable him to make an understandingand enlightened decision.

Other principles of the Code are that experimentsshould yield results for the good of society,that unnecessary suffering and injury should beavoided, and that the subject should be free towithdraw from the experiment at any time andfor any reason.

Other early advocates of informed consentwere Charles Francis Withington and WilliamOsler. Withington29 realised, the ‘possible con-flict between the interests of medical science andthose of the individual patient’, and concludedin favour of ‘the latter’s indefensible rights’.Osler30 insisted on informed consent in medi-cal experiments. Despite this early advocacy, andthe 1946–1947 Nuremberg Code, the applicationof informed consent to medical experiments didnot take hold until the 1960s. Hill,31 based onhis experience in a number of early randomisedclinical trials sponsored by the Medical ResearchCouncil, believed that it was not feasible to drawup a detailed code of ethics for clinical trialsthat would cover the variety of ethical issues thatcame up in these studies, and that the patient’sconsent was not warranted in all clinical tri-als. Gradually the medical community came torecognise the need to protect the reputation andintegrity of medical research and in 1955 a humanexperimentation code was adopted by the Pub-lic Health Council in the Netherlands.32 Later,in 1964, the World Medical Assembly issued theDeclaration of Helsinki33 essentially adopting theethical principles of the Nuremberg Code, withconsent being ‘a central requirement of ethicalresearch’.34 The Declaration of Helsinki has beenupdated and amended several times: Tokyo, 1975;Venice, 1983; Hong Kong, 1989; Cape Town,1996; and Edinburgh, 2000.

DATA MONITORING

In the modern randomised clinical trial, the accu-mulating data are usually monitored for safetyand efficacy by an independent data monitoring

committee. In 1968 the first such committee wasestablished, serving the Coronary Drug Project,a large multicentre trial sponsored in the UnitedStates by the National Heart Institute of theNational Institutes of Health.35,36 In 1967, after apresentation of interim outcome data by the studyco-ordinators to all participating investigators ofthe Coronary Drug Project, Thomas Chalmersaddressed a letter to the policy board chairmanexpressing concern:

that knowledge by the investigators of early nonsta-tistically significant trends in mortality, morbidity,or incidence of side effects might result in someinvestigators–desirous of treating their patients inthe best possible manner, i.e., with the drug thatis ahead–pulling out of the study or unblindingthe treatment groups prematurely. (Canner35, repro-duced with permission from Elsevier)

Following this, a data and safety monitoring com-mittee was established for the Coronary DrugProject. It consisted of scientists who were not con-tributing data to the study, and thereafter the prac-tice of sharing accumulating outcome data with thestudy’s investigators was discontinued. The datasafety and monitoring committee assumed respon-sibility for deciding when the accumulating datawarranted changing the study protocol or termi-nating the study.

The first formal recognition of the need forinterim analyses, and the recognition that suchanalyses affect the probability of the type Ierror, came with the publication in the 1950sof papers on sequential clinical trials by Bross37

and Armitage.38 The principal advantage of asequential trial over a fixed sample size trial, isthat when the length of time needed to reach anendpoint is short, e.g. weeks or months, the samplesize required to detect a substantial benefit fromone of the treatments is less.

In the 1970s and 1980s solutions to interim anal-ysis problems came about in the form of group se-quential methods and stochastic curtailment.39 – 41

In the group sequential trial, the frequency ofinterim analyses is usually limited to a smallnumber, say between three and six. The Pocockboundaries42 use constant nominal significance

Page 19: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

8 TEXTBOOK OF CLINICAL TRIALS

levels; the Haybittle–Peto boundary43,44 usesstringent significance levels for all except the finaltest; in the O’Brien–Fleming method,45 stringencygradually decreases; in the method by Lan andDeMets,46 the total type I error probability is grad-ually spent in a manner that does not require thetiming of analyses to be prespecified. More detailsof these newer methods in the development of clin-ical trials are given in the next chapter.

Recent years have seen a huge increase in thenumbers of trials carried out and published, andin the advancement of methodological aspectsrelating to trials. Whilst many see the birthof clinical trials (certainly in their modern-dayguise) as being the Medical Research Councilstreptomycin trial,1 there remains some contro-versy (see, for example, D’Arcy Hart,47,48 Gill,49

and Clarke50). However, it is interesting to notethat one of the most substantial reviews of histor-ical aspects of trials is based on work for a 1951M.D. thesis.51 Bull cites 135 historical examplesand other supporting references–but no mentionof Bradford Hill and the Medical Research Coun-cil. The modern-day story of clinical trials per-haps begins where Bull ended.

ACKNOWLEDGEMENTS

This chapter is based heavily on work byF. Ederer in: Armitage P, Colton T, eds, Ency-clopedia of Biostatistics. Chichester: John Wiley& Sons (1998).

REFERENCES

1. Medical Research Council. Streptomycin treat-ment of pulmonary tuberculosis. Br Med J (1948)2: 769–82.

2. Slotki JJ. Daniel, Ezra, Nehemiah, Hebrew Textand English Translation with Introductions andCommentary Soncino Press: London (1951) 1–6.

3. Witkosky SJ. The Evil That Has Been Said ofDoctors: Extracts From Early Writers (translatedwith annotations by T.C. Minor). The CincinnatiLancet-Clinic, Vol. 41, New Series Vol. 22 (1889)447–8.

4. Packard FR. The Life and Times of Ambroise Pare,2nd edn. New York: Paul B. Hoeber (1921)27,163.

5. Lind J. A Treatise of the Scurvy. Edinburgh: SandsMurray Cochran (1753) 191–3.

6. Louis PCA. The applicability of statistics to thepractice of medicine. London Med Gazette (1837)20: 488–91.

7. Fisher RA. The arrangement of field experiments.J Min Agric (1926) 33: 503–13.

8. Fisher RA, McKenzie WA. Studies in crop varia-tion: II. The manurial response of different potatovarieties. J Agric Sci (1923) 13: 315.

9. van Helmont JB. Oriatrike or Physik Refined(translated by J. Chandler). London: LodowickLoyd (1662).

10. Amberson B, McMahon PM. A clinical trial ofsanocrysin in pulmonary tuberculosis. Am RevTuberc (1931) 24: 401.

11. Fibiger I. Om Serum Behandlung of Difteri.Hospitalstidende (1898) 6: 309–25, 337–50.

12. Altman DG, Bland JM. Treatment allocation incontrolled trials: why randomise? Br Med J (1999)318: 1209.

13. Diehl HS, Baker AB, Cowan DW. Cold vaccines:an evaluation based on a controlled study. J AmMed Assoc (1938) 111: 1168–73.

14. Gail MH. Statistics in action. J Am Stat Assoc(1996) 91: 1–13.

15. Medical Research Council. Chemotherapy of pul-monary tuberculosis in young adults. Br Med J(1952) i: 1162–8.

16. Medical Research Council. Clinical trials of anti-histaminic drugs in the prevention and treatmentof the common cold. Br Med J (1950) ii:425–31.

17. Medical Research Council. A comparison ofcortisone and aspirin in the treatment of earlycases of rheumatoid arthritis – I. Br Med J (1954)i: 1223–7.

18. Medical Research Council. A comparison ofcortisone and aspirin in the treatment of earlycases of rheumatoid arthritis – II. Br Med J (1955)ii: 695–700.

19. Hill AB, Marshall J, Shaw DA. A controlled clin-ical trial of long-term anticoagulant therapy incerebrovascular disease. Q J Med (1960) 29(NS):597–608.

20. Rheumatic Fever Working Party. The evolutionof rheumatic heart disease in children: five-yearreport of a co-operative clinical trial of ACTH,cortisone, and aspirin. Circulation (1960) 22:505–15.

21. Kinsey VE. Retrolental fibroplasia. AMA ArchOphthalmol (1956) 56: 481–543.

Page 20: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

BRIEF HISTORY OF CLINICAL TRIALS 9

22. Hill AB. Statistical Methods in Clinical andPreventative Medicine. Edinburgh: Livingstone(1962).

23. Katz J. The Nuremberg Code and the NurembergTrial. A reappraisal. J Am Med Assoc (1996) 276:1662–6.

24. McNeill PM. The Ethics and Politics of HumanExperimentation. Cambridge: Press Syndicate ofthe University of Cambridge (1993).

25. Beecher HK. Ethics and clinical research. NewEngl J Med (1966) 274: 1354–60.

26. Freedman B. Research, unethical. In: Reich WT,ed., Encyclopedia of Bioethics. New York: FreePress (1995) 2258–61.

27. Shirer WL. The Rise and Fall of the Third Reich.New York: Simon and Schuster (1960).

28. US Government Printing Office. Trials of WarCriminals before the Nuremberg Military Tri-bunals under Control Council Law No. 10, Vol. 2.Washington: US Government Printing Office(1949) 181–2.

29. Withington CF. Time Relation of Hospitals toMedical Education. Boston: Cupples Uphman(1886).

30. Osler W. The evolution of the idea of experiment.Trans Congr Am Phys Surg (1907) 7: 1–8.

31. Hill AB. Medical ethics and controlled trials. BrMed J (1963) 1: 1043–9.

32. Netherlands Minister of Social Affairs and Health.4 World Med J (1957) 299–300.

33. World Medical Assembly. Declaration of Helsinki:recommendations guiding physicians in biomed-ical research involving human subjects. WorldMedical Assembly, Helsinki (1964).

34. Faden RR, Beauchamp T, King NMP. A Historyof Informed Consent. New York: Oxford Univer-sity Press (1986).

35. Canner P. Monitoring of the data for adverseor beneficial treatment effects. Contr Clin Trials(1983) 4: 467–83.

36. Friedman L. The NHLBI model: a 25-year history.Stat Med (1993) 12: 425–31.

37. Bross I. Sequential medical plans. Biometrics(1952) 8: 188–295.

38. Armitage P. Sequential tests in prophylactic andtherapeutic trials. Q J Med (1954) 23: 255–74.

39. Armitage P. Interim analysis in clinical trials. StatMed (1991) 10: 925–37.

40. Friedman LM, Furberg CD, DeMets DL. Funda-mentals of Clinical Trials, 2nd edn. Boston: Wright(1985).

41. Pocock SJ. Clinical Trials: A Practical Approach.Chichester: John Wiley & Sons (1983).

42. Pocock SJ. Group sequential methods in thedesign and analysis of clinical trials. Biometrika(1977) 64: 191–9.

43. Haybittle JL. Repeated assessment of results ofcancer treatment. Br J Radiol (1971) 44: 793–7.

44. Peto R, Pike MC, Armitage P, Breslow NE, CoxDR, Howard SV, Mantel N, McPherson K, Peto J,Smith P. Design and analysis of randomized clin-ical trials requiring prolonged observation of eachpatient. 1. Introduction and design. Br J Cancer(1976) 34: 585–612.

45. O’Brien PC, Fleming TR. A multiple testing pro-cedure for clinical trials. Biometrics (1979) 35:549–56.

46. Lan KKG, DeMets DL. Discrete sequential bound-aries for clinical trials. Biometrika (1983) 70:659–63.

47. D’Arcy Hart P. History of randomised controlledtrials (letter to the Editor). The Lancet (1972) i:965.

48. D’Arcy Hart P. Early controlled clinical trials(letter to the Editor). Br Med J (1996) 312: 378–9.

49. Gill DBEC. Early controlled trials (letter to theEditor). Br Med J (1996) 312: 1298.

50. Clarke M. Early controlled trials (letter to theEditor). Br Med J (1996) 312: 1298.

51. Bull JP. The historical development of clinicaltherapeutic trials. J Chron Dis (1959) 10: 218–48.

Page 21: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

2

General IssuesDAVID MACHIN

National Cancer Centre, Singapore and United Kingdom Children’s Cancer Study Group,University of Leicester, UK

INTRODUCTION

Just as in any other field of scientific andmedical research, the choice of an appropriatedesign for a clinical trial is a vital element.In many circumstances, and for many of thetrials described in this text, these designs maynot be overly complicated. For example, alarge majority will compare two therapeuticor other options in a parallel group trial. Inwhich case the analytical methods used fordescription and analysis too may not be overlycomplicated. The vast majority of these aredescribed in basic medical statistics textbooksand implemented in standard software packages.Nevertheless there are circumstances in whichmore complex designs, such as sequential trials,are utilised and for which specialist methods arerequired. There are also, often rather complex,statistical problems associated with monitoringthe progress of clinical trials, their interimanalysis, stopping rules for early closure and thepossibility of extending recruitment beyond thatinitially envisaged.

Although the clinical trial itself may not beof complex design in the statistical sense, the

associated trial protocol should carefully describe(and in some detail) the elements essential forits conduct. Thus the protocol will describethe rationale for the trial, the eligible groupof patients or subjects, the therapeutic optionsand their modification should the need arise. Itwill also describe the method of patient allo-cation to these options, the specific clinicalassessments to be made and their frequency,and the major endpoints to be used for eval-uation. It will also include a justification ofthe sample size chosen, an indication of theanalytical techniques to be used for summaryand comparisons, and the proforma for datacollection.

Of major concern in all aspects of clinical trialdevelopment and conduct is the ethical necessitywhich is written into the Declaration of Helsinkiof 19641 to ensure the well-being of the patientsor subjects under study. This in itself requiresthat clinical trials are well planned and conductedwith due concern for the patient’s welfare andsafety. It also requires that the trial is addressingan important question, the answer to which willbring eventual benefit to the patients themselvesor at least to future patients.

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 22: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

12 TEXTBOOK OF CLINICAL TRIALS

EVIDENCE-BASED MEDICINE

In many circumstances trials have been con-ducted that are unrealistically small, some unnec-essarily replicated while others have not beenpublished as their results have not been consid-ered of interest. It has now been recognised thatto obtain the best current evidence with respectto a particular therapy, all pertinent clinical trialinformation needs to be obtained, and if circum-stances permit, the overview is completed by ameta-analysis of the trial results. This recogni-tion has led to the Cochrane Collaboration and aworldwide network of overview groups address-ing numerous therapeutic questions.2 In certainsituations this has brought definitive statementswith respect to a particular therapy. For othersit has led to the launch of large-scale confirma-tory trials.

Although it is not appropriate to review themethodology here, it is clear that the ‘overview’process has led to many changes to the wayin which clinical trial programmes have devel-oped. They have provided the basic informationrequired in planning new trials, impacted on anappropriate trial size, publication policy and veryimportantly raised reporting standards. They areimpacting directly on decisions that affect patientcare and questioning conventional wisdom inmany areas.

TYPES OF CLINICAL TRIALS

In broad terms, the types of trials conducted inhuman subjects may be divided into four phases.These phases represent the stages in, for example,the development of a new drug which requiresearly dose finding and toxicity data in man,indications of potential activity, comparisonswith a standard to determine efficacy and then(in certain circumstances) post-marketing trials.The nomenclature of Phase I, II, III and IV hasbeen developed for drug development purposesand there may or may not be exact parallels inother applications. For example, a trial to assessthe value of a health educational programme will

Table 2.1. Objectives of the trials of different phasesin the development of drug (after Day3)

Phase Objective

I The earliest types of studies that arecarried out in humans. They aretypically done using small numbers ofhealthy subjects and are to investigatepharmacodynamics, pharmacokineticsand toxicity.

II Carried out in patients, usually to findthe best dose of drug and to investigatesafety.

III Generally major trials aimed atconclusively demonstrating efficacy.They are sometimes calledconfirmatory trials and, in the contextof pharmaceuticals, typically are thestudies on which registration of a newproduct will be based.

IV Studies carried out after registration of aproduct. They are often for marketingpurposes as well as to gain broaderexperience with using the newproduct.

be a Phase III study, as will a trial comparingtwo surgical procedures for closing a cleft palate.In both these examples, any one of the PhaseI, II or IV trials would not necessarily beconducted. The objectives of each phase in atypical development programme for a drug aresummarised in Table 2.1.

Without detracting from the importance ofPhase I, II and IV clinical trials, the main focusof this text is on Phase III comparative trials.In this context, reference will often be made tothe ‘gold standard’ randomised controlled trial(RCT). This does not imply that this is the onlytype of trial worthy of conduct, but rather that itprovides a benchmark against which other trialdesigns are measured.

PHASE I AND II TRIALS

The traditional outline of a series of clinicaltrials moving sequentially through Phases I to

Page 23: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 13

IV is useful to consider in an idealistic set-ting, although in practice the sequential manneris not always followed (for reasons that willbecome clear).

Pocock4 (pp. 2–3) also gives a convenientsummary of the four phases while Temple5 givesa discussion of them with emphasis from aregulatory perspective. Whether the sequentialnature of the four phases is adhered to or not, theobjectives of each phase are usually quite clearlydefined.

As we have indicated in Table 2.1, Phase I stud-ies aim to investigate the metabolism of a drugand its pharmacodynamics and pharmacokinet-ics. Typical pharmacokinetic data would allow,for example, investigation of peak drug concen-trations in the blood, the half-life and the timeto complete clearance. Such studies will assist indefining what doses should be used and the dos-ing frequency (once daily, twice daily, hourly)for future studies. These Phase I studies (certainlythe very first ones) are almost always undertakenin healthy volunteers and would naturally be thevery first studies undertaken in humans. However,later in a drug development programme it may benecessary to study its effects in patients with spe-cific diseases, in those taking other medicationsor patients from special groups (infants, elderly,ethnic groups, pregnant women).

Most of the objectives of a Phase I study canoften be met with relatively few subjects – manystudies have fewer than 20 subjects. In essence,they are much more like closely controlledlaboratory experiments than population-basedclinical trials.

Broadly speaking, Phase II trials aim to setthe scene for subsequent confirmatory Phase IIItrials. Typically, although exceptions may occur,these will be the first ‘trials’ in patients. Theyare also the first to investigate the existenceof possible clinical benefits to those patients.However, although efficacy is important in PhaseII it may often be in the form of surrogate, forexample, tumour response rather than survivaltime in a patient with cancer. Along with efficacy,these studies will also be the first to give somedetailed data on side effects.

Although conducted in patients, Phase II trialsare typically still highly controlled and usehighly defined (often narrow) patient groups sothat extraneous variation is kept to a minimum.These are very much exploratory trials aimedat discovering if a compound can show usefulclinical results. Although it is not common,some of these trials may have a randomisedcomparison group.

NON-RANDOMISED EFFICACY STUDIES

HISTORICAL CONTROLS

In certain circumstances, when a new treatmenthas been proposed, investigators have recruitedpatients in single-arm studies. The results fromthese patients are then compared with informationon similar patients having (usually in the past)received a relevant standard therapy for thedisease in question. However, such comparisonsmay well be biased in many possible ways, suchthat it may not be reasonable to ascribe thedifference (if any) observed to the treatmentsthemselves. Nevertheless it has been argued thatusing regression models to account for possibleconfounding variables may correct such biases,6

but this is at best a very uncertain procedureand is not often advocated. Similar problemsarise if all patients are recruited prospectivelybut allocation to treatment is not made atrandom. Of course, there will be situationsin which randomisation is not feasible andthere is no alternative to the use of historicalcontrols or non-randomised prospective studies.One clear example of this is the early evaluationof the Stanford Heart Transplant Program inwhich patients could not be randomised toreceive or not a donor heart. Many carefulanalyses and reviews of this unique data sethave been undertaken and these have establishedthe value of the programme, but progress wouldhave been quicker (and less controversial) hadrandomisation been possible.

In the era of evidence-based medicine, infor-mation from non-randomised but comparative

Page 24: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

14 TEXTBOOK OF CLINICAL TRIALS

studies is categorised as providing weaker evi-dence than randomised trials (see Altman,7

p. 3279).

PHASE III CONTROLLED TRIALS

EQUIPOISE AND UNCERTAINTY

As indicated, the randomised controlled trial isthe standard against which other trial designs maybe compared. One such trial, and there are manyother examples described in subsequent chapters,compared conventional treatment, C, with acomplementary medicine alternative in patientswith severe burns.8 The complementary medicinewas termed Moist Exposed Burns Ointment(MEBO). One essential difference between thetwo treatments was that C covered the wounds(dressed) whilst MEBO left them exposed (notdressed). See Figure 2.1.

In this trial patients with severe burns wereemergency admissions requiring immediate treat-ment, so that once eligibility was confirmed,consent obtained, randomisation immediately fol-lowed and treatment was then commenced. Sucha trial is termed a two-treatment parallel groupdesign. This is the most common design for com-parative clinical trials. In these trials subjects areindependently allocated to receive one of severaltreatment options. No subject receives more thanone of these treatments.

In addition there is genuine uncertainty as towhich of the options is best for the patient. Itis this uncertainty which provides the necessary

equipoise, as described by Freedman9 and Weijeret al.10 to justify random allocation to treatmentafter due consent is given. Enkin,11 in a debatewith Weijer et al.,10 provides a counter view.

There are at least two aspects of the eligibilityrequirements that are important. The first is thatthe patient indeed has the condition (here severeburns) and satisfies all the other requirements.There must be no specific reasons why thepatient should not be included. For example, insome circumstances pregnant or lactating women(otherwise eligible) may be excluded for fearof impacting adversely either on the foetus orthe newborn child. The second is that all (heretwo) therapeutic options are equally appropriatefor this particular patient. Only if both theseaspects are satisfied should the patient be invitedto consent to participate in the trial. There will becircumstances in which a patient may be eligiblefor the trial but the attending physician feels (forwhatever reason) that one of the trial options is‘best’ for the patient. In which case the patientshould receive that option, no consent for the trialis then required and the randomisation would nottake place. In such circumstances, the clinicianshould not randomise the patient in the hope thatthe patient will receive the ‘best’ option. Then,if he or she did not, withdraw the patient fromthe trial.

The consent procedure itself will vary fromtrial to trial and will, at least to some extent,depend on local ethical regulations in the countryin which the trial is being conducted. The ideal isfully informed and written consent by the patient

Patientspresentingwith partial

degreeburns

Eligibleand

consentingsubjects

Source: Reproduced from Ang et al.,8

Randomallocation

totreatment

MEBO

Assessment

ConventionalDressing

Figure 2.1. Randomised controlled trial to compare conventional treatment and Most Exposed Burns Ointment(MEBO) for the treatment of patients with partial degree burns

Page 25: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 15

him or herself. However, departures from thismay be appropriate. For example, such departuresmay concern patients with severe burns who maybe unconscious at admission, very young childrenfor whom a proxy must be used to obtain theconsent for them, or patients with hand burns thatare so severe that they affect their ability to writetheir signature.

Clearly, during the period in which patients arebeing assessed for eligibility and their consentobtained, both the attending physician and thepatient will be fully aware of the potential optionsbeing compared in the RCT. However, neithermust be aware, at this stage, of the eventualtreatment allocation. It is important thereforethat the randomisation list, for the current aswell as for future patients, is held by a neutralthird party. In most circumstances, this shouldbe an appropriate trial office that is contactedby the responsible clinician once eligibility andconsent are obtained. This contact may be madeby telephone, fax, direct access by modem intoa trial database, email or the web – whichever isconvenient in the particular circumstance. It isthen important that therapy is instituted as soonas practicable after the randomisation is obtained.

In specific cases, the randomisation can be con-cealed within opaque and sealed envelopes whichare distributed to the centres in advance of patientrecruitment. Once a patient is deemed eligible,the envelope is taken in the order specified in aprescribed list, opened and the treatment therebyrevealed. Intrinsically, there is nothing wrongwith this process but, because of the potentialfor abuse, it is not regarded as entirely satisfac-tory. However, in some circumstances it will beunavoidable; perhaps a trial is being conductedin a remote area with poor communications. Insuch cases, every precaution should be taken toensure that the process is not compromised.

The therapeutic options should be well des-cribed within the trial protocol and details ofwhat to do, if treatment requires modificationor stopping for an individual patient should begiven. Stopping may arise either when patientsmerely refuse to take further part in the trial orfrom safety concerns with a therapy under test.

STANDARD OR CONTROL THERAPY

In the early stages of the development of a newtherapy it is important to compare this with thecurrent standard for the disease in question. Incertain circumstances, the ‘new’ therapy maybe compared against a ‘no treatment’ control.For example, in patients receiving surgery forthe primary treatment of head and neck cancerfollowed by best supportive care, the randomisedcontrolled trial may be assessing the value ofadding post-operative chemotherapy. In this casethe ‘control’ group are those who receive noadjuvant treatment, whilst the ‘test’ group receivechemotherapy. In certain circumstances, patientsmay receive a placebo control. For example,in the randomised controlled trial conductedby Chow et al.12 in those with advanced livercancer, patients are randomised to receive eitherplacebo or tamoxifen. In this trial both patientsand the attending physicians are ‘blinded’ tothe actual treatment given to individual patients.Such a ‘double-blind’ or ‘double-masked’ trialis a design that reduces any potential biasto a minimum. Such designs are not possiblehowever in many circumstances and neither arethose with a ‘no treatment’ control. In manysituations, the ‘control’ will be the current bestpractice against which the new treatment willbe compared. Should this turn out to be betterthan current practice then this, in its turn, maybecome standard practice against which futuredevelopments will be compared.

In general there will be both baseline andfollow-up information collected on all patients.The baseline (pre-randomisation) informationwill be that required to determine eligibil-ity together with other information required todescribe the patients recruited to the trial togetherwith those variables which are thought likely toinfluence prognosis. The key follow-up informa-tion will be that which is necessary to determinethe major endpoint(s) of the randomised con-trolled trial. Thus in the example of the burnspatients these may be when the unhealed bodysurface area finally closes or the size and sever-ity of the resulting scars. To establish the first ofthese, the burns areas may have to be monitored

Page 26: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

16 TEXTBOOK OF CLINICAL TRIALS

on a daily basis to determine exactly when theendpoint is achieved, whereas the latter may be asingle assessment at the anniversary of the initialburn accident. Pre-trial information on these end-points, possibly from clinical experience or otherstudies, will usually form the basis of the antic-ipated difference between treatments (the effectsize), help determine the trial size and be thevariables for the statistical comparisons.

LARGE SIMPLE TRIALS

It has become recognised over time, particularlyin the fields of cardiovascular disease and cancer,that there are circumstances where small ther-apeutic advantages may be worthwhile demon-strating. In terms of trial size, the smaller thepotential benefit, essentially the effect size, thenthe larger the trial must be in order to be rea-sonably certain that the small benefit envisagedreally exists at all.

Trials of this size, often involving manythousands of patients, are a major undertaking. Tobe practical, they must be in common diseases inorder to recruit the required numbers of patientsin a reasonable time frame. They must be testinga treatment that has wide applicability and can beeasily administered by the clinicians responsibleor the patients themselves. The treatments mustbe relatively non-toxic else the small benefit willbe outweighed by the side effects. The trialsmust be simple in structure and restricted asto the number of variables recorded, so thatthe recruiting clinicians are not overburdened bythe workload attached to large numbers of trialpatients going through the clinic. They also needto be simple in this respect, for the responsibletrial centre to cope with the large amounts ofpatient data collected.

One example of such a trial tests the value ofaspirin in patients with cardiovascular disease.13

This trial concerned a very common disease usinga very simple and low-cost treatment taken astablets with very few side effects. The resultingestimates of absolute survival gain were (asexpected) small but the benefits in public health

terms enormous. Similar types of trials havebeen conducted in patients with breast cancer,one in particular compared the three adjuvanttreatment possibilities: tamoxifen, anastrozole ortheir combination.14

INTERVENTION AND PREVENTION TRIALS

The focus so far has been on randomised con-trolled trials in patients with medical conditionsrequiring treatment or a medical interventionof some sort. Such designs do apply to situa-tions such as trials in normal healthy womenin which alternative forms of contraception arebeing tested.15 However, there are quite differentsituations in which the object of a trial is to eval-uate alternative strategies for preventing diseaseor for detecting its presence earlier than is rou-tine. For example, intervention trials to encourage‘safe sex’ to prevent the spread of AIDS or breastscreening trials to assess the value of early detec-tion of the disease on subsequent treatment andprognosis. In such cases, it may be impossible torandomise on an individual subject basis. Thus aninvestigator may have to randomise communitiesto test out different types of health promotionor different types of vaccines, when problemsof contamination or logistics, respectively, meanthat it is better to randomise a group rather thanan individual. This is the approach adopted byDonner et al.16 in a trial to compare a reducedantenatal care model with a standard care model.For this trial, because of the clustered randomisa-tion of the alternatives on a clinic-to-clinic basis,the Zelen17 single consent design was utilised.

PHASE IV TRIALS – POST-MARKETINGSURVEILLANCE

Within a regulatory framework, Phase IV trialsare generally considered as ‘post-registration’trials: that is, trials of products that already havea marketing authorisation. However, within thispost-registration period studies may be carriedout for a variety of purposes, some within

Page 27: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 17

their existing licence and others out-with thatlicence. Studies may also be undertaken incountries where a marketing authorisation hasnot been approved, in which case they areregulatory or Phase III-type studies, at leastfor that country. Studies may be undertaken toexpand the indications listed on a marketingauthorisation either for a different disease or adifferent severity of the indicated condition. Theymay be undertaken to gain more safety data fornewly registered products: this latter situation ismore usually what is considered as a true PhaseIV study.

Historically, pharmaceutical companies used tocarry out studies that were solely for market-ing purposes and answered very few (if any)research questions. These were termed ‘seedingstudies’ although with tighter ethical constraintssuch studies are now very rare, if indeed theytake place at all. Certainly the ‘hidden’ objectiveof many Phase IV studies carried out by pharma-ceutical companies may be to increase sales, butif the means of doing so is via answering a usefulscientific or medical question then this should beof benefit both to society and to the company.

Many trials organised by academic depart-ments should also be considered as Phase IVstudies. Classic examples such as the RISCGroup trials13 looking at the cardiovascular ben-efits of aspirin are studies of licensed products toexpand their use.

ALTERNATIVE DESIGNS

For illustrative purposes we have used the two-arm parallel group RCT but these designs can begeneralised to compare three or more groups asappropriate. In addition, there are other designsin common use which include 2 × 2 factorial andcrossover designs.18

MORE THAN TWO GROUPS

Although not strictly a different design, a parallelgroup trial with more than two treatments to com-pare does pose some difficulties. For example,

how is the size of a trial comparing three treat-ments, A, B and C, determined, since there arenow three possible anticipated effect sizes thatone can use for planning purposes? These corre-spond to the treatment comparisons A–B, A–C

and B –C. The number of patients required foreach of these comparisons may give three verydifferent sample sizes. It is then necessary todecide which of these will form the basis for thefinal trial recruitment target, N . The trial will thenrandomise the patients equally into the three treat-ment groups. In many circumstances, a three-armtrial will tend to require some 50% more patientsthan a two-arm trial comparing two of the threetreatments under consideration.

Once the trial has been completed, the result-ing analysis (and reporting) is somewhat morecomplex than for a simple two-group comparison.However, it is the importance of the questionsposed, rather than the ease of analysis, whichshould determine the design chosen. A goodexample of the use of such a design is the previ-ously mentioned trial in post-menopausal patientswith breast cancer in which three options arecompared.14

Nevertheless practical considerations may ruleout this choice of design. The design posesparticular problems in data monitoring. Forexample, if an early advantage appears to favourone particular treatment and this suggests the trialmight be stopped early as a consequence. Thenit may not be clear whether the randomisationbetween the other treatment groups should orshould not continue. Were the trial to stopearly, then the questions relating to the othercomparisons are unlikely to have been resolvedat this stage. Should the (reduced) trial continuethen there may be very complex issues associatedwith its analysis and reporting.

In addition, a potentially serious problem ofbias can arise. At the onset of the trial, theclinician has to assess whether or not all thethree options (A, B or C) available for treatmentare suitable for the particular patient underexamination. If any one of these were not thoughtto be appropriate (for whatever reason), then thepatient’s consent would not be sought to enter

Page 28: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

18 TEXTBOOK OF CLINICAL TRIALS

the trial and to be randomised. Suppose later inthe trial, an interim analysis suggests recruitmentto A is no longer necessary and that arm isclosed. For future patients, the clinician now hasto assess whether or not only the two options (Bor C) are suitable for the particular patient underexamination. As a consequence, the patients nowgoing into the trial are no longer potentials for A

and hence may be somewhat different than thoseentering at the earlier stages. Although this willnot bias the final comparison between B and C, itdoes imply that there will be a bias if the patientsentering at this stage are compared with thosereceiving A.

In the above, we have assumed that thethree treatments A, B and C are, in a sense,unrelated albeit all suitable for the patients inmind. However, if they were related, for example,perhaps three doses of the same drug, then anote of this structure may change the approachto design from that outlined here.

FACTORIAL DESIGNS

In a trial conducted by Chan et al.19 patientswith dyslipidaemia in visceral obesity wererandomised to either atorovastin alone, fish oilalone, both or neither (placebo) to investigatetheir influence on lipid levels. This trial may takethe form of no treatment (1), atorovastin alone(a), fish oil alone ( f) or both atorovastin and fishoil (af). These alternative options are summarisedin Figure 2.2.

In contrast to the two-group parallel design ofFigure 2.1, where MEBO is contrasted with con-ventional treatment C, the factorial design posestwo questions simultaneously. Those patientsassigned to groups I and II are compared withthose receiving III and IV, to assess the valueof fish oil. This estimates the effect of fish oiland is termed the ‘main effect’ of that treat-ment. Those assigned to I and III are comparedto those receiving II and IV to assess the maineffect of atorovastin. In most situations the finaltrial size will require fewer patients than wouldbe necessary if the two questions were posed intwo distinct parallel group trials of the format ofFigure 2.2.

Eligible andconsenting

patients withdyslipidaemia

in visceralobsesity

Randomallocation

totreatment

I - Placebo

II - Atorovastinalone

III - Fish oil alone

IV - Atorovastinand fish oil

Figure 2.2. Randomised 2 × 2 factorial trial todetermine the value of atorovastin, fish oil or bothin patients with dyslipidaemia in visceral obesity.Reproduced from Chan et al.19

In addition, the factorial design allows thecomparison of groups (I and IV) with (II andIII) and this estimates the so-called fish oil byatorovastin interaction. For example, suppose boththe main effect of fish oil alone and the maineffect of atorovastin alone prove to be beneficial inthis context, then an interaction would arise if thecombination treatment fish oil–atorovastin givesa benefit greater (or lesser) than the sum of theseconstituent parts. As is the situation here, factorialdesigns can be of a double placebo type, wheresubjects of Group I of Figure 2.2 receive doubleplacebo, one representing each treatment factor.

In principle, the 2 × 2 or 22 design can beextended to the 2k (k > 2). Circumstances wherethis kind of design may be useful are if perhapsthe first two factors are major therapeutic orcurative options and the third is a factor fora secondary question not associated directlywith efficacy but (say) to relieve pain in suchsubjects. However, the presence of a third factorof whatever type increases the complexity ofthe trial design (not itself a particularly difficultstatistical problem) which may have implicationson the patient consent procedures and the timingof randomisation(s). Nevertheless, a 23 designin a trial of falls prevention in the elderlyhas been successfully conducted by Day et al.20

Piantadosi21 describes several examples of the useof these designs.

However, Green22 has issued a cautionary notethat some of the assumptions behind the use of

Page 29: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 19

factorial designs may not be entirely justified andso any proposals for the use of such designsshould be considered carefully.

CROSSOVER TRIALS

In contrast to the design of Figure 2.1, in thecrossover trial each patient will receive bothtreatment options, one followed by the other intwo periods of time. The two treatments will begiven either in the order A followed by B (AB)or the reverse (BA). The essential features of acrossover design are summarised in Figure 2.3for the trial conducted by Hong et al.23 in patientswith erectile dysfunction.

Typically, in the two-period, two-treatmentcrossover trial, for eligible patients there is a run-in stage in which the subject receives neithertreatment. At the end of this randomisation toeither AB or BA takes place. Following activetreatment in Period I (in effect either A orB depending on the randomisation), there is awash-out interval in which again no treatmentis given, after which Period II commences andthe (other) treatment of the sequence is given.The characteristics of this design, for example thepossible run-in and the wash-out period, implythat only certain types of patients in which activetreatment can be withheld in this way are suitableto be recruited. Further, there is an implicationthat the patient returns to essentially the samestate at the beginning of Period II as he or shewas in at the commencement of Period I. This

ensures that the between-treatment comparison(A v B) within the patient remains unaffectedby anything other than the change in treatmentitself and random variation. These considerationstend to restrict the applicability of the designto patients with chronic conditions such as, forexample, arthritis, asthma or migraine. Senn24

describes in careful and comprehensive detail therole of crossover designs in clinical studies.

A clear advantage of the design is that thepatient receives both options and so the analysisincludes within-patient comparisons which aremore sensitive than between-patient comparisons,implying that such trials would require fewersubjects than a parallel group design comparingthe same treatment options.

EQUIVALENCE TRIALS

In certain situations, a new therapy may bringcertain advantages over the current standard,possibly in a reduced side-effects profile, easieradministration or cost. Nevertheless, it may notbe anticipated to be better with respect to theprimary efficacy variable. Under such conditions,the new approach may be expected to be atleast ‘equivalent’ to the standard in relation toefficacy.

Trials to show that two (or more) treatments are‘equivalent’ to each other pose special problemsin design, management and analysis.25 ‘Provingthe null hypothesis’ in a significance testingscenario is never possible: the strict interpretation

Patientswith

erectiledysfunction

Randomallocation

to sequenceof

treatments

Korean redginseng

Wash-out

Placebo

Korean redginseng

Placebo

Source: Reproduced from Hong et al.,23 with permission.

Figure 2.3. Randomised placebo controlled, two-period crossover trial of Korean red ginseng in patients witherectile dysfunction

Page 30: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

20 TEXTBOOK OF CLINICAL TRIALS

when a statistically significant difference hasnot been found is that ‘there is insufficientevidence to demonstrate a difference’. Smalltrials typically fail to detect differences betweentreatment groups but not necessarily becauseno difference exists. Indeed it is unlikely thattwo different treatments will ever exert trulyidentical effects.

A level of ‘therapeutic equivalence’ should bedefined and this is a medical question, not astatistical one. For example in a study of weightgain in pre-term infants, if two treatments showmean increases in weight to within 25 g per weekthen they may be considered as therapeuticallyequivalent. Note that in this example 25 g isnot the mean weight gain that is expected perweek – we would hope that would be muchmore. But if infants receiving one feedingregimen had a mean increase of 150 g per weekthen we would consider an alternative treatmentto be equivalent if the mean weight gain werebetween 125 and 175 g per week.

Conventionally, to show a treatment difference,we would state the null hypothesis as being that

there is no difference between the treatments andthen look for evidence to refute that null hypoth-esis. In the case of equivalence we specify therange of equivalence, � (25 g per week in theabove example) and then test two null hypothe-ses. We test that the observed difference is statis-tically significantly greater than −�; and that theobserved difference is statistically significantlyless than +�. In practice it is much easier toconsider a confidence interval for the differencebetween the treatment means and draw this ona graph with the agreed limits of equivalence.Figure 2.4 shows various scenarios. Some casesshow equivalence, some fail to show equivalence;some cases show a statistically significant differ-ence, others fail to show a difference. Note thatit is quite possible to show a statistically signif-icant difference between two treatments yet alsodemonstrate therapeutic equivalence. These arenot contradictory statements but simply a real-isation that although there is evidence that onetreatment works better than another, the size ofthe benefit is so small that it has little or nopractical advantage.

Not equivalent

Uncertain

Uncertain

Uncertain

Equivalent

Equivalent

Not equivalent

Equivalent

Yes

Yes

Yes

No

Yes

Yes

Yes

No

Statisticalsignificance?

−∆ O +∆True difference

Source: Reproduced from Jones et al.,86 with permission from the BMJ Publishing Group.

Examples of possible results of using the confidence interval approach: −∆ to +∆ isthe prespecified range of equivalence; the horizontal lines correspond to possibletrial outcomes expressed as confidence intervals, with the associated significancetest result shown on the left; above each line is the decision concerning equivalence.

Figure 2.4. Schematic diagram to illustrate the concept of equivalence

Page 31: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 21

The analysis and interpretation can be quitestraightforward but the design and managementof equivalence trials is often much more complex.In general, careless or inaccurate measurement,poor follow-up of patients, poor compliance withstudy procedures and medication all tend to biasresults towards no difference. Since we are tryingto offer evidence of equivalence, poor studydesign and procedures may therefore actuallyhelp to hide treatment differences. In general,therefore, the quality of equivalence trials shouldbe demonstrably high.

One special subset of equivalence trials istermed ‘non-inferiority’ trials. Here we only wishto be sure that one treatment is ‘not worse than’or is ‘at least as good as’ another treatment: if it isbetter, that is fine (even though superiority wouldnot be required to bring it into common use).All we need is to get convincing evidence thatthe new treatment is not worse than the standard.We still have the same design and managementconsiderations but here, looking at Figure 2.4, wewould only be concerned with whether or not thelower limit of the confidence interval is greaterthan the non-inferiority margin (−�).

SEQUENTIAL TRIALS

There has been an implicit assumption in thetrial designs discussed above, that the total (andfinal) sample size is determined at the designstage and before recruitment commences to thetrial in question. This fixed sample size approachessentially implies that the data collected duringthe conduct of the trial will only be examinedfor efficacy once the trial has closed to patientaccrual. However, the vast majority of Phase IIItrials will tend to recruit patients over perhapsan extended interval of time and so informationon efficacy will be accumulated over this period.A sequential trial is one designed to utilise thisaccumulating knowledge to better effect. Perhapsto decrease the final trial size if the data areindicating an advantage to one of the treatmentsand this can be firmly established at an early stageor to extend the trial size in other circumstances.

In fact, Donaldson et al.26 give exampleswhere trials that had been conducted using a

fixed sample size approach might have beencurtailed earlier had a sequential design beenutilised. Fayers et al.27 describe the issues facedwhen designing a sequential trial using α-interferon in patients with renal carcinoma. Theaccumulating patient data from this trial crossedan early termination boundary which inferred anadvantage to α-interferon.28

A fully sequential design will monitor the trialpatient-by-patient as the information on the trialendpoint is observed from them. Alternatively,a ‘group’ sequential trial will utilise informationfrom successive groups of patients. Computerprograms to assist in the implementation of thesedesigns are available29 and a review of some ofthe issues is given by Whitehead.30

The solid lines of Figure 2.5 indicate stoppingboundaries for declaring a statistically significantdifference between treatments A and B. If thebroken boundary is crossed, the trial stops,concluding that no significant difference wasfound between the treatments. Potentially, thenumber of preferences could continue indefinitelybetween the upper solid and broken lines orbetween the lower solid and broken lines; in sucha case no conclusion would ever be reached.

In contrast to the open sequential design,the closed design of Figure 2.6 will continueto recruit to a pre-stated maximum or until aboundary is crossed. Thus this design has a finitesize and a conclusion will always be drawn.Again the solid lines indicate stopping boundariesfor declaring a statistically significant differencebetween treatments and if the broken line iscrossed, concluding that no significant differencewas found between the treatments.

For more complex designs implementing asequential option remains possible. For example,a factorial structure of the treatments does permita sequential approach to trial design as the twoquestions can be monitored separately. Neverthe-less, if one of the factors, say the chemotherapyoption in the example we discussed previously,is stopped by crossing a boundary during thesequential monitoring, then the recruiting clini-cian will only have to put two rather than fourchoices to the patients. Again, the subsequent

Page 32: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

22 TEXTBOOK OF CLINICAL TRIALS

0 10 20 30 40

Number of preferences

−20

−10

0

10

20

50

Exc

ess

pref

eren

ces

(A −

B)

A significantlybetter than B

B significantlybetter than A

No significantdifference

Source: Reproduced from Day3 (Figure 24), with permission.

Open sequential design. The solid lines indicate stopping boundaries fordeclaring a statistically significant difference between treatments A and B.If the broken boundary is crossed, then the study stops, concluding thatno significant difference was found between the treatments. Potentially thenumber of preferences could continue indefinitely between the upper solidand broken lines or between the lower solid and broken lines; in such a caseno conclusion would ever be reached.

Figure 2.5. Open sequential design

patients recruited may then be somewhat dif-ferent from those at the first stage of the trial.This will then impact on the form of the subse-quent analysis.

In contrast, the three-arm trial cannot be soeasily designed in a sequential manner althoughboth Whitehead31 and Jennison and Turnbull32

address the problem. They do not however appearto give an actual clinical example of their use.

There are however several problems associ-ated with the use of sequential designs. Theserange from difficulties of financing a trial ofuncertain size, making sure the data is fully up-to-date as the trial progresses, to the more tech-nical concerns associated with the calculation ofthe appropriate confidence intervals. However,Whitehead,31 see also Jennison and Turnbull,32

has argued very persuasively that all these objec-tions can be resolved. Nevertheless, in relative

terms the use of sequential designs is still some-what limited.

ZELEN’S DESIGNS

Although not strictly an alternative ‘design’in the sense of those of this section, Zelen’srandomised consent design combines aspects ofdesign with problems associated with obtainingconsent from patients to participate in clinicaltrials. They were motivated by the difficultiesexpressed by clinicians in obtaining consent fromwomen who they wished to recruit to trialswith breast cancer.17,33 Essentially subjects arerandomised to one of two treatment groups.Those who were randomised to the standardtreatment (conventional dressing in Figure 2.1)are all treated with it. For these patients noconsent to take part in the trial is sought. On

Page 33: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 23

0 10 20 30 40

Number of preferences

−20

−10

0

10

20

Exc

ess

pref

eren

ces

(A −

B)

A significantlybetter than B

B significantlybetter than A

No significantdifference

Source: Reproduced from Day3 (Figure 4), with permission.

Closed sequential design. The solid lines indicate stopping boundariesfor declaring a statistically significant difference between treatments Aand B. For example, if out of ten patients expressing a preference forone or other treatment, nine preferred treatment B and only one preferredA, then the study would stop, concluding that B is significantly better than A.If the broken boundary is crossed, then the study stops and the conclusionis drawn that no significant difference was found between the treatments.

Figure 2.6. Closed sequential design

the other hand, those who are randomised tothe experimental treatment (MEBO dressing inFigure 2.1) are asked for their consent; if theyagree they are treated with the experimentaltreatment; if they disagree they are treated withthe standard treatment. This is known as Zelen’ssingle consent design. An alternative is that thoserandomised to the standard treatment may alsobe asked if they accept that treatment; again,they are actually given their treatment of choice.This latter double consent design is describedin Figure 2.7; in either case, the analysis mustbe by intention-to-treat, that is based on thetreatment to which patients were randomised, notthe treatment they actually received.

The properties of these designs have beenexamined in some detail by Altman et al.34 whoconcluded that: ‘There are serious statistical argu-ments against the use of randomised consentdesigns, which should discourage their use’. In

Eligible patients

Randomise

Compare

Standard (G1)

Obtain consent

Treat withstandard

Yes

Treat withnew

Evaluate Evaluate

No

New (G2)

Obtain consent

Treat withnew

Yes

Treat withstandard

Evaluate Evaluate

No

Source: After Altman et al.34 (Figure 3).

Figure 2.7. The sequence of events to follow inZelen’s double randomised consent design, seekingconsent in conjunction with randomisation

Page 34: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

24 TEXTBOOK OF CLINICAL TRIALS

any event, they have rarely been used in prac-tice although they continue to be advocated.35

Nevertheless, in the large antenatal care trialdescribed by Donner et al.16 the Zelen singleconsent design is utilised. However, this is acluster randomised trial and the issues are some-what different.

BAYESIAN METHODS

The essence of Bayesian methodology in the con-text of the design of clinical trials is to incorpo-rate relevant information into the design process.At a later stage, Bayesian methods may assistin data monitoring (as trial data accumulate) andwith the final analysis and interpretation of the(now complete) trial data. In theory the informa-tion available and which is pertinent to the trial inquestion can be summarised into a prior distribu-tion. This may include (hard) information fromthe published literature and/or elicited clinicalopinion. The mean of such a distribution wouldcorrespond to the possible effect size which canthen be assessed by the design team as clinicallyworthwhile in their context and hence used forsample size estimation purposes. The same prior,or that prior updated from new external evidenceaccumulated during the course of the trial, maybe used by the trial Data Monitoring Committee(DMC) to help form the basis of recommenda-tions with respect to future conduct of the trial.Perhaps to close the trial as efficacy has beenclearly established or increase the planned size ofthe trial as appropriate. Finally once the trial datais complete, these can be combined with the prior(or updated prior) to obtain the posterior distribu-tion, from which Bayesian estimates of treatmenteffect and corresponding credibility intervals canbe calculated.

However, despite the feasibility of the aboveapproach few trials to date have implementeda full Bayesian approach. A review of articlesin the British Medical Journal from 1996 toNovember 1999 found no examples.7 Never-theless, Spiegelhalter et al.36 show convincinglyhow the concept of optimistic and sceptical priordistributions obtained from the clinical teams

may be of assistance in interpreting the resultsof a trial. Despite some technical difficulties, it isfairly certain that Bayesian methods will becomemore prominent in Phase II trial methodology.For example, Tan et al.37 suggest how such ideasmay be useful in a Phase II programme.

RANDOMISATION AND ALLOCATIONTO TREATMENT

We have indicated above that randomisation ofpatients to the treatment they receive is animportant part of the ‘gold’ standard. In fact it isthe key element. The object of randomisation is tohelp ensure that the final comparison of treatmentoptions is as unbiased as possible, that is, that anydifference or lack of difference observed betweentreatments in efficacy is not due to the method bywhich patients are chosen for the options understudy. For example, if the attending clinicianchose which of MEBO or C should be given toeach patient, then any differences observed maybe due, at least in part, to the selection processitself rather than to a true difference in efficacy.

Apart from the possible effect of the allocatedtreatments themselves, observed differences mayarise through the play of chance alone or possiblyan imbalance of patients with differing prognosesin the treatment groups or both. The object ofthe statistical analysis will be to take accountof any imbalance and assess the role of chance.Some of the imbalance in the major prognosticvariables may be avoided by stratifying therandomisation by prognostic group and ensuringthat an equal number of patients are allocatedwithin each stratum to each of the options. Thismay be achieved by arranging the randomisationto be balanced within predetermined blocks ofpatients within each of the strata. Blocks areusually chosen as neither too small nor too large,four or six are often used. Sometimes the blocksize, perhaps between these options, is chosenat random for successive sequences of patientswithin a stratum of patient types.

Alternatively, the balance of treatments bet-ween therapeutic options can be made using a

Page 35: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 25

dynamic allocation procedure. In such schemes,during the randomisation procedure a patient isidentified to belong to a predetermined categoryaccording to certain covariates. This categorymay, for example, be defined as those of a par-ticular age, gender and tumour stage group. Oncethe category is determined, then randomisation tothe treatment options may proceed as describedabove. One option, however, is to allocate thenext patient to the treatment with the fewestpatients already assigned within that category. Inwhich case, the allocation at that stage is deter-ministic. A better option in such circumstances isto weight the randomisation, perhaps in the ratioof 3:2 in favour of the option with the fewestpatients. Clearly, if numbers are equal, the ran-domisation would revert to 1:1.

We have implicitly assumed that, for two treat-ments, a 1:1 randomisation will take place. Forall practical purposes, this will be statisticallythe most efficient. However, the particular con-text may suggest other ratios. For example, ifthe patient pool is limited for whatever reason,then the clinical team may argue that they shouldobtain more information within the trial fromthe test treatment rather than the well-knownstandard. Perhaps, there is a concern with thetoxicity profile rather than just the efficacy perse. In such circumstances, a randomisation ratioof say 2:3 or 1:2 in favour of the test therapymay be decided. However, some loss of statisticalpower will ensue and this loss should be quan-tified before a decision on the allocation ratio isfinally made.

ENDPOINTS

DEFINING THE ENDPOINT(S)

The protocol for every clinical trial will detailthe assessments to be made on the patientsrecruited. Some of these assessments may focuson aspects of the day-to-day care of the patientwhilst others may focus more on those measureswhich will be necessary in order to determine thetrial endpoint(s) for each subject. It is importantthat these endpoints are unambiguously defined

so that they can be determined for each patientrecruited to the trial. It is good practice todefine which endpoint is the major endpoint ofthe trial as this will be used to determine trialsize and be the main focus for the efficacyevaluation. In many situations, there may beseveral endpoints of interest, but in this case itis important to order them in order of priority orat least to identify those of primary or secondaryimportance. If there are too many endpointsdefined, then the multiplicity of comparisons thenmade at the analysis stage may result in spuriousstatistical significance. This is a major concern ifendpoints for health-related quality of life andhealth economic evaluations are added to thealready established more clinical endpoints.

SINGLE MEASURES

In some trials a single measure may be sufficientto determine the endpoint in each patient. Forexample, the endpoint may be the diastolicblood pressure measured at a particular time, say28 days post-randomisation in each patient. Inthis case the treatment groups will be summarisedby the respective means. In some situations theendpoint may be patient response, for example,the patient becomes normo-tensive following aperiod of treatment. Those who respond aretermed successes and those that do not failures.In this case, the treatment groups will besummarised by the proportion of responders. If,on the other hand, the patients are categorisedas: normo-tensive, still hypotensive but diastolicblood pressure (DBP) nevertheless reduced, orstill hypotensive and DBP not improved, thenthis would correspond to an ordered categoricalvariable. Alternatively, the endpoint may bedefined as the time from randomisation andinception of treatment for the patient to becomenormo-tensive. In this situation repeated (saydaily) measures of DBP will be made untilthe value recorded is normo-tensive (as definedin the protocol). The interval between the dateof randomisation and the date of recording thefirst occurrence of a normo-tensive recording isthe endpoint measure of interest. Such data areusually summarised using survival time methods.

Page 36: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

26 TEXTBOOK OF CLINICAL TRIALS

A particular feature of time-to-event studiesoccurs when the endpoint cannot be determined.For example, in the trial monitoring the DBPit may be that a patient never becomes normo-tensive during the trial observation period. Inwhich case the time from randomisation untilthe end of the trial observation period representsthe time a patient has been under observationbut has not yet become normo-tensive. Such asurvival time is termed censored and is oftendenoted by, say, 28+, which here means thepatient has been observed for 4 weeks but stillremains hypotensive. In contrast, an observationof 28 means the patient has been observed for4 weeks and became normo-tensive on the lastobservation day.

REPEATED MEASURES

In the trial taking repeated DBP assessments,these are recorded in order to determine a singleoutcome – ‘time to becoming normo-tensive’. Inother situations, the successive values of DBPthemselves may be utilised in making the formalcomparisons. If the number of observations madeon each subject is the same, then the analysismay be relatively straightforward, perhaps usingrepeated measures analysis of variance. On theother hand, if the numbers of observationsrecorded varies or if the intervals betweensuccessive observations vary from patient topatient or if there is occasional missing data, thenthe summary and analysis of such data may bequite complex. One option is to calculate the areaunder the curve (AUC) and use this as a singlemeasure for each patient, thus avoiding the useof more complex analytical methods.38

However, the AUC method is now beingsuperseded somewhat in Phase III trials by theuse of general estimating equations and multi-level modelling. The technical details are beyondthe scope of this book but most good statisticspackages39 now include facilities for these typesof analyses. Nevertheless, these methods have notyet had much impact on the reporting of clinicaltrials, although a good example of their use hasbeen provided by Brown et al.40

QUALITY OF LIFE

In many trials, endpoints such as the percent-age of patients responding to treatment, survivaltime or direct measures such as DBP have beenused. In other situations, more psychosocial mea-sures have been utilised such as pain scores, per-haps measured using a visual analogue scale, andemotional functioning scores, perhaps assessedby patients completing a questionnaire them-selves. Such self-completed questionnaires havealso been developed to measure aspects of qual-ity of life (QoL) in patients undergoing treatmentfor their disease. One such instrument is the SF-36 of Ware and Sherbourne,41 part of which isreproduced in Figure 2.8.

The QoL domains measured by these instru-ments may then be used as the definitive end-points for clinical trials in certain circumstances.For example, in patients with terminal cancerthe main thrust of therapy may be for palliation(rather than cure) so that aspects of QoL maybe the primary concerns for any comparison ofalternative approaches to management and careof such patients. If a single aspect of this QoLmeasured at one time point is to be used forcomparison purposes, then no new principles arerequired either for trial design purposes or anal-ysis. On the other hand, and more usually, theremay be several aspects of the QoL instrumentthat may need to be compared between treat-ment groups and these features will usually beassessed over time. This is further complicatedby often-unequal numbers of assessments avail-able from each patient caused either by missingassessments in the series for a variety of reasonsrelated or unrelated to their health status or per-haps in terminal patients by their death. Fayersand Machin42 and Fairclough43 discuss these fea-tures of QoL data in some detail.

As we have discussed previously, there isalso a problem associated with the numerousstatistical tests of significance of the multipleQoL outcomes. These pose problems of inter-pretation which have also been addressed byFayers and Machin42 (Chapter 14). In short, acautious approach is needed to ensure apparently

Page 37: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 27

The SF-36TM Health Survey

Instructions for Completing the Questionnaire

EXAMPLE

Please answer every question. Some questions may look like others, but eachone is different. Please take the time to read and answer each questioncarefully by filling in the bubble that best represents your response.

This is for your review. Do not answer this question. The questionnairebegins with the section Your Health in General below.

For each question you will be asked to fill in a bubble, in each line.

Please begin answering the questions now.

1. In general, would you say your health is:

1. How strongly do you agree or disagree with each of the following statements?

2. Compared to one year ago, how would you rate your health now ?

Please turn the page and continue

© Medical Outcomes Trust and John E. Ware, Jr. —All Rights Reserved

a) I enjoy listening to music.b) I enjoy reading magazines.

Stronglyagree

Stronglydisagree

Agree DisagreeUncertain

Your Health in General

Excellent Very good

Much betternow than one

year ago

Somewhat betternow than one

year ago

Somewhatworse now than

one year ago

Much worsenow than one

year ago

About thesame as one

year ago

Good Fair Poor

Source: Reproduced from Ware and Sherbourne,41 with permission.

For permission to use contact: Dr John Ware, Medical Outcomes Trust, 20 Park PlazaSuite 1014, Boston, MA 02116-4313, USA

Figure 2.8. 36-Item Short Form Health Survey (SF-36)

Page 38: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

28 TEXTBOOK OF CLINICAL TRIALS

‘statistically significant’ differences are trulythose. One way to overcome this problem is forthe clinical protocol to rank the domains of QoLto be measured in terms of their relative impor-tance and to confine the formal statistical testsand confidence intervals to these only.

ECONOMIC EVALUATION

Most trials are intended primarily to addressquestions of efficacy. Safety is frequently animportant (though secondary) objective. Healtheconomics is becoming increasingly importantand is often now evaluated as part of a ran-domised controlled trial.

There are four main types of cost analyses thatare usually considered:

• cost minimisation, simply to determine the besttreatment to minimise the total cost of treatingthe disease;

• cost effectiveness, a trade-off between the costof caring for a patient and the level of efficacyoffered by a treatment;

• cost benefit, a trade-off between the cost ofcaring for a patient and the overall benefit (notrestricted to efficacy);

• cost utility, the trade-off between costs andall measures of ‘utility’ which may includeefficacy, quality of life, greater life expectancyor increased productivity.

One of the big difficulties with pharmacoeco-nomic evaluations is determining what indirectcosts should be considered. Direct costs are usu-ally easier: costs of medication, costs of thosegiving the care (doctors, nurses, health visitors)and the basic costs of occupation of a hospi-tal bed. Indirect costs include loss of earningsand productivity, loss of earnings and produc-tivity of spouses or other family members whomay care for a sick relative and contribution tohospital/pharmacy overhead costs. Because of theambiguity associated with these indirect costs,most pharmacoeconomic evaluations performedas part of a clinical trial tend to focus solely ondirect costs.

If we were to design a trial primarily to com-pare costs associated with different treatmentswe would follow the basic ideas of blinding andrandomisation and then record subsequent costsincurred by the patient and the health provider.A very careful protocol would be necessary todefine which costs are being considered so thatthis is measured consistently for all patients.A treatment that is not very effective might,for example, result in the patient needing morefrequent consultations. The increased physician,nurse and other paramedical personnel contacttime would then be recorded as a cost but itneeds to be clear whether patient travel costs,for example (still direct costs, but not to thehealth service) are included, or not. However,most trials are aimed primarily at assessing effi-ciency and a limitation of investigating costs ina clinical trial is that the schedule of, and fre-quency of, visits by the patient to the physicianmay be very different to what it would be inroutine clinical practice. Typically patients aremonitored more frequently and more intenselyin a trial setting than in routine clinical prac-tice. The costs recorded, therefore, in a clinicaltrial may well be different (probably greater butpossibly less) than in clinical practice. This issometimes put forward as a major objection topharmacoeconomic analyses carried out in con-junction with clinical trials. The same limitationdoes, of course, apply to efficacy evaluations: theoverall level of efficacy seen in clinical trialsis often not realised in clinical practice. How-ever, if we keep in perspective that, in a clini-cal trial, it is the relative efficacy of one treat-ment over another (even if one of them is aplacebo) then this limitation, whilst still impor-tant, can be considered less of an overall objec-tion. The same argument should be applied inpharmacoeconomic evaluations and the relativeincrease/decrease in costs of one treatment overanother can be reported.

Recommendations on trials incorporating healtheconomics assessment have been given by theBMJ Economic Evaluation Working Party.44

Neymark et al.45 discuss some of the method-ological issues as they relate to cancer trials.

Page 39: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 29

TRIAL SIZE

When designing a new trial, a realistic assessmentof the potential benefit (the anticipated effectsize) of the proposed test therapy must be madeat the onset. The history of clinical trials researchsuggests that, in certain circumstances, ratherambitious or over-optimistic views of potentialbenefit have been claimed at the design stage.This has led to trials of inadequate size for thequestions posed.

The retrospective review by Machin et al.46 ofthe published trials of the UK Medical ResearchCouncil in solid tumour cancers is summarisedin the funnel plot of Figure 2.9. The benefitobserved, as expressed by the hazard ratio (HR)for the new treatment, is plotted against thenumber of deaths reported in the trial publication.Those trials within the left-hand section of thefunnel have relatively few deaths observed andso will be of correspondingly low power. Of

these trials, some have an observed HR thatis above the hatched horizontal line. This linehas been drawn at a level that is thought torepresent a clinically worthwhile advantage to thetest treatment. These specific points would havebeen outside the funnel had they been estimatedfrom more observed deaths. Thus we mightconclude from Figure 2.9 that, had these trialsbeen larger, the corresponding treatments mayhave been observed to bring worthwhile benefit,rather than being dismissed as ‘not statistically’significant.

However, it is a common error to assume thatthe lack of statistical significance following atest of hypothesis implies no difference betweengroups. Conversely, a statistically significantresult does not necessarily imply a clinicallysignificant (important) result. Nevertheless, themessage of Figure 2.9 is that potentially usefultherapies may be overlooked if the trials aretoo small.

100 200 300 600500400

Number of deaths

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1.2

1.4

1.6

30

2.01.8

HR

Source: After Machin et al.46

BETTER

WORSE

Designed to demonstrate equivalence

Designed to demonstrate recurrence free survival

CE02 PE

BO02

BR03

LU03b BO01

BAACE01

ME

CEBHNA HNF

HNE OV01

BR01

HNB

HND

NE

PR01PR01

LU01 LU03a

HNCLU05

LU02

LU04

LU02 LU06

LU07LU08

LU09LU09

CEA CR01CR01

BR02

Figure 2.9. Retrospective review of UK Medical Research Council trials in solid tumours published prior to 1996

Page 40: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

30 TEXTBOOK OF CLINICAL TRIALS

ANTICIPATED (PLANNING) EFFECT SIZE

A major factor in determining the size of aRCT is the anticipated effect size or clinicallyworthwhile difference. In broad terms if thisis not large then it should be of sufficientclinical, scientific or public health importance towarrant the consequentially large trial that willbe required to answer the question posed. Ifthe anticipated effect is large, the RCT will berelatively small in which case the investigatorsmay need to question their own ‘optimistic’ viewof the potential benefit. In either case, a realisticview of the possible effect size is important.In practice, it is usually important to calculatesample size for a range of values of the effectsize. In this way the sensitivity of the resultingsample sizes to this range of values will provideoptions for the investigating team.

Estimates of the anticipated effect size maybe obtained from the available literature, formalmeta-analyses of related trials or may be elicitedfrom clinical opinion. In circumstances wherethere is little prior information available, Cohen47

has proposed a standardised effect size, �. In thecase when the difference between two treatmentsA and B is expressed by the difference betweentheir means (µA − µB) and σ is the standarddeviation (SD) of the endpoint variable which isassumed to be a continuous measure, then � =(µA − µB)/σ . A value of � ≤ 0.1 is considereda small standardised effect, � ≈ 0.5 as moderateand � ≥ 1 as large (see also Day3). Experiencehas suggested that in many clinical areas thesecan be taken as a good practical guide for designpurposes. However, for large simple trials, theequivalent of effects sizes as small as � = 0.05or less may be clinically important.

SAMPLE SIZE

Once the trial has been concluded, then a formaltest of the null hypothesis of no differencebetween treatments is often made. We emphasiselater that it is always important to provide anassociated confidence interval for the estimate oftreatment difference observed. The test of the nullhypothesis has an associated false positive rate

and the alternative hypothesis a false negativerate. The former is variously known also as theType I error rate, test size or significance level, α.The latter is the Type II error rate β, and 1 − β

is the power. When designing a clinical trial itis often convenient to think in hypothesis-testingterms and so set α and β and a specific effectsize � for consideration. For determining the sizeof a trial, α and β are typically taken as small,for example α = 0.05 (5%) and β = 0.1 (10%)or equivalently the power 1 − β = 0.9 (90%)is large.

If the trial is ultimately to compare the meansobtained from the two treatment groups, thenwith randomisation to each treatment in equalnumbers, the total sample size, N , is given by

N = 4(z1−α/2 + z1−β)2

�2, (2.1)

where z1−α/2 and z1−β are obtained from tablesof the standardised normal distribution for givenα and β.

If we set in equation (2.1) a two-sided α =0.05 and a power of 1 − β = 0.9, then z1−α/2 =z0.975 = 1.96 and z1−β = z0.9 = 1.2816, so thatN = 42.028/�2 ≈ 42/�2. For large, moderateand small sizes of � of 1, 0.5 and 0.1 thecorresponding sample sizes are 42, 168 and 4200,respectively. More realistically these may berounded to 50, 200 and 4500. For a large simpletrial with � = 0.05, this implies 16 000 patientsmay be recruited.

This basic equation has to be modified toadapt to the specific trial design (parallel group,factorial, crossover or sequential), the type ofrandomisation, the allocation ratio, as well as theparticular type of endpoint under consideration.Machin et al.48 provide examples for manydifferent situations.

A good clinical trial design is that which willanswer the question posed with the minimumnumber of subjects possible. An excessively largetrial not only incurs higher costs but is also uneth-ical. Too small a trial size leads to inconclusiveresults, since there is a greater chance of missingthe clinically important difference, resulting in a

Page 41: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 31

waste of resources. As Pocock4 states, this toois unethical.

MONITORING TRIAL PROGRESS

DATA MONITORING COMMITTEES

It is clear that a randomised controlled trial is amajor undertaking, which clearly involves humansubjects in the process. Thus, as we have stated,it is important that some form of equipoise inrespect to the treatments under test is requiredto justify the randomisation. However, once thetrial is in progress, information accumulates andas it does so it may be that the initial equipoisebecomes disturbed. Indeed the very point of aclinical trial is to upset the equipoise in favour ofthe best (if indeed one truly is) treatment.

Clearly there will be circumstances whensuch early information may be sufficient toconvincingly answer the question posed by thetrial. In which case the trial should close tofurther patient entry. One circumstance whenthis will arise is when the actual benefit farexceeds that which the design team envisaged.For example, Lau et al.49 stopped a trial inpatients with resectable hepatocellular carcinomaafter early results on 43 patients suggesteda substantial benefit to adjuvant intra-arterialiodine-131-labelled lipiodol. Their decision wassubsequently criticised by Pocock and White,50

who suggested the result was ‘too good to betrue’ as early stopping may yield biased estimatesof the treatment effect. A confirmatory trial isnow in progress to substantiate or refute thesefindings.51 Essentially, although very promising,in this case the trial results as published providedinsufficient evidence for other clinical teams toadopt the test therapy for their patients.

Nevertheless in this, and for the majority ofclinical trials, it is clearly important to monitorthe accumulating data. It has also been recognisedthat such monitoring should be reviewed (not bythe clinical teams involved in entering patientsinto the trial themselves) but by an independentDMC. The membership and remit of a DMC willusually depend on the particular trial(s) under

review. For example, the European Organizationfor the Research on the Treatment of Cancerhas a standing committee of three clinical andone statistical member, none of whom areinvolved in any way with the trials under review.This independent DMC reviews reports on trialprogress prepared by the data centre teams andmakes specific recommendations to the relevanttrial coordinating group. Early thoughts on thestructure of DMCs for the UK Medical ResearchCouncil Cancer Therapy Committee are providedby Parmar and Machin.52 To emphasise theimportance of ‘independence’, such committeessometimes choose the acronym IDMC.

SAFETY

Although an IDMC will be concerned withthe relative efficacy of the treatments undertest, issues of safety will also be paramount inmany circumstances. In many cases, safety issuesmay dominate the early stages of a trial whenrelatively new and untested treatment modalitiesare first put into wider use, whereas in thelater stages detailed review of safety may notbe required as no untoward experiences havebeen observed in the early stages. In contrast,serious safety issues may force a recommendationfor early closure of the trial even in situationswhere early indications of benefit in terms ofefficacy are present. Clearly the role of theIDMC or (in view of the ‘safety’ aspects) theIDMSC is to provide a balanced judgement onthese possibly conflicting aspects when makingtheir report. This judgement will derive from thecurrent evidence from the trial itself, externalevidence perhaps on new information since thetrial was inaugurated, and their own collectiveexperience.

INTERIM ANALYSIS AND EARLYSTOPPING RULES

At the planning stage of a clinical trial the designteam will be aware of the need to monitor theprogress of the trial by reports to an IDMSC. Onthese occasions the data centre responsible for the

Page 42: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

32 TEXTBOOK OF CLINICAL TRIALS

conduct of the trial will expect to prepare reportson many aspects of trial progress includingespecially safety and efficacy. This requirement isoften detailed in the trial protocol. The detail mayspecify those aspects that are likely to be of majorconcern and also the timing (often expressed interms of patient numbers or events observed) ofsuch reports.

An interim report may include a formal(statistical) comparison of treatment efficacy.This comparison will then be repeated on theaccumulating data for each IDMSC and finallyfollowing the close of the trial once the relevantdata are to hand. These repeated statistical testsraise the possibility of an increased chanceof falsely declaring a difference in efficacybetween treatments. To compensate for this,methods of adjusting for the multiple looksat the accumulating data have been devised.Many of these are reviewed by Piantadosi53

(Chapter 10).Several of these methods of interim analy-

sis also include ‘stopping rules’; that is theyincorporate procedures, or boundaries which oncecrossed by the data under review, imply that thetrial should terminate. However, all these meth-ods are predicated on obtaining timely and com-plete data, very rapid analysis and report writingand immediate review by the IDMSC.54 Theyalso focus on only one aspect (usually efficacy)and so do not provide a comprehensive view ofthe whole situation.

The nature of the essential balance requiredbetween a formal statistical approach to interimlooks at the data and the less structured natureof IDMSC decision-making is provided byAshby and Machin,55 Machin56 and Piantadosi53

(Chapter 10.8). Parmar et al.57 describe anapproach for monitoring large trials usingBayesian methods.

REPORTING CLINICAL TRIALS

The first rule after completing a clinical trialis to report the results – whether they are pos-itive, negative or equivocal. Selective reportingwhereby results of positive studies tend to be

published and negative studies tend not to bepublished presents a distorted view of the truesituation. This approach to reporting is par-ticularly important for clinical trial overviewsand meta-analysis where it is clearly impor-tant to be able to include all relevant stud-ies (not just the published ones) in the overallsynthesis.

The second aspect of reporting is the standardof reporting, particularly the amount of neces-sary detail given in any trial report. The mostbasic feature that has repeatedly been empha-sised is to give estimates (with confidence inter-vals) of treatment effects and not just p-values.Guidelines for referees (useful also for authors)have been published in several journals includ-ing the British Medical Journal.58 The Con-solidation of the Standards of Reporting Trials(CONSORT) statement is an international rec-ommendation adopted by many leading medicaljournals.59

One particular feature of the CONSORT state-ment is that the outcomes of ‘all’ patients ran-domised to a clinical trial are to be reported.Thus a full note has to be provided on those,for example, who post-randomisation refuse theallocation and perhaps then insist on the competi-tor treatment. Two examples of how the patientflow through a trial is summarised are given inFigure 2.10.

It is of some interest to note that the writingteam for Lau et al.49 were encouraged by thejournal to include information on late (post-interim analysis) randomisations in their report.It is clear that no such stipulation was requiredof the MRC Renal Cancer Collaborators.28

As indicated, the statistical guidelines referredto, and the associated checklists for statisticalreview of papers for international journals,60

require confidence intervals (CIs) to be givenfor the main results. These are intended as animportant prerequisite to be supplemented bythe p-value from the associated hypothesis test.Methods for calculating CIs are provided inmany standard statistical packages as well as thespecialist software of Altman et al.61

Page 43: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 33

350 patients randomlyassigned treatment

174 assignedinterferon-α

176 assignedMPA

167 includedin interimanalysis

7 excluded becauseentered after interim

analysis

168 includedin interimanalysis

8 excluded becauseentered after interim

analysis

116 liver resections for diagnosisof hepatocelluar carcinoma

43 eligible patients randomised

21 patients in131l-lipiodal group

21 patients followed upand completed trial

22 patients followed upand completed trial

18 patients received131l-lipiodal

3 patients did not

22 patients incontrol group

Source: After MRC Renal Cancer Collaborators;28 Lau et al.49

Figure 2.10. Trial profiles following the CONSORT Guidelines

INTENTION-TO-TREAT (ITT)

As we have indicated, once patients have beenrandomised they should start treatment as spec-ified in the protocol as soon as it is practi-cably possible. For the severely burnt patientseither MEBO or C dressings can be immediatelyapplied. On the other hand if patients, once ran-domised, have then to be scheduled for surgery,then there may be considerable delay before ther-apy is activated. This delay may provide a periodin which the patients change their mind aboutconsent or indeed in those with life-threateningillness some may die before the scheduled dateof surgery. Thus, the number of patients actu-ally starting the protocol treatment allocated maybe less than the number randomised to receiveit. The ‘intention-to-treat’ principle is that oncerandomised the patient is retained in that groupfor analysis whatever occurs, even in situationswhere a patient after consent is randomised to(say) A but then refuses and even insists onbeing treated by option B. The effect of sucha patient is to dilute the estimate of the true dif-ference between A and B. However, if such apatient was analysed as if allocated to treatment

B, then the trial is no longer properly ran-domised and the resulting comparison may beseriously biased.

However, in certain circumstances, the ‘inten-tion-to-treat’ may be replaced or supplemented bya ‘per protocol’ summary.62 For example, if thetoxicity and/or side-effects profile of a new agentare to be summarised, any analysis includingthose patients who were randomised to the drugbut then did not receive it (for whatever reason)could seriously underestimate the true levels. Ifthis is indeed appropriate for such endpoints,then the trial protocol should state that such ananalysis is intended from the onset.

One procedure that used to be in widespreaduse was, once the protocol treatment and follow-up were complete, and all the trial-specificinformation collected on a patient, to review thesedata in detail. This review would, for example,check that the patient eligibility criteria weresatisfied and that there had been no importantprotocol deviations while on treatment. Anypatients following this review, then found to beineligible or protocol violators would then, inprinciple, be set aside and excluded from thetrial results.

Page 44: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

34 TEXTBOOK OF CLINICAL TRIALS

One particular problem is one in which patientsare recruited to a trial on the basis of clinicalexamination during which a biopsy specimen istaken and sent for review. In the meantime thepatient is randomised and treatment commencedbut once the report is returned the patient isfound not to comply with the eligibility criteria.The above review process would automaticallyexclude this patient, whereas Freedman andMachin63 argue otherwise.

Usually, this review would not be blind to thetreatment received – in fact even if the trial isdouble blind – there may be clues once the dataare examined in this way as to which treatment iswhich. As a consequence, this process would tendto exclude more patients on the more aggressivetreatment. For example, the review conductedby Machin et al.46 of some early randomisedtrials in patients with cancer conducted by theUK Medical Research Council showed that theearlier publications systematically reported onfewer patients in the more aggressive treatmentarm despite a 1:1 randomisation. The exclusionof a larger proportion of patients receiving themore aggressive therapy would tend to bias theresults in its favour. Thus any patients whohad ‘difficulties’ with the treatment, perhaps themore sick patients, were not included in theassessment of its efficacy. This type of exclusionwas widespread practice, the consequences ofwhich included the development of ITT policiesand standards for reporting clinical trials. Thelatter policy insisting that the progress of all therandomised patients be reported.

In general, the application of ITT is con-servative in the sense that it will tend todilute between-treatment differences. Piaggio andPinol64 have pointed out that for equivalence tri-als ITT will not be conservative but will tend tofavour the equivalence hypothesis.

SUMMARISING THE RESULTS

It is not relevant to review all the analyticaltechniques that may be utilised in summarisingclinical trial data. Many of these are described in

the texts we reference at the end of this chapterand many are covered in ICH E9 Expert WorkingGroup.65 However several aspects are fundamen-tal and these include the statistical significancetest, confidence intervals and analysis adjustedfor confounding (usually prognostic) variables.

The form of these techniques will depend onthe design and especially the type of endpointvariable under consideration. Thus if two meansare to be compared then comparisons are madeusing the Student t-test, for two proportions itis the χ2 test which might be extended for anordinal endpoint to the χ2 test for trend. Ina survival time context, the difference betweentreatments may, under certain conditions, be sum-marised by use of the hazard ratio (HR) andtested using the logrank test.66 Finally, whetherthe endpoint variable is binary, ordered cate-gorical, continuous or a survival time, the cor-responding between-treatment comparisons maybe adjusted for baseline characteristics of thepatients themselves using the appropriate regres-sion techniques. These are made by logisticregression, ordered logistic regression, linearregression and Cox’s proportional hazards regres-sion respectively.

Should the design include cluster randomisa-tion, then Bland and Kerry67 state that the analy-sis is done by group rather than on an individualsubject basis.

TESTS OF HYPOTHESES AND CONFIDENCEINTERVALS

If the data are continuous and can be summarisedby the corresponding mean values in each of thetwo treatment groups A and B, then a simplecomparison is made using the difference d =xA − xB and the test of the null hypothesis ismade using

z = d

SE(d)(2.2)

Formally, this tests the null hypothesis thatthe ‘true’ difference between treatments, δ, iszero. For large trials z will have, under theassumption that the null hypothesis of equal

Page 45: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 35

means is true, an approximately standard normaldistribution from which the corresponding p-value can be obtained.

However, as has been pointed out it is veryimportant to report the observed difference d

together with an associated confidence intervalfor the true difference δ. In large samples, the100(1 − α)% CI is given by

d − z1−α/2SE(d) to d + z1−α/2SE(d) (2.3)

where z1−α/2 is obtained from tables of thenormal distribution. For a 95% CI z1−α/2 =z0.975 = 1.96.

Such a CI provides a sense of the precisionwith which the observed difference between thetwo treatments is provided by the data. In broadterms, the width of the interval is determined bythe number of subjects recruited, the larger thenumber the narrower the corresponding CI. Ingeneral, the 95% CI will exclude the null value(zero in this instance) if the corresponding p-value <0.05.

Although d provides a simple summary ofthe between-treatment group differences it isimportant to verify if this remains unchangedwhen taking full account of baseline character-istics: sometimes termed confounding variablesor covariates. This is often achieved by usingregression techniques to adjust the observed dif-ference, d , by the values of the baseline variables.In most circumstances, there will be some chanceof imbalances in the values of the variables thatmay arise following randomisation. The adjust-ment may affect the value of d itself as well asthe associated standard error, SE(d), and hencethe CI. Such adjustments for important covari-ates affecting prognosis may result in the esti-mate d being reduced, essentially unchanged orincreased – which of these occurs will depend onthe trial data itself.

GRAPHS

Graphical presentation of data is invaluable tocommunicate results in published journal arti-cles or in presentations or posters presented at

scientific conferences. There are many types ofgraphics that can be used but careful thoughtshould be given to the purpose of any graph.Graphs used for exploratory data analysis mayinclude histograms, scatter plots, etc. but thesemay not be appropriate for the final presentationof data. When presenting the results of clinicaltrials, the comparative nature of trials should bekept in mind and graphics produced that help inthe communication of differences between treat-ments. In trials using time as an endpoint measurethe Kaplan–Meier survival curves provide an ele-gant summary (Figure 2.11).66

NUMBER NEEDED TO TREAT (NNT)

Although many summary measures, for examplea difference in response rates or the hazardratio, are utilised in clinical trials a measureunique to this context is the number neededto treat. This is one very convenient way ofassessing the treatment benefit from trials with abinary outcome. From the result of a randomisedtrial comparing a new treatment with a standardtreatment, the NNT is the number of patientswho need to be treated with the new treatmentrather than the standard (control) treatment inorder for one additional patient to benefit. Itcan be obtained for any trial that has reporteda binary outcome.

The NNT is calculated as the reciprocal ofthe difference between treatments where this isexpressed as a difference of two proportions(say) pT and pC for test and control treatmentsunder study. Thus NNT = 1/(pT − pC) and alarge treatment effect thus leads to a small NNT.A therapy that will lead to one life saved forevery 10 patients treated is clearly better than acompeting treatment that saves one life for every50 treated. When there is no treatment effects therisk difference is zero and the NNT is infinite.

A confidence interval for the NNT is obtainedby taking reciprocals of the values defining theconfidence interval for the treatment differenceitself. However, as Altman7 has pointed out thereare some difficulties if the treatment effect thatis not statistically significant and the confidence

Page 46: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

36 TEXTBOOK OF CLINICAL TRIALS

1.0

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.00 2 4 6 8 10 12 14 16 18

Cum

ulat

ive

surv

ival

PlaceboTMX60TMX120

13074120

n

11369

114

O

131.469.595.1

E

-1.161.39

HR

-(0.86, 1.38)(1.07, 1.81)

95% CI for HR

Placebo

TMX60

TMX120

Months from randomisation

No. at risk (Death)PlaceboTMX60TMX120

130 (0)74 (0)

120 (0)

77 (52)39 (35)66 (53)

48 (27)20 (19)26 (40)

18 (9)11 (1)11 (5)

14 (3)8 (3)5 (5)

11 (3)6 (2)3 (2)

7 (2)5 (1)1 (0)

4 (1)5 (0)0 (0)

4 (0)5 (0)0 (0)

32 (15)12 (5) 16 (9)

Source: After Chow et al.12

Figure 2.11. Kaplan–Meier estimates of survival in patients with inoperable hepatocellular carcinoma bydouble-blind treatment group

interval includes the null value of 0 (see also thecomments by Hutton68). The NNT can also beobtained for survival analysis. For these studiesthe NNT is not a single number, but variesaccording to time since the start of treatment.

MULTIPLE COMPARISONS

In a clinical trial in which two groups arebeing compared, the formal statistical test of this

comparison has an associated two-sided test sizeα. This is set as the boundary below whichthe p-value, calculated from the data for theprimary endpoint of the trial, must fall to bedeclared statistically significant. In this case thenull hypothesis of no difference between groupsis then rejected.

When this approach is utilised in the analysisof a clinical trial comparing two groups and thereis truly no difference between the two groups,

Page 47: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 37

despite this ‘no difference’ there is a 100α%probability of a statistically significant result andthe false rejection of the null hypothesis.

If more than one endpoint is measured for thetwo-group study in question then the situationbecomes more complex. For example, if aclinical trial is comparing two treatments (Aand B) but there are three different independentoutcomes being measured, then there are threecomparisons to make between A and B and,in theory at least, three statistical tests. Inthis circumstance it can be shown that thefalse positive rate is no longer 100α% butapproximately 300α%. In fact for k (assumedindependent) outcome measures the false positiverate is approximately 100kα%. Clearly, thefalse positive rate increases as the number ofcomparisons made increases.

In order to retain the false positive rate as100α% the Bonferroni correction is often sug-gested. This implies only declaring differences asstatistically significant at the 100α% level if theobserved p-value < α/k. In the case of α = 0.05and k = 3, this implies p-value < 0.017. Equiv-alently, and preferably, multiply the observed p-value by k and declare this significant if lessthan α.69

Similar considerations apply equally to CIs.One approach that has been used to overcome thisdifficulty is to quote 99% CIs rather than 95% CIswhenever more than a single outcome is regardedas primary. Thus the UK Prospective DiabetesStudy Group70 report 21 distinct endpoints,ranging from fatal myocardial infarction to deathfrom unknown cause, and provide the 99%confidence intervals for the corresponding 21relative risks comparing tight with less tightcontrol of blood pressure. This is a ‘half-wayhouse’ proposal, since 0.01 (= 1 − 0.99) liesbetween taking no account of the multiplicityand retaining 0.05 to define significance and theBonferroni corrected value of 0.05/21 = 0.0024.

Similar considerations of multiplicity may alsoapply in situations other than trials with multi-ple endpoints. For example, Green22 highlightsthis problem with respect to trials of a facto-rial design.

The problem of multiple comparisons is partic-ularly acute in QoL assessments in clinical trials.Thus guidelines by Staquet et al.71 on reportingsuch trials explicitly state: ‘. . . in the case of mul-tiple comparisons, attention must be paid to thetotal number of comparisons, to the adjustment,if any, of the significance level, and to the inter-pretation of the results’. Perneger72 concludesthat: ‘Simply describing what tests of significancehave been performed, and why, is generally thebest way of dealing with multiple comparisons’.In contrast the American Journal of Public Healthrecommend the use of a method of adjustmentdue to Holm.73 From a clinical trials perspec-tive these issues are reviewed by Proschan andWaclawiw.74

SUBGROUP ANALYSIS

In designing a RCT, sample size is usually deter-mined by considering a clinically worthwhileeffect which will be estimated from the trial databy a comparison of all patients randomised to onegroup as compared to the other. It is recognisedthat the precision with which this effect size isestimated may be improved by a stratified anal-ysis adjusted for baseline prognostic variableswhich are known to influence outcome irrespec-tive of the treatment received. However, if treat-ments are compared within these strata (therebyignoring information on patients not in that stra-tum) it is clear that the patient numbers must beless than the RCT as a whole. Thus any suchcomparisons will usually lack sufficient statisticalpower and hence may be unreliable.

A common mistake is to observe a p-valuein excess of 0.05 (and hence judged naively asnot significant) but then to report subgroup analy-ses – perhaps repeating the treatment comparisonwithin each disease stage group. In some cir-cumstances, one of these subgroup comparisonsmay be ‘significant’ (p-value < 0.05) which willalmost certainly imply that those within the othergroups will not since the ‘all-data-combined’analysis is not significant. In extreme cases, sub-group analysis can appear to favour one treatmentin one subgroup and the other in the other sub-group(s). This may then lead to a false conclusion

Page 48: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

38 TEXTBOOK OF CLINICAL TRIALS

that the new treatment works for one group butnot for the other. If a subgroup analysis is plannedfor at the design stage, adjustment for this shouldbe built into the sample size considerations.

BASELINE COMPARISONS

Although the standard of reporting of randomisedcontrolled trials has improved in many medi-cal journals, there remain many who have notyet adopted the full rigours of the CONSORTGuidelines. There are also many situations inwhich inappropriate and substandard analyses areconducted. Particular examples include statisti-cal significance tests of pre-randomisation (base-line) variables, often describing demographic andpatient eligibility criteria in the different treat-ment groups, despite the allocation to groupshaving been made by randomisation so that anyobserved differences are by definition random.75

OTHER MISUSED APPROACHESTO ANALYSIS

Anderson76 catalogues some commonly misusedapproaches used in the analysis of clinical trials.These include in, for example, cancer clinicaltrials with dual endpoints of tumour responseand overall survival, the analysis of survival bytumour response itself. In these cases survival iscompared between those patients who respondand those who do not. Such analysis may lead toa false conclusion that response ‘caused’ longersurvival. Anderson states categorically that suchcomparisons are wrong unless an appropriatemethodology such as one involving a landmark isused. Similar considerations apply if comparisonsare made between groups established on the basisof the amount of (protocol defined) treatmentreceived, dose intensity or toxicity. Andersonet al.77 provide details of the landmark approach.

A common mistake is for investigators toprovide treatment-specific CIs in their reportsfor the endpoint(s) of concern, rather than fora relevant measure of treatment comparison suchas their difference or a hazard ratio.

COMPETING RISKS

In some situations, a patient may fail followingapparently successful treatment from one of C

(≥2) so-called competing risk types, for example,a local recurrence or a distant metastasis, whichare competing causes of ultimate mortality in can-cer patients. If relapse is the outcome of interestin the clinical trial, then usually it is the first eventthat is of primary importance to the clinician.Thus, in a randomised trial of adjuvant therapyin resected Dukes’ C colorectal carcinoma, 17-1A monoclonal therapy was thought to be mosteffective against individually dispersed cells andless effective against local satellite tumour nod-ules or cell aggregates.78 Since the 17-1A anti-body should be most effective in preventing ordelaying distant metastases after surgery, distantmetastasis as a first event was thus a key endpointin this trial.

In the analysis of competing patterns of failure,the Kaplan–Meier method and the associatedlogrank test are frequently used to estimatethe comparative rates of, for example, localrecurrence and distant metastasis in patientsreceiving alternative treatments for their cancer intwo distinct calculations. In one, local recurrenceas the first event is taken as the event ofinterest. In this situation, patients who do nothave a local recurrence, or who have localrecurrence as a second or subsequent event,irrespective of whether or not they have distantmetastasis, are treated as censored observations inthe calculations. In the other calculations, distantmetastasis as the first event is taken as the eventof interest.

A preferred method, described in detail by Taiet al.,79 is to use the cause-specific cumulativeincidence functions and comparisons betweengroups via the test developed by Gray.80 Gelmanet al.81 have pointed out that the Kaplan–Meiermethod produces estimates that are appreciablyhigher than those of the competing risks method.Substantial differences have also been notedin data from patients with Ewing’s sarcoma.79

However a counter viewpoint on the use of thetwo approaches is given by Farley et al.,82 using

Page 49: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 39

illustrations from clinical trials in contraceptivedevelopment.

SOFTWARE FOR STATISTICAL ANALYSIS

There are many statistical packages available forthe summary and analysis of clinical trials andadditional features are continually being added.Pertinent features that distinguish the packagesare the flexibility and quality of graphical outputand the presentation of statistical tables or results.

Over the last decades, there have been manystatistical developments that have impacted on,for example, the endpoints that can be assessed,the design, size, analysis and summary. Theseinclude the Kaplan–Meier method for summaris-ing survival time studies, the logrank test andmost influential of all the associated Cox pro-portional hazards model which allows between-treatment comparisons adjusted for potentiallyconfounding prognostic variables assessed atbaseline. The Cox model can also accommodatetime-dependent variables, that is variables thatare assessed post-randomisation. These devel-opments would have remained theoretical innature but for parallel developments in statisti-cal software.

CLINICAL DATA MANAGEMENT

GOOD QUALITY DATA

Although it is rather outside the scope of thistext, a vital component to the eventual success ofany clinical trial protocol is the quality of theassociated clinical record forms (CRFs). GoodCRFs will be pleasant to the eye, logical inlayout, comprehensive yet focused on the keyinformation required, easy to complete and easyto process.

A key feature of any scientific study is theimplication that the data generated are of highquality. That is, that the observations madeare carefully recorded at the point and timeof observation, and then passed through tothe analysis stage without change. All this isequally true for clinical trials of whatever design.

However, the problems associated with ensuringthis process is indeed in place in, for example,a large multinational trial involving prolongedfollow-up of subjects are considerable. Indeeda whole new industry of Clinical ResearchOrganisations (CROs) has developed to guaranteethis process. Further, the Regulatory Agenciesrightly demand a guarantee of high quality datain any submission made to them for productregistration.

Information from randomised controlled trialsprovides key information when the pharmaceu-tical and allied industries apply to register anew drug or device with the relevant regulatoryauthorities. The authorities themselves imposecertain constraints on the way in which trialsare conducted – these will include basics withrespect to a justification of a sample size forthe trial but will also specify standards.65 Thesetoo have impacted on the ways that trials aredesigned, conducted, analysed and reported.

PRINCIPLES OF QUALITY DATAMANAGEMENT

In clinical trials, subjects are usually enteredone at a time, and their responses to treatmentmonitored sequentially. Regular monitoring oftrial progress, especially during the early stages,is advisable, and prompt attention to data errors,inconsistencies or missing items on the CRFsis required, so that corrections can be madeimmediately. If inadequate control is exercised inthe management of clinical trials data, subsequentanalysis may be delayed or at worst wrong.

The use of computers for data processing andstatistical analysis requires careful planning andexecution by experienced personnel. Errors areliable to happen in the recording of data in theCRFs, as well as in the transfer of data tothe computer using a database managementsoftware.

In a typical trial the sequence of data collectionmight be registration and randomisation, a recordof the patient’s name or code, trial number, dateof birth, date of randomisation and allocatedtreatment (if the trial is not blind) will be

Page 50: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

40 TEXTBOOK OF CLINICAL TRIALS

entered into the database. The on-study form,which contains other demographic and baselineclinical information would be completed andadded to the database in due course. Patientsare then anticipated to return for follow-upassessments often over an extended period oftime. Since patients do not all enter the trial atthe same time, the amount of data collected atany one time may vary considerably as the trialprogresses, and the management of the data ona computer becomes increasingly complex andrequires skilful intervention.

For repeated follow-up evaluations, a sepa-rate record is normally kept for each evalua-tion. For example, in a double-blind randomisedplacebo controlled trial on inoperable hepatocel-lular carcinoma,12 information on ECOG perfor-mance status, Okuda staging, serum albumin andbilirubin is collected. In addition, QoL is evalu-ated not only on entry into the trial, but also atmonthly intervals. Since patients may drop out ofa trial at any time for a variety of reasons this fur-ther complicates the management of the database.

Quality data management via computers entailscareful planning and execution by well-traineddata managers. As the processes of data tran-scription and entry into computers are highlysusceptible to producing errors, a series of checksfor validity and completeness should be carriedout immediately upon the return of CRFs. Itemsthat are commonly verified include range checksfor clinical parameters such as serum levels.These may be excessively high or extremely low(due to out-of-range or wrong units recorded),or in the case of qualitative variables, a non-permitted code.

Logical checks identify any inconsistency ininformation that may be captured in differentparts of CRFs. For instance, dates of randomisa-tion, follow-ups and death carry important infor-mation in clinical trials where survival is theendpoint. Thus it is important to check that thesedates as well as other crucial ones have beenrecorded and entered correctly.

Routine checks on missing items or formsshould also be undertaken, as any missinginformation could be due to oversight in data

transcription, electronic data entry or delay inreturns of CRFs, rather than the information nothaving been collected. Queries should be raisedfor any discrepancies identified by these checks.The data should be edited accordingly afterclarification with the investigator or comparisonwith source documents.

Although information in the CRF should bechecked manually and discrepancies resolved bythe data manager prior to data entry, on-line edit-ing checks via computer provides an additionalmeans of detecting errors or missing data. Thevalidation rules would normally be specified andthe associated data checks programmed in paral-lel with the building of the database.

SOFTWARE FOR CLINICAL DATAMANAGEMENT

Several specialised software tools for managingclinical trial data are commercially available. Anearly one written for the UK Medical ResearchCouncil was COMputer PAckage for ClinicalTrails (COMPACT)83 and this included many ofthe features necessary for handling ragged datasets which arise in clinical trails involving pro-longed follow-up. However, current requirementsof GCP demand, for example, more intensiveaudit trail facilities that were not part of the earlysystems. The newer systems, unlike standardspreadsheets, incorporate all the basic features ofa good clinical data management system.

An ideal clinical data management process isone that delivers valid and accurate data whichaid the maintaining of data integrity throughfacilities for verifying and validating data, as wellas through the implementation of an audit trailto document database modifications. They alsoallow for automatic reporting of discrepancies,customised reports can be created and distributedto external sources such as the investigatorsfor resolution and have the ability to efficientlyhandle repeated follow-up evaluations and trackstatus of patients throughout the trial.

These systems also provide global libraries tostore the definitions of standard code lists, andstandard validation or derivation criteria, to opti-mise information input and access throughout the

Page 51: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 41

operation. Standard codes can easily be com-puted by defining appropriate derivation rules inthe global libraries. These codes and rules canbe tailored to meet the varying requirements ofindividual trials and can be re-used in subse-quent databases.

In multi-centre trials involving diverse and dis-tant locations, it is possible with these clinicaldata management systems to automatically propa-gate the study definitions, including amendments,to all study sites. At each site, the local datacan then be independently managed using thesame validation rules, derivation criteria and codelists. This is implemented by means of remotedata entry.

With multiple sites and many users accessingthe same network, possibly performing differenttasks, for example database design, entry andresolution by data managers, data retrieval, queryand analysis by statisticians, the security of thesystem is of primary concern. Thus a clinicaltrial system should allow network-wide securitystandards to be enforced by enabling systemadministrators to assign, monitor and controlaccess to sensitive clinical data records.

For the purpose of statistical analyses, thereshould also be fast, flexible and easy extractionof data into a variety of user-defined file formats,such as those of the more commonly usedstatistical packages.

Ideally, a good clinical data management sys-tem should have facilities for randomising treat-ments, and registering all information captured atthis stage directly into the database.

McFadden84,85 provides further details onmany aspects of clinical data management andassociated areas.

SELECTED FURTHER READINGON CLINICAL TRIALS

GENERAL

Day S. Dictionary for Clinical Trials. Chichester: JohnWiley & Sons (1999).

Redmond C, Colton T, eds. Biostatistics in ClinicalTrials. Chichester: John Wiley & Sons (2000).

Senn SJ. Statistical Issues in Drug Development.Chichester: John Wiley & Sons (1997).

Williams CJ, ed. Introducing New Treatments forCancer: Practical, Ethical and Legal Problems.Chichester: John Wiley & Sons (1992).

DESIGN AND CONDUCTOF CLINICAL TRIALS

Buyse ME, Staquet MJ, Sylvester RJ. Cancer ClinicalTrials: Methods and Practice. Oxford: Oxford Uni-versity Press (1984).

Crowley J, ed. Handbook of Statistics in ClinicalOncology. New York: Marcel Dekker (2000).

Jadad A. Randomised Controlled Trials: A User’sGuide. London: British Medical Journal Books(1998).

Jennison C, Turnbull BW. Group Sequential Methodswith Applications to Clinical Trials. Boca Raton:Chapman & Hall/CRC (2000).

Matthews JNS. Introduction to Randomised ControlledClinical Trials. London: Arnold (2000).

McFadden ET. Management of Data in Clinical Trials.New York: John Wiley & Sons (1997).

Piantadosi S. Clinical Trials: A Methodologic Perspec-tive. New York: John Wiley & Sons (1997).

Pocock SJ. Clinical Trials: A Practical Approach.Chichester: John Wiley & Sons (1983).

Senn SJ. Cross-over Trials in Clinical Research. 2ndedn. Chichester: John Wiley & Sons (2002).

Whitehead J. Design and Analysis of Sequential Clini-cal Trials, Revised 2nd edn. Chichester: John Wiley& Sons (1997).

Wooding WM. Planning Pharmaceutical Clinical Tri-als. New York: John Wiley & Sons (1994).

QUALITY OF LIFE

Fairclough DL. Design and Analysis of Quality ofLife Studies in Clinical Trials. Andover: CRC Press(2002).

Fayers PM, Machin D. Quality of Life: Assessment,Analysis and Interpretation. Chichester: John Wiley& Sons (2000).

Staquet MJ, Hays R, Fayers PM. Quality of Life Clin-ical Trials: Methods and Practice. Oxford: OxfordMedical Publications (1998).

SAMPLE SIZE

Machin D, Campbell MJ, Fayers PM, Pinol APY.Sample Size Tables for Clinical Studies. Oxford:Blackwell Science (1997).

Page 52: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

42 TEXTBOOK OF CLINICAL TRIALS

EVIDENCE-BASED MEDICINE

Altman DG, Chalmers I, eds. Systematic Reviews.London: British Medical Journal Books (1995).

Sutton AJ, Abrams KR, Jones DR, Sheldon TA,Song F. Methods of Meta-Analysis in MedicalResearch. Chichester: John Wiley & Sons (2000).

REFERENCES

1. World Medical Association. World Medical Asso-ciation Declaration of Helsinki, 1996. Republishedin JAMA (1997) 277: 925–6.

2. Chalmers I, Sackett D, Silagy C. Cochrane collab-oration. In: Redmond C, Colton T, eds, Biostatis-tics in Clinical Trials. Chichester: John Wiley &Sons (2000) 65–7.

3. Day S. Dictionary for Clinical Trials. Chichester:John Wiley & Sons (1999).

4. Pocock SJ. Clinical Trials: A Practical Approach.Chichester: John Wiley & Sons (1983).

5. Temple R. Current definitions of phases of inves-tigation and the role of the FDA in the conduct ofclinical trials. Am Heart J (2000) 139: S133–5.

6. Gehan EA, Freireich EJ. Non-randomized con-trols in cancer clinical trials. New Engl J Med(1974) 290: 198–203.

7. Altman DG. Statistics in medical journals: somerecent trends. Stat Med (2000) 19: 3275–89.

8. Ang ES-W, Lee S-T, Gan CS-G, See PG-J,Chan Y-H, Ng L-H, Machin D. Evaluating therole of alternative therapy in burn wound manage-ment: randomized trial comparing Moist ExposedBurn Ointment with conventional methods in themanagement of patients with second-degree burns.Medscape Gen Med (2001) 3(2): 3.

9. Freedman B. Equipoise and the ethics of clinicalresearch. New Engl J Med (1987) 317: 141–5.

10. Weijer C, Shapiro SH, Glass KC. Clinical equi-poise and not the uncertainty principle is the moralunderpinning of the randomised controlled trial. BrMed J (2000) 321: 756–8.

11. Enkin MW. Clinical equipoise and not the uncer-tainty principle is the moral underpinning of therandomised controlled trial. Br Med J (2000) 321:756–8.

12. Chow PK-H, Tai B-C, Tan C-K, Machin D, John-son PJ, Khin M-W, Soo K-C. No role for high-dose tamoxifen in the treatment of inoperablehepatocellular carcinoma: an Asia–Pacific double-blind randomised controlled trial. Hepatology(2002) 36: 1221–6.

13. RISC Group. Risk of myocardial infarction anddeath during treatment with low dose aspirin and

intravenous heparin in men with unstable coronaryartery disease. Lancet (1990) 336: 827–30.

14. The ATAC (Arimidex, Tamoxifen Alone orin Combination) Trialists’ Group. Anastrozolealone or in combination with tamoxifen versustamoxifen alone for adjuvant treatment of post-menopausal women with early breast cancer: firstresults of the ATAC randomised trial. Lancet(2002) 359: 2131–9.

15. Piya-Anant M, Koetsawang S, Patrasupapong N,Dinchuen P, d’Arcangues C, Piaggio G, Pinol A.Effectiveness of cyclofem in the treatment ofdepo medroxyprogesterone acetate induced amen-orrhea. Contraception (1998) 57: 23–8.

16. Donner A, Piaggio G, Villar J, Pinol A, Al-Mazrou Y, Ba’aqeel H, Bakketeig L, Belizan JM,Berendes H, Carroli G, Farnot U, Lumbiganon P.Methodological considerations in the design of theWHO antenatal care randomised controlled trial.Paediat Perinatal Epidemiol (1998) 12(Suppl. 2):59–74.

17. Zelen M. A new design for randomized clinicaltrials. New Engl J Med (1979) 300: 1242–5.

18. Wooding WM. Planning Pharmaceutical ClinicalTrials. New York: John Wiley & Sons (1994).

19. Chan DC, Watts GF, Mori TA, Barrett PH, BeilinLJ, Redgrave TG. Factorial study of the effectsof atorvastatin and fish oil on dyslipidaemia invisceral obesity. Eur J Clin Invest (2002) 32:429–36.

20. Day L, Fildes B, Gordon I, Fitzharris M,Flamer H, Lord S. Randomised factorial trial offalls prevention among older people living in theirown homes. Br Med J (2002) 325: 128–31.

21. Piantadosi S. Factorial designs. In: Redmond C,Colton T, eds, Biostatistics in Clinical Trials.Chichester: John Wiley & Sons (2000) 193–200.

22. Green S. Factorial designs with time-to-event endpoints. In: Crowley J, ed, Handbook of Statisticsin Clinical Oncology. New York: Marcel Dekker(2000) Chapter 9, 161–71.

23. Hong B, Ji YH, Hong JH, Nam KY, Ahn Ty. Adouble-blind crossover study evaluating the effi-cacy of Korean red gingseng in patients witherectile dysfunction: a preliminary report. J Urol(2002) 168: 2070–3.

24. Senn SJ. Cross-over Trials in Clinical Research.2nd edn. Chichester: John Wiley & Sons (2002).

25. Simon R. Therapeutic equivalence trials. In:Crowley J, ed, Handbook of Statistics in ClinicalOncology. New York: Marcel Dekker (2000)Chapter 10, 173–87.

26. Donaldson AN, Whitehead J, Stephens R, Ma-chin D. A simulated sequential analysis based ondata from two MRC trials. Br J Cancer (1993) 68:1171–8.

Page 53: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GENERAL ISSUES 43

27. Fayers PM, Cook PA, Machin D, Donaldson AN,Whitehead J, Ritchie A, Oliver RTD, Yuen P. Onthe development of the Medical Research Counciltrial of α-interferon in metastatic renal carcinoma.Stat Med (1994) 13: 2249–60.

28. Medical Research Council Renal Cancer Collab-orators. Interferon-α and survival in metastaticrenal carcinoma: early results of a randomisedcontrolled trial. Lancet (1999) 353: 14–17.

29. MPS Research Unit. PEST 4: Operating Manual.Reading: University of Reading (1993).

30. Whitehead J. Use of the triangular test in sequen-tial clinical trials. In: Crowley J, ed, Handbook ofStatistics in Clinical Oncology. New York: MarcelDekker (2000) Chapter 12, 211–28.

31. Whitehead J. Design and Analysis of SequentialClinical Trials, revised 2nd edn. Chichester: JohnWiley & Sons (1997).

32. Jennison C, Turnbull BW. Group SequentialMethods with Applications to Clinical Trials. BocaRaton: Chapman & Hall/CRC (2000).

33. Zelen M. Randomised consent trials. Lancet (1992)340: 375.

34. Altman DG, Whitehead J, Parmar MKB, Sten-ning SP, Fayers PM, Machin D. Randomised con-sent designs in cancer clinical trials. Eur J Cancer(1995) 31A: 1934–44.

35. Machin D, Lee ST. The ethics of randomisedtrials in the context of cleft palate research. PlasticReconstruct Surg (2000) 105: 1566–8.

36. Spiegelhaleter DJ, Freedman LS, Parmar MKB.Bayesian approaches to randomized trials. J RStatist Soc (1994) A 157: 357–416.

37. Tan S-B, Machin D, Tai B-C, Foo K-F, Tan E-H.A Bayesian reassessment of two Phase II trials ofgemcitabine in metastatic nasopharyngeal cancer.Br J Cancer (2002) 86: 843–50.

38. Matthews JNS, Altman DG, Campbell MJ, Roys-ton JP. Analysis of serial measurements in medicalresearch. Br Med J (1990) 300: 230–5.

39. StataCorp. STATA Statistical Software: Release7.0. Stata Corporation, College Station, TX(2001).

40. Brown JE, King MT, Butow PN, Dunn SM,Coates AS. Patterns over time in quality of life,coping and psychological adjustment in late stagemelanoma patients: an application of multilevelmodels. Qual Life Res (2000) 9: 75–85.

41. Ware JE, Sherbourne CD. The MOS 36-ItemShort Form Health Survey (SF-36) I: conceptualframework and item selection. Med Care (1991)30: 473–83.

42. Fayers PM, Machin D. Quality of Life: Assess-ment, Analysis and Interpretation. Chichester:John Wiley & Sons (2000).

43. Fairclough DL. Design and Analysis of Qualityof Life Studies in Clinical Trials. Andover: CRCPress (2002).

44. BMJ Economic Evaluation Working Party. Guide-lines for authors and peer reviewers of economicsubmissions to the BMJ. Br Med J (1996) 313:275–83.

45. Neymark N, Kiebert W, Torfs K, Davies L, Fay-ers P, Hillner B, Gelber R, Guyatt G, Kind P,Machin D, Nord E, Osoba D, Revicki D, Schul-man K, Simpson K. Methodological and statisticalissues of quality of life (QoL) and economic evalu-ation in cancer clinical trials: report of a workshop.Eur J Cancer (1998) 34: 1317–33.

46. Machin D, Stenning SP, Parmar MKB, Fay-ers PM, Girling DJ, Stephens RJ, Stewart LA,Whaley JB. Thirty years of Medical ResearchCouncil randomized trials in solid tumours. ClinOncol (1997) 9: 20–8.

47. Cohen J. Statistical Power Analysis for the Behav-ioral Sciences, 2nd edn. New Jersey: LawrenceEarlbaum (1988).

48. Machin D, Campbell MJ, Fayers PM, Pinol APY.Sample Size Tables for Clinical Studies. Oxford:Blackwell Science (1997).

49. Lau WY, Leung TWT, Ho SKW, Chan M,Machin D, Lau J, Chan ATC, Yeo W, Mok TSK,Yu SCH, Leung NWY, Johnson PJ. Adjuvantintra-arterial iodine-131-labelled lipiodol for re-sectable hepatocellular carcinoma: a prospectiverandomised trial. Lancet (1999) 353: 797–801.

50. Pocock SJ, White I. Trials stopped early: too goodto be true? Lancet (1999) 353: 797–801.

51. Tan SB, Machin D, Cheung YB, Chung YFA,Tai BC. Following a trial that stopped early: whatnext for adjuvant intra-arterial iodine-131-labelledlipiodol in resectable hepatocellular carcinoma? JClin Oncol (2002) 20: 1709.

52. Parmar MKB, Machin D. Monitoring clinical tri-als: experience of, and proposals under consid-eration by, the Cancer Therapy Committee ofthe British Medical Research Council. Stat Med(1993) 12: 497–504.

53. Piantadosi S. Clinical Trials: A Methodologic Per-spective. New York: John Wiley & Sons (1997).

54. Machin D. Discussion of ‘The what, why and howof Bayesian clinical trials monitoring’. Stat Med(1994) 13: 1385–9.

55. Ashby D, Machin D. Stopping rules, interim anal-ysis and data monitoring committees. Br J Cancer(1993) 68: 1047–50.

56. Machin D. Interim analysis and ethical issuesin the conduct of trials. In: Williams CJ, ed,Introducing New Treatments for Cancer: Practical,Ethical and Legal Problems. Chichester: JohnWiley & Sons (1992) Chapter 15.

Page 54: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

44 TEXTBOOK OF CLINICAL TRIALS

57. Parmar MKB, Griffiths GO, Spiegelhalter DJ,Souhami RL, Altman DG, van der Scheuren E.Monitoring of large randomised clinical trials:a new approach with Bayesian methods. Lancet(2001) 358: 375–81.

58. Altman DG, Gore SM, Gardner MJ, Pocock SJ.Statistical guidelines for contributors to medicaljournals. In: Altman DG, Machin D, Bryant TN,Gardner MJ, eds, Statistics with Confidence, 2ndedn. London: British Medical Journal Books(2000) 171–90.

59. Begg C, Cho M, Eastwood S, Horton R, Moher D,Olkin I, Pitkin R, Rennie D, Schultz KF, Simel D,Stroup DF. Improving the quality of reporting ran-domized controlled trials: the CONSORT state-ment. J Am Med Assoc (1996) 276: 637–9.

60. Gardner MJ, Machin D, Campbell MJ, AltmanDG. Statistical checklists. In: Altman DG,Machin D, Bryant TN, Gardner MJ, eds, Statisticswith Confidence, 2nd ed. London: British MedicalJournal Books (2000) 101–10.

61. Altman DG, Machin D, Bryant TN, Gardner MJ.Statistics with Confidence, 2nd edn. London:British Medical Journal Books (2000).

62. Lewis JA, Machin D. Intention to treat: whoshould use ITT. Br J Cancer (1993) 68: 647–50.

63. Freedman LS, Machin D. Pathology review incancer research. Br J Cancer (1993) 68: 1047–50.

64. Piaggio G, Pinol APY. Use of the equivalenceapproach in reproductive health. Stat Med (2001)20: 3571–87.

65. ICH E9 Expert Working Group. Statistical princi-ples for clinical trials: ICH harmonised TripartiteGuideline. Stat Med (1999) 18: 1905–42.

66. Parmar MKB, Machin D. Survival Analysis: APractical Approach. Chichester: John Wiley &Sons (1995).

67. Bland JM, Kerry SM. Statistics notes: trials ran-domised in clusters. Br Med J (1997) 315: 600.

68. Hutton JL. Number needed to treat: properties andproblems. J Roy Stat Soc A (2000) 163: 381–402.

69. Altman DG. Practical Statistics for Medical Re-search. London: Chapman & Hall (1991).

70. UK Prospective Diabetes Study Group. Tightblood pressure control and risk of macrovas-cular and microvascular complications in type2 diabetes: UKPDS 38. Br Med J (1998) 317:703–12.

71. Staquet MJ, Berzon RA, Osoba D, Machin D.Guidelines for reporting results of quality oflife assessments in clinical trials. In: Staquet MJ,Hays RD, Fayers PM, eds, Quality of LifeAssessment in Clinical Trials: Methods and

Practice. Oxford: Oxford University Press (1998)337–47.

72. Perneger TV. What’s wrong with Bonferroniadjustments. Br Med J (1998) 316: 1236–8.

73. Holm . A simple sequentially rejective multipletest procedure. Scand J Public Health (1979) 6:65–70.

74. Proschan MA, Waclawiw MA. Practical guide-lines for multiplicity adjustment in clinical trials.Contr Clin Trials, (2000) 21: 527–39.

75. Altman DG, Dore CJ. Randomisation and base-line comparisons in clinical trials. Lancet (1990)335: 149–53.

76. Anderson JR. Commonly misused approaches inthe analysis of cancer clinical trials. In: Crowley J,ed, Handbook of Statistics in Clinical Oncology.New York: Marcel Dekker (2000) Chapter 24.

77. Anderson JR, Cain KC, Gelber RD. Analysis ofsurvival by tumor response. J Clin Oncol (1983)1: 710–19.

78. Pugh RNH, Murray-Lyon IM, Dawson JL, PietroniMC, William R. Transection of the oesophagus forbleeding oesophageal varices. Br J Surg (1973) 8:646–9.

79. Tai B-C, Machin D, White I, Gebski V. Compet-ing risks analysis of patients with osteosarcoma: acomparison of four different approaches. Stat Med(2001) 20: 661–84.

80. Gray R. A class of k-sample tests for comparingthe cumulative incidence of competing risk. AnnStat (1988) 16: 1141–54.

81. Gelman R, Gelber R, Henderson IC, Coleman CNand Harris JR. Improved methodology for ana-lyzing local and distant recurrence. J Clin Oncol(1990) 8: 548–55.

82. Farley TMM, Ali MM, Slaymaker E. Competingapproaches to analysis of failure times withcompeting risks. Stat Med (2001) 20: 3601–10.

83. Chilvers CED, Fayers PM, Freedman LS, Green-wood RM, Machin D, Palmer N, Westlake AJ.Improving the quality of data in randomized clin-ical trials: the COMPACT computer package. StatMed (1988) 7: 1165–70.

84. McFadden ET. Management of Data in ClinicalTrials. New York: John Wiley & Sons (1997).

85. McFadden ET. Data management and coordina-tion. In: Redmond C, Colton CT, eds, Biostatisticsin Clinical Trials, Chichester: John Wiley & Sons(2000) 158–67.

86. Jones B, Jarvis P, Lewis JA, Ebbutt AF. Trials toassess equivalence: the importance of rigorousmethods. Br Med J (1996) 313: 36–9.

Page 55: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

3

Clinical Trials in PaediatricsJOHAN P.E. KARLBERG

Clinical Trials Centre and Department of Paediatrics,The University of Hong Kong, Hong Kong SAR

WHY SHOULD WE DO CLINICAL TRIALSIN CHILDREN?

Children are subject to many of the same diseasesas adults, and are often treated with the samedrugs and biological products. However, manydrugs on the market used to treat children areinadequately labelled for use with paediatricpatients; and many carry disclaimers stating thatsafety and effectiveness in paediatric patientshave not been established. Information about thesafety and effectiveness of treatments for somepaediatric age groups is particularly difficultto find. Even today, no treatment is availablefor many of the thousands of rare and seriousdiseases that largely affect neonates, infants andchildren. Most drugs used to treat commondiseases in both children and adults have not beeninvestigated in children at all. Over 50% of alldrugs prescribed in paediatric practice are either‘unlicensed’ or ‘off label’.

The paediatric medical community has fordecades tried to persuade regulatory authoritiesand the pharmaceutical industry to test new drugsin the paediatric population in parallel with theadult studies. The motto of the campaign has been

‘Children are not simply Small Adults’ and itsGuidelines for the Ethical Conduct of Studies toEvaluate, published in 1995, reported that:

• In 1973, 78% of medications included a dis-claimer or lack of dose information for children.

• In 1991, 81% of listed drugs were restrictedfor certain age groups.

• In 1992, 79% of 19 new molecular entitiesapproved were not labelled for use in children.

As a result of effectively being denied accessto well-studied drugs, paediatricians either don’ttreat children with potentially beneficial medica-tions, or treat them with medications based eitheron adult studies or anecdotal empirical experi-ence in children. Such non-validated administra-tion of medications may place more children atrisk than if the drugs were administered as partof well-designed, controlled clinical trials. Thereis therefore a moral imperative to formally studydrugs in children, so they can enjoy equal accessto existing as well as new therapeutic agents.

The US National Institute of Health (NIH)published regulations in 1999 clearly definingwhat human studies could be funded by NIH

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 56: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

46 TEXTBOOK OF CLINICAL TRIALS

to the exclusion of paediatric subjects. Theexclusionary circumstances were:

• Research topic irrelevant to children;• Laws or regulations barring inclusion of chil-

dren;• The knowledge is available for children or will

be obtained from another ongoing study;• The relative rarity of the condition in children;• The number of children is limited;• Insufficient data are available in adults to judge

potential risk in children.

Not until recently have children been more reg-ularly included in clinical studies to investi-gate drugs. Considerable differences betweenthe pharmacokinetics and pharmacodynamicsof drugs in children and in adults frequentlymake it impossible to bridge conclusions fromdata obtained in adults. Children cannot evenbe considered a homogeneous group, sinceage groups differ in their absorption, distri-bution, metabolisation and excretion of drugsand their effect on developing organ sys-tems. The anatomical structure of children’sorgans differ from adults, causing different phar-macodynamic characteristics observed duringchildhood.

The lack of paediatric safety informationin product labelling exposes paediatric patientsto the risk of age-specific adverse reactionsunexpected from adult experience. The absenceof paediatric testing and labelling may alsoexpose paediatric patients to ineffective treatmentthrough under-dosing, or may deny paediatricpatients therapeutic advances. Failure to developa paediatric formulation of a drug or biologicalproduct, where younger paediatric populationscannot take the adult formulation, may alsodeny paediatric patients access to importantnew therapies.

Three conclusions can therefore be drawnabout paediatric drug studies: studies must bemade in different age groups; describing the phar-macokinetics and pharmacodynamics is crucial;and the safety of drugs must be studied to identifypotential severe side effects.

REGULATORY ISSUES OF CLINICAL TRIALSIN CHILDREN?

Regulatory authorities in the US and Europehave in recent years taken important steps toaddress the problem of inadequate paediatric test-ing and inadequate paediatric use information indrug and biological product labelling. But theseefforts have, thus far, not substantially increasedthe number of products entering the marketplacewith adequate paediatric labelling. The regulatoryauthorities have therefore concluded that addi-tional steps are necessary to ensure the safetyand effectiveness of drug and biological productsfor paediatric patients. Manufacturers of new andmarketed drugs and biological products must nowevaluate the safety and effectiveness of the prod-ucts in paediatric patients if the product is likelyto be used in a substantial number of children, orprovide a more meaningful therapeutic benefit topaediatric patients than existing treatments.

Since 2000 in both the US and Europe,pharmaceutical companies have been obligedto include paediatric data in all new drugapplications and licence extensions provided thatsubstantial use of the drug in children and ameaningful therapeutic benefit is expected. Thestrength of this legislation is however differentin the two regions–and so is the extension ofmarket exclusivity.

In recent years an independent ‘Orphan’ drugregulation has been in force in the countries ofthe European Community as well as in the US.This creates incentives for the development ofdrugs for rare serious diseases, but is unlikelyto achieve effective improvement in paediatricdrug therapy. The Food and Drug Administration(FDA) Modernization Act established economicincentives for pharmaceutical manufacturers toconduct paediatric studies on drugs for whichpatent protection or exclusivity is availableunder the Drug Price Competition and PatentTerm Restoration Act or the Orphan Drug Act.These provisions attach six additional months ofmarketing exclusivity to any existing exclusivityor patent protection of a drug for which FDA hasrequested paediatric studies.

Page 57: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CLINICAL TRIALS IN PAEDIATRICS 47

However, there is likely to be a consensusduring the coming years–at least in the ICHGCP regions–over requirements for conductingclinical trials on new drugs and other thera-pies in children. But before this consensus canbe reached, a number of points have to beaddressed and discussed, underlined by the fol-lowing two examples.

EXAMPLE 1–ONGOING DISCUSSIONSOF THE CONSENT PROCESS IN

PAEDIATRIC TRIALS

The significant increase in the number of chil-dren participating in clinical trials continues toraise ethical and procedural concerns. The FDAaddressed this issue in April 2001, calling oninstitutional review boards to review study proto-cols that include children and ensure they adoptsafeguards to protect young research participants.A group in the US is currently examining the‘best practices’ related to research involving chil-dren. The study will address:

• Process for obtaining informed consent andassent from children and their parents or legalrepresentatives.

• How well participants in paediatric studies andtheir guardians understand direct benefits andrisks of study involvement?

• Definition of “minimal risk” related to healthyand ill child study participants.

• Whether regulations and policies should varyfor children of different ages (for example,teenagers and infants)?

• Appropriateness of payments to children, par-ents, or legal representatives for participationin research.

• Role of IRBs in monitoring compliance withregulations related to paediatric studies.

EXAMPLE 2–ONGOING DISCUSSIONS OFTHE LEGISLATION OF PAEDIATRIC TRIALS

Based on feedback from a consultation document,the European Commission was expecting toprepare draft legislation on paediatric medicinal

development by Autumn 2002. This legislationis considered by many to be pressing, creatingthe conditions needed to improve medicines forchildren. Nearly all involved parties in Europesupported a legal and regulatory frameworkfor improving child health, especially regardingthe labelling of medicines. The consultationdocument concluded:

• A robust ethical framework for European pae-diatric research needs to be created, includ-ing guidance for informed consent, ethicalreview, recruitment of subjects, and safetyand oversight.

• A robust paediatric clinical study infrastructureneeds to be created in Europe, since as a resultof reluctance to perform such studies up tonow, there is a serious shortage of trained andexperienced people and centres of excellence.

• Greater cooperation should be stimulatedbetween public sector research and private sec-tor research in paediatrics, in the interest ofdeveloping a European dimension to improv-ing medicines for children.

• A clear framework should be developed forassembling international data and informationregarding paediatric trials and medicines–toensure that unnecessary trials are not carriedout in Europe, and that European paediatricianshave the benefit of up-to-date and comprehen-sive information regarding medicinal productsfor their patients, wherever in the world thatinformation has been generated.

• A greater public dialogue is required in Europeregarding the benefits and risks of paediatricresearch for individual children participatingin research, as well as for public healthin general.

ICH GUIDELINE ON PAEDIATRIC STUDIES

The International Conference on Harmonisa-tion (ICH) Guideline E11–Clinical Investigationof Medicinal Products in the Pediatric Popu-lation–became operational in January 2001 inthe United States, Europe and Japan. The E11

Page 58: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

48 TEXTBOOK OF CLINICAL TRIALS

guideline outlines critical issues in paediatricdrug development. In summary, this new andimportant ICH document states that paediatricpatients should be given medicines that havebeen appropriately evaluated for their use. It sayssafe and effective therapy in paediatric patientsrequires the timely development of informationon the proper use of medicinal products in pae-diatric patients of various ages and, often, paedi-atric formulations of those products. The goal ofthis guideline is to encourage and facilitate timelypaediatric medicinal product development inter-nationally. This guideline addresses five issues,namely considerations when initiating a paedi-atric programme, the timing of its initiation, typeof studies, age categories and ethics.

WHEN INITIATING A PAEDIATRICPROGRAMME? (SUMMARY POINTS OF ICH

GCP E11)

The decision to proceed with a paediatric devel-opment programme for a certain medicinal prod-uct requires consideration of factors such as:

• Prevalence of the condition in the paedi-atric population;

• Seriousness of the condition;• Availability and suitability of alternative treat-

ments;• Unique paediatric indications;• Unique paediatric-specific endpoints;• Age ranges of paediatric patients likely to

be treated;• Unique paediatric safety concerns;• Unique paediatric formulation development.

The most common considerations when dis-cussing the need and timing of a paediatric pro-gramme are:

• Most important is the presence of a serious orlife-threatening disease for which the medici-nal product represents a potentially importantadvance in therapy. This situation suggests rel-atively urgent and early initiation of paedi-atric studies.

• For medicinal products for diseases predom-inantly or exclusively affecting paediatricpatients, the entire development programmewill be conducted in the paediatric population,except for initial safety and tolerability data,which will usually be obtained in adults.

• For medicinal products intended to treat seri-ous or life-threatening diseases occurring inboth adults and paediatric patients, for whichthere are currently no (or limited) therapeuticoptions, there is need for relatively urgent andearly initiation of paediatric studies.

• For medicinal products intended to treat otherdiseases and conditions there is less urgency.Trials would usually begin at later phases ofclinical development or, if a safety concernexists, even after a substantial post-marketingperiod in adults. Testing of these medicinalproducts in the paediatric population wouldusually not begin until Phase II or III–sincevery early initiation of testing in paediatricpatients might needlessly expose them to acompound of no benefit.

TYPES OF STUDIES (SUMMARY POINTSOF ICH GCP E11)

Selection of the type of study should be on thesame principles as studies planned for adults.However, several considerations are of specificimportance for paediatric studies. Some of themost important are:

• When a medicinal product is to be usedin the paediatric population for the sameindication(s) as those studied and approved inadults, the disease process is similar in adultsand paediatric patients, and the outcome islikely to be comparable, extrapolation fromadult efficacy can be appropriate. In such cases,pharmacokinetic (PK) studies in all the ageranges of paediatric patients likely to receivethe medicinal product, together with safetystudies, may provide adequate information.

• When a medicinal product is to be used inyounger paediatric patients for the same indi-cation(s) as those studied in older paediatric

Page 59: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CLINICAL TRIALS IN PAEDIATRICS 49

patients, the disease process is similar, and theoutcome is likely to be comparable, extrapola-tion of efficacy from older to younger paedi-atric patients may be possible. In such cases,pharmacokinetic studies in the relevant agegroups of paediatric patients together withsafety studies may be sufficient.

• Many diseases in preterm and term newborninfants are unique or have unique manifesta-tions precluding extrapolation of efficacy fromolder paediatric patients and call for novelmethods of outcome assessment.

• Where the disease course/outcome of therapyin paediatric patients is expected to be similarto adults, but the appropriate blood levelsare not clear, it may be possible to usemeasurements of a pharmacodynamic (PD)effect related to clinical effectiveness. Thus,a PK/PD approach combined with safety andother relevant studies could avoid the need forclinical efficacy studies.

• When unique indications are being sought forthe medicinal product in paediatric patients,or when the disease course and outcome oftherapy are likely to be different in adults andpaediatric patients, clinical efficacy studies inthe paediatric population are needed.

Pharmacokinetics

Pharmacokinetic studies generally should be per-formed to support formulation development anddetermine pharmacokinetic parameters in differ-ent age groups. Pharmacokinetic studies in thepaediatric population are generally conducted inpatients with the disease. Single-dose or steady-state studies are the choice of pharmacoki-netic study:

• For medicinal products that exhibit linear phar-macokinetics in adults, single-dose pharma-cokinetic studies in the paediatric populationmay be sufficient.

• When there is a nonlinearity in absorption,distribution and elimination in adults anddifference in duration of effect between singleand repeated dosing in adults suggests steady-state studies in the paediatric population.

Special considerations should be taken whenblood is drawn more than once in paedi-atric subjects, such as in PK/PD studies. Sev-eral approaches can be used to minimise theamount of blood drawn and/or the number ofvenipunctures:

• Use of sensitive assays;• Use of laboratories experienced in handling

small volumes;• Using routine clinical blood samples for phar-

macokinetic analysis;• Use of indwelling catheters;• Use of population pharmacokinetics and sparse

sampling.

Efficacy

The principles in study design, statistical consid-erations and choice of control groups are detailedin other ICH guidelines and apply to paediatricefficacy studies. But there are also certain featuresunique to paediatric studies.

• Extrapolation of efficacy from studies in adultsto paediatric patients, or from older to youngerpaediatric patients, as mentioned above.

• For efficacy studies it may be importantto employ different endpoints for specificage groups.

• Measurement of subjective symptoms requiresdifferent assessment instruments for patients ofdifferent ages.

• The response to a medicinal product may varyamong patients because of the developmentalstage of the patient.

Safety

ICH guidelines (E2 and E6) describe adverseevent reporting and apply to paediatric studies.But there are certain safety aspects unique topaediatric studies.

• Medicinal products may affect physical andcognitive growth and development, and theadverse event profile may differ in paediatricpatients, compared with adults.

Page 60: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

50 TEXTBOOK OF CLINICAL TRIALS

• The dynamic processes of growth and devel-opment may not manifest an adverse eventat once, but at a later stage of growthand maturation.

• Long-term studies or surveillance data maybe needed to determine possible effects onskeletal, behavioural, cognitive, sexual andimmune maturation and development.

• Post-marketing surveillance may provide im-portant safety and/or efficacy information forthe paediatric population.

• Age-appropriate, normal laboratory values andclinical measurements should be used inadverse event reporting.

AGE CLASSIFICATION OF PAEDIATRICPATIENTS (SUMMARY POINTS OF ICH

GCP E11)

Decisions on how to stratify studies and data byage need to take into consideration developmentalbiology and pharmacology. The identification ofwhich ages to study should be medicinal product-specific and justified.

• Preterm newborn infants: Preterm newborninfants have a unique pathophysiology andresponses to therapy. The complexity of andethical considerations involved in studyingpreterm newborn infants requires a carefulprotocol development with expert input fromneonatologists and neonatal pharmacologists.Only rarely can we extrapolate efficacy fromstudies in adults or in older paediatric patientsto the preterm newborn infant.

• Term newborn infants (0 to 27 days): Newborninfants are more mature than preterm newborninfants, but many of the physiologic andpharmacologic principles for preterm infantsalso apply to them.

• Infants and toddlers (28 days to 23 months):This is a period of rapid CNS maturation,immune system development and total bodygrowth. By 1–2 years of age, clearance ofmany drugs on a mg/kg basis may exceed adultvalues and then it may be dependent on specificpathways of clearance.

• Children (2 to 11 years): Most pathways ofdrug clearance are exceeding adult values.Changes in clearance of a drug may bedependent on maturation of specific metabolicpathways. The protocols should ascertainassessment of the effect of the medicinal prod-uct on growth and development. Recruitmentof patients should ensure adequate represen-tation across the age range in this category.Puberty can affect the activity of enzymes thatmetabolise drugs, and dose requirements forsome medicinal products may decrease dramat-ically.

• Adolescents (12 to 16–18 years (dependent onregion)): This is a period of sexual maturationand medicinal products may interfere with theactions of sex hormones. Medicinal productsand illnesses that delay or accelerate theonset of puberty can have a profound effectand may affect final height. Many diseasesare also influenced by the hormonal changesaround puberty and hormonal changes maythus influence the results of clinical studies.Noncompliance is a special problem andcompliance checks are important.

ETHICAL ISSUES IN PAEDIATRIC STUDIES(SUMMARY POINTS OF ICH GCP E11)

The paediatric population represents a vulnerablesubgroup. Therefore, the following special mea-sures are needed to protect the rights of paediatricstudy participants.

• Participants in clinical studies are expected tobenefit from the clinical study, except underspecial circumstances.

• When protocols involving the paediatric pop-ulation are reviewed, there should be IRB/IECmembers or experts consulted by the IRB/IECwho are knowledgeable in paediatric ethical,clinical and psychosocial issues.

• Paediatric study participants are dependenton their parent(s)/legal guardian to assumeresponsibility for their participation in clinicalstudies. Participants of appropriate intellectualmaturity should personally sign and date either

Page 61: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CLINICAL TRIALS IN PAEDIATRICS 51

a separately designed, written assent form orthe written informed consent.

• Information that can be obtained in a lessvulnerable, consenting population should notbe obtained in a more vulnerable population orone in which the patients are unable to provideindividual consent.

• Studies in handicapped or institutionalisedpaediatric populations should be limited todiseases or conditions found principally inthese populations, or when it is expectedthat the disease may alter the effects of amedicinal product.

• To minimise risk in paediatric clinical studies,those conducting the study should be trainedand experienced in studying the paediatric pop-ulation, including the evaluation and manage-ment of potential paediatric adverse events.

• In designing studies, every attempt shouldbe made to minimise the number of partici-pants and of procedures, consistent with goodstudy design.

• To ensure that experiences of the study sub-jects are positive and to minimise discomfortand distress.

ETHICAL CONSIDERATIONS IN PAEDIATRICSTUDIES

Of all the problems surrounding research inchildren, the one that poses perhaps the mostcomplex question is research ethics. Childrenare not legally able to provide consent and theextent to which children are able to understandthe meaning, risks and potential benefits ofparticipating in clinical trials varies enormouslyaccording to age and background. For thisreason it may be appropriate to address somepoints related to the IRB review, including theinformed consent process, in paediatric trialsmore specifically than outlined in the ICH GCPE11 guideline. One document that addresses thistopic at more depth is the Review and AwardCodes for the NIH Inclusion of Children Policyfrom 1999. The following partly originates fromthis document, but also incorporates sourceslisted at the end of this chapter.

First studies that promise no demonstrable ben-efits to the child participating in the study orto children in general should not be conducted,irrespective of the minimal nature of the atten-dant risks. The risks include discomfort, incon-venience, pain, fright, separation from parentsor surroundings, effects on growth and develop-ment of organs, and size or volume of biologi-cal samples.

The proposed research must be of value tochildren in general and, in most instances, to theindividual child subject:

• The research design must take into consider-ation the unique physiology, psychology andpharmacology of children and their specialneeds and requirements as research subjects.

• The design should minimise risk while max-imising benefits.

• The study design must take into account theracial, ethnic, gender and socioeconomic char-acteristics of the children and their parents.

• A placebo/observational control group maybe acceptable when there is no commonlyaccepted therapy, or the commonly used ther-apy is of questionable efficacy, or the com-monly used therapy has high frequency of sideeffects, i.e. larger than the benefits.

PAEDIATRIC INFORMED CONSENT

Children are not legally able to provide consentand the extent to which children are able to under-stand the meaning, risks and potential benefits ofparticipating in clinical trials varies enormouslyaccording to age and background. Children arecounted as members of a vulnerable populationat risk for exploitation and are given special pro-tection in clinical research. In paediatric trials,just as in adult trials, materials in an understand-able language, opportunities to discuss the trial,and freedom to withdraw without penalty mustbe provided to potential subjects.

Investigators are ultimately held responsiblefor ensuring adequate informed consent. Morethan two decades of inquiry into the processof consent have shown that adults are less than

Page 62: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

52 TEXTBOOK OF CLINICAL TRIALS

adequately informed about risks, benefits and par-ticipation in research. The process is even moreproblematic for research involving individualswith limited abilities in decision-making. Theevolving psychological and emotional develop-ment of children and adolescents presents chal-lenges to paediatric investigators not encounteredwhen dealing with adult subjects. Unless oppos-ing evidence is identified, capacity to understandand provide informed consent has long beenassumed in adults. Results from studies in healthyand sick children suggest that also children havethis capacity. Several investigators have evalu-ated the degree to which minors from schoolage through adolescence are capable of provid-ing assent. Even very young children demonstrateinquisitiveness about the proposed research. Bythe age of nine, children can understand purpose,risk and the right to withdraw from the study.Even seven-year-old children can understand thepurpose of a study. Such observations supportthe requirement by most ethics boards that assentbe obtained in children aged seven and older.However, information regarding scientific versustherapeutic study objectives for both research andalternative treatment is not well understood inseven to 20-year-old subjects. Paediatric subjectscan thus provide an informed agreement to partic-ipate, but the assent process should be conductedusing discussions that encourage questions.

Obtaining Informed Permission–Assent–toParticipate

Regulations permit studies involving minimalrisk in children, with the provision that permis-sion from parents and assent from subjects areobtained. Research involving greater than min-imal risk, but providing potential direct benefitto the child, is also permitted with the sameprovision. There are some exceptions to therequirement for assent and consent. Assent isnot necessary for research expected to directlybenefit the child. Assent must be an activeaffirmation from any child with an intellectualage of seven years or older. Assent should beobtained from children who are competent to

understand; and the purpose, risks and bene-fits of a study should be explained to them.The following guideline has therefore been pro-posed:

No greater than minimal risk

• Assent of the child and permission of at leastone parent.

Greater than minimal risk and prospect ofdirect benefit

• Assent of the child and permission of at leastone parent.

• Anticipated benefit justifies the risk.• Anticipated benefit is as least as favourable as

alternative approaches.

Greater than minimal risk and no prospect ofdirect benefit

• Assent of the child and permission of bothparents.

• Likely to yield generalisable knowledge aboutthe child’s disorder.

SPECIFIC PROBLEMS OF PAEDIATRICSTUDIES

SUBJECT RECRUITMENT

Insufficient enrollment of children is the mostcommon reason for discontinuing paediatric stud-ies. Creating and expanding networks for paedi-atric pharmacology studies, such as in the US andEurope, are steps in the right direction to recruitenough subjects. Many reasons for this poorrecruitment rate for paediatric studies include:

• Strict inclusion and exclusion criteria;• Limited size of the paediatric population;• That each age group has to be consid-

ered separately;• Inconvenience for the parents in having their

children participate in a clinical study;• Fear of making one’s own child available as a

‘guinea pig for research’.

Page 63: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CLINICAL TRIALS IN PAEDIATRICS 53

• Doctors are wary of jeopardising the doc-tor–patient relationship, or losing the trustof parents.

EARLY TESTING

There are no healthy paediatric volunteers. Thelack of volunteers for Phase I studies is aspecial problem and makes the planning oftherapeutic studies in children difficult. Therequirements for paediatric study designs are forthis and other reasons different from studies inadults. Alternative study designs and alternativestatistical methods are required.

STUDY MANAGEMENT

To obtain a sufficient number of subjects requiresa large number of study centres. Moreover, thecost for each individual step of a paediatric studyis usually higher than for studies in adults–bothto pharmaceutical companies, as sponsors of thestudies, and to the participating doctors. Forinstance, explaining the nature of a study–toobtain permission from parents and ensure theircooperation during the course of the study–isa very time-consuming process. Explanatorymaterial and information has to be prepared notonly for parents, but also adapted for the children.Caring for the children during their visits tothe study centre also requires creativity, patienceand time.

FINAL COMMENTS

Faced with heavy workloads, paediatricians mayoften be reluctant to assume what looks likethe extra work of clinical trials. But a shortageof investigators is not the only problem thatslows paediatric trials. It takes many subjects tosatisfy the requirements for an adult drug to beadequately studied in children–and frequently thepopulation of paediatrics with a certain diseasedoes not exist. So not only do studies need to bedesigned to use small populations efficiently; theyalso need to be designed with children in mind.

Just taking adult protocols, then changing the agein the inclusion criteria and the dose, is not goodenough. With a limited number of investigatorsand a limited number of potential subjects, studydesign is critical for successful development ofnew safe and life-saving therapeutic entities forpaediatric usage.

FURTHER READING

PUBLICATIONS

• Shirkey H. Therapeutic orphans (editorial com-ment). J Pediatr (1968) 72: 119–20.

• Rogers LC, Cocchetto DM. Labeling prescriptiondrugs for pediatric use in the United States. ApplClin Trials (1997) 50–6.

• Tarnowski KJ, Allen DM, Mayhall C, Kelly PA.Readability of pediatric biomedical research infor-med consent forms. Pediatr (1990) 85: 58–62.

• Susman EJ, Dorn LD, Fletcher JF. Participation inbiomedical research: the consent process as viewedby children, adolescents, young adults, and physi-cians. J Pediatr (1992) 121: 547–52.

• Ethical Conduct of Studies to Evaluate Drugs inPediatric Populations. Committee on Drug. Pedi-atrics (1995) 95: 286–94.

• Wechsler J. Science, pediatric studies, and surro-gates. Appl Clin Trials (1999) 28–33.

• Nahata MC. Lack of pediatric drug formulations.Pediatrics (1999) 104: 607–9.

• Conroy S, McIntyre J, Choonara I. Unlicensed andoff label drug use in neonates. Arch Dis Childh:Fetal Neonatal Edit (1999) 80: F142–5.

• Conroy S, McIntyre J, Choonara I, Stephenson T.Drug trials in children: problems and the wayforward. Br J Clin Pharmacol (2000) 49: 93–7.

• Kauffman RE. Clinical trials in children: problemsand pitfalls. Paediatr Drugs (2000) 2: 411–18.

• Simar MR. Pediatric drug development; the Interna-tional Conference of Harmonization Focus on Clin-ical Investigation in Children. Drug Inform J (2000)34: 809–19.

OFFICIAL DOCUMENTS/GUIDELINES

• Review and award codes for the NIH inclusion ofchildren policy March 26, 1999.

• Regulation (EC) No. 141/2000 of the EuropeanParliament and the Council of 16 December 1999on orphan medicinal products.

• World Medical Association, Declaration of Helsinki(amended October 2000).

Page 64: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

54 TEXTBOOK OF CLINICAL TRIALS

• Food and Drug Administration, Department ofHealth and Human Services, The Pediatric Exclusiv-ity Provision January 2001 Status Report to Cong-ress.

• Food and Drug Administration, Additional Safe-guards of Children in Clinical Investigations ofFDA-Regulated Products: Interim Rule, FederalRegister 66 (79) 20589 (24 April 2001).

• Food and Drug Administration, Update of listof approved drugs for which additional paediatric

information may produce health benefits in thepaediatric population (May 20, 2001), DocketNo. 98N0056ICH Topic Ell, ‘Clinical Investiga-tion of Medicinal Products in the Pediatric Popu-lation’. CPMP/ICH/2711/99, January 2001, adoptedJuly 2000.

• European Commission, Better Medicines for Chil-dren: Proposed regulatory actions on paediatricmedicinal products (European Commission, Brus-sels, Belgium, 2002).

Page 65: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

4

Clinical Trials Involving Older PeopleCAROL JAGGER AND ANTONY J. ARTHUR

Department of Epidemiology and Public Health, University of Leicester, Leicester, UK

As few diseases or conditions present for the firsttime in later life, there are few treatments pre-scribed solely to older people. There is also littleconsensus on the definition of ‘the elderly’ sinceageing can be considered a continuous processfrom birth to death. However the increasing like-lihood of illness other than that under treatmentand greater mental and physical frailty with age-ing means that older people can be inherentlydifferent to younger adults and the numerousphysiological changes that accompany the ageingprocess may alter the way in which older peoplerespond to drugs.1

By 2010 in most of the developed countries,the 65+ age group will form over 15% of thetotal population and over 20% in Japan. In theUK, those aged 65 years and over make up 18%of the population but they receive nearly half ofall prescriptions.2 Despite this, few trials, unlessspecially designed and conducted in this agegroup, have sufficient numbers of older people,particularly the ‘oldest-old’, to provide evidenceof efficacy, even for treatments of diseasesand conditions that are seen predominantly inlater life. A recent review of clinical trials inParkinson’s disease, where prevalence increaseswith age and incidence peaks between 70 and

80 years of age, found only 38% of studiesincluded subjects over 75 years of age.3

Trials of the efficacy of interventions shouldcover the age groups who are affected.4 Olderpeople have been explicitly excluded throughthe use of a maximum age for eligibility andobviously such trials provide little informationabout the efficacy of treatments in older agegroups. However implicit exclusion is also com-mon, through criteria such as the presence of co-morbid conditions. In addition certain recruitmentmethods may result in study populations witholder people an under-representation of the gen-eral population likely to be treated. In these casesit may be difficult for the clinician to be aware ofthe paucity of older people studied, resulting inthe late recognition of serious side effects whendrugs tested on predominantly younger adult pop-ulations are finally released and prescribed tolarger numbers of older people. Perhaps the mostfamous, or infamous, case of this was benoxapro-fen, a non-steroidal anti-inflammatory drug mar-keted as Opren, which was withdrawn after areport of the death of five elderly patients whohad taken the drug.5

In this chapter we will explore, in moredepth, the reasons why older people face a

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 66: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

56 TEXTBOOK OF CLINICAL TRIALS

number of barriers at each stage of a trial:eligibility, recruitment, gaining informed consentand follow-up. In addition we will discuss strate-gies for increasing the number of older people inclinical trials, so that in future, those responsiblefor the treatment of older people will be able tobase their practice on high-quality evidence.

ELIGIBILITY

Despite recommendations to the contrary, olderpeople are still being excluded from clinicalresearch on the basis of age alone, shown by ananalysis of studies reported in four leading jour-nals (BMJ, Gut, the Lancet and Thorax) whichfound 35% (170) excluded older people withno justifiable reason.6 Reviews of trials in acutemyocardial infarction7 and Parkinson’s disease,3

conditions where prevalence and incidence arestrongly related to advancing age, found that tri-als published later were more likely to excludeolder subjects. Moreover, since more women thanmen survive to older age and in some cases, suchas cardiovascular disease, women develop dis-eases later in life than men, exclusion on the basisof age disproportionately disadvantages women.8

Operating an upper age limit for trials hasoften been used to limit the problem of co-morbid conditions and drug interactions that mayoccur with increasing age, in the belief thatmost adverse drug reactions in older peopleare simply a consequence of advancing age. Areview of pertinent studies suggests that thismay be misguided since the physiological andfunctional characteristics of the patient, ratherthan chronological age per se, appear to be themost important in drug interactions.9

Even when age limits are not imposed, olderpatients are often implicitly excluded because ofother exclusion criteria or because of investigator,cultural or other biases in enrollment.8 In thereview of acute myocardial infarction trials,comparison of the age distributions of patientsin trials with and without age exclusions showedno differences, suggesting that factors other thanexplicit age restrictions were at play.7

The process of patient selection and recruit-ment mostly aims to produce an homogeneousstudy population with the purpose of increas-ing the statistical power to detect the effects ofdrugs.10 The resulting clinical trial, conducted in‘sterile’ conditions, bears little resemblance topractice and cannot be extrapolated to the gen-eral population. Indeed, although tight eligibilitycriteria may aim to produce very similar par-ticipants, inter-patient variability is such that atruly homogeneous group of patients is difficult,if not impossible, to identify. Important prog-nostic variables will be measured at baseline,but even if study participants are the same onthese criteria, they will still vary in the courseof their disease and on unmeasured prognosticfactors.11 Thus the gain in attempting to study agroup of homogeneous patients is outweighed bythe loss in generalisability and clinical applica-bility of the results. Even when treatment trialsare specifically designed for older people, overlystringent exclusion criteria can produce highlyskewed and non-representative patient popula-tions. Many trials of treatments for Alzheimer’sdisease have excluded patients with behaviouralproblems despite such problems being commonwith increasing cognitive impairment. Since thereis considerable scope for improving such symp-toms with drugs that enhance cognition, thesetrials may well be missing opportunities.12

Studying a narrow group of patients alsomisses the potential to identify subgroups ofpatients who may respond particularly well to thedrug under test. A trial comparing the efficacyof sertraline and nortriptyline in major depres-sion included patients aged 60 years and over,but a subgroup analysis of the 76 patients aged70 years and over suggested that treatment withsertraline may confer even greater benefit in thisolder age group than patients aged 60 years andover.13 In trials of intervention packages or ser-vices rather than drugs, similar tensions existbetween maximising the detection of a significanteffect of the intervention in a population unen-cumbered with concurrent illness, and a need toassess effectiveness as close as possible to deliv-ery in practice after the trial. The advantages of

Page 67: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CLINICAL TRIALS INVOLVING OLDER PEOPLE 57

wide eligibility criteria for entering patients intoclinical trials are summarised in Box 4.1.11

Box 4.1 Advantages of wide eligibilitycriteria for entering patients intoclinical trials

1. Easier screening and recruitment. Largetrials aremore feasibleand economical.

2. Large study sizes reduce random error,providing more reliable overall results.

3. Wider applicability of results. There-fore greater clinical and publichealth impact.

4. Greater opportunity to test sub-group hypotheses.

Source: Reproduced from Yusuf et al.11

RECRUITMENT

The recruitment, in sufficient numbers and withinthe desired time frame, of motivated and compli-ant subjects, representative of the wider groupultimately receiving treatment, is the goal of allwho design and execute clinical trials. Recruit-ing motivated participants is a problem for allclinical trials but particular difficulties are evi-dent when recruiting older patients. Clinical tri-als are likely to involve more regular monitoringand follow-up assessments than would routinelytake place in practice and this in itself may betoo burdensome for older people who may haveother health problems, which they may perceiveas more important, or lack access to transport.A longer enrollment phase, although deterringsome older subjects, may allow a more com-plete collection of data and more accurate pre-diction of patient compliance, again highlightingthe tension between pragmatic and explanatorytrials.14 If possible, assessments should be offeredat home or, where this is impossible, transporta-tion to clinics provided at times convenient tothe subject.

Although the experience of earlier trials onstrategies to maximise recruitment may notbe immediately transferable across time andplace, they may provide researchers with ideas.Experiences in recruiting older people to tri-als have been described in the treatment ofhypertension with both pharmacological15 andnon-pharmacological interventions16 and in tri-als of exercise.17 Many of the reports simplydescribe the experience of one or two particularstrategies, though mass mailing, media advertis-ing, community-based screening, clinical practicescreening, participant referrals and other recruit-ment methods have been compared in a trial ofthe efficacy of weight loss and sodium reductionfor preventing hypertension in the elderly.16 Thisstudy concluded that mass mailing of a brochureor letter describing the study resulted in the great-est yield in terms of percent randomised (76%;N = 737) though it is less clear whether thisapplied to all subgroups of the population. Tri-als recruiting volunteers, although producing apopulation who may be more likely to remainthroughout the length of the study, provide littleevidence of applicability to the general elderlypopulation. Older volunteers tend to be morelikely than younger ones to be healthy and liv-ing independently, of particular importance fortrials of interventions involving exercise sincevolunteers may not be the subjects most likelyto benefit.17

Rarely does one single strategy succeed inrecruiting adequate numbers of representativepatients. It is important therefore that the char-acteristics of participants are regularly monitoredthroughout the trial, and compared to the gen-eral population, so that, if necessary, specificdemographic groups, such as the oldest-old orparticular ethnic groups may be targeted. Suchmixed-mode recruitment has produced represen-tative samples of high-risk older people for a trialof geriatric evaluation and management.18 Thefinal sample should aim to be as representativeas possible and a list of strategies that could beused if shortfalls occur during recruitment shouldbe developed at the design stage of the trial.

Page 68: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

58 TEXTBOOK OF CLINICAL TRIALS

GAINING INFORMED CONSENT

Obtaining informed consent from patients beforerandomisation is a universal requirement, althoughlegal requirements across countries may differ.As with eligibility and recruitment, the means ofgaining informed consent from subjects enrollingshould be addressed at the design stage of the trialand the information that the patient requires to giveinformed consent is listed in Box 4.2. A synopsisof the practicalities in obtaining informed consentfor clinical trials has been reported, stressing thatgaining informed consent ‘should not be seen asan exercise in bureaucratic form filling, but as anessential part of the trial requiring time, insight andcommunication skills’.19

Box 4.2 Patient information necessary forinformed consent

1. Diagnosis.2. Available treatments and treatment

on trial.3. Potential risks and benefits of treat-

ment.4. Concept of a clinical trial (includ-

ing randomisation, use of placebos,double-blind procedures).

5. Discomforts or inconveniences associ-ated with assessments.

6. Number of follow-up visits or extratravel for trial.

Clinicians may see relaying the concept of arandomised controlled trial as admittance of igno-rance about the best treatment for the patient, ormay make ageist assumptions concerning the abil-ity of older people to consent to a trial. Further-more, the clinical trial design is complex and evenif explained carefully, patients may not under-stand fully enough to give true informed consent.A qualitative study, as part of a set of trials ofthe effectiveness of treatments for men with uri-nary retention and benign prostatic disease, foundthat, although information given was accurately

recalled, subjects found the concept of randomisa-tion difficult to accept and were confused by termssuch as ‘trial’ and ‘random’ which have differentmeanings to lay and professional groups.20

Studies examining significant predictors ofenrollment into trials are equivocal in their find-ings. A systematic review of literature on informedconsent found evidence of impaired understand-ing of the informed consent information in oldersubjects and those with less formal education,21

whilst the Recruitment and Enrollment Assess-ment in Clinical Trials study, part of the Car-diac Arrythmia Suppression Trial (CAST), did notfind education differences in enrollers and non-enrollers, although enrollers were more likely tohave read the informed consent themselves andto have understood it.22 The ability to understandinformation about clinical trials, particularly therandomisation process, may well be correlatedwith level of education.23 An instrument to assessunderstanding of information given to ascertaininformed consent for ambulatory trials has beendeveloped, but its disadvantages are that it is study-specific and it was tested on relatively young andwell-educated subjects.24

Family members have also been found to playan important role in the informed consent pro-cess, approval by family members, particularlyspouses, being associated with successful enroll-ment in a cardiovascular trial.25 This study alsofound that the majority (96%) of those approachedby a physician agreed to enrol, compared to 66%of those approached by an experienced cardio-vascular nurse. Reasons given by a subsample ofthose enrolled by the physician were predomi-nantly the trust and respect subjects had for theirdoctor, though a small number admitted to agree-ing through fear.

Much healthcare provision is imperialistic andthis may re-enforce the belief, held by some olderpeople, that all decisions relating to their treatmentrest with their doctor. It should be remembered thatnot all older people however want active treatmentin all cases and there may be reluctance to takemedication for certain conditions. A recent trialof selective serotonin reuptake inhibitors in the

Page 69: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CLINICAL TRIALS INVOLVING OLDER PEOPLE 59

treatment of depression and anxiety in community-dwelling older people found that, whilst 11 of67 people with clinically diagnosed depressionand/or a phobic anxiety disorder fulfilled oneor more of the exclusion criteria, 89% of theremaining subjects eligible refused medication,26

inferring that the process by which older peoplemake decisions to participate in clinical trials is acomplex one, meriting further research.

The nature of the trial may well be anotherimportant factor in the decision by older peo-ple to enrol. Trials including invasive proceduressuch as venepuncture, which may be necessary todetermine compliance may not be seen as nec-essary to the subject and may therefore be morelikely to lead to refusal.27 Our experience of non-randomised studies of the health of older peo-ple suggests that, on the whole, older people willparticipate in lengthy interviews and are keen toassist with research that they perceive will helpothers,28 confirming findings from cardiovasculartrials.25 When the study involves an assessmentat an outpatients’ clinic or an invasive proceduresuch as providing a blood sample, refusal rates canincrease noticeably, making it necessary for trialpersonnel to explain clearly not only the need forrandomisation but also the importance of subse-quent assessments to monitor health. This should,of course, be balanced by any inconvenience thepatient might incur by extra visits.

After World War II, the Nuremberg coderequired the ‘voluntary consent of the human sub-ject’ in experimental research and that ‘the per-son should have legal capacity to give consent’.Ethicists argue that to make a properly informedchoice to enter a clinical trial, subjects shouldnot only have been provided with the necessaryinformation about the trial but should also haveunderstood it. The two aspects of having sufficientunderstanding to give consent without coercion arekey, but if taken literally, for example by beingrequired to pass a test of competency, research onthe efficacy of treatments and management strate-gies for dementia patients, particularly those withadvanced dementia, would be impossible.29 Theincreasing prevalence and incidence of dementiawith advancing age may also pose problems for

gaining informed consent generally for trials, notjust those specifically for dementia treatments.

Currently, informed consent is usually gainedfrom proxies on behalf of dementia patients,although technically only the subject may provideconsent to be entered into a trial. Within clinicalcare there has been encouragement for patients toprepare advanced directives or living wills to coverthe eventuality that they may not have the capacityto agree to treatment being given or withdrawn. Atfirst sight this might appear a solution for dementiaresearch also, but the strong motivational factorsfor individuals with clinical care are unlikely tobe present for dementia research.30 In addition,the number of people preparing living wills isstill very small and tends to be the well-educated,higher social class groups. A more realistic futuregoal might be that people are encouraged to nameproxies and state broad beliefs about research inadvanced directives.

Rather than immediately approaching a proxyfor consent with dementia patients, it may bebest to promote the pragmatic view of decision-making capacity that if an individual appears com-petent then they are.31 Dementia patients have beenshown to be capable of understanding and differ-entiating the risk/benefit ratio between differenttreatments and of expressing their contentmentwith having a proxy make decisions on involve-ment in research although the proxies themselvestended to be more protective with their relativesthan with themselves.31 A more pressing problemis the lack of suitable proxies to provide informedconsent on behalf of patients, one trial of palliativecare of patients with advanced dementia who hadbeen hospitalised finding that almost half (72/146)of eligible patients could not be enrolled.32 In onlyfour cases was this due to the proxy refusing con-sent, the proxies themselves lacking the capacityto understand the protocol in 18% of cases and inalmost one-third no functional proxy being found.None of the patients for whom a proxy could notbe found had made a living will.

It is when older people with dementia are res-ident in nursing homes–with their dependencyand vulnerability–that the process of informedconsent is most complex.33 Certainly the high

Page 70: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

60 TEXTBOOK OF CLINICAL TRIALS

prevalence of dementia within long-term care set-tings exacerbates the problems already stated,although the more immediate barrier from aresearcher’s point of view may be the home-owner acting as gatekeeper. Experience of con-ducting censuses of older residents in all typesof long-term care within Leicestershire over thelast 20 years has shown, from the falling responserates of homes, the increasing difficulty overtime in gaining access to residents or staff.34,35

It is important that the inherent problems inconducting research within long-term care set-tings are recognised by researchers at the out-set and that the shortage of staff time and oftenrigid regimes in homes are taken into consid-eration when designing trials, particularly thoseof interventions.36 In these care settings, carefulexplanation to home-owners and staff of the pur-pose of the trial is likely to be vital to the successof the study.

FOLLOW-UP

Even when older people have been enrolledinto trials, greater risk of the development ofco-morbid conditions, cognitive impairment andpolypharmacy will mean that they are more likelyto withdraw before the final outcome assessment.To some extent this can be planned for in advanceby allowing for a realistic withdrawal rate in thesample size calculation. Provision of informationabout the trial should not be considered a ‘once andfor all’ activity at the commencement of the trial,and opportunities to re-enforce the importance ofthe participants’ role in the success of the trial(for example at interim assessments) should beexploited. More flexible timing of follow-up visitsmay also help and these days should not pose anyproblems for analysis.

Missing data is still likely to be present andmore complex data analysis techniques should beused to maximise the use of the data that arepresent. Some statistical packages for repeatedmeasures data analysis–a common analysis fortrials with regular follow-ups–ignore cases withdata missing. Newer techniques such as multi-level modelling and random effects models can

accommodate incomplete data. Finally, outcomessuch as mortality that may be easy to measure andimportant for younger populations may, in olderpeople, be valued less than quality of life and theability to function independently.37

CONCLUSIONS

If clinicians and other professionals caring forolder people are to provide optimal treatment andthe elderly are to benefit from new advances intreatments, decisions need to be based on firmevidence of efficacy in older people. At present thisis noticeably lacking in many aspects of care. Wehave discussed the reasons why older people havebeen and are still being excluded both explicitlyand implicitly from trials. We cannot give anydefinitive solutions to ensure that older peopleare recruited and enrolled in large numbers intotrials, since the setting for the trial (community,nursing home, outpatient clinic) will influence thefeasibility of different design options as well asthe country in which the research is conducted.However we list below important features thatshould be considered when designing trials offuture therapies that may ultimately be used bylarge numbers of older people:

• Aim for as wide eligibility criteria as possibleto ensure smaller random error, a wider appli-cability of results and a greater opportunity totest pre-planned subgroup hypotheses.

• At the design stage, agree a list of strategiesfor recruiting specific subgroups (the veryelderly, ethnic minorities) if these becomeunder-represented during recruitment.

• Regularly monitor the characteristics of subjectsenrolled to ensure good representation of thegeneral population.

• Give careful thought to the information to begiven to subjects and the method by which it willbe given, to gain informed consent. Considerwhether and when consent will need to beobtained from a proxy.

• If possible offer home assessments or, wherethis is impossible, provide transportation to

Page 71: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CLINICAL TRIALS INVOLVING OLDER PEOPLE 61

clinics at times convenient to the subject andtheir caregivers.

• Design a realistic withdrawal rate into thesample size calculation.

Finally, many clinical trials fail because of poorrecruitment, lack of adherence to protocols andcontamination between trial arms. The problemsoutlined in this chapter may mean that this is aparticular issue for trials involving older people.There are useful lessons to be learnt from theseexperiences, yet by definition, these are rarelyshared in the published literature. Methodologicalissues that arise from others’ mistakes in carryingout clinical trials involving older people need tobe aired and discussed in journals. If the qualityof the evidence is improved then older people canexpect to see an improvement in the quality of theirhealth care.

REFERENCES

1. Stewart RB. Clinical research and the geriatricpatient: an assessment of needs. Clin Res PractDrug Regulat Affairs (1985) 3: 477–500.

2. Statistical Bulletin. Statistics of prescriptions dis-pensed in the Family Health Service Authorities:England 1984–1994 (1995).

3. Mitchell SL, Sullivan EA, Lipsitz LA. Exclusionof elderly subjects from clinical trials for Parkin-son’s Disease. Arch Intern Med (1997) 157:1393–8.

4. ICH Harmonised Tripartite Guidelines. Studies insupport of special populations: Geriatrics (1993).

5. Taggart HMcA, Allardice JM. Fatal cholestaticjaundice in elderly patients taking benoxaprofen.Br Med J (1982) 284: 1372.

6. Bugeja G, Kumar A, Banerjee AK. Exclusion ofelderly people from clinical research: a descriptivestudy of published reports. Br Med J (1997) 315:1059.

7. Gurwitz JH, Col NF, Avorn J. The exclusion ofthe elderly and women from clinical trials inacute myocardial infarction. JAMA (1992) 268:1417–22.

8. Wenger NK. Exclusion of the elderly and womenfrom coronary trials. Is their quality of carecompromised? JAMA (1992) 268: 1460–1.

9. Gurwitz JH, Avorn J. The ambiguous relationbetween aging and adverse drug reactions. Ann IntMed (1991) 114: 956–66.

10. Yastrubetskaya O, Chiu E, O’Connell S. Is goodclinical research practice for clinical trials goodclinical practice? Int J Geriat Psychiatry (1997)12: 227–31.

11. Yusuf S, Held P, Teo K. Selection of patients forrandomized controlled trials: implications of wideor narrow eligibility criteria. Stat Med (1990) 9:73–86.

12. Schneider LS, Olin JT, Lyness SA, Chui HC. Eli-gibility of Alzheimer’s disease clinic patientsfor clinical trials. J Am Geriatr Soc (1997) 45:923–8.

13. Finkel SI, Richter EM, Clary CM. Comparativeefficacy and safety of sertraline versus nortriptylinein major depression in patients 70 and older. IntPsychogeriat (1999) 11: 85–99.

14. Applegate WB, Curb JB. Designing and executingrandomized clinical trials involving elderly per-sons. J Am Geriatr Soc (1990) 38: 943–50.

15. Petrovich H, Byington R, Bailey G, et al. SystolicHypertension in the Elderly Program (SHEP)Part 2: Screening and recruitment. Hypertension(1991) 17(Suppl II): 16–23.

16. Whelton PK, Bahnson J, Appel LJ, et al. Recruit-ment in the Trial of Nonpharmacologic Interventionin the Elderly (TONE). J Am Geriatr Soc (1997)45: 185–93.

17. Halbert JA, Silagy CA, Finucane P, Withers RT,Hamdorf PA. Recruitment of older adults for arandomized, controlled trial of exercise advice in ageneral practice setting. J Am Geriatr Soc (1999)47: 477–81.

18. Boult C, Boult L, Morishita L, Pirie P. Solicitingdefined populations to recruit samples of high-riskolder adults. J Gerontol: Med Sci (1998) 53(5):M379–84.

19. Wager E, Tooley PJH, Emanuel MB, Wood SF.How to do it. Get patients’ consent to enter clinicaltrials. Br Med J (1995) 311: 734–7.

20. Featherstone K, Donovan JL. Random allocationor allocation at random? Patients’ perceptions ofparticipation in a randomised controlled trial. BrMed J (1998) 317: 1177–80.

21. Sugarman J, McCrory DC, Hubal RC. Gettingmeaningful informed consent from older adults: astructured literature review of empirical research.J Am Geriatr Soc (1998) 46: 517–24.

22. Gorkin L, Schron EB, Handshaw K et al. Clini-cal trial enrollers vs. nonenrollers: the CardiacArrhythmia Suppression Trial (CAST) Recruit-ment and Enrollment Assessment in Clinical Trials(REACT) project. Contr Clin Trials (1996) 17:46–59.

23. Pucci E, Belardinelli N, Signorino M, Angeleri F.Patients’ understanding of randomised controlled

Page 72: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

62 TEXTBOOK OF CLINICAL TRIALS

trials depends on their education. Br Med J (1999)318: 875.

24. Miller CK, O’Donnell DC, Searight HR, BarbarashRA. The Deaconess Informed Consent Comprehen-sion Test: an assessment tool for clinical research.Pharmacotherapy (1996) 16(5): 872–8.

25. DeLuca SA, Korcuska LA, Oberstar BH, Rosen-thal ML, Welsh PA, Topol EJ. Are we promotingtrue informed consent in cardiovascular clinical tri-als? J Cardiovasc Nurs (1995) 9(3): 54–61.

26. Stevens T, Katona C, Manela M, Watkin V, Liv-ingston G. Drug treatment of older people withaffective disorders in the community: lessons froman attempted clinical trial. Int J Geriat Psychiatry(1999) 14: 467–72.

27. Ameer B, Burlingame MB. Difficulties enrollingelderly patients in pharmacokinetic studies. JGeriatr Drug Ther (1990) 4: 61–7.

28. Clarke M, Jagger C, Anderson J, Battcock T, Kel-ly F, Campbell Stern M. The prevalence of demen-tia in a total population – the comparison of twoscreening instruments. Age Ageing (1991) 20:396–403.

29. Agarwal MR, Ferran J, Ost K, Wilson KCM.Ethics of ‘informed consent’ in dementiaresearch – the debate continues. Int J GeriatPsychiatry (1996) 11: 801–6.

30. Sachs GA. Advance consent for dementia research.Alzh Dis Assoc Disord (1994) 8: 19–27.

31. Sachs GA, Stocking CB, Stern R, Cox DM, Hou-gham G, Sachs RS. Ethical aspects of dementiaresearch: informed consent and proxy consent. ClinRes (1994) 42: 403–12.

32. Baskin SA, Morriss J, Ahronheim JC, Meier D,Morrison RS. Barriers to obtaining consent indementia research: implications for surrogatedecision-making. J Am Geriatr Soc (1998) 46:287–90.

33. Becker D. Informed consent in demented patients:a question of hours. Med Law (1993) 12:271–6.

34. Campbell Stern M, Jagger C, Clarke M et al. Resi-dential care for elderly people: a decade of change.Br Med J (1993) 306: 827–30.

35. Lindesay J, Jagger C, Parker G et al. The Trentcensus of older people in residential care. Reportto NHS Executive Trent (1999).

36. Eisch JS, Colling J, Ouslander J, Hadley BJ,Campbell E. Issues in implementing clinicalresearch in nursing home settings. J New York StateNurses Assoc (1991) 22: 18–21.

37. Grimley Evans J. Evidence-based and evi-dence-biased medicine. Age Ageing (1995) 24:461–3.

Page 73: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

5

Complementary MedicinePING-CHUNG LEUNG

Institute of Chinese Medicine, The Chinese University of Hong Kong, Hong Kong

INTRODUCTION

Medicine has been an art of healing. Althoughthere is no absolute account on its history ofdevelopment in the prehistorical and extremeprimitive days, it must be closely related to thevery ancient people’s eating habits and animalbehaviour. Ancient people fallen sick must preferlight meals with plenty of drinks. The latter mightmean fruits and plant-related products, which arethe forerunners of medicinal herbs.

Ancient people lived with animals: eitherkeeping them as domestic friends or observingthem closely in the fields. Animal instincts andbehaviour lent the ancient people much wisdomof healing. When dogs and cats bit and ate upspecial grasses and leaves when they fell sick,followed by vomiting or diarrhoea, sometimesbringing out special unwanted ingested food orworms, the ancient people noted the specialgrasses and leaves. When they desired to clearup their guts under difficult circumstances, theyrecalled those grasses and leaves their animalsate and hence they imitated the animals, hopingto achieve the same healing. In this way, theprimitive art of healing started.

What followed must have been more andmore observations on more and more grassesand leaves which became considered as ‘herbs’.Herb taking as a means to remove symptomsand ailments is, therefore, the standard earlystage of the healing art in human history.1,2 Thevaluable observations and experiences were keptuntil today.

All primitive tribal populations today are stillusing herb treatment, as the standard popularmethod of healing. The practice does not rule outtrial uses of new herbs and their combinations,but the major practice depends on past experienceand documentations. These early clinical trialswere not the result of imagination but were ini-tiated after observations on the anecdotal effectsof different herbs.3,4

Traditionally there was no real need for large-scale clinical trials for complementary medicine.The need came only when scientific healersbecame interested in complementary medicineand started making use of herbs and other meth-ods in their attempts to supplement modernmedicine. They wanted to know whether, by util-ising the same logic of analysis commonly prac-tised in modern medicine, they could prove that

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 74: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

64 TEXTBOOK OF CLINICAL TRIALS

herbal treatment constituted a logical substitutionor supplement to scientific medicine.

This chapter explores the promises and fal-lacies of clinical trials in herbal medicine andsome other complementary medicine; identifiesthe similarities and difficulties, the developmentsand limitations.

TYPES OF COMPLEMENTARY MEDICINE

The current main stream of medicine is thescientific variety. Other forms of health careoutside the main stream fall into the category ofcomplementary or alternative medicine.

If one uses history as the criterion of iden-tification and considers ancient medicine wasequivalent to complementary medicine, one seesfour main systems of ancient healing. They are:Chinese, Indian, Greek and Egyptian. Geograph-ically, the four systems are separated and yet,the nearby areas do carry similarities. China andIndia certainly did communicate. So did Greeceand Egypt. China probably also obtained infor-mation from Greece, i.e. Europe, later in historythrough the ‘silk route’.5

The four different systems have two mainunique features. The Greek and Egyptian systemsconcentrate on the use of single herbs, while theChinese and Indian systems use multiple combi-nations. Combined formulae are most commonlyprescribed in Chinese herbal medicine.

After thousands of years, the four ancientsystems of medicine still survive well. Greekmedicine in Europe has established itself as ahomeopathic healing art, while the other threesystems enjoy persistent but varying popularities.

In the modern sense, alternative/complemen-tary medicine includes not only the herbalstreams, but any other form of medicine that isunrelated to the modern scientific stream. Whenthe American Medical Association did a survey inthe USA aimed at the revelation of the populari-ties and users of alternative medicine, 17 modali-ties were targeted.6 These included the following:

1. Relaxation techniques2. Herbal medicine

3. Massage4. Chiropractice5. Spiritual healing by others6. Megavitamins7. Self-help groups8. Imagery9. Commercial diets

10. Folk remedies11. Lifestyle diets12. Energy healing13. Homeopathy14. Hypnosis15. Biofeedback16. Acupuncture17. Self-prayer

Of these varieties, the one that commanded thehighest popularity was acupuncture as a form ofpain control.

The author cannot possibly be knowledge-able about all the varieties of complementarymedicine, and is not able to discuss all theirclinical trials. Rather, he will concentrate on thetwo varieties that he is familiar with, herbalmedicine and acupuncture. While discussions arebeing made, examples of clinical trials will begiven, based on personal interests and experi-ences.

FUNDAMENTAL CONSIDERATIONS FORCLINICAL TRIALS ON CHINESE MEDICINE

How should clinical trials of Chinese medicine beconducted? Are there differences between suchtrials and others designed for modern medicine?

We have explained earlier that originally,complementary medicine and its practitioners didnot demand clinical trials. However, clinical trialsare indicated for modern scientists because oncethe efficacy is proven, an alternative methodologyof treatment can be endorsed.7

If modern medicine were not totally successful,there would be a real need for supplementingwith alternative medicine. Generally speaking,the successes of modern medicine are wellknown in most areas. It is therefore necessaryto look to complementary medicine only in

Page 75: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COMPLEMENTARY MEDICINE 65

those areas where the scientific main stream hasdeficiencies.

WHERE ARE THE DEFICIENCIES?

The deficient areas lie where modern medicine,in spite of recent advances, yet fails to getgood solutions.

Modern medicine has developed from the logicof modern science which follows the deductiveapproach. The problem is first thoroughly under-stood by identifying the cause. The cause couldthen be removed by working out an effectivemeans. In the situation of a disease when thecause is simple and straightforward, removing itwould be easy. On the other hand, when the causeis either complicated, not well understood or mul-tiple, removal becomes difficult or impossible.Simple disease-inducing causes include straight-forward infections and simple hormonal deficien-cies. The former is easily tackled with an efficientantibiotic while the latter could be treated withhormonal replacement.

When the causative agent is not yet thor-oughly known, e.g. viral infections, treatmentbecomes difficult.

When the cause is complicated, e.g. in allergicconditions, treatment does not guarantee effec-tive results.

When the cause is complicated, e.g. involv-ing many factors such as physiological, socialand psychological, modern scientific medicinebecomes obviously deficient or incapable.8 – 10

Therefore the deficient areas in modern medi-cine that deserve contributions from complemen-tary medicine include:

• Allergic conditions• Autoimmune diseases• Cancers• Chronic pain• Chronic derangements• Degenerative diseases• Nerve damages• Viral infections• Other areas that modern conventional ther-

apy fails.

INDICATIONS AND PHILOSOPHY OFAPPLYING COMPLEMENTARY MEDICINE

Current medical treatment emphasises the effec-tiveness and statistical chances of obtaining goodresults. Modern medicine has developed as directcorrective measures. Hence when it is effective,the probability of arriving at good results isvery high. Unless not available, there is there-fore no reason why modern medicine should notbe endorsed as the primary mainline treatment.

Although there are still confident herbal prac-titioners who believe and declare that whateverthe modern scientific practitioners can do, theycould substitute with other herbal remedies, thenumber who remain that committed is getting lessand less. Indeed today, most herbal practitionersaccept the role of functioning as supplementaryor alternative healers in a combined effort of cureand care.11

In this context, complementary herbal treat-ment is seldom used as the only healing modality.Instead, it is often given as an adjuvant treatment,either together with the mainline or after comple-tion of the mainline. Users of herbal preparations,moreover, frequently look forward to a tonic sup-portive effect, rather than a curative effect.

HOW DOES HERBAL MEDICINE REALLYWORK?

Traditionally the system of herbal medicine wasbuilt on the rich experience of herb users orherbalists, accumulated over more than two thou-sand years in China since the early Chineseculture. For some reason, while basic medicalsciences, e.g. anatomy and physiology, devel-oped gradually in European territories around theRenaissance period, Chinese healers never feltthe need to explore these basic medical sciences.Without sound knowledge of anatomy and phys-iology, i.e. biological structure and function ofthe human body, it would not be possible toexplore abnormal structures and functions, i.e.pathology. Without understanding the pathology,it would not be possible to apply a direct meansof removing the pathology. Herbal practitioners

Page 76: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

66 TEXTBOOK OF CLINICAL TRIALS

therefore try to heal, not by direct confrontationwith the pathological problem, but by indirectlysupporting the individual to overcome his owndifficulties.12,13

HOW DOES THE INDIVIDUAL OVERCOMEHIS OWN PATHOLOGICAL PROBLEMS?

Firstly by surviving the harmful disturbancesimposed by the pathological processes. Secondlyby supporting the unaffected organs and systemsin their proper functions. Thirdly by preventingfuture pathological mishappenings while thecurrent problem is being solved.

The herbal practitioner has means to suppressthe symptoms which are manifestations of thepathology. Suppression of symptoms like cough,diarrhoea or dyspnoea helps the sick individualto survive.

While waiting for the pathological damagesto heal naturally, the unaffected organs andsystems need to be supported to maintain theirefficient function, which in turn will support theoverall function and metabolic harmony of theliving individual.

Prevention in the modern biological sensefrequently refers to an immunological mecha-nism through which the individual becomes moreresistant to future attacks of similar pathologi-cal nature.

The main focus of disease management forChinese medicine is often the control of adversesymptoms. The ultimate goal is maintaining thewell-being of the biological system. The aeti-ological consideration is therefore not directedtowards the actual cause of the disease (of whichthe herbal expert has no idea), but a generalconceptual state of the biological balance of thehuman bodily functions. The ancient healers cor-related this conceptual state with the Taoist phi-losophy and imagined that bodily function waskept at a balanced state between Yin and Yang(i.e. negative and positive). Any loss of balanceled to ailment and disease.

The aim of treatment is therefore to main-tain the balance. Yin and Yang includes othercontrasting opposing forces like cold and heat,

superficial and deep, empty and solid. The causesof imbalance could be traced to a lack of bal-ance of any pair of opposing forces. In theactual treatment, therefore, all efforts are spenton the maintenance of balance, by a supple-ment of the deficient force, or a decrement ofthe excessive.

Since the pathological causes of the symptoma-tology are unclear to the herbal expert, he wouldneed to observe the changes of symptoms andadjust his day-to-day protocol accordingly. Thisapproach differs very much from conventionalmodern medicine, which successfully identifies apathological cause of disease, chooses a methodof cure with good chance of success, then admin-isters it with all effort and commitment, until thetotal removal of the pathology is achieved.

While the aetiology, epidemiology and naturalcourse of a disease affect the design of clinicaltrials for modern medicine, it is now clear thatin Chinese medicine, there is little analogy ofaetiological and epidemiological considerations.The course of events in a disease, for a herbalexpert, is the appearance of the symptoms: theloss of biological well-being due to the lackof balance between the vital forces. The aimof treatment is the re-establishment of balance;once balance is re-established, either naturally orthrough herbal intervention, well-being will bere-established. Treatment consists of a dynamicapplication of symptomatic relief with the goalof re-establishing the balance.14

Clinical trials for Chinese medicine or herbalmedicine therefore could follow the line ofthought for scientific planning on data collectionand subsequent data meta-analysis. However, thepre-treatment data would be confined mainly tosymptomatology. Other parameters would carrylittle weight for the herbal expert; but couldstill be included for more scientific knowledgein clinical trials.

GENERAL CONSIDERATIONS FOR CLINICALTRIALS ON HERBS

In the modern scientific world, only up-to-datemethodologies should be adopted. The set of

Page 77: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COMPLEMENTARY MEDICINE 67

common methodologies for conducting clinicaltrials on modern medicine has been logical,useful and has made wonderful contributions tothe clinical testing of new drugs and new meth-ods of clinical treatment. The proper analysisof data and the use of statistics have revealedthe trustworthiness of certain accumulated expe-rience, while at the same time the fallacies ofsome even well accepted and widely practisedmethods.15

The common methodology of random selec-tion, blinding and placebo control, followed bystatistical analysis, should be adopted. In thedesign of the trial, good clinical practice shouldbe the aim. However, due to the nature of theherbs, which come from different origins andcarry different species, it is not uncommon toencounter situations where basic principles thatcannot be strictly kept.16

THE OLD APPROACHES

The herbal experts fervently respect case reportsand anecdotal reports, particularly when resultsappear promising. Of course the reason behindit is that they don’t make use of statistics.Moreover, they believe that treatment resultsare different with different patients. Once goodresults are known to be possible, the expert couldtry to achieve equally good results by wiselymanipulating the varieties of treatment.

In this chapter, we do not endorse thistraditional approach. We do want to apply modernassessment tools for a better understanding ofherbal or Chinese medicine treatment while at thesame time we need not argue against the valueof anecdotal observations in Chinese medicine.After all, the development of this system ofhealing depends solely on anecdotal analysis.

Good clinical practice insists that the pre-scribed drug for the clinical trial should bethoroughly known and uniform. However, usingherbal preparations for clinical trials faces diffi-culties of thorough technical knowledge and uni-formity.

Pharmaceutical tests demand that details beknown about the chemistry, the mode of action

and metabolic pathways before clinical tests beconducted. What is the chemistry of specificherbs? What are the pathways of action andmetabolic degradation? Are there adverse effectsin the process of metabolism? A lot of workhas been done in the past 50 years on this basicunderstanding and not much has come out. Eachand every herb contains so much complicatedchemistry that many years of research mightnot yield much fruit. Actually at least fourhundred herbs are popular and possess recordsof therapeutic action and impressive efficacy. Todemand thorough knowledge on just this popularproportion of herbs is not practical, not to speakof the less commonly used extra one to twothousand varieties.17

Uniformity is another difficult area. Strictlyspeaking, since herbs are agricultural products,uniformity should start with the sites of agricul-tural production. The sites of production havedifferent weather, different soil contents, and themethods of plantation are also different. At themoment, maybe over 50% of popular Chineseherbs are produced on special farms in China.However, these farms are scattered over differentprovinces, which have widely different climaticand soil environments. Good agricultural practicedemands that environmental and nurturing proce-dures be uniformly ensured. Procedures includesoil care, watering, fertilisers, pest prevention andharvests. When such procedures are not uniformand there are no means to ensure a common prac-tice, good agricultural practice is not possible.

Not only is there a lack of uniformity in themode of herb production, but different speciesof the same herb are found or planted in dif-ferent regions and provinces. These differentspecies may have different detailed chemical con-tents. Herbal experts have extensive experienceand knowledge about some special correlationsbetween the effectiveness of particular herbs andtheir sites of production. Some commonly usedherbs are even labelled jointly with the best sitesof production. With the development of molec-ular biology, coupled with modern means ofassessment for active ingredients within a chem-ical product, species-specific criteria could be

Page 78: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

68 TEXTBOOK OF CLINICAL TRIALS

identified, using the ‘finger-printing’ technique.Uniformity today should include screening using‘finger-printing’ techniques.

When we consider the other 50% of herbs thatare only available from the wilderness, i.e. aroundmountains, highlands or swamps, and cannot begrown on agricultural farms, the insistence onproduct uniformity becomes even more difficult.

Putting together what we have discussed, tostrictly insist on good clinical practice in clinicaltrials for herbal medicine is largely impossible.We have to accept a compromise. Indeed in thepast 50 years, many attempts have been made ata proper analytical study of herbal preparations.The intention was: to put the herb to processesof extraction, analyse the important ingredients,then try to work out the chemical formulae whichcould be responsible for the clinical effects.

Extraction eliminates the useless componentsand concentrates the effective components, whichnot only cuts down the volume of herbs used butalso intensifies the biological actions. Knowingthe actual effective ingredients and workingout the chemical formulae would be ideal formodernisation of herbal preparations with theaim of converting the preparations into properpharmaceuticals.

In spite of the efforts and resources put intoherbal extractions and chemical analyses in thepast 50 years, successful examples have not beenimpressive. The results of such efforts certainlydo not match the resources put in.18

The unsatisfactory outcome has initiated a newapproach. Instead of following the scientific path-way already taken by pharmaceuticals, which hasshown too many difficulties rather than promises,a more practical line has been endorsed. Sincemost, if not all, the herbs have been used for hun-dreds of years, there should be sufficient amountof reliability on the safety and efficacy of theherbs. The safety and efficacy of the herbs arealready well documented, but their practical util-isation in specific clinical circumstances needsto be further established. The traditional use ofthe herbs had been focused on symptomatic con-trol. Nowadays, the aim of clinical managementis directed towards curing of a disease entity.

We need to acquire an updated understanding onthe effectiveness of the herbal preparations ondisease entities. That is why we could not be sat-isfied with records on efficacy alone, but shouldstart a series of clinical trials to further prove theefficacy of the herbs.19

The National Institutes of Health of the UnitedStates have openly endorsed the approach ofaccepting traditional methods of healing as safemeasures and then putting them to proper clini-cal trials.20 The recognition of acupuncture as apractical effective means of pain control startedin 1998.21 The subsequent formation of a spe-cial section devoted to research on complemen-tary/alternative treatment followed. The NationalCentre for Complementary, Alternative Treat-ment (NCCAM) was properly formed and givena substantially large budget.

Clinical trials to be discussed within thischapter follow the efficacy-driven principle. Theyare planned strictly according to the principles setout under the modern philosophy of clinical tri-als aimed at the production of objective evidenceof the effectiveness of the methods used. It ishowever understood that product uniformity andquality could not be absolutely guaranteed andthat although GMP (Good Manufacturing Pro-cess) could be assured, GCP (Good Clinical Prac-tice) could not be absolutely ensured because ofthe lack of guarantee for any herbal preparations.

In our discussion full reference will be givento what is being recommended in China, whichundoubtedly harbours most activity in Chi-nese medicine.

HERBAL DRUGS IN CHINA

In 1999 the National Bureau on Drug Controldefined new drugs as ‘a manufactured productfor medical treatment that is produced for the firsttime or an old product reproduced with differentformulation and different indications’.

New drugs are divided into five categories:

I. Group 1Artificial derivatives from Chinese herbsNewly discovered Chinese herbs and deriva-tives

Page 79: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COMPLEMENTARY MEDICINE 69

Extracts from Chinese herbs and derivativesExtracts from decortions of herbs

II. Group 2Herbal injectionsHerbal preparations processed inside animalsExtracts from complex decortions

III. Group 3New preparations of decortionsCombined herbal and chemical preparationsImported herbal preparations

IV. Group 4Converted formularyCultivated herbal and domestic animal prepa-rations

V. Group 5Herbal preparations with extended uses

STAGES RECOMMENDED FOR HERBALRESEARCH

The usual four stages are recommended.Phase I : Study on the general acceptance of

the human being after consumption of the herbalpreparation. Normally Phase I refers to a toxicitystudy. The code of practice given under ‘Codeof Practice for the Scrutiny of New Drugs’ inChina, however, recommends that the generalwell-being of the individual after consumption beobserved.22 The logic of skipping toxicity tests isprobably based on an assumption that Chineseherbal preparations have been used safely forcenturies, therefore a special toxic screening isnot necessary. The author has strong reservationson this attitude and would recommend thattoxicity clearance should remain the first phaseof clinical trials.

Phase II : Study on the safety and efficacywhile working out the effective dosage.

Phase III : Expands on Phase II study, col-lecting more reliable confirmation on the safetyand efficacy.

Phase IV : Further study on the safety andefficacy after the new drug is put to market. Moreobservations on adverse effects are expected.

It has been pointed out that, as far as herbalmedicine is concerned, it is not unusual tofind that correlation does not exist between

laboratory research and clinical trials. Whenstudies on the pharmacology, pharmacodynamicsand pharmacokinetics are carried out after theclinical trials, positive values, in support of theclinical observations, might not be impressive.

The possible explanation to this observationmay lie in the fact that the clinical consump-tion of the herbal preparations involves multiple,complex in vivo biological interactions, whereaslaboratory tests consist of only simple unidirec-tional biological interactions.

HOW DO CONCEPTS OF TRADITIONALHEALING AFFECT CLINICAL TRIALS ON

CHINESE MEDICINE?

Earlier in this chapter, the author has alreadytouched on the unique concepts in Chinesemedicine, which are different from modern sci-entific medicine. The application of modern con-cepts in the direction of clinical trials leads to theinevitable sacrifice of some of the fundamentalprinciples. Experienced herbal experts, therefore,might not like to participate.

The following list includes the importantconcepts in Chinese medicine practice beingsacrificed:

1. SymptomandsyndromeIdentification PrincipleFollowing this principle, the herbal expertadjusts details of his treatment according toobservations on the day-by-day changes ofsymptomatology. He may then use differentdrugs for the same symptoms or use the samedrug for different symptoms. Proper clinicaltrials could only use a uniform choice oftreatment modality. This violates the symptomand syndrome identification principle.

2. Holistic ApproachChinese medicine emphasises holistic careand holistic response, whereas clinical trialsprefer objective, specific data as endpoints.The inclusion of specific data into herbalresearch probably does not invite objectionfrom the herbal expert, as long as general datalike different aspects of well-being, i.e. qualityof life, are included.

Page 80: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

70 TEXTBOOK OF CLINICAL TRIALS

3. Response to Pathological ProcessesChinese medicine emphasises the responsesof healthy organs to diseases. The ability ofthe healthy organs to respond to pathologicalchanges ensures that the individual wouldbe able to better resist adversities. Modernclinical trials aim mostly at diseased organs.

4. Old System of Clinical ObservationHerbal experts utilise a system of clinicalobservations which today might be consideredobsolete and over-subjective. This system ofclinical signs includes tongue observation,pulse detection and a collection of subjectivefeelings.23 Modern clinical trials insist onobjective data that could be monitored. Wetherefore have to either develop means toobjectively assess the subjective signs in thetongue and the pulse or we sacrifice the wholesystem of observations. Herbal experts mightnot appreciate either choice.

5. Strong TraditionHerbal experts place genuine confidence onanecdotal observations and experience of sin-gle patients. Insisting on the need to investi-gate collective observations and condemningsingle-case experience would not be wel-comed by herbal experts. This conceptualdifference directly affects the participationand cooperation of the traditional and mod-ern experts.

While thoroughly recognising the unique natureof Chinese medicine and having pointed out thelack of harmony between the old tradition andmodern science, one realises that the currentcompromise adopted in China is to insist on amodern scientific approach as far as possible.Hence in standard textbooks, the following areadvocated24 as standard instructions for clinicaltrials:

1. Use the principles of randomisation, blindingand repetition.

2. Adopt good protocols for clinical trials.3. Avoid bias at all cost.4. Eliminate chance factors.5. Establish new standards of clinical assessment.

6. Establish unique outcome studies.7. Establish unique quality of life assessments.8. Insist on using modern statistics.

ADVERSE EFFECTS

Historically, great herbal masters in China inthe ancient days did produce records on adverseeffects and toxic problems of some herbs. Asearly as the Han dynasty (second century)documents were produced on herbs that needto be utilised with care or extreme care.25 Thistradition was followed closely in the subsequentcenturies.26

More reports were available on methods andmeans with which toxicities and adverse effectscould be reduced.27

In spite of the good past experience, the preva-lent belief is that Chinese medicinal herbs aresafe. On the other hand, more and more reportsappeared on adverse effects and toxicities, andnon-users of herbs tend to exaggerate the reports.

When new preparations come into the market,the innovative processes of extraction and/orproduction might have produced or initiated newpossibilities of adverse effects or toxicity. Thisexperience is already well recorded in a numberof modernised preparations, particularly those forinjection.28 Among the adverse effects, allergicreactions are commonest.

To date, standard instructions on clinical trialsfor Chinese medicine define adverse drug reac-tion in exactly the same way as modern scientificclinical trials, and explanations of the reactionshave been identically identified.29

Categories of adverse reactions include thefollowing:

1. Reactions to herbsReactions are defined as harmful and unex-pected effects while the standard dosages areused in certain drug trials. It is speciallypointed out that for Chinese medicine, theharmful reactions could be due to the qual-ity of the herb and poor choice of indica-tion. These reactions do not include aller-gic responses.

Page 81: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COMPLEMENTARY MEDICINE 71

2. Dosage-related adverse effectsUsing an unnecessarily high dose could induceexcessive effects, side effects or even toxiceffects. Secondary effects like electrolyteimbalance might also be observed.

3. Dosage-unrelated adverse effectsThese adverse effects could be the resultof unfavourable preparation, contaminants inthe herbs, sensitivity of the consumer, aller-gic reactions or specific inductive effects ofthe herb.

4. Drug interactionsClassically, records are available in old Chi-nese medicinal literature on combined effectsof herbs, their facilitatory and antagonisticeffects. Nowadays, not only drug interactionsbetween herbs are important, but possibleinteractions between herbs and commonlyused pharmaceutical preparations are becom-ing issues of great concern since users ofherbal preparations are greatly increasing. Inthe area of anaesthesia, drug interactionsbetween herbs and modern medicine couldinduce life-threatening reactions. Table 5.1illustrates some studies currently done on thisissue.30

5. Delayed adverse effectsAdverse effects of delayed nature includeinduction to cancer formation, foetal abnor-malities and even blockage of bacterial sensi-tivities.

6. Drug dependenceThere might be suspicions that herbal prepa-rations might lead to drug dependence. Apartfrom a few opium-related herbs, Chinese herbsin fact are well known to be non-addictivebecause of their gross lack of specificities.

From the above account it might appearobvious that adverse effects in clinical trialsusing Chinese medicine in fact follow closely theexperience encountered in other drug trials.

As far as the grading of adverse effects isconcerned, it would be appropriate to categorisethe effects as mild, moderate and severe.

With regard to the overall assessment ofadverse effects, a convenient recommendation

for Chinese medicine trials is that of Naranjo.31

Naranjo’s system of grading adverse effectsaccording to fact-finding results is shown inTable 5.2. Overall assessment:

ADR confirmed ≥9ADR likely 5–8ADR possible 1–4ADR unlikely ≤0

Detection and recording of adverse effectsshould bear different emphases at different phasesof the trial, e.g. Phase I trial aims at detection ofadverse effects in relation to dosage, Phase II andIII collect details, whereas Phase IV is concernedmainly with marketed drugs.

Whatever is the situation, detection of adverseeffects should include both clinical observationsand laboratory data, and detection should befollowed with follow-up observations. The sum-mation of observations should be thoroughlyanalysed so that explanations of the adverseeffects may eventually be worked out.

REPORTING OF ADVERSE EFFECTS

It is currently required in China that adverseeffects should be reported to the relevant mon-itoring body as soon as possible. Once a drug ismarketed, adverse effects should be continuouslyreported to the National Control Bureau, withinthe first five years.

Adverse effects detected at the post-marketPhase IV might be particularly important for Chi-nese medicine trials. Since herbal preparationsdo not have clear, definite information about theeffective contents of the herbs, bias and chancesmight be more likely than other trials on simplerdrugs at the early phases. The large trial popu-lation during Phase IV gives a better chance ofelimination of bias and allows a better oppor-tunity of objective detection of adverse effects.During the Phase IV trial, the following aspectswould deserve particular attention:

1. Actual danger level of the adverse effects.Degree of danger of course depends on the

Page 82: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

72 TEXTBOOK OF CLINICAL TRIALS

Table 5.1. Examples of Herb-Drug Interaction

Herb Drug Interaction Mechanism

Radix SalviaeMiltiorrhizae(Danshen)

Warfarin Increased INRProlonged PT/PTT

Danshen decreaseselimination of Warfarinin rats

Radix AngelicaeSinensis (Danggui)

Warfarin Increased INR andwidespread bruising

Danggui containscoumarins

Ginseng (Radix Ginseng) Alcohol Increased alcoholclearance

Ginseng decreases theactivity of alcoholdehydrogenase andaldehydedehydrogenase in mice

Garlic Warfarin Increased INR Post-operative bleedingand spontaneous spinalepidural haemorrhage

Herbal ephedrae (MaHuang)

Pargyline, Isoniazid,Furazolidone

Headache, nausea,vomiting, bellyache,blood pressureincrease

Pargyline, Isoniazid, andFurazolidone interferewith the inactivation ofnoradrenalin anddopamine; ephedrine inherbal ephedrine canpromote the release ofnoradrenalin anddopamine

Ginkgo Biloba Aspirin Spontaneous hyphema Ginkgolides are potentinhibitors of (PAF)

Cornu cervipantotrichum Fructuscrataegi

adrenomimetic Strengthens the effect ofincreasing bloodpressure

Natural MAOIs in Cornucervi pantotrichum,Fructus crataegi andRadix polygonimultiflori inhibited themetabolism ofadrenomimetic,levodopa and opium

Radix polygoni multiflori Levodopa Increased blood pressureand heart rate

Opium Central excitation

Bitter melon Chlorpropamide Decreased urea glucose Bitter melon decreasedthe concentration ofblood glucose

Liquorice Oral contraceptives Hypertension, oedema,hypokalaemia

Oral contraceptive mayincrease sensitivity toglycyrrhizin acid

St. John’s wort Warfarin Decreased INR Decreased the activity ofWarfarin

Cyclosporin Decreasedconcentration inserum

Page 83: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COMPLEMENTARY MEDICINE 73

Table 5.1. (continued)

Herb Drug Interaction Mechanism

Radix Isatidis(Banlangen)

Trimethoprin (TMP) Significantly increaseanti-inflammationeffect

Liu Shen pill Digoxin Frequent ventricularpremature beat

Tamarind Aspirin Increased thebioavailability ofaspirin

Yohimbine Tricyclic antidepressants Hypertension

Note : ACE angiotension-converting enzymeINR international normalised ratioPT prothrombin timePTT partial thromboplastin timePAF platelet-activating factorAUC area under the concentration/time curveMAOIs monoamine oxidase inhibitors

Source: Reproduced from De Smet and D’Arcy,30 with permission from Springer-Verlag.

Table 5.2. Questions to be asked about adverse Drug Reactions

Yes No Not clear

1. Are there decisive records about the ADR? +1 0 02. Are the ADRs found after consumption of

other drugs?+2 −1 0

3. Are the ADRs improved after consumptionof antidote?

+1 0 0

4. Are there repeated ADRs or repeatedadministration?

+2 −1 0

5. Are there other predisposing factors? −1 +2 06. Are there ADRs after placebo? −1 +1 07. Has the blood level of drug giving ADRs

been investigated?+1 0 0

8. Do the ADRs correlate to dosages? +1 0 09. Is there past history? +1 0 0

10. Is there objective proof +1 0 0

incidence of occurrence. The requirement fortreatment and the financial implications arealso important.

2. More thorough studies at Phase IV should beconsidered according to epidemiological prin-ciples. Randomised controlled trials shouldbe insisted on. Cohort studies might be con-venient and useful, but there need to bemarkedly obvious differences between series

of comparisons before results can be instruc-tive. Case reports might still be useful, butmight function as special warning signals tocalls for more serious studies.32

QUALITY OF LIFE

While clinical trials aim at a thorough scien-tific understanding of the effectiveness of specific

Page 84: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

74 TEXTBOOK OF CLINICAL TRIALS

forms of treatment, endpoints of measurementare set to give objective standards of evaluation.Primary endpoints are unique, focused, specificcriteria which indicate the situation of the targetagainst which the trial is directed. Changes ofprimary endpoints illustrate the efficacy directly.Secondary endpoints are supplementary crite-ria created to support observations on changesand efficacy. Secondary endpoints become moreimportant when, predictably, primary endpointsdo not give clear-cut, impressive results. Sec-ondary endpoints become more important whenprimary endpoints are expected to change slowlyand would be particularly important when chronicproblems are being faced.

Since Chinese medicine, under most circum-stances, does not operate via a direct confronta-tion route but rather acts indirectly to supportthe healthy organs and helps to maintain vitalityand prevent functional deterioration, critical anddetailed assessment of the secondary endpoints istherefore of utmost importance.

Quality of life (QoL) is an important aspecton the assessment of care given to the chron-ically ill. QoL often measures the competencyof the care and the ethical standard of thesociety in mental disorders and other disor-ders that demonstrate strong social orientations.Not infrequently, using technical endpoints asresults of clinical trials a reasonable outcomeis observed, and yet patients might not be sat-isfied with their QoL. QoL is therefore multi-focal: it differs between developed and under-developed areas, it also differs under differ-ent cultural circles.33 Different countries andregions therefore try to develop their own datato be included within their own studies onQoL.34 Meanwhile global, generally acceptableQoL charts are also being planned, examined andvalidated.35

Before an acceptable general data chart isready, one has to accept the achievements alreadyrevealed in different fields. Generally speaking,QoL data sheets take in information about phys-iological well-being, psychological well-being,social well-being and the individual’s subjec-tive feeling on the treatment received and the

rehabilitation underway. Different specialties andspecial areas of concern have created charts oftheir own and all these provide valuable infor-mation when equivalent studies come up. Usu-ally they are adopted right away or after vali-dation. Hence there are charts already developedfor children and the elderly, and different med-ical specialties and subspecialties likewise havecreated their own charts. Just to mention a few,special QoL charts are available for the men-tally ill, cardiovascular diseases, rheumatologi-cal disease, respiratory problems, gynaecologicalproblems and special infections.36 QoL charts forChinese medicinal studies need to be developed.

WHAT ARE THE RECOMMENDATIONS FORCLINICAL TRIALS OF CHINESE MEDICINE?

The simplest thing to do is to make reference tothe available charts in whichever clinical trial isbeing conducted and think about amendments tomake them even more suitable.

ARE THERE UNIQUE FEATURES THAT NEEDTO BE OBSERVED?

There are features related to health which arederived from the philosophy of Chinese medicineever since its initial development. Chinese peoplein all walks of life are influenced by thisphilosophy without being aware of it at all stagesof their life. The belief that health depends ona harmony between contrasting forces promptsthe individual to feel either ‘hot or cold’,‘light’ or ‘heavy’, ‘sick inside’ or ‘sick outside’.After treatment, the feeling might remain, mightreverse or might get balanced. The feeling issubjective, but in any clinical trial including thedata of QoL, could one ignore the outcome of thephilosophical guideline responsible for the wholesystem of healing art?

Henceforth, it is obvious the QoL studiesare particularly important for clinical trials ofChinese medicine and research should be doneon special inclusions of data which are uniquefor Chinese medicine.

Page 85: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COMPLEMENTARY MEDICINE 75

EXAMPLES OF CLINICAL TRIALS ONCHINESE MEDICINE

To give more solid information about clinicaltrials on Chinese medicine being done in HongKong, the following paragraphs are devoted tosummaries of such trials.

Synopsis I

Name of Study Medication: PhyllanthusSP. Compound

Title of Study: A Prospective Randomised,Double-Blind, Placebo-Controlled, ParallelStudy to Evaluate the Effect of PhyllanthusSP. Compound in the Treatment of ChronicHepatitus B Virus Infection.

Study Centre: Single-centre

Objective:

Primary

• To evaluate the efficacy of normalisa-tion of liver enzyme, seroconversion ofHbeAg and disappearance of HBV DNAin serum.

• To evaluate the safety of Phyllanthus SP.Compound in patients with hepatitis B.

Secondary

• Proportion of patients with end-of-treat-ment HbeAg seroconversion (HbeAgto anti-Hbe, normalisation of ALT anddisappearance of HBV DNA at the endof treatment).

• Proportion of patients with HbeAg toanti-Hbe.

• Proportion of patients with sustainednormalisation of ALT.

• Proportion of patients with undetectableHBV DNA.

Design: A single-centre, prospective ran-domised, double-blind, placebo-controlled,

parallel study. Patients will be randomisedto one of the four treatment groups andtreated for a duration of 6 months.

Study Population: A minimum of 85hepatitis B patients will be enrolled, 25subjects per treatment group, 10 subjectsin the control group, a total of four groups.

Definition of Endpoints:

• The primary safety endpoint is tolerabil-ity.

• Tolerability failure is defined as a per-manent discontinuation of PhyllanthusPLUS as the result of an adverse event.

• The primary endpoint is a reduction inHBV DNA level from the baseline.

• The secondary endpoint is HbeAg nega-tive, anti-Hbe positive and a decrease inALT level from baseline.

Study Regimen: Subjects will be randomlyand alternatively assigned to receive Phyl-lanthus PLUS or placebo for 6 monthsprospective parallel study.

Duration of Treatment: 6 months

Statistical Methods: Efficacy: Summarystatistics for the change of HBV DNA,HbsAG, HbeAg and ALT from baselinewill be generated and provided for eachtreatment group.

Safety

• The incidence of adverse events andlaboratory toxicity will be summarisedby treatment group and severity. Changefrom baseline in vital signs will besummarised by treatment group.

• Group differences with an error proba-bility of less than 5% (p < 0.05) will beconsidered statistically significant.

Page 86: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

76 TEXTBOOK OF CLINICAL TRIALS

• The statistical analyses will be madewith SPSS for Windows 10.0.

Synopsis II

Name of Study TCM: Danggui BuxueTang

Title of Study: A Randomised, Double-Blind, Comparison Study of the Effect ofDanggui Buxue Tang with Oestradiol onMenopausal Symptoms and Quality of Lifein Hong Kong Chinese Women.

Study Centre: Single-centre

Objective:

Primary

• To compare the effects of DangguiBuxue Tang with Oestradiol on meno-pausal symptoms of hot flushes andsweating.

• To evaluate the safety of Danggui BuxueTang in patients with menopausal symp-toms.

Secondary

• To evaluate the quality of life of thepatients with menopausal symptoms.

Design: A single-centre, randomised, dou-ble-blind and comparison study. Subjectswill be randomised to one of the two treat-ment groups and treated for a duration of6 months with follow-up of 18 months.

Study Population: A minimum of 100patients with menopausal symptoms will beenrolled, 50 subjects per treatment group.

Definition of Endpoints:

• The primary safety endpoint is tolera-bility. Tolerability failure is defined as

a permanent discontinuation of DangguiBuxue Tang as the result of an adverseevent.

• The primary efficacy endpoint is thechange in severity and frequency of hotflushes and night sweats.

• The secondary efficacy endpoint is thechanges in score for the domains mea-sured in the Menopause Specific Qualityof Life Questionnaire.

Study Regimen: Subjects will be randomlyand alternatively assigned to receive Dang-gui Buxue Tang or placebo for 6 months.

Duration of Treatment: 6 months treat-ment period and 18-month follow-up.

Statistical Methods:

• Data will be processed to give groupmean values and standard deviationswhere appropriate.

• Mann–Whitney U-test will be used tocompare the differences between the twogroups of treatment. Group differenceswith an error probability of less than 5%(p < 0.05) will be considered statisti-cally significant.

• The statistical analyses will be madewith SPSS 10.0 for Windows.

Synopsis III

Name of Study TCM: Danggui BuxueTang

Title of Study: A Randomised ComparisonStudy of the Effect of Danggui Buxue Tangwith Tranexamic Acid on DysfunctionalUterine Bleeding and Quality of Life inHong Kong Chinese Women.

Study Centre: Single-centre

Page 87: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COMPLEMENTARY MEDICINE 77

Objective:

Primary

• To compare the effects of DangguiBuxue Tang with Tranexamic acid onmenstrual blood loss per month.

• To compare the patient’s satisfactionbetween using Danggui Buxue Tang andTranexamic acid.

• To evaluate the safety of Danggui BuxueTang in patients with dysfunctional uter-ine bleeding.

Secondary

• To evaluate the improvement of anaemia.• To evaluate the status of iron deficiency.• To evaluate the unwanted side effects.

Design: A single-centre, randomised com-parison study. Subjects will be randomisedto one of the two treatment groups andtreated for a duration of 6 months withfollow-up of 24 months.

Study Population: A minimum of 125patients with dysfunctional uterine bleedingwill be enrolled, 63 subjects in the DangguiBuxue Tang group and 62 subjects in theTranexamic acid group.

Definition of Endpoints:

• The primary safety endpoint is tolera-bility. Tolerability failure is defined asa permanent discontinuation of DangguiBuxue Tang as the result of an adverseevent.

• The primary efficacy endpoint is changein frequency and severity of menstrualbleeding.

• The secondary efficacy endpoint is im-proving anaemia and iron deficiency.

Study Regimen: Subjects will be ran-domly and alternatively assigned to receive

Danggui Buxue Tang or Tranexamic acidfor 6 months treatment and 24 monthsfollow-up.

Duration of Treatment: 6 months treat-ment and 24 months follow-up.

Statistical Methods:

• Data will be processed to give groupmean values and standard deviationswhere appropriate.

• All the data in the outcome measure arecontinuous variables. Mann–Whitney U-test will be used to compare the differ-ences between the two groups of treat-ment. The level of significance will beset at p < 0.05.

• The statistical analyses will be madewith SPSS 10.0 for Windows.

Synopsis IV

Name of Study TCM: Formula A andFormula B

Title of Study: A Randomised, Double-Blind, Placebo-Controlled Study on theClinical Effects of Integrated WesternMedicine and Traditional Chinese Medicinefor Diabetic Foot Ulcer.

Study Centre: Multi-centre

Objective:

Primary

• To evaluate the wound healing effect ofFormula A and Formula B in patientswith diabetic foot ulcer.

• To evaluate the safety of Formula Aand Formula B in patients with diabeticfoot ulcer.

Page 88: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

78 TEXTBOOK OF CLINICAL TRIALS

Secondary

• To evaluate the effect of control of thelocal infection.

Design: A multi-centre, randomised, dou-ble-blind, placebo-controlled study. Sub-jects will be randomised to one of the twotreatment groups and treated for a durationof 6 months.

Study Population: A minimum of 80diabetic foot ulcer patients will be enrolled,40 subjects per treatment group.

Definition of Endpoints:

• The primary safety endpoint is tolerabil-ity. Tolerability failure is defined as apermanent discontinuation of formula Aand Formula B as the result of an adverseevent.

• The primary efficacy endpoint is diabeticfoot ulcer healing and to avoid legamputation.

• The secondary efficacy endpoint is thecontrol of local infection.

Study Regimen: Subjects will be randomlyand alternatively assigned to receive For-mula A and Formula B or placebo of a6-month prospective parallel study.

Duration of Treatment: 6 months

Statistical Methods:

• Results will be presented as the mean ±SE per group.

• Group differences with an error proba-bility of less than 5% (p < 0.05) will beconsidered statistically significant.

• The statistical analysis will be made withSPSS 10.0 for Windows.

Synopsis V

Name of Study TCM: Relieve WheezingTablet

Title of Study: A Randomised, Double-Blind, Placebo-Controlled Parallel Study ofthe Effect of Relieve Wheezing Tablet inthe Treatment of Childhood Asthma.

Study Centre: Single-centre

Objective:

Primary

• To evaluate the medication score, includ-ing daily use of inhaled steroids.

• To evaluate the symptom score, includ-ing cough on daytime and nighttime,wheeze/chest tightness on daytime andnighttime, degree of shortness of breathon exertion.

Secondary

• To evaluate the spirometry lung functionresult.

• To evaluate the breakthrough attacksrequiring medical attention from A/Edoctors, family physicians on hospital-isation.

• To evaluate the degree of skin allergy.• To evaluate the changes in peripheral

blood and Eosinophilic Cationic Pro-tein (ECP).

Design: A single-centre, randomised, dou-ble-blind, placebo-controlled, parallel study.Subjects will be randomised to one of thetwo treatment groups and treated for aduration of 6 months.

Study Population: A minimum of 80patients with moderate to severe perennialasthma will be enrolled, 40 subjects pertreatment group.

Page 89: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COMPLEMENTARY MEDICINE 79

Definition of Endpoints:

• The primary safety endpoint is tolera-bility. Tolerability failure is defined asa permanent discontinuation of RelieveWheezing Tablet as the result of anadverse event.

• The primary efficacy endpoint is achange in improving the symptoms ofasthmatic children and use of inhaledsteroids.

• The secondary efficacy endpoint is im-provement of lung function.

Study Regimen: Subjects will be ran-domly and alternatively assigned to receiveRelieve Wheezing Tablet or placebo for6 months.

Synopsis VI

Name of Study Medication: Danshen andRadix Puerariae Compound

Title of Study: A Prospective Randomised,Double-Blind, Placebo-Controlled, ParallelStudy to Evaluate the Effect of a HerbalPreparation with Compound Formula ofDanshen and Radix Puerariae as Cardio-vascular Tonic in Cardiac Patients.

Study Centre: Single-centre

Objective:

Primary

• To evaluate the safety of Danshen andRadix Puerariae Compound as adjunc-tive therapy in patients with coronaryartery disease.

• To evaluate the efficacy of treatmentand secondary prevention of cardiovas-cular diseases.

Secondary

• To evaluate the lipid and homocysteine-lowering effect of Danshen and RadixPuerariae Compound.

Design: A single-centre, prospective ran-domised, double-blind, placebo-controlled,parallel study. Patients will be randomisedto one of the two treatment groups andreceive Danshen and Radix Puerariae Com-pound or placebo for a duration of 24weeks.

Study Population: A total of 100 patientswith Coronary Artery Disease (CAD) willbe enrolled, 50 subjects treated with Dan-shen and Radix Pruerariae Compound and50 treated with placebo.

Definition of Endpoints:

• The primary safety endpoint is tolerabil-ity.

• Tolerability failure is defined as a per-manent discontinuation of Danshen andRadix Puerariae Compound as the resultof an adverse event.

• The primary endpoint is improving car-diovascular function (endothelial func-tion and carotid intima-medial thickness)from the baseline.

• The secondary endpoint is decrease ofplasma lipid and homocysteine lev-els.

Study Regimen: Subjects will be randomlyassigned to receive Danshen and RadixPuerariae Compound (TCM) or placebofor 24 weeks in a prospective parallelstudy.

Duration of Treatment: 24 weeks

Duration of Project: 30 months

Page 90: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

80 TEXTBOOK OF CLINICAL TRIALS

Statistical Analyses

• The statistical significance of changesbetween TCM and placebo groups willbe assessed by one-way analysis ofvariance.

• Group differences with an error proba-bility of less than 5% (p < 0.05) wereconsidered statistically significant.

• The statistical analyses were made withSPSS for Windows 10.0.

ACUPUNCTURE

Acupuncture is a practical procedure using aspecial needle to enter special regions of thehuman body surface by which symptoms sufferedby the patient are removed. Like other aspects ofChinese medicine, it aims at symptom control,not at the treatment of a specific disease entity.The most popular use must be in the field ofpain control.

In 1998, the National Institutes of Health in theUSA held a consensus conference on the use ofacupuncture for pain control. The conclusion wasthat acupuncture should be accepted as an effec-tive means of pain control for musculo-skeletalproblems and under other specific situations.21

Since then, interest in the use of acupuncture inthe United States grew and many clinical studieswere started.

Of course, acupuncture has been used for thecontrol of other symptoms. Examples includenerve damages, allergic conditions like rhinitis,asthmatic attacks and general feelings of being‘unwell’, often labelled as ‘derangement’.

How are clinical trials on acupuncture beingconducted? Could the clinical trials on acupunc-ture be put in line with modern epidemiologicalrequirements? Or would it be even more difficultcompared with herbal medicine? We have firstof all, to look at the procedures involved and theexplanations given to the effects produced.

Acupuncture involves the insertion of thin nee-dles, through specific acupoints on the body

surface, to varying depths of soft tissue, thenallowing the needles to stay for some minutes.While the needles stay inside the soft tissue,the puncturist may give regular or occasionalrotary movements on the nail. In recent years,acupuncturists have applied direct electrical cur-rent stimulation, so as to unify the stimulations,widen the effects and save manpower. Acupunc-ture is an invasive process directly aiming at theremoval of symptoms. It is easy to imagine then,that patients would not agree to participate in astudy where they would not be able to enjoy thepuncture treatment and function as recipients of‘sham’ puncture. Likewise, if there were otherplacebo punctures which fulfil the requirementof randomisation and placebo control, very fewpatents would be willing to participate.37,38

However, studies on placebo acupuncture havestarted and the varieties included the following:

1. Placebo points–entry points are sites outsidethe acupuncture meridians.

2. Sham puncture–puncture lightly then with-draw.

3. Hiding entry points–while entry points arehidden, it might be possible to achieve realplacebo effect. Hiding of entry points may beachieved by puncturing through plastic tubesor soft plastic blocks.

4. Camouflage puncture by which a needle is justtaped to the skin.

None of these methods could be endorsedbecause the requirements for placebo in the strictsense are far from being satisfied; most recipi-ents could differentiate right away whether it istrue or false puncture. The conventional appli-cation of acupuncture depends on a subjectivefeeling of ‘numbness’ felt within the puncturedarea. Puncturing without checking this feelingis not considered appropriate. This requirementmakes ‘placebo’ puncture impossible. The useof electrical stimulation is a means to enhancethe effects in modern situations where there isinsufficient experience on acupoint identificationand actual puncturing techniques. It is also con-sidered as a method of modernising acupunc-ture. When electrical stimulation is used, placebo

Page 91: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COMPLEMENTARY MEDICINE 81

becomes absolutely impossible because the elec-trical stimulation is always felt.

The considerations are further complicated bythe theories of acupuncture. There are two accept-able theories–the neurological and the humoral.The neurological theory observes that since someof the meridians and most of the acupunc-ture points are related to the peripheral nerves,stimulation of these points leading to physio-logical effects could be working via neurolog-ical pathways, possibly through proprioceptivereceptors.39 The humoral theory assumes thatneedle stimulation produces humoral (hormonal)reactions, manifested as serological appearanceof functional factors which possess pain controleffects and other regulatory functions.40

Whichever theory is at work, it specifiesthat the tiny area of puncture is producingchain reactions, either directly or indirectly. Anapparently harmless, non-productive action onthe skin and soft tissue, imitating acupuncture,might trigger off similar effects and would be farfrom being a placebo.

Therefore standard epidemiological planningfor clinical trials in Chinese medicine includingacupuncture would be very difficult, if at allpossible. Randomisation would not be acceptableto patients, whereas in situations of acupuncture,if sham puncture is insisted on, it is bothunacceptable to patients and falls short of theplacebo requirements.

Carefully planned cohort studies aiming tocompare the effects of different means of paincontrol and other treatment expectations aretherefore the only reasonable means to objec-tively look at the clinical effects of acupuncture.Many cohort studies of this nature are being donefor the study of back pain, neck pain, arthritis ofthe knee, etc. The effects of puncture were com-pared with conventional techniques using phys-iotherapy and other means.

Another common application for acupunc-ture is the restoration of nerve function. Dam-aged neurological tissues suffer from a real lackof regeneration. Peripheral damage feasible forrepair carries reasonable promise. When cellbodies are involved, either intra-cranially or in

the spinal cord, loss of neurological and sec-ondary muscular functions becomes permanent.Acupuncture is widely used under such difficultsituations. Although many reports of impressiveresults are available, it is difficult to realise thereal value. Scientific data coming from well-planned cohort studies for the observation offunctional restoration is still difficult to inter-pret, since the damage could not be uniform andthe factors affecting the different aspects of reha-bilitation and functional return are multiple andcomplicated.

We are therefore still relying on carefulcase studies. However, one does not need toget upset with the obvious limitations. Afterall, in the field of rehabilitation, experience inthe last three decades has already shown thatqualified, broad, trustworthy clinical trials arenot possible.41 Although meta-analysis has ruledout the absolute scientific justification for allthe rehabilitation attempts like different formsof physiotherapy, massage, bracing and eveninvasive means like injections and surgery, wecould still rely on them because we have torelease our patients from suffering. We are allaware of the fact that we are not certain whichpatient is the best candidate to receive whichtreatment.42

CONCLUSION

Complementary medicine does not have anyhistory of modern scientific development. Itbuilt up its knowledge relying on observationsand experience. Now that we try to makeuse of this traditional stream of medicine ina scientific world, we need to explain why itworks in the area of our concern. Very often,these areas are not well served by scientificmedicine. This makes the scientific explanationseven more important.

The way to go about giving scientific expla-nations to healing processes involves the appli-cation of methodologies that are well known andaccepted by all clinical scientists. The standardway to start a scientific approach to clinicaltrials using traditional Chinese medicine would

Page 92: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

82 TEXTBOOK OF CLINICAL TRIALS

just be an application of the same methodolo-gies. However this approach is not feasible andwould probably remain unfeasible in spite ofenthusiasms. We are barred from a smooth appli-cation of the scientific methodology, basicallybecause of the different philosophy behind thetraditional Chinese way of healing. Moreover, thelack of knowledge about the exact chemistry ofthe active component of the herbal remedy whenherbal drug trials are being carried out furtherjeopardises the validity of the clinical trials.

In spite of the essential difficulties, efficacy-driven trials are still carried out, utilising prin-ciples of evidence-based medicine. As long asthe scientific gap is successfully narrowed, thepractical use of complementary medicine willbecome safer, more logical and would deservewider applications. At the same time, workerson complementary medicine should workout aunique, relevant system of assessment for theclinical effects.

REFERENCES

1. Bensky D, Gamble A. Chinese Herbal Medicine.Materia Medica. Seattle: Eastland Press (1993).

2. Tang W, Eisenberg G. Chinese Drugs of PlantOrigin: Chemistry, Pharmacology and Use inTraditional and Modern Medicine. New York:Springer-Verlag (1992).

3. Su WT. History of Drugs. Beijing: United MedicalUniversity Press (1992).

4. McGuire MB. Ritual Healing in Suburban Amer-ica. Newbrunswick: Rutgers University Press(1988).

5. Kong YC. Formulation’s from South-West Asia.Beijing: Chinese University Press (1998).

6. Eisenberg DM, Roger DB. Trends in alternativemedicine use in the USA 1990–1997. JAMA(1998) 280(18): 1569–75.

7. Lai SL. Clinical Trials of Traditional ChineseMateria Medica, Chapter 1. Guangdong: People’sPublishing House (2000).

8. Leung PC. Editorial – Seminar series on evi-dence-based alternative medicine. Hong KongMed J (2001) 7(4): 332–4.

9. Kaptchuk TJ, Eisenberg DM. The persuasive ap-peal of alternative medicine Ann Int Med (1998)129(12): 1061–5.

10. Eisenberg DM, Kessler RC, Foster C. Unconven-tional medicine in the USA – prevalence, costs

and patterns of use. New Engl J Med (1998)328(4): 246–52.

11. Fair WR. The role of complementary medicine inurology. J Urol (1999) 162: 411–20.

12. Leung PC. Traditional Chinese Medicine in China.Seminar on Health Care in Modern China,Yale–China Association (2001).

13. Leung PC. Development of traditional Chinesemedicine in Hong Kong and its implications fororthopaedic surgery. Hong Kong J Orthopaed surg(2002) 6(1): 1–5.

14. Ho T. An anthropologist’s view on traditionalchinese medicine. In: A Practical Guide to ChineseMedicine. Singapore: World Scientific (2002).

15. Chalmers I. The Cochrane Collaboration: prepar-ing, maintaining and disseminating systemic re-views of the effects of health care. Ann NY AcadSci (1993) 203: 166–72.

16. Tyler VE. Herbs of Choice: The Therapeutic useof Phytomedicinals. New York: PharmaceuticalProduct Press (1994).

17. Ernst E. Harmless herbs? A review of the recentliterature. Am J Med (1996) 104: 170–8.

18. Bisset Ng. Herbal Drugs and Phytopharmaceuti-cals. Boca Raton: Medpharm Scientific Publisher(1982).

19. Leung PC. Clinical trials in Chinese medicine –the efficacy driven approach. Hospital AuthorityConvention, Hong Kong (2001).

20. NIH. National Centre for Complementary andAlternative Medicine Five Year Strategic Plan2001–2005. NIH Document (2000).

21. NIH. Acupuncture. NIH Consensus Statement(1997) 15(5): 1–34.

22. Wang PY. New Chinese Medicine: Research, Pro-duction and Registration. China: Chinese Medi-cine Press (1995).

23. Chen KC. Research in Chinese Medicine. Beijing:China Health Publisher (1996).

24. Lai SL, ed., Clinical Trials of Traditional Chi-nese Materia Medica, Chapter 2, 3, 4. Guang-dong: People’s Publishing House (2001).

25. Chang CK (Han Dynasty). Discussions of Fever.Classics.

26. Li ZC (Ming Dynasty). Materia Medica. Classics.27. Suen TM (Tang Dynasty). Important Prescrip-

tions. Classics.28. National Bureau on Drug Supervision. Regulations

on Clinical Trials Using Drugs and Herbs. BureauPublication (1999).

29. Suen TY, Wang SF, Wang KL. Adverse Effects ofDrug Treatment. Beijing: People’s Health Press(1998).

30. De Smet P, D’Arcy PF, eds. Drug interactionswith herbal and other non-toxic remedies. In:Mechanisms of Drug Interactions. Berlin: Sprin-ger-Verlag, (1996).

Page 93: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COMPLEMENTARY MEDICINE 83

31. Sackett DL. Clinical Epidemiology, 2nd edn.Boston: Little, Brown & Co. (1991) 297–9.

32. Uppsala Monitoring Centre, WHO. Safety moni-toring of medicinal products: guidelines for set-ting up and running a pharmacovigilance centre(2000).

33. Jaeschke R, Singer J. Measurement of health sta-tus. Ascertaining the minimally important differ-ence. Contr Clin Trials (1989) 10: 407–15.

34. Wyrwick KW, Nienaber NA. Linking clinical rel-evance and statistical significance in health relatedquality of life. Med Care (1999) 37: 469–78.

35. Stewart AL, Greenfield GS, Hayo RD. Functionalstatus and well-being of patients with chronicconditions. JAMA (1989) 262: 907–13.

36. Keininger DL. Why develop IQOD? InternationalHealth Related Quality of Life Outcome Database.QoL News (1999) Sept/Dec (23): 2.

37. Park J, White AD, Lee H, Ernst ED. Develop-ment of a new sham needle. Acupunct Med (1999)17(2): 347–52.

38. Streiberger K, Kleinhenz J. Introducing a placeboneedle into acupuncture research. The Lancet(1998) 352: 364–5.

39. Godfrey CM, Morgan P. A controlled trial of thetheory of acupuncture in musculoskeletal pain. JRheumatol (1978) 5: 121–4.

40. Lewith GT, Machin D. On the evaluation of theclinical effects of acupuncture. Pain (1983) 16:111–27.

41. Nachemson E. Causes and treatment of low backpain. American Academy of Orthopaedic SurgeryLecture Series (1989).

42. Spitzer WO, Leblanc FE. Scientific approach tothe assessment and management of activity-relatedspinal disorders. Spine (1987) 12(Suppl. 1): 51–9.

Page 94: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CANCER

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 95: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

6

Breast CancerDONALD A. BERRY, TERRY L. SMITH AND AMAN U. BUZDAR

Department of Biostatistics, The University of Texas M. D. Anderson Cancer Center, Houston, TX 77030, USA

INTRODUCTION

More than 200 000 women in the United Statesare diagnosed annually with breast cancer. About40 000 women die from the disease each year.Among women, it is the most common malig-nancy, and is exceeded only by lung cancer as theleading cause of cancer death. Although the riskof breast cancer is substantially higher in olderwomen, many cases occur in young women. Ofcases diagnosed in the US in 1998, 5% occurredin women under the age of 35, 30% in womenaged 35 to 49, 31% in women aged 50 to 64, and33% in women aged 65 or older.1 For US women,the lifetime risk of developing breast cancer isabout 13%. About 0.1% of US women carry aninherited mutation of a breast/ovarian cancer sus-ceptibility gene, BRCA1 or BRCA2, and so havea lifetime risk in excess of 50%.

From 1940 to the early 1980s, breast cancerincidence in the US increased by a fractionof a percent per year when adjusted for age.Chiefly because of the widespread disseminationof screening mammography beginning in theearly 1980s, invasive breast cancer incidence hasincreased by 3–4% per year into the 1990s. Overthe same period, the rate of ductal carcinoma

in situ (DCIS) increased by about sixfold, fromabout 5 cases per 100 000 women in 1980 to morethan 30 per 100 000 in 1998.

Despite the increasing incidence of breastcancer among US women since the early 1980s,and perhaps indirectly because of this increase,the annual age-adjusted rate of breast cancermortality has decreased by almost 2% per yearin the 1990s. Researchers generally attributethis improvement in breast cancer survival toincreased use of screening mammography and toimprovements in the treatment of breast cancer.Efforts are underway to better delineate therelative impacts of factors influencing breastcancer survival (see CISNET2 for additionalinformation).

An important development for breast cancerresearch in the 1980s and 1990s was not directlyrelated to science. These years saw the for-mation of strong advocacy groups that workedto promote research in breast cancer. Federalfunding has increased more than sixfold since1990, and grass-roots action has resulted inan unprecedented programme of breast cancerresearch funding administered by the Depart-ment of Defense. In addition, patient advocateshave become highly educated about research

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 96: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

88 TEXTBOOK OF CLINICAL TRIALS

issues and many serve regularly alongside profes-sional scientists on various governmental boardsguiding the direction of research expendituresand treatment recommendations. Patient advo-cates also serve on cooperative group committeesthat plan clinical trials in breast cancer, institu-tional review boards, and data safety monitor-ing boards.

Advocacy groups have worked to increase thenumber of women who participate in clinical tri-als. The Clinical Trial Initiative of the NationalBreast Cancer Coalition Fund (NBCCF) main-tains a registry of clinical trials and urges womenwith breast cancer to participate (see NBCCF3).Before a clinical trial can be included in theirregistry, experts from the NBCCF ascertain thatit addresses an important, novel research ques-tion related to breast cancer, and that its designis scientifically rigorous and employs appropriateand meaningful outcomes.

STAGING

Breast cancer is staged using a system developedby the American Joint Committee on Cancer,and based on the size and other characteristicsof primary tumour (T), the status of ipsilaterallymph nodes (N), and the presence or absenceof distant metastases (M) (AJCC4 and Singletaryet al.5). The stage of disease, ranging from 0to IV, is based on combinations of these TNMrankings.

Stage 0 consists of ductal and lobular carci-noma in situ (DCIS, LCIS), non-invasive andpossibly non-malignant forms of the disease.Stages I to III are invasive stages in which thetumour is confined to the breast or its immedi-ate vicinity. Higher stage indicates larger primarytumours or greater locoregional tumour involve-ment. Patients having evidence of distant metas-tasis are classified as Stage IV.

Distribution of disease stage at the timeof breast cancer diagnosis varies by country,depending on the health care system’s approachto diagnosis and reporting. In the US, the approxi-mate proportions of women diagnosed with Stage

0 through Stage IV disease are 21%, 42%, 29%,5% and 4%, respectively.1 An additional tumourclassification method based on histopathologicexamination has limited discrimination abilitybecause 70–80% of tumours are of a single type:infiltrating ductal carcinoma.

PROGNOSIS

Breast cancer is heterogeneous. Many breastcancers are slowly growing and their carrierssurvive for many years and die of other causes.Other tumours are very aggressive and mayhave spread to distant sites by the time theprimary tumour is diagnosed. This heterogeneityhas implications for research in all phases of thedisease, beginning with screening and diagnosticmethods through the evaluation of treatments foradvanced disease.

Stage is the most widely recognised deter-minant of patient outcome. Stage IV disease isgenerally regarded to be incurable, with mediansurvival in the range of 18 to 24 months, althougha small fraction of patients with Stage IV diseaseachieve complete remission following systemicchemotherapy, and survive for many years.6 Onthe other hand, patients with Stage I disease, con-sisting of a small primary tumour and no involvedlymph nodes, have at least a 90% probabilityof being disease-free after five years. Lymphnode involvement is associated with a worseprognosis, with five-year disease-free rates rang-ing from 50–75%. Tumour grade, proliferativeactivity and menopausal status play relativelyminor roles.

Although stage is an important prognosticfactor, it is of limited use as a determinantof treatment outcome. The relative benefitsof treatment are reasonably consistent acrossstages – although the absolute benefit can bemuch greater for higher stage disease. Much cur-rent research focuses on factors that may predictclinical benefit from certain treatment approaches(‘predictive factors’) in contrast to the more con-ventional ‘prognostic factors’ which are regardedas indicators of general tumour aggressiveness,irrespective of type of therapy.

Page 97: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

BREAST CANCER 89

The best-studied predictive factor is oestrogen-receptor (ER) status, which is an importantindicator of whether a tumour will respond tohormonal treatment. Tamoxifen and other selec-tive oestrogen-receptor modulators (SERMs) arehighly effective in patients with hormone-sensitive breast cancer, but they have nobenefit in patients whose tumours are ER neg-ative and progesterone-receptor negative (EarlyBreast Cancer Trialists’ Collaborative Group[EBCTCG]7). Patients who benefit from SERMsmay also benefit from aromatase inhibitors.8

HER-2 (also referred to as HER-2/neu, ErbB2,c-erbB-2) is a member of the epidermal growthfactor receptor family that is overexpressed in20% to 40% of breast tumours, and has beencited in numerous reports as conveying poorprognosis.9 Studies in early breast cancer havesuggested that patients with HER-2 positivetumours are more likely to benefit from anthra-cycline therapy.10 – 12 Trastuzumab, a monoclonalantibody against HER-2, is effective in delay-ing progression in Stage IV disease that over-expresses HER-2,13 and is being evaluated forits efficacy in treating HER-2-overexpressing pri-mary tumours.

HISTORICAL PERSPECTIVE ON CLINICALTRIALS IN BREAST CANCER

SURGERY AND RADIOTHERAPY

Scientific understanding of the biology of breastcancer has changed radically in the past 50 years.Results of large randomised trials have played amajor role in this transition. From the nineteenthcentury and up into the 1970s, breast cancer wasunderstood to be a local/regional disease thatspread by direct extension along lymphatic path-ways to distant sites. This concept gave rise to thesurgical methods promoted by W.S. Halsted14 – 16

around the turn of the twentieth century, i.e.,extensive resection of the breast, regional lym-phatics, lymph nodes and muscle. This surgi-cal technique, known as radical mastectomy,remained the principal approach to treatment of

breast cancer throughout the first half of the cen-tury, sometimes combined with radiotherapy.

When the concept of large-scale randomisedclinical trials to investigate alternative therapieswas proposed in the 1960s, controversy aroseamong breast cancer researchers as well as inother medical fields. In a heated exchange, aprominent breast cancer surgeon denounced suchstudies as ‘a great leap backward in the treat-ment of breast cancer’.17 Despite such opposition,pioneers in the field persisted in designing tri-als to address important therapeutic questions ofthe time, and moreover, were able to persuadepatients to participate in this novel idea of assign-ing treatment by randomisation. These early tri-als compared various surgical and radiotherapyapproaches. In a trial of almost 1700 womenimplemented in 1971, there were no significantsurvival differences between conventional radi-cal mastectomy, total mastectomy with radiation,and total mastectomy with removal of axillarynodes.18,19 Results of this and other trials of theera challenged long-held views of the disease andgradually convinced researchers that their con-cept of breast cancer as a local disease whichcould best be treated by radical local treatmenttechniques was incorrect. Rather, breast cancercame to be understood as a systemic disease thatcould benefit from systemic therapy, and radi-cal local therapies were no longer regarded asessential for prolonging survival.

CHEMOTHERAPY

Cytotoxic agents for treatment of solid tumourswere first developed in the 1950s. Breast cancerproved to be highly sensitive to several of these,when used as single agents in small trials. Subse-quently, combinations of these cytotoxic agentswere evaluated, one of the earliest being theCooper regimen (cyclophosphamide, methotrex-ate, 5-fluorouracil, vincristine and prednisone).20

With the understanding of breast cancer as a sys-temic disease and the proven sensitivity of breastcancer cells to cytotoxic agents, the stage was setfor the rapid development of adjuvant chemother-apy once this concept was introduced in the

Page 98: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

90 TEXTBOOK OF CLINICAL TRIALS

1970s. A randomised trial comparing surgery fol-lowed by combination chemotherapy to surgeryalone demonstrated that disease recurrence couldbe significantly reduced using this adjuvant ther-apy approach.21

The introduction of doxorubicin for treatmentof breast cancer is illustrative of the series ofclinical trials typically undertaken for the devel-opment of new agents. Small trials conductedin solid tumours in the early 1970s establishedsafety and dosing, and these were quickly fol-lowed by Phase II trials of the agent in metastaticbreast cancer. Subsequently, doxorubicin wasevaluated in combination with other agents, andrandomised trials established that higher responserates could be achieved in metastatic disease withcombinations that included doxorubicin. Thesesuccesses prompted the introduction of variousdoxorubicin and other anthracycline-containingcombinations as adjuvant therapy for primarybreast cancer. Known by such acronyms as‘FAC’ = ‘CAF’, ‘FEC’, ‘AC’, these combina-tions continue to play a prominent role in thetreatment of breast cancer.22,23 Anthracycline-containing therapies further reduce the risk ofrecurrence and favourably impact survival inearly breast cancer.24

HORMONAL THERAPY

Hormonal therapy is a key component of therapywhen tumours are hormone-receptor positive.Early trials focused on ovarian ablation bysurgery or chemical means. The anti-oestrogenagent tamoxifen was introduced in the 1970s,at a time when there was high regard for thepotential of cytotoxic agents, but little interestin hormonal therapies. Early small trials inmetastatic breast cancer were equivocal and couldhave led to abandoning the agent. However, theweight of evidence from laboratory studies andseveral small trials pointed to superior efficacywith prolonged administration in ER positivedisease. After a series of large randomised trials,tamoxifen is now regarded as standard therapyfor pre- and post-menopausal women with ERpositive tumours.25 Tamoxifen may be the single

most important advance in treating breast cancer.Questions remain about the optimum treatmentduration even though a trial conducted by theNational Surgical Adjuvant Breast and BowelProject (NSABP) comparing 5 and 10 years oftamoxifen therapy concluded there was little orno advantage to longer therapy.26

HIGH-DOSE CHEMOTHERAPY WITH BONEMARROW TRANSPLANT OR STEM

CELL SUPPORT

An unresolved question in therapy of breast can-cer that has presented an unusual challenge forthe conduct of clinical trials is that of high-dose chemotherapy supported by autologous bonemarrow transplant or peripheral blood progenitorcells. Ten trials addressing the question of high-dose versus standard-dose chemotherapy havebeen reported. Two of these were subsequentlydiscredited following an international investiga-tion. Only two of the remaining eight trialsentered more than 200 patients. Financial issues,patient and physician acceptance and competingtreatment strategies have compromised accrual,and it is unclear if ongoing trials can be com-pleted. The available evidence suggests that high-dose therapy provides little or no benefit forpatients regardless of their disease stage.27,28

MAMMOGRAPHY

Eight large randomised trials conducted since1963 assessed the value of screening mam-mography for reducing breast cancer mortal-ity. These are of particular interest for thescrutiny they have undergone in recent years.The preponderance of evidence from the ran-domised trials indicates a benefit associated withscreening mammography.29 – 31 However, a meta-analysis concluded that six of the eight trialswere seriously flawed and the remaining twotrials showed insignificant breast-cancer mortal-ity differences between the screened and non-screened groups.32 The National Cancer Instituterecommends screening mammography every 1to 2 years for women aged 40 and older, while

Page 99: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

BREAST CANCER 91

recognising that there are risks associated withfalse-positive results.

PREVENTION

Beginning in the 1990s, coinciding with thedetection of methods for identifying women athigh risk of breast cancer, the first large-scaletrials were mounted to determine if the incidenceof breast cancer could be reduced in targetedhigh-risk groups. These trials established thatbreast cancer incidence could be greatly reducedby daily doses of tamoxifen.33,34 This reductionwas due entirely to a lower incidence of ERpositive tumours with no change in the incidenceof ER negative tumours. This suggests thatprophylactic tamoxifen will not have as greatan impact on survival as it does on incidence,although none of the prevention trials addresssurvival as an endpoint. An ongoing trial (STAR:the Study of Tamoxifen and Raloxifene) willrecruit 22 000 women at high risk of breast cancerin order to compare the effects of tamoxifen andanother SERM, raloxifene, on the incidence ofprimary breast cancer.35

MAJOR TRIAL GROUPS

One of the largest cooperative groups conduct-ing trials in breast cancer in the US is theNSABP. Trials from this group are often referredto by their ‘B’ numbers, e.g., B-06, which estab-lished the equivalence of lumpectomy to totalmastectomy.36 Other major cooperative groupsconducting clinical trials in breast cancer are theEastern Cooperative Oncology Group (ECOG),Cancer and Leukemia Group B (CALGB), South-west Oncology Group (SWOG), Breast CancerInternational Research Group (BCIRG), Euro-pean Organisation for Research and Treatment ofCancer (EORTC), North Central Cancer Treat-ment Group (NCCTG), and the National CancerInstitute of Canada (NCIC).

An important information resource regardingthe benefits of treatment for early breast canceris the Early Breast Cancer Trialists’ Collabo-rative Group (EBCTCG). This group, based at

Oxford University, serves as a centre for datasynthesis rather than actual conduct of clinicaltrials. Beginning in 1983, this group has col-lected data from virtually all major randomisedtrials conducted in early breast cancer, publishedor not. Data from more that 200 000 womenhave been analysed, using statistical techniquesfor meta-analysis, with results published at theend of each five-year analysis cycle, beginning in1985. These publications have addressed the roleof radiation, ovarian ablation, polychemotherapy,tamoxifen and quality of life; these ‘overview’articles are frequently cited in support of treat-ment approaches.7,24,25,37 – 42 The weaknesses ofmeta-analysis have been widely discussed in thestatistical literature, chief among these being theissue of heterogeneity among the trials beingcombined. For example, the overviewers com-bine various therapeutic regimens under the sin-gle rubric ‘polychemotherapy’. However, theseoverview reports have allowed researchers to reli-ably assess moderate-size treatment effects whichcould not have been detected in individual trials.Treatments causing even moderate reductions inmortality, if implemented widely among womenwith breast cancer, could prevent or delay thou-sands of deaths due to the disease. The meta-analysis has also addressed questions of treatmentefficacy within subsets, for example the confir-mation of benefit of adjuvant tamoxifen in ERpositive pre-menopausal women7 as well as inpost-menopausal women.

TIME-DEPENDENT HAZARDS

In this section we address a methodologicalissue that arises quite generally in survivalanalysis. Consider disease-free survival. This isthe usual primary outcome measure in evaluatingadjuvant therapies, with results presented inthe form of Kaplan–Meier survival curves andcompared using statistical tests that take intoaccount the entire survival distributions. Thesimple hazard function, which is in effect thederivative of the survival curve, can serve as aneffective graphical aid to understanding treatment

Page 100: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

92 TEXTBOOK OF CLINICAL TRIALS

and covariate effects. Moreover, it can revealimportant effects that are not apparent in thesurvival curves, themselves.

We will use trial CALGB 8541 as anexample.43 This trial considered three differ-ent dose schedules of CAF (cyclophosphamide,doxorubicin and 5-fluorouracil) in node positivebreast cancer. The schedules consisted of fourcycles of CAF at 600, 60, 600 mg/m2 (high dose),six cycles at 400, 40, 400 mg/m2 (moderate dose)or four cycles at 300, 30, 300 mg/m2 (low dose).The primary endpoint was disease-free survival,which is shown in Figure 6.1 for the three dosegroups using Kaplan–Meier plots. Details of thecomparison are provided in the original report,and will not be repeated here.

Time-to-event curves such as those in Figure6.1 do not tell the whole story regarding anybenefit of increasing dose and dose intensity. Aclearer picture is contained in plots of hazard overtime. The hazard in any particular time periodis the proportion of events occurring during thattime period in comparison with the number ofpatients who are at risk at the beginning of theperiod. For example, if there are 100 patientsin a group and 10 of these recur in the firstyear, then the first-year hazard is 10%. Goinginto the second year, only 90 patients are atrisk. If another 10 recur in the second year,

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

0 2 4 6 8 10 12 14 16 18

Years

High

Mod

Low

Figure 6.1. Disease-free survival proportion for thethree CAF dose groups of CALGB 8541

then the second-year hazard is 10/90 = 11%.When calculating hazards from survival plotssuch as those in Figure 6.1 (which incorporatecensored observations), we subtract the currentyear’s survival proportion from the previousyear’s survival proportion and divide by theprevious year’s survival proportion. The resultingyearly values are shown in Figure 6.2.

Some authors like to smooth hazard estimatesover time. We prefer to show the raw estimates.The reason is that each time period provides a‘nearly independent’ trial of therapeutic compar-isons. Depending on what assumptions are madeabout the underlying survival distribution, thesetrials may not be truly independent, but eventsthat have occurred previously are set aside anda ‘new trial’ is begun. Each time period hasthe potential for confirming observations madein other time periods.

A striking observation from Figure 6.2 is thatall three hazards decrease over time (after year2). This is a reflection of the heterogeneity ofbreast cancer. The most aggressive tumours recurearly, yielding the high hazards evident in thefirst few years. Once their tumours have recurred,patients are removed from the at-risk population.The remaining tumours tend to be less aggressiveand so they recur at a lower rate.

0.00

0.05

0.10

0.15

0 2.5 5 7.5 10 12.5 15

Years

High

Mod

Low

Figure 6.2. Hazards for the three CAF dose groups ofCALGB 8541, derived from Figure 6.1

Page 101: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

BREAST CANCER 93

As regards treatment-arm effect, the apparentbenefit of a regimen of high-dose chemotherapyis restricted to the first five years or so. Actually,the hazard for a high dose is lower than thoseof the other two arms in each of the first sixyears (although it is not much lower in years4, 5 and 6, and it is not much lower than thatfor a moderate-dose regimen in any of the sixyears). In view of the ‘near independence’ of thesix time periods, this observation is impressive.Another important observation from Figure 6.2 isthat after five years the risks of all three groupscome together, with the annual risk of recurrencebeing approximately 5% in all three groups.

The reduction in hazard of recurrence for highversus low doses is 14% over the 18 years offollow-up (95% confidence interval: 6–22%).This is an average over these years (weightedover time because of differences in at-risk samplesizes over time), but since there is no reductionat all in the later years, the overall reduction isbeing carried by the early years. Restricting to thefirst three years, the reduction is 24% (13–33%).A benefit of chemotherapy that is restricted to thefirst few years is typical in breast cancer trials. Animplication is that a hazard reduction seen earlyin a trial, say one with a median of three yearsof follow-up, will deteriorate over time. This isbecause the comparison will eventually involveaveraging over periods where there is no longera treatment benefit.

In the later years, the hazards of about 5%are very similar to the annual hazard for nodenegative breast cancer patients. Interestingly,convergence to about 5% applies irrespective ofthe number of positive lymph nodes. Figure 6.3shows this effect. It gives hazard plots for threecategories of positive nodes: 1–3, 4–9 and 10 ormore (for the three dose groups combined). Earlyin the trial, patients with 10+ positive nodes havea very high annual recurrence rate of 20–30%.However, after five years or so, the annual hazardis about 5% in all three groups. A patient witha large number of positive nodes who has notexperienced recurrence in the first five years orso has the same updated prognosis as a patientwith a small number of positive nodes, including

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0 2.5 5 7.5 10 12.5 15

Years

10+

4−9

1−3

*

Figure 6.3. Hazards for the three categories ofpositive lymph nodes (1–3, 4–9 and 10 or more)for CALGB 8541. There are few patients at risk in thelater years, especially in the 10+ group, and for tworeasons. One is that this was the smallest group tostart with (174 of the 1550 patients in the trial), andthe other is that most recurred early. For example, theasterisk at 13 years indicates a time point at whichthere were only 24 patients at risk, and three of theserecurred in the 13th year

no positive nodes. The effects of both the numberof positive nodes and dose of CAF have elapsedafter five years.

An important aspect of CALGB 8541 is therole of tumour HER-2/neu expression and inparticular its interaction with dose of CAF.10

HER-2/neu assessment was carried out for asubset of 992 patients from the original study. Itsinteraction with dose was shown to be significantin a multivariate proportional hazards model.But the manner of interaction is easiest tounderstand using hazards. Figure 6.4 shows theeffect of dose of CAF separately for patientswith HER-2/neu negative tumours (n = 720) andHER-2/neu positive tumours (n = 272). HER-2/neu negatives show no dose effect. The entirebenefit of high dose over moderate dose andhigh dose over low dose that is observed inthese patients is concentrated in patients whosetumours are HER-2/neu positive. Moreover, this

Page 102: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

94 TEXTBOOK OF CLINICAL TRIALS

High

Mod

Low

High

Mod

Low

0.00

0.05

0.10

0.15

0.20

0 1 2 3 4 5 6 7 8 9 10 11 12

Year

0.00

0.05

0.10

0.15

0.20

0 1 2 3 4 5 6 7 8 9 10 11 12

Year

Figure 6.4. Annual disease-free survival hazards for a subset of patients (n = 992) in CALGB for whomexpression of HER-2/neu in the patient’s tumour was assessed. Patients in the left-hand panel had tumours thatwere HER-2/neu negative and the tumours of those in the right-hand panel were HER-2/neu positive

benefit occurs through a reduction in hazard ineach of the first three to four years. Again,each year is a separate study and so each ofthese years provides a separate confirmation ofthe overall conclusion. The hazard reduction inthe first three years for high dose as comparedwith the other two groups combined was 65%among patients whose tumours were HER-2/neupositive. HER-2/neu overexpression apparentlyconveys a poor prognosis for lower doses butnot for a high dose – it might even provide afavourable prognosis for a high dose.

Many of the above conclusions would havebeen difficult or impossible without consideringhazards over time. A final comment regardinghazards relates to the common problem ofpredicting survival results into the future forpatients already accrued to a trial. ConsiderFigure 6.1. Some patients have as little as10 years of follow-up information. As morefollow-up information becomes available, therewill be no change in these curves prior to the 10-year time point, but they may change subsequentto 10 years. Because the focus is on patients whohave not yet recurred, the way the curves willchange depends on the hazards beyond 10 years.

The information available about these hazardsis shown in Figure 6.2. For predicting whenand whether a patient recurs, hazards should beconsidered one year at a time, and based on thecurrent year of follow-up.

ASSESSING LONG-TERM IMPACTSOF THERAPY

Showing that a cancer therapy is beneficial usinglogrank tests or proportional hazards regressionmodels or whatever other analysis one uses, doesnot allow for concluding the nature of the benefit.It may be that some patients are cured of theirdisease; or the therapy may delay the disease’sprogress in some patients; or the effect may bea mixture of the two. Deciding among thesepossibilities may be possible when all or almostall events occur in a modest amount of follow-up time. In primary breast cancer, a goodlyproportion of patients never recur. Therefore,such a decision is difficult or impossible to make.

Figure 6.5 illustrates the difficulty in discrim-inating between cure and prolonging survival inbreast cancer trials. Consider a clinical trial that

Page 103: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

BREAST CANCER 95

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

0 5 10 15 20

Years

Current

Prolong

Cure

Sur

viva

l Pro

b

Figure 6.5. Hypothetical survival curves comparingcure and prolonging survival

is designed to evaluate a new therapy, one thatmay improve survival. Suppose further that in thepopulation of interest, the current annual rate ofbreast cancer mortality is 8%. The correspond-ing survival distribution is shown by the dashedsurvival curve labelled ‘current’ in Figure 6.5.It assumes exponential survival (although thesame effect holds for any parametric form) andso the current median survival is 8.66 years.The goal of the new therapy is to improve thisby 1/3 to 11.55 years. If this happens then itmay be through prolonging every patient’s sur-vival by reducing the annual mortality rate to6% – the curve labelled ‘prolong’ – or by leav-ing the annual rate unchanged (at 8%) for mostof the patients but curing a fraction of them – thecurve labelled ‘cure’. To have the same median(11.55 years) as ‘prolong’ implies a cure rateof 17.1%.

In view of the sampling variability present inempirical survival information, it is impossibleto discriminate between the ‘prolong’ and ‘cure’curves shown in Figure 6.5 on the basis ofresults of even impossibly large clinical trials.Indeed, the critical part of the follow-up periodfor this discrimination is 20 years and beyond,and few trials have followed patients for this

long. Moreover, information beyond 20 yearsis relatively sparse because earlier events andcompeting risks (such as cardiovascular disease)will have removed patients from the at-riskpopulation. To make inferential matters worse,there is an enormous array of possible curves thatare similar to the two shown in the figure, withsome having cure rates and others not. Finally,the ‘current’ survival distribution assumes thatall breast cancer is fatal (although survival timesvary). More realistically, some breast cancer(including some invasive as well as in situ breastcancer) will never kill the patient. Decidingwhether a new therapy cures some patients iseven more difficult if a proportion of patients isassumed to have non-fatal disease.

This inability to distinguish between curingpatients and prolonging survival has furtherimplications in the evaluation of screening anddiagnostic methods. It is possible that breastcancers become lethal or not in their veryearly development, as suggested by studies oftumour markers purported to identify especiallyaggressive tumours. If this is so, then earlydetection may not help, and the observed benefitsof therapy, however substantial, may be the resultof slowing the progress of the disease rather thancuring it. Such slowing may be beneficial whetherit comes early or late in the disease.

ADAPTIVE DESIGNS OF CLINICAL TRIALS

Adaptive designs have the dual goals of efficientlearning from all relevant results and effectivetreatment of patients. They are more flexible thanconventional designs, and have application in allphases of drug development. Such designs canbe implemented using Bayesian methodology asa means to incorporate new information into thetrial design.

Designs of clinical trials for breast cancerare usually static in the sense that the samplesize and any prescription for assigning treatment,including the randomisation of patients, are fixedin advance. While designs may include stoppingrules, such as the two-stage Phase II trial design

Page 104: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

96 TEXTBOOK OF CLINICAL TRIALS

of Simon,44 or the interim comparisons in PhaseIII designs,45,46 the criteria for early stoppingare very conservative and therefore few trialsactually stop early. The simplicity of trialswith static design makes them solid inferentialtools. The sample sizes tend to be large, witha straightforward treatment comparison as theobjective. Despite their virtues, static trials resultin slow and unnecessarily costly development ofnew therapeutic agents.

The tradition of drug development is to evalu-ate a single drug at a time. Given the fast pace ofcurrent new drug discovery (there are hundredsof known experimental drugs with potential bene-fits in breast cancer), these inefficient evaluationmethods are no longer adequate. In addition tothe traditional focus on false-positive and false-negative errors in standard drug testing, anotherkind of error applies to drugs not under investi-gation. Every such drug is a false neutral. Giventhe limited resources available to the medicalestablishment to develop new therapies, resourceallocation must be approached in a more rationalway. This is as true in breast cancer, for which arelatively large number of women are willing toparticipate in clinical trials, as it is for other formsof cancer. Pharmaceutical companies and medi-cal researchers generally must be able to considerhundreds of drugs for development at the sametime. Static trials inhibit the simultaneous pro-cessing of many drugs. They cannot efficientlyaddress dose–response questions or prioritisationof similar agents when many drugs are under con-sideration. Dynamic designs that are integratedwith the drug development process are necessaryfor reasonable progress in medical research.

Using an adaptive design means examiningthe accumulating data periodically – or even con-tinually – with the goal of modifying the trial’sdesign. These modifications depend on whatthe data show about the unknown hypotheses.Among the modifications possible are stoppingearly, restricting eligibility criteria, expandingaccrual to additional sites, extending accrualbeyond the trial’s original sample size if itsconclusion is still not clear, dropping arms ordoses, and adding arms or doses. All of these

possibilities are considered in light of the accu-mulating information.

Adaptive designs also include unbalanced ran-domisation, in which the degree of imbalancedepends on the accumulating data. For example,arms that give more information about thehypothesis in question or that are performingbetter than other arms can be weighted moreheavily.47 Current (Bayesian) probabilities thateach of several doses or agents surpass standardor placebo therapy are calculated. These calcu-lations use all information from patients treatedto date. A new patient is then assigned totreatment randomly, with weights proportionalto these probabilities. The assignments involvesome degree of randomisation, but all patientsare more likely to receive treatments that are per-forming better. Those that are doing sufficientlypoorly become inadmissible in the sense that theirassignment weight becomes 0. When and if welearn that a new agent is effective (or ineffective),we stop the trial. Patients in the trial benefit fromdata collected in the trial. The explicit goal isto treat patients more effectively, but in additionwe learn about the new agents more efficiently.Initially we evaluate each design’s frequentistoperating characteristics using Monte Carlo sim-ulation, possibly modifying the parameters of theassignment algorithm to achieve the desired char-acteristics.

Adaptive designs are being used increasinglyin cancer trials. This is true for trials sponsored bypharmaceutical companies, and more generally.A variety of trials at The University of TexasM.D. Anderson Cancer Centre (MDACC) areprospectively adaptive. For example, we arebuilding the foundation for a Phase II trialfor evaluating drugs for breast cancer that ismore a process than a trial. The idea is anextension of more general adaptive assignmentstrategies. We start with a number of treatmentarms plus a control – possibly a standard therapy.We randomise to the arms and learn about theirrelative efficacy as the trial proceeds. Arms thatperform better get used more often. An arm thatperforms sufficiently poorly gets dropped. Anarm that does well enough graduates to Phase III,

Page 105: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

BREAST CANCER 97

and if it does sufficiently well it might evenreplace the control. As more treatments becomeavailable, they are added to the mix and theprocess can continue indefinitely.

A trial of a new agent for treatment ofmetastatic breast cancer is being compared to thecurrent standard therapy in a dynamic mannerthat allows the incorporation of newly availabletreatments in the randomisation process, as wellas the elimination of treatments when a lack ofimproved efficacy can be established. Patients arerandomised to treatments with weights propor-tional to the probability that a treatment is betterthan the standard therapy. The result is that supe-rior therapies move through quickly and poorertherapies get dropped. Patients in the trial are pro-vided with better treatment (when the arms arenot equally good). Patients outside the trial getaccess to better treatments more rapidly.

Dose-finding trials of new agents are also con-ducted adaptively at MDACC, with dose assign-ment based on Bayesian updating of a modelwhich relates dose and toxicity, using resultsfrom preceding patients. The model is the con-tinual reassessment method or CRM.48,49 Eachpatient is assigned to the dose having a prob-ability of toxicity closest to some predeterminedtarget value. This is the Bayesian posterior proba-bility calculated from the data available up to thatpoint (and so it is based on sufficient statistics).

The CRM more effectively finds the maximumtolerated dose (MTD) than does the conventional3 + 3 design.50 A way in which both the 3 +3 and CRM designs are crude is the needto pause accrual while waiting for toxicityinformation.51 Such pauses are inefficient andthey cause logistical problems. Trials should bepaused or stopped if there are safety concerns,not because the design cannot get out of its ownway. In getting information about toxicity (orefficacy), there is seldom a magical dose that thenext patient must get. All doses are potentiallyinformative. Rather than stopping, one shoulduse a design that models dose–response (toxicityand efficacy) and is able to assign a next doseeven though patients previously treated are not

yet fully evaluable. Other improvements to dose-finding methods are underway. These includethe simultaneous incorporation of efficacy resultsinto the design, and the use of toxicity severityrather than the usual assumption that toxicity isdichotomous.

CONCLUSION

Breast cancer clinical trials are not fundamentallydifferent from those of other cancers. However,breast cancer stands out for several reasons. First,it is common, and it is becoming even morecommon with the improvements in and greateruse of detection methods. That implies a greaterability to investigate the potential for therapeuticagents and combinations. As a consequence, therehave been hundreds of randomised clinical trialsconducted in breast cancer, more by far than inany other cancer. Second, it is a disease that isfatal in only a minority of cases. Third, patientadvocates in the breast cancer community havebeen very influential, both as a research forceand a political force, in lobbying for researchfunding. Fourth, breast cancer has been shownto be sensitive to a number of chemotherapiesand hormonal therapies. The advances that havebeen made in breast cancer therapy are moreimpressive than for any other type of cancer,except for testicular cancer and some forms ofleukaemia that commonly affect children. Theseadvances have been built on a foundation ofclinical trials.

REFERENCES

1. Surveillance, Epidemiology, and End ResultsProgram (SEER) Public-Use Data (1973–1998),National Cancer Institute, DCCPS, SurveillanceResearch Program, Cancer Statistics Branch,Bethesda, MD, August 2000 submission; releasedApril 2001 [online database, available at http://seer.cancer.gov].

2. Cancer Intervention and Surveillance ModelingNetwork (CISNET) [online information site].Available through the National Cancer Institute,US government, at http://cisnet.cancer.gov/about.

3. National Breast Cancer Coalition Fund (NBCCF).A patient guide to quality breast cancer care

Page 106: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

98 TEXTBOOK OF CLINICAL TRIALS

[online information site]. Available at http://www.natlbcc.org/nbccf.

4. American Joint Committee on Cancer (AJCC)[online information site]. Available at http://www.cancerstaging.org.

5. Singletary SE, Allred C, Ashley P, Bassett LW,Berry D, Bland KI, Borgen PI, Clark G, Edge SB,Hayes DF, Hughes LL, Hutter RBP, Morrow M,Page DL, Recht A, Theriault RL, Thor A, WeaverDL, Wieand HS, Greene FL. Revision of theAmerican Joint Committee on Cancer StagingSystem for Breast Cancer. J Clin Oncol (2002)20: 3628–36.

6. Greenberg PA, Hortobagyi GN, Smith TL, ZieglerLD, Frye DK, Buzdar AU. Long-term follow-upof patients with complete remission followingcombination chemotherapy for metastatic breastcancer. J Clin Oncol (1996) 14: 2197–205.

7. Early Breast Cancer Trialists’ Collaborative Group(EBCTCG). Tamoxifen for early breast cancer: anoverview of the randomised trials. Lancet (1998)351: 1451–67.

8. Buzdar A, Howell A. Advances in aromataseinhibition: clinical efficacy and tolerability in thetreatment of breast cancer. Clin Cancer Res (2001)7: 2620–35.

9. Yamauchi H, Stearns V, Hayes DF. When is atumor marker ready for prime time? A case studyof c-erbB-2 as a predictive factor in breast cancer.J Clin Oncol (2001) 19: 2334–56.

10. Thor AD, Berry DA, Budman DR, Muss HB,Kute T, Henderson IC, Barcos M, Cirrincione C,Edgerton S, Allred C, Norton L, Liu ET. erbB-2,p53, and efficacy of adjuvant therapy in lymphnode-positive breast cancer. J Natl Cancer Inst(1998) 90: 1346–60.

11. Paik S, Bryant J, Tan-Chiu E, Yothers G, Park C,Wickerham DL, Wolmark N. HER2 and choice ofadjuvant chemotherapy for invasive breast cancer:National Surgical Adjuvant Breast and BowelProject Protocol B-15. J Natl Cancer Inst (2000)92: 1991–8.

12. Ravdin PM, Green S, Albain KS, Boucher V,Ingle J, Pritchard K, Shepard L, Davidson N,Hayes DF, Clark GM, Martino S, Osborne CK,Allred DC. Initial report of the SWOG biologi-cal correlative study of c-erbB-2 expression as apredictor of outcome in a trial comparing adju-vant CAF T with tamoxifen (T) alone. Proc AmSoc Clin Oncol (1998) 17: 97a.

13. Slamon DJ, Leyland-Jones B, Shak S, Fuchs H,Paton V, Bajamonde A, Fleming T, Eiermann W,Wolter J, Pegram M, Baselga J, Norton L. Useof chemotherapy plus a monoclonal antibodyagainst HER2 for metastatic breast cancer thatoverexpresses HER2. N Engl J Med (2001) 344:783–92.

14. Halsted WS. The results of operations for the cureof cancer of the breast performed at the JohnsHopkins Hospital from June 1889 to January 1894.Ann Surg (1894) 20: 497–555.

15. Halsted WS. The results of radical operations forthe cure of carcinoma of the breast. Ann Surg(1907) SLVI: 1–19.

16. Buchanan EB. A century of breast cancer surgery.Cancer Invest (1996) 14: 371–7.

17. Haagensen CD. A great leap backward in thetreatment of carcinoma of the breast. JAMA (1973)224: 1181–3.

18. Fisher B, Montague E, Redmond C, Barton B,Borland D, Fisher ER, Deutsch M, Schwarz G,Margolese R, Donegan W, Volk H, Konvolinka C,Gardner B, Cohn I, Lesnick G, Cruz AB, Law-rence W, Nealon T, Butcher H, Lawton R. Com-parison of radical mastectomy with alternativetreatments for primary breast cancer: a first reportof results from a prospective randomized clinicaltrial. Cancer (1977) 39: 2827–39.

19. Fisher B, Jeong J-H, Anderson S, Bryant J, FisherER, Wolmark N. Twenty-five-year follow-up of arandomized trial comparing radical mastectomy,total mastectomy, and total mastectomy followedby irradiation. N Engl J Med (2002) 347: 567–75.

20. Cooper RG. Combination chemotherapy in hor-mone-resistant breast cancer. Proc Am AssocCancer Res (1969) 10: abstr 57.

21. Bonadonna G, Brusamolino E, Valagussa P,Rossi A, Brugnatelli L, Brambilla C, DeLena M,Tancini G, Bajetta E, Musumeci R, Veronesi U.Combination chemotherapy as an adjuvanttreatment in operable breast cancer. N Engl J Med(1976) 294: 405–10.

22. Hortobagyi GN. Anthracyclines in the treatmentof cancer. Drugs (1997) 54(Suppl): 1–7.

23. Weiss RB. The anthracyclines: will we ever finda better doxorubicin? Semin Oncol (1992) 19:670–86.

24. Early Breast Cancer Trialists’ Collaborative Group(EBCTCG). Polychemotherapy for breast cancer:an overview of the randomised trials. Lancet(1998) 352: 930–42.

25. Early Breast Cancer Trialists’ Collaborative Group(EBCTCG). Tamoxifen for early breast cancer(Cochrane Review) [online database, available athttp://cochrane.de/cochrane]. Cochrane DatabaseSyst Rev (2001) 1: CD000486.

26. Fisher B, Dignam J, Bryant J, Wolmark N. Fiveversus more than five years of tamoxifen forlymph node-negative breast cancer: updated find-ings from the National Surgical Adjuvant Breastand Bowel Project B-14 randomized trial. J NatlCancer Inst (2001) 93: 684–90.

27. Farquhar C, Basser R, Hetrick S, Lethaby A,Marjoribanks J. High dose chemotherapy and

Page 107: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

BREAST CANCER 99

autologous bone marrow or stem cell transplanta-tion versus conventional chemotherapy for womenwith metastatic breast cancer (Cochrane Review)[online database, available at http://cochrane.de/cochrane]. Cochrane Database Syst Rev (2003) 1:CD003142.

28. Farquhar C, Basser R, Marjoribanks J, Lethaby A.High dose chemotherapy and autologous bonemarrow or stem cell transplantation versus con-ventional chemotherapy for women with earlypoor prognosis breast cancer (Cochrane Review)[online database, available at http://cochrane.de/cochrane]. Cochrane Database Syst Rev (2003) 1:CD003139.

29. Miller AB, Baines CJ, To T, Wall C. CanadianNational Breast Screening Study: 1 – Breast can-cer detection and death rates among women aged40 to 49 years. Can Med Assoc J (1992) 147:1459–76.

30. Miller AB, Baines CJ, To T, Wall C. CanadianNational Breast Screening Study: 2 – Breast can-cer detection and death rates among women aged50 to 59 years. Can Med Assoc J (1992) 147:1477–88.

31. Nystrom L, Andersson I, Bjurstam N, Frisell J,Nordenskjold B, Rutqvist LE. Long-term effectsof mammography screening: updated overview ofthe Swedish randomised trials. [erratum in Lancet(2002) 360: 724]. Lancet (2002) 359: 909–19.

32. Gøtzsche PC, Olsen O. Is screening for breastcancer with mammography justifiable? Lancet(2000) 355: 129–34.

33. Fisher B, Costantino JP, Wickerham DL, Red-mond CK, Kavanah M, Cronin WM, Vogel V,Robidoux A, Dimitrov N, Atkins J, Daly M, Wie-and S, Tan-Chiu E, Ford L, Wolmark N. Tamox-ifen for prevention of breast cancer: report ofthe National Surgical Adjuvant Breast and BowelProject P-1 Study. J Natl Cancer Inst (1998) 90:1371–88.

34. IBIS investigators. First results from the Interna-tional Breast Cancer Intervention Study (IBIS-I):a randomised prevention trial. Lancet (2002) 360:817–24.

35. Vogel VG. Follow-up of the breast cancer preven-tion trial and the future of breast cancer preventionefforts. Clin Cancer Res (2001) 7: 4413s–18s.

36. Fisher B, Anderson S, Bryant J, Margolese RG,Deutsch M, Fisher ER, Jeong J-H, Wolmark N.Twenty-year follow-up of a randomized trialcomparing total mastectomy, lumpectomy, andlumpectomy plus irradiation for the treatment ofinvasive breast cancer. N Engl J Med (2002) 347:1233–41.

37. Early Breast Cancer Trialists’ Collaborative Group(EBCTCG). Effects of adjuvant tamoxifen andof cytotoxic therapy on mortality in early breast

cancer: an overview of 61 randomized trialsamong 28,896 women. N Engl J Med (1988) 319:1681–91.

38. Early Breast Cancer Trialists’ Collaborative Group(EBCTCG). Treatment of Early Breast Cancer,Vol. I, Worldwide Evidence 1985–1990. Oxford:Oxford University Press (1990).

39. Early Breast Cancer Trialists’ Collaborative Group(EBCTCG). Systemic treatment of early breastcancer by hormonal, cytotoxic, or immune ther-apy: 133 randomised trials involving 31,000 recur-rences and 24,000 deaths among 75,000 women.Lancet (1992) 339: 1–15; 71–85.

40. Early Breast Cancer Trialists’ Collaborative Group(EBCTCG). Effects of radiotherapy and surgery inearly breast cancer: an overview of the randomizedtrials. N Engl J Med (1995) 333: 1444–55.

41. Early Breast Cancer Trialists’ Collaborative Group(EBCTCG). Ovarian ablation in early breastcancer: overview of the randomised trials. Lancet(1996) 348: 1189–96.

42. Early Breast Cancer Trialists’ Collaborative Group(EBCTCG). Favourable and unfavourable effectson long-term survival of radiotherapy for earlybreast cancer: an overview of the randomisedtrials. Lancet (2000) 355: 1757–70.

43. Budman DR, Berry DA, Cirrincione CT, Hender-son IC, Wood WC, Weiss RB, Ferree CR, MussHB, Green MR, Norton L, Frei III E. Dose anddose intensity as determinants of outcome in theadjuvant treatment of breast cancer. J Natl CancerInst (1998) 90: 1205–11.

44. Simon R. Optimal two-stage designs for phase IIclinical trials. Control Clin Trials (1989) 10: 1–10.

45. O’Brien PC, Fleming TR. A multiple testing pro-cedure for clinical trials. Biometrics (1979) 35:549–56.

46. Lan KKG, DeMets DL. Discrete sequential bound-aries for clinical trials. Biometrika (1983) 70:659–63.

47. Berry DA, Fristedt B. Bandit Problems: Sequen-tial Allocation of Experiments. London: Chapman-Hall (1985).

48. O’Quigley J, Pepe M, Fisher L. Continualreassessment method: a practical design for phaseI clinical trials in cancer. Biometrics (1990) 52:673–84.

49. Goodman S, Zahurak M, Piantadosi S. Some prac-tical improvements in the continual reassessmentmethod for phase I studies. Stat Med (1995) 14:1149–61.

50. Ahn C. An evaluation of phase I cancer clinicaltrial designs. Stat Med (1998) 17: 1537–49.

51. Thall PF, Lee JJ, Tseng C-H, Estey E. Accrualstrategies for phase I trials with delayed patientoutcome. Stat Med (1999) 18: 1155–69.

Page 108: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

7

Childhood Cancer

SHARON B. MURPHY1 AND JONATHAN J. SHUSTER2

1Children’s Cancer Research Institute, University of Texas Health Science Centre, San Antonio,TX 78229-3900, USA

2College of Medicine, University of Florida, Gainesville, FL 32610-0212, USA

INTRODUCTION

There are substantial differences in the conductof clinical cancer research in children com-pared to adults. First, childhood cancer is com-paratively rare. According to statistics releasedby the National Cancer Institute SEER pro-gramme in 1999,1 it is estimated that approxi-mately 12 400 children and adolescents, youngerthan 20 years of age, are diagnosed annuallywith cancer. Stated another way, the averageannual age-adjusted incidence rate for all child-hood cancers is 150 per million persons, aged<20. This is under 2% of the total cancersdiagnosed in the USA. Despite the rarity andnotwithstanding the spectacular success in treat-ment of paediatric cancer, compared to inci-dence and mortality rates of cancer occurringamong adults, cancer is the leading cause ofdeath from disease among children and adoles-cents. Only accidents and firearms kill more chil-dren than cancer. Further, the distribution of can-cer diagnoses in children is very different fromthat in adults. There are a number of major

tumours, such as Wilm’s tumour, retinoblastoma,rhabdomyosarcoma, neuroblastoma, Ewing’s sar-coma and osteogenic sarcoma, for example, thatare either exclusively or predominantly paediatricin nature. In contrast, carcinomas of differenti-ated epithelial tissues, like the aerodigestive tractor breast or prostate, do not occur in children.Thanks to the usual lack of co-morbid conditionsand concomitant illnesses, children usually havea greater tolerance to cancer therapy than adults.Taken together with the differing spectrum ofcancer seen, the host differences related to agenecessitate that paediatric studies of anti-cancerdrug dosage, efficacy and safety are needed.Recognising that children are not just small adultsand that special considerations apply, the FDAhas issued regulations mandating the testing ofnew drugs in paediatric patients.

Given the fact that modern treatments result incure of 75–80% of all children and adolescentswith cancer who are managed appropriately, thelong-term consequences of therapy for childrenare also potentially much greater than in adults,as the therapy can interfere with normal growthand development, leaving them exposed for

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 109: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

102 TEXTBOOK OF CLINICAL TRIALS

decades at risk for serious sequelae, majororgan disturbance, such as cardiac damage andcognitive dysfunction, or second malignancies.Children diagnosed with cancer generally alsohave less of a problem with competing mortalityrisks, when compared to adult cancer.

Given these factors, to adequately size child-hood cancer research studies, a substantial pro-portion of the incident cases must be enrolled.In fact, in some situations, such as randomisedPhase III trials of new regimens compared toalready effective front-line treatments, nation-wide multi-institutional trials are a necessity.These trials will need to enroll nearly every childin the target population with the disease beingstudied for three to five years, with many years offollow-up needed to assess long-term outcomes.Accrual duration in childhood trials may be con-siderably longer than a corresponding adult trial.However, since clinical practice closely approx-imates that of the ongoing study, it is rare thatprogress from an external source ever renders thestudy question obsolete. Finally, given the rar-ity of childhood cancer, and the desirability ofstudy designs of maximal efficiency, it is oftendesirable to conduct ‘2 × 2 factorial studies’,where two interventions are used in the same trial(Standard vs. Standard + A vs. Standard + B vs.Standard + A + B). Such designs carry some riskwhere there is a qualitative interaction betweenthe two interventions. For example, this wouldoccur if the impact of A is highly dependentupon whether B is given or not. Hence the choiceof randomised interventions needs to take thispitfall into account when such factorial designsare considered.

Based on previous trials and internal registrydata of the major national paediatric coopera-tive oncology groups, Table 7.1 provides esti-mates of the potential accrual to the Children’sOncology Group (COG), a consortium of about230 American, Canadian, European and Aus-tralasian medical centres. COG was formed in2000 by the merger of the Pediatric Oncol-ogy Group (POG), the Children’s Cancer Group(CCG), the Intergroup Rhabdomyosarcoma StudyGroup (IRSG) and the National Wilm’s Tumor

Table 7.1. Major categories of paediatric cancer andprojected annual accrual of the Children’s OncologyGroup

Leukaemia

Infant ALLa: 50‘Standard Risk’ B-Precursor ALLa: 1100‘High Risk’ B-Precursor ALLa: 600T-Cell ALLa: 240Philadelphia Chromosome Positive (Ph+) ALLa: 20B-Cell (SIg+)ALLa: 45ANLLa: 300

Lymphoma

Hodgkin’s Disease: 250‘Early Stage’ NHLa: 90‘Advanced Stage’ Lymphoblastic NHLa: 90‘Advanced Stage’ Large Cell NHLa: 80‘Advanced Stage’ Small Non-Cleaved Cell NHLa: 125

Brain Tumours

Astrocytoma: 140Medulloblastoma: 140Glioma: 70Ependymoma: 35

Sarcomas

Ewing’s Sarcoma: 80Osteosarcoma: 170‘Low Risk’ Rhabdomyosarcoma: 64‘Intermediate Risk’ Rhabdomyosarcoma: 99‘High Risk’ Rhabdomyosarcoma: 21

Kidney

Wilm’s Tumour: 480

Embryonal

‘Low Risk’ Neuroblastoma: 200‘Intermediate Risk’ Neuroblastoma: 110‘High Risk’ Neuroblastoma: 150Hepatoblastoma: 65Germ Cell Tumours: 25

aALL = Acute Lymphoblastic LeukaemiaAML = Acute Non-Lymphoblastic LeukaemiaNHL = Non-Hodgkin’s Lymphoma.

Study Group (NWTSG). Virtually every US orCanadian hospital with a childhood paediatrichaematology/oncology division belongs to COG.Note that the anticipated annual accrual to COGtrials does not mirror incidence figures, as there isa substantial gap between incident rates and rates

Page 110: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CHILDHOOD CANCER 103

of referral of newly diagnosed paediatric cancerpatients to member institutions of COG and par-ticipation in clinical trials. Ross et al.2 analysed21 026 incident paediatric cancer cases, diag-nosed from 1989–91, and compared observedto expected numbers of cases seen at memberinstitutions of POG and CCG and found vastlydifferent ratios (observed/expected) depending onage and site, and to a much less extent, geo-graphic region. According to this survey, 92%of children aged less than 15 years in the USreceived their care at a CCG or POG institu-tion, thereby providing at least a mechanismapproximating population-based studies. How-ever, the ratio (O/E) for 15–19 year olds wasonly 0.21, pointing to an adolescent gap inaccess to national cancer clinical trials at qual-ified institutions.

As seen in Table 7.1, the Children’s OncologyGroup runs Phase III clinical trials in a widevariety of tumours, with accrual ranging fromas few as 20 patients per year to over 1000 peryear. The Children’s Oncology Group is alsoheavily involved in correlative science, pilotstudies of potential Phase III interventions, aswell as standard Phase I (dose escalation) studiesand Phase II (early efficacy) studies. The groupplaces special emphasis on translational research(biologic correlation studies) and cancer control(supportive care studies to limit long-term sideeffects and epidemiologic studies to learn aboutthe aetiology of childhood cancer). Due to thepresumably genetic origin of most forms ofchildhood cancer and the short lag time betweensymptoms and diagnosis, prevention trials andscreening trials are difficult to do in paediatriccancer, with neuroblastoma,3 which is basedon urinary screening for elevated levels ofcatecholamines, a notable exception.

This chapter is organised into several sections.In the first section, major accomplishments inthe area of childhood cancer treatment are dis-cussed. In the following section, examples arecited where translational research has affectedthe design of paediatric cancer trials. In the suc-ceeding section, the typical methods of design-ing trials for the Children’s Oncology Group

are presented. A special section dealing with theethical aspects and unique considerations affect-ing the conduct of clinical trials in children isalso included. The final section is devoted to alook into the future.

HISTORY AND PERSPECTIVESON IMPORTANT PAEDIATRIC

CANCER CLINICAL TRIALS

Statistically and clinically significant improve-ments have been achieved in all major formsof childhood cancers through conduct of well-organised single institution and cooperative groupclinical trials which have resulted in sequen-tial and steady improvement in survival ratessince the 1960s when curative treatments werefirst devised. SEER data document that the over-all childhood cancer mortality rates have con-sistently declined throughout the 1975–95 timeperiod.1 Documentation of the overall progressachieved by POG investigators has been reported,demonstrating significant improvements in over-all survival (OS) and event-free survival (EFS)for 8 of 10 disease areas, in a sample of over 7000children and adolescents treated between 1976and 1989.4 Similar results have been achieved byCCG and by European national paediatric coop-erative clinical trials organisations. There is alsoevidence that children and adolescents with acutelymphocytic leukaemia (ALL), non-Hodgkin’slymphoma (NHL), Wilm’s tumour, medulloblas-toma and rhabdomyosarcoma enjoy a significantsurvival advantage when treated according towell-defined protocols, compared to paediatricpatients not enrolled on protocols and treated out-side of paediatric cancer centres.5 Most probablythe inclusion benefit related to participation inclinical trials is a result of a number of factors,including the rigorous process of protocol devel-opment, incorporation of rapid pathology reviewand reference laboratories, defined staging prac-tices and procedures, on-study review of radio-therapy port films, and close monitoring for toxi-city and efficacy. Some of the important advancesachieved in treatment of paediatric cancers arelisted in Table 7.2.

Page 111: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

104 TEXTBOOK OF CLINICAL TRIALS

Table 7.2. Examples of important advances resultingfrom paediatric cancer clinical trials

• Adjuvant chemotherapy improves survival from20% to 70% in non-metastatic osteosarcoma ofthe extremity.50

• Doxorubicin improves outcome when added toother chemotherapy for Ewing’s sarcoma51 andthe addition of ifosfamide and etoposide tovincristine, adriamycin, cyclophosphamide andactinomycin results in greater benefit.52

• Radiation therapy does not improve survival forpatients receiving chemotherapy with Stage I andII, Wilm’s tumour,53,54 Stage Irhabdomyosarcoma55 or localised non-Hodgkin’slymphoma.56

• Demonstration of improved event-free survival inhigh-risk neuroblastoma receiving myeloablativetherapy in conjunction with autologous bonemarrow transplantation and subsequent treatmentwith 13-cis-retinoic acid compared tochemotherapy alone.57

• Attainment of 80% 4-year event-free survival ratesfor standard risk B-precursor ALL.58

• Achievement of 78% EFS for patients withloco-regional embryonal rhabdomyosarcomathrough intensification of chemotherapy inIntergroup Rhabdomyosarcoma Study (IRS)-IV.59

Success in treatment of the most commonform of paediatric malignancy, acute lymphoblas-tic leukaemia (ALL), has been most gratifying.Indeed, a major reason for improvements in over-all survival for childhood cancer in general is dueto improvement in survival rates for ALL, whichaccounts for roughly a third of paediatric cancer.1

With modern chemotherapy, 97–99% of childrencan be expected to attain complete remission, andit is not inconceivable to predict that modifica-tions of the currently most successful protocolswill boost long-term leukaemia-free survival ratesto as high as 85–90%. Treatment success hasbeen achieved through post-induction intensifi-cation/consolidation and re-induction treatments,effective treatments (‘prophylaxis’) for subclinicalcentral nervous system leukaemia, and prolongedanti-metabolite-based continuation treatments of24–36 months duration. Advances have beenachieved by many single institutions and coopera-tive groups treating childhood leukaemias, includ-ing investigators at St. Jude Children’s Research

Hospital in Memphis who pioneered a ‘Total Ther-apy’ curative approach beginning in the 1960s. Itis beyond the scope of this chapter to review thetreatment advances achieved through clinical trialsfor ALL by the BFM (Berlin–Frankfurt–Munster)Group, POG, CCG, the Dana Farber Consor-tium, the Medical Research Council/UKALL, theDutch Childhood Leukaemia Study Group, theFrench Acute Lymphoblastic Leukaemia Coopera-tive Group (FRALLE) and the Italian Associationof Paediatric Haematology–Oncology (AIEOP),but the interested reader may consult reviewssummarising the spectacular progress achieved intreatment of ALL.6

The lymphomas, Hodgkin’s disease (HD) andnon-Hodgkin’s lymphomas (NHL), are the thirdmost common form of paediatric malignancy,next in frequency behind leukaemias and tumoursof the central nervous system. Currently 80–90%of all children and adolescents with malignantlymphomas are curable with optimal multidis-ciplinary management, based on immunopatho-logic classification, staging for determinationof disease extent, and design and selection ofrisk-adapted therapies. Paediatric investigators atStanford, beginning in 1970, first pioneered com-bined modality treatment for children with HDand demonstrated that low-dose involved fieldradiotherapy combined with multiple cycles ofchemotherapy (MOPP or MOPP/ABVD) resultedin cure of 90% of paediatric patients.7 Similarlyoutstanding rates of disease control with com-bined modality management of paediatric HDhave since been reported by others, establishingthe curability of HD in nearly all cases, such thatthe thrust of current trials in paediatric HD istowards reduction of serious late effects of HDtreatments, such as secondary malignancies, par-ticularly leukaemia, infertility, pulmonary fibrosisand restrictive lung disease, serious cardiac prob-lems and premature death.

The non-Hodgkin’s lymphomas occurringamong children and adolescents are virtuallyall high-grade, diffuse malignancies, differingmarkedly from the distribution of histologic typestypically seen among older adults. Staging systemsin use for childhood and adult NHL also differ.8

Page 112: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CHILDHOOD CANCER 105

Ninety percent of localised NHLs, regardless ofhistology, are readily cured by nine weeks ofchemotherapy without radiation.9

Progress in the treatment of paediatric solidtumours has been equally striking in the last30 years as the progress in treating child-hood leukaemias and lymphomas, and may beattributable to development of accurate diag-nostic methods and systems of disease stagingand effective multimodal treatments combin-ing surgery, chemotherapy and radiation. Curerates for rhabdomyosarcoma have increased fromapproximately 25% in 1970 to greater than 75%currently, to 60–70% for non-metastatic bonesarcomas, to over 80% for Wilm’s tumour, over90% for retinoblastoma, over 90% for infantsand children with localised neuroblastoma, andto over half of all children with brain tumours.

Aims of current trials are to increase orpreserve high cure rates, decrease acute toxicityand long-term adverse sequelae of treatment,decrease costs and improve the quality of lifefor children with readily curable cancers. Patientswith high risk or metastatic disease at diagnosisor those who recur after front-line therapiescontinue to pose challenges and should properlybenefit from pilot trials and Phase I or II studiesof new treatments.

PROGNOSTIC FACTORS, TRANSLATIONALRESEARCH AND THERAPEUTICALLY

RELEVANT RISK GROUPS

Successful childhood cancer research is in largepart dependent upon its translational researchprogramme. Over the past three decades, initialdiagnosis and classification of childhood cancerhas become far more sophisticated, as laboratoryscientists have collaborated closely with clinicalinvestigators. In addition, special biological andpharmacological studies, conducted during andafter treatment, offer tools to clinical investigatorsthat were never previously available. As a result,paediatric oncologists, surgeons, pathologists andcollaborating statisticians have the opportunityand the obligation to design and stratify trials

specifically for biologically defined, risk-adaptedsubsets of patients. For example, the NationalWilms’ Tumor Study-5, a therapeutic trial andbiology study, was designed to reduce treatmentintensity for the subgroup(s) of patients with themost favourable prognosis and intensify treat-ment for the patients with the least favourableprognosis, based on stage, histology (favourableor unfavourable, anaplastic, rhabdoid and clearcell types), tumour size and bilaterality, and toinvestigate the impact of loss of heterozygos-ity (LOH) of chromosome 16q and 1p on two-year relapse-free survival through collection oftumour and normal kidney tissue for DNA anal-ysis and banking.

Perhaps the best example of important trans-lational research that has led directly to thera-peutic implications is in the collection of bonemarrow specimens for cytogenetic studies inchildhood acute lymphocytic leukaemia (ALL).10

While classical karyotype analysis is typicallyinformative in 60–70% of the patients, impor-tant genetic markers can now be identified byprobes, using FISH (fluorescence in situ hybridi-sation) or PCR (polymerase chain reaction) invirtually all patients. Translocations, such as thet(4;11),11 – 13 t(9;22)14 – 16 and t(1;19),17,18 conferan adverse prognosis and lead to targeting thepatients for more aggressive therapy. On the otherhand, patients with the cryptic t(12;21) geneticlesion encoding the TEL-AML1 transcript,19 – 21

with hyperdiploid leukaemia identified by flowcytometric measurement of DNA index (typically53+ chromosomes in their primary clone),22,23 orwith specific trisomies detected by FISH, such as4, 10 and 17,23,24 have a more favourable out-come and can be targeted for less intensive treat-ment. As an example of the latter, POG investi-gators designed a trial (#9201) with less intensechemotherapy for ALL patients with lesser risk ofrelapse, defined by initial white blood cell counts<50 000, age between 1 and 10 years, absence ofCNS disease, and presence of one or both of thefollowing: DNA index >1.16, and/or trisomies ofchromosomes 4 and 10 by FISH.

In addition to the well-recognised prognos-tic importance of initial white blood cell count,

Page 113: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

106 TEXTBOOK OF CLINICAL TRIALS

age at diagnosis, extramedullary disease andblast cell genetic features in B-precursor ALLand their significance for stratification and trialdesign, the early response to therapy, presenceof minimal residual disease (MRD), and pharma-cologic and pharmacokinetic variables are alsopredictive of outcome. Slow early response toinduction treatment is predictive of an adverseoutcome and can be defined in several ways:slow clearance of circulating blast cells to oneweek of prednisone or multiagent induction, orgreater than 25% marrow blasts on day 7 (orday 14) of treatment. Quantitation of MRD byimmunologic methods or PCR assay of rear-ranged T-cell-receptor or immunoglobulin heavy-chain genes of leukaemic blasts as clonal markersof leukaemia in patients in clinical remissionhas been shown to identify patients at elevatedrisk for relapse, a factor which (arguably) shouldbe taken into account in assigning alternativetreatment.25,26 Wide variability in absorption oforally administrated chemotherapy, such as 6-mercaptopurine, and inter-patient variability insystemic exposure to both methotrexate and 6-mercaptopurine are important determinants ofoutcome in ALL.27,28 Graham et al.29 uncovereda pharmacologic interaction between methotrex-ate and cytosine arabinoside when given simulta-neously, and demonstrated a correlation betweenhost drug levels and adverse outcome. Individ-ual variability in response to cancer treatment issurely related to genetic polymorphisms in drug-metabolising enzymes, transporters, receptors andother drug targets, and suggests that these geneticdifferences may form a solid scientific basis foroptimising therapies within the context of clinicaltrials.30

Given the plethora of prognostic factors nowknown for most paediatric malignancies, a prag-matic and rational approach to clinical trialsdesign and stratification consists of risk assign-ment by a combination of clinical and biologicalfactors identified through multivariate analysisto be of prognostic significance. Treatment isthen tailored to risk status, commonly consider-ing variables such as patient age, extent of dis-ease and tumour biology. For example, the risk

assignments for the Intergroup Rhabdomyosar-coma Study V (shown in Table 7.3) are based onfavourable prognostic factors identified in Stud-ies I–IV, conducted from 1972 through 1991,and include (1) undetectable distant metastases atdiagnosis; (2) primary sites in the orbit and non-parameningeal head/neck and genitourinary non-bladder/prostate regions; (3) grossly completesurgical removal of localised tumour at diagno-sis; (4) embryonal/botryoid histology; (5) tumoursize ≤5 cm; and (6) age younger than 10 yearsat diagnosis. Patients defined as shown intolow, intermediate and high risk are predicted tohave an estimated three-year EFS rate of 88%,55–76% and <30%, respectively.31

Similarly, significant advances in translationalresearch is neuroblastoma, which accounts for 8to 10% of all childhood cancers, have resulted ina refined risk-related approach to therapy basedon the age of the patient, the stage of the tumouraccording to the International NeuroblastomaStaging System (INSS), histopathologic features,the number of N-myc copy numbers and theploidy of tumour cells (Table 7.4).

Because childhood cancer is rare, national ref-erence laboratories have been established to anal-yse and store samples from the membership ofthe Children’s Oncology Group as well as otherlarge institutions and other international paedi-atric clinical trials organisations. Such laborato-ries help the research programme in terms of sci-entific expertise, quality control and correlativescience. Few institutions can afford to maintainsuch laboratories solely for their own paediatriccancer patients, and web-based informatics appli-cations afford access to the most sophisticatedon-line resources and information even in smallerremote centres.

STUDY DESIGN FOR CHILDHOODCANCER TRIALS

PHASE I STUDY DESIGN

Because childhood cancer is rare and the responseto conventional treatment good, most childrennever experience recurrent disease and are thus

Page 114: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CHILDHOOD CANCER 107

Tabl

e7.

3.R

isk

grou

pas

sign

men

tsfo

rin

terg

roup

Rha

bdom

yosa

rcom

aSt

udy

Gro

upst

udy

V

Ris

k(p

roto

col)

Stag

eG

roup

Site

Size

Age

His

tolo

gyM

etas

tasi

sN

odes

Trea

tmen

t

Low

,sub

grou

pA

1I

Favo

urab

lea

orb

<21

EMB

M0

N0

VA

(D96

02)

1II

Favo

urab

lea

orb

<21

EMB

M0

N0

VA

+X

RT

1III

Orb

iton

lya

orb

<21

EMB

M0

N0

VA

+X

RT

2I

Unf

avou

rabl

ea

<21

EMB

M0

N0

orN

XV

ALo

w,s

ubgr

oup

B1

IIFa

vour

able

aor

b<

21EM

BM

0N

1V

AC

+X

RT

(D96

02)

1III

Orb

iton

lya

orb

<21

EMB

M0

N1

VA

C+

XR

T1

IIIFa

vour

able

(exc

ludi

ngor

bit)

aor

b<

21EM

BM

0N

0or

N1

orN

XV

AC

+X

RT

2II

Unf

avou

rabl

ea

<21

EMB

M0

N0

orN

XV

AC

+X

RT

3Io

rII

Unf

avou

rabl

ea

<21

EMB

M0

N1

VA

C(+

XR

T,G

pII)

3Io

rII

Unf

avou

rabl

eb

<21

EMB

M0

N0

orN

1or

NX

VA

C(+

XR

T,G

pII)

Inte

rmed

iate

2III

Unf

avou

rabl

ea

<21

EMB

M0

N0

orN

XV

AC

±To

po+

XR

T(D

9803

)3

IIIU

nfav

oura

ble

a<

21EM

BM

0N

1V

AC

±To

po+

XR

T3

IIIU

nfav

oura

ble

b<

2 1E M

BM

0N

0o r

N1

orN

XV

AC

±To

po+

XR

T1

or2

or3

Ior

IIor

IIIFa

vour

able

orun

favo

urab

lea

o rb

<21

ALV

/UD

SM

0N

0or

N1

orN

XV

AC

±To

po+

XR

T

4Io

rII

orIII

orIV

Favo

urab

leor

unfa

vour

able

ao r

b<

1 0E M

BM

1N

0o r

N1

o rN

XV

AC

±To

po+

XR

T

Hig

h(D

9802

)4

IVFa

vour

able

orun

favo

urab

lea

o rb

≥10

EMB

M1

N0

orN

1or

NX

CPT

-11,

VA

C+

XR

T

4IV

Favo

urab

leor

unfa

vour

able

ao r

b<

21A

LV/U

DS

M1

N0

orN

1or

NX

CPT

-11,

VA

C+

XR

T

Favo

urab

le=

orbi

t/eye

lid,h

ead

and

neck

(exc

ludi

ngpa

ram

enin

geal

),ge

nito

urin

ary

(not

blad

der

orpr

osta

te)a

ndbi

liary

trac

t.U

nfav

oura

ble

=bl

adde

r,pr

osta

te,e

xtre

mity

,par

amen

inge

al,t

runk

,ret

rope

rito

neal

,pel

vis,

othe

r.a

=tu

mou

rsi

ze≤5

cmin

diam

eter

;b=

tum

our

size

>5

c min

d ia m

e te r

.EM

B=

embr

yona

l,bo

tryo

id,o

rsp

indl

e-ce

llrh

abdo

myo

sarc

omas

orec

tom

esen

chym

omas

with

embr

yona

lRM

S.A

LV=

alve

olar

rhab

dom

yosa

rcom

asor

ecto

mes

ench

ymom

asw

ithal

veol

arR

MS,

UD

S=

undi

ffere

ntia

ted

sarc

omas

.N

0=

regi

onal

node

scl

inic

ally

noti

nvol

ved;

N1

=re

gion

alno

des

clin

ical

lyin

volv

ed;N

X=

node

stat

usun

know

n.V

AC

=vi

ncri

stin

e,ac

tinom

yci n

D,c

yclo

phos

pham

ide;

XR

T=

radi

othe

rapy

;Top

o=

topo

teca

n;G

p=

Gro

up;C

PT-1

1=

irin

otec

an.

Sour

ce:

Rep

rodu

ced

from

Ran

eyet

al.31

(p.2

18),

with

perm

issi

on.

Page 115: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

108 TEXTBOOK OF CLINICAL TRIALS

Table 7.4. International Neuroblastoma Staging System (INSS)

Stage 1: Localized tumour continued to the area of origin; complete gross resection, with or withoutmicroscopic residual disease; identifiable ipsilateral and contralateral lymph node negative fortumour.

Stage 2A: Unilateral with incomplete gross resection; identifiable ipsilateral and contralateral lymph nodenegative for tumour.

Stage 2B: Unilateral with complete or incomplete gross resection; with ipsilateral lymph node positive fortumour; identifiable contralateral lymph node negative for tumour.

Stage 3: Tumour infiltrating across midline with or without regional lymph node involvement; or unilateraltumour with contralateral lymph node involvement; or midline tumour with bilateral lymph nodeinvolvement.

Stage 4: Dissemination of tumour to distant lymph nodes, bone marrow, liver or other organs except asdefined in stage 4S.

Stage 4S: Localized primary tumour as defined in stage 1 or 2, with dissemination limited to liver, skin orbone marrow

Risk group and protocol assignment schema: POG and CCG

INSSstage Age (y)

N-mycstatus

Shimadahistology

DNAploidy

Riskgroup/study

1 0–21 Any Any Any Low2A and 2B <1 Any Any Any Low

≥1–21 Nonamplifieda Any NA Low≥1–21 Amplifiedb Favourable NA Low≥1–21 Amplified Unfavourable NA High

3 <1 Nonamplified Any Any Intermediate<1 Amplified Any Any High≥1–21 Nonamplified Favourable NA Intermediate≥1–21 Nonamplified Unfavourable NA High≥1–21 Amplified Any NA High

4 <1 Nonamplified Any Any Intermediate<1 Amplified Any Any High≥1–21 Any Any NA High

4S <1 Nonamplified Favourable >1 Low<1 Nonamplified Any 1 Intermediate<1 Nonamplified Unfavourable Any Intermediate<1 Amplified Any Any High

aN-myc copy number ≤10.bN-myc copy number >10.

POG, Pediatric Oncology Group; CCG, Children’s Cancer Group; INSS, International Neuroblastoma Staging System; NA, not applicable.

Source: Reproduced from Castleberry,60 (pp. 926, 930), with permission from Elsevier.

not eligible for trials of new agents. Phase Itrials are designed to estimate the maximal tol-erated dose of a drug, to determine the natureand frequency of toxicities, and to define thedrug pharmacokinetics. While eligibility varies,patients have typically failed front-line therapyand usually they will also have failed second-linetherapy. Because of the small number of pae-diatric patients eligible for Phase I trials, most

are accomplished as multi-institutional collab-orations. Paediatric drug development requiresseparate Phase I studies (i.e., separate and dis-tinct from studies done in adults) because pae-diatric patients may tolerate either higher orlower levels of drugs and may exhibit toxici-ties unique to children. Separate trials warrantingemphasis may also reflect unique agents activein paediatric tumours, differing from agents that

Page 116: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CHILDHOOD CANCER 109

are of the highest priority for cancers commonamong adults.

The basic design is to begin at about 80%of the adult maximal tolerated dose. Patients areentered in cohorts and treated at increasing doses.At each level, three patients are typically accrued.If there is no dose-limiting toxicity amongst thethree patients, the dose is raised to the nextlevel (usually a 20–30% escalation), in succes-sive cohorts of patients with no intrapatient doseescalation. If two or all three of these initiallyaccrued patients experience dose-limiting toxic-ity (DLT), the maximum tolerated dose (MTD)will have been deemed exceeded. Finally, if onepatient amongst the initial three patients experi-ences dose-limiting toxicity, an additional threepatients are accrued. If six patients are needed,a dose escalation will occur if a total of one insix (i.e. zero of the next three) has dose-limitingtoxicity. If two or more (i.e. one or more ofthe next three) experience dose-limiting toxic-ity, the maximal tolerated dose will be deemedto have been exceeded. The MTD is definedas the dose level immediately below the levelat which two patients in three to six experi-ence DLT. The definition of dose-limiting toxicitycan vary from study to study, but it generallyfalls into two categories: (a) Grade 3, 4 or 5non-haematologic toxicity other than (1) Grade3 nausea/vomiting; (2) Grade 3 transaminaseelevation; and (3) Grade 3 fever/infection and(b) Grade 4 myelosupression, that lasts morethan 7 days, which requires transfusions twice in7 days, or causes a delay in therapy exceeding14 days. While the study is temporarily closedafter accrual of each set of three patients in orderto assess patient-specific responses and toxicities,a patient reservation system is used to obtainplaces when and if the study reopens. Phase Itrials often require the evaluation of many doselevels. At a given dose level, the probabilities ofdeclaring that the MTD has been exceeded are9.3%, (50%) and [83%], when the true probabil-ities of dose-limiting toxicities are respectively0.1, (0.3) and [0.5].

Consensus guidelines established by Americanand European investigators for the conduct of

paediatric Phase I trials have been established.32

A problem recently identified is the determinationof MTDs in paediatric trials that are lowerthan those defined in adult patients, which mayrelate to differences in the intensity of priortherapy between adult and paediatric patientsentered onto Phase I trials. There is a well-established association between prior therapyand reduced tolerance to myelotoxic drugs.If current paediatric Phase I trials in heavilypretreated patients define MTDs that tend to belower than those determined in adult patientswith minimal prior therapy, then application ofthe paediatric MTD to less heavily pretreatedpaediatric patients, e.g., in Phase II trials, maybe problematic.

PHASE II STUDY DESIGN

The specific purpose of a Phase II trial is todetermine activity, i.e., to develop estimates ofthe response rate of patients with specific tumourtypes to a particular drug or novel combination.Eligible patients typically will have relapsed ona front-line therapy, and the prospect of a cureis unlikely. Typically, the dependent variableis an objective all or none response variablesuch as achievement of a complete or partial(>50%) response. Interim results are maskedfrom the participants until the study closes toaccrual and response information for all patientshas been established. There are three types ofPhase II trial designs that depend upon thestudy objectives.

Testing Activity

The most common is ‘proving activity’. Forthese studies, a fixed objective response rate isspecified for activity (null hypothesis), and thegoal is to reject the hypothesis in favour ofthe alternate hypothesis that the response rate isgreater than this fixed figure. Generally, since thenumber of Phase II agents that can be testedis large in comparison to patient availability,sequential designs are preferred. However, asSimon33 pointed out, it is rarely advantageous to

Page 117: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

110 TEXTBOOK OF CLINICAL TRIALS

go beyond two stages. Two excellent referenceswith regard to Phase II design are Simon33 andShuster34 The designs of Simon33 stop at the firststage only if lack of activity is demonstrated.His argument is that patients should benefit fromactive drugs. However, in paediatrics, due tothe relative scarcity of patients with recurrentdisease, designs that stop early for either lack ofactivity or proven activity are preferred.

Historical Comparison

Another strategy for defining efficacy would beto prove a response rate is superior to thatseen in an historical control study. The responserate of the new study is statistically comparedto that of the control therapy. Makuch andSimon35 have provided methods to determinethe sample size requirements for these studies.Chang et al.36 have extended this to two-stagedesigns (i.e., a sequential approach that couldsave patient resources).

Randomised Phase II Comparison

Due to a limited availability of patients, it isexceedingly rare that a randomised comparisonof a new agent to a control is feasible in apaediatric Phase II study. However, such studieshave been done. See McWilliams et al.37 foran example from childhood neuroblastoma. Asabove, two-stage or group sequential designs arethe preferred method. The programme EAST38

can be used for designs that allow for bothearly acceptance and early rejection of the nullhypothesis that the new treatment is equivalentto the control treatment.

In paediatric oncology, with limited patientnumbers, only one or two cooperative Phase IItrials are conducted with each new agent, andall malignancies refractory to standard therapyare typically combined into a single paediatricPhase II trial, usually stratified by histology. Notsurprisingly, Phase II trials of novel multiagentregimens provide greater evidence of activitythan single agent Phase II trials and offerconsiderable possibility of therapeutic benefit.39

PHASE III DESIGN

These studies typically ask a randomised questionabout either survival or event-free survival (thetime from study entry to the earliest of inductionfailure, relapse, second cancer, or death of anycause). Intent-to-treat40 is the analysis of choicefor efficacy, with other analysis done as sec-ondary supportive inference. For treatment ques-tions where the randomised divergence is con-siderably after study entry or where a significantnumber of failures are expected to occur beforedivergence, a delayed randomisation is typicallydone as close to the divergence point as possible.For these randomisations, the dependent variablewould be event-free survival from the randomi-sation date.

Phase III studies are typically designed assum-ing either proportional hazards or the cure modelof Sposto and Sather.41 In either case, the designsare group sequential in nature with plannedinterim analyses. In the case of proportional haz-ards, the O’Brien–Fleming method42 is used. Thereader is referred to Shuster43 for specific details.Nearly all Phase III childhood cancer trials arerun either as two-armed studies or as 2 × 2 fac-torial studies. It is rare that sufficient numbers ofpaediatric cancer patients are available to conductthree-armed studies, except perhaps in ALL, themost commonly occurring malignancy. The typeof questions utilised in 2 × 2 factorial studiesmust be such that the expectation is for no ‘quali-tative interaction’ between the two interventions.A qualitative interaction between treatments Aand B would occur if a standard regimen plus Ais superior to the standard regimen alone, but thestandard plus A plus B is inferior to the standardplus B. For example, if a study is to randomiseleukaemia patients to receive or not receive regi-men A, designed to have an impact on the CNS,while at the same time to receive or not receiveregimen B, designed to have an impact on mar-row remission, a factorial design would seemappropriate. Essentially, we can run two studiesfor the price of one. If the two interventions havemuch in common, this would be a contraindi-cation for a factorial design. In contrast, if wewished to ask if the same drug had an impact in

Page 118: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CHILDHOOD CANCER 111

induction therapy (first intervention) and in main-tenance therapy (second intervention), there is, atleast intuitively, the plausibility that the advan-tage of both interventions over just one may bezero or even harmful.

Phase III studies done in cooperative groupsare required by the NCI to have a Data Safetyand Monitoring Board which reviews the study ata minimum of every six months for toxicity andat planned intervals for efficacy, until it releasesthe study to the study committee. The releasecan occur no sooner than the earlier of (1) allsubjects have completed the planned interventionor (2) the study was closed early and a newintervention is needed for patients on one orboth arms. Any release prior to the planned dateof final analysis requires approval of the board.Double-blind Phase III studies are rarely feasibledue to the toxic nature of cancer treatment.However, they are encouraged for studies ofsupportive care, as long as the intervention isgiven in a pill form, and has no major known sideeffects requiring special medical monitoring.

Negative questions are often posed for paedi-atric cancer. For such studies, a very high curerate of at least 85% has been shown possible on aconventional regimen. The question posed is canwe do ‘almost as well’ with reduced therapy? Toanswer such questions with confidence requireslarge numbers, and it is rare that even the entirepatient resources of COG are sufficient to addressthis in a randomised manner. For example, if adisease has a historical 4-year remission rate of90%, and an accrual rate of 200 patients per year,a randomised study would take 6 years of accrual(10-year duration) to have 95% power to detecta degradation to 85% under reduced therapy atp = 0.20, one-sided. (Note that the typical valuesof type I error and power are reversed.) A single-arm study would require 315 patients to ask thesame question of a fixed standard of 90% vs. areduction to 85% (nearly a 75% reduction in sam-ple size). While the benefits of reduced therapymay be obvious, such studies carry considerablerisk and must be carefully monitored for earlyevidence that the reduction in therapy is unsafeand is associated with an inferior outcome.

ANCILLARY STUDIES

In paediatric cancer, there is considerable activityin translational research (see above). This cantake the form of biologic studies, late effects, orin controlling acute side effects. These studiesare designed on a case-by-case basis. Examplesinclude the conduct of case–control ‘tissuebank’ studies to establish a promising prognosticmarker. Cases are defined as patients failinga protocol (typically a relapse) and controlsare long-term successes. These studies can bedone using sequential designs, typically two-stage designs. Other typical studies might lookat cognitive impairment (multivariate analysisof variance of neuropsychological variables),acute toxicity of a specified type (typical Chi-square test), the prognostic significance of serialpharmacologically measured drug levels (time-dependent covariate in survival analysis), orexploratory analysis (e.g. microarrays).

ETHICAL AND OTHER SPECIALCONSIDERATIONS AFFECTING CONDUCTOF TRIALS IN CHILDREN WITH CANCER

Children and adolescents constitute a specialvulnerable population of research subjects, oftengrouped with other special classes, like thementally retarded, mentally ill and prisoners.There are special federal protections which applyto all research involving children as subjectswhich are covered by Subpart D of Part 46 ofTitle 45 of the Code of Federal Regulations(45 CFR 46), requiring that institutional reviewboards (IRBS) give consideration to the degreeof risk, the benefit to child subjects, the natureof the knowledge to be gained, permission ofthe parent or guardian, and the concurrence ofthe child subjects, known as assent. A child’scapacity to give assent is conditioned by his orher developmental level.44,45

Subsequent to the promulgation of the originalrules, adopted in 1983 and modified in 1991,there has been nearly continuous debate andcontroversy surrounding safeguards for all humansubjects of research and for children especially.

Page 119: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

112 TEXTBOOK OF CLINICAL TRIALS

The tragic death of an 18-year-old researchsubject in 1999 in a gene-transfer trial ata major research university in which humansubjects were not protected, adverse events hadnot been reported and financial conflicts ofinterest were involved, served to trigger severalnew federal initiatives to further strengthenprotections of human research subjects in clinicaltrials,46 including the imposition of sanctions oninvestigators who fail to adhere to regulations. Asthis chapter goes to press, the federal Office forProtection from Research Risks (OPRR) has beenreorganised, expanded and renamed the Officefor Human Research Protections (OHRP) andtransferred to the Office of the Secretary, Healthand Human Services (HHS) and the NationalBiothetics Advisory Commission, at the requestof the President, has undertaken a sweepingexamination of the ethical and policy issues inthe oversight of human research in the UnitedStates (see www.bioethics.gov). As a result, theethical and regulatory framework within whichpaediatric cancer clinical trials are conducted,now and in the future, will continue to evolve,and investigators must remain abreast.

Specific ethical issues impacting statisticiansinvolved in collaborative research include ensur-ing confidentiality, data and safety monitoring,and problems and pitfalls in interpretation ofinterim analyses and planning studies to answernegative questions.47 A negative question, e.g.,what is the minimum therapy needed to producecure?, has particular relevance for paediatric can-cer trials which are (often) aimed at reduction ofthe acute or delayed effects of cancer treatmenton the growing child.

Notwithstanding the strict ethical guidelinesand regulations surrounding research in children,there is substantial and even increasing pressureto enroll children in clinical trials as a resultof other federal policies and recent legislation,including the Food and Drug Administration’s(FDA’s) 1998 paediatric rule, the paediatric pro-visions of the FDA Modernization Act (FDAMA)of 1997, and the sweeping Children’s Health Actof 2000 (PL 106–310), the sum of which is

certain to increase paediatric clinical trials, partic-ularly drug trials. Federal NIH policies promul-gated in 1998 were aimed at increasing the par-ticipation of children in research so that adequatedata would be developed to support the treatmentfor disorders affecting adults which also affectchildren, and rules mandated that children (i.e.,individuals under age 21) must be included inall human subjects research unless there are sci-entific and ethical reasons not to include them.The FDA rules and regulations48 require phar-maceutical manufacturers to assess the safetyand effectiveness of new drugs and biologics inpaediatric patients and established powerful eco-nomic incentives for manufacturers (six-months’extension of market exclusivity) on any drugfor which FDA requested paediatric studies (seewww.fda.gov/cder/cancer for further informationon regulatory aspects of paediatric oncology drugdevelopment).

In addition to ethical and regulatory issueswhich impact the conduct of paediatric trials,there are also practical problems associated withclinical cancer research in children. Due to anunderstandably greater concern for long-termadverse consequences of treatment in a popula-tion of patients, the majority of whom are likelyto be cured and alive for decades at risk for lateeffects, it is absolutely essential that long-termfollow-up and serial surveillance of survivors isbuilt into the studies. While follow-up is essen-tial, it is also exceedingly difficult and expensiveto maintain, as children and adolescents grow up,go away to school, leave home, marry, changename, etc. The frequency and severity of lateeffects also tend to progress with time off treat-ment, making follow-up beyond 15 or 20 or30 years critical and identification of risk factorsfor the development of these late consequencesof treatment essential. For example, Lipshultzet al.49 studied 120 survivors of childhood ALLor osteogenic sarcoma who had been treated withdoxorubicin a mean of 8.1 years earlier (range2–14 years) and compared their cardiac func-tion to a control population, and evaluated theimpact of gender, age at diagnosis, length oftime since completion of therapy, and dosage and

Page 120: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CHILDHOOD CANCER 113

cumulative dose of doxorubicin on cardiac sta-tus. Calculating sex-specific standardised scoresor z scores (expressed as the number of standarddeviations above or below the value for the nor-mal controls) for cardiac contractility, wall thick-ness and afterload, the results of univariate andmultivariate analysis showed that female sex andhigher cumulative dose of doxorubicin were asso-ciated with depressed contractility, that there wasan association between younger age at diagnosisand reduced left ventricular wall thickness andincreased afterload, and that the prevalence andseverity of abnormalities increased with longerfollow-up.49 Such studies typify the challenge ofmethodologic and statistical issues in the studyof late effects of childhood cancer, the greatestchallenge being data collection.

A LOOK INTO THE FUTUREOF CHILDHOOD CANCER RESEARCH

Despite the progress of the last half century thereremain a number of challenges in childhood can-cer. The focus of research in certain patient sub-sets with very high cure rates will be on qualityof life endpoints. For example, retinoblastoma iscurable in nearly 100% of cases, so preservationof sight and reduction of second malignancies(not survival) are now considered to be the pri-mary goals and endpoints, and trials avoidingenucleation and eliminating external beam ther-apy are now the norm.

One would hope that future therapies for child-hood cancer will be developed which would bemore rational, less empirical and less toxic, rely-ing more on strategies for growth control (e.g.,anti-angiogenesis) and regulation of gene expres-sion and cell proliferation, and/or induction ofapoptotic pathways or blocking of anti-apoptoticsignals, than on cytotoxic or ablative treatments.Assuming that deregulated and/or mutated cellu-lar proto-oncogenes or loss of tumour suppressorgenes are the proximate cause(s) of most formsof childhood cancer, then the genes and/or theirprotein products will very likely be the targetsfor the next generation of paediatric anti-cancer

agents, many of which will likely be orphan drugsfor orphan diseases.

With advances in translational research, thepie (universe of childhood cancer patients) willbe divided into smaller but more homogeneousslices than ever before. International collabora-tion will probably be required in a substantialsegment of cancer types in order to obtain suf-ficient patient numbers to conduct randomisedtrials. Enlightened partnerships between industryand academia, with the assistance of the FDA andNCI, will be needed for efficient development ofnew agents.

Finally, the skill sets necessary to conductpaediatric cancer research are expanding. Tradi-tionally the field involved paediatric haematolo-gist/oncologists, surgeons, radiation oncologists,pathologists, nurses, clinical research associates,pharmacologists, epidemiologists and biostatisti-cians. Today, diagnostic imagers, bench scien-tists, geneticists, pharmacists, clinical psychol-ogists, health economists and others also playsignificant roles in the research. In the future,other fields of expertise will surely need to beadded to the team. The cooperation of a multi-disciplinary team and prompt referral of patientsto paediatric cancer centres participating in clini-cal trials will be critical to achieving future goalsof refining and improving therapy.

REFERENCES

1. National Cancer Institute, SEER Program. CancerIncidence and Survival among Children and Ado-lescents: United States SEER Program 1975–1995.NIH Publ. No. 99–4649. Bethesda, MD: NationalCancer Institute (1999).

2. Ross JA, Severson RK, Pollock BH, Robison LL.Childhood cancer in the United States. A geo-graphic analysis of cases from the Pediatric Coop-erative Clinical Trials Groups. Cancer (1996) 77:201–7.

3. Woods WG, Tuchman M, Robison LL et al. Apopulation-based study of the usefulness ofscreening for neuroblastoma. Lancet (1996) 348:1682–7.

4. Pediatric Oncology Group. Progress against child-hood cancer: the Pediatric Oncology Group expe-rience. Pediatrics (1992) 89(4): 597–600.

Page 121: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

114 TEXTBOOK OF CLINICAL TRIALS

5. Murphy SB. The national impact of clinical coop-erative group trials for pediatric cancer. MedPediat Oncol (1995) 24: 279–80.

6. Pui CH. Childhood Leukemias. Cambridge: Cam-bridge University Press (1999).

7. Donaldson SS, Link MP. Combined modalitytreatment with low-dose radiation and MOPPchemotherapy for children with Hodgkin’sdisease. J Clin Oncol (1987) 5(5): 742–9.

8. Murphy SB. Classification, staging, and endresults of treatment of childhood non-Hodgkin’slymphomas: dissimilarities from adults. Sem Oncol(1980) 7: 332–9.

9. Link M, Shuster J, Donaldson S, Berard CW,Murphy SB. Treatment of children and youngadults with early-stage non-Hodgkin’s lymphoma.New Engl J Med (1997) 337(18): 1259–66.

10. Shuster JJ, Carroll AJ, Look TA et al. Manage-ment of cytogenetic data in multi-center leukemiatrials. Comput Methods Programs Biomed. (1993)40: 269–77.

11. Pui CH, Frankel LS, Carroll AJ et al. Clinicalcharacteristics and treatment outcome of child-hood acute lymphoblastic leukemia with thet(4;11)(q21;q23): a collaborative study of 40 cases.Blood (1991) 77: 440–7.

12. Pui CH, Carroll AJ, Raimondi SC et al. Child-hood acute lymphoblastic leukemia with thet(4;11)(q21;q23): an update. Blood (1994) 83:2384–5.

13. Heerema NA, Sather HN, Ge J et al. Cytogeneticstudies of infant acute lymphoblastic leukemia:poor prognosis of infants with t(4;11) – a reportof the Children’s Cancer Group. Leukemia (1999)13: 679–86.

14. Ribeiro RC, Broniscer A, Rivera GK et al. Phila-delphia chromosome-positive acute lymphoblas-tic leukemia in children: durable responses tochemotherapy associated with low initial whiteblood cell counts. Leukemia (1997) 11: 1493–6.

15. Uckun FM, Nachman JB, Sather HN et al. Poortreatment outcome of Philadelphia chromosome-positive pediatric acute lymphoblastic leukemiadespite intensive chemotherapy. Leukemia Lym-phoma (1999) 33: 101–6.

16. Arico M, Valsecchi MG, Camitta B et al. Out-come of treatment in children with Philadel-phia chromosome-positive acute lymphoblasticleukemia. N Engl J Med (2000) 342: 998–1006.

17. Crist WM, Carroll AJ, Shuster JJ et al. Poor prog-nosis of children with pre-B acute lymphoblasticleukemia is associated with the t(1;19)(q23;p13):a Pediatric Oncology Group study. Blood (1990)76: 117–22.

18. Uckun FM, Sensel MG, Sather HN et al. Clinicalsignificance of translocation t(1;19) in childhoodacute lymphoblastic leukemia in the context

of contemporary therapies: a report from theChildren’s Cancer Group. J Clin Oncol (1998) 16:527–35.

19. Shurtleff SA, Buijs A, Behm FG et al. TEL/AML1fusion resulting from a cryptic t(12;21) is themost common genetic lesion in pediatric ALL anddefines a subgroup of patients with an excellentprognosis. Leukemia (1995) 9: 1985–9.

20. Rubnitz JE, Shuster JJ, Land VJ et al. Case–control study suggests a favorable impact ofTEL rearrangement in patients with B-lineageacute lymphoblastic leukemia treated with anti-metabolite-based therapy: a Pediatric OncologyGroup study. Blood (1997) 89: 1143–6.

21. Raimondi SC, Shurtleff SA, Downing JR et al.12p abnormalities and the TEL gene (ETV6) inchildhood acute lymphoblastic leukemia. Blood(1997) 90: 4559–66.

22. Trueworthy R, Shuster J, Look T et al. Ploidyof lymphoblasts is the strongest predictor oftreatment outcome in B-progenitor cell acutelymphoblastic leukemia of childhood: a PediatricOncology Group study. J Clin Oncol (1992) 10:606–13.

23. Heerema NA, Sather HN, Sensel MG et al. Prog-nostic impact of trisomies of chromosomes 10, 17,and 5 among children with acute lymphoblasticleukemia and high hyperdiploidy (>50 chromo-somes). J Clin Oncol (2000) 18: 1876–87.

24. Harris MB, Shuster JJ, Carroll A et al. Trisomyof leukemic cell chromosomes 4 and 10 identifieschildren with B-progenitor cell acute lymphoblas-tic leukemia with a very low risk of treatmentfailure: a Pediatric Oncology Group study. Blood(1992) 79: 3316–24.

25. Cave H, van der Werff ten Bosch J, Suciu Set al. Clinical significance of minimal residual dis-ease in childhood acute lymphoblastic leukemia.European Organization for Research and Treat-ment of Cancer – Childhood Leukemia Cooper-ative Group. N Engl J Med (1998) 339: 591–8.

26. Campana D, Pui CH. Detection of minimal resid-ual disease in acute leukemia: methodologicadvances and clinical significance. Blood (1995)85: 1416–34.

27. Evans WE, Crom WR, Abromowitch M et al.Clinical pharmacodynamics of high-dose metho-trexate in acute lymphocytic leukemia. Identifica-tion of a relation between concentration and effect.New Engl J Med (1986) 314: 471–7.

28. Koren G, Farrazini G, Sulh H et al. Systemicexposure to mercaptopurine as a prognostic factorin acute lymphocytic leukemia in children. NewEngl J Med (1990) 323: 17–21.

29. Graham ML, Shuster JJ, Kamen BA et al.Changes in red blood cell methotrexate phar-macology and their impact on outcome when

Page 122: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CHILDHOOD CANCER 115

cytarabine is infused with methotrexate in thetreatment of acute lymphocytic leukemia in chil-dren: a Pediatric Oncology Group study. Clin Can-cer Res (1996) 2: 331–7.

30. Evans WE, Relling MV. Pharmacogenomics:translating functional genomics into rational ther-apeutics. Science (1999) 286: 487–91.

31. Raney RB, Anderson JR, Barr FG et al. Rhab-domyosarcoma and undifferentiated sarcoma inthe first two decades of life: a selective reviewof intergroup Rhabdomyosarcoma Study Groupexperience and rationale for intergroup rhab-domyosarcoma study V. J Ped Hematol/Oncol(2001) 23: 215–20.

32. Smith MD, Ho P, Ungerleider R et al. The con-duct of Phase I trials in children with cancer. JClin Oncol (1998) 16: 966–78.

33. Simon R. Optimal two-stage designs for phase IIclinical trials. Contr Clin Trials (1989) 10: 1–10.

34. Shuster JJ. Optimal two-stage designs for singlearm Phase II cancer trials. J. Pharm. Statist (2002)12: 39–51.

35. Makuch R, Simon R. Sample size considerationsfor non-randomized comparitive studies. J ChronDis (1980) 33: 175–81.

36. Chang M, Shuster J, Kepner J. Group sequentialdesigns for phase II trials. Contr Clin Trials (1999)20: 353–64.

37. McWilliams N, Hayesm F, Green A et al. Cyclo-phosphamide/doxorubicin vs. cisplatin/teniposidein the treatment of children older than 12 monthsof age with disseminated neuroblastoma: a Pedi-atric Oncology Group randomized Phase II study.Med Pediat Oncol (1995) 24: 176–80.

38. Mehta C. EaST (Early Stopping in Clinical Trials).Cambridge, MA: Cytel Software Corp. (1992).

39. Weitman S, Ochoa S, Sullivan J et al. PediatricPhase II cancer chemotherapy trials: a PediatricOncology Group study. J Ped Hematol/Oncol(1997) 19: 187–91.

40. Armitage P. Controversies and achievements inclinical trials. Contr Clin Trials (1984) 5: 67–72.

41. Sposto R, Sather H. Determining the duration ofcomparative clinical trials while allowing for cure.J Chronic Dis (1985) 38: 683–90.

42. O’Brien PC, Fleming TR. A multiple testing pro-cedure for clinical trials. Biometrics (1979) 35:549–56.

43. Shuster J. Power and sample size for phaseIII clinical trials of survival. Chapter 7. In:Crowley J, ed., Handbook of Statistics in ClinicalOncology. New York: Marcel Dekker (2001).

44. Jonsen AR. Research involving children: recom-mendations of the National Commission for theProtection of Human Subjects of Biomedical andBehavioral Research. Pediatrics (1978) 62(2):131–6.

45. Leikin SL. Minors’ assent or dissent to medicaltreatment. J Pediat (1983) 102(2): 169–76.

46. Shalala D. Protecting research subjects – whatmust be done. N Engl J Med (2000) 343(11):808–10.

47. Shuster J et al. Ethical issues in cooperativecancer therapy trials from a statistical viewpoint,II. Specific issues. Am J Ped Hematol/Oncol(1985) 7(1): 64–70.

48. Hirschfeld S et al. Pediatric oncology: regulatoryinitiatives. The Oncologist (2002) 5: 441–4.

49. Lipshultz SE, Lipsitz SR, Mone SM, Goorin AMet al. Female sex and higher drug dose as riskfactors for late cardiotoxic effects of doxorubicintherapy for childhood cancer. New Engl J Med(1995) 332(26): 1738–43.

50. Link MP, Goorin AM, Miser AW, Green AA,Pratt CB, Belasco JB, Pritchard J et al. The effectof adjuvant chemotherapy on relapse-free survivalin patients with osteosarcoma of the extremity.New Engl J Med (1986) 314: 1600–6.

51. Smith M. The impact of doxorubicin dose inten-sity on survival of patients with Ewing’s sarcoma.J Clin Oncol (1991) 9: 889–91.

52. Grier HE, Krailo MD, Tarbell NJ et al. The addi-tion of ifosfamide and etoposide to standardchemotherapy in Ewing’s sarcoma/primitive neur-oectodermal tumor of bone: a Children’s CancerGroup/Pediatric Oncology Group study. New EnglJ Med (2003) 348: 694–701.

53. D’Angio G, Breslow N, Beckwith B, Evans A,Baum E, DeLorimier A et al. Treatment of Wilms’tumor: Results of the Third National Wilms’Tumor Study. Cancer (1989) 64: 349–60.

54. D’Angio G, Evans A, Breslow N, Beckwith B,Bishop H, Feigl P et al. The treatment of Wilms’tumor: Results of the National Wilms’ TumorStudy. Cancer (1976) 38: 633–46.

55. Maurer H, Beltangady M, Gehan E et al. Theintergoup rhabdomyosarcoma study – 1. Cancer(1988) 61: 1215.

56. Link M, Donaldson S, Berard C, Shuster J, Mur-phy SB. Results of treatment of childhood local-ized non-Hodgkin’s lymphoma with combinationchemotherapy with or without radiotherapy. NewEngl J Med (1990) 322: 1169–74.

57. Matthay KK, Villablanca JG, Seeger RC, StramDO, Harris RE, Ramsay NK, Swift P et al. Treat-ment of high-risk neuroblastoma with inten-sive chemotherapy, radiotherapy, autologous bonemarrow transplantation, and 13-cis retinoic acid.New Engl J Med (1999) 341: 1165–73.

58. Smith M, Arthur D, Camitta B, Carroll AJ,Crist W, Gaynon P, Gelber R, Heerema N,Link M, Murphy SB et al. Uniform approach to

Page 123: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

116 TEXTBOOK OF CLINICAL TRIALS

risk classification and treatment assignment forchildren with acute lymphoblastic leukemia. J ClinOncol (1996) 14(1): 18–24.

59. Baker KS, Anderson JR, Link MP, Grier HE et al.Benefit of intensified therapy for patients with

local or regional embryonal rhabdomyosarcoma:results from the Intergroup Rhabdomyosarcomastudy IV. J Clin Oncol (2000) 18: 2427–34.

60. Castleberry R. Biology and treatment of neurob-lastoma. Ped Clin North Am (1997) 44: 919–37.

Page 124: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

8

Gastrointestinal CancersDAN SARGENT, RICH GOLDBERG AND PAUL LIMBURG

Mayo Clinic, Rochester, MN 55905, USA

INTRODUCTION

Cancers of the gastrointestinal tract account forapproximately 20% of all new cancer cases inthe United States, and the same proportion ofcancer-related deaths. In this discussion we willuse a broad definition of GI cancer, includingany cancer of a digestive organ. In this def-inition we include cancers of the oesophagus,gastro-oesophageal junction, stomach, pancreas,gallbladder, bile duct, liver, small and large intes-tine, rectum and anus. Incident cancers of theoesophagus, stomach, pancreas, liver, large intes-tine and rectum all exceed 10 000 a year in theUnited States. In addition to the high prevalenceand the large number of cancer sites within theGI tract, the prognosis of patients with GI cancersvaries greatly. For example, patients with can-cers of the large intestine, when discovered earlyin the course of disease, have 5-year survivalrates exceeding 90%. In contrast, the prognosisfor patients with pancreatic cancer is very poor,with a 5-year survival rate of less than 5% acrossall stages.

Incidence rates for GI cancers show a similardiversity. In the past 50 years, the incidence ratesfor liver and gastric cancers in the US have

fallen substantially. For example, in 1930, gastriccancer was the most common cancer diagnosis.By 1994, gastric cancer had fallen to 12th inincidence among cancers. In contrast, the ratesof colon and rectal cancer have remained verystable. Incidence rates for GI cancers also varygreatly worldwide: gastric cancer is tenfold moreprevalent in Asia than in the US.

One common feature in all GI cancers isthe prognostic importance of staging. The TNMsystem has been widely adopted to describe thepatient’s disease status at the time of detection,and has great relevance to the choice of therapyand eventual outcome in all GI cancers. Theimportance of early detection is clear, and someGI cancers are sufficiently frequent and amenableto detection to allow cost-effective screening.

In this chapter we will review, for the majorsites of the GI tract, the important clinical trialsthat have been conducted. Whenever possible,we will highlight the methodological and designissues of these trials, in an effort to provideinsight into their results. We will describe,through this review, how the current standardtreatments in each disease site have evolved, aswell as presenting some of the most pressingissues for future research.

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 125: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

118 TEXTBOOK OF CLINICAL TRIALS

OESOPHAGEAL CANCER

Oesophageal cancer is an area where contro-versy as to the appropriate and optimal ther-apy exists in almost every aspect. In patientswith localised disease (stage 1–3), the roles ofsurgery, radiotherapy and chemotherapy, alone orin combination, have all been both advocated andquestioned. In advanced disease, it seems clearthat chemotherapy regimes have provided somedegree of progress, albeit limited.

LOCALISED DISEASE

In the past two decades, a large number ofrandomised clinical trials, involving thousandsof patients, have investigated the contributionsof radiotherapy or chemotherapy, both aloneand in combination, in the pre-operative andpost-operative settings or as definitive therapywithout surgery. Pre-operative radiotherapy, as asingle modality, has been shown in two relativelysmall randomised trials to provide no additionalbenefit compared to surgery. These two trials,reported by Launois et al.1 and Gignoux et al.,2

randomised 124 and 208 patients, respectively.Chemotherapy as a single modality added tosurgery was investigated in 440 patients byKelsen et al.3 and shown to have no advantageover surgery alone. The larger sample size ofthis study lends credence to this result. The twomodalities have also been compared to each otheras single agents,4 and no difference in patientoutcomes were observed. Based on these resultsit seems clear that single modality therapy haslimited if any impact on patient outcome.

Recently, interest has focused on combinedradiochemotherapy regimens in the pre-operativesetting. The results in this regard have beenconflicting. Four studies have been conducted,three with negative results and one with apositive conclusion. Bosset et al.5 randomised297 patients to pre-operative chemoradiationfollowed by surgery versus surgery alone, andfound no evidence of a difference in overallsurvival (a relative risk for survival between thetwo arms of 1.0), though they did observe an

advantage in disease-free survival in the treatedgroup. In smaller trials, Le Prise et al.6 and Urbaet al.7 reached the same conclusion based on86 and 100 randomised patients, respectively.In contrast, Walsh et al.,8 in a trial of 113patients, found a striking survival advantage forthe combined modality pre-operative approach,with a median survival of 16 months in themultimodality arm compared to 11 months inthe surgery alone arm (p = 0.01). However,the Walsh study has been criticised for severalfactors, including the small sample size, poorerthan expected survival for the surgery alonecontrol group, and the fact that the study wasstopped early at an unplanned interim analysis.In an effort to resolve this controversy, a largemulticentre randomised trial was mounted inthe US, with an accrual goal of 500 patients.Unfortunately, accrual to the trial was very slow,and the trial was closed early, far short of itsaccrual goal. Currently, the combined modalitypre-operative approach has been widely adopted,despite the conflicting evidence of benefit.

Additional controversy exists in this settingas to whether surgery itself is beneficial. TheRadiation Therapy Oncology Group (RTOG) hasconducted two randomised trials that have notincluded surgery as part of the treatment. Her-skovic et al.9 randomised 129 patients to radi-ation alone versus combined chemoradiotherapy.The study was stopped early (planned sample sizeof 150 patients) when the first planned interimanalysis showed a significant survival advantageto the combined modality group. The RTOGthen followed that study with a study comparingtwo doses of radiotherapy, both combined withchemotherapy.10 This study was also stoppedearly, in this case due to a lack of any additionalbenefit in the high-dose radiation arm. No tri-als to date have compared a surgical approach toa non-surgical approach, such a trial would sci-entifically be highly desirable but the practicalfeasibility of such a trial is questionable.

Based on these results, it is clear that thereis no consensus as to a ‘standard of care’for patients with localised oesophageal cancer,and that there is a great need for additional

Page 126: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GASTROINTESTINAL CANCERS 119

clinical trials. Historically, trials in this settinghave tended to be small and underpowered fordetecting moderate effects on outcome. Larger,more definitive trials should be conducted.

ADVANCED DISEASE

Trials in advanced oesophageal cancer havebeen plentiful, though attention in this settinghas focused more on Phase II trials than ran-domised Phase III trials. A multitude of agentshave been investigated, alone and in combina-tion. It is clear that progress has been made;over the last 20 years median survival foradvanced oesophageal cancer has increased from3 months to 6–9 months or greater. The empha-sis on Phase II trials, in an attempt to find apromising new approach, is certainly appropriategiven the modest results available from currentchemotherapies.

GASTRIC CANCER

While the incidence of gastric cancer has declinedin the United States over several decades, 21 600new diagnoses and 12 400 deaths were stillexpected in 2002.11 The nearly 40% cure ratethat these numbers imply likely results froma better natural history than oesophageal orpancreatic cancer, early detection via endoscopy,improvements in surgery, and the post-operativeuse of chemotherapy with radiation for patientswith resected disease. While gastric cancer isunusual among GI primary sites because of thelarge number of antineoplastic agents that showsome activity (as measured by tumour responserate), in the advanced disease setting even the mostactive combination chemotherapy regimens resultin remissions that generally last for only a fewmonths and median survivals of less than one year.

LOCALISED DISEASE

The ideal operation for gastric cancer, includ-ing the issues of limited versus total gastrectomyand extended versus more limited lymph nodedissection, has been a matter of controversy. Tri-als done in the 1980s and 1990s have led to the

conclusion that the most important surgical prin-ciple is achievement, when possible, of a patho-logically negative resection (an R0 resection).However, patients have improved post-operativequality of life if some of the stomach is retained,and most surgeons resect only as much of thestomach as is needed to achieve pathologicallyfree margins. The rich lymphatic networks ofthe stomach can sometimes result in apparentlyclear margins, yet residual intralymphatic diseasemay be present in ‘skip areas’. This has impli-cations regarding post-operative treatment, andsuggests a potential role for adjuvant radiationto the tumour bed and regional structures.

Many surgeons, particularly those in Japan,advocate extended lymph node dissections asa means to improve outcome due to the cen-tral location of the stomach with many lymphnode-bearing areas at risk for metastatic spread.In a landmark study the Dutch Gastric CancerGroup employed a single Japanese surgeon totrain participating Dutch surgeons to perform theclassical Japanese extended lymphadenectomy.12

These investigators randomised 711 eligiblepatients to resection of the primary tumour withclear gastric margins and either standard (D1)or extended lymphadenectomy (D2). Three-yearsurvival rates were 56% and 58% respectively forthe two cohorts, suggesting no advantage to moreaggressive surgery. The British Medical ResearchCouncil conducted a similar, albeit smaller (400patients) trial that confirmed this finding.13

The adjuvant therapy of gastric cancer, mainlyusing 5-FU based regimens, has been a mat-ter of investigation for many years. Many ran-domised trials of chemotherapy versus surgeryalone have been reported and these individualtrials have generally been negative. A meta-analysis of 21 randomised controlled trials con-ducted worldwide, that included 3962 patientswith 1840 allocated to surgery alone and 2122allocated to adjuvant chemotherapy, did show amodest potential benefit for treatment.14 The oddsratio (OR) in favour of chemotherapy was 0.84overall, but the principal benefit was confined topatients enrolled in trials done in Asia (n = 888patients, OR 0.58) as opposed to Western patients

Page 127: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

120 TEXTBOOK OF CLINICAL TRIALS

(n = 3074, OR 0.96). This finding lends somesupport to the possibility of a geographically orethnicity based difference in the natural historyof this disease, a finding supported by some epi-demiologic evidence. Studies of post-operativeradiation versus surgery alone have not shownany advantages, although interpretation of thelimited data addressing this issue is problematic.

Adjuvant radiation and chemotherapy usedin combination has recently been shown to beadvantageous in North American patients. In a603 patient study, patients were randomised toeither surgery alone or to surgery followed bycombined modality therapy.15 In the treatmentarm patients were given 5-FU plus leucovorinbefore and after 4500 cGY to the gastric bed(with radiosensitising 5-FU + leucovorin admin-istered for four consecutive days at the beginningand three days near the end of the radiation).Adjuvant chemoradiotherapy led to a significantmedian survival advantage of 36 compared to27 months (p = 0.005) and a reduction in localregional relapse rate to 67% compared to 82%.In addition to these outcome improvements, twoimportant patterns of care findings were noted.The trial recommended but did not demand atleast a D1 resection and noted that a D2 resec-tion was preferred. However, when operativereports were analysed, only 10% of patients hadD2 resections, 36% D1 resections, with the bal-ance having less aggressive surgery. Secondly,pretreatment radiation field review by a singleradiation oncologist indicated that 35% of sub-mitted treatment plans contained major or minordeviations from the protocol, indicating a needfor education of surgeons and radiation oncol-ogists as to the preferred procedures in thesesettings. Some readers have raised the possibilitythat the chemotherapy and radiation were benefi-cial mainly because of suboptimal surgery in thiscohort of patients.

ADVANCED DISEASE

Palliative therapy does make a meaningful dif-ference to many patients with advanced dis-ease in gastric cancer, whose median survival

with supportive care alone is around threemonths. One or more agents from virtually allclasses of chemotherapy drugs have demonstra-ble activity, and median survivals approachingone year have been reported with several com-bination chemotherapy regimens. One examplerepresentative of modern Phase III trials ran-domised patients to epirubicin, cisplatin and 5-FU(ECF) versus 5-FU, doxorubicin and methotrex-ate (FAMtx).16 The overall response rate was45% compared to 21% (p = 0.0002) and theoverall survival was 8.9 months compared to5.7 months (p = 0.0009) for ECF over FAMtx.Despite the intensive nature of these two reg-imens, and other combinations tested to date,the beneficial effects in terms of improvedpatient longevity have been modest. Earlierdetection, improvements in the management oflocal regional disease, and the testing of newagents seem to provide the best avenues towardsbetter outcomes.

PANCREATIC CANCER

Pancreas cancer has a very poor prognosis. Itaffects approximately 27 000 new patients eachyear in the US, and is fatal in approximately95% of cases. As in all GI cancers, therapyincludes surgery, radiotherapy and chemotherapy,depending on the disease stage.

LOCALISED DISEASE

In the setting of resectable or locally advanceddisease, both radiotherapy and chemotherapy, andthe combination, have been tested extensively.Studies conducted prior to the mid-1990s tendedto be small and underpowered, which has led toa variety of conflicting results.

In locally advanced disease, the Gastrointesti-nal Tumor Study Group (GITSG) randomised227 patients to three arms: radiotherapy alone,or radiotherapy at two different dose levelsgiven with chemotherapy (5-FU).17 Accrual tothe radiotherapy alone arm was stopped early dueto poor results. Two studies have investigated the

Page 128: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GASTROINTESTINAL CANCERS 121

need for chemoradiotherapy versus chemotherapyalone, with conflicting results. Klaassen et al.,18

in a two-arm randomised study of 191 patients,found no advantage for combined therapy ver-sus chemotherapy alone, while GITSG19 reportedthat overall survival was improved with the addi-tion of radiation to chemotherapy in a two-armstudy of 43 patients. The small sample sizes ofall these trials make definitive conclusions diffi-cult, but there is little evidence to support a rolefor radiation alone in this setting.

In the setting of a complete surgical resec-tion, several small randomised studies have sug-gested a benefit to post-operative chemotherapyor chemoradiotherapy. None of these trialsenrolled greater than 114 patients, limiting theability to draw conclusions. The recent reportby Neoptolemos et al.20 provided much moreconclusive evidence in this regard. In thisstudy, 541 eligible patients were randomised toreceive post-operative chemotherapy (6 monthsof post-surgical treatment), chemoradiotherapy (a10-day course of radiotherapy accompanied bychemotherapy), both or neither (the design wasnot a true 2 × 2 factorial because clinicians wereallowed to choose to participate in either one orboth randomisations). In this study, there was nobenefit to the chemoradiotherapy, while a clearbenefit was observed for the chemotherapy groupcompared to the no treatment group (median sur-vivals of 19.7 versus 14.0 months). Interestingly,the authors of that study did not conclude thata no treatment arm was inappropriate for futuretrials, in fact they are currently sponsoring athree-arm trial in 990 patients of two differentchemotherapy approaches (5-FU with folinic acidor gemcitabine) versus surgery alone. This trialhas been criticised for including patients withinvolved margins after surgery and also primariesarising in the ampulla and bile ducts.

ADVANCED DISEASE

Chemotherapy has been considered the standardof care in the US for advanced pancreatic cancer,despite the lack of any randomised trial demon-strating a survival benefit for chemotherapy ver-sus no treatment. The use of chemotherapy was

justified by the occasional tumour response thatwas observed. Single agent therapy with 5-FUhas been used as the control arm for multiplerandomised trials, with the assumption that 5-FUwas at worst a toxic placebo, thus if a newexperimental regimen were shown superior to5-FU, it would indeed have improved efficacywhen compared to no treatment. Burris et al.21

reported a Phase III randomised trial with 126patients that showed an improved overall survivalfor gemcitabine compared to 5-FU alone (mediansurvivals of 5.7 versus 4.4 months respectively,p = 0.003). The Burris trial established gemc-itabine as a new standard of care in this setting.Ongoing and future trials will likely use gemc-itabine as a base, comparing gemcitabine aloneto a multi-drug chemotherapy regimen includinggemcitabine.

A recently completed trial in pancreatic cancercan be used to illustrate the need for careful con-sideration of an agent prior to Phase III testing.Due in large part to the dismal prognosis and lim-ited treatment options available for patients withpancreatic cancer, pressure has been applied torapidly introduce novel agents into Phase III tri-als. The goal is to seek to speed the process oftesting a new agent by avoiding the Phase II stageof testing. Such was the case in a randomisedPhase III trial reported by Moore et al.,22 where anovel agent (a matrix metalloproteinase inhibitor(MMPI)) was tested against gemcitabine in 277patients. In this trial the MMPI had significantlyinferior outcome compared to gemcitabine. Thetrial was carefully and appropriately designed toallow early stopping if the results were extreme,which in this case they were. A Phase II trial mayhave identified the lack of efficacy of this agentprior to its large-scale testing.

COLORECTAL CANCER

Colorectal cancer is the most common malig-nancy in the GI tract. Not surprisingly, it is alsothe GI cancer that has been the most extensivelyinvestigated in clinical trials.

Likely as the direct result of these intensiveresearch efforts, considerable progress has been

Page 129: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

122 TEXTBOOK OF CLINICAL TRIALS

made in many facets of colorectal cancer,including chemoprevention, early detection andtreatment.

CHEMOPREVENTION

Cancer chemoprevention can be defined as theuse of nutritional or pharmaceutical agents toprevent, inhibit or reverse carcinogenesis at apre-invasive stage of disease. Candidate agentsare often identified through a combination ofepidemiological and laboratory-based research.Since most subjects enrolled onto chemopre-vention trials are generally healthy (except fortheir increased cancer risk), minimal toxicityrepresents an important criterion for select-ing candidate agents. Colorectal adenomas arecommonly employed as intermediate endpointbiomarkers to facilitate more rapid comple-tion of colorectal cancer chemoprevention trials.To date, several colorectal cancer chemopre-vention agents have been investigated, includ-ing fibre, antioxidant vitamins, calcium andnonsteroidal anti-inflammatory drugs (NSAIDs).Selective cyclooxygenase-2 enzyme inhibitors,which may have a better safety profile than tradi-tional NSAIDs, have received considerable atten-tion in this context as well. Further discussionregarding the current status of these agents is pro-vided below.

Dietary fibre represents a heterogeneous mix-ture of complex materials derived primarily fromplant cell walls. Extensive observational datacollected over more than three decades suggestthat fibre might help to prevent colorectal neo-plasia by diluting or adsorbing faecal carcino-gens, reducing colonic transit time, altering bileacid metabolism, or increasing short-chain fattyacid production. However, high fibre interven-tions have not been associated with a reducedrisk for recurrent colorectal adenomas in fiveclinical trials.23 – 27 In fact, one small randomisedstudy observed a higher adenoma recurrence rateamong subjects in the active fibre interventiongroup.25 It remains possible that administration ofdietary fibre at an earlier point in tumourigenesis(for example, prior to first adenoma formation)might have a more appreciable anti-carcinogenic

effect. Nonetheless, the existing data do notsupport a major role for this agent in colorectalcancer chemoprevention.

Antioxidant vitamins such as the retinoids,carotenoids, ascorbic acid and alpha-tocopherolmay prevent carcinogen formation by neutral-ising free radicals within the intestinal lumen.Although somewhat inconsistent, the prepon-derance of data from case–control and cohortstudies support an inverse association betweenantioxidant vitamin intake and colorectal can-cer risk. Four colorectal cancer chemopreven-tion trials have investigated antioxidant vitaminsat different doses and in various combinations.One relatively small study found that recur-rent adenomas were less common among sub-jects treated with vitamin A (30 000 IU perday), vitamin C (1 g per day) and vitamin E(70 mg per day) over a mean intervention periodof 17.8 months.28 Another three-year chemopre-vention trial reported a 69% reduction in thenumber of recurrent colorectal polyps among sub-jects randomised to receive multiple antioxidants(beta-carotene, selenium, vitamin C, vitamin E)plus calcium versus placebo compounds.29 Con-versely, two larger trials of vitamin C and vita-min E yielded unremarkable results.30,31 Thus,definitive evidence for a protective benefit fromantioxidant vitamins on colorectal cancer riskremains to be demonstrated.

Calcium may serve as a colorectal cancerchemoprevention agent through at least twomechanisms: functionally removing toxic bileacids from the faecal stream and decreasing cellu-lar proliferation in the large bowel mucosa. Datacompiled from 24 observational studies yieldeda summary risk estimate of 0.86 (95% CI =0.74–0.98) for colorectal cancer among sub-jects with high versus low calcium intakes.32 Inaddition to encouraging data from the relativelysmall clinical trial of multiple antioxidants pluscalcium mentioned above,29 the Calcium PolypPrevention Study found that treatment with cal-cium carbonate at 3 g per day for 4 years wasassociated with a statistically significant 15%reduction in recurrent colorectal adenoma risk,as compared to placebo.33 Further data regarding

Page 130: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GASTROINTESTINAL CANCERS 123

the chemopreventive potential of calcium (andvitamin D) are being collected from the largeWomen’s Health Initiative Clinical Trial,34 whichshould help to clarify whether or not applicationof this agent to average-risk subjects has measur-able value.

NSAIDs are a structurally diverse class ofpharmaceutical agents that appear to reduceproliferation, delay cell cycle progression andinduce apoptosis in epithelially-lined tissues.Extensive data from rodent models suggest thatNSAID administration can reduce gastrointesti-nal tumour incidence and/or multiplicity by upto 80%. In human populations, regular NSAIDuse has been associated with decreased col-orectal cancer risk in numerous observationalstudies. Despite a consistent demonstration ofprobable benefit, NSAIDs have not been rig-orously evaluated in colorectal cancer chemo-prevention trials until recently. The Physicians’Health Study (n = 22 071 subjects), which wasa randomised, double-blind, placebo-controlledtrial of aspirin 325 mg every other day to preventcardiovascular disease, did analyse colorectalcancer incidence rates as a secondary endpoint.After a mean follow-up period of 5 years, nostatistically significant difference was observedbetween the active and placebo groups (RR =1.15; 95% CI = 0.80–1.65).35 Extension of thefollow-up period to 12 years did not apprecia-bly alter the risk estimate (RR = 1.03; 95% CI =0.83–1.28).36 However, certain limitations of thePhysicians’ Health Study trial design, such asthe relatively low aspirin dose and lack of uni-form colorectal cancer surveillance guidelines,may have hindered its ability to detect a protec-tive association.

The chemopreventive effects of traditionalNSAIDs are thought to result primarily frominhibition of cyclooxygenase-2 (COX-2). Selec-tive COX-2 inhibitors like celecoxib and rofe-coxib are therefore being aggressively pursued aspotential colorectal cancer prevention agents. Inthe first trial to be reported, celecoxib 400 mgtwice per day was associated with statisticallysignificant reductions in both the mean num-ber and total burden of colorectal polyps among

subjects with familial adenomatous polyposis.37

Additional COX-2 inhibitor trials are ongoing toconfirm the initial findings and to evaluate theeffect of these agents on sporadic colorectal can-cer risk.

A number of other candidate agents, includ-ing oestrogen compounds, ursodeoxycholic acid,difluoromethylornithine and Bowman–Birk in-hibitor, have shown promising results in cell cul-ture experiments, animal model systems and/orobservational studies. Further data regardingthese (and other) potential colorectal cancerchemopreventive agents are anticipated in thenear future as new Phase I, II and III clinicaltrials are organised and completed.

EARLY DETECTION

Due to a variety of factors, colorectal cancersare very amenable to early detection. First, thebiology of colorectal carcinogenesis is becom-ing increasingly well understood, as evidencedby continued expansion of knowledge regard-ing the molecular events associated with differentstages in the adenoma–carcinoma sequence. Thisrelatively slow process typically requires sev-eral years to progress from normal mucosa toadvanced neoplasia, which affords a clear oppor-tunity for detecting lesions at an asymptomaticstage. Second, there are a variety of possiblescreening methods that range from non-invasivestool tests or imaging studies to invasive endo-scopic evaluations. Third, due to the high inci-dence of colon cancer, such screening may becost-effective in terms of screening costs versusyears of life saved. Fourth, the high incidence ofcolon cancer provides a motivation for many indi-viduals to seek out screening. Based on these andother considerations, several randomised trials ofvarious screening methods have been conducted.

With respect to faecal occult blood testing,three large clinical trials have shown that regularscreening may reduce colorectal cancer mortal-ity by 13–33%.38 – 40 In two trials from Europe,subjects (n = 61 933 and n = 150 251) were ran-domised to undergo screening every other yearversus usual care. In the Minnesota Colon Cancer

Page 131: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

124 TEXTBOOK OF CLINICAL TRIALS

Study, subjects (n = 46 551) were randomisedto annual screening, biennial screening or usualcare. Follow-up in these studies ranged from11–18 years. Interestingly, only one trial foundthat programmatic screening was associated witha statistically significant reduction in colorectalcancer incidence.40 These data suggest that pre-invasive adenomas (arguably the most relevantscreening target) are poorly detected by faecaloccult blood testing. Thus, despite the inclusionof faecal occult blood testing in widely endorsedcolorectal cancer screening guidelines,41,42 fur-ther pursuit of more sensitive and specific stoolbiomarkers is needed.

Direct examination of the distal colorectum byflexible sigmoidoscopy represents another optionfor colorectal cancer screening. However, thisprocedure is at least moderately invasive andmay be associated with transient discomfort. Assuch, adherence to recommendations for initialand repeat flexible sigmoidoscopies was recentlyevaluated in the Prostate, Lung, Colorectal andOvarian (PLCO) Cancer Screening Trial. Amongsubjects randomised to the screening interventionarm (n = 17 713), 83% completed the baselineflexible sigmoidoscopy. Additionally, 87% ofsubjects who were eligible for repeat testingafter three years complied with the follow-upevaluation.43 At present, the effects of flexiblesigmoidoscopy screening on colorectal cancerincidence in the PLCO trial cohort remainunknown. An even larger flexible sigmoidoscopyscreening trial is underway at 14 centres inthe United Kingdom and Wales (n = 170 432randomised subjects).44 When available, datafrom these two trials should be highly informativeregarding the utility of flexible sigmoidoscopyscreening to reduce colorectal cancer incidencerates in the general population.

Colonoscopy is currently the gold standardfor structural evaluation of the large intes-tine. Cost-effectiveness models suggest thatone-time screening colonoscopy between ages50–54 years may be a rationale colorectal cancerprevention approach.45 Existing early detectionguidelines support a slightly more conserva-tive strategy (i.e. colonoscopy every 10 years, in

the absence of symptoms or other known riskfactors). However, screening colonoscopy has notyet been investigated in a randomised clinicaltrial, with the exception of one ongoing feasi-bility study.46 Two novel methods of colorectalcancer screening, CT colonography and DNA-based stool assays, are currently being tested inpopulation-based clinical trials as well. Resultsfrom these studies are anticipated in the nearfuture and may necessitate further modificationof current early detection algorithms.

TREATMENT: COLON CANCER

Localised Disease

Surgery is the primary modality for the treat-ment of localised colon cancer. Depending ondisease stage, surgery alone produces 5-year sur-vival rates of 50% to greater than 90%. Asopposed to gastric and rectal cancer, however,the surgical technique for colon cancer resec-tion has been the subject of limited investigationin randomised clinical trials. One large surgicaltrial recently completed accrual of approximately900 patients. In this study, patients were ran-domised to either the standard ‘open’ colectomyor a ‘minimally invasive’ laproscopically assistedcolectomy.47 The trial’s primary endpoint wascancer recurrence, and it was designed to demon-strate equivalence of the two approaches. Thetrial also included extensive quality of life andcost-effectiveness assessments.

The value of adjuvant treatment for patientswith stage 3 colon cancer (cancer able to becompletely resected, but with positive lymphnodes in the resection specimen) is well accepted.The first trial conducted with a positive result wasconducted by the North Central Treatment group,initiated in 1978.48 This was a three-arm trial,with a sample size of approximately 135 patientsper arm, comparing no post-surgical treatment toadjuvant treatment with either levamisole aloneor 5-FU plus levamisole. The initial results ofthis trial indicated a moderate but statisticallysignificant benefit for the 5-FU plus levamisolearm compared to control. Given the novelty of

Page 132: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GASTROINTESTINAL CANCERS 125

this result, in a decision that likely would neverbe made in the current day, the investigativeteam decided to embark on a larger, confirmatorytrial prior to the release of the results to theoncology community. This confirmatory trial,known as Intergroup trial 0035, enrolled over1200 patients to the same three arms as theinitial trial. Intergroup 0035 clearly demonstratedimproved overall survival in patients treated withadjuvant 5-FU and levamisole.49 These findingslead in part to the 1990 consensus statementfrom the National Cancer Institute that patientswith stage 3 colon cancer who are unable toenter a clinical trial should be offered adjuvanttreatment with 5-FU plus levamisole unless thereare contraindications.50

A number of clinical trials were in progressat the time of the publication of the beneficialresults from the use of adjuvant 5-FU pluslevamisole. Several of these trials included ano post-surgical treatment control arm, and thusthese trials were closed prior to reaching theiraccrual goals due to ethical reasons. Included inthis list of trials that were closed prematurelywere five Phase III randomised trials testing5-FU plus leucovorin versus no post-surgicaltreatment control. The results from three of thesetrials were pooled for analysis;51 the other twowere reported separately.52,53 In each of theseanalyses, adjuvant 5-FU plus leucovorin showeda survival advantage compared to control. Insubsequent studies, throughout the 1990s, variousinvestigative groups conducted trials comparingvarious different schedules and combinationsof 5-FU combined with either leucovorin orlevamisole. None of these trials demonstrated astatistically significant improvement in survivalbetween study arms, although through such trialsit did become clear that 6 months of 5-FU plusleucovorin was at least as effective as 12 monthsof 5-FU plus levamisole.54,55 As a result, thecurrent standard of care in the United States (asof 2002) for stage 3 colon cancer is 6 months of5-FU plus leucovorin.

In the discussion in the preceding paragraph,all of the regimens discussed were based onthe delivery of 5-FU as a short-term bolus

infusion. Based on promising results in theadvanced disease setting (as discussed below),multiple clinical trials have been conducted usingregimens based on a long-term infusion with5-FU. Intergroup trial 0153 directly compareda bolus to an infusional 5-FU based regimenin a randomised Phase III trial of 1078 patients(terminated early at an interim analysis–originalplanned sample size of 1800 patients).56 In thistrial no difference in efficacy was observedbetween the arms, although the toxicity profiledid differ substantially. Based on these results,two recent Phase III randomised trials in theUnited States have used control arms of 6 monthsof bolus 5-FU plus leucovorin. However inEurope, regimens using short-term 24–48 hour5-FU infusions are more popular.

Future efforts in the adjuvant treatment of stage3 colon cancer will likely be directed in twoareas – first, to improving the treatment optionsavailable, and second, and relatedly, to tailoringtherapy to the individual patient. In the treatmentarea, for the time being, new studies will ran-domise patients to treatments based on adding anew treatment to a 5-FU and leucovorin regimen.For example, the two recent large Phase III trialsin the United States compared 5-FU and leucov-orin to either a 5-FU, leucovorin and irinotecan(trial C89803) or 5-FU, leucovorin and oxali-platin (trial C-06). In somewhat of a leap of faith,the trial currently being planned by the US Gas-trointestinal Intergroup will compare 5-FU, leu-covorin and irinotecan to 5-FU, leucovorin andoxaliplatin. This leap is necessitated by the newrealities of conducting trials in the adjuvant colonsetting – namely, that patient outcome has beensufficiently improved that significant follow-upis required in order to obtain sufficient events topower a study. Both the C89803 and C-06 trialswill require a minimum of three years of follow-up prior to any formal analysis. The options forconducting a follow-up study are thus to wait atleast three years, or to push ahead, assuming thatat least one of the experimental regimens willprove superior to the current standard. A dis-cussion of the second area, tailoring therapy to

Page 133: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

126 TEXTBOOK OF CLINICAL TRIALS

the individual patient, will be deferred until thenext section.

One additional insight into the conduct of clin-ical trials in GI cancers may be gained by exam-ining the steady increase in the sample sizes thathas occurred in stage 3 colon clinical trials overthe past two decades. In trials conduced in theearly 1980s, sample sizes of 100–200 per armwere typical,48,52 with some exceptions (such asthe NSABP C-01 trial, with approximately 380patients per arm).57 With such a sample size, thestudy provided adequate power to detect only arelatively large effect. Fortunately, 5-FU, whencombined with either levamisole or leucovorin,did provide a rather large effect, with a reductionin the hazard of death by approximately 25%.58

However, the likelihood of a subsequent treat-ment advance by such a magnitude is unlikely,and smaller advances may indeed be clinicallyrelevant. Therefore, more modern trials in stage 3disease have included sample sizes of 1600 (trialC89803), 2400 (trial C-06) and 4900 patients fora four-arm trial (the QUASAR trial).54 As ther-apy continues to improve, the sample size neces-sary to detect further incremental advances willcontinue to grow. One possible strategy for prac-tically conducting such large trials is discussed inthe next section.

As opposed to the adjuvant treatment of stage3 disease, the benefit of adjuvant treatment forstage 2 (node negative) disease is unclear. Inmany previous trials, patients with stage 2 diseasehave been pooled together with stage 3 patients.The sample size for such trials has typically beenbased on an analysis pooling the data from bothpatient groups. For a variety of reasons, patientswith stage 3 disease have typically constituteda majority of the enrollment to such trials, thuseach individual trial has been underpowered todetect a moderate benefit of treatment in stage2 patients. Due to the limited sample size ineach trial, two attempts have been made to pooldata from several trials in order to gain a suffi-cient sample size to draw a definitive conclusionregarding the value of adjuvant therapy in stage 2disease. However, the two analyses have reporteddiffering conclusions. One analysis, reported by

Mamounas et al.,59 pooled data from four ran-domised trials conducted by the NSABP. In noneof these four trials was there a direct randomisedcomparison between treatment and control. Intheir analysis, the authors estimated the magni-tude of the difference in outcome between the twostudy arms in each of the four studies. They thencompared whether this difference in outcome dif-fered by patient stage. The authors concluded thatthe treatment effect within each study was similarbetween stage 2 and stage 3 patients, and sinceit had been previously demonstrated that treat-ment is beneficial in stage 3 patients, they con-cluded that treatment is also beneficial in stage2 patients.

The second investigation60 used a more directapproach. In this analysis, the study team pooledthe data from stage 2 patients who had partic-ipated in five randomised trials of 5-FU plusleucovorin versus control. They found no statis-tically significant benefit of treatment, based ona pooled sample size of just over 1000 patients.Due to the excellent outcome of stage 2 patients,with an approximately 80% 5-year survival inuntreated patients, even this pooled sample sizehad poor power to detect a small but possi-bly important improvement in patient outcome(only 60% power to detect an 85% 5-year sur-vival in treated patients). Thus, the benefit of5-FU based adjuvant therapy in stage 2 diseaseremains unclear, and further pooled analyses willlikely be necessary. A large randomised trial ofa monoclonal antibody in the setting of stage 2disease has recently completed accrual of over1700 patients (trial C9581); the results of thistrial should help clarify the appropriate treatmentfor patients with stage 2 disease.

Advanced Disease

It is likely that more clinical trials have beenconducted in advanced colon cancer than in anyother GI disease site. This is due to the highincidence of the cancer, and the fact that it is to atleast some degree sensitive to chemotherapeuticagents. Trials in advanced colon cancer typicallyinclude patients with advanced rectal cancer, as

Page 134: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GASTROINTESTINAL CANCERS 127

the response to chemotherapy has not been shownto depend on the precise site of the patient’sdisease within the colorectum.

The drug 5-FU has been the mainstay of treat-ment for colorectal cancer for over 40 years.From 1950–1990, a multitude of trials were con-ducted in an effort to improve the efficacy of5-FU based regimens, by changing methods ofadministration, combining it with various supple-mental agents (such as leucovorin or levamisole),or changing the dose and schedule. Regardingthe timing of administration, the clear result ofmultiple studies is that, among regimens where5-FU is delivered as a bolus injection, the partic-ulars of the administration have a definite impacton toxicity, some impact on tumour response,but little impact on patient survival. The addi-tion of leucovorin to 5-FU has been demon-strated in a meta-analysis to provide increasedefficacy in terms of response rate compared to5-FU alone.61 In another meta-analysis, a sched-ule where the 5-FU is delivered by a continuousinfusion has been shown to provide an advantagein both toxicity and overall survival compared tobolus schedules.62,63 However, the improvementin median survival was modest at 0.8 months,thus many practitioners (at least in the UnitedStates) have continued to administer the bolus 5-FU based regimens based on perceived benefitsof patient and physician convenience.

After 40 years of testing variations on a 5-FUtheme, two more recent developments have addedexcitement to the advanced colorectal cancerclinical trials arena. The first is the introductionof oral 5-FU based regimens. The oral method ofdelivery offers clear benefits in terms of patientpreference. However, an oral approach would notlikely be accepted if it did not provide at leastequivalent efficacy to an IV approach. Therefore,two large equivalence trials have been conductedcomparing an oral to an IV regimen. These twotrials, one reported by Hoff et al.64 and the otherby van Cutsem et al.,65 enrolled 605 and 602patients respectively, and were formally designedto test the equivalence of the oral regimen to theIV approach. In both cases, formal equivalencewas declared.

At almost the same time as the introductionof oral 5-FU based agents for advanced colorec-tal cancer, new chemotherapeutic agents havebeen added to 5-FU with promising results. Basedon results with the agent irinotecan in patientswho had failed a 5-FU based regimen,66,67 tri-als with irinotecan were performed in the settingof patients with previously untreated advanceddisease. As reported by Saltz et al.68 and Douil-lard et al.,69 irinotecan, when added to 5-FU andleucovorin, resulted in improved time to progres-sion and overall survival when compared to 5-FU and leucovorin alone in first-line treatmentof advanced disease. These two relatively largetrials (683 and 387 patients, respectively) estab-lished a new standard of care in this setting. Inthe Saltz trial irinotecan was added to a bolus5-FU schedule, while the Douillard trial addedirinotecan to an infusional 5-FU regimen, thusthe optimal method in which to give 5-FU withthe new agent remained unclear. Recently, thedrug oxaliplatin has shown promising activitywhen combined with 5-FU and leucovorin in sev-eral studies,70,71 with reported median survivalsequaling or exceeding 18 months.

The proven efficacy of irinotecan as second-line therapy in patients who fail 5-FU basedtherapy has complicated the design of first-line advanced disease trials. Traditionally, overallsurvival has been used as the primary endpointfor such studies, and extending the patient’slongevity remains the ultimate goal. However,given that there are second- and even third-linetherapies with proven benefit, the relative meritsof overall survival as the primary outcome fora trial warrant a reconsideration. Consider thedesign used in the Saltz trial,68 where irinotecanplus 5-FU and leucovorin was compared to 5-FU and leucovorin. In this trial, patients whoprogressed on the 5-FU and leucovorin armwere able to receive irinotecan off study, as itwas approved for the second-line indication. Theavailability of this effective second-line agentprovided at least the theoretical possibility thatthe two primary study arms could show nodifference in terms of overall survival, eventhough irinotecan was beneficial to patients on

Page 135: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

128 TEXTBOOK OF CLINICAL TRIALS

both arms of the study. For this reason, timeto tumour progression was specified as theprimary endpoint for the trial. In retrospect,the addition of irinotecan as a component ofthe initial treatment resulted in both improvedtime to progression and overall survival, makingthe result clear. However, these factors must betaken into consideration for future trials, where atminimum data on the use of second- and third-line therapy should be collected.

TREATMENT: RECTAL CANCER

Rectal cancer is second to colon cancer amongGI malignancies in the number of new casesper year. When the initial diagnosis for rectalcancer is as advanced disease, i.e. not surgicallycompletely resectable, its primary treatment is inthe same manner as for advanced colon cancer.However, the optimal adjuvant treatment forrectal cancer is the issue of considerable study.Questions abound as to the importance of surgicaltechnique, the value of radiation therapy, theoptimal chemotherapy regimen and the timing oftherapy, either pre- or post-resection.

Surgery/Adjuvant Therapy

Prior to 1990, external beam radiotherapy in thepost-operative setting was considered by many asthe standard of care in the United States, basedprimarily on an observed benefit in lowering therisk of local recurrence. In particular, radiationas a single agent added to surgery had neverbeen shown to improve overall survival com-pared to surgery alone. In a randomised study of204 patients, Krook et al.72 demonstrated a bene-fit in overall survival of post-operative combinedtherapy with radiation and 5-FU and semus-tine compared to radiation alone. The 1990 NIHconsensus statement concluded that ‘Combinedpostoperative chemotherapy and radiation ther-apy improves local control and survival in stage IIand III patients and is recommended’.50 In a sub-sequent study conducted by the US GI Intergroup,two questions were asked in a 2 × 2 factorialdesign: is semustine necessary, and can therapy

be optimised by using continuous infusion 5-FUbased therapy as opposed to bolus. All patientsin this study received radiation. This study of680 patients concluded that (1) semustine is notnecessary, and (2) infusional 5-FU based ther-apy during the radiotherapy provides a survivaladvantage compared to bolus therapy.73

Two studies conducted by the National Sur-gical Breast and Bowel Project (NSABP) havequestioned the value of radiation in the post-operative setting. In the Krook and O’Connellstudies mentioned above, all patients receivedradiation, and the studies focused on the rela-tive benefit of different chemotherapy regimens.In contrast, NSABP study R-01 tested three armsin a randomised manner in 574 patients: no post-surgical treatment, post-operative radiation andpost-operative chemotherapy. A survival benefitwas observed for the chemotherapy arm com-pared to the no treatment arm, but this advan-tage was not observed in the radiation alonearm.74 The NSABP followed this study with atwo-arm randomised trial of 741 patients com-paring chemotherapy alone to chemotherapy plusradiation.75 The results of this trial showed noimprovement in overall survival for the combinedmodality arm, although there was a statisticallysignificant improvement in the rate of local recur-rence associated with radiation. Despite these twoconsistent results, radiation continues to be com-monly used in the post-operative treatment ofstage 2 and 3 rectal cancer.

Increasingly, practitioners are turning to deliv-ering radiotherapy for rectal cancer in the pre-operative setting. There are several theoreticaladvantages to the pre-operative approach. Per-haps most importantly, from the patient’s per-spective, pre-operative therapy may shrink thetumour sufficiently to allow a sphincter-sparingresection. Pre-operative radiotherapy has beenshown to improve outcome compared to no treat-ment in a large randomised trial of the SwedishRectal Cancer Trial group. This trial randomised1168 patients to a two-arm trial of a shortcourse (25 Gy in 1 week) of pre-operative radia-tion compared to no pre-operative treatment, andshowed a statistically significant improvement in

Page 136: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GASTROINTESTINAL CANCERS 129

both local recurrence rate and overall survival.76

In the United States, the standard pre-operativeregimen is to deliver the 5-week post-operativecourse of 50.4 Gy pre-operatively. The efficacyof this approach has never been established ina randomised trial. A comparison of these twopre-operative approaches is clearly warranted.

Regardless of the specifics of the pre-operativeapproach, a burning question concerns whetherthe pre- or post-operative approach provides thebest outcome. Two randomised trials have beenattempted in the United States, and both wereclosed early far short of their accrual goals dueto poor accrual. However, an ongoing trial inEurope has been more successful, with accrual ofover 600 patients to one of the two approaches.77

The 50.4 Gy long course radiation is being usedin both arms, and both arms are receiving thesame chemotherapy regimen in combination withthe radiation.

In addition to the controversies present inchemotherapy and radiation therapy, there isconsiderable interest in the optimal surgicalmanagement of this disease. In particular, thesurgical approach of total mesorectal excision(TME) has been promoted as an importantsurgical advance. Based on case-series and otherhistorical data, proponents of TME have claimedsignificant reductions in local recurrence ratesand improved overall survival compared tostandard surgery.78 However, TME has neverbeen tested against non-TME surgery in arandomised trial, and such a trial is unlikely toever be conducted. In a large randomised trial of1861 patients conducted by the Dutch ColorectalCancer Group, pre-operative radiation was shownto reduce the rate of local recurrence comparedto no radiation when all patients received TMEsurgery.79 In this early report, with a medianfollow-up of 2 years in living patients, there wasno improvement in overall survival for patientsreceiving radiation.

In summary, it is clear that rectal cancer isan area where randomised clinical trials havemade several important contributions to improv-ing patient outcomes. Post-operative chemother-apy and chemoradiotherapy, and pre-operative

radiation therapy have been shown to reduce thelocal recurrence rate and improve overall survivalbased on large randomised trials. It is also clearthat considerable work remains to define the opti-mal timings and combinations of the differenttreatment modalities.

SELECTED METHODOLOGIC ISSUESWITH EXAMPLES

The number and variety of large Phase IIIrandomised trials that have been conducted in GIcancers brings to light a number of clinical trialsmethodology issues that have direct relevance tothe conduct and conclusions that can be drawnfrom such trials. In this section we will highlightthree such issues: subgroup analysis, the study ofprognostic and predictive factors, and monitoringof clinical trials for safety.

SUBGROUP ANALYSIS

Studies are often conducted in multiple possibleheterogeneous groups of patients in order toachieve a necessary sample size, or to completea trial more quickly. The conduct subgroupanalyses, that is, to examine the result of atrial separately in different groups of randomisedpatients, is extremely important. In fact, suchanalyses can help demonstrate the robustness ofa result: if a study finding can be shown to beconsistent in different patient groups, such aspatients of different ages, or patients enrolledfrom different countries, then the overall resultcannot be explained by an extreme result inone subset of patients. Additionally, subgroupanalyses are important in generating hypothesesfor future study. However, caution must beexercised when examining the results of subsetanalyses, particularly if they were not pre-specified in advance.

One example of a subgroup analysis of greatcontroversy in the setting of GI cancers iswhether post-surgical adjuvant treatment is ben-eficial for patients with stage 2 colon cancer,as discussed previously. Historically, most trials

Page 137: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

130 TEXTBOOK OF CLINICAL TRIALS

in adjuvant colon cancer have included patientswith both stage 2 and 3 disease. The majority ofpatients in such trials have been stage 3 patients,and subset analyses within these trials have con-sistently shown 5-FU based chemotherapy to bebeneficial in this group. However, because themajority of patients in such trials have not beenstage 2, and because the prognosis for the stage2 patients is overall more favourable, with fewerpatient deaths and thus less statistical power,the individual studies have not shown consistentresults in the subset of B2 patients. As mentionedabove, two pooled analyse have been conducted,with conflicting results. Both analyses are to becommended for obtaining and using the individ-ual patient data from each of the included trials,and for having very complete follow-up for thestudies used. However, both analyses could becriticised for other methodologic issues. The first,by the NSABP investigators,59 pooled data frommultiple trials with different treatment arms, noneof which compared directly a no treatment armto what would be considered a standard treat-ment by current standards. The second, by theIMPACT investigators,60 did pool results fromrandomised trials of no post-surgical treatment totherapy with 5-FU and leucovorin, the current USstandard of care. However, these investigators didnot include data from two large trials testing 5-FUand levamisole versus control, which despite hav-ing the 5-FU modulated by a different agent didindeed test a very similar question, with 5-FU andlevamisole shown in large randomised trials togive results indistinguishable from 5-FU and leu-covorin. Additionally, the IMPACT investigatorsused a less powerful analysis than might be pos-sible. The original trials included in the IMPACTanalysis included patients with both stage 2 and3 cancers. In these trials, treatment was provenbeneficial overall. Therefore, the most relevantquestion is whether the effect of treatment dif-fered between the stage 2 patients and the stage 3patients. This question could be tested by obtain-ing all of the data from the trials, examiningthe degree of benefit overall, and then testingfor a stage by treatment interaction. This is themost statistically powerful method for testing the

treatment benefit in both subgroups of patients;the absence of a significant interaction impliesthat there is no evidence that the benefit of treat-ment differs by patient subgroup.

Two adjuvant rectal trials conducted by theNSABP, trials R-0174 and R-02,75 can be usedto demonstrate a different feature of subgroupanalyses. The first trial, R-01, demonstrated asignificant benefit in terms of overall survivalfor the addition of chemotherapy followingresection compared to no post-surgical treatment.However, in subgroup analyses, this benefitseemed to be limited to the male patients,which could not be explained. The NSABP isto be commended for treating this finding ashypothesis generating, and testing the hypothesisin their next study R-02. In study R-02, therandomisation scheme differed for men andwomen, with females being randomised to twoarms and males to four arms. The results of R-02 did not demonstrate the need for differenttreatment for the two genders, which put theissue to rest after being tested as appropriate ina randomised trial. This experience demonstratesthe value of confirming a finding that results froma subgroup analysis prior to accepting the resultinto clinical practice.

TUMOUR MARKER STUDIES

A second area of considerable interest and debatein the GI cancer community regards the useof putative prognostic and predictive markers toguide the choice of therapy for an individualpatient. Hundreds if not thousands of studies havebeen done on markers based on immunohisto-chemistry, flow cytometry, chromosomal markerssuch as allelic loss and microsatellite instability,pathologic features, and many others. Unfortu-nately, few if any of these markers have madetheir way into clinical practice. The reasons forthis lack of progress are many,80 we will focushere on three that are directly related to thelater stages of clinical trials: analyses confinedto patient subgroups, inadequate sample size andimproper design.

In any report investigating tumour markersbased on patients from clinical trials, the rate

Page 138: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GASTROINTESTINAL CANCERS 131

of sample collection is an important issue. Inthe case of retrospective trials conducted on tis-sue obtained from a clinical trial, often tissue isavailable on only a subset of the patients whoentered the initial trial. In these retrospectivemarker studies, comparisons are frequently madebetween characteristics such as baseline demo-graphics and/or tumour stage for the patientswhose tumours were used in the marker studyand those whose tissues were not used. How-ever, even if the characteristics for the patientswho were used in the analysis and those whowere not appear similar, the results of such stud-ies could still be biased. Such an example hasbeen described by Pajak et al.,81 who reported ona study originally conducted by Grignon et al.82

In their study, in patients with advanced prostateadenocarcinoma, Grignon et al. studied the rela-tionship between tumour p53 status and progno-sis in 129 of 456 patients entered onto a trialconducted by the Radiation Therapy OncologyGroup (RTOG study 86-10). When reanalysedby Pajak et al., it was shown that the distribu-tion of treatment received and combined Glea-son Score of patients who had p53 assessedwas identical to those who did not have p53assessed. However, the survival of those patientswho had a p53 determination performed on theirtumour, and were thus included in the study, wassignificantly worse than those not included inthe study.

This example demonstrates the possible pitfallsof performing marker studies on patient subsets.Despite such examples, the approach of collect-ing a set of patients with tissue available, testinga possible prognostic or predictive marker, andreporting the results continues to be the methodby which most markers are examined. The rea-sons for this are many: expediency, ethical issuesof mandatory tissue submission, and policies ofinformed consent and institutional review boards.It does raise the issue of whether for futureprospective studies, tissue submission should bemandatory prior to patient entry on a therapeu-tic clinical trial. Such a discussion raises ethicaland legal issues that are beyond the scope ofthis work, but such discussions are ongoing for

several Phase III trials that are in developmentin the US.

SAFETY MONITORING

As a final clinical trials methodologic issue,we consider the process of monitoring patientsafety in clinical trials. Clinical trials havebeen conducted for many years, and detailedand effective methods have been developed forensuring the safety of participants. One of thefundamental tenants of clinical trials is theprogression of an agent or regimen througha series of trials, starting in small, typicallysingle-centre Phase I studies, on to somewhatlarger, possibly multi-centre Phase II trials, thenon to Phase III trials. The sequential nature ofsuch trials, among other features, allows forconsiderable experience with an agent or drugcombination prior to a large, multi-centre trial.The recent pressure towards developing andtesting agents more rapidly, although beneficialin many ways, does challenge this establishedmechanism for establishing the safety of an agent.More Phase I/II, Phase II/III, or even Phase I/IIItrials are being conducted, the result of whichis that agents or combinations are being pushedinto the multi-centre setting more rapidly that inthe past. This warrants an examination of whycaution is warranted as agents are taken fromPhase I testing to larger trials.

Phase I trials have a successful history as thefirst step in testing new cancer therapies. How-ever, several factors must be considered as anew agent or combination of agents is takenfrom a Phase I trial to a Phase II or Phase IIItrial. These factors relate to possible differencesbetween patients entered onto Phase I trials andthose entered on later trials. First, Phase I tri-als often include patients with any type of solidtumour. This speeds the completion of the trial,and is justified because in many cases the tolera-bility of an agent is not related to the location ofthe patient’s primary tumour. However, in someinstances the tolerability of an agent may dif-fer in patients with different tumour types. Sec-ond, Phase I trials are frequently conducted at

Page 139: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

132 TEXTBOOK OF CLINICAL TRIALS

highly specialised cancer centres, where medicalpersonnel experienced with detailed monitoringand rapid intervention reduce the consequencesof and future risk for toxicity whenever possi-ble. In contrast, Phase II and III trials are oftenconducted in the community setting, where theclinical staff may be using a new agent or com-bination of agents for the first time.

A final reason for caution is that patients areenrolled onto Phase I trials only after failing allstandard therapies. This has several implications.Patients enrolled on Phase I trials have typicallybeen previously exposed to cytotoxic agents, thusthey may be physically or emotionally morerobust and tolerate a new therapy better than apatient in the first-line setting. Additionally, onlypatients that have tolerated their initial therapyacceptably are likely to choose enrollment, orbe considered candidates, for a Phase I trial. Afinal issue with respect to patients who havebeen previously treated is that patients and theirphysicians may have a sense of the rate of diseaseprogression such that patients with the mostvirulent disease are often not even considered forenrollment in a Phase I trial.

In part due to the combination of these variousfactors, study sponsors, including the NationalCancer Institute, have developed sophisticatedsystems to aid in the timely identification oftoxicity problems in ongoing studies. Despitethese systems, gaps remain. For example foragents that are commercially available, expeditedreporting of severe but expected events may notbe required. If such an event were occurring ata greater frequency than expected, and expeditedreporting was not required, a multi-centre trialmay not detect such an incidence for weeks ormonths due to the nature of the process of datacollection, editing, entry and analysis. This hasled some groups to propose supplements to thestandard systems to collect data on all severeevents in a timely manner.83

Even when data is reported in a timely manner,care must be taken in interpreting the datareceived. As an example, consider the experiencein two large Phase III randomised trials reportedon by Rothenberg et al.84 The Rothenberg et al.

report was prompted by the finding of anunexpected number of early deaths on two GIcancer trials, one in advanced colorectal cancerand the other in adjuvant cancer. In one of thetrials in the review, 23 patients were reportedto have died within 60 days of initiation oftherapy. When these 23 events were reportedto the group operations office for this trial,only 10 were deemed to have been related totreatment. However, upon independent review,16 of the 23 events were deemed treatment-related or treatment-induced.

Several conclusions can be drawn regard-ing toxicity in clinical trials. First, as the paceof drug development continues to quicken, itis likely that there will continue to be agentspushed into large trials prior to full and exten-sive Phase I and II testing. Second, effectivesystems must be established to monitor toxic-ity in trials of all phases. Third, an independentassessment of the attribution of an event maybe beneficial, as local investigators may be hes-itant to attribute a poor event to a treatment.In addition, new metrics (such as 60-day all-cause mortality85) and new terminology (suchas treatment-induced, treatment-exacerbated andtreatment-unrelated deaths84) may be helpful instandardising the reporting of adverse events inclinical trials.

CASE STUDY: 5-FU PLUS LEUCOVORININ COLON CANCER

As is clear, the history of clinical trials in GIcancer is long and has been very successful. Asan example illustrating several facets of boththe past history of GI clinical trials and issuesthat will likely be faced again in future studies,here we present a case study of the development,establishment and replacement of what was oncethe US standard of care for advanced colorectalcancer and, as of 2002, remains the standard foradjuvant stage 3 colon cancer, the ‘Mayo Clinic’bolus regimen of 5-FU and leucovorin deliveredfor 5 consecutive days every 4 or 5 weeks.

The activity of fluorinated pyrimidines in thetreatment of GI cancers has been reviewed

Page 140: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GASTROINTESTINAL CANCERS 133

extensively. 5-Fluorouracil (5-FU) is the mostubiquitous of the fluorinated pyrimidines, whichat least in part exert their antineoplastic effect byinhibiting the activity of the enzyme thymidylatesynthase (TS), which in turn interferes with DNAsynthesis in dividing cells. Often agents designedto improve the efficacy of fluorinated pyrimidinesare combined with these agents in an effortto preferentially sensitise tumour cells relativeto host cells to the agent(s). Leucovorin is anagent commonly used in such a setting. TheMayo regimen of 5-FU and leucovorin is thus acombination of an active chemotherapy agent, 5-FU, with a ‘biochemical modulator’ leucovorin.

Prior to the early 1980s, 5-FU was primarilyadministered as a single agent. Administeredin this fashion, it was associated with limitedactivity and moderate toxicity. Response ratesfor metastatic colorectal cancer were low, in theneighbourhood of 10%, and these responses wereshort-lived, lasting on average a few months.61

Based on pre-clinical laboratory studies,86 – 88

the addition of leucovorin to cell culture with oneof the metabolites of 5-FU, fluorodeoxuridylatemonophosphate (FdUMP), resulted in enhancedbinding to and inhibition of TS as comparedto the binding when FdUMP was used alone.This improved inhibition of thymidylate synthaseresulted in inhibited DNA synthesis and resultedin enhanced tumour shrinkage. Depending on themodel systems, optimal concentration of leucov-orin ranged from leucovorin 1–20 mmol/L.89 – 93

These studies supported the use of leucovorindoses ranging from 10 to 600 mg/m2 in clini-cal trials where leucovorin was added to 5-FUin an effort to improve on 5-FU’s single agentactivity. While such laboratory studies providedbasic information on the modulation of 5-FUusing leucovorin, the applicability of these resultsto humans with colorectal cancer was unclear.Based on clinical experience, individuals withcolorectal cancer clearly exhibit significant het-erogeneity in their response to treatment. Thesequence of administration of 5-FU and leucov-orin, the optimal concentration of leucovorin, andthe appropriate interval of 5-FU and leucovorinadministration all were variables to be studied to

explore the efficacy of 5-FU and leucovorin ininhibiting tumour growth.

Early investigators studying the biochemicalmodulation of 5-FU with leucovorin in the treat-ment of colorectal and gastric cancers includedMachover and colleagues.94,95 The Machoverregimen consisted of administering high-doseleucovorin at 200 mg/m2/d prior to 5-FU at adose of 370 mg/m2/d, with both drugs givenconsecutively for 5 days. With this dose ofleucovorin, the blood level is approximately10–20 µmol/L.96 In large part to lower the costof the regimen (leucovorin was very expensiveat the time), the ‘Mayo’ regimen was devised touse the identical 5-FU schedule to the Machoverregimen, but to use low-dose leucovorin at a doseof 20 mg/m2/d, which resulted in blood levels of1–2 µmol/L.

This regimen was first tested as part of a ran-domised Phase II study in advanced unresectablecolorectal cancer.97 Three of the treatment armsare relevant for this discussion: (1) 5-FU as a sin-gle agent administered at a dose of 500 mg/m2/dby IV bolus for 5 consecutive days every5 weeks; (2) the Machover regimen repeated at4 weeks, 8 weeks and every 5 weeks thereafter;and (3) the Mayo regimen repeated at the samefrequency as the Machover regimen. In this trial,provision was made in the protocol to escalatethe 5-FU dose on any treatment arm if there wasno observed myelosuppression or significant non-haematologic toxicity during the previous treat-ment course. When the toxicity was analysedafter treatment of the first 100 patients, the start-ing dose of 5-FU for the Mayo regimen wasincreased to 425 mg/m2/d in order to producedefinite but tolerable toxicity that was of simi-lar magnitude between the six treatment arms.98

The original combination of low-dose leucovorinwith 370 mg/m2/d of 5-FU for 5 consecutivedays was empiric; no formal Phase I trial of thisregimen had ever been performed. In the 208 eli-gible patients entered on the three study arms ofinterest, the overall response rates were 10% for5-FU alone, 26% for the Machover regimen, and43% for the Mayo regimen. Both leucovorin reg-imens demonstrated significant improvement in

Page 141: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

134 TEXTBOOK OF CLINICAL TRIALS

response rate and overall survival compared to5-FU alone.

Concurrent to the previously mentioned study,investigators at the Roswell Park MemorialCancer Institute (RPMI) began testing a reg-imen of leucovorin 500 mg/m2/d with 5-FU600 mg/m2/d given for 6 consecutive weeks fol-lowed by a 2-week rest period.99 In a small study,the RPMI regimen was shown to significantlyimprove the tumour response rate compared tosingle agent 5-FU. Shortly thereafter, the RPMIand Mayo regimens were compared in a ran-domised trial of 366 patients.100 In this trial, theobjective response rates and overall survival wassimilar between the two arms. The toxicity profileof the two regimens did differ, but no clear win-ner was identified. Based largely on cost consid-erations, investigators from the Mayo Clinic andthe North Central Cancer Treatment group choseto pursue the Mayo regimen for future testing.

The activity seen with the combination ofleucovorin and 5-FU in the advanced diseasesetting naturally led to the evaluation of severalof these regimens in the adjuvant treatment ofpatients with stage 2 and 3 colon cancer. Ina study that was suspended after accrual of317 patients (based on the results of a largetrial that demonstrated 5-FU plus levamisole wasan effective treatment in this setting49), patientswith resected stage 2 or 3 colon cancer wererandomised to the Mayo 5-FU plus leucovorinregimen for 6 months or to a no treatment controlarm.53 The 5-year survival for treated patientswas 74%, compared to 63% in the control group(p = 0.02). This result established the efficacyof the Mayo 5-FU plus leucovorin regimen inthe adjuvant setting.

Following this small study, a large trial wasconducted to test four different combinationsof 5-FU with leucovorin and/or levamisole inpatients with stage 2 and 3 colon cancer.

The regimens included the Mayo 5-FU plusleucovorin regimen for 6 months, 5-FU pluslevamisole for 12 months, 5-FU with high-doseleucovorin (the RPMI regimen) for 8 months,or 5-FU plus leucovorin plus levamisole for12 months. In this study of 3759 patients, results

were similar between the Mayo and RPMI 5-FUplus leucovorin programmes, and the 5-FU plusboth leucovorin and levamisole regimen.55 Basedon the essentially identical activity profiles ofthese regimens, the choice between the two 5-FU and leucovorin regimens (Mayo and RPMI)has been based on issues related to schedule(some patients preferred weekly therapy overfive consecutive days of treatment), cost (at thetime of these studies leucovorin was expensive),toxicity profile and clinician’s preference.

From the late 1980s until the year 2000,the Mayo regimen of 5-FU and leucovorin wasregarded as the standard of care for advancedcolon cancer. As discussed previously, in the late1990s and early 2000s, several randomised tri-als were conducted in both the US and Europe inwhich infusion-based 5-FU regimens or regimensthat combine 5-FU with CPT-11 or oxaliplatinhave demonstrated improved patient outcomescompared to those seen with the Mayo regi-men. In addition, the oral agent capecitabinehas been approved as an alternative to IV 5-FUin advanced disease. Thus it appears that inthe advanced disease setting, the Mayo 5-FU+ leucovorin regimen has been replaced as thestandard of care, indeed a welcome advance.In the adjuvant setting, no randomised trial hasbeen completed that demonstrates improved over-all survival for a multiple drug combination, orequivalence for an oral regimen, although trials inboth areas have completed accrual and are await-ing results. Increased toxicity has clearly beendemonstrated for multiple drug combinations inthe adjuvant setting,84 demonstrating the value ofwaiting for the results from these definitive tri-als before adopting a promising new therapy asa standard of care in the community.

REFERENCES

1. Launois B, Delarue D, Campion JP, Kerbaol M.Preoperative radiotherapy for carcinoma of theesophagus. Surg, Gynecol Obstet (1981) 153:690–2.

2. Gignoux M, Roussel A, Paillot B, Gillet M,Schlag P, Favre J-P, Dalesio O, Buyse M,

Page 142: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GASTROINTESTINAL CANCERS 135

Duez N. The value of preoperative radiotherapyin esophageal cancer: results of a study of theE.O.R.T.C. World J Surg (1987) 11: 426–32.

3. Kelsen DP, Ginsberg R, Pajak TF, Sheahan DG,Gunderson L, Mortimer J, Estes N, Haller DG,Ajani J, Kocha W, Minsky BD, Roth JA. Che-motherapy followed by surgery compared withsurgery alone for localized esophageal cancer.New Engl J Med (1998) 339: 1979–84.

4. Kelsen DP, Minsky BD, Smith M, Beitler J,Niedzwiecki D, Chapman D, Bains M, Burt M,Heelan R, Hilaris B. Preoperative therapy foresophageal cancer: a randomized comparison ofchemotherapy versus radiation therapy. J ClinOncol (1990) 8: 1352–61.

5. Bosset J-F, Gignoux M, Triboulet J-P, Tiret E,Mantion G, Elias D, Lozach P, Ollier J-C, PavyJ-J, Mercier M, Sahmoud T. Chemoradiotherapyfollowed by surgery compared with surgery alonein squamous-cell cancer of the esophagus. NewEngl J Med (1997) 337: 161–7.

6. Le Prise E, Etienne PL, Meunier B, Maddern G,Hassel MB, Gedouin D, Boutin D, Campion JP,Launois B. A randomized study of chemother-apy, radiation therapy, and surgery versus surgeryfor localized squamous cell carcinoma of theesophagus. Cancer (1994) 73: 1779–84.

7. Urba SG, Orringer MB, Turrisi A, Iannettoni M,Forastiere A, Strawderman M. Randomized trialof preoperative chemoradiation versus surgeryalone in patients with locoregional esophagealcarcinoma. J Clin Oncol (2001) 19: 305–13.

8. Walsh TN, Noonan N, Hollywood D, Kelly A,Keeling N, Hennessy TPJ. A comparison ofmultimodal therapy and surgery for esophagealadenocarcinoma. New Engl J Med (1996) 335:462–7.

9. Herskovic A, Martz K, Al-Sarraf M, Leich-man L, Brindle J, Vaitkevicius V, Cooper J,Byhardt R, Davis L, Emami B. Combined che-motherapy and radiotherapy compared withradiotherapy alone in patients with cancer of theesophagus. New Engl J Med (1992) 326: 1593–8.

10. Minsky BD, Berkey B, Kelsen DK, Ginsberg R,Pisansky T, Martenson J, Komaki R, Okawara G,Rosenthal, S. Preliminary results of intergroupINT 0123 randomized trial of combined modalitytherapy (CMT) for esophageal cancer: standardvs. high dose radiation therapy. Proc Am Soc ClinOncol (2000) 19: 239a (abstr. 927).

11. Jemal A, Thomas A, Murray T, Thun M. Cancerstatistics, 2000. CA: a Cancer J Clin (2002) 52:23–47.

12. Bonenkamp JJ, Hermans J, Sasako M, van deVelde JH. Extended lymph node dissection forgastric cancer. New Engl J Med (1999) 340:908–14.

13. Cuschieri A, Weeden S, Fielding J, Bancewicz J,Craven J, Joypaul V, Sydes M, Fayers P. Patientsurvival after D1 and D2 resections for gastriccancer: long-term results of the MRC random-ized surgical trial. Surgical Cooperative Group.Br J Cancer (1999) 79: 1522–30.

14. Janunger KG, Hafstrom L, Nygren P, Gli-melius B. A systematic overview of chemother-apy effects in gastric cancer. Acta Oncol (2001)40: 309–26.

15. Macdonald JS, Smalley SR, Benedetti J, Hun-dahl SA, Estes NC, Haller DG, Ajani JA, Gun-derson LL, Jessup JM, Martenson JA. Chemora-diotherapy after surgery compared with surgeryalone for adenocarcinoma of the stomach or gas-troesophageal junction. New Engl J Med (2001)345: 725–30.

16. Webb A, Cunningham D, Scarffe JH, Harper P,Norman A, Joffe JK, Hughes M, Mansi J, Find-lay M, Hill A, Oates J, Nicolson M, Hickish T,O’Brien M, Iveson T, Watson M, Underhill C,Wardley A, Meehan M. Randomized trial com-paring epirubicin, cisplatin, and fluorouracil ver-sus fluorouracil, doxorubicin, and methotrexatein advanced esophagogastric cancer. J Clin Oncol(1997) 15: 261–7.

17. Moertel CG, Frytak S, Hahn RG, O’Connell MJ,Reitemeier RJ, Rubin J, Schutt AJ, Weiland LH,Childs DS, Holbrook MA, Lavin PT, Livs-tone E, Spiro H, Knowlton A, Kalser M, Bar-kin J, Lessner H, Mann-Kaplan R, Ramming K,Douglas HO, Thomas P, Nave H, Bateman J,Lokich J, Chaffey J, Corson JM, Zamcheck N,Noval JW. Therapy of locally unresectable pan-creatic carcinoma: a randomized comparison ofhigh dose (6000 Rads) radiation alone, moder-ate dose radiation (4000 Rads +5−fluorouracil),and high dose radiation +5−fluorouracil. Cancer(1981) 48: 1705–10.

18. Klaassen DJ, MacIntyre JM, Catton GE, Eng-strom PF, Moertel CG. Treatment of locallyunresectable cancer of the stomach and pancreas:a randomized comparison of 5-fluorouracil alonewith radiation plus concurrent and maintenance5-fluourouracil – an Eastern Cooperative Oncol-ogy Group Study. J Clin Oncol (1985) 3: 373–8.

19. Gastrointestinal Tumor Study Group. Treatmentof locally unresectable carcinoma of the pan-creas: comparison of combined-modality therapy(chemotherapy plus radiotherapy) to chemother-apy alone. J Natl Cancer Inst (1988) 80: 751–5.

20. Neoptolemos JP, Dunn JA, Stocken DD, Al-mond J, Link K, Beger H, Bassi C, Falconi M,Pederzoli P, Dervenis C, Fernandez-Cruz L,Lacaine F, Pap A, Spooner D, Kerr DJ, Friess H,Buchler MW, Members of the European Study

Page 143: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

136 TEXTBOOK OF CLINICAL TRIALS

Group for Pancreatic Cancer. Adjuvant chemora-diotherapy and chemotherapy in resectable pan-creatic cancer: a randomised controlled trial.Lancet (2001) 358: 1576–85.

21. Burris HA, Moore MJ, Andersen J, Green MR,Rothenberg ML, Modiano MR, Cripps MC, Port-enoy RK, Storniolo AM, Tarassoff P, Nelson R,Dorr FA, Stephens CD, Von Hoff DD. Improve-ments in survival and clinical benefit with gem-citabine as first-line therapy for patients withadvanced pancreas cancer: a randomized trial. JClin Oncol (1997) 15: 2403–13.

22. Moore MJ, Hamm J, Eisenberg P, Dagenais M,Hagan K, Fields A, Greenberg B, Schwartz B,Ottaway J, Zee B, Seymour L. A comparisonbetween gemcitabine (GEM) and the matrixmetalloproteinase (MMP) inhibitor BAY12-9566(9566) in patients (pts) with advanced pancreaticcancer. Proc Am Soc Clin Oncol (2000) 19: 240a(abstr. 930).

23. McKeown-Eyssen GE, Bright-See E, Bruce WR,Jazmaji V, Cohen LB, Pappas SC, Saibil FG. Arandomized trial of a low fat high fibre diet in therecurrence of colorectal polyps. Toronto PolypPrevention Group. J Clin Epidemiol (1994) 47:525–36.

24. MacLennan R, Macrae F, Bain C, Battistutta D,Chapuis P, Gratten H, Lambert J, Newland RC,Ngu M, Russell A. Randomized trial of intake offat, fiber, and beta carotene to prevent colorec-tal adenomas. The Australian Polyp PreventionProject. J Natl Cancer Inst (1995) 87: 1760–6.

25. Bonithon-Kopp C, Kronborg O, Giacosa A,Rath U, Faivre J. Calcium and fibre supplemen-tation in prevention of colorectal adenoma recur-rence: a randomised intervention trial. EuropeanCancer Prevention Organisation Study Group.Lancet (2000) 356: 1300–6.

26. Alberts DS, Martinez ME, Roe DJ, Guillen-Rodriguez JM, Marshall JR, van Leeuwen JB,Reid ME, Ritenbaugh C, Vargas PA, Bhatta-charyya AB, Earnest DL, Sampliner RE. Lackof effect of a high-fiber cereal supplement onthe recurrence of colorectal adenomas. PhoenixColon Cancer Prevention Physicians’ Network.New Engl J Med (2000) 342: 1156–62.

27. Schatzkin A, Lanza E, Corle D, Lance P, Iber F,Caan B, Shike M, Weissfeld J, Burt R, CooperMR, Kikendall JW, Cahill J. Lack of effect ofa low-fat, high-fiber diet on the recurrenceof colorectal adenomas. Polyp Prevention TrialStudy Group. New Engl J Med (2000) 342:1149–55.

28. Ponz de Leon M, Roncucci L. Chemopreventionof colorectal tumors: role of lactulose and ofother agents. Scand J Gastroenterol Suppl (1997)222: 72–5.

29. Hofstad B, Almendingen K, Vatn M, Ander-sen SN, Owen RW, Larsen S, Osnes M. Growthand recurrence of colorectal polyps: a double-blind 3-year intervention with calcium andantioxidants. Digestion (1998) 59: 148–56.

30. McKeown-Eyssen G, Holloway C, Jazmaji V,Bright-See E, Dion P, Bruce WR. A randomizedtrial of vitamins C and E in the preventionof recurrence of colorectal polyps. Cancer Res(1988) 48: 4701–5.

31. Greenberg ER, Baron JA, Tosteson TD, Free-man DH Jr, Beck GJ, Bond JH, Colacchio TA,Coller JA, Frankl HD, Haile RW. A clinical trialof antioxidant vitamins to prevent colorectal ade-noma. Polyp Prevention Study Group. New EnglJ Med (1994) 331: 141–7.

32. Bergsma-Kadijk JA, van’t Veer P, Kampman E,Burema J. Calcium does not protect againstcolorectal neoplasia. Epidemiology (1996) 7:590–7.

33. Baron JA, Beach M, Mandel JS, van Stolk RU,Haile RW, Sandler RS, Rothstein R, SummersRW, Snover DC, Beck GJ, Bond JH, Green-berg ER. Calcium supplements for the preventionof colorectal adenomas. Calcium Polyp Preven-tion Study Group. New Engl J Med (1999) 340:101–7.

34. The Women’s Health Initiative Study Group.Design of the women’s health initiative clinicaltrial and observational study. Contr Clin Trials(1998) 19: 61–109.

35. Gann PH, Manson JE, Glynn RJ, Buring JE,Hennekens CH. Low-dose aspirin and incidenceof colorectal tumors in a randomized trial. J NatlCancer Inst (1993) 85: 1220–4.

36. Sturmer T, Glynn RJ, Lee IM, Manson JE, Bur-ing JE, Hennekens CH. Aspirin use and colorec-tal cancer: post-trial follow-up data from thePhysicians’ Health Study. Ann Int Med (1998)128: 713–20.

37. Steinbach G, Lynch PM, Phillips RK, WallaceMH, Hawk E, Gordon GB, Wakabayashi N,Saunders B, Shen Y, Fujimura T, Su LK, LevinB. The effect of celecoxib, a cyclooxygenase-2 inhibitor, in familial adenomatous polyposis.New Engl J Med (2000) 342: 1946–52.

38. Jorgensen OD, Kronborg O, Fenger C. A ran-domised study of screening for colorectal cancerusing faecal occult blood testing: results after13 years and seven biennial screening rounds.Gut (2002) 50: 29–32.

39. Scholefield JH, Moss S, Sufi F, Mangham CM,Hardcastle JD. Effect of faecal occult bloodscreening on mortality from colorectal cancer:results from a randomised controlled trial. Gut(2002) 50: 840–4.

Page 144: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GASTROINTESTINAL CANCERS 137

40. Mandel JS, Church TR, Bond JH, Ederer F,Geisser MS, Mongin SJ, Snover DC, Schu-man LM. The effect of fecal occult-blood screen-ing on the incidence of colorectal cancer. NewEngl J Med (2000) 343: 1603–7.

41. Winawer SJ, Fletcher RH, Miller L, Godlee F,Stolar MH, Mulrow CD, Woolf SH, Glick SN,Ganiats TG, Bond JH, Rosen L, Zapka JG, OlsenSJ, Giardiello FM, Sisk JE, Van Antwerp R,Brown-Davis C, Marciniak DA, Mayer RJ. Col-orectal cancer screening: clinical guidelinesand rationale. Gastroenterology (1997) 112:594–642.

42. Smith RA, Cokkinides V, von Eschenbach AC,Levin B, Cohen C, Runowicz CD, Sener S,Saslow D, Eyre HJ. American Cancer Societyguidelines for the early detection of cancer. CA:a Cancer J Clin (2002) 52: 8–22.

43. Weissfeld JL, Ling BS, Schoen RE, BresalierRS, Riley T, Prorok PC. Adherence to repeatscreening flexible sigmoidoscopy in the Prostate,Lung, Colorectal, and Ovarian (PLCO) CancerScreening Trial. Cancer (2002) 94: 2569–76.

44. UK Flexible Sigmoidoscopy Screening TrialInvestigators. Single flexible sigmoidoscopyscreening to prevent colorectal cancer: baselinefindings of a UK multicentre randomised trial.Lancet (2002) 359: 1291–300.

45. Ness RM, Holmes AM, Klein R, Dittus R. Cost-utility of one-time colonoscopic screening forcolorectal cancer at various ages. Am J Gastroen-terol (2000) 95: 1800–11.

46. Winawer SJ, Zauber AG, Church TR, Mandel-son M, Feld A, Bond JH. National ColonoscopyStudy (NCS) preliminary findings: a random-ized clinical trial of general population screeningcolonoscopy. Gastroenterology (2002) T1560.

47. Stocchi L, Nelson H. Laparoscopic colectomyfor colon cancer: trial update. J Surg Oncol(1998) 68: 255–67.

48. Laurie JA, Moertel CG, Fleming TR, WieandHS, Leigh JE, Rubin J, McCormack GW, Gerst-ner JB, Krook JE, Mailliard J, Twito DI, MortonRF, Tschetter LK, Barlow JF. Surgical adjuvanttherapy of large-bowel carcinoma: an evalua-tion of levamisole and the combination of lev-amisole and fluorouracil. J Clin Oncol (1989) 7:1447–56.

49. Moertel CG, Fleming TR, Macdonald JS, HallerDG, Laurie JA, Goodman PJ, Ungerleider JS,Emerson WA, Tormey DC, Glick JH, VeederMH, Mailliard JA. Levamisole and fluorouracilfor adjuvant therapy of resected colon carcinoma.New Engl J Med (1990) 322: 352–8.

50. NIH Consensus Conference. Adjuvant therapyfor patients with colon and rectal cancer. J AmMed Assoc (1990) 264: 1444–50.

51. International Multicentre Pooled Analysis ofColon Cancer Trials (IMPACT) Investigators.Efficacy of adjuvant fluorouracil and folinic acidin colon cancer. Lancet (1995) 345: 939–44.

52. Francini G, Petrioli R, Lorenzini L, Mancini S,Armenio S, Tanzini G, Marsili S, Aquino A,Marzocca G, Civitelli S, Mariani L, De Sando D,Bovenga S, Lorenz M. Folinic acid and 5-fluorouracil as adjuvant chemotherapy in coloncancer. Gastroenterology (1994) 106: 899–906.

53. O’Connell MJ, Mailliard JA, Kahn MJ, Mac-donald JS, Haller DG, Mayer RJ, Wieand HS.Controlled trial of fluorouracil and low-doseleucovorin given for 6 months as postoperativeadjuvant therapy for colon cancer. J Clin Oncol(1997) 15: 246–50.

54. QUASAR Collaborative Group. Comparison offluorouracil with additional levamisole, higher-dose folinic acid, or both, as adjuvant chemother-apy for colorectal cancer: a randomised trial.Lancet (2000) 355: 1588–96.

55. Haller DG, Catalano PJ, Macdonald JS. Fluo-rouracil (FU), leucovorin (LV) and levamisole(LEV) adjuvant therapy for colon cancer: five-year final report of INT-0089. Proc Am Soc ClinOncol (1998) 17: 256a (abstr. 982).

56. Poplin E, Benedetti N, Estes D, Haller DG,Mayer R, Goldberg R, Macdonald J. Phase IIIrandomized trial of bolus 5-FU/leucovorin/levamisole versus 5-FU continuous/infusion/levamisole as adjuvant therapy for high riskcolon cancer (SWOG 9415/INT-0153). Proc AmSoc Clin Oncol (2000) 19: 240a.

57. Wolmark N, Fisher B, Rockette H. Postoperativeadjuvant chemotherapy or BCG for colon cancer:results from NSABP protocol C-01. J NatlCancer Inst (1988) 80: 30–6.

58. Sargent DJ, Goldberg RM, Jacobson SD, Mac-donald JS, Labianca R, Haller DG, Shepherd LE,Seitz JF, Francini G. A pooled analysis of adju-vant chemotherapy for resected colon cancer inelderly patients. New Engl J Med (2001) 345:1091–7.

59. Mamounas E, Wieand S, Wolmark N, Bear HD,Atkins JN, Song K, Jones J, Rockette H. Com-parative efficacy of adjuvant chemotherapy inpatients with Dukes’ B versus Dukes’ C coloncancer: results from four National Surgical Adju-vant Breast and Bowel Project Adjuvant Stud-ies (C-01, C-02, C-03, and C-04). J Clin Oncol(1999) 17: 1349–55.

60. International Multicentre Pooled Analysis of B2Colon Cancer Trials (IMPACT B2) Investigators.Efficacy of adjuvant fluorouracil and folinic acidin B2 colon cancer. J Clin Oncol (1999) 17:1356–63.

Page 145: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

138 TEXTBOOK OF CLINICAL TRIALS

61. Advanced Colorectal Cancer Meta-analysis Pro-ject. Modulation of fluorouracil by leucovorinin patients with advanced colorectal cancer:evidence in terms of response rate. J Clin Oncol(1992) 10: 896–903.

62. Meta-Analysis Group in Cancer. Efficacy ofintravenous continuous infusion of fluorouracilcompared with bolus administration in advancedcolorectal cancer. J Clin Oncol (1998) 16: 301–8.

63. Meta-Analysis Group in Cancer. Toxicity offluorouracil in patients with advanced colorectalcancer: effect of administration schedule andprognostic factors. J Clin Oncol (1998) 16:3537–41.

64. Hoff PM, Ansari R, Batist G. Comparison oforal capecitabine versus intravenous fluorouracilplus leucovorin as first-line treatment in 605patients with metastatic colorectal cancer: resultsof a randomized phase III study. J Clin Oncol(2001) 19: 2282–92.

65. Van Cutsem E, Twelves C, Cassidy J. Oralcapecitabine compared with intravenous flu-orouracil plus leucovorin in patients withmetastatic colorectal cancer: results of a largephase III study. J Clin Oncol (2001) 19:4097–106.

66. Rougier P, Van Cutsem E, Bajetta E. Randomisedtrial of irinotecan versus fluorouracil by continu-ous infusion after fluorouracil failure in patientswith metastatic colorectal cancer. Lancet (1998)352: 1407–12.

67. Cunningham D, Pyrhonen S, James RD. Ran-domised trial of irinotecan plus supportive careversus supportive care alone after fluorouracilfailure for patients with metastatic colorectal can-cer. Lancet (1998) 352: 1413–18.

68. Saltz LB, Cox JV, Blanke C, Rosen LS, Fehren-bacher L, Moore MJ, Maroun JA, Ackland SP,Locker PK, Pirotta N, Elfring GL, Miller LL.Irinotecan plus fluorouracil and leucovorin formetastatic colorectal cancer. New Engl J Med(2000) 343: 905–14.

69. Douillard JY, Cunningham D, Roth AD,Navarro M, James RD, Karasek P, Jandik P,Iveson R, Carmichael J, Alakl M, Gruia G,Awad L, Rougier P. Irinotecan combined withfluorouracil compared with fluorouracil aloneas first-line treatment for metastatic colorectalcancer: a multicentre randomised trial. Lancet(2000) 355: 1041–7.

70. De Gramont A, Figer A, Seymour M, Home-rin M, Hmissi A, Cassidy J, Boni C, Cortes-Funes H, Cervantes A, Freyer G, PapamichaelD, Le Bail N, Louvet C, Hendler D, de Braud F,Wilson C, Morvan F, Bonetti A. Leucovorin and

fluorouracil with or without oxaliplatin as first-line treatment in advanced colorectal cancer. JClin Oncol (2000) 18: 2938–47.

71. Goldberg RM, Morton RF, Sargent DJ, Fuchs C,Ramanathan S, Williamson S, Findlay B,NCCTG, CALGB, ECOG, SWOG, NCIC.N9741 oxaliplatin (Oxal) or CPT-11 + 5-fluorouracil (5FU)/leucovorin (LV) or oxal +CPT-11 in advanced colorectal cancer (CRC).Initial toxicity and response data from a GIintergroup study. Proc Am Soc Clin Oncol (2002)21: 128a.

72. Krook JE, Moertel CG, Gunderson LL, Wie-and HS, Collins RT, Beart RW, Kubista TP,Poon MA, Meyers WC, Mailliard JA, Twito DI,Morton RF, Veeder MH, Witzig TE, Cha S, Vid-yarthi SC. Effective surgical adjuvant therapyfor high-risk rectal carcinoma. New Engl J Med(1991) 324: 709–15.

73. O’Connell MJ, Martenson JA, Wieand HS,Krook JE, Macdonald JS, Haller DG, Mayer RJ,Gunderson LL, Rich TA. Improving adjuvanttherapy for rectal cancer by combining protracted-infusion fluorouracil with radiation therapy aftercurative surgery. New Engl J Med (1994) 331:502–7.

74. Fisher B, Wolmark N, Rockette H, Redmond C,Deutsch M, Wickerham DL. Postoperative adju-vant chemotherapy or radiation therapy for rectalcancer: results from NSABP Protocol R-01. JNatl Cancer Inst (1988) 80: 21–9.

75. Wolmark N, Wieand HS, Hyams DM, Colan-gelo L, Dimitrov NV, Romond EH, Wexler M,Prager D, Cruz AB Jr, Gordon PH, PetrelliNJ, Deutsch M, Mamounas E, Wickerham DL,Fisher ER, Rockette H, Fisher B. Random-ized trial of postoperative adjuvant chemotherapywith or without radiotherapy for carcinoma of therectum: National Surgical Adjuvant Breast andBowel Project Protocol R-02. J Natl Cancer Inst(2000) 92: 388–96.

76. Swedish Rectal Cancer Trial. Improved survivalwith preoperative radiotherapy in resectable rec-tal cancer. New Engl J Med (1997) 336: 980–7.

77. Sauer R, Fietkau R, Wittekind C, Martus P,Rodel C, Hohenberger W, Jatzko G, Sabitzer H,Karstens JH, Becker H, Hess C, Raab R. Adju-vant versus neoadjuvant radiochemotherapy forlocally advanced rectal cancer. A progressreport of a phase-III randomized trial (protocolCAO/ARO/AIO-94). Strahlenther Onkol (2001)177: 173–8.

78. Enker WE, Thaler HT, Cranor ML, Polyak T.Total mesorectal excision in the operative treat-ment of carcinoma of the rectum. J Am Coll Surg(1995) 181: 335–46.

Page 146: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GASTROINTESTINAL CANCERS 139

79. Kapiteijn E, Marijnen CAM, Nagtegaal ID, Put-ter H, Steup WH, Wiggers T, Rutten HJT, Pahl-man L, Glimelius B, Han J, van Krieken HJM,Leer JWH, van de Velde JH. Preoperative radio-therapy combined with total mesorectal excisionfor resectable rectal cancer. New Engl J Med(2001) 345: 638–46.

80. Hammond MEH, Taube SE. Issues and barri-ers to development of clinically useful tumormarkers: a development pathway proposal. SeminOncol (2002) 29: 213–21.

81. Pajak TF, Clark GM, Sargent DJ, McShane LM,Hammond MEH. Statistical issues in tumormarker studies. Arch Pathol Lab Med (2000) 124:1011–15.

82. Grignon D, Caplan R, Sarkar F. p53 status andprognosis of locally advanced prostatic adenocar-cinoma. J Natl Cancer Inst (1997) 89: 158–65.

83. Sargent DJ, Goldberg RM, Mahoney MR, Hill-man DW, McKeough T, Hamilton SF, Darcy JM,Anderson VL, Krook JE, O’Connell MJ. Rapidreporting and review of an increased incidenceof a known adverse event. J Natl Cancer Inst(2000) 92: 1011–13.

84. Rothenberg ML, Meropol NJ, Poplin EA, VanCutsem E, Wadler S. Mortality associated withirinotecan plus bolus fluorouracil/leucovorin:summary findings of an independent panel. J ClinOncol (2001) 19: 3801–7.

85. Goldberg RM, Sargent DJ, Morton RF, MahoneyMR, Krook JE, O’Connell MJ. Sensing toxicityand adjusting ongoing clinical trials protocolsto enhance patient safety: the history and per-formance of the NCCTG ‘Real Time ToxicityMonitoring Program’. J Clin Oncol (2002) 20:4591–6.

86. Ullman B, Lee M, Martin DW Jr, Santi DV.Cytotoxicity of 5-fluoro-2-deoxyuridine: require-ment for reduced folate co-factors and antago-nism by methotrexate. Proc Natl Acad Sci (1978)75: 980–3.

87. Evans RM, Laskin JD, Hakala MT. Effect ofexcess folates and deoxyinosine on the activityand site of action of 5-fluorouracil. Cancer Res(1981) 41: 3288–95.

88. Waxman S, Bruckner H. The enhancement of 5-fluorouracil antimetabolic activity by leucovorin,menadione, and alpha-tocopherol. Eur J CancerClin Oncol (1982) 18: 685–92.

89. Keyomarsi K, Moran RG. Folinic augmentationof the effects of fluoropyrimidines on murine andhuman leukemic cells. Cancer Res (1986) 46:5229–35.

90. Houghton JA, Williams LG, Loftin SK. Rela-tionship between the dose rate of [6RS] leu-

covorin administration, plasma concentrations ofreduced folates, and pools of 5,10 methylenete-trahydrafolates and tetrahydrafolates in humancolon adenocarcinoma xenografts. Cancer Res(1990) 50: 3493–502.

91. Houghton JA, Williams LG, Loftin SK. Fac-tors that influence the therapeutic activity of5-fluorouracil [6RS] leucovorin combinationsin colon adenocarcinoma xenografts. CancerChemother Pharmacol (1992) 30: 423–32.

92. Houghton JA, Williams LG, Cheshire PJ. Influ-ence of the dose of [6RS] leucovorin onreduced folate pools and 5-fluorouracil-mediatedthymidylate synthase inhibition in human colonadenocarcinoma xenografts. Cancer Res (1990)50: 3940–6.

93. Cao S, Frank C, Rustum YM. Role of fluo-ropyrimidine scheduling and (6R,S) leucovorindose in a preclinical animal model of colorec-tal carcinoma. J Natl Cancer Inst (1996) 88:430–6.

94. Machover D, Goldschmidt E, Chollet P. Treat-ment of advanced colorectal and gastric ade-nocarcinomas with 5-fluorouracil and high-dosefolinic acid. J Clin Oncol (1986) 4: 685–96.

95. Machover D, Schwartenberg L, Goldschmidt E.Treatment of advanced colorectal and gastricadenocarcinomas with 5-FU combined with high-dose folinic acid: a pilot study. Cancer Treat Rep(1982) 66: 1803–7.

96. Rustum YM, Trave F, Zakrzewski SF. Biochem-ical and pharmacologic basis for potentiation of5-fluorouracil action by leucovorin. NCI Monogr(1987) 5: 165–70.

97. Poon MA, O’Connell MJ, Moertel CG. Bio-chemical modulation of fluorouracil: evidence ofsignificant improvement in survival and qualityof life in patients with advanced colorectal car-cinoma. J Clin Oncol (1989) 7: 1407–18.

98. O’Connell MJ. A phase III trial of 5-fluorouraciland leucovorin in the treatment of advancedcolorectal cancer: a Mayo Clinic/North CentralCancer Treatment Group study. Cancer (1989)63: 1026–30.

99. Petrelli N, Herrara L, Rustum Y. A prospectiverandomized trial of 5-fluorouracil vs. 5-flu-orouracil and high dose leucovorin +5−fluoro-uracil and methotrexate in previously untreatedpatients with advanced colorectal carcinoma. JClin Oncol (1987) 5: 1559–65.

100. Buroker TR, O’Connell MJ, Wieand HS. Ran-domized comparison of two schedules of flu-orouracil and leucovorin in the treatment ofadvanced colorectal cancer. J Clin Oncol (1994)12: 14–20.

Page 147: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

9

Haematologic Cancers

CHARLES A. SCHIFFER1 AND STEPHEN L. GEORGE2

1Karmanos Cancer Institute, Wayne State University School of Medicine, Detroit, MI 48201, USA2Department of Biostatistics and Bioinformatics, Duke University Medical Centre, Durham, NC 27710, USA

INTRODUCTION

The story of therapeutic research for acutemyeloid leukaemia (AML) is one of mixedsuccess. AML has been studied extensively bothin the clinic and in the laboratory, in part becauseof the accessibility of cells for in vitro testing, andhas served as a model for the elucidation of manyof the principles of anti-neoplastic therapy andtranslational research in infectious disease andtransfusion medicine supportive care. Completeremission after initial therapy is achieved inabout two-thirds of patients, a significant fractionof whom can be cured with additional post-remission treatment. Unfortunately, however, thegreat majority of patients eventually relapse andsuccumb to complications of the disease andits treatment. The incidence of AML spans theentire age spectrum but is most common in adultsgreater than 60 years of age and the prognosis isparticularly poor in these individuals.

Large numbers of randomised trials havebeen performed in patients with AML includ-ing comparative evaluations of different dosesand types of chemotherapeutic agents, the use

of haematopoietic growth factors, stem celltransplantation in first remission and modula-tion of various mechanisms of intrinsic drugresistance.1 – 7 Some of these trials have, in fact,changed the standard care of the disease butmost have unfortunately been negative in thatthe newer approaches failed to improve remissionrates and overall survival. Intensive treatmentregimens have improved outcome in youngerpatients, but clinical trials in patients 60 yearsand older by the Cancer and Leukemia Group B(CALGB) and others have shown remarkablyconsistent poor results, with complete remis-sion (CR) rates of around 50%, and only 10%of patients surviving four years.1,8,9 Indeed, itis arguable that the prognosis of older patientswith AML has not changed in the last 15 years.Therefore, it is imperative that new therapies areevaluated in as efficient a fashion as possible.There are a number of issues which can serveas impediments to new drug development, someof which are idiosyncratic to AML. This chapterwill review some of these problems and sug-gest and discuss some of the statistical issues intrial design.

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 148: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

142 TEXTBOOK OF CLINICAL TRIALS

BACKGROUND

AML occurs as a consequence of an acquiredmutation in a haematopoietic stem cell whichresults in a failure of normal maturation and dif-ferentiation of myeloid cells and an accumulationof juvenile leukaemia cells or ‘blasts’. By mecha-nisms which are not well understood, the expan-sion of this malignant clone suppresses normalblood cell formation and patients usually presentwith symptoms related to absence of normalblood cells including weakness and fatigue due toanaemia, infection because of decreased numberof normal white blood cells, and bleeding becauseof marked decreases in the number of platelets.Because this is a systemic disease, initial treat-ment is with chemotherapy, generally includingan anthracycline and cytarabine (ara-c), adminis-tered for three and seven days respectively. Thisinduces a period of low blood counts for threeto four weeks at which time the patient is at riskfor bleeding and infection and invariably requirestransfusions of red blood cells, platelets andthe use of systemic antibiotics. Should therapybe successful, a complete remission is obtainedwhich is defined as normal blood counts and bonemarrow with no evidence of leukaemia.10 It isknown that small amounts of leukaemia remain,which cannot be detected morphologically withthe microscope, because without further therapy,leukaemia invariably relapses, generally withinsix months. Remission rates are approximately75% in younger patients but are only 50% inolder patients (greater than 60 years of age).

There have been major improvements in infec-tious disease and transfusion supportive care sothat, today, the major cause of initial treatmentfailure is drug-resistant leukaemia (i.e. persis-tence of leukaemia after treatment). Randomisedtrials comparing different types of anthracyclines,different doses and schedules of administrationof ara-c, and the use of growth factors for sup-portive care, have not improved these inductionresults. With appropriate post-remission therapy,approximately 35% of patients less than 50 yearsof age remain disease-free after three years, withalmost all of these patients being functionally

cured of their leukaemia; however, less than 10%of patients greater than 60 years of age remain inlong-term remission.1,6,8,9 Multiple randomisedtrials evaluating high-dose therapy with eitherautologous or allogeneic stem cell rescue haveproduced results similar to those achieved withchemotherapy alone, although the causes of treat-ment failure vary somewhat, with lower ratesof relapse with transplant approaches offset byhigher treatment associated mortality.6 In addi-tion, transplant approaches are generally suitableonly for younger patients.

AML is also an extremely heterogeneous dis-order biologically. Multiple subtypes can beidentified by cytogenetic or molecular studiesof the leukaemia cells. Some of these sub-types (predominantly found in younger patients)have an excellent prognosis with cure rates ofapproximately 60%, whereas other subtypes, gen-erally characterised by chromosome loss andduplication, have almost no patients cured withchemotherapy alone.11 The latter is much morecommon in older patients, particularly those inwhom AML developed as a progression of aprior bone marrow disorder either of a myelo-proliferative nature or more commonly followinga myelodysplastic syndrome.

Cytogenetic and molecular characterisation ofthe type of AML can be critical as evidencedby the remarkable results achieved in recentyears in acute progranulocytic leukaemia (APL),a subgroup representing about 10% of patientswith AML, predominantly in younger adultsand children.12 All patients with APL have abalanced translocation between chromosomes 15and 17, resulting in a mutation of the genecoding for the retinoic acid nuclear receptor.By complex mechanisms, this confers uniquesensitivity to an oral retinoid, all-trans retinoicacid (ATRA), which has appreciably fewer sideeffects than traditional chemotherapy. A seriesof randomised trials have elucidated the optimalmeans of combining ATRA with chemotherapysuch that more than 70% of patients withthis previously devastating leukaemia can becured.13 Interestingly, APL also has a uniquesensitivity to arsenical compounds. It is hoped

Page 149: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

HAEMATOLOGIC CANCERS 143

that similar strategies with different compoundscan be discovered for other AML variants withdiscrete activating mutations, as has recently alsobeen achieved in patients with chronic myeloidleukaemia (CML) with the use of the tyrosinekinase inhibitor imatinib mesylate (STI571),which specifically targets the abnormal enzymeproduced by the bcr/abl mutation characteristicof CML.14

Some studies have suggested differential res-ponsiveness of AML subtypes to different typesof chemotherapy. In particular, patients withmore favourable balanced translocations seemto benefit from high-dose ara-c-based consoli-dation therapy. In contrast, patients with poorprognosis chromosomal changes do not bene-fit from these more intensive chemotherapeuticapproaches,15 although a fraction may be curedwith the graft versus leukaemia effect inducedby allogeneic stem cell transplantation. Stem celltransplant is currently not a possibility for olderpatients. Because of this, many treatment coop-erative groups have devised different therapeuticapproaches for older and younger patients, withmanipulations of stem cell transplantation beingevaluated in the latter group.

IMPROVING THERAPY FOR OLDERPATIENTS WITH AML

Rates of complete remission are much lower andremission duration more abbreviated in patientsgreater than the age of 60 years as a consequenceof more intrinsic drug resistance and more base-line organ dysfunction than are encountered inyounger individuals. New therapeutic approachesshould focus both on increasing remission ratesas well as on prolonging remission and enhanc-ing the cure fraction of such patients. Many AMLstudies have focused on older patients becauseof the large numbers of such patients availablefor studies as well as the feeling that the overallresults of therapy are so poor that it would be pos-sible to identify truly active agents very rapidlybecause differences with historical or randomisedcontrols would be obvious.

However, there are a number of both practicaland biologic issues complicating the conduct ofsuch trials:

• Evaluation of post-remission manipulationsis made more difficult by the low com-plete response rate, so that less than 50%of patients initially entered on trial are eli-gible for post-remission treatment. In addi-tion, many such patients are not candidatesfor intensive therapy because of compromisedorgan function from toxicities encountered dur-ing induction, and because many older patientsdo not recover fully normal blood counts evenafter a significant anti-leukaemic response dur-ing induction.

• In addition, many older individuals declinepost-remission treatment, preferring to spendtheir remaining time outside of the hospital,as far from aggressive medical ministrationsas possible.

Thus, randomised studies of new therapies intro-duced post-remission need larger numbers ofpatients to account for this drop-off in patientsas the study progresses. This represents a majorissue since only a small fraction of such patientsare captured for clinical trials.

Furthermore:

• AML in older individuals is extremely hetero-geneous. Some therapies might be appropriateonly for certain AML subtypes and positiveeffects can be missed when tested in the over-all AML population. This may be particularlytrue for newer targeted therapies.

• A focus on patients with highly resistantdisease represents a particularly high hurdlefor new therapies and treatments. Modest, butnonetheless important, benefits which couldbe of value to other patient groups could bemissed by studying only patients in very poorprognostic groups.

These problems are particularly relevant, be-cause there is no shortage of new agents whichcould and should be evaluated in AML. In

Page 150: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

144 TEXTBOOK OF CLINICAL TRIALS

addition to a continued supply of cytotoxicdrugs, there will be large numbers of anti-angiogenesis compounds, immune modulators,signal transduction inhibitors (either with specificor more generic enzymatic targets), as well asnew and less acutely toxic approaches to stem celltransplant. Many of the non-cytotoxic therapeuticapproaches also have the allure of oral treatmentwith potentially much less toxicity.

If an agent can be safely added to theusual dose of conventional therapy, it mightbe most efficient to utilise the new therapy inboth induction and consolidation, thereby perhapsmaximising the chance to detect anti-leukaemicactivity. Possible study designs for trials of newpost-remission therapies are shown in Table 9.1,where conventional therapy might refer to afew courses of low-dose ara-c which results ina median remission duration of approximatelyeight months and less than 10% long-termdisease-free survival.9 This is slightly better thanobservation without treatment which producesvery few if any long-term disease-free survivorsand shorter CR durations. The choice amongthe various randomised approaches might beinfluenced by the unique features of the agentbeing tested. Also, given the very poor resultsobserved with standard therapy, it could beargued that a straightforward Phase II trial inwhich the new agent is evaluated alone couldhave merit, although the usual problems withhistorical controls and patient selection would beissues. However, a number of anti-cancer agentshave been approved by the FDA in recent yearsunder an accelerated approval mechanism, based

Table 9.1. Possible study designs evaluating newagents in post-remission therapy

Phase II studiesNew agent alone

Randomised Phase III studiesObservation vs. new agentConventional vs. new agentConventional vs. conventional + new agentConventional followed by observation vs. new agentNew agent in both induction and consolidation

on Phase II data alone which showed benefit inpatients with resistant disease and otherwise fewtherapeutic options.

STATISTICAL ISSUES IN DESIGNAND ANALYSIS

Because of the nature of AML and its treat-ment, several statistical issues in the design andanalysis of clinical trials need special attention.Four of these are discussed: factorial designs,outcome measures, competing risks and statisti-cal modelling.

FACTORIAL DESIGNS

The treatment phases for AML are conventionallydivided into an initial phase of induction therapyand, for those achieving a complete remissionduring this phase, a post-remission or mainte-nance therapy phase. The post-remission phaseis sometimes further divided into earlier consol-idation therapy and later maintenance therapy,but for our purposes here, two phases are suf-ficient. It is natural to design studies to comparetherapies in each of these two phases, leading tofactorial designs, in which patients are randomlyassigned to one of two or more induction thera-pies (the first factor) and then to one of two ormore maintenance therapies (the second factor).With two possible treatment assignments in eachphase, this is a 2 × 2 factorial design, a commonand well-known statistical design. Much has beenwritten about this design applied in the clinicaltrials setting.16 The twist in the current situation isthat the second randomisation is applicable onlyfor patients who respond to the induction ther-apy. As noted above, in the case of older AMLpatients, only about 50% of all patients enteredon study may respond and, thus, be eligible andmedically suitable for the second randomisation.

It is typical to separate the objectives of suchstudies into a comparison of induction regi-mens with respect to response rates and, sep-arately, a comparison of maintenance regimenswith respect to the length of remission, disease-free survival or overall survival. For example,

Page 151: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

HAEMATOLOGIC CANCERS 145

CALGB 8923 was a randomised clinical trialof this type involving AML patients at least60 years old.2,9 The induction phase involved arandomisation between GM-CSF, a haematopoi-etic growth factor, and placebo following initialchemotherapy. The hypothesis was that the GM-CSF would reduce infectious complications andperhaps increase the response rate. Respondingpatients were to be randomised to receive oneof two post-remission regimens, cytarabine aloneor a combination of cytarabine and mitoxantrone.Overall, 388 participants were randomised to theinduction therapies; 205 (53%) achieved a CR,but only 169 (44%) were randomised in the post-remission phase.

One of the problems with the usual approach tothese designs is that there is no direct estimationor testing of the four possible treatment poli-cies implied in the design. The policies aredefined by selecting one of two induction ther-apies followed by one of two post-remissiontherapies, if a response is obtained and thepatient consents to continue. One paper dealsdirectly with this issue, making efficient use ofdata from all patients.17 There are also method-ologic issues about when the randomisation tothe post-remission therapy should be done. Forexample, if both randomisations are done at thetime of study entry with a planned intent totreat analysis, then the inevitable (and antici-pated) large patient drop-out can substantiallycomplicate evaluation of the second therapeuticmanoeuvre.

OUTCOME MEASURES

There are various choices for outcome measuresin clinical trials involving AML patients. Theprimary ones are:

• Response rate – the proportion of patients whoachieve an initial clinical response to the induc-tion therapy is referred to as the response rate.In older AML patients, as in all leukaemiapatients, the critical category is the completeresponse (CR) rate, although partial responsesare sometimes included in Phase II trials where

attention is focused on identifying activity ofan agent, no matter how small. Achievement ofa complete response is sine qua non for long-term control of disease. However, the CR rateis a very imperfect surrogate for more mean-ingful clinical outcome measures describedbelow, and has been defined differently by dif-ferent leukaemia treatment groups, and shouldnever be used as a substitute for them, espe-cially in Phase III clinical trials. The primaryrole for the CR rate is as a measure of clinicalactivity in Phase II trials.

• Event-free survival (EFS) – this is the timefrom the start of study until a failure torespond, relapse (for those achieving a CR)or death, whichever occurs first. This outcomemeasure is a good measure of the overallcontrol of disease from the start of therapyand combines the effects of induction andpost-remission therapies. In a Phase III trial,all randomised patients contribute to anyanalysis of EFS under the usual intent-to-treat approach.18 Standard techniques for time-to-event data (survival methods) are used indesign and analysis for EFS, and for the othertime-to-event endpoints defined next.19

• Disease-free [or relapse-free] survival (DFS) –this is a standard outcome measure in trialsof adjuvant therapy for solid tumours, but inAML trials, DFS refers to the survival timespent free of disease. Thus, DFS is applicableonly to patients who achieve a CR. It is definedas the time from achieving a CR to relapseor death, whichever occurs first. Since patientswho fail to achieve a CR are excluded, thismeasure is unsuitable as an overall assessmentof therapy. However, it is useful for compar-ing two or more post-remission therapies aslong as it is recognised that the distribution ofDFS is not representative of the result to beexpected for all patients.

• Length of remission (LR) – the length ofremission is ordinarily defined as the timefrom achieving CR to the time of relapse, withdeaths in remission counted as censored obser-vations. This measure suffers from the sameproblems as DFS and, in addition, the usual

Page 152: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

146 TEXTBOOK OF CLINICAL TRIALS

Kaplan–Meier estimation is no longer valid(see discussion below on competing risks).

• Overall survival (OS) – the time from the startof study to death is an obviously critical out-come measure for any generally fatal diseaselike AML in older adults. It has the virtue ofbeing unambiguously defined and captures aresult of obvious significance. However, thereare often difficulties in interpretation, partic-ularly if multiple therapies are given, or ifpatients cross over to the alternative therapyafter relapse. Nevertheless, the importance ofoverall survival is so fundamental that it shouldalways be analysed, even if it is not used as theprimary outcome measure.

• Other outcome measures – there are someother measures occasionally used in AMLstudies, particularly measures of quality of life(QOL).20 Some attempt to measure survivalor related time to event measures adjusted forquality of life. For example, the Q-TWiSTmethod discounts survival time spent with anunacceptable level of adverse symptoms dueto treatment.21 Such methods attempt to quan-tify the generally accepted notion that sim-ply prolonging survival is not a sufficientobjective. Improved quality of life is equallyimportant.

COMPETING RISKS

For some purposes in designing and analysing tri-als of therapy for older AML patients, it is infor-mative to use the techniques of competing risksanalysis.22 That is, rather than using a compositemeasure such as EFS, one can break this measureinto its constituent parts by considering the timeto each outcome separately. The term ‘compet-ing risks’ refers to various risks of failure, eachcompeting with the others. This terminology orig-inally arose in the context of analysing variouscauses of death, but it is applicable more gener-ally. The fundamental problem from a statisticalperspective is that the risks cannot be assumedto be operating independently from each other.Thus, methods which assume such independence,such as the Kaplan–Meier life table analysis,

which treats other risks as independent censoringmechanisms, are inherently flawed. One way toproperly account for the dependence is throughthe use of the cumulative incidence curve, atopic that has been extensively explored in recentyears.23

STATISTICAL MODELS

Statistical models are heavily used in AMLtrials. The usual time-to-event measures (EFS,DFS, OS) are often handled non-parametricallyin the primary analysis (e.g., Kaplan–Meierestimates, logrank tests, etc.), but the semi-parametric proportional hazards regression modelis surely the most commonly used method toadjust for covariates in the analysis.19 In addition,because of the nature of AML, increasing interestis being focused on so-called cure models,in which it is hypothesised that an unknownfraction (c) of patients are cured (or at leastwill have long-term control of disease) andthe rest (1 − c) are not.24 Interest then focuseson estimating c, comparing the cure rates ofvarious treatments, identifying factors predictiveof c, and identifying prognostic factors for thetime-to-failure in the patients not cured. Inolder patients with AML, this model has notbeen used as much due to the obviously lowvalue of c, but it has been important in otherleukaemias.

SUMMARY

Acute myeloid leukaemia in the older patientis a common and important disease which isrelatively resistant to current therapies. Carefulconsideration of the particular characteristics ofthis disorder is required when designing clinicaltrials. There will be a large number of compoundsavailable for evaluation in upcoming years andit is desirable that such studies be conductedusing the most efficient and informative designsto rapidly identify therapies which lengthensurvival and increase the fraction of patients whoare cured.

Page 153: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

HAEMATOLOGIC CANCERS 147

REFERENCES

1. Mayer RJ, Davis RB, Schiffer CA, Berg DT,Powell BL, Schulman P, Omura GA, Moore JO,McIntyre OR, Frei E. 3. Intensive postremissionchemotherapy in adults with acute myeloidleukemia. Cancer and Leukemia Group B. N EnglJ Med (1994) 331: 896–903.

2. Stone RM, Berg DT, George SL, Dodge RK, Pa-ciucci PA, Schulman P, Lee EJ, Moore JO, Po-well BL, Schiffer CA. Granulocyte-macrophagecolony-stimulating factor after initial chemother-apy for elderly patients with primary acute myel-ogenous leukemia. N Engl J Med (1995) 332:1671–7.

3. Rowe JM, Andersen JW, Mazza JJ, Bennett JM,Paietta E, Hayes FA, Oette D, Cassileth PA, Sta-dtmauer EA, Wiernik PH. A randomized placebo-controlled phase III study of granulocyte-macro-phage colony-stimulating factor in adult patients(> 55 to 70 years of age) with acute myeloge-nous leukemia: a study of the Eastern Coopera-tive Oncology Group (E1490). Blood (1995) 86:457–62.

4. Lowenberg B, Suciu S, Archimbaud E, Ossenkop-pele G, Verhoef GE, Vellenga E, Wijermans P,Berneman Z, Dekker AW, Stryckmans P, Schou-ten H, Jehn U, Muus P, Sonneveld P, DardenneM, Zittoun R. Use of recombinant GM-CSF dur-ing and after remission induction chemother-apy in patients aged 61 years and older withacute myeloid leukemia: final report of AML-11,a phase III randomized study of the LeukemiaCooperative Group of European Organisationfor the Research and Treatment of Cancer andthe Dutch Belgian Hemato-Oncology CooperativeGroup. Blood (1997) 90: 2952–61.

5. Godwin JE, Kopecky KJ, Head DR, Willman CL,Leith CP, Hynes HE, Balcerzak SP, AppelbaumFR. A double-blind placebo-controlled trial ofgranulocyte colony-stimulating factor in elderlypatients with previously untreated acute myeloidleukemia: a Southwest oncology group study(9031). Blood (1998) 91: 3607–15.

6. Cassileth PA, Harrington DP, Appelbaum FR,Lazarus HM, Rowe JM, Paietta E, Willman C,Hurd DD, Bennett JM, Blume KG, Head DR,Wiernik PH. Chemotherapy compared with autol-ogous or allogeneic bone marrow transplantationin the management of acute myeloid leukemiain first remission. N Engl J Med (1998) 339:1649–56.

7. Baer MR, George SL, Dodge RK, O’LoughlinKL, Minderman H, Caligiuri MA, Powell BL,Kolitz JE, Schiffer CA, Bloomfield CD, Larson

RA. Phase III study of the multidrug resis-tance modulator PSC-833 in previously untreatedpatients 60 years of age and older with acutemyeloid leukemia: Cancer and Leukemia GroupB study 9720. Blood (2002) 100: 1224–32.

8. Schiffer CA, Dodge R, Larson RA. Long-termfollow-up of Cancer and Leukemia Group Bstudies in acute myeloid leukemia. Cancer (1997)80: 2210–14.

9. Stone RM, Berg DT, George SL, Dodge RK, Pa-ciucci PA, Schulman PP, Lee EJ, Moore JO, Pow-ell BL, Baer MR, Bloomfield CD, Schiffer CA.Postremission therapy in older patients with denovo acute myeloid leukemia: a randomized trialcomparing mitoxantrone and intermediate-dosecytarabine with standard-dose cytarabine. Blood(2001) 98: 548–53.

10. Cheson BD, Cassileth PA, Head DR, SchifferCA, Bennett JM, Bloomfield CD, Brunning R,Gale RP, Grever MR, Keating MJ. Report ofthe National Cancer Institute-sponsored work-shop on definitions of diagnosis and response inacute myeloid leukemia. J Clin Oncol (1990) 8:813–19.

11. Byrd JC, Mrozek K, Dodge RK, Carroll AJ, Ed-wards CG, Arthur DC, Pettenati MJ, Patil SR,Rao KW, Watson MS, Koduru PR, Moore JO,Stone RM, Mayer RJ, Feldman EJ, Davey FR,Schiffer CA, Larson RA, Bloomfield CD, Can-cer and Leukemia Group. Pretreatment cytoge-netic abnormalities are predictive of inductionsuccess, cumulative incidence of relapse, andoverall survival in adult patients with de novoacute myeloid leukemia: results from Cancer andLeukemia Group B (CALGB 8461). Blood (2002)100: 4325–36.

12. Tallman MS, Andersen JW, Schiffer CA, Appel-baum FR, Feusner JH, Woods WG, Ogden A,Weinstein H, Shepherd L, Willman C, Bloom-field CD, Rowe JM, Wiernik PH. All-trans reti-noic acid in acute promyelocytic leukemia: long-term outcome and prognostic factor analysis fromthe North American Intergroup protocol. Blood(2002) 100: 4298–302.

13. Fenaux P, Chastang C, Chevret S, Sanz M, Dom-bret H, Archimbaud E, Fey M, Rayon C, HuguetF, Sotto JJ, Gardin C. A randomized compari-son of all transretinoic acid (ATRA) followedby chemotherapy and ATRA plus chemother-apy and the role of maintenance therapy innewly diagnosed acute promyelocytic leukemia.The European APL Group. Blood (1999) 94:1192–200.

14. Druker BJ, Sawyers CL, Kantarjian H, Resta DJ,Reese SF, Ford JM, Capdeville R, Talpaz M.Activity of a specific inhibitor of the BCR-ABL tyrosine kinase in the blast crisis of

Page 154: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

148 TEXTBOOK OF CLINICAL TRIALS

chronic myeloid leukemia and acute lymphoblasticleukemia with the Philadelphia chromosome. NEngl J Med (2001) 344: 1038–42.

15. Bloomfield CD, Lawrence D, Byrd JC, Carroll A,Pettenati MJ, Tantravahi R, Patil SR, Davey FR,Berg DT, Schiffer CA, Arthur DC, Mayer RJ.Frequency of prolonged remission duration afterhigh-dose cytarabine intensification in acute mye-loid leukemia varies by cytogenetic subtype. Can-cer Res (1998) 58: 4173–9.

16. Peterson B, George SL. Sample size requirementsand length of study for testing interaction in a2 × k factorial design when time-to-failure is theoutcome. Contr Clin Trials (1993) 14: 511–22.

17. Lunceford JK, Davidian M, Tsiatis AA. Estima-tion of survival distributions of treatment policiesin two-stage randomization designs in clinical tri-als. Biometrics (2002) 58: 48–57.

18. Lachin JM. Statistical considerations in the intent-to-treat principle. Contr Clin Trials (2000) 21:167–89.

19. Hosmer DW, Lemeshow S. Applied Survival Ana-lysis. New York: John Wiley & Sons (1999).

20. Chow SC, Ki FY. Statistical issues in quality-of-life assessment. J Biopharm Stat (1996) 6: 37–48.

21. Gelber RD, Goldhirsch A, Cole BF, Wieand HS,Schroeder G, Krook JE. A quality-adjusted timewithout symptoms or toxicity (Q-TWiST) analysisof adjuvant radiation therapy and chemotherapyfor resectable rectal cancer. J Natl Cancer Inst(1996) 88: 1039–45.

22. Gooley TA, Leisenring W, Crowley J, Storer BE.Estimation of failure probabilities in the presenceof competing risks: new representations of oldestimators. Stat Med (1999) 18: 695–706.

23. Klein JP, Rizzo JD, Zhang MJ, Keiding N. Sta-tistical methods for the analysis and presentationof the results of bone marrow transplants. Part I:unadjusted analysis. Bone Marrow Transpl (2001)28: 909–15.

24. Ghitany ME, Maller R, Zhou S. Estimating theproportion of immunes in censored samples: asimulation study. Stat Med (1995) 14: 39–49.

Page 155: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

10

Melanoma

P.-Y. LIU1 AND VERNON K. SONDAK2

1Fred Hutchinson Cancer Research Centre, Seattle, WA 98109 1024, USA2University of Michigan Comprehensive Cancer Centre, Ann Arbor, MI 48109 0932, USA

INTRODUCTION

Randomised Phase III clinical trials are thegold standard for medical decision making,particularly in terms of adjuvant therapies wherea modest incremental benefit is sought. However,there is sometimes marked disagreement amongclinicians in their interpretation of Phase III trialresults. Nowhere is this more evident than inthe arena of adjuvant therapy of resected ‘high-risk’ melanoma. In this chapter, we will reviewthe results of several key randomised trials andattempt to reconcile at times conflicting clinicalinterpretations.

BACKGROUND

A basic familiarity with malignant melanoma isrequired in order to understand the statistical andclinical issues presented herein.

The prognosis of localised cutaneous melanomais based on several well-defined factors. Patho-logic analysis of the primary tumour can predictthe likelihood of regional and distant metastasis

and death from melanoma. Clinically localisedmelanomas are grouped into three prognosticcategories based on the thickness of the pri-mary tumour as measured by the pathologistusing a micrometer built into the microscope eye-piece (Breslow’s thickness). Melanomas less than1.0 mm in thickness have an overall excellentprognosis with relatively minimal interventionand are considered ‘low-risk’ lesions. Melanomasbetween 1.0 and 3.9 mm are considered to beintermediate risk, while melanomas 4.0 mm orgreater are considered ‘high-risk’ tumours. Thepresence of ulceration of the primary tumourincreases the risk of metastasis and death withinany given thickness category.1

The thickness is highly predictive of the riskof regional lymph node metastasis, with nodalinvolvement in <5% of melanomas that are<1.0 mm versus >30% in melanomas ≥4.0 mm.Intermediate-thickness melanomas have an inter-mediate risk of nodal spread, on the order of 20%.The prognostic significance of the presence ofnodal metastasis far outweighs the significance oftumour thickness: a thin or intermediate-thicknessmelanoma with nodal metastases generally has

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 156: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

150 TEXTBOOK OF CLINICAL TRIALS

a worse prognosis than a thick melanoma withnegative nodes. Once nodal metastasis has beendocumented, the number of involved nodes is thestrongest predictor of subsequent outcome, alongwith the manner of detection of the metastasis.Melanoma in clinically enlarged nodes portendsa worse prognosis than melanoma in clinicallynormal nodes.1

The mainstay of treatment for localised orregionally metastatic melanoma is surgery. Ade-quate wide excision of the primary tumour site(generally taking a margin of 1 to 2 cm of nor-mal skin around the visible edge of the melanomaor biopsy scar) is highly efficacious in controllingdisease at the primary site.2,3

Three main options are available for stag-ing regional nodes in patients with cutaneousmelanoma: clinical staging, surgical staging bycomplete (elective) lymph node dissection, andsurgical staging by sentinel lymph node biopsy.

CLINICAL STAGING

Physical examination is the mainstay of clini-cal staging of the regional nodes. Any palpa-ble lymph nodes that are ≥1 cm in maximumdiameter or very hard or fixed to adjacent struc-tures must be considered highly suspicious formetastatic involvement. Unfortunately, both thespecificity and sensitivity of physical examina-tion for detecting melanoma nodal metastases arelow. In muscular or obese patients, even rela-tively large lymph node metastases can be missedon physical examination. Lymph nodes may beenlarged after a biopsy procedure due to reactivehyperplasia without containing metastasis. Mostimportantly, metastatic involvement of normal-sized lymph nodes cannot be identified by phys-ical examination.

Radiologic studies–computed tomography (CT)and positron emission tomography (PET)–are alsoavailable to clinically stage the regional nodes.CT shares many of the deficiencies of physicalexamination: enlarged nodes may not be malig-nant, and normal-sized nodes harbouring metas-tases will be deemed normal. PET is more sensitivethan CT for differentiating melanoma-containing

from reactive nodes, but is still not able to identifymicroscopic foci of melanoma in normal nodes.4

Currently, neither PET nor CT are routinely rec-ommended for clinical staging.

For patients with low-risk melanomas, i.e.,those that are <1 mm in Breslow’s depth andhave no evidence of ulceration or significantregression, clinical staging by physical exami-nation is standard practice. Currently, surgicalstaging is used in the majority of patients withhigher-risk lesions. For any patient with clinicallyevident nodal involvement, a complete therapeu-tic lymph node dissection is associated with curein about 20% to 40% of patients.

SURGICAL STAGING BY COMPLETE(ELECTIVE) LYMPH NODE DISSECTION

Elective removal of clinically normal regionalnodes identifies evidence of metastasis about 20%of the time, and is clearly a more accurate deter-minant of nodal status than clinical staging. Ret-rospective reviews suggested a survival advan-tage for elective node dissection compared toclinical staging with subsequent therapeutic nodedissection at the time of nodal recurrence.5 Todate, however, no prospective study has demon-strated an overall survival advantage for electivenode dissection.3,6 Although the lack of a demon-strated benefit is not the same as the demonstra-tion of no benefit, elective dissection of clinicallynormal nodes is not considered standard practicefor cutaneous melanoma at the present time. Itis clear, however, that elective node dissectionresults in durable regional disease control in thevast majority of patients, and failures within thedissected nodal basin are quite uncommon.

SURGICAL STAGING BY SENTINEL LYMPHNODE BIOPSY

Sentinel lymph node biopsy is based on theconcept that lymphatic fluid from an area ofskin drains specifically to an initial node ornodes (‘sentinel nodes’) prior to disseminatingto other nodes in the same or nearby basins.Morton et al. described a reliable method for

Page 157: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

MELANOMA 151

identification and removal of the sentinel nodedraining the site of a cutaneous melanoma.7 Theyshowed conclusively that the pathologic status ofthe sentinel node accurately determines whethermelanoma cells have metastasised to that spe-cific lymph node basin.8 An important aspectof sentinel node biopsy is a detailed histologicexamination of the sentinel lymph nodes. Gen-erally, this examination is more thorough thanis practical to perform on the larger numberof nodes obtained during elective node dissec-tion. This more detailed pathologic analysis, com-bined with the ability to identify sentinel nodesthat are outside the defined boundaries of aregional basin, makes sentinel node biopsy themost sensitive and specific test for nodal metas-tasis currently available. The prognostic value ofsentinel node status has been demonstrated inmultiple studies. In published multivariate anal-yses, histologic status of the sentinel nodes isthe most powerful predictor of disease-specificsurvival.9 Overall, 5-year disease-specific sur-vival is >80% for patients with negative sentinelnodes, compared to about 50% for patients withone or more positive sentinel nodes. Importantly,patients with positive sentinel nodes go on toelective complete lymph node dissection. Amongpatients with negative sentinel nodes, only 4%or fewer ultimately experience a clinically evi-dent relapse within the nodal basin. Thus, sentinelnode biopsy matches the excellent regional con-trol achieved by elective node dissection whilesubjecting fewer patients to the morbidity of thecomplete node dissection procedure.

ADJUVANT THERAPY FOR MELANOMA

The development of effective adjuvant therapyhas been a long-standing goal of melanomaresearchers, and the subject of over 100 ran-domised clinical trials involving a host of dif-ferent agents.10 Adjuvant therapy is the systemicor regional administration of drugs or radiationto patients after apparently successful surgery,in an effort to minimise the risk of subsequentrecurrence. Although many patients are cured bysurgery, some benefit from adjuvant treatment

while others will relapse regardless of adjunc-tive measures. Currently there are no predictivemethods to distinguish one group of patientsfrom another, therefore it is necessary to treatall patients in hopes of gaining an incrementalbenefit for a select few. Hence, in addition tothe overall level of efficacy, clinicians evaluatetoxicity, convenience, cost-effectiveness and theprospects of post-relapse salvage therapy whendeciding whether to employ adjuvant therapy.Virtually all of these factors can be determinedaccurately only in randomised trials.

In 1995, high-dose interferon-α2b (IFN-α2b)was approved by the United States Food andDrug Administration, based on the positiveresults of a single, randomised Phase III clinicaltrial, E1684. The FDA’s decision was consideredcontroversial at the time. Subsequent randomisedtrials involving the same basic interferon regimenhave not only failed to put this controversy torest, but have in fact enhanced it.

ADJUVANT INTERFERON CLINICAL TRIALS

E1684

Eastern Cooperative Oncology Group (ECOG)trial E1684, with 280 eligible patients with thickprimary (≥4.00 mm) or node-positive melanomawho were randomly assigned after surgery toobservation or post-operative adjuvant treat-ment with IFN-α2b for one year, demonstratedstatistically-significant improvements in relapse-free and overall survival for patients randomisedto the interferon arm. IFN-α2b therapy increasedthe median relapse-free survival by 9 months(1.72 years for IFN-α2b patients versus 0.98 yearsfor observation patients) and produced a relative42% improvement in the 5-year relapse-free sur-vival rate (37% for IFN-α2b patients versus 26%for observation patients). In addition, IFN-α2btherapy significantly increased median overall sur-vival by 1 year (3.82 years for IFN-α2b patientsversus 2.78 years for observation patients) andproduced a 24% relative improvement in the5-year overall survival rate (46% for IFN-α2bpatients versus 37% for observation patients).11

Page 158: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

152 TEXTBOOK OF CLINICAL TRIALS

Side effects were common and frequently severe,but even when adjusted for time with toxicity, theresults favoured adjuvant IFN-α2b therapy.12

E1690

A subsequent Intergroup adjuvant therapy trial,E1690, also compared the high-dose IFN-α2b toobservation after complete resection of all knowndisease.13 This was a three-arm trial involving608 eligible patients. The eligibility criteria werethe same as for E1684, except for the fact thatelective node dissection was not required forpatients entered onto E1690 with thick primarymelanomas and clinically negative nodes. Resultsof this trial confirmed the relapse-free survivaladvantage seen in E1684 but with no survivaladvantage observed.

E1694

In light of the discordant survival results inE1684 and E1690, the initial results of anotherIntergroup trial, E1694, have received intensescrutiny. This trial compared one year of high-dose interferon not to an observation controlas in the two earlier studies, but rather to twoyears of a ganglioside vaccine called GMK.This was the largest of the three trials, with774 eligible patients between two study arms.For the first time, staging of the lymph nodesby sentinel node biopsy was performed in asignificant fraction of patients. Gangliosides arecarbohydrate antigens found on the surface ofmelanoma cells, as well as normal cells of neuralcrest origin and tumour cells of other types. Apilot randomised trial suggested a relapse-freesurvival benefit in patients who were treatedwith purified ganglioside GM2 (the specificganglioside in the GMK vaccine) plus BCGcompared to those treated with BCG alone.14 InMay 2000, the E1694 trial’s independent DataSafety Monitoring Committee concluded that thehigh-dose interferon arm was associated withhighly significantly improved relapse-free andoverall survival, and mandated that the studyresults be disclosed early.15

CLINICAL CONSIDERATIONS

RELAPSE-FREE SURVIVAL VERSUSOVERALL SURVIVAL

It has been the authors’ experience that clinicianstend to view clinical trial results as dichotomous,that is, ‘positive’ or ‘negative’. Moreover, par-ticularly for adjuvant therapy trials, the accep-tance of a clinical trial as ‘positive’ is oftenrestricted to trials demonstrating a statisticallysignificant benefit in overall survival. From thisperspective, there seems to be an obvious discrep-ancy among the two observation-controlled trials:E1684 demonstrated seemingly striking benefitsfrom the high-dose interferon regimen in bothrelapse-free and overall survival, whereas E1690validated only the relapse-free survival benefitwith no survival difference. However, the impor-tance of relapse-free survival may be worth closerexamination in the current setting.

Statistically it is commonly known that, com-pared to overall survival, disease relapse is a lessobjective endpoint because it depends on the def-inition of relapse as well as the frequency andmethod of detection. Defining relapse is less ofan issue in the adjuvant setting since patientsenter the study with no detectable disease andthereafter any new disease found is considereda relapse. In a well-conducted clinical trial theinterval and method of disease assessment arespecified in the protocol and generally compliedwith by trialists, thereby rendering relapse-freesurvival a more reliable endpoint than in othersituations. From the purely clinical viewpoint,patients have made clear that they are willingto accept even toxic adjuvant therapies that pro-vide improvements in relapse-free survival, evenif they do not result in any prolongation of over-all survival. This observation has been directlyvalidated in melanoma patients,16 and representsthe perception that time spent without signs orsymptoms of recurrent cancer is inherently ofvalue even in the absence of prolongation of totallifespan. In addition, relapse-free survival oftenrepresents a truer reflection of the biologic activ-ity of an adjuvant therapy since randomised trials

Page 159: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

MELANOMA 153

rarely include rigorous controls on post-relapsesalvage therapy. The confounding effect of suchtreatment on overall survival is unknown andnot assessable.

RECONCILING THE STUDY RESULTS BASEDON CLINICAL CONSIDERATIONS

Two of the three randomised Phase III trials ofhigh-dose interferon, E1684 and E1690, demon-strate a relapse-free survival advantage. The thirdtrial, E1694, also shows a relapse-free survivalbenefit but with GMK vaccine and not obser-vation as the control treatment. The implicationof this design difference is discussed in detailbelow. Nevertheless, many consider there is uni-formity of evidence that high-dose interferon hasbiologic activity in at least delaying relapse aftersurgical therapy. This fact alone, combined withthe lack of proven alternatives, is enough formany patients to choose interferon therapy in theabsence of consensus regarding the overall sur-vival benefit.

Crossover to interferon therapy upon relapsemight have partially affected the outcome of atleast one study. The original trial, E1684, wasunlikely to have been affected by crossover fortwo reasons. Surgical staging of the regionalnodes by complete (elective or therapeutic) nodedissection was required. Hence, few patientswere likely to experience regional relapse orother resectable recurrence, where secondaryresection and delayed adjuvant interferon couldbe employed. Most relapses occurred in non-resectable distant sites. In recent medical practice,interferon is rarely employed for the treatment ofmeasurable metastatic disease.

In contrast, the E1690 trial required only clin-ical staging of the regional nodes, and surgerywas not required for patients with thick primarytumours and clinically negative nodes. Amongall relapsed patients (n = 114 in the high-doseinterferon arm and n = 121 in the observationcontrol arm), 54% on high-dose interferon and45% on observation experienced regional recur-rence only. Retrospective data collection indi-cated more patients relapsing on the observation

arm received subsequent interferon-α-containingregimens (31% vs. 15%) and/or biochemotherapy(17% vs. 6%).

While there is some evidence of differentialpost-relapse treatment received, concluding thatthe lack of interferon survival benefit observedin E1690 is due to these differences is notjustified. Making this conclusion presupposessurvival efficacy from these salvage therapies,which cannot be substantiated with currentlyavailable data. In addition, comparing outcomesby post-relapse treatment groups provides littleuseful information because patients were notrandomised to salvage treatment strategies uponrelapse. As is inherent in observational data,unknown patient selection factors cannot beaccounted for by analysis techniques and theirimpact can easily remain even after adjustingfor known prognostic factors. Therefore, althoughavailable data appear compatible with the notionthat initial observation after surgery followed byhigh-dose interferon in case of resectable relapsepresents an alternative strategy to routine use ofadjuvant high-dose interferon, this study offersno proof for the conjecture. The conservativeconclusion is that salvage treatment differenceis a possible confounding factor that limits theconfidence regarding the lack of overall survivalbenefit of high-dose interferon from study E1690.

STATISTICAL CONSIDERATIONS

Although clinical factors clearly impact on theinterpretation of the three trials, our main goal isto examine the statistical aspects of these trialsto determine the extent to which they actuallypresent ‘conflicting’ information. We focus firston E1684 and E1690.

STATISTICAL TESTS EMPLOYED ANDPRESENTATION OF RESULTS

One source of confusion could be due to the factthat one-sided p-values were presented for E1684but two-sided p-values were presented for E1690.Since all comparisons involved were one-sided

Page 160: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

154 TEXTBOOK OF CLINICAL TRIALS

in nature (i.e., is high-dose interferon superior toobservation after surgery), we use all one-sidedp-values (p1) in this discussion. In addition,all hazard ratios are expressed as observationarm versus treatment arm ratios. Thus, a hazardratio >1 indicates an excess of hazard in theobservation arm, or treatment advantage.

Another possible source of confusion couldbe the fact that, in E1684, statistically signifi-cant p-values for relapse-free and overall sur-vival differences by the stratified logrank test(adjusted for disease burden and presentation atinitial diagnosis versus recurrent nodal diseasestatus) were reported (Table 2 of Ref. 11). Butwhen Cox regression analysis was performed,further adjusting for age, time from diagnosis torandomisation and ulceration status of the pri-mary tumour, a significant interferon over obser-vation benefit was presented only for those withnodal disease (Table 4 of Ref. 11). The haz-ard ratio was 1.64 for relapse-free survival and1.49 for overall survival with p1 = 0.01 in bothcases. However, these hazard ratios (presentedin their reciprocals as interferon over observa-tion ratios, 0.61 and 0.67, in actuality) werelabelled ‘Treatment with IFN’ without referenceto the positive nodal disease subset. An interac-tion term between the interferon treatment andthe thick primary, no nodal disease patient cat-egory was actually included in the Cox mod-els and the results were presented in the sametable with the label ‘CS1/PS1 + IFN’. The haz-ard ratios were 0.36 and 0.34 respectively forrelapse-free survival and overall survival. Theseinteraction hazard ratios translated into observa-tion over interferon hazard ratios of 0.60 and 0.50for relapse-free and overall survival in patientswith thick primary tumours and pathologicallynegative nodes, reflecting the occurrence thatinterferon-treated patients fared worse than theobservation patients in this subset. For the readerswho did not appreciate these details of the Coxmodelling, the hazard ratios for the nodal diseasesubset could have been over-interpreted as theCox model treatment effects for the study as awhole, which were not presented in the originalpublication. Such misinterpretation might have

contributed to an exaggerated impression of theoverall survival benefit from E1684.

TRIAL SIZE, OVERALL RESULTSAND OTHER ASPECTS

To interpret the combined results E1684 andE1690, it is useful to compare the study param-eters and overall results. Tables 10.1–10.3 areextracted mainly from Ref. 13. Since there wasnot a low-dose interferon arm in E1684, onlythe high-dose interferon and observation arms ofE1690 are included in the tables. Due to thelimitations of data availability, all randomisedpatients regardless of eligibility determination arepresented for consistency.

The tables indicate that when E1690 resultsbecame available, the study had 50% morepatients than E1684, reflecting wider participa-tion from the US Melanoma Intergroup. Thepatient enrollment periods were non-overlapping.Although the updated data for E1684 had longerfollow-up at the time of E1690 publication, more

Table 10.1. E1684 and E1690 study characteristics

Study E1684 E1690∗

Participatinggroups

ECOG ECOG, SWOG,CALGB,MDACC

Patient accrualperiod

1984–1990 1991–1995

N (allrandomised)

286 427

Medianfollow-up(years)

6.9 4.3

Event count: RFS 197 241Event count: OS 175 190

∗High-dose interferon and observation arms only.

Table 10.2. E1684 and E1690 patient disease stagedistribution

Diseasestage

T4N0

T1-4 N+(occult)

T1-4 N+(overt)

N+Recurrent

E1684 11% 12% 14% 63%E1690 26% 11% 12% 50%

Page 161: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

MELANOMA 155

Table 10.3. E1684 and E1690 results

StudyHazard

ratio 95% CI p-Value∗

Relapse-free survivalE1684∗∗ 1.43 (1.08, 1.89) 0.002E1690 1.28 (1.0, 1.65) 0.03

Overall survivalE1684∗∗ 1.32 (0.98, 1.77) 0.03E1690 1.0 (0.75, 1.33) 0.50

∗One-sided p-value by stratified logrank test.∗∗Ref. 21.

events were analysed for E1690 from the largersample size and the fact that few events occurredafter 5 years. The main known patient char-acteristic difference was in the distribution ofdisease stage. There were more node-negativepatients (26% vs. 11%) and fewer recurrent dis-ease patients (63% vs. 50%) in E1690, repre-senting a somewhat more favourable prognosis.It may be worth pointing out that, among thosewith nodal disease, there did not appear to be sur-vival differences between newly diagnosed andrecurrent disease patients. The more favourablerelapse and survival experiences of the obser-vation patients in E1690 compared to those inE1684 (5-year relapse-free survival of 35% vs.26% and overall survival of 54% vs. 37%) remainlargely unexplained by known factors. Regard-ing the treatment outcome, the magnitude of theinterferon benefit was smaller in E1690 thanin E1684 for both relapse-free survival (hazardratio 1.43 vs. 1.28) and overall survival (haz-ard ratio 1.32 vs. 1.00). The larger event countsin E1690 resulted in narrower confidence inter-vals. As offered by the authors as one plausibleconclusion,13 the combined evidence from thesetwo trials seems to indicate that, for node posi-tive and thick primary, node-negative melanomapatients, treatment of high-dose interferon pro-longs relapse-free survival. Survival benefit, if itexists, may be more limited.

It is worth pointing out that E1690 wasdesigned with not one but two primary compar-isons, comparing high-dose interferon and low-dose interferon to observation (but not to each

other) with a one-sided p-value of 0.0125 foreach comparison to maintain an overall one-sidedtype I error rate of 0.025 for the study. Whenthe results were presented, however, one-sided p-values less than 0.025 were treated as statisticallysignificant for each comparison, representing astudy-wide, one-sided type I error rate of 0.05 ora two-sided error rate of 0.10. Also, per designthe study was sized so that the power for eachindividual comparison was 0.83. In other words,the type II error rate for each comparison was0.17 for an approximate study-wide type II errorrate of 0.34. Should the true magnitude of benefitfrom both interferon regimens be the same, thepower to detect both effects in the same studywas close to 0.66. With the inflated type I errorrate in the end, the overall power would increasesomewhat but would likely remain less than ade-quate for detecting reasonable effects from bothtreatment arms. Hence, the question about thelow-dose interferon regimen’s treatment effectwas essentially unanswered in this study, yet clin-icians seem to have uniformly concluded thatlow-dose interferon is inactive in E1690.

WHAT DOES E1694 TELL US?

E1694 was designed to detect a GMK vaccinebenefit over interferon as the contemporarytreatment standard. As is often practiced withsuperiority designs, the trial would be stoppedat planned interim analyses if the hypothesisedvaccine benefit could be definitively ruled out.This provision was incorporated in the studydesign in the following manner. Instead ofthe typical, highly stringent interim p-valuerequirements, the GMK vaccine needed onlyto be inferior to interferon at a fixed, one-sided p-value of 0.05 for relapse-free survivalin order to consider study termination at interimanalyses. Such evidence might not establish thevaccine inferiority but would certainly rule outits superiority.

Considering the substantially more favourablevaccine toxicity profile, a more appropriate trialdesign might have sought to demonstrate theequivalence of the two agents in their efficacy

Page 162: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

156 TEXTBOOK OF CLINICAL TRIALS

rather than the superiority of the vaccine. In factthe Data and Safety Monitoring Committee inthis case seemed to have followed the equiva-lence principle and disclosed the study resultsonly when there was decisive evidence that theGMK vaccine was inferior to high-dose inter-feron in both relapse-free survival (p1 = 0.0015)

and overall survival (p1 = 0.009).15 Because noobservation control arm was incorporated in thestudy design, the clinical interpretation of E1694in this respect is subject to debate. Obviously,if it were known that the GMK vaccine hadsome level of clinical efficacy, the finding thathigh-dose interferon was significantly better inboth disease-free and overall survival would beof great clinical significance and would substan-tiate the benefits identified in the initial E1684trial. Without this knowledge, some have main-tained the possibility of a deleterious vaccineeffect and insisted that the study cannot be usedto give information on the non-design comparisonof interferon versus observation.

Unfortunately, no credible evidence exists thatthe GMK vaccine is either beneficial or delete-rious. It is likely that the GMK vaccine actedessentially as placebo and the study providedfurther validation that high-dose interferon wasefficacious over no treatment in both relapse-freeand overall survival. But we do not know thisfor certain. As the dramatic survival differencebetween E1684 and E1690 observation patientsamply illustrates,13 comparison of patient out-comes in the GMK vaccine arm to historicalcontrols in the other two trials offers few cluesto the efficacy of the vaccine.

Data were presented that, among the vaccine-treated patients, those displaying antibody respon-ses had a trend towards favourable outcomesRef. 17. Even assuming that the analyses cor-rected for the inherent responder versus non-responder bias,20 the results still cannot be usedto establish a causal relationship between vaccineresponse and favourable outcome. As pointed outin numerous publications, response to treatmentcould simply serve as a selection mechanismwherein responders represented a better progno-sis group. One may contend that it is difficult to

reconcile a trend in favour of antibody respon-ders with speculations of a deleterious effect ofthe vaccine resulting from production of ‘block-ing’ antibodies. However, it is known that effectsof prognostic factors such as disease stage caneasily overwhelm any treatment effects.

DID ANY SUBSET OF PATIENTS BENEFITMORE FROM INTERFERON?

The predominant subcategories of high-riskmelanoma patients are those having thick primarytumours with clinically or pathologically nega-tive nodes and those having documented involve-ment of the nodes. Among the node-positivepatients, subsets include those with 1, 2 to 3 and≥4 nodes; patients with clinically evident ver-sus microscopic nodal involvement; and patientsfound to have nodal involvement at the time ofinitial presentation versus those developing recur-rent disease in the nodes.

The initial findings of E1684 indicated that thesubset of patients with thick primary tumours andpathologically negative nodes had no benefit, andperhaps even a detrimental effect, from adjuvantinterferon.11 The veracity of this finding wascalled into question from the outset, becauseof the small number of node-negative patients(a total of 31 out of 280 eligible patients, or11%) and an imbalance in a major prognosticfactor (ulceration of the primary tumour) biasingthe results in favour of the observation arm.In contrast, subset analysis of the results oftrial E1690 found that the relapse-free survivalbenefit for patients with thick primary tumoursand clinically negative nodes (making up 25% ofthe eligible patient population) was identical tothat for the study population as a whole.13 Subsetanalysis of E1694 showed the greatest interferonover vaccine benefit for the subset of thick, node-negative patients.15

Indeed, in each of the three clinical trials,subset analysis indicated a different group asobtaining the most benefit from high-dose inter-feron: the subset with one single positive nodein E1684; the subset with two to three positivenodes in E1690; and the node-negative subset

Page 163: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

MELANOMA 157

in E1694. The authors properly suggested that,taken together, there was no indication of prefer-ential treatment effect in any one subset.15 Theseresults exemplify the lack of reliability of sub-set results, a phenomenon previously discussed inregard to other melanoma clinical trials.18 With-out appropriate study size for adequate powerwithin subsets, and control for inflated typeerrors stemming from multiple testing, post hocsubset analyses suffer both high false-positiveand high false-negative rates.

CONCLUSIONS

Three randomised trials evaluating high-doseinterferon, involving over 1600 patients, havebeen conducted, yet its treatment benefit remainscontroversial. The combined evidence indicatesthat, for high-risk melanoma patients, treatmentof high-dose interferon prolongs relapse-freesurvival. Survival benefit is less certain. Thereis no credible evidence to suggest that interferonexerts a differential effect in different subsets of‘high-risk’ patients.

There are many reasons why high-dose inter-feron has not been uniformly embraced by physi-cians and patients around the world, even thoughit is the only adjuvant therapy yet shown to haveany sustained impact on relapse-free survival.When the three trials are looked at in the lightof statistical principles, what seem to be glaringdifferences are more plausibly regarded as under-standable variations reflecting trial design andanalysis, combined with the fluctuations inher-ent in human clinical trials conducted over timein similar yet subtly different patient populations.

While it is easy to conclude that furtherresearch is necessary to determine if high-doseinterferon α-2b improves overall survival, there isin fact little chance that definitive further researchwill take place. Only one current clinical trial,the Sunbelt Melanoma Trial, is comparing oneyear of high-dose interferon to a control group.This study includes only patients with a singlepositive sentinel node identified at the time ofinitial presentation.19 As such, it is comprised

of a far more homogeneous patient populationthan any prior clinical trial, potentially enhancingthe scientific validity. Of note, this group nowconstitutes by far the largest fraction of ‘high-risk’ melanoma patients being seen and treatedin the United States today, yet less than 10%of participants in the three prior trials combinedwere from this category. Unfortunately, this trialis likely to be small compared to the most recentIntergroup trials and, regardless of the results, itwill not directly address the role of interferon inall of the other high-risk categories.

It is now nearly 20 years since the designof clinical trial E1684, and 8 years since theFDA’s approval of high-dose interferon-α forthe adjuvant therapy of high-risk melanoma, andwe may never fully know to what extent thistoxic and inconvenient regimen improves overallsurvival. The implications of that statement areprofound, and the burden they place on clinicaltrialists is clear: design and analyse our trialscarefully to have the greatest probability of aclear and unambiguous result.

REFERENCES

1. Balch CM, Soong SJ, Gershenwald JE, et al. Pro-gnostic factors analysis of 17,600 melanomapatients: validation of the American Joint Com-mittee on Cancer melanoma staging system. J ClinOncol (2001) 19: 3622–34.

2. Veronesi U, Cascinelli N. Narrow excision (1-cm margin): a safe procedure for thin cutaneousmelanoma. Arch Surg (1991) 126: 438–41.

3. Balch CM, Soong S, Ross MI, et al. Long-termresults of a multi-institutional randomized trialcomparing prognostic factors and surgical resultsfor intermediate thickness melanomas (1.0 to4.0 mm). Intergroup Melanoma Surgical Trial.Ann Surg Oncol (2000) 7: 87–97.

4. Wagner JD, Schauwecker D, Davidson D, et al.Prospective study of fluorodeoxyglucose-positronemission tomography imaging of lymph nodebasins in melanoma patients undergoing sentinelnode biopsy. J Clin Oncol (1999) 17: 1508–15.

5. Balch CM. The role of elective lymph nodedissection in melanoma: rationale, results, andcontroversies. J Clin Oncol (1988) 6: 163–72.

6. Cascinelli N, Morabito A, Santinami M, MacKieRM, Belli F. Immediate or delayed dissection

Page 164: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

158 TEXTBOOK OF CLINICAL TRIALS

of regional nodes in patients with melanoma ofthe trunk: a randomised trial. WHO MelanomaProgramme. Lancet (1998) 351(9105): 793–6.

7. Morton D, Wen D, Wong J, et al. Technical detailsof intraoperative lymphatic mapping for earlystage melanoma. Arch Surg (1992) 127: 392–9.

8. Morton DL, Thompson JF, Essner R, et al. Vali-dation of the accuracy of intraoperative lymphaticmapping and sentinel lymphadenectomy for early-stage melanoma: a multicenter trial. Multicen-ter Selective Lymphadenectomy Trial Group. AnnSurg (1999) 230: 453–65.

9. Gershenwald JE, Thompson W, Mansfield PF,et al. Multi-institutional melanoma lymphaticmapping experience: the prognostic value ofsentinel lymph node status in 612 stage I orII melanoma patients. J Clin Oncol (1999) 17:976–83.

10. Sondak VK, Wolfe JA. Adjuvant therapy formelanoma. Curr Opin Oncol (1997) 9: 189–204.

11. Kirkwood JM, Strawderman MH, Ernstoff MS,et al. Interferon alfa-2b adjuvant therapy of high-risk resected cutaneous melanoma: the EasternCooperative Oncology Group trial EST 1684. JClin Oncol (1996) 14: 7–17.

12. Cole BF, Gelber RD, Kirkwood JM, et al. Qua-lity-of-life – adjusted survival analysis of inter-feron alfa-2b adjuvant treatment of high-riskresected cutaneous melanoma: an Eastern Cooper-ative Oncology Group study. J Clin Oncol (1996)14: 2666–73.

13. Kirkwood JM, Ibrahim JG, Sondak VK, et al.High- and low-dose interferon alfa-2b in high-riskmelanoma: first analysis of Intergroup trial E1690/S9111/C9190. J Clin Oncol (2000) 18: 2444–54.

14. Livingston PO, Wong GYC, Adluri S, et al. Im-proved survival in stage III melanoma patientswith GM2 antibodies: a randomized trial ofadjuvant vaccination with GM2 ganglioside. J ClinOncol (1994) 12: 1036–44.

15. Kirkwood JM, Ibrahim JG, Sosman JA, et al.High-dose interferon alfa-2b significantly prolongsrelapse-free and overall survival compared withthe GM2-KLH/QS-21 vaccine in patients withresected stage IIB–III melanoma: results ofIntergroup trial E1694/S9512/C509801. J ClinOncol (2001) 19: 2370–80.

16. Kilbridge KL, Weeks JC, Sober AJ, et al. Patientpreferences for adjuvant interferon alfa-2b treat-ment. J Clin Oncol (2001) 19: 812–23.

17. Kirkwood JM, Ibrahim J, Lawson DH, et al.High-dose interferon alfa-2b does not diminishantibody response to GM2 vaccination inpatients with resected melanoma: results of themulticenter Eastern Cooperative Oncology Groupphase II trial E2696. J Clin Oncol (2001) 19:1430–6.

18. Sondak VK. Multi-institutional melanoma vac-cine trial [Letter]. Ann Surg Oncol (1996) 3:588–9.

19. McMasters KM, Sondak VK, Lotze MT, RossMI. Recent advances in melanoma staging andtherapy. Ann Surg Oncol (1999) 6: 467–75.

20. Anderson JR, Cain KC, Gelber RD. Analysis ofsurvival by tumor response. J Clin Oncol (1983)1: 710–19.

21. Kirkwood JM, Manola J, Ibrahim J, et al. A pooledanalysis of ECOG and Intergroup trials of adjuvanthigh-dose interferon for melanoma. Accepted forpublication in Clinical Cancer Research (2004).

Page 165: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

11

Respiratory Cancers

JOAN H. SCHILLER1 AND KYUNGMANN KIM2

Departments of 1Medicine and 2Biostatistics and Medical Informatics, University of Wisconsin,Madison, WI 53792, USA

INTRODUCTION

Carcinoma of the lung and bronchus is estimatedto account for almost 13% of all new cancercases, but to be responsible for about 157 200deaths in the United States in 2003.1 Thisrepresents more than 28% of all deaths due tocancer, exceeding the number of deaths due to thenext four leading cancers, colon, breast, prostateand pancreas, all combined.

The incidence of the disease continues to rise,particularly in women and blacks, and thus is likelyto present a significant public health problem foryears to come. Cigarette smoking is attributed asthe cause of 80% to 90% of lung cancer cases, withthe risk for lung cancer among smokers being 20to 30 times that among non-smokers. Other riskfactors include exposure to asbestos and radon.Asbestos exposure, known to cause malignantmesothelioma, increases the risk for lung cancer,especially among smokers. There are limited dataon molecular and genetic profile as a risk factor,and familial predisposition to lung cancer. Diet’srole in lung cancer is even less obvious.

Despite the significant reduction in smoking,especially among the male population since the

late 1970s, the incidence of lung cancer is still ris-ing because of long latency and a steady increasein smoking among the female population.

With the litigation and subsequent settlementbetween the tobacco industry and the state gov-ernments in the US, the marketing effort of theUS tobacco industry has shifted to the emergingmarkets in Asia and Eastern Europe. As a conse-quence, the smoking-related public health problemis predicted to pose a serious threat to the nationalsecurity in a country such as China, in which smok-ing has increased dramatically in recent years.

The best investment for prevention of smoking-related cancer incidence and death, as well asother diseases such as cardiovascular and otherpulmonary diseases, appears to be in smokingcessation and prevention of taking up the smok-ing habit among teenagers and females.

CLASSIFICATIONS

Lung cancer consists of four major histologicaltypes: adenocarcinoma, squamous cell carcinoma,large-cell carcinoma and small-cell carcinoma.Because of the unique biological features of small-cell lung cancer (SCLC), its staging and treatment

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 166: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

160 TEXTBOOK OF CLINICAL TRIALS

differ radically from the other three types of lungcancer, which are collectively called non-small-cell lung cancer (NSCLC).

Besides histological classification, lung can-cer is also classified according to the Tumour,Node and Metastasis (TNM) staging and theInternational Staging Classification. The TNMstaging system is applied primarily to NSCLCand consists of three components, each accord-ing to primary tumour (T), nodal involvement(N) and distant metastasis (M) as summarisedin Table 11.1. The International Staging Classi-fication summarised in Table 11.2 is based on

the TNM staging.2 It is this classification thatforms the basis for management and treatment ofpatients with NSCLC.

A staging system entirely different from thatfor NSCLC is used for patients with small-cellcarcinoma of the lung. Small-cell lung canceris clinically categorised into two stages: limitedand extensive. Limited-stage SCLC is definedas tumours confined to one hemithorax and itsregional lymph nodes that can be encompassedin a tolerable irradiation field. Extensive-stageSCLC is defined as any extent of disease beyondthat classification.

Table 11.1. TNM staging

Primary tumour (T)

TX Tumour proven by the presence of malignant cells in bronchopulmonary secretions but not visualisedroentgenographically or bronchoscopically, or any tumour that cannot be assessed as in a retreatmentstaging

T0 No evidence of primary tumourTis Carcinoma in situT1 A tumour that is 3.0 cm or less in greatest dimension, surrounded by lung or visceral pleura, and without

evidence of invasion proximal to a lobar bronchus at bronchoscopyT2 A tumour more than 3.0 cm in greatest dimension, or a tumour of any size that either invades the visceral

pleura or has associated atelectasis or obstructive pneumonitis extending to the hilar region. Atbronchoscopy, the proximal extent of demonstrable tumour must be within a lobar bronchus or at least2.0 cm distal to the carina. Any associated atelectasis or obstructive pneumonitis must involve less thanan entire lung

T3 A tumour of any size with direct extension into the chest wall (including superior sulcus tumours),diaphragm, or the mediastinal pleura or pericardium without involving the heart, great vessels, trachea,oesophagus or vertebral body, or a tumour in the main bronchus within 2 cm of the carina withoutinvolving the carina

T4 A tumour of any size with invasion of the mediastinum or involving the heart, great vessels, trachea,oesophagus, vertebral body for carina or presence of malignant pleural effusion; a satellite nodulewithin the same lobe

Nodal involvement (N)

NX Regional lymph nodes cannot be assessedN0 No demonstrable metastasis to regional lymph nodesN1 Metastasis to lymph nodes in the peribronchial or the ipsilateral hilar region, or both, including direct

extensionN2 Metastasis to ipsilateral mediastinal lymph nodes and subcarinal lymph nodesN3 Metastasis to contralateral mediastinal lymph nodes, contralateral hilar lymph nodes, ipsilateral or

contralateral scalene or supraclavicular lymph nodes

Distant metastasis (M)

M0 No distant metastasisM1 Distant metastasis, including pulmonary nodule not in the same lobe as the primary tumour

Page 167: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY CANCERS 161

Table 11.2. International Staging Classification forlung cancer

Five-year survival (%)

Stage TNM subsetClinical

stagePathological

stage

IA T1, N0, M0 61 67IB T2, N0, M0 38 57IIA T1, N1, M0 34 55IIB T2, N1, M0; T3, N0,

M024 39

IIIA T3, N1, M0; T1–3,N2, M0

9 25

IIIB T4, any N, M0; anyT, N3, M0

13 23

IV Any T, any N, M1 1 –

INCIDENCE

In the year 2002, 1 284 900 new cases of invasivecancer were expected in the United States,excluding carcinoma in situ of any site exceptthe urinary bladder and also excluding basal andsquamous cell cancers of the skin. Lung canceris estimated to account for 13% (169 400 cases)of all new cancer cases, 14% (90 200) in malesand 12% (79 200) in females.

The annual age-adjusted incidence rate of lungcancer in the male population has been in asteady decline since its peak in the early 1980s.However, that in the female population appearsto be still increasing, although the rate of increasehas slowed in the late 1990s.

PROGNOSIS

Prognosis for patients diagnosed with lung canceris dismal, with less than 15% surviving longerthan 5 years from the time of diagnosis, and itis highly dependent on stage of the disease asindicated in Table 11.2.

SALIENT FEATURES OF SMALL-CELLLUNG CANCER

Small-cell lung cancer, which makes up a quarterto a third of all lung cancer at diagnosis, differsfrom NSCLC in a number of important ways.

First, it has a more rapid clinical course andnatural history, with the rapid development ofmetastases, symptoms and eventually death. Leftuntreated, the median survival time is typically12–15 weeks for patients with local disease and6–9 weeks for those with advanced disease.Second, it exhibits features of neuroendocrinedifferentiation in many patients, which maybe distinguishable histopathologically and isassociated with paraneoplastic syndromes. Third,unlike NSCLC, SCLC is exquisitely sensitive toboth chemotherapy and radiotherapy, althoughresistant disease often develops.

Due to sensitivity of patients with SCLC tochemotherapy, it can pose challenges in designof clinical trials for drug development as will bediscussed in some detail.

CLINICAL TRIALS IN LUNG CANCER

Clinical trials have resulted in significant seminaltrials which have led to changes in the man-agement of these patients. Those seminal studiesin screening, chemoprevention and treatment areoutlined.

SCREENING AND EARLY DETECTION

Three US randomised screening studies failed todetect an impact of screening high-risk patientswith chest radiographs or sputum cytology onmortality, although earlier stage cancers weredetected in the screened groups.3 – 5 These studieshave been criticised for a number of potentialmethodological and statistical problems, such asover-diagnosis and analysing data by survivalrather than mortality.6

Recently, several clinical studies have demon-strated that early stage lung cancers can bedetected with the use of spiral CT that wouldnot have been detected by routine chest X-ray.7

Spiral CT is a CT scan which does not evaluatethe mediastinum and thus does not use contrast orrequire the presence of a radiologist, employs lowdoses of radiation and can be completed withinone patient ‘breath’. Because it can be donerapidly and does not require a radiologist to be

Page 168: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

162 TEXTBOOK OF CLINICAL TRIALS

present, it is being used in some centres to screenfor lung cancers in high-risk populations. How-ever, it has not been determined whether thereis a survival benefit with this technique.6,7 Giventhe availability of this scanning technique in thecommunity, it is imperative that clinical trials becompleted to determine if the early detection ofsmall tumours results in improved survival thatis not a result of lead time or length bias.

TREATMENT: NON-SMALL-CELLLUNG CANCER

Treatment of NSCLC is dependent primarily onstage of disease at the time of diagnosis and stage,in turn, is dependent upon the size of the tumour(T), location of nodal involvement (N), if any, andpresence or absence of distant metastases (M). Thecurrent TNM staging classification is shown inTable 11.1 and the stage grouping in Table 11.2.

Stage I Disease

A lobectomy is the treatment of choice for stage INSCLC, with cure rates of 60–80% reported.Within stage I, patients with T2, N0 disease donot fare as well as those with T1, N0 cancers.In approximately 20% of patients with medicalcontraindications to surgery but with adequatepulmonary function, high-dose radiotherapy willresult in cure. No role of adjuvant chemotherapyfor stage I NSCLC has been identified.

Chemoprevention: Patients with a resected stageI NSCLC are at high risk of approximately 1%per year for the development of second lungcancers, prompting a number of ongoing clini-cal trials looking at the role of chemoprevention.Surprisingly, several randomised studies havedemonstrated that the use of vitamin A or oneof its derivatives at best, does not prevent lungcancer in smokers and at worst, may increasethe risk of developing it.8 – 10 Preliminary stud-ies have suggested that selenium may reduce theincidence of lung cancer and total cancer mortal-ity. In a multi-centre, double-blind, randomised,placebo-controlled trial, 1312 patients were ran-domised to receive either selenium or placebo.

The selenium group had fewer total carcino-mas, including lung cancer with a relative riskof 0.54 and a 95% confidence interval of (0.30to 0.98) (p = 0.04).11 This has formed the basisfor an intergroup chemoprevention trial which isnow ongoing.

Stage II and ‘Non-Bulky’ IIIA Disease

Treatment of locally advanced NSCLC is oneof the most controversial issues in the manage-ment of lung cancer. Treatment options includesurgery for less-advanced disease, or radiother-apy, either of which has been given with orwithout chemotherapy for control of micrometas-tases. Interpretation of the results of clinical tri-als involving patients with locally advanced dis-ease has been clouded by a number of issues,including changing diagnostic techniques, differ-ent staging systems and heterogeneous patientpopulations that may have disease that rangesfrom ‘non-bulky’ stage IIIA (clinical N1 nodes,with N2 nodes discovered only at the time ofsurgery or mediastinoscopy), to ‘bulky’ N2 nodes(enlarged adenopathy clearly visible on chestX-ray films, or multiple nodal level involvement),to clearly inoperable stage IIIB disease.

Post-operative Thoracic Radiotherapy: Thetreatment for stage II and selected IIIA NSCLCpatients is surgical resection. However, manyof these patients will relapse, prompting numer-ous trials evaluating the role of post-operativeradiotherapy or chemotherapy. A meta-analysisexamining the role of post-operative radiother-apy (PORT) found that patients randomised toreceive PORT actually had an inferior survivalto those randomised to observation alone.12 Ina meta-analysis of 2128 patients in nine clini-cal trials of post-operative radiotherapy, a 7%survival decrement from radiation was identi-fied. However, this particular analysis includeda number of trials from the 1960s and 1970swhen staging was highly inaccurate and relativelyoutmoded radiation therapy technologies wereutilised. In addition, several of the trials includedin this report aggressively treated patients with

Page 169: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY CANCERS 163

no evidence of nodal involvement or those withearly nodal involvement only, a group that bytoday’s standards would not be subjected to post-operative radiation therapy. More recent studieslooking at the role of PORT have concludedthat PORT does not prolong survival, but doesenhance local control. The most comprehensiverandomised trial in this regard was performedby the Lung Cancer Study Group and it demon-strated major improvement in intrathoracic dis-ease control.13 For those patients receiving tho-racic radiotherapy, the intrathoracic failure ratewas only 3%, compared to 43% for patients notreceiving post-operative radiotherapy, althoughno significant survival advantage was identified.

Adjuvant Chemotherapy: Given the propensityof these resected patients to relapse with distantdisease, adjuvant post-operative chemotherapyhas been of significant interest. A meta-analysispublished in 1995 found a small improvement insurvival with post-operative adjuvant chemother-apy that borderlined on statistical significance(p = 0.08),14 leading some clinicians to concludethat adjuvant chemotherapy was of benefit. How-ever, a randomised intergroup study has beencompleted in which patients were randomisedto receive either radiotherapy plus chemotherapy(cisplatin and etoposide for four cycles) or radio-therapy alone. The median and long-term survivalof the two arms was nearly identical.15 Onceagain, the fact that the meta-analysis includedolder studies, in which chemotherapy regimens,staging and other clinical characteristics weredifferent, may account for this discrepancy.Although the role of neoadjuvant chemotherapyis under investigation, it cannot be routinely rec-ommended until the results of randomised clinicaltrials confirm clinical benefit.

Pre-operative Chemotherapy plus Surgery:There have been two small randomised stud-ies involving surgery with or without pre-operative chemotherapy which popularised thisapproach. Both involved 60 patients and bothreport response rates of 35–62% following induc-tion chemotherapy. Both have also reported pro-longed survival, prompting early closure of both

trials. In the European trial, the median survivaltime was 26 months for patients receiving pre-operative chemotherapy plus surgery, comparedto 8 months for patients treated with surgeryalone.16 In the MD Anderson trial, the mediansurvival of the 32 patients randomised to thesurgery-alone group was 11 months comparedto 64 months in the 28 patients randomised tothe combined-modality arm.17 Of note, how-ever, is the fact that updated results of the MDAnderson trial, while still statistically significant,showed a narrowing of the survival curves, with amedian survival of 14 months and 21 months forthe surgery alone and combined modality arms,respectively.18

A larger trial has recently been reported.19

Three hundred and fifty-five patients with stage I,II or IIIA disease were randomised to threecycles of chemotherapy followed by surgery orto surgery alone. Median survival (37 monthsvs. 26 months) and 2-year survival (52% vs.59%) were not statistically different betweenthe two groups. However, a subset analysis inwhich patients who died within 150 days of peri-operative problems were excluded revealed a0.77 reduction in risk which was statistically sig-nificant (p = 0.03). Other subset analysis lookedat outcome by patient stage and found thatthe patients with N0/N1 disease who receivedchemo/surgery had a hazard ratio of 0.68, com-pared to patients with N2 disease, where the haz-ard ratio was 1.04.

Despite the results of the Depierre trial,many clinicians continue to use pre-operativechemotherapy for patients with stage IIIA dis-ease. An intergroup study evaluating chemo/RTvs. chemo/RT surgery has recently been com-pleted; these results are eagerly awaited.

Locally Advanced ‘Bulky’Stage IIIA/IIIB Disease

The optimal treatment for bulky stage IIIA andstage IIIB disease is also controversial. Currentinvestigational efforts are directed at identify-ing the optimal combined-modality approach,involving treatments directed at local control

Page 170: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

164 TEXTBOOK OF CLINICAL TRIALS

of the disease, i.e., surgery or radiotherapy,and micrometastatic disease, i.e., chemother-apy. Possibilities include radiotherapy only, pre-operative chemotherapy, or chemotherapy plusradiotherapy.

Chemotherapy plus Radiation Therapy: Chemo-therapy plus radiotherapy is the treatment ofchoice for patients with bulky or inopera-ble stage III disease. Two randomised studieshave demonstrated an improvement in medianand long-term survival with chemotherapy fol-lowed by radiation therapy versus radiotherapyalone.20,21 More recently, two randomised trialshave shown that concurrent chemoradiotherapyresults in prolonged survival, albeit at the expenseof enhanced toxicity, compared to sequentialtreatment.22,23 Other active areas of investiga-tion include choice of chemotherapy, fractiona-tion and treatment fields.

Recently, weekly, low-dose ‘sensitising’ che-motherapy plus radiation therapy has becomepopular, primarily due to lower toxicities whenadministered with radiotherapy than ‘standard’dose chemotherapy.24 However, this schedule hasnever been looked at in a formal phase III setting,so its relative efficacy compared to standard dosechemotherapy has not been rigorously assessed.

Stage IV Disease

Several meta-analyses have demonstrated thatchemotherapy improves survival in patients withmetastatic NSCLC (approximately 10% 1-yearsurvival untreated vs. 35–40% 1-year sur-vival with treatment),25,26 particularly if thechemotherapy is platin-based.14 In the past10 years, numerous different cytotoxic drugshave become available for the treatment oflung cancer patients. These include, amongothers, vinorelbine, the taxanes (docetaxel andpaclitaxel), gemcitabine and the topoisomeraseI inhibitors (irinotecan and topotecan). Ran-domised studies have shown that these agentsimprove survival when combined with cisplatin,as compared to cisplatin alone,27,28 or the otheragent alone.29,30 However, there is probably lit-tle difference in outcome between agents when

combined with cisplatin, although there are cleardifferences in toxicity and cost.31,32

Second-Line Chemotherapy

Docetaxel was recently approved for the second-line treatment of NSCLC, based upon twoclinical trials. One trial compared two doses ofdocetaxel with best supportive care, and found animprovement in median and long-term survival,despite a low response rate of 7%.33 The othertrial compared docetaxel to either vinorelbine orifosfamide (the treatment physician was allowedto choose) and found an improvement in long-term, although not median survival.34

‘Targeted’ Therapy

Given the overall poor results with standard cyto-toxic therapies and the number of advances thathave been made recently in our understandingof the biology of cancer, a strong interest hasemerged in targeting pathways unique to neo-plastic cells. One such example is the epidermalgrowth factor receptor (EGFr), which has beenfound to be expressed in the majority of patientswith lung cancer. Based upon two phase II tri-als in previously treated NSCLC patients, inwhich response rates of 10–20% were found,35,36

two phase III trials were initiated comparingchemotherapy plus an EGFr inhibitor, ZD1839,with chemotherapy in untreated NSCLC. Some-what surprisingly, no benefit was observed inthese trials.37,38 These unexpected findings haveresulted in clinical researchers, statisticians andthe pharmaceutical industry re-aiming the princi-ples of study design.

TREATMENT: SMALL-CELL LUNG CANCER

Small-cell lung cancer differs from NSCLC ina number of important ways: (1) it has a morerapid clinical course and natural history, with therapid development of metastases, symptoms anddeath; (2) it exhibits features of neuroendocrinedifferentiation in many patients which maybe distinguishable histopathologically and is

Page 171: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY CANCERS 165

associated with paraneoplastic syndromes; and(3) unlike NSCLC, SCLC is exquisitely sensitiveto both chemotherapy and radiotherapy, althoughresistant disease often develops. Because of therapid development of distant disease and itsextreme sensitivity to the cytotoxic effects ofchemotherapy, this mode of therapy forms thebackbone of treatment for this disease.

First-Line Therapy

A number of combination chemotherapeutic reg-imens are available for SCLC. With thesechemotherapy regimens, overall response rates of75–90% and complete response rates of 50%for localised disease can be anticipated. Forextensive-stage disease, overall response ratesof about 75% with complete response rates of25% are common. Despite these high responserates, however, the median survival time remainsabout 14 months for limited-stage disease and7–9 months for extensive-stage disease. Lessthan 5% of extensive-stage patients have long-term survival of greater than 2 years.

A phase III randomised trial has been reportedin abstract form, in which patients with SCLCwere randomised to the control arm of etoposideand cisplatin, versus cisplatin and the topoiso-merase I inhibitor, irinotecan.39 Median survivaland 1-year survival was 420 days and 60% inthe cisplatin/irinotecan arm and 300 days and40% in the cisplatin/etoposide arm. If ongo-ing phase III studies confirm these results, cis-platin/irinotecan would become the first combi-nation of chemotherapy to improve survival overcisplatin/etoposide in SCLC patients in decades.

Second-Line Therapy

No curative regimens for patients with recur-rent disease have been identified. Topotecan hasa 20–40% response rate in patients with ‘sen-sitive’ SCLC, those patients who relapsed twoor more months after their first-line therapy,with a median survival of 22–27 weeks. Forpatients with ‘refractory’ disease which pro-gressed through or within 3 months of comple-tion of first-line therapy, the response rate in

phase II studies is only between 3% and 11%.Median survival is about 20 weeks.40 Results of arandomised trial comparing topotecan with CAV(cyclophosphamide, adriamycin and vincristine)as second-line therapy revealed no difference inresponse rates, duration of response, or survivalbetween the two groups.41

Chemotherapy plus Chest Irradiation

Numerous studies have been done with chemo-therapy and thoracic radiotherapy for patientswith limited-stage SCLC. Conflicting results havebeen attributed to differences in chemother-apy regimens and different schedules integrat-ing chemotherapy and thoracic radiation, con-current, sequential and ‘sandwich’ approaches.Two recent meta-analyses concluded that thoracicradiation does result in a small but significantimprovement in survival and major control ofthe disease in the chest, although no conclusionscould be made regarding the optimal sequencingof chemotherapy and thoracic radiation.25,42

Fractionation of Radiotherapy: For limited-stageSCLC, thoracic radiotherapy has been knownto improve survival, but the best ways of inte-grating chemotherapy and thoracic radiotherapyare uncertain. In order to settle this question, aphase III randomised clinical trial was conductedin which 417 patients with limited SCLC wererandomised to receive a total of 45 Gy of radio-therapy, either twice-daily over a 3-week period oronce-daily over a 5-week period, concurrently withfour 21-day cycles of cisplatin plus etoposide.43

Twice-daily radiotherapy improved median sur-vival as compared with once-daily radiotherapy(23 months vs. 19 months, p = 0.04). However,grade 3 or 4 oesophagitis was significantly morefrequent with twice-daily than with once-dailyfractionation (32% vs. 16%, p < 0.001).

Prophylactic Cranial Irradiation

Numerous trials have demonstrated that prophy-lactic brain irradiation (PCI) does not enhancesurvival, but does decrease the risk of brain

Page 172: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

166 TEXTBOOK OF CLINICAL TRIALS

metastases without a decrease in mental func-tion.44 However, a recent meta-analysis demon-strated a small but statistically and clinicallysignificant improvement in survival with PCI.45

METHODOLOGIC ISSUES

With the traditional cytoreductive and cytotoxicchemotherapy, both as single agents and in com-bination, there are well-established and accepteddesigns for phases I, II and III clinical tri-als. In general these designs are based on theparadigm that with the increased myelosuppres-sion, tumour cells are more likely to be killed,leading to shrinkage of tumours, and that thereis a monotonically increasing dose–response anddose–toxicity relationship. It is also assumed thattumour shrinkage will eventually lead to clinicalbenefit such as prolonged survival or improvedquality of life. In essence, tumour shrinkage hasserved as a surrogate for clinical benefit.

PHASE I CLINICAL TRIALS

In typical phase I clinical trials with acute dose-limiting toxicities as the primary endpoint, astandard dose-escalation scheme with a cohortof fixed number of patients treated at each doselevel is used to estimate the so-called maximumtolerated dose (MTD) or safe dose46,47 to beused in subsequent phase II studies. However,the choice of the initial dose and dose levelshave been rather ad hoc. Worse yet, the standarddose-escalation design does not provide a well-defined basis for estimation of the MTD andis known to have several shortcomings suchas slow dose escalation at the beginning andunderestimation of the targeted dose-limitingtoxicity.48 Most critically, the standard designdoes not provide an estimate of the probabilityof toxicity at the recommended dose level.Nevertheless this standard dose-escalation designfor phase I clinical trials has served a usefulfunction in this setting.

In order to avoid slow dose-escalation andunderestimation, a number of variations on the

standard methods have been proposed.49,50 Thesemethods typically include initial dose-escalationin a single patient and a later switch-over tostandard dose-escalation at the earliest indicationof dose-limiting toxicity, and may also includeintra-patient dose-escalation.50

In order to address the failure of the standarddose-escalation designs to provide an estimateof the probability of toxicity at the recom-mended dose level, several new approacheshave been proposed, including the continualreassessment method.51 This method is primar-ily based on Bayesian statistical modelling of thedose–toxicity relationship with a targeted toxicityprobability for the MTD. With radiotherapy con-cerned with late-onset toxicities as the primaryendpoint, the standard dose-escalation design forphase I clinical trials is inadequate because of thelong-term follow-up required for late-onset toxic-ities associated with radiotherapy. With late-onsettoxicities, the continual reassessment method hasbeen extended by statistical models for the distri-bution for time to toxicity and by pooling toxicityinformation across patients receiving the samedose level.52

PHASE II CLINICAL TRIALS

In phase II clinical trials with cytotoxic chemo-therapy, multi-stage designs with objective tumourresponse defined as shrinkage of tumour by morethan 50% as the primary endpoint are widelyused.53 – 55 These are essentially sequential designsin the sense that a decision to treat additionalpatients for establishment of clinical efficacy ispredicated by the observed clinical efficacy orsafety with the patients from the previous stages.This is primarily to avoid treating patients withseemingly ineffective therapy.

Typically these designs are based on testsof statistical hypotheses with specific minimallyacceptable and maximally unacceptable tumourresponse rates associated with type I and IIerror probabilities. Given these design parame-ters available from historical data, there are manycandidate designs. In order to select a design,one may use either the minimax or the opti-mality criterion.56 Subject to type I and II error

Page 173: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY CANCERS 167

probabilities, the minimax designs minimised themaximum sample size required, while the opti-mal designs minimised the expected sample sizeunder the null hypothesis that the true responserate is less than or equal to the maximallyunacceptable response rate. Oftentimes they arevery disparate, causing confusions to those notso statistically sophisticated. A graphical searchmethod may be used to search for what appearsto be a compromise between the minimax andthe optimal designs with more desirable practi-cal features such as having much smaller max-imum sample size than the optimal design andmuch smaller expected sample size than the min-imax design.57

PHASE III CLINICAL TRIALS

Overall survival typically being the ultimatecriterion for evaluation of the efficacy of cancertreatment in phase III clinical trials, a traditionalrandomised, controlled design with time to deathdue to all causes as the primary endpoint hasbecome recognised as a golden standard forestablishment of standard therapies in cancer.However, depending on the disease setting, otherendpoints such as time to disease progression,time to treatment failure, etc. may be appropriateas a surrogate endpoint despite the problemsassociated with the surrogate endpoint.58

It has been argued that the traditional way ofmoving to phase III trials based on the resultsof phase II trials perhaps was the cause of fail-ure of many experimental therapies including therecent failure of experimental therapies includingnovel targeted therapies such as matrix metallo-proteinase inhibitors and epidermal growth factorreceptor inhibitors.59 One approach is to com-bine phase II and III trials into two-stage designsinvolving selection and testing based on accept-able primary endpoints or in combination withauxiliary endpoints for phase III trials.60,61 Asequential Bayesian phase II/III design has beenproposed for a non-small-cell lung cancer involv-ing an adjuvant adenovirus for p53.62 In thisBayesian design, local control of unresectablestage II or III NSCLC and overall survival are

considered simultaneously in a parametric mix-ture model.

SMALL-CELL LUNG CANCER

As was noted earlier, small-cell lung cancer isknown to be biologically distinct from other his-tologic subtypes of lung cancer in both laboratoryand clinical studies. It is the most chemosensitivetype of lung cancer and as a consequence it posessome difficulties in development of investiga-tional cytotoxic drugs. For example, there is eth-ical concern for testing investigational cytotoxicdrugs in previously untreated small-cell lung can-cer patients.

As a result of these observations, it hasbeen suggested that different phase II designsbe used depending on whether patients hadbeen previously treated with cytotoxic drugs orhave relapsed following treatment with cytotoxicdrugs.63 Also depending on whether patients arerefractory to or have relapsed during previoustreatment, different values for minimally accept-able and maximally unacceptable response ratesshould be used in phase II clinical trials. Differentconsiderations should be given to elderly patientsor patients with poor prognosis as well.

TARGETED THERAPY AND CYTOSTATICDRUGS

Advances in molecular biology and cancer genet-ics coupled with biotechnology are bringing fortha number of new novel agents which appearto target molecular pathways such as cancerinitiation, angiogenesis, invasion or metastasis.Examples include antiangiogenesis agents, epi-dermal growth factor receptor inhibitors, pro-tein kinase inhibitors, matrix metalloproteinaseinhibitors and other so-called molecular targetedtherapies. These new agents are not expectedto shrink tumours. Instead they are expected toinhibit tumour growth or prevent metastasis asthey have demonstrated in a number of animalmodels. With the emergence of these differentclasses of agents with entirely different mode of

Page 174: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

168 TEXTBOOK OF CLINICAL TRIALS

action and expected therapeutic effects, the tradi-tional designs for phase I, II and III clinical trialsappear no longer adequate.64 – 66

With these cytostatic agents, it is unclearwhether there is a clear dose–toxicity anddose–response relationship to help guide usin determining the most appropriate dose forphase II and III clinical trials. Indeed theparadigm for dose-escalation designs for cyto-toxic agents for phase I clinical trials appears nolonger relevant as acute toxicities may not bemeaningful with such agents.

This obviously calls for new methods for esti-mating a safe, but effective dose in phase I clini-cal trials. In such a setting with cytostatic drugs,it was suggested that a biological endpoint otherthan toxicity be used in phase I trials to define thedose for subsequent phase II trials.64 For phase IIpreliminary efficacy screening trials, single-armdesigns can be used in which comparisons aremade with historical control data. Sequentiallymeasured times to disease progression withineach patient who have failed previous treatmentmay be used in phase II designs where statisti-cal hypotheses regarding a hazard ratio of timesto disease progression before and after treatmentwith cytostatic drugs can be tested.65 Consideringthe heterogeneity of cancer, one may wish to dis-tinguish antiproliferative activity attributable tocytostatic drugs from less aggressive disease inphase II screening trials. In that setting, one mayuse a randomised discontinuation design in whichall patients are treated initially with the cytostaticdrug and only those whose disease is stable arerandomised in a double-blind fashion to the samecytostatic drug, active vs. placebo.66

As illustrated above, these new classes ofdrugs will challenge the existing paradigm fordesign, conduct and analysis of phase I, II andIII clinical trials in cancer. These challenges arecertainly not unique to lung cancer clinical trials.Clinical investigators and statisticians need towork more closely to address these challengesin developing most efficient and relevant clinicaltrial designs to help advance the care of patientswith lung cancer.

REFERENCES

1. Jemal A, Murray T, Samuels A, Ghafoor A,Ward E, Thun MJ. Cancer statistics, 2003. CACancer J Clin (2003) 53: 5–26.

2. Mountain CF. Revisions in the international sys-tem for staging of lung cancer. Chest (1997) 111:1710–17.

3. Frost JK, Ball WC, Levin M, Tockman MS, BakerRR, Carter D, Eggelston JC, Erozan YS, GuptaPK, Khouri NF, Marsh BR, Stitik FP. Early lungcancer detection: results of the initial (prevalence)radiologic and cytologic screening in the JohnsHopkins study. Am Rev Respir Dis (1984) 130:549–54.

4. Flehinger BJ, Melamed MR, Zaman MB, Hee-lan RT, Perchick WB, Martini N. Early lung can-cer detection: results of the initial (prevalence)radiologic and cytologic screening in the Memo-rial Sloan–Kettering study. Am Rev Respir Dis(1984) 130: 555–60.

5. Fontana RS, Sanderson DR, Taylor WF, Wool-ner LB, Miller WE, Muhm JR, Uhlenhopp MA.Early lung cancer detection: results of the initial(prevalence) radiologic and cytologic screening inthe Mayo Clinic Study. Am Rev Respir Dis (1984)130: 561–5.

6. Patz EF, Goodman PC, Bepler G. Screening forlung cancer. N Engl J Med (2000) 343: 1627–33.

7. Henschke CI, Naidich DP, Yankelevitz DF, Mc-Guinness G, McCauley DI, Smith JP, Libby DM,Pasmantier MW, Vazquez M, Koizumi J,Flieder D, Altorki NK, Miettinen OS. Early LungCancer Action Project: initial findings on repeatscreening. Cancer (2001) 92: 153–9.

8. van Zandwijk N, Dalesio O, Pastorino U, deVries N, van Tinteren H. Euroscan, a randomizedtrial of vitamin A and N-acetylcysteine in patientswith head & neck cancer or lung cancer. J NatlCancer Inst (2000) 92: 977–86.

9. Omenn GS, Goodman GE, Thornquist MD,Balmes J, Cullen MR, Glass A, Keogh JP, Mey-skens FL, Valanis B, Williams JH, Barnhart S,Hammar S. Effects of a combination of betacarotene and vitamin A on lung cancer andcardiovascular disease. N Engl J Med (1996) 334:1150–5.

10. The Alpha-Tocopherol, Beta Carotene CancerPrevention Study Group. The effect of vitamin Eand beta carotene on the incidence of lung cancerand other cancers in male smokers. N Engl J Med(1994) 330: 1029–35.

11. Clark LC, Combs GF, Turnbull BW, Slate EH,Chalker DK, Chow J, Davis LS, Glover RA, Gra-ham GF, Gross EG, Krongrad A, Lesher JL, ParkHK, Sanders BB, Smith CL, Taylor JR. Effects of

Page 175: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY CANCERS 169

selenium supplementation for cancer prevention inpatients with carcinoma of the skin: a random-ized controlled trial. J Am Med Assoc (1996) 276:1957–63.

12. PORT Meta-analysis Trialists Group. Postopera-tive radiotherapy in non-small-cell lung cancer:systematic review and meta-analysis of individ-ual patient data from nine randomised controlledtrials. Lancet (1998) 352: 257–63.

13. The Lung Cancer Study Group. Effects of post-operative mediastinal radiation on completelyresected stage II and stage III epidermoid cancerof the lung. N Engl J Med (1986) 315: 1377–81.

14. Non-small Cell Lung Cancer Collaborative Group.Chemotherapy in non-small cell lung cancer: ameta-analysis using updated data on individualpatients from 52 randomised clinical trials. Br MedJ (1995) 311: 899–909.

15. Keller SM, Adak S, Wagner H, Herskovic A,Komaki R, Brooks BJ, Perry MC, Livingston RB,Johnson DH for the Eastern Cooperative Oncol-ogy Group. A randomized trial of postopera-tive adjuvant therapy in patients with completelyresected stage II or IIIA non-small cell lung can-cer. N Engl J Med (2000) 343: 1217–22.

16. Rosell R, Gomez-Codina J, Camps C, Maestre J,Padille J, Canto A, Mate JL, Li S, Roig J, Olaz-abal A, Canela M, Ariza A, Skacel Z, Morera-Prat J, Abad A. A randomized trial comparingpreoperative chemotherapy plus surgery withsurgery alone in patients with non-small-cell lungcancer. N Engl J Med (1994) 330: 153–8.

17. Roth JA, Fossella F, Komaki R, Ryan MB, Put-nam JB, Lee JS, Dhingra H, DeCaro L, Chasen M,McGavran M, Atkinson EN, Hong WK. A ran-domized trial comparing perioperative chemother-apy and surgery with surgery alone in resectablestage IIIA non-small-cell lung cancer. J Natl Can-cer Inst (1994) 86: 673–80.

18. Roth JA, Atkinson EN, Fossella F, Komaki R,Ryan MB, Putnam JB, Lee JS, Dhingra H, De-Caro L, Chasen M, Hong WK. Long-term follow-up of patients enrolled in a randomized trialcomparing perioperative chemotherapy and sur-gery alone in resectable stage IIIA non-small-celllung cancer. Lung Cancer (1998) 21: 1–6.

19. Depierre A, Milleron B, Moro-Sibilot, Chevret S,Quoix E, Lebeau B, Braun D, Breton J-L, Le-marie E, Gouva S, Paillot N, Brechot J-M, Jan-icot H, Lebas F-X, Terrioux P, Clavier J,Foucher P, Monchatre M, Coetmeur D, Level M-C, Leclerc P, Blanchon F, Rodier J-M, Thiber-ville L, Villeneuve A, Westeel V, Chastang C.Preoperative chemotherapy followed by surgerycompared with primary surgery in resectablestage I (except T1N0), II, and IIIa non-small celllung cancer. J Clin Oncol (2002) 20: 247–53.

20. Dillman RO, Herndon J, Seagren SL, Eaton WL,Green MR. Improved survival in stage III non-small cell lung cancer: seven-year follow-up ofcancer and leukemia group B (CALGB) 8433. JNatl Cancer Inst (1996) 88: 1210–15.

21. Sause W, Kolesar P, Taylor S, Johnson D, Liv-ingston R, Komaki R, Emami E, Curran W,Byhardt R, Fisher B. Final results of phase III trialin regionally advanced unresectable non-small celllung cancer: Radiation Therapy Oncology Group(RTOG) 88-08 and Eastern Cooperative OncologyGroup (ECOG) 4588. Chest (2000) 117: 358–64.

22. Furuse K, Fukuoka M, Kawahara M, Nishikawa H,Takada Y, Kudoh S, Katagami N, Ariyoshi Y.Phase III study of concurrent versus sequentialthoracic radiotherapy in combination with mit-omycin, vindesine, and cisplatin in unresectablestage III non-small-cell lung cancer. J Clin Oncol(1999) 17: 2692–9.

23. Curran W, Scott C, Langer C, Komaki R, Lee J,Hauser S, Movsas B, Wasserman T, Russell A,Byhardt R, Sause W, Cox J. Phase III comparisonof sequential vs. concurrent chemoradiation forpatients with unresected stage III non-small celllung cancer (NSCLC): initial report of the Radia-tion Therapy Oncology Group (RTOG) 9410. ProcAm Soc Clin Oncol (2000) 18: 484a.

24. Choy H, Akerley W, Safran H, Graziano S,Chung C, Williams T, Cole B, Kennedy T.Multiinstitutional phase II trial of paclitaxel,carboplatin, and concurrent radiation therapy forlocally advanced non-small-cell lung cancer. JClin Oncol (1998) 16: 3316–22.

25. Soquet PJ, Chauvin F, Boissel JP, Cellerino R,Cormier Y, Ganz PA, Kaasa S, Pater JL, Quoix E,Rapp E, Tunarello D, Williams J, Woods BL,Bernard JP. Polychemotherapy in advanced non-small cell lung cancer: a meta-analysis. Lancet(1993) 342: 19–21.

26. Marino P, Pampallona S, Preatoni A, Cantoni A,Inveinezzi F. Chemotherapy vs. supportive care inadvanced non-small cell lung cancer: results of ameta-analysis of the literature. Chest (1994) 106:861–5.

27. Sandler A, Nemunaitis J, Denham C, von Pawel J,Cormier Y, Gatzemeier U, Mattson K, Mane-gold C, Palmer M, Gregor A, Nguyen B, Niyik-iza C, Einhorn L. Phase III trial of gemcitabineplus cisplatin versus cisplatin alone in patientswith locally advanced or metastatic non-small celllung cancer. J Clin Oncol (2000) 18: 122–30.

28. Wozniak AJ, Crowley JJ, Balcerzak SP, WeissGR, Spiridonidis CH, Baker LH, Albain KS,Kelly K, Taylor SA, Gandara DR, Livingston RB.Randomized trial comparing cisplatin with cis-platin plus vinorelbine in the treatment of advanced

Page 176: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

170 TEXTBOOK OF CLINICAL TRIALS

non-small-cell lung cancer: a Southwest OncologyGroup study. J Clin Oncol (1998) 16: 2459–65.

29. Lilenbaum RC, Herndon J, List M, Desch C, Wat-son D, Holland J, Weeks JC, Green MR. Sin-gle agent (SA) versus combination chemotherapy(CC) in advanced non-small cell lung cancer: aCALGB randomized trial of efficacy, quality oflife (QoL), and cost effectiveness. Proc Am SocClin Oncol (2002) 20: 1a.

30. Sederholm C. Gemcitabine (G) compared withgemcitabine plus carboplatin (GC) in advancednon-small cell lung cancer (NSCLC): a phase IIIstudy by the Swedish Lung Cancer Study Group(SLUSG). Proc Am Soc Clin Oncol (2002) 20:1162a.

31. Kelly KJ, Crowley J, Bunn PA, Presant CA,Grevstad PK, Moinpour CM, Ramsey SD, Woz-niak AJ, Weiss GR, Moore DF, Israel VK, Liv-ingston RB, Gandara DR. Randomized phase IIItrial of paclitaxel plus carboplatin versus vinorel-bine plus cisplatin in the treatment of patients withadvanced non-small-cell lung cancer: a SouthwestOncology Group trial. J Clin Oncol (2001) 19:3210–18.

32. Schiller JH, Harrington D, Belani CP, Langer C,Sandler A, Krook J, Zhu Z, Johnson DH for theEastern Cooperative Oncology Group. Compari-son of four chemotherapy regimens for advancednon-small cell lung cancer. N Engl J Med (2002)346: 92–8.

33. Shepherd F, Dancey J, Ramlau R, Mattson K,Gralla R, O’Rourke M, Levitan N, Gressot L,Vincent M, Burkes R, Coughlin S, Kim Y, Be-rille J. Prospective randomized trial of docetaxelversus best supportive care in patients withnon-small cell lung cancer previously treatedwith platinum-based chemotherapy. J Clin Oncol(2000) 18: 2095–103.

34. Fossella FV, DeVore R, Kerr RN, Crawford J,Natale RR, Dunphy F, Kalman L, Miller V, LeeJS, Moore M, Gandara D, Karp D, Vokes E,Kris M, Kim Y, Gamza F, Hammershaimb L.Randomized phase III trial of docetaxel ver-sus vinorelbine or ifosfamide in patients withadvanced non-small cell lung cancer previouslytreated with platinum-containing chemotherapyregimens. J Clin Oncol (2000) 18: 2354–62.

35. Fukuoka M, Yano S, Giaccone G, Tamura T, Nak-agawa K, Douillard J-Y, Nishiwaki Y, Van-steenkiste JF, Kudo S, Averbuch S, Macleod A,Feyereislova A, Baselga J. Final results from aphase II trial of zd1839 (‘Iressa’) for patients withadvanced non-small cell lung cancer (IDEAL 1).Proc Am Soc Clin Oncol (2002) 20: 298a.

36. Kris MG, Natale RB, Herbst RS, Lynch TJ, Pra-ger D, Belani CP, Schiller JH, Kelly K, Spiri-donidis C, Albain KS, Brahmer JR, Sandler A,

Crawford J, Lutzker SG, Lilenbaum R, Helms L,Wolf M, Averbuch S, Ochs J, Kay A. A phase IItrial of zd1839 (‘Iressa’) in advanced non-smallcell lung cancer patients who had failed platinum-and docetaxel-based regiments (IDEAL 2). ProcAm Soc Clin Oncol (2002) 20: 292a.

37. Johnson D, Herbst R, Giaccone G, Schiller J,Natale R, Miller V, Wolf M, Holton A, Aver-buch S, Grous J. Zd1839 (‘Iressa’) in combinationwith paclitaxel and carboplatin in chemotherapy-naive patients with advanced non-small-cell lungcancer: results from a phase III clinical trial(Intact2). Ann Oncol (2002) 13(Suppl 5): 127–8.

38. Giaccone G, Johnson DH, Manegold C, ScagliottiGV, Rosell R, Wolf M, Rennie P, Ochs J, Aver-buch S, Fandi A. A phase III clinical trial ofzd1839 (‘Iressa’) in combination with gemcitabineand cisplatin in chemotherapy-naive patients withadvanced non-small-cell lung cancer (Intact 1).Ann Oncol (2002) 13(Suppl 5): 2.

39. Giaccone G, Johnson DH, Manegold C, ScagliottiGV, Rosell R, Wolf M, Rennie P, Ochs J, Aver-buch S, Fandi A. Irinotecan plus cisplatin com-pared with etoposide plus cisplatin for extensivesmall cell lung cancer. N Engl J Med (2002) 346:85–91.

40. Schiller J, Kim K, Hutson P, DeVore R, Glick J,Stewart J, Johnson D. Phase II study of topotecanin patients with extensive-stage small cell carci-noma of the lung: an Eastern Cooperative Oncol-ogy Group trial (E1592). J Clin Oncol (1996) 14:2345–52.

41. von Pawel J, Schiller J, Sheperd F, Kleisbauer J,Chrysson N, Steward D, Fields S, Clark P, Pal-mer M, Depierre A, Carmichael J, Krebs J,Ross G, Gralla R. Topotecan versus CAV for thetreatment of recurrent small cell lung cancer. JClin Oncol (1999) 17: 658–67.

42. Pignon J, Arriagada R, Ihde D, Johnson D,Perry M, Souhami R, Brodin O, Joss R, Kies M,Lebeau B, Onoshi T, Osterlind K, Tattersall M,Wagner H. A meta-analysis of thoracic radiotherapy for small-cell lung cancer. N Engl J Med(1992) 327: 618–24.

43. Turrisi AT, Kim K, Blum R, Sause WT, Liv-ingston RB, Komaki R, Wagner H, Aisner S,Johnson DH. Twice-daily compared with once-daily thoracic radiotherapy in limited small-celllung cancer treated with concurrently with cis-platin and etoposide. N Engl J Med (1999) 340:265–71.

44. Arriagada R, Le Chevalier T, Borie F, Riviere A,Chomy P, Monnet I, Tardivon A, Viader F,Tarayre M, Benhamou S. Prophylactic cranialirradiation for patients with small-cell lung cancerin complete remission. J Natl Cancer Inst (1995)87: 183–90.

Page 177: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY CANCERS 171

45. Auperin A, Arriagada R, Pignon J, Pechoux C,Gregor A, Stephens R, Kristjansen P, Johnson B,Ueoka H, Wagner H, Aisner J. Prophylactic cra-nial irradiation for patients with small cell lungcancer in complete remission. N Engl J Med(1999) 341: 476–84.

46. Schneiderman MA. Mouse to man: statisticalproblems in bringing a drug to clinical trial. ProcFifth Berkeley Symp Math Stat Prob (1967) 4:855–66.

47. Geller NR. Design of phase I and II clinical trialsin cancer: a statistician’s view. Cancer Invest(1984) 2: 483–91.

48. Storer B, DeMets DL. Current phase I/II designs:are they adequate? J Clin Res Drug Dev (1987) 1:21–30.

49. Storer B. Design and analysis of phase I clinicaltrials. Biometrics (1989) 45: 925–37.

50. Simon R, Freidlin B, Rubinstein L, Arbuck SG,Collins J, Christian MC. Accelerated titration de-signs for phase I clinical trials in oncology. J NatlCancer Inst (1997) 89: 1138–47.

51. O’Quigley J, Pepe M, Fisher L. Continual re-assessment method: a practical design for phase Iclinical trials in cancer. Biometrics (1990) 46:33–48.

52. Cheung YK, Chappell R. Sequential designs forphase I clinical trials with late-onset toxicities.Biometrics (2000) 56: 1177–82.

53. Gehan E. The determination of the number ofpatients required in a preliminary and a follow-uptrial of a new chemotherapeutic agent. J ChronicDis (1961) 13: 346–53.

54. Lee YJ, Staquet M, Simon R, Catane R, Mug-gia F. Two-stage plans for patient accrual inphase II cancer clinical trials. Cancer Treat Rep(1979) 63: 721–61.

55. Fleming TR. One-sample multiple testing proce-dures for phase II trials. Biometrics (1982) 38:143–51.

56. Simon R. Optimal two-stage designs for phase IIclinical trials. Contr Clin Trials (1989) 10: 1–10.

57. Jung SH, Carey M, Kim K. Graphical search fortwo-stage designs for phase II clinical trials. ContrClin Trials (2001) 22: 367–72.

58. Fleming TR. DeMets DL. Surrogate endpoints inclinical trials: are we being misled? Ann Int Med(1996) 125: 605–13.

59. Scher HI, Heller G. Picking the winners in a seaof plenty? Clin Cancer Res (2002) 8: 400–404.

60. Thall PF, Simon R, Ellenberg SS. Two-stageselection and testing designs for comparativeclinical trials. Biometrika (1988) 75: 303–10.

61. Schaid DJ, Wieand S. Therneau TM. Optimaltwo-stage screening designs for survival compar-isons. Biometrika (1990) 77: 507–13.

62. Inoue LYT, Thall PF, Berry DA. Seamlessly ex-panding a randomized phase II trial to phase III.Biometrics (2002) 58: 823–31.

63. Moore TD, Korn EL. Phase II trial design consid-erations for small-cell lung cancer. J Natl CancerInst (1992) 8: 150–4.

64. Korn EL, Arbuck SG, Pluda JM, Simon R, KaplanRS, Christian MC. Clinical trials designs forcytostatic agents: are new approaches needed? JClin Oncol (2001) 19: 265–72.

65. Mick R, Crowley JJ, Carroll RJ. Phase II clinicaltrial design for noncytotoxic anticancer agents forwhich time to disease progression is the primaryendpoint. Proc Am Assoc Clin Res (1999) 59: 228a.

66. Rosner GL, Stadler W, Ratain MJ. Randomizeddiscontinuation design: application to cytostaticantineoplastic agents. J Clin Oncol (2002) 20:4478–84.

Page 178: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CARDIOVASCULAR

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 179: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

12

CardiovascularLAWRENCE FRIEDMAN AND ELEANOR SCHRON

National Heart, Lung, and Blood Institute, Bethesda, MD 20892 2482, USA

INTRODUCTION

Most of the principles in developing, managingand analysing clinical trials in cardiovasculardiseases are the same as for other conditions.There are some special aspects, however, that wewill review here, and then provide examples forin the subsequent sections.

A major point that influences many of the clini-cal trials is that in developed countries, and unfor-tunately more and more in developing nations,cardiovascular disease is common. Atherosclero-sis and hypertension are the primary causes ofcardiovascular disease in adults, although thereare many contributing factors to these (oftentermed risk factors). Although clinical trials havebeen conducted in other causes of cardiovas-cular disease, including congenital conditions,there are far fewer trials in these areas. There-fore, because in developed countries most heartdisease, stroke and peripheral vascular diseasesare due to atherosclerosis and hypertension, andbecause most of the cardiovascular disease clini-cal trials have been conducted in developed coun-tries, this chapter will emphasise those.

Because cardiovascular diseases are common,small treatment benefits may yield important

public benefits, particularly if the treatment issimple and inexpensive, such as aspirin. Againbecause the condition is common, when there arepotential treatments that are simple to administer,trials may be conducted in communities outsideof academic health centres, with their specialexpertise and facilities.

Another important consideration is that athero-sclerotic and hypertensive cardiovascular dis-eases are chronic conditions, often taking decadesto develop and lasting many years after beingfirst diagnosed. Clinical trials, therefore, may beinitiated well before the development of risk fac-tors (sometimes called primordial prevention),after the development of risk factors, but beforethe occurrence of organ damage (primary pre-vention), or after organ damage has occurred(secondary prevention). Some interventions arepotentially useful in all three settings, but others,particularly expensive or invasive approaches,may be best suited for secondary prevention.The relative importance of risk factors and clin-ical findings may also differ, affecting the likelyimpact of the interventions. After a heart attack,for example, how well the surviving myocardiumfunctions in pumping blood may be more impor-tant in determining longevity than cholesterol

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 180: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

176 TEXTBOOK OF CLINICAL TRIALS

level (though the latter has been clearly shownto affect recurrent infarctions and death). Often,treatments for atherosclerotic or hypertensive car-diovascular diseases do not cure the underlyingconditions. Rather, they may reduce the likeli-hood of having clinical sequelae or control theserious consequences of disease.

As noted, the common causes of cardiovas-cular diseases (atherosclerosis and hypertension)are multifactorial in origin. Atherosclerosis maybe influenced by cholesterol level, blood pres-sure, cigarette smoking, obesity, physical activ-ity, inflammatory processes, diabetes, age andgenetics, plus other factors. Hypertension may beinfluenced by things such as diet (intake of saltand various nutrients), obesity, physical activ-ity, stress or emotion, and genetics. With regardto genetic influences, it is not thought that thecommon cardiovascular diseases or their risk fac-tors are influenced by single genes. Rather, thereare likely to be many genes that interact withenvironmental conditions to effect most commoncardiovascular diseases. Because of the multiplerisk factors, clinical trial interventions that alterindividual factors might yield only modest reduc-tions in clinical outcomes such as myocardialinfarction or death from cardiovascular causes.Antihypertensive drugs, though, that lower bloodpressure regardless of the reason for the hyper-tension, have been shown to give impressivereductions in stroke,1 much of which is due tohypertension. On the other hand, treating hyper-tension has led to only modest reductions in heartdisease, probably because the other risk factorsfor heart disease were unchanged. The multifac-torial nature of much cardiovascular disease hasled some to design trials that have attempted tointervene on several factors simultaneously.

Whether as part of a clinical trial, or becauseof usual clinical care, many participants in car-diovascular disease clinical trials are on multi-ple interventions. This can affect adherence tostudy protocol and may lead to various druginteractions.

We typically think of cardiovascular disease asaffecting the heart, as in coronary heart disease,or the brain, as in stroke, but other parts of

the body, such as the kidneys or the legs (asin intermittent claudication), may be affected.Even within a single organ such as the heart,the presentation of cardiovascular disease maytake various forms, such as angina pectoris,myocardial infarction, cardiac arrhythmias ofdifferent sorts, heart failure and sudden death.The interventions studied in clinical trials maybe directed at any of those, though someinterventions, such as blood pressure loweringdrugs, may affect more than one outcome.

One issue in designing and interpreting theresults of cardiovascular disease trials is choice ofthe outcome. As noted, interventions may affectonly one aspect of the disease. Therefore, inves-tigators would prefer to use as the outcome ofinterest only that most likely to be modified bythe treatment. But for outcomes such as cause-specific mortality, that is not easy. Even a ratherbroad outcome such as death due to cardiovascu-lar disease has limitations, because many deathsare unwitnessed and autopsies much less commonthan in the past. When finer splits are used, suchas death due to arrhythmia or myocardial infarc-tion, the difficulties mount.2 Similar problemsexist for non-fatal events, such as myocardialinfarction. In the Framingham Heart Study, morethan 25% of myocardial infarctions were ‘silent’,that is, occurred without symptoms and were onlyrecognised on electrocardiographic examination.3

Extra efforts need to be made to identify these,as they convey considerable risk of death, eventhough they are asymptomatic.

Because many of the cardiovascular diseaseconditions are common, there is rarely a short-age of people with the condition of interestfor most clinical trials. Particularly with mul-ticentre, in fact, multinational, trials, there areadequate numbers of potential participants sothat studies using clinical outcomes are feasi-ble. It is usually not necessary to resort to trialswith surrogate endpoints on account of partici-pant unavailability. However, when a trial seeksspecial subtypes of cardiovascular disease, sub-ject availability becomes more of an issue. It isstill usually unnecessary to conduct trials withsurrogate endpoints, but extra efforts do have

Page 181: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CARDIOVASCULAR 177

to be made to identify and enroll the partic-ipants. Connected with this, many physicianssub-specialise in particular types of cardiovas-cular disease. Involving the kind of physicianmost likely to have knowledge of and accessto the relevant patient population is thereforeessential.4

The large size of many cardiovascularclinical trials means that stratification to ensurebalance among key baseline factors is usuallyunnecessary, with the notable exception of site,in multicentre trials. Site is almost always astratification variable. Beyond that, investigatorsshould stratify on at most a very few variables.For example, for studies of heart failure, weoften stratify by left ventricular ejection fraction;for studies of arrhythmia, we might stratify bytype of arrhythmia and ejection fraction; forprimary prevention trials, we might stratify byage and sex; and for trials of blood pressurelowering, we would stratify by prior use ofantihypertensive drugs.

One feature of trials in cardiovascular dis-ease that always needs to be considered is thedramatic reduction in mortality from heart dis-ease and stroke over the past few decades inmost developed countries.5 As a result of acombination of improved prevention and muchbetter medical care, death rates in developedcountries have decreased to a level that makesmortality outcome studies less feasible. From apublic health and a patient standpoint, this iscertainly a happy state to be in, but it meansthat clinical trials must be designed with theexpectation that the event rates may be consider-ably less than expected. The trials need eitherto be much larger, or else the outcome needsto be a combination of death and other clini-cal events.

The remainder of this chapter will considerissues in specific trials. It is divided into trialsof drugs or biologics, trials of devices andsurgical procedures, and trials of lifestyle orother non-pharmacologic interventions. For fullerdiscussions of various cardiovascular diseaseclinical trials, see the book, Clinical Trials inCardiovascular Disease.6

TRIALS OF PHARMACOLOGIC AGENTS

Pharmaceutical agents are the most commoninterventions tested in clinical trials of cardio-vascular disease. Most trials of drugs are sim-ilar in structure and design to trials in anyother field of medicine. A few points, how-ever, should be made. Because, as noted above,most cardiovascular disease takes decades todevelop, there is a long period when peoplehave few if any symptoms. The likely exis-tence of atherosclerosis, for example, is usu-ally determined by the presence of risk factors,such as hypertension, hyperlipidaemia or fam-ily history, advanced age in people who havethe typical lifestyle of most Western countries,and the use of sophisticated imaging methods.Prevention of the sequelae of atherosclerosis istherefore quite feasible. Because participants intrials of primary prevention are asymptomatic,several principles apply. First, serious or trou-blesome adverse events due to interventions inpeople who are generally healthy are unaccept-able and not tolerated by the participants. There-fore, only drugs that are well-characterised (andare presumably safe and well-tolerated) are gen-erally studied in prevention trials. Second, therate of clinical events is likely to be low. Unlessthe trials use surrogate outcomes, they need tobe very large (thousands and sometimes tens ofthousands of participants) and long (often fiveyears or more). Third, people who are asymp-tomatic and consequently notice no obvious ben-efit from treatment may have trouble adhering tothe regimen, especially in a long trial. Therefore,a considerable ‘drop-out’ rate must be built intothe sample estimate, increasing the size of thetrial even more.

Among the first large clinical trials in car-diovascular disease were trials of lipid lower-ing. The Coronary Drug Project, which beganin the 1960s, tested five interventions (clofibrate,nicotinic acid, dextrothyroxine, and two doses ofequine estrogen) against a placebo in men with ahistory of a myocardial infarction.7,8 These earlyefforts at lipid modification were only modestlysuccessful. The interventions had major adverse

Page 182: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

178 TEXTBOOK OF CLINICAL TRIALS

events and three were stopped before the sched-uled end of the trial. In addition to the adverseevents, the amount of lipid lowering was rela-tively small, in general, so the lack of improve-ment in mortality, the primary outcome, wasperhaps not surprising. Nicotinic acid was shownto reduce non-fatal reinfarction, however, andpost-trial follow-up disclosed a significant reduc-tion in mortality.9 A key finding in the CoronaryDrug Project was that the mortality rate in thecontrol group was only two-thirds of that pre-dicted when the study was started. This proba-bly reflected selection of better risk participants,but improved care may also have played a role.This phenomenon is one that many cardiovas-cular trials conducted since the Coronary DrugProject have had to take into account in the sam-ple size estimates.

The next large lipid-lowering trial was theLipid Research Clinics Coronary Primary Preven-tion Trial.10 This trial compared cholestryramineresin versus placebo to see if there would be a dif-ference in the primary outcome of coronary heartdisease death or non-fatal myocardial infarctionin 3806 men free of prior evidence of heartdisease, but with hyperlipoproteinaemia. Therewere 155 events in the intervention group and187 in the placebo group (one-sided p < 0.05).This study was one of the first to show bene-fits from lipid lowering, even though some ques-tioned the significance of the results, given theone-sided test.

More definitive outcomes from cholesterollowering had to wait for the development ofagents that were more effective in loweringlipids and, importantly, better tolerated. The clin-ical trials of hydroxymethylglutaryl coenzyme A(HMG-CoA) reductase inhibitors (‘statins’) wereprimarily conducted in the 1990s. Trials suchas the Scandinavian Simvastatin Survival Study(4S),11 the Cholesterol and Recurrent Events(CARE) trial12 and the MRC/BHF Heart Pro-tection Study13 have clearly demonstrated thatcholesterol lowering in both people with knownheart disease and in those at high risk, but with-out evidence of heart disease, leads to impressive

reductions in all-cause mortality and in coronaryheart disease events.

The 4S trial compared simvastatin againstplacebo in 4444 participants with known coro-nary heart disease and elevated serum cholesterol.The intervention lowered low-density lipoproteincholesterol by 35% and mortality by 30%.11 TheCARE trial compared pravastatin against placeboin 4159 post-myocardial infarction patients. Thebaseline serum cholesterol level was somewhatlower in this trial than in the 4S trial. As in4S, there was greater than a 30% reduction inlow-density lipoprotein cholesterol and a 24%relative benefit in the primary outcome of coro-nary heart disease death plus non-fatal myocar-dial infarction.12 The MRC/BHF Heart ProtectionStudy extended these finding to those who hadcoronary heart disease or were otherwise at highrisk, regardless of the baseline cholesterol level.13

As a result of these and other trials of choles-terol lowering and trials of blood pressure reduc-tion, new evidence-based guidelines for treatmentof risk factors such as hyperlipidaemia and hyper-tension have been developed and widely dissem-inated. Therefore, regardless of whether the trialis one of primary prevention or in people withknown end-organ damage, the control group mustbe adequately treated. This means that the eventrate in the control group will be less than it hasbeen in past years, making it even more difficultto detect benefit from a new intervention.

An example of a recent trial designed to mimicclinical practice illustrates some of these issues.The Antihypertensive and Lipid-Lowering Treat-ment to Prevent Heart Attack Trial, or ALLHAT,compared treatment, in a blinded fashion, begin-ning with three different antihypertensive agentsagainst the control, which was thiazide diuretictreatment, in more than 40 000 people aged 55or over who had hypertension and at least oneother risk factor. Thus, the hypertension compo-nent of ALLHAT used an ‘active control’ arm.The primary outcome of this component wasfatal coronary heart disease or non-fatal myocar-dial infarction. ALLHAT also included a lipid-lowering agent in over 10 000 of the enrolledparticipants in a factorial design. The primary

Page 183: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CARDIOVASCULAR 179

outcome for the lipid component was all-causemortality; this part of ALLHAT was an open, ornon-blinded study.14 – 16

One of the antihypertensive agents, an alpha-adrenergic blocker, was stopped early becausealthough there was little difference in the primaryoutcome, there was a significant increase inheart failure in the alpha-adrenergic blocker arm,compared with the diuretic arm. The other twoantihypertensive treatments, a calcium channelblocker and an angiotensin-converting enzymeinhibitor, continued to the scheduled end of thetrial. There were no differences between eitherof these arms and the active control thiazidediuretic arm for the primary outcome. There weresome differences in secondary outcomes, with thediuretic being superior to the other agents forheart failure, for example. The blood pressurecomponent of ALLHAT showed that in an activecontrol trial, a very large sample size needed to beused to achieve adequate power, even when theprimary outcome was a combination of events.Also contributing to the need for a large samplewas the fact that about 30% of the participantswho had a follow-up visit at five years haddiscontinued the study drug.

With respect to the lipid component, therewas no significant difference in total mortality,despite the fact that many other studies haveshown benefit from lipid-lowering treatments.17

One explanation may be that there was onlya modest difference in low-density lipoproteincholesterol between the two groups. Almost 30%of the control group participants were receivingnon-study lipid-lowering therapy by the endof the trial. The non-blinded design probablyhelped foster that. The public campaigns aimedat getting people to reduce their cholesterol levelsundoubtedly played a role. Also, people whowere thought to require lipid-lowering therapyand those already on such therapy were noteligible to be enrolled. Because only thosealready entered in ALLHAT for the hypertensioncomponent were candidates for the lipid-loweringcomponent, the originally expected number ofabout 20 000 enrollees turned out to be 10 355,further limiting the study power.

Rates of death in people who have had amyocardial infarction used to be quite high. Mod-ern therapy has reduced those rates remarkably.Thus, trials using mortality alone as an end-point may no longer be feasible even in sur-vivors of a heart attack, unless a very highrisk group is studied. This has led to increaseduse of combination endpoints, such as car-diovascular mortality plus non-fatal myocardialinfarction. The Heart Outcomes Prevention Eval-uation (HOPE) study compared the angiotensin-converting enzyme inhibitor ramipril againstplacebo in 9297 people with either known vas-cular disease or diabetes plus another risk factor.The primary outcome was myocardial infarction,stroke or death from cardiovascular causes. Thus,even though this was a high-risk sample, andthe sample size was considerable, it was neces-sary to have a combination endpoint to achieveadequate power. There was a highly significantand clinically impressive reduction in the primaryoutcome from ramipril.18

A similar study is the Prevention of Eventswith Angiotensin-Converting Enzyme InhibitorTherapy, or PEACE.19 This trial is not yetcompleted, but it too compares an angiotensin-converting enzyme inhibitor (trandolapril) againststandard therapy in over 8000 people withdocumented coronary heart disease and a leftventricular ejection fraction of at least 40%.The reason for the ejection fraction criterionis that angiotensin-converting enzyme inhibitorshave been shown to be beneficial in thosewith heart failure or low ejection fraction. Thiseligibility criterion, however, also means thatthe event rate is lower than if those withimpaired ejection fraction were included. So inorder to have adequate power, even in thisrelatively large study, a combination of eventsis necessary as the primary outcome. Originally,the sample size was set at 14 000, and theprimary outcome was cardiovascular death andnon-fatal myocardial infarction. Early in the trial,primarily for feasibility reasons, the sample sizewas reduced to 8100 and the primary outcomeexpanded to include the need for coronaryrevascularisation procedures. Procedures such as

Page 184: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

180 TEXTBOOK OF CLINICAL TRIALS

need for revascularisation are often included aspart of the endpoint. This can be appropriate, butis subject to considerable bias if the trial is notblinded, which PEACE is.

Antihypertensive agents and lipid-loweringdrugs have generally been approved by drugregulatory agencies on the basis of their effectson blood pressure and cholesterol, rather than ontheir effects on clinical outcomes. The clinicaloutcomes, however, are so important that manytrials have successfully tested their effects ondeath, myocardial infarction and stroke.11 – 16,18,20

Cardiac ventricular arrhythmias are known tocorrelate with total and sudden death. Therefore,for years, it was thought that drugs that reducedcardiac arrhythmias should be approved onthe basis of their antiarrhythmic effect, onthe assumption that they would be clinicallybeneficial. However, when the trials were donethat looked at clinical outcomes, it was seen thatarrhythmia suppression was not a good surrogatefor mortality.

The Cardiac Arrhythmia Suppression Trial(CAST) tested whether suppression of ventric-ular arrhythmias by any of three antiarrhythmicdrugs would reduce the incidence of sudden car-diac death. In the first part of this trial, over1700 patients whose ventricular arrhythmias weresuppressed by encainide, flecainide or moricizinewere randomly assigned to the drug that wasmost effective in suppressing the arrhythmia ormatching placebo. However, two of the drugs,encainide and flecainide, were soon seen to sig-nificantly increase both sudden death and all-cause mortality and they were discontinued earlyin the study.21 The study was continued withmoricizine as the only antiarrhythmic drug. Thistoo was stopped ahead of schedule because ofadverse trends in mortality.22 As a result ofCAST, the use of surrogate outcomes in manyclinical trials has been seriously questioned.

Another feature of many drug trials in car-diovascular disease is that a ‘stepped care’approach is used. This is common in trialsof blood pressure lowering. Because a sin-gle drug is often either insufficiently effec-tive or not well tolerated by the participant,

use of second, third and even fourth choicedrugs is built into the protocol. For example,in the Systolic Hypertension in the Elderlyprogram,20 a multicentre, randomised, double-blind, placebo-controlled, community-based trial,4736 participants with isolated systolic hyperten-sion were randomised to receive either chlorthali-done, 12.5 mg daily, or matching placebo. Thegoal systolic blood pressure differed for eachparticipant depending upon initial systolic bloodpressure. If the blood pressure remained abovethe goal at two consecutive monthly visits, thedose was increased to 25 mg of chlorthalidoneor matching placebo daily. If the participantswere still above the goal at two consecutive vis-its, 25 mg of atenolol daily or matching placebowas added. In participants who still did notreach the goal systolic blood pressure, the dosewas increased to 50 mg of atenolol or matchingplacebo. If atenolol was contraindicated, 0.05 to0.1 mg of reserpine or matching placebo dailywas substituted. Blood pressure above a pri-ori established escape levels, despite maximalstepped-care therapy or corresponding placebo,was an indication for prescribing open-labelactive drug therapy.20

Some trials of pharmaceutical agents comparestrategies, rather than drugs. Recently, the AtrialFibrillation Follow-up Investigation of RhythmManagement (AFFIRM) evaluated which of twoapproaches for treating patients with atrial fib-rillation was better.23,24 Participants had to haveatrial fibrillation, be over age 65 (or under 65 andat high risk for stroke) and the enrolling physi-cian had to deem it appropriate to treat patientsas part of the assigned strategy for up to fiveyears. AFFIRM included 4060 people, enrolled atover 200 sites in Canada and the United States,who were randomised to either rhythm or ratecontrol strategies for managing their atrial fib-rillation. Investigators could select from variousoptions on an approved menu of pharmacologicand non-pharmacologic therapies. Cardioversionand antiarrhythmic drugs were used to maintainsinus rhythm (called ‘rhythm control’). Agentssuch as digitalis, calcium channel blockers andβ-blockers, or ablation of the atrioventricular

Page 185: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CARDIOVASCULAR 181

junction and pacemaker implantation, were usedto control the ventricular response rate from theatrial fibrillation (called ‘rate control’). The trialshowed that there was no significant difference inthe primary outcome (all-cause mortality), thoughthere was a trend favouring the rate control group,and there were fewer adverse effects in the ratecontrol group.

TRIALS OF DEVICES AND SURGICALPROCEDURES

Devices and surgical procedures are commonlyused in patients with heart disease. Examples ofdevices are replacements for heart valves, stentsthat help keep coronary vessels that have beenopened patent, cardiac pacemakers and cardiacdefibrillators. Examples of surgical proceduresare corrections of congenital abnormalities, coro-nary artery bypass grafts (CABG) and aneurysmresection. Trials of various surgical proceduresare usually surgery versus medical treatment orsurgery versus device implantation. Examplesare coronary artery bypass graft procedures inpatients with ischaemia that are compared againstuse of thrombolytic agents or against implanta-tion of coronary artery stents, or coronary bypassgraft procedures in patients with stable anginapectoris that have been compared against bestmedical therapy. Less often, there are trials com-paring one surgical procedure against another.Devices may be compared against surgery, asnoted, or against medical care, or sometimes,against another device.

Obviously, as with all clinical trials, the ques-tions in these kinds of trials need to be importantand the answer relevant, the study needs to beappropriately designed and carried out, and thedata must be properly analysed. In addition, thereare certain features of such studies that need tobe considered. First, there is an unavoidable inte-gration of the intervention being employed andthe technique with which it is done. The skillof the investigator is far more important thanin, for example, drug trials. Unless the inves-tigator has considerable surgical competence or

experience in implanting a device, an interven-tion might be claimed to be not beneficial, oreven harmful, when in the hands of a more skilledoperator it would be beneficial. This was seen inthe Department of Veterans Affairs trial compar-ing surgical and medical management of anginapectoris.25 Thirteen hospitals participated in thistrial. Three of the hospitals had surgical mortalityconsiderably greater than the other 10. The resultscomparing surgery against medical care werefavourable for surgery among patients at high riskof death from their disease, even when all 13 hos-pitals were included in the analysis. However, forlower risk patients, only the comparison involv-ing the 10 hospitals with better surgical resultsshowed benefit from surgery. These data mayreflect normal variation, but they raise the issueof requiring a certain level of experience fromthe surgeons before they participate in a trial.

The issue of skill of the operators who insert adevice also needs to be emphasised. Most recentclinical trials of device implantation require thatthe operators have experience with a certain min-imum number of devices before being allowedto participate in the trial. This does not guar-antee that only highly skilled operators will beinvolved, but it means that the trial is a bettertest of how the device will perform in close tooptimal circumstances. An example is the expe-rience required of investigators and the establish-ment of minimum standards for the device andlead systems used in the Antiarrhythmics VersusImplantable Defibrillators (AVID) trial.26

A second, related issue is how broadly thestudy results can be generalised if only themost experienced surgeons participate in thetrial. After a drug study shows benefit froma new pharmaceutical agent, presumably mostpractitioners are able to administer the drug ina safe, effective manner. Transferring surgicaltechnique and skill from investigators in the trial toothers is less straightforward. Similarly, if a deviceis shown to be beneficial, and then used morewidely by less well-trained operators, the resultswill not be as positive as in the clinical trial setting.

Trials of both devices and surgical proceduresaffect the way in which the primary trial outcome

Page 186: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

182 TEXTBOOK OF CLINICAL TRIALS

is assessed. Because of the invasive nature ofthe intervention, it is likely, indeed expected, thatthere will be an early adverse experience associ-ated with the procedure. The trauma involved; theconsequences of anaesthesia, particularly if gen-eral anaesthesia is used; and the risks of infectionwill almost inevitably lead to morbidity and per-haps mortality early after the intervention. There-fore, the study needs to be designed such thatit lasts long enough for any hoped-for benefitto overcome the early unfavourable experience.Sometimes, the expected benefit does not appearfor quite some time. Not only the investigators,but institutional ethics committees and prospec-tive study participants need to understand thisimplication.

An example is the Program on the SurgicalControl of Hyperlipidemia.27,28 This trial com-pared partial ileal bypass surgery against medicaltherapy in patients with a prior myocardial infarc-tion. The goal was to decrease the absorption oflipids, thereby reducing serum cholesterol. Forthe first two years of the trial, there was lit-tle difference in the primary outcome, all-causemortality, with the surgical group doing slightlyworse than the control group. The curves crossedafter about three years, and at the scheduled endof the trial, there was a non-significant trend infavour of surgery. The study investigators fol-lowed the participants after the formal end of thetrial. The trend in favour of surgery continued andfive years after the end, the mortality differencewas statistically significant.28

This study shows that data monitoring com-mittees need to consider how long to wait tosee if the benefit appears and counterbalances theknown risks. Even though the primary outcomewas not sufficiently adverse early in the trial tojustify stopping, other factors combined with lackof benefit might have influenced a monitoringcommittee to do so. For example, in POSCH,there were side effects such as diarrhoea and,more seriously, a higher rate of kidney stonesand gallstones.27 The slight early adverse trend inmortality plus the increased morbidity could haveled to a decision to stop the study prematurely.

Changes in surgical technique or modificationsin devices while the study is being conductedcan cause difficulties in interpreting the results ofclinical trials. If, partway through a study, there isan important change in the intervention, depend-ing on the outcome, it may be hard to reach aclear conclusion about the possible benefits ofthe intervention. In the past, implantable car-dioverter defibrillators required a thoracotomy.This carried considerable risk that needed tobe considered when inserting the device. Leadsthat could be inserted transvenously were sub-sequently developed, reducing the early com-plication rate. The AVID trial and the Cana-dian Implantable Defibrillator Study (CIDS) trial,both of which compared implantable cardioverterdefibrillators against medical therapy in peo-ple resuscitated from cardiac arrest, were in theprocess of enrolling patients when the switchin practice from primarily using thoracotomy-based defibrillators to transvenous defibrillatorsoccurred.26,29 Because both types of defibrilla-tor performed similarly, there was no problem incombining the results. This was particularly thecase because it was shown significantly in AVIDand with a strong trend in CIDS that the defib-rillator was more effective in reducing mortalitythan antiarrhythmic drug therapy. If there wereno difference, or if drug therapy turned out to besuperior overall, questions about the validity ofthe trial might have been raised, given the mid-study switches in device use.

Trials of cardiovascular devices involve sev-eral other issues that are not common to otherkinds of trials. One is the decision during studydesign as to the primary question. The device isdeveloped to meet certain specifications. Theseinclude minimising the possibility of rejection bythe patient, reducing the likelihood of develop-ment of thrombi and emboli, physical character-istics such as size and weight, and, importantly,does it do what it is designed to do. Does adefibrillator detect and convert life-threateningrhythm disturbances? Does a pacemaker detectand correct severe bradycardia? Can a stent beeasily employed and will it retain its structuralintegrity? These are engineering questions that

Page 187: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CARDIOVASCULAR 183

should be addressed and satisfactorily answeredbefore a clinical trial is conducted. The clinicaltrial should be designed to answer the questionsposed by the clinician. Will the device reducemortality and/or morbidity, what is the resteno-sis/occlusion rate, and what are the risks and sideeffects? The answers to these questions incor-porate the structural and functional aspects ofthe device, the skill of the person inserting thedevice, and the often unknown biologic interac-tion between patients and the device.

The fact that only devices designed andfully expected to be mechanically functionalare used raises a serious ethical issue. If thedevice defibrillates, for example, how can itbe withheld from someone with known life-threatening cardiac arrhythmias? This was facedin the AVID trial.30,31 The rationale for thetrial was that the balance between expectedreduction in death due to arrhythmias versusdeath from other causes, plus adverse events suchas infection, and the possible seriously impairedquality of life, was uncertain.

A key issue is the risk level of the patientsbeing enrolled. If the patients are at truly veryhigh risk of arrhythmic death, even thoughoptimal medical therapy is being used, then itmight be inappropriate to randomise them tomedical therapy if a possibly useful device orsurgical procedure exists. The defibrillator trialsthat were done showed that in moderately high-risk patients, the use of the defibrillator savedlives with an acceptable number of adverseevents. If the risk level is less, however, as mightbe the case in patients with better left ventricularejection fraction, the balance between benefitand potential harm might remain unknown, andthe use of a defibrillator might not be justified,absent clear demonstration of benefit in a clinicaltrial.32,33

Trials have looked at various ways of identi-fying patients at sufficiently high risk to see ifdefibrillators are beneficial, but not at so highrisk that it would be unethical to randomise. TheMulticentre Automatic Defibrillator ImplantationTrial (MADIT) compared use of an implanteddefibrillator versus conventional medical therapy

in 196 patients with heart failure, a prior myocar-dial infarction, left ventricular ejection fractionless than or equal to 35%, a documented episodeof asymptomatic unsustained ventricular tachy-cardia, and inducible, non-suppressible ventricu-lar tachyarrhythmia on electrophysiologic testing.In this very high-risk group of patients, the defib-rillator led to highly significant reductions inall-cause and cardiac mortality.34 A subsequentstudy by the same group of investigators (MADITII) assessed whether the implantable defibrilla-tor would reduce mortality in patients with aprior myocardial infarction and left ventricularejection fraction less than or equal to 30%. Elec-trophysiologic testing was not used to identifyhigh-risk patients. Because the risk of mortalitywas lower in this study than in the prior study,742 patients were randomised to receive eitherthe implantable defibrillator or conventional med-ical therapy. Here too, there was a significantreduction in mortality in the defibrillator group.35

One trial of a device that raised considerablequestions about ethics was the Randomised Eval-uation of Mechanical Assistance for the Treat-ment of Congestive Heart Failure (REMATCH),which was conducted from 1997 to 2001. Thistrial compared use of a left ventricular assistdevice versus medical therapy in 129 patientswith end-stage heart failure who were not can-didates for cardiac transplantation. The one-yearsurvival was 52% in the group receiving the leftventricular assist device and 25% in the medicaltherapy group, a highly significant difference. Attwo years, the rates of survival were 23% and 8%.There were over twice as many serious adverseevents (infection, bleeding, device malfunction)in the device group as in the medical arm.36,37

Because the expected survival rate in thesepatients was so low, there were many questionsabout the ethics of randomisation. It was knownthat the device was mechanically sound, andworked in the short-term as a bridge to trans-plantation. The justification for the trial was thatlong-term benefit, either for survival or qualityof life, was unknown. For a more extensive dis-cussion of the design issues and ethics of thiskind of trial, using REMATCH as an example,

Page 188: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

184 TEXTBOOK OF CLINICAL TRIALS

see the report of a conference on mechanical car-diac support.38

It is uncommon (and often impractical) fortrials of devices or surgical procedures to beblinded. Occasionally, however, this can bedone. One such trial was Mode Selection Trialin Sinus Node Dysfunction (MOST).39 Dual-chamber (atrioventricular) pacing was comparedwith single-chamber (ventricular) pacing in 2010patients with clinically important bradycardia.The primary outcome was death or non-fatalstroke. All patients received a dual-chamberdevice, but in those randomised to single-chamber pacing, only one lead was activated,therefore mode of pacing was randomised ratherthan type of device. Physician investigators andpatients were blinded regarding whether thepatient was in the dual- or single-chamber arm;cross-over at the last follow-up was 31.4% forthose assigned to ventricular pacing, almost halfof these due to pacemaker syndrome. This was incontrast to another study that inserted only single-chamber devices in those randomised to thatgroup and dual-chamber devices only in thoserandomised to the dual-chamber group. Here thecross-over rate was 2.7%.40 This latter trial wasnot blinded, but cross-over from single- to dual-chamber pacing would have required anotherprocedure, accounting for the low cross-over rate.

As with many drug trials, trials of devicescan look at either single devices (or upgradesof these devices as they become available) orclasses of devices. The AVID26 trial comparedthe use of advanced-generation units with tieredtherapy capable of antitachycardia pacing, car-dioversion and defibrillation, as well as brady-cardia pacing, made by more than one company,against any of several drugs (though primarilyamiodarone), thus testing whether the strategy ofusing implantable defibrillators was preferable toa strategy of pharmacologic approach. This kindof trial is more likely to be done by public organ-isations, such as the National Institutes of Health,than by industry. Industry-supported trials, suchas the previously discussed MADIT,34 almostalways compare a single manufacturer’s device.

To the extent that the devices in the ‘strategy-approach’ trial are similarly effective, the resultscan be more broadly generalised to a class ofdevices. There is a risk, however, that devicesmade by one manufacturer will be better thanothers, blurring the outcome of the trial.

Another feature of device and surgical trialsis that unlike most drugs (vaccines being anexception) that need to be administered regularly,devices are implanted and expected to workfor a long time, and surgery, unless reversed,can be life-long. Batteries and other componentsmay need to be replaced, but unless there areproblems, they last for years. This is generally astrength of such trials. There is less problem withcompliance to protocol and long-term follow-up is not only feasible, but desirable. Theseveral coronary artery bypass graft surgery trialsassessed outcome 10, and in some cases morethan 20 years after the initial procedure.41

If a drug trial turns out not to show benefitfrom the drug, simply stopping administrationis usually sufficient. But what if the device orsurgery trial turns out not to show benefit? Whatis the obligation of the investigator, especially ifthe device or procedure is shown by the trial tobe harmful? The Coronary Artery Bypass Graft(CABG) Patch trial42 compared transthoracicimplantation of cardioverter defibrillators againstcontrol in patients undergoing CABG surgery.At the end of an average 32 months follow-up, there was no significant mortality differencebetween the groups. All patients were given theresults of the trial and subsequent therapy wasindividualised. All patients were urged to haveelectrophysiologic testing to see if they were athigh risk of serious arrhythmia, and thus possiblyin need of the defibrillator in the future. About40% of the patients in the intervention groupelected to have the device turned off or removed[J.T. Bigger, personal communication].

TRIALS OF BEHAVIOUR CHANGE

Trials of behaviour change are fairly commonin heart disease. They include trials aimed at

Page 189: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CARDIOVASCULAR 185

smoking prevention or cessation; diet changefor weight loss, blood pressure reduction orcholesterol reduction; and exercise for risk factorreduction and better outcomes in people with aheart attack or heart failure.

These sorts of trials have several aspects thatmake them different from most other trials. First,there is considerable cross-over or recidivism bythe study participants. The interventions are onesthat many can either implement on their ownor stop because they are difficult to maintain.Second, partly as a consequence of the first, thetrials are quite resource intensive on the part ofthe interventionists. Implantation of devices orsurgical procedures require major efforts by theinvestigator or surgeon, but usually only on a one-time basis. Trials of behaviour change requireconsiderable effort on a continuing basis to helpparticipants stay with a change from what mayhave been a life-long habit of smoking, eatingpoorly or sedentary lifestyle. Third, again as aconsequence of the first two factors, the studyduration is often much shorter than with othertypes of trials. Getting people to adhere to anexercise programme for months, let alone years,is extraordinarily difficult. And of course volun-teers willing to be randomised to exercise or noexercise will have more of an interest in exercisethan the general public. Therefore, those allo-cated to the no-exercise programme group willhave more of a tendency to cross-over. Fourth,because of the generally shorter duration of thetrials, surrogate outcomes are more often usedthan in other trials in heart disease. Rather thanassessing clinical outcomes such as heart dis-ease or stroke, the behaviour trials will often useweight change, biochemical measures, or attitudeor knowledge assessed by questionnaires or inter-views. Fifth, standardisation of the interventionand measurement of the degree of compliance aremore complicated. Unless highly controlled feed-ing studies are performed, we have only modestlygood ways of assessing overall food and nutri-ent intake. This is particularly so if maintenanceof caloric intake is one of the objectives of thetrial, as weight would not be able to serve as amarker of change in diet. Sixth, societal changes

and pressure can affect the trials in major ways.If there are changes in restaurant or workplacesmoking regulations during the time of a triallooking at ways to get people to stop smoking,the likely trends in the control group as a result ofthe new regulations will make detection of benefitfrom the intervention more difficult.

Examples of successful trials of diet inter-vention are the Dietary Approaches to StopHypertension (DASH) trial43 and a subsequenttrial of the DASH diet with different lev-els of sodium intake.44 The first DASH studyenrolled 459 adults with systolic blood pressureunder 160 mm Hg and diastolic pressure 80 to95 mm Hg. Participants were randomly allocatedto one of three groups: a diet rich in fruits andvegetables; a diet rich in fruits and vegetablesplus low-fat dairy products and reduced satu-rated fat (‘combination’ diet); or a control dietwith fruit, vegetable and fat content similar tothat commonly eaten in the United States. At theend of eight weeks, both of the intervention dietsreduced blood pressure, with a greater reductionfrom the diet containing low-fat dairy products.The second DASH study enrolled 412 partici-pants in a factorial design trial. Participants wereassigned to either the DASH combination dietor the control diet and to any of three levels ofsodium intake. The moderate and low sodiumintake diets reduced blood pressure in both theDASH diet and control diet groups. The DASHdiet led to lower blood pressure than the controldiet at each sodium level.

In the DASH studies, the food was spe-cially prepared and provided to the partici-pants. Whether or not the DASH-type diet canbe maintained, over time, in people obtain-ing their food in the usual way, was stud-ied in PREMIER, a trial of 810 participantswhose blood pressure is greater than opti-mal or who have mild hypertension.45 Partic-ipants were randomised to one of three arms:advice only; comprehensive lifestyle interven-tion using behavioural approaches; and a com-bined comprehensive lifestyle intervention plusthe DASH diet. The behavioural approachesconsisted of 18 counselling sessions. Unlike

Page 190: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

186 TEXTBOOK OF CLINICAL TRIALS

DASH, the participants were not provided spe-cially prepared foods. The primary outcome wassystolic blood pressure six months after randomi-sation. At six months, the systolic blood pressurehad decreased by 11.1 mm Hg in the combinedgroup, 10.5 mm Hg in the behavioural interven-tion group, and 6.6 mm Hg in the advice onlygroup. The largest reductions in diastolic bloodpressure and in the percentage of people withhypertension were also seen in the combinedgroup, with the behavioural intervention group aclose second. Thus, a combination of behaviouralapproaches and dietary changes can result inmeaningful blood pressure reduction. Even so,the DASH diet, unlike in the previous feed-ing studies,43,44 did not contribute significantlybeyond the comprehensive lifestyle interventionalone. The adoption of the diet in PREMIERwas not as intensive as in the feeding studies.Whether the changes observed in PREMIER per-sist at 18 months is being assessed.45

A different kind of behavioural interventiontrial was the Enhancing Recovery in CoronaryHeart Disease (ENRICHD)46 study. This trialenrolled 2481 patients at 73 hospitals who hadhad a myocardial infarction within the previous28 days. In addition, all participants had depres-sion, low social support, or both. Because depres-sion and poor social support are associated withincreased mortality after a heart attack, it wasthought that intervening on those factors mightlead to improved survival. Those randomisedto intervention received counselling; the controlgroup received usual medical care. Both groupsreceived information on heart disease risk factors.Although the intervention decreased depressionand improved social support, there was no dif-ference at three years in the primary outcome ofdeath or recurrent myocardial infarction (24.1%vs. 24.2%).47

The ENRICHD study raises several issues.First, despite the association between depressionand heart disease, treatment of depression maynot lead to change in mortality from heart disease.That is, depression may not be a causative factor.Second, the observed improvement in depressionand social support may not have been of great

enough magnitude to alter mortality. This is whathappened in some of the early trials of cholesterollowering that failed to show improvement inmortality. The early lipid-lowering drugs werenot as effective as the current ones in reducingcholesterol. Third, the measures of depressionand social support, particularly when obtainedshortly after a major event such as a heart attack,might not reflect true ‘baseline’. Fourth, however,even though mortality and recurrent infarctionwere unchanged, the apparent improvements indepression and social support are not trivialfindings. Unlike surrogate outcome variables thathave little clinical meaning, these outcomes areclinically important in their own right.

Often, it is more appropriate to conductbehaviour change trials in community settings,with one group of communities compared againstanother. The changes, in order to be effective,need to be community-wide. An example ofefforts to improve response time to symptoms ofa heart attack was the Rapid Early Action forCoronary Treatment (REACT) trial. This studyinvolved 10 matched pairs of cities. One groupof cities received intervention through the media,community organisations, and professional andpatient education in an effort to improve theresponse time in the event of symptoms of anacute myocardial infarction. The other group ofcities served as the control. The primary outcomewas time from symptom onset to arrival in thehospital emergency department. REACT showedincreased use of the emergency medical system,but no difference between the groups in time toarrival at the hospital.48

There have been several trials aimed at chang-ing multiple risk factors for heart disease in com-munity settings.49 – 51 These trials showed smalldifferences between the intervention communitiesand the control communities in selected variables.In general, however, they were not particularlysuccessful in achieving large differences despiteintensive education efforts. The reasons are prob-ably multiple. Among them are that the interven-tions were not delivered in sufficiently persua-sive manners and that the control communitiesshowed changes because of national attention to

Page 191: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CARDIOVASCULAR 187

the need to stop smoking, improve diet and getbetter preventive medical attention.

The less than outstanding results from thesecommunity-wide efforts at behaviour changeillustrate both the difficulties in achieving behav-iour change and the problems in community, asopposed to individual, interventions.

SUMMARY

There are certain features that need to be con-sidered in cardiovascular disease trials, whetherthey are primary prevention or secondary preven-tion. Those features include the chronic natureof the common cardiovascular diseases, the mul-tiple risk factors responsible for those diseases,and the fact that huge numbers of people developcardiovascular disease of one form or another. Inthis chapter, we have reviewed these and otherfactors, and described selected trials of drugs,devices and surgical procedures, and behaviouralinterventions that exemplify these features. Over-all, however, trials of cardiovascular diseases aredesigned, conducted and analysed in ways similarto trials of other conditions.

Future trials may include targeting and ‘opti-mising’ interventions based on genotype and theuse of new diagnostic and imaging techniques,potentially yielding better characterisation of thedisease or condition pathology.

REFERENCES

1. Chobanian AV, Bakris GL, Black HR, CushmanWC, Green LA, Izzo JL, et al. Seventh reportof the Joint National Committee on prevention,detection, evaluation, and treatment of high bloodpressure. Hypertension (2003) 42: 1206–52.

2. Greene HL, Richardson DW, Barker AH, RodenDM, Capone RJ, Echt DS, et al. Classification ofdeaths after myocardial infarction as arrhythmicor nonarrhythmic (the Cardiac Arrhythmia PilotStudy). Am J Cardiol (1989) 63: 1–6.

3. Kannel WB. Silent myocardial ischemia andinfarction: insights from the Framingham Study.Cardiol Clin (1986) 4: 583–91.

4. Bardy GH, Lee KL, Mark DB, Poole JE, Fish-bein DP, Singh SN, et al. The Sudden Car-diac Death-Heart Failure Trial (SCD-HeFT). In:

Woosley RL, Singh SN, eds, Arrhythmia Treat-ment and Therapy. New York: Marcel Dekker(2000) 323–42.

5. National Institutes of Health and National Heart,Lung and Blood Institute. Morbidity & Mortality:2002 Chart Book on Cardiovascular, Lung, andBlood Diseases. U.S. Department of Health andHuman Services, Public Health Service, NationalInstitutes of Health (2002).

6. Clinical Trials in Cardiovascular Disease. A Com-panion to Braunwald’s Heart Disease. Philadel-phia: W.B. Saunders (1999).

7. Canner PL, Klimt CR. The Coronary Drug Project.Experimental design features. Contr Clin Trials(1983) 4: 313–32.

8. The Coronary Drug Project Research Group.Clofibrate and niacin in coronary heart disease.JAMA (1975) 231: 360–81.

9. Canner PL, Berge KG, Wenger NK, Stamler J,Friedman L, Prineas RJ, et al. Fifteen year mor-tality in Coronary Drug Project patients: long-termbenefit with niacin. J Am Coll Cardiol (1986) 8:1245–55.

10. Lipid Research Clinics Coronary Primary Preven-tion Trial. The Lipid Research Clinics CoronaryPrimary Prevention Trial results. I. Reduction inincidence of coronary heart disease. JAMA (1984)251: 351–64.

11. The Scandinavian Simvastatin Survival Study.Randomised trial of cholesterol lowering in 4444patients with coronary heart disease: the Scandi-navian Simvastatin Survival Study (4S). Lancet(1994) 344: 1383–9.

12. Sacks FM, Pfeffer MA, Moye LA, Rouleau JL,Rutherford JD, Cole TG, et al. The effect ofpravastatin on coronary events after myocardialinfarction in patients with average cholesterollevels. Cholesterol and Recurrent Events Trialinvestigators. N Engl J Med (1996) 335: 1001–9.

13. MRC/BHF Heart Protection Study of cholesterollowering with simvastatin in 20,536 high-riskindividuals: a randomised placebo-controlled trial.Lancet (2002) 360: 7–22.

14. ALLHAT Collaborative Research Group. Majorcardiovascular events in hypertensive patientsrandomized to doxazosin vs chlorthalidone: theantihypertensive and lipid-lowering treatment toprevent heart attack trial (ALLHAT). JAMA (2000)283: 1967–75.

15. The ALLHAT Officers and Coordinators for theALLHAT Collaborative Research Group. Majoroutcomes in moderately hypercholesterolemic,hypertensive patients randomized to pravastatinvs usual care. The Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial(ALLHAT-LLT). JAMA (2002) 288: 2998–3007.

Page 192: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

188 TEXTBOOK OF CLINICAL TRIALS

16. The ALLHAT Officers and Coordinators forthe ALLHAT Collaborative Research Group.Major outcomes in high-risk hypertensive patientsrandomized to angiotensin-converting enzymeinhibitor or calcium channel blocker vs diuretic.The Antihypertensive and Lipid-Lowering Treat-ment to Prevent Heart Attack Trial (ALLHAT).JAMA (2002) 288: 2981–97.

17. National Institutes of Health and National Heart,Lung and Blood Institute. Third Report of theNational Cholesterol Education Program (NCEP)Expert Panel on Detection, Evaluation, and Treat-ment of High Blood Cholesterol in Adults (AdultTreatment Panel III). U.S. Department of Healthand Human Services, Public Health Service(2001).

18. Yusuf S, Sleight P, Pogue J, Bosch J, Davies R,Dagenais G. Effects of an angiotensin-converting-enzyme inhibitor, ramipril, on cardiovascularevents in high-risk patients. The Heart OutcomesPrevention Evaluation Study Investigators. N EnglJ Med (2000) 342: 145–53.

19. Pfeffer MA, Domanski M, Rosenberg Y, Verter J,Geller N, Albert P, et al. Prevention of eventswith angiotensin-converting enzyme inhibition(the PEACE study design). Prevention of Eventswith Angiotensin-Converting Enzyme Inhibition.Am J Cardiol (1998) 82: 25H–30H.

20. SHEP Cooperative Research Group. Prevention ofstroke by antihypertensive drug treatment in olderpersons with isolated systolic hypertension. Finalresults of the Systolic Hypertension in the ElderlyProgram (SHEP). JAMA (1991) 265: 3255–64.

21. Echt DS, Liebson PR, Mitchell LB, Peters RW,Obias-Manno D, Barker AH, et al. Mortality andmorbidity in patients receiving encainide, fle-cainide, or placebo. The Cardiac Arrhythmia Sup-pression Trial. N Engl J Med (1991) 324: 781–8.

22. The Cardiac Arrhythmia Suppression Trial IIInvestigators. Effect of the antiarrhythmic agentmoricizine on survival after myocardial infarction.N Engl J Med (1992) 327: 227–33.

23. The Planning and Steering Committee of theAFFIRM Study for the NHLBI AFFIRM Inves-tigators. Atrial Fibrillation Follow-up Investiga-tion of Rhythm Management – the AFFIRM studydesign. Am J Cardiol. (1997) 79: 1198–202.

24. The Atrial Fibrillation Follow-up Investigation ofRhythm Management (AFFIRM) Investigators. Acomparison of rate control and rhythm controlin patients with atrial fibrillation. N Engl J Med(2002) 347: 1825–33.

25. Detre K, Peduzzi P, Murphy M, Hultgren H,Thomsen J, Oberman A, et al. Effect of bypasssurgery on survival in patients in low- andhigh-risk subgroups delineated by the use of

simple clinical variables. Circulation (1981) 63:1329–38.

26. The AVID Investigators. Antiarrhythmics Ver-sus Implantable Defibrillators (AVID) – rationale,design, and methods. Am J Cardiol (1995) 75:470–5.

27. Buchwald H, Varco RL, Matts JP, Long JM, FitchLL, Campbell GS, et al. Effect of partial ilealbypass surgery on mortality and morbidity fromcoronary heart disease in patients with hyperc-holesterolemia. Report of the Program on the Sur-gical Control of Hyperlipidemia (POSCH). N EnglJ Med (1990) 323(14): 946–55.

28. Buchwald H, Varco RL, Boen JR, Williams SE,Hansen BJ, Campos CT, et al. Effective lipidmodification by partial ileal bypass reduced long-term coronary heart disease mortality and mor-bidity: five-year posttrial follow-up report fromthe POSCH. Program on the Surgical Control ofthe Hyperlipidemias. Arch Intern Med (1998) 158:1253–61.

29. Connolly SJ, Gent M, Roberts RS, Dorian P,Roy D, Sheldon RS, et al. Canadian implantabledefibrillator study (CIDS): a randomized trial ofthe implantable cardioverter defibrillator againstamiodarone. Circulation (2000) 101: 1297–302.

30. Epstein AE. AVID necessity. PACE Pacing ClinElectrophysiol (1993) 16: 1773–5.

31. Fogoros RN. An AVID dissent. PACE Pacing ClinElectrophysiol (1994) 17: 1707–11.

32. The AVID Investigators. A comparison of anti-arrhythmic-drug therapy with implantable defib-rillators in patients resuscitated from near-fatalventricular arrhythmias. N Engl J Med (1997) 337:1576–83.

33. Domanski MJ, Epstein A, Hallstrom A, Sak-sena S, Zipes DP. Survival of antiarrhythmicor implantable cardioverter defibrillator treatedpatients with varying degrees of left ventriculardysfunction who survived malignant ventriculararrhythmias. J Cardiovasc Electrophysiol (2002)13: 580–3.

34. Moss AJ. Background, outcome, and clinicalimplications of the Multicentre Automatic Defib-rillator Implantation Trial (MADIT). Am J Cardiol(1997) 80: 28F–32F.

35. Moss AJ, Zareba W, Hall WJ, Klein H, WilberDJ, Cannom DS, et al. Prophylactic implantationof a defibrillator in patients with myocardialinfarction and reduced ejection fraction. N EnglJ Med (2002) 346: 877–83.

36. Rose EA, Moskowitz AJ, Packer M, Sollano JA,Williams DL, Tierney AR, et al. The REMATCHtrial: rationale, design, and end points. Ran-domised Evaluation of Mechanical Assistance forthe Treatment of Congestive Heart Failure. AnnThorac Surg (1999) 67: 723–30.

Page 193: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CARDIOVASCULAR 189

37. Rose EA, Gelijns AC, Moskowitz AJ, Heitjan DF,Stevenson LW, Dembitsky W, et al. Long-termmechanical left ventricular assistance for end-stage heart failure. N Engl J Med (2001) 345:1435–43.

38. Stevenson LW, Kormos RL, Bourge RC, Geli-jns A, Griffith BP, Hershberger RE, et al. Mechan-ical cardiac support 2000: current applications andfuture trial design, June 15–16, 2000, Bethesda,Maryland. J Am Coll Cardiol (2001) 37: 340–70.

39. Lamas GA, Lee KL, Sweeney MO, Silverman R,Leon A, Yee R, et al. Ventricular pacing or dual-chamber pacing for sinus-node dysfunction. NEngl J Med (2002) 346: 1854–62.

40. Connolly SJ, Kerr CR, Gent M, Roberts RS,Yusuf S, Gillis AM, et al. Effects of physiologicpacing versus ventricular pacing on the risk ofstroke and death due to cardiovascular causes.Canadian Trial of Physiologic Pacing Investiga-tors. N Engl J Med (2000) 342: 1385–91.

41. Yusuf S, Zucker D, Peduzzi P, Fisher LD,Takaro T, Kennedy JW, et al. Effect of coronaryartery bypass graft surgery on survival: overviewof 10-year results from randomised trials by theCoronary Artery Bypass Graft Surgery TrialistsCollaboration. Lancet (1994) 344: 563–70.

42. Bigger JT Jr. Prophylactic use of implanted car-diac defibrillators in patients at high risk for ven-tricular arrhythmias after coronary-artery bypassgraft surgery. Coronary Artery Bypass Graft(CABG) Patch Trial Investigators. N Engl J Med(1997) 337: 1569–75.

43. Appel LJ, Moore TJ, Obarzanek E, Vollmer WM,Svetkey LP, Sacks FM, et al. A clinical trial ofthe effects of dietary patterns on blood pressure.DASH Collaborative Research Group. N Engl JMed (1997) 336: 1117–24.

44. Sacks FM, Svetkey LP, Vollmer WM, Appel LJ,Bray GA, Harsha D, et al. Effects on bloodpressure of reduced dietary sodium and the Dietary

Approaches to Stop Hypertension (DASH) diet. NEngl J Med (2001) 344: 3–10.

45. Writing Group of the PREMIER CollaborativeResearch Group. Effects of comprehensive lifestylemodification on blood pressure control. Mainresults of the PREMIER clinical trial. JAMA (2003)289: 2083–93.

46. The ENRICHD Investigators. Enhancing recoveryin coronary heart disease (ENRICHD) studyintervention: rationale and design. Psychosom Med(2001) 63: 747–55.

47. Writing Committee for the ENRICHD Investi-gators. Effects of treating depression and lowperceived social support on clinical events aftermyocardial infarction. The Enchancing Recoveryin Coronary Heart Disease Patients (ENRICHD)randomized trial. JAMA (2003) 289: 3106–16.

48. Luepker RV, Raczynski JM, Osganian S, Gold-berg RJ, Finnegan JR Jr, Hedges JR, et al. Effectof a community intervention on patient delay andemergency medical service use in acute coronaryheart disease: the Rapid Early Action for Coro-nary Treatment (REACT) trial. JAMA (2000) 284:60–7.

49. Farquhar JW, Fortmann SP, Flora JA, Taylor CB,Haskell WL, Williams PT, et al. Effects of com-munitywide education on cardiovascular diseaserisk factors. The Stanford Five-City Project. JAMA(1990) 264: 359–65.

50. Luepker RV, Murray DM, Jacobs DR, Mittel-mark MB, Bracht N, Carlaw R, et al. Communityeducation for cardiovascular disease prevention:risk factor changes in the Minnesota Heart HealthProgram. Am J Public Health (1994) 84: 1383–93.

51. Carleton RA, Lasater TM, Assaf AR, FeldmanHA, McKinlay S, Pawtucket Heart Health Pro-gram Writing Group. The Pawtucket Heart HealthProgram: community changes in cardiovascularrisk factors and projected disease risk. Am J PublicHealth (1995) 85: 777–85.

Page 194: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DENTISTRY AND MAXILLO-FACIAL

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 195: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

13

Dentistry and Maxillo-facialMAY C.M. WONG, COLMAN MCGRATH AND EDWARD C.M. LODental Public Health, Faculty of Dentistry, The University of Hong Kong, Hong Kong

INTRODUCTION

SCOPE OF THE CHAPTER

Dentistry is concerned with the prevention andtreatment of diseases and disorders of the teeth,gums (periodontium) and oral cavity. The twomost common oral diseases are dental caries(tooth decay) and periodontal (gum) diseases.Data from the Global Oral Health Databankof the World Health Organization (http://www.whocollab.od.mah.se/index.html) reports that atleast half of the children and nearly all of theadults in most countries throughout the worldhave been affected by dental caries. In addition,findings from epidemiological surveys throughoutthe world have reported that less than 10% oftheir adult population have no periodontal disease(completely healthy gums). One of the morelife-threatening diseases of the oral cavity isoral cancer, primarily cancer involving the oralmucosa (lining of the oral cavity). The prevalenceof oral cancer varies from country to country, inmost countries it accounts for less than 1% ofthe total cancer incidence whereas in the Indiansubcontinent it can account for 30–50% of thetotal cancer incidence.1

Aside from oral diseases there are a numberof conditions or disorders of the oral cavity thatare of concern. Malalignment and malocclusionof teeth (crooked teeth) is prevalent and severein many countries and most report a growingdemand for orthodontic treatment to correct themalocclusion. In the US, it is estimated thataround half of the population are in need of somekind of orthodontic treatment to improve theirocclusion.2 Another problem has been the needfor replacement of missing teeth; congenitallyabsent or lost because of caries or periodontaldisease. Removable prosthesis (dentures or falseteeth) and fixed prosthesis (bridges) as well asimplants (screw in teeth) have been used toaddress these problems.

As the scope of dentistry is very wide, itwould not be possible to include all kinds ofclinical trials in dental research in this chapter.Instead, the discussion here will focus on themore common oral diseases and conditions.First, the disease aetiology and measurementsof dental caries and periodontal disease willbe presented. Second, clinical trials methodsused in dentistry will then be outlined andillustrated with examples. Third, the designs ofclinical trials conducted in the areas of dental

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 196: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

194 TEXTBOOK OF CLINICAL TRIALS

caries, oral rehabilitation, periodontal diseaseand orthodontics will be discussed. Fourth, thecurrent issues of evidence-based dentistry andhierarchical data analysis will also be discussed.Last, the impact of clinical trials on dentalpractice will be summarised.

DISEASE AETIOLOGY AND MEASUREMENTS

Evidence from animal and epidemiological stud-ies shows that dental caries arise from dem-ineralisation of tooth hard tissue due to organicacids produced by plaque bacteria on the toothsurface.3 Frequent intake of fermentable carbo-hydrates, especially sugars, has been shown tobe related to the development of dental caries.4

The most common measure used in clinical tri-als to quantify the severity of dental caries is tocount the number of decayed, filled and missing(due to caries) teeth or tooth surfaces, the DMFTor DMFS index.5 The current treatment approachfor dental caries emphasises prevention of the dis-ease by strengthening the tooth, such as the useof fluorides and fissure sealants, modification ofthe diet, such as the use of sugar substitutes, andappropriate health behaviours. Cavities producedby the caries process can be filled by variousdirect and indirect restorative materials.

Periodontal disease is characterised by theinflammatory response of the gums and its sequelto the toxic substances produced by the plaquebacteria.6 The current treatment approach forperiodontal disease emphasises primary and sec-ondary prevention of the disease through theremoval of plaque by mechanical and chemi-cal means, e.g. toothbrushing and the use ofmouthrinses. There are also various surgical andnon-surgical ways to treat the periodontal pock-ets that are formed in more advanced periodontaldisease states. Many indices have been used inclinical trials to quantify the amount of plaque onthe tooth surfaces, ranging from a simple dichoto-mous scale of presence or absence of visibleplaque7 to recording the thickness of the plaque atthe gum margin8 or the area of tooth surface cov-ered by plaque.9 Gingivitis is usually measuredby the presence or absence of bleeding after gen-tle probing7 or in an ordinal scale using various

clinical signs.10 For more advanced periodon-tal disease, usually the depth of the periodontalpocket and/or the attachment loss are measuredin millimetres using marked probes.

Following the developments in medical re-search, patient-based outcomes measures havealso been developed and employed in dentalresearch. These have focused primarily on theconcept of patient satisfaction and health-relatedquality of life measures. A number of question-naires have been developed recently to measurethe oral health-related quality of life of peo-ple, for example, the Oral Health Impact Profile(OHIP) and the General Oral Health Assess-ment Index (GOHAI).11 Some of these have beenused in clinical studies to assess the outcome ofdental treatment,12,13 to supplement the clinicalassessments.

CLINICAL TRIALS METHODS

In dental research, the clinical trials methodsused mainly follow those developed and adoptedin medical research. The basic design principlesand considerations are very similar, thus thefollowing discussion on clinical trials methodscan be kept short and precise for general areas,with specific areas unique in dentistry discussedin more detail. A few references on clinical trialmethodology and some from the dental literatureare recommended for general reading.14 – 17

RANDOMISED CONTROLLED TRIALS

As with the developments in medical research,randomised controlled trials (RCTs) have becomethe gold standard in conducting clinical trialsin dentistry. The key features of RCTs aretreatment modalities being assigned randomly tothe subjects and the existence of a control group.The controls can either be concurrent controls asin parallel studies or self-controls as in crossoverstudies. The treatment received in the controlgroup can be placebo or standard treatment. Inthe perfect setting, RCTs should also be double-blinded which requires that both the subjects and

Page 197: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DENTISTRY AND MAXILLO-FACIAL 195

the examiners/observers involved in the trials arenot aware of the assignment of the treatmentmodalities to the subjects, thus reducing anybiases in the comparison of the groups besidesrandomisation.

Informed consent, ethical consideration, datamonitoring and pre-study sample size calculationare also important issues in conducting RCTs.Subjects should be informed about the researchprotocols, their roles as participants in thestudies, the different treatment modalities, thepossibility of any side-effects or risks arisingfrom receiving the treatments, and their rights ofdiscontinuing participation. After the explanationof the above details to the subjects, the subjectsshould be given ample opportunity and time toask any questions and to discuss the trial withfamily and friends. Written consent is normallyrequired, however under special circumstances,verbal consent may be employed.18 It is goodpractice that clinical trials only be conducted afterapproval from local ethical committees in orderto protect human rights.

Data monitoring is especially important inlarge-scale, multicentre RCTs and usually adata monitoring committee is established tomonitor the data collected during the study.The committee needs to monitor the data forpatient safety and statistical significance, whilekeeping their findings confidential to prevent theintroduction of bias.15

PARALLEL AND CROSSOVERSTUDY DESIGNS

In the parallel study design, concurrent groupsof subjects are involved in the study and thecomparison of the different modalities is the com-parison of between-subject variation. When thenumber of treatment modalities increases, thecorresponding sample size required in order toachieve a particular level of power and signifi-cance needs to be increased greatly. Crossoverstudy design is a self-controlled study design,subjects serve as their own controls. Thus, thecomparison of the different modalities is the com-parison of within-subject variation. The use of the

subjects as their own controls prevents confound-ing by many characteristics that may influencethe outcome. In a crossover study each subject isgiven the different treatments (or treatment andplacebo) under comparison, one after another.Each subject is his/her own control. The sequenceof assignment is generally randomised, so thatthis is in effect a type of RCT. A ‘washing-outperiod’ may be required between treatments, topermit the effects of the previous treatment todisappear. However, since subjects who partici-pate in clinical trials with crossover design needto receive all treatment regimens and undergo‘washing-out periods’ between treatments, it canmake the investigation periods of the clinical tri-als very long and not feasible.19

Crossover trials are frequently employed inoral hygiene studies where treatment effects arereversible. An example is a single-blind, short-term crossover clinical trial where the plaqueremoval performance of three commerciallyavailable manual toothbrushes was compared.20

A sample of 25 dental hygiene students partici-pated (age 19 to 42 years old). The participantswere instructed to refrain from toothbrushing orflossing for 24 hours before the trial. A pre-brushing plaque index using disclosing solutionwas performed on each participant. One of thethree test brushes was then randomly assigned toeach participant, and they were allowed to brushfor 90 seconds without the aid of a mirror. Apost-brushing plaque index was then performedon each participant. This procedure was repeatedtwice more at 2-week intervals so that each par-ticipant was tested with all three toothbrushes.

Crossover design will not be applicable ifthe treatment has protracted ‘carry-over’ effect,i.e. the effect of the treatment is not reversible.In this situation, either parallel or split-mouthdesign should be adopted. For example in thecase of dental caries, since most carious lesionsoccur in the pit and fissure on the occlusalsurfaces of the posterior teeth, the effectivenessof sealing these pits and fissures in order toprevent dental caries is studied. In studies forcomparing the effectiveness of fissure sealantcompared to non-sealed teeth or sealant with

Page 198: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

196 TEXTBOOK OF CLINICAL TRIALS

different active ingredients to prevent caries,crossover design is not applicable as once theteeth are sealed, the process is not reversible.Thus for these studies, either parallel designor split-mouth design would be used. In thesetting of parallel design, subjects are assignedto different concurrent test groups; while inthe setting of split-mouth design, different teethof the same group of subjects are assigned todifferent test groups at the same study period.21

SPLIT-MOUTH DESIGN

Split-mouth design is one of the self-controlledstudy designs, that is unique in dentistry. Thisdesign is characterised by subdividing the mouthof the subjects into homogeneous within-patientexperimental units such as quadrants (upper left,upper right, lower left and lower right sides ofthe mouth), sextants (upper left posterior, upperanterior, upper right posterior, lower left poste-rior, lower anterior and lower right posterior),contralateral (left and right) or ipsilateral (upperand lower) quadrants or sextants or a symmetricalcombination of these. With these within-patientexperimental units, a range of two to six differenttreatment modalities can be randomly assigned tothe experimental units.22 The number of treat-ment modalities usually equals the number ofwithin-patient experimental units. For instance,in a study where two treatment modalities arecompared, the within-patient experimental unitswould usually be the left or right sides of themouth. In a study where four treatment modali-ties are compared, the within-patient experimen-tal units could be the four quadrants of the mouth.The split-mouth design has been the principalresearch tool in periodontal clinical trials to com-pare different treatment modalities. In the peri-odontal literature, at least 11 different types ofsplit-mouth design have been described.22

The major advantage of using the split-mouthdesign, like the crossover design, is that subjectsserve as their own controls, thus this design maybe more efficient than designs with between-subject comparisons. However, in contrast tothe crossover design, since the subjects are

receiving all the treatment modalities at thedifferent parts of the mouth concurrently, thestudy period of the investigation could be shorter.The study period could then be of the sameduration as if the parallel design was used, butthe number of subjects used could be reduced.In an investigation of the efficiency of split-mouth design compared to whole-mouth design(with the use of parallel study design), it wasconcluded that when disease characteristics aresymmetrically distributed over the within-patientexperimental units, the split-mouth design couldprovide moderate to large gains in relativeefficiency. For periodontal disease, division ofthe mouth into two experimental units, either leftand right or upper and lower sides, provided thegreatest symmetry of the disease characteristics.22

Besides the distribution of the disease, oneshould also consider the carry-across effectsarising from the split-mouth designs. Carry-across effects occur where treatment performedin one experimental unit can affect the treat-ment response in other experimental units ofthe mouth. These effects cannot be estimatedfrom split-mouth data, thus unless prior knowl-edge indicates that no carry-across effects exist,reported estimates of treatment efficacy arepotentially biased. Thus, researchers shouldweigh the potential gain in precision against apotential decrease in validity when using split-mouth designs in clinical trials.23 In a study tocompare the effectiveness of two electric tooth-brushes in plaque removal, a split-mouth designwas used in which either the first (upper right)and third (lower left) quadrants or the second(upper left) and fourth (lower right) quadrantsof 84 subjects were brushed with one or theother toothbrush in a random assignment.24 Inthis situation, since the distribution of plaqueinside the mouth is symmetrical, and the use ofdifferent toothbrushes in brushing different partsof the mouth should not affect the other parts,i.e. no carry-across effect, the use of a split-mouth design should be appropriate. In studiesto compare the effectiveness of fluoride tooth-paste in preventing dental caries, split-mouthdesign is not advisable as the fluorides from the

Page 199: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DENTISTRY AND MAXILLO-FACIAL 197

toothpaste could go freely within the mouth, i.e.with a carry-over effect. For these studies, paral-lel design should be more appropriate.

BLINDING

In order to achieve double-blinding, the subjectsand examiners/observers should not be able to tellwhich treatment modalities have been assigned tothe subjects. This can be done, for example, in amouthrinse study comparing the test mouthrinsewith placebo; the placebo mouthrinse is madewith the same appearance and taste as the testmouthrinse, so that the subjects would not be ableto distinguish the two by sight and they are nottold which mouthrinse they have been assigned.On the other hand, the examiners/observersare also not able to reach the informationfor the assignment of the mouthrinse to thesubjects.25 However, in many other situations, wecan only blind the examiners/observers but notthe subjects (single-blinding). In the study wementioned above concerning the comparison ofthe effectiveness of three electric toothbrushes inplaque removal, it is inevitable that the subjectswill know which toothbrush they were using,thus in this situation, the best that one canachieve is to blind the investigator from knowingwhich toothbrushes have been assigned to whichquadrants of the subjects.24 There are situationswhere even single-blinding is not feasible. In astudy comparing the performance of two dentalfilling materials, amalgam (metal) versus resin-modified glass ionomer cement (tooth colour), itwould be very difficult to blind the investigator asone can distinguish the two by their appearance.26

In conducting clinical trials, one should use themaximum degree of blindness that is possible. Instudying the prevalence of caries and fluorosisof children from a water-fluoridated site anda nearby non-fluoridated site, children weretransported to a common examination site soas to blind the investigators from knowing theresidence of the children.27 Instead of having theinvestigators examining the children at the sitesand then recognising the fluoride content in thewater, this method of transporting the children

to a common examination site demonstrates aninnovative way for researchers to maximise thedegree of blindness.

CLINICAL TRIALS IN DENTISTRY

CARIES PREVENTION AND TREATMENTSTUDIES

The aims for these prevention studies are toinvestigate the effectiveness of different ways ofpreventing dental caries. These include differentmethods of strengthening the teeth (such as theuse of fluorides in different forms), modificationof diet (such as the use of sugar substitutes),or modification of health behaviours (such astooth brushing techniques and habits, oral edu-cation programmes). The target populations forthese studies are mainly children, the elderly andspecial needs groups. Most of the clinical trialsare phase I or phase II types. For those studiesinvestigating the effectiveness of different formsof fluorides (in the form of toothpaste, topicalfluorides, sealant), randomisation of the assign-ment of groups with different regimes (includingthe control group) can be done at the individuallevel with parallel design. In a study to com-pare the effectiveness of two toothpastes withdifferent concentration of fluoride to arrest rootcarious lesions, 201 subjects with at least oneroot carious lesion were recruited from dentalschool patients. They were randomly assigned touse either Prevident 5000 Plus (5000 ppm F) orColgate Winterfresh Gel (1100 ppm F), both con-taining sodium fluoride in the same silica base.Measurements of lesion hardness, area, distancefrom the gingival margin, cavitation and plaquewere recorded at baseline and 3 months later bya single examiner.28

For those studies involving the modification ofdiet and behaviour, field studies were used moreoften because randomisation of the assignment ofgroups with different regimes would be easier toachieve at the school or community level. In a3-year community intervention trial to determinethe caries preventive effect of sugar-substitutedchewing gum among Lithuanian school children,

Page 200: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

198 TEXTBOOK OF CLINICAL TRIALS

a total of 602 children, aged 9–14 years, fromfive secondary schools in Kaunas, Lithuaniawere recruited. Baseline clinical and radiographiccaries examinations were given. The schoolswere randomly allocated to receive one of the fiveinterventions: sorbitol/carbamide gum; sorbitolgum; xylitol gum; control gum; and no gum.Children in the four active intervention groupswere asked to chew at least five pieces of gumper day, preferably after meals. The children werere-examined clinically after 1, 2 and 3 years, andradiographically after 3 years.29

In both the above examples, parallel designwas adopted. However, studies like the com-parison of two different fissure sealant materialscould be carried out using the split-mouth design.In a study to compare the retention and the cariespreventive effect of a glass ionomer developedfor fissure sealing (Fuji III) and a chemicallypolymerised resin-based fissure sealant (Delton),179 7-year-old children with at least one pair ofpermanent first molars that were caries-free oronly had incipient lesions were recruited fromschools. A split-mouth design was adopted byassigning the two sealing materials randomly tothe contralateral teeth. Follow-up examinationsfor sealant retention were done after 6 months,1 year, 2 years and 3 years.30

In order to determine the level of fluoride thatshould be used (dose finding), some studies havefocused on comparing the effectiveness of differ-ent concentrations of fluorides in caries preven-tion. For example, in a randomised, double-blindstudy comparing the anti-caries effectiveness ofsodium fluoride dentifrices containing 1700 ppm,2200 ppm and 2800 ppm fluoride ion relative toan 1100 ppm fluoride ion control, a populationof 5439 schoolchildren, aged 6–15 years, wasrecruited from an urban central Ohio area witha low fluoride content water supply (<0.3 ppm).Subjects were stratified according to gender,age and baseline DMFS scores derived fromthe visual–tactile baseline examination. Randomallocation of the four treatment groups was done:0.243% sodium fluoride (1100 ppm fluoride ion),0.376% sodium fluoride (1700 ppm fluoride ion),0.486% sodium fluoride (2200 ppm fluoride ion)

and 0.619% sodium fluoride (2800 ppm fluo-ride ion). All products were formulated with thesame fluoride-compatible silica abrasive. Sub-jects were examined by visual–tactile and radio-graphic examination at baseline and after 1, 2 and3 years of using the sodium fluoride dentifrices.31

The aims of the caries treatment studies wereto investigate the performance of the differentmaterials or different approaches used to fill upthe decayed cavities in the mouth in terms ofbond strength and retention of the materials.As the treatments being delivered cannot bereversed, thus crossover design is not applicablefor these studies. Among these studies, the use ofsplit-mouth design was more common. In a studyto compare the clinical performance of two glassionomer cements, ChemFlex (Dentsply DeTrey)and Fuji IX GP (GC), when used with theatraumatic restorative treatment (ART) approachin China, 89 schoolchildren aged between 6and 14 years who had bilateral matched pairsof carious posterior teeth were included. Asplit-mouth design was used in which the twomaterials were randomly placed on contralateralsides. The performance of the restorations wasassessed directly and also indirectly from die-stone replicas at baseline and after 6, 12 and24 months.32

ORAL REHABILITATION STUDIES

One concern of oral rehabilitation studies isto compare a range of treatment modalities toreplace missing teeth including removable andfixed dental prostheses, and the use of implants.Depending on the treatment modalities, parallel,split-mouth and crossover designs have beenapplied in these studies. In a 5-year parallelstudy to compare implant-retained mandibularoverdentures (IRO) with complete dentures (CD),61 and 60 patients were randomly assigned tothe IRO and CD groups. The clinical aspects andpatient satisfaction were measured.33

In a split-mouth study, sandblasted and acid-etched (SLA) implants (recently introduced toreduce the healing period between surgery and

Page 201: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DENTISTRY AND MAXILLO-FACIAL 199

prosthesis) were compared to titanium plasma-sprayed (TPS) implants under loaded condi-tions 1 year after placement. Thirty-two healthypatients with comparable bilateral edentuloussites and no discrepancies in the opposing denti-tion were recruited. The surgical procedure wasperformed by the same operator and was iden-tical for all the test and control sites. Abut-ment connection was carried out at 35 Ncm6 weeks post-surgery for test sites and 12 weeksfor the controls by the same dentist who wasblinded to the type of surface of the implant.Provisional restoration was fabricated and a newtightening was performed after 6 weeks. Simi-lar gold–ceramic restorations were cemented onthe same type of solid abutments on both sites.Clinical measures and radiographic changes wererecorded by the same operator who was blindedto the type of surface of the implant, 1 year post-surgery.34

In a study to compare two designs of maxil-lary implant overdentures, a crossover trial wasdesigned to measure differences in patient satis-faction with maxillary long-bar implant overden-tures with and without palatal coverage opposedby a fixed mandibular implant-supported prosthe-sis. A mandibular fixed prosthesis was insertedin 13 total edentulous participants, who werethen divided into two groups. One group (n = 7)

received maxillary long-bar overdentures withpalate, then long-bar overdentures without palate.The other group (n = 6) received the same treat-ments in the reverse order. For each overdenturesdesign, mastication tests and patient satisfactionwere assessed 2 months after the fitting of theoverdenture to allow adaptation.35

Besides clinical outcomes, very often patient-based outcomes such as patient satisfactionand quality of life measures were also mea-sured in these oral rehabilitation studies. As inthe above quoted example, patients were askedto rate (1) their general satisfaction with theupper prosthesis; (2) satisfaction on the physi-cal functions of the prosthesis such as reten-tion, stability, comfort, ease of cleaning, etc.;and (3) satisfaction on the psychosocial func-tions such as a esthetics, self-confidence, etc.

using both the Visual Analogue Scale and theCategory Scale.35 In another study, an oral-specific quality of life measure, the Oral HealthImpact Profile,36 was used to measure theimpact of the clinical intervention on quality oflife. Three groups of subjects were compared,edentulous/edentate subjects who requested andreceived complete implant-stabilised oral pros-theses (IG, n = 26), edentulous/edentate subjectswho requested implants but received conven-tional dentures (CDG1, n = 22), and edentuloussubjects who had new conventional completedentures (CDG2, n = 35).13 OHIP is one ofthe most comprehensive instruments available inevaluating oral health-related quality of life.

TRIALS RELATED TO PERIODONTAL DISEASE

In order to prevent periodontal disease, plaqueremoval and prevention of calculus depositwas one of the key concerns. Thus ways toimprove toothbrushing (mechanical means toremove plaque) or the use of mouthrinse ortoothpaste with active ingredients like chlorhex-idine and pyrophosphate ion (chemical means toremove plaque and to prevent calculus deposi-tion) were investigated. Two main streams oftherapy existed: non-surgical and surgical ther-apy. The aims for these studies were to investi-gate the effectiveness of the treatment modalitiesin reducing the depth of the periodontal pocketor improving periodontal attachment level.

Similar to the caries research, different studydesigns in RCT have been applied to periodon-tal research; one particular design issue that hasbeen discussed more in periodontal research com-pared to other areas in dental research is theconsideration of therapeutic equivalency. This isimportant because clinical trials for testing supe-riority and for testing equivalency should havedifferent designs and there has been a tendencyby the clinicians in periodontal research to con-fuse the difference between the two.37 Clinicaltrials whose purposes are to show equivalence oftwo or more treatments have traditionally utilisedmethods for demonstrating superiority. Thus if nostatistical differences are found, this only demon-strates that the treatments are not superior to the

Page 202: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

200 TEXTBOOK OF CLINICAL TRIALS

others, they may not be equivalent.19 The sam-ple size requirements for both equivalence andsuperiority studies investigating products used inperiodontal regeneration have been investigated.It was found that since equivalence clinical trialsrequire much smaller differences between groups,thus much larger sample sizes are required.38

ORTHODONTIC STUDIES

In orthodontic studies, two main concerns are theeffectiveness of early orthodontic treatment forClass II malocclusion and the value of maxillaryarch expansion for the treatment of posteriorcrossbite. Relatively speaking, fewer RCTs aredone in this research area. Some researchers haveargued that even though RCT has become thegold standard of conducting medical research,it could only apply to a very narrow spectrumof orthodontic questions. One quoted exampleis ‘it would be nearly impossible to enrollfully-informed subjects into any study whosealternatives are of markedly different morbidity:extraction versus non-extraction or orthodonticsversus surgery’.39 Three confusions (or inertia)have been summarised in explaining why themove to conduct RCTs in orthodontic researchhas been slow. First, there is a remarkablelevel of non-acceptance that the highest level ofevidence is derived from RCTs and retrospectiveinvestigation is regarded as more useful. Second,there is the argument that it is not ethicalto subject patients to a random allocation oftreatments if it is already known that one of thetreatments is superior. Third, there is a feelingthat RCTs are very large and difficult to manageand then they are expensive and a large amountof funding is required.40 O’Brien has providedsolutions to the above confusions: (1) one shouldaccept that RCTs derive the highest level ofevidence and retrospective investigation still hasa great value in generating questions for RCTs;(2) one should not simply believe that mosttreatments are superior to another without beingtested in an unbiased manner, thus it is actuallyunethical to provide treatment that has notbeen properly evaluated; (3) careful planning and

monitoring of the trials are all that is neededfor conducting RCTs. He has also urged journaleditors to promote the publication of RCTs.Hopefully, there should be a paradigm shiftin conducting RCTs in orthodontic research infuture years.

CURRENT ISSUES

EVIDENCE-BASED DENTISTRY

One of the major implications of conductingclinical trials is to provide scientific evidencefor the clinicians when they are making clinicaldecisions. Evidence-based medicine was definedas ‘the conscientious, explicit and judicious useof current best evidence in making decisionsabout the care of individual patients’.41 Den-tal researchers started addressing the issues ofevidence-based dentistry (EBD) in the 1990sand a series of articles has been published toaddress the concepts of EBD, the misunderstand-ings of EBD, the barriers to EBD and the pro-cesses of finding, evaluating and applying theevidence.42 – 48 Various studies making use of theEBD approach have been applied to differentareas in dental research and the journal Evidence-Based Dentistry has also been published sinceNovember 1998.

Systematic literature review is the foundationof the EBD approach. It differs from narrativereview as in the latter case the authors or expertsdo not use standardised ways to retrieve articlesand summarise information. Thus different writ-ers may arrive at different conclusions for thesame question. In systematic review, standards infinding, evaluating and synthesising evidence arereported and thus the conclusion is more reliable.

The Cochrane Collaboration is an internationalorganisation that ‘has developed in response toArchie Cochrane’s call for systematic, up-to-datereviews of all relevant RCTs of health care’(http://www.cochrane.org). In an influential book,Cochrane49 drew attention to the importanceof organising critical summaries of all relevantRCTs by specialty or subspecialty areas of healthcare, so that people can make more informed

Page 203: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DENTISTRY AND MAXILLO-FACIAL 201

decisions. The first ‘Cochrane Centre’ wasopened and funded in the UK in October 1992, tofacilitate systematic reviews of randomised con-trolled trials across all areas of health care. Cur-rently, there are 15 Cochrane Centres around theworld. However, the Cochrane Centres are notdirectly responsible for preparing and maintain-ing systematic reviews. This is the responsibil-ity of international collaborative review groups,which also maintain registers of systematicreviews currently being prepared or planned, sothat unnecessary duplication of effort can beminimised and collaboration promoted. Currentlythere are about 50 review groups covering allof the important areas of health care and den-tistry included in the Cochrane Oral Health Group(http://www.cochrane-oral.man.ac.uk). The prin-cipal output of the collaboration is the CochraneReviews which are published electronically insuccessive issues of The Cochrane Database ofSystematic Reviews. Ten Cochrane Reviews havebeen finished in the Cochrane Oral Health Group,including: fluoride gels for preventing dentalcaries in children and adolescents; guided tissueregeneration for periodontal infra-bony defects;interventions for preventing oral mucositis ororal candidiasis for patients with cancer receivingchemotherapy; interventions for treating burn-ing mouth syndrome, oral mucositis, oral leuko-plakia, oral candidiasis and oral lichen planus;orthodontic treatment for posterior crossbites; andpotassium nitrate toothpaste for dentine hyper-sensitivity. Twenty-seven protocols are registeredcurrently to review various fluoride products inpreventing caries, various regimes of interven-tions for replacing missing teeth, and variousorthodontic treatments protocols, etc.

HIERARCHICAL AND MULTILEVELDATA ANALYSIS

Hierarchical data (or clustered data) is common indental research, as people may have up to 32 teethand taking measurements from multiple teeth ofthe same individual is very typical in the data col-lection of clinical trials in dentistry. For example,in caries prevention studies, the individual teeth

of the subjects will be examined and in periodon-tal research, usually six sites of each tooth willbe examined; all these measurements are possi-bly correlated or clustered. More examples havebeen given by Macfarlane and Worthington50

and Gilthorpe et al.51 With these correlated orclustered data, conventional statistical methods,which assume observations being independent,are not appropriate for the analysis. Thus spe-cial statistical analysis is required when data havea hierarchical structure. Analysing data withoutrecognising the hierarchical structure and treat-ing the observations as independent will lead toa spurious increase in the sample size, and corre-sponding spurious significant relationship.52 Forexample in periodontal research, one might takeup to six site measurements for each tooth andwith 28 teeth (not including the wisdom teeth) foran individual, the total number of observationsfor one individual could then be 168. Thus it iseasy to have thousands of observations with only20 subjects. In a study, the total number of sitesassessed was 2236 distributed on 559 teeth in 22subjects.53 One way to overcome this problem isto aggregate the raw observations to the highestlevel of the data structure, for example, site/teethlevel measurements can be aggregated to the levelof an individual subject like the DMFT/DMFSindex, mean probing pocket depth, etc. Thus onemeasurement was taken from each individual andconventional statistical analysis could then beapplied. However, aggregation has several draw-backs, the most obvious of which is the loss ofdetailed information. Therefore, the aggregatedmeasure differentiates poorly both trait and sever-ity of the measures made at the disaggregatedlevel. Furthermore, aggregation is of little valuewhen the focus of interest lies at a lower level ofthe data hierarchy, for example sites with peri-odontal pockets.54

‘Multilevel modelling’ (MLM)55 or equiv-alently ‘hierarchical linear modelling’56 is aclass of techniques developed to analyse hier-archical data. It was first adopted to anal-yse educational data. These techniques are themodified version of statistical methods availablefor the analysis of single-level data structures

Page 204: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

202 TEXTBOOK OF CLINICAL TRIALS

(e.g. multiple regression, logistic regression, log-linear modelling) for the analysis of data withhierarchical structure. One could carry out themultilevel data analysis through the use ofsoftware specially written for MLM, like MLwiN(http://multilevel.ioe.ac.uk), or one could write upmacros in other statistical softwares like SAS,S-plus.57 In carrying out MLM, one should spec-ify the number of levels in the model and thenthe variables incorporated at each of the levels.Back to the example of the periodontal study inwhich the number of sites assessed was 2236 dis-tributed on 559 teeth in 22 subjects.53 One ofthe analyses performed by the researchers wasa multilevel analysis on the factors affecting thechange in probing pocket depth at the sites overthe course of the treatment. A three-level modelwas built with site as level 1, tooth as level 2and subject as level 3. At the site level, 12 vari-ables were constructed (e.g. presence of plaqueat the site, treatment received at the site); at thetooth level, three variables were included (e.g.tooth type) and at the subject level, 19 vari-ables were constructed (e.g. age, gender, smokinghabit). The above analysis was performed usingthe software MLn (the non-WindowsTM versionof MLwiN). With the use of MLM, it is then pos-sible to investigate the change in probing pocketdepth at the site with the consideration of theeffects from the site itself, the tooth that it belongsto and the subject overall. In order to evaluatewhether a three-level model is necessary, oneshould test the ‘null model’ that no indepen-dent variables were included and then check thesignificance of the variance at each level andfit the model accordingly. Several other studiesusing multilevel modelling in analysing caries,periodontal and orthodontics data have also beenpublished.54,57 – 59

IMPACT OF TRIALS ON DENTAL PRACTICE

DIET AND DENTAL CARIES

Evidence of the role of diet, particularly sugars,in relation to dental caries has largely beencollated from animal experiences or in vitro

studies. Human studies have largely been of theobservational type: world-wide epidemiologicalstudies, ‘before and after’ studies, and studiesamong people with both high and low sugarconsumption. Very few interventional studies onhuman subjects have been conducted,4,60 and areunlikely to be undertaken in the future giventhe difficulties of placing groups of people onrigid dietary regimes for long periods of time andbecause of ethical issues. The main conclusionof studies relating to sugar and dental carieshas been that (1) consumption of sugar, evenat high levels, is associated with only a smallincrease in caries increment if the sugar is takenup to four times a day and none between meals;(2) consumption of sugar both between meals andat meals is associated with a marked increasein caries increment.61,62 These conclusions haveshaped key dental education messages of oralhealth promotion campaigns relating to dietand dental health around the world and alsoformed the basis of dentist–patient dental healtheducation relating to diet and dental caries.63

Other trials have provided evidence of variationsin caries incidence with different types of sugars,notably, the low caries rate associated with theuse of sugar alcohols like xylitol.60 This has led tomore widespread use of non-carcinogenic sugaralternatives in drinks and foodstuffs.64 However,this may be attributed more to their low caloricvalue than their low cariogenicity.

WATER FLUORIDATION

Evidence of the effectiveness of water fluorida-tion has largely been based on cross-sectionaland ecological studies, ‘before and after’ stud-ies, and fewer cohort or case–control studies. Norandomised controlled trials have been reportedin the dental literature. Systematic reviews ofthe effectiveness of water fluoridation have con-cluded that it is an effective, efficient and safemethod of preventing dental caries and possiblypromotes equity in oral health in society.65 Thestudies have examined the relationship betweendental caries experience and the fluoride contentof the water supply and have shown clearly the

Page 205: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DENTISTRY AND MAXILLO-FACIAL 203

association between increased fluoride concentra-tion in the drinking water and decreased dentalcaries experience in the population. However, thestudies have suggested that there is little ben-efit where water fluoride concentrations exceed1 ppm. These findings have resulted in the imple-mentation of water fluoridation in many indus-trialised and developing countries where centralwater supplies have made it feasible to do so. Itremains a key World Health Organization goalfor oral health.

However, a number of studies relate to thenegative influences of water fluoridation ondental and general health, primarily on the effectsof water fluoridation in producing dental fluorosis(tooth mottling) among the population.66 Fluorideat a concentration of 1 ppm is likely to producea small increase in dental mottling. However, inthe most part such mottling is unlikely to beof aesthetic concern. Despite strong evidence ofthe effectiveness and safety of water fluoridationsome communities have ceased to fluoridate theirwater supplies because of legal issues, socialacceptance and concern about the additionalbenefits of fluoridated water where other sourcesof fluoride are readily accessible.

ALTERNATIVES TO WATER FLUORIDATION

A wide range of alternative methods for admin-istering systemic fluoride have been suggested inthe literature, particularly milk fluoridation andsalt fluoridation.67 Extensive literature describ-ing the study of fluoride compounds administeredwith calcium-rich food, as well as clinical tri-als and laboratory experiments with fluoridatedmilk, have demonstrated the effectiveness of milkfluoridation in caries prevention.68 However, thecriticism of decreased bioavailability of the fluo-ride, the cost and administrative burden involved,and conflicting evidence of efficacy has resultedin few community milk fluoridation programmes.

Salt fluoridation has also been advocated asan alternative to water fluoridation. Evidence ofthe effectiveness of salt fluoridation has largelycome from test and control community studiesin several different countries.69 Despite the fact

that salt fluoridation offers the consumer a choiceto use fluoride supplements or not, there areonly a handful of countries where it is widelyavailable and consumed. Concerns about theappropriate dosage (a minimum of 200 mg/l F isrecommended) and safety for general health mayimpede its widespread implementation.70

Fluoride supplements in the form of tabletsto drops have long been considered an alterna-tive to water fluoridation. Although the effec-tiveness of fluoride supplements was endorsedby many small clinical studies, closer examina-tion of the experimental conditions of these, theirmethods and the analysis of their results under-mined confidence in their findings. More modern,well-conducted clinical trials of supplements sug-gest that today, in children also exposed to flu-oride from other sources such as toothpaste, themarginal effect of fluoride supplements is verysmall and there is substantial risk of fluorosis ifsupplements are used by young children.71 Thishas resulted in changes to recommended fluoridedosage schedules and deferment of the age com-mencing the use of supplements, implementedin many countries. Overall, poor compliance andpotential risks of fluorosis make fluoride supple-ments a poor public health measure and they areinfrequently prescribed in dental practice.72

FLUORIDATED TOOTHPASTE

The daily use of fluoride toothpaste is a highlyeffective method of delivering fluoride to thetooth surface and has proved to be a major aidto caries prevention.73 The concentration of flu-oride at 1000 ppm F has been suggested as asafe and effective means of preventing caries.74

Although evidence suggests that toothpastes withhigher fluoride concentrations are more effectivein preventing dental caries, because of safetyconcerns dentifrices exceeding 1500 ppm F areonly sold by prescription in many countries.75

However, for children trials have suggested thata lower concentration of fluoride in dentifrices(250–500 ppm F) be used and that only a mini-mum amount (less than 5 mm) should be placedon the brush to minimise risk.76 Some trials have

Page 206: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

204 TEXTBOOK OF CLINICAL TRIALS

suggested that combining more than one fluorideagent is more effective than using one source offluoride agent in preventing dental caries. How-ever, different formulations of toothpaste appearto have similar effectiveness.77 To some extentthe use of dentifrice has removed the need forprofessionally applied fluoride agents, except inspecial circumstances.

OTHER FORMS OF TOPICAL FLUORIDE

Many forms of professionally applied fluoridehave been studied, including solutions, gels orfoams of sodium fluoride, stannous fluoride,organic amine fluoride, acidulated phosphate flu-oride and non-aqueous fluoride varnishes in analcoholic solution of natural resins and difluorosi-lane agents covered by a polyurethane coating.All of these professionally applied topical agentshave anti-caries benefits, although the benefitsand ease of application vary.78 However, a recentsystematic review of the periodic scientific liter-ature undertaken to determine the strength of theevidence for the efficacy of professional cariespreventive methods applied to high-risk individ-uals, and the efficacy of professionally appliedmethods to arrest or reverse non-cavitated cariouslesions, concluded that the strength of the evi-dence was judged to be fair for fluoride varnishesand insufficient for all other methods.79 In dentalpractice professionally applied fluoride is infre-quently employed owing to more widespread useof other fluoride sources, reports of inconclusiveevidence and because of health care reimburse-ment for such preventive procedures.

FISSURE SEALANTS

Most carious lesions occur in the pit and fis-sure on the occlusal surface of posterior teeth.Over the years clinical trials have demonstratedthe effectiveness of sealing these fissures andpits in preventing dental caries.21 Light-curingand auto-polymerising sealants are equally effec-tive. However, the cost-effectiveness of fissuresealants in clinical trials remains questionable.80

Thus, fissure sealants should be employed on

clinical grounds, on patients with special needs, ahistory of extensive caries in the primary deten-tion or caries involving one or more molars.81

It is important that they are reviewed at regu-lar intervals.

TREATMENT OF CARIES LESIONS

A key focus of research has been the perfor-mance of direct posterior restorations (fillings),the longevity and reasons for failure of directresin-based composite (RBC), amalgam and glassionomer cement (GIC) restorations in stress-bearing posterior cavities. Predominantly studieshave been either of the longitudinal or retro-spective cross-sectional type, with few controlledclinical studies. GIC perform significantly worsecompared with amalgam and RBC.82 However,reasons for placement and replacement of directrestorations in dental practice relates to manyfactors, and aesthetic and safety concerns haveresulted in an increased use of RBC or GICrestorations in posterior teeth.83 The handling andfluoride leaching properties of GIC have madethem popular in general practice.84

REPLACEMENT OF MISSING TEETH

A range of treatment modalities to replace miss-ing teeth have been studied, including remov-able and fixed dental prostheses and the use ofimplants. Increasingly these studies have incorpo-rated patients’ perceptions of outcomes. Evidencehas largely been collated from longitudinal orcase–control studies with relatively few RCTs.Implant-retained overdentures are reported tobe superior to complete dentures.33 In addition,implants are useful in the treatment of par-tial edentulism. However, the widespread use ofimplants in practice has been limited by a numberof factors including health care cover and costs,operator experience and appropriateness for indi-vidual cases.

Another contentious issue has been the useof resin-bonded bridges (RBBs) which providea greater degree of conservation of tooth struc-ture of abutments compared with designs of con-ventional fixed prostheses in the treatment of

Page 207: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DENTISTRY AND MAXILLO-FACIAL 205

partial edentulism.85 A key concern has beenthe longevity of RBBs, however studies suggestthat with appropriate case selection, preparationdesign and cementation they are a viable treat-ment option compared to conventional bridges.Increasingly RBBs are being employed in den-tal practice in the treatment of short edentu-lous spaces.

TRIALS RELATING TO PERIODONTALDISEASE

A key focus of periodontal trials has been theneed for plaque control to prevent periodon-tal diseases and for the maintenance of peri-odontal health.86 Primarily studies have focusedon mechanical methods of plaque control. Stud-ies have shown that the most important plaquecontrol method is toothbrushing; precise tech-nique is less important than the result, which isremoval of plaque without causing damage tothe teeth or gums.87 It is widely promoted toestablish toothbrushing practice as a daily routinefrom childhood.

Additional methods of mechanical plaque con-trol include interdental cleaning aids such as den-tal floss. While such aids have been shown to beeffective in plaque control with minimum damageif used correctly, they are generally prescribeddepending on the individual’s periodontal healthand their ability to use them appropriately.88

Chemical antimicrobial agents may be a usefuladjunct measure to managing periodontal health.In particular the use of chlorhexidene in thechemical control of plaques has widely beenadvocated, particularly in acute phases and inpreventing post-surgical infection.89 However,with the long-term use of chlorhexidene thereis a tendency for it to stain (extrinsic) teeth. Inmore recent times the addition of antimicrobialagents to dentifrices to aid plaque control hasbecome commonplace.90 The use of chemicalagents in the removal of plaque, while effective,is not recommended over the use of mechanicalagents.91

In the treatment of periodontal disease many tri-als have concluded that non-surgical periodontal

therapy is more appropriate than surgical peri-odontal therapy, and that surgical therapy shouldonly be considered when sites fail to respondto non-surgical methods despite adequate oralhygiene.92 State-funded and third-party payers oforal health care usually require detailed justifica-tion for surgical periodontal care.

ORTHODONTICS

While there has been considerable growth inthe practice of orthodontics there is a dearthof evidence-based research, particularly RCTs.40

A contentious issue has been the timing oforthodontics and the need for early orthodon-tic intervention. The evidence relating to earlyorthodontic treatment is inconclusive, with theresult that many clinicians decide, on a case-by-case basis, when to provide orthodontictreatment.93 Another key concern has been thevalue of maxillary arch expansion for the treat-ment of posterior crossbite. A Cochrane Reviewon the subject was unable to propose recom-mendations based on inadequate trials.94 Lackof evidence relating to the value of orthognathicsurgery versus orthodontic camouflage in thetreatment of mandibular deficiency, and also asto the need for extraction of teeth for orthodon-tic purposes, has resulted in clinicians decid-ing on a case-by-case basis without any clinicalguidelines.95

CONCLUSION

Currently the lack of high-quality research withindentistry, namely the lack of RCTs, has impededthe identification of best dental practice andthe implementation of evidence-based practicewithin dentistry. There is however widespreadrecognition of these problems and concertedefforts to undertake more collaborative high-quality research that can inform policies andguidelines to be implemented in dental practice.

REFERENCES

1. Stjernsward J, Stanley K, Eddy D, Tsechkovski M,Sobin L, Koza I, Notaney KH. National cancer

Page 208: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

206 TEXTBOOK OF CLINICAL TRIALS

control programs and setting priorities. CancerDetect Prevent (1986) 9: 113–24.

2. Proffit WR, Fields HW, Morav LJ. Prevalence ofmalocclusion and orthodontic treatment need inthe United States: estimates from the NHANESIII survey. Int J Adult Orthodont Orthognat Surg(1998) 13: 97–106.

3. Bowden GHW, Edwardsson S. Oral ecology anddental caries. In: Thylstrup A, Fejerskov O, eds,Textbook of Clinical Cariology. Copenhagen:Munksgaard (1994) 45–70.

4. Gustafsson BE, Quensel CE, Lanke LKS. TheVipeholm dental caries study. The effect ofdifferent levels of carbohydrate intake on cariesactivity in 436 individuals observed for five years(Sweden). Acta Odontol Scand (1954) 11: 232–64.

5. World Health Organization. Oral Health Surveys:Basic Methods, 4th edn. Geneva: WHO (1997).

6. Zambon JJ. Periodontal diseases: microbial fac-tors. Ann Periodontol (1996) 1: 879–925.

7. Ainamo J, Bay I. Problems and proposals forrecording gingivitis and plaque. Int Dental J(1975) 25: 229–35.

8. Silness J, Loe H. Periodontal disease in preg-nancy. II. Correlation between oral hygiene andperiodontal condition. Acta Odontol Scand (1964)22: 112–35.

9. Turesky S, Gilmore ND, Glickman I. Reducedplaque formation by the chloromethyl analogue ofVictamine C. J Periodontol (1970) 41: 41–3.

10. Loe H, Silness J. Periodontal disease in preg-nancy. I. Prevalence and severity. Acta OdontolScand (1963) 21: 533–51.

11. Slade GD, ed. Measuring Oral Health and Qualityof Life. Chapel Hill: University of North Carolina(1997).

12. Dolan TA. The sensitivity of the geriatric oralhealth assessment index to dental care. J DentalEduc (1997) 61: 37–46.

13. Allen PF, McMillan AS, Locker D. An assess-ment of sensitivity to change of the Oral HealthImpact Profile in a clinical trial. Commun DentOral Epidemiol (2001) 29: 175–82.

14. Pocock SJ. Clinical Trials: A Practical Approach.Chichester, UK: John Wiley & Sons (1983).

15. Dennison DK. Components of a randomized clin-ical trial. J Periodont Res (1997) 32: 430–8.

16. Koch GG, Paquette DW. Design principles andstatistical considerations in periodontal clinicaltrials. Ann Periodontol (1997) 2: 42–63.

17. Friedman LM, Furberg CD, DeMets DL. Fun-damentals of Clinical Trials, 3rd edn. Mosby:Springer (1998).

18. Lau G. Ethics. In: Kalberg J, Tsang K, eds, Intro-duction to Clinical Trials. Hong Kong: The Uni-versity of Hong Kong (1998) 73–112.

19. Burns DR, Elswick RK Jr. Equivalence testingwith dental clinical trials. J Dental Res (2001) 80:1513–17.

20. McDaniel TF, Miller DL, Jones RM, Davis MS,Russell CM. Effects of toothbrush design andbrushing proficiency on plaque removal. CompendContin Educ Dent (1997) 18: 572–7.

21. Morphis TL, Toumba KJ, Lygidakis NA. Fluoridepit and fissure sealants: a review. Int J PaediatDent (2000) 10: 90–8.

22. Hujoel PP, Loesche WJ. Efficiency of split-mouthdesigns. J Clin Periodontol (1990) 17: 722–8.

23. Hujoel PP, DeRouen TA. Validity issues in split-mouth trials. J Clin Periodontol (1992) 19: 625–7.

24. Dorfer CE, Berbig B, von Bethlenfalvy ER,Staehle HJ, Pioch T. A clinical study to comparethe efficacy of 2 electric toothbrushes in plaqueremoval. J Clin Periodontol (2001) 28: 987–94.

25. Corbet EF, Tam JO, Zee KY, Wong MC, Lo EC,Mombelli AW, Lang NP. Therapeutic effects ofsupervised chlorhexidine mouthrinses on untreatedgingivitis. Oral Dis (1997) 3: 9–18.

26. Donly KJ, Segura A, Kanellis M, Erickson RL.Clinical performance and caries inhibition ofresin-modified glass ionomer cement and amalgamrestorations. J Am Dental Assoc (1999) 130:1459–66.

27. Stephen KW, Macpherson LM, Gilmour WH,Stuart RA, Merrett MC. A blind caries andfluorosis prevalence study of school-children innaturally fluoridated and nonfluoridated townshipsof Morayshire, Scotland. Commun Dent OralEpidemiol (2002) 30: 70–9.

28. Lynch E, Baysan A, Ellwood R, Davies R, Peters-son L, Borsboom P. Effectiveness of two fluoridedentifrices to arrest root carious lesions. Am J Dent(2000) 13: 218–20.

29. Machiulskiene V, Nyvad B, Baelum V. Cariespreventive effect of sugar-substituted chewinggum. Commun Dent Oral Epidemiol (2001) 29:278–88.

30. Poulsen S, Beiruti N, Sadat N. A comparison ofretention and the effect on caries of fissuresealing with a glass-ionomer and a resin-basedsealant. Commun Dent Oral Epidemiol (2001) 29:298–301.

31. Biesbrock AR, Gerlach RW, Bollmer BW, FallerRV, Jacobs SA, Bartizek RD. Relative anti-cariesefficacy of 1100, 1700, 2200, and 2800 ppmfluoride ion in a sodium fluoride dentifrice over1 year. Commun Dent Oral Epidemiol (2001) 29:382–9.

32. Lo EC, Luo Y, Fan MW, Wei SH. Clinical inves-tigation of two glass-ionomer restoratives usedwith the atraumatic restorative treatment approachin China: two-years results. Caries Res (2001) 35:458–63.

Page 209: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DENTISTRY AND MAXILLO-FACIAL 207

33. Meijer HJ, Raghoebar GM, Van’t Hof MA, Geert-man ME, Van Oort RP. Implant-retained mandibu-lar overdentures compared with complete dentures;a 5-years’ follow-up study of clinical aspects andpatient satisfaction. Clin Oral Implant Res (1999)10: 238–44.

34. Roccuzzo M, Bunino M, Prioglio F, Bianchi SD.Early loading of sandblasted and acid-etched(SLA) implants: a prospective split-mouth com-parative study. Clin Oral Implant Res (2001) 12:572–8.

35. de Albuquerque RF Jr, Lund JP, Tang L, LariveeJ, de Grandmont P, Gauthier G, Feine JS. Within-subject comparison of maxillary long-bar implant-retained prostheses with and without palatalcoverage: patient-based outcomes. Clin OralImplant Res (2000) 11: 555–65.

36. Slade GD, Spencer AJ. Development and evalua-tion of the Oral Health Impact Profile. CommunDental Health (1994) 11: 3–11.

37. Duke SP, Garrett S. Equivalence in periodontaltrials: a description for the clinician. J Periodontol(1998) 69: 650–4.

38. Gunsolley JC, Elswick RK, Davenport JM. Equiv-alence and superiority testing in regeneration clin-ical trials. J Periodontol (1998) 69: 521–7.

39. Johnston LE Jr. Let them eat cake: the strugglebetween form and substance in orthodontic clin-ical investigation. Clin Orthodont Res (1998) 1:88–93.

40. O’Brien K. Editorial: Is evidence-based orthodon-tics a pipedream? J Orthodont (2001) 28: 313.

41. Sackett DL, Rosenberg WM, Gary JA, HaynesRB, Richardson WS. Evidence based medicine:what it is and what it isn’t. Br Med J (1996) 312:71–2.

42. Sutherland SE. The building blocks of evidence-based dentistry. J Can Dental Assoc (2000) 66:241–4.

43. Sutherland SE. Evidence-based dentistry: Part I.Getting started. J Can Dental Assoc (2001) 67:204–6.

44. Sutherland SE. Evidence-based dentistry: Part II.Searching for answers to clinical questions: howto use MEDLINE. J Can Dental Assoc (2001) 67:277–80.

45. Sutherland SE. Evidence-based dentistry: Part IV.Research design and levels of evidence. J CanDental Assoc (2001) 67: 375–8.

46. Sutherland SE. Evidence-based dentistry: Part V.Critical appraisal of the dental literature: papersabout therapy. J Can Dental Assoc (2001) 67:442–5.

47. Sutherland SE. Evidence-based dentistry: Part VI.Critical appraisal of the dental literature: papersabout diagnosis, etiology and prognosis. J CanDental Assoc (2001) 67: 582–5.

48. Sutherland SE, Walker S. Evidence-based den-tistry: Part III. Searching for answers to clinicalquestions: finding evidence on the Internet. J CanDental Assoc (2001) 67: 320–3.

49. Crochane AL. Effectiveness and Efficiency. Ran-dom Reflections on Health Services. London:Nuffield Provincial Hospital Trust (1972).

50. Macfarlane TV, Worthington HV. Some aspectsof data analysis in dentistry. Commun DentalHealth (1999) 16: 216–19.

51. Gilthorpe MS, Maddick IH, Petrie A. Introductionto multilevel modelling in dental research. Com-mun Dental Health (2000) 17: 222–6.

52. Altman DG, Bland JM. Units of analysis. Br MedJ (1997) 314: 1874.

53. Axtelius B, Soderfeldt B, Attstrom R. A multi-level analysis of factors affecting pocket probingdepth in patients responding differently to peri-odontal treatment. J Clin Periodontol (1999) 26:67–76.

54. Gilthorpe MS, Griffiths GS, Maddick IH, ZamzuriAT. The application of multilevel modellingto periodontal research. Commun Dental Health(2000) 17: 227–35.

55. Goldstein H. Multilevel Statistical Models, 2ndedn. London: Edward Arnold (1995).

56. Bryk AS, Raudenbush S. Hierarchical LinearModels: Applications and Data Analysis Methods.Newbury Park: Sage Publications (1992).

57. Hannigan A, O’Mullane DM, Barry D, Schafer F,Robers AJ. A re-analysis of a caries clinical trialby survival analysis. J Dental Res (2001) 80:427–31.

58. Albandar JM, Goldstein H. Multi-level statisticalmodels in studies of periodontal diseases. JPeriodontol (1992) 63: 690–5.

59. Gilthorpe MS, Cunningham SJ. The application ofmultilevel, multivariate modelling to orthodonticresearch data. Commun Dental Health (2000) 17:236–42.

60. Scheinin A, Makinen KK. Turku sugar studies.An overview. Acta Odontol Scand (1976) 34:405–8.

61. Rugg-Gunn AJ. Nutrition, diet and dental publichealth. Commun Dental Health (1993) 10: 47–56.

62. Sheiham A. Dietary effects on dental diseases.Public Health Nutr (2001) 4: 569–91.

63. Duggal MS, van Loveren C. Dental considera-tions for dietary counselling. Int Dental J (2001)51: 408–12.

64. Linke HA. Sugar alcohols and dental health.World Rev Nutr Diet (1986) 47: 134–62.

65. Tresaure ET, Chestnutt IG, Whiting P, McDon-agh M, Kleijnen J. The York review–a systematicreview of public water fluoridation. Br Dental J(2002) 192: 495–7.

Page 210: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

208 TEXTBOOK OF CLINICAL TRIALS

66. McDonagh MS, Whiting PF, Wilson PM, SuttonAJ, Chestnutt I, Cooper J, Misso K, Bradley M,Treasure E, Kleijnen J. Systematic review ofwater fluoridation. Br Med J (2000) 321: 855–9.

67. O’Mullane DM. Systemic fluorides. Adv DentalRes (1994) 8: 181–4.

68. Marino R. Should we use milk fluoridation? Areview. Bull Pan Am Health Org (1995) 29:287–98.

69. Stephen KW. Fluoride prospects for the newmillennium – community and individual patientaspects. Acta Odontol Scand (1999) 57: 352–5.

70. Bergmann KE, Bergmann RL. Salt fluoridationand general health. Adv Dental Res (1995) 9:138–43.

71. Riordan PJ. Fluoride supplements for young chil-dren: an analysis of the literature focusing on ben-efits and risks. Commun Dental Oral Epidemiol(1999) 27: 72–83.

72. Burt BA. The case for eliminating the use ofdietary fluoride supplements for young children.J Public Health Dent (1999) 59: 269–74.

73. Clarkson JJ, McLoughlin J. Role of fluoride inoral health promotion. Int Dental J (2000) 50:119–28.

74. Stephen KW. Dentifrices: recent clinical findingsand implications for use. Int Dental J (1993) 43:549–53.

75. Ripa LW. Clinical studies of high-potency flu-oride dentifrices: a review. J Am Dental Assoc(1989) 118: 85–91.

76. Warren JJ, Levy SM. A review of fluoride den-tifrice related to dental fluorosis. Paediat Dent(1999) 21: 265–71.

77. Holloway PJ, Worthington HV. Sodium fluorideor sodium monofluorophosphate? A critical viewof a meta-analysis on their relative effectivenessin dentifrices. Am J Dent (1993) 6: S55–8.

78. Newbrun E. Topical fluorides in caries preventionand management: a North American perspective.J Dental Educ (2001) 65: 1078–83.

79. Bader JD, Shugars DA, Bonito AJ. A systematicreview of selected caries prevention and man-agement methods. Commun Dent Oral Epidemiol(2001) 29: 399–411.

80. Deery C. The economic evaluation of pit andfissure sealants. Int J Paediat Dent (1999) 9:235–41.

81. Smallridge J. UK National Clinical Guidelines inPaediatric Dentistry. Management of the stainedfissure in the first permanent molar. Int J PaediatDent (2000) 10: 79–83.

82. Hickel R, Manhart J, Gracia-Godoy F. Clinicalresults and new developments of direct posteriorrestorations. Am J Dent (2000) 13: 41D–54D.

83. Wilson NH, Burke FJ, Mjor IA. Reasons forplacement and replacement of restorations ofdirect restorative materials by a selected group ofpractitioners in the United Kingdom. Quintess Int(1997) 28: 245–8.

84. Yip HK, Smales RJ. Glass ionomer cements usedas fissure sealants with the atraumatic restorativetreatment (ART) approach: review of literature. IntDent J (2002) 52: 67–70.

85. Botelho M. Resin-bonded prostheses: the currentstate of development. Quintess Int (1999) 30:525–34.

86. Loe H. Oral hygiene in the prevention of cariesand periodontal disease. Int Dental J (2000) 50:129–39.

87. Westfelt E. Rationale of mechanical plaque con-trol. J Clin Periodontol (1996) 23: 263–7.

88. Iacono VJ, Aldrege WA, Luckus H, Schwartz-stein S. Modern supragingival plaque control. IntDental J (1998) 48: 290–7.

89. Jones CG. Chlorhexidine: is it still the goldstandard? Periodontology (2000) 15: 55–62.

90. Sheen S, Pontefract H, Moran J. The benefits oftoothpaste – real or imagined? The effectivenessof toothpaste in the control of plaque, gingivitis,periodontitis, calculus and oral malodour. DentalUpdate (2001) 28: 144–7.

91. Sheiham A. Is the chemical prevention of gin-givitis necessary to prevent severe periodontitis?Periodontology 2000 (1997) 15: 15–24.

92. Greenstein G. Nonsurgical periodontal therapy in2000: a literature review. J Am Dental Assoc(2000) 131: 1580–92.

93. Kluemper GT, Beeman CS, Hicks EP. Early ortho-dontic treatment: what are the imperatives? J AmDental Assoc (2000) 131: 613–20.

94. Harrison JE, Ashby D. Orthodontic treatment forposterior crossbites. Cochrane Database SystemRev (2001) 1: CD000979.

95. Berg R. Orthodontic treatment – yes or no? Adifficult decision in some cases. J OrofacialOrthopaed (2001) 62: 410–21.

Page 211: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DERMATOLOGY

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 212: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

14

Dermatology

LUIGI NALDI1 AND COSETTA MINELLI2

1Department of Dermatology, Ospedali Riuniti di Bergamo, Bergamo, Italy2Unit of Clinical Epidemiology, Pharmacology Research Institute, M.Negri-Villa Camozzi, Ranica, Italy

WHAT IS DERMATOLOGY?

Dermatology deals with disorders affecting theskin and associated specialised structures such ashair and nails. The skin is a biological barrierbetween ourselves and the outside world con-sisting of a stratified epithelium, an underlyingconnective tissue, i.e., dermis, and a fatty layerusually designed as ‘subcutaneous’. The skin isnot a simple inert covering of the body buta sensitive dynamic boundary. It offers protec-tion against infections, ultraviolet radiation andtrauma. It is essential for controlling water andheat loss and contributes to the synthesis of sub-stances such as vitamin D. The skin is also animportant organ of social and sexual contact.Body perception, which is deeply rooted withinthe culture of any given social group, is largelyaffected by the appearance of the skin and itsassociated structures.

Extensive disorders affecting the skin may dis-rupt its homeostatic functions resulting in a prop-erly speaking ‘skin failure’, needing intensivecare with hydration, maintenance of caloric bal-ance and temperature. However, this is a rare

event occurring with conditions such as exten-sive bullous disorders or exfoliative dermatitis.The most usual health consequence of skin dis-orders is connected with the discomfort of symp-toms, such as itching and burning or pain, whichfrequently accompany skin lesions and interferewith everyday life and sleeping. Moreover, vis-ible lesions may result in a loss of confidenceand disrupt social relations. Feelings of stigmati-sation and major changes in lifestyle caused by achronic skin disorder such as psoriasis have beenrepeatedly documented in population surveys.1

Additional problems may arise under diverse cir-cumstances: the exudation or loss of substancesthat interfere locally with the barrier function(and dressing); the shedding of scales wheneverexcessive desquamation occurs; the need to pre-vent contact dissemination in the case of trans-missible diseases.

A LARGE NUMBER OF DIFFERENT SKINDISEASES

Unlike most other organs that usually countaround 50 to 100 diseases, the skin has a

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 213: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

212 TEXTBOOK OF CLINICAL TRIALS

complement of 1000 to 2000 conditions and over3000 dermatological categories can be found inthe International Classification for Disease ver-sion 9 (ICD-9). This is partly justified by the skinbeing a large and visible organ. Beside disordersprimarily affecting the skin, there are cutaneousmanifestations with most of the major systemicdiseases (e.g., vascular and connective tissue dis-eases). The classification of skin disorders is farfrom satisfactory (Table 14.1). Currently, there isa widespread use of symptom-based or purelydescriptive terms, such as parapsoriasis or pytiri-asis rosea, which reflects our limited understand-ing of the causes and pathogenetic mechanismsof a large number of skin disorders.

Skin diseases as a whole are very commonin the general population. A limited number ofprevalence surveys have documented that skindisorders may affect 20–30% of the general popu-lation at any one time. The most common diseasesare also the most trivial ones. They include suchconditions as mild eczematous lesions, mild tomoderate acne, benign tumours and angiomatouslesions. More severe skin disorders, which may

have an impact in terms of physical disability oreven mortality, are rare or very rare. They include,among others, autoimmune bullous diseases, suchas pemphigus, severe pustular and erythrodermicpsoriasis, generalised eczematous reactions, andsuch malignant tumours as malignant melanomaand lymphoma. The disease frequency may showvariations according to age, sex and geographicarea. Eczema is common at any age while acneis decidedly more frequent among male adoles-cents. Skin tumours are particularly frequent inaged white populations. Infestations and infec-tions such as scabies, pyoderma and dermato-phytosis predominate in developing countries andsome urban pockets of developed countries. Inmany cases, skin diseases are minor health prob-lems, which may be trivialised in comparison withother more serious medical conditions. However,as mentioned above, skin manifestations are vis-ible and may cause more distress to the publicthan more serious medical problems. The issue iscomplicated by the fact that many skin disordersare not present in the population as a yes or nophenomenon, but as a spectrum of severity. The

Table 14.1. Operational classification of skin diseases

Anatomicaldistribution Morphology Pathology

Pathogeneticprocess Aetiology

Genodermatoses X XNevi and other development defects X XMechanical and thermal injuries XPhotodermatoses X XEczemas X X X XLichenoid disorders X XDisorders of keratinisation X XPsoriasis X XInfections and infestations X XDisorders of skin colour XBullous eruptions X X XDisorders of sebaceous glands X X XDisorders of sweat glands X XImmune-related diseases X XUrticaria X X XVascular disorders X X XDisorders of hair X X XDisorders of nails X X XDisorders of subcutaneous fat X X XTumours X X X

Page 214: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DERMATOLOGY 213

public’s perception of what constitutes a ‘disease’requiring medical advice may vary according tocultural issues, the social context, resources andtime. Minor changes in health policy may have alarge health and financial impact simply becausea large number of people may be concerned. Forexample, most of the campaigns conducted toincrease the public awareness of skin cancer haveled to a large increase in the number of benignskin conditions such as benign melanocytic nevibeing evaluated and excised.

Large variations can be documented amongdifferent countries in terms of health serviceorganisation for treating skin disorders. A roughindicator of these variations is the density ofdermatologists ranging, in Europe, from about1:20 000 in Italy and France to 1:150 000 in theUnited Kingdom.

In general, only a fraction of those withskin diseases are expected to seek medical helpwhile an estimated large proportion opt forself-medication. Pharmacists occupy a key rolein advising the public on the use of over-the-counter products. Primary care physiciansseem to treat the majority of people amongthose seeking medical advice. Primary care ofdermatological problems seems to be impreciselydefined with a large overlap with specialistactivity. Most of the dermatologist’s workloadaround the world is concentrated in the outpatientdepartment. In spite of the vast number ofdermatological diseases, it has been documentedthat just a few categories account for about 70%of all dermatological consultations. Brief, moredetailed descriptions of the most frequent skincategories are given below while skin cancer isdealt with in another section.

Generally speaking, dermatology requires alow technology clinical practice. Clinical exper-tise is mainly dependent on the ability to recog-nise a skin disorder quickly and reliably which,in turn, depends to a large extent on the aware-ness of a given clinical pattern, based on pre-vious experience and on the exercised eye of avisually literate physician.2 Complementary diag-nostic procedures include skin biopsy, patch test-ing and immunopathology.

A peculiar aspect of dermatology is the pos-sible option for topical treatment. This treatmentmodality is ideally suited to localised lesions, themain advantage being the restriction of the effectto the site of application and the limitation ofsystemic side effects. A topical agent is usuallydescribed as a vehicle and an active substance,the vehicles being classified as powder, grease,liquid or combinations such as pastes and creams.

Much traditional topical therapy in dermatol-ogy has been developed empirically with so-called magistral formulations. Most of theseproducts seem to rely on physical rather thanchemical properties for their effects and it maybe an arbitrary decision to appoint one spe-cific ingredient as the ‘active’ one. Physicaleffects of topical agents may include detersion,hydration and removal of keratotic scales. Theborder between pharmacological and cosmetolog-ical effects may be imprecisely defined and theterm ‘cosmeceuticals’ is sometimes employed.3

It should be noted that the evaluation of even themost recent cosmetic products is far from beingsatisfactory. In addition to pharmacological treat-ment, a number of non-pharmacological treat-ment modalities exists including phototherapyor photochemotherapy and minor surgical proce-dures such as electrodessication and criotherapy.Large variations in treatment modalites for thesame condition have been documented, whichmainly reflect local traditions and preferences.4,5

ACNE

The term acne refers to a group of disorderscharacterised by abnormalities of the sebaceousglands. Acne vulgaris is the most commoncondition and is characterised by polymor-phous lesions, including comedones (black-heads), inflammatory lesions such as papules orpustules, and scars, affecting the face and less fre-quently the back and shoulders. A combination offactors are considered as pathogenetic, includingthe hormonal influence of androgens, seborrhea,abnormalities in the bacterial flora with over-growth of Propionibacterium Acnei, and pluggingof pilosebaceous openings. Mild degrees of acne

Page 215: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

214 TEXTBOOK OF CLINICAL TRIALS

are extremely common amongst teenagers (morethan 80%) and decrease in later life. The preva-lence of moderate to severe acne has been esti-mated at about 14% in 15–24 year-olds, 3% in25–34 year-olds and about 1% in 35–54 year-olds. It is likely that the vast majority of suf-ferers of mild acne do not seek medical advice.Around 70% of those affected with acne experi-ence shame and embarrassment because of theiracne. Criteria for treatment include clinical sever-ity, as judged by the extension and presence ofinflammatory lesions, and the degree of psycho-logical distress from the disease. The aim oftreatment is to prevent scarring, limit diseaseduration and reduce psychological stress. Mildacne is usually treated by topical modalities suchas benzoyl peroxide or tretinoin, while moderateseverity acne is treated by systemic antibiotics orantiandrogens in women. Oral isotretinoin is usedunder specialist supervision for severe unrespon-sive disease. There are a number of publishedsystems for measuring the severity of acne.6

These vary from sophisticated systems with upto 100 potential grades to simple systems with 4or 5 grades. A specially designed acne disabilityindex has also been devised to assess the psycho-logical impact of the disease and disability, andhas been found to correlate well with severity asmeasured by an objective grading system, evenif a small group experiences disability which isout of proportion with their severity.7

ATOPIC DERMATITIS

Typically, this condition is characterised by itch-ing, dry skin and inflammatory lesions especiallyinvolving skin creases. Patients suffering fromatopic dermatitis may also develop IgE-mediatedallergic diseases such as bronchial asthma orallergic rhinitis. An overall cumulative preva-lence of between 5% and 20% has been suggestedby the age of 11. Around 60–70% of children areclear of significant disease by their mid-teens.Even if genetic factors seem to play a majorrole, environmental factors such as allergens andirritants are important and there is reasonable evi-dence to suggest that the prevalence has increased

two to threefold over the last 30 years. There issome evidence that it may be possible to preventatopic dermatitis in high-risk children born to par-ents with atopic disease by restricting maternalallergens and reducing house dust mite levels.8,9

No treatment has been shown to alter the naturalhistory of established eczema and the mainstayof treatment is emollients, which moisturise theskin, and topical steroids.

PSORIASIS

This is a chronic inflammatory disorder charac-terised by red scaly areas, which tend to affectextensory surfaces of the body and scalp. Itsoverall prevalence is about 1–3% and males areaffected more frequently than females. Severalvarieties have been described including guttate,pustular and erythrodermic psoriasis. In about3% of cases it may associate with a peculiararthritis. Significant disability has been docu-mented with psoriasis. Multifactorial heredity isusually considered for disease causation. Thisimplies interaction between a genetic predispo-sition and environmental factors. Heritability, ameasure that quantifies the overall role of geneticfactors, ranges from 0.5 to 0.9. Acute infections,physical trauma, selected medications and psy-chological stress are usually viewed as triggers.Pustular varieties of psoriasis have been stronglylinked with smoking. Sun exposure usually tem-porarily improves the disease. Altered kinetics ofepidermal cells has been repeatedly documented.The lesions are visible and may itch, sting andbleed easily. The aim of treatment is to achieveshort-term suppression of symptoms and long-term modulation of disease severity, improvingthe quality of life with minimal side effects.Topical agents such as vitamin D derivatives,dithranol and steroids can be used for short-termcontrol of the disease. Ultraviolet B phototherapy,psoralen plus ultraviolet A phototherapy (PUVA)and systemic agents such as methotrexate orcyclosporine are employed to control extensivelesions that fail to respond to topical agents.Relapse is common upon withdrawal. Outcomes

Page 216: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DERMATOLOGY 215

that matter to the patient include disease suppres-sion and duration of remission, patient satisfac-tion and autonomy and disease-related quality oflife. A number of clinical activity scores havebeen developed, the most popular being the Pso-riasis Area and Severity Index–PASI.10 There isno documented evidence that such indexes are areliable proxy for the above-mentioned outcomes.In the long term, a simple measure such as thenumber of patients reaching complete or nearlycomplete stable remission appears as the mostrelevant outcome variable.

LEG ULCERS

Venous and arterial leg ulcers are recognisedas the most common chronic wounds in West-ern populations. A skin ulcer has been definedas a loss of dermis and epidermis produced bysloughing of necrotic tissue. Ulcers persisting for4 weeks or more have been rather arbitrarilyclassified as chronic ulcers. Based on popula-tion surveys, the point prevalence of leg ulcersranges from 0.1% to 1.0% and increases withage. Venous ulcers are the end result of super-ficial or deep venous insufficiency and a venousorigin is diagnosed in about 80% of cases. Arte-rial ulceration may be regarded as a multistepprocess, starting, in general, with a systemic vas-cular derangement such as atherosclerosis. Theprognosis of leg ulcers is less than satisfactory,with about one-quarter of subjects not healing inover 2 years and the majority of patients hav-ing recurrence. In a large-scale clinical study,the healing time varied according to the dimen-sion of the ulcers, their duration and the mobil-ity of the patient. The quality of life of ulcerpatients may be severely affected. Social iso-lation, depression and negative self-image havebeen associated with ulcers in a high percent-age of patients.11 A number of studies point tothe less than satisfactory management of ulcerpatients in the community, including the lack ofany clinical assessment leading to long periods ofineffective or inappropriate treatment and delaysin instituting effective pain-relieving strategies.Ulcer clinics in vascular surgical services in

the UK proved to offer advantages over hometreatment.12 The overwhelming rates of recur-rence clearly suggest that more attention shouldbe paid to prevention.

SKIN DISORDERS AND CLINICAL TRIALMETHODS: ADAPTING STUDY DESIGN TO

SETTING AND DISEASE

As for other disciplines, the last few decadeshave seen an impressive increase in the numberof clinical trials carried out in dermatology.However, there are indications that the upsurgeof clinical research has not been paralleled by arefinement in clinical trial methodology and thequality of randomised control trials (RCTs) indermatology falls well below the usually acceptedstandards.13 – 15 In this section we would liketo mention some issues which deserve specialattention when designing a randomised clinicaltrial in this speciality area. There is a needfor innovative thinking in dermatology to makeclinical research address the important issues andnot simply ape the scientific design.

RANDOMISATION

It can be estimated that there are at leasta thousand rare or very rare skin conditionswhere no single randomised trial has beenconducted. These conditions are also those whichcarry a higher burden in terms of physicaldisability and mortality. The annual incidencerate of many of them is lower than 1 caseper 100 000 and frequently less than 1 caseper 1 000 000. International collaboration andinstitutional support is clearly needed. There areno examples of such an effort.

For quite different reasons, there are also com-mon skin conditions where randomised clinicaltrials have been rarely performed. These con-ditions include several varieties of eczematousdermatitis (e.g., nummular eczema), psoriasis(e.g., guttate psoriasis) and urticaria (e.g., pres-sure urticaria), a number of exanthematic reac-tions (e.g., pytiriasis rosea), rosacea, and common

Page 217: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

216 TEXTBOOK OF CLINICAL TRIALS

infections such as warts and molluscum con-tagious. One alleged difficulty with mountingrandomised clinical trials in dermatology is thevisibility of skin lesions and the considerationthat much more than in other areas, patientsself-monitor their disease and may have precon-ceptions and preferences about specific treatmentmodalities.16 The decision to treat is usually dic-tated by subjective issues and personal feelings.As we will consider below, there is a need to edu-cate physicians and the public about the value ofrandomised trials to assess interventions in der-matology. The need to evaluate the attitudes ofpatients and to educate should be clearly con-sidered when planning a study and developingmodalities to obtain an informed consent fromthe patient.

Randomised clinical trials are usually designedin dermatology with an expected large effect fromthe test treatment and most trials do not recruitmore than a few dozen patients. In small tri-als there may be substantial differences in groupsizes that will reduce the precision of the esti-mated differences in treatment effect and hencethe efficiency of the study. As a consequence,block randomisation may be preferable. On theother hand, a substantial imbalance may persist inprognostic characteristics, and minimisation canbe used to make small groups more similar withrespect to major prognostic variables.17 There issome evidence that the group sizes of clinicaltrials, apparently based on simple randomisationand published in a number of leading dermatolog-ical journals, tend to be much more similar thanexpected by chance (unpublished data from theEuropean Dermatoepidemiology Network psori-asis project). The cluster around equal samplesizes may be due to publication bias, failure toreport blocking, or even to the rectification of anunsatisfactory imbalance by adding extra patientsto one treatment.

In many instances, the management of achronic skin disorder is a multiple step pro-cess where different phases can be identified.For example, at least two phases are usuallyconsidered when treating psoriasis: a clearancephase, which involves a more intensive treatment

approach with the aim of clearing existing pso-riasis lesions, and a maintenance phase, with themain aim of preventing disease relapse. The dif-ferent phases are not necessarily well separatedin time. Long-term disease-modifying strategiescan be adopted at the same time when a treat-ment modality for reaching clearance has beenstarted. An example is the treatment of atopicdermatitis by topical steroids and diet. Most ran-domised clinical trials in dermatology use a sim-plified approach to evaluating treatment effectsand most of them analyse the effect of a singlemanoeuvre over a limited time span. One as yetnot fully explored issue is the potential for com-bining different treatment approaches in a simul-taneous or subsequent order. There are examplesof combinations of such treatment modalities ascalcipotriol and ultraviolet B radiation in psoria-sis treatment, but other rational combinations arenot fully explored. A way of addressing the issueis by relying on factorial design. An exampleof such a design would be a randomised clin-ical trial of the effect of a low-allergen dietcompared with an unrestricted diet in atopicwomen during pregnancy and breast-feeding onthe subsequent development of atopic disordersin children where women are randomised toall the possible combinations of restricted andunrestricted dietetic measures during the peri-ods examined.

BLINDING

There are several reasons for considering blind-ing as a major bias-reducing procedure in ran-domised clinical trials of skin disorders. Firstly,it is expected that physicians and patients aresubject to strong, though difficult to document,hopes and prejudices about the optimal care ofskin disorders. This is reflected, for example,in the large variations of treatment proceduresfor the same condition which have been repeat-edly documented in different areas of derma-tology. Secondly, most outcome measures aresoft end points involving subjective judgement,which may be influenced to a significant extentby the previous knowledge of the treatment

Page 218: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DERMATOLOGY 217

adopted. Thirdly, the visibility of lesions mayinfluence the decision to rely or not on a giventreatment to a larger extent as compared withsituations where disease variables are not so obvi-ous. On the other hand, there may be prob-lems with blinding which may be difficult orimpossible to solve, like with trials comparingcomplex procedures such as ultraviolet light radi-ation and drug regimens. An issue which war-rants more attention than it is often given inrandomised trials is the possibility that certain‘marker variables’ occur, together with obviousside effects. These variables, observable duringtreatment, may in part unblind the trial, even ata subliminal level.18 This is an issue with theuse of topical retinoids and the associated mildcutaneous irritation, which may be noticed butnot reported at all as a ‘side effect’. It is quitecommon to find randomised clinical trials wherethe authors claim blindness in situations wheretreatment modalities are responsible for frequentand obvious side effects. In 1982, for example,a trial was published examining three differenttherapeutic strategies for psoriasis: oral etretinateassociated with topically applied betamethasone,oral etretinate associated with topically appliedplacebo, and oral placebo associated with topi-cal betamethasone.19 Systemic retinoids such asetretinate are responsible for common side effectswhich are reminiscent of vitamin A overdosage,including dryness of the skin and mucous mem-branes, while topical steroids commonly producea transitory blanching effect. It is difficult toaccept blindness in the trial when there is no addi-tional information on how blinding was actuallyassured. It is worth mentioning that the dropoutrates showed large variations among the differ-ent trial arms because of alleged side effectsof treatment.

One way to overcome the problem of anunachievable blindness and avoid the influenceof the researcher’s subjective judgement is toplan the study so that the clinician who treats thepatient is not the same one who judges the effectof the therapy. This way the second clinicianscan be blind to the treatment assigned to thepatient.

STANDARDISATION OF TREATMENTMODALITIES AND ACCESSORY CARE

Independently of the ‘active’ intervention admin-istered, accessory non-pharmacological treatmentand skin care seem to play a significant rolein the outcome of most skin disorders. It iscommon sense that emollients may improve dryskin and wet soaks may help to dry exudat-ing lesions. As a consequence, accessory carerequires careful standardisation. However, whileit is relatively easy to ensure that the pharma-cological treatment is conducted in an appro-priate way (particularly timing and administra-tion route), non-pharmacological accessory careis prone to a larger variability that is affectedby social and cultural factors among others. Toa greater extent, variability may affect topicaltreatment as compared with systemic treatment.Topical treatment is usually more cumbersomein comparison with systemic treatment and maywell depend on the physician’s and patient’s con-sistency. As documented in randomised clinicaltrials of the retinoid derivative tazarotene in pso-riasis, the modalities of application may play asignificant role in tolerability and side effects.20

Once again a well-informed patient as well as anactive and supportive clinician are vitally impor-tant. The issue of standardisation is also impor-tant for assessing compliance when the treatmentis self-applied by the patient. If indeed there arelimitations with such methods as tablet count forassessing compliance with systemic agents, thelimitations are even greater when similar methodsare used to monitor the consumption of topicalagents in the absence of strict rules to define a‘single dose’. The amount consumed cannot bemonitored if patients are not carefully instructedon how to apply the topical agent. The observedcompliance behaviour may range from compliant,overusing, erratic using and dropping out.

DIFFERENT STUDY SETTINGS ANDDISPARATE DISEASE SEVERITY

We have already mentioned that the contents ofprimary care for many skin disorders are impre-cisely defined as opposed to specialist clinical

Page 219: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

218 TEXTBOOK OF CLINICAL TRIALS

practice, with possibly large overlapping areas.In addition, it has been noticed that there may bewide variations in terms of severity within anygiven diagnostic category, with conditions rang-ing from subclinical manifestations, e.g., psori-atic ‘markers’, to skin failure, e.g., erythrodermicor generalised pustular psoriasis. Moreover, itshould be noticed that for any given diseasethere might be clinical variants, which may havepeculiar prognostic features and responses totreatment, e.g., guttate psoriasis versus plaquepsoriasis. As a consequence, it is of the out-most importance that entry criteria in RCTs ofskin disorders are defined as precisely as possi-ble. This should include as a major requirementthe definition of the study setting, clinical variety,disease severity and duration, previous treatmentsand concomitant systemic disorders.

It should be noticed that the severity assess-ment of most skin disorders implies an under-standing of the many influences of the diseaseon the patient’s life, including disease-associateddiscomfort, level of disability and social disrup-tion. Most of these influences are better expressedas a continuum of severity rather than a yes or nophenomenon. On the other hand, there are prac-tical advantages in trying to translate the contin-uum into a limited number of workable severitycategories. The main advantage is a better com-pliance with the discrete nature of most clinicaldecisions where thresholds are usually requiredfor implementing interventions (examples of cat-egorical classifications of a severity continuumare tumour staging and arterial hypertensiondefinition). Unfortunately, for many inflamma-tory skin disorders no reliable severity criteriahave been developed. Even when such criteriaare available, there is uncertainty about sever-ity thresholds. Consequently, large variations areexpected to occur among different RCTs and, infact, have been documented on several occasions.A rather common attitude in published RCTsis the lack of entry criteria and severity defini-tions, so that the patient population appears tobe recruited in a vacuum.15,21 One habit whichshould be discouraged is to include broad diag-nostic categories that lack specificity like, for

example, the category of ‘steroid responsive der-matoses’ or ‘itching disorders’.

OUTCOME MEASURES

There are obvious analogies between the prob-lems implied in the development of severity cri-teria and those implied in outcome measures.They both consist of measures that must havethe properties of validity and reliability. In addi-tion, outcome measures must be responsive tochange, i.e., they must have the ability to iden-tify what may be small but nevertheless clini-cally important changes. On the other hand, withseverity criteria the intent is more to discriminatebetween individuals. If an instrument is usefulto discriminate between severity levels of psori-asis, it does not necessarily mean that it will beable to detect changes which are important as aresult of treatment within these categories. Thedistinction between the two aims (i.e., describingvariations between individuals and assessing vari-ations within individuals) is frequently blurredwhen developing measurement systems for skindisorders. In spite of the fact that, from a clini-cal point of view, distinguishing between diseaseseverity levels may represent a different issueas compared with assessing clinically importantchanges in individual patients, the two issues areusually dealt with by relying on identical scalesystems in dermatology.

There are indications that many score sys-tems employed in dermatology lack the basicrequirements for reliability and validity. Evena simple measure such as the approximate per-centage of area involved in a skin disease isprone to wide inter- and intra-observer varia-tions if the evaluation methods are not clearlyspecified.22 In spite of their lacking basic require-ments, a large number of different scales havebeen developed for such common disorders aspsoriasis or atopic dermatitis (Table 14.2). Oneexample is the ‘Psoriasis Area Severity Index’(PASI).10 This index is obtained by summingup the scores concerning three features of pso-riasis, namely the body district affected, theseverity of the condition (judged by the degreeof erythema, infiltration and desquamation) and

Page 220: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DERMATOLOGY 219

Table 14.2. Measures used in the outcome evaluation of selected skin diseases

Disease Clinical scalesDisease-specific Quality of Life

measures

Acne6,37,38 • Lesion counting (papule, pustule andcomedone counts)

• Plewing and Kligman grading system• Cunliffe score (Leeds technique)• Cook’s photonumeric method• American Academy of Dermatology

(AAD) classification• Allen and Smith photographic method• Fluorescence photography• GAGS (Global Acne Grading System)

• ADI (Acne Disability Index)• CADI (Cardiff Acne Disability Index)• APSEA (Assessment of the

Psychological and Social Effects ofAcne)

• AQL (Acne Quality of Life) index

Atopicdermatitis38–40

• SCORAD (severity Scoring of AtopicDermatitis)

• SASSAD (Six-Area, Six-Sign AtopicDeramtitis) severity index

• ADASI (Atopic Dermatitis Area andSeverity Index)

• EASI (Eczema Area and SeverityIndex)

• Rajka and Langerland scoring system• SSS (Simple Scoring System)• BCSS (Basic Clinical Scoring System)• ADSI (Atopic Dermatitis Severity

Index)• SIS (Skin Intensity Score)• ADAM (Assessment Measure for

Atopic Dermatitis)• Nottingham Eczema Severity Score

• EDI (Eczema Disability Index)

Psoriasis22,41 • Severity scores based on individualsigns (involved Body Surface Area,erythema, induration, desquamation)

• PASI (Psoriasis Area and SeverityIndex)

• SAPASI (Self-administered PASI)• Ultrasound evaluation of the

thickness of psoriasis

• PDI (Psoriasis Disability Index)• PLSI (Psoriasis Life Stress Inventory)

Leg ulcers42,43 • Clinical skin score• Simple wound measurements• Planimetric wound area

measurements

Dermatologicaldiseases as aclass38,44,45

• DIDS (Dermatology Index ofDisease Severity)

• DLQI (Dermatology Life QualityIndex)

• CDLQI (Children’s Dermatology LifeQuality Index)

• IMPACT (Impact of Skin DiseaseScale)

• SKINDEX

Page 221: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

220 TEXTBOOK OF CLINICAL TRIALS

the extension of the disease. The last two arejudged according to the body district analysed.Although the PASI score has been widely used, itis largely unsatisfactory.22 It has never been stan-dardised and there is limited testing for inter- andintra-rater reliability. Validity is another issue. Ithas never been demonstrated that the weightsarbitrarily attributed to each item in the PASIscore actually reflect the clinical severity oflesions. PASI is only relying on the derma-tologist’s judgement of a few clinical featuresof psoriasis and there is increasing awarenessthat the patient’s judgement is equally impor-tant. An additional drawback of PASI is thatsimilar scores can be attributed to varieties ofpsoriasis which differ clinically and in terms ofresponse to treatment.23 The ‘Self-AdministeredPASI’ (SAPASI), which asks the patient to makethe same evaluation as the physician for PASI,does not escape the limitations we have pointedto for PASI as an outcome measure.24

To overcome the problems arising from subjec-tive judgement, more ‘objective’ measures havebeen repeatedly advocated, such as the use ofultrasound to evaluate the thickness of psoriasisplaques.23 In fact, any measurement is fully justi-fied only when it represents a good surrogate forclinically important outcomes, such as the patientdisability and quality of life.25

The notion of responsiveness to change expres-ses the idea that any measure used in atrial should be sensitive to ‘clinically importantchanges’ in response to therapy.26 A conceptualdifficulty arises in specifying what a clinicallyimportant change is. With most scales developedin dermatology the issue remains fraught sinceno ‘gold standard’ has gained wide acceptance.It should be considered that the ‘outcome’ ofthe treatment refers to ‘all the possible resultsthat stem from preventive or therapeutic interven-tions’ and consists of several separate dimensions(e.g., discomfort and disability), which may bebroken down into components and simple mea-surable items. Any given measure achieves itsvalue only to the extent to which it serves as aproxy for an outcome component. For example,

if the PASI index accurately quantifies disabil-ity or discomfort, then it may be of value as asurrogate outcome measure for psoriasis. Whatmay be a relevant outcome variable is a mat-ter of judgement, based on the knowledge of thedisease, the patients’ requirements and the val-ues of society. The outcome of skin disordersthat affect the quality rather than the quantity oflife is expected to be largely culture-dependent.It is our conviction that the development of a‘gold standard’ requires a deep understandingof patients’ requirements and expectations fromtreatment. In the lack of reliable scales, trials withthe simplest and most objective outcome vari-ables are preferable. Such measures as completeremission or recurrence should be preferred, pro-vided that these categories are clearly defined.Clearly, remission or recurrence are events whichoccur with a lower frequency as compared withless dramatic variations in disease activity mea-sured by clinical scales. This, in turn, affects thesample size calculation.

There are at least two different choices whenanalysing outcome measures expressed by anygiven score system. We might compare thedifference between the initial and final scores inthe treatment and control groups or, alternatively,ignore the pre-test scores and simply analyse thescores after treatment. There are two importantanalytical reasons to consider in the use ofchange scores.27 The first is that the subtractionof scores before treatment has the effect ofremoving stable individual differences betweensubjects, thereby increasing the power of thestatistical test. The second reason, which is ofrelatively minor importance in a randomisedtrial, is that there may be overall differencesbetween the two groups at the baseline, and theuse of change scores can potentially correct forthese differences. The usual presentation of scoredata over time (e.g., PASI score) is to buildup a curve based on the mean score values ofthe treatment and control groups. A commonbut inappropriate analysis of these curves isto apply two separate sample tests on meanscore values at several time points (Figure 14.1).The means may not represent a good descriptor

Page 222: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DERMATOLOGY 221

0 2 4 6

Time (wk)

0

1

2

3

4

5

6

7

8

9

10

PAS

I sco

re *

*

*

*

CalcipatriolBetamethasonep < 0.001

Source: Reproduced from Kragballe et al.,46 with permission from Elsevier.

The figure shows a common but unsatisfactory modality of analysing PASI scores collected serially in an RCT. Themean score is calculated at different time points and a graph is presented with lines joining the means at the differenttime points for the experimental and control group. In the graph, ‘errors’ bars are attached at each time point and anindicator of statistical significance is placed by each time point to summarise the results of separate significance tests.The curve joining the means may not be a good descriptor of a typical curve for an individual and no account in theanalysis is taken of the fact that measurements at different time points are from the same subjects and are likely to becorrelated. The number of statistical tests performed and the choice of time points to be tested are additionalproblematic issues. Further, dividing the results into ‘significant’ and ‘not significant’ introduces an artificialdichotomy into serial data.

Figure 14.1. Problematic analysis of PASI score over time

of a typical curve for an individual and theseparate analyses of different time points does notconvey information on how individual subjectsrespond over time. Moreover, this practice canbe criticised on statistical grounds because ofmultiple potentially data-driven statistical testsand because the values over time are notindependent and one time point is likely toinfluence successive time points.28 It should benoticed that the information from each patientmight be diluted when comparing the mean, or

better the median, of indexes such as PASI indifferent treatment groups. In addition, the scoreof patients who leave the study prematurely andare lost to follow-up cannot be evaluated andthe ‘intention-to-treat’ analysis may be difficultto perform. In this respect, the use of simpleclinical variables (e.g., the number of total orpartial remissions) could be more informative.A remedy has been proposed for the analysis ofserial measurements. In the first stage, a suitablesummary of the response in each individual,

Page 223: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

222 TEXTBOOK OF CLINICAL TRIALS

such as a rate of change or an area under thecurve, is identified and calculated. Subsequently,these summary measures are analysed by simplestatistical techniques.28

In conclusion, the situation of outcome eval-uation for many skin disorders could be brieflysummarised as follows. (1) Doctor-centrism: con-siderably more attention has been given to thedermatologists’ views on treatment effects thanto those of the patients. (2) Biologic reduction-ism: most of the interest has been concentrated onskin involvement. Only recently have psycholog-ical and social factors relevant to outcome beenappraised. (3) Process, i.e., what happens alongthe way, rather than outcome: assessment ofinvolved areas by ‘objective’ quantitative indexeshas been preferred to hazarding definition of syn-thetic relevant outcomes (e.g., clearing).

PHASE I AND PHASE II STUDIES

The ordered development of treatment modal-ities according to well-identifiable phases29 isthe exception rather than the rule in dermatol-ogy. There are several reasons for this situation.Many treatments are non-pharmacological inter-ventions (e.g., ultraviolet phototherapy) which donot need to comply with the regulatory require-ments for drug development, and there are nostrict guidelines on how to assess them at anearly clinical phase. Secondly, in spite of beingso common, the resources allocated to the studyof skin disorders are limited as compared withother clinical areas. As a consequence, our under-standing of pathomechanisms is limited, as itis the development of disease-specific therapy.Until the causation of the main skin disorders isunravelled, disparate therapies with impreciselydefined biological activities will continue to beavailable and many treatments will enter the ther-apeutic arena serendipitously. It was the case ofa renal-transplant recipient with psoriasis whoseskin lesions cleared with cyclosporine that led tostudies demonstrating the efficacy of that drug.30

Similar considerations can be made for such treat-ment modalities as topical vitamin D in psoriasis

or the use of minoxidil in androgenetic alopecia.It is widely accepted that a phase I study isone that examines the initial introduction of adrug in human beings with the treatment testedeither in normal volunteers or in patients. Themain issues are the pharmacokinetics, pharmaco-dynamics and tolerability of the drug being testedwith a focus on assessing inter-patient variability.While problems with systemic drugs in dermatol-ogy do not differ from those usually encounteredin other speciality areas, some peculiarities existwith the assessment of topical drugs. Penetrationwithin the deep epidermal layers and dermis isa parameter of particular interest since it clearlyaffects the local activity of the drug itself. On theother hand, pharmacokinetic parameters describ-ing such a penetration are less stringent as com-pared with systemic drugs. The assessment can beperformed on normal or diseased skin. Relevantmethods are those which allow measurements ofthe concentration of the drug in the skin, in agiven time, after topical application, while con-centration gradients are formed. Such profiles areusually obtained by direct invasive techniques(e.g., skin biopsy) using topically applied radi-olabelled drugs. In some instances, a close corre-lation has been documented between the barrierfunction of the horny layer, its reservoir functionand the resulting penetration into the skin. Pen-etration into human skin can thus be predictedfrom drug quantification in horny layer strip-pings. This allows non-radioactive methods ofdrug dosage, like high-performance liquid chro-matography, to be applied. Indirect measurementssuch as urinary excretion or blood levels are alsoanalysed as parameters indicative of the systemicadsorption of the drug and possible toxicity. Inmany instances, it may be of interest to per-form penetration studies in the same patient withthe drug being applied on the involved versusthe uninvolved skin. Whenever the horny layerbarrier is disrupted, penetration within the dis-eased area is usually facilitated. In addition toadsorption, tolerability of a locally applied drugmay be of interest. This is usually evaluated bystudying local reactions with increasing concen-trations of the drug. All the above-mentioned

Page 224: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DERMATOLOGY 223

studies are usually conducted on a few healthysubjects or voluntary patients and in an uncon-trolled way. Measurement error is a crucial issue,which needs standardisation and careful evalua-tion at the design level.

For a limited number of topical drugs phar-macodynamic parameters have been developed.An example is the blanching or vasoconstric-tion assay, which has been employed to screennew topical steroids for clinical efficacy. Thebioavailability of steroids from topical formula-tion has been rather improperly defined as therelative absorption efficiency of a drug, as deter-mined by the release of the steroid from its for-mulation. Its subsequent penetration through thestratum corneum and viable epidermis into thedermis would produce the characteristic blanch-ing effect. This effect is measured through scoresthat have a subjective component and need care-ful standardisation. There have also been someattempts to identify biologic pharmacodynamicmarkers of some chronic skin disorders like pso-riasis to be used at an early stage of drugdevelopment.31 However, these indexes are basedon cross-sectional studies and there is still lim-ited information on their modifications with dis-ease activity.

According to the Food and Drug Administra-tion (FDA) regulations, a phase II study is thefirst controlled clinical study that evaluates theeffectiveness of a drug for a given specific thera-peutic use in patients. It is also the first study toevaluate the risks of a drug’s side effects. Such astudy is typically a well-controlled, very closelymonitored trial that tests a relatively small, nar-rowly defined patient population, usually num-bering no more than a few hundred patients. Ifthe criterion is the number of patients recruited,then most randomised clinical trials in dermatol-ogy would come under this definition.

Study designs that are frequently employed at apreliminary stage in drug development are within-patient control studies, i.e., crossover and self-controlled studies or simultaneous within-patientcontrol studies. In dermatology they are alsoused, albeit improperly, at a more advanced stage.In a survey of more than 350 published RCTs

of psoriasis (unpublished data), a self-controlleddesign accounted for one-third of all the studiesexamined and was relied on at any stage indrug development. Crossover studies are studieswhere patients are randomly allocated to studyarms, where each arm consists of a sequenceof two or more treatments given consecutively.These trials allow the response of a subject toa given treatment A to be contrasted with thesame subject’s response to treatment B. There aresome inconsistencies with the definition of self-controlled studies provided by different authors.We consider as self-controlled studies thoseclinical trials where patients act simultaneouslyas their own control. A prerequisite for this kindof study is the existence of pair organs, e.g., eyes,which can be treated by a locally applied drugin the lack of any significant systemic effect.From our definition we exclude either thosestudies where a single treatment is administeredto patients and a ‘before–after’ comparison iscarried out, and the so-called ‘N-of-one’ RCTs,where different time periods are randomised in asingle patient to different treatment.

The main advantage of a within-patient studyover a parallel concurrent study is a statisticalone. A within-patient study obtains the same sta-tistical power with far fewer patients, while atthe same time reducing problems of variabil-ity between the populations confronted. Within-patient studies may be useful when studyingconditions that are uncommon or show a highdegree of patient-to-patient variability. On theother hand, within-patient studies impose restric-tions and artificial conditions, which may under-mine validity and generalisability of results andmay also raise some ethical concern. The washoutperiod of a crossover trial as well as the treatmentschemes of a self-controlled design, which entailsapplying different treatments to various parts ofthe body, do not seem to be fully justifiable froman ethical point of view. In fact, they don’t satisfythe principle of providing the patients enrolledin clinical trials with the best-proven diagnosticand therapeutic method. By necessity these stud-ies are restricted to the evaluation of short-termoutcomes. A higher degree of collaboration from

Page 225: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

224 TEXTBOOK OF CLINICAL TRIALS

half body quarter of body left-right limbs

left-right arms four limbs two patches

Source: Reproduced from Naldi et al.,32 with permission.

Figure 14.2. Design of intra-patient comparison in 26 trials concerning dithranol short-contact therapy ofpsoriasis

the patients is requested as compared with otherstudy designs. Clearly, the impractical treatmentmodalities in self-controlled studies or washoutperiod in crossover studies may be difficult forthe patient to accept. In this kind of study thenumber of dropouts may be higher when com-pared to parallel group designs. In a survey of

26 self-controlled trials on short-contact dithranolin psoriasis (Figure 14.2), which had a mediannumber of 16 patients (range 5 to 63), half ofthe trials experienced dropout.32 Dropouts mayhave more pronounced effects in a within-patientstudy as compared to other study designs becauseeach patient contributes a large proportion of

Page 226: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DERMATOLOGY 225

the total information, and the design is sensi-tive to departure from the ideal plan. The sit-uation is compounded in self-controlled studieswhere the dropping out from the study may becaused by observing a difference in treatmenteffect between the parts in which the patient hasbeen ‘split up’. In this case, given that dropoutsare related to a difference in treatment effectbetween interventions, the estimate of the effectof the intervention could be incorrect and falselyequalised. There are several more problems to beconsidered. Contamination of treated areas andsystemic absorption may complicate the interpre-tation of self-controlled studies, while crossoverstudies require that the disease lasts long enoughto allow the investigator to expose patients toeach of the experimental treatments and mea-sure the response. Also the treatment must beone that does not permanently alter the diseaseor process under study. Carry-over and periodeffects may clearly compound the analyses.33

Generalisability is an issue of concern in within-patient controls. Not only entry criteria are usu-ally greatly restricted, e.g., symmetrical lesions,but also outcome measures need to be selectedamong those reflecting short-term changes in dis-ease activity. Such issues as patient satisfactionand quality of life are obviously beyond the scopeof a self-controlled design. It is surprising thatself-controlled designs have been the preferreddesign in situations like topical immunotherapyof alopecia areata or short-contact therapy ofpsoriasis where patient satisfaction and mainte-nance of effects over time (e.g., maintenance ofthe hair restoration to an acceptable extent) arevitally important.

PHASE III TRIALS

From phase III studies we request randomisedtrials that gather additional information regardingthe effectiveness and safety of a treatment, underconditions which are closer to the usual clinicalpractice as compared with phase II trials. Theyshould study those clinical outcomes that areof major interest to physicians and patients (as

opposed to those driven by surrogate end points)and last longer than phase II trials. The distinctionbetween phase II and phase III trials is blurred indermatology, where most randomised trials aresmall and, being short-term, employ surrogatemeasures in well-selected groups of patients. Afew points are worth mentioning when discussingthe design of phase III studies in the area ofdermatology.

PATIENT MOTIVATION AND PREFERENCE

It has already been mentioned that one of themain concerns of patients suffering from a skindisorder is the visibility of lesions and, muchmore than in other areas, the patient self-monitorshis/her disease. Patients’ motivations and previ-ous experience are obvious crucial points whenentering a trial. Motivations and expectations arelikely to influence clinical outcome of all treat-ments, but they may have a more crucial role insituations where ‘soft end points’ are of concernas in dermatology. Commonly, more than 20%of the patients entering randomised clinical trialsof psoriasis experience improvement on placeboindependently of the initial disease extension.Motivations are equally important in pragmatictrials where different packages of managementare evaluated, such as in the comparison of a self-administered topical product for psoriasis withhospital-based therapy like phototherapy. Tradi-tionally, motivation is seen as a characteristic ofthe patient that is assumed not to change with thenature of the intervention. However, it has beenargued that it is more realistic to view motivationin terms of the ‘fit’ between the nature of thetreatment and the patient’s wishes and percep-tions, especially with complex interventions thatrequire the patient’s active participation. We havealready mentioned that the boundary betweendisease and non-disease is particularly shady indermatology. On the other hand, the public isconfronted with a great deal of uncontrolled andsometimes misleading or unrealistic messages onhow to improve the body’s appearance. All in all,there is a need to ensure that patient informationand motivations are taken into proper considera-tion when designing and analysing clinical trials

Page 227: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

226 TEXTBOOK OF CLINICAL TRIALS

on skin disorders. The issue is not only a mat-ter of ‘informed consent’. There is a need tostudy the influences that determine patients’ pref-erences and to understand how these may affectthe outcome of clinical trials. A distinction shouldbe drawn between an informed choice based onfactual data–such as a reliable estimate of therisks and benefits of interventions–and attitudestowards treatment based on emotional aspectsand preconceptions. In recent years, a numberof design variants on the traditional randomisedtrial have been proposed to take into accountpatients’ preferences. They include the partiallyrandomised patient preference design and theso-called randomised consent or Zelen design.34

These designs have never gained wide acceptanceand none have ever been used in dermatology.The shift from a paternalistic attitude, wherebyenrollment decisions are made by doctors, to thechoice freely exercised by individual patients islikely to affect the composition of populations inclinical trials. However, when agreement to enrolis based on patients’ preferences for individualtreatments, as in the Zelen design, the groupassembled is unlikely to mirror the target popula-tion of all the eligible patients. There is a need tostudy the influences that determine patients’ pref-erences and understand how these may influencethe final outcome of a trial. In a recent survey,Dutch patients affected by psoriasis consideredthe safety issue and long-term management asmore important than fast clearing.16 It was alsoimportant to them to have a vote in the selectionof the treatment. It is worth mentioning that thelarge majority of RCTs in psoriasis are short-termstudies dealing with short-term clearance ratesthat are assessed by the treating physicians. Thereis room for testing study designs that allow fordifferent preference assessment strategies.

ENTRY CRITERIA

The definition of the study population is of par-ticular importance in dermatology where largevariations in disease severity and different clini-cal subgroups may exist–e.g., plaque versus gut-tate psoriasis. In addition, there may be problems

with variations in disease severity over time. Thisis commonly observed with chronic inflamma-tory skin diseases characterised by a relapsingcourse such as atopic dermatitis or psoriasis. insituations where a variable time-course of theclinical condition is expected, it may be advis-able to proceed with sequential evaluations usingstandardised criteria to judge the stability of thedisease over time. Quite surprisingly, informa-tion about the stability of the clinical condition isoften neglected in clinical trial reports. A reviewthat focused on the selection of patients withpsoriasis examined more than 60 clinical trialsbetween 1988 and 198921 and documented thatinformation about the stability of the conditionwas missing in more than 70% of the studies.

Exclusion criteria have the function of select-ing the ‘more suitable’ patients among all possi-ble candidates (e.g., excluding patients in whomthe treatment under investigation is contraindi-cated). This selection also has the aim of reducingfactors of variability in the study population, inorder to maximise the chance of detecting andquantifying the treatment effect (e.g., excludingpatients who are too young or too old). Thebest way to provide an account of the selec-tion process is a log that lists the included andexcluded patients and specifies the reasons forexclusion. This is rarely found in clinical trialreports concerning skin disorders. An example ofhow far exclusion criteria may operate and limitthe possibility of generalising the study resultsis offered by a clinical trial on the effectivenessof a Chinese herbal extract called ‘Dabao’ inthe treatment of alopecia androgenetica.35 Amongthe 3000 patients available to take part in thetrial, only 396 were eventually selected to berandomised in the treatment or placebo group.Such exclusion must be a warning when interpret-ing the actual effectiveness of Dabao on malesaffected by alopecia androgenetica. It is quiteplausible that a similar selection process operatesin many RCTs concerning skin disorders.

PLACEBO USE

There are still controversies about the use ofplacebo in randomised control trials. It is widely

Page 228: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DERMATOLOGY 227

accepted that ‘in any medical study every patientshould be assured of the best proven diagnosticand therapeutic method’. As a consequence,the use of placebo should be proscribed whena ‘proven’ therapeutic method exists. In spiteof these principles, studies which breach theethical principle are still commonly conductedwith the approval of regulatory agencies andinstitutional review boards. It is widely acceptedthat placebo-controlled trials have high internalvalidity, but they may be difficult to apply toclinical practice in situations where alternativeinterventions of proven efficacy already exist. Inthese circumstances, the information of clinicalvalue is the effect size of the new intervention ascompared with the alternative treatment strategy.The use of placebo may sometimes undermine thevalidity of the study if the treatment falls shortof patients’ expectations, resulting in reducedcompliance and a large dropout rate. Some yearsago a placebo-controlled trial was published onthe effect of ebastine, an H1 receptor antagonist,in chronic urticaria.36 A number of other non-sedative antihistamine drugs of proven efficacywere available when the trial was conducted. Onemight argue that it is unethical to deprive thepatients in the placebo group of any effectivetherapy, even if only for a limited time (14 daysin this study). As a matter of fact, the authorsreported a high number of dropouts due to thelack of effect in the placebo group. A remarkon the possible misinterpretation of the results ofplacebo-controlled trials comes from this study.The authors’ conclusion that ‘ebastine representsan effective and well tolerated alternative to othernon-sedative antihistamine drugs in the treatmentof chronic urticaria’ is likely to be true but farfrom proven.

Researchers may have a number of differentoptions for their choice of placebo or compari-son intervention in randomised clinical trials but,in practice, many regulatory agencies still con-sider placebo controls as the ‘gold standard’.Placebo controls are usually required for the eval-uation of symptom relief or short-term modifica-tion of disorders of moderate severity even whenan alternative treatment is available. The usual

but questionable claim that justifies this practicefrom an ethical point of view is that withhold-ing the active therapy does not necessarily affectthe long-term prognosis. The above-mentionedissues of symptomatic relief and moderate sever-ity disorders are commonly encountered in der-matology and, in fact, a large number of placebo-controlled RCTs are conducted in this area evenwhen alternative therapies exist. The results ofdelaying or withholding the treatment may notbe straightforward in dermatology. However,there is no question that an extraordinary largenumber of similar molecules employed for thesame clinical indications can be found in thisarea. These molecules are mostly assessed inplacebo-controlled RCTs rather than in compar-ative RCTs. Examples include topical steroids,oral antihistamines, antifungal drugs and topi-cal antibacterial drugs. More than 200 treatmentmodalities were identified in a recent survey ofpublished clinical trials of psoriasis with only afew comparative trials. There is a need to estab-lish criteria for the use of placebo in dermatology.They should be developed with the active andinformed participation of the public and shouldbe considered by review boards and regulatoryagencies. Pragmatic randomised trials contrast-ing alternative therapeutic regimens are urgentlyneeded to inform clinical decisions.

TIME FRAME FOR EVALUATION ANDOUTCOME MEASURES IN CONTEXT

This discussion will focus on chronic inflam-matory skin disorders like psoriasis or atopicdermatitis. There is a necessary link betweenthe time frame for evaluation and the measuresadopted to assess clinical response; therefore thetwo issues should be dealt with together. Manychronic inflammatory skin diseases do not neces-sarily have a progressive deteriorating course, butthey may vary in severity over time causing prob-lems that are similar to those encountered withmany psychiatric disorders and some rheumaticdiseases. Whenever a definite cure is not rea-sonably attainable, it is common to distinguishbetween short, intermediate (usually measurable

Page 229: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

228 TEXTBOOK OF CLINICAL TRIALS

within months) and long-term outcomes. We havealready mentioned that clearing the disease in theshort term is different from maintaining clearanceover time, and long-term results are not simplypredictable from short-term outcomes. Most ofthe score systems available for skin disordersseem to fit best with the clearance issue. On theother hand, it is not easy to define what repre-sents a clinically significant long-term change inthe disease status. This is an even more diffi-cult task than defining outcome for other clinicalconditions, such as cancer or ischaemic heart dis-eases, where mortality or major hard clinical endpoints (e.g., myocardial infarction) are of particu-lar interest. In the long-term, the way the diseaseis controlled and the treatment side effects arevitally important. It has been documented thatcompliance with the duration of the treatment islimited and is worst with topical treatments.16

Measures of the quality of life appear ratherattractive. However, what represents an importantchange for most quality of life measures is impre-cisely defined especially if one considers a long-term time frame for evaluation. Clearly, treatmenteffects can be seen from different perspectivesand several dimensions can be taken into account.However, in view of the limitations of the avail-able measures in the long term, simple and cheapoutcome measures applicable in all patients seemto be preferable. These may include the numberof patients in remission, the number of hospi-tal admissions or ambulatory consultations andmajor disease flare ups. Dropouts merit specialattention. In chronic inflammatory skin diseasesthat lack hard end points, they may stronglyreflect dissatisfaction with treatment. Whateverthe outcome measure adopted, dropouts cannotsimply be ignored because the patients who donot provide PASI, Disability or Quality of Lifescores might be different from those who do.Analysis by randomised group irrespective ofsubsequent changes is the method recommendedfor the analysis of clinical trials. This analysisposes special problems when relying on quan-titative scores. It is suggested that every effortshould be made to ensure that patients have a

complete assessment at withdrawal and are fol-lowed up. If some categorical outcome variableis also considered, e.g., hospital admission, therelation between the score value at withdrawaland the final outcome may be explored.

OTHER ISSUES

The most precise definition of the profile of anintervention requires a perspective on the risksand benefits, which is wider than the one usu-ally provided by any single RCT. For manychronic skin diseases, efficacy data are derivedfrom short-term RCTs, whereas patients tendto be treated over years. The main issues ofsafety and long-term effectiveness are usuallyaddressed in the context of observational studies,i.e., phase IV studies. One of the best examplesof such a study is the PUVA follow-up study,a cohort study of more than 1400 patients whohad received a first course of PUVA-treatment in1977. These patients are still being followed upand provide information on disease associationsand prognostic factors. The study pointed to adose-related increased risk of non-melanoma skincancer in PUVA-treated patients. We lack simi-lar studies for many other systemic treatments ofpsoriasis, including methotrexate, retinoids andcyclosporine. The safety profiles of most sys-temic antihistamines are also imprecisely defined.Observational studies may represent the most fea-sible way to study the usefulness of long-termtreatment strategies for chronic inflammatory skindiseases, when disease modification rather thansymptom control becomes a desired outcome. Ashas been proposed for some rheumatologic disor-ders, e.g., rheumatoid arthritis, drug survival, etc.the interval individual patients remain on an agentmay offer an indication for long-term acceptabil-ity that takes into account adverse effects, lack orloss of effect and patients’ preference.

A final mention should be made of thoseactivities that aim at summarising the results ofseveral RCTs on the same issue. There is a largeburden of small RCTs13 addressing disparateclinical questions, as well as a lack of consensus

Page 230: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DERMATOLOGY 229

Table 14.3. List of the systematic reviews on skin conditions already available,or in an advanced stage of development, in the Cochrane Library (CochraneSkin Group, August 2000)

Completed reviewsSurgical treatments for ingrowing toenailsTopical treatments for fungal infections of the skin and nails of the footMinocycline for acne vulgaris: efficacy and safetyInterventions for guttate psoriasisSystemic treatments for metastatic cutaneous melanomaAntistreptococcal interventions in the treatment of guttate and plaque psoriasis

Reviews undergoing the editorial processDrug treatments for discoid lupus erythematosus (DLE)Laser resurfacing for the improvement of facial acne scarring

Protocols under conversion to reviewsSystemic treatments for fungal infections of the skin of the footAntihistamines for atopic eczemaInterventions for toxic epidermal necrolysis (TEN)Complementary therapies for acneLocal treatments for common wartsInterventions for photodamaged skinInterventions for chronic palmoplantar pustular psoriasis

Source: Reproduced from the Cochrane Library.

on the management of many skin disorders. Thiscreates an increasing emphasis on systematicreviews, and a Cochrane Skin Group has beenestablished within the Cochrane Collaborationin 1997. A list of systematic reviews alreadyavailable within the Cochrane Library is reportedin Table 14.3.

In the light of the increasing role system-atic reviews may play with informing clinicalpractice, special care should be devoted to set-ting priorities so that the most important ques-tions are addressed first. Otherwise, they wouldrisk amplifying the irrelevant issues. The impor-tance of the involvement of consumers cannotbe underestimated. The first analyses from sys-tematic assessment of published RCTs point tosome ‘peculiarities’ of dermatology that havealready been discussed in the previous sections,and include among others:

1. The ‘moving boundary’ between cosmetologyand medicine.

2. The need to develop study designs that addressquestions posed by chronic recurrent diseases.

3. The limitations of available outcome measuresthat neglect patients’ needs and expectations.

4. Problems with external generalisability likethe lack of adequate description of the studypopulations and study settings.

5. The lack of comparative RCTs.6. The overwhelming role of pharmaceutical

industries with defining priorities.

REFERENCES

1. McKenna KE, Stern RS. The outcomes movementand new measures of the severity of psoriasis. JAm Acad Dermatol (1996) 34: 534–8.

2. Webster GF. Is dermatology slipping into its anec-dotage? Arch Dermatol (1995) 131: 149–50.

3. Vermeer BJ, Gilchrest BA. Cosmeceuticals – aproposal for rational definition, evaluation, andregulation. Arch Dermatol (1996) 132: 337–40.

4. Peckham PE, Weinstein GD, McCullough JL. Thetreatment of severe psoriasis: A national survey.Arch Dermatol (1987) 123: 1303–7.

5. Farr PM, Diffey BL. PUVA treatment of psoriasisin the United Kingdom. Br J Dermatol (1991) 124:365–7.

Page 231: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

230 TEXTBOOK OF CLINICAL TRIALS

6. Doshi A, Zaheer A, Stiller MJ. A comparison ofcurrent acne grading systems and proposal of anovel system. Int J Dermatol (1997) 36: 416–18.

7. Motley RJ, Finlay AY. Practical use of a disabilityindex in the routine management of acne. Clin ExpDermatol (1992) 17: 1–3.

8. Zeigher RS, Heller S, Mellon MH, Forsythe AB,Hamburger RN, Schatz M. Effect of combinedmaternal and infant food-allergen avoidance ondevelopment of atopy in early infancy: a ran-domized study. J Allerg Clin Immunol (1989) 84:72–89.

9. Arshad SH, Matthews S, Gant C, Hide DW. Ef-fect of allergen avoidance on development ofallergic disorders in infancy. Lancet (1992) 339:1493–7.

10. Frederiksson AJ, Peterssonn DC. Severe psoria-sis: oral therapy with a new retinoid. Dermato-logica (1978) 157: 238–44.

11. Phillips T, Stanton B, Provan A, Lew R. A studyof the impact of leg ulcers on quality of life:financial, social, and psychological implications.J Am Acad Dermatol (1994) 31: 49–53.

12. Moffatt CJ, Franks PJ, Oldroyd M, Bosanquet N,Brown P, Greenhalgh RM, McCollum CN. Com-munity clinics for leg ulcers and impact on heal-ing. BMJ (1992) 305: 1389–92.

13. Williams HC, Seed P. Inadequate size of ‘nega-tive’ clinical trials in dermatology. Br J Dermatol(1993) 128: 317–26.

14. Lehmann HP, Robinson KA, Andrews JS, Hol-loway V, Goodman SN. Acne therapy: a method-ologic review. J Am Acad Dermatol (2000) 47:231–40.

15. Naldi L, Svensson A, Diepgen T, Elsner P, GrobJJ, Coenraads PJ, Bavinck JN, Williams H. Euro-pean Dermato-Epidemiology Network. Random-ized clinical trials for psoriasis 1977–2000: theEDEN survey. J Invest Dermatol (2003) 120:738–41.

16. Van de Kerkhof PCM, De Hoop D, De Korte J,Cobelens SA, Kuipers MV. Patient complianceand disease management in the treatment ofpsoriasis in the Netherlands. Dermatology (2000)200: 292–8.

17. Day S. Commentary: Treatment allocation by themethod of minimisation. Br Med J (1999) 319:947–8.

18. Boateng F, Sampson A, Schwab B. Statisticalanalysis of possible bias of clinical judgementsdue to observing an on-therapy marker variable.Stat Med (1996) 15: 1747–55.

19. Christiansen JV, Holm P, Moller R, Reymann F,Schmidt H. Etretinate (Tigason) and betametha-sone valerate (Celeston valerate) in the treatmentof psoriasis. A double-blind, randomized, multi-center trial. Dermatologica (1982) 165: 204–7.

20. Weinstein GD. Safety, efficacy and duration oftherapeutic effect of tazarotene in the treatmentof plaque psoriasis. Br J Dermatol (1996) 135(Suppl): 32–6.

21. Petersen LJ, Kristensen JK. Selection of patientsfor psoriasis clinical trials: a survey of the recentdermatological literature. J Dermatol Treat (1992)3: 171–6.

22. Ashcroft DM, Li Wan Po A, Williams HC, Grif-fiths CEM. Clinical measures of severity and out-come in psoriasis: a critical appraisal of their qual-ity. Br J Dermatol (1999) 141: 185–91.

23. Marks R, Barton SP, Shuttleworth D, Finlay AY.Assessment of disease progress in psoriasis. ArchDermatol (1989) 125: 235–40.

24. Feldman SR, Fleischer AB Jr, Reboussin DM,Rapp SR, Exum ML, Clark DR, Nune L. Theself-administered psoriasis area and severity indexis valid and reliable. J Invest Dermatol (1996) 106:183–6.

25. Krueger GG, Feldman SR, Camisa C, Duvic M,Elder JT, Gottlieb AB, et al. Two considerationsfor patients with psoriasis and their clinicians:what defines mild, moderate and severe psoriasis?What constitutes a clinically significant improve-ment when treating psoriasis? J Am Acad Derma-tol (2000) 43: 281–3.

26. Husted JA, Cook RJ, Farewell VT, Gladman DD.Methods for assessing responsiveness: a criticalreview and recommendations. J Clin Epidemiol(2000) 53: 459–68.

27. Norman GR. Issues in the use of change scoresin randomized trials. J Clin Epidemiol (1989) 42:1097–105.

28. Matthews JNS, Altman DG, Campbell MJ, Roys-ton P. Analysis of serial measurements in medicalresearch. Br Med J (1990) 300: 230–5.

29. Temple R. Current definitions of phases of inves-tigation and the role of the FDA in the con-duct of clinical trials. Am Heart J (2000) 139:S133–5.

30. Krueger GG. Psoriasis therapy–observational orrational? New Engl J Med (1993) 328: 1845–6.

31. Gottlieb AB. Product development for psoria-sis: Clinical challenges and opportunities. In:Roenigk HH, Maibach HI, eds, Psoriasis, 3rd edn.New York: Marcel Dekker (1998) 421–33.

32. Naldi L, Carrel CF, Parazzini F, Cavalieri d’OroL, Cainelli T. Development of anthralin short-contact therapy in psoriasis: survey of pub-lished clinical trials. Int J Dermatol (1992) 31:126–30.

33. Louis TA, Lavori PW, Bailar JC, Polansky M.Crossover and self-controlled designs in clinicalresearch. New Engl J Med (1984) 310: 24–31.

Page 232: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DERMATOLOGY 231

34. Lambert MF, Wood J. Incorporating patient pref-erences into randomized trials. J Clin Epidemiol(2000) 53: 163–6.

35. Kessels AG, Cardynaals RL, Borger RL, Go MJ,Lambers JC, Knottnerus JA, Knipschild PG. Theeffectiveness of the hair-restorer ‘Dabao’ in maleswith alopecia androgenetica. A clinical experi-ment. J Clin Epidemiol (1991) 44: 439–47.

36. Peyri J. Ebastine in chronic urticaria: a double-blind placebo-controlled study. J Dermatol Treat(1991) 2: 51–3.

37. Gupta MA, Johnson AM, Gupta AK. The devel-opment of an Acne Quality of Life scale: reli-ability, validity, and relation to subjective acneseverity in mild to moderate acne vulgaris. ActaDermato-Venereol (1998) 78: 451–6.

38. Finlay AY. Quality of life measurement in der-matology: a practical guide. Br J Dermatol (1997)136: 305–14.

39. Charman C, Williams H. Outcome measures ofdisease severity in atopic eczema. Arch Dermatol(2000) 136: 763–9.

40. Emerson RM, Charman CR, Williams HC. TheNottingham Eczema Severity Score: preliminaryrefinement of the Rajka and Langeland grading.Br J Dermatol (2000) 142: 288–97.

41. Ashcroft DM, Li Wan Po A, Williams HC, Grif-fiths CEM. Quality of life measures in psoriasis:

a critical appraisal of their quality. J Clin PharmTher (1998) 23: 391–8.

42. Kantor J, Margolis DJ. Efficacy and prognosticvalue of simple wound measurements. Arch Der-matol (1998) 134: 1571–4.

43. Lindholm C, Bjellerup M, Christensen OB, Zed-erfeldt B. Quality of life in chronic leg ulcerpatients. An assessment according to the Notting-ham Health Profile. Acta Dermato-Venereol (1993)73: 440–3.

44. Faust HB, Gonin R, Chuang TY, Lewis CW,Melfi CA, Farmer ER. Reliability testing of thedermatology index of disease severity (DIDS).An index for staging the severity of cutaneousinflammatory disease. Arch Dermatol (1997) 133:1443–8.

45. Chren MM, Lasek RJ, Quinn LM, Mostow EN,Zyzanski SJ. Skindex, a quality of life measurefor patients with skin disease: reliability, validity,and responsiveness. J Invest Dermatol (1996) 107:707–13.

46. Kragballe K, Gjertsen BT, De Hoop D, Karls-mark T, van de Kerkhof PC, Larko O, Nieboer C,Roed-Petersen J, Strand A, Tikjob G. Double-blind, right/left comparison of calcipotriol andbetamethasone valerate in treatment of psoriasisvulgaris. Lancet (1991) 337: 193–6.

Page 233: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

PSYCHIATRY

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 234: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

15

OverviewB.S. EVERITT

Biostatistics and Computing Department, Institute of Psychiatry, King’s College, London, UK

The mentally ill have always been with us–tobe feared, marvelled at, laughed at, pitied ortortured, but all too seldom cured.

Alexander and Selesnick, The History of Psychi-atry, 1967. (Allen & Unwin, Sydney, Australia)

INTRODUCTION

In his dictionary of psychology, the late Pro-fessor Stuart Sutherland defines psychiatry as‘the medical speciality that deals with mentaldisorders’. An almost equally brief definitionappears in Campbell’s Psychiatric Dictionary,namely, ‘the medical speciality concerned withthe study, diagnosis, treatment and prevention ofbehaviour disorders’. In terms of either defini-tion it would appear that psychiatry has a longhistory; Pythagoreans, for example, employeda form of music therapy with emotionally illpatients,1 and Aretaeus (AD 50–130) observedmentally ill patients and did careful follow-upstudies on them. As a result, he established thatmanic and depressive states often occur in thesame individual and that lucid intervals generallyexist between manic and depressive periods.

But a thousand years on such a seeminglyenlightened approach to the mentally ill had been

largely abandoned in favour of viewing the insaneas wild beasts who should be kept constantly infetters. Indeed according to Foucalt,2 ‘madnessborrowed its face from the mask of the beast’. Inearly medieval times beating, incarceration andrestraint were the ‘treatments’ endured by themajority of the mentally ill. Insanity was almostuniversally regarded as a spiritual trial which onehad to undergo as a punishment for vice, a testof faith, or a method of purging sin–a form ofPurgatory on Earth–which could be dealt withonly by spiritual remedies such as exorcism orbeing locked up in a church overnight. Graduallyother approaches to treatment were introducedalthough most were equally harsh, bleeding,vomiting and purging for mentally ill patientswere common, as were more whimsical formsof treatment such as whirling or spinning amadman round on a pivot. These treatments werein addition to the continued use of manacles andchains for restraint. Apart from their harshness,what these treatments also had in common wasthat they were mostly ineffective.

It was not until the seventeenth century thatthe tide of opinion seems to have turned againstrough treatment. For example, on 18 July 1646the Court of Governors of Bethlem Hospital

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 235: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

236 TEXTBOOK OF CLINICAL TRIALS

ordered ‘that no officer or servant shall giveany blows or ill language to any of the madfolks on pain of loosing his place’ and at thesame hospital in 1677 the Governors propoundeda rule that ‘No Officer or Servant shall beator abuse any Lunatik, nor offer any force tothem, but upon absolute, Necessity, for the bettergoverning of them’. As a substitute for coercion,some institutes housing the insane began tooffer kindness, attention to health, cleanlinessand comfort. Reformers such as John Monropioneered the introduction of ‘moral treatment’which stressed the value of occupation to combatthe dangers of idleness, and the need for patientsto be dealt with tenderly and with affection.Such an approach was now considered to bemore likely to restore reason than harshnessor severity.

But although there was an increasing desirefor caring to replace constraint in dealing withthe mentally disturbed, drugs such as corium,digitalis, antimony and chloral were still usedto quieten patients, replacing physical fetterswith pharmacological ones. And despite the bestefforts of the advocates of the moral treatmentapproach, asylums housing the insane oftenremained depressing and degrading places untilwell into the twentieth century, as is illustratedby the following account of a visit by a newlyappointed psychiatrist in 1953 to the chronicward of a mental hospital in Cambridge in theUnited Kingdom:

I was taken in by someone who had a key to unlockthe door and lock it behind you. The crashing ofkeys in the lock was an essential part of asylum lifethen just as it is today in jail. This led into a bigbare room, overcrowded with people, with scrubbedfloors, bare wooden tables, benches screwed to thefloor, people milling around in shapeless clothing.There was a smell in the air of urine, paraldehyde,floor polish, boiled cabbage and carbolic soap – theasylum smell. Some wards were full of tousled,apathetic people just sitting in a row because fortwenty years the nurses had been saying ‘sit down,shut up’. Others were noisy. At the back of theward were the padded cells, in which would beone or two patients, smeared with faeces, shouting

obscenities at anybody who came near. A scene ofhuman degradation.

And, as we shall see in the next section,several new twentieth century treatments wereequally as harsh as those used 200 years earlier,and in the main, almost equally ineffective inproducing a cure. One positive change fromearlier times, however, was that now someclinicians began to take the first small steps toevaluating treatments scientifically by makingqualitative and quantitative observations andmeasurements. Empiricism was, at last, about toplay a role in psychiatric practice.

PSYCHIATRIC TREATMENT AND ITSEVALUATION: THE EARLY TWENTIETH

CENTURY

In the 1920s Dr Henry A. Cotton3 proposed a the-ory relating focal infection to mental disorders, inparticular the functional psychoses. According toDr Cotton:

The so called functional psychoses we believetoday to be due to a combination of manyfactors, but the most constant one is the intra-cerebral, bio-chemical cellular disturbance arisingfrom circulating toxins originating in chronicfoci of infection, situated anywhere in the body,associated probably with secondary disturbanceof the endocrin system. Instead of consideringthe psychosis as a disease entity, it should beconsidered as a symptom, and often a terminalsymptom of a long continued masked infection, thetoxaemia of which acts directly on the brain.

Dr Cotton identified infection of the teeth andtonsils as the most important foci to be consid-ered, but the stomach and in female patients, thecervix could also be sources of infection responsi-ble, according to Dr Cotton’s theory, for the men-tal condition of the patient. The logical treatmentfor the mentally ill resulting from Dr Cotton’stheory was surgical elimination of the chronicallyinfected tissue, all infected teeth and tonsils cer-tainly and for many patients, colectomies. Addi-tionally female patients might require enucleation

Page 236: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

OVERVIEW 237

of the cervix, or in some cases complete removalof fallopian tubes and ovaries. Such treatmentwas, according to Dr Cotton, enormously suc-cessful with, out of 1400 patients treated, only42 needing to remain in hospital.

The focal infection theory of functional psy-choses was not universally accepted, neither werethe striking results said to have been obtainedby the removal of these infections. So in 1922Drs Kopeloff and Cheney4 of the New York StatePsychiatric Institute undertook a study to investi-gate Dr Cotton’s proposed treatment in the spiritof, in their own words: ‘an approach free fromprejudice and without preconceived ideas as tothe possible results’. To achieve this laudableif somewhat pious aim, Kopeloff and Cheneyplanned their study in the form of an experiment.All the patients were divided into two groupsas nearly identical as possible. All members ofone group received operative treatment for fociof infection in teeth and tonsils, while membersof the other group received no such treatment andconsequently could be regarded as controls. Nodoubt Kopeloff and Cheney’s study would havebeen hard pressed to have gained ethical approvaltoday, but despite its ethical and probable sci-entific limitations it did produce results (sum-marised here in Table 15.1) that cast grave doubtsover removal of focal infections as a treatmentfor some types of mental illness, and indirectlyat least, drove a nail into the coffin of Dr Cotton’stheory as to the cause of these conditions.

Dr Cotton’s suggested treatment for patientswith functional psychoses was severe, but notmore so than other ‘physical therapies’ which

became popular in the 1930s and 1940s. Insulincoma, for example, required patients to be givenlarge doses of insulin which, by lowering theblood sugar, induced a comatose state from whichthey would be rescued by a large dose of glucose(if they were amongst the lucky ones–somepatients died). According to Sargant and Slater,5

(reproduced with permission of Elsevier Science)‘reliable statistics are mostly in favour of thetreatment’, although this claim needs to beconsidered alongside their recommendation as tohow to select patients for treatment:

It is rarely indeed that facilities will exist forthe treatment by a full course of insulin of allschizophrenics coming under observation, and it istherefore important not to waste the treatment onpatients not very likely to respond while denying itto the favourable cases.

Perhaps the most severe of the physical therapieswas a lobotomy, where the brain was cut witha knife. The operation was pioneered by EgasMoniz, a Lisbon neurologist, and later taken upenthusiastically by psychiatrists such as WilliamSargant of St. Thomas’s Hospital in the UnitedKingdom. Evaluation of the effectiveness ofthe therapy was largely anecdotal, and evenan enthusiast such as Sargant knew that theoperation was often performed at a price:

It is probable that the highest powers of the intellectare affected detrimentally, and if the patient showslittle sign of this in his day-to-day behaviour it maybe because the daily routine of existence makeslittle call on his best powers. We recognise too thattemperamental qualities also are not unaffected, that

Table 15.1. Results from Kopeloff and Cheney’s study4

Demential praecox Manic depressive

Controls Operated Controls Operated

Number of cases 15 17 15 9Recovered – – 5 4Improved 5 5 8 1Total benefited 5 5 13 5Unimproved 10 12 2 4Left hospital 3 5 6 3

Page 237: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

238 TEXTBOOK OF CLINICAL TRIALS

the reduction in self-criticism may lead to tactlessand inconsiderate behaviour, and that the moreimmediate translation of thought and feeling intoaction can show itself in errors of judgement. Thedamage, once done, is irreparable. . .

Sargant and Slater5

Both insulin therapy and lobotomies were slowlyphased out as treatments for the mentally ill, butanother of the physical therapies introduced inthe mid-twentieth century, electric shock (ECT),remains in use to this day largely because it hasbeen found to be effective in a number of studies(see next section). This treatment, introducedby Cerletti and Bini, consists of producingconvulsions in a patient by means of passingan electric current through two electrodes placedon the forehead. The idea that such convulsionsmight help the mentally ill patient was not new;as long ago as 1798, for example, Weickhardt hadrecommended the giving of camphor to the pointof producing vertigo and epileptic fits.

ECT was (and is) used primarily in the treat-ment of patients with severe depression. Earlyclaims for its effectiveness bordered on the mirac-ulous. Batt,6 for example, reported a recoveryrate of 87%. Fitzgerald7 was only slightly lessoptimistic, suggesting the figure was 78%. In nei-ther report however was there any attempt togather data on recovery rates in concurrent con-trols. Despite this, other psychiatrists acceptedthe quoted recovery rates as an indication of theeffectiveness of ECT. Typical is the followingquotation from Napier:8

It is a remarkable advance that a type of case inwhich the outlook was formerly so problematicalcan now be offered with some confidence theprospect of restoration in a matter of weeks.

(Reproduced with permission of the BritishJournal of Psychiatry)

Some researchers attempted to evaluate ECT bycomparing their results with those from historicalcontrols or from concurrent patients who forone reason or another had not been offered thetreatment of choice (ECT). But such studieslargely only illustrated the weaknesses of such

an approach. That by Karagulla,9 for example,compared results for six groups of patients. Twogroups, men and women, had been treated atthe Royal Edinburgh Hospital for Mental andNervous Disorders in the years 1900–39 (beforethe advent of ECT). The other four groupshad been treated in the years 1940–48, two(men and women) by ECT and two others (menand women) not using ECT. It requires littleimagination to suppose that the historical controlsseen during the period 1900–39 are of littleuse in evaluating ECT; any difference betweenthe recovery rates for the periods 1900–39and 1940–48 in favour of the latter could beexplained by many other factors than treatmentwith ECT. The differences between the ECTgroups and the concurrent controls are alsovirtually impossible to assess since the decisionto use ECT on a patient was a subjective oneby the clinicians involved. There is no way ofknowing whether the treated and untreated groupsare comparable.

But the evaluation of treatments in medicinein general and psychiatry in particular was aboutto be placed on a scientifically far firmer footing,by the introduction and then the increasing useand acceptance of the controlled clinical trial.

PSYCHIATRIC TREATMENTS AND THEIREVALUATION: THE 1950s ONWARDS

Kopeloff and Cheney’s study of removal of focalinfections as a treatment for particular forms ofmental illness has many aspects of a modernclinical trial, although it is missing that mostessential component, random allocation. Kopeloffand Cheney decided themselves in regard to eachpatient whether they should be operated on orwhether they should be a control.

It was Fisher who recognised the need forrandomisation to treatment groups in medical,biological and agricultural experiments, and theeventual adoption of the principle into the eval-uation of treatments has led to what Sir DavidCox has called ‘the most important contribu-tion of 20th Century statistics’, the randomised

Page 238: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

OVERVIEW 239

controlled clinical trial. In such trials patientsare assigned to treatment groups according tochance. Prior to Fisher’s randomisation principlebeing adopted, most of the early studies to com-pare competing treatment for the same conditioninvolved arbitrary, non-systematic schemes fordeciding which treatment a patient would receive.The first trial with a properly randomised controlgroup was that for streptomycin in the treatmentof pulmonary tuberculosis, carried out by Brad-ford Hill in 1947. The first psychiatrist to advo-cate the use of Fisher’s experimental approachin the evaluation of psychiatric treatments, par-ticularly the physical treatments, appears to havebeen Lewis.10 (Reproduced with permission fromOxford University Press) In his paper he criticisespast studies and of a controlled trial he concludes:

An organised experiment would demand muchthat has not hitherto been practicable, includ-ing voluntary acceptance by independent hospi-tals and clinics of an agreed procedure for theselection, management, evaluation of mental state,and follow-up investigation of treated, as well asof control cases. Such an experiment, as R.A.Fisher (1942) has demonstrated, requires muchforethought and self-discipline on the part of thosewho carry it out.

For many psychiatrists practising at the timethis was not altogether welcome news. Thephysical therapies, such as insulin coma andpsychosurgery remained in use, with advocatesof these treatments retaining their enthusiasm,apparently untroubled by the usual requirementsof rational scientific scepticism. Demands thatclinical trial methodology be adopted to evaluatetreatments whose effectiveness most psychiatristsalready took for granted, fell largely on deaf ears.

But slowly matters did improve. In the early1950s Miller and his colleagues randomly allo-cated ten schizophrenic patients to each ofthree alternative treatments, ECT, Pentothal andPentothal plus non-convulsive stimulation, andassessed them before treatment began, after thecessation of treatment, and then again two weekslater.11 Although the sample size was totally inad-equate to demonstrate any significant treatment

differences, the use of randomisation representeda great improvement over earlier studies.

Other small trials of ECT were reported in the1950s but it was not until 1965 that the MRCpublished the results of the first large-scale trialof the treatment. This was a multicentre trialinvolving 55 clinicians and 269 patients. As wellas demonstrating the effectiveness of ECT inthe treatment of depression, the MRC trial alsoshowed that a large multicentre trial in psychiatrywas feasible. The trial did, however, have somecritics. In a letter to the British Medical Journal,Sargant12 wrote:

There is no psychiatric illness in which bedsideknowledge and long clinical experience pays betterdividends; and we are never going to learn howto treat depressions properly from double-blindsampling in an MRC statistician’s office.

(Sargant W Antidepressant drugs. Br Med J.(196) 1; 1495. Reproduced with permission fromthe BMJ Publishing Group)

At the end of the 1940s and the beginning ofthe 1950s, the physical treatments introduced intopsychiatry 30 years earlier still formed the core ofmost psychiatrists’ treatment armoury. But mat-ters were about to change; in the 1950s severalentirely new types of drugs were to be intro-duced in psychiatric practice. In the main thediscovery of these drugs was not based on ascientific knowledge of brain chemicals, rathertheir discovery was for the most part serendip-ity, resulting from acute observations made byclinicians such as Henri Laborit (the effectsof the antihistamine promethazine, from whichdeveloped chlorpromazine), and John Cade whofirst described the value of lithium in manicdepression by observing its effect on a num-ber of patients. The tricyclic antidepressantsand the Selective Serotonin Reuptake Inhibitors(SSRIs), which had fewer side effects in treatingdepression, were also discovered in the 1950s.Finally, almost by accident, Leo Sternback in1957 identified the benzodiazepines for treatingmild anxiety.

The need to establish whether or not thesenewly discovered compounds were effective

Page 239: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

240 TEXTBOOK OF CLINICAL TRIALS

in treating mentally disturbed patients greatlyincreased most psychiatrists’ appreciation of theneed to use clinical trials for evaluating treat-ments. And after 1960 the increasing need tosatisfy regulatory authorities (prior to 1960 onlythe USA had such a body overseeing the intro-duction of new drugs into general use, butthe thalidomide tragedy changed the situationdramatically) meant that the randomised con-trolled clinical trial has now become establishedas the ‘gold standard’ for evaluating compet-ing therapies, although even as late as 1963some psychiatrists were unwilling to acceptthat such an approach was necessary; this isfrom the preface of a 1963 edition of Sar-gant and Slater5 commenting on controlled clin-ical trials:

If they fail to demonstrate any differences betweena placebo and a drug which everybody knows tobe effective, this means only that the work has notbeen done well enough.

Fortunately it is difficult to imagine such a viewbeing expressed in a major psychiatric text nowa-days. Over the last 40 years the use of clinicaltrials in psychiatry, particularly for evaluatingnew drugs, has become widespread. A quotationfrom one of the psychiatric champions of thisapproach, Michael Shepherd,13 remains almostthe perfect model for the modern scientific viewthat psychiatrists should have in the evalua-tion of psychotropic drug therapies in particular,and in the evaluation of psychiatric treatmentsin general:

The clinician is compelled to hold the balancebetween the scales of laboratory data on theone hand and stochastic theory on the other.Though his experience and judgement are essentialit will be necessary for him to adopt a moreexperimental role in the future if he is to co-operatefully with the pharmacologist and the statisticianwhose techniques he should understand if fullweight is to be given to observations made in theclinical setting.

(Shepherd, 1959, reproduced with permission)

SUMMARY

Treatment of the mentally ill has made giantstrides in the last 50 years. Drug treatment ofschizophrenia, depression and anxiety disordershave, in randomised clinical trials, been foundto be effective and have done much to alleviatethe misery of these conditions. Drug treatmentof mental illness works by altering in some waythe chemistry of the body. Chlorpromazine, forexample, has been shown to interfere with theaction of the neurotransmitter dopamine. But themodern view of mental illness, that it has bothpsychological and physical dimensions, impliesthat effective treatment must aim to ease the suf-fering of the mind as well as correcting possi-ble abnormalities of chemistry. And so, in the1970s, behavioural psychotherapy began to beused to treat particular disorders. More recentlycognitive therapy has been introduced. This pro-vides a simple, straightforward treatment regi-men which lasts weeks rather than years, andabove all permits the patients to make sense of,and thus hopefully control, their psychologicalproblems.

Clinical trials in psychiatry initially involvedthe evaluation of drug treatments. More recently,however, psychological therapies have been sub-jected to the rigours of the randomised clini-cal trial, and there has been a growing aware-ness that the theoretical and logistical prob-lems of such trials differ from those of theaverage drug trial. Consequently three of thechapters in this section concentrate on clinicaltrials of psychological treatments as now usedin psychiatry. In Chapter 17 Katherine Shearand Philip Lavori discuss the many problemsassociated with assessing treatments for anx-iety, in particular how interventions work inthe community settings where they will even-tually be used. In Chapter 18 Nicholas Tar-rier and Til Wykes give a masterly overviewof how clinical trials have been used to eval-uate the effectiveness of cognitive behaviouraltreatments of psychosis. Finally in Chapter 19Graham Dunn considers the many issues thatarise in applying clinical trial methodology to

Page 240: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

OVERVIEW 241

the use of psychotherapy for treating depression.The difficulties of undertaking clinical trialsin this area are clearly identified in all thepapers, but these difficulties should be seenas a challenge to psychiatrist and statisticianalike, in what must been seen as the long termgoal of alleviating the misery that is mentalillness.

REFERENCES

1. Gordon BL. Medicine Throughout Antiquity.Philadelphia: Davies (1949).

2. Foucalt M. Madness and Civilization: A Historyof Insanity in the Age of Reason. London: Tavis-tock (1967) (transl. from the French by RichardHoward; originally published as Histoire de laFolie).

3. Cotton HA. The etiology and treatment of the so-called functional psychoses. Summary of resultsbased upon the experience of four years. Am JPsychiat (1922) 2: 156–91.

4. Kopeloff N, Cheney CO. Studies in focal infec-tion: its presence and elimination in the func-tional psychoses. Am J Psychiat (1922) 2:139–56.

5. Sargant W, Slater E. An Introduction to PhysicalMethods of Treatment in Psychiatry. Edinburgh:Livingstone (1944).

6. Batt JC. One hundred depressive psychoses treatedwith electrically induced convulsions. J Mental Sci(1943) 89: 289–96.

7. Fitzgerald OWS. Experiences in the treatment ofdepressive states by electrically induced convul-sions. J Mental Sci (1943) 89: 73–80.

8. Napier FJ. Death from electrical convulsion ther-apy. J Mental Sci (1944) 90: 875–8.

9. Karagulla S. Evaluation of electrical convulsiontherapy as compared with conservative methodsof treatments of depressive states. J Mental Sci(1950) 96: 1060–91.

10. Lewis AJ. On the place of physical treatment inpsychiatry. Br Med Bull (1946) 3: 22–34.

11. Miller DH, Chang J, Cumming E. A com-parison between unidirectional current non-convulsive electrical stimulation given withReiter’s machine, standard alternating currentelectron shock (Cerletti method) and pentothalin chronic schizophrenia. Am J Psychiat (1953)189: 226–40.

12. Sargant W. Antidepressant drugs. Br Med J (1965)1: 1495.

13. Shepherd M. Evaluation of drugs in the treatmentof depression. Can Psychiat Assoc J (1959) 4:120–8.

Page 241: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

16

Alzheimer’s Disease∗

LEON J. THAL AND RONALD G. THOMASDepartment of Neurosciences, University of California San Diego, San Diego, CA 92037, USA

BACKGROUND

HISTORY

Descriptions of patients with dementia appearin the Bible, Roman and Greek writings, andShakespeare. However, it was not until the earlyportion of the 20th century that Alzheimer’s dis-ease (AD) was identified as a specific diseaseentity. The index case, Frau Auguste D, a 50-year-old woman, was admitted to the FrankfurtInsane Asylum in 1901 (see Table 16.1, time-line). She was found to be suffering from a pre-senile dementia with memory loss, generalisedcognitive impairment, delusions and hallucina-tions. Upon her death in 1906, brain exam-ination revealed generalised cerebral atrophyand two microscopic lesions called plaques andtangles.1 The disease was subsequently named‘Alzheimer’s disease’ by Kraepelin.2 However,Kraepelin, the dominant figure of the day, definedAD as a presenile dementia occurring betweenthe ages of 40 and 60 so that individuals whohad similar clinical presentations in their later

∗ Supported by Alzheimer’s Disease Cooperative Study Grant#AGO 10483.

years were considered to not have AD. Thus, ADwas considered to be a rare disorder causing onlypresenile dementia.

Over the succeeding decades, advances weremade largely based on pathological studies ofAD. Amyloid-like staining was noted by Dirvy.3

Terry4 and Kidd5 described the ultrastructureof the neurofibrillary tangle as a paired helicalfilament. In 1968, Blessed et al.6 carried outautopsies in an elderly cohort and found cerebralatrophy as well as plaques and tangles. Theseauthors concluded that these elderly subjects hadAD and that there was no difference betweenearly-onset and late-onset AD.

At that time, the prevailing opinion in theUnited States was that dementia after age 65was primarily due to cerebral atherosclerosis. Itwas not until the mid-1970s that physicians andscientists in the US recognised that most elderlyindividuals with dementia suffered from AD.Katzman7 noted the high prevalence of dementiaand pointed out that AD was the fourth mostcommon cause of death in the US. At the sametime, modern neurochemical techniques wereapplied to the study of AD brain tissue. Threelaboratories independently described the loss ofcholinergic markers in AD.8 – 10 At the same time,

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 242: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

244 TEXTBOOK OF CLINICAL TRIALS

Table 16.1. AD timeline

1907 Index case, Frau Auguste D described by Alzheimer1910 Kraepelin calls AD a rare presenile dementia1927 Dirvy identified amyloid in the plaque1963 Terry and Kidd describe the ultrastructure of the neurofibrillary tangle1975 NIA established–AD is its primary focus1976/77 Selective loss of cholinergic markers reported in AD1976 Katzman reports that AD is the fourth leading cause of death in the US1980 Alzheimer’s Association formed1984 Glenner sequences amyloid and postulates that it causes AD1984 NIA initiated Alzheimer Centers programmes1987 First gene for AD on chromosome 211991 Alzheimer’s Disease Cooperative Study funded by NIA1992 Presenilin 1 mutation on chromosome 14 for early-onset AD1993 Tetrahydroaminoacridine (Cognex), first cholinesterase inhibitor is marketed1993 Apo E4 allele identified as a risk factor for late-onset AD1995 Presenilin 2 mutation on chromosome 1 for early-onset AD1996 Donepezil (Aricept), second AChE inhibitor, marketed1997 Vitamin E and selegiline delay time to endpoints in AD including progression of disease2000 Rivastigmine (Exelon), third AChE inhibitor, marketed

the National Institute on Aging (NIA) was formedin 1975 and its first director, Dr Robert Butler,identified AD as the most important diseaseof aging. In 1980, the Alzheimer’s Associationwas formed to represent public interest in thisdisorder. It has subsequently grown to more than250 national chapters.

Initially, the NIA focused on the fundingof individual investigator grants and some pro-gramme project grants. Developments occurredrapidly, and in 1984, Dr George Glenner iso-lated amyloid from the blood vessel wall ofAlzheimer tissue and postulated that amyloidwas the causative agent of AD.11 By 1987, thefirst gene for early-onset familial AD, whichlater was found to be the β-amyloid precur-sor protein gene, was linked to chromosome21.12 In 1993, an association was reportedbetween the presence of the E4 allelic variant ofapolipoprotein E and the development of AD.13

This was the first genetic risk factor describedfor late-onset AD. In the same year, the firstacetylcholinesterase (AChE) inhibitor, tetrahy-droaminoacridine (Cognex) was approved andmarketed. This drug was developed based on theempirical observation of a decrease in choliner-gic functioning originally made in 1976. During

this same time, two additional genes coding forpresenilin 1, located on chromosome 14, and pre-senilin 2, located on chromosome 1, were foundto be responsible for additional families withearly-onset AD.14,15 Additional cholinesteraseinhibitors were approved in 1996 and 2000. In1997, the first trial demonstrating a delay in timeto endpoints was published using the antioxidant,vitamin E, and the monoamine oxidase inhibitor,selegiline.16

By 1984, the NIA recognised the need forthe study of well-characterised brain tissue forpatients with AD. It initiated the Alzheimer’sdisease centres programme by funding five cen-tres in 1984 with gradual growth of the cen-tres’ programme to 30 by 2000. In 1991, theNIA also funded the Alzheimer’s Disease Coop-erative Study (ADCS) with a mandate to bothdevelop new instrumentation for the testing ofpatients with AD and to carry out clinical tri-als for promising agents that would not be testedby the pharmaceutical industry, such as com-pounds lacking patent protection, compounds inthe public domain for the treatment of otherdisorders, and molecules derived from individ-ual investigator laboratories or small biotechnol-ogy companies.

Page 243: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

ALZHEIMER’S DISEASE 245

Over the past four decades, interest in ADhas markedly increased. Funding has increasedto approximately $400 million per year by theFederal government. Citations in Index Medicushave increased from fewer than 100 in 1970 tomore than 3000 per year by 2000.

EPIDEMIOLOGY

Epidemiological studies of AD have been extre-mely useful in defining prevalence, incidenceand risk factors. Numerous prevalence studieshave been carried out in the US, Europe andChina. Overall, these studies indicate that theprevalence of AD is under 1% below theage of 65. After age 65, the prevalence risesrapidly, doubling with every five-year epoch. Theprevalence exceeds 30% by age 90 (see Kawasand Katzman17 for a review).

Epidemiological studies have also been usefulin defining both risk and protective factorsfor the development of AD (Table 16.2). Themost important risk factor is age. A secondimportant risk factor is family history. This wasrecognised by Heston et al.18 who noted thatthe cumulative risk for AD was significantlyhigher for parents and siblings of patients than forthe general population as a whole. This findinghas been subsequently confirmed by many otherindividuals.19,20 The discovery of early-onset ADfamilies further strengthened the importance ofgenetics as a risk factor. Finally, the associationof the apo E, E4 allele and AD represents agenetic risk for late-onset AD. In addition, anumber of more minor risk factors have beenidentified including female sex, head injury andlow education.

Table 16.2. Risk factors for AD

Major risk factors Protective factorsAge Higher educationPositive family history NSAIDsAD genes Oestrogens

Minor risk factors Non-risk factorsLow education AluminiumHead trauma Vascular diseaseFemale sex

Epidemiological studies have also identifiedthe use of non-steroidal anti-inflammatory drugs(NSAIDs) and oestrogens as potential protectivefactors (see Kawas and Katzman17 for a review).These epidemiologic observations plus studieson the basic biological mechanisms of NSAIDsand oestrogens led to the development of severalclinical drug trials designed to determine whetheror not NSAIDs or oestrogens can either slow therate of decline in patients with established AD orprevent AD in normal elderly.

Epidemiological studies have also been usefulin determining that exposure to aluminium andcerebrovascular disease are not associated withthe development of AD.

GENETICS

Genetic inheritance of AD and familial clusteringwas first reported in the 1950s21 and 1960s,22

then confirmed in a pathological series in the1970s.23 Inheritance was complex but appearedto be autosomal dominant in a small proportionof cases, inherited as a complex trait in upto 50% of the population and sporadic in theremainder. Since these initial studies were carriedout, inheritance for early-onset AD has beenconfirmed in patients carrying mutations for APP,presenilin 1 and presenilin 2. In addition, late-onset AD has been linked to the apo E, E4

allele as a risk factor gene. Thus, at present,approximately one-half of known AD cases mayhave some underlying genetic component.

PATHOLOGY

The macroscopic pathology of AD consistsprimarily of cerebral atrophy and dilatation ofthe ventricles. The hallmarks of the disease aretwo microscopic lesions. The first, neurofibrillarytangles, are intraneuronal paired helical filamentsconsisting predominantly of hyperphosphorylatedtau protein. The second is the senile plaque, apathological inclusion in the neuropil consistingof damaged neuritic endings, compact and diffuseamyloid, and other proteins. The number ofboth plaques and tangles correlates moderatelywell with the degree of dementia. There is

Page 244: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

246 TEXTBOOK OF CLINICAL TRIALS

also a significant neuronal loss and loss ofsynapses. The loss of synapses correlates bestwith the severity of dementia (see Terry et al.24

for a review).

PATHOGENESIS AND AETIOLOGY

At the present time, the cause of AD remainsunknown. However, the best hypothesis atpresent is that misprocessing of amyloid and itsdeposition in the brain results in nerve cell dam-age followed by loss of neurons and the demen-tia syndrome. Data supporting this hypothesis isderived from studies of early-onset familial ADfamilies. Individuals carrying obligate genes forAD on chromosomes 1, 14 and 21 all developAD at an early age. In addition, fibroblasts trans-fected with DNA from these individuals eitherproduce excess amounts of both a-beta 1-40 and1-42 or undergo a shift in the ratio of a-beta pro-duction favouring the production of more a-beta1-42. This shift is important since a-beta 1-42 ishighly insoluble and represents the predominantform of a-beta present in the senile plaque. Indi-viduals carrying one or two apo E4 alleles alsoappear to deposit more a-beta in their brains thanindividuals lacking an E4 allele. Finally, individ-uals with Down’s syndrome carry an extra copyof chromosome 21, have three genes for amy-loid, and overproduce this protein. All developthe pathological features of AD if they live longenough. Thus, the deposition of amyloid appearsto be the central feature in the pathogenesis ofAD. However, this hypothesis awaits formal test-ing (see Selkoe25 for a review).

HISTORY OF CLINICAL AD TRIALS PRIORTO 1976

Prior to the discovery of a cholinergic defi-ciency in the brains of patients with AD, drugschosen for clinical testing in AD were chosenbased on the premise that cerebrovascular insuf-ficiency caused AD. Thus, numerous therapeu-tic modalities were tried including: vasodilators,anticoagulants, hyperbaric oxygen and cerebral

metabolic-enhancing agents known as nootrop-ics. None of these agents were proven to beefficacious for the treatment of AD, although theuse of ergoloid alkaloids, a class of agents withboth cerebral metabolic-enhancing and vasodi-lating activity, did produce a minor degree ofimprovement on subjective rating scales. Withthe demonstration of a cholinergic deficiency inAD, development of these compounds in the USlargely ceased.

GENERAL ISSUES IN AD CLINICAL TRIALS

DIFFICULTY OF DIAGNOSIS

There are many problems associated with thedevelopment of drugs and the conduct of clin-ical trials in AD. At present, there is no bio-logical marker for the disease during life. Thus,clinicians are never 100% certain of the diag-nosis. However, recent clinicopathological seriesreveal that the diagnostic accuracy for ADnow generally exceeds 80% and approaches90%, especially for cases selected for clinicaldrug trials.26 – 29 More recently, approximately15–20% of AD patients have also been foundto have extrapyramidal features. Examination oftheir brains reveals the presence of neocorticaland subcortical Lewy bodies as well as abundantplaques but few tangles indicating the presenceof dementia associated with Lewy bodies.30 – 32

Thus, in any contemporary trial, approximately10% of individuals will turn out to have anotherdisease at autopsy and approximately 20% willhave Lewy bodies accompanying the neuropatho-logical changes of AD.

PATIENTS

Patients with AD always have memory impair-ment as the core feature. However, many otherfeatures may be present such as difficultieswith language, praxis, visuospatial relations andbehaviour. There is substantial heterogeneity inthe clinical presentation of the patient popula-tion. In addition, patients decline at varying ratesover time which leads to increased variability in

Page 245: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

ALZHEIMER’S DISEASE 247

response measures. These differences in patientpopulation characteristics and change over timeare largely responsible for the need to includereasonably large samples in AD clinical trials.

ENDPOINTS

The endpoints studied in AD clinical trialsdepend primarily on the question being asked inthe trial. Early trials of cholinesterase inhibitorswere designed to detect treatment–placebo differ-ences in cognition over relatively short periodsof time. For these trials, the primary endpointsconsisted of a cognitive measure to determinethe specificity of the agent on important cogni-tive endpoints and a clinical global impressionto make certain that the overall effect was suf-ficiently robust to be clinically significant. Trialsexamining agents designed to alter the rate ofdecline have generally used a difference in slopeor a difference at endpoint in cognitive and globalmeasures. One recent trial used the time to devel-opment of functional endpoints such as insti-tutionalisation, death, loss of activities of dailyliving (ADL) or progression to a more severestage of dementia as a composite endpoint.16

Finally, in primary and secondary prevention tri-als where enrolled patients are either normal ormildly cognitively impaired upon enrollment, thetime from enrollment to diagnosis of AD is oftenused as the endpoint.

SURVIVAL ANALYSIS IN AD

While the use of endpoint differences andchanges in the rate of decline are currently themost frequent approach being used in many ADstudies, the use of survival analysis is becom-ing increasingly frequent. There are a number ofinherent advantages to survival analysis in AD.First, endpoints can be real-life events rather thanartificial constructs such as the amount of changeon a psychometric test. Events such as death andinstitutionalisation require little interpretation andclearly possess face validity. Second, survivalanalysis naturally allows the combination of mul-tiple endpoints; also, any patient who reaches one

of several prespecified endpoints or who termi-nates from the study provides useful data. Drop-outs for advancing disease in a longitudinal studyare less of a statistical problem because infor-mation concerning whether or not that individualreached an endpoint may be available. Third, theuse of survival analysis allows patients who reachan endpoint (usually the diagnosis of AD) to exitthe study and seek alternative treatments with-out impacting the statistical analysis. This featuremay potentially enhance recruitment for long-term, placebo-controlled survival trials. Fourth,survival analysis allows for comparison of theentire group despite varying lengths of follow-up,i.e., no imputation. Fifth, it is usually more infor-mative unless the incidence is low. The potentialdisadvantage of survival analysis in AD trials isthat the time to reach certain endpoints (such asinstitutionalisation) is likely to be more variableand affected by social support systems than therate of change on a cognitive measure. Also, iflarge numbers of patients drop out of the studywithout reaching the defined study endpoint, thevalidity of the study may be open to question.Some examples of useful endpoints or milestonesin AD patients include death, institutionalisation,loss of basic ADLs, loss of instrumental ADLsand decline in the disease stage.

PHASE 1 TRIALS

Phase 1 trials for AD are carried out to deter-mine the general tolerability of the agent andmaximum tolerated dose. These trials commonlyutilise fewer than 100 subjects exposed to drug.Early phase 1 trials are generally single doseexposure carried out in normal elderly sub-jects to determine the maximum tolerated dose.Subsequently, the tolerability of multiple dailydoses is evaluated in brief trials lasting forone to two weeks. More recently, trials involv-ing multiple daily dosing have been carried outin early AD patients rather than normal con-trols in so-called ‘bridging studies’. The advan-tage of this approach is that if the metabolismof the drug differs between AD patients and

Page 246: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

248 TEXTBOOK OF CLINICAL TRIALS

healthy normal controls, the doses tolerated byAD patients will be found early in the drugdevelopment process. Early phase 1 studies focuson tolerability, side effects and pharmacokinet-ics. Subsequent phase 1 studies are often carriedout to look for food interactions and interac-tions with other commonly used pharmaceuti-cal products.

PHASE 2 TRIALS

Phase 2 trials are classically designed to explorethe dose range of an agent and to establish aninitial determination of efficacy. They generallyutilise 100–500 subjects. Due to the time andcost involved in the drug development process,many sponsors are currently carrying out com-bined phase 2/3 studies. Most phase 2 studiesare carried out as multi-arm, parallel, placebo-controlled trials. The maximum dose used in sucha trial is approximately one-half to two-thirdsof the maximum tolerated dose determined dur-ing phase 1 testing. Two, three or four doses aregenerally employed and compared to placebo. Insome trial designs, an arm of an already-approvedagent may be added as a positive control. Mostphase 2 trials designed for symptomatic treatmentare approximately six months in duration in orderto meet both European and US regulatory guide-lines. Endpoints in these trials are generally treat-ment–placebo differences on a cognitive scale,most commonly the Alzheimer’s Disease Assess-ment Scale–Cognitive Component (ADAS-Cog),and a Clinician’s Global Impression (CGI).

The number of subjects needed to demonstrateefficacy in clinical trials with continuous responsemeasures depends on the relationship between theeffect size sought, the standard deviation of theoutcome measure, and other parameters such astype 1 error, type 2 error, drop-out rate, drop-in rate, and base rate for the control group. Forexample, for a treatment trial seeking to detecta four-point difference on the ADAS-Cog at sixmonths assuming a standard deviation of nine,100 subjects per group would be required in atwo-arm trial with a power of 90% and an alpha(type 1 error) of 5%.

Few phase 2 trials are designed to examine theability of the agent to slow decline in AD. Suchefficacy-oriented studies are infrequently carriedout because of the need for a large sample sizeand long duration. In general, studies designed toslow decline are carried out in phase 3 clinicaltrials. A few phase 2/3 trials have been carriedout in an attempt to look for both cognitiveand global improvement and for alteration in therate of decline over the course of one year. Forone-year trials designed to slow decline in AD,using a typical outcome measure in which thestandard deviation of the rate of change is equalto the one-year decline, a typical study using80% power and two-sided testing with an alpha(type 1 error) of 5% would require 63 subjectsper group (assuming no drop-outs) comparingdrug to placebo for significance to detect a 50%decrease in rate of decline (Table 16.3). Moststudies are powered to detect 25–40% decreasesin rate of decline and therefore require largersample sizes.

PHASE 3 TRIALS

Many AD trials are currently carried out as com-bined phase 2/3 studies. Depending on the num-ber of arms, these trials generally utilise 300–600subjects per trial. A complete phase 3 pro-gramme (for an FDA new drug application) willinclude at least two pivotal trials and will utilise1000–3000 subjects to both test efficacy and todetermine the side-effect profile of the agent. Inrecent years, several different trial designs havebeen utilised. For short-term trials designed toimprove cognition, three trial designs have beenused: crossover designs, randomised control par-allel designs (RCPD) and enrichment designs

Table 16.3. Same size calculations for AD slope trial

Reduction in rateof decline (%)

Subjects pergroup (N)

Total samplesize (N)

25 251 50250 63 12675 28 56

Page 247: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

ALZHEIMER’S DISEASE 249

(a notable variant of RCPD). The use of thecrossover design presents a number of problemsfor the study of AD patients. This design assumesthat there are no carry-over or period effects ofthe drug and that the treatment response is thesame in both periods. This is rarely the case. Themajor advantage of this design is in the economyof subjects because each subject acts as its owncontrol. However, because AD is not a static dis-order but the change is over time, carry-over andperiod effects are often present and this designis now rarely used. Randomised, controlled, par-allel design studies have two major advantages:the control population is concurrent (thus anyperiod effects are balanced) and uncontaminatedby drug exposure and there are no period effects.The disadvantage of this design is that more sub-jects are required to answer the central question.An enrichment design combines features of bothcrossover and parallel designs. It was used in sev-eral cholinesterase inhibitor trials including oneof the US multicentre tacrine trials.33 Patientsdemonstrating improvement after initial tacrinetreatment were withdrawn from drug, washed out,and subsequently randomised to drug or placeboand analysed for response in a double-blind par-allel phase. Non-responders were dropped fromthe trial prior to the double-blind phase, therebyresulting in enrollment of an enriched popula-tion in the double-blind phase. While this designhas many advantages including individual dosetitration for each patient, its major disadvan-tage is that all subjects are exposed to drug atsome point during the study allowing for possi-ble carry-over effects. Also, there is no placebo-treated population throughout the study on whichto base an estimate of adverse events in the con-trol population. Thus, parallel, placebo-controlleddesigns currently predominate in phase 3 symp-tomatic studies. These studies are similar tothe description of phase 2/3 symptomatic studiesdescribed above.

Slowing decline in AD remains an importantgoal since if slowing can actually be accom-plished, the treatment effect size will continue toincrease with the duration of treatment. Naturalhistory studies indicate that for most commonly

used neuropsychological instruments, the annualrate of change is approximately equal to the one-year standard deviation of change. Thus, for theADAS-Cog, the rate of change is approximately7.0 + / − 7.8 points per year.34 Examination ofrate of change studies indicates that the rate ofdecline on common instruments is reasonablyconstant during the middle stages of dementia butis slower in early and more severe dementia.35

The rate of change is predictable for groups butquite variable for individual patients. Rate ofdecline is likely to be influenced by the distribu-tion of disease subtypes such as the presence ofthe Lewy body variant of AD.36 The gene dosageof apo E4 does not appear to significantly alterthe rate of decline over one year.37 Knowledgeof the rate of decline allows for the accurate com-putation of sample sizes for studies designed toslow cognitive decline in AD. Most contempo-rary phase 3 studies examining rate of decline inAD are designed to detect group differences ofapproximately 25–35% since most clinical inves-tigators and AD advocacy groups believe thatfinding smaller effect sizes would not result inclinically useful drugs.

In addition to trials designed to slow declinein AD, one trial of vitamin E and selegiline16

utilised survival analysis in a 2 × 2 factorialdesign to examine the time to important endpointsin patients with AD. One advantage of the 2 ×2 factorial design is that two agents can betested simultaneously. In addition, interactionsbetween the two agents can be examined. Athird advantage of this design is that 75% ofsubjects are randomised to treatment therebyenhancing enrollment. The disadvantage of the2 × 2 factorial design is that negative interactions(sub-additivity) can occur thereby reducing the2 × 2 factorial design to a four-arm study whichresults in a significant loss of power in theanalysis. In the 2 × 2 factorial design study ofselegiline and vitamin E, the primary endpointwas the time to reach any of the following:death, institutionalisation, loss of basic ADLs andprogression from moderate to severe dementia.Although this study was useful in determiningthat treatment could cause a delay in the time to

Page 248: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

250 TEXTBOOK OF CLINICAL TRIALS

these endpoints, this trial design could not resolvethe issue as to whether or not the delay in the timeto endpoints resulted as a consequence of drugtreatment alone or a change in brain structureas a consequence of the potential neuroprotectiveeffect of these two agents.

In an attempt to intervene earlier in thecourse of the disease, individuals with mildcognitive impairment (MCI) are currently beingstudied. Patients with MCI generally presentwith a memory complaint. When evaluated, theydemonstrate the poor recall and rapid forgettingbut are otherwise generally normal with respectto cognitive functioning and ADLs. However,patients identified with MCI convert to AD at arate of 12–15% per year. In contrast, unselectednormal subjects over the age of 65 develop ADat a rate of only 1% per year.38 This high rate ofconversion has allowed for the development ofclinical trials of a reasonable size using the MCIpatient population. The endpoint of such clinicaltrials is the time to development of clinicallydiagnosable AD analysed with survival analysis.Several such trials have been initiated recentlyexamining the effects of vitamin E, cholinesteraseinhibitors and NSAIDs to delay the time to thedevelopment of AD in patients with MCI.

PRIMARY PREVENTION TRIALS

Numerous studies have indicated that AD isuncommon before the age of 65. However,after the age of 65, the prevalence doublesapproximately every five years reaching anaverage prevalence of over 30% by 90 yearsof age.39,40 Given that the prevalence doublesevery five years, delaying the onset of appearanceof the disease by five years would result in a50% reduction of prevalence in one generation.Delaying onset by 10 years would halve theprevalence again for a total reduction of 75%.Thus, primary prevention would yield the greatestcost benefit.

The earliest pathological changes of AD mayoccur 20–30 years before the clinical expres-sion of disease.41 Age-specific incidence of ADincreases with aging (Table 16.4).

Table 16.4. Age-specific incidence of AD

Age Incidence (%) Cases/1000/yr

65–69 0.6 670–74 1.0 1075–79 2.0 2080–84 3.3 3385–89 4.0 40

Thus, several strategies can be considered forprimary prevention trials. In the first strategy,healthy individuals would be treated in an attemptto delay the onset of disease. The main advantageof this design is that the results would be gener-alisable to other healthy individuals. Because ofthe large sample size and high costs associatedwith this strategy, enrichment strategies shouldalso be considered. These include enrolling: oldersubjects, subjects with a positive family historyof AD, and subjects at risk because of the pres-ence of an apo E4 allele. Several primary pre-vention trials for AD are currently preparing toget underway utilising some of these strategies.A second strategy would be to find subjects whoare already randomised to compounds of interestin trials for other indications to which cognitiveendpoints could be added. This has already beensuccessfully accomplished within the frameworkof the Women’s Health Initiative (WHI) whereapproximately 8000 non-demented women overthe age of 65 have been enrolled and randomisedto hormone replacement therapy (HRT) to deter-mine the effects of these agents on coronaryartery disease and osteoporotic fractures. Cog-nitive endpoints have now been added to thistrial to determine the effect of HRT on incidenceof dementia. A third approach would be to ran-domise to drug treatment a non-demented popula-tion being studied for another purpose such as anepidemiological study of cardiovascular disease.

If individuals are entered into a study entirelyon the basis of age, a large sample size willbe required using population-based, non-enrichedentry criteria. For example, for individuals ata mean age of 75 who have normal cognitionon initial evaluation, dementia will appear at arate of approximately 1.5% per annum. Over

Page 249: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

ALZHEIMER’S DISEASE 251

five years, the cumulative incidence of dementiawould be 7.5%. For a two-arm study withalpha = 0.05 and power = 80%, 2000 subjectsper group would be required to detect a 30%effect size, not accounting for losses secondary todeath. With such low incidence, even these largesamples will yield only 150 cases of dementiain the untreated group. If a drug reduced theincidence of AD by 50%, 75 cases would exist inthe treated group. To allow for losses to follow-upand death, approximately 2500 subjects wouldbe needed per group for a total sample size of5000. These sample sizes would only allow foran 80% probability of detecting a 30% increasein disease incidence.

Several primary prevention trials to preventdementia or AD have been initiated with mostutilising an enrichment strategy.

1. Women’s Health Initiation Memory Study(WHIMS)–8000 normal women >65 years ofage randomised to HRT or placebo.

2. Preventing postmenopausal memory loss andAlzheimer’s disease with Replacement Estro-gens (PREPARE) study–900 normal elderlywomen with a family history of AD ran-domised to HRT or placebo.

3. Gingko study–3000 normal elderly subjects(age >75) treated with Ginkgo or placebo.

4. AD Anti-Inflammatory Prevention Trial (AD-APT)–2600 normal elderly subjects with apositive family history of AD randomised toone of two anti-inflammatory drugs or placebo.

MISCELLANEOUS ISSUES

PRIMARY, SECONDARY AND TERTIARYTREATMENTS

Treatment of existing symptoms represents ter-tiary treatment and is representative of all ofthe currently approved drugs for AD. Secondarytreatment refers to treatment when minimal butnot full-blown disease is present. This is bestexemplified by the treatment of patients with MCIwho have minimal symptomatology. Finally, pri-mary prevention refers to intervention before any

clinical symptoms of disease appear. This will bethe ultimate goal in treating AD.

SYMPTOMATIC VS STRUCTURAL CHANGE

Treatments that slow progression in AD mayproduce underlying structural change within thebrain. One could logically reason that if treatmentcaused slowing of the rate of decline, it mayhave a permanent underlying effect on the brain.Unfortunately, a change in slope in a measureof cognition is insufficient evidence to provethat an underlying brain structural effect hasoccurred. Additional trial design manoeuvresmust be utilised. The most widely utilisedmanoeuvre is that of withdrawal. If the effectinduced by the drug is purely symptomatic, eventhough individuals are better at the end of thetrial, they should decay back to the curve ofuntreated subjects when the drug is withdrawn(Figure 16.1). This was clearly demonstratedin the 26-week Aricept trial in which patientswere removed from Aricept and cognitive scoresdropped to placebo levels after a six-weekwash-out.42 An alternative clinical manoeuvreto demonstrate the same effect would be arandomised start design. In this experiment, halfthe group is started on drug and half started onplacebo. If the drug has a purely symptomaticeffect, individuals begun on placebo then crossedover to active drug will ‘catch up’ to those startedon drug at the beginning of the trial. If the drughas a structural effect on the brain, individualsstarted on drug later in the course of the treatmentperiod will fail to ‘catch up’.

In addition to various trial manoeuvres, thedemonstration of a structural change within thebrain could be used to support a structural effectfor a therapeutic agent. For example, in a rateof decline trial, the maintenance of hippocampalvolume or the maintenance of a synaptic num-ber, as demonstrated on an imaging study, wouldserve as a direct demonstration that a pharma-cological agent produced a difference in brainstructure. Finally, a biochemical marker could beutilised. For example, in Parkinson’s disease ifa ‘dopaminergic sparing’ agent were developed,

Page 250: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

252 TEXTBOOK OF CLINICAL TRIALS

Randomizedphase

Active

Randomizedphase

Active

Placebophase

Disease-modifyingeffect

Placebo

Symptomaticeffect

(a)

Randomized withdrawal designP

erfo

rman

ce

Per

form

ance

Time Time

(b)

Randomized start design

Activephase

Placebo

Symptomaticeffect

Disease-modifyingeffect

Figure 16.1. Randomised start and randomised withdrawal

it might be reflected in higher homovanillic acidlevels in the cerebrospinal fluid after drug with-drawal at the end of the treatment period. For AD,a clear-cut biochemical marker does not yet exist.

N-OF-ONE DESIGN

This manoeuvre is not particularly useful for drugdevelopment but is often used in the clinical set-ting to determine continued response to drug.An example of an N-of-one design in an ide-alised setting would be to answer the question ofwhether or not a patient on a cognitive-enhancingagent, such as a cholinesterase inhibitor, is stillresponding to drug. This question is an impor-tant one since AD patients continue to declinewhile on treatment. After one or two years ontreatment, the clinician is faced with the decisionas to whether or not to continue treatment. Inan idealised version, a patient could be blindlycrossed over to placebo to examine for a with-drawal effect. If the patient was on a symp-tomatic treatment and continuing to respond, adecline in cognition should be observed follow-ing drug withdrawal. This manoeuvre could thenbe followed by blindly restarting the patient on

drug and looking for improvement. In reality, thismanoeuvre is often carried out in the clinic withsimple withdrawal of the agent, retesting, rein-troduction and retesting but without the use ofplacebo. While still useful, the lack of a blindedcrossover to placebo limits the interpretation ofthe results of this manoeuvre.

ETHICAL ISSUES

At present, available drugs to treat AD are symp-tomatic. Several agents are currently approvedand before enrolling patients in any clinical trial,disclosure and discussion of these agents with thepatient and their caregiver is necessary. In addi-tion, vitamin E and selegiline have been reportedto delay the time to endpoints in moderatelyadvanced AD patients. The results of the vitaminE study also need to be discussed with patientsbefore enrolling them in clinical trials since theuse of both cholinesterase inhibitors and vitaminE is currently the standard of care in the US.

A vigorous debate has emerged in the USregarding the ethics of placebo-controlled tri-als. Some have argued that placebo-controlled

Page 251: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

ALZHEIMER’S DISEASE 253

trials are unethical since they deny patients theuse of approved agents while enrolled in thetrial. Individuals espousing this viewpoint claimthat drug development can continue using add-on studies or by demonstrating equivalence ofa new agent to an existing agent. Others arguethat placebo-controlled trials are ethical since cur-rently available agents are entirely symptomaticand do not alter the underlying course of the dis-ease. In addition, patients are free to not enterplacebo-controlled trials if they wish to receivecurrently approved medications. There is con-cern that comparator studies may result in thedevelopment and licensing of many marginal orineffective agents since currently approved agentsare of marginal efficacy and may produce nega-tive results in some clinical trials. It is possiblethat if a new agent were compared to an approvedagent and the true effect size in the approvedagent was minimal, the newly tested drug wouldbe approved even though the effect size maynot have been sufficient to warrant approval ina placebo-controlled trial. A major drawback ofusing active control groups is the requirementof larger sample sizes. At present, most newagents being tested in the US are compared toplacebo. If and when agents are developed thatcan clearly alter the course of the disease, long-term, placebo-controlled trials will become bothunethical and socially unacceptable.

REFERENCES

1. Alzheimer A. Uber eine eigenartige Erkrankungder Hirnrinde. Allgem Z Psychiatr-Gerich Med(1907) 64: 146–8.

2. Kraepelin E. Psychiatrie: ein Lehrbuch fur Studi-erende und Artze. Leipzig: Verlag v. JohannAmbrosius Barth (1910).

3. Dirvy P. Etude histochemique des plaques seniles.J Belge Neurol Psychiat (1927) 27: 643–57.

4. Terry RD. Neurofibrillary tangles in Alzheimer’sdisease. J Neuropathol Exp Neurol (1963) 22:629–42.

5. Kidd M. Paired helical filaments in electronmicroscopy of Alzheimer’s disease. Nature (1963)197: 192–3.

6. Blessed G, Tomlinson BE, Roth M. The associa-tion between quantitative measures of dementia

and of senile change in the cerebral grey mat-ter of elderly subjects. Br J Psychiat (1968) 114:797–811.

7. Katzman R. The prevalence of malignancy ofAlzheimer’s disease. Arch Neurol (1976) 33:217–18.

8. Davies P, Maloney AJF. Selective loss of cen-tral cholinergic neurons in Alzheimer’s disease.Lancet (1976) 2: 1403.

9. Bowen DM, Smith CB, White P, Davison AN.Neurotransmitter-related enzymes and indices ofhypoxia in senile dementia and other abiotrophies.Brain (1976) 99: 459–96.

10. Perry EK, Perry RH, Blessed G, Tomlinson BE.Necropsy evidence of central cholinergic deficitsin senile dementia. Lancet (1977) 1: 189.

11. Glenner GG, Wong CW. Alzheimer’s disease: ini-tial report of the purification and characteriza-tion of a novel cerebrovascular amyloid pro-tein. Biochem Biophys Res Commun (1984) 120:885–90.

12. St. George-Hyslop PH, Tanzi RE, Polinsky PJ,et al. The genetic defect causing familial Alzhei-mer’s disease maps on chromosome 21. Science(1987) 235: 885–90.

13. Corder EH, Saunders AM, Strittmatter W, et al.Gene dose of apolipoprotein E type 4 allele andthe risk of Alzheimer’s disease in late onsetfamilies. Science (1993) 261: 921–3.

14. Schellenberg GD, Bird TD, Wijsman EM, et al.Genetic linkage evidence for a familial Alzhei-mer’s disease locus on chromosome 14. Science(1992) 258: 668–71.

15. Levy-Lahad E, Wijsman EM, Nemens E, et al. Afamilial Alzheimer’s disease locus on chromo-some 1. Science (1995) 269: 970–3.

16. Sano M, Ernesto C, Thomas RG, et al. A con-trolled trial of selegiline, α-tocopherol, or bothas treatment for Alzheimer’s disease. New EnglJ Med (1997) 336: 1216–22.

17. Kawas CH, Katzman R. Epidemiology of demen-tia and Alzheimer’s disease. In: Terry RD, Katz-man R, Bick KL, Sisodia SS, eds, Alzheimer Dis-ease. Philadelphia: Lippincott Williams & Wilkins(1999) 95–114.

18. Heston LL, Mastri AR, Anderson VE, White J.Dementia of the Alzheimer type. Clinical genetics,natural history, and associated conditions. ArchGen Psychiat (1982) 38: 1085–90.

19. Heyman A, Wilkinson WE, Stafford JA, HelmsMJ, Sigmon AH, Weinberg T. Alzheimer’s dis-ease: a study of epidemiological aspects. Ann Neu-rol (1984) 15: 335–41.

20. Amaducci LA, Fratiglioni L, Rocca WA, et al.Risk factors for clinically diagnosed Alzheimer’sdisease: a case–control study of an Italian popu-lation. Neurology (1986) 36: 922–31.

Page 252: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

254 TEXTBOOK OF CLINICAL TRIALS

21. Sjogren T, Sjogren H, Lindgren GH. Morbus Alz-heimer and morbus Pic: a genetic, clinical, andpatho-anatomical study. Acta Psychiat NeurolScand (1952) 8: 9–152.

22. Larsson T, Sjogren T, Jacobsen G. Senile demen-tia: a clinical, sociomedical and genetic study.Acta Psychiat Scand (1963) 39(Suppl 167): 1–259.

23. Heston LL. The genetics of Alzheimer’s disease.Arch Gen Psychiat (1977) 34: 976–81.

24. Terry RD, Masliah E, Hansen LA. The neuro-pathology of Alzheimer’s disease and the struc-tural basis of its cognitive alterations. In: TerryRD, Katzman R, Bick KL, Sisodia SS, eds, Alz-heimer Disease. Philadelphia: Lippincott Williams& Wilkins (1999) 188–203.

25. Selkoe DJ. Molecular pathology of Alzheimer’sdisease: the role of amyloid. In: Growdon JH,Rossor MN, eds, The Dementias Boston: Butter-worth Heinemann (1998) 257–84.

26. Wade JPH, Mirsen TR, Hachinski VC, Fish-man M, Lau C, Merskey H. The clinical diagno-sis of Alzheimer’s disease. Arch Neurol (1987) 44:24–9.

27. Morris J, McKeel D, Fulling K, Torack R, Berg L.Validation of clinical diagnostic criteria for Alz-heimer’s disease. Ann Neurol (1988) 24: 17–22.

28. Burns A, Luthert P, Levy R, Jacoby R, Lantos P.Accuracy of clinical diagnosis of Alzheimer’sdisease. Br Med J (1990) 301: 1026.

29. Galasko D, Hansen L, Katzman R, Wiederholt W,Masliah E, Terry R, Hill LR, Lessin P, Thal LJ.Clinical–neuropathological correlations in Alz-heimer’s disease and related dementias. ArchNeurol (1994) 51: 888–95.

30. Dickson D, Davies P, Mayeux R, et al. DiffuseLewy body disease. Acta Neuropathol (1987) 75:8–15.

31. Hansen L, Salmon D, Galasko D, et al. The Lewybody variant of Alzheimer’s disease: a clinical andpathological entity. Neurology (1990) 40: 1–8.

32. Perry R, Irving D, Blessed G, Fairbairn A, Per-ry E. Senile dementia of Lewy body type. J NeurolSci (1990) 95: 119–39.

33. Davis K, Thal L, Gamzu E, et al. A double-blind,placebo-controlled multicenter study of tacrine forAlzheimer’s disease. New Engl J Med (1992) 327:1253–9.

34. Thal LJ, Carta A, Clarke WR, et al. A one-yearmulticenter placebo-controlled study of acetyl-l-carnitine in patients with Alzheimer’s disease.Neurology (1996) 47: 705–11.

35. Stern RG, Mohs RC, Davidson M, et al. A lon-gitudinal study of Alzheimer’s disease: mea-surement, rate, and predictors of cognitivedeterioration. Am J Psychiat (1994) 151: 390–6.

36. Klauber MR, Hofstetter CR, Hill LR, Thal LJ.Patterns of decline in the Lewy body variantof Alzheimer’s disease. Arch Psychiat ShanghaiNews (1992) 4(Suppl): 50–3.

37. Growdon JH, Locascio JJ, Corkin S, Gomez-Isla T, Hyman BT. Apolipoprotein E genotypedoes not influence rates of cognitive declinein Alzheimer’s disease. Neurology (1996) 47:444–8.

38. Petersen RC, Smith GE, Waring SC, Ivnik RJ,Tangalos EC, Kokmen E. Mild cognitive impair-ment: clinical characterization and outcome. ArchNeurol (1999) 56: 303–8.

39. Rocca WA, Hofman A, Brayne C, et al. Fre-quency and distribution of Alzheimer’s diseasein Europe. A collaborative study of 1960–1990prevalence findings. Ann Neurol (1991) 30:381–90.

40. Evans DA, Funkenstein HH, Albert MS, et al.Prevalence of Alzheimer’s disease in a communitypopulation of older persons. Am Med Assoc J(1989) 262: 2551–6.

41. Braak H, Braak E. Diagnosis of Alzheimer’s dis-ease: development of cytoskeletal changes andstaging of Alzheimer related intraneuronal pathol-ogy. Neurobiol Aging (1994) 15: S141.

42. Rogers SL, Farlow MR, Doody RS, Mohs R, Fri-edhoff LT. A 24-week, double-blind, placebo-controlled trial of donepezil in patients withAlzheimer’s disease. Donepezil Study Group.Neurology (1996) 50: 136–45.

Page 253: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

17

Anxiety Disorders

M. KATHERINE SHEAR1 AND PHILIP W. LAVORI2

1Western Psychiatric Institute and Clinic, Pittsburgh, PA 15213, USA2VA Medical Center, Menlo Park, CA 94025 2539, USA

INTRODUCTION

Anxiety disorders are the most prevalent psychi-atric conditions in the community with a life-time community prevalence of 20–30%.1 Thesedisorders can be seriously impairing, reducingquality of life and causing disability. Recentstudies suggest some forms of anxiety are asso-ciated with early mortality. Many who sufferfrom anxiety disorders have other serious medicalproblems, such as depression, pulmonary disease,cardiovascular illness and neurological condi-tions. Prevalent and debilitating, anxiety disor-ders are serious, persistent illnesses that warranttreatment. Clinical trials are needed to establishefficacy of promising interventions and to deter-mine the best ways to deliver efficacious treat-ments in different contexts.

Methods for conducting efficacy trials in anx-iety disorders have evolved over the past fewdecades. Reliable diagnostic instruments andsymptom severity scales have been developedand tested. Strategies for medication adminis-tration have been identified and manuals writ-ten to standardise these procedures. Cogni-tive behavioural treatment methods have been

specified and explained in manualised format.Treatment training and adherence measures areavailable. These methodological advances meanthat studies of the efficacy of new interventionscan be conducted efficiently and with confidence.

Given the availability of efficacious treatments,researchers are now turning their attention tostudies that test these interventions in the commu-nity settings where they will be used, and in clin-ical contexts (such as maintenance of response)that go beyond the phase of acute illness that isthe focus of most efficacy studies. With this shiftin focus, new methodological problems appear.Generic problems that need to be addressed indesigning such studies, often known as effec-tiveness studies, have been described in theliterature.2 In this chapter we discuss method-ological issues pertaining to effectiveness stud-ies of anxiety disorders. We identify some keyfeatures of these disorders and consider the prob-lems they create for study questions and studydesign. Solutions to methodological problems inclinical trials often require trade-offs, and theproblems we discuss are posed in this way. Weprovide our view of the best way to manage these

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 254: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

256 TEXTBOOK OF CLINICAL TRIALS

problems, and in some cases, make suggestionsfor methodological innovations.

Clinical researchers regularly make method-ological choices regarding subject recruitment,selection and characterisation of subjects, proce-dures for enrollment, assignment to experimen-tal group, experimental manipulation, outcomeassessment and follow-up process. Methods cho-sen will place specific limits upon what can belearned from a study. Thus, it is fundamental thatstudy methodology be driven by the question theresearcher seeks to answer. However, unlike effi-cacy studies, in effectiveness studies, the questionis not always clear. Defining the study question isthe first problem for the effectiveness researcher.Most experienced clinical researchers are expertin conducting efficacy studies to answer the ques-tion ‘Does a new treatment produce better resultsthan a control condition for a well-defined con-dition, under tightly controlled circumstances ofuse?’ Both psychosocial and pharmacologicaltreatment researchers have successfully under-taken such studies, and thus are poised to testefficacy hypotheses for new interventions.

The field of effectiveness research is farless developed. Investigators move forward inunmarked terrain as they decide upon the mostimportant next questions. For example, a study ofLong Term Strategies in the Treatment of PanicDisorder (MH045963-6) currently in progressunder the direction of the authors is designed toanswer the question ‘Should non-responders toan initial trial of CBT receive medication or anadditional dose of CBT?’ This important ques-tion is not addressed by efficacy studies of eithermedication or psychotherapy. Having articulatedsuch a question, decisions must be made aboutwhat methodological approach should be used,and what problems to anticipate. For example,Principal Investigators of the Long Term Strate-gies Panic study had to confront the issue ofwhat the right duration of the initial CBT trialmight be, and what level of non-response to ini-tial trial should be chosen to define intake into therandomised maintenance trial. Decisions such asthese are not trivial, since neither the most impor-tant questions, nor the best way to approach a

given question, is obvious. Given this ambigu-ity, we suggest anxiety disorder researchers mightbe guided by some of the key features of thesedisorders (Table 17.1). We discuss the method-ologic relevance of five such features: (1) anxietydisorders are characterised by high communityprevalence; (2) diagnostic boundaries are ambigu-ous, both between pathological and normal anx-iety and among the different anxiety disorders;(3) phobic fear and avoidance is prevalent in thesedisorders; (4) anxiety disorders are treatable usingeither medication or cognitive–behavioural inter-ventions; and (5) anxiety disorders frequently co-occur with other disorders. Each of these featureswill affect decisions about the research questionand the choice of methods.

FEATURES OF ANXIETY DISORDERS THATIMPACT STUDY METHODOLOGY

HIGH COMMUNITY PREVALENCE

Anxiety disorders are highly prevalent in thecommunity. The high prevalence means there aremany patients in need of treatment. Epidemiolog-ical studies document that most of these patientsdo not present for care in a specialty mentalsetting.3 Instead, they can be found in a range ofcommunity service settings. Even among thosewho do seek specialty mental health treatment,only a subset will be enticed to an academic med-ical clinic, regardless of the incentives provided.For those who seek treatment at a communitymental health setting, usual practice diagnosticprocedures cannot be relied upon to identify anx-iety disorders.4 It is clear that we need to knowhow to recognise and treat the people with anx-iety disorders who most researchers never see.Put another way, we need to study those whodo not participate in studies. This obvious para-dox underscores the principle that effectivenessstudies will not be straightforward.

The job is not simply a matter of runningan efficacy study in one or more communitysettings. Doing so would be important only ifthere are serious questions about whether patientsin such settings respond to proven treatments.

Page 255: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

ANXIETY DISORDERS 257

If this is the case, it is important to framethe specific questions the study should answer,based on the reasons for predicting responsedifferences. For example, if researchers suspectseverity is an important treatment moderator,it might be important to conduct a standardrandomised efficacy trial in settings with patientsof varying severity. Likewise, some patients haveco-occurring symptoms or syndromes, such asserious medical illness, along with an anxietydisorder such as panic disorder. It might makesense to recruit patients from medical clinics intoan efficacy trial in order to study the influenceof the medical illness on the treatment of thetarget condition. Other examples of parametersthat might be predicted to render uncertainoutcome with a proven efficacious treatmentinclude organisational features of the setting orsocio-economic status of the patient. Specificconsiderations like these ought to drive theimportant design decisions such as where the

research will be conducted and in how manydifferent kinds of settings.

A different kind of research question mightbe driven by the subject paradox (how tostudy patients who do not participate in stud-ies), for example ‘What is the most success-ful way of recruiting and engaging individualswho do not seek treatment in a research clinic?’The investigator might compare a public edu-cation programme to a professional educationalintervention. Or, the research aim might focus onevaluating alternative screening strategies in dif-ferent settings. Another example might be ‘Howmuch diversity of setting should be incorporatedinto a study?’ In addressing this question theresearcher might investigate the variation of clin-ical presentation, treatment acceptance, or out-come across different ethnic or socio-economicgroups. Alternatively, the investigators mightexamine the effect of different organisationalstructures or the impact of the organisational

Table 17.1. Implications of features of anxiety disorders for research design

Feature Research issues

1. High community prevalence Settings: which and how manyRecruitment: reaching the unstudied patientAssessment: measuring outside the research clinicAwareness: bridging the patient’s knowledge gapHuman subjects protection: make or buyTechnology: the right machine for the settingComorbidity: adjusting to increased variability

2. Poorly defined nosologic boundaries Normal vs disordered: a question of excessDifferential diagnosis: core symptoms overlapPluripotency: treatments with broad efficacyDouble counting: symptomsEndpoints: ranking outcomesAiming low: focus on preventing relapseStability: the time frame for outcome

3. Phobic fear and avoidance Evasion: measuring the avoidant subjectFear: recruiting the anxious subjectIdentification: personal choice vs avoidance

4. Discordant models of the disorders Acknowledgement: both models have treatment successesControl: paying attention to the other interventionComparing: accommodating preference for modalityTargets: agreeing on the goalsDissemination: thinking ahead about the audience

Page 256: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

258 TEXTBOOK OF CLINICAL TRIALS

climate5 on outcomes or study the implementa-tion of organisational interventions to optimisethe likelihood of dissemination of a treatment.6

Whatever studies are done, it is clear that foranxiety disorders, researchers need to extend theirreach if they seek to make an impact on the greatmajority of individuals who suffer from theseconditions. Methods need to be devised to studypatients in primary care and specialty medicalsettings, dental offices, churches, schools, com-munity centres, and a range of other communityservice or support settings (e.g. domestic vio-lence or homeless shelters, or even highly utilisedcommercial operations such as supermarkets7 ordepartment stores). The use of such settings todeliver care may be particularly relevant forpatients with anxiety disorders who have pho-bic restrictions, and are unable to travel outsidea restricted area.

Designing a new clinical trial for an anxi-ety disorder outside of an established researchcentre raises other problems. Investigators makedeliberate decisions about where to recruit, assessand treat patients, as well as whether to carryout any of these activities in more than onekind of setting. Existing clinical research meth-ods for recognising and recruiting affected indi-viduals may be too cumbersome to work in asetting where research activities are not custom-ary. For example, a busy primary care practice oreven a mental health facility may not be orientedtowards identifying and tracking individuals whomeet criteria for anxiety disorders. Frequentlystaff in such places are very busy, very dedi-cated and sometimes opinionated about what isbest for their patients. The researcher who comesto study usual practice may be seen as challeng-ing the skills, competence or even integrity of thestaff. Still, recent studies in primary care8 havesucceeded in overcoming these barriers and havedone much to provide information to inform pro-cesses to optimise care.

Protocol-driven treatments face additional bar-riers to acceptance in settings other than theresearch clinic. Assessment of outcomes is hardenough in a research clinic; assuring goodfollow-up and reliability of measurement in

non-traditional research settings will tax the inge-nuity of the next generation of effectivenessresearchers. Recent work using adaptive testingmethods holds promise as a technique. Giventhese challenges, it is tempting to suggest thatresearchers concentrate on one research setting,and hope or assume that results will generalise.But the decision to limit the setting has uncer-tain implications for interpreting and generalisingresults. There is a trade-off between generalis-ability and the cost of dealing with heterogeneityof setting. These costs must be borne, and themethodological challenges met, in order to pro-duce research-grade answers to the question ofeffectiveness.

In working in almost any non-mental healthcommunity setting, the investigator must addressstigma and self-criticism that can be associ-ated with the idea of having a mental disorder.Researchers need to take steps to minimise thedifficulties that may be caused by identifying aperson as ill, especially when the person in ques-tion has not already identified their symptoms asproblematic. In such a situation, the news maycome as an unwelcome surprise, or may be per-ceived as insulting or embarrassing. The newlydiagnosed individual may feel suddenly stigma-tised and this may lead to a rejection of thediagnosis and/or the researcher bearing the news.There may be anger or discouragement towardsthe community setting in which the person soughthelp. The researcher needs to be sensitive to thesepossibilities and proactive in dealing with unto-ward reactions associated with identification ofan anxiety disorder. For example, if there is adecision to recruit subjects in a non-psychiatricsetting such as a church or supermarket, theresearcher would need to include an introductoryphase of the work that addresses fears and stigmaassociated with a diagnosis. This can be done in avariety of ways. A community educational phasemight be undertaken prior to initiation of recruit-ment. Individual or group consciousness raisingmight be offered. Focus groups are a very usefulstrategy being increasingly used by researchers.In this situation, small groups of individuals withdifferent types of anxiety might be invited to

Page 257: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

ANXIETY DISORDERS 259

participate in a focus group. Participants are paidand the group leader might guide the group indiscussion of topics such as the meaning of hav-ing an anxiety disorder, the perceived responseof others, including family, friends and/or thecommunity at large. A focus group might alsobe asked to discuss how researchers might bestapproach undiagnosed people in the communitywho suffer from these disorders, or the groupmight be asked how to best present treatmentoptions, or how to explain and encourage par-ticipation in research. Thus armed, the researcherwill be more successful in recruiting and retainingsubjects for a clinical trial. An example of a veryinnovative approach to the problem of commu-nity recruitment9 utilised an intensive telephoneengagement strategy in which mothers of innercity minority children were invited to identifyand problem-solve an important difficulty theywere experiencing. Only after the intake recruiterhad successfully helped with this practical prob-lem did they explain the availability of servicesfor other kinds of problems. This approach wasshown to significantly increase attendance at thefirst clinic appointment. Whatever the approach,it is clear that the prospective patient researchvolunteer must be given opportunities to under-stand their anxiety symptomatology as a treat-able condition underlying what may be just anawareness of limitation or fear. These individu-als further need to decide for themselves whichtreatment programme they wish to access. Theresearcher needs to present a clinical trial inthis context.

Sometimes stigma can be best addressed at thelevel of the service provider–such as a primarycare physician, or administrators and serviceproviders in different kinds of agencies. In orderto access patients in a given facility it may bevery important to first understand the headachesof the facility administrators. A researcher whotakes the time to both identify and respond tothe problems faced by those attempting to delivercare will be rewarded with a much higher level ofenthusiasm and support for the research project.Researchers in the field of services research haveunderstood and successfully accomplished this

kind of work.10 Partnering with administrators indifferent service agencies to find ways to improvetheir efforts is likely to provide easier access tosubjects and better support for implementation ofstudy procedures. Careful attention to such issuescan determine the feasibility of the study.

Assessment Strategies in Community Settings

The standard research diagnostic interview andfollow-up batteries were designed to achievecareful, reliable descriptions of different well-specified phenotypes of psychiatric illness. Whilehighly successful in meeting this goal, suchinstruments have not been designed to maximiseefficiency and minimise patient and staff burden.These extensive and time-consuming inventorieswill not survive transplantation into a primarycare setting, a hospital emergency room or adental office, let alone a church or school. Instead,radically simplified tools must be developed thatutilise innovative statistical and psychometricmethods (e.g. adaptive testing) and/or studysample sizes must be increased to compensatefor extra variance.

The assessment strategy used in a communitysetting may need to be altered in other ways aswell. No matter how prevalent an anxiety disor-der may be in the community, it will be lower inthe community setting, compared to the preva-lence in the enriched intake stream of a specialtyanxiety clinic. The odds on a disease may easilyvary fivefold or more from clinic to commu-nity. Given that the specificity and sensitivity ofthe diagnostic instruments will be no better inthe community setting, and may well be worse,the ‘Bayes factor’ of the test (sensitivity dividedby 1 − specificity) will be smaller in the commu-nity setting. For example, if the sensitivity andspecificity both decline from 90% to 80% thenthe Bayes factor declines from 0.9/(1 − 0.9) = 9to 0.8/(1 − 0.8) = 4. If both the prior odds ofdisease and the Bayes factor are lower, the posi-tive predictive value will also be lower (the oddsof disease given a positive test are just the priorodds of disease times the Bayes factor). In thenumerical example above, given only a fivefold

Page 258: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

260 TEXTBOOK OF CLINICAL TRIALS

difference in prior odds, the posterior odds on dis-ease given a positive test may vary by an orderof magnitude from clinic to community, mak-ing interpretation of intake diagnosis problematic.Multi-step diagnostic procedures may be neces-sary to avoid over-diagnosis.11

Research Recruitment

Study subjects volunteer to participate in research.In a clinical trial, the manner of presentation oftreatment options may make or break a study.The highly selected population of patients whopresent to an academic centre often come pre-conditioned to the value of research protocols,and may have specifically sought out the clinicbecause of its reputation as a research centre.The potential research volunteer in a communitysetting has not voted for research ‘with hisfeet’, and may need a gradual, informative andupbeat approach, to accept the idea of protocol-driven treatment and randomisation. Institutionalreview boards may well regard placebo controlas especially unattractive in such a context, andmay also be concerned about ‘overselling’ thepotential benefits of research to patients. Yet,there is reason to believe that the patient in thecommunity context may be the one with the mostto gain from participation in research, becauseof the likelihood that her illness would otherwisego unrecognised, or the equally disadvantageouslikelihood of inadequate treatment.

Context-Relevant Treatment Protocols

To optimise study results, strategies must bedeveloped for providing protocol treatments ina context-relevant manner. This may includeadjusting to the absence of third-party payers, ormaking use of setting-specific para-professionalpersonnel for some of the interventions. Or, itmay mean incorporating ‘escalation’ strategiesinto the treatment protocol, so that subjectsidentified with substantial needs are transferredto a more traditional setting.

Practical and Administrative Issues

Human subjects review may need to be coordi-nated among several kinds of organisations. Somemay be willing to enter into agreements to acceptthe investigator’s home institutional review, oth-ers may need to develop their own review pro-cesses and obtain Federal-Wide Accreditation.Template agreements that have been shown towork would be a valuable resource.

The investigator must choose methods of datacapture and processes for the data edit cycle thatwork in diverse settings at sites that are distantfrom the coordinating institution. Data monitor-ing, correction of errors and tracking of follow-upare all affected. Technological limitations needto be respected. For example, fax-based meth-ods may be more easily deployed than internet-based methods, especially in settings where a faxmachine is already in use. On the other hand,as personal digital assistants become ubiquitous,patient follow-up may be individualised, remoteand remotely cued. We can imagine technol-ogy that rings a telephone number or sends aninstant message, asking for self-report follow-up, and that can schedule and connect the sub-ject with a live interviewer, all implemented onthe same small wireless device, that might becheap enough to give away as a free incentiveto participation.

Despite the difficulties associated with export-ing clinical trials to the community, it is clearthat the next generation of clinical research inanxiety disorders needs to be rolled out into thesettings where individuals with these disorderslive and work. In addition to the many issuesrelated to the setting of a study in the commu-nity, there are many design considerations relatedto which patients should be included in a givenstudy. Patients in different community settingsare likely to be heterogeneous in different ways,and to differ from patients who seek treatment attraditional research clinics. Existing studies doc-ument a high rate of comorbidity among anxietydisorders, between anxiety disorders and depres-sion, and between anxiety disorders and medicalillnesses. There is also comorbidity of anxiety

Page 259: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

ANXIETY DISORDERS 261

disorders with for example psychotic disorders12

and substance abuse.13,14 Such comorbidity mayincrease the likelihood that a patient seeks treat-ment at research clinics, and therefore it is pos-sible that studies in the community will haveto deal with less comorbidity than studies inthe research clinic. Nevertheless, an effectivenessresearcher must decide how to manage comorbid-ity. There are many consequences of decisionsto include or exclude comorbidities from studyeligibility criteria. There are a variety of assess-ment considerations that are different in comorbidversus non-comorbid subjects. Symptoms of co-occurring depression or substance abuse may bedifficult to disentangle from anxiety symptoms.Many medical disorders produce symptoms ofautonomic nervous system activation, as do anx-iety disorders. Such medical comorbidity may beespecially likely in primary care or medical clinicsettings. The trade-off between heterogeneity andits attendant increase in measurement variance,and homogeneity and its attendant restrictionson generalisability, must be carefully considered.In addition, rigid exclusion criteria may be lessacceptable in community settings than in the spe-cialty research clinic; patients who are surprisedby a diagnosis may be disappointed if they areruled out from studies by being ‘too compli-cated’. An alternative for the researcher is tosimply accept comorbidity and heterogeneity ofthe population and evaluate a treatment that tar-gets a specific symptom, behavioural pattern orsymptom cluster, without regard to the contextin which it occurs. To make this decision theresearcher accepts the ‘noise’ this will cause inthe system and powers the trial accordingly.

Other considerations related to patient het-erogeneity include the fact that illness severityand typical background treatment history mayvary across settings. Patients in some settingsmay have already received multiple treatments,while in other settings they may be treatmentnaıve. Given the findings from multiple studiesthat have documented that affective and anxi-ety disorders are under-recognised and under-treated in the community, it is likely that patientsrecruited from non-mental health settings will

have had little exposure to proven efficacioustreatment. Often such patients have sought helpfrom clergy or other informal sources. In the caseof anxiety disorders, the awareness of the ‘irra-tionality’ of symptoms often means these individ-uals suffer in silence, embarrassed to reveal theirself-perceived defects. Such patients are oftenenormously relieved when they learn that theirdisorder is understood. Even when treatment hasapparently been offered, it may be less vigorousthan the versions that have been proven effi-cacious in clinical trials. Inadequate doses anddurations of pharmacotherapy may be the rule,and specific psychotherapies may be offered inname only. It may be particularly important notto assume (for example) that a patient has demon-strated a lack of response to treatment, and there-fore be ruled out as ineligible.

If patients are identified in settings other thanthe specialty clinic, they may not view theiranxiety disorder (which may be news to them)as the main problem they should be concernedwith (along with their hypertension, maculardegeneration, current spousal abuse or arthritis).They may be unwilling to make accommodationsin schedules and may have needs for unusualavailability of research staff in time and space.Some patients may not understand the usualstandard procedures for treatment in a mentalhealth clinic. They may need to be approachedin an accommodating way.

POORLY DEFINED NOSOLOGICBOUNDARIES

A second feature of anxiety disorders is thatthe boundaries between normal and pathologi-cal anxiety and among the pathological disordersare ill defined. Unlike most psychiatric disorders,the symptoms that comprise the diagnostic cri-teria for anxiety disorders are recognisable innormal people every day. The pathological stateis defined by excess. However, the definition ofexcess is not precise. Because anxiety is a nor-mal emotion, it is not always clear where theboundary between normal and pathological lies.This is particularly true in the context of stressful

Page 260: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

262 TEXTBOOK OF CLINICAL TRIALS

life events and ongoing difficulties. The bound-ary with normal may arise in defining a clinicalpopulation in need of treatment. Boundary issuesare also relevant to treatment targets and defini-tion of remission. In general, there is no consen-sus on what comprises remission of an anxietydisorder. We discuss this problem and suggestsome ways it might be addressed. The problemof the boundary between normal and pathologicalis not a question raised only in the area of anx-iety disorders, but rather is a continued questionin the ongoing discussion related to definitionsof psychopathology. A recent paper15 provides agood summary of current issues. As these authorspoint out, it is also relevant to consider the rela-tionship between mental and physical disorders.These considerations are important for clinicalresearchers to keep in mind but a detailed dis-cussion of the various issues is beyond the scopeof this chapter.

However, as noted above, anxiety is a normalemotion, and so its pathological state must bedistinct from normal variation. It is best toexperience anxiety in moderation. While anxietycan be disabling in excess, a deficiency of anxietycan also be impairing. The question of howmuch anxiety is optimal is not a philosophicalone. Rather, it is one of the conundrums thatcurrently face the clinical trials investigator.Namely, the investigator must decide how muchsymptom relief is optimal and how much issufficient to declare a meaningful response totreatment. Given that anxiety is normal, isthere some expected floor for the intensity ofanxiety symptoms, or is symptomatic anxietyqualitatively different?

Another design question relates to the levelof anxiety that results in optimal long-termoutcome. Still another relates to the definitionof remission of a given disorder. The fieldhas not reached consensus on how to defineremission for any of the anxiety disorders. Thisis a critical methodological problem that needsto be addressed. Investigators need to considerwhether there is a way to overshoot the mark oris less always more? This is a serious question,as researchers are not agreed upon whether it

is useful to have some anxiety symptoms inorder to keep coping functions operative and/orprovide ‘toughening up’ experiences. Perhapssome continued symptomatology is a good ideato encourage continued exposure. The continuedpresence of low-level symptoms may increasethe chances that the patient does not becomecomplacent16 and/or provide opportunities toconfirm the absence of more severe symptoms.

On the other hand, since anxiety disorders areclearly debilitating, perhaps it is best to eliminatesymptoms as fully as possible. Perhaps if weleave residual symptoms, this indicates that wehave not eliminated the underlying vulnerabilityand relapse will be more likely. Ideally, wewould like to eliminate pathological anxietywhile leaving ‘normal’ anxiety intact. Yet thisdistinction may be difficult to define. If we havea pharmaceutical compound that reduces anxiety,might we overshoot the mark? If so, would thatbe as problematic as undertreatment? Commonsense, and the results of a famous study,17 suggestthat a moderate level of anxiety is associated withoptimal performance in situations like test-taking.Laurence Olivier is known to have suffered, asmany actors do, from tremendous stage fright.His view of this was that this fear was anessential motivator that ensured his performancewould be undertaken with the highest possiblefocus and concentration. Every researcher knowsthat the approach of the deadline for grantsubmission generates substantial anxiety whichagain motivates the highest possible level ofenergy and productivity.

Threshold issues relate to the decision tobegin as well as the decision to end treatment.At what point do we declare anxiety to beat a clinically significant level that warrantsintervention? If Laurence Olivier experiencedintense anxiety at each performance, should he betreated? The goal of treatment of an unhappy butsuccessful person should be first and foremost toprevent failure (inability to perform, because ofparalysing fear or shoddy performance, becauseof cavalier attitude) while, if possible, reducingthe discomfort of unhappiness. In this context weecho a famous quote of Freud, concerning the

Page 261: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

ANXIETY DISORDERS 263

goals of psychoanalysis vis-a-vis unhappiness.Anxiety clearly exists on a continuum yet atreatment decision is a binary one. We do notattempt to administer a partial treatment. Thedecision of who to treat, of the minimal levelof symptomatology an eligible subject may havewill have implications for interpreting and actingon study results. It is likely that there is a distancefrom the boundary with normality associatedwith optimal effect of treatment. The closer tothe boundary, the more likely the study willshow non-specific or placebo effects. The fartherfrom the boundary (i.e. the more severe andcomplicated the symptoms are) the less likelythat the treatment will be fully effective. Oneconsideration in deciding who to treat in aresearch study of anxiety disorders is the lifecontext and the personal context in which theanxiety disorder symptoms arise.

Considerations Related to Life Contextand Individual Psychology

Because of the salience of environmental stimulias a trigger for normal anxiety, and the impor-tance of coping mechanisms and social supportsas responses to anxiety, it might be argued thatthese context measures are of particular impor-tance in anxiety studies. Little is known about therelationship between onset, course and treatmentof anxiety disorders and these external factors.There is a need to examine what the nature ofthese relationships may be. For example, it is notknown whether faulty coping mechanisms play arole in the vulnerability to one or another of theanxiety disorders. If so, perhaps this should bea target of a treatment intervention. If not, per-haps coping skills are variable across individualsand/or across stressors and may act as a modera-tor of treatment response. In this case, improvingcoping may be a strategy for treatment of non-responders.

Strong social support is well known to be animportant contributor to a sense of safety. Anx-iety disorder patients experience the world asmore dangerous. Safety is not necessarily theopposite of danger, but a sense of safety can

mitigate the perception of likelihood of dangerand/or the perception of consequences of the dan-ger. Anecdotally, some anxiety disorder patientsare thought to have unusually good interpersonalskills. Turning to others may be one way a patientwith panic disorder copes with a world perceivedas persistently and unpredictably frightening. Forother individuals with anxiety disorders, anxietymay be exacerbated by relationships with oth-ers. A patient with social phobia fears scrutinyby others and this may motivate them to avoidrelationships or to concentrate on developing asmall group of ‘safe’ people. Someone with OCDmay fear contamination from others, or an OCDpatient may fear harming other people. Eithermay lead them to have reduced social contactsand less overall sense of safety. It is also pos-sible that deficits in internal representations ofother people can lead to problems in regula-tion of emotions. This can contribute to anxietysymptoms and perhaps even to the onset of anx-iety symptomatology. There is some indicationthat relationships with others help regulate neu-roendocrine and autonomic nervous system func-tioning. Whether to include measures of socialsupport and attachment into clinical trials in anx-iety disorders is a decision researchers must beginto consider. Such information is an additionalpatient burden. However, it may be difficult todetermine optimal interventions for patients in thecommunity if researchers do not begin to addresssome of these issues.

Also ill defined are boundaries between dif-ferent diagnostic categories with similar symp-tomatology, and frequent comorbidity. Given thefact that fears, worries, somatic symptoms andbehavioural manifestations are shared across dis-orders, distinctions can sometimes be blurred.There have been changes in diagnostic criteriasets since DSM III, especially for panic disor-der and generalised anxiety disorder (see below).There has also been a change in the relationshipbetween panic and agoraphobia, and betweenthese disorders and other DSM IV anxiety disor-ders. These changes reflect growing recognitionof the occurrence of panic and phobic symptomsacross disorders. In addition, the core generalised

Page 262: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

264 TEXTBOOK OF CLINICAL TRIALS

anxiety disorder (GAD) symptom, worry, is alsofrequently found in other anxiety disorders andin depression. The content of worry apparentlydiffers in depression and GAD.18 However, thisis a nuance for an assessment strategy, and again,time is required to tease this apart.

The diagnostic groups that comprise the anx-iety disorders share cognitive manifestations offear or worry, behavioural manifestations ofefforts to cope with the anxious thoughts (pho-bia or compulsions), and somatic symptoms thatoften accompany the anxiety. Yet each disorderhas a different ‘flavour’ of symptoms, and eachmay reflect different aetiological underpinnings.The similarities have implications for cliniciansand researchers alike and will have a bearingon the design of new treatments and dissemina-tion of efficacious interventions. As study intakemoves into the community, we can expect thediagnostic overlap to increase, if only becausemild versions of disorders are harder to separatethan severe ones. Efficacy studies have focusedon specific diagnostic groups, rather than on anx-iety as a loose complex of symptoms. This meansthat the ostensible usefulness of study resultsdepends on clear, reliable identification of thespecific disorders in patients, at best a dubiousproposition in the community setting. However,it should be recognised that while the efficacystudies have been carried out in carefully con-structed ‘pure cultures’, the results of those stud-ies show a startling uniformity of options fortreatment across the spectrum of anxiety dis-orders. It may be that the careful and expen-sive nosologic dissection characteristic of the firstgeneration of clinical trials is added to the pre-cision and power of those studies, but may berelaxed in the next generation of effectivenessstudies, making a virtue of necessity.

We now know how to reliably identify anxietydisorders and we know how to provide effica-cious acute interventions, but these demonstra-tions have occurred only in the research clinic.The traditional decision point for clinical inter-vention, i.e. clinical (DSM IV) diagnosis, isfairly clear, though there are remaining contro-versies about psychiatric diagnosis. For the most

part, these are beyond the scope of this paper.Researchers have developed tools to use forscreening, diagnosis and severity ratings of anxi-ety symptoms. Ensuring that clinicians are awareof these and that the instruments are user-friendlyis a focus of current work. There are many exist-ing publications related to different assessmentinstruments, so we will not provide this infor-mation here. Instead, we suggest that even witha good instrument, there are some difficulties inestablishing clearly a single target condition, andthat the attempt may be a useless diversion ofeffort if discriminatory precision is less importantthan inclusiveness of intake.

The high rate of co-occurrence of anxiety dis-orders is an area of confusion that concernsthe diagnostic nomenclature. For this and othermore theoretical reasons, there is controversy inthe field with regard to whether different DSMIV diagnoses describe truly distinct illness cat-egories. Moreover, even if they are different,their co-occurrence creates measurement prob-lems. For example, if a patient in treatment for apanic disorder has a co-occurring specific phobiaof heights, should avoidance of bridges be rateda symptom of panic disorder, of height phobiaor both?

As noted above, it may be possible to aim cur-rent established treatments on the generic symp-toms that occur across the disorders: fear or worry,somatic symptoms and behavioural changes suchas avoidance or compulsive ritualising. The broadspectrum effects of serotonin-active medicationslend themselves to such an approach, as dothe psychosocial treatments which may reducegeneric cognitive, somatic and behavioural symp-toms across disorders. An investigator planning acommunity study needs to decide whether to testtreatments in the classical disorder-specific trial orin a more broadly-based group of patients definedby the common symptoms and behaviours of theanxiety disorders. We think that the latter choicedeserves serious consideration.

Defining Outcomes and Measuring Results

Katschnig and Amering19 point to the consider-able complexity of symptoms in panic disorder,

Page 263: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

ANXIETY DISORDERS 265

suggesting that spontaneous and situational panicattacks, anticipatory anxiety, phobic avoidance,disabilities, comorbid depression and substanceabuse must be considered. One might add to thisthe presence of other anxiety disorders, personal-ity disorders and physical illnesses. These authorslist methodological difficulties that emerge fromthis complexity. They raise questions such aswhich of the phenomena are most important inassessing course and outcome, what are the rel-evant time intervals for symptom assessment, atwhat point should the clinician consider that theillness has transitioned to a remitted state? Suchconsiderations are relevant for each anxiety dis-order. All are comprised of multiple domainsof symptoms, including panic, anticipatory anx-iety, worry, phobia, obsessions or compulsions.Since the diagnostic criteria do not require thepresence of each of these, it is possible to meetcriteria for one or another anxiety disorder withprominent symptoms in one domain and none inanother. Over time this may change. In some situ-ations different domains within a disorder may benegatively correlated. For example, an individualmay experience a reduction in panic attacks whilebecoming increasingly phobic. Is this improve-ment or worsening of the overall condition? Apatient with OCD may experience a decrease inobsessions as the compulsive behaviours growand become instantiated. Should this be con-sidered a change in severity? A person with aphobia may experience lower overall impairmentand/or fear as their phobic behaviour becomemore fixed, and they begin to accommodate thephobia in their lives. Does this mean the phobiais in partial remission? What if the intensity ofsymptoms is actually worse than when the dis-order was first diagnosed, and yet there is lessimpairment? Similarly, if an individual with OCDhas prominent obsessions and intermittent com-pulsions are they better off, worse off, or the sameas if the opposite is true? What is the role ofimpairment and/or quality of life in determiningoutcome? What criteria should we use for ill-ness severity? What about treatment response orremission? It is clear that response entails a clin-ically significant, noticeable change in symptom

levels while remission entails a return to func-tioning with symptoms at a level that they causeno noticeable distress. Studies are underway toidentify precise markers of these important clin-ical transitions.

The fact that the symptoms of a single dis-order do not necessarily travel together createsdifficulties in defining the endpoint for a treat-ment. Such a ‘carousel course’ (Figure 17.1) ofsymptoms leads to assessment quandaries thatcan be daunting. Again, taking the case of panicdisorder with agoraphobia, if a patient starts treat-ment with several full panic attacks per week,and then has a marked reduction in panic attacks,but continues to have frequent limited symptomepisodes, and remains moderately agoraphobic,how much improved is the patient compared tobaseline? How should life context be factoredinto assessment of illness severity? If a socialphobic gets a new job which requires less pub-lic performance than previously, but the job isbelow his or her level of competence, social anx-iety symptoms may diminish noticeably but is thepatient really better?

Several authors have drawn attention to theseproblems and the general recommendations havebeen to use composite measures of severity overan extended period of time. Such composite mea-sures are available for most of the disorders, andmost are quite user-friendly: The Yale–BrownObsessive Compulsive Scale has been widelyused and is available in a self-report format.The same is true for the more recently devel-oped Panic Disorder Severity Scale. The SocialPhobia Inventory (SPIN) is also brief and com-prehensive. There is little agreement in the fieldabout the one or two best measures for each disor-der. The same measures can sometimes be usedfor screening diagnosis and outcome though itmakes sense to pick the instrument most rele-vant to the goal of the assessment. The use of acomposite measure presupposes that it is possi-ble in principle to rank order the outcomes of thepatients, although there may be many outcomesthat are distinct but not ordered. From a statisticalpoint of view, the ability to reliably order patientoutcomes into as few as four or five categories

Page 264: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

266 TEXTBOOK OF CLINICAL TRIALS

0%

20%

40%

60%

80%

100%

Time 0 Time 1 Time 2 Time 3 Time 4

Anticipatory Anxiety Panic Agoraphobia

Figure 17.1. Panic disorder as an example of a ‘carousal’ symptom pattern

provides considerable increase in the power todetect treatment differences, compared to a sim-ple dichotomy of response/no response. Thereare diminishing returns even to perfectly reli-able orderings with more than five levels. Giveneven modest unreliability, it may not pay to pushcomposite measures beyond a few levels of dis-crimination. The challenge posed by the ‘carouselcourse’, and the pleiotropic outcomes, of anxietydisorder is fundamental: the ability to rank patientoutcomes is the most basic feature of scientificmeasurement and study.

As studies extend into the community, theywill explore the less severe forms of the disor-der, and may be even more vulnerable to theproblem discussed above. This raises the possi-bility that the target of measurement should notbe improvement (alone) but prevention of signif-icant worsening. The advantage of this approachis that it may move the measurement into a morereliable regime, in which there is less controversyabout the meaning of the outcome. The disad-vantage is that it may also require large sample

sizes, in order to detect modest effects on lowprobability outcomes.

Choosing a Time Frame for OutcomeAssessment

The specifics of time frame are also contro-versial. While a group of senior panic disorderresearchers achieved consensus on a recommen-dation of an optimal period of four weeks forassessment of symptoms, and a minimum of twoweeks, these recommendations are not alwaysfollowed. In fact, frequently symptom status isreported without specification of the time frameof the assessment. The issues around time courseare further complicated by variability betweendomains and within a domain, depending onlife circumstances. Some domains of symptoms,such as phobic symptoms, are very stable, and achange in them, even over a fairly brief period,e.g. a few weeks, can be an indicator of changein illness severity or course. (A caveat here is thechange in life context.) However, other symptoms

Page 265: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

ANXIETY DISORDERS 267

are very unstable. For example, it is typical forpanic attacks to occur in clusters and then tosubside. The problem is further compounded bydifficulties inherent in rating panic frequency.Anticipatory anxiety can be far worse if thereis a specific environmental demand to confrontan anxiety-provoking situation. For example, if asocial phobic must go to a wedding. This raisesthe question of the time frame over which differ-ent types of symptoms should be assessed, andthe situations in which the symptoms should beevaluated. Again, a focus on long-term preven-tion of serious worsening may help.

We do not have definitive answers to thesemyriad questions, but suggest that we should bepaying attention to these assessment challenges.It may be possible to undertake secondary dataanalyses that target these questions. In the mean-time, we suggest that outcome assessment musttake into consideration multiple domains to makea meaningful judgment of response or remission.Further, it makes sense to establish and publishconventions for raters so that others can under-stand clearly results of studies. Reports of studyresults rarely describe conventions for rating pho-bia, including changes in life context and/or situ-ational demands. Many published panic disorderstudies use panic attack frequency as the onlyoutcome. Reporting conventions should be broad-ened to address these issues.

Phobic Fear and Avoidance

A third issue specific to anxiety disorders isthe occurrence of phobic symptomatology. Oneof the trickiest problems in anxiety disorderstreatment is the assessment and management ofavoidance. Avoidance is a natural reaction to fearand is usually successful in reducing anxiety inthe short run. However, the longer-term effect isvirtually always to increase anxiety. Avoidancealso causes substantial functional impairment.Avoidance may lead a social phobic to seriouslycurtail his or her education, or resist careerdevelopment because of fear of speaking in agroup. The net result can be highly significantto income and productivity. Thus, avoidance is

both a coping mechanism and a symptom. By itsnature, it can be difficult to measure, since manyanxiety disorder patients try to avoid thinkingabout anxiety-provoking situations, in additionto avoiding confronting these situations. Thismeans that asking a person if there is anythingthey are avoiding often results in under-reporting.It is necessary to inquire about avoidance byasking specific questions, and this can be time-consuming. Some behaviour therapists argue thatphobias can only be assessed using a behaviouralchallenge protocol. However they are measured,it is clear that phobic symptoms are important asthey are among the strongest and most consistentpredictors of long-term outcome.

Avoidance can also play a role in silencing anx-iety symptoms and reducing the impetus to seekhelp. This may be one way that phobic symptomsact to worsen the course of illness. Silencing ofsymptoms is also reminiscent of the hypertensionanalogy where serious consequences result fromlack of awareness of symptoms and difficultyadhering to treatment regimens. In fact, phobicavoidance has now been found, like hyperten-sion, to be a predictor of cardiovascular mortality,at least for men. A further issue related to thesilencing of distress is that it can be difficult todistinguish pathological from normal avoidancebehaviours. Phobic symptomatology may becomeso integrated into the patient’s life that it seemsnormal. Avoidance of some situations may betreated as though they are simply life choices.The patient may say that he or she simply doesnot enjoy shopping in a mall when the fact is thatthey are afraid to go to a mall because they mayhave a panic attack. The problem of distinguish-ing normal from pathological anxiety is broaderthan the issues related to phobia.

This realm of symptoms causes methodolog-ical problems that involve both assessment andtreatment. Phobic symptoms entail avoidance ofcues that evoke fear, anxiety or other dyspho-ric affects. An individual with phobic avoidancewill make every effort to evade exposure to thefeared situation. Evasion often extends to think-ing or talking about the situation. This meansthat the phobic individual cannot be counted on

Page 266: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

268 TEXTBOOK OF CLINICAL TRIALS

to talk about their symptoms spontaneously. Infact, to obtain a clear picture of the extent ofbehavioural avoidance often requires a detailedinquiry. Such an assessment takes time and isnot desirable in many community settings. Aninvestigator must decide whether this time isworth the trade-off of information. The answerto this question will be influenced by the typeof study and the population being studied. How-ever, it is important for researchers to be awarethat more co-occurring phobia has been consis-tently associated with poorer response to treat-ment and greater likelihood of relapse.20,21 Theclinical significance of phobic symptoms under-scores the need to attend to this component ofanxiety symptomatology.

THE TWO CULTURES: DISCORDANTMODELS OF THE ANXIETY DISORDERS

A fourth characteristic of anxiety disorders isbased upon the fact that there are two powerfulmodels of these disorders that are not yet fullyintegrated. Specifically, neurobiological (gener-ally biomedical) and learning theory (generallyacademic psychological) researchers use differentparadigms to explain symptom onset and to guidetreatment. When studying treatment in commu-nity settings, it is important that neither groupignore the other. In anxiety disorders, perhapsmore than any other conditions, there is a needto build on information obtained from both ofthese academic disciplines, given that each fieldcan claim clinical results.

Incorporating Information from Biomedicineand Academic Psychology in Study Designs

Anxiety disorders are unusual in that they havebeen the focus of intensive and more or lessindependent study by both biomedical/psychiatricand behavioural/psychological researchers. Effi-cacious treatments have been devised by eachgroup. Yet, most treatment studies test inter-ventions in one, but not both areas. The exis-tence of two very different types of efficaciousintervention for each of the anxiety disorders

presents some especially challenging method-ological issues for which there is no simplesolution. The practice of ignoring the findingsof the other modality when conducting stud-ies is increasingly problematic. If not specif-ically instructed, pharmacotherapists may varywidely in their knowledge and use of effi-cacious behavioural interventions. This varia-tion can be highly problematic for a treatmentstudy. On the other hand, much effort must gointo controlling the interaction of the pharma-cotherapists with the patient. Researchers mustdecide how much behavioural intervention themedication therapist should administer. Compli-cated and time-consuming procedures are oftenrequired to ensure that such interventions are pro-vided uniformly.

On the other hand, patients in the communityoften receive medication that can be efficaciousfor treating anxiety disorders, even before pre-senting to the cognitive behavioural therapist fortreatment. Investigators must decide how to man-age this situation. Should such patients be elimi-nated from a CBT trial? Should they be eligibleand left on medication that is not fully effective?Or should all medication be discontinued? Eachof these decisions is problematic since a partiallyeffective medication can affect outcome whetherit is continued or discontinued. Omission of thisincreasingly large group of patients, on the otherhand, can also be an important threat to general-isability of the study.

Another problem for researchers is how todecide whether to compare medication and psy-chosocial treatments, and if so, to decide how bestto do so. There is clearly a need in the field toaddress this problem, but the solution to the prob-lem is not trivial. Among the problems are thatpatients often have treatment preferences. Manywill simply refuse to participate in a study inwhich they must agree to be assigned to a treat-ment modality at random. Others will agree anddrop out when they receive an unwanted treat-ment assignment. There are a number of possibledesigns for a comparison treatment study. Theseinclude a full factorial design (Figure 17.2) inwhich each active treatment is compared to an

Page 267: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

ANXIETY DISORDERS 269

Activepsychotherapy

‘Placebo’psychotherapy

Nopsychotherapy

X

X

X

X

X

X

X

X

X

Active medication Placebo medication No pill

Figure 17.2. Full factorial design

inactive (placebo) control treatment and no treat-ment, and the two treatments are combined inall possible combinations. While this is the mostcomplete design it is often impractical because ofthe treatment combinations (or lack of treatment)or because of the number required. It is difficultto undertake such a study at a single site and thenthere are problems with multiple sites in equiva-lence of providing all treatments and in minimis-ing patient heterogeneity. An alternative class ofdesigns has been described recently22 that allowspatients and investigators to describe preferencesin advance of randomisation and then be ran-domised within their preference set (‘equipoisestratum’).

Other design issues are related to the differ-ent putative underpinnings of symptoms as con-ceptualised from different points of view. Thesedifferent viewpoints sometimes translate to dif-ferent treatment targets. For example a CBTapproach to panic disorder focuses on underly-ing fear of bodily sensations, while the pharma-cotherapist targets bursts of autonomic arousal.Pharmacotherapists and psychosocial researcherstypically use different assessment strategies, andmay or may not accept those of the other camp. Inrecent years, a series of multisite studies under-taken as a collaboration between neurobiologicand cognitive behavioural scientists has produced

a more comfortable meeting ground for bothgroups. Still, there are disagreements.

In addition to the scientific differences, thereare social and political differences between thetwo groups of researchers that can complicatemethodological decisions in treatment trials. Theinvestigators need to be clear about who the audi-ence for their results will be. Design decisionsmay influence who will listen to their results.Ideally, a study can be designed so that it will beconvincing to any treatment researcher. However,there are turf issues that may influence the mutualacceptance. Clinicians and researchers from onecamp may feel the other is poaching on their turf.This is more likely to occur if there is insufficientattention to the issues of efficacy of the alterna-tive treatments.

Guild issues are prominent in this field, andfew pharmacotherapists understand the princi-ples and techniques of administering cognitivebehavioural treatment. Similarly many psychoso-cial researchers are not well informed about phar-macotherapy. Investigators from each group tendto have strong allegiance to the unique validity oftheir own methods. At times, there has been ran-cour and contentiousness between them, thoughthis has improved in recent years. In the fewinstances that there has been a head-to-head com-parison of medication and cognitive behavioural

Page 268: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

270 TEXTBOOK OF CLINICAL TRIALS

treatment, they have been similar in efficacy. Itis not yet clear when or how combination treat-ments might be best administered. There is aneed to take into consideration both biomedi-cal and behavioural–psychological perspectivesin designing treatment studies.

CONCLUSIONS

Although proven efficacious treatments havebeen identified for each of the anxiety disorders,the work of clinical trials remains unfinished.There are many unanswered questions, and muchleft to study in order to inform clinicians abouthow to optimise treatment decisions for patientswith these debilitating conditions. In spite ofachievements in documenting treatment efficacyover the past decades, treatment research has justscratched the surface. Innovations are needed intreatment development and in dissemination ofproven interventions. To accomplish this there isa need for innovations in methodology. Efficacystudies, designed to meet US FDA regulatoryneeds,23 will continue to have a role in the clinicalresearch pipeline. But, there is a need for newclinical methods to support studies before andafter efficacy. It is not our purpose to provide acomprehensive review of such methods. Rather,we have focused on a few key issues in anxietydisorders that require special consideration.

Existing clinical trials in anxiety disorders, likethose in other areas of psychiatry, have providedinformation telling us which treatments are activein reducing target symptoms. Unanswered are amyriad of critical questions that relate to daily lifedecisions in the clinic. For example, do impair-ments as well as symptoms respond to efficacioustreatments? If so, what is the time course ofresponse? What is the optimum dose and dura-tion of treatment to achieve maximal results?How often can we produce remission with exist-ing treatments? What is the best way to defineremission? Is maintenance treatment needed afterremission is achieved? If so, how long? What ifa patient does not achieve symptom remission?How should such a patient be managed over the

long run? Do patients with complex comorbidconditions respond to treatment in a way that issimilar or different than those with less comor-bidity? Can a clinician be confident that provenefficacious treatments are appropriately utilisedin a patient whose symptoms meet criteria forthe target disorder, but who differs in demo-graphic characteristics, social supports, or otherways from those seen in efficacy studies? Howclosely must procedures in the community followthose used in research studies in order to achievethe same results?

These and other questions like them are oftenbroadly grouped under the rubric of ‘effective-ness’ studies. We focus especially on characteris-tics of anxiety disorders that make these decisionscomplicated and that comprise methodologicalchallenges for researchers. We confess that wemay raise more questions than we can answer.However, where possible, we will at least pro-vide suggestions about possible ways to addressthe problems.

Decisions about primary, secondary and tertiaryprevention interventions are not so well specified.There is accumulating evidence for psychologicaland neurobiological precursor states for anxietydisorders and for psychosocial risk factors. Thisinformation raises hopes for the possibility of pri-mary prevention. Clearly it would be advantageousto be able to intervene early, before the devel-opment of the disorder and, ideally, even beforethe onset of a precursor state. Once established,anxiety disorders are chronic relapsing conditions.With or without treatment, patients are likely toexperience symptoms that wax and wane, to mean-der in and out of full-fledged symptom states indifferent configurations, to experience temporary,partial or incomplete states of remission. We needmore information about how to manage anxietydisorders, once established, in order to best pre-vent complications and recurrence of full symp-tomatic episodes.

REFERENCES

1. Kessler RC, McGonagle KA, Zhao S, Nelson CB,et al. Lifetime and 12-month prevalence of DSM-III-R psychiatric disorders in the United States:

Page 269: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

ANXIETY DISORDERS 271

Results from the National Comorbidity Survey.Arch Gen Psychiat (1994) 51(1): 8–19.

2. Hohmann A, Shear MK. Community based inter-vention research: coping with the noise of reallife in study design. Am J Psychiatr (2002) 159:201–7.

3. Kessler RC, Soukup J, Davis RB, Foster DF,Wilkey SA, Van Rompay MI, Eisenberg DM. Theuse of complementary and alternative therapies totreat anxiety and depression in the United States.Am J Psychiat (2001) 158(2): 289–94.

4. Shear MK, Greeno C, Kang J, Ludewig D,Frank E, Swartz HA, Hanekamp M. Diagnosis ofnon-psychotic patients in community settings. AmJ Psychiat (2000) 157(4): 581–7.

5. Glisson C, Hemmelgarn A. The effects of organi-zational climate and interorganizational coordina-tion on the quality and outcomes of children’s ser-vice systems. Child Abuse Neglect (1998) 22(5):401–21.

6. Rosenheck R. Stages in the implementation ofinnovative clinical programs in complex organiza-tions. J Nerv Ment Dis (2001) 189(12): 812–21.

7. Swartz H, Shear MK, Frank E. Supermarket Psy-chotherapy. Psych Services (in press).

8. Roy-Byrne PP, Katon W, Cowley DS, Russo JE,Cohen E, Michelson E, Parrot T. Panic disorder inprimary care biopsychosocial differences betweenrecognized and unrecognized patients. Gen HospPsychiat (2000) 22(6): 405–11.

9. McKay M, McCadam K, Gonzales J. Addressingthe barriers to mental health services for innercity children and their caretakers. Commun MentalHealth J (1996) 32(4): 353–61.

10. Essock S, Goldman H. Outcomes and evaluation:system, program and clinician level measures.In: Minkoff K, Pollack D, eds, Managed MentalHealth Care in the Public Sector: A Survival Man-ual. Chronic Mental Illness, Vol. 4. Amsterdam:Harwood Academic (1997) 295–307.

11. Dohrenwend BP, Shrout PE. Toward the develop-ment of a two-stage procedure for case identifica-tion and classification in psychiatric epidemiology.Res Commun Mental Health (1981) 2: 295–323.

12. Dixon L, Green-Paden L, Delahanty J, Luck-sted A, Postrado L, Hall Jo. Variables associated

with disparities in treatment of patients withschizophrenia and comorbid mood and anxietydisorders. Psychiat Serv (2001) 52(9): 1216–22.

13. Kandel DB, Huang F-Y, Davies M. Comorbiditybetween patterns of substance use dependenceand psychiatric syndromes. Drug Alcohol Depend(2001) 64(2): 233–41.

14. Brady KT. Comorbid posttraumatic stress disorderand substance use disorders. Psychiat Ann (2001)31(5): 313–19.

15. Widiger TA, Sankis LM. Adult psychopathology:issues and controversies. Ann Rev Psychol (2000)51: 377–404.

16. Fava GA, Rafanelli C, Ottolini F, Ruini C, Caz-zaro M, Grandi S. Psychological well-being andresidual symptoms in remitted patients with panicdisorder and agoraphobia. J Affect Disord (2001)65(2): 185–190.

17. Yerkes RM, Dodson JD. The relation of strengthof stimulus to rapidity of habit formation. J CompNeurol Psychol (1908) 18: 459–82.

18. Diefenbach GJ, McCarthy-Larzelere A, William-son DA, Mathews A, Manguno-Mire GM, BentzBG. Anxiety, depression, and the content ofworries. Depress Anxiety (2001) 14: 247–50.

19. Katschnig H, Amering M. The long-term courseof panic disorder and its predictors. J ClinPsychopharm (1998) 18(6, Suppl 2) 6S–11S.

20. Scheibe G, Albus M. Predictors and outcome inpanic disorder: a 2-year prospective follow-upstudy. Psychopathology (1997) 30: 177–84.

21. Cowley DS, Flick SN, Roy-Byrne PP. Long-termcourse and outcome in panic disorder: a naturalis-tic follow-up study. Anxiety (1996) 2: 13–21.

22. Lavori PW, Rush AJ, Wisniewski SR, Alpert J,Fava M, Kupfer DJ, Nierenberg A, Quitkin FM,Sackeim HA, Thase ME, Trivedi M. Strength-ening clinical effectiveness trials: equipoise-stratified randomization. Biol Psychiat (2001)50(10): 792–801.

23. Rubinow DR. Practical and ethical aspects ofpharmacotherapeutic evaluation. In: Ginsburg BE,Carter BF, eds, Premenstrual Syndrome: Ethicaland Legal Implications in a Biomedical Perspec-tive. New York: Plenum Press (1987) 47–55.

Page 270: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

18

Cognitive Behaviour Therapy

NICHOLAS TARRIER1 AND TIL WYKES2

1School of Psychiatry and Behavioural Sciences, University of Manchester, Manchester M23 9LT, UK2Institute of Psychiatry, De Crespigny Park, London SE5 8AF, UK

BACKGROUND

We have chosen in this chapter to provide anoverview of the difficulties for the investigationof psychological therapies using the methodologyof randomised control trials. In order to do sowe have selected studies of a new treatment forpsychosis, cognitive behaviour therapy (CBT).This is a new therapy that, following a periodof development, has now resulted in four largerandomised control trials.

Cognitive behaviour therapy is a therapy thattargets the symptoms of one disorder, schizophre-nia. The lifetime risk is 1%. This disorder is char-acterised by a cluster of specific symptoms thatare typically divided into two categories, positiveand negative. Positive symptoms include auditoryhallucinations and delusions, both of which pro-duce much distress. Negative symptoms includelack of drive, emotional apathy as well as povertyof speech and social withdrawal.

In many, if not most, cases the disorder followsa relapsing course.1 A significant proportion,but not all, people suffering from the disorderhave poor outcomes, i.e. with high levels ofdependence on continuing psychiatric care, low

levels of financial independence and little socialfulfillment. There is some underlying variationin the disorder,2 which is probably affectedby interactions with other clinical, social andenvironmental demands and supports such aslife events (death of parent), absence of asupportive family (or presence of a critical one)and economic conditions (high unemployment).

Several different sorts of psychological thera-pies have been developed to address the follow-ing outcomes:

• Total number of symptoms• Distress caused by symptoms• Relapse• Social functioning• Family engagement• Quality of life• Skills/thinking style, e.g. problem solving,

coping skills.

The currently accepted treatment for the posi-tive symptoms of schizophrenia is medication.This has been shown to reduce them significantlyand does reduce relapse. However, it also hascosts as well as benefits in that there is the risk

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 271: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

274 TEXTBOOK OF CLINICAL TRIALS

of developing side effects such as tremor, rest-lessness and uncontrollable mouth movements.Most side effects disappear on stopping the med-ication but there is the chance that the mouthmovements will develop into a condition knownas tardive dyskinesia that is irreversible. Somepatients, about one-third, also continue to expe-rience positive symptoms despite adequate dosesof medication. It is these symptoms that were thetargets for change in this further development ofa psychological therapy, CBT.

Because of the potential risks of long-termmedication and the unpleasant side effects alsoexperienced on short-term treatment, consumersof mental health services have been extremelypositive about the development of psychologicaltreatment. This has led to further pressure onfunders of health service research to providemore data on acceptable alternatives or adjunctsto medication treatment. Hence the recent trialsof CBT in the UK sponsored by either theUK Department of Health directly, governmentresearch agencies or large UK research charities.

WHAT IS COGNITIVE BEHAVIOURTHERAPY?

The main developmental roots for CBT have beenin depression and anxiety. This began over 20years ago but more recently the approach hasbeen applied to people with schizophrenia. Thislater development produced changes in the waythe intervention is presented, although the under-lying model of change may be similar to thatadopted for the other disorders. The main aim ofthe intervention is to reduce distress, disabilityand emotional disturbance as well as the relapseof the acute symptoms.3 Cognitive behaviourtherapies are active and structured therapeuticmethods and should be distinguished from psy-choeducation which tends to be simple, didacticand educational. Brief educational packages havebeen shown to be ineffective either with families4

or with individual patients.5

Although there are specific components ofCBT that would be accepted by all its proponents,these ingredients may be given in different

proportions by different groups of professionalsand for different individuals within a singleservice. Below is a list of the ones that we haveidentified as being used by most groups:

• Engagement with the client.• Problem identification.• Agreeing on a collaborative formulation of the

problems to be assessed.• Use of alternative explanations to challenge

delusional and dysfunctional thoughts.• Establishing the link between thoughts and

emotions.• Encouraging the patient to examine alternative

views of events.• Encouraging the patient to examine the link

between thoughts and behaviour.• Use of behavioural experiments to reality test.• The setting of behavioural goals and targets.• Developing coping strategies to reduce psy-

chotic symptoms.• Development and acquisition of relapse pre-

vention strategies.

Some groups have also included:

• Improvement in self-esteem.• Increasing social support and social networks.• Schema focused therapy.

TREATMENT DEVELOPMENT

New treatments usually evolve through a numberof stages. Initially the problem is identified andsuggestions, involving theoretical and pragmaticelements in varying degrees, are advanced for itssolution. Innovative case studies are carried out.Replications and developments in other case stud-ies and case series follow. The next stage consistsof uncontrolled and small exploratory controlledtrials. These are often innovative but methodolog-ically weak. Finally the ‘gold standard’ of evalua-tion, the large randomised controlled trial (RCT),is carried out if the new treatment is showing suf-ficient promise. RCTs are increasingly large andmethodologically rigorous and therefore moreexpensive, often now involving numerous sitesand large numbers of patients. A further theme

Page 272: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COGNITIVE BEHAVIOUR THERAPY 275

is that of identifying what is responsible for theimproved outcome following the treatment; thatis, the trial includes an explanatory element. Tri-als that identify the key components responsiblefor the changes are essential to the further devel-opment of treatment and to the disseminationof the treatment package into the wider healthservice. However, this lack of an accepted the-oretical base does not (and should not) preventa number of different and successful treatmentinnovations from being introduced into healthservices. Pragmatic trials, which address the issueof whether or not a new treatment works withina routine service setting, are usually large, simpleand multi-centred and evaluate a small number ofoutcomes. These trials do not address the ques-tion of why a treatment works. But this under-standing is essential because the costs of caremight be reduced considerably if it is discov-ered that a rather simple and cheap componentof treatment is responsible for the majority ofthe variance in treatment outcome. These treat-ment extensions, although important, rarely getadequate funding following the initial innova-tive RCTs.

The development and evaluation of CBT forpsychosis is no different from the developmentof other treatments in mental health and hasfollowed a characteristic path. Numerous casestudies were published, some as far back asthe 1950s. For example Dr A.T. Beck initiallyworked with psychotic patients and publisheda case study of the cognitive treatment of apatient suffering from delusional disorder6 beforemoving to start his seminal work on depres-sion. Other case studies were published in the1970s and 1980s (see Tarrier7,8 and Haddocket al.9 for reviews), but it was not until the recentdecade that randomised controlled trials were car-ried out. Small trials with methodological weak-nesses were initially published. For example,Tarrier et al.10 compared coping training withproblem solving but assessments were not blindand drop-outs were not included in the analy-sis. Garety et al.11 compared cognitive behaviourtherapy to treatment-as-usual but again assess-ments were not blind and group allocation was

not random. Drury et al.12,13 evaluated cognitivetherapy with acutely ill patients, but the treat-ment included individual and group treatmentof patients and families while assessments wereneither independent nor blind. However, threemedium size methodologically robust trials ofCBT variants have been carried out with chronicschizophrenic patients,14 – 16 and one large multi-site trial with recent onset acute patients (theSoCRATES Trial17). It is therefore appropriate toreview not only these trials but also the changesin clinical trial methods in this field in order tobegin to define the most optimal strategy for thefuture evaluation of this and other psychologi-cal therapies.

WHY CARRY OUT CLINICAL TRIALSOF PSYCHOLOGICAL TREATMENTS?

PURPOSES AND OUTCOMES

There are a number of different beneficiariesfrom clinical trials. From the health servicesperspective there is an increase in knowledgeabout what treatments are likely to providethe most benefits (see section on Evidence-Based Practice below). In addition, for clinicalacademics there may be elements of the design ofa trial that will allow certain models of aetiologyor treatment efficacy to be tested which caninform theories of the disorder as well as leadingto improvements in treatment. For therapiststhe trial may produce clinical improvementsthat mean that the participants can make healthgains and for the patients the treatments mayprovide them with changes that are valued, suchas increased social inclusion. So it cannot beassumed that there is a single purpose for carryingout a clinical trial. These different purposeschange the type of trial performed, particularly inrelation to how outcomes are defined. We have setout a number of different outcomes below whichmay be variously valued by different groups(in the list respectively health service, clinicalacademic, therapist and participant) and whichcould be targets in CBT trials. It cannot beassumed that all groups will value all outcomes

Page 273: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

276 TEXTBOOK OF CLINICAL TRIALS

to the same extent, or that the same outcomewould be measured in the same way from thedifferent viewpoints. For instance, symptoms canbe measured as a simple change over treatment,by a threshold amount or by the effects on theemotional life of the patient, for example thedistress caused by the symptom.

Possible outcomes of treatment:

• The occurrence or frequency of a particularevent: e.g. number of relapses, time to relapse(survival functions).

• The use of services or other resources: e.g.days spent in hospital, use of communitymental health care.

• Improvement in symptoms at a level assumedto be of clinical significance: e.g. at least20% or 50% improvement, return to withinnormal range.

• Change in a single symptom or other continu-ous outcome that is considered central or pri-mary to the disorder: e.g. severity of delusionsor hallucinations.

• Change in psychopathology that is general orsecondary to the disorder: e.g. scores on astandard measure of psychopathology, severityof distress or anxiety.

• Changes to other important aspects of theperson’s life: e.g. social functioning, numberof friends, quality of life.

Trials are also expensive and so the chancesof funding are dependent on the types of trialswanted by the funding agencies. The mainbeneficiary (and funding) of clinical trials is thehealth service who would prefer pragmatic ratherthan model testing trials. But in the UK therehas also been a new trend that may also affectthe type of trial–the inclusion of mental healthservice users (consumers) and, where appropriate,carers on the trial management committees. Thereare examples of this; users and carers wererepresented in this way on a trial of CBT in dualdiagnosis patients18 and of effectiveness of familyinterventions,19 and also on the research steeringgroup which generates the research designs forthe Centre for Recovery in Severe Psychosis

at the Institute of Psychiatry in London. Theinvolvement of service users in clinical trials inthe UK is now defined in guidelines providedby the Consumers in Research Unit within theDepartment of Health. This new undertaking doesnot seem to be prevalent in other countries.

The difficulty for research into psychologicaltreatments is that studies are usually funded frompublic resources even at the early stages. Thisis in contrast to trials of medications whereparticular companies are not only required tocarry out specific research for licensing but arealso likely to benefit financially from the resultsof trials. Unlike drugs, psychological treatmentsdo not have a specific product champion andtherefore have to compete with other health caretrials for scarce resources.

THERAPEUTIC RELEVANCE

In addition to the list of possible outcomes abovethere are other measures that may be essentialin the assessment of outcome in a trial. Forinstance, if one of the hypotheses is that thetherapy works through a specific mechanism thena sole outcome measure without recourse to eitherqualitative and/or process measures would notprovide a test of this hypothesis. This is anextension of the sorts of questions stipulatedby a clinical academic but is also essential tothe health services. It may be that the treatmentprovides its effects through a simple mechanismwhich could be provided in a less sophisticatedway, that is not requiring high levels of trainingand supervision.

It has been suggested that psychological ther-apies may all work through a common pathway:that the non-specific effects of psychotherapymay account for much of the effect of treatmentoutcome.20,21 This is hardly surprising as psycho-logical therapies have much in common with eachother–they involve, for example, an interaction,negotiation of goals, an agenda for each session.The improvement could be produced by thesecommonalities and not through the specific modelof therapy adopted. For example, treatments thatwere designed as non-specific placebo controls(e.g. befriending in Sensky et al.16 and supportive

Page 274: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COGNITIVE BEHAVIOUR THERAPY 277

counselling in Tarrier et al.15) performed muchbetter than expected, although never better thanCBT. Therefore, the choice of a comparisongroup is extremely important. If psychologicaltherapy is compared to treatment-as-usual (TAU),which includes less individual attention than thepsychological therapy, its effectiveness may bedue to shared common themes of psychologicaltherapy not to ingredients of a particular therapy.Tarrier et al.10 investigated the effects of expecta-tion of therapeutic benefit by the use of a demandand counter-demand manipulation. Half of theparticipants were told that they should expecttherapeutic benefit to accrue with their progressthrough treatment (demand condition) and theother half were told that they should expect ben-efit but that it would not be apparent until afterthe end of treatment and post-treatment assess-ment (counter-demand condition). This manipu-lation of expectancies had no effect on clinicalmeasures, suggesting that at least the anticipa-tion of treatment benefit was not influential inthis patient group.

Alternatively, as psychological therapies in-clude specific attributes in common it may bewrong to conclude, in a comparison of two typesof psychological therapy, that CBT is not thebest form of therapy when the two treatmentsdo not differ significantly from each other.There is always the danger that the study willbe underpowered to demonstrate an advantageof CBT when the non-specific control groupdoes better than expected. However, CBT maybe significantly better than TAU, whereas thealternative may not give such an advantage.When the health services have to decide which ofseveral forms of psychological therapy to chooseto add to their therapeutic armantarium, selectingCBT would be their best choice.

ACUTE CARE, MAINTENANCE THERAPYAND DURABLE EFFECTS

Schizophrenia is most often a chronic relapsingcondition. If we take the metaphors from treat-ments with medication then psychological ther-apy could be provided in a number of differ-ent ways.

• Acute antibiotic treatment which kills offthe bacteria causing the disease (intensivepsychological treatment which changes a keyfactor in the psychological make-up of theindividual, e.g. cognitive behaviour therapy forpanic disorder22).

• Acute treatment of symptom exacerbationsand maintenance treatment, e.g. asthma treat-ment with steroids followed by maintenancewith Salbutamol and/or sodium chromoglycate(CBT for chronic depression23).

• Prophylactic treatment for malaria (e.g. de-briefing treatments for possible post-traumaticstress disorder24).

Psychological therapy often sets itself the sametarget as treatment using antibiotics, with an acutephase followed by a follow-up during whichthere is no active treatment. This protocol mainlyresulted from the lack of specialist input in thehealth services, making it imperative to rationservices. It also follows a set of expectationsthat come from the behavioural tradition in thetreatment of psychological problems and recentCBT interventions for anxiety disorders whereinterventions are brief and the effects durable.For example, Figure 18.1 shows the effects ofimipramine and CBT for panic disorder from atrial by Clark et al.22 In their trial the drug andthe psychological treatment had similar effects atthe end of treatment, but psychological treatmenthad a more permanent effect and the differencesbetween the two treatments were significant atfollow-up. The improvement was predicted bythe change in cognitions following treatment. Inother words, the psychological treatment changeda maintenance factor for the disorder.

This expectation of intensive treatment pro-ducing durable gains may not be appropriatefor schizophrenia as it is a relapsing conditionthat, in some cases, may have a deterioratingcourse. Furthermore, residual symptoms may bepresent between episodes of exacerbation. Resid-ual positive symptoms at discharge are a riskfactor for relapse.5 Many maintenance and causalfactors have been proposed and are in multipledomains, such as social, biological as well as

Page 275: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

278 TEXTBOOK OF CLINICAL TRIALS

20

25

30

35

40

45

baseline post-treatment

15 monthfollow-up

CBTImipramine

Source: Reproduced from Clark et al.,22 with permission.

Figure 18.1. Psychological treatment for panic disorder

psychological. It is possible that CBT could havea successful effect on one of these factors butfail later when other factors become crucial in theprogress of the disorder. This would be shown asa successful outcome at post-treatment but a lackof durability of gains at follow-up. However, theusual interpretation of this pattern of results isthat the effect on the disorder was only tempo-rary. If gains were ‘only temporary’ in a groupof patients who were chosen because their symp-toms were ‘residual’ then even this ‘temporary’gain would be welcome. This set of results, ratherthan dismissing the treatment effect, actually gen-erates a further question–how do we maintain thegains made during treatment when treatment iswithdrawn?

The therapeutic protocol adopted for schizo-phrenia with medication is to provide medicationintensively at the acute stage that is followed bymaintenance treatment at lower dose of similardrugs. It may be that psychological treatmentneeds to be provided in an equivalent way.

An alternative mechanism and pattern ofresults for CBT could be improvement in onefactor, such as self-esteem, which then allows fur-ther improvements in other factors to occur, suchas increased social support through the exten-sion of a support network by increased socialcontact. This would produce an improvement atpost-treatment and even greater gains at follow-up. However, it would appear that CBT wasnot only durable but conferred greater benefits

as time passes, although it would not be clearto the research team how this latter improve-ment came about. This poses the question ofhow do we explain increases in effect sizepost-therapy which cannot be explained merelyby the loss to follow-up of those people forwhom the therapy conferred hardly any benefitat all.

Trials of acute CBT have shown significanteffects of therapy mostly at the cessation oftreatment but always after a follow-up period.Figure 18.2 provides data from Gould et al.25 ofthe effect sizes of seven trials calculated from thefollowing equation:

Effect size = (Mt − Mc)/SDc

where Mt is the mean of the treatment group,Mc is the mean of the control group and this isdivided by the standard deviation of the controlgroup of participants.

The mean effect size for the trials studiedby Gould et al. is 0.65 (95% CI 0.56–0.71).This average is called a medium to large effectsize according to Cohen26 (p. 40). Patients con-tinued to improve over the follow-up periodwith the combined effect size cited by Gouldet al.25 of 0.93 for the four studies reporting afollow-up period. These results are encouraginggiven that schizophrenia is a relapsing condi-tion where life events and other stressors maytrigger new episodes of illness. However, the

Page 276: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COGNITIVE BEHAVIOUR THERAPY 279

interpretation of the results of individual trialshas since changed, mainly because the acceptedstandards for trials have changed. Several trialsthat make up this figure are methodologicallyweak with difficulties in random assignment,blindness of ratings, adequate outcome assess-ment and problematic or unsophisticated analyses(see below).

FAIRNESS AND CHANGING STANDARDSIN TRIAL DESIGN AND REPORTING

Standards have changed and what was reportedin papers a number of years ago would havebeen adequate and satisfactory for the times.However, there are now clear guidelines onhow trials should be reported, formalised inthe CONSORT Statement.27,28 CONSORT is achecklist and flow diagram that were designedto improve the quality of reports of randomisedcontrolled trials. The checklist gives detailedinstruction on describing the study’s method anddesign, assignment and randomisation, masking(blinding), participant flow and follow-up, andanalysis. The flow diagram provides readers witha clear picture of the progress of all participantsin the trial, from the time they are referred tothe trial until the end of their involvement. It

should include the number assessed for eligibilityfor the trial, reasons for exclusion, who wasrandomised and what happened to them prior tofinal assessment and analysis of the trial results.

These standards on reporting, by implication,provide strong recommendations to researchersabout what they need to consider and actionwhen designing and managing a trial. Table 18.1contains a list of those points of the design oranalysis that can seriously bias the interpretationof the results.

The majority of the current CBT trials donot conform to the reporting guidelines as setout in here. For some trials the significantdiscrepancies between Table 18.1 and the trialmay lead to a biased interpretation of the results.For example, in Drury et al.12,13 it is not clearwhat specific therapy is provided as a varietyof different components were being tested atthe same time. In Garety et al.11 the participantswere not randomly allocated to treatment groups.Kuipers et al.14 had no blind assessment oftreatment outcomes. Sensky et al.16 recruited byrepeatedly canvassing local services for referrals.However, the current meta-analyses do showthat despite these methodological difficultiesthere seem to be significant changes in overallsymptoms following treatment with CBT.

0

0. 5

1

1. 5

2Post-treatment effectsize

Follow-up effect size(where available)

Studies cited1. Drury et al, 19962. Garety et al, 19943. Kuipers et al, 1997, 19984. Milton et al, 19785. Sensky et al, 20006. Tarrier et al, 19937. Tarrier et al, 1998, 1999

1 2 3 4 5 6 7

Source: Reproduced from Gould et al.,25 with permission from Elsevier.

Figure 18.2. Effect sizes of CBT trails

Page 277: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

280 TEXTBOOK OF CLINICAL TRIALS

Table 18.1. Items that should be included in reports of randomised trials

Heading Subheading Descriptor

Title Identify the study as a randomised trialAbstract Use a structured formatIntroduction State prospectively defined hypothesis, clinical objectives, and

planned subgroup or covariate analysesMethods Protocol Describe

Planned study population with inclusion or exclusion criteriaPlanned interventions: their nature, content and timingPrimary and secondary outcome measure(s) and the minimum

important difference(s), and indicate how the target sample sizewas estimated

Reasons for statistical analyses chosen, and whether these werecompleted on an intention-to-treat basis

Mechanisms for maintaining intervention quality, adherence toprotocol and assessment of fidelity

Prospectively defined stopping rules (if warranted)Assignment Describe

Randomisation (e.g. individual, cluster, geographic)Allocation schedule methodMethod of allocation concealment

Masking (blinding) DescribeMechanism for maintaining blind and allocation schedule controlEvidence for successful blinding

Results Participant flow andfollow-up

Provide a trial profile summarising participant flow, numbers andtiming of randomisation assignment, interventions, andmeasurements for each randomised group

Analysis State estimated effect of intervention on primary and secondaryoutcome measures, including a point estimate and measure ofprecision (confidence interval)

State results in absolute numbers when feasible (for example, 10/20,not 50%)

Present summary data and appropriate descriptive and interferentialstatistics in sufficient detail to permit alternative analyses andreplication

Describe prognostic variables by treatment group and any attemptto adjust

Describe protocol deviationsDiscussion State specific interpretations of study findings, including sources of

bias and imprecision (internal validity) and discussion of externalvalidity, including appropriate quantitative measures whenpossible

State general interpretation of the data in light of the availableevidence

Source: Modified from Begg et al.27 and Moher et al.28

EVIDENCED-BASED PRACTICE

There is considerable current enthusiasm forevidence-based health care in general and mentalhealth care, but there is a current debate onhow evidence-based practice can be consistently

implemented in routine settings both in the UK29

and abroad.30,31

Evidence-based practice is the delivery ofinterventions for which there is strong scientificevidence that they improve relevant patientoutcomes. Although the type of scientific

Page 278: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COGNITIVE BEHAVIOUR THERAPY 281

evidence does vary, the gold standard for treat-ment outcome is the RCT. Where several trialsexist they can be considered together throughmeta-analysis. Knowledge concerning evidence-based practice accrues through the accumulatingresults of efficacy and effectiveness studies.

Thus the purpose of evidence-based practice is(a) to ensure that the wealth of research evidenceinforms clinical practice so that those who arein receipt of treatment will receive the treatmentthat is the best available and represents thecurrent knowledge base, and (b) to ensure thatplanning and policy is determined by empiricalevidence, for those purchasing services to be ableto make informed choices and for those receivingservices to be empowered by such knowledge.Furthermore, the establishment of an evidence-based practice knowledge base of what worksallows the practice of mental health servicesand individual clinicians to be compared tothe evidence base. This increases accountabilityand establishes guidelines for good practice andimproves the quality of mental health services.Limitations in evidence also set the researchagenda for the future.

There are, however, critics of the colla-tion of data for evidence-based practice. Thismainly focuses around the use of specific meta-analytic techniques that have very limited entrycriteria. The Cochrane database, for example,provides valuable searches and evaluations ofrandomised control trials with strict criteria forentry. Although the evidence may be strong fora particular practice it may be based on a verysmall number of studies. The main criteria forexclusion are the lack of randomisation of theparticipants within the trial and the lack of dataon all those participants who entered the trial.Although clearly the results of such trials shouldbe less weighted in the final evaluation, suchinformation may be valuable when few other dataare available.

TRIALS METHODOLOGY AND AIMS

Efficacy trials are devised to test whether the ther-apy has an effect overall on the outcomes of inter-est. They are carried out in relatively controlled

environments, usually by sophisticated universitybased research teams, and often involve highlyexpert therapists. For CBT trials the outcomesof interest are a reduction in overall psychoticsymptoms, reductions in relapse or reduced rateson admission to hospital, reduced psychopathol-ogy and improvements in functioning. These tri-als may also include various control groups andprocess measures to help understand why thetreatments work. An effectiveness trial attemptsto more closely resemble the real world of rou-tine services, inclusion criteria are wider sothe sample treated is more heterogeneous andincludes the atypical patients, and the therapistsare recruited from the routine services. The mea-sured outcomes are reduced to the minimum andtend to be gross measures that are clinically sig-nificant such as relapse or hospital admission;health economic measures to assess cost are alsodesirable. In special cases an equivalence trialmay be designed, in which a new treatment isexpected to match the clinical efficacy of anestablished treatment but may have other bene-fits, for example in terms of acceptability or cost.These trials have special methodological featuresthat distinguish them from simple comparativetrials.32

PARTICIPANT SAMPLES

Recruitment Bias

Figure 18.3 shows how the patient flow in a studyshould be described. The box of particular inter-est is the one at the very top that describes thosewho have been assessed for eligibility. In orderto prevent bias in recruitment the best methodfor ascertaining samples of patients for a trialis to recruit them from a cohort of patients incontact with a service that covers a geographicarea (as in Tarrier et al.15 and Lewis et al.17).This ensures that the people who are in the trialdo represent those who have the disorder. Inthe UK it is largely assumed that those patientswith schizophrenia in contact with the serviceswill represent those with the disorder requiringclinical intervention. For example, Tarrier et al.15

Page 279: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

282 TEXTBOOK OF CLINICAL TRIALS

Assesssed foreligibility (n = …)

Randomised (n = …)

Excluded (n = …)

Not meeting inclusion criteria (n = …)

Refused to participate (n = …)

other reasons (n = …)

Allocated to intervention (n = …)

Received allocated intervention (n = …)

Did not receive allocated intervention (give reasons) (n = …)

Analysed (n = …)

Excluded from analysis (give reasons) (n = …)

Lost to follow up (n = …)(give reasons)

Discontinued intervention(n = …) (give reasons)

Allocated to intervention (n = …)

Received allocated intervention (n = …)

Did not receive allocated intervention (give reasons) (n = …)

Analysed (n = …)

Excluded from analysis (give reasons) (n = …)

Lost to follow up (n = …)(give reasons)

Discontinued intervention(n = …) (give reasons)

En

rollm

ent

Allo

cati

on

Fo

llow

up

An

alys

is

Figure 18.3. Participant recruitment and flow

screened all patients who might have a diagno-sis of schizophrenia in a number of NHS trusts,selected those who achieved predetermined crite-ria and examined their notes further. All putativecandidates following this procedure were inter-viewed to ascertain whether they satisfied theentry criteria. This method has been used as agold standard and other trials of CBT have usedthe data from Tarrier et al.15 to compare withtheir study sample in order to conclude that their

sample was representative (see Sensky et al.16).What a comparison of samples allows is justthat–if the samples are similar then the resultsof the trials can be usefully compared, but thisinformation cannot be used as evidence of samplerepresentativeness.

Convenience samples which recruit from clinicattenders or, even more problematically, patientsreferred to the project by their clinicians are atrisk of selection bias. The referrer may only select

Page 280: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COGNITIVE BEHAVIOUR THERAPY 283

those possible participants who they view asgood candidates for the treatment or converselypatients who are difficult or treatment refractory.Recruitment of referred patients is unfortunatelythe norm.14,16 Even though it may be possibleto compare the recruited sample to the wholepopulation of patients who may be eligible interms of socio-demographic and clinical servicecontact, this will not be enough data to ruleout a systematic bias. In the treatment of panicdisorder, Klein33,34 has argued vigorously that incomparisons of psychotherapy verses drug, a pillplacebo–drug comparison is necessary to ensurethat the sample is not atypical since the efficacyof the drug (in this case imipramine) is wellestablished. This is largely an argument abouthow representative or typical any sample is, givena reliance on convenience samples.

Selection

There are a number of different factors that needto be considered as part of the recruitment pro-cess such as service delivery system, academicsupport, socio-economic status of the area andgeography (urban, suburban and rural area). It isunlikely that these will have a specific interac-tion with the outcome from therapy, but as thesefactors will affect the generalisation of the trialresults it is probably important for the sample torepresent a variety.

But ethnicity and cultural mix may potentiallyaffect therapy outcomes. As we know very littleabout how to target psychological therapy todifferent cultural groups, it seems reasonableto start investigating a new treatment with aculturally homogeneous group and in later trialsmodify to accommodate cultural diversity, ifsuch modification would be a requirement ofeffectiveness in cultural subgroups.

Diagnosis

In psychological therapies, especially in the fieldof psychosis, there has been a dilemma aboutwhether to adopt medical diagnosis as entry cri-teria to studies. Some clinical psychologists (e.g.

Bentall et al.35) would prefer the adoption ofsymptomatic entry criteria as schizophrenia is aterm covering a group of people with a wide vari-ety of abnormal experiences. So some trials haveas their entry criteria a specific symptom experi-enced as distressing rather than membership of asingle diagnostic category.11,36 However, even inthese studies some patients were excluded on thebasis of diagnosis because of not fulfilling othercriteria (see below), and it was certainly the viewof one of the authors (TW) that in feasibility stud-ies of group CBT some patients with diagnosesother than schizophrenia, e.g. personality disor-der or bipolar affective disorder, did not respondin similar ways to the patients with diagnosesof schizophrenia or schizoaffective disorder. Cur-rent CBT studies have generally included patientsfrom the schizophrenia spectrum and it is cer-tainly the view of some CBT therapists that thetype of therapy offered to people with bipolaraffective disorder is different from that designedfor schizophrenia.37

Even when diagnosis is used there are toomany different systems to choose from (e.g.clinical case note diagnosis, research diagnoses(RDC), DSMIV, ICD10, etc.). The choice of adifferent system will change the characteristicsof the sample. For instance, if people are drawnon RDC criteria they will not necessarily be aschronic as those fulfilling the DSMIV criteria.

Exclusion Criteria

As well as criteria for inclusion into trials moststudies also exclude people on the basis ofspecific issues. In trials of psychological therapyfor psychosis one usual criterion is that thepeople who enter the trial are those whosesymptoms have remained despite adequate dosesof medication. The group chosen on this basisis extremely chronic and refractory and providesan extremely stringent test of the efficacy ofpsychological treatment.

A further thorny issue is that of co-morbidsubstance abuse. Most studies will exclude indi-viduals when the abuse is severe, but the cri-teria for severity are rarely set out clearly so

Page 281: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

284 TEXTBOOK OF CLINICAL TRIALS

that it is impossible to compare between trials.Patients who are recruited from inner city areasare unlikely to be free of recreational drug use.A small consumption of cannabis may not affectthe therapy efficacy, but it is not clear whetherany use of class A drugs affects the therapeu-tic effect of psychological treatment. A morerecent trial has been designed to test the effi-cacy of CBT and family intervention to treatdual diagnosis patients (those diagnosed as suf-fering from schizophrenia and substance abuse)in which the substance abuse is thought toincrease the risk of poor outcomes in the primarydisorder.18 In this case severe substance abusewas an entry criterion.

Again some people may have a co-morbidorganic condition such as epilepsy that may war-rant exclusion, although most trials again wouldevaluate whether the organic condition is primar-ily responsible for the symptoms of the disorderwhich they are trying to alleviate. Deterioratingbrain disorders such as Alzheimer’s disease maybe a reasonable exclusion criterion as CBT relieson the carry-over of changes in one session tosubsequent sessions. Similarly, people who havelearning disabilities may also have some difficul-ties with CBT as it is currently devised, althoughtherapists have extended treatment for depressionto the learning disabilities field. Current trials alsodo not support the idea that lower IQ preventstherapeutic changes.38 But all current trials dohave a lower cut-off for IQ, usually around 65.

Drop-out or Lost to Follow-up

Two main issues affect the inferences about thetrial results. The first is the effect of those peoplewho drop out of the therapy and the secondis those people who are lost to contact at anystage of the trial. Different systems of dealingwith drop-outs can be adopted. Some systemsassume that the person would not have changedat all since leaving the trial (LOCF), but thisapproach has its problems.39 But assuming that thegroup who drop out would have performed in thesame way as those who remained also producesdifficulties. Drop-outs may be those people who

might never have achieved any change followingtherapy. Clearly if a treatment produces high levelsof drop-outs this might imply something about theacceptability of treatment. A precise description ofdrop-outs is required but, from the trials submittedso far this is missing in all but a few cases.

More research on drop-outs is clearly required.But in the area of mental health in particular, theresearch is difficult, if not impossible, to carryout. The new guidelines on research governance40

do not allow for the harassing of people whohave dropped out of trials for their reasonsfor dropping out or for data on their currenthealth status. However, it is not only of interestacademically, as it provides some information onthe veracity of the theory underlying the disorder,but also essential to inform the health services.For example, Tarrier et al.41 reported that patientswho dropped out of treatment tended to be male,unemployed and unskilled, single, with a lowlevel of educational attainment and a low pre-morbid IQ. They had a lengthy duration of illnessalthough at the time of discontinuation they werenot severely ill and functioned at a reasonablelevel. They were likely to be paranoid but notsuspicious of the therapist. They were unlikelyto be grandiose. They did not understand therationale for therapy or the potential for benefitbut feared it could make them worse.

It is not clear whether it is appropriate orethical to collect personal information that iskept for routine monitoring purposes for a personwho has dropped out of a trial. This informationmay consist of health service contacts kept onhealth care databases such as case notes aswell as information from third parties. For trialsinvolving people with severe disorders, third-party information from key workers is nearlyalways included as part of the measurement ofoutcome. The lack of data on drop-out may affectthe relevance and benefit of the trial results to thewider community. It may therefore be unethicalnot to collect as much of it as possible. Theinterpretation of new legal rights such as the newUK Human Rights Act should make the positionof researchers clearer, but it is also possible thatthe idiosyncratic interpretations made by local

Page 282: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COGNITIVE BEHAVIOUR THERAPY 285

Research Ethics committees will lead to furtherconfusion in this already complicated area.

PLATFORM AND ORDNANCE

A naval military analogy between the vehicleof delivery, the platform (e.g. battleship, frigate,etc.) and what is delivered, the ordnance (e.g.shell, missile, etc.) is helpful in understandingthe difference between service organisation andtherapy.42 In terms of this analogy the platformwould be aspects of the mental health service,such as assertive outreach, case managementand so on; whereas ordnance would consistof different types of therapeutic intervention,such as CBT and family interventions. Thisdistinction is useful in clarifying what is beingtested. For example a trial of different serviceorganisations (platforms) would be the UK 700trial,43 in which 708 psychotic patients in fourcentres were randomly assigned to standard orintensive case management. In this trial the onlyspecific difference between the two trial limbswas the number of patients the case managershad in their case loads. No investigation wasmade about the therapeutic input that the casemanagers implemented. The results indicated thatthere was no advantage in clinical or socialoutcomes of intensive case management. Incontrast there are examples of therapy trials inwhich a comparison was made between CBTplus routine care, supportive counselling plusroutine care, and routine care alone for chronicpatients15 and acute patients.17 In these trialspatients are recruited across a number of sites sothat variations in routine care and service deliveryare accommodated. It may be questioned whethertrials of services are of much value if they donot include effective therapies. A battleship isunlikely to perform well in a naval engagementfiring a bow and arrow!

BACKGROUND SERVICES

The background mental health services and theiraccessibility may affect trials in a number of

ways. Recruitment may be affected by what ser-vices are already available and who has accessto them. For example, recruitment is likely tobe different if there is free universally availablehealth care provided by a service committed toresearch and development. A large proportion ofthe population will use this service and poten-tially be available for recruitment and eligible forthe trial. This is essentially what happens in theUK National Health Service. This case is verydifferent when health care is provided, funded byreimbursements in a fragmented manner to cer-tain groups of the population by private serviceswho are unlikely to be committed to research. Inthis case the proportion of the population avail-able for recruitment will be much reduced andbiased towards certain subgroups who may, forwhatever reasons, have no access to private care.Here recruitment is likely to be highly selectiveand potentially biased. The provision of differentservices to different income groups or other popu-lation subgroups mitigates against representative-ness of trial populations in the USA, Australiaand some European countries44,45 and compro-mises the value of such trials.

RANDOMISATION

Purpose

To give an equivalent chance of a recruit beingin any of the groups in the trial design. Someresearchers think that one of the purposes isto balance the groups on every factor thatmay be relevant to the treatment response, butpurely random allocation will not provide suchmatching. If there is strong evidence that aparticular factor may affect the outcome thenthis should be included as a factor in theanalysis. In the past researchers have said theyhave provided evidence of the equivalence oftheir samples in analyses of pre-treatment groupcomparisons and on the basis of finding nostatistically significant differences on factorspertinent to the treatment response have thennot included these factors in their analyses oftherapeutic outcome. The current advice is notto carry out such pre-treatment comparisons

Page 283: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

286 TEXTBOOK OF CLINICAL TRIALS

but to include pertinent factors in the outcomeassessment. However, there is a need for a cleardescription of the people who dropped out oftreatment in relation to those who remained asthis may bias the interpretation of the results.

For studies of psychological treatments in psy-chosis the most relevant factors are listed below.

• The chronicity of the illness, measured inmonths or years since first diagnosis, is likelyto affect treatment outcome because it is wellknown in the field of psychiatry that thosewith longer illnesses may be less likely tochange. Tarrier et al.15 found that this modestlypredicted treatment outcome in an intensiveCBT trial.

• The duration of untreated psychosis (DUP) isthe time spent experiencing symptoms priorto the diagnosis and treatment of the disor-der. Several studies suggest that DUP affectsthe success of other treatments, particularlymedication, and it may be that this is also afactor in the efficacy of psychological disor-ders.

• The severity of the symptoms has been shownto affect treatment outcome.15,38 In the Lon-don–East Anglia trial the best outcomes fol-lowing CBT were found for those people whosaid they were not absolutely certain that theirdelusions were true. The effect of this samefactor was also alluded to several years agoin a small trial by Watt et al.46 But, althoughthe outcome at post-treatment was affected bythis factor there was no measurable influenceat follow-up nine months after the end oftherapy.47

• Gender was investigated by Gould et al.25 intheir meta-analysis of CBT trials. They foundno relationship between effect size and theproportion of men in the trial but as theythemselves point out no data were available forthe specific outcomes for men and women thatprecludes a definitive evaluation. However,young men are usually thought to have a pooreroutcome and are more likely to drop out.41

• Intellectual status has been suggested bycritics of psychological therapy to be a bar

to significant treatment effects. Although onerecent study has not found this to be true,38

trialists should consider this factor in theiranalysis if only to counter such criticisms.

• Interactions with other treatments need to beconsidered where these are variables within thegroup of patients entered into the trial. Themost pertinent for CBT studies is the issue ofthe use of medication. Medication is now oftendivided into two main types, typical antipsy-chotics which have been available for a numberof years and atypical antipsychotic medicationswhich have become available recently. Mostpublished CBT studies were carried out beforethe wide availability of these newer medica-tions. However, medication was not a predic-tor of outcome in Garety et al.38 or Tarrieret al.15 Kuipers et al.47 comment that in theirCBT group, because symptomatic improve-ment would be achieved, these patients wouldbe less likely to be prescribed clozapine (anatypical antipsychotic) and would generally beprescribed lower doses of medication. Thesepredictions, although only a trend towards sig-nificant, were shown in their data. Pinto et al.48

chose their sample on the basis of a failure of atleast two medications to reduce positive symp-toms. In their study, which was not method-ologically strong, the effect size was extremelylarge with the combined effect of CBT andthe new medication (clozapine) producing aneffect size of 2.18.49 This suggests that med-ication should be taken into account in ran-domisation or at least in subgroup analyses.

Entry to the study may be stratified if thevariable has a known interaction with treatmentor the variable can be used as a covariate in theanalyses. Being clear about which variables mayinteract with psychological treatment is essentialat the outset of the trial because the trial must bedefined and have a sufficiently large sample to beadequately powered to test for these effects.

Details

Details of the process of randomisation must besupplied in the paper. For instance in Kuipers

Page 284: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COGNITIVE BEHAVIOUR THERAPY 287

et al.14 a randomised permuted blocks allocationwas adopted in each centre which contributedparticipants to the trial. Other studies withmultiple centres (e.g. Sensky et al.16) randomisedparticipants at each centre, this they then arguedallows them to control for within-centre effectsand allows them to test between-centre effects oftreatment efficacy.

Blindness

In clinical trials blindness usually refers totwo aspects of the trial. The first relates tothe allocation of participants to the differenttreatment limbs so that the allocation process isindependent and concealed from those involvedin the assessment or treatment. This preventspeople from choosing who to put into the trialon the basis of the patient’s own preference,resulting in more enthusiastic people being inthe treatment arm, or the research worker’spreference which may result in those with morefavourable prognoses being allocated to theexperimental treatment.

The second use of the word blindness relatesto the concealment of which treatment theparticipant received from those involved withassessment, especially of outcome. This is anextremely important issue and one that is difficultto ensure. The aim is to prevent any bias,conscious or otherwise, entering the assessmentprocess through knowledge of which treatmentthe participant received. For example, knowledgethat the hypothesis to be tested was that CBTwould be better that treatment-as-usual becauseprevious studies had demonstrated this may biasan assessor to rating the patient as more improvedif they knew the patient had received CBT.

The importance of adequate concealment wasdemonstrated in a study by Moher et al.50 whoexamined the quality of concealment in treatmenttrials in circulatory and digestive disease, mentalhealth, obstetrics and childbirth. The examinedtrials had already passed a number of qualityassessments and been included in a numberof meta-analyses. They found that trials withpoorer quality blinding were associated with

an increased estimate of benefit of 34%. Thisreplicated a similar earlier finding of Schulzet al.51 who also reported exaggerated treatmentefficacy of 30–40% in trials with inadequateconcealment. There is good evidence that thepoorer the trial methodology the better, and moreinflated, the treatment results obtained.

The assessment and treatment procedures mustbe separate and independent, in other words theperson who carries out the assessment should bedifferent from the person who delivers the treat-ment. This is not always the case in published tri-als, for example Brooker et al.52,53 trained mentalhealth nurses in family intervention and assessedthe effectiveness of the intervention by havingthe nurses perform the assessments. It couldbe argued that any problem of bias could beavoided in cases such as these by performingthe assessments through patient self-report. How-ever, this does not address the problem of socialapproval that may introduce bias where patientsgive results they think their therapist would wantto receive.

Independent assessors and therapists will notensure that assessors remain naive to treatmentallocation. Accidental knowledge of allocationcan be minimised by using separate adminis-trative procedures and geographically separatingtherapists from assessors in terms of office loca-tion and administrative procedure. This shouldprevent assessors bumping into patients aboutto receive therapy and such similar accidents.Patient allocation should be multiply coded sothat learning of one patient’s allocation does notbreak the whole trial code. Patients should beinstructed not to reveal any detail of their treat-ment or who has treated them to the assessors atthe start of the trial and before each assessment.It is unlikely that this will be fool-proof but itwill minimise revelations. See Tarrier et al.15 forfurther details of efforts to maintain blindness ina clinical trial of CBT.

Opinions differ as to whether verificationof maintenance of blindness is desirable. It ispossible to ask assessors to guess the allocationof trial participants. This can be used as evidenceof successful blindness.15,54 Assessors should not

Page 285: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

288 TEXTBOOK OF CLINICAL TRIALS

be informed that they will be asked to guess asthis would prime them to the task. Guessing isless likely to be successful when there are morethan two treatment groups. With two groups, oreven more than two, an assessor could adopt astrategy that patients who improved should bein the experimental treatment group because thiswould be in line with the study hypothesis. If thetrial had been successful this strategy would havebeen correct and the assessor would most likelyhave guessed right in many cases although for thewrong reasons. This would not be an indicationthat the assessor knew of the treatment allocationand was hence biased in their assessment butthat they knew who improved which aided themin guessing group allocation. The problem forthe trial investigators here would be that theirassessors appear not to have been blind. If theassessors were not able to guess correctly usingthis strategy it would probably mean that theexperimental treatment had not been effective andthe trial was a failure anyway. Having assessorsguess allocation holds the investigators hostage tofortune, although with multiple treatment groupsit can be effective in demonstrating blindness.

Even if assessors do maintain blindness totreatment allocation they will still be aware of thetiming of the assessment, pre-, post-treatment orfollow-up. Thus the only way to ensure blindnessof both treatment allocation and assessment timeis to separate the gathering of information fromits rating. Thus all assessment interviews shouldbe audio-taped independently of their ratingand rating should be carried out by a differentassessor who is unaware of the allocation orassessment. This would also allow the audio-tapes to be edited of any accidental revelation

of identifiers. To be successful interviews needto follow a protocol as to the procedure ofthe interview so that adequate and sufficientinformation is available to make ratings. In moststudies of CBT in general and for psychosisin particular, the process for blind allocation israrely described, for example Kuipers et al.14

In contrast, Sensky et al.16 and Tarrier et al.15

both describe the method for ensuring blindnessand the maintenance of allocation of subjectsto groups.

PROTOCOL

Design Protocol

There are various ways of testing whethera particular treatment is efficacious but theaccepted method is to compare the treatment witha placebo control that allows for a comparisonof client expectations of improvement duringtherapy with the active ingredient itself. Wehave discussed above the importance of thesenon-specific factors in psychological treatments.Social contact, social support and the modellingof interpersonal behaviour are all an integralpart of psychological therapy. Tests of theeffectiveness of individual CBT have used avariety of designs. Table 18.2 below gives theoutline of the main recent trials.

There are a variety of designs that will allowthe examination of both the effectiveness andspecificity of the effect of CBT above theeffects found for psychological interventions ingeneral in this group. The results show thatthere are significant effects over treatment-as-usual. There is also one study16 that shows a

Table 18.2. Designs for randomised control trials of CBT for psychosis

Comparison withtreatment-as-usual

Comparison withalternative

‘placebo’ therapy

Comparison with ‘placebo’therapy and

treatment-as-usual

Garety et al.11 Sensky et al.16 Tarrier et al.10

Kuipers et al.14,47 Drury et al.12,13 Tarrier et al.15,73

Barrowclough et al.18 SOCRATES17

Page 286: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COGNITIVE BEHAVIOUR THERAPY 289

difference between CBT and a ‘placebo’ therapy(befriending), but only at follow-up. However,Tarrier et al.15 found a significant differencebetween CBT and TAU but no overall differencebetween the two therapies at any stage of thestudy.55 Analysis of specific symptoms foundthat there was a significant advantage of CBTover supportive counselling in the treatment ofhallucinations.56

Treatment Protocol

It is essential to have a clear and unambiguoustreatment protocol for psychological treatments.However, even when a manual is available itis much harder to evaluate exactly whether theprotocol has been adhered to. In treatments withmedication this process is relatively easy as thedose and timing of the treatment can be veri-fied using simple procedures. For psychologicaltreatment the verification process relies on tapedinterviews of treatment sessions that are thenrated later for fidelity with the treatment pro-tocol. However, there are several problems thatmay interfere with this process. Firstly the patientmust agree to the recording of the session and insome studies, e.g. Chadwick et al.,57 the patientsrefused to have any sessions taped. Once tapedsessions have been collected the independent rat-ing must answer a number of questions:

• Does the session represent the treatment tobe provided? In other words is it possible todifferentiate the experimental treatment fromthe placebo treatment. Sensky et al.16 andTarrier et al.15 were able to show that theirindependent assessors was able to assign 100%and 97% of the tapes rated to the appropriatetreatment arm.

• Is the experimental treatment manual beingadhered to? This requires that the researchershave a specific rating scale that will allow therating of key areas of their treatment. Had-dock et al.58 has developed a rating scale (theCognitive Therapy Scale for Psychosis–CTS-Psy) to assess quality of therapy. This allowsassessment of general (e.g. interpersonal effec-tiveness) and specific (e.g. guided discovery)

aspects of therapy to be assessed. However,the scale allocates equal weighting to all items.There is, as yet, no empirical evidence to sup-port such equal weightings, and it may wellbe, for example, that ‘agenda setting’ is lessimportant than the ‘choice of intervention’.

• Does the progression of therapy cover all thekey topics of the manual? This requires thatseveral sessions of therapy are recorded atdifferent times and that the content of theseis scored for the timing of the interventionsin the treatment programme. This is probablythe most important part of rating CBT trialsbecause although it can be clear how manysessions are provided to a patient it is notclear whether the content given is the same.CBT researchers (personal communications,including one of the authors, NT) observethat some patients are able to travel throughthe whole manual whereas others cover muchless. So although the therapy duration may beequivalent, exposure to the complete protocolcan be different. This dosage of treatment maybe an important factor in defining treatmentoutcome as some patients are clearly gettingmore treatment than others. However, thenumber of treatment sessions is not related toeffect size as presented in Gould et al.25

Progression through therapy may be affected by anumber of factors such as the level of disturbanceor cognitive impairment of the patient. Therapistswill also differ in their ability to progress therapyand their skills in different aspects–determinedby skills, training, profession, trial providedtraining (see Tarrier et al.59). So far in CBTtrials these factors have not been investigated inany detail.

INDIVIDUALISED TREATMENTS

Turkington and Siddle60 claim that a case formu-lation is essential to treating psychotic patientssuccessfully. We do not disagree that a caseformulation is desirable but with psychoticpatients it is not always possible and a purelysymptomatic approach has to be adopted on

Page 287: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

290 TEXTBOOK OF CLINICAL TRIALS

occasions.61 In spite of our support for case for-mulation the evidence from general adult mentalhealth that a case formulation is necessary is poorand the results equivocal. Schulte et al.62 treateda mixed group of 120 phobic patients with a stan-dardised treatment and individualised behaviourtherapy based on functional behaviour analysis.They also included a yoked control group inwhich treatment was based not on the individ-ual’s assessment but that of the yoked patient.The standardised treatment group showed themost improvement and patients who acted asyoked controls improved as well as the otherpatients. Similarly Emmelkamp et al.63 allocated22 obsessive–compulsive patients to either tailor-made cognitive behavioural therapy or standard-ised exposure in vivo therapy. There were nosignificant differences between groups but thegroup sizes were small (n = 11 in each group).Jacobson et al.64 treated 30 distressed maritalcouples with either a manual-based version ofmarital therapy or a clinically flexible versionof the same treatment in which treatment planswere individually based and the number of treat-ment sessions were not specified. Both treat-ments resulted in significant improvements atpost-treatment but at six-month follow-up thecouples treated with the structured format weremore likely to have deteriorated and flexiblytreated couples were more likely to have main-tained their treatment gains. There appears littleadvantage of case formulation-based treatmentover a standard package. This result is not sur-prising given the sample sizes of these stud-ies. Standard treatment programmes are effectivefor a wide range of psychological disorders andeven if an individualised treatment was superiorthe difference in effect sizes will most proba-bly be small and the sample size to significantlydemonstrate such a difference would necessarilybe large. Therefore, the studies that have beendone are massively underpowered. To substanti-ate this Tarrier and Calam65 have estimated thesample sizes required to show significant differ-ences with 80% power and 0.05 significance levelbased on the data provided in the published reportof Emmelkamp et al.63 On the basis of their data

the numbers in each group required to show a sig-nificant difference for the five outcome measureswould be between 25 and in excess of 15 000,with a median of 800 patients in each treatmentgroup. The issue of case formulation-based treat-ment versus protocol-based treatment is unlikelyto be resolved by a direct head-to-head compari-son, which would be too large and costly.

TREATMENT COMPONENTS

CBT treatments also differ in other ways fromeach other. Although they have a basic set ofingredients the emphasis may be placed differ-ently. For example, the different emphases onbehavioural activation and cognitive schema inthe changes in thinking thought to be the causeof the treatment effect. Changing behaviour canhave an effect on thinking as studies of CBT forpanic disorder have discovered.22 Patients in theClark study were treated with behavioural acti-vation programmes that are embedded in CBT.They showed that the prediction of outcome wasdependent on one main factor, cognitions abouttheir bodily sensations. The behavioural experi-ments seemed to have an effect on cognition. But,other groups in the field of psychosis emphasisemore distal stimuli such as the developmentalpath of the delusion. The particular componentof CBT that accounts for most of the variance inoutcome has not yet been differentiated and thesemore subtle differences are not used in meta-analyses of the treatment studies. Figure 18.4shows the effect sizes taken from Gould et al.25

on a scale devised by ourselves on the amount ofbehavioural activation that the treatment empha-sises. As can be seen from the graph the effectsize is increased when more behavioural activa-tion is included. It may be that a simple change inbehaviour via a behavioural experiment may pro-vide enough evidence to reduce delusional con-viction. For instance Birchwood and Chadwick66

suggest that the perceived powerfulness of anauditory hallucination directly predicts the dis-tress experienced. Adopting one successful cop-ing strategy may provide enough evidence toreduce the perceived power of an auditory hal-lucination and increase the amount of perceived

Page 288: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COGNITIVE BEHAVIOUR THERAPY 291

00.10.20.30.40.50.60.70.8

Sensky00

Kuipers97

Garety94

Tarrier98

Tarrier93

Amount of behavioural activation in CBT

Effe

ct s

ize

from

Gou

ld e

t al.

(200

1)

Effect size

Figure 18.4. How much ‘B’ in CBT for psychosis

control the patient has over their symptoms.Wykes et al.67 provides some evidence of thisrelationship in a waiting list control trial of groupCBT for auditory hallucinations. If this is truethen successful behavioural experiments shouldalways be included and should predict successfultreatment outcome. As yet no study has attemptedto measure this process.

Turkington and Siddle60 maintain that cogni-tive therapy with psychotic populations will resultin long-term improvement because it involvesschema change whereas cognitive behaviour ther-apy (as carried out by Tarrier et al.15) will resultin short-term change only. They go on to saythat ‘schema change seems vital in terms ofthe durability of any achieved benefits’ (p. 302).However, there is little evidence that schema arecausative in psychosis or that change is importantfor treatment effects (see Figure 18.4). The directtransport of Beck’s model of depression to psy-chosis in the absence of evidence for its explana-tory value in this population has been criticised.68

The evidence for the effect of schema work onoutcome is anyway sparse even in the area ofdepression for which it was designed. Jacobsonrandomised 150 people to three treatment arms:(i) behavioural activation, (ii) behavioural activa-tion and work on dysfunctional thoughts, and(iii) total CBT with work on cognitive schema.The results of this trial showed no differencesin outcome either at post-treatment or follow-up between the three groups. The effect of theother components of CBT was no different to

the effects of behavioural activation alone (seeFigure 18.5).

OUTCOMES

Current thinking from methodologists on trialsin psychiatry is that designs should be simplifiedand that outcome measures should be kept to aminimum. In particular the use of rating scalesshould be restricted to ‘one or two which arebest understood’ (Johnson,69 p. 229). Follow-upshould be carried out on ‘few occasions ratherthan many’ and entry criteria should be as broadas possible.69 This advice is rather in conflictwith that given previously which suggests that,in the treatment of psychosis, multiple outcomeswhich reflect the complex nature of the disorderand its effects should be used70 and that data onthe process of therapy are essential. The multipleeffects of therapy may, but not necessarily, havea common outcome in relapse prevention or totalsymptoms. For instance cognitive therapy shouldaffect cognition, CBT should affect cognitionsand behaviour, and family interventions shouldaffect families’ interactions. All these effectscould in some way change the outcome of thedisorder but via multiple pathways. Because ofthese multiple effects, multiple outcomes may berecommended in order to differentiate the routeto effectiveness in explanatory trials.

All current trials measure overall symptoms.However, they all do so in different ways andeven when the measure appears to be a standard

Page 289: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

292 TEXTBOOK OF CLINICAL TRIALS

0

5

10

15

20

25

30

35

Pre Post 6 monthfollow-up

Behaviouralactivation

Activation andAutomatic thoughts

Complete CBT,including schemawork

Source: Data from Jacobsen et al.75

Figure 18.5. Evidence of the effect of schema work in depression

measure (e.g. BPRS71) it is often adapted. Forinstance Kuipers et al.14 added items to the BPRSmaking it difficult to compare their results withothers adopting the conventional version of thesame instrument. The use of non-standard instru-ments to measure awareness of stigma, copingskills, etc. prevents comparisons being made anddoes need replication with standardised ratingscales with no known psychometric properties.

In all medical trials a statistically significantdifference in outcome may provide little benefitto the patients. What needs to be defined for trialsof CBT for psychosis is the clinical significanceof outcomes. This was alluded to earlier in thischapter. Clinically significant outcomes may bereductions in the distress associated with thedisorder. Currently clinical significance is definedas the sorts of improvements that are achievedin drug trials–20% change in symptoms. Thismay be a low threshold for what could beachieved through psychological therapy. Manytrials of psychological therapy adopt only astatistically significant test of effectiveness, butTarrier et al.15 and Sensky et al.16 adopt a 50%change criterion for their measures, althoughKuipers et al.14 use the lower threshold of 20%.Where trials use such a threshold of achievingclinical significance or not, comparisons can bemade by comparing the Numbers Needed to Treat(NNT), which represents the number of patientsthat need to be treated to achieve one clinicallysignificant outcome.72

CONCLUSIONS

Because psychological therapy has no productchampion as found in the drug industry, prag-matic trials are needed initially to convince peo-ple that therapy is worthwhile. However, theseneed to be followed by explanatory trials thatcan establish the specificity of the treatment. Cur-rently the trials in the field of psychosis havemainly been pragmatic and these have shown thatthe therapy is worthwhile with improvements inpositive and negative symptoms at post-treatmentand follow-up in some studies. However, the tri-als that have been designed to test the specificityof treatment have not been so successful. Veryfew differences emerge between CBT for psy-chosis and alternative therapies are shown at post-treatment. Of the two studies with long follow-ups one showed an advantage for CBT over thealternative therapy16 but the other showed equiv-alent benefits of both therapies (CBT and sup-portive counselling) over routine care.55,73 Oneresolution of this conflict may be the nature ofthe therapy chosen. In the Sensky et al.16 studythe comparison condition was befriending whichis described as a therapy with equivalent amountsof therapist contact but where the content was adiscussion focusing on neutral topics such as hob-bies, sports and current affairs where the therapistis instructed to be empathic but non-directive.Even this therapy was associated with reductionsin symptoms at the end of therapy that may be

Page 290: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COGNITIVE BEHAVIOUR THERAPY 293

a testament to the paucity of the social contactand lack of warm relationships of people withcontinuing active psychosis. However, the effectsof this therapy were not sustained to a follow-upwhere the patients in the comparison therapy con-dition actually got worse. In the Tarrier et al.15

study the comparison therapy chosen was sup-portive counselling in which the therapist triedto achieve a supportive relationship which fos-tered rapport and provided emotional support.This therapy proved to be successful in reduc-ing symptoms over the course of the therapy andfollow-up. This result suggests that some of theessential ingredients of CBT encompassed withinthe counselling framework may be either sharedwith supportive counselling or are as effectiveas these other ingredients. It is of course possi-ble that within the model of schizophrenia whichencompasses a vulnerability stress model that thetwo forms of therapy may work via differentpathways. Supportive counselling may work byemphasising self-esteem through rapport withinthe therapeutic relationship. Unlike befriendingthis support produces more stable changes thatare durable to follow-up. However, it should benoted that supportive counselling did significantlyworse at treating hallucinations when comparedto CBT on this symptom alone.56

Currently there is no evidence that cognitivebehaviour therapy works via a cognitive system,although training in coping skills has been shownto improve coping.74 None of the studies haveyet produced analyses showing that the cognitivechange established during therapy is the keyto later improvement. In fact few have evenprovided analyses which test this possibility.Because of the complex nature of the aetiologyof psychosis with its multiple causal processes itmay be impossible to identify a single route tochange. There may be many idiosyncratic routesthat will only be established in large trials withseveral hundred patients included. One such trialis taking place in Manchester, the Socrates trial.17

In this trial hundreds of acute patients who are atthe beginning of their illness have been providedwith therapy and followed up over a long period.It is only through these sorts of studies that it

will be possible to establish routes to change thatcan then inform the development of therapy. It isonly by these later developments that it will bepossible to develop training and therefore providelarger numbers of people with psychosis witheffective psychological therapy.

REFERENCES

1. Wiersma D, Nienhuis FJ, Sloof CJ, Geil R. Nat-ural course of schizophrenic disorders: a 15 yearfollow-up of a Dutch incidence cohort. SchizophrBull (1998) 24: 75–85.

2. British Psychological Society. Recent Advances inUnderstanding Mental Illness and Psychotic Expe-riences. Leicester: British Psychological Society(2000).

3. Fowler D, Garety P, Kuipers E. Cognitive therapyfor psychosis: formulation, treatment, effects andservice implications. J Mental Health (1998) 7:123–33.

4. Tarrier N, Barrowclough C, Vaughn C, Bam-rah JS, Porceddu K, Watts S, Freeman HL. Thecommunity management of schizophrenia: a con-trolled trial of a behavioural intervention with fam-ilies. Br J Psychiat (1988) 153: 532–42.

5. Cunningham-Owens DG, Carroll A, Fattah S,Clyde Z, Coffey I, Johnstone EC. A randomised,controlled trial of a brief educational packagefor schizophrenic outpatients. Acta Psychiat Scand(2001) 103: 362–9.

6. Beck AT. Successful outpatient psychotherapywith a schizophrenic with delusion based onborrowed guilt. Psychiatry (1952) 15: 305–12.

7. Tarrier N. Psychological treatment of schizo-phrenic symptoms. In: Kavanagh D, ed., Schizo-phrenia: An Overview and Practical Handbook.London: Chapman & Hall (1992).

8. Tarrier N. Modification and management of resid-ual psychotic symptoms. In: Birchwood M, Tar-rier N, eds, Psychological Approaches to Schizo-phrenia: Innovations in Assessment, Treatment andServices. Chichester: John Wiley & Sons (1992).

9. Haddock G, Tarrier N, Spaulding W, Yusupoff L,Kinney C, McCarthy E. Individual cognitive-behaviour therapy in the treatment of schizophre-nia: a review. Clin Psychol Rev (1998) 18:821–38.

10. Tarrier N, Beckett R, Harwood S, Baker A,Yusupoff L, Ugarteburu I. A trial of two cognitivebehavioural methods of treating drug-resistantresidual psychotic symptoms in schizophrenicpatients. I: Outcome. Br J Psychiat (1993) 162:524–32.

Page 291: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

294 TEXTBOOK OF CLINICAL TRIALS

11. Garety P, Kuipers E, Fowler D, Chamberlain F,Dunn G. Cognitive-behaviour therapy for drugresistant psychosis. Br J Med Psychol (1994) 67:259–71.

12. Drury V, Birchwood M, Cochrane R, Macmil-lan F. Cognitive therapy and recovery from acutepsychosis. I. Impact on psychotic symptoms. Br JPsychiat (1996) 169: 593–601.

13. Drury V, Birchwood M, Cochrane R, Macmil-lan F. Cognitive therapy and recovery from acutepsychosis. II. Impact on recovery time. Br J Psy-chiat (1996) 169: 602–7.

14. Kuipers E, Garety P, Fowler D, Dunn G, Beb-bington P, Freeman D, Hadley C. London–EastAnglia randomised controlled trial of cognitive-behavioural therapy for psychosis: 1. Effects ofthe treatment phase. Br J Psychiat (1997) 171:319–27.

15. Tarrier N, Yusupoff L, Kinney C, McCarthy E,Gledhill A, Haddock G, Morris J. A randomisedcontrolled trial of intensive cognitive behaviourtherapy for patients with chronic schizophrenia.Br Med J (1998) 317: 303–7.

16. Sensky T, Turkington D, Kingdon D, Scott JL,Scott J, Siddle R, O’Carroll M, Barnes TRE. Arandomised controlled trial of cognitive-behavioraltherapy for persistent symptoms in schizophreniaresistant to medication. Arch Gen Psychiat (2000)57: 165–72.

17. Lewis SW, Tarrier N, Haddock G, Bentall R,Kinderman P, Kingdon D, Siddle R, Drake R,Everitt J, Leadley K, Benn A, Glazebrook K,Haley C, Akhtar S, Davies L, Palmer S, Fara-gher B, Dunn G. Randomised controlled trial ofcognitive-behaviour therapy in early schizophre-nia: acute phase outcomes. Br J Psychiat (2002)181(Suppl 43): 91–7.

18. Barrowclough C, Haddock G, Tarrier N, Lewis S,Moring J, O’Brian R, Schofield N, McGovern J.Randomised controlled trial of motivational inter-viewing and cognitive behavioural interventionfor schizophrenia patients with associated drugor alcohol misuse. Am J Psychiat (2001) 158:1706–13.

19. Barrowclough C, Tarrier N, Sellwood W, Quinn J,Mainwaring J, Lewis S. A randomised controlledeffectiveness trial of a needs based psychosocialintervention service for carers of schizophrenicpatients. Br J Psychiat (1999) 174: 506–11.

20. Horvath A, Symonds D. Relationship betweenworking alliance and outcomes in psychotherapy:a meta-analysis. J Counsel Psychol (1991) 38:139–49.

21. Wampold BE, Mondin GW, Moody M, Stich F,Benson K, Ahn H. A meta-analysis of outcomestudies comparing bona fide psychotherapies:

empirically, “all must have prizes”. Psychol Bull(1997) 122: 203–15.

22. Clark DM, Salkovskis PM, Hackmann A, Mid-dleton H, Anastasiades P, Gelder M. A compar-ison of cognitive therapy, applied relaxation andimipramine in the treatment of panic disorder. BrJ Psychiat (1994) 164: 759–69.

23. Paykel ES, Scott J, Teasdale JD, Johnson AL,Garland A, Moore R, Jenaway A, Cornwall PL,Hayhurst H, Abbott R, Pope M. Prevention ofrelapse in residual depression by cognitive ther-apy. Arch Gen Psychiat (1999) 56: 829–35.

24. Yammey G. Psychologists question debriefing fortraumatised employees. Br Med J (2000) 320: 140.

25. Gould RA, Mueser KT, Bolton E, Mays V,Goff D. Cognitive therapy for psychosis in schizo-phrenia: an effect size analysis. Schizophr Res(2001) 48: 335–42.

26. Cohen J. Statistical Power Analysis for the Behav-ioral Sciences. Hillsdale, NJ: Lawrence Erlbaum(1988).

27. Begg C, Cho M, Eastwood S, Horton R, Moher D,Olkin I, Pitkin R, Rennie D, Schulz KF, Simelk D,Stroup DF. Improving the quality of reportingof randomized controlled trials: The CONSORTStatement. J Am Med Assoc (1996) 276: 637–9.

28. Moher D, Schulz K, Altman D. The CONSORTStatement: revised recommendations for improv-ing the quality of reports of parallel grouprandomisation. J Am Med Assoc (2001) 285:1987–91.

29. Geddes J, Reynolds S, Streiner D, Szatmari P,Haynes B. Evidence-based practice in mentalhealth. Evidence-Based Mental Health (1998) 1:4–5.

30. Barlow DH. Health care policy, psychotherapyresearch and the future of psychotherapy. AmPsychol (1996) 51: 1050–8.

31. Drake RE, Goldman HH, Leff HS, Lehman AF,Dixon L, Mueser KT, Torrey WC. Implementingevidence-based practices in routine mental servicesettings. Psychiat Serv (2001) 52: 179–82.

32. Jones B, Jarvis P, Lewis JA, Ebbutt AF. Trials toassess equivalence: the importance of rigorousmethods. Br Med J (1996) 313: 36–9.

33. Klein DF. Preventing hung juries about therapystudies. J Consult Clin Psychol (1996) 64: 81–7.

34. Klein DF. Discussion of ‘methodological contro-versies in the treatment of panic disorders’. BehavRes Ther (1996) 34: 849–53.

35. Bentall RP, Jackson HF, Pilgrim D. Abandoningthe concept of schizophrenia: some implications ofvalidity arguments for psychological research intopsychotic phenomena. Br J Clin Psychol (1989)27: 303–24.

Page 292: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

COGNITIVE BEHAVIOUR THERAPY 295

36. Wykes T, Parr A-M, Landau S. Group treatmentfor auditory hallucinations – a waiting list con-trolled study. Br J Psychiat (1999) 175: 180–5.

37. Lam D, Bright J, Jones S, Hayward P, Schuck N,Chisholm D, Sham P. Cognitive therapy for bipo-lar illness – a pilot study of relapse prevention.Cognit Ther Res (2000) 24: 503–20.

38. Garety P, Fowler D, Kuipers E, Freeman D,Dunn G, Bebbington P, Hadley C, Jones S. Lon-don–East Anglia randomised controlled trial ofcognitive-behavioural therapy for psychosis. IIPredictors of outcome. Br J Psychiat (1997) 171:319–27.

39. Everitt BS. The analysis of longitudinal data:beyond MANOVA. Br J Psychiat (1998) 172:7–10.

40. Department of Health. Research GovernanceFramework for Health and Social Care. Depart-ment of Health, UK (2001).

41. Tarrier N, Yusupoff L, McCarthy E, Kinney C,Wittkowski A. Some reason why patients suffer-ing from chronic schizophrenia fail to continue inpsychological treatment. Behav Cognit Psychother(1998) 26: 177–81.

42. Marshall M. A systematic review of the effec-tiveness of acute day hospitals versus admission.Paper presented at The Making Mental Health Ser-vices Effective: Now and Tomorrow Conference,7–8 September 2000, Manchester, UK (2000).

43. Burns T, Creed F, Fahy T, Thompson S, Tyrer P,White I, & the UK 700 Group. Intensive versusstandard case management for severe psychoticillness: a randomised trial. Lancet (1999) 353:2185–9.

44. Wykes T, Tarrier N, Lewis S. Outcome and Inno-vation in the Psychological Treatment of Schizo-phrenia. Chichester: John Wiley & Sons(1998).

45. Tarrier N. What can be learned from clinicaltrials? Reply to Devilly and Foa (2001). J ConsultClin Psychol (2001) 69: 117–18.

46. Watt F, Powell G, Austin S. Modification of delu-sional beliefs. Br J Med Psychol (1983) 46:359–63.

47. Kuipers E, Fowler D, Garety P, Chisholm D,Freeman D, Dunn G, Bebbington P, Hadley C.London–East Anglia randomised controlled trialof cognitive-behavioural therapy for psychosis. IIIFollow-up and economic evaluation at 18 months.Br J Psychiat (1998) 171: 319–27.

48. Pinto A, La Pia S, Mennella R, Giorgio D, DeSi-mone L. Cognitive-behavioral therapy and cloza-pine for clients with treatment-refractory schizo-phrenia. Psychiat Serv (1999) 50: 901–4.

49. Rector NA, Beck AT. Cognitive-behavioral ther-apy for schizophrenia: an empirical review. J NervMent Dis (2001) 189: 278–87.

50. Moher D, Pham B, Jones A, Cook DJ, Jadad AR,Moher M, Tugwell P, Klassen TP. Does qualityof reports of randomised trials affect estimates ofintervention efficacy reported in meta-analyses?Lancet (1998) 352: 609–13.

51. Schulz KF, Chalmers I, Hayes RJ, Altman DG.Empirical evidence of bias: dimensions of method-ological quality associated with estimates of treat-ment effect in controlled trials. J Am Med Assoc(1995) 280: 178–80.

52. Brooker C, Tarrier N, Barrowclough C, Butter-worth A, Goldberg D. Training community psy-chiatric nurses for psychosocial intervention:report of a pilot study. Br J Psychiat (1992) 160:836–44.

53. Brooker C, Falloon I, Butterworth A, Goldberg D,Graham-Hole V, Hillier V. The outcome of train-ing community psychiatric nurses to deliver psy-chosocial intervention. Br J Psychiat (1994) 165:222–30.

54. Basoglu M, Marks I, Livanou M, Swinson R.Double-blindness procedure, rater blindness, andratings of outcome. Arch Gen Psychiat (1997) 54:744–8.

55. Tarrier N, Kinney C, McCarthy E, Humphreys L,Wittkowski A, Morris J. Two year follow-up ofcognitive-behaviour therapy and supportive coun-selling in the treatment of persistent positivesymptoms in chronic schizophrenia. J Consult ClinPsychol (2000) 68: 917–22.

56. Tarrier N, Kinney C, McCarthy E, Wittkowski A,Yusupoff L, Gledhill A, Morris J. Are some typesof psychotic symptoms more responsive to CBT?Behav Cognit Psychother (2001) 29: 45–55.

57. Chadwick P, Sambrooke S, Rasch S, Davies E.Challenging the omnipotence of voices: groupcognitive behavior therapy for voices. Behav ResTher (2000) 38: 993–1003.

58. Haddock G, Devane S, Bradshaw T, McGovern J,Tarrier N, Kinderman P, Baguley I, Lancashire S,Harris N. An investigation into the psychometricproperties of the Cognitive Therapy scale forpsychosis (CTS-Psy). Behav Cognit Psychother(2001) 29: 93–106.

59. Tarrier N, Barrowclough C, Haddock G, McGov-ern J. The dissemination of innovative cognitive-behavioural treatments for schizophrenia. J MentalHealth (1999) 8: 569–82.

60. Turkington D, Siddle R. Improving understandingand coping in people with schizophrenia bychanging attitudes. Psychiat Rehab Skills (2000)4: 300–20.

61. Tarrier N, Haddock G. Cognitive-behaviour ther-apy for the treatment of schizophrenia: a caseformulation approach. In: Hofman SG, ThompsonMC, eds, Handbook of Psychological Treatments

Page 293: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

296 TEXTBOOK OF CLINICAL TRIALS

for Severe Mental Disorders. New York: GuilfordPress (2001).

62. Schulte D, Kunzel R, Pepping G, Schulte-Bahrenberg T. Tailor-made versus standardisedtherapy of phobic patients. Adv Behav Res Ther(1992) 14: 67–92.

63. Emmelkamp PMG, Bouman TK, Blaauw E. Indi-vidualised versus standardised therapy: a com-parative evaluation with obsessive–compulsivepatients. Clin Psychol Psychother (1994) 1:95–100.

64. Jacobson NS, Schmaling KB, Holtzworth-MunroeA, Katt JL, Wood LF, Follette VM. Research-structured vs clinically flexible versions of sociallearning-based marital therapy. Behav Res Ther(1989) 27: 173–80.

65. Tarrier N, Calam R. New developments incognitive-behavioural case formulation. Epidemi-ological, systemic and social context: an integra-tive approach. Cognit Behav Psychother (2002)30: 311–28.

66. Birchwood M, Chadwick P. The omnipotence ofvoices: testing the validity of cognitive model.Psychol Med (1997) 21: 1345–53.

67. Wykes T, Reeder C, Corner J, Williams C,Everitt B. The effects of neurocognitive remedi-ation on executive processing in patients withschizophrenia. Schizophr Bull (1999) 25: 291–308.

68. Patience DA. Cognitive-behaviour therapy forschizophrenia. Br J Psychiat (1994) 165: 266–7.

69. Johnson T. Clinical trials in psychiatry: back-ground and statistical perspective. Stat Meth MedRes (1998) 7: 209–34.

70. Barrowclough C, Tarrier N. Psychosocial inter-ventions with families and their effects on thecourse of schizophrenia: a review. Psychol Med(1984) 14: 629–42.

71. Ventura J, Green M, Shaner A, Liberman R.Training and quality assurance with the BriefPsychiatric Rating Scale: the drift busters. Int JMeth Psychiat Res (1993) 3: 221–44.

72. Sackett DL, Richardson WS, Rosenberg W,Haynes RB. Evidence-Based Medicine. London:Churchill Livingstone (1998).

73. Tarrier N, Wittkowski A, Kinney C, McCarthy E,Morris J, Humphreys L. The durability of theeffects of cognitive behaviour therapy in thetreatment of chronic schizophrenia: twelve monthsfollow-up. Br J Psychiat (1999) 174: 500–4.

74. Tarrier N, Sharpe L, Beckett R, Harwood S,Baker A, Yusupoff L. A trial of two cognitivebehavioural methods of treating drug-resistantresidual psychotic symptoms in schizophrenicpatients. II: Treatment-specific changes in copingand problem-solving skills. Social PsychiatPsychiat Epidemiol (1993) 28: 5–10.

75. Jacobson NS, Dobson KS, Truax PA, Addis ME,Koerner K, Gollan JK, Gortner E, Prince SE. JConsult Clin Psychol (1996) 64: 295–304.

Page 294: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

19

DepressionGRAHAM DUNN

Biostatistics Group, School of Epidemiology & Health Sciences, University of Manchester, Manchester, UK

INTRODUCTION

The purpose of the present chapter is to explorethe pitfalls in and challenges to the valid estima-tion of the effects of psychotherapies. Although‘depression’ appears in the title, the discussionwill be relevant to psychological treatments forany mental illness (including psychotic disorderssuch as bipolar depression and schizophrenia).Most of the illustrative examples, however, willrefer to the treatment of depression. Right fromthe start, despite pointing out all of its poten-tial problems, I will assume that by far the bestway of trying to estimate treatment effects is viathe use of a randomised controlled trial (RCT).I have little sympathy with the increasingly pop-ular view that we can learn much of real valueabout treatment effects from systematically col-lected outcome data in routine clinical practice(see Dunn1 for a critique of this view). Nor do Ihave any sympathy for the often-heard view thatRCTs and the use of statistical methods are inap-propriate vehicles for the evaluation of somethingas complex as psychotherapy. As we have writ-ten elsewhere: ‘Clinicians who claim that statis-tical methods are inappropriate for the evaluationof psychotherapies because they are limited to

analysing means, or do not account for individualdifferences, are simply revealing their ignoranceof statistics and of recent developments in statis-tical methodology’.2

A belief in the fundamental role of randomi-sation, however, does not imply that the naıveimplementation of RCTs in outcome researchcannot lead to some invalid or unsafe conclu-sions. The design of the trial and statisticalanalysis of the results have to be appropriateto the setting. Psychotherapy involves complexinteractions between patient and therapist andsometimes (as in group therapy) involves theinteraction of a group of patients with each otheras well as with their therapist. It is not as sim-ple as taking a tablet! A psychotherapy trial islikely to be far more complex, both in its imple-mentation and in the analysis and interpretationof the subsequent results, than most drug trials.There are also far more opportunities for invalidinferences concerning treatment effects.

First, and often primarily, we are concernedwith internal validity : the valid estimation of acausal effect of treatment from the data actuallycollected (given set of patients, therapists, treat-ment centres and psychotherapy actually deliv-ered). Are the group differences we see the causal

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 295: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

298 TEXTBOOK OF CLINICAL TRIALS

effects of treatment? Or can they be explained byother factors? Later we may be concerned withexternal validity : generalisation of the inferredcausal effects to other patients, therapists, treat-ment centres and, perhaps, other forms of psy-chotherapy. To help clarify the discussion I havemade much use of Rubin’s counterfactual modelof causality and its use in the estimation of treat-ment effects.3,4 The kernel of the present discus-sion, applied to the problems of patient choiceand non-adherence to treatment, can be foundin Dunn.5

THE CAUSAL EFFECT OF TREATMENT

Mr Smith has suffered from severe depression,on and off, for several years. Six months ago hisfamily doctor advised him to undergo a courseof psychotherapy. He accepted this advice, hashad several what he thinks were very helpfulsessions with the psychotherapist, and now isfeeling considerably better. Let’s assume, for thesake of argument, that he has a total score of 10points on the Beck Depression Inventory (BDI),6

having started with a score of 20 six months ago.What is the effect of psychotherapy? How dowe measure this effect? Putting it another way,what proportion, if any, of the drop from 20to 10 points might be attributed to the receiptof therapy? Or perhaps I should have written:‘What proportion might validly be attributed tothe receipt of therapy?’ But, before we attemptto answer this question, let us consider anotherpatient, Mrs Jones, who also suffers from chronicdepression. Like Mr Smith, she was advised tohave psychotherapy by her family doctor sixmonths ago but, for various reasons, she nevermanaged to keep any of her appointments andhas not received any help from the therapist. Athird patient, Mr Adams, refused outright to haveanything to do with the therapist. Mrs Jones’present BDI score is 12 and that for Mr Adamsis 15. For Mrs Jones and Mr Adams one mightask what would have been the effect of therapyoffered if they had actually received it? Whatmight their BDI scores have been?

In the above paragraph we are trying to find away to estimate what may be called the causaleffect of a treatment. The essence of the solutionto the problem is a comparison. For each ofthe three patients, Mr Smith, Mrs Jones andMr Adams, there are two possible outcomes ofthe referral to see a psychotherapist. The firstis to receive therapy and have the severity ofdepression measured after a given interval afterthe onset of the course of treatment. The secondis to fail to get the offered help, but againhave the severity of depression measured afterthe allotted time. Let the variable T representtreatment received. It has two possible values,T = t (therapy) and T = c (no therapy). I use‘c’ for no therapy to indicate that it can beregarded as a control condition. Let i indicatethe identity of the patient (i = 1 for Mr Smith, 2for Mrs Jones and 3 for Mr Adams). Finally, letYT (i) indicate the final BDI score for patient i

after receiving treatment option T . There are twopotential outcomes for each of the three patients,as indicated in the following table:

Patient BDI with BDI withouttherapy therapy

Mr Smith Yt(1) Yc(1)

Mrs Jones Yt(2) Yc(2)

Mr Adams Yt(3) Yc(3)

We define the causal effect of the receipt oftherapy as the difference between the BDIscore for the ith patient after therapy andthe corresponding BDI score after receiving notherapy. That is, by the difference Yt(i) − Yc(i).It is a random variable that varies from onepatient to another. Unfortunately, it can neverbe observed. The obvious problem is that eachpatient receives one of the treatment conditions,or the other, but not both. Either the patientreceives psychotherapy or he/she does not. Thatis, the ith patient provides a value for eitherYt(i) or Yc(i), but not both. Mr Smith providesYt(1) but not Yc(1), Mrs Jones provides Yc(2)

but not Yt(2), and so on. We provide a statisticalsolution to this problem in the following section,

Page 296: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DEPRESSION 299

but we will re-emphasise the point that the causaleffect of psychotherapy is the comparison of theoutcome actually observed with that which wouldhave been observed if, contrary to fact, the othertreatment option had been taken.

Similar arguments apply to the comparison ofthe effects of different types of psychotherapy,or to the comparison of a specific type of psy-chotherapy with, for example, a psychopharma-cological intervention such as a tricyclic antide-pressant. The essence is always to try to get anestimate of the difference between the patient’sobserved response with that which would havebeen observed if the patient had received thealternative treatment.

COMPARISON OF GROUP AVERAGESAND THE ROLE OF RANDOMISATION

Now let’s assume that we have access to alarge population of eligible patients – the targetpopulation about which we wish to draw causalinferences about the value of psychotherapyor counselling. And let us concentrate on theaverage causal effect (ACE)3,4 of the therapyfor this target population. The average for thepopulation is called an expected value in statisticsand the ACE can therefore be written as:

ACE = E[Yt(i) − Yc(i)] (1)

where the expectation E[·] is over all values of i.From the mathematical properties of expectations(averages) it follows that:

ACE = E[Yt(i)] − E[Yc(i)] (2)

This simple formula shows us that informationon different patients can be used to estimateE[Yt(i)] and E[Yc(i)] separately and the differ-ence between these two expectations (averages)can be used to estimate the average of the dif-ferences (i.e. the ACE). We can observe theYt(i)’s in patients receiving therapy and we canalso observe the Yc(i)’s for those in the con-trol condition. All that we need is to be surethat the observed averages for the treated (ther-apy) and untreated (control) patients are unbiased

estimates of E[Yt(i)] and E[Yc(i)], respectively.In general, however, the average of the Yt(i)’sfor the whole of the population (i.e. all possi-ble i’s) is not the same as the average of theYt(i)’s for those patients who have happened toreceive the treatment (psychotherapy). Expressedmathematically:

E[Yt(i)] �= E[Yt(i)|T = t] �= E[Yt(i)|T = c](3)

and

E[Yc(i)] �= E[Yc(i)|T = t] �= E[Yc(i)|T = c](4)

where ‘|’ means ‘given that’. To summarise,the ACE is defined by the difference betweenE[Yt(i)] and E[Yc(i)] but what we actuallyobserve are the estimators of E[Yt(i)|T = t] andE[Yc(i)|T = c]. How do we ensure that ourobserved averages are also valid estimators ofE[Yt(i)] and E[Yc(i)]? If we are able to do thisthen we have replaced an impossible-to-observecausal effect on an individual patient with apossible-to-estimate average of the causal effectsfor our target population.4

If both E[Yt(i)] = E[Yt(i)|T = t] and E[Yc(i)]= E[Yc(i)|T = c] then the potential outcomes(both the Yt(i)’s and the Yc(i)’s) are statisticallyindependent of the mechanism of assigning (orchoosing) treatment options. Otherwise, they arenot. If either the patient’s family doctor, or thepatient himself, or both, were to decide whichtreatment option to choose then it is almostcertain that this choice will not be statisticallyindependent of the potential outcome. This is thefamiliar problem of confounding. The differencein observed outcomes may arise from the fact thatthe patients with the best (or worst) prognosis,on average, might be the ones that opt fortherapy. The observed outcomes in this situationmight tell us something about the selectionmechanism (treatment choice) but are not veryinformative about the causal effect of therapy.Knowing the values of all of the prognosticvariables, together with a little knowledge ofexperimental design, might lead us to matchor stratify the patients prior to estimation ofthe treatment effects. But we cannot guarantee

Page 297: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

300 TEXTBOOK OF CLINICAL TRIALS

that we are aware of all possible confounders.There is always the possibility that we have notthought of, or forgotten, something that is vitallyimportant. Although we may be able to convinceourselves that we have not missed an importantconfounder, the only way we can ensure that wecan convince a sceptical reviewer is to allocatetreatment options randomly. Random allocationensures that both E[Yt(i)] = E[Yt(i)|T = t] andE[Yc(i)] = E[Yc(i)|T = c] providing that t andc are the allocated treatments (not, necessarily,those actually received). Randomisation is theonly sure way of coping with all confounders,and it copes with them irrespective of whetherwe are aware of them or not. Randomisationdoes not guarantee that treatment groups willbe exactly comparable in any given comparison,but it does ensure that on average there will becomparability. Our conclusion is that if we wishto be sure that we are estimating the desired ACEwe need an RCT.

An essential corollary of randomisation is thatwe obtain outcome data on all of the randomisedpatients and that we calculate our group averagesfrom the patients as they were randomised andnot according to whether they actually receivedor adhered to the treatment option that they wereallocated to. This is the intention-to-treat (ITT)principle (see, for example, Sheiner and Rubin7).If we do not use ITT then the fundamentalassumptions concerning our estimates of thecausal effect of treatment (ACE) no longer hold.Loss to follow-up (i.e. a failure of the patientto provide outcome data) is a major threat to allRCTs but, in this part of the discussion we willsimplify matters by assuming that outcome has,indeed, been obtained for all patients enteringthe trial. But what if some of the patientschoose a treatment option other than the one theywere randomly assigned to? Or perhaps somepatients adhere to the allocated treatment muchless than others – they turn up to the occasionalsession of therapy, for example, but not all ofthose which had been planned. This will clearlydilute (attenuate) the effect we wish to estimate.In fact, our ACE estimator (the differencebetween the observed mean outcomes for the two

randomly allocated groups) provides us with anestimate of the causal effect of offering treatment(i.e. randomisation) rather than the effect ofactually receiving it. It is a valid estimator of acausal effect but many investigators (particularlypsychotherapists!) might claim that it is anestimator of the wrong effect. As an estimatorof the causal effect of receiving therapy theITT estimate is likely to be biased. However,many other investigators might be convincedthat this is the estimator of real interest – itmeasures the effect of a decision to treat ina given way and is therefore vitally importantfor people involved in making these decisions(or, at the very least, those paying for them!).It is the standard approach to the analysis ofdrug trials and that usually expected by theregulatory authorities such as the US Food andDrug Administration (FDA) and the UK NationalInstitute for Clinical Excellence (NICE).

CHOICE OF, AND ADHERENCE TO, ANAPPROPRIATE FORM OF PSYCHOTHERAPY

What constitutes the active treatment for ourrequired comparison? There are several com-mon forms of psychotherapy that are regu-larly used for patients with depression, includ-ing behaviour therapy, cognitive behaviour ther-apy (CBT),8 interpersonal psychotherapy (IPT),9

brief dynamic psychotherapy.10 Usually the ther-apy involves the treatment of individual patients,but there is also the possibility of working withgroups of patients with similar problems, or withthe patient and his or her family. For a gen-eral review, see for example, Scott11 or Roth andFonaghy.12 If our aim is to evaluate the efficacyof one of these forms or models of treatment,or to compare its efficacy with another modelof psychotherapy or even pharmacotherapy, thenit must be self-evident that we need to be ableto describe explicitly and precisely what treat-ment using any of the specified models actuallyinvolves; i.e. they must be standardised. Crits-Christoph and Gladis13 give two main reasonsfor standardisation. First, from a clinical view-point, there is a need to be able to describe what

Page 298: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DEPRESSION 301

actually seems to work (or does not) so thatclear treatment recommendations can be made toother potential therapists. Second, from a researchviewpoint, therapies need to be replicable. Stan-dardisation of psychotherapies, however, is noteasy, and it is a topic beyond both the scopeof the present chapter and the competence ofthe present writer. Briefly, it involves the cre-ation of a detailed treatment manual, the selectionand subsequent training of appropriate therapistsin the use of the manual, certification of thera-pists based on adherence to the treatment model,and continued assessment of therapist adherenceand competence during a clinical trial.13 Clearly,when critically appraising the results of a particu-lar RCT, we need to be able to convince ourselvesthat the therapy has been undertaken as intended,and that the therapy as given was exactly whatit is said to be. For this we need a publishedtreatment manual and a well-validated method ofmeasuring adherence to the therapy as describedin the manual.

CHOICE OF AN APPROPRIATECONTROL GROUP

Standardisation of psychotherapy might bethought to be a difficult problem but it is oftenfar more difficult to come up with a valid andconvincing control condition. Crits-Christoph andGladis13 consider this as perhaps the single mostvexing problem for research into the outcome ofpsychotherapy. Too often, we see that researchershave used ‘no treatment’ or ‘waiting-list’ controls.Too often we see the phrase ‘routine care’ usedfor the control condition when, in many circum-stances it implies routine neglect. It is importantthat when patients are invited to take part in anRCT they are convinced that they will receive ade-quate levels of advice, support and care if they areallocated to the nominal control condition. Other-wise, why should they consent to randomisation?Otherwise, why should an ethics committee grantits approval for the trial? It is also important thatthe test psychotherapy is being compared to a carepackage that might be regarded as potentially as

good as the therapy on offer. The test therapy, forexample, might involve supportive counselling inaddition to the specific elements implied by thepsychotherapeutic model, and the natural controlcondition would be the receipt of the same levelof support in the absence of the psychotherapeuticelements under test. If, however, we wish to evalu-ate supportive counselling itself, then we still havea problem. Trialists often refer to ‘equipoise’ injustification of randomisation in an RCT. To main-tain equipoise we need to be convinced that thecontrol group patients are at least provided withthe best-available routine care and that they arenot allocated to a condition that might cause harm.

I will illustrate the choice of control groups byreferring to a particularly well-known and influ-ential psychotherapy trial. The National Instituteof Mental Health (NIMH) Treatment of Depres-sion Collaborative Research Program (TDCRP)14

involved the use of four treatment arms. Groups 1and 2 received CBT and IPT, respectively. Group3 received pharmacotherapy with imipramine(administered double blind) together with acare package called ‘clinical management (CM)’.Finally, Group 4 received a pill placebo (againadministered double blind) and CM. Elkin et al.14

state that ‘The CM component of both phar-macotherapy conditions was introduced into thestudy to ensure standard clinical care, to max-imize compliance, and to address ethical con-cerns regarding the use of a placebo on depressedpatients. The CM component provided guide-lines, not only for the management of medicationand side effects and review of the patient’s clini-cal status, but also for providing the patient withsupport and encouragement and direct advice ifnecessary. Although specific psychotherapeuticinterventions were proscribed (especially thosethat might overlap with the two psychotherapies),the CM component approximated a “minimalsupportive therapy” condition.’

The essence of the imipramine–CM andplacebo–CM conditions was to provide a fullystandardised package of clinical care, either ofwhich could be used as a control group for theevaluation of the efficacy of the psychotherapies.So, the two major questions addressed by the

Page 299: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

302 TEXTBOOK OF CLINICAL TRIALS

NIMH TDCRP study were ‘(1) Is there evidenceof the effectiveness of each of the psychothera-pies, as compared both with the standard refer-ence treatment of imipramine–CM and with theplacebo plus CM (PLA–CM) control condition?(2) Are there any differences in the effective-ness of the two psychotherapies?’ These ques-tions emphasise the comparative nature of thisand any other well-designed RCT. When one asksquestions about the effectiveness of psychother-apy one should always add the rider ‘relativeto what?’ A valid and well-standardised controlcondition is as vital to the comparison as is thestandardised package of therapy. Crits-Christophand Gladis,13 referring to the TDCRP trial, com-ment that whilst the placebo–CM is perhaps notthe ideal control condition for psychotherapy, itserves a practical function. That is, if a spe-cific psychotherapy can do no better than theplacebo–CM control should the psychotherapybe pursued as a treatment option? Beware ofauthors who make claims about the improvementof patients in a particular treatment group with-out reference to that in other comparison groups.Roth and Fonaghy’s12 (p. 64) comment that thesmall differences between the four TDCRP trialgroups, in terms of their outcome, is due to theunexpectedly good outcome under placebo–CM(explained by the fact that it contains non-specificelements of psychotherapy) seems to be missingthe point.

CHOICE OF ASSESSMENT METHODAND OUTCOME MEASURES

It is very difficult to see how one could possi-bly design a so-called double blind RCT in thefield of psychotherapy evaluation. The patientsare likely to know what is going on, unless theyhave been deceived by their therapists, and itwould be rather bizarre if the therapists wereunaware of what treatment was being offered!Blind assessment by a third party (a clinician orresearch worker not involved in the provision oftherapy or clinical support) is often the preferredoption, but even here it is frequently difficult

to maintain blindness. The therapists themselvesshould not undertake the assessment of outcome.One should always bear in mind, however, thatirrespective of who carries out the assessment,there is always the possibility of subjective biasesin the assessments. The Hamilton Rating Scalefor Depression, HRSD15 and the Beck Depres-sion Inventory, BDI6 are the two most commonlyused measures of depressive symptoms in RCTsfor the treatment of depression. In fact, they arefrequently both used within the same RCT toassess different aspects of symptomology. TheHRDS is a clinician-rated measure, based onan interview with the patient, which gives moreweight to the ‘biological’ or somatic symptoms ofdepression, whilst the BDI is a patient-completedquestionnaire which concentrates more on thecognitive aspects. There have been suggestionsthat different forms of therapy (drugs as opposedto psychological treatments, for example) mighthave a differential effect on these outcome mea-sures (drugs doing better according to the HRDSand the BDI favouring CBT, for example). Theexpected treatment group by outcome measureinteraction needs to be specified (and preferablypublished) as part of the trial protocol and, if itis regarded as being important, the trial needs tobe powered accordingly. In reality, it is hard toimagine a convincing justification for a trial ofthe size and expense needed for such a test.

What about missing outcome assessments?Drop-outs and other sources of missing data leadto real problems for the valid estimation of theeffects of treatment. A detailed discussion of thistopic is beyond the scope of the present chapter,but it must be stressed that the only effective wayof dealing with missing data is to ensure thatthere are none. Investigators should make everyeffort to ensure that outcome data (however brief)are collected on all of the patients randomised,irrespective of their subsequent treatment history.In particular, data collection should not beabandoned simply because the patient has nottaken up the offer of therapy or has not adheredto the prescribed course of treatment.16,17 Thereis no logical reason why patients should refuseto be assessed even though they have decided

Page 300: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DEPRESSION 303

that therapy is not for them, although, in practice,the two types of protocol violation are likely togo together.

But what if you have drop-outs and hap-hazardly missing data? What is the best wayof dealing with them in the statistical analy-sis? If we use a naıve complete-case analysis(that is, base the inferences concerning causaleffects on those patients with complete data)then we are likely to have two problems: lackof statistical efficiency (low statistical power)and bias. Bias may be caused, for example, bythe drop-outs not occurring completely at ran-dom. Ideally, analyses should be available on allavailable data (and should include where pos-sible all patients randomised to the competingtreatment arms) and should compensate for theobserved patterns of drop-out. Possibilities fordealing with the missing values include impu-tation (ranging from rather crude and unsatisfac-tory methods – at least from the point of viewof estimation – such as last-observation-carried-forward to the much more realistic stochasticimputation methods including hot-decking andmultiple imputation), the use of inverse probabil-ity weighting and, finally, a full likelihood analy-sis based on statistical models for both the miss-ing data process and for the outcome given that ithas been assessed. For further details, readers arereferred to a series of reviews in Statistical Meth-ods in Medical Research, 8(1), particularly theprimer on multiple imputation by Schafer.18 Theuse of inverse probability weights is widespreadin survey statistics but has only occasionally beenused to allow for drop-outs in RCTs. The useof this technique in longitudinal clinical trials isexplained and illustrated in Everitt and Pickles.19

Its use is also illustrated in a recent depressiontrial by Dowrick et al.20 Finally, a discussion ofthe analysis of longitudinal data with drop-outs,paying particular attention to the NIMH TDCRPtrial, is provided by Gibbons et al.21

CENTRE, GROUP AND THERAPIST EFFECTS

It is clear that the outcome of psychotherapy isdependent upon characteristics of the therapist.

These include training and experience of thetherapist, degree of adherence to the therapeuticmodel (use of a manual, for example) and thecapacity to develop a therapeutic alliance withthe patient (see, for example, Crits-Christophet al.22 and Roth and Fonaghy12). In a multi-centre RCT there are also likely to be differencesin the effectiveness of the collaborating clinicalcentres. Therapists in some centres may haveconsiderably more experience in the use of agiven treatment approach than in others, reflected,for example, in their degree of adherence to agiven therapeutic model. In addition, if patientsare treated as groups rather than individuallythere are also likely to be differences betweengroups arising not only from the characteristicsof the patients and of the therapist but also frominteractions between the patients.23 If they geton well together the group might thrive. If, onthe other hand, there is a particularly disruptiveor difficult patient within a particular group thenthe group as a whole may not do as well as itmight otherwise have done.

Consider a hypothetical single-centre RCT withtwo treatment arms. In Group A all the patientsreceive CBT individually from an internationallyrespected pioneer of CBT, Dr Garner. In GroupB all the patients, again individually, receive sup-portive counselling (SC) from a recently-trainedcommunity psychiatric nurse, Mr Martin. Let’sassume that Dr Garner’s patients do considerablybetter than those of Mr Martin. What can we inferabout the causal effect of CBT from such a trial?What are the threats to the validity of the trial? DrGarner is likely to be a very experienced, highly-skilled and highly-motivated ‘brand champion’.Mr Martin, on the other hand, lacks experience.Is the observed difference due to the difference inabilities and experience of the two therapists, oris it an effect of CBT? We cannot tell. The twoeffects are completely confounded in this simpledesign. This is a severe threat to the internal valid-ity of the trial. If, however, we believe that theobserved differences are an effect of CBT, thenwhat? We still cannot be sure that there is notsome attribute of Dr Garner that enables him to

Page 301: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

304 TEXTBOOK OF CLINICAL TRIALS

be particularly successful in delivering this par-ticular variant of CBT. Could other clinicians usethe same model and achieve the same or, at least,comparable results? We do not know. This is athreat to the external validity or generalisabilityof the trial’s findings.

In practice, many RCTs, which otherwise haveadmirable design characteristics and quality con-trol procedures, involve the use of only two orthree highly-skilled and experienced therapists.They are often the academic clinicians who havebeen involved in the development or modificationof the therapy under evaluation. Does this inval-idate the findings of these trials? No, but it doeslimit their generalisability. They should, perhaps,be regarded as the equivalent of the pharmacother-apist’s ‘Phase II’ drug trials, being a necessarypreliminary to the design and conduct of a full‘Phase III’ evaluation using a large and repre-sentative sample of therapists. It would be inap-propriate and certainly difficult to justify a largemulti-centre trial involving large numbers of ther-apists without first being able to establish that the‘experts’ or ‘brand champions’ are able to achievepromising results. If the latter cannot demonstrateworthwhile effects then it would be pointless tomove on to the larger trial. If they can, however,we then (but only then) need to ask how well thetherapy might work in routine clinical practice.

In a large multi-centre trial we need to involveas many therapists as possible. Each therapist islikely to be based in only one of the centres andto be delivering treatment in only one arm of thetrial (i.e. therapists are nested within both centresand treatments), but it is possible for a therapistto deliver more than one of the forms of therapyin a comparative trial (i.e. therapists, like centres,are crossed with treatments). These designs haveimplications for the statistical analysis and for thevalidity of statistical inferences based on theseanalyses.24 – 26 Both centre and therapist effectsshould be incorporated into an appropriate statis-tical model. Using the notation of Roberts,26 sucha model, for a quantitative outcome measure, forexample, will have the form:

yijk = α + λj +∑

p

βpxijkp + ujk + εijk (5)

Here yijk is the outcome of the ith patient of thekth therapist within the j th treatment arm of thetrial. Assuming that λj is zero in the control arm,λj is the effect of the treatment effect for thej th arm of the trial. Each xijkp is the baselinemeasurement of the pth patient characteristic(such as a demographic or other prognosticvariable, including treatment centre) and βp is thecorresponding regression coefficient. The termujk is the average effect of the kth therapistwithin the j th treatment arm of the trial. It isa random variable (i.e. randomly varying fromone therapist to another) with an assumed meanof zero and variance of σj

2. This variance mayvary from one arm of the trial to another (thereis no a priori reason why the variation betweentherapists within different arms of the trial shouldbe the same) and, in particular, if the control armdoes not involve the use of therapists at all, then,for the controls σj

2 = 0 (i.e. in this situation thereare no therapist effects in the control group). Inthe case of the possibility of one or more arms ofthe trial involving group therapies, the statisticalmodel would be even more complex.

Models such as that described in equation (5)are called random effects, random regression ormultilevel models.27 Technical details of theiruse are beyond the scope of this chapter andinterested readers are referred to Roberts26 foran illustration of their use in the context ofRCTs involving therapist effects. What readersshould note, however, is that failure to allowfor appropriate therapist effects in the statisticalanalysis (assuming that they are present in thedata) is likely to lead to spurious statisticalsignificance (i.e. the stated p-values will betoo low) and estimated confidence intervals orstandard errors for treatment effects that are toooptimistic (i.e. smaller than they should be). Acorollary of this is that, even when the analysisis correct, a trial whose sample size has not beendetermined after allowance for the possibility oftherapist effects is very likely to be underpowered(too small!). This is the same problem as thosefaced by the designers of cluster randomised trials(see, for example, Donner28). Again, Roberts26

provides details of the required adjustments to

Page 302: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DEPRESSION 305

sample size calculations, on the assumption thattherapist variation is the same for each of thearms of the trial.

But there is more to the problem of therapisteffects than can be solved by the technicaldevice of allowing for them in an appropriatestatistical model. Nor is the main problem oneof generalising from the impact of therapists in agiven trial to the wider community of therapists.We started the discussion in the present sectionby comparing the outcome of CBT as deliveredby Dr Garner with that of SC as delivered byMr Martin. We pointed out that the requiredtreatment effect is fully confounded with thedifference between the two therapists. Now letus move on to a larger trial in which each ofthe patients in Group A receives CBT from arandomly selected therapist from a team of, say50, experienced and highly competent cognitivetherapists. Each of the patients in Group B, onthe other hand, receives SC from a randomlyselected therapist from a team of experiencedand highly competent interpersonal therapists.We still have a problem. Again, the requiredtreatment effect (the difference between outcomesfor CBT and SC) cannot be disentangled fromthe difference between the average effects ofthe two groups of therapists. In general, theλj in equation (5) can either be interpreted asthe average of the within-Group j therapisteffects or as an effect of therapy j – thatis, the two interpretations are equivalent. ‘Atthe present time, researchers and consumers ofpsychotherapy research findings are left with abasic dilemma when interpreting the findingsof studies focusing on the efficacy of specifictreatments: how to disentangle the effects dueto the therapeutic approach from those dueto the particular therapists who have carriedout the approach. It is particularly pressingwhen different therapists carry out each of thetreatments in a comparative outcome study.’29

The cognitive therapists and counsellors in theabove hypothetical trial (or even in a real onesuch as the NIMH TDPRC study) might differ inlots of ways and these therapist differences maybe the causal effects of the treatment difference,

not the difference in psychotherapeutic approach.Consider therapeutic competence, for example.The CBT therapists might, for example, beeither more or less competent than their SCcounterparts. But how could we assess this? Howcould we possibly compare the competence ofDr Garner as a cognitive therapist, for example,with that of Mr Martin as a counsellor. It isakin to asking whether I am more competent asa statistician than my scientific colleague is asa laboratory worker. And moving to a crosseddesign (both types of therapy being offered byevery therapist) does not solve our problem. If DrGarner, for example, were to be experienced andhighly competent as both a cognitive therapistand an interpersonal therapist we still would notbe able to compare the competencies in the twoapproaches. Elkin29 (Reproduced with permissionof Oxford University Press) concluded that ‘Wemay never be able to truly “disentangle” theeffects due to the therapist from those dueto the therapy, because they may often beinherently intertwined and also very interactivewith particular patient attributes.’

‘The treatment conditions being compared. . . are, in actuality, “packages” of particulartherapeutic approaches and the therapists whochose and are chosen to administer them.’30

The interpretation of the results of RCTs shouldexplicitly acknowledge this fact. As well asvery carefully defining both the treatment andthe control conditions, authors should providecritical information about the therapists carryingout the treatments, and the information shouldbe included in the dissemination concerningsupposedly empirically validated results.29 Thelatter is particularly important when we cometo systematic review and/or meta-analysis of theresults from a disparate collection of individualtrials, and in the formulation of any subsequentclinical guidelines based upon the results ofthese trials.

WHAT WORKS FOR WHOM?

The question implies a belief that there is noconstant treatment effect. That is, it implies that

Page 303: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

306 TEXTBOOK OF CLINICAL TRIALS

a given form of treatment has a greater effect onsome patients than it does on others; that the receiptof psychotherapy A will be more beneficial forMr Smith than the receipt of psychotherapy B,for example, but that B might be better than Afor Mrs Jones. Mr Smith has a particular attribute(presenting symptoms, clinical or family history,for example) that indicates therapy A. Mrs Jones,on the other hand, has characteristics that indicatetherapy B. In the epidemiological literature this iscalled ‘effect modification’ – a particularly usefulterm as it should remind us that ‘causal effect’implies comparison of observed outcome with thatwhich would have been observed under differentcircumstances. In terms of statistical modelling(analysis of variance, or covariance, for example)it will provide an example of a treatment groupby patient attribute interaction, where the attributecould be one of a potentially vast range of measuresmade on the patients at or prior to randomisation.Supposed examples of such interactions are rarelyconvincing. Even if based on a valid statisticalanalysis (i.e. a test of an appropriate two-wayinteraction) they are usually ‘discovered’ as partof a post hoc ‘fishing trip’. More frequently theirexistence has been based on an invalid analysis.All too often the investigators are looking for aso-called ‘predictor of outcome’ by searching inthe relevant treatment group for patient attributesassociated with good outcome. This tells us nothingabout effect modification – the same attributesmight lead to the better outcomes within the controlgroup(s). One should always remember that validinferences from an RCT involve comparison ofthe randomised groups. Here we are concernedwith the question ‘Does the treatment effect (e.g.comparison of outcomes in Groups A and B)depend on, say, patient attribute C?’ The identityof attribute C should be clearly specified in the trialprotocol, together with a prior estimate of the sizeof the proposed interaction. The sample size for thetrial should then be determined such that there issufficient power to detect this interaction throughthe use of an appropriate statistical significancetest. One good candidate for attribute C mightbe patient preference,31 but there is little, if any,

methodologically sound work in this area. Crits-Christoph and Gladis13 comment that two of thelargest randomisedclinical trialseverundertaken toevaluatepsychotherapies (althoughnotspecificallyfor depression) failed to provide much support forspecified patient – treatment interactions.22,32

ESTIMATION OF CAUSAL EFFECTSIN AN RCT WITH PATIENT CHOICE

Consider a hypothetical RCT in which 200 eligi-ble depressed patients have been randomly allo-cated to receive either counselling plus routinecare (T = t) or routine care alone (T = c). Forsimplicity, assume that all of the 100 patientsallocated to routine care receive exactly that (theydo not have access to counselling unless theyhave been allocated to that treatment arm of thetrial). Of the 100 patients offered counselling,however, only 70 accept the offer. A fixed timeinterval after randomisation (six months, say)the patients’ clinical status (improved versus notimproved) is assessed and used as the primaryoutcome of the trial. The effects of either treat-ment allocation or treatment actually receivedare to be estimated from the differences betweenaverage outcomes as before, the only differencebeing that we are averaging binary outcomes(1 = improved; 0 = not improved, say) to obtainobserved proportions. The results of this hypo-thetical trial are summarised in Table 19.1 (notethat we have simplified the issue by assumingthere are no drop-outs).

The estimate of the ITT effect is both simpleand familiar. The proportion of those receivingcounselling who improve is 0.70 (i.e. 70/100)and the corresponding proportion for the control

Table 19.1. Results of a hypothetical trial of coun-selling

T = t T = c

Improved Total Improved Total

Comply 60 70Do not comply 10 30Overall 70 100 50 100

Page 304: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DEPRESSION 307

group is 0.50 (i.e. 50/100). The difference (theACE for being offered counselling) is 0.20. Forreaders who prefer NTT (the reciprocal of thedifference between the two proportions), this is5 (i.e. 1/0.20). But what about estimating thecausal effect of receiving treatment? There aretwo commonly used, but invalid, methods ofanalysis–analysis per protocol or analysis astreated. There is also the correct (correctness, ofcourse, being vitally dependent on the validityof a few key assumptions) but much less famil-iar estimator – the complier average causal effect(CACE).33 – 35 The per protocol analysis com-pares the outcome in those people in the coun-selling group who actually receive counsellingwith that in the control group (i.e. it excludespatients who have violated the treatment proto-col from the analysis). Here the difference is60/70 − 50/100(= 0.36). The as treated analysiscompares outcome in those patients who receivecounselling with that in those who do not receiveit (all patients are included in this analysis). Hereit is 60/70 − 60/130(= 0.40). The problem withboth of these estimators is that it is impossibleto interpret them as a causal effect in the senseof comparing potential outcomes on the samepatient. The patient groups are not comparable.The estimated effects are merely associations,subject to confounding. And association, as youall know, does not imply causality!

What about the CACE? This is an estimateof the difference between the outcome in thecompliers (i.e. those who accepted and receivedthe offered counselling) with that which wouldhave been expected in the same patients if theyhad not been offered counselling. This is wherewe need two key assumptions.36,37 The first one iseasy to defend for a randomised trial. The secondneeds a bit more careful thought.

Assumption 1: the proportion of patients whoare potential compliers is the same in the tworandomly allocated groups. This follows directlyfrom the random allocation mechanism.

Assumption 2: the proportion of potential non-compliers who improve is independent of treat-ment allocation. In other words, it makes no dif-ference to the outcome of a patient who would

refuse the offer of counselling whether or not theyare in the group actually offered counselling. Theoffer, in itself, is not beneficial.

Assumption 1 allows us to estimate the propor-tion of potential compliers in the control group.In our example it is 70/100. The estimated num-ber of non-compliers in the control group is 30and the number of compliers is 70.

Assumption 2 allows us to estimate the propor-tion (number) of patients who improve amongstthe non-compliers in the control group. In ourexample the number of patients who improve inthis group is estimated to be 10 (the propor-tion is 10/30). Now, there were a total of 50patients who were observed to improve in thecontrol group and therefore the estimated numberof potential compliers who improve in the controlgroup must be 40 (that is 50 − 10). Otherwisethe numbers do not add up! So, the proportionof patients improving in the counselling groupamongst those who actually receive counselling is60/70. The proportion in the corresponding con-trol group (i.e. those who would have acceptedthe offer) is estimated to be 40/70. The CACEestimator is the difference between these twoproportions, 60/70 − 40/70(= 0.29). The corre-sponding NNT is 3.5.

Note that in the above example the potentialcompliers did better than the non-compliers,irrespective of which treatment arm they wereallocated to. This is not unexpected and nottoo difficult to rationalise. But now consider asecond, more ‘difficult’ example. The results of asecond hypothetical trial are shown in Table 19.2.The ITT effect (ACE) is estimated by 50/100 −30/100(= 0.20). The corresponding NTT is 5.The CACE estimate is 35/70 − 15/70(= 0.29).

Table 19.2. Results of a second hypothetical trial ofcounselling

T = t T = c

Improved Total Improved Total

Comply 35 70Do not comply 15 30Overall 50 100 30 100

Page 305: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

308 TEXTBOOK OF CLINICAL TRIALS

The corresponding NNT is again 3.5. But notethat this time the potential compliers in thecontrol group are doing a lot worse than thenon-compliers (15/70 vs 15/30). Again, this isreasonably straightforward to rationalise. Thepatients who accept the offer of counselling arethose with the worst prognosis or, equivalently,those that turn it down (or would turn it downif offered it) are those who are getting betteranyway. The latter do not need treatment. Butwhat this should do is prompt the data analystto ask whether Assumption 2 is really justified.Might the offer of help on its own be of benefit?And if so, of how much benefit? Or perhapsthose patients in the control group who wouldhave accepted the offer feel let down (resentfuldemoralisation) and do worse than they wouldotherwise have done if they had known nothingabout the possible offer of help.

In practice, we do not have to work rightthrough the estimation from first principles in theabove way. It can be shown that the requiredestimates can be obtained from the followingsimple formula:33,38,39

CACE = ITT estimate for outcome

ITT estimate for receipt of treatment(6)

This formula applies in situations where bothof the measures of outcome and treatmentreceived are binary (i.e. both the ITT effects aredifferences between proportions) or where bothare quantitative (i.e. both the ITT effects aredifferences between means), or one is binary andthe other quantitative. So, for the first exampleabove, CACE = (70/100 − 50/100)/(70/100 −0) = 0.29, as before.

RANDOMISED CONSENT AND PATIENTPREFERENCE DESIGNS

A serious issue in the design of RCTs concernsthe amount of information given to the patientabout the aims of the trial. So-called informedconsent is a prerequisite for most trials but itis not always obvious what ‘informed consent’

actually means or whether, strictly speaking, itis ever possible. In the context of our exampleillustrating the effect of patient compliance to atreatment offer, is it ever ethically justified torandomise and then only seek consent to treatin the group allocated to receive therapy? Thisis an example of Zelen’s40 original form of therandomised consent design. All patients in thetrial are asked to provide outcome data, of course,but those in the control group may never knowthat they had taken part in a trial. Is this reallya serious ethical problem? This design wouldcircumvent the potential problem of resentfuldemoralisation amongst the controls. I will notattempt to answer the question raised.

In an attempt to solve some of the serious prob-lems surrounding the issue of patient preference,Brewin and Bradley41 (see also Bradley42 and Sil-verman and Altman43) have proposed what theydescribe as the patient preference design. In thisdesign, eligible patients are told about the reasonsfor the trial and the treatments on offer. Patientswho do not have a strong preference (that is,they are prepared to be randomised) are enteredinto a conventional RCT. Those patients witha strong preference are offered the treatment oftheir choice. So, for the comparison of two treat-ments, A and B, for example, the patient prefer-ence trial finishes up with four groups: those whoprefer A; those without preference who are ran-domly allocated to A; those who prefer B; thosewithout preference who are randomly allocatedto B. In the context of the present discussion, thecomparison of the randomly allocated groups canlead to an ACE or CACE estimate as describedabove. But what can the two patient preferencegroups provide? Merely an estimate of associa-tion. Like per protocol or as treated estimators,they do not appear to be able to provide estimatesof causal effects. And for this reason they can-not be used to check the (external) validity of theestimates of causal effects provided by the ran-domised groups. Whether the difference betweenthe two preference groups is the same as or com-pletely different from that provided within thecore RCT, so what? What does it tell us? Thatthere are selection effects? The treatment effect

Page 306: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DEPRESSION 309

may, indeed, be different in those patients with-out a strong preference (i.e. those prepared to berandomised) when compared to the rest, but therest cannot provide the valid information fromwhich we can test whether it is true. But per-haps readers should see the results of such a trialand decide for themselves. An example of theuse of a patient preference design is providedby a recently published trial of counselling fordepression.44,45

Another design possibility which, in my view,has much more promise is to simply ask theparticipants of a conventional RCT what theirpreferences are prior to randomisation. The aimhere is not to allow patient preference to influ-ence treatment received (but in the presence ofnon-compliance this will be inevitable) but toincorporate patient preferences into the analysisof the resulting data. This has been tried by Torg-erson et al.31 Although the latter authors do notpursue all of the possibilities in terms of esti-mating treatment effects, the design offers ways,at least partially, of testing the validity of theassumptions necessary for the above CACE esti-mator, or, equivalently, looking for a poor prog-nosis/demoralising effect in the potential com-pliers of the control group. Getting preferenceinformation prior to randomisation would alsoimprove the precision of the estimates of theCACE, but this is well beyond the scope ofthe present chapter – for further information, seeFisher-Lapp and Goetghebeur.39 The latter articlewill also provide a suitable entry to the literatureon adjustment for partial compliance (i.e. regres-sion models for the response in terms of levelsof compliance).

CONCLUSIONS

The design and analysis of a convincing RCTfor the estimation of the effects of psychotherapyare difficult. It is not safe to simply assume thatthe theoretical and logistical problems are similarto those of the average drug trial. Life hereis much more complex. Psychotherapy (at leastin its individual form) involves the interaction

of two people (the patient and the therapist)and it is the involvement of these two peoplethat is the essence of the complexity. Added tothis are the problems of the choice of adequatecontrol groups (in particular, the absence ofa convincing placebo) and the impossibilityof conducting a trial that is double blind. Inthe critical appraisal of such trials we shouldnot, perhaps, be searching for methodologicalperfection but, instead, be aware of the pitfallsto valid inferences concerning treatment effectsand temper our judgements accordingly (and thisapplies just as much to the trials that we havebeen involved in as it does to the trials of otherinvestigators).

This chapter has not considered systematicreview and meta-analysis of trials of psychother-apies but the authors (and appraisers) of suchstudies should be fully aware of all the method-ological pitfalls of the individual trials. A meta-analysis of a series of trials that have naıvelyignored random therapist effects, for example,or ignored the structure of a group therapytrial, simply summarises the faulty analyses ofthe originals. Unfortunately, the consumers ofmeta-analyses (particularly if they are producedunder the auspices of such august bodies as theCochrane Collaboration) seem to place far toomuch faith in their findings. Consumers need tobe aware that the authors of systematic reviewsare capable of missing subtle (or not so subtle)methodological flaws in the original trials. Con-sumers should resist taking the conclusions of theauthors of these meta-analyses, and the clinicalguidelines that result from them, on trust. In orderto critically appraise a published systematic viewone needs to know about not only the mechanism(and quality) of the review itself, but also have adetailed knowledge of what the reviewers shouldhave been looking for in the way of method-ological problems in the original trials. Reportingguidelines such as CONSORT46,47 are having asubstantial impact on the quality of clinical trials,and on the appraisal methodologies of system-atic reviewers. In the case of psychotherapy trials,however, the CONSORT recommendations onlycover a small part of the key components of the

Page 307: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

310 TEXTBOOK OF CLINICAL TRIALS

trial. Sticking to CONSORT guidelines is neces-sary for a good trial report, but is not sufficient. Ihope the present chapter succeeds in stimulatingreaders to think of other aspects of such trials thatneed to be equally well reported.

REFERENCES

1. Dunn G. Statistical methods for measuring out-comes. In: Tansella M, Thornicroft G, eds, Men-tal Health Outcome Measures. London: Gaskell(2001) 5–18.

2. Garety P, Dunn G, Fowler D, Kuipers E. Theevaluation of cognitive behaviour therapy for psy-chosis. In: Wykes T, Tarrier N, Lewis S, eds, Out-come and Innovation in Psychological Treatmentof Schizophrenia. Chichester: John Wiley & Sons(1998) 101–18.

3. Rubin DB. Estimating causal effects of treatmentsin randomized and nonrandomized studies. J EducPsychol (1974) 66: 688–701.

4. Holland PW. Statistics and causal inference (withdiscussion). J Am Stat Assoc (1986) 81: 945–70.

5. Dunn G. The challenge of patient choice and non-adherence to treatment in randomised controlledtrials of counselling or psychotherapy. UnderstandStat (2002) 1: 19–29.

6. Beck AT, Ward CH, Mendelson M, et al. Aninventory for measuring depression. Arch GenPsychiat (1961) 4: 561–71.

7. Sheiner LB, Rubin DB. Intention-to-treat and thegoals of clinical trials. Clin Pharmacol Ther(1995) 57: 6–15.

8. Fennell MJV. Cognitive-behaviour therapy fordepressive disorders. In: New Oxford Textbookof Psychiatry. Oxford: Oxford University Press(2000) 1394–405.

9. Markovitz JC, Weissman MM. Interpersonal psy-chotherapy for depression and other disorders.In: New Oxford Textbook of Psychiatry. Oxford:Oxford University Press (2000) 1411–20.

10. Ursano RJ, Ursano AM. Brief individual psycho-dynamic psychotherapy. In: New Oxford Textbookof Psychiatry. Oxford: Oxford University Press(2000) 1421–32.

11. Scott J. Psychological treatments for depression:an update. Br J Psychiat (1995) 167: 289–92.

12. Roth A, Fonaghy P. What Works for Whom? ACritical Review of Psychotherapy Research. NewYork: The Guilford Press (1996).

13. Crits-Christoph P, Gladis M. The evaluation ofpsychological treatment. In: New Oxford Textbookof Psychiatry. Oxford: Oxford University Press(2000) 1259–69.

14. Elkin I, Shea T, Watkins JT, et al. National Insti-tute of Mental Health Treatment of DepressionCollaborative Research Program (1989). Arch GenPsychiat (1989) 46: 971–81.

15. Hamilton M. A rating scale for depression. J.Neurology Neurosurgery & Psychiatry (1960) 23:56–61.

16. Lavori PW. Clinical trials in psychiatry: shouldprotocol deviation censor patient data? Neuropsy-chopharmacology (1992) 6: 39–48.

17. Siqueland L, Chittams J, Frank A, et al. The pro-tocol deviation patient: characterization and impli-cations for clinical trials research. Psychother Res(1998) 8: 287–306.

18. Schafer D. Multiple imputation: a primer. StatMeth Med Res (1999) 8: 3–15.

19. Everitt BS, Pickles A. Statistical Aspects of theDesign and Analysis of Clinical Trials. London:Imperial College Press (1999).

20. Dowrick C, Dunn G, Ayuso-Mateos J-L, et al.Problem solving treatment and group psychoed-ucation for depression: a multicentre randomisedcontrolled trial. Br Med J (2000) 321: 1450–5.

21. Gibbons RD, Hedecker D, Elkin I, et al. Someconceptual and statistical issues in analysis oflongitudinal psychiatric data. Arch Gen Psychiat(1993) 50: 739–50.

22. Crits-Christoph P, Siqueland L, Blaine J, et al.Psychological treatments for cocaine dependence:National Institute for Drug Abuse CollaborativeCocaine Treatment Study. Arch Gen Psychiat(1999) 56: 493–502.

23. Burlinghame G, Kircher J, Taylor S. Methodologi-cal considerations in group psychotherapy research:past, present and future practices. In: Fuhriman A,Burlinghame G, eds, Handbook of Group Psy-chotherapy and Counseling. New York: John Wiley& Sons. (1994) 41–80.

24. Martindale C. The therapist-as-fixed-effect fallacyin psychotherapy research. J Consult Clin Psychol(1978) 46: 1526–30.

25. Crits-Christoph P, Mintz J. Implications of ther-apist effects for the design and analysis of com-parative studies of psychotherapies. J Consult ClinPsychol (1991) 59: 20–6.

26. Roberts C. The implications of variation in out-come between health professionals for the designand analysis of randomized clinical trials. Stat Med(1999) 18: 2605–15.

27. Goldstein H. Multilevel Statistical Models, 2ndedn. London: Arnold (1995).

28. Donner A. Design and Analysis of Cluster Ran-domization Trials in Health Research. London:Arnold (2000).

29. Elkin I. A major dilemma in psychotherapy out-come research: disentangling therapists from ther-apies. Clin Psychol: Sci Pract (1999) 6: 10–32.

Page 308: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

DEPRESSION 311

30. Elkin I,Parloff MB,Hadlet SW,Autry JH.NationalInstitute of Mental Health Treatment of DepressionCollaborative Research Program: background andresearch plan. Arch Gen Psychiat (1985) 42:305–16.

31. Torgerson DJ, Klaber-Moffett J, Russell IT. Patientpreferences in randomised trials: threat or opportu-nity? J Health Serv Res Policy (1996) 1: 194–7.

32. Project MATCH Research Group. Matching alco-holism treatments to client heterogeneity: ProjectMATCH posttreatment drinking outcomes. J Stud-ies Alcohol (1997) 58: 7–29.

33. Angrist JD, Imbens GW, Rubin DB. Identificationof causal effects using instrumental variables (withdiscussion). J Am Stat Assoc (1996) 91: 444–55.

34. Little R, Yau LHY. Statistical techniques for ana-lyzing data from prevention trials: treatment ofno-shows using Rubin’s causal model. PsycholMeth (1998) 3: 147–59.

35. Little R, Rubin DB. Causal effects in clinical andepidemiological studies via potential outcomes:concepts and analytical approaches. Ann RevPublic Health (2000) 21: 121–45.

36. Bloom HS. Accounting for no-shows in exper-imental evaluation designs. Eval Rev (1984) 8:225–46.

37. Sommer A, Zeger SL. On estimating efficacyfrom clinical trials. Stat Med (1991) 10: 45–52.

38. Cuzick J, Edwards R, Segnan N. Adjusting fornon-compliance and contamination in randomizedclinical trials. Stat Med (1997) 16: 1017–29.

39. Fisher-Lapp K, Goetghebeur E. Practical proper-ties of some structural mean analyses of the effect

of compliance in randomized trials. Contr Clin Tri-als (1999) 20: 531–46.

40. Zelen M. A new design for randomized clinicaltrials. New Engl J Med (1979) 300: 1242–5.

41. Brewin C, Bradley C. Patient preferences andrandomised clinical trials. Br Med J (1989) 299:313–15.

42. Bradley C. Designing medical and educationalintervention studies. Diabet Care (1993) 16:509–18.

43. Silverman WA, Altman DG. Patient’s preferencesand randomised trials. Lancet (1996) 347: 171–4.

44. Ward E, King M, Lloyde M, et al. Randomizedcontrolled trial of non-directive counselling,cognitive-behaviour therapy, and usual generalpractitioner care for patients with depression.I: Clinical effectiveness. Br Med J (2000) 321:1383–8.

45. Bower P, Byford S, Sibbald B, et al. Random-ized controlled trial of non-directive counselling,cognitive-behaviour therapy, and usual generalpractitioner care for patients with depression.II: Cost effectiveness. Br Med J (2000) 321:1389–92.

46. Moher D, Schultz KF, Altman DG. The CON-SORT statement: revised recommendations forimproving the quality of reports of parallel-group randomized trials. Ann Int Med (2001) 134:657–62.

47. Altman DG, Schultz KF, Moher D, et al. Therevised CONSORT statement for reporting ran-domized trials: explanation and elaboration. AnnInt Med (2001) 134: 663–94.

Page 309: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

REPRODUCTIVE HEALTH

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 310: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

20

ContraceptionGILDA PIAGGIO∗

Department of Reproductive Health and Research, World Health Organisation, Geneva, Switzerland ∗The viewsexpressed in this paper are solely those of the author and do not necessarily reflect the views of the World

Health Organization.

INTRODUCTION

Contraception deals with the prevention of preg-nancy. The basic pillar of family planning isa wide spectrum of contraceptive methods thatenables men and women to make informedchoices about timing and size of family. Effec-tive and safe methods should be available suchthat they fit the needs of women and men invery diverse social and cultural settings world-wide. There should be reversible methods that pro-tect against sexually transmitted infections, can becontrolled by women, can be used by adolescentsand by breast-feeding women. The choice of acontraceptive method involves personal decisionsand depends on the stage of life, family situationor civil status, age, preferences and health profileof individuals and couples. Contraceptive researchand development has resulted in a variety of avail-able methods and continues to address importantissues to fill gaps in the available portfolio and inknowledge about method safety and effectiveness.

Contraceptives are generally used by healthyindividuals to prevent pregnancy and some alsoserve prophylactic purposes. They need to be

very safe so that it does not offset the benefitsobtained from their use, and this emphasizes theimportance of addressing the safety concerns.Both contraceptive efficacy and risks should bewell defined to enable the user and the prescriberto make the best choice of a contraceptivemethod.1

The development of effective and safe methodsof contraception poses special challenges. First,to achieve an understanding of the complex phys-iology of reproduction. Second, the effectivenessof many methods depends on a successful inter-action between men and women. Some methodshave to be used by the man or the couple butfailure (pregnancy) is always observed in thewoman. Third, for some methods, behaviouraland social factors are critical, determining com-pliance, which is very difficult to assess.

Contraceptive methods can in general be clas-sified into hormonal and non-hormonal meth-ods. Hormonal methods used by women includeoral contraceptive pills (OCs), injectable prepara-tions, implants, hormone-releasing devices (vagi-nal rings and progestogen-releasing IUDs) andpost-coital oral pills (visiting pills and emergency

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 311: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

316 TEXTBOOK OF CLINICAL TRIALS

contraceptive (EC) pills). Non-hormonal meth-ods used by women include intrauterine devices(IUDs), barrier methods (diaphragm and femalecondom), spermicides, natural methods (calendarand lactational amenorrhoea) and sterilisation,as well as immunocontraceptives, that are beingdeveloped. Hormonal male methods consist ofinjectable preparations and implants, still underdevelopment. Non-hormonal ones comprise con-doms, withdrawal and sterilisation (vasectomyand vas occlusion), while immunocontraceptivesor vaccines are under development.

These broad classes of contraceptive methodsdiffer in the length of the acting period, in themechanism of action, in the interval and way ofadministration or insertion, in the possibility ofcontrol by the woman, in their effectiveness andin their possible effects on health and indicationsfor their use. They also differ in the way theymeet the interests of men and women in differentsocial and cultural settings.

CONTRACEPTION METHODS:AN OVERVIEW

Table 20.1 presents a list of currently used con-traceptive methods with the interval of actionrequired, the pregnancy rates under typical andperfect use and the main safety concerns. Exten-sive and detailed descriptions of old and newcontraceptive methods are available.2,3 A com-prehensive review of the literature on contracep-tive efficacy was done by Trussell and Kost.4

HORMONAL CONTRACEPTIVESFOR WOMEN

Hormonal methods prevent conception by inhibit-ing ovulation or preventing implantation orchanging the quality of cervical mucus and thuspreventing sperm access to the cervix. Oral meth-ods exert their action if administered within acycle, while injectable preparations, implants andhormone-releasing devices are long-acting.

Oral Contraceptives

OCs comprise combined oral contraceptives(COCs) and progesterone-only pills (POPs).

Modern low-dose COCs contain a combinationof oestrogen and a progestin (20 to 35 mcg ofoestrogen and 150 mcg or less of levonorgestrel,or 200 to 300 mcg of norgestrel or 400 to1000 mcg of norethindrone or the equivalent ofanother progestin). There are monophasic formu-lations, with constant daily doses of oestrogenand progestin, biphasic ones, in which the doseof progestin changes in each of two periods andtriphasic ones, in which the dosages change ineach of the three seven-day periods of pill intakeduring the 21 days of pill cycle.

COCs prevent conception through the suppres-sion of ovulation via hypothalamic and pituitaryeffects and progestin-mediated alterations in theconsistency and properties of cervical mucus. Itis still unconfirmed if the mechanism of actionalso includes alterations in the endometrial liningand of tubal transport mechanisms.

POPs have a lower dose of progestin thando COCs (75 mcg of norgestrel or 350 mcg ofnorethindrone). They prevent conception througha combination of mechanisms including suppres-sion of ovulation, alteration of cervical mucus, ofthe endometrium and of the fallopian tubes.

Synthetic oestrogens were first developed inthe early 1930s and the more potent ethinyloestradiol in 1938, while synthetic orally activeprogestins were first produced in the early1950s. In this decade the first generation pro-gestins, like ethynodiol and lynesterol weredeveloped. OCs became available in the UnitedStates in 1959. A major breakthrough in thedevelopment of OCs was the finding that theoestrogen and progestin acted synergistically toinhibit the pituitary. This allowed the transi-tion from high-dose to low-dose, of both theoestrogen and the progestin. Low-dose oestro-gen COCs have less frequent complaints aboutbreast tenderness, nausea and leg cramps.5 Infor-mation on efficacy and common side-effectswas obtained from randomised clinical trials(RCTs),6,7 one of them comparing COCs andPOPs.7 The progestins that have been mostwidely studied are norethindrone (or norethis-terone) and levonorgestrel (second-generation).Around the 1990s, three new progestins have

Page 312: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CONTRACEPTION 317

Table 20.1. Contraceptive methods available, their characteristics, typical failure and discontinuation rates andsafety concerns

Type MethodDuration of

action

Typical one yearpregnancy

rate per 100woman-years

Perfect one yearpregnancy

rate per 100woman-years Safety concerns

Hormonal for womenOCs COCs Daily 6–8 0.1 Cardiovascular diseases,

depression, hepaticadenomas

POPs Daily 1 (breastfeeding) 0.5 UnknownInjectables DMPA 3-month 0.3 0.3 None

NET-EN 2-month 0.3 0.3 NoneCombined once-a-month 0.3 0.3 Same as for COCs for severe

pathologiesImplants Norplant 5-year 0.1 0.1 Infection at implant site

Jadelle 5-year Infection at implant siteImplanon 3-year Infection at implant site

Vaginal rings Combined 3–12 months <3 <3 Vaginal irritation (at insertion)Lesions

Low-dose Lng 3–12 months 4.5 3.2 Vaginal irritation (at insertion)Lesions

Non-hormonal for womenIUD Copper 8–10 yrs 0.8 0.6 Increased menstrual bleeding,

anaemiaUterine perforationPID 20–30 days post-insertion

Lng-releasing 5–7 yrs Not available Not available STD riskBarrier Female condom Coitus-related 21 5 None

Diaphragmw/spermicide

Coitus-related 20 6 Toxic shock syndromeUrinary tract infections

Cervical cap Coitus-related 20 (nulliparous) 9 (nulliparous) Toxic shock syndrome40 (parous) 26 (parous)

Spermicides Spermicides Coitus-related 26 6 Vaginal infection and irritationNatural Periodic

abstinenceDaily 20 1–9 None

Lactationalamenorrhoea

Duration ofbreastfeeding

2 0.5 None

Sterilisation Female Permanent 0.5 0.5 Infection, anesthesiacomplications

Non-hormonal for menBarrier Male condom Coitus-related 14 3Natural Withdrawal Coitus-related 19 4Sterilisation Male (vasectomy) Permanent 0.2 0.1 Cancer of prostate and testes

been introduced (third-generation): norgestimate,desogestrel and gestodene.

OCs are likely to affect lipid and carbohydratemetabolism and the coagulation system, whichseem to be predictive of cardiovascular problems.The effect of low-dose OCs on these physiologi-cal functions has been shown to be non-existentor small.8 – 11 Another safety concern regards can-cer. Some studies have reported that the use ofhormonal contraceptives is protective of cancer

of the ovary and the endometrium.12 However,a possible link between the use of OCs for along period and breast cancer risk among youngwomen and cancer of the cervix is still a con-cern. Side-effects of COCs are nausea, headaches,dizziness, spotting, weight gain, breast tendernessand chloasma. For POPs, the main side-effect ismenstrual irregularities.

The association between OCs and cardiovas-cular diseases, namely venous thrombosis (VTE),

Page 313: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

318 TEXTBOOK OF CLINICAL TRIALS

ischaemic heart disease and cerebrovascular dis-ease, has been the object of a controversy.There seems to be a small increase in the riskof VTE, larger for the OCs containing third-generation progestins compared to those con-taining second-generation progestins. However,‘modern, low oestrogen dose OCs are extremelysafe if used appropriately in young women’.13

The most recent evidence suggests that myocar-dial infarction and stroke are rare and limited towomen who smoke cigarettes or have hyperten-sion or other cardiovascular risk factors.

OCs constitute the most common form ofsteroidal hormonal contraception and it is alsothe most common method of reversible contra-ception in countries other than China. It is esti-mated that 60 to 80 million women are OC usersworldwide.14 COCs are a safe method of contra-ception, only not to be recommended for womenwith cardiovascular diseases, diabetes mellitus,cancer or smoking.15 POPs can be taken by lac-tating women, but are not recommended in casesof thromboembolism or vein thrombosis.

OCs require daily attention by the womanand they have a high discontinuation rate: inprogrammes, it has been reported that less than50% of the women who start use continuetreatment after one year.16,17 Both COCs and POPsare very effective under perfect use,15 and undertypical use they are still effective (Table 20.1).

Injectable Preparations

The most common injectable contraceptive isthe progestin-only preparation depot-medroxy-progesterone acetate (DMPA), that provides con-traceptive protection for three months. Nor-ethisterone enantate (NET-EN) is also a progestin-only preparation that provides protection for twomonths. Combined oestrogen–progestin once-a-month injectable contraceptives are Cyclofem,which combines DMPA with an oestrogen, andMesigyna, which combines NET-EN with anoestrogen. Injectable preparations are long-actingand require the intervention of health care profes-sionals to administer the injection.

The mechanism of action of DMPA is sup-pression of ovulation and changes in the cervical

mucus and the endometrium. Combined injectablepreparations seem to have a mechanism of actionsimilar to COCs.

DMPA was first used as a contraceptivein the 1960s. Subsequently other alternativeinjectable contraceptives were developed amongwhich NET-EN gained widespread use. Once-a-month injectables were developed with thepurpose of overcoming the inconvenience of thedisruption of the menstrual bleeding pattern ofprogesterone-only preparations. WHO undertookthe evaluation and optimisation of the dose andoestrogen/progesterone ratio of Cyclofem andMesigyna.18 Also, the Chinese Injectable No.1, with a complicated administration schedule,was developed in China. A multicentre trialwas important to decide between the 100 mgor 150 mg dose for DMPA.19 A number ofclinical trials showed that NET-EN was highlyeffective.20 Other trials determined that NET-ENneeds to be administered every two months andalso compared DMPA and NET-EN.21

Regarding safety concerns, a large multicentrestudy provided reassurance that the use of DMPAwas not associated with cancer22 and thus DMPAwas registered in the United States as a long-acting contraceptive.23 Headache is a commoncomplaint, side-effects are weight gain and delayin the return of fertility. Menstrual irregularitiesare frequent, including prolonged and heavybleeding, mostly during the first months of use,and long periods of amenorrhoea.

About 16 million women worldwide are usersof injectable contraceptives: 13 million DMPAusers in 90 different countries, 1 million NET-EN users and 2 million once-a-month injectablesusers in Latin America and China, and introduc-tory studies are being conducted in several othercountries.24

Progesterone-only contraceptives can be takenby breastfeeding women when they do not haveaccess to other methods. It is not recommendedfor women with multiple risk factors for arte-rial cardiovascular disease or with unexplainedvaginal bleeding. A theoretical concern is theeffect on the neonate for breastfeeding women<6 weeks postpartum.

Page 314: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CONTRACEPTION 319

Injectable contraceptives are very effective: 0.3pregnancies per 100 woman-years in the firstyear of use for DMPA, NET-EN and combinedinjectables15 (Table 20.1).

Implants

Implants used for contraception in women consistof a silicone rubber tube or capsule insertedsubdermally in the arm, containing a steroidor progestin released through it at a constantrate for several years. Implants are thus long-acting and require the intervention of health careprofessionals for insertion and removal. Norplantis the most widely used implant, consisting of sixlevonorgestrel-releasing rods with contraceptiveaction during five years. Jadelle is a two-rodlevonorgestrel implant with a five-year duration.Implanon is a single implant releasing 3-keto-desogestrel during three years. Another singleimplant releases ST 1435, a progestin rapidlymetabolised, making the implant suitable forlactating women.

The mechanism of action, similarly to that ofPOPs, includes a combination of effects, therebeing indications that ovulation suppression is notthe only one,25 since only about 50% of womenshow suppression of progesterone levels.26

Norplant became available in the United Statesin 1991, after regulatory approval based on largeclinical trials, which provided information ondiscontinuation rates and side-effects.27 NorplantII is a two-rod levonorgestrel implant easierto insert and remove and less conspicuousthan Norplant, with a modified manufacturingdesign, but its development was stopped aftertrials comparing it with Norplant showed lowerefficacy. The pregnancy rates were found todepend on the type of tubing used to manufacturethe implant, the soft tubing being an improvementover the hard tubing.

The main safety concern of implants is infec-tion at the implant site, otherwise they are consid-ered safe. The main side-effect observed amongNorplant users is disturbances in the menstrualbleeding pattern, with episodes of prolonged andheavy bleeding, mostly during the first months of

use. Common complaints are headache, weightgain, mood change and depression. The safety ofNorplant has been confirmed.28

It is estimated that one million women areNorplant users. Patterns of use are similar toother progestogen-only contraceptives. It is veryeffective, with 0.1 pregnancies per 100 woman-years in the first year of use.15

Vaginal Rings

Vaginal rings are devices releasing either acombination of a progestin and an oestrogen oronly a progestin, of which the most common arelevonorgestrel and progesterone.

The mechanism of action of levonorgestrel-releasing rings is similar to that of POPs, butwith an increased effect on cervical mucus.

The first ring was progesterone-only, then theprogesterone dose was reduced and combinedrings were developed. Several designs werestudied before an active core ring surroundedby an active silastic membrane was developed,leading to multi-compartment rings. A low-doselevonorgestrel contraceptive ring (20 mcg/day)was studied in WHO multicentre trials.29,30

Safety concerns related to the levonorgestrelring are menstrual disturbances, vaginal symp-toms, lesions and repeated expulsion.

Vaginal rings are attractive because they canbe discontinued easily by the woman herself andare thus under her control, they do not requiredaily attention like the OCs, and they are notcoitus-related like the condoms. The one yearpregnancy rates of combined rings were lessthan 3 pregnancies per 100 woman-years in amulticentre trial31 (Table 20.1).

Emergency Contraception

Emergency contraception (EC) based on pills isa post-coital method that is recommended up tothree to five days after an act of unprotected inter-course. The standard EC method was the Yuzperegimen of COCs (ethinylestradiol 100 mcg andlevonorgestrel 0.5 mg or dl-norgestrel 1.0 mgrepeated 12 hours later). A superior regimen

Page 315: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

320 TEXTBOOK OF CLINICAL TRIALS

consists of two 0.75 mg doses of levonorgestrel12 hours apart32 or a single 1.5 mg dose.33 Singledoses of the anti-progestin mifepristone, rangingfrom 10 mg to 600 mg is another method. EC isa back-up method and cannot be used regularly.

Regarding the mechanism of action, if unpro-tected intercourse occurs within a few days ofovulation, the only time when fertilisation canoccur, ECs will exert their effect prior to implan-tation being completed (day 6–7 after fertilisa-tion) and thus an established pregnancy wouldnot be disrupted.34,35 If EC is administered whena woman is already pregnant there is evidencefrom a study with pregnant women that ‘ethinyloestradiol is not a reliable abortifacient . . . andthat its efficiency as a postcoital contraceptivemay be limited to a relatively short period fol-lowing ovulation prior to implantation’.36

Although EC started in the 1980s in Europewith the Yuzpe regimen, it was only in 1997that the FDA in the United States declaredthe regimen safe and effective.37 The standardEC method until the late 1990s was the Yuzperegimen, when levonorgestrel was shown in atrial to be more effective and have a better side-effect profile.32 It has been registered in 1999in the United States and is available now in atleast 80 countries around the world. Mifepristonestarted to be used as an EC method at the doseof 600 mg. RCTs have shown that 10 mg canbe used instead of higher doses, with a reducedeffect of delay of menses observed with higherdoses.38

EC pills are relatively benign and they poseno serious safety concerns. Nausea and vomitingare common with high-oestrogen regimens. Lev-onorgestrel and mifepristone regimens have a bet-ter side-effect profile than the Yuzpe regimen.32,38

A concern with mifepristone is the delay ofmenses, mainly with high doses.38

In women receiving EC up to 72 hours afterunprotected intercourse, 1.1% pregnancy rateshave been observed after levonorgestrel with typ-ical use and 3.2% after Yuzpe.32 With the coitus-to-treatment interval extended to 120 hours, preg-nancy rates were 1.1% to 1.3% after mifepristone

doses ranging from 10 mg to 600 mg.38 With per-fect use, the corresponding figures in the citedtrials were 0.8% after levonorgestrel, 1.9% afterYuzpe and 0.3% to 1% after mifepristone.

NON-HORMONAL CONTRACEPTIVESFOR WOMEN

IUDs

Intrauterine devices (IUDs) are inert intrauterinerings or plastic devices with or without drugloading (copper or levonorgestrel). They are long-acting and require the intervention of health careprofessionals for insertion and removal. IUDsinserted after an unprotected coitus are alsoeffective as EC.

The mechanism of action of IUDs is to inhibitsperm and ovum transport and fertilisation.

IUDs were first introduced for contraceptivepurposes at the beginning of the 1900s. The firstIUD consisted of a loop of silk thread. Then ametal copper-releasing ring was developed. Inthe 1960s plastic coils became popular, like theLippes Loop. The IUD used in the 1970s, theplastic Dalkon Shield, was associated with highpregnancy rates and high infection rates. Safetyproblems with old devices included the risk ofcontracting pelvic inflammatory disease, and thatof septic abortion and infertility, with consequenthigh discontinuation rates. Randomised clinicaltrials (RCTs) published in 1975 compared theDalkon Shield with the Lippes Loop D and thenew Copper-7 and Copper-T 200. The superiorityof collared Copper-T was thus established.39 – 41

In the early 1980s trials were conducted includingNOVA T, MLCu250, Copper-T 220C and MLCu375.42 In other trials the Copper-T 380 showedsuperiority over the MLCu 375.43 – 44 Many IUDtrials were conducted in China to try to designcopper IUDs adapted to Chinese women. A majordevelopment was that of steroid-releasing IUDs.Devices releasing 20 mcg/day of levonorgestrelhave been shown to be very effective.47

High efficacy and low risk have been observedfor copper IUDs and the levonorgestrel-releasingIUD in large trials47 – 50 and there has been

Page 316: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CONTRACEPTION 321

a progressive increase in effectiveness withcontinued research.26

IUDs are the most commonly used contra-ceptive methods after sterilisation, and the mostcommonly used reversible method in China. Itis estimated that about 120 million women areIUD users worldwide.51 The method is not rec-ommended for pregnant women and those withcurrent sexually transmitted infections or at riskof acquiring them.

Compliance is not a problem with IUDs,but there could instead be discontinuation forseveral reasons. New copper devices and the newhormone-releasing IUD combine low pregnancyrates with low discontinuation rates. One-yearpregnancy rates for the most efficient IUDs are<1 per 100 woman-years for the copper IUDs,0.1–0.2 for the levonorgestrel-releasing IUD and0.6–0.8 for the TCu-380A IUD.15

Barrier Methods

Barrier methods used by women are the dia-phragm, the female condom and spermicides.The importance of developing effective dualprotection barrier methods that provide protectionagainst sexually transmitted diseases (STDs) hasincreased in the last years with the spread ofthe HIV/AIDs epidemic. Condoms are barriermethods providing this feature of dual protection.

Barrier methods prevent conception by avoid-ing contact between sperm and the ovum. Theyact as a mechanical barrier (condom, diaphragm)or by inactivating the sperm (spermicides) or both(diaphragm with spermicide and cervical cap).

The female condom has become an importantalternative because it is under the woman’scontrol and can provide women with the ability ofprotecting themselves against STDs in situationswhere men refuse to use condoms. It is coitus-related and thus pregnancy can be the result ofeither method failure or inconsistency of use.Effectiveness: 5 pregnancies per 100 woman-years in the first year of use under perfect use(effective), 21 under typical use (only somewhateffective).15

The diaphragm is an elastic membrane withcavity rim, which may be attractive to poten-tial users but it lacks the advantage of protec-tion against STDs. A new microbicide-releasingdiaphragm is being developed to address thisconcern. Safety concerns for the diaphragm area history of toxic shock syndrome and allergyto latex.15 It is effective under perfect use(Table 20.1).

Spermicides are in the form of creams, jellies,foams in pressurised containers, foaming tabletsor suppositories. They are not very effective whenused by themselves, but can be used in combi-nation with other methods. Effectiveness: 6 preg-nancies per 100 woman-years in the first year ofuse if used correctly and consistently (effective),26 under typical use (only somewhat effective).15

Self-administered topical preparations with sper-micidal and microbicidal activities are being stud-ied, that provide both contraceptive and anti-infection protection and are under the control ofthe woman. The cellulose sulphate gel is one suchpreparation.

The cervical cap or sponge is a mushroomcap-shaped device releasing a spermicide andwhose concave side is applied over the cervix. Itsmaximal insertion time is 24 hours. It is effectiveunder perfect use (Table 20.1).

Natural Methods

Periodic abstinence restricts intercourse to theinfertile phase of the woman’s cycle, whichdepends on the ability of the woman to identifythe fertile period.52 It acts through preventionof fertilisation. It is effective under perfect use(Table 20.1).

The lactational amenorrhoea method is anaccepted method of contraception when the inter-est of the woman is birth-spacing, since it hasbeen observed that breastfeeding women withinsix months of delivery who are amenorrhoeichave <2% risk of becoming pregnant.53 Effec-tiveness: 0.5 pregnancies per 100 woman-yearsin the first year of perfect use (very effective), 2under typical use (effective).15

Page 317: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

322 TEXTBOOK OF CLINICAL TRIALS

Sterilisation

Sterilisation in women is an effective surgicalprocedure involving the blockage of the fallopiantubes, which transport mature ova from theovaries to the uterus. The most widely practisedtechniques are minilaparotomy and laparoscopy.Sterilisation is the only permanent contraceptivemethod and the most prevalent, 180 millioncouples have been reported to be sterilised(male or female). Research is in progress todevelop a non-surgical procedure. Effectiveness:0.5 pregnancies per 100 woman-years in the firstyear of use (always very effective).15

ImmunoContraceptives

Research is in progress for the development ofa female vaccine based on the human chorionicgonadotrophin molecule (hCG), for action afterfertilisation and before implantation.

HORMONAL CONTRACEPTIVES FOR MEN

Hormonal injectable methods for men that reducethe production of spermatozoa are based onweekly injections of testosterone. Research isin progress for the development of a longer-acting injectable preparation, namely at four-week intervals. Preparations based on DMPA andtestosterone and on NET-EN and testosterone areunder study.

Implants for men are also under investiga-tion. A depot androgen/progestin combinationhas recently demonstrated high contraceptiveefficacy with satisfactory short-term safety andrecovery of spermatogenesis.54 In this trial, a hor-monal implant was given every four months toreplace testosterone and the progestin DMPA wasinjected every three months.

NON-HORMONAL CONTRACEPTIVES FORMEN

Barrier Methods

Condoms used by men are tubes closed spher-ically on one side, normally made of a latex

membrane 0.06 to 0.07 mm thick.55 Most arelubricated, and some contain spermicides. Thefeature of dual protection and the mechanism ofaction are the same as those described for thefemale condom.

Research on the male condom has dealt withefficacy and acceptability issues. Old condomswere made of hard material, acceptability waslow and they were not very resistant to adversestorage conditions. Improvement has been madewith the latex condoms. A new polyurethanecondom was compared with latex condoms inRCTs.56

Male condom is the most widely used of bar-rier methods but its use is not more widespreadbecause it is often not accepted, mainly by themale partner. Condoms have practically no riskof side-effects. The only concern has been allergyto latex in latex condoms.

Effectiveness: 3 pregnancies per 100 woman-years in the first year of use if used correctlyand consistently (effective), 14 under typical use(only somewhat effective).15

Natural Methods

Withdrawal, or coitus interruptus, is a loweffectiveness method (Table 20.1) which dependson the man successfully withdrawing the penisfrom the vagina before ejaculation starts, and thuspreventing fertilisation.

Sterilisation

Vasectomy in the male is a simple surgicalprocedure, but it is questionable in some culturalsettings due to the incision and non-reversibility.Research is in progress for the development of areversible procedure.

A possible association between vasectomy andprostate cancer was a safety concern, but obser-vational studies have shown that if there issuch an association, the increased risk in vasec-tomised men compared to non-vasectomised menis small.57

Sterilisation is the only permanent contra-ceptive method for men. Effectiveness: 0.1–0.2

Page 318: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CONTRACEPTION 323

pregnancies per 100 woman-years in the first yearof use (always very effective).15

ImmunoContraceptives

Research on vaccines for men is in progress basedon antibodies that neutralise the biological effectof the gonadotrophin hormone-releasing hormone(GnRH) and follicle-stimulating hormone (FSH),with a resulting oligospermia or azoospermia.

CLINICAL TRIAL METHODSIN CONTRACEPTION

Observational studies constitute the source ofinformation for comparisons of efficacy, discon-tinuation rates or safety across broad classesof contraceptive methods, for example implantsand IUDs. Women cannot usually be randomisedto different broad classes of methods becausethe woman’s choice of contraceptive is deter-mined by social, cultural and personal reasons.An exception to this was one large RCT whichallocated women at random to OCs or to vagi-nal methods (consisting of diaphragms, jellies,creams or foams).58 However, results were diffi-cult to interpret because there were many womenswitching methods and lost to follow-up.59,60

RCTs, on the other hand, have been animportant tool to find new safe and efficientregimens or devices within each broad class andimprove existing ones by answering questionsabout the best compound, the best dose, the bestinterval or route of administration (compounds)or the best physical properties (devices).

Sometimes partially randomised trials are usedto compare two types of hormonal contraceptiveswithin the same broad class with a non-hormonalone, used as a placebo control group. Forexample, in a WHO trial (data not published)on the effect of two injectable contraceptives(DMPA and NET-EN) on lipid and lipoproteinmetabolism, women requesting an injectablecontraceptive were allocated at random to thetwo preparations, and a group of non-hormone-releasing IUD users was the control. In another

WHO trial under preparation, two types ofimplants will be compared with regard to efficacyin preventing pregnancies, allocating at randomthe type of implant to women choosing implants.To assess the effect of hormones on the bleedingpattern, IUD users or sterilised women willconstitute a control group.

Clinical trials generally include an insufficientnumber of women to provide conclusive informa-tion on rare events, like cancer and cardiovasculardiseases.26 However, a careful documentation ofserious adverse events and predisposing risk fac-tors in the conduct of clinical trials should alwaysbe provided.1

General principles applying to the conductof clinical trials and post-registration assess-ment of steroidal contraceptive drugs have beenestablished.1,61 The development of a new con-traceptive method involves a long process until itis registered and reaches the market. The method-ology used depends on the stage of developmentand will be treated separately for Phase I/II andPhase III trials.

PHASE I/II TRIALS

Objectives

Phase I trials deal with drug safety and aimto determine an acceptable drug dosage, andalso study drug metabolism and bioavailability.In contraceptive research, Phase I trials areconducted to investigate the pharmacology ofsteroidal contraceptive or other contraceptivedrugs in healthy volunteers.61 Phase I trials mustbe preceded by initial toxicity studies in rodentsand pharmacokinetic studies in primates, whichgive an indication for the dose used in the firstclinical study.61

When a contraceptive has been assessed tobe safe in Phase I trials, its research progressesto Phase II trials, using the optimal dose andadministration schedule. Contraceptive Phase IItrials are small-scale investigations into the effec-tiveness and safety of a contraceptive method,carried out on closely monitored patients. Theyhave the goal to establish its mechanisms of

Page 319: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

324 TEXTBOOK OF CLINICAL TRIALS

action, metabolic effects and provide prelimi-nary estimates of the frequency of common side-effects, its effectiveness and its acceptability. Itis recommended to previously conduct repeated-dose toxicity and reproduction studies in animals.Phase II studies are conventionally subdividedinto Phase IIA, studies on the pharmacology ofthe drug in patients and Phase IIB, definitivedose-finding studies.61

Since steroidal contraceptives are used byhealthy people, it is desirable to assess the mini-mum effective dose at the initial stages of clinicaltesting.61 This can be done already in Phase Itrials instead of in later stages. The direct assess-ment of efficacy in small trials is not possiblebecause with reasonably effective contraceptivespregnancy is a rare event. There exist surro-gate variables for efficacy of a steroid drug forpregnancy prevention. Phase I trials on contra-ceptives, therefore, are often also used to lookat these surrogates of efficacy in addition tosafety issues, so that Phase I and Phase II tri-als are combined to evaluate both safety andendocrinological endpoints. Examples of surro-gates of efficacy are hormone levels as indica-tors of inhibition of ovulation in contraceptivesfor women, sperm concentration in long-actingandrogen–progestogen formulations for men asan indicator of inhibition of spermatogenesis, andamount of serum hCG antibodies in immunocon-traceptive trials for women. Serological and clin-ical diagnoses of pregnancies are also conducted.

In the case of hormonal contraceptives forwomen, the following clinical pharmacologicalparameters should be studied to assess theinhibition of ovulation:1

1. Hormonal activity: the nature and hormonalactivity of a steroid contained in the contra-ceptive and its principal metabolites shouldbe described.

2. Pharmacological action on ovarian function:the mechanism by which the contraceptiveeffect is attained should be described. Thisinvolves, in the first place, a description ofthe ovarian function in women with normalovulatory function, measuring plasma concen-tration of ovarian steroids and gonadotrophins

and conducting ultrasound of ovaries. Thus,information is obtained on the extent to whichovarian function is suppressed. Time to onsetof action and to return to normal ovarian func-tion after discontinuation should be studied.

3. Other pharmacological effects on the repro-ductive system: effects on the endometriumand on the cervical mucus.

4. Effects on other endocrine systems: pituitary,adrenal, thyroid.

5. Metabolic effects: effects on hemostatic vari-ables, plasma lipids and carbohydrate metabolism.For products not containing an oestrogenand suppressing oestrogen secretion, effect onbone mineral density and bone metabolism.

Design

For contraception in women, recruitment intoPhase I trials is conducted among volunteers ofreproductive age, not pregnant or lactating, regu-larly menstruating, identified in family planningclinics or selected community groups, who areIUD users or sterilised, and therefore, not at riskof becoming pregnant. Users of other hormonalcontraceptives than the one being studied are notacceptable because the method might interferewith the assessment of clinical and laboratoryparameters. Other selection criteria depend onthe contraceptive being studied. For example, forcontraceptive vaccines, acute hypersensitivity tothe carrier should be excluded, and if reversibil-ity cannot be assured, participants should beno younger than 25 years and have had chil-dren previously.

A series of sequential studies using differentdose levels are conducted to assess the minimumeffective dose. These studies involve doses thatare two or three times the initial dose. For eachdose level, a study is conducted in 10–20 healthyvolunteers.61

Selection criteria for Phase II trials on women,different to those mentioned for Phase I trials, arebeing currently exposed to the risk of pregnancyand have proven fertility. At this stage, ifvolunteers participating in Phase I studies areIUD users and they are willing to continue, thenthe IUD should be removed to assess efficacy.

Page 320: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CONTRACEPTION 325

Phase IIB trials require about 50–100 subjectsto assess efficacy and side-effects of the dosagedetermined in early trials (Phase I–IIA).

Examples

Examples of Phase I and Phase II clinical trialsare the ones conducted with injectable prepa-rations to evaluate well-known potent syntheticprogestins in combination. A Phase I trial testedthe use of progesterone as an alternative.62

Several examples for injectable contraceptivesare summarised in a review by Newton et al.24

An early pharmacological trial on Cyclofem with11 women involved one pre-treatment cycle, a3-month treatment phase with an injection ofCyclofem every 28 days and then a 3-monthrecovery phase. It confirmed that ovulation wasinhibited and that inhibition of luteal activity per-sists after the last injection for several cycles.63

A comparative non-randomised study of Cyclo-fem and Mesigyna with 15 women, 8 receiv-ing Cyclofem and 7 Mesigyna, involved onepre-treatment cycle, three treatment cycles of28 days and a 90-day follow-up period. Theresults showed that the suppressive effect ofCyclofem was greater than Mesigyna.64

A four-arm trial of reduced dose of medrox-yprogesterone acetate and oestradiol cypionate inCyclofem recruited 88 women into the follow-ing groups: Cyclofem full dose, Cyclofem halfdose, DMPA full dose, DMPA half dose. All fourpreparations were found to be effective in inhibit-ing ovulation for at least one month after theinjection, and the combined preparations showedmore regular bleeding patterns.65

Metabolic Studies

Specific Phase II studies on biochemical variablesare conducted when required. These variablesinclude lipid and lipoprotein metabolism, coagu-lation, fibrinolysis and platelet function as well asother physiological events such as vaginal bloodloss.61 The parallel group designs have been themost common design in this type of study, but fac-torial designs have also been used. Newton et al.24

describe examples of these studies for injectable

once-a-month preparations. The crossover designhas been used in a metabolic study to comparethree different progestogens (norgestrel, norethis-terone and medroxyprogesterone) in treatmentperiods of 3-week duration immediately precededby 3 weeks of ‘wash out’.64

Ethical Considerations

When conducting Phase I/II trials, the fact thatcontraceptive methods are used by healthy indi-viduals implies a different risk/benefit assess-ment compared with therapeutic drugs for life-threatening diseases. This justifies the assessmentof the minimum effective dose at early stages ofdevelopment.

When volunteers are advised on the risks andbenefits of the study in order to seek theirinformed consent, the specific risks of receivinga steroidal contraceptive should be explained.

PHASE III TRIALS

Objectives

After a contraceptive is shown to be reasonablyeffective in Phase II trials, it is essential to com-pare it with the current standard contraceptive(s)within the same broad class in a large trial involv-ing a substantial number of patients, with the goalof establishing its efficacy.26 Phase III trials per-mit more refined estimates of safety, effectivenessand acceptability in comparison with a standard.In contraceptive research, this information pro-vides the basis for introducing a hormonal contra-ceptive into family planning practice in field trialsin various settings, as a prerequisite for registra-tion with drug regulatory authorities (see introduc-tory trials in the other Issues section later).

Design and Trial Size

The most common design to compare meth-ods within each broad class of contracep-tives has been the parallel group design,with simple randomisation in single-centre tri-als, and stratified by centre in multicentretrials. This was the case for the develop-ment of OCs,5 – 7 injectables,18,19,22,32,33,38,55,67

Page 321: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

326 TEXTBOOK OF CLINICAL TRIALS

implants,27 IUDs,40 – 46 condoms68 and ECregimens.32,38,56

The control used in RCTs to compare effi-cacy of methods is typically an active control,since a placebo control would not be ethical.Examples of comparisons of new versus standard,respectively, are the following: NET-EN ver-sus DMPA (injectables), Norplant II versus Nor-plant (implants), steroid-releasing versus copperIUDs, polyurethane condom versus latex con-dom, Yuzpe versus levonorgestrel regimens (EC).Placebo controls have been used to assess effi-cacy of a treatment to improve the bleedingpattern disrupted by the use of progestin-onlycontraceptives.

In contraceptive trials, the main end-point forefficacy is based on pregnancies, a rare event.The number of subjects required per group todetect as significant a difference between groupscorresponding to a doubling of the rate, in a two-sided 5% level test, with 80% power, is usuallylarge (1140 for a control rate of 2%, 4700 for acontrol rate of 0.5%).

When the effect of two factors is of interest andif an interaction is likely to be present, the samplesize needed is approximately double in a four-arm2 × 2 trial than in a two-arm trial comparing thetwo doses of only one component. This might bea reason for which factorial designs have not beencommonly used in contraceptive efficacy trials.In the study of bleeding patterns among users ofprogestogen-only contraceptives, an example ofa factorial design is provided by a trial compar-ing the effect of low-dose aspirin and vitaminE alone or in combination on Norplant-inducedprolonged bleeding.69

For registration of a steroidal contraceptive,some regulatory agencies require clinical trialswith 200 (FDA) or 400 subjects completing twoyears of observation, while some others requireeven less.26 It is clear that this number does notprovide sufficient power to detect a differencein rare events with the control. Nor does itprovide sufficient precision for a confidenceinterval estimation of a rare event: 5 eventsobserved in 200 subjects gives a rate of 2.5%with 95% CI of 1% to 10%. On the other

hand, the Committee for Proprietary MedicinalProducts (CPMP) recommends that 20 000 cyclesbe observed, which at 13 28-day cycles peryear is equivalent to 1540 women-years or 770women followed completely for two years. Thiscalculation is based on the criterion that thedifference between the upper 95% confidencelimit for the Pearl index (number of pregnanciesper 100 women-years) and the point estimatedoes not exceed 1.1

Recruitment

Participants in Phase III contraceptive trials areusually recruited in family planning clinics. Amajority of attendants requesting contraception infamily planning clinics (other than STD clinics)are healthy. On arrival, subjects (women or men)or couples requesting or using the method understudy are screened for eligibility. An eligibilitycriterion common to contraceptive efficacy trialsis good general health, but others are specificto the contraceptive method, depending on thecorresponding safety concerns and eligibilitycriteria.15

Trial participants are not therefore a randomsample from women in reproductive age, andtheir particular characteristics affect externalvalidity, making difficult the generalisation ofresults to wider populations.4,70 First, womenchoosing a particular broad class of contraceptiveare likely to be self-selected. For example,implants are often selected by older women.Second, clinicians are likely to select differenttypes of women for different contraceptives.Third, women who are long-term users of amethod and are happy with it do not come tothe clinic and are less likely to be enrolled.

According to current ethical principles, all eli-gible subjects have to provide informed consentbefore being enrolled into the trial. In contra-ceptive trials, obtaining this consent from ado-lescents is problematic because some countriesrequire a minimum legal age to provide consent.Consent from relatives is not always possible dueto the need to maintain confidentiality in sensitiveissues like contraception.

Page 322: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CONTRACEPTION 327

Randomisation, Allocation Concealmentand Blinding

Randomisation in contraceptive RCTs is achievedin a similar way to RCTs in other areas, by the useof a computer-generated randomisation sequence.In multicentre trials the randomisation is usuallystratified by centre, done in blocks, and preparedcentrally. Treatments are allocated by assigningthe next consecutive subject number in the ran-domisation sequence to the next enrolled subject.

Allocation concealment strategies includesealed opaque envelopes (unblinded trials) andpacking of drugs by a central company (blindedtrials). Many multicentre multinational RCTshave included settings with poor telecommunica-tion systems, in which central telephone randomi-sation as a strategy for allocation concealmentwas not a feasible option.

Most clinical trials comparing implants, IUDsand other devices cannot be blinded becauseinsertion or placement of the device usuallyimplies that both the administrator and theuser will see it, touch it or smell it. Thesituation is similar in sterilisation trials in whichsurgical procedures are compared. Some trialscomparing IUDs or sterilisation techniques canbe blinded to the woman but not to the deviceor procedure administrator. Depending on thetreatments being compared, many clinical trialscomparing injectable preparations, drugs for ECand possibly spermicides can be blinded to users,treatment administrators and outcome evaluators(‘double blind’).

Blinding in contraceptive trials can avert biasafter treatment allocation by preventing thefollowing causes of bias. First, it is possible thatthe health care provider or the user will tend todiscontinue one treatment more than the other.Second, ascertainment bias could be introducedin the evaluation of subjective outcomes, likelesions in contraceptive rings trials. The delayin the recognition of pregnancy, the imprecisionin the estimate of the date of conception andthe occurrence of chemical pregnancies notedabove are sources of uncertainty which also pavethe way for the introduction of ascertainmentbias. Bias could still be present even in blinded

trials due to unblinding caused by ancillaryinformation, like differential side-effects from thetreatments being compared. For example, in ECtrials, higher doses of a compound might causenausea more frequently than lower doses.

Effectiveness and Efficacy of ContraceptiveMethods: Theoretical Model

Effectiveness of a method can be defined as ‘theproportionate reduction in fecundability causedby the use of a method’.4 As such, it isnot measurable because one would have tocompare the rate of conception under use ofthe method with that in the same populationnot practising contraception nor lactating. Thecommon use of effectiveness is to denote howwell a method works. Sometimes efficacy is usedwith this meaning.

Steiner et al.71 proposed a theoretical model(see also Refs. 4 and 51) in which the couple’sability to conceive and the timing and frequencyof intercourse determine the unobservable preg-nancy rate in the absence of contraception. In thepresence of (perfect or imperfect) contraceptiveprotection this pregnancy rate is reduced, deter-mining the ‘typical’ pregnancy rate. This typicalrate is composed of the perfect use pregnancy rateand the imperfect use pregnancy rate, weightedby the proportion of each type of user.

A measure of efficacy that implies a com-parison with the same treated population underplacebo is the proportion of pregnancies pre-vented out of the expected pregnancies, or pre-ventable fraction, given by 1− observed preg-nancy rate/expected pregnancy rate.

The contraceptive method efficacy is the pre-ventable fraction under conditions of perfect use,and the effectiveness is the preventable fractionunder conditions of typical use. The differencebetween these two rates depends on both thepregnancy rates under each condition and the pro-portions of the two types of users.52

Estimation of Pregnancy Rates and Efficacyin Non-coitus Related Methods

Sterilisation, which acts continuously and is per-manent, and methods which act continuously but

Page 323: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

328 TEXTBOOK OF CLINICAL TRIALS

are reversible, like IUDs and long-acting hor-monal methods, are non-coitus related methodsin the sense they do not require any particularaction by the user to be effective.

If there is no daily monitoring of follicu-lar growth or urinary metabolites, the estimatesof efficacy by the preventable fraction are con-structed with external estimates of probabilitiesof conception72 and with data on pattern and tim-ing of intercourse. The latter is difficult to obtainin large trials comparing regular use contracep-tives, therefore the common measure of how wella contraceptive method works in preventing preg-nancy is failure, or the occurrence of pregnancyin the period of time during which the contracep-tive is used.4 Thus, efficacy in this loose senseis evaluated in the woman and (inversely) mea-sured in number of pregnancies per woman-timeof observation (typically per 100 woman-years).

For reversible methods (for example IUDs andlong-acting hormonal methods), the assessmentof the pregnancy status might be difficult dueto the following sources of uncertainty:52,70

(1) when the decision to stop using a method ismade, the pregnancy might be recognised afterthe method is stopped; (2) imprecision in theestimate of the date of conception when theestimate is based on the date of start of thelast menstrual period; (3) occurrence of early‘chemical’ pregnancies, of which a considerablepercentage is lost before reaching the stage ofclinical pregnancy and (4) early foetal losses,which might be unnoticed by the woman.

In clinical trials comparing regular contracep-tives, women are usually required to return tothe clinic at specified intervals during a follow-up period. The timing of reporting pregnanciesvaries among women. It is important that themethod of counting pregnancies does not dependon this timing. The ‘active follow-up’ preventsthis problem by defining a cut-off date and con-tacting women three months later to learn theirpregnancy status at the cut-off date.70

One of the main problems affecting data qualityis the loss to follow-up in trials comparing regularuse contraceptives, that require long periods offollow-up. Bardin and Sivin26 discuss the bias

introduced in comparative trials by the failure toobserve all subjects through the completion of thestudy. The magnitude of the bias depends on theproportion of subjects lost to follow-up and theoutcome mean or proportion in this group.

The estimation of the pregnancy rate using thePearl index has been shown to be not appropriatedue to the decline in fertility of the cohortbeing followed with duration of the contraceptiveuse. This decline in fertility has been illustratedby Sivin and Schmidt42 from long-term studies,where a progressive increase in the effectivenessof each device with age was observed, as well asa wider difference in failure rates among devicesand a progressive increase in effectiveness withcontinued research.

Life table techniques have been the recom-mended analysis technique for years, using thesingle decrement method, in which women whoexit for other reasons than pregnancy are cen-sored at the time of exit.52,70,73 The estimation ofthe pregnancy rate is given by the cumulative lifetable rate (net rate). The daily life table method,using the Kaplan–Meier product-limit estimateof the cumulative pregnancy rate (net rate) givessimilar results and leads naturally to the logranktest to compare groups.73

A difficulty in the estimation of pregnancy ratesis the presence of other reasons of discontinua-tion. For IUDs, the commonly analysed discon-tinuation reasons are expulsion, medical removal(due to pain, bleeding or pelvic inflammatory dis-ease), non-medical removal (wish to become preg-nant, no further need of contraception) and lossto follow-up. The use of net rates from life tabletechniques deals with competing causes by cen-soring. This approach has been criticised by Taiet al.74 because it assumes independence of thedifferent reasons for discontinuation. They arguethat the estimates obtained by this approach areoverestimates of the rate for each of the reasons,as can be seen by the fact that the sum of the prob-abilities of discontinuation due to all the reasonsis greater than one. They are hypothetical ratesthat would occur if discontinuations for other rea-sons could not take place. Tai et al. recommend

Page 324: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CONTRACEPTION 329

the use of cumulative incidence rates, that esti-mate the pregnancy rate in the presence of com-peting causes. They also argue in favour of thecompeting risk methodology to compare cumula-tive incidences between two methods (two IUDs).Farley discussed these recommendations, con-cluding that ‘Kaplan–Meier estimates are moreappropriate when estimating the effectiveness ofa contraceptive method, . . . (and the) cumulativeincidence estimates are more appropriate whenmaking programmatic decisions regarding contra-ceptive methods’.75

Behavioural Patterns and the Estimationof Efficacy of Contraceptive Methods

Methods that are used around the time of inter-course, like barrier methods and spermicides, arecoitus-related and require a high degree of usercompliance with the correct way of using themethod in order to prevent pregnancy reliably.52

For these methods, a pregnancy can be the resultof either a method failure or lack of use or incor-rect use. OCs are not coitus-related but have tobe taken daily by women, posing similar types ofproblems. Similarly, periodic abstinence relies forits effectiveness on rules of when to abstain fromsexual intercourse in order to avoid pregnancy,and users may depart from these rules.

In order to separate a method failure froma lack of use or an incorrect use of a methodwhich is coitus-related, investigators denotedpregnancies in which the method had not beenused or had been incorrectly used as ‘userfailures’. Pregnancies in which the method hadbeen correctly used were denoted as ‘methodfailures’. Trussell70 illustrated the inadequacyof computing pregnancy rates corresponding tothese two sources using the same denominatorthat includes all exposure from both ‘perfect’and ‘imperfect’ use. He proposed to collectinformation on ‘imperfect’ use for all months ofexposure, or alternatively obtain information oncorrect and consistent use at the end of the trial,and calculate separate rates for ‘perfect’ users andfor ‘imperfect’ users.

For comparative trials, this issue is addressedby conducting a stratified analysis by imperfect

and perfect use. The comparison of the effective-ness between treatments for all cycles (whetherperfect or imperfect use took place) provides thetreatment effect under conditions of typical use.The comparison of the efficacy between treat-ments is given by a subgroup analysis with cyclesof perfect use.

Estimation of Pregnancy Rates, Effectivenessand Efficacy of Emergency Contraceptives

In trials comparing EC methods, it is feasible toknow the pattern and timing of intercourse: pro-tocols usually require a single unprotected act ofintercourse, and its date is reported by the woman,as well as the date of start of the last menstrualperiod. A crude estimate of efficacy is the pro-portion of observed pregnancies (number of preg-nancies divided by the number of women treated),but this measure is affected by the distribution oftiming of intercourse with respect to the woman’scycle. The number of pregnancies occurring in thesame population under no use of method is unob-servable. It is estimated by the expected number ofpregnancies, calculated by multiplying the numberof women having unprotected intercourse on eachday of the menstrual cycle by the probability ofconception on that day, using external probabilitiesof conception.72 The proportion of pregnanciesprevented, or preventable fraction, is given by (1−[observed pregnancies/expected pregnancies]). Atechnique for the construction of confidence inter-vals for the preventable fraction is available, usingvariance–covariance matrices from the externalestimates of conception probabilities.72 The day ofovulation, and thereof the day of the cycle in whichintercourse took place, is usually estimated fromthe date of the last menstrual period as reported bythe women, and thus subject to imprecision.

The success of EC depends on not havingfurther unprotected acts of intercourse duringthe same cycle, since the EC treatment doesnot prevent pregnancies resulting from theseacts.32,38 Therefore user compliance can affectthe effectiveness of the method. If the ECtreatment includes two doses, its success alsodepends on the treatment compliance, i.e., onthe woman taking the second dose (at home)

Page 325: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

330 TEXTBOOK OF CLINICAL TRIALS

at the correct interval. The estimation of typicaland perfect use is therefore relevant in EC trials.When comparing groups, this has been achievedby two analyses, one including all subjects and asubgroup analysis with perfect users.

Caveats in Comparing Efficacy andEffectiveness Between Groups:

Intention-to-Treat and Subgroup Analysis

In RCTs, the comparison of estimates of effective-ness obtained with two treatments corresponds toan intention-to-treat (ITT) analysis (in the absenceof lost to follow-up), while that of efficacy corre-sponds to a subgroup analysis of perfect users. Inlarge RCTs, the proportions of perfect and imper-fect users are likely to be similar in the treatmentsbeing compared, so that differences in effective-ness between two treatments will depend mainlyon differences between the pregnancy rates underthe two treatments. Thus, the comparisons of effec-tiveness between treatments within the RCT arenot biased (internal validity).

On the other hand, the comparison of efficacybetween treatments has the limitations of a sub-group analysis. In the first place, the advantagesof randomisation are lost, since imperfect usersare excluded from the analysis. When the sub-group analysis is based on subject characteristicsthat are not affected by treatment, like baselinevariables, each smaller subgroup is like a smallerrandomised trial.76 But when the subgroup isdefined by a variable observed after randomi-sation and potentially affected by the treatment,then the treatment effect may influence classi-fication into the subgroup. The treatment effectobserved in the subgroup would then be biased.This caveat is illustrated by an RCT to comparemifepristone and levonorgestrel for EC. The mainvariable to define perfect use is adhering to theprotocol requirement of not having further actsof unprotected intercourse before the start of nextmenses. Mifepristone is known to delay ovulationand thus is associated with a delay in the startof menses, while levonorgestrel is not.33,38 Thisprovides women under both regimens with a dif-ferential opportunity to violate the requirement,and then the effects of treatment under perfect

use and under typical use are likely to be of dif-ferent magnitude. In the second place, unless thetrial was designed to have sufficient power at thesubgroup level, a relevant treatment effect in thesubgroup will not be detected, or the confidenceinterval estimate of the effect will be imprecise.Stratification into perfect and imperfect users isanother way of reporting results.32,38

Assessment of Side-effects and Acceptabilityof Contraceptive Methods

In women using contraceptives regularly, infor-mation on side-effects and complaints such asnausea, vomiting, diarrhoea, fatigue, dizziness,headache, lower abdominal pain and breast ten-derness, as well as adverse events, can be col-lected in follow-up visits. Differences betweengroups in events which have a rate of 5 or moreper 100 can be detected with small trials. Ratesof 1 to 5 per 100 require larger trials. Detectionof differences for lower rates would require evenlarger trials.26

Side-effects of EC are complaints (or signsand symptoms) of nausea, vomiting, diarrhoea,fatigue, dizziness, headache, lower abdominalpain and breast tenderness, and are treated asbinomial proportions. Delay in the return of nextmenses, a time-to-event outcome, is undesirablebecause it gives the opportunity for more acts ofunprotected intercourse and is a source of anxietyfor the woman. It has been found to be a concernwith mifepristone, mainly with high doses.

Acceptability of a contraceptive methoddepends not only on the characteristics of themethod but on the service delivery setting andthe socio-demographic and economic factors of aparticular country.24 It can be assessed by ques-tions to the user regarding satisfaction, willing-ness to recommend the method to others and topay to have access to the method. Many side-effects of regular use contraceptives are reflectedin discontinuation. Some of these discontinua-tion reasons are related to the acceptability of themethod by the user. For long-acting hormonalmethods, for example, the main discontinuationreason is disturbances in the menstrual bleedingpattern, largely determined by cultural and social

Page 326: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CONTRACEPTION 331

factors. An Egyptian study on the acceptability ofonce-a-month injectable contraceptives found dif-ferences between women discontinuing and thosecontinuing in all measures of acceptability.77 Fac-tors important in determining acceptability were:age, contraceptive history, learning about injecta-bles, the husband’s attitude and knowing aboutanother user’s satisfaction.

OTHER ISSUES

Vaginal Bleeding Patterns

Hormonal contraception is often associated withdisturbances in the vaginal bleeding pattern.These disorders are common with the use ofprogestogen-only hormonal methods and they donot imply a health risk per se, since it has beenshown that the amount of blood loss is not aproblem. They may be tolerated by the woman,and this depends on cultural and behaviouralpatterns. The measurement of bleeding patternscan be achieved by direct questions to women,by their completing menstrual diaries or bymeasuring blood loss.26

The most used method of analysis of men-strual diaries is the reference period method,78

which was standardised by WHO using a 90-dayreference period.79 It consists in analysing bleed-ing/spotting records in women’s menstrual diarycards by taking fixed-length segments of time (thereference period, for which a 90-day segment hasbeen used as a convention) and deriving measuresof bleeding pattern, or indices. The following10 indices have been recommended:80 numberof bleeding/spotting days, number of spottingdays, number of bleeding/spotting episodes, num-ber of spotting-only episodes, mean, range andmaximum value of lengths of bleeding/spottingepisodes, mean, range and maximum valueof lengths of bleeding-free intervals. Theseindices have been analysed using box–whiskerplots and non-parametric analysis techniques.To summarise the information provided, Belseyand Carlson81 conducted a principal componentanalysis with data from women using differentcontraceptives, and concluded that most of theessential information about a woman’s bleeding

pattern was contained in four indices: numberof bleeding/spotting episodes, mean length ofepisodes, mean length of bleeding-free intervalsand the range of bleeding-free interval lengths.Based on the indices, the following clinicallyimportant patterns are derived:80 no bleeding(amenorrhoea), prolonged, frequent, infrequentand irregular bleeding.

The 90-day reference period method wasapplied to diary data collected from womentreated with Cyclofem, Mesigyna, a low-doselevonorgestrel-releasing ring and DMPA takingpart in Phase III WHO clinical trials. Amongwomen using once-a-month injectable and thelevonorgestrel-releasing ring the incidence ofacceptable patterns was higher than amongDMPA users, although the patterns were differentfrom those of normally menstruating women.82

Several placebo controlled RCTs have beenconducted to investigate the therapeutic effec-tiveness of one or more treatments for bleedingirregularities. An example is given by a trial com-paring the bleeding pattern of untreated DMPAusers with groups treated with ethinyl oestradiolor oestrone sulphate.83

Equivalence Trials

In equivalence trials, the objective is to showthat a new contraceptive or contraceptive devicethat has advantages with regard to side-effects,user preference or cost has equivalent efficacyto that of the standard one. Failure to detect adifference in a conventional significance test doesnot imply equivalence and a significant differencemay correspond to equivalence within a marginof clinical relevance (or margin of equivalence,denoted by �). A confidence interval for thedifference between the methods, on the absoluteor relative scale, on the other hand, is meaningfulbecause it contains all likely values for the effect.If these are all smaller than �, then the methodscan be considered equivalent.

For trials in the reproductive field, it has beenreported that the first difficulty encountered withthe corresponding trials was that the ‘equiva-lence’ nature of the trial had not been recognisedby the design teams. Therefore no statement of

Page 327: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

332 TEXTBOOK OF CLINICAL TRIALS

an equivalence hypothesis with the specificationof the margin of equivalence had been formu-lated, nor had the sample size been calculatedwith the objective to demonstrate equivalence.This resulted in underpowered trials to demon-strate equivalence within a clinically relevant dif-ference. In only a few cases was an equivalencehypothesis stated with a clear specification of themargin of equivalence.84

An example of an equivalence trial is givenby the WHO Yuzpe-levonorgestrel trial.32 TheYuzpe regimen using combined oral contracep-tives had been used in EC as an effective methodto prevent unwanted pregnancy. However, likeother regimens containing oestrogen, it is asso-ciated with side-effects like nausea and vom-iting. The progestogen regimen levonorgestrelwas shown to be better tolerated and equally ormore effective, and it was recommended as abetter alternative to the Yuzpe regimen. Anotherexample is a trial establishing the equivalencebetween a single dose and a split dose of 1.5 mglevonorgestrel for EC.33

Introductory Trials and Phase IV Trials

Introductory trials are field studies to assessacceptability, effectiveness, continuation of use,side-effects and service-related needs of a methodin specific populations, in the context of familyplanning services.61 They are expanded Phase IIItrials. Some countries may require to conductthese trials in a network of 5–10 centres, includ-ing an acceptability component. Such studiesmight involve 1000 to 5000 subjects.

Phase IV trials are those conducted after adrug has been approved for marketing, to fur-ther investigate adverse effects of the drug.Very rare events cannot be rigorously assessedbefore the contraceptive drug reaches the mar-ket because even Phase III trials do not havesufficient power. Strategies for post-registrationsurveillance of contraceptive drugs are reportsof adverse reactions, large-scale experimentalstudies, formal epidemiological studies and indi-rect correlational studies.61 The most commonlyused consists of epidemiological studies. Post-registration RCTs are costly, lack sufficient power

to detect uncommon but important reactions, can-not last long enough to identify long-term effectsand the experimental group cannot be comparedto a placebo.61,85 This last limitation implies thatwhen comparing two active treatments throughan RCT, the absence of effect does not mean thateither has no risk compared to a placebo. Anotherlimitation of RCTs as a strategy at this stage ofdevelopment of the contraceptive is that RCTsare conducted on healthy women and the riskof adverse reactions might be relevant in womenwith risk conditions.60

Systematic Reviews

Systematic reviews on contraceptive methods areavailable in the Cochrane Library.86 A search wasdone using the word ‘contraception’, obtaining45 hits out of 1456 total. A systematic reviewwas included in the list if it included comparisonsof efficacy, side-effects or acceptability of thesemethods or effectiveness of treatments for bleed-ing irregularities induced by them. Trials includ-ing contraceptives as treatment for complicationsor diseases and those comparing treatments forcomplications due to contraceptive use other thanbleeding irregularities were not included. Subfer-tility trials were not included. The title and ifnecessary the abstract were examined to assesswhether the review was eligible. The 22 reviewssatisfying these criteria are listed in Table 20.2.

As an example, the systematic review ‘Inter-ventions for emergency contraception’ included33 trials, most of which were conducted in China.The authors conducted 46 meta-analyses with dif-ferent comparisons and various outcomes com-prising efficacy (pregnancies) and side-effects,including delay of menses. For the mifepristonedose-comparisons they grouped the doses used indifferent trials in low, mid and high doses.87

ACKNOWLEDGEMENTS

The author is greatful to Dr TMM Farley foruseful discussions regarding the structure of thechapter, and to Dr P D Griffin for proof-readingthe section on Phase I/II trials.

The UNDP/UNFPA/WHO/World Bank Spe-cial Programme of Research, Development

Page 328: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CONTRACEPTION 333

Table 20.2. Systematic reviews in The Cochrane Database of Systematic Reviews addressing efficacy orside-effects of contraceptive methods

Method Stage Review

OCs Complete review Biphasic versus monophasic oral contraceptives for contraceptionComplete review Biphasic versus triphasic oral contraceptives for contraceptionProtocol Triphasic versus monophasic oral contraceptives for contraceptionProtocol Skin patch and vaginal ring versus combined oral contraceptives for

contraceptionProtocol Comparison of acceptability of low-dose oral contraceptives

containing norethisteroneInjectables Protocol Treatment of vaginal bleeding irregularities induced by progestin-only

contraceptivesImplants Protocol Subdermal implantable contraceptives versus other forms of reversible

contraceptives as effective methods of preventing pregnancyEC Complete review Interventions for emergency contraceptionIUDs Complete review Frameless versus classical intrauterine device for contraception

Complete review Hormonally impregnated intrauterine systems (IUSs) versus otherforms of reversible contraceptives as effective methods ofpreventing pregnancy

Barrier Complete review Condom effectiveness in reducing heterosexual HIV transmissionComplete review Diaphragm versus diaphragm with spermicides for contraceptionComplete review Sponge versus diaphragm for contraceptionProtocol Cervical cap versus diaphragm for contraceptionProtocol Female condom for preventing heterosexually transmitted HIV

infection in womenProtocol Non-latex versus latex condoms for contraception

Lactationalamenorrhoea

Protocol Lactational amenorrhoea for family planning

Sterilisation Complete review Minilaparotomy and endoscopic techniques for tubal sterilisation

and Research Training in Human Reproduc-tion, Department of Reproductive Health andResearch, WHO, Geneva, Switzerland, supportedthe author during her work on this chapter.

REFERENCES

1. Committee for Proprietary Medicinal Products(CPMP). Clinical investigation of steroid contra-ceptives in women. The European Agency for theEvaluation of Medicinal Products (2000).

2. Van Look PFA, Perez-Palacios G. ContraceptiveResearch and Development 1984 to 1994. Oxford:Oxford University Press (1994).

3. Rabe T, Runnebaum B. Fertility Control – Updateand Trends. Berlin: Springer-Verlag (1999).

4. Trussell J, Kost K. Contraceptive failure in theUnited States: a critical review of the literature.Stud Fam Plann (1987) 18(5): 237–83.

5. Dionne P, Vickerson F. A double-blind compari-son of two oral contraceptives containing 50 mug. and 30 mu g. ethinyl estradiol. Curr Ther ResClin Exp (1974) 16(4): 281–8.

6. WHO Task Force on Oral Contraceptives. Arandomized, double-blind study of six combinedoral contraceptives. Contraception (1982) 25:231–41.

7. WHO Task Force on Oral Contraceptives. Arandomized, double-blind study of two combinedand two progesterone-only oral contraceptives.Contraception (1982) 25: 243–52.

8. Fotherby K. Update on lipid metabolism and oralcontraception. Br J Fam Plann (1990) 15: 23–6.

9. Gaspard UJ. Metabolic effects of oral contracep-tives. Am J Obstet Gynecol (1987) 157: 1029–41.

10. Runnebaum B, Rabe T. New progestogens in oralcontraceptives. Am J Obstet Gynecol (1987) 157:1059–63.

11. Sabra A, Bonnar J. Hemostatic system changesinduced by 50 mcg and 30 mcg estrogen/proges-togen oral contraceptives. J Reprod Med (1983)28: 85–91.

12. WHO. Oral contraceptives and neoplasia. Reportof a WHO Scientific Group, 817. WHO TechnicalReport Series (1992).

Page 329: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

334 TEXTBOOK OF CLINICAL TRIALS

13. Farley TMM, Meirik O, Collins J. Cardiovasculardisease and combined oral contraceptives: review-ing the evidence and balancing the risks. HumanReprod Update (1999) 5(6): 721–35.

14. Rabe T, Runnebaum B. The future of oral hor-monal contraception. In: Rabe T, Runnebaum B,eds. Fertility Control – Update and Trends. Berlin:Springer-Verlag (1999) 73–89.

15. RHR/WHO. Improving access to quality carein family planning: medical eligibility criteria,second edition. WHO/RHR/00.02 (2000).

16. Hagenfeldt K. Contraceptive research and devel-opment today: an overview. In: Van Look PFA,Perez-Palacios G, eds. Contraceptive Researchand Development 1984 to 1994. Oxford: OxfordUniversity Press (1994) 3–22.

17. Rivera R. Oral contraceptives: the last decade.In: Van Look PFA, Perez-Palacios G, eds. Contra-ceptive Research and Development 1984 to 1994.Oxford: Oxford University Press (1994).

18. WHO Task Force of Long-Acting SystemicAgents for Fertility Regulation. A multicentredphase III comparative study of two hormonal con-traceptive preparations given once-a-month byintramuscular injection. II. The comparison ofbleeding patterns. Contraception (1989) 40(5):531–51.

19. WHO Task Force of Long-Acting SystemicAgents for Fertility Regulation. A multicentredphase III comparative trial of 150 mg and 100 mgof depot-medroxyprogesterone acetate given everythree months: Efficacy and side-effects. Contra-ception (1986) 34: 223–35.

20. Hall P. Long-acting formulations. In: Dicz-falusy E, Bygdeman M, eds. Fertility RegulationToday and Tomorrow. New York: Raven Press(1987) 119–41.

21. WHO Task Force of Long-Acting SystemicAgents for Fertility Regulation. A multinationalcomparative clinical trial of long-acting injectablecontraceptives: norethisterone enantate given intwo dosage regimens and depot-medroxyproges-terone acetate, final report. Contraception (1983)28: 1–20.

22. WHO Collaborative Study of Neoplasia andSteroid Contraceptives. Breast cancer and depot-medroxyprogesterone acetate: a multinationalstudy. Lancet (1991) 338: 833–8.

23. WHO Collaborative Study of Neoplasia andSteroid Contraceptives. Depot-medroxyproges-terone acetate (DMPA) and risk of epithelial ovar-ian cancer. Int J Cancer (1991) 49: 191–5.

24. Newton JR, d’Arcangues C, Hall PE. Once-a-month combined injectable contraceptives. JObstet Gynaecol (1994) 14: S1–34.

25. Croxatto HB, Diaz S, Miranda P, Elamsson K,Johansson ED. Plasma levels of levonorgestrel

in women during longterm use of Norplants.Contraception (1980) 22(6): 583–96.

26. Bardin CW, Sivin I. Lessons learnt from clinicaltrials. In: Michal F, ed. Safety Requirements forContraceptive Steroids. WHO (1989) 400–13.

27. Sivin I. International experience with Norplantand Norplant-2 contraceptives. Stud Fam Plann(1988) 19: 81–94.

28. International Collaborative Post-Marketing Sur-veillance of Norplant. Post-marketing surveil-lance of Norplant((R)) contraceptive implants: II.Non-reproductive health(1). Contraception (2001)63(4): 187–209.

29. Koetsawang S, Ji G, Krishna U, Cuadros A,Dhall GI, Wyss R, et al. Microdose intravaginallevonorgestrel contraception: a multicentre clinicaltrial. I. Contraceptive efficacy and side effects.World Health Organization. Task Force on Long-Acting Systemic Agents for Fertility Regulation.Contraception (1990) 41(2): 105–24.

30. Koetsawang S, Ji G, Krishna U, Cuadros A,Dhall GI, Wyss R, et al. Microdose intravaginallevonorgestrel contraception: a multicentre clinicaltrial. II. Expulsions and removals. WorldHealth Organization. Task Force on Long-Acting Systemic Agents for Fertility Regulation.Contraception (1990) 41(2): 125–41.

31. Sivin I, Mishell DR Jr, Victor A, Diaz S, Alvarez-Sanchez F, Nielsen NC, et al. A multicenterstudy of levonorgestrel–estradiol contraceptivevaginal rings. I-Use Effectiveness. An internationalcomparative trial. Contraception (1981) 24(4):377–92.

32. WHO Task Force on Postovulatory Methods ofFertility Regulation. Randomised controlled trialof levonorgestrel versus the Yuzpe regimen ofcombined oral contraceptives for emergency con-traception. Task Force on Postovulatory Methodsof Fertility Regulation. Lancet (1998) 352(9126):428–33.

33. Von Hertzen H, Piaggio G, Ding J, Chen J, Sonj S,Bartfai G, et al. Low dose mifepristone and tworegimens of levonorgestrel for emergency con-traception: a WHO multicentre randomised trial.Lancet (2002) 360: 1803–10.

34. Wilcox AJ, Weinberg CR, Baird DD. Timing ofsexual intercourse in relation to ovulation: effectson the probability of conception, survival of thepregnancy, and sex of the baby. N Engl J Med(1995) 333(23): 1517–21.

35. Wilcox AJ, Weinberg CR, Baird DD. Post-ovula-tory ageing of the human oocyte and embryofailure. Human Reprod (1998) 13(2): 394–7.

36. Bacic M, Wesselius dC, Diczfalusy E. Failure oflarge doses of ethinyl estradiol to interferewith early embryonic development in the human

Page 330: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

CONTRACEPTION 335

species. Am J Obstet Gynecol (1970) 107(4):531–4.

37. Grimes DA. Emergency contraception – expand-ing opportunities for primary prevention. N EnglJ Med (1997) 337(15): 1078–9.

38. WHO Task Force on Postovulatory Methods ofFertility Regulation. Comparison of three singledoses of mifepristone as emergency contraception:a randomised trial. Task Force on PostovulatoryMethods of Fertility Regulation. Lancet (1999)353(9154): 697–702.

39. Tatum HJ. Comparative experience with newermodels of the Copper T in the United States. In:Hefnawi F, Segal SJ, eds. Analysis of IntrauterineContraception. New York: American ElsevierPublications (1975) 155–63.

40. Jain AK. Safety and effectiveness of intrauterinedevices. Contraception (1975) 11(3): 243–59.

41. Sivin I, Stern J. Long-acting, more effective cop-per T IUDs: a summary of U.S. experience,1970–75. Stud Fam Plann (1979) 10(10): 263–81.

42. Sivin I, Schmidt F. Effectiveness of IUDs: areview. Contraception (1987) 36(1): 55–84.

43. Champion CB, Behlilovic B, Arosemena JM,Randic L, Cole LP, Wilkens LR. A three-yearevaluation of TCu 380 Ag and multiload Cu 375intrauterine devices. Contraception (1988) 38(6):631–9.

44. Cole LP, Potts DM, Aranda C, Behlilovic B,Etman ES, Moreno J, et al. An evaluation of theTCu 380Ag and the Multiload Cu375. Fertil Steril(1985) 43(2): 214–7.

45. Rowe P. Clinical performance of copper IUDs.Proceedings of a meeting: a new look at IUDs.In: Bardin CW, Mishell DR, eds. Proceedingsfrom the Fourth International Conference on IUDs,Newton, MD. Butterworth Heinemann (1994)13–31.

46. Sastrawinata S, Farr G, Prihadi SM, Hutapea H,Anwar M, Wahyudi I, et al. A comparative clin-ical trial of the TCu 380A, Lippes Loop D andMultiload Cu 375 IUDs in Indonesia. Contracep-tion (1991) 44(2): 141–54.

47. Sivin I, Stern J, Coutinho E, Mattos CE, el Mah-goub S, Diaz S, et al. Prolonged intrauterine con-traception: a seven-year randomized study of thelevonorgestrel 20 mcg/day (LNg 20) and the Cop-per T380 Ag IUDS. Contraception (1991) 44(5):473–80.

48. Luukkainen T, Allonen H, Nielsen NC, NygrenKG, Pyorala T. Five years’ experience of intrauter-ine contraception with the Nova-T and the Copper-T-200. Am J Obstet Gynecol (1983) 147(8):885–92.

49. WHO Task Force on the Safety and Efficacy ofFertility Regulation. The TCu 380A, the TCu220C, Multiload 250 and Nova T IUDs at 3, 5

and 7 years of use. Results from three randomizedtrials. Contraception (1990) 42: 141–58.

50. WHO. Long-term reversible contraception. Twelveyears of experience with the TCu380A andTCu220C. Contraception (1997) 56(6): 341–52.

51. Wagner H. Intrauterine contraception: past, presentand future. In: Rabe T, Runnebaum B, eds. Fertil-ity Control – Update and Trends. Berlin: Springer-Verlag (1999) 151–71.

52. Farley TM. Reproduction. In: Armitage P, Col-ton T, eds.Encyclopedia of Biostatistics (1998).

53. Labbok M, Jennings VH. Advances in fertilityregulation through ovulation prediction duringlactation (lactational amenorrhoea method) andduring the menstrual cycle. In: Van Look PFA,Perez-Palacios G, eds. Contraceptive Researchand Development 1984 to 1994. Oxford: OxfordUniversity Press (1994).

54. Turner L, Conway AJ, Jimenez M, Liu PY, For-bes E, McLachlan RI, Handelsman DJ. Contra-ceptive efficacy of a depot progestin and androgencombination in men. J of Clinical Endocrinologyand Metabolism (2003) 88(10): 4659–67.

55. Rabe T, Vladescu E, Runnebaum B. Contracep-tion: historical development, current status andfuture aspects. In: Rabe T, Runnebaum B, eds.Fertility Control – Update and Trends. Berlin:Springer-Verlag (1999) 29–72.

56. Population Council Center for Biomedical Re-search. Annual Report 1991. New York: ThePopulation Council (1991).

57. Farley TMM, Meirik O, Metha S, Waites GMH.The Safety of vasectomy: recent concerns: Bul-letin of the World Health Organization (1993)71(3-4): 413–9.

58. Fuertes-de la Haba A, Bangdiwala IS, Pelegrina I.Success of randomization in a controlled contra-ceptive experiment. J Reprod Med (1973) 11(4):142–8.

59. Potts M, Feldblum PJ, Chi I, Liao W, Fuertes-deLa HA. The Puerto Rico oral contraceptive study:an evaluation of the methodology and results of afeasibility study. Br J Fam Plann (1982).

60. Skegg DCG. The epidemiological assessment ofsafety of hormonal contraceptives: a methodolog-ical review. In: Michal F, ed. Safety Requirementsfor Contraceptive Steroids. WHO (1989) 21–37.

61. Special Programme of Research, Developmentand Research Training in Human Reproduc-tion. Guidelines for the toxicological and clini-cal assessment and post-registration surveillanceof steroidal contraceptive drugs. In: Michal F, ed.Safety Requirements for Contraceptive Steroids.WHO (1989) 415–54.

62. Garza-Flores J, Fatiikun T. Future directions infertility regulation. In: Negro-Vilar A, Perez-Palacios G, eds. Reproduction, Growth and

Page 331: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

336 TEXTBOOK OF CLINICAL TRIALS

Development. New York: Raven Press (1991)383–97.

63. Fotherby K, Benagiano G, Toppozada HK, et al.A preliminary pharmacological trial of the monthlyinjectable contraceptive Cycloprovera. Contracep-tion (1982) 25: 261–72.

64. Aedo AR, Landgren BM, Johannisson E, Dicz-falusy E. Pharmacokinetic and pharmacodynamicinvestigations with monthly injectable contra-ceptive preparations. Contraception (1985) 31:453–69.

65. WHO Task Force of Long-Acting SystemicAgents for Fertility Regulation. A multicen-tred pharmacokinetic, pharmacodynamic study ofonce-a-month injectable contraceptives. I. Differ-ent doses of HRP112 and of DepoProvera. Con-traception (1987) 36(4): 441–57.

66. Silfverstolpe G, Gustafson A, Samsioe G, Svan-borg A. Lipid metabolic studies in oophorec-tomized women: effects of three differentprogestogens. Acta Obstet Gynecol Scand (1979)Suppl 88: 89–95.

67. WHO Task Force of Long-Acting SystemicAgents for Fertility Regulation. A multicentredphase III comparative study of two hormonal con-traceptive preparations given once-a-month byintramuscular injection: I. Contraceptive efficacyand side effects. World Health Organization. Con-traception (1988) 37(1): 1–20.

68. Family Health International. Contraceptive tech-nology and family planning research. FamilyHealth International (1992).

69. d’Arcangues C, Piaggio G, Brache V, BenHaissa R, Hazelden C, Mossai R et al. Effective-ness of Vitamin E and low-dose aspirin, alone orin combination, on Norplant induced prolongedbleeding. Contraception (2004) (in press).

70. Trussell J. Methodological pitfalls in the analysisof contraceptive failure. Stat Med (1991) 10(2):201–20.

71. Steiner M, Dominik R, Trussell J, Hertz-Picciott I.Measuring contraceptive effectiveness: a con-ceptual framework. Obstet Gynecol (1996) 88(3Suppl): 24S–30S.

72. Trussell J, Rodriguez G, Ellertson C. Updatedestimates of the effectiveness of the Yuzpe reg-imen of emergency contraception. Contraception(1999) 59(3): 147–51.

73. Farley TM. Life-table methods for contraceptiveresearch. Stat Med (1986) 5(5): 475–89.

74. Tai BC, Peregoudov A, Machin D. A compet-ing risk approach to the analysis of trialsof alternative intra-uterine devices (IUDs) forfertility regulation. Stat Med (2001) 20(23):3589–600.

75. Farley TM, Ali MM, Slaymaker E. Compet-ing approaches to analysis of failure times withcompeting risks. Stat Med (2001) 20(23):3601–10.

76. Yusuf S, Wittes J, Probstfield J, Tyroler HA. Anal-ysis and interpretation of treatment effects in sub-groups of patients in randomized clinical trials.JAMA (1991) 266(1): 93–8.

77. Hassan EO, el Nahal N, el Hussein M. Accept-ability of the once-a-month injectable contracep-tives Cyclofem and Mesigyna in Egypt. Contra-ception (1994) 49(5): 469–88.

78. Rodriguez G, Faundes-Latham A, Atkinson LE.An approach to the analysis of menstrual patternsin the critical evaluation of contraceptives. StudFam Plann (1976) 7(2): 42–51.

79. Belsey EM, Farley TM. The analysis of menstrualbleeding patterns: a review. Contraception (1988)38(2): 129–56.

80. Belsey EM, Machin D, d’Arcangues C. The anal-ysis of vaginal bleeding patterns induced by fertil-ity regulating methods. World Health OrganizationSpecial Programme of Research, Development andResearch Training in Human Reproduction. Con-traception (1986) 34(3): 253–60.

81. Belsey EM, Carlson N. The description of men-strual bleeding patterns: towards fewer measures.Stat Med (1991) 10(2): 267–84.

82. Fraser IS. Vaginal bleeding patterns in womenusing once-a-month injectable contraceptives.Contraception (1994) 49(4): 399–420.

83. Said S, Sadek W, Rocca M, Koetsawang S, Kir-wat O, Piya-Anant M, et al. Clinical evaluationof the therapeutic effectiveness of ethinyl oestra-diol and oestrone sulphate on prolonged bleed-ing in women using depot medroxyprogesteroneacetate for contraception. World Health Organiza-tion, Special Programme of Research, Develop-ment and Research Training in Human Reproduc-tion, Task Force on Long-acting Systemic Agentsfor Fertility Regulation. Human Reprod (1996) 11Suppl 2: 1–13.

84. Piaggio G, Pinol AP. Use of the equivalenceapproach in reproductive health clinical trials. StatMed (2001) 20(23): 3571–7.

85. Petitti DB, Shapiro S. Strategies for post-registra-tion surveillance of contraceptive steroids. In:Michal F, ed. Safety Requirements for Contracep-tive Steroids. WHO (1989) 335–47.

86. The Cochrane Library (Issue 4). Oxford: UpdateSoftware (2002).

87. Cheng L, Gulmezoglu AM, Ezcurra E, Van LookPFA. Interventions for emergency contraception.The Cochrane Library. 2001. Oxford: UpdateSoftware (2001).

Page 332: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

21

Gynaecology and Infertility

SILADITYA BHATTACHARYA1 AND JILL MOLLISON2

1Department of Obstetrics and Gynaecology, University of Aberdeen, Aberdeen, UK2Department of Public Health, University of Aberdeen, Aberdeen, UK

INTRODUCTION

The randomised clinical trial is widely acceptedas the gold standard for scientific evaluation oftreatments. In the current climate of clinical gov-ernance, data from clinical trials are consideredto represent the highest level of evidence that canbe used to inform effective treatment strategies.Yet, there are fewer trials in gynaecology in com-parison with other disciplines. Those reportedin the literature account for a minority of pub-lished papers in major journals.1 Gynaecologi-cal trials incorporate a wide spectrum of clinicalconditions and proposed interventions. Womencan differ substantially in terms of age, physi-cal and psychological disability; while treatmentscan range from drug therapy to surgical proce-dures, from information provision to physiother-apy and dietary advice. The aim of this chapter isto examine, test and explore the basic principlesof clinical trials in gynaecology. An overviewof different types of trials is provided and refer-ence will be made to specific challenges, includ-ing identification of sample populations, choiceof appropriate outcomes and tools, randomisa-tion and arrangements for follow-up. Examples

are drawn from general gynaecology, infertilityand fertility control. Trials in obstetrics, con-traception and gynaecological oncology, whichare discussed elsewhere, will be excluded fromour discussion.

TAXONOMY OF CLINICAL TRIALS

Clinical trials may be classified in a numberof ways (see Table 21.1). Some of these arediscussed below.

PHASE I CLINICAL TRIALS

These preliminary studies generally address drugsafety rather than efficacy, and may be performedon healthy volunteers. Examples include stud-ies of drug metabolism and bio-availability ofrecombinant gonadotrophins in infertile women.2

Most phase I trials are either directly or indi-rectly supported by the pharmaceutical industryand involve relatively small numbers of subjects.Women are required to adhere to a strict proto-col and agree to fairly extensive evaluation ofteninvolving multiple investigations such as blood

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 333: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

338 TEXTBOOK OF CLINICAL TRIALS

Table 21.1. Taxonomy of clinical trials

Phased trials Phase IPhase IIPhase IIIPhase IV

Conduct PragmaticExplanatory

Design Parallel groupCrossoverFactorialPatient preferenceCluster randomisation

Randomisation TrueQuasi-randomisation

counts, biochemistry, endocrine profile and liverand kidney function tests. In this context, it maybe useful to be aware of the fact that finding‘normal’ subjects for such trials in reproductivemedicine can sometimes be challenging as a largeproportion of young, fit, healthy women mayeither be on oral contraception or actively tryingfor a pregnancy.

PHASE II CLINICAL TRIALS

These are also fairly small-scale investigationsinto the efficacy and safety of a drug and requireclose monitoring of each patient. Sometimesthey can be employed as a screening processto screen drugs, which are either potentiallyinactive or toxic. They may also be usedto determine the most appropriate dose androute of administration of a drug. Examplesof these types of trials include those involvingthe use of misoprostol for medical terminationof pregnancy.3 Where patient acceptability ofthe route of administration, i.e. vaginal, oralor sublingual, is an important outcome, thesetrials may need to break out of the traditionalmould of strictly controlled explanatory trials andassume the pragmatic approach associated withphase III trials.

PHASE III CLINICAL TRIALS

After a drug has been shown to be reasonablyeffective it is essential to compare it with the

current standard management for the same con-dition in a large trial involving a substantialnumber of patients. This is also the design usedfor non-pharmacological interventions, which areincreasing in number. The majority of the tri-als referred to in this chapter are phase III tri-als. This is the point of evaluation followingwhich interventions are introduced into clini-cal practice.

PHASE IV CLINICAL TRIALS

Even after a treatment finds general acceptance,unanswered questions about its safety and long-term effectiveness continue to be addressed inthe context of phase IV trials. The long-termimplications of new methods of treatment ofmenorrhagia such as endometrial ablation are stillunder evaluation a decade after the results ofthe first phase III trials were reported. Medium-term data have been presented in a number ofpublications.4,5

PRAGMATIC AND EXPLANATORY TRIALS

In terms of design, clinical trials are oftendescribed as either explanatory or pragmatic.Explanatory trials measure efficacy–the benefit atreatment produces under ideal conditions. Prag-matic trials measure effectiveness–the benefit thetreatment produces in routine clinical practice.6

Examples of the former include evaluation ofdrugs used to treat menorrhagia or those usedto undertake medical termination of pregnancy.The aim is to assess the outcome of a new drugunder controlled conditions using a homogeneousgroup of patients. Eligibility criteria are strict,and protocol violations are not allowed. In anexplanatory trial comparing recombinant folliclestimulating hormone with a urinary preparation,any woman who fails to receive the appropriatedrug in the prescribed dose will be excluded fromthe study on grounds of protocol violation.

In contrast, a pragmatic trial aims to mirrorthe normal variations between patients that occurin real life. For example, a pragmatic trial ofmedical versus surgical treatment for menorrha-gia will include all women with a subjective

Page 334: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GYNAECOLOGY AND INFERTILITY 339

complaint of heavy menstrual loss. Women ran-domised to drug treatment, who find the interven-tion unacceptable and elect for surgery, do notface disqualification from the trial. This some-what relaxed policy is justified on the groundsthat women’s decisions to reject their allocatedtreatment are likely to reflect real-life situa-tions and can actually be interpreted as a mea-sure of dissatisfaction. Furthermore, the treatmentoffered to patients in the surgery arm may notbe identical, as operations may be performed bymore than one surgeon, each with a slightly dif-ferent technique. A similar attitude would applyto a pragmatic trial of physiotherapy for prolapseor counselling for premenstrual syndrome, whereidentical interventions cannot be guaranteed bydifferent physiotherapists or counsellors.

There are other differences between the twotypes of trials. Blinding is more likely to be usedin an explanatory trial such as one comparingoral metformin with placebo in women withpolycystic ovarian syndrome. Pragmatic trialsmay also be blinded, but this is often not feasible(for example, in surgical versus medical trials),nor always desirable. There is also less of acompulsion to use placebos, as the objective isto compare the new intervention, not with aplacebo, but with the ‘gold standard’ or bestof the existing treatments. Clinician and patientbiases caused by the absence of blinding maynot necessarily be detrimental to the trial, butcould actually be seen to be part of the responseto treatment. The outcome in a pragmatic trialsuch as one comparing oral clomiphene citrate(a drug treatment) versus expectant managementin unexplained infertility incorporates the totaldifference between the interventions that arebeing evaluated. This may include the effect ofthe treatment as well as the associated placeboeffect as this best reflects the likely clinicalresponse in practice.6

TRIAL DESIGN

SIMPLE PARALLEL GROUP

This is the simplest and commonest trial designinvolving a comparison between two groups,

i.e. an experimental versus a control group. Asthis type of trial is most easily understood byresearchers as well as patients, examples abound.Occasionally a trial may have three arms, e.g.intrauterine insemination (IUI) versus ovarianstimulation + intrauterine insemination versusin vitro fertilisation (IVF) for the treatment ofunexplained infertility.7 Sample sizes for suchtrials will be dictated by the minimum significantdifference in outcome between any two arms.Due to the nature of the interventions, expecteddifferences between the arms will vary. Thenumber of women required to show a clinicallysignificant difference in pregnancy rates betweenIUI and IVF is smaller than that necessary toshow a difference between IUI and stimulatedIUI, and will ultimately be the one chosen forthis trial.

CROSSOVER

Crossover trials have the advantage of usingwomen as their own controls, thus reducingthe sample size required. Women are randomlyexposed to either the control or the interventionarm first, followed by the other. Often a ‘washoutperiod’ is introduced between the two arms toreduce the risk of contamination. Unfortunatelythis design is more suited for medical treatmentsof chronic conditions as opposed to surgicaltrials or infertility trials. In the first group thepracticalities of the situation render such a designimpossible. In the second, a definite outcomesuch as pregnancy has the natural effect ofpreventing women from completing later phasesof the trial.8 In such situations, exaggeratedestimates of treatment effect can occur, leadingto erroneous clinical decisions. From a practicalpoint of view, only data from the first phaseof the crossover trial may be valid. However,this design may well be suitable for exploringdrug treatment of chronic conditions such aspremenstrual syndrome or sexual function.

FACTORIAL

Factorial designs are often efficient as they canaddress two questions within the context of a

Page 335: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

340 TEXTBOOK OF CLINICAL TRIALS

Women with unexplainedinfertility

Randomisation

A

Expectant management

B

Clomiphene

C

Intra-uterine insemination(IUI)

D

Clomiphene + intra-uterineinsemination (IUI)

Figure 21.1

single trial. Women with unexplained infertilitycan be randomised to receive either expectantmanagement (no treatment), insemination alone,clomiphene alone or clomiphene and insemina-tion treatment as shown in Figure 21.1.

When the two treatments (factors) do notinteract with one another, this design has theadvantage of requiring half the number of patientsthat would be required if two parallel RCTs wereto be conducted. The advantage is self-evident. Inthis case, the effect of IUI alone can be assessedby comparing groups C and D with A and B,while the effect of clomiphene can be evaluatedby comparing B and D versus A and C. Theadvantage is self-evident. In this case, the effectof IUI alone can be assessed by comparing groupsC and D with A and B, while the effect ofclomiphene can be evaluated by comparing B andD versus A and C.

CLUSTER RANDOMISED TRIALS INGYNAECOLOGY AND INFERTILITY

A potential problem that can occur with ran-domised controlled trials is where the intervention

is not targeted at individual patients, but atgroups of patients. This can happen where theintervention is an information package for themanagement of menorrhagia in primary care9

or a clinical guideline for the management ofinfertility.10 In these studies, randomising patientsto receive management using the informationpackage or clinical guidelines would have intro-duced contamination, since GPs would havebeen expected to manage both study (informationleaflet, clinical guidelines) and control patients.Potentially this could underestimate the true effec-tiveness of the intervention. Therefore, clustersof patients (e.g. general practices) were randomlyallocated to receive intervention (e.g. informationleaflets, clinical guidelines) or control. Clusterrandomisation should only be carried out whenthere is a strong justification for doing so.

The primary implication of cluster randomisedtrials is that the measurements on individualsare not statistically independent of one another;that is measurements from individuals within thesame cluster will be correlated to one another.This has implications in the design (e.g. sample

Page 336: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GYNAECOLOGY AND INFERTILITY 341

size), conduct (e.g. informed consent), analysisand reporting. Cluster randomised trials shouldadjust for this clustering when determining thenumber of patients required. The sample sizethat would be required if patients were to berandomised must be inflated by a factor whichtakes into account the extent of the clusteringand the size of the cluster.11 The extent ofthe correlation is measured by the intra-clustercorrelation coefficient (ICC)12 and researchers arerequired to have some indication of this, in orderthat the study can be adequately powered. Studiesthat fail to adequately inflate the sample size willsuffer from a type II error.

Similarly, the correlated responses obtainedfrom each cluster have an implication for thestatistical analysis, since standard statistical tests(e.g. t-test) assume that observations are inde-pendent of one another. There are a number ofapproaches to analysing cluster randomised tri-als and these are detailed elsewhere.13 Failure toaccount for the correlated responses in the anal-yses will result in an increased type I error.

Clustering of outcomes can also occur in infer-tility trials where alternative treatments are beingcompared. For example, in randomised controlledtrials comparing IVF with ICSI the unit of allo-cation varies between patients,14 oocytes15 andcycles.16 Often, outcomes such as implantationrate and fertilisation rate are considered. Theseare both expressed as percentages out of the totalnumber of oocytes retrieved. Hence, in trials thatrandomise patients (couples) or cycles and reportimplantation or fertilisation rates, there will beclustering of the outcome since oocytes are clus-tered within patients or cycles.14,16 Hence, inthese studies adjustment should be made in theanalysis to adjust for the correlated outcomesassessed (on each oocyte) within patients (orcycles). In trials that randomise by patients andreport fertilisation of implantation rates, someadjustment is required. However, for outcomessuch as live birth rate or pregnancy rate noadjustment is required since the percentages areexpressed out of the total number of patients ran-domised. Bhattacharya et al.16 randomised cyclesand reported implantation rates per transferred

embryo. However, they noted that the differ-ence in implantation rates was likely to be widerthan that reported due to failure to adjust forthe clustering of embryos/oocytes transferred toeach woman. Studies where oocytes have beenrandomised have no clustering implications sinceoocytes retrieved from the same women are ran-domly allocated to receive ICSI or IVF.

When conducting randomised controlled trialsin infertility, consideration should be given to theunit of randomisation and the outcome measuresto be applied. When there is implicit clusteringin the data, the statistical analysis should accountfor this using the methods described above.13

QUASI-RANDOMISED TRIALS

These are controlled experimental studies wheretreatment allocation is performed on the basis ofpatient unit numbers or days of the week whenthe patients are recruited. Although this design oftreatment allocation affords an element of chance,it cannot be considered to be genuine randomi-sation. This type of design may still appeal tothose involved in laboratory trials involving incu-bation or cryopreservation of human embryos.In these cases, it may be easier and cheaper touse a certain protocol for all embryos on alter-nate days or alternate weeks rather than changethe protocol or a freezer setting for each embryoor each woman. The consequent loss of allo-cation concealment will lead to serious inclu-sion bias as some patients may be deliberatelyexcluded. This, is turn, can exaggerate treatmenteffects.

PATIENT PREFERENCE TRIAL

A potential problem in some randomised trialsarises when patients or their clinicians refuse tobe randomised on grounds of strong treatmentpreferences. Exclusion of these patients mayaffect the generalisability of the results as par-ticipants may not be representative. Yet recruit-ment of these patients may introduce substantialbias especially when it is impossible to blindthem. In addition, compliance and satisfactionmay be higher with the preferred intervention.17

Page 337: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

342 TEXTBOOK OF CLINICAL TRIALS

This is particularly so when the ‘new’ treatmentis only available within the context of the trialor when, as in trials in unexplained infertility,one of the arms comprises a ‘no treatment’ or‘expectant management’ group. Dissatisfactionwith the allocation may lead to differential com-pliance and follow-up resulting in groups whichcannot then be assumed to be similar. The out-come measures could also be affected by howsatisfied patients are with their allocated treat-ment. The effect of patient preference on outcomewould depend to a great extent on the specificoutcomes being assessed. If the principal out-come is death or live birth, then the effect ofpatient preferences is likely to be small. If theprincipal outcome is satisfaction with care, thenthe effect of patient preference is large.18 Undersuch circumstances the conventional randomisedtrial will underestimate the relationship betweenthe intervention and the outcome, i.e. show theminimum effect size. Conversely a comparisonbetween two groups of patients who have chosentheir treatment and thus optimised their treatmentchoice will be considered to represent the max-imum effect size. An intervention in questionwill have an effect size between the minimumand maximum as derived from the randomisedand the preference part of a partially randomisedpatient preference trial.18

To deal with patient preferences within atrial, the use of a partially randomised patientpreference (PRPP) trial has been suggested.19

Patients with strong preferences are allowed theirdesired treatment. Those without such views aresubjected to randomisation. For example in a trialof medical and surgical termination of pregnancywe end up with four groups–randomised tomedical, randomised to surgical, prefer medicaland prefer surgical.

Potential disadvantages with PRPP trials includeeffects of the trial size. The size of a total PRPPcohort will need to be much larger than for a con-ventional randomised controlled trial. As alreadymentioned, the size of the randomised cohort needsto be the same as in a conventional trial. In addi-tion, the number of patients in the non-randomisedpreference arms needs to be of equivalent size. The

numbers quickly add up to generate a total sam-ple size double that for a conventional trial. Thishas the predictable effect of adding to the cost andduration of the trial. Entry of a disproportionatenumber of patients into either the randomised orthe preference arms is also a problem, as the trialwill not be completed unless the appropriate num-ber have been recruited into the two components ofthe trial. The situation may be further complicatedby patients favouring one treatment over another,making comparison of the two groups in the pref-erence arm more difficult.

A further problem with this approach lies inthe analysis. Any comparison using the non-randomised groups is unreliable because ofunknown and uncontrolled confounders. Patientpreference designs have been used in trials oftermination of pregnancy20,21 and menorrhagia.22

The evidence to support the use of PRPP trialscompared with conventional randomised trials islimited. A randomised comparison of the twostrategies by Cooper et al.22 suggested that theextra cost and complexity were not justified inthe context of medical versus surgical treatmentof menorrhagia.

A conventional randomised trial could addressthe effect that patient preference has on outcomeby recording this information before allocation.23

This would allow resources to be concentrated onrecruiting as many patients as possible into therandomised comparison group but would allowstratification of the results by initial preference.

EQUIVALENT TRIALS

Often in reproductive health care the aim is toshow that one treatment is as effective (equiv-alence), or no less effective (non-inferior), asanother. The methodology for equivalence tri-als differs to that of superiority trials in design,analysis and interpretation. In designing equiva-lence trials, attention must be given to definingan equivalence margin. This is the differencein effect that would be deemed to be ‘clini-cally insignificant’.24 In comparison with supe-riority trials, larger sample sizes are neededto demonstrate equivalence. In the analysis ofequivalence trials, conventional statistical testing

Page 338: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GYNAECOLOGY AND INFERTILITY 343

has little relevance and interpretation of resultsshould be conducted though use of confidenceintervals in relation to the predefined equiva-lence margin.25 Statistical significance is demon-strated if the upper and lower limits of the 95%confidence interval do not cross the equivalencemargin.25 In superiority trials, the most conser-vative analysis is by intention to treat (ITT). Inan equivalence trial, however, a ‘per protocol’(PP) analysis is usually considered statisticallymore conservative. This is because an ITT anal-ysis may blur the comparison between groups andlead to an increased chance of declaring the twotreatments as equivalent when they are not. Thedecision about which should be the primary formof analysis (ITT or PP) in an equivalence studyis not straightforward.26 It depends on the par-ticular characteristics of the study, including thedefinitions adopted for the ITT and PP analysesand the risk of bias.27

PERFORMING AN RCT

SYSTEMATIC REVIEW

A systematic review of the literature is an essen-tial component of the pre-trial work-up. It enablesthe researcher to define the clinical question inthe light of work that has gone before and assessthe need for a trial. It also provides vital infor-mation about the limitations of previous trials,outcome measures used and nature of follow-up.This is useful in planning the design of the pro-posed study. A recent systematic review28 hasidentified typical problems associated with pre-vious trials in unexplained infertility includingsmall sample sizes, inappropriate outcome mea-sures (pregnancy rates per cycle) and lack of costdata. Similar information relating to other top-ics in gynaecology can also be obtained fromCochrane reviews.

DEFINING THE STUDY POPULATION

Definition of the study sample is a vital part ofany clinical trial. Unfortunately this aspect of trialdesign can be contentious. The diagnostic crite-ria of many gynaecological conditions continue

to generate debate amongst clinicians. Disagree-ment about the definition of a particular conditioncan lead to dismissal of the conclusions of a trialas irrelevant. Certain terms continue to pose par-ticular problems. For example, infertility whichis defined as ‘the lack of pregnancy after oneyear of regular unprotected coital exposure’ refersto the lack of an exposure, i.e. pregnancy ratherthan a particular disorder. There may be contrib-utory factors from both sexes, and definitions ofsubgroups such as unexplained infertility or poly-cystic ovarian syndrome vary widely. Variationin laboratory procedures (such as semen analy-ses) may affect the diagnosis of male infertilitywhile the investigations used for tubal patency(laparoscopy versus hysterosalpingogram) mayhave an effect on the identification of endometrio-sis in infertile women.

With menorrhagia, the problem is different.The conventional textbook definition of menor-rhagia (menstrual blood loss >80 mls) is imprac-tical and the pragmatic approach is to includeall women with a subjective complaint of heavymenstrual loss. This encourages a situation wherecritics can question the external validity of tri-als where women have been included eitheron the basis of objective measurement of men-strual blood loss or on pragmatic grounds withincreased self-reported bleeding. Some puristswill argue that efficacy of treatments for men-orrhagia cannot be evaluated accurately withoutpatients with ‘genuine’ pathology. However, theoutcome of such trials may not necessarily beapplicable to the vast majority of clinical situa-tions where menstrual loss is not routinely mea-sured. From a clinical point of view, it is probablymore useful to use a pragmatic approach andrecruit all women on the basis of a subjectivecomplaint of menorrhagia.

Studies on urinary incontinence also requireappropriate definition of the population group,as it is well known that women who attendurodynamic clinics constitute a small proportionof the total number of women in the communitysuffering from urinary incontinence. Althoughthere is no way of completely avoiding selectionbias, explicit description of the eligibility criteria

Page 339: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

344 TEXTBOOK OF CLINICAL TRIALS

allows the readers to draw their own conclusionsregarding the applicability of the data to their ownspecific contexts. Those performing secondaryresearch can also use these data to assessheterogeneity between trials.

A specific problem associated with infertilitytrials is the question of how to deal with themale partner. Conventionally it is the womanwho undergoes treatment, and it is she whois considered to be the participant in trialsand subjected to recruitment, randomisation andfollow-up. However, in trials where satisfaction,acceptability and costs are outcomes, it is perhapsappropriate to seek the male partners’ viewsas well.

An important aspect of the choice of the studypopulation involves the effect of the participantson generalisability of the findings. Althoughstudy participants may meet eligibility criteria,participation is voluntary and volunteers may dif-fer from the general population in terms of gen-eral health, co-interventions, educational level,motivation and ability to follow a protocol. Eth-nic minorities may be missed on account ofunfamiliarity with the language of the question-naires used.

INTERVENTIONS

Due to its unique mix of medical and surgi-cal workload, gynaecology offers a number ofdiverse interventions that need to be tested inthe context of clinical trials. Some examples areshown in Table 21.2.

DEFINING OUTCOMES

For any trial, it is crucial to have an a priorihypothesis and a clinically relevant primary out-come on which the power calculation is based.This essential primary step prevents the sacri-fice of quality for expediency and discouragesopportunistic and inappropriate manipulation ofcollected data. Outcomes of choice include thosethat are purely clinical, as well as others whichmay be patient-centred or economic. The pre-cise nature of the primary and secondary out-comes will depend on the type of trial and

Table 21.2. Types of interventions subjected toclinical trials in gynaecology

Intervention Examples of trials

Packages of care Information packages in usein general practice forappropriate treatment andreferral in menorrhagia

The value of guidelines ininfertility for generalpractitioners

Surgical techniques Hysterectomy versusendometrial ablation

Different types ofendometrial ablation, e.g.TCRE versus Laser

Drug trial Placebo versus tranexamicacid for menorrhagia

Comparison ofdifferent treatmentmodalities

Medical versus surgicaltermination of pregnancy

Mirena IUS versusendometrial ablation formenorrhagia

Expectant treatment versusIVF for unexplainedinfertility

Laboratorytechniques

In vitro fertilisation (IVF)versus intra-cytoplasmicsperm injection

Alternative methods ofcryopreservation of humanembryos

Place of care One-stop specialist clinicversus general clinic

Out-patient versus in-patientendometrial ablation

Investigations Effectiveness of methods ofscreening for chlamydiatrachomatis

Hysterosalpingography as atest of tubal patency

The post-coital test in thediagnosis of infertility

Hysteroscopy in thediagnosis of menorrhagia

its clinical context. This may involve differ-ent levels of observation and analysis, incor-porating the individual, the family and thecommunity.29

Page 340: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GYNAECOLOGY AND INFERTILITY 345

CLINICAL OUTCOMES

Clinical outcomes are essential components ofany clinical trial. They should be meaningfulto other clinicians and have a direct bearing onthe women’s appreciation of the burden of dis-ease. Generally speaking, outcomes which arerepresented as discrete or categorical variablesare often more meaningful than continuous out-comes. For example, the proportion of womenrequiring blood transfusion after hysterectomy isseen to be a more clinically relevant outcome thanthe volume of blood lost during the procedure.The proportion of women in whom the uterus isempty at 24 hours may be a more meaningful out-come than the mean number of hours required formedical termination. In reproductive medicine,many clinical trials have tended to choose surro-gate markers such as number of oocytes retrievedor fertilised as primary outcome rather than livebirth or pregnancy. The reason for such a choiceis not difficult to guess. Fewer participants arerequired to achieve a ‘significant’ difference ifthe outcome is represented as a continuous vari-able (such as number of oocytes) instead of adichotomous variable such as live birth. This inturn reduces the sample size and costs, allowinga single centre to perform a trial that would oth-erwise require far higher levels of funding and amulti-centre approach.

Explanatory trials usually rely on a single clin-ical outcome. For example, in a trial comparingdrug treatments for menorrhagia, menstrual bloodloss in millilitres may well be an appropriate pri-mary outcome. Other physiological or biochemi-cal outcomes such as haemoglobin level, volumeurinary loss, extent of endometriosis visualisedby laparoscopy, number of ovarian follicles seenon ultrasound scan and serum estradiol levels fol-lowing ovarian stimulation may also be used indifferent situations. Unfortunately, they may notalways correlate well with the clinically relevantoutcomes–certainly from the patients’ perspec-tive. In certain cases, costs and a need to optfor a modest and realistic sample size dictatethe need for surrogate outcomes in preferenceto more robust but less common substantive out-comes. Thus bone mineral density rather than the

incidence of hip fracture may be chosen as a prin-cipal outcome in trials of hormone replacementtherapy.30

Pragmatic trials usually require the evaluationof more than one outcome measure in order tocome to a decision about the effectiveness, risks,costs and acceptability of an intervention. Forexample, in surgical trials of menorrhagia out-comes should include satisfaction with treatment,menstrual flow, pain, premenstrual syndrome andperiod of recovery. Sometimes when the impactof a disease spreads beyond the individual to awider group such as the family, GPs or carers,outcomes may need to be expanded to include awider group. This may be relevant in trials of uri-nary incontinence or HRT. It is however impor-tant to remember that it is the primary outcome onwhich the power calculation is usually based, andthe one that the trial is best designed to address.

QUALITY OF LIFE

Quality of life (QOL) is now accepted by mostclinicians as an important outcome in clinicaltrials.31 However the term is sometimes usedloosely and without a clear understanding of whatit means.32 Since QOL is considered to be acomplex concept comprising physical, emotionaland other dimensions, most questionnaires incommon use not only assess the detailed aspectsof QOL but also provide a summary score foroverall health status.33 Generic measures suchas short form health survey (SF-36)34 broadlyassess physical, mental and social health and canbe used to compare conditions and treatments.Although the number of such instruments incurrent use is rapidly increasing, there is aremarkable level of consistency between them.33

Other methods include tools focusing on a sin-gle aspect such as pain, anxiety as well as indi-vidualised measures in which patients themselvesdefine and rate the most important aspects oftheir quality of life.35 A number of condition-specific tools, which can be used either indepen-dently or to supplement generic measures, havebeen developed.36 Examples include the King’sCollege Questionnaire for Urinary Incontinence37

Page 341: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

346 TEXTBOOK OF CLINICAL TRIALS

and Menstrual Distress questionnaire.38 TheEndometriosis Health Profile-3039 and The Meno-pause Rating Scale (MRS).40

A systematic review by Sanders et al.41 showedthat despite the plethora of instruments, theprevalence of reporting on quality of life remainslow, increasing from 1% in 1980 to 4% in 1997.There is also a general unwillingness to askpatients to supplement questionnaire-based datawith personal responses, and lack of appreciationabout the critical importance of response rates.

Patients themselves can find it difficult to dis-tinguish between quality of life and health statusor to rate their health without a point of reference.At the same time, the effects of age and chang-ing expectations need to be adjusted for wheninterpreting QOL scores. Overall, QOL offersa superior way of assessing treatment successin trials involving general gynaecology (such asmenorrhagia, urinary incontinence, menopause,pre-menstrual tension) where interventions aretargeted at women with benign but debilitatingillnesses that compromise several key areas ofday-to-day life. On the other hand, women seek-ing fertility treatment or abortion services arenot necessarily unwell. The aim of treatmentis to enhance their physical and mental well-being rather than correct a pre-existing deficitin health status. Existing instruments do not dis-criminate between these two broad groups andfurther refinements are needed with respect toassessing positive aspects of general and sexualhealth as opposed to the conventionally used neg-ative aspects.42 Meanwhile, simple global ques-tions on self-reported health or QOL continue tobe useful as prognostic measures for stratificationof treatment allocation and as important outcomemeasures alongside purely clinical outcomes.

PATIENT SATISFACTION

There continues to be a general lack of agreementabout the mechanisms which produce satisfac-tion, as well as the meaning of the word ‘satisfac-tion’ itself which has been defined as an ‘evalu-ation based on the fulfillment of expectations’.43

The conventional view is that satisfaction reflects

the sum total of a number of patient-related fac-tors, including expectations, characteristics andpsychosocial determinants.44 Over the past fewyears, patient satisfaction has become increas-ingly accepted as a measure of quality in healthservices and a valid outcome in randomised clin-ical trials.45 This is particularly significant in thecurrent climate of delivery of health care whichaims to provide a patient-centred service withgreater public involvement in planning.46 Thepurpose of patient satisfaction measurement is todescribe health care services from the patient’spoint of view, measure the ‘process’ of care andevaluate health care.44 The particular strength ofusing satisfaction as an outcome is related to theunique circumstances of certain gynaecologicaltrials such as those used for menorrhagia wherenot only the interventions but also the clinicaloutcomes may be dissimilar. In a trial of hys-terectomy versus endometrial ablation, womenwould be expected to be amenorrhoeic follow-ing hysterectomy but not after ablation. Here,comparison of amenorrhoea rates is unlikely tobe helpful in comparing the two groups, whilesatisfaction is not only a robust measure of treat-ment success, but also incorporates the sum totalof a woman’s experience of the alternative treat-ment arms including discomfort, recovery timeand side-effects. A similar argument can be usedto justify the use of the same outcome for trialscomparing surgical and non-surgical treatment ofurinary incontinence.

Despite their widespread use in clinical trials,assessment of patient satisfaction has been criti-cised on theoretical and methodological groundsand their practical use questioned.47 Relativelyfew patients express open dissatisfaction withtreatment.48 Indeed satisfaction rates of 80%or more are reported by most hospital-basedstudies.49 There is also little evidence to indi-cate that expressions of satisfaction result fromthe fulfillment of expectations; in some situ-ations it is difficult to establish the fact thatexpectations exist at all. High satisfaction rat-ings do not necessarily mean that women havehad good experiences in relation to the ser-vice as satisfaction may well make allowances

Page 342: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GYNAECOLOGY AND INFERTILITY 347

for mitigating circumstances. If the aim is toprovide women with a voice, it is important notto rely on satisfaction with treatment as a sin-gle outcome but to prioritise methods of access-ing women’s experience of interventions and themeaning and value they attach to them.47 Thereare no off-the-shelf questionnaires that are com-pletely satisfactory50 and qualitative studies havedemonstrated that high satisfaction rates cannotbe taken as proof of positive experience. Manytools mentioned in the literature are not val-idated, while many expressions of satisfactionmay not be evaluations at all.51,52 Dissatisfac-tion may be more useful as a minimum level ofnegative experience and may be of potential usein benchmarking exercises. At the moment mostclinical trials in gynaecology attempt to mea-sure satisfaction using a number of direct andindirect questions. Some of these questions havebeen repeated at various points during follow-upto assess change in satisfaction rates over time.Despite the obvious shortcomings of the existingsystem, there has been an opportunity to refineand validate some of these questionnaires throughrepeated use in a series of related trials.4 Accept-ability has been measured by direct questions andby other tools such as the Semantic Differen-tial technique in the context of menorrhagia andtermination.20 – 22

In other areas such as infertility, satisfactionwith treatment is more difficult to assess as theeffect of the desired outcome (live birth) is pre-dominant even where treatment is invasive orunpleasant. Conversely there is dissatisfactionwith treatment where the outcome is failure tofall pregnant. Some attempts have been madein recent trials to specifically address separatelysatisfaction with ‘treatment’ as opposed to satis-faction with ‘outcome’. This area is deserving offurther study.

ECONOMIC EVALUATION

With the emergence of new methods of treatmentcomes an increasing awareness of the need tostudy not just the clinical effectiveness but alsothe cost-effectiveness of alternative treatments.Pragmatic clinical trials are the standard approach

not only for evaluating interventions, but alsocomparing costs.53 The costs of treatments areusually estimated using information about thequantities of the resources used. For examplethe resources used for hysterectomy include thestaff time involved, the consumables used andthe length of the subsequent in-patient stay.To estimate the cost of treatment, informationabout this resource use is combined with unitcost estimates, which provide a fixed monetaryvalue to each cost-generating item.54 The totalcost is then the weighted sum of quantities ofresources used where weights are unit costs.Carrying out an economic evaluation alongsidea randomised trial allows detailed information tobe collected about the quantities used by eachpatient in the study. Such information allows acost for each patient, producing a patient-specificcost data. This is turn reduces the extent towhich comparison between the groups is basedon assumptions about resource use. Howeverrandomised trials are not necessarily the only wayor necessarily the best way to address economicquestions.55 There is an important role for othermethods, including modelling.

In the context of RCTs however there is anurgent need to revise the way in which health eco-nomic outcomes are addressed within a clinicaltrial. While cost outcomes are generally regardedas secondary outcomes, the rationale for a for-mal sample size calculation with adequate powerfor the planned analysis is still relevant given thelarge variability in costs between individuals.56

This is even more relevant where subsets areused for cost data for practical reasons. A recentreview has identified an urgent need to improvethe statistical analysis and interpretation of costdata in RCTs.54 This is particularly relevant to theprovision of descriptive statistics relating to costs.As cost data are typically skewed, the median canbe interpreted as the typical cost for individuals.However, it is the mean cost that is important forpolicy decisions as it is this value, multiplied bythe number of patients, which gives an estimateof the total cost of an intervention.

Table 21.3 provides some examples of outcomemeasures used in different types of gynaecological

Page 343: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

348 TEXTBOOK OF CLINICAL TRIALS

Table 21.3. Outcomes in gynaecological trials

Clinical area Outcomes Comments

Infertility • Live birth rate per couple• Live birth rate per treatment• Clinical pregnancy rate per couple• Clinical pregnancy rate per

treatment• Biochemical pregnancy rate• Fertilisation rate• Implantation rate• Multiple pregnancy• Morbidity (e.g. ovarian

hyperstimulation)• Costs

Although live birth per couple is the mostrobust outcome, it demands large samplesizes and a longer duration of follow-up.

Live birth/clinical pregnancy rate pertreatment is still used in many trials.

Multiple pregnancy and its effect onmaternal and perinatal morbidity isincreasingly being acknowledged as animportant outcome of fertility trials.

Menorrhagia • Satisfaction• Acceptability• Quality of life• Menstrual blood loss• Bleeding and pain scores• Morbidity• Repeat surgery• Haemoglobin level• Amenorrhoea rates• Costs

Satisfaction and QOL are clinically moreuseful than objective measurement ofmenstrual blood loss or amenorrhoearates, especially when trials comparetreatments such as hysterectomy whichguarantees amenorrhoea versus theMirena intrauterine system orendometrial ablation which do not.

Satisfaction with treatment may notcorrespond to amenorrhoea rates.

Long-term follow-up is important in theevaluation of all new technologies.

Urogynaecology • Satisfaction• Acceptability• Quality of life• Symptom relief• Objective measurement of urinary

loss• Surgical morbidity, repeat surgery• Length of hospital stay• Urodynamic assessment• Costs

Symptom relief and objective assessment ofbladder function may not necessarilycorrespond with quality of life orsatisfaction.

Long-term follow-up is necessary foreffective evaluation of treatments.

Hormone replacementtherapy

• Hip fracture• Cardiovascular disease• Menopausal symptoms• Quality of life• Satisfaction• Acceptability• Bone mineral density• Serum lipid profile• Side-effects and morbidity

Historically, surrogate outcomes like lipidprofile and bone density have been morepopular than rates of cardiovasculardisease or fracture.

Termination ofpregnancy

• Efficacy: evacuation of the uterus• Acceptability• Morbidity• Quality of life• Costs

Quality of life is difficult to assess in thecontext of termination.

Long-term follow-up difficult.

Page 344: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GYNAECOLOGY AND INFERTILITY 349

trials. A crude list such as this is useful, if only toillustrate the specific demands of different clinicalareas. Overall, due to the limitations of using ‘pure’clinical outcomes in benign gynaecology, ‘satis-faction’ and ‘quality of life’ (however defined)have found widespread acceptance as appropri-ate outcomes. In other areas such as infertility,‘satisfaction’ is meaningless without the promiseof live birth while even the most invasive anduncomfortable treatment may be perceived to beentirely acceptable if it leads to pregnancy. In gen-eral, even when relevant, purely clinical outcomesmay lead to potential conflicts between the clin-icians’ and patients’ points of view. A numberof health state measures incorporating validatedand reliable scales have been developed to addressthis very issue.57 These may be generic or disease-specific. Most pragmatic trials will use a numberof outcomes from the above categories. At thesame time, it is best, in very large trials, to con-centrate on a few simple outcomes–for reasons ofconvenience and efficiency.58 There is also a sta-tistical drawback to the use of multiple outcomes.The greater the number of outcomes, the higheris the possibility of one of them reaching statisti-cal significance on the basis of chance alone. It isimportant to consider relevance of outcome mea-sures to the stakeholders. It is thus important topredefine primary and secondary outcomes. Theextent to which a trial changes practice will dependon the outcomes chosen.

SAMPLE AND SAMPLE SIZE

The sample size refers to the number of womenneeded to provide adequate power (usually 80%to 90%) in order to show that the findings of thetrial are not merely due to chance. The samplesize for each trial is usually calculated with theprimary outcome in mind. Although secondaryoutcomes are often investigated and subgroupanalyses performed, the power of an RCT toprovide conclusive answers to these may belimited. The statistical approach to determiningsample size is the power calculation, whichdetermines how likely the study is to producea statistically significant result for a difference

between groups of a certain magnitude. It isimportant to ensure that the study is designedto detect significant differences if they exist.Conversely if the statistical power is low, theresults of the trial will be questionable as thenumbers may have been too small to detectgenuine differences. In general, a clinicallysignificant difference in the primary outcomeshould be identified as the point of reference fora sample size calculation. Intimate knowledgeof the clinical area is crucial for this. Forexample a 20% difference in satisfaction ratebetween two forms of treatment for incontinencemay be considered to be clinically important.Conversely, against a background of low livebirth rates, a difference of 5% to 10% maybe enough to change clinical practice in aninfertility trial.

In determining the sample size adequate atten-tion should also be paid to the possibility ofsample attrition and the need for any future sub-group analysis. For example, in abortion trials, ahigh non-response to follow-up should be antici-pated and the sample size inflated accordingly.21

In infertility trials, where it may be clinicallyimportant to assess the effect of the interven-tion in different clinical groups, a similar exercisewill ensure meaningful subgroup analysis. At thesame time, aiming for unrealistically large samplesizes is counterproductive and should be avoided.With a large sample size it is almost always pos-sible to reject any null hypothesis (type I error).Conversely samples which are too small have ahigh risk of failing to demonstrate a real differ-ence (type II error). The latter is more frequent ingynaecological trials. In small trials, a subgroupanalysis based on tiny numbers of patients shouldbe perceived as a hypothesis generating exercise.

RANDOMISATION

Randomisation involves allocating women togroups such that individual characteristics donot influence the nature of the intervention. Forexample in a trial of treatments for menorrhagia,the aim is to avoid bias by distributing factors thatmay influence outcome, such as age, parity, dys-menorrhea, premenstrual syndrome and uterine

Page 345: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

350 TEXTBOOK OF CLINICAL TRIALS

fibroids, randomly between treatment groups. Itis anticipated that any difference in outcome ispurely due to the treatment and not influenced byone or more of these other characteristics. Ran-dom allocation does not guarantee that the groupswill be identical but it does ensure that any dif-ferences between them are due to chance alone.

Randomisation also facilitates the concealmentof the type of treatment from the researchersand subjects to further reduce bias in treatmentcomparison. Thus it ensures that women with ahigher BMI (body mass index) are not prefer-entially allocated to endometrial ablation ratherthan hysterectomy. In addition, it leads to treat-ment groups which are random samples of thepopulation sampled and thus makes valid the useof standard statistical tests based on probabil-ity theory.

While the simplest method of randomisationis tossing a coin, in practice, this is not anaccepted method of treatment allocation. Themain reason for this is the lack of an audit trailthat makes it difficult to confirm that the ran-dom allocation was done correctly. For thesereasons the random allocation should be deter-mined in advance, preferably by using pseu-dorandom numbers generated by a mathemat-ical process. After the randomisation list hasbeen prepared (by someone who will not beinvolved in recruitment), it must be made avail-able to researchers. Although the process of ran-domisation can occur at the recruitment pointthis is preferably done at long range, by tele-phone or even the internet. If envelopes are used,these must be opaque, as researchers could the-oretically hold envelopes to a lamp in orderto read what is written inside. For the samereason these envelopes should be sequentiallynumbered so that the recruiter has to take thenext envelope. Differences in outcome betweentreatment groups are considerably larger in tri-als where allocation concealment is not strictlyenforced as this produces a clear bias. Tele-phone randomisation, either by means of an oper-ator or a computer-operated 24-hour phone line,is ideal for large trials and especially multi-centre trials. Although potentially more efficient,

internet-based randomisation systems continue togenerate concerns about ensuring security andconfidentiality of patient details. Practical prob-lems with randomisation may arise in surgicaland laboratory-based trials where randomisationmay need the assistance of nursing or techni-cal staff.

While simple randomisation techniques will,on average, allocate equal numbers to each arm,occasionally, even in large trials, groups of dif-ferent sizes can result. Block randomisation canbe used to keep the numbers in each group veryclose at all times. In a trial of two alternative sur-gical treatments for menorrhagia we might wantto ensure that each surgeon treats similar num-bers of women by either method. Stratified ran-domisation produces a separate randomisation listfor each surgeon (stratum) so that we get verysimilar numbers of patients receiving each treat-ment within each stratum. If envelopes are used,this may involve separate lists of random num-bers and separate piles of sealed envelopes foreach surgeon. We may additionally use blocks toensure that there is a balance of treatments withineach stratum. While stratified randomisation canbe extended to two or more stratifying variables,we have to be careful to include only a few strata,to prevent generating extremely small subgroups.Stratification by centre is standard practice inmulti-centre trials.

In small studies with several important prog-nostic variables such as infertility trials, ran-dom allocation may not provide adequate bal-ance within the groups. The lack of numbers maymake it difficult to stratify for all the importantvariables. Here, it is still possible to achieve bal-ance using minimisation, which is based on theconcept that the next patient to enter the trialis allocated to whichever treatment would min-imise the overall imbalance between groups atany stage of the trial. Even in small trials thisprovides groups that are comparable across sev-eral prognostic factors. It is important to specifyexactly which prognostic variables are to be usedand to say how they are to be grouped. Forexample age, previous pregnancy and durationof infertility are important prognostic factors for

Page 346: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GYNAECOLOGY AND INFERTILITY 351

fertility. Minimisation in this context will requirea statement about the actual age groups, forexample <30 years and ≥30 years. Minimisationis crucial in infertility trials where a clinically sig-nificant difference in live birth rates associatedwith alternative treatments is small and easilyoverpowered by the effect of prognostic factorssuch as age, parity and previous pregnancy.

Occasionally we allocate a group of subjectstogether rather than individuals to treatments. Forexample, in a health promotion study carriedout in general practices, we might need toapply the intervention to all patients in thepractice. This may involve display of publicitymaterial in the waiting room, for example. Inthis situation, we may need to keep groups ofpatients separate in order to avoid contamination.In a different setting, if we are providing aspecial physiotherapist to advise patients in award, it would be difficult for the nurse to visitsome patients and not others. If we are providingtraining to the patients or their carers, we do notwant the subjects receiving training to pass onwhat they have learned to controls. This might bedesirable in general, but not in a trial. A group ofsubjects allocated to a treatment together is calleda cluster. Clusters must be taken into accountin the design as the use of clusters reduces thepower of the trial and so requires an increase insample size.

A practical problem relating to randomisationconcerns the emotive nature of some of the con-ditions under evaluation such as infertility ortermination of pregnancy. Some women may beunwilling to accept the extra stress of participat-ing in a trial over and above what is already acomplex and psychologically challenging experi-ence. There may also be compelling social rea-sons why women undergoing termination are lesslikely to opt for randomisation, comply with trialprotocols and follow-up arrangements. Infertilecouples may be required to fund their treatmentthemselves. This could influence their decision torefuse to participate in a trial where the experi-mental arm (such as assisted hatching) is substan-tially more expensive than standard treatment,unless the trial organisers offer to absorb the extra

costs. Often there is an imperative to provide‘treatment’ at the request of the couples. Thismakes it difficult to recruit couples into a clin-ical trial where one of the options is ‘expectantmanagement’.

CONCEALMENT OF ALLOCATION

The unpredictability of the randomisation pro-cess can only be successful if followed byallocation concealment, i.e. concealment of thesequence until patients have been assigned totheir groups.59 This ensures strict implementa-tion of a random allocation sequence withoutforeknowledge of treatment assignments. Aware-ness of the next treatment allocation could leadto exclusion of certain women based on theirprognosis because they would have been allo-cated to the perceived inappropriate group. Forexample, in a trial of unexplained infertility,women with a prolonged duration of infertilitycould be excluded if the next treatment allocationwere known to be a ‘no treatment’ arm. Adequateconcealment would ensure that the decision toaccept or reject a participant should be made andinformed consent obtained without prior knowl-edge of the nature of the assignment.

Trials that use inadequate or unclear allocationconcealment have tended to yield 40% larger esti-mates of effect compared to those which usedadequate concealment.60 – 62 Trials with poorlyconcealed allocation also generated greater het-erogeneity in results, i.e. the results fluctuatedextensively above and below the estimates frombetter studies.60

BLINDING

Double blinding seeks to prevent ascertainmentbias, protects the sequence after allocation andcannot always be implemented.1 As in the casewith allocation concealment, lack of blindingmay lead to exaggerated estimate effects of treat-ment. A survey of trials in gynaecology foundthat investigators could have used double blind-ing more often.1 When used, methods of doubleblinding were poorly reported and rarely eval-uated. It is recommended that authors provide

Page 347: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

352 TEXTBOOK OF CLINICAL TRIALS

adequate information about the methods usedto ensure double blinding. This should includedetails such as the type of intervention (cap-sules/tablets), and efforts made to duplicate thecharacteristics of the treatment (taste, appear-ance, route of administration). In addition it isimportant to be explicit about the methods usedto control the allocation schedule, such as loca-tion of the schedule during the trial, details ofwhen the code was broken for analysis and thecircumstances under which the code could bebroken for individual cases (adverse reactions).Finally there should be a statement about theperceived success or failure of the double blind-ing efforts.

EXCLUSIONS

Exclusions can occur due to eventual discov-ery about ineligibility, deviations from protocol,withdrawals or losses to follow-up. Exclusionsbefore randomisation do not affect the internalvalidity of the trial but can compromise general-isability. For most pragmatic trials it is importantto keep the eligibility criteria to a minimum. Inpractice it is unusual to find significant qualitativedifferences between women in trials and thosein the general population. Exclusions after trialentry represent a further source of bias withinan RCT as any erosion over the course of thetrial from those initially randomised groups isnot likely to be random in nature. The acceptedmethod of primary analysis in all cases is by‘intention to treat’, i.e. analysis of patients inthe originally assigned groups regardless of anybreaches of protocol.63 This can prove unnervingfor clinicians especially in the context of surgicaltrials. For example in a trial comparing hysterec-tomy versus endometrial ablation many clinicianswould find it difficult to accept results of anal-ysis of amenorrhoea rates by intention to treatarguing that it is inappropriate to include hys-terectomised women in the ablation group as thiswould lead to an overestimation of amenorrhoearates. Investigators can also do secondary analy-ses, preferably pre-planned based on only thoseparticipants who fully complied with the trial pro-tocol (per protocol) or who received a particular

treatment irrespective of randomised assignment(analysis by treatment received). Secondary anal-yses are acceptable, as long as researchers labelthem as such, and as non-randomised compar-isons. The advantage of randomisation is entirelylost when investigators exclude participants andin effect present a non-randomised comparison asthe primary result, i.e. similar to a cohort study.Exclusions of participants can lead to misleadingresults.64 Researchers sometimes exclude patientson the basis of outcomes that happen before treat-ment has begun such as pregnancy in a couplewith infertility. Although this may seem sensi-ble in as much as the event of interest occurredindependent of the treatment, the same argumentcould be used for excluding pregnancies in a nointervention arm of the trial.

It is important to attempt to minimise exclu-sions and be explicit about those cases whereexclusions occurred. This can be enforced byminimising the delay between randomisation andinitiation of treatment. This can be particularlyrelevant to infertility trials where couples couldfall pregnant before treatment can start or wherethe intervention is conditional on a set of clini-cal criteria. For example, in couples randomisedto IVF or ICSI it may be more efficient todelay randomisation until after oocyte recoveryso that women who have failed to respond togonadotrophin stimulation are not included.

FOLLOW-UP

It is important to pre-determine the length andtype of follow-up for each trial. The precisecircumstances and the time interval will dependon the nature of the trial. In fertility trialsthe traditional method was to express outcomesas pregnancy rates per cycle. This meant theduration of follow-up was brief. For more robustoutcomes like pregnancy rate per woman, it maybe necessary to extend the follow-up for threeto six cycles depending on the nature of thetreatment. A further 9 months need to be addedon to allow live birth per couple to be usedas an outcome. For menorrhagia trials, 80% ofre-treatments occur within 2 years, making thisan acceptable duration for follow-up in the first

Page 348: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GYNAECOLOGY AND INFERTILITY 353

instance. A prolonged period of follow-up ofup to 5 years would be ideal as many womencould expect the effects of their treatment to waneover time and long-term complications of therapyto surface. This would appear to be equallytrue for urogynaecology trials. For termination ofpregnancy, follow-up has to be kept short as theloss to follow-up is high and many women maynot wish to be contacted at a later date. For HRTtrials, which genuinely wish to address crucialoutcomes such as rates of fracture, cardiovasculardisease or Alzheimer’s disease, follow-up mayneed to be extended to tens of years. Thisobviously raises significant ethical, logistic andfinancial issues which may well need to be takeninto account whilst planning such trials.

DATA COLLECTION

Data in a trial are usually collected from sourcessuch as case notes, local clinic databases andpatient questionnaires. Occasionally interviewsmay be used to explore areas which are not capa-ble of being probed adequately with question-naires. General practitioners, local and nationaldatabases may also be accessed to obtain clini-cal information such as retreatment rates or seri-ous complications about patients who are lost tofollow-up.

CONDUCT

Recruitment

To avoid recruitment bias, it is important totarget all eligible women and record all refusals.It may be helpful to obtain some baselineclinical details about them in order to exploreany major differences between participants andnon-participants, which could affect the externalvalidity of the trial.

Trial Co-Ordination

Following informed consent, it is important toobtain baseline information by filling in datasheetsor questionnaires prior to randomisation. Subse-quent data collection should occur at the pre-specified times and an efficient system of timely

reminders put into place. In pragmatic trials itis often important to distinguish those womenwho no longer wish to continue with the allo-cated treatment from those who wish to terminatetheir involvement with the trial and do not wish tobe contacted for follow-up or have questionnairessent to them. Hopefully the numbers in this lat-ter group should be small but their wishes shouldbe respected.

Data Entry and Analysis

This is an important aspect of the trial and errorshere can lead to significant bias. As mentionedabove, analysis should be by intention to treat.Each woman should be analysed as though shehad received the intervention to which she hadbeen randomised. This minimises any bias dueto non-random removal of participants fromthe trial. The exception is explanatory trials,usually phase I and II drug trials, where strictrules of exclusion for protocol violation apply.Occasionally it may be important from a clinicalpoint of view to perform as separate analysisby treatment received. This should be clearlydescribed as such and should be used to assessthe primary outcome. Intention to treat can causemuch consternation among clinicians particularlyin surgical trials where some outcomes may seemabsurd–for example continuing menstrual lossin women allocated to hysterectomy who didnot undergo the operation but were analysed byintention to treat.

Presenting Results

Analysis should follow the original plan set outin the protocol and the CONSORT recommen-dations should be observed. Particularly helpfulis a trial chart which sets out in an explicitmanner any exclusions or loss to follow-up.Results of subgroup analyses should be treatedwith caution and used mainly as hypothesis-generating exercises in most modest-sized trials.There should be a conscious attempt to limitdiscussion to the results generated by the trialand avoid speculation.

Page 349: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

354 TEXTBOOK OF CLINICAL TRIALS

ETHICS OF TRIALS

The scientific rationale for conducting trials iscollective equipoise. Clinicians need to be gen-uinely uncertain about the best treatment. In sucha clinical situation, there should be no conflictbetween the interests of those participating in atrial and those who stand to gain in the future.The important issue is that participants are alsoin personal equipoise and give informed consent.

Despite awareness of its importance, there isevidence that some doctors do not seem to takeinformed consent as seriously as they should.65

This may well be because participants seem tobe less willing to be randomised, when they aregiven more preliminary data, and made aware ofany accumulating evidence of effectiveness. Inmany trials, a significant number of participantsemerge from consultations expecting to benefitpersonally by their participation.

Some infertility-related procedures are de-scribed as ‘licensed treatment’ under the aegis ofthe Human Fertilisation and Embryology Author-ity (HFEA) in the United Kingdom. Clinicaldata pertaining to licensed treatments (includingdonor insemination, IVF and ICSI) are confi-dential and may not be revealed to researchers(including clinicians) who are not covered bythe institutional HFEA treatment licence, withoutthe explicit permission of the couple. This cancreate problems in accessing data, particularlyfollow-up data from notes or databases. Further-more, trials involving manipulation of gametesand embryos need separate approval from theHFEA in addition to approval from the localethics committee.

For all clinical trials, it is sensible from an ethi-cal and financial point of view to have clear stop-page rules as part of the original study design. Anindependent data monitoring committee shouldbe available to review the results of an interimanalysis. Early stopping should only occur underpre-planned, well-specified circumstances such amarked superiority or toxicity of one arm ofthe study which is greater than that originallyhypothesised. Examples include stopping a trialevaluating the use of prophylactic antibioticsduring hysteroscopic surgery where the control

arm demonstrates a significantly higher rate ofinfection.66 Alternatively, in a trial comparing apolicy of single versus double embryo transfer(in order to prevent twin pregnancies) it may beappropriate to stop if the pregnancy rate in thesingle embryo group becomes unacceptably low.

CONCLUSION

Clinical trials in gynaecology have lagged behindthose in other disciplines in terms of overallnumbers as well as quality. There are few largemulti-centre trials, particularly surgical trials. Theclinical population is heterogeneous and interven-tions under scrutiny diverse. Some treatments,such as those for infertility and unwanted fer-tility target women (and their partners) who havespecific reproductive health needs but are oth-erwise in good health. Trials also need to beable to compare interventions that cross differenttreatment boundaries. Trialists in this field needto design more pragmatic trials with clinicallymeaningful outcome measures. In gynaecologythese should be quality of life and satisfaction;in infertility, live birth rates per couple/woman.Finally, the importance of collecting cost datacannot be overstated in terms of planning gynae-cological services which are effective, acceptableand affordable.

REFERENCES

1. Schulz KF, Grimes DA, Altman DG, Hayes RJ.Blinding and exclusions after allocation in ran-domised controlled trials: survey of published par-allel group trials in obstetrics and gynaecology.BMJ (1996) 312(7033): 742–4.

2. Out HJ, Schnabel PG, Rombout F, Geurts TB,Bosschaert MA, Coelingh Bennink HJ. A bioe-quivalence study of two urinary follicle stimulat-ing hormone preparations: Follegon and Metrodin.Human Reprod (1996) 11(1): 61–3.

3. El-Refaey H, Rajasekar D, Abdalla M, Calder L,Templeton A. Induction of abortion with Mifepri-stone (RU 486) and oral or vaginal Misoprostol.N Engl J Med (1995) 332: 983–7.

4. The Aberdeen Endometrial Ablation Group. Arandomised trial of hysterectomy versus endome-trial ablation for the treatment of dysfunctional

Page 350: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

GYNAECOLOGY AND INFERTILITY 355

bleeding: clinical psychological and economic out-come at four years. Br J Obstet Gynaecol (1999)106: 360–6.

5. Cooper KG, Jack SA, Parkin DE, Grant AM. Five-year follow up of women randomised to medicalmanagement or transcervical resection of theendometrium for heavy menstrual loss: clinicaland quality of life outcomes. BJOG: Int J ObstetGynaecol (2001) 108(12): 1222–8.

6. Roland M, Torgerson DJ. Understanding controlledtrials: what are pragmatic trials? BMJ (1998) 316:285.

7. Goverde AJ, McDonnell J, Vermeiden JPW, SchatsR, Rutten FFH, Schoemaker J. Intrauterine insem-ination or in-vitro fertilisation in idiopathic sub-fertility and male subfertility: a randomised trialand cost-effectiveness analysis. Lancet (2000) 355:13–18.

8. Khan KS, Daya S, Collins JA, Walter SD. Empir-ical evidence of bias in infertility research: over-estimation of treatment effect in crossover trialsusing pregnancy as the outcome measure. FertilSteril (1996) 65(5): 939–45.

9. Fender GRK, Prentice A, Gorst T, Nixon RM,Duffy SW, Day NE, et al. Randomised controlledtrial of educational package on managementof menorrhagia in primary care: the Angliamenorrhagia education study. BMJ (1999) 318:1246–50.

10. Morrison J, Carroll L, Twaddle S, Cameron I,Grimshaw J, Leyland A, et al. Pragmatic ran-domised controlled trial to evaluate guidelines forthe management of infertility across the primarycare–secondary care interface. BMJ (2001) 322:1–7.

11. Kerry SM, Bland JM. Sample size in clusterrandomisation. BMJ (1998) 316: 549.

12. Kerry SM, Bland JM. The intracluster correlationcoefficient in cluster randomisation. BMJ (1998)316: 1455.

13. Donner A, Klar N. Design and analysis of clusterrandomization trials in health research, 1st edn.London: Arnold (2000).

14. Aboulgar MA, Mansour RT, Serour JI, Amin YM,Kamal A. Prospective controlled randomised studyof in-vitro fertilisation versus intracytoplasmicsperm injection in the treatment of tubal factorinfertility with normal semen parameters. FertilSteril (1996) 66: 753–6.

15. Ruiz A, Remohi J, Minguez Y, Guanes P, SimonC, Pellicer A. The role of in vitro fertilisationand intracytoplasmic sperm injection in coupleswith unexplained infertility after failed intrauterineinsemination. Fertil Steril (1997) 68: 171–3.

16. Bhattacharya S, Hamilton MPR, Shabban M, Kha-laf Y, Seddler M, Ghobara T, et al. A randomisedcontrolled trial of in-vitro fertilisation (IVF) and

intra-cytoplasmic sperm injection (ICSI) in non-male factor infertility. Lancet (2001) 357: 2075–9.

17. Torgerson D, Sibbald B. Understanding clinicaltrials: what is a patient preference trial? BMJ(1998) 316: 360.

18. Brocklehurst P. Br J Obstet Gynaecol (1997) 104:1332–5.

19. Brewin CR, Bradley C. Patient preferences andrandomised clinical trials. BMJ (1989) 299:313–15.

20. Henshaw RC, Naki SA, Russell IT, et al. Com-parison of medical abortion with vacuum aspi-ration; women’s preferences and acceptability oftreatment. Br Med J (1993) 307: 714–17.

21. Ashok PW, Kidd A, Flett GM, Fitzmaurice A,Graham W, Templeton A. A randomised compari-son of medical abortion and surgical vacuum aspi-ration at 10–13 weeks gestation. Human Reprod(2002) 17(1): 92–8.

22. Cooper KG, Parkin DE, Garret AM, Grant AM.A randomised comparison of medical and hystero-scopic management in women consulting a gynae-cologist for treatment of heavy menstrual loss. BrJ Obstet Gynaecol (1997) 104: 1360–6.

23. Torgerson DJ, Klaber-Moffett J, Russell I. Patientpreferences in randomised trials: threat or oppor-tunity? J Health Serv Policy (1996) 1(4): 194–7.

24. Piaggio G, Pinol APY. Use of the equivalenceapproach in reproductive health clinical trials. StatMed (2001) 20: 3571–87.

25. Jones B, Jarvis P, Lewis JA, Ebbutt AF. Trials toassess equivalence: the importance of rigorousmethods. BMJ (1996) 313: 36–9.

26. Rohmel J. Therapeutic equivalence investigations:statistical considerations. Stat Med (1998) 17:1703–14.

27. Ebbutt AF, Frith L. Practical issues in equivalencetrials. Stat Med (1998) 17(15 & 16): 1691–1701.

28. Pandian Z, Bhattacharya S, Nikolaou N, Vale L,Templeton A. In-vitro fertilisation for unexplainedsubfertility (Cochrane Review). In: The CochraneLibrary, Issue 4. Oxford: Update Software (2002).

29. Roland M, Torgerson DJ. Understanding controlledtrials: what outcomes should be measured? BMJ(1998) 317: 1075–80.

30. Cranney A, Welch V, Adachi JD, Guyatt G,Krolicki N, Griffith L, Shea B, Tugwell P,Wells G. Etidronate for treating and preventingpostmenopausal osteoporosis (Cochrane Review).In: The Cochrane Library, Issue 2. Oxford: UpdateSoftware (2002).

31. Patrick DL, Bergner M. Measurement of healthstatus in the 1990s. Ann Rev Public Health (1990)11: 1165–83.

32. Gill TM, Feinstein AR. A critical appraisal ofthe quality of life measurements. JAMA (1994)272(8): 619–26.

Page 351: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

356 TEXTBOOK OF CLINICAL TRIALS

33. Fayers PM, Sprangers MAG. Understanding self-rated health. Lancet (2002) 359: 187–8.

34. Ware JE, Sherbourne CD. The MOS 36-item shortform health survey (SF-36). Conceptual frame-work and item selection. Med Care (1992) 30:473–83.

35. McGee H, Hannah M, O’Boyle CA, Hickey A,O’Malley K, Joyce CRB. Assessing the quality oflife of the individuals: the SEIQOL with a healthand gastroenterology unit population. Pschol Med(1991) 21: 749–59.

36. Guyatt GH, Bombardier C, Tugwell PX. Measur-ing disease specific quality of life in clinical trials.Can Med Assoc J (1986) 134: 889–95.

37. Kelleher CJ, Cardozo LD, Khullar V, Salvatore S.A new questionnaire to assess the quality of life ofurinary incontinent women. Br J Obstet Gynaecol(1997) 104(12): 1374–9.

38. Moos RH. Typology of menstrual cycle symp-toms. Am J Obstet Gynecol (1969) 103: 390–402.

39. Jones G, Kennedy S, Barnard A, Wong J, Jenkin-son C. Development of an endometriosis quality-of-life instrument: The Endometriosis HealthProfile-30. Obstet Gynecol (2001) 98(2): 258–64.

40. Schneider HP, Heinemann LA, Rosemeier HP,Potthoff P, Behre HM. The Menopause RatingScale (MRS): comparison with Kupperman indexand quality-of-life scale SF-36. Climacteric (2000)3(1): 50–8.

41. Sanders C, Egger M, Donovan J, Tallon D, FrankelS. Reporting on quality of life in randomisedcontrolled trials: bibliographic study. BMJ (1998)317: 1191–4.

42. Graham WJ. Outcomes and effectiveness in repro-ductive health. Social Sci Med (1998) 47(12):1925–36.

43. Williams B. Patient satisfaction: a valid concept?Social Sci Med (1994) 38: 509–16.

44. Sitzia J, Wood N. Patient satisfaction: a review ofissues and concepts. Social Sci Med (1997) 45(12):1829–43.

45. Bury M. Doctors, patients and interactions inhealth care. In: Health and Illness in a ChangingSociety. London: Routledge (1997).

46. Department of Health. The New NHS: Modern andDependable. London: HMSO (1998).

47. Williams B, Coyle J, Healy D. The meaningof patient satisfaction: an explanation of highreported levels. Social Sci Med (1998) 47(9):1351–9.

48. Hopton JL, Howie JGR, Porter MD. The need tolook at the patient in General Practice satisfactionsurveys. J Fam Prac (1993) 10: 82–7.

49. Williams SJ, Calnan M. Key determinants of con-sumer satisfaction in general practice. J Fam Prac(1991) 8: 237–42.

50. McIver S, Meredith P. There for the asking: canthe government’s planned annual survey reallymeasure patient satisfaction? Health Serv J (1998)19: 26–7.

51. Leimkuhler A, Muller U. Patient satisfaction–artifact or social fact. Nervenarzt (1996) 67:765–73.

52. Owens DJ, Batchelor C. Patient satisfaction andthe elderly. Social Sci Med (1996) 42: 1483–91.

53. Drummond MF, Davies L. Economic analysisalongside clinical trials. Revisiting the method-ological issues. Int J Technol Assess Health Care(1991) 7: 561–73.

54. Barber JA, Thompson SG. Analysis and interpre-tation of cost data in randomised controlled tri-als: review of published studies. BMJ (1998) 317:1195–200.

55. Fayers PM, Hand DJ. Generalisation from phaseIII clinical trials: survival, quality of life and healtheconomics. Lancet (1997) 350: 1025–7.

56. Briggs A. Economic evaluation and clinical trials:size matters. BMJ (2000) 321: 1362–3.

57. Bowling A. Measuring Disease: A Review ofDisease Specific Quality of Life MeasurementScales. Mitton Keynes: Open University Press(1994).

58. Peto R, Collins R, Gray R. Large scale randomisedevidence: large sample trials and overviews oftrials. J Clin Epidemiol (1995) 48: 23–40.

59. Schulz KF and Grimes DA. Generation of alloca-tion sequences in randomised trials: chance notchoice. Lancet (2002) 359: 515–19.

60. Schulz KF, Chalmers I, Hayes RJ, Altman DG.Empirical evidence of bias: dimensions of method-ological quality associated with estimates of treat-ment effects in controlled trials. JAMA (1995) 273:408–12.

61. Moher D, Pham B, Jones A, et al. Does qualityof reports of randomised trials affect estimatesof intervention efficacy reported in meta-analysis.Lancet (1998) 352: 609–13.

62. Juni P, Altman D, Egger M. Assessing quality ofcontrolled trials. BMJ (2001) 323: 42–6.

63. Pocock SJ. Clinical Trials: A Practical Approach.Chichester: John Wiley & Sons (1983).

64. Schulz KF, Grimes DA. Sample size slippages inrandomised trials: exclusions and the lost andwayward. Lancet (2002) 359: 781–5.

65. Edwards SJ, Lilford RJ, Hewison J. The ethics ofrandomised controlled trials from the perspectivesof patients, the public and health care profession-als. BMJ (1998) 317: 1209–12.

66. Bhattacharya S, Parkin DE, Reid TMS, Abramo-vich DR, Mollison J, Kitchener HC. A prospectiverandomised study of the effects of prophylacticantibiotics on the incidence of bacteraemia follow-ing hysteroscopic surgery. Eur J Obstet GynaecolReprod Biol (1995) 63: 37–40.

Page 352: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 353: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

22

Respiratory

ANDERS KALLENAstraZeneca, Lund, Sweden

AIRWAYS DISEASES

Airway obstruction is a common and impor-tant feature of some respiratory diseases. It canbe acute, ‘semi-chronic’ (e.g. due to cancer)or chronic. The chronic obstructive airway dis-eases can be divided into whether the obstruc-tion is reversible or not. In the former case thepatient usually has asthma, in the latter casechronic obstructive pulmonary disease, abbrevi-ated COPD. These disease concepts lack precisedefinitions, and the division is only meant asa first approximation. Both diseases are inflam-matory diseases mainly of the lower respiratorytract: in asthma there is an inflammatory pro-cess mainly in the central airways, whereas theinflammation of COPD is predominantly periph-eral with progressive destruction of lung tis-sue. Inflammation in the upper respiratory tract,i.e. rhinitis, is characterised by both acute andchronic conditions, the most distinctive beingseasonal hayfever.

This chapter will discuss clinical trials in thethree diseases asthma, COPD and rhinitis, withthe focus on the first of these.

MEDICAL BACKGROUND

Asthma

From a clinical point of view, asthma presentsitself by recurrent breathlessness, cough orwheeze caused by variable or intermittent nar-rowing of the intra-pulmonary airways. Theseverity of these symptoms has a wide range,from very mild intermittent with symptoms onlyupon provocation, to severe persistent with largeimpact on daily life. There is no precise definitionof asthma, and therefore the prevalence is hard toestablish. We know, however, that it is commonerin children than in adults, and more common inboys than in girls.1 A figure for children around10% and half that for adults is probably closeto reality in most of the western world. There ishowever a definite regional inhomogeneity withregions with much higher prevalence and regionswhere the disease is rare. Most epidemiologicalstudies seem, however, to agree that the preva-lence is rising.2

The high prevalence of asthma gives a poorprediction of the impact of the disease on thecommunity, because the overwhelming majority

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 354: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

360 TEXTBOOK OF CLINICAL TRIALS

of asthmatics are mild sufferers with symptomsconfined to wheezing after exercise or breath-lessness in association with an upper respiratorytract infection. At the other end of the spectrum,asthma is a crippling, life-threatening diseasewith acute severe attacks requiring emergencyroom treatment. In the western world, about 1%of adults and 2% of children need medical atten-tion for asthmatic symptoms.3,4

Many factors are known to cause narrowingof intra-pulmonary airways. The sensitivity tosuch stimuli varies between individuals but undernormal circumstances the concentration of suchsubstances is too low to produce symptoms inhealthy subjects. Asthmatic patients are moreor less characterised by a high sensitivity tosuch stimuli, a phenomenon called non-specificbronchial hyperresponsiveness. The most com-mon cause of non-specific bronchial irritation isexercise and, for many, this may be the only man-ifestation of their asthma.

It is essential to make a clear distinctionbetween this non-specific hyperresponsivenessand the allergic reactions. Allergy is an immuno-logical reaction to a specific environmentalagent. Hyperresponsive bronchi, in addition toresponding in an exaggerated fashion to exoge-nous stimuli, will also respond in an enhancedfashion to inflammatory mediators released inthe bronchial wall as a result of an allergicreaction. Thus a trivial allergic reaction in ahyperresponsive bronchus may provoke a largebronchoconstrictive response. There is little, ifany, relationship between the degree of atopyand non-specific hyperresponsiveness. Instead thedegree of non-specific hyperresponsiveness isassociated with the degree of inflammation in therespiratory tract.5

Asthma may start at any age. When startingduring childhood and adolescence it is likelyto be associated with atopy, as compared towhen symptoms start later in life. Most asthmaticpatients have perennial symptoms, but a minorityshows a seasonal variation, sometimes confinedto periods with air-borne pollen, sometimes tothe winter months. Thus different asthmatics mayhave symptoms during different periods of the

year, with long periods of absolute or relativerelief between attacks of varying severity.

In general, asthma carries a favourable prog-nosis because the bronchial inflammation doesnot usually cause permanent tissue damage. How-ever, in a subgroup of subjects, irreversiblebronchial obstruction develops later in life.6

COPD

Chronic obstructive pulmonary disease is char-acterised by long-term, in general progressive,irreversible obstruction of the flow of air out ofthe lungs. To a large extent it is comprised of tworelated disease:

1. Chronic bronchitis, whose clinical definitionis productive cough (from bronchial secre-tion) on most days for 3 months/year fortwo consecutive years. The mucus hyper-secretion comes from hypertrophied bronchialglands and increases the risk of bacteriallung infections.

2. Emphysema, which has a pathological defini-tion with enlargement of the alveoli due to thedestruction of the walls between them. Thesewalls contain elastic fibres, so their destruc-tion reduces the elasticity of the lung, leadingto collapse, and thus obstruction, of airways.

The disease entities asthma, chronic bronchi-tis and emphysema are in no way mutuallyexclusive: a given patient can have symptomsfrom more than one. The definitions of the lasttwo does not imply that the patient has airwayobstruction, so not everyone with these diseaseshas COPD.

COPD is believed to affect more than halfa billion people worldwide, causing perhaps3 million deaths annually. When diagnosed, thisis often in a relatively late stage of the disease,with less than 50% of lung function remaining, sothe majority of cases are at any specified point intime undiagnosed. The prevalence of diagnosedCOPD is about 5%, and is increasing.

Pathologically COPD is a disease with per-iferal inflammation (thus rather a bronchioli-tis than bronchitis) with progressive lung tissue

Page 355: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 361

destruction. In the western world, by far themost important factor responsible for COPD issmoking; it has been said to be responsible forup to 90% of COPD patients.7 However, onlyabout 15–20% of all cigarette smokers developCOPD. The mechanism seems to be that cigarettesmoke attracts cells (neutrophils, macrophagesand cytotoxic T-cells as opposed to eosinophilsand T-helper cells in asthma) to the lungs thatpromote inflammation, and these are stimulatedto release elastase, an enzyme that breaks downthe elastic fibres in lung tissue. Normally thelungs are protected against this enzyme by theelastase inhibitors, among them α1-antitrypsin,which is produced in the liver (congenital defi-ciency of this enzyme is another, but rare, cau-sation for emphysema). Air pollution has beensuspected to have similar effects as smoking, butit is unclear to what extent that is an importantaetiological factor for COPD. Also, there is a highCOPD incidence in women in Asia attributed tocooking fumes.8

The typical COPD patient has been smok-ing 20 or more cigarettes a day for more than20 years and presents with a chronic cough,shortness of breath (dyspnea) and frequent res-piratory infections. If the underlying disease ismainly emphysema, shortness of breath may bethe only symptom. Initially the dyspnea onlycomes during physical exercise, but as the diseaseprogresses it occurs already on minimal exertion.For the patient with chronic bronchitis dominat-ing, the major symptoms are chronic cough andsputum production. The sputum may be clear butis usually coloured and thick as bacterial coloni-sation is common.

Rhinitis and Nasal Polyposis

The upper respiratory tract is to some extentlike terminal bronchioli without smooth muscles.Instead the nose has venous sinusoids and themajor reason for obstruction of the upper airwaytract is vasodilation of capacitance vessels andoedema while secretion can contribute. Anotherdifference to the lower tract is that stimulationof nervous irritant receptors in the nose results in

sneezing, which is the cleaning reflex of the upperairways corresponding in a way to coughing,which is the cleaning reflex of the lower airways.

Inflammation in the upper respiratory tract,rhinitis, presents as one or more of the symptomsnasal congestion, rhinorrhea (i.e. runny nose),sneezing and itching. Chronic inflammatory con-ditions can in predisposed individuals result inbenign protrusions of nasal polyps into the nasalcavity, polyposis.

Rhinitis can be allergic or non-allergic, wherethe latter is characterised by presence of symp-toms of varying severity. Allergic rhinitis can beseasonal as hay fever (SAR = Seasonal Aller-gic Rhinitis), or perennial. In the latter case thesymptoms can be due to continuous exposureto allergens like the house dust mite, or maypresent themselves intermittently as episodes trig-gered by allergens like, e.g., grass pollen. Despitethe common inflammatory denominator for aller-gic rhinitis and polyposis, there is no evidencethat the two conditions are closely linked, or thatallergy plays a major role in the aetiology ofnasal polyposis.

Rhinitis is a very common disease, but sur-prisingly little is known about its epidemiology.The nose has a filter function, and is thereforeexposed to a much larger amount of inhaled aller-gens per square centimetre than bronchi, espe-cially when the allergens are large. SAR is dueto air-borne plant pollen. From a clinical point ofview the most widely distributed ones are thoseof grasses, but some tree pollen, including birchand olive tree, are also important, as is ragweed.It is important to note that the pollen season foran individual plant species varies from one coun-try or region to another. Also, whereas the seasonfor air-borne pollen is limited to perhaps half ayear in temperate zones, in warmer climates itis so long that what seems to be perennial symp-toms may be provoked by multiple and sequentialseasonal allergies.

Patients with nasal polyposis suffer from aseries of symptoms, just as in rhinitis but inparticular nasal blockage and an absence of smell.The prevalence is not known, there are fewepidemiological studies, but it is probably in the

Page 356: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

362 TEXTBOOK OF CLINICAL TRIALS

range of 1–4%. The diagnosis of polyps requiresappropriate inspection of the nasal cavities by atrained physician.

CURRENT TREATMENTS

The respiratory tract has a limited repertoire ofresponses to irritation or other stimulation. In thenose vasodilation leads to decreased airway cali-bre and nasal blockage. The bronchi may changetheir calibre or alter the amount of glandularsecretion produced, leading to obstruction. Thereis oedema, hyperaemia and cellular infiltration ofthe wall of the tract. Afferent nerves may signalinformation to the brain stem to produce sneezing(upper tract) or cough or the sensation of breath-lessness (lower tract). The relative importance ofthese factors varies between individuals, and dif-ferent drugs interfere with different factors.

Drugs for chronic obstructive respiratory dis-eases are given either systemically, as tablets,or by local administration using an inhalationdevice. When it comes to inhaled products, itis important to note that a treatment consists oftwo objects, a drug to be delivered to the bodyand an inhalation device used for this deliver-ance. We will not discuss devices here, only drugclasses. It is however important to understand thatthe amount of drug delivered to the airways mayvary considerably from one inhalator to another.9

The same might be true of the distribution patternwithin the lungs, with potential consequences forthe effectiveness of the treatment.

Bronchodilator Drugs

There are three basic groups of bronchodilatordrugs–β2-agonists (today by far the largest),xanthines and anticholinergics.10 Their modes ofaction differ somewhat. We discuss each class ofdrugs separately.

β2-Agonists bind to the β-adrenergic receptorand stimulate the intracellular accumulation ofthe signal substance cAMP (cyclic adenosinemonophosphate). There are now three knowntypes of β-adrenergic receptors in the humanbody: stimulation of the β1-receptor causes

cardiac stimulation and intestinal inhibition,whereas stimulation of the β2-receptor results inbronchodilatation, vasodilatation, stimulation ofskeletal muscles and uterine contractile inhibi-tion. A third type, β3-receptors, cause lipolysis.11

The development within this drug class hasbeen towards more and more potent and selectiveβ2-agonists. The first generation of drugs wereshort-acting with a duration of action of, at most,4–6 hours. Lately a few long-acting drugs, withduration of action superseding 12 hours, havebeen introduced.12

β2-agonists are of benefit to the majority ofasthmatic patients because of the bronchodilatorproperty; rapid-acting ones are often given asrescue medication for relief of symptoms. Thedrug class does however have actions other thansmooth muscle relaxation that may contributeto their long-term therapeutic effect in asthmaand motivate their use in COPD: they stimulatemucociliary function in the airways, restorenormal clearance of bronchial secretion andinhibit microvascular permeability in the airwaysleading to decreased mucosal oedema.

Side-effects are a consequence of the bindingto receptors in tissues and organs outside thelung: tremour by binding to receptors in skeletalmuscle, tachycardia by binding to receptors in theheart (this problem has been reduced as the drugsbecame highly selective, but there are some β2-receptors in the heart as well) and hypokalemia,due to a redistribution from extra- to intracellularspaces. In general tolerance develops rapidly tothe extra-pulmonary effects, so these are usuallymild or absent in patients, though individualvariation in the sensitivity can make the use ofthese drugs impossible in the occasional patient.

Xanthines , the most well-known member ofwhich is caffeine but the most widely usedone as treatment for airway obstruction is theo-phylline, are potent smooth muscle relaxantsby acting directly in the intracellular messengercAMP. Thus they have about the same phar-macological actions as β2-agonists. But sincethey act intracellularly and not by binding to areceptor on the cell surface, the effect is moregeneralised and the side-effects are somewhat

Page 357: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 363

different and potentially more serious than thoseof β2-agonists. The most important ones relateto the gastrointestinal, cardiovascular and centralnervous systems. At the start of treatment withoral theophylline, most patients will experiencesome caffeine-like symptoms including irritabil-ity and nausea, symptoms which usually fadeaway after a few days. For that reason, however,treatment is usually initiated in subtherapeuticdoses and progressively increased over a periodof 1–2 weeks.13

The serious side-effects, in contrast to thecaffeine-like ones, are well correlated to plasmaconcentrations. In clinical practice theophyllineconcentration in plasma has to be monitored anddose adjusted so that the plasma concentrationlies within a therapeutic window. Because ofthis the use of xanthines has diminished overthe last 10 years as alternative treatment hasbecome available.

Anticholinergic drugs have been used sinceancient times for the treatment of asthma. Theuse of various plant derivatives has evolvedthrough synthetic atropine to more selectivebronchodilating anticholinergic agents with fewerside-effects than atropine.14

The bronchodilating effect of this drug classis due to their antagonism of the binding ofacetylcholine (from the vagal nerve) to themuscarinic receptors of bronchial smooth muscle.These drugs are particularly used in treatingreversible airway obstruction in COPD.

The side-effects of anticholinergic agents aredue to blockade of muscarinic M2-receptors inother organs and include dryness of the mouth,blurred vision, urine retention and difficulty inmicturition, tachycardia, flushing and lighthead-edness.

Corticosteroids

That glucocorticosteroids (GCSs) have a thera-peutic effect on asthma, rhinitis and other anti-inflammatory diseases has been known for along time and is due to their being manmadeanalogues of an endogenous anti-inflammatorysteroid–cortisol. Cortisol is in a way nature’s

own remedy for inflammation: if we removethe adrenal glands inflammatory reactions aregreatly exacerbated. Regulation of endogenouscortisol is complex, involving the hypothala-mic–pituitary–adrenal (HPA) axis. During asevere inflammatory response, elevated levels ofcytokines stimulate centres in the brain, leadingto an increase of cortisol in the circulation therebyattenuating the inflammatory response. It is nowbelieved that even at normal levels, endogenoushormones will regulate inflammation. The GCSmode of action is by binding to a glucocorticoidreceptor within the cell’s cytosol. When used fortreating, e.g., asthma GCSs lead to a reduction ofairway inflammation, mucous hypersecretion andairway reactivity while restoring the integrity ofthe airways.15

Originally GCSs were given systemically asasthma treatment. There are however well knownside-effects of high does of oral GCSs over along time that limits that usage. These include,but are not limited to, osteoporosis, hypertension,adrenal insufficiency and Cushingoid features aswell as growth retardation in children. Concernabout these side-effects diminished the use of oralGCS as an asthma treatment. Inhaled GCSs haveimproved the benefit/risk ratio. Since adminis-tration is aimed directly at the site of inflam-mation, lower doses can be used, giving lowerGCS concentrations in plasma with largely neg-ligible systemic side-effects as a result. InhaledGCSs are now widely accepted as first-line anti-inflammatory therapy for asthma.16

To evaluate the long-term side-effects ofinhaled GCSs is difficult. They are rare at dosesgiven in asthma treatment, so large numbers ofpatients and long-term clinical studies are needed.Some information can be gained by studyingthe endogenous cortisol levels. As already men-tioned, the endogenous cortisol level is controlledby the highly complex HPA axis. Introduction ofexogenous GCS in the plasma will affect this axisand lead to a suppression of the endogenous cor-tisol levels, the degree of which is determinedby the plasma concentrations and the potency ofthe drug. Thus, the degree of suppression is a

Page 358: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

364 TEXTBOOK OF CLINICAL TRIALS

measure of the amount of active (on the HPAaxis) exogenous GCS in the body.

Other Drugs

Vasoconstrictors are used extensively in rhinitis.Topical α-agonistic sympatomimetics effectivelyand promptly alleviate the nasal blockage. Theyhave no effect on rhinorrhea, nasal itch orsneezing.17

Antihistamines are used for rhinitis, mainly asrescue medication. Their main effect is to blockperipheral H1-receptors which limits vasodilata-tion in the nasal mucosa. They have an effect onnasal itching, sneezing and discharge, but little orno effect on nasal congestion and blockage.18

Disodium cromoglycate (DSCG) and nedo-cromil sodium DSCG has been used as a pro-phylactic anti-asthma drug, mainly by childrenand young adults.19 To be effective it should beadministered four times per day. Originally itsmechanism of action was proposed to be sta-bilisation of mast cells, though that is probablynot the case. Taken immediately before exposure,DSCG affects the asthmatic reactions induced byvarious stimuli. However, after discontinuation oflong-term treatment, DSCG seems not to havemodulated the bronchial hyperresponsiveness orthe underlying inflammatory reaction.

Nedocromil sodium is another non-steroidalsubstance with anti-inflammatory properties invitro. It acts as an inhibitor at several levelsof neurogenic inflammation in asthma. Clinicalstudies have demonstrated improvements in air-way functions, including a reduction of bronchialhyperreactivity, but it does not protect againstmaximal airway narrowing, which is an impor-tant feature of inhaled corticosteroids.

Both DSCG and nedocromil are remarkablyfree from side-effects. They are also used forrhinitis, with effects similar to antihistamines.

Leukotriene modifiers The cysteinyl leukotrienesare products of the arachidonic acid metabolismwith effects that mimic many features of asthma,e.g. by increasing eosinophil migration, mucus

production, airway wall oedema and causingbronchospasm. Oral leukotriene receptor antago-nists, to be administered once or twice daily, areavailable along with an oral leukotriene synthe-sis inhibitor, which has to be administered fourtimes daily.

Leukotriene modifiers improve airway func-tion and decrease the need for additional mainte-nance and rescue asthma therapies. Leukotrienemodifiers also attenuate bronchospasm inducedby allergens, exercise, cold air, salicylates andexercise. In patients with chronic, persistentasthma, results from clinical studies indicatethat inhaled corticosteroids have a more con-sistent and greater average effect than antileu-cotriene drugs.20

The long-term safety of leukotriene modi-fiers is still not clear. Some patients reducingtheir oral corticosteroids when treatment withantileukotriene drugs has been initiated havedeveloped a special type of vasculitis calledChurg–Strauss syndrome. However, it might bethe unmasking of a pre-existing condition and notinduced by the leukotriene modifier per se.

MEASUREMENT SCALES

When measuring the status of the chronicobstructive airways disease in a subject we canrely either on data obtained at a visit to the clinic,or we can monitor the patient over a longer periodby daily recordings at home.

At a visit to the clinic the primary focus isusually to obtain an objective, indirect, measureof airways narrowing – a lung function test. Sucha test measures some functional index of theairway calibre in some kind of experimentalsetting. We will discuss various such indices andexperimental settings and what they measure.

In the latter case, long-term daily recordings,we provide the patient with a diary card and usu-ally ask him/her to record twice daily informationrelating to symptoms of the disease under study.In addition patients are often given a device toobtain an objective measurement of lung functionat home, traditionally a peak flow meter.

These two approaches are in no way mutuallyexclusive – in a long-term study we can make

Page 359: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 365

experimental manoeuvres of the first kind. As anexample there is virtually no long-term asthmatrial that does not measure FEV1 on visits to theclinic. However, for the present discussion weconsider experimental approaches and diary cardapproaches separately, except that single FEV1

measurements at the clinic will be discussedalong with diary cards.

Lung Function Measurements

Airway narrowing leads to an increased resis-tance to the airflow. The airway resistance canbe measured directly with body plethysmogra-phy (in a ‘body-box’), an expensive and rathercomplicated procedure. Another way of measur-ing lung function is by flow measurements, whichuses much more inexpensive apparatus, a spirom-eter. However spirometric measurements, to bediscussed below, depend not only on airwaysresistance, but also on lung volumes.

When doing a spirometric manoeuvre thepatient takes a maximum deep breath and thenexhales as rapidly as possible as much as

possible. The spirometer records the exhaledvolume as a function of time, V (t). From thiscurve (Figure 22.1) a number of spirometricindices can be obtained. The most widely usedmeasure is the forced expiratory volume inone second (V (1), denoted FEV1), followedby the forced vital capacity (V (∞), denotedFVC). If the expiratory effort has been markedlyinadequate it is usually obvious from the trace.

By calculating numerically the derivative ofV (t) we obtain the expiratory airflow. Its maxi-mum value, which usually occurs within 100 ms,is the peak expiratory flow (PEF). Often thespirometric result is shown by plotting the flowagainst the volume exhaled. From this curvewe can identify both PEF and FVC (but notFEV1), but can also define new measurements,like FEF25%, which is the flow when 25% ofthe FVC has been exhaled. Another measureof current interest is FEF25 – 75%, which is theamount of volume expired per second when theexhaled volume increased from 25% to 75% ofFVC. It is considered to measure effects in thesmall airways.

FEV1

FVC

PEF

FEF25%

FEF75% FVC

0 1 2 3 4 5 6

Volume Exhaled (L)

Vol

ume

Exh

aled

(L)

Time (s)

0

2

4

6

8

10

12

14

Flo

w (

L/s)

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.50.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Figure 22.1. Illustration of some spirometric measurements

Page 360: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

366 TEXTBOOK OF CLINICAL TRIALS

A full spirometric manoeuvre consists ofmeasurement of the inspiratory part also. Theinspiratory vital capacity (IVC) is a measure ofthe functional residual capacity (FRC) and is animportant measure in COPD patients.

If performed correctly, the spirometric test ishighly reproducible but somewhat effort-depen-dent. Different parameters are effort-dependentto different degrees: e.g. FEV1 is less dependentthan FVC, since it only needs maximum effort for1 s. The direct measurement of airway resistance(Raw), which is done in the body box, is effort-independent, but has a poor reproducibility. Sincethe spirometry has a good reproducibility, anduses fairly simple and portable equipment, itis most useful for clinical purposes. In specialsituations, however, the assessment of resistancemight be preferable.

PEF is much more effort-dependent than FEV1,but it can also be measured by a much cheaperapparatus than a spirometer. Such a peak flowmeter is often provided to the patients forself-monitoration at home. Instructions are thengiven to fill in a diary card and to contact thehealthcare service when PEF has dropped for afew consecutive days below prespecified levels.In the same way, PEF can be monitored with thissimple device in a long-term study by recording,often twice daily, in a diary card.

There is a diurnal variation in FEV1 andother lung function measurements. It is thereforeimportant when comparing different such mea-surements obtained at different visits to the clinicfor the same patient, that these measurements aretaken at approximately the same time of the day.

Lung function measurements can be followedin order to assess effects, but also to characterisedisease severity. However lung function is afunction of both gender, age and ‘size of patient’.Therefore a lung function parameter cannot bejudged on an absolute scale – an FEV1 of 2.4 Lmeans different things for a young, tall boyand an old, tiny woman. A measure of diseaseseverity would be the ratio of the actual FEV1

and the would-be, and unmeasurable, FEV1

the patient should have without the obstructiveairway disease. As a remedy for the latter

various predicted formulas have been obtainedfor different lung function parameters. Thus,e.g., a key disease severity parameter is theFEV1 in percent of predicted normal, both forasthma and COPD. There are a number ofsuch formulae available, generally depending ondemographic variables like race, gender, ageand height.22 It should be emphasised, however,that these measures cannot be anything butrather approximative ones, since the predictednormal values are not exact counterparts tothe unknown lung function without disease! If thelung function is between 80% and 120% of thepredicted normal value, it is in general consideredto be ‘normal’.

Another disease characteristic obtained fromlung function measurements is the reversibility.This is an index obtained from a very simplesingle-dose monitoring experiment: we measureFEV1, give a rapid-acting β2-agonist and wait30 min (typically) to measure FEV1 again. Theclassical reversibility is then obtained as

reversibility

= 100 × FEV1(after) − FEV1(before)

FEV1(before)

A value in excess of 15% was previously consid-ered indicative of reversible airways obstruction,though later guidelines21 use 12%. The basis forthese numbers is somewhat unclear – it is prob-ably chosen in order to be ‘certain’ that there isan effect: the variability in FEV1 is such that anumerical increase per se is not a definite proofof an improved lung function.

Upper Airway Function Tests

There are also upper airway function tests simi-lar to the lung function indices discussed above.They are however much less used, since symptomscores are considered of overriding importance inrhinitis studies. Resistance can be estimated bytwo different techniques: posterior rhinomanome-try in which values are obtained by probes placedin the mouth, and anterior rhinomanometry inwhich a device in the nose is used. Less compli-cated, and expensive methods for assessing nasal

Page 361: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 367

patency rely on the measurement of peak nasalflow either on inspiration (PNIF) or expiration(PNEF). We do not discuss these methods in anyfurther detail.

DESIGNS FOR EXPERIMENTAL ASTHMATRIALS

For asthma studies, there are a number ofexperimental designs to measure various aspectsof the therapeutic effect based on objective lungfunction measures. For this section, let E denotean index of lung function. In most real-life casesthis is FEV1, but the discussion is not restrictedto this case.

We can group the designs in two groups: eitherthe response after administration(s) of a studydrug is followed, which can be done by time orby increasing doses, or the protective effect ofthe study drug to some provocation is assessed.

Single Dose Monitoring

This type of experiment is simple. Consider oneindividual on one occasion when this experimentis performed. We first take a baseline measure-ment, E0, give the study drug and then followlung function at predetermined timepoints afterstudy drug administration. This provides us withan approximation of a response curve E (t), wherewe use E(0) = E0 (though technically it wasobtained at a timepoint t < 0). From this curvea number of measures can be obtained for fur-ther analysis. The two most important measuresderived from the curve E (t) are

1. The average level, defined as area under thecurve (of the polygonal approximation wehave observed to the response curve) dividedby observational time. This we denote by Eav .

2. The maximal level, Emax.

We can also compute tmax, the time at whichEmax occurred. This is sometimes useful. Otherpotential measures are related to the conceptresponders ‘onset of action’ and ‘duration ofeffect’. Tradition has it that for FEV1, effect

is declared at a timepoint t if there is a 15%increase compared to baseline at that time. Basedon such a concept, we can define: the time ofonset as the timepoint (if any) at which thepolygonal curve cuts the line E = 1.15 × E0 forthe first time (for rapid-acting bronchodilatorsone usually added the restriction that this shouldoccur within 30 min). The ending of effect occursat the timepoint on the polygonal approximationwhich is followed by at least two observationsbelow the line E = 1.15 × E0, provided that twomeasurements were taken. If only one was taken,that will suffice and if no measurement was foundbelow the line, censor the end of effect to the lastmeasured time. The duration is then the time fromonset of action to end of effect.

The main problem with these definitions ofonset and duration of action is not the arbitrarynumber 15%, but the fact that effect is measuredby relating to baseline. This is not appropriate,since lung function has a clear diurnal variation.It might be a reasonable approximation for a fewhours, the perceived time of clinical efficacy ofa short-acting β2-agonist, but will produce anincorrect result if used for a longer period. Infact, there are studies in which a patient receivingplacebo as treatment has had a definite increasein lung function already on the first measurementafter treatment administration (changes in themeans – not individual spurious events), so theuse of baseline as a reference when declaringeffects should very much be questioned.

A related problem is to define responders.As the name suggests, a responder is a subjectwho responds to the treatment. Traditionallythis has been decided based on the maximalincrease from baseline. The discussion aboveimplies that this is not necessarily a good wayto go. To actually measure effect, we need torelate the measurements to the measurementsobtained without drug administration. However,since asthma is not a stable disease, these must betaken simultaneously. And this is impossible! Inclinical trials we do not really need this conceptat all, except for descriptive purposes. We willreturn to the question of duration in the discussionof an example.

Page 362: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

368 TEXTBOOK OF CLINICAL TRIALS

One lesson, however, from the discussionis that effect for many of these variables isoften clinically measured as percent change. Thismeans that

�effect = �E/E ≈ � ln(E)

which by integration motivates why many lungfunction indices should be analysed on a loga-rithmic scale. We analyse these types of trialswith multiplicative models, which is justified bythis observation.

Challenge Tests

A challenge test is similar to the single dose mon-itoring test, except that most of the monitoringtakes place after a provocation of some kind.Two important cases of challenge are exercise,either a treadmill test or using a cycle ergometer,and an allergen to which the patient is allergic.A baseline measurement E0 is taken, often afteradministration of study drug. Then the provo-cation is done and lung function followed. Inmost cases there are two phases in the reac-tion found. First there is an immediate reactionwith bronchoconstriction within minutes whichlasts 1–2 hours. Several hours later there is adelayed reaction with a much slower and sus-tained time course.

Typically an exercise test is followed only dur-ing the immediate reaction, the actual existenceof a delayed reaction is controversial. The protec-tive effect of the study drug can be measured bymaximal decrease in lung function from baseline:

IndexEIB = 100 × (E0 − Emin)/E0

and we only need to follow the patient untilwe know he has attained the low turning point,whereafter he is given a high dose of a bron-chodilator in order to restore lung function.Because of the intrinsic variability in lung func-tion measurements spurious local minima canoccur in the measurement series – it is importantthat the investigator has certified that the globalminima has occurred before stopping. The most

common lung function measurement here is againFEV1. It should be noted that a better definition ofthe index would be IndexEIB = 100 × Emin/E0,since then the analysis could be done on the mul-tiplicative scale as discussed above!

For allergen challenge test we are more inter-ested in the whole response for 10–12 hours,in order to study both the immediate reac-tion and the late reaction. The immediatereaction (EAR = early asthmatic reaction) is anepisode of acute bronchoconstriction which peaksbetween 10 and 20 min after inhalation andresolves within 1.5–2 hours. The late reac-tion (LAR = late asthmatic reaction) is probablyan inflammation mediated bronchoconstrictionwhich starts about 3 hours after allergen inhala-tion and does not resolve for many hours. Aller-gen challenge tests are potentially dangerous, andare therefore not much favoured as a mode ofstudying asthma.

If they are, we need to measure FEV1 repeat-edly during the first hour, and then more sparselyduring the next 7–8 hours (perhaps once anhour). The EAR is most often defined as themaximum percent reduction in FEV1 (from base-line) occurring in the first hour after challenge,whereas the LAR is defined as the maximumpercent reduction in FEV1 (again from baseline)occurring between 3 and 7 hours after challenge.Alternatively we compute the area under thecurve for the first hour and for the period between3 and 7 hours after challenge and use that as anefficacy measure in much the same way as forthe single dose monitoring experiment.

Hyperresponsiveness Studies

The level of airway responsiveness to a non-specific stimulus is measured by exposing sub-jects to the stimulus and measuring the response.There are a number of dialects of this test,by varying the selected stimulus, the mode ofadministration of it and the method of assessingthe response.

The most commonly used stimuli are metha-choline and histamine, though small doses ofallergens can also be used. Methacholine and

Page 363: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 369

histamine produce similar responses, but thelatter has more side-effects and can only beadministered safely in concentrations up to32 mg/ml, whereas methacholine can be usedsafely in concentrations up to 256 mg/ml (thesenumbers should be compared to the clinicaldefinition of hyperresponsiveness which is thatthe provocation dose (PD20, see below) is ≤8 mg/ml). The stimuli is administered from anaerosol which can be done in different ways.Suffice it to note that one can either do itwith or without a dosimeter which controls thedose. Response is generally measured either asFEV1 or as airway resistance (or its inverse,conductance).

Technically the subject first inhales salineand then inhales progressively increasing, oftendoubled, doses of the stimuli from the aerosolat 3-min intervals. There is a measurement aftereach dose administration, so we can considerthe response to be a function of the lastconcentration or dose given. In both cases thesaline inhalation produces the baseline value.From this dose–response curve (I call it that,though sometimes it is a concentration–responsecurve) different characteristics can be computed.A general dose–response curve is sigmoidalin shape which is well approximated with aloglinear portion over most of its responserange. We can, however, not clinically obtaininformation on much more than the lower partof this dose–response curve, which means thattraditional measures for dose–response curves(ED50 and slope) are not usually estimatable. Wecan think of the effect of the drugs as a parallelshift of the response curve so that if a givenresponse is obtained with dose D without thedrug, it takes dose ρD (with, hopefully, ρ > 1)to get that response with the drug. To estimateρ in this type of study we fix a level, expressedas percent decrease in the response, and estimatethe dose of stimuli needed to obtain that level.The dose which gives a decrease of x% in theresponse variable is denoted PDx (or PCx if we donot control doses). For FEV1 we usually computePD20, whereas for Raw a higher percentage canbe used.

The actual algorithm for estimation of PDx canvary. The following suggestion is justified by thisdescription of the dose–response curve.

1. If there is a dose with less than x% decreasefollowed by a dose with more than x%,loglinear interpolation (of log D vs. response)is done.

2. If the first dose provoked a fall in excess ofx%, we cannot do loglinear interpolation. Inthat case we do a linear interpolation back tobaseline and obtain a dose corresponding to afall of x% from this. However, we never goback more than to half the first dose given.

3. If the last dose produced a fall of less thanx%, we extrapolate loglinearly, but only up totwice the highest dose given.

What to use as dose can also be discussed.If we do not control the dose, we must usethe concentration given. If we control the dose,we can choose to use cumulative doses or lastdose without much difference in the final results,when provocation doses are compared (becausefor a geometric series, the sum is essentiallyproportional to the next dose, and we compareratios). In general the use of the cumulative doseseems to be favoured.

The measure PD20 is not limited to the possibleinterpretation discussed above (as the relativedose potency ρ of the stimuli). If the effect isdue to changes in both position and shape, themeasure can still be used. For epidemiologicalpurposes another index has been introduced, thetwo-point slope, which is the percent declinefrom baseline to last dose, divided by last dose.Though this index has a clear interpretation (aspercent decrease per unit dose), this interpretationis wrong since the decrease is not linear withdose – instead it is virtually zero until it becomeslinear with log-dose.

Note: It has been suggested that you cannotestimate PDx if there has not been a fall of x%.Technically this means that we should note it asmissing. This might be sensible for the caring ofthe patient, where this is perceived as no hyper-responsiveness. However, for a clinical study,

Page 364: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

370 TEXTBOOK OF CLINICAL TRIALS

where treatments are compared, it is imperativeto do an estimation. Setting it to missing meansthat the analysis loses the information that a highdose is needed to achieve the specified decrease!

EXERCISE TESTS IN COPD

Since a progressive decline in physical fitness isthe main characteristic of COPD, exercise testsare useful for a proper evaluation of treatmenteffects in these patients. In these tests exercisecan be either walking, running (treadmill tests)or bicycling. The basic design of the test can be,in its purest form

1. to determine the maximal workload sustain-able, or

2. to fix the workload at some level, anddetermine the endurance time.

An example of the first kind is to measurethe distance walked in a prespecified time,like 6 or 12 min. The second kind counterpartwould be to fix (individually) the pace whichat walking should be done and then measuretime walked. It is believed that the secondkind of experiment is more relevant in thestudy of COPD – that it correlates better withbreathlessness and disability. The first kind oftest is probably much influenced by attitude andexpectation. We should also note that the secondkind of test should provide a lower metabolicand respiratory stress than the first one and thatthe limiting factor in an exercise test does nothave to be the physical fitness – COPD patientsmay well fail due to muscular fatigue beforegeneral fatigue.

In practice many tests used constitute a com-promise between these two approaches: a spe-cific time schedule is designed so that the workload is held fixed for some fixed time, thenincreased for ‘a step’ for another period of time,etc. Typical cycle-ergometry and treadmill testshave this design, as has the so-called shuttlewalking test in which the patient walks at agiven pace for one minute, then increases thepace for another minute, etc., all according toa well-defined protocol. The natural outcome of

these experiments is an endurance time, thoughfor some cycle-ergometry tests you could alter-natively use the total workload (but these shouldbe heavily correlated).

For a comparison of some exercise tests inCOPD, see Ref. 23.

In conjunction with these tests measurementsof breathlessness are usually done. There aredifferent tests available. A much used dyspnoeascore is the Modified Borg scale,24 in whichdyspnoea is scored on a 0–10 scale before andafter the exercise test. Alternatively one can usea visual analogue scale with the same effect.An alternative scale is the Transitional DyspnoeaIndex,25 for which we first rate three factors(functional impairment, magnitude of task andmagnitude of effort) on baseline, each on scales0–4 (well, 4–0 actually – the scale is reversed),and then rate the changes over the exercisedirectly on a scale from −3 to +3.

EXPOSURE STUDIES ON RHINITIS

For allergic rhinitis there are two study designsof the experimental type available. Both areexposure studies, one in the natural season, onein an artificial season:

1. The Park study. In this study the subjects areexposed to pollen over a 1–2 day period bywalking around in a park. There are two mainproblems with this type of study – it is highlydependent on season and the patients oftenfind it very boring.

2. The experimental Nasal Allergen ChallengeArtificial Season model. In this type of studythe subjects are artificially exposed to pollenfor some period. This can be either as sprayapplication for a few consecutive days, orin an Environmental Exposure Unit (EEU)in which subjects are exposed to pollen ina special room for, typically, 3 hours on anumber of consecutive days. In this room thereis a flow of air to which the pollen is addedand evenly distributed in the air by fans.

In both cases we measure nasal symptoms asoutcome variable. Both these studies are parallel

Page 365: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 371

groups in design, but effects can often bedemonstrated with rather small patient numbers.

LONG-TERM CLINICAL STUDIES WITHDIARY CARDS

In a diary card study, the patient is provided witha diary card to fill in various information aboutthe status of his disease under investigation, oftentwice daily. For most asthma/COPD studies,the patients also measure PEF. It is importantthat the patient uses the same peak flow meterthroughout the study, since different brands havedifferent scales, and there is a considerablewithin-brand variability as well. In additionto this, some symptom scoring is requested.This can be either an overall assessment ofsymptoms, or assessments of specific symptoms,like wheezing, shortness of breath and coughfor asthma. Finally, for asthma/COPD studies,the use of rescue medication, usually a short-acting β2-agonist, should be entered into the diarycard. With the increased use of IT, paper-baseddiary cards are more and more replaced withelectronic counterparts, which has the potentialbenefit of monitoring when the recordings aredone. Some such devices can also contain aspirometer, which makes it possible to replacethe somewhat variable PEF measurement withthe more accepted FEV1 measurement. The factthat the electronic device can be programmed sothat it only accepts data obtained at the timepointwhen it should be obtained, increases the validityof this type of data. The FEV1 measurementsrecorded with a portable spirometer should bemore valid than PEF data obtained by a peakflow meter and manually recorded on a paper-based diary card. Our discussion will primarilyrelate to the old paper-based diary cards with aconcomitant peak flow meter. We leave it to thereader to assess potential changes that occur forelectronically fetched data.

In terms of basic design we have two types oflong-term clinical studies in asthma:

1. Studies in which treatments are fixed through-out the period under investigation. An arm in

such an study might be, e.g., budesonide Tur-buhaler 200 µg b.i.d.

2. Studies in which the treatment is not fixedthroughout the period under investigation. Insuch studies we can either vary the doseof the investigational product, or vary thedose of some concomitant treatment. Onetypical such study has an arm in whichtreatment is initiated with a high dose of agiven GCS, which is then reduced accordingto some scheme until the patient is nolonger controlled on the present dose. Avariant are the steroid sparing studies, inwhich a fixed dose of some investigationaltreatment is given throughout the study periodand concomitant with this treatment someGCS is given, the dose of which is thenreduced in steps. For inhaled GCS, oral steroidsparing studies have been performed in thisway, for other anti-inflammatory drugs likeleukotriene modifiers inhaled steroid sparingstudies are relevant.

The usage of the diary card data varies betweenthese two types of data. In studies with fixedtreatments they define the primary efficacy vari-ables, whereas in studies with varying treatmentdoses, dose changes are conditioned on the diarycard variables and these therefore act only as con-trol variables.

Diary card data in a long-term clinical studyoften represents a considerable amount of data,as measured in megabytes on disk. The numberof megabytes, however, does not truly reflect itsinformation value. Data is not obtained in a verycontrolled fashion. Morning values are generallyconsidered slightly sharper than evening values,since sleep is comparatively similar amongpatients and data should be obtained and recordedimmediately after waking up.

For that reason, peak flow obtained in themorning is often considered the primary variableof interest in a long-term diary card study on asth-matics. The day-to-day variability, for a symp-tomatic asthmatic, can be considerable. However,using the mean of all values over a prespecifiedperiod, as long as possible, generally provides

Page 366: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

372 TEXTBOOK OF CLINICAL TRIALS

us with a measure that has proven to be usefulin many clinical studies. An alternative efficacymeasure is to use FEV1 measurements obtained atvisits to the clinic. Though each individual FEV1

measurement so obtained is much more reliablethan a single PEF measurement, the overall meanover a treatment period of daily recorded PEFmeasurements obtained in the morning is, in ourexperience, a more efficient variable for demon-strating differences between treatments in lungfunction. Since most treatments are mainly symp-tomatic, integrated measures over time are therelevant ‘endpoint’ measures.

When using FEV1 obtained at visits to theclinic as primary variable in a long-term clinicaltrial, we must take the diurnal variation of lungfunction into account. Thus it is important thatFEV1 is measured at approximately the same timeof day on each visit. To obtain maximal efficiencywe also need to schedule the patients for visitsto the clinic early in the morning (around 8a.m.), with approximately the same argumentas given for peak flow morning measurementsabove. The possibility of enforcing this will verymuch determine the effectiveness of the FEV1

variable in discriminating between treatments.In COPD studies lung function is also of

interest, but for this disease the symptomaticbenefit is stressed more. For COPD, rating ofnight sleep, breathlessness, coughing and chesttightness seem to be accepted symptoms toinclude in diary cards.

For rhinitis, the symptoms recorded are nasalblockage, rhinorrhea, sneezing and/or itchy nosewhich sometimes are combined into the nasalindex score, which is the sum of them. Inaddition to this, eye symptoms are recorded asa secondary variable. The most readily availableobjective measure in the clinical trial setting iseither the Peak Nasal Inspiratory Flow (PNIF) orPeak Nasal Expiratory Index. Of these the PNIFparameter seems to be the most discriminative.

The data in diary cards can be used indifferent ways to compute variables for usein statistical comparisons. As already indicated,in fixed dose studies period mean values are oftencomputed, not only of PEF measurements, but

also of symptom scores and of the use of rescuemedication. Because of the intrinsic variability inthe underlying disease it is important to computemeans over long periods, preferably the fulltreatment period. This means that, for some drugsat least, the mean will contain data from a periodof onset of action, though the effect of that willbe minor in long-term studies.

Mean values of symptom scores do not seemvery meaningful to most clinicians in assessingthe actual response. An alternative is to computethe percentage of symptom-free days, which issomewhat simpler to interpret clinically. Simi-larly it might be useful to compute the percentageof days with no rescue medication.

When many symptoms are recorded individu-ally, one approach to the analysis of the data is tocompute the sum of the symptoms (as the nasalindex score), but an alternative is to analyse themsimultaneously in a multivariate analysis.

Even more useful, sometimes, is to collect datawithin a day, or adjacent days, to form newmeasures. One simple such measure for asthma isto define a patient to have control of the asthma, ifthere are no symptoms and the patient did not userescue medication. The percentage of such dayswith asthma control can be a useful summarymeasure for some patient populations, typicallyrather mild ones. A variant of this is to definemild exacerbations, or episodes, of asthma fromdiary cards by looking for worsening of lungfunction and/or increase in rescue consumptionand/or symptoms. The exact criteria for suchepisodes probably need to be adjusted to thepatient population under study, and to the studydesign. In order to avoid spurious events, it mightbe a good idea to define an event to have occurredfor two consecutive days in order to be labelledan episode. From an analysis point of view wecan analyse time to first such exacerbation orthe percentage of episode-free days. Analysis ofexacerbation is even more relevant in COPDstudies than in asthma studies.

One final note on response data in studies onasthmatic patients, especially PEF, and the dis-ease asthma in general. When we interpret diarycard data, obtained over a longer period, we must

Page 367: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 373

interpret it on a group mean level. A discussionon individual responders is virtually meaning-less. A responder refers to a patient that respondsto the investigational treatment. This cannot beassessed on the basis of diary card data, sincethe underlying disease is, by definition, vary-ing – what seems to be a clear response couldwell be a period of good asthma control totallyunrelated to drug effect (in some cases a studyeffect) and the converse. This is obvious onceone has inspected placebo data in a long-termstudy. However this does not exclude that one candefine responders according to some criteria andcompare percent responders between treatments,since that is a comparison on group level.

So far we have considered studies with fixedtreatments during the investigational period. In astudy which tries to identify the minimal doseon which the patient is controlled, the obviousendpoint for analysis is this minimal dose. This istrue irrespective of whether the dose in questionrefers to the investigational drug or to someconcomitant drug (as e.g. in the oral sparingstudies alluded to above).

More explicit examples of this will be demon-strated later.

QUALITY OF LIFE

Asthma and COPD are chronic disorders thatcan place considerable restrictions on the physi-cal, emotional and social aspects of the lives ofpatients. Assessments of the patient’s own per-ception of the impact of asthma on their life,of general well-being, is known as measurementof quality of life. Quality of life may be use-ful for assessing the degree of morbidity, e.g. inorder to evaluate the health economic impact ofthe diseases in the community. It is assessed byquestionnaires that include a large set of phys-ical and psychological characteristics assessingthe general functioning and well-being in the con-text of lifestyle. Quality of life scales are eithergeneral and not specifically designed for patientswith asthma or COPD, or they are more specificdisease-related but, as of today, in general notapplicable to the general population due to cul-tural differences.

General health status scales such as the Sick-ness Impact Profile with 136 items26 have beenproposed. A compromise between lengthy ques-tionnaires and single-item measures of healthhas also been proposed. The Nottingham HealthProfile with 45 items and SF-36 (a Measuresof Sickness short-form general health survey)are now widely used and validated. The SF-36Health Status Questionnaire is based on 36 itemsselected to represent eight health concepts (phys-ical, social and role functioning; mental health;health perception; energy/fatigue; patin; and gen-eral health).27 Its quality of life scales have beenshown to correlate with the severity of asthma,but it has yet to demonstrate any superiority overthe simpler, diary card-based symptom scores fordemonstrating effect in clinical trials.

For COPD the St. George’s Respiratory Ques-tionnaire has gained importance in later years.It is perhaps the most comprehensive question-naire for evaluation of quality of life in airwaysdiseases and allows for direct numerical compar-isons to be made among patients, study popu-lations and therapies, and has sensitivity whenapplied to mild as well as severe disease.28 Itwas developed by Paul Jones at St. George’sHospital in London in 1990 and is designed tomeasure impact on overall health, daily life andperceived well-being. The measure consists of 50(76 responses) items that produce three domainscores and one overall score. The domains are:symptoms (severity and frequency), activity (thatcause or are limited by breathlessness) and impact(on social life and psychological disturbancescaused by the airways disease).

CLINICAL TRIAL METHODS

HOW TO AVOID BIAS

Blinding

Most effect measurements of the respiratorydiseases discussed here are influenced to anon-negligible degree by the patients’ expec-tations. One typical example of this is thatin some double-blind, placebo-controlled single

Page 368: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

374 TEXTBOOK OF CLINICAL TRIALS

dose trials measuring bronchodilation, there isan immediate response in lung function also inthe placebo group, which probably is due to(false) expectations. The classical methods toavoid expectation bias, blinding and randomisa-tion, are therefore important. A clinical study inthis area should follow a double-blind approachin which study drugs are prepacked in accordancewith a suitable randomisation schedule, and sup-plied to the trial centre(s) labelled only with thesubject number and the treatment period so thatno one involved in the conduct of the trial isaware of the specific treatment allocated to anyparticular subject, not even as a code letter.29

The code should not be broken until all deci-sions concerning data validity have been takenand documented.

Many studies in the respiratory area concerninhalation products, where there are not only,say, two different drugs involved, but also twodifferent inhalers (or perhaps one drug in twodifferent inhalers). To maintain blinding in thosesituations one often needs to resort to the ‘double-dummy’ technique. This means that for eachinhaler there has to be two clones: one with activedrug and one with placebo. On each inhalationoccasion, the subject has to inhale not only fromthe inhaler with active substance, but also fromthe other inhalers, but containing placebo. Thismight lead to a large number of inhalations peroccasion, which in turn might lead to incompletecompliance. Note that the use of different inhalersimplies a consideration on the order in whichthese should be taken. Carry-over effects fromone type of inhaler might dictate which shouldbe taken first/last, whereas in other situations abalanced scheme might be called for.

Rhinitis studies pose a special problem in termsof blinding because the double dummy techniqueis not considered appropriate – there is a fear thatadditional placebo material may clear the airwaysof drug so that the response is different withand without simultaneous placebo administration.This is a problem mainly when two differentdrugs inhaled through different devices are to becompared. The partial remedy that is most oftenused is to include a placebo group, and let half

the group have one device and the other half theother device. That way, at least, the patient doesnot know whether he gets active drug or not.

Open labelled studies might be acceptable forsome systemic effects studies where the outcomevariable is the plasma concentrations of somemarker, or in long-term safety studies.

Randomisation

Studies in respiratory diseases must also beproperly randomised, so that prognostic fac-tors are distributed between groups by chancealone, which minimises selection bias. For manyoutcome variables some prognostic factors areknown, and it is important for the credibility ofthe result that the choice of group for individ-ual patients has not been made by the inves-tigator. The observed outcome can still be dueto an imbalance of prognostic factors betweengroups, but randomisation at least means that thiswas produced by chance alone, not by the par-ties involved in the study (provided the study isalso blinded).

By the way, this last point is why it ismeaningless to do group comparability testing atbaseline. The p-value computed from a statisticaltest is a measure of how certain one is that the‘null hypothesis’ is wrong, how improbable theresult is assuming that the null hypothesis istrue. For a test comparing baseline data (i.e. dataobtained before randomisation), say the mean, fortwo treatments, the p-value is a measure of howunlikely the observed group mean difference is.If this is small, say p < 5%, this means that anunlikely event has occurred, or that the groupsare not equal on average. However, if we trustour randomisation procedures, there is no factorthat can explain why the two groups should notbe equal except for chance alone, so we mustconclude that an unlikely event has occurred.The unlikely event, i.e. the difference observed atbaseline, might have consequences on the results.But the extent of that effect does not depend onthe p-value of a test at baseline, it depends on thecorrelation of the effect variable with the baselinevariable in question. This means that a large

Page 369: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 375

baseline difference might have no consequencesat all, or a small difference at baseline mighthave large consequences. In the respiratory areait is very rare that the latter is the case. Ifobserved differences at baseline causes concern,the robustness of the results should be checkedwith respect to this issue, not a separate testat baseline.

Other Sources of Bias

Another way to risk selection bias, also in arandomised, double-blind study, is to excludedata obtained on treatment. To exclude patientson data obtained prior to first dose cannotin itself produce bias. However the prognosticfactors for respiratory trials, like FEV1 in percentof predicted normal and reversibility, are onlyestimates of time-varying entities and are notprecise enough that we can actually claim thata patient violating some inclusion criteria isnot necessarily an appropriate patient for thetrial. They are essential in order to focus theinvestigator on the appropriate patient population,but once a violation to the protocol criteria hasemerged it might be appropriate to use the patientin the statistical analysis. With Senn,30 I considerthe protocol a guide to the physician, not thestatistician.

Protocol violations after the first dose shouldnot in general invalidate the patient for thestatistical analysis. If such a protocol violation isconfounded with treatment effects, their omissionmight bias the result. However, the fact that theyare protocol violations might in itself imply thatthe measurements are improper measurements,which is another type of bias. This problemis illustrated in respiratory trials by the use ofrescue medication.

During an asthma or COPD trial patients areusually provided with short-acting β2-agonists touse as rescue medication. This means that somemeasurements of lung function and symptomswill be influenced by this add-on therapy, and thevalidity of those measurements (as direct treat-ment measures) will therefore be questioned. Iftreatments have the same effect they pose no

problem, since they should occur with similarfrequency in different groups. However, a moreeffective treatment is expected to have less con-sumption of rescue medication. As a consequencethere is a bias towards no effect by includ-ing those measurements when computing periodmeans for instance. If we ignore them (i.e. con-sider them to be missing) we introduce a biasin the same direction, since we only count thedays when the patient was, relatively speaking,symptom-free. There seems to be no easy wayout of this dilemma, and the approach we havetaken is to ignore the additional information onrecently taken rescue medication for the analy-sis, but instead plot, descriptively only, for eachday the proportion of patients that takes rescueclose enough to peak flow measurement. Hope-fully, and this is usually the case, the main resultand this graph gives the same message on effect.At least this approach should be conservative.

TRIAL DESIGN CONSIDERATIONS

From a bird’s-eye perspective, there are two typesof responses for respiratory diseases:

1. Immediate responses that disappear withina short period of time. This includes thefast bronchodilating effect of β2-agonists,responses to various provocations and spe-cific systemic effects that are measured bymarkers in the blood (like plasma cortisol forglucocorticosteroids and serum potassium forβ2-agonists). Many of these studies are singledose studies.

2. Long term studies addressing effects on symp-toms or average lung function of the underly-ing condition.

Crossover Trials

For the first type of studies, the crossover designis well suited. In such a study each subject israndomised to a sequence of treatments, and actsas his own control for treatment comparisons.In many cases this is attractive, because thewithin-subject variability is smaller than the

Page 370: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

376 TEXTBOOK OF CLINICAL TRIALS

between-subject variability which means that asmaller study is required for the same power, ascompared to a parallel group study. Numerousvariations exist, e.g. trials in which each subjectreceives only a subset of the treatments studied(incomplete design), and trials where the sametreatment is repeated within a subject.

However, there are caveats with crossoverstudies. The primary caveat is the possiblepresence of carry-over effects (in fact, non-equalcarry-over effects), which might bias treatmentcomparisons. When deciding if a crossoverdesign is appropriate for a particular study, wetherefore must convince ourself, beforehand, thatwe can get rid of possible carry-over effectsby separating the various treatment periods withwashout periods during which no treatmentis given. For a single-dose short-acting β2-agonist trial a washout period need in generalonly be a few days. A trial which studiescortisol depression after short periods of GCSadministration should have washout periods of1–2 weeks, though shorter ones are acceptablein single dose studies.

When periods in crossover trials containrepeated dosing over a few weeks, and the actualexperiment is performed at the end of such aperiod, it is often unnecessary to have drug-freewashout periods between periods. But to take thatstep, one must make plausible that taking a newtreatment directly after another does not by itselfhave any effects on the variables to analyse.

Parallel Group Trials

Since the treatment for respiratory diseases ingeneral is to achieve a prolonged improvementof the underlying condition, the most importanttrials need to extend over longer periods. Themost natural design for them is the parallel groupdesign, where the subjects are randomised to oneof a number of arms, each arm being allocated adifferent treatment. These treatments will includethe investigational product at one or more doses,possibly including a placebo (dose = 0) arm, andpossibly some active control treatments at one ormore doses.

Sequential Trials

Sequential trials are not much used in therespiratory area. A long-term study can obviouslynot be done this way, since the total study periodwould be enormous. For experimental studies,on the other hand, the setup is often of suchcomplexity that the clinic need to have clearspecifications on patient numbers before the startof the study for their planning.

Interim Analysis

Interim analyses should not be needed in trialsin the respiratory area, except possibly for firstdose in man studies on new drugs – but then itis a drug issue and not a therapeutic area issue.Interim surveys of safety data might well bejustified in the beginning of clinical programmes,but for efficacy issues they are seldom needed.

HANDLING OF MISSING VALUES

The experimental type of studies are often verydifficult to perform clinically. To get good qualitydata, it is extremely important that the investiga-tor with staff has a good knowledge and experi-ence in this type of study. It is in general better todo such studies in one or two experienced cen-tres with many patients, as compared to usingmany centres with fewer patients – despite thefact that the study might take longer to perform.With those premises missing values are, in ourexperience, a negligible problem since in gen-eral experimental sequences are complete. Whenmissing values occur, they are due to discontin-uations between experimental sessions and theseare few and there is no problem in analysing theresulting unbalanced study.

It is a completely different issue with thelong-term clinical studies. Here patients not onlydiscontinue treatment, but there are also missingobservations during the period. For the fixedtreatment trial the purpose is in general to achievea steady state, on group level, on the treatmentand then compare the level of the measuredvariable in steady state between groups. For

Page 371: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 377

patients reaching steady state we therefore ingeneral have a number of data points measuringsteady state level, in the case of diary cards quitea few. The efficacy variable is then usually asummary statistic of these data points, like aperiod mean of a diary card variable. Missingvalues during this period does therefore notconstitute a major problem – we take the meanof available data.

Sometimes a long-term clinical study containsexperimental procedures, like a methacholineprovocation test. Or just spirometry at the clinicalvisit, at least FEV1. This is then done at leastonce pre-randomisation and then again only afew times on treatment, in particular on the lastprotocol visit. The effect variable should notbe defined as the change from baseline to lastprotocol visit, but as the change from baselineto the last visit on treatment the patient attended.Specifically, instead of analysing the change inlog PD20 from visit 2 to visit 8, we define theefficacy variable as the change from visit 2 tothe last visit on treatment, which might be visit4, 6 or 8 in a particular study. Technically this isequivalent to what is called the last value carriedforward, or the last value extended, principle, butthere is no need to use that label if we define theefficacy variable appropriately.

This still leaves us with one key prob-lem – what if we do not have any efficacy mea-surements on treatment to use. To avoid thatproblem in diary card studies, it is often betterto define the full treatment period as the periodover which to compute summary statistics. Atleast that provides an effect measurement for eachindividual who has started to fill in the diarycards. The drawback is that the period mean forone patient can be the mean of 90 data points,whereas for another it is the mean of only a fewdata points. The next step is in general to analysethese period means with an ANOVA, and thenthe information of the precision of the computedmean is lost. On balance, it seems better to havean effect measurement on each patient.

For data obtained on clinical visits, the riskof having no measurement at all on treatmentis not negligible. The omission of such patients

from the analysis means that there might be apotential bias in the end result. To understandthis, assume that there are no withdrawals ingroup A, but half the patients withdraw fromgroup B because of insufficient efficacy. Theremaining patients in group B are then the oneswho needed less treatment. That patient grouphas a corresponding subgroup in group A ofapproximately the same size (as a consequence ofproper blinding and randomisation procedures),but group A also contains another subgroup ofpatients, corresponding to the ones that droppedout from group B. The remaining groups aretherefore not really comparable, and inferencedrawn from available data might be misleading!

However, there is no simple, trustworthy,remedy for this. Our approach is to use availabledata for the analysis, hoping that the potentialbias is conservative (which it probably is in mostcases for respiratory trials). However, if there is alarge difference in withdrawal rates between thegroups, it is logical to do the primary analysis onwithdrawal data to assert group differences.

When describing diary card data, daily meanvalue curves by treatment are useful. Whencomputing these mean values, missing valuespose great problems in that raw mean valuescan produce very misleading impressions ongroup behaviour. To see why, consider a placeboarm in a diary card study in asthma in whichthe patients with worsening of symptoms dropout progressively (the worse the symptoms, theearlier they drop out). As the patients withlow response values drop out, the group meanwill increase, so the temporal behaviour of themean values will indicate that the placebo groupincreases in effect with time. However, this effectis solely due to withdrawals!

To avoid this culprit when plotting the tempo-ral behaviour of variables some kind of impu-tation of data is needed, in order to keep thedenominator the same when computing mean val-ues within a treatment group. The simplest suchimputation is to use the last value extended (LVE)approach, in which the last value for a withdrawalis extended to later timepoints. Using this prin-ciple, the mean values plotted can be interpreted

Page 372: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

378 TEXTBOOK OF CLINICAL TRIALS

as follows: the mean at time t is the mean ofthe last recorded measurements up to and includ-ing time t. When using this principle for diarycard variables like PEF it is often better notto take only the last measurement, but rathera mean of a few measurements. More sophisti-cated approaches based on some kind of multipleimputation technique for missing data can alsobe considered, but the add-on value of doing thatis probably very small for the average study inrespiratory diseases.

MULTIPLE COMPARISONS

A respiratory trial usually contains a numberof effect variables, and often also a numberof different treatments. Thus there are multiplecomparisons to be done. This poses a majorproblem, because of the risk of over-emphasisingfluke significances because of many comparisons.

To handle the many effect variables we there-fore have to predefine which one is to be con-sidered the primary one. It is from the result onthis variable that the overall statistical conclusionfrom the study can be drawn. In general one studycan have a few different objectives that are notclosely related (like efficacy and safety), and thena primary variable for each objective should beappropriate. However, it is probably a too sta-tistical approach to focus only on the primaryvariable when trying to understand the results ofa clinical trial. No variable fetches all aspects ofa respiratory disease, and the approach should beto select the most sensitive variable as primaryvariable, to decide on the overall conclusion, butthen a number of secondary variables should beso described that one gets an overall picture ofwhat is ‘going on’.

When it comes to the problem of multipletreatment comparisons, the study logic should bestructured in terms of well spelled out objectives.To prove efficacy might mean one comparison,to estimate a relative dose potency anotheranalysis. With precisely formulated questions themultiplicity problem here should at least diminishsubstantially. This approach will be illustrated inwhat follows.

SAMPLE SIZE DETERMINATIONS

In order to certify that a proposed study is ofan appropriate size, a sample size justification isneeded in the protocol. It cannot be justified eth-ically to succumb a number of patients to a studyprotocol if there is no hope whatsoever to demon-strate what you want to demonstrate. Similarly, ifthe study is heavily overdimensionalised we haveput an unnecessary number of patients at what-ever risk the study can carry with it, and thatis not ethical too. However, sample size deter-mination is there to ethically justify the studyin advance – it has no consequences when theresults are obtained.

In the respiratory area many test hypothesesare stated in terms of mean values, and forsuch variables the sample size is (essentially)proportional to the ratio (σ/�)2, where σ is theresidual standard deviation and � is the meandifference we do not want to miss. When using amultiplicative model for a variable, these entitiesrefer to the logarithm of the variable in question.Note that σ means different things in a crossovertrial and in a parallel group trial – in the formercase it refers to a within-patient variability (moreexactly

√2× the residual standard deviation of

the ANOVA) and in the latter to a between-patient variability. Also what is relevant is theresidual standard deviation from the proposedanalysis of variance, which might contain abaseline adjustment.

The following table shows some typical valuesof the sample size parameters that can be used forasthma trials. Each example will be discussed inmore detail below.

Increase in Design σ � (σ/�)2

PEF morning(L/min)

PG 40–45 10–20 4–20

Symptom score(0–3 scale)

PG 0.4–0.5 0.05–0.15 6–100

FEV1 (L) PG 0.4–0.5 0.05–0.1 15–100ln(AUC FEV1)

(L)XO 0.07–0.10 0.05–0.10 0.5–4

2 log(PD20) XO 0.9–1.1 1–2 0.8–4

Here the range is not a range – the lower numberfor σ represents an optimistic number, the larger

Page 373: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 379

number a conservative one. Similarly, for �

the range is more of a typical range for whichto dimensionalise, not a range on what can beobtained. To obtain numbers (per group in theparallel group case) from this we should multiplyby approximately 25.

For the crossover measurements of the table,we just note that the AUC refers to AUC-basedaverage over the full period and that for thatvariable the pre-dose FEV1 value is used ascovariate in the analysis. For the PD20 case nobaseline covariate is used.

For the parallel group measurements we usebaseline covariate. For PEF morning a baselineis obtained as the mean value over a numberof measurements, typically 1–2 weeks, and thenthe effect variable as the mean of 1–3 monthsof data. The shorter the periods, the largerthe residual standard deviation. Similarly, forFEV1 the table refers to a measurement both atbaseline and end of treatment, but the treatmentvalue could well be a mean of a number ofmeasurements. Moreover, the FEV1 data refers tothe situation when visits to the clinic are spreadout over the morning, the European style, asdiscussed earlier. With visits scheduled at 8 a.m.precisely larger effects can be expected.

Concerning symptoms scores, these too areobtained from period means of diary card data,and relate to a typical asthma study. Changesin symptom scores are often small in studies inasthmatics with mild–moderate severity, sincethey do not have many symptoms on entry. Inrhinitis studies a combination of symptom scoresis often done. If we use the TNS discussed earlierwe typically have a standard deviation of about1.3 and effect sizes of 0.5–1, giving a (σ/�)2 of2–8. Typically, therefore, rhinitis studies can besmaller than asthma studies.

For COPD, exacerbation rates are more impor-tant as outcome variable. A rate of one exac-erbation per year can be used in sample sizecalculations.

SUBGROUP ANALYSIS

When doing statistics on trials in respiratorymedicine, the question of subgroup analysis

inevitably presents itself, as in so many areas ofmedical statistics. It is however no more sensibleto do such analysis on data on lung functionthan in the general case.30 Ultimately it has todo with the generalisability of the results ofa trial.

In airways diseases, asthma in particular, thedisease severity varies among patients. Thusthe magnitude of the response attainable willvary between patients. How much will partlydepend on what we measure. If we measurelung function, patients with small lungs, like lit-tle women, will be expected to have smallernumerical effects than patients with large lungvolumes, like tall men. This does not meanthat the actual benefit to the patient is less,only that the outcome variable suffers fromthis deficiency. We could remedy this partiallyby measuring lung function in percent of pre-dicted normal, which tries to capture size dif-ferences, instead. But that is only a partialremedy to a larger problem – that there is alarge heterogeneity in response sizes for someoutcome variables, which does not necessarilyreflect different clinical responses. Compliance tostudy procedures might also influence the mea-sured responses.

This heterogeneity in the disease populationis not a problem for the proper conclusions ofa clinical trial. Consider a randomised parallelgroup study in which treatments A and Bare compared. If properly conducted observedtreatment differences should be explained bydifferent treatment effects alone, and any claimfrom the study should relate to the relativeeffect of the two treatments. If it turns outthat the effects differ, say, between men andwomen, that should not jeopardise the outcomeof the trial, since a proper randomisation impliesthat the sex distribution is similar in the twogroups. We get problems only when we try tocompare effect sizes between studies. Differencesin effect sizes could well be explained not onlyby different patient populations, including genderdistribution, but also by different compliance tostudy procedures in the trials.

Page 374: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

380 TEXTBOOK OF CLINICAL TRIALS

This discussion implies that estimated treat-ment effects, at least for lung function variables,should be interpreted with caution. It is oftenmuch better to try to transform the informationon the effect scale to a dose scale, as will beextensively discussed in sections to come.

Another implication of the discussion is thatif we start to compare different subgroups inan asthma trial we can expect to find differentresponses. One such example is the multi-centretrial, in which we have many centres, oftenfrom different countries, with different patientpopulations and medical traditions. Differenteffect sizes should not come as a surprise,and do not necessarily indicate interpretationalproblems in terms of overall effects. We wouldalso expect that if we compare the effects in asubgroup of mild asthmatics to a subgroup ofmore severe asthmatics within the trial, they willnot be identical.

Subgroup analysis will by definition be lesspowered than the analysis of the full analysis dataset. As a consequence, subgroup analysis is nota sensible thing to do in a single trial, unlesswell specified in advance, and then the studysize should be large enough for this subgroupanalysis. If not specified in advance, the risk ofa fluke significance because of the ever-presentmultiplicity problem is too large, leading to falseconclusions.

However, subgroup analysis is in generaltotally unnecessary. If we want to see if theresponse differs between men and women, as anexample, the proper approach is not to analysethe subgroup of men and the subgroup of womenseparately. It is to analyse the full data set withan interaction factor for treatment and sex. Ifthis interaction test is statistically significant, weknow that there is a difference in the response formen and women. We can even quantify it, if weso choose. Similar adjustments can be done fordisease indices like lung function in percent ofpredicted normal or reversibility. But the add-onvalue of doing this remains to be proven, exceptthat it might reduce the residual variance in theanalysis of variance model.

PHASE I/II STUDIES

EFFICACY STUDIES

In terms of efficacy, not much can be done ina phase I trial. These trials, mainly concernedwith tolerability and pharmacokinetics, give noreal clue to whether a new drug actually worksor not. Note that in general a respiratory drugmust be very well tolerated to be useful, sincethere are so many efficacious and safe drugs onthe market.

When trying to establish that a new drug iseffective in asthma or not and to estimate clini-cally relevant doses, the approach differs betweenthe drugs that have more or less immediateeffect on lung function, typically bronchodila-tors, and the ones that work more indirectly onlung function, via the inflammatory process, asglucocorticosteroids. For bronchodilators we canuse small-scale experimental studies, whereas foranti-inflammatory drugs we typically need long-term studies from the very start.

To establish efficiency is conceptually simple:all we need to do is to show that a givendose of the drug is superior to placebo. Thereis however no true placebo treatment in long-term asthma or COPD trials – at a minimumthe patients need to be provided a short-actingβ2-agonist to be used as rescue medication. Allnew drugs are therefore studied on top of somebaseline treatment, which in most cases is notvery constant. For example, a GCS treatmentis taken in addition to rescue medication. Itpotentially becomes a problem when you wantto introduce a new rescue treatment, which is tobe taken when needed, and you need to documentlong-term effects.

The next question, possibly posed in the samestudy, is which is the relevant dose-span for fur-ther investigation? For a dose–response study(including the simple one with only placebo andone dose) the choice of patients is important. Themajority of asthmatics, the mild ones, do not needmuch therapy and because of the relative impre-cision in the measurement tools, many patientswill have no measurable response in many ofthe tests discussed. For a bronchodilator study

Page 375: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 381

of the crossover type, where the design containsa number of experimental days when response isfollowed after a dose, some bronchoconstrictionis needed in order to see an effect. By defini-tion the bronchoconstriction varies with time foran asthmatic, so the measured response will notonly depend on the dose of the bronchodilator,but also on the degree of airway narrowing onthe particular day the experiment is done. Thusit is difficult to assess efficacy for the individ-ual patient, which we handle in clinical studiesby considering means. But more importantly, inorder to achieve dose–response we need suffi-ciently many patients with sufficiently many dayson which they can respond. The selection of suchpatients is not easy!

For long-term diary card studies we have asimilar problem. The effect measures are rathernoisy, and we generally need somewhat largestudies to measure a signal through all the noise.In parts of the world with a widespread healthcare system most asthmatics are rather welltreated. In particular the use of GCS alreadyin fairly mild asthma in Europe, Australia andCanada seems to have made the majority ofasthmatics more or less symptom-free for most ofthe time. That in turn means that the traditionaldiary card might have difficulty in catchingany responses.

From the foregoing discussion it should beclear that the magnitude of response in aparticular variable is very difficult to assess: ifit is small, is it because the patients studieddid not have much room for improvement orwas it because the drug was of minor efficacy?The only way, it seems, to actually assess thedegree of response is to compare it to somethingwe know, by experience, to work – to includean active control in the clinical trial. In mypersonal view, a clinical dose – response trialin asthmatics without an active control has verylittle information value. Also note that we shouldnot need placebo in order to prove efficacy – itshould suffice if we could prove that there isa dose–response relationship (this is in fact thepoint with the expression ‘show dose–response’:to prove that the drug has a pharmacological

effect). This does not rule out (if the slope ispositive) that small doses have a negative effectand larger doses a positive effect, somethingthat should be borne in mind when interpretingthe result.

What can be done in a placebo-controlleddose–response study without an active controlis to estimate the minimal effective dose (MED).This addresses the question: ‘Which is the lowestdose, of those studied, that is proven effective’and tests, in a recursive manner to controlsignificance level, the doses from highest tolowest with placebo. There are a number ofalgorithms available that can be used, but the keypoint about MED is that it is the lowest dose thatwe from this study can claim is effective. Thusthe result depends heavily on the size of the studyand choice of patient population, a property theaverage physician probably would not like MEDto have. Thus there is a great danger in usingMED as defined here for decision making. Inmy view MED is more likely to lead to falsedecisions than correct ones.

The information we actually want from adose–response study is the shape of the dose–re-sponse curve, to allow us to pick the ‘best’ dose.Not a detailed shape, a simple approximationwhich can be used to derive insights from. Aslong as there is a monotonous dose–responsecurve a more complicated description than theone provided by the formula

E = E0 + Emax/(1 + (ED50/D)γ ) (1)

is rarely needed. This formula contains fourparameters: E0 is the baseline level of theresponse variable, corresponding to placebo.Emax is the maximal effect attainable, ED50 is thedose required to obtain 50% of this and finally,γ is a sensitivity parameter which measures howmuch the response changes with changes in dose.The shape of this function is a sigmoidal curvewith the extremely important property that overmuch of the range (say from E0 + 0.2Emax toE0 + 0.8Emax) it can be well approximated bya straight line E = a + b ln D. A descriptionof such a dose–response curve should be the

Page 376: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

382 TEXTBOOK OF CLINICAL TRIALS

purpose of the dose–response trial, not to discussthe individual doses that were actually chosen tobe used in the study.

Identifying the dose–response curve, however,does not give you a hint on how well thetreatment compares to competing treatments. Forthat purpose it is wise to include an active controlin the trial. We can then use the dose–responsecurve to estimate the dose of the new drug thatproduces the same effect as the active controldoes, hopefully with confidence limits.

Example: Bronchodilation

The bronchodilating effects of two long-actingβ2-agonists, we call them A and B, each withits own inhalation device, were compared bygiving single dose administrations, followed byrepeated measurements of FEV1 over a 12-hour period. The following five treatments were

studied in a randomised, double-blind, double-dummy crossover study: 6, 12 and 24 µg of drugA, 50 µg of drug B and placebo.

In Figure 22.2 we see the geometric meanvalues, expressed as percent increase from themeasurement taken prior to treatment adminis-tration. The reason for plotting geometric, andnot arithmetic, means is that results are often tobe expressed as percent increases, and then datashould be analysed on a logarithmic scale andgeometric means are therefore more natural thanarithmetic ones. As a consequence, differencesare unnatural entities to discuss and should bereplaced by ratios.

To actually analyse the data we want someoverall summary statistic that includes both themaximal effect and the duration of response andwe use the area-based average FEV1,av over12 hours. If we want to keep the idea of analysingon a multiplicative scale, we need to compute the

0 2 4 6 8 10 12

Time since treatment administration (hours)

A 6 µgA 12 µgA 24 µgB 50 µgPlacebo

−10

−6

2

−2

6

10

14

18

22

26

FE

V1

(% in

crea

se fr

om b

asel

ine)

Figure 22.2. Geometric mean values, expressed as percent increase from the baseline measurement, of FEV1

measurements over 12 hours for individual treatments

Page 377: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 383

area all the way down to zero. Alternatively, wecould integrate over the baseline measurement,but then the area could be negative and we wouldbe forced to do the final analysis on the originalscale. We have chosen the former approach.

The ANOVA uses the model

ln(FEV1,av) = patient + treatment

+ period + ln(FEV1,base)

To have baseline as a covariate in a single dosestudy is rather essential. If we observe lungfunction over a short time period, baseline isvery important and we could probably just aswell use FEV1,av/FEV1,base as effect variable.However, when we observe over a longer timeperiod, the influence of baseline should diminishand after many hours could probably just aswell be ignored. By using it as a covariate,we get a reasonable compromise between thesetwo extremes.

Based on the results from this analysis we canaddress various questions:

1. Which doses of drug A can we claim to beeffective? To find this out we compare them,from highest to lowest dose, with placebo.Here is the result in tabular form:

Mean 95% ConfidenceTreatment ratio limits

24 µg of A 1.214 1.177, 1.25212 µg of A 1.176 1.140, 1.2126 µg of A 1.154 1.119, 1.190

Mean ratio relates to placebo. We see thatthe mean effect is 15–21% larger than itis for placebo, and the confidence intervalsclearly show that all these comparisons werestatistically significant. So we can claim that6 µg is an effective dose of drug A, withoutcompromising the significance level (see thediscussion on MED).

2. Which dose of drug A has the same meaneffect as the reference treatment, 50 µg of

drug B? To do this, we fit (weighted linearregression to keep track of the uncertainties ofthe means31) a straight line to drug A meansvs. log-dose and estimate the dose that has thesame mean effect as the reference treatment.This is illustrated in Figure 22.3. The actualdose estimate was 9.3 µg with 95% confi-dence limits 3.4–19 µg. As a consequence wefind that 24 µg of drug A as a single dose hasgreater bronchodilating effect over 12 hoursthan 50 µg of drug B.

3. What about duration of effect? Looking atFigure 22.2 we see from the placebo curveindications of the diurnal variation that isknown for lung function. To define durationby identifying when individual curves cross aline, say 15% above baseline, does not seemappropriate – if the placebo curve drops, youmight still have a good response even whenyou are back to baseline. A more statisticallysound approach would be to rephrase thequestion as ‘is there still an effect after12 hours’ and then compare the treatments toplacebo at that point in time. This is done inthe following table:

Mean 95% ConfidenceTreatment ratio limits

24 µg of drug A 1.246 1.187, 1.30912 µg of drug A 1.176 1.120, 1.2346 µg of drug A 1.168 1.112, 1.22650 µg of drug B 1.200 1.144, 1.260

Again mean ratio relates to placebo and wehave responses that are between 17% and 25%larger than that after 12 hours. Thus all activetreatments clearly have a duration in excessof 12 hours.

SYSTEMIC EFFECTS

Effects of anti-asthma drugs are in general notconfined to the lungs. Both β2-agonists andGCSs have receptors on/in cells throughout thebody. Since the drugs are cleared through thebloodstream they will therefore have systemic

Page 378: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

384 TEXTBOOK OF CLINICAL TRIALS

101

Dose (µg)

2.50

2.55

2.60

2.65

2.70

mea

n F

EV

1 (L

)

Figure 22.3. Treatment mean values for 12-hour average FEV1 with fitted log-linear dose–response curve fordrug A and estimation of Deq relative to 50 µg of drug B

effects (albeit perhaps not measurable). In con-trast to the anti-asthmatic effects, these effectscan be measured both in healthy volunteers andin patients.

GCSs are synthetic cousins to an endogenous,anti-inflammatory, substance, cortisol. Given this,we can compare the pharmacodynamic systemiceffects of different GCSs by comparing theireffects on endogenous cortisol levels. This has theadded advantage over drug plasma concentrationsthat it accounts for differences in potency indecreasing plasma levels of cortisol (which isdone by negative feedback on the HPA axis).It is important to state at this point that wedo not study endogenous cortisol levels becausethey themselves represent a dangerous side-effect. They are studied because they are sensitivemarkers for the ‘amount’ of exogenous GCS inthe body!

With this model in mind we can use cortisolin plasma as an index of the systemic burdenof therapeutically given GCS. By measuringthe effect on cortisol we get a rather directmeasure of the overall potency and concentrationin plasma of the GCS. We can however notmeasure it timepoint by timepoint and compareto measurements without drug, since the levelof cortisol is determined as a balance betweenproduction and elimination (with a half-life ofabout 1.5 hours) and the GCS acts on theproduction side. We therefore need to study alonger period. The cortisol levels in plasma havea diurnal rhythm which is very pronounced, sothe most appropriate study to do is to giverepeated doses of the GCS until a new steadystate has been reached and then measure p-cortisol for 24 hours. A typical schedule is everyother hour. The most useful variable to study is

Page 379: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 385

the area under the curve for those 24 hours. Insteady state, when there is a 24-hour periodicity,this is proportional to the amount producedduring 24 hours.

The actual clinical consequences of the levelsattained are very hard to assess. They aresurrogate measures of the long-term effects,but as such they should provide useful relativeinformation on different GCSs, though we cannever expect effects on p-cortisol to be a perfectpredictor of say relative risk of osteoporosis. Thedistribution pattern in the body of the GCS mightbe of some consequence for this.

Example: Comparison of Plasma CortisolDose–Response Curves

We want to compare two inhaled steroids (withinhalation devices), call them A and B interms of their degree of suppression on theplasma cortisol level. The comparison is done inhealthy volunteers in a randomised, open seven-way crossover study. Each treatment periodconsists of 4 days, and there was a washoutperiod of at least 2 days between each suchtreatment period. Each steroid was given inthree doses: 200, 400 and 1000 µg b.i.d. forA and 200, 375 and 1000 µg b.i.d. for B.The seventh treatment was a placebo treatment.Blood samples were measured every second hourduring the last 24 hours in each treatment period(10 p.m. to 10 p.m.). In all 21 healthy volunteersparticipated in the study and all completed alltreatment periods.

The effect of the fact that the study is openis hard to assess. If the administration of oneof the drugs is associated with more stress thanthe administration of the other, this might biasthe result. However, this seems unlikely, anddoing the study open has the benefit that fewerinhalations are required on each occasion.

To analyse the study, we first do an ANOVA.It is done on the logarithm of the concentrationswith standard factors for a crossover study: sub-ject, treatment and period. As a first presentationof the results we can compare all active treat-ments to placebo:

Mean 95% ConfidenceTreatment ratio (%) limits p

200 µg of A 97.4 73.3, 129.3 0.85400 µg of A 94.1 70.9, 125.0 0.671000 µg of A 70.0 52.7, 92.9 0.014200 µg of B 76.4 57.6, 101.5 0.063375 µg of B 53.5 40.3, 71.1 0.000031000 µg of B 9.1 6.9, 12.1 <0.00001

Here the mean ratio is presented as a percentage.For instance, 76.4% for 200 µg of drug B meansthat there is a suppression of 100 − 76.4% =23.6%.

This result does not tell much about how thedrugs compare. To do that we can fit parallelnon-linear dose–response curves to the meaneffect data, adjusting for precision by using aweighted non-linear regression.31 We assume forthis analysis that given enough steroids, thecortisol levels go down to zero. The result isgraphically shown in Figure 22.4.

With the appropriate parametrisation here, weobtain that the relative dose potency is estimatedto be 3.7, with 95% confidence limits 2.7 and 6.4.Thus, in terms of depressing cortisol levels, B isestimated to be about four times more potent thanA (remember that a letter stands for a GCS plusa device – change device and this relation mightchange). Or put in other words: to achieve thesame average depression in cortisol, we can usea four times larger dose of A than of B.

Having obtained this result, the immediatequestion is: ‘How relevant is this result to thetarget group of the drug – the asthmatic patient?’.We cannot extrapolate these results to patients‘as is’. There is a basic difference between ahealthy volunteer and an asthmatic – the latterhas an ongoing inflammatory process. This meansthat the dynamic system regulating cortisol isdisturbed (compared to healthy) and we canexpect smaller absolute effects of a given doseof the GCS. So a typical patient might have alarger ED50 than a typical healthy volunteer. Bythe same token we can expect different patientsto vary considerably in this respect.

However, there is no reason to expect thatdifferent GCS should act differently in patients

Page 380: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

386 TEXTBOOK OF CLINICAL TRIALS

102 103

Dose (µg)

0

20

40

60

80

100

120

140

160

Mea

n co

rtis

ol le

vel 2

4 ho

urs

(nm

ol/L

)

Figure 22.4. Estimated dose–response mean value curves for treatments A (to the right) and B (to the left)

and in healthy volunteers, i.e. there is no reasonto claim that the relative effect of two GCS,as measured by the potency ratio, ρ, shoulddiffer between patients and healthy volunteers, orbetween different categories of patients for thatmatter. If such differences turn out to be the case,the reason must be that the systemic dose differsbetween healthy volunteers and patients, andthen, most likely, between patients of differentdegrees of severity in their disease.

PHASE III STUDIES

DOCUMENTING EFFICACY AND SAFETY

Most drugs for obstructive airway diseases aregiven for maintenance treatment, and the mainpoint to document is the level of disease controlof the proposed treatment. At the same timeit is important to document the adverse eventprofile, since most of the drugs in this therapeuticarea are considered very safe, and safety must

not be an issue of a new drug. Thus thepivotal confirmatory trials for the airway diseasesasthma, COPD and rhinitis are parallel group,diary card studies typically spanning from amonth up to a year.

Asthma Trials

For asthma the typical study length for an efficacystudy seems to be 3 months, though occasionallylonger studies are needed. As already discussed,there is a continuous scale of severity of each ofthe respiratory diseases. The severity of asthmacan be classified into a few groups, each of whichhas its recommended medical treatment. Boththe classification and the recommended treatmentare, like the disease under study, somewhatvarying with time. The following classificationis meant only to be indicative on what type ofcriteria are used for such classification.

Intermittent: These patients have normal lungfunction between occasional exacerbations

Page 381: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 387

with symptoms at most once a week. FEV1 inpercent of predicted normal should be ≥ 80%.

Mild persistent: Now symptoms appear weekly,but not many times a day and exacerbationsmay affect both activity and sleep. FEV1 inpercent of predicted normal should be ≥ 80%.

Moderate persistent: Symptoms appear daily,and affect both activity and sleep. There arenight-time symptoms weekly and FEV1 iswithin 60–80% of predicted normal.

Severe persistent: These patients have contin-uous symptoms, frequent exacerbations andnight-time symptoms, which limits physicalactivity. FEV1 in percent of predicted normalis ≤ 60%.

This classification borrows from the GINAclassification.32 However, that classification alsouses a concept of variability which we donot discuss.

To describe the patient’s disease severityin a clinical trial we use data obtained ata visit to the clinic before randomisation. Inaddition, a diary card is provided for a run-inperiod to assess symptoms and use of rescuemedication. Inclusion into a study is often basedon these measurements. A more practically,oriented classification of the severity of asthma isbased on the use of GCS in the patient’s regulartreatment: no GCS (intermittent–mild), inhaledGCS up to 400 µg/day (mild persistent), inhaledGCS in the range 400–1000 µg/day (moderate)and inhaled GCS ≥ 1000 µg/day (severe). Sothe daily dose of background GCS treatmentcan be used as an indicator of asthma severity.Another classification is based on PC20.33

The best way to use the information in diarycards might vary between patient groups. Asalready explained, the traditional use of diarycard data is to assess changes in period means.This is expected to work best in patients withmoderate asthma. In the intermittent group oneshould not expect effects of any considerablemagnitude because the lung function is close tonormal, and the patient is, for most of the time,symptom-free. In the group of the most severepatients, patients often have obtained an irre-versible component to the disease and therefore

show little improvement in lung function. Insteadthe focus for studies in severe patients might beon how a new treatment can substitute for an oldtreatment without compromising the patient: e.g.how much oral steroids can be spared by tak-ing inhaled GCS, or how much inhaled steroidscan be spared when taking a concomitant leu-cotriene modifier.

The main objective of the confirmatory trialsis to show efficacy, and thus require a placebocontrol. This was discussed earlier. Concerningdoses, for the majority of drugs for airwaydiseases, there is not one dose that is appropriatefor all patients. Instead a range of doses hasto be justified and documented in the clinicalprogramme. The general discussion on MED anddose–response is relevant here too.

Example: A Diary Card Study with FixedTreatment Arms

The main outcome variable in a diary cardtrial with fixed treatment arms is some kind ofsummary measure (typically a mean value) over alonger period, presumed to represent, on a grouplevel, for most part a steady state situation. Wealso need a corresponding measure during thebaseline/run-in period for the statistical analysis.

Figure 22.5 shows the estimated daily meansof a 3-month clinical trial in asthma. The studywas a multi-centre, double-blind, double-dummystudy of 3 months’ treatment with investigationaldrugs. There were 52 centres in seven differentcountries worldwide and the randomised treat-ments were placebo, two dose levels, 100 and600 µg daily dose, of the GCS A and one doselevel, a daily dose of 200 µg of the GCS B. Inall 547 patients were enrolled, of which 472 wererandomised and 383 completed the study. Therewere twice as many withdrawals in the placebogroup as in the other groups. The mean agewas 44 years, the mean FEV1 in percent of pre-dicted normal was 70% and the mean reversibilitywas 24%. All patients were on inhaled steroidswhen entering the study – the mean daily dosewas 850 µg ranging from 500 to 1500. Thus thepopulation must be characterised as being mod-erate–severe.

Page 382: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

388 TEXTBOOK OF CLINICAL TRIALS

To plot the temporal behaviour of the effectof the four treatments, simple mean values areexpected to produce a bias towards no effect. Atleast when comparison is done to placebo: in thisgroup there were more withdrawals and many ofthese can be expected to be due to reasons thatare correlated to (lack of) treatment effects. Someweeks into the treatment period, raw mean valuesfor this group will therefore mainly includepatients that are not in desperate need of GCStreatment. This will introduce a selection biaswhich will partly hide the effect of the treatments.To avoid this problem, that different days willcontain different patients, we have pre-processeddata when plotting Figure 22.5 to make sure alldays contain data from all patients. The way thisis done is as follows:

1. Linear interpolation is done in order to imputeall missing values between the first and the lastrecorded day for each patient.

2. If the patient withdrew from the study beforethe 90th treatment day, the mean of the lastthree recorded days is extended from thelast recorded day plus one to day 90 post-randomisation.

Then daily means are computed by treatment.In Figure 22.5 an additional operation has beenmade: for each treatment we compute a baselineby taking the average of the run-in means andsubtract that from all daily means. This is donein order to highlight changes since that is whatthe analysis focuses on.

The statistical analysis of this data uses theperiod means from the individual patients: themean is computed for the run-in period, which isused as a baseline, and then for the treatmentperiod (from the day of randomisation andonwards). The change from baseline is used aseffect variable, and the analysis is an ANOVAwith treatments and centre as factors and baseline

−20−25

−20

−15

−10

−5

0

5

10

15

20 40 60 80 1000

Days since randomisation

Cha

nge

in P

EF

Mor

ning

(L/

min

) fr

om b

asel

ine

A 100 µg

A 600 µg

Placebo

B 200 µg

Figure 22.5. Estimated daily mean values of the change from baseline in PEF morning. See text forcomputational details

Page 383: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 389

as covariate. The following table shows theadjusted mean values for the effect variablefor the four treatment groups, adjusted to acommon baseline value (the mean over the fullstudy population):

95% ConfidenceTreatment Mean SEM limits

placebo −10.5 3.3 −17.0, −4.0A 100 µg −0.3 3.3 −6.8, 6.2A 600 µg 7.4 3.3 1.0, 13.9B 200 µg 2.0 3.4 −4.7, 8.8

From this analysis we also find that there is astatistically significant, negative, dependence ofthe change in PEF on the baseline PEF andthat the estimated residual standard devotionwas 35.9 L/min. As is common for this kind ofdata, the explanatory power of the analysis issmall: only 8% of the variability is explained bythe model.

Other diary card variables are often analysedthe same way as was shown for PEF morning,by first computing individual period means.Symptom scores and rescue medication arehowever variables for which the average valueis not necessarily easy to interpret. Symptomscores are really ordered categorical data, andeven though the average gives a hint of theamount of symptoms, it is not clear that e.g.(2 + 2)/2 means the same as (1 + 3)/2. Forrescue medication, the problem is mainly thatwe use the mean as a measure of location fora distribution which, for some patients, might beskew. Also the distribution of period means overpatients may well be skew.

For symptom scores and rescue medication it istherefore often useful to compute the percentageof symptom-free days, or the percentage of dayswith no rescue, instead of period means. Atleast the former is often for mild patients amore efficient measure than the correspondingperiod mean. And it is clinically easier tointerpret. This idea can be carried one stepfurther: we can introduce the concept of an‘asthma controlled day’, as one in which there are

no asthma symptoms and no rescue medicationwas needed. The percentage of such days isoften a useful variable, at least in studies onmild–moderate asthmatics. Modification to a‘well-controlled’ day for more severe patientpopulations is possible.

Another approach to diary card data is to mea-sure time to first exacerbation. In a populationlike this, this is not expected to produce betterresults than the ones presented but, as already dis-cussed, this approach might be useful in milderpopulations. For illustration, we have used thefollowing definition of an exacerbation for thepresent study (which was in moderate–severepatients): for there to be a mild exacerbation oneof the following three criteria should be fulfilledon two consecutive days:

1. Morning or evening PEF should be more than20% below the period mean during run-in.

2. Rescue medication should be up at least fourinhalations above the period mean during run-in.

3. The patient woke up during night-time andtook rescue medication.

By scanning the diary cards we can, for eachpatient, compute the time to first exacerbation, ifthere is one, otherwise the patient is censored.

With this definition, we can present resultsas Kaplan–Meier plots for the time to event,as in Figure 22.6. We see a picture similar tothat conveyed by the period means. Statisticallywe can compare groups either by log-rank testsor semiparametric Cox regression. In the tablebelow treatments are compared to placebo basedon a Cox regression model:

Hazard 95% Confidenceratio limits p-Value

A 200 µg vsplacebo

0.7813 0.5642, 1.082 0.137

A 600 µg vsplacebo

0.6731 0.4812, 0.9414 0.0207

B 200 µg vsplacebo

0.717 0.5127, 1.003 0.0519

Page 384: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

390 TEXTBOOK OF CLINICAL TRIALS

A 100 µg

A 600 µg

Placebo

B 200 µg

0 10 20 30 40 50 60 70 80 900.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Days since randomisation

Frac

tion

patie

nts

with

no

exac

erba

tion

Figure 22.6. Kaplan–Meier plot of the time to a mild exacerbation

We see that the hazard for a mild exacerbationfor the high dose of A, e.g., is estimated to beabout 67% of that of placebo. As expected in thispatient group, the results are, from a statisticalpoint of view, somewhat weaker than the onesobtained from the analysis of period means.

A Dose Reduction Study

To compare the efficacy of two GCSs, a ran-domised, double-blind parallel group study withtwo treatment arms (one for each GCS) wasdesigned. The objective is to estimate the rela-tive dose potency by starting each arm on a highdose of the GCS, treating for some weeks, step-ping down the dose and treating for some weeks,making a further step down, etc. This is done untilthe asthma is no longer controlled. This way weobtain for each patient the lowest dose on whichthe patient had asthma control (the one previous

to the last one). This we call the MED (MinimalEffective Dose) for the patient.

Such a study needs a detailed definition of whatis meant by asthma control. There is no suchuniversal definition, but lack of asthma control isessentially the same thing as a mild exacerbation,as discussed above. So the working definitionabove could be used, and the algorithm is to stepdown until a mild exacerbation occurs.

In such a study the diary card variables per seare not of independent use, they should not becompared between groups, except possibly thedata on the highest dose. Instead it is expectedthat the mean values are similar in the twoarms over the treatment period, what varies isthe underlying dose producing those effects. Theeffect variable of interest is the MED, which isto be compared between the groups.

The nature of MED is such that the best way ofanalysing it is not immediate. On the one hand

Page 385: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 391

it will be rather discrete in nature, with only afew possible levels. On the other hand the mostinformative way of expressing the result is tosay how much more was needed on average forone as compared to the other (like that the MEDfor A was on average 125% that for B ). Weadvocate for that reason that MED is analysedunder a multiplicative ANOVA model, i.e. afterlog transformation of the dose, provided thatMED > 0 in all cases. The most appropriate wayto do this is to regard the data as interval censoreddata for the analysis.

One final comment on the design of long-termasthma trials. In many instances, especially fordose–response studies, it is informative for theinterpretation of the results to relate the observedeffects to the ‘highest effects attainable’. Thiscan be done within a study, so that the patientsare put on a heavy treatment, consisting of ahigh dose of a GCS and a long-acting β2-agonist(or whatever is considered necessary to get thepatient in as good a condition as possible), eitherduring run-in, in a period after a run-in periodor by adding on a period at the end of thestudy with a similar treatment. The purpose ofthis is to be able to quantify the response interms of what can actually be achieved in thepatients under study. If we put this referenceperiod at the end of the study, we must makecertain that all patients, including withdrawals,pass it in order to avoid having problems witha selection bias. If we put this reference periodbefore randomisation, we might have carry-overeffects into the randomised treatments with theirpotential problems. But having such a periodas reference often helps in the interpretation ofthe results.

COPD Studies

For COPD the classification of disease severitycan be made as follows:34

Mild: This is what is called the smoker’s cough.The patient has FEV1 > 60% of predictednormal, no breathlessness and is in generalunknown to the health care system.

Moderate: The patient has breathlessness onexertion and FEV1 in the range 40–60% ofpredicted normal.

Severe: The patient has breathlessness on every-day activities and FEV1 < 40% of predictednormal.

What is to be proved in a clinical programme fora COPD drug depends on what the claim of thedrug is. We can crudely divide effects on COPDinto two groups: symptomatic effects and diseasemodifying effects. The natural history of COPDis one of an accelerated progressive declinein lung function leading up to a distressful,premature, death. This decline in lung functionleads to progressive symptoms and diminishedexercise endurance. Symptomatic effects relateto the alleviation of symptoms and improvementof quality of life, whereas disease modifyingeffects are effects that lessen the decline rate inlung function.

A drug with symptomatic effects on COPDshould work on a rather short time-scale. Itshould lead to improved symptoms, fewer exacer-bations and better performance on exercise tests.Many drugs that were originally anti-asthmadrugs have been tried, and licensed, for theCOPD indication. Part of their effect may wellbe due to the reversible component that manyCOPD patients have in their disease – in otherwords an anti-asthma effect within the COPD. Inorder to claim effects above this, studies havebeen performed in which one tries to excludepatients with reversible components by usingexclusion criteria on patients with a reversibilitytest above 15% of baseline. To claim that short-term effects seen in the population are due tonon-anti-asthma effects because of such exclu-sion criteria is obviously not correct – a patientwith reversible airways obstruction can well havea reversibility of less than 15% on a particularoccasion. Since COPD is a disease affecting smallairways, it seems more logical to base short-term effects on measurements related to theseairways, as opposed to FEV1. However, regula-tory requirements make FEV1 the primary effi-cacy variable in COPD studies – at least as ofthis writing – though emphasis is made on the

Page 386: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

392 TEXTBOOK OF CLINICAL TRIALS

symptoms and/or exercise tests also. In fact theCPMP guidelines require two primary efficacyvariables in COPD studies – one should be FEV1

and the other a symptom score.35

A COPD study for a drug with primarilysymptomatic effects is in general a long-termstudy, probably with diary cards, not verydifferent from the asthma trial discussed above.Prevention of exacerbations is perhaps the mostimportant aspect of COPD treatment, so a 6-month study is the minimum.

A COPD drug which claims disease modifyingproperties has a heavier burden of proof onit. The effect of disease modifying is that therate of decline in lung function is reduced. Toprove this, we need to do long-term studiesover 3–5 years, or perhaps more, in which lungfunction is measured repeatedly. The statisticalanalysis should focus on the rate of decline,which could be done using a linear mixedeffects model.

Rhinitis Trials

Classification of rhinitis patients into groupsaccording to severity is lacking. The accepteddivision is between occasional and continuousexpression of symptoms, i.e. between seasonalallergic rhinitis and perennial rhinitis. The rhinitissymptoms are the same, so the measurements,notably symptom scores, are the same. As alreadyindicated symptoms are often recorded in diarycards for blockage, runny nose, sneezing/itchyand eye symptoms, and the sum of the first threemake up the Combined Nasal Symptom scoreor Nasal Index Score, which is a useful primaryvariable for clinical trials.

The difference between seasonal and perennialrhinitis lies more in the study design/conduct. Forperennial rhinitis the situation is similar to thatfor asthma or COPD in that the patient can startthe trial almost any time. For hay fever, however,the study must be conducted over a rather shortperiod of pollen exposure. What makes thesetrials more difficult is that ideally the patientsshould be included during the onset of the pollenseason to get baseline data, then followed overone to several pollen peaks with treatment. The

intensity of the rhinitis is dependent on pollencounts in the air, and lack of treatment effectscan well be due to insufficient pollen exposure.Therefore concomitant collection of pollen data isnot only useful but almost necessary when tryingto understand lack of effect in such studies.

THERAPEUTIC EQUIVALENCE

One of the challenges for drug development isto prove that a new treatment is therapeuticallyequivalent to a reference treatment. In the area ofrespiratory diseases this problem appears in twodifferent settings – when we want to register anew formulation, most often a new inhaler, andin market claims of equality of two treatments.The background and motivation of these differsomewhat, so we discuss them separately.

Bioequivalency of Two Devices

Bioequivalency refers to a specific problem.Assume that a drug is delivered as a tablet or insome other form, such that it must pass throughthe bloodstream before reaching its site of action.Then the plasma concentration profiles of thedrug define the clinical effect. This reasoningis the rationale for the bioequivalence concept:to prove that two formulations are bioequiva-lent, show that the plasma concentration profilesare sufficiently similar. From that we could thenlogically infer that the therapeutic effects are suf-ficiently similar to have the same therapeuticeffect. This is in general a rather straightfor-ward problem, requiring only small pharmacoki-netic studies.

For many years there has been a well-definedmethod to establish bioequivalency in this situ-ation. We reduce the general question of similarplasma concentration curves to key measures ofrate and extent of absorption, including AUC.The ratio of two AUCs measures the relativebioavailability of the two formulations and therequirement is that the ratio of the means (anal-ysed under a multiplicative model) should haveconfidence limits within 80–125%.

For inhaled respiratory products, however, thesite of action, the lungs, lie prior to the blood-stream (i.e. when the drug appears in the blood

Page 387: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 393

it has in general had its desired pharmacolog-ical effect). Thus plasma concentrations cannotpredict effect by pure logic! Equal delivereddose does not by logic imply the same effectfor different inhalers, because different inhalerscould deposit the drug in different parts of thelungs. For that reason, to bridge from one inhala-tion device to another is not necessarily a sim-ple case of measuring plasma concentrations. Asof this writing there is substantial confusion onhow to proceed with bioequivalence studies forinhalers. We will discuss some aspects of theproblem here.

The first aspect is that what you inhale areparticles, and these will be deposited differentlydepending on size. To give an equivalent effect,we therefore need equivalent in vitro performanceof the two inhalers. For the rest of the discussionwe assume that in vitro data is similar for thetwo inhalers.

The basic assumption is that since whatappears in the bloodstream does not have tohave passed the site of action, systemic exposure,as measured by drug concentrations in thecirculation, is not necessarily enough to concludeefficacy. However, similar systemic exposureshould be sufficient to deduce similar systemiceffects, and therefore reduce much of the questionof bioequivalent side-effects to the classicalbioequivalence method.

Logically, if we measure blood concentrationsand conclude that the systemic exposure is thesame, the outstanding question is if the drug hasbeen delivered to the site of action. For a nasalspray it is hard to see how it can fail to do so.For a nasal spray, therefore, to require anythingbeyond in vitro data and pharmacokinetic datamight be an overkill. For an orally inhaled drugthe situation is more complicated.

If we consider a bronchodilator as an example,we need the drug to hit the receptors of thecontracted muscles. To check that the drug hashit these, we can therefore do a pharmacody-namic study, e.g. a single dose study in whichFEV1 is followed for a number of hours, or abronchoprovocation study if that is preferred. Asuggested design is to study two or three doses

of each inhaler, in order to see not only that theresponse is similar, but also that there is a similarsensitivity to changes in dose. These are, rela-tively speaking, simple studies to perform.

The next question is, which should the deci-sion rule be for bioequivalency for such astudy – when are two inhalers considered to besimilar? There seems to be two approaches usedover the last few years. One is to use the wordcomparability. This is, for good reasons, not welldefined and essentially means that there is adose–response relationship on each device andthat, numerically, the mean on each dose level issimilar in the eye of the regulator. Thus there isno true statistical decision plan associated withthe study and it is not used as proof per se,only as supportive information to what in vitroand PK data provides. The other approach isstrictly statistical. In this case we should com-pute the relative dose potency with confidencelimits. At present, in the case of bioequivalencyfor pMDIs for albuterol (salbutamol in the US),FDA requires that the 90% confidence limits forthis parameter should be contained in the inter-val 2/3–1.5. The justification for these limits isnot clear, but they imply that the mean effect isso similar for the two pMDIs that they could beswitched on the market.

When it comes to decide bioequivalencyfor orally inhaled anti-inflammatory drugs theproblem is even more difficult, since there are, asof today, no designs available that can provide thekind of answers discussed for bronchodilators, toa reasonable price. The problem is that in moststudies the dose–response appears rather flat, sovery large studies are needed for the type ofdecision plan that was indicated above.

Marketing Therapeutic Equivalence

The other aspect of therapeutic equivalence isto show that a new treatment is as effective asan old one, whereas it has some other benefitscompared to it.

Proving that two treatments are equivalenthas, however, a long history in the context ofmedicine. The traditional way was to misuse the

Page 388: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

394 TEXTBOOK OF CLINICAL TRIALS

p-value technology – if we could not demonstratea difference (p > 5%) the treatments are equal.This is obviously wrong (it is similar to ‘notproven guilty’ in court, which means only that,not that one is proven innocent), which is bynow acknowledged by most, but not all, work-ers in the field. A theoretically valid approachbecame legitimised by the ICH guidelines,29

which defines an algorithm borrowed from theoriginal bioequivalency concept for plasma con-centrations. First you prespecify some limits (cor-responding to the 80–125% above) and if your95% (sic!) confidence interval for the mean dif-ference is contained within this prespecified inter-val, you can declare therapeutic equivalence. Thisapproach is however complicated, when youreffect scale has no obvious interpretation, like alung function scale or a symptom scale. To bea sensible approach, the prespecified limits mustimply that clinicians do consider such a smalldifference to be of no clinical consequence – thepredetermined limits must be agreed on. It is hardto forsee that this can actually be done in thefield of respiratory medicine, since many effectchanges mean different things depending on whatpopulation is studied.

There is however a sensible, though expensive,approach available – the one used for demon-strating bioequivalency of bronchodilators. Thekey there is to translate from the effect scaleto the dose scale, by studying dose–response.Almost all drugs in the respiratory area (thoughthere are exceptions) are available in multi-ple doses, and when two treatments (drug plusinhaler) are to be compared, at least one ofthem can in general be varied on some kind ofdose scale.

The simplest such design is as follows. Toprove that dose a of a treatment A is therapeuti-cally equivalent to a treatment B (dose need notbe specified), we can study doses a/2 and 2a of A(not dose a!) and treatment B. By assuming a lin-ear dose–response relationship (versus log-dose)for A, we can estimate the dose of A that has thesame effect as treatment B. Illustrations of this,with more than two doses of A were illustratedearlier. Now, if the 90% confidence limit for the

dose of A that has the same effect as treatmentB lies between a/2 and 2a it is reasonable thatdose a of A is equivalent from a clinical pointof view to treatment B from an efficacy pointof view. This is because half the dose has lesseffect and twice the dose more effect. Implic-itly this assumes that the clinical response to asuboptimal dose is to double this, which is whatis done with most drugs in the respiratory area.In general, to draw the conclusion of therapeuticequivalence, large studies might be needed.36

Often one tries to establish the therapeuticequivalence in one clinical study, and comparebenefits in the other. Basically I do not think thisis the way to go about this kind of problem – inmost cases it is probably a problem that shouldbe discussed in terms of therapeutic ratio, asdiscussed in the next section.

The Real Issue – The Therapeutic Ratio

Dose–response studies provide us, at best, withdoses for further investigation. However, whethera drug ends up being superior or inferior towhat is on the market is not determined by whatdose it is given in. What is the point in halvingthe nominal dose, if you get twice as muchadverse effects?

The appropriate measure here is the therapeuticratio. The therapeutic ratio for a drug relatesthe positive effect to the negative effect. Tounderstand it, assume that effects, both positiveand negative, are measured on a scale from0 to 100. Also assume, for the time being,that the dose–response curves for positive andnegative effects are parallel. Then a therapeuticratio for this drug of 2 means that twice aslarge a dose is needed to get the same negativeeffect as positive effect. Thus we can definethe therapeutic ratio for drug A as T R(A) =ED50(side-effect)/ED50(effect). If we have twodrugs, we can define the relative therapeuticindex of A to B as T R = T R(A)/T R(B), whichis equivalent to the ratio ρ(effect)/ρ(side-effect),where ρ is the relative dose potency for thetwo drugs with respect to the indicated effect.This means that we can estimate not only therelative therapeutic index, but also confidence

Page 389: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 395

intervals to the estimate, by assessing the relativedose potencies. Moreover, we can define thetherapeutic index as a ratio of relative dosepotencies without assuming that the effect and theside-effect curves are parallel. To be meaningfulit only requires that the dose–response curvesfor efficacy are parallel and the dose–responsecurves for the side-effect are parallel. We canestimate the therapeutic index by combiningeffects from two studies – one on efficacy andone on side-effect, but better still is to obtain allthe information in one study.

The first problem to solve is the precisedefinition of outcome variables, both positive andnegative. Different results can be obtained byusing different outcome variables. It is thereforeimportant that the precise objective is spelt outand the outcome variables related to this. Thefollowing example is an illustration.

Example: Estimating the Relative TherapeuticIndex

In order to assess the relative usefulness of along-acting β2-agonist, call it A, and a short-acting one, which we call B, we want to compareone topical effect, bronchodilation, and onesystemic effect, suppression of serum potassium.The study was of crossover design with single-dose administrations and serial measurement ofboth variables taken.37 In order to be able to getmeaningful estimates of the relative therapeuticindex we need many patients, relatively speaking,and in order to obtain a simpler study we chooseto measure the maximal effect on each parameterand compare that.

Thus a randomised, double-blind, six-periodcrossover study was designed with the followingsingle dose treatments: placebo, 6 µg, 24 µgand 72 µg of drug A and 200 µg and 1800 µgof drug B. Each treatment period consistedof a single dose administration which wasfollowed for 4 hours and from each experimentalsequence the maximal FEV1 value and theminimal S-potassium value was extracted forstatistical analysis.

Figure 22.7 demonstrates the main result. Wesee (period and baseline adjusted) treatment

means together with straight line approximationsto the dose–response curves for each variable.In order to make the results more interpretablemean values (and lines) are expressed as apercentage of the mean value for placebo. Fromthese straight lines we can estimate the relativedose potency, as discussed earlier, for eachvariable separately:

95% ConfidenceVariable ρ limits

FEV1 147 65, 534S-potassium 60 41, 91

Thus we see that in terms of efficacy, the long-acting drug A is almost 150 times more efficientthan the short-acting B. On the side-effect side,A is 60 times more potent, so from this data wesee that the relative therapeutic index is estimatedto be 146.6/59.75 = 2.4. To obtain confidencelimits is somewhat involved since we need to takeinto account the covariation of the two variables.How to do this is outlined in Ref.38 The resultis that the relative therapeutic index is estimatedto be 2.5 with (approximative) 95% confidencelimits 1.02 and 9.0 (p = 0.046).

The conclusion from this is, that in terms of thevariables of this analysis treatment A is estimatedto be 2.5 times ‘better’, but it is certified that itis ‘better’ than treatment B. Of course, this resultis not better than data. From Figure 22.7 we seethat the lowest dose of each drug has a very smallaverage effect on serum potassium. It could (andshould) be questioned if these really are on thelog-linear part of the dose–response curve. Wecan repeat the analysis by only incorporating thedoses 24 and 72 µg of A and 1800 µg of B forserum potassium.

OTHER ISSUES

PHASE IV

Much phase IV work focuses on comparisonswith competitor products in order to demonstratethe advantages of the new product. This has been

Page 390: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

396 TEXTBOOK OF CLINICAL TRIALS

100 101 102 103

Dose

104

108

112

116

120

124

max

imal

FE

V1

(per

cent

of p

lace

bo)

100 101 102 103

Dose

84

86

88

90

92

100

94

96

98

min

imal

S–P

otas

sium

(pe

rcen

t of p

lace

bo)

Figure 22.7. Adjusted mean values for each treatment and outcome variable

discussed in previous chapters and will not berepeated here. In addition to that, special safetyissues might have to be addressed, which mightcall for large-scale studies in order to study somerare event.

PHARMACO-ECONOMICS

Asthma is a costly illness – its costs can accountfor as much as 1–1.5% of all resources inthe health care sector. The costs are, however,unevenly distributed among the asthmatics; itis not uncommon that 10% of the most severecases (usually patients with uncontrolled asthma)account for over 50% of the total cost. Thereshould thus be room for a significant costreduction by improving disease control.

The costs can, basically, be divided intothree categories (usually only the first, althoughsometimes also the second category is measured):

1. Direct costs, defined as health care resourcesconsumed, include costs associated with drugs,

devices, consultations with physicians, emer-gency room visits and hospitalisations.

2. Indirect costs, defined as lost productivity,include time off work or school, either patientor relative, premature retirement and death.Indirect costs may account for up to 50% ofthe cost of asthma.

3. Intangible costs, contain factors related toquality of life (grief, fear, unhappiness,pain, etc.).

Within the direct costs, drugs and general prac-titioner visits can, crudely, be considered costsof managing controlled asthma, whereas emer-gency room and hospital costs can be assumedto relate to treatment failure. Assuming this tobe about 75% of the total cost, the major partof the costs of asthma appear to be a result ofinadequately controlled disease. Thus, the goal isto get control of the asthma, which (for example)can be done by patient education39 and prophy-lactic therapy. As an example of the latter, it

Page 391: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 397

has been demonstrated that the introduction ofhigh-dose inhaled steroids in patients with severeasthma reduced the number of days of hospital-isation by 80%.40 It should also be noted thata part of the drug costs consists of rescue med-ication, like short-acting β2-agonists, and is initself a sign of the disease not being adequatelyunder control, and that an increase in prophylac-tic therapy, like inhaled steroids, might decreasethese costs.

So, the economics of asthma inform us thata large population of mild-to-moderate asthmat-ics has a low daily cost. For some of thesepatients, the disease becomes uncontrolled andcostly, with hospitalisations and time off work orschool, possibly progressing into early retirementor death. Some of these cases are probably due tobad compliance with treatment regimens. Interna-tional guidelines therefore stress that prophylactictreatment should be introduced at an early stagein asthma treatment, resulting in an increase indrug and general practitioner costs, but hopefullyleading to reduced hospitalisation costs and indi-rect costs. As an observation on this topic, it canbe mentioned that the cost of the avoidance of oneadmission to hospital will pay for about 3 yearsof treatment with inhaled steroids.41

Apart from collecting data on costs, it is alsoimportant to measure the individual’s qualityof life and the effectiveness of various inter-ventions (where effectiveness, as compared toefficacy, ideally should measure the effects ofan intervention in clinical practice and also inunits the patients care about). This is perhapsespecially important if a new medical treatmentincreases the total costs, as it then becomesimportant to relate these extra costs to theadditional effectiveness gained. Currently recom-mended effectiveness variables include the num-ber of symptom- and episode-free days. In cost-effectiveness analyses, if both costs and effec-tiveness are higher for one of the treatments, thedifference in costs is divided by the difference ineffectiveness to obtain a cost–effectiveness ratio.This ratio is, for example, expressed as: com-pared to treatment A, treatment B costs $x persymptom-free day gained. That ratio thus gives

support in answering the question on whetherthe additional effectiveness (or quality of life orasthma control) can justify the extra costs.

Furthermore, it is important to keep in mindthat costs often are country-specific, e.g. in thecase of an international clinical trial, and someadaptation is needed before translating resultsfrom one country to another. This is so becausethe outcome of a new treatment will depend onthe local medical tradition, drug pricing and theunit costs of other health care resources, anda number of social conditions like the labourmarket. In general, though, as the interest for‘value for money’ and cost-containment in thehealth care sector grows, the importance of thesekind of evaluations is likely to increase. Fromclinical trials it is thus important to measure thesekind of variables (both costs and effectiveness),which then can be used as the basis for acost–effectiveness analysis.

META-ANALYSIS

Meta-analysis is in general useful when the focusis on an issue for which data is available ina number of studies, but each individual studyis underpowered for that particular issue. Suchissues might be questions related to specialsubgroups or issues of rare events. Obviousexamples of the latter are special, and rare,adverse events. In this respect the respiratory areadoes not differ much from other areas. The resultsobtained so far within the respiratory area frommeta-analysis are not impressive.42

REFERENCES

1. Beasley R. Worldwide variation in the prevalenceof symptoms of asthma, allergic rhinoconjunc-tivitis and atopic eczema. Lancet (1998) 351:1225–32.

2. Chadwick DJ, Cardew G. The Rising Trends inAsthma, Ciba Foundation. Chichester: John Wiley& Sons (1997).

3. Peat J, Li J. Reversing the trend: reducing theprevalence of asthma. J Allergy Clin Immunol(1999) 103: 1–10.

4. Sears MR. Epidemiology of childhood asthma.Lancet (1997) 350: 1015–20.

Page 392: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

398 TEXTBOOK OF CLINICAL TRIALS

5. Barnes PJ. The changing face of asthma. Q J Med(1987) 63: 359–65.

6. Lange P, Parner J, Vestbo J, Schnohr P, Jensen G.A 15-year follow-up study of ventilatory functionin adults with asthma. N Engl J Med (1998) 339:1194–200.

7. Sherman CB. The health consequences of cigarettesmoking. Pulmonary diseases. Med Clin N Am(1992) 2: 355–75.

8. Buist SA. Smoking and other risk factors. In: Mur-ray JF, Nadel JA, eds. Textbook of RespiratoryMedicine, 2nd edn. Philadelphia: WB Saunders(1994) 1259–87.

9. Selroos O, Pietinalho A, Riska H. Delivery de-vices for inhaled asthma medication Clinicalimplications of differences in effectiveness. ClinImmunother (1996) 6: 273–99.

10. Selroos O. Bronchial asthma, chronic bronchi-tis and pulmonary parenchymal diseases. In:Moren F, Dolovich MB, Newhouse MT, New-man SP, eds. Aerosols in Medicine. Principles,Diagnosis and Therapy, 2nd edn. Amsterdam:Elsevier Science (1993) 261–89.

11. Bartnes PJ. Beta-adrenergic receptors and theirregulation. Am J Respir Crit Care Med (1995) 152:838–60.

12. Lofdahl CG, Chung KF. Long-acting β2-adreno-ceptor agonists: a new perspective in the treatmentof asthma. Eur Respir J (1991) 4: 218–24.

13. Rall TW. Drugs used in the treatment of asthma:the methylxanthines, cromolyn sodium, and otheragents. In: Gilman AG, Rall TW, Nies AS, Tay-lor P, eds. Goodman and Gilman’s The Pharmaco-logical Basis of Therapeutics, 8th edn. New York:Pergamon Press (1990) 618–37.

14. Gandevia B. Historical review of the use ofparasympatholytic agents in the treatment ofrespiratory disorders. Postgrad Med J (1975)51(Suppl 7): 13–20.

15. Brattsand R, Selroos O. Glucocorticosteroids. In:Page CP, Metzger WJ, eds. Drugs and the Lung.New York: Raven Press (1994) 101–220.

16. The British Guidelines on asthma management.Thorax (1997) 52(Suppl 1): S1–21.

17. Howarth PH. The medical treatment of chronicrhinitis. In: Busse WW, Holgate ST, eds. Asthmaand Rhinitis. Boston: Blackwell Scientific (1995)1415–28.

18. Simons FER. Antihistamines. In: Mygind N, Nac-lerio R, eds. Allergic and Non-allergic Rhini-tis. Clinical Aspects. Copenhagen: Munksgaard(1993) 123–36.

19. Bernstein JA, Bernstein IL. Cromones. In: BarnesPJ, Grunstein MM, Leff AR, Woolcock AJ, eds.Asthma. Philadelphia: Lippincott-Raven (1997)1647–66.

20. Barnes NC. Effects of antileukotrienes in thetreatment of asthma. Am J Respir Crit Care Med(2000) 161: S73–6.

21. ATS guidelines.22. Quanjer PH, Tammeling GJ, Cotes JE, Peder-

sen OF, Peslin R, Yernault JC. TI: lung volumesand forced ventilatory flows. Report WorkingParty: Standardization of Lung Function Tests.European Community for Steel and Coal. EurRespir J (1993) 6(Suppl 16): 5–40.

23. Oga T, Nishimura K, Tsukino M, Hajiro T, IkedaA, Izumi T. The effects of oxitropium bromideon exercise performance in patients with stablechronic obstructive pulmonary disease. A compar-ison of three different exercise tests. Am J RespirCrit Care Med (2000) 161: 1897–901.

24. Borg GAV. Psychophysical bases of perceivedexertion. Med Sci Sports Exercise (1982) 14:377–81.

25. Mahler DA, Wells CK, Feinstein AR. The mea-surement of dyspnea. Contents, interobserveragreement, and physiologic correlates of two newclinical indexes. Chest (1984) 85: 751–8.

26. Berger M, et al. The Sickness Impact Profile:development and final revision of a health statusmeasure. Med Care (1981) 19: 787–805.

27. Stewart AL, Hays RD, Ware JE Jr. The MOSshort-form general health survey. Reliability andvalidity in a patient population. Med Care (1988)26: 724–35.

28. Jones PW, Quirk FH, Baveystock CM. The StGeorge’s Resp questionnaire. Resp Med (1991) 85:25–31.

29. ICH Harmonised Tripartite Guideline. Statisticalprinciples in clinical trials. Stat Med (1999) 18:1905–42.

30. Senn S. Statistical Issues in Drug Development.Statistics in Practice. Chichester: John Wiley &Sons (1997).

31. Kallen A, Larsson P. Dose response studies: howdo we make them conclusive? Stat Med (1999) 18:629–41.

32. Global Initiative for Asthma. National Institutes ofHealth. National Heart, Lung, and Blood InstitutePublication 95-3659 (1995).

33. Woolcock AJ. Therapies to control the airwayinflammation of asthma. Eur J Respir Dis (1986)69(Suppl 147): 166–74.

34. BTS Guidelines for the management of chronicobstructive pulmonary disease. Thorax (1997)52(Suppl 5): S1–28.

35. Committee for proprietary medicinal products(CPMP). Points to consider on clinical investi-gation of medicinal products in the treatment ofpatients with chronic obstructive pulmonary dis-ease (COPD). London: EMEA (1998).

Page 393: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

RESPIRATORY 399

36. Kallen A, Larsson P. On the definition of ther-apeutic equivalence. Drug Inform J (2000) 34:349–54.

37. Rott Z, Bocskei C, Poszi M, Juhasz G, Larsson P,Rosenborg J. On the relative therapeutic indexbetween formoterol Turbuhaler and salbutamolpressurized metered dose (pMDI) inhaler in asth-matic patients. Eur Respir J (1998) 12(Suppl 28):324s.

38. Kallen A. A note on therapeutic equivalence andtherapeutic ratio with application to studies inrespiratory diseases. Drug Inform J (2001) 35:1495–505.

39. Liljas B, Lahdensuo A. Is asthma self-manage-ment cost-effective? Patient Educ Counsel (1997)33: 97–104.

40. Adelroth E, Thompson S. Advantages of high-dose inhaled budesonide. Lancet (1988) i: 476.

41. Blainey D, Lomas D, Beale A, Partridge M. Thecost of acute asthma: how much is preventable?Health Trends (1991) 22: 151–3.

42. Jadad RJ, Moher M, Brownman GP, Booker L,Sigouin C, Fuentes M, Stevens R. Systemic re-views and meta-analyses on treatment of asthma:critical evaluations. BMJ (2000) 320: 537–40.

Page 394: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

Index

Note: page numbers in italics refer to figures and tables and boxes

α-blockers 179acetylcholinesterase (AChE)

inhibitors 244acne 213–14

disability index 214outcome measures 219treatment 214

acupuncture 80–1cohort studies 81humoral theory 81neurological theory 81pain control 68placebo 80–1

acute lymphoblastic leukaemia(ALL)

childhood 103, 104, 112–13translational research 105

acute myeloid leukaemia (AML)141–6

competing risks 146complete response (CR) rate

144, 145cure models 146cytogenetic characterisation

142disease-free survival 145, 146drug-resistant 142, 143event-free survival 145, 146factorial design of trials

144–5

GM-CSF 145induction regimens 144–5Kaplan–Meier survival curves

146length of remission 145–6maintenance regimens 144–5molecular characterisation 142older patients 143–4, 146outcome measures 145–6overall survival 146post-remission therapy 142,

143, 144, 145QoL 146response rate 145statistical issues in

design/analysis of trials144–6

subtypes 142treatment failure 142treatment phases 144

acute progranulocytic leukaemia(APL) 142–3

adaptive design of trials 95–7adolescents 50

cancer trials 103adverse events

herbal medicine 70–1, 72–3,73–4

monitoring in paediatric clinicaltrials 49–50

reportingin Chinese medicine

71, 73in gastrointestinal cancer

trials 132age limit, upper for trials 56air pollution 361airway

diseases 359–73narrowing 360upper, function tests 366–7

airway resistance 365measurement 366–7

albuterol (salbutamol) 393allergic disease

atopic dermatitis 214respiratory 360rhinitis 361, 370–1

allocation to treatment 24–5cognitive behavioural therapy

287, 288contraception trials 327depression 300dynamic procedure 25gynaecology 350

concealment 351infertility 341, 350

concealment 351random 14

alopecia androgenetica 226

Textbook of Clinical Trials. Edited by D. Machin, S. Day and S. Green 2004 John Wiley & Sons, Ltd ISBN: 0-471-98787-5

Page 395: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

402 INDEX

Alzheimer’s disease 243–53aetiology 246amyloid 243, 244

deposition 246apo E4 allele 246, 250apolipoprotein E 244bridging studies 247–8cholinergic marker loss 243decline slowing 248, 249, 251diagnosis 246endpoints 247epidemiology 245ethics 252–3familial clustering 245genetics 245, 246mild cognitive impairment

studies 250N-of-one design 252pathogenesis 246pathology 245–6patients 246–7Phase I trials 247–8Phase II trials 248Phase III trials 248–50placebo-controlled trials

252–3prevalence 245, 250primary prevention trials

250–1randomised start/withdrawal

studies 252rate of change studies 249secondary prevention 251structural change 251–2survival analysis 247, 249–50symptomatic change 251–2synapse loss 246tertiary prevention 251trials 56, 246–51

Alzheimer’s Disease AssessmentScale–Cognitive Component(ADAS-Cog) 248

Alzheimer’s Disease CooperativeStudy (ADCS) 244

amalgam 204anastrozole 15aneurysm resection 181angiotensin converting enzyme

(ACE) inhibitors 179anthracycline therapy 89, 90

AML 142antiangiogenesis agents 167antiarrhythmic drugs 180Antiarrhythmics Versus

Implantable Defibrillators(AVID) trial 181, 182, 183,184

anticholinergic drugs 363anticipated effect size 30antidepressants, tricyclic 239antihistamines 364Antihypertensive and

Lipid-Lowering Treatment toPrevent Heart Attack Trial(ALLHAT) 178–9

antihypertensive drugs176, 180

antimicrobial agents in toothpaste205

antioxidant vitamins, colorectalcancer prevention 122

anxiety disorders 255–70avoidance 267–8Bayes factor of test 259co-occurring

symptoms/syndromes257, 264

community recruitment 259community settings 256–7,

258, 260, 266assessment strategies

259–60efficacious medication 268

comorbidity 260–1data capture/edit cycle 260diagnostic category definitions

263–4diagnostic groups 264diagnostic instruments 255discordant models 268–70effectiveness studies 255–6efficacy trials 257, 264escalation strategies 260exclusion criteria 261features 257full factorial study design

268–9generalised 263–4heterogeneity/homogeneity

trade-off 261high community prevalence

256–61illness severity 261individual psychology

263–4life context 263–4measurement targets 266multi-step diagnostic

procedures 260normal state 262nosological boundary definition

261–8optimal long-term outcome

262

outcomeassessment 266–7definition 264–6

pathological state 262phobic symptoms 267–8prevalence 259primary care setting 258recruiting individuals not

seeking treatment 257research recruitment 260result measurement 264–6self-criticism 258severity 257

composite measures 265social support 263specialty clinics 259stigma 258, 259symptoms

complexity 264–5elimination 262severity scales 255silencing 267stability 266–7

treatmentbackground history 261cognitive behavioural 255,

256, 268decisions 262–3efficacious 268medication 268–9protocol-driven 258protocols 260psychosocial 268–9

apo E4 allele 246, 250apolipoprotein E 244area under the curve (AUC) 26

bioequivalency 392glucocorticosteroids 385

Aricept trial 251aromatase inhibitors 89asbestos exposure 159aspirin, cardiovascular disease

15, 16, 175assent to participate, paediatric

clinical trials 52asthma 359–60

anticholinergic drugs 363β2-agonists 362challenge tests 368control 390–1, 396–7controlled days 389cost–effectiveness ratio of

treatment 397crossover studies 381data pre-processing 388diary card studies 371–3,

387–90

Page 396: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

INDEX 403

disease severity 379dose–response curves 369,

381–2dose–response study 381–2,

391duration of action 367effectiveness variables 397efficacy studies 380–3,

386–91experimental trials 367–70glucocorticosteroids 363, 371,

381diary card studies 387–90dose reduction study 390–1level of use 387

highest effects attainable 391hyperresponsiveness studies

368–70Kaplan–Meier survival curves

389, 390minimal effective dose (MED)

381, 390–1missing data 388onset of action 367prophylactic treatment 397QoL 373reference period 391rescue medication 375, 389responders 367, 373response curve 367safety studies 386–91sample size parameters 378–9severity classification 386–7single dose monitoring 367–8subgroup analysis 379–80symptom scores 389

asthmatic reactions, early/late368

atenolol 180atherosclerosis 175, 176, 177atopy 216atorvastatin 18atrial fibrillation, rate control 181Atrial Fibrillation Follow-up

Investigation of RhythmManagement (AFFIRM)180–1

atrioventricular junction ablation180–1

audit trail facilities 40auditory hallucinations, perceived

powerfulness 290–1average causal effect (ACE) 299,

307–8avoidance behaviours 267–8

β-blockers 180β2-agonists 362, 380

bronchodilatation 382–3relative therapeutic index 395,

396systemic effects 383–4

baseline comparisons 38baseline information 15Bayesian methods 24Beck Depression Inventory (BDI)

298, 302benoxaprofen (Opren) 55benzodiazepines 239betamethasone 217bias 6

clinician 339patient 339recruitment 281–3respiratory disease studies

373–5, 377bioequivalency 392–3blinding 6

cardiovascular device trials184

cognitive behavioural therapystudies 287–8

contraception trials 327dentistry 197explanatory trials 339gynaecology studies 351–2infertility studies 351–2pragmatic trials 339respiratory diseases 373–4rhinitis studies 374skin disorders 216–17

blood taking, paediatric clinicaltrials 49

Boccaceto 3body plethysmography 365Bonferroni correction 37Bowman–Birk inhibitor 123brain tumour, childhood 105breast cancer 87–97

adaptive designs of trials95–7

adjuvant therapy 90aggressive 92anastrozole 15anthracycline therapy 89, 90CAF treatment regime 92–4chemotherapy 89–90

high-dose with bone marrowtransplant/stem cellsupport 90

low-dose versus high dose92–4

continual reassessment method(CRM) 97

curing 94dose-finding trials 97doxorubicin 90HER-2 89HER-2/neu expression 93–4hormonal therapy 90incidence 87long-term therapy impact

94–5mammography 87, 90–1meta-analysis 91new agent trials 97oestrogen-receptor (ER) status

89, 90, 91ovarian ablation 91patient advocates 87–8polychemotherapy 91prevention 91prognosis 88–9prolonging survival 94quality of life 91radiotherapy 89, 91raloxifene 91recurrence 92, 93research promotion 87–8staging 88stopping rules 95–6surgery 89tamoxifen 15, 89, 90, 91time-dependent hazards 91–4treatment benefits 88trial groups 91

breastfeeding, contraception 321breathlessness measurements 370bridges, dental 193

resin-bonded 204–5bronchial carcinoma 159bronchial hyperresponsiveness

non-specific 360studies 368–70

bronchitis, chronic 360, 361bronchodilator drugs 362–3

bioequivalency 393efficacy studies 380–3

bronchoprovocation studies 393burns, severe 14Butler, Robert 244

χ2 test 34Cade, John 239calcipotriol 216calcium, colorectal cancer

prevention 122–3calcium channel blockers 179,

180

Page 397: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

404 INDEX

Canadian Implantable DefibrillatorStudy (CIDS) 182

canceradaptive design of trials 96–7aggressive treatment 34chemoprevention 122–3distant metastases 38local recurrence 38solid tumours, retrospective

review 29see also named cancers

cancer, childhood 101–13accrual duration in trials 102ancillary studies 111categories 101, 102clinical research problems 112control 103cure rate 101doxorubicin 112–13ethics 111–12incidence 101minimal residual disease 106national reference laboratories

106negative questions 111outcome prediction 106prognostic factors 105–6reduced therapy 111risk assignment 106risk groups 106solid tumour treatment 105study design 108, 110–11survival improvement 103translational research 103,

105, 106, 111treatments 101–2

advances 103, 104effects 112success 104

Cancer and Leukemia Group B(CALGB), breast cancer trial91, 92–4

Cardiac Arrhythmia SuppressionTrial (CAST) 180

cardiac defibrillators, implantable182

cardiovascular device trials181–4

blinding 184risk levels of patients 183strategy approach 184

cardiovascular disease 175–87adherence to trial treatments

177aspirin 15, 16, 175atrial fibrillation rate control

181

behaviour change trials 184–7causes 175, 176community settings for trials

186congenital abnormality

correction 181control group

active 178–9treatment 178

depression 186genetic factors 176left ventricular ejection fraction

179, 183multifactorial 176multiple interventions 176oral contraceptives 317–18pharmacological agent trials

177–81presentation 176primary prevention 175, 178primordial prevention 175rhythm control 180–1secondary prevention 175social support 186stepped care approach 180strategy comparisons 180–1stratification of trials 177surrogate outcomes 185ventricular arrhythmias 180

cardioversion 180, 184care settings, older people trials

59–60causal effects 306–8

complier average 307, 308,309

causality, counterfactual model298

cause-specific cumulativeincidence functions 38

CAV (cyclophosphamide,adriamycin and vincristine)165

celecoxib 123cervical cap/sponge 321children 50

informed consent 111see also cancer, childhood;

paediatric clinical trialsChildren’s Health Act (2000, US)

112Children’s Oncology Group

(COG) 102–3Chinese medicine 64–5

adverse effects 70–1, 72–3,73–4

alopecia androgenetica 226disease management 66

endpoints 74herbal drugs 68–71, 72, 73–4holistic approach 69Naranjo’s grading of adverse

effects 71old system of clinical

observation 70quality of life 73–4recommendations for clinical

trials 74–5, 76 –9, 80–1response to pathological

processes 70strong tradition 70symptom identification

principle 69traditional healing concepts

69–70unique features 74

chlorhexidine 205chlorthalidone 180Cholesterol and Recurrent Events

(CARE) trial 178cholesterol levels 175–6

lowering 178, 182cholestyramine resin 178cholinesterase inhibitors 244,

252chronic myeloid leukaemia (CML)

143chronic obstructive pulmonary

disease (COPD) 359,360–1

anticholinergic drugs 363β2-agonists 362diary card studies 371–3, 392disease modifying effects 391,

392exacerbation rates 379exercise tests 370measurement scales 364–7QoL 373rescue medication 375reversibility 391severity classification 391symptomatic benefit 372,

391–2cigarette smoking 159

cessation 159COPD 361habit change 185prevention of uptake 159pustular psoriasis 214

cisplatinnon-small-cell lung cancer

164small-cell lung cancer 165

Page 398: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

INDEX 405

clinical observation, old system(herbal medicine) 70

clinical record forms (CRFs) 39data transfer 40

Clinical Research Organisations(CROs) 39

clinical trialsalternative designs 17–24designs 339–40types 12–24, 337–9

Clinician’s Global Impression(CGI) 248

clofibrate 177–8clomiphene citrate 339clozapine 286cluster randomisation 340–3,

351Cochrane Collaboration 12

Cochrane Skin Group 229contraception 332, 333dentistry 200–1evidence-based practice 281

cognitive behavioural therapy273–93

acute 278anxiety disorders 255, 256,

268assessors 287–8background services 285benefits 278blindness 287–8case formulation 290changing standards in trial

design/reporting 279,280

choice of 300–1chronicity of illness 286clinical trial purpose/outcome

275–6co-morbid conditions 284

substance abuse 283–4components 274development 274–5, 275diagnosis 283dropouts from study 284–5durability 278duration of untreated psychosis

286effectiveness trials 281,

288–9efficacy trials 281entry criteria 283evaluation 275evidence-based practice

280–5exclusion criteria 283–4funding for trials 276

gender 286intellectual status 286IQ cut-off 284lost to follow-up 284–5mean effect size 278–9multiple effects 291number needed to treat (NNT)

292outcomes of treatment 276,

281, 291–2clinical significance 292prediction 290

panic disorders 269participant samples 281–5patient allocation 287, 288progression through therapy

289protocol 288–9randomisation 285–8randomised controlled trials

274–5rating scales 291recruitment 285

bias 281–3reporting guidelines 279selection of patients 283specificity 288–9standardised exposure therapy

290substance abuse 283–4symptoms

measurement 291–2severity 286

therapeutic relevance 276–7therapists 287

effects 303–4treatment 289

components 290–1individualised 289–90interactions 286

cognitive impairment, mild250

cognitive therapy, schema change291

cohort studies, acupuncture 81coitus interruptus 322colon cancer

adjuvant treatment 124–5,132–4

advanced disease126–8

localised disease 124–6stage 2 disease 126stage 3 disease 126surgery 124treatment 124–8

colonoscopy 124

colorectal cancer 121–9chemoprevention 122–3early detection 123–4

combined oral contraceptives(COCs) 316–18

common cold vaccine 6COMPACT (Computer Package

for Clinical Trials) 40complementary medicine 63–82

indications 65–6philosophy 65–6severe burns 14types 64

complier average causal effects(CACE) 307, 308, 309

computed tomography (CT), spiralfor lung cancer detection161–2

computers 39data management 40security of system 41software 40–1

condomsfemale 321male 322

confidence intervals 32, 35, 37treatment-specific 38

confoundingdepression studies 299–300variables 13

consentrandomised controlled trials

14–15see also informed consent

Consolidation of the Standards ofReporting Trials(CONSORT) guidelines 32,33, 279, 353

continual reassessment method(CRM) 97

contraception 315–33acceptability of methods

330–1barrier methods 322

women 321controls 326discontinuation 330–1dose levels 324–5efficacy

end-point 326surrogates 324

emergency 319–20effectiveness/efficacy studies

329–30pregnancy rate 329–30side-effects 331

equivalence trials 331–2

Page 399: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

406 INDEX

contraception (continued )ethics of trials 325, 326–7hormonal methods 315, 316

men 322vaginal bleeding patterns

331women 316–20, 324

implants 319, 322informed consent for trials

326–7injectable 318–19, 322intention to treat (ITT) 330introductory trials 332men 316

hormonal 322non-hormonal 317, 322–3

methods for clinical trials323–33

natural methods 321, 322non-hormonal methods

315–16men 317, 322–3women 317, 320–2

Phase I trials 323–5Phase II trials 323–5

metabolic studies 325Phase III trials 325–31

acceptability of methods330–1

allocation concealment 327behavioural patterns 329blinding 327design 325–6effectiveness/efficacy 327,

329–30objectives 325pregnancy rate estimation

328–9randomisation 327recruitment 326–7side-effects assessment

330–1trial size 325–6

Phase IV trials 332progesterone-only 316–18randomised controlled trials

323, 325–31side-effects 330–1sterilisation 322–3subgroup analysis 330systematic reviews 332–3vaginal bleeding patterns 331women 315–16

barrier methods 321hormonal methods 316–20,

324

non-hormonal methods317, 320–2

Phase I trials 324–5controls

contraception 326no treatment 15Phase III trials 15–16placebo 15

coping skills/strategies 263,290–1

copper IUDs 320–1coronary artery bypass graft 181,

184Coronary Artery Bypass Graft

(CABG) Patch trial 184Coronary Drug Project 7, 177–8corticosteroids see

glucocorticosteroidscortisol 363, 384

dose–response curves 385–6cost analysis 28cost benefit 28cost effectiveness 28

ratio 397cost minimisation 28cost utility 28costs

direct 28, 396indirect 28, 396intangible 396

Cotton, Henry A 236–7Cox, David 238Cox proportional hazards model

39melanoma 154

Cox regression models 389cranial irradiation, prophylactic in

small-cell lung cancer165–6

crossover study design 223, 339Alzheimer’s disease 249asthma 381caveats 376dentistry 195–6respiratory diseases 381

crossover trials 19respiratory diseases 375–6

cyclo-oxygenase 2 (COX-2)inhibitors 123

Cyclofem 325, 331cyclosporine 214cytarabine 145cytostatic drugs 167–8

Dabao (Chinese herbal medicine)226

Danggui Buxue Tang (Chineseherbal medicine) 76–7

Daniel, Book of (Old Testament)3

Danshen and Radix PuerariaeCompound (Chinese herbalmedicine) 79–80

dataaccumulation in adaptive design

96aggregation 201checking 40collection sequence 39–40cost 347diary cards 371, 372graphical presentation 35hierarchical analysis 201–2imputation 303, 377–8inverse probability weighting

303likelihood analysis 303management 39–41

quality 39–40software 40–1

missingasthma studies 388depression trials 302–3older people trials 60respiratory disease trials

376–8monitoring 7–8, 31

committees 31dentistry studies 195

multilevel analysis 201–2pre-processing for asthma

studies 388processing 39quality 39–40transcription/entry to computers

40Declaration of Helsinki 7, 11defibrillators 183, 184dementia

associated with Lewy bodies246

patients and informed consent59

see also Alzheimer’s diseasedental caries 193

aetiology 194diet 202measurement 194prevention 197–8split-mouth study design

196–7treatment 197–8, 204

dental filling materials 197

Page 400: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

INDEX 407

dental floss 205dental fluorosis 203dental prostheses 193dentistry 193–205

blinding of studies 197clinical trials 197–200Cochrane Collaboration

200–1control groups 196crossover study design 195–6dental practice impact of trials

202–5evidence-based 200–1multilevel modelling 201–2parallel study design 195–6patient satisfaction 199QoL 199randomised controlled trials

194–5split-mouth study design

196–7, 198systematic literature review

200–1dentures, complete 198, 204depot-medroxyprogesterone

acetate (DMPA) 318–19,322

trials 325, 331depression 297–310

causal effect of treatment298–9

estimator 300confounding 299–300external validity of trials 298group averages comparison

299–300internal validity of trials

297–8loss to follow-up 300missing data 302–3observed outcomes 299older people trials 56patient preference designs

308–9psychotherapy 298–9random allocation 300randomisation 299–300randomised consent 308randomised controlled trials

297dermatitis, atopic 214

outcome measures 219dermatology 211–29

classification 211–12non-pharmacological treatments

213primary care 213

randomised trials 215–16score systems 218, 220topical treatment 213see also skin disorders

desogestrel 317dextrothyroxine 177–8diaphragm, contraceptive 321diary card studies 377

asthma 387–90COPD 392long-term 371–3, 381

data analysis 372diet

dental caries 202intervention trials 185low-allergen 216

Dietary Approaches to StopHypertension (DASH) trial185–6

dietary changecardiovascular disease studies

185dentistry studies 197–8

dietary fibre, colorectal cancerprevention 122

difluoromethylornithine 123digitalis 180disease-free survival 145disodium cromoglycate 364dithranol 224docetaxel 164dose–response curves

asthma 369, 381–2cortisol 385–6

dose–response studies 394–5asthma 381–2, 391

double-blind trial 15double-dummy technique 374Down’s syndrome 246doxorubicin

breast cancer 90childhood cancer 112–13

drugsherbal medicine interactions

71, 72–3metabolism 13potency ratio 386psychiatric 236therapeutic equivalence

392–4therapeutic ratio 394–5

dyslipidaemia in visceral obesity18

dyspnoea score 370

Early Breast Cancer Trialists’Collaborative Group(EBCTCG) 91

early stopping rules see stoppingrules

Eastern Cooperative OncologyGroup (ECOG) 151–2

ebastine 227economic evaluation 28

gynaecology studies 347, 348,349

infertility studies 349respiratory diseases 396–7

effectiveness studiesanxiety disorders 255–6cognitive behavioural therapy

281contraception 327, 329–30pragmatic trials 338–9variables 397

efficacy 13documenting in respiratory

disease studies 386–92herbal medicine 68, 69interim report 32older people trials 55paediatric clinical trials 49relative 28

efficacy studiesasthma 380–3, 386–91cognitive behavioural therapy

281contraception 327, 329–30explanatory trials 338–9non-randomised 13–14respiratory diseases 380–3

electric shock treatment 238, 239eligibility

older people trials 56–7requirements for randomised

controlled trials 14, 15wide criteria 57

emphysema 360encainide 180endometrial ablation 346

exclusions from trials 352Endometriosis Health Profile 346endpoints 25–6

Chinese medicine 74defining 25dual 38measures 35, 36–7multiple 37single measures 25–6statistical analysis 39

Page 401: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

408 INDEX

Enhancing Recovery in CoronaryHeart Disease (ENRCHD)study 186

enrichment trial designs 248–9enrolment predictors 58epidermal growth factor receptor

(EGFr) 164, 167equipoise 14equivalence trials 19–21

contraception 331–2infertility 342–3periodontal diseases 199–200see also therapeutic equivalence

erectile dysfunction, Korean redginseng 19

escalation strategies, anxietydisorders 260

ethical necessity 11ethics 6–7

Alzheimer’s disease 252–3childhood cancer studies

111–12contraceptive trials 325,

326–7dentistry studies 195gynaecology trials 354infertility trials 354paediatric clinical trials 50–1,

51–2randomisation 183–4small-cell lung cancer trials

167ethinyl oestradiol 316, 331

emergency contraception319–20

etoposide 165etretinate 217event-free survival 145, 146evidence-based practice 12,

13–14cognitive behavioural therapy

280–5dentistry 200–1

Ewing’s sarcoma, competing risks38

exclusion criteria 56exercise programmes 185exercise tests 370explanatory trials 338–9

clinical outcomes 345external validity of trials 298

factorial design 18–19, 339–40faecal occult blood testing

123–4false-negative errors 96

false-neutral errors 96false-positive errors 96fertilisation rates 341fish oil, dyslipidaemia in visceral

obesity 18Fisher, Ronald A 5, 238–9flecainide 180fluoridation of water 197, 198,

202–3alternatives 203

fluorideprofessional application to teeth

204supplements 203

topical 2045-fluorouracil

colon cancer adjuvant treatment125, 126

with leucovorin 132–4colon cancer treatment 127–8gastric cancer adjuvant therapy

119–20leucovorin combination

132–4pancreatic cancer 120–1post-surgical adjuvant treatment

for gastrointestinal cancers130

rectal cancer adjuvant therapy128

focal infection theory 236–7focus groups 258–9follicle-stimulating hormone

(FSH) 323follow-up information 15

repeated evaluation 40Food and Drug Administration

(FDA, US)consent in paediatric clinical

trials 47Modernization Act 46, 112

forced expiratory volume in onesecond (FEV1) 365–6

asthma challenge tests 368diary cards 371, 372electronic devices 371

forced vital capacity 365Formula A and Formula B

(Chinese herbal medicine)77–8

Framingham Heart Study 176full factorial study design, anxiety

disorders 268–9functional residual capacity (FRC)

366

gangliosides 152gastric cancer 119–20

adjuvant therapy 119–20advanced disease 120extended lymph node dissection

119localised disease 119–20palliative therapy 120

gastrointestinal cancers 117adverse event reporting 132incidence 117post-surgical adjuvant treatment

129–30prognosis 117safety monitoring in trials

131–2staging 117subgroup analysis 129–30toxicity in trials 132tumour marker studies 130–1see also named cancers

gemcitabinenon-small-cell lung cancer

164pancreatic cancer 121

general estimating equations 26General Oral Health Assessment

Index (GOHAI) 194gestodene 317gingivitis 194glass ionomer cements 198, 204Glenner, George 244glucocorticosteroids 363–4

asthma 363, 371, 381diary card studies 387–90dose reduction study 390–1level of use 387

plasma cortisol dose–responsecurves 385–6

side-effects 363–4systemic effects 383–5

GMK ganglioside vaccine 152,153, 155–6

gold standard 12gonadotrophin hormone-releasing

hormone (GnRH) 323graft-versus-leukaemia effect

143granulocyte–macrophage colony

stimulating factor (GM-CSF)145

graphs 35group sequential methods 7gynaecology

blinding 351–2cluster randomised trials

340–3, 351

Page 402: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

INDEX 409

data collection 353economic evaluation 347,

348, 349ethics 354exclusions from trials 352follow-up 352–3interventions 344outcome

definition 344–5measures 347, 348, 349

patient satisfaction 346–7QoL 345–6randomisation 349–51recruitment to trials 353results presentation 353sample size 349stopping rules 354study population 343–4subgroup analysis 349, 353treatment allocation 350

concealment 351

haematological cancers 141–6Hamilton Rating Scale for

Depression (HRSD) 302hay fever 361, 392Haybittle–Peto boundary 8hazard ratio (HR) 29, 34

melanoma 154health economics 28Heart Outcomes Prevention

Evaluation (HOPE) study179

Helmont, Jan Baptista van 5hepatocellular carcinoma

iodine-131-labelled lipiodol31, 33

Kaplan–Meier survival curves36

HER-2 89HER-2/neu expression in breast

cancer 93–4herbal medicine 63, 65–6

adverse effects 70–1, 72–3,73–4

alopecia androgenetica 226China 68–71, 72, 73–4clinical trials 66–8disease concept 66drug interactions 71, 72–3efficacy 68, 69extraction 68methodologies 67old approaches 67–8post-marketing surveillance

69

safety 68, 69stages recommended for

research 69toxicity clearance 69treatment aim 66

herbschemistry 67finger-printing techniques 68growing 67species 67uniformity 67, 68

hierarchical linear modelling201–2

Hill, A Bradford 5, 6, 239histamine 368–9history of clinical trials 3–8Hodgkin’s disease, childhood

104holistic approach, Chinese

medicine 69hormone replacement therapy

250, 345follow-up of subjects 353

human chorionic gonadotrophin(hCG) 322

Human Fertilisation andEmbryology Authority(HFEA, UK) 354

Human Research Protections(OHRP, US) 112

Human Rights Act (UK) 284–5hydroxymethylglutaryl coenzyme

A (HMG-CoA) reductaseinhibitors 178

hyperlipidaemia 177treatment guidelines 178

hyperlipoproteinaemia 178hypertension 175, 176, 177

treatment guidelines 178hypothesis testing 34–5hysterectomy 345, 346

exclusions from trials 352

ifosfamide 164ileal bypass surgery 182imatinib mesylate (STI571) 143imipramine

panic disorders 277psychotherapy study 301–2

immunocontraceptives 322, 323implantation rates 341imputation 303, 377–8

linear interpolation 388multiple 378

in vitro fertilisation (IVF) 339,341

infants 50infertility

blinding 351–2cluster randomised trials

340–3, 351data collection 353dropouts from study 349economic evaluation 349equivalence trials 342–3ethics 354exclusions from trials 352follow-up 352–3intention to treat (ITT) 343minimisation 350–1outcome

clusters 341definition 344–5measures 348, 349

patientspreference trials 341–2satisfaction 347

per protocol analysis 343QoL 345–6quasi-randomised trials 341randomisation 349–51recruitment to trials 341–2,

353results presentation 353sample size 349stopping rules 354study population 343–4subgroup analysis 349, 353systematic review 343treatment allocation 341, 350

concealment 351trials 339, 340

informed consent 7children 111contraception trials 326–7dentistry studies 195information understanding

instrument 58older people trials 58–60paediatric clinical trials 47,

51–2proxies 59randomised controlled trials

308inspiratory vital capacity (IVC)

366insulin coma 237intention to treat (ITT) 33–4,

300contraception 330infertility 343psychotherapy 306–7, 308

Page 403: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

410 INDEX

interferon α (IFN-α), renalcarcinoma 21, 33

interferon α2b (IFN-α2b)melanoma 154, 155, 156–7

adjuvant trials 151–2, 153interim analyses 7internal validity of trials 297–8International Conference on

Harmonisation (ICH),guideline on paediatricstudies 47–51, 51

intervention trials 15intra-cluster correlation coefficient

(ICC) 341intracytoplasmic sperm injection

(ICSI) 341intrauterine contraceptive device

(IUD) 320–1, 328–9intrauterine insemination (IUI)

339inverse probability weighting

303iodine-131-labelled lipiodol 31,

33irinotecan

colon cancer treatment 127–8non-small-cell lung cancer

164small-cell lung cancer 165

isotretinoin 214

Jadelle (contraceptive implant)319

Kaplan–Meier survival curves35, 38, 39

AML 146asthma 389, 390breast cancer 91–2pregnancy rate estimation

328, 329King’s College Questionnaire for

Urinary Incontinence 345Korean red ginseng 19Kraepelin E 243

Laborit, Henri 239Lan and DeMets method 8large simple trials 15last value extended (LVE) 377–8left ventricular assist device 183leg ulcers 215

outcome measures 219legislation, paediatric clinical trials

47

leucovorincolon cancer adjuvant treatment

125, 126with 5-FU 132–4

colon cancer treatment 127gastric cancer adjuvant therapy

120post-surgical adjuvant treatment

for gastrointestinal cancers130

leukaemiachildhood 103, 104drug-resistant 142see also acute lymphoblastic

leukaemia (ALL); acutemyeloid leukaemia (AML)

leukotriene modifiers 364levamisole 126, 134levonorgestrel 316, 319

emergency contraception319–20

equivalence trial 332IUDs 320–1vaginal ring 331

Lewy bodies 246life table techniques 328lifestyle change 185likelihood analysis 303Lind, James 4, 5lipid lowering trials 177–8, 180Lipid Research Clinics Coronary

Primary Prevention Trial178

living wills 59lobotomy 237logrank test 34, 39

breast cancer 94Long Term Strategies in the

Treatment of Panic Disorder256

Louis, Pierre-Charles-Alexandre4

low density lipoprotein (LDL)cholesterol 178, 179

lung cancer 159–68chemotherapy 166–7classifications 159–60, 161clinical trials 161–6cytostatic drugs 167–8dose escalation 166dose-limiting toxicity 166early detection 161–2incidence 161latency 159maximum tolerated dose 166methodology of trials 166–8minimax trial design 166–7

non-small-cell 160adjuvant adenovirus for p53

167adjuvant chemotherapy

163chemoprevention 162chemotherapy 164chemotherapy plus radiation

therapy 164locally advance bulky stage

IIIA/IIIB disease163–4

metastatic 164non-bulky stage IIIA disease

162–3post-operative thoracic

radiotherapy 162–3pre-operative chemotherapy

plus surgery 163stage I disease 162stage II disease 162–3stage IV disease 164targeted therapy 164treatment studies 162–4

optimality criterion 166–7prognosis 161screening 161–2small-cell 159–60, 161

chemotherapy 165chemotherapy plus chest

irradiation 165ethics of trials 167first-line therapy 165methodology of trials 167prophylactic cranial

irradiation 165–6radiotherapy fractionation

165recurrent disease 165second-line therapy 165treatment 164–6

staging 160, 161targeted therapy 167–8TNM staging system 160

lung function measurements365–6

mammography, breast cancer 87,90–1

mandibular deficiency 205mastectomy, radical 89matrix metalloproteinase inhibitor

(MMPI) 167pancreatic cancer 121

maxillary arch expansion 200,205

Page 404: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

INDEX 411

maximum tolerated dose (MTD)lung cancer 166paediatric clinical trials 109

measurement scales in respiratorydiseases 364–7

medroxyprogesterone acetate,depot (DMPA) 318–19,322

trials 325, 331melanoma 149–57

adjuvant therapy 151antibody responses 156Breslow’s thickness 149–50,

156Cox models 154GMK ganglioside vaccine

152, 153, 155–6hazard ratio 154high-risk subcategories 156IFN-α2b 151–2, 153, 154,

155, 156–7lymph node dissection 150,

151metastases 149–50, 153overall results of trials 154–5overall survival 152–3, 154,

156p-values 153–4, 155patient characteristics 155post-relapse salvage therapy

151, 153prognosis 149, 156regional node staging 150–1

clinical 150surgical 150–1, 153

relapse-free survival 152–3,154, 156

sentinel lymph node biopsy150–1

statistical aspects of trials153–7

subset analysis 156–7surgery 150trial size 154–5type I error rate 155

Menopause Rating Scale 346menorrhagia 345

patient satisfaction 346randomisation 350study population 343

menstrual diary analysis 331Menstrual Distress questionnaire

346mental illness 235Mesigyna 325, 331meta-analysis

breast cancer 91

reporting 32respiratory diseases

397methacholine 368–9

provocation test 377methodologies in herbal medicine

67methotrexate 214mifepristone 319, 330milk fluoridation 203minimal effective dose (MED)

381, 390–1minimisation in infertility trials

350–1mitoxantrone 145Mode Selection Trial in Sinus

Node Dysfunction (MOST)184

modern medicine 64deficiencies 65

Modified Borg scale 370molecular targeted therapies

167monitoring progress of trials

31–2Moniz, Egas 237Monro, John 236Monte Carlo simulation 96more than two group design

17–18moricizine 180mouthrinse trials 197, 199MRC/BHF Heart Protection Study

178Multicentre Automatic

Defibrillator ImplantationTrial (MADIT) 183, 184

multilevel modelling 26, 60dentistry 201–2

multiple comparisons 36–7myelotoxic drug tolerance 109myocardial infarction

176, 183

Naranjo’s grading of adverseeffects in Chinese medicine71

Nasal Allergen ChallengeArtificial Season model370–1

nasal patency assessment 366–7nasal polyposis 361–2National Bioethics Advisory

Commission (US) 112National Breast Cancer Coalition

fund (NBCCF, US) ClinicalTrial Initiative 88

National Bureau on Drug Control(China) 68

National Centre forComplementary, AlternativeTreatment (NCCAM, US)68

National Institute on Aging (NIA,US) 244

National Institutes of Health (NIH,US) 5

acupuncture for pain control80

complementary medicine 68Inclusion of Children Policy

51paediatric subject exclusion

45–6Nazi regime (Germany) 6nedocromil sodium 364neuroblastoma

childhood 105staging system 108translational research 106

neurofibrillary tangles 245newborn infants 50nicotinic acid 177–8non-Hodgkin’s lymphoma,

childhood 104–5non-inferiority trials 21non-randomised studies 13–14non-steroidal anti-inflammatory

drugs (NSAIDs)Alzheimer’s disease 245colorectal cancer prevention

123norethindrone 316norethisterone enantate (NET-EN)

318–19, 322norgestimate 317norgestrel 316dl-norgestrel 319Norplant (contraceptive implant)

319nortriptyline, older people trials

56Nottingham Health Profile 373null hypothesis 30, 34–5number needed to treat (NNT)

35–6, 292psychotherapy 307, 308

Nuremburg Code 6–7

O’Brien–Fleming method8, 110

oesophageal cancer 118–19advanced 118localised 118–19

Page 405: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

412 INDEX

oesophageal cancer (continued )pre-operative

radiochemotherapy 118survival 119

oestradiol cypionate 325oestrogens

Alzheimer’s disease 245compounds 123equine 177–8oral contraceptives 316synthetic 316

oestrone sulphate 331older people, clinical trials

55–61AML 143–4, 146atherosclerosis 177decision making 58–9eligibility 56–7follow-up 60informed consent 58–60nature of trial 59nursing home residents 59–60understanding 59

oocytes 341retrieval 345

oral cancer 193oral contraceptives 316–18

cardiovascular disease317–18

safety concerns 317Oral Health Impact Profile (OHIP)

194, 199oral hygiene studies 195oral rehabilitation studies 198–9orphan drug regulation 46orthodontic treatment 193, 200

randomised controlled trials200, 205

timing 205Osler, William 7osteogenic sarcoma, childhood

105, 112–13outcome measures

clusters 341gynaecology 347, 348, 349infertility 348, 349pragmatic trials 349skin disorders 218, 219,

220–2, 227–8outcomes

clinical 345definition in

gynaecology/infertility344–5

primary 349secondary 349

overdentures

implant-retained mandibular198, 204

maxillary implant 199overview groups 12overviews, reporting 32oxaliplatin 127, 134

p-value 35, 37p53 tumour marker 131

adjuvant adenovirus 167pacemaker implantation 181, 184paclitaxel 164paediatric clinical trials 45–6

age classification of patients50

assent to participate 52dose-limiting toxicity 109drug development 108–9early testing 53efficacy 49ethics 50–1, 51–2ICH Guideline 47–51informed consent 47, 51–2initiation 48legislation 47maximum tolerated dose 109recruitment 52–3regulatory issues 46–7safety 49–50safety information lack 46study management 53types of studies 48–50see also cancer, childhood

pancreatic cancer 120–1panic attacks 267panic disorders

cognitive behaviouraltreatments 269

imipramine 277placebo–drug comparison 283

parallel study designcontraceptives 325–6dentistry 195–6randomised control 248–9respiratory diseases 376, 379simple 339

Pare, Ambroise 4Park study 370–1partially randomised patient

preference (PRPP) 342pathological process response 70patient motivation in skin

disorders 225–6patient preference trials

infertility 341–2psychotherapy 308–9skin disorders 225–6

patient satisfactiongynaecology 346–7infertility 347skin disorders 225

peak expiratory flow (PEF) 365,366

diary cards 371, 372, 389electronic devices 371

peak nasal flow 367, 372Pearl index 326, 328Pentothal 239per protocol analysis 343per protocol summary 33periodontal diseases 193

aetiology 194equivalence trials 199–200measurement 194randomised controlled trials

199split-mouth study design 196trials 199–200, 205

Petrarch 3pharmaceutical companies,

paediatric data requirements46

pharmacists, skin disorders 213pharmacodynamics 13

in children 46paediatric clinical trials 49

pharmacoeconomics 396–7see also economic evaluation

pharmacokinetics 13paediatric clinical trials 46,

48–9Phase I trials 12–13, 337–8

Alzheimer’s disease 247–8childhood cancer 108, 110–11contraception 323–5lung cancer 166respiratory diseases 380–6safety monitoring 131–2skin disorders 222–5

Phase II trials 12–13, 338Alzheimer’s disease 248childhood cancer 109–10contraception 323–5historical comparison 110lung cancer 166–7oesophageal cancer 119proving activity 109randomised comparison 110respiratory diseases 380–6safety monitoring 131–2skin disorders 222–5testing activity 109–10within-patient control studies

223–5

Page 406: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

INDEX 413

Phase III trials 12, 14–16, 338Alzheimer’s disease 248–50childhood cancer 102, 103,

110–11colon cancer adjuvant treatment

125contraception 325–31control therapy 15–16cure model 110double-blind for supportive

cancer care 111lung cancer 167proportional hazards 110respiratory diseases 386–95safety monitoring 131–2skin disorders 225–8standard therapy 15–16

Phase IV trials 12, 15–16, 338respiratory diseases 395–6

photochemotherapy 213phototherapy 213

PUVA 214, 228ultraviolet B 214, 216

Phyllanthus SP (Chinese herbalmedicine) 75–6

pilosebaceous opening plugging213

placebo 6acupuncture 80–1Alzheimer’s disease 252–3controls 15psychotherapy study 301–2rhinitis studies 374skin disorder trials 226–7

plaque control 205plaque index, pre-/post-brushing

195plaque removal 199

blinding of study 197mechanical 205split-mouth study design 196

Pocock boundaries 7–8pollen allergy 361, 370–1

season onset 392post-marketing surveillance

15–16adverse effects in Chinese

medicine 71, 73herbal medicine 69paediatric clinical trials 50

potency ratio 386pragmatic trials 338–9

clinical outcomes 345outcome measures 349

pravastatin 178pre-randomisation variables 38pregnancy rate

emergency contraception329–30

estimation 328–9PREMIER trial 185–6presenilin 244preterm newborn infants 50Prevention of Events with

Angiotensin-ConvertingEnzyme Inhibitor Therapy(PEACE) trial 179–80

prevention trials 15primary care

anxiety disorders 258dermatology 213

primary prevention 178Alzheimer’s disease 251cardiovascular disease 175

primordial prevention 175progesterone 319progesterone-only pills (POPs)

316–18progestins 316–17

implants 319Program on the Surgical Control

of Hyperlipidaemia(POSCH) 182

Propionibacterium acne 213proportional hazards regression

models 94protein kinase inhibitors 167psoralen plus ultraviolet A

phototherapy (PUVA) 214follow-up study 228

psoriasis 214–15dithranol treatment 224outcome measures 219randomised controlled trials

225, 226treatment 216trials 217

Psoriasis Area and Severity Index(PASI) 215, 218, 220, 221

psychiatry 235early twentieth century

treatment 236–8late twentieth century 238–40physical therapies 237–8, 239randomisation 239removal of foci of infection

236–7see also named conditions

psychosischronicity 286duration of untreated 286functional 236–7

psychotherapy 276–7adherence to 300–1

assessment method 302–3brief dynamic 300causal effects 306–8centre effects 303–4choice of 300–1clinical management 301–2comparison of types 299components 288control group 301–2depression 298–9dropouts from study 302–3effect modification 306efficacy evaluation 300equipoise 301intention to treat (ITT) 306–7interpersonal 300, 305missing data 302–3number needed to treat (NNT)

307outcome features 302–3patient preference designs

308–9randomisation 301randomised consent 308standardisation 300–1therapist effects 303–5treatment manual 301treatment protocol 289

Q-TWiST method 146quality of life (QoL) 26, 27, 28

AML 146asthma 373breast cancer 91Chinese medicine 73–4COPD 373data sheets 74dentistry 199evaluation 40gynaecology 345–6infertility 345–6leg ulcers 215multiple comparisons 37skin disorders 225, 228

quasi-randomised trials 341questionnaires 26, 27, 28

Radiation Therapy OncologyGroup (RTOG) 118

radiochemotherapy, pre-operative118

radiotherapy, post-operativethoracic for non-small-celllung cancer 162–3

radon exposure 159raloxifene 91

Page 407: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

414 INDEX

random effects models 60randomisation 5, 13, 24–5, 297

block 216, 350cluster 340–3cognitive behavioural therapy

285–8concealed 15contraception trials 327depression studies 299–300ethics 183–4gynaecology studies 349–51infertility studies 349–51list 15psychiatry 239psychotherapy 301ratio 25respiratory disease studies

374–5skin disease trials 215–16stratification 24, 350unbalanced 96understanding by subjects 58

randomised consent 226, 308randomised control parallel trial

designs 248–9randomised controlled trials

(RCTs) 14, 238–9causal effects 306–8cluster 340–3cognitive behavioural therapy

274–5comparison of randomised

groups 306consent 14–15contraception 323, 325–31cost data 347dentistry 194–5depression 297eligibility requirements 14, 15equipoise 301informed consent 308multicentre 303orthodontics 200, 205periodontal diseases 199placebo-controlled 227skin disorders 218, 228

psoriasis 225, 226summarising results of

several trials 228–9systematic review 343therapist effects 303–5

Randomised Evaluation ofMechanical Assistance forthe Treatment of CongestiveHeart Failure (REMATCH)183–4

Rapid Early Action for CoronaryTreatment (REACT) trial186

recruitment 34bias in cognitive behavioural

therapy 281–3gynaecology trials 353infertility trials 353older people trials 56, 57paediatric clinical trials 52–3strategies 57

rectal cancerpostoperative radiotherapy

129preoperative radiotherapy

128–9surgery/adjuvant therapy

128–9total mesorectal excision

(TME) 129treatment 128–9

referees, guidelines 32regression models 13regression techniques 35Regulatory Agencies 39regulatory authorities, paediatric

clinical trials 46–7Relieve Wheezing Tablet (Chinese

herbal medicine) 78–9remission, length of 145–6renal carcinoma, IFN-α 21, 33reporting of clinical trials 32, 33rescue medication 375

asthma 389β2-agonists 380

resin-based composite (RBC)204

respiratory cancers see lung cancerrespiratory diseases 359–97

airway diseases 359–73bias 377bioequivalency 392–3blinding 373–4bronchodilatation 382–3crossover trials 375–6, 381current treatments 362–4design of trials 375–6diary card studies 371–3, 377dropout from study 377efficacy

documenting 386–92studies 380–3

inhalation products 374,392–3

interim analysis 376lung function measurements

365–6

measurement scales 364–7meta-analysis 397methods for trials 373–80missing data 376–8multiple comparisons 378parallel study design 376, 379pharmacoeconomics 396–7Phase I/II studies 380–6Phase III trials 386–95Phase IV studies 395–6protocol violations 375randomisation 374–5safety documenting 386–92sample size determinations

378–9sequential trials 376subgroup analysis 379–80systemic effects of drugs

383–6systemic exposure to drugs

393therapeutic equivalence

392–5treatment costs 396

respiratory tract, upper,inflammation 361

retinoblastoma, childhood 105all-trans retinoic acid (ATRA)

142retinopathy of prematurity

(retrolental fibroplasia) 5retrospective review, solid tumour

cancers 29reversibility of lung function 366rhabdomyosarcoma

childhood 105risk assignment 106, 107

rheumatic heart disease 5rhinitis 359, 361–2

blinding of studies 374exposure studies 370–1glucocorticosteroids 363perennial 392placebo groups 374seasonal 392studies 392

rhinomanometry, posterior 366risks, competing 38–9rofecoxib 123

safetyasthma studies 386–91documenting in respiratory

disease studies 386–92herbal medicine 68, 69paediatric clinical trials 49–50

Page 408: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

INDEX 415

safety monitoring 7, 31gastrointestinal cancer trials

131–2paediatrics 46

St George’s RespiratoryQuestionnaire 373

salbutamol (albuterol) 393salt fluoridation 203sample size 30–1

dentistry studies 195determination 37respiratory diseases 378–9

Sargant, William 237–8, 239Scandinavian Simvastatin Survival

Study (4S) 178schema change 291schizophrenia

acute care 277–9cognitive behavioural therapy

273–93maintenance therapy 277–9medication 278recruitment bias 281–3residual symptoms 277–8

scurvy 4, 5seasonal allergic rhinitis 361secondary prevention

Alzheimer’s disease 251cardiovascular disease 175

selective oestrogen-receptormodulators (SERMs) 89,91

selective serotonin reuptakeinhibitors (SSRIs) 239

selegiline, Alzheimer’s disease249–50, 252

selenium 162Self-Administered Psoriasis Area

and Severity Index (PASI)220

self-controlled study design 223senile plaque 245sequential trials 21–2

closed 21, 23open 21, 22respiratory diseases 376

sertraline, older people trials 56sexual intercourse

coitus interruptus 322periodic abstinence 321

sexually transmitted diseases 321Shepherd, Michael 239Short Form Health Survey (SF-36)

26, 27, 345, 373Sickness Impact Profile 373sigmoidoscopy, flexible 124simvastatin 178

skin disorders 211–15accessory care 217blinding 216–17clinical trial methods 215–18,

219, 220–2disease-modifying strategies

216disease severity 217–18dropouts from studies 228entry criteria to studies 226management 216measurement error 223outcome measures 218, 219,

220–2, 227–8patient motivation/preference

225–6patient satisfaction 225pharmacists 213Phase I and II studies 222–5Phase III trials 225–8placebo use 226–7prevalence 212QoL 225, 228randomisation 215–16randomised controlled trials

218, 228summarising results of

several trials 228–9score systems 228severity

criteria 218spectrum 212

study settings 217–18time frame for evaluation

227–8treatment

modality standardisation217

penetration 222topical 213variation 213

skin failure 211Social Phobia Inventory (SPIN)

265social phobics 267SoCRATES Trial 275spermicides 321spirometry 365–6split-mouth study design, dentistry

196–7, 198sputum production 361ST 1435 (progestin) 319standard codes 40–1standard error (SE) 35standardised effect size 30Stanford Heart Transplant

Program 13

statins 178statistical analysis 24

computer use 39guidelines 32software 39

statistical significance 29stem cell transplantation,

allogeneic 143sterilisation 322–3, 328Sternback, Leo 239stochastic curtailment 7stopping rules 31–2

breast cancer 95–6infertility/gynaecology trials

354streptomycin, use in pulmonary

tuberculosis 5, 6, 239stringency 8stroke 176Student t-test 34subgroup analysis 37–8

contraception 330gastrointestinal cancers

129–30infertility/gynaecology trials

349, 353respiratory diseases 379–80

sugar alcohols 202sugar consumption 202superiority trials 343surrogates 13survival

analysis 38disease-free 145, 146, 152–3event-free 145, 146overall 146, 152–3, 154, 156relapse-free 145, 152–3, 154,

156survival time methods 25, 26symptoms

identification 69measurement in cognitive

behavioural therapy291–2

pathological causes 66suppression 66

syphilis, Tuskegee study 6systematic review of infertility

343Systolic Hypertension in the

Elderly program 180

tacrine 249tamoxifen, breast cancer 15, 89,

90, 91taxanes 164

Page 409: Textbook of Clinical Trialsjonspharmacy.weebly.com/uploads/2/1/9/2/21923694/...This Textbook of Clinical Trials is not a textbook of clinical trials in the traditional sense. Rather,

416 INDEX

teethextraction for orthodontics

205fissure sealant 195–6, 198,

204fluoride application 204implants 193, 198–9, 204malalignment/malocclusion

193, 200replacement 204–5

tertiary prevention, Alzheimer’sdisease 251

testosterone, injectablecontraceptives 322

tetrahydroaminoacridine (Cognex)244

theophylline 362–3therapeutic advantage, small

15therapeutic equivalence 20,

392–4marketing 393–4

therapeutic index, relative394–5, 396

therapeutic ratio 394–5thoracotomy 182time-to-event studies 26TNM staging system 117

lung cancer 160toddlers 50toothbrushing 205toothpaste trials 199

antimicrobial agents 205fluoridation 203–4

topical drugspharmacodynamic parameters

223skin penetration 222

topoisomerase I inhibitorsnon-small-cell lung cancer

164

small-cell lung cancer165

topotecannon-small-cell lung cancer

164small-cell lung cancer 165

tradition, strong 70traditional healing 69–70trandolapril 179Transitional Dyspnoea Index 370translational research 103, 105

ancillary studies 111neuroblastoma 106

trastuzumab 89treatment

efficacy interim report 32random allocation 14

Treatment of DepressionCollaborative ResearchProgram (TDCRP) 301

trial size 29–31tuberculosis, pulmonary and

streptomycin use 5, 6, 239tumour markers, gastrointestinal

cancers 130–1Tuskegee (Alabama, US), syphilis

study 6Type I error 7, 8Type I error rate 30

melanoma trials 155Type II error rate 30

ulceration, leg 215, 219ultraviolet B phototherapy 214,

216urinary incontinence 345

study population 343–4ursodeoxycholic acid 123urticaria 227

vaginal bleeding patterns 331vaginal rings 319validity

external 298internal 297–8

vasectomy 322–3vasoconstrictors 364venous thromboembolism

317–18ventricular arrhythmias 180ventricular ejection fraction, left

179, 183vinorelbine 164vitamin A 162vitamin E in Alzheimer’s disease

249–50, 252

Western lifestyle 177Wilms’ tumour, childhood 105within-patient studies 223–5

comparisons 19Withington, Charles Francis 7Women’s Health Initiative (WHI)

250

xanthines 362–3xylitol 202

Yale–Brown ObsessiveCompulsive Scale 265

Yin and Yang, balancemaintenance 66

Yuzpe regimen 319, 320, 332

ZD1839 (EGFr inhibitor) 164Zelen consent design 308

double 23, 226single 15, 22–4