31st july talk (20021)

Preview:

Citation preview

Clinical Trial Writing IISample Size Calculation and

Randomization

Liying XU (Tel: 22528716)CCTERCUHK 31st July 2002

1Sample Size Planning

1.1 Introduction

Fundamental Points Clinical trials should have sufficient

statistical power to detect difference between groups considered to be of clinical interest. Therefore calculation of sample size with provision for adequate levels of significance and power is a essential part of planning.

Five Key Questions Regarding the Sample Size

What is the main purpose of the trial? What is the principal measure of patients

outcome? How will the data be analyzed to detect a treatment

difference? (The test statistic: t-test , X2 or CI.) What type of results does one anticipate with

standard treatment? Ho and HA, How small a treatment difference is it

important to detect and with what degree of certainty? ( , and .)

How to deal with treatment withdraws and protocol violations. (Data set used.)

SSC: Only an Estimate

Parameters used in calculation are estimates with uncertainty and often base on very small prior studies

Population may be different Publication bias--overly optimistic Different inclusion and exclusion

criteria Mathematical models approximation

What should be in the protocol?

Sample size justification Methods of calculation Quantities used in calculation:

• Variances• mean values• response rates • difference to be detected

Realistic and Conservative

Overestimated size: unfeasible early termination

Underestimated size justify an increase extension in follow-up incorrect conclusion (WORSE)

What is (Type I error)? The probability of erroneously

rejecting the null hypothesis (Put an useless medicine into the

market!)

What is (Type II error)? The probability of erroneously failing

to reject the null hypothesis. (keep a good medicine away from

patients!)

What is Power ?

Power quantifies the ability of the study to find true differences of various values of .

Power = 1- =P (accept H1|H1 is true) ----the chance of correctly identify H1

(correctly identify a better medicine)

What is ?

is the minimum difference between groups that is judged to be clinically important Minimal effect which has clinical relevance in

the management of patients The anticipated effect of the new treatment

(larger)

The Choice of and depend on:

the medical and practical consequences of the two kinds of errors

prior plausibility of the hypothesis the desired impact of the results

The Choice of and

=0.10 and =0.2 for preliminary trials that are likely to be replicated.

=0.01 and =0.05 for the trial that are unlikely replicated.

= if both test and control treatments are new, about equal in cost, and there are good reasons to consider them both relatively safe.

The Choice of and

> if there is no established control treatment and test treatment is relatively inexpensive, easy to apply and is not known to have any serious side effects.

< (the most common approach 0.05 and 0,2)if the control treatment is already widely used and is known to be reasonably safe and effective, whereas the test treatment is new,costly, and produces serious side effects.

1.2 SSC for Continuous Outcome Variables

H0: =C-I=0 HA: =C-I0 If the variance in known If

If H0 will be rejected at the level of significance.

IC

Ic

NN

xxz

11

ZZ

A total sample 2N would be needed to detect a true difference between I and C with power (1-) and significant level by formula:

2

224

2

ZZN

Example 1

An investigator wish to estimate the sample size necessary to detect a 10 mg/dl difference in cholesterol level in a diet intervention group compared to the control group. The variance from other data is estimated to be (50 mg/dl). For a two sided 5% significance level, Z=1.96, and for 90% power, Z=1.282.

2N=4(1.96+1.282)2(50)2/102=1050

Example1a Baseline Adjustment

An investigator interested in the mean levels of change might want to test whether diet intervention lowers serum cholesterol from baseline levels when compare with a control.

H0: =0

HA: 0 =20mg/dl, =10mg/dl 2N=4(1.96+1.282)2(20)2/102=170

Ic

Ic

A Professional Statement A sample size of 85 in each group will

have 90% power to detect a difference in means of 10.0 assuming that the common standard deviation is 20.0 using a two group t-test with a 0.05 two-sided significant level.

Values of f(,) to be used in formula for sample size calculation

( T y p e I I e r r o r )

0 . 0 5 0 . 1 0 . 2 0 . 5( T y p e I

e r r o r )0 . 1

0 . 0 50 . 0 20 . 0 1

1 0 . 81 3 . 01 5 . 81 7 . 8

8 . 61 0 . 51 3 . 01 4 . 9

6 . 27 . 9

1 0 . 01 1 . 7

2 . 73 . 85 . 46 . 6

),(2

fZZ

1.3 SSC for a Binary Outcome Two independent samples

)/(

111/

CICI

ICIC

NNrrp

NNppppZ

2/)( IC ppp

22 /1)(42 IC ppppZZN

Example 2

Suppose the annual event rate in the control group is anticipated to be 20%. The investigator hopes that the intervention will reduce the annual rate to 15%. The study is planned so that each participant will be followed for 2 years. Therefore, if the assumption are accurate, approximately 40% of the participants in the control group and 30% of the participants in the intervention group will develop an event.

960956

3.04.0/)65.0)(35.0(282.196.142 22

N

A Professional Statement A two group x2 test with a 0.05 two-

sided significant level will have 90% power to detect the difference between a Group 1 proportion, P1,of 0.40 and a Group 2 proportion P2 of 0.30 (odds ratio of 0.643) when the sample size in each group is 480.

Table 1.3 Approximate total sample size for comparing various proportions in two groups with significance level () of 0.05 and power(1-) of 0.8 and 0.9

True proportions =0.05(one-sided) =0.05(two-sided)

pC pI 1- 1- 1- 1-

Control group Interventiongroup

0.90 0.80 0.90 0.80

0.6

0.50

0.40

0.30

0.20

0.10

0.500.400.300.200.400.300.250.200.300.250.200.200.150.100.150.100.050.05

8502109050850210130907803301806402701401980440170950

610160704061015090605602401304701901001430320120690

104026012060104025016011096041022079033017024305402001170

7802009050780190120807203101705902501301810400150870

From Table 1.3 You can see:

N The power 1- N The N

Paired Binary Outcome

McNemar’s test

d=difference in the proportion of successes (d=pI-pC)

f=the portion of participants whose response is discordant (the pair of outcome are not the same)

2

2

d

fZZN p

Example 3

Consider an eye study where one eye is treated for loss in visual acuity by a new laser procedure and the other eye is treated by standard therapy. The failure rate on the control, pC, is estimated to be 0.4, and the new procedure is projected to reduce the failure rate to 0.20. The discordant rate f is assumed to be 0.50.

=0.05 The power 1- =0.90 f=0.5 PC=0.4 PI=0.2

1325.02622.04.0

5.0282.196.12

2

Np

1.4 Adjusting for Non-adherence

Ro =drop out rate

RI=drop in rate N=N

If RO=0.20, RI=0.05 N =1.78N

21/ IO RR

1.5 Adjusting the Multiple Comparison

’= /k

k= the number of multiple comparison variables

Table 1.4 Adjusting for Randomization Ratio

Randomization Ratio Increase in total N1:1 01:2 +12.5%1:3 +33%1:4 +56%1:5 +80%1:6 +100%

1.6 Adjusting for loss of follow up

If p is the proportion of subjects lost to follow-up, the number of subjects must be increased by a factor of 1/(1-p).

1.7 Other Factors: the rate of attrition of subjects

during a trial intermediate analyses

Sample size re-estimation Events rates are lower than

anticipate Variability of larger than expected

Without unbinding data and Making treatment comparisons

1.8 Power Calculation(assuming we compare two medicines)

Power Depends on 4 Elements: The real difference between the two

medicines, • Big big power

The variation among individuals, • Small big power

The sample size, n• Large nbig power

Type I error,• Large big power

Sensitivity of the sample size estimate

to a variety of deviations from these assumptions

a power table

Table 1 Statistical Power of the Tanzania Vitamin and HIV Infection Trial (N=960)

Effect of B

0% 15% 30%

Effect of A Loss to follow up

0% 20% 33%

Loss to follow up

0% 20% 33%

Loss to follow up

0% 20% 33%

30% 89% 82% 74% 85% 76% 68% 79% 69% 61%

25% 75% 65% 58% 69% 59% 52% 62% 52% 45%

Example 4 Regret for Low Power Due to Small Sample?

I have a set of data that the mean change between the 2 groups is significantly different (p<0.05).  But when I put calculate the power it gives only 50%.  How should I interpret this? Also, can someone kindly advise as whether it is meaningful (or pointless) to calculate the power when the result is statistically significant?

Books and Software Sample size tables for clinical

studies (second edition) By David Machin, Michael Campbell Peter Fayers

and Alain Pinol Blackwell Science 1997

PASS 2000 available in CCTER

nQuery 4.0 available in CCTER

2. Randomization

Randomization

Definition: randomization is a process by which each

participant has the same chance of being assigned to either intervention or control.

Fundamental Point

Randomization trends to produce study groups comparable with respect to known and unknown risk factors, removes investigator bias in the allocation of participants, and guarantees that statistical tests will have valid significance levels.

Two Types of Bias in Randomization

Selection bias occurs if the allocation process is predictable. If any

bias exists as to what treatment particular types of participants should receive, then a selection bias might occur.

Accidental bias can arise if the randomization procedure does not

achieve balance on risk factors or prognostic covariates especially in small studies.

Fixed Allocation Randomization Fixed allocation randomization procedures

assign the intervention to participants with a pre-specified probability, usually equal, and that allocation probability is not altered as the study processes

• Simple randomization• Blocked randomization• Stratified randomization

Randomization Types

Simple randomization

Simple Randomization Option 1: to toss an unbiased coin for a randomized

trial with two treatment (call them A and B) Option 2: to use a random digit table. A randomization

list may be generated by using the digits, one per treatment assignment, starting with the top row and working downwards:

Option 3: to use a random number-producing algorithm, available on most digital computer systems.

Advantages

Each treatment assignment is completely unpredictable, and probability theory guarantees that in the long run the numbers of patients on each treatment will not be radically different and easy to implement

Disadvantages

Unequal groups one treatment is assigned more often than

another Time imbalance or chronological bias

One treatment is given with greater frequency at the beginning of a trial and another with greater frequency at the end of the trial.

Simple randomization is not often used, even for large studies.

Randomization Types

Blocked randomization

Blocked Randomization (permuted block randomization) Blocked randomization is to ensure exactly equal

treatment numbers at certain equally spaced point in the sequence of patients assignments

A table of random permutations is used containing, in random order, all possible combinations (permutations) of a small series of figures.

Block size: 6,8,10,16,20.

Advantages

The balance between the number of participants in each group is guaranteed during the course of randomization. The number in each group will never differ by more than b/2 when b is the length of the block.

Disadvantages

Analysis may be more complicated (in theory)Correct analysis could have bigger power

Changing block size can avoid the randomization to be predictable

Mid-block inequality might occur if the interim analysis is intended.

Randomization Types Stratified randomization

lym ph sk in breast

Ye s

lym ph sk in breast

N o

U .S .

lym ph sk in breast

Ye s

lym ph sk in breast

N o

Europe

previous exposure

geographic location

site

Stratified Randomization

Stratified randomization process involves measuring the level of the selected factors for participants, determining to which stratum each belongs, and performing the randomization within the stratum. Within each stratum, the randomization process itself could be simple randomization, but in practice most clinical trials use some blocked randomization strategy.

Table 3. Stratification Factors and Levels (323=18 Strata)

Age Sex Smoking history

1. 40-49 yr 1.Male 1. Current smoker

2. 50-59 yr 2 Female 2. Ex-smoker

3. 60-69 yr 3. Never smoked

Table 4 Stratified Randomization with Block Size of FourStrat

a Age Sex Smoking Group assignment

1 2 3 4 5 6 7 8 9 10 11 12

40-49 40-49 40-49 40-49 40-49 40-59 50-59 50-59 50-59 50-59 50-59 50-59 etc.

M M M F F F M M M F F F

Current Ex

Never Current

Ex Never

Current Ex

Never Current

Ex Never

ABBA BABA.. BABA BBAA..

Etc.

Advantages

To make two study groups appear comparable with regard to specified factors, the power of the study can be increased by taking the stratification into account in the analysis.

Disadvantages

The prognostic factor used in stratified randomization may be unimportant and other factors may be identified later are of more importance

MechanismTrial Type

Mechanism

No central registration office Randomization list sealed envelops

Double blind drug trial Pharmacist will be involved

Multi-centre trial Central registration office

Single-centre trial Independent person responsible for patients registration and randomization

An Example of Stratified Randomization

Patients will be stratified according to the following criteria:

1) Treatment center (Hospital A vs Hospital B vs Hospital C)

2) N-stage(N2 vs N3) 3) T-stage (T1-2 vs T3-4)

What should be in the protocol? A dynamic allocation scheme will be used to

randomize patients in equal proportions within each of 12 strata. The scheme first creates time-ordered blocks of size divisible by three and then uses simple randomization to divide the patients in each block into three treatment arms, in equal proportion. The block sizes will be chosen randomly so that each block contains either 6 or 9 patients.

Cont…

This procedure helps to ensure both randomness and investigator blinding (the block sizes are known only to the statistician), as recommended by Freedman et al. Randomization will be generated by the consulting statistician in sealed envelopes, labeled by stratum, which will be unsealed after patient registration.

Adaptive Randomization

Number adaptiveBiased coin method

Baseline adaptive (MINIMIZATION) Outcome adaptive

Biased Coin Method

Advantages Investigators can not determine the next

assignment by discovery the blocking factor.

DisadvantagesComplexity in useStatistical analysis cumbersome

Minimization

Minimization is an well -accepted statistical method to limit imbalance in relative small randomized clinical trials in conditions with known important prognostic baseline characteristics.

It called minimization because imbalance in the distribution of prognostic factors are minimized

Table 1 Some baseline characteristics of patients in a controlled trial of mustine versus talc in the control of pleural effusions in patients with breast cancer (Frientiman et al, 1983)

Treatment Mustine (n=23) Talc(n=23)

Mean age (SE) 50.3(1.5) 55.3(2.2)

Stage of disease: 1 or 2 3 or 4

52% 48%

74% 26%

Mean interval in month between BC diag. and effusion diag. (SE)

33.1(6.2) 60.4(13.1)

Postmenopausal 43% 74%

Minimization Factors

Age ( years) <=50 Or >50

Stage of disease 1 or 2 Or 3 or 4

Time between diagnosis of cancer and diagnosis of effusions(months)

<=30 Or >30

Menopausal Pre Or Post

Table 2 Characteristics of the first 29 patients in a clinical trial using minimization to allocate treatment

Mustine Talc

Age <=50 >50

7 8

6 8

Stage 1 or 2 3 or 4

11 4

11 3

Time Interval

<=30m >30m

6 9

4 10

Menopausal Pre Post

7 8

5 9

Table 3 Calculation of imbalance in patient characteristics for allocating treatment to the thirtieth patient

Mustine (n=15)

Talc (n=14)

Age >50 8 8

Stage 3 or 4 4 3

Time interval <=30m 6 4

Postmenopausal 8 9

Total 26 24

Advantages

It can reduce the imbalance into the minimum level especially in small trial

Computer Program available (called Mini) and also not difficult to perform ‘by hand’

Minimization and stratification on the same prognostic factors produce similar levels of power, but minimization may add slightly more power if stratification does not include all of the covariance

Disadvantages

It is a bit complicated process compare to the simple randomization

Practical Considerations

Study type Randomization

Large studies Blocked

Large, Multicentre studies Stratified by centre

Small studies Blocked and Stratified by centre

Large number of Prognostic factors

Minimization

Large studies Stratified analysis without stratified randomization

Recommended