Upload
vijay-pithadia
View
482
Download
1
Tags:
Embed Size (px)
Citation preview
Clinical Trial Writing IISample Size Calculation and
Randomization
Liying XU (Tel: 22528716)CCTERCUHK 31st July 2002
1Sample Size Planning
1.1 Introduction
Fundamental Points Clinical trials should have sufficient
statistical power to detect difference between groups considered to be of clinical interest. Therefore calculation of sample size with provision for adequate levels of significance and power is a essential part of planning.
Five Key Questions Regarding the Sample Size
What is the main purpose of the trial? What is the principal measure of patients
outcome? How will the data be analyzed to detect a treatment
difference? (The test statistic: t-test , X2 or CI.) What type of results does one anticipate with
standard treatment? Ho and HA, How small a treatment difference is it
important to detect and with what degree of certainty? ( , and .)
How to deal with treatment withdraws and protocol violations. (Data set used.)
SSC: Only an Estimate
Parameters used in calculation are estimates with uncertainty and often base on very small prior studies
Population may be different Publication bias--overly optimistic Different inclusion and exclusion
criteria Mathematical models approximation
What should be in the protocol?
Sample size justification Methods of calculation Quantities used in calculation:
• Variances• mean values• response rates • difference to be detected
Realistic and Conservative
Overestimated size: unfeasible early termination
Underestimated size justify an increase extension in follow-up incorrect conclusion (WORSE)
What is (Type I error)? The probability of erroneously
rejecting the null hypothesis (Put an useless medicine into the
market!)
What is (Type II error)? The probability of erroneously failing
to reject the null hypothesis. (keep a good medicine away from
patients!)
What is Power ?
Power quantifies the ability of the study to find true differences of various values of .
Power = 1- =P (accept H1|H1 is true) ----the chance of correctly identify H1
(correctly identify a better medicine)
What is ?
is the minimum difference between groups that is judged to be clinically important Minimal effect which has clinical relevance in
the management of patients The anticipated effect of the new treatment
(larger)
The Choice of and depend on:
the medical and practical consequences of the two kinds of errors
prior plausibility of the hypothesis the desired impact of the results
The Choice of and
=0.10 and =0.2 for preliminary trials that are likely to be replicated.
=0.01 and =0.05 for the trial that are unlikely replicated.
= if both test and control treatments are new, about equal in cost, and there are good reasons to consider them both relatively safe.
The Choice of and
> if there is no established control treatment and test treatment is relatively inexpensive, easy to apply and is not known to have any serious side effects.
< (the most common approach 0.05 and 0,2)if the control treatment is already widely used and is known to be reasonably safe and effective, whereas the test treatment is new,costly, and produces serious side effects.
1.2 SSC for Continuous Outcome Variables
H0: =C-I=0 HA: =C-I0 If the variance in known If
If H0 will be rejected at the level of significance.
IC
Ic
NN
xxz
11
ZZ
A total sample 2N would be needed to detect a true difference between I and C with power (1-) and significant level by formula:
2
224
2
ZZN
Example 1
An investigator wish to estimate the sample size necessary to detect a 10 mg/dl difference in cholesterol level in a diet intervention group compared to the control group. The variance from other data is estimated to be (50 mg/dl). For a two sided 5% significance level, Z=1.96, and for 90% power, Z=1.282.
2N=4(1.96+1.282)2(50)2/102=1050
Example1a Baseline Adjustment
An investigator interested in the mean levels of change might want to test whether diet intervention lowers serum cholesterol from baseline levels when compare with a control.
H0: =0
HA: 0 =20mg/dl, =10mg/dl 2N=4(1.96+1.282)2(20)2/102=170
Ic
Ic
A Professional Statement A sample size of 85 in each group will
have 90% power to detect a difference in means of 10.0 assuming that the common standard deviation is 20.0 using a two group t-test with a 0.05 two-sided significant level.
Values of f(,) to be used in formula for sample size calculation
( T y p e I I e r r o r )
0 . 0 5 0 . 1 0 . 2 0 . 5( T y p e I
e r r o r )0 . 1
0 . 0 50 . 0 20 . 0 1
1 0 . 81 3 . 01 5 . 81 7 . 8
8 . 61 0 . 51 3 . 01 4 . 9
6 . 27 . 9
1 0 . 01 1 . 7
2 . 73 . 85 . 46 . 6
),(2
fZZ
1.3 SSC for a Binary Outcome Two independent samples
)/(
111/
CICI
ICIC
NNrrp
NNppppZ
2/)( IC ppp
22 /1)(42 IC ppppZZN
Example 2
Suppose the annual event rate in the control group is anticipated to be 20%. The investigator hopes that the intervention will reduce the annual rate to 15%. The study is planned so that each participant will be followed for 2 years. Therefore, if the assumption are accurate, approximately 40% of the participants in the control group and 30% of the participants in the intervention group will develop an event.
960956
3.04.0/)65.0)(35.0(282.196.142 22
N
A Professional Statement A two group x2 test with a 0.05 two-
sided significant level will have 90% power to detect the difference between a Group 1 proportion, P1,of 0.40 and a Group 2 proportion P2 of 0.30 (odds ratio of 0.643) when the sample size in each group is 480.
Table 1.3 Approximate total sample size for comparing various proportions in two groups with significance level () of 0.05 and power(1-) of 0.8 and 0.9
True proportions =0.05(one-sided) =0.05(two-sided)
pC pI 1- 1- 1- 1-
Control group Interventiongroup
0.90 0.80 0.90 0.80
0.6
0.50
0.40
0.30
0.20
0.10
0.500.400.300.200.400.300.250.200.300.250.200.200.150.100.150.100.050.05
8502109050850210130907803301806402701401980440170950
610160704061015090605602401304701901001430320120690
104026012060104025016011096041022079033017024305402001170
7802009050780190120807203101705902501301810400150870
From Table 1.3 You can see:
N The power 1- N The N
Paired Binary Outcome
McNemar’s test
d=difference in the proportion of successes (d=pI-pC)
f=the portion of participants whose response is discordant (the pair of outcome are not the same)
2
2
d
fZZN p
Example 3
Consider an eye study where one eye is treated for loss in visual acuity by a new laser procedure and the other eye is treated by standard therapy. The failure rate on the control, pC, is estimated to be 0.4, and the new procedure is projected to reduce the failure rate to 0.20. The discordant rate f is assumed to be 0.50.
=0.05 The power 1- =0.90 f=0.5 PC=0.4 PI=0.2
1325.02622.04.0
5.0282.196.12
2
Np
1.4 Adjusting for Non-adherence
Ro =drop out rate
RI=drop in rate N=N
If RO=0.20, RI=0.05 N =1.78N
21/ IO RR
1.5 Adjusting the Multiple Comparison
’= /k
k= the number of multiple comparison variables
Table 1.4 Adjusting for Randomization Ratio
Randomization Ratio Increase in total N1:1 01:2 +12.5%1:3 +33%1:4 +56%1:5 +80%1:6 +100%
1.6 Adjusting for loss of follow up
If p is the proportion of subjects lost to follow-up, the number of subjects must be increased by a factor of 1/(1-p).
1.7 Other Factors: the rate of attrition of subjects
during a trial intermediate analyses
Sample size re-estimation Events rates are lower than
anticipate Variability of larger than expected
Without unbinding data and Making treatment comparisons
1.8 Power Calculation(assuming we compare two medicines)
Power Depends on 4 Elements: The real difference between the two
medicines, • Big big power
The variation among individuals, • Small big power
The sample size, n• Large nbig power
Type I error,• Large big power
Sensitivity of the sample size estimate
to a variety of deviations from these assumptions
a power table
Table 1 Statistical Power of the Tanzania Vitamin and HIV Infection Trial (N=960)
Effect of B
0% 15% 30%
Effect of A Loss to follow up
0% 20% 33%
Loss to follow up
0% 20% 33%
Loss to follow up
0% 20% 33%
30% 89% 82% 74% 85% 76% 68% 79% 69% 61%
25% 75% 65% 58% 69% 59% 52% 62% 52% 45%
Example 4 Regret for Low Power Due to Small Sample?
I have a set of data that the mean change between the 2 groups is significantly different (p<0.05). But when I put calculate the power it gives only 50%. How should I interpret this? Also, can someone kindly advise as whether it is meaningful (or pointless) to calculate the power when the result is statistically significant?
Books and Software Sample size tables for clinical
studies (second edition) By David Machin, Michael Campbell Peter Fayers
and Alain Pinol Blackwell Science 1997
PASS 2000 available in CCTER
nQuery 4.0 available in CCTER
2. Randomization
Randomization
Definition: randomization is a process by which each
participant has the same chance of being assigned to either intervention or control.
Fundamental Point
Randomization trends to produce study groups comparable with respect to known and unknown risk factors, removes investigator bias in the allocation of participants, and guarantees that statistical tests will have valid significance levels.
Two Types of Bias in Randomization
Selection bias occurs if the allocation process is predictable. If any
bias exists as to what treatment particular types of participants should receive, then a selection bias might occur.
Accidental bias can arise if the randomization procedure does not
achieve balance on risk factors or prognostic covariates especially in small studies.
Fixed Allocation Randomization Fixed allocation randomization procedures
assign the intervention to participants with a pre-specified probability, usually equal, and that allocation probability is not altered as the study processes
• Simple randomization• Blocked randomization• Stratified randomization
Randomization Types
Simple randomization
Simple Randomization Option 1: to toss an unbiased coin for a randomized
trial with two treatment (call them A and B) Option 2: to use a random digit table. A randomization
list may be generated by using the digits, one per treatment assignment, starting with the top row and working downwards:
Option 3: to use a random number-producing algorithm, available on most digital computer systems.
Advantages
Each treatment assignment is completely unpredictable, and probability theory guarantees that in the long run the numbers of patients on each treatment will not be radically different and easy to implement
Disadvantages
Unequal groups one treatment is assigned more often than
another Time imbalance or chronological bias
One treatment is given with greater frequency at the beginning of a trial and another with greater frequency at the end of the trial.
Simple randomization is not often used, even for large studies.
Randomization Types
Blocked randomization
Blocked Randomization (permuted block randomization) Blocked randomization is to ensure exactly equal
treatment numbers at certain equally spaced point in the sequence of patients assignments
A table of random permutations is used containing, in random order, all possible combinations (permutations) of a small series of figures.
Block size: 6,8,10,16,20.
Advantages
The balance between the number of participants in each group is guaranteed during the course of randomization. The number in each group will never differ by more than b/2 when b is the length of the block.
Disadvantages
Analysis may be more complicated (in theory)Correct analysis could have bigger power
Changing block size can avoid the randomization to be predictable
Mid-block inequality might occur if the interim analysis is intended.
Randomization Types Stratified randomization
lym ph sk in breast
Ye s
lym ph sk in breast
N o
U .S .
lym ph sk in breast
Ye s
lym ph sk in breast
N o
Europe
previous exposure
geographic location
site
Stratified Randomization
Stratified randomization process involves measuring the level of the selected factors for participants, determining to which stratum each belongs, and performing the randomization within the stratum. Within each stratum, the randomization process itself could be simple randomization, but in practice most clinical trials use some blocked randomization strategy.
Table 3. Stratification Factors and Levels (323=18 Strata)
Age Sex Smoking history
1. 40-49 yr 1.Male 1. Current smoker
2. 50-59 yr 2 Female 2. Ex-smoker
3. 60-69 yr 3. Never smoked
Table 4 Stratified Randomization with Block Size of FourStrat
a Age Sex Smoking Group assignment
1 2 3 4 5 6 7 8 9 10 11 12
40-49 40-49 40-49 40-49 40-49 40-59 50-59 50-59 50-59 50-59 50-59 50-59 etc.
M M M F F F M M M F F F
Current Ex
Never Current
Ex Never
Current Ex
Never Current
Ex Never
ABBA BABA.. BABA BBAA..
Etc.
Advantages
To make two study groups appear comparable with regard to specified factors, the power of the study can be increased by taking the stratification into account in the analysis.
Disadvantages
The prognostic factor used in stratified randomization may be unimportant and other factors may be identified later are of more importance
MechanismTrial Type
Mechanism
No central registration office Randomization list sealed envelops
Double blind drug trial Pharmacist will be involved
Multi-centre trial Central registration office
Single-centre trial Independent person responsible for patients registration and randomization
An Example of Stratified Randomization
Patients will be stratified according to the following criteria:
1) Treatment center (Hospital A vs Hospital B vs Hospital C)
2) N-stage(N2 vs N3) 3) T-stage (T1-2 vs T3-4)
What should be in the protocol? A dynamic allocation scheme will be used to
randomize patients in equal proportions within each of 12 strata. The scheme first creates time-ordered blocks of size divisible by three and then uses simple randomization to divide the patients in each block into three treatment arms, in equal proportion. The block sizes will be chosen randomly so that each block contains either 6 or 9 patients.
Cont…
This procedure helps to ensure both randomness and investigator blinding (the block sizes are known only to the statistician), as recommended by Freedman et al. Randomization will be generated by the consulting statistician in sealed envelopes, labeled by stratum, which will be unsealed after patient registration.
Adaptive Randomization
Number adaptiveBiased coin method
Baseline adaptive (MINIMIZATION) Outcome adaptive
Biased Coin Method
Advantages Investigators can not determine the next
assignment by discovery the blocking factor.
DisadvantagesComplexity in useStatistical analysis cumbersome
Minimization
Minimization is an well -accepted statistical method to limit imbalance in relative small randomized clinical trials in conditions with known important prognostic baseline characteristics.
It called minimization because imbalance in the distribution of prognostic factors are minimized
Table 1 Some baseline characteristics of patients in a controlled trial of mustine versus talc in the control of pleural effusions in patients with breast cancer (Frientiman et al, 1983)
Treatment Mustine (n=23) Talc(n=23)
Mean age (SE) 50.3(1.5) 55.3(2.2)
Stage of disease: 1 or 2 3 or 4
52% 48%
74% 26%
Mean interval in month between BC diag. and effusion diag. (SE)
33.1(6.2) 60.4(13.1)
Postmenopausal 43% 74%
Minimization Factors
Age ( years) <=50 Or >50
Stage of disease 1 or 2 Or 3 or 4
Time between diagnosis of cancer and diagnosis of effusions(months)
<=30 Or >30
Menopausal Pre Or Post
Table 2 Characteristics of the first 29 patients in a clinical trial using minimization to allocate treatment
Mustine Talc
Age <=50 >50
7 8
6 8
Stage 1 or 2 3 or 4
11 4
11 3
Time Interval
<=30m >30m
6 9
4 10
Menopausal Pre Post
7 8
5 9
Table 3 Calculation of imbalance in patient characteristics for allocating treatment to the thirtieth patient
Mustine (n=15)
Talc (n=14)
Age >50 8 8
Stage 3 or 4 4 3
Time interval <=30m 6 4
Postmenopausal 8 9
Total 26 24
Advantages
It can reduce the imbalance into the minimum level especially in small trial
Computer Program available (called Mini) and also not difficult to perform ‘by hand’
Minimization and stratification on the same prognostic factors produce similar levels of power, but minimization may add slightly more power if stratification does not include all of the covariance
Disadvantages
It is a bit complicated process compare to the simple randomization
Practical Considerations
Study type Randomization
Large studies Blocked
Large, Multicentre studies Stratified by centre
Small studies Blocked and Stratified by centre
Large number of Prognostic factors
Minimization
Large studies Stratified analysis without stratified randomization