View
20
Download
0
Category
Tags:
Preview:
DESCRIPTION
Getting an estimate of % of GM in a sample 2. Qualitative laboratory methods. May 8-10, 2006 Iowa State University, Ames – USA Jean-Louis Laffont Kirk Remund. Overview. Impurity estimators and confidence intervals Quantitative information from a qualitative assay - PowerPoint PPT Presentation
Citation preview
Getting an estimate of % of GM in a sample2. Qualitative laboratory methods
May 8-10, 2006
Iowa State University, Ames – USA
Jean-Louis Laffont
Kirk Remund
ISTA Statistics Committee 2
Overview
• Impurity estimators and confidence intervals
• Quantitative information from a qualitative assay
• Limitations to quantification with a qualitative assay
ISTA Statistics Committee 3
?Impurity Estimate
Our best guess ofwhat the true lotimpurity/purity is based on the sample…
µ=1%truth
µ= lot impurity/purity(sometimes called p)
2% μ ˆ
μ ˆ impurity/purityestimate
ISTA Statistics Committee 4
Estimate based on sampleLot
14% 4/28 μ
- -
-
-
-
+
+
+ --
-
-
-
-
-
-
--
--
--
-
--
-
-
-
+
+
-
-
25% 1/4 μ ˆ
Sample
ISTA Statistics Committee 5
Lot 2Lot 1
Confidence intervals are like nets…
Then what are we tryingto catch?
µ2µ1
Answer: true level of impurity (µ) in the lot
Lot 3µ3
Oops, looks like one got
away!!
ISTA Statistics Committee 6
Confidence level• Net “interval” size is function of sampling variability,
assay errors and confidence level• If we fix the sampling and assay variability then:
lower conf. level
higher conf. level
small
large
ISTA Statistics Committee 7
• Statement: “We are 95% confident that the true lot impurity is contained within the interval (net)”
• Overall we expect that 95% of the time the interval will catch the true lot impurity (µ)
What does 95% confidence mean?
µ
µ
µµµ
µ µ
µ
µexpect 5% of timeµ will fall out of net
ISTA Statistics Committee 8
Estimator of GM purity/impurity(Individual Seed Testing)
sampled seeds # total
seedsdeviant of #
n
dμ ˆ
• Individual seed testing used to test purity of GM material for proficency test• Used to test purity of GM variety seed• Implemented in Seedcalc
• Estimator:
2d2,2nα,2d1
2d2,2nα,2d1UL 1)F(dd)(n
1)F(dμ
ˆ• UCL:
where F is the 1- quantile from an F-distribution with 2d+2 and 2n-2d degrees of freedom
ISTA Statistics Committee 9
1/m
n
d11μ
ˆ
Estimator of GM impurity(Seed Pool Testing)
• Used to estimate AP levels of GM in conventional seed• Used to estimate level if GM impurity in conventional seed for proficency test
• Implemented in Seedcalc
• Estimator:
where m is the number seeds per pool, n is the number of seed poolsand d is the number of deviant seed pools
1/m
2d2,2nα,2d1
2d2,2nα,2d1UL 1)F(dd)(n
1)F(d11μ
ˆ• UCL:
ISTA Statistics Committee 10
1 & 2 sided Confidence limits
• Upper confidence limit (UCL)– “95% confident that true impurity is below
upper confidence limit”– Caution: do not use as estimate
• Two-sided confidence limit– “95% confident that the true impurity is
between the lower and upper limit– Similar to form on earlier slide formulas
and is implemented in Seedcalc
ISTA Statistics Committee 11
/2 /2
Two-sided confidence interval(put ½ of alpha in each tail)
1- confidence thatinterval contains true purityof lot
1- confidence interval
ISTA Statistics Committee 12
One-sided confidence interval(alpha in one tail)
1- confidence thatinterval contains true purityof lot
1- confidence interval
ISTA Statistics Committee 13
The following slides illustrate that a simple presence/absence answer per pool of seeds allows estimation of % of seeds presence
The statistical computation takes into account the fact that for a given level of GM presence, some sub-samples will “by chance” contain 0 GM seeds, others 1 GM seed, others 2 GM seeds, etc..
The formula is :
Where d is the number of deviant sub-samples , n is the number of sub-samples, m is the number of seeds per sub-sample
mndGMestimate /1)/1(1%
ISTA Statistics Committee 15
1
2
3
4
5
6
7
8
9
10
From 1500 seeds, 10 pools of 150 seeds have been made
ISTA Statistics Committee 16
Each sub-sample is tested for presence/absence of GM seeds
Positive control
Negative control
4 sub-samples are positives
0.34%
ISTA Statistics Committee 17
nb of pools
seeds per pool 0 1 2 3 4 5 6 7 8 9 10
1 1500 0 ####
2 750 0 0,09% ####
3 500 0 0,08% 0,22% ####
4 375 0 0,08% 0,18% 0,37% ####
5 300 0 0,07% 0,17% 0,30% 0,54% ####
6 250 0 0,07% 0,16% 0,28% 0,44% 0,71% ####
7 214 0 0,07% 0,16% 0,26% 0,40% 0,58% 0,91% ####
8 187 0 0,07% 0,15% 0,25% 0,37% 0,52% 0,74% 1,11% ####
9 166 0 0,07% 0,15% 0,24% 0,35% 0,49% 0,66% 0,90% 1,31% ####
10 150 0 0,07% 0,15% 0,24% 0,34% 0,46% 0,61% 0,80% 1,07% 1,52% ####
% estimate can be obtained in Seedcalc or in ISTA documents
4 positive pools from 10 pools of 150 seeds => 0.34% of GM seeds
ISTA Statistics Committee 18
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
GM seed ( 5 GM Seeds 3 times 5 positive, 1 time 4 positive)
Statistical computation take into account that some sub-samples may have more than a GM seed
ISTA Statistics Committee 19
Qualitative Test/Quantitative InformationQualitative Test/Quantitative Information
Example of seed pool testing strategy:Example of seed pool testing strategy:
- - -
+ - - -
++ - - seed
seed
seed- <0.25%
<0.46%
<0.77%
(4 pools of 300 seeds)
ISTA Statistics Committee 20
++ - -seed
<0.46%
0.14 = Best Estimate
(4 pools of 500 seeds)
~1.4%8-9
~0.4%>9
~7.8%6-7
~24.9%4-5
~65.4%2-3
Probability*
# positive
seedsall seeds negative
(1000 seeds)
Distribution of attribute within pooled samples: How many positive seeds in 2 positive pools?
* Probability of set number of positives given that two pools are negative
How confident are we that the How confident are we that the qualitativequalitative data is data is appropriate to describe a appropriate to describe a quantitativequantitative result? result?
ISTA Statistics Committee 21
Inputs
Outputs
ISTA Statistics Committee 22
0.0% 0.5% 1.0% 1.5% 2.0%
LQL
Threshold Testing VS UCLThreshold Testing VS UCL
UCL yields moreinformation than threshold
testing
μ̂ ULμ̂
μ̂ ULμ̂
μ̂ ULμ̂
μ̂ ULμ̂
ISTA Statistics Committee 23
Threshold Testing VS UCL
ISTA Statistics Committee 24% impurity
0 2 4 6 8
60 of 50
30 of 100
20 of 150
15 of 200
10 of 300
5 of 600
3 of 1000
2 of 1500
Estimation limitations for small # of pools
Real-time PCR assays also hasProblem estimating higher AP impurity levels due to asymptoteof cycles at higher impurity
ISTA Statistics Committee 25
Limited information if all pools are positive
• Test 10 pools of 300 seeds and all are positive – Impurity estimate = 100% BUT– 95% confident that impurity in lot is
between 0.45% and 100%!!!
• Test 3 pools of 1000 and all positive– 95% confident that impurity in lot is
between 0.05% and 100%!!!
ISTA Statistics Committee 26
Demonstration andExercises in Seedcalc
Recommended