Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
0.003
0.004
B: 0.34 %
0.2
0.4
0.60.8
1.00.00
0.050.10
0.15
0.20
0.001
0.002
A%AB%
ˆ ⎛ ⎞= − −⎜ ⎟1/mdµ 1 1Statistics Committee ⎜ ⎟
⎝ ⎠µ 1 1
n
0
2
4
6
Obs
erve
d ge
ne le
vel (
%)
Res
ult
Statistics Committee0
0% 0.09% 0.11% 0.23% 1.54% 2.06% 4%
150 135 135 150 150 150 150Number of data points
Gene level (%)0% 0.05% 0.1% 0.2% 1.5% 2% 4%
2008 - 2009 report
( )ijk i j
ijkij
Y L
L E
µ α
α
= + +
+ +
0.2
0.3
0.4
Distribution of the test statistic under H0 - df=9
P(t)
-4 -2 0 2 4
0.0
0.1
0
t
P
..,
iji j
y y N=∑
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
June 17, 2008
June 14, 2009
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
• STA Committee membership:
Chair: Jean-Louis Laffont FranceVice: Kirk Remund USAM b J li B i HMembers: Julianna Bànyai Hungary
Julia Barabas HungaryOlfat H El Bagoury EgyptOlfat H. El Bagoury EgyptWinfried Jackisch GermanyZivan Karaman FranceMichael Kruse Germany
• Opportunity for new members following the8th ISTA Seminar on Statistics in Seed Testing8 ISTA Seminar on Statistics in Seed Testing
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
Outline
• Overview of the activities from June 2008 to June 2009June 2009
th• 8th ISTA Seminar on Statistics in Seed Testing
• Analysis of validation studies data
• Future work
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
Overview of the activities from June 2008 to June 2009• Support to GMO Task Force
– Rating of the GMO PT10 and PT11– Stack assessment: application for 2-way stack and 3-way
stack assessment will be available soonstack assessment will be available soon– Biotechnology Trait Detection Workshop, May 11-16, 2009,
Osijek, CroatiaOrganized by: Institute for Seed and Seedlings, Osijek, CroatiaConducted by: ISTASponsored by: FAOSponsored by: FAO
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
Overview of the activities from June 2008 to June 2009
• Report reviews: 8 test plan reviews,Report reviews: 8 test plan reviews,8 validation report reviews
• Statistical analysesExamples:– Validation study: Revised method for the germination test ofValidation study: Revised method for the germination test of
Brassica spp. and Sinapis alba
– Validation study: Germination procedure for Brachiaria brizanthafFor this analysis, we used successfully a modern and
powerful approach that will be discussed later
• 8th ISTA Seminar on Statistics in Seed Testing
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
8th ISTA Seminar on Statistics in Seed Testing
• April 5 – 7, 2009 – Aussonne, FranceApril 5 7, 2009 Aussonne, France Hosted by Pioneer Génétique
• 23 participants from 6 different countries from industry and
• Thanks to Charlotte Philip
• 23 participants from 6 different countries from industry and government laboratories
and Valérie Ancelin for making the arrangements for the seminar and tofor the seminar and to Nadine Ettel for her support in the seminar planning andin the seminar planning and preparation
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
8th ISTA Seminar on Statistics in Seed Testing
• Lectures given by Bonnie Hong Zivan Karaman• Lectures given by Bonnie Hong, Zivan Karaman,Mustapha El Yakhlifi, Jean-Louis Laffontand Kirk Remundand Kirk Remund
• Seminar content:S i i l di ib i /– Statistical distributions/tests
– Data checking– Linear modelsLinear models– Seed testing plans– ISO laboratory uncertainty applications– Sampling– Statistical software: SAS, R,
E l t ti ti l t lExcel statistical tools
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
8th ISTA Seminar on Statistics in Seed Testing
• Engaged discussion between the statisticians and the labEngaged discussion between the statisticians and the lab practitioners lead to the identification of a modern and unified approach for the analysis of Validation Studies.
• The basis for this approach is the utilization of Generalized Li Mi d Eff t M d l (GLMM)Linear Mixed-Effect Models (GLMM):– Normal underlying distributions and fixed effects only
Linear Model (LM) i e classical ANOVALinear Model (LM), i.e. classical ANOVA– Normal underlying distributions and fixed and random effects
Linear Mixed-Effect Model (LMM) P i bi i l d l i di ib i d fi d ff l– Poisson or binomial underlying distributions and fixed effects only
Generalized Linear Model (GLM) – P.McCullagh and J.A.Nelder (1983)– Poisson or binomial underlying distributions and fixed and random effectsy g
Generalized Linear Mixed-Effect Model (GLMM)
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
8th ISTA Seminar on Statistics in Seed Testing
Data exploration and data checking:Data exploration and data checking:• Classical methods: Tolerance tables,
Sum of % = 100%• Modern methods: boxplots, Q-Q plots, ed
lings
80 80
% Normal Seedlings
s (%
) 1.5
2.0
Modern methods: boxplots, Q Q plots, Hampel’s method %
Nor
mal
See
15-3
5-H
2SO
4-KN
O3
15-3
5-KN
O3
15-3
5-Pr
ehea
t-KN
O3
20-3
5-H
2SO
4-KN
O3
20-3
5-KN
O3
20-3
5-Pr
ehea
t-KN
O3
40
60
40
60
Quantiles of Standard Normal
Oth
er s
peci
es
-2 -1 0 1 2
0.0
0.5
1.0
Cleaned data
Fitting one class of Generalized Linear Mixed-effect Model:mean of the new method significantly ≠ mean of the reference method?Lot x Method interaction meaningful?repeatability variance of the new method = repeatability variance of the reference method ?If enough labs:
d ibilit d ibilit f ?reproducibility variance of the new method = reproducibility variance of the reference method ?
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
8th ISTA Seminar on Statistics in Seed Testing
• Data exploration and visualization tools GLMM can be• Data exploration and visualization tools, GLMM can be implemented in which is a free available software.We have plans to deliver free packages for that purposeWe have plans to deliver free packages for that purpose.
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
An example of using this data analysis process:New trier validation dataProtocol:• 3 triers Cargo
lSpiral Sampling stick with
compartments
• 2 labs• 6 lots/lab: 3 chaffy seeds , 3 non-chaffy seeds
samplerspear p(ISTA method)
6 lots/lab: 3 chaffy seeds , 3 non chaffy seeds
• Each sampling with each trier was repeated 5 times• On each submitted sample a purity analysis an other seed count and aOn each submitted sample a purity analysis, an other seed count and a germination test is performed according to the ISTA Rules chapters 3, 4, 5
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
An example of using this data analysis process:
New trier validation data - Other seeds by number analysisOther seeds by number
1000
1200
1000
1200
• Data exploration
Other seeds by number
eds
by n
umbe
r
600
800
600
800
Lot1
Lot2
Lot3
Lot4
Lot5
Lot6
Lot7
Lot8
Lot9
Lot10
Lot11
Lot12
600
800
1000
1200Spiral Spear
Oth
er s
ee
200
400
200
400
1200Cargo Sampler ISTA Stick
0
200
400
600
seed
s by
num
ber
ampl
er
tick
pear
ampl
er
tick
pear
ampl
er
tick
pear
ampl
er
tick
pear
ampl
er
tick
pear
ampl
er
tick
pear
ampl
er
tick
pear
ampl
er
tick
pear
ampl
er
tick
pear
ampl
er
tick
pear
ampl
er
tick
pear
ampl
er
tick
pear
0 0
400
600
800
1000
Oth
er s
C1-
Car
go S
a
C1-
ISTA
S
C1-
Spira
l S
C2-
Car
go S
a
C2-
ISTA
S
C2-
Spira
l S
C3-
Car
go S
a
C3-
ISTA
S
C3-
Spira
l S
C4-
Car
go S
a
C4-
ISTA
S
C4-
Spira
l S
C5-
Car
go S
a
C5-
ISTA
S
C5-
Spira
l S
C6-
Car
go S
a
C6-
ISTA
S
C6-
Spira
l S
N1-
Car
go S
a
N1-
ISTA
S
N1-
Spira
l S
N2-
Car
go S
a
N2-
ISTA
S
N2-
Spira
l S
N3-
Car
go S
a
N3-
ISTA
S
N3-
Spira
l S
N4-
Car
go S
a
N4-
ISTA
S
N4-
Spira
l S
N5-
Car
go S
a
N5-
ISTA
S
N5-
Spira
l S
N6-
Car
go S
a
N6-
ISTA
S
N6-
Spira
l S
Non normalityHeteroscedasticity
0
200
-2 -1 0 1 2
Unit Normal Quantile
Small differences across triers for agiven lot
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
An example of using this data analysis process: New trier validation data - Other seeds by number analysis
• Modeling: comparing trier means trueness of new triersModeling: comparing trier means trueness of new triers
Results are counts Poisson GLM :other seeds by number ~ Poisson (λijkl)
log(λijkl) = mu + Trier + Lab + Lot(Lab) + Lab x Trier + Trier x Lot(Lab) + Residuals
After fitting this model, there is indication of overdispersion(variability of the counts greatly exceeds the mean under the Poisson assumption) ( y g y p )which is very common for count data:Dispersion parameter: ˆ 2.73φ =
This overdispersion is taken into account for the computationof the standard-errors of the parameter estimates and relatedof the standard errors of the parameter estimates and relatedstatistics
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
An example of using this data analysis process: New trier validation data - Other seeds by number analysis
• Tests of fixed effects:
No overall significant difference between
triers
Lot x Trier significantinteraction
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
• Inspecting Lot x Trier interactions:An example of using this data analysis process: New trier validation data - Other seeds by number analysis
000
Interaction plot
Lot
800
10
umbe
r
Lot
N2C5C6N6C4N3N4
400
600
Oth
er s
eeds
by
nu
N4C3C1N5C2N1
Interaction is limited: no excessive overlap of the interaction plot lines, 2 lots only out of 12 exhibiting significant differences at the 1%
020
0
exhibiting significant differences at the 1%level.
Cargo Sampler ISTA Stick Spiral Spear
Trueness of the new triersacceptedaccepted
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
An example of using this data analysis process: New trier validation data - Other seeds by number analysis
• Repeatability:F h t i th f ll i P i GLM d l i fitt dFor each trier, the following Poisson GLM model is fitted:other seeds by number ~ Poisson (λijk)
l (λ ) L b L (L b) R id llog(λijk) = mu + Lab + Lot(Lab) + Residuals
Repeatability standard deviations are then defined as:y
where is the estimate of the dispersionparameter and λ a nominal count (i e a “gold standard”)
ˆrS λφ= φ̂
parameter and λ a nominal count (i.e. a gold standard ).
The repeatabilityp ystandard-deviationsof the three triers
id ti lare identical
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
Future work
• Adding free data analysisAdding free data analysistools on the ISTA website:
rubzav
IN93 KE93 NN93annariaugcasdeldiaenafunhamharkarkatlucm12rebronrubzav
OA93
2 4 6 8
RN93 WP93
– Data exploration:
– Modeling:Yield
annariaugcasdeldiaenafunhamharkarkatlucm12rebronrubzav
2 4 6 8
BH93 EA93
2 4 6 8
HW93annariaug
casdeldiaenafun
hamharkarkatluc
m12rebronrub
gExcel is not a statistical data analysis package. In all fairness, it was never intended to be one. The Data Analysis ToolPak is an add‐in ‐‐ anThe Data Analysis ToolPak is an add in an extra feature that was added to enable you to do a few quick calculations. So it should not be surprising that that is just what it is good for ‐f i k l l ia few quick calculations.
‐ Eva Goldwater, U. Mass Data Analysis Grouphttp://www‐unix.oit.umass.edu/~evagold/excel.html
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
Future work (cont)
• Seedcalc9Seedcalc9
– 2-way and 3-way stack assessment (multinomial pool testing)
– Hypergeometric pool testing
• Statistical support to other Technical Committees, in particular, to the DNA based methods Working Group from gthe Variety Committee, which has really specific data…
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org
Many thanks to the ECOM and to theto the
ISTA SecretariatISTA Secretariatfor their support,
COand also to the TCOMsmembers for enrichingmembers for enriching
discussions
Thank you for your attention!y y