20
INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org 0.003 0.004 B: 0.34 % 0.2 0.4 0.6 0.8 1.0 0.00 0.05 0.10 0.15 0.20 0.001 0.002 A% AB% ˆ = 1/m d μ 1 1 Statistics Committee μ 1 1 n 2 4 6 Observed gene level (%) Result Statistics Committee 0 0% 0.05% 0.1% 0.2% 1.5% 2% 4% 2008 - 2009 report ( ) ijk i j ijk ij Y L L E µ α α = + + + + 0.2 0.3 0.4 Distribution of the test statistic under H0 - df=9 P(t) -4 -2 0 2 4 0.0 0.1 0 t P .. , ij ij y y N =

2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

0.003

0.004

B: 0.34 %

0.2

0.4

0.60.8

1.00.00

0.050.10

0.15

0.20

0.001

0.002

A%AB%

ˆ ⎛ ⎞= − −⎜ ⎟1/mdµ 1 1Statistics Committee ⎜ ⎟

⎝ ⎠µ 1 1

n

0

2

4

6

Obs

erve

d ge

ne le

vel (

%)

Res

ult

Statistics Committee0

0% 0.09% 0.11% 0.23% 1.54% 2.06% 4%

150 135 135 150 150 150 150Number of data points

Gene level (%)0% 0.05% 0.1% 0.2% 1.5% 2% 4%

2008 - 2009 report

( )ijk i j

ijkij

Y L

L E

µ α

α

= + +

+ +

0.2

0.3

0.4

Distribution of the test statistic under H0 - df=9

P(t)

-4 -2 0 2 4

0.0

0.1

0

t

P

..,

iji j

y y N=∑

Page 2: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

June 17, 2008

June 14, 2009

Page 3: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

• STA Committee membership:

Chair: Jean-Louis Laffont FranceVice: Kirk Remund USAM b J li B i HMembers: Julianna Bànyai Hungary

Julia Barabas HungaryOlfat H El Bagoury EgyptOlfat H. El Bagoury EgyptWinfried Jackisch GermanyZivan Karaman FranceMichael Kruse Germany

• Opportunity for new members following the8th ISTA Seminar on Statistics in Seed Testing8 ISTA Seminar on Statistics in Seed Testing

Page 4: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

Outline

• Overview of the activities from June 2008 to June 2009June 2009

th• 8th ISTA Seminar on Statistics in Seed Testing

• Analysis of validation studies data

• Future work

Page 5: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

Overview of the activities from June 2008 to June 2009• Support to GMO Task Force

– Rating of the GMO PT10 and PT11– Stack assessment: application for 2-way stack and 3-way

stack assessment will be available soonstack assessment will be available soon– Biotechnology Trait Detection Workshop, May 11-16, 2009,

Osijek, CroatiaOrganized by: Institute for Seed and Seedlings, Osijek, CroatiaConducted by: ISTASponsored by: FAOSponsored by: FAO

Page 6: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

Overview of the activities from June 2008 to June 2009

• Report reviews: 8 test plan reviews,Report reviews: 8 test plan reviews,8 validation report reviews

• Statistical analysesExamples:– Validation study: Revised method for the germination test ofValidation study: Revised method for the germination test of

Brassica spp. and Sinapis alba

– Validation study: Germination procedure for Brachiaria brizanthafFor this analysis, we used successfully a modern and

powerful approach that will be discussed later

• 8th ISTA Seminar on Statistics in Seed Testing

Page 7: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

8th ISTA Seminar on Statistics in Seed Testing

• April 5 – 7, 2009 – Aussonne, FranceApril 5 7, 2009 Aussonne, France Hosted by Pioneer Génétique

• 23 participants from 6 different countries from industry and

• Thanks to Charlotte Philip

• 23 participants from 6 different countries from industry and government laboratories

and Valérie Ancelin for making the arrangements for the seminar and tofor the seminar and to Nadine Ettel for her support in the seminar planning andin the seminar planning and preparation

Page 8: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

8th ISTA Seminar on Statistics in Seed Testing

• Lectures given by Bonnie Hong Zivan Karaman• Lectures given by Bonnie Hong, Zivan Karaman,Mustapha El Yakhlifi, Jean-Louis Laffontand Kirk Remundand Kirk Remund

• Seminar content:S i i l di ib i /– Statistical distributions/tests

– Data checking– Linear modelsLinear models– Seed testing plans– ISO laboratory uncertainty applications– Sampling– Statistical software: SAS, R,

E l t ti ti l t lExcel statistical tools

Page 9: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

8th ISTA Seminar on Statistics in Seed Testing

• Engaged discussion between the statisticians and the labEngaged discussion between the statisticians and the lab practitioners lead to the identification of a modern and unified approach for the analysis of Validation Studies.

• The basis for this approach is the utilization of Generalized Li Mi d Eff t M d l (GLMM)Linear Mixed-Effect Models (GLMM):– Normal underlying distributions and fixed effects only

Linear Model (LM) i e classical ANOVALinear Model (LM), i.e. classical ANOVA– Normal underlying distributions and fixed and random effects

Linear Mixed-Effect Model (LMM) P i bi i l d l i di ib i d fi d ff l– Poisson or binomial underlying distributions and fixed effects only

Generalized Linear Model (GLM) – P.McCullagh and J.A.Nelder (1983)– Poisson or binomial underlying distributions and fixed and random effectsy g

Generalized Linear Mixed-Effect Model (GLMM)

Page 10: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

8th ISTA Seminar on Statistics in Seed Testing

Data exploration and data checking:Data exploration and data checking:• Classical methods: Tolerance tables,

Sum of % = 100%• Modern methods: boxplots, Q-Q plots, ed

lings

80 80

% Normal Seedlings

s (%

) 1.5

2.0

Modern methods: boxplots, Q Q plots, Hampel’s method %

Nor

mal

See

15-3

5-H

2SO

4-KN

O3

15-3

5-KN

O3

15-3

5-Pr

ehea

t-KN

O3

20-3

5-H

2SO

4-KN

O3

20-3

5-KN

O3

20-3

5-Pr

ehea

t-KN

O3

40

60

40

60

Quantiles of Standard Normal

Oth

er s

peci

es

-2 -1 0 1 2

0.0

0.5

1.0

Cleaned data

Fitting one class of Generalized Linear Mixed-effect Model:mean of the new method significantly ≠ mean of the reference method?Lot x Method interaction meaningful?repeatability variance of the new method = repeatability variance of the reference method ?If enough labs:

d ibilit d ibilit f ?reproducibility variance of the new method = reproducibility variance of the reference method ?

Page 11: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

8th ISTA Seminar on Statistics in Seed Testing

• Data exploration and visualization tools GLMM can be• Data exploration and visualization tools, GLMM can be implemented in which is a free available software.We have plans to deliver free packages for that purposeWe have plans to deliver free packages for that purpose.

Page 12: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

An example of using this data analysis process:New trier validation dataProtocol:• 3 triers Cargo

lSpiral Sampling stick with

compartments

• 2 labs• 6 lots/lab: 3 chaffy seeds , 3 non-chaffy seeds

samplerspear p(ISTA method)

6 lots/lab: 3 chaffy seeds , 3 non chaffy seeds

• Each sampling with each trier was repeated 5 times• On each submitted sample a purity analysis an other seed count and aOn each submitted sample a purity analysis, an other seed count and a germination test is performed according to the ISTA Rules chapters 3, 4, 5

Page 13: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

An example of using this data analysis process:

New trier validation data - Other seeds by number analysisOther seeds by number

1000

1200

1000

1200

• Data exploration

Other seeds by number

eds

by n

umbe

r

600

800

600

800

Lot1

Lot2

Lot3

Lot4

Lot5

Lot6

Lot7

Lot8

Lot9

Lot10

Lot11

Lot12

600

800

1000

1200Spiral Spear

Oth

er s

ee

200

400

200

400

1200Cargo Sampler ISTA Stick

0

200

400

600

seed

s by

num

ber

ampl

er

tick

pear

ampl

er

tick

pear

ampl

er

tick

pear

ampl

er

tick

pear

ampl

er

tick

pear

ampl

er

tick

pear

ampl

er

tick

pear

ampl

er

tick

pear

ampl

er

tick

pear

ampl

er

tick

pear

ampl

er

tick

pear

ampl

er

tick

pear

0 0

400

600

800

1000

Oth

er s

C1-

Car

go S

a

C1-

ISTA

S

C1-

Spira

l S

C2-

Car

go S

a

C2-

ISTA

S

C2-

Spira

l S

C3-

Car

go S

a

C3-

ISTA

S

C3-

Spira

l S

C4-

Car

go S

a

C4-

ISTA

S

C4-

Spira

l S

C5-

Car

go S

a

C5-

ISTA

S

C5-

Spira

l S

C6-

Car

go S

a

C6-

ISTA

S

C6-

Spira

l S

N1-

Car

go S

a

N1-

ISTA

S

N1-

Spira

l S

N2-

Car

go S

a

N2-

ISTA

S

N2-

Spira

l S

N3-

Car

go S

a

N3-

ISTA

S

N3-

Spira

l S

N4-

Car

go S

a

N4-

ISTA

S

N4-

Spira

l S

N5-

Car

go S

a

N5-

ISTA

S

N5-

Spira

l S

N6-

Car

go S

a

N6-

ISTA

S

N6-

Spira

l S

Non normalityHeteroscedasticity

0

200

-2 -1 0 1 2

Unit Normal Quantile

Small differences across triers for agiven lot

Page 14: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

An example of using this data analysis process: New trier validation data - Other seeds by number analysis

• Modeling: comparing trier means trueness of new triersModeling: comparing trier means trueness of new triers

Results are counts Poisson GLM :other seeds by number ~ Poisson (λijkl)

log(λijkl) = mu + Trier + Lab + Lot(Lab) + Lab x Trier + Trier x Lot(Lab) + Residuals

After fitting this model, there is indication of overdispersion(variability of the counts greatly exceeds the mean under the Poisson assumption) ( y g y p )which is very common for count data:Dispersion parameter: ˆ 2.73φ =

This overdispersion is taken into account for the computationof the standard-errors of the parameter estimates and relatedof the standard errors of the parameter estimates and relatedstatistics

Page 15: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

An example of using this data analysis process: New trier validation data - Other seeds by number analysis

• Tests of fixed effects:

No overall significant difference between

triers

Lot x Trier significantinteraction

Page 16: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

• Inspecting Lot x Trier interactions:An example of using this data analysis process: New trier validation data - Other seeds by number analysis

000

Interaction plot

Lot

800

10

umbe

r

Lot

N2C5C6N6C4N3N4

400

600

Oth

er s

eeds

by

nu

N4C3C1N5C2N1

Interaction is limited: no excessive overlap of the interaction plot lines, 2 lots only out of 12 exhibiting significant differences at the 1%

020

0

exhibiting significant differences at the 1%level.

Cargo Sampler ISTA Stick Spiral Spear

Trueness of the new triersacceptedaccepted

Page 17: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

An example of using this data analysis process: New trier validation data - Other seeds by number analysis

• Repeatability:F h t i th f ll i P i GLM d l i fitt dFor each trier, the following Poisson GLM model is fitted:other seeds by number ~ Poisson (λijk)

l (λ ) L b L (L b) R id llog(λijk) = mu + Lab + Lot(Lab) + Residuals

Repeatability standard deviations are then defined as:y

where is the estimate of the dispersionparameter and λ a nominal count (i e a “gold standard”)

ˆrS λφ= φ̂

parameter and λ a nominal count (i.e. a gold standard ).

The repeatabilityp ystandard-deviationsof the three triers

id ti lare identical

Page 18: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

Future work

• Adding free data analysisAdding free data analysistools on the ISTA website:

rubzav

IN93 KE93 NN93annariaugcasdeldiaenafunhamharkarkatlucm12rebronrubzav

OA93

2 4 6 8

RN93 WP93

– Data exploration:

– Modeling:Yield

annariaugcasdeldiaenafunhamharkarkatlucm12rebronrubzav

2 4 6 8

BH93 EA93

2 4 6 8

HW93annariaug

casdeldiaenafun

hamharkarkatluc

m12rebronrub

gExcel is not a statistical data analysis package. In all fairness, it was never intended to be one.  The Data Analysis ToolPak is an add‐in ‐‐ anThe Data Analysis ToolPak is an add in  an extra feature that was added to enable you to do a few quick calculations. So it should not be surprising that that is just what it is good for ‐f i k l l ia few quick calculations. 

‐ Eva Goldwater, U. Mass Data Analysis Grouphttp://www‐unix.oit.umass.edu/~evagold/excel.html

Page 19: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

Future work (cont)

• Seedcalc9Seedcalc9

– 2-way and 3-way stack assessment (multinomial pool testing)

– Hypergeometric pool testing

• Statistical support to other Technical Committees, in particular, to the DNA based methods Working Group from gthe Variety Committee, which has really specific data…

Page 20: 2 STAT JL & Kirk [Kompatibilitätsmodus]...Modern methods: Q plots 1.5 2.0 Hampel’s method % Normal Se 1 5-3 5-H 2 S O 4-K N O 3 1-3 5-K N O 3 P r e h e a t O 2 0 3 5 H 2 S O 4 K

INTERNATIONAL SEED TESTING ASSOCIATION (ISTA) www.seedtest.org

Many thanks to the ECOM and to theto the

ISTA SecretariatISTA Secretariatfor their support,

COand also to the TCOMsmembers for enrichingmembers for enriching

discussions

Thank you for your attention!y y