Upload
davida
View
63
Download
0
Embed Size (px)
DESCRIPTION
Application of statistical methods for the comparison of data distributions. Susanna Guatelli, Barbara Mascialino, Andreas Pfeiffer, Maria Grazia Pia, Alberto Ribon, Paolo Viarengo. Outline. The comparison of two data distribution is fundamental in experimental practice - PowerPoint PPT Presentation
Citation preview
Barbara MascialinoIEEE-NSS October 21th, 2004
Application of statistical methods for the comparison
of data distributions
Susanna Guatelli, Barbara Mascialino, Andreas Pfeiffer, Maria Grazia Pia, Alberto
Ribon, Paolo Viarengo
Barbara MascialinoIEEE-NSS October 21th, 2004
• The comparison of two data distribution is fundamental in experimental practice
• Many algorithms are available for the comparison of two data distributions (the two-sample problem)
Aim of this study:Aim of this study: compare the algorithms available in statistics literature to select the most appropriate one in every specific case
Outline
Detector monitoringDetector monitoring (current versus reference data)Simulation validation (experiment versus simulation)Simulation validation (experiment versus simulation)
Reconstruction versus expectationReconstruction versus expectationRegression testing (two versions of the same software)Regression testing (two versions of the same software)
Physics analysisPhysics analysis (measurement versus theory, experiment A versus experiment B)
Parametric statistics Non-parametric statistics(Goodness-of-Fit testing)
Barbara MascialinoIEEE-NSS October 21th, 2004
The two-sample problem
EXAMPLE 1EXAMPLE 1: binned data
Which is the most suitable goodness-of-fit test?
EXAMPLE 2EXAMPLE 2: unbinned dataX-ray fluorescence spectrum Dosimetric distribution from a medical
LINAC
Barbara MascialinoIEEE-NSS October 21th, 2004
• Applies to binnedbinned distributions
• It can be useful also in case of unbinned distributions, but the data must be grouped into classes
• Cannot be applied if the counting of the theoretical frequencies in each class is < 5
– When this is not the case, one could try to unify contiguous classes until the minimum theoretical frequency is reached
– Otherwise one could use Yates’ formula
Chi-squared testChi-squared test
Barbara MascialinoIEEE-NSS October 21th, 2004
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
EMPIRICAL DISTRIBUTION FUNCTIONORIGINAL DISTRIBUTIONS
• Kolmogorov-Smirnov test
• Goodman approximation of KS test
• Kuiper test
)(
4 22
nmnmDmn
)()( xGxFSupD mnmn
)()()()(* xFxGMaxxGxFMaxD nmmn
Dmn
Tests based on the supremum statisticsTests based on the supremum statisticsunbinned distributionsunbinned distributions
SUPREMUMSUPREMUMSTATISTICSSTATISTICS
Barbara MascialinoIEEE-NSS October 21th, 2004
• Fisz-Cramer-von Mises test
• k-sample Anderson-Darling test
i
ii xFxFnnnnt 2
21221
21 )]()([)(
i k kkk
kiikk
iK nhHnH
HnnFhnkn
nA
4)(
)(1)1(
)1( 2
22
Tests containing a weighting functionTests containing a weighting functionbinned/unbinned distributionsbinned/unbinned distributions
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
EMPIRICAL DISTRIBUTION FUNCTIONORIGINAL DISTRIBUTIONS
QUADRATICQUADRATICSTATISTICSSTATISTICS
+ + WEIGHTING WEIGHTING FUNCTIONFUNCTION
Sum/integral of all the distances
Barbara MascialinoIEEE-NSS October 21th, 2004
G.A.P Cirrone, S. Donadio, S. Guatelli, A. Mantero, B. Mascialino, S. Parlati, M.G. Pia, A. Pfeiffer, A. Ribon, P. Viarengo
“A Goodness-of-Fit Statistical Toolkit”IEEE- Transactions on Nuclear Science (2004), 51 (5): October issue.
http://www.ge.infn.it/geant4/analysis/HEPstatistics/
Barbara MascialinoIEEE-NSS October 21th, 2004
Power evaluation
N=1000Monte Carlo replications
Confidence Level = 0.05
Pseudoexperiment: a random drawing
of two samples from two parent distributions
For each test, the p-value computed by the GoF Toolkit derives from analytical calculation of the asymptotic distribution, often depending on the samples sizes.
The power of a test is the probability of rejecting the null
hypothesis correctly
Parent distribution 1
Sample 1n
Sample 2m
GoFtest
Parent distribution 2
PowerPower = # pseudoexperiments with p-value < (1-CL)
# pseudoexperiments
Barbara MascialinoIEEE-NSS October 21th, 2004
Parent distributions
1)(1 xfUniform
)2
(2
2
21)(
x
exf
Gaussian
||3
21)( xexf
Double exponential
241
11)(x
xf
Cauchy
xexf )(5
Exponential
Contaminated Normal Distribution 2
)1,1(5.0)4,1(5.0)(7 xf
)9,0(1.0)1,0(9.0)(6 xfContaminated Normal Distribution 1
Barbara MascialinoIEEE-NSS October 21th, 2004
Skewness and tailweight
025.05.0
5.0975.0
xxxxS
125.0875.0
025.0975.0
xxxxT
ParentParent SS TTf1(x) Uniform 1 1.267
f2(x) Gaussian 1 1.704
f3(x) Double exponential 1 2.161
f4(x) Cauchy 1 5.263
f5(x) Exponential 4.486 1.883
f6(x) Contamined normal 1
1 1.991
f7(x) Contamined normal 2
1.769 1.693
SkewnessSkewness TailweightTailweight
Barbara MascialinoIEEE-NSS October 21th, 2004
Power increases as a function of the sample size (analytical calculation of the
asymptotic distribution)
N sample
Pow
er
Kolmogorov-Smirnov testCL = 0.05
The “location-scale problem”Case Parent1 = Parent 2
UniformNormalExponentialDouble ExponentialContaminated Normal 1
Contaminated Normal 2Cauchy
small sized samples
moderate sized samples
Barbara MascialinoIEEE-NSS October 21th, 2004
The “general shape problem”
Distribution1 – Distribution 2 KSKS CVMCVM ADADCN2-Normal 55.6±1.8 15.2±1.1 86.1±1.1
CN2-CN1 24.9±1.4 25.2±1.1 44.8±1.6
CN2-Double Exponential 37.6±1.5 40.2±1.6 51.6±1.6
T2
Case Parent1 ≠ Parent 2
Pow
erTailweight Distribution 2
CL = 0.05
Kolmogorov-Smirnov
Cramér-von Mises
Anderson-Darling
(S1 = S2 = 1)Distribution 1
Double exponential (T1 = 2.161)
A) Symmetric versus symmetric
B) Skewed versus symmetric
KSKS CVCVMM
ADAD~ <For very long tailed distributions:
KSKS CVCVMM
ADAD~ ~For short-medium tailed distributions:
Barbara MascialinoIEEE-NSS October 21th, 2004
Comparative evaluation of testsComparative evaluation of tests
ShortShort(T(T<1.5)<1.5)
MediumMedium(1.5 < T < 2)(1.5 < T < 2)
LongLong(T>2)(T>2)
SS~~11 KSKS KS – CVMKS – CVM CVM - ADCVM - ADSS>1.5>1.5 KS - ADKS - AD ADAD CVM - ADCVM - ADSk
ewne
ssSk
ewne
ss
TailweightTailweight
22 Supremum Supremum statistics statistics
teststests
Tests Tests containing a containing a
weight functionweight function< <
Barbara MascialinoIEEE-NSS October 21th, 2004
Results for the data examples
Extremely skewed – medium tail
ANDERSON-DARLING TESTA2=0.085 – p>0.05
Moderate skewed – medium tail
KOLMOGOROV-SMIRNOV TESTD=0.27 – p>0.05
X-variable: Ŝ=4 T=1.43
Y-variable: Ŝ=4 T=1.50
X-variable: Ŝ=1.53 T=1.36
Y-variable: Ŝ=1.27 T=1.34
^
^
^
^
EXAMPLE 1EXAMPLE 1: binned data EXAMPLE 2EXAMPLE 2: unbinned data
Barbara MascialinoIEEE-NSS October 21th, 2004
• Studied several goodness-of-fit tests for location-scale alternatives and general alternatives
• There is nono clear winner for all the considered distributions in general
• To select one test in practice:1.1. first classifyclassify the type of the distributions in terms of skewness SS and tailweight TT2.2. choose the mostmost appropriate test for the classified type of distribution
Conclusions
Topic still subject to research activity in the domain of statisticsTopic still subject to research activity in the domain of statistics