Upload
annabella-hodge
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
POOLED DATA DISTRIBUTIONS
GRAPHICAL AND STATISTICAL TOOLS FOR EXAMINING COMPARISON
REFERENCE VALUES
Alan Steele, Ken Hill, and Rob Douglas
National Research Council of CanadaE-mail: [email protected]
National Research Conseil nationalCouncil Canada de recherches
Measurement comparison data sets are generally summarized using a simple statistical reference value calculated from the pool of the participants’ results. Consideration of the comparison data sets, particularly with regard to the consequences and implications of such data pooling, can allow informed decisions regarding the appropriateness of choosing a simple statistical reference value. Graphs of the relevant distributions provide insight to this problem.
Steele, Hill, and Douglas: Pooled Data Distributions 2
Introduction• Comparison data collection and analysis continues to grow in
importance among the tasks of international metrology
• Sample distributions and populations are routinely considered when preparing the summary of the comparison
• Reference values (KCRVs) are often calculated from the measurement data supplied by the participants
• We believe that graphical techniques are an aid to understanding and communication in this field
Steele, Hill, and Douglas: Pooled Data Distributions 3
The Normal Approach• Generally, initial implicit assumption is to consider that all of the
participants’ data, as xi/ui, represent individual samples from a single (normal) population
• A coherent picture of the population mean and standard deviation can be built from the comparison data set that is fully consistent with the reported values and uncertainties
• Most outlier-test protocols rely on this assumption to identify when and if a given laboratory result should be excluded, since its inclusion would violate this internal consistency
Steele, Hill, and Douglas: Pooled Data Distributions 4
Pooled Data Distributions• Creating pooled data distributions tackles this problem from the
opposite direction
• The independent distributions reported by each participant (through their value and uncertainty) are summed directly
• Result is taken as representative of the underlying population as revealed in the comparison measurements
• Monte Carlo methods are useful when calculations involve Student distributions or medians rather than means
Steele, Hill, and Douglas: Pooled Data Distributions 5
Monte Carlo Calculations
• High quality linear congruent uniform random number generators are easy to find
• Transformation from uniform to any distribution done via cumulative distribution
• Example shows Student distribution transform
• Our Excel Toolkit includes an external DLL for doing fast Monte Carlo simulations with multiple large arrays
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0.16
0.18
0.20
-10 -5 0 5 10
x
Stu
dent
PD
F
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Stu
dent
CD
F
= -1 = 2 = 4
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0.16
0.18
0.20
-10 -5 0 5 10
x
Stu
dent
His
togr
am (
105 E
vent
s)
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Stu
dent
CD
F
= -1 = 2 = 4
Steele, Hill, and Douglas: Pooled Data Distributions 6
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
-5 -4 -3 -2 -1 0 1 2 3 4 5
x
CD
F(x
,=
0,
=1
,)
2
3
4
5
6
7
8
9
10
Dealing with Student Distributions
• Student Cumulative Distribution Functions for different Degrees of Freedom (= 2…10)
• Note that the line at 97.5% cumulative probability crosses each curve at the coverage factor, k, appropriate for a 95% confidence interval
95%
96%
97%
98%
99%
100%
1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
x
CD
F(x
,=
0,
=1
,)
2
3
4
5
6
7
8
9
10
k
Steele, Hill, and Douglas: Pooled Data Distributions 7
Example Data From KCDB
• Recent results for CCAUV.U-K1
• Low power, 1.9 MHz: 5 Labs
• Finite degrees of freedom specified for all participants
• Data failed consistency check using weighted mean
• Median chosen as KCRV
Lab P Ref (mW) u (mW)
PTB 97.4 0.84 8.3
NIST 99 0.64 6.3
NPL 97.6 1.01 11
CSIRO 114.5 6.75 6.7NIM 94 1.16 12.5
PTB NIST NPL CSIRO NIM90
95
100
105
110
115
120
125
130
135
140
PR
ef (
mW
)
Steele, Hill, and Douglas: Pooled Data Distributions 8
Statistical Distributions
• Results of Monte Carlo simulation:– lab distributions used to
resample comparison– pooled data histogram
incremented once for each lab per event
– mean, weighted mean, and median calculated for each event
• Population revealed by measurement is multi-modal and evidently not normal
NIM
CSIRO
NPL
NIST
PTB
75 85 95 105 115 125
PRef (mW)
75 85 95 105 115 125PRef (mW)
PD
F
Pooled Data
Steele, Hill, and Douglas: Pooled Data Distributions 9
Statistical Distributions
• Results of Monte Carlo simulation:– lab distributions used to
resample comparison– pooled data histogram
incremented once for each lab per event
– mean, weighted mean, and median calculated for each event
• Population revealed by measurement is multi-modal and evidently not normal
NIM
CSIRO
NPL
NIST
PTB
75 85 95 105 115 125
PRef (mW)
75 85 95 105 115 125PRef (mW)
PD
F
Pooled Data
Weighted Mean
Median
Simple Mean
Steele, Hill, and Douglas: Pooled Data Distributions 10
Advantages of Monte Carlo
• Technique is simple to implement
• Allows calculation of confidence intervals for statistics
• Covariances can be accommodated in straightforward manner
• Possible to include outlier rejection schemes
• Easy to track quantities of interest, such as probability of a given participant being median laboratory
• Can consider other candidate reference values
Steele, Hill, and Douglas: Pooled Data Distributions 11
Example: CCT-K3 Argon Point
• Another example from KCDB
• CCT-K3 Argon Triple Point
• Large variation in reported values
• Large variation in stated uncertainties
• No KCRV was assigned, based on data pooling analysis
-4
-3
-2
-1
0
1
2
Laboratory
TL
ab -
TP
ilot (
mK
)
Steele, Hill, and Douglas: Pooled Data Distributions 12
Algorithmic Reference Values
• Linear combinations of simple estimators can be used as robust estimators of location
• For CCT-K3, proposal to use simple average of mean, weighted mean, and median
• Evaluation of any such algorithmic estimator is easy to do with Monte Carlo
-0.50 -0.25 0.00 0.25 0.50
TLAB - TARV (mK)
PD
F ARV
Weighted Mean
MedianSimple Mean
Steele, Hill, and Douglas: Pooled Data Distributions 13
Quantifying the Comparison
• Calculating a reference value – typically the variance-weighted mean or the median - is a routine part of reporting comparisons
• The suitability of these statistics for representing the data set can be checked using chi-squared testing
• It is also possible to perform such tests without invoking a reference value by considering the data in pair wise fashion
• Advantages of pair-statistics– Always works, even before choosing a reference value– More rigorous, since can handle correlations exactly– Explicit, following metrological chains of inference
Steele, Hill, and Douglas: Pooled Data Distributions 14
Pair-Difference Distributions
• Similar to exclusive statistics
• Consider difference between one lab and “rest of world”
• Sum of per-lab differences is the all-pairs-difference (APD) distribution; this is symmetric
• Width of APD is a measure of “global” quality assurance for independent calibration of an artifact by two different labs chosen at random
-40 -30 -20 -10 0 10 20 30 40
Measurement Difference (mW)
Xj - PTB
Xj - NIST
Xj - NPL
Xj - CSIRO
Xj - NIM
-40 -30 -20 -10 0 10 20 30 40
Measurement Difference (mW)
Sum: APD
Median ± MAD
Steele, Hill, and Douglas: Pooled Data Distributions 15
N
ji1,i jiij2j
2i
2ji
12j )2()()1(χ uuruuxxN
N
1j
2j
12r χχ N
Reduced Chi-Squared Testing• Normalizing the pair differences by the pair uncertainties
allows us to build tests of the measurement capability claims
• This is still independent of any chosen reference value
• This All Pairs Difference reduced 2 has N-1 degrees of freedom
• If a data set fails the APD 2 test, it will fail for every possible KCRV
PTB NIST NPL CSIRO NIM APD 2 3.57 5.78 3.25 6.65 8.57 5.57
Pr(2 > 2obs) <5.810-2 <1.610-2 <7.110-2 <9.910-3 <3.410-3 =1.810-4
APD
Steele, Hill, and Douglas: Pooled Data Distributions 16
Conclusions
• Monte Carlo technique is fast and simple to implement
• Graphs provide a powerful tool for visual consideration of:– Pooled data (sum distribution)– Simple Estimators (mean, weighted mean, median)– Other Estimators (any algorithm can be used)
• All-pairs reduced chi-squared statistic is egalitarian over participants, and independent of choice of KCRV
• No single choice of KCRV can adequately represent a comparison that fails the all-pairs-difference chi-squared test