Looking at Analytical Data

Preview:

Citation preview

263

Looking at Analytical Data

Brian LISTER

B r i t i s h Geological Survey, Geochemistry D i r e c t o r a t e , London WClX BNG, United Kingdom.

Using a s t a t i s t i c a l package assembled prima- r i l y f o r t he evaluation of ore standards produced by BGS, the opportunity has been taken t o re-examine not only da ta on e a r l i e r BGS reference materials b u t a l so da ta published by other organizations. I t i s concluded t h a t , often due t o the d i f f i c u l t i e s of processing la rge numbers of r e s u l t s , da ta a re not always looked a t as c lose ly as they should be. T h i s can cause incor rec t evaluations t o be made. Other problems encoun- tered in the assessment of in te r - labora tory data a re discussed. I t i s suggested t h a t one of t he most e f f ec t ive ways of examining r ep l i ca t e data i s by p l o t t i n g them sequentially as S - d i s t r i - bution curves.

The writer h a s r e c e n t l y completed t h e e v a l u a t i o n of four base metal c o n c e n t r a t e s as r e f e r e n c e materials. This t a s k was made easier by us ing a d a t a process ing program assembled f o r t h e NERC Honeywell 66/DPS-300 computer. T h i s o r d e r s t h e d a t a and then c a l c u l a t e s v a r i o u s s t a t i s t i c a l parameters i n c l u d i n g some r o b u s t estimates n o t u s u a l l y found i n s ta t is t ical packages. Data may be e l imina ted us ing a c h o i c e o f c r i te r ia and pararceters r e c a l c u l a t e d . A t any s t a g e , t h e d a t a may be p l o t t e d s e q u e n t i a l l y as S - d i s t r i b u t i o n curves. It is undoubtedly a great advance from our first survey o f o r e a n a l y s i s i n 1968 (l), where an electro-mechanical c a l c u l a t o r was used.

The s ta t is t ical package enabled t h e 5000 d a t a from 27 l a b o r a t o r i e s t o be examined and re-examined i n d e t a i l . The p l o t s of t h e data , i n p a r t i c u l a r , were i n v a l u a b l e because any anomalous d i s t r i b u t i o n shapes were v i s i b l e immediately. O u t l i e r s could t h u s be i d e n t i f i e d and t h e v a l i d i t y of t h e i r r e j e c t i o n confirmed mathema- t i c a l l y . To adapt a well-known saying , "a p i c t u r e i s worth a thousand numbers" and it is t h e g r e a t v a l u e o f t h e d i s t r i b u t i o n curves t h a t h a s occasioned t h e t i t l e of t h i s paper .

The comparative ease wi th which d a t a could be examined by t h i s method l e d t o a r e t r o s p e c t i v e look a t d a t a sets from previous work i n i t i a t e d by

t h i s l a b o r a t o r y and t h e n t o l o o k a t d a t a publ i shed by o t h e r workers i n t h e f i e l d . The conclus ion is t h a t w e do n o t always make t h e b e s t use o f our d a t a , sometimes because i n a p p r o p r i a t e methods are used , sometimes due t o t h e d i f f i - c u l t i e s of d e a l i n g w i t h large numbers of d a t a . The l a t t e r o f t e n r e s u l t s i n t h e process ing of d a t a mechanical ly wi thout looking a t them. A computer e n a b l e s t h e tedium t o be removed from d a t a process ing b u t i t then becomes even more impor tan t t h a t t h e d a t a should be examined c a r e f u l l y . Two comments made i n prev ious papers bear r e p e t i t i o n : I n i n t e r p r e t i n g da ta . . . , a r i g i d l y mathenat ica l approach on t h e one hand or a n e m p i r i c a l approach on t h e o t h e r . . . , w i l l n o t y i e l d t h e b e s t r e s u l t s . Methods need t o vary depending on t h e number and q u a l i t y of t h e da ta" ( 2 ) . " F i n a l l y , i n t h e assessment of a n a l y t i c a l d a t a , t h e r e i s one technique t h a t should never be overlooked - an open mind, a f r e s h l ook a t t h e d a t a and commonsense" ( 3) .

I t must be emphasized t h a t t h e purpose of t h i s paper is n o t t o c r i t i c i s e o m own previous work or t h a t of o t h e r s bu t t o p o i n t some of t h e ways t o a more s e a r c h i n g a p p r a i s a l o f d a t a . Methods are c o n t i n u a l l y evolv ing and, a l though t h o s e t h a t were used 20 y e a r s ago may still be s a t i s f a c t o r y when a p p l i e d t o good d a t a , they may prove very f a l l i b l e when used w i t h d a t a of lesser q u a l i t y . There is no d e f i n i t i v e way of e v a l u a t i n g a n a l y t i c a l data b u t t h e appearance of t h e data, when p l o t t e d , may w e l l s u g g e s t t h e b e s t approach. That is t h e main theme of t h i s paper.

DATA EVALUATION

We can now look a t some d a t a and see how t h e i r e v a l u a t i o n might be improved s t a r t i n g with some g e n e r a t e d by t h i s I n s t i t u t e and cont inuing wi th d a t a from o t h e r l a b o r a t o r i e s .

Note t h a t i n t h e f i g u r e s below, t h e abscissa r e p r e s e n t s c o n c e n t r a t i o n of e lement and t h e o r d i n a t e , number of d a t a . The t h r e e v e r t i c a l l i n e s show t h e mean and one s t a n d a r d d e v i a t i o n

Geostandards Newsletter, VOZ. 9, N o 2 , Octobre 1985, p . 263 Ci 273

264

e i t h e r s i d e o f i t . D i f f e r e n t symbols have been used f o r some methods of a n a l y s i s , b u t as t h e s e are n o t always c o n s i s t e n t , no key is g iven h e r e . I n t h e t a b l e s , n , K and s are t h e number of r e s u l t s , mean and s t a n d a r d d e v i a t i o n ; 6 , and b, are t h e skewness and k u r t o s i s which, i n a normal d i s t r i b u t i o n , should be 0 and 3, r e s p e c t i v e l y ; M i s t h e median and GM, t h e Gas twir th median ( 4 1 , 1 2 A is Hampel's M-estimate (51 , 25% is t h e 25% trimmed mean ( 5 ) and DCM is t h e dominant c l u s t e r mode ( 6 ) .

Fluorine in IGS 39

I n t h e o r i g i n a l d a t a process ing ( 7 ) . as t h e skewness and k u r t o s i s were w e l l w i t h i n t h e l i m i t s f o r a normal d i s t r i b u t i o n , no d a t a were elimi- na ted . However, as Figure 1 shows c l e a r l y , ' t h e r e are o u t l i e r s v i s i b l e a t bo th t o p and bottom, f o u r t e e n i n a l l . If t h e s e are e l i m i n a t e d , t h e d i s t r i b u t i o n i s reasonably symmetr ical i f some- what p l a t y k u r t i c . Robust estimates, us ing t h e o r i g i n a l 52 d a t a , s u g g e s t a h i g h e r va lue than t h e accepted mean of 46.69%. After e l i m i n a t i o n (see Table 1) , t h i s becomes even m x e c l e a r - c u t and a va lue o f 46.35% would now be sugges ted .

Table 1. IGS 39, F%

n

- X

s

LiG

b2

M

Gb!

12A

25%

DCM

5 2

46 .69

0 . 6 4

0 . 1 9

3 . 2 8

46 .83

46.77

46.77

46 .77

46 .69

38

46 .R2

0 . 2 5

-0 .34

2 . 3 4

46 .86

46 .86

4 6 . 8 5

46 .85

4 6 . 8 4

50 I ++- 4 0

30

I:

. . , , L . . , , .

4.5 5.5 4.6 46.5 47 41.5 40 40.5 I

Figure 1. IGS 39, Fluorine %

Tantalum in IGS 33

There were o r i g i n a l l y 53 r e s u l t s by XRF and gravimet ry ( 7 ) , ( F i g u r e 2 ) . The d i s t r i b u t i o n became approximately normal by removing t h e t h r e e lowes t d a t a b u t it s t i l l remained n e g a t i v e l y skewed. The m e a n i n c r e a s e d from 4.19 t o 4.29%, t h e median remaining t h e same a t 4.37% (see Table 2). T h i s was t h e one o c c a s i o n i n t h e e v a l u a t i o n of the 20 r e f e r e n c e materials IGS 20-39 where t h e mean and median d i f f e r e d so much t h a t t h e median w a s p r e f e r r e d as t h e b e s t estimate of a t r u e va lue . However, it now appears t h a t t h e DCM a t 4.40% would have been an even b e t t e r choice . If t h e most obvious d i s c r e p a n t r e s u l t s are removed ( 5 a t t h e t o p and t h i r t e e n a€ t h e bot tom) , t h e d i s t r i b u t i o n is very n e a r l y normal and the median becomes 4.39, t h e DCM remaining a t 4.40%. The r o b u s t n e s s o f Hampel's estimate i s w e l l i l l u s - t r a t e d s i n c e , i n r e d u c i n g the d a t a from 53 t o 35, it o n l y changes 0.02. The u s e o f some form of mode for skewed d a t a is i l l u s t r a t e d by t h e DCM which g i v e s a good estimate of a t r u e va lue wi thout removing any d a t a . The almost v e r t i c a l p o r t i o n of t h e d i s t r i b u t i o n c u r v e s u g g e s t s t h a t t h e t r u e v a l u e could be s l i g h t l y h i g h e r t h a n 4 . 4 m .

2.5 3 3.5 4 4 5 5

Figure 2. IGS 33, Tantalum %

Table 2. IGS 33, Ta% n 53 5 0 35

E 4 . 1 9 4 . 2 9 4.37

8 0 . 5 8 0 . 4 2 0 . 1 1

6 -1 .22 - 0 - 3 3 0 . 1 1

b2 5 .05 4 . 0 2 2 .95

M 4 .37 4.37 4 . 3 9

GM 4 . 3 4 4 . 3 4 4 . 3 8

12A 4.37 4.37 4 . 3 9

25% 4 . 3 3 4 . 3 5 4 . 3 8

DQ4 4 . 4 0 4 . 4 0 4 . 4 0

Barium in 1GS 38

‘“here were o r i g i n a l l y 48 d a t a with a mean of 52.00 and a median o f 51.35%. Four high o u t l i e r s are obvious ( F i g u r e 3 ) and were e l imina ted a t t h e time. However, it would be p r e f e r a b l e t o remove t h e four low d a t a , a l s o . Agreement between d i f fe i -en t e s t i m a t o r s i s s t i l l n o t e n t i r e l y s a t i s f a c t o r y but 51.54% is probably t h e b e s t choice (see Table 3 ) .

,-A-

50 52 54 56 58

Figure 3. IGS 38, Barium %

Table 3. IGS 38, Ba%

n 48 44 40 - X 52.00 51.46 51.65 .2 2.04 0.96 0.78

1.96 -0 I 20 0.36 6.67 2.55 1.96

Pi 51 035 51.26 51.35

P l

t 2

G 1.1 51.57 51.44 51 a 5 2

1 26 51.48 51.47 51 055 25$ 51.58 51 a 4 4 51.55 X:? 51.20 51.20 51.20

Copper in MP-la

These d a t a ( 8 j , show very w e l l t h a t , wi th a good s e r i e s of a n a l y s e s , e v a l u a t i o n poses f e w problems. Because o u t l i e r s occur more o r less symmetr ical ly (see Figure 4 ) , t h e d i f f e r e n t e s t i m a t o r s of a t r u e value are i n good agreement. The high skewness and k u r t o s i s , however, show t h a t t h e d i s t r i b u t i o n i s not normal (Table 4 ) . Four low and e i g h t h igh out l ier : ; can be iden- t i f i e d and e l imjna ted from t h e o r i , g i n a l 135 d a t a . Es t imates change b u t l i t t l e and agree with t h a t made tly t h e o r i g i n a t o r s of t h e m a t e r i a l .

I I . . I , , .

, . . . I , , . . _- 1.45 1.5 1.65 1.6 1 65 I .35 1.4

Figure 4. MP-la, Copper %

Table 4. MP-la, Cu%

135 1.35 - 1.65 1.41 0.047 2.57

13.30 1.44 1.44 1.44 1.44 1.44

122

1.37 - 1.49 1.14 0.01 9 0.17 2.74 1.44 1.44 1.44 1.44 1.44

That d i f f e r e n t methods of copper a n a l y s i s can y i e l d d i f f e r e n t r e s u l t s f o r t h e same sample h a s been noted more t h a n once ( 2 , 9 ) . Here, t h e bulk of t h e d a t a are by AAS and g ive a n e s t i m a t e of 1.44%, a g r e e i n g wi th t h e o v e r a l l va lue . The t i t r imet r ic d a t a are s l i g h t l y lower a t about 1.43. There are only t e n r e s u l t s by electro- gravimet ry from two l a b o r a t o r i e s bu t t h e lowest of t h e s e is 1.44%, so t h a t t h i s method g i v e s a high va lue . A s t h e d a t a by t i t r i m e t r y and e l e c t r o l y s i s are comparat ively f e w and t h e d i f f e r e n c e s s l i g h t , n o t too much s i g n i f i c a n c e should be a t t a c h e d t o t h e s e f i n d i n g s , but t h e t r e n d is t h e r e .

Tungsten in m-la

There are 60 r e s u l t s f o r tungs ten i n t h e base metal ore, MP-la ranging from 0.029 t o 0.0!53%. The o r i g i n a t o r s of t h i s material have g iven only a p r o v i s i o n a l va lue f o r t h i s e lement because of t h e lack of consensus. F igure 5 shows

266

t h i s very c l e a r l y . The cause .could be inhomo- genei ty for t h i s element or poor q u a l i t y ana lys i s but even a provis iona l value i s op t imis t i c .

30

20

;! 0

0 . 0 3 0 .035 I

I , , . . , I

, * . . . . I

0.045

F igure 5. MP-la, Tungsten X

0 .05

Total iron as Fe203 in FeE-1

This i r o n formation sample has been evalu- a t ed a t 75.06% by i ts o r i g i n a t o r s (10) , us ing t h e " s e l e c t l abo ra to r i e s " method. o f Abbey (11). This a s ses ses t h e number of good, f a i r and poor r e s u l t s for each p a r t i c i p a t i n g l abora to ry and, from these , c a l c u l a t e s a r a t i n g f o r each labora- t o ry . Only r e s u l t s from l a b o r a t o r i e s whose r a t i n g s a r e above a s p e c i f i e d l e v e l a r e used i n t h e evaluation. This method appears t o work very w e l l i n p r a c t i c e and has t h e g r e a t merit t h a t t he da ta are s tud ied i n d e t a i l r a t h e r than b l ind ly submitted t o some r i g i d mathematical process. There a r e two main c r i t i c i s m s , first, because of t h e na tu re of t h e r e j e c t i o n procedures, it is d i f f i c u l t t o cons t ruc t confidence i n t e r v a l s . Secondly, if a l abora to ry has re turned a mixture of good and bad da ta , depending on t h e element, method or a n a l y s t , i t seems d r a s t i c t o r e j e c t a l l t h e da ta . It is o f t e n found t h a t very capable l a b o r a t o r i e s f a i l occas iona l ly when they de ter - mine an unfami l ia r element or use an unusual technique.

Figure 6 shows one very d e f i n i t e o u t l i e r i n t h e 43 r e s u l t s , a n add i t iona l t h r e e low and s i x high o u t l i e r s and poss ib ly a f u r t h e r two high ones. The e f f e c t s of r e j e c t i n g t h e s e are shown i n Table 5. They a l s o show t h a t assessment is not always simple even wi th comparatively high q u a l i t y da ta . The writer would remove t h e four low and e i g h t high da ta and sugges t an e s t ima te of 75.74%. Note aga in t h e consistency of Hampel's 12A estimate compared wi th a l l bu t t he DCM.

Total iron as F e 2 4 i n I F 4 There is a t o t a l of 79 r e s u l t s for which t h e

compiler (12), has given a recommended value Of 55.85%. This should probably be higher-even with

t h e o r i g i n a l da t a , t he medien i s 55.94 and 1 2 A , 55.96%. Figure 7 shows t h a t t h e r e are three very c l e a r low o u t l i e r s and one high, but t he e l imina t ion o f a f u r t h e r s i x low r e s u l t s is j u s t i f i e d , g iv ing an almost pe r fec t normal d i s t r i b u t i o n . An estimate of 56 .OO% is suggested (see Table 6 ) .

35

1 5 m j

15

75

F igure 6. FeR-1, Total iron as Fe,03 %

Table 5. FeR-1, t o t a l Fe,O,%

n - X

k1 b2 M GM 1 2A 25$ DCM

43 75.88 3.09 -2 e 43 14.56 75.80 75.77 75.73 75.83 75.74

42 76 24 2.03 0.80

3.45 75 83 75 82 75.73 75.85 75.74

33 75-82 0.87 0.75 4.49 75.74 75.74 75.74 75.76 75.74

31

75 67 0.65 -0.39 3.40 75 72 75.71 75.74 75.73 75.74

5 53 54 55 5 7 5 1 56

Figure 7. IF-G, Total i r o n as Fe203 %

267

15 ---

lo ---

Tahle 6. IF-G, t o t a l Fe203%

'5

A

A

h

A

A

, , .

! , __h__ 20 + , '

-

- -

-

-

. . . , , I

n - X

S

PI b2 1.1

GM 1 2A

25;: DCM

79 55.70 0.99

-1.22

5.89 55.94 55.93 55.96 55.92 55.94

75 55 .v 0.77

-0.61

3-19 55.95 55.95 55.96 55.94 55.94

69 56 .02

0.57 -0.05

2.87 55.98 55.99 56.01 56 .OO

55.94

Analy t ica l methods may be a s s e s s e d sepa- r a t e l y . The 20 XRF d a t a (F igure 8 ) , have one c l e a r o u t l i e r . The 20 AAS r e s u l t s ( F i g u r e 91, have one obvious l o w o u t l i e r or, a l t e r n a t i v e l y , four low and one high. The 16 t i t r imetr ic d a t a have c e r t a i n l y one and probably two low o u t l i e r s (F igure 10 ) . A s o t h e r methods have been used by only a few l a b o r a t o r i e s , they have n o t been examined but t h e d a t a for t h e t h r e e main methods of a n a l y s i s are summarized i n T a b l e 7 . XRF, apparent ly g i v e s somewhat h i g h e r va lues and AAS, somewhat lower but t h e r e seems l i t t l e doubt t h a t 56%, or perhaps a l i t t l e h i g h e r , is a r e a s o n a b l e estimate.

Manganese in SOIL-5

Figure 11 shows 55 d a t a f o r manganese which have been given a p r o v i s i o n a l value of 052 ppm a f t e r e l i m i n a t i n g one low and four high o u t l i e r s (13 ) . The writer h a s always advocated p l o t t i n g d a t a as d i s t r i b u t i o n curves i n p r e f e r e n c e t o his tograms or any o t h e r method of v i s u a l repre- s e n t a t i o n because r e s u l t s a r e n o t processed i n any way, merely ordered . I n t h e c o n s t r u c t i o n of histo,grams, f o r example, a r b i t r a r y i n t e r v a l s must be chosen and shapes can v a r y cons iderably by a l t e r i n g those i n t e r v a l s . However, F igure 11 shows t h a t even d i s t r i b u t i o n curves have t h e i r dange:rs. There i s one gross o u t l i e r s o t h a t t h e v a r i a t i o n between t h e remaining d a t a is com- pressed , making them appear much b e t t e r t h a n they a c t u a l l y are. F i g u r e 1 2 e l i m i n a t e s t h e s i n g l e high r e s u l t and, immediately, v a r i a t i o n s between t h e remaining 54 d a t a become much c l e a r e r . F igure 13 shows t h e 50 d a t a used by t h e o r i g i n a l compiler . Since t h e skewness is o u t s i d e t h e 5% l i m i t f o r 50 d a t a and, s i n c e t h e lowest r e s u l t seems an obvious o u t l i e r , it h a s been removed. This would g i v e a n estimate of a t r u e va lue of t h e o r d e r of 880 ppm (Table 8 ) . Looking a t F igure 13 a g a i n , t h e r e are s t r o n g arguments for t h i n k i n g t h a t a l l t h e d a t a below 850 ppm are s u s p e c t and t h a t t h e cont inuous d i s t r i b u t i o n s t a r t i n g about 890 ppm r e p r e s e n t s t h e most r e l i a b l e r e s u l t s . The DCM, a t 922 ppm, s u p p o r t s t h i s view.

52 53 54 55 5 6 57

Figurei8. IF-G, Total iron %, XRF r e s u l t s

10 1 ' 1

x iy

52 53 54

-1-----

55 56 57 58

Figure 9. IF-G, Tqtal i ron X, AAS r e s u l t s

:: 12 1- l: 1 6

I 4 i

L- I , . . . I . . I .

A

c

-li 5 5 . 5

Figure 10. IF-G, Total iron %, Ti t r imet r ic results

268

Table 7. IF-G, main methods o f analysis

Method XRF ,US Vol

n - X

8

dT b2

M GM 12A

75;' 3":

20 19

55.98 56.18 1.03 0.50

-2.72 -0.23 1 1 . 1 ; 3.27

56.14 56.15 56.15 56.17 56.19 56.19

55.15 56.19 56.07 56.07

20

55.85 1.23

-0.82

4.59

55.86 55.90 55.97 55.92 55.83 -

19 15 55.03 56.10 C.95 0.63 0.23 0.38 3.15 2.45

55.88 55.97 55.96 56.00 55.99 56.02

55.99 56.02 55.83 55.78

15

55.79 0.59

-1.13 3.56

56.02 55.96 55.99

55.97 56.02

15 55.91 0.54 -0.87

3.09

56.04 56.01 56.01

55.99 56.02

14 55.99 0.43

-0.62

2.87

56.05 55.06 55.04

56.03 56.02 -

SO0 600 100 BOO 900 1000

1000 2000 3000 4000 5000 6000

Figure 11. SOIL-5, Manganese ppm, a l l r e s u l t s

.-

Figure i2. r e s u l t

. . . .

500 1000 1500

SOIL-5, Manganese ppm, minus one high

Figure 13. r e s u l t s , one low

SOIL-5, Manganese ppm, minus f o u r high

Table 8. SOIL-5, Mn ppm

calcium in SOIL-5

Table 9 shows t h e r e s u l t s and methods used by 1 2 l a b o r a t o r i e s f o r calcium. The r a n g e can be s e e n b e s t i n F igure 14. The compi le rs of t h e d a t a g i v e a "for in format ion only" mean of 2.20% with a s t a n d a r d error of ,+ 0.28. However, t h e r e are r e s u l t s by f i v e l a b o r a t o r i e s u s i n g three d i f f e - r e n t methods w i t h i n t h e range 2.34-2.52%. The p r o b a b i l i t y of t h e t r u e va lue be ing w i t h i n t h i s r a n g e i s ext remely h i g h and t h e median of 2.4975 would have been a much b e t t e r c h o i c e as an e s t i m a t o r . T h i s f i g u r e l ies o u t s i d e t h e s t a n d a r d error of t h e m e a n . A v a l u e of 2.50% could be s a f e l y recommended.

&eodm~r i n SOIL-5

The lowest and h i g h e s t v a l u e s i n Table 10 have been e l i m i n a t e d and a mean of 29.9 ppm h a s been recommended w i t h q u a l i f i e d conf idence only . However, w i t h s i x l a b o r a t o r i e s i n e x c e l l e n t agreement, t h e d a t a should be very r e l i a b l e . S i n c e a l l s i x l a b o r a t o r i e s used n e u t r o n a c t i v a - t i o n a n a l y s i s , t h e r e i s t h e p o s s i b i l i t y o f a method b i a s . Otherwise, t h e va lue i s u n l i k e l y t o be wrong.

269

Table 9. SOIL-5, Ca%

Method R e s u l t

NAA 0.248 XRF 0.983 XRF 1.303 XRF 1.630 NAA 2.340 XRF 2.495 NAA 2.500 N A A 2.500 AAS 2.520 NAA 2.975 XRF 3.300 AAS 3.600

8

T I

I : I(

I(

6

.i

r(

-?-+----+-- 1.5 2 2.5

. . , .

3

Figure 14. SOIL-5, Calcium %

Table 10. SOIL-5, Nd ppm

; -n - 3

-L .

I 3.5

Mercury in SOIL-5

l h e d a t a from e leven l a b o r a t o r i e s are g i v e n i n Table 11. The o r i g i n a l compi le rs have e l i - minated t h e lowest and h i g h e s t t o g i v e a mean of 0.79 ppm. I n F igure 15, s i n c e t h e h i g h e s t r e s u l t is a gross o u t l i e r , it h a s been e l i m i n a t e d so as

n o t t o d i s t o r t t h e appearance of t h e rest - as d e s c r i b e d above - t h e lowest r e s u l t being r e t a i n e d . There is, a p p a r e n t l y , one group o f d a t a around 0.55 ppm and another above 0.9 ppm and any a t t e m p t t o e v a l u a t e them i n terms of a t r u e va lue would be misleading. There i s no ready explana- t i o n f o r t h e two groups of d a t a but i t i s impor tan t t o n o t e t h e i r e x i s t e n c e and n o t t o g i v e a mean t h a t f a l l s between them.

' . I"?

1 F! ."5C

* 0

* * -

0 2 0 4 0 6

Figure 15 . SOIL-5, Mercury pprn

Nickel in SP-3

w 0 8 I

The 31 d a t a i n Table 12 are taken from a v a l u a b l e compi la t ion by Abbey (111, who a l s o employs them as an example i n t h e use of d i f f e r e n t methods o f e v a l u a t i o n . I n s p e c t i o n of t h e s e d a t a s u g g e s t s t h a t t h e va lue o f 125 ppm i S a gross o u t l i e r b u t bo th F igure 16, b e f o r e i ts e l i m i n a t i o n and Figure 17, a f t e r w a r d s , show a d i s t r i b u t i o n which does n o t appear normal. It is p o s i t i v e l y skewed and t h e shape is t e n d i n g towards t h a t of t h e lognormal d i s t r i b u t i o n i n

270

which t h e shape is normal i f logarithms of t h e da t a are p l o t t e d . This can be c h a r a c t e r i s t i c of some trace-element a n a l y s i s due t o inhomogeneity - one sub-sample may have one mineral g r a i n conta in ing t h e element, another , three, and so on, g iv ing a wide spread with l i t t l e agreement. If t h e logarithms of t h e data are p l o t t e d , wi th o r without t h e high value of 125 ppm, as i n F igures 18 and 19, t h e r e is some improvement to the shape, p a r t i c u l a r l y i n t h e la t ter case , though it is st i l l some way from t h e charac- t e r i s t i c elongated S-shape of a normal d i s t r i - bution. Skewness and k u r t o s i s are much improved, however.

Table 12. SY-3, N i ppm

Figure 16, SY-3, Nickel ppm, a l l results

Whether t h e 125 ppm r e s u l t is spur ious or a va l id ana lys i s of t h a t p a r t i c u l a r sub-sample, it is d i f f i c u l t to say. I n a normal d i s t r i b u t i o n it would be anomalous, i n a lognormal d i s t r i b u t i o n it would be noth ing ou t of t h e ord inary . The a r i thme t i c mean is 16 but most o t h e r estimates give a value of 11 ppm or less, t h e DCM a t 7.9,

being t h e lowest. An unbiased e f f i c i e n t estimate o f t h e lognormal mean (41, is about 14.5 or, e l imina t ing t h e s i n g l e high r e s u l t , 12.2 ppm. The s i t u a t i o n is indeed confusing and t h e r e is probably no ready explana t ion . If these were geochemical explora t ion sample ana lyses , then t h e average n i c k e l conten t of t h e rock would probably be about 14 ppm bu t t he re is too much v a r i a t i o n f o r t h e material t o be s u i t a b l e as a trace metal s tandard .

10

I

4 '1

t '1 4

rt

I I

'1 '1 " f . I . . . . . , . . . *

f 1.

, . . . ' " " ' ~ ' . ' . 3 15 20 25 50

Figure 1 7 . SY-3, Nickel ppm, one result omitted

30

20

15

-I . . . .

'1 % *

rt

rt

'1 '1

rt

t , , . , .

1 - . . . . .

* '1

4 '1 4

rt

. . . . . I

2 2 .5 5 3.1 4 4.5

Figure 18. SY-3, Nickel ppm. logarithms of results

DISCUSS I ON

The above examples show t h a t w e do no t a l w a y s examine our da ta aa c a r e f u l l y as we should. Sometimes, re -appra isa l confirms t h e o r i g i n a l i n t e r p r e t a t i o n , sometimes it merely adds confusion t o an a l r eadF ambiguous s i t u a t i o n . Mom than once, however, re-appraisal has . s u b s t i t u t e d a d i f f e r e n t value or r e j e c t e d any eva lua t ion as inappropr ia te .

271

2 1 . 5 3 3.5

Figure 19. SY-3, Nickel, logarithms o f results, one omitted

The wri ter makes no claim t h a t t h e methods t h a t have been used are n e c e s s a r i l y t h e b e s t b u t ra ther t h a t t h e d a t a have been s t u d i e d more c l o s e l y t h a n may sometimes happen. It is unders- tandable t h a t workers sometimes process l a r g e numbers of d a t a mechanical ly b u t t h i s .is a d i s s e r v i c e t o t h e a n a l y s t s who have provided them. Two r e s u l t s of 5 and 10% may be processed t o g i v e a mean o f 7.5 and a s t a n d a r d d e v i a t i o n o f 3.5 b u t t h i s has very l i t t l e meaning w i t h i n t h e c o n t e x t of a r e f e r e n c e material.

Non-normal distributions

A s a g e n e r a l r u l e , d a t a obta ined i n an i n t e r - l a b o r a t o r y c e r t i f i c a t i o n programme as w e l l as r e p l i c a t e a n a l y s e s w i t h i n one l a b o r a t o r y w i l l fo l low an approximately normal d i s t r i b u t i o n . Trace elements , however, may t e n d towards t h e lognormal, a s i s t h e case wi th n i c k e l i n SY-3. This i s l i k e l y if t h e element i n q u e s t i o n i s p r e s e n t i n s p o r a d i c d i s c r e t e g r a i n s - as i n a l l u v i a l go ld . Under t h e s e c i rcumstances , agree- ment between de termina t ions is u n l i k e l y unless large sub-samples are taken f o r a n a l y s i s . If a trace element i s d i f f u s e d i n t h e c r y s t a l l a t t i c e of a minera l , t h i s problem is l i k e l y t o be' less pronounced.

Another d i s t r i b u t i o n t h a t may be met i n t h e e v a l u a t i o n of d a t a is t h e n e g a t i v e lognormal, which i s n e g a t i v e l y skewed. According t o Koch and Link (lo), t h i s may occur when o b s e r v a t i o n s are n e a r t o an upper l i m i t set by some n a t u r a l p r o c e s s , f o r example, i r o n i n a hemat i te or l e a d i n a lead concent ra te . Presumably, under t h e s e c i rcumstances , where c o n c e n t r a t i o n is near a n upper l i m i t , t h e r e are g r e a t e r numbers o f low than h i g h d a t a , c a u s i n g the n e g a t i v e skewness. Although t h e writer has n o t encountered t h i s d i s t r i b u t i o n , it is as w e l l t o be aware t h a t it can occur . I r o n i n IF-G ( F i g u r e 7 ) , which is n e g a t i v e l y skewed, is c e r t a i n l y t e n d i n g t h i s way.

F i g u r e 20 i l l u s t r a t e s t h e shape of a t y p i c a l lognormal d i s t r i b u t i o n . It r e p r e s e n t s n i n e hypo- t h e t i c a l d a t a vary ing between 1 and 54 ppm. If t h e figure i s turned through 180°, it t h e n r e p r e s e n t s a n e g a t i v e lognormal d i s t r i b u t i o n of n i n e d a t a from 55.54 t o 60.90%.

1 . . . . . . . . ; . . 10

. . . . . . . . . . . . . . .

4

. ; . , - I

i

. . . . . . . . . . . . I

20 30

Figure 20. Lognormal and negative lognormal distributions

Rounding Of dot8

Figure 4 is a good i l l u s t r a t i o n of a n o t h e r d i f f i c u l t y i n t h e e v a l u a t i o n of d a t a . There are groups of de te rmina t ions a t 1.42, 1.43, 1.44% and so on, caused by t h e rounding up o r down of AAS r e a d i n g s . I t c a n only be assumed t h a t there is an e q u a l p r o b a b i l i t y of e i t h e r o c c u r r i n g and t h a t , for i n s t a n c e , i f t h e r e are ten r e s u l t s of 0.14 and t e n r e s u l t s of 0.15%, t h e n it is probable t h a t t h e t r u e r e s u l t l i e s n e a r e r t o 0.145% t h a n t o e i t h e r 0.14 o r 0.15. Rounding can also concea l t r u e i n t r a - l a b o r a t o r y v a r i a n c e - t w o results of 0.143 and 0.136, when rounded show no var iance , whereas 0.143 and 0.146, show increased v a r i a n c e .

Nlrbers of d e t e d o r t i o n 8

The r e - e v a l u a t i o n s above have been c a r r i e d o u t on d a t a of d i f f e r e n t types . T h i s was n o t commented upon l e s t i t made t h e c e n t r a l i s s u e of d a t a a p p r a i s a l u n n e c e s s a r i l y complicated. I n some cases, t h e d a t a c o n s i s t e d of the o r i g i n a l a n a l y s e s as r e t u r n e d by p a r t i c i p a t i n g labo- r a t o r i e s , whereas i n o t h e r s , t h e r e s u l t s from each l a b o r a t o r y have been averaged t o g i v e a l a b o r a t o r y mean. Both methods have t h e i r advan- tages and d isadvantages .

I t is clear t h a t e q u a l weight w i l l be g i v e n t o d a t a i f numbers of de te rmina t ions from i n d i v i d u a l l a b o r a t o r i e s are t h e same. I n refe- r e n c e material programmes, i t is u s u a l fo r t h e o r i g i n a t o r s t o r e q u e s t p a r t i c i p a t i n g l a b o r a t o r i e s t o return a s p e c i f i e d number of r e s u l t s - BGS. for example, normally r e q u e s t s four. T h i s r e q u e s t

272

is rarelj complied wi th by a l l l a b o r a t o r i e s . A commercial l a b o r a t o r y , f o r i n s t a n c e , w i l l o f t e n f u r n i s h one averaged r e s u l t i n accordance w i t h t h e u s u a l p r a c t i c e t o c l i e n t s . Some l a b o r a t o r i e s , on t h e o t h e r hand, w i l l r e t u r n a long s t r i n g of d a t a which may cause problems. If t h e l a b o r a t o r y i s of h igh r e p u t e i n t h e a n a l y s i s of a p a r t i c u l a r e lement , no harm may be done b u t i f t h e l a b o r a t o r y i s less e x p e r t o r is exper iment ing wi th a new technique o r a n unfamiliar e lement , it can cause d i f f i c u l t i e s .

Table 13 shows some h y p o t h e t i c a l d a t a from f i v e f i c t i t i o u s l a b o r a t o r i e s i n o r d e r t o i l l u s - t ra te a number of p o i n t s . The first f o u r are provid ing what is a p p a r e n t l y t h e b e s t estimate of a t r u e va lue b u t t h e i r d a t a are overwhelmed by t h e t e n de te rmina t ions from Laboratory N o 5. T h i s may be c o r r e c t e d by averaging t h e r e s u l t s from each l a b o r a t o r y o r , a l t e r n a t i v e l y , by r e t a i n i n g a l l d a t a from t h e first four and p a r t i a l l y averaging t h e d a t a from t h e f i f t h s o t h a t they t o t a l f o u r . I n t h i s c o n t r i v e d example, t h e r e s u l t s from Laboratory N o 5 would b e r e j e c t e d by most c r i t e r i a for t h e i d e n t i f i c a t i o n o f o u t l i e r s , s o t h a t they do n o t p r e s e n t a s e r i o u s problem b u t , i f t h e y averaged, s a y , 10.5%, t h e y would probably n o t be r e j e c t e d by any c r i t e r i a and, u n l e s s d e a l t wi th i n some way, would weight t h e d a t a towards too h igh a n estimate. T h i s shows a g a i n t h e importance o f examining t h e d a t a and t h e S-curves i n t h e l i g h t o f a l l a v a i l a b l e informat ion .

If w e now look a t t h e effects o f averaging t h e data from L a b o r a t o r i e s Nos 1 t o 4 , t h e mean, i n a l l Gases, is 10.00%, s o t h a t by doing t h i s , w e have e l i m i n a t e d a l l s i g n s of i n t r a - l a b o r a t o r y v a r i a t i o n , producing a s t a n d a r d d e v i a t i o n of zero . A s imple way of overcoming t h i s problem is t o average t h e i n d i v i d u a l l a b o r a t o r y s t a n d a r d d e v i a t i o n s which g i v e s 0.0807 - n o t v e r y d i f f e r e n t from t h e d e v i a t i o n of t h e 20 i n d i v i d u a l r e s u l t s , 0.0816.

D i f f e r e n t o r g a n i s a t i o n s t a c k l e t h i s problem i n d i f f e r e n t ways. Some, average r e s u l t s from each l a b o r a t o r y and t a k e a s t a n d a r d d e v i a t i o n o f t h e averaged d a t a , w h i l e o t h e r s p r o c e s s each i n d i v i d u a l r e s u l t . Another method t h a t may be used (8,151, is a n a n a l y s i s of v a r i a n c e technique t o s e p a r a t e between- and within- l a b o r a t o r y v a r i a t i o n . The writer makes u s e of -each i n d i - v i d u a l r e s u l t i n t h e p r o c e s s i n g o f data b e l i e v i n g t h a t , t o average d a t a b e f o r e p r o c e s s i n g is , t o some e x t e n t , t o s u p r e s s informat ion . Never- t h e l e s s , should any l a b o r a t o r y provide long series of d a t a , t h e r e w i l l be a check t o ensure t h a t they do n o t r e s u l t i n any wrong conclus ions . It is u n l i k e l y t h a t any e v a l u a t i o n w i l l encounter data as anomalous a s t h o s e i n Table 13 b u t it is always w i s e t o check. It seems i n e v i t a b l e t h a t , i f t h e o r i g i n a t o r of a r e f e r e n c e material r e c e i v e s , s a y , s i x r e s u l t s f o r t e n e lements from each o f 80 l a b o r a t o r i e s , some s i m p l i f i c a t i o n of t h e v a s t amounts of d a t a w i l l be n e c e s s a r y .

Table 13. CU%

lab. No. 1 Lab. No. 2 Lab. No. 3 Lab. No. 4 Lab. No. 5 ~~~~

9.80 9.9 10.0 9.98 12.0 12.0

10.10 9.95 10.0 9.99 12.1 12.0

10.10 10.02 9.95 9.97 12.0 12.1

10.00 10.13 10.05 10.06 12.2 11.9

12.1 12.2

F I N A L CONMENTS

S - d i s t r i b u t i o n c u r v e s , first brought t o t h e wri ter ' s n o t i c e i n t h e assessment o f t h e rock s t a n d a r d s , G-l and W - 1 (16). are a s imple and e f f e c t i v e means of looking a t d a t a b e f o r e and d u r i n g process ing . T h e i r va lue lies i n the fact t h a t d a t a are d e p i c t e d i n an e n t i r e l y unaf fec ted manner - no cumula t ive p e r c e n t a g e s , n o class i n t e r v a l s t o be chosen - merely t h e data, p l o t t e d s e q u e n t i a l l y from t h e lowest t o the h i g h e s t . They are v a l u a b l e whatever method of data assessment is f i n a l l y chosen b u t , e s p e c i a l l y so, when t h e r e are large q u a n t i t i e s of multi-element data and it becomes i n e v i t a b l e t h a t they must be submit ted t o some "number crunching" program i n which impor tan t anomalies might be los t t o view.

ACKNOWLEDGEMENTS

It is impor tan t t h a t i n t e r - l a b o r a t o r y d a t a obta ined i n r e f e r e n c e material programmes are e a s i l y a c c e s s i b l e t o o t h e r s and t h e writer is g r a t e f u l t o j o u r n a l s such as "Geostandards Newsletter" that t h i s is so. T h i s paper is publ i shed w i t h t h e approval of t h e D i r e c t o r , B r i t i s h Geological Survey (NERC).

RESUME

Les echantillons minerais de reference pr6- pares par l e "British Geological Survey" ont ete reevalues avec l a disponibilite d'une nouvelle serie de programmes statistique. A cette occa- sion, un certain nombre d'autres echantillons de reference prepares par d'autres organisms ont 6te egalement reexamines. On arrive I l a con- clusion suivante: l a diff icult6 de manipuler un grand nombre de donnees empeche un examen approfondi des donq6es ,et ceci peut crnduire I une mauvaise evaluation,de donnges. D'autres problSmes rencontres lors de 1 'evaluation de resultats inter-laboratoires sont discutes. I 1 es t sugger-6 qu'une des Sthodes efficaces pour examiner des donnees de compilation e s t de construire une courbe de distribution-S en les alignant sequentiellement.

2 73

REFERENCES

( 3 )

(4)

( 5 )

( 7 )

B. Lister and M.J. Gallagher (1970) An inter-laboratory survey of the accuracy of ore analysis, Trans Inst. %in. Metall., Sect. 8 , 79: B 213-237.

B. Lister (1978) The preparation of twenty ore standards, 1:GS 20 - 39. Preliminary work and assessment of analytical data, Geostandards Newsletter, 7: 157-186.

R. Lister (1982) Evaluation of analytical data: a practical guide for geoanalysts, Geostandards Newsletter, 6: 175-205.

J. Gastwirth (1966) On robust procedures, J. Am. Stat. Assoc.. 61: 929-948.

D . F . Andrews et a1 (1972) Robust estimates of location: survey and advances, Princeton University Press, 3738.

P.J. Ellis, I. Copelowitz and T.W. Steele (1977) Estimation of the mode by the dominant cluster method, Geostandards Newsletter, 1: 123-130.

R. Lister (1977) Second inter-laboratory survey of the accuracy of ore analysis, T r a n s Inst. Min. Metall., Sect. R, 86: B133-148.

H . F . Steger and W.S. Bowman (1987) MP-la: A certified .reference ore, Canmet Report 82-14E, 3 3 p .

B. Lister and J. van der Linden ( I n press) The preparation of Bougainville copper concentrate as reference material. IGS 45.

(10) S. Abbey, C.R. McLeod and Wan Liang-Guo (1983) FeR-1, FeR-2, FeR-3 and FeR-4: Four Canadian iron-forma- tion samples prepared for use as reference materials, Geol. Surv. of Canada Paper 83-19, 51p.

(11) S. Abbey (1983) Studies in "standard samples" of silicate r o c k s and minerals 1969 1982, Geol. Surv. of Canada Paper 83-15, 114p.

(12) K . Govindaraju (1984) Report (1984) on two GIT-IWG geochemical reference samples: albite from Italy, AL-I and iron formation sample from Greenland, IF&, Geost,andards Newsletter, 8: 63-113.

(13) R. Dybczyhski,, A. TugEavul and 0. Suschny (1979) Soil-5, a new IAEA certified reference material for trace element determinations, Geostandards Newsletter. 3: 61-87.

(14) G.S. Koch and R.F. Link (1970) Statistical analysis of geological data, Vol. I, Wiley, 375p.

(15) J. Mandel and R.C. Paule (1970) Interlaboratory evaluation of a material with unequal numbers of replicates. Analytical Chemistry, 42, 1194-1197.

(16) R.E. Stevens et a1 (1960) Second report on a cooperative investigation of the composition of two silicate r o c k s , Bull. U . S . Geol. Surv., no. 1113, 126p.

Recommended