Upload
liang-chen
View
216
Download
2
Embed Size (px)
Citation preview
ARTICLE IN PRESS
Nuclear Instruments and Methods in Physics Research A 598 (2009) 450–453
Contents lists available at ScienceDirect
Nuclear Instruments and Methods inPhysics Research A
0168-90
doi:10.1
�Corr
E-m
journal homepage: www.elsevier.com/locate/nima
Nuclide identification algorithm based on K–L transform and neural networks
Liang Chen �, Yi-Xiang Wei
Key Laboratory of Particle & Radiation Imaging (Tsinghua University), Department of Engineering Physics, Tsinghua University, Ministry of Education, China
a r t i c l e i n f o
Article history:
Received 24 June 2008
Received in revised form
11 September 2008
Accepted 20 September 2008Available online 14 October 2008
Keywords:
K–L transform
Neural network
Nuclide identification
Linear associative memory
ADALINE
02/$ - see front matter & 2008 Elsevier B.V. A
016/j.nima.2008.09.035
esponding author. Tel.: +86 10 62784 529.
ail address: [email protected] (
a b s t r a c t
Traditional spectrum analysis algorithm based on peak search is hard to deal with complex overlapped
peaks, especially in bad resolution and high background conditions. This paper described a new nuclide
identification method based on the Karhunen–Loeve transform (K–L transform) and artificial neural
networks. By the K–L transform and feature extraction, the nuclide gamma spectrum was compacted.
The K–L transform coefficients were used as the neural network’s input. The linear associative memory
and ADALINE were discussed. Lots of experiments and tests showed that the method was credible and
practical, especially suitable for fast nuclide identification.
& 2008 Elsevier B.V. All rights reserved.
1. Introduction
Artificial neural networks are widely used in pattern recogni-tion. Using artificial neural networks in gamma spectrum analysiswas proposed in the early 1990s [1]. Unlike the classical methods,using artificial neural networks does not require searching, fittingand dealing with complex overlapped peaks. The spectrum isconsidered as a whole and the global shape is compared withstored patterns. Expert knowledge is not required, and humanparticipation is not even necessary after the network is trainedproperly.
Several papers related to this subject have been published inthe years since artificial neural networks were introduced. Someof them used a multilayer linear perceptron network and backpropagation algorithm [2,3]. Because of its bad fault-tolerance,lack of robustness, convergence to local minima and largecomputation, this algorithm is rarely used. Others have usedoptimal linear associative memory (OLAM) networks [1,2], whichstill are subject to computation and stability problems becausethe original or part of the original data is used as input. Currentgamma spectrum analysis programs [4,5] still use the classicalmethod based on peak search and matching.
This research focuses on a nuclide identification algorithmused in portable radionuclide identification devices (RIDs) thatmeet the IAEA’s requirements [6]. RIDs required the rapid andaccurate identification of 27 kinds of radionuclides. HGe detectorscannot be used in portable devices, so we choose a NaI detector.
ll rights reserved.
L. Chen).
The peaks are usually overlapped due to bad resolution in the NaIdetector spectrum, and it is hard to search for high backgroundand great fluctuation. We used the Karhunen–Loeve transform(K–L transform) for extracting features from the gamma spectrum,followed by a neural network. A large amount of training and testdata were obtained using a NaI detector (Hamamatsu PhotonicsK.K. CH201-03) and a multichannel analyzer (Canberra, DAS-1000). The method has the advantages of speed, accuracy, widetolerance and good robustness.
2. Feature extraction
Feature extraction is a key step in pattern recognition, and theclassical peak search algorithm is a specific kind of featureextraction. The features are used as the neural network’s input, sothe features of different nuclides should be as different aspossible, and the number of features should be as few as possible.Here we used K–L transform to extract the features.
The K–L transform is a special orthogonal transform, and ismainly used in compacting data for 1D and 2D signals. Thegamma spectrum can be treated as a wide stationary randomvector. The vectors in this study were all column vectors. Therecovariance matrix is defined as
Cx ¼ Efðx� lxÞðx� lxÞTg ¼
c0;0 c0;1 � � � c0;N�1
c1;0 c1;1 � � � c1;N�1
..
. ... ..
. ...
cN�1;0 cN�1;1 � � � cN�1;N�1
2666664
3777775 (1)
ARTICLE IN PRESS
L. Chen, Y.-X. Wei / Nuclear Instruments and Methods in Physics Research A 598 (2009) 450–453 451
where E{ � } is the average operator, lx=E{x} is the average vectorof signal x and the elements of Cx are given by
ci;j ¼ EfðxðiÞ � mxðiÞÞðxðjÞ � mxðjÞÞg ¼ cj;i (2)
The eigenvalues and corresponding eigenvectors of Cx are l0,l1,y,lN�1 and A0, A1,y,AN�1. By normalizing the eigenvectors weget a normalized orthogonal matrix A=[A0, A1,yAN�1]. Finally theK–L transform of the signal x can be expressed as y=ATx, where yis the K–L transform coefficient vector. If the eigenvalues aresorted in descending order, and only the m largest eigenvalues andtheir corresponding eigenvectors are reserved, we can obtain y0,the compact version of y. By recovering the original signal form y0,we can obtain the approximate version of x, x0=Ay0. It has beenproved that y0 reserves the maximum energy of the original signal[7], and the mean square error between x and x0, e=E{[x�x0]2} isminimized to the sum of the rejected eigenvalues. The K–Ltransform removes the correlation of the original signals, reversesthe maximum energy and minimizes the mean square error, so itis also called the optimized transform. In our nuclide identifica-tion algorithm, the m-dimensional vector y0 can be seen as the mfeatures of the gamma spectrum, and used as input for the neuralnetwork.
3. Neural network model
3.1. Linear associative memory network
A typical neural network is composed of an input layer, anoutput layer and sometimes one or more hidden layers. Each layerincludes neurons that are connected to all the neurons of asuccessive layer. There is a weight wij for each connection betweentwo neurons i and j. The weights are established by a trainingprocedure. The output is calculated summing of all the inputsweighted by corresponding elements of the weight matrix W, andit is then processed with a transfer function such as a linear orsigmoid function.
Fig. 1 shows the linear associative memory, a typical neuralnetwork processed with a linear transfer function and without ahidden layer, which is also the structure used in this paper.
The output of the network is given by
ai ¼ purelinXN
j¼1
wijpj
0@
1A ¼XN
j¼1
wijpj (3)
and the matrix form is given by
a ¼Wp (4)
Fig. 1. Structure of the linear associative memory network.
3.2. Hebb’s rule and its variations
The weight matrix W can be established by various kinds oftraining procedures and the Hebb rule is a basic one for a linearassociative memory network. According to the supervised Hebbrule [8], the weight matrix W is given as
W ¼ t1pT1 þ t2pT
2 þ � � � þ tQ pTQ ¼ TPT (5)
where p1, p2,y, pQ are the input vectors used for training and t1,t2,y, tQ are corresponding target vectors. As shown in Eqs. (4) and(5), when pk is input into the network, the output can becomputed as
a ¼Wpk ¼XQ
q¼1
tqpTq
!pk ¼
XQ
q¼1
tqðpTqpkÞ (6)
If the input vectors are orthogonal and normalized, Eq. (6) couldbe rewritten as a=Wpk=tk. The output of the network is equal tothe target.
If the input vectors are not orthogonal, the Hebb rule producessome errors. There are several procedures that can be used toreduce these errors. The core idea of variations of the Hebb rule isto minimize the difference between the network output andtarget vector. The mean square error can be expressed as:
FðWÞ ¼XQ
q¼1
ktq �Wpqk2 (7)
Usually the row number of pq (the number of features) is greaterthan the column number (the number of training samples). So thesolution of the square error problem is
W ¼ TPþ ¼ TðPTPÞ�1PT (8)
where P+ is the pseudoinverse of matrix P. Because of this Eq. (8)is also called the pseudoinverse rule.
Another solution to solve the non-orthogonal input vectorsproblem is to use the ADALINE and Least Mean Square (LMS)algorithm [8]. The ADALINE network is very similar to linearassociative memory, except that it has a bias vector. The output ofADALINE is given as
a ¼ purelin ðWpþ bÞ ¼Wpþ b (9)
Including the bias vector b as a column of the weight matrix W,and the bias input ‘‘1’’ as a component of the input vector, Eq. (9)could be rewritten as a=xz, where x=[W b] and z=[p 1]T. The meansquare error problem is
FðxÞ ¼ E½ðt� aÞ2� ¼ E½ðt� xzÞ2� ¼ E½t2� � 2xE½tz�
þ xE½zzT�xT ¼ c� 2xhþ xRxT (10)
The LMS algorithm uses an iteration method to search for thesolution x that makes F(x) minimum. From the steepest descentalgorithm with a learning rate constant, the iteration formula is
xkþ1 ¼ xk þ 2aeðkÞzðkÞT (11)
where k is the iteration step, a is the learning rate constant ande(k)=t(k)�a(k) is the difference between the desired and actualoutput of the network. It is also referred to as the delta rule or theWidrow–Hoff learning algorithm. The maximum learning rateconstant is ao2/lmax, where lmax is the largest eigenvalue ofmatrix R in Eq. (10).
If the input vectors are statistically independent, the expectedsolution would converge to x*=R�1h. If we set the bias vector band bias input ‘‘1’’ as zero, the convergence solution x* would beequivalent to Eq. (8). So the pseudoinverse and delta rules isessentially the same. The advantage of the delta rule is that it canupdate the weights after each new input pattern is presented,
ARTICLE IN PRESS
L. Chen, Y.-X. Wei / Nuclear Instruments and Methods in Physics Research A 598 (2009) 450–453452
whereas the pseudoinverse rule computes the weights in onestep after all of the input/target pairs are known. The sequentialupdating allows the ADALINE to adapt to a changing environment.
4. Nuclide identification
4.1. Train the networks
The selection of the training sample and the number offeatures has a great influence on the network performance. Therewere eight radioactive sources in our lab: 241Am, 133Ba, 60Co, 137Cs,152Eu, 226Ra, 232Th and natural uranium (NU). The activities rangefrom 0.3 to 10mCi. We measured twelve spectrums (1024channels) for each source and divided them into three groups:group 1 included two spectrums for a fluctuation error less than1%, group 2 included five measured for 1 min and group 3included five measured for 2 min. We chose one from group 1, twofrom group 2 and two from group 3. We used them for the K–Ltransform and reserved the 512 largest coefficients. About 90%energy was reserved in the first 50 coefficients, as shown in Fig. 2:
So we chose the first 64 features as the network input. Thepractical training steps were as follows:
(a)
TablPart
Data
Am2
Am2
Am2
Ba13
Ba13
Co60
Co60
Cs13
Cs13
Eu15
Eu15
Ra22
Ra22
Th23
Th23
NU_
NU_
NU+
Ra22
Choose the sample spectrums as mentioned above, subtract thenatural background, smooth and normalize the spectrums. Weused seven-point Gaussian window to smooth the spectrums.
0 100 200 300 400 500 600-1
-0.5
0
0.5
1
1.5K-L transform coeffients of Co60
Fig. 2. Typical K–L tran
e 1of the nuclide identification results using the linear associative memory and pseu
Nuclide
241Am 133Ba 60Co 137Cs
41_101 1 1.16E�09 �2.97E�11 1.46E�
41_203 1.0003 �0.0005 �0.0001 9.13E�
41_303 0.9996 0.0004 6.06E�05 4.28E�
3_203 0.0061 1.0008 0.0004 0.0013
3_303 �0.0029 0.9955 �0.0003 0.0002
_203 �0.0144 0.0394 0.9972 �0.0082
_303 0.0206 0.0294 1.0055 0.0039
7_203 �0.016 0.0325 0.0031 1.003
7_303 0.0094 0.0215 �0.0024 0.9939
2_203 �0.007 0.0129 �0.0016 0.0026
2_303 �0.0098 0.0081 0.0025 0.0039
6_203 0.0111 0.0119 �0.0042 �0.02
6_303 0.001 0.0105 8.97E�05 �0.001
2_203 �0.0051 0.03101 0.0013 �0.0035
2_303 0.0187 0.0342 �0.0016 0.013
203 0.0066 0.0093 0.0006 �0.0017
303 0.0004 0.0097 �0.0004 0.0002
Cs137 �0.0043 �0.0214 �0.0022 0.0962
6+Th232 �0.0234 �0.0123 �0.0002 �0.0054
(b)
0-4
-2
0
2
4
6
sform
doinv
11
05
05
7
Calculate the covariance matrix of the sample spectrum, andthe eigenvalues and eigenvectors of the matrix.
(c)
Reserve the 64 largest eigenvalues and their corres-ponding eigenvectors, and generate the feature matrixA1024 * 64.(d)
Apply the K–L transform to the sample spectrums, whichmeans calculating the network training input: p64 * 1=ATx,where x1024 * 1 is the training spectrum.(e)
Generate the input pattern {pi, ti}. The pattern is defined as241Am, 133Ba, 60Co, 137Cs, 152Eu, 226Ra, 232Th and NU. Forexample, if pi is the K–L coefficient vector of the 60Cospectrum, the third element of ti would be ‘‘1’’, and the otherelements would be ‘‘0’’.(f)
Calculate the weight matrix W8 * 64 and bias vector b using thesupervised pseudoinverse rule or delta rule.4.2. Test the performance of the networks
We use all the experiment spectrums to test the networkperformance. The test steps were as follows:
(a)
Subtract the natural background, and then smooth andnormalize the test spectrums.(b)
Calculate the K–L transform of the test spectrums, whichmeans calculating the network input: p64 * 1=ATx, wherex1024 * 1 is the test data.100 200 300 400 500 600
K-L transform coeffients of Ra226
coefficients.
erse rule
152Eu 226Ra 232Th NU
�1.30E�10 2.12E�10 7.55E�11 �1.88E�10
0.0001 0.0004 �5.96E�05 �5.19E�05
�0.0001 �0.0002 0.0002 �0.0002
�0.0015 0.0069 �0.0114 �0.0014
0.0125 �0.0044 �0.0002 �0.005
�0.0157 �0.0118 �0.0194 0.0523
�0.0258 �0.0119 �0.0136 0.0145
0.0128 �0.0416 0.0095 0.0057
�0.0192 �0.0014 0.0138 �0.0026
0.9915 �0.0249 0.02028 �0.0012
0.9951 �0.0112 0.0061 �0.0005
�0.0033 1.0127 0.0121 �0.0111
�0.0095 0.9755 0.0298 �0.0056
�0.0151 0.0217 1.0088 0.0226
�0.0306 �0.0152 0.9956 0.0023
�0.0018 �0.01 0.011 0.9921
�0.004 �0.0124 0.0123 0.9967
�0.0077 0.0149 0.0234 0.9365
0.044 0.3744 0.7276 �0.0145
ARTICLE IN PRESS
0 200 400 600 800 1000 12000
1000
2000
3000
4000
5000
6000
7000
Channel
Cou
nt
NU+Cs137
662keV from Cs137
0 200 400 600 800 1000 12000
2000
4000
6000
8000
10000
12000
Channel
Cou
nt
Ra226+Th232
Fig. 3. Mixed nuclide spectrums for test.
L. Chen, Y.-X. Wei / Nuclear Instruments and Methods in Physics Research A 598 (2009) 450–453 453
(c)
Calculate the identification result, namely the output of thenetwork: R8 * 1=Wp=WATx or R8 * 1=Wp+b=WATx+b. The ele-ments of column vector R can be seen as the confidence of theexistence of a corresponding nuclide.Note that WAT and b can be stored after the training process. Sothe main computation is just the product of matrix WAT
8�1024 andvector x1024 * 1 the computation is simple enough for a portabledevice.
5. Test results
We used both linear associative memory and ADALINE net-works to test the nuclide identification performance. In additionto the single nuclide, we also tested some mixed nuclidespectrums. To keep this article short, Table 1 only shows part ofthe test results.
The first column of Table 1 is the file name of the experimentaldata. The file name is composed of the nuclide name and anumber. The first digit of the number is the group number. Thefirst row of Table 1 is the data of 241Am, which was also used asthe training data. It shows that the confidence of 241Am is exactly‘‘1’’, and that of the other nuclides is almost ‘‘0’’. The other outputsof the training data give the same result. The outputs of the non-training data show that for the single nuclide data, theconfidences of the matched patterns range from 0.97 to 1.02,and those of the mismatched patterns range from �0.05 to 0.06.
The last two rows are mixed nuclide spectrums, and theconfidences of two matched patterns are also much greater thanthose of the mismatched patterns. The confidence of 137Cs in‘‘NU+Cs137’’ is smaller, because the activity of 137Cs is lowcompared to the NU activity, and the features of 137Cs are coveredup. The mixed nuclide spectrums are shown in Fig. 3.
6. Conclusion
Using the K–L transform coefficients of the original spectrumto train and test the neural networks, the linear associative
memory and ADALINE networks give the same result. There is aclear distinction between the confidences of the matched andmismatched nuclides. The mixed nuclide spectrum is a linearsuperposition of the component spectrums, and both networkshave a linear structure, so they can address mixed nuclideproblems.
We used different numbers of features and training samplesto test the network performance. In extreme cases, we reservedonly 32 features, and use two spectrums (from groups 1 and 2) ofeach nuclide for training, and there were little difference in theresults. Another aspect of the network performance is the outputof a never-trained pattern. We removed the 60Co and 152Euspectrums from the training sample, and then used them asinputs. The output vectors were very different from the trainedspectrums: the smallest element was smaller than �0.3 andusually smaller than �1.0, while that of the trained spectrums was�0.05. From the test results, it is easy to define a confidencethreshold for untrained, single nuclide and mixed nuclidespectrums. The single nuclide was the usual condition in ourapplication.
We can see that the neural networks have wide tolerance andgood robustness. This method is more stable than the traditionalmethod. By feature extraction based on K–L transform, thecomputation is limited at a very low level, so it is especiallysuitable for fast nuclide identification in portable devices.
References
[1] P. Olmos, J.C. Diaz, J.M. Perez, P. Gomez, V. Rodellar, P. Aguayo, A. Bru,G. Garcia-Belmonte, J.L. de Pablos, IEEE Trans. Nucl. Sci. 38 (4) (1991).
[2] P.E. Keller, L.J. Kangas, G.L. Troyer, S. Hashem, R.T. Kouzes, IEEE Trans. Nucl. Sci.42 (4) (1995).
[3] E. Yoshida, K. Shizuma, S. Endo, T. Oka, Nucl. Instr. and Meth. A 484 (2002) 557.[4] Genie-2000 Spectroscopy System Customization Tools, pp. 206–336.[5] SAMPO, Advanced Gamma Spectrum Analysis Software, Version 3.62, User’s
Guide, Version 1.1, pp. 51–61.[6] IAEA Nuclear Series No. 1, Technical and Functional Specification for
Border Monitoring Equipment, Technical Guidance, ISBN 92_0_100206_8, ISSN1816-937, pp. 39–76.
[7] G.-S. Hu, Digital Signal Processing, second ed., Tsinghua University Publication,2003, pp. 368–371.
[8] M.T. Hagan, H.B. Demuth, M. Beale, Neural Network Design, PWS PublishingCompany, 1996, pp. 7/1–7/14.