4
Wavelet based Raman spectra comparison Gordon Cooper a , Maria Kubik b, , Kurt Kubik c a University of Witwatersrand, School of Geosciences, Private Bag 3, 2050 Wits, South Africa b The Art Gallery of Western Australia, Perth Cultural Centre, Perth WA 6000, Australia c The University of Queensland, School of IT and Electrical Eng., St. Lucia, QLD 4072, Australia abstract article info Article history: Received 7 December 2010 Received in revised form 17 January 2011 Accepted 26 January 2011 Available online 3 February 2011 Keywords: Raman spectra Matching algorithms Wavelet comparison Pigment identication Art conservation In this paper a novel blind peak detector for Raman spectra based on the continuous wavelet transform is proposed. The peak detector is shown to correctly identify peaks of different widths and intensities, and is well suited for subsequent peak-to-peak matching for the process of identifying unknown Raman spectra with the help of a library of reference spectra. No background reduction or initial noise ltering is necessary. The described methods were veried by successfully identifying 73 paint pigments by a spectral search through a Raman library for 638 pigments. In all except 7 cases, the SchminckeKremer pairings from the truth reference were found to be amongst the two top pairings with the largest matching cost function. These 7 cases were again inspected by the paintings conservator and doubts were raised regarding the truthfulness of the initial pairing. The proposed methods are equally applicable to other spectral methods. Crown Copyright © 2011 Published by Elsevier B.V. All rights reserved. 1. Introduction Raman spectroscopy has found wide application in the identi- cation of compounds in artworks such as pigments and binding media [Refs. 14]. As artworks such as paintings are composed of many layers of often mixed materials, proper characterisation and identication of the component materials are crucial for the development of responsible and careful conservation interventions, understanding artist technique and degradation processes, and assisting with possible forgery detection. The unique and irreplace- able value of artwork guides the conservation scientist to select analytical techniques which do not compromise the integrity of the original material. As a non-destructive and more importantly non- sampling technique, micro-Raman spectroscopy can be used for the selective identication of small components such as pigments in a paint layer (typically less than 1 μg), or natural binders and varnishes which are bound with other materials. The Raman- spectra collected from such samples are a ngerprint of the molecular structure of these materials, which can then be used for their identication. The identication of pigments by Raman spectroscopy is achieved by comparing the spectrum of an unknown sample against a database containing spectra of known compounds. When designing such identication methods, the following aspects of the Raman spectra have to be observed: Raman band intensities and spectral peak values vary between spectra; these variations are due to a number of factors, amongst others incident laser power, frequency of the incident radiation, efciency of the grating and detector, absorptivity of the materials, molar scattering power of the molar mode, and the concentration of the sample [5]. There may be small variations in the spectral peak location for the corresponding chemical element, due to differences in calibration procedures used in different laboratories, and due to variations in molecular properties. The spectral line is systematically offset from the zero count. This baseline magnitude depends on factors such as pigment particle size and the use of llers: although llers do not interfere with pigment peaks or discrimination, they cause higher backgrounds than desirable. This is unavoidable as different manufacturers add different llers and a pure pigment is not easily available. Insignicant small uctuations in the composition of the compound and measuring noisecause small local differences in the spectra. Raman pigment identication cannot be successfully completed for ~ 30% of commonly used pigments, where either the pigment is not Raman active (no spectral peaks), or where the spectrum is overwhelmed by uorescence [6]. Therefore visual control of the automated matching remains necessary. Automated methods of comparison can be categorised into correlation methods and peak detection methods. Cross-correlation Chemometrics and Intelligent Laboratory Systems 107 (2011) 6568 Corresponding author at: 3/110 Lorikeet Drive, Peregian Beach, QLD 4573, Australia. Tel.: +61 754713594. E-mail addresses: [email protected] (G. Cooper), [email protected] (M. Kubik), [email protected] (K. Kubik). 0169-7439/$ see front matter. Crown Copyright © 2011 Published by Elsevier B.V. All rights reserved. doi:10.1016/j.chemolab.2011.01.010 Contents lists available at ScienceDirect Chemometrics and Intelligent Laboratory Systems journal homepage: www.elsevier.com/locate/chemolab

Wavelet based Raman spectra comparison

Embed Size (px)

Citation preview

Page 1: Wavelet based Raman spectra comparison

Chemometrics and Intelligent Laboratory Systems 107 (2011) 65–68

Contents lists available at ScienceDirect

Chemometrics and Intelligent Laboratory Systems

j ourna l homepage: www.e lsev ie r.com/ locate /chemolab

Wavelet based Raman spectra comparison

Gordon Cooper a, Maria Kubik b,⁎, Kurt Kubik c

a University of Witwatersrand, School of Geosciences, Private Bag 3, 2050 Wits, South Africab The Art Gallery of Western Australia, Perth Cultural Centre, Perth WA 6000, Australiac The University of Queensland, School of IT and Electrical Eng., St. Lucia, QLD 4072, Australia

⁎ Corresponding author at: 3/110 Lorikeet Drive,Australia. Tel.: +61 754713594.

E-mail addresses: [email protected] (G. Coo(M. Kubik), [email protected] (K. Kubik).

0169-7439/$ – see front matter. Crown Copyright © 20doi:10.1016/j.chemolab.2011.01.010

a b s t r a c t

a r t i c l e i n f o

Article history:Received 7 December 2010Received in revised form 17 January 2011Accepted 26 January 2011Available online 3 February 2011

Keywords:Raman spectraMatching algorithmsWavelet comparisonPigment identificationArt conservation

In this paper a novel blind peak detector for Raman spectra based on the continuous wavelet transform isproposed. The peak detector is shown to correctly identify peaks of different widths and intensities, and iswell suited for subsequent peak-to-peak matching for the process of identifying unknown Raman spectrawith the help of a library of reference spectra. No background reduction or initial noise filtering isnecessary. The described methods were verified by successfully identifying 73 paint pigments by a spectralsearch through a Raman library for 638 pigments. In all except 7 cases, the Schmincke–Kremer pairingsfrom the truth reference were found to be amongst the two top pairings with the largest matching costfunction. These 7 cases were again inspected by the paintings conservator and doubts were raisedregarding the truthfulness of the initial pairing. The proposed methods are equally applicable to otherspectral methods.

Crown Copyright © 2011 Published by Elsevier B.V. All rights reserved.

1. Introduction

Raman spectroscopy has found wide application in the identi-fication of compounds in artworks such as pigments and bindingmedia [Refs. 1–4]. As artworks such as paintings are composed ofmany layers of often mixed materials, proper characterisation andidentification of the component materials are crucial for thedevelopment of responsible and careful conservation interventions,understanding artist technique and degradation processes, andassisting with possible forgery detection. The unique and irreplace-able value of artwork guides the conservation scientist to selectanalytical techniques which do not compromise the integrity of theoriginal material. As a non-destructive and more importantly non-sampling technique, micro-Raman spectroscopy can be used for theselective identification of small components such as pigments in apaint layer (typically less than 1 μg), or natural binders andvarnishes which are bound with other materials. The Raman-spectra collected from such samples are a fingerprint of themolecular structure of these materials, which can then be used fortheir identification.

The identification of pigments by Raman spectroscopy is achievedby comparing the spectrum of an unknown sample against a databasecontaining spectra of known compounds. When designing such

Peregian Beach, QLD 4573,

per), [email protected]

11 Published by Elsevier B.V. All rig

identification methods, the following aspects of the Raman spectrahave to be observed:

• Raman band intensities and spectral peak values vary betweenspectra; these variations are due to a number of factors, amongstothers incident laser power, frequency of the incident radiation,efficiency of the grating and detector, absorptivity of the materials,molar scattering power of the molar mode, and the concentration ofthe sample [5].

• There may be small variations in the spectral peak location for thecorresponding chemical element, due to differences in calibrationprocedures used in different laboratories, and due to variations inmolecular properties.

• The spectral line is systematically offset from the zero count. Thisbaselinemagnitude depends on factors such as pigment particle sizeand the use of fillers: although fillers do not interfere with pigmentpeaks or discrimination, they cause higher backgrounds thandesirable. This is unavoidable as different manufacturers adddifferent fillers and a pure pigment is not easily available.

• Insignificant small fluctuations in the composition of the compoundand “measuring noise” cause small local differences in the spectra.

• Raman pigment identification cannot be successfully completed for~30% of commonly used pigments, where either the pigment is notRaman active (no spectral peaks), or where the spectrum isoverwhelmed by fluorescence [6]. Therefore visual control of theautomated matching remains necessary.

Automated methods of comparison can be categorised intocorrelation methods and peak detection methods. Cross-correlation

hts reserved.

Page 2: Wavelet based Raman spectra comparison

66 G. Cooper et al. / Chemometrics and Intelligent Laboratory Systems 107 (2011) 65–68

methods (for example [7]) calculate a statistical correlation measurebetween pairs of spectra, while peak finding methods (for example[8]) use the detected spectral peaks to compare spectrum pairs. Boththese approaches have important drawbacks: Correlation methodsnecessitate pre-processing by applying baseline correction [9],normalizing and smoothing to all spectra. Peak detection methodsrequire a-priori assumption regarding the extent/width of the peakand its relative magnitude as compared to the noise background, andoften also require a baseline correction. These assumptions aredifficult to make because the width and size of significant spectralfeatures may vary over a large range.

The here presented “blind” method combines the best features ofthe two algorithm classes, in that it uses a blind peak detectionalgorithm and then correlates the spectral peaks by using a semblancecriterion [10] which does not have the drawbacks of the statisticalcorrelation. The objective of so-called ‘blind’ identification [11] andmatching is to identify spectral peaks and match the spectra by onlyutilizing the spectral observations. The spectra are characterised bywavelets so as to utilise wavelet tools in the processing.

For illustration purposes, selected Raman spectra from thepigment data base of Kubik were used [6], who measured theRaman spectra for 638 pigments from the German manufacturerKremer® [12] and 73 pigments from another artists' paint manufac-turer, Schmincke® [13]. These brands were selected based on thewide range available and the quality of their materials. The Schminckepigment Burnt Sienna Powder 18–679 and Kremer pigment 40450English Burnt Sienna were chosen as a candidate pair for matchingdue to their clearly resolved spectral peaks. Fig. 1 shows the spectra ofthe Schmincke pigment and the selected Kremer pigment, measuredwith a Renishaw 2000 Micro-Raman Spectrometer with a 782 nmlaser (Kubik, [6]). The Schmincke pigment has six major intensitymaxima at wave numbers 225, 291, 408, 492, 610 and 1310 cm−1.Note the noise fluctuations in the Kremer spectrum, which partiallybury the sixth peak value (Fig. 1). The human operator would tend toregard this spectrumpair as a propermatch, both considering the typeof the pigments and their spectra. The automated matching methodshould at least model the human comparison performance.

2. Theory

The proposed blind peak detection and matching method uses thecontinuous wavelet transform to analyse the characteristics of theRaman spectra. Wavelet analysis has been applied to many areas ofsignal processing. Wavelet analysis localizes spectral information

0 500 1000 1500 2000 25000

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Raman shift (cm-1)

Cou

nts

Fig. 1. Raman spectrum — Schmincke pigment 18–679 Burnt Sienna Powder andKremer pigment 40450 English Burnt Sienna (dotted line).

(unlike Fourier analysis), which makes it especially suitable for theanalysis of nonstationary features such as spectral peaks. Only a verybrief introduction to wavelet analysis is given here; see [14] forfurther details. The continuous wavelet transform (CWT) Wf of adataset f(t) is given by;

Wf u; sð Þ = ∫∞−∞f tð Þ 1

spΨ� t−u

s

� �dt ð1Þ

where Ψ is the mother wavelet, s is a scaling factor, u moves thewavelet through the data, p is a constant (usually 0.5), and * meanscomplex conjugate. Different wavelets have different mathematicalproperties, and can be chosen depending on the nature of the problemunder investigation. Because the Raman peaks have a Gaussian form, awavelet based on the Gaussian function (actually on its secondderivative) was used, defined as;

Ψ xð Þ = C2e−ixe−x2 ð2Þ

where C2 is a normalisation factor such that ||Ψ(2) ||2=1, and Ψ(2) isthe second derivative ofΨ [15]. The output of the wavelet transform isan image with time (Raman frequency in this case) on the x axis andscale (related to wavelength) on the y axis.

Peaks in the Raman spectrum result in high values for the CWTover a range of scales (Fig. 2), and this can be used to identify peaklocations. Note that the isolated spectral peak at 1310 cm−1 isdiscernable, while the closely spaced peaks at smaller scales requirefurther processing for clear identification. Because the wavelet usedwas complex, the CWT is also complex and gives both amplitude andphase information about the data, both of which are used in themethod of analysis proposed here. Once the CWT has been computedthen the first step is to locate peaks in its absolute value. There aremany different methods by which this could have been done [16], buta very simple blind method was used because it was computationallyundemanding, which is important if a large database is to be searched.A window is moved through the data, the object being to assign avalue to the data point at the centre of the window which gives thelikelihood that this is a local maximum of the data. The assigned valueis just the fraction of the data point values within the window that aresmaller than the centre point, and clearly this can vary between 0.0and 1.0. These peak fraction values can then be displayed as an image,as shown in Fig. 3.

The peak fraction values of two spectra can be compared to yield ameasure of similarity, and while some success was achieved in thismanner, the results were considerably improved by the addition of a

Fig. 2. CWT analysis of the Schmincke data in Fig. 1.Red depicts comparatively largeCWT coefficients, while dark blue depicts relatively low coefficients.

Page 3: Wavelet based Raman spectra comparison

0 200 400 600 800 1000 1200 1400 1600 1800 20000

5

10

15

20

25

30

35

40

45

Raman shift (cm-1)

Cou

nts

Fig. 4. Common peak locations for the Schmincke pigment 18–679 Burnt SiennaPowder and Kremer pigment 40450 English Burnt Sienna.

Fig. 3. Peak fraction values of the Schmincke data in Fig. 1. Red depicts identified peaksextending over a range of wavelengths.

67G. Cooper et al. / Chemometrics and Intelligent Laboratory Systems 107 (2011) 65–68

second measure of correlation. Semblance is a measure of the cross-correlation of two datasets, and when computed from the waveletphase [10] yields a correlation coefficient at each time (frequency inthis case) and scale. The cross wavelet transform between twodatasets [17], is defined as

CWT1;2 = CWT1:CWT�2 ð3Þ

where CWT1 and CWT2 are the continuous wavelet transforms ofdatasets 1 and 2 respectively. The cross wavelet transform is acomplex quantity having an amplitude given by

A = CWT1;2 ð4Þ

and local phase θ

θ = tan−1I CWT1;2� �

=R CWT1;2� �� �

: ð5Þ

Semblance is then defined as [10]

S = cos θð Þ: ð6Þ

S has values that range from −1 (inversely correlated) throughzero (uncorrelated) to +1 (correlated). However a comparison ofthe relative phases of two spectra cannot be used on its ownbecause the spectra might behave differently between their peaks(see above) and this would be reflected in their wavelet phases.Hence the method that was found to be the most successful forcomparing spectra was to only use the semblance values from theportions of the CWT which had a peak fraction above a thresholdvalue (a threshold of 0.9 seemed to work well in practice), seeFig. 4. The sum of these semblance counts above the threshold isused as match metric.

3. Experimental results

A library database of Raman spectra had been created for 638Kremer pigments [12] on loan from the National Archives ofAustralia (Canberra) for the purpose of art conservation [6,18]. Thematching methods documented in this paper were tested with 73Schmincke powdered pigments [13], especially synthetic andorganic compounds, whether they allowed direct comparison andidentification. The appropriate matching Kremer pigments hadbeen identified by a paintings conservator, and were used as truth

reference. From visual inspection of the data it was found that 14of the 73 Schmincke pigments were not Raman active. Theirspectra consist mainly of background and had to be excluded fromfurther tests. For the remaining Schmincke spectra, matchingpigment candidates were found from the data base of 638 Kremerpigments by selecting the Kremer pigments with the largest matchmetrics. For each Schmincke spectrum, a number of Kremercandidates had match values close to the maximum match metric,differing only for few percent or permille. This is due to the factthat many Kremer pigments have (nearly) identical chemicalcompositions. For example, the spectra of Kremer pigmentsultramarine light, ultramarine dark and ultramarine very dark arepractically identical and thus give identical spectral peaks, theshade of the pigments being due to the particle size. In all except 7cases, the Schmincke–Kremer pairings from the truth referencewere found to be amongst the two top pairings with the largestmatching cost function. These 7 cases were again inspected by thepaintings conservator and doubts were raised regarding thetruthfulness of the initial pairing.

Regarding the computational efficiency of the method, theprocessing of one match of spectra with ~2000 data points (withMATLAB 7.5.0; Dell Inspiron 1525; 2.0 GHz Intel Core 2 processor)required 2.5 s CPU time. The alternative cross-correlation methodwith a baseline correction and smoothing of the spectra requiredapproximately the same CPU time. In the latter method, the baselinecorrection consisted of a one-sided fit typically of a 5th degreepolynomial – iteratively calculated over ~15 iterations – according tothe method described in [9]. This baseline pre-processing oftenrequired human interaction in order to select a proper polynomialdegree.

4. Conclusions

The here presented blind method achieved superior results toconventional correlation methods and selected commercial products.Because neither the matching methods in commercial softwaresystems nor the methods described in this paper produce uniqueresults, they should only be used to identify a number of candidatematch pairs, and pigment identification should be confirmed by visualinspection of the spectra by the analyst, and be complemented byother analytical techniques. The visual control of the results allowsalso elimination of match results where pigments were not Raman-active. Different particle sizes caused differences in pigment shadewhile having the same spectrum. Fluorescent background in

Page 4: Wavelet based Raman spectra comparison

68 G. Cooper et al. / Chemometrics and Intelligent Laboratory Systems 107 (2011) 65–68

particular was a severe problem and swamped a number of the Ramanpeaks. In consequence, the blind matching method of this paperserves as a useful standard and reference method, but needs to becomplemented by other methods.

References

[1] I. Belle, R. Clark, P. Gibbs, Raman spectroscopic library of natural and syntheticpigments (pre - 1850 AD), Spectrochimica Acta Part A. 53 (12) (1997) 2159–2179.

[2] L. Burgio, K. Melessanaki, M. Dougleridis, R. Clark, D. Anglos, Pigmentidentification in paintings employing laser induced breakdown spectroscopyand Raman microscopy, Spectrochimica Acta Part B, (2001) 905–913.

[3] P. Vandenabeele, A. Hardy, L. Moens, Evaluation of a principal components-basedsearching algorithm for Raman spectroscopic identification of organic pigments in20th century artwork, Applied spectroscopy, 55 (2001) 525–533.

[4] P. Vanden abeele, B. Wehling, L. Moens, H. Edwards, M. De Reu, G. Van Hooydonk,Analysis with micro-Raman spectroscopy of natural organic binding media andvarnishes used in art, Analytica Chimica Acta (2000) 261–274.

[5] J. Loader, Basic Laser Raman Spectroscopy, Heyden & Son, London, 1970.[6] M. Kubik, The use of non-sampling spectroscopic techniques in conservation:

improved methodologies for pigment identification; PhD Thesis, AustralianNational University, Canberra 2006.

[7] Thermo Electric Corporation, Grams/AI user's guide, Thermo, Woburn, MA. 2003.

[8] Elmer Perkins, Search Plus, http://las.perkinelmer.com/Catalog/ProductInfoPage.htm?ProductID=SPECSCHP8 (accessed Nov 2010).

[9] V. Mazet, C. Carteret, D. Brie, J. Idier, B. Humbert, Background removal fromspectra by designing and minimising a non-quadratic cost function, Chemo-metrics and Intelligent Laboratory Systems 76 (2) (April 2005) 121–133.

[10] G.R.J. Cooper, D.R. Cowan, Comparing time series using waveletbased semblanceanalysis, Computers and Geosciences 34 (2) (2008) 95–102.

[11] Z. Wang, J. D. Z., J.D. Chen, Blind separation of slow waves and spikes fromgastrointestinal myoelectrical recordings, IEEE Transactions on InformationTechnology in Biomedicine 5 (2) (June 2001) 133.

[12] Kremer, Pigment samples in jars, http://www.kremer-pigmente.com/shopint/index.php?lang=ENG&list=15018 Accessed online 10 July 2010.

[13] Schmincke, Premium-artists'-pigments, Series 18, http://www.schmincke.de/produkte/pigments-and-bronzes/pigments.html?L=18 Accessed online 10 July2010.

[14] S. Mallat, A Wavelet Tour of Signal Processing, Academic Press, 19998 637 pp.[15] Matlab Help, The Mathworks, 2010.[16] P. Du, W. Kibbe, S. Lin, Improved peak detection in mass spectrum by

incorporating continuous wavelet transform-based pattern matching; OxfordJournals Life Sciences Bioinformatics 22 (11) (2006) 2059–2065.

[17] C. Torrence, G.P. Compo, A practical guide to wavelet analysis, Bulletin of theAmerican Meteorological Society 79 (1998) 61–78.

[18] M. Kubik, A. Lee, Kremer pigments, in: D. Creagh, I. Batterham (Eds.), RamanSpectra, vol. 1, Cultural Heritage Research Centre, Uni. Canberra Press, Canberra,2005.