5
978-1-4244-5934-6/10/$26.00 ©2010 IEEE 2159 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2010) A Quantification Method of Glucose in Aqueous Solution by FTIR/ATR Spectroscopy Jiemei Chen 1 , Lingling Wu 1,2 , Tao Pan 2, *, Jun Xie 2 , Huazhou Chen 2,3 1 Department of Biological Engineering, Jinan University, Guangzhou 510632, P.R.China 2 Key Laboratory of Optoelectronic Information and Sensing Technologies of Guangdong Higher Educational Institutes (Jinan University), Guangzhou 510632, P.R.China 3 Department of Mathematics, Shanghai University, Shanghai 200444, P.R.China *[email protected] AbstractA rapid quantitative analysis method of glucose in aqueous solution was established by using the FTIR/ATR spectroscopy, partial least squares (PLS) regression and Savitzky- Golay (SG) smoothing method. Based on the prediction effect of the optimal single wavenumber model, calibration set and prediction set were divided. By extending the number of smoothing points and the degree of polynomial, 483 smooth modes were calculated. The PLS models corresponding to all combinations of 483 SG smoothing modes and 1-40 PLS factor were established respectively. The optimal smoothing parameters were the first order derivative smoothing, 5 or 6 degree polynomial, 63 smoothing points, the optimal PLS factor, root mean squared error of predication (RMSEP), correlation coefficient of predication (R P ) and relative root mean squared error of predication (RRMSEP) were 3, 0.3729 (mmol/L), 0.9995 and 2.48% respectively, which was obviously superior to the direct PLS model without SG smoothing and the optimal SG smoothing model within 25 smoothing points (the original smoothing method). This demonstrates that the extending of SG smoothing modes and large-scale simultaneous optimization selection of SG smoothing parameters and PLS factor was all very necessary, and can be effectively applied to the model optimization of FTIR/ATR spectroscopy analysis. Keywords-Glucose solution; FTIR/ATR spectroscopy analysis; partial least squares; Savitzky-Golay smoothing I. INTRODUCTION Glucose is an important life metabolite, and much significant life information can be gotten by detection of glucose concentration in the living system. Routine measurement method of glucose concentration always needs chemical reagents, and it’s not an effective method because of possible damaging to the living system. Fourier transform infrared spectroscopy (FTIR) and attenuated total reflection (ATR) technology are effective determination methods for structure and content of components [1] . And they could be used nondestructive quantitative analysis of online, real-time and in situ which no chemical reagents. The FTIR/ATR spectroscopy technique had been extensive applied to the analysis of agricultural product and food [2] , fermentation process monitoring [3] , enzyme activity analysis [4] , cell metabolism measurement [5-6] and many other fields. In order to establish the quantitative analysis method of glucose in life system, in this paper, a quantification method of glucose in aqueous solution was first development by FTIR/ATR spectroscopy. And the glucose concentrations in aqueous solution samples were prepared according to the glucose concentration range in human or animal blood. Partial least squares (PLS) was a classical effective chemometrics method which was used widely in spectroscopy analysis [7-11] . The PLS factor is an important parameter. If the PLS factor is too small, the spectral information of the samples couldn’t be fully used. The model accuracy would be decreased. If the PLS factor is too big, noises would be led into the model and the prediction ability would be decreased too. Therefore, it is important to select a reasonable PLS factor. Savitzky-Golay (SG) smoothing method was a widely used method in spectral pretreatment which can eliminate noise [12-15] . The SG smoothing parameters include the order of derivative, the degree of polynomial and the number of smoothing points. Particularly, it is important to set the number of smoothing points. If the number is too small, it would lead new errors to the model. If the number is too big, the spectra data containing information would be polished and lost. Both of the situations would decrease the model accuracy. The SG smoothing parameters would be different when objects are different and measurement modes are different. It is very necessary to large-scale simultaneous optimization of the SG smoothing parameters and the PLS factor according to prediction effects. But as there are many smoothing modes and different formulas, the workload would be very large. This work was seldom accomplished in previous study. On the other hand, more smoothing points may be necessary in some actual measuring systems. To widen the application scope, it is necessary to expanse the smoothing parameters table according to the original method [12] . In this paper, by simulating the glucose concentration in human or animal blood, 82 glucose aqueous solution samples were designed. The rapid determination method and the This work was supported by the National Natural Science Foundation of China (10771087), the Natural Science Foundation of Guangdong Province (7005948), the Science and Technology Project of Guangdong Province (2007A020905001, 2009B030801239). *Corresponding author: Tao Pan ( [email protected])

A Quantification Method of Glucose in Aqueous.pdf

  • Upload
    jelena

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

  • 978-1-4244-5934-6/10/$26.00 2010 IEEE 2159

    2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2010)

    A Quantification Method of Glucose in Aqueous Solution by FTIR/ATR Spectroscopy

    Jiemei Chen1, Lingling Wu1,2, Tao Pan2,*, Jun Xie2, Huazhou Chen2,3

    1 Department of Biological Engineering, Jinan University, Guangzhou 510632, P.R.China 2 Key Laboratory of Optoelectronic Information and Sensing Technologies of Guangdong Higher Educational Institutes (Jinan

    University), Guangzhou 510632, P.R.China 3Department of Mathematics, Shanghai University, Shanghai 200444, P.R.China

    *[email protected]

    AbstractA rapid quantitative analysis method of glucose in aqueous solution was established by using the FTIR/ATR spectroscopy, partial least squares (PLS) regression and Savitzky-Golay (SG) smoothing method. Based on the prediction effect of the optimal single wavenumber model, calibration set and prediction set were divided. By extending the number of smoothing points and the degree of polynomial, 483 smooth modes were calculated. The PLS models corresponding to all combinations of 483 SG smoothing modes and 1-40 PLS factor were established respectively. The optimal smoothing parameters were the first order derivative smoothing, 5 or 6 degree polynomial, 63 smoothing points, the optimal PLS factor, root mean squared error of predication (RMSEP), correlation coefficient of predication (RP) and relative root mean squared error of predication (RRMSEP) were 3, 0.3729 (mmol/L), 0.9995 and 2.48% respectively, which was obviously superior to the direct PLS model without SG smoothing and the optimal SG smoothing model within 25 smoothing points (the original smoothing method). This demonstrates that the extending of SG smoothing modes and large-scale simultaneous optimization selection of SG smoothing parameters and PLS factor was all very necessary, and can be effectively applied to the model optimization of FTIR/ATR spectroscopy analysis.

    Keywords-Glucose solution; FTIR/ATR spectroscopy analysis; partial least squares; Savitzky-Golay smoothing

    I. INTRODUCTION Glucose is an important life metabolite, and much

    significant life information can be gotten by detection of glucose concentration in the living system. Routine measurement method of glucose concentration always needs chemical reagents, and its not an effective method because of possible damaging to the living system.

    Fourier transform infrared spectroscopy (FTIR) and attenuated total reflection (ATR) technology are effective determination methods for structure and content of components[1]. And they could be used nondestructive quantitative analysis of online, real-time and in situ which no chemical reagents. The FTIR/ATR spectroscopy technique had been extensive applied to the analysis of agricultural product and food[2], fermentation process monitoring[3],

    enzyme activity analysis[4] , cell metabolism measurement[5-6] and many other fields.

    In order to establish the quantitative analysis method of glucose in life system, in this paper, a quantification method of glucose in aqueous solution was first development by FTIR/ATR spectroscopy. And the glucose concentrations in aqueous solution samples were prepared according to the glucose concentration range in human or animal blood.

    Partial least squares (PLS) was a classical effective chemometrics method which was used widely in spectroscopy analysis[7-11]. The PLS factor is an important parameter. If the PLS factor is too small, the spectral information of the samples couldnt be fully used. The model accuracy would be decreased. If the PLS factor is too big, noises would be led into the model and the prediction ability would be decreased too. Therefore, it is important to select a reasonable PLS factor.

    Savitzky-Golay (SG) smoothing method was a widely used method in spectral pretreatment which can eliminate noise[12-15]. The SG smoothing parameters include the order of derivative, the degree of polynomial and the number of smoothing points. Particularly, it is important to set the number of smoothing points. If the number is too small, it would lead new errors to the model. If the number is too big, the spectra data containing information would be polished and lost. Both of the situations would decrease the model accuracy. The SG smoothing parameters would be different when objects are different and measurement modes are different. It is very necessary to large-scale simultaneous optimization of the SG smoothing parameters and the PLS factor according to prediction effects. But as there are many smoothing modes and different formulas, the workload would be very large. This work was seldom accomplished in previous study. On the other hand, more smoothing points may be necessary in some actual measuring systems. To widen the application scope, it is necessary to expanse the smoothing parameters table according to the original method[12].

    In this paper, by simulating the glucose concentration in human or animal blood, 82 glucose aqueous solution samples were designed. The rapid determination method and the

    This work was supported by the National Natural Science Foundation of China (10771087), the Natural Science Foundation of Guangdong Province (7005948), the Science and Technology Project of Guangdong Province (2007A020905001, 2009B030801239). *Corresponding author: Tao Pan ([email protected])

  • 2160

    analysis model of glucose solution were established by using the FTIR/ATR spectroscopy technology, PLS regression and SG smoothing method. Especially, the simultaneous optimization of the SG smoothing parameters and the PLS factor was applied to FTIR/ATR spectroscopy analysis of glucose solution.

    II. EXPERIMENT AND METHODS

    A. Experimental Materials, Instrument and Measurement Method

    82 glucose aqueous solution samples were designed, and glucose concentration ranged from 0.416 to 39.036 (mmol/L), the mean values and the standard deviations are 18.449 and 10.810 (mmol/L) respectively.

    A VERTEX 70 FTIR spectrometer (BRUKER Company) equipped with a KBr beamsplitter and a deuterated triglycine sulfate KBr detector was used to collect the spectra. The MIR spectra were obtained from 4500 to 600 cm-1 with a horizontal ATR sampling accessory with a diamond internal reflection element on a ZnSe crystal (SPECAC Company, 45o angle of incidence, 3 times reflective). 32 scans of symmetrical interferograms at 4 cm-1 resolution were added for each spectrum. The spectra were measured at temperature 25 1 and humidity 46% RH.

    0.075ml of each glucose aqueous solution sample was taken for spectroscopy measuring. Each sample was measured 3 times, and the average spectrum was calculated.

    B. Dividing Method for Calibration Set and Prediction Set Based on the prediction effect of the optimal single

    wavenumber model for all samples, calibration set and prediction set were divided. The glucose chemical value and spectral data of samples in calibration set were combined to establish models for data mining. Then, the established model was applied to the spectral data of samples in prediction set, to calculate the theoretical value of glucose. Comparing calculated value and actual value, model prediction effect was evaluated.

    By Beer's law, the single wavenumber linear model for the glucose aqueous solution sample absorbance and glucose chemical values is follows

    A(v)=k(v)C+, (1) where A(v) is the absorbance of sample and k(v) is the unit concentration absorption coefficient of glucose aqueous solution, for each wavenumber v. C is the glucose chemical value and is other unknown interference. k(v) was regression calculated using the absorbance and chemical values of all samples, and then the predict value C'i(v) of sample i was calculated by using k(v) and the sample absorbance, i=1,2,,N, N is the number of samples. Additionally, root mean square error (RMSE) between predict values and chemical values were calculated. Set that Ci is the chemical value of sample i, thus

    1

    ))('()RMSE( 1

    2

    ==

    N

    CvCv

    N

    iii

    . (2)

    According to the minimum value of RMSE, the optimal single wavenumber model and the corresponding wavenumber vOptimal were selected. Base on the optimal model, the bias between predict value and chemical value of each sample was calculated which named single wavenumber prediction bias (SWPB).

    SWPB(i)=|C'i(vOptimal) Ci |, i=1, 2, , N. (3) According to SWPB, the calibration set and the prediction set were divided. Computer procedures were used to make a similar distribution of SWPB for the two sets (mean value and standard deviation are similar, relative error was less than 1%). By this method, the chemical values and spectral data were combined to make the calibration set and the prediction set have consistent distribution. Consequently the dividing method has modeling representative. To ensure the concentration range of calibration set cover that of validation one, the samples with maximum and minimum chemical values were divided into the calibration set, while the samples with second maximum and second minimum chemical values were divided into the prediction set.

    C. SG Smoothing Method SG Smoothing parameters include the order of derivative

    (the original spectral smoothing was recorded zero order derivative smoothing), the degree of polynomial and the number of smooth points. Because some actual measurement systems (for example, the case of the spectral wavenumber gap was small) may require more smooth points. So, in this paper, the number of smoothing points were expanded from 5, 7 25 (odd)[12] to 5, 7 81(odd), and the degree of polynomial were expanded to n = 2, 3, 4, 5, 6 (originally n = 2, 3, 4, 5). According to the original method[12], 14 smoothing coefficient tables which covering the original smoothing coefficient, and 483 smoothing modes (originally 117 modes) were calculated by computer program. And it is a SG smoothing preprocessing group with wider application scope.

    D. Model Evaluation Indicators The model evaluation indicators main includes root mean

    squared error of predication (RMSEP) and correlation coefficient of predication (RP) and the relative root mean squared error of predication (RRMSEP) as follows:

    ,1

    )'(RMSEP 1

    2

    ==

    M

    CCM

    iipip

    (4)

    ==

    =

    =

    M

    impipmpip

    M

    i

    M

    impipmpip

    CCCC

    CCCC

    1

    22

    1

    1P

    )''()(

    )'')((R , (5)

    )%( 100RMSEPRRMSEP =mpC

    (6)

  • 2161

    where C'ip, Cip were predictive value and chemical values of the sample i in the prediction set, C'mp, Cmp were the mean predicted value and mean chemical value of all samples in the prediction set, M is the sample number in the prediction set. RMSEP used as the goal of model optimization and parameter design.

    III. RESULTS AND DISCUSSION The FTIR/ATR spectra of 82 samples were showed in

    Figure 1. According to the method of section IIB, all single wavenumber models were established, and the optimal wavenumber vOptimal was 1034 cm-1 according to the minimum RMSE. Based on 1034 cm-1 model, the SWPB of each sample

    was calculated. SWPBs distribution and chemical values distribution of all the samples were shown in the Figure 2. All 82 samples were used for modeling. The calibration set consists of 55 samples and the prediction set of 27 samples. By the method mentioned in section IIB, the samples were split into the calibration set and prediction set. TABLE I show the mean value and standard deviation of SWPB and chemical value. TABLE I and Figure 2 indicate that SWPBs distribution in the calibration set and the prediction set was very consistent.

    As a comparison, the whole spectral region (4500-600 cm-1) and the fingerprint range (1600-900 cm-1) were also

    established by the direct PLS model without SG smoothing. The optimal RMSEPs based on whole spectral region and fingerprint region were 1.109 and 0.6604 (mmol/L) respectively. The prediction accuracy of the later was obviously better than the former. So the fingerprint region (1600-900 cm-1) was selected as the spectral band for modeling by PLS method, and the corresponding optimal PLS factor was 6, the optimal RMSEP was 0.6604 (mmol/L), RP was 0.9981, RRMSEP was 4.40%.

    TABLE I. THE MEANS AND THE STANDARD DEVIATIONS OF SWPBS AND CHAMICAL VALUES IN CALIBRATION SET AND

    PREDICTION SET

    Chemical value (mmol/L)

    SWPB (mmol/L)

    Mean Standard deviation Mean Standard deviation

    Calibration set 20.131 10.851 8.615 5.829

    Prediction set 15.021 10.064 8.664 5.810

    Then the PLS models with various SG smoothing were built. Based on computer algorithms platform which was developed by authors, PLS models corresponding to all combinations of 483 smoothing modes and PLS factor changing from 1 to 40 were established. According to the prediction effect, the SG smoothing parameters and the PLS factor were simultaneously optimized. The RMSEP values of the optimal models with different derivative modes and different smoothing points were shown in Figure 3. The degree of polynomial, smooth points, PLS factor and RMSEP of the optimal model with different derivation order were shown in TABLE II. The result of the direct PLS model without SG

    smoothing was also listed in TABLE II.

    The global optimal SG smoothing parameters were the first order derivative smoothing, 5 or 6 degree of polynomial, 63 smoothing points. And the corresponding optimal PLS factor, the optimal RMSEP, RP, RRMSEP were 3, 0.3729

    Figure 2 The distributions of SWPBs and chemical values.

    Figure 1 The FTIR/ATR spectra of 82 glucose aqueous solution samples.

    Figure 3 The optimal RMSEP corresponding to the number of

    smoothing points for each order derivative mode (0) Original spectra smoothing; (1) 1st order derivative; (2)2nd

    order derivative; (3) 3rd order derivative; (4) 4th order Derivative; (5) 5th order derivative

  • 2162

    (mmol/L), 0.9995, 2.48% respectively. The prediction effect was obvious better than the result obtained without SG smoothing. TABLE II and Figure 3 showed that the optimal smoothing points and the optimal PLS factor corresponding to different derivative order were different. If using the designated smoothing parameters which used by previous researches, without a large-scale selection, it is difficult to find the optimal SG smoothing parameters and the PLS factor. In addition, TABLE II and Figure 3 also showed that the optimal smoothing points were not less than 25, if using any smoothing point within 25, the optimal prediction effect would not be obtained (within 25 smoothing points, the best RMSEP was 0.6604 (mmol/L), RP was 0.9981). Figure 4 showed the comparison of the predictive value and the chemical value for the optimal SG smoothing model of each sample. It was seen that the correlation between the predictive value and the chemical value of all samples was very good. These indicated the expansion of SG smoothing mode was very necessary.

    TABLE II. PREDICTION EFFECT OF THE OPTIMAL MODEL CORRESPODING TO EACH ORDER DERIVATION

    Polynomial degree Number of smoothing

    points

    PLS factor

    RMSEP (mmol/L)

    RRMSEP RP

    No SG smoothing 6 0.6604 4.40% 0.9981

    Original spectra

    smoothing 45 41 6 0.6418 4.27% 0.9983

    1st order derivative 56 63 3 0.3729 2.48% 0.9995

    2nd order derivative 45 67 3 0.4762 3.17% 0.9989

    3rd order derivative 34 57 5 0.5279 3.51% 0.9988

    4th order derivative 45 65 6 0.5483 3.65% 0.9987

    5th order derivative 56 75 9 0.5459 3.63% 0.9985

    The global optimal SG smoothing parameters were the first order derivative smoothing, 5 or 6 degree of polynomial,

    63 smoothing points. And the corresponding smoothing formula was the follows:

    =

    =31

    310

    ~i

    iiaa (7)

    Where ia were the original spectral data, 0~a were the

    spectral data after SG smoothing, i were the smoothing coefficient as follows: -4.886, -1.796, 0.459, 1.997, 2.928, 3.353, 3.364, 3.045, 2.472, 1.713, 0.830, -0.125, -1.104, -2.067, -2.978, -3.808, -4.536, -5.141, -5.611, -5.937, -6.113, -6.138, -6.016, -5.752, -5.354, -4.833, -4.204, -3.480, -2.679, -1.820, -0.920, 0, 0.920, 1.820, 2.6789, 3.480, 4.204, 4.834, 5.354, 5.752, 6.016, 6.139, 6.113, 5.937, 5.611, 5.141, 4.536, 3.808, 2.978, 2.067, 1.104, 0.125, -0.830, -1.713, -2.472, -3.045, -3.364, -3.353, -2.928, -1.997, -0.459, 1.796, 4.887 ( 310 ).

    IV. CONCLUSION The simultaneous optimization of the SG smoothing

    parameters and the PLS factor was applied to FTIR/ATR analysis of glucose solution. Based on the optimal single wavenumber model, the calibration set and the prediction set were divided. For direct PLS model without SG smoothing, the optimal PLS factor, RMSEP, RP and RRMSEP were 6, 0.6604 (mmol/L), 0.9981 and 4.40% respectively. By extending the number of smoothing points and the degree of polynomial, 483 smooth modes were calculated. The PLS models corresponding to all combinations of 483 SG smoothing modes and 1-40 PLS factor were established respectively. The optimal smoothing parameters were the first order derivative smoothing, 5 or 6 degree polynomial, 63 smoothing points, the optimal PLS factor, RMSEP, RP and RRMSEP were 3, 0.3729 (mmol/L), 0.9995 and 2.48% respectively, which was obviously superior to the direct PLS model without SG smoothing and the optimal SG smoothing model within 25 smoothing points (the original smoothing method). This demonstrates that the extending of SG smoothing modes and large-scale simultaneous optimization selection of SG smoothing parameters and PLS factor was all very necessary, and can be effectively applied to the model optimization of FTIR/ATR analysis.

    REFERENCES [1] F. M. Mirabella and N. J. Harrick, Internal reflection spectroscopy:

    review and supplement, New York: Harrick Scientific Corporation, 1985.

    [2] R. H. Wilson, Fourier Transform mid-infrared spectroscopy for food analysis, Trends in Analyical Chemistry, vol. 9, pp. 127131, 1990.

    [3] P. H. Axelsen, W. D. Braddock, H. L. Brockman, et al., Use of internal reflectance infrared spectroscopy for in situ study of supported lipid monolayers, Applied Spectroscopy, vol. 49, no. 5, pp. 526531, 1995.

    [4] T. Fujii, Y. Miyahara, Infrared ATR spectroscopy of substrates in aqueous solution using cryoenrichment and its application in enzyme-activity assays, Applied Spectroscopy, vol. 52, no. 2, pp. 128133, 1998.

    Figure 4 Comparison of predictive value and chemical value of

    each sample for the optimal SG smoothing model

  • 2163

    [5] K. Nakanishi, A. Hashimoto, T. Pan, et al., Mid-infrared spectroscopic measurement of ionic dissociative materials in metabolic pathway. Applied Spectroscopy, vol. 57, no. 12, 15101516, 2003.

    [6] T. Pan, A. Hashimoto, M. Kanou, et al., Development of a quantification system of ionic dissociative metabolites using an FT-IR/ATR method. Bioprocess and Biosystems Engineering, vol. 26, no. 2, 133139, 2003.

    [7] D. A. Burns, E. W. Ciurczak, Handbook of near-infrared analysis, 2nd ed, New York: Marcel dekker inc, 2001: 633647

    [8] X. L. Chu, H. F. Yuan, W. Z. Lu, Progress and Application of Spectral Data Pretreatment and Wavelength Selection Methods in NIR Analytical Technique, Progress in Chemistry, vol. 16, no. 4, 528542, 2004.

    [9] X. L. Chu, Y. Xu, W. Z. Lu, Research and Application Progress of Chemometrics Methods in Near Infrared Spectroscopic Analysis, Chinese Journal of Analytical Chemistry, vol. 36, no. 5, 702709, 2008.

    [10] P. Cao, T. Pan, X. D. Chen, Choice of wave band indesign of minitype near-infrared corn protein content analyzer, Optics and Precision Engineering, vol. 15, no. 12, 19521958, 2007.

    [11] J. H. Jiang, R. J. Berry, H. W. Siesler, Y. Ozaki, Wavelength interval selection in multicomponent spectral analysis by moving window partial least-squares regression with applications to mid-infrared and hear-infrared spectroscopic data, Analytical Chemistry, vol. 74, 35553565, 2002.

    [12] A. Savitzky, M. J. E. Golay, Smoothing and differentiation of data by simplified least squares procedures, Analytical Chemistry, vol. 36, no. 8, pp. 16271637, 1964.

    [13] J. M. Chen, T. Pan, X. D. Chen, Application of second derivative spectrum prepares in quantification measuring glucose-6-phosphate and fructose-6-phosphate using a FTIR/ATR method, Optics and Precision Engineering, vol. 14, no. 1, 17, 2006.

    [14] H. Yin, T. Pan, D. F. Liu, et al., A simultaneous quantification method of thalassemia screening multiple indicators using FTIR/ATR spectroscopy, Photonics and Optoelectronics, 2009, SOPO2009, Symposium on 14-16 Aug. 2009, pp. 14.

    [15] H. Yin, T. Pan, P. L. Tian, et al., Rapid Quantitative Analysis for the Human Blood Hemoglobin Applied through FTIR/ATR Spectrum, Chinese Journal of Spectroscopy Laboratory, vol. 26, No. 2, 432436, 2009.