6
Component Spectrum Reconstruction from Partially Characterized Mixtures* MARC R. NYDENt and KRISHNAN CHITTUR Center for Fire Research, National Bureau of Standards, Gaithersburg, Maryland 20899 (M.R.N.); and National Center for Biomedical Infrared Spectroscopy, Battelle Laboratories, Columbus, Ohio (K.C.) A mathematical analysis of some existing approaches to component spectrum reconstruction is presented. This analysis leads to the deri- vation of a generalization of the cross-correlation technique. The effec- tiveness of these methods is assessed from the quality of the reconstruc- tions obtained with the use of synthetic mixture spectra. Reconstructions of the spectra of the components of aqueous mixtures of immunoglobulin G and albumin are compared to the corresponding spectral reconstruc- tions of the pure proteins in buffer. Index Headings: Computer applications; Infrared. INTRODUCTION The prospect of reconstructing the spectra of com- ponents directly from the spectra of mixtures has been a focal point in the field of computer-assisted spectro- scopic analysis. Spectral reconstructions could be used to replace physical separations in situations where effi- ciency of time is more valued than are definitive com- ponent identifications. Possible applications are envi- ronmental monitoring and on-line quality control. Moreover, in many applications, physical separation of the sample either is not possible or is possible only with significant alteration of the physical state of the com- ponents. In such cases, the reconstructed spectra could be used to identify the components and to obtain infor- mation about structural changes which result from the interactions between these components. The study of protein structural changes which occur both in solution and on polymer surfaces, for example, would be partic- ularly well-suited to the application of these techniques. It is possible to conceive of methods by which the spectrum of each component is reconstructed without introducing values for the concentrations of these com- ponents. Indeed, a number of papers in the literature 1-5 have described strategies for accomplishing this task, to varying degrees of accuracy. However, it must be em- phasized that the problem as it has been presented above does not have an exact solution in the general case. With- out a complete specification of the concentrations of each component, a set of mixture spectra can be equally well represented by an arbitrary linear combination of the component spectra. In other words, the mixture spectra by themselves do not constitute sufficient information from which the spectra of the components can be deter- mined. This is an important practical consideration since it is not always possible to measure the concentrations of every component in a series of complex mixtures. As a consequence, the accuracy of all methods of com- ponent spectral reconstruction depend on the validity of assumptions or observations that have the effect of re- stricting the component concentrations to a subset of Received 13 May 1988;revisionreceived26 June 1988. * This paper is a contributionfrom the National Bureau of Standards and is not subject to copyright. Author to whomcorrespondenceshould be sent. permissible values. The effectiveness of imposing some constraints, such as the existence of unique absorbance bands 3and non-negative component concentrations,4has been demonstrated on simple mixtures. Unfortunately, exact reconstructions are only possible in the limiting case when these constraints are sufficient to uniquely specify the values of all of the component concentrations. Otherwise, an approximation is being made, and the best that can be expected is that the effectiveness of the ap- proximation is not unexpectedly lost. Despite hopes to the contrary, the accuracy of these approximate spectral reconstructions has a tendency to degrade as the complexity of the mixtures, as measured by the number of components, increases. The reason for this problem is that simple constraints, such as non- negativity, become less definitive as the number of de- grees of freedom grows out of proportion with the num- ber constraining equations. Prospects for discovering a set of conditions, short of specifying the component con- centrations themselves, which will ensure accurate re- constructions from the spectra of complex mixtures do not seem promising. We have, therefore, directed our research efforts to solving a related, but more tractable, problem. In this paper we will report progress in the develop- ment of a strategy for the spectral reconstruction of com- ponents in "partially characterized" mixtures. This phrase is used to distinguish our applications, which presume knowledge of the concentrations of the components of interest, from the case of fully uncharacterized mixtures where no concentration information is available. A math- ematical analysis of the conditions which govern the ac- curacy of the approximation will be provided in con- junction with a set of examples from which the effectiveness of the method can be reasily assessed. Fi- nally, the procedure will be used to monitor spectral changes in the components of aqueous mixtures of plas- ma-proteins. THEORY We will begin our mathematical analysis by deriving the cross-correlation method proposed by Honigs et al. 5 The derivation proceeds from consideration of a set of spectra measured from m mixtures containing a total of n components (m _> n). The deviations from Beer's law arising from chemical interactions between mixture com- ponents are at least partially absorbed in the definition of the terms which make up the model. Thus, we can write n A~i = ~ CvKii (i = 1, 2,..., N) (1) J where A~z is the absorbance of any one of the mixtures at the frequency ~i; C~i is the concentration of component Volume 43, Number 1, 1989 0oo3-7028/89/4301-o12352.00/0 APPLIED SPECTROSCOPY 123 © 1989 Society for Applied Spectroscopy

Component Spectrum Reconstruction from Partially Characterized Mixtures

Embed Size (px)

Citation preview

Component Spectrum Reconstruction from Partially Characterized Mixtures*

M A R C R. N Y D E N t and K R I S H N A N C H I T T U R Center for Fire Research, National Bureau of Standards, Gaithersburg, Maryland 20899 (M.R.N.); and National Center for Biomedical Infrared Spectroscopy, Battelle Laboratories, Columbus, Ohio (K.C.)

A mathematical analysis of some existing approaches to component spectrum reconstruction is presented. This analysis leads to the deri- vation of a generalization of the cross-correlation technique. The effec- tiveness of these methods is assessed from the quality of the reconstruc- tions obtained with the use of synthetic mixture spectra. Reconstructions of the spectra of the components of aqueous mixtures of immunoglobulin G and albumin are compared to the corresponding spectral reconstruc- tions of the pure proteins in buffer.

Index Headings: Computer applications; Infrared.

INTRODUCTION

The prospect of reconstructing the spectra of com- ponents directly from the spectra of mixtures has been a focal point in the field of computer-assisted spectro- scopic analysis. Spectral reconstructions could be used to replace physical separations in situations where effi- ciency of time is more valued than are definitive com- ponent identifications. Possible applications are envi- ronmental monitoring and on-line qual i ty control. Moreover, in many applications, physical separation of the sample either is not possible or is possible only with significant alteration of the physical state of the com- ponents. In such cases, the reconstructed spectra could be used to identify the components and to obtain infor- mation about structural changes which result from the interactions between these components. The study of protein structural changes which occur both in solution and on polymer surfaces, for example, would be partic- ularly well-suited to the application of these techniques.

It is possible to conceive of methods by which the spectrum of each component is reconstructed without introducing values for the concentrations of these com- ponents. Indeed, a number of papers in the literature 1-5 have described strategies for accomplishing this task, to varying degrees of accuracy. However, it must be em- phasized that the problem as it has been presented above does not have an exact solution in the general case. With- out a complete specification of the concentrations of each component, a set of mixture spectra can be equally well represented by an arbitrary linear combination of the component spectra. In other words, the mixture spectra by themselves do not constitute sufficient information from which the spectra of the components can be deter- mined. This is an important practical consideration since it is not always possible to measure the concentrations of every component in a series of complex mixtures.

As a consequence, the accuracy of all methods of com- ponent spectral reconstruction depend on the validity of assumptions or observations that have the effect of re- stricting the component concentrations to a subset of

Received 13 May 1988; revision received 26 June 1988. * This paper is a contribution from the National Bureau of Standards

and is not subject to copyright. Author to whom correspondence should be sent.

permissible values. The effectiveness of imposing some constraints, such as the existence of unique absorbance bands 3 and non-negative component concentrations, 4 has been demonstrated on simple mixtures. Unfortunately, exact reconstructions are only possible in the limiting case when these constraints are sufficient to uniquely specify the values of all of the component concentrations. Otherwise, an approximation is being made, and the best that can be expected is that the effectiveness of the ap- proximation is not unexpectedly lost.

Despite hopes to the contrary, the accuracy of these approximate spectral reconstructions has a tendency to degrade as the complexity of the mixtures, as measured by the number of components, increases. The reason for this problem is that simple constraints, such as non- negativity, become less definitive as the number of de- grees of freedom grows out of proportion with the num- ber constraining equations. Prospects for discovering a set of conditions, short of specifying the component con- centrations themselves, which will ensure accurate re- constructions from the spectra of complex mixtures do not seem promising. We have, therefore, directed our research efforts to solving a related, but more tractable, problem.

In this paper we will report progress in the develop- ment of a strategy for the spectral reconstruction of com- ponents in "partially characterized" mixtures. This phrase is used to distinguish our applications, which presume knowledge of the concentrations of the components of interest, from the case of fully uncharacterized mixtures where no concentration information is available. A math- ematical analysis of the conditions which govern the ac- curacy of the approximation will be provided in con- junct ion with a set of examples from which the effectiveness of the method can be reasily assessed. Fi- nally, the procedure will be used to monitor spectral changes in the components of aqueous mixtures of plas- ma-proteins.

THEORY

We will begin our mathematical analysis by deriving the cross-correlation method proposed by Honigs et al. 5 The derivation proceeds from consideration of a set of spectra measured from m mixtures containing a total of n components (m _> n). The deviations from Beer's law arising from chemical interactions between mixture com- ponents are at least partially absorbed in the definition of the terms which make up the model. Thus, we can write

n

A~i = ~ CvKii (i = 1, 2 , . . . , N) (1) J

where A~z is the absorbance of any one of the mixtures at the frequency ~i; C~i is the concentration of component

Volume 43, Number 1, 1989 0oo3-7028/89/4301-o12352.00/0 APPLIED SPECTROSCOPY 123 © 1989 Society for Applied Spectroscopy

j in this mixture; and K~ is the averaged absorbtivity at vi of component j in the mixture. Clearly, this system of equations can be solved for the component spectra after specification of the mixture spectra and the concentra- tions of each of the components in every mixture. The basis of our proposal--that spectral reconstructions can be used to detect structural changes arising from molec- ular interactions--is that the spectra (Kj) obtained in this way will be sensitive to the environment of the mix- ture. We further contend that, in certain cases, it is pos- sible to compute accurate reconstructions of the com- ponent spectra even if the concentrations and identities of the other components are unknown.

The statistical properties we are interested in are ap- parent from consideration of the following expression:

AQ~- Az= ~ (C v - C~)K~ ( i = 1 , 2 . . . . ,N) . (2) J

This equation was obtained by subtracting the average absorbance spectrum of the m mixtures from both sides of Eq. 1. The average absorbance at v~ and the average component concentrations are defined as

Ai = ~ A U m (i = 1, 2 . . . . , N) (3)

and

C_,j = ~_~ C J m (j = 1, 2 . . . . , n). (4) £

After multiplying both sides of Eq. 2 by (C,t - ~ ) , sum- ming over ~, and dividing by m, one arrives at the fol- lowing expression involving the covariances of the com- ponent concentrations:

C, ) l /m = coy(C,, (5)

(i = 1, 2 . . . . , N).

The covariance between the concentrations of com- ponents i and j is defined as

cov(Ct, Cj) = ~ ( C ~ t C v ) / m - C-,jCt. (6) £

The conventional nomenclature is that the off-diagonal elements of this matrix are called covariances and the diagonal elements are called variances. The statistical definition of stochastic independence (see, for example, the discussion which begins on page 151 of Ref. 6) implies that, if the components are randomly distributed in the mixtures, then the covariances of their concentrations will converge to zero in the limit of infinite m.

On the basis of this reasoning, we can write the fol- lowing formula which relates the spectrum of a specified component to the spectra of a set of mixtures with ran- dom concentrations:

K , = lim ~ [A~ - A,)(C~t - C,)]/(ma, 2) (7) r n ~ O O f~

(i = 1, 2 , . . . , N).

The variance of the target component concentration, which shows up in the denominator of Eq. 7, is given by

,~,~ = cov(C,, C,). (8)

The right-hand side of Eq. 7 is identical to eq. 4 in Ref. 5. It is related to the statistical definition of the corre- lation between random variables, Ai and Ct, the differ- ence being that the products of the standard deviations of C, and Ai, which are present in the definition of the correlation coefficient, are replaced by the variance of C t •

Equation 7 is the basis of the cross-correlation method. It is a prescription for determining the spectrum of a target component without knowledge of the concentra- tions or identities of the remaining components. Accu- rate reconstructions are obtainable from this formula only when the mean-centered concentration vectors, (Cj - (~), are orthogonal to the mean-centered concen- tration vector of the target compound--or, stated in the parlance of statistics, when there is no correlation be- tween the concentrations of the target and the remaining components.

We suspect that, for many applications, the deviations from Beer's law will not be a significant source of error. The basis of our reasoning is that some of the deviations are absorbed in the definition of the Kji, whereas the residual will tend to have little correlation with the com- ponent concentration vectors. Thus, we anticipate that, in general, the largest source of error will be the corre- lation between the concentrations of major components which exists when these concentrations are not random. The only way to eliminate this error is to introduce the measured concentrations of the correlated components into the reconstruction.

The most straightforward way to do this is to solve the system of equations

A - .~ = ( C - ( ~ ) K a (9 )

for K ° in the subspace spanned by the concentration vectors, Cj (j = 1, 2 , . . . , n '), corresponding to the (n' < n) correlated components. In this equation, A - ), is the (m x N) matrix consisting of the deviations from the average absorbance at each frequency; (C - (~) is the (m x n') matrix comprised of the deviations from the average component concentrations in each mixture; and K ° is the (n' x N) matrix of component spectra. The superscript, a, is used here to emphasize that component spectra obtained by solving Eq. 9 with an incomplete concentration matrix will be contaminated by features of the spectra of the remaining components. The extent of this contamination depends on statistical consider- ations which will be discussed presently. Equation 9 is a generalization of the cross-correlation method. A math- ematical analysis of the properties of this approximation follows.

We begin this process by inverting Eq. 9. This is ac- complished by left-multiplying both sides by the pseudo- inverse of (C - C). The result is

K" = ( C - C ) t ( A - .~ ) . ( 1 0 )

The pseudo-inverse of the mean-centered concentration matrix is defined by

(C - ( ~ ) t ( C - (~) = I, (11)

with I being the identity matrix. 7 It is clear from this defining equation that for any

choice of the n' components the dot products

124 Volume 43, Number 1, 1989

0 . g 0 _

h o.8o _

b 0.70 _

S

0 o.0o _

r 0.50 _

b a 0.40 _

n 0 .30_

C

e 0.20 _

O. 10 _

0.00

Actual Spectrum of Carbon Dioxide M i c r o n s

2.6 28 3.0 3.5 4.0 ,4.5 5.0 6.0 7.0 8 t0 15 I i I I I I I I I I I

~ ' ' I ' ' ' ' I 3500

FIG. 1.

. . . . I ' ' ' ' I 3000 2500 2000 1500 1000

Wavenumber

Measured spectrum of carbon dioxide.

( C - ~ ) t i ( c - C)/m = c o v ( ( C - ( : ) % C ) = 0 (i =~ j = 1, 2 . . . . . n'). (12)

The idea is to force the covariance of the correlated com- ponents to zero by algebra, that is, by satisfying Eq. 10. This relationship is a consequence of the properties of the solutions of a system of linear equations--it is in- dependent of the number of mixtures, provided that m > n'. Note that this is an example of the fact that zero covariance does not imply stochastic independence, even though stochastic independence does imply zero covari- ance2 After multiplying Eq. 2 by the vector (C - (~)t~, dividing by m, and substituting Eq. 10, one arrives at the following expression:

~ [(A~, - A~)(C - C)t,Jm

= Kati/m + ~ cov((C - C)¢,~, Cj)Kj, (13) jc~t

(i = 1, 2 , . . . , N ) .

Confirmation of the fact that this constitutes a gener- alization of Eq. 5 is attained with the realization that this equation, albeit with a different constant multiply- ing point for another strategy, which is effective when derived by substituting the term (C - (~)tt~ for every occurrence of (C~, - ~) in Eq. 5.

O.gO_

A 0.s0 _ b

0.70 _ S

O 0 .50_

r 0.50 _

b a o.ao _

n O. 30 _

C

e o.2o_

0.10 _

0.00

Actual Spectrum of Acetic Acid Microns

2.6 2,0 3.0 3.5 4,0 4.5 5.0 6.0 7.0 8 t0 15

m I I I I I I I I t I

A ' ' ' I .... I .... I "'" ' I'7 -f

3500 3000 2500 2000

Wavenumber' FIG. 2.

t500 iO00

Measured spectrum of acetic acid.

Reconstructed Spectrum of Carbon Dioxide M i c r o n s

2.6 2.8 3,0 3.5 4.0 4.5 5.0 5.0 7.0 B 10 15 I 1 I I I r I L I I I

0.90

A o.so,

b 0.70 ,

S

0 0.60

r 0.50.

b a o . 4 0 ,

rl 0.30 . ~ i~

C

e 0 . 2 0

0.10 ,

' ' ' ~ - i 4000 3500 3000 2500 2000 15oo 1ooo

Navenumber

FIG. 3. Spectrum of carbon dioxide reconstructed from 150 mixture spectra using Eq. 15 and the concentration vectors for 5 of the 10 mixture components.

It is clear from Eq. 13 that spectral features from the components j > n will contaminate the reconstruction of the spectrum of component t only to the extent that their concentrations correlate with the vector (C - C)t,. Thus, using the same reasoning that was used to justify Eq. 7, we can write

Kti = l i m K a t i . (14) m ~ o o

On the basis of this analysis we propose the following strategy for the reconstruction of component spectra: (1) Measure the concentrations of the components of inter- est by chromatography, or by quantitative spectroscopy, if suitable calibration spectra are available, s and (2) solve the system of equations represented by Eq. 9 in the sub- space of the n' concentration vectors for the matrix, K a, which contains the reconstructed component spectra.

The accuracy of the reconstructions computed from an incomplete specification of the concentration matrix will improve as the number of mixture spectra used is increased. Note, however, that this strategy does con- stitute a workable generalization of the cross-correlation method proposed in Ref. 5. Indeed, the methods become identical when a single concentration vector is used. On the other hand, the generalized cross-correlation tech- nique becomes exact for any number of mixtures (m > n) when all of the component concentration vectors are

TABLE I. Sums of the squares of the deviations" of the measured and reconstructed spectra for each strategy.

Strategy

Component b Eq. 15 Eq. 7 Eqs. 9 or 17

Number of mixtures 25 150 25 150 25 150

Water 11.7 1.04 26.1 1.31 3.20 1.08 Carbon dioxide 12.8 2.93 30.8 1.09 1.57 0.60 Carbon monoxide 1.15 2.52 66.7 1.38 10.3 0.69 Acetaldehyde 4.32 2.47 11.5 0.96 3.53 0.70 Acetic acid 17.5 2.72 160.0 1.60 2.46 0.66

N

a Computed as ~(K~i "~t - Kj?DP)L i

b The remaining 5 components were: acrolein, acrylonitrile, ammonia, aniline, and benzene.

APPLIED SPECTROSCOPY 125

0 . 9 0 -

A 0.80

b O. 70 S

0.60 0

P O. 50

b O. 40

a

n o. 30

C 0.20

e OAO

0.00

Reconstructed Spectrum of Carbon Oioxide {A) Microns

2.0 2.0 3.0 3.5 4.0 4.5 5.0 6.0 7.0 8 10 15 I I I [ I I I I I I I J

I ' ' ' ' I ' ' ' ' ' ' 4000 3500 3000 2500 2000 1SO0 1000

Wavenumber FIO. 4. Spectrum of carbon dioxide reconstructed from 150 mixture spectra using Eq. 9 and the concentration vectors for 5 of the 10 mixture components.

Reconstructed Spectrum of Carbon Diox:ide (C) M i c r o n s

2.6 2.8 3.0 3.5 4.0 4.5 5.0 6.0 7,0 8 t0 15 I I I I I I I I I I I t

o . g o _

A 0.8o_

b 0.70

s 0,60

0 O, 50

r

b 040

a 0,30 _ [ [ n

0.20 C

0.00 r -'~d r ~a'~"-r--'LJ~ i ~ -.

-0.10 T , , , ' ' ' ' ' ' ' ' I ' '

4000 3500 3000 2500 2000 1500 1000

Wavenumber

FIG. 6. Spectrum of carbon dioxide reconstructed from 150 mixture spectra using Eq. 7.

specified. In practice, the constraints of the analysis will usually dictate a compromise between these asymptotes.

Some variations of this strategy are also of interest. The fact that the solutions of the system of equations represented by Eq. I are identical to the solutions of Eq. 9 when the complete concentration matrix is specified prompted us to explore the consequences of solving the following equation using an incomplete concentration matrix:

A = CKL (15)

Again, a superscript is used to distinguish the various approximations.

As it turns out, the approach summarized by Eq. 15 is not intrinsically useful; however, it does form the start- ing point for another strategy which is effective when the component concentrations are correlated in the sense that they sum to the a constant value which is indepen- dent of the mixture.

The basis of this strategy is the following approxi- mation:

A~i ~, CQ,K, + ~ CjKji (~ = 1, 2 , . . . , m) (16) j ÷ t

Reconstructed S p e c t r u m o f Acetic Acid (A) Microns

2.6 2.8 3.0 3.5 4.0 4.5 5.0 6,0 7,0 8 10 t5 l I I I I I I I I I I I

o . g o _

A o.so _

b 0.70 _

e 0.60 _

O

r 0.5O _ ~ ~

b 0.40 _ a

N O. 30 _

C 0.20 _

e o.1o _

O.OO . . . . .

4000 3500 3000 2500 2000 1500 1000

Wavenumber

FIG. 5. Spectrum of acetic acid reconstructed from 150 mixture spec- tra using Eq. 9 and.the concentration vectors for 5 of the 10 mixture components.

which is consistent with the assumption that C~ and C~ are independent random variables. Although average values for the unspecified component concentrations may be available in some instances, our purpose is to develop a methodology with widespread applicability. This ob- jective is satisfied by incorporating the approximation that the average concentration is the same for all of the unspecified components. In this way we obtain the fol- lowing model:

A = CK c, (17)

where the truncated concentration matrix of Eq. 15 has been augmented by a column vector containing the same value repeated m times, and K c contains n' + 1 compo- nent spectra.

An advantage of this method is that it corrects for the correlation between components satisfying the condi- tion, C~t + Cj = 1. As a result, the solutions to Eq. 17 are preferred if the component concentrations sum to the same constant in all of the mixtures. If, on the other hand, the sums of the concentrations change with the mixture, then the solutions of Eq. 17 are identical to those of Eq. 9.

L I O _

t .00 _ A b o,go_

S O. 00 _

O O. 70 _

r 0.60_

b 0.50 _ a

n O, 40 _

C 0.30 _

e O. 00

0.I0

FIG. 7.

Reconstructed Spectrum of Carbon Dioxide (C) Microns

2.6 2.0 3.0 3.5 4.0 4.5 5.0 5.0 7.0 8 10 15 I I I I I I I I I I I I

~ - ( - i 4000 3500 3000 2500

' ' L ' ' ' ' l ' ' ' ' ~ ' ' 2000 1500 1000

Navenumber

t Spectrum of carbon dioxide reconstructed from 25 mixture

spectra using Eq. 7.

126 Volume 43, Number 1, 1989

Reconstructed Spectrum of Carbon Dioxide (A) Microns

2 .6 2 8 3 .0 3 .5 4.0 4 .5 5 ,0 6 .0 7 .0 8 10 15 I I I ] I ~ I I I I I I

0.90 -

0 .80 - A b 0.70 _

S 0 .60 _

0 0 .50 _

r

b 0.4o _

a 0 3 0 _ I I n C 0 .20 _

e 0.10 _

0 .00 _

- 0 A 0 - , , , , I . . . . I ' ' ' ~ I ' ' ,4000 3500 3000 2500 2000 1500 I000

Wavenumber

Fro. 8. Spectrum of carbon dioxide reconstructed from 25 mixture spectra using Eq. 9 and the concentration vectors for 5 of the 10 mixture components.

EXPERIMENTAL

The effectiveness of the strategies summarized by Eqs. 7, 9, and 15 was examined with the use of the component spectra reconstructed from synthetic mixture spectra. The choice was made to use synthetic mixture spectra because we wanted to be able to validate the methods without the need to interpret the effects of molecular interactions. A set of 150 synthetic mixtures was pre- pared by taking linear combinations of the vapor-phase spectra of ten common compounds. The coefficients of the component spectra, which play the role of component concentrations, were obtained with the use of a random number generator with no constraint on the sums of the component concentrations.

The preparation of the protein solutions and the mea- surement of the corresponding spectra were performed as described in Ref. 8. The spectra of pure immunoglob- ulin G (IgG) in buffer and pure albumin (A1) in buffer, and the spectra of IgG and A1 in the presence of buffer and each other, were reconstructed from a total of 49 standardized solution spectra with total protein concen- trations ranging from 10 to 60 mg/mL. Fourier self-de- convolution of the reconstructed spectra was performed with the use of the method described in Ref. 9. The procedure requires the input of two parameters and the specification of a line shape function. In this study, we used a pure Lorentzian line shape with half-width at half- height, a = 18, and a resolution enhancement factor, k = 1.8.

All linear equations were solved with a program (RE- CON) which is part of a software package developed at the Center for Fire Research, for the purposes of iden- tifying and quantitating components in fire atmospheres.

RESULTS AND DISCUSSION

Synthetic Spectra. The actual spectra of carbon dioxide and acetic acid, which were two of the ten components (see Table I) of the synthetic mixtures, are displayed in Figs. 1 and 2, respectively. The reconstructed spectrum of carbon dioxide, obtained by solving Eq. 15 with the use of a truncated concentration matrix consisting of the 150 component concentration vectors from five of the ten

Mic rons 6 .05 5 .30 6 .35 6 .40 6 .45 6 .50 6 .59 6 .60 6 .65 5 .70 5 .75

I I I I I I I I I

0 , 0 t 2

A

b o.oio

s

0 0. 008

P

b a o. 006

n

C o. 004

e

0,002 /

I ' ' ' I I 1600 t580 ~560 1540 1520 1500 t480

Wavenumber

FIG. 9. Deconvolved spectrum of IgG reconstructed from the spectra of aqueous mixtures of IgG and A1 compared to the deconvolved spec- t rum of IgG reconstrueted from pure IgG solutions.

components, one of which was carbon dioxide, is dis- played in Fig. 3. Although we did not expect to get ac- curate reconstructions from Eq. 15, we thought that it was important to include these figures and, in so doing, to demonstrate the full range of possible results.

There is a striking contrast between the accuracy of the reconstructed spectrum in Fig. 3 and that of the reconstructed spectra of carbon dioxide (Fig. 4) and ace- tic acid (Fig. 5) which were obtained by solving Eq. 9 with the use of exactly the same concentration infor- mation. The examples illustrated are typical of the re- sults obtained. This fact is substantiated by the results reported in Table I. The listed values are the sums of the squares of the deviations (i.e., the difference between the reconstructed and actual spectrum) for each of the five reconstructions which we computed for each strat- egy.

Not all of the strategies described in this paper give different results in all applications. In fact, in this study the value of the sum of the component concentrations was not constrained to a constant value. Consequently, the solutions of Eq. 17 are identical to the solutions of Eq. 9. On the other hand, the solutions to Eq. 7 and 9 are noticeably different. Indeed, the quality of the re- constructions which were obtained from the cross-cor- relation method (Eq. 7), an example of which is displayed in Fig. 6, was not quite as good as that computed from Eq. 9 (or equivalently from Eq. 17). The explanation for this difference is that the cross-correlation method can only make use of the information contained in a single concentration vector. Despite this fact, it should be stressed that each of the five components could be readily identified from the reconstructed spectra obtained by cross-correlating all 150 mixture spectra with the appro- priate concentration vectors.

On the bases of the reasoning which is to follow, we do not expect this favorable result to be the norm. The component concentrations of the synthetic mixtures used in this investigation are necessarily uncorrelated because of the way they were generated. Real mixtures do not necessarily conform to any simple model. Depending on their source, they may contain both correlated and un- correlated components. The cross-correlation technique is bound to fail except in those instances when there is

APPLIED SPECTROSCOPY 127

5 2 5 J ~

0,022

0 0 2 0

0,01B

0 0 J 6 _

0 . 0 l a

0,012

0.010 _

0.006

Microns 6.30 6,35 6.40 6,45 6.50 6.55 6.50 6.65 6.70 6.75

I I I I I I I I f I

in M ix tu re

0.004

1600 I580 1560 1540 1520 1500 ~4B0

Wavenumber FIG. 10. Deconvolved spectrum of A1 reconstructed from the spectra of aqueous mixtures of IgG and A1 compared to the deconvolved spec- trum of A1 reconstructed from pure A1 solutions.

little correlation between the target and remaining com- ponents, whereas Eq. 9 and Eq. 17 continue to be useful as long as the components which are correlated can be quantitated.

The virtues and the failings of the various strategies become most evident when the number of available mixtures is small. The reconstructions obtained by cross- correlating the spectra of 25 mixtures with the appropriate concentration vectors are not identifiable. For example, the reconstructed spectrum of carbon dioxide (Fig. 7) which was calculated from Eq. 7 with the use of 25 mix- ture spectra, is extensively contaminated by the spectra of the other components. On the other hand, the recon- structed spectrum in Fig. 8, which was obtained by solv- ing Eq. 9 using the same mixtures, is immediately rec- ognizable as carbon dioxide. In general, we have found that the solutions of Eqs. 9 and 17 converge to the com- ponent spectra much more rapidly than do the recon- structions obtained from Eq. 7.

Protein Spectra. Spectral reconstruction techniques can play an important part in revealing structural changes which result from the interactions between proteins. Al- though a detailed study of these effects is outside the scope of this paper, we can offer a modest demonstration.

The spectra of IgG and A1 were reconstructed from the spectra of 32 standardized mixtures consisting only of IgG and A1 dissolved in buffer, by solving Eq. 15 with a complete concentration matrix. Here our interest was to investigate the possibility of detecting structural changes in the proteins, and we did not want to introduce further errors by using an incomplete concentration ma- trix. In this context, it should be recalled that the so- lutions of Eqs. 9, 15, and 17 are equivalent when all of the component concentrations are specified.

The reconstructed spectra were deconvolved in the region of the amide II band. These spectra, which contain information about the IgG-A1 interactions, are compared to the deconvolved spectra of the corresponding pure proteins in Figs. 9 and 10. The so-called pure protein spectra were obtained in independent reconstructions from 8 standardized solutions containing only IgG in buffer, and from 9 standardized solutions containing only A1 in buffer. An effort was made to ensure that the concentrations of IgG and A1 in the single-component mixtures spanned the range of values represented in the

128 Volume 43, Number 1, 1989

binary mixtures. The idea was to isolate the effects of the IgG-A1 interactions from the IgG-IgG and A1-A1 in- teractions.

A comparison of the spectra in Fig. 9 indicates that there is an increase in the intensity of the underlying bands centered at 1520 and 1535 cm -1 in the recon- structed spectrum of IgG in the mixture. There is also a change in the shape of the 1520-cm -1 band, which is evident in the reconstructed spectrum of A1 in the mix- ture (Fig. 10).

It is tempting to attribute the spectral differences cited above to the interactions between the IgG and A1 in solution. Unfortunately, it is not clear whether these dif- ferences are physically meaningful, because the results of additional reconstructions, using subsets of the orig- inal mixture basis sets, were inconclusive. It is reason- able, however, to assume that the process of overdeter- mining the reconstructions has a favorable effect on minimizing random fluctuations. Nevertheless, we main- tain that these results are, at the very least, illustrative of the potential which exists to use spectral reconstruc- tion to interpret interaction-induced structural changes in proteins. Further investigations are in progress.

CONCLUSIONS

The validation study performed with synthetic mix- ture spectra demonstrates that useful component spec- trum reconstructions can be obtained from partially characterized mixtures. A mathematical analysis of the problem however, indicates that accurate reconstruc- tions are possible only if the unspecified components are uncorrelated. The generalized cross-correlation method, which was derived in this paper, facilitates this by al- lowing for input of the measured concentrations of the correlated components. A related method (Eq. 17) can be used in those cases where there is additional corre- lation between components as a result of the condition that the component concentrations sum to a constant value which is independent of the mixture.

The usefulness of component spectrum reconstruc- tions for the study of protein interactions was also ex- amined. Our results are only suggestive, but it is clear that the potential exists to use these techniques to detect the structural changes which result from the interactions between proteins in solution.

ACKNOWLEDGMENTS One of us (M.R.N.) would like to thank Glenn Forney of the Center

for Fire Research for helpful discussions on matters relevant to this paper. This work was partially supported by Grants RR-01367 and HL- 38936 (K.C.).

1. W. H. Lawton and E. A. Sylvestre, Technomet 13, 617 (1971). 2. J. L. Koenig and D. Kormos, Appl. Spectrosc. 33, 349 (1979). 3. P. C. Gillette, J. B. Lando, and J. L. Koenig, Anal. Chem. 55, 630

(1983). 4. J. Liu and J. L. Koenig, Appl. Spectrosc. 59, 2609 (1987). 5. D. E. Honigs, G. M. Heiftje, and T. Hirschfeld, Appl. Spectrosc. 38

(1984). 6. A. E. Mood, F. A. Graybill, and D. C. Boes, Introduction to the

Theory of Statistics (McGraw-Hill, New York, 1974). 7. G.E. Forsythe, M. A. Malcolm, and C. B. Moler, Computer Methods

for Mathematical Computations (Prentice-Hall, Englewood Cliffs, New Jersey, 1977).

8. M. R. Nyden, G. F. Forney, and K. Chittur, Appl. Spectrosc. 42, 588 (1988).

9. J. K. Kauppinen, D. J. Moffatt, H. H. Mantsch, and D. G. Cameron, Appl. Spectrosc. 35 (1981).