7

Click here to load reader

Review of TOF-SIMS Bio Analysis Using Mutual Information

Embed Size (px)

Citation preview

Page 1: Review of TOF-SIMS Bio Analysis Using Mutual Information

13

6

ReviewReceived: 3 October 2008 Accepted: 6 October 2008 Published online in Wiley Interscience: 28 November 2008

(www.interscience.wiley.com) DOI 10.1002/sia.2989

Review of TOF-SIMS bioanalysis using mutualinformationSatoka Aoyagi∗

The evaluation of proteins on biodevices, such as the distribution or orientation of immobilized proteins, is one of the mostimportant issues for the development of sophisticated biodevices. In this review, a description is provided of the applicationof one of the most sensitive surface analysis methods, time-of-flight secondary ion mass spectrometry (TOF-SIMS), to proteinevaluation and TOF-SIMS spectra analysis using mutual information. TOF-SIMS is useful for the evaluation of biodevice surfaces,because TOF-SIMS provides submicron-scale chemical mapping and information on the chemical structures of the upper surfacepart of a protein. In addition, TOF-SIMS requires no pretreatment of samples, such as labeling with a fluorescent probe orcoating with metallic thin films. Data analysis methods are, however, required to interpret the protein sample TOF-SIMS data,because the fragment ions from proteins are so complicated that it is difficult to predict them. In order to identify hiddenimportant peaks related to protein samples out of the hundreds of peaks in TOF-SIMS spectra, mutual information has beenused. Protein distribution on biodevices and the orientation of an immobilized protein obtained with this method are describedin this review. Copyright c© 2008 John Wiley & Sons, Ltd.

Keywords: TOF-SIMS; mutual information; protein; distribution; imaging; orientation

Introduction

This article reviews the application of information theory tobiodevice analysis by means of time-of-flight secondary ion massspectrometry (TOF-SIMS). The evaluation of proteins on biodevicesurfaces is crucial for the development of high-performancebiodevices such as artificial organs and biosensors. TOF-SIMS isone of the most useful techniques for such evaluation of biodevicesurfaces, because TOF-SIMS provides chemical information on themost upper surface of a protein monolayer and the distribution ofproteins on materials at a submicron scale. Although TOF-SIMS isone of the most powerful techniques for analyzing biodevices, ithas not come into widespread use in the field of biology because ofits complicated spectrum interpretation. Many analytic techniqueshave been used for the characterization of TOF-SIMS spectra[1 – 5]

with fragment ions from organic molecules or biomolecules, suchas polymers and proteins, because TOF-SIMS is not suitable forthe ionization of intact macromolecules. Moreover, it is difficultto discriminate among proteins based on a simple comparisonof the fragment ions, because every protein consists of aminoacids from the same basic set of 20 amino acids. Certain typesof polymers also have this same problem. Therefore, TOF-SIMSspectra of protein samples contain a number of unpredictable andcomplicated peaks. Multivariate data such as TOF-SIMS spectracan be solved in a variety of ways. In this article, one of the newapproaches to analyzing the TOF-SIMS spectra of protein samplesis described.

Mutual information,[6 – 8] defined by information theory, hasbeen applied to analyzing TOF-SIMS spectra to characterize thespecificity of every peak in the TOF-SIMS spectra of a sample bycomparing it with another sample, such as a reference sample.Mutual information has been applied to chromatography forselecting significant peaks.[8] Because TOF-SIMS spectra, especiallyof protein samples, contain complicated peaks, they require acritical peak selection method to obtain images of particularproteins and also to evaluate their orientation. Secondary ion

images of a particular protein[9 – 12] can be obtained and theorientation of immobilized proteins[13 – 16] can be evaluated usingthe specific peaks selected with the mutual information. In thisarticle, the concept of mutual information and the spectra analysismethod using mutual information[8,9] are explained, and certainresults using this method are presented.

TOF-SIMS Spectra Analysis

TOF-SIMS spectra analysis with mutual information

Mutual information is obtained by subtracting a posteriorientropy (uncertainty) from a priori entropy (uncertainty). In thisformulation, a posteriori entropy is defined as the informationentropy that occurs after an event.

The calculation steps are the following: Assume the numberof TOF-SIMS spectra is N and they are classified into twocategories, the sample and the reference sample. The numberof spectra belonging to the sample is n(a1) and that belongingto the reference sample is n(a2). In terms of sample categories,information entropy S(A) is defined by the following Eqn (1):

S(A) = −∑

p(ai) log2p(ai) (1)

where the probability p(ai) = n(ai)/N (i = 1, 2), S(A) is the amountof information needed to determine the a priori category of aspectrum. With a certain peak threshold V , the set of spectra are

∗ Correspondence to: Satoka Aoyagi, Faculty of Life and Environmental Science,Shimane University, 1060 Nishikawatsu-cho, Matsue-shi, Shimane 690-8504, Japan. E-mail: [email protected]

Faculty of Life and Environmental Science, Shimane University, 1060Nishikawatsu-cho, Matsue-shi, Shimane 690-8504, Japan

Surf. Interface Anal. 2009, 41, 136–142 Copyright c© 2008 John Wiley & Sons, Ltd.

Page 2: Review of TOF-SIMS Bio Analysis Using Mutual Information

13

7

TOF-SIMS application to bioanalysis using mutual information

split into two subsets, B1 and B2. The peak intensity greater than Vis classified as B1 and the number of the spectra containing thesepeaks as n(b1), while that less than V is classified as B2 and thenumber of the spectra containing these peaks as n(b2). Therefore,the information entropy of splitting induced by V , S(B) is definedby the Eqn (2):

S(B) = −∑

p(bj) log2p(bj) (2)

where the probability p(bj) = n(bj)/N (j = 1, 2)Mutual information I(A;B) is defined by Eqn (3):

I(A; B) = S(A) − S(A|B) (3)

S(A|B) = −∑ ∑

p(bj)p(ai|bj) log2p(ai|bj) (4)

where the probability p(ai|bj) = n(ai|bj)/n(bj), S(A) means the apriori uncertainty and S(A|B) is the a posteriori uncertainty. The termn(ai|bj) is the number of spectra belonging to sample category iout of the spectra containing peaks greater than V . The best valueof V is chosen to provide the largest I(A; B).[7,8] When I(A; B) = S(A),the peak intensity of each spectra is classifiable to the correctcategory.

For example, there are TOF-SIMS spectra of two samples; 1 and2. We can compare the intensities of a certain peak, as shown inFig. 1. When the peaks are completely classified by the thresholdV , for example, the intensities of the peaks in the all of TOF-SIMSspectra of the sample 1 are higher than those of sample 2, the aposteriori entropy is zero and the mutual information is 1. In otherwords, when mutual information equals the a priori entropy, thispeak is the most important peak for classifying the samples. Onthe other hand, when intensities of the peaks cannot be classifiedwith an appropriate threshold V at all, for example, the intensitiesof the peaks in most of TOF-SIMS spectra of sample 1 are almost ashigh as those of sample 2, the value of the a priori entropy equalsthe a posteriori entropy. This ‘‘a posteriori entropy’’ is informationentropy after the estimation of the peak intensity with threshold V .In this case, the mutual information is zero, and nothing is clarifiedby the evaluation of the peak intensity with V .

When a peak of a secondary ion is specific to sample 1,its intensity in the sample 1 is higher than that in sample2. Considering the intensity fluctuation in the measurementsand background, the intensity in sample 1 may not always behigher than sample 2, even though it is indeed often higher.Therefore, identification is not easy by way of a simple comparison.Furthermore, it is not easy to check every peak in all spectra. Themutual information values indicate the relative importance of apeak. Peaks having low mutual information values can be omittedbecause they are considered not to be specific to a particular

Table 1. Intensities of peaks

m/z 46 129 130 131 132 133 134 135

Sample 1 24 747 3621 4073 4752 2329 10 053 2424 3940

Sample 1 14 056 3309 5616 5417 2512 11 961 2825 4018

Sample 1 21 176 3431 2488 4664 2038 10 761 2456 3693

Sample 2 12 349 2516 5998 3963 2028 5826 1917 2421

Sample 2 25 371 3039 6283 4151 2125 5688 2124 2814

Sample 2 18 076 2944 7338 4259 2144 4874 2014 2243

Sample 2 20 468 2837 6715 3962 2015 4488 1935 2300

Sample 2 22 006 2444 5467 3449 1630 3921 1602 2264

sample. Therefore, only peaks having high values of mutualinformation should be checked, based on their peak propertiessuch as chemical structures, noise and distribution, to see if theyare appropriate and useful for further analysis.

An example would be mutual information as calculated forsample 1. Assume there are three spectra of sample 1 and fivespectra of sample 2. Table 1 shows the intensities of several peaksin the TOF-SIMS spectra. The mutual information of the peak atm/z133 is calculated. First, S(A) is calculated with these samplenumbers as follows:

The number of sample 1 data: n(a1) = 3The number of sample 2 data: n(a2) = 5The number of all data: N = 8Therefore,

p(a1) = n(a1)/N = 3/8

p(a2) = n(a2)/N = 5/8 (5)

Therefore,

S(A) = −∑

p(ai)log2p(ai) = −{p(a1)log2p(a1)

+ p(a2)log2p(a2)}= −{(3/8)log2(3/8) + (5/8)log2(5/8)} = 0.95 (6)

Next, the most appropriate threshold V was determined by trialand error to obtain the best mutual information. For example, thebest value of the mutual information, 0.95, is obtained when V is6000. S(A|B) is calculated:

The number of data with a peak intensity at m/z133 higher than6000: n(b1) = 3

The number of data with a peak intensity at m/z133 lower than6000: n(b2) = 5

Mutual information → 1 Mutual information → 0

VV

Sample 1 Sample 2 Sample 1 Sample 2

Figure 1. Classification concept based on mutual information.

Surf. Interface Anal. 2009, 41, 136–142 Copyright c© 2008 John Wiley & Sons, Ltd. www.interscience.wiley.com/journal/sia

Page 3: Review of TOF-SIMS Bio Analysis Using Mutual Information

13

8

S. Aoyagi

The number of sample 1 data with a peak intensity at m/z133higher than 6000: n(a1|b1) = 3

The number of sample 1 data with a peak intensity at m/z133lower than 6000: n(a1|b2) = 0

The number of sample 2 data with a peak intensity at m/z133higher than 6000: n(a2|b1) = 0

The number of sample 2 data with a peak intensity at m/z133lower than 6000: n(a2|b2) = 5

Therefore,

p(a1|b1) = n(a1|b1)/n(b1) = 3/3 = 1

p(a1|b2) = n(a1|b2)/n(b2) = 0/5 = 0

p(a2|b1) = n(a2|b1)/n(b1) = 0/3 = 1

p(a2|b2) = n(a2|b2)/n(b2) = 5/5 = 0 (7)

Therefore,

S(A|B) = −∑∑

p(bj)p(ai|bj)log2p(ai|bj)

= −{p(b1)p(a1|b1)log2p(a1|b1)

+ p(b2)p(a1|b2)log2p(a1|b2)

+ p(b1)p(a2|b1)log2p(a2|b1)

+ p(b2)p(a2|b2)log2p(a2|b2)} = 0 (8)

Finally, mutual information I(A; B) is

I(A; B) = S(A) − S(A|B) = 0.95 (9)

S(A) is 1 when the spectra numbers of both samples are equal.When mutual information on the peaks at m/z46, 134 is

calculated in the same way focusing on sample 1, the bestvalues of the mutual information are approximately 0.5 and 0.95,respectively. Although a low value of mutual information means aparticular peak is not specific to a particular sample, a high valueof mutual information does not always ensure the specificity ofa particular peak to a particular sample. For example, the peakat m/z134 cannot be selected as a specific peak in sample 1because the difference in the intensities of sample 1 and sample2 is not large enough to allow classification, even though theintensities of sample 1 data are higher than those of the sample2 data. Therefore, every peak selected as a specific peak usingmutual information should be checked to determine whether it isreally appropriate. On the other hand, peaks having low mutualinformation values can be omitted from further consideration. Thismethod reduces the target peaks on TOF-SIMS spectra. Usuallyseveral to several dozen peaks are retained.

Peak identification and protein matching

Fragment ion peaks from ensembles of amino acids are identifiedby searching every combination of amino acids based on thefollowing hypothesis: (i) recombination is not considered, (ii)double bonds are not cut completely, (iii) hydrogen additionand hydrogen desorption are considered flexibly (i.e. the numbersof carbon, oxygen, nitrogen and sulfur are considered) and (iv)sulfur–sulfur bonding is considered present when there are morethan two sulfur atoms.

Possible fragments from every part of a protein moleculewere considered with structures of the 20 amino acid residuescomposing a protein. Table 2 shows fragments from the sidechains of the residues, and Table 3 shows fragments from the

principal chain or principal chains. If it is assumed that a fragmention is generated from a part containing two neighbor residues, itsformula can be one of the combinations of two of the residues inTable 2 and fragments of the two principle chains in Table 3. Inaddition, when the fragment ion generation part contains a prolineresidue, the possible fragments in Table 4 should be considered.

Evaluation of Proteins on Biodevices

Imaging of protein distribution

TOF-SIMS is the most useful technique to evaluate the distri-bution and immobilization processes of proteins on biodevicesurfaces[12,17 – 19] because it provides chemical mapping at highspatial resolution. However, it is difficult to obtain one specificprotein mapping from out of a collection of other proteins, es-pecially on complicated materials containing nitrogen, becausefragment ions from protein samples are very complicated andnot predictable at present. In order to obtain chemical imagesof proteins, the specific peaks related to each protein must beselected out of all the peaks in the TOF-SIMS spectra. Becausethere are so many possible combinations of amino acid residuesand so many possible structures of fragment ions from amino acidcombinations, it is very difficult to find out a specific secondaryion from a particular protein. In addition, some of the fragmentions from substrate materials have the appearance of complicated,protein-like fragment ions, which makes it more difficult to selectappropriate peaks of the fragment ions related to the protein forfurther analysis of the TOF-SIMS data.

When TOF-SIMS spectra analysis using mutual information isused to study protein samples, this analysis method indicatesseveral important peaks that can be used to obtain secondaryion images of the particular protein. For example, in order toevaluate the protein adsorption that takes place on hollow fiber

Table 2. Possible fragments generated from amino acid residue sidechains

Amino acidFormula ofside chain Possible fragment (except for hydrogen)

Gly G H – – – – – –

Ala A CH3 C – – – – –

Val V C3H7 C3 C2 C – – –

Leu L C4H9 C4 C3 C2 C – –

Ile I C4H9 C4 C3 C2 C – –

Met M C3H7S C3S C2S C2 C – –

Pro P C3H6 C3 C2 C – – –

Phe F C7H8 C7 C – – – –

Trp W C9H7N C9N C – – – –

Ser S CH3O CO C – – – –

Thr T C2H5O C2O C2 CO C – –

Asn N C2H4NO C2NO C2O C – – –

Gln Q C3H6NO C3NO C3O C2 C – –

Tyr Y C7H8O C7O C7 C – – –

Cys C CH3S CS C – – – –

Lys K C4H11N C4N C4 C3 C2 C –

Arg R C4H11N3 C4N3 C4N2 C3N C3 C2 C

His H C4H5N2 C4N2 C – – – –

Asp D C2H2O2 C2O2 C2O C – – –

Glu E C3H4O2 C3O2 C3O C2 C – –

www.interscience.wiley.com/journal/sia Copyright c© 2008 John Wiley & Sons, Ltd. Surf. Interface Anal. 2009, 41, 136–142

Page 4: Review of TOF-SIMS Bio Analysis Using Mutual Information

13

9

TOF-SIMS application to bioanalysis using mutual information

Table 3. Possible fragments generated from principle chains of amino acid residues

Number ofresidues

Formula ofprinciple chain Possible fragment from principle chain (except for hydrogen)

1 C2H2NO C3N2O2 C3NO2 C2N2O C2NO C2O CN C

2 C4H4N2O2 C5N3O3 C5N2O3 C4N3O2 C4N2O2 C4NO2 C3N2O C3NO

3 C6H6N3O3 C7N4O4 C7N3O4 C6N4O3 C6N3O3 C6N2O3 C5N3O2 C5N2O2

4 C8H8N4O4 C9N5O5 C9N4O5 C8N5O4 C8N4O4 C8N3O4 C7N4O3 C7N3O3

5 C10H10N5O5 C11N6O6 C11N5O6 C10N6O5 C10N5O5 C10N4O5 C9N5O4 C9N4O4

Table 4. Possible fragments generated from the principle chainscontaining proline

Number ofresidues

Possible fragment from principle chain(except for hydrogen)

1 N – –

2 C2N2O C2NO –

3 C4N3O2 C4N2O2 C5N3O3

4 C6N4O3 C6N3O3 C7N4O4

5 C8N5O4 C8N4O4 C9N5O5

dialysis membranes with nanosize pores used for the treatmentof patients with renal failure, secondary ion images related tothe protein are useful. The protein adsorption onto the dialysismembranes has been evaluated by TOF-SIMS imaging[9,10] inan effort to develop higher performance dialysis techniques.Solutions of a model protein, bovine serum albumin (BSA),were fed into the hollow of three commercially available dialysismembranes made of polysulfone (APS), polyether polymer alloy(FLX) and poly(acrylonitrile) (PAN), respectively, for 7 h, and thenrinsed with distilled water and dried in a desicator before TOF-SIMS measurement. In addition, dried native membranes wereprepared as reference samples. The values of mutual informationof SIMS spectra of each sample were calculated to select peaksspecific to BSA-adsorbed samples. The peaks with higher values of

mutual information were checked to determine whether they wereappropriate for chemical mapping or not, for example, whethertheir intensities were sufficient for secondary ion imaging. Becausethe dialysis membranes contain complicated additives such aspolyvinylpyrrolidone (PVP) to obtain high biocompatibility, theyproduce complicated, protein-like fragment ions. In order to selectpeaks appropriate for protein imaging, peaks only specific tothe target protein must be found from among all the peaks inthe spectra. The spectra analysis with the mutual information ishelpful for such a purpose. For example, according to the highmutual information values, the peaks of CHS, C2H6NO, C4H10N andC3H6NO2 at m/z = 45, 60, 72 and 88, respectively, were selectedas peaks specific to BSA adsorbed on the APS membrane, andthe C4H2, C5H8N, C5H10O and C6H10NO peaks at m/z = 50, 82, 86and 112, respectively, were selected as peaks specific to the APSmembrane.[9,10] Protein images on the membranes were obtainedby integrating these specific secondary ion images.

Figure 2 shows typical secondary ion images of BSA adsorptionon membranes. Because these samples are influenced bytopography, which may alter the images of BSA distribution,all images were divided by their total images to reduce thetopographical effect. The peaks related to BSA were selected bycomparing the TOF-SIMS spectra of BSA-adsorbed membranes andthose of native membranes using mutual information. Accordingto the native membrane figures (upper figures) and BSA-treatedmembrane figures (lower figures), BSA is adsorbed on the insideand outside surfaces of the APS membrane, the inside and outside

CCD image of the hollow-fiber membrane

100 mm

APS FLX PAN

APS (BSA) FLX(BSA) PAN(BSA)

Native membranes

BSA-treated membranes

Figure 2. TOF-SIMS image of the protein adsorbed onto hollow fiber membranes.

Surf. Interface Anal. 2009, 41, 136–142 Copyright c© 2008 John Wiley & Sons, Ltd. www.interscience.wiley.com/journal/sia

Page 5: Review of TOF-SIMS Bio Analysis Using Mutual Information

14

0

S. Aoyagi

Mass Spectrum

m/z

Specific fragment ion

Amino acid residues

Amino acid sequence

Substrate: biosensor, particle, electrode etc

Secondary ion generationSurface part

2 nm

Figure 3. Basic concept of orientation evaluation with TOF-SIMS.

surfaces as well as on cross-section of the FLX membrane, and theinside surface of the PAN membrane.

Distributions of two or three proteins were also obtainedusing TOF-SIMS with the mutual information analysis.[11 – 13] Forexample, the distribution patterns of two proteins on an opticbiosensor were observed by TOF-SIMS imaging.[11,12] Protein A wasimmobilized on the sensor surface to detect the sample protein,immunoglobulin G (IgG). In order to confirm immobilizationof protein A and also to evaluate the reaction between theimmobilized protein A and IgG in a sample solution, both proteinA immobilized on the glass plate and IgG bound to the immobilizedprotein A were measured with TOF-SIMS, and then their spectrawere analyzed by means of principal component analysis (PCA)and mutual information. Principal component (PC) score plotsindicated that the fragment ions from protein A and thosefrom IgG could be distinguished with the TOF-SIMS spectra.[12]

According to the PC scores and loadings, though peaks relatedto the protein A were not found searching through loadings ofthe PCs, peaks related to the IgG sample were found to be atm/z = 43, 71, 82, 84, 102, 136, 159 and 170. In addition, someof these peaks, at m/z = 84, 102, 159 and 170, have high valuesof mutual information. Unfortunately, they were not appropriatefor imaging because their intensities in the IgG sample and theprotein A sample were not so different.

Peaks thought to be specific to each protein were selectedusing the mutual information. Candidate peaks specific to the IgGsample were at m/z = 59, 60, 70, 72, 74, 86, 87, 100, 110, 112,130, 136, 159, 166, 170 and 184, and those specific to the proteinA sample were at m/z = 40, 44, 45, 46, 47, 48, 127 and 182.The mutual information values of these peaks are more than 0.8.After checking to determine whether they are really specific to theproteins by comparing their intensities in the protein A sampleto those in the IgG sample and by evaluating the secondary ionimages related to each protein, peaks specific to protein A and

IgG were determined to be at m/z = 182 and 184, respectively.[12]

Finally, both protein distributions on the biosensor chip wereobtained with the peaks selected with this analysis method of theTOF-SIMS spectra using mutual information to reduce the numberof unimportant peaks.

In protein TOF-SIMS spectra, the relationship among peaks isusually hidden, because several peaks of fragment ions generatedfrom different parts of a protein often have very similar massvalues, or the same formula, which makes attribution of aparticular peak uncertain. Because a peak in TOF-SIMS spectrafrom protein samples can be generated by several different partsof a protein because of complicated fragmentation, this mayaffect the results obtained with PCA or other commonly usedmultivariate techniques. Those peaks selected using PCA may notalways include all the important peaks specific to a particularsample. Therefore, sophisticated analysis methods are sometimesnot successful for interpreting the TOF-SIMS spectra from proteins.

On the other hand, information theory, which has been appliedto reduce the influence of noise in information technology, canhelp simplify the TOF-SIMS spectra from protein samples to selectthe unimportant peaks. Peaks acting almost the same in differentsamples can be found based on the mutual information. In otherwords, peaks acting differently in the different samples can beidentified using mutual information. When TOF-SIMS deviceshaving much higher mass resolution or MS/MS TOF-SIMS arereleased in the near future, TOF-SIMS spectra from proteins willbe analyzed with sophisticated statistical techniques. Currently,comparatively simple comparison methods, such as the mutualinformation method, are useful to identify the important hiddenpeaks. Moreover, the mutual information analysis method is ableto contribute as the first step of TOF-SIMS data analysis to reducethe amount of less important information. Then the data treatedwith the mutual information analysis method can be applied toother multivariate analysis techniques.

www.interscience.wiley.com/journal/sia Copyright c© 2008 John Wiley & Sons, Ltd. Surf. Interface Anal. 2009, 41, 136–142

Page 6: Review of TOF-SIMS Bio Analysis Using Mutual Information

14

1

TOF-SIMS application to bioanalysis using mutual information

Investigation of conformation and orientation of proteins onsubstrates

The orientation and three-dimensional structure of immobilizedproteins on biodevices are very important to assure their highperformance. TOF-SIMS has been already used in the stericanalysis of proteins such as an evaluation of orientation andprotein conformation.[14 – 16,20 – 25] Because the sampling depth ofTOF-SIMS in the static mode is less than 2 nm,[26] a thicknessmuch less than for most proteins, TOF-SIMS is able to provide thechemical structure of the surface side of an immobilized proteinmolecule, which indicates the orientation of the immobilizedprotein. In other words, the orientation of immobilized proteinscan be evaluated based on the determination of a partial structure,representing ensembles of amino acids, on the surface side. Whenpeaks generated from these amino acid ensembles are identified,the surface side can be determined. The important peaks indicatingthe surface side are specific to a specific target sample whenreference samples are appropriately prepared. Using the mutualinformation analysis, the important peaks specific to the targetprotein can be selected by a comparison with TOF-SIMS spectra ofthe reference samples.

Lysozyme[16,20], cytochrome b5[15] and BSA[14] have beeninvestigated with regard to their orientation by means of TOF-SIMS with mutual information analysis. Specific peaks related toan oriented immobilized protein were obtained by a comparisonwith reference samples such as nonoriented immobilized proteins,differently oriented proteins and different protein immobilizedsamples. Because proteins in general produce similar fragmentions, specific fragment ions are not always found in strong peaksof the spectra. Every peak in the spectra should be checked incomparison with the reference samples to find the relevant specificpeaks.

For example, in case of egg white lysozyme,[16] the followingpeaks were selected using mutual information analysis: main peaksspecific to N-lysozyme, immobilized at epsilon amine groups,are at m/z = 133 and 281, and a peak specific to C-lysozyme,immobilized at carboxyl groups, is at m/z = 175, 234 and 282.[16]

When the data of the lysozyme samples and reference samples,enkephalin-immobilized samples and BSA-immobilized samples,were performed with PCA using a peak set of reported fragmentions related to the amino acids composing proteins,[27 – 29] it issuggested that candidate peaks related to C-lysozyme are atm/z = 70, 81, 82, 83, 84, 86, 88, 110 and 159, and those relatedto N-lysozyme are at m/z = 43, 44, 60, 68, 69, 72, 73 and 74.Some of the peaks, peaks at m/z = 70, 84, 86, 88, 110 and 159for C-lysozyme and at m/z = 43, 73 and 74 for N-lysozyme, havehigh mutual information values though they were omitted in thechecking process after the calculation of mutual information. Thepeaks selected with PCA are not helpful to determine surface ornear surface sides of the immobilized protein because they aremainly generated from one residue existing in almost every sideof the protein. Moreover, the intensities of the peaks are not sodifferent in both lysozyme sample spectra.

After finding the specific peaks, their possible chemical formulasare searched by means of considering every possible fragmentfrom amino acid residues composing a protein, or an ensemble ofthe residues. Then the amino acid residues, being able to generatefragment ions having certain indicated chemical formulas in theamino acid sequence of the protein are searched. Because the sidecontaining most of these residues is considered to be the surfaceside of the protein, every side of the three-dimensional proteinstructure is checked. Finally, the surface side containing most of

these residues is found, which indicates the orientation of theprotein. Figure 3 shows the basic concept of these steps.

In order to evaluate the results obtained with the mutualinformation method, the orientational states of the proteinsamples were controlled, and the TOF-SIMS spectra of the sampleswere analyzed. For example, egg white lysozyme was immobilizedon an indium tin oxide (ITO)-coated glass slide activated byglutaraldehyde at an epsilon amine group of lysine.[16] Becauseglutaraldehyde mainly reacts with lysine, which has an epsilonamine group,[30] the underside of the immobilized protein shouldhave many lysine residues. The egg white lysozyme contains

Figure 4. The indicated surface side of the egg white lysozyme immobilizedat lysine residues.

Figure 5. The backside of the indicated surface side of the egg whitelysozyme.

Surf. Interface Anal. 2009, 41, 136–142 Copyright c© 2008 John Wiley & Sons, Ltd. www.interscience.wiley.com/journal/sia

Page 7: Review of TOF-SIMS Bio Analysis Using Mutual Information

14

2

S. Aoyagi

six lysine residues in one molecule. As a result, Fig. 4 showsthe indicated surface side of this lysozyme[16] and Fig. 5 showsthe backside of the indicated surface side. Because the backsideincludes four lysine residues (13, 96, 97 and 116) out of six lysineresidues, this lysozyme is mainly immobilized at this backside.Therefore, the orientation indicated with the TOF-SIMS analysisresults is rational and in line with the expected orientation basedon the 3D structure obtained with X-ray analysis.

The actual 3D structures of protein samples may be differentfrom their reported structures obtained with X-ray analysis ornuclear magnetic resonance (NMR), because the structures willchange depending on the conditions. However, their mainstructures are preserved, and these structural changes are minorand partial changes, which can be detected with this TOF-SIMSanalysis method. In the near future, this TOF-SIMS analysis methodis expected to be able to indicate not only the orientation but alsoboth real protein structures and structural changes depending onsample conditions.

Summary

SIMS spectra analysis techniques are necessary to interpret com-plicated and nonpredictable fragment ions from macromolecules,even though SIMS, with the cluster of primary ion sources, en-hances the intensity of high-mass secondary ions. Spectra analysiswith mutual information is one of the most useful techniquesfor identifying important, sometimes hidden, fragment ion peaksrelated to a particular protein sample to obtain either secondaryion images of the protein or chemical structures on the surfaceside of the protein. At this point, further improvement of theproduction of larger and greater numbers of fragment ions andhigher mass resolution or MS/MS application to SIMS to identifyhigh-mass fragment ions is required. Both data analysis and ionanalysis techniques, including the enhancement of ion produc-tion, are expected to be considerably improved in the near term,and hence will enable a more precise measurement of variousproteins on biodevices. In the near future, conformation changein proteins after reactions or environmental changes such as pHand orientation of the binding sites of immobilized protein will bemeasurable with TOF-SIMS.

References

[1] M. S. Wagner, D. G. Castner, Langmuir 2001, 17, 4649.[2] E. S. Berman, K. S. Kulp, M. G. Knize, L. Wu, E. J. Nelson, D. O. Nelson,

K. J. Wu, Anal. Chem. 2006, 78, 6497.[3] J. L. S. Lee, I. S. Gilmore, M. P. Seah, Surf. Interface Anal. 2007, 40, 1.[4] F. M. Green, E. J. Dell, I. S. Gilmore, M. P. Seah, Int. J. Mass Spectrom.

2008, 272, 38.[5] S. Aoyagi, M. Kudo, in Proteins at Solid-liquid Interfaces

(Ed.: P. Dejardin), Springer-Verlag: Berlin, Hidelberg, 2006, pp. 151.[6] C. E. Shannon, W. Weaver, The Mathematical Theory of Information,

University of Illinois Press: Urbana, 1947.[7] K. Eckschlager, V. Stepanek, K. Danzer, J. Chemometr. 1990, 4, 195.[8] T. Fujikura, K. Sakamoto, J. T. Shimozawa, Anal. Chim. Acta 1997,

351, 387.[9] S. Aoyagi, M. Hayama, U. Hasegawa, K. Sakai, M. Tozu, T. Hoshi,

M. Kudo, e-J. Surf. Sci. Nanotechnol. 2003, 1, 67.[10] S. Aoyagi, M. Hayama, U. Hasegawa, K. Sakai, M. Tozu, T. Hoshi,

M. Kudo, Appl. Surf. Sci. 2004, 411, 231.[11] S. Aoyagi, Y. Kawashima, M. Kudo, Nucl. Instrum. Methods Phys. Res.,

Sect. B 2005, 232, 146.[12] S. Aoyagi, M. Kudo, Biosens. Bioelectron. 2005, 20, 8 1626.[13] S. Aoyagi, A. Takesawa, A. C. Yamashita, M. Kudo, Appl. Surf. Sci.

2006, 252, 6697.[14] S. Aoyagi, M. Dohi, N. Kato, M. Kudo, S. Iida, M. Tozu, N. Sanada, e-J.

Surf. Sci. Nanotechnol. 2006, 4, 614.[15] S. Aoyagi, A. Rouleau, W. Boireau, Appl. Surf. Sci. (in press).[16] K. Okada, S. Aoyagi, M. Dohi, N. Kato, M. Kudo, M. Tozu,

T. Miyayama, N. Sanada, Appl. Surf. Sci. (in press).[17] B. Hagenhoff, Biosens. Bioelectron. 1995, 10, 885.[18] R. Chatterjee, Appl. Surf. Sci. 2004, 437, 231.[19] K. G. Lloyd, D. P. O. Keefe, Appl. Surf. Sci. 2004, 207, 231.[20] S. Aoyagi, K. Okada, A. Shigyo, N. Man, A. Karen, Appl. Surf. Sci. (in

press).[21] P. Bertrand, Appl. Surf. Sci. 2006, 252, 6986.[22] R. Michel, S. Pasche, M. Textor, D. G. Castner, Langmuir 2005, 21,

12327.[23] K. Leufgen, M. Mutter, H. Vogel, W. Szymczak, J. Am. Chem. Soc.

2003, 125, 8911.[24] N. Xia, C. J. May, S. L. McArthur, D. G. Castner, Langmuir 2002, 18,

4090.[25] N. Xia, D. G. Castner, J. Biomed. Mater. Res. 2003, 67A, 179.[26] R. Michel, D. G. Castner, Surf. Interface Anal. 2006, 38, 1386.[27] D. S. Mantus, B. D. Ratner, B. A. Carlson, J. F. Moulder, Anal. Chem.

1993, 65, 1431.[28] J.-B. Lhoest, E. Detrait, P. van den B. de Aguilar, P. Bertrand, J.Biomed.

Mater. Res. 1998, 41, 95.[29] J.-B. Lhoest, M. S. Wagner, C. D. Tidwell, D. G. Castner, J. Biomed.

Mater. Res. 2001, 57, 432.[30] D. Hopwood, Histochem. J. 1972, 4, 267.

www.interscience.wiley.com/journal/sia Copyright c© 2008 John Wiley & Sons, Ltd. Surf. Interface Anal. 2009, 41, 136–142