7
Describing some characters of serine proteinase using fractal analysis Xin Peng, Wei Qi , Rongxin Su, Zhimin He State Key Laboratory of Chemical Engineering, Chemical Engineering Research Center, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, PR China article info Article history: Received 25 July 2011 Accepted 14 April 2012 Available online 16 May 2012 abstract In this paper we calculated the fractal dimensions of four proteins, chymotrypsin, elastase, trypsin and subtilisin, which are made up of about 220–275 amino acids and belong to the family of serine proteinase by using three definitions of fractal dimension i.e. the chain fractal dimension (D L ), the mass fractal dimension (D m ) and the correlation fractal dimen- sion (D c ). We also analyzed the relationship between fractal dimension and space structure or secondary structure contents of proteins. The results showed that the values of fractal dimensions are almost same for the global mammalian enzymes (chymotrypsin, elastase and trypsin), but different for the global subtilisin. This demonstrated that the more similar structures, the more equal fractal dimensions, and if the fractal dimensions of proteins are different from each other, the three dimensional structures should not be similar. On the other hand, the detailed structures and fractal dimensions of the active sites of four enzymes are extraordinarily similar. Therefore, the fractal method can be applied to the elucidation of the proteins evolution. Ó 2012 Elsevier Ltd. All rights reserved. 1. Introduction Proteins are made up of 20 different amino acids linked together by peptide bonds [1]. From a microscopic scale, a protein is determined by its curved amino acid sequence which is called primary structure via the process of protein folding. Since the irregularity of a protein’s curved shape does not change with the change of observation scale to a certain extent, its backbone conformation obeys statisti- cal laws [2]. From a macroscopic scale, a protein is a three dimensional object which is called tertiary or quaternary structure. On the surface of proteins, there exist a great number of irregular ‘‘caves’’ and ‘‘gaps’’, and with further observation the micro-rugged, very irregular structures are also found. Namely proteins have a strong similarity between local structure and overall structure, which is an obvious characteristic of fractal geometry. So the fractal method can be used to describe the complicated spatial and dynamical structures of proteins [3]. Moreover, the fractal theory has been successfully applied to many branches of science, especially those involving complex systems in physics, mathematics, biology and engineering fields. Evidence that protein molecules may be characterized by a fractal-like geometry has appeared in a variety of mea- surements. For instance, based on the electron-spin relaxa- tion measurements on low-spin iron in several proteins by Stapleton and co-workers [4–6] and a theoretical underpin- ning for the variation of the vibrational density of a protein with vibrational frequency by Alexander and Orbach [7], it was found that the fractal nature of proteins could be eluci- dated by the fractal dimension. According to the data reported by Lewis and Rees [8], it was demonstrated that protein surface showed similar fractal properties and the fractal surface dimension can be determined by using the ball-rolling algorithm, which average value was approxi- mately 2.4 of protein. In another example, Isogi and Itoh [9] made a detailed calculation on the chain fractal dimen- sion of a set of 43 proteins from different classes. They also analyzed the relationship between the fractal dimension and tertiary structure. It was found that the fractal region of protein chains could be divided into two regions: the small scale region and the large scale region, which 0960-0779/$ - see front matter Ó 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.chaos.2012.04.002 Corresponding author. Tel.: +86 22 27407799; fax: +86 22 27407599. E-mail address: [email protected] (W. Qi). Chaos, Solitons & Fractals 45 (2012) 1017–1023 Contents lists available at SciVerse ScienceDirect Chaos, Solitons & Fractals Nonlinear Science, and Nonequilibrium and Complex Phenomena journal homepage: www.elsevier.com/locate/chaos

Describing some characters of serine proteinase using fractal analysis

Embed Size (px)

Citation preview

Page 1: Describing some characters of serine proteinase using fractal analysis

Chaos, Solitons & Fractals 45 (2012) 1017–1023

Contents lists available at SciVerse ScienceDirect

Chaos, Solitons & FractalsNonlinear Science, and Nonequilibrium and Complex Phenomena

journal homepage: www.elsevier .com/locate /chaos

Describing some characters of serine proteinase using fractal analysis

Xin Peng, Wei Qi ⇑, Rongxin Su, Zhimin HeState Key Laboratory of Chemical Engineering, Chemical Engineering Research Center, School of Chemical Engineering and Technology, Tianjin University,Tianjin 300072, PR China

a r t i c l e i n f o a b s t r a c t

Article history:Received 25 July 2011Accepted 14 April 2012Available online 16 May 2012

0960-0779/$ - see front matter � 2012 Elsevier Ltdhttp://dx.doi.org/10.1016/j.chaos.2012.04.002

⇑ Corresponding author. Tel.: +86 22 27407799; faE-mail address: [email protected] (W. Qi).

In this paper we calculated the fractal dimensions of four proteins, chymotrypsin, elastase,trypsin and subtilisin, which are made up of about 220–275 amino acids and belong to thefamily of serine proteinase by using three definitions of fractal dimension i.e. the chainfractal dimension (DL), the mass fractal dimension (Dm) and the correlation fractal dimen-sion (Dc). We also analyzed the relationship between fractal dimension and space structureor secondary structure contents of proteins. The results showed that the values of fractaldimensions are almost same for the global mammalian enzymes (chymotrypsin, elastaseand trypsin), but different for the global subtilisin. This demonstrated that the more similarstructures, the more equal fractal dimensions, and if the fractal dimensions of proteins aredifferent from each other, the three dimensional structures should not be similar. On theother hand, the detailed structures and fractal dimensions of the active sites of fourenzymes are extraordinarily similar. Therefore, the fractal method can be applied to theelucidation of the proteins evolution.

� 2012 Elsevier Ltd. All rights reserved.

1. Introduction

Proteins are made up of 20 different amino acids linkedtogether by peptide bonds [1]. From a microscopic scale, aprotein is determined by its curved amino acid sequencewhich is called primary structure via the process of proteinfolding. Since the irregularity of a protein’s curved shapedoes not change with the change of observation scale toa certain extent, its backbone conformation obeys statisti-cal laws [2]. From a macroscopic scale, a protein is a threedimensional object which is called tertiary or quaternarystructure. On the surface of proteins, there exist a greatnumber of irregular ‘‘caves’’ and ‘‘gaps’’, and with furtherobservation the micro-rugged, very irregular structuresare also found. Namely proteins have a strong similaritybetween local structure and overall structure, which is anobvious characteristic of fractal geometry. So the fractalmethod can be used to describe the complicated spatialand dynamical structures of proteins [3]. Moreover, thefractal theory has been successfully applied to many

. All rights reserved.

x: +86 22 27407599.

branches of science, especially those involving complexsystems in physics, mathematics, biology and engineeringfields.

Evidence that protein molecules may be characterizedby a fractal-like geometry has appeared in a variety of mea-surements. For instance, based on the electron-spin relaxa-tion measurements on low-spin iron in several proteins byStapleton and co-workers [4–6] and a theoretical underpin-ning for the variation of the vibrational density of a proteinwith vibrational frequency by Alexander and Orbach [7], itwas found that the fractal nature of proteins could be eluci-dated by the fractal dimension. According to the datareported by Lewis and Rees [8], it was demonstrated thatprotein surface showed similar fractal properties and thefractal surface dimension can be determined by using theball-rolling algorithm, which average value was approxi-mately 2.4 of protein. In another example, Isogi and Itoh[9] made a detailed calculation on the chain fractal dimen-sion of a set of 43 proteins from different classes. They alsoanalyzed the relationship between the fractal dimensionand tertiary structure. It was found that the fractal regionof protein chains could be divided into two regions: thesmall scale region and the large scale region, which

Page 2: Describing some characters of serine proteinase using fractal analysis

m =1 m=2

m=3 m=N-1

Fig. 1. The length of protein chain with different scales for a protein of Nresidues. The dots represent the Ca-atom of amino acid residues; the dotlines show the actual length of the protein chain; and the length of thethick line refers to L(m) in Eq. (1).

1018 X. Peng et al. / Chaos, Solitons & Fractals 45 (2012) 1017–1023

reflected the different properties of proteins folding. In re-cent years, the mass fractal dimension of biomacromole-cules such as protein and ribosome were also investigated[10–13]. However, the fractal dimension analysis was onlyused to elucidate the intrinsic structural characteristics ofproteins in most papers, a few researches concerned aboutthe illumination of structural differences among the pro-teins in conjunction with their biological function.

In this paper we introduced three methodologies to cal-culate the fractal dimensions of some important enzymes.It is well known that the serine proteases are a group of en-zymes which hydrolyze peptide bonds in proteins. Theseserine proteases have four essential structural features:the catalytic triad, the oxyanion hole, the specificity pock-et, and the unspecific substrate binding [14,15]. We se-lected four proteins from this family and studied therelationship between fractal dimension and space struc-ture of proteins, which can be used to explain the evolu-tionary relationship between proteins. Moreover, we alsoanalyzed the differences between the fractal dimensionvalues of global proteins and that of active site of enzymes.Therefore, fractal analysis is a useful tool to build a bridgebetween structure and function of proteins.

2. Method

2.1. Definition of the chain fractal dimension

Based on the fractal theory and the previous studies[9,16], the length of a protein backbone is defined as afunction of the fineness by drawing zigzag line connectingthe Ca-atoms starting from the N-terminal to the C-termi-nal step by step according to the scale. The calculation of abackbone chain is repeated by changing the scale. If theremaining residues at the side of C-terminal can not beconnected by the zigzag line, the length corresponding tothese unconnected residues is included by adding an advis-able connection term for each scale. In this paper, thelength of protein backbone, L(m), for different sequenceintervals, m, is expressed as follows:

LðmÞ ¼ lðmÞ þ nlðmÞmK

ð1Þ

where l(m) is the length of the through-space zigzag of or-der K, starting from the first Ca-atom, traveling through Ca-atom m + 1,2m + 1, . . . , and ending at Ca-atom Km + 1.(Fig. 1, the length of the solid line corresponds to the mag-nitude of l(m).); n (n < m) is the number of residues leftunconnected and m is the total number of segments. Thesecond term on the right-hand of Eq. (1) is the suitableconnection term which takes into account the contributionfrom the residues left unconnected. After calculation ofL(m) corresponding to different scale m, the fractal dimen-sion of protein chain (DL) is defined as:

LðmÞ / m1�DL ð2Þ

2.2. Definition of the mass fractal dimension

The classical box-counting method [17] is used in theimage analysis to calculate the fractal dimension. In this

paper, the fractal dimension of protein entity in thethree-dimensional Euclidean space is calculated by thetheory of box-counting method, which is entirely dissimi-lar to the classical box-counting method. More to the point,the approach is the sandbox method. The correspondingdimension [10–13], which can be used as a measure ofthe compactness, is referred to as the mass fractal dimen-sion (Dm) and defined as

MðeÞ / eDm ð3Þ

where M(e) is the number of all atoms of the protein chainin the ‘‘box’’ and e is the scale of ‘‘box’’. A fractal diagramcan be drawn by plotting logM(e) and loge on the abscissaand the ordinate, respectively, and then Dm can be ob-tained from the slope of fractal diagram. In terms of theory,the origin of coordinate system should be the mass centerof all atoms for a protein, which is nearly identical to thegeometric origin for all molecules. Therefore, in view ofapplication it is simple and convenient to set the geometriccenter as the origin of coordinate system, because it is veryeasier to find the geometric origin than to find the masscenter. Moreover, in order to test the results sensitivityto the selected ‘‘box’’ shape, it is necessary to calculatethe Dm of protein with two different kinds of ‘‘boxes’’ i.e.‘‘sphere’’ and ‘‘cube’’.

2.3. Definition of the correlation dimension

Besides DL and Dm as above mentioned, a fractal entitywhich has an image or a set of temporal points but doesnot know the mass [13,18], is characterized in general bythe correlation dimension (Dc). Dc is closely related to Dm.The correlation integral C(e) of the distance (e) is calculatedas following:

Page 3: Describing some characters of serine proteinase using fractal analysis

X. Peng et al. / Chaos, Solitons & Fractals 45 (2012) 1017–1023 1019

CðeÞ ¼ 1NðN � 1Þ

XN

i–j

Hðe� krijkÞ ð4Þ

where N corresponds to the number of points; rij is the dis-tance between a pair of monomers i and j; and H(x) is theHeaviside function. In our case N is equal to the number ofall atoms of the protein chain. To determine the Dc, C(e)that lies within a distance e is counted. This quantity scalesas:

CðeÞ / eDc ð5Þ

where Dc is determined from the slope of a log–log plot ofC(e) versus e. This definition has found popular applicationin determining the fractal dimension of the ‘‘strangeattractors’’ that characterize a chaotic system [19]. More-over, Dc has the advantage of being straightforwardly andquickly calculated, and is often in agreement with othercalculations of dimension.

2.4. Data set and calculations

In this paper we calculated the fractal dimension valuesof proteins including chymotrypsin, elastase, trypsin andsubtilisin which belonged to the family of serine proteaseby using three different methods (DL,Dm,Dc), and we alsoinvestigated the relationship between fractal dimensionand spatial structure of proteins in order to examinewhether the three dimensional structure of protein deter-mined the fractal dimension of protein backbone. The pro-teins selected from the Protein Data Bank [20] were usedX-ray diffraction as the structure elucidation method andhad a resolution less than 3.0 À. The class was assignedaccording to the SCOP database [21] and the secondarystructure assignment was performed using the DSSP soft-ware [22]. The fractal dimension of active site in selectedenzymes was calculated based on the amino acid residuesin three spheres, in which the catalytic triads of proteinase(His,Asp,Ser) was the center and the radius was 7 Å,respectively. The specific calculated procedure of fractaldimension for active site is available in the SupplementaryMaterial (Table S1).

With regard to the calculation range for the fractaldimension, it is a question that needs careful considerationin practice. Actually, in this work we compute the fractaldimensions of the proteins based on the following aspect.To avoid as much as possible finite-size effects when com-puting the fractal dimensions, we must consider the com-putation interval of the length scale. Namely, due to thefinite-size effect, there are both upper and lower size limitsbeyond which a protein is no longer fractal. Moreover, thecorrelation coefficients for the lines that best fit the datawere greater than 0.995 for the mass fractal dimension(Dm) and the correlation dimension (Dc) (especially forthe global enzyme), but to the fractal dimension of proteinchain (DL), the correlation coefficient is not worth investi-gating due to the calculation range is clear. Overall, thisrange strikes a balance between having enough points tomeaningfully compute the fractal dimension, and keepingthose over a range where the fractal dimension does notchange much with small changes in the number of points

used in the calculation. Thus the selection of computationrange is the very difficult research activity we should do[10,11].

3. Results and discussions

3.1. The relationship between fractal dimension and structure

To explore the relationship between fractal dimensionand tertiary structure of proteins, we select four proteinsfrom the family of serine protease. The plots of logL(m) ver-sus logm are bilinear for chymotrypsin, subtilisin, elastaseand trypsin in Fig. 2, which have been demonstrated bythe existence of cross over indicated by arrows. This is acharacteristic of a great deal of proteins [9,16]. As shownin Table 1, the fractal nature of proteins can be divided intotwo parts within the ranges of m 6 10 and m > 10 (m is themeasurement scale), because the plots within each rangecan be regarded as linear. When m 6 10, DL values of chy-motrypsin, elastase, and trypsin are less than that of subtil-isin. This is because DL(m 6 10) value, which is dependenton the secondary structure type [23,24], reflects the localconformation (folding) of protein chain [9,13,16]. In a glob-ular protein, the fractal dimension values for a helix, paral-lel b sheet, anti-parallel b sheet, twisted b sheet, andreverse turn are 1.44, 1.09, 1.06, 1.07, and 1.59, respec-tively [23]. As shown in Table 2, the first three enzymesare all belong to b class, but subtilisin is a/b class. More-over, the secondary structure percentages of subtilisinare completely different from the other enzymes. For in-stance, the value of a-% for subtilisin is evidently largerthan that of the first three enzymes. However, the rule ofthe fractal dimension of protein chains in the range of lar-ger m is contrary to that of protein chains within the rangeof the scale m 6 10. In other words, the difference betweenDL(m 6 10) and DL(m > 10) reflects the fact that the localand global conformation (folding) of protein molecule aregoverned by different rules [9,16]. This phenomenon is inaccordance with the data calculated by Wang et al. [16]and Isogai and Itoh [9]. Furthermore, the DL values in theshort m range of these proteins are less than that of a poly-mer which can be represented by the self-avoiding walk(SAW). Considering that the SAW is a typical model for apolymer with mass less bonds [25], this suggests that thelocal conformations of these proteins are more extendedor swollen than that of both the SAW and the globalproteins.

The plots of logM(e) versus loge and logC(e) versus logefor four global enzymes are presented in Figs. 3 and 4,respectively. As shown in Table 1, the Dm and Dc valuesof the first three global proteins are almost equal, but lessthan that of subtilisin. The reason is that the three dimen-sional structure and the secondary structure contents (inTable 2) of the first three proteins are very similar, whilesubtilisin has a quite different structure from the mamma-lian serine proteinases. From this view of point, we can de-duce that if the structures of proteins are very similar, thefractal dimensions of these proteins must be quite equal. Inaddition, Dm values of four proteins are smaller than that ofcompletely crowded three dimensional collapsed polymer(Dm = 3), which indicates that the structure of protein has

Page 4: Describing some characters of serine proteinase using fractal analysis

0.0 0.5 1.0 1.5 2.0 2.5 3.00.5

1.0

1.5

2.0

2.5

3.0

0.0 0.5 1.0 1.5 2.0 2.5 3.00.5

1.0

1.5

2.0

2.5

3.0

0.0 0.5 1.0 1.5 2.0 2.5 3.00.5

1.0

1.5

2.0

2.5

3.0

0.0 0.5 1.0 1.5 2.0 2.5 3.00.5

1.0

1.5

2.0

2.5

3.0

log

L(m

)log m

subtilisin

log

L(m

)

chymotrypsin

log

L(m

)

log m log m

elastase

trypsin

log

L(m

)

log m

Fig. 2. The plots of logL(m) versus logm for determining the fractal dimensions of four serine proteinase chains. Each plot shows two linear regions, whichare crossed at logm = 1.0 (m = 10) indicated by the arrow.

Table 1The fractal dimensions of serine proteinase calculated by three methods. Dm is the average value by using two different kinds of ‘‘boxes’’: ‘‘sphere’’ and ‘‘cube’’.The standard deviation of the corresponding number of the estimation is in the parentheses.

Protein PDB ID Number of residues Global enzyme Active site

DL (m 6 10) DL (m > 10) Dm Dc Dm Dc

Chymotrypsin 1k2i 245 1.32 (0.02) 1.98 (0.03) 2.75 (0.05) 2.47 (0.03) 2.81 (0.16) 2.32 (0.04)Elastase 2g4t 240 1.30 (0.01) 2.14 (0.04) 2.77 (0.04) 2.47 (0.03) 2.79 (0.17) 2.32 (0.04)Trypsin 2g8t 223 1.31 (0.02) 2.14 (0.04) 2.77 (0.06) 2.48 (0.03) 2.81 (0.11) 2.34 (0.04)Subtilisin 1af4 274 1.35 (0.01) 1.80 (0.02) 2.81 (0.03) 2.52 (0.03) 2.81 (0.10) 2.34 (0.05)

Table 2The class types and the secondary structure contents of the proteins. Theprotein types are classified according to the SCOP 1.73 notation. Thesecondary structure contents are calculated according to the DSSP software.

Protein Class a [%] b [%] Turn [%]

Chymotrypsin b 7.35 34.29 14.29Elastase b 5.42 38.33 17.50Trypsin b 7.17 36.77 15.70Subtilisin a/b 29.93 20.07 16.79

1020 X. Peng et al. / Chaos, Solitons & Fractals 45 (2012) 1017–1023

‘‘empty’’ or ‘‘void’’ space and the protein configuration isnot completely filled by the atoms [10]. Furthermore, Dm

represents the crowding degree of molecular space, i.e.the larger Dm is, the more atoms in the space; and Dc, ingeneral, characterizes the compact degree of the geometricconfiguration [13], i.e. the larger Dc is, the more compact(or dense) of space structure of the protein. So we can inferthat the spatial structure of subtilisin is more crowded andmore compact than that of three other enzymes.

Certainly, here it must note that the fractal is a statisti-cal property in the sense that detail structure content isaveraged out. Therefore, in regard to the description of

the detail structure for proteins, fractal theory may benot enough to illustrate this phenomenon in a way. How-ever, in this study we can obtain the following conclusion:if two or even more types of fractal dimensions fromvarious different perspectives between two proteins arebasically the same, some properties of their three dimen-sional structures (here, refer to local folding, global folding,crowding degree and compact degree) may tend to be con-served (or no discrepancies). So if the values of fractaldimensions of two proteins are completely different, wecan conclude that the space structures should be not thesame. Hence, from Table 1 we can see that the fractaldimensions of subtilisin (DL(m 6 10), DL(m > 10), Dm andDc for global enzyme) is entirely dissimilar relative to theother three enzymes, then the three dimensional globalstructure of subtilisin will be different from other proteins.As a matter of fact, the class type and the secondary struc-ture content of subtilisin with respect to the first three en-zymes is, indeed, completely different as shown in Table 2(for subtilisin, a/b class, but for other enzymes, b class).

As is stated above, the fractal dimension can be used todiscuss some biological problems related to classificationof enzymes and prediction of protein structures [26–28].

Page 5: Describing some characters of serine proteinase using fractal analysis

0.0 0.5 1.0 1.5 2.00.00.51.01.52.02.53.03.54.0

0.0 0.5 1.0 1.5 2.00.00.51.01.52.02.53.03.54.0

0.0 0.5 1.0 1.5 2.00.00.51.01.52.02.53.03.54.0

0.0 0.5 1.0 1.5 2.00.00.51.01.52.02.53.03.54.0 elastase

log

log ε

trypsin

log

log ε

subtilisin

log

log ε

log

log ε

chymoptrypsin

Fig. 3. The log–log plots of the number of atoms M(e) versus the scale e for determining the mass fractal dimensions. The slopes are Dm values of four globalserine proteinases by using the ‘‘sphere box’’.

0.0 0.5 1.0 1.5 2.0 2.5 3.0-3.0-2.5-2.0-1.5-1.0-0.50.00.5

0.0 0.5 1.0 1.5 2.0 2.5 3.0-3.0-2.5-2.0-1.5-1.0-0.50.00.5

0.0 0.5 1.0 1.5 2.0 2.5 3.0-3.0-2.5-2.0-1.5

-1.0-0.50.00.5

0.0 0.5 1.0 1.5 2.0 2.5 3.0-3.0-2.5-2.0-1.5-1.0-0.50.00.5

chymoptrypsin

log

C(ε

)

log ε

elastase

log

C(ε

)

log ε

trypsin

log

C(ε

)

log ε

subtilisin

log

C(ε

)

log ε

Fig. 4. The typical log–log diagrams of the correlation integral C(e) versus the distance e for determining the correlation fractal dimensions. The slopes areDc values of four global serine proteinases by using the ‘‘sphere box’’.

X. Peng et al. / Chaos, Solitons & Fractals 45 (2012) 1017–1023 1021

If the secondary structure contents and space structure ofproteins are very similar, the fractal dimensions of proteinmolecules must be very close to the same. We can alsodeduce that if the fractal dimensions of proteins are differ-ent from each other, the three dimensional structure willnot be similar.

3.2. The relationship between fractal dimension of globalenzyme and that of active site

As shown in Table 1, the Dm values of active sites forfour enzymes are almost equal. This phenomenon is alsoexisted for Dc values. These characters suggest that some

Page 6: Describing some characters of serine proteinase using fractal analysis

1022 X. Peng et al. / Chaos, Solitons & Fractals 45 (2012) 1017–1023

properties of the three dimensional structures of activesites for four proteins are very similar. Namely the com-pactness of space structure and the compact degree ofthe geometric configuration for active sites of serine en-zymes presented in this work are very close to the same.This is in agreement with the structural data of serine pro-teinases [14,15], which are known to present catalytic triadconsisting of side chains from Asp, His, and Ser. These ami-no acids are close to each other in the active site, althoughthey are far apart in the amino acid sequence of the poly-peptide chain as shown in Fig. 5.

It is very interesting that the whole amino acidsequence and the global 3-D structure of subtilisin, a bac-terial serine proteinase, are quite different from the mam-malian serine proteinase (chymotrypsin, elastase andtrypsin), but the amino acid residues in subtilisin that par-ticipate in catalytic triad, in oxyanion hole, and in substratebinding are almost in identical positions as they are in chy-motrypsin, elastase and trypsin. Starting from unrelatedancestral proteins, convergent evolution has resulted inthe same structural solution to achieve a particularly cata-lytic mechanism. In other words, the serine proteinasesprovide a spectacular example of convergent evolutionfrom the molecular level. Actually, with regard to the rela-tionship between fractal analysis and evolution, previousstudies [29,30] have demonstrated that multifractal analy-sis and fractal theory can be employed to construct a phy-logenetic tree using protein sequences from completegenomes. Moreover, in this study from the value of variousof fractal dimensions we can deduce that the global three-dimensional structure of subtilisin is different from theothers, whereas the active sites for serine enzymes tendto be conserved in structural and functional field whichis in accord with the character of convergent evolution.Hence the fractal dimension can be quantificationally usedto characterize this biological phenomenon in a certainrespect.

As shown in Table 1, the fractal dimensions calculatedby using different methods are quite different. The Dm val-ues of global enzymes are less than that of active sites,whereas that is contrary to the Dc values. In the previous

Fig. 5. Schematic diagrams of the three dimensional structure of chymotrypsin (residues in the catalytic triad are shown in red. (For interpretation of the referencof this article.)

studies, there existed two distinct results. Farin and Avnirreckoned that the fractal dimension of active site of trypsinshould be larger than the global value, so that the captur-ing of substrate could be enhanced on a more corrugatedsurface [31]. On the other hand, Lewis and Rees obtaineddifferent results from lysozyme and superoxide dismutase,in which the fractal dimensions of active sites were foundto be lower than the global values. They suggested that ac-tive site should not strongly bind the ligands, otherwisethe final stage of the enzymatic reaction (the release ofproducts) would be slow. Hence in order to keep the cata-lytic efficiency, both ‘‘the trapping’’ and ‘‘the desorption’’should be optimized [8]. As a matter of fact, the surfacesof protein molecules are rough and irregular. Thusdifferent fractal dimension can be obtained from differentperspectives. To balance the contradictory between differ-ently calculated fractal dimensions of active sites and glo-bal enzymes, researchers suggest that the multifractaltheory may be a useful tool to investigate the detailedstructure of proteins and to characterize the spatial heter-ogeneity of fractal patterns, in which a spectrum of fractaldimension can be obtained by using probability distribu-tions [32], but it need further theoretical and experimentalresearch.

4. Conclusions

In this paper, we apply several methods to calculate thefractal dimension of protein molecule, as well as elucidatethe relationship between protein structure and fractaldimension and the relationship between fractal dimensionof active site and that of global enzyme.

The results reveal that the fractal dimension of proteinchain (DL) can be divided into two linear regions: one isthe short m range which reflects the local conformation(folding) of protein chain; and the other one is the long mrange which reflects the global conformation (folding). Dm

values represent the crowding degree of molecular spaceand Dc values characterize the compact degree of thegeometric configuration of molecules. They are dependent

left) and subtilisin (right). The active sites are blue-tint and the amino acides to colour in this figure legend, the reader is referred to the web version

Page 7: Describing some characters of serine proteinase using fractal analysis

X. Peng et al. / Chaos, Solitons & Fractals 45 (2012) 1017–1023 1023

on both the fold type and the secondary structure contentsof proteins.

The values of fractal dimensions calculated by severalmethods are almost the same for the global mammalianenzymes (chymotrypsin, elastase and trypsin). However,the fractal dimensions of global subtilisin are differentfrom the other serine proteinases. This fact demonstratesthat the more similar structures, the more equal fractaldimensions, and if the fractal dimensions of proteins aredifferent from each other, the three dimensional structureshould not be similar.

Moreover, the detailed structures and fractal dimen-sions of active sites of four enzymes are extraordinarilysimilar, but the global structure and fractal dimension ofsubtilisin is completely different from the other serine pro-teinases. Therefore, the fractal theory can be used to studythe evolutionary relationship of proteins.

In conclusion, proteins are self-similar and highly heter-ogeneous objects, so that the concept of fractal may serveas a useful tool for description of the intrinsic characteris-tics of protein molecules.

Acknowledgments

This work was supported by the Program for New Cen-tury Excellent Talents in Chinese University (NCET-08–0386), the Key Project of Chinese Ministry of Education(108031), the 863 Program of China (2008AA10Z318), theNatural Science Foundation of China (20976125,31071509) and Tianjin (10JCYBJC05100), and the Programof Introducing Talents of Discipline to Universities of China(No. B06006).

Appendix A. Supplementary data

Supplementary data associated with this article can befound, in the online version, at http://dx.doi.org/10.1016/j.chaos.2012.04.002.

References

[1] Petsko G, Ringe D. Protein structure and function. London: WileyBlackwell Press; 2004.

[2] Li HQ, Zhao HM. Fractal studies on protein molecular chains. J BioactCompat Pol 1994;9:318–26.

[3] Mandelbrot BB. The fractal geometry of nature. New York: FreemanPress; 1982.

[4] Stapleton HJ, Allen JP, Flynn CP, Stinson DG, Kurtz SR. Fractal formproteins. Phys Rev Lett 1980;45:1456–9.

[5] Allen JP, Colvin JT, Stinson DG, Flynn GP, Stapleton HJ. Proteinconformation from electron spin relaxation data. Bophys J1982;38:299–310.

[6] Colvin JT, Stapleton HJ. Fractal and spectral dimensions ofbiopolymer chains: solvent studies of electron spin relaxation ratesin myoglobin azide. J Chem Phys 1985;82:4699–706.

[7] Alexander S, Orbach R. Density of states on fractals: fractons. J Phys(Paris) Lett 1982;43:L625–31.

[8] Lewis M, Rees DC. Fractal surfaces of protein. Science1985;230:1163–5.

[9] Isogai Y, Itoh T. Fractal analysis of tertiary structure of proteinmolecule. J Phys Soc Jpn 1984;53:2162–71.

[10] Enright MB, Leitner DM. Mass fractal dimension and thecompactness of proteins. Phys Rev E 2005;71:011912.

[11] Enright MB, Yu X, Leitner DM. Hydration dependence of the massfractal dimension and anomalous diffusion of vibrational energy inproteins. Phys Rev E 2006;73:051905.

[12] Lee CY. Mass fractal dimension of the ribosome and implication of itsdynamic characteristics. Phys Rev E 2006;73:042901.

[13] Lee CY. Self-similarity of biopolymer backbones in the ribosome.Physica A 2008;387:4871–80.

[14] Kraut J. Serine proteases: structure and mechanism of catalysis. AnnRev Biochem 1977;46:331–58.

[15] Hedstrom L. Serine protease mechanism and specificity. Chem Rev2002;102:4501–23.

[16] Wang CX, Shi YY, Huang FH. Fractal study of tertiary structure ofproteins. Phys Rev A 1990;41:7043–8.

[17] Kenneth F. Fractal geometry: mathematical foundations andapplications. 2nd ed. New York: Wiley Blackwell Press; 2003.

[18] Morita H, Takano M. Residue network in protein native structurebelongs to the universality class of a three-dimensional criticalpercolation cluster. Phys Rev E 2009;79:020901.

[19] Grassberger P, Procaccia I. Characterization of strange attractors.Phys Rev Lett 1983;50:346–9.

[20] Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H,et al. The protein data bank. Nucleic Acids Res 2000;28:235–42.

[21] Murzin AG, Brenner SE, Hubbard T, Chothia C. SCOP: a structuralclassification of proteins database for the investigation of sequencesand structures. J Mol Biol 1995;247:536–40.

[22] Kabsch W, Sander C. Dictionary of protein secondary structure:pattern recognition of hydrogen-bonded and geometrical features.Biopolymers 1983;22:2577–637.

[23] Xiao Y. Comment on ‘‘Fractal study of tertiary structure of proteins’’.Phys Rev E 1994;49:5903–5.

[24] Tejera E, Machado A, Rebelo I, Nieto-Villar J. Fractal protein structurerevisited: topological, kinetic and thermodynamic relationships.Physica A 2009;388:4600–8.

[25] Havlin S, Ben-Avraham D. Theoretical and numerical study of fractaldimensionality in self-avoiding walks. Phys Rev A 1982;26:1728–34.

[26] Isvoran A, Pitulice L, Crrascu CT, Chiriac A. Fractal aspects of calciumbinding protein structures. Chaos Solitons Fract 2008;35:960–6.

[27] Isvoran A. Describing some properties of adenylat kinase usingfractal concepts. Chaos Solitons Fract 2004;19:141–5.

[28] Isvoran A, Licz A, Unipan L, Morariu VV. Determination of the fractaldimension of the lysozyme backbone of three different organisms.Chaos Solitons Fract 2001;12:757–60.

[29] Yu ZG, Anh V, Lau KS. Multifractal and correlation analysis of proteinsequences from complete genome. Phys Rev E 2003;68:021913.

[30] Yu ZG, Anh V, Lau KS. Chaos game representation of proteinsequences based on the detailed HP model and their multifractaland correlation analyses. J Theor Biol 2004;226:341–8.

[31] Farin D, Avnir D. In: Unger KK, Behrens D, Kral H, editors.Characterization of porous solids. Amsterdam: Elsevier; 1988.

[32] Li HQ. Fractal analysis for protein and enzyme surfaces. Chin J ChemPhys 1992;5:183–9.