20
proteins STRUCTURE FUNCTION BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke * Mathematisch-Naturwissenschaftliche Fakulta ¨t, Institut fu ¨r Pharmazeutische und Medizinische Chemie, Heinrich-Heine-Universita ¨t Du ¨sseldorf, Germany INTRODUCTION In the recent years, there has been considerable effort to under- stand the structural determinants of stability and activity of proteins from thermophilic and hyperthermophilic organisms (henceforth referred to together as thermophilic proteins) with optimal growth temperatures of more than 608C. 1,2 These proteins are required to retain their native structures even under extreme environmental conditions, when homologues from mesophilic organisms already denature. 1,2 Consequently, thermophilic proteins often have a substantially higher intrinsic thermostability than their mesophilic counterparts, while retaining the basic fold characteristics, and in the case of enzymes, the catalytic site properties of the particular protein family. 1,3 Because of their adaptation to high temperature, thermophilic enzymes are interesting for the biotechnological industry as bio- catalysts. 4–6 However, the identification of suitable thermophilic enzymes with a specific temperature-dependent activity profile by screening is tedious. 6,7 As a valuable alternative, mesophilic enzymes can be re-engineered to become more thermostable. 8,9 Thus, there is great interest in the biotechnology industry for having tools in hand that aid in developing thermostable enzymes by protein engineering. In addition to stability, optimizing enzyme activity for elevated target temperatures is mandatory. This requires both a thorough understanding of the structural determinants of thermostability and temperature-dependent activity. 10 Numerous studies have been performed to identify principles of thermostability and, subsequently, to apply those in rational or data-driven protein engineering. 8,11 These include a large and growing number of comparative studies using mesophilic and ther- mophilic protein homologues. 12 The influence of various sequence and structure determinants of thermophilic adaptation has been studied. 13–21 Often, thermophilic adaptation comes along with a better packing of hydrophobic interactions and an increased density of salt-bridges or charge-assisted hydrogen bonds. 19,20,22 In many cases, a combination of different sequence or structure features was found to be responsible for an increased thermostability. 10,21,23 Sebastian Radestock’s current address is Computational Structural Biology Group, Max Planck Institute of Biophysics, Max-von-Laue-Str. 3, 60438 Frankfurt am Main, Germany. Additional Supporting Information may be found in the online version of this article. *Correspondence to: Holger Gohlke, Universita ¨tsstr. 1, 40225 Du ¨sseldorf, Germany. E-mail: [email protected]. Received 5 August 2010; Revised 28 September 2010; Accepted 7 November 2010 Published online 19 November 2010 in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/prot.22946 ABSTRACT We probe the hypothesis of corresponding states, according to which homologues from mesophilic and thermophilic organisms are in corresponding states of similar rigidity and flexibility at their respective optimal temperatures. For this, the local distribution of flexible and rigid regions in 19 pairs of homologous proteins from meso- and thermophilic organisms is analyzed and related to activity characteristics of the enzymes by con- straint network analysis (CNA). Two pairs of enzymes are considered in more detail: 3-isopro- pylmalate dehydrogenase and thermolysin-like protease. By comparing microscopic stability fea- tures of homologues with the help of stability maps, introduced for the first time, we show that adaptive mutations in enzymes from thermo- philic organisms maintain the balance between overall rigidity, important for thermostability, and local flexibility, important for activity, at the appropriate working temperature. Thermophilic adaptation in general leads to an increase of structural rigidity but conserves the distribution of functionally important flexible regions between homologues. This finding provides direct evidence for the hypothesis of corresponding states. CNA thereby implicitly captures and uni- fies many different mechanisms that contribute to increased thermostability and to activity at high temperatures. This allows to qualitatively relate changes in the flexibility of active site regions, induced either by a temperature change or by the introduction of mutations, to experi- mentally observed losses of the enzyme function. As for applications, the results demonstrate that exploiting the principle of corresponding states not only allows for successful thermostability optimization but also for guiding experiments in order to improve enzyme activity in protein engineering. Proteins 2011; 79:1089–1108. V V C 2010 Wiley-Liss, Inc. Key words: rigidity theory; thermostability; flexibil- ity; protein engineering; network; enzyme activity. V V C 2010 WILEY-LISS, INC. PROTEINS 1089

proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

proteinsSTRUCTURE O FUNCTION O BIOINFORMATICS

Protein rigidity and thermophilic adaptationSebastian Radestock and Holger Gohlke*

Mathematisch-Naturwissenschaftliche Fakultat, Institut fur Pharmazeutische und Medizinische Chemie,

Heinrich-Heine-Universitat Dusseldorf, Germany

INTRODUCTION

In the recent years, there has been considerable effort to under-

stand the structural determinants of stability and activity of proteins

from thermophilic and hyperthermophilic organisms (henceforth

referred to together as thermophilic proteins) with optimal growth

temperatures of more than 608C.1,2 These proteins are required to

retain their native structures even under extreme environmental

conditions, when homologues from mesophilic organisms already

denature.1,2 Consequently, thermophilic proteins often have a

substantially higher intrinsic thermostability than their mesophilic

counterparts, while retaining the basic fold characteristics, and in

the case of enzymes, the catalytic site properties of the particular

protein family.1,3

Because of their adaptation to high temperature, thermophilic

enzymes are interesting for the biotechnological industry as bio-

catalysts.4–6 However, the identification of suitable thermophilic

enzymes with a specific temperature-dependent activity profile by

screening is tedious.6,7 As a valuable alternative, mesophilic

enzymes can be re-engineered to become more thermostable.8,9

Thus, there is great interest in the biotechnology industry for having

tools in hand that aid in developing thermostable enzymes by

protein engineering. In addition to stability, optimizing enzyme

activity for elevated target temperatures is mandatory. This requires

both a thorough understanding of the structural determinants of

thermostability and temperature-dependent activity.10

Numerous studies have been performed to identify principles of

thermostability and, subsequently, to apply those in rational or

data-driven protein engineering.8,11 These include a large and

growing number of comparative studies using mesophilic and ther-

mophilic protein homologues.12 The influence of various sequence

and structure determinants of thermophilic adaptation has been

studied.13–21 Often, thermophilic adaptation comes along with a

better packing of hydrophobic interactions and an increased density

of salt-bridges or charge-assisted hydrogen bonds.19,20,22 In many

cases, a combination of different sequence or structure features was

found to be responsible for an increased thermostability.10,21,23

Sebastian Radestock’s current address is Computational Structural Biology Group, Max Planck

Institute of Biophysics, Max-von-Laue-Str. 3, 60438 Frankfurt am Main, Germany.

Additional Supporting Information may be found in the online version of this article.

*Correspondence to: Holger Gohlke, Universitatsstr. 1, 40225 Dusseldorf, Germany.

E-mail: [email protected].

Received 5 August 2010; Revised 28 September 2010; Accepted 7 November 2010

Published online 19 November 2010 in Wiley Online Library (wileyonlinelibrary.com).

DOI: 10.1002/prot.22946

ABSTRACT

We probe the hypothesis of corresponding states,

according to which homologues from mesophilic

and thermophilic organisms are in corresponding

states of similar rigidity and flexibility at their

respective optimal temperatures. For this, the

local distribution of flexible and rigid regions in

19 pairs of homologous proteins from meso- and

thermophilic organisms is analyzed and related

to activity characteristics of the enzymes by con-

straint network analysis (CNA). Two pairs of

enzymes are considered in more detail: 3-isopro-

pylmalate dehydrogenase and thermolysin-like

protease. By comparing microscopic stability fea-

tures of homologues with the help of stability

maps, introduced for the first time, we show that

adaptive mutations in enzymes from thermo-

philic organisms maintain the balance between

overall rigidity, important for thermostability,

and local flexibility, important for activity, at the

appropriate working temperature. Thermophilic

adaptation in general leads to an increase of

structural rigidity but conserves the distribution

of functionally important flexible regions

between homologues. This finding provides direct

evidence for the hypothesis of corresponding

states. CNA thereby implicitly captures and uni-

fies many different mechanisms that contribute

to increased thermostability and to activity at

high temperatures. This allows to qualitatively

relate changes in the flexibility of active site

regions, induced either by a temperature change

or by the introduction of mutations, to experi-

mentally observed losses of the enzyme function.

As for applications, the results demonstrate that

exploiting the principle of corresponding states

not only allows for successful thermostability

optimization but also for guiding experiments

in order to improve enzyme activity in protein

engineering.

Proteins 2011; 79:1089–1108.VVC 2010 Wiley-Liss, Inc.

Key words: rigidity theory; thermostability; flexibil-

ity; protein engineering; network; enzyme activity.

VVC 2010 WILEY-LISS, INC. PROTEINS 1089

Page 2: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

These structural changes contribute to the improvement

of the underlying network of noncovalent interactions

within the structure, presumably leading to an increase

in mechanical rigidity of the structure and a decrease

in flexibility.1

Motional properties of proteins with different thermal

stabilities have been compared using a variety of different

experimental techniques, including hydrogen/deuterium

(H/D) exchange, neutron scattering, and nuclear mag-

netic resonance (NMR) spectroscopy.24 As an intriguing

conclusion, the local distribution of flexible or rigid

regions in a thermophilic protein is similar to that of a

mesophilic homologue, provided that the proteins are

compared at their respective optimal temperatures. This

has led to the hypothesis that homologous mesophilic

and thermophilic proteins are in corresponding states of

similar rigidity and flexibility at their respective optimal

temperature.1,25 This should be especially true for

enzymes, where stability characteristics of catalytic and

binding site residues are an important factor in deter-

mining the activity.3 In fact, many thermophilic enzymes

have a maximal activity at elevated temperature, but are

far less active at ambient temperatures, where the struc-

tures are more rigid.26–29 Thus, rigidity is an important

factor for the integrity of the native folded structure,

whereas a certain degree of flexibility is required for

activity, although the general validity of this relation is

being discussed.30

The experimental characterization and quantification of

protein flexibility and rigidity are challenging, especially

in regard to the diverse types and timescales of motions

that are of interest.24,31 Moreover, changes in flexibility

due to sequence variations can be limited to a small part

of the structure only, which makes them difficult to

detect.32 Computational approaches are interesting alter-

natives. As an example, molecular dynamics (MD) simu-

lations provide information on the motional properties of

atoms in a protein structure. By performing simulations

at different temperatures or by running nonequilibrium

simulations, motional properties of the structure during

thermal unfolding can be analyzed.33–36 By comparing

equilibrium and nonequilibrium MD simulations, Creveld

et al.37 studied the unfolding of cutinase and distin-

guished two types of flexible regions: those that promote

unfolding and those that are important for enzyme activ-

ity. Thereby, unfolding regions were predicted where

mutations modulate thermostability and do not interfere

with enzyme activity. Despite these successes, such simula-

tions are computationally expensive, and it is not clear

whether force fields and solvent models are adequate for

use at high temperatures. Without doing expensive MD

simulations, moving protein parts can also be predicted

from a single protein conformation using methods related

to normal mode analysis (NMA).38 Using such an

approach, Su et al.39 studied changes in the fluctuation of

residues upon protein unfolding.

Both MD and NMA studies report on the mobility of

atoms by analyzing atomic fluctuations or the correlation

of movements, rather than flexibility and rigidity charac-

teristics of structural regions. Here, flexibility denotes the

possibility of motion within the region, that is, the region

can be deformed. In contrast, no relative motion is

allowed within rigid (structurally stable) regions, and

such regions can only move as a rigid body with six

degrees of freedom. Thus, flexibility and its opposite, ri-

gidity, are static properties that describe the possibility of

motion.40,41 The distinction between mobility and flexibil-

ity/rigidity becomes obvious in the case of a mobile rigid

body, such as a moving helix or domain.

In a previous study, we showed for a dataset of 20

pairs of mesophilic and thermophilic protein homologues

that protein thermostability can be predicted by directly

characterizing the mechanical rigidity of a protein struc-

ture during a thermal-unfolding simulation.42 The

approach is referred to as constraint network analysis

(CNA) and based on a graph-theoretical method that

determines rigidity and flexibility within a protein struc-

ture in atomic resolution.43 We showed that the

approach is helpful for understanding and exploiting the

relationship between microscopic structure and macro-

scopic stability and, thus, can be used in data-driven

protein engineering to increase protein thermostability by

introducing mutations at regions that are crucial for

macroscopic stability. These regions are referred to as

unfolding regions or weak spots.

Here, we go beyond the previous study, which related

protein thermal stability and rigidity by analyzing the

local distribution of flexible or rigid regions and relating

it to activity characteristics of enzyme structures. We

show that CNA implicitly captures and unifies many

different mechanisms that contribute to increased ther-

mostability and to activity at high temperatures. Two

pairs of enzymes from our data set are considered in

more detail: 3-isopropylmalate dehydrogenase (IPMDH)

and thermolysin-like protease (TLP). By comparing mi-

croscopic stability features of the meso- and thermophilic

protein homologues, we show that adaptive mutations of

thermophilic enzymes maintain the balance between

overall rigidity, important for thermostability, and local

flexibility, important for activity, at the respective tem-

perature at which the protein functions. Thus, direct

evidence is offered that supports the hypothesis of corre-

sponding states. To the best of our knowledge, the pre-

sent study is the first one probing this hypothesis for a

large data set by computational means. Moreover, unlike

other computational studies that probed this hypo-

thesis,35,36 protein rigidity and flexibility are characte-

rized directly. Our results demonstrate that exploiting the

principle of corresponding states not only allows for

successfully optimizing thermostability but also for guid-

ing experiments in order to improve enzyme activity in

protein engineering.

S. Radestock and H. Gohlke

1090 PROTEINS

Page 3: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

THEORY

In the following, we will briefly introduce the theoreti-

cal background of CNA. With this method, the mecha-

nical rigidity of a protein structure is characterized by

simulating the thermal unfolding of a protein.

Unfolding simulations on constraint networkrepresentations of proteins

Protein stability is determined by the large number of

interactions that contribute to the native-folded state.44

Thus, to theoretically study protein stability features, pro-

teins can be described as molecular networks.45,46 In the

present study, we go beyond describing proteins as topo-

logical networks. Instead, protein structures are repre-

sented as constraint networks, in particular molecular

frameworks.47 These networks are a subtype of 3D-net-

works in which vertices represent atoms and edges repre-

sent covalent and noncovalent bond and angle constraints.

As noncovalent bond constraints, hydrogen bonds, inclu-

ding salt bridges, and hydrophobic interactions are used.

The rigidity and flexibility within a constraint network

can be fully characterized using constraint counting.48 As

a result, the protein-structure network is decomposed into

rigid clusters and flexible regions. A rigid cluster is a set of

atoms that move together as a rigid body, that is, no inter-

nal movement is allowed. Otherwise, an atom is part of a

flexible region. The rigid cluster decomposition can be

calculated using the FIRST (Floppy Inclusion and Rigid

Substructure Topography) software.43

Thermal unfolding of the protein structure can now be

simulated by gradually removing noncovalent bond

constraints from the constraint network.42,49,50 Thus,

starting from a rigid network with a large number of

constraints, a flexible network is obtained with only a

few constraints left. Here, hydrogen bonds (including salt

bridges) are removed from the network in the order of

increasing strength, that is, only hydrogen bonds with an

energy Ehb � Ecut are retained for a certain network state.

This follows the idea that stronger hydrogen bonds will

break at higher temperatures than weaker ones.51 The

number of hydrophobic contacts is kept constant during

the thermal unfolding, because the strength of hydro-

phobic interactions remains constant or even increases

with increasing temperature.52 A new network is gene-

rated for each hydrogen bond that is removed, and the

rigid cluster decomposition is performed on this network.

Identifying the folded–unfolded transition

During the thermal-unfolding simulation, a phase

transition from a largely rigid to a largely flexible net-

work is observed. At the transition, the percolating rigid

cluster (the giant cluster) suddenly breaks and stops

dominating the system. The transition defines the rigidity

percolation threshold.53 Both proteins and glasses are

similar in that such a transition can be observed.50

However, the percolation behavior of protein networks is

usually more complex, and several transitions can be

observed. This is related to the fact that protein struc-

tures are modular, as they are assembled from secondary

structure elements, subdomains, and domains, which

often break away from the giant cluster as a whole. As

every protein fold has its unique number and pattern

of spatially arranged modules, a percolation behavior

characteristic for each fold is observed.

To describe the general percolation behavior of a

network, the microstructure of the network, that is, prop-

erties of the set of clusters generated by the bond-dilution

process are analyzed.54 Here, we choose the fraction of

the network belonging to the percolating rigid cluster as

an order parameter for characterizing the rigidity transi-

tion (the rigidity-order parameter, P1). The rigidity-order

parameter denotes the probability that an atom belongs to

the giant cluster and is zero in the floppy phase. Monitor-

ing the decay of the giant cluster by the rigidity-order

parameter provides a global and intuitive description of

the rigidity within the protein structure during thermal

unfolding. Although the percolation behavior of protein

networks is complex, the curves of the rigidity-order

parameter are similar to curves observed for network

models of glasses and amorphous solids.54,55

The first transition observed describes the break down

of the completely rigid network into a number of rigid

clusters. The last transition describes the loss of the

remaining piece of rigidity and the onset of a network

that is completely flexible, that is, rigid clusters cease to

exist. The first transition is referred to as rigidity percola-

tion transition, because the network is unable to transmit

stress afterward. However, the last transition is biologi-

cally most relevant, because it corresponds to the folded–

unfolded transition in experimental protein unfolding.42

To identify the last steps of the folded–unfolded transi-

tion that is relevant from a biological point of view,42 a

parameter for analyzing macroscopic properties of the net-

work associated with the rigid cluster-size distribution is

used.56 In this context, Andraud et al.57 introduced the

cluster configuration entropy (H) as a morphological

descriptor for heterogeneous materials. H has been

adapted from Shannon’s information theory and, thus, is

a measure of the degree of disorder in the realization of

a given state: as long as the giant cluster dominates the

system, H is low because of the limited number of possi-

ble ways to configure a system with a very large cluster.

Originally, H was defined as a function of the probability

that an atom is part of a cluster of size s (s-cluster).57

This s-cluster configuration entropy definition has been

shown to be particularly useful for identifying early steps

of the transition of protein networks between largely

rigid states and partially flexible states.

However, we are interested in identifying the late steps

of the transition between partially rigid states, where a

Protein Rigidity and Thermophilic Adaptation

PROTEINS 1091

Page 4: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

rigid core in the structure network is still present, and

largely flexible states, where the core is broken. This is in

accordance with the idea that the core then represents

the folding nucleus that dominates the transition state in

experimental protein (un)folding. To identify these steps

of the transition, we used a configuration entropy

H ¼ �X

s

ws lnws; ð1Þ

where the probability ws that an arbitrarily occupied site

belongs to a s2-cluster is given by

ws ¼ s2nsPs

s2nsð2Þ

and the normalized cluster number ns is

ns ¼ number of clusters of size s

N: ð3Þ

N equals the total number of atoms

Our definition of the configuration entropy is useful

for identifying the temperature at which the giant cluster

finally stops to dominate the system, that is, where a

transition takes place that corresponds to the folded–

unfolded transition in protein unfolding. The tempera-

ture at which this transition occurs is referred to as the

phase-transition temperature (Tp), which can be related

to the experimental melting temperature (Tm). See

Methods section for a description as to how the phase-

transition temperature is determined.

RESULTS AND DISCUSSION

Data set of mesophilic and thermophilichomologues

The hypotheses underlying the present study are that

thermostability comes along with an increased mechani-

cal rigidity of the structure and that mesophilic and ther-

mophilic proteins are in corresponding states of similar

rigidity and flexibility at their respective optimal tempe-

rature. To probe these hypotheses, a data set containing

homologous protein structures from mesophilic and ther-

mophilic organisms was constructed (Table S1).42 The

structures in the data set fulfill stringent quality criteria

based on previous results that indicate that a good struc-

tural quality is necessary for successful rigidity analysis.43

The data set consists of 19 pairs of protein structures

from 19 different protein families (Table S1). The corre-

sponding mesophilic and thermophilic proteins within

each family are highly similar, with sequence identities

varying from 37 to 88% and backbone root mean square

deviation (rmsd) values between 0.6 and 2.5 A. Higher

rmsd values were only obtained for proteins of families 5

(elongation factor TU) and 11 (phosphoglycerate kinase)

due to differences in the configuration of domains.

If considered separately, the domains share a high-struc-

tural similarity (rmsd < 2.5 A).

The high sequential and structural similarities point to

close relationships with respect to the three-dimensional

topology of the molecules. The presence of conserved

catalytic residues suggests that the homologues are closely

related, if not identical, from a mechanistic point of

view, too. Sequence alignments of 3-IPMDH and TLP are

shown in the Supporting Information (Figs. S1 and S2),

where catalytic and other functionally important residues

are highlighted.

Ideally, the melting temperature (Tm) of a protein

should be used as a descriptor of thermostability. Unfortu-

nately, only a few proteins in our data set have melting

temperatures currently available, because the Tm of many

proteins cannot be determined experimentally. The

limited number of proteins with known Tm, thus, forced

us to use the optimal growth temperature of the corre-

sponding organism (Tog) as a descriptor of thermostability.

Optimal growth temperatures have previously been used in

studies comparing meso- and thermophilic proteins.16

Macroscopic stability is characterizedby the final decay of a rigid core

All protein structures were subjected to thermal-

unfolding simulations using CNA. CNA is based on a

rigidity analysis of protein constraint networks, as imple-

mented in the FIRST software.43 This method is remark-

ably fast, and unfolding trajectories can be produced

within a few minutes. Using CNA, global (macroscopic)

stability is characterized by monitoring the size of the

largest rigid cluster (the giant cluster) and the rigid

cluster-size distribution within the network as the

thermal-unfolding simulation proceeds. These properties

are described by the rigidity-order parameter (P1) and

the s2-cluster configuration entropy (H), respectively.

From these quantities, the temperature of a biologically

relevant phase transition (Tp) is obtained that relates to

the protein’s melting temperature (Tm).42 Here, only

differences in the phase-transition temperatures of

homologous proteins are considered.42

The rigidity-order parameter plots of two representa-

tive pairs of proteins from our data set, IPMDH and

TLP, are shown in Figure 1(a,b). As described in the

Theory section, the curves characterize the general perco-

lation behavior of the constraint networks during unfold-

ing. In respect to the pattern of transitions, the rigidity-

order parameter curves of thermophilic proteins are

almost identical to their mesophilic counterparts. Never-

theless, the rigidity-order parameter curves of thermo-

philic proteins are shifted to higher temperatures (except

in low-temperature regions). From a global point of

view, these observations provide initial support to the

S. Radestock and H. Gohlke

1092 PROTEINS

Page 5: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

hypothesis of corresponding states, according to which

mesophilic and thermophilic enzymes are in correspon-

ding states of similar rigidity and flexibility at their

respective optimal temperature.

The s2-cluster configuration entropy plots of the two

representative pairs of proteins from our data set are

shown in Figure 1(c,d). The cluster configuration entropy

is used to detect the rigid-to-flexible phase transition

that is biologically relevant, that is, where the last rigid

region integrating multiple secondary structure elements

breaks, and the structure is no longer dominated by a

rigid cluster. This melting is associated with small

changes in the topology of the constraint network, which

have a large effect on the rigid cluster-size distribution.

Accordingly, the curve shows a steep increase at the

phase-transition temperature, where the relevant transi-

tion takes place. As can be seen from Figure 1(c,d), the

thermophilic protein has a higher phase-transition

temperature compared to its mesophilic homologue.

Results for all pairs of protein structures from the data

set are given in Table I. Notably, for approximately two

thirds of the 19 different protein families, the Tp of the

thermophilic protein is predicted to be higher than that

of the mesophilic counterpart, as exemplarily demon-

strated for IPMDH and TLP. This shows that the rigid

Figure 1Rigidity-order parameter (a, b) and cluster configuration entropy (c, d) versus temperature for the mesophilic and the thermophilic form of 3-

isopropylmalate dehydrogenase (IPMDH) (a, c), and thermolysin-like protease (TLP) (b, d). The solid line denotes the parameter for the

mesophilic protein, and the dashed line denotes the parameter for the thermophilic protein.

Table IDifferences in experimental Tog values versus differences in computed

phase-transition temperatures (Tp) between a mesophilic protein and its

thermophilic homologue.

Protein family DToga DTp

b

No. Name

1 Pyrrolidone carboxyl peptidase 53.0 18.02 Aspartate aminotransferase-like 35.5 27.23 b-Glucanase 20.0 211.34 Xylose isomerase 34.5 11.55 Elongation factor TU 35.5 9.76 Signal recognition particle (SRP) 34.0 15.37 Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) 15.5 20.48 Methylguanin-DNA methyltransferase 58.0 10.99 Carboxypeptidase 18.0 7.210 3-Isopropylmalate dehydrogenase (IPMDH) 38.0 15.311 Phosphoglycerate kinase 43.0 1.112 Cyclodextrin glycosyltransferase 25.0 29.813 Carbonic anhydrase 13.0 4.914 Cellulase 20.0 216.815 Thermolysin-like protease (TLP) 25.0 23.616 Subtilase 34.0 2.817 Endocellulase 20.0 225.918 Histidine carrier protein 18.0 16.219 Adenylate kinase, zinc finger 18.0 24.7

aOptimal growth temperature difference of the corresponding species, in K.bIn K. DTp is marked in bold if the phase-transition temperature of the thermo-

philic protein was predicted to be higher than that of the mesophilic counterpart.

Protein Rigidity and Thermophilic Adaptation

PROTEINS 1093

Page 6: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

protein core of a thermophilic protein is more resistant

to high temperatures than that of a mesophilic protein,

which lends immediate support to the hypothesis that

thermostability is linked to an enhanced structural rigi-

dity of the folded native state. Values in Table I differ

slightly from previously published ones,42 because, in

the present study, a modified, fully automated method

for identifying the phase-transition temperature was

used (see Methods section). This method yields phase-

transition temperatures that are determined in a more

consistent manner across the whole data set.

In approximately one third of the investigated pairs of

homologues, a higher phase-transition temperature for

the thermophilic protein could not be found. The follow-

ing reasons may account for this. First, in a few cases,

mesophilic proteins have been reported to be as thermo-

stable (if not more thermostable) than their thermophilic

counterpart.58 Second, extrinsic factors such as glycosyla-

tion, salt, pressure effects, or solvent viscosity may

further influence thermostability.59 As only for a few

proteins in our data set melting temperatures are avail-

able, let alone data on the dependency of Tm on the sol-

vent viscosity, we are currently unable to investigate this

solvent influence any further. Finally, rigidity analysis is

sensitive with respect to deficiencies in the structures43;

in particular, as only one static representation per struc-

ture is analyzed. Analyzing multiple representations

instead, for example, generated by MD simulations, and

averaging over the results is expected to alleviate deficien-

cies in the constraint model and, hence, led to more

robust results.60

Identifying sites that are importantfor macroscopic stability

To understand protein structure adaptations that lead

to higher thermostability, we next consider microscopic

properties of the protein structures in the context of the

macroscopic percolation behavior of the constraint

networks. Here, the unfolding simulations of IPMDH

and TLP are considered in detail, and local structural

regions are identified at which the unfolding starts. The

identified unfolding regions are compared to experimen-

tal data.

3-Isopropylmalate dehydrogenase

3-Isopropylmalate dehydrogenase (IPMDH) is an

enzyme in the leucine biosynthesis pathway that catalyzes

the dehydrogenation and concomitant decarboxylation of

3-isopropylmalate (IPM) substrate, using nicotinamide

adenine dinucleotide (NAD1) as a cofactor. In the pres-

ent study, IPMDH from E. coli and T. thermophilus is

compared. IPMDH is a functional dimer, composed of

two identical subunits (SU), each with �350 amino acid

residues.61 The polypeptide chain of one subunit is

folded into two domains with similar topologies based

on parallel a/b motifs: domain 1 contains the N- and

the C-terminus, and domain 2 contains the subunit

interface. The active site is located in a cleft between the

two domains and is made up of residues from both sub-

units.62

In Figure 1(c), cluster configuration entropy plots are

shown for mesophilic and thermophilic IPMDH. Meso-

philic and thermophilic IPMDH show a corresponding

percolation behavior. The biologically relevant transitions,

corresponding to the folded–unfolded transitions in pro-

tein unfolding, can be identified at 340 and 355 K for

the E. coli and the T. thermophilus IPMDH, respectively.

The difference in experimental optimal growth tempera-

tures of the corresponding species is 38 K (Table I). The

rigid-cluster decomposition of the networks before and

after the phase transition is shown in Figure 2. The

events that take place at the unfolding transitions in

E. coli and T. thermophilus IPMDH are quite similar:

before the transition, the giant cluster extends over the

subunit-interface and the domain-interfaces, although

only a few residues of domain 1 are incorporated by the

giant cluster [Fig. 2(a,c)].

After the transition, the giant cluster does not cover

the subunit-interface anymore [Fig. 2(b,d)]. Likewise, the

rigid connection between domains 1 and 2 is lost. The

close correspondence of the rigid cluster distribution in

the constraint networks of the homologues before and

after the phase transition is a striking result of our analy-

sis. Particularly, the location of the giant cluster is similar

within the pair of homologues. As for the size of the

giant cluster, the one in the thermophilic structure just

before the phase transition is larger than the one in the

mesophilic protein. This is in agreement with the experi-

ment, where it has been shown that the folding inter-

mediate of the E. coli structure only retains 50% of

the original secondary structure, whereas that of the

T. thermophilus IPMDH retains 90%.64,65

By comparing the structure networks before and after

the phase transition, we can identify microscopic struc-

tural features, where the unfolding starts, so-called

unfolding regions or weak spots. The decay of the giant

cluster in IPMDH initially causes residues at the domain

interface, in domain 1, and at the subunit interface

(domain 2) to become flexible. In Figure 2(b), these

regions are indicated by arrows. The unfolding regions

identified by the thermal-unfolding simulation of E. coli

IPMDH have been compared to experiment (Table II).

Encouragingly, sites within the predicted regions have

also been successfully targeted by experiment to increase

thermostability of various mesophilic66–71 and chimeric

IPMDH.72–77 The experimental examination of thermo-

philic IPMDH has further revealed that most of the sites

that are important for thermostability are at the subunit-

interface,70,71,78 the domain-interface,73,79–81 and in

domain 1.82,83 These regions are also detected

as unfolding regions in our simulation for both the

S. Radestock and H. Gohlke

1094 PROTEINS

Page 7: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

mesophilic and the thermophilic protein (Table II).

Overall, the agreement between the sites predicted by

CNA and the experimentally identified unfolding regions

is good. However, weak agreement is found for the

domain 1 region, of which only a few residues are part

of the giant cluster, although mutations in this domain

have been shown to influence protein stability.82,83

Thermolysin-like protease

Thermolysin-like proteases (TLP) are members of the

peptidase family M4 of which thermolysin is the proto-

type.84 Thermolysin originates from B. thermoproteo-

lyticus. TLP are proteases and require a Zn21 ion for

their activity. They contain multiple Ca21 ions, which

Table IIComparison of predicted with experimentally verified unfolding sites for mesophilic

3-ispropylmalate-dehydrogenase (IPMDH)

Structure regionResidues in predictedunfolding regionsa

Experimentally verifiedthermostabilizing mutationsa,b

I Subunit interface 151–163, 165–167, 173, 185–200,215, 219–224, 241–248, 255, 256

191, 194, 200, 204, 256, 259

II Domain interface 105–112, 133–143, 174–175, 301–305c 98, 112, 114–116, 118, 120,131, 140, 182, 271, 295, 311

III Domain 1 — 20, 24, 42, 57, 60–62, 84, 86,92, 350, 356–360

aNumbering in the alignment, as shown in Figure S1.bSee text for details. According to ref. 66–74,77,79,81–83.cPreviously,42 these residues have been classified as belonging to the domain 1 region.

Figure 2Snapshots from the thermal unfolding simulation of meso- (a, b) and thermophilic (c, d) 3-isopropylmalatdehydrogenase (IPMDH) just before

(a, c) and after the phase transitions (b, d) at 340 and 355 K, respectively. The rigid cluster decomposition of the network is shown. The giant

cluster is shown in blue. Other clusters are shaded black. Arrows in (b) indicate potential unfolding nuclei. Roman numbers refer to the numbering

of the unfolding nuclei in Table II. The subunits and domains are marked. The locations of the active sites are indicated by asterisks. Figures were

generated using PyMOL.63 [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Protein Rigidity and Thermophilic Adaptation

PROTEINS 1095

Page 8: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

have been reported to be important for stability.85 TLP

consists of an N-terminal (mainly b-sheet) and a C-termi-

nal (mainly a-helical) domain. Both domains are con-

nected by a central a-helix (residues 139–154), which is

located at the bottom of the active site cleft and contains

several of the catalytically important residues. In the pre-

sent study, mesophilic TLP from B. cereus and thermo-

philic thermolysin is compared. The structures share a

high sequence and structure similarity (Table S1). The two

enzymes differ in their thermostability and in the level of

enzyme activity, when measured at the same temperature.

The simulated thermal-unfolding behavior of B. cereus

TLP and thermolysin is similar, as expressed by a corre-

sponding percolation behavior. The analysis of the cluster

configuration entropy plot revealed that the biologically

relevant transitions that correspond to the folded–

unfolded transitions can be identified at 350 and 373 K,

respectively [Fig. 1(d)]. The difference in experimental

optimal growth temperatures of the corresponding species

is 25 K (Table I). In Figure 3, the rigid cluster decomposi-

tion of the networks of B. cereus TLP and thermolysin,

respectively, is shown directly before and after the phase

transition. Before the transition, the giant cluster domi-

nates the system in both cases. Moreover, the giant cluster

is located in the same region of the proteins: it extends

over the N-terminal domain and comprises the b-sheet

region and an a-helix in the N-terminal domain (residues

67–89). The size of the giant cluster in thermolysin differs

from that reported previously.42 This is because, in the

present study, a modified method to determine the phase-

transition temperature was applied (see Methods section).

In addition to these common features, some differences

also exist. The networks also contain several smaller rigid

clusters, mainly located in the C-terminal domain, but

also in the N-terminal one. The size and number of these

clusters vary, with more and larger rigid clusters existing

in B. cereus TLP compared to thermolysin.

After the phase transition, the giant cluster decays into

smaller rigid clusters and regions that are flexible.

Interestingly, the decay of the giant cluster is in the N-termi-

nal domain, where the main site of proteolytic degradation

is assumed. Thus, the thermal-unfolding simulation predicts

that the global unfolding starts at similar sites as observed

for local-unfolding events that lead to kinetic degradation.

Comparing the constraint networks before and after

the phase transition allows identifying the microscopic

structural origins of the thermostability of thermolysin.

For both mesophilic TLP and thermolysin, similar

unfolding regions can be identified. The first region

comprises almost all residues of the b-sheet in the N-ter-

minal domain, the second one a Ca21-binding site (resi-

dues 56, 58, 60, and 68), the hydrophobic region around

Figure 3Snapshots from the thermal unfolding simulation of meso- (a, b) and thermophilic (c, d) thermolysin-like protease (TLP) just before (a, c) and

after the phase transitions (b, d) at 350 and 373 K, respectively. The rigid cluster decomposition of the network is shown. The giant cluster is

shown in blue. Other clusters are shaded black. Arrows in (b) indicate potential unfolding nuclei. Roman numbers refer to the numbering of the

unfolding nuclei in Table III. The N- and C-terminal domains are marked. The location of the active site is indicated by an asterisk. [Color figure

can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

S. Radestock and H. Gohlke

1096 PROTEINS

Page 9: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

Phe 63, and the N-terminal residues of the a-helix in the

N-terminal domain (residues 67–89). The predicted

unfolding regions are in good agreement with sites where

stabilizing mutations have successfully been introduced

experimentally into TLP to increase thermostability

(Table III).86–94 This holds true in particular for

the unfolding region comprised residues 56–70 in the N-

terminal domain (region II). Little correspondence, how-

ever, was found for the residues of the b-sheet region in

the N-terminal domain (region I). No unfolding regions

were found in the C-terminal domain. This is in agree-

ment with the experimental studies showing that muta-

tions at various sites in the C-terminal domain only have

a marginal effect on thermal stability.93,95–99

For both IPMDH and TLP, we found a strong agree-

ment between the events that take place at the unfolding

transition in the mesophilic enzyme and its thermophilic

homologue. In particular, the location and the size of the

giant cluster were comparable. In both the mesophilic

and the thermophilic enzyme, corresponding unfolding

regions were identified. The only difference in the global-

unfolding behavior between the mesophilic and the ther-

mophilic protein was the higher thermoresistance of the

unfolding regions. Apparently, thermostability is deter-

mined by the temperature dependence of local changes

in flexibility and rigidity of the systems.

Microscopic stability is determinedby the emergence of flexible regions

By analyzing the decay of the giant cluster, we identi-

fied the above unfolding regions that are important for

macroscopic stability, because their breaking determined

the global folding–unfolding transition. This demon-

strates that, within a rigid cluster, a hierarchy of regions

of varying stability can exist.50,100 Revealing such hier-

archies in detail provides information about microscopic

stability features, which can be associated with local

unfolding events. To characterize microscopic stability

features from the thermal-unfolding simulation, ‘‘stability

maps’’ are introduced here. For this, ‘‘rigid contacts’’

between two residues, represented by their Ca atoms, are

identified. A ‘‘rigid contact’’ exists if two residues belong

to the same rigid cluster. During a thermal-unfolding

simulation, ‘‘stability maps’’ are then constructed in that,

for each residue pair, Ecut is identified at which a ‘‘rigid

contact’’ between two residues is lost. [Note that, by Eq.

(4), Ecut also relates to a temperature at which the ‘‘rigid

contact’’ is lost.] That way, a contact’s stability relates to

the microscopic stability in the network, and, taken

together, the microscopic stabilities of all residue–residue

contacts result in a stability map. Stability maps are

comparable to cooperativity correlation plots used in

related studies.101,102 However, stability maps provide

direct information on the distribution of rigidity and flexi-

bility within the system, and they also identify regions that

are flexibly or rigidly correlated across the structure. In

turn, cooperativity correlation plots are calculated from an

ensemble of realizations of a given state of the protein, and

they primarily identify regions that are correlated across

the entire ensemble of these realizations.101,102

In Figures 4 and 5, the stability maps for IPMDH and

TLP are shown, respectively. The upper and lower tri-

angles of the matrices display the stability maps for the

mesophilic and the thermophilic homologues, with corre-

sponding rows and columns determined by a structural

alignment. Gaps in the structural alignment are colored in

gray. Relative microscopic stability values are color coded

in that contacts that are stable at temperatures above the

phase-transition temperature are shown in red, whereas

contacts that have already been broken when this tempera-

ture is reached are shown in white. For both IPMDH and

TLP, the size and the localization of the giant cluster at

temperatures slightly below the respective phase-transition

temperature are similar for the mesophilic and the ther-

mophilic proteins. Moreover, the size and the localization

of many other rigid clusters are corresponding.

Histograms of the differences between microscopic

stability values of rigid contacts in thermophilic IPMDH

and TLP and microscopic stability values of correspon-

ding rigid contacts in the mesophilic proteins are shown

in Figure S3(a,b), respectively. A negative stability diffe-

rence indicates that a rigid contact in the thermophilic

protein is stronger than that in the mesophilic protein.

For both IPMDH and TLP, stability differences mostly

range from 0 to 22.0 kcal mol21, with �55 and 75%,

Table IIIComparison of predicted with experimentally verified unfolding sites for mesophilic thermolysin-like

protease (TLP)

Structure regionResidues in predictedunfolding regionsa

Experimentally verifiedthermostabilizing mutationsa,b

I b-Sheet in the N-terminal domain 21–24, 29, 31–34, 39–42, 44,101–107, 114–115, 122–123

19, 38, 104, 120, 134, 142, 183

II Ca21-binding site, hydrophobic regionaround Phe 63, and N-terminus of thea-helix in the N-terminal domain

54, 56–62, 68–70 4, 8, 57, 59, 61, 64, 66–70

aNumbering in the alignment, as shown in Figure S2.bSee text for details. According to ref. 86–94.

Protein Rigidity and Thermophilic Adaptation

PROTEINS 1097

Page 10: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

respectively, of the contacts in the range of 0 and

21.0 kcal mol21. Thus, it becomes apparent that, for the

thermophilic protein, the majority of rigid contacts are

stronger when compared with the mesophilic homologue,

but that the increase in strength per contact is moderate.

To finally quantify the relation of microstability features

of those pairs of meso- and thermophilic homologous

proteins in our data set for which the phase-transition

temperature of the thermophilic protein was predicted to

be higher than that of the mesophilic counterpart, corre-

lation coefficients were calculated for the respective stabi-

lity maps (Table IV). These values range from 0.25 to

0.90, with an average value of 0.53 � 0.19, showing weak

to strong correlations in all cases. These correlations are

significant with P-value � 0.05. For example, scatter plots

of the microscopic stability values of rigid contacts in

mesophilic versus thermophilic IPMDH and TLP, respec-

tively, are shown in Figure S4(a,b).

Overall, these results demonstrate that microscopic

stability features are conserved between mesophilic and

thermophilic homologues. In agreement with what has

been found when analyzing the general percolation beha-

vior based on global parameters, the results demonstrate

that mesophilic and thermophilic proteins are in corre-

sponding states of similar rigidity and flexibility, even at

the microscopic level. Such a conservation of microscopic

stability features strongly suggests that the features are

important for enzyme function and activity.1,24,103

Relating microscopic stability andenzyme activity

Experimentally, the interplay between flexibility, stabi-

lity, and activity has been studied by measuring kinetic

parameters that describe enzyme function and catalysis

(Km and kcat).3 However, these parameters do not

provide insights at an atomic level into the importance

of flexibility and rigidity for enzyme activity. A

more direct view on protein flexibility and microstability

is obtained by biophysical techniques that analyze

Figure 4Combined stability map for E. coli and T. thermophilus IPMDH subunits (SU) 1 and 2. In the upper (lower) half of the matrix, stability

information for the mesophilic (thermophilic) protein is shown. Relative microscopic stability values are color coded: contacts that are stable at

temperatures above the phase-transition temperature are shown in red, and instable contacts are shown in white. Gaps in the alignment of both

structures or residues that do not form any rigid contacts are shown in gray. Catalytic residues are highlighted by black arrows, and other

functionally important residues are highlighted by gray arrows. Contacts between the residues in the giant cluster (at a temperature just below Tp)

are surrounded by a solid line. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

S. Radestock and H. Gohlke

1098 PROTEINS

Page 11: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

dynamical properties of the structure, such as H/D

exchange, neutron scattering, and NMR spectroscopy.24

However, even then, it remains difficult to quantify the

influence of flexibility and rigidity due to the different

types of motions and timescales detected. Such difficulties

may explain why, in some cases, thermophilic enzymes

were unexpectedly found by H/D exchange to display a

similar,104,105 or even increased, flexibility when com-

pared with their mesophilic homologues.58,106–108

Using CNA, no assumption needs to be made with

respect to timescales and types of motions. This method

allows studying rigidity and flexibility in atomic resolu-

tion and thus characterizing microscopic stability fea-

tures. Here, we analyze how adaptive mutations of ther-

mophilic enzymes maintain the balance between rigidity,

important for macroscopic stability, and local flexibility,

important for activity, regardless of the temperature at

which the protein functions. The role of local flexibility

is most apparent for regions involved in substrate bind-

ing and product release.

Figure 5Combined stability map for B. cereus thermolysin-like protease (TLP) and thermolysin. In the upper (lower) half of the matrix, stability

information for the mesophilic (thermophilic) protein is shown. Relative microscopic stability values are color coded: contacts that are stable at

temperatures above the phase-transition temperature are shown in red, and instable contacts are shown in white. Gaps in the alignment of both

structures or residues that do not form any rigid contacts are shown in gray. Catalytic residues are highlighted by black arrows, and other

functionally important residues are highlighted by gray arrows. Contacts between the residues in the giant cluster (at a temperature just below

Tp) are surrounded by a solid line. Contacts between the residues belonging to the a-helix in the N-terminus are surrounded by a dashed line.

The latter region is shown enlarged. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Table IVCorrelation between the stability maps from mesophilic

and thermophilic homologues

Protein familya rb

1 0.464 0.305 0.486 0.558 0.699 0.3510 0.6511 0.6612 0.3313 0.5315 0.7416 0.2518 0.90Mean 0.53 � 0.19

aNumbers refer to family numbers in Table I. Only those families were considered

for which the phase-transition temperature of the thermophilic protein was pre-

dicted to be higher than that of the mesophilic counterpart.bPearson’s correlation coefficient.

Protein Rigidity and Thermophilic Adaptation

PROTEINS 1099

Page 12: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

3-Isopropylmalate dehydrogenase

The active site of 3-isopropylmalate dehydrogenase

(IPMDH) is located in a cleft between the two subunits

and is made up of residues from both subunits.62 The

catalytic residues Lys 195 and Asp 227 belong to one

subunit, whereas Tyr 145 belongs to the other (the num-

bering refers to the alignment shown in the Supporting

Information). The catalytic residues are indicated by

black arrows in Figure 4. The catalytic residues are

located in domain 2 of the enzyme, which has an impor-

tant role for macroscopic stability (see above). As illus-

trated in Figure 6(a,d) for meso- and thermophilic

IPMDH, respectively, the active site is in the immediate

vicinity of the giant cluster at the working temperature

(see Methods section) of the corresponding enzyme (337

and 352 K, respectively), indicating that the catalytic resi-

dues are in a relatively stable region of the protein.

Examining the stability map in Figure 4 reveals that

contacts between the catalytic residues and their neigh-

boring residues are relatively unstable when compared

with contacts between residues in the giant cluster. (In

Fig. 4, contacts between residues of the giant cluster are

surrounded by a solid line.) Thus, the catalytic residues

are not part of the giant cluster, but located in a locally

flexible region, allowing a certain degree of flexibility for

them, which is anticipated to be necessary for catalysis.

Still, because of the vicinity to the giant cluster, a certain

degree of global stability is retained.

As shown for the thermophilic enzyme, this view is

corroborated in that the microstability values of the

active site show that these residues become part of the

giant cluster as early as at temperatures 7–10 K below

the respective phase-transition temperature [Figs. 4 and

6(c)]. Thus, at the working temperature of the meso-

philic enzyme (337 K), the active site of the thermophilic

homologue is rigidified. It is highly likely that this is the

Figure 6Active site of meso- (a, b) and thermophilic (c, d) 3-isopropylmalate dehydrogenase (IPMDH) at the working temperature of the mesophilic

enzyme (337 K) (a, c) and the working temperature of the thermophilic enzyme (352 K) (b, d). The rigid cluster decomposition of the network is

shown. The giant cluster is shown in blue. Other clusters are shaded black. Catalytic residues are shown in red, and other functionally important

residues are shown in green.

S. Radestock and H. Gohlke

1100 PROTEINS

Page 13: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

reason why the T. thermophilus IPMDH has been found

to be only marginally active then.29 In turn, at the work-

ing temperature of the thermophilic enzyme (352 K), the

active site of the mesophilic homologue is identified to

be completely flexible and, hence, has lost all structural

stability [Fig. 6(b)].

Aside from the microscopic stability of the catalytic

residues, the rigidity and flexibility characteristics of resi-

dues involved in substrate and cofactor binding are

expected to have a prominent role for the activity of

IPMDH, even more so, because, upon binding of IPM

and NAD1, IPMDH shows an induced fit of the binding

site. Especially, the cofactor binding triggers conforma-

tional changes in the immediate vicinity of both the

adenosine- and the nicotinamide-binding sites: the Ca

atoms of residues 14–18, 284–299, and 334–347 in

domain 1 and 261–266 at the domain interface move by

up to 2.5 A to bind the adenine moiety (the numbering

refers to the alignment shown in Fig. S1).62,109 In turn,

residues 77–91 in domain 1 move at the nicotinamide-

binding site.109 These residues are indicated by gray

arrows in Figure 4. All these residues are located in loops,

with the exception of residues 16–18 and 90–91, which

are located at the amino terminal ends of a-helices. Thelongest of the moving segments comprises residues 284–

299. This loop closes over an analog of NAD1, ADP-

ribose, when bound to IPMDH, allowing Asp 298 and

Leu 290 to bind the adenine moiety.62

As a prerequisite for these movements, contacts

between residues in the moving regions and neighboring

regions need to be unstable. This is shown in the stability

map (see Fig. 4), where it can be seen that the microscopic

stabilities of these residues are in fact low: residues 14–18

and 77–91 become rigidified (i.e., they are integrated into

the giant cluster) only at temperatures 10–20 K below the

phase-transition temperature [Fig. 6(c)]. At this stage, the

regions cannot move toward the binding site anymore,

and the closed conformation of IPMDH cannot be

adopted, which may be another reason why the thermo-

philic enzyme is only marginally active at the temperature

of maximal activity of the mesophilic enzyme [Fig. 6(a)].

Notably, cold-active forms of T. thermophilus IPMDH

have been engineered by introducing mutations into

exactly these fragments in order to improve binding by

favoring the closed conformation.110–113

Thermolysin-like protease

The active site of TLP is located in a cleft between the

N- and the C-terminal domain. TLP binds a single-cata-

lytic Zn21 ion. Two mechanisms of action have been

proposed for TLP: one in which Glu 144 acts as a proton

acceptor in catalysis,114,115 and a second in which His

232 acts as a general base instead of the glutamate (the

numbering refers to the alignment shown in Fig.

S2).116,117 In either mechanism, residues Glu 144, His

232, Tyr 158, and a Zn21 bound water play important

roles during catalysis.84 The Zn21 is coordinated by two

histidine residues (His 143 and His 147), a glutamate

(Glu 167), and a water molecule. Most of the catalytic

and Zn21-coordinating residues belong to the central a-helix (residues 139–154) that represents the bottom of

the binding site.84 The catalytic and Zn21 coordinating

residues are indicated by black arrows in Figure 5.

For TLPs, a hinge-bending motion has been postulated

to be important for substrate binding and catalysis. In

this context, the central a-helix (residues 139–154) has

been identified as a flexible joint.118 Particularly, residues

Gly 136 and Gly 137 at the N-terminus of this a-helixare important.119 In addition, Gly 79 in the center of an

a-helix in the N-terminal domain (residues 67–89) has

been found to be important for hinge bending.119–121

These residues are indicated by gray arrows in Figure 5.

Mutations of residue 79 influenced substrate binding by

altering mobility in the binding cleft. Mutations of

residues 136 and 137 affected motions directly involved

in catalysis and, hence, influenced kcat.

The microscopic stabilities of the functionally impor-

tant residues 79, 136, and 137 can be determined from the

unfolding simulation. To this aim, the stability map of

meso- and thermophilic TLP in Figure 5 is examined. It is

expected that contacts between the functionally important

residues and their neighboring residues are unstable, thus

allowing for a local flexibility. For Gly 79, this can be

observed primarily in the case of mesophilic TLP: contacts

between residue 79 and other residues of the long a-helixin the N-terminal domain (residues 67–89) are relatively

unstable compared to mutual contacts of other residues

within the a-helix. (In Fig. 5, contacts between these resi-

dues are surrounded by a black-dashed line, and this

region is shown enlarged.) The result shows that the influ-

ence of residue 79 is limited to a local region, most likely

because Gly 79 acts as a a-helix breaker.The importance of residues 136 and 137 as hinge resi-

dues can also be understood from examining the stability

map. Contacts between the residues of the giant cluster

and contacts between residues of a large cluster in the C-

terminal domain 138-208 are relatively stable, whereas

contacts between residues 125–137 are relatively unstable.

At temperatures just below the respective phase-transition

temperature, residues 125–137 form a flexible region that

assures that the giant cluster does not extend into the C-

terminal domain and does not fuse with residues 138–

208 forming the other cluster. That way, residues 125–

137 can act as a hinge. As shown in Figure 7(a,d) for

meso- and thermophilic TLP, respectively, some of the

catalytic and Zn21 coordinating residues belong to the

large cluster in the C-terminal domain at the working

temperature (see Methods section) of the corresponding

enzyme (342 and 364 K, respectively). Thus, these resi-

dues are in a locally rigid region, which can nevertheless

move globally due to the hinge. The global mobility and

Protein Rigidity and Thermophilic Adaptation

PROTEINS 1101

Page 14: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

local rigidity characteristics have been confirmed

by recent H/D exchange experiments that were per-

formed to study the conformational dynamics of free

thermolysin.122

As shown for thermolysin, at temperatures 20–30 K

below the phase-transition temperature, the global mobi-

lity is lost, because the hinge region rigidifies [Fig. 7(c)].

This happens because contacts are then established

between residues of the hinge (125-136) and residues of

the giant cluster in the C-terminal domain (137-208).

This result explains experimental findings that show that

thermolysin is only marginally active at a temperature

that corresponds to the Tog of the mesophilic enzyme, 25

K below its own Tog.123 In turn, at a temperature where

the thermophilic enzyme is predicted to be fully active

(364 K), the active site of the mesophilic homologue is

completely flexible [Fig. 7(b)].

The importance of the flexibility of the hinge region

has also been confirmed by mutational studies.90,124

Mutating Leu 145 by Ser increased kcat and, thus,

improved the enzyme’s activity but did not influence the

thermostability of the protein.124 On the other hand,

mutating Gly 120 to Glu improved the activity, too, but

had a negative effect on macroscopic stability.90 These

observations can be understood with the help of the

CNA results: whereas the mutation of residue 145 was

performed in a region that is not important for macro-

scopic stability, the mutation of residue 120 targets an

identified unfolding region. Apparently, activity could be

modulated independently from stability when a mutation

was introduced at a site that does not correspond to an

unfolding region. It is encouraging to note that CNA

allows understanding this complex relationship between

activity and thermostability.

Figure 7Active site of meso- (a, b) and thermophilic (c, d) thermolysin-like protease (TLP) at the working temperature of the mesophilic enzyme (342 K)

(a, c) and the working temperature of the thermophilic enzyme (364 K) (b, d). The rigid cluster decomposition of the network is shown. The giant

cluster is shown in blue. Other clusters are shaded black. Catalytic residues are shown in red, and other functionally important residues are shown

in green.

S. Radestock and H. Gohlke

1102 PROTEINS

Page 15: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

CONCLUSIONS

On the basis of the constraint network representations

of proteins, and by applying rigidity theory, we analyzed

the local distribution of flexible and rigid regions in 19

pairs of homologous proteins from meso- and thermo-

philic organisms and related it to activity characteristics

of the enzyme structures. The present study thus extends

a previous one by us, which related protein structural ri-

gidity and thermostability.42

From a biological point of view, the most intriguing

result of our study is that adaptive mutations of thermo-

philic enzymes maintain the balance between overall ri-

gidity, important for thermostability, and local flexibility,

important for activity, at the respective temperature at

which the protein functions: our results demonstrate that

thermophilic adaptation leads to an increase of structural

rigidity in general, but that the distribution of function-

ally important flexible regions is conserved between

meso- und thermophilic homologues. This finding pro-

vides direct evidence for the hypothesis of corresponding

states. To the best of our knowledge, this study is the

first one probing this hypothesis for a large data set by

computational means and by directly characterizing pro-

tein rigidity and flexibility. Our results demonstrate that

exploiting the principle of corresponding states not only

allows for successful thermostability optimization but

also for guiding experiments in order to improve enzyme

activity in protein engineering.

In more detail, our results show that specific regions

are crucial for macroscopic stability of protein structures.

When these unfolding nuclei become flexible, the struc-

tural integrity of the native folded state is lost, and the

protein unfolds. Notably, unfolding nuclei identified for

IPMDH and TLP are in good agreement with sites that

are known to be important for thermostability and where

thermostabilizing mutations have been introduced. Thus,

we suggest that the unfolding regions represent sites

where mutations should be introduced that might modu-

late a protein’s thermostability. Detailed analysis of the

flexibility in the IPMDH and TLP substrate-binding

regions with the help of ‘‘stability maps,’’ introduced here

for the first time, showed that enzyme function requires

certain regions of the protein to be flexible at the work-

ing temperature of the respective enzyme. We were able

to qualitatively relate changes in the flexibility of those

regions, induced either by a temperature change or by

the introduction of mutations, to experimentally

observed losses of the enzymes’ functions.

From a methodological point of view, it is important

to note that protein thermostability can be predicted by

characterizing the mechanical rigidity of a protein struc-

ture during a thermal-unfolding simulation. CNA is

based on a graph-theoretical method that determines ri-

gidity and flexibility within a protein structure in atomic

resolution. The approach is extremely fast and computa-

tionally inexpensive. Moreover, no assumption on the

type and the timescale of motion is made. Remarkably,

CNA implicitly captures and unifies many different

mechanisms that contribute to increased thermostability

and to activity at high temperatures. Thus, CNA provides

an appealing alternative to MD simulations or experi-

ments. Finally, the approach is entirely based on the

native topology of a protein. This can be rationalized in

that we are identifying when the giant rigid cluster

breaks, which is controlled by native contacts.

Network representations of proteins have already been

used to predict flexibility and stability characteristics of

protein structures. Vendruscolo et al.125,126 showed that

protein structure networks are ‘‘small world’’ networks.

In such networks, atoms or residues can be identified

that determine the topology of the network.45 Heringa

and Argos127 used protein structure networks to identify

clusters of tightly packed side chains. Later on, they

showed that these clusters were important for protein

thermostability.128 Brinda et al.129,130 used a similar

approach to identify differences between meso- and ther-

mophilic proteins. Robinson-Rechavi et al.19 used such

an approach to identify structural properties of thermo-

philic proteins, too. In the present study, we go beyond

analyzing topological properties of protein structure net-

works. Instead, protein structures are represented as con-

straint networks (molecular frameworks), where rigid

and flexible regions can be identified.42

A shortcoming of our method is the fact that a struc-

ture of high quality is needed for analysis. Moreover, ri-

gidity analysis is sensitive with respect to deficiencies in

the structures, and extrinsic factors such as glycosylation,

salt, pressure effects, or solvent viscosity may further

influence thermostability. Thus, future work is needed to

make the approach viable also in those cases where only

low-resolution structures or homology models are avail-

able and/or to include such extrinsic factors. An

approach to reduce sensitivity toward a single structure

has been developed by Jacobs et al.101,102 Using the dis-

tance constraint model (DCM), a large number of com-

binations of constraints is generated as an input to rigid-

ity analysis, and the results are then averaged. However,

the DCM requires a fit to experimental data, which limits

the general applicability of the method.

Compared to other computational methods for protein

engineering,131–133 CNA provides an interesting alterna-

tive. Most importantly, we expect CNA to be a valuable

tool for predicting ‘‘weak spots’’ in a protein structure,

where mutations could improve microscopic stability and

consequently macroscopic stability, ultimately leading to

an increased thermostability. In addition, CNA provides

detailed information about regions that should not be

stabilized in order to not interfere with enzyme activity.

Such information will be very valuable, because even in

an elaborate study describing the successful design of a

thermostable enzyme,131 only coarse assumptions have

Protein Rigidity and Thermophilic Adaptation

PROTEINS 1103

Page 16: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

been made so as to not influence the active site’s stability

during the design process.

METHODS

Data set

A detailed description of the data set selection is given

in Ref. 42. Here, we summarize the procedure. Crystallo-

graphic models of homologous pairs of mesophilic and

thermophilic protein structures were collected from the

Protein Data Bank (PDB)134 using data sets from the

recent literature as a starting point.15,130 Only homolo-

gous pairs were considered that fulfilled the following

criteria: (i) the resolution is �2.6 A; (ii) structures were

excluded in cases where the crystallographic model did

not correspond to the biological unit; (iii) the quality of

the structures was checked using the PDBREPORT data-

base,135 and structures with the qualification ‘‘bad’’ were

excluded. Excluded structures were replaced by other

available structures of the same protein if possible.

For enzymes, information about catalytic residues was

taken from the Catalytic Site Atlas.136 To ensure struc-

tural homology, only sequences sharing �35% residue

identity were considered, as were only pairs of structures

that shared the same oligomeric and ligand-binding state

and had a similar resolution. In some cases, it was not

possible to find structures that met all of the above crite-

ria. In the case of the thermophilic protein of family 10

(IPMDH), a mutant was taken, because the wild-type

enzyme was only available as a monomer from the PDB.

The biological unit, however, is a homodimer. Never-

theless, the mutant shows only slight differences in ther-

mostability and no difference in enzyme activity when

compared with the wild-type protein.80 For further

details on other protein families, see ref. 42.

The final data set contained 19 homologous pairs of

mesophilic and thermophilic protein structures (Table

S1). The data set differs from a previously published

one42 in that every protein family is now being repre-

sented by only one pair of structures. Structural similar-

ity was determined using the DaliLite server.137 Where

necessary, the crystallographic model was modified by

flipping Asn, Gln, or His side chains using REDUCE.138

The optimal growth temperature (Tog) was assigned to

each protein by considering the living environment

temperature or the average of a range of habitat tempera-

tures of the corresponding species.21,139 Melting

temperatures were taken from the literature and the Pro-

Therm database.140

Constraint network analysis

Network construction

To construct the constraint network, only the protein

structure was used. Water, solvent molecules, buffer ions,

substrates, and cofactor molecules were removed from

the crystallographic models. Metal ions were retained

when they were part of the structure. Bonds between

ions and protein atoms were treated as covalent bonds

and inserted manually. The constraint network of the

covalent and noncovalent bonds present within the pro-

tein was constructed using the FIRST software (version

6.2).43 Hydrogen bonds (including salt bridges) were

modeled by a bond between the hydrogen and the

acceptor atom and two additional angular constraints

associated with these atoms.43 Hydrogen-bond energies

Ehb were calculated from an empirical potential based on

donor-hydrogen-acceptor atom geometries as given in

the crystallographic model.51 The position of the hydro-

gen atoms was used as predicted by the REDUCE soft-

ware.138 Hydrophobic interactions between carbon or

sulfur atoms were taken into account if the distance

between these atoms was less than the sum of their van

der Waals radii (C: 1.7 A, S: 1.8 A) plus 0.25 A.50

Rigid cluster decomposition

Rigidity within the constraint network can be fully char-

acterized using constraint counting.48 To this aim, in

FIRST, the ‘‘pebble game’’ algorithm is implemented.43

This algorithm determines for every bond whether it is

part of a rigid cluster or a flexible region.48,141,142 A

rigid cluster is defined as a set of atoms that move together

as a rigid body in any floppy motion. Otherwise, an atom

is part of a flexible region. The size of a rigid cluster was

determined as the number of atoms that form the cluster.

Cluster size was further normalized by the total number of

atoms within the network. By definition, the giant cluster

is the largest cluster present in a given network.

Thermal-unfolding simulation

By gradually removing noncovalent bond constraints

from the network, one can go from a rigid network with a

large number of bonds to a flexible network with only a

few bonds. Knowing the energies Ehb of all potential

hydrogen bonds (including salt bridges) in the network,

the energy values were used for excluding hydrogen bonds

from the network with Ehb > Ecut. A new network was gen-

erated for every single hydrogen bond that was removed.

The networks were then decomposed into rigid clusters, as

described earlier. As the energy of a hydrogen bond relates

to a temperature T at which the bond breaks43,51 (see also

below), thermal unfolding can be simulated that way.50

The number of hydrophobic contacts was kept fixed

during the thermal-unfolding simulation.

Network parameter evaluation

Global changes in the rigidity characteristics of the

network during the thermal-unfolding simulation were

characterized using the rigidity-order parameter (P1)

S. Radestock and H. Gohlke

1104 PROTEINS

Page 17: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

and the s2-cluster configuration entropy (H) as defined

in the Theory section. A phase transition was identified

as the most prominent change in a H versus Ecut plot. To

do so, a spline was fitted to the entropy values. Ecut,p is

the hydrogen bond energy cutoff where the first-order

derivative of the spline reaches its maximum.

To determine a phase-transition temperature Tp from

Ecut,p, a relation T 5 f(Ecut) needs to be established. For

this, we calibrated Ecut 5 Ecut,p values with respect to ex-

perimental melting temperatures Tm, considering only

those pairs of meso- and thermophilic proteins for which

Ecut,p of the mesophilic protein was correctly predicted to

be larger than Ecut,p of the thermophilic protein. For

those proteins, for which only optimal growth tempera-

tures of the corresponding species (Tog) were available,

Tm values were estimated by assuming that they are 25 K

above the Tog value.143 Optimal growth temperatures

have previously been used in studies comparing meso-

and thermophilic proteins.16 This resulted in the follow-

ing relation:

T ¼ �20K=ðkcalmol�1Þ�Ecut þ 300 K ð4Þ

Considering that, at Ecut, only hydrogen bonds with

Ehb � Ecut are retained in the network, Eq. (4) also

provides a mapping of the procedure for removing

hydrogen bonds onto a scale of temperature. In contrast

to previous studies43,60 where the hydrogen bond

energy was related to thermal fluctuations in terms of

kT, Eq. (4) provides a less steep relation of hydrogen

bond energy with temperature. We note that Eq. (4)

works well for determining differences in Tp for pairs of

homologous proteins and to relate these to differences

in experimentally determined melting temperatures (Tm)

or optimal growth temperatures of the corresponding

species (Tog). However, the relation does not allow to

compare Tp values between nonhomologous proteins.

This is because the size of a constraint network and

the topology of the underlying protein influence the

absolute Tp value.

Local stability characteristics were identified based on

a stability map. At a given temperature, rigidity analysis

labels portions of the protein as rigid or flexible. A stabi-

lity map shows how pairs of residues at a given tempera-

ture are related to each other in terms of their rigidity

characteristics. On the map, a color is assigned to a pair

of residues according to the stability of their ‘‘contact.’’

Here, two residues form a ‘‘contact’’ if both are part of

the same rigid cluster. The contact stability is then

defined as the Ecut value at which the contact breaks

during the thermal-unfolding simulation. Stability maps

of the same size can be compared by measuring Pearson’s

correlation of the two matrices. For pairs of homologous

proteins, corresponding residue positions in the sequence

were identified based on structural alignments obtained

from the DaliLite server.137

For comparing a pair of homologous proteins in their

respective functional states, working temperatures for the

enzymes were determined, such that (i) these tempera-

tures are lower by the same amount than the respective

Tp values and (ii) structural elements that need to move

for the enzymes to be functional come out as flexible in

the rigidity analysis. This leads to working temperatures

3 K below the respective Tp values in the case of IPMDH

and 8 K in the case of TLP.

ACKNOWLEDGMENTS

We are grateful to Daniel Cashman, Doris Klein,

Prakash Rathi (Heinrich Heine-University, Dusseldorf),

and Benjamin Radestock (Johannes Gutenberg-Univer-

sitat, Mainz) for critically reading the manuscript.

REFERENCES

1. Jaenicke R, Bohm G. The stability of proteins in extreme environ-

ments. Curr Opin Struct Biol 1998;8:738–748.

2. Sterner R, Liebl W. Thermophilic adaptation of proteins. Crit Rev

Biochem Mol Biol 2001;36:39–106.

3. Fields PA. Review: protein function at thermal extremes: balancing

stability and flexibility. Comp Biochem Physiol A: Comp Physiol

2001;129:417–431.

4. Demirjian DC, Moris-Varas F, Cassidy CS. Enzymes from extremo-

philes. Curr Opin Chem Biol 2001;5:144–151.

5. Atomi H. Recent progress towards the application of hyperthermo-

philes and their enzymes. Curr Opin Chem Biol 2005;9:166–173.

6. Turner P, Mamo G, Karlsson EN. Potential and utilization of ther-

mophiles and thermostable enzymes in biorefining. Microb Cell

Fact 2007;6:1–23.

7. Bommarius AS, Broering JM, Chaparro-Riggers JF, Polizzi KM.

High-throughput screening for enhanced protein stability. Curr

Opin Biotechnol 2006;17:606–610.

8. Eijsink VGH, Bjork A, Gaseidnes S, Sirevag R, Synstad B, van den

Burg B, Vriend G. Rational engineering of enzyme stability. J Bio-

technol 2004;113:105–120.

9. Eijsink VGH, Gaseidnes S, Borchert TV, van den Burg B. Directed

evolution of enzyme stability. Biomol Eng 2005;22:21–30.

10. Russell RJM, Taylor GL. Engineering thermostability—lessons from

thermophilic proteins. Curr Opin Biotechnol 1995;6:370–374.

11. Chaparro-Riggers JF, Polizzi KM, Bommarius AS. Better library

design: data-driven protein engineering. Biotechnol J 2007;2:180–

191.

12. Razvi A, Scholtz JM. Lessons in stability from thermophilic pro-

teins. Protein Sci 2006;15:1569–1578.

13. Berezovsky IN, Shakhnovich EI. Physics and evolution of thermo-

philic adaptation. Proc Natl Acad Sci USA 2005;102:12742–12747.

14. Zeldovich KB, Berezovsky IN, Shakhnovich EI. Protein and DNA

sequence determinants of thermophilic adaptation. PloS Comput

Biol 2007;3:62–72.

15. Sadeghi M, Naderi-Manesh H, Zarrabi M, Ranjbar B. Effective

factors in thermostability of thermophilic proteins. Biophys Chem

2006;119:256–270.

16. Gromiha MM, Oobatake M, Sarai A. Important amino acid pro-

perties for enhanced thermostability from mesophilic to thermo-

philic proteins. Biophys Chem 1999;82:51–67.

17. Kumar S, Tsai CJ, Nussinov R. Factors enhancing protein thermo-

stability. Protein Eng 2000;13:179–191.

18. Gianese G, Bossa F, Pascarella S. Comparative structural analysis of

psychrophilic and meso- and thermophilic enzymes. Proteins

2002;47:236–249.

Protein Rigidity and Thermophilic Adaptation

PROTEINS 1105

Page 18: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

19. Robinson-Rechavi M, Alibes A, Godzik A. Contribution of electro-

static interactions, compactness and quaternary structure to

protein thermostability: lessons from structural genomics of Ther-

motoga maritima. J Mol Biol 2006;356:547–557.

20. Szilagyi A, Zavodszky P. Structural differences between mesophilic,

moderately thermophilic and extremely thermophilic protein sub-

units: results of a comprehensive survey. Structure 2000;8:493–504.

21. Querol E, Perez-Pons JA, Mozo-Villarias A. Analysis of protein

conformational characteristics related to thermostability. Prot Eng

1996;9:265–271.

22. Vogt G, Argos P. Protein thermal stability: hydrogen bonds or

internal packing? Fold Des 1997;2:S40–S46.

23. Vieille C, Zeikus JG. Thermozymes: identifying molecular determi-

nants of protein structural and functional stability. Trends Biotech-

nol 1996;14:183–190.

24. Sterner R, Brunner E. The relationship between catalytic activity,

structural flexibility and conformational stability as deduced from

the analysis of mesophilic-thermophilic enzyme pairs and protein

engineering studies. In: Robb F, Antranikian G, Grogan D,

Driessen A, editors. Thermophiles: biology and technology at high

temperatures. London: CRC Press; 2008. pp 25–38.

25. Jaenicke R. Protein stability and molecular adaptation to extreme

conditions. Eur J Biochem 1991;202:715–728.

26. Tang KES, Dill KA. Native protein fluctuations: the confor-

mational-motion temperature and the inverse correlation of pro-

tein flexibility with protein stability. J Biomol Struct Dyn

1998;16:397–411.

27. Vihinen M. Relationship of protein flexibility to thermostability.

Prot Eng 1987;1:477–480.

28. Wrba A, Schweiger A, Schultes V, Jaenicke R, Zavodszky P.

Extremely thermostable D-glyceraldehyde-3-phosphate dehydrogen-

ase from the eubacterium Thermotoga maritima. Biochemistry

1990;29:7584–7592.

29. Zavodszky P, Kardos J, Svingor A, Petsko GA. Adjustment of

conformational flexibility is a key event in the thermal adaptation

of proteins. Proc Natl Acad Sci USA 1998;95:7406–7411.

30. Jaenicke R. Do ultrastable proteins from hyperthermophiles have

high or low conformational rigidity? Proc Natl Acad Sci USA

2000;97:2962–2964.

31. Kamerzell TJ, Middaugh CR. The complex inter-relationships

between protein flexibility and stability. J Pharm Sci 2008;97:3494–

3517.

32. Georlette D, Blaise V, Collins T, D’Amico S, Gratia E, Hoyoux A,

Marx JC, Sonan G, Feller G, Gerday C. Some like it cold: biocatal-

ysis at low temperatures. FEMS Microbiol Rev 2004;28:25–42.

33. van Gunsteren WF, Hunenberger PH, Kovacs H, Mark AE, Schiffer

CA. Investigation of protein unfolding and stability by computer

simulation. In: Dobson CM, Fersht AR, editors. Protein folding.

Cambridge: Cambridge University Press; 1995. pp 49–60.

34. Purmonen M, Valjakka J, Takkinen K, Laitinen T, Rouvinen J. Mo-

lecular dynamics studies on the thermostability of family 11 xyla-

nases. Protein Eng Des Sel 2007;20:551–559.

35. Sham YY, Ma B, Tsai CJ, Nussinov R. Thermal unfolding molecu-

lar dynamics simulation of Escherichia coli dihydrofolate reductase:

thermal stability of protein domains and unfolding pathway.

Proteins 2002;46:308–320.

36. Pang JY, Allemann RK. Molecular dynamics simulation of thermal

unfolding of Thermatoga maritima DHFR. Phys Chem Chem Phys

2007;9:711–718.

37. Creveld LD, Amadei A, van Schaik RC, Pepermans HAM, de Vlieg

J, Berendsen HJC. Identification of functional and unfolding

motions of cutinase as obtained from molecular dynamics com-

puter simulations. Proteins 1998;33:253–264.

38. Case DA. Normal mode analysis of protein dynamics. Curr Opin

Struct Biol 1994;4:285–290.

39. Su JG, Li CH, Hao R, Chen WZ, Wang CX. Protein unfolding beha-

vior studied by elastic network model. Biophys J 2008;94:4586–4596.

40. Ahmed A, Gohlke H. Multiscale modeling of macromolecular

conformational changes combining concepts from rigidity and

elastic network theory. Proteins 2006;63:1038–1051.

41. Gohlke H, Thorpe MF. A natural coarse graining for simulating

large biomolecular motion. Biophys J 2006;91:2115–2120.

42. Radestock S, Gohlke H. Exploiting the link between protein rigi-

dity and thermostability for data-driven protein engineering. Eng

Life Sci 2008;8:507–522.

43. Jacobs DJ, Rader AJ, Kuhn LA, Thorpe MF. Protein flexibility pre-

dictions using graph theory. Proteins 2001;44:150–165.

44. Dill KA. Dominant forces in protein folding. Biochemistry

1990;29:7133–7155.

45. Bode C, Kovacs IA, Szalay MS, Palotai R, Korcsmaros T, Csermely P.

Network analysis of protein dynamics. FEBS Lett 2007;581:2776–2782.

46. Greene LH, Higman VA. Uncovering network systems within

protein structures. J Mol Biol 2003;334:781–791.

47. Whiteley W. Counting out to the flexibility of molecules. Phys Biol

2005;2:S116–S126.

48. Jacobs DJ. Generic rigidity in three-dimensional bond-bending

networks. J Phys A: Math Gen 1998;31:6653–6668.

49. Hespenheide BM, Rader AJ, Thorpe MF, Kuhn LA. Identifying

protein folding cores from the evolution of flexible regions during

unfolding. J Mol Graph Mod 2002;21:195–207.

50. Rader AJ, Hespenheide BM, Kuhn LA, Thorpe MF. Protein unfol-

ding: rigidity lost. Proc Natl Acad Sci USA 2002;99:3540–3545.

51. Dahiyat BI, Gordon DB, Mayo SL. Automated design of the

surface positions of protein helices. Protein Sci 1997;6:1333–1337.

52. Makhatadze GI, Privalov PL. Energetics of protein structure. Adv

Protein Chem 1995;47:307–425.

53. Thorpe MF. Continuous deformations in random networks.

J Non-Cryst Solids 1983;57:355–370.

54. Stauffer D, Aharony A. Introduction to percolation theory.

London: Taylor and Francis; 1994.

55. Albert R, Barabasi AL. Statistical mechanics of complex networks.

Rev Mod Phys 2002;74:47–97.

56. Tsang IR, Tsang IJ. Cluster size diversity, percolation, and complex

systems. Phys Rev E 1999;60:2684–2698.

57. Andraud C, Beghdadi A, Lafait J. Entropic analysis of random

morphologies. Phys A 1994;207:208–212.

58. Fitter J, Herrmann R, Dencher NA, Blume A, Hauss T. Activity

and stability of a thermostable a-amylase compared to its meso-

philic homologue: mechanisms of thermal adaptation. Biochemi-

stry 2001;40:10723–10731.

59. Vieille C, Zeikus GJ. Hyperthermophilic enzymes: sources, uses,

and molecular mechanisms for thermostability. Microbiol Mol Biol

Rev 2001;65:1–43.

60. Gohlke H, Kuhn LA, Case DA. Change in protein flexibility upon

complex formation: analysis of Ras-Raf using molecular dynamics

and a molecular framework approach. Proteins 2004;56:322–337.

61. Wallon G, Kryger G, Lovett ST, Oshima T, Ringe D, Petsko GA.

Crystal structures of Escherichia coli and Salmonella typhimurium

3-isopropylmalate dehydrogenase and comparison with their ther-

mophilic counterpart from Thermus thermophilus. J Mol Biol

1997;266:1016–1031.

62. Kadono S, Sakurai M, Moriyama H, Sato M, Hayashi Y, Oshima

T, Tanaka N. Ligand-induced changes in the conformation of 3-

isopropylmalate dehydrogenase from Thermus thermophilus. J

Biochem 1995;118:745–752.

63. DeLano WL. The PyMOL molecular graphics system. Palo Alto,

CA: DeLano Scientific; 2002.

64. Motono C, Oshima T, Yamagishi A. High thermal stability of 3-

isopropylmalate dehydrogenase from Thermus thermophilus result-

ing from low delta Cp of unfolding. Protein Eng 2001;14:961–966.

65. Motono C, Yamagishi A, Oshima T. Urea-induced unfolding and

conformational stability of 3-isopropylmalate dehydrogenase from

the thermophile Thermus thermophilus and its mesophilic counter-

part from Escherichia coli. Biochemistry 1999;38:1332–1337.

S. Radestock and H. Gohlke

1106 PROTEINS

Page 19: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

66. Akanuma S, Yamagishi A, Tanaka N, Oshima T. Serial increase in

the thermal stability of 3-isopropylmalate dehydrogenase from

Bacillus subtilis by experimental evolution. Prot Sci 1998;7:698–705.

67. Tamakoshi M, Nakano Y, Kakizawa S, Yamagishi A, Oshima T.

Selection of stabilized 3-isopropylmalate dehydrogenase of Saccha-

romyces cerevisiae using the host-vector system of an extreme ther-

mophile, Thermus thermophilus. Extremophiles 2001;5:17–22.

68. Ohkuri T, Yamagishi A. Increased thermal stability against irrevers-

ible inactivation of 3-isopropylmalate dehydrogenase induced by

decreased van der Waals volume at the subunit interface. Prot Eng

2003;16:615–621.

69. Aoshima M, Oshima T. Stabilization of Escherichia coli isopropyl-

malate dehydrogenase by single amino acid substitution. Prot Eng

1997;10:249–254.

70. Kirino H, Aoki M, Aoshima M, Hayashi Y, Ohba M, Yamagishi A,

Wakagi T, Oshima T. Hydrophobic interaction at the subunit inter-

face contributes to the thermostability of 3-isopropylmalate de-

hydrogenase from an extreme thermophile, Thermus thermophilus.

Eur J Biochem 1994;220:275–281.

71. Nemeth A, Svingor A, Pocsik M, Dobo J, Magyar C, Szilagyi A,

Gal P, Zavodszky P. Mirror image mutations reveal the significance

of an intersubunit ion cluster in the stability of 3-isopropylmalate

dehydrogenase. FEBS Lett 2000;468:48–52.

72. Hori T, Moriyama H, Kawaguchi J, Hayashi-Iwasaki Y, Oshima T,

Tanaka N. The initial step of the thermal unfolding of 3-isopropyl-

malate dehydrogenase detected by the temperature-jump Laue

method. Prot Eng 2000;13:527–533.

73. Kotsuka T, Akanuma S, Tomuro M, Yamagishi A, Oshima T.

Further stabilization of 3-isopropylmalate dehydrogenase of an

extreme thermophile, Thermus thermophilus, by a suppressor

mutation method. J Bacteriol 1996;178:723–727.

74. Numata K, Hayashi-Iwasaki Y, Kawaguchi J, Sakurai M, Moriyama

H, Tanaka N, Oshima T. Thermostabilization of a chimeric enzyme

by residue substitutions: four amino acid residues in loop regions are

responsible for the thermostability of Thermus thermophilus isopro-

pylmalate dehydrogenase. Biochim Biophys Acta 2001;1545:174–183.

75. Numata K, Hayashiiwasaki Y, Yutani K, Oshima T. Studies on

interdomain interaction of 3-isopropylmalate dehydrogenase from

an extreme thermophile, Thermus thermophilus, by constructing

chimeric enzymes. Extremophiles 1999;3:259–262.

76. Numata K, Muro M, Akutsu N, Nosoh Y, Yamagishi A, Oshima T.

Thermal stability of chimeric isopropylmalate dehydrogenase genes

constructed from a thermophile and a mesophile. Prot Eng

1995;8:39–43.

77. Sakurai M, Moriyama H, Onodera K, Kadono S, Numata K,

Hayashi Y, Kawaguchi J, Yamagishi A, Oshima T, Tanaka N. The

crystal-structure of thermostable mutants of chimeric 3-isopropyl-

malate dehydrogenase, 2T2M6T. Prot Eng 1995;8:763–767.

78. Imada K, Sato M, Tanaka N, Katsube Y, Matsuura Y, Oshima T. 3-

Dimensional structure of a highly thermostable enzyme, 3-isopro-

pylmalate dehydrogenase of Thermus thermophilus at 2.2 A resolu-

tion. J Mol Biol 1991;222:725–738.

79. Akanuma S, Qu CX, Yamagishi A, Tanaka N, Oshima T. Effect of

polar side chains at position 172 on thermal stability of 3-isopro-

pylmalate dehydrogenase from Thermus thermophilus. FEBS Lett

1997;410:141–144.

80. Qu CX, Akanuma S, Moriyama H, Tanaka N, Oshima T. A muta-

tion at the interface between domains causes rearrangement

of domains in 3-isopropylmalate dehydrogenase. Prot Eng

1997;10:45–52.

81. Tamakoshi M, Yamagishi A, Oshima T. Screening of stable proteins

in an extreme thermophile, Thermus thermophilus. Mol Microbiol

1995;16:1031–1036.

82. Akanuma S, Yamagishi A, Tanaka N, Oshima T. Spontaneous

tandem sequence duplications reverse the thermal stability of

carboxyl-terminal modified 3-isopropylmalate dehydrogenase.

J Bacteriol 1996;178:6300–6304.

83. Nurachman Z, Akanuma S, Sato T, Oshima T, Tanaka N. Crystal

structures of 3-isopropylmalate dehydrogenases with mutations at

the C-terminus: crystallographic analyses of structure-stability rela-

tionships. Prot Eng 2000;13:253–258.

84. Adekoya OA, Sylte I. The thermolysin family (M4) of enzymes:

therapeutic and biotechnological potential. Chem Biol Drug Des

2009;73:7–16.

85. Corbett RJ, Roche RS. The unfolding mechanism of thermolysin.

Biopolymers 1983;22:101–105.

86. Imanaka T, Shibazaki M, Takagi M. A new way of enhancing the

thermostability of proteases. Nature 1986;324:695–697.

87. Mansfeld J, Vriend G, Dijkstra BW, Veltman OR, VandenBurg B,

Venema G, UlbrichHofmann R, Eijsink VGH. Extreme stabilization

of a thermolysin-like protease by an engineered disulfide bond.

J Biol Chem 1997;272:11152–11156.

88. van den Burg B, Vriend G, Veltman OR, Venema G, Eunsink

VGH. Engineering an enzyme to resist boiling. Proc Natl Acad Sci

USA 1998;95:2056–2060.

89. Hardy F, Vriend G, Veltman OR, Vandervinne B, Venema G,

Eijsink VGH. Stabilization of Bacillus stearothermophilus neutral

protease by introduction of prolines. FEBS Lett 1993;317:89–92.

90. Kidokoro S, Miki Y, Endo K, Wada A, Nagao H, Miyake T,

Aoyama A, Yoneya T, Kai K, Ooe S. Remarkable activity enhance-

ment of thermolysin mutants. FEBS Lett 1995;367:73–76.

91. van den Burg B, Enequist HG, Vanderhaar ME, Eijsink VGH, Stulp

BK, Venema G. A highly thermostable neutral protease from

Bacillus caldolyticus—cloning and expression of the gene in Bacillus

subtilis and characterization of the gene product. J Bacteriol

1991;173:4107–4115.

92. Veltman OR, Vriend G, Hardy F, Mansfeld J, van den Burg B,

Venema G, Eijsink VGH. Mutational analysis of a surface area that

is critical for the thermal stability of thermolysin-like proteases.

Eur J Biochem 1997;248:433–440.

93. Veltman OR, Vriend G, Middelhoven PJ, van den Burg B, Venema

G, Eijsink VGH. Analysis of structural determinants of the stability

of thermolysin-like proteases by molecular modelling and site-

directed mutagenesis. Prot Eng 1996;9:1181–1189.

94. van den Burg B, Dijkstra BW, Vriend G, Vandervinne B, Venema

G, Eijsink VGH. Protein stabilization by hydrophobic interactions

at the surface. Eur J Biochem 1994;220:981–985.

95. Eijsink VGH, Dijkstra BW, Vriend G, Vanderzee JR, Veltman OR,

Vandervinne B, Vandenburg B, Kempe S, Venema G. The effect of

cavity-filling mutations on the thermostability of Bacillus stearo-

thermophilus neutral protease. Prot Eng 1992;5:421–426.

96. Eijsink VGH, Vriend G, Vandenburg B, Vanderzee JR, Veltman OR,

Stulp BK, Venema G. Introduction of a stabilizing 10-residue b-hair-pin in Bacillus subtilis neutral protease. Prot Eng 1992;5:157–163.

97. Eijsink VGH, Vriend G, Vandenburg B, Vanderzee JR, Venema G.

Increasing the thermostability of a neutral protease by replacing

positively charged amino-acids in the N-terminal turn of alpha-

helices. Prot Eng 1992;5:165–170.

98. Eijsink VGH, Vriend G, Vandervinne B, Hazes B, Vandenburg B,

Venema G. Effects of changing the interaction between subdomains

on the thermostability of Bacillus neutral proteases. Proteins

1992;14:224–236.

99. Margarit I, Campagnoli S, Frigerio F, Grandi G, Defilippis V, Fontana

A. Cumulative stabilizing effects of glycine to alanine substitutions in

Bacillus subtilis neutral protease. Prot Eng 1992;5:543–550.

100. Sitharam M, Agbandje-McKenna M. Modeling virus self-assembly

pathways: avoiding dynamics using geometric constraint decompo-

sition. J Comput Biol 2006;13:1232–1265.

101. Livesay DR, Jacobs DJ. Conserved quantitative stability/flexibility

relationships (QSFR) in an orthologous RNase H pair. Proteins

2006;62:130–143.

102. Livesay DR, Huynh DH, Dallakyan S, Jacobs DJ. Hydrogen bond

networks determine emergent mechanical and thermodynamic

properties across a protein family. Chem Cent J 2008;2:1–20.

Protein Rigidity and Thermophilic Adaptation

PROTEINS 1107

Page 20: proteins - uni-duesseldorf.de · proteins STRUCTURE O FUNCTION O BIOINFORMATICS Protein rigidity and thermophilic adaptation Sebastian Radestock and Holger Gohlke* Mathematisch-Naturwissenschaftliche

103. Somero GN. Temperature adaptation of enzymes—biological opti-

mization through structure-function compromises. Annu Rev Ecol

Syst 1978;9:1–29.

104. Hollien J, Marqusee S. Structural distribution of stability in a ther-

mophilic enzyme. Proc Natl Acad Sci USA 1999;96:13674–13678.

105. Hollien J, Marqusee S. Comparison of the folding processes of

T. thermophilus and E. coli ribonucleases H. J Mol Biol

2002;316:327–340.

106. Fitter J, Heberle J. Structural equilibrium fluctuations in meso-

philic and thermophilic alpha-amylase. Biophys J 2000;79:1629–

1636.

107. Hernandez G, Jenney FE, Adams MWW, LeMaster DM. Milli-

second time scale conformational flexibility in a hyperthermophile

protein at ambient temperature. Proc Natl Acad Sci USA

2000;97:3166–3170.

108. Hernandez G, LeMaster DM. Reduced temperature dependence of

collective conformational opening in a hyperthermophile rub-

redoxin. Biochemistry 2001;40:14384–14391.

109. Hurley JH, Dean AM. Structure of 3-isopropylmalate dehydro-

genase in complex with NAD1—ligand-induced loop closing and

mechanism for cofactor specificity. Structure 1994;2:1007–1016.

110. Hirose R, Suzuki T, Moriyama H, Sato T, Yamagishi A, Oshima T,

Tanaka N. Crystal structures of mutants of Thermus thermophilus

IPMDH adapted to low temperatures. Prot Eng 2001;14:81–84.

111. Suzuki T, Yasugi M, Arisaka F, Oshima T, Yamagishi A. Cold-adap-

tation mechanism of mutant enzymes of 3-isopropylmalate dehy-

drogenase from Thermus thermophilus. Prot Eng 2002;15:471–476.

112. Yasugi M, Amino M, Suzuki T, Oshima T, Yamagishi A. Cold

adaptation of the thermophilic enzyme 3-isopropylmalate dehydro-

genase. J Biochem 2001;129:477–484.

113. Yasugi M, Suzuki T, Yamagishi A, Oshima T. Analysis of the effect

of accumulation of amino acid replacements on activity of 3-iso-

propylmalate dehydrogenase from Thermus thermophilus. Prot Eng

2001;14:601–607.

114. Hangauer DG, Monzingo AF, Matthews BW. An interactive

computer-graphics study of thermolysin-catalyzed peptide cleavage

and inhibition by N-carboxymethyl dipeptides. Biochemistry

1984;23:5730–5741.

115. Matthews BW. Structural basis of the action of thermolysin and

related zinc peptidases. Acc Chem Res 1988;21:333–340.

116. Mock WL, Aksamawati M. Binding to thermolysin of phenolate-

containing inhibitors necessitates a revised mechanism of catalysis.

Biochem J 1994;302:57–68.

117. Mock WL, Stanford DJ. Arazoformyl dipeptide substrates for ther-

molysin. Confirmation of a reverse protonation catalytic mecha-

nism. Biochemistry 1996;35:7369–7377.

118. van Aalten DMF, Amadei A, Linssen ABM, Eijsink VGH, Vriend

G, Berendsen HJC. The essential dynamics of thermolysin—confir-

mation of the hinge-bending motion and comparison of simula-

tions in vacuum and water. Proteins 1995;22:45–54.

119. Veltman OR, Eijsink VGH, Vriend G, de Kreij A, Venema G, van

den Burg B. Probing catalytic hinge bending motions in thermoly-

sin-like proteases by glycine to alanine mutations. Biochemistry

1998;37:5305–5311.

120. Holland DR, Hausrath AC, Juers D, Matthews BW. Structural-ana-

lysis of zinc substitutions in the active-site of thermolysin. Prot Sci

1995;4:1955–1965.

121. Holland DR, Tronrud DE, Pley HW, Flaherty KM, Stark W, Janso-

nius JN, Mckay DB, Matthews BW. Structural comparison suggests

that thermolysin and related neutral proteases undergo hinge-

bending motion during catalysis. Biochemistry 1992;31:11310–

11316.

122. Liu YH, Konermann L. Conformational dynamics of free and cata-

lytically active thermolysin are indistinguishable by hydrogen/deu-

terium exchange mass spectrometry. Biochemistry 2008;47:6342–

6351.

123. de Kreij A, van den Burg B, Venema G, Vriend G, Eijsink VG,

Nielsen JE. The effects of modifying the surface charge on the cata-

lytic activity of a thermolysin-like protease. J Biol Chem

2002;277:15432–15438.

124. Yasukawa K, Inouye K. Improving the activity and stability of ther-

molysin by site-directed mutagenesis. Biochim Biophys Acta

2007;1774:1281–1288.

125. Vendruscolo M, Dokholyan NV, Paci E, Karplus M. Small-world

view of the amino acids that play a key role in protein folding.

Phys Rev E 2002;65:1–4.

126. Dokholyan NV, Li L, Ding F, Shakhnovich EI. Topological determi-

nants of protein folding. Proc Natl Acad Sci USA 2002;99:8637–

8641.

127. Heringa J, Argos P. Side-chain clusters in protein structures and

their role in protein folding. J Mol Biol 1991;220:151–171.

128. Heringa J, Argos P, Egmond MR, Devlieg J. Increasing thermal sta-

bility of subtilisin from mutations suggested by strongly interacting

side-chain clusters. Prot Eng 1995;8:21–30.

129. Kannan N, Vishveshwara S. Identification of side-chain clusters in

protein structures by a graph spectral method. J Mol Biol

1999;292:441–464.

130. Brinda KV, Vishveshwara S. A network representation of protein

structures: implications for protein stability. Biophys J

2005;89:4159–4170.

131. Korkegian A, Black ME, Baker D, Stoddard BL. Computational

thermostabilization of an enzyme. Science 2005;308:857–860.

132. Kuhlman B, Baker D. Native protein sequences are close to optimal

for their structures. Proc Natl Acad Sci USA 2000;97:10383–10388.

133. Malakauskas SM, Mayo SL. Design, structure and stability of a

hyperthermophilic protein variant. Nat Struct Biol 1998;5:470–475.

134. Bourne PE, Addess KJ, Bluhm WF, Chen L, Deshpande N, Feng

ZK, Fleri W, Green R, Merino-Ott JC, Townsend-Merino W,

Weissig H, Westbrook J, Berman HM. The distribution and query

systems of the RCSB protein data bank. Nucleic Acids Res

2004;32:D223–D225.

135. Hooft RWW, Vriend G, Sander C, Abola EE. Errors in protein

structures. Nature 1996;381:272.

136. Porter CT, Bartlett GJ, Thornton JM. The Catalytic Site Atlas: a

resource of catalytic sites and residues identified in enzymes using

structural data. Nucleic Acids Res 2004;32:D129–D133.

137. Holm L, Park J. DaliLite workbench for protein structure compari-

son. Bioinformatics 2000;16:566–567.

138. Word JM, Lovell SC, Richardson JS, Richardson DC. Asparagine

and glutamine: using hydrogen atom contacts in the choice of

side-chain amide orientation. J Mol Biol 1999;285:1735–1747.

139. Vogt G, Woell S, Argos P. Protein thermal stability, hydrogen

bonds, and ion pairs. J Mol Biol 1997;269:631–643.

140. Bava KA, Gromiha MM, Uedaira H, Kitajima K, Sarai A.

ProTherm, version 4.0: thermodynamic database for proteins and

mutants. Nucleic Acids Res 2004;32:D120–D121.

141. Jacobs DJ, Thorpe MF. Generic rigidity percolation—the pebble

game. Phys Rev Lett 1995;75:4051–4054.

142. Jacobs DJ, Hendrickson B. An algorithm for two-dimensional

rigidity percolation: the pebble game. J Comput Phys

1997;137:346–365.

143. Dehouck Y, Folch B, Rooman M. Revisiting the correlation

between proteins’ thermoresistance and organisms’ thermophilicity.

Prot Eng Des Sel 2008;21:275–278.

S. Radestock and H. Gohlke

1108 PROTEINS