Protein folding, the Levinthal paradox and rapidly mixing

Protein folding, the Levinthal paradox andrapidly mixing Markov chains

Peter Clote∗

February 9, 2014

∗Research supported by [email protected], Institut fur Informatik, Universitat Munchen,Oettingenstraße 67, D-80538 Munchen, Germany

1

AbstractIn [20, 21], A. Sali, E. Shakhnovich and M. Karplus modeled pro-

tein folding using a 27-bead heteropolymer on a cubic lattice withnormally distributed contact energies. Using a Monte-Carlo foldingalgorithm with a local move set between conformations, Sali et al.attempted to answer the Levinthal paradox [13] of how a protein canfold rapidly, i.e. within milliseconds to seconds, despite the magnitudeof the conformation space (e.g. approximately 526 ≈ 1018 for the 27-mer). Letting t0(P ) denote the folding time (i.e. first passage time)and ∆(P ) denote the energy gap between the lowest energy Ei0(nativestate) and second lowest energy Ei1 of protein P with normally dis-tributed contact energy, Sali, Shakhnovich and Karplus observed that∆(P ) is large exactly when t0(P ) is small.

Using Sinclair’s notion of rapid mixing [17] and his modification ofthe Diaconis-Stroock [6] bound on relative pointwise distance in termsof the subdominant eigenvalue, we provide the first theoretical basisfor the principal observation of Sali, Shakhnovich and Karplus. Specif-ically, we show that the mean first passage time is bounded above byc1πi0πi1 + c2, where πi0 [resp. πi1 ] is the Boltzmann probability ofthe system being in the native minimum energy state [resp. secondminimum]. It follows that this upper bound decreases iff the energygap Ei1 −Ei0 increases. Our result is actually proved for pivot moves(rotations) with multiple occupancy, rather than local moves, but itseems clear that our technique can be extended to cover a variant ofthe model of [20, 21].

Introduction

Though experiments on small proteins [1, 10] suggest that the native stateof a protein corresponds to a free energy minimum, this is not yet proven.Nevertheless, this hypothesis is widely accepted, and forms the basis forcomputational predictions of a protein’s conformation from its amino acidsequence. Despite substantial progress using techniques such as threading,molecular dynamics, etc., it is generally recognized that the calculation ofa protein’s native state from its residue sequence is still inadequate due totwo reasons. The energy function is only approximately known (see, forinstance, studies by M. Teeter [18, 19]), and the computational problem(e.g. exhaustive search of the conformation landscape) is too large to beundertaken using current technology.

2

This has led to various approaches, such as an approximation of the hy-drophobic force (approximation of tendency for polar residues to form hydro-gen bonds with the solvent and hydrophobic residues to form an inner core)and to work with lattice models, which constrain the combinatorial searchproblem. Even in the simple HP-model of H.S. Chan and K.A. Dill [3], theproblem of computing the native state which maximizes the number of non-contiguous unit-distance H-H contacts is NP -complete (Berger-Leighton [2]and Crescenzi et al. [5]), despite the existence of folding algorithms whichprovably yield conformations within a factor of the optimal (Hart-Istrail [8]).For more background on current work on protein folding, see the excellentrecent survey by E. Shakhnovich [16].

In [20, 21], Sali, Shakhnovich and Karplus (hereafter SSK) simulatedprotein folding by applying a Monte-Carlo algorithm for fixed temperature(i.e. not simulated annealing) to a 27-bead heteropolymer in a cubic lattice,where contact energies were normally distributed. Their article’s principleconclusion was that protein folding time t0 appeared to be small if and onlyif the energy gap between the lowest energy and the second lowest energy ofconformations on the compact 3× 3× 3 cube was large. It would follow thatthermodynamic considerations alone (rather than specific, encoded foldingpathways) suffice to drive the protein (27-mer in a cubic lattice) to its lowestenergy conformation, thus answering the Levinthal paradox.

The importance of this contribution of Sali-Shakhnovich-Karplus shouldnot be underestimated, as it has stimulated widespread interest in theoreticalchemistry [4, 11] and computational biology [9]. Moreover, the 27-bead het-eropolymer model has been used for various simulation purposes by numerousauthors (Shakhnovich, Gutin, Govindarajan, Goldstein, etc.), including aninteresting application by Shakhnovich to the simulation of prebiotic selec-tion forces (proteins are now selected for functionality within a cell, while inprebiotic times proteins may have been selected for rapid folding).

In this paper, we present the first mathematical justification for the SKKobservation, by relating protein folding on lattice models to the brilliant workof Diaconis-Stroock [6] and of A. Sinclair [17] on rapidly mixing Markovchains. Using the Diaconis-Stroock upper bound λ1 ≤ (1 − 1

b`) on the sub-

dominant (or second) eigenvalue λ1 of the transition matrix of the Markovchain underlying the Monte-Carlo algorithm, we relate the mean first passagetime to the energy gap ∆ = E1−E0. Our conclusion is that the SSK observa-tion necessarily follows from their hypothesis that protein folding is modeledby a Markov chain. While this hypothesis seems perfectly reasonable from

3

a physical standpoint, our result suggests that one might investigate proteinmodeling using non-Markovian models (such as the sum of several Markovprocesses, or the grouping of Markov processes [7]).

While our motivation stems from an analysis of [20, 21], we emphasizethat (due to lack of time) our proofs are carried out in this paper only forthe closely related model using pivot moves with multiple occupancy (ratherthan the SKK local move set of end, corner and crankshaft moves explainedlater). Details of a similar analysis for the SKK model will be carried out inthe full version of this paper.

The plan of this paper is as follows. In §1, we describe the lattice modeland simulation program of Sali et al. [20, 21], including their (non-ergodic)local move set. In §2, we give background on Markov chains and an overviewof Sinclair’s work relating conductance with relative pointwise distance. In §3,we present our analysis mathematically justifying the SKK observation for arelated model. Finally, in the Appendix, we give an example for the injectivemapping technique and simulation results supporting our conclusions.

Before beginning, here is a brief sketch of our approach. Suppose that fis a function having a unique global minimum at i0 [resp. second minimumat i1]; i.e. f(i0) < f(i1) and for all x ∈ dom(f) − {i0, i1} it is the case thatf(i1) < f(x). Apply the Monte-Carlo algorithm to determine the minimum i0of f , where the energy gap ∆ = f(i1)−f(i0). Use Sinclair’s injective mappingtechnique to estimate, using the Diaconis-Stroock inequality, an upper boundfor the subdominant eigenvalue for the transition probability matrix for therelated Markov chain. This allows us to derive a time t0 for which the relativepointwise distance between p

(t)i,j and the stationary probability p∗j is small, for

all i, j ∈ dom(f) and t ≥ t0, where the p∗i are the stationary (Boltzmann)probabilities of the Markov chain underlying the Monte-Carlo algorithm.1

The time t0 is shown to be bounded above by c1p∗i0p∗i1 + c2. For a particular

range of values, we show that this upper bound decreases (hence t0 decreases)iff the energy gap ∆ = f(i1) − f(i0) increases. We finally estimate themean first passage time to native state i0 in terms of t0. Putting everythingtogether, we have that energy gap ∆ is large if and only if the mean firstpassage time to native state i0 is small. Arguing that the SSK protein foldingmodel is essentially given by a Markov chain for which the above analysis

1Stationary probability p∗i equals e−f(i)/T /Z, where the partition function Z is∑j∈S e

−f(j)/T .

4

holds, we have a mathematical justification for the SKK observation.2

2Our result is somewhat weak, since we prove only that the upper bound for t0 de-creases. We have not yet shown that this upper bound is tight, from which it wouldfollow that t0 itself decreases. However our work, substantiated by simulation results in§3, should be understood as a first mathematical justification for the SKK and relatedobservations.

5

1 Heteropolymer protein model

In [21, 20], A. Sali, E. Shakhnovich and M. Karplus consider a 27-bead het-eropolymer on a cubic lattice, such that the residue-residue contact potentialE satisfies

E =∑

1≤i<j≤27

Bi,jδ(ri,j)

where Bi,j is normally distributed with mean −2 and standard variation 1,ri,j is Euclidean distance between residues i, j, and δ(ri,j) = 1 if ri,j = 1 andi, j are not immediate neighbors in the polypeptide chain (i.e. |i−j| > 1), else0. With high probability, the native state lies in the compact 3× 3× 3 cube.Removing symmetric duplicates, there are exactly 103,346 conformations inthe compact cube; sorting the energies for these conformations, one has anenergy spectrum, where Ei0 [resp. Ei1 ] denotes the energy minimum [resp.second lowest energy].

SKK define a protein to be strongly folding [resp. weakly folding] if in atleast 4 out of 10 [resp. between 1 and 3] simulations, the first passage timeto the native state is at most 50× 106 [resp. in the interval [0.1, 0.4)]. Thismeasure is quantified by defining the foldicity of a protein to be the fractionof simulations in which the minimum energy conformation is attained within50× 106 Monte-Carlo steps; i.e. the first passage time ≤ 50× 106.

The SKK protein simulation program uses Monte-Carlo at a specific tem-perature (not simulated annealing), where the choice of temperature is opti-mized so that (roughly) the Boltzmann probability

e−Ei0/T

Z≈ 0.4

where the partition function Z =∑c∈CC e

−Ec/T , and CC denotes the collec-tion of 103,346 conformations on the compact cube.

Temperature selection is explained at length in [21]; in essence, the tem-perature is chosen to be sufficiently low for the energy of the native state to bea pronounced global minimum (thermodynamic equilibrium) yet sufficientlyhigh for the simulation to avoid local energy minima. Such considerationsseem indeed relevant to the physical case of protein folding. SKK then ar-gue that, at least for the simplified 27-bead heteropolymer with normallydistributed contact potentials, protein folding on a cubic lattice does not re-quire any specific protein folding pathways, since they observe that a random

6

coil 27-mer folds within 50× 106 steps if and only if the energy gap Ei1 −Ei0between the lowest and second lowest energy is large.

1.1 Local move set

If one models protein folding by a Markov process, then one must define aset of valid moves between conformations. SKK allow a valid move fromconformation i to j only if one can perform an end, corner, or crankshaftmove, while preserving the condition of being a self-avoiding walk (called theexcluded volume condition).

end: For k = 1 or k = 27, one moves the end segment determined by k [i.e.(1, 2) or (26, 27)] in any of 5 possible directions.

corner: For 2 ≤ k ≤ 26, if (k − 1, k) and (k, k + 1) are not colinear, thenmove the corner (k − 1, k, k + 1) 180◦ within the same plane.

crankshaft: For 2 ≤ k ≤ 25, if segments (k−1, k), (k, k+1) and (k+1, k+2)form a U-shape, then rotate the U-structure either 90◦ or −90◦.

End moves and corner moves are designated as 1-monomer moves, while thecrankshaft move is designated as a 2-monomer move. With probability 0.2,one decides to perform a 1-monomer move; if 2 ≤ k ≤ 26, then this is a cornermove, otherwise an end move. With remaining probability 0.8, a 2-monomermove is executed, if possible.

Taking into account the local set move, excluded volume constraint, andMetropolis criterion, there is a well-defined probability pi,j of transition fromstate i to j.

1.2 Pivot moves

The previously defined move set of SKK are local, in the sense that constantlymany (here at most 2 sites) on the self-avoiding walk are changed. A theoremof Madras-Sokol [14] states that local move sets explore only an exponentiallysmall subset of the collection of all n-step self-avoiding walks. In contrast,a pivot move from a conformation is defined by choosing at random a siteof the self-avoiding walk, thus dividing it into two segments. Taking thissite as the origin, perform a rotation or reflection of one of the segments toobtain the next conformation, provided that the excluded volume condition

7

is respected. Pivot moves are ergodic, in the sense that Monte-Carlo pivotsimulations of linear polymers explore the entire conformation space. We de-fine pivot phantom moves as above, but where the excluded volume conditionis dropped; i.e. intermediates along the folding pathway are allowed to benon-self-avoiding walks. In Appendix II of [21], such phantom conformationsobtained by the SKK local move set are considered, where multiple occu-pancy of sites is allowed but penalized (so-called quasi-self-avoiding walks).According to p. 1634 of [21], “No significant differences were observed be-tween the results from the two sets of simulations.”

Our analysis in the next section is performed for pivot moves with multipleoccupancy. In the full version, we intend to analyze the SKK local move setwith multiple occupancy (multiple occupancy seems unavoidable in our useof Sinclair’s injective mapping technique).

8

2 Markov chains and the second eigenvalue

In this section, we begin with some definitions and well-known results, andstate some profound results of Sinclair, relating relative pointwise distanceto the second eigenvalue. For undefined concepts, consult Feller [7].

2.1 Background

Let P be the transition probability matrix of a (first-order) Markov chain M .M is irreducible (or strongly connected) if for any two states i, j belonging to

the state space S of M , there exists n such that p(n)i,j > 0. M is aperiodic if

d(i) = 1 for all states i ∈ S, where the period d(i) of state i is the gcd of all

k ≥ 1 such that p(k)i,i > 0. If M is irreducible and aperiodic, then there exists

n0 such that P n0 is positive, i.e. for all i, j ∈ S, p(n0)i,j > 0. It is well-known

that in this case, there exist stationary probabilities p∗j = limn→∞ p(n)i,j . In

fact, lettingd = max

α,β

∑k

max(pn0α,k − p

n0β,k, 0) < 1

it can be shown (see [15]) that for all i,

|pni,k − p∗k| ≤ dn/n0−1.

Alternatively, following [12], define N = P n0 as above, and set ε = minN . IfC is a column vector of N with maximum M0 and minimum m0, then lettingM1 [resp. m1] designate the maximum [resp. minimum] of N · C, it can beshown that

M1 −m1 ≤ (1− 2ε)(M0 −m0).

Iterating this, we have Mn −mn ≤ (1− 2ε)n which converges to 0.The convergence of an appropriate irreducible, aperiodic Markov chain

to stationary probabilities is the basis for the convergence of Monte-Carlosimulations. Namely, define a neighborhood system satisfying i 6∈ Ni, i ∈Nj ⇔ j ∈ Ni, |Ni| = |Nj| for all i, j ∈ S. Let f : S → R be a function,whose minimum is sought by a Monte-Carlo simulation at temperature T .Define matrix P by setting

pi,j =

αe−(f(j)−f(i))/T

|Ni| if j ∈ Ni and f(j) > f(i)α|Ni| if j ∈ Ni and f(j) ≤ f(i)

0 if i 6= j and j 6∈ Ni

1−∑j 6=i pi,j if i = j.

9

The classic result of Metropolis et al. states that the Markov chain withtransition probability matrix P is irreducible and aperiodic, and has theBoltzmann distribution as its stationary probability distribution:

p∗i =e−f(i)/T

Z

where the partition function Z satisfies Z =∑j∈S e

−f(j)/T . In the sequel, wesometimes denote p∗i by πi.

Suppose that N = P n0 is positive, N∗ = limn→∞Nn, where P is the

transition probability matrix for irreducible, aperiodic Markov chain M . Fol-lowing [12], the fundamental matrix F is defined by

F = (I − (N −N∗))−1 = I +∞∑n=1

(Nn −N∗) = I +∞∑n=1

(N −N∗)n.

The matrix M = (µi,j) of mean first passage times, is defined by taking µi,jto be the expected number of steps to go from state i to state j. In [12],it is proved that M = (I − F + EFdg)D, where I is the identity matrix, Eis the matrix, all of whose entries are 1, Fdg is obtained from F by settingoff-diagonal entries of F to 0, and D is the diagonal matrix with entriesdi,i = 1/p∗i .

Suppose that f : S → R is an energy function, which we wish to mini-mize using a Monte-Carlo algorithm for a certain neighborhood system andtemperature (e.g. to determine the native state of a heteropolymer). As-sume that f has unique global minimum at i0, and define g : S → R bysetting g(i) = f(i) for i 6= i0, and g(i0) < f(i0). Consider the Markov chaintransition probability matrix Pf [resp. Pg] corresponding to f [resp. g].

In principle, one should be able to show that µi,i0(Pf ) < µi,i0(Pg), whereµi,j(Pf ) is the mean first passage time from state i to state j in the Markovchain associated with f , and similarly for g. However, initial calculationsare quite involved, and we use an alternate approach described in the nextsections.

2.2 Relative pointwise distance

For our analysis, we are interested in the expected time for convergence of aMonte-Carlo simulation of protein folding.

10

Let M be a Markov chain with state space S, and let U ⊆ S. Following[17], the relative pointwise distance over U is defined as

∆U(t) = maxi,j∈S

|p(t)i,j − p∗j |p∗j

.

We write ∆(t) = ∆S(t).Fix a subset X ⊆ S of the state space of Markov chain M . Define

capacity to be CX =∑i∈X p

∗i ; define the ergodic flow out of X to be FX =∑

i∈X,j 6∈X pi,jp∗i . Since 0 < FX ≤ CX < 1, the quotient ΦX = FX/CX , may

be considered to be the conditional flow out of X, provided the system is inX. The conductance Φ = minCX≤1/2 ΦX , where the minimum is take over allX ⊆ S.

An irreducible, aperiodic Markov chain M with transition probabilitymatrix P is reversible if p∗i · pi,j = p∗j · pj,i for all i, j ∈ S.

Theorem 1 (A. Sinclair[17]) Let M be a reversible, irreducible, aperiodicMarkov chain, all of whose eigenvalues are non-negative. Then

∆U(t) ≤ λt1mini∈U p∗i

where λ1 < 1 is the second largest eigenvalue.3 Moreover, ∆U(t) ≤ (1−φ2/2)t

mini∈U πi

and if φ ≤ 1/2 then ∆(t) ≥ (1− 2φ)t.

With pivot moves (even allowing multiple occupancies), it is clear that theassociated Markov chain is reversible. That this is not the case with the chainassociated with the local moves of SKK, even allowing multiple occupancies,was pointed out by R. Backofen and S. Will (personal communication). It isinteresting to note that a modification of approximate bin packing yields apolynomial approximation scheme of the conductance ΦCC when consideringsubsets of conformations on the compact cube. R. Backofen and S. Will havean even simpler argument for the same result. These points will be coveredin the full version of this paper.

3The largest is 1, since P · (p∗1, . . . , p∗N ) = P , where P is the transition probabilitymatrix for M , and M has N states.

11

3 Relating mean first passage time to energy

gap

In this section, we give an application of Sinclair’s technique, using his mod-ified form of the Diaconis-Stroock inequality, to provide an upper bound forthe subdominant eigenvalue of the transition probability matrix correspond-ing to a Markov chain for protein folding, using pivot moves with multipleoccupancy.

Let S denote the set of all conformations s = (s0, . . . , sn) with multipleoccupancy of the n + 1-bead heteropolymer (in either the 2D or 3D cubiclattice), where si 6= si+2.

4 It follows that |S| = 3n in 2D and 5n in 3D. Assumethat u = (u(0), . . . , u(n)), v = (v(0), . . . , v(n)) ∈ S. Define the canonicalpath p = (p1, . . . , pm) between u, v where all pi ∈ S, p1 = u, pm = v, m ≤ nand for each i < m, pi+1 is obtained from pi by performing a rotation atthe first site j such that pi(j) 6= v(j). (See example in Appendix.) Let Pdenote the set of all canonical paths between ordered pairs (u, v) of distinctconformations. Clearly the length of the longest canonical path is at mostn. Given t = (u, v) ∈ S × S, where v is obtained from u by one move, definePt ⊆ P to be the set of canonical paths containing edge t. The followingclaim illustrates the injective mapping technique, introduced by Sinclair [17].

Lemma 2 With the previous notation, for all t = (w,w′) ∈ S×S, there areat most |S| many paths in Pt.

Proof. Suppose that w and w′ are identical on sites 0, . . . , k but differ atsite k + 1. Define σk : Pt × Pt → S by σk(u, v) = s, where

si =

{ui 0 ≤ i ≤ ksk + (wi − wk) k + 1 ≤ i ≤ n.

Then σk is injective, since from s, w, w′ we can define u, v as follows:

ui =

{si 0 ≤ i ≤ ksk + (wi − wk) k + 1 ≤ i ≤ n

vi =

{sk + (w′i − w′k) 0 ≤ i ≤ ksi k + 1 ≤ i ≤ n.

4The walk may intersect itself, but may not have consecutive overlapping steps.

12

See example in Appendix. Q.E.D.Following p. 131 of [17], define

b = maxt

∑p∈Pt

πp(I) · πp(F )

πipi,j

where t = (i, j) and p(I) [resp. p(F )] denotes the initial [resp. final] con-formation in path p containing edge t. Letting i0 [resp. i1] denote the con-formation with minimum energy f(i0) [resp. second lowest energy f(i1)], itis clear that πp(I) · πp(F ) ≤ πi0 · πi1 . This, together with the previous claimimplies that

b ≤ |S|πi0 · πi1c

(1)

where c = minπipi,j, the minimum taken over all pairs (i, j) of conformations,where j is obtained from i by a pivot move. Q.E.D.

Modifying work of Diaconis-Stroock [6], Sinclair (cited on p. 131-132 of[17]) proved that

λ1 ≤ (1− 1

b`)

where ` is the maximum length of a canonical path. From Theorem 1,

∆U(t) ≤ λt1mini∈U πi

≤ (1− 1/b`)t

mini∈U πi

≤ (1− c/n|S|πi0πi1)t

mini∈U πi.

Setting the last inequality to be bounded above by 0 < ε < 1, and takinglogarithms, we find

t ln

(1− c

n|S|πi0πi1

)≤ ln ε+ ln(min

i∈Uπi)

and so

t ≤ ln ε+ ln(mini∈U πi)

ln(

1− cn|S|πi0

πi1

) .

13

Now1

ln(1− δ)= −1

δ− 1

2− δ

12− δ2

24− 19δ3

720+O(δ4).

Letting δ = cn|S|πi0

πi1and dropping higher order terms, we have

t ≤(− ln ε− ln(min

i∈Uπi))·(n|S|πi0πi1

c+ 1

). (2)

Suppose that g : S → R is defined by g(i) = f(i) for i 6= i1 and g(i1) > f(i1);i.e. g is identical to f , but the energy gap ∆g = g(i1) − g(i0) is larger thanthe energy gap ∆f = f(i1)− f(i0). Let

Z =∑i∈S

e−f(i)/T

Z ′ =∑i∈S

e−g(i)/T

πi =e−f(i)/T

Z

π′i =e−g(i)/T

Z ′

pi,j =e−(f(j)−f(i))/T

N

p′i,j =e−(g(j)−g(i))/T

Nc = min

t=(i,j)πipi,j

c′ = mint=(i,j)

π′ip′i,j

where N is the neighborhood size using pivot moves (for rotation alone, Nis 3 in 2D and 5 in 3D).

Lemma 3 For all i 6= i1, it is the case that π′i > πi.

Proof.

π′i > πi ⇔e−g(i)/T

Z ′>e−f(i)/T

Z⇔ Z > Z ′

14

since f(i) = g(i). Now Z ′ = Z − e−f(i1)/T + e−g(i1)/T and g(i1) > f(i1), soZ ′ < Z. Q.E.D.

Claim. πi1 > π′i1 .

Proof of Claim.

πi1 > π′i1 ⇔e−f(i1)/T

Z>e−g(i1)/T

Z ′

⇔ e−(f(i1)−g(i1))/T >Z

Z ′.

Let a = f(i1)/T and b = g(i1)/T . Then

πi1 > π′i1 ⇔ Z ′ · e−a+b > Z

⇔ (Z − e−a + e−b) · e−a+b > Z

⇔ Z(e−a+b − 1)− e−2a+b + e−a > 0

⇔ Z(e−a+b − 1)− e−a(e−a+b − 1) > 0

⇔ (Z − e−a)(e−a+b − 1) > 0.

Now b > a, so (e−a+b − 1) > 0, and since |S| > 1, we have (Z − e−a) > 0.Q.E.D.

Claim. πipi,j < πxpx,y ⇔ π′ip′i,j < π′xp

′x,y.

Proof of Claim. Consider the first case, where f(i) < f(j) and f(x) <f(y). Then

πipi,j =e−f(i)/T · e−(f(j)−f(i))/T

Z ·N

=e−f(j)/T

Z ·N

and similarly πxpx,y = e−f(y)/T

Z·N . Then πipi,j < πxpx,y iff e−f(j) < e−f(y). Sinceg(i0) < g(i1) < g(i2), the ordering relation of g is unchanged from that of f ;i.e. f(j) < f(y) iff g(j) < g(y). It follows that

πipi,j < πxpx,y ⇔ e−f(j) < e−f(y)

⇔ e−g(j) < e−g(y)

⇔ π′ip′i,j < π′xp

′x,y.

The other 3 cases are similarly handled. Q.E.D.

15

Claim. cc′≤ 1.

Proof of Claim. From the previous claim, it follows that if c = πipi,j thenc′ = π′ip

′i,j (i.e. for the same values i, j). Clearly the minimum occurs when

f(i) < f(j), so

c ≤ c′ ⇔ πipi,j ≤ π′ip′i,j

⇔ πiπ′i≤p′i,jpi,j

⇔ e−(f(i)−g(i))/T · Z ′/Z ≤ e−(g(j)−g(i))/T

e−(f(j)−f(i))/T= e−(f(i)−g(i))/T

⇔ Z ′ ≤ Z.

Q.E.D.

Lemma 4 With the previous notation, if f(i) = g(i) for all i ∈ S−{i1} and

0 < g(i1)− f(i1) < 2 ln(

1−δδ

)where δ = e−g(i1)

Z′, then

π′i0π′i1

c′<πi0πi1c

. (3)

Proof.

π′i0π′i1

c′<πi0πi1c

⇔π′i0πi0· cc′<πi1π′i1

⇔ A ·B < C

where A = e−(g(i0)−f(i0))/T · ZZ′

= ZZ′

, since f(i0) = g(i0), B = cc′≤ 1, and

C = e−(f(i1)−g(i1))/T · Z′Z

. Letting a = f(i1)/T and b = g(i1)/T , we have that

A ·B < C ⇔ Z

Z ′< e−a+b · Z

′

Z⇔ (Z/Z ′)2 < e−a+b.

Now Z = Z ′ − e−b + e−a, so

(Z/Z ′)2 =

(1 +

e−a − e−b

Z ′

)2

=

(1 +

e−b(eb−a − 1)

Z ′

)2

.

Let x = eb−a, which is greater than 1 since b = g(i1)/T > a = f(i1)/T .Letting δ = e−b/Z ′, by the quadratic formula it follows that for 1 < x <

16

(1−δδ

)2, we have (1 − δ(x − 1))2 < x, and so in this domain (Z/Z ′)2 < eb−a.

Q.E.D.

It thus follows that for larger gap g(i1)−g(i0) our upper bound for λ1 andhence for the relative pointwise distance decreases. We now relate relativepointwise distance to mean first passage time. Recall from the backgroundsection on Markov chains, M = (I − F + EFdg) ·D. If

∆(t) = maxi∈S

|p(t)i,j − πj|πj

< ε

then certainly |P t − P ∗| < ε · E, and P t is positive. Recall that the funda-mental matrix

F = (I − (P t − P ∗))−1 =∞∑i=0

(P t − P ∗)i

where we use the convention that the 0-th power of a matrix is the identity.Thus |F | ≤ ∑∞

i=0(ε · E)i. Suppose that |S| = N so that E is an N × Nmatrix, and take ε < 1/N2, so ε1/2 < 1/N .

Claim. For i ≥ 1, (ε · E)i ≤ εn+1

2 · E.

Proof of Claim. By induction on i ≥ 1. Suppose the claim holds for kand consider k + 1.

(ε · E)k+1 = (ε · E)(ε · E)k

≤ (ε · E)(εk+12 · E)

= (εk+32 · E2)

= (εk+32 ·N · E)

≤ (εk+22 · E).

Letting δ = ε1/2, it follows that

|F | ≤ I +∞∑i=1

εi+12 · E

= I +∞∑i=1

δi+1 · E

= I + E(1

1− δ− 1− δ)

17

and |I−F | ≤ E( 11−δ−1−δ). Recall that Fdg is the diagonal matrix obtained

from F by setting off-diagonal values to 0. It follows that |EFdg| ≤ E( 11−δ−δ)

and so

|I − F + EFdg| ≤ |I − F |+ |EFdg|

≤ E(1

1− δ− δ − 1) + E(

1

1− δ− δ)

≤ E(2

1− δ− 2δ − 1).

Recall that the diagonal matrix D has entries di = 1/p∗i,i and 0 off thediagonal. A calculation shows that

|(I − F + EFdg) ·D| ≤ |I − F + EFdg| · |D| ≤(

2

1− δ− 2δ − 1

)·Q

where each row of Q is (1/p∗1,1, 1/p∗2,2, . . . , 1/p

∗N,N). Thus the mean first pas-

sage time after the t-th step in the Monte-Carlo simulation is bounded by

(2

1− δ− 2δ − 1) · 1/p∗j

which for 0 < δ < 1/2 is at most 2/p∗j . Putting things together, we have thefollowing.

Theorem 5 Let M be a reversible, irreducible, aperiodic Markov chain, allof whose eigenvalues are non-negative and which corresponds to a Monte-Carlo simulation of n+ 1-bead heteropolymer folding using pivot moves withmultiple occupancy. Let N be the size of the state space S of M , i0 ∈ S bethe native state conformation, and let c = min πipi,j where the minimum istaken over all conformations i, j such that j is obtained by a pivot move fromi with transition probability pi,j. Then the mean first passage time µi,i0 fromrandom coil conformation i to native state i0 is bounded above by c1πi0πi1 +c2,where c1 = (nN/c)(2 lnN −mini∈Sπi) and c2 = (2 lnN −mini∈Sπi) + 2/πi0.5

Proof. Given ε = 1/N2, using (2) compute t0 such that

t0 ≤(− ln ε− ln(min

i∈Uπi))·(n|S|πi0πi1

c+ 1

)5Note that c2 ≤ (2 lnN − πi0 + 2/πi0 .

18

and for all larger t, ∆U(t) ≤ ε. Since δ = ε1/2 < 1/2, the expected number ofMonte-Carlo steps to visit the native state i0 from any conformation i at timet0 is at most 2/πi0 . Finally, ln ε < 2 lnN . This yields the upper bound withconstants c1, c2. Finally, by Lemmas 3 and 4, it follows that the energy gapEi1−Ei0 increases iff our upper bound for mean first passage times decreases.This concludes our justification of the SKK observation that proteins whichfold have a large energy gap between minimum and second minimum energy.Q.E.D.

19

Conclusion and Acknowledgements

I would like to thank R. Backofen and S. Will, with whom I have discussedover time various aspects of a research program to analyze the runtime forMonte-Carlo simulations of protein folding on lattice models. Though theresults of this article were obtained without their collaboration, there are anumber of items which we have jointly investigated. Presumably the journalversion of this article will incorporate this joint work. I would like to thankE. Bornberg-Bauer and E. Shakhnovich for discussions on various aspects ofthe HP-model, for copies of their papers and pointers to the literature.

20

References

[1] C.B. Anfinsen. Principles that govern the folding of protein chains. Science,181:223–230, 1973.

[2] B. Berger and T. Leighton. Journal of Computational Biology, 1998.

[3] H.S. Chan and K.A. Dill. Compact polymers. Macromolecules, 22:4559, 1989.

[4] Hue Sun Chan. Kinetics of protein folding. Nature, 373:664–665, 23 February1995. Scientific Correspondence: Criticism to [20].

[5] P. Crescenzi, D. Goldman, C. Papadimitriou, A. Piccolboni, and M. Yan-nakakis. On the complexity of protein folding. Journal of ComputationalBiology, 5(3):523–466, 1998.

[6] P. Diaconis and D. Stroock. Geometric bounds for eigenvalues of markovchains. Annals of Applied Probability, 1:35–61, 1991.

[7] W. Feller. An introduction to probability theory and its applications. J. Wileyand Sons, Inc, 1968. Volume 1, Third Edition.

[8] W. Hart and S. Istrail. Fast protein folding in the hydrophobic-hydrophobicmodel within three-eighths of optimal. In Proceedings of the 27th AnnualACM Symposium on Theory of Computing, Las Vegas, 1995. 157–168.

[9] M. Karplus. Santa Fe, Jan 20–23, 1997.

[10] M. Karplus and E. Shakhnovich. Protein folding: theoretical studies of ther-modynamics and dynamics. In T.E. Creighton, editor, Protein Folding, pages237–196. iW.H. Freeman and Company, New York, 1992.

[11] M. Karplus, A. Sali, and E. Shakhnovich. Kinetics of protein folding. Nature,373:665, 23 February 1995. Scientific Correspondence: Reply to [4].

[12] J.G. Kemeny and J.L. Snell. Finite Markov Chains. Van Nostrand Company,1960. 210 pages.

[13] C. Levinthal. Are there pathways for protein folding? J. Chim. Phys., 65:44–45, 1968.

[14] N. Madras and A.D.Sokol. Nonergodicity of local, length-conserving monte-carlo algorithms for the self-avoiding walk. J. Stat. Phys., 47:573–595, 1987.

[15] Y.A. Rozanov. Probability Theory: A Concise Course. Dover Publications,Inc., 1977.

21

[16] E. Shakhnovich. Theoretical studies of protein-folding thermodynamics andkinetics. Current Opinion in Structural Biology, 7:29–40, 1997.

[17] Alistair Sinclair. Algorithms for random generation and counting: A Markovchain approach. Birkhauser, 1993. 146 pages.

[18] M. Teeter. An empirical examination of potential energy minimization usingthe well-determined structure of the protein crambin. Journal of the AmericanChemical Society, 108:7163–7172, 1986.

[19] M. Teeter. Water-protein interactions: Theory and experiment. Annu. Rev.Biophys. Biophys. Chem., 20:577–600, 1991.

[20] A. Sali, E. Shakhnovich, and M. Karplus. How does a protein fold? Nature,369:248–251, 19 May 1994. Letters to Nature.

[21] A. Sali, E. Shakhnovich, and M. Karplus. Kinetics of protein folding: Alattice model study of the requirements for folding to the native state. J.Molec. Biol., 235:1614–1636, 1994.

22

Appendix

In this section, we present an example in 2D for the injective mapping tech-nique, and display some simulation results.

An example

The following graphic example may be helpful to understand the mappingσk(u, v). Consider two heteropolymers u, v with n = 11 beads (sites 0, . . . , 10),i.e. a 10-step self-avoiding walks, where u has the form

and v has the form

The following conformations are obtained only by applying rotations, whichleads to multiple occupancy conformations. Note that by applying reflectionsas well (i.e. reflection to 5), we could have obtained a self-avoiding walk. Weindicate the site 0 on the figures below.

Rotate by π at site 0 to obtain the following.

Rotate by −π/2 at site 1 to obtain the following.

23





24



Let w be the conformation 4) and w′ be the conformation 5) pictured below.

As defined in the text, if sites are numbered from 0, . . . , n = 10, and k = 5,then

si =

{ui for 0 ≤ i ≤ k = 5uk + (vi − vk) for k + 1 = 6 ≤ i ≤ n = 10.

with figure

25

One can obtain u, v from s, w, w′ by σk(u, v) = s, where

ui =

{si for 0 ≤ i ≤ k = 5sk + (wi − wk) for k + 1 = 6 ≤ i ≤ n = 10.

and

vi =

{sk + (w′i − w′k) for 0 ≤ i ≤ k = 5si for k + 1 = 6 ≤ i ≤ n = 10.

Simulation results

Let f : {0, . . . , 2n − 1} → R have normally distributed random values withmean -2 and variance 0.1. Below is a sample run for the hypercube neigh-borhood system, where j ∈ Ni iff Hamming distance between i, j is 1. Runsfor the full neighborhood system (j ∈ Ni iff j 6= i) are similar.6 aveMC isthe average over 5000 runs of the number of Monte-Carlo moves required toreach f(i0). bUP is the upper bound for b computed by our method (we omithere the factor 1/c); lambda1UP = 1 − 1/b` is the upper bound for λ1; andDELTA is the energy gap. Each row of the data is obtained by defining newenergy function g, where g is identical to f on values different than i1 andf(i0) < g(i1) < f(i1), hence with smaller energy gap. For the data below, thecorrelation between lambda1UP and aveMC is 0.996313, between lambda1UP

and DELTA is −0.99912 and between DELTA and aveMC is −0.99595.

Temperature 2

Gaussian distribution, mean: -2.000000 stdev: 0.316228

Delta aveMC PI(i_0) PI(i_1) bUP lambda1

0.024 16.514 0.040 0.039 5.316 0.96864648

0.023 37.217 0.040 0.039 5.318 0.96865862

0.023 54.217 0.040 0.039 5.320 0.96867076

0.022 73.032 0.040 0.039 5.322 0.96868289

6Simulations of heteropolymer folding with pivot moves will be presented in the fullpaper.

26

0.021 93.187 0.040 0.039 5.324 0.96869501

0.020 112.578 0.040 0.039 5.326 0.96870713

0.019 128.578 0.040 0.039 5.328 0.96871925

0.019 139.947 0.040 0.039 5.330 0.96873136

0.018 165.779 0.040 0.040 5.332 0.96874347

0.017 205.176 0.040 0.040 5.334 0.96875557

0.016 243.304 0.040 0.040 5.336 0.96876766

0.015 296.403 0.040 0.040 5.338 0.96877975

0.015 350.793 0.040 0.040 5.340 0.96879184

0.014 350.793 0.040 0.040 5.343 0.96880392

0.013 350.793 0.040 0.040 5.345 0.96881600

0.012 421.801 0.040 0.040 5.347 0.96882807

0.011 439.278 0.040 0.040 5.349 0.96884014

0.010 446.278 0.040 0.040 5.351 0.96885220

0.010 485.167 0.040 0.040 5.353 0.96886426

0.009 540.114 0.040 0.040 5.355 0.96887631

0.008 549.114 0.040 0.040 5.357 0.96888835

0.007 577.843 0.040 0.040 5.359 0.96890040

0.006 589.155 0.040 0.040 5.361 0.96891244

0.006 612.873 0.040 0.040 5.363 0.96892447

0.005 669.410 0.040 0.040 5.365 0.96893650

0.004 708.300 0.040 0.040 5.367 0.96894852

0.003 725.642 0.040 0.040 5.370 0.96896054

0.002 730.642 0.040 0.040 5.372 0.96897255

0.002 746.429 0.040 0.040 5.374 0.96898456

0.001 763.348 0.040 0.040 5.376 0.96899656

In comparison, the following is a plot of data from S. Will’s reprogram-ming of the SKK Monte-Carlo simulation using an 8-mer. The x-axis isDELTA, and the y-axis is an approximation to conductance Φ.

27

0

0.002

0.004

0.006

0.008

0.01

0.012

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

”phi” ◦

◦

◦

◦◦ ◦

◦◦

◦ ◦

◦

◦

◦◦

◦

◦

◦

◦

◦◦◦

◦

◦ ◦◦ ◦◦

◦

◦

◦◦

◦ ◦◦◦

◦

◦◦ ◦◦◦

◦◦

◦◦◦ ◦

◦◦ ◦◦

◦◦

◦

◦

◦

◦◦

◦

◦

◦◦

◦ ◦

◦ ◦

◦

◦

◦

◦

◦◦

◦

◦◦

◦

◦◦

◦

◦◦

◦◦◦

◦

◦

◦

◦◦

◦◦◦

◦◦

◦

◦

◦

◦

◦

◦

◦

◦

◦

◦

◦ ◦

◦

◦ ◦◦

◦

◦

◦◦◦◦

◦

◦◦ ◦

◦

◦

◦◦

◦

◦◦

◦

◦

◦

◦

◦ ◦◦

◦◦

◦

◦

◦

◦

◦◦

◦

◦

◦

◦◦ ◦◦

◦

◦

◦ ◦

◦ ◦

◦

◦◦

◦

◦

◦ ◦

◦

◦

◦ ◦◦ ◦◦

◦◦

◦◦◦

◦

◦

◦

◦

◦

◦◦◦◦

◦◦

◦

◦

◦◦

◦

◦

◦

◦

◦

◦

◦

◦

◦◦ ◦

◦

28

Documents

Protein folding, the Levinthal paradox and rapidly mixing