Lecture 10' - Structural Transitions of Polypeptides

Embed Size (px)

Citation preview

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    1/30

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    2/30

    Coil-Helix Transitions

    The transition between a random coil and a helixstructure: also called the coil-helix transition

    is an important component in protein folding pathways.

    The term, random coil: refers to a set of equivalent coil-like structures: each is unfolded, relative to a typical helical structure

    -helix, 310 helix, -strand, etc.

    Focus: thermodynamic properties of these transitions.

    For simplicity, we begin with a homopolymer: a polynucleotide of identical amino acids

    and focus on the transition: random coil to -helix.

    We investigate this coil-helix transition:

    using a statistical thermodynamic treatment: The Zipper Model.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    3/30

    The Nucleation of the -helix

    The nucleation step in -helix formation: involves formation of an H-bond between:

    Keto Oxygen of residue j.

    Amide Hydrogen of residue j+4.

    this requires the torsion angles to

    assume mean values:

    = -57o, = -47

    o.

    entropically unfavorable.

    This H-bond helps to stabilize

    the helical structure. Energetic favorability, however:

    relies on isolation of the H-bond

    from competition with water

    as is the case in the folded proteins interior;

    or in a non-polar solvent.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    4/30

    Model Polypeptide System

    As a result, many shorter oligo-polypeptides: form an -helix only in organic solvent (e.g., octanol).

    Certain long polypeptides, however:

    can be induced to form an -helix in Aq. solution

    Classic Example (Zimm and Bragg, 1959): poly-[-benzyl-L-glutamate]

    in 80% dichloroacetic acid / 20% ethylene dichloride.

    undergoes a coil-to-helix transition when heated:

    the opposite of protein denaturation in Aq. solution due to dichloroacetic acids ability to form strong H-bonds:

    with the amide-Nitrogens in the coil.

    Here, -helix formation thus endothermic.

    since Ho

    > 0, this process is entropy driven (So

    > 0)

    consistent with release ofsolventfrom the helix.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    5/30

    Estimating s

    In order to apply the Zipper model to a transition: s must be related to experimentally measured quantities:

    The midpoint of the melting transition:

    The melting temperature, Tm

    Ho

    for adding 1 helical unit onto a pre-existing helix:

    The enthalpy of helix growth, H

    o

    g. Method: Experimental determination of s

    s modeled as a micro-equilibrium constant of helix formation

    In practice, Ho

    g determined from by comparing Tms of polymers of

    2 different lengths.

    Allows modeling within the zipper model.

    While adjusted to yield the best fit.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    6/30

    Comparison with Experiment

    Points are experimental data (2 sets, Doty, etal.): determined by optical rotation.

    for several lengths: N = 26, 1500 residues.

    Curves: predicted values: Fractional number of helical

    residues, h.

    estimated by the Zipper model.

    s values computed using:

    Ho

    g = 0.89 kcal/mol > 0. helix formation endothermic.

    thus, s increases with T.

    2 fitted values shown: = 1 x 10

    -4(dashed).

    = 2 x 10-4

    (solid).

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    7/30

    General Transition Characteristics

    Transition shows a large N-dependence. even though s and are length-independent.

    Cooperativity of the transition increases with N.

    as measured by the narrowness of the transition

    Relative contributions of parametersalso stronglyN-dependent

    In short polymers, nucleation ()dominant

    initial formation unfavorable.

    Propagation strongly inhibited.

    At large N, propagation (s)

    dominates

    nucleation penalty distributed

    over more residues

    s quickly dominates as T increases.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    8/30

    Validating the Size of

    Fitted Application of a Zipper model: predicts the coil to -helix transition to be highly cooperative.

    fitted value, = 10-4.

    This prediction can be separately validated as follows: statistical weight of nucleation = s.

    This accounts for formation of 1 H-bond. s accounts for the balance between H

    oand S

    o

    for only 1 residue.

    accounts for the cooperativity of nucleation: nucleation restricts the angles of 4 residues:

    to values typical of an -helix: ( = -57

    o

    ,= -47

    o

    ), the S

    ofor only 1 of these 4 is included in s

    thus, = exp[3So

    res/R]

    Net entropy change/residue:

    So

    res = R ln Whelix R ln Wcoil -R ln 9 = -18 J/mol K.

    Substitution yields the estimate, = 1.5 x 10-3.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    9/30

    The Coil to 310-Helix Transition

    s and always depend on the nature of the transition. other transitions exhibit different s and values.

    Example: Sequences of type (AAAAK)nA.

    A, K = alanine, lysine, respectively.

    convert from 310

    helices to -helices

    when n is increased from 3 to 4. i.e., with increasing total polymer length.

    This suggests that the 310 helix:

    is easier to initiate (from a coil) than an -helix:

    (310

    ) > (-helix)

    but, once initiated, the -helix is more easily propagated:

    s(-helix) > s(310)

    The difference in is due to conformational entropies:

    nucleation of a 310 helix fixes the torsion angles of 1 less residue:

    -helix: H-bond between residues j and j+4.

    310-helix: H-bond b/w residues j and j+3.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    10/30

    Sequence Dependence of, s

    For regular-helix formation (or melting), both and s are sequence dependent. each type of residue characterized by a different s value.

    cooperativity, ofhelix formation will also vary: but with the mean residue content.

    Ex: In Lysine-containing polymers,

    10

    -3

    . nucleation 10x more favorable compared to poly-[-benzyl-L-glutamate].

    The impact of residue differences on -helixstability:

    studied using host-guest peptides. energetic variations due to a single , internal switched

    residue are measured; essentially no variation in .

    emphasis: determination of variation in s.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    11/30

    Host-Guest Parameters Begin with a host -helix (yyyyyyy):

    y = some residue stable as a -helix. transition free energy/residue: G

    o(y).

    Replace 1 y with a guest residue (X). yields sequence: yyy-X-yyy

    Measure Go

    (kJ/mol) for all X values:

    then, G

    o

    (x) = G

    o

    (y) + G

    o

    (x) Values yield -helix propensities.

    Here, shown normalized relative to the G

    oof Gly;

    since Gly lacks a C (R = H).

    All but Pro more favorable than Gly. Pro is a strong -helix breaker.

    Conversion from Go(x) to s:

    assume ~ independent of X. then, s = exp[-G

    o(X)/RT].

    Here, s = 1.0 means neutral favorability.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    12/30

    Modeling Melting Initiation

    Consider an N-residue polypeptide:

    in the fully helical conformation: hhhhhh

    weight: N =sN.

    Melting of the Helix:

    can occur by 2 fundamentally different processes

    melting of an end residue:

    2 conformations: chhhhh and hhhhhc.

    total weight: N-1 = 2sN-1

    .

    melting a middle residue

    N-2 conformations, of the form : hhhchhh each has 2 helix-islands

    total weight: N-1 = (N-2)2s

    N-1.

    More generally, a conformation with j helix-islands:

    will contain j factors of.

    this motivates our Zipper model.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    13/30

    End vs. Middle Melting

    The relative probabilities of initiating melting: at the end vs. the middle

    estimated by a ratio of statistical weights:

    Pe/Pm = 2sN-1

    /(N-2)2s

    N-1 2/N.

    we have 2 opposing factors:

    N = number of central melting points. = penalty for initiating melting at a given middle point.

    Assuming the typical experimental value ( = 2x10-4)

    Pe/Pm = 1 when N 104.

    For short helices (N < 104

    residues), dominates

    transition initiates at the ends.

    For long helices (N > 104

    residues), N overcomes ...

    denaturation may then proceed from the middle. These trends observed experimentally, in globular proteins.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    14/30

    Predicting Protein Structure

    Given sets of (s,) values for all 20 amino acids for formation of all types of 2

    ostructures:

    coil to -helices, -strand, or 310 helix, etc..

    We should be able to apply a Zipper model:

    to predict the probability of adopting each type of structure during folding and melting of globular proteins.

    Added complication:

    s = exp(-Go/RT) also depends on external factors.

    e.g.,Go

    varies with residue environment. Globular proteins offer 2 distinct environments:

    hydrophobic: buried residues in the protein interior.

    hydrophilic: solvent accessible residues at the protein surface.

    meaningful assignment of an s value to each residue j:

    demands knowledge of whether j is buried, in each context.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    15/30

    Chou and Fasman

    Statistical Thermodynamics not routinely used tomodel protein folding.

    however, many statistical methods for predicting 2o

    structure have been developed.

    these incorporate many of the essential features of theZipper model.

    The Method of Chou and Fasman (1974):

    begins with a empirical set of residue parameters.

    defined not by measured transition energies (Go),

    but by the statistical tendency of each residue to form each

    type of structure

    as determined from the mole fractions present in actual

    protein crystals.

    first parameters used data from 64 different proteins.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    16/30

    Chou and Fasman Parameters

    For each type of amino acid three types of parameters are computed: = propensity to form an -helix.

    = propensity to form a -sheet.

    = propensity to turn (adopt a coil).

    Example: Determination of values. First, a mole fraction is computed for each type, i:

    (i) = occurrence of i in an -helix / occurrence of i in

    the data set.

    Secondly, an average alpha-helical amino acid is defined:

    with an average value of(i) :

    < > = i(i) / 20

    Parameter i then defined as a relative tendency:

    = (i) / < >

    really just a weighted average.

    Re eated for each t e of 2o

    structure.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    17/30

    The Favorability of Propagation

    Parameters and : correspond to the mean propagation terms,

    in their respective Zipper models; averaged over solvent conditions.

    Ex.: corresponds conceptually to

    in the Zipper model of the coil to -helix transition. Qualitative propensity also assigned to each residue,

    relative to each type of structure.

    e.g., relative to -helix formation, residues categorized as: Strong Helix Formers (H)

    Average Helix Formers (h) Weak Helix Formers (I)

    Indifferent (i)

    Weak Helix Breakers (b)

    Strong Helix Breakers (B)

    Again, this is repeated for each type of 2o

    structure.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    18/30

    Chou-Fasman Parameter Set

    Comparison w/ Host-Guest Parameters:

    relative favorabilities:

    general agreement.

    differences in theordering.

    values play the role of

    propagation terms, s.

    Proline:

    low and .

    due to restricted

    torsion angles.

    -helix, -sheet breaker.

    Glycine:

    low , but high

    great conformational

    freedom.

    3rd residue of a Type II turn.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    19/30

    The Cooperativity of Nucleation

    The cooperativity of 2o

    structure formation: i.e., the statistical unlikelihood of nucleation.

    is also included in the Chou-Fasman model:

    but, implicitly, in the rules of region assignment: e.g., whether a sub-sequence is helix, sheet, or coil.

    Regions of 2o structure assigned by inspection: where any 2

    ostructure requires a string of residues of similar

    propensity.

    Example: For an -helix: initiation of a helix requires a contiguous set of helix formers:

    H, h, or I with I given -weight. clearly modeling the cooperativity of helix nucleation.

    nucleated helices propagate through residues, H, h, I, and i.

    and terminate when two or more helix breakers are encountered. again, modeling the cooperativity of the process.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    20/30

    Example: Chou-Fasman Method

    Applied to the first 24 residues of Adenylate kinase. method predicts 2 structures:

    N-terminal string with -helix forming tendency. mean weight: = 1.39.

    2nd string with both -helix and -sheet forming tendency.

    mean -tendency higher: = 1.56.

    Experimentally, strings correspond to -helix, -sheet. A -turn (specific coil) is also observed.

    predicted by a hydropathy-based modification by Rose (1978).

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    21/30

    Example (cont.)

    Applied to the remainder of Adenylate kinase: And also compared with a 2nd method (Nagano).

    best results provided by a joint method:

    here, obtains ~ 70% accuracy.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    22/30

    Evaluating Accuracy

    The most widely used method: the overall, per-residue, 3-state accuracy (Q3):

    Q3 = [(PH+PE+PC)/N] x 100%

    N = total number of residues.

    PX = number of correctly predicted residues in state X.

    X = -Helix, -shEet, or Coil.

    Although other methods exist,

    Q3 is the most conceptually simple.

    Pioneering method by Chou-Fasman:

    overall accuracy of only about Q3

    = 50%.

    as assessed by a database of 267 known structures.

    initially very popular, due to conceptual simplicity.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    23/30

    Improvements on Chou-Fasman

    Many improvements have appeared. differ based on parameter definition and application.

    an in-depth consideration beyond the scope of this course.

    however, success correlated with the addition of relevant

    statistical information

    [1] Information regarding residue context. i.e., The propensity of a residue to adopt a given state:

    determined by its n neighboring residues

    as compared with observations in a database.

    We examine: the GOR method (Garnier, 1987):

    [2] Information regarding homologous proteins.

    protein first subjected to multiple alignment.

    to identify homologous proteins.

    prediction then based on consensus propensities.

    We examine: the PHD method (Rost and Sander, 1993).

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    24/30

    The GOR Method

    Propensity of a residue to adopt state S: defined not only by its own identity:

    as in Chou-Fasman,

    but also by the identities of neighboring residues.

    GOR uses a 17-residue window:

    a central, predicted residue + 8 flanking residues

    on each side.

    e.g., residues 4-20 used to predict the state of the 12 th

    residue (F) of adenylate kinase:

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    25/30

    The GOR Method (cont.)

    Using sequences in the database, 3 Scoring Matrices, MS werefirst constructed: One for each of the 3 basic helical states, S = {H, E, C}.

    Each is a 20x17 matrix, with elements mxy:

    row, x = amino acid type (e.g., Ala). column, y = residue position within the window,

    mxy = the probability that residue y is of type x given that the central residue is in state S

    So, the sum of the mxy values in each column is 1.

    Again, each matrix constructed in advance, from observed frequencies in the data base

    (e.g., from all known protein structures).

    Any candidate sequence is evaluated at each position, k: By applying each scoring matrix, MS:

    Residue k is taken as the central residue of the window

    And elements (mxy) are summed for all 1 y 17. such that the residue at each window position, y is of type x.

    Highest of the 3 sums (H, E, and C) : yields the prediction, S, for that residues state.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    26/30

    Example: The GOR Method An application of GOR IV shown below:

    run at the Network Protein Sequence @nalysis site:

    http://npsa-pbil.ibcp.fr/cgi-bin/secpred_gor4.pl .

    to the 1st 24 residues of Adenylate kinase:

    this method correctly predicts the -sheet and -turn.

    the -helix at residues (1-8) is predicted coil:

    although its structural propensity to form a

    helix is noted (blue line).

    Overall, Gor IV has an accuracy of Q3 = 64.4%.

    http://npsa-pbil.ibcp.fr/cgi-bin/secpred_gor4.plhttp://npsa-pbil.ibcp.fr/cgi-bin/secpred_gor4.pl
  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    27/30

    Use of Multiple Alignments

    The method ofMultiple alignments was first used to

    aid in protein 2o

    structural prediction:

    by Zvelebil (1987), in combination with the GOR I method.

    accuracy improved by 9%.

    Basic idea:

    Given a sequence to be evaluated:

    identify a set of homologous (i.e., similar) sequences

    each with > 25% sequence identity.

    2o

    structure prediction then based on consensus propensities.

    one popular multiple alignment based-method:

    The PHD Method:

    Profile network from HeiDelberg:

    combines sequence homology info. with a neural network.

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    28/30

    Profile network from HeiDelberg

    The PHD Method (Rost and Sander, 1993): combines sequence homology information,

    with the optimization strength of a 2-layer neural network.

    1st Layer: Raw Predictions Input:

    fractions of the 20 types of residue ateach multiple-alignment position

    in a 13-residue window aroundevaluated residue, k.

    total of 20x13 = 260 input nodes.

    Output: probability for each state (PH, PE, PC).

    2nd Layer: Elimination of Infeasible Structures. input: output of the first layer.

    After application to each residue in the chain.

    output: refined probabilities. refines the raw predictions of the 1st layer

    e.g., HHHEEHH becomes HHHHHHH

  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    29/30

    Example

    An application of PHD shown below: run at the PredictProtein server (Columbia):

    http://dodo.cpmc.columbia.edu/predictprotein/submit_def.html .

    initial homology search: Psi-Blast.

    to 1st 24 residues of Adenylate kinase.

    This method correctly: predicts the -sheet (r10-r14).

    E region (blue).

    predicts the reverse-turn (r16-r22). L region (green).

    however, -helix predicted to be coil. as in the GOR method

    but, a 30% -helical probability isassigned to the region (red).

    Overall PHD accuracy: Q3 = 70.8%.

    http://dodo.cpmc.columbia.edu/predictprotein/submit_def.htmlhttp://dodo.cpmc.columbia.edu/predictprotein/submit_def.htmlhttp://dodo.cpmc.columbia.edu/predictprotein/submit_def.htmlhttp://dodo.cpmc.columbia.edu/predictprotein/submit_def.html
  • 8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

    30/30

    Conclusion

    In this Lecture, the helix-coil transition was used todiscuss:

    The coil to -helix transition of a model polypeptide system:

    Poly-[-benzyl-L-glutamate];

    and the sequence-dependence of s and .

    The tendency of short polypeptides to melt at helix ends.

    The lower cooperativity of 310 helix formation.

    Limitations of the Zipper model were then discussed:

    The dependence of s on the (unknown) residue

    environment.

    So that purely statistical methods of prediction are more

    usual.

    The conceptual relationship b/w the Zipper model:

    and statistical methods of predicting protein 2o structure