44
Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Embed Size (px)

Citation preview

Page 1: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Automating Steps in Protein Structure Determination

by NMR

CS 296.4 April 13, 2009

Page 2: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Outline

Background

Steps in NMR protein structure determinationThe ACE cycle (Assign-Calculate-Evaluate)The assignment problem

Algorithms for automated NOE assignment

Semi-automated methods More-automated methods

Conclusions

Page 3: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

The Steps inProtein Structure Determination by NMR

1. Sample preparation2. Data collection3. Data evaluation4. Structure calculation5. Structure refinement6. Structure deposition

Page 4: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

The Steps inProtein Structure Determination by NMR

1. Sample preparation(a) protein selection(b) gene engineering(c) protein expression(d) protein purification(e) buffer optimization(f ) isotope labeling

2. Data collection3. Data evaluation4. Structure calculation5. Structure refinement6. Structure deposition (and maybe write a paper and graduate)

Page 5: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

The Steps inProtein Structure Determination by NMR

1. Sample preparation(a) protein selection(b) gene engineering(c) protein expression(d) protein purification(e) buffer optimization(f ) isotope labeling

2. Data collection(a) HSQC (b) amide H/D exchange(c) triple-resonance

3. Data evaluation4. Structure calculation5. Structure refinement

Page 6: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

The Steps inProtein Structure Determination by NMR

1. Sample preparation(a) protein selection(b) gene engineering(c) protein expression(d) protein purification(e) buffer optimization(f ) isotope labeling

2. Data collection(a) HSQC (b) amide H/D exchange(c) triple-resonance

3. Data evaluation(a) spectrum calculation(b) peak picking

Page 7: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Automatable Steps inProtein Structure Determination by NMR

1. Sample preparation2. Data collection3. Data evaluation4. Structure calculation5. Structure refinement6. Structure deposition

Page 8: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Fig. 2 (2003) Progress in NMR Spectroscopy, 43, 105, Guntert.

The

AssignCalculateEvaluate

cycle

in automated

NOEassignment

and structure

calculation.

Page 9: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Automating NOE Assignmentsand

THE Assignment Problem

Page 10: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Automating NOE Assignmentsand

THE Assignment Problem

There are MANY assignment tasks

1. Resonance Assignment 2. NOE Assignment

Page 11: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Automating NOE Assignmentsand

THE Assignment Problem

There are MANY assignment tasks

1. Resonance Assignment (interpreting data)2. NOE Assignment (interpreting data)

Page 12: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Automating NOE Assignmentsand

THE Assignment Problem

There are MANY assignment tasks

1. Resonance Assignment 2. NOE Assignment

and one major assignment problem.

ambiguous assignments

Due to the data collection problems of1. Completeness 2. Uniqueness

Page 13: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Automating NOE Assignmentsand

THE Assignment Problem

There are MANY assignment tasks

1. Resonance Assignment 2. NOE Assignment

and one major assignment problem.

ambiguous assignments

Due to the data collection problems of1. Completeness (missing data points)2. Uniqueness (unresolvable data points)

Page 14: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

from Fig. 3 (2003) Progress in NMR Spectroscopy, 43, 105, Guntert.

Unambiguously assigning a NOESY cross peak

Page 15: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Automated NMR Protein structure calculationPeter Guntert (2003) Progress in NMR Spectroscopy, 43, 105-125

Algorithms for automated NOESY assignment

Semi-automated methods1. ASsign NOEs (1993)2. Structure Assisted NOE Evaluation (2001)

Page 16: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Automated NMR Protein structure calculationPeter Guntert (2003) Progress in NMR Spectroscopy, 43, 105-125

Algorithms for automated NOESY assignment

Semi-automated methods1. ASsign NOEs (1993)2. Structure Assisted NOE Evaluation (2001)More-automated methods1. NOAH (1995)2. Ambiguous Restraints Iterative Assignments (1997)3. AutoStructure (1999)4. KNOWledge-based NOE assignments (2002)5. CANDID (2002)

Page 17: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

ASNO (1993) Guntert, Berndt, & Wuthrich

Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) 4. Set of estimated structures User specifies1. = max allowed chemical shift error 2. dmax = max interproton distance causing NOE3. nmin = min # structures with d < dmax

Page 18: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

ASNO (1993) Guntert, Berndt, & Wuthrich

Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) 4. Set of estimated structures User specifies1. = max allowed chemical shift error 2. dmax = max interproton distance causing NOE3. nmin = min # structures with d < dmax Algorithm steps1. each cross peak: find all poss. assignments (1Hj, 1Hk) 2. each (1Hj, 1Hk): n = # of structures with d < dmax 3. Prune all (1Hj, 1Hk) with n < nmin User intervention1. Manually check and refine NOE assignments (1Hj, 1Hk) 2. Refine set of structures and rerun algorithm

Page 19: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Fig. 1 (1993) J Biomol NMR, 3, 601, Guntert, Berndt, & Wuthrich. demo: Dendrotoxin K, 7kDa, 57AA, bbRMSD = 0.32Ang

Page 20: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

SANE (2001) Duggan, Legge, Dyson, & Wright

Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs ( j j ) ) User specifies Filters 1. Distance (Set of estimated structures)2. Chemical Shift ( = max allowed error)3. Secondary structure (unlikely NOE assignments)4. Assignment (expected NOE assignments)5. NOE contribution (same as in ARIA method)

Page 21: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

SANE (2001) Duggan, Legge, Dyson, & Wright

Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) User specifies Filters 1. Distance (Set of estimated structures)2. Chemical Shift ( = max allowed error)3. Secondary structure (unlikely NOE assignments)4. Assignment (expected NOE assignments)5. NOE contribution (same as in ARIA method)Algorithm steps1. each cross peak: find all poss. assignments (1Hj, 1Hk) 2. Apply five filters to prune list of (1Hj, 1Hk) 3. Write unique or ambiguous dist restraints, or violationsUser intervention1. Violation analysis

Page 22: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Fig. 1 (2001) J Biomol NMR, 19, 321, Duggan, et al. demo: LFA-1 I-domain, 21.3kDa, 183AA, bbRMSD = 0.29Ang

Page 23: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

NOAH (1995) Mumenthaler & Braun

Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs ( j j ) ) 4. Scalar coupling constants (3JNH)Algorithm calculates1. Distance constraints from NOE assignments2. Angle constraints from scalar couplings

Page 24: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

NOAH (1995) Mumenthaler & Braun

Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) 4. Scalar coupling constants (3JNH)Algorithm calculates1. Distance constraints from NOE assignments2. Angle constraints from scalar couplingsAlgorithm uses1. Structure-based filter (recognizes correct constraints)2. Chemical Shift limit ( = max allowed error)3. Error-tolerant target function in DIAMOD (1994) (minimizes effect of incorrect distance constraints from incorrect NOE assignments)

Page 25: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Fig. 1 (1995) J Mol Biol, 254, 465, Mumenthaler & Braun demo: 3 proteins ranging from 57 to 74 residues

Page 26: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

(1995) J Mol Biol, 254, 465, Mumenthaler & Braun NMRa/b=DEN=57, TEN=74, REP=69 residues

Page 27: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

ARIA (1997) Nilges, et al.

Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) 4. Assignment cutoff, p, decreases for each cycle5. (opt) preliminary structures, manual assignments6. (opt) RDCs, scalar couplings, d-angles, S-S or H-bondsAlgorithm calculates in each cycle1. Unique and partial NOE assignments 2. Unique and ambiguous distance restraints 3. Merges distance restraints with other input data4. Bundle of refined structures (typically 20)

Page 28: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

ARIA (1997) Nilges, et al.

An NOE cross peak with more than one possible assignment is considered as a weighted composite of all of them. Ambiguous distance restraints introduced to incorporate dk of each ambiguous NOE assignment.

Ambiguous restraints

To reduce the number of assignment possibilities each relative contribution Ck is calculated from dk and the average distancefor all possible assignments from the lowest n of 20 conformersfrom the previous cycle. The largest Ck that add up to the cutoffvalue, p, for that cycle are kept, the rest are discarded.

Page 29: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Fig. 1 (1997) J Mol Biol, 269, 408, Nilges, et al. demo: -spectrin PH domain, 106 residues

Page 30: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Table 1 (1997) J Mol Biol, 269, 408, Nilges, et al.

-spectrin PH domain, 106 residues

MAN data derived from manual assignments80ms and 30ms data differ only in mixing times

Page 31: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

AutoStructure (1999) Moseley & Montelione

Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) 4. Scalar couplings5. Slow amide H/D exchange data6. Preliminary structure7. Preliminary H-bonded pairsAlgorithm calculates1. Distance restraints2. Dihedral angle restraints 3. H-bonding pairs 4. Refined structures

Page 32: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Fig. 1 (1999) Curr. Opin. Struct. Biol., 9, 635, Moseley & Montelione. (& Y.J. Huang PhD thesis)

basic fibroblast growth factor (127 residues)

(a) 10 NMR-derived structures bbRMSD = 0.7 Ang. between (b) manual and AutoStructure-derived structures

Page 33: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

KNOWNOE (2002) Gronwald, et al.

Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) 4. NOESY cross peak volume probability distribution5. Preliminary structureUser specifies1. = max allowed chemical shift error 2. initial value of dmax = max interproton distance 3. Number, N, of current best structures

Page 34: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

KNOWNOE (2002) Gronwald, et al.

Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) 4. NOESY cross peak volume probability distribution5. Preliminary structureUser specifies1. = max allowed chemical shift error 2. initial value of dmax = max interproton distance 3. Number, N, of current best structuresAlgorithm, working together with CNS, iteratively will1. build A-list of uniquely assigned NOE cross peaks2. calculate P(Ak, a | Vo) for all other peaks3. add to A-list all peaks with P(Ak, a | Vo) < cutoff (0.8-0.9)4. use current A-list to calculate N structures

Page 35: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

KNOWNOE (2002) Gronwald, et al.

The problem of ambiguous assignments is addressedwith a Bayesian algorithm based on NOE cross peak volume probability distributions derived from 326 spectra.

P(Ak, a | Vo) = probability that more than fraction a of cross peak volume Vo is due to assignment k

If P(Ak, a | Vo) > cutoff value (typically 0.8 to 0.9) then consider that peak assigned to k for the next cycle.

These authors state that their algorithm is “Based on the observation that cross peak volume and correct cross peak assignment are not independent ofeach other”.

Page 36: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Figures 3 & 4 (2002) J. Biomol. NMR, 23, 271, Gronwald, et al. Probability distributions of distance (left) and volume (right)

Page 37: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

CANDID (2002) Hermann, Guntert & Wuthrich

Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs ( j j ) ) 4. Previously assigned NOE distance constraints5. (opt) other conformational constraints User specifies1. = max allowed chemical shift error2. Cycle-dependent parameters (thresholds, cutoffs, etc.)

Page 38: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

from (2002) J. Mol. Biol., 319, 209, Hermann, Guntert, & Wuthrich.

Page 39: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

CANDID (2002) Hermann, Guntert & Wuthrich

Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs ( j j ) ) 4. Previously assigned NOE distance constraints5. (opt) other conformational constraints User specifies1. = max allowed chemical shift error2. Cycle-dependent parameters (thresholds, cutoffs, etc.)Algorithm uses1. Structure-based filters (like NOAH)2. Ambiguous distance constraints (like ARIA)3. Network anchoring (new) 4. Constraint combination (new)

Page 40: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Fig. 1 (2002) J. Mol. Biol., 319, 209, Hermann, Guntert, & Wuthrich.

Page 41: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

CANDID (2002) Hermann, Guntert & Wuthrich

ways to handle problems caused by no preliminary structure in first cycle

1. Network anchoring “… evaluates the self-consistency of NOE assignments independent of knowledge of the 3D protein structure.”

“… a sensitive approach for detecting erroneous ‘lonely’ constraints …”

2. Constraint combination “… an extension of the concept of ambiguous NOE assignments.”

“… reduces the impact of unidentified artifact constraints in the input for the first structure calculation.”

Result:“The correct fold is obtained in cycle 1 of a de novo structure calculation.”

Page 42: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

from (2002) J. Mol. Biol., 319, 209, Hermann, Guntert, & Wuthrich.

Page 43: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

Questions ?

Conclusions

Page 44: Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009