Upload
cory-carter
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Automating Steps in Protein Structure Determination
by NMR
CS 296.4 April 13, 2009
Outline
Background
Steps in NMR protein structure determinationThe ACE cycle (Assign-Calculate-Evaluate)The assignment problem
Algorithms for automated NOE assignment
Semi-automated methods More-automated methods
Conclusions
The Steps inProtein Structure Determination by NMR
1. Sample preparation2. Data collection3. Data evaluation4. Structure calculation5. Structure refinement6. Structure deposition
The Steps inProtein Structure Determination by NMR
1. Sample preparation(a) protein selection(b) gene engineering(c) protein expression(d) protein purification(e) buffer optimization(f ) isotope labeling
2. Data collection3. Data evaluation4. Structure calculation5. Structure refinement6. Structure deposition (and maybe write a paper and graduate)
The Steps inProtein Structure Determination by NMR
1. Sample preparation(a) protein selection(b) gene engineering(c) protein expression(d) protein purification(e) buffer optimization(f ) isotope labeling
2. Data collection(a) HSQC (b) amide H/D exchange(c) triple-resonance
3. Data evaluation4. Structure calculation5. Structure refinement
The Steps inProtein Structure Determination by NMR
1. Sample preparation(a) protein selection(b) gene engineering(c) protein expression(d) protein purification(e) buffer optimization(f ) isotope labeling
2. Data collection(a) HSQC (b) amide H/D exchange(c) triple-resonance
3. Data evaluation(a) spectrum calculation(b) peak picking
Automatable Steps inProtein Structure Determination by NMR
1. Sample preparation2. Data collection3. Data evaluation4. Structure calculation5. Structure refinement6. Structure deposition
Fig. 2 (2003) Progress in NMR Spectroscopy, 43, 105, Guntert.
The
AssignCalculateEvaluate
cycle
in automated
NOEassignment
and structure
calculation.
Automating NOE Assignmentsand
THE Assignment Problem
Automating NOE Assignmentsand
THE Assignment Problem
There are MANY assignment tasks
1. Resonance Assignment 2. NOE Assignment
Automating NOE Assignmentsand
THE Assignment Problem
There are MANY assignment tasks
1. Resonance Assignment (interpreting data)2. NOE Assignment (interpreting data)
Automating NOE Assignmentsand
THE Assignment Problem
There are MANY assignment tasks
1. Resonance Assignment 2. NOE Assignment
and one major assignment problem.
ambiguous assignments
Due to the data collection problems of1. Completeness 2. Uniqueness
Automating NOE Assignmentsand
THE Assignment Problem
There are MANY assignment tasks
1. Resonance Assignment 2. NOE Assignment
and one major assignment problem.
ambiguous assignments
Due to the data collection problems of1. Completeness (missing data points)2. Uniqueness (unresolvable data points)
from Fig. 3 (2003) Progress in NMR Spectroscopy, 43, 105, Guntert.
Unambiguously assigning a NOESY cross peak
Automated NMR Protein structure calculationPeter Guntert (2003) Progress in NMR Spectroscopy, 43, 105-125
Algorithms for automated NOESY assignment
Semi-automated methods1. ASsign NOEs (1993)2. Structure Assisted NOE Evaluation (2001)
Automated NMR Protein structure calculationPeter Guntert (2003) Progress in NMR Spectroscopy, 43, 105-125
Algorithms for automated NOESY assignment
Semi-automated methods1. ASsign NOEs (1993)2. Structure Assisted NOE Evaluation (2001)More-automated methods1. NOAH (1995)2. Ambiguous Restraints Iterative Assignments (1997)3. AutoStructure (1999)4. KNOWledge-based NOE assignments (2002)5. CANDID (2002)
ASNO (1993) Guntert, Berndt, & Wuthrich
Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) 4. Set of estimated structures User specifies1. = max allowed chemical shift error 2. dmax = max interproton distance causing NOE3. nmin = min # structures with d < dmax
ASNO (1993) Guntert, Berndt, & Wuthrich
Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) 4. Set of estimated structures User specifies1. = max allowed chemical shift error 2. dmax = max interproton distance causing NOE3. nmin = min # structures with d < dmax Algorithm steps1. each cross peak: find all poss. assignments (1Hj, 1Hk) 2. each (1Hj, 1Hk): n = # of structures with d < dmax 3. Prune all (1Hj, 1Hk) with n < nmin User intervention1. Manually check and refine NOE assignments (1Hj, 1Hk) 2. Refine set of structures and rerun algorithm
Fig. 1 (1993) J Biomol NMR, 3, 601, Guntert, Berndt, & Wuthrich. demo: Dendrotoxin K, 7kDa, 57AA, bbRMSD = 0.32Ang
SANE (2001) Duggan, Legge, Dyson, & Wright
Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs ( j j ) ) User specifies Filters 1. Distance (Set of estimated structures)2. Chemical Shift ( = max allowed error)3. Secondary structure (unlikely NOE assignments)4. Assignment (expected NOE assignments)5. NOE contribution (same as in ARIA method)
SANE (2001) Duggan, Legge, Dyson, & Wright
Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) User specifies Filters 1. Distance (Set of estimated structures)2. Chemical Shift ( = max allowed error)3. Secondary structure (unlikely NOE assignments)4. Assignment (expected NOE assignments)5. NOE contribution (same as in ARIA method)Algorithm steps1. each cross peak: find all poss. assignments (1Hj, 1Hk) 2. Apply five filters to prune list of (1Hj, 1Hk) 3. Write unique or ambiguous dist restraints, or violationsUser intervention1. Violation analysis
Fig. 1 (2001) J Biomol NMR, 19, 321, Duggan, et al. demo: LFA-1 I-domain, 21.3kDa, 183AA, bbRMSD = 0.29Ang
NOAH (1995) Mumenthaler & Braun
Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs ( j j ) ) 4. Scalar coupling constants (3JNH)Algorithm calculates1. Distance constraints from NOE assignments2. Angle constraints from scalar couplings
NOAH (1995) Mumenthaler & Braun
Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) 4. Scalar coupling constants (3JNH)Algorithm calculates1. Distance constraints from NOE assignments2. Angle constraints from scalar couplingsAlgorithm uses1. Structure-based filter (recognizes correct constraints)2. Chemical Shift limit ( = max allowed error)3. Error-tolerant target function in DIAMOD (1994) (minimizes effect of incorrect distance constraints from incorrect NOE assignments)
Fig. 1 (1995) J Mol Biol, 254, 465, Mumenthaler & Braun demo: 3 proteins ranging from 57 to 74 residues
(1995) J Mol Biol, 254, 465, Mumenthaler & Braun NMRa/b=DEN=57, TEN=74, REP=69 residues
ARIA (1997) Nilges, et al.
Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) 4. Assignment cutoff, p, decreases for each cycle5. (opt) preliminary structures, manual assignments6. (opt) RDCs, scalar couplings, d-angles, S-S or H-bondsAlgorithm calculates in each cycle1. Unique and partial NOE assignments 2. Unique and ambiguous distance restraints 3. Merges distance restraints with other input data4. Bundle of refined structures (typically 20)
ARIA (1997) Nilges, et al.
An NOE cross peak with more than one possible assignment is considered as a weighted composite of all of them. Ambiguous distance restraints introduced to incorporate dk of each ambiguous NOE assignment.
Ambiguous restraints
To reduce the number of assignment possibilities each relative contribution Ck is calculated from dk and the average distancefor all possible assignments from the lowest n of 20 conformersfrom the previous cycle. The largest Ck that add up to the cutoffvalue, p, for that cycle are kept, the rest are discarded.
Fig. 1 (1997) J Mol Biol, 269, 408, Nilges, et al. demo: -spectrin PH domain, 106 residues
Table 1 (1997) J Mol Biol, 269, 408, Nilges, et al.
-spectrin PH domain, 106 residues
MAN data derived from manual assignments80ms and 30ms data differ only in mixing times
AutoStructure (1999) Moseley & Montelione
Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) 4. Scalar couplings5. Slow amide H/D exchange data6. Preliminary structure7. Preliminary H-bonded pairsAlgorithm calculates1. Distance restraints2. Dihedral angle restraints 3. H-bonding pairs 4. Refined structures
Fig. 1 (1999) Curr. Opin. Struct. Biol., 9, 635, Moseley & Montelione. (& Y.J. Huang PhD thesis)
basic fibroblast growth factor (127 residues)
(a) 10 NMR-derived structures bbRMSD = 0.7 Ang. between (b) manual and AutoStructure-derived structures
KNOWNOE (2002) Gronwald, et al.
Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) 4. NOESY cross peak volume probability distribution5. Preliminary structureUser specifies1. = max allowed chemical shift error 2. initial value of dmax = max interproton distance 3. Number, N, of current best structures
KNOWNOE (2002) Gronwald, et al.
Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments3. NOESY cross peak list (of pairs ( j j ) ) 4. NOESY cross peak volume probability distribution5. Preliminary structureUser specifies1. = max allowed chemical shift error 2. initial value of dmax = max interproton distance 3. Number, N, of current best structuresAlgorithm, working together with CNS, iteratively will1. build A-list of uniquely assigned NOE cross peaks2. calculate P(Ak, a | Vo) for all other peaks3. add to A-list all peaks with P(Ak, a | Vo) < cutoff (0.8-0.9)4. use current A-list to calculate N structures
KNOWNOE (2002) Gronwald, et al.
The problem of ambiguous assignments is addressedwith a Bayesian algorithm based on NOE cross peak volume probability distributions derived from 326 spectra.
P(Ak, a | Vo) = probability that more than fraction a of cross peak volume Vo is due to assignment k
If P(Ak, a | Vo) > cutoff value (typically 0.8 to 0.9) then consider that peak assigned to k for the next cycle.
These authors state that their algorithm is “Based on the observation that cross peak volume and correct cross peak assignment are not independent ofeach other”.
Figures 3 & 4 (2002) J. Biomol. NMR, 23, 271, Gronwald, et al. Probability distributions of distance (left) and volume (right)
CANDID (2002) Hermann, Guntert & Wuthrich
Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs ( j j ) ) 4. Previously assigned NOE distance constraints5. (opt) other conformational constraints User specifies1. = max allowed chemical shift error2. Cycle-dependent parameters (thresholds, cutoffs, etc.)
from (2002) J. Mol. Biol., 319, 209, Hermann, Guntert, & Wuthrich.
CANDID (2002) Hermann, Guntert & Wuthrich
Input “data”1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs ( j j ) ) 4. Previously assigned NOE distance constraints5. (opt) other conformational constraints User specifies1. = max allowed chemical shift error2. Cycle-dependent parameters (thresholds, cutoffs, etc.)Algorithm uses1. Structure-based filters (like NOAH)2. Ambiguous distance constraints (like ARIA)3. Network anchoring (new) 4. Constraint combination (new)
Fig. 1 (2002) J. Mol. Biol., 319, 209, Hermann, Guntert, & Wuthrich.
CANDID (2002) Hermann, Guntert & Wuthrich
ways to handle problems caused by no preliminary structure in first cycle
1. Network anchoring “… evaluates the self-consistency of NOE assignments independent of knowledge of the 3D protein structure.”
“… a sensitive approach for detecting erroneous ‘lonely’ constraints …”
2. Constraint combination “… an extension of the concept of ambiguous NOE assignments.”
“… reduces the impact of unidentified artifact constraints in the input for the first structure calculation.”
Result:“The correct fold is obtained in cycle 1 of a de novo structure calculation.”
from (2002) J. Mol. Biol., 319, 209, Hermann, Guntert, & Wuthrich.
Questions ?
Conclusions