View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Computing Protein Structures from Electron Density Maps: The
Missing Loop Problem
I. Lotan, H. van den Bedem, A. Beacon and J.C. Latombe
Protein Structure: Experimental Techniques
Nuclear Magnetic Resonance (NMR) spectroscopy – limited to short sequences.
X-ray crystallography
X-ray Crystallography
Crystallizing protein samples
Collect X-ray diffraction images
Calculate electronic charge – a 3-D Electron Density Map (EDM)
Electron Density Map
3-D “image” of atomic structure– High value (electron density) at atom centers– Density falls off exponentially away from center– Limited resolution, sampled on 3D grid
The End Goal: Build Protein Model from EDM
Completeness of automatically generated models varies with experimental data quality:
High Resolution 90% completeness.Low Resolution 2/3 completeness.
Completing the missing fragments manually is time consuming.
Experimental Data Quality Varies
Recovering the phase of diffracted beam is associated with error.
Resolution at which data were collected (High resolution images cannot be obtained for all proteins)
Not all replicas of protein in the protein crystal are identical
Mobility of molecule fragments Temperature dependent atomic vibration
Existing Techniques
Existing software rely on: Pattern recognition techniques Unambiguous density Elementary stereochemical constraints.
Model Refinement
Standard Maximum Likelihood (ML) algorithms exploit experimental and model phase information to build new refined models.
Iterating model building and refinement steps improves completeness and quality of models.The problem: missing fragments (Usually loops).The solution: filling the gaps at early stage.
Goal: Propose Candidates to Missing Fragments
Input:– EDM– Known structure– Anchor residues– The amino acid sequence
Output: propose a structure that fall within the radius of convergence of existing refinement tools (1-1.5Å)
Model
Standard Phi-Psi model. Compute backbone, ignore
side chains except Cß and O atoms.
Loop closure Mobile anchor vs.
stationary anchor.Closure is measured as the RMSD distance of the Mobile anchor atoms from stationary anchor atoms.
Stationary Anchor
Mobile Anchor
IK + EDM Loop Structure
Two stages algorithm:1. Guided by the EDM, sample closing
conformation.2. Refine top-ranking conformation, using local
optimization, while maintaining loop closure.
Conformations Ranking – density fit and conformational likelihood.
Stage 1: Generating Loop Candidates
Employ cyclic coordinate descent (CCD) method to obtain closing conformations, up to a tolerance distance dclose.
Starting conformations are obtained by a random procedure, biased by PDB-derived distributions.
Best scoring (95% percentile) conformations are submitted to stage 2.
Cyclic Coordinate Descent (CCD)
Adding the Electron Density Constraints
We would like to guide the loop closing to fit the EDM. For residue i the CDD proposes a distance minimizing dihedral
angles (Φ,Ψ)ip
.
Find a pair (Φ,Ψ)i in a square neighborhood of (Φ,Ψ)ip that
maximizes the local fit to the EDM. The neighborhood’s size is reduced linearly with CCD iterations to allow closure.
Atoms that are changed
by angle pair i and not i+1
Center of atom
Aj
Stage 2: Refining Loop Candidates
Improve models fit to experimental data (This time the model as a whole, as opposed to local fit in stage 1).
Maintain loop-closure constraint during optimization process.
Target Function
For conformation q, the target function T(q) is the sum of the squared differences between the observed density and the calculated density at each grid point in some volume V around the loop.
Scaling Factors Calculated Density (sum of contributions of atoms within a cutoff distance
from gi)Observed Density
Grid Points in Volume
Optimization with Closure Constraints
Generic Approach: Objective function optimization (T(q)) while performing given task (loop-closure) by taking
advantage of manipulator redundancy (DoFs).f(q) : forward kinematics equation.
J(q) : 6-by-n Jacobian
: the change to the end of the chain
J+(q) : an approximation of J-1(q)
N(q) : Orthonormal basis for the Null-Space (n-6 dimensions)
y = əT(q)/əq : gradient vector of objective function T(q)
Minimization Procedure: Monte Carlo and Simulated Annealing
Choose a random sub-chain with at least 8 DoFs. Propose random move with magnitude proportional to current
temperature– High temperature: use exact IK solver (Dill)– Low temperature: pick random direction in null-space
Minimize resulting conformation (gradient decent) Accept using Metropolis criterion:
P(accept qnew) = e^[( T(qprev) - T(qnew) ) / temp] Use simulated annealing – at each step decrease pseudo-
temperature At each step verify closure constrained is satisfied within
tolerance.
Results – High Resolution Data
Applying RESOLVE to the data (high resolution) yielded 88% completed initial model .Applying the alg to a gap of 12 residues.Magenta – the structure from the PDBCyan – Best scoring structure, RMSD = 0.25Å.The lowest RMSD for 7 residues gap at the end of stage 1 is 0.35Å.
Results – Low Resolution Data
Applying RESOLVE yielded a model with 61% completeness.
Applying the alg to a gap of 12 residues.
Magenta – the highest scoring, RMSD 0.6Å.
Yellow – starting conformation (end of stage 1), RMSD = 2.1Å (the lowest)