Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 1 of 38
Protein DataCrystal StructurePhases
Overview of the Phase Problem
John RoseACA Summer School 2006
Reorganized by Andy Howard, Biology 555, Spring 2008Part 2 of 2
RememberWe can measure reflection intensities
We can calculate structure factors from the intensitiesWe can calculate the structure factors from atomic positions
We need phase information to generate the image
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 2 of 38
From Glusker, Lewis and Rossi
!
Puvw
=1
V|F
hkl
hkl
" |2cos2# (hu + kv + lv)
Finding the Heavy Atomsor Anomalous Scatterers
The Patterson function - a F2 Fourier transform
with φ = 0 - vector map
(u,v,w instead of x,y,z) - maps all inter-atomic vectors - get N2 vectors!!
(where N= number ofatoms)
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 3 of 38
The Difference Patterson Map
SIR : |ΔF|2 = |Fnat - Fder|2SAS : |ΔF|2 = |Fhkl - F-h-k-l|2
Patterson map is centrosymmetric - see peaks at u,v,w & -u, -v, -wPeak height proportional to ZiZjPeak u,v,w’s give heavy atom x,y,z’s - Harker analysisOrigin (0,0,0) maps vector of atom to itself
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 4 of 38
Harker analysis• Certain relationships apply in Patterson
maps that enable us to determine some ofthe coordinates of our heavy atoms
• They depend on looking at differencesbetween atomic positions
• These relationships were worked out byLindo Patterson and David Harker
• Patterson space is centrosymmetric butotherwise similar to original symmetry; butPatterson symmetry has no translations
DavidHarker
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 5 of 38
Example: space group P21
• P21 has peaks atR1=(x,y,z) and R2=(-x,y+1/2,-z)
• Therefore we’ll get Patterson (difference) peaks atR1-R1, R1-R2, R2-R1, R2-R1:
• (0,0,0), (2x,-1/2,2z), (-2x,1/2,-2z),(0,0,0)• So if we look at the section of the map at Y=1/2,
we can find peaks at (-2x,1/2,-2z) and therebydiscern what the x and z coordinates of a real atomare
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 6 of 38
How do we actually use this?
• Compute difference Patterson map,i.e. map with coefficients derived fromFhkl
PH - FhklP or Fhkl - F-h-k-l
• Examine Harker sections• Peaks in Harker sections tell us where the
heavy atoms or anomalous scatterers are• Automated programs like BNP, SOLVE,
SHELX can do the heavy lifting for us
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 7 of 38
A Note About Handedness• We identify each reflection by an index, hkl.• The hkl also tells us the relative location of that reflection
in a reciprocal space coordinate system.• The indexed reflection has correct handedness if a data
processing program assigns it correctly.• The identity of the handedness of the molecule of the
crystal is related to the assignment of the handedness of thedata, which may be right or wrong!
• Note: not all data processing programs assign handednesscorrectly!
• Be careful with your data processing.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 8 of 38
O
N
M
L
ΔOLM = ΔOLN
Q
!
"QLM
+"LON
= #
!
"LON
= # $%H
FPH = FP + FH
Need value of FHFrom Glusker, Lewis and Rossi
The Phase Triangle Relationship
FP, FPH, FH and -FH are vectors (have direction)FP <= obtained from native dataFPH <= obtained from derivative or anomalous dataFH <= obtained from Patterson analysis
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 9 of 38
O
N
M
LQ
From Glusker, Lewis and Rossi
The Phase Triangle Relationship
• In simplest terms, isomorphous replacement finds theorientation of the phase triangle from the orientation of oneof its sides. It turns out, however, that there are twopossible ways to orient the triangle if we fix the orientationof one of its sides.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 10 of 38
X1 = φtrueor φfalseX2 = φtrueor φfalse
From Glusker, Lewis and Rossi
Note:FP = proteinFH = heavy atomFP1 = heavy atom derivative
The center of the FP1circle isplaced at the end of thevector -FH1.
Single Isomorphous Replacement
• The situation of two possible SIR phases is called the“phase ambiguity” problem, since we obtain both a trueand a false phase for each reflection. Both phasesolutions are equally probable, i.e. the phase probabilitydistribution is bimodal.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 11 of 38
X1 = φtrueor φfalseX2 = φtrueor φfalse
From Glusker, Lewis and Rossi
Note:FP = proteinFH = heavy atomFP1 = heavy atom derivative
The center of the FP1circleis placed at the end of thevector -FH1.
Resolving the Phase Ambugity
Add more information:(1) Add another derivative (Multiple Isomorphous Replacement)(2) Use a density modification technique (solvent flattening)(3) Add anomalous data (SIR with anomalous scattering)
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 12 of 38
X1 = φtrueX2 = φfalseX3 = φfals
Exact overlap at X1 dependent on data accuracy dependent on HA accuracy called lack of closure
From Glusker, Lewis and Rossi
Note:FP = proteinFH1 = heavy atom #1FH2 = heavy atom #2FP1 = heavy atom derivativeFP2 = heavy atom derivative
The center of the FP1 and FP1circles are placed at the end of thevector -FH1 and -FH2, respectively.
Multiple Isomorphous Replacement
• We still get two solutions, one true andone false for each reflection from thesecond derivative. The true solutionsshould be consistent between the twoderivatives while the false solution shouldshow a random variation.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 13 of 38
B.C. Wang, 1985
From Glusker, Lewis and Rossi
Solvent FlatteningSimilar to noise filteringResolve the SIR or SAS phase ambiguityElectron density can’t be negativeUse an iterative process to enhance true phase!
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 14 of 38
How does solvent flatteningresolve the phase ambiguity?
• Solvent flattening can locate and enhance the proteinimage—viz., whatever is not solvent must be protein!
• From the protein image, the phases of the structurefactors of the protein can be calculated
• These calculated phases are then used to select the truephases from sets of true and false phases
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 15 of 38
Using the structure to solve thephase ambiguity
• Thus, in essence, the phaseambiguity is resolved by the proteinimage itself!
• This solvent-flattening process wasmade practical by the introduction ofthe ISIR/ISAS program suite (Wang,1985) and other phasing programssuch DM and PHASES are based onthis approach.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 16 of 38
Handedness from solvent flattening
• The ISAS process is performed twice, once withheavy atom sites @ refined locations, once intheir inverted locations
0.9260.350.78Wrong0.560.9640.240.82Correct0.56NP+I+S4
0.9190.360.76Wrong0.540.9550.270.80Correct0.54NP + I3
0.9400.300.80Wrong0.540.9580.260.82Correct0.54RHE
Corr.Coeff.
RfactorFOM2Handed-ness
FOM1Data
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 17 of 38
Notes on the handedness table
• 1: Figure of merit before solvent flattening
• 2: Figure of merit after one filter and four cycles of solvent
flattening
• 3: Four Iodine were used for phasing
• 4: Four Iodine and 56 Sulfur atoms were used for phasing• Heavy Atom Handedness and Protein Structure Determination
using Single-wavelength Anomalous Scattering Data, ACAAnnual Meeting, Montreal, July 25, 1995.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 18 of 38
Does the correct hand make a difference?
• Yes!• The wrong
hand will givethe mirrorimage!
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 19 of 38
Anomalous Dispersion Methods
• All elements display an anomalous dispersion(AD) effect in X-ray diffraction
• For light elements (H, C, N, O), anomalousdispersion effects are negligible; they’re smalleven for S and P at typical X-ray energies
• For heavier elements, especially when the X-raywavelength approaches an atomic absorption edgeof the element, these AD effects can be very large.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 20 of 38
Scattering power whenanomalous scattering exists
The scattering power of an atom exhibiting AD effects is:fAD = fn + Δf' + iΔf”where:fnis the normal scattering power of the atom in absence of
AD effectsΔf' arises from the AD effect and is a real factor
(+/- signed) added to fnΔf" is an imaginary term which also arises from the AD
effectΔf" is always positive and 90° ahead of (fn + Δf') in phase
angle
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 21 of 38
Δf’ and Δf”
• The values of Δf' and Δf" are highly dependenton the wavelength of the X-radiation.
• In the absence of AD effects, Ihkl = I-h-k-l(Friedel’s Law).
• With AD effects, Ihkl ≠ I-h-k-l (Friedel’s Lawbreaks down).
• Accurate measurement of Friedel pair differencescan be used to extract starting phases if the ADeffect is large enough.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 22 of 38
Δf”
Fn
Figure 6. Illustration of the effect of anomalous dispersion whichproduces different vector lengths for F hkl and F -h-k-l .
Fhkl
real
F-h-k-lΔf”
F-n
F+++
F---real
f’
f’
Breakdown of Friedel’s Law
(Fhkl Left) Fn represents the total scattering by "normal" atoms without AD effects,f’ represents the sum of the normal and real AD scattering values (fn + Δf'), Δf"is the imaginary AD component and appears 90° (at a right angle) ahead of the f’vector and the total scattering is the vector F+++.
(F-h-k-l Right) F-n is the inverse of Fn (at -Φhkl) and f’ is the inverse of f’, the Δf"vector is once again 90° ahead of f’. The resultant vector, F--- in this case, isobviously shorter than the F+++ vector.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 23 of 38
Collecting Anomalous Scattering Data
• Anomalous scatterers, such asselenium, are generally incorporatedinto the protein during expression ofthe protein or are soaked into thecrystals in a manner similar topreparing a heavy atom derivative.
• Bromine, iodine, xenon andtraditional heavy atom compoundsare also good anomalous scatterers.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 24 of 38
How strong is the signal?
• The anomalous signal, the difference between|F+++| and |F---| is generally about one order ofmagnitude smaller than that between |FPH(hkl)|,and |FP(hkl)|.
• Thus, the signal-to-noise (S/n) level in the dataplays a critical role in the success of anomalousscattering experiments, i.e. the higher the S/n inthe data the greater the probability ofproducing an interpretable electron densitymap.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 25 of 38
Why does it work at all?
The lack of isomorphism problem is muchmilder for anomalous data than forisomorphous replacement:
• One sample, not two or more• Unit cell is by definition (?) identical• Molecule is in the same place within that
unit cell• That partly compensates for the low S/N
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 26 of 38
Why is selenium a good choice?
• Methionine is a relatively rare amino acid: 2.4%(vs. average of 5%)
• So there aren’t a huge number of mets in a typicalprotein, but there generally are a few
• It’s possible to make E.coli auxotrophic formethionine and then feed it selenomethionine inits place
• This incorporates SeMet stoichiometrically andcovalently, which is definitely good!
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 27 of 38
Anomalous data collection• The anomalous signal can be optimized by data
collection at or near the absorption edge of theanomalous scatterer. This requires a tunable X-raysource such as a synchrotron.
• The S/n of the data can also be increased bycollecting redundant data.
• The two common anomalous scatteringexperiments are Multiwavelength AnomalousDispersion (MAD) and single wavelengthanomalous scattering/diffraction (SAS or SAD)
• The SAS technique is becoming more popular sinceit does not require a tunable X-ray source.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 28 of 38
Increasing Number of SAS Structures
MAD
SAD
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 29 of 38
Increasing S/n with Redundancy
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 30 of 38
From Glusker, Lewis and Rossi
.
Multiwavelength Anomalous Dispersion
Note:FP = proteinFH1 = heavy atomF+
PH = F+++F-
PH = F---F+
H” = Δf”+++F-
H” = Δf”---
The center of the F+PH and F-
PHcircles are placed at the end ofthe vector -F+
H” and -F-H”
respectively
• In the MAD experiment a strong anomalous scatterer is introduced into the crystaland data are recorded at several wavelengths (peak, inflection and remote) near theX-ray absorption edge of the anomalous scatterer. The phase ambiguity resolved amanner similar to the use of multiple derivatives in the MIR technique
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 31 of 38
Single Wavelength Anomalous Scattering
• The SAS method, which combines the use of SAS dataand solvent flattening to resolve phase ambiguity wasfirst introduced in the ISAS program (Wang, 1985).The technique is very similar to resolving the phaseambiguity in SIR data.
• The SAS method does not require a tunable sourceand successful structure determination can be carriedout using a home X-ray source on crystals containinganomalous scatterers with sufficiently large Δf” suchas iron, copper, iodine, xenon and many heavy atomsalts.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 32 of 38
Sulfur S-SAS:experimental realities
• The ultimate goal of the SAS method is the use of S-SAS to phase protein data since most proteins containsulfur. However sulfur has a very weak anomalousscattering signal with Δf” = 0.56 e- for Cu X-rays. TheS-SAS method requires careful data collection andcrystals that diffract to 2Å resolution.
• A high symmetry space group (more internalsymmetry equivalents) increases the chance of success.
• The use of soft X-rays such as Cr Kα (λ = 2.2909Å)X-rays doubles the sulfur signal (Δf” = 1.14 e-).
• There over 20 S-SAS structures in the Protein DataBank.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 33 of 38
What is the Limit of the SASMethod?
• Electron density maps of Rhe by Sulfur-ISAS• Calculated using simulated data in 1983• Δf” = 0.56e- using Cu Kα X-rays
Wang (1985), Methods Enzymol. 115: 90-112
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 34 of 38
Molecular Replacement
• Molecular replacement has proven effective forsolving macromolecular crystal structures basedupon the knowledge of homologous structures.
• The method is straightforward and reduces the timeand effort required for structure determinationbecause there is no need to prepare heavy atomderivatives and collect their data.
• Model building is also simplified, since little or nochain tracing is required.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 35 of 38
Molecular Replacement:Practical Considerations
• The 3-dimensional structure of the search model must bevery close (< 1.7Å r.m.s.d.) to that of the unknownstructure for the technique to work.
• Sequence homology between the model and unknownprotein is helpful but not strictly required. Success hasbeen observed using search models having as low as 17%sequence similarity.
• Several computer programs such as AmoRe, X-PLOR/CNS PHASER are available for MR calculations.
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 36 of 38
px.cryst.bbk.ac.uk/03/sample/molrep.htm
How Molecular Replacment works• Use a model of the
protein to estimatephases
• Must be a structuralhomologue(RMSD < 1.7Å)
• Two-step process:rotation and translation
• Find orientation ofmodel (red→ black)
• Find location of orientedmodel (black→ blue)
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 37 of 38
Using a protein model to estimate phases:the rotation function
• We need to determine the model’sorientation in X1’s unit cell
• We use a Patterson search approach in(α,β,γ), which are Euler anglesassociated with the rotational space
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 38 of 38
Euler angles forrotation function
The coordinate systemis rotated by:
• an angle α aroundthe original z axis;
• then by an angle βaround the new yaxis;
• and then by an angleγ around the final zaxis.
zyz convention
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 39 of 38
Using a protein model to estimate phases:translation function
• We need to determine the oriented model’slocation in X1’s unit cell
• We do this with an R-factor search, where
19 Feb 2008 Biology 555:Crystallographic Phasing II p. 40 of 38
Translation functions• Oriented model is stepped through the X1 unit
cell using small increments in x, y, and z (e.g.x → x+ step)
• The point where R is lowest represents thecorrect location
• There exists an alternative method that usesmaximum likelihood to find the translationpeak; this notion is embodied in the softwarepackage PHASER by Randy Read