Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Protein Crystallography
Crystals X-rays Atoms
FT FT-1
Solve phase problem
From protein to protein structure
Purify protein
Grow crystals
Collect X-ray data
Refine the model
Done
From protein to phasingGrow crystals
Collect X-ray data
collect 1000’s of images
index and integrate spot intensities
scale intensity data
determine space group and cell dimensions
mount and freeze single crystals
Solve phase problem
Protein crystal growth
blue line = saturation of protein
red line = supersaturation limit
Crystal growth occurs betweenthese two limits. Above thesupersaturation limit, proteinsform only disordered precipitate.
prot
ein
conc
entra
tion
precipitant concentration
Arrows indicatedifferent diffusionexperients.
A,B,D,F,G. Vapordiffusion.
E. Bulk
C. Microdialysis
L=liquidS=solidm=metastable state(supersaturated)
vapor diffusion setup
Volatiles (i.e. water) evaporate from one surface andcondence on the other.
Drop has higher water concentration than reservoir, sodrop slowly evaporates.
a Linbro plate
Crystallization theoryNucleation takes higher concentration than crystalgrowth.
R RR R
R
R
R RR
R RRR
slow slow fast
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR
R
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR
fastnot so slow
After nucleation, the large size of a face makes the weakbond more likely.
Crystallization theory
Bonds A,B are stronger than P,Q.
More on Periodic Bond Chain theory: http://www.che.utoledo.edu/nadarajah/webpages/PBC.htm
Crystallization robotHigh-throughput crystallography labsuse pipeting robots to explorethousnds of “conditions”. Eachcondition is a formulation of thecrystal drop and the reservoirsolution.
Conditions can have different:
•protein concentration
•pH
•precipitant, precipitant concentration
•detergents
•organic co-solvents
•metal ions
•ligands
•concentration gradient
protein crystals
cellulasesubtilisin
The color you see is“birefringence”, the wavelength-dependent rotation of polarizedlight.
~1mm
Crystal mountingXtal ismounted in athin-walledglass capillarytube
Xtal is mountedon a thin film ofwater in a wireloop. The loop isfixed to a metalor glass rod.
If freezing (preferred)
eucentric goniometer head(made by Nonius)
If not freezing
Must freeze immediately or filmwill dry out.!
Mounted xtal is attachedto a goniometer head forprecise adjustment.
wax
Crystal must be kept at proper humidityand temperature!! Very fragile!
Low-melting hard wax isused to ‘glue’ the rod orcapillary here.
Small wrenches fit here,here, here and here.
Centering the crystal in thebeam
“machinecenter” is theintersection ofthe beam andthe twogoniostatrotation axes.Must be set bymanufacturer!
xrays
To place crystal at machinecenter, rotate ω and κ andwatch the crystal. If it movesfrom side to side, it is offcenter.
If it is off-center, we adjustthe screws on the goniometerhead.
whoops it’s off center. Fix it!
X-ray diffractometer with areadetector
The detector (or film) sits on a “Two-theta arm” that can swing out, awayfrom the beam to collect high-resolution data.
Axis of two-theta arm
Synchrotrons provide tunable,monochromatic X-rays
Crystals must be flash frozen
Water must be frozen to < –70°C very fast to prevent theformation of hexagonal ice crystals. Water glass forms.
How? Crystals, mounted on loops, are flash frozen bydipping in liquid propane or freon at –70°, or by instantexposure to N2 gas at –70°C.
hexagonal ice
Precession photograph
Determine Space Group• symmetry• systematic absences• spacing of spots• angle between axes.
Data collection
Measure the intensity (amplitude squared) of each reflection.
Output of data collection, thousands of reflections, each with5 parameters:
h k l F sigma
The forward and reverseFourier tranform
light
The general equation for wave
E Amplitude
*also an oscillating magnetic field of the same frequency, 90 degrees out of phase.
wavelengthE(t) = A cos(ωt + α)
αt
ω=2πc/λ
(phase)
oscillation rate in cycles/second
The instantaneous electric field at time t
Photons areoscillatingelectricfields*.
An electric field acceleratescharged particles
+++++++++++++++++++++++
--------------------------------------
e-
e-
+
-
e- oscillates in an electric field...•e- oscillation is the same frequency as the X-rays
•e- oscillation is much faster that orbiting motion of e- aroundnucleus (no significant Doppler effect).
•The amplitude of the e- oscillation is large because the massof an e- is small. [Atomic nuclei don’t oscillate much!]
Ee- e-
e-e-e-
e-e-e-e-
te-
+
-
oscillating e- create photons…in all directions ⊥ to the oscillation of incoming.
e-
That’s scattering.
X-ray sources may be partially polarized.
oscillation
emission
Scattering by 2 e-
Difference in pathlength = r•s - r•s0Relative phase: α = 2π(r•s - r•s0)/λ
e-
e-
rs0
sr•s0
r•s
X-raysource
X-ray
detector
Wave addition
+
=
Sum the electric fields at each point in time.
“Constructive interference”: amplitude increases.
Rule #1:The sum of two waves with wavelength λalways produces a wave of wavelength λ.
Adding two waves by parts
5.0-60° +=
+
+120°
4.0
cosine sine
1.0
+=
-60°
2.54.33
-2.0 -3.46
0.870.5
Add amplitudes of cosine and sine parts, then recombine them.
Cosine parts and Sine parts can be summed independently.(like “orthogonal coordinates”)
The sum of angles rule
cos(α + β) = cos α cos β – sin α sin β
Applying the sum of angles rule tothe wave equation decomposes it
into sine and cosine partsE(t) = A cos(ωt + α)
A cos(ωt + α) = A cosα cosωt - A sinα sinωt
amplitude ofcosine part
amplitude ofsine part
cosine part
sine
part
Aα
Using the sum of angles rule:
Reference wave cosωtReference wave sinωt
A waves can be representedas a complex exponential
ei!= cos! + i sin!
!
ei"
=1+ i" #" 22!# i" 3
3!+" 44!+ i" 5
5!#L
!
cos" =1#" 22!+" 4
4!#" 66!#L
!
isin" = i" # i" 33!+ i" 5
5!# i" 77!+L
Proof: write the expansions and sum them
!
e"
=1+" +" 22!+" 3
3!+" 44!+" 5
5!#L
Euler’s Theorem:
A wave as a complexexponential
E(t) = A cos(ωt + α)
= A cosα cosωt - A sinα sinωt
r
-i
Aα
= A eiα
-i1
Adding waves using vector addition
t
A1=2.0 α1= +90°
A2=4.0 α2= -60°
-60°
+90°
B≈2.5 β≈-36°
Phase depends on the distancetraveled
Phase = D/ λ – nearest integer(D/λ)
Same for scattered path
Use path difference to getphase
e-
e-
rs0
smeasure this...
and this...Get the difference, divide by the wavelength.
Multiply by 2π.That’s the phase difference.
Add the two waves using vectors.
definition: Structure factor (F)...is a single wave F, having an amplitude, phase anddirection relative to the (crystallographic) reference frame.The wave sum of all scattering factors having the sameincoming s0 and scattered s directions.
F
s0f1 f2 f3
f4F = Σf e-2πiS•rj
s
j
Worksheet for calculating Ffor 4 atoms
λ = 1.00Å
s0 = (1,0,0)
s = (0.5, 0.8660, 0.)
r1 = (0.10, 0.22, 0.0)
r2 = (0.15, 0.10, 0.0)
r3 = (0.25, 0.33, 0.0)
r4 = (0.00, -0.10, 0.0)
phases: 2πr•(s-s0)
r1•(s-s0) = (0.10, 0.22, 0.0)•(-0.5,0.8660, 0.) =-0.05+0.19+0 = 0.14α1 = 360*0.14 = 50.4° α2 = 4.2°, α3 = 57.9°, α4 = -31.2°
F=3.2ei28°
amplitude=3.2phase=28°
The Unit Cell
e-
Origin is set by convention to the centers of symmetry (ifthere is no crystallographic symmetry, the origin is arbitrary).
the repeating unit in a crystal
can we now sum the scattered waves from every point in
?
Summing waves from allpoints in the unit cell
Let’s calculate the scattering factor from each volume unitin the unit cell. Each volume unit has a phase, defined by itslocation relative to the origin (which always has has zero).
e-e-
tiny volumes
Fourier Transform
F S( ) = ! r( )ei2"S• rdr#
This type of integral is called the Fourier Transform. Aconversion of one space into another space withreciprocal units.
Inverse Fourier Transform
F(S) = ! r( )ei 2"S• rdr#
For every Fourier Transform, there exists an inverseFourier Transform which converts the reciprocal spaceback to real space.
! r( ) = F S( )e"i 2#S•rd(S)$
forward transform
reverse transform
note the minus sign
Reflection plane
All points on the reflection plane scatter in phase.
same angle, θ, with beam and scattered Xray
s0 sSθ θ
2θ
Bragg’s law
d θ
nλ=2d sinθ
Reflection planes separated by d scatter in phase.If θ is larger, d is smaller.
d=λ/2sinθ
path length for thisplane is shorter by λ
Sir Lawrence Bragg, winner of 1914 Nobel Prize
Scattering by Bragg planes
R
id
A is proportional tothe total number of e-
on all of these planes.
A
θ θ
...
...
...
...
planes extend throughout the crystal
Integrating planes separatedby d
R
i
A’ is proportional tothe total number of e-
on these planes.
A’
Planes shifted by d/6 arephase shifted by 2π/6.
Integrating planes separatedby d
R
i
The total F is the wavesum over all Braggplanes.
R
i
F
back transform, low
back transform, medium back transform, high
low
medium
high
image: 2 betastrands
FT
Resolution
Bragg planes are alwaysperpendicular to S
θθ
s0
sS
-s0
Since s0 and s are the same length and have the sameangle to the reflection plane, S = (s-s0)/λ is normal to theplane.
The length of S is 1/d
θθ
s0
sS
-s0
The length of S is 2sinθ times the lengths of s and s0, whichis 1/λ. So |S| = 2sinθ/λ = 1/d
sinθ/λ
sinθ/λ
Crystal = 3D lattice
ρ(r) = ρ(r + ta + ub + vc)A vector expression that defines a crystal lattice:
where a,b,and c are the unit cell axes and t,u, and v are integers.
Adding unit cells
ab
c
ab
c
ab
c
(0,0,0)
(ta,ub,vc)
ab
c
(2ta,2ub,2vc)
(3ta,3ub,3vc)
r=(ta,ub,vc)
Phase=α
Phase=α+S•r
Phase=α+S•r+ S•r
etc
Periodic delta functions in threedirections
a
bc �
=
I
I
…results in points of diffraction, in S space.
Laue conditionsDiffraction is discrete scattering. Points of near-perfect constructive interference separated by zones of near-perfect destructive interference.Certain points S in reciprocal space satisfy the conditions thatthe phase shift equals zero.
S •a = h
S •b = k
S •c = l
h, k, and l are integers. These are the “Laue conditions.”
Definition of the ReciprocalLattice
Let’s define the reciprocal lattice as the subset of points inS for which the Laue conditions hold: S = (ha*, kb*, lc*)then...
a*• a = 1
b*• b = 1
c*• c = 1
reciprocal lattice axes
See Drenth Ch 4, p 86,Table 4.1 for how tocalculate the reciprocallattice vectors a*b*c*
Reciprocal cell axes areperpendicular to real cell axes
S •a = 1
S •b = 0
S •c = 0
The a* axis is defined as the direction S when theLaue numbers are h=1, k=0, and l=0. What is the direction of S?
If the dot-product of two vectors iszero, then the two vectors areorthogonal (90° apart). So a* isorthogonal to b and c.
a* is not necessarily parallel to a,but it is always perpendicular to b and c.
Complete conditions for thereciprocal lattice
a*• a = 1
b*• b = 1
c*• c = 1
b*• a = 0 c
*• a = 0
a*• b = 0 c
*• b = 0
a*• c = 0 b
*• c = 0
Real cell relationship toreciprocal cell
ab*
a*
2D for simplicity If a < b, the a* > b*
a* ⊥ b a* ⊥ c
b* ⊥ a b* ⊥ c
c* ⊥ a c* ⊥ b
See Drenth Ch 4, p 86 formore.
1 7 0
2 1 0
• Bragg planes must align with the unit cell origins.• All unit cells in the crystal (ta+ub+vc) scatter with the same
phase.
Using Miller indeces: S = ha*+kb*+lc*
2 1 0
Bragg planes intersect the axes @ 1/h,1/k,1/l
• The n=1 Bragg plane (normal to S atdistance d) cuts the unit cell axes at
1/h 1/k 1/l
1/h = -1/2
1/k = +1/1
If indeces hkl are doubled, Braggdistance d is halved.
2 1 0
• All unit cell origins have phase zero. But not all phase-zeroBragg planes must go through a unit cell origin. For
example, the n=odd Bragg planes for the 0 2 0 reflectiondoes not touch a single unit cell origin.
4 2 00 1 0 0 2 0
(2 3 3) Bragg planes (4 6 6) Bragg planes
3D Bragg planes
Phase-zero planes intersect the cell axes at fractional coordinates(1/h,0,0), (0,1/k,0),(0,0,1/l)
Ewald sphere intersecting thereciprocal lattice
For a given orientation of the crystal with respect to the X-rays,only those value of S that are on the Ewald sphere are visible tothe detector.
Ewald sphere reciprocal lattice (centered on crystal)
In class exercise: index thesespots
kh
the beam
answer
kh
the beam
l=2l=1
l=0
hkl= 3 4 0
0 12 1
7 12 2
7 0 1
Crystal packing
Protein crystal packing interactions are salt-bridges and H-bonds mostly. These are much weaker than thehydrophobic interactions that hold proteins together. Thismeans that (1) protein crystals are fragile, and (2) proteinsin crystals are probably not significantly distorted fromtheir native conformations.
Oh no. We can’t measure phases!
X-ray detectors (film, photomultiplier tubes, CCDs, etc)can measure only the intensity of the X-rays (which isthe amplitude squared), but we need the full waveequations Aeiα for each reflection to do the reverse Fouriertransform.
And because it is called the phase “problem”, the processof getting the phases is called a “solution”. That’s why wesay we “solved” the crystal structure, instead of“measured” or “determined” it.
The Phase Problem
Phase ismore
importantthan
amplitudecolor=phase angle
darkness=amplitude
real space Patterson space
What do we get if set all phases tozero?
phase=α phase = 0
Moving every pair of atoms to the origin, would not changethe amplitudes, but would set all phases = 0.
“Patterson space”
r
Vice-versa: Not changing amplitudes, but setting all phases to0 creates a centro-symmetric image of the molecule.
Patterson map represents allinter-atomic vectors
To generate a fake Patterson map in 2D, draw all inter-atomicvectors, then move the tails to the origin. The heads are wherepeaks would be.
For example, take glycine, 5atoms (not counting H’s)
Move each vector to the origin
Patteron map for Gly in P1
Can you reassemble glycine from this?
translationalsymmetrypeaks
unit cell vector peak
For small molecules,vector/geometry problem can
be solved
...if you know the geometry (bond lengths, angles) of the molecule
http://www.cryst.bbk.ac.uk/xtal/mir/patt2.htm
Multiple isomorphous replacement
= +
The Fourier transform (i.e. diffraction pattern) of a heavyatom derivitive is the vector sum of the transforms of theprotein and the heavy atoms.
NOTE: protein and protein-heavyatom crystals must be isomorphous.
=Turning proteins into small molecules by soaking in heavy atoms
Subtracting Fourier transforms
= –
The Fourier transform of the heavy atoms is the vectordifference of FPH and FH.
Subtracting Fouriertransforms
FPH – FP = FH i is true, but we still don’t know the phases, so wecan’t subtract vectors.
Assume, for the moment, that |FH| << |FP| , |FPH| . Then the phases
of FP and FPH are approximately the same. So,
FH
FP
FPH
In that case, the Patterson map (no phases) based on |FPH|–|FP| will be
approximately the same as the Patterson map based on |FH| .
|FPH–FP| ≈ |FPH| – |FP|
We can represent a structurefactor of unknown phase as a
circle
R
i
Radius of the circle is the amplitude. The true Flies somewhere on the circle.
Summing structure factorsgeometrically
|FP + FH| = |FPH|We knowamplitudeand phase
We know onlyamplitude
FH
FPH
FP
Thereare twoways tomakethevectorsumsadd up.
|FP + FH1| = |FPH1||FP + FH2| = |FPH2|Or|FP|=|FPH1–FH1|= |FPH2–FH2|
Two heavy atom derivativesare enough, but more is better
FH
FPH
FP
Initial phasesPhases are not measuredexactly because amplitudesare not measured exactly.
Error bars on FP and FPHcreate a distribution ofpossible phase values α.
width of circle is 1σdeviation, derivedfrom data collectionstatistics.
From data to model
Collect native data: Fp Collect heavy atom data: Fph
Estimate phases
Calculate ρ
Trace the map
Refine
Is the map traceable?
density modification?yes
no
From data to phases
native data: Fp heavy atom data: Fph
Calculate difference Patterson
Find heavy atom peaks on Harker sections
Solve for heavy atom positions using symmetry
Calculate heavy atom vectors
Estimate phases
Anomalous dispersion
Heavy atom free electrons
bound electrons
Inner electronsscatter with a timedelay. This is aphase shift that isalways counter-clockwise relativeto the phase of thefree electrons.
Anomalous dispersion
Low-resolution
At 4-6Å resolution, alpha helices look likesausages.
Medium resolution
~3Å data is good enough to se the backbone withspace inbetween.
Errors in thephases makesomeconnectionsambiguous.
Holes in rings are a good thing
Seeing a hole in a tyrosine or phenylalanine ring is universallyaccepted as proof of good phases. You need at least 2Å data.
Great map: holes in rings
Superior map: Atomicity
Rarely is the data this good. 2 holes in Trp. All atoms separated.
Only small moleculestructures are this good
Atoms are separated down to several contours. Proteinsare never this well-ordered. But this is what the densityreally looks like.
Refinement
•The gradient* of the R-factor with respect to each atomicposition may be calculated.
•Each atom is moved down-hill along the gradient.
•“Restraints” may be imposed.
dRfactor
dr v i
*
What is a restraint?A restraint is a function of the coordinates that is lowest when the coordinates
are “ideal”, and which increases as the coordinates become less ideal..
Stereochemical restraints
bond lengths bond angles torsion angles
also...
planargroups
B’s
molecular replacementIf the structure of the molecule is known approximately,then the phases can be calculated.
BUT. We need to know how the molecule is oriented.
R RRR
R RRR
The diffraction patterns of these two crystals are not the same.
6-dimensional search space
Every possible rigidbody transformation of a molecule can be described using6 parameters. 3 angles of rotation (defining a matrix of 9 coefficients), and avector of translation (3 values). i.e.
x' = c11x + c21y + c31z + vx
y' = c12x + c22y + c32z + vy
z' = c13x + c23y + c33z + vz
Therefore, the position of our molecule in the crystal unit cell must be a 6Dtransformation of its current position. Molecular replacement is the method forfinding the angles and vector that define the transformation.
Rotated Patterson map for Gly
intermolecularvectors
Rotated Patterson map for Gly
intermolecularvectors
intramolecular vectorsrotate around the origin
intermolecular vectors aretransformed differently
Refinement
Steps after initial modeling:
(1) Rigid body refinement.
(2) Density modification.
(3) Difference maps.
(4) Least squares, protein coordinates + overall B-factor.
(5) Add waters, ions. More least squares.
(6) Least squares, protein coordinates + atomic B-factors.
(7) Least squares, multiple occupancy and anisotropic B-factors.
(8) Validation. Publication.
Initial model to final model
Rigid body refinement
(1) Rigid body refinement.After molecular replacement only, to get the preciseorientation of the molecule relative to the crystal axes. Wholemolecule treated as a rigid group. Model may be cut intodomains. If so, then each domain is rigidbody refined.
Density modification.(2) Density modification.
Coordinate-free refinement. The map is modified directly, then new phasesare calculated. This step may be skipped for good starting models.
Density modification :
Fo’s and (new) phases
Map Modified map
Fc’s and new phases
initialphases
Solvent Flattening: Make the water part of the map flat.(1) Drawenvelope aroundprotein part
(2) Set solvent ρ to <ρ> and back transform.
(1) Calculate map.
(2) Skeletonize the map
(3) Make the skeleton “ protein-like”
(4) Back transform the skeleton.
Protein-like means: (a) no cycles, (b) no islands
Difference maps
ρ(Fo-Fc) = Difference map. Fc is calculate from thecoordinates. This map shows missing or wrongly placed atoms.
ρ(2Fo-Fc) = This is a “native” map (Fo) plus a difference map(Fo-Fc). This map should look like the corrected model.
ΝΟΤΕ: " ρ(F) " means “maps calculated using amplitudes X”
Omit map = Difference map or 2Fo-Fcafter removing suspicious coordinates.Removes “phase bias” density thatresults from least-squares refinementusing wrong coordinates.
(3) Difference maps are used throughout the refinement process after amodel has been built.
FÉTHIÈRE et al, Protein Science (1996), 5: 1174- 1183.
Two inhibitorpeptides in twodifferent crystals ofthe proteasethrombin.
The inhibitorcoordinates wereomitted from themodel beforecalculating Fc.
Then maps weremade using Fo-Fcamplitudes and Fcphases.
(stereo images)
Least-squares refinement
•The partial derivative of the R-factor withrespect to each atomic position can becalculated, because we know the change inamplitudes with change in coordinates.
•A 3D derivative is a “gradient”. Each atomis moved down-hill along the gradient.
•“Restraints” may be imposed to maintaingood stereochemistry.
dRfactor
dr v i
(4) Least squares, protein coordinates + overall B-factor.
bond lengths
bond angles
torsion anglesplanar groupsvan der WaalsRestraint types:
Calculated phases, observedamplitudes = hybrid F's
•Fc’s are calculated from the atomic coordinates
•A new electron density map calculated from the Fc's would onlyreproduce the model. (of course!)
•Instead we use the observed amplitudes |Fo|, and the model phases, αc.
!(v r ) = F
obs
v h ( )
h
.
" e#i 2$
r h •
r r ( )+% calc
r h ( )( )
Hybrid back transform:
Hybrid maps show places where the current model is wrong and needs to be changed.
The “free R-factor”: cross-validation
The free R-factor is the test set residual, calculated the same asthe R-factor, but on the “test set”.
Free R-factor asks “how well does your model predict the datait hasn’t seen?”
Rfree =
Fobs h( ) ! k Fcalc h( )h"T
#
Fobs h( )h"T
#
Note: the only difference is which hkl are used to calculate.