Crystals X-rays Atoms

Protein Crystallography

Crystals X-rays Atoms

FT FT-1

Solve phase problem

From protein to protein structure

Purify protein

Grow crystals

Collect X-ray data

Refine the model

Done

From protein to phasingGrow crystals

Collect X-ray data

collect 1000’s of images

index and integrate spot intensities

scale intensity data

determine space group and cell dimensions

mount and freeze single crystals

Solve phase problem

Protein crystal growth

blue line = saturation of protein

red line = supersaturation limit

Crystal growth occurs betweenthese two limits. Above thesupersaturation limit, proteinsform only disordered precipitate.

prot

ein

conc

entra

tion

precipitant concentration

Arrows indicatedifferent diffusionexperients.

A,B,D,F,G. Vapordiffusion.

E. Bulk

C. Microdialysis

L=liquidS=solidm=metastable state(supersaturated)

vapor diffusion setup

Volatiles (i.e. water) evaporate from one surface andcondence on the other.

Drop has higher water concentration than reservoir, sodrop slowly evaporates.

a Linbro plate

Crystallization theoryNucleation takes higher concentration than crystalgrowth.

R RR R

R

R

R RR

R RRR

slow slow fast

RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR

R

RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR

RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR

fastnot so slow

After nucleation, the large size of a face makes the weakbond more likely.

Crystallization theory

Bonds A,B are stronger than P,Q.

More on Periodic Bond Chain theory: http://www.che.utoledo.edu/nadarajah/webpages/PBC.htm

Crystallization robotHigh-throughput crystallography labsuse pipeting robots to explorethousnds of “conditions”. Eachcondition is a formulation of thecrystal drop and the reservoirsolution.

Conditions can have different:

•protein concentration

•pH

•precipitant, precipitant concentration

•detergents

•organic co-solvents

•metal ions

•ligands

•concentration gradient

protein crystals

cellulasesubtilisin

The color you see is“birefringence”, the wavelength-dependent rotation of polarizedlight.

~1mm

Crystal mountingXtal ismounted in athin-walledglass capillarytube

Xtal is mountedon a thin film ofwater in a wireloop. The loop isfixed to a metalor glass rod.

If freezing (preferred)

eucentric goniometer head(made by Nonius)

If not freezing

Must freeze immediately or filmwill dry out.!

Mounted xtal is attachedto a goniometer head forprecise adjustment.

wax

Crystal must be kept at proper humidityand temperature!! Very fragile!

Low-melting hard wax isused to ‘glue’ the rod orcapillary here.

Small wrenches fit here,here, here and here.

Centering the crystal in thebeam

“machinecenter” is theintersection ofthe beam andthe twogoniostatrotation axes.Must be set bymanufacturer!

xrays

To place crystal at machinecenter, rotate ω and κ andwatch the crystal. If it movesfrom side to side, it is offcenter.

If it is off-center, we adjustthe screws on the goniometerhead.

whoops it’s off center. Fix it!

X-ray diffractometer with areadetector

The detector (or film) sits on a “Two-theta arm” that can swing out, awayfrom the beam to collect high-resolution data.

Axis of two-theta arm

Synchrotrons provide tunable,monochromatic X-rays

Crystals must be flash frozen

Water must be frozen to < –70°C very fast to prevent theformation of hexagonal ice crystals. Water glass forms.

How? Crystals, mounted on loops, are flash frozen bydipping in liquid propane or freon at –70°, or by instantexposure to N2 gas at –70°C.

hexagonal ice

Precession photograph

Determine Space Group• symmetry• systematic absences• spacing of spots• angle between axes.

Data collection

Measure the intensity (amplitude squared) of each reflection.

Output of data collection, thousands of reflections, each with5 parameters:

h k l F sigma

The forward and reverseFourier tranform

light

The general equation for wave

E Amplitude

*also an oscillating magnetic field of the same frequency, 90 degrees out of phase.

wavelengthE(t) = A cos(ωt + α)

αt

ω=2πc/λ

(phase)

oscillation rate in cycles/second

The instantaneous electric field at time t

Photons areoscillatingelectricfields*.

An electric field acceleratescharged particles

+++++++++++++++++++++++

--------------------------------------

e-

e-

+

-

e- oscillates in an electric field...•e- oscillation is the same frequency as the X-rays

•e- oscillation is much faster that orbiting motion of e- aroundnucleus (no significant Doppler effect).

•The amplitude of the e- oscillation is large because the massof an e- is small. [Atomic nuclei don’t oscillate much!]

Ee- e-

e-e-e-

e-e-e-e-

te-

+

-

oscillating e- create photons…in all directions ⊥ to the oscillation of incoming.

e-

That’s scattering.

X-ray sources may be partially polarized.

oscillation

emission

Scattering by 2 e-

Difference in pathlength = r•s - r•s0Relative phase: α = 2π(r•s - r•s0)/λ

e-

e-

rs0

sr•s0

r•s

X-raysource

X-ray

detector

Wave addition

+

=

Sum the electric fields at each point in time.

“Constructive interference”: amplitude increases.

Rule #1:The sum of two waves with wavelength λalways produces a wave of wavelength λ.

Adding two waves by parts

5.0-60° +=

+

+120°

4.0

cosine sine

1.0

+=

-60°

2.54.33

-2.0 -3.46

0.870.5

Add amplitudes of cosine and sine parts, then recombine them.

Cosine parts and Sine parts can be summed independently.(like “orthogonal coordinates”)

The sum of angles rule

cos(α + β) = cos α cos β – sin α sin β

Applying the sum of angles rule tothe wave equation decomposes it

into sine and cosine partsE(t) = A cos(ωt + α)

A cos(ωt + α) = A cosα cosωt - A sinα sinωt

amplitude ofcosine part

amplitude ofsine part

cosine part

sine

part

Aα

Using the sum of angles rule:

Reference wave cosωtReference wave sinωt

A waves can be representedas a complex exponential

ei!= cos! + i sin!

!

ei"

=1+ i" #" 22!# i" 3

3!+" 44!+ i" 5

5!#L

!

cos" =1#" 22!+" 4

4!#" 66!#L

!

isin" = i" # i" 33!+ i" 5

5!# i" 77!+L

Proof: write the expansions and sum them

!

e"

=1+" +" 22!+" 3

3!+" 44!+" 5

5!#L

Euler’s Theorem:

A wave as a complexexponential

E(t) = A cos(ωt + α)

= A cosα cosωt - A sinα sinωt

r

-i

Aα

= A eiα

-i1

Adding waves using vector addition

t

A1=2.0 α1= +90°

A2=4.0 α2= -60°

-60°

+90°

B≈2.5 β≈-36°

Phase depends on the distancetraveled

Phase = D/ λ – nearest integer(D/λ)

Same for scattered path

Use path difference to getphase

e-

e-

rs0

smeasure this...

and this...Get the difference, divide by the wavelength.

Multiply by 2π.That’s the phase difference.

Add the two waves using vectors.

definition: Structure factor (F)...is a single wave F, having an amplitude, phase anddirection relative to the (crystallographic) reference frame.The wave sum of all scattering factors having the sameincoming s0 and scattered s directions.

F

s0f1 f2 f3

f4F = Σf e-2πiS•rj

s

j

Worksheet for calculating Ffor 4 atoms

λ = 1.00Å

s0 = (1,0,0)

s = (0.5, 0.8660, 0.)

r1 = (0.10, 0.22, 0.0)

r2 = (0.15, 0.10, 0.0)

r3 = (0.25, 0.33, 0.0)

r4 = (0.00, -0.10, 0.0)

phases: 2πr•(s-s0)

r1•(s-s0) = (0.10, 0.22, 0.0)•(-0.5,0.8660, 0.) =-0.05+0.19+0 = 0.14α1 = 360*0.14 = 50.4° α2 = 4.2°, α3 = 57.9°, α4 = -31.2°

F=3.2ei28°

amplitude=3.2phase=28°

The Unit Cell

e-

Origin is set by convention to the centers of symmetry (ifthere is no crystallographic symmetry, the origin is arbitrary).

the repeating unit in a crystal

can we now sum the scattered waves from every point in

?

Summing waves from allpoints in the unit cell

Let’s calculate the scattering factor from each volume unitin the unit cell. Each volume unit has a phase, defined by itslocation relative to the origin (which always has has zero).

e-e-

tiny volumes

Fourier Transform

F S( ) = ! r( )ei2"S• rdr#

This type of integral is called the Fourier Transform. Aconversion of one space into another space withreciprocal units.

Inverse Fourier Transform

F(S) = ! r( )ei 2"S• rdr#

For every Fourier Transform, there exists an inverseFourier Transform which converts the reciprocal spaceback to real space.

! r( ) = F S( )e"i 2#S•rd(S)$

forward transform

reverse transform

note the minus sign

Reflection plane

All points on the reflection plane scatter in phase.

same angle, θ, with beam and scattered Xray

s0 sSθ θ

2θ

Bragg’s law

d θ

nλ=2d sinθ

Reflection planes separated by d scatter in phase.If θ is larger, d is smaller.

d=λ/2sinθ

path length for thisplane is shorter by λ

Sir Lawrence Bragg, winner of 1914 Nobel Prize

Scattering by Bragg planes

R

id

A is proportional tothe total number of e-

on all of these planes.

A

θ θ

...

...

...

...

planes extend throughout the crystal

Integrating planes separatedby d

R

i

A’ is proportional tothe total number of e-

on these planes.

A’

Planes shifted by d/6 arephase shifted by 2π/6.

Integrating planes separatedby d

R

i

The total F is the wavesum over all Braggplanes.

R

i

F

back transform, low

back transform, medium back transform, high

low

medium

high

image: 2 betastrands

FT

Resolution

Bragg planes are alwaysperpendicular to S

θθ

s0

sS

-s0

Since s0 and s are the same length and have the sameangle to the reflection plane, S = (s-s0)/λ is normal to theplane.

The length of S is 1/d

θθ

s0

sS

-s0

The length of S is 2sinθ times the lengths of s and s0, whichis 1/λ. So |S| = 2sinθ/λ = 1/d

sinθ/λ

sinθ/λ

Crystal = 3D lattice

ρ(r) = ρ(r + ta + ub + vc)A vector expression that defines a crystal lattice:

where a,b,and c are the unit cell axes and t,u, and v are integers.

Adding unit cells

ab

c

ab

c

ab

c

(0,0,0)

(ta,ub,vc)

ab

c

(2ta,2ub,2vc)

(3ta,3ub,3vc)

r=(ta,ub,vc)

Phase=α

Phase=α+S•r

Phase=α+S•r+ S•r

etc

Periodic delta functions in threedirections

a

bc �

=

I

I

…results in points of diffraction, in S space.

Laue conditionsDiffraction is discrete scattering. Points of near-perfect constructive interference separated by zones of near-perfect destructive interference.Certain points S in reciprocal space satisfy the conditions thatthe phase shift equals zero.

S •a = h

S •b = k

S •c = l

h, k, and l are integers. These are the “Laue conditions.”

Definition of the ReciprocalLattice

Let’s define the reciprocal lattice as the subset of points inS for which the Laue conditions hold: S = (ha*, kb*, lc*)then...

a*• a = 1

b*• b = 1

c*• c = 1

reciprocal lattice axes

See Drenth Ch 4, p 86,Table 4.1 for how tocalculate the reciprocallattice vectors a*b*c*

Reciprocal cell axes areperpendicular to real cell axes

S •a = 1

S •b = 0

S •c = 0

The a* axis is defined as the direction S when theLaue numbers are h=1, k=0, and l=0. What is the direction of S?

If the dot-product of two vectors iszero, then the two vectors areorthogonal (90° apart). So a* isorthogonal to b and c.

a* is not necessarily parallel to a,but it is always perpendicular to b and c.

Complete conditions for thereciprocal lattice

a*• a = 1

b*• b = 1

c*• c = 1

b*• a = 0 c

*• a = 0

a*• b = 0 c

*• b = 0

a*• c = 0 b

*• c = 0

Real cell relationship toreciprocal cell

ab*

a*

2D for simplicity If a < b, the a* > b*

a* ⊥ b a* ⊥ c

b* ⊥ a b* ⊥ c

c* ⊥ a c* ⊥ b

See Drenth Ch 4, p 86 formore.

1 7 0

2 1 0

• Bragg planes must align with the unit cell origins.• All unit cells in the crystal (ta+ub+vc) scatter with the same

phase.

Using Miller indeces: S = ha*+kb*+lc*

2 1 0

Bragg planes intersect the axes @ 1/h,1/k,1/l

• The n=1 Bragg plane (normal to S atdistance d) cuts the unit cell axes at

1/h 1/k 1/l

1/h = -1/2

1/k = +1/1

If indeces hkl are doubled, Braggdistance d is halved.

2 1 0

• All unit cell origins have phase zero. But not all phase-zeroBragg planes must go through a unit cell origin. For

example, the n=odd Bragg planes for the 0 2 0 reflectiondoes not touch a single unit cell origin.

4 2 00 1 0 0 2 0

(2 3 3) Bragg planes (4 6 6) Bragg planes

3D Bragg planes

Phase-zero planes intersect the cell axes at fractional coordinates(1/h,0,0), (0,1/k,0),(0,0,1/l)

Ewald sphere intersecting thereciprocal lattice

For a given orientation of the crystal with respect to the X-rays,only those value of S that are on the Ewald sphere are visible tothe detector.

Ewald sphere reciprocal lattice (centered on crystal)

In class exercise: index thesespots

kh

the beam

answer

kh

the beam

l=2l=1

l=0

hkl= 3 4 0

0 12 1

7 12 2

7 0 1

Crystal packing

Protein crystal packing interactions are salt-bridges and H-bonds mostly. These are much weaker than thehydrophobic interactions that hold proteins together. Thismeans that (1) protein crystals are fragile, and (2) proteinsin crystals are probably not significantly distorted fromtheir native conformations.

Oh no. We can’t measure phases!

X-ray detectors (film, photomultiplier tubes, CCDs, etc)can measure only the intensity of the X-rays (which isthe amplitude squared), but we need the full waveequations Aeiα for each reflection to do the reverse Fouriertransform.

And because it is called the phase “problem”, the processof getting the phases is called a “solution”. That’s why wesay we “solved” the crystal structure, instead of“measured” or “determined” it.

The Phase Problem

Phase ismore

importantthan

amplitudecolor=phase angle

darkness=amplitude

real space Patterson space

What do we get if set all phases tozero?

phase=α phase = 0

Moving every pair of atoms to the origin, would not changethe amplitudes, but would set all phases = 0.

“Patterson space”

r

Vice-versa: Not changing amplitudes, but setting all phases to0 creates a centro-symmetric image of the molecule.

Patterson map represents allinter-atomic vectors

To generate a fake Patterson map in 2D, draw all inter-atomicvectors, then move the tails to the origin. The heads are wherepeaks would be.

For example, take glycine, 5atoms (not counting H’s)

Move each vector to the origin

Patteron map for Gly in P1

Can you reassemble glycine from this?

translationalsymmetrypeaks

unit cell vector peak

For small molecules,vector/geometry problem can

be solved

...if you know the geometry (bond lengths, angles) of the molecule

http://www.cryst.bbk.ac.uk/xtal/mir/patt2.htm

Multiple isomorphous replacement

= +

The Fourier transform (i.e. diffraction pattern) of a heavyatom derivitive is the vector sum of the transforms of theprotein and the heavy atoms.

NOTE: protein and protein-heavyatom crystals must be isomorphous.

=Turning proteins into small molecules by soaking in heavy atoms

Subtracting Fourier transforms

= –

The Fourier transform of the heavy atoms is the vectordifference of FPH and FH.

Subtracting Fouriertransforms

FPH – FP = FH i is true, but we still don’t know the phases, so wecan’t subtract vectors.

Assume, for the moment, that |FH| << |FP| , |FPH| . Then the phases

of FP and FPH are approximately the same. So,

FH

FP

FPH

In that case, the Patterson map (no phases) based on |FPH|–|FP| will be

approximately the same as the Patterson map based on |FH| .

|FPH–FP| ≈ |FPH| – |FP|

We can represent a structurefactor of unknown phase as a

circle

R

i

Radius of the circle is the amplitude. The true Flies somewhere on the circle.

Summing structure factorsgeometrically

|FP + FH| = |FPH|We knowamplitudeand phase

We know onlyamplitude

FH

FPH

FP

Thereare twoways tomakethevectorsumsadd up.

|FP + FH1| = |FPH1||FP + FH2| = |FPH2|Or|FP|=|FPH1–FH1|= |FPH2–FH2|

Two heavy atom derivativesare enough, but more is better

FH

FPH

FP

Initial phasesPhases are not measuredexactly because amplitudesare not measured exactly.

Error bars on FP and FPHcreate a distribution ofpossible phase values α.

width of circle is 1σdeviation, derivedfrom data collectionstatistics.

From data to model

Collect native data: Fp Collect heavy atom data: Fph

Estimate phases

Calculate ρ

Trace the map

Refine

Is the map traceable?

density modification?yes

no

From data to phases

native data: Fp heavy atom data: Fph

Calculate difference Patterson

Find heavy atom peaks on Harker sections

Solve for heavy atom positions using symmetry

Calculate heavy atom vectors

Estimate phases

Anomalous dispersion

Heavy atom free electrons

bound electrons

Inner electronsscatter with a timedelay. This is aphase shift that isalways counter-clockwise relativeto the phase of thefree electrons.

Anomalous dispersion

Low-resolution

At 4-6Å resolution, alpha helices look likesausages.

Medium resolution

~3Å data is good enough to se the backbone withspace inbetween.

Errors in thephases makesomeconnectionsambiguous.

Holes in rings are a good thing

Seeing a hole in a tyrosine or phenylalanine ring is universallyaccepted as proof of good phases. You need at least 2Å data.

Great map: holes in rings

Superior map: Atomicity

Rarely is the data this good. 2 holes in Trp. All atoms separated.

Only small moleculestructures are this good

Atoms are separated down to several contours. Proteinsare never this well-ordered. But this is what the densityreally looks like.

Refinement

•The gradient* of the R-factor with respect to each atomicposition may be calculated.

•Each atom is moved down-hill along the gradient.

•“Restraints” may be imposed.

dRfactor

dr v i

*

What is a restraint?A restraint is a function of the coordinates that is lowest when the coordinates

are “ideal”, and which increases as the coordinates become less ideal..

Stereochemical restraints

bond lengths bond angles torsion angles

also...

planargroups

B’s

molecular replacementIf the structure of the molecule is known approximately,then the phases can be calculated.

BUT. We need to know how the molecule is oriented.

R RRR

R RRR

The diffraction patterns of these two crystals are not the same.

6-dimensional search space

Every possible rigidbody transformation of a molecule can be described using6 parameters. 3 angles of rotation (defining a matrix of 9 coefficients), and avector of translation (3 values). i.e.

x' = c11x + c21y + c31z + vx

y' = c12x + c22y + c32z + vy

z' = c13x + c23y + c33z + vz

Therefore, the position of our molecule in the crystal unit cell must be a 6Dtransformation of its current position. Molecular replacement is the method forfinding the angles and vector that define the transformation.

Rotated Patterson map for Gly

intermolecularvectors

Rotated Patterson map for Gly

intermolecularvectors

intramolecular vectorsrotate around the origin

intermolecular vectors aretransformed differently

Refinement

Steps after initial modeling:

(1) Rigid body refinement.

(2) Density modification.

(3) Difference maps.

(4) Least squares, protein coordinates + overall B-factor.

(5) Add waters, ions. More least squares.

(6) Least squares, protein coordinates + atomic B-factors.

(7) Least squares, multiple occupancy and anisotropic B-factors.

(8) Validation. Publication.

Initial model to final model

Rigid body refinement

(1) Rigid body refinement.After molecular replacement only, to get the preciseorientation of the molecule relative to the crystal axes. Wholemolecule treated as a rigid group. Model may be cut intodomains. If so, then each domain is rigidbody refined.

Density modification.(2) Density modification.

Coordinate-free refinement. The map is modified directly, then new phasesare calculated. This step may be skipped for good starting models.

Density modification :

Fo’s and (new) phases

Map Modified map

Fc’s and new phases

initialphases

Solvent Flattening: Make the water part of the map flat.(1) Drawenvelope aroundprotein part

(2) Set solvent ρ to <ρ> and back transform.

(1) Calculate map.

(2) Skeletonize the map

(3) Make the skeleton “ protein-like”

(4) Back transform the skeleton.

Protein-like means: (a) no cycles, (b) no islands

Difference maps

ρ(Fo-Fc) = Difference map. Fc is calculate from thecoordinates. This map shows missing or wrongly placed atoms.

ρ(2Fo-Fc) = This is a “native” map (Fo) plus a difference map(Fo-Fc). This map should look like the corrected model.

ΝΟΤΕ: " ρ(F) " means “maps calculated using amplitudes X”

Omit map = Difference map or 2Fo-Fcafter removing suspicious coordinates.Removes “phase bias” density thatresults from least-squares refinementusing wrong coordinates.

(3) Difference maps are used throughout the refinement process after amodel has been built.

FÉTHIÈRE et al, Protein Science (1996), 5: 1174- 1183.

Two inhibitorpeptides in twodifferent crystals ofthe proteasethrombin.

The inhibitorcoordinates wereomitted from themodel beforecalculating Fc.

Then maps weremade using Fo-Fcamplitudes and Fcphases.

(stereo images)

Least-squares refinement

•The partial derivative of the R-factor withrespect to each atomic position can becalculated, because we know the change inamplitudes with change in coordinates.

•A 3D derivative is a “gradient”. Each atomis moved down-hill along the gradient.

•“Restraints” may be imposed to maintaingood stereochemistry.

dRfactor

dr v i

(4) Least squares, protein coordinates + overall B-factor.

bond lengths

bond angles

torsion anglesplanar groupsvan der WaalsRestraint types:

Calculated phases, observedamplitudes = hybrid F's

•Fc’s are calculated from the atomic coordinates

•A new electron density map calculated from the Fc's would onlyreproduce the model. (of course!)

•Instead we use the observed amplitudes |Fo|, and the model phases, αc.

!(v r ) = F

obs

v h ( )

h

.

" e#i 2$

r h •

r r ( )+% calc

r h ( )( )

Hybrid back transform:

Hybrid maps show places where the current model is wrong and needs to be changed.

The “free R-factor”: cross-validation

The free R-factor is the test set residual, calculated the same asthe R-factor, but on the “test set”.

Free R-factor asks “how well does your model predict the datait hasn’t seen?”

Rfree =

Fobs h( ) ! k Fcalc h( )h"T

#

Fobs h( )h"T

#

Note: the only difference is which hkl are used to calculate.

Documents

Crystals X-rays Atoms