Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Tim Grüne
The Shelx Suite: Applications to Macromolecular CrystallographyTim Grüne
Dept. of Structural Chemistry, University of GöttingenJune 2011
http://shelx.uni-ac.gwdg.de
SBGrid Symposium 2011: The Shelx Suite 1/35
Tim Grüne
Package Content
The “Shelx Suite” [1] consists of
SHELXS (Small Molecule) structure solution by Patterson & direct methodsSHELXD mx (Small molecule) structure solution and (macromolecular) heavy atom location by direct
methodsSHELXE mx (Macromolecular) Phasing and density modificationSHELXL mx Small molecule and macromolecular structure refinementCIFTAB Tables creation from CIF-files for publicationSHELXC mx Data preparation for macromolecular phasing with shelxd & shelxeSHELXPRO mx Collection of conversion utilities (e.g. PDB→ .ins & .res→ PDB)SHELXWAT automated solvent molecule search for macromolecules (rather obsolete with coot)
The download page also contains the programs mtz2sca and mtz2hkl [3] for conversion of mtz-format files tosca- and hkl-format respectively.
SBGrid Symposium 2011: The Shelx Suite 2/35
Tim Grüne
SHELXmx
Programs labelled with mx are (also) used in macromolecular crystallography.
SHELXD originally written for structure solution of small molecules by direct methods, now also usedfor substructure solution for experimental phasing (this was possible without modification of theprogram).
SHELXE density modification and (β-version) auto-tracing of protein structuresSHELX C/D/E the “triad” shelx c/d/e is best know for experimental phasingSHELXL high resolution refinement and refinement of neutron diffraction dataSHELXPRO Preparation of .ins files from PDB files including standard restraints for peptides and
nucleic acid structures; creation of maps for O and XtalView (useful in the “pre-coot-era")
SBGrid Symposium 2011: The Shelx Suite 3/35
Tim Grüne
Impact on Crystallography
The SHELX programs, shelxd/s (structure solution with direct methods) and especially shelxl (refinement), havefor long been the de facto standard for small molecule crystallography, best illustrated by the rise of the impactfactor of Acta Crystallography A in 2009/2010 (http://www.iucr.org/index.html/leading-article/2010/2010-07-12).
In macromolecular crystallography, shelxd is one of the major phasing programs, and also used by manypipelines like autoSharp, crank, autorickshaw,. . .
shelxl is less popular when it comes to macromolecules (majors: refmac5 [7] & phenix.refine [5]).
Garib Murshudov’s favourite quote on new features in refmac5:
“[. . . feature xyz] has been available in SHELXL since the beginning of time.”
SBGrid Symposium 2011: The Shelx Suite 4/35
Tim Grüne
Macromolecular Crystallography
In macromolecular crystallography, the shelx programs have some alternatives:
SHELXD • Sharp www.globalphasing.com• solve https://solve.lanl.gov• bp3 http://www.bfsc.leidenuniv.nl/software/bp3• SnB http://www.hwi.buffalo.edu/snb, . . .
SHELXE • DM,• parrot http://www.ysbl.york.ac.uk/˜cowtan/parrot/parrot.html• buccanneer http://www.ysbl.york.ac.uk/˜cowtan/buccaneer/buccaneer.html• resolve https://solve.lanl.gov, . . .
SHELXL • Refmac5 http://www.ysbl.york.ac.uk/˜garib/refmac/• Phenix http://www.phenix-online.org• Buster www.globalphasing.com• TNT http://www.uoxray.uoregon.edu/tnt/,. . .
(these lists are definitely far from complete )
SBGrid Symposium 2011: The Shelx Suite 5/35
Tim Grüne
Additional Programs
The following programs are also authored by George Sheldrick but distributed by Bruker AXS:
xprep Space group determination, data analysis and preparation of phasingsadabs Scaling of integrated data (normal and modulated crystals)twinabs Scaling of integrated data (non-merohedrally twinned crystals)
sadabs and twinabs are fine-tuned to work with the data processing program saint (Bruker AXS) and produceexcellent results even with twinned macromolecular data.
Scaling with sadabs
For the transition from HKL2000 data (.x-files) to scaling with sadabs (instead of scalepack), the Shelx home-page provides the program x2sad.
For the transition from XDS (XDS ASCII.HKL) to scaling with sadabs (instead of scala or xscale), the Shelxhomepage provides the program xds2sad.
SBGrid Symposium 2011: The Shelx Suite 6/35
Tim Grüne
Program Philosophy
All shelx programm are command line programs, i.e. they are started from a terminal window (Linux/UNIX) orthe command prompt (windows) respectively.
• The main programs shelxd, shelxs, shelxl require an .ins file with instructions.• shelxe takes all its options from the command line.• shelxpro, xprep and sadabs/twinabs are interactive programs offering (text-based) menus.
SBGrid Symposium 2011: The Shelx Suite 7/35
Tim Grüne
Structure Refinement with Shelxl
SBGrid Symposium 2011: The Shelx Suite 8/35
Tim Grüne
Advantages of Shelxl
• Enormous flexibility, even refinement of Laue-data, and neutron data.• Refinement of (multiple domain) twin data• Refinement against intensities⇒ inclusion of negative intensities.• Proper treatment of anisotropy and symmetry• Calculation for standard uncertainties (esds) for small structures• Availability of parallelised version (shelxl mp)
SBGrid Symposium 2011: The Shelx Suite 9/35
Tim Grüne
Reasons for Shelxl Refinement
• Anisotropic refinement (at d . 1.5 Å)• Twin refinement• Occupancy refinement (ligand studies)• Complicated disorder (free variables)• Laue data• Neutron diffraction data (time-of-flight Laue)
SBGrid Symposium 2011: The Shelx Suite 10/35
Tim Grüne
Input to Shelxl
Shelxl requires two input files with the same basename:
1. myname.hkl shelxl reads data from plain text files, one Miller-index per line. Several differentformats are allowed. Most commonly used:• HKLF 4: HKLIobsσI Miller indices followed by intensity data• HKLF 5: HKLIobsσIm where m is the twin domain• HKLF 2: HKLIσIBNλ (X-ray or neutron) Laue data
2. myname.ins Instruction file containing header with instructions, restraints, and constraints; fol-lowed by list of atoms.For historical reasons (punch cards era) line widths are restricted to 80 characters (columns).Lines can be continued by an ’=’ sign as last characters (before column 80) and leaving the firstfour characters empty.
SBGrid Symposium 2011: The Shelx Suite 11/35
Tim Grüne
Input File Generation
With macromolecular data it is most convenient to start model building and refinement with phenix [5] or refmac5[7]. Switch to shelxl, once the model is relatively complete.
Creating the
hkl-file apply mtz2hkl [3] to mtz-file used as input to refmac5/ phenix.ins-file shelxpro to create ins-file from PDB-file. Automatically includes restraints for
standard amino acids and nucleic acids and code for riding hydrogen positions (HFIX,initially commented out)
SBGrid Symposium 2011: The Shelx Suite 12/35
Tim Grüne
Shelxl Interaction with Coot
Model building with coot [6] now works very well with shelxl refinement (reading of name.res and name.fcf,writing of updated name new-round.ins).
Automatic generation of (σa-weighted) map anddifference map from fcf-file.
Save updated coordinates to .ins-file but check occupancies of newly placed atoms (solvent, ions) to be 11.0and not 1.0 (this is a bug in coot version 0.6.1).
SBGrid Symposium 2011: The Shelx Suite 13/35
Tim Grüne
Coot: Displaying Hydrogen
Riding hydrogens are not moved upon refinementin coot. It is sufficient not to display them in coot(Edit -> Bond Parameters)
Hydrogens are handled by shelxl with the AFIX command which ignores the coordinates of the calculated atompositions in the .ins-file.
SBGrid Symposium 2011: The Shelx Suite 14/35
Tim Grüne
Fractional Coordinates
Unlike PDB-files shelxl stores atom coordinates in the ins-file as fractional coordinates.
UisoO 4 0.4541 -0.0399 0.2690 11.00 0.18181 (isotropic atom)
| | | | |atom type x y z occ line continuation
| | | | | |N 3 0.2722 -0.1317 -0.1280 11.00 = (anisotropic atom)
0.24975 0.13001 0.18948 -0.03210 -0.05152 0.00098| | | | | |U11 U22 U33 U23 U13 U12
The anisotropic, symmetric ADP matrix (Uij) is used to calculate the scattering factor
F (hkl) =atoms j∑
in unit cellfj(θhkl)e
−2π2(hkl)
U11 U12 U13U12 U22 U23U13 U23 U33
a∗b∗c∗
e2πi(hxj+kyj+lzj)
SBGrid Symposium 2011: The Shelx Suite 15/35
Tim Grüne
Fractional Coordinates
Unlike PDB-files shelxl stores atom coordinates in the ins-file as fractional coordinates.
UisoO 4 0.4541 -0.0399 0.2690 11.00 0.18181 (isotropic atom)
| | | | |atom type x y z occ line continuation
| | | | | |N 3 0.2722 -0.1317 -0.1280 11.00 = (anisotropic atom)
0.24975 0.13001 0.18948 -0.03210 -0.05152 0.00098| | | | | |U11 U22 U33 U23 U13 U12
• Atom names are arbitrary (up to 4 characters - digits and letters, except keywords)• Scattering factor derived from atom type and SFAC keyword
SFAC C H N O S
1 2 3 4 5
Element names as they appear in the PSE have their scattering properties predefined in shelxl.
SBGrid Symposium 2011: The Shelx Suite 16/35
Tim Grüne
FVAR: Free variables
The use of fractional coordinates and U- instead of B-values allows for one of the major strengths of shelxl:
The use of free variables as restraints and constraints.
SBGrid Symposium 2011: The Shelx Suite 17/35
Tim Grüne
FVAR: The Concept
Free variables are enumerated by the FVAR card:
FVAR 0.07531 0.54646 0.56437 0.60583
"1" 2 3 4
(The first free variable is used as scaling factor between calculated and observed data.)
Numbers in atom descriptions (and in SUMP, CHIV, and DFIX) are interpreted as 10m+p, wherem is and integerand −5 < p < 5.
m = 0 p is freely refined, e.g. coordinates (x, y, z)
m = 1 p is fixed and not refined at all. Usually used for occupancy occ = 11.00
m > 1 p is refined as the mth number of the FVAR card (p = fvar(m)). This way groupsof atoms can be refined together, e.g. the occupancy of ligand molecules.
m ≤ −1 p is constraint to the value 1 − fvar(m). E.g. the occupancy of a two-folddisorder is thus handled by a single parameter
SBGrid Symposium 2011: The Shelx Suite 18/35
Tim Grüne
FVAR-Example: occupancy refinement
Simplest application: occupancy refinement of partially occupied ligand.
Partially occupied glycerol in a protein structure.Setting occupancies of all atoms to 0.5: independent refine-ment results in chemical nonsense:O3 occ.: 0.68 C3 occ.: 0.31 C2 occ.: 0.42 ...
Solution: Use FVAR #2:before refinement:
FVAR 0.14497 0.5
O3 occ.: 21.0 C3 occ.: 21.0 C2 occ.: 21.0 ...
after refinement:FVAR 0.14412 0.55387
O3 occ.: 21.0 C3 occ.: 21.0 C2 occ.: 21.0 ...
SBGrid Symposium 2011: The Shelx Suite 19/35
Tim Grüne
FVAR-Example: occupancy refinement
Simplest application: occupancy refinement of partially occupied ligand.
Partially occupied glycerol in a protein structure.Setting occupancies of all atoms to 0.5: independent refine-ment results in chemical nonsense:O3 occ.: 0.68 C3 occ.: 0.31 C2 occ.: 0.42 ...
Solution: Use FVAR #2:before refinement:
FVAR 0.14497 0.5
O3 occ.: 21.0 C3 occ.: 21.0 C2 occ.: 21.0 ...
after refinement:FVAR 0.14412 0.55387
O3 occ.: 21.0 C3 occ.: 21.0 C2 occ.: 21.0 ...
SBGrid Symposium 2011: The Shelx Suite 20/35
Tim Grüne
Alternative Conformations: PART + FVAR
Partially occupied ligands often lead to alternative conformations of side-chains: Interaction with the ligand inthose unit-cells where it is present may result in a different orientation than in those unit-cells where the ligandis missing.
Modeling disorder (3)
RESI 233 TYR..PART 1 31.0 CB ..PART 2 -31.0CB ..PART 0..
RESI 123 THR..PART 1 31.0 CB ..PART 2 -31.0CB ..PART 0..
1 parameter describes the occupancies of 22 atoms !!
Thomas R. Schneider
shelxl provides PARTs for mutually excluding interac-tions: All atoms in PART n can make bonds to eachother and to the atoms in PART 0, but not to otherPARTs.E.g. the occupancy of a two-fold disorder can bemodelled by using FVAR N for one part and FVAR -N
for the other one. The free variable -N is refined to1-(value f N).
SBGrid Symposium 2011: The Shelx Suite 21/35
Tim Grüne
SUMP: Modelling more than two-fold Disorder
• PARTs with three-fold or higher disorder: assign one freevariable each.• Restrain sum of all fvar’s with SUMP:
SUMP 1.0 0.01 1 19 1 20 1 21
1× fvar(19) + 1× fvar(20) + 1× fvar(21) = 1.0± 0.01
SBGrid Symposium 2011: The Shelx Suite 22/35
Tim Grüne
Anisotropic Refinement
• Can be carried out if data:parameter ratio is sufficiently high (roughly at about 1.5 Å or better)• Increases number of parameters from 4 to 9⇒ Should only be started at end of refinement when model is fairly complete
Transition from isotropic to anisotropic refinement: embrace the corresponding region in the .ins-file with
ANIS
.
.
.
ANIS 0
shelxl automatically and correctly sets up symmetry related constrains of ADPs.
SBGrid Symposium 2011: The Shelx Suite 23/35
Tim Grüne
Anisotropic Restraints
Chemical environment has an effect on the ADP’s of bonded or neighbouring atoms:
Restraints on ADP's
DELU SIMU ISOR
Thomas R. Schneider
SBGrid Symposium 2011: The Shelx Suite 24/35
Tim Grüne
shelxle — A GUI for shelxl
shelxl now has a GUI, shelxle developed by Christian Hübschle ([email protected]) in direct collaboration withGeorge Sheldrick. The program has now reached a stable status and can be downloaded as β-version uponemail request to Christian Hübschle.
A video of the program in action is available at http://ewald.ac.chemie.uni-goettingen.de/lehre/pm.html.
SBGrid Symposium 2011: The Shelx Suite 25/35
Tim Grüne
Macromolecular Phasing with shelx c/d/e
SBGrid Symposium 2011: The Shelx Suite 26/35
Tim Grüne
Possible Phasing Techniques with shelx c/d/e
The triad shelx c/d/e can be used in experimental phasing for
• S/MAD (single/ multi-) wavelength anomalous dispersion• SIR single wavelength isomorphous replacement• RIP (radiation damage induced phasing)• SIRAS combination of SIR and SAD• RIPAS combination of RIP and SAD
SBGrid Symposium 2011: The Shelx Suite 27/35
Tim Grüne
Latest Improvments in shelxd
The latest β-version of shelxd, shelxd mp is available upon email request to [email protected]
shelxd mp is a parallelised version using openMP.
• Parallel version shelxd mp: approximate 29 times faster on 32 CPU machine than single CPUversion.• shelxd mp runs faster even on 1 CPU compared to previous version (due to improvement of
calculation of Patterson Minimum Function PMSF)• Criterion of “best” solution: CFOM = CC + CCweak
SBGrid Symposium 2011: The Shelx Suite 28/35
Tim Grüne
Automated Model Building in shelxe
shelxd often finds a correct solution to the substructure, even at low (anomalous) resolution (5-8 Å)
Still the data are sometimes not good enough for density modification programs to produce an interpretablemap.
The option -a lets the beta-version of shelxe try and build a poly-ALA model into the electron density, iteratingbetween model building and density modification.
By this method the performance of shelxe has been pushed to produce interpretable electron density maps incases which were hopeless before.
SBGrid Symposium 2011: The Shelx Suite 29/35
Tim Grüne
Molecular Replacement Boosts
The flexibility of shelx programs could often be “abused” for applications they were initially not intended for, e.g.macromolecular phasing with the direct methods program shelxd.
Density modification can work with very poor phase information. This recently led to using shelxe as arbiter forextremely low quality molecular replacement solutions. e.g. in cases where only low homology starting modelsare available.
It works as easy as
shelxe mymodel.pda -a30 -s0.6 -q -y2.0
This creates a poly-Alanine trace (mymodel.pdb) for the data stored in mymodel.hkl (e.g. extracted from thephaser input mtz-file with mtz2hkl [3].
Because of the way shelxe works this procedure also remove model bias from the resulting electron densitymap
This works best with data to 2.0 Å resolution but should also work at 2-3 (’ish) Å.
SBGrid Symposium 2011: The Shelx Suite 30/35
Tim Grüne
Phaser & Shelxe
Data courtesy A. Thorn.
• Small search fragments used with phaser fora 1.7 Å data set.• Phaser TFZ-score and shelxe-CC do not cor-
relate.• TFZ > 8 non-reliable criterion with this
method• Correct solutions marked by CC > 25 %
SBGrid Symposium 2011: The Shelx Suite 31/35
Tim Grüne
Arcimboldo
The Computer program Arcimboldo [8] pushes this to extremes:
with CC>25%
Keep those
Let shelxe
expand solutions
Write out
many solutions
Phaser:
search with small
helical motif
Since this works without input model and without experimental phase information, this method can be regardedas ab initio method for macromolecules at 2 Å resolution!
SBGrid Symposium 2011: The Shelx Suite 32/35
Tim Grüne
shelxe — Getting the best out of your data
Electron density from initial helix fragment (7 out of145 residues) after 20 cycles density modificationwith shelxe
The same region after density modification combinedwith poly-ALA model building.
SBGrid Symposium 2011: The Shelx Suite 33/35
Tim Grüne
Availability
The Shelx programs are available free of charge for academic users and 2499 USD for profit-users.
The application form is available from shelx.uni-ac.gwdg.de/SHELX.
The latest version is SHELX-97 which was released in 1997. The next major release is scheduled for 2012.
β-test versions of some of the programs (currently notably shelxe with auto-tracing of proteins and the multi-processor version of shelxd) are available to Shelx users upon email request to
George Sheldrick ([email protected])Tim Grüne ([email protected])
SBGrid Symposium 2011: The Shelx Suite 34/35
Tim Grüne
References
1. G. M. Sheldrick, A short history of SHELX, Acta Crystallogr. (2008), A642. G. M. Sheldrick, Experimental phasing with SHELXC/D/E: combining chain tracing with density modi-
fication, Acta Crystallogr. (2010), D663. T. Grune, mtz2sca and mtz2hkl: facilitated transition from CCP4 to the SHELX program suite J. App.
Cryst. (2008), 41(1)4. C. Hübschle, University of Göttingen5. P. D. Adams et al., PHENIX: a comprehensive Python-based system for macromolecular structure
solution, Acta Crystallogr. (2010) D66, 213–2216. P. Emsley et al., Features and Development of Coot, Acta Crystallogr. (2010), D667. The CCP4 Suite: Programs for Protein Crystallography. Acta Crystallogr. D50, 760–7638. Rodríguez, D. D. et al., Crystallographic ab initio protein structure solution below atomic resolution,
Nature Methods (2009), volume 6(9); http://chango.ibmb.csic.es/ARCIMBOLDO
SBGrid Symposium 2011: The Shelx Suite 35/35