8
Using Sculptor and Situs for simultaneous assembly of atomic components into low-resolution shapes Stefan Birmanns , Mirabela Rusu, Willy Wriggers 1 University of Texas School of Biomedical Informatics at Houston, 7000 Fannin St. UCT 600, Houston, TX 77030, USA article info Article history: Available online 13 November 2010 Keywords: Multi-resolution modeling Multi-component registration Macromolecular assembly Cryo-electron microscopy Immersive visualization Haptic rendering abstract We describe an integrated software system called Sculptor that combines visualization capabilities with molecular modeling algorithms for the analysis of multi-scale data sets. Sculptor features extensive spe- cial purpose visualization techniques that are based on modern GPU programming and are capable of representing complex molecular assemblies in real-time. The integration of graphics and modeling offers several advantages. The user interface not only eases the usually steep learning curve of pure algorithmic techniques, but it also permits instant analysis and post-processing of results, as well as the integration of results from external software. Here, we implemented an interactive peak-selection strategy that enables the user to explore a preliminary score landscape generated by the colores tool of Situs. The interactive placement of components, one at a time, is advantageous for low-resolution or ambiguously shaped maps, which are sometimes difficult to interpret by the fully automatic peak selection of colores. For the subsequent refinement of the preliminary models resulting from both interactive and automatic peak selection, we have implemented a novel simultaneous multi-body docking in Sculptor and Situs that softly enforces shape complementarities between components using the normalization of the cross-correlation coefficient. The proposed techniques are freely available in Situs version 2.6 and Sculptor version 2.0. Published by Elsevier Inc. 1. Introduction Modeling macromolecular complexes at atomic-level of detail is an important step toward understanding their functional proper- ties. Although X-ray crystallography and NMR can provide such high-resolution structures, alternative biophysical techniques also provide a wealth of intermediate- to low-resolution data. The different experimental and environmental conditions of a cryo- electron microscopy (EM) single particle study (Frank, 2002), or of an electron tomography reconstruction (Medalia et al., 2002), often allow the collection of complementary information. In addi- tion, these low-resolution techniques are often the only way to visualize large molecular complexes in their assembled state. Over the last few years, a variety of algorithms has been devel- oped for the automatic interpretation of volumetric data, which can be categorized by the scoring function that they aim to opti- mize. One set of techniques employs the cross-correlation coeffi- cient (CC; Volkmann and Hanein, 1999; Roseman, 2000; Chacón and Wriggers, 2002) as the scoring function. These methods can determine the position of a probe molecule inside a volumetric target map in a fully automatic way, by either determining the placement of a single unit at a time (Rath et al., 2003; Garzón et al., 2007), or multiple models simultaneously (Kawabata, 2008; Lasker et al., 2009; Rusu and Birmanns, 2010). The docking accuracy of these techniques depends on the shape of the fitted components and on the resolution of the map (Chacón and Wriggers, 2002). Spurious placements can be encountered for low-resolution cryo-EM reconstructions where interior density features such as secondary structure elements remain obscured. A Laplacian filter (Chacón and Wriggers, 2002) could improve the performance of contour-based matches, but the resolution re- mained a limiting factor. Other limitations that may complicate the docking include resolution anisotropy from preferred particle orientations, stain artifacts in negative stain EM, and conforma- tional heterogeneity eroding features of the reconstruction. In addition, one might have to rely on distant homologues for the ba- sis of the atomic models instead of crystal or NMR structures. The aim of the present report is to extend the performance of quantitative multi-resolution docking methods for challenging systems where the fitting is ambiguous. We propose an integrated strategy that combines fully automatic algorithms with an interac- tive peak selection method. Instead of depending exclusively on automated optimization, our method enables visual exploration of a preliminary scoring landscape on a discrete lattice, one compo- nent at a time. The user selects local maxima of the docking score based on her expert knowledge of the spatial organization of the 1047-8477/$ - see front matter Published by Elsevier Inc. doi:10.1016/j.jsb.2010.11.002 Corresponding author. Fax: +1 713 500 3907. E-mail address: [email protected] (S. Birmanns). 1 Present address: D.E. Shaw Research, 120 W. 45th St., New York, NY 10036, USA. Journal of Structural Biology 173 (2011) 428–435 Contents lists available at ScienceDirect Journal of Structural Biology journal homepage: www.elsevier.com/locate/yjsbi

Journal of Structural Biologymolecular modeling algorithms for the analysis of multi-scale data sets. Sculptor features extensive spe-cial purpose visualization techniques that are

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Journal of Structural Biologymolecular modeling algorithms for the analysis of multi-scale data sets. Sculptor features extensive spe-cial purpose visualization techniques that are

Journal of Structural Biology 173 (2011) 428–435

Contents lists available at ScienceDirect

Journal of Structural Biology

journal homepage: www.elsevier .com/ locate/y jsbi

Using Sculptor and Situs for simultaneous assembly of atomic componentsinto low-resolution shapes

Stefan Birmanns ⇑, Mirabela Rusu, Willy Wriggers 1

University of Texas School of Biomedical Informatics at Houston, 7000 Fannin St. UCT 600, Houston, TX 77030, USA

a r t i c l e i n f o

Article history:Available online 13 November 2010

Keywords:Multi-resolution modelingMulti-component registrationMacromolecular assemblyCryo-electron microscopyImmersive visualizationHaptic rendering

1047-8477/$ - see front matter Published by Elsevierdoi:10.1016/j.jsb.2010.11.002

⇑ Corresponding author. Fax: +1 713 500 3907.E-mail address: [email protected] (S.

1 Present address: D.E. Shaw Research, 120 W. 45th

a b s t r a c t

We describe an integrated software system called Sculptor that combines visualization capabilities withmolecular modeling algorithms for the analysis of multi-scale data sets. Sculptor features extensive spe-cial purpose visualization techniques that are based on modern GPU programming and are capable ofrepresenting complex molecular assemblies in real-time. The integration of graphics and modeling offersseveral advantages. The user interface not only eases the usually steep learning curve of pure algorithmictechniques, but it also permits instant analysis and post-processing of results, as well as the integration ofresults from external software. Here, we implemented an interactive peak-selection strategy that enablesthe user to explore a preliminary score landscape generated by the colores tool of Situs. The interactiveplacement of components, one at a time, is advantageous for low-resolution or ambiguously shapedmaps, which are sometimes difficult to interpret by the fully automatic peak selection of colores. Forthe subsequent refinement of the preliminary models resulting from both interactive and automatic peakselection, we have implemented a novel simultaneous multi-body docking in Sculptor and Situs that softlyenforces shape complementarities between components using the normalization of the cross-correlationcoefficient. The proposed techniques are freely available in Situs version 2.6 and Sculptor version 2.0.

Published by Elsevier Inc.

1. Introduction

Modeling macromolecular complexes at atomic-level of detail isan important step toward understanding their functional proper-ties. Although X-ray crystallography and NMR can provide suchhigh-resolution structures, alternative biophysical techniques alsoprovide a wealth of intermediate- to low-resolution data. Thedifferent experimental and environmental conditions of a cryo-electron microscopy (EM) single particle study (Frank, 2002), orof an electron tomography reconstruction (Medalia et al., 2002),often allow the collection of complementary information. In addi-tion, these low-resolution techniques are often the only way tovisualize large molecular complexes in their assembled state.

Over the last few years, a variety of algorithms has been devel-oped for the automatic interpretation of volumetric data, whichcan be categorized by the scoring function that they aim to opti-mize. One set of techniques employs the cross-correlation coeffi-cient (CC; Volkmann and Hanein, 1999; Roseman, 2000; Chacónand Wriggers, 2002) as the scoring function. These methods candetermine the position of a probe molecule inside a volumetrictarget map in a fully automatic way, by either determining the

Inc.

Birmanns).St., New York, NY 10036, USA.

placement of a single unit at a time (Rath et al., 2003; Garzónet al., 2007), or multiple models simultaneously (Kawabata,2008; Lasker et al., 2009; Rusu and Birmanns, 2010).

The docking accuracy of these techniques depends on the shapeof the fitted components and on the resolution of the map (Chacónand Wriggers, 2002). Spurious placements can be encountered forlow-resolution cryo-EM reconstructions where interior densityfeatures such as secondary structure elements remain obscured.A Laplacian filter (Chacón and Wriggers, 2002) could improve theperformance of contour-based matches, but the resolution re-mained a limiting factor. Other limitations that may complicatethe docking include resolution anisotropy from preferred particleorientations, stain artifacts in negative stain EM, and conforma-tional heterogeneity eroding features of the reconstruction. Inaddition, one might have to rely on distant homologues for the ba-sis of the atomic models instead of crystal or NMR structures.

The aim of the present report is to extend the performance ofquantitative multi-resolution docking methods for challengingsystems where the fitting is ambiguous. We propose an integratedstrategy that combines fully automatic algorithms with an interac-tive peak selection method. Instead of depending exclusively onautomated optimization, our method enables visual explorationof a preliminary scoring landscape on a discrete lattice, one compo-nent at a time. The user selects local maxima of the docking scorebased on her expert knowledge of the spatial organization of the

Page 2: Journal of Structural Biologymolecular modeling algorithms for the analysis of multi-scale data sets. Sculptor features extensive spe-cial purpose visualization techniques that are

S. Birmanns et al. / Journal of Structural Biology 173 (2011) 428–435 429

atomic models. The preliminary results of the interactive (orautomatic) peak search are refined further with a novel off-latticerefinement in which the placement of the component structuresoptimized simultaneously. Fitting a group of independent compo-nents at the same time introduces additional steric constraints,which can increase the docking accuracy, especially in the case ofambiguous shapes or low-resolution data. Fig. 1 shows the differ-ent stages of the workflow supported by the software system.

In addition to the above-mentioned algorithmic techniques, themodeling workflow typically requires an exploration and visualiza-tion of the data sets. Therefore, in Section 2, we will first present abrief overview of the Sculptor software and its main features.Subsequently, we will demonstrate the proposed methods usingsimulated and experimental maps for which atomic structuresexist, such as the 14 Å resolution cryo-EM map of the actomyosincomplex (Holmes et al., 2003) and a GroEL reconstruction at11.5 Å resolution (Ludtke et al., 2001). In Section 3 we will describethe interactive peak selection and construction of preliminary startmodels for refinement. The results of the peak search are thenoptimized by the simultaneous multi-body refinement step, asreported in Section 4. We will also compare the results with tradi-tional single-body refinement.

2. Interactive modeling with Sculptor

Sculptor is a graphical user interface (GUI)-based multi-scalemodeling program, which combines various visualization tech-niques with pattern matching and feature extraction algorithms.The concept originated from the observation that several algo-rithms of the Situs modeling package (Wriggers, 2010) were timeefficient enough to be executed in a GUI-based environment,where the outcome can be explored and parameters can be chan-ged immediately. This concept was initially implemented in theform of a software SenSitus (Birmanns and Wriggers, 2003), whichfocused on interactive docking via haptic rendering in a virtualreality environment. Over time, the software transformed into ageneral purpose visualization and multi-scale modeling tool calledSculptor. In this new role, the program was used increasingly forpre- and post-processing of data sets and to analyze the outcomeof non-interactive Situs algorithms. In addition, Sculptor served asa test bed for very efficient docking algorithms, which would ben-efit directly from the interactivity possible in the GUI environment.

Fig. 1. Overview of the proposed ‘‘Interactive Global Docking’’ workflow. The multi-reso2010): the program colores exports the preliminary score landscape (translation functiodetermined by an exhaustive search of three Euler angles with user-selected rotational gboxes). In Sculptor, the interactive peak search based on the preliminary scores yields a pnew multi-body refinement procedure.

Sculptor is organized around a graphical user interface (Fig. 2).The software aims to reduce visual ‘‘clutter’’ by using a single-window approach, where the main user interface elements arearranged around a central 3D graphics area, using non-overlappingframes. Internally, the program forms only a thin layer on top of ageneral-purpose C++ class library, which implements the algo-rithms and visualization techniques (Fig. 3). The modular approachshown in Fig. 3 enables the testing of algorithms isolated from themain application program and also permits us to encapsulate theoperating system-dependent routines in a single area of the code.Over time, this strategy has also proven to be very effective forintroducing new lab members to the project, and it helps to keepthe source maintainable.

Visualization of atomic models is essential for the interpretationof the 3D shape of molecular systems (Humphrey et al., 1996;Delano, 2002; Goddard and Ferrin, 2007). The use of Sculptor formulti-resolution modeling results in unique challenges for thedepiction of biomolecular assemblies. In multi-resolution modeling,one routinely combines several atomic models thereby constructinglarge molecular assemblies. The size of such large systems requiresadditional efforts to keep the visualization updated in real computetime. We have implemented several innovative rendering tech-niques that off-load the creation of the geometry to programmablegraphics card processors (GPUs). Using GPU acceleration, Sculptor isable to display molecular systems with hundreds of thousands ofatoms in various rendering styles.

Besides the visualization of atomic models, an efficient depic-tion of volumetric maps is important for exploring the shape ofand the visual agreement between the multi-resolution data sets.Therefore, Sculptor offers a variety of volume rendering strategies.The maps can be viewed using the traditional iso-surface tech-nique, which relies on the marching cube algorithm (Lorensenand Cline, 1987) to convert the volume into a triangular mesh. Thiswell-established technique is very efficient and allows an interac-tive rendering of small to medium-sized data sets, but it cannotdirectly display different intensity values simultaneously.

As an alternative, Sculptor also supports a direct volume render-ing technique, which enables the user to assign material propertiesfreely to the intensities. This approach is based on the pre-integratedvolume rendering idea (Engel et al., 2001), but was implemented inSculptor using the features of modern programmable graphicscards. Direct volume rendering is very popular in other scientificdisciplines, for example, medical visualization, and offers more

lution data is first analyzed using an exhaustive search approach in Situs (Wriggers,n) as a 3D map of maximum CC values. The rotations corresponding to each voxel,ranularity, are exported as a 3D map of indices into a rotation table (green and redreliminary model of the entire complex, which is then further optimized using the

Page 3: Journal of Structural Biologymolecular modeling algorithms for the analysis of multi-scale data sets. Sculptor features extensive spe-cial purpose visualization techniques that are

Fig. 2. Screenshot of the graphical user interface of Sculptor. The software is organized around the main 3D rendering panel, and minor panels are grouped into non-overlapping frames. The upper section of the left panel shows a list of the loaded data files along with their type, and their visibility and mobility status (the lower half of thepanel changes its content based on the selected document). In the screenshot, the volumetric map is selected and the direct volume rendering tab is shown. The user canfreely define the transfer function (mapping of material properties to intensities) based on the histogram of the map. The bottom right section shows the log panel, where theprogram communicates the status and outcome of commands and algorithms.

430 S. Birmanns et al. / Journal of Structural Biology 173 (2011) 428–435

flexibility than surface-based representations. Every intensity level ofthe volumetric data is rendered using specific (and user-controllable)material settings, such as varying colors and transparency levels.This permits a simultaneous rendering of different intensities thatare distinguishable by the user. Thus smaller, almost disconnectedregions can be shown as part of the main body of the molecule(Fig. 4). The technique can also help to visualize map intensitiesthat differ between parts of a map (see article by Wriggers et al.,in this issue) without requiring a matching of densities.

Finally, Sculptor also offers a two-dimensional visualization ofmap cross-sections that can be added to the main scene, and theGUI-based ‘‘Map-Explorer’’ permits direct access to and queryingof individual voxels. Although not relevant to the scope of thispaper, the software is able to render deformable models in thecontext of flexible fitting of molecular models (Rusu et al., 2008).Sculptor reads and writes volumetric files in the Situs formatand can import and export feature vectors for the interactivemanipulation of a feature vector ‘‘skeleton’’ used in flexible fitting(Wriggers et al., 2004). The feature vectors can also be used for acoarse-grained modeling of structural data sets and for an interac-tive ‘‘point cloud matching’’ (Birmanns and Wriggers, 2007).

As described above, Sculptor was initially designed as a GUI-based extension of the Situs modeling package, and it is thereforeconnected in several ways with the Situs command-line tools. Inthe following section, we present a specifically useful link between

the two suites, where intermediate results of the colores programare explored in Sculptor, and the peaks of a scoring landscape areidentified for further automatic refinement.

3. Interactive peak search, one component at a time

The CC is an established scoring function for multi-resolutionmodeling and has been used for a decade in various softwaretools (Volkmann and Hanein, 1999; Roseman, 2000; Chacón andWriggers, 2002). The maximization of the CC in rigid-body dockingcan be separated into three translational and three rotationaldegrees of freedom. The translational search can then be greatlyaccelerated by exploiting the Fourier convolution theorem. Themain idea is to scan all rotations explicitly with a user-definablerotational granularity while rapidly computing the 3D translationfunction on the discrete lattice of the EM map using the FastFourier Transform (Wriggers and Chacón, 2001). As part of theoverall docking, the colores program (Chacón and Wriggers,2002) generates an intermediate 3D ‘‘map’’ of CC scores and cor-responding angles by saving the maximum CC and correspondingangles for each voxel. This preliminary score ‘‘landscape’’ is suffi-ciently well sampled to enable the identification of peaks.

This component-wise peak-search process would be straight-forward if it could be based on the score values alone. But in

Page 4: Journal of Structural Biologymolecular modeling algorithms for the analysis of multi-scale data sets. Sculptor features extensive spe-cial purpose visualization techniques that are

Fig. 3. Software architecture of Sculptor. The main application program comprises only a thin software layer on top of a C++ class library, which is subdivided into modules.Low-level modules like ‘‘System’’, which connects the library with the different operating systems, form the basis for higher-level modules. The ‘‘Library for Input devices inVirtual Environments’’ (LIVE) provides routines to dynamically load device drivers for special input devices like the SensAble 6DOF or Novint Falcon force feedbackcontrollers. ‘‘Core’’ implements a general purpose 3D scenegraph architecture, which is extended by high-level modules that provide nodes to visualize specific data types.Sculptor uses the visualization nodes for atomic models and volumetric maps, although additional high-level modules exist for other applications.

Fig. 4. Direct volume rendering permits the visualization of soft boundaries involumetric maps. Shown here is an interactive visualization of the DegP protease–chaperone in Sculptor (EMDB 1504).

S. Birmanns et al. / Journal of Structural Biology 173 (2011) 428–435 431

practice other criteria have to be optimized simultaneously, forexample, the proximity of other docking solutions. (In the auto-mated peak detection of colores, a relatively small user-definednumber of candidate peaks are explored, for which both score val-ues and a simple proximity criterion – based on map resolution –must be satisfied). The following sections describe an interactivepeak search technique implemented in Sculptor, and provide exam-ples of its application in practical situations.

3.1. Technical implementation of the interactive peak search

The purpose of the interactive peak search in Sculptor is to em-power the user to freely explore an unrestricted number of possi-ble peak positions in translational space (as an alternative to theautomated peak filter used by colores). Components are fitted oneat a time. The exploration of the peaks (by mapping rigid-body de-grees of freedom onto fitted structures) is efficient for on-line use,unlike the exhaustive search that requires a slower, off-line calcu-lation with colores. The interactive peak search comprises one stepin the overall ‘‘Interactive Global Docking’’ workflow (Fig. 1) thatuses exhaustive search data in an interactive fashion, combiningit with steric information generated on-the-fly while the userexamines the solution space.

As mentioned above, a cross-correlation based exhaustivesearch cannot be carried out in real-time because it takes severalminutes to hours to complete, depending on the search granularityand map size. Therefore, the scoring data is generated in advanceby the colores program of the software package Situs. To enablethe interactive exploration of peaks a new option ‘‘-sculptor’’ wasintroduced in colores in Situs version 2.5 that writes intermediatescoring data to Sculptor-readable files. Specifically, the programexports (i) the target map in compatible indexing, (ii) a preliminaryscore ‘‘master file’’ as a 3D map of maximum CC values, and (iii) therotations corresponding to each voxel as a 3D map of indices into arotation table. Sculptor then reads the relevant files and allows theuser to interactively explore fits, with the help of both visual andhaptic (force) feedback from the software. Once a good candidatesolution is identified for one of the subunits, steric interaction data

Page 5: Journal of Structural Biologymolecular modeling algorithms for the analysis of multi-scale data sets. Sculptor features extensive spe-cial purpose visualization techniques that are

Fig. 5. Different stages of interactive peak search shown as direct screen captures from Sculptor: docking of subunits into the 14 Å cryo-EM map of the actomyosin complex(Holmes et al., 2003). (A) and (B) The user translates the G-actin monomer as desired inside the map, while the optimal rotation at each position is updated in real time. Thecolor of the central sphere and the numeric value reflect the (normalized) CC value. (C)–(E) Two G-actin monomers are docked (yellow ribbons) and fixed, while a thirdG-actin monomer is translated inside the map. (E) and (F) Contacts are shown between the mobile G-actin monomer and the fixed monomers (favorable contacts aredisplayed as green spheres and steric clashes in red). (F) The myosin S1 fragment docked inside the map of the complex.

432 S. Birmanns et al. / Journal of Structural Biology 173 (2011) 428–435

is generated and displayed on the fly, warning the user of potentialclashes and highlighting good shape complementarity (Heyd andBirmanns, 2008). If a haptic device is available, the CC values andsteric interactions can be interpreted as a potential function fromwhich to compute forces that guide the user (Heyd and Birmanns,2009).

However, the docking solutions generated by the peak searchare only approximate. The preliminary solutions are biased bythe particular order of the placement. They are also subject tothe translational and rotational search granularity. They will be re-fined further via automated local optimization (Section 4).

3.2. Practical applications of the interactive peak search

We demonstrate the interactive peak search procedure using aspecific experimental example, the actomyosin complex (Holmeset al., 2003). The 14 Å resolution map comprises (filamentous)F-actin decorated with myosin fragment S1. To generate a modelof the filament, we used two atomic structures: a G-actin monomer(Holmes et al., 2003) and a flexed model of myosin S1, as describedby Wriggers (2010). The docking was compared with two referencestructures: the F-actin model (Holmes et al., 2003) alone or in com-plex with the flexed myosin S1. Our test was realistic in the sensethat the subunits did not have perfectly fitting interfaces: G-actinand myosin S1 subunits used for the docking were all determinedin isolation and not in the complex visualized by the cryo-EM map.Although the complex has never been solved at atomic resolution,the reference structures were deemed sufficiently reliable for this

test because the F-actin model was independently refined againstX-ray fiber diffraction data (Holmes et al., 2003), and the myosinS1 was fitted to an individual S1 fragment after subtracting theF-actin (Wriggers, 2010).

Within Sculptor, the user can move the probe structure usingthe mouse or a supported haptic (force feedback) device. As theprobe structure is moved around, it automatically rotates intothe most favorable orientation (Fig. 5A and 5B show the G-actinmonomer as the scoring landscape is explored). In addition, the(globally normalized) score at the present position is displayedgraphically through color changes of the central sphere, as wellas numerically. A suitable candidate position for the first monomercan thus be located, on the basis of both the global docking scoreand the user’s knowledge of the system. If a haptic device is avail-able, forces will guide the user to promising docking locations. Thechosen location is saved as a solution (Fig. 5C–E shows two suchsolutions in yellow ribbon representations), and the probe ismoved to search for the next candidate position. At this stage,the additional steric information generated by Sculptor is dis-played. Both shape complementarity (Fig. 5D) and steric clashes(Fig. 5E) can be visualized.

The colores tool was employed for the preliminary exhaustivesearch fitting of the two atomic components to the 14 Å-resolutionmap. The rotational granularity was 9� and the translational gran-ularity (voxel spacing) was 2.75 Å. The preliminary model gener-ated by the interactive peak search shown in Fig. 5 exhibited aroot mean square deviation (RMSD) of 5.1 Å from F-actin aloneand a RMSD of 3.8 Å from the complete actomyosin complex

Page 6: Journal of Structural Biologymolecular modeling algorithms for the analysis of multi-scale data sets. Sculptor features extensive spe-cial purpose visualization techniques that are

Table 1Comparison of various docking approaches for the actomyosin complex and thechaperonin GroEL (subunits include 12 G-actin monomers/12 myosin S1 for theactomyosin complex, and 14 monomers for GroEL). Root mean square deviation(RMSD) from the references and cross-correlation coefficients (CC) are shown. The CCvalues of actomyosin correspond to the map shown in Fig. 6 and are systematicallylower than those of the GroEL due to filament end effects. The interactive peak searchmodel is equivalent to colores before Powell optimization (Chacón and Wriggers,2002). In the single-body refinement each fragment is fitted independently (equiv-alent to colores after Powell optimization), while in the multi-body refinement allfragments were simultaneously optimized.

Reference model Actomyosin complex GroEL (emd-1080F-actin/actomyosin PDB 1XCK)

RMSD (Å) CC RMSD (Å) CC

Reference model 0.576/0.703 0.946Interactive peak search 5.1/3.8 0.537/0.672 4.3 0.881Single-body refinement 2.6/2.1 0.569/0.698 1.7 0.945Multi-body refinement 1.8/1.4 0.583/0.709 1.3 0.950

S. Birmanns et al. / Journal of Structural Biology 173 (2011) 428–435 433

(Table 1). This preliminary model (still subject to search granular-ity) is sufficiently close to the reference structure to allow therefinement procedures of Section 4 to converge.

Similar results were obtained for a 11.5 Å-resolution recon-struction of GroEL (EMDB-1080, Ranson et al. (2001), Tagari et al.(2002)), where multiple copies of one monomer, extracted from

Fig. 6. The actomyosin complex: Multi-body refinement of 12 G-a

the crystal structure of the wild type GroEL (PDB 1XCK, Bermanet al. (2002)), were docked into the map of the 14-mer. The preli-minary model obtained after interactive peak search exhibited anRMSD of 4.3 Å from the reference (Table 1).

4. Multi-body refinement

After exhaustive search and interactive or automatic peaksearch, an off-lattice refinement is required to optimize the posi-tion and the orientation of the preliminary scoring peaks. This istypically achieved with a local optimization that explores the off-lattice scoring landscape in the vicinity of an on-lattice peak. Forexample, the ‘‘Fit In Map’’ tool in Chimera (Pettersen et al., 2004)employs a steepest ascent local optimization. The programs previ-ously developed by us, such as colores of Situs (Wriggers, 2010) andthe ‘‘Refinement’’ tool in Sculptor (Birmanns and Wriggers, 2007),employ conjugate gradient ascent optimization (Powell, 1964).Similar to the gradient ascent, the Powell method uses the gradientinformation but employs a directional search to accelerate theconvergence.

The traditional local refinement approaches have one factor incommon: they optimize only the six rigid-body degrees of freedomof individual fragments, one at a time. However, a typical dockingscenario often involves a larger number N > 1 components that

ctin monomers/12 myosin S1 inside the 14 Å-resolution map.

Page 7: Journal of Structural Biologymolecular modeling algorithms for the analysis of multi-scale data sets. Sculptor features extensive spe-cial purpose visualization techniques that are

434 S. Birmanns et al. / Journal of Structural Biology 173 (2011) 428–435

sterically interact with each other. Here, we introduce a novel mul-ti-body refinement procedure that applies the Powell conjugategradient optimization simultaneously to all components, to maxi-mize the CC as a function of 6N rigid-body degrees of freedom. Thetechnique was implemented in Sculptor and in the collage tool of Si-tus. Starting from a good preliminary configuration, such as thatprovided by the above exhaustive search and interactive peaksearch, it is reasonable to expect that the algorithm quickly con-verges to the nearest maximum, despite the higher dimensionalityof the search space compared to the traditional single-bodyrefinement.

To demonstrate the performance of multi-body refinement, wecompare it against the performance of single-body refinement forthe actomyosin and GroEL test cases in Table 1. The traditional sin-gle-body optimization, in which each fragment is individually re-fined, improved the RMSD of 3.8 Å in the actomyosin model to2.1 Å in 41 min.2 Likewise, the GroEL model was improved from4.3 Å to 1.7 Å RMSD in 6 min 43s2. Both actomyosin and GroELmodels benefited from the new multi-body refinement as theirRMSD was reduced further to 1.4 Å or less. The 24-units of theactomyosin complex were refined in 2 h 20 min while the 14 sub-units of GroEL only required 16 min 17 s for the optimization2. Thefinal actomyosin model is shown in Fig. 6. The CC values of the fitsin Table 1 generally mirror the RMSD results. In other words, lowerRMSD corresponds to higher CC scores, with the interesting excep-tion of the reference structures themselves. (The small remainingRMSD after multi-body refinement thus appears to be caused byover fitting of the CC – we do not necessarily expect the referencestructures to exhibit the highest CC because of their differentbiophysical origins from those of the cryo-EM maps).

Our validations on experimental maps show that the additionalspatial constraints from multi-body refinement improve the fit ofsubunits in the map without requiring any special filters or masks.We expected that such a refinement would be particularly benefi-cial for low-resolution maps where all densities can be accountedfor, such as in symmetric oligomers. To test this hypothesis, we cre-ated simulated low-resolution maps (Belnap et al., 1999) from var-ious known oligomeric structures (PDBs: 1NIC – trimer (Admanet al., 1995), 1QQW – pentamer (Ko et al., 2000), 1XMV – hexamer(Xing and Bell, 2004), 1TYQ – heptamer (Nolen et al., 2004)). Thedeviation of monomers from the oligomer crystal structure wascomputed as a function of resolution of the simulated maps (datanot shown). For single-body refinement with standard CC, the mod-els maintained the correct architecture down to 9–15 Å resolution,below which the monomers drifted toward the center of the mapwhere the higher densities generate a high correlation. This break-down of the traditional CC criterion due to blurring of interior mapdetail is well known (Chacón and Wriggers, 2002) and hasprompted the development of filters and masks for single-bodyrefinement in the past (Wriggers and Chacón, 2001). In contrast,in multi-body refinement, the models maintained the correct archi-tecture (sub-Ångstrom accuracy placement in the simulated maps),even at resolutions as low as 50 Å, because the fragments hadnowhere to go. Although this test used idealized data (perfectlysimulated EM maps, and perfectly complementary subunits), theresults do validate the multi-body refinement approach further.

5. Discussion and conclusions

We have described a novel, combined software system for theinteractive exploration, interpretation, and analysis of multi-reso-lution data. Building on the classic Situs program package, Sculptorcombines intermediate results of the exhaustive, quantitative

2 Runtime measured on a Intel 2 Duo processor E8400 @3.0 GHz.

docking tool colores with a direct visual interpretation of the re-sults. The interactive peak search and the new simultaneous mul-ti-body optimization of the results guarantee a high level offlexibility and enable expert participation in the modeling, whilestill ensuring that the output is based on reproducible quantitativescoring functions.

The main difference between the proposed workflow (Fig. 1)and traditional docking with colores is the replacement of the auto-mated peak search with an interactive one and the added possibil-ity of performing multi-body refinement.

The experimental examples, actomyosin and GroEL, demon-strate that it is straightforward to select local optima of the CCscore with the interactive peak search method in Sculptor. Theinteractive approach is beneficial in situations where informationabout the approximate position of a subunit is available from bio-physics, i.e., where the correct candidate solution is obvious to anexpert user. We expect that the interactive approach will workwell in challenging situations where many degenerate solutionsexist (such as in symmetric structures or multi-body docking atlow resolution), or when correct fits are obscured by spurious max-ima of the scoring function.

We proposed a novel multi-body refinement technique thataims to prevent steric clashes and guards against individual frag-ments wandering off into high densities that are already occupiedby a different fragment. The spatial constraints, implicitly intro-duced by the normalization of the CC during simultaneous refine-ment of all components, help to identify better quality models. Ourvalidations on both simulated and experimental test cases showedthat multi-body refinement is more beneficial than single-bodyoptimization. Based on the tests with simulated maps, we expectthis benefit to become prominent especially at very low resolution.The test results suggest that a standard CC, without filters andmasks, can be used as a scoring function for multi-body refinementat low-resolution if densities are well accounted for by the fittedatomic structures.

Although the runtimes are significantly higher in the case of themulti-body refinement, they are still in an acceptable range fromminutes up to a few hours for large systems with a high numberof subunits. Complementing the standalone collage tool of Situs,the progress can be monitored visually in Sculptor if desired. Theinspection afforded by Sculptor allows the user to stop the processanytime, to fine tune parameters or to test if a sufficient level ofconvergence has been reached.

While the test results have been very promising, they do notrule out possible limitations of the proposed methods. First, it ispossible that the 6N-dimensional search exhibits a smaller zoneof convergence than the N separate 6D searches (e.g. due to thehigher dimensionality of the search space and dependencies be-tween fragments). The size of the zone of convergence is not crit-ical for off-lattice peak refinement (which starts close to themaximum) but it might pose a problem for sub-optimal modelsthat are farther from the peak maxima. Second, some of the pre-sented tests may have been overly idealized. In the GroEL and sim-ulated map tests the subunits (extracted from the crystal structureof the complex) exhibit perfect shape complementarity. It is likelythat in realistic docking problems, the atomic structure of the com-plex is unknown and that fragments assemble by induced fit. Flex-ible loops or secondary structure elements at the fragmentinterface might lower the score of the correct solution due to inad-vertent steric clashes, requiring a flexible protein–protein docking(Noy and Goldblum, 2010). These scenarios and possible limita-tions depend heavily on the experimental system and will be ex-plored in future work.

A possible extension of the current approach, which we planto investigate in the future, is the integration of symmetry in themulti-body refinement procedure. It is expected that refining

Page 8: Journal of Structural Biologymolecular modeling algorithms for the analysis of multi-scale data sets. Sculptor features extensive spe-cial purpose visualization techniques that are

S. Birmanns et al. / Journal of Structural Biology 173 (2011) 428–435 435

symmetrical copies simultaneously is beneficial as it reduces thedegrees of freedom of the search space.

We look forward to receiving feedback on the software perfor-mance from the user community. Sculptor is freely available athttp://sculptor.biomachina.org for Windows, Macintoshand Linux operating systems. We have prepared convenient distri-bution packages for the different operating systems and have up-loaded various tutorials and manual texts online. The featuresdescribed in this manuscript are included in Sculptor versions 2.0and later. The programs colores and collage are part of the Situs pro-gram package, available at http://situs.biomachina.org. Theexport of the Sculptor-compatible peak search files is implementedin Situs version 2.5 and later, and the multi-body refinement in col-lage is implemented in Situs version 2.6 and later.

Acknowledgments

We thank Jochen Heyd, Manuel Wahle, Maik Boltes, and HerwigZilken for contributions to earlier versions of Sculptor. We alsothank Teresa Ruiz, Michael Radermacher, Tanvir Shaikh, AlexanderFreiberg, Gkhan Tolun, and Tracy Nixon for their extensive testingof the software and their suggestions for fine-tuning the graphicalinterface. This work was supported in part by grants from NationalInstitutes of Health (R01GM62968), the Human Frontier ScienceProgram (RGP0026/2003), the Gillson-Longenbaugh Foundation,and by startup funds from the University of Texas Health ScienceCenter at Houston.

References

Adman, E.T., Godden, J.W., Turley, S., 1995. The structure of copper-nitrite reductasefrom Achromobacter cycloclastes at five pH values, with NO2-bound and withtype II copper depleted. J. Biol. Chem. 270, 27458–27474.

Belnap, D.M., Kumar, A., Folk, J.T., Smith, T.J., Baker, T.S., 1999. Low-resolutiondensity maps from atomic models: How stepping ‘‘back’’ can be a step‘‘forward’’. J. Struct. Biol. 125, 166–175.

Berman, H.M., Battistuz, T., Bhat, T.N., Bluhm, W.F., Bourne, P.E., Burkhardt, K., Feng,Z., Gilliland, G.L., Iype, L., Jain, S., Fagan, P., Marvin, J., Padilla, D., Ravichandran,V., Schneider, B., Thanki, N., Weissig, H., Westbrook, J.D., Zardecki, C., 2002. Theprotein data bank. Acta Cryst. D 58, 899–907.

Birmanns, S., Wriggers, W., 2003. Interactive fitting augmented by force-feedbackand virtual reality. J. Struct. Biol. 144, 123–131.

Birmanns, S., Wriggers, W., 2007. Multi-resolution anchor-point registration ofbiomolecular assemblies and their components. J Struct. Biol. 157, 271–280.

Chacón, P., Wriggers, W., 2002. Multi-resolution contour-based fitting ofmacromolecular structures. J. Mol. Biol. 317, 375–384.

Delano, W.L., 2002. The PyMOL molecular graphics system. DeLano Scientific, PaloAlto, CA, USA. http://www.pymol.org.

Engel, K., Kraus, M., Ertl, T., 2001. High-quality pre-integrated volume renderingusing hardware-accelerated pixel shading. In: HWWS ’01: Proceedings of theACM SIGGRAPH/EUROGRAPHICS Workshop on Graphics Hardware, pp. 9–16.

Frank, J., 2002. Single-particle imaging of macromolecules by cryo-electronmicroscopy. Ann. Rev. Biophys. Biomol. Struct. 31, 303–319.

Garzón, J.I., Kovacs, J., Abagyan, R., Chacón, P., 2007. ADP_EM: fast exhaustive multi-resolution docking for high-throughput coverage. Bioinformatics 23 (4), 427–433.

Goddard, T.D., Ferrin, T.E., 2007. Visualization software for molecular assemblies.Curr. Opin. Struct. Biol. 17, 587–595.

Heyd, J., Birmanns, S., 2008. Beyond the black box: interactive global docking ofprotein complexes. Microsc. Today 16 (4).

Heyd, J., Birmanns, S., 2009. Immersive structural biology: a new approach to hybridmodeling of macromolecular assemblies. Virtual Reality 13, 245–255.

Holmes, K.C., Angert, I., Kull, F.J., Jahn, W., Schröder, R.R., 2003. Electron cryo-microscopy shows how strong binding of myosin to actin releases nucleotide.Nature 425, 423–427.

Humphrey, W., Dalke, A., Schulten, K., 1996. VMD: visual molecular dynamics. J.Mol. Graphics 14, 33–38.

Kawabata, T., 2008. Multiple subunit fitting into a low-resolution density map of amacromolecular complex using a gaussian mixture model. Biophys. J. 95 (10),4643–4658.

Ko, T.P., Safo, M.K., Musayev, F.N., Di Salvo, M.L., Wang, C., Wu, S.H., Abraham, D.J.,2000. Structure of human erythrocyte catalase. Acta Cryst. D 56 (Pt 2), 241–245.

Lasker, K., Dror, O., Shatsky, M., Nussinov, R., Wolfson, HJ., 2009. EMatch: discoveryof high resolution structural homologues of protein domains in intermediateresolution cryo-EM maps. IEEE/ACM Trans. Comput. Biol. Bioinform. 4 (1), 28–39.

Lorensen, W.E., Cline, H.E., 1987. Marching cubes: a high resolution 3D surfaceconstruction algorithm. In: SIGGRAPH ’87: Proceedings of the 14th AnnualConference on Computer Graphics and Interactive Techniques, vol. 21, pp. 163–169.

Ludtke, S.J., Jakana, J., Song, J.L., Chuang, D.T., Chiu, W., 2001. A 11.5 Å single particlereconstruction of GroEL using EMAN. J. Mol. Biol. 314, 253–262.

Medalia, O., Weber, I., Frangakis, A.S., Nicastro, D., Gerisch, G., Baumeister, W., 2002.Macromolecular architecture in eukaryotic cells visualized by cryoelectrontomography. Science 298, 1209–1213.

Nolen, B.J., Littlefield, R.S., Pollard, T.D., 2004. Crystal structures of actin-relatedprotein 2/3 complex with bound ATP or ADP. Proc. Natl. Acad. Sci. USA 101,15627–15632.

Noy, E., Goldblum, A., 2010. Flexible protein–protein docking based on Best-Firstsearch algorithm. J. Comp. Chem. 31, 1929–1943.

Pettersen, E.F., Goddard, T.D., Huang, C.C., Couch, G.S., Greenblatt, D.M., Meng, E.C.,Ferrin, T.E., 2004. UCSF Chimera – a visualization system for exploratoryresearch and analysis. J. Comp. Chem. 25 (13), 1605–1612.

Powell, M.J.D., 1964. An efficient method for finding the minimum of a function ofseveral variables without calculating derivatives. Comput. J. 7 (2), 155–162.

Ranson, N.A., Farr, G.W., Roseman, A.M., Gowen, B., Fenton, W.A., Horwich, A.L.,Saibil, H.R., 2001. ATP-bound states of GroEL captured by cryo-electronmicroscopy. Cell 107, 869–879.

Rath, B.K., Hegerl, R., Leith, A., Shaikh, T.R., Wagenknecht, T., Frank, J., 2003. Fast 3Dmotif search of EM density maps using a locally normalized cross-correlationfunction. J. Struct. Biol. 144, 95–103.

Roseman, A.M., 2000. Docking structures of domains into maps from cryo-electronmicroscopy using local correlation. Acta Cryst. D 56, 1332–1340.

Rusu, M., Birmanns, S., 2010. Evolutionary tabu search strategies for thesimultaneous registration of multiple atomic structures in cryo-EMreconstructions. J. Struct. Biol. 170, 164–171.

Rusu, M., Birmanns, S., Wriggers, W., 2008. Biomolecular pleiomorphism probed byspatial interpolation of coarse models. Bioinformatics 24, 2460–2466.

Tagari, M., Newman, R., Chagoyen, M., Carazo, J.M., Henrick, K., 2002. New electronmicroscopy database and deposition system. Trends Biochem. Sci. 27, 589.

Volkmann, N., Hanein, D., 1999. Quantitative fitting of atomic models into observeddensities derived by electron microscopy. J. Struct. Biol. 125, 176–184.

Wriggers, W., 2010. Using Situs for the integration of multi-resolution structures.Biophys. Rev. 2, 21–27.

Wriggers, W., Alamo, L., Padrón, R., 2011. Matching structural densities fromdifferent biophysical origins with gain and bias. J. Struct. Biol. 173, 445–450.

Wriggers, W., Chacón, P., 2001. Modeling tricks and fitting techniques formultiresolution structures. Structure 9, 779–788.

Wriggers, W., Chacón, P., Kovacs, J.A., Tama, F., Birmanns, S., 2004. Topologyrepresenting neural networks reconcile biomolecular shape, structure, anddynamics. Neurocomputing 56, 365–379.

Xing, X., Bell, C.E., 2004. Crystal structures of Escherichia coli RecA in complex withMgADP and MnAMP-PNP. Biochemistry 43, 16142–16152.