Université de Montréal · Accelrys Scientific Support and Customer Service 9685 Scranton Road San Diego, CA 92121-3752 To print photographs or files of computational results (figures

DeCipherRelease 2000

March 2000

Accelrys9685 Scranton Road

San Diego, CA 92121-3752

619/458-9990 Fax: 619/458-0136

Copyright*

This document is copyright © 2000, Accelrys Inc., a subsidiary of Pharmacopeia, Inc. Allrights reserved. Except as permitted under the United States Copyright Act of 1976, no partof this publication may be reproduced or distributed in any form or by any means or storedin a database retrieval system without the prior written permission of Accelrys Inc.The software described in this document is furnished under a license and may be used orcopied only in accordance with the terms of such license.

Restricted Rights LegendUse, duplication, or disclosure by the Government is subject to restrictions as in subpara-graph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFAR252.227–7013 or subparagraphs (c)(1) and (2) of the Commercial Computer Software—Restricted Rights clause at FAR 52.227-19, as applicable, and any successor rules and regula-tions.

Trademark AcknowledgmentsCatalyst, Cerius2, Discover, Insight II, and QUANTA are registered trademarks of AccelrysInc. Biograf, Biosym, Cerius, CHARMm, Open Force Field, NMRgraf, Polygraf, QMW, Quan-tum Mechanics Workbench, WebLab, and the Biosym, MSI, and Accelrys marks are trade-marks of Accelrys Inc.IRIS, IRIX, and Silicon Graphics are trademarks of Silicon Graphics, Inc. AIX, Risc System/6000, and IBM are registered trademarks of International Business Machines, Inc. UNIX is aregistered trademark, licensed exclusively by X/Open Company, Ltd. PostScript is a trade-mark of Adobe Systems, Inc. The X-Window system is a trademark of the MassachusettsInstitute of Technology. NSF is a trademark of Sun Microsystems, Inc. FLEXlm is a trademarkof Highland Software, Inc.

Permission to Reprint, Acknowledgments, and ReferencesAccelrys usually grants permission to republish or reprint material copyrighted by Accelrys,provided that requests are first received in writing and that the required copyright credit lineis used. For information published in documentation, the format is “Reprinted with permis-sion from Document-name, Month Year, Accelrys Inc., San Diego.” For example:

Reprinted with permission from Cerius2 User Guide, March 2000, Accelrys Inc.,San Diego.

Requests should be submitted to Accelrys Scientific Support, either through electronic mailto [email protected] or in writing to:

*U.S. version of Copyright Page

Accelrys Scientific Support and Customer Service9685 Scranton RoadSan Diego, CA 92121-3752

To print photographs or files of computational results (figures and/or data) obtained usingAccelrys software, acknowledge the source in the format:

Computational results obtained using software programs from Accelrys Inc.—dynamics calculations were done with the Discover® program, using the CFF91forcefield, ab initio calculations were done with the DMol program, andgraphical displays were printed out from the Cerius2 molecular modelingsystem.

To reference a Accelrys publication in another publication, no author should be specified andAccelrys Inc. should be considered the publisher. For example:

Cerius2 Modeling Environment, March 2000. San Diego: Accelrys Inc., 2000.

DeCipher/March 2000 i

Contents

1. Introduction 1

DeCipher.... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1Role of analysis software in molecular modeling . . . . . . . . .1Software overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3

Substructure selection . . . . . . . . . . . . . . . . . . . . . . . . . . .3Data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5Property definition and evaluation. . . . . . . . . . . . . . . . .5

Structure analysis . . . . . . . . . . . . . . . . . . . . . . . . . . .5Energetics analysis . . . . . . . . . . . . . . . . . . . . . . . . . .6Dynamics analysis . . . . . . . . . . . . . . . . . . . . . . . . . .7

Property presentation . . . . . . . . . . . . . . . . . . . . . . . . . . .7Molecular structure and property spreadsheet . . . .7

Dynamic molecular information system . . . . . . . . . . . . . . . .8User interface and user-software interaction . . . . . . . . .9

2. Theory13

Molecular properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14

Basic molecular properties . . . . . . . . . . . . . . . . . . . . . . . . .14Basicpropertytypes-scalar,vector,tensor,andreferenceaxes:

14Atoms and atom sets . . . . . . . . . . . . . . . . . . . . . . .15Pseudoatoms (centroids) . . . . . . . . . . . . . . . . . . . .15Internals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16Nonbonds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16

Geometric variables . . . . . . . . . . . . . . . . . . . . . . . . . . .17Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17Plane. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17

Geometric functions . . . . . . . . . . . . . . . . . . . . . . . . . . .18Position. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18Angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19Dihedral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20

Structural functions . . . . . . . . . . . . . . . . . . . . . . . . . . .20Moment of inertia. . . . . . . . . . . . . . . . . . . . . . . . . .20Radius of gyration . . . . . . . . . . . . . . . . . . . . . . . . .21Displacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21

ii DeCipher/March 2000

.

Isotropicandanisotropicmeansquaredisplacement(MSD)22

B-factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23Root mean square displacement (RMSD) . . . . . . . 23

Kinetic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24Linear momentum . . . . . . . . . . . . . . . . . . . . . . . . . 24Center of mass velocity . . . . . . . . . . . . . . . . . . . . . 25Angular momentum . . . . . . . . . . . . . . . . . . . . . . . 25Angular velocity . . . . . . . . . . . . . . . . . . . . . . . . . . 25Kinetic energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Total kinetic energy . . . . . . . . . . . . . . . . . . . . . 25Translational kinetic energy . . . . . . . . . . . . . . 26Rotational kinetic energy . . . . . . . . . . . . . . . . 26Vibrational kinetic energy . . . . . . . . . . . . . . . . 27

Electrostatic functions: multipole moments . . . . . . . . . 27Monopole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Dipole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Multipolar interaction energy . . . . . . . . . . . . . . . . 28Multipolar interaction forces . . . . . . . . . . . . . . . . . 31Multipole calculations of DeCipher. . . . . . . . . . . . 31

Potential functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32Potential energy and forces . . . . . . . . . . . . . . . . . . 32

Consistent valence forcefield (CVFF) . . . . . . . . . . . . . . 32AMBER forcefield . . . . . . . . . . . . . . . . . . . . . . . . . 34The CFF forcefield (Class II forcefield) . . . . . . . . . 34Energy and derivative expressions . . . . . . . . . . . . 37

CFF91/CFF. . . . . . . . . . . . . . . . . . . . . . . . . . . 37CVFF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41AMBER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Force calculations . . . . . . . . . . . . . . . . . . . . . . . . . 47Bond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47Angle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47Torsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48Wilson out of plane . . . . . . . . . . . . . . . . . . . . 49

Restraint function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . optimization50

High-level molecular properties . . . . . . . . . . . . . . . . . . . . . 51Distribution functions . . . . . . . . . . . . . . . . . . . . . . . . . 51Radial distribution functions (RadialDistFunc). . . . . . . 52

Pair distribution functions . . . . . . . . . . . . . . . . . . . 52Spherical and cylindrical distribution functions . . 54Coordination numbers. . . . . . . . . . . . . . . . . . . . . . 54

Coordination number distribution function (CN_Distrib)55

Orientational radial distribution function (Orient_RDF) .55Spherical harmonic expansion. . . . . . . . . . . . . . . . 55Orientational pair distribution functions. . . . . . . . 55Spherical orientational distribution function. . . . . 56Cylindrical orientational distribution function . . . 57

DeCipher/March 2000 iii

Angular distribution dunction (AngleDistFunc) . . . . . .57Meansquaredisplacement(MeanSqDisp)anddiffusioncoefficient

57Time correlation functions (TimeCorrFunc) . . . . . . . . . .59

Direct approach . . . . . . . . . . . . . . . . . . . . . . . . . . .59Normalized discrete time correlation functions61

Legendre polynomials Pl cosq . . . . . . . . . . . . . . . .62Spectral density . . . . . . . . . . . . . . . . . . . . . . . . . . .63Diffusion coefficient . . . . . . . . . . . . . . . . . . . . . . . .65

Residency time (ResidencyTime) . . . . . . . . . . . . . . . . . .66

3. Languages 69

Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .69. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Molecular subsets69

Topological subsets. . . . . . . . . . . . . . . . . . . . . . . . .69Geometric subsets . . . . . . . . . . . . . . . . . . . . . . . . .72

Three dimensional atom attributes . . . . . . . . .72Three dimensional neighborhood . . . . . . . . . .72

Functions and mathematical expressions in DeCipher. . . .73Molecular properties representations . . . . . . . . . . . . . .73

Native functions . . . . . . . . . . . . . . . . . . . . . . . . . . .73Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .74Numerical functions . . . . . . . . . . . . . . . . . . . . . . . .74System functions . . . . . . . . . . . . . . . . . . . . . . . . . .74Geometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .75Implicit functions . . . . . . . . . . . . . . . . . . . . . . . . . .75

Mathematical expressions (the “Function Spec”) . . . . .75Built-in operators . . . . . . . . . . . . . . . . . . . . . . . . . .76

Precedence rules . . . . . . . . . . . . . . . . . . . . . . .76Array, vector, Ref_Axes, and tensor subscripts 76Built-in scalar functions . . . . . . . . . . . . . . . . . .78Built-in constant . . . . . . . . . . . . . . . . . . . . . . . .78Built-in vector functions . . . . . . . . . . . . . . . . .78Built-in tensor functions . . . . . . . . . . . . . . . . .79Specific built-in array functions: . . . . . . . . . . .79

Examples (for Function Spec). . . . . . . . . . . . . . . . .80

4. Command Summary 83

Configurations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .83Configurations/Get . . . . . . . . . . . . . . . . . . . . . . . . . . .83Configurations/Put . . . . . . . . . . . . . . . . . . . . . . . . . . .83Configurations/Record. . . . . . . . . . . . . . . . . . . . . . . . .84Configurations/Filter . . . . . . . . . . . . . . . . . . . . . . . . . .84Configurations/Align. . . . . . . . . . . . . . . . . . . . . . . . . .84Configurations/Color . . . . . . . . . . . . . . . . . . . . . . . . . .84Configurations/Animate . . . . . . . . . . . . . . . . . . . . . . .85Configurations/Select. . . . . . . . . . . . . . . . . . . . . . . . . .85

iv DeCipher/March 2000

.

Configurations/Tabulate . . . . . . . . . . . . . . . . . . . . . . . 85Configurations/Delete . . . . . . . . . . . . . . . . . . . . . . . . . 85

Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85Functions/Construct_Graph . . . . . . . . . . . . . . . . . . . . 86Functions/Construct_Table . . . . . . . . . . . . . . . . . . . . . 86Functions/Angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87Functions/Dihedrals . . . . . . . . . . . . . . . . . . . . . . . . . . 87Functions/Dipole. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87Functions/Distances . . . . . . . . . . . . . . . . . . . . . . . . . . 88Functions/Moment_of_Inertia. . . . . . . . . . . . . . . . . . . 88Functions/Positions . . . . . . . . . . . . . . . . . . . . . . . . . . . 88Functions/Radius_of_Gyration . . . . . . . . . . . . . . . . . . 88Functions/Numeric . . . . . . . . . . . . . . . . . . . . . . . . . . . 88Functions/B_Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . 89Functions/Displacement . . . . . . . . . . . . . . . . . . . . . . . 89Functions/RMSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89Functions/Momentum. . . . . . . . . . . . . . . . . . . . . . . . . 89Functions/Velocity. . . . . . . . . . . . . . . . . . . . . . . . . . . . 89Functions/Temperature . . . . . . . . . . . . . . . . . . . . . . . . 90Functions/Kinetic_Energy . . . . . . . . . . . . . . . . . . . . . . 90Functions/Multipolar_Energy . . . . . . . . . . . . . . . . . . . 90Functions/Potential_Energy . . . . . . . . . . . . . . . . . . . . 90Functions/Restraint_Energy . . . . . . . . . . . . . . . . . . . . 91Functions/Non_Bond_Setup . . . . . . . . . . . . . . . . . . . . 91Functions/Cross_Term_Setup . . . . . . . . . . . . . . . . . . . 91Functions/Optimize . . . . . . . . . . . . . . . . . . . . . . . . . . 91Functions/Get . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Functions/Put . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Functions/Rename. . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Functions/Delete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Functions/Info . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Spectra/RadialDistFunc. . . . . . . . . . . . . . . . . . . . . . . . 93Spectra/OrientRDF . . . . . . . . . . . . . . . . . . . . . . . . . . . 93Spectra/AngleDistFunc . . . . . . . . . . . . . . . . . . . . . . . . 93Spectra/CN_Distrib . . . . . . . . . . . . . . . . . . . . . . . . . . . 93Spectra/MeanSqDisp . . . . . . . . . . . . . . . . . . . . . . . . . . 93Spectra/TimeCorrFunc . . . . . . . . . . . . . . . . . . . . . . . . 94Spectra/ResidencyTime . . . . . . . . . . . . . . . . . . . . . . . . 94

Geometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94Geometrics/Point. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94Geometrics/Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95Geometrics/Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95Geometrics/Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95Geometrics/Color . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95Geometrics/Info. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

SubStructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96SubStructure/Get . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

DeCipher/March 2000 v

SubStructure/Put . . . . . . . . . . . . . . . . . . . . . . . . . . . . .96SubStructure/Internals . . . . . . . . . . . . . . . . . . . . . . . . .96SubStructure/NonBonds . . . . . . . . . . . . . . . . . . . . . . .97SubStructure/Rename . . . . . . . . . . . . . . . . . . . . . . . . .97Substructure/Delete . . . . . . . . . . . . . . . . . . . . . . . . . . .97Substructure/Info . . . . . . . . . . . . . . . . . . . . . . . . . . . . .97SubStructure/Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . .97

Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .97Cluster/Construct_Graph. . . . . . . . . . . . . . . . . . . . . . .98Cluster/Repartition . . . . . . . . . . . . . . . . . . . . . . . . . . .98Cluster/Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .98

5. Online Tutorials 99

Pilot online tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .99

A. References 101

B. Glossary 103

C. File Formats 105

Multiple-file trajectory definition file . . . . . . . . . . . . . . . .105

D. Units 107

Electrostatic conversion factor. . . . . . . . . . . . . . . . . . . . . .107Frequency conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . .108

E. Atom_Template Expressions 109

Bond order characters . . . . . . . . . . . . . . . . . . . . . . . . . . . .109Examples of Atom_Templates . . . . . . . . . . . . . . . . . . . . . .110

vi DeCipher/March 2000

.

DeCipher/March 2000 1

1 Introduction

DeCipher...

DeCipher’s philosophy revolves around mathematical andgeometric modeling of molecular properties. It is a softwareprogram that allows users to abstract molecular propertiesdynamically from molecular structures and simulations.

DeCipher is a “dynamic molecular information system”.Molecular properties are defined interactively, evaluateddynamically, and visualized interactively through dynamicallylinked spreadsheets, 2D and 3D graphs, and 3D moleculargraphics representations.

The software is extremely general, and can be applied to a widevariety of molecular systems in gas and condensed phases.Structural, energetic and dynamic properties of molecular systemsare covered, they are represented as a library of mathematicalfunctions. A mathematical language allows users to defineadditional molecular properties as linear or non-linearcombinations of library functions. Ensemble averages or timecorrelation functions can be obtained through a number ofoperators.

Role of analysis software in molecular modeling

Most molecular modeling and structure determination proceduresinvolve three steps:

1. Molecular model building.

2. Molecular simulation (or structure refinement).

2 DeCipher/March 2000

1. Introduction

3. Analysis of the resulting models and their simulation.

While Insight II can be used for the first step, and Discover orCHARMm for the second (in their most common form asmolecular dynamics simulation tools), the DeCipher moduletackles the third step.

The analysis step is crucial: to understand the structural basis of aparticular biological function, you must analyze the details of eachrelevant structure, relate and compare conformations of activesubstructures from different molecules, and relate all of thesestructural analyses to biological activity to construct a model forthe physical origins of a targeted function. A pharmaceuticalchemist interested in proposing a new drug or a new modificationto improve the potency of an old drug will typically analyzehundreds of candidates before a single candidate provessuccessful. For modeling software to be of use, it must provideextensive capabilities to compare these candidates, in terms ofconformational similarities, property similarities, functionalsimilarities, and if the three dimensional structure of the targetedreceptor is known, in terms of quality of fit and energetics ofbinding.

Without analysis software, the output of a simulation is a movie.Although the visual animation of a molecular dynamicssimulation can provide insightful qualitative information, thesuccessful use of simulation to design a new drug or materialdemands sufficient analysis to quantitatively relate and compareproposed modifications. Current simulation methods may notprovide absolute, accurate, numerical answers to all questions.However, the current success of simulation in providingqualitative information is based on the application of consistentcomputational methodologies which can numerically rankalternatives. The most natural question from users of molecularsimulation programs is: What do I do now that I have run asimulation? What does it teach me? What can I learn from it, andhow can I extract useful information to apply to my designproblem? Answering these questions is the job of DeCipher.

Simulation and knowledge-based modeling are effective only ifthey are coupled with extensive capabilities to analyze theirresults. In a very real sense, all molecular simulation software isaimed towards the goal of analyzing molecular systems, and usingthe results of this analysis to predict the behavior of new and novel

Software overview


systems and hence, design them. The effectiveness of simulationsoftware is therefore integrally linked to the effectiveness of itsanalysis software. No other software tools are as central andpervasive to all areas of the computer assisted molecular design,or as crucial to the design and implementation of effectivecomputer assisted molecular design as the analysis software.

Software overview

DeCipher is designed to interactively analyze structural,energetic, and mechanical properties of molecules or molecularsystems in condensed phases. Dynamic properties can be obtainedfrom post processing of molecular simulations.

Simulations can be analyzed many times, using the stored outputfrom simulations as input to DeCipher. The data is not immutable:if analysis indicates inadequacies in the model or in the length ofthe simulation performed, additional simulation may beindicated.

DeCipher performs four logical steps as illustrated in Figure 1.

1. Substructure selection—specification of which molecular struc-ture or substructure is relevant for a particular analysis.

2. Data collection—specification of the data resulting from thesimulation of interest to be collected and stored.

3. Property definition and evaluation––specification of a molecu-lar property to be calculated as a mathematical function of thisdata.

4. Property presentation—specification of the mode of presenta-tion of the properties as tables, graphs or other molecular rep-resentations.

Substructure selection

The first user interaction in any molecular analysis consists ofdefining the structure or substructure to be investigated. Theflexibility of DeCipher is determined to a large extent through theflexibility of this central set selection capability. The definition of


1. Introduction

static and dynamic sets of atoms based on atomic, monomeric, andmolecular properties gives users the ability to analyze molecularsystems at any level of detail. A static set is one which is definedonce, and membership in that set is permanent; a dynamic set isdefined by a rule that is stored for evaluation by the system asneeded. Logical (Boolean) operations on sets such as union,intersection, and difference, allow definition of new sets of atomsgiven existing ones. Following are two examples.

♦ A permanent set of atoms might be created by selecting all res-idues of a particular type or charge within a given distance of aparticular atom; these might then be displayed, colored, orstored for additional queries as desired.

♦ A dynamic set might consist of the first shell of hydration of anion in solution during a molecular dynamics simulation. Mem-bership in this set might be determined at each time step,according to how close any water is to the ion, its interactionenergy with the ion, the nature of that interaction (e.g., a hydro-gen bond is formed), etc.

Figure 1 DeCipher Overview

1Substructure

Selection

2

Data Collection

3Property Definition

and Evaluation

4Property

Presentation

Specify the system orsubsystem of interest.

Specify and collectsimulation data needed.

Define and evaluatemolecular properties.

Tabulate and visualizemolecular properties.

Molecularstructure.

Molecularsimulation data.

Software overview


The set concept is not limited to atoms: sets of internal coordinates(bonds, valence angles, dihedral angles, out of plane or improperangles) or nonbond distances (including hydrogen bonds, or ofgeometric constraints such as NOEs) are also definable based onthe attributes of their constituent atoms. As an example of thislatter capability, one could study the distribution of hydrogenbond lengths and angles in a set of organic crystals or in a proteinstructure, selecting only those hydrogen bonds formed to ionicspecies.

Data collection

The analysis of molecular structures and dynamics simulations is,technically speaking, data analysis. The input to DeCipherconsists of the elementary data saved from the simulation, orstored in structural databases. A data collection mechanism givesselective access to atomic variables (atom positions, velocities, andtopological attributes) and system variables (unit cell lengths,pressures and/or temperatures, and time). Any structural,energetic and dynamic property that can be formally representedby a mathematical function of these variables can be defined andevaluated.

Property definition and evaluation

Structure analysis

Given a molecular structure or set of molecular dynamicsconfigurations, you can identify and classify the substructures ofmolecular systems which exhibit a particular structural motif. Youmay use standard and user-defined energetic, structural anddynamic criteria to direct the analysis.

That is, one can define different sets of configurations based onselected properties and then calculate average structures andproperties for these. High-level structural properties such as radialdistribution functions for atoms, residues and molecules, as wellas elementary structural properties such as bond lengths, valenceangles, dihedral angles, nonbond distances, etc., are accessible forperiodic as well as non-periodic systems. Geometric constructs(centroids, vectors, planes) allow for more elaborate, problem-


1. Introduction

dependent, structural analyses based on the determination ofgeometric relationships such as an angle between arbitraryvectors, or a distance from an atom, normal to an arbitrary plane.Calculated structural properties are then available for selectingparticular configurations (e.g., conformations of a molecule, orframes of a trajectory). An example of the latter might be theselection of all water molecules within 2-4 Å above an interfacialplane such as that of a lipid-water interface or of a catalyticsurface.

Energetics analysis

Without a detailed energetics analysis there cannot be a fullunderstanding of the underlying interactions determiningmolecular structure and function.

In molecular dynamics, one solves Newton's equations of motionnumerically, monitoring in particular the total energy of amolecular system. Usually the only energetics output of amolecular dynamics program, such as Discover or CHARMm,consists of the total kinetic and potential energies of the molecularsystem with, at most, the breakdown of the potential energy. In thestudy of biological systems for example, one needs to know whereare the main energy interactions occurring (which fragments in themolecules interact noticeably), and why (what “type” ofinteraction is involved, i.e., the energy contributions associatedwith the different terms in the potential energy function). Inmolecular mechanics, one minimizes the potential energy of amolecule or a set of interacting molecules. At best, the energybreakdown in terms of energy types is given for the whole systemminimized. Again, no details are usually given at a substructurallevel.

It is the role of DeCipher to determine where the main interactionsoccur, and how to best partition molecular systems into interactingsub-structural fragments based on a meaningful potential energydistribution (how the energy, and what type of energy, isdistributed among substructural groups). Such a knowledgeextracted from a simulation helps tremendously in designing newcompounds, since one can isolate individual atoms, functionalgroups, residues, fragments with well defined internal energetics,and energetics of interaction. Their role in binding for example can

Software overview


be rationalized in terms of energy, and that knowledge can then beapplied in rational drug design.

DeCipher provides the means to calculate potential energies formolecules and their substructures, either in isolation or interactingwith other species and substructures. The analysis focusesprimarily on the potential energy as modelled by molecularforcefields. The program also allows the calculation of kineticenergies associated with any substructure, from velocitiesgenerated during molecular dynamics. When analyzing energiesat the molecular system level, the program makes full use of thedynamic’s energy output, as stored in the trajectory data files, toavoid performing redundant calculations.

Dynamics analysis

In the analysis of molecular dynamics simulations the dataconsists of atomic positions, velocities, forces, and all relevantinformation about the simulation conditions necessary for theanalysis. The latter might consist of the system’s temperature,pressure, structural constraints, or the variable lengths of theperiodic cell boundaries during constant pressure simulationswith periodic boundary conditions. The simulation history fileproduced by molecular dynamics programs, such as Discover orCHARMm (which contain mainly position, velocity, and systemstate variables) logically comprises a database for the moleculardynamics analysis module. With the flexible data analysismechanism outlined above, one can define and calculate dynamicproperties of molecules. These include simple time-series ofstructural or energetic variables, correlation functions (inparticular order parameters), as well as their Fourier transform-related power spectra, transport properties (such as self-diffusioncoefficients), and experimentally accessible spectral properties(such as those probed by NMR or IR).

Property presentation

Molecular structure and property spreadsheet

Molecular properties can be visualized through the simultaneoususe of tables, graphs, and molecular structure and surfacerepresentations. These capabilities define a molecular structure


1. Introduction

and property spreadsheet. It allows intuitive and detailed access toall system and user-defined properties, permitting patterns anddifferences among molecular systems to emerge. Each column ofthe spreadsheet is associated with a property, and each line isassociated with a structure, conformer, or configuration (frame ofa trajectory). Vector properties such as electric dipoles, and tensorproperties such as moment of inertia can be tabulated. Anyproperty (column of the table) can be graphed against any other orprocessed through a mathematical equation (spreadsheetoperation).

Consider the problem of monitoring the angle between two helixaxes in a protein, as well as the electric dipole magnitude anddirection. These vector properties can be viewed simultaneouslyin a table, a graph, and an animated molecular structure withdynamic color-coded structural vector monitors.

Another representative application of such a spreadsheet might beits use in comparing experimental and theoretical observables. Forexample, by importing experimentally observed values forinternal coordinates for a series of molecules from structuraldatabases such as the Cambridge Database, comparison with andrefinement of predictions from simulations are facilitated.

Dynamic molecular information system

DeCipher is an engine to interactively and selectively extractelements of information from molecular structures andsimulations. As a user defines new properties and visualizes themsimultaneously as numbers in tables, graphs, and moleculargraphics representations, new questions arise that can beformulated as a new property definition on the system or a relatedone. DeCipher’s ability to endlessly define new properties, andrefine the analytical and graphical model of complex molecularproperties makes this a dynamic molecular information system.The data is processed dynamically, and each new user-definedproperty adds a piece of information in the analysis of molecularstructures and simulations (see Figure 2).



User interface and user-software interaction

A consistent user interface, integrated within Insight II, allowsusers to define, name, plot, display and further process all of theatomic and molecular variables and property functions. Keyflexibility is provided by the ability to exactly and formally expressthe property to be analyzed.

After having defined a property for a subsystem of interest, onecan analyze it. First one can define sets of atoms, either static or

Figure 2 DeCipher Software Architecture

Property Presentation

Tables Graphs

DATA

❥ DynamicsTrajectories

❥ Monte CarloConformerEnsembles

3D MolecularGraphicsDisplay

Molecule

Residue

Atoms

InternalsNonbonds

GeometricConstructs

SubsetSelection

PropertyEvaluation

PropertyDefinition

DataCollection

DynamicMolecularInformationSystem

Links

Molecular Structure and Property Spreadsheet


1. Introduction

dynamic, based on their attributes, and name them. One can alsodefine and name geometric objects, such as arbitrary points,vectors, and planes, which can be specified explicitly or as best-fitto sets of atom positions. For centroids a general weighting schemeallows, in particular, the definition of a center of mass, a center ofgeometry, or a center of charge. One can define, name, andgraphically monitor molecular functions, with dynamically linkedtables, graphs and three dimensional molecular structures andassociated geometric objects. These may be purely geometricfunctions such as position vectors; point to point and point toplane distance vectors; three point, vector-vector, vector-plane,plane-plane, and dihedral angles; electrostatic functions, such aselectric dipole moments for any group of atoms, or kineticfunctions such as group velocities or momenta etc.

In addition to user-defined functions, DeCipher contains system-defined variables and functions such as time, temperature,pressure, volume, potential, kinetic and total energy for moleculardynamics analysis. Once molecular functions have been definedformally, one can analyze their time-series, or simply graph themfor selected sets of configurations, even if not time-correlated.These functions can be used as attributes of configurations of thedynamic molecular system, and can also be used to selectivelyload sets of configurations (i.e., by filtering configurations on thebasis of their properties, defined as functions). System functionsand user-defined functions representing properties of themolecular system or any user-defined subsystem (subset) can betabulated in a very flexible way. Spreadsheets can be defined andmodified dynamically. Vector, scalar, and tensor functions such aspositions, distances, angles, velocities, momenta, kinetic energies,electric dipoles, moments of inertia, B-factors, and potentialenergies (including individual component breakdown) can betabulated. Any column of tabulated data can be graphed directlyfrom the table, as in any spreadsheet. One can therefore follow asimulation, watching an animation of molecules and geometricobjects, and one or more table(s) and graph(s) of selectedproperties.

To study dynamic properties of molecular systems, direct timecorrelation functions for (auto-correlation) and among (cross-correlation) any of these, either vector or scalar functions, can becalculated and graphed interactively, or in the background.Orientation correlation functions can be determined for any vector



functions (first- and second-order Legendre polynomials). Tostudy structural properties of condensed phases, one can obtaincoordination numbers in addition to calculating the pair/radialdistribution function between any set of atoms. To study transportproperties, one can calculate the mean square displacementcorrelation function and diffusion coefficients, and averagedistance travelled by sets of atoms over a period of time. Theflexibility for the determination of these high level molecularfunctions is enhanced by a dual implementation for interactiveand background calculations. The interactive mode, available forthe radial distribution function, mean square displacement, andtime correlation functions and spectral densities calculations, is anexploration tool of impressive power. It allows one to graph high-level molecular properties as easily as time series of any basicgeometric, energetic, and kinetic molecular functions.


1. Introduction


2 Theory

Molecular properties

DeCipher’s philosophy revolves around mathematical and geo-metrical modeling of molecular properties. In DeCipher, there aretwo broad categories of molecular properties.

Basic molecular properties are represented as simple mathemati-cal functions of atomic variables directly accessible from molecu-lar structure databases or molecular simulations trajectory files:atom coordinates, velocities, charges, masses, etc. These basicproperties are referred to as “Functions” in the program. DeCi-pher’s Functions cover structural properties such as arbitrary geo-metrical distances and angles, energetic properties for arbitrarysets of atoms such as kinetic energies and center of mass velocities,potential energies and associated forces, permanent electricdipoles and multipoles, as well as multipolar interaction energiesand forces.

Once they are defined basic properties can be visualized, mea-sured, monitored, and correlated over time. Functions can be sca-lars, vectors, tensors or principal axes. They can be multi-dimensional, so that one can refer directly to sets of properties, forexample on a per residue or per molecule basis. When defined, abasic property becomes a molecular variable like any other, andcan be used as such for further analysis. A complex property canalso be expressed as a linear or non-linear combination of basicfunctions, through a mathematical expression. For example, a usercould define a dipole-dipole interaction energy from electricdipole vectors, the distance between them, and the angles describ-ing their relative orientations.

High level properties are also functions, in the mathematical sense,of atomic variables and time (i.e., configurations). Their calcula-tion however requires ensemble averages over configuration


2. Theory

space. High level molecular properties in DeCipher cover pair andradial distribution functions, time correlation functions, orienta-tion correlation functions, and their related fourier transformedspectral densities. High level properties are best represented asgraphs and are referred to as “spectra” in the program, by analogyto experimental properties obtained as spectra. For example thepower spectrum of the dipole autocorrelation function can bedirectly related to an experimental low frequency IR spectrum.

For more information on atomic attributes, refer to the Molecularsubsets heading in Chapter 3, Languages.

Nomenclature

In the following, we will use the word “particle” to refer to eitheran atom, a residue, a molecule, or an arbitrary set of atoms. A par-ticle containing more than one atom can be considered as a point,mass, or charge distribution, and/or represented by a centroidsuch as a center of geometry, mass, or charge. In this case, we willuse “charge distribution”, “particle” and “atom set” interchange-ably.

Basic molecular properties

Basic property types - scalar, vector, tensor, andreference axes:

In DeCipher, a basic molecular property is expressed as a functionof atomic variables, and can be a scalar, a real space vector, tensor,or frame of reference (reference axes). To a vector function, A, fourscalar functions can be directly associated: each of its Cartesiancomponents Aα (α = x, y, z), and its magnitude (the norm of vector):

Eq. 1

Position, distance, velocity, momentum, force, and dipole momentfunctions are defined as vectors (triplets x, y, z). They may be

A Aα2

α�=



graphically represented in real (affine) space, at arbitrary origins.Default origins are automatically defined by the program, atappropriate positions relative to the molecular structure.

In a tensor function, corresponding to a three dimensional sym-metric matrix, one associates a number of scalar functions: six sca-lar components (xx, yy, zz, xy, xz, yz), the trace, i.e., the sum of thediagonal elements, and the determinant. In addition, a reference-axes function is a triplet of vectors, sharing an origin in 3D space,it corresponds in particular to the diagonal form of a tensor (prin-cipal moments), together with the associated frame of referencedefinition, the principal axes system.

All the scalar and vector functions associated with native vector,tensor, and reference-axes functions are accessed through sub-scripting or mathematical operators. (See Functions and mathemati-cal expressions in DeCipher in Chapter 3, Languages.)

Atoms and atom sets

In DeCipher, atoms and atomic positions in real space are the basicstructural variables. Atom coordinates are expressed in Insight II’s“world” frame of reference. Atoms have a number of static anddynamic native attributes, such as mass, charge, potential types,etc., that are used in conjunction with coordinates to calculate mostproperties. You may load additional dynamic attributes such asvelocities from dynamics simulations. Sets of atoms can be definedon the basis of atomic, monomeric (residue) and molecularattributes.

Pseudoatoms (centroids)

The position of geometric centroids such as the center of massri,COM, the center of geometry ri,COG, and the center charge ri,COCof a particle i are defined as:

Eq. 2ri COM,

miaria

a 1=

N

�Mi

---------------------------=


2. Theory

Eq. 3

Eq. 4

where mia, ria are the masses and position vectors of individualatoms a, and Mi, N, and Qi

ε are the total mass, number of atoms,and charge of the particle i. Sets of pseudoatoms can be defined atonce for sets of residues and molecules.

Note that centers of positive (ε = +) and negative charge (ε = -)should be defined separately. The subset of positively chargedatoms should be defined first, then the subset of negativelycharged atoms, and finally, their respective centroids. By conven-tion, if a subset contains both positive and negative charges and anull total charge, a center is created halfway between the centers ofpositive and negative charge. If the total charge of such a subset isnonzero, only the center of either positive or negative charge is cre-ated (whichever has the maximum absolute charge) and repre-sents the position of the resulting monopole (net charge) for thatsubset.

Internals

Sets of internal coordinates, or Internals (bonds, valence angles,dihedral angles, impropers or out-of plane angles) are definedusing any atomic attribute. For example, a set of internals can bedefined by specifying atom names, atom types, charges, masses, orelement types, etc.

Nonbonds

Sets of non-bonded interaction distances, or Nonbonds, aredefined as all non bonded atom pairs between two sets of atoms,within a specified distance range.

ri COG,

ria

a 1=

N

�N

-----------------=

ri COC,ε

eiaε ria

a 1=

Nε

�Qi

ε-------------------------=



Geometric variables

Point

A geometric point can be defined directly by specifying arbitrarycoordinates (in Insight II’s screen reference frame) or coordinatesrelative to atom positions—as a centroid of atoms (see Atoms andatom sets above).

For more information see the Geometrics/Point heading in Chapter4, Command Summary.

Vector

A geometric vector can be defined directly by specifying an originand end point in affine space (Insight II “world”), an origin anddirection (Cartesian components), or as the best fit (oriented) axisto a set of atom positions.

Vector objects in DeCipher can also represent vector functions inaffine space. In other words they can be seen as geometrical mon-itors of vector functions. Vector functions are defined in vectorspace to allow vector calculus, while they can be displayed in theaffine space (Insight II “world”). By allowing this dual representa-tion, you can map vector functions anywhere in affine space. Forexample a set of dipole moments associated with monomers canbe mapped from one molecule to another, or can all be mapped toa single point in space to allow the visualization of their orienta-tional distribution. Also tensor functions can be visualized as atriad of vectors corresponding to their eigenvectors, mapped ontoa spatial origin, usually a centroid, but also an arbitrary point.

For more information see the Geometrics/Vector heading in Chapter4, Command Summary.

Plane

A geometric plane can be defined directly by specifying threepoints, or as the best fit plane to a set of atom positions.

For more information see the Geometrics/Plane heading in Chapter4, Command Summary.


2. Theory

Geometric functions

Position

Position vectors can be defined for atoms, pseudoatoms, or geo-metric points, relative to any origin.

Distance

Distance is a special case vector which is defined by either twopoints, or a point and a plane.

1. Point to point distance vector

2. Point to plane distance vector

3. Point to line distance vector

4. Line to line distance vector

A

Br defined by two points.

A

B

mal to a plane, at aint in space.

A

B

mal to a line/vector, to aint in space.



Angle

An angle can be defined by either three points, two vectors, a vec-tor and a plane, or two planes.

1. Three Points angle

2. Vector - Vector angle

3. Plane - Plane angle

4. Vector — Plane angle

A

B

mon normal to twofined along two vectors

O

OC

E

CE

nce vector defined by two“origin” O, “center” C, oron two given vectors.

e defined by three points.

e defined by two vectors.

e defined by two planes.


2. Theory

Dihedral

A dihedral angle is defined by four points.

Structural functions

Moment of inertia

The moment of inertia tensor I of a particle is defined by:

Eq. 5

where

Eq. 6

and ma, ra are the mass and position vector of an atom a from thecenter of mass, and N is the number of atoms of the particle, andthe remaining components of the tensor I can be obtained by cir-cular permutation of x, y, and z. In DeCipher, a particle can be aresidue, a molecule or an arbitrary set of atoms. Optionally, themasses can be set to one for all atoms, giving rise to a purely geo-metric quantity.

e defined by a vector and a plane.

I

Ixx Ixy– Ixz–

Iyx– Iyy Iyz–

Izx– Izy– Izz� ��

=

Ixx ma ra2 xa

2–( )

a 1=

N

�=

Ixy maxaya

a 1=

N

�=



The principal moments of inertia, Ia, Ib and Ic, are obtained bydiagonalization of the inertia tensor. They correspond to the eigen-values, while the principal axes correspond to the eigenvectors. Inthe principal axes system, the inertia tensor is diagonal:

Eq. 7

Radius of gyration

The radius of gyration of a set of atoms is defined by:

Eq. 8

where M is the total mass of the particle.

RG represents a mass weighted root-mean-square average dis-tance of all atoms in the particle from their center of mass. In DeCi-pher, a particle can be a residue, a molecule or an arbitrary set ofatoms. Optionally, the masses can be set to one for all atoms, giv-ing rise to a purely geometric quantity.

Displacement

The displacement vector of an atom or a particle centroid (usuallyits center of geometry or center of mass), a, is given by the differ-ence between its position in a given configuration and a referenceconfiguration:

∆ra = (ra - ra,ref) Eq. 9

In DeCipher, rref can either be the average particle position over aset of configurations (usually a molecular dynamics trajectory), anarbitrary configuration of the system, or a reference structure(obtained for example by X-ray crystallography or nuclear mag-netic resonance spectroscopy). Of particular interest are the initialconfiguration of a set, or just the previous configuration in a

I

Ia 0 0

0 Ib 0

0 0 Ic� ��

=

RG1M----- mara

2

a 1=

N

��

1 2/

=


2. Theory

molecular dynamics trajectory. All of those are readily available inDeCipher.

Isotropic and anisotropic mean square displacement (MSD)

The mean square displacement of a particle (monomer, molecule,or atom set), i, corresponds to a sum over their constituent atoms.It is defined by:

Eq. 10

where angular brackets indicate the average (mean) over a config-uration ensemble, and subscript a is the atom index.

The isotropic mean-square displacement of an atom, a, corre-sponds to:

Eq. 11

where ∆xa, ∆ya, ∆za correspond to the Cartesian components of thedisplacement vector ∆ra.

The “anisotropic mean-square displacement”, or mean-square dis-placement tensor of an atom, a, is represented by the following 3x3matrix:

Eq. 12

where ∆xa, ∆ya, ∆za correspond to the Cartesian components of thedisplacement vector ∆ra. With this definition, the isotropic mean-square displacement corresponds to the trace of the anisotropicmean-square displacement tensor, and the “principal” mean

MSD ra ra ref,–( )

a

�2

� � ∆ra( )

a

�2

� �= =

∆ra( )2� � ∆xa∆xa� � ∆ya∆ya� � ∆za∆za� �+ + ∆ra ∆ra•� �= =

∆ra( )2� �

∆xa∆xa� � ∆xa∆ya� � ∆xa∆za� �

∆ya∆xa� � ∆ya∆ya� � ∆ya∆za� �

∆za∆xa� � ∆za∆ya� � ∆za∆za� ��

∆ra ∆ra⊗� �= =



square displacements are obtained by diagonalization of the ten-sor. They correspond to the maximum mean-square displacement,i.e., the eigenvalues, in three orthogonal directions, given by theeigenvectors.

In DeCipher, these functions can be defined with the Functions/RMSD command (see the Root mean square displacement (RMSD)heading below).

B-factor

The Debye-Waller factor, Ba, of an atom a is defined by:

Eq. 13

The isotropic B-factor of an atom a is obtained by with ∆ra as theatomic displacement from its mean position during a moleculardynamics trajectory, and < (∆ra)2 > the isotropic mean square dis-placement, as defined previously. The anisotropic B-factors inDeCipher are defined by replacing < (∆ra)2 > in the previous equa-tion by the anisotropic mean-square displacement tensor of theatom a.

In addition to an atom based B-factor, a monomer (residue) basedB-factor is defined as the average value over the constituent atomsof that residue (or a subset of them, for example the backboneatoms).

Eq. 14

Root mean square displacement (RMSD)

The (isotropic) root mean square displacement of a particle i(monomers, molecules or arbitrary sets of atoms) is defined by:

Ba83---π2

� �� ∆ra( )2� �=

Bi Ba

a

�=


2. Theory

Eq. 15

where, ∆ra = ra-ra,ref represents atomic displacement vectors, andthe sum runs over all atoms in the particle i. The reference config-uration can either be (as in displacement vectors definitions) anaverage over a trajectory, an arbitrary configuration of the systemsuch as the first or previous configuration in a trajectory, or a ref-erence structure such as an X-ray or nuclear magnetic resonancestructure.

Kinetic functions

Linear momentum

The linear momentum of a particle, p, is determined by:

Eq. 16

where ma and va are the mass and velocity vector of atom a, and Nis the number of atoms in the particle. The previous equation canalso be written as:

p = Mv Eq. 17

where v the center of mass velocity, and M is the total mass of theparticle:

Eq. 18

RMSDi ra ra ref,–( )

a

�2

� � ∆ra( )

a

�2

� �= =

p mava( )

a 1=

N

�=

M ma

a 1=

N

�=



Center of mass velocity

The velocity vector of the center of mass of a particle, v, is deter-mined by:

Eq. 19

Angular momentum

The angular momentum of a molecule, considered as a rigid body,about the center of mass is defined as:

Eq. 20

where ma is the mass, and ra and va are the position and velocityvectors of atom a relative to the center of mass; and N is the num-ber of atoms in molecule a. The previous equation can also be writ-ten as:

L = Iω Eq. 21

where I is the moment of inertia tensor, and ω the angular velocityvector.

Angular velocity

The angular velocity of a molecule, considered as a rigid body, isdefined by inverting Eq. 21:

ω = I-1L Eq. 22

Kinetic energy

Total kinetic energy The total kinetic energy of a particle,, is defined by:

v

mava( )

a 1=

N

�M

----------------------------=

L ma ra va×( )

a 1=

N

�=

Ektotal


2. Theory

Eq. 23

Translational kinetic energy The translational kinetic energyof a particle is calculated from the center of mass, as:

Eq. 24

where v is the center of mass velocity and M the total mass of theparticle.

Rotational kinetic energy The rotational kinetic energy of amolecule is defined, in matrix form as:

Eq. 25

where ω is the angular velocity vector and I is the moment of iner-tia tensor.

Let n be a unit vector in the direction of ω, an alternate form for therotational kinetic energy is:

Eq. 26

where I is a scalar, the moment of inertia about the axis of rotation.

It is convenient to express the inertia tensor in its diagonal form(Eq. 7); in which case the rotational kinetic energy can be writtenas:

Eq. 27

Ektotal 1

2--- mava

2( )

a 1=

N

�=

Ektrans 1

2---Mv2=

Ekrot 1

2---ωtIω=

Ekrot 1

2---Iω2 1

2---ntIn= =

Ekrot 1

2---Ia ωa( )2 1

2---Ib ωb( )2 1

2---Ic ωc( )2+ +=



where Ia , Ib and Ic are the principal moments of inertia, and ωa, ωband ωc are the angular velocity components along the principalaxes.

Vibrational kinetic energy The vibrational kinetic energy issimply obtained from the total, translational, and rotational kineticenergy:

Eq. 28

Electrostatic functions: multipole moments

Monopole

The electric monopole moment, or total charge, Q, of a particleconsidered as a charge distribution, is simply given by the sum ofthe partial charges qa of its constituting atoms a:

Eq. 29

Obviously, the monopole is zero for an uncharged particle. For acharged particle, we locate the monopole at the center of positive(resp. negative) charge if the particle is globally positive (resp. neg-ative). In DeCipher, the monopole is available at the residue andmolecule level, as “monomer_charge”, and “mol_charge”attributes.

Dipole

The electric dipole moment vector, , of a particle considered as acharge distribution, is determined by:

Eq. 30

Ekvib Ek

total Ektrans Ek

rot––=

Q qa

a 1=

N

�=

µ

µ qara

a 1=

N

�=

Q-rcoc

Q+rcocR +

-


2. Theory

where qa is the partial charge and ra is the position vector of atoma in the particle, relative to an arbitrary origin. It can be rewrittenin the following form:

Eq. 31

where:

♦ Qε = Q+ = |Q-|is the sum of positive charges for an unchargedparticle, or min (Q+, |Q-|) for a charged particle;

♦ R = r+COC - r-

COC is the vector between the center of negativecharge and the center of positive charge; and

♦ the constant 4.802 represents the conversion factor to obtainresults in Debyes.

If the net charge of the particle is zero, the electric dipole momentvector is independent of the origin of the coordinate system. For acharged particle, it is origin dependent, and also has a resultingmonopole at the center of charge, either positive or negative,whichever has the larger charge in absolute value, and with a netcharge equal to the difference Q+ + Q-. By convention, the positionof the origin of a dipole vector of a particle is halfway between thecenter of positive and negative charges.

Multipolar interaction energy

The total electrostatic interaction energy between two charge dis-tributions, (Figure 1) is given by Coulomb’s law:

Eq. 32

where a and b run over the charges of the distribution 1 and 2,respectively, and rab is the distance, in angstroms, between chargesqa and qb. The calculation of the electrostatic conversion factor332.054 is given in Appendix B, Glossary.

If two charge distributions do not overlap, we can decomposeEq. 32 into multipole-multipole constituents Ul1, l2 in which themultipole of order l1 in the distribution 1 interacts with the multi-pole of order l2 in the distribution 2. The conventional multipoleexpansion (Gray and Gubbins 1984) of the Coulomb potential is a

µ 4.802QεR=

U 332.054qaqb

rab-----------kcal molA

·( )⁄

b a≠�

a

�=



simple model representing electrostatic interactions at long dis-tances. This model is commonly used to evaluate point-chargemodels for liquid simulations (Allen and Tildesley 1989), and canbe of practical use in interpreting electrostatic interactionsbetween molecular substructural elements. This model has alsobeen generalized to simulate long range interactions of very largesystems in the fast multipole method (Greegard and Rokhlin1989).

In DeCipher, multipole-multipole interaction terms Ul1, l2 arederived by expanding the quantity 1/rab = 1/ |rab| = 1/ |r + rb -ra| in terms of products of spherical harmonics of the orientationsωa, ωb and ω of ra, rb, and r respectively, in a space-fixed frame ofreference:

Eq. 33

where A is the expansion coefficient, and Ylm is a spherical har-monic (Rose 1957). The non-overlapping condition is |r| > ra,max+ rb,max.

For example, the dipole-dipole interaction term U11 (orders l1 = 1and l2 = 1) is:

Figure 1. Interaction geometry for charges qa and qb.

r

rab

rb

ra O1

O2

qa

qb

1rab------- Alalbl

mambmrarbr( )Ylama

ωa( )Ylbmbωb( )Ylm ω( )

mambm

�lalbl

�=


2. Theory

Eq. 34

where

♦ r is the distance between the centers of mass of the two interact-ing charge distributions, O1 and O2 (see Figure 1), and

♦ ci (i = 1 or 2) are the cosines of the angles between dipolemoments µi and the radial vector r:

Eq. 35

♦ c12 is the cosine of the angle between the dipole moment vectors µ1 and µ2,

Eq. 36

♦ n1, n2, and n are unit vectors along µ1, µ2, and r. Their orienta-tions are expressed in polar coordinates ω1 = (θ1,φ1), ω2 = (θ2,φ2), and ω = (θ,φ), where

♦ θi and θ are the angles between ni and n, and n and the Z axis,respectively, as defined in Figure 2, and

♦ φi and φare the rotational angles of ni about n and of n about theZ axis, respectively.

In general, the inverse distance spherical harmonic expansions(Eq. 33) converges when the non-overlapping condition is satis-

U11

µ1µ2

r3------------� �

� � 3c1c2 c12–( )=

ci ni n• θ i θcoscos θi θ φi φ–( )cossinsin+= =

c12 n1 n2• θ 1 θ2coscos θ1 θ2 φ1 φ2–( )cossinsin+= =

Figure 2

n1

n2

θ1

θ2θ n

Z

X

Y



fied. To investigate how many terms are needed to represent theelectrostatic energy to a desired accuracy, the multipole expansionfor the known point-charge distribution may be directly comparedwith the exact electrostatic (Coulomb) potential (Eq. 32). Note thatonly the lowest non-vanishing multipole is independent on thecoordinate system. For example, a set of atoms whose total netcharge (monopole) is non-zero will have a dipole momentdepending on the origin of the frame of reference.

Multipolar interaction forces

The force on an atom a, as a result of interactions with all atoms b,can be derived from the energy expressions,

Eq. 37

Multipole calculations of DeCipher

Multipole interaction energies and forces in DeCipher include thefollowing features:

♦ Determination of the order of the lowest non-vanishing termsfor both atom sets,

♦ Calculation of the current distance r, and the non-overlap con-dition of |r|,

♦ Ability to set the value of r for calculating energies and forces,

♦ Multipole expansion to any order for calculating the interactionenergy between two atom sets, and corresponding forces ofeach atom in the sets,

♦ Ability to compare interaction energies and forces computedfrom the multipole expansion with the “exact” Coulomb’spotential energy (see Eq. 32 above).

In DeCipher, this functionality is not meant to replace Coulombicinteraction energy calculations, but rather to analyze:

♦ how the electrostatics interactions between molecules or molec-ular substructures can be modeled using multipoles, i.e, towhat order to truncate multipolar expansions to model molec-ular systems, or

fa fab

b

� Urab∇

b

�–==


2. Theory

♦ for a given order in the multipolar interaction energy expan-sion, how to best partition a molecular system in substructuralelements (sets of atoms representing a charge distribution).This should be particularly important in molecules with stronglocal dipoles or higher multipoles, or large molecular systemswith macro-multipoles.

Potential functions

Potential energy and forces

DeCipher provides the means to calculate potential energies andforces for molecules and their substructures, either in isolation orinteracting with other species and substructures. The analysisfocuses primarily on the potential energy as modelled by molecu-lar forcefields. When analyzing energies at the system level, theprogram simply uses the simulation’s energy output, read-in astrajectory files.

In DeCipher, the potential energy is a multi-dimensional scalarfunction, whose functional form, and therefore “dimension”,depends on the forcefield. Each forcefield term defines a potentialenergy “component”. Forcefield functional forms are describedhereafter. Potential energy derivatives are also available, and canbe obtained, and their “components” correspond to internals andnon bonds. Forces are represented as vector functions (as negativepotential energy gradients) at the atom, residue or molecule level.You can define a DeCipher function for any independent term orarbitrary sum of terms, available in a given functional form, oraccess these components independently through function lan-guage subscripts. (See Functions and mathematical expressions inDeCipher in Chapter 3, Languages.)

Following are functional forms for CVFF, AMBER, and CFF91/CFF forcefields, respectively, and details of the energy terms,derivatives and forces for all the forcefield functional forms.

Consistent valence forcefield (CVFF)

The CVFF functional form of the potential energy is given by:.



Eq. 38

where:

♦ Terms 1-4 are the diagonal energy terms: bond, valence angles,torsion angles, and out-of-plane (represented by improperdihedral angles) deformations. An alternative for the first termis a quadratic form for bond stretching.

♦ Terms 5-9 are the off-diagonal (or cross) energy terms includingcouplings between bonds, valence angles, bonds and valenceangles, valence angles and torsions, and between out-of-planes.

♦ Terms 10-11 represent the nonbond interactions: the van derWaals interactions with a (6-12) Lennard-Jones potential, andthe Coulombic potential.

In CVFF, hydrogen bonds are natural consequence of the standardvan der Waals and electrostatic parameters.

Epot Db 1 eα b b0–( )–

–[ ]

b

� Hθ θ θ0–( )2

θ� Hφ 1 s nφ( )cos+[ ]

φ�+ +=

Hχ 1 nχ( )cos+[ ]

χ� Fbb ′ b b0–( ) b ′ b ′0–( )

b ′�

b

� Fθθ′ θ θ0–( ) θ′ θ′0–( )

θ′�

θ�+ +

Fbθ b b0–( ) θ θ0–( )

θ�

b

� Fφθθ′ φ θ θ0–( ) θ′ θ′0–( )cos

φ� Fχχ′ χχ′

χ′�

χ�+ + +

ε r*/r( )12 2 r*/r( )6–[ ]� qiqi/εrij�+ +

(1) (2) (3)

(5)(4) (6)

(7) (8) (9)

(10) (11)


2. Theory

AMBER forcefield

The functional form of the potential energy, as modelled in theAMBER forcefield is given in Eq. 39. They can easily be comparedto CVFF terms in Eq. 38.

Eq. 39

where:

♦ Terms 1-4 represent the internal potential energy associatedwith bonds, valence angles, and dihedrals (including torsionsaround bonds and improper angles describing out of planemotions). The AMBER forcefield is a diagonal forcefield, anddoes not include:

♦ Terms 5 and 6 are the (6-12) van der Waals and electrostaticpotential.

♦ Term 7 is a (10-12) van der Waals potential representing hydro-gen-bonds. This term is not present in either CVFF or CFF91/CFF, where hydrogen bonds result from terms 5-6, mainly, elec-trostatic interactions.

The CFF forcefield (Class II forcefield)

The CFF functional form of the potential energy is given in Eq. 40and Figure 3.

♦ Terms 1-4 represent the diagonal energy terms associated withbond, valence angle, torsion, and out-of-plane deformations.

♦ Terms 5-11 represent various cross terms.

Epot K2 b b0–( )2

b

� Hθ θ θ0–( )2

θ�

Vn

2------ 1 nφ φ0–( )cos+[ ]

φ�+ +=

Hχ 1 nχ( )cos+[ ]

χ� ε r*/r( )12 2 r*/r( )6–[ ]� qiqj/εi jrij�

Cij

rij12

-------Dij

rij10

-------–�+ + +

(1) (2) (3)

(5) (6) (7)(4)



♦ Term 12 is the Coulombic electrostatic potential

♦ Term 13 represents a (6-9) van der Waals potential.

Eq. 40

Epot K2 b b0–( )2 K3 b b0–( )3 K4 b b0–( )4+ +[ ]

b

�=

H2 θ θ0–( )2 H3 θ θ0–( )3 H4 θ θ0–( )4+ +

θ�+

V1 1 φ φ10

–( )cos–[ ] V2 1 2φ φ20

–( )cos–[ ] V3 1 3φ φ30

–( )cos–[ ]+ +[ ]

φ�+

Kχχ2

χ� Fbb ′ b b0–( ) b ′ b ′0–( )

b ′�

b

� Fθθ′ θ θ0–( ) θ′ θ′0–( )

θ′�

θ�+ + +

Fbθ b b0–( ) θ θ0–( )

θ�

b

� b b0–( ) V1 φcos V2 2φcos V3 3φcos+ +[ ]

φ�

b

�+ +

b ′ b ′0–( ) V1 φcos V2 2φcos V3 3φcos+ +[ ]

φ�

b ′�+

θ θ0–( ) V1 φcos V2 2φcos V3 3φcos+ +[ ]

φ�

θ�+

Kφθθ′ φcos θ θ0–( ) θ′ θ′0–( )

θ′�

θ�

φ�

qiqj

εrij---------

i j>�

Aij

rij9

------Bij

rij6

------–

i j>�+ + +

(1)

(2)

(3)

(4) (5) (6)

(7) (8)

(9)

(10)

(11) (12) (13)


2. Theory

Figure 3 Graphic Illustration of Terms in CFF

(8)

Σ +Σ

+Σ +Σ

+Σ +Σ

+Σ +Σ

+Σ +Σ

(1) (2)

(3) (4)

(5) (6)

(7)

(9) (10)

+Σ(11) (12, 13)

+Σ



Energy and derivative expressions

CFF91/CFF

♦ bond

Eq. 41

where f1, f2 and f3 are force constants, r1 the bond distance andr10 the bond reference value.

♦ angle

Eq. 42

where f1, f2 and f3 are force constants, r1 the angle and r10 theangle reference value.

♦ torsion

Eq. 43

where f1, f2 and f3 are force constants, r1 the torsion and r10, r20and r30 reference torsions (set to zero in CFF).

♦ out of plane

The Wilson out of plane angle (radians) is used, which is givenby the average of three angles. An angle is between a vector anda plane, where the vector is from the out of plane atom to an

E f1 r1 r10–( )2 f2 r1 r10–( )3 f3 r1 r10–( )4+ +=

dEdr1-------- 2f1 r1 r10–( ) 3f2 r1 r10–( )2 4 f3 r1 r10–( )3+ +=

E f1 r1 r10–( )2 f2 r1 r10–( )3 f3 r1 r10–( )4+ +=

dEdr1-------- 2f1 r1 r10–( ) 3f2 r1 r10–( )2 4f3 r1 r10–( )3+ +=

E f1 1 r1 r10–( )cos–( ) f2 1 2 r1 r20–( )( )cos–( ) f3 1 3 r1 r30–( )( )cos–( )+ +=

dEdr1-------- f1 r1 r10–( )sin 2f2 2 r1 r20–( )( )sin 3f3 3 r1 r30–( )( )sin+ +=


2. Theory

atom bonded to it, and the plane is formed by the out of planeatom and the other two atoms bonded to the out of plane atom.

Eq. 44

where f1 is the force constant r1 the Wilson out of plane angle,and r10 the reference angle (set to zero in CFF).

♦ nonbond

Eq. 45

where εi and εj are the ε values for atom i and atom j, ri and rjthe radius for atom i and atom j, and r the distance betweenatoms i and j.

♦ bond-bond

E f1 r1 r10–( )2=

dEdr1-------- 2f1 r1 r10–( )=

E2 εiεj( )1 2/ ri

3rj3

ri6rj

6-----------------------------------

� ��

2

ri6rj

6

2----------� �

� �1 6/

r------------------------

� ��

9

3

ri6rj

6

2----------� �

� �1 6/

r------------------------

� ��

6

–

� ��

=

dEdr-------

2 εiεj( )1 2/ ri3rj

3

ri6rj

6-----------------------------------

� �� 18–

r---------

ri6rj

6

2----------� �

� �1 6/

r------------------------

� ��

9

18r

------

ri6rj

6

2----------� �

� �1 6/

r------------------------

� ��

6

+

� ��

=



Eq. 46

where f1 is the force constant, r1 and r2 the bond distances andr10 and r20 the reference values for the bonds.

♦ bond-angle

Eq. 47

where f1 is the force constant, r1 and r2 the bond distance andangle and r10 and r20 the reference values for the bond andangle.

♦ angle-angle

Eq. 48

where f1 is the force constant, r1 and r2 the angles and r10 andr20 the reference values for the angles.

E f1 r1 r10–( ) r2 r20–( )=

dEdr1-------- f1 r2 r20–( )=

dEdr2-------- f1 r1 r10–( )=

E f1 r1 r10–( ) r2 r20–( )=

dEdr1-------- f1 r2 r20–( )=

dEdr2-------- f1 r1 r10–( )=

E f1 r1 r10–( ) r2 r20–( )=

dEdr1-------- f1 r2 r20–( )=

dEdr2-------- f1 r1 r10–( )=


2. Theory

♦ bond-bond_13

Eq. 49

where f1 is the force constant, r1 and r2 the bond distances, andr10 and r20 the reference values for the bonds.

♦ bond-torsion

Eq. 50

where f1, f2 and f3 are force constants, r1 the bond distance, r10the reference values for the bond, and r2 the torsion.

♦ bond-torsion_end

Eq. 51

E f1 r1 r10–( ) r2 r20–( )=

dEdr1-------- f1 r2 r20–( )=

dEdr2-------- f1 r1 r10–( )=

E r1 r10–( ) f1 r2( ) f2 2r2( ) f3 3r2( )cos+cos+cos( )=

dEdr1-------- f1 r2( ) f2 2r2( ) f3 3r2( )cos+cos+cos=

dEdr2-------- r1 r10–( ) f1 r2( )sin 2f2 2r2( )sin 3 f3 3r2( )sin+ +( )–=



dEdr2-------- r1 r10–( ) f1 r2( )sin 2f2 2r2( )sin 3 f3 3r2( )sin+ +( )–=



where f1, f2 and f3 are force constants, r1 the bond distance, r10the reference values for the internal, and r2 the torsion.

♦ angle torsion

Eq. 52

where f1, f2 and f3 are force constants, r1 the angle, r10 the refer-ence values for the angle, and r2 the torsion.

♦ angle-angle-torsion

Eq. 53

where f1 is the force constant, r1 and r2 the angles, r10 and r20 theangle reference values, and r3 the torsion.

CVFF

♦ bond

Eq. 54



dEdr2-------- r1 r10–( ) f1 r2( )sin 2f2 2r2( )sin 3f3 3r2( )sin+ +( )–=

E f1 r1 r10–( ) r2 r20–( ) r3( )cos=

dEdr1-------- f1 r2 r20–( ) r3( )cos=

dEdr2-------- f1 r1 r10–( ) r3( )cos=

dEdr3-------- f1 r1 r10–( ) r2 r20–( ) r3( )sin–=

E f1 r1 r10–( )2=

dEdr1-------- 2 f1 r1 r10–( )=


2. Theory

where f1 is a force constant, r1 the bond distance, and r10 thebond reference value.

♦ angle

Eq. 55

where f1 is a force constant, r1 the angle, and r10 the angle refer-ence value

♦ torsion

Eq. 56

where f1 is a force constant, r1 the torsion, r10 the reference tor-sion and nr an integer.

♦ out of plane

The improper torsion definition is used, which depends onatom ordering and so is not as well defined as the Wilson out ofplane.

Eq. 57

where f1 is a force constant, r1 the torsion r10 the reference tor-sion and nr an integer.

♦ nonbond

E f1 r1 r10–( )2=

dEdr1-------- 2 f1 r1 r10–( )=

E f1 1 nrr1

r10–( )cos–( )=

dEdr1-------- nrf–

1nrr

1r10–( )sin=

E f1 1 nrr1

r10–( )cos–( )=

dEdr1-------- nrf–

1nrr

1r10–( )sin=



Eq. 58

where ai, bi and aj, bj are coefficients for atom i and atom j, andr the distance between atoms i and j.

♦ bond-bond

Eq. 59

where f1 is the force constant, r1 and r2 the bond distances, andr10 and r20 the reference values for the bonds.

♦ bond-angle

Eq. 60

Eaiaj( )1 2/

r12---------------------

bibj( )1 2/

r6---------------------–=

dEdr-------

12 aiaj( )1 2/

r13----------------------------–

6 bibj( )1 2/

r7-------------------------+=

E f1 r1 r10–( ) r2 r20–( )=

dEdr1-------- f1 r2 r20–( )=

dEdr2-------- f1 r1 r10–( )=

E f1 r1 r10–( ) r2 r20–( )=

dEdr1-------- f1 r2 r20–( )=

dEdr2-------- f1 r1 r10–( )=


2. Theory

where f1 is the force constant, r1 and r2 the bond distance andangle, and r10 and r20 the reference values for the bond andangle.

♦ angle-angle

Eq. 61

where f1 is the force constant, r1 and r2 the angles, and r10 andr20 the reference values for the angles.

♦ out of plane-out of plane

Eq. 62

where f1 is the force constant, r1 and r2 the out of plane angles,and r10 and r20 the reference values for these angles.

♦ angle-angle-torsion

E f1 r1 r10–( ) r2 r20–( )=

dEdr1-------- f1 r2 r20–( )=

dEdr2-------- f1 r1 r10–( )=

E f1 r1 r10–( ) r2 r20–( )=

dEdr1-------- f1 r2 r20–( )=

dEdr2-------- f1 r1 r10–( )=



Eq. 63

where f1 is the force constant, r1 and r2 the angles, r10 and r20 theangle reference values, and r3 the torsion.

AMBER

♦ bond

Eq. 64

where f1 is a force constant, r1 the bond distance, and r10 thebond reference value.

♦ angle

Eq. 65

where f1 is a force constant, r1 the angle, and r10 the angle refer-ence value.

♦ torsion

E f1 r1 r10–( ) r2 r20–( ) r3( )cos=

dEdr1-------- f1 r2 r20–( ) r3( )cos=

dEdr2-------- f1 r1 r10–( ) r3( )cos=

dEdr3-------- f1 r1 r10–( ) r2 r20–( ) r3( )sin–=

E f1 r1 r10–( )2=

dEdr1-------- 2 f1 r1 r10–( )=

E f1 r1 r10–( )2=

dEdr1-------- 2 f1 r1 r10–( )=


2. Theory

Eq. 66

where f1, f2 and f3 are force constants, r1 the torsion, and r10, r20and r30 reference torsions.

♦ out of plane

The improper torsion definition is used, which depends onatom ordering and so is not as well defined as the Wilson out ofplane.

Eq. 67

where f1 is a force constant, r1 the torsion, r10 the reference tor-sion, and nr an integer.

♦ nonbond

Eq. 68

where εi, ri and εj, rj are the coefficients for atom i and atom j,and r the distance between atoms i and j.

♦ hydrogen bond

E f1 1 r1 r10–( )cos–( ) f2 1 2 r1 r20–( )( )cos–( ) f3 1 3 r1 r30–( )( )cos–( )+ +=

dEdr1-------- f1 r1 r10–( )sin 2f2 2 r1 r20–( )( )sin 3f3 3 r1 r30–( )( )sin+ +=

E f1 1 nrr1

r10–( )cos–( )=

dEdr1-------- nrf–

1nrr

1r10–( )sin=

E εiεj( )1 2/rirj

2r--------� �

� �12

2rirj

2r--------� �

� �6

–� �� =

dEdr------- εiεj( )1 2/ 12

r------–

rirj

2r--------� �

� �12 12

r------

rirj

2r--------� �

� �6

+� �� =



Eq. 69

where a and b are coefficients for a pair

Force calculations

Bond The bond distance in angstroms (Å) between two pointsis given by:

Eq. 70

where X1 and X2 are the Cartesian coordinates of point 1 and point2.

The derivatives of bond with respect to Cartesian coordinates aregiven by:

Eq. 71

Angle The angle (radians) formed by three points is given by:

Ea

r12-------

b

r6-----–=

dEdr-------

12– a

r13------------

6b

r7------+=

bond [ X1 x( ) X2 x( )–( )2 X1 y( ) X2 y( )–( )2+=

+ X1 z( ) X2 z( )–( )2]1 2⁄

x∂∂

bondX1 x( ) X2 x( )–

bond----------------------------------=

x∂∂

bondX1 y( ) X2 y( )–

bond----------------------------------=

x∂∂

bondX1 z( ) X2 z( )–

bond---------------------------------=


2. Theory

Eq. 72

where vector a is from the central atom of an angle to an end atom,vector b is from the central atom to the other end atom.

The derivatives of angle with respect to Cartesian coordinates isgiven by:

Eq. 73

for the atom at X1,

Eq. 74

for the atom at X3, and for the central atom of the angle, X2, it is thesum of the above two expressions.

Derivatives with respect to y and z differ only in the componentsof the vectors selected.

Torsion The torsion (radians) formed by four points is given by:

Eq. 75

where vector a is from the second atom of a torsion to an end atom,vector b is from the second atom to the third, and vector c is fromthe other end atom to the third atom.

The derivatives of torsion with respect to Cartesian coordinates isgiven by:

anglea a b•( )cos

a b----------------------------=

x∂∂

angle1

angle( )sin---------------------------

b x( )a b------------

angle( )a x( )cosa 2

---------------------------------------–� �� –=

x∂∂

angle1

angle( )sin---------------------------

aa b------------

angle( )b x( )cos

b 2---------------------------------------–� �

� �–=

torsiona a b×( ) c b×( )•( )cos

a b× c b×--------------------------------------------------------=



Eq. 76

for the atom at X1, where angle12 is the angle between a and b,

Eq. 77

for the atom at X2, where angle23 is the angle between b and c,

Eq. 78

for the atom at X3,

Eq. 79

for the atom at X4.

Derivatives with respect to y and z differ only in the componentsof the vectors selected.

Wilson out of plane

Eq. 80

Restraint function

A restraint or restraint energy function is a harmonic function ofuser specified distances, angles and dihedral angles formed by any

x∂∂

torsiona b×

a b× a angle12( )sin------------------------------------------------------=

x∂∂

torsionb a angle12( )cos–

a b angle12( )sin---------------------------------------------------� �

� � a b×a b×

---------------� �� – angle23( ) c b×

c b× c angle23( )sin-----------------------------------------------------� �

� �cos–=

x∂∂

torsionb c angle23( )cos–

c b angle23( )sin---------------------------------------------------� �

� � c b×c b×--------------� �

� �– angle12( ) a b×a b× b angle12( )sin

------------------------------------------------------� �� cos–=

x∂∂

torsionc b×

c b× c angle23( )sin-----------------------------------------------------=

angle13---

c b×( ) a•c b× a

-------------------------a c×( ) b•a c× b

-------------------------b a×( ) c•b a× c

-------------------------+ +� �� =


2. Theory

two, three and four atoms respectively. These do not need to beinternals, they can be absolutely arbitrary, for example they couldbe associated with a set of C-alpha pseudo bonds, angles and dihe-drals in a protein backbone chain. This function represents apseudo-energy, and its functional form is harmonic:

Eq. 81

The “parameters”, i.e., “force constants” K and “reference values”R0 associated with N interatomic distances, angles or dihedralangles are set by you, and read in the program through a “restraintfile”.

A restraint energy function may be used exactly as any potentialenergy function, or in addition to it. This is particularly useful forrestrained energy minimizations.

Function optimization

Molecular structures or substructures may be optimized throughthe minimization of:

♦ a potential energy function,

♦ a restraint function,

♦ or a combination of these.

When used in combination, one of the functions may be scaled: letE1 be a potential energy function, and E2 be a restraint function.One can minimize the following function:

E = E1 + sE2 Eq. 82

where s is a scale factor, equal to 1.0 in the default, and E1, E2 areany combination of potential or restraint functions.

Three basic methods are provided for function minimization:

♦ steepest descent: this method should be used in particularwhen trying to minimize a very “bad” starting structure,

E Ki

i 1=

N

� Ri R0 i,–( )2=

High-level molecular properties


♦ Quasi Newton-Raphson: a method of choice to minimize smallmolecules,

♦ conjugate gradient: a method of choice to minimize largermolecular systems,

as well as a combination of the first two, named:

♦ Automatic: when the gradient g becomes smaller than a givenRMS gradient tolerance, the Newton-Raphson method is auto-matically invoked.


Several structural and dynamic properties can be represented byhigh-level mathematical functions of molecular variables, andobtained from molecular dynamics simulations. These propertiesare reviewed hereafter, and can be calculated using commands inthe Spectra pulldown in DeCipher. Historically, most of theseproperties have been derived for condensed phase systems, in par-ticular molecular liquids (Allen and Tildesley 1989). They can beapplied to study macromolecules either in isolation or in solution(Brooks, Karplus and Pettitt 1988). Macromolecules such as pro-teins, for example, can be considered as a micro-condensed sys-tem, and their structural and dynamic properties can be studiedwith the mathematical functions defined hereafter.

Distribution functions

A generalized expression for radial distribution functions gij =g(rij, ωi, ωj) depends on:

♦ radial distances, rij, between two particle atoms or centroids iand j, and

♦ orientations ωi and ωj of associated property vectors Ai and Aj.

In DeCipher, we classify radial distribution functions into pair dis-tribution functions, where one is interested in averages over alarge number of pairs of particles, and spherical and cylindrical dis-tribution functions. With the latter, one can study the structure


2. Theory

(positions and orientations) of particles around a particular uniquecenter or around a particular cylindrical axis.

The function gij is proportional to the probability of finding twoquantities at a distance rij apart and having orientations ωi and ωj.A number of different approaches which give “projections” of gijhave been developed (Gray and Gubbins 1984), and may berelated to observable properties. Among these projections we cancite purely radial distributions g(r). For example, a series of neu-tron diffraction experiments on isotopically substituted moleculescan be expressed as an atom-atom pair distribution function g(r)(Powles 1973; Soper and Egelstaff 1980). Various other equilibriumthermodynamic and structural properties are related to a sphericalharmonics expansion of g(rij, ωi, ωj) (Gray and Henderson1987,1979; Allen and Tildesley 1989).

In the next paragraphs, we will describe:

♦ radial distribution functions, g(r), of particle-particle pairs, cen-tral point-particles, and axis-particles, i.e., pair, spherical andcylindrical radial distribution functions;

♦ orientational radial distribution functions, g(rij, ωi, ωj), for ana-lyzing orientational distributions of ensembles of particle-cen-tered vectors, a spherical radial vector to particle-centeredvectors, and a cylindrical axis-to particle-centered vectors,respectively, i.e., pair, spherical and cylindrical orientationalradial distribution functions.

Radial distribution functions (RadialDistFunc)

Pair distribution functions

Pair distribution functions, also called pair correlation functions,are of fundamental importance to the theory of equilibrium prop-erties for molecular fluids (Egelstaff, Gray, and Gubbins 1975). Apair distribution function (PDF), g(rij), describes the probability offinding a particle of type j at a distance between r and r + dr fromparticles of type i as a function of the i – j separation distance r. Pairdistribution function calculations have been applied to investigatemodifications of the solvent structure induced by the presence ofa solute, and to study structural modifications of dense systemsduring molecular dynamics simulations. PDFs can serve broader



purposes: by using weight factors for residues, such as hydropho-bicities, hydrophobicity contrast functions have been used tosearch for metal binding sites in proteins (Yamashita et al. 1990). InDecipher, nonbond interaction energy distribution over the sepa-ration distance of atom pairs can be studied by using van der Warr,Coulombic, or total nonbond interaction energy between atompairs as the weight factors.

The pair distribution function may be expressed as:

Eq. 83

where:

♦ d�Nij(r)� is the average number of i (or j) particles with the dis-tance between r-δr and r from a particle i (or j),

♦ ρij is the bulk density of type i and j particles, and

♦ d �Nij(r)�/dV(r) is the average local density of type i and j parti-cles in the shell volume, dV (r), between r - δr and r from a typei or j particle.

The divisor normalizes the distribution so that gij = 1 whend�Nij(r)�/dV(r) is the same as the bulk density. Thus, the pair distri-bution function represents the ratio of local density and bulk den-sity for type b atoms around type a atoms.

In DeCipher, the bulk density is taken as ρij = 1 by convention fornon periodic molecular systems, but for a periodic system’s cell, itcan be calculated as:

Eq. 84

In this calculation, g(r) can also be partitioned into inter- and intra-molecular contributions. To explore the structure of a system dur-ing a dynamics simulation, options are provided for calculatingeither g(r) for an individual configuration, a set of configurations(trajectory frames) individually or as an average. When calculated

gij r( )d Nij r( )� �

ρi jdV r( )-----------------------=

ρi j

Ni Nj+

Vcell-----------------=


2. Theory

individually for each of the configurations, they can be plotted asa 3D graph, (r, g(r), frame number).

Spherical and cylindrical distribution functions

A spherical radial distribution function gives the probability offinding particles within a spherical shell around a given center.This is a special case of pair distribution functions, where we lookat all pairwise distances to the same center, be it an atom, a cen-troid, or an arbitrary geometrical point in space. A cylindricalradial distribution function gives the probability of finding parti-cles within a cylindrical shell around a given axis.

Coordination numbers

The coordination number, Nij(r), represents the number of type iparticles around a type j particle plus the number of type j atomsaround a type i atom at a distance r. The average coordinationnumber is expressed as

Eq. 85

In practice, the average coordination number, �N(R)�, at a given Ris obtained by counting

�Nij(R)� = (Ni (r ≤ R) + Nj (r ≤ R) ) Eq. 86

where, Ni (r ≤ R) and Nj (r ≤ R) are the number of type i particles(i.e., the coordination number of a type j particle) and of type j par-ticles (i.e., the coordination number of a type i particle), individu-ally, that constitute the i-j pairs whose separations are less than orequal to R.

For radial distribution functions, however, the coordination num-ber at a given R is simply the number of particles within the radiusR from the center of spheres or the cylinder axis

N(R) = N (r ≤ R) Eq. 87

(See Spectra/RadialDistFunc in Chapter 4, Command Summary)

Nij r( )� � ρi j g r( )4πr2dr

0

r

�=



Coordination number distribution function (CN_Distrib)

Running coordination numbers N(r) are average quantities over allpairs of atoms or particle centroids, and it is useful to look at theirdistributions for a given radial distance r.

In DeCipher, the coordination number distribution is representedas a 2D plot, (n(r) vs. the number of particles; in essence a histo-gram).

For example, the average coordination number N(r) is five forintermolecular oxygen-oxygen atom pairs in a water solvationshell of 3 Å. However, the coordination number distributionshows that the most probable solvation (coordination) number isfour, while some of oxygen atoms have higher coordination num-bers such as 5, 6, or 7, and a few have lower ones. (For an applica-tion of this type see Corongiu and Clementi, 1993).

(See Spectra/CN_Distrib function in Chapter 4, Command Summary)

Orientational radial distribution function (Orient_RDF)

Spherical harmonic expansion

The spherical harmonic expansion for a pair of linear molecules orvectors is traditionally written as:

Eq. 88

where the Ylm(ω) are spherical harmonics (Rose 1957) and m = - m.The orientations are measured relative to the vector rij. The coeffi-cients are evaluated by averaging a product of spherical har-monics over a spherical shell around a particle. Some observableproperties are related to spherical harmonics coefficients, asdescribed in the following.

Orientational pair distribution functions

The angular correlation parameter of rank l, gl, may be expressedas:

g rij ωi ωj, ,( ) 4π gliljmrij( )Ylim

ωi( )Yljmωj( )

li ljm

�=

gli ljm


2. Theory

Eq. 89

where

♦ Pl(cosθij) is a l order of Legendre polynomial,

♦ θij is the angle between the vectors, i and j.

The first order correlation coefficient g1 may be related to dielectricproperties of polar molecules. The rotational order parameter (Vieil-lard-Baron 1972) may be investigated by g1, while the second ordercoefficient g2 can be related to depolarized light scattering.

The orientational radial distribution function is then in the form:

gl(r) = �Pl(cosθ)�r,r + δr Eq. 90

where

♦ r is the length of radial vector, rij, between a pair of vectors i andj, and

♦ θ may be θij, the angle between vectors i and j, θi the anglebetween rij and the vector i, and θj is the angle between rij andvector j.

The brackets in Eq. 90 indicates an average over all pairs of vectorswhose radial distance is within r-δr and r. These distribution func-tions give insight in the distance dependence of orientationalorder, and are of importance to study systems such as liquid crys-tals.

Spherical orientational distribution function

The spherical orientational radial distribution function representsthe average orientation of particle-centered vectors within a spher-ical shell of radius r from a given center. The orientation is definedby the Legendre polynomial of scalar product of property unit vec-tors and the radial unit vector from the center of the sphere to theparticle. These distribution functions may be useful to studymolecular systems or subsystems with spherical symmetry, suchas solute-solvent interactions, for example to study the polariza-tion of water molecules in a spherical shell around an ion or a polar

gl Pl

j i≠�

i

� θi jcos( )� � Pl θi jcos( )� �= =



side chain in a protein or peptide. It could also be of use in thestudy of lipids arrangements in liposomes.

Cylindrical orientational distribution function

The cylindrical orientational radial distribution function repre-sents the average orientation of particle-centered vectors within acylindrical shell of radius r from a given axis. The orientation isdefined by the Legendre polynomial of scalar product of unit vec-tors of property vectors and the unit vector of the cylinder axis.These distribution functions may be useful to study molecules ofcylindrical shape, such as DNA, or molecular substructures suchas helices in proteins and lipids in bilayers.

(See Spectra/OrientRDF in Chapter 4, Command Summary)

Angular distribution dunction (AngleDistFunc)

The angular distribution function is measured by

Eq. 91

where, N(cosθ) is the number of pairs of vectors whose angle iswithin the interval θ-δθ and θ, and N is the total number of vectorpairs.

(See Spectra/AngleDistFunc in Chapter 4, Command Summary)

Mean square displacement (MeanSqDisp) and diffusioncoefficient

The self-diffusion coefficient, D, of an particle can be obtaineddirectly from the slope of the mean square displacement (MSD)correlation function. The MSD gives the mean square displace-ment of particle positions vs. time. The self-diffusion coefficient isexpressed as:

g θ( ) N θcos( )N

---------------------=


2. Theory

Eq. 92

where ri(t) is the position vector of particle i, N is the number ofparticles in the specified subset, and �[ri(t) - ri(0)]2� is the averageof the mean square displacement of particle i over all choices oftime origin within a dynamics trajectory (see Figure 2 for details).

For n discrete frames in a trajectory, the MSD, �[ri(t) - ri(0)]2�, offrame τ + j from frame j (the time interval between frames τ + jand j is t.) is actually calculated by:

Eq. 93

for τ = 1, 2, ..., n - 1, where, τmax = n - τ.

From the formula, you can see that the value of the mean squaredisplacement (MSD) for small τ is determined with greater statis-tical precision since the number of terms in the average, τmax, islarger.

Besides the mean square displacement, a time series of the averagetraveled distance over N particles at time t can also be calculated.It is expressed as:

Eq. 94

D1

6N-------

ddt----- ri t( ) ri 0( )–[ ] 2� �

i 1=

N

�t ∞→lim=

Figure 4. Time Intervals

t0 t1 t2 t3 t4 t5 … tnTime →

ri τ( ) ri 0( )–[ ] 2� �1

τmax----------- ri τ j+( ) ri j( )–[ ] 2

j 1=

τmax

�=

d τ( ) 1N---- ri j 1+( ) ri j( )–

j 1=

τ

�i 1=

N

�=



for t = 1, ..., n.

The calculations can be performed for molecular systems simu-lated with periodic boundary conditions. A correction is made fortranslational switching between images if the explicit imagemodel is used.

(See Spectra/MeanSqDisp in Chapter 4, Command Summary, and Lin-eFit Graph in Insight II to obtain the slope, i.e., a diffusion coeffi-cient).

Time correlation functions (TimeCorrFunc)

Time correlation functions (TCF) are of great interest for the anal-ysis of the dynamic correlations in a simulated molecular system,in particular, in condensed phases. For example, the time integralof the velocity auto-correlation function is directly related to trans-port properties of molecular liquids (i.e., diffusion coefficients).Another example is the calculation of the IR spectrum through aFourier transform of the dipole autocorrelation function.

The data extracted from a molecular dynamics simulation formdiscrete time series of atomic and molecular variables. To calculatetime correlation functions (TCF) of discrete time series with nequally spaced points, the TCF is approximated by a finite sumover the direct multiplication of one property with itself (auto-cor-relation) or two distinct properties (cross-correlation). Thismethod of calculation is referred to as the direct approach. In addi-tion, orientation time correlation functions are calculated as thefirst or second order Legendre polynomials Pl cosq, P1 and P2.

In the following sections, the general concept of time correlationfunctions is described. The actual algorithms of TCF calculationsare shown in Normalized discrete time correlation functions at the endof this section.

Direct approach

A time correlation function, C(t), is obtained whenever any time-dependent quantity, A(t), is multiplied by itself or another time-dependent quantity, B(t'), evaluated at some different time, t', andthe product is averaged over some equilibrium ensemble. A con-


2. Theory

venient notation for the time correlation function, C(t), of two timeseries, A(t) and B(t') is:

CAB(t’) = �A(t) B(t’)� Eq. 95

where the brackets denote an average over a selected trajectory. Aand B can be either scalar or vector variables, but have to be thesame type in a correlation calculation. If vectors are used, the righthand side of indicates a scalar product between the elements ofthe time series. In the limit of classical mechanics, the probabilitydistribution of the system at equilibrium is not affected by a shiftof the origin of time. Therefore:

�A(t) B(t’)� = �A(0) B(t - t’)� Eq. 96

The auto-correlation function describes the correlation in time of asingle property:

CAA (t) = �A(0) A (t)� Eq. 97

where A (0) and A (t) are the time series elements separated by timet.

The cross-correlation function describes the correlation of twoproperties:

CAB(t) = �A (0) B(t)� Eq. 98

The normalized correlation function is expressed by:

Eq. 99

Given an auto-correlation function, C(t), the correlation time isdefined by:

Eq. 100

For a correlation function that can be represented approximatelyas an exponentially decaying function of time, tA can be obtainedfrom a linear fit to log [CAA(t)] vs. t.

C t ′( ) A t( )B t ′( )� �A t ′( )B t ′( )� �

------------------------------=

tA CAA t( )dt

0

∞

�=



Normalized discrete time correlation functions Auto-corre-lation function of a property A is given by:

Eq. 101

for τ = 1, 2, ..., l, where the number in the parentheses is the indexof successive frames in the trajectory, N is the dimension (numberof values) of the property in the trajectory, i is the dimension indexof the property, j is the time origin index, l is the number of corre-lated frames in the calculation (i.e., the t_Length parameter in theSpectra/TimeCorrFunc command). It must never exceed the totalnumber frames in the trajectory, n, and can be determined by thecorrelation time if it is known. m is the number of skipping framesbetween the time origins. (i.e., t0_Step in the Spectra/TimeCorr-Func command). It can be greater than one if it is unnecessary touse each successive frame as a time origin. n is the total number offrames in the trajectory, and τmax is the total number of time ori-gins. τmax = (n - τ)/m if the modulus of (n - τ)/m is zero, otherwiseτmax = (n - τ)/m + 1.

The cross-correlation function of two properties A and B havingthe same dimension, is given by:

Eq. 102

CAA τ( ) 1

τmaxN2------------------

Ai

i 1=

N

� j m×( )Ai τ j m×+( )

j 0=

τmax

�

Ai j m×( )[ ] 2

i 1=

N

�j 0=

τmax

�

----------------------------------------------------------------------------=

CAB1

2τmaxN2----------------------

Ai j m×( )Bi τ j m×+( ) Bi j m×( )Ai τ j m×+( )

i 1=

N

�j 0=

τmax

�+

i 1=

N

�j 0=

τmax

�

Ai j m×( )[ ] 2

i 1=

N

�j 0=

τmax

� Bi j m×( )[ ] 2

i 1=

N

�j 0=

τmax

�+

---------------------------------------------------------------------------------------------------------------------------------------------------------------=


2. Theory

The cross-correlation function of a multidimensional property Aand a mono-dimensional property B is given by:

Eq. 103

This equation is comparable to Eq. 101 for the calculation of anauto-correlation function, but here the property A (τ) is correlatedwith a single reference property of itself at time origin A (0).

Legendre polynomials Pl cosθ

The orientational time correlation function is expressed as:

Cl(t) = �Pl [cosθ(t)]� Eq. 104

where θ is the angle between a vector at time t and the same vector(auto-correlation) or a different one (cross-correlation) at timezero. Thus:

Eq. 105

where Pl is the lth-order Legendre polynomial,P1 [cosθ(t)] = cosθ(t), which results in a correlation functionbetween two unit vectors (scalar product)

which results in the calculation of a correlation function of a sec-ond-order Legendre polynomial for unit vectors, and Cl(t) mea-sures the degree of correlation between the orientation of thevector at time t and the same vector at time zero. It is particularlyuseful for analyzing orientation properties of molecules in con-

CAB τ( ) 1τmax N×---------------------

Bi j m×( )A τ j m×+( )

i 1=

N

�j 0=

τmax

�

B j m×( )[ ] 2

j 0=

τmax

�

---------------------------------------------------------------------------=

θcosA t( ) B t( )•A t( ) B t( )----------------------------=

P2 θcos t( )[ ] 12--- 3 θ2cos t( ) 1–[ ]=



densed phases. It shows how well a vector “remembers” where itwas pointing at an earlier time t.

Spectral density

In principle, time correlation functions can be related with Fouriertransforms of the infrared, Raman, and inelastic neutron scatteringspectra of molecular liquids (Zwanzig 1965; Berne and Pecora1976). In practice, power spectral densities are determined by theFourier transform of time correlation functions:

Eq. 106

Given a real discrete time correlation function, C(t) containing n-frames at equal intervals, and it is even in time and stationary, itsdiscrete Fourier transform is expressed as:

Eq. 107

for τ = 0, ..., n - 1.

The estimate of (one-sided, i.e., only the zero and positive frequen-cies) power spectral density (Press et al. 1990; Priestley 1992) perunit time, P(fk), is defined at n/2 + 1 frequencies from C(fk) as:

Eq. 108

The frequencies, fk, are defined as:

C f( ) C t( )e ift–

∞–

∞

�=

C fk( ) C τ( )e2πiτk

n---------------

τ 0=

n 1–

�=

P f0( ) 1

n2----- C f0( ) 2

=

P fk( ) 1

n2----- C fk( ) 2

C fn k–( ) 2+[ ]= for k 1 2 … n 2⁄ 1–, , ,=

P fn 2⁄( ) 1

n2----- C fn 2⁄( ) 2

=


2. Theory

Eq. 109

for

.

The total power, however, can be computed either in the timedomain or in the frequency domain according to Parseval’s theo-rem:

Eq. 110

Optionally, a power spectrum may be “smoothed” by using a win-dow function w(τ):

Eq. 111

for τ = 0, ..., 2M - 1.

For example, the Parzen window function is expressed as:

Eq. 112

In this case, the equations for the periodogram estimator become:

fkk

n∆t---------=

k 0 1 … n2---, , ,=

C fk( ) 2

k 0=

n 1–

�1n--- C τ( ) 2

τ 0=

n 1–

�=

D fk( ) C τ( )w τ( )e2πiτ k

n---------------

τ 0=

n 1–

�=

w τ( ) 1τ 2M 1–

2-----------------–

2M 1+2

-------------------------------------------–=



Eq. 113

where:

Eq. 114

(See Spectra/TimeCorrFunc in Chapter 4, Command Summary.)

Diffusion coefficient

The self-diffusion coefficient, D, can be directly obtained from thevelocity auto-correlation function for the molecular center-of-massmotion at equilibrium:

Eq. 115

The connection between Eq. 115 and Eq. 92 (the “Einstein” rela-tion) in the mean square displacement can be established by inte-gration by parts. Note that Eq. 92 is meaningful only at large tcompared with the correlation time.

Alternatively, D could be obtain from the zero component of theFourier transform of C(t) using Eq. 107, Eq. 115, and the equiparti-tion principle:

Eq. 116

P f0( ) 1Wss--------- C f0( ) 2

=

P fk( ) 1Wss--------- C fk( ) 2

C fn k–( ) 2+( )= for k 1 2 … M 1–, , ,=

P fM( ) 1Wss--------- C fM( ) 2

=

Wss 2M w2 τ( )

τ 0=

2M

�=

D13--- vi 0( ) vi t( )•� �dt

0

∞

�=

DC f0( )kT

m--------------------=


2. Theory

In Eq. 116, k is Boltzmann’s constant, T is the temperature, and mis the mass.

Residency time (ResidencyTime)

The residency time, as implemented in DeCipher, allows one todetermine either:

♦ For particle interactions, how long during a simulation a parti-cle (atom/residue/molecule) resides in (is within a) give spa-tial neighborhood. For example: the time a given watermolecule is within a particular shell (around a particular ionwithin a given distance range), or the average time water mol-ecules spend within a given distance from each other. Resultsare presented as histograms, residency time as a function of dis-tance ranges.

♦ For arbitrary properties (DeCipher Functions), how long dur-ing a simulation a given property remains within a given range(around a given value). For example the intermolecular anglein a water system O-H-O angle of a water molecule, the resi-dency time of the angle is 5ps in the 9.9 + 0.1 degree interval,3ps in the 10.0+ 0.1 degree interval, and so on.

The average residency time over N pairs of particles, or functionelements is defined as:

Eq. 117

Tr,i(r-∆r) is the residency time of the ith atom pair or function ele-ment in the distance or function value interval between r-∆r and r.

(See the Spectra/ResidencyTime function in Chapter 4, CommandSummary.)

Tr r( )� �

Tr i,

i

N

� r ∆r–( )

N---------------------------------------=




2. Theory


3 Languages

Introduction

We review in this chapter languages used in DeCipher to specifymolecular properties. In this context, languages are essentiallyalphanumeric expressions used to specify molecular properties.

These specifications are done through a Function Spec.

These “specs” are essential to the use of DeCipher.

Finally we will review the BCL macro-language (the Biosym Com-mand Language), common to all Insight II integrated modules, butof particular significance in encapsulating “DeCiphering strate-gies”. In writing a BCL macro composed of Insight II and DeCi-pher commands, and eventually calls to external programs orother modules, one can build very powerful programs in a veryshort time and that integrate seamlessly with Insight II and DeCi-pher.

Molecular subsets

Topological subsets

Topological sets of atoms, or molecular subsets, can be definedusing any topological attribute at the atom, monomer, or moleculelevel using the Subset/Define command. The atom attributes canbe tabulated using the Molecule/Tabulate command. This setselection mechanism works as if a molecule or an assembly of mol-ecules were a relational table of atoms, and one would query thoseatoms which meet a given criterion, i.e., having a property (anattribute) within a given value range. The value range can be acharacter string such as an atom name, a number or a range ofnumbers, or logical variables (true/false), with series of such val-


3. Languages

ues separated by commas. To obtain sets of atoms with multipleproperties, one can apply successively the selection with differentattributes (Filter option in the Subset/Define command), or useBoolean operations on subsets (Subset/Combine command). Forexample, one can simply select within a molecule, those atoms:

Property: Atom_Name value range: C*

Property: Partial_Charge value range: > 0, < 0.5

Property: Chirality value range: s, ps

Table 1 Topological Atom Attributes

Name Name of the atom (i.e., CA, HA1, OE2, NZ)Atom_Template Topological atom neighborhood specifierBond_Count Number of bonds connecting from/to the atom

Partial_Charge Charge of the atomChirality Chirality (carbons) and Pseudo-Chirality (hydrogens)

[R, S, PR, PS]Color Graphical attribute of the atom

Connected_To List of atoms bonded to the atomElement_Type Name of the chemical element (i.e., C, H, O, N)Group Forcefield attribute: “charge group” used for electro-

static energy calculations

Hetero Heteroatoms, as defined in a PDB file with the HETEROkeyword

Hybridization Normal hybridization state of the atom (sp, sp2, sp3)Mass Mass of the atom

Max_Valence Theoretical number of valence bonds for the atom(can be modified with the Modify/Element com-mand

Mol_Spec Insight II’s Name specifier(Syntax: Molecule:Residue:Atom)

Potential Atom type used for forcefield calculations

Seq_in_Monomer Sequential number of the atom within a residue(monomer)

Introduction


Seq_Number Sequential Number of the atom within the molecule

Switching Forcefield attribute: “charge group switching atom”Temperature_Factor Temperature factor (B-factor) read in from a PDB fileVDW_Radius van der Waals radius of the atomValence Actual valence (number of bonds) of the atom

Table 1 Topological Atom Attributes (Continued)

Table 2 Residue/Monomer Attributes

Atom_Count Number of atoms within the residueCharge Charge of the residue (sum of constituent atom

charges)Mass Mass of the residue (sum of the constituent atom

masses)Seq_number Sequential residue number in a molecule (protein

sequence number for example)Type Residue type (Insight II/Discover “types”; i.e., PRO,

LYS+, GLYN)Mol_Spec Insight II’s Name specifier (Syntax: Molecule:Residue)

Table 3 Molecular Attributes

Mol_atom_count Number of atoms within the moleculeCharge Charge of the molecule (sum of constituent atom

charges)Mass Mass of the molecule (sum of the constituent atom

masses)Monomer_Count Number of residues in the molecule

Mol_Spec Insight II’s Name specifier


3. Languages

Geometric subsets

Three dimensional atom attributes Geometric subsets can bedefined as “boxes” of atoms by specifying sequentially one, two,or three Cartesian coordinates ranges, relative to the video screenor to the molecular object frame of reference. These coordinates areatomic attributes, and can be used as any topological attribute. Forexample, one can simply specify:

Value Range: Y_Screen >0, i.e., the right half of the screen

and (filter) X_Screen >0, i.e, the top right quarter of the screen(infinite box along Z),

or more specifically:

Value Range: -5 < X_Object < 5;

followed by (Filter) -5 < Y_Object < 5;

followed by (Filter) -5 < Z_Object < 5;

thus selecting all atoms within a square box of 10 Å centered on theorigin of the molecule frame of reference.

Three dimensional neighborhood Geometric subsets can alsobe defined using the interatomic distance criterion. Two com-mands, Zone and Interface Subset, are used to this effect, where

Table 4 Three Dimensional Atom Attributes

X_Object First Cartesian coordinate relative to the object frameof reference

X_Screen First Cartesian coordinate relative to the screen frameof reference

Y_Object Second Cartesian coordinate relative to the objectframe of reference

Y_Screen Second Cartesian coordinate relative to the screenframe of reference

Z_Object Third Cartesian coordinate relative to the objectframe of reference

Z_Screen Third Cartesian coordinate relative to the screenframe of reference

Functions and mathematical expressions in DeCipher


one specifies two molecules or subsets, one as the center andanother as the surrounding, and an interatomic distance. The pro-gram will select all atoms within the external subset that lie at amaximum given distance from any atom within the central subset.

One can therefore select all atoms of a given type for examplewithin a sphere of a given radius around a central atom, but also aBoolean union of spheres of a given radius around a set of centralatoms.

Functions and mathematical expressions inDeCipher

Molecular properties representations

DeCipher's philosophy revolves around mathematical and geo-metric modeling of molecular properties. There are two broad cat-egories of mathematical functions representing molecularproperties, called Functions and Spectra in the program.

Native functions

Functions represent basic molecular properties. They are simplemathematical functions of atomic variables directly accessiblefrom molecular structure databases or molecular dynamics trajec-tory files: atom coordinates, velocities, charges, masses, etc. DeCi-pher's Functions pulldown covers structural properties such as:

♦ arbitrary geometric distances and angles

♦ energetic properties for arbitrary sets of atoms such as kineticenergies and center of mass velocities

♦ potential energies and associated forces

♦ permanent electric dipoles and multipoles, and

♦ multipolar interaction energies and forces.

Once it is defined a Function can be visualized, measured, moni-tored, and correlated over time, as appropriate. Functions can beeither scalars, vectors, or tensors. They can be multi-dimensional,


3. Languages

so that one can refer directly to sets of properties, for example on aper residue or per molecule basis. When defined, a Functionbecomes a molecular variable like any other, and can be used assuch for further analysis. Functions are evaluated dynamically bythe program.

Spectra

Spectra represent high level molecular properties. They are alsofunctions, in the mathematical sense, of atomic variables and time(i.e., configurations), however, their calculation requires ensembleaverages over configuration space. The Spectra pulldown coverspair and radial distribution functions, time correlation functions,orientation correlation functions, and their related Fourier trans-formed spectral densities. They are represented as graphs, whichexplains the name “Spectra” in the program, by analogy to exper-imental properties obtained as spectra. For example the powerspectrum of the dipole autocorrelation function can be directlyrelated to an experimental low frequency IR spectrum.

Numerical functions

Generally speaking, any property calculated internally or exter-nally can be imported into DeCipher as a numerical Function,when stored into a table. Spectra can be explicitly referred to asnumerical Functions for further use, when tabulated. Tables can becreated from external files (and from graphs by writing a graph fileto be read back in as a table) or by typing directly in Insight IIspreadsheet.

System functions

In addition the program automatically defines system Functionssuch as the potential, kinetic and total energy, temperature, pres-sure, and cell volume for the whole molecular system, as well asdynamics variables such as time or frame numbers. These aretreated as intrinsic numerical functions by DeCipher, and areobtained directly from data files produced by molecular dynamics(time, or frame numbers, atomic coordinates, velocities, pressuretensor, cell components, and system energies).



Geometrics

Geometric objects, called Geometrics in DeCipher, include points,planes and vectors. They all can be used in various Function defi-nitions. Vectors can be defined graphically, between two arbitrarypoints, a point and a plane, or as a best fit oriented line to a set ofpoints (geometric distribution of points in space). They can also bedefined as graphical monitors of vector or tensor Functions, and assuch, they can be scaled, and anchored to any point in real (affine)space.

Implicit functions

Vector objects can be used anywhere in DeCipher as Implicit vec-tor Functions, regardless of how they have been defined. DeCi-pher will also treat any geometric point, or set of atoms expressedas a molecule spec as an implicit position Function (a vector func-tion) with each atom representing a separate vector originating at(0,0,0), the so-called world origin in Insight II. Obviously, point andatom coordinates are identical to the corresponding position vec-tors. Purely geometric objects vectors and points may be used aspart of any mathematical expression, while a molecule spec canonly be used by itself. It is however useful to write out atomic coor-dinates into a table. Note that in a table, a spreadsheet mathemat-ical expression can then be applied. Finally, DeCipher will treatany set of internals (bonds, valence angles, dihedrals, out-of planeangles) or non-bonds, as an implicit multi-dimensional function,either scalar (valence angles, out of plane angles, dihedrals) or dis-tance vectors (bonds and non-bonds).

Mathematical expressions (the “Function Spec”)

When defined, a Function, either native, system, numerical, orimplicit, becomes a molecular variable like any other, and can beused as such for further analysis. A complex property can then beexpressed as a linear or non-linear combination of functions,through a mathematical expression. For example, a user coulddefine a dipole-dipole interaction energy from two electric dipolevectors, the distance between them, and the angles describingtheir relative orientations. A user-defined molecular variable suchas this is called a Function Spec and is evaluated dynamically.


3. Languages

DeCipher treats all Functions as a function of time or configura-tion (frame) number. A static molecule in Insight II is treated as asingle configuration

Built-in operators

Arithmetic and vector operators allow users to compose mathe-matical expressions. These operators can be used with scalar, vec-tor and tensor operands, single-valued or arrays:

♦ + : scalar and vector addition

♦ - : scalar and vector subtraction, or negation

♦ / : division by a scalar

♦ * : scalar multiplication and vector dot product

♦ ^ : power and vector cross product

Precedence rules In decreasing order of evaluation:

♦ ( )

♦ * , /

♦ + , -

♦ ^

Parenthesis can be used to logically group various expressionoperations, but should also be used to override precedence. Theseare also useful to guarantee a particular order of evaluation whenone is not sure of the precedence.

When mixing types of operands (scalar and vectors) of the divi-sion, floating point modulus, and power operators, the scalarvalue must come second. For example, one may either add twoequal dimension arrays, or add a single value to all the elements oran array. Whenever the mapping is one to many, as in the latterexample, the single value should be second (especially for /, %,and ^).

Array, vector, Ref_Axes, and tensor subscripts

♦ Array subscript: [range] addresses sub-ranges of an array (therange syntax is identical to that used within Frame_Spec, suchas 1-5,>0@3)



♦ Vector components subscripts: .x, .y, .z, address the three com-ponents of a vector ( x, y, or z components respectively) withrespect to the “world frame of reference”.

♦ Ref_Axes subscripts: .i, .j, .k, address the three individual axesof a reference axes system. Axes components with respect to the“world frame of reference”, as for any vector, can be obtainedusing component subscripts .x, .y, .z. (The x component of the iaxis is therefore referenced using the double subscript .i.x)

♦ Tensor components subscripts: .xx, .yy, .zz, .xy, .xz, .yz addressindividual tensor components.

♦ Origin coordinates subscripts: .ox .oy .oz address individualcomponents of the origin of vectors, tensors and reference axesin 3D space.

♦ Energy components subscripts listed below address energycomponents, or their sums in forcefield space

.total: all energy terms

.cross: all defined cross terms

.bond: bond terms

.angle: valence-angle terms

.torsion: torsion terms

.oop: out of plane terms

.vdw: van der Waals non-bond terms

.coulomb: Coulombic non-bond terms

.hbond: hydrogen bond terms

.bb: bond-bond cross terms

.bb_13: bond-bond 1-3 terms

.ba: bond-angle terms

.aa: angle-angle terms

.bt: bond-torsion terms

.bt_end: end bond-torsion terms

.at: angle-torsion terms


3. Languages

.aat: angle-angle-torsion terms

.oop_oop: out of plane - out of plane terms

Built-in scalar functions The following functions operate onsingle valued or multidimensional scalar functions (arrays). Theargument can be any mathematical expression whose result is ascalar array. Angles are expressed in degrees.

♦ sin

♦ cos

♦ tan

♦ asin

♦ acos

♦ atan

♦ sinh

♦ cosh

♦ tanh

♦ exp: exponential

♦ log: natural logarithm

♦ log10: logarithm in base 10

♦ sqrt: square root

♦ abs: absolute value

Built-in constant

♦ PI = 3.14159265358979323846

♦ DEGTORAD = PI / 180.0

♦ RADTODEG = 180.0 / PI

Built-in vector functions In the following, A,B represents onevector or a set (array) of them; R represents a ref_axes or a set ofthem.

♦ dot(A,B): dot product of vectors A and B; same as A*B



♦ cross(A,B): cross product of vectors A and B; same as A^B

♦ unit(A): unit vector along a vector A

♦ norm(A): norm (magnitude) of vector A

♦ project(A, R): projections (components) of a vector A onto thethree axes of a ref_axes R

Built-in tensor functions In the following, T represents one ten-sor or a set (array) of them.

♦ trace(T): trace of a tensor T

♦ avg_trace(T): average (1/3) of the trace of a tensor T

♦ det(T): determinant of a tensor T

♦ principal_axes(T): [alias eigenvectors(T)]: principal_axes(eigenvectors) of a tensor T

♦ principal_moments(T): [alias eigenvalues(T)]: principal_moments (eigenvalues) of a tensor T

Specific built-in array functions: The following functionsoperate only on arrays. The argument can be any mathematicalexpression whose result is a array.

♦ sum: scalar sum

♦ vsum: vector sum

♦ tsum: tensor sum

♦ avg: average value of scalar array

♦ vavg: average value of vector array

♦ tavg: average value of tensor array

♦ dim: dimension (total number of elements within a scalararray); alias count

♦ rank: rank of a scalar array (number of non-zero elementswithin a scalar array)

♦ vdim: vector array dimension (total number of elements, i.e,individual vectors, within a vector array); alias vcount


3. Languages

♦ vrank: rank of a vector array, or number of non-zero elementswithin a vector array (a zero vector has all components equal tozero)

♦ tdim: tensor array dimension (total number of elements, i.e,individual tensors, within a tensor array); alias tcount

♦ trank: rank of a tensor array, or number of non-zero elementswithin a tensor array (a zero tensor has all components equal tozero)

♦ rdim: ref_axes array dimension (total number of elements, i.e.,reference axes systems, within a ref_axes array)

Examples (for Function Spec)

dipole1

The vector function dipole1 defined by Functions/Dipole.

dipole1[5-15].y

The y component of the 5th through 15th elementary vectors inthe dipole1 vector function.

14.4*((mu1*mu2)-3*(mu1*unit(d12))*(mu2*unit(d12))) / (norm(d12)^3)

Equation representing the dipole-dipole interaction energy. It isexpressed as a mathematical expression in DeCipher in theterms of dipole vectors, mu1 and mu2, and distance vector, d12.This mathematical expression can be directly tabulated byentering it into Function Spec in Functions/Construct_Table.

acos(unit(V1^V2)*unit(V3))

Gives the angle between the normal to the plane formed by V1and V2, and V3

vsum(velocity1) / vdim(velocity1)

Gives the average vector of all elementary vectors in the vectorfunction array velocity1 defined by the Functions/Velocitycommand, using any of the array options: Per_atom, Per_monomer or Per_molecule.

project(displacement1[7-10],principal_axes(moment_of_inertia[7-10]).i)



Gives the component of the displacement vectors 7 to 10, forexample for water molecules 7-10, on their corresponding larg-est principal axis.


3. Languages


4 Command Summary

Configurations

A set of configurations corresponds to either a molecular dynam-ics trajectory or any conformation/configuration ensembleobtained, for example through Monte Carlo or systematic confor-mational search. For a single molecule system, configurations rep-resent conformations of the molecule. For a molecular dynamicssimulation, the configurations are time correlated and represent amolecular trajectory.

The Configurations pulldown includes the following commands.

Configurations/Get

Loads specified configurations from a configurations data set.

Configurations can be specified using a Frame Spec, or by using aFrame Spec criterion. In the latter case, one can load selectivelyconfigurations with desired properties, as specified by DeCipherFunctions. Periodic systems can be loaded with the option toReimage the periodic images.

The file formats supported include history and archive files writ-ten by Discover, as well as the output files from CHARMm, Amberand OFF.

Configurations/Put

Writes out an archive file of specified configurations among thepreviously loaded set. The specification can be a Frame Spec, or aspreadsheet cell selection corresponding to a selected set of config-


4. Command Summary

urations with given properties (i.e., tabulated DeCipher Func-tions).

Configurations/Record

Writes out selectively in an archive file the configuration displayedon the screen (in world coordinates). This command is particularlyuseful to record an interactive docking session, or conformationalchange process. Selected configurations can be appended to thefile one by one.

Configurations/Filter

Filters out a specified vibrational frequency range in a moleculardynamics trajectory.

Configurations/Align

Aligns all configuration frames presently loaded to a specified ref-erence frame by aligning the moment of inertia axis of each frame.

This command is especially useful when computing information,such as RMSD, that may be effected by rotational motion duringdynamics. The combination of Configurations/Align and Func-tions/RMSD gives results closer to the Insight Transform/Super-impose command which first finds the closest alignment of theatom sets used before reporting the RMSD value.

Configurations/Color

Colors molecular configurations statically or dynamically accord-ing to atomic, monomeric or molecular properties (Functions orattributes), or subset membership. Any user-defined color Spec-trum can be used for color mapping properties on top of molecularstructure.

Functions


Configurations/Animate

Displays the specified configurations as frames of a movie. Theseframes have been previously loaded, using the Configurations/Get command or loaded directly. Only the set of atoms specified isloaded, so for example only the protein from a solvated dynamicssimulation can be animated.

Configurations/Select

Selects and displays a specific configuration or an average onewithin an animated configuration set.

Configurations/Tabulate

Tabulates specified system attributes for selected configurations,or dynamic attributes for a specified set of atoms, originally storedin a configuration data file.

Configurations/Delete

Deletes the previously loaded configuration set and frees memory

Functions

DeCipher Functions represent atomic, monomeric, molecular andsuper-molecular properties. They are for the most part user-defined, although there are system functions that are automati-cally associated with a molecular system when the configurationsdata set is loaded. Once defined, they can be used for tabulation,visualization, and further mathematical and statistical analysis.

All vector and tensor functions components are expressed in the“world” frame of reference. These functions can be visualized asvectors in 3D space mapped on molecules or on any arbitrarypoint in space, using the Geometrics/Vector command. Tensor func-tions are always visualized as a set of principal axes. Scalar func-


4. Command Summary

tions are visualized by color mapping functions on top of theircorresponding molecular substructures (atoms, residues, mole-cules), using the Configurations/Color command. All functions aretabulated at any level of detail using the Functions/Construct_Tablecommand, and can be graphed using the Functions/Construct_Graph command.

The Functions pulldown includes all the commands necessary todefine geometric, kinetic, and energetic functions.

Functions/Construct_Graph

Constructs a two or three dimensional graph of selected functionsfor a specified set of configurations. You may use this command toadd a new plot to an existing graph.

All functions are plotted as single scalar values, so that vectors areplotted as their norms and tensors are plotted as the trace average.If the function is multi-dimensional then the average value is plot-ted.

Functions/Construct_Table

Creates a spreadsheet of scalar, vector or tensor functions. Bydefault all functions are tabulated as scalar representations (forvectors their vector norm is used; for tensors their trace average isused). Optionally, vector components, tensor components, princi-pal moments, principal axes components, as well as their spatialorigins can be tabulated.

Multi-dimensional functions, representing arrays of properties,are tabulated in their entirety or as an average over the array. Col-umns represent functions and rows represent configurations (orvice versa when the spreadsheet is transposed).

The spreadsheet has two modes: definition and filling (append-ing).

In definition mode, functions are added, inserted, deleted orreplaced. So a DeCipher spreadsheet is itself a definition (it is anorganized set of functions which represent molecular propertydefinitions).

Functions


In filling (appending) mode the functions are evaluated for thespecified configuration set, for the current molecular configura-tion or for the specified assembly.

When the spreadsheet is in interactive mode, functions in thespreadsheet are evaluated dynamically from the displayed molec-ular configuration. Spreadsheets can be switched between interac-tive and non-interactive modes. Filling can be interrupted, usingthe <Esc> key.

Mathematical expressions of DeCipher functions can be tabulateddirectly or further properties can be calculated in the spreadsheet,using spreadsheet operations. DeCipher functions (definitions)can be copied from spreadsheet to spreadsheet using regular Copyand Paste commands.

Functions/Angles

Defines an angle between three points or atoms, between two vec-tors, a vector and a plane, or two planes.

Functions/Dihedrals

Defines a dihedral angle between four atoms, pseudoatoms orpoints.

Functions/Dipole

Defines the electric dipole vectors for a specified set of atoms(optionally on a Per_Residue or Per_Molecule basis).

The dipole vector direction is oriented along the axis defined fromthe center of negative charge to the center of positive charge of thecorresponding set of atoms and its magnitude is in Debyes. For thegraphical representation (see the Geometrics/Vector command) theorigin of the dipole vector is placed halfway between the center ofpositive charge and the center of negative charge.


4. Command Summary

Functions/Distances

Defines sets of distance vectors between atoms, pseudoatoms, res-idues, molecules, and points, points and planes, points and lines,and among lines (lines are represented as vectors). It can also bethe out of plane distance between a single atom and the planeformed by the three connected atoms to that central atom (sp2atoms).

Functions/Moment_of_Inertia

Defines the moment of inertia tensor or simply any moment rela-tive to any axis previously defined (a vector function).

Functions/Positions

Defines position vectors for a specified set of atoms, pseudoatoms,residues or molecules from a given origin. The default origin is the“world” origin.

Functions/Radius_of_Gyration

Defines the radius of gyration for a specified set of atoms. A tensorextension called anisotropic radius of gyration gives rise to threeprincipal moments representing the axes of an ellipsoid, ratherthan the isotropic radius of a sphere.

Mass-weighting is optional for this function. A non-mass-weighted radius of gyration corresponds to a best fit sphere or bestfit ellipsoid.

Functions/Numeric

Defines multi-dimensional numeric functions, either scalar, vector,tensor, or reference axes systems through a tabular input, i.e., aspreadsheet.

These functions can be “mapped” to the atoms, residues or mole-cules as appropriate. Numeric functions can be used like any other

Functions


in display or with mathematical expressions. For example, you candefine hydrophobic moments using residue based hydrophobici-ties read in as numerical functions (a spreadsheet can be createdfrom a table file in the Homology module).

Functions/B_Factor

Defines, either isotropic or anisotropic, atomic or monomeric B-factors.

Functions/Displacement

Defines displacement vectors for atoms, pseudoatoms, and resi-due- and molecule centroids, with respect to a reference structure,an average structure over a set of configurations, or a previousconfiguration.

Functions/RMSD

The Root-Mean-Square Deviation (RMSD) in Cartesian coordi-nates, between a set of atoms in a dynamic structure and a corre-sponding one in a reference structure, either a different moleculeor a different configuration. For the latter case, a choice of the pre-vious or the average configuration among a set is available. A ten-sor extension called anisotropic RMSD is optionally available.

Functions/Momentum

Defines linear and angular momenta for a specified set of atoms,or optionally on a per atom, per residue or per molecule basis, asappropriate.

Functions/Velocity

Defines translational and angular velocity vectors for a specifiedset of atoms, optionally on a per atom, per residue or per moleculebasis, as appropriate.


4. Command Summary

Functions/Temperature

Defines the temperature of a set of atoms. This requires that veloc-ity information be available for the molecule.

Functions/Kinetic_Energy

Defines the kinetic energies for a specified set of atoms, residues ormolecules. Translational, rotational and vibrational kinetic ener-gies can be obtained as appropriate.

Functions/Multipolar_Energy

Defines permanent multipolar interaction energies at any level ofa multipolar expansion between two specified non-overlappingsets of atoms, as a function of their centroid distances.

When evaluated and tabulated in a spreadsheet, for example,overlapping distance is not checked so users must examine theresults carefully.

Functions/Potential_Energy

Defines internal and interaction potential energy functions within(between two) set(s) of atoms, with breakdown at the energy com-ponent level (default) and optionally at the atom, residue or mole-cule (pairs) level. The command also allows the definition ofenergy derivatives with respect to internals and non bond dis-tances, and the definition of forces or accelerations on atoms, resi-dues and molecules.

You may also define the energy of derivatives with respect to inter-nals and non bond distances, and the forces or accelerations onatoms, residues and molecules.

Potential energy functions can make use of the various forcefieldfunctional forms supported by CVFF, CFF, and Amber(CHARMm and ESFF are not supported at this time). The systemautomatically switches functional form, so you do not have toknow the details of the energy expression. For example, you can

Functions


select a “Bond” energy term and the system will know the func-tional form to be used for the present forcefield. Switching force-fields does not require a redefinition of the symbolic energyfunction.

Functions/Restraint_Energy

Defines distance, dihedral and angle restraints, reading their defi-nitions from a restraint file (.rstrnt).

A harmonic pseudo energy function is defined using the force con-stants and boundaries stored in the file.

Functions/Non_Bond_Setup

Non bond setup for energy calculations. Includes selection of cut-off vs. no cutoff options, of atom, residue or group based cutoffs,cutoff distance and non bond list update frequencies during calcu-lations.

For non bonded interaction energy calculations, the frequency forregenerating the non bond list is used across a configuration set.For example, a frequency of 20 means that the list is updated every20 frames in the specified Frame_Spec.

Functions/Cross_Term_Setup

Forcefield cross terms selection for energy calculations. Only thosecross terms specified that are also present in the forcefield beingused will be calculated.

Functions/Optimize

Energy and restrained energy minimizer, with choice of steepestdescent, conjugate gradient, or Newton-Raphson methods. Thiscommand can be interrupted with the <Esc> key.


4. Command Summary

Functions/Get

Reads the definition of a function from a file. These files are createdwith the Functions/Put command.

Functions/Put

Used to save a function definition to a file. This can later berestored using the Functions/Get command.

Functions/Rename

Renames previously defined functions.

Functions/Delete

Deletes user-defined functions. Vector functions associated withgeometric objects (defined through the Geometrics/Vector com-mand) cannot be removed until the vector objects are deleted.

Functions/Info

Gives the definitions of the specified functions.

Spectra

Spectra are high level math functions. They represent averageproperties over sets of particles or vectors, averages over time as afunction of radial distances, and averages over space (radial dis-tances) as a function of time. Calculations can be interactive or inthe background. Results are presented as graphs, spreadsheets ortable files.

Spectra


Spectra/RadialDistFunc

Computes pair, spherical or cylindrical radial distribution func-tions and running coordination numbers (CN) between two sets ofatoms, a set of atoms and a reference point, or a set of atoms and areference axis respectively. Results can be obtained for the dis-played configuration, or for a set of specified configurations, indi-vidually or as an average.

Spectra/OrientRDF

Computes the orientational pair, spherical or cylindrical radial dis-tribution function between two sets of vectors, a set of vectors anda reference point, or a set of vectors and a reference axis, respec-tively. Results can be obtained for the instantaneous configuration,or for a set of specified configurations, individually or as an aver-age.

Spectra/AngleDistFunc

Computes the angular (orientational) distribution function over aset of vectors.

Spectra/CN_Distrib

This command examines the distribution of coordination numbersfor sets of atoms at a given radial distance. It can only be used afterthe coordination numbers have been computed by the Spectra/RadialDistFunc command using the CN_Distrib option.

Spectra/MeanSqDisp

Computes the mean square displacement and/or the distancetravelled by a specified set of atoms as a function of time.

The self diffusion coefficient for a set of atoms, can then beobtained directly from the slope of the curve MSD vs. TIME (D =


4. Command Summary

slope/6). The slope can be obtained through the Analysis Graph/Line_Fit command.

Spectra/TimeCorrFunc

Computes auto and cross time correlation functions for scalar andvector functions, and their Fourier transforms to obtain the spec-tral density. Orientation correlation functions are obtained as firstand second order Legendre polynomials. Self-diffusion coeffi-cients can be obtained from the zero frequency of the velocity auto-correlation functions.

Spectra/ResidencyTime

Computes time spent by particles in their relative spatial neigh-borhood, represented as a function of a radial distance betweenparticles in two sets of atoms. This command also computes thetime a given function remains within a given range.

Geometrics

Geometrics are 0D, 1D and 2D geometrical objects (points, vectorsand planes). The vectors can be used as monitors for vector func-tions and tensor functions (by using their principal axes represen-tation). This pulldown contains commands to define and modifythe display characteristics of specified geometrical objects.

Geometrics/Point

Defines arbitrary geometric points either directly or relative toatom positions. Points defined directly can be moved freely, whilethose associated with sets of atoms cannot.

Geometrics


Geometrics/Vector

Defines vectors geometrically from specified points or atom posi-tions, as the best fit to a set of atoms, or as graphical monitors ofvector functions.

Geometrics/Plane

Defines a plane from a set of arbitrary points, or as a best fit to a setof atoms.

An arbitrary point may be an atom, a pseudoatom (see the BuilderPseudo_Atom/Define command) or a point object (see Geometrics/Point). If a specified point represents a set of points, the geometriccenter of those points is assumed as the specified point.

Geometrics/Style

Changes the display style of geometrical objects. The two render-ing styles available are wire-mesh and solid.

Geometrics/Color

Colors specified geometrical objects with individual colors, oruses a Color_Spectrum to color vectors, according to their norm.If unit or scaled vectors are displayed, this command colors themby the norm of the original vector function, and not by the dis-played length.

Geometrics/Info

Gives the definition of a specified geometric object.


4. Command Summary

SubStructure

A SubStructure is a collection of ordered sets of atoms, or “n-tuplesets”. Presently they correspond to sets of substructural elements:bonds, valence angles, out of planes, impropers, dihedrals and nonbonds (excluding 1-3 interactions). These sets are named and canbe heterogeneous, i.e., contain any combination of internals andnon bonds.

This pulldown contains commands to create, manipulate and out-put these substructure sets.

SubStructure/Get

The SubStructure/Get command is used to create a new substruc-ture (a subset of internals or non bonds) by reading its definitionfrom a subset file (.sub), a restraint file (.rstrnt) or a moleculardata file (.mdf). The first two file types can be created by the Sub-Structure/Put command. Molecular data files are created by theViewer Molecule/Put command.

SubStructure/Put

Saves to disk sets of internals and/or non bonds, in either restraintfile format or subset file format.

SubStructure/Internals

Defines sets of internals contained in any molecular substructure(atom subset). They can be defined using any atomic attribute,including atom names and potential types, charges, masses,hybridization, etc. Internals can be tabulated using the Substruc-ture/Info and with the Functions/Construct_Table command.

Cluster


SubStructure/NonBonds

Defines sets of non bonded atom pairs between any two sets ofatoms within a given distance range. Non bonds can be tabulatedusing the Substructure/Info command and the Functions/Construct_Table command.

SubStructure/Rename

Renames specified sets of internals or non bonds.

Substructure/Delete

Deletes specified sets of internals or non bonds.

Substructure/Info

Gives the definition of specified sets of internals or non bonds anda tabular output of their values for the displayed configuration.

SubStructure/Plot

Creates Ramachandran style plots for the angles and moleculesspecified. This can either be for a single configuration or for a setof configurations across the previously loaded data set.

Cluster

Commands under the Cluster pulldown are used to perform clus-ter analysis on molecular configurations (conformations). Theanalysis begins by creating a cluster graph that displays the resultsof a structural comparison throughout the defined configurationdata set. Every frame within the set of configurations is comparedto every other frame and the RMS resulting from the superimposi-tion of the frame pairs is displayed within a cluster graph.


4. Command Summary

Cluster/Construct_Graph

Creates a 3D graph of Configuration/Configuration/RMS, calleda “cluster graph”. Configurations are previously loaded throughthe Configurations/Get command.

Cluster/Repartition

Repartitions a “cluster graph” by redefining the RMS bins, both inthe number of bins and the RMS value range, and the associatedcolors.

Cluster/Family

Creates or modifies a configurations subset, that can be savedusing the Configurations/Put command. The configurations subset,or “family”, is grouped as an assembly.


5 Online Tutorials

Pilot online tutorials

Most tutorials are now available online for use with the Pilot inter-face. To access the online tutorials for DeCipher, click the mortar-board icon in the Insight II interface.

Then, from the Open Tutorial window, select DeCipher tutorials,and choose from the list of available lessons:

♦ Lesson 1: Getting the Data of a Box of Water

♦ Lesson 2: Compute the Velocity Auto-correlation Function

♦ Lesson 3: Compute the Mean Square Displacement

♦ Lesson 4: Compute the Pair Distribution Function for a Box ofWater

♦ Lesson 5: Structure Dynamics

♦ Lesson 6: Property Geometrics

You can access the Open Tutorial window at any time by clickingthe Open File button in the lower left corner of the Pilot window.

For a more complete description of Pilot and its use, click the on-screen help button in the Pilot interface or refer to the Introductionto Insight II chapter in the Insight II manual.


5. Online Tutorials


A References

Allen, M.P.; Tildesley, D.J. Computer Simulation of Liquids, Clar-endon press: Oxford (1989).

Berne, B.J.; Pecora, R. Dynamic light scattering, John Wiley & Sons:New York (1976).

Brooks, C.L.; Karplus, M.; Pettitt, B.M. “Proteins-A theoretical per-spective of dynamics, structure, and thermodynamics”,Advances Chemical Physics, Vol. LXXI, John Wiley & Sons: NewYork (1988).

Corongiu, G.; Clementi, E. “Solvated water molecules and hydro-gen-bridged networks in liquid water”, J. Chem. Phys., 98,2241-2249 (1993).

Egelstaff, P.A.; Gray, C.G.; Gubbins, K.E. “Molecular structure andproperties”, MTP International Review of Science, Physical Chem-istry, Series 2, Vol. 2, Butterworths: London (1975).

Gray, C.G.; Gubbins, K.E. The theory of molecular fluids. I. Funda-mentals, Clarendon Press. Oxford (1984).

Gray, C.G.; Henderson, R.L. Can. J. Phys., 56, 571 (1987); 57, 1605(1979).

Greegard, L. and Rockhlin, V., “On the evaluation of electrostaticinteractions in molecular modeling”, Chemica Scripta, 29A,139-144 (1989).

Jorgensen, W.L.; Chandrasekhar, J.; Madura, J.D.; Impey, R.W.;Klein, M.L. “Comparison of simple potential functions forsimulating liquid water,” J. Chem. Phys., 79, 926-935 (1983).

Powles, J.G. Advan. Phys., 22, 1 (1973).

Press, W.H.; Flannery, B.P.; Teukolsky, S.A.; Vetterling, W.T. Numer-ical Recipes in C, Cambridge University Press (1990).


A. References

Priestley, M.B. Spectral Analysis and Time Series, Birnbaum, Z.W.;Lukacs, E., Eds., Academic Press: London (1992).

Rahman, A.; Stillinger, F.H. “Molecular dynamics study of liquidwater,” J. Chem. Phys., 55, 3336-3359 (1971).

Rose, M.E. Elementary Theory of Angular Momentum, Wiley: NewYork (1957).

Soper, A.K.; Egelstaff, P.A. Mol. Phys. 39, 1202 (1980).

Stillinger, F.H.; Rahman, A. “Improved simulation of liquid waterby molecular dynamics,” J. Chem. Phys., 60, 1545-1557 (1974).

Stone, A.J. J. Chem. Phys. Lett., 83, 233 (1982)

Stone, A.J. and Alderton, M. “Distributed multipole analysismethods and applications”, Molecular Physics, 56, 1047-1064(1985).

Vieillard-Baron, J. E. “Phase transition of the classical hard ellipsesystem,” J. Chem. Phys. 56, 4729-44 (1972).

Yamashita, M.M.; Wesson, L.; Eisenman, G.; Eisenberg, D. Proc.Natl. Acad. Sci. USA, 87, 5648-5652 (1990).

Zwanzig, R. “Time correlation functions and transport coefficientsin statistical mechanics,” Ann. Rev. Phys. Chem., 16, 67-102(1965).


B Glossary

Table 5

A general property variablei, j particle or vector indicesa, b atom indicesm atom mass

M mass of a particleN the number of atoms in a particlep momentumq atom partial chargeQ total charger position vector

rCOC position of the center of charge (COC) of a parti-cle

rCOG position of the center of geometry (COG) of aparticle

rCOM position of the center of mass (COM) of a particleEpot Potential energyEk kinetic energyU Electrostatic interaction energy

v velocityV volume of a particle

electric dipole moment vector, A arrowhead variable, , and bold variable, A,

denote vectorssubscript α denotes α component (x, y, z) of a vectorsuperscript ε denotes the sign (+, –) of the charge

µ

A A


B. Glossary


C File Formats

Multiple-file trajectory definition file

The trajectory definition file is a list of files that together describea single trajectory or configuration set.

The format for this file is a list of files, one per line, with anoptional file type descriptor on the same line. If there is no file typedescriptor then the type is inferred from the file extension. Themapping is as follows where the Type column is the file typedescriptor that corresponds to each file format and extension:

The file type is inferred from the extension (.his) in the followingsample trajectory definition file:

Table 6

Extension Type Source/Format

his History Discover history filearc/car/cor Archive Discover archive filedcd/DCD Charmm CHARMm output filetor/xdr_tor Torsion torsion file produced in Search Compare

crd Amber Amber coordinates filecsr CSRtrj/atrj/qtrj

amf_trjOFF OFF file format


C. File Formats

!BIOSYM traj_listdynamics_run1.hisdynamics_run2.hisdynamics_run3.his

In this sample the file type is specified after the filename:

!BIOSYM traj_listdiscover_rotors.arc Archivecsr_run.csr CSR


D Units

Electrostatic conversion factor

e = 1.60219 x 10-19 C

4πε0 = 1.11265 x 10-10 J-1 C2 m-1

1 Å = 10-10 m

Avagadro’s constant = 6.02205 x 1023 mol-1

1 cal = 4.18413 J

For one substance:

Eq. 118

For one mole of substance:

5.51397 × 10-19 × 6.02205 × 1023 = 332054 (cal/mol) = 332.054 (kcal/mol) Eq. 119

qiqj

4πε0rij------------------

1.60219 10 19–×( )2C2( )1.112650 10 10– J 1– C2m 1– 10 10– m××--------------------------------------------------------------------------------------------

1J4.18413cal---------------------------×∼

5.51397 10 19– cal×=


D. Units

Frequency conversion

c = Speed of light = 3 X1010cms-1

cycles per picosecond (ps-1) = 1012 Hertz or= 1/c X 1012 cm-1

= 100/3 cm-1


E Atom_Template Expressions

Among the topological atom attributes, the Atom_Template isunique in that it is evaluated on the fly, rather than being stored. Itis a very powerful means to identify atomic environments. It relieson a SMILES-like string, and is as such a topology specificationlanguage. It is used in particular in potential templates, used toassign atom types in Insight II. This expression specifies thebonded atoms and bond orders that must be encountered in thesearch outward from the atom in question.

Bond order characters

In the Atom_Template context, the string specifies the topologicalenvironment of an atom, that is preceded by the special character>. All bonded substructures are enclosed in parentheses or brack-ets, and begin with a bond order character from the following set:

♦ - = single bond

♦ : = partial double bond

♦ = = double bond

♦ # = triple bond

♦ ~ = wildcard bond (matches any bond order)

Atoms are indicated by their chemical symbol, and the specialsymbol * is used to indicate a match with any element. Any sub-structure can in turn have nested substructures. This can continueas deep as necessary to uniquely identify the atom’s environment.Spaces can be used in the template to increase readability.


E. Atom_Template Expressions

Examples of Atom_Templates

The Atom_Template: (>C (=O) (-O(-H)) (-*)) describes the carbonin the following group:

anything/

O=C\O-H

The closing parenthesis for an atom does not indicate that theremay not be any more bonds out of the atom, only that no furtherspecific bonds are required for a match. Thus the template (>N)matches any nitrogen, no matter what its bonded neighbors are.

Brackets around an atom are used when you wish to indicate thatthe connectivity must match the template exactly. Thus the tem-plate [>Ca] can be used for ionic calcium, and will not match a cal-cium with atoms bonded to it. When an atom with substructuresis bracketed, it is only that atom, not all the substructures, which islimited in its connectivity.

The Atom_Template: (>C[:N(-*)]) matches a carbon in the follow-ing groups:

but not:

C N

H

H

C N

C H

H

C N

H

H

Examples of Atom_Templates


The Atom_Template: (>H) matches any hydrogen

The Atom_Template: (>H(-C)) matches a hydrogen single bondedto a carbon. No restriction on what may be bonded to the carbon.

The Atom_Template: (>H[-O(-H)]) matches a hydrogen singlebonded to an oxygen which in turn has a hydrogen bonded to it.The brackets around the OH prevent matches in the unlikely situ-ation of a charged oxygen which happens to have two hydrogens.With the brackets present the oxygen must have only the bondsspecified.

The Atom_Template: (>C(-H)(-H)(-*)(-*)) matches a carbon with 4single bonds, two of which are to hydrogens. Note that the hydro-gen substructures appear before the wildcard substructures. Thisprevents a hydrogen from being used for the wildcard match andhence not being available for the more specific substructure match.


E. Atom_Template Expressions


AAlign Configurations parameter block 84analysis software 2angle 19AngleDistFunc Spectra 93Angles Functions 87Animate Configurations 85atom attributes

3D 72Atom_Template 109auto-correlation function 60, 61

BB_Factor Functions 89best fit

(oriented) axis to a set of atom positions 17plane to a set of atom 17

bulk density 53

Ccenter charge

of a particle 15center of charge 28center of geometry 15center of mass 15, 25charged and uncharged particles 28CHARMm 6, 7cluster analysis 97Cluster pulldown 97CN_Distrib Spectra 93Color Configurations parameter block 84Colors Geometrics 95condensed phase 59, 62

structural properties of 11Configurations pulldown 83Construct_Graph Cluster 98Construct_Graph Functions 86

Construct_Table Functions 86coordination number 11, 54correlated frames 61correlation functions 7

auto 60cross 60mean square displacement 57normalized 60normalized discrete 61orientation 62orientation time 59time 59

correlation time 60Cross_Term_Setup Functions 91cross-correlation function 60

of two properties 61

Ddata analysis 5data collection 3

mechanism 5DeCipher

dynamic information system 8input 2, 5, 6, 7, 32

Delete Configurations 85Delete Functions 92Delete SubStructure 97diffusion coefficients 11, 59

self 7, 57, 65dihedral angle 16, 20dihedral angle, monitoring 10dihedral angle, specifying 5Dihedrals Functions 87dipole auto-correlation function 59Dipole Functions 87dipole moment vector

electric 27, 28functions 14

dipole vectororigin 28

Index


.

direct approach 59discrete time series 59Displacement Functions 89distance traveled 58distance vector 18Distances Functions 88dynamic information system 8dynamic set 3, 4dynamical correlations 59

EEinstein relation 65electric dipole moment

vector 27energy

kinetic 90equilibrium ensemble 59equipartition principle 65exponentially decaying function of time 60

FFamily Cluster 98Filter Configurations parameter block 84Fourier transform 63, 65Fourier transform-related power spectra 7Functions 73

implicit vector 75numerical 74system 74

functionskinetic 14system-defined 10user-defined 10

Functions pulldown 85

Ggeometric

centroid 15function 20plane 17point 17vector 17

geometric subsets 72

geometricalconstructs 5

Geometrics 75Geometrics pulldown 94Get Configurations parameter block 83Get Functions 92Get SubStructure 96

Hhigh-level mathematical functions 51hydrophobicity contrast function 53

Iinelastic neutron scattering spectra 63Info Functions 92Info Geometrics 95Info SubStructure 97infrared spectrum 63internals 16Internals SubStructure 96

Kkinetic energy, defining 90kinetic functions 14Kinetic_Energy Functions 90

LLegendre polynomials 11, 59, 62level of detail 4

Mmean square displacement 57

correlation function 11MeanSqDisp Spectra 93metal binding sites in proteins 53modulus 61molecular attributes 71molecular dynamics

simulation 7, 52molecular property 14

.


molecular subsets 69Moment_of_Inertia Functions 88Momentum Functions 89momentum vector 24monomer/residue attributes 71MSD 57Multipolar_Energy Functions 90

Nnewlink Berne-Pecora-1976 101newlink Corongiu-Clementi-1993 101newlink Press-etal-1990 101newlink Priestley-1992 102newlink Zwanzig-1965 102Non_Bond_Setup Functions 91NonBond SubStructure 97nonbonds 16normalized correlation function 60Numeric Functions 88

OOptimize Functions 91orientation correlation functions 10, 62orientation time correlation functions 59OrientRDF Spectra 93

Ppair/radial distribution function 11Parseval’s theorem 64Parzen window function 64Pilot online tutorials 99plane angle 19Plane Geometrics 95Plot SubStructure 97Point Geometrics 94position vector 18, 58Positions Functions 88Potential_Energy Functions 90power spectral density 63property definition and evaluation 3

energetics analysis 6structure analysis 5

property presentation 3molecular structure and property spread-

sheet 7Put Configurations parameter block 83Put Functions 92Put SubStructure 96

Rradial distribution function 51RadialDistFunc Spectra 93Radius_of_Gyration Functions 88Ramachandran plots 97Raman spectrum 63Record Configurations parameter block 84reference property 62Rename Functions 92Rename SubStructure 97Repartition Cluster 98ResidencyTime Spectra 94residue/monomer attributes 71Restraint_Energy Functions 91RMSD Functions 89rotational kinetic energy 90

SSelect Configurations 85self-diffusion coefficient 57, 65Spectra 74Spectra pulldown 51, 92spectral properties, experimentally accessible

7spreadsheet

molecular structure and properties 8static set 3, 4structural properties

elementary 5high-level 5

Style Geometrics 95subsets

geometric 72molecular 69topological 69

Substructure 96SubStructure pulldown 96


.

substructure selection 3dynamic set 4static set 4static/dynamic sets 3

TTabulate Configurations 85tabulated data

graphing 10TCF 59

direct approach 59Temperature Functions 90three point angle 19time correlation functions 10, 59, 63TimeCorrFunc command, Spectra pulldown

61TimeCorrFunc Spectra 94topological atom attributes 70topological subsets 69total kinetic energy 25, 90trajectory 61trajectory definition file 105translational kinetic energy 90translational switching between images 59transport properties 7, 11

of molecular liquids 59

Vvector angle 19vector functions 14Vector Geometrics 95vector properties 8vectors 14velocity auto-correlation function 65

time integral 59Velocity Functions 89velocity vector 25vibrational frequency, filtering out 84vibrational kinetic energy 90

Documents

Université de Montréal · Accelrys Scientific Support and Customer Service 9685 Scranton Road San Diego, CA 92121-3752 To print photographs or files of computational results (figures