View
217
Download
1
Category
Preview:
Citation preview
1
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
3D3D--QSARQSAR
Tim ClarkComputer-Chemie-Centrum
Universität Erlangen-NürnbergNägelsbachstraße 25
91052 Erlangen
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
StructureStructure--ActivityActivityRelationshipsRelationships
ChemicalChemicalStructureStructure
BiologicalactivityQSAR
Physicalproperty
QSPR
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Molecules
Gases
Perfect Crystals
Liquids
Polymers
Crystal Defects
Amorphous Solids
Easy
Diificult to impossible
Small
Medium
Large
Organic
Inorganic
Hybrid
Equilibrium
Fast (τ < ns)
Intermediate
Size
Structure
Energy
Enthalpy
Dipole Moments
Polarizability
Binding Energy
IR Spectra
Transition States
Activation Energy
NMR Spectra
Elastic Modulus
uv Spectra
Free Energy
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
QSPR QSPR MethodsMethods forfor PolymersPolymers• The Van Krevelen Method
o D. W. Van Krevelen, Properties of Polymers, 3rd ed., (Amsterdam, Elsevier, 1990).
• The Askadskii Methodo Andrey A. Askadskii, Physical Properties of
Polymers: Prediction and Control (Amsterdam, Gordon and Breach Publishers,1996).
• Connectivity Indiceso Jozef Bicerano, Prediction of Polymer
Properties (New York, Marcel Dekker, Inc., 1993).
2
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Van Van KrevelenKrevelen• The Van Krevelen method is a
group-additive method• Each group in the monomer is
assigned an additive increment• The target property is obtained
by simply summing the increments due to each fragment in the monomer
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Van Van KrevelenKrevelen
HC CH2
n
Polystyrene:
HC
CH2+
Group 1
Group 2
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Van Van KrevelenKrevelen (1)(1)Polystyrene:
HC
CH2+
M=90.12V=82.15
M=14.03V=15.85
( )
1( )
1
N groups
ii
N groups
ii
MMV V
ρ =
=
=∑
∑
Density, ρ
-314.03 90.12 1.06 g cm15.85 82.15
ρ += =
+
Exp. = 1.05 g cm-3
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Van Van KrevelenKrevelen (2)(2)Polystyrene:
HC
CH2+
M=90.12Yg=3500
M=14.03Yg=2700
( )
,1
( )
1
N groups
g ig i
g N groups
ii
YYT
V M
=
=
=∑
∑
Glass-transitiontemperature, Tg
2700 3500 362 K14.03 90.12gT +
= =+
Exp. = 373±2 K
3
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Van Van KrevelenKrevelen (3)(3)• Advantages
o Fast, easyo Usually accurate
• Disadvantageso Missing parameters for new groupso Not applicable for random polymers or
copolymers
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
AskadskiiAskadskii• The Askadskii method treats each
monomer as a series of harmonic oscillators
• The thermal movement related to each harmonic oscillator is in turn related to the glass-transition temperature
• After some manipulation, this concept leads to a simple additive model
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
AskadskiiAskadskii
( / )
1( / ) ( / )
1 1
N atoms groups
ii
g N atoms groups N atoms groups
i i ii i
VT
a V b
=
= =
∆=
∆ +
∑
∑ ∑
Glass-transitiontemperature, Tg
= van der Waals volume of atom or group , semiempirical coefficients
i
i i
V ia b∆
=
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Typical errorsTypical errors
±13.23%-Surface tension
±2.99%-Thermal decomposition temperature
±6.09%±5.22%Dielectric constant
±5.82%±3.71%Tg
±4.32%±7.21%Heat capacity (solid)
±5.12%±5.62%Heat capacity (liquid)
±1.02%±0.66%Refractive Index
±3.42%±1.58%Density
AskadskiiVan KrevelenProperty
4
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
BiceranoBicerano• The Bicerano method is based on
“electrotopological indices”, which were introduced by Kier and Hall:
o Molecular Structure Description, L. B. Kier and L. H. Hall, Academic Press, San Diego, 1999.
• Topological indices are derived from molecular bonding graphs
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
DescriptorsDescriptors: 2D: 2D• Topological Descriptors
o e.g. Kier und Hall:oχn :
n different types of descriptor that describe mainly the branching in the molecule
oκn:“shape” descriptors
oE-States :“electronic“ descriptors that describe the acceptor properties of the atoms.
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Topological indicesTopological indices• Graph-theoretical invariants
• W Wiener Indexo Oldest topological indexo Corresponds to surface area of moleculeo Dij is the bond distance between atoms i
and j
1 1
12
N N
iji j
W D= =
= ∑∑Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Topological indicesTopological indices• χ molecular connectivity index (Kier, Hall)
o Possibility of molecules for bimolecular interaction
o σi number of sigma electrons, hi number of connected hydrogens
… and many more• Used frequently in published models but
often of limited use in practical application due to difficult interpretation of descriptorso The inverse QSAR problem: going from model
to compound
5
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
KierKier and Hall and Hall TopologicalTopologicalIndicesIndices
• Molecular connectivity chi and kappa indices (1995)
o L. H. Hall and L. B. Kier, The Molecular Connectivity Chi and Kappa Shape Indexes in Structure-Property Modeling, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd (eds), VCH, New York, 1999.
o Connectivity indices intended primarily to describe the molecular shape.
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
χχ (connectivity) Indices: (connectivity) Indices: DefinitionsDefinitions
= number of skeletal (non-hydrogen) neighbor atoms to atom i iδ
( )0
1
1N atoms
i i
χδ=
= ∑
Zeroth order chi index:
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
χχ (connectivity) Indices: (connectivity) Indices: DefinitionsDefinitions
= number of skeletal (non-hydrogen) neighbor atoms to atom and are the two atoms involved in bond i i
i j ijδ
( )1
1
1N bonds
ij i j
χδ δ=
= ∑First order chi index:
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
χχ Indices: Heat of Indices: Heat of Atomization for Atomization for AlkanesAlkanes
1 4
4 5 5
286.38 12.46 1.515
1.142 2.474 2.026 114.38atom C P
PC C PC
H N χ χ
χ χ χ
∆ = − +
+ − − +
Higher order indices depend on paths (P) or clusters (C) in the molecular graph.
Standard deviation to experiment = 0.46 kcal mol-1
6
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
κκ (shape) Indices: Paths(shape) Indices: Paths
141076
131176
6666
3756
4656
3556
4556
3456
3P2P1PNCMolecule
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
κκ (shape) Indices: (shape) Indices: DefinitionsDefinitions
max min
1
, = maximum and minimum possible indicesof order for a given
is the first order path number for molecule
m m
C
i
P Pm N
P i
( )( )( )
21 11 max min
2 21 1
12 C C
i i
N NP P
P Pκ
−= =
First order kappa index:
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
κκ (shape) Indices: (shape) Indices: DefinitionsDefinitions
max min
1
, = maximum and minimum possible indicesof order for a given
is the first order path number for molecule
m m
C
i
P Pm N
P i
( )( )( )
( )
22 22 max min
2 22 2
1 22 C C
i i
N NP P
P Pκ
− −= =
Second order kappa index:
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
κκ (shape) Indices: (shape) Indices: DefinitionsDefinitions
( )3 3
3 max min23
4
i
P P
Pκ =
Third order kappa index:
( )( )( )
23
23
1 3 for is oddC C
C
i
N NN
Pκ
− −=
( )( )( )
23
23
2 3 for is evenC C
C
i
N NN
Pκ
− −=
7
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
ElectrotopologicalElectrotopological Indices; the EIndices; the E--StateState
= number of hydrogens bonded to atom
,
where is the number of valence electrons for atom
iv vi i i
vi
h i
Z h
Z i
δ = −
O
O
= 1, 1vδ δ =
= 1, 6vδ δ =
= 3, 4vδ δ =
= 2, 6vδ δ =
= 2, 2vδ δ =
= 1, 1vδ δ =
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
δδ and and δδvv
17sp3
16sp2
26sp15sp2
25sp3
35sp3
24sp
34sp2
44sp3
δδvHybridizationAtomC
C
C
N
N
N
O
O
F
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Intrinsic (I) StatesIntrinsic (I) States
( )212 v
NIδ
δ
+
=
i jij
ij
I II
r−
∆ =
= principal quantum numberN
= number of bonds between
atoms and (the topological distance)
ijr
i j
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
The The ElectrotopologicalElectrotopological (E(E--)State)State
( )
1
N atoms
i i ijj
S I I=
= + ∆∑= the E-State for atom S i
8
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
EE--StatesStates
O
O
1.78
0.48
1.38 4.41
-0.20
9.82
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
BiceranoBicerano• The Bicerano method uses chi and
kappa indices, I-states and E-statesto calculate the properties of polymers.
• E.g molar volume:
0 0 1
1
3.64277 9.798697 8.85282921.693912 0.978655
v
vMV
VN
χ χ χ
χ
= + −
+ +
( )
( ) ( ) ( ) ( ) ( )
where24 18 5 7 16
2 3 5 5 11 7( 1)MV Si S silfone Cl Br
backbone ester ether carbonate C C cyc fused
N N N N N N
N N N N N N− −
=
= − − − −
+ + + + − − −
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Typical errorsTypical errors
±13.23%
±2.99%
±6.09%
±5.82%
±4.32%
±5.12%
±1.02%
±3.42%
Askadskii
--Surface tension
--Thermal decomposition temperature
±1.77%±5.22%Dielectric constant
±5.57%±3.71%Tg
±5.57%±7.21%Heat capacity (solid)
±5.55%±5.62%Heat capacity (liquid)
±0.63%±0.66%Refractive Index
±2.10%±1.58%Density
BiceranoVan KrevelenProperty
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Internet Internet SourcesSources• Properties available
o http://www.dtwassociates.com/?phb_list_of_properties
• Interactive demo for Askadskiicalculationso http://mzchem.com/index.wm?opt=8&subopt=0&page=main_1_7.h
tm
• Handbook with calculation detailso http://www.chemcad.fr/produits/documentation/dtwassociates-
ppphand/ppphb_dug.pdf
9
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
BCUT BCUT DescriptorsDescriptorsAdjacency matrix:
•Diagonal elements:•Atomic number•Atomic charge•Atomic polarizabililty•H-bond properties
•Off-diagonal elements•(Lewis bond orders)/10 for bonded atoms•0.001 for all other elements
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
BCUT BCUT DescriptorsDescriptors
8.001.1.001.001.001.001.001.001.001
8.2.001.001.001.001.001.001.001
6.001.001.001.1.001.001.001
6.001.001.001.001.1.001
7.15.001.001.001.15
6.15.001.001.001
6.15.001.001
6.15.001
6.15
6
N
O
O H
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
BCUT BCUT DescriptorsDescriptors• Eigenvalues of the adjacency matrix• The highest and the lowest eigenvalues are useful ADME descriptors
• BCUTs may be either 2D, as described above, or 3D
• BCUTs are also known as Burden Eigenvalue descriptors
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
3D3D--QSAR and QSPRQSAR and QSPR
10
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
WhatWhat isis ChemicalChemicalStructureStructure??
• 2D-Structureo Atoms, Bonds (“Connection Tables“)
• 3D-Structureo Atomso Coordinates
• Molecular surfaces
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Molecular StructureMolecular Structure
CH3
HH2N
HO OSMILES: N[C@@H](C)C(=O)O (L-Alanine)
CH3
HH2N
HO O
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Molecular StructureMolecular Structure
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Molecular StructureMolecular Structure
11
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
3D3D--QSARQSAR
• 3D must be better than 2D (?)o We know the “real” structure of the
moleculeo Therefore, we also know exactly its
binding propertieso ….. but do we ??
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Multiple MinimaMultiple Minima
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
TheThe TargetTarget MoleculeMolecule
N
O
ZrN
O
CH2Ph
PhH2C
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
TheThe TargetTarget MoleculeMolecule
-142 -144 -146 -148 -150 -152 -154 -156 -158 -160
Heat of Formation (kcal mol-1)
0
1
2
-161.9
12
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
An Easy MoleculeAn Easy MoleculeCH3
Cl
Br O
CH3
H3C
H3C
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
C8
C9
C10
C10a
C7a
C7
C11
C11a
C6a
C6
C12
C12a
C5a
C5
C1
C2
C3
C4
OH
H3C
O OH
NMe2
OHO
OH
O
NH2
OH
C8
C9
C10
C10a
C7a
C7
C11
C11a
C6a
C6
C12
C12a
C5a
C5
C1
C2
C3
C4
OH
H3C
O OH
NMe2
OHO
O
O
NH2
OH
Tetracycline Tetracycline –– a nota not--soso--easy easy MoleculeMolecule
D AC B
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
TwoTwo ConformationsConformations(just (just forfor thethe rings)rings)
“Extended”• Favored by Solvation• More stable in solution
“Twisted”• More stable in vacuo• Consistently 2.5 –3.0
kcal mol-1 less stablethan extended in water
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
SixSix TautomersTautomers
OH O OOHOH
O
NH2
OH
NMe2HH
CH3HO
OH O OOHOH
O
NH2
O
NHMe2HH
CH3HO
OH OH OOOH
O
NH2
O
NHMe2HH
CH3HO
OH O OOOH
O
NH2
OH
NHMe2HH
CH3HO
OH O OOOH
OH
NH2
O
NHMe2HH
CH3HO
OH O OHOOH
O
NH2
O
NHMe2HH
CH3HO
N
1.7
Energy (kcal mol-1)
0.0
2.6
~ 6
6.4
Ze
Zd
Zc
Zb
Za
13
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
BombykolBombykol• Sexual Pheromone of the silkworm bombyx
morio 11 rotatable bondso Roughly 8,000 conformations
OH
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Biological ConformationBiological Conformation
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
3D 3D SimilaritySimilarity TechniquesTechniques• Search for similarity with the target
pharmacophore• Pure shape similarity (www.eyesopen.com)• Electrostatic similarity and similarity of the
electron density (Sanz, Carbo, Richards)Carbo Index:
2 2
A BAB
A B
P P dR
P d P d
τ
τ τ=
⋅
∫∫ ∫
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
How Much Difference Does How Much Difference Does Conformation Make?Conformation Make?
350 400 450 500 550Predicted boiling point
-17
-16
-15
-14
-13
-12
-11
-10
-9
AM
1 H
eat o
f For
mat
ion
(kca
l mol
-1)
H2NNH
NH2
Boltzmann-averagedpredicted boiling point = 444±36°
14
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Where do we get 3DWhere do we get 3D--Structures?Structures?
• X-Ray crystal structures• 2D-3D conversion
o CORINAo CONCORD
• Geometry optimizationo Force-Fieldo QM
o Semiempiricalo Density Functionalo Ab initio
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
ClassicalClassical MechanicsMechanics (Force (Force Fields)Fields)
• “Atoms and springs” mechanical model of molecules
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Potential Functions Potential Functions bond stretch bond stretch undund angle bendangle bend
• Bond stretch
• Angle bend
0 2( )stretch stretchV k r r= −
0 2( )bend bendV k θ θ= −
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Potential Potential FunctionsFunctionsTorsionsTorsions
N = Periodicity of the barrier (e.g. Ethane = 3)
One torsionalcontribution per ABCD combination
[ ]1 cos( )tors torsV k Nφ= +
15
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Potential Potential FunctionsFunctionsvan der van der WaalsWaals//repulsionrepulsion
D = van derWaals-well depth,
R = van derWaals-Radius
12 6
. 2A B A BvdW A B
AB AB
R R R RV D Dr r
⎡ ⎤⎛ ⎞ ⎛ ⎞+ +⎢ ⎥= −⎜ ⎟ ⎜ ⎟⎢ ⎥⎝ ⎠ ⎝ ⎠⎣ ⎦
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Potential Potential FunctionsFunctionsvan der van der WaalsWaals//repulsionrepulsion
Distance
Pote
ntia
l Ene
rgy
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Potential Functions Potential Functions Coulomb InteractionsCoulomb Interactions
• Charge-charge
• Dipole-Dipole
i jCoulomb
ij
q qV
rε=
Bond dipole
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
ForceForce--FieldField methodsmethods• Molecular Mechanics:
o Structures and energies can be more exact than experiment.
o Very well suited for conformational problems, relative stability of isomers etc.
o Cannot extrapolate; are only aplicable for classes of compounds that are experimentally well characterized.
o Usually not suitable for reactions.
16
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
ForceForce--FieldField methodsmethods• Conjugated π-systems:
o Each π-System requires its own force fieldo The force field for a π-bond depends on the
bond ordero A simple MO-technique is used to calculate
bond orders
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
ForceForce--FieldField methodsmethods
• Molecular dynamics:o Long simulations are necessary in order
to obtain good statistical samplingo Systems (e.g. enzyme + water) are often
very large (> 10.000 atoms)
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Conformational SearchingConformational Searching• A molecule with N three-fold rotatable bonds
has 3N possible conformations that must be searched
• If we need 1 µsec for each conformation, we need one hour for a molecule with 20 rotatable bonds, 8×1013 years for one with 50
• A molecule with 50 rotatable bonds corresponds to (Gly)25
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Conformational SearchingConformational Searching• Simulated annealing to find the most stable (“global”) minimum
• “Dead-end” search algorithms to eliminate high-energy conformations early
• Stochastic search algorithms such as GAs
• Still only possible with force fields
17
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
ForceForce--FieldField methodsmethods• Molecular dynamics:
o Movements of the atoms (or molecules) are calculated from the forces and velocities
o Integration over long simulation times gives thermodynamic quantities
o “Global” minima can be found by Simulated Annealing
o reliable thermodynamic quantities can be obtained from Free Energy Perturbation (FEP) calculations
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Molecular DynamicsMolecular Dynamics• Solve Newton’s equations of motion by numerical
integration for the classical mechanical molecular model• Need to include solvent molecules for biological systems• Often use periodic boundary conditions to avoid edge
effects• “long” simulations are of the order of 10 nanoseconds• “interesting” protein movements are of the order of
microseconds to milliseconds• Bottleneck are the long-range Coulomb interactions (use
Particle-Mesh Ewald, PME)
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
ForceForce--FieldField methodsmethods• Monte-Carlo (MC):
o Random movements are tried and selected according to a thermodynamic test (Boltzmann-distribution).
o Simulations usually reach equilibrium fatserthan MD.
o No kinetic information is available.o Can be used for Simulated Annealing or Free
Energy Perturbation –calculations.
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Monte Carlo Calculations: an exampleMonte Carlo Calculations: an example
• The “dartboard method” for calculating π
x
y
• green area = π r2/4• integrate over the greenarea for r=1 → π /4
• ∴
•
( )1 1
0 0
4 ,x y
f x y dxdyπ= =
= ∫ ∫
( ) ( )2 2, 1 if 1f x y x y= + ≤
0 otherwise=
•
••
•
• •
•
•
•
•
•
••
•
•
•
18
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Monte Carlo Calculations: an exampleMonte Carlo Calculations: an example
• Calculate π
• where
( )1 1
10 0
44 ( , ) ,N
i iix y
f x y dxdy f x yN
π== =
= ≈ ∑∫ ∫
( ) 2 2, 1 if 1f x y x y= + ≤0 otherwise =
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Monte Carlo Calculations: an exampleMonte Carlo Calculations: an example
3.141592654
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
ErrorsErrors
1 500 1000 1500 2000
Number of Cycles/106
-15
-13
-11
-9
-7
Log 10
(erro
r)
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Calculational Techniques: Calculational Techniques: Property PredictionProperty Prediction
1. Quantitative Structure-Activity and Structure-Property-Relationships (QSAR and QSPR)
2. Free-Energy Perturbation Calculations (MD or MC)
3. Kinetic or mesoscopic modeling
19
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MolecularMolecular Orbital Orbital MethodsMethods• Linear Combination of Atomic Orbitals (LCAO)o Molecular orbitals (MOs) are calculated
as linear combinations of atomic orbitals(AOs) .
o AOs are usually known as the basis set .o This approximation was introduced by
Erich Hückel.
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MolecularMolecular Orbital Orbital MethodsMethods• Linear Combination of Atomic Orbitals (LCAO)
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
HHüückelckel--TheoryTheory
• “π-only”-Theory (each atom is represented by a single p-Orbital, hydrogens are ignored).
• Overlap (β) between bonded atoms is constant, otherwise zero.
• Hückel-theory is a one-electron theory.
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
HHüückelckel--TheoryTheory: : EthyleneEthylene
20
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
HHüückelckel--MatrixMatrix
C2 C3
C4
C1
H
H
H
H
H
H
1
αβ00C4
βαβ0C3
0βαβC2
00βαC1
C4C3C2C1
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
HHüückelckel--MatrixMatrix
αβ00C4
βαβ0C3
0βαβC2
00βαC1
C4C3C2C1
Diagonal-isation
-.37170.6015-.60150.3717ϕ4
0.6015-.3717-.37170.6015ϕ3
-.6015-.37170.37170.6015ϕ2
0.37170.60150.60150.3717ϕ1
α+1.618β
α+0.618β
α-0.618β
α-1.618β
ψ4ψ3ψ2ψ1
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
ButadieneButadiene--MOsMOs
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MolecularMolecular Orbital Orbital MethodsMethods• Self Consistent Field (SCF)
o Each electron “feels” the mean field of all the others (also known as the mean-field approximation).
o The SCF-problem ca.o Elektron-Elektron-Abstoßung wird durch
die SCF-Methode überschätzt.
21
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MolecularMolecular Orbital Orbital MethodsMethods• Self Consistent Field (SCF)
e-
e-e-
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MOMO--MethodsMethods• Pople-Pariser-Parr (PPP)
o SCFo π-onlyo For planar moleculeso Used mainly for absorption spoectra
(still used extensively in industry!)o Very strongly parameterized
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MOMO--MethodsMethods• Complete Neglect of Differential Overlap
(CNDO)o J. A. Pople, R. Segal, J. Chem. Phys. 1965, 43,
S136-S149. o 3-dimensional theory (σ- and π-systems)o LCAO-SCFo Only the repulsion integrals (µµ|λλ) are
considered and are all equal for a given elemento p-Orbitals are treated as if they were s- for
the two-electron integrals
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
CNDOCNDO--IntegralsIntegrals• Of all the possible integrals (µν⏐λσ), only (µµ⏐λλ) are used
AB(µµ λλ)=γ
AA A AIP EAγ = −
( )2AA BB
ABAB AA BBrγ γγ
γ γ+
=+ +
22
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MOMO--MethodsMethods• Intermediate Neglect of Differential
Overlap (INDO)o J. A. Pople, D. L. Beveridge und P. A. Dobosh, J.
Chem. Phys. 1967, 47, 2026 – 2033.o 3-dimensional theorie (σ- und π-systems)o LCAO-SCFo Only the repulsion integrals (µµ|λλ) are
considered and are all equal for a given elemento One-center integrals are parameterized
according to their type
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
INDOINDO--IntegralsIntegrals• Of all the possible integrals (µν⏐λσ), only (µµ⏐λλ) are used
• 5 Types :
•Gss
•Gsp
•Gpp
•Gp2
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MOMO--MethodenMethoden• Neglect of Diatomic Differential Overlap (NDDO)o J. A. Popleo 3-dimensional theory (σ- and π-systems)o LCAO-SCFo Of all the repulsion integrals, only
(µν|λσ) (µ and ν are on the same atom and λ and σ are also on one atom) are used
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
NDDONDDO--IntegralsIntegrals• Of all the possible integrals (µν⏐λσ), only those in which µ and ν are on the same atom and λ and σare also centered on one atom are considered.
• The same 5 types (for an sp-basis set) as for INDO
• Integrals are calculated as a multipole-multipoleinteraction (up to quadrupole)
• Also available for d-orbitals
23
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Semiempirical Semiempirical MOMO--MethodsMethods
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
NDDONDDO--Methods(Methods(s,ps,p))MNDO MNDO/H§
AM1§ PM5§,¶PM3§,¥≡ ≅
§ Gaußian functions added to the core-core repulsion¥ Classical torsional potential used for amide bonds (C-N) to correct the rotation barrier¶ Classical two-center dispersion potential
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MNDOMNDO• M.J.S. Dewar, W. Thiel, J. Am. Chem. Soc., 99, 4899, (1977).o NDDO-based methodo Element-specific parameterizationo Multipole approximation for the two-
electron integrals o s-, p-Basis seto “Frozen core”-approximation
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MNDO: MNDO: ImprpovementsImprpovementsoverover MINDO/3MINDO/3
o Geometries – especially bond angles -are reproduced better than in MINDO/3.
o Heats of formation are generally more accurate.
o MINDO/3’s strong tendency to make non-classical bridged structures is corrected.
24
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MNDOMNDO--WeaknessesWeaknesses• Rotation barriers are too low
• π-Systems are often calculated to be non-planar
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MNDOMNDO--WeaknessesWeaknesses• Rotation barriers are too low• π-Systems are often calculated to be non-planar
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MNDOMNDO--WeaknessesWeaknesses• Repulsion between lone pairs is too weak
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MNDOMNDO--WeaknessesWeaknesses• Hydrogens bonds do not exist in MNDO
25
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MNDOMNDO--WeaknessesWeaknesses• Rings are generally too flat with inversion barriers that are too low. Cyclobutane is predicted to be planar.
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MNDOMNDO-- ““Periodic TablePeriodic Table““
H, He, Li, Be, B, C, N, O, FNa, Mg, Al, Si, P, S, Cl K, Ca,
Zn, Ge, Br Cd, Sn, IHg, Pb
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
AM1AM1• (Austin Model 1) M.J.S. Dewaret.al. J. Am. Chem. Soc.,107 3902 (1985).o Quantum mechanically almost identical
to MNDOo Core-core repulsion modified by
additional Gaussian functions as introduced in MNDO/H by Burstein and Isaev
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
AM1: AM1: ImprovementsImprovements OverOverMNDOMNDO
o Rotation barriers are higher than in MNDO, but still too low.
o π-Systems are reproduced better than in MNDO, but are often still not completely planar.
o Hydrogens bonds give roughly the right energies – however, the geometry is wrong.
26
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
AM1AM1--WeaknessesWeaknesses• Geometries of hydrogen bonds
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
AM1AM1--WeaknessesWeaknesses• Very poor geometries for P- and S-compounds
• Very poor energies for hypervalent compounds including sulfones
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
AM1AM1--WeaknessesWeaknesses• The energies of nitro-compounds are reproduced poorly.
• Alkyl amines are too flat.
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
AM1AM1--””Periodic TablePeriodic Table““
HB, C, N, O, F
Na, Mg, Al, Si, P, S, Cl Zn, Ge, Br
Sn, IHg
27
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
PM3PM3• (Parameterized Method 3) J.J.P. Stewart, J. Comp. Chem., 10, 209 (1989); 12, 320 (1991).o Quantum mechanically identical to AM1o Automatic parameterisation with more
degrees of freedom than for AM1o Parameterized with special attention
paid to hypervalent compounds, hydrogen bonds and nitro-compounds
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
PM3: PM3: ImprovementsImprovements OverOverAM1AM1
o Geometries for P- und S-compounds are better
o Geometries for hydrogen bonds are improved over AM1
o Results optimized for nitro-compounds
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
PM3PM3--WeaknessesWeaknesses• Amide-CN-rotation barrieren are extremely small (force-field correction).
• Amide-nitrogens are calculated to be pyramidal.
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
PM3PM3--WeaknessesWeaknesses
• Rotation barriers are far too low.
• π-Systems are often calculated to be non-planar.
• Rings are too flat.
28
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
PM3PM3--””Periodic TablePeriodic Table““
H, He, Li, Be, B, C, N, O, FNa, Mg, Al, Si, P, S, Cl
Ca,Zn,Ga,Ge,As,Se,BrCd,In,Sn,Sb, Te, IHg, Tl,Pb, Bi
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
NDDONDDO--MethodsMethods ((s,p,ds,p,d))MNDO MNDO/d
AM1
PM3
AM1(d)
Voityuk und Rösch, nur Mo
AM1(d)
V, Fe, Cu, Mo, Pd, Ag, Pt
FujitsuPM3-tmWave-
functionFirst-row transition metals(only parameterized for geometries)
Al, Si, P, S, Cl, Br, I
AM1*
Erlangen, H-F, Al-Cl, Ti, Zr, Mo
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Ab Ab initioinitio--MOMO--MethodsMethods
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
HartreeHartree--FockFock--LimitLimitenergy
experiment
SCF-energies
Hartree-Fock limit
Correlation energy
29
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
CorrelationCorrelation• Dynamic correlation
o Results directly from the overestimation of electron-electron repulsion in SCF-Theory.
• Non-dynamic (static) correlationo Only significant in systems with near-
degenerate partially occupied orbitals(e.g. biradicals).
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Dynamic CorrelationDynamic Correlation• Semiempirical MO-Methods
o Is included by scaling the one- and two-center integrals.
• Semiempirical CI-Calculationso Therefore only treat static correlationo … and are therefore easily interpreted
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
abab initioinitio MO TheoryMO Theory• Approximate solution to the time-independent
electronic Schrödinger equationo Linear Combination of Atomic Orbitals (LCAO)o Usually single Hartree-Fock reference configuration
based on a single Slater determinanto Correlation included either perturbationally (MPn) or
using Coupled-Cluster theory (e.g. CCSD(T))
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
abab initioinitio MO Theory: MO Theory: ApproximationsApproximations
• Linear Combination of Atomic Orbitals (LCAO)
+
30
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
abab initioinitio MO Theory: MO Theory: ApproximationsApproximations
• Slater Determinants and Self-Consistent-Field Theoryo Multi-electron wavefunction is approximated as a
series of one-electron wavefunctions (orbitals)o Each electron interacts with the mean field of
all other electrons (Hartree-Fock Theory)
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
HartreeHartree--FockFock--LimitLimit
Ener
gy →
SCF-energies
Hartree-Fock limit
Correlation energy
Experiment
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
ab ab initioinitio ComputationalComputationalLevelsLevels
Correlation →
Basi
s Se
t →
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
abab initioinitio MO TheoryMO Theory• The method can be improved systematically so that
convergence of the results can be recognized• Therefore, extrapolation schemes give very high
accuracy• Scaling of methods with correlation is typically worse
than <O> N4
• Linear scaling (often local) methods are now available for many techniques
• Limit for problems that need extensive geometry optimizations or second derivatives lies by about 200 atoms
31
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Density Functional Theory Density Functional Theory (DFT)(DFT)
• The properties of a molecule can be derived from its ground-state electron density (1st Hohenburg-Kohn theorem)o Correlation is treated implicitly as a correction to the energy of
a uniform electron gaso Usually necessary to integrate the density numericallyo The energy is given by a functional of the electron densityo This functional is unknowno DFT is usually performed analogously to Hartree-Fock theory
using Kohn-Sham orbitals• Moderately parallel because of the numerical
integrations (4-8 processors)• Roughly 102 faster than comparable ab initio for large
systems
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Semiempirical MOSemiempirical MO--TheoryTheory• Usually based on the NDDO approximation
o Current methods introduced in the 70’so Up to 104 faster than DFTo Scales with N3 but most implementations are closer to N2
o Applications with 1,000 atoms are not unusual, 500 standardo Linear scaling can be attained either by divide-and-conquer or by
localized MO-techniqueso Correlation is treated implicitly by scaling the two-electron
integralso Heavily parameterized to fit experimental data
o Heats of Formationo Ionization potentialso Dipole momentso Molecular structures
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Semiempirical Semiempirical GeometryGeometryOptimizationOptimization
• 177 atoms
• no symmetry
• initial geometry from a GUI-builder
•Elapsed time (single 2 GHz Xeon under Windows) 60 minutes
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Weaknesses of Semiempirical Weaknesses of Semiempirical MOMO--TheoryTheory
• Parameterized – extrapolation can lead to wild and unpredictable errors
• Weak interactions (dispersion) not reproduced at allo but not in DFT either
• Hydrogen bonds either not reproduced (MNDO), wrong geometry (AM1) or wrong energy (PM3)
• Bond rotation barriers are too low• Nitrogen pyramidalization etc. is a problem
32
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
How do we use 3DHow do we use 3D--Information?Information?
• QSAR usually requires that we describe each molecule with a fixed number of descriptors
• …. but molecules have different numbers of atoms
• Three possible strategies:o Specific descriptorso Global descriptorso Grids
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Specific DescriptorsSpecific Descriptors• Require a knowledge of what is important. E.g.o “Bite” angles for diphosphine ligandso HOMO energies (or coefficients) for
reactions with electrophileso Spin densities for radical reactionso Atomic charges for important atomso Double-bond order
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
ZieglerZiegler--NattaNatta
ZrRActivity depends onthis angle (linear QSAR)
Local model,only works forzirconium!
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Global DescriptorsGlobal Descriptors• Describe a fundamental property of the
molecule that hopefully is related to the target property or activityo Molecular weight, volume, surface area,
polarizability, dipole moment, refractive index ……
o Descriptors constructed (invented) to describe molecular properties
o Similarity
33
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
3D 3D SimilaritySimilarity TechniquesTechniques• Search for similarity with the target
pharmacophore• Pure shape similarity (www.eyesopen.com)• Electrostatic similarity and similarity of the
electron density (Sanz, Carbo, Richards)Carbo Index:
2 2
A BAB
A B
P P dR
P d P d
τ
τ τ=
⋅
∫∫ ∫
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
DescriptorsDescriptors: 3D: 3D• Atomic coordinates
o Autocorrelationo MORSE-Codes
• Molecular surfaceso Polar surface areao Statistical descriptors of the
electrostatioc potential at the surface (Politzer, Murray)
o Surface Autocorrelations (Gasteiger)
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Surface descriptorsSurface descriptors• Fast Polar Surface Area Calculation (Ertl)• Calculate a local property (usually the MEP)
at the surface of the molecule (triangulated)
• “Murray-Politzer” descriptorso Use the statistical properties of the distribution
of the values of the local property as descriptors• Autocorrelation (Gasteiger)
o Use the distance between triangulation points to create a “spectrum”
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Murray/Murray/PolitzerPolitzer--DescriptorsDescriptors
m ethane trim ethylam ine bis-T rifluorom ethylPhosphinic acid
σ 2tot = 5 .4 σ 2
tot = 446.6 σ 2tot = 651.0
ν = 0 .144 ν = 0 .009 ν = 0 .246
Total variance = σ2tot ; balance parameter = ν
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
34
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
AutocorrelationAutocorrelation• Usually used for time-series:
• Can be used with distances r:
( )1
n
j j ij
i a aρ +=
=∑
( )1 1
( )n n
i j iji j
r a a f r rρ= =
= −∑∑
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Surface AutocorrelationsSurface Autocorrelationsdifferentorientations
differentside-chainconformations
differentpoint densities
differentdistanceintervals
differentatomic radii
differentsurfaces
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
PCA of Surface AutocorrelationPCA of Surface Autocorrelation
High activity
* Intermediate activity
+ Low activity
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
MolecularMolecular SurfacesSurfaces
Van der Waals Conolly (SES)
35
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
De NovoDe Novo Ligand DesignLigand Design
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
PharmacophoresPharmacophores
• Pharmacophores are two- or three-dimensional arrays of binding features that are associated with the desired biological activity
• The following example shows the use of a pharmacophore search for 17β-hydroxysteroid dehydrogenase
• The natural substrate is estradiol
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
EstradiolEstradiol DockedDocked in in 1717ββ--hydroxysteroid dehydrogenasehydroxysteroid dehydrogenase
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
EstradiolEstradiol PharmacophorePharmacophore
36
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
FlavoneFlavone HitHit
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Multiple HitsMultiple Hits
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
PharmacophorePharmacophore in in thethe Binding Binding PocketPocket
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
ComparativeComparative MolecularMolecularFieldField Analysis (Analysis (CoMFACoMFA))
N
N
N
O
N
N
N
H-bond acceptors
Hydrophobicgroup
37
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
CoMFACoMFA GridGrid
www.kubinyi.de
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
CoMFACoMFA AnalysisAnalysis
www.kubinyi.de
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
CoMFACoMFA ResultsResults
www.kubinyi.de
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
CoMFACoMFA
• Steric• Green: + • Yellow: –
• Electrostatic (positive)• Blue: +• Red: –
H. Lanig, W. Utz, P. Gmeiner, J. Med. Chem. 2001, 44, 1151-1157.
38
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
HighHigh--ThroughputThroughput DockingDocking
• Dock rigid or flexible ligands into the receptor (usually rigid)
• Precalculate grid of properties for the receptor to speed up searching
• Evaluate the results using a scoring function
• What is a scoring function?
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
ScoringScoring FunctionsFunctionsH2O H2O H2Oreceptor ligand receptor:ligand+
( ) ( ) ( ) ( )0
/ /
HB Ion
lipo lipo rot rot aro aro aro aro
G G G f r f G f r fG A G N G N
α α∆ = ∆ + ∆ ∆ ∆ + ∆ ∆ ∆
+∆ + ∆ + ∆ ∆∑ ∑
LUDI Scoring Function:
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Salt bridgesH-Bonds
ScoringScoring FunctionsFunctions
-10 0 .69 kcal m olG∆ = −
( ) ( ) ( ) ( )0
/ /
HB Ion
lipo lipo rot rot aro aro aro aro
G G G f r f G f r fG A G N G N
α α∆ = ∆ + ∆ ∆ ∆ + ∆ ∆ ∆
+∆ + ∆ + ∆ ∆∑ ∑
-10.76 kcal molHBG∆ = −
-11 .45 kcal m olIonG∆ = −
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
Rotatable bonds
π-Stackinghydrophobic
( ) ( ) ( ) ( )0
/ /
HB Ion
lipo lipo rot rot aro aro aro aro
G G G f r f G f r fG A G N G N
α α∆ = ∆ + ∆ ∆ ∆ + ∆ ∆ ∆
+∆ + ∆ + ∆ ∆∑ ∑
ScoringScoring FunctionsFunctions
-10.03 kcal mollipoG∆ = −
-1/ 0.00 kcal molaro aroG∆ =
-10 .22 kcal m olrotG∆ = −
39
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
FreeFree--EnergyEnergy PerturbationPerturbation (FEP)(FEP)
A(gas) B(gas)
A(bound) B(bound)
experimentallyknown
target
calculate
calculate?Computer-Chemie-Centrum Universität Erlangen-Nürnberg
FreeFree--EnergyEnergy PerturbationPerturbation
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
FreeFree--Energy perturbationEnergy perturbation
• Mutate as slowly as possible• Mutate in both directions (the results must be the same)
• Can only do very small changes in structure
• If it’s good, it’s very, very good
Computer-Chemie-Centrum Universität Erlangen-Nürnberg
SummarySummary• 2D-Descriptors may be able to describe
geometry changes• 3D-Descriptors can be very sensitive to
conformation for QSAR but are often less so for QSPR
• 3D-Methods often require alignment• … and the problem of multiple
conformations and/or tautomers remains unsolved
Recommended