Upload
lizbwood
View
68
Download
1
Tags:
Embed Size (px)
Citation preview
Density Functional Tight Binding (DFTB):
Application to organic and biological molecules
Michael Gaus,† Qiang Cui,† and Marcus Elstner∗,‡
Department of Chemistry and Theoretical Chemistry Institute, University of Wisconsin, Madison,
1101 University Avenue, Madison, Wisconsin 53706, USA, and Karlsruhe Institute of Technology,
Physical Chemistry, Kaiserstrasse 12, D-76131 Karlsruhe, Germany
E-mail: [email protected]
∗To whom correspondence should be addressed†University of Wisconsin‡Karlsruher Institut für Technologie
1
Abstract
In this work, we review recent extensions of the Density Functional Tight Binding (DFTB)
methodology and its application to organic and biological molecules. DFTB denotes a class of
computational models derived from Density Functional Theory (DFT) using a Taylor expan-
sion around a reference density. The first and second order models, DFTB1 and DFTB2, have
been reviewed recently (WIREs Comput Mol Sci 2012, 2: 456-465). Here, we discuss the
extension to third-order, DFTB3, which in combination with a modification of the Coulomb
interactions in the second order formalism and a new parametrization scheme leads to a sig-
nificant improvement of the overall performance. The performance of DFTB2 and DFTB3 for
organic and biological molecules are discussed in detail, as well as problems and limitations
of the underlying approximations.
Introduction
Density Functional Tight Binding (DFTB) is a generic name for a set of computational models
derived from DFT. The starting point of the derivation is the reference density ρ0 of the molecular
system, which is constructed as a superposition of the neutral densities ρa0 of the atoms (a) that
constitute the system,
ρ0 = ∑a
ρa0 . (1)
The different DFTB models are derived by expanding the DFT total energy functional around
this density ρ0 in first, second and third orders, respectively. The first order terms constitute the
standard DFTB1 model, which originally was called simply DFTB,1,2 while the model based on
the second order expansion, DFTB2, was originally called SCC-DFTB.3 In the last years we have
derived third-order terms, leading to the DFTB34–8 model.
If the ground state density ρ is written in terms of the reference density ρ0 and the density
2
fluctuation δρ ,
ρ = ρ0 +δρ, (2)
the DFTB total energy can be expanded in the respective orders as:
E[ρ] = E0[ρ0]+E1[ρ0,δρ]+E2[ρ0,(δρ)2]+E3[ρ0,(δρ)3]. (3)
E0 and E1 constitute the DFTB(1) model, including E2 defines DFTB2, and the inclusion of E3
yields the DFTB3 model.
The different DFTB models have reasonably clear areas of application. DFTB1 is suitable for
systems, in which the charge transfer between atoms is small, such as homonuclear systems or
systems with atoms of similar electronegatitvity. Therefore, DFTB1 is well suited for the descrip-
tion of hydrocarbons for which the higher order terms are small. On the other hand, DFTB1 can
also treat systems where a complete charge transfer between the atoms occurs, as, for example, in
NaCl.5 DFTB1 is 5-10 times faster than DFTB2/DFTB3 since it does not require a self-consistent
determination of the charge distribution; i.e., it requires a solution of the generalized eigenvalue
problem only once instead of 5-10 times on average for DFTB2/DFTB3. By contrast, the increase
in CPU time from DFTB2 to DFTB3 is negligible. The second order terms are crucial for polar
molecules, where only partial charge transfer occurs,5 and the third-order expansion becomes in-
dispensable for charged molecular species,6,7 as will be discussed in more detail below. Therefore,
for an application to biological molecules DFTB2 or DFTB3 is required. Since DFTB3 does not
imply any major increase in computational cost, we recently devised a new DFTB3 model by de-
riving a parameter set called ‘3OB’, which we recommend to use in standard applications of DFTB
to biological molecules.8
Three approximations follow the expansion of the total energy in Eq.3: (i) The energy contri-
bution E0[ρ0] is approximated by a sum of pair potentials, which are fitted for a set of molecules
to appropriate reference data. (ii) The Kohn-Sham orbitals Ψi appearing in the term E1[ρ0,δρ]
3
are expanded in a minimal atomic orbital basis φµ and a two-center approximation is applied to
the evaluation of the resulting integrals. (iii) The electron density fluctuations δρ appearing in the
second- and third-order terms are expanded with a multipole expansion. In the existing models,
this expansion is truncated after the monopole term, thus electron-electron interaction (Hartree
and exchange-correlation terms) is effectively approximated by the interaction of atomic partial
charges. This interaction is described by a Coulomb-term, which is damped at short interatomic
distances. A major improvement for non-bonded interactions has been achieved by identifying a
shortcoming of the original interaction term3 and proposing a simple modification.5 This modifi-
cation is now used as a default in the DFTB3 model.7,8
Since DFTB1 and DFTB2 have been reviewed in great detail in Ref. 9, this article focuses on
the extension to DFTB3 as outlined in Refs. 5–8 and on the performance of DFTB2 and DFTB3
for organic and biological molecules. 1
Theoretical Approach
Theory of the third-order SCC-DFTB: DFTB3
The extension of the DFTB approach to include third-order terms (DFTB3) has been introduced re-
cently5,7 and will be briefely summarized in the following. The starting point to derive the DFTB3
total energy is the energy expression of the Kohn-Sham density functional theory.11 Instead of
finding the electron density ρ(rrr) that minimizes the total energy, a reference density ρ0 is assumed
and perturbed by some density fluctuation, ρ(rrr) = ρ0(rrr)+ δρ(rrr). The exchange-correlation en-
ergy functional is then expanded in a Taylor series up to third order and the total energy can be
1A very nice introduction into DFTB2 has also been given in Ref. 10.
4
written as,
Edftb3[ρ0 +δρ] =− 12
∫∫ρ0(rrr)ρ0(rrr′)|rrr− rrr′|
drrrdrrr′−∫
V xc[ρ0]ρ0(rrr)drrr+Exc[ρ0]+Enn
+∑i
ni⟨ψi∣∣H[ρ0]
∣∣ψi⟩
+12
∫∫ ( 1|rrr− rrr′|
+δ 2Exc[ρ]
δρ(rrr)δρ(rrr′)
∣∣∣∣ρ0
)δρ(rrr)δρ(rrr′)drrrdrrr′
+16
∫ ∫∫δ 3Exc[ρ]
δρ(rrr)δρ(rrr′)δρ(rrr′′)
∣∣∣∣ρ0
δρ(rrr)δρ(rrr′)δρ(rrr′′)drrrdrrr′drrr′′
= E0[ρ0]+E1[ρ0,δρ]+E2[ρ0,(δρ)2]+E3[ρ0,(δρ)3].
(4)
Central to the performance of the DFTB models are several approximations following this
expansion, which are:
(i) E0[ρ0] consists of the DFT ‘double counting’ contributions and the nuclear-nuclear repul-
sion in the first line of Eq.4 and depends only on the reference density ρ0, which is given by the
superposition of neutral atomic densities (Eq.1). In other words, this term is not dependent on
the specific chemical environment; it can be determined for an appropriate ‘reference system’ and
then applied to other molecules. This is the key to the transferability of the derived parameters. In
DFTB, this term is approximated by a sum of pair interactions referred to as the repulsive potential,
Erep =12 ∑
abV rep
ab , (5)
(see Ref. 9), which is either determined by comparison to DFT calculations1 or fitted to empirical
data.12 This approach neglects three-body contributions, which may become important in certain
cases, such as in condensed phase systems.13
(ii) E1 consists of the Hamiltonmatrix elements < ψi|H0|ψi > in the second line of Eq.4:
H0 is the Kohn-Sham Hamiltonian of the molecular system but with the reference density. ψi
are the Kohn-Sham orbitals, which are represented in a minimal basis of pseudoatomic orbitals,
ψi = ∑µ cµiφµ . This is a central part of the computational efficiency since it reduces the size of the
5
eigenvalue problem significantly.
E1[ρ0,δρ] = ∑iab
∑µ∈a
∑ν∈b
nicµicν iH0µν (6)
where H0µν are the Hamilton matrix elements in the atomic orbital (AO) representation. The diago-
nal elements of H0µν are chosen to be atomic DFT eigenvalues evaluated with the PBE14 exchange-
correlation functional, the off-diagonal elements are calculated in a two-center approximation.9,15
This minimal basis approximation is at the core of the problems when it comes to the calculation of
response properties.16,17 A single-zeta basis can be tuned quite well to reproduce the bonding prop-
erties of molecules. However, treatment of the more diffuse part of the density, which is relevant
to non-covalent interactions, is more challenging and requires more extended basis sets. Further,
polarization functions may become important for some systems, such as for nitrogen as discussed
below. After introducing the minimal basis, the resulting interaction integrals are approximated.
In particular, this concerns the neglect of 2-center integrals for the diagonal terms and three-center
integrals for the off-diagonal contributions; for a detailed discussion, see Ref. 9.
(iii) E2, the energy term in the third line of Eq.4, is approximated using only the monopole
term in an expansion of δρ in spherical harmonics.3 The charge density fluctuations δρ are then
written as a superposition of atomic contributions, δρ = ∑a δρa, in which the spherical atomic
contributions are approximated by a simple Slater function with ∆qa = qa−q0a (qa is the Mulliken
charge of atom a and q0a the number of valence electrons of the neutral atom a) centered on the
nucleus at RRRa:
δρa ≈ ∆qaτ3
a8π
e−τa|rrr−RRRa| (7)
With this approximation, the Coulomb interaction of the second order term with respect to δρ can
be expressed analytically and is abbreviated as γab in the following. The exponent τa is chosen
such that the on-site value of the γ-function properly describes the atomic chemical hardness (or
alternatively the Hubbard parameter as calculated from DFT) and, therefore, implicitly takes into
account the exchange-correlation contribution to the second order term. To improve this interpo-
6
lation between long-range Coulomb interaction and the on-site term, further refinements on the
γ-function have been applied.7
As discussed in Refs.,4,5 the derived function γab assumes a specific inverse relationship of
the chemical hardness U of an element with its atomic size. Although this relation holds well
within one row of the periodic table, it does not for elements of different rows. In particular,
hydrogen turned out to deserve special attention and we therefore proposed a modified γab function
referred to as γhab.5–7 This modified function can be applied within DFTB2, but it is not the ‘default’
option. Within DFTB3, it has become default7,8 and therefore is a key ingredient of the DFTB3
methodology.
(iv) The E3 term consists of diagonal and off-diagonal contributions. Originally, only the
diagonal terms have been included.4–6 As in DFTB3, a monopole approximation is applied and this
term describes the change of the chemical hardness of an atom with its charge state.5 Specifically,
in third order a new parameter is introduced, the charge derivative of the chemical hardness, Ud .
This parameter can be computed from DFT or optimized in order to improve the performance of the
model. In DFTB3, Ud contributes at the third-order through a Γ-function, which is the derivative
of γab with respect to atomic charge. It is interesting to note that Giese et al.18 showed within
the framework of a rigorous density-functional expansion method that the third-order contribution
does not add significantly to accuracy, in contrast to our finding with calculations based on diverse
sets of molecules.6–8 Therefore, the third-order terms in DFTB3 can be seen as a systematic way to
introduce the charge dependence to compensate for deficiencies of intrinsic approximations within
the second order formalism, namely, the small size of the pseudo-atomic orbital basis, the fixed
shape of the initial atomic densities ρa0 as well as the simplified density fluctuation scheme.
With all these approximations the DFTB3 total energy is given by
Edftb3 =12 ∑
abV rep
ab +∑iab
∑µ∈a
∑ν∈b
nicµicν iH0µν +
12 ∑
ab∆qa∆qbγ
hab +
13 ∑
ab(∆qa)
2∆qbΓab. (8)
The derivative of this expression with respect to the molecular orbital coefficients, cµi, leads to the
7
corresponding Kohn-Sham equations
∑ν
cν i(Hµν − εiSµν
)= 0 with ν ∈ b and ∀a,µ ∈ a, (9)
Hµν = H0µν +Sµν ∑
c∆qc
(12(γac + γbc)+
13(∆qaΓac +∆qbΓbc)+
∆qc
6(Γca +Γcb)
), (10)
where Sµν is the overlap matrix. The Hamilton matrix elements depend on the Mulliken charges,
which in turn depend on the molecular orbital coefficients Thus, these equations have to be solved
self-consistently.
Dispersion correction
Dispersion interactions play an important role in processes dominated by non-covalent interactions,
such as conformational transitions of biomolecules. In the DFTB framework, the first attempt19 to
include dispersion was to augment the DFTB2 energy with an empirical dispersion term, following
the similar strategy applied to Hartree-Fock energies; the results were promising19 and stimulated
similar developments for pure DFT methods.20–23 However, due to the use of a minimal basis set of
atomic orbitals, which are slightly compressed with respect to atomic orbitals, the electron density
in DFTB2 is not well described for large distances, especially for the overlap of weakly interacting
densities which are essential to the description of van der Waals (vdW) interactions. The early
parametrization of DFTB2 for organic and biological molecules2 led to an underestimation of dis-
tances in hydrogen and vdW bonded complexes. Therefore, an empirical dispersion correction has
been proposed which also contains a repulsive contribution in order to correct for this artifact.24
The new DFTB3 parametrization 3OB corrects for this problem by using a slightly more extended
atomic orbital basis set, leading to a good description of non-covalently bonded complexes using
the original dispersion correction from Ref. 19. Recently, Grimme has parametrized his D323 cor-
rection for DFTB3, leading to an excellent performance of DFTB3-D3 for a large set of hydrogen
2referred to as the ‘mio’ set, see www.dftb.org
8
and vdW bonded molecules.25
Treatment of Electron Spin
Standard DFTB is a closed-shell method and therefore exhibits large errors for open-shell systems.
Köhler et al. have formulated an open-shell DFTB variant that includes spin-polarization effects
either in a collinear26,27 or a noncollinear fashion.28 Besides doubling the orbital set for spin up
and spin down electrons, an additional term is added to the total energy that takes into account the
Mulliken spin-population and atomic spin-polarization constants. The latter are calculated from
DFT as numerical difference of partially spin-polarized states in proximity of the spin-unpolarized
state of an atom. The collinear spin treatment improves the description of radicals of organic
molecules. However, for some systems the direction of spin-quantization varies significantly in
space (e.g., antiferromagnetism), for which the noncollinear spin-polarization treatment is neces-
sary. Note that for the collinear case the amount of computation time doubles with respect to the
nonpolarized calculation, while for the noncollinear one the cost quadruples.
Inherited DFT problems
DFTB is derived from DFT and usually GGA functionals (PBE) are applied to compute the terms
in E1, E2 and E3. Since E0 only affects bond energies but not the electronic spectrum in total, ap-
plying higher level methods or experimental data for the determination of E0 does not compensate
for most of the problems inherent to DFT-GGA, except for overbinding, which could be almost
entirely removed in DFTB3/3OB. The deficiency of DFT-GGA in describing vdW interactions can
be compensated by using empirical dispersion corrections, as describe above. All other phenom-
ena related to the self-interaction problem (SIC) in DFT are retained in the DFTB model. This
is reflected, for example, in the performance of DFTB for the subsets of problematic cases in the
GMTKN2429 test set (SIE11, DARC, DC9), as discussed in Ref. 8. The self-interaction problem
shows up in many properties and is contained in the second- and third-order terms in DFTB. A
detailed analysis has been published recently.30,31 The description of the balance between charge
9
delocalization and polarization, for example in charge transfer complexes, is also a challenge to
DFT. Rapacioli et al.32 adapted recently a configuration interaction method, based on constrained
DFT calculations, into the DFTB approach. This allows one to investigate charge resonances in
molecular complexes and describe the proper dissociation behavior.
QM/MM coupling
DFTB has been combined with empirical force field methods in a QM/MM framework as de-
scribed in Ref.33 This scheme has further been extended to include also a continuum electrostatics
environment in the DFTB/MM-GSBP scheme,34 which is useful to the study of chemical reactions
in large macromolecular systems.35,36
For the interaction between QM and MM atoms, it is common to include both electrostatic
and van der Waals contributions;37–39 bonded-terms are also included when the partitioning is
across covalent bonds. In most biomolecular applications, electrostatics tend to dominate and
therefore it is essential that electrostatic interactions between QM and MM atoms are properly
described. For DFTB, the QM-MM electrostatic interaction is approximately calculated in the
original implementation33 as the Coulombic interaction between the QM Mulliken charges (qa)
and MM point charges (QI). The error due to this approximation can be significant when QM
and MM atoms approach each other where charge penetration effect becomes important. As a
result, reactions that involve highly charged solutes/substrates are difficult to study with the original
DFTB/MM Hamiltonian.40 The problem can be partially solved by enlarging the QM region, but
this introduces not only additional cost but also technical complications for cases that involve
highly mobile solvents, such as the need of changing QM/MM partitioning on the fly.41,42
In our recent work,43 motivated by the Klopman-Ohno (KO) expression for the two-center two-
electron integrals in semi-empirical QM methods,44 which also inspired the development of the
γab kernel in the original DFTB, we have implemented a different Hamiltonian for the DFTB/MM
10
electrostatics. It takes the form,
HQM/MMelec,KO = ∑
a∈QM∑
A∈MM
∆qaQA√R2
aA +aa(1
Ua(qa)+ 1
UA)2e−baRaI
= ∑a∈QM
∑A∈MM
γKO∆qaQA (11)
in which aa and ba are element type dependent parameters. Together with the van der Waals
parameters in the QM/MM Hamiltonian, there are 4 QM/MM parameters for each element type,
and they can be determined based on microsolvation clusters.43 To be consistent with the third-
order formulation of SCC-DFTB,7 the Hubbard parameter in the KO functional is dependent on
the QM charge. As a result, the effective size of the QM charge distribution naturally adjusts
as the QM region undergoes chemical transformations, making the KO based QM/MM scheme
particularly attractive for describing chemical reactions in the condensed phase.
Our studies of charged solutes and chemical reactions clearly indicate that the KO scheme
is robust and transferable. For the fitting set clusters, both the point-charge and KO schemes
have comparable errors (relative to full QM results) in solute-solvent interactions, with the Mean
Unsigned Error, (MUE) of 3.3 and 4.8 kcal/mol, respectively (note that the errors are for total
solute-solvent interactions, which are often >100 kcal/mol, thus the error is typically less than
5%!). However, for 16 stable structures and 24 transition states in the QCRNA database, the MUE
is 4.3 kcal/mol for the KO scheme but 16.2 kcal/mol for the point-charge based QM/MM model.
As another example, for the hydrolysis of phosphate mono esters in solution43 the hydrolysis
barrier is grossly overestimated (∼ 11 kcal/mol) with SCC-DFTBPR/MM simulations using the
point-charge based QM-MM Hamiltonian. 3 With the KO scheme, the computed barrier is in close
agreement (within 2 kcal/mol) with available experimental data.
Parametrization
The parametrization of the DFTB models involves three steps:
3SCC-DFTBPR is a DFTB variant including only diagonal 3rd order terms and a specific modification andparametrization for phosphate hydrolysis. See Ref.40
11
(i) The determination of the parameters for E1:
This is usually the first step in the parametrization. Here, one has to compute
H0µν =< φµ |H[ρ0]|φν > (12)
and Sµν for setting up the Hamilton Matrix elements in Eq.10. In a first step, one has to determine
the atomic orbital basis set φµ and the neutral atom densities ρ0a by solving the atomic KS equa-
tions where an additional potential leads to a confinement of the orbitals.9 For the basis set, the
confinement parameter is usually set to roughly twice the covalent radius of the element, while the
choice of the confinement radius of the initial atomic densities is slightly more empirical.7,12 The
choice of these two parameters in a reasonable range does not alter molecular properties on a large
scale, however, they can be used for a fine-tuning of the method. This has been discussed recently
for the derivation of the new DFTB3 parameters 3OB.8 Compared to the older DFTB2 parameters
‘mio’, more diffuse basis functions φµ lead to an increase in Pauli repulsion which is relevant for
weak interactions, while a slightly larger compression of the initial densities ρ0a leads to a decrease
in the overbinding and therefore better performance for heats of formation and reaction energies.
(ii) The determination of the parameters for E2 and E3:
For the atomic partial charges qa in Eq.8 a Mulliken partitioning scheme is usually applied, al-
though other schemes are possible as well. Using CM3 charges has been shown to improve the
electrostatic potential of molecules;17,45 however, additional parameters would enter the parametriza-
tion procedure, which we have tried to keep as simple and straightforward as possible. The function
γ(Ua,Rab) in Eq.8 has been determined by an analytical derivation3 and the chemical hardness pa-
rameter (or Hubbard parameter) Ua is usually computed from DFT. However, as described above,
this choice of γ(Ua,Rab) presupposes a particular inverse relationship between the chemical hard-
ness and the size of an atom, which holds well within one row of the periodic table but by no means
for elements of different rows4,5,7 4. Therefore, the functional form of γ(Ua,Rab) should depend
on the row of the peridic table. For the first row, we use the original form but for hydrogen and
4See in particular Fig. 2 from Ref. 7
12
its interaction with other elements a modified function γh(Ua,Rab) is applied.4,5,7 Other choices
of functions for the 2nd, 3rd etc. rows is ongoing work. For DFTB3, the derivative of γ(Ua,Rab),
Γ(Ua,Uda ,Rab) is needed,7 where Ud
a is the charge derivative of Ua. In the earlier implementa-
tions,6,40 only the diagonal part of Γ(Ua,Uda ,Rab) was implemented. This works well for first row
molecules, except for the deprotonation of NH3, where the off-diagonal terms seem to be impor-
tant.7 The diagonal version, however, was not able to describe phosphorous containing molecules,
in particular their (de-)protonation energies, and an ad hoc modification has been necessary in-
volving a special parametrization.40 This problem could be remedied at full third-order,7 however,
by treating Uda as adjustable parameters.
(iii) The determination of the parameters for E0:
The determination of Erep has been greatly simplified by introducing automated parametrization
procedures.12,46 These schemes not only reduce the effort but also allow to vary the optimization
targets. In principle, data from any theoretical level and experiment can enter the parametriza-
tion. Since it depends only on ρ0, Erep could be determined in principle only once and would
be valid for all DFTB models. Indeed, the Erep parameters originally derived for DFTB2, called
‘mio’, worked rather well with DFTB3.6,7 However, a fine tuning can be achieved when Erep is
specifically optimized for the respective model. Therefore, we have reoptimized the parameters
for DFTB3, now called 3OB (referring to ‘DFTB3’ and the main field of application: organic and
biological molecules). Therefore, there are currently two sets of parameters available, the ’mio’
set which has been derived for DFTB23 and the ’3OB’ set, which has been derived for DFTB3.8 5
Note that the 3OB set also differs in the electronic parameters, as described above.6
In summary, one first has to determine 4 parameters per atomtype, the confinement radii for
the atomic orbitals φµ , which is called r0, the confinement radii for the atomic densities ρa0 , which
is called rd0 , the atomic Hubbard parameter Ua and its charge derivative Ud
a . The determination
5These parameter sets can be downloaded from the website www.dftb.org.6In earlier work applying the diagonal DFTB3 method in combination with the ‘mio’ set, fitted Ud values com-
pensated for the overbinding of the method. This is no longer needed using ‘3OB’ since this parametrization removesthe overbinding by changing the density compression radii.8 Further, the special parametrization and modification forphosphorous compounds40 is no longer required due to the introduction of the 3rd order off-diagonal terms.
13
of r0 and rd0 is an empirical procedure and can be quite involved,8,12 while Ua and Ud
a can be
easily computed in principle. For the modified function γh one additional parameter appears,
which is fitted to reproduce the water dimer binding energy. The repulsive potentials are two-body
contributions, therefore they are much more involved although largely automated procedures have
been recently developed.12
While for many applications relative energies are the important quantity, sometimes the cal-
culation of atomization energies and heats of formation is desired, as for example in the case of
fitting the DFTB repulsive potentials. However, the calculation of atomization energies requires
some additional care.8 It is given by the total energy of a system Etot and the atomic energies Eatom
EAt =−Etot +∑a
Eatoma (13)
With a closed-shell treatment DFTB gives Eatoma of rather poor quality. One may use the spin-
polarization formalism, which improves the results. In practice, however, the atomization energies
are usually calculated using spin-polarization energies Espin that are pre-calculated from DFT for
each atom; i.e., Espin is the difference of the atomic energy calculated at the spin-unpolarized state
and the spin-polarized state. 7 With that, Eatom is calculated as the total energy of an atom plus the
spin-polarization energy Espin.
Note that using Espin gives slightly more accurate results than using atomic energies as calcu-
lated from spin-polarized DFTB because the spin-polarization from the atom (as calculated from
DFT) is added rather than a correction of the atomic energy. The latter uses spin-polarization con-
stants calculated as derivative of the atomic eigenvalues in the proximity of the spin-unpolarized
atom.7In the case of Hydrogen the spin-unpolarized state would refer to a hypothetical one where 0.5 electrons are spin-
up and 0.5 electrons are spin-down, while the spin-polarized state is the ground state of the atom with 1.0 spin-up and0.0 spin-down electrons.
14
Performance
Energetics, structure and vibrational frequencies of small molecules
DFTB2 has been tested over the years for a variety of molecular properties. A first thorough test
has been performed by Krüger et al,47 who benchmarked the accuracy of DFTB2 against G2 and
BLYP for 22 molecules, evaluating 28 reaction energies, geometries and vibrational frequencies.
Reaction energies show an mean error of 4.3 kcal/mol with respect to G2 and geometries are in
excellent agreement with those obtained at the DFT level. Vibrational frequencies, however, show
larger deviations; in particular, the stretch frequencies of several specific modes are significantly
overestimated. Therefore, Małolepsza et al.48 suggested to apply a specific parametrization of
Erep for vibrational frequencies. With this special parameter set, DFTB2 shows a very good per-
formance and vibrational frequencies approach the quality of those from full DFT. We investigated
this point in more detail in later publications for DFTB212 and DFTB3.8 These studies demon-
strate the limited flexibility of the current DFTB approach; i.e, it is not possible to achieve an
accuracy comparable to DFT-GGA for both reaction energies and vibrational frequencies with a
single parameterization. There is an optimization conflict where one property deteriorates when
the other is improved. The pragmatic solution to this problem is to supply two sets of parame-
ters,8 one optimized for energies and geometries (3OB), the other for geometries and vibrational
frequencies (3OB-f).
Two other publications have benchmarked DFTB2 for even larger molecular test sets. Sat-
telmeyer et al.49 benchmarked DFTB2 for 622 closed shell molecules containing O, N, C and H in
comparison with Hartree-Fock based semi-empirical methods like AM1, PM3 and PDDG/PM3.
The good performance of DFTB for geometries was confirmed, however, the performance of
DFTB2 for heats of formation with a mean average error of 5.8 kcal/mol was worse than that
of PM3 and PDDG/PM3, the latter with a mean absolute error of 3.2 kcal/mol that even outper-
forms B3LYP/6-31G(d).8 Otte et al.50 confirmed these findings, showing that DFTB2 performs
8This study also indicated errors in the treatment of N-O and S-O bonds, which should be ameliorated with thenew 3OB parametrization.
15
slightly worse for heats of formation than AM1 and PM3, and in particular worse than the OMx
suite of methods. However, geometries are very well described and DFTB2 is clearly superior for
vibrational frequencies. Further, DFTB2 performs very well for structures and relative energies of
peptide conformations,9 as well as for hydrogen bonded systems.
The ‘3OB’ parametrization for DFTB3 has been developed with improving two particular lim-
itations of DFTB2 in mind: the overbinding of about 5-10 kcal/mol per covalent bond (for O,
N, C, H containing molecules) and the underestimation of binding energies in weakly bonded
complexes.8 The third-order terms improve the description of localized charges10 and the modi-
fied Coulomb interaction γh(Ua,Rab) improves hydrogen bonding interactions.7 As a result, the
description of energies is greatly improved: DFTB3/3OB approaches the accuracy of DFT-GGA
methods like PBE for heats of formations and atomization energies as well as the accuracy of the
@@best semi-empirical methods like PDDG-PM3. DFTB3/3OB is even better than DFT-GGA
when only a small, double zeta type basis set is applied,8 as typically done with DFT or DFT/MM
based molecular dynamics simulations. In particular hydrogen bonding energies, proton affini-
ties and proton transfer barriers, which are relevant in many biochemical problems, are very well
described.
Recently, Goerigk and Grimme have compiled a general database (GMTKN24) for main group
thermochemistry, kinetics and non-covalent interactions.29 This set benchmarks a variety of molec-
ular properties, reaction and atomization energies, reaction barriers, electron affinities, ionization
potentials (IP’s) and proton affinities (EA’s), hydrogen bonding and VdW interactions, conforma-
tional energies of peptides, hydrocarbons and carbohydrates, isomerization reactions and some
other properties. For this set, the accuracy of DFTB3/3OB is comparable to the newest variant of
the OMx models,8 OM3, which has been shown recently to approach the accuracy of DFT-GGA
methods for this data set.53
9See also Refs. 51,52 for more detail.10As they appear in small charged molecules, where the charge is located on few atoms. Large ionic molecules,
where the charge is distributed over a large number of atoms is unproblematic in DFTB2.
16
Properties: IP’s, EA’s, dipole moments and molecular polarizability
IP’s and EA’s for small molecules are difficult to compute with a minimal basis set method like
DFTB since these properties do not enter the parametrization procedure, in contrast to NDDO
type semi-empirical methods. The adjustment of Erep only affects bond lengths (not angles!),
bond energies and stretch frequencies. Therefore, properties like IP and EA are usually less accu-
rately described8,49,50,53 and deserve careful testing for the specific problem in hand. This holds
as well for dipole moments, which are simply computed from the Mulliken population analysis.
The description of electrostatic properties such as dipole moments can be easily improved using a
parametrized charge scheme like CM3, as has been shown in Ref. 17. However, IR intensities are
not changed since these depend on the derivative of the dipole moment with respect to the normal
coordinates, which is not improved. Unfortunately, this holds similarly for molelcular polariz-
abilities and Raman intensities. The polarizabilities can be adequately improved using methods
like Chemical Potential equilization16 or a variational approach (VAR),17 but Raman intensities
suffer from the same problem as IR intensities; i.e., although the properties are improved, their
derivatives with respect to normal coordinates are not.
Conformations of complex molecules
Most of the tests described so far benchmark the DFTB performance for covalent bond lengths
and bond angles. The performance for dihedral angles, which is important to the description of
conformations of complex molecules like peptides, proteins, DNA and carbohydrates, remains
systematically tested. DFTB2 has been extensively benchmarked for the structures and relative en-
ergies of polyalanine conformations.51,52 Relative energies and structures were found to be in good
agreement with DFT and ab initio predictions, and vibrational spectroscopic features were also re-
produced satisfactorily;54 for a short review, see Ref. 55. However, low frequency modes seem
to be underestimated,56 which indicates that rotational barriers are too low in DFTB. QM/MM
simulations of di-alanine in water indicated that the free energy minima at the α-helical and β -
sheet region were more extended than in standard force field methods,57 a finding confirmed later
17
using a different QM/MM implementation.58 A deeper analysis indeed showed that the rotational
barriers around the dihedral angles are very low. Furthermore, DFTB/MM populates the α basin
more than the β basin, in contrast to experimental findings. The energy differences, however, are
small and on the order of 0.5 kcal/mol. Therefore, small changes in the Hamiltonian can lead to
a significant change in the populations, and it is possible that DFTB3 with an improved QM/MM
coupling (KO-scheme, see above) leads to an improvement.
DFTB2 has also been tested for carbohydrates and the property of interest are again the dihe-
dral angles, in particular the ring puckering modes. It has been shown that DFTB2 produces free
energy surfaces for conformational transitions similar to those of ab initio methods, in contrast to
various NDDO methods,59 motivating the application of DFTB2 to carbohydrate reaction dynam-
ics.60,61 The agreement with high level methods, however, is far from perfect and leaves ample
room for future improvement. For example, potential energy scans for certain dihedral angles
clearly showed that DFTB2 is in qualitative agreement with full DFT but with too low torsional
barriers, while NDDO type methods seem to fail even qualitatively.62 Nevertheless, there seems
no compelling reason to use DFTB2 for the description of structure and dynamics of carbohydrates
at the moment, since empirical force fields currently represent the potential energy surface much
more accurately.
Water
Another issue worth mentioning is the description of water by DFTB. Given its fundamental
importance in chemistry and biology, it’s desirable to be able to adequately describe water in
both gas and condensed phases, including water in different protonation states (e.g., a solvated
proton/hydroxide). With the standard DFTB2, the hydrogen bonding interaction between water
molecules is too weak; as discussed above and in detail elsewhere,4,5,7 this motivated the de-
velopment of the modified γh function for atom pairs involving H. With DFTB37 and the latest
parameterization,8 for example, the water dimer binding energy is well described and low-energy
conformers of small water clusters are also captured. The relative energies of these low-energy con-
18
formers, however, are not yet ideal, suggesting the need of further improving hydrogen-bonding
interactions by, for example, going beyond the monopole approximation for charge-charge inter-
actions in eq. 8. The imperfection of water-water interaction is also manifested in bulk water
simulations, which indicated that both DFTB2 and DFTB3 tend to over-predict the height of the
first solvation shell peak in the O-O radial distribution function while underestimating the second
solvation shell.63–65
For NVT simulations at the ambient condition, one simple but ad hoc approach to improve the
description of bulk water is to adjust the pair-wise repulsive potentials based on a reversed Monte
Carlo protocol such that experimental radial distribution functions are reproduced. This is found66
to be somewhat successful in that the resulting repulsive potentials also improved the description
of small protonated water clusters and the structure of a solvated proton. For the 13 low-energy
isomers H(H2O)+22, for example, the RMSE is only 0.9 kcal/mol relative to MP2 results,67 as
compared to the value of 3.8 kcal/mol for the original DFTB3. For a solvated proton in the bulk,
the integrated coordination number for the first solvation shell is 3.2, which is close to the value
of 3.0 for CPMD (using the HCTH functional); by comparison, the standard DFTB3 gives a value
close to 5.0.65 Nevertheless, the enthalpy of evaporation remains too low by about 1 kcal/mol,
and preliminary NPT simulations indicate that DFTB2/3 tends to substantially overestimate the
density of bulk water at the ambient condition, a situation also observed in some ab initio DFT
simulations.68 Therefore, improving the description of water remains an important topic for further
DFTB developments.
Conclusions
The extension of DFTB to the third-order, DFTB3,7,8 in combination with a new parametrization
procedure12 has improved the performance significantly for reaction energies, geometeries and
hydrogen bonded complexes. DFTB3 even outperforms DFT-GGA with double zeta (DZ) basis
in special cases, although being 2-3 orders of magnitude faster.69 However, the computational
19
efficiency comes at the price of reduced transferability; i.e., not all molecular properties can be
computed at the same accuracy within one parameter set. Such an optimization conflict has been
found in case of reaction energies and vibrational frequencies, therefore we have proposed to use
two different parametrizations, the 3OB for energies and geometries and 3OB-f for geometries and
vibrational frequencies;8 geometries are described with similar accuracy in both parameter sets.
A key to better non-bonded interactions is the use of the third-order term in combination with
the modified Coulomb interaction term, γh. The augmentation of the DFTB3 total energy with
the empirical dispersion extension can be advised as a default, because it usually only improves
results.
Despite all these improvements, there are still several limitations of DFTB:
(i) The DFT-GGA framework used for the expansion of the total energy. The DFTB models
inherit the well-known DFT-GGA problems, especially the self-interaction error.
(ii) The use of a minimal basis set. This leads to a reduced molecular polarizability and limits
the application of DFTB for computing IR and Raman spectra. Further, the missing polarization
functions may cause problems in the description of sp3 nitrogen.8 This shows up in large errors
for proton affinities with acidic nitrogen, for which no satisfactory solution has been proposed up
to now; an ad hoc fix is used by applying a special parameter set (NHmod) for these special cases.
(iii) The limited flexibility of the scheme (fixed initial density, monopole approximation) leaves
further problems. This shows up in the description of atomization energies of ionic species,8 an-
other complication is the need for two different parameter sets for hydrogen. A special parametriza-
tion is needed, when the bond breaking of molecular hydrogen is computed.8
(iv) DFTB describes the general conformational properties of biomolecules quite well: pep-
tides, DNA bases and sugars can be computed with often good accuracy. However, DFTB under-
estimates torsional barriers, which currently limits its applicability in the description of conforma-
tional dynamics of these complex molecules.
Another important direction for DFTB development concerns the treatment of metal ions, such
as Mg2+, Zn2+ and Cu+/2+, which play important structural and catalytic roles in biomolecules.
20
DFTB2 has been parameterized for several first-row elements (e.g., Fe, Ni, Co, Cu and Zn),70–72
and it has been shown that DFTB2 generally gives reliable structural properties for metal sites,
including fairly complex bi-metallo zinc sites,73–78 and DFTB/MM has been successfully applied
to a number of metalloenzymes by us and other research groups.73–79 Pushing forward the DFTB
framework is significant for metalloenzyme applications because for transition metal ions, despite
progress,80,81 a robust semi-empirical method (even just for structures!) is not yet available. This
is particularly true for open-shell cases: although parameterizations for several open-shell metal
ions (e.g., Ni, Cu and Fe) have been reported in the literature,71,72,82 their application has largely
been limited to geometry optimization of organometallic compounds and only several metalloen-
zymes;83 systematic development of the methodology to improve energetics remains an important
frontier.
References
(1) Porezag, D.; Frauenheim, T.; Köhler, T.; Seifert, G.; Kaschner, R. Phys. Rev. B 1995, 51,
12947–12957.
(2) Seifert, G.; Porezag, D.; Frauenheim, T. Int. J. Quantum Chem. 1996, 58, 185–192.
(3) Elstner, M.; Porezag, D.; Jungnickel, G.; Elsner, J.; Haugk, M.; Frauenheim, T.; Suhai, S.;
Seifert, G. Phys. Rev. B 1998, 58, 7260–7268.
(4) Elstner, M. Theor. Chem. Acc. 2006, 116, 316–325.
(5) Elstner, M. J. Phys. Chem. A 2007, 111, 5614–5621.
(6) Yang, Y.; Yu, H.; York, D.; Cui, Q.; Elstner, M. J. Phys. Chem. A 2007, 111, 10861–10873.
(7) Gaus, M.; Cui, Q.; Elstner, M. J. Chem. Theory Comput. 2011, 7, 931–948.
(8) Gaus, M.; Goez, A.; Elstner, M. J. Chem. Theory Comput. 2012, 9, 338.
(9) Seifert, G.; Joswig, J.-O. WIREs Comput Mol Sci 2012, 2, 456–465.
21
(10) Koskinen, P.; Makinen, V. Comp. Mat. Sci. 2009, 47, 237.
(11) Kohn, W.; Sham, L. J. Phys. Rev. 1965, 140, A1133–A1138.
(12) Gaus, M.; Chou, C.-P.; Witek, H.; Elstner, M. J. Phys. Chem. A 2009, 113, 11866–11881.
(13) Goldman, N.; Fried, L. E. J. Phys. Chem. C 2011, 116, 2198–22044.
(14) Perdew, J. P.; Burke, K.; Ernzerhof, M. Phys. Rev. Lett. 1996, 77, 3865–3868.
(15) Seifert, G. J. Phys. Chem. A 2007, 111, 5609–5613.
(16) Kaminski, S.; Giese, T. J.; Gaus, M.; York, D. M.; Elstner, M. J. Phys. Chem. A 2012, 116,
9131–9141.
(17) Kaminski, S.; Gaus, M.; Elstner, M. J. Phys. Chem. A 2012, 116, 11927–11937.
(18) Giese, T. J.; York, D. M. Theor. Chem. Acc. 2012, 131, 1145.
(19) Elstner, M.; Hobza, P.; Frauenheim, T.; Suhai, S.; Kaxiras, E. J. Chem. Phys. 2001, 114,
5149–5155.
(20) Wu, Q.; Yang, W. J. Chem. Phys. 2002, 116, 515–524.
(21) Grimme, S. J. Comput. Chem. 2004, 25, 1463–1473.
(22) Grimme, S. J. Comput. Chem. 2006, 27, 1787–1799.
(23) Grimme, S.; Antony, J.; Ehrlich, S.; Krieg, H. J. Chem. Phys. 2010, 132, 154104.
(24) Zhechkov, L.; Heine, T.; Patchkovski, S.; Seifert, G.; Duarte, H. J. Chem. Theory Comput.
2005, 1, 841–847.
(25) Risthaus, T.; Grimme, S. J. Chem. Theory Comput., in press.
(26) Köhler, C.; Seifert, G.; Gerstmann, U.; Elstner, M.; Overhof, H.; Frauenheim, T. Phys. Chem.
Chem. Phys. 2001, 3, 5109–5114.
22
(27) Köhler, C.; Seifert, G.; Frauenheim, T. Chem. Phys. 2005, 309, 23–31.
(28) Köhler, C.; Frauenheim, T.; Hourahine, B.; Seifert, G.; Sternberg, M. J. Phys. Chem. A 2007,
111, 5622–5629.
(29) Goerigk, L.; Grimme, S. J. Chem. Theory Comput. 2010, 6, 107–126.
(30) Hourahine, B.; Sanna, S.; Aradi, B.; Köhler, C.; Niehaus, T.; Frauenheim, T. J. Phys. Chem.
A 2007, 111, 5671–5677.
(31) Lundberg, M.; Nishimoto, Y.; Irle, S. Int. J. Quantum Chem. 2012, 112, 1701–1711.
(32) Rapacioli, M.; Spiegelman, F.; Scemama, A.; Mirtschink, A. J. Chem. Theory Comput. 2011,
7, 44–55.
(33) Cui, Q.; Elstner, M.; Kaxiras, E.; Frauenheim, T.; Karplus, M. J. Phys. Chem. B 2001, 105,
569–585.
(34) Riccardi, D.; Schaefer, P.; Yang, Y.; Yu, H.; Ghosh, N.; Prat-Resina, X.; König, P.; Li, G.;
Xu, D.; Guo, H.; Elstner, M.; Cui, Q. J. Phys. Chem. B 2006, 110, 6458–6469.
(35) Yang, Y.; Yu, H.; Cui, Q. J. Mol. Biol. 2008, 381, 1407–1420.
(36) Ghosh, N.; Xavier, P.-R.; Gunner, M. R.; Cui, Q. Biochemistry 2009, 48, 2468–2485.
(37) Freindorf, M.; Gao, J. L. J. Comp. Chem. 1996, 17, 386–395.
(38) Riccardi, D.; Li, G.; Cui, Q. J. Phys. Chem. B 2004, 108, 6467–6478.
(39) Giese, T. J.; York, D. M. J. Chem. Phys. 2007, 127, 194101.
(40) Yang, Y.; Yu, H.; York, D.; Elstner, M.; Cui, Q. J. Chem. Theory Comput. 2008, 4, 2067–
2084.
(41) Nielsen, S. O.; Bulo, R. E.; Moore, P. B.; Ensing, B. Phys. Chem. Chem. Phys. 2010, 12,
12401–12414.
23
(42) Park, K.; Götz, A. W.; Walker, R. C.; Paesani, F. J. Chem. Theo. Comp. 2012, 8, 2868–2877.
(43) Hou, G.; Zhu, X.; M. Elstner,; Cui, Q. J. Chem. Theo. Comp. 2012, 8, 4293–4304.
(44) Pople, J. A.; Beveridge, D. L. Approximate Molecular Orbital Theory; McGraw-Hill Com-
panies, 1970.
(45) Kalinowski, J. A.; Lesyng, B.; Thompson, J. D.; Cramer, C. J.; Truhlar, D. G. J. Phys. Chem.
A 2004, 108, 2545–2549.
(46) Bodrog, Z.; Aradi, B.; Frauenheim, T. J. Chem. Theory Comput. 2011, 7, 2654–2664.
(47) Krüger, T.; Elstner, M.; Schiffels, P.; Frauenheim, T. J. Chem. Phys. 2005, 122, 114110.
(48) Małolepsza, E.; Witek, H. A.; Morokuma, K. Chem. Phys. Lett. 2005, 412, 237–243.
(49) Sattelmeyer, K. W.; Tirado-Rives, J.; Jorgensen, W. L. J. Phys. Chem. A 2006, 110, 13551–
13559.
(50) Otte, N.; Scholten, M.; Thiel, W. J. Phys. Chem. A 2007, 111, 5751–5755.
(51) Elstner, M.; Jalkanen, K. J.; Knapp-Mohammady, M.; Frauenheim, T.; Suhai, S. Chem. Phys.
2001, 263, 203–219.
(52) Elstner, M.; Jalkanen, K. J.; Knapp-Mohammady, M.; Frauenheim, T.; Suhai, S. Chem. Phys.
2000, 256, 15–27.
(53) Korth, M.; Thiel, W. J. Chem. Theory Comput. 2011, 7, 2929–2936.
(54) Bohr, H. G.; Jalkanen, K. J.; Elstner, M.; Frimand, K.; Suhai, S. Chem. Phys. 1999, 246,
13–36.
(55) Elstner, M.; Frauenheim, T.; Suhai, S. J. Mol. Struct.: THEOCHEM 2003, 632, 29–41.
(56) Elstner, M.; Frauenheim, T.; Kaxiras, E.; Seifert, G.; Suhai, S. Phys. Stat. Sol. B 2000, 217,
357–376.
24
(57) Hu, H.; Elstner, M.; Hermans, J. Proteins: Struct., Funct., Genet. 2003, 50, 451–463.
(58) Seabra, G. D. M.; Walker, R. C.; Elstner, M.; Case, D. A.; Roitberg, A. E. J. Phys. Chem. A
2007, 111, 5655–5664.
(59) Barnett, C.; Naidoo, K. J.Phys. Chem. B 2010, 114, 17142–17154.
(60) Barnett, C.; Wilkinson, K.; Naidoo, K. J. Am. Chem. Soc. 2010, 132, 12800–12803.
(61) Barnett, C.; Wilkinson, K.; Naidoo, K. J. Am. Chem. Soc. 2011, 133, 19474–19482.
(62) Islam, S.; Roy, P.-N. J. Chem. Theory Comput. 2012, 8, 2412–2423.
(63) Hu, H.; Lu, Z.; Elstner, M.; Hermans, J.; Yang, W. J. Phys. Chem. A 2007, 111, 5685–5691.
(64) Maupin, C.; Aradi, B.; Voth, G. J. Phys. Chem. B 2010, 114, 6922–6931.
(65) Goyal, P.; M. Elstner,; Cui, Q. J. Phys. Chem. B 2011, 115, 6790–6805.
(66) Goyal, P.; Hu, J.; M. Elstner,; Irle, S.; Cui, Q. Manuscript in preparation
(67) Choi, T.; Jordan, K. J. Phys. Chem. B 2010, 114, 6932–6936.
(68) Wang, J.; Roman-Perez, G.; Soler, J. M.; Artacho, E.; Fernandez-Serra, M. V. J. Chem. Phys.
2011, 134, 024516.
(69) Elstner, M.; Gaus, M. In Computational Methods for Large systems: Electronic Structure
Approaches for Biotechnology and Nanotechnology; Reimers, J. R., Ed.; John Wiley and
Sons: Hoboken, New Jersey, 2011; pp 287–308.
(70) Elstner, M.; Cui, Q.; Munih, P.; Kaxiras, E.; Frauenheim, T.; Karplus, M. J. Comput. Chem.
2003, 24, 565–581.
(71) Zheng, G. S.; Witek, H. A.; Bobadova-Parvanova, P.; Irle, S.; Musaev, D. G.; Prabhakar, R.;
Morokuma, K. J. Chem. Theo. Comp. 2007, 3, 1349–1367.
25
(72) Bruschi, M.; Bertini, L.; Bonacic-Koutecky, V.; De Gioia, L.; Mitric, R.; Zampella, G.; Fan-
tucci, P. J. Phys. Chem. B 2012, 116, 6250–6260.
(73) Hou, G. H.; Cui, Q. J. Am. Chem. Soc. 2012, 134, 229–246.
(74) Riccardi, D.; Yang, S.; Cui, Q. Biochim. Biophys. Acta 2010, 1804, 342–351.
(75) Xu, D.; Guo, H.; Cui, G. J. Am. Chem. Soc. 2007, 129, 10814–10822.
(76) Xu, D. G.; Guo, H. J. Am. Chem. Soc. 2009, 131, 9780–9788.
(77) Xu, D. G.; Xie, D. Q.; Guo, H. J. Biol. Chem. 2006, 281, 8740–8747.
(78) Chakravorty, D. K.; Wang, B.; Lee, C. W.; Giedroc, D. P.; K. M. Merz, Jr., J. Am. Chem. Soc.
2012, 134, 3367–3376.
(79) Yang, Y.; Miao, Y. P.; Wang, B.; Cui, G. L.; K. M. Merz, Jr., Biochem. 2012, 51, 2606–2618.
(80) Thiel, W. Adv. Chem. Phys. 1996, 93, 703–757.
(81) Thiel, W.; Voityuk, A. A. J. Phys. Chem. 1996, 100, 616–626.
(82) Köhler, C.; Seifert, G.; Frauenheim, T. Chem. Phys. 2005, 309, 23–31.
(83) Lundberg, M.; Sasakura, Y.; Zheng, G. S.; Morokuma, K. J. Chem. Theo. Comp. 2010, 6,
1413–1427.
26