Wires 2013

Density Functional Tight Binding (DFTB):

Application to organic and biological molecules

Michael Gaus,† Qiang Cui,† and Marcus Elstner∗,‡

Department of Chemistry and Theoretical Chemistry Institute, University of Wisconsin, Madison,

1101 University Avenue, Madison, Wisconsin 53706, USA, and Karlsruhe Institute of Technology,

Physical Chemistry, Kaiserstrasse 12, D-76131 Karlsruhe, Germany

E-mail: [email protected]

∗To whom correspondence should be addressed†University of Wisconsin‡Karlsruher Institut für Technologie

1

Abstract

In this work, we review recent extensions of the Density Functional Tight Binding (DFTB)

methodology and its application to organic and biological molecules. DFTB denotes a class of

computational models derived from Density Functional Theory (DFT) using a Taylor expan-

sion around a reference density. The first and second order models, DFTB1 and DFTB2, have

been reviewed recently (WIREs Comput Mol Sci 2012, 2: 456-465). Here, we discuss the

extension to third-order, DFTB3, which in combination with a modification of the Coulomb

interactions in the second order formalism and a new parametrization scheme leads to a sig-

nificant improvement of the overall performance. The performance of DFTB2 and DFTB3 for

organic and biological molecules are discussed in detail, as well as problems and limitations

of the underlying approximations.

Introduction

Density Functional Tight Binding (DFTB) is a generic name for a set of computational models

derived from DFT. The starting point of the derivation is the reference density ρ0 of the molecular

system, which is constructed as a superposition of the neutral densities ρa0 of the atoms (a) that

constitute the system,

ρ0 = ∑a

ρa0 . (1)

The different DFTB models are derived by expanding the DFT total energy functional around

this density ρ0 in first, second and third orders, respectively. The first order terms constitute the

standard DFTB1 model, which originally was called simply DFTB,1,2 while the model based on

the second order expansion, DFTB2, was originally called SCC-DFTB.3 In the last years we have

derived third-order terms, leading to the DFTB34–8 model.

If the ground state density ρ is written in terms of the reference density ρ0 and the density

2

fluctuation δρ ,

ρ = ρ0 +δρ, (2)

the DFTB total energy can be expanded in the respective orders as:

E[ρ] = E0[ρ0]+E1[ρ0,δρ]+E2[ρ0,(δρ)2]+E3[ρ0,(δρ)3]. (3)

E0 and E1 constitute the DFTB(1) model, including E2 defines DFTB2, and the inclusion of E3

yields the DFTB3 model.

The different DFTB models have reasonably clear areas of application. DFTB1 is suitable for

systems, in which the charge transfer between atoms is small, such as homonuclear systems or

systems with atoms of similar electronegatitvity. Therefore, DFTB1 is well suited for the descrip-

tion of hydrocarbons for which the higher order terms are small. On the other hand, DFTB1 can

also treat systems where a complete charge transfer between the atoms occurs, as, for example, in

NaCl.5 DFTB1 is 5-10 times faster than DFTB2/DFTB3 since it does not require a self-consistent

determination of the charge distribution; i.e., it requires a solution of the generalized eigenvalue

problem only once instead of 5-10 times on average for DFTB2/DFTB3. By contrast, the increase

in CPU time from DFTB2 to DFTB3 is negligible. The second order terms are crucial for polar

molecules, where only partial charge transfer occurs,5 and the third-order expansion becomes in-

dispensable for charged molecular species,6,7 as will be discussed in more detail below. Therefore,

for an application to biological molecules DFTB2 or DFTB3 is required. Since DFTB3 does not

imply any major increase in computational cost, we recently devised a new DFTB3 model by de-

riving a parameter set called ‘3OB’, which we recommend to use in standard applications of DFTB

to biological molecules.8

Three approximations follow the expansion of the total energy in Eq.3: (i) The energy contri-

bution E0[ρ0] is approximated by a sum of pair potentials, which are fitted for a set of molecules

to appropriate reference data. (ii) The Kohn-Sham orbitals Ψi appearing in the term E1[ρ0,δρ]

3

are expanded in a minimal atomic orbital basis φµ and a two-center approximation is applied to

the evaluation of the resulting integrals. (iii) The electron density fluctuations δρ appearing in the

second- and third-order terms are expanded with a multipole expansion. In the existing models,

this expansion is truncated after the monopole term, thus electron-electron interaction (Hartree

and exchange-correlation terms) is effectively approximated by the interaction of atomic partial

charges. This interaction is described by a Coulomb-term, which is damped at short interatomic

distances. A major improvement for non-bonded interactions has been achieved by identifying a

shortcoming of the original interaction term3 and proposing a simple modification.5 This modifi-

cation is now used as a default in the DFTB3 model.7,8

Since DFTB1 and DFTB2 have been reviewed in great detail in Ref. 9, this article focuses on

the extension to DFTB3 as outlined in Refs. 5–8 and on the performance of DFTB2 and DFTB3

for organic and biological molecules. 1

Theoretical Approach

Theory of the third-order SCC-DFTB: DFTB3

The extension of the DFTB approach to include third-order terms (DFTB3) has been introduced re-

cently5,7 and will be briefely summarized in the following. The starting point to derive the DFTB3

total energy is the energy expression of the Kohn-Sham density functional theory.11 Instead of

finding the electron density ρ(rrr) that minimizes the total energy, a reference density ρ0 is assumed

and perturbed by some density fluctuation, ρ(rrr) = ρ0(rrr)+ δρ(rrr). The exchange-correlation en-

ergy functional is then expanded in a Taylor series up to third order and the total energy can be

1A very nice introduction into DFTB2 has also been given in Ref. 10.

4

written as,

Edftb3[ρ0 +δρ] =− 12

∫∫ρ0(rrr)ρ0(rrr′)|rrr− rrr′|

drrrdrrr′−∫

V xc[ρ0]ρ0(rrr)drrr+Exc[ρ0]+Enn

+∑i

ni⟨ψi∣∣H[ρ0]

∣∣ψi⟩

+12

∫∫ ( 1|rrr− rrr′|

+δ 2Exc[ρ]

δρ(rrr)δρ(rrr′)

∣∣∣∣ρ0

)δρ(rrr)δρ(rrr′)drrrdrrr′

+16

∫ ∫∫δ 3Exc[ρ]

δρ(rrr)δρ(rrr′)δρ(rrr′′)

∣∣∣∣ρ0

δρ(rrr)δρ(rrr′)δρ(rrr′′)drrrdrrr′drrr′′

= E0[ρ0]+E1[ρ0,δρ]+E2[ρ0,(δρ)2]+E3[ρ0,(δρ)3].

(4)

Central to the performance of the DFTB models are several approximations following this

expansion, which are:

(i) E0[ρ0] consists of the DFT ‘double counting’ contributions and the nuclear-nuclear repul-

sion in the first line of Eq.4 and depends only on the reference density ρ0, which is given by the

superposition of neutral atomic densities (Eq.1). In other words, this term is not dependent on

the specific chemical environment; it can be determined for an appropriate ‘reference system’ and

then applied to other molecules. This is the key to the transferability of the derived parameters. In

DFTB, this term is approximated by a sum of pair interactions referred to as the repulsive potential,

Erep =12 ∑

abV rep

ab , (5)

(see Ref. 9), which is either determined by comparison to DFT calculations1 or fitted to empirical

data.12 This approach neglects three-body contributions, which may become important in certain

cases, such as in condensed phase systems.13

(ii) E1 consists of the Hamiltonmatrix elements < ψi|H0|ψi > in the second line of Eq.4:

H0 is the Kohn-Sham Hamiltonian of the molecular system but with the reference density. ψi

are the Kohn-Sham orbitals, which are represented in a minimal basis of pseudoatomic orbitals,

ψi = ∑µ cµiφµ . This is a central part of the computational efficiency since it reduces the size of the

5

eigenvalue problem significantly.

E1[ρ0,δρ] = ∑iab

∑µ∈a

∑ν∈b

nicµicν iH0µν (6)

where H0µν are the Hamilton matrix elements in the atomic orbital (AO) representation. The diago-

nal elements of H0µν are chosen to be atomic DFT eigenvalues evaluated with the PBE14 exchange-

correlation functional, the off-diagonal elements are calculated in a two-center approximation.9,15

This minimal basis approximation is at the core of the problems when it comes to the calculation of

response properties.16,17 A single-zeta basis can be tuned quite well to reproduce the bonding prop-

erties of molecules. However, treatment of the more diffuse part of the density, which is relevant

to non-covalent interactions, is more challenging and requires more extended basis sets. Further,

polarization functions may become important for some systems, such as for nitrogen as discussed

below. After introducing the minimal basis, the resulting interaction integrals are approximated.

In particular, this concerns the neglect of 2-center integrals for the diagonal terms and three-center

integrals for the off-diagonal contributions; for a detailed discussion, see Ref. 9.

(iii) E2, the energy term in the third line of Eq.4, is approximated using only the monopole

term in an expansion of δρ in spherical harmonics.3 The charge density fluctuations δρ are then

written as a superposition of atomic contributions, δρ = ∑a δρa, in which the spherical atomic

contributions are approximated by a simple Slater function with ∆qa = qa−q0a (qa is the Mulliken

charge of atom a and q0a the number of valence electrons of the neutral atom a) centered on the

nucleus at RRRa:

δρa ≈ ∆qaτ3

a8π

e−τa|rrr−RRRa| (7)

With this approximation, the Coulomb interaction of the second order term with respect to δρ can

be expressed analytically and is abbreviated as γab in the following. The exponent τa is chosen

such that the on-site value of the γ-function properly describes the atomic chemical hardness (or

alternatively the Hubbard parameter as calculated from DFT) and, therefore, implicitly takes into

account the exchange-correlation contribution to the second order term. To improve this interpo-

6

lation between long-range Coulomb interaction and the on-site term, further refinements on the

γ-function have been applied.7

As discussed in Refs.,4,5 the derived function γab assumes a specific inverse relationship of

the chemical hardness U of an element with its atomic size. Although this relation holds well

within one row of the periodic table, it does not for elements of different rows. In particular,

hydrogen turned out to deserve special attention and we therefore proposed a modified γab function

referred to as γhab.5–7 This modified function can be applied within DFTB2, but it is not the ‘default’

option. Within DFTB3, it has become default7,8 and therefore is a key ingredient of the DFTB3

methodology.

(iv) The E3 term consists of diagonal and off-diagonal contributions. Originally, only the

diagonal terms have been included.4–6 As in DFTB3, a monopole approximation is applied and this

term describes the change of the chemical hardness of an atom with its charge state.5 Specifically,

in third order a new parameter is introduced, the charge derivative of the chemical hardness, Ud .

This parameter can be computed from DFT or optimized in order to improve the performance of the

model. In DFTB3, Ud contributes at the third-order through a Γ-function, which is the derivative

of γab with respect to atomic charge. It is interesting to note that Giese et al.18 showed within

the framework of a rigorous density-functional expansion method that the third-order contribution

does not add significantly to accuracy, in contrast to our finding with calculations based on diverse

sets of molecules.6–8 Therefore, the third-order terms in DFTB3 can be seen as a systematic way to

introduce the charge dependence to compensate for deficiencies of intrinsic approximations within

the second order formalism, namely, the small size of the pseudo-atomic orbital basis, the fixed

shape of the initial atomic densities ρa0 as well as the simplified density fluctuation scheme.

With all these approximations the DFTB3 total energy is given by

Edftb3 =12 ∑

abV rep

ab +∑iab

∑µ∈a

∑ν∈b

nicµicν iH0µν +

12 ∑

ab∆qa∆qbγ

hab +

13 ∑

ab(∆qa)

2∆qbΓab. (8)

The derivative of this expression with respect to the molecular orbital coefficients, cµi, leads to the

7

corresponding Kohn-Sham equations

∑ν

cν i(Hµν − εiSµν

)= 0 with ν ∈ b and ∀a,µ ∈ a, (9)

Hµν = H0µν +Sµν ∑

c∆qc

(12(γac + γbc)+

13(∆qaΓac +∆qbΓbc)+

∆qc

6(Γca +Γcb)

), (10)

where Sµν is the overlap matrix. The Hamilton matrix elements depend on the Mulliken charges,

which in turn depend on the molecular orbital coefficients Thus, these equations have to be solved

self-consistently.

Dispersion correction

Dispersion interactions play an important role in processes dominated by non-covalent interactions,

such as conformational transitions of biomolecules. In the DFTB framework, the first attempt19 to

include dispersion was to augment the DFTB2 energy with an empirical dispersion term, following

the similar strategy applied to Hartree-Fock energies; the results were promising19 and stimulated

similar developments for pure DFT methods.20–23 However, due to the use of a minimal basis set of

atomic orbitals, which are slightly compressed with respect to atomic orbitals, the electron density

in DFTB2 is not well described for large distances, especially for the overlap of weakly interacting

densities which are essential to the description of van der Waals (vdW) interactions. The early

parametrization of DFTB2 for organic and biological molecules2 led to an underestimation of dis-

tances in hydrogen and vdW bonded complexes. Therefore, an empirical dispersion correction has

been proposed which also contains a repulsive contribution in order to correct for this artifact.24

The new DFTB3 parametrization 3OB corrects for this problem by using a slightly more extended

atomic orbital basis set, leading to a good description of non-covalently bonded complexes using

the original dispersion correction from Ref. 19. Recently, Grimme has parametrized his D323 cor-

rection for DFTB3, leading to an excellent performance of DFTB3-D3 for a large set of hydrogen

2referred to as the ‘mio’ set, see www.dftb.org

8

and vdW bonded molecules.25

Treatment of Electron Spin

Standard DFTB is a closed-shell method and therefore exhibits large errors for open-shell systems.

Köhler et al. have formulated an open-shell DFTB variant that includes spin-polarization effects

either in a collinear26,27 or a noncollinear fashion.28 Besides doubling the orbital set for spin up

and spin down electrons, an additional term is added to the total energy that takes into account the

Mulliken spin-population and atomic spin-polarization constants. The latter are calculated from

DFT as numerical difference of partially spin-polarized states in proximity of the spin-unpolarized

state of an atom. The collinear spin treatment improves the description of radicals of organic

molecules. However, for some systems the direction of spin-quantization varies significantly in

space (e.g., antiferromagnetism), for which the noncollinear spin-polarization treatment is neces-

sary. Note that for the collinear case the amount of computation time doubles with respect to the

nonpolarized calculation, while for the noncollinear one the cost quadruples.

Inherited DFT problems

DFTB is derived from DFT and usually GGA functionals (PBE) are applied to compute the terms

in E1, E2 and E3. Since E0 only affects bond energies but not the electronic spectrum in total, ap-

plying higher level methods or experimental data for the determination of E0 does not compensate

for most of the problems inherent to DFT-GGA, except for overbinding, which could be almost

entirely removed in DFTB3/3OB. The deficiency of DFT-GGA in describing vdW interactions can

be compensated by using empirical dispersion corrections, as describe above. All other phenom-

ena related to the self-interaction problem (SIC) in DFT are retained in the DFTB model. This

is reflected, for example, in the performance of DFTB for the subsets of problematic cases in the

GMTKN2429 test set (SIE11, DARC, DC9), as discussed in Ref. 8. The self-interaction problem

shows up in many properties and is contained in the second- and third-order terms in DFTB. A

detailed analysis has been published recently.30,31 The description of the balance between charge

9

delocalization and polarization, for example in charge transfer complexes, is also a challenge to

DFT. Rapacioli et al.32 adapted recently a configuration interaction method, based on constrained

DFT calculations, into the DFTB approach. This allows one to investigate charge resonances in

molecular complexes and describe the proper dissociation behavior.

QM/MM coupling

DFTB has been combined with empirical force field methods in a QM/MM framework as de-

scribed in Ref.33 This scheme has further been extended to include also a continuum electrostatics

environment in the DFTB/MM-GSBP scheme,34 which is useful to the study of chemical reactions

in large macromolecular systems.35,36

For the interaction between QM and MM atoms, it is common to include both electrostatic

and van der Waals contributions;37–39 bonded-terms are also included when the partitioning is

across covalent bonds. In most biomolecular applications, electrostatics tend to dominate and

therefore it is essential that electrostatic interactions between QM and MM atoms are properly

described. For DFTB, the QM-MM electrostatic interaction is approximately calculated in the

original implementation33 as the Coulombic interaction between the QM Mulliken charges (qa)

and MM point charges (QI). The error due to this approximation can be significant when QM

and MM atoms approach each other where charge penetration effect becomes important. As a

result, reactions that involve highly charged solutes/substrates are difficult to study with the original

DFTB/MM Hamiltonian.40 The problem can be partially solved by enlarging the QM region, but

this introduces not only additional cost but also technical complications for cases that involve

highly mobile solvents, such as the need of changing QM/MM partitioning on the fly.41,42

In our recent work,43 motivated by the Klopman-Ohno (KO) expression for the two-center two-

electron integrals in semi-empirical QM methods,44 which also inspired the development of the

γab kernel in the original DFTB, we have implemented a different Hamiltonian for the DFTB/MM

10

electrostatics. It takes the form,

HQM/MMelec,KO = ∑

a∈QM∑

A∈MM

∆qaQA√R2

aA +aa(1

Ua(qa)+ 1

UA)2e−baRaI

= ∑a∈QM

∑A∈MM

γKO∆qaQA (11)

in which aa and ba are element type dependent parameters. Together with the van der Waals

parameters in the QM/MM Hamiltonian, there are 4 QM/MM parameters for each element type,

and they can be determined based on microsolvation clusters.43 To be consistent with the third-

order formulation of SCC-DFTB,7 the Hubbard parameter in the KO functional is dependent on

the QM charge. As a result, the effective size of the QM charge distribution naturally adjusts

as the QM region undergoes chemical transformations, making the KO based QM/MM scheme

particularly attractive for describing chemical reactions in the condensed phase.

Our studies of charged solutes and chemical reactions clearly indicate that the KO scheme

is robust and transferable. For the fitting set clusters, both the point-charge and KO schemes

have comparable errors (relative to full QM results) in solute-solvent interactions, with the Mean

Unsigned Error, (MUE) of 3.3 and 4.8 kcal/mol, respectively (note that the errors are for total

solute-solvent interactions, which are often >100 kcal/mol, thus the error is typically less than

5%!). However, for 16 stable structures and 24 transition states in the QCRNA database, the MUE

is 4.3 kcal/mol for the KO scheme but 16.2 kcal/mol for the point-charge based QM/MM model.

As another example, for the hydrolysis of phosphate mono esters in solution43 the hydrolysis

barrier is grossly overestimated (∼ 11 kcal/mol) with SCC-DFTBPR/MM simulations using the

point-charge based QM-MM Hamiltonian. 3 With the KO scheme, the computed barrier is in close

agreement (within 2 kcal/mol) with available experimental data.

Parametrization

The parametrization of the DFTB models involves three steps:

3SCC-DFTBPR is a DFTB variant including only diagonal 3rd order terms and a specific modification andparametrization for phosphate hydrolysis. See Ref.40

11

(i) The determination of the parameters for E1:

This is usually the first step in the parametrization. Here, one has to compute

H0µν =< φµ |H[ρ0]|φν > (12)

and Sµν for setting up the Hamilton Matrix elements in Eq.10. In a first step, one has to determine

the atomic orbital basis set φµ and the neutral atom densities ρ0a by solving the atomic KS equa-

tions where an additional potential leads to a confinement of the orbitals.9 For the basis set, the

confinement parameter is usually set to roughly twice the covalent radius of the element, while the

choice of the confinement radius of the initial atomic densities is slightly more empirical.7,12 The

choice of these two parameters in a reasonable range does not alter molecular properties on a large

scale, however, they can be used for a fine-tuning of the method. This has been discussed recently

for the derivation of the new DFTB3 parameters 3OB.8 Compared to the older DFTB2 parameters

‘mio’, more diffuse basis functions φµ lead to an increase in Pauli repulsion which is relevant for

weak interactions, while a slightly larger compression of the initial densities ρ0a leads to a decrease

in the overbinding and therefore better performance for heats of formation and reaction energies.

(ii) The determination of the parameters for E2 and E3:

For the atomic partial charges qa in Eq.8 a Mulliken partitioning scheme is usually applied, al-

though other schemes are possible as well. Using CM3 charges has been shown to improve the

electrostatic potential of molecules;17,45 however, additional parameters would enter the parametriza-

tion procedure, which we have tried to keep as simple and straightforward as possible. The function

γ(Ua,Rab) in Eq.8 has been determined by an analytical derivation3 and the chemical hardness pa-

rameter (or Hubbard parameter) Ua is usually computed from DFT. However, as described above,

this choice of γ(Ua,Rab) presupposes a particular inverse relationship between the chemical hard-

ness and the size of an atom, which holds well within one row of the periodic table but by no means

for elements of different rows4,5,7 4. Therefore, the functional form of γ(Ua,Rab) should depend

on the row of the peridic table. For the first row, we use the original form but for hydrogen and

4See in particular Fig. 2 from Ref. 7

12

its interaction with other elements a modified function γh(Ua,Rab) is applied.4,5,7 Other choices

of functions for the 2nd, 3rd etc. rows is ongoing work. For DFTB3, the derivative of γ(Ua,Rab),

Γ(Ua,Uda ,Rab) is needed,7 where Ud

a is the charge derivative of Ua. In the earlier implementa-

tions,6,40 only the diagonal part of Γ(Ua,Uda ,Rab) was implemented. This works well for first row

molecules, except for the deprotonation of NH3, where the off-diagonal terms seem to be impor-

tant.7 The diagonal version, however, was not able to describe phosphorous containing molecules,

in particular their (de-)protonation energies, and an ad hoc modification has been necessary in-

volving a special parametrization.40 This problem could be remedied at full third-order,7 however,

by treating Uda as adjustable parameters.

(iii) The determination of the parameters for E0:

The determination of Erep has been greatly simplified by introducing automated parametrization

procedures.12,46 These schemes not only reduce the effort but also allow to vary the optimization

targets. In principle, data from any theoretical level and experiment can enter the parametriza-

tion. Since it depends only on ρ0, Erep could be determined in principle only once and would

be valid for all DFTB models. Indeed, the Erep parameters originally derived for DFTB2, called

‘mio’, worked rather well with DFTB3.6,7 However, a fine tuning can be achieved when Erep is

specifically optimized for the respective model. Therefore, we have reoptimized the parameters

for DFTB3, now called 3OB (referring to ‘DFTB3’ and the main field of application: organic and

biological molecules). Therefore, there are currently two sets of parameters available, the ’mio’

set which has been derived for DFTB23 and the ’3OB’ set, which has been derived for DFTB3.8 5

Note that the 3OB set also differs in the electronic parameters, as described above.6

In summary, one first has to determine 4 parameters per atomtype, the confinement radii for

the atomic orbitals φµ , which is called r0, the confinement radii for the atomic densities ρa0 , which

is called rd0 , the atomic Hubbard parameter Ua and its charge derivative Ud

a . The determination

5These parameter sets can be downloaded from the website www.dftb.org.6In earlier work applying the diagonal DFTB3 method in combination with the ‘mio’ set, fitted Ud values com-

pensated for the overbinding of the method. This is no longer needed using ‘3OB’ since this parametrization removesthe overbinding by changing the density compression radii.8 Further, the special parametrization and modification forphosphorous compounds40 is no longer required due to the introduction of the 3rd order off-diagonal terms.

13

of r0 and rd0 is an empirical procedure and can be quite involved,8,12 while Ua and Ud

a can be

easily computed in principle. For the modified function γh one additional parameter appears,

which is fitted to reproduce the water dimer binding energy. The repulsive potentials are two-body

contributions, therefore they are much more involved although largely automated procedures have

been recently developed.12

While for many applications relative energies are the important quantity, sometimes the cal-

culation of atomization energies and heats of formation is desired, as for example in the case of

fitting the DFTB repulsive potentials. However, the calculation of atomization energies requires

some additional care.8 It is given by the total energy of a system Etot and the atomic energies Eatom

EAt =−Etot +∑a

Eatoma (13)

With a closed-shell treatment DFTB gives Eatoma of rather poor quality. One may use the spin-

polarization formalism, which improves the results. In practice, however, the atomization energies

are usually calculated using spin-polarization energies Espin that are pre-calculated from DFT for

each atom; i.e., Espin is the difference of the atomic energy calculated at the spin-unpolarized state

and the spin-polarized state. 7 With that, Eatom is calculated as the total energy of an atom plus the

spin-polarization energy Espin.

Note that using Espin gives slightly more accurate results than using atomic energies as calcu-

lated from spin-polarized DFTB because the spin-polarization from the atom (as calculated from

DFT) is added rather than a correction of the atomic energy. The latter uses spin-polarization con-

stants calculated as derivative of the atomic eigenvalues in the proximity of the spin-unpolarized

atom.7In the case of Hydrogen the spin-unpolarized state would refer to a hypothetical one where 0.5 electrons are spin-

up and 0.5 electrons are spin-down, while the spin-polarized state is the ground state of the atom with 1.0 spin-up and0.0 spin-down electrons.

14

Performance

Energetics, structure and vibrational frequencies of small molecules

DFTB2 has been tested over the years for a variety of molecular properties. A first thorough test

has been performed by Krüger et al,47 who benchmarked the accuracy of DFTB2 against G2 and

BLYP for 22 molecules, evaluating 28 reaction energies, geometries and vibrational frequencies.

Reaction energies show an mean error of 4.3 kcal/mol with respect to G2 and geometries are in

excellent agreement with those obtained at the DFT level. Vibrational frequencies, however, show

larger deviations; in particular, the stretch frequencies of several specific modes are significantly

overestimated. Therefore, Małolepsza et al.48 suggested to apply a specific parametrization of

Erep for vibrational frequencies. With this special parameter set, DFTB2 shows a very good per-

formance and vibrational frequencies approach the quality of those from full DFT. We investigated

this point in more detail in later publications for DFTB212 and DFTB3.8 These studies demon-

strate the limited flexibility of the current DFTB approach; i.e, it is not possible to achieve an

accuracy comparable to DFT-GGA for both reaction energies and vibrational frequencies with a

single parameterization. There is an optimization conflict where one property deteriorates when

the other is improved. The pragmatic solution to this problem is to supply two sets of parame-

ters,8 one optimized for energies and geometries (3OB), the other for geometries and vibrational

frequencies (3OB-f).

Two other publications have benchmarked DFTB2 for even larger molecular test sets. Sat-

telmeyer et al.49 benchmarked DFTB2 for 622 closed shell molecules containing O, N, C and H in

comparison with Hartree-Fock based semi-empirical methods like AM1, PM3 and PDDG/PM3.

The good performance of DFTB for geometries was confirmed, however, the performance of

DFTB2 for heats of formation with a mean average error of 5.8 kcal/mol was worse than that

of PM3 and PDDG/PM3, the latter with a mean absolute error of 3.2 kcal/mol that even outper-

forms B3LYP/6-31G(d).8 Otte et al.50 confirmed these findings, showing that DFTB2 performs

8This study also indicated errors in the treatment of N-O and S-O bonds, which should be ameliorated with thenew 3OB parametrization.

15

slightly worse for heats of formation than AM1 and PM3, and in particular worse than the OMx

suite of methods. However, geometries are very well described and DFTB2 is clearly superior for

vibrational frequencies. Further, DFTB2 performs very well for structures and relative energies of

peptide conformations,9 as well as for hydrogen bonded systems.

The ‘3OB’ parametrization for DFTB3 has been developed with improving two particular lim-

itations of DFTB2 in mind: the overbinding of about 5-10 kcal/mol per covalent bond (for O,

N, C, H containing molecules) and the underestimation of binding energies in weakly bonded

complexes.8 The third-order terms improve the description of localized charges10 and the modi-

fied Coulomb interaction γh(Ua,Rab) improves hydrogen bonding interactions.7 As a result, the

description of energies is greatly improved: DFTB3/3OB approaches the accuracy of DFT-GGA

methods like PBE for heats of formations and atomization energies as well as the accuracy of the

@@best semi-empirical methods like PDDG-PM3. DFTB3/3OB is even better than DFT-GGA

when only a small, double zeta type basis set is applied,8 as typically done with DFT or DFT/MM

based molecular dynamics simulations. In particular hydrogen bonding energies, proton affini-

ties and proton transfer barriers, which are relevant in many biochemical problems, are very well

described.

Recently, Goerigk and Grimme have compiled a general database (GMTKN24) for main group

thermochemistry, kinetics and non-covalent interactions.29 This set benchmarks a variety of molec-

ular properties, reaction and atomization energies, reaction barriers, electron affinities, ionization

potentials (IP’s) and proton affinities (EA’s), hydrogen bonding and VdW interactions, conforma-

tional energies of peptides, hydrocarbons and carbohydrates, isomerization reactions and some

other properties. For this set, the accuracy of DFTB3/3OB is comparable to the newest variant of

the OMx models,8 OM3, which has been shown recently to approach the accuracy of DFT-GGA

methods for this data set.53

9See also Refs. 51,52 for more detail.10As they appear in small charged molecules, where the charge is located on few atoms. Large ionic molecules,

where the charge is distributed over a large number of atoms is unproblematic in DFTB2.

16

Properties: IP’s, EA’s, dipole moments and molecular polarizability

IP’s and EA’s for small molecules are difficult to compute with a minimal basis set method like

DFTB since these properties do not enter the parametrization procedure, in contrast to NDDO

type semi-empirical methods. The adjustment of Erep only affects bond lengths (not angles!),

bond energies and stretch frequencies. Therefore, properties like IP and EA are usually less accu-

rately described8,49,50,53 and deserve careful testing for the specific problem in hand. This holds

as well for dipole moments, which are simply computed from the Mulliken population analysis.

The description of electrostatic properties such as dipole moments can be easily improved using a

parametrized charge scheme like CM3, as has been shown in Ref. 17. However, IR intensities are

not changed since these depend on the derivative of the dipole moment with respect to the normal

coordinates, which is not improved. Unfortunately, this holds similarly for molelcular polariz-

abilities and Raman intensities. The polarizabilities can be adequately improved using methods

like Chemical Potential equilization16 or a variational approach (VAR),17 but Raman intensities

suffer from the same problem as IR intensities; i.e., although the properties are improved, their

derivatives with respect to normal coordinates are not.

Conformations of complex molecules

Most of the tests described so far benchmark the DFTB performance for covalent bond lengths

and bond angles. The performance for dihedral angles, which is important to the description of

conformations of complex molecules like peptides, proteins, DNA and carbohydrates, remains

systematically tested. DFTB2 has been extensively benchmarked for the structures and relative en-

ergies of polyalanine conformations.51,52 Relative energies and structures were found to be in good

agreement with DFT and ab initio predictions, and vibrational spectroscopic features were also re-

produced satisfactorily;54 for a short review, see Ref. 55. However, low frequency modes seem

to be underestimated,56 which indicates that rotational barriers are too low in DFTB. QM/MM

simulations of di-alanine in water indicated that the free energy minima at the α-helical and β -

sheet region were more extended than in standard force field methods,57 a finding confirmed later

17

using a different QM/MM implementation.58 A deeper analysis indeed showed that the rotational

barriers around the dihedral angles are very low. Furthermore, DFTB/MM populates the α basin

more than the β basin, in contrast to experimental findings. The energy differences, however, are

small and on the order of 0.5 kcal/mol. Therefore, small changes in the Hamiltonian can lead to

a significant change in the populations, and it is possible that DFTB3 with an improved QM/MM

coupling (KO-scheme, see above) leads to an improvement.

DFTB2 has also been tested for carbohydrates and the property of interest are again the dihe-

dral angles, in particular the ring puckering modes. It has been shown that DFTB2 produces free

energy surfaces for conformational transitions similar to those of ab initio methods, in contrast to

various NDDO methods,59 motivating the application of DFTB2 to carbohydrate reaction dynam-

ics.60,61 The agreement with high level methods, however, is far from perfect and leaves ample

room for future improvement. For example, potential energy scans for certain dihedral angles

clearly showed that DFTB2 is in qualitative agreement with full DFT but with too low torsional

barriers, while NDDO type methods seem to fail even qualitatively.62 Nevertheless, there seems

no compelling reason to use DFTB2 for the description of structure and dynamics of carbohydrates

at the moment, since empirical force fields currently represent the potential energy surface much

more accurately.

Water

Another issue worth mentioning is the description of water by DFTB. Given its fundamental

importance in chemistry and biology, it’s desirable to be able to adequately describe water in

both gas and condensed phases, including water in different protonation states (e.g., a solvated

proton/hydroxide). With the standard DFTB2, the hydrogen bonding interaction between water

molecules is too weak; as discussed above and in detail elsewhere,4,5,7 this motivated the de-

velopment of the modified γh function for atom pairs involving H. With DFTB37 and the latest

parameterization,8 for example, the water dimer binding energy is well described and low-energy

conformers of small water clusters are also captured. The relative energies of these low-energy con-

18

formers, however, are not yet ideal, suggesting the need of further improving hydrogen-bonding

interactions by, for example, going beyond the monopole approximation for charge-charge inter-

actions in eq. 8. The imperfection of water-water interaction is also manifested in bulk water

simulations, which indicated that both DFTB2 and DFTB3 tend to over-predict the height of the

first solvation shell peak in the O-O radial distribution function while underestimating the second

solvation shell.63–65

For NVT simulations at the ambient condition, one simple but ad hoc approach to improve the

description of bulk water is to adjust the pair-wise repulsive potentials based on a reversed Monte

Carlo protocol such that experimental radial distribution functions are reproduced. This is found66

to be somewhat successful in that the resulting repulsive potentials also improved the description

of small protonated water clusters and the structure of a solvated proton. For the 13 low-energy

isomers H(H2O)+22, for example, the RMSE is only 0.9 kcal/mol relative to MP2 results,67 as

compared to the value of 3.8 kcal/mol for the original DFTB3. For a solvated proton in the bulk,

the integrated coordination number for the first solvation shell is 3.2, which is close to the value

of 3.0 for CPMD (using the HCTH functional); by comparison, the standard DFTB3 gives a value

close to 5.0.65 Nevertheless, the enthalpy of evaporation remains too low by about 1 kcal/mol,

and preliminary NPT simulations indicate that DFTB2/3 tends to substantially overestimate the

density of bulk water at the ambient condition, a situation also observed in some ab initio DFT

simulations.68 Therefore, improving the description of water remains an important topic for further

DFTB developments.

Conclusions

The extension of DFTB to the third-order, DFTB3,7,8 in combination with a new parametrization

procedure12 has improved the performance significantly for reaction energies, geometeries and

hydrogen bonded complexes. DFTB3 even outperforms DFT-GGA with double zeta (DZ) basis

in special cases, although being 2-3 orders of magnitude faster.69 However, the computational

19

efficiency comes at the price of reduced transferability; i.e., not all molecular properties can be

computed at the same accuracy within one parameter set. Such an optimization conflict has been

found in case of reaction energies and vibrational frequencies, therefore we have proposed to use

two different parametrizations, the 3OB for energies and geometries and 3OB-f for geometries and

vibrational frequencies;8 geometries are described with similar accuracy in both parameter sets.

A key to better non-bonded interactions is the use of the third-order term in combination with

the modified Coulomb interaction term, γh. The augmentation of the DFTB3 total energy with

the empirical dispersion extension can be advised as a default, because it usually only improves

results.

Despite all these improvements, there are still several limitations of DFTB:

(i) The DFT-GGA framework used for the expansion of the total energy. The DFTB models

inherit the well-known DFT-GGA problems, especially the self-interaction error.

(ii) The use of a minimal basis set. This leads to a reduced molecular polarizability and limits

the application of DFTB for computing IR and Raman spectra. Further, the missing polarization

functions may cause problems in the description of sp3 nitrogen.8 This shows up in large errors

for proton affinities with acidic nitrogen, for which no satisfactory solution has been proposed up

to now; an ad hoc fix is used by applying a special parameter set (NHmod) for these special cases.

(iii) The limited flexibility of the scheme (fixed initial density, monopole approximation) leaves

further problems. This shows up in the description of atomization energies of ionic species,8 an-

other complication is the need for two different parameter sets for hydrogen. A special parametriza-

tion is needed, when the bond breaking of molecular hydrogen is computed.8

(iv) DFTB describes the general conformational properties of biomolecules quite well: pep-

tides, DNA bases and sugars can be computed with often good accuracy. However, DFTB under-

estimates torsional barriers, which currently limits its applicability in the description of conforma-

tional dynamics of these complex molecules.

Another important direction for DFTB development concerns the treatment of metal ions, such

as Mg2+, Zn2+ and Cu+/2+, which play important structural and catalytic roles in biomolecules.

20

DFTB2 has been parameterized for several first-row elements (e.g., Fe, Ni, Co, Cu and Zn),70–72

and it has been shown that DFTB2 generally gives reliable structural properties for metal sites,

including fairly complex bi-metallo zinc sites,73–78 and DFTB/MM has been successfully applied

to a number of metalloenzymes by us and other research groups.73–79 Pushing forward the DFTB

framework is significant for metalloenzyme applications because for transition metal ions, despite

progress,80,81 a robust semi-empirical method (even just for structures!) is not yet available. This

is particularly true for open-shell cases: although parameterizations for several open-shell metal

ions (e.g., Ni, Cu and Fe) have been reported in the literature,71,72,82 their application has largely

been limited to geometry optimization of organometallic compounds and only several metalloen-

zymes;83 systematic development of the methodology to improve energetics remains an important

frontier.

References

(1) Porezag, D.; Frauenheim, T.; Köhler, T.; Seifert, G.; Kaschner, R. Phys. Rev. B 1995, 51,

12947–12957.

(2) Seifert, G.; Porezag, D.; Frauenheim, T. Int. J. Quantum Chem. 1996, 58, 185–192.

(3) Elstner, M.; Porezag, D.; Jungnickel, G.; Elsner, J.; Haugk, M.; Frauenheim, T.; Suhai, S.;

Seifert, G. Phys. Rev. B 1998, 58, 7260–7268.

(4) Elstner, M. Theor. Chem. Acc. 2006, 116, 316–325.

(5) Elstner, M. J. Phys. Chem. A 2007, 111, 5614–5621.

(6) Yang, Y.; Yu, H.; York, D.; Cui, Q.; Elstner, M. J. Phys. Chem. A 2007, 111, 10861–10873.

(7) Gaus, M.; Cui, Q.; Elstner, M. J. Chem. Theory Comput. 2011, 7, 931–948.

(8) Gaus, M.; Goez, A.; Elstner, M. J. Chem. Theory Comput. 2012, 9, 338.

(9) Seifert, G.; Joswig, J.-O. WIREs Comput Mol Sci 2012, 2, 456–465.

21

(10) Koskinen, P.; Makinen, V. Comp. Mat. Sci. 2009, 47, 237.

(11) Kohn, W.; Sham, L. J. Phys. Rev. 1965, 140, A1133–A1138.

(12) Gaus, M.; Chou, C.-P.; Witek, H.; Elstner, M. J. Phys. Chem. A 2009, 113, 11866–11881.

(13) Goldman, N.; Fried, L. E. J. Phys. Chem. C 2011, 116, 2198–22044.

(14) Perdew, J. P.; Burke, K.; Ernzerhof, M. Phys. Rev. Lett. 1996, 77, 3865–3868.

(15) Seifert, G. J. Phys. Chem. A 2007, 111, 5609–5613.

(16) Kaminski, S.; Giese, T. J.; Gaus, M.; York, D. M.; Elstner, M. J. Phys. Chem. A 2012, 116,

9131–9141.

(17) Kaminski, S.; Gaus, M.; Elstner, M. J. Phys. Chem. A 2012, 116, 11927–11937.

(18) Giese, T. J.; York, D. M. Theor. Chem. Acc. 2012, 131, 1145.

(19) Elstner, M.; Hobza, P.; Frauenheim, T.; Suhai, S.; Kaxiras, E. J. Chem. Phys. 2001, 114,

5149–5155.

(20) Wu, Q.; Yang, W. J. Chem. Phys. 2002, 116, 515–524.

(21) Grimme, S. J. Comput. Chem. 2004, 25, 1463–1473.

(22) Grimme, S. J. Comput. Chem. 2006, 27, 1787–1799.

(23) Grimme, S.; Antony, J.; Ehrlich, S.; Krieg, H. J. Chem. Phys. 2010, 132, 154104.

(24) Zhechkov, L.; Heine, T.; Patchkovski, S.; Seifert, G.; Duarte, H. J. Chem. Theory Comput.

2005, 1, 841–847.

(25) Risthaus, T.; Grimme, S. J. Chem. Theory Comput., in press.

(26) Köhler, C.; Seifert, G.; Gerstmann, U.; Elstner, M.; Overhof, H.; Frauenheim, T. Phys. Chem.

Chem. Phys. 2001, 3, 5109–5114.

22

(27) Köhler, C.; Seifert, G.; Frauenheim, T. Chem. Phys. 2005, 309, 23–31.

(28) Köhler, C.; Frauenheim, T.; Hourahine, B.; Seifert, G.; Sternberg, M. J. Phys. Chem. A 2007,

111, 5622–5629.

(29) Goerigk, L.; Grimme, S. J. Chem. Theory Comput. 2010, 6, 107–126.

(30) Hourahine, B.; Sanna, S.; Aradi, B.; Köhler, C.; Niehaus, T.; Frauenheim, T. J. Phys. Chem.

A 2007, 111, 5671–5677.

(31) Lundberg, M.; Nishimoto, Y.; Irle, S. Int. J. Quantum Chem. 2012, 112, 1701–1711.

(32) Rapacioli, M.; Spiegelman, F.; Scemama, A.; Mirtschink, A. J. Chem. Theory Comput. 2011,

7, 44–55.

(33) Cui, Q.; Elstner, M.; Kaxiras, E.; Frauenheim, T.; Karplus, M. J. Phys. Chem. B 2001, 105,

569–585.

(34) Riccardi, D.; Schaefer, P.; Yang, Y.; Yu, H.; Ghosh, N.; Prat-Resina, X.; König, P.; Li, G.;

Xu, D.; Guo, H.; Elstner, M.; Cui, Q. J. Phys. Chem. B 2006, 110, 6458–6469.

(35) Yang, Y.; Yu, H.; Cui, Q. J. Mol. Biol. 2008, 381, 1407–1420.

(36) Ghosh, N.; Xavier, P.-R.; Gunner, M. R.; Cui, Q. Biochemistry 2009, 48, 2468–2485.

(37) Freindorf, M.; Gao, J. L. J. Comp. Chem. 1996, 17, 386–395.

(38) Riccardi, D.; Li, G.; Cui, Q. J. Phys. Chem. B 2004, 108, 6467–6478.

(39) Giese, T. J.; York, D. M. J. Chem. Phys. 2007, 127, 194101.

(40) Yang, Y.; Yu, H.; York, D.; Elstner, M.; Cui, Q. J. Chem. Theory Comput. 2008, 4, 2067–

2084.

(41) Nielsen, S. O.; Bulo, R. E.; Moore, P. B.; Ensing, B. Phys. Chem. Chem. Phys. 2010, 12,

12401–12414.

23

(42) Park, K.; Götz, A. W.; Walker, R. C.; Paesani, F. J. Chem. Theo. Comp. 2012, 8, 2868–2877.

(43) Hou, G.; Zhu, X.; M. Elstner,; Cui, Q. J. Chem. Theo. Comp. 2012, 8, 4293–4304.

(44) Pople, J. A.; Beveridge, D. L. Approximate Molecular Orbital Theory; McGraw-Hill Com-

panies, 1970.

(45) Kalinowski, J. A.; Lesyng, B.; Thompson, J. D.; Cramer, C. J.; Truhlar, D. G. J. Phys. Chem.

A 2004, 108, 2545–2549.

(46) Bodrog, Z.; Aradi, B.; Frauenheim, T. J. Chem. Theory Comput. 2011, 7, 2654–2664.

(47) Krüger, T.; Elstner, M.; Schiffels, P.; Frauenheim, T. J. Chem. Phys. 2005, 122, 114110.

(48) Małolepsza, E.; Witek, H. A.; Morokuma, K. Chem. Phys. Lett. 2005, 412, 237–243.

(49) Sattelmeyer, K. W.; Tirado-Rives, J.; Jorgensen, W. L. J. Phys. Chem. A 2006, 110, 13551–

13559.

(50) Otte, N.; Scholten, M.; Thiel, W. J. Phys. Chem. A 2007, 111, 5751–5755.

(51) Elstner, M.; Jalkanen, K. J.; Knapp-Mohammady, M.; Frauenheim, T.; Suhai, S. Chem. Phys.

2001, 263, 203–219.

(52) Elstner, M.; Jalkanen, K. J.; Knapp-Mohammady, M.; Frauenheim, T.; Suhai, S. Chem. Phys.

2000, 256, 15–27.

(53) Korth, M.; Thiel, W. J. Chem. Theory Comput. 2011, 7, 2929–2936.

(54) Bohr, H. G.; Jalkanen, K. J.; Elstner, M.; Frimand, K.; Suhai, S. Chem. Phys. 1999, 246,

13–36.

(55) Elstner, M.; Frauenheim, T.; Suhai, S. J. Mol. Struct.: THEOCHEM 2003, 632, 29–41.

(56) Elstner, M.; Frauenheim, T.; Kaxiras, E.; Seifert, G.; Suhai, S. Phys. Stat. Sol. B 2000, 217,

357–376.

24

(57) Hu, H.; Elstner, M.; Hermans, J. Proteins: Struct., Funct., Genet. 2003, 50, 451–463.

(58) Seabra, G. D. M.; Walker, R. C.; Elstner, M.; Case, D. A.; Roitberg, A. E. J. Phys. Chem. A

2007, 111, 5655–5664.

(59) Barnett, C.; Naidoo, K. J.Phys. Chem. B 2010, 114, 17142–17154.

(60) Barnett, C.; Wilkinson, K.; Naidoo, K. J. Am. Chem. Soc. 2010, 132, 12800–12803.

(61) Barnett, C.; Wilkinson, K.; Naidoo, K. J. Am. Chem. Soc. 2011, 133, 19474–19482.

(62) Islam, S.; Roy, P.-N. J. Chem. Theory Comput. 2012, 8, 2412–2423.

(63) Hu, H.; Lu, Z.; Elstner, M.; Hermans, J.; Yang, W. J. Phys. Chem. A 2007, 111, 5685–5691.

(64) Maupin, C.; Aradi, B.; Voth, G. J. Phys. Chem. B 2010, 114, 6922–6931.

(65) Goyal, P.; M. Elstner,; Cui, Q. J. Phys. Chem. B 2011, 115, 6790–6805.

(66) Goyal, P.; Hu, J.; M. Elstner,; Irle, S.; Cui, Q. Manuscript in preparation

(67) Choi, T.; Jordan, K. J. Phys. Chem. B 2010, 114, 6932–6936.

(68) Wang, J.; Roman-Perez, G.; Soler, J. M.; Artacho, E.; Fernandez-Serra, M. V. J. Chem. Phys.

2011, 134, 024516.

(69) Elstner, M.; Gaus, M. In Computational Methods for Large systems: Electronic Structure

Approaches for Biotechnology and Nanotechnology; Reimers, J. R., Ed.; John Wiley and

Sons: Hoboken, New Jersey, 2011; pp 287–308.

(70) Elstner, M.; Cui, Q.; Munih, P.; Kaxiras, E.; Frauenheim, T.; Karplus, M. J. Comput. Chem.

2003, 24, 565–581.

(71) Zheng, G. S.; Witek, H. A.; Bobadova-Parvanova, P.; Irle, S.; Musaev, D. G.; Prabhakar, R.;

Morokuma, K. J. Chem. Theo. Comp. 2007, 3, 1349–1367.

25

(72) Bruschi, M.; Bertini, L.; Bonacic-Koutecky, V.; De Gioia, L.; Mitric, R.; Zampella, G.; Fan-

tucci, P. J. Phys. Chem. B 2012, 116, 6250–6260.

(73) Hou, G. H.; Cui, Q. J. Am. Chem. Soc. 2012, 134, 229–246.

(74) Riccardi, D.; Yang, S.; Cui, Q. Biochim. Biophys. Acta 2010, 1804, 342–351.

(75) Xu, D.; Guo, H.; Cui, G. J. Am. Chem. Soc. 2007, 129, 10814–10822.

(76) Xu, D. G.; Guo, H. J. Am. Chem. Soc. 2009, 131, 9780–9788.

(77) Xu, D. G.; Xie, D. Q.; Guo, H. J. Biol. Chem. 2006, 281, 8740–8747.

(78) Chakravorty, D. K.; Wang, B.; Lee, C. W.; Giedroc, D. P.; K. M. Merz, Jr., J. Am. Chem. Soc.

2012, 134, 3367–3376.

(79) Yang, Y.; Miao, Y. P.; Wang, B.; Cui, G. L.; K. M. Merz, Jr., Biochem. 2012, 51, 2606–2618.

(80) Thiel, W. Adv. Chem. Phys. 1996, 93, 703–757.

(81) Thiel, W.; Voityuk, A. A. J. Phys. Chem. 1996, 100, 616–626.

(82) Köhler, C.; Seifert, G.; Frauenheim, T. Chem. Phys. 2005, 309, 23–31.

(83) Lundberg, M.; Sasakura, Y.; Zheng, G. S.; Morokuma, K. J. Chem. Theo. Comp. 2010, 6,

1413–1427.

26

Documents

Wires 2013