18
Parametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis, 1, a) Yannis Pantazis, 2, b) Markos A. Katsoulakis, 2, c) and Vagelis Harmandaris 3, d) 1) Department of Mathematics and Applied Mathematics, University of Crete, Greece 2) Department of Mathematics and Statistics, University of Massachusetts, Amherst, USA 3) Department of Mathematics and Applied Mathematics, University of Crete, Institute of Applied and Computational Mathematics (IACM), Foundation for Research and Technology Hellas (FORTH), GR-70013, Heraklion, Crete, Greece (Dated: November 5, 2018) In this paper we extend the parametric sensitivity analysis (SA) methodology proposed in Ref. [Y. Pantazis and M. A. Katsoulakis, J. Chem. Phys. 138, 054115 (2013)] to continuous time and continuous space Markov processes represented by stochastic differential equations and, particularly, stochastic molecular dynamics as described by the Langevin equation. The utilized SA method is based on the computation of the information-theoretic (and thermodynamic) quantity of relative entropy rate (RER) and the associated Fisher information matrix (FIM) between path distributions. A major advantage of the pathwise SA method is that both RER and pathwise FIM depend only on averages of the force field therefore they are tractable and computable as ergodic averages from a single run of the molecular dynamics simulation both in equilibrium and in non-equilibrium steady state regimes. We validate the performance of the extended SA method to two different molecular stochastic systems, a standard Lennard- Jones fluid and an all-atom methane liquid and compare the obtained parameter sensitivities with parameter sensitivities on three popular and well-studied observable functions, namely, the radial distribution function, the mean squared displacement and the pressure. Results show that the RER-based sensitivities are highly correlated with the observable-based sensitivities. Keywords: Relative Entropy, Sensitivity Analysis, Fisher Information Matrix, Langevin dynamics, Methane Molecular Dynamics I. INTRODUCTION Molecular simulation is the bridge between theoreti- cally developed models and experimental approaches for the study of molecular systems in the atomistic level. 1 Nowadays, molecular simulation methodologies are used extensively to predict structure-properties relations of complex systems. The importance of numerical simula- tions in material sciences, biology and chemistry has been recently acknowledged by the 2013 Nobel price in Chem- istry “for the development of multiscale models in com- plex chemical systems”. The computational modeling of realistic complex molecular systems at the molecular level requires long molecular simulations for an enormous dis- tribution of length and time scales. 2–5 The properties of the model systems depend on a large number of param- eters, which are usually obtained utilizing optimization techniques matching specific data taken either from more detailed (e.g. ab-initio) simulations or from experiments. Furthermore, stochastic modeling is especially important a) E-mail: [email protected] b) E-mail: [email protected] c) E-mail: [email protected] d) E-mail: [email protected] for describing the inherent randomness of molecular dy- namics in various scales. All the above complexities imply the need of rigor- ous mathematical tools for the analysis of both deter- ministic and stochastic molecular systems. Uncertainty quantification (UQ) in computational chemistry is of paramount importance, especially in multiscale model- ing, where properties evaluated at the atomic-molecular scale are transferred to the mesoscopic scale. 4,6 Sources of epistemic uncertainty can stem from (i) numerical uncer- tainty, (ii) model uncertainty and (iii) parametric uncer- tainty. Numerical uncertainties are related to the finite time of the dynamic simulation, the number of particles, as well as values of the parameters related to the numer- ical method used (e.g. time-step), to name some. Model uncertainty comes from the specific force field representa- tion and its calibration to experimental properties, and the usage of specific boundary conditions. Parametric uncertainties stem from errors in parameter values due to noisy or insufficient measurements. Of all the above, the uncertainty associated with the parameters of the potential is the least understood. 6,7 Furthermore, intrin- sic stochasticity of the system is added on top of epis- temic uncertainty. This type of uncertainty is also called aleatoric. There exists a diverse range of UQ approaches pro- 1 arXiv:1412.6482v1 [cs.IT] 19 Dec 2014

(Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

Parametric Sensitivity Analysis for Stochastic Molecular Systems usingInformation Theoretic Metrics

Anastasios Tsourtis,1, a) Yannis Pantazis,2, b) Markos A. Katsoulakis,2, c) and Vagelis Harmandaris3, d)

1)Department of Mathematics and Applied Mathematics, University of Crete,Greece2)Department of Mathematics and Statistics, University of Massachusetts, Amherst,USA3)Department of Mathematics and Applied Mathematics, University of Crete,Institute of Applied and Computational Mathematics (IACM), Foundationfor Research and Technology Hellas (FORTH), GR-70013, Heraklion, Crete,Greece

(Dated: November 5, 2018)

In this paper we extend the parametric sensitivity analysis (SA) methodology proposed in Ref. [Y. Pantazisand M. A. Katsoulakis, J. Chem. Phys. 138, 054115 (2013)]

to continuous time and continuous space Markov processes represented by stochastic differential equationsand, particularly, stochastic molecular dynamics as described by the Langevin equation. The utilized SAmethod is based on the computation of the information-theoretic (and thermodynamic) quantity of relativeentropy rate (RER) and the associated Fisher information matrix (FIM) between path distributions. A majoradvantage of the pathwise SA method is that both RER and pathwise FIM depend only on averages of theforce field therefore they are tractable and computable as ergodic averages from a single run of the moleculardynamics simulation both in equilibrium and in non-equilibrium steady state regimes. We validate theperformance of the extended SA method to two different molecular stochastic systems, a standard Lennard-Jones fluid and an all-atom methane liquid and compare the obtained parameter sensitivities with parametersensitivities on three popular and well-studied observable functions, namely, the radial distribution function,the mean squared displacement and the pressure. Results show that the RER-based sensitivities are highlycorrelated with the observable-based sensitivities.

Keywords: Relative Entropy, Sensitivity Analysis, Fisher Information Matrix, Langevin dynamics, MethaneMolecular Dynamics

I. INTRODUCTION

Molecular simulation is the bridge between theoreti-cally developed models and experimental approaches forthe study of molecular systems in the atomistic level.1

Nowadays, molecular simulation methodologies are usedextensively to predict structure-properties relations ofcomplex systems. The importance of numerical simula-tions in material sciences, biology and chemistry has beenrecently acknowledged by the 2013 Nobel price in Chem-istry “for the development of multiscale models in com-plex chemical systems”. The computational modeling ofrealistic complex molecular systems at the molecular levelrequires long molecular simulations for an enormous dis-tribution of length and time scales.2–5 The properties ofthe model systems depend on a large number of param-eters, which are usually obtained utilizing optimizationtechniques matching specific data taken either from moredetailed (e.g. ab-initio) simulations or from experiments.Furthermore, stochastic modeling is especially important

a)E-mail: [email protected])E-mail: [email protected])E-mail: [email protected])E-mail: [email protected]

for describing the inherent randomness of molecular dy-namics in various scales.

All the above complexities imply the need of rigor-ous mathematical tools for the analysis of both deter-ministic and stochastic molecular systems. Uncertaintyquantification (UQ) in computational chemistry is ofparamount importance, especially in multiscale model-ing, where properties evaluated at the atomic-molecularscale are transferred to the mesoscopic scale.4,6 Sources ofepistemic uncertainty can stem from (i) numerical uncer-tainty, (ii) model uncertainty and (iii) parametric uncer-tainty. Numerical uncertainties are related to the finitetime of the dynamic simulation, the number of particles,as well as values of the parameters related to the numer-ical method used (e.g. time-step), to name some. Modeluncertainty comes from the specific force field representa-tion and its calibration to experimental properties, andthe usage of specific boundary conditions. Parametricuncertainties stem from errors in parameter values dueto noisy or insufficient measurements. Of all the above,the uncertainty associated with the parameters of thepotential is the least understood.6,7 Furthermore, intrin-sic stochasticity of the system is added on top of epis-temic uncertainty. This type of uncertainty is also calledaleatoric.

There exists a diverse range of UQ approaches pro-

1

arX

iv:1

412.

6482

v1 [

cs.I

T]

19

Dec

201

4

Page 2: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

posed in the literature. Variance-based methods suchas analysis of variance (ANOVA),8 Bayesian statisti-cal analysis7,9 (applications to deal with large uncer-tainties) and polynomial chaos expansions10 have beenwidely used. The first two methods are based on mul-tiple and usually expensive Monte Carlo runs resultingin huge computational cost whereas the latter becomesintractable when the parameter space is large. An indepth study of this last method in MD has recently beenpresented by Rizzi et al. 10.

Sensitivity analysis (SA) is a powerful tool that givesinsight of how small variations (uncertainty) in systemparameters (input), can affect the output of the systemsubstantially. Such perturbations occur from computa-tional errors, uncertainty and errors resulting from ex-perimental parameter estimation11 (such as parameterfitting through ensemble averages of macroscopic ther-modynamic quantities). Thus, parametric SA can pro-vide critical insights in uncertainty quantification. Espe-cially in the stochastic setting (e.g., Langevin dynamicsin molecular systems), SA is performed by analysis of thesystem’s mean behavior, i.e., several simulations startingfrom a configuration at the stationary (or steady states)regime. The stationary regime is crucial for complexmolecular systems since it captures not only static quan-tities such as the radial distribution function but alsodynamical quantities which includes transitions betweenmetastable states in complex, high-dimensional energylandscapes and intermittency.12 Depending on the mag-nitude of the perturbations, SA can be classified into lo-cal (infinitesimal, one-at-a-time parameter perturbation)and global (finite, multiple parameter perturbation).

Furthermore, the role of SA is not restricted to UQ butit is of pivotal significance in several other applications.First, robustness of a system meaning the stability of thebehavior under simultaneous changes in model parame-ters or variations of orders of magnitude in insensitiveparameters that insignificantly affect the dynamics canbe addressed utilizing parametric SA approaches. Sec-ond, sensitivity analysis on experiment conditions underwhich information loss is minimized, establish optimalexperimental design.13 Furthermore, identifiability anal-ysis employs SA to determine a priori whether certainparameters can be estimated from experimental data ofa given type. The work in Ref. 14 contains a generalframework of SA in MD (proteins) using the observablehelicity while cross-validation with experimental data isalso displayed. Overall, SA plays a fundamental role inmultiscale design and as it has been highlighted by Braatzet al.15.

Typically in a stochastic setting, the most common lo-cal parametric SA method is based on partial derivativeson ensemble averages of quantities of interest arounda nominal parameter value.16 Large derivatives indicatestrong sensitivity of the observable to the particular pa-rameter while the opposite holds when the derivativevalues are small. There has been an increasing numberof methods to compute the partial derivatives especially

in discrete-event systems whose applications range frombiochemical reaction networks to operations research andqueuing theory. Finite-difference approaches based oncommon random numbers,16 on common reaction path17

which exploits positive correlations among coupled per-turbed/unperturbed reaction paths as well as couplingmethods18,19 have been proposed. There are certain is-sues associated with these finite-difference approaches;the estimator of the partial derivative has bias whilethe variance of the gradient estimator increases withthe dimension of the parameter space. Instead of usingthe finite-difference approaches, one can utilize Girsanovmeasure transformation to directly compute the infinites-imal sensitivity.20–22 In MD simulations where both timeand states are typically continuous, Iordanov et al.23

performed SA in three potentials of different functionalforms (and number of parameters) in order to comparetheir influence in thermodynamic quantities. Instead ofperturbing the potential parameters they scaled the po-tentials one-at-a-time (local SA) aiming to minimize thediscrepancy from the experimental values of each observ-able separately.

Another class of sensitivity methods which is not fo-cused on specific observable functions but on the over-all properties of the stochastic process is based on in-formation theory concepts. Application of information-theoretic SA methods to analysis of stochastic mod-els uses quantities such as entropy, relative entropy (orKullback-Leibler divergence), the corresponding Fisherinformation matrix as well as mutual information. Rela-tive entropy (RE) measures the inefficiency of assuming aperturbed (or ”wrong”) distribution instead of assumingthe unperturbed (or ”true”) one. RE have been used forthe SA study of climate models24 where the equilibriumprobability density function (PDF) has been obtainedthrough an entropy maximization subject to constraintsinduced by the measurements, while Fisher informationmatrix (FIM) is an indispensable tool for optimal exper-imental design.13 In a typical SA approach based on RE,explicit knowledge of the equilibrium PDF is assumed.However, in systems with non-equilibrium steady states(NESS) (i.e., systems in which a steady state is reachedbut the detailed balance condition is violated), there areno explicit formulas for the stationary distribution andeven when a Gibbs measure is available, it is usuallycomputationally inefficient to sample from. Such non-equilibrium systems are common place in molecular sys-tems with multiple mechanisms such as reaction-diffusionsystems or driven molecular systems.25

In Ref. 12, the RE between path distributions (i.e., dis-tributions of the particle trajectories) for discrete timeor discrete event systems was been utilized as a mea-sure of sensitivity. When the system is in stationaritythe relative entropy of the two path distributions decom-poses into two parts; (i) the relative entropy rate (RER)that scales linearly with time and (ii) a constant termrelated to the relative entropy of the initial distributionof the system. In long times, the first term dominates

2

Page 3: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

providing major insights on the sensitivity of the systemwith respect to parameter perturbations. In this work,we extend the SA method proposed in Ref. 12 to thecase of stochastic differential equations (i.e., continuous-time and continuous state space) and particularly theLangevin equation. Furthermore, when perturbationsare small, a Taylor expansion on RER is performed re-vealing the lower order of this expansion which is thepathwise FIM associated with the RER. Practically, RERis an observable of the stochastic process which can becomputed numerically as an ergodic average in a straight-forward manner as it only requires local dynamics (in ourcase the forces). Similarly, FIM computations are feasi-ble in the same fashion with the advantage of being moreinformative since any perturbation direction can be ex-plored. Both of these observables can be sampled on thefly, from a single MD run since only the reference processneeds to be simulated. Finally, spectral methods for thecalculation of RER were introduced for the over-dampedLangevin case in Ref. 26.

The studied pathwise SA method has major advan-tages which can be listed as follows. First, it is gradient-free method which does not require knowledge of theequilibrium PDF. By gradient-free we mean that thepathwise FIM does not depend on the extent of the per-turbation, so when the computation for different per-turbations is necessary (especially in high-dimensionalproblems) the extra cost is minimal in comparison to thestraightforward RER calculation. Second, it is rigorouslyvalid for long-time, stationary dynamics in path-spaceincluding metastable dynamics in a complex landscape.Third, it is suitable for non-equilibrium systems from sta-tistical mechanics perspective; for example in NESS pro-cesses such as dissipative systems where the structure ofthe equilibrium PDF is unknown. Fourth, it is fast sinceit requires samples only from the unperturbed processwhich can be also obtained in a trivially parallel manner.

The work presented here is a part of a more gen-eral hierarchical simulation scheme that involves multi-ple simulation level and a broad range of length and timescales.27–29 In this work, we apply the above methodol-ogy on stochastic molecular systems as the RER and FIMmethods are based on this setting i.e., non-deterministicwith random noise. As test cases we examine: (a) abenchmark Lennard-Jones (LJ) fluid model,1,30 and, (b)a detailed all-atom methane (CH4) model.31 The LJ fluidsystem is the most widely used in molecular simulationsmodel of a simple fluid, whereas the second one employsmore complexity due to the intramolecular bond and an-gle potentials in addition to the LJ intermolecular poten-tial. Methane has been also extensively studied over theyears due to the fact that it is in abundance in natureand has environmental impacts as well as it can be usedas fuel being the main component of natural gas.

The proposed pathwise SA method is validatedthrough proper observable quantities upon perturbationof the potential parameters, which include structural, dy-namical and thermodynamic properties of both LJ and

methane model systems. We stress here that the uti-lized SA method based on RER and the pathwise FIMis independent of the observable quantities, which is notthe case for derivative-based SA methods where they suf-fer from smoothness assumptions on the observable func-tionals. The partial derivative of an observable is relatedwith the RE through the Pinsker inequality (see ineq.(16)). The Pinsker inequality asserts that small RER(or FIM) values result in small changes in observable ex-pectation values under perturbation; thus RER and FIMcan serve as a screening tool for specific observables. Thepresent work provides a detailed quantitative study con-cerning the relation between the pathwise SA method(RER / FIM tools) and specific observables of molecularsystems. Specifically, it can be inferred that the param-eter that controls the well’s position of the LJ potential(i.e., σLJ) is two orders of magnitude more sensitive thanthe strength of the LJ potential (i.e., εLJ) in terms ofRER for the LJ fluid model as well as for the methanemodel. This result is also confirmed by the observables.Moreover, we show that 5% perturbation of σLJ is morecrucial in terms of RER (but also supported from theobservable functions) than changing the potential cut-off radius from 4σLJ to 1.6σLJ . Interestingly, the useof RER as an information criterion to assign values tosimulation parameters such as the cutoff radius may re-sult in accelerated MD simulations with the induced errorbeing controlled by the RER. Finally, the intramolecu-lar parameters (i.e., the parameters of the bonds) in themethane model are more sensitive in terms of RER thanthe intermolecular parameters which is explained by thefact that the bond and angle forces are stiffer than thebetween-atoms LJ forces.

The organization of the paper is as follows. The follow-ing Section describes the path-wise sensitivity analysismethod for Langevin dynamics in detail. In Section III,the LJ fluid model, the methane model as well as vari-ous observable functions are presented followed by Sec-tion IV where the validation of the proposed pathwise SAmethod is demonstrated. Finally, we conclude the paperin Section V.

II. PATHWISE SENSITIVITY ANALYSISFOR LANGEVIN DYNAMICS

This Section describes and motivates the info-theoreticapproach for sensitivity analysis of stochastic MolecularDynamics. Particularly, the RER and the correspondingpathwise FIM are derived for the Langevin equation.

A. Stochastic equation of motion

Langevin dynamics models a Hamiltonian systemwhich is coupled with a thermostat.32 The thermostatserves as a reservoir of energy. In Langevin dynamics,

3

Page 4: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

the motion of particles is governed through a probabilis-tic framework by a system of stochastic differential equa-tions given by{

dqt = M−1ptdtdpt = F θ(qt)dt− γM−1ptdt+ σdWt ,

(1)

where qt ∈ RdN is the position vector of the N parti-cles in d-dimensions, pt ∈ RdN is the momentum vec-tor of the particles, M is the (diagonal) mass matrix,F θ(·) : RdN → RdN is the driving (conservative) forcewhich depends on a parameter vector θ ∈ RK (e.g. pa-rameters of the specific atomistic force field), γ is thefriction matrix, σ is the diffusion matrix and Wt is adN -dimensional Brownian motion. In the equilibriumregime, the forces are of gradient form, i.e., F θ(qt) =−∇V θ(qt) where V θ(·) is the potential energy. Moreover,the fluctuation-dissipation theorem asserts that frictionand diffusion terms are related with the inverse temper-ature β ∈ R of the system by

σσT = 2β−1γ .

Under gradient-type forces and the fluctuation dissipa-tion theorem, the Langevin system has a Gibbs equilib-rium (or invariant) distribution, µθ(·, ·), given by

µθ(dq, dp) =1

Ze−β(V θ(q)+ 1

2pTM−1p)dqdp . (2)

In non-equilibrium steady states, however, the stationarydistribution, µθ(·, ·), is generally not known restrictingthe sensitivity analysis methods that rely on the explicitknowledge of the steady states. Though, as we showbelow, the proposed pathwise sensitivity methodology isnot limited to equilibrium systems and it works equallywell in the non-equilibrium steady states regime since itonly necessitates the explicit knowledge of the drivingforces (i.e., the local dynamics).

B. Relative Entropy Rate and FisherInformation Matrix for Langevin Processes

Let the path space X be the set of all trajectories{(qt, pt)}Tt=0 generated by the Langevin equation in thetime interval [0, T ]. Let Qθ[0,T ] denote the path space

distribution, i.e., the probability to see a particular ele-ment of path space, X , for a specific set of parametersθ. Consider also a perturbation vector, ε0 ∈ RK , anddenote by Qθ+ε0[0,T ] the path space distribution of the per-

turbed process, (qt, pt). The proposed sensitivity anal-ysis approach is based on the quantification of the dif-ference between the two path space probability distribu-tions by computing the relative entropy (RE) betweenthem. Thus, the pathwise RE of the unperturbed dis-tribution, Qθ[0,T ], with respect to (w.r.t.) the perturbed

distribution, Qθ+ε0[0,T ] , assuming that they are absolutely

continuous w.r.t. each other is defined as

R(Qθ[0,T ]|Qθ+ε0[0,T ] ) :=

∫log

(dQθ[0,T ]

dQθ+ε0[0,T ]

)dQθ[0,T ] , (3)

wheredQθ[0,T ]

dQθ+ε0[0,T ]

is the Radon-Nikodym derivative and it

is well-defined due to the absolute continuity assump-tion. A key property of RE is that R(Qθ[0,T ]|Q

θ+ε0[0,T ] ) ≥ 0

with equality if and only if Qθ[0,T ] = Qθ+ε0[0,T ] , which allows

us to view relative entropy as a “distance” (more pre-cisely a semi-metric) between two probability measurescapturing the relative importance of parameter vectorchanges.8 Moreover, from an information theory perspec-tive, the relative entropy measures loss/change of infor-

mation when Qθ+ε0[0,T ] is considered instead of Qθ[0,T ].33

The necessary and sufficient conditions of the two pathdistributions (perturbed and unperturbed) to be abso-lutely continuous are provided next.

Assumption II.1. Assume that

(a) the diffusion matrix, σ, is invertible, and,

(b) EQθ[0,T ]

[exp{ ∫ T

0|u(qt, pt)|2

}] <∞, where the func-

tion u(·, ·) : R2dN → R2dN is defined such that forall pairs (q, p) it should hold that[

0 00 σ

]u(q, p) =

[M−1p−M−1pF θ(q)− γM−1p− (F θ+ε0(q)− γM−1p)

],

or, equivalently,[0σ

]u(q, p) =

[0F θ(q)− F θ+ε0(q)

].

Notice that such a function, u(·, ·), exists due to (a).Furthermore, (a) implies that the noise is non-degeneratefor the momenta. Then, the RE of the path distributiondefined in (3) is finite and an explicit formula can beestimated as the following proposition asserts.

Proposition II.1. Let Assumption II.1 holds. Assumealso that (q0, p0) ∼ νθ and (q0, p0) ∼ νθ+ε0 where νθ(·, ·)and νθ+ε0(·, ·) are two initial distributions which shouldbe absolutely continuous w.r.t. each other. Then,

R(Qθ[0,T ]|Qθ+ε0[0,T ] ) = R(νθ|νθ+ε0)

+1

2EQθ

[0,T ]

[ ∫ T

0

|u(qt, pt)|2dt] (4)

Proof. Under Assumption II.1, the Girsanov theorem ap-plies providing an explicit formula of the Radon-Nikodymderivative34 which is given by

dQθ[0,T ]

dQθ+ε0[0,T ]

({(qt, pt)

}Tt=0

)=

dνθ

dνθ+ε0(q0, p0)×

exp

{−∫ T

0

u(qt, pt)T dWt −

1

2

∫ T

0

|u(qt, pt)|2dt}.

4

Page 5: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

Moreover, Wt :=∫ t

0u(qs, ps)dt+Wt is a Brownian mo-

tion with respect to the path distribution Qθ[0,T ], mean-

ing that, for any measurable function f(·, ·), it holds

EQθ[0,T ]

[ ∫ T0f(qt, pt)

T dWt

]= 0. Then,

R(Qθ[0,T ]|Qθ+ε0[0,T ] ) =

∫ (log

dνθ

dνθ+ε0(q0, p0)

−∫ T

0

u(qt, pt)T dWt −

1

2

∫ T

0

|u(qt, pt)|2dt)dQθ[0,T ]

=

∫log

dνθ

dνθ+ε0(q0, p0)dQθ[0,T ] −

∫ ∫ T

0

u(qt, pt)T dWtdQ

θ[0,T ]

+1

2

∫ ∫ T

0

|u(qt, pt)|2dtdQθ[0,T ]

= R(νθ|νθ+ε0) +1

2

∫ ∫ T

0

|u(qt, pt)|2dtdQθ[0,T ]

We remark that this proposition is a result on the tran-sient regime since the initial distributions can be any-thing as fas as they are absolutely continuous w.r.t. eachother. In the stationary regime, a significant simplifica-tion of the pathwise RE occurs. As the following propo-sition asserts, pathwise RE is decomposed into a linearin time term plus a constant where the slope of the linearterm is the relative entropy rate (RER).

Proposition II.2. Let Assumption II.1 holds. Assumealso that (q0, p0) ∼ µθ and (q0, p0) ∼ µθ+ε0 where µθ(·, ·)and µθ+ε0(·, ·) are the stationary distributions for the un-perturbed and the perturbed process, respectively, whichshould be absolutely continuous w.r.t. each other. Then,the pathwise RE equals to

R(Qθ[0,T ]|Qθ+ε0[0,T ] ) = TH(Qθ|Qθ+ε0) +R(µθ|µθ+ε0) (5)

where

H(Qθ|Qθ+ε0) :=

1

2Eµθ [(F θ+ε0(q)− F θ(q))T (σσT )−1(F θ+ε0(q)− F θ(q))]

(6)

is the Relative Entropy Rate.

Proof. First notice that we drop the T subscript fromthe definition of RER because RER is time-independent.Then, it is straightforward to show from the previousproposition that

R(Qθ[0,T ]|Qθ+ε0[0,T ] )

= R(µθ|µθ+ε0) +1

2

∫ ∫ T

0

|u(qt, pt)|2dtdQθ[0,T ]

= R(µθ|µθ+ε0) +1

2

∫ T

0

∫|u(qt, pt)|2dQθ[0,T ]dt

= R(µθ|µθ+ε0) +1

2

∫ T

0

∫|u(q, p)|2µθ(dq, dp)dt

= R(µθ|µθ+ε0) +T

2Eµθ [|u(q, p)|2] .

RER inherits all the properties of relative entropy(non-negativity, convexity, etc.) and it measures thechange of information in path space per unit time. Forlarge times, the term that involves RER is the significantterm, since it scales linearly with time, while the constantone becomes less and less important. Moreover, the es-timation of RER necessitates only the knowledge of thedriving forces (i.e., the local dynamics) which is availablesince the driving forces are computed in any numericalscheme of the Langevin equation.

Pathwise Fisher information matrix: Generally, REis locally a quadratic functional in a neighborhood of pa-rameter vector, θ. Under smoothness assumption in theparameter vector, the curvature of the RE around θ, de-fined by its Hessian, is the FIM. Analogously, we definethe Hessian of the RER to be the pathwise FIM denotedby FH(Qθ). The relation between the RER and the path-wise FIM is

H(Qθ|Qθ+ε0) =1

2εT0 FH(Qθ)ε0 +O(|ε0|3) . (7)

Under smoothness assumption of the force vector, F θ(·),w.r.t. to the parameter vector, θ, an explicit formula forthe pathwise FIM for the Langevin process is straight-forwardly obtained from (6) given by

FH(Q) = Eµθ [∇θF θ(q)T (σσT )−1∇θF θ(q)] , (8)

where∇θF θ(·) is a dN×K matrix containing all the first-order partial derivatives of the force vector (i.e., the Ja-cobian matrix). Observe that the pathwise FIM does notdepend on the perturbation vector, ε0, making pathwiseFIM an attractive “gradient-free” quantity for sensitivityanalysis. Indeed, the RER for any perturbation can berecovered up to third-order utilizing only the pathwiseFIM and (7). Moreover, the spectral analysis of FH(Q)would allow to identify which parameter directions aremost/least sensitive to perturbations.

Example 1: Unknown stationary distribution:In many molecular systems the steady state is not aGibbs distribution and typically it is not known explic-itly. This is commonplace in non-equilibrium molecu-lar systems such as models with multiple mechanisms,e.g. reaction-diffusion systems, or driven molecularsystems.25,35 Here we consider such a mathematicallysimple example, where we assume that the force fieldconsists of two components; one conservative term givenas minus the gradient of the potential energy and an-other term that is not the gradient of a potential func-tion. Mathematically, the force field is given by

F θ(q) = −∇V θ(q) +G(q)

where we further assume for simplicity that only the con-servative term depends on the parameter vector, θ. Since,the resulting Langevin process is at the non-equilibriumregime, the steady states do not admit an explicit form.

5

Page 6: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

However, denoting by µθ the unknown stationary dis-tribution of the Langevin process driven by the aboveforces, the RER is given by

H(Qθ|Qθ+ε0) =

1

2Eµθ [(∇V θ+ε0(q)−∇V θ(q))T (σσT )−1(∇V θ+ε0(q)−∇V θ(q))] .

(9)

Notice that the expression in the expectation does notdepend on the non-conservative forces and it is the sameexpression as in the equilibrium regime. However, thedependence on the non-conservative forces is evidentthrough the (unknown) stationary distribution, µθ.

Example 2: Inverse temperature perturbation:Using the fluctuation-dissipation relation, we can sub-stitute the friction parameter γ with the inverse temper-ature β and compute the RER and the pathwise FIMfor β perturbations. Indeed, substituting in eq. (6) therelation γ = 1

2βσσT , we are looking for u(·, ·) such that[

0 00 σ

]u(q, p) =

[0− 1

2βσσTM−1p+ 1

2(β + εβ)σσTM−1p

],

where εβ is the perturbation of inverse temperature. No-tice that the forces were cancelled out in this expressionfor u because no perturbation is performed in the pa-rameters of the forces. At the stationary regime, RER isthen given by

H(Qβ |Qβ+εβ ) =ε2β8Eµβ [pTM−1σσTM−1p] , (10)

where µβ(·) is the stationary distribution of the process.It is evident that RER is a quadratic function of the per-turbation of the inverse temperature and interestinglyenough it depends only on the momenta, p. The aboveformula is valid for any force field and implies that thesensitivity of the (inverse) temperature as quantified bythe relative entropy between path distributions is inde-pendent of the underlying system as it defined by theforces or by the potential function, V θ(·).

Furthermore, in the equilibrium regime where the sta-tionary distribution is given by the Gibbs measure (eq.(2)), (10) can be further simplified because of the Gaus-sian nature of the momenta, p. Indeed, assuming forsimplicity that M = mIdN and σ = σIdN with m,σ ∈ R,(10) is rewritten as

H(Qβ |Qβ+εβ ) =ε2βσ

2

8βmdN . (11)

Consequently, the pathwise FIM in the logarithmic scale(see equation below) is given by

FH(Qlog β) =γ

2mdN . (12)

SA in the logarithmic scale: In many molecular sys-tems, the model parameters may differ by orders of mag-nitude, thus, it is more appropriate to perform relativeperturbations, i.e., the i-th element of the perturbation

vector is θiε0,i. After straightforward algebra, the ele-ments of the logarithmic-scale Fisher information matrixare given by

(FH(Qlog θ))i,j = θiθj(FH(Qθ))i,j , i, j = 1, . . . ,K . (13)

We refer to Ref. 12 for more details.

Statistical estimators: Even though the Langevinequation is degenerate since the noise applies only to themomenta, the process is hypo-elliptic and ergodic undermild conditions on the potential energy, V (·). Therefore,RER and the corresponding pathwise FIM can be com-puted as ergodic averages. Note though that in order toobtain samples from the Langevin process, a numericalscheme should be employed resulting in errors due to thediscretization procedure. There exist several numericalintegrators such as BBK and BAOAB for the Langevinequation.32,36 In Appendix A, BBK integrator is brieflyreviewed. The inserted bias is of order O(∆t) where ∆t isthe time-step as it has been shown for Langevin equationunder compactness condition37–39 (e.g., under boundeddomain). Then, the statistical estimator for the RER isgiven by

H(Qθ|Qθ+ε0) =1

2n

n∑i=1

(F θ+ε0(q(i))− F θ(q(i))

)T(σσT )−1(F θ+ε0(q(i))− F θ(q(i))

),

(14)

where n is the number of samples and, similarly, thestatistical estimator for the pathwise FIM is given by

FH(Qθ) =1

n

n∑i=1

∇θF θ(q(i))T (σσT )−1∇θF θ(q(i)) . (15)

Sensitivity Bound: Relative entropy provides a mathe-matically elegant and computationally tractable method-ology for the parameter sensitivity analysis of Langevinsystems. Such an approach focuses on the sensitivity ofthe entire probability distribution, either at equilibriumor at the path-space level, i.e., for the entire stationarytime-series quantifying among others the transferabilityskills of the molecular models. However, in many situa-tions in molecular simulations, the interest is focused onobservables such as radial distribution function, pressure,mean square displacement, etc. Therefore, it is desirableto attempt to connect the parameter sensitivities of ob-servables to the relative entropy methods proposed here.Indeed, relative entropy can provide an upper bound fora large family of observable functions, g, through thePinsker (or Csiszar-Kullback-Pinsker) inequality,33

|EQθ+ε0[0,T ]

[g]− EQθ[0,T ]

[g]| ≤ ||g||∞√

2R(Qθ[0,T ]|Qθ+ε0[0,T ] ) (16)

where || · ||∞ denotes the supremum (here, maximum) ofg. In the context of sensitivity analysis, inequality (16)states that if the relative entropy is small, i.e., insensitivein a particular parameter direction, then, any boundedobservable g is also expected to be insensitive towards thesame direction. In this sense, ineq. (16) can be viewed asa screening tool for parametric ”insensitivity analysis” ofobservables. Sensitivity bounds sharper than ineq. (16)

6

Page 7: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

are also being developed.40 Note that the inverse is notjustified; i.e. if the relative entropy is large for a spe-cific parameter direction, then an observable g might, ormight not, exhibit sensitivity with respect to the sameparameter direction.

III. MODELS AND OBSERVABLES

This section describes the two molecular models dis-cussed here and several observable functions on whichthe proposed sensitivity analysis method is validated. Aprototypical Lennard-Jones fluid model with two forcefield parameters and a methane model with ten parame-ters are presented. Observables such as the radial distri-bution function, the mean square displacement and thepressure spanning from a wide range of model proper-ties are also provided. All simulations are performed un-der constant number of atoms, volume and temperature(NVT ensemble).

A. LJ fluid model

In order to investigate the sensitivity analysis for arealistic system we examine the LJ fluid model. Inthis model, the atoms are identical, interacting with theLennard Jones potential with reduced non-dimensionalparameters εLJ = 1, σLJ = 1. One of the advantages ofthe LJ fluid is that there exists a phase diagram of thereduced density ρ∗ versus the reduced temperature T ∗.30

The popularity of this model relies on the generality ofsystems of molecular liquids that can be described as wellas computational efficiency. We restrict the force field in-teractions in the vicinity of cutoff radius rcut. Thus, the(truncated) LJ pair potential is given by

VLJ(rij) =

4εLJ

[(σLJrij

)12

−(σLJrij

)6], if rij < rcut

0 , otherwise,

(17)while the total potential energy of the system is

VLJ(q) =∑

1≤i,j≤Ni<j

VLJ(rij) ,

with

rij = |qi − qj | =√

(qxi − qxj )2 + (qyi − qyj )2 + (qzi − qzj )2 ,

being the Euclidean distance between the atoms.Sensitivity analysis is performed on the LJ potential

parameters εLJ and σLJ and as we show later (see sec-tion IV), the most sensitive parameter is the latter. Weconsider a system of N = 2048 atoms in a cubic simula-tion box of side length L = 14.3σLJ with periodic bound-ary conditions (PBC). The reduced temperature of the

Figure 1. Visualization of the CH4 interactions. The site tosite non-bonded LJ interactions (intermolecular) are markedin red whereas the intramolecular potential interactions aremarked in green.

run is T ∗ = 0.85τ which means that the system is in liq-uid phase (number density ρ∗ = 0.7). For the numericalscheme, the time-step is ∆t = 10−3 while the length ofthe run is 105 time-steps. An equilibration period of 104

steps is sufficient for the fcc lattice to melt and standardreduced units are used throughout the simulations.

B. CH4 model

Methane is a more complicated molecule combined oftwo different types of atoms; carbon (C) and hydrogen(H). Active research is targeted on CH4 because of its en-vironmental impact and energy utilization.41 Our sensi-tivity study is expanded and validated on this more com-plex molecular model which consists of different inter-molecular potentials between the pairs of atoms (bondedand non-bonded) as well as additional parameters im-posed by the geometry of the molecule (bonds and an-gles). We define V (q) the total potential and N the totalnumber of atoms (both C’s and H’s).

V (q) = Vbond(q) + Vangle(q) + VLJ(q) . (18)

where Vbond(q), Vangle(q) are quadratic intramolecularpotential functions of the bonds and angles respectively.VLJ(q) is the non-bonded potential as defined in the pre-vious subsection. For more details concerning the modelsee Appendix C.

The parameter vector, θ, consists of six LJ parame-ters (three different LJ potentials depending on the atomtype, see also Figure 1), two bond parameters and twoangle parameters. The parameters values of CH4 aresummarized in Table I whereas the values of the simula-tion parameters are presented in Table II.

C. Observables

To validate the proposed pathwise SA approach wehave calculated various observables that are related tothermodynamical, structural and dynamical properties of

7

Page 8: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

εLJ [Kcalmol

] σLJ [A] rcut [A]

C − C 0.0951 3.473 15.0

C −H 0.0380 3.159 15.0

H −H 0.0152 2.846 15.0

Kb [ KcalmolA2 ] r0 [A] Kθ [ Kcal

mol·deg2 ] θ0 [rad]

700 1.1 100 1.909

Table I. Non-bonded LJ coefficients as well as bond and anglecoefficients for methane.31

N(molecules) T [K] L [A] ρ [molesA3

] γ

512 100 32.9 0.0143 0.5

Table II. Simulation parameters for CH4

the molecular stochastic models. These quantities are ex-perimentally tractable and are related to the microscopicas well as the macroscopic level. In more detail, here, wefocus on radial distribution function (RDF), mean squaredisplacement (MSD) and pressure. Other studies23 in theliterature computed observables such as the Helmholtzfree energy, density, enthalpy to name some. Despite thefact that the RDF as well as the pressure are equilibriumquantities, MSD is related to the dynamics (time-seriesaveraging) making the proposed pathwise method suit-able for such long-time quantities. Note also, that thereare no closed analytic expressions for all the above ob-servables with respect to the force field (model) parame-ters.

1. Radial distribution function

The structure of liquids is characterized by the pair ra-dial distribution function, g(r), (g(2)(r) to be more pre-cise) and it is the most important observable of molecularsimulations due to the fact that the ensemble average ofany pair function may be expressed by it.1,42 Further-more, g(r) can be calculated experimentally by X-raydiffraction.42 The RDF is the pair distribution functionthat indicates the normalized distribution of a pair ofidentical atoms (or molecules) at a given distance. Forlong intermolecular distance r in liquids, g(r) fluctuatesaround unity. This static observable is based on the equi-librium structure of the system and it is constructed byhistogram averages. For N identical atoms let the two-atom distribution function be

P(2)N (q1, q2) :=

1

(N − 2)!

∫e−βV (q)dq3 . . . dqN , (19)

where q1, q2 are the positions of the first and secondatoms kept fixed, irrespective of the configuration of therest of the particles. For a (homogeneous) liquid, it holdsthat

P(2)N = ρ2g(2)(|r1,2|), ρ =

N

Vol(20)

where ρ is the number density while V ol is the vol-ume of the simulation box. If the atoms were indepen-dent of each other, P (2) would equal ρ2 so in practiceg(r) corrects for the spatial (density) correlation betweenatoms. For the CH4 model, we consider the molecularg(r) which is based on the center of mass of each indi-vidual molecules.

2. Mean square displacement

The mean square displacement associates the diffu-sion coefficient, D, with the atom (or center of mass formolecules) coordinates and is a measure of the spatialextent of random motion of the Langevin dynamics. Itis defined as

MSD = 〈(qt − qt0)2〉 = EQ[t0,t]

[(qt − qt0)2

](21)

where qt, qt0 are vectors of particle positions at time t andreference time instant t0, while the brackets, 〈·, ·〉, de-note ensemble averaging over all configurations of all theatoms (or molecules). This quantity provides us with in-formation about the dynamical properties of the system.The MSD and the diffusion coefficient, D, are related byEinstein’s equation

2D =1

dlimt→∞

∂〈(qt − qt0)2〉∂t

(22)

Where d is the dimension of the system (here d = 3).

3. Pressure

Temperature and pressure are macroscopic thermody-namic parameters defined in an experimental setup butthey can also be defined microscopically. Pressure isgiven by the expression1

P =ρ

β+vir

Vol,

where the first term is the kinetic energy contributionwhile vir is the atomic (or molecular) virial given by

vir =1

3

∑1≤i≤N

∑j>i

Fijrij .

Note that Fij is the total force (both non-bonded andbonded in the CH4 case) between atoms (or molecules)i and j.

IV. RESULTS

Every model at hand has a domain of applicability; i.e.,the forcefield representation allows to calculate (usuallythermodynamic) properties of interest in accordance to

8

Page 9: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

experimental values within a margin of error. This meansthat a force field might represent well one property, suchas density, but may not be valid for others, or might rep-resent all of them less accurately. In the following weperform simulations where the RER and FIM for eachperturbed variable are computed. Discussion on the re-sults as well as validation with respect to the observablequantities defined in section III C supports our results.

A. LJ fluid

RER and FIM calculations for the LJ fluid aresummarized in Figure 2. We compare the RER valueusing the continuous time statistical estimators, Eqs.14, 15. The middle bar corresponds to the FIM-basedRER whereas the left and right bars are the values ofestimator 14 for a negative and positive perturbationby ε0 = 5% respectively. All the plots are normalizedupon division with the number of particles. As thefigure suggests σLJ is the most sensitive parameter.Systems size effects have been thoroughly examined byperforming test simulations of bigger systems underthe same parameters, which produce similar results tothose presented here. It has been shown for a similarmodel that uncertainty in thermodynamic and transportproperties based on the potential parameters is largerthan statistical simulation uncertainty.9

The corresponding results for the discrete time caseusing the BBK integrator are shown in the Appendix.There’s minor discrepancy of order O(∆t) as previouslymentioned in section II due to the discretization errorbias. We note here that the continuous time computa-tions are faster since the RER and FIM formulas are lesscomplex.

10−1

100

101

102

103

εLJ

σLJ

Re

lati

ve

En

tro

py

Ra

te

RER and FIM for all the parameters (logscale)

RER (−ε0)

FIM−basedRER (+ε

0)

Figure 2. RER and FIM of continuous time estimators (14),(15). Comparing with the discrete time case (supplementarymaterial), the values are almost identical. σLJ is the mostsensitive parameter.

1 2 3 4 5 6 70

0.5

1

1.5

2

2.5

3

r [σLJ

]

g(r

)

unperturbed−5%ε

LJ

+5%εLJ

1.05 1.1 1.15 1.2

2

2.5

3

1 2 3 4 5 6 70

0.5

1

1.5

2

2.5

3

r [σLJ

]

g(r

)

unperturbed−5%σ

LJ

+5%σLJ

Figure 3. Effect of perturbation of εLJ parameter by ±5%(upper panel) and σLJ parameter by ±5% (lower panel)on RDF. The first peak is shifted vertically for fluctuationsaround εLJ whereas it is shifted to the right or left when thefluctuations concern the σLJ parameter. It is clear that σLJ(lower pannel) is more sensitive as the plots differ substan-tially, which is in agreement with the RE method.

perturbation ||gθ(r)− gθ+ε0(r)||L2

|gθ|−|gθ+ε0 ||gθ| RER

+5%εLJ 0.049 0.8 % 0.79

−5%εLJ 0.066 -1.17% 0.79

+5%σLJ 0.47 -3.83 % 409

−5%σLJ 0.59 7.4 % 115

rcut = 1.6σLJ 0.189 -3.44 % 0.71

rcut = 7σLJ 0.01 0.19% 1.6× 10−4

Table III. L2 norm of the difference of the unperturbed minusthe perturbed g(r) and normalized area difference.

Validation on the stronger sensitivity on σLJ comparedto εLJ , is demonstrated by the RDF (g(r)) plots shown inFigure 3. Note that the gradient of the potential, i.e. theinteratomic force, depends linearly with respect to εLJ .An increase in this parameter leads to a deeper potentialwell and stronger attraction between the atoms at the

9

Page 10: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

same distance. Thus, as expected, positive perturbationin εLJ leads to an increase of the first peak in the RDFgraph. In addition, only the first peak of the g(r) isaffected, the rest of the curve remains the same. Positive(negative) perturbations on the σLJ parameter shift thewhole RDF graph due to the fact that the atoms sensegreater (weaker) repulsion forces. Hence, the distributionmaximum is transferred to a longer (shorter) distance.We also notice that the peak of the curve has increasedat the new maximum which can be explained by the finitevolume of the same simulation box (NVT ensemble) ofthe unperturbed system.

In order to get a more detailed insight on the RDFsshown in Figure 3 we have also computed the L2 norm,shown in Table III. The L2 norm is suitable for a com-parison of the unperturbed versus the perturbed plotsgθ(r) and gθ+ε0(r) respectively. As the RER/FIM com-putations have shown that there is a relative entropy dif-ference of about 2 orders of magnitude with respect tothe two potential parameters (Figure 2), we now observea consistent difference, of about 5 times in L2, for theRDF observable. Table III and Figure 3 suggest that thepositively and negatively perturbed RDF plots exhibita more symmetric behavior on the εLJ parameter thanthe σLJ . Moreover −5%σLJ changes the packing of theLJ fluid completely; all the density distribution peaksare moved to a shorter distance. This result is consistentwith Pinsker’s inequality (16) as the more sensitive direc-tion allows for greater differences in the expected valuesof the observables.

Opposite perturbation directions yield different RERvalues whereas this is not the case for the FIM basedRER which is a second-order (quadratic) approxima-tion. In one of the realizations in our example, RERfor +(−)5%σLJ is 360.7 (101.1) and FIM is 196.6 mean-ing that H(Qθ|Qθ±ε0) is not symmetric w.r.t. FIM andthe negative direction being more sensitive.

There is no analytic formula that relates the MSD tothe potential parameters but we expect that a larger de-viation will result upon perturbation of a more sensitiveparameter. As we can see in Figure 4, the line for theinsensitive perturbed parameter εLJ slightly differs fromthe black one, both for positive and negative ε0. On thecontrary, the line that corresponds to the increased σLJis further away and under the unperturbed one. Basedon the aforementioned discussion on g(r) this is reason-able, as an increase in the σLJ values leads to strongerrepulsive forces at the same distance, hence more atomcollisions and consequantely to a larger friction coeffi-cient, i.e. lower mobility of the LJ atoms. A decrease inσLJ lowers the interatomic repulsive forces and there’sno significant effect at this density because the randomforcing dominates the dynamics. This result is consis-tent with Pinsker’s inequality as it provides an upperbound only, meaning that although this parameter isindicated as more sensitive (bigger RER value on ther.h.s.) the expected value w.r.t. this observable slightlychanges upon perturbation. Additional runs (realizations

0 50 100 150 200 250 3000

20

40

60

80

100

120

140

t(τ)

MS

D [

σ 2

]

unperturbed+5%ε

LJ

+5% σLJ

−5% εLJ

−5% σLJ

10−4

10−2

100

102

10−6

10−5

10−4

10−3

1/t

D

Figure 4. MSD for different perturbed directions by ±5%.The εLJ parameter has a small impact in comparison with themore sensitive σLJ . The MSD plot for the positive perturba-tion of σLJ stands out as the increased collisions dominate therandom forcing. Errorbars indicate the standard deviation for−5%σLJ and the deviation propagates with time. The insetillustrates the diffusion coefficient difference in logscale.

of the Markov chain starting from different configura-tions) for the same negative σLJ direction have shownthat the errorbars are within 2.5%. The linear depen-dence of the interatomic forces with respect to εLJ ac-counts for increased (decreased) interatomic interactionstrength when this parameter is changed upwards (down-wards).

Table IV contains the diffusion coefficient D (from eq.22) related to Figure 4 and depicts a quantitative aspect.The perturbation direction of +5%σLJ is dominant andclearly results in slowing down the diffusion of the LJfluid particles.

As mentioned above for simulation of the LJ fluid stan-dard non-dimensional (reduced units) are used. The re-duced pressure is denoted by P ∗ and Table IV containsthe simulation results. Once more we observe a greaterinfluence in perturbation of the parameter σLJ especiallyfor a positive increase. This is consistent with the factthat the volume remains unchanged and the repulsiveforces increase as discussed earlier, giving a pressure riseof fourteen times more for a +5% perturbation. We ob-serve the opposite fact for a reduction in σLJ . The εLJparameter has a more symmetric influence and this isexplained by the linear increase in the forces (derivativeof the potential formula) between the atoms and con-sequently via the virial coefficient it is depicted at thepressure. The third column of Table IV compares therelative pressure change with respect to the unperturbedrun and the pressure standard deviation is on the col-umn that follows. Parameter σLJ alters the pressure byan order of magnitude, a result which is consistent withthe RER/FIM calculations in the previous subsection.

10

Page 11: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

1. Discontinuous model parameter: cutoffradius

The rcut is a parameter of the model but the potentialis not differentiable with respect to it. Hence we cancompute the relative entropy rate but we cannot havean estimate of the Fisher Information matrix becausethe computation of FIM involves products of partialderivatives (see also Eq. 15). Figure 5 summarizesthe quantity R(rcut

ref |rcut) per particle, where in thisnotation we mean that the RER integral differentialis w.r.t. the path space measure corresponding to themodel’s rcut as reference. The potential tends to zero atdistance rcut = 2.5σLJ , that is a typical value also usedin the literature, so information lost upon trimmingthe potential tail is small in comparison to that whenrcut is shifted to the left. We expected that the RERshould be higher for a negative perturbation of rcut asvalidated in Figure 5 and the asymmetry (exponentialform for negative perturbation) comes from the formula(plot) of the potential; more information regarding theattractive part is lost rapidly for a −10% reduction stepfrom the reference rcut = 4σLJ . Indeed the trick of thercut convention has been used in molecular simulationsin order to reduce the computations at the expenseof minimal information loss, so our results using thispseudo-metric indicate that our choice of rcut is suitable.Additional runs for an increase in rcut suggest a trivial

gain of information based on the H(Qrrefcut |Qrcut) value

as well as the RDF (see next).

The RDF plot changes with a change in rcut as shownin Figure 6. When the potential tail is restricted up torcut, the long-range attractive part is zero after that dis-tance. This results to weaker long-range attractive forces(loss of cohesive energy) hence the first peak in the RDFgraph is lower and the mass is distributed to the right.We have included the plot of a 60% decrease to illustratethe higher dependence on a “premature” truncation anda plot of 75% increase for comparison. The empiricalvalue of 2.5σLJ is adequate for simulations, but a furtherreduction to 1.6 σLJ results to huge loss of information,especially for the attractive part. On the contrary, if wealmost double the reference value of 4σLJ to 7σLJ , the

perturbation P ∗P∗θ+ε0

−P∗θ

|P∗θ| σSTD D[

σ2LJτ

]

unperturbed 0.11 - 0.25 3.2× 10−4

+5%εLJ -0.10 -1.92 0.26 2.9× 10−4

−5%εLJ 0.28 1.56 0.23 3.4× 10−4

+5%σLJ 1.71 14.34 0.6 1.8× 10−4

−5%σLJ -0.47 -5.24 0.12 3.04× 10−4

Table IV. (left)Pressure change with respect to different per-turbation directions of the LJ fluid parameters. σLJ is themost sensitive direction. (right)Diffusion coefficient of theMSD plots. The errorbars are within ±2.5%.

−60−50 −40 −30 −20 −10 −1 10 20 30 40 50 60 70 80 9010

−5

10−4

10−3

10−2

10−1

100

perturbation %

RE

Rrc

ut

Figure 5. RER per particle for different rcut values in logscale.Perturbation of -10% corresponds to the 90% of the referencercut = 4σLJ . There is significant loss of information when werestrict the potential tail (rcut) to less than one half, as thatvalue is near the minimum of the potential well and a fractionof the attractive forces is lost.

gain is minimal and this can be seen in Figures 5 and 6.

We have seen here that the influence of this parameteris minimal in comparison with the potential parametersin Figure 2 for this reference value in terms of RER. RERper parcticle for a −5%εLJ pertrubation is similar to a−60% redution in rcut. The L2 norm of the g(r) differencefor different rcut values (Table III and Figure 6 ) illustratethe same behavior too. At this point we should stressthat the sensitivity of the observables on rcut changes ifwe choose another reference value; however in practiceusually rcut is not one of the parameters tuned duringthe force field development/optimization.

Non-equilibrium regime LJ fluid

Finally, we have also studied a non-reversible LJfluid. In more detail, we have checked the effect ofan additional non-gradient term in the force in they-direction i.e. F θ(q) = −∇V θ(q) − G(q), G(q) =[0, α, 0, 0, α, 0, ..., 0, α, 0]T . α = 1 for the irreversible caseand the term G(q) is divergence-free. The eigenvaluesand dominant eigenvectors are summarized in Table Vand the corresponding RDF plot is given for compari-son in Figure 7. We expected that despite the fact thatthis process has a different measure close to the station-ary measure of the reversible one, the extra non-gradientterm cancels out in eq. (6). Hence our results as expectedare similar but we have demonstrated that the methodis general and can be used for a process equipped witha steady state measure. We aim to the study of morecomplex systems in non-equilibrium35 in future work.

11

Page 12: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

1 2 3 4 5 6 70

0.5

1

1.5

2

2.5

3

r [σLJ

]

g(r

)

rcut

=4σLJ

rcut

=1.6σLJ

rcut

=7σLJ

1.05 1.1 1.15 1.2

2

2.5

3

Figure 6. LJ fluid g(r) for different rcut values. Bigger rcutresults to longer range attractive forces binding the atomscloser (higher peak). As the L2 norm quantifies, the influenceof almost double rcut value (4σLJ is considered as reference),slightly affects the plot and Pinsker inequality validates thisfact. On the other hand, a decrease of this parameter leadsto loss of information and the corresponding g(r) describesa completely different model. Note that for this plot we in-creased the system size as the simulation box dimensions re-strict the maximum value of rcut.

α = 0 α = 1

eigenvalues eigenvector eigenvalues eigenvector

9.434 ×104 0.062 1.012 ×105 0.0621

4.33× 1011 0.998 4.48 ×1011 0.998

Table V. eigenvalues and eigenvectors for the non-reversiblecase

B. CH4

In the following we discuss calculations of RER-FIMas well as various observables for the all-atom methaneliquid. FIM and RER calculations are summarized inFigures 8 and 9. In more detail, Figure 8 shows that theRER values vary orders of magnitude for the variousparameter perturbations, hence we grouped them infour panels a)-d). In Figure 9 the FIM-based RER datais plotted in logscale for comparison. We note herethat due to the uneven number of the different pairs ofC−C, C−H, H−H we have divided with 8 and 16 thequantities corresponding to the second and third typeof pairs in order to obtain comparable plots. All RERvalues are normalized with the number of correspondinginteractions. Furthermore, bigger systems consisting of4000 molecules conclude with identical results.As in the LJ paradigm, we can see a greater sensitivityon the σLJ parameters instead of the corresponding εLJones. The errorbars indicate that the variance of theestimators were small and that a positive perturbationincreased the value of the RER with respect to the

1 2 3 4 5 6 70

0.5

1

1.5

2

2.5

3

r [σLJ

]

g(r

)

unp

+5% εLJ

+5% σLJ

1.05 1.1 1.15 1.2

1

1.5

2

2.5

3

Figure 7. RDF plot for the irreversible case. The normvalues are: ||gθ+5%εLJ (r) − gθ(r)||L2=0.058, ||gθ+5%σLJ (r) −gθ(r)||L2=0.473

FIM-based RER estimate. Clearly the most sensitiveparameter is the C −H bond length r0 followed by thebending angle θ0.The fact that r0 and θ0 are more sensitive is notsurprising if we consider that the type of all harmonicpotentials is very steep. Kb,Kθ constants are of theorder O(102 − 10−3) as obtained from more detailed(ab initio) calculations or from fittings of experimentaldata (see Table II). These constants are part of the∇Vbond,∇Vangle which is contained in the estimators.The asymmetry in the σLJ RER values in comparisonto the FIM values (panel b in Fig 8) is explained bythe third order term contribution in the expansion ofRER. A rigorous calculation in Appendix B shows thatthis term includes the Hessian of the gradient of thepotential w.r.t. the parameters and is non-zero for σLJ .

1. Observables

We perform the same observable computations as withthe LJ fluid model in order to validate the predicted sen-sitivity of the parameters provided by the RER and path-wise FIM methods. Although we have performed simula-tions for various values of the parameters, we chose 5% asa suitable value for better representation of our results.Note that in principe parameter sensitivities change as wechange phase space point; in higher temperatures or lowdensities each observable is affected differently and ourproposed RE method incorporates this behavior throughthe force differences (eq. 6). Here we have performedsimulations in the temperature range from 80 to 180 Kand qualitatively similar results were observed. A moredetailed study of SA over various temperatures of morecomplex (macromolecular) systems will be the subject of

12

Page 13: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

perturbation ||gθ(r)− gθ+ε0(r)||L2 ||gθ(r)− gθ−ε0(r)||L2

εC−C 1.0× 10−2 1.2× 10−2

σC−C 1.1× 10−1 5.7×10−2

εC−H 1.7× 10−2 9.6× 10−3

σC−H 2.8× 10−1 1.7× 10−1

εH−H 1.4× 10−2 1.15× 10−2

σH−H 2.05× 10−1 1.6× 10−1

Kb 1.1× 10−2 9.7× 10−3

r0 1.01× 10−1 1.1× 10−1

Kθ 8.7× 10−3 8.2× 10−3

θ0 9× 10−3 9× 10−3

Table VI. L2 norm of the difference of the unperturbed minusthe perturbed g(r) for ±5% perturbation.

a future work.

As in the case of the RDF of the LJ fluid, an increase inthe σLJ parameters shifts the graphs to the right (Figure10) due to the repulsive forces. All the differences withrespect to the L2 norm are summarized in Table VI forclarity.

In addition, from the set of RDF data presented inFigure 10, an increase in σC−HLJ values results to largerdeviations. As we keep the volume fixed, the contributionof the C−H interactions in the packing is larger than thatof the C−C pairs because of the larger number of C−Hpairs. Following this graph is the one involving σH−HLJincrease because of the even smaller numerical value incomparison to the other σLJ ’s. At this point the smallermass of the hydrogens is the reason although the numberof pairs (hence interactions) is the largest.

The MSD plots indicate the σCHLJ , σHHLJ as the most sen-

sitive parameters. An increase in σLJ results to increasedcollisions and smaller diffusion coefficient (smaller MSD)as can be seen in Figure 11. As in the LJ case, positiveσLJ perturbations (for all three types) result to greaterrepulsive forces, hence reduced diffusivity. εLJ variationsslightly affect the MSD with respect to the other param-eters and the same holds for Kb and Kθ too (we haveomitted the plots for brevity). Under this dynamic ob-servable the intramolecular interactions are less relevantthan the intermolecular ones, for the specific state point(temperature and density) studied here.

Pressure calculations for different perturbation direc-tions are summarized in Table VII. According to thisobservable quantity σH−HLJ and r0 are the most sensitiveparameters, which are also indicated by the RE methods.As in the case of the LJ fluid, a change in εLJ (in all pairtypes) affects the pressure less than a change in σLJ .Pressure rises through an increase in σLJ due to moreatom collisions. Additionally, stronger forces account fora higher pressure virial. The presented results are in ac-cordance with earlier work43 on an LJ model of water, inwhich sensitivity analysis using partial derivatives of ob-servables with respect to the parameters were used. Thatstudy also demonstrated that pressure is greatly affected

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

εC−C

εC−H

εH−H

Re

lati

ve

En

tro

py

Ra

te

RER estimator( −ε0)

FIM−basedRER estimator( +ε

0)

a)

0

50

100

150

σC−C

σC−H

σH−H

Re

lati

ve

En

tro

py

Ra

te

RER estimator( −ε0)

FIM−basedRER estimator( +ε

0)

b)

0

100

200

300

400

500

600

700

800

Kr K

θ

Re

lati

ve

En

tro

py

Ra

te

RER estimator( −ε0)

FIM−basedRER estimator( +ε

0)

c)

10

0

101

102

103

104

105

106

107

r0 θ

0

Re

lati

ve

En

tro

py

Ra

te (

log

sc

ale

)

RER estimator( −ε0)

FIM−basedRER estimator( +ε

0)

d)

Figure 8. CH4 per molecule RER-FIM comparison with errorbars using the two different estimators for ±5% perturbationsin all the parameters. Non-bonded (a and b) and bonded (cand d) potential parameters are shown. The parameters aregrouped according to their order of magnitude. The mostsensitive one is r0 followed by θ0 and there has been a minorscaling according to the number of atom-atom pairs.

13

Page 14: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

10−2

10−1

100

101

102

103

104

105

106

107

εC−C

σC−C

εC−H

σC−H

εH−H

σH−H

Kr r

0 K

θ θ

0

Figure 9. CH4 FIM-based RER comparison for ±5% pertur-bations in logscale.

2 4 6 8 10 12 140

0.5

1

1.5

2

2.5

distance [A]

g(r

)

unperturbed

+5% σC−C

+5% σC−H

+5% σH−H

4 4.2 4.4 4.6

1.5

2

2.5

Figure 10. CH4 molecular g(r) for +5% perturbations on σLJ .The tail of the plot varies slightly hence the zoomed regiondiffers more. As in the LJ fluid case, the σLJ defines the shiftof the curve horizontally.

by variations in σLJ and classified the bond length, σLJand the bond constant as the most sensitive ones.

A change in the bending angle θ0 does not affect thepressure44 as well as the impact of the constants Kb,Kθ

on the pressure is minimal. We note that the unper-turbed system pressure is higher than 1atm because themodel we chose (forcefield and integrator) does not repro-duce the whole CH4 phase diagram precisely. Such smalldeviations from the equations of state and experimentsare expected. We refer to the supplementary materialsfor more figures and results which were omitted here forbrevity.

0 50 100 150 200 250 300 3500

100

200

300

400

500

600

700

800

900

1000

t[ps]

MS

D [

A2 ]

unperturbed+5%σ

LJ

C−C

+5%σLJ

C−H

+5%σLJ

H−H

+5 %r0

10−2

100

0.1

0.2

0.3

0.4

0.5

1/t

D

Figure 11. CH4 MSD for 5% perturbations. We have sum-marized the most important directions. With respect to thisobservable, the most sensitive parameter is σC−H followed byσH−H . This is in accordance with the RER in Figure 9. Theinset illustrates the diffusion coefficient differences in logscale.

perturbation P θ+ε0 [atm] Pθ+ε0−Pθ|Pθ| σSTD P θ−ε0 [atm] Pθ−ε0−Pθ

|Pθ|

unperturbed 19.7 - 58.4 - -

εC−C -3.9 -1.2 51.7 33.1 +0.7

σC−C -2.3 -1.1 53.1 87.4 +3.4

εC−H -31.3 -2.6 52.2 63.6 +2.2

σC−H 177.2 +8 56.7 44.3 +1.2

εH−H 8.27 -0.6 49.1 23.7 +0.2

σH−H 437 +21.2 56.3 195 +10.9

Kb 15.3 -0.2 49.6 14.3 -0.3

r0 281 +13.3 56.5 217 +12

Kθ 18.6 -0.05 52.9 14.9 -0.2

θ0 13.7 -0.3 52.45 13.3 -0.3

Table VII. Pressure for ±5% perturbation of different direc-tions and the corresponding standard deviation. The mostsensitive parameters r0 and σH−HLJ increase the pressure sub-stantially.

V. CONCLUSION

In this work, we extended the parametric SA approachof Ref. 12 to stochastic molecular dynamics. The focuswas set particularly to the Langevin equation, however,it is applicable to any molecular system that can be de-scribed by a system of stochastic differential equations.The presented SA approach is based on the relative en-tropy per unit time of the path distribution at a ref-erence parameter point with respect to the path distri-bution at a perturbed parameter point. Major advan-tages of this method are that: i) it is capable of han-dling non-equilibrium steady state systems and ii) it iscomputationally tractable through the expansion of theRER which results in the pathwise FIM. Pathwise FIMprovides a fast “gradient-free” method for parametric SA

14

Page 15: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

since it provides an estimate –up to third-order accuracy–of the RER for different perturbation directions througha simple matrix multiplication.

We examined two systems; the well-known prototypi-cal LJ fluid and a more complex one: methane (CH4).SA on the LJ fluid system was based on the potential pa-rameters εLJ , σLJ with the latter being the more sensitiveto perturbations whereas CH4 involved 6 intermolecu-lar and 4 intramolecular potential parameters with theintramolecular parameters being the most sensitive interms of RER. Static and dynamic observable quantitiessuch as the radial distribution function, the mean squaredisplacement and the pressure validated the proposed SAapproach. Theoretical justification of the proposed SAapproach is also provided through the Pinsker inequal-ity. We also investigated the effect of the potential cutoffradius, rcut, by numerically computing the RER showingfirst that RER can be used as an information criterion forassigning appropriate values to parameters of the systemand second that 5% perturbation of σLJ produce greaterimpact than changing rcut from 4σLJ to 1.6σLJ .

Finally, RE for high-dimensional systems was used as ameasure of loss of information in coarse-graining.13,45,46

Coarse-graining (CG) methods of stochastic systems al-low for constructing optimal parametrized Markoviancoarse-grained dynamics within a parametric family, byminimizing the information loss (i.e., the relative en-tropy) on the path space. Application of RE to the erroranalysis of coarse-graining of stochastic particle systemshave been pioneered in these papers.47–49 Recent ongo-ing work on application of the RE framework for CG inthe non-equilibrium regime where there’s no Gibbs struc-ture can be found in Ref. 50. We aim to utilize thecurrent SA method to tackle with more complex hybridmacromolecular materials or biomolecular systems in andout-of equilibrium conditions.28,29,51 Another goal is toadapt the RE method to quantify and indicate the mostefficient CG mapping of mesoscale simulations.52

ACKNOWLEDGMENTS

This research has been co-financed by the EuropeanUnion (European Social Fund – ESF) and Greek na-tional funds through the Operational Program “Edu-cation and Lifelong Learning” of the National Strate-gic Reference Framework (NSRF) – Research FundingPrograms: THALES and ARISTEIA II. The research ofMAK was supported in part by the Office of AdvancedScientific Computing Research, U.S. Department of En-ergy under Contract No. DE-FG02-13ER26161/DE-SC0010723.

Appendix A: Pathwise SA at the discrete-timelevel

In Section II, we perform SA by first deriving RERand the corresponding pathwise FIM for the continuous-time stochastic Langevin process and then discretizingthe process to get numerical estimates for these quanti-ties. We can reverse the order of SA and first discretizethe Langevin process and then derive the RER and thepathwise FIM. Here, the latter approach is presented us-ing the BBK algorithm as a numerical integrator of theLangevin process which defines a discrete-time Markovchain. A preliminary example of this approach can befound in Ref. 12. In the BBK integrator, the Hamiltonianpart of the Langevin equation (1) is integrated with theVerlet propagator whereas the thermostat is an Ornstein-Uhlenbeck process and the explicit/implicit propagator isused.

The BBK algorithm32 readspi+ 1

2= pi −∇V (qi)

∆t2− γM−1pi

∆t2

+ σ∆Wi

qi+1 = qi + ∆tM−1pi+ 12

pi+1 = pi+ 12−∇V (qi+1) ∆t

2− γM−1pi+1

∆t2

+ σ∆Wi+ 12

(A1)

∆Wi,∆Wi+ 12

are iid Gaussian random vectors with zero

mean and covariance matrix ∆t2 IdN while ∆t is the time

step of the numerical scheme. Notice that other choicesof numerical integrators can be utilized such as the onesproposed by Leimkuhler et al.36,53 which introduce a rela-tively weak perturbative effect on the physical dynamics.

We define the state of the discrete-time system at time-step i as zi = (qi, pi) ∈ R2dN . The process {zi}Mi=0 forthe BBK integrator is a Markov chain with transitionprobability P θ(zi, zi+1) where θ ∈ RK is the vector ofthe system’s parameters. Notice that the length of thediscrete-time process is related with the time window ofthe continuous-time process through T = M∆t. Thepath space probability density, Qθ0:M (·), is defined as

Qθ0:M ({zi}Mi=0) = µθ(z0)M−1∏i=0

P θ(zi, zi+1) , (A2)

where µθ(·) denotes the stationary distribution of thediscrete-time. As in the continuous-time case, we per-turb the parameter vector, θ, by adding a perturbationvector ε0 ∈ RK . At the stationary regime, the pathwiserelative entropy of Qθ0:M with respect to Qθ+ε00:M admitsalso a decomposition into a linear in time term plus aconstant.12 Indeed, it holds that

R(Qθ0:M |Qθ+ε00:M ) = MH(Qθ|Qθ+ε0)+R(µθ|µθ+ε0) , (A3)

where R(µθ|µθ+ε0) is the relative entropy between thestationary distributions while H(Qθ|Qθ+ε0) is the RERof the discrete-time Markov chain given by

H(Qθ|Qθ+ε0) = Eµθ

[∫R2dN

P θ(z, z′) logP θ(z, z′)

P θ+ε0(z, z′)dz′].

(A4)

15

Page 16: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

The discrete-time RER is related with the continuous-time RER through52

H(Qθ|Qθ+ε0) = lim∆t→0

1

∆tH(Qθ|Qθ+ε0) . (A5)

As expected, discrete-time RER is locally a quadraticfunctional in a neighborhood of θ hence its curvaturearound θ, defined by the Hessian, is the pathwise FIMwhich is given by12

FH(Qθ) =

Eµθ

[∫R2dN

P θ(z, z′)∇θ logP θ(z, z′)∇θ logP θ(z, z′)T dz′].

(A6)

We refer to (12) for statistical estimators of the discrete-time RER and the corresponding pathwise FIM while inSupplementary Materials we provide detailed formulasfor the numerical calculation of (A4) and (A6) for theBBK integrator.

Appendix B: Expansion of the continuous-timeRER

We now expand the RER in eq. (6) through Taylor se-ries expansion around the point θ. We start with expand-ing the m-th component of the force, F θ+ε0(q), aroundθ

F θ+ε0m (q) = F θm(q)+∇θF θm(q)ε0+1

2εT0∇2

θFθm(q)ε0+O(|ε0|3)

(B1)where ∇ denotes the 1 × K gradient vector while ∇2

denotes the K × K Hessian matrix. Then, the RER iswritten as

H(Qθ|Qθ+ε0)

=1

2Eµθ [(F θ+ε0(q)− F θ(q))T (σσT )−1(F θ+ε0(q)− F θ(q))]

=1

2

dN∑m,n=1

Eµθ [(F θ+ε0m (q)− F θm(q))((σσT )−1)m,n(F θ+ε0n (q)− F θn(q))]

=1

2

dN∑m,n=1

((σσT )−1)m,nEµθ [∇θF θm(q)ε0∇θF θn(q)ε0]

+1

2

dN∑m,n=1

((σσT )−1)m,nEµθ [∇θF θm(q)ε0εT0∇2

θFθm(q)ε0] +O(|ε0|4) .

The pathwise FIM comes from the second-order termwhile the third-order term defines a tensor matrix.

For the LJ non-bonded potential, the leading termof the second-order term (i.e., the pathwise FIM) inthe RER expansion when σLJ is perturbed is of or-

der O((σLJr

)10)while the leading term of the third-

order term of RER is of order O((σLJr

)9)with (typ-

ically) σLJ < r. The fact that the leading term ofthe third-order term has smaller exponent compared tothe second-order term, makes the contribution of the

third-order term to the value of RER significant on av-erage. Therefore, the asymmetry between H(Qθ|Qθ+ε0)and H(Qθ|Qθ−ε0) observed both in the LJ fluid (Fig-ure 2) and the methane (Figure 8) stems exactly from thesignificance of the third-order term. Notice that asym-metries between positive and negative perturbations arenot rare and have been observed in biological reactionmodels and one method that is employed for assessingparameter identifiability in non-linear models is the pro-file likelihood method.54

Appendix C: Potential energy terms of CH4

In this section, the details of the total potential V (q) =Vbond(q) + Vangle(q) + VLJ(q) for the methane model arepresented. The total bond potential equals to

Vbond(q) =∑AVbond(|qj − qi|) (C1)

where

A = {qi=C, qj=H qi, qj ∈ same CH4,

4 bonds per CH4}

while the local bond potential is

Vbond(|qj − qi|) = Vbond(rij) =1

2Kb(r0 − qij)2 . (C2)

The two constants r0 and Kb determine the distance andthe strength of the bond between the two atoms, respec-tively.

The angle defined for each triplet H − C − H on thesame CH4 molecule is denoted by θjik. Then, the totalangular potential is

Vangle(q) =∑BVangle(∠qjqiqk) (C3)

where

B = {qi=C, qj , qk=H, qi, qj , qk ∈ same CH4,

6 angles per CH4} ,

while the local angular potential is given by

Vangle(∠qjqiqk) = Vangle(θijk) =1

2Kθ(θ0−θijk)2 . (C4)

The two constants θ0 and Kθ determine the degree andthe strength of the angle, respectively.

Moreover, the non-bonded term of the potential en-ergy, VLJ(q), is given by

VLJ(q) =∑CVLJ(|qj − qi|) (C5)

where

C = {qi, qj=H or C, qi, qj ∈ different CH4}

16

Page 17: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

while the functional form of the LJ potential, VLJ(rij),is given by (17).

Since the LJ potential is the non-bonded term, the sumin (C5) is over all the atoms of the other methanes. Itis convenient furthermore to divide this sum into threesums, each one corresponding on a different class of in-teractions between C − C,C −H,H −H. Thus, we canrewrite

VLJ(q) =∑C1

V C−CLJ (rij)+∑C2

V H−HLJ (rij)+∑C3

V H−CLJ (rij) ,

(C6)where

C1 = {qi, qj= C}C2 = {qi= C, qj= H, qi, qj ∈ different CH4}C3 = {qi, qj= H, qi, qj ∈ different CH4} .

Each LJ potential has its own parameter values.

REFERENCES

1M. P. Allen and D. J. Tildesley. Computer simulationof liquids. Clarendon Press, New York, NY, USA, 1987.Daan Frenkel and B. Smit. Understanding MolecularSimulation, Second Edition: From Algorithms to Ap-plications (Computational Science). Academic Press,2001.

2M.J. Kotelyanskii and D.N. Theodorou. SimulationMethods for Polymers, volume Chapter ”Molecular Dy-namics Simulations of Polymers”. Marcel Dekker, NewYork, 2004.

3D. E. Shaw, P. Maragakis, K. Lindorff-Larsen, S. Pi-ana, R. Dror, M. P. Eastwood, J. A. Bank, J. M.Jumper, J Salmon, Y. Shan, and W. Wriggers. Atomic-level characterization of the structural dynamics of pro-teins. Science, 330(6002):341–346, 2010.

4D. Fritz, V. A. Harmandaris, K. Kremer, andN. Van der Vegt. Coarse-grained polymer melts basedon isolated atomistic chains: simulation of polystyreneof different tacticities. Macromolecules, 42(19):7579–7588, 2009.

5V. Johnston and V. Harmandaris. Hierarchical simu-lations of hybrid polymer/solid materials. Soft Matter,9:6696–6710, 2013.

6A. Chernatynskiy, S. Phillpot, and R. LeSar. Uncer-tainty quantification in multiscale simulation of mate-rials: A prospective. Annual review of Materials Re-search, 43:157–182, July 2013.

7P. Angelikopoulos, C. Papadimitriou, and P. Koumout-sakos. Bayesian uncertainty quantification and prop-agation in molecular dynamics simulations: A highperformance computing framework. J. Chem. Phys,137(14), 2012.

8H. Liu, A. Sudjianto, and W. Chen. Relative en-tropy based method for probabilistic sensitivity analy-sis in engineering design. Journal of Mechanical Design,128(2):326–336, 2005.

9F. Cailliez and P. Pernot. Statistical approachesto forcefield calibration and prediction uncertainty inmolecular simulation. J Chem. Phys., 134(5), 2011.

10F. Rizzi, H. N. Najm, B. J. Debusschere, K. Sargsyan,M. Salloum, H. Adalsteinsson, and O. M. Knio. Un-certainty quantification in md simulations. part i: for-ward propagation. SIAM Multiscale Model. Simul.,10(4):1428–1459, 12 2012.

11S. K. Rao, R. Imam, K. Ramanathan, and S. Push-pavanam. Sensitivity analysis and kinetic parameterestimation in a three way catalytic converter. Indus-trial & Engineering Chemistry Research, 48:3779–3790,2009.

12Y. Pantazis and M. Katsoulakis. A relative entropyrate method for path space sensitivity analysis of sta-tionary complex stochastic dynamics. J. Chem. Phys,138, 2013.

13A. F. Emery and A. V. Nenarokomov. Optimal exper-iment design. Measurement Science and Technology,9(6):864, 1998.

14B. Cooke and S. C. Schmidler. Statistical predictionand molecular dynamics simulation. Biophysical Jour-nal, 95:4497–4511, November 2008.

15R. D. Braatz, R. C. Alkire, E. Seebauer, E. Rusli,R. Gunawan, T.O. Drews, X. Li, and Y. He. Perspec-tives on the design and control of multiscale systems.J. Proc. Control, 16:193–204, 2006.

16R. Gunawan, Y. Cao, L. Petzold, and F. J. III Doyle.Sensitivity analysis of discrete stochastic systems. Bio-physical Journal, 88:2530–2540, 2005.

17M. Rathinam, P. W. Sheppard, and M. Khammash.Efficient computation of parameter sensitivities of dis-crete stochastic chemical reaction networks. J. Chem.Phys., 132:034103–(1–13), 2010.

18D. F. Anderson. An efficient finite difference methodfor parameter sensitivities of continuous-time Markovchains. SIAM J. Numerical Analysis, 50(5):2237–2258,2012.

19G. Arampatzis and M. A. Katsoulakis. Goal-orientedsensitivity analysis for lattice kinetic monte carlo sim-ulations. J. Chem. Phys., 12(140):124108, 2014.

20P. W. Glynn. Likelihood ratio gradient estimationfor stochastic systems. Communications of the ACM,33(10):75–84, 1990.

21S. Plyasunov and A. P. Arkin. Efficient stochastic sen-sitivity analysis of discrete event systems. J. Comput.Phys., 221(2):724–738, February 2007.

22P. B. Warren and R. J. Allen. Steady-state param-eter sensitivity in stochastic modeling via trajectoryreweighting. J. Chem. Phys., 136(10), 2012.

23T. Iordanov, G. Schenter, and B. Garret. Sensitivityanalysis of thermodynamic properties of liquid water:a general approach to improve empirical potentials. J.Phys. Chem, A(110):762–771, 2006.

24A. Majda and B. Gershgorin. Quantifying uncertaintyin climate change science through empirical informa-tion theory. PNAS, 107(34):14958–14963, 2010.

25C. Baig and V. Harmandaris. Quantitative analysis

17

Page 18: (Dated: November 5, 2018) - arXivParametric Sensitivity Analysis for Stochastic Molecular Systems using Information Theoretic Metrics Anastasios Tsourtis,1, a) Yannis Pantazis, 2,b)

on the validity of a coarse-grained model for nonequi-librium polymeric liquids under flow. Macromolecules,(43):3156–3160, 2010.

26K. R. Haas, H. Yang, and J-W Chu. Fisher informationmetric for the Langevin equation and least informativemodels of continuous stochastic dynamics. J. Chem.Phys., 139(12), SEP 28 2013.

27V. Johnston and V. Harmandaris. Hierarchical multi-scale modeling of polymer−solid interfaces: Atomisticto coarse-grained description and structural and con-formational properties of polystyrene−gold systems.Macromolecules, 46:5741−5750, 2013.

28A. Rissanou and V. Harmandaris. Dynamics of variouspolymer/graphene interfacial systems through atom-istic molecular dynamics simulations. Soft Matter,42(10):2876–2888, 2014.

29V. Harmandaris. Quantitative study of equilibrium andnon-equilibrium polymer dynamics through systematichierarchical coarse-graining simulations. Korea-Aust.Rheol. J., 26:15–28, 2014.

30B. Smit. Phase diagrams of lennard-jones fluids. J.Chem. Phys, 96(11), 6 1992.

31S. Mayo, B. Olafson, and W. Goddard. Dreiding: ageneric force field for molecular simulations. J. Phys.Chem., 94(26):8897–8909, 1990.

32T. Lelievre, M. Rousset, and G. Stoltz. Free energycalculations. Imperial College Press, 2010.

33T Cover and J. Thomas. Elements of Information The-ory. John Wiley and Sons, 1991.

34B. Oksendal. Stochastic Differential Equations: An in-troduction with applications. Springer-Verlag, 5th edi-tion, 2000.

35V. Harmandaris, V. Mavrantzas, and D. Theodorou.Atomistic molecular dynamics simulation of stress re-laxation upon cessation of steady-state uniaxial elonga-tional flow. Macromolecules, 33(21):8062–8076, 2000.

36B. Leimkuhler and C. Matthews. Robust and effi-cient configurational molecular sampling via langevindynamics. J. Chem. Phys, 138(17), 2013.

37V. Bally and D. Talay. The law of the Euler schemefor stochastic differential equations. I. Convergence rateof the distribution function. Probab. Theory RelatedFields, 104:43–60, 1996.

38V. Bally and D. Talay. The law of the Euler scheme forstochastic differential equations. II. Convergence rateof the density. Monte Carlo Methods Appl., 2:93–128,1996.

39J. C. Mattingly, A. M. Stuart, and M. V. Tretyakov.Convergence of numerical time-averaging and station-ary measures via Poisson equations. SIAM J. Numer.Anal., 48:552–577, 2010.

40P. Dupuis, M.A. Katsoulakis, Y. Pantazis, andP. Plechac. Sesnitivity bounds and error estimates for

stochastic models. (in preparation).41D. A. Hensher and K. J. Button. Handbook of Transport

and the Environment. Elsevier, 2003.42D. McQuarrie. Statistical Mechanics. Harper Collins

Publishers, 1976.43S. B. Zhu and C. F. Wong. Sensitivity analysis of a

polarizable water model. J. Chem. Phys, 98:4695–4701,1994.

44H. Heinz, W. Paul, and K. Binder. Calculation of localpressure tensors in systems with many-body interac-tions. Phys. Rev. E, 72(6):066704–066714, 2005.

45M. S. Shell. Systematic coarse-graining of potentialenergy landscapes and dynamics in liquids. J. Chem.Phys, 137(8), 2012.

46M. A. Katsoulakis, P. Plechac, L. Rey-Bellet, and D. K.Tsagkarogiannis. Coarse-graining schemes and a pos-teriori error estimates for stochastic lattice systems.ESAIM: Mathematical Modelling and Numerical Anal-ysis, 41:627–660, 5 2007.

47M. A. Katsoulakis, A. J. Majda, and D. G. Vlachos.Coarse-grained stochastic processes for microscopic lat-tice systems. Proc. Natl. Acad. Sci. USA, 100(3):782–787, 2003.

48M. A. Katsoulakis, A. J. Majda, and D. G. Vla-chos. Coarse-grained stochastic processes and montecarlo simulations in lattice systems. J. Comp. Phys.,112:250–278, 2003.

49M. A. Katsoulakis, L. Rey-Bellet, P. Plechac, andD. K. Tsagkarogiannis. Mathematical strategies in thecoarse-graining of extensive systems: error quantifica-tion and adaptivity. J. Non Newt. Fluid Mech., 2008.

50M. A. Katsoulakis and P. Plechac. Information-theoretic tools for parametrized coarse-graining ofnon-equilibrium extended systems. J. Chem. Phys.,139(arXiv:1304.7700), Apr 2013.

51A. Rissanou, E. Georgilis, M. Kasotaskis, A. Mitraki,and V. Harmandaris. Effect of solvent on the self-assembly of dialanine and diphenylalanine peptides. J.Phys. Chem. B, 117:3962–3975, 2013.

52E. Kalligiannaki, V. Harmandaris, M. Katsoulakis,and P. Plechac. Optimizing coarse-grained models fornon-equilibrium molecular systems: Dynamical forcematching across scales. to be submitted.

53B. Leimkuhler, E. Noorizadeh, and F. Theil. A gentlestochastic thermostat for molecular dynamics. J. Stat.Phys., (135):261–277, 2009.

54A. Raue, C. Kreutz, T. Maiwald, J. Bachmann,M. Schilling, U. Klingmuller, and J. Timmer. Struc-tural and practical identifiability analysis of partiallyobserved dynamical models by exploiting the profilelikelihood. Bioinformatics, 2009.

18