32
Intro Workflow Summary References UQTk A C++/Python Toolkit for Uncertainty Quantification Bert Debusschere, Khachik Sargsyan, Cosmin Safta, Kenny Chowdhary [email protected] Sandia National Laboratories, Livermore, CA, USA Wed April 6, 2016 – SIAM UQ 16 Debusschere – SNL UQTk

UQTk - A C++/Python Toolkit for Uncertainty Quantificationneckel/siamuq16_slides_minisymp/2016_uqtk... · A C++/Python Toolkit for Uncertainty Quantification ... Part of US DOE BER

  • Upload
    letu

  • View
    229

  • Download
    0

Embed Size (px)

Citation preview

Intro Workflow Summary References

UQTkA C++/Python Toolkit for Uncertainty Quantification

Bert Debusschere, Khachik Sargsyan, Cosmin Safta,Kenny Chowdhary

[email protected] National Laboratories,

Livermore, CA, USA

Wed April 6, 2016 – SIAM UQ 16

Debusschere – SNL UQTk

Intro Workflow Summary References

Acknowledgements

Cosmin Safta Sandia National LaboratoriesKhachik Sargsyan Livermore, CA, USAKenny ChowdharyHabib NajmOmar Knio Duke University, Raleigh, NC, USARoger Ghanem University of Southern California

Los Angeles, CA, USAOlivier Le Maître LIMSI-CNRS, Orsay, Franceand many others ...

This material is based upon work supported by the U.S. Department of Energy, Officeof Science, Office of Advanced Scientific Computing Research (ASCR), ScientificDiscovery through Advanced Computing (SciDAC) and Applied Mathematics Research(AMR) programs.

Sandia National Laboratories is a multi-program laboratory managed and operated bySandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for theU.S. Department of Energy’s National Nuclear Security Administration under contractDE-AC04-94AL85000.

Debusschere – SNL UQTk

Intro Workflow Summary References

Outline

1 General Characteristics

2 An Example Workflow

3 Summary

4 References

Debusschere – SNL UQTk

Intro Workflow Summary References

UQ is about enabling predictive simulations

Predictive Simulation

Theory

Experiments

Debusschere – SNL UQTk

Intro Workflow Summary References

UQ methods extract information from all sources toenable predictive simulation

Predictive Simulation

Theory

Experiments

Introduction Introduction UQ Software PCES Summary References

Formulation

dudt

= f (u;λ)

P(λ|D, I) =P(D|λ, I)p(λ, I)

P(D)

Debusschere – SNL UQ

Introduction Introduction UQ Software PCES Summary References

Formulation

dudt

= f (u;λ)

P(λ|D, I) =P(D|λ, I)p(λ, I)

P(D)

Debusschere – SNL UQ

Introduction Introduction UQ Software PCES Summary References

Formulation

dudt

= f (u;λ)

P(λ|D, I) =P(D|λ, I)p(λ, I)

P(D)

Debusschere – SNL UQ

Introduction Introduction UQ Software PCES Summary References

Formulation

dudt

= f (u;λ)

P(λ|D, I) =P(D|λ, I)p(λ, I)

P(D)

Debusschere – SNL UQ

Inference

Forward Propagation

Introduction Introduction UQ Software PCES Summary References

Formulation

dudt

= f (u;λ)

P(λ|D, I) =P(D|λ, I)P(λ, I)

P(D)

λ u

P(u) P(λ)

Debusschere – SNL UQ

Introduction Introduction UQ Software PCES Summary References

Formulation

dudt

= f (u;λ)

P(λ|D, I) =P(D|λ, I)P(λ, I)

P(D)

λ u

P(u) P(λ)

Debusschere – SNL UQ

Introduction Introduction UQ Software PCES Summary References

Formulation

dudt

= f (u;λ)

P(λ|D, I) =P(D|λ, I)P(λ, I)

P(D)

λ u

P(u) P(λ)

Debusschere – SNL UQ

Introduction Introduction UQ Software PCES Summary References

Formulation

dudt

= f (u;λ)

P(λ|D, I) =P(D|λ, I)P(λ, I)

P(D)

λ u

P(u) P(λ)

Debusschere – SNL UQ

Introduction Introduction UQ Software PCES Summary References

Formulation

dudt

= f (u;λ)

P(λ|D, I) =P(D|λ, I)P(λ, I)

P(D)

λ u

P(u) P(λ)

Debusschere – SNL UQ• UQ not just about propagating uncertainties• The term UQ covers a wide range of methods

Debusschere – SNL UQTk

Intro Workflow Summary References

UQTk provides tools to build a general UQ workflow

• Tools for• Representation of random variables and stochastic

processes• Forward uncertainty propagation• Inverse problems• Sensitivity analysis• Bayesian Compressive Sensing• Gaussian Processes

• Tools can be used stand-alone or combined into ageneral workflow

Debusschere – SNL UQTk

Intro Workflow Summary References

We want UQTk to be straightforward to download,install and use

• Target usage:• Rapid prototyping of UQ workflows• Algorithmic research in UQ• Tutorials / educational

• Released under the GNU Lesser General PublicLicense• http://www.sandia.gov/UQToolkit/

• Current version 2.1• Version 3.0 coming (very) soon

• No massive third party libaries to download, install,and configure

Debusschere – SNL UQTk

Intro Workflow Summary References

UQTk is used in a variety of applications

• Direct collaborations• US DOE SciDAC QUEST UQ institutehttp://www.quest-scidac.org

• Variety of US DOE SciDAC partnership projects• Part of US DOE BER ACME climate model analysis

tools• Always welcome new applications / collaborations

• Downloads from http://www.sandia.gov/UQToolkit

• ≈ 600 total downloads• ≈ 425 downloads of version 2.x• Mostly academic and laboratory research groups

• Mailing lists• [email protected][email protected]• Join at http://www.sandia.gov/UQToolkit

Debusschere – SNL UQTk

Intro Workflow Summary References

We rely on Polynomial Chaos expansions (PCEs) torepresent uncertainty

• Standard PC Basis types supported:• Gauss – Hermite• Uniform – Legendre• Gamma – Laguerre• Beta – Jacobi

• Also support for custom orthogonal polynomials• Defined by user-provided three-term recurrence

formula• Both intrusive and non-intrusive PC tools provided

• Primarily Galerkin projection methods• Some regression approaches offered through

Bayesian Compressed Sensing module• See also Debusschere, et al. 2004; Sargsyan, et al.

2014Debusschere – SNL UQTk

Intro Workflow Summary References

UQTk uses a combination of C++ and Python

• Main libraries in C++• PCBasis and PCSet classes: PC tools (intrusive and

non-intrusive)• Quad class: quadrature rules (full tensor and sparse

tensor product rules)• MCMC, Gproc, . . .

• Functionality available via• Direct linking of C++ code• Standalone apps• Python interface based on Swig (UQTk version 3.0)

• Download as tar file and configure with CMake

• Examples of common workflows provided

Debusschere – SNL UQTk

Intro Workflow Summary References

Outline

1 General Characteristics

2 An Example Workflow

3 Summary

4 References

Debusschere – SNL UQTk

Intro Workflow Summary References

An example UQTk workflow (UQTk version 3.0)

Forward UQ

Inverse UQ

f(λ)

Model

fc(λ)

Surrogate

Dim.Red.

Likelihood D = {yi}

Data

Posterior p(λ|D)

Prior p(λ)

g(λ)

Any model

Prediction p(g(λ)|D)

Debusschere – SNL UQTk

Intro Workflow Summary References

An example UQTk workflow (UQTk version 3.0)

• Consider uncertain parameter vector λ• Propagate uncertainty in λ through model g(λ)• First calibrate λ using data on function f (λ)• Consider the following calibration functions f (λ):

• Gaussian: f G(λ) = exp(−∑d

i=1 a2i λ

2i

)• Exponential: f E(λ) = exp

(∑di=1 aiλi

)• 5 dimensional λ: a = (0.4,0.3,0.2,0.1,0.05)

• Forward models g(λ):• Gaussian: g1(λ) = f G(λ)• Exponential: g2(λ) = f E(λ)• Summation: g3(λ) =

∑di=1 λi

• For more details, see “Handbook of UncertaintyQuantification", Springer, 2016

Debusschere – SNL UQTk

Intro Workflow Summary References

Surrogate models provide computationally cheapapproximations for full forward model

• Used instead of full forward model in computationallydemanding operations such as optimization andcalibration

• Use PCE surrogate model• Same approach as forward UQ

• Legendre-Uniform PCEs• Use uniform distributions over range of input

parameters• Galerkin projection with Gauss quadrature• 3rd order PCE using 45 = 1024 quadrature points• 111 random validation samples to assess surrogate

accuracy

Debusschere – SNL UQTk

Intro Workflow Summary References

Sample UQTk commands (using stand-alone apps)

• Generate quadrature points:generate_quad -d 5 -g LU -x full -p 4

• Generate random samples for validation:pce_rv -w PCvar -d 5 -n 111 -p 5 -x LU

• Evaluate model at quadrature and validation points• Perform Galerkin projection:pce_resp -d 5 -x LEG -o 3 -e

• Evaluate output PCE at validation points to computeerror:pce_eval -x PC -s LU -o 3 -f <INPC>

Debusschere – SNL UQTk

Intro Workflow Summary References

Surrogate Construction

Gaussian

0.75 0.80 0.85 0.90 0.95 1.00 1.05Model

0.75

0.80

0.85

0.90

0.95

1.00

1.05

Polynomial Surrogate

training points

validation points

y=x

Exponential

0.5 1.0 1.5 2.0Model

0.5

1.0

1.5

2.0

Polynomial Surrogate

training points

validation points

y=x

• 3rd order PC surrogate accurate up to 0.1% relativeerror for both Gaussian and Exponential function

Debusschere – SNL UQTk

Intro Workflow Summary References

Sensitivity analysis enables dimensionality reduction

• UQTk computes main, joint, and total sensitivitycoefficients• Si : Fraction of variance due to λi only• Sij : Fraction of variance due to both λi and λj• ST

i : Fraction of variance due to λi by itself and incombination with any other λj

• Can be computed analytically from PC responsesurface

• pce_sens -m mindex.dat \-f PCcoeff_quad.dat -x LU

Debusschere – SNL UQTk

Intro Workflow Summary References

Components that explain most of the variance areretained

Gaussian Exponential0.0

0.2

0.4

0.6

0.8

1.0Sensitivity

λ1 λ2 λ3 λ4 λ5

• More than 80% of variance attributed to first twocomponents

• Include only λ1 and λ2 in calibrationDebusschere – SNL UQTk

Intro Workflow Summary References

Bayesian calibration using surrogate models

Forward UQ

Inverse UQ

f(λ)

Model

fc(λ)

Surrogate

Dim.Red.

Likelihood D = {yi}

Data

Posterior p(λ|D)

Prior p(λ)

g(λ)

Any model

Prediction p(g(λ)|D)

Debusschere – SNL UQTk

Intro Workflow Summary References

Bayesian Parameter Inference

Posterior︷ ︸︸ ︷p(λ|D) ∝

Likelihood︷ ︸︸ ︷p(D|λ)

Prior︷︸︸︷p(λ)

LD(λ) = p(D|λ) ∝R∏

j=1

exp

(−(yG

j − f G(λ))2

2σ2

)exp

(−(yE

j − f E(λ))2

2σ2

)

• Generate R random noisy realizations of f G(λ) andf E(λ) as data D

• Assume Normally distributed priors N(0,0.3) on λ1

and λ2

• Infer λ1 and λ2 against data on both f G(λ) and f E(λ)• model_inf -x xfile.dat -y yfile.dat -f pc \

-l classical -m 100000 -e 0.01 -i normal

Debusschere – SNL UQTk

Intro Workflow Summary References

Markov Chain Monte Carlo generates a set ofposterior samples

λ1

0 20000 40000 60000 80000 100000MCMC Sample

−1.0

−0.8

−0.6

−0.4

−0.2

0.0

0.2

Parameter λ1

λ2

0 20000 40000 60000 80000 100000MCMC Sample

−1.5

−1.0

−0.5

0.0

0.5

1.0

Parameter λ2

• The chains for both parameters show good mixing

Debusschere – SNL UQTk

Intro Workflow Summary References

Marginal and Joint Posterior Densities

0.0

0.5

1.0

1.5

2.0

2.5

3.0λ1

PDF of λ1

−1.0 −0.5 0.0 0.5λ2

0.0

0.5

1.0

1.5

2.0PDF of λ2

−0.8 −0.6 −0.4 −0.2 0.0λ1

−1.0

−0.5

0.0

0.5

λ2

Debusschere – SNL UQTk

Intro Workflow Summary References

Forward propagation with calibrated parameters

Forward UQ

Inverse UQ

f(λ)

Model

fc(λ)

Surrogate

Dim.Red.

Likelihood D = {yi}

Data

Posterior p(λ|D)

Prior p(λ)

g(λ)

Any model

Prediction p(g(λ)|D)

Debusschere – SNL UQTk

Intro Workflow Summary References

The Rosenblatt transformation maps the posteriorsamples to standard Gaussian random variables

• Posterior distributions represent the parameteruncertainties

• Set of MCMC samples characterizes these posteriors• Need to project these posteriors onto Gauss-Hermite

PC basis to get PCE for λ• Galerkin projection requires map between posterior

samples and ξ• Rosenblatt transformation provides this mapping• pce_quad provides this map to project samples onto

PCEs

Debusschere – SNL UQTk

Intro Workflow Summary References

Rosenblatt mapping of quadrature points enablesGalerkin projection onto Gauss-Hermite basis

−0.8 −0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0Parameter λ1

−1.0

−0.8

−0.6

−0.4

−0.2

0.0

Parameter λ2

MCMC samples

Quadrature samples

λ1 =P∑

k=0

λ1k Ψk (ξ) λ1k ∝∫

R−1λ1

(ξ)︸ ︷︷ ︸λ1

Ψk (ξ)w(ξ)dξ

λ2 =P∑

k=0

λ2k Ψk (ξ) λ2k ∝∫

R−1λ2

(ξ)︸ ︷︷ ︸λ2

Ψk (ξ)w(ξ)dξ

Debusschere – SNL UQTk

Intro Workflow Summary References

Forward propagation uses similar commands assurrogate construction

• Use calibrated uncertainty on λ1, λ2

• Keep prior uncertainty N(0,0.3) on λ3, λ4, λ5

• Forward models g(λ):• Gaussian: g1(λ) = f G(λ)• Exponential: g2(λ) = f E(λ)• Summation: g3(λ) =

∑di=1 λi

• Quadrature Galerkin projection• generate_quad -d 5 -g HG -x full -p 4• Evaluate model at quadrature points• pce_resp -d 5 -x HG -o 3

Debusschere – SNL UQTk

Intro Workflow Summary References

Input calibration reduces output uncertainty

0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00Model Output

0

10

20

30

40

50

60

PDF

Gaussian

Prior

Posterior

0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0Model Output

0

1

2

3

4

5

6

7

8

PDF

Exponential

Prior

Posterior

• After calibration: output distributions narrow and shift

Debusschere – SNL UQTk

Intro Workflow Summary References

Input calibration reduces output uncertainty

−4 −3 −2 −1 0 1 2 3Model Output

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

PDF

Summation

Prior

Posterior

• After calibration: output distributions narrow and shift

Debusschere – SNL UQTk

Intro Workflow Summary References

Attribution relates output uncertainties to specificinputs

Gaussian Exponential Summation0.0

0.2

0.4

0.6

0.8

1.0

Sensitivity

(λ1 ,λ2 ) λ3 λ4 λ5

• Attribution uses sensitivity analysis tools: pce_sens• For Gaussian model: more data needed to further

reduce input uncertainty in λ1, λ2

• For other two models, need to calibrate other inputsDebusschere – SNL UQTk

Intro Workflow Summary References

Summary

• UQTk provides a powerful set of tools for buildinggeneral UQ workflows

• Multiple ways to access functionality• Direct linking of C++ code• Standalone apps• Python interface based on Swig (UQTk version 3.0)

• Version 3.0 soon athttp://www.sandia.gov/UQToolkit

• Do not hesitate to contact [email protected]

Debusschere – SNL UQTk

Intro Workflow Summary References

References

• B. Debusschere, H. Najm, P. Pébay, O. Knio, R.Ghanem and O. Le Maître, “Numerical Challenges inthe Use of Polynomial Chaos Representations forStochastic Processes", SIAM J. Sci. Comp., 26:2,2004.

• K. Sargsyan, et al., “Dimensionality reduction forcomplex models via bayesian compressive sensing",Int. J. of Uncertainty Quantification, 4, 1:63-93, 2014.

• “Handbook of Uncertainty Quantification", R.Ghanem, D. Higdon, H. Owhadi (Eds.), Springer,2016, http://www.springer.com/us/book/9783319123844

• UQ Tutorials: http://www.quest-scidac.org/outreach/tutorials/

Debusschere – SNL UQTk

Intro Workflow Summary References

Rosenblatt Transformation for Multi-D RVs

• Assume samples of multi-D RVs are (e.g. fromMCMC sampling of posterior parameter distribution)

• Rosenblatt transformation maps any (not necessarilyindependent) set of random variables (λ1, . . . , λd) touniform i.i.d.’s {ηi}d

i=1 (Rosenblatt, 1952).

η1 = F1(λ1)

η2 = F2|1(λ2|λ1)

...ηd = Fd |d−1,...,1(λd |λd−1, . . . , λ1)

• Rosenblatt transformation is a multi-D generalizationof 1D CDF mapping.

• Conditional CDFs are harder to evaluate in highdimensions

Debusschere – SNL UQTk