Parallel Adaptive and Robust Algorithms for the Bayesian Analysis … · 2012. 4. 20. · Parallel Adaptive and Robust Algorithms for the Bayesian Analysis of Mathematical Models

Parallel Adaptive and Robust Algorithms for theBayesian Analysis of Mathematical Models

Under Uncertainty

Ernesto Esteves Prudencio1 and Sai Hung Cheung2

1- Institute for Computational Engineering and Sciences (ICES)The University of Texas at Austin

2- School of Civil and Environmental EngineeringNanyang Technological University, Singapore

SIAM PP12, Savannah, GA, February 17, 2012, 3:30 PM

Prudencio and Cheung Parallel Adaptive Multilevel Sampling SIAM PP12, Savannah, Feb. 17 1 / 34

Acknowledgement: Research Sponsors

NNSA-DOE, Predictive Science Academic Alliance Programs (PSAAP)

KAUST, Academic Excellence Alliance (AEA) Program


Outline

1 Motivation

2 Computational Tasks

3 ML Algorithm

4 Final Remarks


Motivation

1. Motivation


Motivation

Treatment of Mathematical Models under Uncertainty

We need to calibrate, predict and validate under uncertainty

Uncertainties:

• Boundary and initial conditions, geometry

• Values of physical parameters

• Structure of equations (model inadequacy)

• Experimental data


Motivation

PECOS Center: Atmospheric Entry Vehicles

Decision maker: what is the probability of failure?

A quantity of interest: TPS recession rate at peak heating

Model: fluid dynamics, thermochemistry, radiation, turbulence, ablation


Motivation

Bayesian Model Analysis

Bayes Theorem:

π(θ|D)︸︷︷︸posterior

=

likelihood︷︸︸︷f(D|θ)

prior︷︸︸︷π(θ)

π(D)=

f(D|θ) π(θ)∫f(D|θ π(θ)) dθ

Each instance of θ yields one (deterministic or stochastic) model

Example form of likelihood:

ln [f(D|θ)] ∝ −12[y(θ)− d]T [C]−1 [y(θ)− d]

C = σ2 I⇒ ln [f(D|θ)] ∝ −12‖y(θ)− d‖2

σ2


Motivation

Bayesian Model Analysis

Bayes Theorem:

π(θ|D)︸︷︷︸posterior

=

likelihood︷︸︸︷f(D|θ)

prior︷︸︸︷π(θ)

π(D)=

f(D|θ) π(θ)∫f(D|θ π(θ)) dθ

Each instance of θ yields one (deterministic or stochastic) model

Example form of likelihood:

ln [f(D|θ)] ∝ −12[y(θ)− d]T [C]−1 [y(θ)− d]

C = σ2 I⇒ ln [f(D|θ)] ∝ −12‖y(θ)− d‖2

σ2


Motivation

Case 1: Just One Candidate Model is Available

Calibrate Predict

Motivation for samples


Motivation

Case 1: Just One Candidate Model is Available

Calibrate Predict

Motivation for samples


Motivation

Case 2: Many Candidate Models are Available

Motivation for samples and for model ranking


Motivation

Case 2: Many Candidate Models are Available

Motivation for samples and for model rankingPrudencio and Cheung Parallel Adaptive Multilevel Sampling SIAM PP12, Savannah, Feb. 17 9 / 34

Motivation

The Concepts of “Model Class” and “Model Evidence”

Model class M1 = set of all models corresponding to all possible θ

• = mathematical equations + all assumptions supporting them;

• = a hypothesis, a collection of statements that allows the definition ofπ(θ) and f(D|θ).

π(θ1|D,M1) =f(D|θ1,M1) π(θ1|M1)

π(D|M1)=

f(D|θ1,M1) π(θ1|M1)∫f(D|θ1,M1) π(θ1|M1) dθ1

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Model evidence = probability of obtaining D given some hypothesis M1

π(D|M1)︸︷︷︸evidence

=∫f(D|θ1,M1)︸︷︷︸

likelihood

π(θ1|M1)︸︷︷︸prior

dθ1


Motivation

Plausibility of a Model Class in a Set of Candidates

Different assumptions, equations, parameters⇒ different model class

M = {M1,M2, . . . ,Mm}

Bayes theorem at model class level, with the discrete setM of candidates:

p(Mj |D,M)︸︷︷︸posterior plausibility

=

evidence︷︸︸︷π(D|Mj)

prior plausibility︷︸︸︷p(Mj |M)

π(D|M)=

π(D|Mj) p(Mj |M)∑mj=1 π(D|Mj) p(Mj |M)

Property:∑m

j=1 p(Mj |D,M) = 1.


Motivation

Comparing Bayesian Inference FormulasIntra Model Class:

π(θj |D,Mj)︸︷︷︸posterior prob.

=

likelihood︷︸︸︷f(D|θj ,Mj)

prior probability︷︸︸︷π(θj |Mj)

π(D|Mj)=

f(D|θj ,Mj) π(θj |Mj)∫f(D|θj ,Mj) π(θj |Mj) dθj

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Inter Model Classes:

p(Mj |D,M)︸︷︷︸posterior plausibility

=

evidence︷︸︸︷π(D|Mj)

prior plausibility︷︸︸︷p(Mj |M)

π(D|M)=

π(D|Mj) p(Mj |M)∑mj=1 π(D|Mj) p(Mj |M)


Motivation

Example of Model Evidence Calculations

j π(D|Mj) p(Mj |M) p(Mj |D,M)1 1.6× 10−3 ≈ 33% ≈ 07%2 6.4× 10−3 ≈ 33% ≈ 26%3 1.6× 10−2 ≈ 33% ≈ 67%


Computational Tasks

2. Computational Tasks


Computational Tasks

Two Computational Tasks

• Generate samples of posterior π(θ|D) in order to forward propagateuncertainty and compute QoI rv’s

• Compute model evidence π(D|M) =∫f(D|θ,M) π(θ|M) dθ


Computational Tasks

Possible Algorithms

• Metropolis-Hastings (MCMC):

samples for f(D|θ,M) π(θ|M)

• Monte Carlo:∫f(D|θ,M) π(θ|M)︸︷︷︸

samples

dθ ≈ 1N

N∑i=1

f(D|θ(i),M)


Computational Tasks

Unimodal Distributions: “Easy”


Computational Tasks

Multimodal Distributions: Not Necessarily Complicated


Computational Tasks

Multimodal Distributions: Possibly Complicated


ML Algorithm

3. ML Algorithm


ML Algorithm

Main Idea

For

l = 0, 1, . . . , L > 1,

sample

π(l)

target(θ) = f τl(D|θ)× πprior(θ),

with0 = τ0 < τ1 < . . . < τL−1 < τL = 1.


ML Algorithm

Example of Last Level


ML Algorithm

Illustration on Different Levels (Exponents)


ML Algorithm

Main Idea in More Detail

∫f(θ) π(θ) dθ =

∫f π dθ

=∫f (1−τL−1) f (τL−1−τL−2) . . . f (τ2−τ1) f τ1 π dθ

= c1

∫f (1−τL−1) f (τL−1−τL−2) . . . f (τ2−τ1) f

τ1 π

c1dθ

= c2 c1

∫f (1−τL−1) f (τL−1−τL−2) . . .

f (τ2−τ1) f τ1 π

c2 c1dθ

= cL cL−1 . . . c2 c1


ML Algorithm

ML Algorithm Overview

• Set l = 0, τl = 0• Sample prior distribution

• While τl < 1 do {• Begin next level: set l← l + 1• Compute τl• Select, from previous level, initial positions for Markov chains

• Compute sizes of chains

• Generate chains

• Compute cl• }


ML Algorithm

Chances for Load Unbalancing

The “good” samples from a level serve as initial positions for the next level.

“Luckier” MPI nodes, with more “good” samples, will generate moresamples in the next level.

Cumulative effect is clear (e.g. a case of “unbalancing ratio” = 29).


ML Algorithm

ML Algorithm with Load Balancing

• Set l = 0, τl = 0• Sample prior distribution

• While τl < 1 do {• Begin next level: set l← l + 1• Compute τl• Select, from previous level, initial positions for Markov chains

• Compute sizes of chains

• Redistribute chain initial positions among MPI nodes

• Generate chains

• Compute cl• }


ML Algorithm


ML Algorithm

(Schematic) Potential Work Balancing Issues

b =maximum total computational workminimum total computational work

, among all processors


ML Algorithm

Results with 1D Problem

8 processors 64 processors


ML Algorithm

Results with 10D Problem

8 processors 64 processors


Final Remarks

4. Final Remarks


Final Remarks

Many UQ Research Challenges Beyond Load Balancing

• Statistical robustness

• Fault tolerance (Karl Schulz)

• Computational cost

• Convergence

• Various models: turbulence, thermochemistry, peridynamics,earthquakes, tumor growth


Final Remarks

Thank you!


Documents

Parallel Adaptive and Robust Algorithms for the Bayesian Analysis … · 2012. 4. 20. · Parallel Adaptive and Robust Algorithms for the Bayesian Analysis of Mathematical Models