Concepts in Global Sensitivity Analysis€¦ · Concepts in Global Sensitivity Analysis IMA UQ...

Preview:

Citation preview

Concepts in Global Sensitivity Analysis IMA UQ Short Course, June 23, 2015 A good reference is Global Sensitivity Analysis: The Primer. Saltelli, et al. (2008)

WARNING: These slides are meant to complement the oral presentation in the short course. Use out of context at your own risk.

Paul Constantine Colorado School of Mines inside.mines.edu/~pconstan activesubspaces.org @DrPaulynomial

http://www.sfu.ca/~ssurjano/index.html

Von Neumann, John, and Herman H. Goldstine.  "Numerical inverting of matrices of high order."  Bulletin of the American Mathematical Society  53.11 (1947): 1021-1099.

•  What kinds of science/engineering models do you care about?

•  Do you have a simulation that you trust? What are the

inputs and outputs? •  How would you characterize the uncertainty in the

inputs? In other words, what do you know about the unknown inputs?

•  What question are you trying to answer with your model?

f(x)

x

•  Finite dimensional vector •  Independent components •  Centered and scaled to remove units

Time0 0.5 1 1.5 2

Res

pons

e

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Perturbation 1Baseline

Time0 0.5 1 1.5 2

Res

pons

e-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1Perturbation 2Baseline

2-­‐norm  difference   20.5  

infinity-­‐norm  difference   2.0  

difference  at  final  5me   0.0  

2-­‐norm  difference   31.6  

infinity-­‐norm  difference   1.8  

difference  at  final  5me   0.0  

Time0 0.5 1 1.5 2

Response

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1Baseline Which perturbation shows the largest change?

•  Scalar-valued •  “Smooth” •  No “noise!”

f

Sensitivity analysis seeks to identify the most important parameters.

•  What are the most important parameters in your model?

•  What are the least important parameters?

•  What does it mean for a parameter to be important?

@f

@xi(x)

Derivatives measure local sensitivity. But we want something global.

Some Global Sensitivity Metrics 1.  Morris’ elementary effects 2.  Sobol sensitivity indices 3.  Mean (squared) derivatives 4.  Active subspaces

Morris’ Elementary Effects (Like bad approximations to average derivatives)

EEij(h) =f(xj + hei)� f(xj)

h

Step size

h 2⇢

2n

p� 1, n = 1, . . . , p� 1

Elementary effect

p-level grid

x1

x2

µi(h) =1

N

NX

j=1

EEij(h)

µ⇤i (h) =

1

N

NX

j=1

|EEij(h)|

Sensitivity indices

Variance-based decompositions

f(x) = f0

+mX

i=1

fi(xi)

+mX

i=1

mX

j>i

fi,j(xi, xj)

· · ·+ f1,...,m(x1, . . . , xm)

constant

functions of one variable

functions of two variables

functions of 3, 4, … variables

function of m variables

Variance-based decompositions

f0 = E [f ]

fi = E [f |xi]� f0

fi,j = E [f |xi, xj ]�X

i

fi � f0

...

f1,...,m = f(x)� “everything else”

orthogonal functions

Var [f ] =X

i

Var [fi] +X

i,j

Var [fi,j ] + · · ·+Var [f1,...,m]

Decomposition of variance

Sobol indices

Si =Var [fi]

Var [f ]

Si1,...,ik =Var [fi1,...,ik ]

Var [f ]

ST1 = S1 + S1,2 + S1,3 + S1,2,3

First order sensitivity index

Interaction effects

Total effects (e.g., sum everything with a “1”)

(PAUL: Mention the relationship to polynomial chaos.)

Mean (squared) derivatives

E@f

@xi

�E"✓

@f

@xi

◆2#

Kucherenko, et al., DGSM, RESS (2008)

Let’s play!

Think of an interesting bivariate function.

Estimating with Monte Carlo is loud.

Number of samples102 104 106

Mon

te C

arlo

Erro

r

10-5

10-4

10-3

10-2

10-1

100

What is it goood for?

•  Sensitivity metrics can be hard to interpret if not zero.

•  May provide or confirm understanding. •  Lots of ideas for using them as weights for anisotropic

approximation schemes. •  Would like to use them to reduce the dimension.

AUDIENCE POLL

How many dimensions is “high” dimensions?

APPROXIMATION OPTIMIZATION INTEGRATION

f(x) ⇡ f(x)

Zf(x) ⇢ dx minimize

x

f(x)

Dimension 10 points / dimension 1 second / evaluation 1 10 10 s 2 100 ~ 1.6 min 3 1,000 ~ 16 min 4 10,000 ~ 2.7 hours 5 100,000 ~ 1.1 days 6 1,000,000 ~ 1.6 weeks … … … 20 1e20 3 trillion years

(240x age of the universe)

Dimension 10 points / dimension 1 second / evaluation 1 10 10 s 2 100 ~ 1.6 min 3 1,000 ~ 16 min 4 10,000 ~ 2.7 hours 5 100,000 ~ 1.1 days 6 1,000,000 ~ 1.6 weeks … … … 20 1e20 3 trillion years

(240x age of the universe)

“Reduced order models”

Dimension 10 points / dimension 1 second / evaluation 1 10 10 s 2 100 ~ 1.6 min 3 1,000 ~ 16 min 4 10,000 ~ 2.7 hours 5 100,000 ~ 1.1 days 6 1,000,000 ~ 1.6 weeks … … … 20 1e20 3 trillion years

(240x age of the universe)

“Better designs”

Dimension 10 points / dimension 1 second / evaluation 1 10 10 s 2 100 ~ 1.6 min 3 1,000 ~ 16 min 4 10,000 ~ 2.7 hours 5 100,000 ~ 1.1 days 6 1,000,000 ~ 1.6 weeks … … … 20 1e20 3 trillion years

(240x age of the universe)

“Dimension reduction”

direction of change

direction of flat

f(x1, x2) = exp(0.7x1 + 0.3x2)

bookstore.siam.org/sl02/

$27

Coupon code: BKSL15

DEFINE the active subspace.

Consider a function and its gradient vector,

The average outer product of the gradient and its eigendecomposition,

Partition the eigendecomposition,

Rotate and separate the coordinates,

⇤ =

⇤1

⇤2

�, W =

⇥W 1 W 2

⇤, W 1 2 Rm⇥n

x = WW Tx = W 1W

T1 x+W 2W

T2 x = W 1y +W 2z

active variables

inactive variables

C =

Z(r

x

f)(rx

f)T ⇢ dx = W⇤W T

f = f(x), x 2 Rm, rf(x) 2 Rm, ⇢ : Rm ! R+

Zxx

T ⇢ dx

Zr

x

f rx

fT ⇢ dx

VS.

The eigenvectors indicate perturbations that change the function more, on average.

LEMMA 1:

LEMMA 2:

�i =

Z �(r

x

f)Twi

�2⇢ dx, i = 1, . . . ,m

Z(ryf)

T (ryf) ⇢ dx = �1 + · · ·+ �n

Z(rzf)

T (rzf) ⇢ dx = �n+1 + · · ·+ �m

DISCOVER the active subspace with random sampling.

Draw samples:

Compute: and fj = f(xj) rx

fj = rx

f(xj)

Approximate with Monte Carlo

Equivalent to SVD of samples of the gradient.

Called an active subspace method in T. Russi’s 2010 Ph.D. thesis, Uncertainty Quantification with Experimental Data in Complex System Models

xj ⇠ ⇢

C ⇡ 1

N

NX

j=1

rx

fj rx

fTj = W ⇤W

T

1pN

⇥r

x

f1 · · · rx

fN⇤= W

p⇤V

T

1pN

⇥r

x

f1 · · · rx

fN⇤⇡ W 1

q⇤1V

T

1

Low-rank approximation of the collection of gradients:

Let’s be abundantly clear about the problem we are trying to solve.

Low-dimensional linear approximation of the gradient:

rf(x) ⇡ W 1 a(x)

f(x) ⇡ g⇣W

T

1 x

Approximate a function of many variables by a function of a few linear combinations of the variables: ✔  

✖  ✖  

f(x) ⇡ g⇣W

T

1 x

How do you construct g?

What is the approximation error?

What is the effect of the approximate eigenvectors?

[ Show them the animation! ]

Define the conditional expectation:

THEOREM:

Define the Monte Carlo approximation:

THEOREM:

g(y) =

Zf(W 1y +W 2z) ⇢(z|y) dz, f(x) ⇡ g(W T

1 x)

g(y) =1

N

NX

i=1

f(W 1y +W 2zi), zi ⇠ ⇢(z|y)

EXPLOIT active subspaces for response surfaces with conditional averaging.

✓Z ⇣f(x)� g(W T

1 x)⌘2

⇢ dx

◆ 12

CP (�n+1 + · · ·+ �m)12

✓Z ⇣f(x)� g(W T

1 x)⌘2

⇢ dx

◆ 12

CP

⇣1 +N� 1

2

⌘(�n+1 + · · ·+ �m)

12

✓Z ⇣f(x)� g(W

T

1 x)⌘2

⇢ dx

◆ 12

CP

⇣" (�1 + · · ·+ �n)

12 + (�n+1 + · · ·+ �m)

12

EXPLOIT active subspaces for response surfaces with conditional averaging.

Subspace error

Eigenvalues for active variables

Eigenvalues for inactive variables

Define the subspace error:

" = dist (W 1, W 1)

THEOREM:

THE BIG IDEA

1.  Choose points in the domain of g.

2.  Estimate conditional averages at each point.

3.  Construct the approximation in n < m dimensions.

There’s an active subspace in this parameterized PDE.

Two-d Poisson with 100-term Karhunen-Loeve coefficients

�1

�2D

�r · (aru) = 1, x 2 Du = 0, x 2 �1

�n · aru = 0, x 2 �2

DIMENSION REDUCTION: 100 to 1

1 2 3 4 5 610−13

10−12

10−11

10−10

10−9

10−8

10−7

10−6

Index

Eige

nval

ues

EstBI

1 2 3 4 5 610−13

10−12

10−11

10−10

10−9

10−8

10−7

10−6

Index

Eige

nval

ues

1 2 3 4 5 610−2

10−1

100

Subspace Dimension

Subs

pace

Dis

tanc

e

BIEst

1 2 3 4 5 610−2

10−1

100

Subspace Dimension

Subs

pace

Dis

tanc

e

Two-d Poisson with 100-term Karhunen-Loeve coefficients

�1

�2D

Active variable-3 -2 -1 0 1 2 3

Qua

ntity

of I

nter

est

#10-3

0

0.5

1

1.5

2

2.5

3

�r · (aru) = 1, x 2 Du = 0, x 2 �1

�n · aru = 0, x 2 �2

There’s an active subspace in this parameterized PDE.

DIMENSION REDUCTION: 100 to 1

Active subspaces can be sensitivity metrics.

0 20 40 60 80 100−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Index

β=0.01β=1

Com

pone

nts

of fi

rst e

igen

vect

or

Short correlation length Long correlation length

Questions?

•  How do the active subspaces relate to the coordinate-based sensitivity metrics?

•  How does this relate to PCA/POD?

•  How many gradient samples do I need?

•  How new is all this?

Paul Constantine Colorado School of Mines

activesubspaces.org @DrPaulynomial

Recommended