accurate ABC Oliver Ratmann

Centre for Outbreak Analysis and Modelling

Statistical modeling of summary values leads to accurate Approximate

Bayesian Computations

Oliver Ratmann (Imperial College London, UK)Anton Camacho (London School of Hygiene & Tropical Medicine, UK)

Adam Meijer (National Institute of the Environment & Public Health, NL)Gé Donker (Netherlands Institute for Health Services Research, NL)

Thursday, 30 May 13

Noisy ABC

σ2

n-A

BC

est

imat

e of

πτ(σ

2 |x)

0.0 0.5 1.0 1.5 2.0 2.5

0.0

0.5

1.0

1.5

2.0

2.5

3.0 n=60

naivetolerancesτ-=0.35τ+=1.65

π(σ2|x)

argmaxσ2π(σ2|x)

0.90 0.95 1.00 1.05 1.10

050

010

0015

00

estimated mean of σ2

n−AB

C re

petit

ions

S2(y)−S2(x)

[c−,c+]=[−0.5,0.5]

[c−,c+]=[−0.3,0.3]

[c−,c+]=[−0.1,0.1]

Thursday, 30 May 13

Accurate ABC

σ2n−

ABC

est

imat

e of

πτ(σ

2 |x)

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

n=60

calibratedtolerancesτ−=0.572τ+=1.808m=97

π(σ2|x)

argmaxσ2

π(σ2|x)

Can we construct ABC sth inference is accurate• wrt posterior mean / MAP /

• wrt to posterior variance

If yes, under which conditions?

How general are these?

✓0

Thursday, 30 May 13

Accurate ABC - overview

1. m sim and n obs data points on summary level “summary values” ➣ can model their distribution, eg

s

1:n(x) ⇠ N (µx

,�

2x

)

Three elements for accurate ABC

3. indirect inference ➣ link auxiliary space back to original space

2. classification on auxiliary space ➣ given , is the underlying small ? s

1:n(x) s1:m(y) ⇢ = µ(✓)� µx

s1:m(y) ⇠ N (µ(✓),�2(✓))

Thursday, 30 May 13

Summary valuesm sim and n obs data points on summary level➣ can model their distribution

data

Thursday, 30 May 13


data summary values

Thursday, 30 May 13


data summary values modeled distribution

Thursday, 30 May 13


data summary values modeled distribution

eg Normal, Exponential,Gamma, Chi-Square;or data transformation eg Log-Normal

Sufficient statistics available on auxiliary space

Thursday, 30 May 13

Constructing -space⇢modeling summary values defines an auxiliary probability space

s

1:n(x) ⇠ N (µx

,�

2x

)

s1:n(y) ⇠ N (µ(✓),�2(✓))

⇢ = µ(✓)� µx

obs

simpopulation error

L : ⇥ ⇢ RD ! � ⇢ RK

✓ ! (⇢1, . . . , ⇢K)

⇢k

= �k

(⌫xk

, ⌫k

(✓))

⇢ = (⇢1, . . . , ⇢K)✓ = (✓1, . . . , ✓D)D orig parameters

K error parametersLink function

Thursday, 30 May 13

Indirect inference on -space⇢transform sufficiency problem into change of variable problem⇡

true posterior

(✓|x) / `(x|⇢)⇡(⇢) |@L(✓)|`(S(x)|⇢)⇡(⇢) |@L(✓)|

⇡

abc

(✓|x) / P

x

(ABC accept|⇢)⇡(⇢) |@L(✓)|

Thursday, 30 May 13

Indirect inference on -space⇢transform sufficiency problem into change of variable problem

ABC approximation on -space is⇢

⇡

true posterior

(✓|x) / `(x|⇢)⇡(⇢) |@L(✓)|`(S(x)|⇢)⇡(⇢) |@L(✓)|

⇡

abc

(✓|x) / P

x

(ABC accept|⇢)⇡(⇢) |@L(✓)|

Thursday, 30 May 13



⇡

true posterior

(✓|x) / `(x|⇢)⇡(⇢) |@L(✓)|`(S(x)|⇢)⇡(⇢) |@L(✓)|

⇡

abc

(✓|x) / P

x

(ABC accept|⇢)⇡(⇢) |@L(✓)|

match through calibrationof ABC tolerances and m

Thursday, 30 May 13



⇡

true posterior

(✓|x) / `(x|⇢)⇡(⇢) |@L(✓)|`(S(x)|⇢)⇡(⇢) |@L(✓)|

⇡

abc

(✓|x) / P

x

(ABC accept|⇢)⇡(⇢) |@L(✓)|

0.5 1.0 1.5 2.0 2.5 3.0

01

23

45

σ2

density

match through calibrationof ABC tolerances and m

Thursday, 30 May 13



⇡

true posterior

(✓|x) / `(x|⇢)⇡(⇢) |@L(✓)|`(S(x)|⇢)⇡(⇢) |@L(✓)|

⇡

abc

(✓|x) / P

x

(ABC accept|⇢)⇡(⇢) |@L(✓)|

Discussion wrt indirect inference (Gouriéroux 1993)• difficulty in indirect inference: which aux space chosen

here constructed empirically from distr of summary values• MLE invariant under parameter transformation, only need bijective

for posterior distribution, entersL

|@L(✓)|

Thursday, 30 May 13

Accept/reject on -spaceinterpret accept/reject as hypothesis testing procedureR =

�c

� T

�s

1:n(x), s1:m(y)� c

+

⇢

T-testobjective: declare , unequalH0: , equalH1: , unequalrejection region:

µ(✓) µx

µ(✓) µx

µ(✓) µx

(�1, c�] [ [c+,1)

ABCobjective: declare , equalH0: , unequalH1: , equalrejection region:

µ(✓) µx

µ(✓) µx

µ(✓) µx

[c�, c+]

Thursday, 30 May 13


�c

� T

�s

1:n(x), s1:m(y)� c

+

⇢


µ(✓) µx

µ(✓) µx

µ(✓) µx

(�1, c�] [ [c+,1)


set , sth

µ(✓) µx

µ(✓) µx

µ(✓) µx

[c�, c+]

c� c+ P (R |H0 ) ↵

Thursday, 30 May 13


�c

� T

�s

1:n(x), s1:m(y)� c

+

⇢


µ(✓) µx

µ(✓) µx

µ(✓) µx

(�1, c�] [ [c+,1)


set , sth

µ(✓) µx

µ(✓) µx

µ(✓) µx

[c�, c+]

c� c+ P (R |H0 ) ↵critical region depends on summary values

Thursday, 30 May 13


�c

� T

�s

1:n(x), s1:m(y)� c

+

⇢


µ(✓) µx

µ(✓) µx

µ(✓) µx

(�1, c�] [ [c+,1)


set , sth

is power function

µ(✓) µx

µ(✓) µx

µ(✓) µx

[c�, c+]

c� c+

⇢ ! P (R | ⇢ )

P (R |H0 ) ↵critical region depends on summary values

Thursday, 30 May 13


�c

� T

�s

1:n(x), s1:m(y)� c

+

⇢


µ(✓) µx

µ(✓) µx

µ(✓) µx

(�1, c�] [ [c+,1)


set , sth

is power function

µ(✓) µx

µ(✓) µx

µ(✓) µx

[c�, c+]

c� c+

⇢ ! P (R | ⇢ )

P (R |H0 ) ↵critical region depends on summary values

power known, so we know ABC accept probability

Thursday, 30 May 13


�c

� T

�s

1:n(x), s1:m(y)� c

+

⇢


µ(✓) µx

µ(✓) µx

µ(✓) µx

(�1, c�] [ [c+,1)


set , sth

is power function

µ(✓) µx

µ(✓) µx

µ(✓) µx

[c�, c+]

c� c+

⇢ ! P (R | ⇢ )

P (R |H0 ) ↵

data fixed, so one-sample two-sided test

Thursday, 30 May 13

Example: test variance

x

1:n ⇠ N (0,�2x

) y1:m ⇠ N (0,�2)

suppose

then

for simplicity, summary values equal data

Thursday, 30 May 13


x

1:n ⇠ N (0,�2x

) y1:m ⇠ N (0,�2)

suppose

⇢ = �2/�2x

⇢? = 1

H0 : ⇢ /2 [⌧�, ⌧+]

H1 : ⇢ 2 [⌧�, ⌧+]

T = S2(y1:m)/S2(x1:n) = ⇢1

n� 1

mX

i=1

(yi

� y)2

�2

⇠ ⇢

n� 1�2m�1

then


Thursday, 30 May 13


x

1:n ⇠ N (0,�2x

) y1:m ⇠ N (0,�2)

suppose

⇢ = �2/�2x

⇢? = 1

H0 : ⇢ /2 [⌧�, ⌧+]

H1 : ⇢ 2 [⌧�, ⌧+]

T = S2(y1:m)/S2(x1:n) = ⇢1

n� 1

mX

i=1

(yi

� y)2

�2

⇠ ⇢

n� 1�2m�1

then


point of equality

Thursday, 30 May 13


x

1:n ⇠ N (0,�2x

) y1:m ⇠ N (0,�2)

suppose

⇢ = �2/�2x

⇢? = 1

H0 : ⇢ /2 [⌧�, ⌧+]

H1 : ⇢ 2 [⌧�, ⌧+]

T = S2(y1:m)/S2(x1:n) = ⇢1

n� 1

mX

i=1

(yi

� y)2

�2

⇠ ⇢

n� 1�2m�1

then


point of equality

tolerances on population level

Thursday, 30 May 13


x

1:n ⇠ N (0,�2x

) y1:m ⇠ N (0,�2)

suppose

⇢ = �2/�2x

⇢? = 1

H0 : ⇢ /2 [⌧�, ⌧+]

H1 : ⇢ 2 [⌧�, ⌧+]

T = S2(y1:m)/S2(x1:n) = ⇢1

n� 1

mX

i=1

(yi

� y)2

�2

⇠ ⇢

n� 1�2m�1

then


point of equality

tolerances on population level

know distribution of T,can work out , andpower function

c� c+

Thursday, 30 May 13


x

1:n ⇠ N (0,�2x

) y1:m ⇠ N (0,�2)

suppose

⇢ = �2/�2x

⇢? = 1

H0 : ⇢ /2 [⌧�, ⌧+]

H1 : ⇢ 2 [⌧�, ⌧+]

T = S2(y1:m)/S2(x1:n) = ⇢1

n� 1

mX

i=1

(yi

� y)2

�2

⇠ ⇢

n� 1�2m�1

then

know distribution of T,can work out , andpower function

c� c+

0.5 1.0 1.5 2.0

0.0

0.2

0.4

0.6

0.8

ρpowe

r

Thursday, 30 May 13


x

1:n ⇠ N (0,�2x

) y1:m ⇠ N (0,�2)

suppose

⇢ = �2/�2x

⇢? = 1

H0 : ⇢ /2 [⌧�, ⌧+]

H1 : ⇢ 2 [⌧�, ⌧+]

T = S2(y1:m)/S2(x1:n) = ⇢1

n� 1

mX

i=1

(yi

� y)2

�2

⇠ ⇢

n� 1�2m�1

then

0.5 1.0 1.5 2.0

0.0

0.2

0.4

0.6

0.8

ρpowe

r

increase

increase

tighten

move mode

Thursday, 30 May 13


x

1:n ⇠ N (0,�2x

) y1:m ⇠ N (0,�2)

suppose

⇢ = �2/�2x

⇢? = 1

H0 : ⇢ /2 [⌧�, ⌧+]

H1 : ⇢ 2 [⌧�, ⌧+]

T = S2(y1:m)/S2(x1:n) = ⇢1

n� 1

mX

i=1

(yi

� y)2

�2

⇠ ⇢

n� 1�2m�1

then

calibrated tol

σ2

n−AB

C e

stim

ate

of π

τ(σ2 |x

)0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.5

1.0

1.5

2.0

n=60

calibratedtolerancesτ−=0.477τ+=2.2naivetolerancesτ−=0.35τ+=1.65

π(σ2|x)

argmaxσ2

π(σ2|x)

likelihood on -space⇢

Thursday, 30 May 13


x

1:n ⇠ N (0,�2x

) y1:m ⇠ N (0,�2)

suppose

⇢ = �2/�2x

⇢? = 1

H0 : ⇢ /2 [⌧�, ⌧+]

H1 : ⇢ 2 [⌧�, ⌧+]

T = S2(y1:m)/S2(x1:n) = ⇢1

n� 1

mX

i=1

(yi

� y)2

�2

⇠ ⇢

n� 1�2m�1

then

calibrated m=97

σ2

n−AB

C e

stim

ate

of π

τ(σ2 |x

)0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

n=60

calibratedtolerancesτ−=0.572τ+=1.808m=97calibratedtolerancesτ−=0.726τ+=1.392m=300

π(σ2|x)

argmaxσ2

π(σ2|x)

likelihood on -space⇢

Thursday, 30 May 13

Calibration Lemmaswhen is it possible and easy to calibrate?

depends on distribution family

T = S

2(y1:m)/S2(x1:n) = ⇢

1

n� 1

mX

i=1

(yi � y)2

�

2

⇠ ⇢

n� 1�

2m�1

main condition:

• if family continuous in and strictly totally positive of order 3, then power function is unimodal

⇢

Thursday, 30 May 13

Calibration Lemmaswhen is it possible and easy to calibrate?

depends on distribution family

T = S

2(y1:m)/S2(x1:n) = ⇢

1

n� 1

mX

i=1

(yi � y)2

�

2

⇠ ⇢

n� 1�

2m�1

main condition:

• if family continuous in and strictly totally positive of order 3, then power function is unimodal

⇢

Discussionmany tests satisfying these criteria available, see

Thursday, 30 May 13

Combining test statisticsequivalent to combining summary statistics

very briefly:

• Mahalanobis approach possible, corresponds to KT location tests for normal summary values

• Intersection approach possible,

can combine KT tests arbitrarily

Thursday, 30 May 13

Back to indirect inference

transform sufficiency problem into change of variable problem


⇡

true posterior

(✓|x) / `(x|⇢)⇡(⇢) |@L(✓)|`(S(x)|⇢)⇡(⇢) |@L(✓)|

⇡

abc

(✓|x) / P

x

(ABC accept|⇢)⇡(⇢) |@L(✓)|

now calibrated to match closely

Thursday, 30 May 13

Back to indirect inference

transform sufficiency problem into change of variable problem


⇡

true posterior

(✓|x) / `(x|⇢)⇡(⇢) |@L(✓)|`(S(x)|⇢)⇡(⇢) |@L(✓)|

⇡

abc

(✓|x) / P

x

(ABC accept|⇢)⇡(⇢) |@L(✓)|

we are left with the change of variables

Thursday, 30 May 13

Conditions on link function

if tolerances & m calibrated, main condition isthat must be bijective

can be tested from ABC output

L : ✓ ! (⇢1, . . . , ⇢K)

a

−0.4−0.2

0.00.2

0.4

sigma^2

0.9

1.0

1.1

log(rho[1])

−0.2

0.0

0.2

−0.4 −0.2 0.0 0.2 0.4

0.9

1.0

1.1

1.2

a

σ2

2

2 4

6

8

10

moving average example

two model parameters

Thursday, 30 May 13




L : ✓ ! (⇢1, . . . , ⇢K)

a

−0.4−0.2

0.00.2

0.4

sigma^2

0.9

1.0

1.1

log(rho[1])

−0.2

0.0

0.2

−0.4 −0.2 0.0 0.2 0.4

0.9

1.0

1.1

1.2

a

σ2

2

2 4

6

8

10


only one test

Thursday, 30 May 13




L : ✓ ! (⇢1, . . . , ⇢K)

a

−0.4−0.2

0.00.2

0.4

sigma^2

0.9

1.0

1.1

rho[2]

−0.4

−0.2

0.0

0.2

a

−0.4−0.2

0.00.2

0.4

sigma^2

0.9

1.0

1.1

log(rho[1])

−0.2

0.0

0.2

−0.4 −0.2 0.0 0.2 0.4

0.9

1.0

1.1

1.2

a

σ2

2

2 4

6

8

10

−0.4 −0.2 0.0 0.2 0.4

0.9

1.0

1.1

1.2

a

σ2

10

20

30

40 ●


adding one more test

Thursday, 30 May 13




L : ✓ ! (⇢1, . . . , ⇢K)

• bijectivity easy to check1. record estimate of , eg 2. reconstruct link function with regression3. link bijective if and only if

is a single point

⇢ s

1:m(y)� s

1:n(x)

Discussion

Thursday, 30 May 13

Conclusions

possible to set up ABC such that the ABC mean or MAP are exactly those of the true posterior(calibrate , )

possible to set up ABC such thatthe KL divergence of the ABC approximation to the true posterior is very small(calibrate m)

⌧� ⌧+

Thursday, 30 May 13

Conclusions

possible to set up ABC such that the ABC mean or MAP are exactly those of the true posterior(calibrate , )

possible to set up ABC such thatthe KL divergence of the ABC approximation to the true posterior is very small(calibrate m)

⌧� ⌧+

To achieve this, need to

1. identify summary values2. use a suitable test statistic for calibrations3. calibrate4. test if link function meets conditions

Thursday, 30 May 13

Resources

code on githubmanuscript on arxiv

Thursday, 30 May 13

Thank you!

Thursday, 30 May 13

Time series application

Thursday, 30 May 13


first patch, strong seasonality

second patch, weak seasonality.Re-seeds first patch.

Thursday, 30 May 13


first patch, strong seasonality

second patch, weak seasonality.Re-seeds first patch.

parameters to estimate + reporting rate

Thursday, 30 May 13


3 model parameters3 tests

Thursday, 30 May 13

Time series application100 replicate runs

Thursday, 30 May 13

Technology

accurate ABC Oliver Ratmann