79
duke.eps Robust Bayesian Regression Readings: Hoff Chapter 9, West JRSSB 1984, F´ uquene, P´ erez & Pericchi 2015 STA 721 Duke University Duke University November 21, 2017

Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Robust Bayesian RegressionReadings: Hoff Chapter 9, West JRSSB 1984, Fuquene, Perez

& Pericchi 2015

STA 721 Duke University

Duke University

November 21, 2017

Page 2: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Multiple Outliers

I Hoeting, Madigan and Raftery (in various permutations)consider the problem of simultaneous variable selection andoutlier identification.

I This is implemented in the library(BMA) in the functionMC3.REG. This has the advantage that more than 2 pointsmay be considered as outliers at the same time.

I The function uses a Markov chain to identify both importantvariables and potential outliers, but is coded in Fortran soshould run reasonably quickly.

I Can also use BAS or other variable selection programs

Page 3: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Multiple Outliers

I Hoeting, Madigan and Raftery (in various permutations)consider the problem of simultaneous variable selection andoutlier identification.

I This is implemented in the library(BMA) in the functionMC3.REG. This has the advantage that more than 2 pointsmay be considered as outliers at the same time.

I The function uses a Markov chain to identify both importantvariables and potential outliers, but is coded in Fortran soshould run reasonably quickly.

I Can also use BAS or other variable selection programs

Page 4: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Multiple Outliers

I Hoeting, Madigan and Raftery (in various permutations)consider the problem of simultaneous variable selection andoutlier identification.

I This is implemented in the library(BMA) in the functionMC3.REG. This has the advantage that more than 2 pointsmay be considered as outliers at the same time.

I The function uses a Markov chain to identify both importantvariables and potential outliers, but is coded in Fortran soshould run reasonably quickly.

I Can also use BAS or other variable selection programs

Page 5: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Multiple Outliers

I Hoeting, Madigan and Raftery (in various permutations)consider the problem of simultaneous variable selection andoutlier identification.

I This is implemented in the library(BMA) in the functionMC3.REG. This has the advantage that more than 2 pointsmay be considered as outliers at the same time.

I The function uses a Markov chain to identify both importantvariables and potential outliers, but is coded in Fortran soshould run reasonably quickly.

I Can also use BAS or other variable selection programs

Page 6: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Multiple Outliers

I Hoeting, Madigan and Raftery (in various permutations)consider the problem of simultaneous variable selection andoutlier identification.

I This is implemented in the library(BMA) in the functionMC3.REG. This has the advantage that more than 2 pointsmay be considered as outliers at the same time.

I The function uses a Markov chain to identify both importantvariables and potential outliers, but is coded in Fortran soshould run reasonably quickly.

I Can also use BAS or other variable selection programs

Page 7: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Using BAS

library(MASS)

data(stackloss)

n = nrow(stackloss)

stack.out = cbind(stackloss, diag(n))

library(BAS)

BAS.stack = bas.lm(stack.loss ~ ., data=stack.out,

prior="hyper-g-n", a=3,

modelprior=tr.beta.binomial(1, 1,15) ,

method="MCMC", MCMC.it=200000)

Page 8: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Output

P(B != 0 | Y) model 1 model 2 model 3 model 4 model 5Intercept 1.00 1.00 1.00 1.00 1.00 1.00Air.Flow 1.00 1.00 1.00 1.00 1.00 1.00

Water.Temp 0.23 0.00 0.00 0.00 1.00 1.00Acid.Conc. 0.04 0.00 0.00 0.00 0.00 0.00

‘1‘ 0.22 0.00 0.00 0.00 0.00 1.00‘2‘ 0.07 0.00 0.00 0.00 0.00 0.00‘3‘ 0.24 0.00 0.00 0.00 0.00 1.00‘4‘ 0.75 1.00 0.00 1.00 1.00 1.00‘5‘ 0.03 0.00 0.00 0.00 0.00 0.00‘6‘ 0.04 0.00 0.00 0.00 0.00 0.00‘7‘ 0.03 0.00 0.00 0.00 0.00 0.00‘8‘ 0.03 0.00 0.00 0.00 0.00 0.00‘9‘ 0.03 0.00 0.00 0.00 0.00 0.00

‘10‘ 0.03 0.00 0.00 0.00 0.00 0.00‘11‘ 0.03 0.00 0.00 0.00 0.00 0.00‘12‘ 0.04 0.00 0.00 0.00 0.00 0.00‘13‘ 0.16 0.00 0.00 1.00 0.00 0.00‘14‘ 0.08 0.00 0.00 0.00 0.00 0.00‘15‘ 0.03 0.00 0.00 0.00 0.00 0.00‘16‘ 0.03 0.00 0.00 0.00 0.00 0.00‘17‘ 0.03 0.00 0.00 0.00 0.00 0.00‘18‘ 0.02 0.00 0.00 0.00 0.00 0.00‘19‘ 0.04 0.00 0.00 0.00 0.00 0.00‘20‘ 0.06 0.00 0.00 0.00 0.00 0.00‘21‘ 0.94 1.00 1.00 1.00 1.00 1.00BF 0.13 0.01 0.08 0.07 1.00

PostProbs 0.24 0.11 0.03 0.02 0.02R2 0.96 0.93 0.97 0.97 0.99dim 4.00 3.00 5.00 5.00 7.00

logmarg 22.17 19.43 21.68 21.57 24.18

Page 9: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

BAS

10 15 20 25 30 35 40

−3−1

01

23

Predictions under BMA

Resid

uals

Residuals vs Fitted

13

3

14

0 1000 2000 3000 4000 5000

0.0

0.2

0.4

0.6

0.8

1.0

Model Search Order

Cum

ulat

ive P

roba

bility

Model Probabilities

5 10 15

05

1015

2025

Model Dimension

log(

Mar

gina

l)

Model Complexity

0.0

0.2

0.4

0.6

0.8

1.0

Mar

gina

l Inc

lusio

n Pr

obab

ility

Inte

rcep

tAi

r.Flo

wW

ater

.Tem

pAc

id.C

onc. ‘1

‘‘2

‘‘3

‘‘4

‘‘5

‘‘6

‘‘7

‘‘8

‘‘9

‘‘1

0‘‘1

1‘‘1

2‘‘1

3‘‘1

4‘‘1

5‘‘1

6‘‘1

7‘‘1

8‘‘1

9‘‘2

0‘‘2

1‘

Inclusion Probabilities

Page 10: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

BAS

00.

479

1.04

31.

199

1.46

21.

653

2.98

53.

766

2014

119

87

65

43

21

Mod

el R

ank

Log

Post

erio

r Odd

s

Inte

rcep

tAi

r.Flo

wW

ater

.Tem

pAc

id.C

onc. ‘1

‘‘2

‘‘3

‘‘4

‘‘5

‘‘6

‘‘7

‘‘8

‘‘9

‘‘1

0‘‘1

1‘‘1

2‘‘1

3‘‘1

4‘‘1

5‘‘1

6‘‘1

7‘‘1

8‘‘1

9‘‘2

0‘‘2

1‘

Page 11: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Body Fat Data: Intervals w/ All Data

Response % Body Fat and Predictor Waist Circumference

95% confidence and prediction intervals for bodyfat.lm

Abdomen

Bod

yfat

0

10

20

30

40

50

60

80 100 120 140

xbar

xbar

observedfitconf intpred int

95% confidence and prediction intervals for bodyfat.lm2

Abdomen

Bod

yfat

0

20

40

60

80 100 120 140

xbar

xbar

observedfitconf intpred int

Which analysis do we use? with Case 39 or not – or somethingdifferent?

Page 12: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Cook’s Distance

0.00 0.02 0.04 0.06 0.08 0.10

−4

−2

02

Leverage

Sta

ndar

dize

d re

sidu

als

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●●

●●

● ●

●●

●●

●● ●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●● ●

lm(Bodyfat ~ I(Abdomen − 2.54 * 34))

Cook's distance

1

0.5

0.5

Residuals vs Leverage

39

216

41

Page 13: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Options for Handling Influential Cases

I Are there scientific grounds for eliminating the case?

I Test if the case has a different mean than population

I Report results with and without the case

I Model Averaging to Account for Model Uncertainty?

I Full model Y = Xβ + Inδ + ε

I 2n submodels γi = 0⇔ δi = 0

I If γi = 1 then case i has a different mean “mean shift”outliers.

Page 14: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Options for Handling Influential Cases

I Are there scientific grounds for eliminating the case?

I Test if the case has a different mean than population

I Report results with and without the case

I Model Averaging to Account for Model Uncertainty?

I Full model Y = Xβ + Inδ + ε

I 2n submodels γi = 0⇔ δi = 0

I If γi = 1 then case i has a different mean “mean shift”outliers.

Page 15: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Options for Handling Influential Cases

I Are there scientific grounds for eliminating the case?

I Test if the case has a different mean than population

I Report results with and without the case

I Model Averaging to Account for Model Uncertainty?

I Full model Y = Xβ + Inδ + ε

I 2n submodels γi = 0⇔ δi = 0

I If γi = 1 then case i has a different mean “mean shift”outliers.

Page 16: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Options for Handling Influential Cases

I Are there scientific grounds for eliminating the case?

I Test if the case has a different mean than population

I Report results with and without the case

I Model Averaging to Account for Model Uncertainty?

I Full model Y = Xβ + Inδ + ε

I 2n submodels γi = 0⇔ δi = 0

I If γi = 1 then case i has a different mean “mean shift”outliers.

Page 17: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Options for Handling Influential Cases

I Are there scientific grounds for eliminating the case?

I Test if the case has a different mean than population

I Report results with and without the case

I Model Averaging to Account for Model Uncertainty?

I Full model Y = Xβ + Inδ + ε

I 2n submodels γi = 0⇔ δi = 0

I If γi = 1 then case i has a different mean “mean shift”outliers.

Page 18: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Options for Handling Influential Cases

I Are there scientific grounds for eliminating the case?

I Test if the case has a different mean than population

I Report results with and without the case

I Model Averaging to Account for Model Uncertainty?

I Full model Y = Xβ + Inδ + ε

I 2n submodels γi = 0⇔ δi = 0

I If γi = 1 then case i has a different mean “mean shift”outliers.

Page 19: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Mean Shift = Variance Inflation

I Model Y = Xβ + Inδ + ε

I Priorδi | γi ∼ N(0,Vσ2γi )γi ∼ Ber(π)

Then εi given σ2 is independent of δi and

ε∗i ≡ εi + δi | σ2{

N(0, σ2) wp (1− π)N(0, σ2(1 + V )) wp π

Model Y = Xβ + ε∗ “variance inflation”V + 1 = K = 7 in the paper by Hoeting et al. package BMA

Page 20: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Simultaneous Outlier and Variable Selection

MC3.REG(all.y = bodyfat$Bodyfat, all.x = as.matrix(bodyfat$Abdomen),

num.its = 10000, outliers = TRUE)

Model parameters: PI=0.02 K=7 nu=2.58 lambda=0.28 phi=2.85

15 models were selected

Best 5 models (cumulative posterior probability = 0.9939):

prob model 1 model 2 model 3 model 4 model 5

variables

all.x 1 x x x x x

outliers

39 0.94932 x x . x .

204 0.04117 . . . x .

207 0.10427 . x . . x

post prob 0.815 0.095 0.044 0.035 0.004

Page 21: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Change Error Assumptions

Yiind∼ t(ν, α + βxi , 1/φ)

L(α, β, φ) ∝n∏

i=1

φ1/2(

1 +φ(yi − α− βxi )2

ν

)− (ν+1)2

Use Prior p(α, β, φ) ∝ 1/φ

Posterior distribution

p(α, β, φ | Y ) ∝ φn/2−1n∏

i=1

(1 +

φ(yi − α− βxi )2

ν

)− (ν+1)2

Page 22: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Change Error Assumptions

Yiind∼ t(ν, α + βxi , 1/φ)

L(α, β, φ) ∝n∏

i=1

φ1/2(

1 +φ(yi − α− βxi )2

ν

)− (ν+1)2

Use Prior p(α, β, φ) ∝ 1/φ

Posterior distribution

p(α, β, φ | Y ) ∝ φn/2−1n∏

i=1

(1 +

φ(yi − α− βxi )2

ν

)− (ν+1)2

Page 23: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Change Error Assumptions

Yiind∼ t(ν, α + βxi , 1/φ)

L(α, β, φ) ∝n∏

i=1

φ1/2(

1 +φ(yi − α− βxi )2

ν

)− (ν+1)2

Use Prior p(α, β, φ) ∝ 1/φ

Posterior distribution

p(α, β, φ | Y ) ∝ φn/2−1n∏

i=1

(1 +

φ(yi − α− βxi )2

ν

)− (ν+1)2

Page 24: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Change Error Assumptions

Yiind∼ t(ν, α + βxi , 1/φ)

L(α, β, φ) ∝n∏

i=1

φ1/2(

1 +φ(yi − α− βxi )2

ν

)− (ν+1)2

Use Prior p(α, β, φ) ∝ 1/φ

Posterior distribution

p(α, β, φ | Y ) ∝ φn/2−1n∏

i=1

(1 +

φ(yi − α− βxi )2

ν

)− (ν+1)2

Page 25: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Change Error Assumptions

Yiind∼ t(ν, α + βxi , 1/φ)

L(α, β, φ) ∝n∏

i=1

φ1/2(

1 +φ(yi − α− βxi )2

ν

)− (ν+1)2

Use Prior p(α, β, φ) ∝ 1/φ

Posterior distribution

p(α, β, φ | Y ) ∝ φn/2−1n∏

i=1

(1 +

φ(yi − α− βxi )2

ν

)− (ν+1)2

Page 26: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Bounded Influence - West 1984 (and references within)

Treat σ2 as given, then influence of individual observations on theposterior distribution of β in the model where E[Yi ] = xTi β isinvestigated through the score function:

d

dβlog p(β | Y) =

d

dβlog p(β) +

n∑i=1

xg(yi − xTi β)

where

g(ε) = − d

dεlog p(ε)

is the influence function of the error distribution (unimodal,continuous, differentiable, symmetric)

An outlying observation yj is accommodated if the posteriordistribution for p(β | Y(i)) converges to p(β | Y) for all β as|Yi | → ∞. Requires error models with influence functions that goto zero such as the Student t (O’Hagan, 1979)

Page 27: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Bounded Influence - West 1984 (and references within)

Treat σ2 as given, then influence of individual observations on theposterior distribution of β in the model where E[Yi ] = xTi β isinvestigated through the score function:

d

dβlog p(β | Y) =

d

dβlog p(β) +

n∑i=1

xg(yi − xTi β)

where

g(ε) = − d

dεlog p(ε)

is the influence function of the error distribution (unimodal,continuous, differentiable, symmetric)

An outlying observation yj is accommodated if the posteriordistribution for p(β | Y(i)) converges to p(β | Y) for all β as|Yi | → ∞. Requires error models with influence functions that goto zero such as the Student t (O’Hagan, 1979)

Page 28: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Bounded Influence - West 1984 (and references within)

Treat σ2 as given, then influence of individual observations on theposterior distribution of β in the model where E[Yi ] = xTi β isinvestigated through the score function:

d

dβlog p(β | Y) =

d

dβlog p(β) +

n∑i=1

xg(yi − xTi β)

where

g(ε) = − d

dεlog p(ε)

is the influence function of the error distribution (unimodal,continuous, differentiable, symmetric)

An outlying observation yj is accommodated if the posteriordistribution for p(β | Y(i)) converges to p(β | Y) for all β as|Yi | → ∞. Requires error models with influence functions that goto zero such as the Student t (O’Hagan, 1979)

Page 29: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Bounded Influence - West 1984 (and references within)

Treat σ2 as given, then influence of individual observations on theposterior distribution of β in the model where E[Yi ] = xTi β isinvestigated through the score function:

d

dβlog p(β | Y) =

d

dβlog p(β) +

n∑i=1

xg(yi − xTi β)

where

g(ε) = − d

dεlog p(ε)

is the influence function of the error distribution (unimodal,continuous, differentiable, symmetric)

An outlying observation yj is accommodated if the posteriordistribution for p(β | Y(i)) converges to p(β | Y) for all β as|Yi | → ∞. Requires error models with influence functions that goto zero such as the Student t (O’Hagan, 1979)

Page 30: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Choice of dfI Score function for t with α degrees of freedom has turning

points at ±√α

−100 −50 0 50 100

−1.

5−

1.0

−0.

50.

00.

51.

01.

5

eps

g(ep

s, 9

)

I g ′(ε) is negative when ε2 > α (standardized errors)I Contribution of observation to information matrix is negative

and the observation is doubtfulI Suggest taking α = 8 or α = 9 to reject errors larger than

√8

or 3 sd.

Problem: No closed form solution for posterior distribution

Page 31: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Choice of dfI Score function for t with α degrees of freedom has turning

points at ±√α

−100 −50 0 50 100

−1.

5−

1.0

−0.

50.

00.

51.

01.

5

eps

g(ep

s, 9

)

I g ′(ε) is negative when ε2 > α (standardized errors)

I Contribution of observation to information matrix is negativeand the observation is doubtful

I Suggest taking α = 8 or α = 9 to reject errors larger than√

8or 3 sd.

Problem: No closed form solution for posterior distribution

Page 32: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Choice of dfI Score function for t with α degrees of freedom has turning

points at ±√α

−100 −50 0 50 100

−1.

5−

1.0

−0.

50.

00.

51.

01.

5

eps

g(ep

s, 9

)

I g ′(ε) is negative when ε2 > α (standardized errors)I Contribution of observation to information matrix is negative

and the observation is doubtful

I Suggest taking α = 8 or α = 9 to reject errors larger than√

8or 3 sd.

Problem: No closed form solution for posterior distribution

Page 33: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Choice of dfI Score function for t with α degrees of freedom has turning

points at ±√α

−100 −50 0 50 100

−1.

5−

1.0

−0.

50.

00.

51.

01.

5

eps

g(ep

s, 9

)

I g ′(ε) is negative when ε2 > α (standardized errors)I Contribution of observation to information matrix is negative

and the observation is doubtfulI Suggest taking α = 8 or α = 9 to reject errors larger than

√8

or 3 sd.

Problem: No closed form solution for posterior distribution

Page 34: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Choice of dfI Score function for t with α degrees of freedom has turning

points at ±√α

−100 −50 0 50 100

−1.

5−

1.0

−0.

50.

00.

51.

01.

5

eps

g(ep

s, 9

)

I g ′(ε) is negative when ε2 > α (standardized errors)I Contribution of observation to information matrix is negative

and the observation is doubtfulI Suggest taking α = 8 or α = 9 to reject errors larger than

√8

or 3 sd.

Problem: No closed form solution for posterior distribution

Page 35: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Scale-Mixtures of Normal Representation

Ziiid∼ t(ν, 0, σ2)⇔

Zi | λiind∼ N(0, σ2/λi )

λiiid∼ G (ν/2, ν/2)

Integrate out “latent” λ’s to obtain marginal distribution.

Page 36: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Scale-Mixtures of Normal Representation

Ziiid∼ t(ν, 0, σ2)⇔

Zi | λiind∼ N(0, σ2/λi )

λiiid∼ G (ν/2, ν/2)

Integrate out “latent” λ’s to obtain marginal distribution.

Page 37: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Scale-Mixtures of Normal Representation

Ziiid∼ t(ν, 0, σ2)⇔

Zi | λiind∼ N(0, σ2/λi )

λiiid∼ G (ν/2, ν/2)

Integrate out “latent” λ’s to obtain marginal distribution.

Page 38: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Scale-Mixtures of Normal Representation

Ziiid∼ t(ν, 0, σ2)⇔

Zi | λiind∼ N(0, σ2/λi )

λiiid∼ G (ν/2, ν/2)

Integrate out “latent” λ’s to obtain marginal distribution.

Page 39: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Latent Variable Model

Yi | α, β, φ, λind∼ N(α + βxi ,

1

φλi)

λiiid∼ G (ν/2, ν/2)

p(α, β, φ) ∝ 1/φ

Joint Posterior Distribution:

p((α, β, φ, λ1, . . . , λn | Y ) ∝ φn/2 exp

{−φ

2

∑λi (yi − α− βxi )2

φ−1

n∏i=1

λν/2−1i exp(−λiν/2)

Page 40: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Latent Variable Model

Yi | α, β, φ, λind∼ N(α + βxi ,

1

φλi)

λiiid∼ G (ν/2, ν/2)

p(α, β, φ) ∝ 1/φ

Joint Posterior Distribution:

p((α, β, φ, λ1, . . . , λn | Y ) ∝ φn/2 exp

{−φ

2

∑λi (yi − α− βxi )2

φ−1

n∏i=1

λν/2−1i exp(−λiν/2)

Page 41: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Latent Variable Model

Yi | α, β, φ, λind∼ N(α + βxi ,

1

φλi)

λiiid∼ G (ν/2, ν/2)

p(α, β, φ) ∝ 1/φ

Joint Posterior Distribution:

p((α, β, φ, λ1, . . . , λn | Y ) ∝ φn/2 exp

{−φ

2

∑λi (yi − α− βxi )2

φ−1

n∏i=1

λν/2−1i exp(−λiν/2)

Page 42: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Latent Variable Model

Yi | α, β, φ, λind∼ N(α + βxi ,

1

φλi)

λiiid∼ G (ν/2, ν/2)

p(α, β, φ) ∝ 1/φ

Joint Posterior Distribution:

p((α, β, φ, λ1, . . . , λn | Y ) ∝ φn/2 exp

{−φ

2

∑λi (yi − α− βxi )2

φ−1

n∏i=1

λν/2−1i exp(−λiν/2)

Page 43: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Latent Variable Model

Yi | α, β, φ, λind∼ N(α + βxi ,

1

φλi)

λiiid∼ G (ν/2, ν/2)

p(α, β, φ) ∝ 1/φ

Joint Posterior Distribution:

p((α, β, φ, λ1, . . . , λn | Y ) ∝ φn/2 exp

{−φ

2

∑λi (yi − α− βxi )2

φ−1

n∏i=1

λν/2−1i exp(−λiν/2)

Page 44: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Latent Variable Model

Yi | α, β, φ, λind∼ N(α + βxi ,

1

φλi)

λiiid∼ G (ν/2, ν/2)

p(α, β, φ) ∝ 1/φ

Joint Posterior Distribution:

p((α, β, φ, λ1, . . . , λn | Y ) ∝ φn/2 exp

{−φ

2

∑λi (yi − α− βxi )2

φ−1

n∏i=1

λν/2−1i exp(−λiν/2)

Page 45: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Latent Variable Model

Yi | α, β, φ, λind∼ N(α + βxi ,

1

φλi)

λiiid∼ G (ν/2, ν/2)

p(α, β, φ) ∝ 1/φ

Joint Posterior Distribution:

p((α, β, φ, λ1, . . . , λn | Y ) ∝ φn/2 exp

{−φ

2

∑λi (yi − α− βxi )2

φ−1

n∏i=1

λν/2−1i exp(−λiν/2)

Page 46: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

JAGS

Just Another Gibbs Sampler (and more)

I Model

I Data

I Initial values (optional)

May do this through ordinary text files or use the functions inR2jags to specify model, data, and initial values then call jags.

Page 47: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

JAGS

Just Another Gibbs Sampler (and more)

I Model

I Data

I Initial values (optional)

May do this through ordinary text files or use the functions inR2jags to specify model, data, and initial values then call jags.

Page 48: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

JAGS

Just Another Gibbs Sampler (and more)

I Model

I Data

I Initial values (optional)

May do this through ordinary text files or use the functions inR2jags to specify model, data, and initial values then call jags.

Page 49: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

JAGS

Just Another Gibbs Sampler (and more)

I Model

I Data

I Initial values (optional)

May do this through ordinary text files or use the functions inR2jags to specify model, data, and initial values then call jags.

Page 50: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Model Specification via R2jags

rr.model = function() {

for (i in 1:n) {

mu[i] <- alpha0 + alpha1*(X[i] - Xbar)

lambda[i] ~ dgamma(9/2, 9/2)

prec[i] <- phi*lambda[i]

Y[i] ~ dnorm(mu[i], prec[i])

}

phi ~ dgamma(1.0E-6, 1.0E-6)

alpha0 ~ dnorm(0, 1.0E-6)

alpha1 ~ dnorm(0,1.0E-6)

}

Page 51: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Notes on Models

I Distributions of stochastic “nodes” are specified using ∼

I Assignment of deterministic “nodes” uses <- (NOT =)

I JAGS allows expressions as arguments in distributions

I Normal distributions are parameterized using precisions, sodnorm(0, 1.0E-6) is a N(0, 1.0× 106)

I uses for loop structure as in R for model description butcoded in C++ so is fast!

Page 52: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Notes on Models

I Distributions of stochastic “nodes” are specified using ∼I Assignment of deterministic “nodes” uses <- (NOT =)

I JAGS allows expressions as arguments in distributions

I Normal distributions are parameterized using precisions, sodnorm(0, 1.0E-6) is a N(0, 1.0× 106)

I uses for loop structure as in R for model description butcoded in C++ so is fast!

Page 53: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Notes on Models

I Distributions of stochastic “nodes” are specified using ∼I Assignment of deterministic “nodes” uses <- (NOT =)

I JAGS allows expressions as arguments in distributions

I Normal distributions are parameterized using precisions, sodnorm(0, 1.0E-6) is a N(0, 1.0× 106)

I uses for loop structure as in R for model description butcoded in C++ so is fast!

Page 54: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Notes on Models

I Distributions of stochastic “nodes” are specified using ∼I Assignment of deterministic “nodes” uses <- (NOT =)

I JAGS allows expressions as arguments in distributions

I Normal distributions are parameterized using precisions, sodnorm(0, 1.0E-6) is a N(0, 1.0× 106)

I uses for loop structure as in R for model description butcoded in C++ so is fast!

Page 55: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Notes on Models

I Distributions of stochastic “nodes” are specified using ∼I Assignment of deterministic “nodes” uses <- (NOT =)

I JAGS allows expressions as arguments in distributions

I Normal distributions are parameterized using precisions, sodnorm(0, 1.0E-6) is a N(0, 1.0× 106)

I uses for loop structure as in R for model description butcoded in C++ so is fast!

Page 56: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Data

A list or rectangular data structure for all data and summaries ofdata used in the model

bf.data = list(Y = bodyfat$Bodyfat,

X=bodyfat$Abdomen)

bf.data$n = length(bf.data$Y)

bf.data$Xbar = mean(bf.data$X)

Page 57: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Specifying which Parameters to Save

The parameters to be monitored and returned to R are specifiedwith the variable parameters

parameters = c("beta0", "beta1", "sigma",

"mu34", "y34", "lambda[39]")

I All of the above (except lambda) are calculated from the otherparameters. (See R-code for definitions of these parameters.)

I lambda[39] saves only the 39th case of λ

I To save a whole vector (for example all lambdas, just give thevector name)

Page 58: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Specifying which Parameters to Save

The parameters to be monitored and returned to R are specifiedwith the variable parameters

parameters = c("beta0", "beta1", "sigma",

"mu34", "y34", "lambda[39]")

I All of the above (except lambda) are calculated from the otherparameters. (See R-code for definitions of these parameters.)

I lambda[39] saves only the 39th case of λ

I To save a whole vector (for example all lambdas, just give thevector name)

Page 59: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Specifying which Parameters to Save

The parameters to be monitored and returned to R are specifiedwith the variable parameters

parameters = c("beta0", "beta1", "sigma",

"mu34", "y34", "lambda[39]")

I All of the above (except lambda) are calculated from the otherparameters. (See R-code for definitions of these parameters.)

I lambda[39] saves only the 39th case of λ

I To save a whole vector (for example all lambdas, just give thevector name)

Page 60: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Specifying which Parameters to Save

The parameters to be monitored and returned to R are specifiedwith the variable parameters

parameters = c("beta0", "beta1", "sigma",

"mu34", "y34", "lambda[39]")

I All of the above (except lambda) are calculated from the otherparameters. (See R-code for definitions of these parameters.)

I lambda[39] saves only the 39th case of λ

I To save a whole vector (for example all lambdas, just give thevector name)

Page 61: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Running jags from R

bf.sim = jags(bf.data, inits=NULL, par=parameters,

model=rr.model,

n.chains=2, n.iter=10000,

)

Page 62: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Output

mean sd 2.5% 50% 97.5%

beta0 -41.70 2.75 -46.91 -41.67 -36.40beta1 0.66 0.03 0.60 0.66 0.71sigma 4.48 0.23 4.05 4.46 4.96mu34 15.10 0.35 14.43 15.10 15.82

y34 14.94 5.15 4.37 15.21 24.65lambda[39] 0.33 0.16 0.11 0.30 0.7295% HPD interval for expected bodyfat (14.5, 15.8)

95% HPD interval for bodyfat (5.1, 25.3)

Page 63: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Comparison

I 95% Probability Interval for β is (0.60, 0.71) with t9 errors

I 95% Confidence Interval for β is (0.58, 0.69) (all data normalmodel)

I 95% Confidence Interval for β is (0.61, 0.73) ( normal modelwithout case 39)

Results intermediate without having to remove any observationsCase 39 down weighted by λ39

Page 64: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Comparison

I 95% Probability Interval for β is (0.60, 0.71) with t9 errors

I 95% Confidence Interval for β is (0.58, 0.69) (all data normalmodel)

I 95% Confidence Interval for β is (0.61, 0.73) ( normal modelwithout case 39)

Results intermediate without having to remove any observationsCase 39 down weighted by λ39

Page 65: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Comparison

I 95% Probability Interval for β is (0.60, 0.71) with t9 errors

I 95% Confidence Interval for β is (0.58, 0.69) (all data normalmodel)

I 95% Confidence Interval for β is (0.61, 0.73) ( normal modelwithout case 39)

Results intermediate without having to remove any observationsCase 39 down weighted by λ39

Page 66: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Comparison

I 95% Probability Interval for β is (0.60, 0.71) with t9 errors

I 95% Confidence Interval for β is (0.58, 0.69) (all data normalmodel)

I 95% Confidence Interval for β is (0.61, 0.73) ( normal modelwithout case 39)

Results intermediate without having to remove any observations

Case 39 down weighted by λ39

Page 67: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Comparison

I 95% Probability Interval for β is (0.60, 0.71) with t9 errors

I 95% Confidence Interval for β is (0.58, 0.69) (all data normalmodel)

I 95% Confidence Interval for β is (0.61, 0.73) ( normal modelwithout case 39)

Results intermediate without having to remove any observationsCase 39 down weighted by λ39

Page 68: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Full Conditional for λj

p(λj | rest,Y ) ∝ p(α, β, φ, λ1, . . . , λn | Y )

∝ φn/2−1n∏

i=1

exp

{−φ

2λi (yi − α− βxi )2

n∏i=1

λν+12−1

i exp(−λiν

2)

Ignore all terms except those that involve λj

λj | rest,Y ∼ G

(ν + 1

2,φ(yj − α− βxj)2 + ν

2

)

Page 69: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Full Conditional for λj

p(λj | rest,Y ) ∝ p(α, β, φ, λ1, . . . , λn | Y )

∝ φn/2−1n∏

i=1

exp

{−φ

2λi (yi − α− βxi )2

n∏i=1

λν+12−1

i exp(−λiν

2)

Ignore all terms except those that involve λj

λj | rest,Y ∼ G

(ν + 1

2,φ(yj − α− βxj)2 + ν

2

)

Page 70: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Full Conditional for λj

p(λj | rest,Y ) ∝ p(α, β, φ, λ1, . . . , λn | Y )

∝ φn/2−1n∏

i=1

exp

{−φ

2λi (yi − α− βxi )2

n∏i=1

λν+12−1

i exp(−λiν

2)

Ignore all terms except those that involve λj

λj | rest,Y ∼ G

(ν + 1

2,φ(yj − α− βxj)2 + ν

2

)

Page 71: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Full Conditional for λj

p(λj | rest,Y ) ∝ p(α, β, φ, λ1, . . . , λn | Y )

∝ φn/2−1n∏

i=1

exp

{−φ

2λi (yi − α− βxi )2

n∏i=1

λν+12−1

i exp(−λiν

2)

Ignore all terms except those that involve λj

λj | rest,Y ∼ G

(ν + 1

2,φ(yj − α− βxj)2 + ν

2

)

Page 72: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Weights

Under prior E [λi ] = 1

Under posterior, large residuals are down-weighted (approximatelythose bigger than

√ν)

Posterior Distribution

λ39

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

2.5

Page 73: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Weights

Under prior E [λi ] = 1Under posterior, large residuals are down-weighted (approximatelythose bigger than

√ν)

Posterior Distribution

λ39

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

2.5

Page 74: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Prior Distributions on Parameter

As a general recommendation, the prior distribution should have“heavier” tails than the likelihood

I with t9 errors use a tα with α < 9

I also represent via scale mixture of normals

I Horseshoe, Double Pareto, Cauchy all have heavier tails

I See Stack-loss code

Page 75: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Prior Distributions on Parameter

As a general recommendation, the prior distribution should have“heavier” tails than the likelihood

I with t9 errors use a tα with α < 9

I also represent via scale mixture of normals

I Horseshoe, Double Pareto, Cauchy all have heavier tails

I See Stack-loss code

Page 76: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Prior Distributions on Parameter

As a general recommendation, the prior distribution should have“heavier” tails than the likelihood

I with t9 errors use a tα with α < 9

I also represent via scale mixture of normals

I Horseshoe, Double Pareto, Cauchy all have heavier tails

I See Stack-loss code

Page 77: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Prior Distributions on Parameter

As a general recommendation, the prior distribution should have“heavier” tails than the likelihood

I with t9 errors use a tα with α < 9

I also represent via scale mixture of normals

I Horseshoe, Double Pareto, Cauchy all have heavier tails

I See Stack-loss code

Page 78: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Prior Distributions on Parameter

As a general recommendation, the prior distribution should have“heavier” tails than the likelihood

I with t9 errors use a tα with α < 9

I also represent via scale mixture of normals

I Horseshoe, Double Pareto, Cauchy all have heavier tails

I See Stack-loss code

Page 79: Readings: Ho Chapter 9, West JRSSB 1984, Fuquene, P erez ... · BAS.stack=bas.lm(stack.loss~.,data=stack.out, prior="hyper-g-n",a=3, modelprior=tr.beta.binomial(1,1,15) , ... Response

duke.eps

Prior Distributions on Parameter

As a general recommendation, the prior distribution should have“heavier” tails than the likelihood

I with t9 errors use a tα with α < 9

I also represent via scale mixture of normals

I Horseshoe, Double Pareto, Cauchy all have heavier tails

I See Stack-loss code