BIOST 572 Final Talk - University of Washingtonfaculty.washington.edu/heagerty/Courses/b572/public/... · 2012. 6. 1. · BIOST 572 Final Talk David Benkeser University of Washington

BIOST 572 Final Talk

David Benkeser

University of Washington Department of Biostatistics

May 22, 2012

David Benkeser BIOST 572 Final Talk

When I was a boy there wereonly three kinds of sandwichesin common use - the ham, thechicken and the Swiss cheese.Others, to be sure, existed, but

it was only as oddities

HL Mencken (1880-1956)American essayist, journalist


Motivation

Non-linear, non-normal data, but we are still motivated to estimate“linear trend”

Define Bayesian analogue of frequentist methods used in thissituation: estimating equations and sandwich-based standard errors

Distinguish between fixed and random sampling

Informs about what the frequentist sandwich is actually measuring


MethodsDefining β

Suppose we have Y and X sampled from a distribution withdensity function λ and

Eλ(Y |X = x) = φ(X)

Define our quantity of interest as

β = argminα

Eλ[(φ(x)− xα)2

]i.e. the set of coefficients minimizing average squared error inapproximating the mean value of Y by linear function of X


MethodsBayesian Model Specification

Likelihood:

Y |X = x, φ(·), σ2(·) ∼ N(φ(x), σ2(x))

Priors specified such that:

p(λ(·), φ(·), σ2(·)) = pλ(λ(·)) pφ,σ2(φ(·), σ2(·))

This gives a posterior for φ(·) and λ(·):

π(λ(·), φ(·)|X,Y )


MethodsBayesian Model Specification

π(λ(·), φ(·)|X,Y ) induces a posterior for β

π(β) = π

(argmin

α

∫(φ(x)− xα)2λ(x)dx | X,Y

)Define point estimate by posterior mean

β = Eπ(β | X,Y )

and measure of uncertainty by posterior standard deviation

σβ = diag(Covπ(β | X,Y ))1/2


MethodsDiscrete Covariates

Let ξ = (ξ1, ..., ξK ) be K values covariates X can assume and nk

be the number of times Xi = ξk

Define density λ with support ξ of form

Pr (x = ξk ;λ(·)) = λk ,

K∑k=1

λk = 1

Define improper Dirichlet prior for λ(·)

pλ(λ(·)) ∝K∏

k=1

λ−1k



Which gives posterior that is also Dirichlet

pλ|X (λ(·)) ∝K∏

k=1

λ−1+nkk

This posterior is the Bayesian Bootstrap (Rubin, 1981)

Operationally similar to regular bootstrap

Instead of resampling, reweights observations

Why not just Bayesian Bootstrap the whole model?



Let φ(·) = (φ1, ..., φK ) and σ2(·) = (σ21, ..., σ2K )

Assign independent noninformative priors for (φk , σ2k )

If nk ≥ 4, φk has posterior t-distribution with

Eπ(φk | X,Y ) = yk

Varπ(φk | X,Y ) =1

nk (nk − 3)

∑i :Xi=ξk

(Yi − yk )2



It turns out thatβ →as (XTX)−1XTY

σβ − diag [(XTX)−1XT ΣX(XTX)−1]1/2 = o(n−1)

where

Σij =

{(Yi − Xi (X

TX)−1XTY )2 if i=j0 else



Posterior of φk can be split into (uncorrelated) deterministic andrandom components

φk = yk + ε

When calculating posterior variance of β we can split into twocovariance terms

Covπ(β) = Covπ(Ex ;λ[xTx]−1Ex ;λ[xT y(x)]|X,Y

)+ Covπ

(Ex ;λ[xTx]−1Ex ;λ[xT ε(x)]|X,Y

)→as diag [(XTX)−1XT (Σ′ + Σ)X(XTX)−1]



Three “meat” matrices:

Σ′ii =1

nk − 3

∑i :Xi=ξk

(Yi − yk )2

Σii = (Yi − Xi (XTX)−1XT Y )2

Σii = Σ′ii + Σii = (Yi − Xi (XTX)−1XTY )2

Classic sandwich (Σ) accounts for residual errors (Σ′) as well asthe errors due to non-linearity φ (Σ)


MethodsFixed Design Matrix

Replace random density λ with deterministic density

λfixed ;k =nk

n

and proceed as before defining our quantity of interest as

βfixed = argminα

∫(φ(x)− xα)2λfixed (x)dx

Posterior mean is our point estimate and posterior standarddeviation is measure of uncertainty


MethodsFixed Design Matrix

It turns out that βfixed is exactly the OLS estimator and

σβ,fixed = diag [(XTX)−1XT Σ′X(XTX)−1]

where Σ′ is diagonal matrix

Σ′ij =

{ 1nk−3

∑i :Xi=ξk

(Yi − yk )2 if i=j

0 else

Accounts for variation of Y |X around its mean only and not errordue to non-linearity in φ (which does not change between samples)


MethodsContinuous Covariates

Need some constraints on φ(·) and σ2(·) for identifiability

In applied setting, these functions can be approximated bysemi-parametric smoothing methods

Use penalized O’Sullivan splines (Wand and Ormerod, 2008) tomodel φ(·) and σ2(·)

φ(x ; a) = α0 + α1xi +Q∑

q=1

aqZiq

logσ(x ; b) = γ0 + γ1xi +Q∑

q=1

bqZiq


MethodsContinuous Covariates

Define diffuse priors on aq and bq and use OpenBUGS to simulatefrom posterior

For prior on λ use limiting case of Dirichlet process, which givesBB distribution as posterior

Expect that similar results will hold as in the discrete case.Simulations give supporting evidence.


ResultsSimulations (n=400)

−10 −5 0 5 10

−40

−20

020

40

Linear and Homoscedastic

−10 −5 0 5 10

−50

050

Linear and Heteroscedastic

−10 −5 0 5 10

−50

050

Non−linear and Homoscedastic

−10 −5 0 5 10

−10

0−

500

5010

0

Non−linear and Heteroscedastic


ResultsSimulations (n=400)

8590

9510

0

Linear and Homoscedastic

Sampling

Nom

inal

95%

Cov

erag

e

ModelSandwichBayes

Random Fixed

8590

9510

0

Linear and Heteroscedastic

Sampling

Nom

inal

95%

Cov

erag

e

Random Fixed

8590

9510

0

Non−linear and Homoscedastic

Sampling

Nom

inal

95%

Cov

erag

e

Random Fixed

8590

9510

0

Non−linear and Heteroscedastic

Sampling

Nom

inal

95%

Cov

erag

e

Random Fixed


ResultsHealth Care Cost Data

0 10 20 30 40 50 60

020

000

4000

0

Outpatient Health Costs

Age

Ann

ual C

ost (

dolla

rs)

0 10 20 30 40 50 60

500

1500

2500

Smoothed Outpatient Health Costs

Age

Ann

ual c

ost (

dolla

rs)

O'Sullivan Splines (posterior mean)O'Sullivan Splines (posterior sample)

0 10 20 30 40 50 60

1000

3000

Smoothed Standard Deviation

Age

Sta

ndar

d D

evia

tion O'Sullivan Splines (posterior mean)

O'Sullivan Splines (posterior sample)


ResultsHealth Care Cost Data

1214

1618

20

Linear regression of average annual outpatient health care cost on age

Continuous Discrete

β (9

5% C

I)

Model Sandwich Bayes(random) Bayes(fixed) Bayes(random) Bayes(fixed)


Conclusions

Developed model-robust Bayesian framework for linear regression

Method distinguishes between random and fixed covariates and isasymptotically equivalent to sandwich in random case

Better frequentist coverage in the fixed design case thansandwich-based estimates

Contrasting fixed and random case shows that sandwich implicitlyaccounts for random sampling


Questions?


Documents

BIOST 572 Final Talk - University of Washingtonfaculty.washington.edu/heagerty/Courses/b572/public/... · 2012. 6. 1. · BIOST 572 Final Talk David Benkeser University of Washington