1 Bayesian Essentials Slides by Peter Rossi and David Madigan

1

Bayesian Essentials

Slides by Peter Rossi and David Madigan

2

Distribution Theory 101

Marginal and Conditional Distributions:

X

Y

1

1

uniform

3

Simulating from Joint

To draw from the joint:i. Draw from marginal on Xii. Condition on this draw, and draw from

conditional of Y|X

library(triangle)x <- rtriangle(NumDraws,0,1,1)y <- runif(NumDraws,0,x)plot(x,y)

4

Triangular Distribution

If U~ unif(0,1), then:

sqrt(U) has the standard triangle distribution

If U1, U2 ~ unif(0,1), then:

Y=max{U1,U2} has the standard triangle distribution

Sampling Importance Resampling

5

f

g

draw a big sample from g

sub-sample from that sample with probability f/g

Metropolis

6

start with current = 0.5

to get the next value: draw a “proposal” from g

keep with probability f(proposal)/f(current)

else keep current

f

g

7

The Goal of InferenceMake inferences about unknown quantities using available information.

Inference -- make probability statements

unknowns --

parameters, functions of parameters, states or latent variables, “future” outcomes, outcomes conditional on an action

Information –

data-based

non data-based

theories of behavior; subjective views; mechanism

parameters are finite or in some range

8

p(θ|D) α p(D| θ) p(θ)

Posterior α “Likelihood” × Prior

Modern Bayesian computing– simulation methods for generating draws from the posterior distribution p(θ|D).

Bayes theorem

9

Summarizing the posterior

Output from Bayesian Inference:A possibly high dimensional distribution

Summarize this object via simulation:marginal distributions of don’t just compute

Contrast with Sampling Theory:point est/standard error

summary of irrelevant dist bad summary (normal)Limitations of asymptotics

10

Metropolis

Start somewhere with θcurrent

To get the next value, generate a proposal θproposal

Accept with “probability”:

else keep currrent

11

Example

Believe these measurements (D) come from N(μ,1):

0.9072867 -0.4490744 -0.1463117 0.2525023 0.9723840 -0.8946437 -0.2529104 0.5101836 1.22897950.5685497

Prior for μ?

p(μ) = 2μ

12

Example continued

p(D|μ)? 0.9072867 -0.4490744 -0.1463117 0.2525023 0.9723840 -0.8946437 -0.2529104 0.5101836 1.22897950.5685497

y1,…,y10

switch to R…

other priors? unif(0,1), norm(0,1), norm(0,100)

generating good candidates?

13

Prediction

See D, compute: “Predictive Distribution”

future observable

14

Bayes/Classical Estimators

Prior washes out – locally uniform!!! Bayes is consistent unless you have dogmatic prior.

15

Bayesian Computations

Before simulation methods, Bayesians used posterior expectations of various functions as summary of posterior.

If p(θ|D) is in a convenient form (e.g. normal), then I might be able to compute this for some h.

16

Conjugate Families

Models with convenient analytic properties almost invariably come from conjugate families.

Why do I care now?- conjugate models are used as building blocks- build intuition re functions of Bayesian inference

Definition:A prior is conjugate to a likelihood if the

posterior is in the same class of distributions as prior.

Basically, conjugate priors are like the posterior from some imaginary dataset with a diffuse prior.

17

Beta-Binomial model

Need a prior!

18

Beta distribution

19

Posterior

20

Prediction

21

Regression model

22

Bayesian Regression

Prior:

Inverted Chi-Square:

Interpretation as from another dataset.

Draw from prior?

23

Posterior

24

Combining quadratic forms

25

Posterior

26

IID Simulations

3) Repeat

1) Draw [2 | y, X]

2) Draw [ | 2,y, X]

Scheme: [y|X, , 2] [|2] [2][, 2|y,X] [2 | y,X] [ | 2,y,X]

27

IID Simulator, cont.

Documents

1 Bayesian Essentials Slides by Peter Rossi and David Madigan