Upload
norah-sparks
View
224
Download
1
Tags:
Embed Size (px)
Citation preview
1
Bayesian Essentials
Slides by Peter Rossi and David Madigan
2
Distribution Theory 101
Marginal and Conditional Distributions:
X
Y
1
1
uniform
3
Simulating from Joint
To draw from the joint:i. Draw from marginal on Xii. Condition on this draw, and draw from
conditional of Y|X
library(triangle)x <- rtriangle(NumDraws,0,1,1)y <- runif(NumDraws,0,x)plot(x,y)
4
Triangular Distribution
If U~ unif(0,1), then:
sqrt(U) has the standard triangle distribution
If U1, U2 ~ unif(0,1), then:
Y=max{U1,U2} has the standard triangle distribution
Sampling Importance Resampling
5
f
g
draw a big sample from g
sub-sample from that sample with probability f/g
Metropolis
6
start with current = 0.5
to get the next value: draw a “proposal” from g
keep with probability f(proposal)/f(current)
else keep current
f
g
7
The Goal of InferenceMake inferences about unknown quantities using available information.
Inference -- make probability statements
unknowns --
parameters, functions of parameters, states or latent variables, “future” outcomes, outcomes conditional on an action
Information –
data-based
non data-based
theories of behavior; subjective views; mechanism
parameters are finite or in some range
8
p(θ|D) α p(D| θ) p(θ)
Posterior α “Likelihood” × Prior
Modern Bayesian computing– simulation methods for generating draws from the posterior distribution p(θ|D).
Bayes theorem
9
Summarizing the posterior
Output from Bayesian Inference:A possibly high dimensional distribution
Summarize this object via simulation:marginal distributions of don’t just compute
Contrast with Sampling Theory:point est/standard error
summary of irrelevant dist bad summary (normal)Limitations of asymptotics
10
Metropolis
Start somewhere with θcurrent
To get the next value, generate a proposal θproposal
Accept with “probability”:
else keep currrent
11
Example
Believe these measurements (D) come from N(μ,1):
0.9072867 -0.4490744 -0.1463117 0.2525023 0.9723840 -0.8946437 -0.2529104 0.5101836 1.22897950.5685497
Prior for μ?
p(μ) = 2μ
12
Example continued
p(D|μ)? 0.9072867 -0.4490744 -0.1463117 0.2525023 0.9723840 -0.8946437 -0.2529104 0.5101836 1.22897950.5685497
y1,…,y10
switch to R…
other priors? unif(0,1), norm(0,1), norm(0,100)
generating good candidates?
13
Prediction
See D, compute: “Predictive Distribution”
future observable
14
Bayes/Classical Estimators
Prior washes out – locally uniform!!! Bayes is consistent unless you have dogmatic prior.
15
Bayesian Computations
Before simulation methods, Bayesians used posterior expectations of various functions as summary of posterior.
If p(θ|D) is in a convenient form (e.g. normal), then I might be able to compute this for some h.
16
Conjugate Families
Models with convenient analytic properties almost invariably come from conjugate families.
Why do I care now?- conjugate models are used as building blocks- build intuition re functions of Bayesian inference
Definition:A prior is conjugate to a likelihood if the
posterior is in the same class of distributions as prior.
Basically, conjugate priors are like the posterior from some imaginary dataset with a diffuse prior.
17
Beta-Binomial model
Need a prior!
18
Beta distribution
19
Posterior
20
Prediction
21
Regression model
22
Bayesian Regression
Prior:
Inverted Chi-Square:
Interpretation as from another dataset.
Draw from prior?
23
Posterior
24
Combining quadratic forms
25
Posterior
26
IID Simulations
3) Repeat
1) Draw [2 | y, X]
2) Draw [ | 2,y, X]
Scheme: [y|X, , 2] [|2] [2][, 2|y,X] [2 | y,X] [ | 2,y,X]
27
IID Simulator, cont.