Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Creating structured and flexible models: someopen problems
Andrew GelmanDepartment of Statistics and Department of Political Science
Columbia University
8 June 2009
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Themes
I Goal: models that are structured enough to be able to learnfrom data but not be so strong as to overwhelm the data
I Collaborators:Joe Bafumi, Valerie Chan, Samantha Cook, Jeronimo Cortina,Zaiying Huang, Aleks Jakulin, Jouni Kerman, Gary King, IainPardoe, David Park, Maria Grazia Pittau, Boris Shor, MattStevens, Yu-Sung Su, Francis Tuerlinckx, Masanao Yajima,Shouhao Zhou
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Themes
I Goal: models that are structured enough to be able to learnfrom data but not be so strong as to overwhelm the data
I Collaborators:Joe Bafumi, Valerie Chan, Samantha Cook, Jeronimo Cortina,Zaiying Huang, Aleks Jakulin, Jouni Kerman, Gary King, IainPardoe, David Park, Maria Grazia Pittau, Boris Shor, MattStevens, Yu-Sung Su, Francis Tuerlinckx, Masanao Yajima,Shouhao Zhou
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Themes
I Goal: models that are structured enough to be able to learnfrom data but not be so strong as to overwhelm the data
I Collaborators:Joe Bafumi, Valerie Chan, Samantha Cook, Jeronimo Cortina,Zaiying Huang, Aleks Jakulin, Jouni Kerman, Gary King, IainPardoe, David Park, Maria Grazia Pittau, Boris Shor, MattStevens, Yu-Sung Su, Francis Tuerlinckx, Masanao Yajima,Shouhao Zhou
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Logistic regression
−6 −4 −2 0 2 4 60.0
0.2
0.4
0.6
0.8
1.0 y = logit−1(x)
x
logi
t−1(x
)
slope = 1/4
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
A clean example
−10 0 10 20
0.0
0.2
0.4
0.6
0.8
1.0
estimated Pr(y=1) = logit−1(−1.40 + 0.33 x)
x
y slope = 0.33/4
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
The problem of separation
−6 −4 −2 0 2 4 6
0.0
0.2
0.4
0.6
0.8
1.0
slope = infinity?
x
y
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Separation is no joke!
glm (vote ~ female + black + income, family=binomial(link="logit"))
1960 1968
coef.est coef.se coef.est coef.se
(Intercept) -0.14 0.23 (Intercept) 0.47 0.24
female 0.24 0.14 female -0.01 0.15
black -1.03 0.36 black -3.64 0.59
income 0.03 0.06 income -0.03 0.07
1964 1972
coef.est coef.se coef.est coef.se
(Intercept) -1.15 0.22 (Intercept) 0.67 0.18
female -0.09 0.14 female -0.25 0.12
black -16.83 420.40 black -2.63 0.27
income 0.19 0.06 income 0.09 0.05
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
bayesglm()
I Bayesian logistic regression
I In the arm (Applied Regression and Multilevel modeling)package in R
I Replaces glm(), estimates are more numerically andcomputationally stable
I Student-t prior distributions for regression coefs
I Use EM-like algorithm
I We went inside glm.fit to augment the iteratively weightedleast squares step
I Default choices for tuning parameters (we’ll get back to this!)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
bayesglm()
I Bayesian logistic regression
I In the arm (Applied Regression and Multilevel modeling)package in R
I Replaces glm(), estimates are more numerically andcomputationally stable
I Student-t prior distributions for regression coefs
I Use EM-like algorithm
I We went inside glm.fit to augment the iteratively weightedleast squares step
I Default choices for tuning parameters (we’ll get back to this!)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
bayesglm()
I Bayesian logistic regression
I In the arm (Applied Regression and Multilevel modeling)package in R
I Replaces glm(), estimates are more numerically andcomputationally stable
I Student-t prior distributions for regression coefs
I Use EM-like algorithm
I We went inside glm.fit to augment the iteratively weightedleast squares step
I Default choices for tuning parameters (we’ll get back to this!)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
bayesglm()
I Bayesian logistic regression
I In the arm (Applied Regression and Multilevel modeling)package in R
I Replaces glm(), estimates are more numerically andcomputationally stable
I Student-t prior distributions for regression coefs
I Use EM-like algorithm
I We went inside glm.fit to augment the iteratively weightedleast squares step
I Default choices for tuning parameters (we’ll get back to this!)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
bayesglm()
I Bayesian logistic regression
I In the arm (Applied Regression and Multilevel modeling)package in R
I Replaces glm(), estimates are more numerically andcomputationally stable
I Student-t prior distributions for regression coefs
I Use EM-like algorithm
I We went inside glm.fit to augment the iteratively weightedleast squares step
I Default choices for tuning parameters (we’ll get back to this!)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
bayesglm()
I Bayesian logistic regression
I In the arm (Applied Regression and Multilevel modeling)package in R
I Replaces glm(), estimates are more numerically andcomputationally stable
I Student-t prior distributions for regression coefs
I Use EM-like algorithm
I We went inside glm.fit to augment the iteratively weightedleast squares step
I Default choices for tuning parameters (we’ll get back to this!)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
bayesglm()
I Bayesian logistic regression
I In the arm (Applied Regression and Multilevel modeling)package in R
I Replaces glm(), estimates are more numerically andcomputationally stable
I Student-t prior distributions for regression coefs
I Use EM-like algorithm
I We went inside glm.fit to augment the iteratively weightedleast squares step
I Default choices for tuning parameters (we’ll get back to this!)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
bayesglm()
I Bayesian logistic regression
I In the arm (Applied Regression and Multilevel modeling)package in R
I Replaces glm(), estimates are more numerically andcomputationally stable
I Student-t prior distributions for regression coefs
I Use EM-like algorithm
I We went inside glm.fit to augment the iteratively weightedleast squares step
I Default choices for tuning parameters (we’ll get back to this!)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Regularization in action!
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
What else is out there?
I glm (maximum likelihood): fails under separation, gives noisyanswers for sparse data
I Augment with prior “successes” and “failures”: doesn’t workwell for multiple predictors
I brlr (Jeffreys-like prior distribution): computationallyunstable
I brglm (improvement on brlr): doesn’t do enough smoothing
I BBR (Laplace prior distribution): OK, not quite as good asbayesglm
I Non-Bayesian machine learning algorithms: understateuncertainty in predictions
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
What else is out there?
I glm (maximum likelihood): fails under separation, gives noisyanswers for sparse data
I Augment with prior “successes” and “failures”: doesn’t workwell for multiple predictors
I brlr (Jeffreys-like prior distribution): computationallyunstable
I brglm (improvement on brlr): doesn’t do enough smoothing
I BBR (Laplace prior distribution): OK, not quite as good asbayesglm
I Non-Bayesian machine learning algorithms: understateuncertainty in predictions
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
What else is out there?
I glm (maximum likelihood): fails under separation, gives noisyanswers for sparse data
I Augment with prior “successes” and “failures”: doesn’t workwell for multiple predictors
I brlr (Jeffreys-like prior distribution): computationallyunstable
I brglm (improvement on brlr): doesn’t do enough smoothing
I BBR (Laplace prior distribution): OK, not quite as good asbayesglm
I Non-Bayesian machine learning algorithms: understateuncertainty in predictions
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
What else is out there?
I glm (maximum likelihood): fails under separation, gives noisyanswers for sparse data
I Augment with prior “successes” and “failures”: doesn’t workwell for multiple predictors
I brlr (Jeffreys-like prior distribution): computationallyunstable
I brglm (improvement on brlr): doesn’t do enough smoothing
I BBR (Laplace prior distribution): OK, not quite as good asbayesglm
I Non-Bayesian machine learning algorithms: understateuncertainty in predictions
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
What else is out there?
I glm (maximum likelihood): fails under separation, gives noisyanswers for sparse data
I Augment with prior “successes” and “failures”: doesn’t workwell for multiple predictors
I brlr (Jeffreys-like prior distribution): computationallyunstable
I brglm (improvement on brlr): doesn’t do enough smoothing
I BBR (Laplace prior distribution): OK, not quite as good asbayesglm
I Non-Bayesian machine learning algorithms: understateuncertainty in predictions
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
What else is out there?
I glm (maximum likelihood): fails under separation, gives noisyanswers for sparse data
I Augment with prior “successes” and “failures”: doesn’t workwell for multiple predictors
I brlr (Jeffreys-like prior distribution): computationallyunstable
I brglm (improvement on brlr): doesn’t do enough smoothing
I BBR (Laplace prior distribution): OK, not quite as good asbayesglm
I Non-Bayesian machine learning algorithms: understateuncertainty in predictions
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
What else is out there?
I glm (maximum likelihood): fails under separation, gives noisyanswers for sparse data
I Augment with prior “successes” and “failures”: doesn’t workwell for multiple predictors
I brlr (Jeffreys-like prior distribution): computationallyunstable
I brglm (improvement on brlr): doesn’t do enough smoothing
I BBR (Laplace prior distribution): OK, not quite as good asbayesglm
I Non-Bayesian machine learning algorithms: understateuncertainty in predictions
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Information in prior distributions
I Informative prior distI A full generative model for the data
I Noninformative prior distI Let the data speakI Goal: valid inference for any θ
I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Information in prior distributions
I Informative prior distI A full generative model for the data
I Noninformative prior distI Let the data speakI Goal: valid inference for any θ
I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Information in prior distributions
I Informative prior distI A full generative model for the data
I Noninformative prior distI Let the data speakI Goal: valid inference for any θ
I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Information in prior distributions
I Informative prior distI A full generative model for the data
I Noninformative prior distI Let the data speakI Goal: valid inference for any θ
I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Information in prior distributions
I Informative prior distI A full generative model for the data
I Noninformative prior distI Let the data speakI Goal: valid inference for any θ
I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Information in prior distributions
I Informative prior distI A full generative model for the data
I Noninformative prior distI Let the data speakI Goal: valid inference for any θ
I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Information in prior distributions
I Informative prior distI A full generative model for the data
I Noninformative prior distI Let the data speakI Goal: valid inference for any θ
I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Information in prior distributions
I Informative prior distI A full generative model for the data
I Noninformative prior distI Let the data speakI Goal: valid inference for any θ
I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Information in prior distributions
I Informative prior distI A full generative model for the data
I Noninformative prior distI Let the data speakI Goal: valid inference for any θ
I Weakly informative prior distI Purposely include less information than we actually haveI Goal: regularization, stabilization
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Weakly informative priors forlogistic regression coefficients
I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always
between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50
or from 0.50 to 0.99I Smoking and lung cancer
I Independent Cauchy prior dists with center 0 and scale 2.5
I Rescale each predictor to have mean 0 and sd 12
I Fast implementation using EM; easy adaptation of glm
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Weakly informative priors forlogistic regression coefficients
I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always
between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50
or from 0.50 to 0.99I Smoking and lung cancer
I Independent Cauchy prior dists with center 0 and scale 2.5
I Rescale each predictor to have mean 0 and sd 12
I Fast implementation using EM; easy adaptation of glm
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Weakly informative priors forlogistic regression coefficients
I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always
between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50
or from 0.50 to 0.99I Smoking and lung cancer
I Independent Cauchy prior dists with center 0 and scale 2.5
I Rescale each predictor to have mean 0 and sd 12
I Fast implementation using EM; easy adaptation of glm
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Weakly informative priors forlogistic regression coefficients
I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always
between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50
or from 0.50 to 0.99I Smoking and lung cancer
I Independent Cauchy prior dists with center 0 and scale 2.5
I Rescale each predictor to have mean 0 and sd 12
I Fast implementation using EM; easy adaptation of glm
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Weakly informative priors forlogistic regression coefficients
I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always
between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50
or from 0.50 to 0.99I Smoking and lung cancer
I Independent Cauchy prior dists with center 0 and scale 2.5
I Rescale each predictor to have mean 0 and sd 12
I Fast implementation using EM; easy adaptation of glm
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Weakly informative priors forlogistic regression coefficients
I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always
between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50
or from 0.50 to 0.99I Smoking and lung cancer
I Independent Cauchy prior dists with center 0 and scale 2.5
I Rescale each predictor to have mean 0 and sd 12
I Fast implementation using EM; easy adaptation of glm
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Weakly informative priors forlogistic regression coefficients
I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always
between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50
or from 0.50 to 0.99I Smoking and lung cancer
I Independent Cauchy prior dists with center 0 and scale 2.5
I Rescale each predictor to have mean 0 and sd 12
I Fast implementation using EM; easy adaptation of glm
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Weakly informative priors forlogistic regression coefficients
I Separation in logistic regressionI Some prior info: logistic regression coefs are almost always
between −5 and 5:I 5 on the logit scale takes you from 0.01 to 0.50
or from 0.50 to 0.99I Smoking and lung cancer
I Independent Cauchy prior dists with center 0 and scale 2.5
I Rescale each predictor to have mean 0 and sd 12
I Fast implementation using EM; easy adaptation of glm
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Prior distributions
−10 −5 0 5 10
θ
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Another example
Dose #deaths/#animals
−0.86 0/5−0.30 1/5−0.05 3/5
0.73 5/5
I Slope of a logistic regression of Pr(death) on dose:I Maximum likelihood est is 7.8± 4.9I With weakly-informative prior: Bayes est is 4.4± 1.9
I Which is truly conservative?
I The sociology of shrinkage
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Another example
Dose #deaths/#animals
−0.86 0/5−0.30 1/5−0.05 3/5
0.73 5/5
I Slope of a logistic regression of Pr(death) on dose:I Maximum likelihood est is 7.8± 4.9I With weakly-informative prior: Bayes est is 4.4± 1.9
I Which is truly conservative?
I The sociology of shrinkage
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Another example
Dose #deaths/#animals
−0.86 0/5−0.30 1/5−0.05 3/5
0.73 5/5
I Slope of a logistic regression of Pr(death) on dose:I Maximum likelihood est is 7.8± 4.9I With weakly-informative prior: Bayes est is 4.4± 1.9
I Which is truly conservative?
I The sociology of shrinkage
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Another example
Dose #deaths/#animals
−0.86 0/5−0.30 1/5−0.05 3/5
0.73 5/5
I Slope of a logistic regression of Pr(death) on dose:I Maximum likelihood est is 7.8± 4.9I With weakly-informative prior: Bayes est is 4.4± 1.9
I Which is truly conservative?
I The sociology of shrinkage
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Another example
Dose #deaths/#animals
−0.86 0/5−0.30 1/5−0.05 3/5
0.73 5/5
I Slope of a logistic regression of Pr(death) on dose:I Maximum likelihood est is 7.8± 4.9I With weakly-informative prior: Bayes est is 4.4± 1.9
I Which is truly conservative?
I The sociology of shrinkage
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Another example
Dose #deaths/#animals
−0.86 0/5−0.30 1/5−0.05 3/5
0.73 5/5
I Slope of a logistic regression of Pr(death) on dose:I Maximum likelihood est is 7.8± 4.9I With weakly-informative prior: Bayes est is 4.4± 1.9
I Which is truly conservative?
I The sociology of shrinkage
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Maximum likelihood and Bayesian estimates
Dose
Pro
babi
lity
of d
eath
0 10 20
0.0
0.5
1.0
glmbayesglm
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Conservatism of Bayesian inference
I Problems with maximum likelihood when data showseparation:
I Coefficient estimate of −∞I Estimated predictive probability of 0 for new cases
I Is this conservative?
I Not if evaluated by log score or predictive log-likelihood
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Conservatism of Bayesian inference
I Problems with maximum likelihood when data showseparation:
I Coefficient estimate of −∞I Estimated predictive probability of 0 for new cases
I Is this conservative?
I Not if evaluated by log score or predictive log-likelihood
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Conservatism of Bayesian inference
I Problems with maximum likelihood when data showseparation:
I Coefficient estimate of −∞I Estimated predictive probability of 0 for new cases
I Is this conservative?
I Not if evaluated by log score or predictive log-likelihood
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Conservatism of Bayesian inference
I Problems with maximum likelihood when data showseparation:
I Coefficient estimate of −∞I Estimated predictive probability of 0 for new cases
I Is this conservative?
I Not if evaluated by log score or predictive log-likelihood
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Conservatism of Bayesian inference
I Problems with maximum likelihood when data showseparation:
I Coefficient estimate of −∞I Estimated predictive probability of 0 for new cases
I Is this conservative?
I Not if evaluated by log score or predictive log-likelihood
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Conservatism of Bayesian inference
I Problems with maximum likelihood when data showseparation:
I Coefficient estimate of −∞I Estimated predictive probability of 0 for new cases
I Is this conservative?
I Not if evaluated by log score or predictive log-likelihood
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Which one is conservative?
Dose
Pro
babi
lity
of d
eath
0 10 20
0.0
0.5
1.0
glmbayesglm
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Prior as population distribution
I Consider many possible datasets
I The “true prior” is the distribution of β’s across these datasets
I Fit one dataset at a time
I A “weakly informative prior” has less information (widervariance) than the true prior
I Open question: How to formalize the tradeoffs from usingdifferent priors?
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Prior as population distribution
I Consider many possible datasets
I The “true prior” is the distribution of β’s across these datasets
I Fit one dataset at a time
I A “weakly informative prior” has less information (widervariance) than the true prior
I Open question: How to formalize the tradeoffs from usingdifferent priors?
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Prior as population distribution
I Consider many possible datasets
I The “true prior” is the distribution of β’s across these datasets
I Fit one dataset at a time
I A “weakly informative prior” has less information (widervariance) than the true prior
I Open question: How to formalize the tradeoffs from usingdifferent priors?
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Prior as population distribution
I Consider many possible datasets
I The “true prior” is the distribution of β’s across these datasets
I Fit one dataset at a time
I A “weakly informative prior” has less information (widervariance) than the true prior
I Open question: How to formalize the tradeoffs from usingdifferent priors?
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Prior as population distribution
I Consider many possible datasets
I The “true prior” is the distribution of β’s across these datasets
I Fit one dataset at a time
I A “weakly informative prior” has less information (widervariance) than the true prior
I Open question: How to formalize the tradeoffs from usingdifferent priors?
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Prior as population distribution
I Consider many possible datasets
I The “true prior” is the distribution of β’s across these datasets
I Fit one dataset at a time
I A “weakly informative prior” has less information (widervariance) than the true prior
I Open question: How to formalize the tradeoffs from usingdifferent priors?
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Evaluation using a corpus of datasets
I Compare classical glm to Bayesian estimates using variousprior distributions
I Evaluate using 5-fold cross-validation and average predictiveerror
I The optimal prior distribution for β’s is (approx) Cauchy (0, 1)
I Our Cauchy (0, 2.5) prior distribution is weakly informative!
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Evaluation using a corpus of datasets
I Compare classical glm to Bayesian estimates using variousprior distributions
I Evaluate using 5-fold cross-validation and average predictiveerror
I The optimal prior distribution for β’s is (approx) Cauchy (0, 1)
I Our Cauchy (0, 2.5) prior distribution is weakly informative!
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Evaluation using a corpus of datasets
I Compare classical glm to Bayesian estimates using variousprior distributions
I Evaluate using 5-fold cross-validation and average predictiveerror
I The optimal prior distribution for β’s is (approx) Cauchy (0, 1)
I Our Cauchy (0, 2.5) prior distribution is weakly informative!
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Evaluation using a corpus of datasets
I Compare classical glm to Bayesian estimates using variousprior distributions
I Evaluate using 5-fold cross-validation and average predictiveerror
I The optimal prior distribution for β’s is (approx) Cauchy (0, 1)
I Our Cauchy (0, 2.5) prior distribution is weakly informative!
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Evaluation using a corpus of datasets
I Compare classical glm to Bayesian estimates using variousprior distributions
I Evaluate using 5-fold cross-validation and average predictiveerror
I The optimal prior distribution for β’s is (approx) Cauchy (0, 1)
I Our Cauchy (0, 2.5) prior distribution is weakly informative!
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Expected predictive loss, avg over a corpus of datasets
0 1 2 3 4 5
0.29
0.30
0.31
0.32
0.33
scale of prior
−lo
g te
st li
kelih
ood
(1.79)GLM
BBR(l)
df=2.0
df=4.0
df=8.0BBR(g)
df=1.0
df=0.5
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Other examples of weakly informative priors
I Variance parameters
I Covariance matrices
I Population variation in a physiological model
I Mixture models
I Intentional underpooling in hierarchical models
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Other examples of weakly informative priors
I Variance parameters
I Covariance matrices
I Population variation in a physiological model
I Mixture models
I Intentional underpooling in hierarchical models
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Other examples of weakly informative priors
I Variance parameters
I Covariance matrices
I Population variation in a physiological model
I Mixture models
I Intentional underpooling in hierarchical models
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Other examples of weakly informative priors
I Variance parameters
I Covariance matrices
I Population variation in a physiological model
I Mixture models
I Intentional underpooling in hierarchical models
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Other examples of weakly informative priors
I Variance parameters
I Covariance matrices
I Population variation in a physiological model
I Mixture models
I Intentional underpooling in hierarchical models
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
Other examples of weakly informative priors
I Variance parameters
I Covariance matrices
I Population variation in a physiological model
I Mixture models
I Intentional underpooling in hierarchical models
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
End of part 1
I “Noninformative priors” are actually weakly informative
I “Weakly informative” is a more general and useful conceptI Regularization
I Better inferencesI Stability of computation (bayesglm)
I Why use weakly informative priors rather than informativepriors?
I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
End of part 1
I “Noninformative priors” are actually weakly informative
I “Weakly informative” is a more general and useful conceptI Regularization
I Better inferencesI Stability of computation (bayesglm)
I Why use weakly informative priors rather than informativepriors?
I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
End of part 1
I “Noninformative priors” are actually weakly informative
I “Weakly informative” is a more general and useful conceptI Regularization
I Better inferencesI Stability of computation (bayesglm)
I Why use weakly informative priors rather than informativepriors?
I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
End of part 1
I “Noninformative priors” are actually weakly informative
I “Weakly informative” is a more general and useful conceptI Regularization
I Better inferencesI Stability of computation (bayesglm)
I Why use weakly informative priors rather than informativepriors?
I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
End of part 1
I “Noninformative priors” are actually weakly informative
I “Weakly informative” is a more general and useful conceptI Regularization
I Better inferencesI Stability of computation (bayesglm)
I Why use weakly informative priors rather than informativepriors?
I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
End of part 1
I “Noninformative priors” are actually weakly informative
I “Weakly informative” is a more general and useful conceptI Regularization
I Better inferencesI Stability of computation (bayesglm)
I Why use weakly informative priors rather than informativepriors?
I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
End of part 1
I “Noninformative priors” are actually weakly informative
I “Weakly informative” is a more general and useful conceptI Regularization
I Better inferencesI Stability of computation (bayesglm)
I Why use weakly informative priors rather than informativepriors?
I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
End of part 1
I “Noninformative priors” are actually weakly informative
I “Weakly informative” is a more general and useful conceptI Regularization
I Better inferencesI Stability of computation (bayesglm)
I Why use weakly informative priors rather than informativepriors?
I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
End of part 1
I “Noninformative priors” are actually weakly informative
I “Weakly informative” is a more general and useful conceptI Regularization
I Better inferencesI Stability of computation (bayesglm)
I Why use weakly informative priors rather than informativepriors?
I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Separation in logistic regressionBayesian solutionPrior informationEvaluation using a corpus of datasets
End of part 1
I “Noninformative priors” are actually weakly informative
I “Weakly informative” is a more general and useful conceptI Regularization
I Better inferencesI Stability of computation (bayesglm)
I Why use weakly informative priors rather than informativepriors?
I Conformity with statistical culture (“conservatism”)I Labor-saving deviceI Robustness
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
No-interaction model
I Before-after data with treatment and control groupsI Default model: constant treatment effects
I Fisher’s classical null hyp: effect is zero for all casesI Regression model: yi = Tiθ + Xiβ + εi
control
treatment
"before" measurement, x
"afte
r" m
easu
rem
ent,
y
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
No-interaction model
I Before-after data with treatment and control groupsI Default model: constant treatment effects
I Fisher’s classical null hyp: effect is zero for all casesI Regression model: yi = Tiθ + Xiβ + εi
control
treatment
"before" measurement, x
"afte
r" m
easu
rem
ent,
y
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
No-interaction model
I Before-after data with treatment and control groupsI Default model: constant treatment effects
I Fisher’s classical null hyp: effect is zero for all casesI Regression model: yi = Tiθ + Xiβ + εi
control
treatment
"before" measurement, x
"afte
r" m
easu
rem
ent,
y
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
No-interaction model
I Before-after data with treatment and control groupsI Default model: constant treatment effects
I Fisher’s classical null hyp: effect is zero for all casesI Regression model: yi = Tiθ + Xiβ + εi
control
treatment
"before" measurement, x
"afte
r" m
easu
rem
ent,
y
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
No-interaction model
I Before-after data with treatment and control groupsI Default model: constant treatment effects
I Fisher’s classical null hyp: effect is zero for all casesI Regression model: yi = Tiθ + Xiβ + εi
control
treatment
"before" measurement, x
"afte
r" m
easu
rem
ent,
y
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Actual data show interactions
I Treatment interacts with “before” measurement
I Before-after correlation is higher for controls than for treatedunits
I ExamplesI An observational study of legislative redistrictingI An experiment with pre-test, post-test dataI Congressional elections with incumbents and open seats
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Actual data show interactions
I Treatment interacts with “before” measurement
I Before-after correlation is higher for controls than for treatedunits
I ExamplesI An observational study of legislative redistrictingI An experiment with pre-test, post-test dataI Congressional elections with incumbents and open seats
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Actual data show interactions
I Treatment interacts with “before” measurement
I Before-after correlation is higher for controls than for treatedunits
I ExamplesI An observational study of legislative redistrictingI An experiment with pre-test, post-test dataI Congressional elections with incumbents and open seats
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Actual data show interactions
I Treatment interacts with “before” measurement
I Before-after correlation is higher for controls than for treatedunits
I ExamplesI An observational study of legislative redistrictingI An experiment with pre-test, post-test dataI Congressional elections with incumbents and open seats
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Actual data show interactions
I Treatment interacts with “before” measurement
I Before-after correlation is higher for controls than for treatedunits
I ExamplesI An observational study of legislative redistrictingI An experiment with pre-test, post-test dataI Congressional elections with incumbents and open seats
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Actual data show interactions
I Treatment interacts with “before” measurement
I Before-after correlation is higher for controls than for treatedunits
I ExamplesI An observational study of legislative redistrictingI An experiment with pre-test, post-test dataI Congressional elections with incumbents and open seats
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Actual data show interactions
I Treatment interacts with “before” measurement
I Before-after correlation is higher for controls than for treatedunits
I ExamplesI An observational study of legislative redistrictingI An experiment with pre-test, post-test dataI Congressional elections with incumbents and open seats
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Observational study of legislative redistricting:before-after data
Estimated partisan bias in previous election
Est
imat
ed p
artis
an b
ias
(adj
uste
d fo
r st
ate)
-0.05 0.0 0.05
-0.0
50.
00.
05
no redistricting
bipartisan redistrict
Dem. redistrict
Rep. redistrict.
. .. .
.
..
.....
...
.
.
.
...
.
.
..
..
.
.
.. .
. .
.
.
.
.
.
..
..
.
. .
.
.
. .
..
..
.
.. ..
.
.
.
..
.
..
..
.
.
. .
.
.
.
..
..
.
.
.
.
.
.. .
.
.
..
.
.
.
. .
..
..
.
..
.
. .
.
.. o
o
o
o
o o
ox
x
x
x
x
xx
x
x
x
•
•• •
•
•
•
•
•
•
•••
•
•
(favors Democrats)
(favors Republicans)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Educational experiment: correlation between pre-test andpost-test data for controls and for treated units
grade
corr
elat
ion
1 2 3 4
0.8
0.9
1.0
controls
treated
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Correlation between two successive Congressional electionsfor incumbents running (controls) and open seats (treated)
1900 1920 1940 1960 1980 2000
0.0
0.2
0.4
0.6
0.8
year
corr
elat
ion
incumbents
open seats
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Interactions as variance components
I yi = Tiθ + Xiβ + εi + ηi
I Unit-level “error term” ηi
I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:
I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error
I Under all these models, the before-after correlation is higherfor controls than treated units
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Interactions as variance components
I yi = Tiθ + Xiβ + εi + ηi
I Unit-level “error term” ηi
I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:
I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error
I Under all these models, the before-after correlation is higherfor controls than treated units
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Interactions as variance components
I yi = Tiθ + Xiβ + εi + ηi
I Unit-level “error term” ηi
I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:
I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error
I Under all these models, the before-after correlation is higherfor controls than treated units
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Interactions as variance components
I yi = Tiθ + Xiβ + εi + ηi
I Unit-level “error term” ηi
I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:
I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error
I Under all these models, the before-after correlation is higherfor controls than treated units
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Interactions as variance components
I yi = Tiθ + Xiβ + εi + ηi
I Unit-level “error term” ηi
I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:
I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error
I Under all these models, the before-after correlation is higherfor controls than treated units
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Interactions as variance components
I yi = Tiθ + Xiβ + εi + ηi
I Unit-level “error term” ηi
I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:
I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error
I Under all these models, the before-after correlation is higherfor controls than treated units
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Interactions as variance components
I yi = Tiθ + Xiβ + εi + ηi
I Unit-level “error term” ηi
I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:
I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error
I Under all these models, the before-after correlation is higherfor controls than treated units
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
Interactions as variance components
I yi = Tiθ + Xiβ + εi + ηi
I Unit-level “error term” ηi
I For control units, ηi persists from time 1 to time 2I For treatment units, ηi changes:
I Subtractive treatment error (ηi only at time 1)I Additive treatment error (ηi only at time 2)I Replacement treatment error
I Under all these models, the before-after correlation is higherfor controls than treated units
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
End of part 2
I Treatment and control can be modeled asymmetrically
I Additional variance component
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
End of part 2
I Treatment and control can be modeled asymmetrically
I Additional variance component
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Legislative redistrictingEducational experimentIncumbency advantageHierarchical models for treatment interactions
End of part 2
I Treatment and control can be modeled asymmetrically
I Additional variance component
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Examples of interactions in regression
I Federal spending by state, year, category (50× 19× 10)
I Vote preference given state and demographic variables(50× 2× 2× 4× 4)
I Rich state, poor state, red state, blue state (50× 2 for eachelection)
I Meta-analysis of incentives in sample surveys (26)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Examples of interactions in regression
I Federal spending by state, year, category (50× 19× 10)
I Vote preference given state and demographic variables(50× 2× 2× 4× 4)
I Rich state, poor state, red state, blue state (50× 2 for eachelection)
I Meta-analysis of incentives in sample surveys (26)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Examples of interactions in regression
I Federal spending by state, year, category (50× 19× 10)
I Vote preference given state and demographic variables(50× 2× 2× 4× 4)
I Rich state, poor state, red state, blue state (50× 2 for eachelection)
I Meta-analysis of incentives in sample surveys (26)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Examples of interactions in regression
I Federal spending by state, year, category (50× 19× 10)
I Vote preference given state and demographic variables(50× 2× 2× 4× 4)
I Rich state, poor state, red state, blue state (50× 2 for eachelection)
I Meta-analysis of incentives in sample surveys (26)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Examples of interactions in regression
I Federal spending by state, year, category (50× 19× 10)
I Vote preference given state and demographic variables(50× 2× 2× 4× 4)
I Rich state, poor state, red state, blue state (50× 2 for eachelection)
I Meta-analysis of incentives in sample surveys (26)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
General concerns
I Lots of potential interactions
I Setting high-level interactions to zero? Too extreme,especially when interactions are of substantive interest
I Simple hierarchical model for interactions is too crude
I Model: large main effects can have large interactions. Inhierarchical setting, model should come “naturally”
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
General concerns
I Lots of potential interactions
I Setting high-level interactions to zero? Too extreme,especially when interactions are of substantive interest
I Simple hierarchical model for interactions is too crude
I Model: large main effects can have large interactions. Inhierarchical setting, model should come “naturally”
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
General concerns
I Lots of potential interactions
I Setting high-level interactions to zero? Too extreme,especially when interactions are of substantive interest
I Simple hierarchical model for interactions is too crude
I Model: large main effects can have large interactions. Inhierarchical setting, model should come “naturally”
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
General concerns
I Lots of potential interactions
I Setting high-level interactions to zero? Too extreme,especially when interactions are of substantive interest
I Simple hierarchical model for interactions is too crude
I Model: large main effects can have large interactions. Inhierarchical setting, model should come “naturally”
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
General concerns
I Lots of potential interactions
I Setting high-level interactions to zero? Too extreme,especially when interactions are of substantive interest
I Simple hierarchical model for interactions is too crude
I Model: large main effects can have large interactions. Inhierarchical setting, model should come “naturally”
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Federal spending by state
I Federal spending by state, year, category (50× 19× 10)
I For each spending category, 50× 19 data structure
I yjt = αj + βt + γjt
I possible model: γjt ∼ N (0, A + B|αjβt |)I Some example data
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Federal spending by state
I Federal spending by state, year, category (50× 19× 10)
I For each spending category, 50× 19 data structure
I yjt = αj + βt + γjt
I possible model: γjt ∼ N (0, A + B|αjβt |)I Some example data
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Federal spending by state
I Federal spending by state, year, category (50× 19× 10)
I For each spending category, 50× 19 data structure
I yjt = αj + βt + γjt
I possible model: γjt ∼ N (0, A + B|αjβt |)I Some example data
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Federal spending by state
I Federal spending by state, year, category (50× 19× 10)
I For each spending category, 50× 19 data structure
I yjt = αj + βt + γjt
I possible model: γjt ∼ N (0, A + B|αjβt |)I Some example data
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Federal spending by state
I Federal spending by state, year, category (50× 19× 10)
I For each spending category, 50× 19 data structure
I yjt = αj + βt + γjt
I possible model: γjt ∼ N (0, A + B|αjβt |)I Some example data
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Federal spending by state
I Federal spending by state, year, category (50× 19× 10)
I For each spending category, 50× 19 data structure
I yjt = αj + βt + γjt
I possible model: γjt ∼ N (0, A + B|αjβt |)I Some example data
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Interactions |γjt plotted vs. main effects |αjβt |
Disretire − Federal Spending (Inflation−adjusted)
Abs(a_ j*b_t)
Abs
(e_
jt)
0.00 0.02 0.04 0.06 0.08
0.00
0.05
0.10
0.15
0.20
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Logistic regression for pre-election polls
I Logistic regression predicting vote from demographic andgeographic factors: sex, ethnicity, age, education, state
I Hierarchical model for 4 age levels, 4 education levels, 16 age× education, 50 states
I Also consider interactions such as ethnicity × state
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Logistic regression for pre-election polls
I Logistic regression predicting vote from demographic andgeographic factors: sex, ethnicity, age, education, state
I Hierarchical model for 4 age levels, 4 education levels, 16 age× education, 50 states
I Also consider interactions such as ethnicity × state
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Logistic regression for pre-election polls
I Logistic regression predicting vote from demographic andgeographic factors: sex, ethnicity, age, education, state
I Hierarchical model for 4 age levels, 4 education levels, 16 age× education, 50 states
I Also consider interactions such as ethnicity × state
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Logistic regression for pre-election polls
I Logistic regression predicting vote from demographic andgeographic factors: sex, ethnicity, age, education, state
I Hierarchical model for 4 age levels, 4 education levels, 16 age× education, 50 states
I Also consider interactions such as ethnicity × state
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Logistic regression for pre-election polls
I Logistic regression predicting vote from demographic andgeographic factors: sex, ethnicity, age, education, state
I Hierarchical model for 4 age levels, 4 education levels, 16 age× education, 50 states
I Also consider interactions such as ethnicity × state
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Prediction error as function of # of predictors
MSE : training sample
number of parameters
MS
E
0 50 100 150
0.22
0.23
0.24
0.25
MSE : test sample
number of parameters
MS
E
0 50 100 150
0.22
0.23
0.24
0.25
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Red state, blue state, rich state, poor state
I Richer voters favor the Republicans, but
I Richer states favor the Democrats
I Hierarchical logistic regression: predict your vote given yourincome and your state (“varying-intercept model”)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Red state, blue state, rich state, poor state
I Richer voters favor the Republicans, but
I Richer states favor the Democrats
I Hierarchical logistic regression: predict your vote given yourincome and your state (“varying-intercept model”)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Red state, blue state, rich state, poor state
I Richer voters favor the Republicans, but
I Richer states favor the Democrats
I Hierarchical logistic regression: predict your vote given yourincome and your state (“varying-intercept model”)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Red state, blue state, rich state, poor state
I Richer voters favor the Republicans, but
I Richer states favor the Democrats
I Hierarchical logistic regression: predict your vote given yourincome and your state (“varying-intercept model”)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Varying-intercept model
Income
Pro
babi
lity
Vot
ing
Rep
−2 −1 0 1 2
0.25
0.35
0.45
0.55
0.65
0.75
Connecticut
Ohio
Mississippi
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Varying-intercept, varying-slope model
Income
Pro
babi
lity
Vot
ing
Rep
−2 −1 0 1 2
0.4
0.5
0.6
0.7
0.8
Connecticut
Ohio
Mississippi
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Interactions!
Avg Income 2000 vs. Var Slope 2000
Avg State Income ($10k)
Slo
pe
2.0 2.5 3.0 3.5
0.0
0.1
0.2
0.3
0.4
0.5
AL
AKAZ
AR
CA
CO
CT
DEFL
GA
HIID
ILIN
IA
KS
KY
LA
ME
MD
MA
MI MN
MS
MO
MT
NENV NH
NJ
NM
NY
NCNDOH
OK
OR
PA
RISC SD
TN
TXUT VT
VA
WA
WV
WI
WY
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
3-way interactions!
Income
Pro
babi
lity
Vot
ing
Rep
−2 −1 0 1 2
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1980
ConnecticutOhio
Mississippi
Income
Pro
babi
lity
Vot
ing
Rep
−2 −1 0 1 20.
10.
20.
30.
40.
50.
60.
70.
80.
9
1984
Connecticut
OhioMississippi
Income
Pro
babi
lity
Vot
ing
Rep
−2 −1 0 1 2
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1988
Connecticut
Ohio
Mississippi
Income
Pro
babi
lity
Vot
ing
Rep
−2 −1 0 1 2
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1992
Connecticut
Ohio
Mississippi
Income
Pro
babi
lity
Vot
ing
Rep
−2 −1 0 1 2
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1996
Connecticut
Ohio
Mississippi
Income
Pro
babi
lity
Vot
ing
Rep
−2 −1 0 1 2
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
2000
Connecticut
Ohio
Mississippi
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Meta-analysis of effects of incentives on survey responserates
I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)
I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Meta-analysis of effects of incentives on survey responserates
I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)
I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Meta-analysis of effects of incentives on survey responserates
I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)
I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Meta-analysis of effects of incentives on survey responserates
I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)
I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Meta-analysis of effects of incentives on survey responserates
I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)
I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Meta-analysis of effects of incentives on survey responserates
I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)
I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Meta-analysis of effects of incentives on survey responserates
I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)
I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Meta-analysis of effects of incentives on survey responserates
I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)
I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Meta-analysis of effects of incentives on survey responserates
I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)
I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Meta-analysis of effects of incentives on survey responserates
I 6 factorsI Incentive or notI Value of incentiveI Form (gift or cash)I Timing (before or after)I Mode (telephone or face-to-face)I Burden (short or long survey)
I ModelsI No interactions: estimates don’t make senseI Interactions: estimates are out of control
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Model without interactions
I Estimated effects on response rate (in percentage points)
Beta (s.e.)Intercept 1.4 (1.6)Value of incentive 0.34 (0.17)Prepayment 2.8 (1.8)Gift −6.9 (1.5)Burden 3.3 (1.3)
I Will a low-value postpaid gift really reduce response rates by 7percentage points??
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Model without interactions
I Estimated effects on response rate (in percentage points)
Beta (s.e.)Intercept 1.4 (1.6)Value of incentive 0.34 (0.17)Prepayment 2.8 (1.8)Gift −6.9 (1.5)Burden 3.3 (1.3)
I Will a low-value postpaid gift really reduce response rates by 7percentage points??
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Model without interactions
I Estimated effects on response rate (in percentage points)
Beta (s.e.)Intercept 1.4 (1.6)Value of incentive 0.34 (0.17)Prepayment 2.8 (1.8)Gift −6.9 (1.5)Burden 3.3 (1.3)
I Will a low-value postpaid gift really reduce response rates by 7percentage points??
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Models with interactions
Model I Model II Model III Model IVConstant 60.7 (2.2) 60.8 (2.5) 61.0 (2.5) 60.1 (2.5)Incentive 5.4 (0.7) 3.7 (0.8) 2.8 (1.0) 6.1 (1.2)Mode 15.2 (4.7) 16.1 (5.1) 16.0 (4.9) 18.0 (4.6)Burden −7.2 (4.3) −8.9 (5.0) −8.7 (5.0) −9.9 (5.0)Mode × Burden −7.6 (9.8) −7.8 (9.4) −4.9 (9.1)Incentive × Value 0.14 (0.03) 0.33 (0.09) 0.26 (0.09)Incentive × Timing 4.4 (1.3) 1.7 (1.7) −0.2 (2.1)Incentive × Form 1.4 (1.3) 1.1 (1.2) −1.2 (2.0)Incentive × Mode −2.3 (1.6) −2.0 (1.7) 7.8 (2.9)Incentive × Burden 4.8 (1.5) 5.4 (1.8) −5.2 (2.7)Incentive × Value × Timing 0.40 (0.17) 0.58 (0.18)Incentive × Value × Burden −0.06 (0.06) 1.10 (0.24)Incentive × Timing × Burden 11.1 (3.9)Incentive × Value × Form 0.30 (0.20)Incentive × Value × Mode −1.20 (0.24)Incentive × Timing × Form 9.9 (2.7)Incentive × Timing × Mode −17.4 (4.1)Incentive × Form × Mode −0.3 (2.5)Incentive × Form × Burden 5.9 (3.2)Incentive × Mode × Burden −5.8 (3.0)Within-study sd, σ 4.2 (0.3) 3.6 (0.3) 3.6 (0.3) 2.8 (0.3)Between-study sd, τ 18 (2) 19 (2) 18 (2) 18 (2)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Structured hierarchical models
I Need to go beyond exchangeability to shrink batches ofparameters in a reasonable way
I For example, parameter matrices αjk don’t look likeexchangeable vectors
I Similar problems arise in shrinking higher-order terms inneural nets, wavelets, tree models, image models, . . .
I Recall the “blessing of dimensionality”: as the number offactors, and the number of levels per factor, increases, moreinformation is available to estimate the hyperparameters ofthe big model
I In the background: advances in Bayesian computationincluding parameter expansion (Meng, Liu, Liu, Rubin, vanDyk), adaptive Metropolis algorithms (Pasarica), structuredcomputations (Kerman)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Structured hierarchical models
I Need to go beyond exchangeability to shrink batches ofparameters in a reasonable way
I For example, parameter matrices αjk don’t look likeexchangeable vectors
I Similar problems arise in shrinking higher-order terms inneural nets, wavelets, tree models, image models, . . .
I Recall the “blessing of dimensionality”: as the number offactors, and the number of levels per factor, increases, moreinformation is available to estimate the hyperparameters ofthe big model
I In the background: advances in Bayesian computationincluding parameter expansion (Meng, Liu, Liu, Rubin, vanDyk), adaptive Metropolis algorithms (Pasarica), structuredcomputations (Kerman)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Structured hierarchical models
I Need to go beyond exchangeability to shrink batches ofparameters in a reasonable way
I For example, parameter matrices αjk don’t look likeexchangeable vectors
I Similar problems arise in shrinking higher-order terms inneural nets, wavelets, tree models, image models, . . .
I Recall the “blessing of dimensionality”: as the number offactors, and the number of levels per factor, increases, moreinformation is available to estimate the hyperparameters ofthe big model
I In the background: advances in Bayesian computationincluding parameter expansion (Meng, Liu, Liu, Rubin, vanDyk), adaptive Metropolis algorithms (Pasarica), structuredcomputations (Kerman)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Structured hierarchical models
I Need to go beyond exchangeability to shrink batches ofparameters in a reasonable way
I For example, parameter matrices αjk don’t look likeexchangeable vectors
I Similar problems arise in shrinking higher-order terms inneural nets, wavelets, tree models, image models, . . .
I Recall the “blessing of dimensionality”: as the number offactors, and the number of levels per factor, increases, moreinformation is available to estimate the hyperparameters ofthe big model
I In the background: advances in Bayesian computationincluding parameter expansion (Meng, Liu, Liu, Rubin, vanDyk), adaptive Metropolis algorithms (Pasarica), structuredcomputations (Kerman)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Structured hierarchical models
I Need to go beyond exchangeability to shrink batches ofparameters in a reasonable way
I For example, parameter matrices αjk don’t look likeexchangeable vectors
I Similar problems arise in shrinking higher-order terms inneural nets, wavelets, tree models, image models, . . .
I Recall the “blessing of dimensionality”: as the number offactors, and the number of levels per factor, increases, moreinformation is available to estimate the hyperparameters ofthe big model
I In the background: advances in Bayesian computationincluding parameter expansion (Meng, Liu, Liu, Rubin, vanDyk), adaptive Metropolis algorithms (Pasarica), structuredcomputations (Kerman)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Structured hierarchical models
I Need to go beyond exchangeability to shrink batches ofparameters in a reasonable way
I For example, parameter matrices αjk don’t look likeexchangeable vectors
I Similar problems arise in shrinking higher-order terms inneural nets, wavelets, tree models, image models, . . .
I Recall the “blessing of dimensionality”: as the number offactors, and the number of levels per factor, increases, moreinformation is available to estimate the hyperparameters ofthe big model
I In the background: advances in Bayesian computationincluding parameter expansion (Meng, Liu, Liu, Rubin, vanDyk), adaptive Metropolis algorithms (Pasarica), structuredcomputations (Kerman)
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
End of part 3
I Interactions are important
I Technical challenges in modeling and computation
I Blessing of dimensionality
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
End of part 3
I Interactions are important
I Technical challenges in modeling and computation
I Blessing of dimensionality
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
End of part 3
I Interactions are important
I Technical challenges in modeling and computation
I Blessing of dimensionality
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
End of part 3
I Interactions are important
I Technical challenges in modeling and computation
I Blessing of dimensionality
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
Federal spendingVote preferencesIncome and votingIncentives in sample surveys
Another example: who supports school vouchers?
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
What have we learned?
I Models need structure but not too much structureI Interactions are important
I Treatment interactions in before-after studiesI 2-way, 3-way, . . . , interactions in regression models
I Conservatism in statistics
I Weak prior information is key
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
What have we learned?
I Models need structure but not too much structureI Interactions are important
I Treatment interactions in before-after studiesI 2-way, 3-way, . . . , interactions in regression models
I Conservatism in statistics
I Weak prior information is key
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
What have we learned?
I Models need structure but not too much structureI Interactions are important
I Treatment interactions in before-after studiesI 2-way, 3-way, . . . , interactions in regression models
I Conservatism in statistics
I Weak prior information is key
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
What have we learned?
I Models need structure but not too much structureI Interactions are important
I Treatment interactions in before-after studiesI 2-way, 3-way, . . . , interactions in regression models
I Conservatism in statistics
I Weak prior information is key
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
What have we learned?
I Models need structure but not too much structureI Interactions are important
I Treatment interactions in before-after studiesI 2-way, 3-way, . . . , interactions in regression models
I Conservatism in statistics
I Weak prior information is key
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
What have we learned?
I Models need structure but not too much structureI Interactions are important
I Treatment interactions in before-after studiesI 2-way, 3-way, . . . , interactions in regression models
I Conservatism in statistics
I Weak prior information is key
Andrew Gelman Creating structured and flexible models: some open problems
Weakly informative priorsInteractions in before-after studies
Interactions in regressionsConclusions
What have we learned?
I Models need structure but not too much structureI Interactions are important
I Treatment interactions in before-after studiesI 2-way, 3-way, . . . , interactions in regression models
I Conservatism in statistics
I Weak prior information is key
Andrew Gelman Creating structured and flexible models: some open problems