Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Calibrated Bayes: an attractive framework for official statistics in the
21st century
Roderick J. Little
Overview
• Design-based versus model-based survey inference
• Current orthodoxy: design-model compromise
– Strengths and drawbacks
• An alternative: Calibrated Bayes
• Two US Census Bureau applications
– Disclaimer: views are mine, not US Census Bureau
NTTS 2015: Calibrated Bayes 2
Overview
• Design-based versus model-based survey inference
• Current orthodoxy: design-model compromise
– Strengths and drawbacks
• An alternative: Calibrated Bayes
• Two US Census Bureau applications
– Disclaimer: views are mine, not US Census Bureau
NTTS 2015: Calibrated Bayes 3
Survey estimation • Design-based inference: population values are
fixed, inference is based on probability distribution of sample selection. Obviously this assumes that we have a probability sample (or “quasi-randomization”, where we pretend that we have one)
• Model-based inference: survey variables are assumed to come from a statistical model: probability sampling is not the basis for inference, but useful for making the sample selection ignorable. (see e.g. Gelman et al., 2003; Little 2004)
NTTS 2015: Calibrated Bayes 4
Design vs model-based survey inference • Two main variants of model-based inference:
– Superpopulation models: Frequentist inference based on repeated samples from a “superpopulation” model
– Bayes: add prior distribution for parameters; inference about finite population quantities or parameters based on posterior distribution
• A fascinating part of the more general debate about frequentist versus Bayesian inference in statistics at large: – Design-based inference is inherently frequentist
– Purest form of model-based inference is Bayes
NTTS 2015: Calibrated Bayes 5
Design-based inference
1( ,..., ) = population values (fixed); design variablesNY Y Y Z
( , ) = finite population quantityQ Q Y Z
1( ,..., ) = Sample Inclusion Indicators (random)NI I I
Ii RST
1
0
,
,
unit included in sample
otherwise
incˆ ˆ( , , ) = sample estimate of q q Y I Z Q
incˆ ˆ( , , ) = sample estimate of , the variance of V Y I Z V q
inc part of included in the surveyY Y
ˆ ˆˆ ˆ1.96 , 1.96 95% confidence interval for q V q V Q NTTS 2015: Calibrated Bayes 6
Choice of q̂
NTTS 2015: Calibrated Bayes
It is natural to seek an estimate that is -
However, this kind of optimality is not possible without
a model (Horvitz and Thompson 1952, Godambe 1955)
design efficient
There are many choices of design-consistent estimates ...
Many survey estimates are motivated by
Regression model regression estimator
Ratio model rat
mod
io
els:
estimator, etc.
implicit
Seek good design-based properties:
ˆ : ( | ) (too strong)
ˆOr weaker: : as sample size gets large
design unbiasedness E q Y Q
design consistency q Q
7
Limitations of design-based approach
• Inference is based on probability sampling, but true probability samples are harder and harder to come by:
– Noncontact, nonresponse is increasing
– Face-to-face interviews increasingly expensive
– High proportion of available information is now not based on probability samples (e.g. internet, administrative data)
• Theory is basically asymptotic -- limited tools for small samples, e.g. small area estimation
NTTS 2015: Calibrated Bayes 8
Asymptotia Highlands
Murky sub-asymptotial forests
How many
more to reach the promised
land of
asymptotia?
Design-based methods live in the land of asymptotia 9
Model-based approaches • In model-based, or model-dependent, approaches,
models are the basis for the entire inference: estimator, standard error, interval estimation
• Two variants:
– Superpopulation modeling
– Bayesian (full probability) modeling
• Common theme is to “infer” or “predict” about non-sampled portion of the population, conditional on the sample and model
• Superpopulation is super, but Bayes is better … for small samples
NTTS 2015: Calibrated Bayes 10
Bayes inference for surveys
inc
Model: ( | ) = prior distribution for
Data: ampled values of ; = design variables
p Y Z Y
Y s Y Z
inc
Inference about ( , ) are based on
posterior predictive distribution ( ( , ) | , )
Q Q Y Z
p Q Y Z Y Z
inc
inc
In particular:
ˆOne estimate is posterior mean: ( | , )
Standard error is posterior sd: ( | , )
95% posterior probability interval plays role
of confidence interval (with a simpler interpretat
q E Q Y Z
Var Q Y Z
ion)
NTTS 2015: Calibrated Bayes 11
Inference about is then obtained from its posterior
distribution, computed via Bayes’ Theorem:
Parametric models
Usually prior distribution is specified via parametric models:
( | ) ( | , ) ( | )p Y Z p Y Z p Z d
( | , ) = parametric model, as in superpopulation approachp Y Z
( | ) = prior distribution for p Z
That is: Posterior = Prior x Likelihood
inc inc
inc
( | , ) ( | ) ( | , )
( | , ) Likelihood function
p Y Z p Z L Y Z
L Y Z
NTTS 2015: Calibrated Bayes 12
Example. Spline model on weights
Z Y Z Sample Population
HT
1
1/ ; selection prob
n
i i i
i
y yN
mod
1 1
2 2
A modeling alternative to the HT estimator is create
predictions from a more robust model relating to :
1ˆ ˆ= , predictions from model, e.g.:
~ Nor( , ); leads to
n N
i i i
i i n
i i i
Y Z
y y y yN
y
HT
2
~ Nor( ( ), ); ( ) = penalized spline of on
Simulations in Zheng and Little (2005) suggest better RMSE,
confidence coverage for spline model compared with
design-based approaches
k
i i i i
y
y S S Y Z
NTTS 2015: Calibrated Bayes 13
The model-based perspective- pros
• Flexible, unified approach for all survey problems
– Models for nonresponse, response and matching errors, small area models, combining data sources
• Bayesian approach is not asymptotic, provides better small-sample inferences
• Probability sampling is justified as making sampling mechanism ignorable, improving robustness
NTTS 2015: Calibrated Bayes 14
Models bring survey inference closer to
the statistical mainstream
B/F Gorilla
Follow my (frequentist)
statistical standards
Why? I am an
economist, I
build models!
15 NTTS 2015: Calibrated Bayes
The model-based perspective- cons
• Explicit dependence on the choice of model, which has subjective elements (but assumptions are explicit, not buried in a formula)
• Bad models provide bad answers – justifiable concerns about the effect of model misspecification
• Models are needed for all survey variables – need to understand the data, and potential for more complex computations
NTTS 2015: Calibrated Bayes 16
Overview
• Design-based versus model-based survey inference
• Current orthodoxy: design-model compromise
– Strengths and drawbacks
• An alternative: Calibrated Bayes
• Two US Census Bureau applications
– Disclaimer: views are mine, not US Census Bureau
NTTS 2015: Calibrated Bayes 17
The current “status quo” -- design-
model compromise • Design-based for large samples, descriptive statistics
– But may be model assisted, e.g. regression calibration:
– model estimates adjusted to protect against misspecification, (e.g. Särndal, Swensson and Wretman 1992).
• Model-based for small area estimation, nonresponse, time series,…
• Attempts to capitalize on best features of both paradigms… but … at the expense of “inferential schizophrenia” (Little 2012)?
NTTS 2015: Calibrated Bayes 18
GREG
1 1
ˆ ˆ ˆ ˆ( ) / , model predictionN N
i i i i i i
i i
T y I y y y
Example: when is an area “small”?
n
-
o
m
e
t
e
r
Design-based inference
-----------------------------------
Model-based inference
n0 = “Point of
inferential
schizophrenia”
How do I choose n0?
If n0 = 35, should my entire statistical philosophy
and inference be different when n=34 and n=36? n=36, CI: [ ] (wider since based on direct estimate)
n=34, CI: [ ] (narrower since based on model)
NTTS 2015: Calibrated Bayes 19
Multilevel (hierarchical Bayes) models
n
-
o
m
e
t
e
r
Bayesian multilevel model estimates borrow
strength increasingly from model as n decreases
ˆ(1 )a a a a aw y w
aw
1
0
Sample size n
Model estimate
Direct estimate
NTTS 2015: Calibrated Bayes 20
Overview
• Design-based versus model-based survey inference
• Current orthodoxy: design-model compromise
– Strengths and drawbacks
• An alternative: Calibrated Bayes
• Two US Census Bureau applications
– Disclaimer: views are mine, not US Census Bureau
NTTS 2015: Calibrated Bayes 21
An alternative paradigm: Calibrated Bayes • Frequentists should be Bayesian
– Bayes is optimal under a correctly specified model
• Bayesians should be frequentist
– We never know the model (and all models are wrong)
– Inferences should be robust to misspecification, have good repeated sampling characteristics
• Calibrated Bayes (Box 1980, Rubin 1984, Little 2006, 2012, 2013)
– Inference based on a Bayesian model
– Model chosen to yield inferences that are well-calibrated in a frequentist sense
– Aim for posterior probability intervals that have (approximately) nominal frequentist coverage
NTTS 2015: Calibrated Bayes 22
NTTS 2015: Calibrated Bayes 23
Bayes/frequentist compromises
“I believe that … sampling theory is needed for exploration and ultimate criticism of the entertained model in the light of the current data, while Bayes’ theory is needed for estimation of parameters conditional on adequacy of the model.”
George Box (1980)
Calibrated Bayes “The applied statistician should be
Bayesian in principle and calibrated to the real world in practice – appropriate frequency calculations help to define such a tie.”
NTTS 2015: Calibrated Bayes 24
“… frequency calculations are useful for making Bayesian statements scientific, … in the sense of capable of being shown wrong by empirical test; here the technique is the calibration of Bayesian probabilities to the frequencies of actual events.”
Rubin (1984)
NTTS 2015: Calibrated Bayes
Calibrated Bayes models for surveys should
incorporate sample design features
• The “Calibrated” part of Calibrated Bayes requires robust models with good repeated sampling properties:
• Generally weak priors that are dominated by the likelihood (“objective Bayes”)
• Models that incorporate sampling design features:
– Capture design weights and stratifying variables as covariates in the prediction model (e.g. Gelman 2007)
– Clustering via hierarchical random effects models
25
Overview
• Design-based versus model-based survey inference
• Current orthodoxy: design-model compromise
– Strengths and drawbacks
• An alternative: Calibrated Bayes
• Two US Census Bureau applications
– Disclaimer: views are mine, not US Census Bureau
NTTS 2015: Calibrated Bayes 26
Applications
• Voting Rights Act special tabulation
• The American Community Survey (ACS) and the “standard error error”
NTTS 2015: Calibrated Bayes 27
Voting Rights Act Special
Tabulation
• Section 203 Language Provisions of the Voting Rights Act
• Determines counties and townships required to provide language assistance at the polls
• Determinations are based in part on the following “more than 5%” provision:
… More than 5 percent of voting age citizens of political district are members of a single language minority and are Limited English Proficient (LEP).
28 NTTS 2015: Calibrated Bayes
Voting Rights Act Tabulations • Previously used direct estimates from Long Form
Decennial Census Data • Used ACS 2005-2009 and 2010 Census data to
produce estimates by fall 2011 • Direct estimates for some districts are based on small
ACS sample and hence have unacceptably high variance
• E.g. let P be proportion of voting age citizens in political district who are members of a single language minority and are Limited English Proficient
• Suppose ACS was a simple random sample, a direct estimate of P is the sample proportion m/n – District A with n=105, m=5, m/n < 0.05 – District B with n=105, m=6, m/n > 0.05 – Direct ACS estimation is more complex, but same idea applies
NTTS 2015: Calibrated Bayes 29
Voting Rights Tabulations • Overview of approach to the “more than 5%” provision:
• Build a district level regression model to predict P based on variables in the ACS
• Classify districts into classes with similar predicted P based on the model [predictive mean stratification]
• Within classes, apply a Beta-Binomial model that pulls the direct ACS estimate of P towards the average P for districts in that class
• Compare Beta-Binomial model estimate with 5% for this aspect of the determination
• Rationale: increased precision of Beta-Binomial estimates in small samples increases the probability of getting the determination right, particularly in small districts • See Joyce et al. (2014)
NTTS 2015: Calibrated Bayes 30
• Small p and n, posterior distribution is skewed to right
mode median mean
• What’s the right point estimate: median, mode, mean? Bayes forces a choice …
• Design-based, superpopulation model approaches fail to address the issue
– Maximum likelihood is equivalent to mode with flat prior, which does not correspond to a sensible loss function
Bayes forces a loss function
NTTS 2015: Calibrated Bayes 31
American Community Survey • US Census Bureau is making available thousands
of ACS tables, with millions of cells
• A high fraction of these estimates are based on very little data, and hence are very noisy
– Many people want information, not data, so ACS should produce information products, as well as data products
– When noise swamps the signal, the information content is buried
– Data products are highly constrained by confidentiality requirements, leading to incompleteness
NTTS 2015: Calibrated Bayes 32
The Statistical Problem • The ACS philosophy is essentially to produce
“direct” (“design-based”) estimates, together with margins of error
• This works fine with large samples, but most of the ACS estimates are based on small samples
– The estimates are often too noisy to be useful
– The confidence intervals derived from the estimates and margins of error are known to be of poor quality, violating statistical standards
• Intervals include proportions outside the range (0,1)
• Intervals do not have nominal coverage
NTTS 2015: Calibrated Bayes 33
The “standard error” error
• ACS reports estimates and margins of error that yield asymptotic 90% confidence intervals
• But in small samples, the implied confidence intervals do not have the stated coverage; so
• Seek to replaces estimates and margins of error by posterior means and 5% to 95% credibility intervals that have the approximately the nominal coverage
• A non-Bayesian can interpret the posterior means as estimates, and the 90% credibility intervals as 90% confidence intervals.
NTTS 2015: Calibrated Bayes 34
35
Binary outcome: Schmertmann example
Margins of
error exceed
the estimates
Data for example
NTTS 2015: Calibrated Bayes 36
outcome (e.g. poverty)
covariates (e.g. categorized age=a, gender = g, stratum = h)
In county :
sample count with age=a, gender = g, stratum = h
sample count in poverty with age=a, ge
aghc
aghc
Y
x
c
n
x
nder = g, stratum = h
ˆ / sample proportionaghc aghc aghcp x n
Fully Bayesian model
NTTS 2015: Calibrated Bayes 37
*
| ~ Bin( , )
~ Beta( , ) Beta ( , )
[Assumption: ]
| ~ Beta , (1 )
aghc aghc aghc aghc
aghc agh agh agh
agh agh agh
aghc aghc aghc agh aghc aghc agh
x p p n
p
p x x n x
Key is how to determine prior parameters , (or , )
(a) Empirical Bayes: estimate prior parameters, then treat as if known
Simple beta intervals, but understates uncertainty
agh agh agh
(b) Full Bayes: Incorporate uncertainty of prior parameter estimates
More work, but better reflects uncertainty; Consider approximations,
since full Bayes seems computationally complex
Pragmatic “pseudo-Bayes” approach
Tom Louis suggested this simple “Bayes-like” approach:
A. Compute design-based estimate of proportion and standard error using existing methods
B. Pretend data are binomial with number of successes x* and sample size n* that lead to the estimates in A.
C. Compute Beta posterior distribution with noninformative prior (e.g. uniform or Jeffreys)
D. Compute 90% posterior credibility interval based on this Beta posterior (reflects asymmetry, always between 0 and 1)
Simple to implement and easily beats standard Wald-type confidence intervals in simulations (Franco, Little, Louis and Slud 2015, in preparation)
NTTS 2015: Calibrated Bayes 38
Barriers to Calibrated Bayes • It’s a major paradigm shift
• It’s too much work/computation
– but this concern is alleviated by gains in computing power and advances in Bayesian computational methods
• More explicit dependence on the choice of model -- concerns with model misspecification
– “Design-based is model-free and hence robust…model-based requires models, which are inherently subjective”
• But models are essential for today’s data, and
• a judicious Calibrated Bayes model is robust and incorporates key design features – and would bring official statistics back in the statistical mainstream
NTTS 2015: Calibrated Bayes 39
References 1 Box, G.E.P. (1980), Sampling and Bayes inference in scientific modeling and robustness (with discussion), JRSSA, 143, 383-430.
Joyce, P.M., Malec, D., Little, R.J., Gilary, A., Navarro, A. and Asiala, M.E. (2014). Statistical Modeling Methodology for the Voting Rights Act Section 203 Language Assistance Determinations. JASA, 109, 36-47.
Gelman, A. (2007). Struggles with survey weighting and regression modeling. Statist. Sci., 22, 2, 153-164 (with discussion and rejoinder).
Gelman, A., Carlin, J.B., Stern, H.S. and Rubin, D.B. (2003), Bayesian Data Analysis, 2nd. edition. New York: CRC Press.
Godambe, V.P. (1955). A unified theory of sampling from finite populations. JRSSB, 17, 269-278.
Horvitz, D.G. & Thompson, D.J. (1952). A generalization of sampling without replacement from a finite universe. JASA, 47, 663-685.
Little, R.J.A. (2004). To Model or Not to Model? Competing Modes of Inference for Finite Population Sampling. JASA, 99, 546-556. NTTS 2015: Calibrated Bayes 40
References 2
Little, R.J.A. (2006). Calibrated Bayes: A Bayes/frequentist roadmap. Am. Statist., 60, 3, 213-223
_____ (2012). Calibrated Bayes: an alternative inferential paradigm for official statistics (with discussion and rejoinder). JOS, 28, 3, 309-372.
_____ (2013). Survey Sampling: Past Controversies, Current Orthodoxies, and Future Paradigms. In Past, Present and Future of Statistical Science, COPSS 50th Anniversary Volume, X. Lin, D. L. Banks, C. Genest, G. Molenberghs, D.W. Scott, and J.-L. Wang, eds. CRC Press.
Rubin, DB (1984), Bayesianly justifiable and relevant frequency calculations for the applied statistician, Annals Statist. 12, 1151-1172.
Särndal, C.-E., Swensson, B. & Wretman, J.H. (1992), Model Assisted Survey Sampling, Springer Verlag: New York.
Zheng, H. & Little, R.J. (2005). Inference for the population total from probability-proportional-to-size samples based on predictions from a penalized spline nonparametric model. JOS, 21, 1-20.
NTTS 2015: Calibrated Bayes 41