Upload
francisco-alejandro
View
240
Download
0
Embed Size (px)
Citation preview
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
1/27
Variance Estimation in
Complex SurveysDrew Hardin
Kinfemichael Gedif
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
2/27
So far..
Variance for estimated mean and totalunder
SRS, Stratified, Cluster (single, multi-stage), etc.
Variance for estimating a ratio of twomeans under
SRS (we used linearization method)
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
3/27
What about other cases?
Variance for estimators that are not linearcombinations of means and totals
Ratios
Variance for estimating other statistic fromcomplex surveys
Median, quantiles, functions of EMF, etc.
Other approaches are necessary
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
4/27
Outline
Variance Estimation Methods Linearization
Random Group Methods
Balanced Repeated Replication (BRR) Resampling techniques
Jackknife, Bootstrap
Adapting to complex surveysHot research areas
Reference
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
5/27
Linearization (Taylor SeriesMethods)
We have seen this before (ratio estimatorand other courses).
Suppose our statistic is non-linear. It canoften be approximated using TaylorsTheorem.
We know how to calculate variances oflinear functions of means and totals.
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
6/27
Linearization (Taylor SeriesMethods)
Linearize
Calculate Variance
),(
)(
)(
),...,(
2
),...(1
2
),...(
1
1 11
ji
jji i
ktt
k
ttk
ttCovt
h
t
h
tV
t
htV
t
htthV
kk
k
j
jjttt
j
kk tt
c
cccchttthtttth k
k
1
,..,21321 )(),....,,,(
),...,,(),...,,,( 21321
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
7/27
Linearization (Taylor Series)Methods
Pro:
Can be applied in general sampling designs
Theory is well developed
Software is available
Con:
Finding partial derivatives may be difficult
Different method is needed for each statistic
The function of interest may not be expressed asmooth function of population totals or means
Accuracy of the linearization approximation
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
8/27
Random Group Methods
Based on the concept of replicating the surveydesign
Not usually possible to merely go and replicatethe survey
However, often the survey can be divided into Rgroups so that each group forms a miniatureversions of the survey
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
9/27
Random Group Methods
1 2 3 4 5 6 7 8Stratum 1
1 2 3 4 5 6 7 8Stratum 2
1 2 3 4 5 6 7 8Stratum 3
1 2 3 4 5 6 7 8Stratum 4
1 2 3 4 5 6 7 8Stratum 5
Treat as miniature sample
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
10/27
Unbiased Estimator (Average of Samples)
Slightly Biased Estimator (All Data)1
)~
(1
)~
( 1
2
1
RR
V
R
rr
1
)(1
1
2
2
RRV
R
r
r
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
11/27
Random Group Methods
Pro: Easy to calculate General method (can also be used for non smooth
functions)
Con:Assumption of independent groups (problem when N
is small) Small number of groups (particularly if one strata is
sampled only a few times)
Survey design must be replicated in each randomgroup (presence of strata and clusters remain thesame)
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
12/27
Resampling and Replication Methods
Balanced Repeated Replication (BRR)
Special case when nh=2
Jackknife (Quenouille (1949) Tukey (1958))
Bootstrap (Efron (1979) Shao and Tu (1995))
These methods Extend the idea of random group method
Allows replicate groups to overlap
Are all purpose methods
Asymptotic properties ??
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
13/27
Balanced Repeated Replication
Suppose we had sampled 2 per stratum
There are 2H ways to pick 1 from eachstratum.
Each combination could treated as asample.
Pick R samples.
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
14/27
Balanced Repeated Replication
Which samples should we include?Assign each value either 1 or1 within the stratum
Select samples that are orthogonal to one another to
create balanceYou can use the design matrix for a fraction factorial
Specify a vector ar of 1,-1 values for each stratum
Estimator
2
1
)(1
)(
R
r
rBRRR
V a
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
15/27
Balanced Repeated Replication
Pro Relatively few computations
Asymptotically equivalent to linearization methods for
smooth functions of population totals and quantiles Can be extended to use weights
Con 2 psu per sample
Can be extended with more complex schemes
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
16/27
The JackknifeSRS-with replacement
Quenoule (1949); Tukey (1958); Shao and Tu (1995)
Let be the estimator of after omitting the ithobservation
Jackknife estimate
Jackknife estimator of the
For Stratified SRS without replacement Jones (1974)
l ii
n
i
i
J nnn )1(
~where/
~~
1
n
i
J
i
n
i
in
i
i
J
nn
nn
nV
1
2
11
2
)~~
()1(
1
/where)(1
)(
i
)(V
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
17/27
The Jackknifestratified multistage design
In stratum h, delete one PSU at a time
Let be the estimator of the same form aswhen PSU iof stratum his omitted
Jackknife estimate:
Or using pseudovalues
)()1/()(' ''
hihi
hihhhhh hh
hiygwherenhyynWyWy
)( hi
L
h
n
i
L
h
n
i
hi
h
II
J
hiI
J
hi
hh
hi
h h
nL
n
nn
1 1 1 1
)()()()(
)()(
~11~;/
~~
)1(~
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
18/27
The Jackknifestratified multistage design
Different formulae for
Where
Using the pseudovalues
)(V
hn
i
methodhiL
h h
hL
n
nV
1
2)(
1
)()1
)(
LnL
h
hL
h
hihmethod/or,/,,becan
1
)(
1
)()(
IIIjn
nV
hn
i
j
J
hiL
h h
hL ,)
~~(
)1)(
1
2)()(
1
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
19/27
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
20/27
The BootstrapNave bootstrap
Efron (1979); Rao and Wu (1988); Shao and Tu (1995)
Resample with replacement in stratum h
Estimate:
Variance:
Or approximate by
The estimator is not a consistent estimator of thevariance of a general nonlinear statistics
hnihi
y1
*
Bb
ygandyyynyb
h
b
h
b
i
b
hih
b
h
,...,2,1
)(,,*)*()*()*()*(1)*(
2*
*
*
*
*))(()( EEVNBS
B
b
b
BVNB S 1
.*)*(**
)
(1
1)
(
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
21/27
The BootstrapNave bootstrap
For
Comparing with
The ratio does not converge to 1for abounded nh
*** yyW hh
2
2
* 1)( h
h
h
h
sn
n
n
WyVar h
2
2
)( hh
sn
WyVar h
)(
)(*
yVaryVar
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
22/27
The BootstrapModified bootstrap
Resample with replacement in stratum h
Calculate:
Variance:
Can be approximated with Monte Carlo
For the linear case, it reduces to the customaryunbiased variance estimator
mh< nh
1,1
* h
m
ihimy
h
)~(~
,~~,/~~
)()1(
~
1
*
2/1
2/1
ygyWymyy
yyn
myy
h
m
i
L
h
hhhih
hi
h
hhhi
h
2*
*
*
*
**))
~(
~()
~( EEV
MBS
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
23/27
More on bootstrap
The method can be extended to stratified srswithout replacement by simply changing
For mh=nh-1, this method reduces to the nave BS
For nh=2, mh=1, the method reduces to the
random half-sample replication method For nh>3, choice of mh see Rao and Wu (1988)
))(1()1(
~to~*
2/1
2/1
hhihh
h
hhihi
yyfn
myyy
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
24/27
SimulationRao and Wu (1988)
Jackknife and Linearization intervals gavesubstantial bias for nonlinear statistics in one sidedintervals
The bootstrap performs best for one-sided intervals(especially when mh=nh-1)
For two-sided intervals, the three methods havesimilar performances in coverage probabilities
The Jackknife and linearization methods are morestable than the bootstrap
B=200 is sufficient
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
25/27
Hot topics
Jackknife with non-smooth functions (Raoand Sitter 1996)
Two-phase variance estimation (Graubardand Korn 2002; Rubin-Bleuer and Schiopu-Kratina 2005)
Estimating Function (EF) bootstrap method(Rao and Tausi 2004)
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
26/27
Software
OSIRIS BRR, Jackknife
SAS Linearization
Stata Linearization
SUDAAN Linearization, Bootstrap, Jackknife
WesVar BRR, JackKnife, Bootstrap
7/30/2019 Variance Estimation in Complex Surveys (1).ppt
27/27
References: Effron, B. (1979). Bootstrap methods: Another look at the jackknife. Annals of
statistics 7, 1-26. Graubard, B., J., Korn, E., L. (2002). Inference for supper population parametersusing sample surveys. Statistical Science, 17, 73-96.
Krewski, D., and Rao, J., N., K. (1981). Inference from stratified samples: Propertiesof linearization, jackknife, and balanced replication methods. The annals of statistics.9, 1010-1019.
Quenouille, M., H.(1949). Problems in plane sampling. Annals of MathematicalStatistics 20, 355-375.
Rao, J.,N.,K., and Wu, C., F., J., (1988). Resampling inferences with complex surveydata. JASA, 83, 231-241.
Rao, J.,N.,K., and Tausi, M. (2004). Estimating function variance estimation understratified multistage sampling. Communications in statistics. 33:, 2087-2095.
Rao, J. N. K., and Sitter, R. R. (1996). Discussion of Shaos paper.Statistics, 27, pp.246247.
Rubin-Bleuer, S., and Schiopu-Kratina, I. (2005). On the two-phase framework for
joint model and design based framework. Annals of Statistics (to appear) Shao, J., and Tu, (1995). The jackknife and bootstrap. New York: Springer-Verlag. Tukey, J.W. (1958). Bias and confidence in not-quite large samples. Annals of
Mathematical Statistics. 29:614.Not referred in the presentation Wolter, K. M. (1985) Introduction to variance estimation. New York: Springer-Verlag. Shao, J. (1996). Resampling Methods in Sample Surveys. Invited paper, Statistics,
27, pp. 203237, with discussion, 237254.