47
Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter, NORC and the University of Chicago

Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

Embed Size (px)

Citation preview

Page 1: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

Variance Estimation in Complex Surveys

Third International Conference on Establishment Surveys

Montreal, Quebec

June 18-21, 2007

Presented by:

Kirk Wolter, NORC and the University of Chicago

Page 2: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

2

Outline of Lecture –

Introduction (Chapter 1) Textbook Methods (Chapter 1) Replication-Based Methods

Random Group (Chapter 2) Balanced Half-Samples (Chapter 3) Jackknife (Chapter 4) Bootstrap (Chapter 5)

Taylor Series (Chapter 6) Generalized Variance Functions

(Chapter 7)

Page 3: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

3

Chapter 1: Introduction

Notation and Basic Definitions

1. Finite population, - Residents of Canada- Restaurants in Montreal- Farms in Quebec- Schools in Ottawa

2. Sample, - Simple random sampling, without replacement- Systematic sampling- Stratification- Clustering- Double sampling

NU ,,1

s

Page 4: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

4

Chapter 1: Introduction

5. Probability sampling design,

-

-

8. Characteristic of interest,

-

-

0)( sP

s

sP 1

farmth of in tons yield iYi

sP

iY

employednot if ,0

employed isresident th - if ,1

iYi

Page 5: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

5

Chapter 1: Introduction

12. Parameter, - Proportion of residents who are

employed- Total production of farms- Trend in price index for

restaurants- Regression of sales on area for pharmacies

13. Estimator,

-

Ys,ˆ

Page 6: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

6

Chapter 1: Introduction

14. Expectation and variance

-

-

16. Estimator of variance

-

-

-

v

YssPs

,ˆE

s

EYssP

E2

2

ˆ,

ˆˆEˆVar

ˆVarˆE v

01ˆ

ˆ

Var

vP

Page 7: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

7

Textbook Methods

1. Design: srs wor of size

Estimator:

Variance Estimator:

Nnf

yfYn

ii

/

ˆ1

1

n

ii

n

ii

nyy

nyys

nsfNYv

1

1

22

22

/

1/

/1ˆ

n

Page 8: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

8

Textbook Methods

2. Design: srs wor at both the first and second stages of sampling

Estimator:

Variance Estimator:

iii

n

i

m

jiji

Mmf

Nnf

yffYi

/

/

ˆ

2

1

1 1

12

11

i

i

m

jiiji

m

jiiiji

ii

n

iii

n

iii

myy

myys

msfMnNnNYyMnfNYv

1.

1

2.

2

2

12

2

1

2

.12

/

1/

/1/1//ˆ/11ˆ

Page 9: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

9

Replication-Based Methods

2

1

ˆˆˆ

k

Cv

Page 10: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

10

Chapter 2: The Method of Random Groups

Interpenetrating samples Replicated samples Ultimate cluster Resampling Random groups

Page 11: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

11

Chapter 2: The Method of Random Groups

The Case of Independent Random Groups

(i) Draw a sample, No restrictions on the sampling methodology

(ii) Replace the first sampleDraw second sample, Use same sampling methodology

(iii) Repeat until samples are obtained,

2s

1s

2k

ksss ,,, 21

Page 12: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

12

Chapter 2: The Method of Random Groups

Common estimation procedure:

Editing procedures Adjustments for nonresponse Outlier procedures Estimator of parameter

Page 13: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

13

Chapter 2: The Method of Random Groups

Common measurement process:

Field work Callbacks Clerical screening and coding Conversion to machine-readable form

Page 14: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

14

Chapter 2: The Method of Random Groups

Estimators of the Parameter of Interest,

Random group estimators

Overall estimators

k ˆ,,ˆ,ˆ21

k

k 1

ˆ1ˆ

:

Page 15: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

15

Chapter 2: The Method of Random Groups

Two Examples:

Population total

Ratio

k

isi

i

isi

i

isi

i

N

ii

YWk

YYW

YYW

YY

1

1

ˆˆ

ˆˆ

k

X

Y

k

X

Y

X

Y

X

Y

ˆ1ˆ

ˆ

ˆˆ

ˆ

ˆˆ

Page 16: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

16

Chapter 2: The Method of Random Groups

Estimators of

k

k

kkv

vv

kkv

1

2

2

1

1

2

1/ˆˆˆ

)ˆ(ˆ

1/)ˆˆ()ˆ(

:ˆVaror ˆVar

Page 17: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

17

Chapter 2: The Method of Random Groups

Properties:

ˆˆ

1/3ˆ

ˆVar

ˆVarˆCV

ˆVarˆE

21

21

14

21

vv

k

kk

vv

v

Page 18: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

18

Chapter 2: The Method of Random Groups

Confidence Intervals:

2/,12/ or

)ˆ(ˆ,)ˆ(ˆ

ktzc

vcvc

Page 19: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

19

Chapter 3: Variance Estimation Based on Balanced Half-Samples

Description of Basic Techniques

L strata

Nh units per stratum

N size of entire population

nh = 2 units selected per stratum

srs wr

Example: restaurants in Montreal

Page 20: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

20

Chapter 3: Variance Estimation Based on Balanced Half-Samples

average number of customers served by Montreal restaurants on a Monday night

Y

2/

/

21

1

hhh

hh

L

hhhst

yyy

NNW

yWy

Page 21: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

21

Chapter 3: Variance Estimation Based on Balanced Half-Samples

Textbook Estimator of Variance

21

2

1

22

1

22

1

22

12/

4/

2/

hhh

ihhih

L

hhh

L

hhhst

yyd

yys

dW

sWyv

Page 22: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

22

Chapter 3: Variance Estimation Based on Balanced Half-Samples

Random Group Estimator of Variance

k = 2 independent random groups are available

L

hhhst

stst

stststRG

L

L

yWy

yy

yyyv

yyy

yyy

1,

22,1,

2

1

2,

22212

12111

4/

122

1

,,,

,,,

Page 23: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

23

Chapter 3: Variance Estimation Based on Balanced Half-Samples

Half-Sample Methodology

L

hhhhhhst

L

h

yyWy

h

12211,

1

samples-half possible 2

otherwise , 0

sample halfth - the

for selected is )1,(unit if , 1

Page 24: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

24

Chapter 3: Variance Estimation Based on Balanced Half-Samples

Choosing a Manageable Number, k, of Half-Samples

balanced

random

/1

2,

k

k

kyyyvk

stststk

Page 25: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

25

Chapter 3: Variance Estimation Based on Balanced Half-Samples

Table 3.2.1. Definition of Balanced Half-Sample Replicates for 5, 6, 7, or 8 Strata

Stratum (h)

Replicate 1 2 3 4 5 6 7 8

1h +1 -1 -1 +1 -1 +1 +1 -1

2h +1 +1 -1 -1 +1 -1 +1 -1

3h +1 +1 +1 -1 -1 +1 -1 -1

4h -1 +1 +1 +1 -1 -1 +1 -1

5h +1 -1 +1 +1 +1 -1 -1 -1

6h -1 +1 -1 +1 +1 +1 -1 -1

7h -1 -1 +1 -1 +1 +1 +1 -1

8h -1 -1 -1 -1 -1 -1 -1 -1

Page 26: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

26

Chapter 3: Variance Estimation Based on Balanced Half-Samples

Properties of the Balanced Half-Sample Methods

Lkyyk

yvyv

k

stst

ststk

provided, 1

1,

Page 27: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

27

Chapter 3: Variance Estimation Based on Balanced Half-Samples

Usage with Multistage Designs

L

hhhhh

h

h

L

hhhhh

pYpYYv

h

Y

p

pYpYY

Y

1

2

2211

1

1

12211

4//ˆ/ˆˆ

PSUth -1,in persons employed

ofnumber totalof estimator ˆ

units housing

2/ˆ2/ˆˆ

Canadain

persons employed ofnumber total

Page 28: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

28

Chapter 3: Variance Estimation Based on Balanced Half-Samples

Balanced Half-Sample Methodology

k

k

L

hhhhhhh

kYYYv

pYpYY

1

2

1222111

/ˆˆˆ

/ˆ/ˆˆ

Page 29: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

29

Chapter 3: Variance Estimation Based on Balanced Half-Samples

Alternative Half-Sample Estimators of Variance

equaly necessarilnot are Estimators

ˆˆ1ˆ

ˆˆ4

ˆ1ˆˆ

ˆˆ1ˆ

/ˆˆˆ

1

2

2

1

2†

1

2

1

2

k

kc

ckk

kcc

k

k

kv

kv

vvv

kv

kv

k

k

k

k

Page 30: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

30

Chapter 4: The Jackknife Method

Quenouille (1949) – bias reduction

Tukey (1958) – variance estimationtestinginterval estimation

Resampling

Page 31: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

31

Chapter 4: The Jackknife Method

Basic Methodology

Random sample

Random groups

Parameter

Estimator

nyyy ,,, 21

kmn

Quebec)

in farms of acreper yield :(example

Page 32: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

32

Chapter 4: The Jackknife Method

Drop out m

Pseudovalue

Quenouille’s estimator

Variance estimator

Special case

ˆ1ˆˆ kk

k

k 1

ˆ1ˆ

2

12

2

11

ˆˆ1

1)ˆ(

)ˆˆ(1

1)ˆ(

k

k

kkv

kkv

1, mnk

k,,1ˆ

Page 33: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

33

Chapter 4: The Jackknife Method

Full-sample estimator

Variance estimator

2

1

ˆˆ1

k

ikkv

Page 34: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

34

Chapter 4: The Jackknife Method

Example: ratio

xykxyk

xy

xy

XY

/1/ˆ

/

Page 35: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

35

Chapter 4: The Jackknife Method

Usage in Stratified Sampling

Drop out observation(s) from individual strata

hn

ihi

L

h h

h

hi

n

nv

1

2

11

ˆˆ1ˆ

ˆ

ˆ

Page 36: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

36

Chapter 4: The Jackknife Method

Application to Cluster Sampling

Example

Drop out ultimate clusters

persons employed total

ijijji

km

iii

ijijji

n

iii

YWpYkm

YWpYn

)(

1

1

1

/ˆ1

/ˆ1ˆ

out dropped is PSU if ,0

out droppednot is PSU if ,)1()(

ijij Wkm

mkW

Page 37: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

37

Chapter 5: The Bootstrap Method

Works with replicates of potentially any size, *n

Original Application –

nYY ,,1 are iid random variables (scalar or vector)

from a distribution function F is to be estimated

Page 38: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

38

Chapter 5: The Bootstrap Method

A bootstrap sample (or bootstrap replicate) is a

simple random sample with replacement (srs wr) of

size *n selected from the original sample.

**1 *,,

nYY

* denotes the estimator of the same functional form as

Page 39: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

39

Chapter 5: The Bootstrap Method

Ideal Bootstrap Estimator of Var

**1

ˆˆ Varv ,

where *Var signifies the conditional variance, given the original sample

Monte Carlo Bootstrap Estimator of Var

i. Draw a large number, A , of independent bootstrap replicates from

the main sample and label the corresponding observations as

**1 *,,

nYY

, for A,,1 ;

ii. For each bootstrap replicate, compute the corresponding estimator

* of the parameter of interest; and

iii. Calculate the variance between the * values

2**

12

ˆˆ1

A

Av ,

*

1

* ˆ1ˆ

A

A.

Page 40: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

40

Chapter 5: The Bootstrap Method

Application to the Finite Population –

Simple Random Sampling with Replacement (srs wr) Data

nyy ,,1

Parameter of Interest

Y Standard Estimator

iyny /1

Page 41: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

41

Chapter 5: The Bootstrap Method

Bootstrap Sample

**

*1 ,, nyy

Estimator

*** /1 iyny

Bootstrap Moments

n

iiy

nyE

1*1*

22*1*

11s

n

nyy

nyVar

n

ii

Ideal Bootstrap Estimator of Variance

*

2

*

*1**

*1

1

}{

n

s

n

n

n

yVaryVaryv

Unbiased Choice

1* nn

Page 42: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

42

Chapter 5: The Bootstrap Method

Multistage Sampling with pps wr Sampling at the First Stage Observed Data

ijy , where i indexes the selected PSU and j indexes the completed

interview within the PSU Parameter of Interest

Y Estimator

iii

n

ii

n

i i

i

jijij

n

i

pYz

znp

Y

nywY

1ˆ1ˆ1

Page 43: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

43

Chapter 5: The Bootstrap Method

Bootstrap Sample

**2

*1 *,...,,

nzzz

Bootstrap Moments

n

ii

n

ii

Yzn

z

Yzn

z

2*1*

*1*

ˆ1Var

ˆ1E

Ideal Bootstrap Estimator of Variance

.ˆ11VarˆVarˆ 2

**

*1**

*1 hn

ii Yz

nnn

zYYv

Unbiased Choice

1* nn

Page 44: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

44

Chapter 6: Taylor Series Methods

Assume a complex survey design

),...,( 1 pYYY vector of population totals

)ˆ,...,ˆ(ˆ1 pYYY

)(Yg parameter of interest, such as

the ratio 2

1

Y

Y

)ˆ(ˆ Yg

Page 45: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

45

Chapter 6: Taylor Series Methods

First-order Taylor series approximation

MSE

RYYy

gjj

j

p

j

)ˆ()(ˆ

1

Y

ddY

)}ˆ()(

{Var}ˆ{MSE1

jjj

p

j

YYy

g

jj y

gd

)(Y

})ˆ)(ˆ{(E YYYY

Page 46: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

46

Chapter 6: Taylor Series Methods

dd ˆˆˆ)ˆ(v

jj y

gd

)ˆ(ˆ Y

by textbook or replication-based method applied to the y-data

Alternative algorithm

si

jiij YWY

iisi

UWU

ˆ

jij

p

ji Y

y

gU

)(

1

Y

}ˆ{Var}ˆ{MSE U

Page 47: Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,

47

Chapter 7: Generalized Variance Functions

1. Population total,

2. Estimator of the total,

3. Relative variance,

4.

2

2ˆVar

X

XV

XV /2

X

X