74
1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses “auxiliary” information (X ) Sample data: observe y i and x i Population information Have y i and x i on all individual units, or Have summary statistics from the population distribution of X, such as population mean, total of X Ratio estimation is also used to estimate population parameter called a ratio (B ) p t y ˆ and , ˆ ,

1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

Embed Size (px)

Citation preview

Page 1: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

1

Ratio estimation under SRS Assume

Absence of nonsampling error SRS of size n from a pop of size N

Ratio estimation is alternative to under SRS, uses “auxiliary” information (X ) Sample data: observe yi and xi

Population information Have yi and xi on all individual units, or Have summary statistics from the population

distribution of X, such as population mean, total of X Ratio estimation is also used to estimate

population parameter called a ratio (B )

pty ˆ and ,ˆ ,

Page 2: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

2

Uses Estimate a ratio

Tree volume or bushels per acre Per capita income Liability to asset ratio

More precise estimator of population parameters If X and Y are correlated, can improve upon

Estimating totals when pop size N is unknown Avoids need to know N in formula for

Domain estimation Obtaining estimates of subsamples

Incorporate known information into estimates Postratification

Adjust for nonresponse

t

pty ˆ and ,ˆ ,

Page 3: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

3

Estimating a ratio, B Population parameter for the ratio: B

Examples Number of bushels harvested (y) per acre (x) Number of children (y) per single-parent

household (x) Total usable weight (y) relative to total

shipment weight (x) for chickens

U

U

x

y

xy

t

tB

Page 4: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

4

Estimating a ratio

SRS of n observation units Collect data on y and x for each

OU Natural estimator for B ?

U

U

x

y

xy

t

tB

Page 5: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

5

Estimating a ratio -2 Estimator for B

is a biased estimator for B

is a ratio of random variables

n

ii

n

ii

x

y

x

y

xy

t

tB

1

1

ˆ

ˆˆ

B

BBE ]ˆ[

B

Page 6: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

6

Bias ofB

yS

xS

SSN

yyxxR

yxCorrR

SSRSBxnN

nBBE

y

x

yx

UiUi

N

i

yxxU

of deviation standard population

of deviation standard population

1

,

0

11ˆ

1

2

Page 7: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

7

Bias is small if Sample size n is large Sample fraction n/N is large is large is small (pop std deviation for x) High positive correlation between X

and Y

(see Lohr p. 67)

xSUx

Bias of – 2 B

Page 8: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

8

Estimated variance of estimator for B Estimator for

If is unknown?

iii

n

ii

n

iiie

U

e

xBye

en

xByn

s

xn

sNn

BV

ˆ

11ˆ

11

where

1]ˆ[ˆ

1

2

1

22

2

2

]ˆ[BV

Ux

Page 9: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

9

Variance of

Variance is small if sample size n is large sample fraction n/N is large deviations about line e = y Bx are

small correlation between X and Y close to 1 is large

2

2

1ˆˆU

e

xn

sNn

BV

Ux

B

Page 10: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

10

Ag example – 1 Frame: 1987 Agricultural Census

Take SRS of 300 counties from 3078 counties to estimate conditions in 1992

Collect data on y , have data on x for sample

Existing knowledge about the population

ix

iy

i

i

county in 1987 in farms of acreage total

county in 1992 in farms of acreage total

acres 625,470,964

1987 in US incounty per farms of acreage total

county / acres 283.343,313

1987 in US incounty per farms of acreage average

x

U

t

x

Page 11: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

11

Ag example – 2 Estimate

1987 in farms of acres1992 in farms of acres

B

0,586,1179,369,11498

ˆ

acres 90,586,117

acres 89,369,114

300

1

300

1

i

i

ii

ii

x

yB

x

y

0.9866 farm acres in 1992 relative to 1987 farm acres

Page 12: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

12

Ag example – 3

Need to calculate variance of ei ’s

2

2

1ˆˆU

e

xn

sNn

BV

Page 13: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

13

Ag example – 4 For each county i, calculate Coffee Co, AL example

Sum of squares for ei

iii xBye ˆ

1693.00 179,311 (0.9866) 1175,209 ie

112300

22

22 109965166.2 xeeee ii

n

i

462,179,002,1 2991

11 2

1

2

i

n

ie e

ns

Page 14: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

14

Ag example – 5

0055.0)ˆ(

acre farm 1987per 1992 in acres farm 9866.0ˆ

000030707.0283.343,313300

462,179,002,13078300

1

1ˆˆ2

2

BSE

B

xn

sNn

BVU

e

Page 15: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

15

Estimating proportions If denominator variable is random, use

ratio estimator to estimate the proportion p

Example (p. 72) 10 plots under protected oak trees used to assess effect

of feral pigs on native vegetation on Santa Cruz Island, CA

Count live seedlings y and total number of seedlings x per plot

Y and X correlated due to common environmental factors

Estimate proportion of live seedlings to total number of seedlings

B

032.0)ˆ( with 300.06.201.6ˆ BSE

xy

B

Page 16: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

16

Estimating population mean Estimator for

“Adjustment factor” for sample mean

A measure of discrepancy between sample and population information, and

Improves precision if X and Y are + correlated

Uy

xx

yxxy

xBy UUUr

ˆˆ

xxU

Uxx

Page 17: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

17

Underlying model with B > 0

B is a slope B > 0 indicates X

and Y are positively correlated

Absence of intercept implies line must go through origin (0, 0)

y

x0

0

ii xBy

Page 18: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

18

Using population mean of X to adjust sample mean

Discrepancy between sample & pop info for X is viewed as evidence that same relative discrepancy exists between

xx

yxBy UUr

ˆˆ

U

UU

yyxx

xx

of estimatebetter get to adjust

1

adjust 1 yxx

xx UU

Uyy and

Page 19: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

19

Bias of Ratio estimator for the population mean

is biased

Rules of thumb for bias of apply

0]}ˆ{[

ˆˆ

BBEx

xxy

xBEyyE

U

UU

UUUr

B

ry

Page 20: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

20

Estimator for variance of Estimator for variance of

ns

Nn

BVxyV eUr

22 1ˆˆˆˆ

ry

ry

Page 21: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

21

Ag example – 6

1992 incounty per acres farm6.133,309

283.343,3139866.0ˆˆ

Ur xBy

Page 22: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

22

Ag example - 8

17001736ˆ

67.890,014,3300

462,179,002,19025.0

1ˆˆˆˆ

1992 incounty / acres farm 100,309ˆ

22

r

eUr

r

ySE

ns

Nn

BVxyV

y

Page 23: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

23

Ag example – 9 Expect a linear relationship between

X and Y (Figure 3.1) Note that sample mean is not equal to

population mean for X

county / acres 723.953,301

sample thefor 1987 in

US incounty per farms of acreage mean

county / acres 283.343,313

1987 in

US incounty per farms of acreage average

x

xU

Page 24: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

24

MSE under ratio estimation Recall …

MSE = Variance + Bias2

SRS estimators are unbiased so MSE = Variance

Ratio estimators are biased so MSE > Variance

Use MSE to compare design/estimation strategies EX: compare sample mean under SRS with

ratio estimator for pop mean under SRS

Page 25: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

25

Sample mean vs. ratio estimator of mean is smaller than

if and only if

For example, if and

ratio estimation will be better than SRS

]ˆ[ ryMSE

yCVxCV

R21

yCVxCV ~

2/1, yxCorrR

][yMSE

Page 26: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

26

Estimating the MSE Estimate MSE with sample estimates of

bias and variance of estimator This tends to underestimate MSE

and are approximations Estimated MSE is less biased if

is small (see earlier slide) Large sample size or sampling fraction High + correlation for X and Y

is a precise estimate (small CV for ) We have a reasonably large sample size

(n > 30)

x

BBias ˆ BV ˆ

x

BBias ˆ

Page 27: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

27

Ag example – 10

796,013,3100696,301,3ˆˆˆˆ

10830,344552,3449958.0

830,3449866.0954,301300

13078300

acres 736,1

1992 incounty per acres farm 134,309ˆ

042,151,357ˆˆ

acres898,18300

552,3443078300

1992 incounty per acres farm 897,297

2

2

2/1

rrr

r

r

yBiasyVyESM

saBi

yse

y

yVyESM

yes

y

Page 28: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

28

Estimating population total t Estimator for t

Is biased?

Estimator for

rxx

x

yyr yNtBt

t

tt ˆˆ

ˆ

ˆˆ

yrt

]ˆ[ yrtV

]ˆ[ˆ yrtV

Page 29: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

29

Ag example – 11

3

2

22

22

10856,2300

724,179,002,19025.03078

ˆˆ1ˆˆˆˆ

1992 in US in acres farm 191,513,951

625,470,9649866.0ˆˆ

yre

xyr

xyr

yVNns

Nn

NBVttV

tBt

Page 30: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

30

Summary of ratio estimation

iii

n

iie

UU

e

n

ii

n

ii

x

y

xBye

en

s

xxxn

sNn

BV

x

y

xy

t

tB

ˆ

11

where

)w/ (est.1ˆˆ

ˆ

ˆˆ

1

22

2

2

1

1

Page 31: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

31

Summary of ratio estn – 2

ns

Nn

NyVNBVttV

t

ttBt

ns

Nn

BVxyV

xx

yxBy

erxyr

x

xytxyr

eUr

UUr

2222

22

1ˆˆˆˆˆˆ

ˆˆˆˆ

1ˆˆˆˆ

ˆˆ

Page 32: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

32

Regression estimation What if relationship between y and x is

linear, but does NOT pass through the origin

Better model in this case isxBBy 10

y

xB0

B1 slope

Page 33: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

33

Regression estimation – 2 New estimator is a regression estimator

To estimate , is predicted value from regression of y on x at

Adjustment factor for sample mean is linear, rather than multiplicative

Uxx regyUy

xxByxBBy UUreg 110ˆˆˆˆ

Page 34: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

34

Estimating population mean Regression estimator

Estimating regression parameters

Uy

xByB

s

sr

s

s

xx

yyxxB

x

y

x

xy

n

i i

n

i ii

10

2

1

21

1

ˆˆ

ˆ

xxByxBBy UUreg 110ˆˆˆˆ

Page 35: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

35

Estimating pop mean – 2 Sample variances, correlation,

covariance

n

i iixy

yx

xy

n

i ix

n

i iy

xyxxn

s

ss

sr

xxn

s

yyn

s

1

1

22

1

22

11

11

11

Page 36: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

36

Bias in regression estimator

0],ˆ[ˆ1 xBCovyyE Ureg

Page 37: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

37

Estimating variance

Note: This is a different residual than ratio estimation (predicted values differ)

iiiii

n

iie

ereg

yyxBBye

en

s

ns

Nn

yV

ˆˆˆ

11

where

1ˆˆ

10

1

22

2

Page 38: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

38

Estimating the MSE Plugging sample estimates into

Lohr, equation 3.13:

)1(1ˆˆ 22

rn

s

Nn

yESM yreg

Page 39: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

39

Estimating population total t

Is regression estimator for t unbiased?

regyreg

regyreg

yVNtV

yNt

ˆˆˆˆ

ˆˆ

2

Page 40: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

40

Tree example Goal: obtain a precise estimate of number

of dead trees in an area Sample

Select n = 25 out of N = 100 plots Make field determination of number of dead

trees per plot, yi

Population For all N = 100 plots, have photo determination

on number of dead trees per plot, xi

Calculate = 11.3 dead trees per plot Ux

Page 41: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

41

Tree example – 2 Lohr, p. 77-78

Data Plot of y vs. x Output from PROC REG

Components for calculating estimators and estimating the variance of the estimators

We will use PROC SURVEYREG, which will give you the correct output for regression estimators

Page 42: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

42

Tree example – 3 Estimated mean number of dead

trees/plot

Estimated total number of dead trees

41.0~4080.025

54834.510025

1ˆˆ

trees/plot dead99.113.11613274.0059292.5ˆ

reg

reg

yes

y

414080.0100ˆˆ

area in trees dead 119999.11100ˆ

yreg

yreg

tes

t

Page 43: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

43

Tree example – 4 Due to small sample size, Lohr uses t -

distribution w/ n 2 degrees of freedom

Half-width for 95% CI

Approx 95% CI for ty is (1115, 1283) dead trees

07.2 so 232,05., 23,025.,2/ tndft df

45.8480.4007.2ˆ2,2/ yregn test

Page 44: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

44

Related estimators Ratio estimator

B0 = 0 ratio model

Ratio estimator regression estimator with no intercept

Difference estimation B1 = 1 slope is assumed to be 1

xBBy 10

y

xB0

B1 slope

Page 45: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

45

Domain estimation under SRS Usually interested in estimates and

inferences for subpopulations, called domains

If we have not used stratification to set the sample size for each domain, then we should use domain estimation We will assume SRS for this discussion

If we use stratified sampling with strata = domains, then use stratum estimators (Ch 4) To use stratification, need to know domain

assignment for each unit in the sampling frame prior to sampling

Page 46: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

46

Stratification vs. domain estimation In stratified random sampling

Define sample size in each stratum before collecting data

Sample size in stratum h is fixed, or known In other words, the sample size nh is the same

for each sample selected under the specified design

In domain estimation nd = sample size in domain d is random Don’t know nd until after the data have been

collected The value of nd changes from sample to sample

Page 47: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

47

Population partitioned into domains

Recall U = index set for population = {1, 2, …, N } Domain index set for domain d = 1, 2, …, D

Ud = {1, 2, …, Nd } where Nd = number of OUs in domain d in the population

In sample of size n nd = number of sample units from domain d are in the sample Sd = index set for sample belonging to domain d

Domain D

d=1

d=2

. . . . . . d=D

Domain #1

Page 48: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

48

Boat owner example Population

N = 400,000 boat owners (currently licensed) Sample

n = 1,500 owners selected using SRS Divide universe (population) into 2 domains

d = 1 own open motor boat > 16 ft. (large boat) d = 2 do not own this type of boat

Of the n = 1500 sample owners: n1 = 472 owners of open motor boat > 16 ft. n2 = 1028 owners do not own this kind of boat

Page 49: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

49

New population parameters Domain mean

Domain total

d

dUi

id

U yN

y1

d

dUi

iU yt

" domain to belong NOT does Unit "

" domain to belongs Unit "

diUi

diUi

d

d

Page 50: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

50

Boat owner example - 2 Estimate population domain mean

Estimate the average number of children for boat owners from domain 1

Estimate proportion of boat owners from domain 1 who have children

Estimate population domain total Estimate the total number of children

for large boat owners (domain 1)

Page 51: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

51

New population parameter – 2 Ratio form of population mean

Numerator variable

Denominator variable

Bxu

Nx

Nu

NN

Ny

N

y

yU

UN

ii

N

ii

d

N

ii

d

Uii

U

d

d

d

/

/

/

/

1

11

d

dii Ui

Uiyu

if0

if

d

di Ui

Uix

if0

if1

Page 52: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

52

Boat owner example - 3 Estimate mean number of children

for owners from domain 1

owner for children ofnumber iy i

1) domain in(not otherwise0

1) (domain owner if 1Uiyu i

i

otherwise0

owner if1 1Uix i

Zero values for OUs that are not in domain 1

Applies to whole pop

Page 53: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

53

Boat example – 4 Owner

(i) Domain

(di) # Kids

(yi) Den. (ui)

Num. (xi)

1 1 3 2 1 2 3 2 5 4 1 0 5 2 0 6 2 1 7 1 1 8 2 2 …

Page 54: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

54

Estimator for population domain mean

dUy

dn

y

n

uu

x

u

xn

un

xu

By

d

dii

d

Sii

Sii

n

ii

n

ii

n

ii

n

ii

d

dd

domain in nsobservatio of mean sample

1

1

ˆ

1

1

1

1

Page 55: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

55

Boat example – 5 Domain 1 data

Number of Children

Number of Respondents

0 76 1 139 2 166 3 63 4 19 5 5 6 3 8 1

Total 472

Page 56: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

56

Boat example – 6 Domain 1 and domain 2 data

combined ui

Number of Respondents

0 1104 1 139 2 166 3 63 4 19 5 5 6 3 8 1

Total 1500

1104 zeros =

76 zeros from domain 1

+

1028 zeros from domain 2

Page 57: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

57

Two ways of estimating mean

Boat example – 7

ownerboat largeper children 67.1314667.0524667.0ˆ

314667.015004821

524667.01500787

)8(1...)2(166)1(139)0(76)0(1028[1500

11

1

1

1

xu

B

nn

un

x

un

u

n

ii

n

ii

ownerboat largeper children 67.14727871 1

111

n

iiy

ny

Whole data set

Domain 1 data only

Page 58: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

58

Estimator for variance of dy

dSiii

dyd

ydd

dd

d

xByn

s

snn

NN

nNn

yV

xu

By

22

2

2

ˆ1

1where

111

ˆ

Page 59: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

59

Boat example – 8

111078.94ˆ1

1177966.3

11500472

11

177966.3472

1500 with estimate --

?000,400

FPC ignore can so 1000,400

500,111

111

1

2

1

21

11

11

2

2

Si

iiy

ydd

dd

xByn

s

nn

nn

nn

NN

Nn

snn

NN

nNn

yV

Page 60: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

60

Boat example – 9

45.04465287.0

ownerboat largeper children 67.1667373.1

199388.0472111078.94

11

111

1

1

1

21

21

1

2

1

21

1

2

11

ySE

y

n

s

snn

nn

n

snn

NN

nNn

yV

y

y

y

Page 61: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

61

Approximation for estimator of variance of dy

dSiii

dyd

dddd

d

ydd

xByn

s

nn

nn

nn

NN

n

s

Nn

yV

22

2

ˆ1

1 where

11

and assuming

Domain 1 data only

Page 62: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

62

Estimated variance of Estimator for

Domain variance estimator is directly related

iii

n

ii

n

iiie

U

e

xBye

en

xByn

s

xn

sNn

BV

ˆ

11ˆ

11

where

1]ˆ[ˆ

1

2

1

22

2

2

]ˆ[BV

B

Page 63: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

63

Relationship to estimating a ratio with Population mean of X

Residual

NN

x dU

U

UU x

uBy

d

xu

B ˆ

d

diiiii

SiB

SixByxBue

if 00ˆ0

if ˆˆ

Page 64: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

64

Relationship to estimating a ratio with - 2 Residual variance

2

2

22

22

11

ˆ11

11

01

1ˆ1

1

ˆ1

1

ydd

Siii

d

d

SiSiii

Siiie

snn

xBynn

n

nxBy

n

xBun

s

d

dd

U

UU x

uBy

d

xu

B ˆ

Page 65: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

65

Estimator for variance of dy

dSiii

dyd

ydd

dd

xByn

s

snn

NN

nNn

yV

22

2

2

ˆ1

1where

111

22

1

11]ˆ[ˆ e

U

sxnN

nBV

Page 66: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

66

Estimating a population domain total If we know the domain sizes, Nd

Uddd yNt

ddyd

dddyd

yVNtV

NyNt

ˆˆˆ

known ifˆ

2

Page 67: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

67

Estimating a population domain total - 2 If we do NOT know the domain

sizes

Uddd yNt

n

iiu

u

yd

dyd

uun

s

ns

Nn

N

uVNtV

NuNt

1

22

22

2

11

where

1

ˆˆˆ

unknown ifˆ

Standard SRS estimator using u as the variable

Page 68: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

68

Boat example – 10 Do not know the domain size, N1

000,10530,10ˆˆ)ˆ(

232,871,1101500

0394178.1000,400

)1(ˆˆˆ

children 000,210867,209524667.0000,400

ˆˆ

11

2

222

1

1

yy

uy

yyd

tVtSE

ns

Nn

NuVNtV

uNtt

Page 69: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

69

Comparing 2 domain means Suppose we want to test the hypothesis that

two domain means are equal

Construct a z-test with Type 1 error rate (for falsely rejecting null hypothesis)

Test statistic:

Critical value: z/2

Reject H0 if |z| > z/2

211

210

:

:

UU

UU

yyH

yyH

)(ˆ)(ˆ21

21

yVyV

yyz

Page 70: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

70

Boat example - 10 Large boat owners (d = 1)

Other boat owners (d = 2)

4465287.0

ownerboat largeper children667373.1

1

1

ySE

y

669793.0

ownerboat other per children501059.2

2

2

ySE

y

Page 71: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

71

Boat example - 11 Test whether domain means are equal at =

0.05 Calculate z-statistic

Critical value z/2 = z0.25 = 1.96 Apply rejection rule

|z| = |-1.04|=1.04 < 1.96 = z0.25 Fail to reject H0

04.1804991.0833686.0

669793.0446529.0

501059.2667373.1

)(ˆ)(ˆ

22

21

21

yVyV

yyz

Page 72: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

72

Overview Population parameters

Mean Total Proportion (w/ fixed denom) Ratio

Includes proportion w/ random denominator

Domain mean Domain total

Page 73: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

73

Overview – 2 Estimation strategies

No auxiliary information Auxiliary information X, no intercept

Y and X positively correlated Linear relationship passes through origin

Auxiliary information X, intercept Y and X positively correlated Linear relationship does not pass through

origin

Page 74: 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses

74

Overview – 3 Make a table of population parameters

(rows) by estimation strategy (columns) In each cell, write down

Estimator for population parameter Estimator for variance of estimated parameter Residual ei

Notes Some cells will be blank Look for relationship between mean and total,

and mean and proportion Look at how the variance formulas for many of

the estimators are essentially the same form

)ˆ(ˆ V