32
Eurostat Weighting and Estimation

Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Embed Size (px)

Citation preview

Page 1: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Eurostat

Weighting and Estimation

Page 2: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Presented by

• Loredana Di Consiglio• Istituto Nazionale di Statistica, ISTAT

Page 3: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Outline

• Weighting and estimation in the Handbook

– Weighting, use of auxiliary variables and calibration estimators

– Small area estimation– Preliminary estimation

• Choice of estimation method

Page 4: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Weighting

• Principle of weighting: each sample unit represents a number of population units.

• Basic weights: the design weights

s

kpk sIEskspsk ))(()()()Pr( 1

U

kks

kks

kk

HT yskydyY1

)(1ˆ 1

Non-linear Estimation: Plug-in Principle (or substitution)

• Horvitz-Thompson estimator

Page 5: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Weighting

• The principle of weighting is also applied to account for unit non-response.

• Design weights can be adjusted also to consider non-response in order to reduce the possible bias of resulting estimates.

• For example, the sample can be partitioned into sub-groups of units where the response rates are assumed to be constant, and where it can be assumed that non-respondents behave similarly to respondents.

• Non-response depends on auxiliary variables defining a partition of the population, but conditionally on these variables it is independent of the target variable.

Page 6: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Use of Auxiliary information

• When auxiliary variables are available: reduce bias, reduce variance (however sometimes, external bounds)

• Ratio estimator, auxiliary information : the total of one numerical variable

• If applied to the X variable, one gets a perfect estimate

kkkks

rat dX

Xwyw

X

YXY

ˆ re whe

ˆ

ˆˆ

XX rat ˆ

Page 7: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Use of Auxiliary information

• Poststratification: total of a vector of indicator of post-strata

• The estimator is

Uhh UkNX

.

)(

.

.

.

1

.ˆ/ˆ)(/)(ˆ whereˆˆ s

hhkhs

kkhhhh

hpost NYdUkydUkYYNY 11

Page 8: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Use of Auxiliary information

• Raking Ratio – Auxiliary Information: known totals of different auxiliary

variables (not-cross-classified)

J1jN

I1iNi

to

to

j

The Raking-Ratio method consists in performing post-stratification with all variables and iterate

Page 9: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Use of Auxiliary information

• GREG

• GREG is «assisted» by a linear relationship between X and Y.

siii

siiiGREG xdXydY ̂ˆ

Page 10: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Calibration

• The estimate of total Y is obtained by means of a procedure which

– Corrects bias due to non response

– takes into account the knowledge of auxiliary variables, requiring that the estimates of these ones are equal to their own known totals

sk

kksk

kkkCAL wydyY ˆ

Page 11: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Calibration

• The weights wk are calculated as follows:

dk is the initial weight, equal to the inverse of the

inclusion probability pk

gk is the final correction factor, which allows equality of

sampling estimates to their known totals; it is calculated by means of the following equations

Page 12: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Calibration

• Final weight are chosen to satisfy constraints on auxiliary variables subject to

• where G is an appropriate distance function• Subject to bounds for w/d

s k

kkp

tXW

WDGE

,min

xk

s

Page 13: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Calibration

• Distance function G:

- Linear

– Raking ratio: (w/d) Log (w/d) – w/d +1

– Truncated linear ULdwdw

,/ 2

1/ 2

s d

dw

2

2

Page 14: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Calibration

• Calibration estimator equals GREG when choosing the linear (Euclidean) distance function

β̂)ˆ(ˆˆˆ 'XX YYY GREGCAL

Page 15: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Calibration

• All calibration estimators are asymptotically equal to GREG

• They are approximately unbiased and consistent

• Their sampling variance converges to GREG variance

Page 16: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Calibration

• Software – CLAN (Statistics Sweden)– BASCULA (The Netherlands)– GES (StatCan)

• ReGenesees (ISTAT)- R package– A second R package, called ReGenesees.GUI, implements the

presentation layer of the system: less experienced R users will take advantage from the user-friendly graphical interface.

• downloadable from the Joinup https://joinup.ec.europa.eu/software/regenesees/release/all

Page 17: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Weighting, use of auxiliary variable and calibration

• Planned modules in HB – Main theme module– Calibration estimators– Already available:

GREG http://www.cros-portal.eu/content/generalised-regression-estimator

Page 18: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Small area estimation

• Most national surveys are planned to produce accurate estimation at national level.

• Analyses at finer partition may not have the desired precision due to small sample size or even zero sample.

• A small area is a domain where the sample size is not sufficient to satisfy prefixed level of precision.

Page 19: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Small area estimation• Indirect estimators – make use of what has been observed

on the other domains (or time)– Traditional estimators:

Synthetic estimators Composite estimators

– Model based estimators Area level models Unit level models

• With this class of estimators extra-information is gained in the estimation process by making use of observations outside the domain of interest by means of implicit (synthetic estimators) or explicit (model based estimators) use of models.

Page 20: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Small area estimation

• Use information at local level with common beta – Modified direct

βxX ˆˆ

1ˆ1ˆ GRE

T

siidid

d

dsi

idid

d

d

dd

wN

ywN

Y

βXβx ˆ)ˆ(ˆ1ˆ

.GRE T

dsi

Tididid

d

d

d

ywN

Y

Page 21: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Small area estimation

• Synthetic estimators: simple case it is assumed that small areas have same mean of larger domains (at least in classes),

βX ˆˆ Td

SINY Synthetic estimators can be based on different models (relationships between variable of interest and auxiliary v.); linear model; linear mixed model at unit level; linear mixed model at area level.

Page 22: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Small area estimation

βX ˆ)ˆ1(ˆˆˆ.

HTEBLUP1 Tddddd YY duud ˆ ˆ ˆˆ 22

ˆ1ˆˆˆ EBLUP2 βXβxβX Tdd

Td

Tdddd yY

• Model based estimators- Based on area level model:

- Based on unit level model:

dduud n/ˆ ˆ ˆˆ 222

Page 25: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Preliminary estimation

• The treatment of unit non-response may be applied.

• In this case, the late response is treated as non-response but in order to avoid biased estimates, the self-selection of quick respondents mechanism should not be considered as random.

Page 26: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Preliminary estimation

• Rao et al. (1989) proposed composite estimators that may represent an improvement of the standard estimator.

• The basic composite estimator is obtained as weighted average of the preliminary estimate at time t and the final estimate at time t-1 adjusted for the difference between preliminary estimates at time t and t-1.

• chosen on the basis of variances and covariances

[0,1]in )()1( 11, p

tp

ttp

tt YYYYY

Page 27: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Preliminary estimation

• In order to reduce the revision error of the preliminary estimates model based estimators can be considered, Rao, Srinath and Quenneville (1989) adopt a time series approach to preliminary estimation.

• Let be respectively the preliminary estimate at time t, the final estimates and the measurement errors in preliminary estimates at time t

*P and ttt YYY

Page 28: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Preliminary estimation

• Furthermore, suppose:

• The estimator results:• Or when auxiliary variables

• Or taking into account of seasonality

*11 1ˆ

tPttt YYYY

20

2

2

11 ,0~, NXYY tt

P

ktktt

21 ,0~, NYY tttt

20

*1

* ,0~, NYY tttt

212211 ,0~, NYYY ttttt

Page 30: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Choice of estimation methods

• Quality indicators:– Accuracy: degree of closeness of estimates to the true values.

Bias Precision

– Timeliness : is the length of time between the event or phenomenon they describe and their availability. – Revision errors

– Coherence and comparability: Coherence with other statistics

Ref. ESS Handbook for Quality Reports Methodologies and Working papers, 2009

Page 31: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Choice of estimation methods

• Close relationship with sampling design – (e.g. weights) – Choice of sampling strategy

• Non probabilistic sample design? E.g. cut-off sampling model based estimators– Model simply assumes that the units cut off behave

similarly to those in the sampled portion. – Model assumptions should be analysed as far as

possible.

Page 32: Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Thank you for your attention