27
1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII June 20, 2007

1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

Embed Size (px)

Citation preview

Page 1: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

1

ESTIMATION IN THE PRESENCE OF TAX DATA

IN BUSINESS SURVEYS

David Haziza, Gordon Kuromi and Joana BérubéUniversité de Montréal & Statistics Canada

ICESIII

June 20, 2007

Page 2: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

2

OUTLINE

• Introduction

• Current sampling design

• Current point estimators

• Alternative sampling design

• Alternative estimators

• Domain estimation

• When the tax variable is missing

Page 3: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

3

TAX DATA PROGRAM

• Goal: To increase the use of tax data in business surveys in order to

reduce the respondents burden

reduce costs

potentially improve the quality of point estimators

Page 4: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

4

TYPES OF VARIABLES

• We distinguish between 3 types of variables

Financial survey variables (total revenue, total expenditure, etc)

Financial tax variables (total revenue, total expenditure, etc)

Non-financial variables

• There is a direct link between the financial survey variables and the financial tax variables

• No direct link between non-financial variables and tax variables

Page 5: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

5

TAX DATA

• 3 type of tax data:

T1 data: un-incorporated businesses (Unified Enterprise Survey)

T2 data: incorporated businesses (Unified Enterprise Survey)

GST data: both incorporated & un-incorporated (Monthly surveys)

Page 6: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

6

CURRENT SAMPLING DESIGN

• Stratification by Province, NAICS and Size

• 3 types of strata:

Take-all strata (typically complex units)

Take-some strata (simple and complex units)

Take-none strata (simple units)

• Use of tax data is limited to take some strata (for simple units only) and take-none strata

Page 7: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

7

CURRENT SAMPLING DESIGN

( )eh ehs n

( )h hU N

( )h hs n

( )eh ehs n

( )neh nehs n

STRATUM = PROVINCE x NAICS

50%

Eligible unitsNoneligible units

Page 8: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

8

CURRENT SAMPLING DESIGN

• Advantages: The current design fits the imputation and estimation systems

• Disadvantages: It is a two-phase sampling design The sample sizes for collection in both the eligible and non-

eligible strata are random variables may increase the variance of the estimators and add uncertainty to the collection costs

The use of tax data is limited to the first-phase sample.

Page 9: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

9

FINANCIAL VARIABLES

• Survey variables: y available only for the units in

• Tax variables: x available for all the units in

• For many financial variables, there is a corresponding tax variable

• We assume that both type of variables are known without errors (measurement errors, nonresponse)

• These two assumptions are not satisfied in practice!

hs

hU

Page 10: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

10

CURRENT TAX REPLACEMENT METHODS

• Model describing the relationship between x and y:

• Special cases:

Direct tax replacement:

Ratio type replacement:

i i iy f x

iii cVE 2,0

, 1i i if x x c

, i i i if x x c x

iii xy

iii xy

Page 11: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

11

PREDICTED VALUES

• : predicted value for

Direct tax replacement: (used in UES)

Ratio type replacement: (used in monthly surveys)

• Estimate of is obtained from the units in s.

iy

ii xy ˆ

ii xBy ˆˆ

B

iy

Page 12: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

12

NOT ONLY DIRECT TYPE REPLACEMENT?

• Considerable efforts have been made to standardize the concepts and definitions between the tax variables and the survey variables (Chart of Account compliance for T1 and T2)

• As a result, we expect that the model should be valid. Sometimes, it is not because

Difference in reporting of data and other issues (Jocelyn, Mach et Pelletier, 2006)

Difference in the reference period (GST data)

xy

Page 13: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

13

CURRENT POINT ESTIMATORS: PREDICTION TYPE

• In the noneligible portion: Horvitz-Thompson estimator

• In the eligible portion: Prediction (or imputed) type estimator

y is observed for all i in s

is used for i in

We have

y ss

ssi

iisi

iis

PRED ywywY ˆˆ

Page 14: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

14

CURRENT POINT ESTIMATORS: PREDICTION TYPE

• Advantages: Similar to imputed estimators in the context of imputation They are simple and fit the current imputation and estimation

systems They fit the so-called micro approach for displaying the data

• Disadvantages: They are generally p-biased May be pm-biased if the tax replacement model is incorrectly

specified

Page 15: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

15

DISPLAYING THE DATA: MICRO VS. MACRO APPROACH

• We distinguish between two approaches for displaying the data:

(i) Micro approach: consists of reporting the observed y-values as well as the predicted values (similar to an imputed file in the context of item nonresponse)

(ii) Macro approach: consists of reporting the observed y-values along with a calibration weight

• Currently, the micro approach is used

Page 16: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

16

PREDICTION TYPE ESTIMATORS

• Micro approach

iw

n

2

1

ny

y

y

2

1

nw

w

w

2

1

ss

n

n

1

n

n

y

y

ˆ

ˆ 1

n

n

w

w

1

Unit

s

Domain

y~

Domain estimators potentially p-biased and pm-biased

Page 17: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

17

ALTERNATIVE SAMPLING DESIGN

STRATUM = PROVINCE x NAICS

( )h hU N

( )eh ehs n

( )eh ehU N

( )neh nehs n

( )neh nehU N

Noneligible units Eligible units

Page 18: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

18

ALTERNATIVE SAMPLING DESIGN

• Advantages: It is a single phase sampling design which simplifies the

estimation procedures, particularly variance estimation The sample sizes are known prior to sampling Full use of available tax data is now made

• Disadvantages: The estimation systems need to be modified to fit the new

procedure

Page 19: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

19

POINT ESTIMATION

• For Financial variables, we have 2 options:

Tax/survey based framework: We simply use for

the eligible part and a design consistent estimator for the noneligible part

Survey based framework: We want to estimate

Use design consistent estimators (calibration estimators such as the GREG or optimal estimator) that make use of all the available tax data (Monthly surveys)

E

ii U

X x

ii U

Y y

Page 20: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

20

GREG TYPE ESTIMATORS

• The GREG estimator is usually written as

• The GREG estimator fits the macro approach but it can also fit the micro approach

ˆ ˆ ˆG i i i ii U i s

Y y w y y

ˆ CALG i i

i s

Y w y

Page 21: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

21

GREG TYPE ESTIMATORS

Micro approach

Unit

nn yy

yy

yy

ˆ

ˆ

ˆ

22

11

iw

n

2

1

ny

y

y

ˆ

ˆ

ˆ

2

1

1

2

n

w

w

w

U s

1

n

N

ˆ

n

N

y

y

0

0

0

0

y yye ˆ

sDomain

Domain estimators asymptotically p-unbiased

Page 22: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

22

DOMAIN ESTIMATION

• Three situations are encountered in practice:

(i) The domain is identical with the model group

(ii) The domain is contained in the model group

(iii) The domain interesects more than one model groups

Page 23: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

23

DOMAIN ESTIMATION

• Even if the prediction type estimators are pm-unbiased at the model group level, they could be significantly biased if the model prevailing at the domain level is different than the model prevailing at the model group level

• The GREG type estimators are always asymptotically p-unbiased at the domain level. However, they could be inefficient if the model prevailing at the domain level is different than the model prevailing at the model group level

Page 24: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

24

DOMAIN ESTIMATION: MICRO vs. MACRO

• Macro and micro approaches lead to identical estimators of parameters at the model group level

• At the domain level, both approaches lead to different estimators

• No definite comparison is possible but we expect that will perform better than if the domain size is small ˆ macro

dY

ˆ microdY

Page 25: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

25

WHEN THE TAX VARIABLE IS MISSING

• In practice, the tax variable is subject to nonresponse and it is imputed

• Let z be a new variable defined as: x if the tax variable is observed and if the tax variable is missing

• Inference can be made conditional on z

*x

Page 26: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

26

FUTURE WORK

• Find a compromise calibration weight if the macro approach is used

• For non-financial variables, find the best set of auxiliary variables and use it to calibrate*x

Page 27: 1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII

27

Pour plus d’informations, veuillez contacter/ for more information, please contact

[email protected]

(613) 951-5221