18
Sébastien CHAMI [email protected] 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Sébastien CHAMI [email protected] 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Embed Size (px)

Citation preview

Page 1: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Sébastien [email protected]

5 May, 2010

Reengineering French structural business statistics

An extended use of administrative data

Page 2: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 2

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

Outlines

› 1- Presentation of administrative tax data› 2- Links between the statistical register and tax data› 3- Tax data micro-editing› 4- Selective data editing and manual review of tax

data

Page 3: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 3

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

The administrative tax data

› Businesses are taxed on their profit› To determine this tax, they send annually a return to

the administration based on their accounts› The tax administration cedes back to Insee this

collected tax data from which structural business statistics are derived

Page 4: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 4

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

The diversity of tax forms

› There are several types of tax forms depending on the sector of activity and the size of the company

› The sector of activity determines the type of profits subject to taxation

–Agricultural profits (out of the scope of SBS)–Industrial and business profits–Non-commercial profits

› The size determines the system of taxation –Normal system for big units–Simplified system for small units–Extremely simplified system for very small units

Page 5: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 5

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

A detailed set of data

› Tax returns are very detailed forms with lots of characteristics : more than 1000 overall

› Some are common in every forms, some are specific› 250 characteristics of interest have been chosen

within these 1000 to meet our statistical purpose

›Note : the return for micro-businesses consists in a simple declaration of turnover => No tax form and no data for them

Page 6: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 6

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

An highly prescriptive accounting

› As explained before, tax returns are taken from the companies accounts

› The French accounting standard is very prescriptive›Consequences : the information provided by the tax

form is, with few exceptions, very consistent :–One characteristic represents the same concept in every

return

– The value of this characteristic has been determined with the same accounting method for everyone

Page 7: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 7

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

Flexible rules for accounting periods

› A major problem for the homogeneity of our statistics is the accounting period on which is based the tax form

› Two constraints : –At least one accounting period-ends each calendar year

–Continuity of accounting periods : neither overlap nor gap

Page 8: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 8

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

From the accounting period to the reference period

›Our statistics are based on the calendar year thus we need a restatement to obtain an homogeneous period for the tax returns

–Choice of the accounting period with the most common months with the calendar year

–Estimating to 12 months the tax forms with an accounting period different from 12 months (births and deaths excluded)

Page 9: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 9

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

From the accounting period to the reference period

N-1 N N+1

Year N

Year N

Birth No change

Year N+1

Year N-1

Year N and 12 months rectifying

Year N and 12 months rectifying

Death No change

Page 10: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 10

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

The statistical register (Ocsane)

› The scope covered by our statistics is defined by a statistical register (Ocsane)

› Before editing tax data, it is essential to match the tax forms sent to us with the units of our register in order to have the fundamental rule satisfied at the end of our process :

1 unit of the register = 1 return (either collected, derived from collected or imputed)

› This is done in 3 steps :

–Matching the administrative returns with our register

–Dealing with multiple returns

–Imputing a return for units that do not have one

Page 11: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 11

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

The identification of returns

› Tax administration uses a different id number (IFRP) than ours (“Siren” number) but it has the “Siren” number in its own database as a simple characteristic

› They have made a strong effort for several years to improve the quality of this “Siren” number about 98% of tax forms have a correct Siren number

that can be found in our register›With the remaining 2% :

–If turnover >= 50M € : manual review to find a matching unit in the register (1000 matches per year this way)

–If turnover< 50M € : return is discarded

Page 12: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 12

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

Multiple tax returnsYear N

“Paste”

2nd returnConsecutive periods

1st return

merger

Correction2nd return

1st return

2nd return

1st return

2nd return : non-com. profits1st return : I&B profitsMultiple types

of profit

Consolidation

Page 13: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 13

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

Imputed data

› 540 000 units of our register have no tax return :–210 000 units because the administration did not send

us their forms (tax audits for example) or did it too late

– 330 000 micro-businesses

› They represent about 20% of the units of the register but only 3% of the total turnover

Page 14: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 14

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

Imputed data

Three methods are used to impute data :› If a non-imputed return is available from the year

before, the new return is the previous one inflated by a median evolution of turnover of the company sector

› Otherwise, the return for the current year is imputed as an average return of its sector and size class 

› Micro-businesses are imputed in a similar way as the second method but with a specific structure of accounts

Page 15: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 15

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

Data are strongly constrained by mathematical relationships

› Accounting data is very redundant› The 250 characteristics of interest are linked with

nearly 100 relations of types :X = X1+X2+ …. +Xn

or

X = Y - Z

Page 16: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 16

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

Micro-editing methods

›Micro-editing is mainly based on the constraints exposed before.

› There are two methods :

X

X1

X2

Unsatisfied relation X=X1+X2

Error

X

Micro editing 1 : Shaping the

breakdown

X1

X2

Micro editing 2 : Recalculation of

the total

X

X1

X2

Page 17: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 17

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

Selective editing process

› A selective editing process is then implemented to determine the most influential companies for our statistic aggregates

› The main principle is to calculate an aggregate ratio (i.e. the ratio of 2 aggregates) with and without each company : influential data corresponds to the highest difference between the two ratios

› The influences on the different characteristics are synthesized in one score. By setting thresholds for this score, one defines which companies have to be manually reviewed

Page 18: Sébastien CHAMI sebastien.chami@insee.fr 5 May, 2010 Reengineering French structural business statistics An extended use of administrative data

Page 18

An extended use of administrative dataSébastien Chami Helsinki-Q2010 5 May, 2010

Human reviewing of tax data

› The most influential selected tax returns are then submitted to a staff of clerks for reviewing

› A specific software has been developed in order to achieve this review

› This software presents to the clerks for each return they must review :

–The list of the characteristics that need to be reviewed (i.e. the characteristics that this return influences the most)

–The values before editing and after editing of these characteristics

–The values of year N, N-1, N-2 of these characteristics

–The errors before micro-editing involving these characteristics

› With this information the clerks then recall the company to validate or modify the values of the selected characteristics