QQUUAALLIITTYY RR - gouvernement...4D 0 0 0 0 0 4F 4G 4H 8A 12.5 8B 0 8E 8F 0 9 I.1.4. Application of 1%-rules foreseen in the SBS regulation Annex II Variables 21110, 21120 and 21140

QQUUAALLIITTYY RREEPPOORRTT

SSTTRRUUCCTTUURRAALL BBUUSSIINNEESSSS

SSTTAATTIISSTTIICCSS RREEGGUULLAATTIIOONN

AANNNNEEXXEESS II TTOO IIVV && VVIIIIII

MMEEMMBBEERR SSTTAATTEE:: LLUUXXEEMMBBOOUURRGG

RREEFFEERREENNCCEE YYEEAARR:: 22001111

RREEPPOORRTT IISSSSUUEEDD:: SSBBSS -- SSEERRVVIICCEESS,, IINNDDUUSSTTRRIIEESS,, TTRRAADDEE

AANNDD CCOONNSSTTRRUUCCTTIIOONNSS

COMPILING INSTITUTION: STATEC, LUXEMBOURG

CONTACT DETAILS:

NAME: Georges Zangerlé

E-MAIL ADDRESS: [email protected]

TELEPHONE NUMBER: +352 247-84242

POSTAL ADDRESS: 13, rue Erasme

L-1468 Luxembourg

Please answer in the grey-shaded cells or, when necessary, update the information given.

Please check the pre-filled cells marked in green.

There is no limit for the replies to open questions; the row height will be automatically adjusted to your text.

Please note that, if you consider that any of your answer should be treated as confidential, the answer

needs being labelled “CONFIDENTIAL” explicitly.

mailto:[email protected]

2

Contact persons

Annex I:

RESPONSIBLE INSTITUTION: STATEC, LUXEMBOURG

CONTACT DETAILS:




Annex II:


CONTACT DETAILS:




Annex III:


CONTACT DETAILS:




Annex IV:


CONTACT DETAILS:




3

Annex VIII:


CONTACT DETAILS:




4

Contents:

CONTACT PERSONS .................................................................................................................................. 2

I. RELEVANCE .......................................................................................................................................... 6

I.1. COMPLETENESS ........................................................................................................................................ 6 I.1.1. DATA AVAILABILITY ............................................................................................................................... 6 THE AVAILABILITY RATE IS AS FOLLOWS (PRE-FILLED BY EUROSTAT): .......................................................... 6 I.1.2. AVAILABILITY OF CHARACTERISTICS AND/OR BREAKDOWNS REQUIRED BY THE SBS-REGULATION ... 7 I.1.3. USE OF THE QUALITY FLAG 'CONTRIBUTION TO EUROPEAN TOTALS ONLY' (CETO-FLAG) FORESEEN IN

THE SBS REGULATION (FOR MORE EXPLANATION SEE ANNEX) ...................................................................... 8 I.1.4. APPLICATION OF 1%-RULES FORESEEN IN THE SBS REGULATION ......................................................... 9 I.1.5. DEROGATIONS FROM THE PROVISIONS OF THE SBS REGULATION ......................................................... 9 I.2. CONFIDENTIALITY ................................................................................................................................... 9 I.3. MONITORING USER INTEREST ............................................................................................................... 12 I.3.1. DATA DISSEMINATION REPORT ............................................................................................................. 12 I.3.2. CONSULTATION OF YOUR MAIN USERS (TARGET GROUP: NARROW SCOPE, E.G. NATIONAL ACCOUNTS,

CENTRAL BANKS, ECONOMY DEPARTMENT ETC.) ......................................................................................... 13 I.3.3. USERS' SATISFACTION (BROADER SCOPE) ............................................................................................. 14

II. ACCURACY AND RELIABILITY ...................................................................................................... 15

II.1. CONCEPTS AND SOURCES ..................................................................................................................... 15 II.2. NON-SAMPLING ERRORS....................................................................................................................... 24 II.2.1. UNIT NON-RESPONSE ........................................................................................................................... 24 II.2.2. BIAS ..................................................................................................................................................... 26 II.2.3 IMPUTATION ......................................................................................................................................... 27 II.2.4 COVERAGE ERRORS .............................................................................................................................. 28 II.2.5. PROCESSING ERRORS ........................................................................................................................... 29 II.3. INFERENCE (GROSSING UP) .................................................................................................................. 29 II.4. ASSESSMENT OF REVISIONS ................................................................................................................. 30 II.4.1. PRELIMINARY DATA VERSUS FINAL DATA ........................................................................................... 30 II.4.2. AVERAGE SIZE OF REVISION ................................................................................................................ 31 II.4.3. REVISION POLICY ................................................................................................................................. 34

III. COHERENCE AND COMPARABILITY ....................................................................................... 35

III.1. COHERENCE ........................................................................................................................................ 35 III.2. COMPARABILITY ................................................................................................................................. 36 III.2.1. COMPARABILITY OVER TIME .............................................................................................................. 36 III.2.2. GEOGRAPHICAL COMPARABILITY: COVERAGE OF TARGET POPULATION .......................................... 36 III.2.3. GEOGRAPHICAL COMPARABILITY: OTHER ISSUES ............................................................................. 36

IV. TIMELINESS AND PUNCTUALITY ................................................................................................ 38

IV.1. TIMELINESS ......................................................................................................................................... 38 IV.2. PUNCTUALITY ...................................................................................................................................... 38

5

V. ACCESSIBILITY AND CLARITY ....................................................................................................... 40

V.1. ACCESSIBILITY ...................................................................................................................................... 40 V.2. CLARITY ............................................................................................................................................. 41

VI. FURTHER COMMENTS ..................................................................................................................... 42

ANNEX.......................................................................................................................................................... 42

6

I. Relevance

Definition

Relevance is the degree to which statistical outputs meet current and potential user needs. It

depends on whether all the statistics that are needed are produced and the extent to which

concepts used (definitions, classifications etc.,) reflect user needs.

I.1. Completeness

The completeness is the extent to which data are available compared with the requirements in

terms of characteristic, geographical and activity breakdown, as specified in the SBS Regulation1.

I.1.1. Data availability

The availability rate is as follows (pre-filled by Eurostat):

100applicable fields ofnumber

provided cells ofnumber

2011 LU

1A 92

1B 96

1C 100

1E 93

2A 95

2B 100

2C 100

2D 100

2F 100

2H

2I

3A 93

3B 100

3C 100

3D 0

4A 93

4B 100

4C 100

4D 100

4F

4G

4H

Total 95

1 SBS Regulation: http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2008:097:0013:0059:EN:PDF:

http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2008:097:0013:0059:EN:PDF

7

8A 100

8B 100

8E

8F 100

Total 100

SPECIFICATION OF MISSING DETAIL: (TABLE TO BE FILLED IN BY EUROSTAT)

a) characteristics 11AA,,22AA,, 33AA,,11EE,,44AA::

VV1166114400

22AA:: VV1188111100

nothing missing

b) activities missing in series HH550022 nothing missing

c) breakdown in series 11BB:: HH550011,,HH550022,,

HH550033,, HH550044 nothing missing

d) missing series 33DD nothing missing

I.1.2. Availability of characteristics and/or breakdowns required by the SBS-Regulation

Please comment on the rates of available statistics calculated by Eurostat and explain the reasons

why any characteristics or breakdowns required by the SBS Regulation are not available (e.g.

derogations) and describe your plans for improvement in the future.

ad a) Missing characteristics

- 16 14 0: the “Number of employees in full time equivalent units” is not available in survey or

administrative data ;

- 18 11 0: turnover by 3-digit NACE activity breakdown is not yet available due to the still work-

in-progress implementation of the CPA version 2008.

ad b) Breakdown in series

The activities missing in series 1A and 1B are related to subdivisions of water transport, more

precisely sea and coastal water transport. The figures are based on an economic branch level

estimation obtained yearly from our national accounts department. At the moment of the

transmission of SBS data for the reference year 2009, the source was still only available for NACE

Rev.1.1. Given that NACE Rev.2 is more detailed for sea and coastal water transport activities, the

available NACE Rev.1.1 estimate could not be clearly allocated to either passenger or freight.

Nonetheless, the data are available for the aggregate of passenger and freight.

ad c) Missing series

Luxembourg is completely exempted from providing data series 3D (see 1%-rule derogation).

Areas for improvement

- Work-in-progress: Water transport activities estimates in NACE Rev.2 are not yet available.

Water transport is a complex methodological issue in Luxembourg, due to the paradox of the

country not having a direct access to the sea and the existence of many incorporated entities in

this activity branch. During 2013, a solution was being worked out jointly with national accounts

and business register. Update 2014: However, due to a shortage of resources, this area has been

put on hold due to other priorities.

8

- Work-in-progress: The implementation of CPA version 2008 (which impacts variable 18 11 0) is

still on-going because it requires close consultation with national accounts to avoid any major

inconsistencies in the future. A solution is currently being worked out jointly with national

accounts and should be achieved for reference year 2013 ;

- Work-in-progress: Data collection for variable 16 14 0 is not foreseen due to administrative

burden and reliability considerations. As a result of a national task force, we have started

exploring the possibility to extract this variable from an administrative source (i.e. “number of

hours worked” combined with “number of employees”. We seek to implement this variable for

reference year 2012.

I.1.3. Use of the quality flag 'Contribution to European totals only' (CETO-flag) foreseen in the

SBS regulation (For more explanation see annex)

100applicable fields ofnumber

cells flagged-CETO ofnumber

2011 LU LU LU LU LU

TOTAL Section Division Group Class

1A 2.38 0 3.83 2.98 1.8

1B 0.41 0 0.53 0.41

1C 2.94 0 3.57

1D 0

1E 0

2A 0 0 0 0 0

2B 0 0 0 0

2C 0 0 0

2D 0 0 0 0 0

2F 0 0 0 0 0

2H

2I

3A 0 0 0 0 0

3B 0 0 0 0

3C 0 0 0 0

3D

4A 0 0 0 0 0

4B 0 0 0 0

4C 0 0 0

4D 0 0 0 0 0

4F

4G

4H

8A 12.5

8B 0

8E

8F 0

9

I.1.4. Application of 1%-rules foreseen in the SBS regulation

Annex II Variables 21110, 21120 and 21140 Annex III All variables that are not included in Annex I Annex IV All variables that are not included in Annex I

Please indicate whether you intend to provide in the next reference year any of the data for

which you have applied the 1%-rule in the past.

We have decided to provide Eurostat with data available even if Luxembourg had a 1%-rule

derogation, under the condition that such a data delivery would not negatively impact the

completeness score for Luxembourg.

Please indicate whether you intend to discontinue the provision of the data for which the 1%-

rule could be applied (Eurostat will check the validity of this request and will confirm or not).

See here above

I.1.5. Derogations from the provisions of the SBS Regulation

Please indicate whether you intend to provide data for which you have granted derogation

earlier than the timing foreseen in the SBS Regulation.

Currently, there are no resources to collect those data.

I.2. Confidentiality

I.2.1. The rate of confidential cells is as follows (pre-filled by Eurostat):

100provided cells ofNumber

cells alconfidenti ofNumber

2011

LU LU LU LU LU

TOTAL Section Division Group Class

1A 33 0 17 31 40

1B 31 16 32 31

1C 10 0 12

1E 39

2A 42 0 42 45 42

10

2B 31 53 35 29

2C 26 0 29

2D 38 0 37 39 38

2F 44 0 44 47 44

2H

2I

3A 26 0 0 35 24

3B 31 21 21 33

3C 21 0 0 25

3D

4A 16 0 0 0 25

4B 21 0 0 30

4C 0 0 0

4D 17 0 0 0 27

4F

4G

4H

TOTAL 34 21 31 33 38

8A 61

8B 70

8E

8F 56

Total 62

N.B. The number of confidential cells is calculated at the detailed level of the activity breakdown

required by the Regulation (EC) 251/2009.

I.2.2. Please describe your confidentiality rules (primary and secondary). Note: The described

confidentiality rules will not be published by ESTAT.

General remarks

The team involved in the SDC procedures has been trained via the 3-day course on SDC offered

in the Eurostat training programme (ESTP) or in the CENEX-SDC framework, as well as via the

2-day course on SDC in the area of SBS. One member of the staff has represented Luxembourg

in the ‘ESSNet on SDC 2008-2009’ project as a tau-Argus software tester and SDC handbook

reviewer.

Technical considerations

The basis for any suppression pattern (addressing both primary and secondary confidentiality) is

the software package tau-Argus. However, the process also involves manual procedures, i.e.

checking the tau-Argus output, comparing the historical data series and addressing linked table

disclosure risks (see secondary confidentiality for further details).

The statistical disclosure control procedures are not performed for every variable individually

but only for a single shadow variable, i.e. "Turnover", more precisely the SBS variable 12 11 0

but including royalties. If a given cell is confidential for that variable (no matter if primary or

secondary), the same cell will be suppressed for all the other available variables (e.g. number of

persons employed, investments, personnel costs, etc.).

11

This approach has lead us to check the confidentiality for the following reference series:

- preliminary enterprise data broken down by economic activity ;

- final enterprise data broken down by economic activity ;

- enterprise data broken down by economic activity and by employment size class ;

- enterprise data broken down by special aggregates ;

- kind-of-activity (KAU) data broken down by economic activity ;

- (if available) local unit data broken down by economic activity ;

- enterprise data broken down by economic activity and by the client’s country of residence

(annex VIII).

This strongly contributes to minimising both SBS transmission delays and otherwise unsolvable

linked table or variable-specific disclosure risks.

Primary confidentiality rules

As a general principle, the NACE section level data, if not broken down by any other spanning

variable (e.g. size class), are not considered as confidential, except if the section has a trivial

breakdown (e.g. section L of NACE Rev.2). There can be other case-by-case exceptions to this

principle.

a) Sensitivity rule

We apply the (n,k)-dominance rule, i.e. a cell is suppressed if n units separately or jointly

dominate the total value of a cell by at least k% .

b) Minimum frequency rule

For any cells that are left after applying the sensitivity rule, a minimum frequency is applied. A

cell is suppressed if there are less than n units in a given cell.

Secondary confidentiality rules

The secondary suppression is calculated by tau-Argus using the ‘Modular’ algorithm. Manual

suppressions or cost adjustments are performed using the tau-Argus ‘a priori’ file facility.

a) Secondary suppression within a table

- A cell is suppressed for secondary confidentiality if n units dominate jointly or separately the

confidential total value by at least k% ;

- special attention is paid to the impact of singletons, a risk which is in most cases directly

addressed by the tau-Argus Modular algorithm ;

- tau-Argus is set to minimise the cost when determining the secondary suppressed cells.

However, we also want to provide the user with useful data, whether it is in terms of

interpretation and/or availability of time series. Consequently, the cost minimisation can be

overridden for economic and/or historical reasons.

b) Secondary suppression due to linked tables disclosure risks

- historical disclosure: in conformity with the SDC handbook, we ensure that no historical cell

is compromised by disclosing the same cell for the current reference year. As long as there is

any significant link with prior year data, a cell may not be disclosed for the current reference

year.

- links to any other series based on different unit concepts but sharing the same breakdown:

data are first checked on an enterprise level. Then the data for other concepts (KAU, local

units) are checked - the cells for which there is any significant link with the enterprise data

series inherit the flags from the latter. Sometimes, a flag of a KAU or local unit series has an

12

impact on the enterprise series. Therefore, we cannot send the enterprise data series before

the other series.

I.2.3. Have you taken any measures to reduce the number of confidential cell?

[ X ] Yes

[ ] No

If yes, please describe them briefly: We streamlined the way we treat periodic disclosure risk.

Instead of more or less copying and pasting the confidentiality pattern, we now only leave those

cells closed which are necessary to protect a primarily confidential cell in the previous year.

The same principle is applied between the series using different statistical units.

I.2.4. How do you evaluate the impact of the applied measured to reduce the number of

confidential cells?

very good good satisfactory poor very poor

I.3. Monitoring user interest

If the SBS data of Annexes I to IV are not disseminated at national level by your NSI, please

answer only to the questions 1.3.2.5.

I.3.1. Data dissemination report

1.3.1.1. Does the dissemination/publishing unit in your NSI keep track of the number of hard-

copies of the SBS publications sent or sold?2

[ X ] Yes

If yes, how many publications were sent/sold from the most recent hard-copy publications (or in

the last 12 months…): 824 statistical year books

[ ] No

[ ] No (hardcopies are not printed)

1.3.1.2. Does the dissemination/publishing unit in your NSI keep track of the number of

downloaded on-line SBS publications?

[ X ] Yes

If yes, how many downloaded publications/data were sent/sold for the most recent SBS

publications/data (or in the last 12 months…): ….

[ ] No

[ ] No (publications are not available on-line)

1.3.1.3. Does the dissemination/publishing unit in your institute keep track of the number of

downloaded data from the on-line databases?

[ X ] Yes

2 The ESTAT dissemination unit surveys the language policy in national publications: use of native language, use of

English, use of any other language. This information is not specifically targeting SBS.

13

If yes, how many downloads of SBS data were made last year: 3982 clicks (12 months period)

[ ] No

[ ] No (data are not available in on-line databases)

1.3.1.4. Do you or the dissemination/publishing unit in your NSI provide any information which is

not available in the published publications and in the published on-line databases?

[ ] Yes, to the public authorities

[ ] Yes, to the public authorities and a limited set of registered main users

[ ] Yes, to everyone with a specific request

[X] No

I.3.2. Consultation of your main users (target group: narrow scope, e.g. National Accounts,

Central Banks, Economy department etc.)

I.3.2.1. Has your unit regular consultations with some of your main users?

[ X ] Yes

[ ] No

I.3.2.2. If yes, could you give a brief description of your main users and their needs (by main

groups of users) Internal or external users?

External users: public and private research institutions, marketing companies, national central

bank, lobby and policy groups, private users, university students, academic users and others.

External users are often interested in the variables “number of [units]”, “turnover”, “value-added”

and “employment” by economic activity as well as by employment size class. Sometimes,

concentrations ratios (e.g. market share of top 5 companies, etc.) are requested but due to

disclosure issues, these are not provided.

Internal users: national annual accounts, CIS-R&D statistics, ICT statistics, short-term statistics,

foreign direct investment, internal research department.

Internal users need micro level data. SBS micro level data are mainly used as a direct data source

for the production of other statistics (e.g. national annual accounts including the financial account,

ICT, CIS-R&D) or to calculate index weights (e.g. short-term statistics). Other uses are quality

checks, use of the data for sampling purposes, etc.

Internal users and annex VIII

The geographical dimension in the national SBS survey is also a major national source for national

accounts, not only for services companies but for production of services in general. The same

source is used to compile the annex VIII series.

I.3.2.3. If SBS data of Annexes I to IV are published at national level, could you please give a

brief description of the general opinion of the main users about the quality of the data?

External users

While the external users are generally satisfied with the quality of the data, there are a few causes

for potential disappointment:

- cells unavailable due to confidentiality in case of very detailed / specific data needs ;

14

- data not available due to unrealistic timing expectations, given that national SBS data for the

latest reference year are available only after t+21 or t+22 months – a few users seem to expect

results at a t+12 months frequency ;

- (rare) confusions in case of annual revisions of historical SBS data, which are necessary to

remain in line with national accounts data.

Internal users

Not applicable because internal users use SBS micro data.

I.3.2.4. If SBS data of Annexes I to IV are published at national level, does the consultation with

the key users reveal any specific unfulfilled need in addition to the requirements of the

Annexes I to IV of the SBS Regulation? If yes, please specify.

Internal users become more and more interested in balance sheet data. External users occasionally

request concentration indicators (e.g. market share of the top 5 companies, etc.), but those requests

remain unfulfilled.

(Not to be filled in if questions 1.3.2.1 to 1.3.2.4 have been answered)

I.3.2.5. If the SBS data as required for the Annexes I to IV of the SBS Regulation are not

published as such at national level, would you please indicate why the data published by your

NSI are more suitable for the users' needs?

Annexes I-IV

NACE Rev.2 series are currently being prepared for dissemination at national level. The series

will start at reference year 2005 until 2011. Thereupon, Luxembourg will send revised data series

to Eurostat for the said reference years.

Annex VIII

We do not currently publish data available in Annex VIII, given that the detailed breakdowns of

data cause many cells to be confidential.

I.3.3. Users' satisfaction (broader scope)

I.3.3.1. Have you organised a punctual or a regular survey related to the users' satisfaction

regarding the availability of your data for Annexes I to IV of the SBS Regulation in your country?

[ ] Yes

[ X ] No

I.3.3.2. If you have organised a survey as such, to what extent the users' needs were fulfilled by the

available data and if they are relevant for all of them?

not applicable

15

II. Accuracy and reliability

Definition

Accuracy of statistical outputs in the general statistical sense is the degree of closeness of

estimates to the true values.

If a different methodology is applied for different activity groups (or by Annex of the SBS

Regulation) this part of the quality report may be copied for each of the different

methodologies. Please indicate clearly which activities or Annex of the SBS Regulation the

information is referring to.

II.1. Concepts and sources

Member states may use different data collection systems to comply with the SBS regulation.

It might be:

the use of sample surveys3 (with administrative information present in the sampling frame

and only used for stratification purpose),

the use of a sample survey combined with use of administrative information (whereby this

information is also used in the editing and imputation process and possibly in the inference

process through calibration),

the use of administrative data as the only source of information and all characteristics are

calculated from the administrative records using or not a model based approach (in this case

no survey information is used for a significant part of the strata covered).

It is necessary only to fill in the concerned column.

If several surveys are used in order to compile different SBS characteristics, the table should be

copied as many times as there are survey.

3 Exhaustive interrogation of all enterprises is considered a type of survey.

Methodology 1: Sample survey Methodology 2: Sample survey combined

with administrative information

Methodology 3: Administrative

information

II.1.1. Description of source

a) Survey

1. Please describe the sample design in particular:

(If several surveys are organised to collect data, please

describe separately, for each of them, the sample

design).

- type of sample design

stratified

multistage

clustered

stratified

multistage

clustered

- stratification criteria

Activity

Employment size class

Region

Other (please specify )

Activity

Employment size class

Region

Other (please specify turnover )

- selection schemes (sampling rates)

Using number of employees and turnover

thresholds, the target population, obtained

through the business register, is divided into

two parts:

- any legal units employing either more

than 45 employees or having declared a

turnover excluding VAT of more than

7 million EUR per annum are selected

every year. Any legal units linked to

the aforementioned units (e.g. group

and enterprise links) are also selected

every year ;

- the other legal units are selected using a

stratified random probability sampling

design. Units in strata consisting of

only one unit are always selected.

17

Furthermore, we apply a rotation

principle to ensure that a unit can only

be surveyed at maximum once every

three years – this procedure helps to

reduce the administrative burden for

the smaller entities. The following

years, these units can be satisfactorily

imputed using cold-deck ratio

imputation.

- any possible threshold values

please refer to “selection schemes (sampling

rates)”

- the effective sample size

- To cover the needs of national accounts

and SBS: between 3000 and 3500 legal

units ;

- To cover the needs of SBS only:

between 2000 and 2500 legal units.

- If the reference period differs from the calendar

year is there a correction to bring it in accordance

with the reference period? For SBS (calendar

year).

No.

For each company with a different financial

year than the calendar year, we take a

decision regarding the SBS reference

period. Normally, this decision is taken in

such a way that at least 6 months of the

financial year for a given company are

included in the corresponding SBS

reference period. Once that decision is

taken, it is maintained every subsequent

year for comparability reasons.

- Additional or specific information should be added

in the column 'Methodology2 ' if the suggested

options are not feasible with the used concepts in

your country of compiling SBS data.

The stratification used in the sample design

does not take into account the size classes

defined in the SBS regulation. However,

mass imputation procedures exist on a

micro data level, so that data can be broken

down according to any spanning variable

defined a posteriori.

While the sample usually covers only

around 8% of the target population in terms

of number of units, it covers around 80% of

the target population in terms of turnover.

18

b) Administrative source

Please describe the administrative sources:

The used administrative sources:

- Business Register

- Social Security

- Value-added tax (VAT)

- Please enumerate the characteristics directly

available or with a good proxy in the administrative

source.

- NACE code

- National accounts institutional sector

code

- Employment

- Wages

- Turnover

- Please enumerate the characteristics for which a

model based estimate is used.

- Employment

- Turnover

- Is your unit able to check the plausibility of those

data, namely by contacting directly the units?

No.

However, we check the plausibility of those

data for every unit in the sample and on a

case-by-case basis with annual business

accounts (if available at the Register of

Commerce).

The extent to which the administrative source are

used:

[ X ] : data source, basic data for some

characteristics

[ X ] : data source for imputation in case of

non-response.

[ X ] : data source for imputation, for strata

not covered by the survey

[ X ] : data source for 'mass imputation':

(imputation of units not selected in the

sample)

[ ] : data source for calibration (calage sur

marge) to optimize inference from survey

results

- What kind of administrative data do you access?

[ X ] : micro data

[ ] : aggregated data

[ ] : for the entire population

(global data)

[ ] : micro data

[ ] : aggregated data

[ ] : for the entire population

(global data)

19

[ ] : by each (or some) economic

breakdown class

[ ] : for a part of the population

(sole proprietor, companies, etc.)

[ ] : other (please specify):

___________________________

[ ] : by each (or some)

economic breakdown class

[ ] : for a part of the population

(sole proprietor, companies, etc.)

[ ] : other (please specify):

___________________________

How do you assess the frequency to which the used

administrative data sources are updated?

VG * G S P VP

VG G S P VP

Are the administrative data subject to several

revisions with (increasing) degree of completeness?

[ X ] yes

[ ] no

[ ] yes

[ ] no

II.1.2.Relation between reporting units and the

legal units / enterprise (statistical unit):

Which is the relation between the reporting unit for

the survey/administrative data and the enterprise?

In the SBS survey, the reporting unit is the

legal unit. However, for a very few legal

units, the data are collected separately for

the economic subdivisions of the legal unit.

Moreover, local unit data are collected at

legal unit level.

In the administrative sources, the reporting

unit is the legal unit.

Finally, the statistical unit “enterprise” can

be a legal unit or a group of legal units.

Enterprises which are formed by a group of

legal units are not the general case in

Luxembourg but their occurrence is

significant, in particular due to many legal

units specialized in ancillary activities for

other legal units belonging to the same

enterprise group.

II.1.3. Definitions and concepts used in the

survey/administrative sources :

20

How would you assess the proximity of the

definitions and concepts (including statistical units)

used in the survey/administrative source with those

required for statistical purposes?

VG G S P VP

VG G S P VP

VG G S P VP

Please list the main differences between the

survey/administrative source and the statistical

definitions and concepts.

The differences can be categorized as

follows:

- timing differences ;

- permanent differences.

The turnover recognition moment between

the survey and the administrative source are

often similar. However, for some economic

activities (e.g. real estate, construction, ICT

services, etc.), there can be a significant

timing difference between the moment of

invoicing (VAT) and the moment of

revenue recognition in the profit&loss

account.

Furthermore, the concept of turnover in the

survey includes secondary and ancillary

activities, in conformity with the SBS

regulation definition. However, these

activities are not necessarily captured by the

variable “Turnover” available in the

administrative source (VAT), thus creating

a permanent difference. This situation can

also be reversed, i.e. an element of the

variable “Turnover” in the administrative

source is not necessarily considered as

turnover in the SBS survey (e.g. proceeds

from the sale of fixed assets, interests

received on intercompany loans in some

specific cases).

Are there any of the SBS characteristics not

included in those data sources?

yes

If yes, please indicate the method used to

compile those characteristics (estimation).

yes



Characteristic 16 14 0 Number of persons

yes



21

______________________________.

no

employed in full-time equivalent is not

available in the data sources. We are

currently looking into possibilities to

estimate the variable, as an inclusion in the

SBS survey would increase the

administrative burden quite significantly,

in particular for bigger companies and thus

also negatively impact the non response

rate. A few characteristics for which

Luxembourg has obtained a permanent

derogation (e.g. environmental breakdown

of variables) are also not available in the

data sources.

no

______________________________.

no

22

II.1.4. Sampling error: coefficients of variation

Coefficients of Variation

valueestimated

variancesampling theof estimateCV

For the main series, by type of economic activity

(1A, 2A, 3A and 4A – Regulation No 295/2008), a

coefficient of variation for turnover (characteristics

12 11 0) is required on the NACE Rev2 – 3-digit

(group) level.

For the same series, coefficients of variation are

expected for variables: 11 11 0, 12 11 0, 12 15 0,

13 31 0, 15 11 0 and 16 13 0 on NACE Rev.2 –

Section level for the sections B, C, D, E, F, G, H, I,

J, L, M and N and also on NACE Rev 2 – 2-digit

(division) level for Div. 45, 46 and 47 covering the

trade section and division 95.

For the series broken down by employment size

class (1B, 2B, 3B and 4B – Regulation No

295/2008) a coefficient of variation is required on

NACE Rev 2 – 1-digit (section) level only for

variables: 11 11 0, 12 11 0, 12 15 0 and 16 11 0.

The size classes for which CV's are requested are

those limited to the following set: 0–9, 10–19, 20–

49, 50–249 and 250+ persons employed for series

2B and 4B and 0-1, 2-9, 10-19, 20-49, 50-249, 250+

for series 1B and 3B..

Please provide in a separate file the coefficients of

variation, according to the specifications from

above, and describe the methods used and the

aspects taken into account for computation of the

CV (including software).

The ratio estimator is used for grossing up.

Please refer to section II.3 for further

details on grossing up procedures.

The formula of the CV for the ratio

estimator has been programmed in STATA

using the formulae described in the book

“Techniques de sondage” by Pascal Ardilly

(ISBN 10: 2-7108-0847-1).

Given the ratio R, the sampling variance

has been estimated using the residuals (u)

between the variable of interest (y) and the

ancillary variable (x) :

n

sfYV u

R

2

)1()ˆ

(

with

N

nf 1)1( ,

n

i

iu un

s

1

2

1

1

and iii xRyu

The CV has been calculated using the

following formula:

y

YVCV

R )ˆ

(

Given that the sample design includes a

rotation principle for small units, data

imputed using the procedure described in

section II.2.1.1. have also been taken into

account for the calculation of the CVs.

23

The following variables of interest have

been grossed up using the ancillary

variable “turnover”:

- 12 11 0 ;

- 12 15 0.

The following variables of interest have

been grossed up using the ancillary

variable “number of persons employed”:

- 15 11 0 ;

- 16 11 0.

Variable 13 31 0 has been grossed up using

the ancillary variable “personnel costs”,

available through administrative sources.

The CVs have been calculated for a given

variable of interest with the corresponding

ancillary variable.

For the variables 11 11 0 and 16 13 0, the

CV is zero, as they are available for every

unit, either in the business register or in the

survey, and thus do not need to be

estimated.

The CV has been calculated for every

single cell, i.e. by activity as well as by

activity broken down by employment size

class. Please note that the CV is missing

for the following cells:

- cells with no unit or only one unit in

the target population ;

- cells with no unit or only one unit in

the sample.

Legend:

VG= Very good; G=Good; Satisfactory= S; Poor=P; Very poor=VP

II.2. Non-sampling errors

II.2.1. Unit non-response

Unit non-response occurs when not all the reporting units in the sample participate in the survey or

the needed information is not available in the administrative data sources. In principle non-responses

can occur using both data sources: survey and administrative data.

(If this is not applicable it should be explained why e.g. 'Administrative data have gone through an

imputation procedure eliminating raw data flaws and imputing records not available.')

II.2.1.1. Description of estimation methods for taking into account the unit non-responses

a) Please describe the methods used for taking into account the unit non-response.

General approach

For taking unit non-response into account, we use

- the latest available survey data for the last 10 years ( iy~ ) ;

- administrative data (turnover, number of persons employed, personnel costs) from the reference

period ( ix ) and the latest available survey period ( ix~ ) – the availability of administrative data is

very high.

The estimate is obtained as follows on a micro data level:

i

iii

x

xyy ~~ˆ

This method is known as cold-deck ratio imputation.

In some cases, mainly for bigger companies, the ratio i

i

x

x~ is manually adjusted using annual

business accounts available for the reference period. This provides an even more precise estimate.

If the ratio is abnormally high (compared to other similar units in the stratum) or if there is a weak

correlation between the administrative variable ix~ and the survey variable iy~ , this type of

imputation is not performed.

If no historical data are available or if those data are deemed outdated, the units are grossed up

using the same procedures as for units not covered by the sample or by imputed data.

Investment variables

This form of imputation is not used for the family of variables related to 15 11 0. However, an

administrative source is available to impute this variable.

Number of hours worked by employees

The average number of hours worked by firm is available in administrative sources. Using this

indicator and the number of employees as well as applying necessary adjustments to obtain the

effective number of hours worked, we impute the number of hours worked for every business in

the target population.

b) Please describe, when applicable, the measures taken or envisaged for minimising the unit non-

response.

25

Non response concerns units for which either of the following characteristics applies:

- units which have not responded. Every two months a written reminder is addressed to the units

who have not responded. The third reminder is done by registered mail ;

- the survey form data are not exploitable due to either inconsistent data or lack of information -

the questionnaire design is reviewed every year to ensure that item non response due to form

design is minimal and does not transform into unit non response ;

- units which have ceased their activity or which have been subject to restructuring (mergers,

splits, etc.) during the reference year – for these units, data cannot be easily collected but have

to be estimated or imputed in most cases ;

- units which cannot be contacted. The addresses for these units are checked manually using the

phone book and the national Register of Natural and Legal Persons. Unit non response for this

reason is usually low.

In rare situations, we enforce statistical obligation on a case-by-case basis by filing a legal

complaint against big units which have repeatedly failed to respond with usable data or at all.

II.2.1.2. Weighted unit non-response rate

The weighted unit non-response rate shows how well data collection has worked for the population of

interest. The non-response rate could be weighted by the weights that take into account the most

relevant characteristics. Weights are set at the sampling stage, based on a size criterion available in

the business register.

We prefer the 'number of persons employed' as a weighting factor for the calculation of the non-

responses, as it is always positive. Turnover can also be used if it is the main quantitative

characteristics available for all enterprises.

a) Weighted unit non-response rate

100 sample in the units of totalweighted

estimationin used units respondent-non weighted

The weighted unit non-response rate should be complied for the data collected through survey or

extracted from administrative sources. Please provide in a separate file the weighted unit non-

response rates at the NACE Rev 2 3-digit (group) level.

Remarks:

1. Only if a different survey is used for either one of the variables 11110, 12110, 12150,

13310, 15110 and 16110, a separate unit non-response file ought to be included, mentioning the

variable number in the field concerned.

2. If a variable is directly calculated from the reference frame (drawn from the Business

Register), non-response is zero by definition.

Please describe below which characteristics was used for weighting non-response and comment on

the remarks listed above.

The weighted unit non-response rate was weighted using the variable 12 11 0. We draw your

26

attention to the fact that the non response rate is equal to 0 for the variables 11 11 0 and 16 13 0, as

they are available in the administrative sources.

For the possible reasons for non response, please refer to the section II.2.1.1.b).

Overall, the weighted unit non-response rate is very low, the rate being 6% (2010: 1%). However,

for some economic activities, the rate is high. The non-response rate can even be equal to “NA”,

i.e. if there was no unit in the sample for a given cell, even though there are units for that cell in the

total population. It is important to know that :

- it is often not possible to use the same strata for estimation as those foreseen in the data

transmission format. Given the fact that estimated values are imputed on a micro data level

(imputation is done based on the relative importance of turnover for each unit), it is

nonetheless possible to present the results according to the data transmission format ;

- for groups G501 to G504, we do not use any micro data figures, please refer to section

I.1.2. for further details.

b) How do you evaluate the recorded unit non-response rate in the overall context?

Very high High On the average Low Very low

II.2.2. Bias

Please comment on the possible bias resulting from non-response or from the estimation method.

Bias resulting from non response

For the units to which the imputation method described in section II.2.1.1. has been applied, the

bias is significantly reduced when that ratio is adjusted with observations from the annual business

accounts (if available).

For most units, such an adjustment is not performed due to limited resources and the limited

availability of detailed annual business accounts for small businesses in particular. Nonetheless, the

use of adjusted past survey data certainly reduces the bias resulting from non-response.

If no historical survey data are available for a given unit, the latter is handled in the same way as

any unit which is not included in the sample. The bias would then result from the estimation

method. In most cases, this concerns only small businesses, which have a minor impact on the

aggregate value of the variables of interest.

Given the above, the bias can be evaluated as being small.

Bias resulting from the estimation method

According to the book reference mentioned in section II.1.4., the ratio estimator is biased, the

formula of the bias being complex but with a dimension of 1/n. Consequently, the higher n, the

smaller the bias. However, if n is small or if there is a weak linear correlation, the bias of the ratio

estimator could be significant.

While the bias resulting from the estimation method remains unknown, the estimation method

27

mainly concerns small units. Therefore, we believe that it has a limited effect in the overall

accuracy of the estimate.

Please select which of the options describes the bias of the estimate:

unbiased

Small bias (with a limited effect in the

overall accuracy of the estimate.

Substantial bias

II.2.3 Imputation

Imputation is the process used to resolve problems of missing, invalid or inconsistent responses identified

during editing.

II.2.3.1. Description of methods used

What imputation procedure was applied to cover up the uncovered non-response? Imputation

procedures will normally rely on the available administrative data, but imputation may also be

applied in other situations. (see annex on II.2.3)

Please briefly describe the imputation methods used for dealing with unit and item non- response

(e.g. corrector factor in the weighting procedure, imputation based on available information from

previous reference period etc.)

Unit non response

Please refer to the method described in II.2.1.1 for further details.

Units for which historical data are available

Units which have been in a sample prior to the reference period are imputed using the same method

and under the same conditions as for unit non-response.

Item non response

Given that SBS data are above all quantitative and that they also serve the needs of national

accounts, it is very difficult to identify item non-response. This is in particular true for units who

are included only once in the sample.

For the bigger companies, we are able to follow their structure over a larger period and thus to

identify item non-response in an easier way. Item non-response is therefore dealt with on a case-

by-case basis during editing procedures.

Missing information in the administrative source

For some economic activities, VAT data are not available. No imputation is done on administrative

data. If employment data are significant for a given unit, it is very likely included in the survey

data. Consequently, the occurrence of missing data in the administrative source is rare and of low

impact.

28

II.2.3.2 Evaluation of the impact of imputation

a) How do you evaluate the impact of imputation on CVs?

Very important Important Not important

The reduction of CVs can be assessed by comparing the CVs just using the reweighting procedures

with the CVS after imputation procedures.

The impact of imputation on CVs is important in particular for the small size classes, as most of

those units are selected using a rotation mechanism.

II.2.4 Coverage errors

II.2.4.1. Frame

Please describe the frame used for the SBS:

- What is the variable used for identifying principal and secondary activities?

Value-added is used where available on an activity or even a product level – this is often only

possible for units covered by the SBS survey or for which similarly detailed information would be

disclosed in annual business accounts. Turnover and employment are used when value-added is not

available.

- What is the method used for identifying activities?

top-down

bottom-up

other:

- Please comment on the frequency of updating the unit's principal activity (stability rules).

If the principal activity remains changed for at least two consecutive reference years, the NACE

code is updated.

- Please comment on the frequency of updating the business register in your NSI.

The update frequency is high. Updates of data managed by the NSI itself are performed instantly.

Monthly data related to employment are updated monthly with a t+4 months delay. Monthly VAT

data are updated monthly with a t+3 months delay.

- How do you assess the impact of imperfection of the relevant business register on the quality of

the key statistics?

almost none low medium high very high don't know

II.2.4.2. Out-of-scope units

Please provide information on the methods used to detect the out-of scope units and provide

Eurostat with an estimated number of those units.

Definition

29

The most frequently observed type of out-of scope units are legal units which are incorporated or

registered in Luxembourg but which at the same time do not have any non-financial economic

activity (SBS scope for annexes I-IV, VIII) in the Luxembourg economic territory (e.g. real-estate

rental activities with buildings in foreign territories only, companies where the activity is done in

foreign branches only, some activities related to shipping, etc.). The other significant type of out-

of-scope units relates to non-market activities.

Identification

Such units are identified and dealt with on a case-by-case basis by the staff involved in national

accounts, SBS, business register and balance of payments. This kind of information is also more

and more centralised through the business register.

Estimated number of undetected units

There is a list of companies which have been excluded from the SBS population for the above

reasons. However, we have no estimate of the number of undetected out-of-scope units still

lingering in the SBS population.

II.2.5. Processing errors

Processing errors can be of different types: at data entry (keying errors or optical scanning errors);

wrong handling of raw data errors. A good editing system may detect most of those errors.

a) Please tick the appropriate checks used in order to compile the SBS data and give a short

description of your editing system.

completeness checks validity checks plausibility checks

(data integrity rules) (internal consistency)

Processing errors in the survey

For the survey, we use an objective-driven approach to ensure that the survey data comply with the

pre-defined quality standard.

The following documentation (in French) describes the framework used to cover non sampling

errors for the survey reference year 2005, i.e. in the context of the old regulatory framework. There

is no updated version of the report yet as it is was published as a working paper and thus is not an

official methodological document by STATEC. However, it provides an idea of the type of control

environment that we have in place in the area of SBS survey data.

http://www.statistiques.public.lu/catalogue-publications/economie-statistiques/2009/29-2009.pdf

Errors in the administrative source

Error checking activities in the administrative source consist in the following:

- analysis of the correlation between survey and administrative variables ;

- analysis of births and deaths of significant units ;

- detection of out-of-scope units ;

- identification of merger and acquisitions to minimise the risk of double counting.

II.3. Inference (grossing up)

http://www.statistiques.public.lu/catalogue-publications/economie-statistiques/2009/29-2009.pdf

30

Have you applied any methods for grossing-up the figures covered by the SBS Regulation - to

cover the entire population of enterprises (there is no cut-off threshold used)?

yes

no

b) If you have applied any methods for grossing-up, please provide information about the

methods.

When grossing up survey data, we always take into account ancillary data available in the

administrative sources, i.e. the ancillary variables “turnover” and “number of employees”, and not

only the sample design. The aforementioned ancillary variables most often have a linear

correlation with the variables of interest in SBS. Consequently, the ratio estimator is used for

grossing up.

With the variable of interest being available only for the sample (y) and with the ancillary variable

being available both for the sample (x) and the SBS target population (X), the ratio estimator of

the variable of interest RY , expressed as an average, can be written:

RXYR ˆ

The ratio R has been formulated in the following way:

n

i

i

n

i

i

xn

yn

R

1

1

1

1

If for a given stratum no unit is available in the sample, i.e. neither for the historical periods nor

for the reference period, the units in this stratum are grouped together with another stratum for

which sample data are available and are then grossed up.

Variables 11 11 0, 13 31 0, 13 32 0, 13 33 0, 16 13 0 and 16 15 0 are available directly in the

administrative source and therefore do not need to be grossed up (as from reference year 2009). If

prior year survey data are available, a cold-deck ratio imputation is performed – this significantly

reduces the bias in case there are permanent differences between the admin source and the survey

data for a given unit. For the other units, administrative data are simply pasted into the data set.

II.4. Assessment of revisions

II.4.1. Preliminary data versus final data

The quality of the preliminary data is measured by comparing them with the final value. For annexes

1 to 4, an error measure defined as ratio between the sum of the absolute revision and the sum of the

absolute later estimates, the relative mean absolute revisions is calculated:

The following formula is applied

RMAR – Relative Mean Absolute Revisions

31

T

t Lt

T

t PtLt

T

t Lt

T

t t

ˆ

ˆˆ

ˆ

RRMAR

1

1

1

1

Lt = Latest estimate

Pt = Preliminary estimate

Characteristics 2009 2010 2011

a) Turnover (12110) 00..00553377 00,,00222255 00..00118855

b) Number of persons employed (16110) 00..00229977 00,,00113355 00..00228822

Please give a description of the methodology used for compiling preliminary data.

Preliminary data series are based on estimates which consist, for the vast majority of cases, of

adjusted survey data from the reference year t-1.

The adjustment happens with the administrative data sources, using the cold-deck ratio

imputation as described in section II.2.1.1. Units for which there is no t-1 survey data are grossed

up using the estimation method described in section II.3.

We are aware of the consistent underestimation of preliminary data compared to final data, in

particular as regards the number of enterprises. This is due to the fact that the VAT variables

(turnover, etc.) from the administrative source, which are used to calibrate most of the SBS

variables, are not completely available for the reference year t at the preliminary data compilation

moment (t+9 months). The final administrative deadline for companies to file their annual VAT

declaration is set at t+10 months, yet it still takes a few months until those data are then available

to STATEC. Consequently, at t+9 or t+10 months, the administrative source is still incomplete,

which is in particular noticeable for the variable 12 11 0.

As from reference year 2011, a new method minimises this effect. Moreover, one can see that the

precision has been gradually improved during the recent years.

II.4.2. Average size of revision

By revision we mean replacing the data that has been published before on the Eurostat website by

new data and not when corrections are sent before the data is finally released on the website.

The revision size is measured as follows:

RMAR – Relative Mean Absolute Revisions

T

t Lt

T

t PtLt

T

t Lt

T

t t

ˆ

ˆˆ

ˆ

RRMAR

1

1

1

1

Lt = Latest estimate

Pt = Preliminary estimate

32

NB: Due to the fact that it is difficult to calculate the RMAR as presented in the definition, the

calculation of the revision size below was calculated by considering the final version of data

(published on the website) as latest and the first version of data provided as preliminary. All

NACE codes up to 4 digit level were taken into account for the calculation.

The following list of characteristics to be taken on board for the calculation of the revision size, is

suggested by annex for the national series of the four annexes:

Annex 1: Series 1A: 11 11 0, 12 11 0, 12 12 0, 12 15 0, 12 17 0, 13 11 0, 13 31 0, 15 11 0, 16 11

0, 16 13 0

ANNEX

I

Country Variables RMAR

LU V11110 0.0000

LU V12110 0.0000

LU V12120 0.0000

LU V12150 0.0000

LU V12170 0.0000

LU V13110 0.0000

LU V13310 0.0000

LU V15110 0.0000

LU V16110 0.0000

LU V16130 0.0000

Annex 2, Series 2A 11 11 0, 12 11 0, 12 12 0, 12 15 0, 12 17 0, 13 11 0, 13 13 1, 13 31 0, 13

32 0, 15 11 0, 15 15 0, 16 11 0, 16 13 0, 16 14 0, 16 15 0, 18 11 0, 20 11 0

ANNEX

II


LU V11110 0.0000

LU V12110 0.0000

LU V12120 0.0000

LU V12150 0.0000

LU V12170 0.0000

LU V13110 0.0000

LU V13131 0.0000

LU V13310 0.0000

LU V13320 0.0000

33

LU V15110 0.0000

LU V15150 0.0000

LU V16110 0.0000

LU V16130 0.0000

LU V16140 #DIV/0!

LU V16150 0.0000

LU V18110 #DIV/0!

LU V20110 0.0000

Annex 3, Series 3A: 11 11 0, 12 11 0, 12 12 0, 12 13 0, 12 15 0, 13 11 0, 13 12 0, 13 31 0,

13 32 0, 15 11 0, 15 15 0, 16 11 0, 16 13 0

ANNEX

III


LU V11110 0.000

LU V12110 0.000

LU V12120 0.000

LU V12130 0.000

LU V12150 0.000

LU V12170 0.000

LU V13110 0.000

LU V13120 0.000

LU V13310 0.000

LU V13320 0.000

LU V15110 0.000

LU V15150 0.000

LU V16110 0.000

LU V16130 0.000

Annex 4, Series 4A: 11 11 0, 12 11 0, 12 12 0, 12 15 0, 12 17 0, 13 11 0, 13 31 0, 13 32 0,

15 11 0, 15 15 0, 16 11 0, 16 13 0, 16 14 0, 16 15 0, 18 110

ANNEX

IV


LU V11110 0.000

LU V12110 0.000

LU V12120 0.000

LU V12150 0.000

LU V12170 0.000

34

LU V13110 0.000

LU V13310 0.000

LU V13320 0.000

LU V15110 0.000

LU V15150 0.000

LU V16110 0.000

LU V16130 0.000

LU V16140 #DIV/0!

LU V16150 0.000

LU V18110 #DIV/0!

Not applied.

If revisions have been sent, please provide the reasons for making the revision.

Not applicable.

II.4.3. Revision policy

Please describe for each annex your revision policy, including information such as the average

number of revisions (planned or not), the main reasons for revisions and the impact of the

revisions.

Annex I

In the past, we had no revision policy for SBS data sent to Eurostat due to resources constraints.

However, due to recent economic evolutions, a revision of SBS data series has been decided for

the reference years 2005 to 2010 included. The revision will take into account the changes due to

NACE Rev.2, so that users will have a longer series for this classification. Moreover, the profiling

of a major enterprise group in Luxembourg has led to a new enterprise definition for the units of

this group – this has a major impact on the overall figures of industrial and trade activities.

The revision is currently in the works and planned for the end of the first half of 2014 – initial

planning was 2013 but due other priorities, the revision had to be put on hold.

Annex II

see remark here above

Annex III


Annex IV


Annex VIII


35

III. Coherence and comparability

III.1. Coherence

Definition

Coherence of statistics is their adequacy to be reliably combined in different ways and for

various uses. It focuses on the joint use of statistics that are produced for different primary

purposes to show cases of incoherence rather than to prove coherence.

Please report on the inconsistencies that can already be documented for the following statistics as

well as work undertaken in this respect:

- Number of enterprises in the business register

- Production value of Prodcom

- Value added of national accounts

- Evolution of turnover and persons employed from short term statistics

- Other (please specify)

Consistency with business demography

These differences have been documented in the SBS quality report on Annex IX (cf. section III.1).

Consistency with national accounts

In Luxembourg, our policy is to remain as consistent as possible with the data published in the

framework of national accounts. Therefore, the SBS data on the STATEC website were based the

statistical unit KAU and not the statistical unit enterprise, which is used for data published on the

Eurostat website. This is subject to change in the near future to allow the publication of inward

FATS and size class data. Nonetheless, the KAU series will be kept for the purposes of compiling

national accounts.

Given that the revised national accounts data for a given reference year are available after t+21

months, whereas the publication of LU SBS data is currently performed with data sets available at

t+19 months, there are punctual differences, in particular relating to the last minute identification

of (partially) out-of-scope units as well as due to NACE reclassification of units. Furthermore,

national accounts include estimates for illegal or black market activities. Finally, national accounts

balancing procedures cause inconsistencies with SBS.

Consistency with short-term statistics

No efforts have been invested so far in analysing the potential or actual data inconsistencies

between SBS and STS. However, the latter use SBS data (enterprise concept) for weighting

purposes.

Consistency with Labour Cost Survey (LCS) for wages and salaries per employee

If we consider only the population of enterprises employing at least 10 persons and only those

economic activities which are covered by both statistics at the same time, the figures are very

similar.

Only to be filled in if this quality report includes information on the Annex VIII

Please report on the inconsistencies between the total turnover reported for Annex VIII and the

36

corresponding data to be reported under the provisions of Annex I.

[Comment: as we do not know which countries will report on Annex VIII using this report, we

cannot include an overview with remaining inconsistencies in the data disseminated by Eurostat]

Not applicable.

III.2. Comparability

Definition

Comparability aims at measuring the impact of differences in applied statistical concepts and

measurement tools and procedures, when statistics are compared between geographical areas, non-

geographical domains, or over time.

III.2.1. Comparability over time

Length of time series is the period when the statistics have been compiled for the first time until the

latest reference year.

Length of comparable time series starts with the last break in time series and longs until the last

reference year. Please split into individual annexes if time series are different.

Indicator Period (yyyy – yyyy)

III.2.1.1. Length of time series 1995-2011

III.2.1.2. Length of comparable time series 2003-2009

III.2.1.3. In case II.2.1.1. is not equal to II.2.1.2., please, indicate the reasons or differences, in

concepts and methods of measurements for breaks in time series.

As from 2003, Luxembourg has for the first time delivered data series based on the enterprise

concept and on the KAU concept. Before 2003, only the KAU concept was available for Eurostat

data.

III.2.2. Geographical comparability: Coverage of target population

III.2.2.1. Please report on the inclusion of branches of foreign enterprises and the exclusion of the

results relating the activities abroad of enterprises of the reporting country.

Branches of foreign enterprises are identified as such in the business register. Consequently, they are

included if they are considered as being a resident unit. For the units in the survey, we try to obtain

as many data as needed to exclude foreign branches included in Luxembourgian legal units.

Please also refer to section II.2.4.2.

III.2.2.2. Are the statistical units used for compiling the SBS series in conformity with the

definitions in Regulation No 696/93?

Yes.

III.2.3. Geographical comparability: Other issues

III.2.3.1. Please report on any other issues (with the exception of differences in concepts and

definitions in the source data mentioned in paragraph II.I.3) that have an influence on the

37

comparability of the data of your country with that of the other EEA countries.

Not applicable.

38

IV. Timeliness and Punctuality

IV.1. Timeliness

Definition

Timeliness of information reflects the length of time between its availability and the event or

phenomenon it describes.

Please provide the key dates for the following actions

Action Deadline(s) MM/YY

a) data-collection, if any

If several data collections are organised, please indicate the

deadline of the last

February 2013

b) post-collection phase August 2013

c) dissemination in your country, if applicable n.a. (yet) for reference year 2011

IV.2. Punctuality

Definition

Punctuality refers to the time lag between the release date of data and the target date when it

should have been delivered, for instance, with reference to dates announced in some official

release calendar, laid down by Regulations or previously agreed among partners.

The punctuality of delivery date is calculated as actual date of data delivery – scheduled date of

transmission to Eurostat. It shows the delay (positive value) or advance (negative value) in calendar

days after the legal deadline (18 months after the end of the reference year fore final data; 10

months for preliminary data).

Country LU

R1A 39

R1B 39

R1C 39

R1E 39

R2A 39

R2B 39

R2C 39

R2D 39

R2F 39

R2H

R2I

R3A 39

R3B 39

R3C 39

R3D

39

R4A 39

R4B 39

R4C 39

R4D 39

R4F 39

R4G

R4H 39

R8A 39

R8B 39

R8E

R8F 39

IV.2.1. Please comment the delays (if any) of transmission to Eurostat as calculated by Eurostat,

e.g., the reasons for the late delivery and present the action plan for improving timeliness and

respecting transmission delays (if needed, you can split this question for the different annexes).

Comment on the delays

In absolute terms, STATEC does still not meet the deadlines set by the SBS regulatory framework.

Compared to the reference year 2010, the delays have been reduced and all data sets have been

sent at the same date. As already mentioned previously, the number of persons currently allocated

to the SBS regulation is insufficient to meet the official time table.

Action plan

- As of reference year 2013, we hope that the mandatory Luxembourg Balance Sheet Register

will available for massive date extraction. This would allow us to obtain electronic data as fast

as possible and to increase the overall SBS data quality, in particular for small businesses.

- We constantly keep our eyes on newly available administrative sources to minimise the

transmission delays as well as to maximise the SBS data quality.

40

V. Accessibility and Clarity

V.1. Accessibility

Definition

Accessibility refers to the physical conditions in which users can obtain data: where to go,

how to order, delivery time, clear pricing policy, convenient marketing conditions, availability of

micro and macro data, various formats (paper, files, CD_ROM, Internet etc.).

If the SBS data are not disseminated at national level by your Institute please proceed to

VI. Further comments.

If needed, this following question can be split for the different annexes.

V.1.1. Is the SBS data published at national level different from the data sent to Eurostat?

yes

no

V.1.2. If yes, please gives a brief description of the reasons for these differences.

Annexes I-IV

The differences between the SBS series published by Eurostat and by STATEC are as follows:

1) As already stated in section I.3.2.5., SBS data published on the STATEC website are based

on the KAU concept and not on the enterprise concept. For several economic activities

there is consequently a big difference. Given this choice of concept, any additional

breakdowns (e.g. size classes, client’s country of residence, etc.) available for the

enterprise concept are not published on our website because the SDC procedures have not

been performed for such breakdowns on a KAU level.

2) The degree of detail and the number of variables on the STATEC website is based on a

quality choice and does not correspond to the maximum NACE detail and number of

variables available.

3) The series on the Luxembourg website are regularly revised to keep them in line with the

national accounts series, which are also revised every year. However, the revised series are

only based on the KAU concept. No revised data are sent to Eurostat because that would

significantly increase the burden due to additional SDC procedures.

4) 2008 data series have so far only been published in reference to NACE Rev.1.1. The

reasons for this have been mentioned in section I.3.2.5.

All of the above is subject to change in the short run. The goal is to publish the data sent to

Eurostat on the national website, with all of the available dimensions and variables.

Annex VIII

41

The SBS Annex VIII data are not disseminated at national level because they only concern a

subset of enterprises (employing at least 20 persons) and a subset of services activities. More

particularly, confidentiality is an issue, given the small target populations.

V.1.3. Do you publish additional indicators together with the SBS at the national level?

yes

no

V.1.4. If yes, please give a brief description of the additional indicators published at the national

level.

Not applicable.

V.1.5. How do you disseminate SBS data?

Member

State

Paper/pdf Publications Electronic Publications

News

release

Statistical

yearbook

Thematic

publications

Internet-

Data base

CD/DVD-

Rom

Other (fax, e-

mail, etc.)

2011

data yes yes yes yes yes yes

Action

plan

2012

data

For 2011 data, please, mark with a cross where applicable.

For 2012 data, please, report any scheduled action plan specifying the implementation date.

V.1.6. Availability of paper publications in any foreign languages

[ X ] in English

[ X ] in the following other language(s) German, French

Please indicate links to your electronic publications on SBS

(click on the links here below)

LU SBS dynamic on-line tables

Annuaire du STATEC

Luxembourg in numbers

V.2. Clarity

Definition

Clarity refers to the data's information environment whether data are accompanied with

appropriate metadata, illustrations such as graphs and maps, and whether information on their

quality is also available.

V.2.1. Are statistical metadata available?

[ X ] available for paper publications

[ X ] available on the Website (electronic version)

[ ] no methodological explanations on data are disseminated

Please indicate links to your electronic publications of metadata

http://www.statistiques.public.lu/fr/methodologie/definitions/index.html

http://www.statistiques.public.lu/fr/methodologie/methodes/entreprises/Stat-struct/sse/index.html

http://www.statistiques.public.lu/stat/ReportFolders/ReportFolder.aspx?IF_Language=fra&MainTheme=4&FldrName=1&RFPath=42

http://www.statistiques.public.lu/stat/ReportFolders/ReportFolder.aspx?IF_Language=fra&MainTheme=4&FldrName=1&RFPath=42

http://www.statistiques.public.lu/fr/publications/series/annuaire-stat-lux/index.html

http://www.statistiques.public.lu/fr/publications/series/lux-chiffres-fr/index.html

http://www.statistiques.public.lu/fr/methodologie/definitions/index.html

http://www.statistiques.public.lu/fr/methodologie/methodes/entreprises/Stat-struct/sse/index.html

42

VI. Further Comments

Please provide further comments regarding SBS data quality which are not included in the parts

above (e.g. foreseeable changes in the methodology; etc.).

None.

Annex

Note: numbering in the annex is not contiguous: paragraphs are numbered corresponding to the

entry concerned in the quality report.

I. RELEVANCE Relevance is the degree to which statistics meet current and potential users' needs. It refers to whether

all statistics that are needed are produced and the extent to which concepts used (definitions,

classifications etc.) reflect user needs.

These questions need to be answered for the statistical concepts (like definitions of statistical units or

classifications used) and on the availability of required statistics (completeness). The questions as regards

the deviations from the statistical concept are covered under coherence and comparability, as they are

also needed in that frame.

Conformity of characteristics (variables) with the regulation can be considered either as a relevance

issue, an accuracy problem or a comparability item. We treat it in the accuracy chapter under II.1.3.2

where data production using administrative sources are dealt with.

I.1 COMPLETENESS

Completeness is the extent to which data are available – compared to what is requested according to the

SBS-Regulation.

In this quality report completeness is measured via availability ratios, namely as the delivered cells for a

certain level of detail or the total to the amount that was expected to be obtained.

I.1.2. Availability of characteristics and/or breakdowns required by the SBS-Regulation

In this question Member States shall describe the reasons why in some cases details are missing and to

include the measures that the Member States took or are planning to take in order to change the situation.

If none of the details are missing, the answer shall be left in blank or if the Member State foresees no

change in this situation.

I.1.3. Use of the quality flag 'Contribution to European totals only' (CETO-flag) foreseen in the SBS regulation

In order to minimise the burden on businesses and the costs to the national statistical authorities, the

Member States may mark data for use as a contribution to European totals only (CETO). Eurostat shall not

publish those data, nor shall Member States mark nationally published data with a CETO flag. The use of

the CETO flag shall be dependent on the individual Member State's share of the EU total of value added in

the business economy as follows:

43

(a) Germany, France, Italy, and United Kingdom: CETO-flagged data may be sent for NACE Rev. 2

class level and for the size class breakdown at NACE Rev. 2 group level. No more than 15 % of the cells

may be marked.

(b) Belgium, Denmark, Ireland, Greece, Spain, the Netherlands, Austria, Poland, Portugal, Finland and

Sweden: CETO-flagged data may be sent for NACE Rev. 2 class level and for the size class breakdown

at NACE Rev. 2 group level. No more than 25 % of the cells may be marked. In addition, if, in any of

these Member States, the share of a NACE Rev. 2 class or of a size class of NACE Rev. 2 group is less

than 0,1 % of the business economy of the Member State concerned, those data may additionally be sent

as CETO-flagged.

(c) Bulgaria, Czech Republic, Estonia, Cyprus, Latvia, Lithuania, Luxembourg, Hungary, Malta,

Romania, Slovenia and Slovakia: CETO-flagged data may be sent for NACE Rev. 2 group and class

level and for the size class breakdown at NACE Rev. 2 group level. No more than 25 % of the cells at

group level may be marked.

The measures designed to amend non-essential elements of this Regulation, inter alia, by supplementing

it, relating to reviewing the rules for the CETO flag and grouping the Member States, shall be adopted in

accordance with the regulatory procedure with scrutiny referred to in Article 12(3) by 29 April 2013 and

every five years thereafter.

As Eurostat will present a comment on how he rules of SBS are respected, Member States can update and

check if they agree with Eurostat’s comment.

I.1.4. Application of 1%-rules foreseen in the SBS regulation

This question aims to identify the situations where Member States despite of the 1%-rule are foreseen to

provide data (or because are able to obtain the data concerned, e.g. using administrative sources), and

situations where the Member States expects that this rule will apply in a new branch.

I.1.5. Derogations from the provisions of the SBS Regulation

Some Member States were granted derogations. Sometimes however, the cause for requesting these

derogations applies no longer and the Member States is able to provide the “missing data”. In this

question, information on this is expected.

This question can be let in blank if you do not have any derogation or if you do not foresee any change in

the present situation.

I.3. Monitoring user interest

The user interest is an important indication about relevance. The aim of the group of questions is to

quantify this interest. It is split in three groups: Data dissemination, user consultation and user

satisfaction.

II Accuracy

44

Definition

Accuracy in the general statistical sense means the closeness of computations or estimates to the

exact or true values. The difference between the two values is the error.

As any statement on accuracy requires more information on the methods used for compiling the

structural business statistics, most of the methodology description is filed under this chapter.

II.1. Concepts and sources

Member states may use different data collection systems to comply with the SBS regulation. We

distinguish:

the use of sample surveys4 (with administrative information present in the sampling frame and

only used for stratification purpose),

the use of a sample survey combined with use of administrative information (whereby this

information is also used in the editing and imputation process and possibly in the inference process

through calibration),

the use of administrative data as the only source of information and all characteristics are

calculated from the administrative records using or not a model based approach (in this case no

survey information is used for a significant part of the strata covered).

As such it is necessary only to fill in the column concerned:

Column 1 if your data collection is made by the way of survey(s);

the column 2 if your data collection is a combination of surveys and the use of

administrative data (even if the administrative data is used for calibration or imputation);

the column 3 is expected to be filled by those Member States that only use

administrative data in the data collection for SBS.

II.I.3. Definitions and concepts used in the survey / administrative source

In this question the Member States self-assessment is expected. If you identify any difference between

the definitions please describe it in general – if you consider it as an important difference (in case your

assessment is lower than good, please detail it as much as possible). Please split your answer between

the differences for survey and the differences for administrative source.

Sampling error and non-sampling errors

When describing accuracy it is customary to distinguish between sampling and non-sampling errors.

Sampling errors can be described by an estimate of the uncertainty due to sampling, quantified in the

coefficient of variation (COV). Calculation of this COV is dealt with in II.1.4.

Non-sampling errors are more varied in nature and harder to quantify. We distinguish:

Frame errors (under-coverage / over-coverage)

Reporting or measurement errors including non-response

Processing errors

Model assumption errors

Among those, reporting and measurement errors are focussed on non-response under heading II.2.1.

Coverage and processing errors are included in this report under the headings II.2.4 and II.2.5.

4 Exhaustive interrogation of all enterprises is considered a type of survey.

45

II.1.4. Sampling error: coefficient of variation

valueestimated

variancesampling theof estimateCV

Largely adopted as an estimate for the sampling variance is the Horvitz-Thompson variance,

in which wh is the weight of the elements in stratum h. Since strata are independent from one

another, the variance over the strata concerned is the sum of the variances. Remark that there is no

variance contribution in an exhaustively surveyed stratum (of large enterprises), provided that

there is no non-response.

If there is no sampling (e.g. a model based approach is used for mass imputation using

administrative records), sampling variance is zero by definition.

A 'Jacknife' method can be applied to estimate a variance in case an imputation procedure was

used for part or all of the elements in the stratum:

Suppose ni elements were imputed in a stratum of nt enterprises. We can then calculate the

variance among ni+1 alternate elements consisting of the total estimate based on all elements and

the ni estimates based on nt-1 enterprises, whereby one imputed enterprise was dropped and

consequently a weight factor nt/ (nt-1) was applied.

V = Var(T, T1,…Ti,…Tni) with i=1…ni and Ti = nt/ (nt-1) yj (ji)

In case the data collection is exclusively based on an administrative source, the 'Jacknife' method

may still be applied. In this case, all nt enterprises are considered imputation results (t=i). In this

case, the objective value of this calculation is questionable: it reflects the intra-strata variability

and quietly assumes that errors due to the model based approach are proportional to this.

II.2.1 Non-response

The weighted unit non-response rate shows how well the data collection worked for the population

of interest. The non-response rate could be weighted by the weights that take into account the most

relevant characteristic. Weights are set at the sampling stage, based on a size criterion available in

the business register. As weighting factor for the non-response calculation, we prefer the 'number

of persons employed' as it is always positive. Turnover can also be used if this is the main

quantitative characteristics available for all enterprises.

II.2.1.1. Description of estimation method for taking unit non-response into account

a) Description the methods used for taking unit non-response into account

This question deals with how MS deal with non-response: Is imputation used for non-respondent

enterprises? If so is there a size class threshold: i.e. is imputation used for all strata? In which case

reweighting is used? Is there any calibration procedure?

2

1

)(.)1( h

n

i

ih

h

h

h

h

h

ywwVV

46

b) Measures taken or envisaged for minimising non-response.

In this question Member States shall inform about all kind of measures that have been taken, like

fines, personal contacts with non-response units in order to motivate them to answer. On the other

hand some other kind of measures can be considered like the use of partial data obtain by the way

other surveys, for example.

II.2.2. Bias

A measurement bias can be due to an inappropriate selection scheme or to a selective non-

response. Whereas the former could be considered a sampling error (but probably due to some

erroneous entry in the sampling frame), the latter occurs in the production process and is not

related to the sampling stage. In a purely administrative survey, a missing or incomplete

administrative record could also be considered a type of non-response. It remains to be seen

whether it is of a complete random nature or not.

We will treat 'bias' as an outcome of non-response, leaving it out of the section which is split

according to the data collection method. It is treated in section II.2.2

II.2.3. Imputation

Imputation as a procedure in data production does not fit on the axis sampling/non sampling

errors. In a survey, imputation can be applied to 'recover' cases of non-response where good

approximations can be made, namely based on administrative records. For medium sized and

larger enterprises, accounting information quite often contains enough data fields that are normally

used for filling out a questionnaire and which yield very good estimates upon imputation.

Imputation is also an alternative for calibration in case individual administrative data are available

for all of the elements in the stratum (both sampled and non-sampled). For example, the

information available in the Business Register that was used for stratification is the minimal set

that may successfully be used for imputation. Mass imputation on the non sampled enterprises

then replaces inference.

Finally in any data collection from administrative sources only, the imputation procedure is based

on a model, translating administrative data into statistical characteristics.

In the first case, the imputation normally leads to a substantially reduced coefficient of variation:

without imputation the variance can be calculated using the weights corrected for non-response.

After imputation of the medium and large enterprises, the 'jacknife' variance estimate can be used.

It is then possible to assess the reduction of the coefficient of variation due to the imputation

procedure.

II.2.3.2 Evaluation of the impact of imputation

This question only refers to the imputation procedure to recover non-response, not to the

imputation of corrected data after the editing has reveal (severe) raw data errors.

Please assess the reduction of the coefficient of variation obtained through imputation as compared

to the case where only reweighting is applied. (Please use the Jacknife method described in this

annex under II.2.3) Only one single comparison needs being made by annex, preferably on the

characteristics turnover (12 11 0) and for the largest section it includes: e.g. for Nace Rev 2 C

manufacturing industry.

47

II.4. Assessment of revisions

Revisions and differences between preliminary and final data make use of the same formula. It has

the advantage that (for positive definite characteristics) the denominator cannot be zero, hence

there are no singularities possible (even with a 'true zero' revised to a non-zero entry). Moreover

contrary to percentage changes, it is symmetrical. The factor 2 makes the differences

asymptotically (meaning for very small differences) equal to percentage changes.

The formula then mentally translates as 'a weighted sum of squares of percentage changes'.

II.4.1. Preliminary data versus final data

Member States shall present, based on its self-assessment, what they consider as the main reasons

for the differences between the preliminary data and the final data, for example Member States can

consider that the answering rate in the moment of preliminary data is too low or some of the

administrative data used is not yet available.

If, based on your assessment you do consider the differences insignificant, you do not need to fill

his answer.

II.4.2. Average size of revision

In this question, it is very important to get information on what were the reasons why the data has

been revised – for example: an error was detected by a user; after getting a confirmation from a

unit, NSI was informed that the data was wrong; when editing data for year n it was detected an

error on year n-1 data, etc.

III.2. Comparability

II.2.1. Comparability over time

It is very important in the context of a quality report to obtain information on the time length of the

series and to identify the reasons why in some series the time length is shorter.

Documents

QQUUAALLIITTYY RR - gouvernement...4D 0 0 0 0 0 4F 4G 4H 8A 12.5 8B 0 8E 8F 0 9 I.1.4. Application of 1%-rules foreseen in the SBS regulation Annex II Variables 21110, 21120 and 21140