Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
QQUUAALLIITTYY RREEPPOORRTT
SSTTRRUUCCTTUURRAALL BBUUSSIINNEESSSS
SSTTAATTIISSTTIICCSS RREEGGUULLAATTIIOONN
AANNNNEEXXEESS II TTOO IIVV && VVIIIIII
MMEEMMBBEERR SSTTAATTEE:: LLUUXXEEMMBBOOUURRGG
RREEFFEERREENNCCEE YYEEAARR:: 22001111
RREEPPOORRTT IISSSSUUEEDD:: SSBBSS -- SSEERRVVIICCEESS,, IINNDDUUSSTTRRIIEESS,, TTRRAADDEE
AANNDD CCOONNSSTTRRUUCCTTIIOONNSS
COMPILING INSTITUTION: STATEC, LUXEMBOURG
CONTACT DETAILS:
NAME: Georges Zangerlé
E-MAIL ADDRESS: [email protected]
TELEPHONE NUMBER: +352 247-84242
POSTAL ADDRESS: 13, rue Erasme
L-1468 Luxembourg
Please answer in the grey-shaded cells or, when necessary, update the information given.
Please check the pre-filled cells marked in green.
There is no limit for the replies to open questions; the row height will be automatically adjusted to your text.
Please note that, if you consider that any of your answer should be treated as confidential, the answer
needs being labelled “CONFIDENTIAL” explicitly.
2
Contact persons
Annex I:
RESPONSIBLE INSTITUTION: STATEC, LUXEMBOURG
CONTACT DETAILS:
NAME: Georges Zangerlé
E-MAIL ADDRESS: [email protected]
TELEPHONE NUMBER: +352 247-84242
Annex II:
RESPONSIBLE INSTITUTION: STATEC, LUXEMBOURG
CONTACT DETAILS:
NAME: Georges Zangerlé
E-MAIL ADDRESS: [email protected]
TELEPHONE NUMBER: +352 247-84242
Annex III:
RESPONSIBLE INSTITUTION: STATEC, LUXEMBOURG
CONTACT DETAILS:
NAME: Georges Zangerlé
E-MAIL ADDRESS: [email protected]
TELEPHONE NUMBER: +352 247-84242
Annex IV:
RESPONSIBLE INSTITUTION: STATEC, LUXEMBOURG
CONTACT DETAILS:
NAME: Georges Zangerlé
E-MAIL ADDRESS: [email protected]
TELEPHONE NUMBER: +352 247-84242
3
Annex VIII:
RESPONSIBLE INSTITUTION: STATEC, LUXEMBOURG
CONTACT DETAILS:
NAME: Georges Zangerlé
E-MAIL ADDRESS: [email protected]
TELEPHONE NUMBER: +352 247-84242
4
Contents:
CONTACT PERSONS .................................................................................................................................. 2
I. RELEVANCE .......................................................................................................................................... 6
I.1. COMPLETENESS ........................................................................................................................................ 6 I.1.1. DATA AVAILABILITY ............................................................................................................................... 6 THE AVAILABILITY RATE IS AS FOLLOWS (PRE-FILLED BY EUROSTAT): .......................................................... 6 I.1.2. AVAILABILITY OF CHARACTERISTICS AND/OR BREAKDOWNS REQUIRED BY THE SBS-REGULATION ... 7 I.1.3. USE OF THE QUALITY FLAG 'CONTRIBUTION TO EUROPEAN TOTALS ONLY' (CETO-FLAG) FORESEEN IN
THE SBS REGULATION (FOR MORE EXPLANATION SEE ANNEX) ...................................................................... 8 I.1.4. APPLICATION OF 1%-RULES FORESEEN IN THE SBS REGULATION ......................................................... 9 I.1.5. DEROGATIONS FROM THE PROVISIONS OF THE SBS REGULATION ......................................................... 9 I.2. CONFIDENTIALITY ................................................................................................................................... 9 I.3. MONITORING USER INTEREST ............................................................................................................... 12 I.3.1. DATA DISSEMINATION REPORT ............................................................................................................. 12 I.3.2. CONSULTATION OF YOUR MAIN USERS (TARGET GROUP: NARROW SCOPE, E.G. NATIONAL ACCOUNTS,
CENTRAL BANKS, ECONOMY DEPARTMENT ETC.) ......................................................................................... 13 I.3.3. USERS' SATISFACTION (BROADER SCOPE) ............................................................................................. 14
II. ACCURACY AND RELIABILITY ...................................................................................................... 15
II.1. CONCEPTS AND SOURCES ..................................................................................................................... 15 II.2. NON-SAMPLING ERRORS....................................................................................................................... 24 II.2.1. UNIT NON-RESPONSE ........................................................................................................................... 24 II.2.2. BIAS ..................................................................................................................................................... 26 II.2.3 IMPUTATION ......................................................................................................................................... 27 II.2.4 COVERAGE ERRORS .............................................................................................................................. 28 II.2.5. PROCESSING ERRORS ........................................................................................................................... 29 II.3. INFERENCE (GROSSING UP) .................................................................................................................. 29 II.4. ASSESSMENT OF REVISIONS ................................................................................................................. 30 II.4.1. PRELIMINARY DATA VERSUS FINAL DATA ........................................................................................... 30 II.4.2. AVERAGE SIZE OF REVISION ................................................................................................................ 31 II.4.3. REVISION POLICY ................................................................................................................................. 34
III. COHERENCE AND COMPARABILITY ....................................................................................... 35
III.1. COHERENCE ........................................................................................................................................ 35 III.2. COMPARABILITY ................................................................................................................................. 36 III.2.1. COMPARABILITY OVER TIME .............................................................................................................. 36 III.2.2. GEOGRAPHICAL COMPARABILITY: COVERAGE OF TARGET POPULATION .......................................... 36 III.2.3. GEOGRAPHICAL COMPARABILITY: OTHER ISSUES ............................................................................. 36
IV. TIMELINESS AND PUNCTUALITY ................................................................................................ 38
IV.1. TIMELINESS ......................................................................................................................................... 38 IV.2. PUNCTUALITY ...................................................................................................................................... 38
5
V. ACCESSIBILITY AND CLARITY ....................................................................................................... 40
V.1. ACCESSIBILITY ...................................................................................................................................... 40 V.2. CLARITY ............................................................................................................................................. 41
VI. FURTHER COMMENTS ..................................................................................................................... 42
ANNEX.......................................................................................................................................................... 42
6
I. Relevance
Definition
Relevance is the degree to which statistical outputs meet current and potential user needs. It
depends on whether all the statistics that are needed are produced and the extent to which
concepts used (definitions, classifications etc.,) reflect user needs.
I.1. Completeness
The completeness is the extent to which data are available compared with the requirements in
terms of characteristic, geographical and activity breakdown, as specified in the SBS Regulation1.
I.1.1. Data availability
The availability rate is as follows (pre-filled by Eurostat):
100applicable fields ofnumber
provided cells ofnumber
2011 LU
1A 92
1B 96
1C 100
1E 93
2A 95
2B 100
2C 100
2D 100
2F 100
2H
2I
3A 93
3B 100
3C 100
3D 0
4A 93
4B 100
4C 100
4D 100
4F
4G
4H
Total 95
1 SBS Regulation: http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2008:097:0013:0059:EN:PDF:
7
8A 100
8B 100
8E
8F 100
Total 100
SPECIFICATION OF MISSING DETAIL: (TABLE TO BE FILLED IN BY EUROSTAT)
a) characteristics 11AA,,22AA,, 33AA,,11EE,,44AA::
VV1166114400
22AA:: VV1188111100
nothing missing
b) activities missing in series HH550022 nothing missing
c) breakdown in series 11BB:: HH550011,,HH550022,,
HH550033,, HH550044 nothing missing
d) missing series 33DD nothing missing
I.1.2. Availability of characteristics and/or breakdowns required by the SBS-Regulation
Please comment on the rates of available statistics calculated by Eurostat and explain the reasons
why any characteristics or breakdowns required by the SBS Regulation are not available (e.g.
derogations) and describe your plans for improvement in the future.
ad a) Missing characteristics
- 16 14 0: the “Number of employees in full time equivalent units” is not available in survey or
administrative data ;
- 18 11 0: turnover by 3-digit NACE activity breakdown is not yet available due to the still work-
in-progress implementation of the CPA version 2008.
ad b) Breakdown in series
The activities missing in series 1A and 1B are related to subdivisions of water transport, more
precisely sea and coastal water transport. The figures are based on an economic branch level
estimation obtained yearly from our national accounts department. At the moment of the
transmission of SBS data for the reference year 2009, the source was still only available for NACE
Rev.1.1. Given that NACE Rev.2 is more detailed for sea and coastal water transport activities, the
available NACE Rev.1.1 estimate could not be clearly allocated to either passenger or freight.
Nonetheless, the data are available for the aggregate of passenger and freight.
ad c) Missing series
Luxembourg is completely exempted from providing data series 3D (see 1%-rule derogation).
Areas for improvement
- Work-in-progress: Water transport activities estimates in NACE Rev.2 are not yet available.
Water transport is a complex methodological issue in Luxembourg, due to the paradox of the
country not having a direct access to the sea and the existence of many incorporated entities in
this activity branch. During 2013, a solution was being worked out jointly with national accounts
and business register. Update 2014: However, due to a shortage of resources, this area has been
put on hold due to other priorities.
8
- Work-in-progress: The implementation of CPA version 2008 (which impacts variable 18 11 0) is
still on-going because it requires close consultation with national accounts to avoid any major
inconsistencies in the future. A solution is currently being worked out jointly with national
accounts and should be achieved for reference year 2013 ;
- Work-in-progress: Data collection for variable 16 14 0 is not foreseen due to administrative
burden and reliability considerations. As a result of a national task force, we have started
exploring the possibility to extract this variable from an administrative source (i.e. “number of
hours worked” combined with “number of employees”. We seek to implement this variable for
reference year 2012.
I.1.3. Use of the quality flag 'Contribution to European totals only' (CETO-flag) foreseen in the
SBS regulation (For more explanation see annex)
100applicable fields ofnumber
cells flagged-CETO ofnumber
2011 LU LU LU LU LU
TOTAL Section Division Group Class
1A 2.38 0 3.83 2.98 1.8
1B 0.41 0 0.53 0.41
1C 2.94 0 3.57
1D 0
1E 0
2A 0 0 0 0 0
2B 0 0 0 0
2C 0 0 0
2D 0 0 0 0 0
2F 0 0 0 0 0
2H
2I
3A 0 0 0 0 0
3B 0 0 0 0
3C 0 0 0 0
3D
4A 0 0 0 0 0
4B 0 0 0 0
4C 0 0 0
4D 0 0 0 0 0
4F
4G
4H
8A 12.5
8B 0
8E
8F 0
9
I.1.4. Application of 1%-rules foreseen in the SBS regulation
Annex II Variables 21110, 21120 and 21140 Annex III All variables that are not included in Annex I Annex IV All variables that are not included in Annex I
Please indicate whether you intend to provide in the next reference year any of the data for
which you have applied the 1%-rule in the past.
We have decided to provide Eurostat with data available even if Luxembourg had a 1%-rule
derogation, under the condition that such a data delivery would not negatively impact the
completeness score for Luxembourg.
Please indicate whether you intend to discontinue the provision of the data for which the 1%-
rule could be applied (Eurostat will check the validity of this request and will confirm or not).
See here above
I.1.5. Derogations from the provisions of the SBS Regulation
Please indicate whether you intend to provide data for which you have granted derogation
earlier than the timing foreseen in the SBS Regulation.
Currently, there are no resources to collect those data.
I.2. Confidentiality
I.2.1. The rate of confidential cells is as follows (pre-filled by Eurostat):
100provided cells ofNumber
cells alconfidenti ofNumber
2011
LU LU LU LU LU
TOTAL Section Division Group Class
1A 33 0 17 31 40
1B 31 16 32 31
1C 10 0 12
1E 39
2A 42 0 42 45 42
10
2B 31 53 35 29
2C 26 0 29
2D 38 0 37 39 38
2F 44 0 44 47 44
2H
2I
3A 26 0 0 35 24
3B 31 21 21 33
3C 21 0 0 25
3D
4A 16 0 0 0 25
4B 21 0 0 30
4C 0 0 0
4D 17 0 0 0 27
4F
4G
4H
TOTAL 34 21 31 33 38
8A 61
8B 70
8E
8F 56
Total 62
N.B. The number of confidential cells is calculated at the detailed level of the activity breakdown
required by the Regulation (EC) 251/2009.
I.2.2. Please describe your confidentiality rules (primary and secondary). Note: The described
confidentiality rules will not be published by ESTAT.
General remarks
The team involved in the SDC procedures has been trained via the 3-day course on SDC offered
in the Eurostat training programme (ESTP) or in the CENEX-SDC framework, as well as via the
2-day course on SDC in the area of SBS. One member of the staff has represented Luxembourg
in the ‘ESSNet on SDC 2008-2009’ project as a tau-Argus software tester and SDC handbook
reviewer.
Technical considerations
The basis for any suppression pattern (addressing both primary and secondary confidentiality) is
the software package tau-Argus. However, the process also involves manual procedures, i.e.
checking the tau-Argus output, comparing the historical data series and addressing linked table
disclosure risks (see secondary confidentiality for further details).
The statistical disclosure control procedures are not performed for every variable individually
but only for a single shadow variable, i.e. "Turnover", more precisely the SBS variable 12 11 0
but including royalties. If a given cell is confidential for that variable (no matter if primary or
secondary), the same cell will be suppressed for all the other available variables (e.g. number of
persons employed, investments, personnel costs, etc.).
11
This approach has lead us to check the confidentiality for the following reference series:
- preliminary enterprise data broken down by economic activity ;
- final enterprise data broken down by economic activity ;
- enterprise data broken down by economic activity and by employment size class ;
- enterprise data broken down by special aggregates ;
- kind-of-activity (KAU) data broken down by economic activity ;
- (if available) local unit data broken down by economic activity ;
- enterprise data broken down by economic activity and by the client’s country of residence
(annex VIII).
This strongly contributes to minimising both SBS transmission delays and otherwise unsolvable
linked table or variable-specific disclosure risks.
Primary confidentiality rules
As a general principle, the NACE section level data, if not broken down by any other spanning
variable (e.g. size class), are not considered as confidential, except if the section has a trivial
breakdown (e.g. section L of NACE Rev.2). There can be other case-by-case exceptions to this
principle.
a) Sensitivity rule
We apply the (n,k)-dominance rule, i.e. a cell is suppressed if n units separately or jointly
dominate the total value of a cell by at least k% .
b) Minimum frequency rule
For any cells that are left after applying the sensitivity rule, a minimum frequency is applied. A
cell is suppressed if there are less than n units in a given cell.
Secondary confidentiality rules
The secondary suppression is calculated by tau-Argus using the ‘Modular’ algorithm. Manual
suppressions or cost adjustments are performed using the tau-Argus ‘a priori’ file facility.
a) Secondary suppression within a table
- A cell is suppressed for secondary confidentiality if n units dominate jointly or separately the
confidential total value by at least k% ;
- special attention is paid to the impact of singletons, a risk which is in most cases directly
addressed by the tau-Argus Modular algorithm ;
- tau-Argus is set to minimise the cost when determining the secondary suppressed cells.
However, we also want to provide the user with useful data, whether it is in terms of
interpretation and/or availability of time series. Consequently, the cost minimisation can be
overridden for economic and/or historical reasons.
b) Secondary suppression due to linked tables disclosure risks
- historical disclosure: in conformity with the SDC handbook, we ensure that no historical cell
is compromised by disclosing the same cell for the current reference year. As long as there is
any significant link with prior year data, a cell may not be disclosed for the current reference
year.
- links to any other series based on different unit concepts but sharing the same breakdown:
data are first checked on an enterprise level. Then the data for other concepts (KAU, local
units) are checked - the cells for which there is any significant link with the enterprise data
series inherit the flags from the latter. Sometimes, a flag of a KAU or local unit series has an
12
impact on the enterprise series. Therefore, we cannot send the enterprise data series before
the other series.
I.2.3. Have you taken any measures to reduce the number of confidential cell?
[ X ] Yes
[ ] No
If yes, please describe them briefly: We streamlined the way we treat periodic disclosure risk.
Instead of more or less copying and pasting the confidentiality pattern, we now only leave those
cells closed which are necessary to protect a primarily confidential cell in the previous year.
The same principle is applied between the series using different statistical units.
I.2.4. How do you evaluate the impact of the applied measured to reduce the number of
confidential cells?
very good good satisfactory poor very poor
I.3. Monitoring user interest
If the SBS data of Annexes I to IV are not disseminated at national level by your NSI, please
answer only to the questions 1.3.2.5.
I.3.1. Data dissemination report
1.3.1.1. Does the dissemination/publishing unit in your NSI keep track of the number of hard-
copies of the SBS publications sent or sold?2
[ X ] Yes
If yes, how many publications were sent/sold from the most recent hard-copy publications (or in
the last 12 months…): 824 statistical year books
[ ] No
[ ] No (hardcopies are not printed)
1.3.1.2. Does the dissemination/publishing unit in your NSI keep track of the number of
downloaded on-line SBS publications?
[ X ] Yes
If yes, how many downloaded publications/data were sent/sold for the most recent SBS
publications/data (or in the last 12 months…): ….
[ ] No
[ ] No (publications are not available on-line)
1.3.1.3. Does the dissemination/publishing unit in your institute keep track of the number of
downloaded data from the on-line databases?
[ X ] Yes
2 The ESTAT dissemination unit surveys the language policy in national publications: use of native language, use of
English, use of any other language. This information is not specifically targeting SBS.
13
If yes, how many downloads of SBS data were made last year: 3982 clicks (12 months period)
[ ] No
[ ] No (data are not available in on-line databases)
1.3.1.4. Do you or the dissemination/publishing unit in your NSI provide any information which is
not available in the published publications and in the published on-line databases?
[ ] Yes, to the public authorities
[ ] Yes, to the public authorities and a limited set of registered main users
[ ] Yes, to everyone with a specific request
[X] No
I.3.2. Consultation of your main users (target group: narrow scope, e.g. National Accounts,
Central Banks, Economy department etc.)
I.3.2.1. Has your unit regular consultations with some of your main users?
[ X ] Yes
[ ] No
I.3.2.2. If yes, could you give a brief description of your main users and their needs (by main
groups of users) Internal or external users?
External users: public and private research institutions, marketing companies, national central
bank, lobby and policy groups, private users, university students, academic users and others.
External users are often interested in the variables “number of [units]”, “turnover”, “value-added”
and “employment” by economic activity as well as by employment size class. Sometimes,
concentrations ratios (e.g. market share of top 5 companies, etc.) are requested but due to
disclosure issues, these are not provided.
Internal users: national annual accounts, CIS-R&D statistics, ICT statistics, short-term statistics,
foreign direct investment, internal research department.
Internal users need micro level data. SBS micro level data are mainly used as a direct data source
for the production of other statistics (e.g. national annual accounts including the financial account,
ICT, CIS-R&D) or to calculate index weights (e.g. short-term statistics). Other uses are quality
checks, use of the data for sampling purposes, etc.
Internal users and annex VIII
The geographical dimension in the national SBS survey is also a major national source for national
accounts, not only for services companies but for production of services in general. The same
source is used to compile the annex VIII series.
I.3.2.3. If SBS data of Annexes I to IV are published at national level, could you please give a
brief description of the general opinion of the main users about the quality of the data?
External users
While the external users are generally satisfied with the quality of the data, there are a few causes
for potential disappointment:
- cells unavailable due to confidentiality in case of very detailed / specific data needs ;
14
- data not available due to unrealistic timing expectations, given that national SBS data for the
latest reference year are available only after t+21 or t+22 months – a few users seem to expect
results at a t+12 months frequency ;
- (rare) confusions in case of annual revisions of historical SBS data, which are necessary to
remain in line with national accounts data.
Internal users
Not applicable because internal users use SBS micro data.
I.3.2.4. If SBS data of Annexes I to IV are published at national level, does the consultation with
the key users reveal any specific unfulfilled need in addition to the requirements of the
Annexes I to IV of the SBS Regulation? If yes, please specify.
Internal users become more and more interested in balance sheet data. External users occasionally
request concentration indicators (e.g. market share of the top 5 companies, etc.), but those requests
remain unfulfilled.
(Not to be filled in if questions 1.3.2.1 to 1.3.2.4 have been answered)
I.3.2.5. If the SBS data as required for the Annexes I to IV of the SBS Regulation are not
published as such at national level, would you please indicate why the data published by your
NSI are more suitable for the users' needs?
Annexes I-IV
NACE Rev.2 series are currently being prepared for dissemination at national level. The series
will start at reference year 2005 until 2011. Thereupon, Luxembourg will send revised data series
to Eurostat for the said reference years.
Annex VIII
We do not currently publish data available in Annex VIII, given that the detailed breakdowns of
data cause many cells to be confidential.
I.3.3. Users' satisfaction (broader scope)
I.3.3.1. Have you organised a punctual or a regular survey related to the users' satisfaction
regarding the availability of your data for Annexes I to IV of the SBS Regulation in your country?
[ ] Yes
[ X ] No
I.3.3.2. If you have organised a survey as such, to what extent the users' needs were fulfilled by the
available data and if they are relevant for all of them?
not applicable
15
II. Accuracy and reliability
Definition
Accuracy of statistical outputs in the general statistical sense is the degree of closeness of
estimates to the true values.
If a different methodology is applied for different activity groups (or by Annex of the SBS
Regulation) this part of the quality report may be copied for each of the different
methodologies. Please indicate clearly which activities or Annex of the SBS Regulation the
information is referring to.
II.1. Concepts and sources
Member states may use different data collection systems to comply with the SBS regulation.
It might be:
the use of sample surveys3 (with administrative information present in the sampling frame
and only used for stratification purpose),
the use of a sample survey combined with use of administrative information (whereby this
information is also used in the editing and imputation process and possibly in the inference
process through calibration),
the use of administrative data as the only source of information and all characteristics are
calculated from the administrative records using or not a model based approach (in this case
no survey information is used for a significant part of the strata covered).
It is necessary only to fill in the concerned column.
If several surveys are used in order to compile different SBS characteristics, the table should be
copied as many times as there are survey.
3 Exhaustive interrogation of all enterprises is considered a type of survey.
Methodology 1: Sample survey Methodology 2: Sample survey combined
with administrative information
Methodology 3: Administrative
information
II.1.1. Description of source
a) Survey
1. Please describe the sample design in particular:
(If several surveys are organised to collect data, please
describe separately, for each of them, the sample
design).
- type of sample design
stratified
multistage
clustered
stratified
multistage
clustered
- stratification criteria
Activity
Employment size class
Region
Other (please specify )
Activity
Employment size class
Region
Other (please specify turnover )
- selection schemes (sampling rates)
Using number of employees and turnover
thresholds, the target population, obtained
through the business register, is divided into
two parts:
- any legal units employing either more
than 45 employees or having declared a
turnover excluding VAT of more than
7 million EUR per annum are selected
every year. Any legal units linked to
the aforementioned units (e.g. group
and enterprise links) are also selected
every year ;
- the other legal units are selected using a
stratified random probability sampling
design. Units in strata consisting of
only one unit are always selected.
17
Furthermore, we apply a rotation
principle to ensure that a unit can only
be surveyed at maximum once every
three years – this procedure helps to
reduce the administrative burden for
the smaller entities. The following
years, these units can be satisfactorily
imputed using cold-deck ratio
imputation.
- any possible threshold values
please refer to “selection schemes (sampling
rates)”
- the effective sample size
- To cover the needs of national accounts
and SBS: between 3000 and 3500 legal
units ;
- To cover the needs of SBS only:
between 2000 and 2500 legal units.
- If the reference period differs from the calendar
year is there a correction to bring it in accordance
with the reference period? For SBS (calendar
year).
No.
For each company with a different financial
year than the calendar year, we take a
decision regarding the SBS reference
period. Normally, this decision is taken in
such a way that at least 6 months of the
financial year for a given company are
included in the corresponding SBS
reference period. Once that decision is
taken, it is maintained every subsequent
year for comparability reasons.
- Additional or specific information should be added
in the column 'Methodology2 ' if the suggested
options are not feasible with the used concepts in
your country of compiling SBS data.
The stratification used in the sample design
does not take into account the size classes
defined in the SBS regulation. However,
mass imputation procedures exist on a
micro data level, so that data can be broken
down according to any spanning variable
defined a posteriori.
While the sample usually covers only
around 8% of the target population in terms
of number of units, it covers around 80% of
the target population in terms of turnover.
18
b) Administrative source
Please describe the administrative sources:
The used administrative sources:
- Business Register
- Social Security
- Value-added tax (VAT)
- Please enumerate the characteristics directly
available or with a good proxy in the administrative
source.
- NACE code
- National accounts institutional sector
code
- Employment
- Wages
- Turnover
- Please enumerate the characteristics for which a
model based estimate is used.
- Employment
- Turnover
- Is your unit able to check the plausibility of those
data, namely by contacting directly the units?
No.
However, we check the plausibility of those
data for every unit in the sample and on a
case-by-case basis with annual business
accounts (if available at the Register of
Commerce).
The extent to which the administrative source are
used:
[ X ] : data source, basic data for some
characteristics
[ X ] : data source for imputation in case of
non-response.
[ X ] : data source for imputation, for strata
not covered by the survey
[ X ] : data source for 'mass imputation':
(imputation of units not selected in the
sample)
[ ] : data source for calibration (calage sur
marge) to optimize inference from survey
results
- What kind of administrative data do you access?
[ X ] : micro data
[ ] : aggregated data
[ ] : for the entire population
(global data)
[ ] : micro data
[ ] : aggregated data
[ ] : for the entire population
(global data)
19
[ ] : by each (or some) economic
breakdown class
[ ] : for a part of the population
(sole proprietor, companies, etc.)
[ ] : other (please specify):
___________________________
[ ] : by each (or some)
economic breakdown class
[ ] : for a part of the population
(sole proprietor, companies, etc.)
[ ] : other (please specify):
___________________________
How do you assess the frequency to which the used
administrative data sources are updated?
VG * G S P VP
VG G S P VP
Are the administrative data subject to several
revisions with (increasing) degree of completeness?
[ X ] yes
[ ] no
[ ] yes
[ ] no
II.1.2.Relation between reporting units and the
legal units / enterprise (statistical unit):
Which is the relation between the reporting unit for
the survey/administrative data and the enterprise?
In the SBS survey, the reporting unit is the
legal unit. However, for a very few legal
units, the data are collected separately for
the economic subdivisions of the legal unit.
Moreover, local unit data are collected at
legal unit level.
In the administrative sources, the reporting
unit is the legal unit.
Finally, the statistical unit “enterprise” can
be a legal unit or a group of legal units.
Enterprises which are formed by a group of
legal units are not the general case in
Luxembourg but their occurrence is
significant, in particular due to many legal
units specialized in ancillary activities for
other legal units belonging to the same
enterprise group.
II.1.3. Definitions and concepts used in the
survey/administrative sources :
20
How would you assess the proximity of the
definitions and concepts (including statistical units)
used in the survey/administrative source with those
required for statistical purposes?
VG G S P VP
VG G S P VP
VG G S P VP
Please list the main differences between the
survey/administrative source and the statistical
definitions and concepts.
The differences can be categorized as
follows:
- timing differences ;
- permanent differences.
The turnover recognition moment between
the survey and the administrative source are
often similar. However, for some economic
activities (e.g. real estate, construction, ICT
services, etc.), there can be a significant
timing difference between the moment of
invoicing (VAT) and the moment of
revenue recognition in the profit&loss
account.
Furthermore, the concept of turnover in the
survey includes secondary and ancillary
activities, in conformity with the SBS
regulation definition. However, these
activities are not necessarily captured by the
variable “Turnover” available in the
administrative source (VAT), thus creating
a permanent difference. This situation can
also be reversed, i.e. an element of the
variable “Turnover” in the administrative
source is not necessarily considered as
turnover in the SBS survey (e.g. proceeds
from the sale of fixed assets, interests
received on intercompany loans in some
specific cases).
Are there any of the SBS characteristics not
included in those data sources?
yes
If yes, please indicate the method used to
compile those characteristics (estimation).
yes
If yes, please indicate the method used to
compile those characteristics (estimation).
Characteristic 16 14 0 Number of persons
yes
If yes, please indicate the method used to
compile those characteristics (estimation).
21
______________________________.
no
employed in full-time equivalent is not
available in the data sources. We are
currently looking into possibilities to
estimate the variable, as an inclusion in the
SBS survey would increase the
administrative burden quite significantly,
in particular for bigger companies and thus
also negatively impact the non response
rate. A few characteristics for which
Luxembourg has obtained a permanent
derogation (e.g. environmental breakdown
of variables) are also not available in the
data sources.
no
______________________________.
no
22
II.1.4. Sampling error: coefficients of variation
Coefficients of Variation
valueestimated
variancesampling theof estimateCV
For the main series, by type of economic activity
(1A, 2A, 3A and 4A – Regulation No 295/2008), a
coefficient of variation for turnover (characteristics
12 11 0) is required on the NACE Rev2 – 3-digit
(group) level.
For the same series, coefficients of variation are
expected for variables: 11 11 0, 12 11 0, 12 15 0,
13 31 0, 15 11 0 and 16 13 0 on NACE Rev.2 –
Section level for the sections B, C, D, E, F, G, H, I,
J, L, M and N and also on NACE Rev 2 – 2-digit
(division) level for Div. 45, 46 and 47 covering the
trade section and division 95.
For the series broken down by employment size
class (1B, 2B, 3B and 4B – Regulation No
295/2008) a coefficient of variation is required on
NACE Rev 2 – 1-digit (section) level only for
variables: 11 11 0, 12 11 0, 12 15 0 and 16 11 0.
The size classes for which CV's are requested are
those limited to the following set: 0–9, 10–19, 20–
49, 50–249 and 250+ persons employed for series
2B and 4B and 0-1, 2-9, 10-19, 20-49, 50-249, 250+
for series 1B and 3B..
Please provide in a separate file the coefficients of
variation, according to the specifications from
above, and describe the methods used and the
aspects taken into account for computation of the
CV (including software).
The ratio estimator is used for grossing up.
Please refer to section II.3 for further
details on grossing up procedures.
The formula of the CV for the ratio
estimator has been programmed in STATA
using the formulae described in the book
“Techniques de sondage” by Pascal Ardilly
(ISBN 10: 2-7108-0847-1).
Given the ratio R, the sampling variance
has been estimated using the residuals (u)
between the variable of interest (y) and the
ancillary variable (x) :
n
sfYV u
R
2
)1()ˆ
(
with
N
nf 1)1( ,
n
i
iu un
s
1
2
1
1
and iii xRyu
The CV has been calculated using the
following formula:
y
YVCV
R )ˆ
(
Given that the sample design includes a
rotation principle for small units, data
imputed using the procedure described in
section II.2.1.1. have also been taken into
account for the calculation of the CVs.
23
The following variables of interest have
been grossed up using the ancillary
variable “turnover”:
- 12 11 0 ;
- 12 15 0.
The following variables of interest have
been grossed up using the ancillary
variable “number of persons employed”:
- 15 11 0 ;
- 16 11 0.
Variable 13 31 0 has been grossed up using
the ancillary variable “personnel costs”,
available through administrative sources.
The CVs have been calculated for a given
variable of interest with the corresponding
ancillary variable.
For the variables 11 11 0 and 16 13 0, the
CV is zero, as they are available for every
unit, either in the business register or in the
survey, and thus do not need to be
estimated.
The CV has been calculated for every
single cell, i.e. by activity as well as by
activity broken down by employment size
class. Please note that the CV is missing
for the following cells:
- cells with no unit or only one unit in
the target population ;
- cells with no unit or only one unit in
the sample.
Legend:
VG= Very good; G=Good; Satisfactory= S; Poor=P; Very poor=VP
II.2. Non-sampling errors
II.2.1. Unit non-response
Unit non-response occurs when not all the reporting units in the sample participate in the survey or
the needed information is not available in the administrative data sources. In principle non-responses
can occur using both data sources: survey and administrative data.
(If this is not applicable it should be explained why e.g. 'Administrative data have gone through an
imputation procedure eliminating raw data flaws and imputing records not available.')
II.2.1.1. Description of estimation methods for taking into account the unit non-responses
a) Please describe the methods used for taking into account the unit non-response.
General approach
For taking unit non-response into account, we use
- the latest available survey data for the last 10 years ( iy~ ) ;
- administrative data (turnover, number of persons employed, personnel costs) from the reference
period ( ix ) and the latest available survey period ( ix~ ) – the availability of administrative data is
very high.
The estimate is obtained as follows on a micro data level:
i
iii
x
xyy ~~ˆ
This method is known as cold-deck ratio imputation.
In some cases, mainly for bigger companies, the ratio i
i
x
x~ is manually adjusted using annual
business accounts available for the reference period. This provides an even more precise estimate.
If the ratio is abnormally high (compared to other similar units in the stratum) or if there is a weak
correlation between the administrative variable ix~ and the survey variable iy~ , this type of
imputation is not performed.
If no historical data are available or if those data are deemed outdated, the units are grossed up
using the same procedures as for units not covered by the sample or by imputed data.
Investment variables
This form of imputation is not used for the family of variables related to 15 11 0. However, an
administrative source is available to impute this variable.
Number of hours worked by employees
The average number of hours worked by firm is available in administrative sources. Using this
indicator and the number of employees as well as applying necessary adjustments to obtain the
effective number of hours worked, we impute the number of hours worked for every business in
the target population.
b) Please describe, when applicable, the measures taken or envisaged for minimising the unit non-
response.
25
Non response concerns units for which either of the following characteristics applies:
- units which have not responded. Every two months a written reminder is addressed to the units
who have not responded. The third reminder is done by registered mail ;
- the survey form data are not exploitable due to either inconsistent data or lack of information -
the questionnaire design is reviewed every year to ensure that item non response due to form
design is minimal and does not transform into unit non response ;
- units which have ceased their activity or which have been subject to restructuring (mergers,
splits, etc.) during the reference year – for these units, data cannot be easily collected but have
to be estimated or imputed in most cases ;
- units which cannot be contacted. The addresses for these units are checked manually using the
phone book and the national Register of Natural and Legal Persons. Unit non response for this
reason is usually low.
In rare situations, we enforce statistical obligation on a case-by-case basis by filing a legal
complaint against big units which have repeatedly failed to respond with usable data or at all.
II.2.1.2. Weighted unit non-response rate
The weighted unit non-response rate shows how well data collection has worked for the population of
interest. The non-response rate could be weighted by the weights that take into account the most
relevant characteristics. Weights are set at the sampling stage, based on a size criterion available in
the business register.
We prefer the 'number of persons employed' as a weighting factor for the calculation of the non-
responses, as it is always positive. Turnover can also be used if it is the main quantitative
characteristics available for all enterprises.
a) Weighted unit non-response rate
100 sample in the units of totalweighted
estimationin used units respondent-non weighted
The weighted unit non-response rate should be complied for the data collected through survey or
extracted from administrative sources. Please provide in a separate file the weighted unit non-
response rates at the NACE Rev 2 3-digit (group) level.
Remarks:
1. Only if a different survey is used for either one of the variables 11110, 12110, 12150,
13310, 15110 and 16110, a separate unit non-response file ought to be included, mentioning the
variable number in the field concerned.
2. If a variable is directly calculated from the reference frame (drawn from the Business
Register), non-response is zero by definition.
Please describe below which characteristics was used for weighting non-response and comment on
the remarks listed above.
The weighted unit non-response rate was weighted using the variable 12 11 0. We draw your
26
attention to the fact that the non response rate is equal to 0 for the variables 11 11 0 and 16 13 0, as
they are available in the administrative sources.
For the possible reasons for non response, please refer to the section II.2.1.1.b).
Overall, the weighted unit non-response rate is very low, the rate being 6% (2010: 1%). However,
for some economic activities, the rate is high. The non-response rate can even be equal to “NA”,
i.e. if there was no unit in the sample for a given cell, even though there are units for that cell in the
total population. It is important to know that :
- it is often not possible to use the same strata for estimation as those foreseen in the data
transmission format. Given the fact that estimated values are imputed on a micro data level
(imputation is done based on the relative importance of turnover for each unit), it is
nonetheless possible to present the results according to the data transmission format ;
- for groups G501 to G504, we do not use any micro data figures, please refer to section
I.1.2. for further details.
b) How do you evaluate the recorded unit non-response rate in the overall context?
Very high High On the average Low Very low
II.2.2. Bias
Please comment on the possible bias resulting from non-response or from the estimation method.
Bias resulting from non response
For the units to which the imputation method described in section II.2.1.1. has been applied, the
bias is significantly reduced when that ratio is adjusted with observations from the annual business
accounts (if available).
For most units, such an adjustment is not performed due to limited resources and the limited
availability of detailed annual business accounts for small businesses in particular. Nonetheless, the
use of adjusted past survey data certainly reduces the bias resulting from non-response.
If no historical survey data are available for a given unit, the latter is handled in the same way as
any unit which is not included in the sample. The bias would then result from the estimation
method. In most cases, this concerns only small businesses, which have a minor impact on the
aggregate value of the variables of interest.
Given the above, the bias can be evaluated as being small.
Bias resulting from the estimation method
According to the book reference mentioned in section II.1.4., the ratio estimator is biased, the
formula of the bias being complex but with a dimension of 1/n. Consequently, the higher n, the
smaller the bias. However, if n is small or if there is a weak linear correlation, the bias of the ratio
estimator could be significant.
While the bias resulting from the estimation method remains unknown, the estimation method
27
mainly concerns small units. Therefore, we believe that it has a limited effect in the overall
accuracy of the estimate.
Please select which of the options describes the bias of the estimate:
unbiased
Small bias (with a limited effect in the
overall accuracy of the estimate.
Substantial bias
II.2.3 Imputation
Imputation is the process used to resolve problems of missing, invalid or inconsistent responses identified
during editing.
II.2.3.1. Description of methods used
What imputation procedure was applied to cover up the uncovered non-response? Imputation
procedures will normally rely on the available administrative data, but imputation may also be
applied in other situations. (see annex on II.2.3)
Please briefly describe the imputation methods used for dealing with unit and item non- response
(e.g. corrector factor in the weighting procedure, imputation based on available information from
previous reference period etc.)
Unit non response
Please refer to the method described in II.2.1.1 for further details.
Units for which historical data are available
Units which have been in a sample prior to the reference period are imputed using the same method
and under the same conditions as for unit non-response.
Item non response
Given that SBS data are above all quantitative and that they also serve the needs of national
accounts, it is very difficult to identify item non-response. This is in particular true for units who
are included only once in the sample.
For the bigger companies, we are able to follow their structure over a larger period and thus to
identify item non-response in an easier way. Item non-response is therefore dealt with on a case-
by-case basis during editing procedures.
Missing information in the administrative source
For some economic activities, VAT data are not available. No imputation is done on administrative
data. If employment data are significant for a given unit, it is very likely included in the survey
data. Consequently, the occurrence of missing data in the administrative source is rare and of low
impact.
28
II.2.3.2 Evaluation of the impact of imputation
a) How do you evaluate the impact of imputation on CVs?
Very important Important Not important
The reduction of CVs can be assessed by comparing the CVs just using the reweighting procedures
with the CVS after imputation procedures.
The impact of imputation on CVs is important in particular for the small size classes, as most of
those units are selected using a rotation mechanism.
II.2.4 Coverage errors
II.2.4.1. Frame
Please describe the frame used for the SBS:
- What is the variable used for identifying principal and secondary activities?
Value-added is used where available on an activity or even a product level – this is often only
possible for units covered by the SBS survey or for which similarly detailed information would be
disclosed in annual business accounts. Turnover and employment are used when value-added is not
available.
- What is the method used for identifying activities?
top-down
bottom-up
other:
- Please comment on the frequency of updating the unit's principal activity (stability rules).
If the principal activity remains changed for at least two consecutive reference years, the NACE
code is updated.
- Please comment on the frequency of updating the business register in your NSI.
The update frequency is high. Updates of data managed by the NSI itself are performed instantly.
Monthly data related to employment are updated monthly with a t+4 months delay. Monthly VAT
data are updated monthly with a t+3 months delay.
- How do you assess the impact of imperfection of the relevant business register on the quality of
the key statistics?
almost none low medium high very high don't know
II.2.4.2. Out-of-scope units
Please provide information on the methods used to detect the out-of scope units and provide
Eurostat with an estimated number of those units.
Definition
29
The most frequently observed type of out-of scope units are legal units which are incorporated or
registered in Luxembourg but which at the same time do not have any non-financial economic
activity (SBS scope for annexes I-IV, VIII) in the Luxembourg economic territory (e.g. real-estate
rental activities with buildings in foreign territories only, companies where the activity is done in
foreign branches only, some activities related to shipping, etc.). The other significant type of out-
of-scope units relates to non-market activities.
Identification
Such units are identified and dealt with on a case-by-case basis by the staff involved in national
accounts, SBS, business register and balance of payments. This kind of information is also more
and more centralised through the business register.
Estimated number of undetected units
There is a list of companies which have been excluded from the SBS population for the above
reasons. However, we have no estimate of the number of undetected out-of-scope units still
lingering in the SBS population.
II.2.5. Processing errors
Processing errors can be of different types: at data entry (keying errors or optical scanning errors);
wrong handling of raw data errors. A good editing system may detect most of those errors.
a) Please tick the appropriate checks used in order to compile the SBS data and give a short
description of your editing system.
completeness checks validity checks plausibility checks
(data integrity rules) (internal consistency)
Processing errors in the survey
For the survey, we use an objective-driven approach to ensure that the survey data comply with the
pre-defined quality standard.
The following documentation (in French) describes the framework used to cover non sampling
errors for the survey reference year 2005, i.e. in the context of the old regulatory framework. There
is no updated version of the report yet as it is was published as a working paper and thus is not an
official methodological document by STATEC. However, it provides an idea of the type of control
environment that we have in place in the area of SBS survey data.
http://www.statistiques.public.lu/catalogue-publications/economie-statistiques/2009/29-2009.pdf
Errors in the administrative source
Error checking activities in the administrative source consist in the following:
- analysis of the correlation between survey and administrative variables ;
- analysis of births and deaths of significant units ;
- detection of out-of-scope units ;
- identification of merger and acquisitions to minimise the risk of double counting.
II.3. Inference (grossing up)
30
Have you applied any methods for grossing-up the figures covered by the SBS Regulation - to
cover the entire population of enterprises (there is no cut-off threshold used)?
yes
no
b) If you have applied any methods for grossing-up, please provide information about the
methods.
When grossing up survey data, we always take into account ancillary data available in the
administrative sources, i.e. the ancillary variables “turnover” and “number of employees”, and not
only the sample design. The aforementioned ancillary variables most often have a linear
correlation with the variables of interest in SBS. Consequently, the ratio estimator is used for
grossing up.
With the variable of interest being available only for the sample (y) and with the ancillary variable
being available both for the sample (x) and the SBS target population (X), the ratio estimator of
the variable of interest RY , expressed as an average, can be written:
RXYR ˆ
The ratio R has been formulated in the following way:
n
i
i
n
i
i
xn
yn
R
1
1
1
1
If for a given stratum no unit is available in the sample, i.e. neither for the historical periods nor
for the reference period, the units in this stratum are grouped together with another stratum for
which sample data are available and are then grossed up.
Variables 11 11 0, 13 31 0, 13 32 0, 13 33 0, 16 13 0 and 16 15 0 are available directly in the
administrative source and therefore do not need to be grossed up (as from reference year 2009). If
prior year survey data are available, a cold-deck ratio imputation is performed – this significantly
reduces the bias in case there are permanent differences between the admin source and the survey
data for a given unit. For the other units, administrative data are simply pasted into the data set.
II.4. Assessment of revisions
II.4.1. Preliminary data versus final data
The quality of the preliminary data is measured by comparing them with the final value. For annexes
1 to 4, an error measure defined as ratio between the sum of the absolute revision and the sum of the
absolute later estimates, the relative mean absolute revisions is calculated:
The following formula is applied
RMAR – Relative Mean Absolute Revisions
31
T
t Lt
T
t PtLt
T
t Lt
T
t t
ˆ
ˆˆ
ˆ
RRMAR
1
1
1
1
Lt = Latest estimate
Pt = Preliminary estimate
Characteristics 2009 2010 2011
a) Turnover (12110) 00..00553377 00,,00222255 00..00118855
b) Number of persons employed (16110) 00..00229977 00,,00113355 00..00228822
Please give a description of the methodology used for compiling preliminary data.
Preliminary data series are based on estimates which consist, for the vast majority of cases, of
adjusted survey data from the reference year t-1.
The adjustment happens with the administrative data sources, using the cold-deck ratio
imputation as described in section II.2.1.1. Units for which there is no t-1 survey data are grossed
up using the estimation method described in section II.3.
We are aware of the consistent underestimation of preliminary data compared to final data, in
particular as regards the number of enterprises. This is due to the fact that the VAT variables
(turnover, etc.) from the administrative source, which are used to calibrate most of the SBS
variables, are not completely available for the reference year t at the preliminary data compilation
moment (t+9 months). The final administrative deadline for companies to file their annual VAT
declaration is set at t+10 months, yet it still takes a few months until those data are then available
to STATEC. Consequently, at t+9 or t+10 months, the administrative source is still incomplete,
which is in particular noticeable for the variable 12 11 0.
As from reference year 2011, a new method minimises this effect. Moreover, one can see that the
precision has been gradually improved during the recent years.
II.4.2. Average size of revision
By revision we mean replacing the data that has been published before on the Eurostat website by
new data and not when corrections are sent before the data is finally released on the website.
The revision size is measured as follows:
RMAR – Relative Mean Absolute Revisions
T
t Lt
T
t PtLt
T
t Lt
T
t t
ˆ
ˆˆ
ˆ
RRMAR
1
1
1
1
Lt = Latest estimate
Pt = Preliminary estimate
32
NB: Due to the fact that it is difficult to calculate the RMAR as presented in the definition, the
calculation of the revision size below was calculated by considering the final version of data
(published on the website) as latest and the first version of data provided as preliminary. All
NACE codes up to 4 digit level were taken into account for the calculation.
The following list of characteristics to be taken on board for the calculation of the revision size, is
suggested by annex for the national series of the four annexes:
Annex 1: Series 1A: 11 11 0, 12 11 0, 12 12 0, 12 15 0, 12 17 0, 13 11 0, 13 31 0, 15 11 0, 16 11
0, 16 13 0
ANNEX
I
Country Variables RMAR
LU V11110 0.0000
LU V12110 0.0000
LU V12120 0.0000
LU V12150 0.0000
LU V12170 0.0000
LU V13110 0.0000
LU V13310 0.0000
LU V15110 0.0000
LU V16110 0.0000
LU V16130 0.0000
Annex 2, Series 2A 11 11 0, 12 11 0, 12 12 0, 12 15 0, 12 17 0, 13 11 0, 13 13 1, 13 31 0, 13
32 0, 15 11 0, 15 15 0, 16 11 0, 16 13 0, 16 14 0, 16 15 0, 18 11 0, 20 11 0
ANNEX
II
Country Variables RMAR
LU V11110 0.0000
LU V12110 0.0000
LU V12120 0.0000
LU V12150 0.0000
LU V12170 0.0000
LU V13110 0.0000
LU V13131 0.0000
LU V13310 0.0000
LU V13320 0.0000
33
LU V15110 0.0000
LU V15150 0.0000
LU V16110 0.0000
LU V16130 0.0000
LU V16140 #DIV/0!
LU V16150 0.0000
LU V18110 #DIV/0!
LU V20110 0.0000
Annex 3, Series 3A: 11 11 0, 12 11 0, 12 12 0, 12 13 0, 12 15 0, 13 11 0, 13 12 0, 13 31 0,
13 32 0, 15 11 0, 15 15 0, 16 11 0, 16 13 0
ANNEX
III
Country Variables RMAR
LU V11110 0.000
LU V12110 0.000
LU V12120 0.000
LU V12130 0.000
LU V12150 0.000
LU V12170 0.000
LU V13110 0.000
LU V13120 0.000
LU V13310 0.000
LU V13320 0.000
LU V15110 0.000
LU V15150 0.000
LU V16110 0.000
LU V16130 0.000
Annex 4, Series 4A: 11 11 0, 12 11 0, 12 12 0, 12 15 0, 12 17 0, 13 11 0, 13 31 0, 13 32 0,
15 11 0, 15 15 0, 16 11 0, 16 13 0, 16 14 0, 16 15 0, 18 110
ANNEX
IV
Country Variables RMAR
LU V11110 0.000
LU V12110 0.000
LU V12120 0.000
LU V12150 0.000
LU V12170 0.000
34
LU V13110 0.000
LU V13310 0.000
LU V13320 0.000
LU V15110 0.000
LU V15150 0.000
LU V16110 0.000
LU V16130 0.000
LU V16140 #DIV/0!
LU V16150 0.000
LU V18110 #DIV/0!
Not applied.
If revisions have been sent, please provide the reasons for making the revision.
Not applicable.
II.4.3. Revision policy
Please describe for each annex your revision policy, including information such as the average
number of revisions (planned or not), the main reasons for revisions and the impact of the
revisions.
Annex I
In the past, we had no revision policy for SBS data sent to Eurostat due to resources constraints.
However, due to recent economic evolutions, a revision of SBS data series has been decided for
the reference years 2005 to 2010 included. The revision will take into account the changes due to
NACE Rev.2, so that users will have a longer series for this classification. Moreover, the profiling
of a major enterprise group in Luxembourg has led to a new enterprise definition for the units of
this group – this has a major impact on the overall figures of industrial and trade activities.
The revision is currently in the works and planned for the end of the first half of 2014 – initial
planning was 2013 but due other priorities, the revision had to be put on hold.
Annex II
see remark here above
Annex III
see remark here above
Annex IV
see remark here above
Annex VIII
see remark here above
35
III. Coherence and comparability
III.1. Coherence
Definition
Coherence of statistics is their adequacy to be reliably combined in different ways and for
various uses. It focuses on the joint use of statistics that are produced for different primary
purposes to show cases of incoherence rather than to prove coherence.
Please report on the inconsistencies that can already be documented for the following statistics as
well as work undertaken in this respect:
- Number of enterprises in the business register
- Production value of Prodcom
- Value added of national accounts
- Evolution of turnover and persons employed from short term statistics
- Other (please specify)
Consistency with business demography
These differences have been documented in the SBS quality report on Annex IX (cf. section III.1).
Consistency with national accounts
In Luxembourg, our policy is to remain as consistent as possible with the data published in the
framework of national accounts. Therefore, the SBS data on the STATEC website were based the
statistical unit KAU and not the statistical unit enterprise, which is used for data published on the
Eurostat website. This is subject to change in the near future to allow the publication of inward
FATS and size class data. Nonetheless, the KAU series will be kept for the purposes of compiling
national accounts.
Given that the revised national accounts data for a given reference year are available after t+21
months, whereas the publication of LU SBS data is currently performed with data sets available at
t+19 months, there are punctual differences, in particular relating to the last minute identification
of (partially) out-of-scope units as well as due to NACE reclassification of units. Furthermore,
national accounts include estimates for illegal or black market activities. Finally, national accounts
balancing procedures cause inconsistencies with SBS.
Consistency with short-term statistics
No efforts have been invested so far in analysing the potential or actual data inconsistencies
between SBS and STS. However, the latter use SBS data (enterprise concept) for weighting
purposes.
Consistency with Labour Cost Survey (LCS) for wages and salaries per employee
If we consider only the population of enterprises employing at least 10 persons and only those
economic activities which are covered by both statistics at the same time, the figures are very
similar.
Only to be filled in if this quality report includes information on the Annex VIII
Please report on the inconsistencies between the total turnover reported for Annex VIII and the
36
corresponding data to be reported under the provisions of Annex I.
[Comment: as we do not know which countries will report on Annex VIII using this report, we
cannot include an overview with remaining inconsistencies in the data disseminated by Eurostat]
Not applicable.
III.2. Comparability
Definition
Comparability aims at measuring the impact of differences in applied statistical concepts and
measurement tools and procedures, when statistics are compared between geographical areas, non-
geographical domains, or over time.
III.2.1. Comparability over time
Length of time series is the period when the statistics have been compiled for the first time until the
latest reference year.
Length of comparable time series starts with the last break in time series and longs until the last
reference year. Please split into individual annexes if time series are different.
Indicator Period (yyyy – yyyy)
III.2.1.1. Length of time series 1995-2011
III.2.1.2. Length of comparable time series 2003-2009
III.2.1.3. In case II.2.1.1. is not equal to II.2.1.2., please, indicate the reasons or differences, in
concepts and methods of measurements for breaks in time series.
As from 2003, Luxembourg has for the first time delivered data series based on the enterprise
concept and on the KAU concept. Before 2003, only the KAU concept was available for Eurostat
data.
III.2.2. Geographical comparability: Coverage of target population
III.2.2.1. Please report on the inclusion of branches of foreign enterprises and the exclusion of the
results relating the activities abroad of enterprises of the reporting country.
Branches of foreign enterprises are identified as such in the business register. Consequently, they are
included if they are considered as being a resident unit. For the units in the survey, we try to obtain
as many data as needed to exclude foreign branches included in Luxembourgian legal units.
Please also refer to section II.2.4.2.
III.2.2.2. Are the statistical units used for compiling the SBS series in conformity with the
definitions in Regulation No 696/93?
Yes.
III.2.3. Geographical comparability: Other issues
III.2.3.1. Please report on any other issues (with the exception of differences in concepts and
definitions in the source data mentioned in paragraph II.I.3) that have an influence on the
37
comparability of the data of your country with that of the other EEA countries.
Not applicable.
38
IV. Timeliness and Punctuality
IV.1. Timeliness
Definition
Timeliness of information reflects the length of time between its availability and the event or
phenomenon it describes.
Please provide the key dates for the following actions
Action Deadline(s) MM/YY
a) data-collection, if any
If several data collections are organised, please indicate the
deadline of the last
February 2013
b) post-collection phase August 2013
c) dissemination in your country, if applicable n.a. (yet) for reference year 2011
IV.2. Punctuality
Definition
Punctuality refers to the time lag between the release date of data and the target date when it
should have been delivered, for instance, with reference to dates announced in some official
release calendar, laid down by Regulations or previously agreed among partners.
The punctuality of delivery date is calculated as actual date of data delivery – scheduled date of
transmission to Eurostat. It shows the delay (positive value) or advance (negative value) in calendar
days after the legal deadline (18 months after the end of the reference year fore final data; 10
months for preliminary data).
Country LU
R1A 39
R1B 39
R1C 39
R1E 39
R2A 39
R2B 39
R2C 39
R2D 39
R2F 39
R2H
R2I
R3A 39
R3B 39
R3C 39
R3D
39
R4A 39
R4B 39
R4C 39
R4D 39
R4F 39
R4G
R4H 39
R8A 39
R8B 39
R8E
R8F 39
IV.2.1. Please comment the delays (if any) of transmission to Eurostat as calculated by Eurostat,
e.g., the reasons for the late delivery and present the action plan for improving timeliness and
respecting transmission delays (if needed, you can split this question for the different annexes).
Comment on the delays
In absolute terms, STATEC does still not meet the deadlines set by the SBS regulatory framework.
Compared to the reference year 2010, the delays have been reduced and all data sets have been
sent at the same date. As already mentioned previously, the number of persons currently allocated
to the SBS regulation is insufficient to meet the official time table.
Action plan
- As of reference year 2013, we hope that the mandatory Luxembourg Balance Sheet Register
will available for massive date extraction. This would allow us to obtain electronic data as fast
as possible and to increase the overall SBS data quality, in particular for small businesses.
- We constantly keep our eyes on newly available administrative sources to minimise the
transmission delays as well as to maximise the SBS data quality.
40
V. Accessibility and Clarity
V.1. Accessibility
Definition
Accessibility refers to the physical conditions in which users can obtain data: where to go,
how to order, delivery time, clear pricing policy, convenient marketing conditions, availability of
micro and macro data, various formats (paper, files, CD_ROM, Internet etc.).
If the SBS data are not disseminated at national level by your Institute please proceed to
VI. Further comments.
If needed, this following question can be split for the different annexes.
V.1.1. Is the SBS data published at national level different from the data sent to Eurostat?
yes
no
V.1.2. If yes, please gives a brief description of the reasons for these differences.
Annexes I-IV
The differences between the SBS series published by Eurostat and by STATEC are as follows:
1) As already stated in section I.3.2.5., SBS data published on the STATEC website are based
on the KAU concept and not on the enterprise concept. For several economic activities
there is consequently a big difference. Given this choice of concept, any additional
breakdowns (e.g. size classes, client’s country of residence, etc.) available for the
enterprise concept are not published on our website because the SDC procedures have not
been performed for such breakdowns on a KAU level.
2) The degree of detail and the number of variables on the STATEC website is based on a
quality choice and does not correspond to the maximum NACE detail and number of
variables available.
3) The series on the Luxembourg website are regularly revised to keep them in line with the
national accounts series, which are also revised every year. However, the revised series are
only based on the KAU concept. No revised data are sent to Eurostat because that would
significantly increase the burden due to additional SDC procedures.
4) 2008 data series have so far only been published in reference to NACE Rev.1.1. The
reasons for this have been mentioned in section I.3.2.5.
All of the above is subject to change in the short run. The goal is to publish the data sent to
Eurostat on the national website, with all of the available dimensions and variables.
Annex VIII
41
The SBS Annex VIII data are not disseminated at national level because they only concern a
subset of enterprises (employing at least 20 persons) and a subset of services activities. More
particularly, confidentiality is an issue, given the small target populations.
V.1.3. Do you publish additional indicators together with the SBS at the national level?
yes
no
V.1.4. If yes, please give a brief description of the additional indicators published at the national
level.
Not applicable.
V.1.5. How do you disseminate SBS data?
Member
State
Paper/pdf Publications Electronic Publications
News
release
Statistical
yearbook
Thematic
publications
Internet-
Data base
CD/DVD-
Rom
Other (fax, e-
mail, etc.)
2011
data yes yes yes yes yes yes
Action
plan
2012
data
For 2011 data, please, mark with a cross where applicable.
For 2012 data, please, report any scheduled action plan specifying the implementation date.
V.1.6. Availability of paper publications in any foreign languages
[ X ] in English
[ X ] in the following other language(s) German, French
Please indicate links to your electronic publications on SBS
(click on the links here below)
LU SBS dynamic on-line tables
Annuaire du STATEC
Luxembourg in numbers
V.2. Clarity
Definition
Clarity refers to the data's information environment whether data are accompanied with
appropriate metadata, illustrations such as graphs and maps, and whether information on their
quality is also available.
V.2.1. Are statistical metadata available?
[ X ] available for paper publications
[ X ] available on the Website (electronic version)
[ ] no methodological explanations on data are disseminated
Please indicate links to your electronic publications of metadata
http://www.statistiques.public.lu/fr/methodologie/definitions/index.html
http://www.statistiques.public.lu/fr/methodologie/methodes/entreprises/Stat-struct/sse/index.html
42
VI. Further Comments
Please provide further comments regarding SBS data quality which are not included in the parts
above (e.g. foreseeable changes in the methodology; etc.).
None.
Annex
Note: numbering in the annex is not contiguous: paragraphs are numbered corresponding to the
entry concerned in the quality report.
I. RELEVANCE Relevance is the degree to which statistics meet current and potential users' needs. It refers to whether
all statistics that are needed are produced and the extent to which concepts used (definitions,
classifications etc.) reflect user needs.
These questions need to be answered for the statistical concepts (like definitions of statistical units or
classifications used) and on the availability of required statistics (completeness). The questions as regards
the deviations from the statistical concept are covered under coherence and comparability, as they are
also needed in that frame.
Conformity of characteristics (variables) with the regulation can be considered either as a relevance
issue, an accuracy problem or a comparability item. We treat it in the accuracy chapter under II.1.3.2
where data production using administrative sources are dealt with.
I.1 COMPLETENESS
Completeness is the extent to which data are available – compared to what is requested according to the
SBS-Regulation.
In this quality report completeness is measured via availability ratios, namely as the delivered cells for a
certain level of detail or the total to the amount that was expected to be obtained.
I.1.2. Availability of characteristics and/or breakdowns required by the SBS-Regulation
In this question Member States shall describe the reasons why in some cases details are missing and to
include the measures that the Member States took or are planning to take in order to change the situation.
If none of the details are missing, the answer shall be left in blank or if the Member State foresees no
change in this situation.
I.1.3. Use of the quality flag 'Contribution to European totals only' (CETO-flag) foreseen in the SBS regulation
In order to minimise the burden on businesses and the costs to the national statistical authorities, the
Member States may mark data for use as a contribution to European totals only (CETO). Eurostat shall not
publish those data, nor shall Member States mark nationally published data with a CETO flag. The use of
the CETO flag shall be dependent on the individual Member State's share of the EU total of value added in
the business economy as follows:
43
(a) Germany, France, Italy, and United Kingdom: CETO-flagged data may be sent for NACE Rev. 2
class level and for the size class breakdown at NACE Rev. 2 group level. No more than 15 % of the cells
may be marked.
(b) Belgium, Denmark, Ireland, Greece, Spain, the Netherlands, Austria, Poland, Portugal, Finland and
Sweden: CETO-flagged data may be sent for NACE Rev. 2 class level and for the size class breakdown
at NACE Rev. 2 group level. No more than 25 % of the cells may be marked. In addition, if, in any of
these Member States, the share of a NACE Rev. 2 class or of a size class of NACE Rev. 2 group is less
than 0,1 % of the business economy of the Member State concerned, those data may additionally be sent
as CETO-flagged.
(c) Bulgaria, Czech Republic, Estonia, Cyprus, Latvia, Lithuania, Luxembourg, Hungary, Malta,
Romania, Slovenia and Slovakia: CETO-flagged data may be sent for NACE Rev. 2 group and class
level and for the size class breakdown at NACE Rev. 2 group level. No more than 25 % of the cells at
group level may be marked.
The measures designed to amend non-essential elements of this Regulation, inter alia, by supplementing
it, relating to reviewing the rules for the CETO flag and grouping the Member States, shall be adopted in
accordance with the regulatory procedure with scrutiny referred to in Article 12(3) by 29 April 2013 and
every five years thereafter.
As Eurostat will present a comment on how he rules of SBS are respected, Member States can update and
check if they agree with Eurostat’s comment.
I.1.4. Application of 1%-rules foreseen in the SBS regulation
This question aims to identify the situations where Member States despite of the 1%-rule are foreseen to
provide data (or because are able to obtain the data concerned, e.g. using administrative sources), and
situations where the Member States expects that this rule will apply in a new branch.
I.1.5. Derogations from the provisions of the SBS Regulation
Some Member States were granted derogations. Sometimes however, the cause for requesting these
derogations applies no longer and the Member States is able to provide the “missing data”. In this
question, information on this is expected.
This question can be let in blank if you do not have any derogation or if you do not foresee any change in
the present situation.
I.3. Monitoring user interest
The user interest is an important indication about relevance. The aim of the group of questions is to
quantify this interest. It is split in three groups: Data dissemination, user consultation and user
satisfaction.
II Accuracy
44
Definition
Accuracy in the general statistical sense means the closeness of computations or estimates to the
exact or true values. The difference between the two values is the error.
As any statement on accuracy requires more information on the methods used for compiling the
structural business statistics, most of the methodology description is filed under this chapter.
II.1. Concepts and sources
Member states may use different data collection systems to comply with the SBS regulation. We
distinguish:
the use of sample surveys4 (with administrative information present in the sampling frame and
only used for stratification purpose),
the use of a sample survey combined with use of administrative information (whereby this
information is also used in the editing and imputation process and possibly in the inference process
through calibration),
the use of administrative data as the only source of information and all characteristics are
calculated from the administrative records using or not a model based approach (in this case no
survey information is used for a significant part of the strata covered).
As such it is necessary only to fill in the column concerned:
Column 1 if your data collection is made by the way of survey(s);
the column 2 if your data collection is a combination of surveys and the use of
administrative data (even if the administrative data is used for calibration or imputation);
the column 3 is expected to be filled by those Member States that only use
administrative data in the data collection for SBS.
II.I.3. Definitions and concepts used in the survey / administrative source
In this question the Member States self-assessment is expected. If you identify any difference between
the definitions please describe it in general – if you consider it as an important difference (in case your
assessment is lower than good, please detail it as much as possible). Please split your answer between
the differences for survey and the differences for administrative source.
Sampling error and non-sampling errors
When describing accuracy it is customary to distinguish between sampling and non-sampling errors.
Sampling errors can be described by an estimate of the uncertainty due to sampling, quantified in the
coefficient of variation (COV). Calculation of this COV is dealt with in II.1.4.
Non-sampling errors are more varied in nature and harder to quantify. We distinguish:
Frame errors (under-coverage / over-coverage)
Reporting or measurement errors including non-response
Processing errors
Model assumption errors
Among those, reporting and measurement errors are focussed on non-response under heading II.2.1.
Coverage and processing errors are included in this report under the headings II.2.4 and II.2.5.
4 Exhaustive interrogation of all enterprises is considered a type of survey.
45
II.1.4. Sampling error: coefficient of variation
valueestimated
variancesampling theof estimateCV
Largely adopted as an estimate for the sampling variance is the Horvitz-Thompson variance,
in which wh is the weight of the elements in stratum h. Since strata are independent from one
another, the variance over the strata concerned is the sum of the variances. Remark that there is no
variance contribution in an exhaustively surveyed stratum (of large enterprises), provided that
there is no non-response.
If there is no sampling (e.g. a model based approach is used for mass imputation using
administrative records), sampling variance is zero by definition.
A 'Jacknife' method can be applied to estimate a variance in case an imputation procedure was
used for part or all of the elements in the stratum:
Suppose ni elements were imputed in a stratum of nt enterprises. We can then calculate the
variance among ni+1 alternate elements consisting of the total estimate based on all elements and
the ni estimates based on nt-1 enterprises, whereby one imputed enterprise was dropped and
consequently a weight factor nt/ (nt-1) was applied.
V = Var(T, T1,…Ti,…Tni) with i=1…ni and Ti = nt/ (nt-1) yj (ji)
In case the data collection is exclusively based on an administrative source, the 'Jacknife' method
may still be applied. In this case, all nt enterprises are considered imputation results (t=i). In this
case, the objective value of this calculation is questionable: it reflects the intra-strata variability
and quietly assumes that errors due to the model based approach are proportional to this.
II.2.1 Non-response
The weighted unit non-response rate shows how well the data collection worked for the population
of interest. The non-response rate could be weighted by the weights that take into account the most
relevant characteristic. Weights are set at the sampling stage, based on a size criterion available in
the business register. As weighting factor for the non-response calculation, we prefer the 'number
of persons employed' as it is always positive. Turnover can also be used if this is the main
quantitative characteristics available for all enterprises.
II.2.1.1. Description of estimation method for taking unit non-response into account
a) Description the methods used for taking unit non-response into account
This question deals with how MS deal with non-response: Is imputation used for non-respondent
enterprises? If so is there a size class threshold: i.e. is imputation used for all strata? In which case
reweighting is used? Is there any calibration procedure?
2
1
)(.)1( h
n
i
ih
h
h
h
h
h
ywwVV
46
b) Measures taken or envisaged for minimising non-response.
In this question Member States shall inform about all kind of measures that have been taken, like
fines, personal contacts with non-response units in order to motivate them to answer. On the other
hand some other kind of measures can be considered like the use of partial data obtain by the way
other surveys, for example.
II.2.2. Bias
A measurement bias can be due to an inappropriate selection scheme or to a selective non-
response. Whereas the former could be considered a sampling error (but probably due to some
erroneous entry in the sampling frame), the latter occurs in the production process and is not
related to the sampling stage. In a purely administrative survey, a missing or incomplete
administrative record could also be considered a type of non-response. It remains to be seen
whether it is of a complete random nature or not.
We will treat 'bias' as an outcome of non-response, leaving it out of the section which is split
according to the data collection method. It is treated in section II.2.2
II.2.3. Imputation
Imputation as a procedure in data production does not fit on the axis sampling/non sampling
errors. In a survey, imputation can be applied to 'recover' cases of non-response where good
approximations can be made, namely based on administrative records. For medium sized and
larger enterprises, accounting information quite often contains enough data fields that are normally
used for filling out a questionnaire and which yield very good estimates upon imputation.
Imputation is also an alternative for calibration in case individual administrative data are available
for all of the elements in the stratum (both sampled and non-sampled). For example, the
information available in the Business Register that was used for stratification is the minimal set
that may successfully be used for imputation. Mass imputation on the non sampled enterprises
then replaces inference.
Finally in any data collection from administrative sources only, the imputation procedure is based
on a model, translating administrative data into statistical characteristics.
In the first case, the imputation normally leads to a substantially reduced coefficient of variation:
without imputation the variance can be calculated using the weights corrected for non-response.
After imputation of the medium and large enterprises, the 'jacknife' variance estimate can be used.
It is then possible to assess the reduction of the coefficient of variation due to the imputation
procedure.
II.2.3.2 Evaluation of the impact of imputation
This question only refers to the imputation procedure to recover non-response, not to the
imputation of corrected data after the editing has reveal (severe) raw data errors.
Please assess the reduction of the coefficient of variation obtained through imputation as compared
to the case where only reweighting is applied. (Please use the Jacknife method described in this
annex under II.2.3) Only one single comparison needs being made by annex, preferably on the
characteristics turnover (12 11 0) and for the largest section it includes: e.g. for Nace Rev 2 C
manufacturing industry.
47
II.4. Assessment of revisions
Revisions and differences between preliminary and final data make use of the same formula. It has
the advantage that (for positive definite characteristics) the denominator cannot be zero, hence
there are no singularities possible (even with a 'true zero' revised to a non-zero entry). Moreover
contrary to percentage changes, it is symmetrical. The factor 2 makes the differences
asymptotically (meaning for very small differences) equal to percentage changes.
The formula then mentally translates as 'a weighted sum of squares of percentage changes'.
II.4.1. Preliminary data versus final data
Member States shall present, based on its self-assessment, what they consider as the main reasons
for the differences between the preliminary data and the final data, for example Member States can
consider that the answering rate in the moment of preliminary data is too low or some of the
administrative data used is not yet available.
If, based on your assessment you do consider the differences insignificant, you do not need to fill
his answer.
II.4.2. Average size of revision
In this question, it is very important to get information on what were the reasons why the data has
been revised – for example: an error was detected by a user; after getting a confirmation from a
unit, NSI was informed that the data was wrong; when editing data for year n it was detected an
error on year n-1 data, etc.
III.2. Comparability
II.2.1. Comparability over time
It is very important in the context of a quality report to obtain information on the time length of the
series and to identify the reasons why in some series the time length is shorter.