14
www.statistik.at We provide information A quality monitoring system for statistics based on administrative data UNECE Seminar on New Frontiers for Statistical Data Collection, Geneva Manuela Lenk Statistics Austria Registers, Classifications and Methods Division 31 st Oct.– 2 nd Nov. 2012

A quality monitoring system for statistics based on administrative data

  • Upload
    daphne

  • View
    34

  • Download
    1

Embed Size (px)

DESCRIPTION

A quality monitoring system for statistics based on administrative data. Manuela Lenk Statistics Austria Registers, Classifications and Methods Division 31 st Oct.– 2 nd Nov. 2012. UNECE Seminar on New Frontiers for Statistical Data Collection, Geneva. Register- based census in Austria. - PowerPoint PPT Presentation

Citation preview

Page 1: A  quality monitoring system for statistics based  on administrative  data

www.statistik.at We provide information

A quality monitoring system for statistics based on administrative dataUNECE Seminar on New Frontiers for Statistical Data Collection, Geneva

Manuela LenkStatistics Austria

Registers, Classifications and Methods Division

31st Oct.– 2nd Nov. 2012

Page 2: A  quality monitoring system for statistics based  on administrative  data

www.statistik.at slide 2 | 31.10. - 2.11.2012

Register-based census in Austria

First register-based census in Austria 2011 Full census, no sampling

Census topicsPopulation census, housing census, census of enterprises and their local units of employment

Data availability On municipality level Geo-Codes Statistical databases Interactive maps

Page 3: A  quality monitoring system for statistics based  on administrative  data

www.statistik.at slide 3 | 31.10. - 2.11.2012

Quality assessment of the census

Application of a quality framework• The framework is independent from data processing,

allowing the application on other statistical projects• Data processes can be evaluated without influencing them

Three stages of quality evaluation• Raw data

– Registers provided by the data holders

• Central Database (CDB)– Combined information from the registers– Data is merged by a unique key

• Final Data Pool (FDP)– Final data including imputations

Page 4: A  quality monitoring system for statistics based  on administrative  data

www.statistik.at slide 4 | 31.10. - 2.11.2012

Quality framework - Overview

Page 5: A  quality monitoring system for statistics based  on administrative  data

www.statistik.at slide 5 | 31.10. - 2.11.2012

Quality assessment on register level I

Calculation of quality indicators• Each attribute in each register gets a quality between 0 and 1• Quality calculation is based on 3 so-called hyperdimensions

HD Documentation• Focuses on factors which possibly predetermine data quality• Realized by a questionnaire which is filled out in accordance

with the data authority• Questions are weighted by their impact on data quality

• Quality indicator: maximum obtainable scoreobtained score

Page 6: A  quality monitoring system for statistics based  on administrative  data

www.statistik.at slide 6 | 31.10. - 2.11.2012

Quality assessment on register level II

HD Pre-processing• Detection of formal errors, like missing primary keys, out-of-

range values and item non-response• Usable records are calculated by the subtraction of erroneous

records from total records

• Quality indicator:

HD External Source• The accuracy of the data is checked• Comparison with existing representative surveys

• Quality indicator:

total number of recordsusable records

total number of linked recordsnumber of consistent values

Page 7: A  quality monitoring system for statistics based  on administrative  data

www.statistik.at slide 7 | 31.10. - 2.11.2012

Quality framework - Overview

Page 8: A  quality monitoring system for statistics based  on administrative  data

www.statistik.at slide 8 | 31.10. - 2.11.2012

Quality assessment of the CDB and FDP

Unique AttributesAttribute exists in only one register, directly transferred to the CDB (e.g. highest level of education)

Multiple AttributesAttribute exists in more than one register, combined in the CDB using certain decision rules (e.g. demographic attributes)

Derived AttributesAttribute is created based on other attributes (e.g. type of commuter)

Multiple Attribute

Page 9: A  quality monitoring system for statistics based  on administrative  data

www.statistik.at slide 9 | 31.10. - 2.11.2012

Quality assessment of unique attributes

The highest level of education (EDU) is

delivered by one single register. The quality

indicator is derived by the three

hyperdimensions.

There are still missing values

(with quality=0) that decrease the quality indicator in

the CDB.

After imputations of missing values, we assess the quality indicator of the

attribute EDU in the Final Data Pool.

Page 10: A  quality monitoring system for statistics based  on administrative  data

www.statistik.at slide 10 | 31.10. - 2.11.2012

Quality assessment of multiple attributes

SEX is available in two registers. The

attribute is evaluated in both data sources

with the three hyperdimensions.

Does the information differ between the two data sources?

Which register should we believe in? Dempster-Shafer theory

takes uncertainty, consistency and conflict into account.

Page 11: A  quality monitoring system for statistics based  on administrative  data

www.statistik.at slide 11 | 31.10. - 2.11.2012

Quality assessment of derived attributes

There is no information on current activity status (CAS) or commuters (COM) in the raw

data. We derive the information for CAS from two other attributes in two data

sources.

We obtain the required information for COM

from the already derived attribute CAS. Thus, the quality indicator of both

attributes is equal.

Imputations are applied on CAS. The imputed values are

transferred to the COM attribute by the same

derivation process already done in the

CDB.

Page 12: A  quality monitoring system for statistics based  on administrative  data

www.statistik.at slide 12 | 31.10. - 2.11.2012

Usability of the results

Raw data Which register delivers a certain attribute with the highest quality indicator? Is there a register with a below-average quality for all delivered attributes? Is the quality indicator of a certain attribute worse than in the last delivery?

Census Database Is there any advancement of data quality by the use of multiple data sources? Comparison with prior censuses – plausibility checks

Final Data Pool Comparison of attributes for further advancement Comparison of census generations over time

Page 13: A  quality monitoring system for statistics based  on administrative  data

www.statistik.at slide 13 | 31.10. - 2.11.2012

Further Information Austrian Journal of Statistics, Volume 39 (2010), Number 4

• http://www.stat.tugraz.at/AJS/ausg104/104Berka.pdf

Statistica Neerlandica, Volume 66 (2012), Issue 1• http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9574.2011.00506.x/pdf

ESSnet on Data Integration 2011, Madrid• http://www.ine.es/e/essnetdi_ws2011/ppts/Lenk.pdf

ISI World Statistics Congress STS50 - Methods and quality of administrative data used in a census 2011, Dublin• http://isi2011.congressplanner.eu/pdfs/650199.pdf

NTTS Conference 2011, Brussels• http://www.cros-portal.eu/sites/default/files/S13P1.pdf

UNECE/Eurostat Expert Group Meeting on Register-Based Censuses 2010, The Hague• http://live.unece.org/fileadmin/DAM/stats/documents/ece/ces/ge.41/2010/wp.4.e.pdf

European Conference on Quality in Official Statistics 2010, Helsinki• http://q2010.stat.fi/media//presentations/session-26/fiedler_quality-in-official-statistics_statisticsaustria_paper.pdf

European Conference on Quality in Official Statistics, June 2012• http://www.q2012.gr/articlefiles/sessions/21.2_Manuela%20Lenk%20_A%20quality%20monitoring%20system.pdf

Page 14: A  quality monitoring system for statistics based  on administrative  data

www.statistik.at slide 14 | 31.10. - 2.11.2012

Please address queries to:Manuela Lenk

Register based census

Contact information:Guglgasse 13, 1110 Viennaphone: +43 (1) 71128-8283

fax: +43 (1) [email protected]

A quality monitoring system for statistics based on administrative dataUNECE Seminar on New Frontiers for Statistical Data Collection, Geneva