19
Basque Statistics Office Confidentiality Project: Final stages Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Tarragona, Spain, 26-28 October 2011

Basque Statistics Office Confidentiality Project: Final stages

  • Upload
    ishi

  • View
    29

  • Download
    0

Embed Size (px)

DESCRIPTION

Basque Statistics Office Confidentiality Project: Final stages. Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Tarragona, Spain, 26-28 October 2011. Outline. Introduction Microdata for standard distribution 2.1. Background - PowerPoint PPT Presentation

Citation preview

Page 1: Basque Statistics Office Confidentiality Project:  Final stages

Basque Statistics Office Confidentiality Project:

Final stages

Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

Tarragona, Spain, 26-28 October 2011

Page 2: Basque Statistics Office Confidentiality Project:  Final stages

1. Introduction

2. Microdata for standard distribution

2.1. Background

2.2. Methodology for the production of safe microdata

2.3. Surveys analysed

3. In- situ access to researchers

4. Future tasks

5. Conclusions

Outline

Page 3: Basque Statistics Office Confidentiality Project:  Final stages

1. Introduction

Period Action Output

1988-1999 Research fellowship on data protection techniques and statistical confidentiality.

Technical notebook on “Statistical Data Protection Techniques” edited by EUSTAT.

April 2000 International Seminar on “Confidentiality and statistical data protection techniques” organized by EUSTAT.Lecturer: L.H. Cox

Publication: “Confidentiality and statistical data protection techniques” L.H. Cox edited by EUSTAT.

September 2000

Security Analysis of Census Tables Internal report about sensitive crosses and dissemination proposal

2001 Participation in The Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality (Skopje, Macedonia, 14-16 March)

Article: “A comparative test for several threshold values in frequency tables: A Tau-Argus performance example.”

Page 4: Basque Statistics Office Confidentiality Project:  Final stages

Period Action Output

2002 Tabular Data protection of preliminary results of the Census 2001, using Tau-Argus (optimal method).

Publication of suppression patterns for frequency tables with fine geographical levels.

2003-2004 CASC project pursuit. Testing of Argus software.

June 2004 Attendance of PSD (Privacy in Statistical Databases) Conference. (Barcelona, Spain, 6-9 June)

2005 Staff training on disclosure control and protection software.

Internal Workshop on SDC techniques and ARGUS.

1. Introduction

Page 5: Basque Statistics Office Confidentiality Project:  Final stages

Period Action Output

Period Action Output

2006 Work on standard safety criteria Internal report about analysis of sources and internal situation.

December 2006

Attendance of PSD Conference. (Rome, December)

Feedback and contacts.

2007 Constitution of Confidentiality Council Group of experts to assess and deal with issues of confidentiality in terms of distribution

2007 Rules for Website tables and microdata distribution

Creation of “Rules of Confidentiality in statistical distribution”

2008- Miccrodata generationand in situ access for researchers

Public use microflesMicrofiles for researchers

1. Introduction

Page 6: Basque Statistics Office Confidentiality Project:  Final stages

2. Microdata for standard distribution

Page 7: Basque Statistics Office Confidentiality Project:  Final stages

• Until 2008…- Microdata distribution only under request- Few requests- Users: Universities, researchers,…

• From 2008…- First microdata files for standard distribution- Social and demographic surveys (households and individuals)- Increase of requests- Users: General public

• Today…- Standard microfiles in the website of EUSTAT- Request form (user identification; the objective of the request)- Business surveys: in-situ access for researchers

2.1 Backgrund

Page 8: Basque Statistics Office Confidentiality Project:  Final stages

• Study the structure of the files: statistical unit, hierarchies,…• Selection and filter of variables

– Geographical level– Identifying variables (sex, age, place of residence, civil status, profession,...).– Sensitive topics (ideology, union membership, religion, beliefs, health).

• Risk analysis (Mu-Argus)• Microdata protection techniques• “Safe” microdata and metadata distribution

2.2 Methodology for the creation of microdata

Page 9: Basque Statistics Office Confidentiality Project:  Final stages

2.3. Surveys analysed

• Survey on living conditions (ECV 2008)

• Survey on demographics and validation (EDV 2009)

• Survey on social capital (ECS 2010)

• Survey on environment - families (EMAF 2010)

• Survey on the information society - families (ESIF 2011)

Page 10: Basque Statistics Office Confidentiality Project:  Final stages

Periodicity: five-year surveyType of survey: Sampling surveyStatistical units: Families & individualsSample size: 4.909 families (and one individual

per family)

Objectives and information collected:

To learn about the living conditions (health, education, work, free time, environment,…) of the Basque families and population.

Survey on living conditions: creation of microdata

Page 11: Basque Statistics Office Confidentiality Project:  Final stages

• Structure of the microfile: – one file of families– one file of individuals– a key for the join is included

• Selection and filter of variables– Geographical level: quality and confidentiality criteria – Identifying variables

DESCRIPTION Individuals file

DESCRIPTION Families file

Province (3) Province (3) Municipality (104) Municipality (104) Zone (9) Zone (9) Age (100) Age (100) Profession (9) Profession (9) Sex (2) Number of spaces (24) Civil status (5) Family size (9) Place of birth (5) Place of birth (5) Level of education (4) Place of birth (5) Relation to activity (3) Professional situation (7)

Risk analysis

Survey on living conditions: creation of microdata

Page 12: Basque Statistics Office Confidentiality Project:  Final stages

Risk analysis: resultsFamilies file

Individuals file

Survey on living conditions: creation of microdata

Traditional method Probabilistic method

KEY

Number of

unique records

% of unique records

Number expected of re-identifications

% expected of re-

identifications

MUN x AGEF x CPROF1 2980 59,79 114,98 2,31 MUN x AGEF x SPAC 2945 59,09 112,09 2,25 MUN x AGEF x FAMSIZE 2522 50,60 97,14 1,95 MUN x AGEF x SPROF 2333 46,81 92,20 1,85 MUN x AGEF x NIVI1 2229 44,72 86,42 1,73 MUN x AGEF x ECIV 2065 41,43 79,94 1,60 MUN x AGEF x SEXF 1833 36,78 72,17 1,45 MUN x AGEF x SEXF x CPROF1 3241 65,03 124,70 2,50

Traditional method Probabilistic method

KEY

Number of

unique records

% of unique records

Number expected of re-identifications

% expected of re-

identifications

MUN x AGEI x CPROF2 3064 62,42 60,87 1,24 MUN x AGEI x NIVI1I 2473 50,38 48,60 0,99 MUN x AGEI x LNACI 2295 46,75 46,64 0,95 MUN x AGEI x ECIVI 2205 44,92 46,14 0,94 MUN x AGEI x SEXI 2183 44,47 42,71 0,87 MUN x AGEI x RELA 2098 42,74 40,74 0,83 MUN x AGEI x SEXI x CPROF2 3442 70,12 68,73 1,4

Page 13: Basque Statistics Office Confidentiality Project:  Final stages

Re-identification ratios

Individuals file

Families file

Survey on living conditions: creation of microdata

Page 14: Basque Statistics Office Confidentiality Project:  Final stages

Microdata protection techniques

Global recoding• Geographical level: Zone (groups of municipalities) • Age: Five-year groups

Top-Bottom coding• Family size: Top-coded (10 or more).• Number of rooms: Top-bottom coded (1-3 rooms, …, 7 or more)

Filter of sensitive variables related to:• Health, family income, economical restrictions,

delinquency of the environment, participation in politics, interest in games of chance,…

Survey on living conditions: creation of microdata

Page 15: Basque Statistics Office Confidentiality Project:  Final stages

3. In situ access to researchers

The aim: To provide a better service to researchers by permitting access to microdata in EUSTAT facilities under a rigorous security protocol.

Protocol stages

• Request including information on the petitioner, objectives of the research project, people involved, detail description of the request and a work plan.

• Request authorisation/rejection based on the fulfilment of the requirements and purpose of the request (“scientific purpose”).

• Sign of a contract (confidentiality obligation, conditions of access,…)

• Access in EUSTAT centres : Software and hardware restrictions

• Check of the outputs

Page 16: Basque Statistics Office Confidentiality Project:  Final stages

4. Future tasks

• Continue during the 2011 and 2012 with the creation of new microdata files (Labour Force Survey, Natural Population Movement, Survey on Family Conciliation... )

• To offer in-situ access from the provincial offices in Bilbao and Donostia-San Sebastián and regard the possibility of Remote Access.

Page 17: Basque Statistics Office Confidentiality Project:  Final stages

5. Conclusions

• The creation of microdata requires teamwork between methodologists, experts on protection techniques, and producers of statistics. Training is needed to extend all this know-how.

• The driving role of these types of transversal projects is indispensible, in our case this role is carried out by the Confidentiality Council integrated by the different departments of EUSTAT

Page 18: Basque Statistics Office Confidentiality Project:  Final stages

Referencias• Directive 95/46/EC of the European Parliament and of the Council of 24

October 1995 on The Protection of Individuals with regard to the Processing of Personal Data and on the Free Movement of such data.

• Basque Statistics Office - EUSTAT (1999) Statistical Data Protection Techniques. Technical notebook.

• Basque Statistics Office - EUSTAT (2007) Treatment of Confidentiality in EUSTAT statistical operations. Confidentiality protocol.

• Garín, A., Urrutia, J., (2000). Statistical Secret protection: basic elements of a data protection system. OFISTAT Seminar.

• National Institute of Statistics - INE (1994) . Population and Households Census 1991: Methodology. ISBN: 84-260-2889-6. Madrid.

• Law 4/1986 of 23 April - Basque Statistical Law.

• Law 15/1999 of 13 December - Organic Law on Personal Data Protection.

• Statistical Programme Committee (2005) European Statistics Code of Practice and Commission Recommendations. Brussels.

Page 19: Basque Statistics Office Confidentiality Project:  Final stages

Thanks for your attention!