View
218
Download
1
Category
Preview:
Citation preview
WP. 46 Providing access to data and making microdata
safe, experiences of the ONS
Jane LonghurstPaul Jackson
ONS
The Statistical Disclosure Control Problem
Original Data
Data Utility
Maximum Tolerable Risk
Accessed Data
No data
Disclosure Risk
Protecting and Providing Access to Microdata
Legal Issues
Policy Issues
The Data
Risk Assessment
Risk Management
Output
Test and
Evaluate
Protecting and Providing Access to Microdata
Legal Issues
Policy Issues
The Data
Risk Assessment
Risk Management
Output
Test and
Evaluate
Legal Issues
Legal Context
• No general statistics act
• No comprehensive business register
• No population register
• Registrations of Births, Marriages and Deaths are
public – including cause of death
• A system of common law
• An Information Commissioner – a privacy and an access
to information champion with court powers
• Data Protection / Human Rights
• Freedom of Information
Legal Issues
Legal Context continued :
•Business Surveys have statutory
protection
– But ONS has the lawful authority to
disclose identified business survey
data to any central government
department for any purpose, and any
local authority for their planning
purposes.
Legal Issues
Legal Context continued :
•Census records have statutory
protection
– But ONS has lawful authority to
disclose personal census information
to any person for statistical purposes.
Legal Issues
Legal Context continued :
•Household survey records are protected
by the civil “common law duty of
confidence.”
– But ONS has lawful authority to disclose
identifying household survey data to any
person where there is informed consent.
– And ONS survey pledges obtain consent for
disclosures of ‘detailed but anonymised
data’to any genuine researcher.
Legal Issues
Legal Context
This extraordinary authority to disclose
identifying microdata to certain persons,
departments and authorities only delays
the real issue –
– The access needs management – MRP
– When it is not ONS applying the SDC standards
for outputs, then someone else has to.
– Therefore usable standards and guidance are
essential
Legal Issues
Legal Context
•So when ONS has so many options,
how does it decide –
– i) who should have controlled access
under what conditions, and
– ii) what ONS or other users’ outputs
should look like.
So we need Policy
Protecting and Providing Access to Microdata
Legal Issues
Policy Issues
The Data
Risk Assessment
Risk Management
Output
Test and
Evaluate
Policy Issues
So we need Policy•National Statistics Code of Practice for the GSS•Protocol for data access and confidentiality
– A Confidentiality Guarantee, – National Statistics are guaranteed not likely to identify
an individual, assuming an intruder is prepared to use a proportionate amount of time, effort and expertise .
•Departmental policy– Variations according to considerations of :
• data source type• risk analysis and management• methodology• access / release options
Protecting and Providing Access to Microdata
Legal Issues
Policy Issues
The Data
Risk Assessment
Risk Management
Output
Test and
Evaluate
Protecting and Providing Access to Microdata
Legal Issues
Policy Issues
The Data
Risk Assessment
Risk Management
Output
Test and
Evaluate
Risk Assessment
•An element of disclosure risk comes from records that are unique in the sample and in a known population
•Several approaches to assessing the disclosure risk in microdata:– Disclosure risk scenarios– Variable checklist– Quantitative risk measures
Disclosure Risk Scenarios
•Identify possible situations where disclosure risk could occur
•Assumptions concerning prior knowledge of intruder and information available to him, e.g. private database, journalist, nosy neighbour
•Identify key variables - indirectly identifying variables
•Use this process to decide what needs to be protected against– can be complex– requires discussion and judgement
SDC Checklist for Microdata Release
•Level of geography•Ethnic classification•Detail of occupation•Visible variables •Traceable variables •Survey design •Dissemination
Quantitative Risk Assessment
•Recognised need for quantitative risk measures
•Research project initiated•Need for individual and global risk measures
•Problem for sample microdata is that population is an unknown parameter
•Different methods for estimating the disclosure risk measures– Heuristics– Probabilistic models
Probabilistic Modelling
•Estimate the disclosure risk based on natural assumptions about the distribution of the population
•Provides linked estimates of individual and global risk measures
•Research focused on– Model selection techniques– Robustness of estimates– Goodness of fit criteria
•Tested on ONS social surveys
Heuristics
•DIS/SUDA method consists of two elements– DIS - file level assessment of risk– SUDA - grades and orders records within a
file according to level of risk
•Provide variable and variable value contribution to the risk
•Implemented by ONS for 2001 Census SAR
Evaluation of Quantitative Risk Measures
•Simulate sample surveys from Census data
•Compare risk measures with true risk•Practical considerations•How to set thresholds•Incorporate risk measures into MRP decision process
Protecting and Providing Access to Microdata
Legal Issues
Policy Issues
The Data
Risk Assessment
Risk Management
Output
Test and
Evaluate
SDC for Microdata
•Perturbative methods– Record swapping– Adding noise
•Non-perturbative methods– Recoding– Suppression– Sub-sampling
•Mixed strategies•ONS mainly implements recoding•PRAM implemented for 2001 Census SAR
Access Options - SPECIALISTS
Data Laboratory
• Only government can use identifying business micro-data
• Identified census data is high risk• Hence the on-site lab and the
employment contracts• Only safe data can leave the
laboratory.• Approx 150 users/yr
Access Options - GOVERNMENT
Access Agreements in central and local government.
• UK is a devolved statistical system• ONS discloses identifying survey micro-
data to other government departments for statistics and research purposes– Users are professionals like us, subject to the
same Code of Practice, and the same laws.– We don’t screen for research validity– We don’t check outputs
• Approximately 300 disclosures of confidential micro-data every year
• No known breaches of confidentiality.
Access Options - RESEARCHFor the academic researchers, the UK Data Archive
• If it didn’t exist, we’d have to invent it.• All ONS household survey datasets are deposited with
UKDA– Year of birth, regional geography, all other variables (limited
coding)– Some large households removed
• Academic researchers and government departments can download the dataset upon signing a user license. Takes about an hour.– This year, 16,600 downloads have taken place. Each can have
up to 10 users in the institution….– ONS does not screen the license applications– ONS does not vet the research proposals– ONS does not check outputs– In place for 30 years now– No known instance of wrongful identification.
Access Options
The UK Data Archive, con’tBut this is not enough.• So ONS has now created the ‘Special
License’– Month of birth– Local authority geography– All households– Still access by downloading the data.
• ONS does check each Special License application– But not for valid research, only data needs,– And we still don’t check any outputs
Access Options - PUBLIC
For the Public, Freedom of Information
– ONS can only withhold microdata where its disclosure to an applicant would be likely to result, in :• A breach of any law it was collected under• An actionable breach of confidence• A breach of a data protection principle
– The Scottish Information Commissioner has instructed the Scottish Health Service to disclose to an applicant the counts of Leukaemia in under 14 yr olds by Ward (average ward population approx 4,000)• The table was all 1s and zeros – effectively microdata, and
‘safe‘.
Access Options
Are ONS access options and practices reasonable?
• They follow the constructs used by the Courts and Information Commissioners, in that policies are written in plain English
• Licensed academic users are, in 30 years of experience, not intruders. They are trusted colleagues – and like us they can make mistakes sometimes.
• Other civil service professionals are not intruders – they are as reliable and trustworthy as we are. They too have professional codes of conduct, ethics, and moral principles
• All statisticians and researchers need clear rules, and should be trusted to follow them.
Protecting and Providing Access to Microdata
Legal Issues
Policy Issues
The Data
Risk Assessment
Risk Management
Output
Test and
Evaluate
Protecting and Providing Access to Microdata
Legal Issues
Policy Issues
The Data
Risk Assessment
Risk Management
Output
Test and
Evaluate
Recommended