Upload
merryl-preston
View
214
Download
1
Embed Size (px)
Citation preview
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Overview of Archiving of Microdata
Session 4
United Nations Statistics DivisionDemographic Statistics Section
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Overview of Presentation
What are microdata?
Why disseminate microdata?
Data files for archiving
Preparing the data sets
Data security
Tools for archiving of microdata
Risks of disseminating microdata
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
What are microdata?
Microdata:are electronic data files containing the information
about each unit of enumeration such as person, household, housing unit
are organized data files in which each line (or record) contains information about one unit of observation
contain information in the form of coded values contain different types of variables-numeric,
alphanumeric, discrete or continuous-obtained from direct responses or derived by imputation/calculation
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Why disseminate microdata?
Main reason is to support research by offering flexibility
to define variables and modify categories in a way to meet the needs of researches
to generate more interest which facilitates wider use of census data
A closer relationship between data providers and users can improve the reliability and relevance of data
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Version of data files for archiving Data procedures often create multiple versions of microdata
files. These files; are created during different stages of census operation differ in the quality, content and number of records range from raw microdata files to cleaned and edited files
for public use
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
What is sensitive in microdata?
In order to ensure data confidentiality, census data usually do not contain variables that are direct identifiers
Census data sets include variables that are indirect identifiers; Detail geographic information Detail information on professional status
Some variables in microdata sets can be sensitive due to the nature of the information contained in them Information on income, ethnicity, religion, etc.
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Preparing the data set
Acquisition Microdata can be generated from various data sources:
censuses, surveys and administrative registers A clear acquisition policy that describes scope, source
and mandate for the acquisition of microdata sets is necessary
NSO can play an important role by expanding the scope of the data archive to official sources such as line ministries
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Preparing the data set
Data file Hierarchical/relational files are easier to analyze and more
efficient for data storage The identification variables in all data files should provide a
unique identifier Unique identifiers to merge data files should be composed of
numeric variables for more efficient sorting and filtering of records
A unique household identification should not be a compilation of geographic codes since these codes are highly identifying
All unnecessary or temporary variables from the data files should be removed
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Preparing the data set
Variables and codesAll variables are labeled (variable labels) and the
codes for all categorical variables are labeled (value labels)
“Missing” codes should be standardized for all variables
“Not applicable” code should be distinct from other missing codes
If “errors” or “missing data” imputed, this should be indicated in the data set
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Preparing the data set
Verification operation If a dataset is hierarchical, all records in the individual level
files should have a corresponding household in the household-level file
The number of records in each file should be verified Data from all sections of the questionnaire should be
included in the dataset
===>setting up verification rules to check data sets
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Data security
Physical security Controlling access to rooms where data are held Logging the removal of and access to media or hard copy
material in store rooms
Network security Not storing confidential data on servers or computers
connected to an external network Firewall protection and security-related upgrades to avoid
viruses and malicious code
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Data security
Security of computer systems and files Locking computer systems with password and installing a
firewall system Implementing password protection of, and controlled access
to, data files Protecting servers by power surge protection systems
through line-interactive uninterruptible power supply (UPS) systems
Imposing non-disclosure agreements for managers or users of confidential data
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Data security
Security of personal data Anonymising or aggregating data Separating data content according to security needs Removing personal information from data files and storing
them separately
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Tools for archiving microdata
International Household Network Survey (IHSN) A network of international agencies coordinated by World
Bank/PARIS21 Develop tools, guidelines and training materials Advocate compliance with good practices and international
standards
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Tools for archiving microdata
Redatam based IMIS Originally developed at CELADE to promote acess to census
microdata It is a database management tool that manages large
volumes of census data Aims to promote access to and analysis of census and other
data for informed decision making for sectoral and local development policies and programmes
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Risks of disseminating microdata
Maintaining respondents’ trust: confidentiality protection is the key element of trust
Potential misuse and misunderstanding of data by users: there should be procedures to prevent misuse of microdata; good documentation and technical support to prevent misunderstanding of microdata
Exposure to criticism and contradiction: data quality may not be good enough for further dissemination; there may be inconsistency between research results based on microdata and published aggregated data
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Risks of disseminating microdata Legal issues: it is crucial for data procedures to ensure there is a
sound legal and ethnical base (as well as the technical and methodological tools) for protecting confidentiality
Costs: these will include not only the costs of creating and documenting microdata files, but the costs of creating access tools and safeguards, and of supporting and authorizing enquiries made by research community, training and support to new users of microdata files
Technical capacity: the files need to be well-documented and preserved; be reviewed to identify the risk of disclosure of individual information and the risk reduced using various techniques
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
Microdata is archived:
“to allow future users to retrieve, access, decipher, view, interpret, understand and experience documents, data and records in meaningful and valid ways” Jeff Rothernberg
“ to create institutional memory for long term researches”
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011
THANK YOU …..