20
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations Statistics Division Demographic Statistics Section

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Embed Size (px)

Citation preview

Page 1: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Overview of Archiving of Microdata

Session 4

United Nations Statistics DivisionDemographic Statistics Section

Page 2: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Overview of Presentation

What are microdata?

Why disseminate microdata?

Data files for archiving

Preparing the data sets

Data security

Tools for archiving of microdata

Risks of disseminating microdata

Page 3: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

What are microdata?

Microdata:are electronic data files containing the information

about each unit of enumeration such as person, household, housing unit

are organized data files in which each line (or record) contains information about one unit of observation

contain information in the form of coded values contain different types of variables-numeric,

alphanumeric, discrete or continuous-obtained from direct responses or derived by imputation/calculation

Page 4: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Why disseminate microdata?

Main reason is to support research by offering flexibility

to define variables and modify categories in a way to meet the needs of researches

to generate more interest which facilitates wider use of census data

A closer relationship between data providers and users can improve the reliability and relevance of data

Page 5: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Version of data files for archiving Data procedures often create multiple versions of microdata

files. These files; are created during different stages of census operation differ in the quality, content and number of records range from raw microdata files to cleaned and edited files

for public use

Page 6: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Page 7: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

What is sensitive in microdata?

In order to ensure data confidentiality, census data usually do not contain variables that are direct identifiers

Census data sets include variables that are indirect identifiers; Detail geographic information Detail information on professional status

Some variables in microdata sets can be sensitive due to the nature of the information contained in them Information on income, ethnicity, religion, etc.

Page 8: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Preparing the data set

Acquisition Microdata can be generated from various data sources:

censuses, surveys and administrative registers A clear acquisition policy that describes scope, source

and mandate for the acquisition of microdata sets is necessary

NSO can play an important role by expanding the scope of the data archive to official sources such as line ministries

Page 9: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Preparing the data set

Data file Hierarchical/relational files are easier to analyze and more

efficient for data storage The identification variables in all data files should provide a

unique identifier Unique identifiers to merge data files should be composed of

numeric variables for more efficient sorting and filtering of records

A unique household identification should not be a compilation of geographic codes since these codes are highly identifying

All unnecessary or temporary variables from the data files should be removed

Page 10: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Preparing the data set

Variables and codesAll variables are labeled (variable labels) and the

codes for all categorical variables are labeled (value labels)

“Missing” codes should be standardized for all variables

“Not applicable” code should be distinct from other missing codes

If “errors” or “missing data” imputed, this should be indicated in the data set

Page 11: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Preparing the data set

Verification operation If a dataset is hierarchical, all records in the individual level

files should have a corresponding household in the household-level file

The number of records in each file should be verified Data from all sections of the questionnaire should be

included in the dataset

===>setting up verification rules to check data sets

Page 12: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Data security

Physical security Controlling access to rooms where data are held Logging the removal of and access to media or hard copy

material in store rooms

Network security Not storing confidential data on servers or computers

connected to an external network Firewall protection and security-related upgrades to avoid

viruses and malicious code

Page 13: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Data security

Security of computer systems and files Locking computer systems with password and installing a

firewall system Implementing password protection of, and controlled access

to, data files Protecting servers by power surge protection systems

through line-interactive uninterruptible power supply (UPS) systems

Imposing non-disclosure agreements for managers or users of confidential data

Page 14: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Data security

Security of personal data Anonymising or aggregating data Separating data content according to security needs Removing personal information from data files and storing

them separately

Page 15: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Tools for archiving microdata

International Household Network Survey (IHSN) A network of international agencies coordinated by World

Bank/PARIS21 Develop tools, guidelines and training materials Advocate compliance with good practices and international

standards

Page 16: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Tools for archiving microdata

Redatam based IMIS Originally developed at CELADE to promote acess to census

microdata It is a database management tool that manages large

volumes of census data Aims to promote access to and analysis of census and other

data for informed decision making for sectoral and local development policies and programmes

Page 17: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Risks of disseminating microdata

Maintaining respondents’ trust: confidentiality protection is the key element of trust

Potential misuse and misunderstanding of data by users: there should be procedures to prevent misuse of microdata; good documentation and technical support to prevent misunderstanding of microdata

Exposure to criticism and contradiction: data quality may not be good enough for further dissemination; there may be inconsistency between research results based on microdata and published aggregated data

Page 18: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Risks of disseminating microdata Legal issues: it is crucial for data procedures to ensure there is a

sound legal and ethnical base (as well as the technical and methodological tools) for protecting confidentiality

Costs: these will include not only the costs of creating and documenting microdata files, but the costs of creating access tools and safeguards, and of supporting and authorizing enquiries made by research community, training and support to new users of microdata files

Technical capacity: the files need to be well-documented and preserved; be reviewed to identify the risk of disclosure of individual information and the risk reduced using various techniques

Page 19: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

Microdata is archived:

“to allow future users to retrieve, access, decipher, view, interpret, understand and experience documents, data and records in meaningful and valid ways” Jeff Rothernberg

“ to create institutional memory for long term researches”

Page 20: Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011 Overview of Archiving of Microdata Session 4 United Nations

Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September 2011

THANK YOU …..