19
Alive and kicking! Keeping data re-usable in the European Values Study IASSIST Cologne, May 2013 [email protected], [email protected] GESIS, Data Archive for the Social Sciences

Alive and kicking! Keeping data re-usable in the European Values Study

Embed Size (px)

DESCRIPTION

Alive and kicking! Keeping data re-usable in the European Values Study Evelyn Brislinger, Astrid Recker GESIS - Leibniz Institute for the Social Sciences Repeated cross-national surveys generate huge amounts of cross-linked data and metadata. To enable replication and to make this data re-usable in new research contexts, thorough and standardized documentation of data and project workflow is indispensable. However, in the social sciences, data and documentation often undergo a continuous process of correction, refinement, and further development. These processes need to be documented too, especially to allow data providers to build on these results and experiences in preparation of the next wave. In this paper, we use the European Values Study (EVS) 1981-2008 to illustrate the challenges to be met in the active curation of extensive amounts of data and documentation created, altered, and re-used across the survey life-cycle. Outlining how these challenges are met by the EVS, we will particularly discuss the following questions: Looking beyond the “standard” documentation of data and survey methods, what supporting contextual information should accompany data to ensure their effective “migration” and use across waves? Especially relevant in a project composed of 125 national surveys covering 49 countries and spanning almost 30 years is the question which preservation metadata is needed to achieve this objective and thus support the long-term accessibility of data and contextual information?

Citation preview

Page 1: Alive and kicking! Keeping data re-usable in the European Values Study

Alive and kicking! Keeping data re-usable in the

European Values Study

IASSIST Cologne, May 2013

[email protected], [email protected]

GESIS, Data Archive for the Social Sciences

Page 2: Alive and kicking! Keeping data re-usable in the European Values Study

Overview

Data and information flow in the EVS project

Principles and workflows for managing data and

documentation in survey projects

Page 3: Alive and kicking! Keeping data re-usable in the European Values Study

GESIS Data Archive

Basis

Interplay between Principal Investigators (PI) and Data Archive

Agreement on submission of data and information packages

Goals

Ease access to data for a broad user community

Provide metadata for discovery, understanding, and good use of data

Preserve data and metadata for re-use and replications

Holdings

Studies, study series, and complex survey programs as ISSP, Eurobarometer,

ALLBUS, European Values Study (EVS), or election studies

Page 4: Alive and kicking! Keeping data re-usable in the European Values Study

Data and information created in a survey project

Total stock of data and

documentation created

Data and documentation

submitted to an archive

Further information necessary

for the project(?)

Selection processes

Management solutions for structuring data and information

Page 5: Alive and kicking! Keeping data re-usable in the European Values Study

Example: European Values Study (EVS)

9-year-period, 4 waves

49 countries, 125 national surveys

Cross-national, longitudinal

research program

National surveys

Waves

1981/1990/1999/2008

Longitudinal data File

1981-2008 (LdF)

Integrated Values Surveys

EVS/WVS (IVS)

Harmonization and integration process

Number of files

Size of files

Atlas of European Values

www.europeanvaluesstudy.eu/evs/evsatlas.html

Page 6: Alive and kicking! Keeping data re-usable in the European Values Study

Collaboration of actors involved (EVS 2008)

Data

created

processed

documented

National team

Data

standardized

harmonized

integrated

Central team

Data Archive Secondary users Principal Investigators

Data

checked

documented

preserved

released

Data

re-used

Analyses

replicated

Results

reported

Page 7: Alive and kicking! Keeping data re-usable in the European Values Study

Users: analyze and evaluate outcomes

Questions

Check trend questions and original

questions

ZACAT-Online Study Catalogue

Data

Analyze data, report errors, monitor

error reporting

GESIS Data Catalogue

Publications

Replicate analysis of other projects

EVS Repository

…. and detect peculiarities in

questions or problems in data

Page 8: Alive and kicking! Keeping data re-usable in the European Values Study

Peculiarities in question text spotted?

Project Design

Questionnaire Design

Questionnaire Translation

Data Collection

Data Documentation

Data Processing

Check question and translation

Master/field questionnaire, methodological

questionnaire, report ‘Translation History’

Check source of question

Trend question from EVS and WVS,

questions borrowed from other surveys

Identify consequences for

Countries sharing/adopting affected

language, languages belonging to a family,

further languages used in a country

EVS 2008 Data lifecycle

Page 9: Alive and kicking! Keeping data re-usable in the European Values Study

Data error detected?

Standardization and harmonization process: check comparability of surveys,

questions, variables cumulate data and document each step

Integrated Values

Surveys

EVS/WVS

Longitudi-nal data

File

1981-2008

Wave 2008

National data

Original data file

Wave 1999

…..

National data

…..

Retrace data processing steps across surveys: check data, syntax

files, and documentation update data and highlight problems for next wave

Error detected

Page 10: Alive and kicking! Keeping data re-usable in the European Values Study

Data and information created

Designated communities

Principal Investigator/Project

Secondary user

Experiences from EVS project

Data and information packages

Project package

Archive package

Selection processes

Within project

Between project and archive

Project

Archive

Total stock

Page 11: Alive and kicking! Keeping data re-usable in the European Values Study

Communicating with the future: Activity on two levels

Macro level

Defining workflows, file and information paths on which

necessary information is passed on

Micro level

Organizing information so that it is

re-usable (RDM, metadata,

systematic file structures)

Page 12: Alive and kicking! Keeping data re-usable in the European Values Study

Begin by identifying principles for structuring and documenting files in

the project (Research Data Management)

Select which information

is relevant

to whom?

A tidy house, a tidy mind!

Reference, don’t

duplicate files whenever possible

Identify and

capture “kinship

relations”

Capture process

knowledge

classes

itineraries Make changes

traceable versioning

document revisions &

annotations

minutes

protocols

Page 13: Alive and kicking! Keeping data re-usable in the European Values Study

The magic wand

Follow principles of good research

data management (RDM)

Use metadata to document process

and content information

Use standards wherever possible

(e.g. DDI, Dublin Core, ISO codes,

file naming conventions, etc.)

(and not the one used by the sorcerer’s apprentice)

Page 14: Alive and kicking! Keeping data re-usable in the European Values Study

Document

Date

created

Language Version

Format

Resource

Rights

Date

modified

English

Actor

Name

Collection

hasDate

hasModifier

creates

modifies hasAccessRights

isA

hasVersion

isA

hasCreator

hasLanguage

hasIdentifier

isPartOf

hasFormat

hasIdentifier

hasRole dc:creator

dc:created

dc:modified

dc:identifier

dc:format

dc:provenance

dc:description

dc:language

dc:accessRights

dc:collection

isA

Page 15: Alive and kicking! Keeping data re-usable in the European Values Study

Managing information flows in a collaborative, long-

term project

Which paths does information (data, documentation, other

contextual material) take from producers to users?

Two models helped us clarify processes and paths, as well as

identify helpful terminology and concepts

– Project life cycle

– Open Archival Information System (OAIS) reference model

(CCSDS 2012)

CCSDS (2012). Reference Model for an Open Archival Information System (OAIS). Recommended Practice.

http://public.ccsds.org/publications/archive/650x0m2.pdf

Page 16: Alive and kicking! Keeping data re-usable in the European Values Study

Project Repository

Ingest

Data processing

and enhancement

Data

Management

Temporary

Storage

Access

(project-internal

use, PIs)

Project Design Data

Dissemination

Questionnaire

Design

Questionnaire

Translation Data Collection

Data

Documentation

Data

Processing

Project life cycle: Data flow during creation of a survey

Guidelines

Page 17: Alive and kicking! Keeping data re-usable in the European Values Study

Data Archive

(preservation service provider)

Data

Management Access

Archival Storage

(long-term)

Preservation Planning

Administration

Ingest

Secondary

Users

(future)

Principal

Investigators

SIP AIP AIP

DIP

Project Repository

(content provider)

Ingest

Data processing

and enhancement

Data

Management

Temporary

Storage

Access

(project-internal

use, PIs)

Project and Data Archive as distributed system

PIP

PIP

PIP

PIP

PIP

PIP

PIP

PIP

PIP

PIP = Project Information Package, SIP = Submission Information Package,

AIP = Archival Information Package, DIP = Dissemination Information Package

Project Design Data

Dissemination

Questionnaire

Design

Questionnaire

Translation Data Collection

Data

Processing

Data

Documentation

Page 18: Alive and kicking! Keeping data re-usable in the European Values Study

Staying Alive! Where we are going from here

Developing a guideline for projects

– structuring and annotating of information on the micro level

– issues to discuss with an Archive (preservation service provider)

Testing our model

– implementing our ideas in smaller projects with the aim of

making the results available to other projects

Page 19: Alive and kicking! Keeping data re-usable in the European Values Study

Thank you for your attention!

Evelyn Brislinger | Astrid Recker

GESIS – Leibniz Institute for the Social Sciences, Data Archive

[email protected] | [email protected]

www.gesis.org