14
Going Local with a World-Class Data Infrastructure: Enabling SDMX for Research Support Rob Grim Head Research Support/Research Data Specialist Executive Manager Open Data Foundation (ODaF) Library & IT Services, Tilburg University (Netherlands) IASSIST 2012, June 8 Washington

Going local with a world-class data infrastructure: Enabling SDMX for research support

Embed Size (px)

Citation preview

Page 1: Going local with a world-class data infrastructure: Enabling SDMX for research support

Going Local with a World-Class Data Infrastructure: Enabling SDMX for Research Support

Rob GrimHead Research Support/Research Data SpecialistExecutive Manager Open Data Foundation (ODaF)Library & IT Services, Tilburg University (Netherlands)IASSIST 2012, June 8 Washington

Page 2: Going local with a world-class data infrastructure: Enabling SDMX for research support

212-04-2023

Research Data Support? With SDMX?

• Why should we support researchers anyway?

• Why should a university use a complex set of standards such as SDMX to support research?

• CARDS World Taxation Indicators project

• Collaborative research

• Workflow support

• Infrastructure development

• Metadata management

• What does it take get SDMX up and running?

Capture

SDMX

Curation

Page 3: Going local with a world-class data infrastructure: Enabling SDMX for research support

312-04-2023

Research Data Support (Tilburg University)

1. Archive research data and supplementary materials

2. Register data sources used and provenance information

3. Assist with dataset description to improve accessibility of datasets

4. Integrated library and data catalogue

5. Subject portals e.g. ‘European Values Study’

6. Financial Data Support

Dataset available !

DDI and RDF in metadata record (hidden)

Page 4: Going local with a world-class data infrastructure: Enabling SDMX for research support

412-04-2023

Research (Data) Support

1. “Research Support”, often used as a synonym for IT support

2. Current research data services focus on data archiving, DMPs, curation

3. Simple approaches to data sharing

4. Portfolio of research data tools needed to support academic practices

5. Potential of metadata management undervalued

SDMX

Dataverse Network (DVN)

Questasy

Survey documentation

Metadata Repository

Archiving + Access Management

SDMX Data Repository

Landscape tools

Aim for “Need to have” instead of “Nice to Have”

Page 5: Going local with a world-class data infrastructure: Enabling SDMX for research support

5

Metadata Management

12-04-2023

Source: OECD.StatSource: Eurostat

Capture

Dissemination

Capture

Capture Time-series data

Metadata Registry

Tools

Tools needed!

Page 6: Going local with a world-class data infrastructure: Enabling SDMX for research support

612-04-2023

Why SDMX?

1. SDMX allows us to capture and manage ‘data intelligence’ in a formalized and structured way

2. SDMX information model useful to describe time-series data from different disciplines

3. SDMX offers means to prevent unnecessary replication of data

4. SDMX offers means to deal with confidential data and IPR

5. The standard is well used, training materials, tutorials available

6. SDMX IT tools are available for different platforms: Java .NET

7. FAO OpenSDMX initiative (D4Science)

8. Researchers want ‘something’ like OECD.Stat

OECD.Stat

Capture

SDMX

Curation

FAO

Page 7: Going local with a world-class data infrastructure: Enabling SDMX for research support

712-04-2023

Workflow

Agency ECB FAOStat GIST IAEG ISO SDMXAgency Scheme 0 0 0 0 0 1Categorisations 0 0 0 0 0 0Category schemes 1 0 0 0 0 1Codelists 2 8 2 10 0 9Concept schemes 1 1 0 2 1 1DSD 3 1 0 1 0 0Dataflows 1 1Data provider scheme 1 1Provision agreement 2

Curate

Capture

Display

Verbs:…Extract from PDF, CSV…Convert toSDMX-ML …Code 4 Registry

Time series metadata: concepts, dimensions, attributes

Existing tools

WTIURL:

Table: Overview of registry structures

Page 8: Going local with a world-class data infrastructure: Enabling SDMX for research support

812-04-2023

Where we are now?

• Production workflow for SDMX

• Populating the metadata registry

• Enter (hierachichal) codelists

• Concept IDs

• Concept Schemes

• DSDs

• Dataflows

• SDMX ML Generic format

• WTI Fusion Registry

• SDMX data repository

• Keep data in the original formats (csv, txt, Stata)

• Convert data from a database to SDMX

• Specific purpose database for SDMX compliant system

• Other: Collaborate with FAO, Open SDMX?

Source: SDMX Information Model

Page 9: Going local with a world-class data infrastructure: Enabling SDMX for research support

Codelists

Code Values

Metadata registry: Fusion Registry

Page 10: Going local with a world-class data infrastructure: Enabling SDMX for research support

Titelpresentatie in Footer

Concept Scheme

Page 11: Going local with a world-class data infrastructure: Enabling SDMX for research support

Category Scheme

Page 12: Going local with a world-class data infrastructure: Enabling SDMX for research support

1212-04-2023

CARDS-project World Taxation Indicators

1. Georgia State University, International Center for Public Policy, World Tax Indicators Portal

2. Tilburg University, prof. Jenny Ligthart

3. Lack of data on personal income tax (PIT), corporate income tax (CIT), Value Added Tax (VAT) and other tax indicators

4. Incomplete series, missing countries, tax data difficult to access (addendums), difficult to compare

5. Work WTI group: statutory tax rates. Tilburg: effective tax rates, corporate income tax.

6. The ‘raw ‘data stem from the IMF/GFS and the OECD/Revenue statistics.

Page 13: Going local with a world-class data infrastructure: Enabling SDMX for research support

1312-04-2023

Lessons learnt so far

• Support of senior management is needed to get beyond the project/pilot stage

• SDMX standards are complex: steep learning curve

• Capacity building is a must (Tip: Eurostat SDMX tutorials)

• SDMX data repository: collaborate with other organizations

• Focus on DSDs, full target and partial identifiers, hierarchical code lists

• Fusion Registry upgrade

• Additional (academic) partners welcome to leverage the macro economic time series registry and repository

Page 14: Going local with a world-class data infrastructure: Enabling SDMX for research support

1412-04-2023

Acknowledgements

• CARDS was funded SURF. The CARDS project was undertaken in 2011 in the framework of the SURFshare programme – Access to Research Data

• WTI group and prof. Jenny Ligthart

References1. Burgi-Schmelz, A. (2009). Data to the rescue. Why improved statistical information will be key for

prevention of future crises. Finance and Development, 46(1), 31-43. 2. Peter, K. S., Buttrick, S., & Duncan, D. Data appendix to “global reform of personal income taxation, 1981-

2005: Evidence from 189 countries” 3. Peter, K. S., Steve Buttrick, & Duncan, D. (2010). Global reform of personal income taxation, 1981-2005:

Evidence from 189 countries. National Tax Journal, 6(3).

Don’t forget!

Before you ask:

“What you can do for your country “, ask yourself:

“What metadata management can do for you”

Final Thought