25
Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

Stata export for metadata

documentation

Munich, 26.05.2019Anne Balz, Klaus Pforr, Florian Thirolf

Page 2: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

Motivation

� German Microdata Lab (GML) offers Metadata for

various official microdata online

� Goal: extract Metadata from these Datasets automatically

and import them into our database

� German Microcensus

� European Labour Force Survey

� EU-SILC (European Union Statistics on Income and

Living Conditions)

2

Page 3: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

Microdata-Informationsystem MISSY

3

� Online plattform („MISSY-web“)

� Documentation of official microdata (European &

national)

� Documentation on different levels:

� study

� question

� variable

Page 4: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

Microdata-Informationsystem MISSY

4

Page 5: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

Microdata-Informationsystem MISSY

5

Page 6: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

ado dta2mdcore functionality

Page 7: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

core functionality

7

*.dta

output.*

Page 8: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

core functionality

8

*.dta dta2meta.ado meta.dta

output.*

meta2*.ado

Page 9: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

ado dta2mdado dta2md

Page 10: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

ado dta2md

10

*.dta dta2meta.ado meta.dta meta2*.ado

output.*

Page 11: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

the meta-file

All necessary (meta-)information in a table format:

� Variable level

� Varname, -label

� Summary statistics (min, max, mean, std)

� Value level

� Value, - label

� Frequencies and percentages

� Overall

� For groups (e.g.: countries)

11

Page 12: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

ado dta2md

12

Value Level

User Input (Variable): Group-Variable & Computed

Technical: First Value within Variable

Variable Level

Page 13: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

the meta-file

13

Page 14: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

the meta-file

14

Page 15: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

ado dta2md

15

Page 16: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

ado dta2md

dta2md input(filename) output(filename) //

freqvarlist(varlist) //

[group(varname) //

missingdef(string) smissingdef(string) //

replace ]

dta2md input($path/micro_file.dta) output($path/meta_file.dta)//

freqvarlist(var1 var2 var3) //

group(country) //

missing("X<0") //

smissingdef(`"X="invalid answer"| X="did not understand""') //

replace

16

Page 17: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

ado dta2md

17

Loop over all vars

If group specified:

Loop over all groups

(within levels of vars)

If computed:

Loop over all levels

(within all vars)

If group specified:

Loop over all groups

Page 18: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

ado dta2mdado meta2DDI

Page 19: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

ado meta2DDI

19

*.dta dta2Meta.ado meta.dta meta2DDI.ado

DDI2.5.xml

Page 20: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

ado meta2DDI

� Uses the ‚file‘ command

� ‚forvalues‘ to runthrough all categories

� variables of the meta-file are used to form hierarchical output

20

� example:

� ‚first‘ (0/1) tags first category of a variable

� used to generate output on variable level

Page 21: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

ado meta2DDI

21

Page 22: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

ado meta2DDI

22

Page 23: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

ado dta2mdusecase MISSY

Page 24: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

Usecase MISSY

24

*.dta dta2Meta.ado meta.dta meta2sql.ado

getUUIDs

generateUUIDs

mapRelations

Database

output.sql

Page 25: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf

meta2sql.ado

� ‚file‘ command is used

� different frame

� ‚forvalues‘ for each database-table

25