23
DDI AND THE DATA PRODUCER Prepared for Expert Seminar Finnish Social Science Data Archive Tampere, Finland September 1-2, 2000

DDI AND THE DATA PRODUCER

  • Upload
    london

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

DDI AND THE DATA PRODUCER. Prepared for Expert Seminar Finnish Social Science Data Archive Tampere, Finland September 1-2, 2000. NATIONAL SURVEY OF FAMILY GROWTH - NSFG. Purpose: collect data on factors affecting pregnancy and women’s health in the United States - PowerPoint PPT Presentation

Citation preview

Page 1: DDI AND THE DATA PRODUCER

DDI AND THE DATA PRODUCER

Prepared forExpert Seminar

Finnish Social Science Data Archive

Tampere, Finland

September 1-2, 2000

Page 2: DDI AND THE DATA PRODUCER

2

NATIONAL SURVEY OF FAMILY GROWTH - NSFG

• Purpose: collect data on factors affecting pregnancy and women’s health in the United States

• Survey of males added for the first time in current cycle

• Previous surveys conducted in 1973, 1976, 1982, 1988, and 1995

• New survey planned for 2001-2002 with pretest in January, 2001

Page 3: DDI AND THE DATA PRODUCER

3

NSFG Research Topics

• the number of children women have had and the number they expect in the future

• intended and unintended births

• sexual intercourse

• marriage and cohabitation

• contraceptive use

• infertility, impaired fecundity, and sterilization

Page 4: DDI AND THE DATA PRODUCER

4

NSFG Research Topics - 2• breastfeeding, maternity leave, and child care• adoption, stepchildren, and foster children• health insurance coverage• family planning and other medical services• smoking by women 15-44 years of age• HIV testing• Pelvic inflammatory disease and douching• sex education

Page 5: DDI AND THE DATA PRODUCER

5

COMPUTER-ASSISTED INTERVIEWING AND XML

• TADEQ (Tool for the Analysis and Documentation of Electronic Questionnaires)– Fourth Framework Research Project of the EU

– Partners:• Statistics Netherlands• Technical University of Vienna• Office for National Statistics (UK)• Statistics Finland• Instituto Nacional de Estatistica (Portugal)

Page 6: DDI AND THE DATA PRODUCER

6

COMPUTER-ASSISTED INTERVIEWING AND XML

• Goal of TADEQ project : create an ‘open’ tool using a ‘neutral’ way to describe how CAI questionnaires (in Blaise) are conducted and to produce a human-readable textual documentation. This neutral way is through the use of XML.

Page 7: DDI AND THE DATA PRODUCER

7

COMPUTER-ASSISTED INTERVIEWING AND XML

• Goal of National Survey of Family Growth (NSFG) project: Work with Blaise programmers at Survey Research Center to output DDI tags from CAI instrument

• Eliminate as much ‘hand-editing’ as possible to create this variable-level markup

• How this might work for the user…..

Page 8: DDI AND THE DATA PRODUCER

8

(4965)1995 NATIONAL SURVEY OF FAMILY GROWTHFile Documentation Respondent File (Section H Questionnaire Items)

VARIABLE COLUMN NON-HISPANIC NAME LOCATION HISP WHITE BLACK TOTAL QXTEXT AND CODE CATEGORIES

HLPPRG 9408-9408 (During any of your relationships,) have you (or your husband/or your husband or partner) ever been to a doctor or other medical care provider to talk about ways to help you become pregnant?

Inapplicable:R has has never had sex (RHADSEX coded 2, 7, 8, or 9); R has had sex but never since her first menstrual period (CI-22 SEXAFMEN coded 2); answer was not ascertained, R refused to report, or R did not know when she had sex after her first period (CI-23 WNSEXAFM coded 9997, 9998, or 9999); or R already reported receiving medial help to get pregnant in pregnancy history (HLPGETPG coded 1).

213 952 216 1457 Blank = Blank, inapplicable 52 275 84 425 1 = YES 1287 5255 2146 8963 2 = NO 0 0 0 0 7 = Not Ascertained 1 1 0 2 8 = Refused 0 0 0 0 9 = Don’t Know

Page 9: DDI AND THE DATA PRODUCER

9

Brief Example of DDI Markup

<varGrp type=‘section subject’ var=‘hlpprg howmanyr seekwwho…’>

<labl>Section H:</labl>

<txt>Infertility Services and Reproductive

Health</txt>

</varGrp>

Page 10: DDI AND THE DATA PRODUCER

10

Page 11: DDI AND THE DATA PRODUCER

11

Female Respondent File

Section H: Infertility Services and Reproductive Health

Female Respondent File Contents

Pregnancy File

Male Respondent File

Combined Male-Female Respondent File

Trend File

NSFG, Cycle 6

•Ever Received Help to Get Pregnant Series

•Ever Received Help to Prevent Miscarriage Series

•Douching Series

•Health Problems Related to Childbearing Series

•Census Bureau’s Disability Series

•HIV Testing and AIDS Series

Page 12: DDI AND THE DATA PRODUCER

12

Female Respondent File, Section H

Ever Received Help to Get Pregnant Series

Female Respondent File Contents

Section H Contents

Pregnancy File

Male Respondent File

Combined Male-Female Respondent File

Trend File

NSFG, Cycle 6

•HLPPRG--HA1 Received Medical Help To Get Pregnant? •HOWMANYR--HA2 # of H/P with whom R Sought Medical Help •SEEKWWHO--HA3 Which H/P Did R Seek Medical Help With •TYPALLP0--HA5 Infertility Services Received-1st •TYPALLP1--HA5 Infertility Services Received-2nd •TYPALLP2--HA5 Infertility Services Received-3rd •TYPALLP3--HA5 Infertility Services Received-4th •TYPALLP4--HA5 Infertility Services Received-5th •TYPALLP5--HA5 Infertility Services Received-6th•WHOTEST--HA5a Who had infertility testing?•WHARTIN--HA5b Inseminated with whose sperm?•OTMEDHE0--HA5c Other Infertility Services-1st•OTMEDHE1--HA5c Other Infertility Services-2nd •OTMEDHE2--HA5c Other Infertility Services-3rd•OTMEDHE3--HA5c Other Infertility Services-4th •[more…]

Page 13: DDI AND THE DATA PRODUCER

13

Page 14: DDI AND THE DATA PRODUCER

14

HLPPRG--HA1 Received Medical Help to Get Pregnant

Inapplicable Respondents

Full Question Text

Cycle 5 Frequencies

Female Respondent File Contents

Section H Contents

Back to Variable HLPPREG

Back to Variables List

NSFG, Cycle 6

•R has never had sex (RHADSEX coded 2, 7, 8, or 9)

•R has had sex but never since her first menstrual period (CI-22 SEXAFMEN coded 2)

•Answer was not ascertained, R refused to report, or R did not know when she had sex after her first period (CI-23 WNSEXAFM coded 9997, 9998, or 9999)

•R already reported receiving medical help to get pregnant in pregnancy history (HLPGETPG coded 1)

Page 15: DDI AND THE DATA PRODUCER

15

HLPPRG--HA1 Received Medical Help to Get Pregnant

Full Question Text

Inapplicable Respondents

Cycle 5 Frequencies

Female Respondent File Contents

Section H Contents

Back to Variable HLPPREG

Back to Variables List

NSFG, Cycle 6

HA-1. IF TIMESMAR = 1, MARSTAT = MARRIED OR SEPARATED, AND LIFEPRTS = 1, ASK: Have you or your husband ever been to a doctor or other medical care provider to talk about ways to help you become pregnant? IF TIMESMAR = 1, MARSTAT = WIDOWED OR DIVORCED, AND LIFEPRTS = 1, ASK: Did you or your husband ever go to a doctor or other medical care provider to talk about ways to help you become pregnant? IF TIMESMAR 1 AND LIFEPRTS > 1, ASK: During any of your relationships, have you or your husband or partner at the time ever been to a doctor or other medical care provider to talk about ways to help you become pregnant? IF TIMESMAR = 0 AND LIFEPRTS = 0, ASK: Have you ever been to a doctor or other medical care provider to talk about ways to help you become pregnant? IF TIMESMAR = 0 AND LIFEPRTS 1, ASK: During any of your relationships, have you or your partner at the time ever been to a doctor or other medical care provider to talk about ways to help you become pregnant? YES ............1 NO .............2 (HLPMC) REFUSED.........8 (HLPMC) DON'T KNOW......9 (HLPMC) NOTE: DO NOT COUNT IF MAIN PURPOSE OF VISIT WAS FOR SOMETHING OTHER THAN SEEKING HELP TO BECOME PREGNANT.

FLOW CHECK H-2: IF LIFEPRTS > 1, ASK HOWMANYR. ELSE, GO TO TYPALLPG.

Page 16: DDI AND THE DATA PRODUCER

16

HLPPRG--HA1 Received Medical Help to Get Pregnant

Cycle 5 Frequencies

Inapplicable Respondents

Full Question Text

Female Respondent File Contents

Section H Contents

Back to Variable HLPPREG

Back to Variables List

NSFG, Cycle 6

Hispanic Non-HispanicWhite

Non-HispanicBlack

Total Code Values and Definitions

213 952 216 1457 Blank=Blank, inap.

52 275 84 435 1=Yes

1287 5255 2146 8963 2=No

0 0 0 0 7=Not Ascertained

1 1 0 0 8=Refused

0 0 0 0 9=Don’t Know

Page 17: DDI AND THE DATA PRODUCER

17

Page 18: DDI AND THE DATA PRODUCER

18

INTCTFAM: Intact status of childhood family

Recode rules

Inapplicable Respondents

Imputation

Cycle 5 Frequencies

Distribution by Sex

Back to Variable INTCTFAM

Back to Variables List

If R lived with both biological parents at birth (VAR128.01 FAMTYP01=1) and that living situation did not change, or is current, (VAR129.01 CMCHFM01)=0 then INTCTFAM=1

If R lived with both adoptive parents at birth (VAR128.01 FAMTYP01=2) and that living situation did not change, or is current, (VAR129.01 CMCHFAM1=0) then INTCTFAM=2

Else, if R's parental living situation was anything else, or ever changed, INTCTFAM=3

Code categories:1=two biological parents from birth2=two adoptive parents from birth3=anything other than two biological or two adoptive parents from birthNSFG, Cycle 6

Page 19: DDI AND THE DATA PRODUCER

19

INTCTFAM: Intact status of childhood family

Inapplicable respondents

Recode Rules

Imputation

Cycle 5 Frequencies

Distribution by Sex

Back to Variable INTCTFAM

Back to Variables List

Non-blank for all Rs.

NSFG, Cycle 6

Page 20: DDI AND THE DATA PRODUCER

20

INTCTFAM: Intact status of childhood family

Imputation

Recode Rules

Inapplicable Respondents

Cycle 5 Frequencies

Distribution by Sex

Back to Variable INTCTFAM

Back to Variables List

Method 2: Hot deck imputation - most frequently used method of imputation Imputation using the hot deck procedure requires the identification of a pool of donors (cases with complete data) with characteristics similar to those of the receptor (the case with a missing value). A donor is then selected from the pool randomly either with equal probability (unweighted hot deck) or with probability proportional to the fully adjusted sampling weight of the donor (weighted hot deck). The cases that could donate a value to an observation without data are called donor pools, or imputation classes. An imputation class needed to be sufficiently large so that the number of times a donor provided a value was minimized, but also sufficiently small so that the donors and receptors were adequately comparable. By creating a group of respondents with similar characteristics for variables believed to be correlated with the missing recode, imputed values are generally more consistent with the life-history information. Unweighted hot deck was used in the vast majority of hot deck imputations. Weighted hot deck imputation was used for a few variables with missing data on roughly 2-8 percent of cases.

NSFG, Cycle 6

Page 21: DDI AND THE DATA PRODUCER

21

INTCTFAM: Intact status of childhood family

Cycle 5 frequencies

Recode Rules

Inapplicable Respondents

Imputation

Distribution by Sex

Back to Variable INTCTFAM

Back to Variables List

NSFG, Cycle 6

Hispanic Non-HispanicWhite

Non-HispanicBlack

Total Code Values and Definitions

927 4240 1002 6399 1=Two biologicalparents from birth

8 58 6 75 2=Two adoptiveparents from birth

618 2185 1438 4373 3=Anything otherthan two biologicalor two adoptiveparents from birth

Page 22: DDI AND THE DATA PRODUCER

22

INTCTFAM: Intact status of childhood family

Distribution by Sex

Recode Rules

Inapplicable Respondents

Imputation

Cycle 5 Frequencies

Back to Variable INTCTFAM

Back to Variables List

NSFG, Cycle 6

FemaleRespondents

MaleRespondents

Code Values and Definitions

xxxx xxxx 1=Two biological parents frombirth

xx xx 2=Two adoptive parents frombirth

xxxx xxxx 3=Anything other than twobiological or two adoptiveparents from birth

Page 23: DDI AND THE DATA PRODUCER

23

IMPORTANCE OF DDI TO INSTITUTE FOR SOCIAL RESEARCH

• Committee established with goal that Survey Research Center adopts a common data description standard based on XML and DDI for its codebooks

• Get data producers at Institute for Social Research to produce original documentation using DDI standards

• Educate staff in XML and DDI