34
Background and Motivation Data Modeling Results Future Research References UCLA Department of Statistics Small area estimation approach to estimating the association between traffic-generated air pollution and early childhood respiratory problems Mine C ¸etinkaya [email protected] August 1, 2010 Mine C ¸etinkaya [email protected] SAE for air pollution and respiratory problems UCLA Department of Statistics

Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

UCLA Department of Statistics

Small area estimation approach to estimating the associationbetween traffic-generated air pollution and early childhood

respiratory problems

Mine [email protected]

August 1, 2010

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 2: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

What is small area estimation?

Small area estimation is a statistical technique used forestimating parameters for small sub-populations, when thesub-population of interest is included in a larger survey.

An area is regarded as “small” if the sample from the area isnot sufficient to produce direct estimates of adequateprecision.

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 3: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

How does small area estimation work?

Small area estimation “borrows strength” from related areas,usually neighbors or observations in the same area recorded atdifferent times.

This requires the use of auxiliary information related to thevariable of interest. Model based estimators can be used toshare information between different areas.

The modeling approach is quite powerful however the resultsit yields are highly dependent on the validity of the model(Longford, 2005).

On the other hand, model based estimators provide theadvantage of being able to use spatial variability terms (Rao,2003).

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 4: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Motivation of this talk

My interest is in the area of public health and applying smallarea estimation techniques to real world data sets andproblems that may then help inform policy decisions.

This talk is based on work that has been done in collaborationwith researchers from the School of Public Health at UCLA.

My goal for this talk is to present a case study that illustratesthe value of SAE techniques.

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 5: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Background

Existing body of literature on the connection between prenatalair pollution exposure and adverse birth outcomes such aspreterm birth and low birth weight.

Public health researchers at UCLA are interested ininvestigating this relationship in Los Angeles County using alarge data set collected from birth certificates and phone /mail surveys.

The researchers also became interested in longer term effectsof prenatal air pollution exposure and conducted a follow upsurvey on various respiratory health measures such as asthmaand wheezing.

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 6: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Population data

Population data

The population of interest is singleton births in Los AngelesCounty in 2003 (data from 58,316 birth certificates).

Birth certificates in California contain information on both thebaby and the mother.

Though birth certificate data include address information, thismay not be a good tool for accurately identifying prenatal airpollution exposure.

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 7: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Sample data

Sample data - EPOS

In order to get a better picture of where mothers spent theirtime during pregnancy a detailed survey was conducted.

Environment and Pregnancy Outcomes Study (EPOS):Case-controlled and nested within a birth cohort (2003)conducted four to six months post-delivery and collecteddetailed risk factor information to assess prenatal air pollutionexposure.

6,374 women were sampled however only 2,543 responded tothe survey (Ritz et al., 2007).

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 8: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Sample data

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 9: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Sample data

Sample data - ECHOS

Three years later, 2,470 of the 2,543 mothers who agreed to afollow up survey during the first interview were re-contacted.

Early Childhood Outcomes Study (ECHOS): Follow uptelephone or mail surveys on the child’s respiratory health,residential history since birth, and other potentially importantcovariates (Ritz & Turner, 2009).

Only 1,215 women responded (49% follow-up).

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 10: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Sample data

Hierarchical Structure

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 11: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Previous work

From EPOS: evidence for a connection between low birthweight or preterm birth and certain pollutants (CO and PM10)

From ECHOS: evidence for a connection between respiratoryhealth problems and pollutants (PM10, PM2.5, NO, NO2, O3,and CO)

Results are based on logistic regression models on thecomplete data set, they fail to take into account spatialvariability / dependency.

This is an area in which small area estimation techniques canprove valuable.

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 12: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

ExposureAverage ambient NO2 exposure over entire pregnancy, per 10 ppb

18

20

22

24

26

28

30

32

34

36

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 13: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Exposure (cont.)Average PM10 exposure over entire pregnancy, per 10µg/m3

26

28

30

32

34

36

38

40

42

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 14: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Exposure (cont.)Average PM2.5 exposure over entire pregnancy, per 10µg/m3

16

17

18

19

20

21

22

23

24

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 15: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Besag, York, Mollie (1991)

Besag et al. (1991) proposed a model that includes spatial andnon-spatial random effects:

Oi ∼ Po(µi )

log(µi ) = log(Ei ) + α + βXi + ui + vi ,

where

ui ∼ N(0, σ2u) and vi |v−i ∼ N

(∑j∼i vj

ni,σ2v

ni

),

and where i represents each small area, i.e. zip code areas.

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 16: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Basic spatial dependency

Zip code areas areconsidered to bespatially dependent ifthey are neighboring.

An adjacency matrixis created based onthese relationshipswith all links equallyweighted.

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 17: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Explanatory variables considered

Pollutants: PM10, PM2.5, ambient and seasonalized NO andNO2, CO, O3.

Maternal race/ethnicity

Maternal socio economic variables: education, payment sourcefor prenatal care, income

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 18: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Wheeze in past 12 months

Model 2: M1 + edu + prenatal pay Model 3: M2 + income

Observed/Expected Model 1: Race only

0.00

0.39

0.57

0.85

1.25

1.85

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 19: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Sneezing, or a runny or blocked nose apart from cold in past 12 months

Model 2: M1 + edu + prenatal pay Model 3: M2 + income

Observed/Expected Model 1: Race only

0.00

0.31

0.68

1.47

3.20

6.94

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 20: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Medication use for wheezing or asthma in the past 12 months

Model 2: M1 + edu + prenatal pay Model 3: M2 + income

Observed/Expected Model 1: Race only

0.00

0.42

0.83

1.66

3.29

6.54

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 21: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Doctor diagnosed ear infections

Model 2: M1 + edu + prenatal pay Model 3: M2 + income

Observed/Expected Model 1: Race only

0.00

0.47

0.67

0.96

1.37

1.97

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 22: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Summary

Smoother distribution of the outcome variable when spatialdependency of zip code areas and other covariates are takeninto account.

Distributions of air pollution and the socio economic variablesare expected to be relatively smooth, large differences betweenneighboring zip code areas are unrealistic.

However, some of these models are likely to be be overfittingthe data.

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 23: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Future Research

Build more complicated models that “borrow strength” basedon other criteria.

Address areas with missing data:

Sneezing, or a runny, or blocked nose apart from cold in past 12 months

0.0

0.2

0.4

0.6

0.8

1.0

Predict outcomes for the entire population which will helpidentify significant risk factors and at risk groups.

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 24: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Thanks

Thanks to Jan de Leeuw, Beate Ritz and Michelle Wilhelm.

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 25: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Bibliography

Besag, York, & Mollie (1991). Bayesian image restoration, withtwo applications in spatial statistics. Annals of the Institute ofStatistical Mathematics, 43, 1–59.

Longford (2005). Missing Data and Small Area Estimation. NewYork: Springer.

Rao (2003). Small Area Estimation. New Jersey: Wiley.

Ritz & Turner, W. (2009). Prenatal air pollution exposure andearly childhood respiratory disease in the ucla environment andpregnancy outcomes study (epos) cohort. In preparation.

Ritz, Wilhelm, Hoggatt, & Ghosh (2007). Ambient air pollutionand preterm birth in the environment and pregnancy outcomesstudy at the university of california, los angeles. AmericanJournal of Epidemiology, 166(9), 1045–1052.

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 26: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Additional graphs - Distribution of exposure variablesAverage CO exposure over entire pregnancy, per 1 ppm

0.0

0.5

1.0

1.5

2.0

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 27: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Additional graphs - Distribution of exposure variablesAverage ambient NO exposure over entire pregnancy, per 20 ppb

15

20

25

30

35

40

45

50

55

60

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 28: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Additional graphs - Distribution of exposure variablesAverage O3 exposure over entire pregnancy, per 10 ppb

25

30

35

40

45

50

55

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 29: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Additional graphs - Distribution of exposure variables

Seasonal LUR NO exposure over entire pregnancy, per 20 ppb

20

40

60

80

100

120

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 30: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Additional graphs - Standard errors

Wheeze in past 12 months (s.e. of θ)

Model 2: M1 + edu + prenatal pay Model 3: M2 + income

Model 1: Race only

0.10

0.27

0.33

0.41

0.54

1.10

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 31: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Additional graphs - Standard errors

Sneezing/runny or blocked nose apart from cold in past 12 months (s.e. of θ)

Model 2: M1 + edu + prenatal pay Model 3: M2 + income

Model 1: Race only

0.15

0.28

0.38

0.49

0.64

1.48

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 32: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Additional graphs - Standard errors

Medication use for wheezing or asthma in the past 12 months (s.e. of θ)

Model 2: M1 + edu + prenatal pay Model 3: M2 + income

Model 1: Race only

0.15

0.29

0.35

0.44

0.55

1.35

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 33: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Additional graphs - Distribution of exposure variables

Doctor−diagnosed ear infections (s.e. of θ)

Model 2: M1 + edu + prenatal pay Model 3: M2 + income

Model 1: Race only

0.12

0.17

0.20

0.24

0.29

0.64

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics

Page 34: Small area estimation approach to estimating the ...mc301/talks/SAE_JSM2010.pdf · Background and Motivation Data Modeling Results Future ResearchReferences UCLA Department of Statistics

Background and Motivation Data Modeling Results Future Research References

Additional graphs - WinBugs code

Mine Cetinkaya [email protected]

SAE for air pollution and respiratory problems UCLA Department of Statistics