48
Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School of Medicine at Mount Sinai

Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Embed Size (px)

Citation preview

Page 1: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Comparing methods for addressing limits of detection in environmental epidemiology

Roni Kobrosly, PhD, MPH

Department of Preventive Medicine

Icahn School of Medicine at Mount Sinai

Page 2: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

A familiar diagram…

EnvironmentalExposure

InternalDose

BiologicallyEffective

Dose

AlteredStructure/Function

ClinicalDisease

Biomarker of Exposure

DeCaprio, 1997

Page 3: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Biomarkers and Limits of Detection (LOD)

Page 4: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

It is difficult to quantify the concentration because it is so low

LOD

Higher concentration

Page 5: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Handling LODs in analysis

• Easiest approach: simply delete these observations

• Problems with this:

o However, values < LOD are informative: analyte may have a concentration between 0 and LOD

o Studies are expensive and you lose covariate data!

o Excluding observations from analyses *may* substantially bias results

Chen et al. 2011

Page 6: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Handling LODs in analysis

• Hornung & Reed describe approach that involves substituting a single value for each observation <LOD

• Three suggested substitutions: LOD/2, LOD/√2, or just LOD

• Problem: Replacing a sizable portion of the data with a single value increases the likelihood of bias and reduces power!

Helsel, 2005; Hughes 2000;Hornung & Reed, 1990

Page 7: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Citations in Google Scholar

Hornung & Reed, 1990

19901992

19941996

19982000

20022004

20062008

20102012

0102030405060708090100

Year

Nu

mb

er

of

Pu

blica

tio

ns

Page 8: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Comparing LOD methods

• While there are many studies testing individual methods, relatively little work comparing performance of several methods

• Even fewer studies have compared methods in context of multivariable data

• Comparative studies that do exist provide contradictory recommendations. No consensus!

Page 9: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Simulation Study Objectives

• Compare performance of LOD methods when independent variable is subject to limit of detection in multiple regression

• Compare performance across a range of “experimental” conditions

• Create flowchart to aid researchers in their analysis decision making

Page 10: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Statistical Bias

Nat’l Library of Med definition: “Any deviation of results or inferences from the truth”

Unbiased Biased

Page 11: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Variable Definitions

• Four continuous variables:

• Y: Dependent variable (outcome)

• X: Independent variable (exposure, subject to LOD)

• C1, C2: Independent variables (covariates)

Page 12: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

6 “Experimental Conditions”

1) Dataset sample size: n = {100, 500}

Page 13: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

2) % of exposure variable with values in LOD region:

LOD% = {0.05, 0.25}

Page 14: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

3) Distribution of Exposure Variable:

Normal versus Skewed

Page 15: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

4) R2 of full model:

R2 = {0.10, 0.20}

Page 16: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

5) Strength & direction of exposure-outcome association:

Beta = {-10, 0, 10}

Page 17: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

6) Direction of confounding:

Strong Positive, versus Strong Negative, versus None

+-

Page 18: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

LOD methods considered

1. Deletion of subjects with LOD values

2. Substitution with LOD/√(2)

3. Substitution with LOD/2

4. Substitution with just LOD value

5. Multiple imputation (King’s Amelia II)

6. MLE-imputation method (Helsel & Krishnamoorthy)

Page 19: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Method 1: Deletion

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 <LOD 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 <LOD 12.6 9.0

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 <LOD 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 <LOD 12.6 9.0

Page 20: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Method 2: Sub with LOD/√(2)

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 <LOD 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 <LOD 12.6 9.0

LODX = 9.0

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 6.4 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 6.4 12.6 9.0

9.0/√2 = 6.4

Page 21: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Method 3: Sub with LOD/(2)

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 <LOD 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 <LOD 12.6 9.0

LODX = 9.0

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 4.5 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 4.5 12.6 9.0

9.0/2 = 4.5

Page 22: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Method 4: Sub with just LOD

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 <LOD 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 <LOD 12.6 9.0

LODX = 9.0

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 9.0 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 9.0 12.6 9.0

9.0

Page 23: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Method 5: Multiple Imputation

• “Amelia II” by Dr. Gary King

• Assumes pattern of observations below LOD only depends on observed data (not unobserved data)

• Lets you constrain imputed values (very helpful when working with LODs!)

Page 24: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Method 5: Multiple Imputation

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 <LOD 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 <LOD 12.6 9.0

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 3.0 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 6.2 12.6 9.0

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 2.5 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 6.8 12.6 9.0

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 3.3 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 6.3 12.6 9.0

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 3.5 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 6.0 12.6 9.0

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 2.8 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 7.2 12.6 9.0

M = 5

Page 25: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Method 5: Multiple ImputationY X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 3.0 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 6.2 12.6 9.0

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 2.5 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 6.8 12.6 9.0

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 3.3 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 6.3 12.6 9.0

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 3.5 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 6.0 12.6 9.0

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 2.8 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 7.2 12.6 9.0

= 10.01

β1 = 10.1

β2 = 9.5 β3 = 8.3 β4 = 12.1

β5 = 10.4

Page 26: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Method 6: MLE-Imputation

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 <LOD 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 <LOD 12.6 9.0

Page 27: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Method 6: MLE-Imputation

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 <LOD 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 <LOD 12.6 9.0

Assume normal distribution, estimate and Sx

Page 28: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Method 6: MLE-Imputation

Y X C1 C2

167.7 25.8 13.5 12.9

-66.3 15.9 11.7 12.6

50.6 3.2 10.4 10.8

-273.0 9.5 11.8 11.1

156.9 5.8 12.6 9.0

Use estimated LOD value, , and Sx to randomly generate observations below LOD

Page 29: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Two-step Data Generation Process

• 1st Step: Select “true” regression parameters for following two models:

o

o

• 2nd Step: Use “true” parameters to guide the drawing of random numbers

X 0 _ X C1_ X (C1) C 2 _ X (C2)

Y 0 _ Y X (X) C1_ Y (C1) C 2 _ Y (C2)

Page 30: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

“TRUTH”Y = 2.8 + 2(X) + 4.5(C1) + 6(C2)

Dataset1.1 Dataset1.2 Dataset1.3SIMULATED DATASETS

X = 1.3 - 6(C1) + 1.5(C2)

Obs # Y X C1 C2

1 24.67 5.44 -0.28 1.77

2 30.73 9.47 -1.55 -0.81

3 19.39 -0.98 0.96 0.92

4 -9.47 -8.20 1.72 0.49

i yi xi c1i c2i

Page 31: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Y = 2.8 + 2(X) + 4.5(C1) + 6(C2)Create a set of “true” parameters

Dataset1.1

Dataset1.2

Dataset1.3

Dataset1.1000

Create 1500 simulated datasets for set of “true” parameters, using specific set of experimental conditions

Apply a LOD correction method and run regression for each dataset

Bias = 2.2 – 2 = 0.2

Take difference of estimated coefficient and “true” parameter. Produce 1000 bias estimates with 95% CI’s

ˆ y 2.72 2.2(X) 4.2(C1) 5.98(C2)

Page 32: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Help from Minerva

Minerva runtime ~ 5 minutes

Page 33: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

n = 100, 25% LOD, Skewed Dist, R2 = 0.20, Negative X-Y Association, Negative confounding

Mea

n B

ias

(wit

h 9

5% C

I)

3.0

4.0

5.0

6.0

Deletion

LOD/sqrt(2)

LOD/2

LOD

Multi Impu

2.0

0

1.0

-1.0

7.0

8.0

MLE Impu

-2.0

Page 34: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Mea

n B

ias

(wit

h 9

5% C

I)

-3.0

-2.0

-1.0

0

Deletion

LOD/sqrt(2)

LOD/2

LOD

Multi Impu

-4.0

-6.0

-5.0

-7.0

1.0

2.0

MLE Impu

-8.0

n = 100, 25% LOD, Skewed Dist, R2 = 0.20, Positive X-Y Association, Negative confounding

Page 35: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

n = 100, 25% LOD, Skewed Dist, R2 = 0.20, Negative X-Y Association, No confounding

Mea

n B

ias

(wit

h 9

5% C

I)

0

0.2

0.4

0.6

Deletion

LOD/sqrt(2)

LOD/2

LOD

Multi Impu

-0.2

-0.6

-0.4

-0.8

0.8

1.0

MLE Impu

-1.0

Page 36: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

n = 100, 25% LOD, Skewed Dist, R2 = 0.20, Positive X-Y Association, No confounding

Mea

n B

ias

(wit

h 9

5% C

I)

0

0.2

0.4

0.6

Deletion

LOD/sqrt(2)

LOD/2

LOD

Multi Impu

-0.2

-0.6

-0.4

-0.8

0.8

1.0

MLE Impu

-1.0

Page 37: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

n = 100, 25% LOD, Skewed Dist, R2 = 0.20, Negative X-Y Association, Positive confounding

Mea

n B

ias

(wit

h 9

5% C

I)

3.0

4.0

5.0

6.0

Deletion

LOD/sqrt(2)

LOD/2

LOD

Multi Impu

2.0

0

1.0

-1.0

7.0

8.0

MLE Impu

-2.0

Page 38: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

n = 100, 25% LOD, Skewed Dist, R2 = 0.20, Positive X-Y Association, Positive confounding

Mea

n B

ias

(wit

h 9

5% C

I)

-3.0

-2.0

-1.0

0

Deletion

LOD/sqrt(2)

LOD/2

LOD

Multi Impu

-4.0

-6.0

-5.0

-7.0

1.0

2.0

MLE Impu

-8.0

Page 39: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

An overview of results

• Relative bias of methods is highly dependent on experimental conditions (i.e. no simple answers)

• Covariates and confounding matters! Simulations that only consider bivariate, X-Y relationships with LODs are limited

Page 40: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Deletion method results

• Surprisingly… provides unbiased estimates across all conditions!

• If sample size is large and LOD% is small, this may be a good option. As LOD% becomes larger, deletion is more costly

• Important caveat: deletion method works well if true associations are linear

Page 41: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Deletion method with linear effects

Bottom 8% of X variable deleted

Page 42: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Substitution method results

• Not surprisingly… these methods are generally terrible!

• Just LOD substitution is worst type

• In most scenarios, these will bias associations towards the null

• … but, works reasonably well when distribution is highly skewed, no confounding, and LOD% is low

Page 43: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Multiple Imputation results

• Amelia II performs relatively well! Particularly when R2 is higher

• Does well even when LOD% is high

• Problematic when there is no confounding (reason: this indicates there are no/weak associations between variables)

Page 44: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

MLE Imputation results

• Associated with severe bias in most cases

• Highly reliant on parametric assumptions and the code is daunting: recommend avoiding this method

• However, performed reasonably well when exposure is normally distributed, no confounding, and LOD% is low

Page 45: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

A Case Study…

Page 46: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Sarah’s SFF Analysis

• Study for Future Families (SFF): a multicenter pregnancy cohort study that recruited mothers from 1999-2005

• Sarah Evans’ analysis: prenatal exposure to Bisphenol A (BPA) and neurobehavioral scores in 153 children at ages 6-10

• 28 (18%) children have BPA levels below the LOD

Page 47: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Sarah’s SFF Analysis

• Maternal urinary BPA collected during late pregnancy

• Neurobehavioral scores obtained through School-age Child Behavior Checklist (CBCL).

• Used multiple regression adjusting for child age at CBCL assessment, mother’s education level, family stress, urinary creatinine

Page 48: Comparing methods for addressing limits of detection in environmental epidemiology Roni Kobrosly, PhD, MPH Department of Preventive Medicine Icahn School

Anxiety/Dep

Withdrawn/Dep

Somatic

Social

Thought

Attention

Rule-Break

Aggressive

Internalizing

Externalizing

Total Problems

LOD/sqrt(2)

-0.2 0-0.4-0.6 0.2 0.4 0.6 0.8 1.0

Deletion