16
Stratified subsampling for effective removal of batch effects in metabolomics Application to endocrine disruptors screening Julien Boccard School of Pharmaceutical Sciences University of Geneva, University of Lausanne

Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

Stratified subsampling for effective

removal of batch effects

in metabolomics

Application to

endocrine disruptors screening

Julien Boccard

School of Pharmaceutical Sciences

University of Geneva, University of Lausanne

Page 2: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

Endocrine Disruption

Endocrine disruption is related to many pathologies such as

infertility, diabetes, obesity and cancer (breast, prostate, endometrial,

ovary, cervical, testis, bladder, renal, thyroid or osteosarcoma, etc.)

Need for efficient monitoring to provide an

opportunity to diagnose exposure/disease at

early stages

The screening of potential Endocrine Disrupting Chemicals

is a major concern for regulatory agencies

U.S. Environmental Protection Agency (EPA) and Organization for Economic

Co-operation and Development (OECD) fund R&D programs

?

Page 3: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

Steroid Profiling in Adrenal Cell Model

Develop of an efficient and robust analytical protocol for

monitoring steroid metabolites in H295R cell culture supernatant

Sample preparation: Protein precipitation, Solid-Phase Extraction

UHPLC analysis coupled to QTOF high resolution Mass Spectrometry

H295R Steroidogenesis Assay, OECD 2011

H295R cell lines (from human adrenocortical carcinoma cells)

OECD model to study steroidogenesis perturbations

H295R cell lines expresses genes encoding

most of the key enzymes of steroidogenesis

BUT

Test designed to assess variations of

testosterone and estradiol

due to chemical exposure

Page 4: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

Untargeted MS Acquisition

Organic acids

?

Lipids

Acylcarnitines Nucleosides &

Derivatives

UHPLC-QTOF/MSE → full m/z range acquisition (100-1’000)

About 10’000 detected features …

Page 5: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

Focus on Steroid Metabolites

Steroids

Database ID - Pathways

Litterature Scientific Knowledge

Web ressources

About 250 reference steroids …

Page 6: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

Experimental Dataset

Exposure to 7 different conditions (6 toxicants – 1 control)

Acetyltributylcitrate (ACT), forskolin (FOR), linuron (LIN), octocrylene (OCT),

octylmethoxycinnamate (OMC), torcetrapib (TOR), dimethlysulfoxide (DMSO)

Non-cytotoxic concentrations

Two or three replicates

Three biological batches to estimate repeatability

49 samples with >100 annotated steroid metabolites

High variability between batches

Very strong batch effect

Metabolic alterations due to exposure

are masked

Principal Component Analysis

-15

-10

-5

0

5

10

15

20

-15 -10 -5 0 5 10 15 20

PCA score

t1

(42.8%)

t2

(27.1%)

Batch 3 Batch 1

Batch 2

Page 7: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

Batch Effects Removal

Between group

variation

Total variation

Within group

variation

ASCA

ANOVA-PCA

ANOVA-PLS

ANOVA-TP

AComDim

AMOPLS

How to account for the study design in a multivariate context ?

Associate ANOVA decomposition with projection methods

Explicitly consider the batch as an experimental factor

Quality Controls (QCs) samples for batch correction

X = Xμ + Xα + Xβ + Xαβ + XRes

Page 8: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

ANOVA Multiblock OPLS workflow

Experimental matrix

(n x k) X

ANOVA

decomposition

(n x k) + + + XRes X A X B X AB

XRes X A+XRes X B+XRes X AB+XRes

X = TpαPpαT + TpβPpβ

T + TpαβPpαβT + ToPo

T + E

Y = TpαQpαT + TpβQpβ

T + TpαβQpαβT + F

Joint analysis of

the submatrices

Prediction of level barycentres based on experimental submatrices

multiblock OPLS

Y

Boccard et al., Analytica Chimica Acta (2016), 920, 18-28.

Page 9: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

Unbalanced Designs

What to do with groups (factor levels) of unequal sizes

General linear model approach offers an unbiased decomposition

(Thiel et al., 2017)

BUT submatrices are still non-orthogonal

Resampling using the smallest size of exchangeable units

(lowest number of observations associated with a level or combination of levels)

Balanced groups are mandatory for variance decomposition

Percentage of explained variation (Sum of square)

Orthogonal (uncorrelated additive) submatrices

?X = XMean + XExposure + XBatch + XInteraction + XRes

Page 10: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

Stratified Subsampling

Batch 1

Batch 2

Batch 3

n=8 n=7 n=7 n=6 n=7 n=7 n=7

n=18

n=16

n=15

Exposure Factor (7 levels)

Batch Factor

(3 levels)

Stratified subsampling

(103 subsets)

Batch 1

Batch 2

Batch 3

OCT OMC ATC LIN TOR FOR DMSO

n=6 n=6 n=6 n=6 n=6 n=6 n=6

n=14

n=14

n=14

OCT OMC ATC LIN TOR FOR DMSO

0

10

20

30

40

50

60

70

80

Counts

Classes

0

10

20

30

40

50

60

70

80

Counts

Classes

0

10

20

30

40

50

60

70

80

Counts

Classes

0

10

20

30

40

50

60

70

80

Counts

Classes

Model Population

Scores

Loadings

0

10

20

30

40

50

60

70

80

Counts

Classes

0

10

20

30

40

50

60

70

80

Counts

Classes

Exposure 16.7%

Batch 61%

Quantitative evaluation

of the experimental factors

Interaction 15.8%

Residuals 6.5%

Page 11: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

AMOPLS Effects Interpretation - Batch

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

-0.3 -0.2 -0.1 0 0.1 0.2 0.3

Batch - Scores tp1 vs. tp2

-4

-3

-2

-1

0

1

2

3

4

5

6

7

-1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5

pp4 vs. pp6

-8

-6

-4

-2

0

2

4

6

8

10

-6 -4 -2 0 2 4 6 8

Batch - Loadings

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3

Interaction - Scores tp4 vs. tp6

tp1

tp2

tp4

tp6

pp1

pp2

pp4

pp6

Batch 3 Batch 1

Batch 2

Batch Main Effect

Batch × Exposure

Interaction

FOR

TOR

OCT

ACT

DMSO

LIN

OMC Clear groupings according

to batches

Major source of variations

SCORES

Massive overall differences

Culture medium variability

LOADINGS

Page 12: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

AMOPLS Effects Interpretation - Exposure

-6

-4

-2

0

2

4

6

8

10

-7 -6 -5 -4 -3 -2 -1 0 1 2 3

pp3 vs. pp5

pp3

pp5

-6

-4

-2

0

2

4

6

8

10

12

-8 -6 -4 -2 0 2 4 6

pp7 vs. pp10

pp10

pp7

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

-0.4 -0.3 -0.2 -0.1 0 0.1 0.2

Toxicants - Scores tp3 vs. tp5

tp5

tp3

FOR

DMSO

TOR

ACT

OCT

OMC

LIN

-0.08

-0.06

-0.04

-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

-0.15 -0.1 -0.05 0 0.05 0.1 0.15

Toxicants - Scores tp7 vs. tp10

tp10

tp7

ACT

OCT

LIN

OMC

DMSO

TOR

FOR

Exposure Main Effect

Clusters of samples related

to chemical exposure

Homogeneous groups

SCORES

Higher or lower abundances

according to exposure

Specific patterns

LOADINGS

Page 13: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

Xenobiotics Mapping

Hierarchical Cluster Analysis • Euclidean distances based on AMOPLS scores

• Ward aggregation method

DMSO

1 2 3 4

ATCATCATCATCATCATCATCOCTOCTOCTOCTOCTOCTOCTOCTOMCOMCOMCLINLINLINOMCOMCOMCOMCLINLINLINDMSO DMSO DMSO DMSO DMSO DMSO DMSO TORTORTORTORTORTORTORFORFORFORFORFORFORFOR

FOR

TOR

LIN &

OMC

OCT

ATC

tp3 tp5 tp7 tp10

1 2 3 4 5 6

Acetyl tributylcitrate_1Acetyl tributylcitrate_3Acetyl tributylcitrate_1Acetyl tributylcitrate_2Acetyl tributylcitrate_2Acetyl tributylcitrate_3Acetyl tributylcitrate_2Acetyl tributylcitrate_3Acetyl tributylcitrate_1Octocrylene_1Octocrylene_2Octocrylene_3Octocrylene_3Octocrylene_2Octocrylene_1Octocrylene_3Octocrylene_2Octocrylene_1Octyl Methoxycinnamate_2Octyl Methoxycinnamate_3Octyl Methoxycinnamate_1Octyl Methoxycinnamate_1Octyl Methoxycinnamate_3Octyl Methoxycinnamate_1Octyl Methoxycinnamate_3Octyl Methoxycinnamate_2Octyl Methoxycinnamate_2Linuron_2Linuron_1Linuron_3Linuron_3Linuron_2Linuron_3Linuron_1Linuron_1Linuron_2DMSO_2DMSO_1DMSO_1DMSO_3DMSO_3DMSO_3DMSO_2DMSO_1DMSO_2Torcetrapib_1Torcetrapib_2Torcetrapib_3Torcetrapib_3Torcetrapib_2Torcetrapib_3Torcetrapib_1Torcetrapib_2Torcetrapib_1Forskolin_1Forskolin_2Forskolin_3Forskolin_1Forskolin_3Forskolin_2Forskolin_3Forskolin_1Forskolin_2

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

Positive score

Negative score

Similar anti-androgenic signatures

Known steroidogenesis inducer

Control

Induction of aldosterone and cortisol

Anti-androgenic and anti-estrogenic

Increased corticosteroid production

Page 14: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

0.00

0.50

1.00

1.50

2.00

2.50

AMOPLS Effect-specific VIP value

𝑉𝐼𝑃𝑗 = 𝑝 𝑆𝑆𝑎 𝑤𝑎𝑗 𝑤𝑎 2

𝐴

𝑎=1

𝑆𝑆𝑎

𝐴

𝑎=1

Selection of the most relevant steroids

Highlight altered enzymes for

mechanistic interpretation

Focus further analytical developments

for absolute quantification

Exposure Main Effect VIP ?

Page 15: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

Conclusions

Applicable to any ANOVA-based strategies (ASCA, ANOVA-PCA, ...)

Quantitative evaluation of the experimental factors

Mapping of xenobiotics according to

their steroidomic signatures

Removing batch effect using specific components

Focus on the most relevant steroid

metabolites and enzymes

Stratified subsampling allowed proper variance decomposition and

modeling of the different sources of variation using AMOPLS

Page 16: Stratified subsampling for effective removal of batch effects in … · 2019. 3. 6. · Boccard et al., Analytica Chimica Acta (2016), 920, 18-28. Unbalanced Designs What to do with

Prof. Serge Rudaz Dr. Fabienne Jeanneret Dr. David Tonoli

Acknowledgements

Prof. Alex Odermatt Dr. Petra Strajhar