40
Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of Amsterdam ICML 2018

Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Attention-based Deep Multiple Instance LearningMaximilian Ilse, Jakub Tomczak, Max WellingAMLAB, University of Amsterdam

ICML 2018

Page 2: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Motivation

Typical size of benchmark

natural images: up to 256x256

Page 3: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Motivation

Typical size of benchmark

natural images: up to 256x256

Typical size of medical images:

~10,000x10,000

Page 4: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Motivation

Typical size of benchmark

natural images: up to 256x256

Typical size of medical images:

~10,000x10,000

How to process it?

Page 5: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Motivation

Goal: Find (local) objects (abnormal

changes in tissue) in an image.

Page 6: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Motivation

Goal: Find (local) objects (abnormal

changes in tissue) in an image.

Data: billions of pixels, 101-102 scans,

weak labels (for regions or a scan).

Page 7: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Motivation

Goal: Find (local) objects (abnormal

changes in tissue) in an image.

Data: billions of pixels, 101-102 scans,

weak labels (for regions or a scan).

Solution: Use local information in the

image and look for Regions of

Interest.Ricci-Vitiani, L., et al. "Identification and expansion of human colon-cancer-initiating cells." Nature 445.7123 (2007): 111.

Page 8: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Supervised Learning vs. Multiple Instance Learning

One image - one label

Page 9: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Supervised Learning vs. Multiple Instance Learning

One image - one label Many images - one label

Page 10: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Supervised Learning vs. Multiple Instance Learning

One image - one label Many images - one label

Individual labels: are unknown.

Page 11: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Supervised Learning vs. Multiple Instance Learning

One image - one label Many images - one label

Individual labels: are unknown.

Assumptions about the label Y :

Page 12: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Supervised Learning vs. Multiple Instance Learning

One image - one label Many images - one label

Individual labels: are unknown.

Assumptions about the label Y :Instances with (yk = 1) = key instances

Page 13: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Multiple Instance Learning

A MIL classifier as a probabilistic model:

Page 14: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Multiple Instance Learning

A MIL classifier as a probabilistic model:

Must be permutation-invariant!

Page 15: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Multiple Instance Learning

A MIL classifier as a probabilistic model:

Must be permutation-invariant!

How?

Page 16: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Multiple Instance Learning

A MIL classifier as a probabilistic model:

Theorem (Zaheer et al., 2017)A scoring function for a set of instances X, , is a symmetric function (i.e., permutation invariant to the elements in X), if and only if it can be decomposed in the following form:

S(X) = g(∑x∊X f(x))

where f and g are suitable transformations.

Page 17: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Multiple Instance Learning

A MIL classifier as a probabilistic model:

Theorem (Qi et al., 2017)For any ε > 0, a Hausdorff continuous symmetric function can be arbitrarily approximated by a function in the form g(maxx∊X f(x)), where max is the element-wise vector maximum operator and f and g are continuous functions, that is:

|S(X) - g(maxx∊X f(x))| < ε.

Page 18: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Multiple Instance Learning

A MIL classifier as a probabilistic model:

The theorems say that we can model a permutation-invariant θ(X) by composing:

● a transformation f of individual instances,

● a permutation-invariant function σ, e.g., sum, mean or max (MIL pooling),

● a transformation of combined instances using a function g:

θ(X) = g(σ(f(x1), …, f(xK))

Page 19: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Multiple Instance Learning: Components

We model both transformations f and g using neural networks.

Page 20: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Multiple Instance Learning: Components

We model both transformations f and g using neural networks.

Two approaches:

- embedded-based

- instance-based

Page 21: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Multiple Instance Learning: Components

We model both transformations f and g using neural networks.

Two approaches:

- embedded-based

- instance-based

MIL pooling:

- mean,

- max,

- other (e.g., Noisy-Or).

Page 22: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Multiple Instance Learning: Components

Issues:

- Embedded-based approach

lacks interpretability.

- Instance-based approach

propagates error.

- max and mean are non-learnable.

Page 23: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Multiple Instance Learning: Attention-based approach

We propose to use the attention mechanism as MIL pooling:

where:

Page 24: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Multiple Instance Learning: Attention-based approach

We propose to use the attention mechanism as MIL pooling:

where:

attention with gating mechanism

Page 25: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Multiple Instance Learning: Attention-based approach

The attention mechanism as MIL pooling:

- MIL operator is trainable;

- attention weights could be

interpreted (key instances).

Embedded-based approach

is interpretable and fully trainable.

Page 26: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Experiments: MNIST-based problem

Y = 1

Y = 0

Page 27: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Experiments: MNIST-based problem

Page 28: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Experiments: MNIST-based problem

Page 29: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Experiments: MNIST-based problem

Page 30: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Experiments: Breast Cancer

Page 31: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Experiments: Colon Cancer

Page 32: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Experiments: Colon Cancer

Page 33: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Experiments: Colon Cancer

Page 34: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Experiments: Colon Cancer

Page 35: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Conclusion

Deep MIL: a flexible approach to cope with large images.

Attention mechanism: interpretable and learnable MIL pooling.

Next step: Application to whole-slide classification.

Next step: taking into account spatial dependencies (non i.i.d. instances).

Page 36: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Conclusion

Deep MIL: a flexible approach to cope with large images.

Attention mechanism: interpretable and learnable MIL pooling.

Next step: Application to whole-slide classification.

Next step: taking into account spatial dependencies (non i.i.d. instances).

Page 37: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Conclusion

Deep MIL: a flexible approach to cope with large images.

Attention mechanism: interpretable and learnable MIL pooling.

Next step: Application to whole-slide classification.

Next step: taking into account spatial dependencies (non i.i.d. instances).

Page 38: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Conclusion

Deep MIL: a flexible approach to cope with large images.

Attention mechanism: interpretable and learnable MIL pooling.

Next step: Application to whole-slide classification.

Next step: taking into account spatial dependencies (non i.i.d. instances).

Page 39: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Conclusion

Deep MIL: a flexible approach to cope with large images.

Attention mechanism: interpretable and learnable MIL pooling.

Next step: Application to whole-slide classification.

Next step: taking into account spatial dependencies (non i.i.d. instances).

Page 40: Instance Deep Multiple Learning Attention-based · 2020-07-08 · Attention-based Deep Multiple Instance Learning Maximilian Ilse, Jakub Tomczak, Max Welling AMLAB, University of

Code on github:https://github.com/AMLab-Amsterdam/AttentionDeepMIL

Contact:[email protected]@gmail.com

The research conducted by Jakub M. Tomczak was funded by the European Commission within the Marie Skłodowska-Curie Individual Fellowship (Grant No. 702666, ”Deep learning and Bayesian inference for medical imaging”).

The research conducted by Maximilian Ilse was funded by the Nederlandse Organisatie voor Wetenschappelijk Onderzoek (Grant DLMedIa: Deep Learning for Medical Image Analysis).