Machine (deep) learning for radiomicsstatic.sif.it/SIF/resources/public/files/va2019/Hatt3.pdfImage interpolation (nearest-neighbors, b-splines…) Specifically for textures: Grey-levels

Mathieu Hatt, PhD, HDR – CR INSERM

[email protected]

Laboratory of Medical Information Processing

LaTIM, UMR INSERM-UBO 1101, Brest

Varenna, June 29th 2019

Machine (deep) learning for radiomics

mailto:[email protected]

2 Introduction Radiomics: is this really new?

The term radiomics has become popular since 2012

(publication by P. Lambin)

Numerous publications before 2012 that were denoted

as « quantification studies » could be considered as

« radiomics studies »

Even complex metrics such as textural features exist

since the 70’s and have been used in medical imaging

since the 90’s in MRI and CT and 2009 in PET [1-3]

1. Schad, et al. MR tissue characterization of intracranial tumors by means of texture analysis. Magn Reson Imaging 1993

2. Mir, et al. Texture analysis of CT-images for early detection of liver malignancy. Biomed Sci Instrum. 1995

3. El Naqa, et al. Exploring feature-based approaches in PET images for predicting cancer treatment outcomes. Pattern

Recognit. 2009

3 Introduction Radiomics: is this really new?

Some new aspects of radiomics:

High-throughput: hundreds (thousands?) of features

Link / combination with other –omics (biology)

leads to even more variables to handle!

Hence the need for robust machine learning methods

1. Schad, et al. MR tissue characterization of intracranial tumors by means of texture analysis. Magn Reson Imaging 1993

2. Mir, et al. Texture analysis of CT-images for early detection of liver malignancy. Biomed Sci Instrum. 1995

3. El Naqa, et al. Exploring feature-based approaches in PET images for predicting cancer treatment outcomes. Pattern

Recognit. 2009

4 Radiomics Workflow complexity and calculation choices

Large number of features, each with different:

Image pre-processing (filtering/denoising...)

ROI determination (segmentation, 2D/3D…)

Image interpolation (nearest-neighbors, b-splines…)

Specifically for textures:

Grey-levels discretization

Method (relative, absolute, equalization…) and parameters

Texture matrices design

Number of directions, distances, normalisation, 2D/3D…

Potentially: thousands of variables to handle

Image Biomarker Standardisation Initiative. Multicentre initiative for standardization of image

biomarkers. https://arxiv.org/abs/1612.07003

5

Inappropriate statistical analysis

Radiomics Statistical validation

Chalkidou, et al. False Discovery Rates in PET and CT Studies with Texture

Features: A Systematic Review. PLoS One. 2015

6 Radiomics Feature selection / elimination

How to select/reduce?

Eliminate features beforehand based on robustness (test-

retest, segmentation, etc.) and redundancy

Dimensionality reduction techniques (PCA, SVD...)

Through validation of different models in external data

Models built by relying on less robust features will have poor

performance whereas those combining the most robust ones

will be more generalizable and achieve higher performance

Feature selection techniques in the machine learning

step for building models

7 Radiomics Feature selection / elimination

Kumar, et al. Feature selection. SmartCR . 2014

8

Machine learning: guidelines/tips Arrange dataset before!

Check data, remove outliers, rely on expert knowledge

Datasets as large as possible (10 instances per variable)

Split into training, validation, testing

Training to build model, validation to tune parameters, testing to evaluate

E.g. 50%, 30% and 20%, randomly or stratified sampling, ensure consistency

Choose an appropriate algorithm category

Supervised vs. unsupervised

Classification or regression

Start with the simplest algorithms first

Investigate complex ones only if needed/justified

Take care of the imbalanced data problem

Class weighting, bayes rule, sampling (SMOTE)

Rely on appropriate metrics (Matthews correlation)

Radiomics How to use machine learning?

Chicco, et al. Ten quick tips for machine learning in

computational biology. Biodata mining. 2017

9

Machine learning: guidelines/tips Optimize hyper-parameters

Use validation dataset to choose the best parametres

Minimize overfitting

Use cross-validation and regularization

Evaluate performance using appropriate metrics

Matthews correlation or precision-recall curve

Accuracy or F1 score can be misleading:

Use/develop open source code/software

Get help/feedback from computer science online community


Chicco, et al. Ten quick tips for machine learning in

computational biology. Biodata mining. 2017

Predicted 0 Predicted 1

True 0 1 5

True 1 4 90

Accuracy = 91%

F1 score = 95.24%

Matthews = 0.14

10

Machine learning: choosing algorithms


Parmar, et al. Machine Learning methods for Quantitative Radiomic Biomarkers. Sci Rep. 2015

Keger, et al. A comparative study of machine learning methods for time-to-event survival data for

radiomics risk modelling. Sci Rep 2017

11



Parmar, et al. Machine Learning methods for Quantitative

Radiomic Biomarkers. Sci Rep. 2015

12



Parmar, et al. Machine Learning methods for Quantitative

Radiomic Biomarkers. Sci Rep. 2015

13 Radiomics How to use machine learning?

Deist, et al. Machine learning algorithms for outcome prediction in

(chemo)radiotherapy: an empirical comparison of classifiers. Med Phys 2018

12 datasets, 3496 patients

NSCLC, H&N, meningioma

How to select a machine learning algorithm?


Deist, et al. Machine learning algorithms for outcome prediction in

(chemo)radiotherapy: an empirical comparison of classifiers. Med Phys 2018

How to select a machine learning algorithm?

Random forest was best in 6/12 datasets, elastic net logistic regression in 4/12

But no single best classifier across all datasets

12 datasets, 3496 patients (NSCLC, H&N, meningioma)


How to select a machine learning algorithm? Solution: fusion/ensemble

95 / 67%

How to select a machine learning algorithm? Solution: fusion/ensemble

Random Forest

Support Vector Machine

Logistic regression / LASSO

Improving final output with ensemble? (e.g. majority voting)

Example in NSCLC: 145 stage 2-3 patients with PET + CT radiomics

« embedded » feature selection

Sepehri, et al. Consensus of machine learning pipelines for outcome prediction relying

on clinical and radiomics features from 18F-FDG PET/CT images in non-small cell

lung cancer. EANM 2019

Random forest

Support vector machine

Logistic regression 74 / 65%

90 / 68% Initial set of

variables (PET

and CT radiomic

features + clinical

variables)

Majority voting 100 / 75%

Split: 97 training / 48 testing

16 Radiomics Potential of deep learning?

Deep learning: end-to-end solution

(Usual radiomics)

Courtesy of I. El Naqa

17

Deep learning

Convolutional Neural Networks (CNN)

Limitations (a priori)

Need (very) large datasets for efficient training

Black boxes that do not generate knowledge

Radiomics Potential of deep learning?

18

Deep learning limitations?

Need for large datasets

Data augmentation

Transfer learning / fine-tuning

Black boxes / knowledge generation

Networks visualization

Back propagation to exploit networks


Quellec, et al. Deep image mining for diabetic retinopathy

screening. Med Image Anal. 2017

19

Deep learning / CNN + radiomics


Samek, et al. Evaluating the Visualization of What a Deep Neural Network

Has Learned. IEEE Trans Neural Networks and Learning Systems. 2017

20

Deep learning limitations?


Quellec, et al. Deep image mining for diabetic retinopathy

screening. Med Image Anal. 2017

retropropagation

21



22



Ypsilantis, et al. Predicting Response to Neoadjuvant Chemotherapy with

PET Imaging Using Convolutional Neural Networks. PLoS One. 2015

96 patients for training, 11 for testing

Triplets (3S-CNN) or single slice (1S-CNN)

Data augmentation → 5316 triplets for both responders and nonresponders

23





24





25



Antropova, et al. A deep feature fusion methodology for breast cancer diagnosis

demonstrated on three imaging modality datasets. Med Phys 2017

Standard

radiomics

26





Full field digital mamography (FFDM)

N=245

Ultrasound (US)

N=1125 DCE-MRI

N=690

27






Diamant, et al. Deep learning in head & neck cancer

outcome prediction. Sci Rep. 2019


Head and neck cancer patients with CT images : 194 training (2 institutions)

and 106 validation (2 institutions)










Hosny, et al. Deep learning for lung cancer prognostication: A

retrospective multi-cohort radiomics study. PLoS Med. 2018










36



37


« Endpoint-guided » segmentation

Potential for radiotherapy planning optimisation


retropropagation

38 Radiomics Conclusions

Machine learning & radiomics

Very dynamic field of research

Numerous pitfalls

Redundancy, statistical validation

Methodological choices

Potential solutions:

Larger, prospective, multicentric datasets

Rely on machine learning experts

Combination with (replacement by?) deep learning

Thank you for your attention 39

Documents

Machine (deep) learning for radiomicsstatic.sif.it/SIF/resources/public/files/va2019/Hatt3.pdfImage interpolation (nearest-neighbors, b-splines…) Specifically for textures: Grey-levels