Transfer Learning and Fine-tuning Deep Neural Networks

© 2014 Microsoft Corporation. All rights reserved.

Anusua Trivedi, Data ScientistAlgorithm Data Science (ADS)

[email protected]

Transfer Learning and Fine-tuning Deep Neural Networks

2

1. Traditional Machine Learning (ML)2. ML Vs. Deep Learning3. Why Deep Learning for Image Analysis4. Deep Convolutional Neural Network (DCNN)5. Transfer Learning DCNN6. Fine-tuning DCNN7. Recurrent Neural Network (RNN)8. Case Studies

Talk Outline

3

Vision Analytics

Recommenda-tion engines

Advertising analysis

Weather forecasting for business planning

Social network analysis

Legal discovery and document archiving

Pricing analysisFraud detection

Churn analysis

Equipment monitoring

Location-based tracking and services

Personalized Insurance

Machine learning & predictive analytics are core capabilities that help business decisions

What is ML?

4

Traditional ML Vs Deep Learning

Deep learning can automatically learn features in data

Deep learning is largely a "black box" technique, updating learned weights at each layer

Traditional ML requires manual feature extraction/engineering

Feature extraction for unstructured data is very difficult

5

1. Image data requires subject-matter expertise to extract key features

2. Deep learning extracts feature automatically from domain-specific images, without any feature engineering technique

3. This step makes the image analysis process much easier

Why use Deep Learning for Image Analysis?

6

Early Work

1. Fukushima (1980) – Neo-Cognitron

2. LeCun (1989) – Convolutional Neural Networks (CNN)

3. With the advent of GPUs, DCNN popularity grew

4. Most popular – AlexNet (on ImageNet images)

7

1. Train networks with many layers

2. Multiple layers work to build an improved feature space

3. First layer learns 1st order features (e.g. edges)

4. 2nd layer learns higher order features

5. Lastly, final layer features are fed into supervised layer(s)

Deep Neural Network (DNN)

8

Deep Convolutional Neural Network (DCNN)

C layers are convolutions, S layers pool/sample

9

Essential components of DCNN

10

Convolution• Conv layers consist of a

rectangular grid of neurons.

• The weights for this are the same for each neuron in the conv layer.

• The conv layer weights specify the convolution filter.

11

PoolingThe pooling layer takes small rectangular blocks from the convolutional layer and subsamples it to produce a single output from that block

12

DCNN Sample - LeNet

13

Transfer Learning & Fine-tuning DCNN

14

1.Non-symbolic frameworks• The main drawback of imperative frameworks

(like torch, caffe etc. ) is manual optimization. • Most imperative frameworks are not easily modified.

2.Symbolic frameworks• Symbolic frameworks (like Theano, Tensorflow, CNTK,

MXNET etc.) can infer optimization automatically from the dependency graph.

• A symbolic framework can exploit much more memory reuse

Deep Learning Frameworks

15

1. Easy to implement new networks

2. Easy to modify existing networks using Lasagne/Keras

3. Very mature python interface

4. Easy to customize with domain-specific data.

5. Transfer learning and fine tuning in Lasagne/Keras is very easy

Theano

16

1. Here we use labeled fluorescein angiography images of eyes to improve Diabetic Retinopathy (DR) prediction.

2. We use a DCNN to improve DR prediction.

Case Study: Diabetic Retinopathy Prediction

17

GoogleNet

18

ImageNet

19

1. We use an ImageNet pre-trained DCNN

2. We fine-tune that DCNN to transfer generic learned features to DR prediction.

3. Lower layers of the pre-trained DCNN contain generic features that can be used for the DR prediction task.

Transfer Learning & Fine-tuning our DCNN model

20

Diabetic Retinopathy

21

Image Augmentation

22

Transfer Learning DCNN

23

Fine-tuning GoogleNet

24

Diabetic Retinopathy Prediction

25

Prediction Results Comparison

Our DCNN improves DR prediction accuracy compared to state-of-the-art Support Vector Machine approaches

IMAGE CLASSIFICATION MEAN ACCURACY

Our fine-tuned DCNN 0.96

Feature Based SVM 0.82

26

Other Uses of this DCNN Model

27

Re-usability of this DCNN Model

1. We fine-tune ImageNet-trained DCNN for medical image analysis

2. We can fine-tune the same ImageNet-trained DCNN model in a completely different domain, and for a completely different task.

28

1. We use the ImageNet-trained DCNN and learn Apparel Classification with Style (ACS) image features through transfer learning and fine-tuning.

2. Then we use a Long short-term memory (LSTM) Recurrent Neural Network (RNN) on the learned image features for the image caption generation.

Case Study: Fashion Image Caption Generation

29

Image Augmentation

30

Transfer Learning DCNN

31

Recurrent Neural Network (RNN-LSTM)

• Recurrent neural networks (RNN) are networks with loops in them, allowing information to persist.

• Long Short Term Memory (LSTM) networks are a special kind of RNN, capable of learning long-term dependencies.

• Good for state-wise (step-by-step) caption generation task.

32

Deep CNN-RNN

Model

33

ACS Images Caption Generation

34

Caption generation using fine-tuned CNN-RNN model

35

Microsoft Cognitive APIs and BOTs

THANKS!

Questions?

Data & Analytics

Transfer Learning and Fine-tuning Deep Neural Networks