Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet

Transfer Learning 31-Jul-2018

Adrian SpataruData Scientist at Know-Center [email protected]://www.fb.me/adrian.spataru.5

mailto:[email protected]

Outline- Quick Neural Network Refresher- What is Transfer Learning?

- Bottlenecking & Fine-Tuning

- Multitask Learning

- Domain-adversarial Training

- Zero-Shot Learning

- Resources for Pretrained models

Perceptron

Multilayer Perceptron

Convolutional Neural Network

Traditional ML

Transfer Learning

Transfer Learning Landscape

Transfer Learning Tutorial (Hung-yi Lee)

http://speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/transfer%20(v3).pdf

Fine-Tuning- Scenario

- Lot of labeled source data- limited labeled target data

- Idea: train a model by source data, then fine-tune the model with the target data.

- Why? - Training on target data only, will likely overfit.- May reduce training time with pretrained models

Fine-Tuning Example- Kaggle Imaterialist Challenge- Labeling of Household items (glass,chair etc)- Multiclass Classification Problem- 128 Classes- Training 190k Images

Typical Image Classifier

Feature Extractor Label Classifier

Fine-Tuning

Feature Extractor Label Classifier

128Output

Results- Using a pretrained ResNet50 model on ImageNet and finetuning

- 84% Accuracy - Top 20%

- Avg Ensemble of 11 different pretrained models and finetuning- 89% Accuracy - First Place

- ResNet50 takes 4 days to train on a GTX1060

Bottlenecking - Fine-Tuning of the poor.

- Idea: train a model by source data -> use the model to extract features for the target data -> train a new model with the extracted features

input FCLayer

128Output

Results- Initial feature extraction takes 6 hours on a GTX 1060 - Training a models takes 20min- Reduces the data from 30gb to 2-3gb- 82% Accuracy - Avg Ensemble of 5 bottleneck models- Top 38% Place - 159/436

CODE DEMO

Multitask Learning

Multitask Learning (2)- Why consider it?

- Attention focusing- Eavesdropping- Regularization

- Lot of Architectures- Cross-stitch Networks- Fully-Adaptive Feature Sharing- etc

Multiple Language Translation

Domain Adversarial Training- Scenario

- lot of labeled source data- lots of unlabel target data

- Goal: Train a model which performs well on unlabeled data.



- Goal: Train a model which performs well on unlabeled data.- Goal 2.0: The distribution of the features extracted are similar

TargetSource



- Goal: Train a model which performs well on unlabeled data.- Goal 2.0: The distribution of the features extracted are similar

TargetSource

Domain Adversarial Training

Transfer Learning (Hung-yi Lee)

http://speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/transfer%20(v3).pdf

Domain Adversarial Training Example

No Adapt - 87% AccWith Adapt - 91% Acc

Zero Shot Learning- Scenario

- lot of labeled source data- unlabeled target data

- Goal: Train a model which performs well on target data.- How? Inference through attributes, metadata etc

Zero Shot Learning Attributes

Zero Shot Learning Attributes

Multimodal Embeddings

Zero Shot Learning Attributes- Can Classify unseen classes.- If 1:1 Mapping can be ensured- Ex: Weimaraner Dog

- gray, has tail, is on land, small- [0,1,1,1]

Zero Shot Learning Attributes Wikipedia- Use Wikipedia and Word2Vec/GloVe articles as object description

Image example of zero shot learning

Image example of zero shot learning

Google Translate - Zero shot LearningBackup

https://1.bp.blogspot.com/-jwgtcgkgG2o/WDSBrwu9jeI/AAAAAAAABbM/2Eobq-N9_nYeAdeH-sB_NZGbhyoSWgReACLcB/s640/image01.gif

Google Translate - Zero shot Learning

Pretrained modelshttps://modeldepot.io/

http://pretrained.ml/

https://keras.io/

https://github.com/Cadene/pretrained-models.pytorch

https://nlp.stanford.edu/projects/glove/

https://github.com/pumpikano/tf-dann

https://modeldepot.io/

http://pretrained.ml/

https://keras.io/

https://github.com/Cadene/pretrained-models.pytorch

https://nlp.stanford.edu/projects/glove/

https://github.com/pumpikano/tf-dann

ReferencesWeiss, Karl, Taghi M. Khoshgoftaar, and DingDing Wang. "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9.

Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.

Dong, Daxiang, et al. "Multi-task learning for multiple language translation." Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Vol. 1. 2015.

Ganin, Yaroslav, et al. "Domain-adversarial training of neural networks." The Journal of Machine Learning Research 17.1 (2016): 2096-2030.

Johnson, Melvin, et al. "Google's multilingual neural machine translation system: enabling zero-shot translation." arXiv preprint arXiv:1611.04558 (2016).

Zhang, Ziming, and Venkatesh Saligrama. "Zero-shot learning via semantic similarity embedding." Proceedings of the IEEE international conference on computer vision. 2015.

Adrian SpataruData Scientist at Know-Center [email protected]://www.fb.me/adrian.spataru.5

Thank you for your Attention!

mailto:[email protected]

Documents

Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet