Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Transfer Learning 31-Jul-2018
Adrian SpataruData Scientist at Know-Center [email protected]://www.fb.me/adrian.spataru.5
Outline- Quick Neural Network Refresher- What is Transfer Learning?
- Bottlenecking & Fine-Tuning
- Multitask Learning
- Domain-adversarial Training
- Zero-Shot Learning
- Resources for Pretrained models
Perceptron
Multilayer Perceptron
Convolutional Neural Network
Traditional ML
Transfer Learning
Transfer Learning Landscape
Transfer Learning Tutorial (Hung-yi Lee)
Fine-Tuning- Scenario
- Lot of labeled source data- limited labeled target data
- Idea: train a model by source data, then fine-tune the model with the target data.
- Why? - Training on target data only, will likely overfit.- May reduce training time with pretrained models
Fine-Tuning Example- Kaggle Imaterialist Challenge- Labeling of Household items (glass,chair etc)- Multiclass Classification Problem- 128 Classes- Training 190k Images
Typical Image Classifier
Feature Extractor Label Classifier
Fine-Tuning
Feature Extractor Label Classifier
128Output
Results- Using a pretrained ResNet50 model on ImageNet and finetuning
- 84% Accuracy - Top 20%
- Avg Ensemble of 11 different pretrained models and finetuning- 89% Accuracy - First Place
- ResNet50 takes 4 days to train on a GTX1060
Bottlenecking - Fine-Tuning of the poor.
- Idea: train a model by source data -> use the model to extract features for the target data -> train a new model with the extracted features
input FCLayer
128Output
Results- Initial feature extraction takes 6 hours on a GTX 1060 - Training a models takes 20min- Reduces the data from 30gb to 2-3gb- 82% Accuracy - Avg Ensemble of 5 bottleneck models- Top 38% Place - 159/436
CODE DEMO
Multitask Learning
Multitask Learning (2)- Why consider it?
- Attention focusing- Eavesdropping- Regularization
- Lot of Architectures- Cross-stitch Networks- Fully-Adaptive Feature Sharing- etc
Multiple Language Translation
Domain Adversarial Training- Scenario
- lot of labeled source data- lots of unlabel target data
- Goal: Train a model which performs well on unlabeled data.
Domain Adversarial Training- Scenario
- lot of labeled source data- lots of unlabel target data
- Goal: Train a model which performs well on unlabeled data.- Goal 2.0: The distribution of the features extracted are similar
TargetSource
Domain Adversarial Training- Scenario
- lot of labeled source data- lots of unlabel target data
- Goal: Train a model which performs well on unlabeled data.- Goal 2.0: The distribution of the features extracted are similar
TargetSource
Domain Adversarial Training
Transfer Learning (Hung-yi Lee)
Domain Adversarial Training Example
No Adapt - 87% AccWith Adapt - 91% Acc
Zero Shot Learning- Scenario
- lot of labeled source data- unlabeled target data
- Goal: Train a model which performs well on target data.- How? Inference through attributes, metadata etc
Zero Shot Learning Attributes
Zero Shot Learning Attributes
Multimodal Embeddings
Zero Shot Learning Attributes- Can Classify unseen classes.- If 1:1 Mapping can be ensured- Ex: Weimaraner Dog
- gray, has tail, is on land, small- [0,1,1,1]
Zero Shot Learning Attributes Wikipedia- Use Wikipedia and Word2Vec/GloVe articles as object description
Image example of zero shot learning
Image example of zero shot learning
Google Translate - Zero shot LearningBackup
Google Translate - Zero shot Learning
Pretrained modelshttps://modeldepot.io/
http://pretrained.ml/
https://keras.io/
https://github.com/Cadene/pretrained-models.pytorch
https://nlp.stanford.edu/projects/glove/
https://github.com/pumpikano/tf-dann
ReferencesWeiss, Karl, Taghi M. Khoshgoftaar, and DingDing Wang. "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9.
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
Dong, Daxiang, et al. "Multi-task learning for multiple language translation." Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Vol. 1. 2015.
Ganin, Yaroslav, et al. "Domain-adversarial training of neural networks." The Journal of Machine Learning Research 17.1 (2016): 2096-2030.
Johnson, Melvin, et al. "Google's multilingual neural machine translation system: enabling zero-shot translation." arXiv preprint arXiv:1611.04558 (2016).
Zhang, Ziming, and Venkatesh Saligrama. "Zero-shot learning via semantic similarity embedding." Proceedings of the IEEE international conference on computer vision. 2015.
Adrian SpataruData Scientist at Know-Center [email protected]://www.fb.me/adrian.spataru.5
Thank you for your Attention!