Upload
julien-simon
View
356
Download
1
Embed Size (px)
Citation preview
Deep Learning for Developers16/12/2017
Julien Simon, AI Evangelist, EMEA
@julsimon
What to expect
• AI ?
• An introduction to Deep Learning
• Common neural network architectures and use cases
• An introduction to Apache MXNet
• Demos using Jupyter notebooks on Amazon SageMaker
• Resources
• Artificial Intelligence: design software applications which exhibit human-like behavior, e.g. speech, natural language processing, reasoning or intuition
• Machine Learning: teach machines to learn without being explicitly programmed
• Deep Learning: using neural networks, teach machines to learn from complex data where features cannot be explicitly expressed
Myth: AI is dark magicaka « You’re not smart enough »
Fact: AI is math, code and chipsA bit of Science, a lot of Engineering
Amazon AI is based on Deep Learning
An introduction to Deep Learning
Activation functions
The neuron
i=1
l
xi ∗ wi = u
”Multiply and Accumulate”Source: Wikipedia
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
x =
x11, x12, …. x1I
x21, x22, …. x2I
… … …
xm1, xm2, …. xmI
I features
m samples
y =
y1
y2
…
ym
m labels,
N2 outputs
0,0,1,0,0,…,0
1,0,0,0,0,…,0
…
0,0,0,0,1,…,0
One-hot encoding
Neural networks
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
x =
x11, x12, …. x1I
x21, x22, …. x2I
… … …
xm1, xm2, …. xmI
I features
m samples
y =
y1
y2
…
ym
m labels,
N2 categories
Total number of predictionsAccuracy =
Number of correct predictions
0,0,1,0,0,…,0
1,0,0,0,0,…,0
…
0,0,0,0,1,…,0
One-hot encoding
Neural networks
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Neural networksInitially, the network will not predict correctlyf(X1) = Y’1
A loss function measures the difference between
the real label Y1 and the predicted label Y’1error = loss(Y1, Y’1)
For a batch of samples:
𝑖=1
𝑏𝑎𝑡𝑐ℎ 𝑠𝑖𝑧𝑒
loss(Yi, Y’i) = batch error
The purpose of the training process is to
minimize error by gradually adjusting weights
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Training
Training data set Training
Trained
neural network
Batch size
Learning rate
Number of epochs
Hyper parameters
Backpropagation
Stochastic Gradient Descent (SGD)
Imagine you stand on top of a mountain with skis
strapped to your feet. You want to get down to
the valley as quickly as possible, but there is fog
and you can only see your immediate
surroundings. How can you get down the
mountain as quickly as possible? You look
around and identify the steepest path down, go
down that path for a bit, again look around and
find the new steepest path, go down that path,
and repeat—this is exactly what gradient descent
does.Tim Dettmers
University of Lugano
2015
https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/
The « step size » is called
the learning rate
z=f(x,y)
Alternatives to SGD:
rmsprod, adagrad, adadelta,
adam, etc.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Val idat ion
Validation data set Trained
neural network
Validation
accuracy
Prediction at
the end of
each epoch
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Early stopping
Training accuracy
Loss function
Accuracy
100%
Epochs
Validation accuracy
Loss
OV
ERFITTIN
G
Save the model at the end of each epoch
Common network architecturesand use cases
Convolutional Neural Networks (CNN)
Le Cun, 1998: handwritten digit recognition, 32x32 pixels
Convolution and pooling reduce dimensionality
https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/
https://news.developer.nvidia.com/expedia-ranking-hotel-images-with-deep-learning/
• Expedia have over 10 million images from
300,000 hotels
• Using great images boosts conversion
• Using Keras and EC2 GPU instances,
they fine-tuned a pre-trained Convolutional Neural
Network using 100,000 images
• Hotel descriptions now automatically feature the best
available images
CNN: Object Classification
CNN: Object Detection
https://github.com/precedenceguo/mx-rcnn https://github.com/zhreshold/mxnet-yolo
• 17,000 images from Instagram
• 7 brands
• Deep Learning model pre-trained on ImageNet
• Fine-tuning with TensorFlow and EC2 GPU instances
• Additional work on color extraction
https://technology.condenast.com/story/handbag-brand-and-color-detection
CNN: Object Detection
CNN: Object Segmentation
https://github.com/TuSimple/mx-maskrcnn
https://www.oreilly.com/ideas/self-driving-trucks-enter-the-fast-lane-using-deep-learning
Last June, tuSimple drove an autonomous
truck
for 200 miles from Yuma, AZ to San Diego,
CA
CNN: Face Detection
https://github.com/tornadomeet/mxnet-face
Real-Time Pose Estimation
https://github.com/dragonfly90/mxnet_Realtime_Multi-Person_Pose_Estimation
Long Short Term Memory Networks (LSTM)
• A LSTM neuron computes the output based on the input and a previous state
• LSTM networks have memory
• They’re great at predictingsequences, e.g. machine translation
Generative Adversarial Networks (GAN)
• Goodfellow, 2014
• Dual-network architecture
• Detector network• Sees real samples
• Learn how to detect real samples from fake ones created by the Generator network
• Generator network• Doesn’t see real samples
• Creates images from random data
• Applies the same weight updates as the Generator
• Learns gradually how to generate better fake samples
GAN: Welcome to the (un)real world, Neo
Generating new ”celebrity” faceshttps://github.com/tkarras/progressive_growing_of_gans
From semantic map to 2048x1024 picture https://tcwang0509.github.io/pix2pixHD/
Apache MXNet
Apache MXNet: Open Source library for Deep Learning
Programmable Portable High Performance
Near linear scaling
across hundreds of
GPUs
Highly efficient
models for
mobile
and IoT
Simple syntax,
multiple
languages
Most Open Best On AWSOptimized for
Deep Learning on AWS
Accepted into the
Apache Incubator
MXNet 1.0 released on December 4th
Input Output
1 1 1
1 0 1
0 0 03
mx. sym. Convol ut i on( dat a, ker nel =( 5, 5) , num_f i l t er =20)
mx. sym. Pool i ng( dat a, pool _t ype=" max" , ker nel =( 2, 2) ,
st r i de=( 2, 2)
l st m. l st m_unr ol l ( num_l st m_l ayer , seq_l en, l en, num_hi dden, num_embed)
4 2
2 0 4=Max
1
3
...
4
0.2
-0.1
...
0.7
mx. sym. Ful l yConnect ed( dat a, num_hi dden=128)
2
mx. symbol . Embeddi ng( dat a, i nput _di m, out put _di m = k)
0.2
-0.1
...
0.7
Queen
4 2
2 0 2=Avg
Input Weights
cos(w, queen ) = cos(w, k i ng) - cos(w, m an ) + cos(w, w om an)
mx. sym. Act i vat i on( dat a, act _t ype=" xxxx" )
" r el u"
" t anh"
" si gmoi d"
" sof t r el u"
Neural Art
Face Search
Image Segmentation
Image Caption
“People Riding
Bikes”
Bicycle, People,
Road, Sport
Image Labels
Image
Video
Speech
Text
“People Riding
Bikes”
Machine Translation
“Οι άνθρωποι
ιππασίας ποδήλατα”
Events
mx. model . FeedFor war d model . f i t
mx. sym. Sof t maxOut put
Anat omy of a Deep Lear ning Model
https://github.com/awslabs/mxnet-model-server/
https://aws.amazon.com/blogs/ai/announcing-onnx-support-for-apache-mxnet/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The Apache MXNet API
• Storing and accessing data in multi-dimensional arraysNDArray API
• Building models (layers, weights, activation functions) Symbol API
• Serving data during training and validation Iterators
• Training and using models Module API
Demos
https://github.com/juliensimon/dlnotebooks
1) Hello World: learn a synthetic data set
2) Classify images with pre-trained models
3) Learn and predict MNIST with a Multi-Layer Perceptron and the LeNet CNN
4) Train, host and deploy a model with Amazon SageMaker
5) Generate MNIST samples with a GAN
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
One-click training for ML, DL, and
custom algorithms
Easier training with hyperparameter
optimization
Highly-optimized machine learning
algorithms
Deployment without
engineering effort
Fully-managed hosting at scale
BuildPre-built notebook
instances
DeployTrain
Amazon SageMaker
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Resources
https://aws.amazon.com/machine-learning
https://aws.amazon.com/sagemaker
https://aws.amazon.com/blogs/ai
https://mxnet.incubator.apache.org
https://github.com/apache/incubator-mxnet
https://github.com/gluon-api
An overview of Amazon SageMaker https://www.youtube.com/watch?v=ym7NEYEx9x4
https://medium.com/@julsimon
Thank you!
Julien Simon, AI Evangelist, EMEA
@julsimon