Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Building high-‐level features using large-‐scale unsupervised learning Anh Nguyen, Bay-‐yuan Hsu CS290D – Data Mining (Spring 2014) University of California, Santa Barbara Slide adapted from Andrew Ng (Stanford), Nando de Freitas (UBC) 1
Agenda 1. Mo9va9on 2. Approach 1. Sparse Deep Auto-‐encoder 2. Local Recep9ve Field 3. L2 Pooling 4. Local contrast normaliza9on 5. Overall Model
3. Parallelism 4. Evalua9on 5. Discussion
2
1. MOTIVATION
3
Mo9va9on
• Feature learning • Supervised learning
• Need large number of labeled data • Unsupervised learning
• Example: Build face detector without having labeled face images
• Building high-‐level features using unlabeled data.
4
Mo9va9on
• Previous works • Auto encoder • Sparse coding
• Result: Only learns low level features • Reason: Computa9onal constraints • Approach
• Dataset • Model • ComputaIonal resources 5
2. APPROACH
6
Sparse Deep Auto-‐encoder
• Auto-‐encoder • Neural network • Unsupervised learning • Back-‐propaga9on
7
Sparse Deep Auto-‐encoder (cnt’d)
• Sparse Coding • Input: Images x(1), x(2) ... x(m) • Learn: Bases (features) f1, f2, ..., fk, so that each input x can be approximately decomposed as: x=∑ajfj s.t. aj’s are mostly zero (“sparse”)
8
Sparse Deep Auto-‐encoder (cnt’d)
9
Sparse Deep Auto-‐encoder (cnt’d)
• Sparse Coding • Regularizer
10
Sparse Deep Auto-‐encoder (cnt’d)
• Sparse Deep Auto-‐encoder • Mul9ple hidden layers to achieve par9cular characteris9c in learning features
11
Local Recep9ve Field
• Defini9on: Each feature in the autoencoder can connect only to a small region of the lower layer
• Goal: • Learn feature efficiently • Parallelism
• Training on small image patches
12
L2 Pooling
• Goal: Robust to local distorIon • Approach: Group similar features together to achieve invariance
13
L2 Pooling
• Goal: Robust to local distorIon • Approach: Group similar features together to achieve invariance
14
L2 Pooling
• Goal: Robust to local distorIon • Approach: Group similar features together to achieve invariance
15
L2 Pooling
• Goal: Robust to local distorIon • Approach: Group similar features together to achieve invariance
16
Local Contrast Normaliza9on
• Goal: Robust to variaIon in light intensity • Approach: Normalize contrast
17
Local Contrast Normaliza9on
• Goal: Robust to variaIon in light intensity • Approach: Normalize contrast
18
Overall Model
• 3 layers • Simple: 18x18 px
• 8 neurons/patch • Complex: 5x5 px • LCN: 5x5 px
19
Overall Model
20
Overall Model
• Train: • Reconstruct input of each layer
• Op9miza9on func9on
21
Overall Model
• Complex model?
22
3. PARALLELISM
23
Asynchronous SGD
n Two recent lines of research in speeding up large learning problems: • Parallel/distributed compu9ng • Online (and mini-‐batch) learning algorithms: stochas9c gradient descent, perceptron, MIRA, stepwise EM n How can we bring together the benefits of parallel compu9ng and online learning?
24
Asynchronous SGD
n SGD: Stochas9c Gradient Descent: • Choose an ini9al vector of parameters W and learning rate α
• Repeat un9l an approximate minimum is obtained: • Randomly shuffle examples in the training set
25
26
27
28
Model Parallelism
• Weights divided according to locality of image and store on different machine
29
5. EVALUATION
30
Evalua9on
• 10M Youtube unlabeled frames of size 200x200
• 1B parameters • 1000 machines • 16,000 cores
31
Experiment on Faces
• Test set • 37,000 images • 13,026 face images
• Best neuron
32
Experiment on Faces (cnt’d)
• Visualiza9on • Top s9mulus (images) for face neuron • Op9mal s9mulus for face neuron
33
Experiment on Faces (cnt’d)
• Invariances Proper9es
34
Experiment on Faces (cnt’d)
• Invariances Proper9es
35
Experiment on Cat/Human body
• Test set • Cat: 10,000 posi9ve, 18,409 nega9ve • Human body: 13,026 posi9ve, 23,974 nega9ve
• Accuracy
36
ImageNet classifica9on
• Recognizing images • Dataset
• 20,000 categories • 14M images
• Accuracy • 15.8% • State of art: 9.3%
37
5. DISCUSSION
38
Discussion
• Deep learning • Unsupervised feature learning • Learning mul9ple layers of representa9on
• Increase accuracy: Invariance, contrast normaliza9on
• Scalability
39
6. REFERENCES
40
References 1. Quoc Le et al., “Building High-‐level Features using Large Scale Unsupervised
Learning” 2. Nando de Freitas, “Deep Learning”, URL: hops://www.youtube.com/watch?
v=g4ZmJJWR34Q 3. Andrew Ng, “Sparse autoencoder”, URL: hop://www.stanford.edu/class/archive/
cs/cs294a/cs294a.1104/sparseAutoencoder.pdf 4. Andrew Ng, “Machine Learning and AI via Brain Simula9ons”, URL: hops://
forum.stanford.edu/events/2011slides/plenary/2011plenaryNg.pdf 5. Andrew Ng, “Deep Learning”, URL: hop://www.ipam.ucla.edu/publica9ons/
gss2012/gss2012_10595.pdf
41