37
Intro to Neural Networks Prof. Kate Saenko, [email protected]

Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

  • Upload
    others

  • View
    21

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Intro to Neural Networks

Prof. Kate Saenko, [email protected]

Page 2: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Today

• Neural Networks

– aka Multilayer Perceptrons,

– aka Deep Belief Networks/Deep Learning

• Unsupervised Neural Networks

– Autoencoders

– Sparse Autoencoders

Page 3: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Intro to Neural Networks

Slides by Andrew Ng

Page 4: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Neural Networks

Origins: Algorithms that try to mimic the brain. Was very widely used in 80s and early 90s; popularity diminished in late 90s. Recent resurgence: State-of-the-art technique for many applications

Page 5: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

[Roe et al., 1992]

Auditory cortex learns to see

Auditory Cortex

The “one learning algorithm” hypothesis

[Roe et al., 1992]

Page 6: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Somatosensory cortex learns to see

Somatosensory Cortex

The “one learning algorithm” hypothesis

[Metin & Frost, 1989]

Page 7: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Seeing with your tongue Human echolocation (sonar)

Haptic belt: Direction sense Implanting a 3rd eye

Sensor representations in the brain

[BrainPort; Welsh & Blasch, 1997; Nagel et al., 2005; Constantine-Paton & Law, 2009]

Page 8: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Neuron in the brain

Page 9: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Neurons in the brain

[Credit: US National Institutes of Health, National Institute on Aging]

Page 10: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Neuron model: Logistic unit

Sigmoid (logistic) activation function.

Page 11: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Sigmoid function Logistic function

One Unit is a Logistic Regression Classifier

Want

1

0.5

0

1

0

Page 12: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Interpretation of Hypothesis Output

= estimated probability that y = 1 on input x

Tell patient that 70% chance of tumor being malignant

Example: If

“probability that y = 1, given x, parameterized by ”

Page 13: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Neural Network

Layer 3 Layer 1 Layer 2

Page 14: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Neural Network “activation” of unit in layer

matrix of weights controlling function mapping from layer to layer

If network has units in layer , units in layer , then will be of dimension .

Page 15: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Add .

Forward propagation: Vectorized implementation

Page 16: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Layer 3 Layer 1 Layer 2

Neural Network learning its own features

Page 17: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Layer 3 Layer 1 Layer 2

Other network architectures

Layer 4

Page 18: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Handwritten digit classification

[Courtesy of Yann LeCun]

Page 19: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Handwritten digit classification

[Courtesy of Yann LeCun]

Page 20: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Neural Network (Classification)

Binary classification 1 output unit

Layer 1 Layer 2 Layer 3 Layer 4

Multi-class classification (K classes)

K output units

total no. of layers in network

no. of units (not counting bias unit) in layer

pedestrian car motorcycle truck

E.g. , , ,

Page 21: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Cost function

Neural network:

regularization

training error

Page 22: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Training set:

How to choose parameters ?

m examples

Cost function: Training Error

Page 23: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Linear regression:

“non-convex” “convex”

Cost function: Training Error

Page 24: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

If y = 1

1 0

Cost function: Training Error

Page 25: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

If y = 0

1 0

Cost function: Training Error

Page 26: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Cost function: Training Error

Page 27: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Output

To fit parameters :

To make a prediction given new :

Cost function: Training Error

Page 28: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Andrew Ng

Cost function

Neural network:

regularization

training error

Page 29: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Gradient computation

Need code to compute: -

-

Use “Backpropagation algorithm” - Efficient way to compute

- Computes gradient

incrementally by “propagating” backwards through the network

Page 30: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Unsupervised Neural Networks

Page 31: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Unsupervised Learning

• So far, we talked about classifiation

• Can also apply algorithm to regression, or

• unsupervised learning, by setting outputs to inputs – Autoencoder

– Sparse Autoencoder

– Restricted Bolzmann Machine (RBM)

autoencoder

Page 32: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Autoencoder

• The identity function seems trivial, but…

• …by putting constraints on the hidden units, we can discover interesting structure – Use few units, enforce sparsity, etc.

• Example: – Learn autoencoder on 10x10 pixel patches

– Use only 50 hidden nodes

– Forces hidden nodes to “compress” data

• Similar to PCA, sparse coding, LLE, etc.

Page 33: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Sparse Autoencoder

• Think of a neuron as being "active" (or as "firing") if its output value is close to 1, or as being "inactive" if its output value is close to 0.

• We would like to constrain the neurons to be inactive most of the time

• Same idea as Sparse Coding! But different method

Page 34: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Sparse Autoencoder

• we will write to denote the activation of this hidden unit when the network is given a specific input x

• the average activation of hidden unit (averaged over the training set) is

• We would like to (approximately) enforce the constraint

where is a sparsity parameter, typically a small value close to zero (say ).

Page 35: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

RECALL: Gradient computation

Need code to compute: -

-

Use “Backpropagation algorithm” - Efficient way to compute

- Computes gradient

incrementally by “propagating” backwards through the network

+ Sparsity penalty

regularization

training error

+

Page 36: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

Sparse Autoencoder

Page 37: Intro to Neural Networksmorse.uml.edu/Activities.d/CDM/docs/neuralnets.pdf · Intro to Neural Networks Slides by Andrew Ng . Andrew Ng Neural Networks Origins: Algorithms that try

http://deeplearning.stanford.edu/wiki/index.php/Visualizing_a_Trained_Autoencoder

Sparse Autoencoder • Visualizing a sparse Autoencoder trained on 10x10

image patches