Neural Networks Introduction - GitHub Pages

Nakul Gopalan

Georgia Tech

Neural Networks

Introduction

Machine Learning CS 4641

These slides are from Vivek Srikumar, Mahdi Roozbahani and Chao Zhang.

Outline

2

• Perceptron

• Stacking Linear Threshold Units

• Neural Networks

• Expressivity of Neural Networks

• Predicting with Neural Networks

• Backpropogation

Linear Classifiers

These slides are from Vivek Srikumar

Linear Classifiers


Perceptron


Perceptron Algorithm



Outline

8

• Perceptron


• Neural Networks



• Backpropogation

Linear Threshold Unit

9

Features for Linear Threshold Unit

10

Features from Classifiers

11


12


13


14


15

Outline

16

• Perceptron


• Neural Networks



• Backpropogation

Neural Networks

17

Inspiration from Biological Neurons

18

Artificial Neurons

19

Activation Functions

20

Neural Network

21

Neural Network

22

A Brief History of Neural Network

23

Outline

24

• Perceptron


• Neural Networks



• Backpropogation

A Single Neuron with Threshold Activation

25

Two Layers with Threshold Activation

26Figure from [Shai Shalev-Shwartz and Shai Ben-David, 2014]

Three Layers with Threshold Activation

27Figure from [Shai Shalev-Shwartz and Shai Ben-David, 2014]

NNs are Universal Function Approximators

28

Outline

29

• Perceptron


• Neural Networks



• Backpropogation

Predicting with Neural Networks

30


31


32

Predicting with Neural Nets: The Forward Pass

33

The Forward Pass

34

The Forward Pass

35

The Forward Pass

36

Outline

37

• Perceptron


• Neural Networks



• Backpropogation

Backpropogation

• Smarter chain rule for derivatives

• What is chain rule:

Univariate case:

Multivariate case (Partial derivatives):

These slides are from Roger Grosse

Backpropogation

These slides are from Matt Gromley

Backpropogation


Backpropogtion

1

2𝑦 − 𝑦∗ 2 𝑦 − 𝑦∗


Backpropagation

• Backprop is used to train the overwhelming majority of neural

nets today.

Even optimization algorithms much fancier than gradient descent (e.g.

second-order methods) use backprop to compute the gradients.

• Despite its practical success, backprop is believed to be

neurally implausible. No evidence for biological signals

analogous to error derivatives.

All the biologically plausible alternatives we know about learn much more

slowly (on computers). So how on earth does the brain learn?

Slide from Roger Grosse

Take-Home Messages

43


• Neural Networks



• Backprop is chain rule with some book-keeping

Documents

Neural Networks Introduction - GitHub Pages