30
MLP The Multi-layer Perceptron Dr. Syed Imtiyaz Hassan Assistant Professor, Department. of CSE, Jamia Hamdard (Deemed to be University), New Delhi, India. https ://Syedimtiyazhassan.org [email protected] http://www.jamiahamdard.edu

The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

MLP

The Multi-layer Perceptron

D r. S y e d I m t i y a z H a s s a nA s s i s t a n t P r o f e s s o r, D e p a r t m e n t . o f C S E , J a m i a H a m d a r d( D e e m e d t o b e U n i v e r s i t y ) , N e w D e l h i , I n d i a .

h t t p s : / / S y e d i m t i y a z h a s s a n . o r gs . i m t i y a z @ j a m i a h a m d a r d . a c . i nh t t p : / / w w w. j a m i a h a m d a r d . e d u

Page 2: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

MLP

XOR RevisitS O L U T I O N U S I N G M L P

2

Page 3: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

MLP

The Sigmoid Threshold Unit

3

Page 4: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

Adaline• Adaptive Linear Element

• Proposed by Widrow & Hoff, 1960

4

Page 5: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

Adaline

X is voltages w is conductance of controllableresistors

Madaline (Many Adaline) Adaline connected to AND logic

Adaline & Madaline are single layer.

5

Page 6: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

Adaline

Also known as LMS or Widrow & Hoff rule

Update formula

6

D E LTA R U L E

Page 7: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

MLP Architecture

Page 8: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

MLP

The 3-3-2 Network

8

Page 9: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

Gradient descent B a s i s f o r t h e B A C K P R O PA G AT I O N A l g o r i t h m

• k = number of outputs

• d = a training example

• td = target output

• od = output of the unit

• D = set of training example

9

Page 10: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• Error = Half of squared difference

• E as a function of w, because the linear unit output o depends on this weight vector.

10

Gradient descent B a s i s f o r t h e B A C K P R O PA G AT I O N A l g o r i t h m

Page 11: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• gradient of E w.r.t. w

• Training Rule

11

Gradient descent B a s i s f o r t h e B A C K P R O PA G AT I O N A l g o r i t h m

Page 12: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• Training Rule (in component form)

12

Gradient descent B a s i s f o r t h e B A C K P R O PA G AT I O N A l g o r i t h m

Page 13: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• gradient

13

Gradient descent B a s i s f o r t h e B A C K P R O PA G AT I O N A l g o r i t h m

Page 14: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

14

Gradient descent B a s i s f o r t h e B A C K P R O PA G AT I O N A l g o r i t h m

Page 15: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• A Differentiable Threshold Unit

15

Multi Layer PerceptronF E E D F O R WA R D B A C K P R O PA G AT I O N

Page 16: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• Networks with multiple output units rather than single units

16

Multi Layer PerceptronF E E D F O R WA R D B A C K P R O PA G AT I O N

Page 17: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

MLP

Backpropagation Algorithm

17

The stochastic gradient descent version of the Backpropagation Algorithm

for feedforward networks containing two layers of sigmoid units

Page 18: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

MLP

Backpropagation Algorithm

18

Page 19: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• Batch algorithm converges to a local minimum faster than the sequential algorithm

Mini-batches

• is used for splitting the training set into random batches

• estimating the gradient based on one of the subsets of the training set

• performing a weight update and then

• using the next subset to estimate a new gradient and using that for the weight update

• until all of the training set have been used

19

Mini-batchesC H A N C E T O E S C A P E F R O M L O C A L M I N I M A

Page 20: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• Extreme version of the mini-batch idea

• to use just one piece of data to estimate the gradient at each iteration of the algorithm, and to pick that piece of data uniformly at random from the training set.

• It is often used if the training set is very large

20

Stochastic Gradient DescentF O R L A R G E T R A I N I N G S E T

Page 21: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• Weight update on the nth iteration depend partially on the update that occurred during the (n - 1)th

iteration

21

Adding Momentum

Page 22: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• An ANN that uses radial basis functions as activation functions.

• The output of the network is a linear combination of RBFs of the inputs and neuron parameters.

• RBF is a real-valued function whose value depends only on the distance from the origin.

22

RBFN

Page 23: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• Radial basis function (RBF) networks typically have three layers: an input layer, a hidden layer with a non-linear RBF activation function and a linear output layer.

23

RBFN

Page 24: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• Euclidian

• Gaussian

• Multiquadric

• ….

24

RBFN

Page 25: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• Adaptive Resonance Theory

• Developed by Stephen Grossberg and Gail Carpenter in 1987.

• The basic ART system is an unsupervised learning model.

• Always open to new learning (adaptive) without losing the old patterns (resonance).

25

ART

Page 26: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• Recognition phase• The input vector is compared with the classification

presented at every node in the output layer.

• The output of the neuron becomes “1” if it best matches with the classification applied, otherwise it becomes “0”.

• Comparison phase• A comparison of the input vector to the comparison layer

vector is done. The condition for reset is that the degree of similarity would be less than vigilance parameter.

26

ART Operating Principal

Page 27: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• Search phase• The network will search for reset as well as the match

done in the above phases.

• If there would be no reset and the match is quite good, then the classification is over.

• Otherwise, the process would be repeated and the other stored pattern must be sent to find the correct match.

27

ART Operating Principal

Page 28: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

• ART 1

• ART 2

• ARTMAP (Predictive ART)

• Fuzzy ART

• Fuzzy ARTMAP

• Gaussian ART

• Gaussian ARTMAP

28

ART Types

Page 29: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

MLP

Summary Adal ine

Delta Rule

Gradient Descent

Backpropagat ion

RBFN

ART

29

Page 30: The Multi-layer MLP Perceptron · 09.03.2019  · Adaline X is voltages w is conductance of controllable resistors Madaline (Many Adaline) Adaline connected to AND logic Adaline &

Thank You