35
© Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch Copyright Notice: © Copyright 2004 Electrical and Computer Engineering Department, University of Missouri-Rolla. All rights reserved. Permission is freely given to receive and store this material for personal educational use by educational institutions only. Not to be reproduced, linked, distributed, or sold in any form or media without express written permission of the authors

© Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

Embed Size (px)

Citation preview

Page 1: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

A Brief Overview of Neural Networks

By

Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch

Copyright Notice:© Copyright 2004 Electrical and Computer Engineering Department, University of Missouri-Rolla. All rights reserved. Permission is freely given to receive and store this material for personal educational use by educational institutions only. Not to be reproduced, linked, distributed, or sold in any form or media without express written permission of the authors

Page 2: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Overview

• Relation to Biological Brain: Biological Neural Network• The Artificial Neuron• Types of Networks and Learning Techniques• Supervised Learning & Backpropagation Training

Algorithm• Learning by Example• Applications• Questions

Page 3: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Biological Neuron

Page 4: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Artificial Neuron

Σ f(n)W

W

W

W

Outputs

Activation

Function

INPUTS

W=Weight

Neuron

Page 5: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Transfer Functions

: ( ) 11

nSIGMOID f n

e

: ( )LINEAR f n n

1

0Input

Output

Page 6: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Types of networks

Multiple Inputs and Single Layer

Multiple Inputs and layers

Page 7: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Types of Networks – Contd.Feedback

Recurrent Networks

Page 8: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Learning Techniques

• Supervised Learning:

Inputs from the environment

Neural Network

Actual System

Σ

Error

+

-

Expected Output

Actual Output

Training

Page 9: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Multilayer Perceptron

Inputs First Hidden layer

Second Hidden Layer

Output Layer

Page 10: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Signal FlowBackpropagation of Errors

Function Signals

Error Signals

Page 11: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Learning by Example

• Hidden layer transfer function: Sigmoid function = F(n)= 1/(1+exp(-n)), where n is the net input to the neuron.

Derivative= F’(n) = (output of the neuron)(1-output of the neuron) : Slope of the transfer function.

• Output layer transfer function: Linear function= F(n)=n; Output=Input to the neuron

Derivative= F’(n)= 1

Page 12: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Learning by Example

• Training Algorithm: backpropagation of errors using gradient descent training.

• Colors:– Red: Current weights– Orange: Updated weights– Black boxes: Inputs and outputs to a neuron– Blue: Sensitivities at each layer

Page 13: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

First Pass

0.5

0.5

0.5

0.50.5

0.5

0.5

0.51

0.5

0.5 0.6225

0.62250.6225

0.6225

0.6508

0.6508

0.6508

0.6508

Error=1-0.6508=0.3492

G3=(1)(0.3492)=0.3492

G2= (0.6508)(1-0.6508)(0.3492)(0.5)=0.0397

G1= (0.6225)(1-0.6225)(0.0397)(0.5)(2)=0.0093

Gradient of the neuron= G =slope of the transfer function×[Σ{(weight of the neuron to the next neuron) × (output of the neuron)}]

Gradient of the output neuron = slope of the transfer function × error

Page 14: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Weight Update 1New Weight=Old Weight + {(learning rate)(gradient)(prior output)}

0.5+(0.5)(0.3492)(0.6508)

0.6136

0.5124 0.5124

0.51240.6136

0.5124

0.5047

0.5047

0.5+(0.5)(0.0397)(0.6225)0.5+(0.5)(0.0093)(1)

Page 15: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Second Pass

0.5047

0.5124

0.6136

0.61360.5047

0.5124

0.5124

0.51241

0.5047

0.5047

0.6391

0.63910.6236

0.6236

0.8033

0.6545

0.6545

0.8033

Error=1-0.8033=0.1967

G3=(1)(0.1967)=0.1967

G2= (0.6545)(1-0.6545)(0.1967)(0.6136)=0.0273

G1= (0.6236)(1-0.6236)(0.5124)(0.0273)(2)=0.0066

Page 16: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Weight Update 2New Weight=Old Weight + {(learning rate)(gradient)(prior output)}

0.6136+(0.5)(0.1967)(0.6545)

0.6779

0.5209 0.5209

0.52090.6779

0.5209

0.508

0.508

0.5124+(0.5)(0.0273)(0.6236)0.5047+(0.5)(0.0066)(1)

Page 17: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Third Pass

0.508

0.5209

0.6779

0.67790.508

0.5209

0.5209

0.52091

0.508

0.508

0.6504

0.65040.6243

0.6243

0.8909

0.6571

0.6571

0.8909

Page 18: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Weight Update Summary

Output Expected OutputErrorw1 w2 w3

Initial conditions 0.5 0.5 0.5 0.6508 1 0.3492Pass 1 Update 0.5047 0.5124 0.6136 0.8033 1 0.1967Pass 2 Update 0.508 0.5209 0.6779 0.8909 1 0.1091

Weights

W1: Weights from the input to the input layerW2: Weights from the input layer to the hidden layerW3: Weights from the hidden layer to the output layer

Page 19: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Training Algorithm

• The process of feedforward and backpropagation continues until the required mean squared error has been reached.

• Typical mse: 1e-5

• Other complicated backpropagation training algorithms also available.

Page 20: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Why Gradient?O1

O2

O = Output of the neuronW = Weight N = Net input to the neuron

W1

W2 N = (O1×W1)+(O2×W2)

O3 = 1/[1+exp(-N)]

Error = Actual Output – O3

• To reduce error: Change in weights: o Learning rateo Rate of change of error w.r.t rate of change of weight

Gradient: rate of change of error w.r.t rate of change of ‘N’ Prior output (O1 and O2)

0Input

Output 1

Page 21: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Gradient in Detail• Gradient : Rate of change of error w.r.t rate of change in net input to neuron

o For output neurons Slope of the transfer function × error

o For hidden neurons : A bit complicated ! : error fed back in terms of gradient of successive neurons

Slope of the transfer function × [Σ (gradient of next neuron × weight connecting the neuron to the next neuron)] Why summation? Share the responsibility!!

o Therefore: Credit Assignment Problem

Page 22: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

An Example

1

0.4

0.731

0.598

0.5

0.50.5

0.5

0.6645

0.6645

0.66

0.66

1

0

Error = 1-0.66 = 0.34

Error = 0-0.66 = -0.66

G1=0.66×(1-0.66)×(-0.66)= -0.148

G1=0.66×(1-0.66)×(0.34)= 0.0763

Reduce more

Increase less

Page 23: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Improving performance

• Changing the number of layers and number of neurons in each layer.

• Variation in Transfer functions.

• Changing the learning rate.

• Training for longer times.

• Type of pre-processing and post-processing.

Page 24: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Applications

• Used in complex function approximations, feature extraction & classification, and optimization & control problems

• Applicability in all areas of science and technology.

Page 25: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Application

• Detection and Classification of Impact Induced Damage in Composite Plates using Neural Networks: ANNs as classifiers

• Intelligent Strain Sensing on a Smart Composite Wing using Extrinsic Fabry-Perot Interferometric Sensors and Neural Networks: ANNs as function approximators

Page 26: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

App 1:Damage Classification

• Important part of Health Monitoring of a Composite Structure.• Study - concentrated on “Low Velocity Impact Behavior” • Low velocity impact events can induce localized delamination -

significantly reduce the compression strength of composite structures.

• Many times - damage from impact due to low velocity events cannot be detected by visual inspection techniques.

• The experimental determination of impact-induced strain profiles can help predict the extent of damage in composite plates.

• Visual inspection techniques (Surface inspection) may not indicate the severity and extent of the internal damage such as cracking and delamination

• Artificial neural networks can be incorporated for real time monitoring of composites for damage detection and classification.

Page 27: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Type of Input Data

Strain Graphs

-5.00E-04

0.00E+00

5.00E-04

1.00E-03

1.50E-03

2.00E-03

0 1000 2000 3000 4000

Time (Microseconds)

Str

ain

STRAIN X

STRAIN Y

STRAIN XY

• The strain in time is sampled every 4 µs and stored in files for every experiment performed. Only Strain ‘X’ and ‘Y’ used as inputs. They are down sampled and normalized.

Page 28: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Type of OutputsA total of Seven (7) Classifications were decided upon by visually inspecting the composite plates and the Kinetic Energy of the falling mass. The classification is coded using “GRAY CODE”

CLASSIFICATION KINETIC ENERGYRANGE (J)(0.5*m*v*v)

CODE

NO DAMAGE <=0.1 [0 0 0]MINUTE SCRATCHES 0.1< K.E. <=0.3 [0 0 1]

MINOR PARALLELSURFACE CRACKS

0.3 < K.E <=4 [0 1 1]

SURFACEDISCOLORATION &

SMALL MATRIXCRACKS

4 < K.E. <= 8 [0 1 0]

DISCOLORATION &LONG MATRIX CRACKS

8< K.E. <=10 [1 1 0]

MODERATEDISCOLORATION,DELAMINATION &

LONG MATRIX CRACKS

10 < K.E. <=12.5 [1 1 1]

SEVEREDISCOLORATION,

SEVEREDELAMINATION &

LONG MATRIX CRACKS

K.E > 12.5 [1 0 1]

Page 29: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

ANN Details

• Multi-layered Feed-forward Network is used• 10 neurons in the Hidden Layer• As we have 7 Classifications implemented in

Gray Code, we have 3 neurons in the output layer

• Transfer Functions of both hidden and output layer is “LOGSIG”

Page 30: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Training Algorithm

• Backprpogation using several training algorithms were used to train the network.

• Levenberg Marquardt couldn’t be used because of the huge memory requirements for 503 elements of the input vector.

• “Conjugate Gradient Method” & “One Step Secant Method”, used to train the network give good results, with the former converging earlier.

• Conjugate Gradient Method is suited for large size input vectors. One step secant method requires more storage space than conjugate gradient method, therefore takes a longer time to converge.

• The network was trained for 4000 epochs to obtain the required mean squared error.

Page 31: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Results

0 500 1000 1500 2000 2500 3000 3500 400010

-5

100 Performance is 1.15403e-005, Goal is 1e-005

4000 Epochs

Tra

inin

g-B

lue

Goa

l-Bla

ck

0 1000 2000 3000 40000

0.1

0.2

0.3

0.4

mea

n sq

uare

d er

ror

epochs

training

TRAININGALGORITHM

PA(503 ELEMENTS)

AD. GRADIENTDESCENT

1

RESILIENT BP 2ONE STEP SECANT 1

Number of Errors for one of the locations

Page 32: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

App 2: Strain Prediction

• Aerodynamic parameter prediction Strain: different points on wing

• Varying conditions Angle-of-attack & air speed

• Neural network modeling• Stall Prediction

Page 33: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Experimentation

• Key Strain points Measured

• Variation in Pressure: 0 to 460 Pa

• Variation in angle-of-attack: -1.6270 to 4.310

Page 34: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

Neural Network Modeling

Neural network trained on two types of data• Max and Min strain• Average Strain

Strain for sensor S1

-35

-34

-33

-32

-31

-30

-29

Time

Mic

ro-s

trai

n Series1

Maximum

Minimum

Average

Page 35: © Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C

© Copyright 2004 ECE, UM-Rolla. All rights reserved

S1 S2 S3Max Strain 4.05% 0.71% 2.08%Min Strain 8.35% 1.92% 0.94%Average Strain 3.70% 2.03% 1.05%

Sensor

Results

Average errors in the test set