Upload
steve-rogers
View
228
Download
3
Embed Size (px)
DESCRIPTION
Artificial Neural Network tutorial with a few applications.
Citation preview
2adaptive NN background & applications – Steve Rogers
Overview
• Adaptive Neural Networks have become more popular due to their ability to approximate a large array of dynamics. The ability to adapt is accomplished by means of a set of tuning rules.
• Adaptive Neural Networks are used for control & system identification/prediction, table look-ups, fault detection, and optimization due to their generalization ability. Tuning rules for adaptive neural networks have featured Lyapunov-based approaches in recent years. Although these have some desirable qualities they have led to complex tuning procedures. Tuning rules should be simple and provide for rapid, reliable convergence.
• Adaptive Neural Networks possess learning, adaptation, and classification capabilities.
3adaptive NN background & applications – Steve Rogers
Neural Networks Decision Points
• Advantages Capable of learning complex nonlinear systems Code/algorithms available Can use either for fixed or adaptive applications including control &
system identification/prediction, table look-ups, and fault detection May handle arbitrary inputs, unlike linear systems Can treat system to be identified as a ‘black box’, i.e., doesn’t
require knowledge of 1st principles Can be used in conjunction with other conventional methods
• Disadvantages Requires specialized knowledge related to the algorithm Difficult to validate, especially adaptive systems, because the
weights are not deterministic Solution may be available from other conventional methods Convergence may be to local minimum & not global solution
4adaptive NN background & applications – Steve Rogers
Neural Network Components
• Neurons Also known as nodes, neurons are the basic computing element of a network
• Connections Defines relationship of neuron within the network
• Weights Used to determine whether a neuron activates Can also be used as the activation of a neuron
• Activation Function Function used to determine the output of a neuron
• Adaptive Algorithm Controls the learning process of the network
A key point is that arbitrary measurements & derived measurements may be inputs.
5adaptive NN background & applications – Steve Rogers
Neural Network Structure
Common Activation Functions
Bipolar Sigmoid )(11)( xe
xf
11
2)( )(
xexf
2
)( xexf
Binary Sigmoid
Gaussian Radial
A neuron in the NN will take the weighted sum of its inputs and use this as an input signal for the neuron's activation function which will then produce an output signal.
6adaptive NN background & applications – Steve Rogers
Adaptive Radial Basis Function Neural Networks (RBFN)
• RBFN’s are two-layer networks whose outputs are a linear combination of the hidden layer functions
• Typical RBFN equations are:
• where is the input vector of the network, h indicates the total number of hidden neurons, k and k refers to the center and width of the kth hidden neuron. ||…|| is the Euclidean norm. The function f(.) is the output of an RBFN, which represents the network approximation to the actual output. The coefficient wk is the connection weight vector of the kth-hidden neuron to the output neurons and w0 is the bias term.
21
10
2exp kk
h
kk
Tk
T
k
wwf
7adaptive NN background & applications – Steve Rogers
Typical RBFN Architecture
8adaptive NN background & applications – Steve Rogers
Adaptive Update Schemes
• Any good identification scheme that utilizes the RBFN scheme should satisfy two criteria:
1) the parameters of the RBFN are tuned properly to satisfy stability and performance needs
2) the parameter adaptive law should be efficient to allow real-time operation Feedback• The RAN (resource allocating network) was developed to tune all the RBF
parameters and incorporated a growth feature. MRAN also includes a pruning feature. Other tuning rules only adjusted the connection weight vector and left the center and width vectors fixed.
• A Lyapunov derived tuning rule is:
• where is the vector of parameters to be tuned including the connection weights, centers, and widths, is the user selected learning rate (positive scalar), (n) is the gradient of the function with respect to the parameter vector evaluated at (n).
• P is the solution of the Lyapunov derived equation . Q is a user selected positive definite matrix; A is a user selected Hurwitz (all components of polynomial positive) stable matrix. e(n) is the error vector.
nPennn 1
)( PAPAQ T
9adaptive NN background & applications – Steve Rogers
22
222
2
ekew
ekew
ewkeww www
• Another common approach for tuning rules is
• The 3rd term moves the discrete pole away from the unit circle, i.e., from being a pure integrator. Although this may slow down convergence, it improves stability, and should remove oscillations.
• Note that all parameters are tuned in the above gradient approach
Update Schemes
21
2exp
iiii yye ˆ
10adaptive NN background & applications – Steve Rogers
Radial Basis Function Block Diagrams
11adaptive NN background & applications – Steve Rogers
•The bottom part of the figure shows how a control structure may be inserted into the linear combiner (LC). The simplest control structure is the standard learning rate . A proportional integral (PI) structure is the next simplest controller. It has the form:
•which gives another integrator plus a zero. Note also that Kp may be combined with , ie, Kp = .•Any control structure may be used including lead-lag, PID, servo type PID, etc.
Control Circuit for LC update
s
asKp
RBF With Controller Update Mechanism
y + ex
dot
-yhat
1---s
x
sigmoid
s
asKp
bsasKl
PI
12adaptive NN background & applications – Steve Rogers
Optimization Gain Results
bsasK
1
112
bsasK
sKisKp
13adaptive NN background & applications – Steve Rogers
Data Plots
sKisKp
bsasKp
14adaptive NN background & applications – Steve Rogers
Use of Neural Networks for Control Enhancements of Existing Systems
ExistingController Gas Turbine
set points
measurements
NN add-on
A neural network may be added to an existing system & make use of the current data stream to enhance an existing system. This would make it non-intrusive to the existing system to take advantage of the existing control system capabilities. The NN add-on could focus on any deficiencies of the existing system. Most current research NN prototypes are handled in this fashion.
15adaptive NN background & applications – Steve Rogers
NN Control of Systems with Jumps:friction, deadzones, backlash, & hysteresis
• Add-on to existing continuous controller• Modify usual activation function by adding a jump function
Common Activation Functions
Bipolar Sigmoid )(11)( xe
xf
11
2)( )(
xexf
2
)( xexf
Binary Sigmoid
Gaussian Radial
continuous
0_1
10_0
)()( xfor
e
xforxg
x
0_11
20_0
)()( xfor
e
xforxg
x
0_0_0
)( 2
xforexfor
xg x
Jump functions
16adaptive NN background & applications – Steve Rogers
System Identification
UnknownSystem
AdaptiveComponent
+Input x[n]
d[n]
y[n]
e[n]
+
-
The adaptive component successfully models the system when e[n] converges to a small value. If model coefficients change drastically an anomaly may be declared.
17adaptive NN background & applications – Steve Rogers
System ID with Adaptive Neural Networks• Adaptive components are usually used in
conjunction with conventional components because of the instability concerns. They are used to ‘pick up the slop’ remaining from the conventional component.
• Multi-Layer Perceptrons (MLP) may be used for system identification or prediction.
• Numerous structures & update law options.• Sigma Pi structure, Ci are input vectors, β is a
kronecker product, W is a set of weights, Ue is an error function (in this case a PI control output), G/B are defined by the application.
• Single Hidden Layer structure, W/V are weights to be updated adaptively, & the other parameters are defined by the application system.
• The MLP used here is explained in the following sheets.
eKKKe
KU
WULBUGW
CCCWoutput
pi
i
ie
ee
T
21
21
,
,,, 321
0,0,0
,ˆˆˆˆ
,ˆˆˆˆˆ
3,,1,
1
2
1
1
vw
vzT
wT
z
n
iiijvjvjj
n
jjjwkwkw
VWxV
WxVW
xvbz
nkzwboutput
18adaptive NN background & applications – Steve Rogers
System ID Example with MLP
32321311313
22221211212
12121111111
13312211121 ,
pvpvfnfcpvpvfnfcpvpvfnfc
cwcwcwfnfa
22
,,,
)(ˆ
,ˆ
1
ˆ1
ˆ1
ˆ1
kaktkerrorkF
kkFkk
kvkFkvkv
kkFkk
kwkFkwkw
iii
ninini
JJJ
iii
111111
11
,
JxSxRxSxRSxJxSJxJx
Jii
R
nnin
S
iiJJ
pvfwfa
wpvffa
+
+
+
f
f
f
+ f
p1
p2
v11
v12
v21
v22
v31
v32
θ1
θ2
θ3
n11
n12
n13
c11
c12
c13
n21 a
MLP (Multi-Layer Perceptron) diagram MLP equations
MLP general equations
w1
w2
w3λ
The equations completely define the MLP. Note that α is a scalar learning rate. The matlab implementation is shown in the following sheet.
MLP general update laws
19adaptive NN background & applications – Steve Rogers
Matlab Code% mlp_example.m%clear *N = 500;cycles = 4;x = sin(cycles*2*pi*[0:N-1]/N);lb = -0.7;ub = 0.6;gain = 2;init = 1;for i = 1:N if x(i)>ub,y(i) = gain*x(i)^5; elseif x(i)<lb,y(i) = gain*x(i)^5; else y(i) = sign(x(i))*x(i)^2; end yhat(i) =
MLP_recurArray([init,x(i),y(i)]); init = 0;end figure(1)subplot(211)err = y(:) - yhat(:);errnorm = norm(err);plot([x(:),y(:),yhat(:)]),grid ontitle(['MLP estimation of sinusoid, error
= ',... num2str(errnorm)])subplot(212)plot(err),grid onylabel('error')
function yout = MLP_recurArray(in);%% MLP backpropagation learning for single hidden layer% W is output layer weights% Vi is for ith hidden layer% Assume N number of interior nodes, then the MLP NN equations are:% O = W*atanh(V*I);% With the above there are 2 update equations:% W = W - mu*err*atan(V*I);% V = V - mu*err*W*I*[1/(1+(V*I)];% N is the number of interior nodes% m is the number of inputs including the bias signalpersistent XN = 10;m = 5;my = 5;init = in(1);u = in(2);y = in(3);% Initialize W & Vif init == 1 | isempty(X) X.W = zeros(1,N); X.dW = X.W; X.V = rand(N,m+my+N)/10000; X.dV = zeros(size(X.V)); X.in = [1;u*ones(m-1,1);y*ones(my,1);zeros(N,1)]; X.predslow = y;end
mu = .09;bet = .1;G = tanh(X.V*X.in);out = X.W*G;err = y - out;nextW = X.W + mu*err*G' + bet*X.dW;sec2h = sech(X.V*X.in);sec2h = sec2h.*sec2h;nextV = X.V + mu*err*sec2h.*X.W'*X.in'... + bet*X.dV;X.in = [1;u;X.in(2:m-1);y;X.in(2+m+1:2+m+my-1);G];X.dW = nextW - X.W;X.dV = nextV - X.V;X.W = nextW;X.V = nextV;yout = out;
MLP function code
20adaptive NN background & applications – Steve Rogers
0 50 100 150 200 250 300 350 400 450 500-2
-1
0
1
2MLP estimation of sinusoid, error = 10.3689
0 50 100 150 200 250 300 350 400 450 5000
1
2
3
4
5
errornormwtnorm
Results
x
y
yhat
x is the input sinusoid, y is the output signal which is a nonlinear combination of sinusoids, & yhat is the MLP tracking signal. The bottom plot shows the stability & error performance.
Fluctuation of weights indicates that better model structure needed.
21adaptive NN background & applications – Steve Rogers
Predictive Filters
The block entitled adaptive filter may be replaced by an arbitrary structured filter. The adaptive filter copy is updated each time step. This same concept can be applied to an adaptive neural network. Note that many adaptive components (unless otherwise guaranteed stable) are used in conjunction with conventional components to ensure the stability of the adaptive component.
Z-n Adaptive filtersignal Signal estimate
Adaptive filter copy
error-
+
Signal prediction
Z – discrete delay operatorN – number of delays
22adaptive NN background & applications – Steve Rogers
Fault Detection Concepts• Actuator nonlinearities – deadband, backlash, &
hysteresis. Conventional and adaptive neural networks to estimate jump discontinuities.
• Instrument faults – excessive noise, dead sensor, drift, and bias. Simple statistics for the 1st two & system ID for the last two.
• Parameter estimation for process fault detection. Changes in coefficients may be used for fault detection.
• Hopfield neural networks may be used for principal component analyses (PCA), which is used in data driven fault detection.
23adaptive NN background & applications – Steve Rogers
Continuous Instrumentation Diagnostics for Accuracy/Precision & Life-Cycle Maintenance
• Sensor faults. Monitoring is data validation or cross checking sensor data. There are 4 types of anomalies from typical analog sensors: dead, excessive noise, drift, & offset.
• Dead or excessive noise can be detected & isolated using standard deviations of the individual sensor data stream. The standard deviation is compared to the statistics of common sensors throughout the plant.
• Drift or offset may also be caused by something in the process being measured, therefore, detection/isolation must be model based.
• Drift or offset fault detection model equations can be based on performance criteria, heat/mass balance equations, or other model structures. Fault detection parameters are derived from the equations. Any change indicates an anomaly which can then be investigated. Kalman filters are frequently used to estimate the fault parameters in stochastic systems, although other nonlinear systems including neural networks may be used as well. Typical equations and fault indicators derived from an electric pump system & heat exchanger follow.
24adaptive NN background & applications – Steve Rogers
Instrument Fault Types: Excessive Noise & Dead Sensor
0 10 20 30 40 50 60 70 80 90 100-3
-2
-1
0
1
2
3
4
time seconds
sens
or v
alue
bottom - excessive noise fault, top - dead sensor fault
25adaptive NN background & applications – Steve Rogers
Raw data Filter Bank
fault indicatorsThese will be determined by operational experience.
Raw signal Low passfilter
smoothed signal +Abs( )
residual Low passfilter
s
x y
Typical distribution of ‘s’ for a group of sensorsThis algorithm will process the raw engineering converted data that comes from each sensor. ‘s’ is the output signal that is sent to the decision logic. Sensors will be grouped by type, service, criticality, etc., as appropriate.
x y
Filter Bank
Fault decision
Note that Sds, & Snf will be refined by operational experience.
Technical Approach for dead/noisy sensors: Sensor Fault Detection Filter Banks
snum
ber
Possible noise failurePossible dead sensor
Sds Snf
Smoo
thed
sign
al
Note that the low pass filter blocks may be of arbitrary structure and may fixed or adaptive neural networks or linear networks.
26adaptive NN background & applications – Steve Rogers
Instrument Fault Types: Drift & Offset
0 1 2 3 4 5 6 7 8 9 10-1
0
1
2
3
4
5
6
7
8
time (seconds)
sens
or v
alue
bottom - offset fault, top - drift fault
27adaptive NN background & applications – Steve Rogers
Proposed Observer Solution for Drift/Offset Sensor Fault
u process Q sensors
Observer 1
Observer 2
Observer q
y1
y2
yq
Logic 1
Logic q
. . .
y11y21
y12y22
yq1
yq2
y2qyqq
y1q
Decision 1
Decision q
•The observers will process the raw engineering data ‘yi’ (output measurements) and ‘u’ (input measurements) that comes from each sensor. •An estimated value of all the output measurements is sent to a set of rules for decision making. •If a residual is greater than a threshold a fault is indicated. •This is the basis for an approach using neural networks. Note that the observer blocks may have arbitrary structures. Each observer is made unique by varying the input vectors. Therefore, the differences between them become fault indicators.
Failed health
Nominal health
Suspect health
Possible States of Sensor Health
28adaptive NN background & applications – Steve Rogers
Pump 1 Fluid Schematic with sensors & formulas
filter Gas trap
dpf
dpg
Inlet
accumulator
QuantitySensor
Pump
dpp
T
TemperatureSensor
Check Valve
fmFlowmeter
Outlet
LATI02SR0201P
LATI02SR0501Q
LATI02SR0401P
LATI02SR0001T
psia
Abs Press SensorLATI02SR0101P LATI02FM0001R
LATI02FM0002R
LATI02SR0301P
Pump Indicators:1) Zf = dpf/pph^2 (filter resistance)2) Zg = dpg/pph^2 (gas trap resistance)3) Impeller specific speed = rpm*pph^0.5/(dpp^0.75)4) Suction specific speed = rpm*pph^0.5/(psia^0.75)electricwatts1 = amps*voltselectric watts2 = amps*4.3825*krpmhydraulic watts = pph*psid/(60*8.34*2.298)5) pump efficiency = hydraulic watts/electric watts6) a1 = dpp - function(Impeller specific speed)*ppha1 should be close to zero except in a fault condition.7) vc = Amps/krpm (pump ratio)8) load = pph*dpp/(krpm*krpm) (pump load ratio)where the left hand side of the above 8 equations are indicator parameters.
Amps LATI21FC0001C/10volts LATI21FC0001Vkrpm LATI21FC0003U/(255*20000)
Pump Dynamic Equations are used for estimation:1) Ampsdot = -(R2/L2)*Amps - (psi/L2)*krpm2) krpmdot = (psi/J)*Amps - (hth /J)*krpm3) dppdot = hnn*pph^2 + hww*krpm^24) pphdot = -(hrr/ab)*pph^2 + dpp/abwhere R2, L2, psi, J, hth, hnn, hww, hrr, and ab are indicator parameters which can be determined.
29adaptive NN background & applications – Steve Rogers
ISS MTL & LTL PPA Equations TablePump Indicators Equation Parameters Algorithm PPA Area of fault detection Sensors
1) Zf Adaptive or low pass filter filter performance dpf, pph2) Zg Adaptive or low pass filter gas trap performance dpg, pph3) impeller spec. speed (iss) Adaptive or low pass filter pump performance krpm, pph, dpp4) suction spec. speed (sss) Adaptive or low pass filter pump performance krpm, pph, psia5) pump efficiency (pe) Adaptive or low pass filter pump performance Amps, volts, pph, dpp6) a1 Adaptive or low pass filter pump performance dpp, krpm, pph dpp7) vc Adaptive or low pass filter pump motor performance amps, krpm8) load Adaptive or low pass filter pump motor performance pph, dpp, krpm
Pump Dynamic Indicators Equation
1) R2, L2, psi Adaptive filter motor performance amps, krpm2) psi, J, hth Adaptive filter motor performance amps, krpm3) hnn, hww Adaptive filter pump performance pph, krpm4) hrr, ab Adaptive filter pump performance pph, dpp
Sensor Fault Matrix - PPA equationsIndicator Parameters
Sensors
dpfpphdpg
krpmdpppsia
Ampsvolts
T
Zf Zg iss sss pe a1 vc load R2 L2 psi J hth hnn hww hrr ab
** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Note that the algorithms may be adaptive neural networks as well as linear adaptive filters.
30adaptive NN background & applications – Steve Rogers
On-Line Estimation of deadband, backlash, & hysteresis In Control Element
v bl
br
mr
ml
u Control Element& Plant
y
Deadband Schematic
u(t) =mr(v(t) - br) if v(t) >= br0 if bl < v(t) < brml(v(t) - bl) if v(t) <= bl
v cl
cr
m
m
u Control Element& Plant
y
Backlash Schematic
u(t) =m(v(t) - cl) if v(t) <= clm(v(t) - cr) if v(t) >= cru(t - 1) if cl < v(t) < cr
Backlash Equations
Deadband Equations
The hysteresis schematic is more complicated than deadband or backlash & is not shown here. The general approach for parameter estimation is shown below. The types of nonlinearities are usually known by inspection.
nonlinearity u Control Element& Plant
yv
Parameter estimation(Kalman Filter)
Mr, ml, m, br, bl, cl, cr, etc.
The estimated deadband parameters may be used for 2 purposes:• on-line control loop audits• plant control
31adaptive NN background & applications – Steve Rogers
Deadband Model Parameter Estimation
32adaptive NN background & applications – Steve Rogers
Backlash Model Parameter Estimation
33adaptive NN background & applications – Steve Rogers
Matlab code
% matlab deadband code if udb(i)>0; [p1,Pdb,err(i)] = KalmanF(p1,Pdb,... udb(i),[v(i) -1 0 0]');end;if udb(i)<0; [p1,Pdb,err(i)] = KalmanF(p1,Pdb,... udb(i),[0 0 v(i) -1]');end;function [param,P,err] = KalmanF(param,P,y,x);%%niter = 10;Q = 0.05*eye(size(P));for i = 1:niter err = y - x'*param; k = P*x/(1 + x'*P*x); P = (eye(size(P)) - k*x')*P + Q; param = param + k*err;end;
34adaptive NN background & applications – Steve Rogers
Examples of applications for active control of noise and vibration
• Control of aircraft interior noise by use of lightweight vibration sources on the fuselage and acoustic sources inside the fuselage.
• Reduction of helicopter cabin noise by active vibration isolation of the rotor and gearbox from the cabin.
• Reduction of noise radiated by ships and submarines by active vibration isolation of interior mounted machinery (using active elements in parallel with passive elements) and active reduction of vibratory power transmission along the hull, using vibration actuators on the hull.
• Reduction of internal combustion engine exhaust noise by use of acoustic control sources at the exhaust outlet or by use of high intensity acoustic sources mounted on the exhaust pipe and radiating into the pipe at some distance from the exhaust outlet.
• Reduction of low frequency noise radiated by industrial noise sources such as vacuum pumps, forced air blowers, cooling towers and gas turbine exhausts, by use of acoustic control sources.
• Lightweight machinery enclosures with active control for low frequency noise reduction. • Control of tonal noise radiated by turbo-machinery (including aircraft engines). • Reduction of low frequency noise propagating in air conditioning systems by use of
acoustic sources radiating into the duct airway. • Reduction of electrical transformer noise either by using a secondary, perforated
lightweight skin surrounding the transformer and driven by vibration sources or by attaching vibration sources directly to the transformer tank. Use of acoustic control sources for this purpose is also being investigated, but a large number of sources are required to obtain global control.
• Reduction of noise inside automobiles using acoustic sources inside the cabin and lightweight vibration actuators on the body panels.
• Active headsets and earmuffs.
35adaptive NN background & applications – Steve Rogers
NoiseSource
Primary Noise
ReferenceMicrophone
ANCx(n)
y(n)e(n)
ErrorMicrophone
CancelingLoudspeaker
Acoustic Concept 1
ANC is active noise control, which includes an adaptive component . Main components are:• error microphone for each direction• reference microphone• canceling loudspeaker for each direction
y(n) is the loudspeaker signal that minimizes e(n) signal.
36adaptive NN background & applications – Steve Rogers
NoiseSource
Primary Noise
ANC
y(n)e(n)
ErrorMicrophone
CancelingLoudspeaker
Acoustic Concept 2
ANC is active noise control. The ANC includes an adaptive algorithm that learns the system in order to create an ‘anti-noise’ in the canceling loudspeaker. Components are:• error microphone for each direction•canceling loudspeaker for each direction
y(n) is the loudspeaker signal that minimizes e(n) signal.