Upload
modem-narayana
View
417
Download
5
Tags:
Embed Size (px)
DESCRIPTION
ANFIS ppts
Citation preview
1©Copyright 2002 by Piero P. Bonissone
Adaptive Neural Fuzzy Inference Systems (ANFIS):Analysis and Applications
2©Copyright 2002 by Piero P. Bonissone
Outline• Objective• Fuzzy Control
– Background, Technology & Typology
• ANFIS:– as a Type III Fuzzy Control– as a fuzzification of CART– Characteristics– Pros and Cons– Opportunities– Applications– References
3©Copyright 2002 by Piero P. Bonissone
ANFIS Objective
• To integrate the best features of Fuzzy Systems and Neural Networks:– From FS: Representation of prior knowledge into
a set of constraints (network topology) to reduce the optimization search space
– From NN: Adaptation of backpropagation to structured network to automate FC parametric tuning
• ANFIS application to synthesize:– controllers (automated FC tuning)– models (to explain past data and predict future
behavior)
4©Copyright 2002 by Piero P. Bonissone
FC Technology & Typology
• Fuzzy Control– A high level representation language with
local semantics and an interpreter/compiler to synthesize non-linear (control) surfaces
– A Universal Functional Approximator
• FC Types– Type I: RHS is a monotonic function
– Type II: RHS is a fuzzy set– Type III: RHS is a (linear) function of state
5©Copyright 2002 by Piero P. Bonissone
FC Technology (Background)
• Fuzzy KB representation– Scaling factors, Termsets, Rules
• Rule inference (generalized modus ponens)• Development & Deployment
– Interpreters, Tuners, Compilers, Run-time– Synthesis of control surface
• FC Types I, II, III
6©Copyright 2002 by Piero P. Bonissone
FC of Type II, III, and ANFIS
• Type II Fuzzy Control must be tuned manually• Type III Fuzzy Control (Takagi-Sugeno type)
have an automatic Right Hand Side (RHS) tuning
• ANFIS will provide both:– RHS tuning, by implementing the TSK controller
as a network
– and LHS tuning, by using back-propagation
7©Copyright 2002 by Piero P. Bonissone
Inputs IF-part Rules + Norm THEN-part Output
x1
x2
Input 1
Input 2
&
&
&
&
OutputΣ
ANFIS Network
N
N
N
N
ω1 1ω
Layers: 0 1 2 3 4 5
8©Copyright 2002 by Piero P. Bonissone
ANFIS Neurons: Clarification note
• Note that neurons in ANFIS have different structures:– Values [ Membership function defined by parameterized
soft trapezoids (Generalized Bell Functions) ]
– Rules [ Differentiable T-norm - usually product ]
– Normalization [ Sum and arithmetic division ]
– Functions [ Linear regressions and multiplication with
, i.e., normalized weights ω, ]
– Output [ Algebraic Sum ]
ω
9©Copyright 2002 by Piero P. Bonissone
ANFIS as a generalization of CART
• Classification and Regression Tree (CART)– Algorithm defined by Breiman et al in 1984– Creates a binary decision tree to classify the
data into one of 2n linear regression models to minimize the Gini index for the current node c:
Gini(c) = where:• pj is the probability of class j in node c• Gini(c) measure the amount of “impurity” (incorrect
classification) in node c
1 − pj2
j∑
10©Copyright 2002 by Piero P. Bonissone
CART Problems
• Discontinuity• Lack of locality (sign of coefficients)
11©Copyright 2002 by Piero P. Bonissone
CART: Binary Partition Tree and Rule Table Representation
x1 x2 y
a1 ≤ a2 ≤ f1(x1,x2)
a1 ≤ > a2
a1 > a2 ≤
a1 > > a2
Partition Tree Rule Table
f2(x1,x2)
f3(x1,x2)
f4(x1,x2)
x1
x2 x2
f1(x1,x2) f2(x1,x2) f3(x1,x2) f4 (x1,x2)
a1 ≤ x1a1 > x1
a2 ≤ x2 a2 > x2
a2 > x2a2 ≤ x2
12©Copyright 2002 by Piero P. Bonissone
Discontinuities Due to Small Input Perturbations
Let's assume two inputs: I1=(x11,x12), and I2=(x21,x22) such that: x11 = ( a1- ε ) x21 = ( a1+ ε ) x12 = x22 < a2
Then I1 is assigned f1(x11,x12) while I2 is assigned f3(x1,x2)
X1
(a1 ≤ )µ (x1)
y1= f1(x11, . )
a1x11 x21
y3= f3(x11, . )
0
1
X1
x1
x2 x2
f1(x1,x2) f2(x1,x2) f3 (x1,x2) f4 (x1,x2)
x1 ≤ a1a1 > x1
x2≤ a2 a2 > x2
a2 > x2a2 ≤ x2
13©Copyright 2002 by Piero P. Bonissone
Takagi-Sugeno (TS) Model
• Combines fuzzy sets in antecedents with crisp function in output:
• IF (x1 is A) AND (x2 is B) THEN y = f(x1,x2)
IF X is small
THEN Y1=4
IF X is medium
THEN Y2=-0.5X+4
IF X is large
THEN Y3=X-1
∑
∑
=
==n
jj
n
jjj
w
wY
Y
1
1
14©Copyright 2002 by Piero P. Bonissone
ANFIS Characteristics• Adaptive Neural Fuzzy Inference System
(ANFIS)– Algorithm defined by J.-S. Roger Jang in 1992– Creates a fuzzy decision tree to classify the data
into one of 2n (or pn) linear regression models to minimize the sum of squared errors (SSE):
where:• ej is the error between the desired and the actual output• p is the number of fuzzy partitions of each variable• n is the number of input variables
SSE = ej2
j∑
15©Copyright 2002 by Piero P. Bonissone
ANFIS as a Type III FC• L0: State variables are nodes in ANFIS inputs layer• L1: Termsets of each state variable are nodes in
ANFIS values layer, computing the membership value
• L2: Each rule in FC is a node in ANFIS rules layerusing soft-min or product to compute the rule matching factor ωi
• L3: Each ωi is scaled into in the normalization layer• L4: Each weighs the result of its linear regression fi
in the function layer, generating the rule output
• L5: Each rule output is added in the output layer
iωiω
16©Copyright 2002 by Piero P. Bonissone
ANFIS ArchitectureRule Set:
...
THEN )B is (x AND) Ais x( IF
THEN )B is (x AND) Ais x( IF
2221222221
1211111211
rxqxpf
rxqxpf
++=++=
x1
x2
A1
A2
B1
B2
Π
Π
Nω1
ω2N
ω1
ω2
ω1 f1
ω2 f2
Σ y
x1
x1 x2
x2
Layers: 0 1 2 3 4 5
17©Copyright 2002 by Piero P. Bonissone
ANFIS-Visualized (Example for n =2)
• Fuzzy reasoning
A1 B1
A2 B2
w1
w2
y1 =p1*x1 +q1*x2+r1
y2 =p2*x1+q2* x2 +r2
z = w1+w2
w1*y1+w2*y2
x y
• ANFIS (Adaptive Neuro-Fuzzy Inference System)A1
A2
B1
B2
Π
ΠΣ
Σ/
x
y
w1
w2
w1*y1
w2*y2
Σwi*yi
Σwi
Y
where A1: Medium; A2: Small-MediumB1: Medium; B2: Small-Medium
18©Copyright 2002 by Piero P. Bonissone
Layer 1: Calculate Membership Value for Premise Parameter
• Output O1,i for node i=1,2
• Output O1,i for node i=3,4
• whereA is a linguistic label (small, large, …)
( )1,1 xOiAi µ=
( )2,1 2xO
iBi −= µ
( )b
i
i
A
a
cxx 2
1
1
1
1
−+
=µ
Node output: membership value of input
19©Copyright 2002 by Piero P. Bonissone
Layer 1 (cont.): Effect of changing Parameters {a,b,c}
µA x( )= 1
1+ x −ci
ai
2b-10 -5 0 5 100
0.2
0.4
0.6
0.8
1
(a ) Cha nging 'a '
-10 -5 0 5 100
0.2
0.4
0.6
0.8
1
(b) Cha nging 'b '
-10 -5 0 5 100
0.2
0.4
0.6
0.8
1
(c ) Cha nging 'c '
-10 -5 0 5 100
0.2
0.4
0.6
0.8
1
(d) Cha nging 'a ' a nd 'b '
20©Copyright 2002 by Piero P. Bonissone
Layer 2: Firing Strength of Rule
• Use T-norm (min, product, fuzzy AND, ...)
(for i=1,2)
( ) ( )21,2 xxwOii BAii µµ==
Node output: firing strength of rule
21©Copyright 2002 by Piero P. Bonissone
Layer 3: Normalize Firing Strength
• Ratio of ith rule’s firing strength vs. all rules’ firing strength
(for i=1,2)
O ww
w wi ii
31 2
, = =+
Node output: Normalized firing strengths
22©Copyright 2002 by Piero P. Bonissone
Layer 4: Consequent Parameters
• Takagi-Sugeno type output
• Consequent parameters {pi, qi, ri}
( )iiiiiii rxqxpwfwO ++== 21,4
Node output: Evaluation of Right Hand Side Polynomials
23©Copyright 2002 by Piero P. Bonissone
Layer 5: Overall Output
• Note: – Output is linear in consequent parameters p,q,r:
O w fw f
wi i
i
i ii
ii
5 1, = =∑∑∑
( ) ( )( ) ( ) ( ) ( ) ( ) ( ) 2222221211121111
222122121111
221
21
21
1
rwqxwpxwrwqxwpxw
rxqxpwrxqxpw
fww
wf
ww
w
+++++=
+++++=
++
+=
Node output: Weighted Evaluation of RHS Polynomials
24©Copyright 2002 by Piero P. Bonissone
Inputs IF-part Rules + Norm THEN-part Output
x1
x2
Input 1
Input 2
&
&
&
&
OutputΣ
ANFIS Network
N
N
N
N
ω1 1ω
Layers: 0 1 2 3 4 5
25©Copyright 2002 by Piero P. Bonissone
ANFIS Computational Complexity
Layer # L-Type # Nodes # ParamL0 Inputs n 0
L1 Values (p•n) 3•(p•n)=|S1|
L2 Rules pn 0
L3 Normalize pn 0
L4 Lin. Funct. pn (n+1)•pn=|S2|
L5 Sum 1 0
26©Copyright 2002 by Piero P. Bonissone
ANFIS Parametric Representation
• ANFIS uses two sets of parameters: S1 and S2– S1 represents the fuzzy partitions used in the
rules LHS
– S2 represents the coefficients of the linear functions in the rules RHS
S1= a11,b11,c11{ } , a12,b12,c12{ } ,..., a1p,b1p ,c1p{ } ,..., anp,bnp,cnp{ }{ }
S2 = c10,c11,...,c1n{ } , ..., cpn 0
,cpn1
,...,cpnn{ }{ }
27©Copyright 2002 by Piero P. Bonissone
ANFIS Learning Algorithms
• ANFIS uses a two-pass learning cycle– Forward pass:
• S1 is fixed and S2 is computed using a Least Squared Error (LSE) algorithm (Off-line Learning)
– Backward pass:• S2 is fixed and S1 is computed using a gradient
descent algorithm (usually Back-propagation)
28©Copyright 2002 by Piero P. Bonissone
Hybrid training method
A1
A2
B1
B2
Σ
Σ
/
x1
w1
w4
w1*y1
w4*y4
Σwi*yi
Σwi
Y
Π
Π
Π
Π
nonlinearparameters
linearparameters
fixed
least-squares
steepest descent
fixed
Forward stroke Backward stroke
MF param.(nonlinear)
Coef. param.(linear)
Structure ID & Parameter ID
• Input space partitioning
A1
B1
A2
B2x1
x1
x2
A1 A2
B1
B2
x2
x2
29©Copyright 2002 by Piero P. Bonissone
ANFIS Least Squares (LSE) Batch Algorithm
• LSE used in Forward Stroke:– Parameter Set:
– For given values of S1, using K training data, we can transform the above equation into B=AX, where X contains the elements of S2
– This is solved by: (ATA)-1AT B=X* where (ATA)-1AT
is the pseudo-inverse of A (if ATA is nonsingular)– The LSE minimizes the error ||AX-B||2 by
approximating X with X*
S = S1U S2{ } , and S1I S2 = ∅{ }( ) vectorinput theis I ,, whereSIFOutput =
( ) S2in linear is FH ,,)( oo whereSIFHOutputH =
30©Copyright 2002 by Piero P. Bonissone
ANFIS LSE Batch Algorithm (cont.)
• Rather than solving directly (ATA)-1AT B=X* , we resolve it iteratively (from numerical methods):
Si+1 = Si −Sia(i+1)a(i+1)
T Si
1 + a(i+1)T Sia(i+1)
,
Xi+1 = Xi + S(i+1)a(i+1)(b(i+1)T − a(i+1)
T Xi )
for i = 0,1,...,K −1
X0 = 0,
S0 = γI , (where γ is a large number )
a iT = i th line of matrix A
biT = ith element of vector B
X * = Xk
where:
31©Copyright 2002 by Piero P. Bonissone
ANFIS Back-propagation
• Error measure Ek
(for the kth (1≤k≤K) entry of the training data)
)( 2,
)(
1iL
LN
iik xdE ∑
=
−=
• Overall error measure E:
E = E kk = 1
K
∑
vectoroutput ofcomponent i
vectoroutput ofcomponent i
Llayer in nodesnumber = N(L)
:where
th,
th
actualx
desiredd
iL
i
=
=
32©Copyright 2002 by Piero P. Bonissone
ANFIS Back-propagation (cont.)
• For each parameter α i the update formula is:
i
i
E
∂α∂ηα
+
−=∆
derivativeordered theis
size step theis
rate learning theis
:where
i
2
i
i
E
E
∂α∂κ
∂α∂
κη
+
∑
=
33©Copyright 2002 by Piero P. Bonissone
ANFIS Pros and Cons
• ANFIS is one of the best tradeoff between neural and fuzzy systems, providing:– smoothness, due to the FC interpolation
– adaptability, due to the NN Backpropagation
• ANFIS however has strong computational complexity restrictions
34©Copyright 2002 by Piero P. Bonissone
Translates priorknowledge into
network topology& initial fuzzy
partitions
Network's firstthree layers notfully connected(inputs-values-
rules)
ANFIS Pros
Induced partial-order is usually
preserved
Uses data todeterminerules RHS
(TSK model)
Networkimplementation of
Takagi-Sugeno-KangFLC
Smallerfan-out for
Backprop
Fasterconvergencythan typicalfeedforward
NN
Smaller sizetraining set
Modelcompactness
(smaller # rulesthan using labels)
AdvantagesCharacteristics
+++
++
+
35©Copyright 2002 by Piero P. Bonissone
Translates priorknowledge into
network topology& initial fuzzy
partitions
ANFIS Cons
Sensitivity toinitial number
of partitions " P "
Uses data todetermine rules
RHS (TSKmodel)
Partial loss ofrule "locality"
Surface oscillationsaround points (caused
by high partitionnumber)
Coefficient signs notalways consistent with
underlying monotonicrelations
DisadvantagesCharacteristics
Sensitivity tonumber of input
variables " n"
Spatial exponentialcomplexity:
# Rules = P ^n- -
- -
-
36©Copyright 2002 by Piero P. Bonissone
Uses LMS algorithmto computepolynomialscoefficients
Uses Backprop totune fuzzypartitions
Uses fuzzypartitions to
discount outliereffects
Automatic FLCparametric
tuning
Error pressure tomodify only
"values" layer
Smoothnessguaranteed byinterpolation
AdvantagesCharacteristics
ANFIS Pros (cont.)
Uses FLC inferencemechanism to
interpolate among rules++
+++
37©Copyright 2002 by Piero P. Bonissone
ANFIS Cons (cont.)DisadvantagesCharacteristics
Uses LMS algorithmto computepolynomialscoefficients
Uses Backprop totune fuzzypartitions
Batch processdisregards previous
state (or IC)
Uses FLC inferencemechanism to
interpolate among rules
Not possible torepresent known
monotonicrelations
Error gradient calculationrequires derivability of
fuzzy partitions and T-norms used by FLC
Uses convex sum:
Σ λ i f i (X)/ Σ λ i
Cannot usetrapezoids nor
"Min"
"Awkward"interpolation
between slopes ofdifferent sign
Based onquadraticerror cost
function
Symmetric errortreatment & greatoutliers influence
-
-
- -
-
38©Copyright 2002 by Piero P. Bonissone
ANFIS Opportunities
• Changes to decrease ANFIS complexity– Use “don’t care” values in rules (no connection
between any node of value layer and rule layer)– Use reduced subset of state vector in partition tree
while evaluating linear functions on complete state
– Use heterogeneous partition granularity (different partitions pi for each state variable, instead of “p”)
# RULES = pii=1
n
∏
X = X r( X (n−r) )U
39©Copyright 2002 by Piero P. Bonissone
ANFIS Opportunities (cont.)
• Changes to extend ANFIS applicability– Use other cost function (rather than SSE) to
represent the user’s utility values of the error(error asymmetry, saturation effects of
outliers,etc.)
– Use other type of aggregation function (rather than convex sum) to better handle slopes of different signs.
40©Copyright 2002 by Piero P. Bonissone
ANFIS Applications at GE
• Margoil Oil Thickness Estimator• Voltage Instability Predictor (Smart Relay)• Collateral Evaluation for Mortgage Approval• Prediction of Time-to-Break for Paper Web
41©Copyright 2002 by Piero P. Bonissone
ANFIS References
• “ANFIS: Adaptive-Network-Based Fuzzy Inference System”, J.S.R. Jang, IEEE Trans. Systems, Man, Cybernetics, 23(5/6):665-685, 1993.
• “Neuro-Fuzzy Modeling and Control”, J.S.R. Jang and C.-T. Sun, Proceedings of the IEEE, 83(3):378-406
• “Industrial Applications of Fuzzy Logic at General Electric”, Bonissone, Badami, Chiang, Khedkar, Marcelle, Schutten, Proceedings of the IEEE, 83(3):450-465
• The Fuzzy Logic Toolbox for use with MATLAB, J.S.R. Jang and N. Gulley, Natick, MA: The MathWorks Inc., 1995
• Machine Learning, Neural and Statistical ClassificationMichie, Spiegelhart & Taylor (Eds.), NY: Ellis Horwood 1994
• Classification and Regression Trees, Breiman, Friedman,Olshen & Stone, Monterey, CA: Wadsworth and Brooks, 1985