Lecture CEIT 2014 - Egypt Chapterras-egypt.org/reading/Keynote Speech CEIT 2014.pdf · Hybrid neuro-fuzzy systems The learning algorithm results, like in neural networks, in a change

Neu

ro-f

uzz

y s

yst

ems

1

Neuro-fuzzy systems: Trends and Applications

International Conference on Control, Engineering & Information Technology (CEIT’14), March 22-25, Tunisia

Dr/ Ahmad Taher AzarAssistant Professor, Faculty of Computers and

Information, Benha University, EgyptEducational Chair of IEEE RAS Egypt chapter

[email protected]: Website: http://www.bu.edu.eg/staff/ahmadazar14

Neu

ro-f

uzz

y s

yst

ems

Agenda

3

Types of models and system modeling Different Modeling Paradigms Neuro-Fuzzy systems Convergence of Technologies Advantage Of Neuro-fuzzy Modeling Types of Neuro-Fuzzy Systems Modeling With Neuro-fuzzy Systems Interpretability Versus Accuracy Of Neuro-

fuzzy Models Factors affecting the interpretability of NF

Systems Multi-adaptive Neuro Fuzzy System Design

Neu

ro-f

uzz

y s

yst

ems

4

Types of models and system modeling approaches

• Mathematical-other• Parametric-nonparametric• Continuous time- discrete time• Input-output- State space• Linear-non linear• Dynamic-Static• Time invariant-time varying• SISO -MIMO

Models

• Physical-experimental• White box-grey box-black box• Structure determination-parameter estimation• Time domain- frequency domain.

modeling–System identification

Neu

ro-f

uzz

y s

yst

ems

White Box. The model is completely constructed from a priori knowledge and

physical insight. Here, empirical data are not used during modelidentification and are only used for validation. Complete a-prioriknowledge of this kind is very rare, because usually some aspectsof the distribution of the data are unknown.

Gray Box. An incomplete model is constructed from a priori knowledge and

physical insight, then available empirical data are used to adapt themodel by finding several specific unknown parameters.

Black Box. No a priori knowledge is used to construct the model. The model is

chosen as a flexible parameterized function, which is used to fitthe data

5

Different Modeling Paradigms

Neu

ro-f

uzz

y s

yst

ems

6

Different Modeling Paradigms

Modeling Approach

Source Of Information

Method Of Acquisition Example Deficiency

Mechanistic(white-box)

Formal knowledge and

dataMathematical Differential

equations

cannot use "soft"

knowledge

Black-box Data Optimization (learning)

Regression, neural

network

cannot at all use knowledge

Fuzzy(grey box)

Various knowledge and data

Knowledge-based + learning

Rule-based model

curse” of dimensionality

Neu

ro-f

uzz

y s

yst

ems

Neuro-fuzzy systems are basically adaptive fuzzy systems developedby exploiting the similarities between fuzzy systems and certain formsof neural networks Neural Networks have their Strengths Fuzzy Logic has its Strengths

7

Neuro-Fuzzy systems

Neural Nets

KnowledgeRepresentation

Fuzzy Logic

Implicit, the systemcannot be easy interpretedor modified (-)Trains itself by learningfrom data sets (+++)

Explicit, verification andoptimization easy andefficient (+++)None, you have to defineeverything explicitly (-)Trainability

Explicit Knowledge Representation from Fuzzy Logic with TrainingAlgorithms from Neural Nets. This substantially reduces developmenttime and cost while improving the accuracy of the resulting fuzzy model.

Get “best of both worlds”:

Neu

ro-f

uzz

y s

yst

ems

8

Convergence of Technologies

Year:

1940194519501955196019651970197519801985199019952000

Computing:

Relay/Valve Based

Transistors

Small Scale Integration

Large Scale Integration

Artificial Intelligence

Neural Networks:

Neuron Model (McCulloch/Pitts)Training Rules (Hepp)

Delta Rule (Wirow/Hoff)

Multilayer Perceptron, XOR

Hopfield Model (Hopfield/Tank)Backpropagation (Rumelhart)Bidir. Assoc. Mem. (Kosko)

Fuzzy Logic:

Seminal Paper (Zadeh)

Fuzzy Control (Mamdani)

Broad Application in JapanBroad Application in EuropeBroad Application in the U.S.

Soft Computing, NeuroFuzzy

Neu

ro-f

uzz

y s

yst

ems

The advantages of neuro-fuzzy modeling are: Model identification can be performed using both empirical data

and qualitative knowledge.

The resulting models are transparent, significantly aidinghumanistic model validation and knowledge discovery.

Neuro-fuzzy systems are basically adaptive fuzzy systemsdeveloped by exploiting the similarities between fuzzy systems andcertain forms of neural networks, which fall in the class ofgeneralized local methods.

Hence, the behavior of a neuro-fuzzy system can either berepresented by a set of humanly understandable rules or by acombination of localized basis functions associated with localmodels (i.e. a generalized local method), making them an idealframework to perform nonlinear predictive modeling. model.

9

Advantage Of Neuro-fuzzy Modeling

Neu

ro-f

uzz

y s

yst

ems

There are several ways to combine neural networks and fuzzylogic.

Efforts at merging these two technologies may becharacterized by considering three main categories:

Neural Fuzzy Systems,

Fuzzy Neural Networks And

Fuzzy-neural Hybrid Systems.

10

Types of Neuro-Fuzzy Systems

Neu

ro-f

uzz

y s

yst

ems

Neural fuzzy systems are characterized by the use of neural networksto provide fuzzy systems with a kind of automatic tuning method, butwithout altering their functionality.

One example of this approach would be the use of neural networks forthe membership function elicitation and mapping between fuzzy setsthat are utilized as fuzzy rules as shown. This kind of combination ismostly used in control applications.

11

Neural Fuzzy Systems

Neu

ro-f

uzz

y s

yst

ems

In this example, the neural network simulates the processing of afuzzy system in which the neurons of the first layer are responsiblefor the fuzzification process.

12


The neurons of the second layerrepresent the fuzzy words usedin the fuzzy rules (third layer).

Finally, the neurons of the lastlayer are responsible for thedefuzzification process.

Neu

ro-f

uzz

y s

yst

ems

In the training process, a neural network adjusts its weights in orderto minimize the mean square error between the output of the networkand the desired output.

13


In this particular example, the weights ofthe neural network represent theparameters of the fuzzification function,fuzzy word membership function, fuzzyrule confidences and defuzzificationfunction respectively.

In this sense, the training of this neuralnetwork results in automatically adjustingthe parameters of a fuzzy system andfinding their optimal values.

Neu

ro-f

uzz

y s

yst

ems

Fuzzy neural Systems: The main goal of this approach is to 'fuzzify' some of the elements of

neural networks, using fuzzy logic. In this case, a crisp neuron can become fuzzy. Since fuzzy neural networks

are inherently neural networks, they are mostly used in PatternRecognition Applications.

In these fuzzy neurons, the inputs are non-fuzzy, but the weightingoperations are replaced by membership functions. The result of eachweighting operation is the membership value of the corresponding input inthe fuzzy set.

14


Neu

ro-f

uzz

y s

yst

ems

Hybrid neuro-fuzzy systems

In this approach, both fuzzy and neural networks techniques are usedindependently, becoming, in this sense, a hybrid system. Each one doesits own job in serving different functions in the system, incorporatingand complementing each other in order to achieve a common goal.

This kind of merging is application-oriented and suitable for bothcontrol and pattern recognition applications.

The idea of a hybrid model is the interpretation of the fuzzy rule-base in terms of a neural network. In this way the fuzzy sets can beinterpreted as weights, and the rules, input variables, and outputvariables can be represented as neurons

15


Neu

ro-f

uzz

y s

yst

ems

Hybrid neuro-fuzzy systems

The learning algorithm results, like in neural networks, in a change ofthe architecture, i.e. in an adaption of the weights, and/or in creatingor deleting connections. These changes can be interpreted both interms of a neural net and in terms of a fuzzy controller.

This last aspect is very important as the black box behavior of neuralnets is avoided this way. This means a successful learning procedureresults in an explicit increase of knowledge that can be represented inform of a fuzzy controller's rule base.

Hybrid neuro-fuzzy controllers are realized by approaches like ARIC,GARIC, ANFIS or the NNDFR model. These approaches consist allmore or less of special neural networks, and they are capable to learnfuzzy sets.

16


Neu

ro-f

uzz

y s

yst

ems

17

ANFIS (Jang 1993)

Fuzzy reasoning

A1 B1

A2 B2

w1

w2

z1 =p1*x+q1*y+r1

z2 =p2*x+q2*y+r2

z = w1+w2

w1*z1+w2*z2

x y

When Z is a first order polynomial, the resulting fuzzy inferencesystem is called a "first order Sugeno fuzzy model".

When Z is constant, the resulting model is called "zero-order Sugenofuzzy model", which can be viewed either as a special case of theMamdani inference system, in which each rule's consequent isspecified by a fuzzy singleton.

Neu

ro-f

uzz

y s

yst

ems

18

zero-order Sugeno fuzzy model

• Rule baseIf X is A1 and Y is B1 then Z = C1If X is A2 and Y is B2 then Z = C2

Neu

ro-f

uzz

y s

yst

ems

19

First-Order Sugeno FIS

• Rule baseIf X is A1 and Y is B1 then Z = p1*x + q1*y + r1

If X is A2 and Y is B2 then Z = p2*x + q2*y + r2

• Fuzzy reasoning

A1 B1

A2 B2

x=3

X

X

Y

Yy=2

w1

w2

z1 =p1*x+q1*y+r1

z =

z2 =p2*x+q2*y+r2

w1+w2

w1*z1+w2*z2

Neu

ro-f

uzz

y s

yst

ems

20

ANFIS Architecture

Layer 1: fuzzification layer Every node I in the layer 1 is an adaptivenode with a node function. Parameters in this layer: premise (orantecedent) parameters.

Layer 2: rule layer Is a fixed node labeled whose output is theproduct of all the incoming signals.

Neu

ro-f

uzz

y s

yst

ems

21

ANFIS Architecture

Layer 3: normalization layer• a fixed node labeled N.• Outputs of this layer are called normalized firing strengths.

Layer 4: defuzzification layer

• An adaptive node with a node fn O4,I = wi fi = wi (pi x + qi y + ri ) fori=1,2 where wi is a normalized firing strength from layer 3 and {pi ,qi ri } is the parameter set of this node – Consequent Parameters.

Layer 5: summation neuron• A fixed node which computes the overall output as the summation of

all incoming signals

Overall output = O5, 1 = ∑ wi fi = ∑ wi fi / ∑ wi

Neu

ro-f

uzz

y s

yst

ems

22

ANFIS Flow chart

Yes

Load Training/Testing dataGenerate initial FIS Model

• Set initial input parameters andmembership function

• Chose FIS model optimization method(hybrid method)

• Define training and testing parameters(number of training/testing epochs)

Input Training data into ANFIS system

Testing finished

Get results after training

No

Yes

Input Testing data into ANFIS system

View FIS structure,

Output surface of FIS, �

Generated rules and �

Adjusted membership functions

No

Training finished

Start

Stop

1

1

Neu

ro-f

uzz

y s

yst

ems

Remove noise/irrelevant inputs.

Remove inputs that depends on other inputs.

Make the underlying model more concise and transparent.

Reduce the time for model construction.

The selected parameters must affect the target problem, i.e.,strong relationships must exist among the parameters andtarget (or output) variables

The selected parameters must be well-populated, andcorresponding data must be as clean as possible.

23

Parameter Selection for the System

Neu

ro-f

uzz

y s

yst

ems

Whatever may be the adopted vision of fuzzy model, twodifferent phases must be carried out in fuzzy modeling,designated as structural identification. parametric identification

Structural identification consists of determining the structureof the rules, i.e. the number of rules and the number of fuzzysets used to partition each variable in the input and outputspace so as to derive linguistic labels.

Once a satisfactory structure is available, the parametricidentification must follow for the fine adjustment of theposition of all membership functions together with their shapeas the main concern

24

Modeling With Neuro-fuzzy Systems

Neu

ro-f

uzz

y s

yst

ems

Parametric Identification

Two types of parameters characterize a fuzzy model: thosedetermining the shape and distribution of the input fuzzy sets andthose describing the output fuzzy sets (or linear models).

Many neuro-fuzzy systems use direct nonlinear optimization toidentify all the parameters of a fuzzy system.

Different optimization techniques can be used to this aim. The mostwidely used is an extension of the well-known back-propagationalgorithm implemented by gradient descent. A very large number ofneuro-fuzzy systems are based on backpropagation.

25


Neu

ro-f

uzz

y s

yst

ems

Hybrid training method

26


A1

A2

B1

B2

S

S

/

x

y

w1

w4

w1*z1

w4*z4

Swi*zi

Swi

z

P

P

P

P

nonlinearparameters

linearparameters

Given the values of premise parameters, the overall output can beexpressed as a linear combinations of the consequent parameters.

1 21 2 1 1 2 2

1 2 1 2

1 1 1 1 1 1 2 2 2 2 2 2( ) ( ) ( ) ( ) ( ) ( )

w wf f f w f w f

w w w ww x p w y q w r w x p w y q w r

Neu

ro-f

uzz

y s

yst

ems

Hybrid training method

More specifically, in the forward pass of the hybrid learning algorithm,functional signals go forward till layer 4 and the consequentparameters are identified by the least squares estimate.

In the backward pass, the error rates propagate backward and thepremise parameters are updated by the gradient descent.

27


fixed

least-squares

steepest descent

fixed

forward pass backward passMF param.(nonlinear)

Coef. param.(linear)

The consequent parameters thus identified are optimal under thecondition that the premise parameters are fixed. Accordingly the hybridapproach is much faster than the strict gradient descent and it isworthwhile to look for the possibility of decomposing the parameter set

Neu

ro-f

uzz

y s

yst

ems

Param. ID: Comparisons

Steepest descent (SD) treats all parameters as nonlinear

Hybrid learning (SD+LSE) distinguishes between linear and nonlinear

Gauss-Newton (GN) linearizes and treat all parameters as linear

Levenberg-Marquardt (LM) switches smoothly between SD and GN

28


Neu

ro-f

uzz

y s

yst

ems

To speed up the process of parameter identification, many neuro-fuzzy systems adopt a multi-stage learning procedure to find andoptimize the parameters.

Typically, two stages are considered. In the first stage the input space is partitioned into regions by

unsupervised learning, and from each region the premise (andeventually the consequent) parameters of a fuzzy rule are derived.

In the second stage the consequent parameters are estimated via asupervised learning technique.

In most cases, the second stage performs also a fine adjustment ofthe premise parameters obtained in the first stage using a nonlinearoptimization technique. Most of the techniques used in theinitialization stage fall in one of the following categories:

29


Neu

ro-f

uzz

y s

yst

ems

30

Grid partitioning

With this approach, the domains of the input variables are a prioripartitioned into a specified number of fuzzy sets.

The rule base is then established to cover the input space by using allpossible combinations of input fuzzy sets as multivariate fuzzy setsdescribing the rule antecedents

The consequent parameters are estimated bythe least squares method using available input-output data

Advantage: very interpretable fuzzy sets canbe generated.

Drawback: the number of multivariate fuzzysets, and hence the number of rules, is anexponential function of the number of inputs.

This curse of dimensionality restricts the use of fuzzy models based ongrid partitioning to low dimensional problems.

Neu

ro-f

uzz

y s

yst

ems

31

Cluster-oriented methods

Cluster-oriented methods try to group the training data into clusters anduse them to define multivariate fuzzy sets describing the premise part offuzzy rules

Popular clustering methods adopted to find the centers of multivariatemembership functions are the k-means clustering, the self-organizingfeature maps (SOM), Fuzzy clustering.

While clustering-based methods produce verya flexible partitioning with respect to thegrid-based approaches, the lattice partition ofthe input space is ignored and this usuallyresults in rule bases that cannot beinterpreted very well.

As shown, the projection of clusters toobtain fuzzy rules typically results inoverlapping nonsensical fuzzy sets that arehard to interpret linguistically

Neu

ro-f

uzz

y s

yst

ems

32

Cluster-oriented methods

In the case of fuzzy clustering, the projection causes a loss ofinformation because the Cartesian product of the induced membershipfunctions does not reproduce the fuzzy cluster exactly

Another consequence of cluster projection is that for each variablethe fuzzy sets are induced individually for each rule and for eachfeature there will be as many different fuzzy sets as the number ofclusters.

Some of these fuzzy sets may be similar, yet they are usually notidentical. For a good interpretation it is necessary to have a fuzzypartition of few fuzzy sets where each clearly represents a linguisticconcept.

To eliminate this redundancy, similarity measures can be used in orderto assess the degree of overlapping between adjacent fuzzy sets andmerge fuzzy sets that are too similar.

Neu

ro-f

uzz

y s

yst

ems

33

Structural Identification

Before fuzzy rule parameters can be optimized, the structure of thefuzzy rule base must be defined. This involves determining the numberof rules and the granularity of the data space, i.e. the number of fuzzysets used to partition each variable.

In fuzzy rule-based systems, as in any other modeling technique, thereis a tradeoff between accuracy and complexity.

The more rules, the finer the approximation of the nonlinear mappingcan be obtained by the fuzzy system, but also more parameters haveto be estimated, thus the cost and complexity increase

A possible approach to structure identification is to perform astepwise search through the fuzzy model space. Once again, thesesearch strategies fall into one of two general categories: forwardselection and backward elimination.

Neu

ro-f

uzz

y s

yst

ems

34

Structural Identification

Forward selection. Starting from a very simple rule base, new fuzzyrules are dynamically added or the density of fuzzy sets isincrementally increased.

Backward elimination. An initial fuzzy rule base, constructed from apriori knowledge or by learning from data, is reduced, until a minimumof the error function is found. The structure of the fuzzy rules canalso be optimized by GA's so that a compact fuzzy rule base can beobtained

The learning algorithm is an example of structure adaptation in neuro-fuzzy systems. Rules are dynamically recruited or deleted according totheir significance to system performance, so that a parsimoniousstructure with high performance is achieved.

Neu

ro-f

uzz

y s

yst

ems

35

Interpretability Versus Accuracy Of Neuro-fuzzy Models

The twofold face of fuzzy systems leads to a trade-off betweenreadability and accuracy

Interpretability Accuracy

No. of parameters Few Parameters More Parameters

No. of fuzzy rules Few Rules More Rules

Type of Fuzzy logic Model Mamdani Models TSK models

To keep the model simple, the prediction is usually less accurate. Insolving this trade-off the interpretability (meaning also simplicity) offuzzy systems must be considered the major advantage and hence itshould be pursuit more than accuracy.

In fact fuzzy systems are not better function approximators orclassifiers than other approaches. If we are interested in a veryprecise prediction, then we are usually not so much interested in theinterpretability of the solution

Neu

ro-f

uzz

y s

yst

ems

36

Factors affecting the interpretability of NF Systems

Choice of fuzzy model type: Mamdani-type (or singleton) fuzzy systems should be preferred to

TS fuzzy systems, because the rule, consequents consist ofinterpretable fuzzy sets.

Number of fuzzy rules: a fuzzy system with a large rule base is less interpretable than a

fuzzy system that needs only few rules.

Number of input variables: each rule should use as few variables as possible to be more

comprehensible.

Number of fuzzy sets per variable: only a moderate number of fuzzysets should be used to partition a variable. A coarse granularityincreases the readability of the fuzzy model, hence too many linguisticlabels for each variable is preferentially to avoid.

Neu

ro-f

uzz

y s

yst

ems

37

Factors affecting the interpretability of NF Systems

Characteristics of fuzzy rules: the fuzzy rules must be complete, i.e. for any input, the rule-based

system can generate an answer. Also the rule base must be consistent, i.e. there must be no

contradictory rules that have identical antecedents but differentconsequents. Only partial inconsistency is acceptable.

Also, any form of redundancy should be avoided, e.g. there must beno rule whose antecedent is a subset of the antecedent of anotherrule, and no rule may appear more than once in the rule base.

Characteristics of fuzzy sets: fuzzy sets should be "interpretable" tothe user of the fuzzy system. This means that membership functionsshould be normal, convex and they should guarantee a completecoverage of the corresponding input domain (coverage), so that a usershould be able to label each fuzzy set by a linguistic term. Also toomuch overlapping between the membership functions should beprevented, so as to have distinguishable fuzzy sets.

Neu

ro-f

uzz

y s

yst

ems

38

Multi-adaptive Neuro Fuzzy System Design

Ensemble-Based Approach

Combination Module

NF Network 1

NF Network 2

NF Network N

Training Set

Neu

ro-f

uzz

y s

yst

ems

39


Modular-Based Approach

Sub-Task N

Subset 1

Subset 2

Subset N

Sub-Task 1 Sub-Task 2

NF Network 1

NF Network 2

NF Network N

Decomposition of the training set into N different groups

Decomposition of the task into N different sub-tasks

TrainingSet

CombinationModule

Neu

ro-f

uzz

y s

yst

ems

40


Hybrid-Based neuro-fuzzy combination approach

Combination Module

NF Network 2

Training Set

Sub-Task 1

ModularModule

Sub-Task 2

NF Network 2

NF Network 2

Sub-Task N

Ensemble Module

Neu

ro-f

uzz

y s

yst

ems

41

Conclusion

Neuro-fuzzy modeling approaches combine the benefits of twopowerful paradigms into a single capsule and provide a powerfulframework to extract fuzzy (linguistic) rules from numerical data.

The aim of using a neuro-fuzzy network is to find, through learningfrom data, a fuzzy model that represents the process underlying thedata.

Contributing factors to successful applications of neuro-fuzzy andsoft computing: Sensor technologies

Cheap fast microprocessors

Modern fast computers

Neu

ro-f

uzz

y s

yst

ems

42


References: “Neuro-Fuzzy and Soft Computing”, J.-S. R. Jang, C.-T. Sun and

E. Mizutani, Prentice Hall, 1996 “Neuro-Fuzzy Modeling and Control”, J.-S. R. Jang and C.-T. Sun,

the Proceedings of IEEE, March 1995. “ANFIS: Adaptive-Network-based Fuzzy Inference Systems,”,

J.-S. R. Jang, IEEE Trans. on Systems, Man, and Cybernetics, May 1993.

Internet resources: This set of slides is available at

http://www.cs.nthu.edu.tw/~jang/publication.htm WWW resouces about neuro-fuzzy and soft-computing

http://www.cs.nthu.edu.tw/~jang/nfsc.htm

Neu

ro-f

uzz

y s

yst

ems

43

Documents

Lecture CEIT 2014 - Egypt Chapterras-egypt.org/reading/Keynote Speech CEIT 2014.pdf · Hybrid neuro-fuzzy systems The learning algorithm results, like in neural networks, in a change