Upload
abouzar-sekhavati
View
217
Download
0
Embed Size (px)
Citation preview
8/4/2019 Fuzzy Modeling
1/65
By: Saeed [email protected]
Fuzzy modeling
1
8/4/2019 Fuzzy Modeling
2/65
Introduction:Classical approach :
- Low accuracy in complicated systems
- Systems for which first principle and theoretical methods
are not fully developed-
Solution:1- human parallel processing neural networks2- human reasoning and inference system fuzzy models
2
8/4/2019 Fuzzy Modeling
3/65
-Although neural networks have many advantages but theyhave three main problems:
1- data saved in some parameter which are notinterpretable
2- nonlinear optimization problem
3- capturing the expert knowledge is impossible
3
8/4/2019 Fuzzy Modeling
4/65
Fuzzy models-A mathematical model which in some way uses fuzzy sets is called
fuzzy model [1]
-A method for modeling complex, ill defined, and less tractablesystems.
if( ) then ( )
validity of rule output of rule
fuzzy sets fuzzy sets(mamdani) or
functions (Takagi-Sugeno)
Example(mamdani):If pressure is high,then volume is small
Example(TSK):If velocity is high, thenforce = k *
4
8/4/2019 Fuzzy Modeling
5/65
-Two different ideas are behind these modelingapproaches; while the Mamdani model tries to imitate
the human reasoning mechanism, the Takagi-Sugenomodel tries to represent system by some local simplemodels when it is not describable by a single modelaccurately. For this reason Takagi-Sugeno model is
sometimes called local model.
5
8/4/2019 Fuzzy Modeling
6/65
- input space partitioning
partitioning ofinput space
Grid
partitioning
Tree
partitioning
Scatter
partitioning
1-ANFIS (Jang)
2-FUREGALOLIMOT(Nelles) CLUSTERING (Babuka)
6
8/4/2019 Fuzzy Modeling
7/65
ANFIS(Adaptive-Network-Based Fuzzy Inference System)
7
8/4/2019 Fuzzy Modeling
8/65
Main problems of fuzzy modeling before ANFIS:
1) No standard methods exist for transforming human
knowledge or experience into the rule base and database
of a fuzzy inference system.
2) There is a need for effective methods for tuning the
membership functions (MFs) so as to minimize the
output error measure or maximize performance index.
8
8/4/2019 Fuzzy Modeling
9/65
neural networksNeuron structure:
Output of neuron:1
( )m
k kj j k
j
y x b
9
8/4/2019 Fuzzy Modeling
10/65
- Activation function(
):
The logistic function ( +() ):
Hyperbolic tangent (tanh()):
Nonlinear behavior ofneural networks! 10
8/4/2019 Fuzzy Modeling
11/65
Multilayer perceptron (MLP):
Arbitrary number of hidden layer can be used!
11
8/4/2019 Fuzzy Modeling
12/65
Training MLPS (back propagation)
-Training data:
(: Input to MLP , : desired output , :MLP output for ())
()
() & (1) (1)
() () & ()
() - Cost function
12
=&
- What should be optimized
(neuron weights) 12
8/4/2019 Fuzzy Modeling
13/65
-Optimization algorithm
steepest descent: The search direction is the oppositegradient direction.
: the gradient of output error with respect to
- The most important advantage of this algorithm is that
it shows that the gradient for each weight can becalculated with the aid of the gradient of neurons in thenext layer.
13
8/4/2019 Fuzzy Modeling
14/65
-Training procedure:
Its two pass optimization method. In forward pass the inputsgo through the MLP and and can be calculated.It backward pass the error goes through output layer to input
layer and update all of the MLPs weights. This procedure
repeated by all data samples many time.
14
8/4/2019 Fuzzy Modeling
15/65
Fuzzy Inference System (FIS)
15
8/4/2019 Fuzzy Modeling
16/65
Fuzzy Inference System (FIS):
1-Compare the input variables with the membership functionson the premise part to obtain the membership values (orcompatibility measures) of each linguistic label. (This step is
often calledfuzzification ).2- Combine (through a specific T-norm operator, usuallymultiplication or min.) the membership values on the premisepart to getfiring strength (weight) of each rule.
3- Generate the qualified consequent (either fuzzy or crisp) ofeach rule depending on the firing strength.
4- Aggregate the qualified consequents to produce a crisp
output. (This step is called defuzzification.) 16
8/4/2019 Fuzzy Modeling
17/65
- Example
()
()
MamdaniType 1 Type2
TSK
17
8/4/2019 Fuzzy Modeling
18/65
Each of this if then rules can be represented as an adaptive network:
Nodes with adaptiveparameters
Nodes fixedoperation
Centers and width ofmembership functions & &
18
8/4/2019 Fuzzy Modeling
19/65
Example of a FIS with two inputand three membership function
for each of the inputs
19
8/4/2019 Fuzzy Modeling
20/65
Training procedure:
Forward pass Backward pass
Premise parameters Fixed Gradient descent
Consequent
parameters
Least square
estimateFixed
signals Node output Error rates
twopasses in the hybrid learning procedure for ANFIS
20
8/4/2019 Fuzzy Modeling
21/65
Why we can use the least squares algorithm for consequentparameters: (for example for TSK model on page 18)
() ()
() ()
21
Linear regressionproblem
8/4/2019 Fuzzy Modeling
22/65
- In backward pass the gradient descent algorithm isused to optimize the premise parameter while the error
propagate backward through the network.(like backpropagation in neural networks)
22
8/4/2019 Fuzzy Modeling
23/65
Remark1: since the consequent parameters are optimizedin each iteration with least squares algorithm, in backwardpass the nonlinear optimization problem can be solvedmore efficiently and problems such as being trapped inlocal minima or slow convergence are less problematic.
23
8/4/2019 Fuzzy Modeling
24/65
- remark2: TSK model is more popular in ANISstructure since it has more adjustable parameters in
consequent of rules. This will reduce the training timeand effort, because these parameters will be linear withrespect to output error and can be estimated veryefficiently through least-squares algorithm
24
8/4/2019 Fuzzy Modeling
25/65
- Remark3: sometimes optimizing the premise parameter(input membership functions) will deteriorate theinterpretability of the rule base.
25
8/4/2019 Fuzzy Modeling
26/65
Example: 0.6 sin 0.3 sin 3 0.1 sin 5 & [1,1]
26
3 membershipfunction for each
output(9rules)
8/4/2019 Fuzzy Modeling
27/65
27
4 membershipfunction for eachoutput(16rules)
8/4/2019 Fuzzy Modeling
28/65
28
5 membershipfunction for eachoutput(25rules)
Loss ofinterpretability
8/4/2019 Fuzzy Modeling
29/65
FUREGA
Fuzzy Rule Extraction using
Genetic Algorithm
29
8/4/2019 Fuzzy Modeling
30/65
FUREGA:1- start a grid base network using prior knowledge
2- selection of rule by genetic algorithm
3-least squares for output parameter optimization
4- constrain nonlinear optimization of membershipfunction
30
8/4/2019 Fuzzy Modeling
31/65
Properties :
Hopeful to have the best solution (accuracy)
Time consuming training
Curse of dimensionality
Interpretability ?
31
8/4/2019 Fuzzy Modeling
32/65
Local Linear Model Tree
LOLIMOT
32
8/4/2019 Fuzzy Modeling
33/65
What are local models ?
33
8/4/2019 Fuzzy Modeling
34/65
Example:
34
8/4/2019 Fuzzy Modeling
35/65
LOLIMOT algorithm:-The algorithm has an outer loop (upper level) thatdetermines the input partitions (structure) where thelocal linear models are valid and an inner loop (lowerlevel) that estimates the parameters of those local linear
models by efficient weighted least squares algorithm.
Consequent parameter estimation:
. (, , )= :local linear model parameters : inputs vector: normalized Gaussian weighting function for the ith model withcenter coordinates and standard deviations
35
8/4/2019 Fuzzy Modeling
36/65
, , =
Where:
exp( 12 (
))
- Assume the weighting functions would have been alreadydetermined. Then the parameters of each linear model areestimated separately by a weighted least squares technique.
With the data matrixX (inputs of model-known) the
diagonal weighting matrix Q, (each entry is theweighting function value of the corresponding input data)and desired outputsythe optimal parameters of the model are:
36
8/4/2019 Fuzzy Modeling
37/65
- Input space partitioning
1- Set the first hyper-rectangle in such a way that is containsall data points. Estimate a global linear model.
2- For all input dimensions j := l...n:
2a. Cut the hyper-rectangle into two halves alongdimension j.
2b. Estimate local linear models for each half.
2c. Calculate the global approximation error (output error)
for the model with this cut.
3- Determine which cut has led to the smallestapproximation error.
37
8/4/2019 Fuzzy Modeling
38/65
4- Perform this cut. Place a weighting function within each
center of both hyper-rectangles. Set standard deviations ofboth weighting functions proportional to the extension of thehyper-rectangle in each dimension. Apply the correspondingestimated local linear models(from 2b).
5- Calculate the local error measures Jon basis of a parallelrunning model for each hyper-rectangle.
6-Choose the hyper-rectangle with the largest local error
measureJ.
7-If the global approximation error on a parallel model
(output error) is too large go to step 2.
8- Convergence. Stop. 38
8/4/2019 Fuzzy Modeling
39/65
LOLIMOT
39
8/4/2019 Fuzzy Modeling
40/65
Example:
40
8/4/2019 Fuzzy Modeling
41/65
properties:
High interpretability of rules
Automatically partitioning of the input spaceaccording to the system properties
Different objective function for modeling error andstructure optimization
Low sensitivity to user selected parameters
No curse of dimensionality for high-dimensionalproblems
41
8/4/2019 Fuzzy Modeling
42/65
Implementing Hierarchical Fuzzy
Clustering in Fuzzy IdentificationUsing weighted fuzzy C-means
42
8/4/2019 Fuzzy Modeling
43/65
Clustering- Definitionto divide the data-set in such way that objects belonging tothe same cluster are as similar as possible and objectsbelonging to different clusters are as dissimilar as possible
- types
1- Crisp
2- Fuzzy
- Properties
1-Unsupervised learning task
2- Nonlinear optimization
3- Computational economy
4- Needs user defined parameters 43
8/4/2019 Fuzzy Modeling
44/65
Fuzzy C_means (FCM)
Cost function
m ---> 1 clusters ---> crispm ---> clusters ---> fuzzy
Iterative training
44
8/4/2019 Fuzzy Modeling
45/65
Example of fuzzy C_means
45
8/4/2019 Fuzzy Modeling
46/65
Weighted fuzzy C-means (WFCM) Some points are more important
46
8/4/2019 Fuzzy Modeling
47/65
self organizing map(SOM):
The most famous neural network base clustering
K-means (crisp C-means) with sequential training
47
( )
8/4/2019 Fuzzy Modeling
48/65
SOM algorithm:1- Choose initial values for the C neuron vectors , 1, . . . , . Thiscan be done by picking randomlyCdifferent data samples.2. Choose one sample for the data set(u). This can be done eitherrandomly or by systematically going through the hole data set.
3. Calculate the distance of the selected data sample to all neuronvectors. Typically, the Euclidean distance measure is used. The neuronwith the vector closest to the data sample is called thewinner neuron.
4. Update the vector of the winner neuron in a way that moves it towardthe selected data sample u:
( )5. If any neuron vector has been moved significantly, in the previousstep then go to Step 2; otherwise stop.
48
8/4/2019 Fuzzy Modeling
49/65
fuzzy clustering for fuzzy identification
It is a unsupervised learning task so it does not need no additionaldata.
Input space term-sets derived from a direct result of the clusteringprocess
Computational economy
49
8/4/2019 Fuzzy Modeling
50/65
Application of clustering in fuzzy modeling
1- applying clustering algorithms to input data only
2- applying clustering algorithms to output data only
3- applying clustering algorithms to a vector composedof input and output data.
50
8/4/2019 Fuzzy Modeling
51/65
FCM for input space partitioning
FCM requires a priori knowledge of the number ofclusters
- determining the number of clusters in an iterative manner
- using optimal fuzzy clustering methods
dependence of FCM on the initialization- hierarchical clustering
interpretability of the final fuzzy model
- Model simplification methods
51
8/4/2019 Fuzzy Modeling
52/65
Algorithm:
52
8/4/2019 Fuzzy Modeling
53/65
Algorithm:
1- apply SOM algorithm to classify N data samples into ncrisp clusters( , 1 . . ).
2- select the n cluster center(
, 1 . . ) from previous
step and assign a weight for each of them according totheir relative cardinality.
3-apply WFCM to classify the n cluster center ( , )into C new clusters.
53
8/4/2019 Fuzzy Modeling
54/65
4- The centers of the Gaussian membership functions in
premise 0f the fuzzy rules are obtained by simplyprojecting the final cluster centersinto each axis. Tocalculate the respective standard deviations utilize thefuzzy covariance matrix.[5]
5- use weighted least squares to optimize the consequentparameters and steepest descent for premiseparameters.(Formulas[5])
6- merge similar member functions for interpretability.
similarity measure: , 7- optimize the consequent parameters again.
54
8/4/2019 Fuzzy Modeling
55/65
Example I:
55
8/4/2019 Fuzzy Modeling
56/65
Example :
1 2 3 4 5
DS1\10
1
2
3
4
5
DS1\11SOM
WFCM
56
1 2 3 4 5X1
1.5
2.5
3.5
4.5
5.5
X2
green w=2/51
red w=4/51
black w=10/51
dark blue w=8/51
blue w=3/51
8/4/2019 Fuzzy Modeling
57/65
Example (cont.):
1 1.5 2 2.5 3 3.5 4 4.5 50
0.2
0.4
0.6
0.8
1
initial term-sets for x1
1 1.5 2 2.5 3 3.5 4 4.5 50
0.5
1
final term-sets for x1
1 1.5 2 2.5 3 3.5 4 4.5 50
0.5
1
simplified term-sets for x1
medium
small
large
1 1.5 2 2.5 3 3.5 4 4.5 50
0.2
0.4
0.6
0.8
1
initial term-sets for x2
1 1.5 2 2.5 3 3.5 4 4.5 5
0
0.5
1
final term-sets for x2
1 1.5 2 2.5 3 3.5 4 4.5 50
0.5
1
simplified term-sets for x2
small
large
R1: if x1 is small and x2 is small then y=17.3-2.6x1+1.4x2R2: if x1 is medium and x2 is large then y=7.5-2.9x1-0.02x2R3: if x1 is large and x2 is small then y=4.7+2.7x1-7.8x2R4: if x1 is large and x2 is large then y=2.8-0.2x1-0.2x2
J=0.1801
J=0.0018
J=0.0154
57
8/4/2019 Fuzzy Modeling
58/65
Example II:
0 50 100 150 200 250 300 350 400 450 5000.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
t
x
Inputs
x(t-18) x(t-12)x(t-6) x(t)
output
x(t+6)
8/4/2019 Fuzzy Modeling
59/65
Example II(cont.):
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30
0.2
0.4
0.6
0.8
1
initial term-sets for x(t-18)
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30
0.5
1
final term-sets for x(t-18)
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30
0.5
1
simplified term-sets for x(t-18)
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30
0.2
0.4
0.6
0.8
1
initial term-sets for x(t-12)
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30
0.5
1
final term-sets for x(t-12)
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30
0.5
1
simplified term-sets for x(t-12)
8/4/2019 Fuzzy Modeling
60/65
Example II(cont.):
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30
0.2
0.4
0.6
0.8
1
initial term-sets for x(t-6)
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30
0.5
1
final term-sets for x(t-6)
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30
0.5
1
simplified term-sets for x(t-6)
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30
0.2
0.4
0.6
0.8
1
initial term-sets for x(t)
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30
0.5
1
final term-sets for x(t)
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30
0.5
1
simplified term-sets for x(t)
J=0.0166
J=0.0072
J=0.0128
8/4/2019 Fuzzy Modeling
61/65
Benefits to Similar approaches::
It does not need any additional data
Low sensitivity to user selected parameters andinitial condition
Computational economy curse of dimensionality
interpretability
Sensitivity to data distribution
61
8/4/2019 Fuzzy Modeling
62/65
universal
approximator
62
8/4/2019 Fuzzy Modeling
63/65
Proof:[6]
63
8/4/2019 Fuzzy Modeling
64/65
References:1- Babuka, R. and Verbuggen, H. (2003). Neuro-fuzzy methods for nonlinear systemidentification, Review. Annual reviews in control, 27, 73-85.
2- Haykin, S.(1998), Neural Networks: A Comprehensive Foundation. Prentice Hall.
4- Jang, J.-S.R. (1993). ANFIS: Adaptive-network-based fuzzy inference systems. IEEETransactions on Systems, Man & Cybernetics, 23(3), 665685.
3- Nelles, O. and Isermann, R. (1996). Basis function networks for interpolation oflocal linear models. In: IEEE Conference on Decision and Control (CDC), 470475.
4- Nelles, O. (2002). Nonlinear System Identification. Springer Verlag, Berlin.
5- Oliveira, J. V. and Pedrycz, W. (2007).Advances in Fuzzy Clustering and itsApplications,John Wiley & Sons, chapter 12.
6- Espinosa, J., Vandewalle, J., Wertz, V. (2004). Fuzzy logic, identification andpredictive control. Springer Verlag, Berlin.
64
8/4/2019 Fuzzy Modeling
65/65
Questions and Discussion
Thanks for your attention