11
Fuzzy Sets and Systems 130 (2002) 21–31 www.elsevier.com/locate/fss Multiobjective fuzzy regression with central tendency and possibilistic properties Liem Tran a ; , Lucien Duckstein b a Coll. of Earth and Mineral Sciences, Center for Integrated Regional Assessment, 2217 Earth & Engrg. Science Bldg., The Pennsylvania State University, University Park, PA 16802, USA b Ecole Nationale du Genie Rural, des Eaux et des Forets, 75732 Paris 15, France Received 31 March 1999; received in revised form 21 February 2001; accepted 8 June 2001 Abstract A multiobjective fuzzy regression model (MOFR) is developed. This MOFR model combines central tendency and possibilistic properties of statistical and fuzzy regressions and overcomes several shortcomings of these two approaches. A new class of distance measure for two intervals that takes into account all the points in both intervals is introduced. The methodology is illustrated by numerical examples. c 2002 Elsevier Science B.V. All rights reserved. Keywords: Fuzzy statistics and data analysis; Fuzzy mathematical programming; Multiple criteria evaluation; Fuzzy regression 1. Introduction The purpose of this paper is to introduce a mutiobjective fuzzy regression (MOFR) model that combines central tendency and possibilistic properties of both statistical and fuzzy regressions. The model overcomes several shortcomings of current fuzzy and statistical regression approaches, namely (1) for fuzzy regression: not taking into account all data points and sensitivity to data outliers of fuzzy regression, (2) for statistical regression: diculties of verifying distribution assumptions, insucient and= or inaccurate input and= or output data, and vagueness of the relationship between input and output variables in statistical regression. Furthermore, this model keeps the good characteristics of the two approaches and provides a smooth transition from fuzzy to statistical regression when data are sucient and assumptions for statistical regression can be satisfactorily veried. The paper is organized as follows—A review of current fuzzy regression models is presented in the next section. A new class of distance measure for intervals is introduced in Section 3. The MOFR is developed in Section 4, leading to the linear programming algorithms applied to two numerical examples in Section 5. Section 6 is devoted to discussion and conclusions. Corresponding author. Tel.: +1-814-865-1587; fax: +1-814-865-3191. E-mail address: [email protected] (L. Tran). 0165-0114/02/$ - see front matter c 2002 Elsevier Science B.V. All rights reserved. PII:S0165-0114(01)00138-5

Multiobjective fuzzy regression with central tendency and possibilistic properties

Embed Size (px)

Citation preview

Page 1: Multiobjective fuzzy regression with central tendency and possibilistic properties

Fuzzy Sets and Systems 130 (2002) 21–31www.elsevier.com/locate/fss

Multiobjective fuzzy regression with central tendencyand possibilistic properties

Liem Trana ; ∗, Lucien Ducksteinb

aColl. of Earth and Mineral Sciences, Center for Integrated Regional Assessment, 2217 Earth & Engrg. Science Bldg., ThePennsylvania State University, University Park, PA 16802, USA

bEcole Nationale du Genie Rural, des Eaux et des Forets, 75732 Paris 15, France

Received 31 March 1999; received in revised form 21 February 2001; accepted 8 June 2001

Abstract

A multiobjective fuzzy regression model (MOFR) is developed. This MOFR model combines central tendency andpossibilistic properties of statistical and fuzzy regressions and overcomes several shortcomings of these two approaches.A new class of distance measure for two intervals that takes into account all the points in both intervals is introduced.The methodology is illustrated by numerical examples. c© 2002 Elsevier Science B.V. All rights reserved.

Keywords: Fuzzy statistics and data analysis; Fuzzy mathematical programming; Multiple criteria evaluation; Fuzzyregression

1. Introduction

The purpose of this paper is to introduce a mutiobjective fuzzy regression (MOFR) model that combinescentral tendency and possibilistic properties of both statistical and fuzzy regressions. The model overcomesseveral shortcomings of current fuzzy and statistical regression approaches, namely (1) for fuzzy regression:not taking into account all data points and sensitivity to data outliers of fuzzy regression, (2) for statisticalregression: di5culties of verifying distribution assumptions, insu5cient and=or inaccurate input and=or outputdata, and vagueness of the relationship between input and output variables in statistical regression. Furthermore,this model keeps the good characteristics of the two approaches and provides a smooth transition from fuzzyto statistical regression when data are su5cient and assumptions for statistical regression can be satisfactorilyveri6ed.

The paper is organized as follows—A review of current fuzzy regression models is presented in the nextsection. A new class of distance measure for intervals is introduced in Section 3. The MOFR is developedin Section 4, leading to the linear programming algorithms applied to two numerical examples in Section 5.Section 6 is devoted to discussion and conclusions.

∗ Corresponding author. Tel.: +1-814-865-1587; fax: +1-814-865-3191.E-mail address: [email protected] (L. Tran).

0165-0114/02/$ - see front matter c© 2002 Elsevier Science B.V. All rights reserved.PII: S 0165 -0114(01)00138 -5

Page 2: Multiobjective fuzzy regression with central tendency and possibilistic properties

22 L. Tran, L. Duckstein / Fuzzy Sets and Systems 130 (2002) 21–31

2. Fuzzy regression—a review

Generally, there are two main directions in fuzzy regression model development—fuzzy linear regression(FLR) and fuzzy least-squares regression (FLSR) [10,17]. The 6rst direction includes the original FLR modelproposed by Tanaka et al. [30] and its variations such as the models in Tanaka [28], Sakawa and Yano[23,24], Peters [20], Kim and Bishu [16]. The FLSR is 6rstly introduced by Diamond [8] and Celmins [5,6].Along the same line, there are several variations such as the ones in Savic and Pedrycz [25], BIardossy et al.[1], Chang and Lee [7] and Tanaka and Lee [29].

Between these two directions, FLR has been criticized signi6cantly, especially in the original formulationof Tanaka et al. [30]. Savic and Pedrycz [25] noted that not all data points are allowed to inJuence theestimated parameters in FLR. Furthermore, the model is sensitive to data outliers and prediction intervalsbecome wider as more data are collected [21,22]. The original FLR with the objective function of minimizingthe sum of parameter widths may yield some crisp parameter estimates [5,6] and it is scale dependent [14].Redden and Woodall [21] pointed out that there might be multiple solutions in some cases because the modalvalues of fuzzy parameters do not appear in the objective functions. On the other hand, FLSR has had veryfew criticisms because of its similarity to traditional least-squares regression. However, the introduction offuzzy parameters’ modal values using conventional linear regression causes FLSR to be sensitive to outliers.Furthermore, FLSR should be used only when enough data are available. This results in losing one of theadvantages of fuzzy regression in dealing with data insu5ciency.

NOzelkan [17], NOzelkan et al. [18] and NOzelkan and Duckstein [19] developed a bi-objective fuzzy regression(BOFR) model which is capable of solving several problems of FLR mentioned above, namely the problemof data outliers, the contribution of all data points in estimating parameters. This model can be rewritten as

Min-dominated {V; Ep}Subject to

PL; t() − OL; t() 6 �L; t ; OR; t() − PR; t() 6 �R; t ∀t ∈ T; (1)�L; t ; �R; t ¿ 0 ∀t ∈ T;

where O and P are fuzzy numbers representing observations and predictions, respectively; P = AX , where A isthe vector of the model’s (fuzzy) parameters and X is the vector of (fuzzy) input variables; OL; t(); OR; t() andPL; t(); PR; t() are lower and upper values of the -level intervals of the fuzzy numbers O and P, respectively;T is the set of observations; V denotes the vagueness measure de6ned as the spread of the predictions to beminimized V =

∑t (PR; t() − PL; t()); �L; t ; �R; t are relaxation variables; Ep =

∑t (�p

L; t + �pR; t) is deviation from

outliers; and 16p6∞ shows the compensation level. The expression “min-dominated” indicates the non-inferior solution-6nding process. It reJects the fact that there is often no optimum solution in multiobjectivedecision-making (MODM) but only a compromise one which is chosen from the e5cient set (or the Paretoset). Moreover, NOzelkan [17] and NOzelkan and Duckstein [19] proved that FLR models of Tanaka et al. [30],Tanaka [28], Peters [20], and classical crisp (non-fuzzy) Lp regression model are speci6c cases of their model.Although having several advantages compared with other FLR models, the NOzelkan’s model still has somedrawbacks, such as the modal values of fuzzy parameters do not appear in the objective function and thecentral tendency property does not exist fully and explicitly in the model. These points will be illustrated inthe 6rst numerical example later.

3. Distance measures

A new distance measure for two intervals is presented in this section. It will be used later to construct oneobjective function in the MOFR—the distance between modal values of the regression line (or hyperplane)

Page 3: Multiobjective fuzzy regression with central tendency and possibilistic properties

L. Tran, L. Duckstein / Fuzzy Sets and Systems 130 (2002) 21–31 23

and observed data. It should be noted that the terms “interval” used throughout this paper is for a closedinterval denoted by its lower and upper limits (A(a1; a2) = {a | a16a6a2; a∈R}).

Let F(R) be the set of intervals in R, and the distance between two intervals A(a1; a2) and B(b1; b2) bede6ned as

D2(A; B) =∫ 1=2

−1=2

{[12

(a1 + a2) + x(a2 − a1)]−[

12

(b1 + b2) + x(b2 − b1)]}2

dx (2)

= [ 12 (a1 + a2) − 1

2 (b1 + b2)]2 + 13 [ 1

2 (a2 − a1) − 12 (b2 − b1)]2: (3)

It can be proved that D(A; B) =√

D2(A; B) is a distance on F(R). Firstly, D(A; B)¿0. Symmetry is transpar-ent. If D(A; B) = 0 then A = B. The triangle inequality follows from the fact that the function to be integratedin (2) is the square of Euclidean distance.

Although only the lower and upper bound values appear in the operational form (3) derived from (2),the integral in (2) shows that this distance takes into account every point on both intervals when computingthe distance between those two intervals. It is diRerent from most existing distance measures for intervalswhich often use only the lower and upper bound values (e.g., those used in BIardossy et al. [1], Diamond [8],Diamond and KNorner [9] and Diamond and Tanaka [10]). Bertoluzza et al. [2] proposed a distance measurefor intervals which also considers every point of both intervals. Its general form, however, is too complicatedand the authors later restricted the measure to a particular case with a 6nite number of considered values foroperational purpose.

The following example shows that the present so-called Tran and Duckstein’s distance de6ned in thispaper has better properties than the one used in Diamond [8] and BIardossy et al. [1], which is a form ofcombination of the two diRerences |a1 − b1| and |a2 − b2|. Consider the crisp number A(0) and two intervalsB(−1; 3) and C(1; 3). Using the measure applied in Diamond [8] and BIardossy et al. [1], it is found thatd2(A; B) = d2(A; C) = 10. With the Tran and Duckstein’s distance measure, D2(A; B) = 7=3¡D2(A; C) = 13=3.This result makes more sense as A is inside of B and outside of C, leading to the expectation of D(A; B)¡D(A;C). In the case of crisp numbers, this new interval distance measure becomes the Euclidean distance. Hencethis is a generalization of the usual Euclidean distance, a feature needed for a calculation with mixed crispnumbers and intervals. It should be mentioned that this feature is not seen in the common HausdorR distanceor the “dissemblance index” distance of Kaufman and Gupta [15].

4. Formulation of the three-objective fuzzy regression

Using the idea from NOzelkan [17], NOzelkan et al. [18], and NOzelkan and Duckstein [19] in dealing withoutliers and using the residuals from the center of the regression line (or hyperplane) to observed data as anobjective function, the following three-objective fuzzy regression (TOFR) model is formulated:

Min-dominated {R; V p; Ep}Subject to

|Pt() − Ot()|6 rt ∀t ∈ T;

rt ¿ 0 ∀t ∈ T;

PL; t() − OL; t() 6 �L; t ; OR; t() − PR; t() 6 �R; t ∀t ∈ T; (4)

�L; t ; �R; t ¿ 0 ∀t ∈ T;

where O and P are fuzzy numbers representing observations and predictions, respectively; P = AX , where Ais the vector of the model’s (fuzzy) parameters and X is the set of (fuzzy) input variables; OL; t(); OR; t()

Page 4: Multiobjective fuzzy regression with central tendency and possibilistic properties

24 L. Tran, L. Duckstein / Fuzzy Sets and Systems 130 (2002) 21–31

Fig. 1. An illustration of the TOFR model.

and PL; t(); PR; t() are lower and upper values of the -level intervals of the fuzzy numbers O and P,respectively; T is the set of observations; rt is residual from Ot() to (the corresponding interval on) the centerline of fuzzy prediction Pt() (Fig. 1); R =

∑t rt ; Vp =

∑t (PR; t()−PL; t())p denotes the vagueness measure

to be minimized; �L; t ; �R; t are relaxation variables, describing deviation from outliers; and Ep =∑

t (�pL; t +

�pR; t); 16p6∞ shows the compensation level. Actually, R alone is the least absolute value regression (or

often referred as L1 regression) expanded for fuzzy observations using the distance measure de6ned in Section3 above. In other words, the TOFR model can be viewed as 6nding the fuzzy parameters by way of consideringthe relationship between the fuzzy parameters and the observations at a selected membership level , called thelevel of credibility, based on three performance criteria—central tendency, spread of predictions, and dealingwith outliers. An illustration of the TOFR model can be seen in Fig. 1.

The following lemma is provided to show that in fact the BOFR of NOzelkan [17], NOzelkan et al. [18] andNOzelkan and Duckstein [19], the FLR of Tanaka et al. [30], Tanaka [28], the FLR of Peters [20], the FLSRof Diamond [8], and the classical Lp regression are speci6c cases of TOFR.

Lemma 1. (a) The BOFR of ;Ozelkan [17]; ;Ozelkan et al. [18] and ;Ozelkan and Duckstein [19]; (b)the FLR of Tanaka et al. [30]; Tanaka [28]; the FLR of Peters [20]; the FLSR of Diamond [8]; andclassical Lp regression are speci?c cases of TOFR.

Proof. (a) Let rt = � ∀t where � is a value large enough to be bigger than any possible residual betweenobserved and predicted data. As a consequence, the objective function R is a constant, the TOFR becomesexactly the same as the BOFR of NOzelkan [17], NOzelkan et al. [18], and NOzelkan and Duckstein [19].

(b) NOzelkan [17] and NOzelkan and Duckstein [19] proved that the FLR of Tanaka et al. [30], Tanaka [28],the FLR of Peters [20], the FLSR of Diamond [8], and the classical Lp regression are speci6c cases of hisBOFR model. From this assertion and from (a), it can be stated that FLR of Tanaka et al. [30], Tanaka [28],the FLR of Peters [20], the FLSR of Diamond [8], and the classical Lp regression are speci6c cases of theTOFR model.

5. Linear programming formulation for three-objective fuzzy regression

To be able to deal with curvilinear membership functions, left–right (L–R) fuzzy numbers of Dubois andPrade [11] are used, as in BIardossy et al. [1]. By de6nition, a fuzzy set A on the set of real numbers is called

Page 5: Multiobjective fuzzy regression with central tendency and possibilistic properties

L. Tran, L. Duckstein / Fuzzy Sets and Systems 130 (2002) 21–31 25

an L–R fuzzy number if the membership of x can be calculated as

�(x) =

{LA( a−x

�A) for x ¡ a and �A ¿ 0;

RA( x−a�A

) for x ¿ a and �A ¿ 0:(5)

Here L and R are continuous strictly decreasing functions on [0; 1] and

L(x) = R(x) = 1 if x 6 0;

L(x) = R(x) = 0 if x ¿ 1:

Let (X1; t ; X2; t ; : : : ; XK; t ; Ot) for t = 1; : : : ; T be the set of (fuzzy) observations, where K is the set of (fuzzy)input variables.

Let P =K∑

i=1

Ai ⊗ Xi + A0 (6)

be the fuzzy linear function to be estimated with the TOFR, where Ai; i = 0; : : : ; K , are the (fuzzy) parameters.In general, the multiplication of two intervals involves the minimization and maximization operations on allpossible products between lower and upper bounds of two intervals, making the linear programming forestimating (6) complicated. However, in most applications, the sign of fuzzy parameters Ai is often knownin advance and one does not expect it to change from negative to positive at any -level interval. Within thiscontext, it is reasonable to assume that every fuzzy parameter Ai is either nonpositive or nonnegative for every-level interval. Furthermore, it is also assumed that the -level interval of fuzzy number Xi is nonnegativefor any ∈ [0; 1]. Then the -level interval of P in (6) can be expressed as follows:

PL; t() =∑j∈J1

(aj − L−1Aj

()�Aj )(xj; t − L−1Xj; t

()�Xj; t ) +∑j∈J2

(aj − L−1Aj

()�Aj )(xj; t + R−1Xj; t

()�Xj; t )

+ (a0 − L−1A0

()�A0 );

PR; t() =∑j∈J1

(aj + R−1Aj

()�Aj )(xj; t + R−1Xj; t

()�Xj; t ) +∑j∈J2

(aj + R−1Aj

()�Aj )(xj; t − L−1Xj; t

()�Xj; t ) (7)

+ (a0 + R−1A0

()�A0 );

where J1 is the subset of all nonnegative fuzzy parameters, J2 is the subset of all nonpositive fuzzy parameters,and J1 ∩ J2 = ∅.

As de6ned in Section 4, objective R is the sum of residuals from Ot() to center line of fuzzy predictionPt() as follows:

R =∑

t

|Pt() − Ot()| =∑

t

√D2(Pt(); Ot()); (8)

where Pt() is an interval (Pt; 1(); Pt; 2()) which can be computed as follows:

Pt;1() =

j∈J1

aj(xj; t − L−1Xj; t

()�Xj; t ) +∑j∈J2

aj(xj; t + R−1Xj; t

()�Xj; t ) + a0

; (9)

Pt;2() =

j∈J1

aj(xj; t + R−1Xj; t

()�Xj; t ) +∑j∈J2

aj(xj; t − L−1Xj; t

()�Xj; t ) + a0

: (10)

Page 6: Multiobjective fuzzy regression with central tendency and possibilistic properties

26 L. Tran, L. Duckstein / Fuzzy Sets and Systems 130 (2002) 21–31

From (3) and (8):

R =∑

t

√[12

(Pt;1() + Pt;2())−12

(OL()+OR())]2

+13

[12

(Pt;2()−Pt;1())−12

(OR()−OL())]2

:

(11)

As a2 + b26(|a| + |b|)2, (11) can be rewritten as follows:

R 6∑

t

√{∣∣∣∣12(Pt;1()+Pt;2())−12

(OL()+OR())∣∣∣∣+

1√3

∣∣∣∣12(Pt;2()−Pt;1())−12

(OR()−OL())∣∣∣∣}2

:

(12)

Then,

R6∑

t

{∣∣∣∣12(Pt;1() + Pt;2()) − 12

(OL() + OR())∣∣∣∣

+1√3

∣∣∣∣12(Pt;2() − Pt;1()) − 12

(OR() − OL())∣∣∣∣}

(13)

or

R6∑

t

∣∣∣∣12(Pt;1() + Pt;2()) − 12

(OL() + OR())∣∣∣∣

+1√3

∑t

∣∣∣∣12(Pt;2() − Pt;1()) − 12

(OR() − OL())∣∣∣∣ : (14)

The right-hand side of (14) is used in (4) to minimize the objective R. Common algorithms in least-absolute-deviations regression such as the ones in Bloom6eld and Steiger [4], SpNath [26], Birkes and Dodge[3] can be used to 6nd the minimum of the right-hand side of (14).

In order to obtain linear programming formulations, two objectives Vp and Ep are de6ned as aggregatedexpression of individual vagueness and deviation from outliers, respectively, using an L1 norm as follows:

V =∑

t

(PR; t() − PL; t()); (15)

E =∑

t

(�L; t + �R; t); (16)

where PL; t()−OL; t()6�L; t ; OR; t()−PR; t()6�R; t ; and �L; t ; �R; t¿0 for every t. By replacing (7) for appro-priate components in (15) and (16), the expressions of V and E are obtained.

The TOFR (4) can be solved using diRerent MODM techniques. Details on MODM can be found inGoicoechea et al. [13], Szidarovszky et al. [27], Vincke [32], Duckstein and Tecle [12]. It should be mentionedthat each MODM technique has its own procedure to help the decision maker to choose the best compromisesolution among the Pareto set based on his=her implicit preferences. In this paper we apply two commonMODM techniques—the �-constraint and the compromise programming—for illustration purpose. With the�-constraint technique, the decision maker speci6es bounds on the objectives in a sequential manner. Here wesuggest for the TOFR that R should be minimized 6rst using Eq. (14) and then be set as constraint whileoptimizing V and E (Eqs. (15) and (16), respectively). An iterative process should be carried out to consider

Page 7: Multiobjective fuzzy regression with central tendency and possibilistic properties

L. Tran, L. Duckstein / Fuzzy Sets and Systems 130 (2002) 21–31 27

Table 1Data of Example 1 (from [20])

X Y X Y

1 1.5 6 6.32 2.3 7 6.53 2.7 8 7.84 4.4 9 8.55 9.4 10 10.5

Table 2Results of the TOFR from diRerent MODM techniques for Example 1a

Cases MODM techniques Objective functions Fuzzy parameters

R E V a−0 a0 a+0 a−1 a1 a+

1

a �-constraint 7.00 0.00 49.00 0.00 0.60 4.90 0.90 0.90 0.90b �-constraint 7.00 0.90 40.00 0.20 0.60 4.20 0.90 0.90 0.90c �-constraint 7.00 2.90 20.00 0.20 0.60 2.20 0.90 0.90 0.90d �-constraint 7.00 3.93 10.00 0.20 0.60 0.83 0.90 0.90 0.97e compromise, p = 1 7.04 4.05 8.69 0.21 0.60 0.60 0.91 0.91 1.00f compromise, p =∞ 7.25 2.84 19.65 −0:15 0.60 1.81 0.95 0.95 0.95g NOzelkan’s BOFR (V640)b 16.45 0.80 40.00 −0:15 1.85 3.85 0.95 0.95 0.95h NOzelkan’s BOFR (V610)b 7.40 3.82 10.00 −0:15 0.26 0.67 0.95 0.97 0.97

aLinear regression function: Y = A0 + A1 ⊗ X ; A0(a−0 ; a0; a+0 )L–R and A1(a−1 ; a1; a+

1 )L–R are L–R triangular fuzzy numbers.bR is not an objective function in this model.

the trade-oR between V and E and reach a reasonable solution. If the physical system and relative distributionof possible outliers are known, the objective E can be considered as two separate objectives of EL and ER

with diRerent constraints. It allows the independent consideration of including or excluding possible outlierson either side of the regression line. With compromise programming, L1 and L∞ yield linear programmingformulations. A payoR table obtained from optimizing each objective would be an alternative for setting goalsof the objectives. More discussion on multiobjective analyses for TOFR can be found in [31].

5.1. Numerical examples

For the 6rst numerical example, the TOFR is applied to the same problem as given by Peters [20]. Data aregiven in Table 1. The solutions obtained at the level = 0 from diRerent MODM techniques are presentedin Table 2 and Fig. 2. The Pareto optimal solutions of cases (a)–(d) are obtained using the �-constrainttechnique. Here objective R is minimized 6rst and its optimum value is set as a constraint (R67:0). For case(a) objective E is set to 0 and then objective V is minimized while a constraint of 40, 20, and 10 is set forV in cases (b), (c), and (d), respectively, before minimizing E. The Pareto optimal solutions of cases (e)and (f) are obtained using L1 and L∞ compromise programming techniques, respectively. Solutions obtainedfrom NOzelkan’s BOFR are presented in cases (g) and (h) for comparison purpose.

First of all one can see that both TOFR and NOzelkan’s BOFR deal with outliers satisfactorily. The centerregression line in NOzelkan’s BOFR, however, does not seem to reveal the data’s central tendency property. Itslocation is merely controlled by two outer regression lines and the symmetric assumption of fuzzy regressionparameters. It leads to the fact that the central line’s parameters might change substantially while changing

Page 8: Multiobjective fuzzy regression with central tendency and possibilistic properties

28 L. Tran, L. Duckstein / Fuzzy Sets and Systems 130 (2002) 21–31

Fig. 2. Pareto optimal solutions using TOFR and NOzelkan’s BOFR for = 0 level set (see Table 1 for MCDM techniques and regressionparameters).

values of E and V . For example, while changing V from 40 in case (g) to 10 in case (h), a0 varies from1.85 to 0.26. As a consequence, the use of the center line for prediction is questionable. In contrast, this kindof erratic change is not seen in the cases where TOFR is applied, in which the center line’s parameters areobtained through optimizing R. Thus the center line truly and explicitly reJects the central tendency property

Page 9: Multiobjective fuzzy regression with central tendency and possibilistic properties

L. Tran, L. Duckstein / Fuzzy Sets and Systems 130 (2002) 21–31 29

Table 3Data of Example 2a

No X−1 X1 X +

1 X−2 X2 X +

2 Y− Y Y +

1 1.9 2.0 2.1 2.9 3.0 3.1 15.1 15.2 15.32 2.9 3.0 3.1 1.9 2.0 2.1 14.0 14.1 14.23 2.9 3.0 3.1 2.9 3.0 3.1 15.7 15.8 15.94 3.9 4.0 4.1 4.9 5.0 5.1 20.7 20.8 20.95 4.9 5.0 5.1 3.9 4.0 4.1 19.8 19.9 20.06 4.9 5.0 5.1 4.9 5.0 5.1 21.6 21.7 21.87 4.9 5.0 5.1 5.9 6.0 6.1 24.0 24.1 24.28 5.9 6.0 6.1 4.9 5.0 5.1 23.0 23.1 23.29 5.9 6.0 6.1 5.9 6.0 6.1 34.9 35.0 35.1

10 6.9 7.0 7.1 6.9 7.0 7.1 27.8 27.9 28.0

aX1; X2, and Y are L–R triangular fuzzy numbers.

Table 4Results of the TOFR from diRerent MODM techniques for Example 2a

Runs Objective functions Fuzzy parameters

R E V a−0 a0 a+0 a−1 a1 a+

1 A−2 A2 A+

2

1 12.41 5.89 10.19 7.29 7.29 7.29 0.92 0.92 0.92 2.02 2.02 2.022 12.41 21.39 8.00 7.17 7.29 7.29 0.92 0.92 0.92 2.02 2.02 2.333 12.41 36.80 6.00 7.17 7.29 7.29 0.92 0.92 0.92 2.02 2.02 2.654 12.41 52.21 4.00 7.17 7.29 7.29 0.92 0.92 0.92 2.02 2.02 2.985 12.41 67.62 2.00 7.17 7.29 7.29 0.92 0.92 0.92 2.02 2.02 3.316 12.41 83.05 0.00 7.16 7.29 7.29 0.92 0.92 0.92 2.02 2.02 3.64

aLinear regression function: Y = A0 + A1 ⊗ X1 + A2 ⊗ X2; Ai(a−i ; ai ; a+

i )L–R; i = 0; 1; 2; are L–R triangular fuzzy numbers.

of data points, providing a good estimate for prediction purpose. Furthermore, as seen in cases (a)–(f), theL1 regression of R keeps the TOFR robust with respect to outliers. Certainly this feature will not be seenif least squares regression is used. In brief, this numerical example shows that TOFR possesses both centraltendency and possibilistic properties of statistical and fuzzy regressions satisfactorily as well as other goodcharacteristics (e.g., robustness regarding outliers, dealing with fuzzy input=output data). It should be notedthat, although Fig. 2 is very compelling in showing diRerent non-dominated solutions, such visualization isnot necessary for most MODM techniques in choosing the best compromise solution in the Pareto set. Thereader is referred to the MODM references [9,10,23,28] for more details on diRerent MODM techniques.

The second numerical example is shown to demonstrate the applicability of the TOFR for multivariate caseswith fuzzy input=output data (Table 3). Data are created from the equation y = 7 + x1 + 2x2 and modi6edto L–R triangular fuzzy numbers after some randomness is introduced into y. The 9th point in the data setis made as an outlier intentionally to show the TOFR’s robustness in dealing with outliers. The �-constrainttechnique is used to obtain the Pareto optimal solutions at the level = 0 (Table 4). Here objective R isminimized 6rst and its optimum value is set as a constraint (R612:41). Then a diRerent value of E is usedas a constraint for each run of optimizing V . With his=her relative references, the decision maker can use thetrade-oR among the three objective functions R; V; and E provided in Table 4 to pick up the most-preferredsolution.

Page 10: Multiobjective fuzzy regression with central tendency and possibilistic properties

30 L. Tran, L. Duckstein / Fuzzy Sets and Systems 130 (2002) 21–31

6. Discussion and conclusions

A TOFR model has been described and illustrated numerically. The model is applicable for multivariateproblems with crisp=fuzzy input=output data. The TOFR model, which combines central tendency and pos-sibilistic properties of statistical and fuzzy regressions, overcomes the shortcomings of fuzzy and statisticalregression approaches and provides a suitable transition from fuzzy to statistical regression when more databecome available and other statistical assumptions are veri6ed su5ciently. In addition, the model addressesthe data outlier problem in fuzzy regression adequately, by the trade-oR between diRerent objective functionswithin a MODM framework.

The introduction of the residual related function R into the TOFR model has several advantages. First ofall, the central tendency properties of statistical regression are present explicitly in the model for consider-ation. Furthermore, the L1 regression is robust with respect to y-direction outlier (often referred as outlier,diRerentiated from a x-direction outlier referred as a leverage point). Hence, the model can be used whenonly few data are available, while this is not the case for FLSR. As the L1 regression estimator is not alwayscompletely unique, multiobjective programming would be an appropriate framework to 6nd a compromisesolution according to the decision maker’s relative preferences over the three objectives of V; E; and R.

Since the objective R is robust regarding outlier and the relaxation variables in objective E express thesigni6cance of a data point as an outlier, TOFR would be an eRective tool in identifying extreme events ina hydroclimatological study.

TOFR can also be used as an appropriate tool to calibrate the parameters in a linear model if data areuncertain or the distribution of variables cannot be veri6ed easily. Furthermore it provides a suitable means ofdealing with uncertainty in model structure, making the model more realistic and easy to update. For example,TOFR was applied within a fuzzy rule-based modeling framework to improve the performance of the reviseduniversal soil loss equation (RUSLE) [31]. With a superior result of the fuzzy RUSLE compared to thoseof the conventional RUSLE, this work proves that TOFR satisfactorily handles uncertainties in both data andmodel structure.

The main conclusions of this paper can be summarized as follows:

• The distance measure introduced in this paper provides an eRective and meaningful means to measure thedistance between two intervals as it takes into consideration all points on both intervals rather than just theintervals’ lower and upper bounds.

• By possessing both the central tendency and possibilistic properties of statistical and fuzzy regressions, theTOFR model is capable of exhibiting the good characteristics of both approaches and overcoming severalof their shortcomings.

• The methodology presented in this paper can be a valuable tool in a variety of problems due to its simplicityand eRectiveness in dealing with uncertainty in data and model structure.

Acknowledgements

The support provided to the 6rst author’s Ph.D. program by the East–West Center is gratefully acknowl-edged. The authors are grateful to the reviewers for their thoughtful comments.

References

[1] A. BIardossy, R. Hagaman, L. Duckstein, I. Bogardi, Fuzzy least squares regression: theory and application, in: J. Kacprzyk, M.Fedrizi (Eds.), Fuzzy Regression Analysis, Omnitech Press, Warsaw and Physica-Verlag, Heidelberg, 1992, pp. 181–193.

[2] C. Bertoluzza, N. Corral, A. Salas, On a new class of distances between fuzzy numbers, Mathware Soft Comput. 2 (1995) 71–84.

Page 11: Multiobjective fuzzy regression with central tendency and possibilistic properties

L. Tran, L. Duckstein / Fuzzy Sets and Systems 130 (2002) 21–31 31

[3] D. Birkes, Y. Dodge, Alternative Methods of Regression, Wiley, New York, 1993.[4] P. Bloom6eld, W.L. Steiger, Least Absolute Deviations—Theory, Applications, and Algorithms, BirkhNauser, Boston, 1983.[5] A. Celmins, Least-squares model 6tting to fuzzy vector data, Fuzzy Sets and Systems 22 (1987) 245–269.[6] A. Celmins, Multidimensional least-squares 6tting of fuzzy models, Math. Modelling 9 (1987) 669–690.[7] P.-T. Chang, E.S. Lee, A generalized fuzzy weighted least-squares regression, Fuzzy Sets and Systems 82 (1996) 289–298.[8] P. Diamond, Fuzzy least squares, Info. Sci. 46 (1988) 141–157.[9] P. Diamond, R. KNorner, Extended fuzzy linear models and least squares estimates, Comput. Math. Appl. 33 (1997) 15–32.

[10] P. Diamond, H. Tanaka, Fuzzy regression analysis, in: R. S lowiInski (Ed.), Fuzzy Sets in Decision Analysis, Operations Researchand Statistics, Kluwer Academic Publishers, Boston, 1998.

[11] D. Dubois, H. Prade, Fuzzy Sets and Systems: Theory and Applications, Academic Press, New York, 1980.[12] L. Duckstein, Tecle, Multiobjective analysis in water resources, Part II: a new typology of MCDM techniques, in: J.B. Marco, R.

Harboe, J.D. Salas (Eds.), Stochastic Hydrology and its Use in Water Resources Systems Simulation and Optimization, NATO ASISeries E: Applied Sciences, Vol. 237, NijhoR, Dordrecht, 1993, pp. 333–343.

[13] A. Goicoechea, D.R. Hansen, L. Duckstein, Multiobjective Decision Analysis with Engineering and Business Applications, Wiley,New York, 1982.

[14] S. Jozsef, On the eRect of linear data transformations in possibilistic fuzzy linear regression, Fuzzy Sets and Systems 45 (1992)185–188.

[15] A. Kaufman, M. Gupta, Introduction to Fuzzy Arithmetic Theory and Application, Van Nostrand Reinhold, New York, 1985.[16] B. Kim, R.R. Bishu, Evaluation of fuzzy linear regression models by comparing membership functions, Fuzzy Sets and Systems

100 (1998) 342–352.[17] E.C. NOzelkan, Multi-objective Fuzzy Regression Applied to the Calibration of Conceptual Rainfall-RunoR Models, Unpublished Ph.D.

Dissertation, Department of Systems and Industrial Engineering, The University of Arizona, 1997.[18] E.C. NOzelkan, L. Duckstein, A. Galambosi, Analysis of tradeoR between data outliers and prediction vagueness in fuzzy regression

using a biobjective framework, Proc. EUFIT’98—6th European Congr. on Intelligent Techniques and Soft Computing, 1998,pp. 1048–1051.

[19] E.C. NOzelkan, L. Duckstein, Multiobjective fuzzy regression: a general framework, Comput. Oper. Res. 27 (2000) 635–640.[20] G. Peters, Fuzzy linear regression with fuzzy intervals, Fuzzy Sets and Systems 63 (1994) 45–55.[21] D.T. Redden, W.H. Woodall, Properties of certain fuzzy linear regression models, Fuzzy Sets and Systems 64 (1994) 361–375.[22] D.T. Redden, W.H. Woodall, Further examination of fuzzy linear regression, Fuzzy Sets and Systems 79 (1996) 203–211.[23] M. Sakawa, H. Yano, Multiobjective fuzzy linear regression analysis and its application, Electron. Comm. Jpn. 73 (1990) 1–9.[24] M. Sakawa, H. Yano, Multiobjective fuzzy linear regression analysis for fuzzy input–output data, Fuzzy Sets and Systems 47 (1992)

173–181.[25] D.A. Savic, W. Pedrycz, Evaluation of fuzzy linear regression models, Fuzzy Sets and Systems 39 (1991) 51–63.[26] H. SpNath, Mathematical Algorithms for Linear Regression, Academic Press, Boston, 1987.[27] F. Szidarovszky, M.E. Gershon, L. Duckstein, Techniques for Multiobjective Decision Making in Systems Management, Elsevier,

Amsterdam, 1986.[28] H. Tanaka, Fuzzy data analysis by possibilistic linear models, Fuzzy Sets and Systems 24 (1987) 363–375.[29] H. Tanaka, H. Lee, Fuzzy linear regression combining central tendency and possibilistic properties, Proc. FUZZ-IEEE’97, 1997, pp.

63–68.[30] H. Tanaka, S. Uejima, K. Asai, Linear regression analysis with fuzzy model, IEEE Trans. Systems Man Cybernet. 12 (1982)

903–907.[31] L. Tran, Multiobjective Fuzzy Regression and its Application to Erosion Prediction, Unpublished Ph.D. Dissertation, Department of

Geography, The University of Hawaii, 1999.[32] P. Vincke, Multicriteria Decision-Aid, Wiley, New York, 1992.