Upload
yoshiteru-nakamori
View
213
Download
0
Embed Size (px)
Citation preview
Information Sciences 176 (2006) 3610–3644
www.elsevier.com/locate/ins
Treating fuzziness in subjectiveevaluation data
Yoshiteru Nakamori a,*, Mina Ryoke b
a School of Knowledge Science, Japan Advanced Institute of Science and Technology,
1-1 Asahidai, Tatsunokuchi, Ishikawa 923-1292, Japanb Graduate School of Business Sciences, University of Tsukuba, Tsukuba, Japan
Received 23 January 2005; received in revised form 1 February 2006; accepted 3 February 2006
Abstract
This paper proposes a technique to deal with fuzziness in subjective evaluation data,and applies it to principal component analysis and correspondence analysis. In the exist-ing method, or techniques developed directly from it, fuzzy sets are defined from somestandpoint on a data space, and the fuzzy parameters of the statistical model are iden-tified with linear programming or the method of least squares. In this paper, we try tomap the variation in evaluation data into the parameter space while preserving informa-tion as much as possible, and thereby define fuzzy sets in the parameter space. Clearly, itis possible to use the obtained fuzzy model to derive things like the principal componentscores from the extension principle. However, with a fuzzy model which uses the exten-sion principle, the possibility distribution spreads out as the explanatory variable valuesincrease. This does not necessarily make sense for subjective evaluations, such as a5-level evaluation, for instance. Instead of doing so, we propose a method for explicitlyexpressing the vagueness of evaluation, using certain quantities related to the eigen-values of a matrix which specifies the fuzzy parameter spread. As a numerical example,we present an analysis of subjective evaluation data on local environments.� 2006 Elsevier Inc. All rights reserved.
0020-0255/$ - see front matter � 2006 Elsevier Inc. All rights reserved.doi:10.1016/j.ins.2006.02.015
* Corresponding author. Tel.: +81 761 51 1755; fax: +81 761 51 1149.E-mail address: [email protected] (Y. Nakamori).
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3611
Keywords: Subjective evaluation data; Fuzzy principal component analysis; Fuzzy correspondenceanalysis; Local environment evaluation
1. Introduction
Two approaches have been proposed for extending fuzzy logic to principalcomponent analysis. The first method introduces the concept of a fuzzy group,assigns membership values of data vectors in the fuzzy group, and then per-forms the principal component analysis in which the membership values areused as the weights of data vectors [18]. For example, when given attributessuch as operating profit margin, size and growth rate for many informationindustry companies, a principal component analysis can be formulated by tak-ing the sales ratio of the information industry department of each company asthe data weight which corresponds to the membership value in the fuzzy groupcalled ‘‘information industry’’. The second technique is a principal componentanalysis in which the data is given as fuzzy numbers [19]. For example, whengiven 5 years worth of the above attribute data on information industry com-panies, the possibility distribution of the attribute values is expressed, fromthat data, as fuzzy numbers. To reflect that possibility in the principal compo-nent, a linear programming problem is formulated, which includes an ordinaryeigenvalue problem in principal component analysis, and the fuzzy principalcomponent scores are derived by solving that.
The data covered in this paper is 3-mode data, in which multiple objects weresubjectively evaluated using multiple evaluation criteria. If multiple evaluatorsare oriented with ‘‘5 years of data’’, multiple objects oriented with ‘‘multipleinformation industry companies’’, and multiple evaluation criteria orientedwith ‘‘a number of attributes such as operating margin, scale and growth rate’’,it is theoretically possible to apply the second technique above. Incidentally,traditional techniques for handling 3-mode data are roughly grouped intotwo types [1]. One is the PARAFAC model [4], and the other is the TuckerModel [15]. Research on both of these models is ongoing, but in the most gen-eral terms, these methods are being employed to discover models shared acrossall modes. Therefore, they are not suitable for use in expressing variation tounderstand each evaluation object or evaluation item, like that being empha-sized in this paper.
The data handled here are values obtained by subjectively evaluating vari-ous aspects of the objects with five levels. While they are crisp values, theyare also ‘‘volatile’’ values, depending on the values of the evaluator and the sit-uation. For example, in response to the question ‘‘Can fish caught in the riversand ponds near the region where you live be eaten?’’, the data is comprised of
3612 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
five levels of answers, ranging from ‘‘No, they cannot be eaten at all’’ to ‘‘Yes,they can be eaten without any problems’’. Because they are not being askedtheir preferences, evaluators attempt to reply as objectively as possible, butthe result is a data set with large variance due to factors such as sensitivityof evaluators to the environment. Here, this sort of data is called sensibility
data, or ‘‘kansei data’’ in Japanese. When we attempt to extract the featuresof a sensibility data set using the conventional multivariate analysis technique,it is easiest to use average data relating to evaluators, and this contains a cer-tain degree of information. However, to more effectively use information inher-ent in the evaluation of objects using human sensibility, it is important todevelop techniques for modeling individual differences in evaluation and inthe spread of vagueness. It is of interest to note that subjective evaluation datarepresented by linguistic terms has been extensively used in linguistic decisionanalysis, e.g., [2,17,6].
Here, we attempt modeling through a concept that differs from the tech-nique of Yabuuchi and Watada [18,19]. This is one attempt to numericallyquantify the vagueness inherent in data, but it is difficult to compare the supe-riority/inferiority of models based on different concepts or ideas when there isno specific external standard to be predicted. Here, we are attempting to solveproblems where weights cannot be assigned to the objects of analysis or theevaluators, and therefore the first technique of Yabuuchi and Watada cannotbe applied. As mentioned earlier, their second technique can be applied; in thisresearch, however, we focused primarily on discovering the principal compo-nents and weight parameters which seem to preserve the differences of opinionamong evaluators in the data space. Although Yabuuchi and Watada’s secondmethod allows us to adopt certain evaluation criteria and uniquely determinethe fuzzy weights, it is based on linear programming, and thus the approachdoes not entail preserving differences of opinion. On the other hand, the maintopic in this research is comparing the objects in principal component analysis,and we maintain that it should be sufficient to derive relative fuzziness.
The mathematical model of vague concepts was firstly introduced by Zadeh[20], using the notion of partial degrees of membership. Since then, the problemof efficiently constructing membership functions of fuzzy sets in a given partic-ular application has been studied by many distinguished fuzzy scholars includ-ing Turksen [16], Kruse et al. [7], and Pedrycs [10] among others. The specificmeaning of a vague concept in a proposition is usually evaluated in differentways for different assessments of an entity by different agents, contexts, etc.[11]. Huynh et al. [5] show that the context model [3] provides a practicalframework for constructing membership functions of fuzzy concepts. Thispaper also tries to construct membership functions, but that express fuzzinessof principal component scores, in a particular situation where we have to treata set of subjective evaluation data. Unlike the regression analysis, the principalcomponent analysis does not assume the existence of external variables which
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3613
can act as criteria to justify the membership functions. Therefore, in this paper,we try to construct membership functions that express the relative fuzziness
between principal component scores.In the next section, we describe the background of this research, and the
data structures handled. Then, we consider two methods for finding the fuzzyprincipal component scores: fuzzifying the sensibility data, and fuzzifying theweight parameters of the principal component model. The former method isused in analyzing the objects of evaluation. The latter method, on the otherhand, provides a model for performing overall evaluations when new evalua-tion data has been obtained. Based on this model, we can obtain the fuzzy prin-cipal component scores from the extension principle. However, a possibilitymodel which uses the extension principle has the additional feature of a possi-bility distribution which spreads out as the value of the explanatory variableincreases. Accordingly, in this paper, we propose a method for explicitlyexpressing the evaluation vagueness, using certain quantities related to theeigenvalues of a matrix which specifies the fuzzy parameter spread. As anumerical example, we carry out an analysis of data obtained by having resi-dents conduct a sensory evaluation of their local environment. This paper isprimarily concerned with fuzzy principal component analysis; fuzzy correspon-dence analysis is described briefly as its direct application.
2. Purpose of research and model structure
2.1. Purpose of research
We indicate the objects of evaluation with m = 1,2, . . . ,M, the evaluationitems with n = 1,2, . . . ,N, and the evaluators with k = 1,2, . . . ,K. For example,we handle 3-mode structure data as follows:
• Evaluation objects: (m = 1) Examination student 1, (m = 2) Examinationstudent 2, . . .
• Evaluation items: (n = 1) Scholastic record, (n = 2) Human qualities, (n = 3)Future potential, . . .
• Evaluators: (k = 1) Examiner A, (k = 2) Examiner B, . . .
Here, the evaluation value zmnk of examiner k regarding evaluation object m
from the standpoint of evaluation item n is often given as a 5-level value,but this is an extremely vague value, and in this paper is referred to as sensibil-
ity data. The data vector for evaluator k is written as
zmk ¼ ðzm1k; zm2k; . . . ; zmNkÞt; zmnk 2 f1; 2; 3; 4; 5g ð1Þ
3614 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
If we assume that the overall evaluation of examination student m by examinerk will be found by a linear weighted sum as follows:
xmk ¼ a1kzm1k þ a2kzm2k þ � � � þ aNkzmNk ð2Þ
then the weight vector itself
ak ¼ ða1k; a2k; . . . ; aNkÞt ð3Þ
is not absolute, and in many cases is ‘‘tacit’’. In fact, in pass/fail determination,for instance, the order of the examined students is determined by averaging the‘‘tacit’’ overall evaluation values {xmk} with respect to the examiners, withoutfactoring the individual evaluation of each examiner.
Determination methods like that given above are often used in a variety ofsettings, but for applications such as university entrance examinations whichcarry a strong requirement for objectivity, the weight vector between examinersis often made uniform. One technique which can be used in this situation isprincipal component analysis. A weight vector can be determined from dataaveraged with respect to the examiners using principal component analysis,but the purpose here is to find fuzzy principal component scores which takeinto account the evaluation’s vagueness and fluctuation. Two possible methodsof doing this are:
• Find the membership function for a fuzzy weight vector which ‘‘in somesense’’ preserves the differences in the dispersion of the evaluator’sevaluation.
• By calculating the fuzzy principal components of the object using the exten-sion principle in fuzzy set theory, find the membership function for the prin-cipal component scores by all evaluators.
Based on this logic, fuzzy regression analysis has been studied [8,9], butbecause an external criterion exists there the method of least squares is appli-cable, and a mapping from the data space to the parameter space can be found.This can be used to achieve the above objective ‘‘to some extent’’. However, adifferent approach is needed because this research does not assume the exis-tence of external criteria (overall evaluation values). Since a criterion for mea-suring the absolute amount of ‘‘vagueness’’ does not exist, our objective is tofind principal component scores which involve relative values of fuzziness.
2.2. Model structure
There are three possible methods of finding fuzzy principal componentscores:
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3615
(1) Fuzzifying sensibility data.(2) Fuzzifying weight parameters.(3) Fuzzifying both sensibility data and weight parameters.
The last method is mathematically difficult to handle, and, as will becomeclear later, the same information must be duplicated to use in this method,so we shall investigate only the first two.
• Average model: Both methods begin by building a principal componentmodel using an evaluation data matrix with the evaluation objects and eval-uation items as the indices of the rows and columns, which is obtained byaveraging sensibility data with respect to the evaluators. The weight param-eters of the average principal component model are identified by solving aneigenvalue problem for the variance–covariance matrix between evaluationitems. This will be explained in Section 3.
• Analysis of evaluation objects: For the first method, the average modeldescribed above is the final model, and fuzzified sensibility data is input sep-arately into the average model to find the fuzzy principal component. This isan extremely easy method and makes it possible to discover which evalua-tion for which object has dispersed to what extent. However, when fuzzify-ing the sensibility data, it is necessary to determine the perspective, such aswhether to stress the possibilities which the data can assume or to stress thevariance of data. This will be treated in Section 4.
• Overall evaluation model: In the second method, a fuzzy principal compo-nent model is built by fuzzifying the parameters of the average model. Inthe course of fuzzifying, the focus is on the parameters of the average model;identification of parameter vectors for individual evaluators is done so thatdifferences of opinion are preserved as much as possible and the fuzzyparameters are found using these types of information. It is possible to cal-culate the fuzzy principal component (overall evaluation) if crisp evaluationdata given as inputs to this model. In this paper, the model based on thismethod is called the fuzzy principal component model. This will be consideredin Section 5.
Here, when fuzzifying data or parameters, the fuzzy component spread dif-fers depending on whether variance–covariance information is used or a possi-bility distribution is implemented. In this paper, fuzzification is done using amethod like that of Tanaka and Ishibuchi [12,13] which emphasizes the possi-bilities of data. This is one idea, but in the environment evaluation which isshown as an example, differences of opinion may appear between the stand-points of the government and residents. Sometimes considering data possibili-ties is important, but in other cases that is actually not helpful, so it is necessaryto decide the method on a case by case basis.
3616 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
3. Data structure and average model
3.1. Data structure
Generally speaking, in many cases the sensibility evaluation data which canbe obtained does not have the complete set of three modes as described above.For example, due to time constraints, systems in which all examiners examineall examination students are unusual. Therefore, we assume a data structurelike that shown below as a realistic set-up.
Let E = {1,2, . . . ,K} be the set of evaluators, and O = {1,2, . . . ,M} be theset of objects of evaluation. When letting Em be the set of evaluators whichevaluated object m, and letting Ok be the set of objects evaluated by evaluatork, we get
E ¼[Mm¼1
Em; Em 6¼ /; 8m ð4Þ
O ¼[Kk¼1
Ok; Ok 6¼ /; 8k ð5Þ
This includes special cases like the following:
Case 1: The case where all objects are evaluated by all evaluators (complete3-mode data):
Em ¼ E; 8m; jEmj ¼ jEj ¼ K; jOkj ¼ jOj ¼ M ð6ÞCase 2: The case where only one object is evaluated by each evaluator:
jOkj ¼ 1; 8k; Em \ Em0 ¼ /; m 6¼ m0;XM
m¼1
jEmj ¼ K ð7Þ
Here, j Æ j indicates the number of elements in a set.
The data treated in this paper as a sample application corresponds to Case 2above, so the theory corresponding to Case 2 is developed in this paper.
3.2. Environment evaluation data
In this paper we analyze questionnaire data relating to waterside spacesadjacent to the residence locality. The questionnaire survey was conducted inNovember 2001 using the direct distribution method, and was targeted at res-idents of Komatsu City and the town of Tsurugi, both located in Ishikawa Pre-fecture, Japan. Here we use data excerpted from that survey.
The following is the list of the evaluation items used for numerical experi-ments in this paper:
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3617
(n = 1) Vegetation like reeds and water plants can be found.(n = 2) Water recreation (swimming, boating, etc.) is possible.(n = 3) Waterside barbecuing and camping are possible.(n = 4) River embankments are established.(n = 5) The water is clear.
The data used for analysis is indicated in Table 1. For reference, the tablealso gives sensory evaluation values relating to water quality and the pleasant-ness of the waterside:
(Water quality) Water quality is good.(Pleasantness) Waterside space is pleasant.
The evaluation objects were watersides near the following geographical sites.Figures in parentheses after the place name are the biochemical oxygendemand (BOD; mg/l, average for 1999), an indicator of water quality. LargerBOD values indicate a greater degree of contamination.
(m = 1) Hakusan Gokuchi Dike (BOD = 0.5).(m = 2) Mikawa Bridge (BOD = 0.6).(m = 3) Tatsunokuchi Bridge (BOD = 0.7).(m = 4) Nomi Bridge (BOD = 0.8).(m = 5) Tsurugashima Bridge (BOD = 0.9).(m = 6) Miyuki Bridge (BOD = 4.5).(m = 7) Ukiyanagi New Bridge (BOD = 5.2).
Each evaluator evaluated only one geographical site, corresponding to Case2.
3.3. Average principal component scores
Let the evaluator average vector for the object m be
zm ¼ ðzm1; zm2; . . . ; zmN Þt ð8Þ
where we define
zmn ¼1
jEmjXk2Em
zmnk; jEmj > 0 ð9Þ
Note that the following holds:Xk2Em
ðzmk � zmÞ ¼Xk2Em
zmk � jEmjzm ¼ 0 ð10Þ
Table 1Waterside subjective evaluation data
m k n = 1 n = 2 n = 3 n = 4 n = 5 Water quality Pleasantness
1 1 1 5 5 2 2 3 31 2 1 1 1 5 2 3 41 3 1 4 1 3 1 2 31 4 1 2 5 5 5 4 41 5 5 2 2 2 1 2 21 6 3 1 1 5 1 3 31 7 4 5 5 5 3 2 41 8 5 5 5 3 5 5 51 9 5 1 5 5 1 2 31 10 1 3 1 5 3 3 41 11 1 4 4 4 1 2 11 12 5 5 5 1 1 2 31 13 3 2 2 4 3 3 31 14 2 1 1 5 1 1 41 15 2 5 5 2 1 1 11 16 3 2 2 5 1 2 3
2 17 2 2 5 4 1 3 32 18 2 2 4 4 4 3 32 19 5 1 5 5 2 3 32 20 4 5 4 3 3 3 32 21 3 2 2 3 3 3 32 22 4 2 2 2 3 4 32 23 5 3 5 4 3 5 32 24 2 4 4 4 2 2 3
3 25 5 1 2 5 4 3 33 26 1 3 3 3 4 3 33 27 4 1 4 4 4 3 43 28 4 2 4 5 4 2 33 29 2 1 1 5 2 3 33 30 1 2 4 5 4 3 43 31 3 5 4 4 2 4 53 32 1 1 3 4 3 3 33 33 3 1 5 3 4 3 33 34 2 2 2 5 5 3 33 35 2 1 2 2 2 3 33 36 2 2 4 5 1 1 33 37 2 2 4 4 2 3 23 38 3 5 5 4 2 3 3
4 39 5 5 5 5 2 3 34 40 3 1 1 5 3 3 34 41 2 5 1 4 1 2 24 42 5 4 1 5 3 3 3
5 43 5 4 1 1 2 2 25 44 4 5 1 5 1 2 3
3618 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
Table 1 (continued)
m k n = 1 n = 2 n = 3 n = 4 n = 5 Water quality Pleasantness
5 45 2 5 1 5 3 3 45 46 4 4 2 3 4 2 25 47 5 5 1 4 2 2 45 48 5 5 1 5 2 3 35 49 4 1 1 4 2 2 25 50 5 2 1 5 1 3 45 51 4 3 4 2 2 2 35 52 5 5 1 2 1 1 2
6 53 2 1 1 5 1 2 26 54 5 1 1 1 2 1 26 55 1 1 1 1 1 2 26 56 4 4 3 4 2 2 36 57 2 2 1 4 3 2 26 58 4 4 1 5 1 2 36 59 1 1 1 5 1 1 26 60 4 1 1 4 1 1 46 61 4 1 1 5 1 2 26 62 5 1 1 1 1 1 36 63 2 1 1 1 1 1 26 64 2 2 1 1 1 1 1
7 65 4 5 1 5 1 2 37 66 4 4 1 2 1 1 27 67 1 1 1 5 1 1 27 68 5 1 1 5 1 1 17 69 5 2 1 2 3 2 37 70 5 1 2 5 1 2 27 71 4 4 1 4 2 2 37 72 5 4 1 3 2 3 37 73 2 1 1 3 2 2 3
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3619
Let the variance–covariance matrix between evaluation items for the averagedata be
S ¼
s11 s12 � � � s1N
s21 s22 � � � s2N
..
. ... . .
. ...
sN1 sN2 � � � sNN
0BBBB@
1CCCCA ð11Þ
where
snn0 ¼1
M
XM
m¼1
ðzmn � znÞðzmn0 � zn0 Þ; zn ¼1
M
XM
m¼1
zmn ð12Þ
3620 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
Here, we set
z0 ¼ ðz1; z2; . . . ; zN Þt ð13Þand set zmk � z0 and zm � z0 anew to zmk and zm. Therefore, the following holdsin the rest of this paper:XM
m¼1
zm ¼XM
m¼1
1
jEmjXk2Em
zmk ¼ 0 ð14Þ
We find the eigenvalues and eigenvectors of the variance–covariance matrix Sby
Sap ¼ kpap; p ¼ 1; 2; . . . ;N ð15Þwhere we let
ap ¼ ðap1; ap2; . . . ; apNÞt; atpap ¼ 1; 8p; k1 P k2 P � � �P kN P 0
ð16ÞFrom the above, the pth principal component score xpm for the object m due tothe average data is given by
xpm ¼ atpzm ð17Þ
As described above, zm is actually found by subtracting the average vector z0
for the evaluator and object, so z0 is mapped to the origin of the space spannedby the principal component axes.
3.4. Average model for environment evaluation
When we work from the data given in Table 1:
fzmnk ; m ¼ 1; 2; . . . ; 7; n ¼ 1; 2; . . . ; 5; k ¼ 1; 2; . . . ; 73g ð18Þand find the average data for evaluators:
fzmn; m ¼ 1; 2; . . . ; 7; n ¼ 1; 2; . . . ; 5g ð19Þthe results are as shown in Table 2.
Table 2Average data for evaluators
n = 1 n = 2 n = 3 n = 4 n = 5 Water quality Pleasantness
m = 1 2.688 3.000 3.125 3.813 2.000 2.500 3.125m = 2 3.375 2.625 3.875 3.625 2.625 3.250 3.000m = 3 2.500 2.071 3.357 4.143 3.071 2.860 3.210m = 4 3.750 3.750 2.000 4.750 2.250 2.750 2.750m = 5 4.300 3.900 1.400 3.600 2.000 2.200 2.900m = 6 3.000 1.667 1.167 3.083 1.333 1.500 2.330m = 7 3.889 2.556 1.111 3.778 1.556 1.800 2.440
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3621
Here, z0 is calculated as
z0 ¼ ð3:357; 2:796; 2:291; 3:827; 2:119Þt ð20Þ
then subtracting this from the original average data shown in Table 2, we have
z1 ¼ ð�0:669; 0:204; 0:834;�0:014;�0:119Þt
z2 ¼ ð0:018;�0:171; 1:584;�0:202; 0:506Þt
z3 ¼ ð�0:857;�0:725; 1:066; 0:316; 0:952Þt
z4 ¼ ð0:393; 0:954;�0:291; 0:923; 0:131Þt
z5 ¼ ð0:943; 1:104;�0:891;�0:227;�0:119Þt
z6 ¼ ð�0:357;�1:129;�1:124;�0:744;�0:786Þt
z7 ¼ ð0:532;�0:240;�1:180;�0:049;�0:563Þt
8>>>>>>>>>>>><>>>>>>>>>>>>:
ð21Þ
Calculating the variance–covariance matrix using Eqs. (11) and (12), we obtain
S ¼
0:377 0:310 �0:375 0:018 �0:115
0:310 0:580 �0:083 0:184 0:031
�0:375 �0:083 1:125 0:119 0:476
0:018 0:184 0:119 0:229 0:137
�0:115 0:031 0:476 0:137 0:306
0BBBBBB@
1CCCCCCA
ð22Þ
The eigenvalues and eigenvectors of S are calculated as follows:
k1 ¼ 1:521; a1 ¼ ð�0:355;�0:162; 0:839; 0:089; 0:369Þt
k2 ¼ 0:801; a2 ¼ ð0:382; 0:800; 0:176; 0:360; 0:230Þt
k3 ¼ 0:168; a3 ¼ ð0:385; 0:157; 0:380;�0:792;�0:236Þt
k4 ¼ 0:094; a4 ¼ ð�0:637; 0:459; 0:088; 0:006;�0:614Þt
k5 ¼ 0:033; a5 ¼ ð�0:418; 0:314;�0:335;�0:485; 0:616Þt
8>>>>>>><>>>>>>>:
ð23Þ
The first and second principal component scores (x1m,x2m) for objectsm = 1,2, . . . , 7 due to the average data are calculated as follows:
ðx11; x21Þ ¼ ð0:859; 0:022Þ; ðx12; x22Þ ¼ ð1:520; 0:193Þðx13; x23Þ ¼ ð1:695;�0:386Þ; ðx14; x24Þ ¼ ð�0:408; 1:225Þðx15; x25Þ ¼ ð�1:324; 0:978Þ; ðx16; x26Þ ¼ ð�0:990;�1:687Þðx17; x27Þ ¼ ð�1:352;�0:345Þ
8>>>><>>>>:
ð24Þ
Fig. 1 shows a plot of these scores on a two-dimensional plane. Looking at thea1 and a2 components, it appears that the first axis (horizontal axis) reflectsevaluation of water quality, and the second axis (vertical axis) reflects the de-gree to which the natural conditions remain.
Fig. 1. Average principal component model.
3622 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
4. Fuzzy data mapping
4.1. Fuzzifying sensibility data
Here we fuzzify the average data vector zm, and introduce the fuzzy vector
Zm ¼ ðZm1; Zm2; . . . ; ZmN Þt ð25ÞFirst we calculate the variance–covariance matrix
T m ¼1
jEmjXk2Em
ðzmk � zmÞðzmk � zmÞt ð26Þ
Here, we assume that Tm is a positive definite matrix. Assuming cm to be a po-sitive real number which will be determined later, we let
DZm ¼ cm � T m ð27Þand define the fuzzy vector Zm with the following multi-dimensional member-ship function:
lZmðzÞ ¼ expf�ðz� zmÞtD�1
Zmðz� zmÞg ð28Þ
The parameter cm can be set as follows [14]. That is, taking a certain real num-ber h 2 (0,1), we find the minimum cm satisfying the following inequality:
mink2EmflZmðzmkÞgP h ð29Þ
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3623
That value is
cm ¼maxk2Emfðzmk � zmÞtT�1
m ðzmk � zmÞg� log h
ð30Þ
This is a method which reflects the approach of an analyst who asks the ques-tion: to what degree should data possibilities be incorporated? Since an exter-nal standard (overall evaluation) does not exist, h must be determinedsubjectively. However, as will be shown later, it is possible to find the relative
fuzziness of principal components, and thereby compare the vagueness of eval-uation with respect to objects.
Note 1: Here we explain the meaning of fuzzifying the above data for thecase of one-dimensional data. As shown in Fig. 2, we assume that five datapoints (we have used only five to make the situation easier to visualize) are dis-persed around the average, and that the leftmost data point is farthest from theaverage. This is a one-dimensional case, so lZm
ðzÞ given by Eq. (28) is a bell-shape with left–right symmetry, and this is regarded as the possibility distribu-tion of the data. Here, the diagram on the left is the possibility of the leftmostdata point set as h1, and h2 (>h1) is the diagram on the right. Determination ofthe spread of the membership function depends on how we estimate the possi-bility of the data occurring most distant from the average. Our approach is toachieve this sort of spread by multiplying the variance–covariance matrix Tm
by cm.
4.2. Fuzzy principal components (1)
If we apply the extension principle [20], the membership function for thefuzzy pth principal component Xpm of the object m is given as follows:
lX pmðxÞ ¼ max
zlZmðzÞjx ¼ at
pzn o
ð31Þ
This can be found by solving the following optimization problem [14]:
minimize JðzÞ ¼ ðz� zmÞtD�1Zmðz� zmÞ
subject to x ¼ atpz
(ð32Þ
Fig. 2. Membership function as a possibility distribution.
3624 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
That is, if we introduce the Lagrange multiplier k and set
Lðz; kÞ ¼ ðz� zmÞtD�1Zmðz� zmÞ þ k x� at
pz� �
ð33Þ
and solve
oLðz; kÞoz
¼ 0;oLðz; kÞ
ok¼ 0 ð34Þ
then we obtain
z ¼ zm þx� at
pzm
atpDZm ap
DZm ap ð35Þ
Substituting this for J(z), we have
lX pmðxÞ ¼ exp �
x� atpzm
� �2
atpDZm ap
8><>:
9>=>; ð36Þ
Note 2: The membership function for the fuzzy pth principal componentfound above can be interpreted as follows. That is, the orthogonal projectiononto the pth principal component axis for the individual data vector is
xpmk ¼ atpzmk ð37Þ
When this is averaged over evaluators, it becomes
xpm ¼ atpzm ð38Þ
The variance is given by atpT map. The value of this multiplied by cm is at
pDZm ap.Note 3: If there is no change in data, or if the change is extremely small, the
fuzzy vector Zm is defined by the following equation:
lZmðzÞ ¼
1; z ¼ zm
0; otherwise
�ð39Þ
In this case, we have
lX pmðxÞ ¼
1; x ¼ atpzm
0; otherwise
�ð40Þ
Note 4: In modeling with a Gaussian distribution, if
xpm ¼ ap1zm1 þ ap2zm2 þ � � � þ apN zmN ð41Þand the zmn’s are mutually independent in accordance with the Gaussian distri-bution Nðlmn; r
2mnÞ, then xpm follows the Gaussian distribution:
NXN
n¼1
apnlmn;XN
n¼1
a2pnr
2mn
!ð42Þ
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3625
Thus we can calculate the distribution of principal component scores. How-ever, a chi-square test confirms that the environment evaluation data used inthis paper does not satisfy the independence hypothesis. We shall note that,even if we choose to ignore this fact and graph the distribution of xpm, theresults closely resemble the results shown later in Section 4.4.
4.3. Relative fuzziness
Using the above method, it is possible to express the relative fuzziness of theprincipal component scores for objects as shown in Fig. 3. Here, the member-ship function for the fuzzy principal component score in the plane of the firstand second principal component axes is defined by the following equation:
lX 1m�X 2mðx1; x2Þ ¼ lX 1m
ðx1Þ � lX 2mðx2Þ
¼ exp �x1 � at
1zm
� �2
at1DZm a1
�x2 � at
2zm
� �2
at2DZm a2
( )ð43Þ
Fig. 3 shows the a-level sets described by
fðx1; x2ÞjlX 1m�X 2mðx1; x2ÞP ag ð44Þ
2nd principal component
Student 1Student 5
Student 3
Student 2
Student 4
Student 61st principalcomponent
Fig. 3. Conceptual diagram of fuzzy principal component scores.
Table 3Values of parameter cm
h 0.005 0.01 0.05 0.10 0.25 0.50
c1 1.91 2.19 3.37 4.39 7.29 14.6c2 1.27 1.46 2.24 2.92 4.84 9.69c3 1.58 1.82 2.80 3.64 6.05 12.1c4 2.43 2.80 4.30 5.60 9.29 18.6c5 1.63 1.88 2.89 3.76 6.24 12.5c6 2.08 2.39 3.67 4.78 7.93 15.9c7 1.51 1.74 2.67 3.48 5.77 11.5
3626 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
The ellipses in Fig. 3 express the fuzziness of the principal component scores,but their size is not necessarily absolute and they are the results of subjectivejudgment. However, the size of the ellipses can be regarded as indicative of rel-ative size of fuzziness. This is the meaning of relative fuzziness in this paper.
4.4. Principal component analysis for environmental evaluation data by
fuzzifying data
We calculate the variance–covariance matrices T1,T2, . . . ,T7 defined by Eq.(26), and find the matrix DZm defined by Eq. (27). Here, the value of cm is cal-culated as in Table 3 for a number of h values. Using h = 0.005 from Table 3,and graphing the sets given by Eq. (44) taking a = 0.9 and a = 0.7, we obtainFigs. 4 and 5.
Fig. 4. Analysis by fuzzifying data: a = 0.9.
Fig. 5. Analysis by fuzzifying data: a = 0.7.
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3627
In the above method, it is difficult to decide how to determine the h takeninto account in Eq. (29) and the a used when graphing. Interpretation of h isparticularly difficult. Nevertheless, it is thought to express the relative fuzziness
of sensibility data, as described above.
5. Fuzzy principal component model
In this section, we construct the following fuzzy principal component modelby fuzzifying the weight parameters:
X p ¼ Ap1z1 þ Ap2z2 þ � � � þ ApN zN ð45ÞHere, Xp is the fuzzy number indicating the pth principal component, Apn is thefuzzy number indicating the weight of evaluation item n, and zn indicates thecrisp evaluation value for evaluation item n. Modeling in this way allows usto output as a fuzzy number the principal component score for a new evalua-tion vector not used in constructing the model.
5.1. Model identification
The membership function for the fuzzy vector
Ap ¼ ðAp1;Ap2; . . . ;ApNÞt ð46Þ
3628 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
is defined by
lApðaÞ ¼ expf�ða� apÞtD�1
Apða� apÞg ð47Þ
Here, the center of the membership function is given by the eigenvector of thevariance–covariance matrix S found with Eq. (15):
ap ¼ ðap1; ap2; . . . ; apNÞt ð48Þ
On the other hand, DAp , which governs the spread, is established as follows.First, the weight vector specific to evaluator k
apk ¼ ðap1k; ap2k; . . . ; apNkÞt ð49Þ
is defined by the following equation:
apk ¼ ap þ ðzmk � zmÞ; k 2 Em; 8p ð50Þ
This takes into account the dispersion of the evaluation of evaluator k, and thefollowing equation holds due to Eq. (10):
1
K
XK
k¼1
apk ¼ ap þ1
K
XM
m¼1
Xk2Em
ðzmk � zmÞ ¼ ap ð51Þ
Here, the variance–covariance matrix of {apk} is a common matrix for p, asindicated below:
R ¼ 1
K
XK
k¼1
ðapk � apÞðapk � apÞt ¼1
K
XK
k¼1
ðzmk � zmÞðzmk � zmÞt
¼ 1
K
XM
m¼1
Xk2Em
ðzmk � zmÞðzmk � zmÞt ¼PM
m¼1jEmj � T mPMm¼1jEmj
ð52Þ
Then we write DAp as DA and define
DA ¼ R ð53Þ
DA is given by the weighted average of Tm using jEmj, and variance is empha-sized for geographical sites with many data points. For example, if N = 2 andK = 7, the positional relationship of evaluator weight parameters and the a-level sets are shown in Fig. 6.
Note 5: For Case 1 (jOkj = jOj = M) or in the general case (jOkj > 1, $k), ifwe define
apk ¼ ap þXm2Ok
ðzmk � zmÞ; 8p ð54Þ
then the average of evaluators’ parameters coincides with ap, and the varianceis given as
Fig. 6. Conceptual diagram of weight membership functions.
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3629
R ¼ 1
K
XK
k¼1
Xm2Ok
Xm02Ok
ðzmk � zmÞðzm0k � zm0 Þt ð55Þ
5.2. Fuzzy principal components (2)
When a crisp evaluation vector z is given, the membership function for thefuzzy principal component score can be found by applying the extension prin-ciple [20] as follows:
lX pðxÞ ¼ max
aflApðaÞjx ¼ atzg ¼ exp �
x� atpz
� �2
ztDAz
8><>:
9>=>; ð56Þ
The equation above is obtained by solving the following optimization problem:
minimize ða� apÞtD�1A ða� apÞ
subject to x ¼ atz
(ð57Þ
Even in this case, using z = zm, it is possible to express the relative fuzziness ofthe principal component score for object m, as indicated in Fig. 7. Here, themembership function for the fuzzy principal component score in the first andsecond principal component plane is defined by the following equation:
2nd principal component
Student1
Student5
Student 3
Student2
Student 4
Student6
1st principalcomponent
Fig. 7. Conceptual diagram of fuzzy principal component scores.
3630 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
lX 1m�X 2mðx1; x2Þ ¼ lX 1m
ðx1Þ � lX 2mðx2Þ
¼ exp �x1 � at
1zm
� �2 þ x2 � at2zm
� �2
ztmDAzm
( )ð58Þ
Fig. 7 indicates the a-level sets described by
fðx1; x2ÞjlX 1m�X 2mðx1; x2ÞP ag ð59Þ
However, from Eq. (58) it is clear that there is a problem with this method inthat fuzziness depends on the length of the evaluation vector. The followingattempts to correct this point.
5.3. Numerical quantification of vagueness
The a-level set for the fuzzy principal component score in the pth–qth prin-cipal component plane is expressed by the following circle:
xp � atpz
� �2
þ xq � atqz
� �2
¼ ztDAz� ð� log aÞ; 0 < a < 1 ð60Þ
One feature of a possibility linear model based on the extension principle is thatthe possibility spreads out as the value of the explanatory variable increases.
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3631
This is understandable in the case of a regression model, but in a 5-level eval-uation, it is unnatural to assume that 5 has greater vagueness than 3. We caninstead regard 3 as being the most vague. So we consider removing the effect oflength of the evaluation data vector as a factor. First, we can try to correct thesize of the circle as follows:
ðxp � atpzÞ2 þ ðxq � at
qzÞ2 ¼ ztDAz
ztz� ð� log aÞ ð61Þ
However, since our intention is not to indicate the absolute value of fuzziness,we propose drawing a circle like the following.
First, letting kmax(DA) and kmin(DA) be respectively the maximum and min-imum eigenvalues of the matrix DA, note that
kminðDAÞ 6ztDAz
ztz6 kmaxðDAÞ ð62Þ
We assume that the fuzzy principal component score of the evaluation vector zin the pth–qth principal component plane can be expressed by the followingcircle:
xp � atpz
� �2
þ xq � atqz
� �2
¼ r2 ð63Þ
Here, we find the radius r so it satisfies
r2 � r2min
r2max � r2
min
¼ k� kminðDAÞkmaxðDAÞ � kminðDAÞ
ð64Þ
where
k ¼ ztDAz
ztzð65Þ
and rmax and rmin are design parameters indicating the maximum and minimumradius.
The increased size of the k value means that there is a large possibility of opin-ion dispersal for the direction of the evaluation vector z. This is the primary idea
being proposed in this paper.
Note 6: We can now write the membership function of the pth principalcomponent score as
lX pðxÞ ¼ exp �
ðx� atpzÞ2
k
( )ð66Þ
instead of the one given in Eq. (56), where
k ¼ ztDAz
ztzð67Þ
3632 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
and z is any evaluation vector. If we define the membership function like this, itis possible to draw the a-level set directly, not using Eqs. (63) and (64).
5.4. Evaluation model of local environment based on fuzzification of weights
Here we illustrate the proposed method using questionnaire data relating towaterside spaces adjoining the locality of residence. First, we use Eq. (50) tofind the weight parameters peculiar to evaluator k:
fapk; k ¼ 1; 2; . . . ; 73g ð68Þ
These variance–covariance matrices DA, which do not depend on the principalcomponent p, are calculated as follows:
DA ¼
1:761 0:154 0:348 �0:152 0:061
0:154 1:991 0:600 �0:308 0:008
0:348 0:600 1:463 �0:151 0:124
�0:152 �0:308 �0:151 1:666 0:024
0:061 0:008 0:124 0:024 1:020
0BBBBBBB@
1CCCCCCCA
ð69Þ
The eigenvalues and eigenvectors of DA are calculated as follows:
k1 ¼ 2:664; a1 ¼ ð�0:375;�0:695;�0:504; 0:347;�0:050Þt
k2 ¼ 1:692; a2 ¼ ð0:869;�0:466; 0:070; 0:120; 0:091Þt
k3 ¼ 1:540; a3 ¼ ð0:023;�0:257;�0:282;�0:917;�0:111Þt
k4 ¼ 1:079; a4 ¼ ð0:285; 0:355;�0:525; 0:154;�0:703Þt
k5 ¼ 0:927; a5 ¼ ð0:151; 0:329;�0:621; 0:018; 0:695Þt
8>>>>>>><>>>>>>>:
ð70Þ
The maximum and minimum eigenvalues for DA are
kmaxðDAÞ ¼ 2:664; kminðDAÞ ¼ 0:927 ð71Þ
and we set
rmax ¼ 0:5; rmin ¼ 0:05 ð72Þ
The fuzzy principal component scores when the average data zm
(m = 1,2, . . . , 7) is input to the model are given in Fig. 8. Fig. 8 differs fromFigs. 4 and 5, but these were attempts to reflect the dispersion of the evalua-tions in the dispersion of the principal components, and Fig. 8 models the fuzz-iness of the evaluation vector for each geographical site. The latter stands onthe idea of standardizing the evaluation vector when calculating the degreeof vagueness.
Fig. 8. Fuzzy principal component scores for respective evaluated geographical sites.
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3633
The following are the evaluation vectors obtained from evaluation near theMikawa Bridge (m = 2):
z21 ¼ ð2; 2; 5; 4; 1Þ; z22 ¼ ð2; 2; 4; 4; 4Þ; z23 ¼ ð5; 1; 5; 5; 2Þz24 ¼ ð4; 5; 4; 3; 3Þ; z25 ¼ ð3; 2; 2; 3; 3Þ; z26 ¼ ð4; 2; 2; 2; 3Þz27 ¼ ð5; 3; 5; 4; 3Þ; z28 ¼ ð2; 4; 4; 4; 2Þ
8><>: ð73Þ
When these subtracting
z0 ¼ ð3:357; 2:796; 2:291; 3:827; 2:119Þ ð74Þare input to the model, we obtain Fig. 9. The item indicated by 2 in Fig. 9 de-notes the case where average data for 8 people was used, and the items indi-cated by 21,22, . . . , 28 denote the cases where the respective evaluator datawas input.
Furthermore, Figs. 10 and 11 show the fuzzy principal component scores fora number of possible (extremal) input vectors. Looking at Fig. 10, we can seethat the evaluation vectors from (1,1,1,1,1) to (5, 5,5,5,5) lie on a singlestraight line. The small radii of the evaluation vectors (4,4,1,4,3) and(4,4,1,4,5) are attributable to the fact that they are nearly parallel with theeigenvector corresponding to the minimum eigenvector of DA. This suggeststhat there is a tendency for responses not to disperse in that direction (thedirection where the average vector z0 is actually drawn). Conversely, the largeradius of the evaluation vector (2,1,2,5,2) is because it is almost parallel with
Fig. 9. Fuzzy principle component scores from 8 evaluators at geographical site 2.
Fig. 10. Fuzzy principal component scores (1).
3634 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
Fig. 11. Fuzzy principal component scores (2).
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3635
the eigenvector which corresponds to the maximum eigenvalue of DA. Thissuggests that responses are dispersed in that direction. In other words, thisindicates that there are many responses which seem to vary in proportion withthe vector when the average vector is drawn from this vector. We can say thatthis signifies the fact that the evaluated geographical site is difficult to specify.
6. Correspondence analysis
Here we shall consider the handling of sensibility data in correspondenceanalysis, by directly applying the technique of the previous section. First weconstruct an average model, and then we introduce relative fuzziness withrespect to evaluation objects, and relative fuzziness with respect to evaluationitems.
6.1. Average model
We first normalize the evaluator average data {zmn} given by Eq. (9) asfollows:
pmn ¼zmn
z0
; z0 ¼XM
m¼1
XN
n¼1
zmn ð75Þ
3636 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
and prepare the following correlation table:
P ¼
p11 p12 � � � p1N
p21 p22 � � � p2N
..
. ... . .
. ...
pM1 pM2 � � � pMN
0BBB@
1CCCA ð76Þ
Here, if we let
pm� ¼XN
n¼1
pmn; p�n ¼XM
m¼1
pmn ð77Þ
then the following equation holds:XM
m¼1
pm� ¼XN
n¼1
p�n ¼ 1 ð78Þ
In correspondence analysis, a quantity xm is associated with evaluation objectm, and a quantity yn is associated with evaluation item n, and we find the fol-lowing vectors to maximize the correlation coefficient qxy defined below:
x ¼ ðx1; x2; . . . ; xMÞt; y ¼ ðy1; y2; . . . ; yN Þt ð79Þ
Here, the correlation coefficient is defined by the following equation:
qxy ¼rxy
rxryð80Þ
where
r2x ¼
XM
m¼1
pm�x2m �
XM
m¼1
pm�xm
!2
; r2y ¼
XN
n¼1
p�ny2n �
XN
n¼1
p�nyn
!2
ð81Þ
rxy ¼XM
m¼1
XN
n¼1
pmnxmyn �XM
m¼1
pm�xm
XN
n¼1
p�nyn ð82Þ
This maximization problem reverts to an eigenvalue problem, and the solutionis given by the eigenvectors (see Note 8):
~xi ¼ ð~xi1;~xi2; . . . ;~xiMÞt; i ¼ 1; 2; . . . ;M ð83Þ~yi ¼ ð~yi1; ~yi2; . . . ; ~yiN Þt; i ¼ 1; 2; . . . ;N ð84Þ
However, the first eigenvector is a meaningless solution, and in correspondenceanalysis, we check the proximity relationship between objects and evaluationitems by plotting the following on a two-dimensional plane using the secondand third eigenvectors:
ð~x2m;~x3mÞ; m ¼ 1; 2; . . . ;M ð85Þð~y2n; ~y3nÞ; n ¼ 1; 2; . . . ;N ð86Þ
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3637
Note 7: The correlation coefficient introduced above is defined as follows.First we calculate the weighted averages of {x1,x2, . . . ,xM} and {y1,y2, . . . ,yN}:
�x ¼XM
m¼1
pm�xm; �y ¼XN
n¼1
p�nyn ð87Þ
and define the variance and covariance as follows:
r2x ¼
XM
m¼1
pm�ðxm � �xÞ2; r2y ¼
XN
n¼1
p�nðyn � �yÞ2 ð88Þ
rxy ¼XM
m¼1
XN
n¼1
pmnðxm � �xÞðyn � �yÞ ð89Þ
If the object m strongly responds to n then pmn becomes large. In this case wegive the similar values to xm and yn, and make rxy large.
Note 8: Standardizing ~x and ~y:
~x ¼ ð~x1;~x2; . . . ;~xMÞt; ~xm ¼1
rxxm �
XM
m¼1
pm�xm
!ð90Þ
~y ¼ ð~y1; ~y2; . . . ; ~yNÞt; ~yn ¼1
ryyn �
XN
n¼1
p�nyn
!ð91Þ
and introducing the following matrices Px and Py:
P x ¼
p1� 0 � � � 0
0 p2� � � � 0
..
. ... . .
. ...
0 0 � � � pM�
0BBBB@
1CCCCA; P y ¼
p�1 0 � � � 0
0 p�2 � � � 0
..
. ... . .
. ...
0 0 � � � p�N
0BBBB@
1CCCCA ð92Þ
we obtain the following necessary conditions:
P~y� qxyP x~x ¼ 0; P t~x� qxyP y~y ¼ 0 ð93Þ
We can solve these equations by transforming them into an eigenvalue problemfor a symmetric matrix.
6.2. Relative fuzziness with respect to evaluation objects
We define a quantity, as follows, indicating the variation in the evaluationby evaluator k of the evaluation object m:
bmk ¼1
N
XN
n¼1
ðzmnk � zmnÞ; k 2 Em ð94Þ
3638 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
Using this, the following vector is introduced:
bk ¼ bmkðd1k; d2k; . . . ; dMkÞt; dmk ¼1; k 2 Em
0; k 62 Em
�ð95Þ
The pseudo-eigenvector peculiar to the evaluator k
~xik ¼ ð~xi1k;~xi2k; . . . ;~xiMkÞt ð96Þis defined by the following equation:
~xik ¼ ~xi þ bk ð97ÞHere, the following equation holds due to Eq. (10):
1
K
XK
k¼1
~xik ¼ ~xi ð98Þ
We introduce a fuzzy vector, whose components are fuzzy numbers
X i ¼ ðX i1;X i2; . . . ;X iMÞt ð99Þand define the membership function with the following equation:
lX iðxÞ ¼ exp � x� ~xið ÞtD�1
X iðx� ~xiÞ
� �ð100Þ
The matrix DX i , which stipulates the spread of the membership function, is de-fined using the variance–covariance matrix:
R ¼ 1
K
XK
k¼1
ð~xik � ~xiÞð~xik � ~xiÞt
¼ 1
K
PKk¼1
b21kd1k 0 � � � 0
0PKk¼1
b22kd2k � � � 0
..
. ... . .
. ...
0 0 � � �PKk¼1
b2MkdMk
0BBBBBBBBB@
1CCCCCCCCCA
ð101Þ
Here, DX i does not depend on i, so we write DX, and set
DX ¼ R ð102ÞWhen we introduce the vector
am ¼ ðam1; am2; . . . ; amMÞt; amm0 ¼1; m ¼ m0
0; m 6¼ m0
�ð103Þ
we obtain
X im ¼ atmX i ð104Þ
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3639
Applying the extension principle, the membership function of Xim is found asfollows:
lX imðxÞ ¼ max
xlX iðxÞjx ¼ at
mx� �
¼ exp �x� at
m~xi
� �2
atmDX am
( )ð105Þ
We define the membership function for the fuzzy vector (X2m,X3m) by
lX 2m�X 3mðx2; x3Þ ¼ lX 2m
ðx2Þ � lX 3mðx3Þ
¼ exp �x2 � at
m~x2
� �2 þ x3 � atm~x3
� �2
atmDX am
( )ð106Þ
As in the previous section, we let the following be the circle indicating relative
fuzziness:
x2 � atm~x2
� �2 þ x3 � atm~x3
� �2 ¼ r2m ð107Þ
and establish the radius rm with the following equation:
r2m � r2
min
r2max � r2
min
¼ dm �minfdmgmaxfdmg �minfdmg
ð108Þ
where
dm ¼at
mDX am
atmam
¼ 1
K
XK
k¼1
b2mkdmk ¼
1
K
Xk2Em
b2mk ð109Þ
6.3. Relative fuzziness with respect to evaluation items
The vector ck indicating the response variation of an evaluator with respectto evaluation item n is defined as follows:
ck ¼ ðc1k; c2k; . . . ; cNkÞt; cnk ¼ zmnk � zmn; k 2 Em ð110ÞThe pseudo-eigenvector peculiar to evaluator k
~yik ¼ ð~yi1k; ~yi2k; . . . ; ~yiNkÞt ð111Þis given by the following equation:
~yik ¼ ~yi þ ck ð112ÞHere too, the following equation holds due to Eq. (10):
1
K
XK
k¼1
~yik ¼ ~yi ð113Þ
3640 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
Just as we previously introduced the relative fuzziness with respect to the eval-uation object, here too we introduce a fuzzy vector, whose components arefuzzy numbers
Y i ¼ ðY i1; Y i2; . . . ; Y iN Þt ð114Þ
The membership function is defined as follows:
lY iðyÞ ¼ expf�ðy� ~yiÞtD�1
Y iðy� ~yiÞg ð115Þ
Here too, DY i is defined by using the variance–covariance matrix
T ¼ 1
K
XK
k¼1
ð~yik � ~yiÞð~yik � ~yiÞt
¼ 1
K
PKk¼1
c1kc1kPKk¼1
c1kc2k � � �PKk¼1
c1kcNk
PKk¼1
c2kc1kPKk¼1
c2kc2k � � �PKk¼1
c1kcNk
..
. ... . .
. ...
PKk¼1
cNkc1kPKk¼1
cNkc2k � � �PKk¼1
cNkcNk
0BBBBBBBBB@
1CCCCCCCCCA
ð116Þ
This is not dependent on i, so we write DY i as DY, and define as follows:
DY ¼ T ð117ÞNext, we introduce the vector
an ¼ ðan1; an2; . . . ; anN Þt; ann0 ¼1; n ¼ n0
0; n 6¼ n0
�ð118Þ
and map the fuzzy vector Yi onto the fuzzy numbers corresponding to the nthevaluation item Yin:
Y in ¼ atnY i ð119Þ
Using the extension principle, we obtain the following membership function:
lY nðyÞ ¼ max
yflY iðyÞjy ¼ at
nyg ¼ exp �y � at
n~yi
� �2
atnDY an
( )ð120Þ
The membership function of the fuzzy vector (Y2n,Y3n) is defined as follows:
lY 2n�Y 3nðy2; y3Þ ¼ lY 2n
ðy2Þ � lY 3nðy3Þ
¼ exp �y2 � at
n~y2
� �2 þ y3 � atn~y3
� �2
atnDY an
( )ð121Þ
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3641
We define the relative fuzziness by the following equation:
y2 � atn~y2
� �2 þ y3 � atn~y3
� �2 ¼ s2n ð122Þ
The radius sn is calculated using the following equation:
s2n � s2
min
s2max � s2
min
¼ dn �minfdngmaxfdng �minfdng
ð123Þ
where dn is given by the following equation:
dn ¼anDY an
atnan
¼ 1
K
XK
k¼1
c2nk ¼
1
K
XM
m¼1
Xk2Em
ðzmnk � zmnÞ2 ð124Þ
Thus, the square of the radius is determined in such a way that it varies in pro-portion to the variance. Note that dn is given by the weighted average of thevariance of the original data, as indicated below:
dn ¼1
K
XM
m¼1
Xk2Em
ðzmnk � zmnÞ2 ¼PM
m¼1jEmj 1jEmjP
k2Emðzmnk � zmnÞ2
n oPM
m¼1jEmjð125Þ
6.4. Application to subjective evaluation data on waterside environments
Here we conduct correspondence analysis, in which relative fuzziness hasbeen expressed, by using sensibility evaluation data on waterside environments.The second and third eigenvectors were found as follows:
x2 ¼
0:8296
1:0557
1:2917
�0:5099
�1:1373
�0:7321
�1:1812
0BBBBBBBBBBB@
1CCCCCCCCCCCA; x3 ¼
1:3480
�0:0947
�0:9417
0:6092
1:0063
�1:6744
�0:8953
0BBBBBBBBBBB@
1CCCCCCCCCCCA
ð126Þ
y2 ¼
�1:0326
�0:6659
1:9128
�0:1640
0:7161
0BBBBBB@
1CCCCCCA; y3 ¼
�0:8225
1:9050
0:2835
�0:5305
�0:5799
0BBBBBB@
1CCCCCCA
ð127Þ
When we set
rmax ¼ smax ¼ 0:5; rmin ¼ smin ¼ 0:05 ð128Þ
3642 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
we can find the centers and radii of circles indicating the evaluation objectsm = 1,2, . . . , 7 and evaluation items n = 1,2, . . . , 5, as indicated below:
m ¼ 1 : ðx21; x31Þ ¼ ð0:8296; 1:3480Þ; r1 ¼ 0:5000
m ¼ 2 : ðx22; x32Þ ¼ ð1:0557;�0:0947Þ; r2 ¼ 0:1531
m ¼ 3 : ðx23; x33Þ ¼ ð1:2917;�0:9417Þ; r3 ¼ 0:3195
m ¼ 4 : ðx24; x34Þ ¼ ð�0:5099; 0:6092Þ; r4 ¼ 0:1757
m ¼ 5 : ðx25; x35Þ ¼ ð�1:1373; 1:0063Þ; r5 ¼ 0:0500
m ¼ 6 : ðx26; x36Þ ¼ ð�0:7321;�1:6744Þ; r6 ¼ 0:3529
m ¼ 7 : ðx27; x37Þ ¼ ð�1:1812;�0:8953Þ; r7 ¼ 0:1532
8>>>>>>>>>>>><>>>>>>>>>>>>:
ð129Þ
n ¼ 1 : ðy21; y31Þ ¼ ð�1:0326;�0:8225Þ; s1 ¼ 0:4375
n ¼ 2 : ðy22; y32Þ ¼ ð�0:6659; 1:9050Þ; s2 ¼ 0:5000
n ¼ 3 : ðy23; y33Þ ¼ ð1:9128; 0:2835Þ; s3 ¼ 0:3398
n ¼ 4 : ðy24; y34Þ ¼ ð�0:1640;�0:5305Þ; s4 ¼ 0:4090
n ¼ 5 : ðy25; y35Þ ¼ ð0:7161;�0:5799Þ; s5 ¼ 0:0500
8>>>>>><>>>>>>:
ð130Þ
m=1
m=2
m=3
m=4
m=5
m=6
m=7
n=1
n=2
n=3
n=4 n=5
(n=1) Water plans
(n=2) Recreation
(n=4) Embankments
(n=5) Clean(n=3) Barbecuing
Camping
Fig. 12. Results using the proposed technique.
Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644 3643
Fig. 12 was obtained from the above. The figure shows circles, withradii corresponding to the relative fuzziness obtained using the proposedtechnique.
7. Conclusion
In this paper we considered a new fuzzy principal component analysis tech-nique for analyzing sensibility data. We investigated a method of fuzzifyingdata, and a method of fuzzifying weight parameters. In both cases, weattempted as much as possible to faithfully reflect the dispersion of evaluationvalues in sensibility evaluation due to the evaluators in the principal compo-nent score. However, the methods of achieving this reflection differed. Inci-dentally, one remaining problem is that ‘‘parameter setting is extremely adhoc, and we would like to optimize this by introducing some kind of stan-dard’’. However, it is impossible to consider an absolute value for ‘‘vague-ness’’. This also does not mean there is a solid basis for setting theproblem to keep the fuzzy eigenvalue spread to a minimum, in the Yabuuchiand Watada method.
Consequently, in this paper, we have introduced the concept of relative
fuzziness. The main purpose of principal component analysis is seeing thepositional relationships of objects in a low-dimensional principal componentspace, but we introduced vagueness in position to this in a relative fashion.We then assumed that vagueness was a reflection of the manner of dispersionof evaluation data. This allowed us to see the distinctive features of the anal-ysis method of evaluators. This could be used in considering combinations oftest examiners to fairly conduct university or company entrance examina-tions. Finally, the proposed fuzzy principal component model was modifiedto eliminate effects of the evaluation vector length. In this way, the fuzzinessof the principal component score is found using the relationship with the var-iance–covariance matrix which stipulates the fuzziness of the modelparameters.
The radii of the circles in several figures in the paper express the relative
fuzziness of the locations of objects for evaluation or words used in evaluation.We can understand, looking at those figures, how opinions are spread in termsof objects or words. Such information is useful in decision-making or the nextstage analysis using subjective evaluation. We also proposed correspondenceanalysis incorporating relative fuzziness, as a direct application of the secondof the abovementioned methods for principal component analysis. Addition-ally, in this paper we developed theory relating to evaluation of a single objectby evaluators; an issue for future study will be expanding this to cases wheregeneral incomplete 3-mode data is available.
3644 Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644
References
[1] P. Arabie, J.D. Carroll, W.S. DeSarbo, Three-way Scaling and Clustering, Sage Publications,1987.
[2] J.L. Garcı́a-Lapresta, A general class of simple majority decision rules based on linguisticopinions, Information Sciences 176 (2006) 352–365.
[3] J. Gebhardt, R. Kruse, The context model: an integrating view of vagueness and uncertainty,International Journal of Approximate Reasoning 9 (1993) 282–314.
[4] R.A. Harshman, M.E. Lundy, Data preprocessing and the extended PARAFAC model, in:H.G. Law, C.W. Snyder, J.A. Hattie, R.P. McDonald (Eds.), Research Methods for Multi-mode Data Analysis, Praeger, 1984, pp. 184–216.
[5] V.N. Huynh, Y. Nakamori, T.B. Ho, G. Resconi, A context model for fuzzy concept analysisbased upon modal logic, Information Sciences 160 (2004) 111–129.
[6] V.N. Huynh, Y. Nakamori, A satisfactory-oriented approach to multiexpert decision-makingwith linguistic assessments, IEEE Transaction on Systems, Man, and Cybernetics, Part B:Cybernetics 35 (2) (2005) 184–196.
[7] R. Kruse, J. Gebhardt, F. Klawonn, Numerical and logical approaches of fuzzy set theory bythe context model, in: R. Lowen, M. Roubens (Eds.), Fuzzy Logic: State of the Art, KluwerAcademic Publishers, 1993, pp. 365–376.
[8] Y. Nakamori, M. Ryoke, Modeling of fuzziness in multivariate data analysis, in: Proceedingsof the SMC’99 (1999 IEEE International Conference on Systems, Man and Cybernetics)Tokyo, Japan, October 12–15, 1999, pp. 302–307.
[9] Y. Nakamori, M. Ryoke, Fuzzy data analysis for three-way data, in: Proceedings of the Joint9th IFSA World Congress and 20th NAFIPS International Conference—Fuzziness and SoftComputing in the New Millennium, Vancouver, Canada, June 25–28, 2001, pp. 2189–2194.
[10] W. Pedrycz, Fuzzy equalization in the construction of fuzzy sets, Fuzzy Sets and Systems 119(2001) 329–335.
[11] G. Resconi, I.B. Turksen, Canonical forms of fuzzy truthoods by meta-theory based uponmodal logic, Information Sciences 131 (2001) 157–194.
[12] H. Tanaka, Fuzzy data analysis by possibilistic linear models, Fuzzy Sets and Systems 24(1987) 363–375.
[13] H. Tanaka, H. Ishibuchi, Identification of possibilistic linear systems by quadratic membershipfunctions of fuzzy parameters, Fuzzy Sets and Systems 41 (1991) 145–160.
[14] H. Tanaka, H. Ishibuchi, Soft Data Analysis, Asakura-Shoten, 1995.[15] L.R. Tucker, The extension of factor analysis to three-dimensional matrices, in: H. Gulliksen,
N. Frederiksen (Eds.), Contributions to Mathematical Psychology, Holt, Rinehart andWinston, 1964, pp. 110–127.
[16] I.B. Turksen, Measurement of membership functions and their acquisition, Fuzzy Sets andSystems 40 (1991) 5–38.
[17] Z. Xu, A method based on linguistic aggregation operators for group decision making withlinguistic preference relations, Information Sciences 166 (2004) 19–30.
[18] Y. Yabuuchi, J. Watada, Fuzzy principal component analysis and its application, Journal ofBiomedical Fuzzy Systems Association 3 (1997) 83–92.
[19] Y. Yabuuchi, J. Watada, Y. Nakamori, Fuzzy principal component analysis for fuzzy data, in:Proceedings of the 6th IEEE International Conference on Fuzzy Systems, Barcelona, Spain,July 1–5, 1997, pp. 1127–1130.
[20] L.A. Zadeh, The concept of linguistic variable and its application to approximate reasoning,Information Sciences 8 (1975) 199–249, II: 8 (1975), III: 9 43-80.