Upload
qiong-li
View
213
Download
1
Embed Size (px)
Citation preview
Safety Science 50 (2012) 1275–1283
Contents lists available at SciVerse ScienceDirect
Safety Science
journal homepage: www.elsevier .com/locate /ssc i
Research on flood risk analysis and evaluation method based on variable fuzzysets and information diffusion
Qiong Li a,b, Jianzhong Zhou a,⇑, Donghan Liu c, Xingwen Jiang a
a College of Hydropower and Information Engineering, Huazhong University of Science and Technology, Wuhan 430074, Chinab School of Mathematics and Physics, Huangshi Institute of Technology, Huangshi 435003, Chinac Huangshi Institute of Technology, Huangshi 435003, China
a r t i c l e i n f o
Article history:Received 4 August 2011Received in revised form 29 November 2011Accepted 14 January 2012Available online 16 February 2012
Keywords:Variable fuzzy setsInformation diffusionAnalytical hierarchy processFlood risk assessment
0925-7535/$ - see front matter � 2012 Elsevier Ltd. Adoi:10.1016/j.ssci.2012.01.007
⇑ Corresponding author. Tel.: +86 27 87543127.E-mail address: [email protected] (J. Zhou)
a b s t r a c t
Floods have become increasingly alarming worldwide. Flood risk management in terms of assessingdisaster risk properly is a great challenge that society faces today. Natural disaster risk analysis is typi-cally beset with issues such as imprecision, uncertainty, and partial truth. There are two basic forms ofuncertainty related to natural disaster risk assessment, namely, randomness caused by inherent stochas-tic variability and fuzziness due to macroscopic grad and incomplete knowledge sample. However, thetraditional probability statistical method ignores the fuzziness of risk assessment with incomplete datasets and requires a large sample size of data. The fuzzy set methodology is introduced in the area of disas-ter risk assessment to improve probability estimation. The purpose of the current study is to establish afuzzy model to evaluate flood risk with incomplete data sets. The present paper puts forward a compositemethod based on variable fuzzy sets and information diffusion method for disaster risk assessment. Theresults indicate that the methodology is effective and practical; thus, it has the potential to forecast theflood risk in flood risk management. We hope that by conducting such risk analysis, the impact of flooddisasters can be mitigated in the future.
� 2012 Elsevier Ltd. All rights reserved.
1. Introduction
Water, a special resource which sustains all forms of life, is anessential substance for the sustainable development of society.With the recent rapid development of the economy and the growthof population, flood, drought, water resource shortage, and waterenvironment deterioration have become more acute. In particular,recent flooding disasters have shown the vulnerability of devel-oped and developing countries to such events. In China, flooddisasters occur frequently, and about two-thirds of its area facesthe threat of different types and degrees of floods. These phenom-ena are the result of natural and unnatural factors, such as socialand economic factors. Severe floods occur frequently, so flood riskmanagement plays an important role in guiding the government inmaking timely and correct decisions for flood rescue and relief.
Risk assessment is the use of scientific data to define the proba-bility of some harm coming to an individual or a population becauseof exposure to a substance or situation. However, assessing floodrisk is difficult because of the lack of objective measures of accept-able risk, scarcity of data, and abundance of unknown probabilitydistributions. In traditional flood risk assessment, probabilitystatistics method is usually used to estimate the exceedance
ll rights reserved.
.
probability of the hydrological variables. However, in the case ofpractical issues, problems exist in feasibility and reliability withoutconsidering the fuzzy uncertainty. In small sample issues, resultsbased on classical statistical methods are sometimes very unreli-able, because collecting long-sequence flood data is rather difficultand the sample is always small.
In general, the uncertain feature of risk is relative to both ran-domness and fuzziness. For example, the occurrence of flood is arandom event. However, the flood degree is a fuzzy concept. Inthe process of risk evaluation, randomness is due to a large amountof existing unknown factors, whereas fuzziness is concerned withthe terms of macroscopic grad and incomplete knowledge sample(Huang, 1997). Using fuzzy sets theory (Zadeh, 1965), the data maybe defined through vague, linguistic terms, such as low probability,serious impact, or high risk. These terms cannot be defined mean-ingfully with a precise single value, but fuzzy set theory providesthe means by which these terms may be formally defined in math-ematical logic. In practice, avoiding the gaps caused by the scarcityof data is impossible, which causes fuzziness (imprecision, vague-ness, incompleteness, and so on). Therefore, we have to deal withthe fuzziness of a risk system (Huang, 2002).
In many cases, the analysis of a risk system is too complex tounderstand by random viewpoint. A natural way to improve riskanalysis is to introduce a concept of fuzzy risk to overcome difficul-ties from fuzzy environments with incomplete knowledge. More
1276 Q. Li et al. / Safety Science 50 (2012) 1275–1283
importantly, a concept of fuzzy risk provides flexibility in the wayrisk information is conveyed.
There are many ways to fill up gaps caused by scarcity of data.The most popular is employing expert experience in empiricalBayesian methods. Many fuzzy methods have been developed tosmooth out the gaps to some degree. The objectives of the currentstudy are to find a more effective way of expressing the vaguenessand imprecision of natural disaster risk assessment and to find anacceptable way of communicating these imprecisions. Indepen-dent of those methods, we use a new technique to transport infor-mation carried by the data sets to the gaps. This method is calledinformation diffusion method (IDM), which helps extract as muchuseful underlying data as possible from the sample, thus improvingthe accuracy of system recognition (Huang and Shi, 2002; Palm,2007). Information diffusion is a fuzzy mathematical set-valuemethod for samples, which optimizes the use of fuzzy informationof samples to offset the information deficiency.
Constructing a disaster control engineering system is a synthesisof multi-dimensional factors, so its risk evaluation shall be operatedfrom single factor to multifactors. This procedure implies thatroutine evaluation method often omits important informationand cannot obtain integrated risk evaluation for engineeringsystems. Accordingly, under the global view of systems foundedupon the characteristics of risk evaluation, variable fuzzy set(VFS) is presented to evaluate the synthetic risk of disaster controlengineering system. The method can scientifically and reasonablydetermine membership degrees and relative membership functionsof disquisitive objectives (or indicators) at level intervals relating tothe disaster. Furthermore, VFS can fully utilize one’s experience andknowledge and integrate the qualitative and quantitative informa-tion of the indicator system with analytical hierarchy process (AHP)to obtain the weights of the objectives (or indicators) for thecomprehensive evaluation of floods (Wang et al., 2011; Zhanget al., 2011).
In the current study, we establish a synthetic disaster risk assess-ment model based on information diffusion and VFS with a smallnumber of measured samples. The model is applied to the flood riskanalysis in China successfully. In Section 2, we discuss incomplete-ness and analyze its relation to fuzziness to show why any incom-plete knowledge set carries fuzzy information. In Section 3, webriefly describe some basic concepts and principles of the modelingframework. This is a new attempt at applying information diffusiontheory and VFS in flood risk analysis. Computations based on thisanalytical flood risk model can yield a relatively accurate estimatedflood damage value. An example is carried out, and the finding indi-cates that the model exhibits fairly stable analytical results, evenwhen a small set of sample data is used. The results also indicatethat the method is highly capable of extracting useful information,thereby improving system recognition accuracy. These results areshown in Section 4. Finally, discussions and conclusions arepresented in Section 5.
2. Incompleteness and fuzziness
When we study a natural disaster risk system, we sometimesmeet the small sample problem, where the data is too scanty tomake a decision in any classical approach. Therefore, the size of asample observed must be insufficient. Avoiding the so-calledincompleteness is difficult when we study a natural disaster risksystem.
Chaitin proposed that incompleteness and fuzziness are subtlyrelated (Chaitin, 1990) because simple mathematical questions donot always have clear answers and some questions can give answersthat are completely random and look gray, rather than black orwhite. Therefore, when we study an incomplete sample, we cannot
avoid its fuzziness. In the process of risk assessment, the random-ness is due to a large amount of unknown factors. Fuzziness is con-cerned with the fuzzy information associated with an incompletedata set with respect to scarcity. This kind of fuzzy information iscalled mass-body fuzzy information (Huang and Shi, 2002). Themain task of processing the mass-body fuzzy information is to un-earth (or mine) fuzzy information, which is buried in an incompletedata set. This process is called information diffusion, wherein infor-mation is defined as the data organized to reveal patterns. The ear-liest and the most widely used model is the linear informationdistribution (Liu and Huang, 1990), which divides into two partsan observation-carrying information as a measure value of 1. An-other model is the normal information diffusion (Huang, 1997), inwhich an observation is changed into many parts according to a nor-mal function.
When we study a risk system using a probabilistic method, it isusually difficult to ascertain if a hypothesis of probability distribu-tion is suitable, and sometimes we meet the problem of small sam-ples, wherein the data is too scanty to make a decision. The problemhas shown that empirical Bayesian methods (Carlin and Louis, 1997)and kernel methods (Breiman et al., 1977; Chen, 1989; Devroye andGyorfi, 1985; Hand, 1982; Parzen, 1962; Silverman, 1986; Wertz,1978) need further development. Obtaining a precise relationbetween events and probabilities of occurrence is difficult. Goinga step further, if we employ other methods to simplify system anal-ysis, obtaining the precise relations we need is also difficult. In otherwords, the relations we obtain are usually imprecise. To keep theimprecision, the best way is to employ fuzzy sets to represent therelations.
Fuzzy set theory, which deals with uncertainties and allows theincorporation of the opinions of decision makers, may provide anappropriate tool for establishing disaster risk management sys-tems, such as fuzzy rule-based techniques and the combination ofthe fuzzy approach with other techniques. Risk is expressed interms of fuzzy risk only when we study it by a fuzzy method. Someearly related applications can be found in the literature (Brown,1979; Clements, 1977; Dong et al., 1985; Esogbue et al., 1992;Hadipriono Timothy and Fabian, 1991; Hoffman et al., 1978). Fuzzyrisk is an engineering concept, which can be defined as anapproximate representation to show risk with fuzzy theory andtechniques. In general, a fuzzy risk is a fuzzy relation between lossevents and the factors concerned.
The concept of the fuzzy set was proposed by Zadeh (1965),who bestowed media and fuzziness scientific description and greatsignificance in the academic world. However, the fuzzy set is staticif the relativity and variability are not considered. Therefore, thetheory is in conflict with the variability of the interim form. Somedefects of traditional fuzzy sets are due to approaching the media,variable fuzzy phenomenon, and variable fuzzy objects by staticconcepts, theory, and method of fuzzy sets.
In light of the foregoing, the theory and method of VFS was pro-posed by Chen based on opposite fuzzy sets and the definitions of arelative difference function (Guo and Chen, 2006; Wu et al., 2006).The method of Chen is the innovation and extension of the staticfuzzy set theory established by Zadeh (1965), which is very impor-tant in theory and applications. The comprehensive evaluation ofVFS can effectively eliminate the effect of border on the assessmentresult and can monitor the error of estimation standard.
In the present study, VFS was combined with informationdiffusion as an integration of techniques. The method proposedin the current research first uses fuzzy multiple indicators of com-prehensive evaluation of VFS and converts the multi-dimensionalindicators of the samples into one-dimensional degree values.Then, the method turns the degree values of the observed sampleinto fuzzy sets by information diffusion method, finally obtainingthe risk values. The method is then tested by a case showing that
Q. Li et al. / Safety Science 50 (2012) 1275–1283 1277
the method is superior to the traditional statistical model and im-proves the result of traditional estimation.
3. Basic concepts
3.1. Variable fuzzy sets
To define the concept, let us suppose that U is a fuzzy concept,A represents the characteristic of acceptability, and Ac representsrepellency. Hence, to any element u(u e U), lA(u) and lAc ðuÞ arerelative membership degree (RMD) functions that express degreesof acceptability and repellency, respectively. We have lAðuÞþlAc ðuÞ ¼ 1. Therefore, 0 � lAðuÞ � 1;0 � lAc ðuÞ � 1.
Let DðuÞ ¼ lAðuÞ � lAc ðuÞ; where D(u) is defined as the relativedifference degree of u to A. Mapping D:u ? D(u) e [ - 1, 1] is de-fined as the relative difference function of u to A. Therefore, wehave D(u) = 2lA(u) � 1 or lA(u) = [1 + D(u)]/2.
Let
V0 ¼ ðu;DÞju 2 U;DðuÞ ¼ lAðuÞ � lAc ðuÞ;D 2 ½�1;1�� �
Aþ ¼ fuju 2 U;0 < DðuÞ < 1g
A� ¼ fuju 2 U;�1 < DðuÞ < 0g
A0 ¼ fuju 2 U;DðuÞ ¼ 0g
where V0 is defined as VFS and A+, A�, and A0 are defined as attract-ing sets, repelling sets, and balance boundary of VFS V0, respectively(Guo and Chen, 2006; Wu et al., 2006).
3.2. Methods of relative difference function
We suppose that X0 = [a,b] are attracting sets of VFS V on thereal axis, i.e., interval of lAðuÞ > lAc ðuÞand X0 = [c,d] is a certaininterval containing X0, i.e., X0 � X0 (Fig. 1).
According to the definition of VFS, interval [c,a] and [b,d] are allrepelling sets of VFS, i.e., the intervals of lAðuÞ < lAc ðuÞ. Suppose Mis the point value of D(u) = 1 in attracting sets [a,b]. x is a randomvalue in interval X0, so if x is located on the left side of M, its differ-ence function is as follows (Guo and Chen, 2006; Wu et al., 2006):
DðxÞ ¼ ð x�aM�a Þ
b x 2 ða;MÞDðxÞ ¼ �ðx�a
c�a Þb x 2 ðc; aÞ
(ð1Þ
orlðxÞ ¼ 0:5½1þ ð x�a
M�a Þb� x 2 ða;MÞ
lðxÞ ¼ 0:5½1� ðx�ac�a Þ
b� x 2 ðc; aÞ
(ð2Þ
If x is located on the right side of M, its difference function is
DðxÞ ¼ ð x�bM�b Þ
b x 2 ðM; bÞDðxÞ ¼ �ðx�b
d�b Þb x 2 ðb; dÞ
(ð3Þ
orlðxÞ ¼ 0:5½1þ ð x�b
M�b Þb� x 2 ðM; bÞ
lðxÞ ¼ 0:5½1� ðx�bd�b Þ
b� x 2 ðb; dÞ
(ð4Þ
where b is the indicator bigger than 0, usually taken as b = 1. Thus,Eqs. (1) and (3) become linear functions which equal to Eqs. (2) and(4). They satisfy the following: (i) x = a, and x = b, D(u) = 0 orlAðuÞ ¼ lAc ðuÞ ¼ 0:5; (ii) x = M, D(u) = 1 or lA(u) = 1; (iii) x = c, andx = d, D(u) = �1 or lA(u) = 0. Then, according to Eqs. (1) and (3) (or
Fig. 1. Relationship between points X, M and internals [a,b], [c,d].
Eqs. (2) and (4)) and lA(u) = 1 + D(u)/2, we can obtain the valuesof the difference function lA(u) of the inquisitive indicators.
3.3. Analytic hierarchy process to decide the indicator weights
Various factors affect disasters, and the degrees by which theyaffect them are different. All the factors are interrelated and inter-act with each other to form a complex system. Synthetic analysis ofthe effect of the factors and evaluation of the disaster risk appro-priately influence the final result directly. Therefore, determiningthe weight of the evaluation indicators is also a concern. We shouldcompare each evaluation indicator according to its impact to hu-mans and nature. In the current study, AHP is used to solve thisproblem.
The AHP developed by Satty (1980) is based on the formulationof the decision problem in a hierarchical structure. We chooseweights by comparing attributes two at a time, assessing the ratiosfor their importance. These ratios are used to compute the weightsof individual attributes and to measure the consistency of theuser’s assessments (Hobbs and Meier, 2000). The method incorpo-rates the researcher’s subjective judgment aided by expert opinion,if need be, during the analysis by expressing the complex system ina hierarchical structure. Therefore, AHP ensures that the decision-making process is systematic, numerical, and computable. AHP isalso a popular method in problem evaluation (Hobbs and Meier,2000; Liang et al., 2006; Limmeechokchai and Chawana, 2007), rec-ognized as a robust and flexible tool for dealing with complex deci-sion-making problems (Liang et al., 2006). Its use has been largelyexplored in the literature (Greening and Bernow, 2004; Pohekarand Ramachandran, 2004).
AHP is especially suitable for problems in which multipleoptions and multiple criteria are taken into consideration, and thencombined with VFS. Due to the vagueness and uncertainty existingin the sample data, conventional AHP seems to be insufficient andimprecise when applied to ambiguous problems. Therefore, thepresent study integrated AHP and the fuzzy method of VFS todevelop a fuzzy analytical hierarchy process to decide the disastersynthetical degree value. The method can avoid the randomnessdue to the deviation of expert opinion, as well as the error due tothe ambiguity and uncertainty in the fuzzy method. Therefore,the establishment of AHP and VFS method in the current papermakes the disaster risk evaluation more practical and feasible.
3.4. VFS-AHP process to evaluate the synthetical degree value
Suppose the sample set is {x1,x2, . . . ,xn} and every sample has mindicators, the sample indicator matrix is
X ¼
x11 x12 � � � x1n
x21 x22 � � � x2n
..
. ... ..
. ...
xm1 xm2 � � � xmnÞ
0BBBB@
1CCCCA ¼ ðxijÞ ð5Þ
where xij is the ith indicator of sample j, i = 1,2, . . . ,m, andj = 1,2, . . . ,n.
Each indicator can be evaluated by c levels, so the indicatorcriteria interval matrices of each level are
Iab ¼
½a11; b11� ½a12; b12� � � � ½a1c; b1c�½a21; b21� ½a22; b22� � � � ½a2c; b2c�... ..
. ... ..
.
½am1; bm1� ½am2; bm2� � � � ½amc; bmc�
0BBBB@
1CCCCA ¼ ð½aih; bih�Þ
where i = 1,2, . . . ,m; h = 1,2, . . . ,c. Level 1 is the superior level andlevel c is the inferior level. For every [aih,bih], we can determine
1278 Q. Li et al. / Safety Science 50 (2012) 1275–1283
its range of interval ½cih;dih� according to the lower and upper limitof its adjacent intervals and its point M as follows:
Icd ¼
½c11; d11� ½c12;d12� � � � ½c1c;d1c�½c21; d21� ½c22;d22� � � � ½c2c;d2c�... ..
. ... ..
.
½cm1;dm1� ½cm2; dm2� � � � ½cmc; dmc�
0BBBB@
1CCCCA ¼ ð½cih;dih�Þ
M ¼
M11 M12 � � � M1c
M21 M22 � � � M2c
..
. ... ..
. ...
Mm1 Mm2 � � � Mmc
0BBBB@
1CCCCA ¼ ðMihÞ
Based on matrices I[a,b], I[c,d], and M, we judge that evaluatingindicator x is located on the left or right side of point M. Accord-ingly, we select Eq. (2) or (4) to calculate the difference functionlh(uij) of the indicators to the standards. h is a grade number, i isan indicator number, and j is the sample number.
Thus, we obtain the relative degree of the membership matrixof the indicator values of the sample to each level according toEqs. (2) and (4) as follows:
jU ¼ ðlðxijÞhÞ ð6Þ
According to AHP, the two-level hierarchy is constructed to ob-tain the weights of the evaluation indicators. We obtain the nor-malized weights of the evaluation indicators as w.
To obtain the synthetic degree value of each indicator, we usethe variable fuzzy recognition model presented by Wu et al.(2006) as follows:
u0hðxjÞ ¼ 1þ
Pmi¼1½wið1� lðxijÞh�
p
Pmi¼1½wilðxijÞh�
p
2664
3775
ap
8>>><>>>:
9>>>=>>>;
�1
ð7Þ
H ¼ ð1;2;3;4Þ�uhðxjÞ ð8Þ
where h is the degree number, h = 1, 2, 3, 4, xj represent sample j,and xij is the ith indicator value of sample j. Thus, H is the syntheticdegree value vector of every sample.
3.5. Information diffusion
Information diffusion is a fuzzy mathematical set-value methodfor the samples, optimizing the use of fuzzy information of thesamples to offset the information deficiency. The method can turnan observed sample into a fuzzy set, that is, turn a single pointsample into a set-value sample. The simplest model of informationdiffusion is the normal diffusion model.
Information diffusion: Let X be a set of samples and V be a sub-set of the universe. l:X � V ? [0,1] is a mapping from X � V to[0,1]. "(x,v) e X � V is a kind of information diffusion of X on Vand satisfies three conditions as follows (Huang and Shi, 2002):
(1) It is decreasing. "x e X, 8t0; t00 2 V , if kt0 � xk 6 kt00 � xk, thenlðx; t0ÞP lðx; t00Þ. l is the diffusion function.
(2) "x e X. Let t⁄ be the observed value of x, which satisfieslðx;v�Þ ¼ maxv2Vlðx;vÞ.
(3) l(x, t) is conservative. If and only if 8x 2 X, its integral valueon the universe is 1, viz.
RUl(x, u)du = 1.
In particular, if the random variables’ domain is discrete, sup-pose it is U = {u1,u2, . . . ,um}, the conservation condition isPm
j¼1lðx;ujÞ ¼ 1 ("x e X).
Let X = {x1,x2, . . . ,xn} be a sample, and U = {u1,u2, . . . ,um} be thediscrete universe of X. xi and uj are called the sample point andthe monitoring point, respectively. If "xi e X, "uj e U, we diffusethe information carried by xi to uj at gain fi(uj) using the normalinformation diffusion shown in the following equation:
fiðujÞ ¼ exp �ðxi � ujÞ2
2h2
" #; uj 2 U ð9Þ
where h is the normal diffusion coefficient calculated by Eq. (10)(Huang and Moraga, 2005; Huang and Shi, 2002).
h ¼
0:8146ðb� aÞ; n ¼ 5;
0:5690ðb� aÞ; n ¼ 6;
0:4560ðb� aÞ; n ¼ 7;
0:3860ðb� aÞ; n ¼ 8;
0:3362ðb� aÞ; n ¼ 9;
0:2986ðb� aÞ; n ¼ 10;
0:6851ðb� aÞ=ðn� 1Þ; n P 11
8>>>>>>>>>>><>>>>>>>>>>>:
ð10Þ
where b ¼max16i6nfxig; a ¼min16i6nfxigLet
Ci ¼Xm
j¼1
fiðujÞ ð11Þ
We obtain a normalized information distribution on U deter-mined by xi, as shown in the following equation:
lxiðujÞ ¼
fiðujÞCi
ð12Þ
For each monitoring point uj, by adding all normalized information,we obtain the information gain at uj, which comes from the givensample X. The information gain is shown in the following equation:
qðujÞ ¼Xn
i¼1
lxiðujÞ ð13Þ
q(uj)represents that with the information diffusion technique, thereare q(uj) (is generally not an integer) sample points in terms of sta-tistic averaging at the monitoring point uj. q(uj) is not usually a po-sitive integer, but is a number not less than zero. The assumption is
Q ¼Xm
j¼1
qðujÞ; ð14Þ
where Q is the sum of the sample size of all q(uj). Theoretically,there will be Q = n, but due to the numerical calculation error, thereis a slight difference between Q and n. Therefore, we can use Eq. (15)to estimate the frequency value of a sample falling at uj.
pðujÞ ¼qðujÞ
Qð15Þ
The frequency value can be taken as the estimation value of itsprobability. The probability value of transcending uj should be
PðujÞ ¼Xm
k¼j
pðujÞ; ð16Þ
where P(uj) is the required risk estimation value.
4. Evaluation
4.1. Methods compared
To evaluate our method, we compared it with other methodsusing some simulation experiments.
Table 1Average divergence q and the relative error e for N(0,1).
n 10 12 14 16 18 20 22
q 0.059428 0.047242 0.044552 0.044344 0.043196 0.041732 0.040316q0 0.063654 0.057944 0.057042 0.054808 0.053956 0.047588 0.045344e 0.0664 0.1847 0.2190 0.1909 0.1994 0.1231 0.1109
Table 2Average divergence q and the relative error e for E(15).
n 10 12 14 16 18 20 22
q 0.087628 0.088108 0.085376 0.082214 0.082084 0.072396 0.070786q0 0.122476 0.109402 0.102194 0.091622 0.088534 0.083526 0.079282e 0.2845 0.1946 0.1646 0.1027 0.0729 0.1333 0.1072
Q. Li et al. / Safety Science 50 (2012) 1275–1283 1279
An experiment is conducted using the normal distributionN(0,1). We obtain 10 numbers randomly from the standard normaldistribution N(0,1) using Eqs. (10)–(15). The average divergence isobtained as q = 0.059428 after 50 simulation experiments. Then,let n = 12, . . . ,22 and respectively simulate 50 experiments. Wethen obtain Table 1, which shows the average divergence-q ofthe normal information diffusion estimate compared with theaverage divergence q0 of the histogram estimate. The relative error
of HE and IDM is calculated as e ¼ q0�qq0 . The results show that IDM is
better than HE. Roughly speaking, for a small sample, the methodof normal information diffusion can improve a histogram estimatorto reduce the mean error by about 15.63%.
For another experiment in exponential distribution, we obtainTable 2 to show the average divergences-q, q0 of IDM and HE,respectively. Table 2 shows that when n is small, IDM is better thanHE with respect to lognormal distribution.
When n is small, the method of IDM is superior with respect toalmost any distribution. Furthermore, for a given sample whosesize is large and which is drawn from an exponential or lognormaldistribution, the new method is not the best. Therefore, no diffu-sion function can express all diffusion phenomena.
4.2. Limitation of IDM
A study of the above simulation experiments reveals that thesuperiority of IDM is dependent on whether we are blind to thepopulation and whether the size of a given sample is small. Inthe experiments, the given sample is considered fuzzy, so somebenefits can be obtained by IDM. The work efficiency of IDM isabout 35% higher than that of HE. That is, if no knowledge is avail-able about the population from which the given sample is drawn,and if the sample size is small, we have to obtain more observa-tions, adding about 35%, to guarantee that the estimation is as goodas the one given by the fuzzy method.
However, if we have a lot of knowledge about the population toconfirm an assumption, the statistical object with respect to a gi-ven sample is clearer. If the size of a given sample is large, thereis an abundance of statistical information in the sample. In thiscase, it is unnecessary to replace the statistics with IDM as littlebenefit can be obtained using it.
5. Application of the method to flood risk assessment
The assessment of flood risk comprises the following steps:
1. Decide the flood indicator weights using AHP.2. Convert the multi-dimensional indicators of the samples into
the one-dimensional degree value using VFS-AHP.
3. Turn the degree value of the sample into a fuzzy set and thenget the required risk estimation value by IDM.
4. Calculate the recurrence interval according to the risk estima-tion value.
According to the above theory, we can calculate the flood riskestimation of various degrees in China based on the historical datafrom 1950 to 2009 collected by the Ministry of Water Resources ofthe People’s Republic of China (Table 3). We select the set of 60 re-cords as the sample, and then 30 records are randomly chosen toform a small sample to test the stability of the results obtainedusing the method. Damage area, inundated area, dead population,and collapsed houses have been chosen as the disaster indicators inflood risk analysis. By frequency analysis, the floods are classifiedinto four levels: small, medium, large, and extreme (Table 4).
5.1. VFS-AHP for comprehensive evaluation of the flood degree
The two-level hierarchy is constructed to obtain the weights ofthe evaluation indicator. The goal is to ascertain ‘‘the weights ofthe evaluation indicators.’’ The evaluation indicators (attributes)are damage area, inundated area, dead population, and collapsedhouses.
The pairwise comparison is conducted using a scale based onthe proposal of Satty (1980) detailed in Table 5. To illustrate thekind of results obtained, Table 6 presents a pairwise comparisonmatrix drawn from the information provided from the expert forthe evaluation of the importance of the factors. Then, the consis-tency of the comparison matrix was tested and the relative weightsof the elements are computed along with the consistency ratio (CR)as presented in Table 7. If the CR is below 10%, the judgments areconsidered consistent.
According to AHP, we obtain the normalized weights of theevaluation indicators as
W ¼ ½0:0625 0:1875 0:4375 0:3125� ¼ ðwiÞ ð17Þ
According to Table 4 and Chen (1997), we set up the matrices of theparameters for calculating the difference function of VFS.
I½a;b� ¼
½0;9045� ½9045;14197� ½14197;20388� ½20388;80000�½0;4989� ½4989;8216:7� ½8216:7;13000� ½13000;50000�½0;3446� ½3446;5113� ½5113;10676� ½10676;100000�½0;112:1� ½112:1;247:7� ½247:7;754:3� ½754:3;5000�
26664
37775
I½c;d� ¼
½0;14197� ½0;20388� ½9045;80000� ½14197;80000�½0;8216:7� ½0;13000� ½4989;50000� ½8216:7;50000�½0;5113� ½0;10676� ½3446;100000� ½5113;100000�½0;247:7� ½0;754:3� ½112:1;5000� ½247:7;5000�
26664
37775
Table 3Values of flood indicators during 60 years.
Year Disaster area (1000 hectares) Inundated area (1000 hectares) Dead population (persons) Collapsed houses (10,000)
1950 6559.00 4710.00 1982 130.501951 4173.00 1476.00 7819 31.801952 2794.00 1547.00 4162 14.501953 7187.00 3285.00 3308 322.001954 16131.00 11305.00 42,447 900.901955 5247.00 3067.00 2718 49.201956 14377.00 10905.00 10,676 465.901957 8083.00 6032.00 4415 371.201958 4279.00 1441.00 3642 77.101959 4813.00 1817.00 4540 42.101960 10155.00 4975.00 6033 74.701961 8910.00 5356.00 5074 146.301962 9810.00 6318.00 4350 247.701963 14071.00 10479.00 10,441 1435.301964 14933.00 10038.00 4288 246.501965 5587.00 2813.00 1906 95.601966 2508.00 950.00 1901 26.801967 2599.00 1407.00 1095 10.801968 2670.00 1659.00 1159 63.001969 5443.00 3265.00 4667 164.601970 3129.00 1234.00 2444 25.201971 3989.00 1481.00 2323 30.201972 4083.00 1259.00 1910 22.801973 6235.00 2577.00 3413 72.301974 6431.00 2737.00 1849 120.001975 6817.00 3467.00 29,653 754.301976 4197.00 1329.00 1817 81.901977 9095.00 4989.00 3163 50.601978 2820.00 924.00 1796 28.001979 6775.00 2870.00 3446 48.801980 9146.00 5025.00 3705 138.301981 8625.00 3973.00 5832 155.101982 8361.00 4463.00 5323 341.501983 12162.00 5747.00 7238 218.901984 10632.00 5361.00 3941 112.101985 14197.00 8949.00 3578 142.001986 9155.00 5601.00 2761 150.901987 8686.00 4104.00 3749 92.101988 11949.00 6128.00 4094 91.001989 11328.00 5917.00 3270 100.101990 11804.00 5605.00 3589 96.601991 24596.00 14614.00 5113 497.901992 9423.30 4464.00 3012 98.951993 16387.30 8610.40 3499 148.911994 18858.90 11489.50 5340 349.371995 14366.70 8000.80 3852 245.581996 20388.10 11823.30 5840 547.701997 13134.80 6514.60 2799 101.061998 22291.80 13785.00 4150 685.031999 9605.20 5389.12 1896 160.502000 9045.01 5396.03 1942 112.612001 7137.78 4253.39 1605 63.492002 12384.21 7439.01 1819 146.232003 20365.70 12999.80 1551 245.422004 7781.90 4017.10 1282 93.312005 14967.48 8216.68 1660 153.292006 10521.86 5592.42 2276 105.822007 12548.92 5969.02 1230 102.972008 8867.82 4537.58 633 44.702009 8748.16 3795.79 538 55.59
Table 4Flood disaster rating standard.
Disasterlevel
Damage area (1000hectares)
Inundated area (1000hectares)
Dead population(persons)
Collapsed houses(10,000)
Recurrence interval(years)
Gradenumber
Small flood 0–9045 0–4989 0–3446 0–112.1 0–2 1Medium
flood9045–14,197 4989–8216.7 3446–5113 112.1–247.7 2–5 2
Large flood 14,197–20,388 8216.7–13,000 5113–10,676 247.7–754.3 5–20 3Extreme
flood20,388–80,000 13,000–50,000 10,676–100,000 754.3–5000 >20 4
1280 Q. Li et al. / Safety Science 50 (2012) 1275–1283
Table 5Scale preferences used in the pairwise comparison process.
Range Category Score
Superior Absolutely superior 9Very strongly superior 7Strongly superior 5Moderately superior 3
Equal Equal 1Inferior Absolutely inferior 1/9
Very strongly inferior 1/7Strongly inferior 1/5Moderately inferior 1/3
Table 6Pairwise comparison of the alternatives with respect to flood disasters.
Damagearea
Inundatedarea
Deadpopulation
Collapsedhouses
Damage area 1 1/2 1/9 1/3Inundated area 2 1 1/5 1/2Dead population 9 5 1 3Collapsed houses 3 2 1/3 1
Table 7Vector of weights of the alternatives withrespect to flood disasters.
Flood impact
Damage area 0.0655Inundated area 0.1189Dead population 0.6043Collapsed houses 0.2113CR 0.0030
Table 8The disaster degree values during the 60 years in China.
Sample Degree value Sample Degree value
1 1.2975 31 1.70582 2.4769 32 2.27283 1.5831 33 2.47274 1.5564 34 2.60615 3.2447 35 1.7776 1.2101 36 1.79487 3.2537 37 1.54328 2.131 38 1.6289 1.4648 39 1.8295
10 1.7051 40 1.552811 2.2814 41 1.637712 2.1606 42 2.776913 2.0829 43 1.398314 3.449 44 1.772515 2.2234 45 2.720816 1.1692 46 2.029417 1.0678 47 2.872718 1.0227 48 1.482119 1.0524 49 2.307920 1.9909 50 1.429921 1.1141 51 1.285722 1.1142 52 1.128123 1.0727 53 1.445624 1.3453 54 1.395725 1.2092 55 1.127326 3.9117 56 1.434927 1.1177 57 1.378228 1.3402 58 1.243529 1.0624 59 1.048330 1.32 60 1.0439
Fig. 2. Comparisons of the risks by VFS-IDM model and the traditional statisticalmethod.
Q. Li et al. / Safety Science 50 (2012) 1275–1283 1281
M ¼
0 10762 18324 800000 6064 11406 500000 4002 8822 1000000 157 585 5000
26664
37775
Based on matrices I[a,b], I[c,d], and M, we judge that the evaluatingindicator x is locates on the left or right side of point M. Accordingly,we select Eqs. (1) or (2) to calculate the difference function lh(uij) ofthe indicators to the standards. h is the grade number andh = 1,2,3,4; i is the indicator number and i = 1,2,3,4; and j is thesample number and j = 1,2, . . . ,32, . . . ,0.
To get the synthetic RMD of each indicator, we use the variablefuzzy recognition model presented by Wu et al. (2006) integratedwith the indicator weight wi by AHP
u0hðxjÞ ¼ 1þPm
i¼1½wið1� lðxijÞh�pPm
i¼1½wilðxijÞh�p
� �ap
( )�1
ð18Þ
H ¼ ð1;2;3;4Þ�uhðxjÞ ð19Þ
where h is the degree number, h = 1, 2, 3, 4, xj represent sample j,and xij is the ith indicator value of sample j. First, we may use thevariable fuzzy recognition model in Eq. (18) to calculate the syn-thetic relative membership degree of Sample 1. With Eq. (18), weobtain the synthetic relative membership degree of each indicatorfor flood u0hðxjÞ. After normalizing them, we get the normalized syn-thetic relative membership degree of each indicator uh(xj). wi is theindicator weight; m is the number of indicators and m = 4; l(xij)h isthe difference function of indicator i of the sample j to degree h; a isa rule parameter of model optimization, a = 1 is the least singlemethod, and a = 2 is the least square method; and p is the distance
parameter, p = 1 is the hamming distance, and p = 2 is the Euclideandistance.
When taking the rule parameter of model optimization a = 2and the distance parameter p = 1, we obtain the disaster degreeof sample 1 as H = 1.2975. In the same way, we can calculate thedisaster degree values of all 60 samples as shown in Table 8.
5.2. Flood risk evaluation based on information diffusion
Based on VFS, the disaster degree values of the 60 samples arecalculated (Table 8), that is, the sample point set X = {x1,x2, . . . ,x60}.The universe of discourse, namely, the monitoring point set, is
Table 9Comparison of two methods.
Method VFS-IDM model Statistics
Mean error 0.0214 0.0544
Table 10Flood disaster risk evaluation values.
1282 Q. Li et al. / Safety Science 50 (2012) 1275–1283
taken as U = {u1,u2, . . . ,u41} = {0,0.1,0.2, . . . ,4.0}. The normalizedinformation distribution of each xi, that is, lxi
ðujÞ, can be obtainedaccording to Eqs. (9)–(12). Then, based on Eqs. (13)–(16), disasterprobability risk estimation is calculated. The relationship betweenthe recurrence interval N (years) and the probability p can be ex-pressed as N = 1/p. The flood exceedance probability curve to thedisaster degree value compared with the comparison of that bythe traditional statistical method is shown in Fig. 2.
Disasters level Smallflood
Mediumflood
Largeflood
Extremeflood
Exceedance probability risk 0.9656 0.5735 0.1565 0.0269Recurrence interval (years) 1.0356 1.7436 6.3899 37.1629
6. Result and discussion
By IDM method, we obtain the exceedance probabilities on thedifferent disaster degree values shown in Fig. 2.
In Fig. 2, the results reflect that the risk of the flood decreasessmoothly with the degree value by the VFS-IDM model. The curveof the VFS-IDM model is smoother and more accurate than that oftraditional statistical method.
Thirty records are randomly chosen to form a small sample, andthey are analyzed in the same way for comparison with the largesample. Their results are compared in Figs. 3 and 4.
Fig. 3. Comparison of the risks by VFS-IDM model with small sample and largesample.
Fig. 4. Comparison of the risks by traditional statistics with small sample and largesample.
Fig. 3 shows the difference between the two curves of the esti-mated risk with small sample and large sample by VFS-IDM model.From Fig. 3, two curves match well, which indicates that the resultbarely changes when the sample size changes, and that the methodis stable and barely affected by the size of the sample. The analysisresults for a very large sample can be used as the standard, so theVFS-IDM method is considered closer to the standard than the sta-tistical method, as proven by the experiments.
In Fig. 4, we compare two curves of the estimated risk withsmall sample and large sample by frequency statistics. The meanerrors between the results with large sample and small sampleby frequency statistics reach the value of 0.0544, which is muchbigger than that by VFS-IDM model.
Figs. 3 and 4 indicate that the results of the small sample ana-lyzed by VFS-IDM model are satisfactory. The results reflect thefact that the risks of the floods decrease smoothly with the increasein degree value, and that the VFS-IDM model works better for prac-tical problems. Comparing with those calculated by statistics, theinformation diffusion approach is much better because the resultof VFS-IDM model is closer to the standard. Table 9 presents a com-parison of the mean errors between the results with large sampleand small sample by VFS-IDM model and traditional statistics.The table also shows that the mean error given by the VFS-IDMmodel is much smaller than that by statistical method.
The results also illustrate the risk assessment values and therecurrence interval values on different disaster levels in China.
Due to the standard of four grades, we have (Chen, 2009) thefollowing categories:
(a) If 1.0 6 H 6 1.5, then the flood degree is small (1st grade).(b) If 1.5 < H 6 2.5, then it belongs to medium (2nd grade).(c) If 2.5 < H 6 3.5, then it belongs to large (3rd grade).(d) If 3.5 < H 6 4, then it belongs to extreme (4th grade).
The result in Fig. 3 illustrates the risk estimation, i.e., theexceedance probability of the disaster degree value. From thisinformation, we know the risk estimation is 0.0269 when thedisaster indicator is 3.5. In other words, floods exceeding the 3.5�value (extreme floods) occur every 37.1629 years. Similarly, theprobability of floods exceeding 2.5� (large floods) is 0.1565, whichmeans that floods exceeding that intensity occur every6.3899 years. These findings indicate the serious situation of floodsin China. The frequency and the recurrence interval of the floods ofthe four grades are shown in Table 10.
7. Conclusion
Floods occur frequently in China and cause significant propertylosses and casualties. Flood risk assessment of an area is importantfor flood disaster managers so they could implement a compensa-
Q. Li et al. / Safety Science 50 (2012) 1275–1283 1283
tion and disaster-reduction plan. However, risk as a natural or soci-etal phenomenon, is neither precise nor certain. In the current pa-per, we analyze the concept of fuzzy risk with respect toenvironment and safety using a new model. We also analyzewhy we have to implement fuzzy risk estimation in many cases.
We put forward a fuzzy method of flood risk assessment basedon VFS theory and information diffusion technique to improveprobability estimation. From the case calculation, the proposedestimate is better than the statistics estimation. The method hasbeen tested and found to be reliable. The results are reasonableand stable. Moreover, the analysis has shown that the compositemodel has potential in identifying the risks of natural disasters insome areas. In view of the theoretic system of flood risk assess-ment developed thus far and the fact that observed series of disas-ters are quite short or even unavailable, the method based on IDMand VFS adopted in the paper is indisputably an effective and prac-tical method. This is the first time that the model is applied to casesof flood disasters, and more work is needed to draw some final les-sons from flood disasters.
Neither the classical models nor the information diffusion mod-el govern the nature of the physical processes. They are introducedas a compensation for their own limitations in the understandingof the processes concerned.
As fuzzy risk analysis involves more imprecision, uncertainty,and partial truth in natural and societal phenomena, the worksin fuzzy risks must promote the study of the foundations of fuzzylogic.
Further technological developments in flood control and manynew effective methods of flood risk analysis can be used to obtainprediction accuracy. We hope that by conducting such analysis,lessons can be learned so that the impact of natural disasters, suchas the floods in China, can be prevented or mitigated in the future.
Acknowledgments
This work is supported by a Grant from the National Basic Re-search Program of China (Project No. 2007CB714107), a Grant fromthe Key Projects in the National Science and Technology Pillar Pro-gram (Project No. 2008BAB29B08), and a Grant from the SpecialResearch Foundation for the Public Welfare Industry of the Minis-try of Science and Technology and the Ministry of Water Resources(Project No. 201001080).
References
Breiman, L., Meisel, W., Purcell, E., 1977. Variable kernel estimates of multivariatedensities. Technometrics 19, 135–144.
Brown, C.B., 1979. A fuzzy safety measure. Journal of the Engineering MechanicsDivision 105, 855–872.
Carlin, B.P., Louis, T.A., 1997. Bayes and empirical Bayes methods for data analysis.Statistics and Computing 7, 153–154.
Chaitin, G.J., 1990. Information, Randomness & Incompleteness: Papers onAlgorithmic Information Theory. World Scientific Pub. Co. Inc..
Chen, S., 1997. Relative membership function and new frame of fuzzy sets theoryfor pattern recognition. Journal of Fuzzy Mathematics 5, 401–412.
Chen, S., 2009. Theory and Model of Variable Fuzzy Sets and its Application, first ed.Dalian University of Technology Press, Dalian.
Chen, X.R., 1989. Non-Parametric Statistics. Shanghai Science and Technology Press,Shanghai.
Clements, D.P., 1977. Fuzzy ratings for computer security evaluation. PhD thesis,University of California, Berkeley.
Devroye, L., Gyorfi, L., 1985. Nonparametric Density Estimation. Wiley.Dong, W., Shah, H., Wongt, F., 1985. Fuzzy computations in risk and decision
analysis. Civil Engineering Systems 2, 201–208.Esogbue, A.O., Theologidu, M., Guo, K., 1992. On the application of fuzzy sets theory
to the optimal flood control problem arising in water resources systems. FuzzySets and Systems 48, 155–172.
Greening, L.A., Bernow, S., 2004. Design of coordinated energy and environmentalpolicies: use of multi-criteria decision-making. Energy Policy 32, 721–735.
Guo, Y., Chen, S., 2006. Application of Variable Fuzzy Sets in Classified Prediction ofRockburst. ASCE.
Hadipriono Timothy, J., Fabian, C., 1991. A rule-based fuzzy logic deductiontechnique for damage assessment of protective structures. Fuzzy Sets andSystems 44, 459–468.
Hand, D.J., 1982. Kernel Discriminant Analysis. Research Studies Press, Chichester,UK.
Hobbs, B.F., Meier, P., 2000. Energy Decisions and the Environment: A Guide to theUse of Multicriteria Methods. Springer, Netherlands.
Hoffman, L.J., Michelman, E.H., Clements, D., 1978. SECURATE-security evaluationand analysis using fuzzy metrics. IEEE Computer Society, p. 531.
Huang, C., 1997. Principle of information diffusion. Fuzzy Sets and Systems 91, 69–90.
Huang, C., 2002. Information diffusion techniques and small-sample problem.International Journal of Information Technology and Decision Making 1, 229–250.
Huang, C., Moraga, C., 2005. Extracting fuzzy if–then rules by using the informationmatrix technique⁄ 1. Journal of Computer and System Sciences 70, 26–52.
Huang, C., Shi, Y., 2002. Towards Efficient Fuzzy Information Processing: Using thePrinciple of Information Diffusion. Physica Verlag.
Liang, Z., Yang, K., Sun, Y., Yuan, J., Zhang, H., Zhang, Z., 2006. Decision support forchoice optimal power generation projects: fuzzy comprehensive evaluationmodel based on the electricity market. Energy Policy 34, 3359–3364.
Limmeechokchai, B., Chawana, S., 2007. Sustainable energy development strategiesin the rural Thailand: the case of the improved cooking stove and the smallbiogas digester. Renewable and Sustainable Energy Reviews 11, 818–837.
Liu, Z.R., Huang, C.F., 1990. Information distribution method relevant in fuzzyinformation analysis. Fuzzy Sets and Systems 36, 67–76.
Palm, R., 2007. Multiple-step-ahead prediction in control systems with Gaussianprocess models and TS-fuzzy models. Engineering Applications of ArtificialIntelligence 20, 1023–1035.
Parzen, E., 1962. On estimation of a probability density function and mode. TheAnnals of Mathematical Statistics 33, 1065–1076.
Pohekar, S., Ramachandran, M., 2004. Application of multi-criteria decision makingto sustainable energy planning – a review. Renewable and Sustainable EnergyReviews 8, 365–381.
Satty, T., 1980. The Analytic Hierarchy Process. McGraw-Hill, New York.Silverman, B.W., 1986. Density Estimation for Statistics and Data Analysis. Chapman
& Hall/CRC.Wang, Y., Wang, D., Wu, J., 2011. A variable fuzzy set assessment model for water
shortage risk: two case studies from China. Human and Ecological RiskAssessment 17, 631–645.
Wertz, W., 1978. Statistical Density Estimation: A Survey. Vandenhoeck & Ruprecht.Wu, L., Guo, Y., Chen, S., Zhou, H., 2006. Use of variable fuzzy sets methods for
desertification evaluation. Computational Intelligence, Theory and Applications,721–731.
Zadeh, L.A., 1965. Fuzzy sets. Information and Control 8, 338–353.Zhang, D., Wang, G., Zhou, H., 2011. Assessment on agricultural drought risk based
on variable fuzzy sets model. Chinese Geographical Science 21, 167–175.