Research on flood risk analysis and evaluation method based on variable fuzzy sets and information diffusion

Safety Science 50 (2012) 1275–1283

Contents lists available at SciVerse ScienceDirect

Safety Science

journal homepage: www.elsevier .com/locate /ssc i

Research on flood risk analysis and evaluation method based on variable fuzzysets and information diffusion

Qiong Li a,b, Jianzhong Zhou a,⇑, Donghan Liu c, Xingwen Jiang a

a College of Hydropower and Information Engineering, Huazhong University of Science and Technology, Wuhan 430074, Chinab School of Mathematics and Physics, Huangshi Institute of Technology, Huangshi 435003, Chinac Huangshi Institute of Technology, Huangshi 435003, China

a r t i c l e i n f o

Article history:Received 4 August 2011Received in revised form 29 November 2011Accepted 14 January 2012Available online 16 February 2012

Keywords:Variable fuzzy setsInformation diffusionAnalytical hierarchy processFlood risk assessment

0925-7535/$ - see front matter � 2012 Elsevier Ltd. Adoi:10.1016/j.ssci.2012.01.007

⇑ Corresponding author. Tel.: +86 27 87543127.E-mail address: [email protected] (J. Zhou)

a b s t r a c t

Floods have become increasingly alarming worldwide. Flood risk management in terms of assessingdisaster risk properly is a great challenge that society faces today. Natural disaster risk analysis is typi-cally beset with issues such as imprecision, uncertainty, and partial truth. There are two basic forms ofuncertainty related to natural disaster risk assessment, namely, randomness caused by inherent stochas-tic variability and fuzziness due to macroscopic grad and incomplete knowledge sample. However, thetraditional probability statistical method ignores the fuzziness of risk assessment with incomplete datasets and requires a large sample size of data. The fuzzy set methodology is introduced in the area of disas-ter risk assessment to improve probability estimation. The purpose of the current study is to establish afuzzy model to evaluate flood risk with incomplete data sets. The present paper puts forward a compositemethod based on variable fuzzy sets and information diffusion method for disaster risk assessment. Theresults indicate that the methodology is effective and practical; thus, it has the potential to forecast theflood risk in flood risk management. We hope that by conducting such risk analysis, the impact of flooddisasters can be mitigated in the future.

� 2012 Elsevier Ltd. All rights reserved.

1. Introduction

Water, a special resource which sustains all forms of life, is anessential substance for the sustainable development of society.With the recent rapid development of the economy and the growthof population, flood, drought, water resource shortage, and waterenvironment deterioration have become more acute. In particular,recent flooding disasters have shown the vulnerability of devel-oped and developing countries to such events. In China, flooddisasters occur frequently, and about two-thirds of its area facesthe threat of different types and degrees of floods. These phenom-ena are the result of natural and unnatural factors, such as socialand economic factors. Severe floods occur frequently, so flood riskmanagement plays an important role in guiding the government inmaking timely and correct decisions for flood rescue and relief.

Risk assessment is the use of scientific data to define the proba-bility of some harm coming to an individual or a population becauseof exposure to a substance or situation. However, assessing floodrisk is difficult because of the lack of objective measures of accept-able risk, scarcity of data, and abundance of unknown probabilitydistributions. In traditional flood risk assessment, probabilitystatistics method is usually used to estimate the exceedance

ll rights reserved.

.

probability of the hydrological variables. However, in the case ofpractical issues, problems exist in feasibility and reliability withoutconsidering the fuzzy uncertainty. In small sample issues, resultsbased on classical statistical methods are sometimes very unreli-able, because collecting long-sequence flood data is rather difficultand the sample is always small.

In general, the uncertain feature of risk is relative to both ran-domness and fuzziness. For example, the occurrence of flood is arandom event. However, the flood degree is a fuzzy concept. Inthe process of risk evaluation, randomness is due to a large amountof existing unknown factors, whereas fuzziness is concerned withthe terms of macroscopic grad and incomplete knowledge sample(Huang, 1997). Using fuzzy sets theory (Zadeh, 1965), the data maybe defined through vague, linguistic terms, such as low probability,serious impact, or high risk. These terms cannot be defined mean-ingfully with a precise single value, but fuzzy set theory providesthe means by which these terms may be formally defined in math-ematical logic. In practice, avoiding the gaps caused by the scarcityof data is impossible, which causes fuzziness (imprecision, vague-ness, incompleteness, and so on). Therefore, we have to deal withthe fuzziness of a risk system (Huang, 2002).

In many cases, the analysis of a risk system is too complex tounderstand by random viewpoint. A natural way to improve riskanalysis is to introduce a concept of fuzzy risk to overcome difficul-ties from fuzzy environments with incomplete knowledge. More

http://dx.doi.org/10.1016/j.ssci.2012.01.007

mailto:[email protected]

http://dx.doi.org/10.1016/j.ssci.2012.01.007

http://www.sciencedirect.com/science/journal/09257535

http://www.elsevier.com/locate/ssci

1276 Q. Li et al. / Safety Science 50 (2012) 1275–1283

importantly, a concept of fuzzy risk provides flexibility in the wayrisk information is conveyed.

There are many ways to fill up gaps caused by scarcity of data.The most popular is employing expert experience in empiricalBayesian methods. Many fuzzy methods have been developed tosmooth out the gaps to some degree. The objectives of the currentstudy are to find a more effective way of expressing the vaguenessand imprecision of natural disaster risk assessment and to find anacceptable way of communicating these imprecisions. Indepen-dent of those methods, we use a new technique to transport infor-mation carried by the data sets to the gaps. This method is calledinformation diffusion method (IDM), which helps extract as muchuseful underlying data as possible from the sample, thus improvingthe accuracy of system recognition (Huang and Shi, 2002; Palm,2007). Information diffusion is a fuzzy mathematical set-valuemethod for samples, which optimizes the use of fuzzy informationof samples to offset the information deficiency.

Constructing a disaster control engineering system is a synthesisof multi-dimensional factors, so its risk evaluation shall be operatedfrom single factor to multifactors. This procedure implies thatroutine evaluation method often omits important informationand cannot obtain integrated risk evaluation for engineeringsystems. Accordingly, under the global view of systems foundedupon the characteristics of risk evaluation, variable fuzzy set(VFS) is presented to evaluate the synthetic risk of disaster controlengineering system. The method can scientifically and reasonablydetermine membership degrees and relative membership functionsof disquisitive objectives (or indicators) at level intervals relating tothe disaster. Furthermore, VFS can fully utilize one’s experience andknowledge and integrate the qualitative and quantitative informa-tion of the indicator system with analytical hierarchy process (AHP)to obtain the weights of the objectives (or indicators) for thecomprehensive evaluation of floods (Wang et al., 2011; Zhanget al., 2011).

In the current study, we establish a synthetic disaster risk assess-ment model based on information diffusion and VFS with a smallnumber of measured samples. The model is applied to the flood riskanalysis in China successfully. In Section 2, we discuss incomplete-ness and analyze its relation to fuzziness to show why any incom-plete knowledge set carries fuzzy information. In Section 3, webriefly describe some basic concepts and principles of the modelingframework. This is a new attempt at applying information diffusiontheory and VFS in flood risk analysis. Computations based on thisanalytical flood risk model can yield a relatively accurate estimatedflood damage value. An example is carried out, and the finding indi-cates that the model exhibits fairly stable analytical results, evenwhen a small set of sample data is used. The results also indicatethat the method is highly capable of extracting useful information,thereby improving system recognition accuracy. These results areshown in Section 4. Finally, discussions and conclusions arepresented in Section 5.

2. Incompleteness and fuzziness

When we study a natural disaster risk system, we sometimesmeet the small sample problem, where the data is too scanty tomake a decision in any classical approach. Therefore, the size of asample observed must be insufficient. Avoiding the so-calledincompleteness is difficult when we study a natural disaster risksystem.

Chaitin proposed that incompleteness and fuzziness are subtlyrelated (Chaitin, 1990) because simple mathematical questions donot always have clear answers and some questions can give answersthat are completely random and look gray, rather than black orwhite. Therefore, when we study an incomplete sample, we cannot

avoid its fuzziness. In the process of risk assessment, the random-ness is due to a large amount of unknown factors. Fuzziness is con-cerned with the fuzzy information associated with an incompletedata set with respect to scarcity. This kind of fuzzy information iscalled mass-body fuzzy information (Huang and Shi, 2002). Themain task of processing the mass-body fuzzy information is to un-earth (or mine) fuzzy information, which is buried in an incompletedata set. This process is called information diffusion, wherein infor-mation is defined as the data organized to reveal patterns. The ear-liest and the most widely used model is the linear informationdistribution (Liu and Huang, 1990), which divides into two partsan observation-carrying information as a measure value of 1. An-other model is the normal information diffusion (Huang, 1997), inwhich an observation is changed into many parts according to a nor-mal function.

When we study a risk system using a probabilistic method, it isusually difficult to ascertain if a hypothesis of probability distribu-tion is suitable, and sometimes we meet the problem of small sam-ples, wherein the data is too scanty to make a decision. The problemhas shown that empirical Bayesian methods (Carlin and Louis, 1997)and kernel methods (Breiman et al., 1977; Chen, 1989; Devroye andGyorfi, 1985; Hand, 1982; Parzen, 1962; Silverman, 1986; Wertz,1978) need further development. Obtaining a precise relationbetween events and probabilities of occurrence is difficult. Goinga step further, if we employ other methods to simplify system anal-ysis, obtaining the precise relations we need is also difficult. In otherwords, the relations we obtain are usually imprecise. To keep theimprecision, the best way is to employ fuzzy sets to represent therelations.

Fuzzy set theory, which deals with uncertainties and allows theincorporation of the opinions of decision makers, may provide anappropriate tool for establishing disaster risk management sys-tems, such as fuzzy rule-based techniques and the combination ofthe fuzzy approach with other techniques. Risk is expressed interms of fuzzy risk only when we study it by a fuzzy method. Someearly related applications can be found in the literature (Brown,1979; Clements, 1977; Dong et al., 1985; Esogbue et al., 1992;Hadipriono Timothy and Fabian, 1991; Hoffman et al., 1978). Fuzzyrisk is an engineering concept, which can be defined as anapproximate representation to show risk with fuzzy theory andtechniques. In general, a fuzzy risk is a fuzzy relation between lossevents and the factors concerned.

The concept of the fuzzy set was proposed by Zadeh (1965),who bestowed media and fuzziness scientific description and greatsignificance in the academic world. However, the fuzzy set is staticif the relativity and variability are not considered. Therefore, thetheory is in conflict with the variability of the interim form. Somedefects of traditional fuzzy sets are due to approaching the media,variable fuzzy phenomenon, and variable fuzzy objects by staticconcepts, theory, and method of fuzzy sets.

In light of the foregoing, the theory and method of VFS was pro-posed by Chen based on opposite fuzzy sets and the definitions of arelative difference function (Guo and Chen, 2006; Wu et al., 2006).The method of Chen is the innovation and extension of the staticfuzzy set theory established by Zadeh (1965), which is very impor-tant in theory and applications. The comprehensive evaluation ofVFS can effectively eliminate the effect of border on the assessmentresult and can monitor the error of estimation standard.

In the present study, VFS was combined with informationdiffusion as an integration of techniques. The method proposedin the current research first uses fuzzy multiple indicators of com-prehensive evaluation of VFS and converts the multi-dimensionalindicators of the samples into one-dimensional degree values.Then, the method turns the degree values of the observed sampleinto fuzzy sets by information diffusion method, finally obtainingthe risk values. The method is then tested by a case showing that

Q. Li et al. / Safety Science 50 (2012) 1275–1283 1277

the method is superior to the traditional statistical model and im-proves the result of traditional estimation.

3. Basic concepts

3.1. Variable fuzzy sets

To define the concept, let us suppose that U is a fuzzy concept,A represents the characteristic of acceptability, and Ac representsrepellency. Hence, to any element u(u e U), lA(u) and lAc ðuÞ arerelative membership degree (RMD) functions that express degreesof acceptability and repellency, respectively. We have lAðuÞþlAc ðuÞ ¼ 1. Therefore, 0 � lAðuÞ � 1;0 � lAc ðuÞ � 1.

Let DðuÞ ¼ lAðuÞ � lAc ðuÞ; where D(u) is defined as the relativedifference degree of u to A. Mapping D:u ? D(u) e [ - 1, 1] is de-fined as the relative difference function of u to A. Therefore, wehave D(u) = 2lA(u) � 1 or lA(u) = [1 + D(u)]/2.

Let

V0 ¼ ðu;DÞju 2 U;DðuÞ ¼ lAðuÞ � lAc ðuÞ;D 2 ½�1;1��

Aþ ¼ fuju 2 U;0 < DðuÞ < 1g

A� ¼ fuju 2 U;�1 < DðuÞ < 0g

A0 ¼ fuju 2 U;DðuÞ ¼ 0g

where V0 is defined as VFS and A+, A�, and A0 are defined as attract-ing sets, repelling sets, and balance boundary of VFS V0, respectively(Guo and Chen, 2006; Wu et al., 2006).

3.2. Methods of relative difference function

We suppose that X0 = [a,b] are attracting sets of VFS V on thereal axis, i.e., interval of lAðuÞ > lAc ðuÞand X0 = [c,d] is a certaininterval containing X0, i.e., X0 � X0 (Fig. 1).

According to the definition of VFS, interval [c,a] and [b,d] are allrepelling sets of VFS, i.e., the intervals of lAðuÞ < lAc ðuÞ. Suppose Mis the point value of D(u) = 1 in attracting sets [a,b]. x is a randomvalue in interval X0, so if x is located on the left side of M, its differ-ence function is as follows (Guo and Chen, 2006; Wu et al., 2006):

DðxÞ ¼ ð x�aM�a Þ

b x 2 ða;MÞDðxÞ ¼ �ðx�a

c�a Þb x 2 ðc; aÞ

(ð1Þ

orlðxÞ ¼ 0:5½1þ ð x�a

M�a Þb� x 2 ða;MÞ

lðxÞ ¼ 0:5½1� ðx�ac�a Þ

b� x 2 ðc; aÞ

(ð2Þ

If x is located on the right side of M, its difference function is

DðxÞ ¼ ð x�bM�b Þ

b x 2 ðM; bÞDðxÞ ¼ �ðx�b

d�b Þb x 2 ðb; dÞ

(ð3Þ

orlðxÞ ¼ 0:5½1þ ð x�b

M�b Þb� x 2 ðM; bÞ

lðxÞ ¼ 0:5½1� ðx�bd�b Þ

b� x 2 ðb; dÞ

(ð4Þ

where b is the indicator bigger than 0, usually taken as b = 1. Thus,Eqs. (1) and (3) become linear functions which equal to Eqs. (2) and(4). They satisfy the following: (i) x = a, and x = b, D(u) = 0 orlAðuÞ ¼ lAc ðuÞ ¼ 0:5; (ii) x = M, D(u) = 1 or lA(u) = 1; (iii) x = c, andx = d, D(u) = �1 or lA(u) = 0. Then, according to Eqs. (1) and (3) (or

Fig. 1. Relationship between points X, M and internals [a,b], [c,d].

Eqs. (2) and (4)) and lA(u) = 1 + D(u)/2, we can obtain the valuesof the difference function lA(u) of the inquisitive indicators.

3.3. Analytic hierarchy process to decide the indicator weights

Various factors affect disasters, and the degrees by which theyaffect them are different. All the factors are interrelated and inter-act with each other to form a complex system. Synthetic analysis ofthe effect of the factors and evaluation of the disaster risk appro-priately influence the final result directly. Therefore, determiningthe weight of the evaluation indicators is also a concern. We shouldcompare each evaluation indicator according to its impact to hu-mans and nature. In the current study, AHP is used to solve thisproblem.

The AHP developed by Satty (1980) is based on the formulationof the decision problem in a hierarchical structure. We chooseweights by comparing attributes two at a time, assessing the ratiosfor their importance. These ratios are used to compute the weightsof individual attributes and to measure the consistency of theuser’s assessments (Hobbs and Meier, 2000). The method incorpo-rates the researcher’s subjective judgment aided by expert opinion,if need be, during the analysis by expressing the complex system ina hierarchical structure. Therefore, AHP ensures that the decision-making process is systematic, numerical, and computable. AHP isalso a popular method in problem evaluation (Hobbs and Meier,2000; Liang et al., 2006; Limmeechokchai and Chawana, 2007), rec-ognized as a robust and flexible tool for dealing with complex deci-sion-making problems (Liang et al., 2006). Its use has been largelyexplored in the literature (Greening and Bernow, 2004; Pohekarand Ramachandran, 2004).

AHP is especially suitable for problems in which multipleoptions and multiple criteria are taken into consideration, and thencombined with VFS. Due to the vagueness and uncertainty existingin the sample data, conventional AHP seems to be insufficient andimprecise when applied to ambiguous problems. Therefore, thepresent study integrated AHP and the fuzzy method of VFS todevelop a fuzzy analytical hierarchy process to decide the disastersynthetical degree value. The method can avoid the randomnessdue to the deviation of expert opinion, as well as the error due tothe ambiguity and uncertainty in the fuzzy method. Therefore,the establishment of AHP and VFS method in the current papermakes the disaster risk evaluation more practical and feasible.

3.4. VFS-AHP process to evaluate the synthetical degree value

Suppose the sample set is {x1,x2, . . . ,xn} and every sample has mindicators, the sample indicator matrix is

X ¼

x11 x12 � � � x1n

x21 x22 � � � x2n

..

. ... ..

. ...

xm1 xm2 � � � xmnÞ

0BBBB@

1CCCCA ¼ ðxijÞ ð5Þ

where xij is the ith indicator of sample j, i = 1,2, . . . ,m, andj = 1,2, . . . ,n.

Each indicator can be evaluated by c levels, so the indicatorcriteria interval matrices of each level are

Iab ¼

½a11; b11� ½a12; b12� � � � ½a1c; b1c�½a21; b21� ½a22; b22� � � � ½a2c; b2c�... ..

. ... ..

.

½am1; bm1� ½am2; bm2� � � � ½amc; bmc�

0BBBB@

1CCCCA ¼ ð½aih; bih�Þ

where i = 1,2, . . . ,m; h = 1,2, . . . ,c. Level 1 is the superior level andlevel c is the inferior level. For every [aih,bih], we can determine


its range of interval ½cih;dih� according to the lower and upper limitof its adjacent intervals and its point M as follows:

Icd ¼

½c11; d11� ½c12;d12� � � � ½c1c;d1c�½c21; d21� ½c22;d22� � � � ½c2c;d2c�... ..

. ... ..

.

½cm1;dm1� ½cm2; dm2� � � � ½cmc; dmc�

0BBBB@

1CCCCA ¼ ð½cih;dih�Þ

M ¼

M11 M12 � � � M1c

M21 M22 � � � M2c

..

. ... ..

. ...

Mm1 Mm2 � � � Mmc

0BBBB@

1CCCCA ¼ ðMihÞ

Based on matrices I[a,b], I[c,d], and M, we judge that evaluatingindicator x is located on the left or right side of point M. Accord-ingly, we select Eq. (2) or (4) to calculate the difference functionlh(uij) of the indicators to the standards. h is a grade number, i isan indicator number, and j is the sample number.

Thus, we obtain the relative degree of the membership matrixof the indicator values of the sample to each level according toEqs. (2) and (4) as follows:

jU ¼ ðlðxijÞhÞ ð6Þ

According to AHP, the two-level hierarchy is constructed to ob-tain the weights of the evaluation indicators. We obtain the nor-malized weights of the evaluation indicators as w.

To obtain the synthetic degree value of each indicator, we usethe variable fuzzy recognition model presented by Wu et al.(2006) as follows:

u0hðxjÞ ¼ 1þ

Pmi¼1½wið1� lðxijÞh�

p

Pmi¼1½wilðxijÞh�

p

2664

3775

ap

8>>><>>>:

9>>>=>>>;

�1

ð7Þ

H ¼ ð1;2;3;4Þ�uhðxjÞ ð8Þ

where h is the degree number, h = 1, 2, 3, 4, xj represent sample j,and xij is the ith indicator value of sample j. Thus, H is the syntheticdegree value vector of every sample.

3.5. Information diffusion

Information diffusion is a fuzzy mathematical set-value methodfor the samples, optimizing the use of fuzzy information of thesamples to offset the information deficiency. The method can turnan observed sample into a fuzzy set, that is, turn a single pointsample into a set-value sample. The simplest model of informationdiffusion is the normal diffusion model.

Information diffusion: Let X be a set of samples and V be a sub-set of the universe. l:X � V ? [0,1] is a mapping from X � V to[0,1]. "(x,v) e X � V is a kind of information diffusion of X on Vand satisfies three conditions as follows (Huang and Shi, 2002):

(1) It is decreasing. "x e X, 8t0; t00 2 V , if kt0 � xk 6 kt00 � xk, thenlðx; t0ÞP lðx; t00Þ. l is the diffusion function.

(2) "x e X. Let t⁄ be the observed value of x, which satisfieslðx;v�Þ ¼ maxv2Vlðx;vÞ.

(3) l(x, t) is conservative. If and only if 8x 2 X, its integral valueon the universe is 1, viz.

RUl(x, u)du = 1.

In particular, if the random variables’ domain is discrete, sup-pose it is U = {u1,u2, . . . ,um}, the conservation condition isPm

j¼1lðx;ujÞ ¼ 1 ("x e X).

Let X = {x1,x2, . . . ,xn} be a sample, and U = {u1,u2, . . . ,um} be thediscrete universe of X. xi and uj are called the sample point andthe monitoring point, respectively. If "xi e X, "uj e U, we diffusethe information carried by xi to uj at gain fi(uj) using the normalinformation diffusion shown in the following equation:

fiðujÞ ¼ exp �ðxi � ujÞ2

2h2

" #; uj 2 U ð9Þ

where h is the normal diffusion coefficient calculated by Eq. (10)(Huang and Moraga, 2005; Huang and Shi, 2002).

h ¼

0:8146ðb� aÞ; n ¼ 5;

0:5690ðb� aÞ; n ¼ 6;

0:4560ðb� aÞ; n ¼ 7;

0:3860ðb� aÞ; n ¼ 8;

0:3362ðb� aÞ; n ¼ 9;

0:2986ðb� aÞ; n ¼ 10;

0:6851ðb� aÞ=ðn� 1Þ; n P 11

8>>>>>>>>>>><>>>>>>>>>>>:

ð10Þ

where b ¼max16i6nfxig; a ¼min16i6nfxigLet

Ci ¼Xm

j¼1

fiðujÞ ð11Þ

We obtain a normalized information distribution on U deter-mined by xi, as shown in the following equation:

lxiðujÞ ¼

fiðujÞCi

ð12Þ

For each monitoring point uj, by adding all normalized information,we obtain the information gain at uj, which comes from the givensample X. The information gain is shown in the following equation:

qðujÞ ¼Xn

i¼1

lxiðujÞ ð13Þ

q(uj)represents that with the information diffusion technique, thereare q(uj) (is generally not an integer) sample points in terms of sta-tistic averaging at the monitoring point uj. q(uj) is not usually a po-sitive integer, but is a number not less than zero. The assumption is

Q ¼Xm

j¼1

qðujÞ; ð14Þ

where Q is the sum of the sample size of all q(uj). Theoretically,there will be Q = n, but due to the numerical calculation error, thereis a slight difference between Q and n. Therefore, we can use Eq. (15)to estimate the frequency value of a sample falling at uj.

pðujÞ ¼qðujÞ

Qð15Þ

The frequency value can be taken as the estimation value of itsprobability. The probability value of transcending uj should be

PðujÞ ¼Xm

k¼j

pðujÞ; ð16Þ

where P(uj) is the required risk estimation value.

4. Evaluation

4.1. Methods compared

To evaluate our method, we compared it with other methodsusing some simulation experiments.

Table 1Average divergence q and the relative error e for N(0,1).

n 10 12 14 16 18 20 22

q 0.059428 0.047242 0.044552 0.044344 0.043196 0.041732 0.040316q0 0.063654 0.057944 0.057042 0.054808 0.053956 0.047588 0.045344e 0.0664 0.1847 0.2190 0.1909 0.1994 0.1231 0.1109

Table 2Average divergence q and the relative error e for E(15).

n 10 12 14 16 18 20 22

q 0.087628 0.088108 0.085376 0.082214 0.082084 0.072396 0.070786q0 0.122476 0.109402 0.102194 0.091622 0.088534 0.083526 0.079282e 0.2845 0.1946 0.1646 0.1027 0.0729 0.1333 0.1072


An experiment is conducted using the normal distributionN(0,1). We obtain 10 numbers randomly from the standard normaldistribution N(0,1) using Eqs. (10)–(15). The average divergence isobtained as q = 0.059428 after 50 simulation experiments. Then,let n = 12, . . . ,22 and respectively simulate 50 experiments. Wethen obtain Table 1, which shows the average divergence-q ofthe normal information diffusion estimate compared with theaverage divergence q0 of the histogram estimate. The relative error

of HE and IDM is calculated as e ¼ q0�qq0 . The results show that IDM is

better than HE. Roughly speaking, for a small sample, the methodof normal information diffusion can improve a histogram estimatorto reduce the mean error by about 15.63%.

For another experiment in exponential distribution, we obtainTable 2 to show the average divergences-q, q0 of IDM and HE,respectively. Table 2 shows that when n is small, IDM is better thanHE with respect to lognormal distribution.

When n is small, the method of IDM is superior with respect toalmost any distribution. Furthermore, for a given sample whosesize is large and which is drawn from an exponential or lognormaldistribution, the new method is not the best. Therefore, no diffu-sion function can express all diffusion phenomena.

4.2. Limitation of IDM

A study of the above simulation experiments reveals that thesuperiority of IDM is dependent on whether we are blind to thepopulation and whether the size of a given sample is small. Inthe experiments, the given sample is considered fuzzy, so somebenefits can be obtained by IDM. The work efficiency of IDM isabout 35% higher than that of HE. That is, if no knowledge is avail-able about the population from which the given sample is drawn,and if the sample size is small, we have to obtain more observa-tions, adding about 35%, to guarantee that the estimation is as goodas the one given by the fuzzy method.

However, if we have a lot of knowledge about the population toconfirm an assumption, the statistical object with respect to a gi-ven sample is clearer. If the size of a given sample is large, thereis an abundance of statistical information in the sample. In thiscase, it is unnecessary to replace the statistics with IDM as littlebenefit can be obtained using it.

5. Application of the method to flood risk assessment

The assessment of flood risk comprises the following steps:

1. Decide the flood indicator weights using AHP.2. Convert the multi-dimensional indicators of the samples into

the one-dimensional degree value using VFS-AHP.

3. Turn the degree value of the sample into a fuzzy set and thenget the required risk estimation value by IDM.

4. Calculate the recurrence interval according to the risk estima-tion value.

According to the above theory, we can calculate the flood riskestimation of various degrees in China based on the historical datafrom 1950 to 2009 collected by the Ministry of Water Resources ofthe People’s Republic of China (Table 3). We select the set of 60 re-cords as the sample, and then 30 records are randomly chosen toform a small sample to test the stability of the results obtainedusing the method. Damage area, inundated area, dead population,and collapsed houses have been chosen as the disaster indicators inflood risk analysis. By frequency analysis, the floods are classifiedinto four levels: small, medium, large, and extreme (Table 4).

5.1. VFS-AHP for comprehensive evaluation of the flood degree

The two-level hierarchy is constructed to obtain the weights ofthe evaluation indicator. The goal is to ascertain ‘‘the weights ofthe evaluation indicators.’’ The evaluation indicators (attributes)are damage area, inundated area, dead population, and collapsedhouses.

The pairwise comparison is conducted using a scale based onthe proposal of Satty (1980) detailed in Table 5. To illustrate thekind of results obtained, Table 6 presents a pairwise comparisonmatrix drawn from the information provided from the expert forthe evaluation of the importance of the factors. Then, the consis-tency of the comparison matrix was tested and the relative weightsof the elements are computed along with the consistency ratio (CR)as presented in Table 7. If the CR is below 10%, the judgments areconsidered consistent.

According to AHP, we obtain the normalized weights of theevaluation indicators as

W ¼ ½0:0625 0:1875 0:4375 0:3125� ¼ ðwiÞ ð17Þ

According to Table 4 and Chen (1997), we set up the matrices of theparameters for calculating the difference function of VFS.

I½a;b� ¼

½0;9045� ½9045;14197� ½14197;20388� ½20388;80000�½0;4989� ½4989;8216:7� ½8216:7;13000� ½13000;50000�½0;3446� ½3446;5113� ½5113;10676� ½10676;100000�½0;112:1� ½112:1;247:7� ½247:7;754:3� ½754:3;5000�

26664

37775

I½c;d� ¼

½0;14197� ½0;20388� ½9045;80000� ½14197;80000�½0;8216:7� ½0;13000� ½4989;50000� ½8216:7;50000�½0;5113� ½0;10676� ½3446;100000� ½5113;100000�½0;247:7� ½0;754:3� ½112:1;5000� ½247:7;5000�

26664

37775

Table 3Values of flood indicators during 60 years.

Year Disaster area (1000 hectares) Inundated area (1000 hectares) Dead population (persons) Collapsed houses (10,000)

1950 6559.00 4710.00 1982 130.501951 4173.00 1476.00 7819 31.801952 2794.00 1547.00 4162 14.501953 7187.00 3285.00 3308 322.001954 16131.00 11305.00 42,447 900.901955 5247.00 3067.00 2718 49.201956 14377.00 10905.00 10,676 465.901957 8083.00 6032.00 4415 371.201958 4279.00 1441.00 3642 77.101959 4813.00 1817.00 4540 42.101960 10155.00 4975.00 6033 74.701961 8910.00 5356.00 5074 146.301962 9810.00 6318.00 4350 247.701963 14071.00 10479.00 10,441 1435.301964 14933.00 10038.00 4288 246.501965 5587.00 2813.00 1906 95.601966 2508.00 950.00 1901 26.801967 2599.00 1407.00 1095 10.801968 2670.00 1659.00 1159 63.001969 5443.00 3265.00 4667 164.601970 3129.00 1234.00 2444 25.201971 3989.00 1481.00 2323 30.201972 4083.00 1259.00 1910 22.801973 6235.00 2577.00 3413 72.301974 6431.00 2737.00 1849 120.001975 6817.00 3467.00 29,653 754.301976 4197.00 1329.00 1817 81.901977 9095.00 4989.00 3163 50.601978 2820.00 924.00 1796 28.001979 6775.00 2870.00 3446 48.801980 9146.00 5025.00 3705 138.301981 8625.00 3973.00 5832 155.101982 8361.00 4463.00 5323 341.501983 12162.00 5747.00 7238 218.901984 10632.00 5361.00 3941 112.101985 14197.00 8949.00 3578 142.001986 9155.00 5601.00 2761 150.901987 8686.00 4104.00 3749 92.101988 11949.00 6128.00 4094 91.001989 11328.00 5917.00 3270 100.101990 11804.00 5605.00 3589 96.601991 24596.00 14614.00 5113 497.901992 9423.30 4464.00 3012 98.951993 16387.30 8610.40 3499 148.911994 18858.90 11489.50 5340 349.371995 14366.70 8000.80 3852 245.581996 20388.10 11823.30 5840 547.701997 13134.80 6514.60 2799 101.061998 22291.80 13785.00 4150 685.031999 9605.20 5389.12 1896 160.502000 9045.01 5396.03 1942 112.612001 7137.78 4253.39 1605 63.492002 12384.21 7439.01 1819 146.232003 20365.70 12999.80 1551 245.422004 7781.90 4017.10 1282 93.312005 14967.48 8216.68 1660 153.292006 10521.86 5592.42 2276 105.822007 12548.92 5969.02 1230 102.972008 8867.82 4537.58 633 44.702009 8748.16 3795.79 538 55.59

Table 4Flood disaster rating standard.

Disasterlevel

Damage area (1000hectares)

Inundated area (1000hectares)

Dead population(persons)

Collapsed houses(10,000)

Recurrence interval(years)

Gradenumber

Small flood 0–9045 0–4989 0–3446 0–112.1 0–2 1Medium

flood9045–14,197 4989–8216.7 3446–5113 112.1–247.7 2–5 2

Large flood 14,197–20,388 8216.7–13,000 5113–10,676 247.7–754.3 5–20 3Extreme

flood20,388–80,000 13,000–50,000 10,676–100,000 754.3–5000 >20 4


Table 5Scale preferences used in the pairwise comparison process.

Range Category Score

Superior Absolutely superior 9Very strongly superior 7Strongly superior 5Moderately superior 3

Equal Equal 1Inferior Absolutely inferior 1/9

Very strongly inferior 1/7Strongly inferior 1/5Moderately inferior 1/3

Table 6Pairwise comparison of the alternatives with respect to flood disasters.

Damagearea

Inundatedarea

Deadpopulation

Collapsedhouses

Damage area 1 1/2 1/9 1/3Inundated area 2 1 1/5 1/2Dead population 9 5 1 3Collapsed houses 3 2 1/3 1

Table 7Vector of weights of the alternatives withrespect to flood disasters.

Flood impact

Damage area 0.0655Inundated area 0.1189Dead population 0.6043Collapsed houses 0.2113CR 0.0030

Table 8The disaster degree values during the 60 years in China.

Sample Degree value Sample Degree value

1 1.2975 31 1.70582 2.4769 32 2.27283 1.5831 33 2.47274 1.5564 34 2.60615 3.2447 35 1.7776 1.2101 36 1.79487 3.2537 37 1.54328 2.131 38 1.6289 1.4648 39 1.8295

10 1.7051 40 1.552811 2.2814 41 1.637712 2.1606 42 2.776913 2.0829 43 1.398314 3.449 44 1.772515 2.2234 45 2.720816 1.1692 46 2.029417 1.0678 47 2.872718 1.0227 48 1.482119 1.0524 49 2.307920 1.9909 50 1.429921 1.1141 51 1.285722 1.1142 52 1.128123 1.0727 53 1.445624 1.3453 54 1.395725 1.2092 55 1.127326 3.9117 56 1.434927 1.1177 57 1.378228 1.3402 58 1.243529 1.0624 59 1.048330 1.32 60 1.0439

Fig. 2. Comparisons of the risks by VFS-IDM model and the traditional statisticalmethod.


M ¼

0 10762 18324 800000 6064 11406 500000 4002 8822 1000000 157 585 5000

26664

37775

Based on matrices I[a,b], I[c,d], and M, we judge that the evaluatingindicator x is locates on the left or right side of point M. Accordingly,we select Eqs. (1) or (2) to calculate the difference function lh(uij) ofthe indicators to the standards. h is the grade number andh = 1,2,3,4; i is the indicator number and i = 1,2,3,4; and j is thesample number and j = 1,2, . . . ,32, . . . ,0.

To get the synthetic RMD of each indicator, we use the variablefuzzy recognition model presented by Wu et al. (2006) integratedwith the indicator weight wi by AHP

u0hðxjÞ ¼ 1þPm

i¼1½wið1� lðxijÞh�pPm

i¼1½wilðxijÞh�p

� �ap

( )�1

ð18Þ

H ¼ ð1;2;3;4Þ�uhðxjÞ ð19Þ

where h is the degree number, h = 1, 2, 3, 4, xj represent sample j,and xij is the ith indicator value of sample j. First, we may use thevariable fuzzy recognition model in Eq. (18) to calculate the syn-thetic relative membership degree of Sample 1. With Eq. (18), weobtain the synthetic relative membership degree of each indicatorfor flood u0hðxjÞ. After normalizing them, we get the normalized syn-thetic relative membership degree of each indicator uh(xj). wi is theindicator weight; m is the number of indicators and m = 4; l(xij)h isthe difference function of indicator i of the sample j to degree h; a isa rule parameter of model optimization, a = 1 is the least singlemethod, and a = 2 is the least square method; and p is the distance

parameter, p = 1 is the hamming distance, and p = 2 is the Euclideandistance.

When taking the rule parameter of model optimization a = 2and the distance parameter p = 1, we obtain the disaster degreeof sample 1 as H = 1.2975. In the same way, we can calculate thedisaster degree values of all 60 samples as shown in Table 8.

5.2. Flood risk evaluation based on information diffusion

Based on VFS, the disaster degree values of the 60 samples arecalculated (Table 8), that is, the sample point set X = {x1,x2, . . . ,x60}.The universe of discourse, namely, the monitoring point set, is

Table 9Comparison of two methods.

Method VFS-IDM model Statistics

Mean error 0.0214 0.0544

Table 10Flood disaster risk evaluation values.


taken as U = {u1,u2, . . . ,u41} = {0,0.1,0.2, . . . ,4.0}. The normalizedinformation distribution of each xi, that is, lxi

ðujÞ, can be obtainedaccording to Eqs. (9)–(12). Then, based on Eqs. (13)–(16), disasterprobability risk estimation is calculated. The relationship betweenthe recurrence interval N (years) and the probability p can be ex-pressed as N = 1/p. The flood exceedance probability curve to thedisaster degree value compared with the comparison of that bythe traditional statistical method is shown in Fig. 2.

Disasters level Smallflood

Mediumflood

Largeflood

Extremeflood

Exceedance probability risk 0.9656 0.5735 0.1565 0.0269Recurrence interval (years) 1.0356 1.7436 6.3899 37.1629

6. Result and discussion

By IDM method, we obtain the exceedance probabilities on thedifferent disaster degree values shown in Fig. 2.

In Fig. 2, the results reflect that the risk of the flood decreasessmoothly with the degree value by the VFS-IDM model. The curveof the VFS-IDM model is smoother and more accurate than that oftraditional statistical method.

Thirty records are randomly chosen to form a small sample, andthey are analyzed in the same way for comparison with the largesample. Their results are compared in Figs. 3 and 4.

Fig. 3. Comparison of the risks by VFS-IDM model with small sample and largesample.

Fig. 4. Comparison of the risks by traditional statistics with small sample and largesample.

Fig. 3 shows the difference between the two curves of the esti-mated risk with small sample and large sample by VFS-IDM model.From Fig. 3, two curves match well, which indicates that the resultbarely changes when the sample size changes, and that the methodis stable and barely affected by the size of the sample. The analysisresults for a very large sample can be used as the standard, so theVFS-IDM method is considered closer to the standard than the sta-tistical method, as proven by the experiments.

In Fig. 4, we compare two curves of the estimated risk withsmall sample and large sample by frequency statistics. The meanerrors between the results with large sample and small sampleby frequency statistics reach the value of 0.0544, which is muchbigger than that by VFS-IDM model.

Figs. 3 and 4 indicate that the results of the small sample ana-lyzed by VFS-IDM model are satisfactory. The results reflect thefact that the risks of the floods decrease smoothly with the increasein degree value, and that the VFS-IDM model works better for prac-tical problems. Comparing with those calculated by statistics, theinformation diffusion approach is much better because the resultof VFS-IDM model is closer to the standard. Table 9 presents a com-parison of the mean errors between the results with large sampleand small sample by VFS-IDM model and traditional statistics.The table also shows that the mean error given by the VFS-IDMmodel is much smaller than that by statistical method.

The results also illustrate the risk assessment values and therecurrence interval values on different disaster levels in China.

Due to the standard of four grades, we have (Chen, 2009) thefollowing categories:

(a) If 1.0 6 H 6 1.5, then the flood degree is small (1st grade).(b) If 1.5 < H 6 2.5, then it belongs to medium (2nd grade).(c) If 2.5 < H 6 3.5, then it belongs to large (3rd grade).(d) If 3.5 < H 6 4, then it belongs to extreme (4th grade).

The result in Fig. 3 illustrates the risk estimation, i.e., theexceedance probability of the disaster degree value. From thisinformation, we know the risk estimation is 0.0269 when thedisaster indicator is 3.5. In other words, floods exceeding the 3.5�value (extreme floods) occur every 37.1629 years. Similarly, theprobability of floods exceeding 2.5� (large floods) is 0.1565, whichmeans that floods exceeding that intensity occur every6.3899 years. These findings indicate the serious situation of floodsin China. The frequency and the recurrence interval of the floods ofthe four grades are shown in Table 10.

7. Conclusion

Floods occur frequently in China and cause significant propertylosses and casualties. Flood risk assessment of an area is importantfor flood disaster managers so they could implement a compensa-


tion and disaster-reduction plan. However, risk as a natural or soci-etal phenomenon, is neither precise nor certain. In the current pa-per, we analyze the concept of fuzzy risk with respect toenvironment and safety using a new model. We also analyzewhy we have to implement fuzzy risk estimation in many cases.

We put forward a fuzzy method of flood risk assessment basedon VFS theory and information diffusion technique to improveprobability estimation. From the case calculation, the proposedestimate is better than the statistics estimation. The method hasbeen tested and found to be reliable. The results are reasonableand stable. Moreover, the analysis has shown that the compositemodel has potential in identifying the risks of natural disasters insome areas. In view of the theoretic system of flood risk assess-ment developed thus far and the fact that observed series of disas-ters are quite short or even unavailable, the method based on IDMand VFS adopted in the paper is indisputably an effective and prac-tical method. This is the first time that the model is applied to casesof flood disasters, and more work is needed to draw some final les-sons from flood disasters.

Neither the classical models nor the information diffusion mod-el govern the nature of the physical processes. They are introducedas a compensation for their own limitations in the understandingof the processes concerned.

As fuzzy risk analysis involves more imprecision, uncertainty,and partial truth in natural and societal phenomena, the worksin fuzzy risks must promote the study of the foundations of fuzzylogic.

Further technological developments in flood control and manynew effective methods of flood risk analysis can be used to obtainprediction accuracy. We hope that by conducting such analysis,lessons can be learned so that the impact of natural disasters, suchas the floods in China, can be prevented or mitigated in the future.

Acknowledgments

This work is supported by a Grant from the National Basic Re-search Program of China (Project No. 2007CB714107), a Grant fromthe Key Projects in the National Science and Technology Pillar Pro-gram (Project No. 2008BAB29B08), and a Grant from the SpecialResearch Foundation for the Public Welfare Industry of the Minis-try of Science and Technology and the Ministry of Water Resources(Project No. 201001080).

References

Breiman, L., Meisel, W., Purcell, E., 1977. Variable kernel estimates of multivariatedensities. Technometrics 19, 135–144.

Brown, C.B., 1979. A fuzzy safety measure. Journal of the Engineering MechanicsDivision 105, 855–872.

Carlin, B.P., Louis, T.A., 1997. Bayes and empirical Bayes methods for data analysis.Statistics and Computing 7, 153–154.

Chaitin, G.J., 1990. Information, Randomness & Incompleteness: Papers onAlgorithmic Information Theory. World Scientific Pub. Co. Inc..

Chen, S., 1997. Relative membership function and new frame of fuzzy sets theoryfor pattern recognition. Journal of Fuzzy Mathematics 5, 401–412.

Chen, S., 2009. Theory and Model of Variable Fuzzy Sets and its Application, first ed.Dalian University of Technology Press, Dalian.

Chen, X.R., 1989. Non-Parametric Statistics. Shanghai Science and Technology Press,Shanghai.

Clements, D.P., 1977. Fuzzy ratings for computer security evaluation. PhD thesis,University of California, Berkeley.

Devroye, L., Gyorfi, L., 1985. Nonparametric Density Estimation. Wiley.Dong, W., Shah, H., Wongt, F., 1985. Fuzzy computations in risk and decision

analysis. Civil Engineering Systems 2, 201–208.Esogbue, A.O., Theologidu, M., Guo, K., 1992. On the application of fuzzy sets theory

to the optimal flood control problem arising in water resources systems. FuzzySets and Systems 48, 155–172.

Greening, L.A., Bernow, S., 2004. Design of coordinated energy and environmentalpolicies: use of multi-criteria decision-making. Energy Policy 32, 721–735.

Guo, Y., Chen, S., 2006. Application of Variable Fuzzy Sets in Classified Prediction ofRockburst. ASCE.

Hadipriono Timothy, J., Fabian, C., 1991. A rule-based fuzzy logic deductiontechnique for damage assessment of protective structures. Fuzzy Sets andSystems 44, 459–468.

Hand, D.J., 1982. Kernel Discriminant Analysis. Research Studies Press, Chichester,UK.

Hobbs, B.F., Meier, P., 2000. Energy Decisions and the Environment: A Guide to theUse of Multicriteria Methods. Springer, Netherlands.

Hoffman, L.J., Michelman, E.H., Clements, D., 1978. SECURATE-security evaluationand analysis using fuzzy metrics. IEEE Computer Society, p. 531.

Huang, C., 1997. Principle of information diffusion. Fuzzy Sets and Systems 91, 69–90.

Huang, C., 2002. Information diffusion techniques and small-sample problem.International Journal of Information Technology and Decision Making 1, 229–250.

Huang, C., Moraga, C., 2005. Extracting fuzzy if–then rules by using the informationmatrix technique⁄ 1. Journal of Computer and System Sciences 70, 26–52.

Huang, C., Shi, Y., 2002. Towards Efficient Fuzzy Information Processing: Using thePrinciple of Information Diffusion. Physica Verlag.

Liang, Z., Yang, K., Sun, Y., Yuan, J., Zhang, H., Zhang, Z., 2006. Decision support forchoice optimal power generation projects: fuzzy comprehensive evaluationmodel based on the electricity market. Energy Policy 34, 3359–3364.

Limmeechokchai, B., Chawana, S., 2007. Sustainable energy development strategiesin the rural Thailand: the case of the improved cooking stove and the smallbiogas digester. Renewable and Sustainable Energy Reviews 11, 818–837.

Liu, Z.R., Huang, C.F., 1990. Information distribution method relevant in fuzzyinformation analysis. Fuzzy Sets and Systems 36, 67–76.

Palm, R., 2007. Multiple-step-ahead prediction in control systems with Gaussianprocess models and TS-fuzzy models. Engineering Applications of ArtificialIntelligence 20, 1023–1035.

Parzen, E., 1962. On estimation of a probability density function and mode. TheAnnals of Mathematical Statistics 33, 1065–1076.

Pohekar, S., Ramachandran, M., 2004. Application of multi-criteria decision makingto sustainable energy planning – a review. Renewable and Sustainable EnergyReviews 8, 365–381.

Satty, T., 1980. The Analytic Hierarchy Process. McGraw-Hill, New York.Silverman, B.W., 1986. Density Estimation for Statistics and Data Analysis. Chapman

& Hall/CRC.Wang, Y., Wang, D., Wu, J., 2011. A variable fuzzy set assessment model for water

shortage risk: two case studies from China. Human and Ecological RiskAssessment 17, 631–645.

Wertz, W., 1978. Statistical Density Estimation: A Survey. Vandenhoeck & Ruprecht.Wu, L., Guo, Y., Chen, S., Zhou, H., 2006. Use of variable fuzzy sets methods for

desertification evaluation. Computational Intelligence, Theory and Applications,721–731.

Zadeh, L.A., 1965. Fuzzy sets. Information and Control 8, 338–353.Zhang, D., Wang, G., Zhou, H., 2011. Assessment on agricultural drought risk based

on variable fuzzy sets model. Chinese Geographical Science 21, 167–175.

Documents

Research on flood risk analysis and evaluation method based on variable fuzzy sets and information diffusion