8
Cross-entropy measure of uncertain variables Xiaowei Chen a , Samarjit Kar b,, Dan A. Ralescu c a Department of Risk Management and Insurance, Nankai University, Tianjin 30071, China b Department of Mathematics, National Institute of Technology, Durgapur 713209, India c Department of Mathematical Sciences, University of Cincinnati, Cincinnati, OH 45221-0025, USA article info Article history: Received 7 July 2010 Received in revised form 22 February 2012 Accepted 26 February 2012 Available online 17 March 2012 Keywords: Uncertain variable Cross-entropy Minimum cross-entropy principle abstract ross-entropy is a measure of the difference between two distribution functions. In order to deal with the divergence of uncertain variables via uncertainty distributions, this paper aims at introducing the concept of cross-entropy for uncertain variables based on uncertain theory, as well as investigating some mathematical properties of this concept. Several prac- tical examples are also provided to calculate uncertain cross-entropy. Furthermore, the minimum cross-entropy principle is proposed in this paper. Finally, a study of generalized cross-entropy for uncertain variables is carried out. Ó 2012 Elsevier Inc. All rights reserved. 1. Introduction Probability theory, fuzzy set theory, rough set theory, and credibility theory were all introduced to describe non-deter- ministic phenomena. However, some of the non-deterministic phenomena expressed in the natural language, e.g. ‘‘about 100 km’’, ‘‘approximately 39 °C’’, ‘‘big size’’, are neither random nor fuzzy. Liu [16] founded uncertainty theory, as a branch of mathematics based on normality, self-duality, countable subadditivity, and product measure axioms. An uncertain mea- sure is used to indicate the degree of belief that an uncertain event may occur. An uncertain variable is a measurable function from an uncertainty space to the set of real numbers and this concept is used to represent uncertain quantities. The uncer- tainty distribution is a description of an uncertain variable. Uncertainty theory has wide applications in programming, logic, risk management, and reliability theory. In many cases, the uncertainty is not static, but changes over time. In order to de- scribe dynamic uncertain systems, uncertain processes were first introduced by Liu [17]. Uncertain statistics is a methodol- ogy for collecting and interpreting experimental data (provided by experts) in the framework of uncertainty theory. Suppose that we know the states of a system take values in a specific set with unknown distribution, although we do not know the exact form of this distribution function. However, we may learn constraints on this distribution: expectations, var- iance, or bounds on these values. Suppose that we need to choose a distribution that is in some sense the best estimate given what we know. Usually there are infinite many distributions satisfying the constraints. Which one should we choose? Before answering this question, we will first discuss the concepts of entropy and cross-entropy. In 1949, Shannon [25] introduced entropy to measure the degree of uncertainty of random variables. Inspired by the Shannon entropy, fuzzy entropy was proposed by Zadeh [34] to quantify the amount of fuzziness, and the entropy of a fuzzy event is defined as a weighted Shannon entropy. Fuzzy entropy has been studied by many researchers such as [7,12,14,15,20–22,33,36]. The principle of maximum entropy was proposed by Jaynes [11]: of all the distributions that sat- isfy the constraints, choose the one with the largest entropy. Besides this method, cross-entropy was introduced by Good [8]. 0020-0255/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ins.2012.02.049 Corresponding author. Tel.: +91 9434453186. E-mail addresses: [email protected] (X. Chen), [email protected] (S. Kar), [email protected] (D.A. Ralescu). Information Sciences 201 (2012) 53–60 Contents lists available at SciVerse ScienceDirect Information Sciences journal homepage: www.elsevier.com/locate/ins

Cross-entropy measure of uncertain variables

Embed Size (px)

Citation preview

Page 1: Cross-entropy measure of uncertain variables

Information Sciences 201 (2012) 53–60

Contents lists available at SciVerse ScienceDirect

Information Sciences

journal homepage: www.elsevier .com/locate / ins

Cross-entropy measure of uncertain variables

Xiaowei Chen a, Samarjit Kar b,⇑, Dan A. Ralescu c

a Department of Risk Management and Insurance, Nankai University, Tianjin 30071, Chinab Department of Mathematics, National Institute of Technology, Durgapur 713209, Indiac Department of Mathematical Sciences, University of Cincinnati, Cincinnati, OH 45221-0025, USA

a r t i c l e i n f o

Article history:Received 7 July 2010Received in revised form 22 February 2012Accepted 26 February 2012Available online 17 March 2012

Keywords:Uncertain variableCross-entropyMinimum cross-entropy principle

0020-0255/$ - see front matter � 2012 Elsevier Inchttp://dx.doi.org/10.1016/j.ins.2012.02.049

⇑ Corresponding author. Tel.: +91 9434453186.E-mail addresses: [email protected]

a b s t r a c t

ross-entropy is a measure of the difference between two distribution functions. In order todeal with the divergence of uncertain variables via uncertainty distributions, this paperaims at introducing the concept of cross-entropy for uncertain variables based on uncertaintheory, as well as investigating some mathematical properties of this concept. Several prac-tical examples are also provided to calculate uncertain cross-entropy. Furthermore, theminimum cross-entropy principle is proposed in this paper. Finally, a study of generalizedcross-entropy for uncertain variables is carried out.

� 2012 Elsevier Inc. All rights reserved.

1. Introduction

Probability theory, fuzzy set theory, rough set theory, and credibility theory were all introduced to describe non-deter-ministic phenomena. However, some of the non-deterministic phenomena expressed in the natural language, e.g. ‘‘about100 km’’, ‘‘approximately 39 �C’’, ‘‘big size’’, are neither random nor fuzzy. Liu [16] founded uncertainty theory, as a branchof mathematics based on normality, self-duality, countable subadditivity, and product measure axioms. An uncertain mea-sure is used to indicate the degree of belief that an uncertain event may occur. An uncertain variable is a measurable functionfrom an uncertainty space to the set of real numbers and this concept is used to represent uncertain quantities. The uncer-tainty distribution is a description of an uncertain variable. Uncertainty theory has wide applications in programming, logic,risk management, and reliability theory. In many cases, the uncertainty is not static, but changes over time. In order to de-scribe dynamic uncertain systems, uncertain processes were first introduced by Liu [17]. Uncertain statistics is a methodol-ogy for collecting and interpreting experimental data (provided by experts) in the framework of uncertainty theory.

Suppose that we know the states of a system take values in a specific set with unknown distribution, although we do notknow the exact form of this distribution function. However, we may learn constraints on this distribution: expectations, var-iance, or bounds on these values. Suppose that we need to choose a distribution that is in some sense the best estimate givenwhat we know. Usually there are infinite many distributions satisfying the constraints. Which one should we choose? Beforeanswering this question, we will first discuss the concepts of entropy and cross-entropy.

In 1949, Shannon [25] introduced entropy to measure the degree of uncertainty of random variables. Inspired by theShannon entropy, fuzzy entropy was proposed by Zadeh [34] to quantify the amount of fuzziness, and the entropy of a fuzzyevent is defined as a weighted Shannon entropy. Fuzzy entropy has been studied by many researchers such as[7,12,14,15,20–22,33,36]. The principle of maximum entropy was proposed by Jaynes [11]: of all the distributions that sat-isfy the constraints, choose the one with the largest entropy. Besides this method, cross-entropy was introduced by Good [8].

. All rights reserved.

.cn (X. Chen), [email protected] (S. Kar), [email protected] (D.A. Ralescu).

Page 2: Cross-entropy measure of uncertain variables

54 X. Chen et al. / Information Sciences 201 (2012) 53–60

It is a non-symmetric measure of the difference between two probability distributions. Other names of this concept includeexpected weight of evidence, directed divergence, and relative entropy. Based on De Luca and Termini’s [7] fuzzy entropy,Bhandari and Pal [1] defined the cross-entropy for a fuzzy set via its membership function. The theory of fuzzy cross-entropyhas been studied in [2,24]. The principle of minimum cross-entropy was proposed by Kullback [13]: from the distributionsthat satisfy the constraints, choose the one with the least cross-entropy.The principle of maximum entropy can be used toselect a number of representative samples from a large database [28]. The principle of maximum entropy and principle ofminimum cross-entropy have been applied to machine learning and to decision trees; see [10,26,27,29–31,35] for details.Other applications include portfolio selection [23] and optimization models [9,32].

Uncertainty theory is used to model human uncertainty. The uncertainty distribution plays a fundamental role. Unlikeprobability distribution (based on the sample), we often ask some domain experts to evaluate their degree of belief that eachevent will occur. Then the empirical prior information becomes more important. In many real problems, the distributionfunction is unavailable except for partial information, for example, prior distribution function, which may be based on intu-ition or experience with the problem. In order to better estimating the uncertainty distribution, Liu [19] introduced uncer-tain entropy to characterize uncertainty resulting from information deficiency. Chen and Dai [4] investigated the maximumentropy principle of uncertainty distribution for uncertain variables. To compute the entropy more conveniently, Dai andChen [5] provide some formulas for the entropy of functions dealing with uncertain variables with regular uncertain distri-butions. In order to deal with the divergence of two given uncertain distributions, this paper will introduce the concept ofcross-entropy for uncertain variables. Several practical examples are also provided to calculate uncertain cross-entropy. Inpractice, we often need to estimate the uncertainty distribution of an uncertain variable from the known (partial) informa-tion, for example, prior uncertainty distribution, which may be based on intuition or experience with the particular problem.In this context, the principle of minimum cross-entropy in uncertainty theory will be studied. The rest of the paper is organizedas follows: some preliminary concepts of uncertainty theory are briefly recalled in Section 2. The concept and basic proper-ties of entropy of uncertain variables are introduced in Section 3. The concept of cross-entropy for uncertain variables isintroduced in Section 4, where some mathematical properties are also studied. The minimum cross-entropy principle the-orem for uncertain variables is proved in Section 5. A study of generalized cross-entropy for uncertain variables is carriedout in Section 6. Finally, a brief summary is given in Section 7.

2. Preliminaries

Let C be a nonempty set, and L a r-algebra over C. An uncertain measureM [16] is a set function defined on L satisfyingthe following four axioms:

Axiom 1. (Normality Axiom) MfCg ¼ 1;Axiom 2. (Duality Axiom) MfKg þMfKcg ¼ 1 for any event K 2 L;Axiom 3. (Subadditivity Axiom) For every countable sequence of events {Ki}, we have

MS1i¼1

Ki

� �6P1i¼1MfKig

Axiom 4. (Product Measure Axiom) Let Ck be nonempty sets on whichMk are uncertain measures, k = 1,2, . . . ,n, respec-tively. Then the product uncertain measure M is an uncertain measure on the product r-algebraL ¼ L1 � L2 � � � � � Ln satisfying

MQnk¼1

Kk

� �¼ min

16i6nMkfKkg

An uncertain variable is a measurable function from an uncertainty space ðC;L;MÞ to the set of real numbers. The uncer-tainty distribution function U : R! ½0;1� of an uncertain variable n is defined as UðxÞ ¼ Mfn 6 xg: The expected value oper-ator of uncertain variable was defined by Liu [16] as

E½n� ¼Z þ1

0Mfn P rgdr �

Z 0

�1Mfn 6 rgdr

provided that at least one of the two integrals is finite. Furthermore, the variance is defined as E[(n � e)2], where e is the finiteexpected value of n.

3. Entropy of uncertain variables

Definition 1 (Liu [19]). Let n be an uncertain variable with uncertainty distribution U(x). Then its entropy is defined by

H½n� ¼Z þ1

�1SðUðxÞÞdx

Page 3: Cross-entropy measure of uncertain variables

X. Chen et al. / Information Sciences 201 (2012) 53–60 55

where S(t) = �t ln t � (1 � t)ln(1 � t).Note that S(t) = � t ln t � (1 � t)ln(1 � t) is strictly concave on [0,1] and symmetrical about t = 0.5. Then H[n] P 0 for all

uncertain variables n.Liu [18] proved that 0 6 H[n] 6 ln2 if n takes values in the interval [a,b], and H[n] = (b � a)ln2 if and only if n is an uncer-

tain variable with the following distribution:

UðxÞ ¼0; if x < a

0:5; if a 6 x 6 b

1; if x > b

8><>:

Next, we will calculate the entropy of some uncertain variables.

Example 1. A linear uncertain variable n has an uncertainty distribution

UðxÞ ¼0; if x < a

ðx� aÞ=ðb� aÞ; if a 6 x 6 b

1; if x > b

8><>:

where a and b are real numbers with a < b. The entropy of the linear uncertain variable Lða; bÞ is H[n] = (b � a)/2.

Example 2. A zigzag uncertain variable n has an uncertainty distribution.

UðxÞ ¼

0; if x < a

ðx� aÞ=2ðb� aÞ; if a 6 x < b

ðxþ c � 2bÞ=2ðc � bÞ; if b 6 x 6 c

1; if x > c

8>>><>>>:

where a, b and c are real numbers with a < b < c. The entropy of zigzag uncertain variable Zða; b; cÞ is H[n] = (c � a)/2.

Example 3. A normal uncertain variable n is an uncertain variable with normal uncertainty distribution function

UðxÞ ¼ 1þ exppðe� xÞffiffiffi

3p

r

� �� ��1

; �1 < x < þ1; r > 0

Then the entropy of a normal uncertain variable is H½n� ¼ ðprÞ=ffiffiffi3p

.

Theorem 1 (Dai and Chen [5]). Assume n is an uncertain variable with regular uncertainty distribution U. If the entropy H[n]exists, then

H½n� ¼Z 1

0U�1ðaÞ ln a

1� ada

Theorem 2 (Dai and Chen [5]). Let n and g be independent uncertain variables. Then for any real numbers a and b, we have

H½anþ bg� ¼ jajH½n� þ jbjH½g�

Furthermore, Chen and Dai [4] proved the following maximum entropy theorem for uncertain variables. Let n be an

uncertain variable with finite expected value e and variance r2. Then H½n� 6 prffiffi3p and the equality holds only if n is a normal

uncertain variable with expected value e and variance r2, i.e., Nðe;rÞ.

Definition 2 (Dai [6]). Suppose that n is an uncertain variable with uncertain distribution U. Then its quadratic entropy isdefined by

Q ½n� ¼Z þ1

�1ðUðxÞÞð1�UðxÞÞdt

Some mathematical properties of quadratic entropy were studied by Dai [6], in particular the principle of maximum qua-dratic entropy, where also several maximum quadratic entropy theorems with moment constraints are discussed. The qua-dratic entropy has also been applied to estimate uncertainty distributions in uncertain statistics by Dai [6].

4. Cross-entropy for uncertain variables

In this section, we will introduce the concept of cross-entropy for uncertain variables by using uncertain measures. First,we recall the information theoretic distance known as cross-entropy and introduced by Kullback [13]. Let P = {pl,p2, . . . ,pn}

Page 4: Cross-entropy measure of uncertain variables

56 X. Chen et al. / Information Sciences 201 (2012) 53–60

and Q = {ql,q2, . . . ,qn} be two probability distributions wherePn

i¼1pi ¼Pn

i¼1qi ¼ 1. The cross entropy, D(P;Q) is defined asfollows:

DðP; QÞ ¼Pni¼1

pi lnpi

qið1Þ

Eq. (1) is asymmetric, Pal and Pal [21] used the symmetric version:

DðP; QÞ ¼Pni¼1

pi lnpi

qiþ qi ln

qi

pi

� �

Inspired by this, we will introduced the following function to define cross-entropy for uncertain variables

Tðs; tÞ ¼ s lnst

� �þ ð1� sÞ ln 1� s

1� t

� �; 0 6 t 6 1; 0 6 s 6 1

with convention 0 � ln0 = 0. It is obvious that T(s, t) = T(1 � s,1 � t) for any 0 6 s 6 1 and 0 6 t 6 1. Note that

@T@s¼ ln

st� ln

1� s1� t

;@T@t¼ t � s

tð1� tÞ@2T@s¼ 1

sð1� sÞ ;@2T@t@s

¼ � 1tð1� tÞ ;

@2T

@2s¼ s

t2 þ1� s

ð1� tÞ2

Then T(s, t) is a strictly convex function with respect to (s, t) and reaches its minimum value 0 when s = t. In uncertainty the-ory, the best description for uncertain variable is its uncertainty distribution. The inverse uncertainty distribution has manygood properties and the inverse uncertainty distribution for operations of uncertain variables can be obtained easily. So wewill define cross-entropy by using uncertainty distributions.

Definition 3 (Chen [3]). Let n and g be two uncertain variables. Then the cross-entropy of n from g is defined as

D½n; g� ¼Z þ1

�1TðMfn 6 xg;Mfg 6 xgÞdx

where Tðs; tÞ ¼ s ln st

þ ð1� sÞ ln 1�s

1�t

:

It is obvious that D[n;g] is symmetric, i.e., the value does not change if the outcomes are labeled differently. Let Un and Ugbe the distribution functions of uncertain variables n and g, respectively. The cross-entropy of n from g can be written as

D½n; g� ¼Z þ1

�1UnðxÞ ln

UnðxÞUgðxÞ

� �þ ð1�UnðxÞÞ ln

1�UnðxÞ1�UgðxÞ

� �� �dx

The cross-entropy depends only on the number of values and their uncertainties and does not depend on the actual valuesthat the uncertain variables n and g take.

Theorem 3. For any uncertain variables n and g, we have D[n;g] P 0 and the equality holds if and only if n and g have the sameuncertainty distribution.

Proof. Let Un(x) and Ug(x) be the uncertainty distribution functions of n and g, respectively. Since T(s, t) is strictly convex on[0,1] � [0,1] and reaches its minimum value when s = t. Therefore

TðUnðxÞ;UgðxÞÞP 0

for almost all the points x 2 R. Then

D½n; g� ¼Z þ1

�1TðUnðxÞ;UgðxÞÞdx P 0

For each s 2 [0,1], there is a unique point t = s with T(s, t) = 0. Thus, D[n;g] = 0 if and only if T(Un(x),Ug(x)) = 0 for almost allpoints x 2 R, that is Mfn 6 xg ¼ Mfg 6 xg: h

Example 4. Suppose that n and g are uncertain variables with uncertainty distributions U1 and U2, respectively. Assumethat uncertainty distributions U1 and U2 have the form

U1ðxÞ ¼0; if x < a1

ai; if ai 6 x < aiþ1

1; if x P an

8><>:

Page 5: Cross-entropy measure of uncertain variables

X. Chen et al. / Information Sciences 201 (2012) 53–60 57

and

U2ðxÞ ¼0; if x < a1

bi; if ai 6 x < aiþ1

1; if x P an

8><>:

respectively. Then the cross-entropy of n from g is

D½n;g� ¼Pn�1

i¼1ai ln

ai

biðaiþ1 � aiÞ þ ð1� aiÞ ln

1� ai

1� biðaiþ1 � aiÞ

Example 5. Suppose that n and g are linear uncertain variables with uncertainty distributions Lða; bÞ and Lðc; dÞ(c 6 a < b 6 d), respectively. Then the cross-entropy of n from g is

D½n;g� ¼Z þ1

�1UnðxÞ ln

UnðxÞUgðxÞ

� �þ ð1�UnðxÞÞ ln

1�UnðxÞ1�UgðxÞ

� �� �dx ð2Þ

¼Z b

a

x� ab� a

lnðx� aÞðd� cÞðb� aÞðx� cÞ þ

b� xb� a

lnðb� xÞðd� cÞðb� aÞðd� xÞ

� �dx ð3Þ

þZ a

cln

d� cd� x

dxþZ d

bln

d� cx� c

dx ð4Þ

In particular if the uncertainty distributions of n and g are Lð0;1Þ and Lð0;2Þ, respectively, using the formula above, we getD[n;g] = 0.5.

Example 6. Suppose that n and g are zigzag uncertain variables with uncertainty distributions Lða; b; cÞ and Lðd; b; eÞ(d 6 a < b < c 6 e), respectively. Then the cross-entropy of n from g is

D½n;g� ¼Z þ1

�1UnðxÞ ln

UnðxÞUgðxÞ

� �þ ð1�UnðxÞÞ ln

1�UnðxÞ1�UgðxÞ

� �� �dx ð5Þ

¼Z b

a

x� a2b� 2a

lnðx� aÞð2b� 2dÞðx� dÞð2b� 2aÞ þ

2b� x� a2b� 2a

lnð2b� x� aÞð2b� 2dÞð2b� x� dÞð2b� 2aÞdx ð6Þ

þZ c

b

xþ c � 2b2c � 2b

lnðxþ c � 2bÞð2e� 2bÞðxþ e� 2bÞð2c � 2bÞ þ

c � x2c � 2b

lnðc � xÞð2e� 2bÞðe� xÞð2c � 2bÞdx ð7Þ

þZ a

dln

2b� 2d2b� d� x

dxþZ e

cln

2e� 2bxþ e� 2b

dx ð8Þ

In particular if the uncertainty distributions of n and g are Zð0;1;2Þ and Zð0;1;3Þ, respectively, using the formula above,we get D[n;g] = 0.2.

Example 7. Suppose that n and g are normal uncertain variables with uncertainty distributions Nðe1;r1Þ and Nðe1;r2Þ,respectively. Then the cross-entropy of n from g is

D½n;g� ¼Z þ1

�1UnðxÞ ln

UnðxÞUgðxÞ

� �þ ð1�UnðxÞÞ ln

1�UnðxÞ1�UgðxÞ

� �� �dx ð9Þ

¼Z þ1

�1

1

1þ exp pðe1�xÞffiffi3p

r1

� � ln1þ exp pðe2�xÞffiffi

3p

r2

� �1þ exp pðe1�xÞffiffi

3p

r1

� �dx ð10Þ

þZ þ1

�1

1

1þ exp pðx�e1Þffiffi3p

r1

� � ln1þ exp pðx�e2Þffiffi

3p

r2

� �1þ exp pðx�e1Þffiffi

3p

r1

� �dx ð11Þ

In particular if the uncertainty distributions of n and g are Nð0;2Þ and Nð0;1Þ, respectively, using the formula above, weget D[n;g] = 0.9.

5. Minimum cross-entropy principle

In real problems, the distribution function of an uncertain variable is unavailable except partial information, for example,some prior distribution function, which may be based on intuition or experience with the problem. If the momentconstraints and the prior distribution function are given, since the distribution function must be consistent with the giveninformation and our experience, we will use the minimum cross-entropy principle to choose the one that is closest to the givenprior distribution function out of all the distributions satisfying the given moment constraints.

Page 6: Cross-entropy measure of uncertain variables

58 X. Chen et al. / Information Sciences 201 (2012) 53–60

Theorem 4. Let n be a continuous uncertain variable with finite second moment m2. If the prior distribution function has theform

WðxÞ ¼ ð1þ expðaxÞÞ�1; a < 0

then the minimum cross-entropy distribution function is the normal uncertain distribution with second moment m2.

Proof. Let U(x) be the distribution function of n and write U⁄(x) = U(�x) for x P 0. The second moment is

E½n2� ¼Z þ1

0Mfn2 P rgdr ¼

Z þ1

0Mfðn P

ffiffiffirpÞ [ ðn 6 �

ffiffiffirpÞgdr ¼

Z þ1

0ð1�Uð

ffiffiffirpÞ þUð�

ffiffiffirpÞÞdr

¼Z þ1

02rð1�UðrÞ þUð�rÞÞdr ¼

Z þ1

02rð1�UðrÞ þU�ðrÞÞdr ¼ m2

Thus there exists a real number j such that

Z þ1

02rð1�UðrÞÞdr ¼ jm2;

Z þ1

02rU�ðrÞdr ¼ ð1� jÞm2

The minimum cross-entropy distribution function U(r) should minimize the cross-entropy

Z þ1

�1UðrÞ ln UðrÞ

WðrÞ

� �þ ð1�UðrÞÞ ln 1�UðrÞ

1�WðrÞ

� �� �dr ¼

Z 0

�1UðrÞ ln UðrÞ

WðrÞ

� �þ ð1�UðrÞÞ ln 1�UðrÞ

1�WðrÞ

� �� �dr

þZ þ1

0UðrÞ ln UðrÞ

WðrÞ

� �þ ð1�UðrÞÞ ln 1�UðrÞ

1�WðrÞ

� �� �dr

¼Z þ1

0UðrÞ ln UðrÞ

WðrÞ

� �þ ð1�UðrÞÞ ln 1�UðrÞ

1�WðrÞ

� ��

þU�ðrÞ ln U�ðrÞWð�rÞ

� �þ ð1�U�ðrÞÞ ln 1�U�ðrÞ

1�Wð�rÞ

� ��dr

Subject to the moment constraints

Z þ1

02rð1�UðrÞdr ¼ jm2;

Z þ1

02rU�ðrÞdr ¼ ð1� jÞm2

The Lagrangian is

L ¼Z þ1

0UðrÞ ln UðrÞ

WðrÞ

� �þ ð1�UðrÞÞ ln 1�UðxÞ

1�WðrÞ

� �þU�ðrÞ ln U�ðrÞ

Wð�rÞ

� �þ ð1�U�ðrÞÞ ln 1�U�ðrÞ

1�Wð�rÞ

� �� �dr

� k1

Z þ1

02rð1�UðrÞÞdr � jm2

� �� k2

Z þ1

02rU�ðrÞdr � ð1� jÞm2

� �

The Euler–Lagrange equations tell us that the minimum cross-entropy distribution function satisfies

lnUðrÞWðrÞ � ln

1�UðrÞ1�WðrÞ ¼ �2rk1

lnU�ðrÞWð�rÞ � ln

1�U�ðrÞ1�Wð�rÞ ¼ 2rk2

Thus U and U⁄ have the form

UðrÞ ¼ ð1þ expðar þ 2k1rÞÞ�1

U�ðrÞ ¼ ð1þ expðar � 2k2rÞÞ�1

Substituting it into the moment constrains, we get

UðrÞ ¼ 1þ exp � prffiffiffiffiffiffiffi6jp

m

� �� ��1

U�ðrÞ ¼ 1þ expprffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

6ð1� jÞp

m

! !�1

When j = 1/2, then cross-entropy achieves the minimum. Thus the distribution function is just the normal uncertainty dis-tribution N(0,m). h

Page 7: Cross-entropy measure of uncertain variables

X. Chen et al. / Information Sciences 201 (2012) 53–60 59

6. Generalized cross-entropy

In this section, we will define a generalized cross-entropy for an uncertain variable by a strictly convex function P(x) sat-isfying P(1) = 0. There are many functions satisfying the above condition such as P(x) = x2 � x, P(x) = (xa � x)/(a � 1),(a > 0,a – 1) and PðxÞ ¼ x� 1

2

�� ��� 12. For convenience, we define

Tðs; tÞ ¼ tPst

� �þ ð1� tÞP 1� s

1� t

� �; ðs; tÞ 2 ½0;1� � ½0;1�

It is easy to prove that T(s, t) is a function from [0,1] � [0,1] to [0,+1) with convention Tðs;0Þ ¼ limt!0þTðs; tÞ andTðs;1Þ ¼ limt!1�Tðs; tÞ.

Definition 4. Let n and g be two uncertain variables. Then the generalized cross-entropy of n from g is defined as

GD½n;g� ¼Z þ1

�1TðMfn 6 xg;Mfg 6 xgÞdx

where Tðs; tÞ ¼ tP st

þ ð1� tÞP 1�s

1�t

:

It is clear that GD[n;g] is not symmetric. Note that, by changing the formulation of P(x), we can get a different generalizedcross-entropy.

(i) Let P(x) = (xa � x)/(a � 1), a > 0, a – 1. Then

GD½n;g� ¼ 1a� 1

Z þ1

�1½Mfn 6 xgaMfg 6 xg1�a þ ð1�Mfn 6 xgÞað1�Mfg 6 xgÞ1�a � 1�dx

In particular, when a = 1/2,

GD½n;g� ¼Z þ1

�1

ðMfn 6 xg �Mfg 6 xgÞ2

Mfg 6 xgð1�Mfg 6 xgÞdx

(ii) Let PðxÞ ¼ x� 12

�� ��� 12. Then

GD½n;g� ¼Z þ1

�1jMfn 6 xg �Mfg 6 xgj � 1

2

� �dx

Let Un(x) and Ug(x) be the distribution functions of n and g, respectively. Similarly, the generalized cross-entropy can berewritten as

GD½n;g� ¼Z þ1

�1UgðxÞP

UnðxÞUgðxÞ

� �þ ð1�UgðxÞÞP

1�UnðxÞ1�UgðxÞ

� �� �dx

Now, suppose that P(x) is a twice-differentiable strictly convex function. Then T(s, t) is twice-differentiable with respect toboth s and t, and

@T@s¼ P0

st

� ��P0

1� s1� t

� �;

@T@t¼ P

st

� �� s

tP0

st

� ��P

1� s1� t

� �þ 1� s

1� tP0

1� s1� t

� �ð12Þ

@2T@s@t

¼ @2T@t@s

¼ � st2 P00

st

� �þ 1� s

ð1� tÞ2P00

1� s1� t

� � !ð13Þ

@2T@s2 ¼

1tP00

st

� �þ 1

1� tP00

1� s1� t

� �;

@2T@t2 ¼

s2

t3 P00st

� �þ ð1� sÞ2

ð1� tÞ3P00

1� s1� t

� �ð14Þ

The following properties of T(s, t) can be easily proved by equations (12)–(14): (a) T(s, t) is a strictly convex function withrespect to (s,t) and attains its minimum value zero on the line s = t; and (b) for any 0 6 s 6 1,0 6 t 6 1, we haveT(s, t) = T(1 � s,1 � t).

Theorem 5. For any uncertain variables n and g, the generalized cross-entropy satisfies

GD½n;g�P 0

and the equality holds if and only if n and g have the same distribution function.

Proof. Since T(s, t) is strictly convex about (s, t) and attains its minimum value zero on the line s = t, this theorem can beproved similarly to Theorem 3. h

Page 8: Cross-entropy measure of uncertain variables

60 X. Chen et al. / Information Sciences 201 (2012) 53–60

7. Conclusion

In this paper, we first recalled the concept of entropy for uncertain variables and its mathematical properties. Then weintroduced the concept of cross-entropy for uncertain variables to deal with the divergence of two uncertain variables. Inaddition, we investigated some mathematical properties of this type of cross-entropy and proposed the minimum cross-en-tropy principle. Further, some examples are provided to calculate uncertain cross-entropy. Finally, we carry out a study ongeneralized cross-entropy for uncertain variables. In the future, we plan to carry on our research to obtain more properties ofour proposed cross-entropy measure, especially when the prior distribution function of an uncertain variable has otherforms. In addition, we plan to apply our results to the field of portfolio selection, uncertain optimization, and machinelearning.

Acknowledgments

This work was supported by National Natural Science Foundation of China Grant Nos. 71073084, 71171119 and91024032. Dan A. Ralescus work was partly supported by a Taft Travel for Research Grant.

References

[1] D. Bhandary, N. Pal, Some new information measures for fuzzy sets, Information Sciences 67 (1993) 209–228.[2] P. Boer, D. Kroese, S. Mannor, R. Rubinstein, A tutorial on the cross-entropy method, Annals of Operations Research 134 (1) (2005) 19–67.[3] X. Chen, Cross-entropy of uncertain variables, in: Proceedings of the 9th International Conference on Electronic Business, Macau, November 30–

December 4, 2009, pp.1093–1095.[4] X. Chen, W. Dai, Maximum entropy principle for uncertain variables, International Journal of Fuzzy Systems 13 (3) (2011) 232–236.[5] W. Dai, X. Chen, Entropy of function of uncertain variables, Entropy of Function of Uncertain Variables Mathematical and Computer Modelling. http://

dx.doi.org/10.1016/j.mcm.2011.08.052.[6] W. Dai, Quadratic entropy of uncertain variables, Information – An International Interdisciplinary Journal, in press.[7] A. De Luca, S. Termini, A definition of nonprobabilistic entropy in the setting of fuzzy sets theory, Information and Control 20 (1972) 301–312.[8] I. Good, Maximum entropy for hypothesis formulation, especially for multidimensional contingency tables, The Annals of Mathematical Statistics 34

(3) (1963) 911–934.[9] R. Haber, R. Toro, O. Gajate, Optimal fuzzy control system using the cross-entropy method. A case study of a drilling process, Information Sciences 180

(14-15) (2010) 2777–2792.[10] Q.H. Hu, W. Pan, S. An, P.J. Ma, J.M. Wei, An efficient gene selection technique for cancer recognition based on neighborhood mutual information,

International Journal of Machine Learning and Cybernetics 1 (1–4) (2010) 63–74.[11] E. Jaynes, Information theory and statistical mechanics, Physical Reviews 106 (4) (1957) 620–630.[12] A. Kaufmann, Introduction to the Theory of Fuzzy Subsets, vol. 1, Academic Press, New York, 1975.[13] S. Kullback, Information Theory and Statistics, Wiley, New York, 1959.[14] B. Kosko, Fuzzy entropy and conditioning, Information Sciences 40 (1986) 165–174.[15] P. Li, B. Liu, Entropy of credibility distributions for fuzzy variables, IEEE Transactions on Fuzzy Systems 16 (1) (2008) 123–129.[16] B. Liu, Uncertainty Theory, second ed., Springer, Verlag, Berlin, 2007.[17] B. Liu, Fuzzy process, hybrid process and uncertain process, Journal of Uncertain Systems 2 (1) (2008) 3–16.[18] B. Liu, Uncertainty Theory: A Branch of Mathematics for Modeling Human Uncertainty, Springer-Verlag, Berlin, 2010.[19] B. Liu, Some research problems in uncertainty theory, Journal of Uncertain Systems 3 (1) (2009) 3–10.[20] D.MalyszkoandJ. Stepaniuk, Adaptive multilevel rough entropy evolutionary thresholding, Information Sciences 180 (7) (2010) 1138–1158.[21] N. Pal, K. Pal, Object background segmentation using a new definition of entropy, IEE Proceedings – Computers and Digital Techniques 136 (1989) 284–

295.[22] N. Pal, J. Bezdek, Measuring fuzzy uncertainty, IEEE Transactions on Fuzzy Systems 2 (1994) 107–118.[23] Z. Qin, X. Li, X. Ji, Portfolio selection based on fuzzy cross-entropy, Journal of Computational and Applied mathematics 228 (1) (2009) 139–149.[24] R. Rubinstein, D. Kroese, The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine

Learning, Springer, Berlin, 2004.[25] C. Shannon, The Mathematical Theory of Communication, The University of Illinois Press, Urbana, 1949.[26] V. Vagin, M. Fomina, Problem of knowledge discovery in noisy databases, International Journal of Machine Learning and Cybernetics 2 (3) (2011) 135–

145.[27] L.J. Wang, An improved multiple fuzzy NNC system based on mutual information and fuzzy integral, International Journal of Machine Learning and

Cybernetics 2 (1) (2011) 25–36.[28] X.Z. Wang, L.C. Dong, J.H. Yan, Maximum ambiguity based sample selection in fuzzy decision tree induction, IEEE Transactions on Knowledge and Data

Engineering (2011), http://dx.doi.org/10.1109/TKDE.2011.67.[29] X.Z. Wang, C.R. Dong, Improving generalization of fuzzy if–then rules by maximizing fuzzy entropy, IEEE Transactions on Fuzzy Systems 17 (3) (2009)

556–567.[30] X.Z. Wang, J.H. Zhai, S.X. Lu, Induction of multiple fuzzy decision trees based on rough set technique, Information Sciences 178 (16) (2008) 3188–3202.[31] W.G. Yi, M. G Lu, Z. Liu, Multi-valued attribute and multi-labeled data decision tree algorithm, International Journal of Machine Learning and

Cybernetics 2 (2) (2011) 67–74.[32] H. Xie, R. Zheng, J. Guo, X. Chen, Cross-fuzzy entropy: a new method to test pattern synchrony of bivariate time series, Information Sciences 180 (9)

(2010) 1715–1724.[33] R. Yager, On measures of fuzziness and negation – Part I: Membership in the unit interval, International Journal of General Systems 5 (1979) 221–229.[34] L. Zadeh, Probability measures of fuzzy events, Journal of Mathematical Analysis and Applications 23 (1968) 421–427.[35] J.H. Zhai, Fuzzy decision tree based on fuzzy-rough technique, Soft Computing 15 (6) (2011) 1087–1096.[36] Q. Zhang, S. Jiang, A note on information entropy measures for vague sets and its applications, Information Sciences 178 (21) (2008) 4184–4191.