6
(~ Applied Mathematics Letters PERGAMON Applied Mathematics Letters 13 (2000) 37-42 www.elsevier.nl/locate/;~ml Shannon's Entropy in Exponential Families: Statistical Applications M. L. MENENDEZ Departamento de Matem{~tica Aplicada ETSAM, Universidad Polit~cnica de Madrid 28040, Madrid, Spain (Received and accepted April 1999) Communicated by R. Ahlswede Abstract--In this paper, we study the Shannon's entropy and its applications in the regular exponential models. @ 1999 Elsevier Science Ltd. All rights reserved. Keywords--Shannon's entropy, Asymptotic distribution, Exponential model, Entropy measure, Testing hypotheses. 1. INTRODUCTION Let (2(, ~x, P0)0eo be a probability space, where O is an open convex subset of R M°. We shall assume that there exists a generalized probability density function fo(x) = exp ~T~(x)Oj - b(O) j=l for the distribution ~0 with respect to a a-finite measure #, being b(O) = in f -- exp Tj(x)Oj dp(x) R~tlo strictly convex and (oh(e) ab(e)~ -,-(e) -- k ~ ' " aeMo ] an invertible function. It is clear that the Fisher information matrix for this model is given by (o2b(o) = (or(o)) IF(O) \0o~oo~ ] ~,j=l ..... No \--g0T] i=1 ..... fo This work was supported by Grant DGES PB96-0635. 0893-9659/1999/$ - see front matter. (~) 1999 Elsevier Science Ltd. All rights reserved. Typeset by Afl,~-~ PII: S0893-9659(99)00142-1

Shannon's entropy in exponential families: Statistical applications

Embed Size (px)

Citation preview

( ~ Applied Mathematics Letters

P E R G A M O N Applied Mathematics Letters 13 (2000) 37-42 www.elsevier.nl/locate/;~ml

Shannon's Entropy in Exponential Families: Statistical Applications

M. L. M E N E N D E Z Departamento de Matem{~tica Aplicada

ETSAM, Universidad Polit~cnica de Madrid 28040, Madrid, Spain

(Received and accepted April 1999)

Communicated by R. Ahlswede

A b s t r a c t - - I n this paper, we study the Shannon's entropy and its applications in the regular exponential models. @ 1999 Elsevier Science Ltd. All rights reserved.

K e y w o r d s - - S h a n n o n ' s entropy, Asymptotic distribution, Exponential model, Entropy measure, Testing hypotheses.

1. I N T R O D U C T I O N

Let (2(, ~ x , P0 )0eo be a p robab i l i t y space, where O is an open convex subse t of R M°. We shal l

a s sume t h a t t he re exis ts a genera l ized p robab i l i t y dens i ty funct ion

fo(x) = e x p ~T~(x)Oj - b(O) j = l

for t he d i s t r i b u t i o n ~0 wi th respec t to a a - f in i te measure #, be ing

b(O) = in f - - exp Tj(x)Oj dp(x) R~tlo

s t r i c t l y convex and (oh(e) ab(e)~

-,-(e) -- k ~ ' " aeMo ] an inver t ib le funct ion.

I t is c lear t h a t the F i sher in fo rmat ion m a t r i x for th is mode l is given by

(o2b(o) = (or (o ) ) IF(O)

\0o~oo~ ] ~,j=l ..... No \ - - g 0 T ] i=1 ..... f o

This work was supported by Grant DGES PB96-0635.

0893-9659/1999/$ - see front matter. (~) 1999 Elsevier Science Ltd. All rights reserved. Typeset by Afl,~- ~ PII: S0893-9659(99)00142-1

38 M . L . MENENDEZ

and we shall suppose that it is positive definite in O. If X 1 , . . . , Xn is a random sample we have that

n _1 TMo(Xd On ( X l , . . . , X n ) = 7 " - 1 T l ( X i ) , . . . , n

i=1 i=1

is the a.s. only value of the maximum likelihood estimator. These exponential models satisfying the previous assumptions are called regular exponential models, see [1].

In Section 2, we present the expression of Shannon's entropy for this family as well as some examples. In Section 3, the asymptotic distribution for Shannon's entropy in this family is obtained and finally in Section 4 some statistical applications, as well as an example are presented.

2. S H A N N O N ' S E N T R O P Y

In the following theorem, we obtain an easy expression for Shannon's entropy in the regular exponential models.

THEOREM 2.1. We consider the regular exponential model

fo(x) = exp Tj(x)Oj - b(O) ,

then the expression of Shannon's entropy is given by Mo

g ( o ) = - + b(O).

j = l

P R O O F . W e k n o w t h a t

Mo P P H(O) = - / fo(x)logfo(x)dp(x) = - E Oj / fo(x)Tj(x)d#(x) + b(O).

XJ j = l X d

But

b(O)=ln f exp O~Tj(x) d•(x) x ( ~=1

and Oh(0) = f OOj fo(x)Tj (x) d#(x).

X

Then, we have the desired result.

REMARK 1. If we consider the functional

Kr(O) = / " fo(x)"Tj (x) d#(x) 2(

it is immediate for the regular exponential model that

Kr(0) = exp {-rb(O) + b(rO)}, if rO E O.

This expression is very interesting because it is possible to get the expression of some important entropy measures for the regular exponential model. So we have, for instance, the following.

Ent ropy Expression

H~(0), (see [2])

gr(0), (see [3])

rH(O), (see [4])

g;(0), (see [5])

1 - - lnKr(O) , r ~ O, 1

( 1 - r ) - l ( K r ( O ) - l ) , r ~ l , r > O

(1 -- r) -1 ( ( K 1 / r ( O ) ) r - 1), r ~: 1, r > 0

(l--r) -l((Kr(O)) s - 1 / r - l - l ~ , r # 1, s~: 1, r > 0 \ /

Exponential Families 39

I t is also possible to obta in the expression of Shannon ' s en t ropy as a limit case of the former

entropies

H(8) = l im H~(8) = l im H~(8) = lira ~H(8) = lira H~(8) . r--*l r -~ l r---* 1 r -~l

Now we present some examples .

EXAMPLE {2.2. BETA DISTRIBUTION. In this case, we have

fol,o2 (x) = exp {81 l n x + 02 ln(1 - x) - in B(81 + 1,02 + 1)}

wi th 83 = c ~ - 1, 02 = / 3 - 1. Then

5(81,02) = In B(01 + 1, 02 + 1) = In F(01 + 1) + In F(82 + 1) - In F(01 + 02 + 2)

and Shannon ' s en t ropy is given by

H(ct , ~) = - ( a , - 1) {qJ(c~) - ~(c~ +/3)} - (/3 - 1) {~'(~3) - q*(ct + /3)} + In B(c~,/:~),

Or(x) where ~b(x) - Ox

EXAMPLE 2 .3 . GAMMA DISTRIBUTION. In this case, we have

fol,o2 (x) = exp ( -81x + 82 In x + (82 + 1) In 81 - In F(82 + 1))

with 81 = a and 82 = p - 1. Then

b(81,82) = - (82 + 1)1n81 + ln(82 + 1)

~md Shannon ' s en t ropy is given by

H(a,p) = (1 - p)~b(p) + p + In F(p) a

EXAMPLE 2 .4 . NORMAL DISTRIBUTION. In this case, we have

fo~,o~ = e x p x 8 1 - x 2 0 2 - l n r e - ~ l n 0 2 + 4 0 2 j

with 01 = p/or 2 and 02 = 1/2a 2. T h e n

1 1 012 b( St , 82 ) = ~ I n r e - ~ i n 8 2 + - -

482

and Shannon ' s en t ropy is given by

H ( # , or) = In (2reea 2) 1/2

3 . A S Y M P T O T I C D I S T R I B U T I O N

In this section, we ob ta in the a sympto t i c d is t r ibut ion of Shannon ' s en t ropy for the exponent ia l model when the p a r a m e t e r 8 is replaced for its corresponding M a x i m u m Likelihood Es t ima to r . P rom the popula t ion , a r a n d o m sample of size n is drawn. Let 0 the M L E of the p a r a m e t e r 0 and consider the expression of Shannon ' s en t ropy for the exponent ia l model

Mo

H(O) = - E 8jrj(8) + b(O). j = l

40 M.L. MENENDEZ

A Taylor's expansion of H(0) around 0 yields

Mo H(O) = g(o) + ~ OH(O)

OH(O) ~ 02b(O)

=-j2-11% ~=l J ° ° ' j (°)

and 1 MO Mo 02H(O.)

with I10 - O* II < [I 0 - 01]. It is well known for regular exponential models that

V/-~(O__80 ) n~'--*ooL N(O, IF(8O)_I)

with 80 the true value of the parameter.

It is immediate to establish that v~R,~ P-~ 0. Then n -"-~O0

(n (0) -- n(o0) ) L g (O, TtIF(Oo)-XT),

/0H(0)~ provided TtIF(O)-IT > 0. Hence, we can establish where T t = ( t l , . . . ,tMo) and tk = ~ OOk )0=0o the following theorem.

THEOREM 3.1. For the regular exponential model the following relation holds:

(n (0) - n (8o)) n~£ S (O, TtIF (80) -1 T )

provided Tt IF(Oo)-IT > O.

4. S T A T I S T I C A L A P P L I C A T I O N S

The previous results giving the asymptotic distribution of Shannon's entropy for exponential model in random sampling can be used in various settings to construct confidence intervals and to test statistical hypotheses. Now we present some of them, for those exponential models in which H(8) is one to one.

To test Ho : O = 0o against H1 : 8 ¢ 00, is equivalent to testing H(O) = H(8o), against H(O) H(Oo) then we can use the statistic

- 0,., (0) + ( 0 ) - Z1 = v ~ ~=1 a(80)

which has approximately a standard normal distribution under Ho, for sufficiently large n, and 02(00) = TtIF(Oo)-IT, being T = (ta,... , tMo) and

tk = - Z °°J \-~N)0_-0o j = l

where

Exponential Families 41

We consider two popula t ions with parameters 0 (1) = ( S u , . . . , 81Mo) and 8 (2) = (0~1, . . . , 82Mo), respectively. To test Ho : 8 (1) = 8 (2) against H0 : 8 (1) ~ 8 (2) is equivalent to test ing H ( 0 (1)) =

H(8(2)), against H ( 0 (1)) # H(8 (2)) then we will use the statist ic

~ ~ {02j'rj(O(2))-Olj'rj(O(1))+(i/Mo)(b(O(2))-b(O(1)))} j = l

Z2 = IT t2 (7 (8(1) )2"~- ~tlO" (8(1)) 2

which converges in dis t r ibut ion to a s tandard normal distribution.

Finally, if we consider r populat ions with parameters 0 (i) = ( 8 i l , . . . , 0iMo) i = 1 . . . . . 'r, respec-

t ively and we want to test 0 (1) = 8 (2) . . . . . 8("), against 0 (i) ~ O(J) for some i ¢ j is equivalent

to tes t ing H ( 8 (1)) = H(8 (2)) . . . . H(8(")), against H(8 (i)) # H(8(J)) for some i ~ j then we will

use the statist ic

~ k i Mo Mo

i=l j = l i=l j = l

r where ki = (n.i/a(8(i)) 2) and k = Y~i=l ki. Z3 converges in dis tr ibut ion to a chi-square distribu-

t ion with r - 1 degrees of freedom.

Now we present an example as an application of testing.

EXAMPLE 4.1. The following da t a are a r andom sample of size 30 from a G a m m a distr ibution.

0.27670 0.04626 0.72408 0.73261 1.07326

0.52424 0.55992 0.12328 0.53696 1.31656

0.26827 1.11829 0.15717 0.79090 0.63257

0.37857 0.45168 0.74294 0.26924 0.50337

0.17408 1.11554 0.17436 0.41877 0.59942

0.31044 0.43031 0.57359 1.52558 0.75491

We want to test the null hypothesis t ha t the parameters of the G a m m a dis tr ibut ion from which

da t a come are p = 2, a = 3. Then we consider the hypothesis test ing problem

H 0 : p = 2 , a = 3 against H 0 : p ¢ 2 , a # 3 .

To solve this problem we can use the statist ic Z 1 given above. The MLE for the parameter can

be obta ined numerical ly as the solution of the equat ion

F(&---'--y - in ~ - In ~ = 0

after which one compu te the ML E for the parameter p, as lJ = X /& where X = 1/30 ~t=3° 1 Xi.

In our case, /} = (&,iS) = (3.9,2.42). Then we have H(0) = 1.1238, H(O0) = H ( 2 , 3 ) =- 0.4786,

and a(S0) = 2.35 with 80 = (2, 3). The expression of a(S0) is obtained from Theorem 1 and its

expression for this case is given by

o.2(8o ) = 5(p) + p[1 - (p - 1)5(p)] 2 + 211 - (p - 1)(5(p)] p(5(p) - 1

where

for p = 2. is 0.13.

~ ' ( p ) r ( ; ) - ( ~ (p ) )2 5(p) = r (p )

Then the value of the corresponding statist ic is 1.50, and the corresponding p-values

42 M . L . MENENDEZ

R E F E R E N C E S 1. L.D. Brown, Fundamental of statistical exponential families, In Lecture Notes, Volume 9, Inst. of Mathemat-

ical Statistics, Hayward, CA, (1986). 2. A. R~nyi, On measures of entropy and information, Proceeding of the Fourth Berkeley Symposium on Math-

ematical Statistics and Probability, Volume 1, pp. 547-561, Berkeley University Press, Berkeley, CA, (1961). 3. J. Havrda and F. Charvat, Concept of structural c~-entropy, Kybernetika 3, 30-35, (1967). 4. S. Arimoto, Information-theoretical considerations on estimation problems, Information and Control 19,

181-194, (1971). 5. B.D. Sharma and D.P. Mittal, New non-additive measures of relative information, Journal of Combinatory

and Systems Science 2, 122-133, (1975).