Download pdf - Measuring relative equality of concentration between different

Measuring relative equality of concentration between different

income/wealth distributions

Quentin L Burrell

Isle of Man International Business School

The Nunnery

Old Castletown Road

Douglas

Isle of Man IM2 1QB

via United Kingdom

Email: [email protected]

Submitted to the Organising Committee of the “International Conference to

commemorate Gini and Lorenz”, University of Siena, Italy, 23-26 May, 2005.

mailto:[email protected]

Abstract

In a recent paper Burrell (2005a) introduced two new measures - both based on the Gini

mean difference - for measuring the similarity of concentration of productivity between

different informetric distributions. The first was derived from Dagum’s (1987) notion of

relative economic affluence (REA); the second – in some ways analogous to the

correlation coefficient – is a new approach giving the so-called co-concentration

coefficient (C-CC). Models and methods adopted in the field of informetrics – very

roughly, the “metrics” aspects of “information systems” - are very often based upon, or

have direct analogues with, or at very least are inspired by, ones from econometrics and it

is the purpose of this paper to suggest ways in which these new measures of similarity

could be useful in, for instance, studies of income distributions (a) between different

countries and (b) over different periods of time. The measures are illustrated using

exponential, Pareto, Weibull and Singh-Maddala distributions.

2

1. Introduction

One of the most intuitively reasonable requirements of a measure of concentration of

income/wealth within a population is that it should be invariant under scale

transformations – the degree of inequality should be the same if incomes are measured in

€ or US$. The situation is rather different if we are comparing inequalities between

populations. For instance, suppose we have two populations whose unit income

distributions are identical, but with one measured in €, the other in US$. Then their

“within population” concentration measures will be the same and yet there is clearly a

difference “between populations” since if both were expressed in the same units, we

would have two populations with different degrees of affluence.

Dagum (1987) sought to address this problem by introducing a measure of relative

economic affluence (REA) based upon the Gini mean difference, defined as the average

absolute difference between incomes of (randomly chosen) members of the two

populations. It turns out that the REAs of two income distributions are the same if and

only if the means of the two distributions are the same. The first of the new measures

proposed by Burrell (2005a) is a simple adaptation of the REA; the second incorporates

the Gini mean difference and the Gini coefficients of the two populations separately to

give a normalised measure – somewhat analogous to the correlation coefficient – lying

between 0 and 1, with the upper value being achieved if and only if the two population

income distributions (measured in the same units) are identical. Although we will talk in

terms of income distributions, the notions clearly extend to other fields.

3

2. Basic definitions

We imagine a population of individuals and let X denote the income of a randomly

chosen individual. Suppose that the distribution of X in the population is given by the

(absolutely) continuous probability density function (pdf) f defined on the non-

negative real line. (Restricting attention to the absolutely continuous case is done purely

for simplicity of presentation.)

)x(X

Notation.

(i)µ = mean of X = ]X[EX = dx)x(xf0

X∫∞

(ii) Tail distribution function of X = )xX(P)x(X ≥=Φ = dy)y(fx

X∫∞

At this early stage, let us recall that for a continuous non-negative random variable X we

have

X0

X dx)x(]X[E µ=Φ= ∫∞

(1)

(see, for instance, Stirzaker (1994, p238)). Without further comment we will always

assume that the mean is finite. There are many different approaches to the measurement

of concentration/inequality of (income) distributions and we refer the reader to Lambert

(2001) and Kleiber & Kotz (2003) for further discussion. See also Egghe & Rousseau

(1990). Notwithstanding the opinion that “… the overemphasis – bordering on obsession

– on the Gini coefficient as the measure of income inequality … is an unhealthy and

possibly misleading development” (Kleiber & Kotz, 2003, p30), the Gini coefficient is

the cornerstone of our analysis.

Definition 1. The Gini coefficient/index/ratio for X is defined formally as

4

X

21X 2

|]XX[|Eµ−

=γ , where X1 and X2 are independent copies of X.

(See the previous references, among many others, as well as the original presentation by

Gini (1914).)

The idea behind the definition is that we look at each pair of individuals within the

population in turn, find the absolute difference between their incomes and then average

out over all possible pairs. For purposes of calculation, the above definition is not very

convenient. Of the many others available (see Yitzhaki, 1998), the one that best suits our

purposes is given in the following:

Proposition 1.

X

0

2X

X

dx)x(1

µ

Φ−=γ

∫∞

(2)

According to Kleiber & Kotz (2003), this is originally due to Arnold & Laguna (1977)

“at least in the non-Italian literature”. It was independently rediscovered by Dorfman

(1979) in economics and by Burrell (1991, 1992a) in informetrics.

The Gini coefficient is usually held to be one of the, if not the, best inequality measures

in that it obeys all seven of the “desirable” properties proposed by Dalton (1920) for such

a measure, see Dagum (1983). (But note the previously quoted counter opinion of Kleiber

& Kotz.) One of these properties is that it is invariant under scale, or is independent of

the unit of measurement. This is clearly almost a necessary property in measuring

inequality within a population. However, this property should not necessarily carry over

to comparative studies of inequality between populations if different units of income are

used. As an example, suppose that the income in two populations each follows an

5

exponential distribution, measured in the same units but with different means. Then

clearly the general level of income is greater in the population having the greater mean.

On the other hand, since the exponential is a scale-parameter family, the Gini coefficient

for each will be the same. (Indeed, the Gini coefficient for an exponential distribution is

½, see, e.g. Burrell (1992a).) Hence reliance on standard measurements of inequality such

as the Gini coefficient is inappropriate for measuring relative inequality between

populations. Instead, we follow Dagum (1987) to extend the Gini coefficient to become a

measure of the overall inequality of income between two populations.

Aside. Within the field of income/wealth distributions, the standard graphical

representation of inequality is via the Lorenz curve where one arranges individuals in

increasing order of income. In informetrics the focus is usually upon the “most

productive sources” (= “richest individuals”) in which case it is natural to arrange

individuals in decreasing order of “income” so that what is plotted is the tail distribution

function against the tail-moment distribution function. In the informetrics literature this

graphical representation is sometimes known as the Leimkuhler curve, see Burrell (1991,

1992a). The geometric relationship with the Lorenz curve is immediate, as is the

geometric interpretation of the Gini coefficient via the Leimkuhler curve, see Burrell

(1991).

Let us denote the income of a randomly chosen individual from each population by X, Y,

respectively. The mean and tail distribution function are defined as before and denoted

respectively, and similarly for the Y population. The idea behind the XX , Φµ

6

construction of the Gini ratio between the two populations is exactly analogous to that of

the Gini coefficient for a single population, namely we look at pairs of individuals, but

now one from each population, find the absolute difference between their incomes,

measured in the same monetary units, and average this difference over all possible pairs.

Thus we have the following:

Definition 2. The Gini ratio between the two populations, denoted by , is given

by

)Y,X(G

YX

|]YX[|E)Y,X(Gµ+µ

−= , where X, Y are independent.

(The numerator of the above expression is what is referred to as the Gini mean difference,

Dagum (1987).)

The analogy with Definition 1 is clear. Indeed, the Gini coefficient becomes a special

case since if X and Y have the same distribution then X)X,X(G)Y,X(G γ== so that the

(comparative) Gini ratio becomes the (single population) Gini coefficient. Note also that

, where we can get as a limiting case, see Dagum (1987). 1)Y,X(G0 <≤ 1G →

Again, for purposes of calculation, the defining formula for the Gini ratio is not very

convenient so that we make use of the following:

Theorem 1. With the above notation:

YX

0YX dx)x()x(2

1)Y,X(Gµ+µ

ΦΦ−=

∫∞

(3)

Proof. See Burrell (2005a), reproduced here in Appendix 1.

7

3. Normalized measures.

I: The relative concentration coefficient

In the paper in which he introduced the Gini ratio, Dagum (1987) proposed the notion of

relative economic affluence. We briefly recap Dagum’s approach, but modifying some of

his notation and terminology.

As described earlier, the Gini ratio is derived from the average absolute difference in

income between the X-population and the Y-population. Dagum’s relative measure

results from splitting this difference into two components: the average excess income of

members of the X-population over less affluent Y-sources, which we denote by p(X,Y),

or p in Dagum’s notation; and the average excess income of Y-sources over less affluent

X-sources, denoted p(Y,X), or Dagum’s . (These two components are in fact those

considered in the proof of Theorem 1 in the Appendix.)

1

1d

In Burrell (2005a) we suggested a (relative) concentration coefficient between the X and

Y populations defined as D(X,Y) = p(X,Y)/p(Y,X) assuming, without loss of generality,

that E[X] ≤ E[Y]. This is just one minus the relative economic affluence defined by

Dagum (1987, Definition 7). An alternative representation of D(X, Y) is given in the

following:

Proposition 2. Assuming, without loss of generality, that E[Y] ≥ E[X]

∫∫

∫∫

ΦΦ−

Φ−Φ=

ΦΦ−µ

ΦΦ−µ=

YX

YX

YXY

YXX

)1(

)1()Y,X(D

Proof. See Appendix 2.

8

Corollary.

1)Y,X(D0Y

X ≤µµ

≤< and D(X,Y) = 1 if and only if E[X] = E[Y].

The proof is immediate, but see Burrell (2005a) for details.

Thus D(X, Y) is normalized in that it lies between 0 and 1 and the upper bound is

achieved if and only if the two means are the same. However, if the means are not the

same then the upper bound is given by the ratio of the means and this leads us to make

the following

Definition 3. The (modified) relative concentration coefficient is

( )( ) YYX

XYX

X

Y

/1

/1)Y,X(D)Y,X(*D

µΦΦ−

µΦΦ−=

µµ

=∫∫ (4)

where wlog E[Y] > E[X].

II: The coefficient of co-concentration

Note that although the Gini ratio already gives some sort of measure of the degree of

similarity/dissimilarity between two income distributions so far as their

concentration/inequality is concerned, it is not very informative on its own. One problem

is that the ratio is minimised when the two distributions are the same whereas we would

like a comparative measure to be maximised in this situation. This is easily resolved if we

make the following:

9

Definition 4.

X

0

2X

XX

dx)x(1

µ

Φ=γ−=θ

∫∞

(5)

= coefficient of equality within the distribution of X

and

YX

0YX dx)x()x(2

)Y,X(G1)Y,X(Hµ+µ

ΦΦ=−=

∫∞

(6)

= equality ratio between the distributions of X and Y.

Note that both and H(X,Y) lie between 0 and 1 and that Xθ Xθ = 1 corresponds to the case

where all individuals have the same income. If all individuals across both populations

have equal income, then H(X,Y) = 1. Also, zero values can only be achieved via a

limiting process so that in practice both may be taken as being strictly greater than zero.

We can now construct a new measure that focuses on the degree of equality rather than

inequality of concentration between the two populations.

Definition 5. The coefficient of co-concentration or co-concentration coefficient

(C-CC) is given by X Y X Y

H(X,Y) (1 G(X,Y))Q(X,Y)(1 )(1 )

−= =

θ θ − γ − γ (7)

∫ ∫

∫ΦΦ

µµµ+µ

ΦΦ

))((

22

Y2

X

YX

YX

YX= (8)

where the representation (8) follows from (7) together with (5) and (6).

The following shows that this is a standardised measure and is (joint) scale invariant:

10

Theorem 2.

(i) 1)Y,X(Q0 ≤<

(ii) Q(X,Y) = 1 if and only if the two distributions are the same.

(iii) Q(kX, kY) = Q(X, Y) for any constant k.

Proof. See Appendix 3.

Note. The equality of distributions required for the co-concentration coefficient to

achieve its upper bound is a much stronger condition than the equality of means required

in the case of the relative concentration coefficient and, we would argue, a more natural

requirement.

4. Some theoretical examples

In Burrell (2005a), simple examples for Exponential and Pareto distributions were

considered. Here we look at two rather more substantial cases.

(a) Weibull distribution

Suppose that X ~ Wei(α, β), i.e. X has a Weibull distribution with index α and scale

parameter β, so that the tail distribution function of X is given by

])/x(exp[)xX(P)x(Xαβ−=>=Φ

The mean of the distribution is well known to be

α+Γβ==µ

11]X[EX .

This results from, or can be viewed as providing, the useful identity

11

∫∞

α

α+Γβ=β−

0

11dx])/x(exp[ (9)

Noting that [ ] [ ]αα λ−=β−=Φ )/x(exp)/x(2exp)x( 2X , where , we can use

the identity (9) to straight away write

αβ=λ /12/

( )

α+Γβ=Φ∫

∞α 112/dx)x(

0

/12X .

It then follows that the Gini coefficient, using (2), is α−−=γ /1X 21 .

This is, of course, a well-known result; see e.g. Kleiber& Kotz (2003, p177) for an

alternative derivation.

Similarly, if X ~ Wei(α, β1) and Y ~ Wei(α, β2) then

[ ] [ ]ααα

ααα λ−=

β+

β−=β−β−=ΦΦ )/x(exp11xexp)/x()/x(exp)x()x(

2121YX

( )

,

where now ααα β+β

ββ=λ /1

21

21 . It then follows from the identity (9) that

( )

α+Γ

β+β

ββ=ΦΦ ααα

∞

∫11dx)x()x( /1

21

21

0YX

Of course, the above derivation can be much simplified if we recall that the Weibull

parameter β is a scale parameter and that the measures we are considering are (joint)

scale invariant, see Theorem 2(iii). For instance, notice how λ in the above depends only

on the ratio of the two β-values. Hence there is no loss in assuming that, say, β1 = β and

β2 = 1 throughout. With this assumption we find the equality ratio (Definition 4) as

αα

αα

∞

β+β+β

=

α+Γβ+

α+Γ

β+β

=µ+µ

ΦΦ=

∫/1

/1

YX

0YX

)1)(1(2

11)1(

11)1(

2dx)x()x(2)Y,X(H

12

Also, the product of the coefficients of equality is so that the co-

concentration coefficient is

α−=θθ /2YX 2

α

ααα

α+

β+β+

β=

β+β+β

=θθ

=/1

/1

/11

YX 12

)1(2

)1)(1(2)Y,X(H)Y,X(Q

Note the particular case where α = 1 gives the C-CC for the exponential distribution as

2)1(4)Y,X(Q

β+β

= , see Burrell (2005a).

Remark. In practice we have found that the graph of Q(X, Y) is fairly flat and so we

recommend using its square for both illustrative and analytic purposes. (This is analogous

to using the R2 measure, or coefficient of variation, rather than the basic correlation

coefficient in correlation studies.)

The graph of Q2 as a function of the scaling ratio β is given in Figure 1 for various values

of α. Note how in each case the peak, where Q2 = 1, occurs when the scale ratio β = 1,

which corresponds to the two distributions coinciding.

******************* Insert Figure 1 about here ****************************

For the modified relative concentration coefficient, using the formula (4) and the results

above, routine algebra gives

β−β+−β+

= αα

αα

/1

/1

)1(1)1()Y,X(*D if β ≤ 1

1)1(

)1(/1

/1

−β+β−β+

= αα

αα

if β > 1.

13

This is illustrated in Figure 2 over the same range and for the same values of α. Note the

severely peaked nature of the graphs around β = 1 for α > 1. The differences between the

general forms of the graphs reflect the different emphases of the two measures in

assessing differences/similarities in concentrations.

************************* Insert Figure 2 about here *********************

(b) Singh-Maddala distribution

The Singh-Maddala (1975, 1976) distribution is a very flexible three-parameter income

distribution model. (See Kleiber & Kotz (2003) for a concise treatment of its various

attributes.) For our purposes, an attractive feature is the simple form of its tail distribution

function. Indeed, adopting the Kleiber & Kotz notation, if X ~ SM(a, b, q) then

qa

X bx1)x(

−

+=Φ and hence, as before,

)q()a/1q()a/11(bdx

bx1dx)x(]X[E

q

0 0

a

XX Γ−Γ+Γ

=

+=Φ==µ

−∞ ∞

∫ ∫ (10)

The final equality of (10) provides our “useful identity”. (Note that we have merely

quoted the expression for the mean of the distribution, see Kleiber & Kotz (2003, p201).)

Clearly, then

)q2()a/1q2()a/11(bdx

bx1dx)x(

q2

0 0

a2

X Γ−Γ+Γ

=

+=Φ

−∞ ∞

∫ ∫

using (10), so that the coefficient of equality is

)q2()a/1q()a/1q2()q(

X Γ−Γ−ΓΓ

=θ

14

Also if X ~ SM(a, b, q) and Y ~ SM(a, b, p) then, using the same “identity” in (10)

)pq()a/1pq()a/11(bdx

bx1dx)x()x(

)pq(

0 0

a

YX +Γ−+Γ+Γ

=

+=ΦΦ

+−∞ ∞

∫ ∫

and the equality ratio is then

1

YX

0YX

)p(a/1p(

)q()a/1q(

)pq()a/1pq(2

dx)x()x(2)Y,X(H

−

∞

Γ

−Γ+

Γ−Γ

+Γ−+Γ

=µ+µ

ΦΦ=

∫

Note that both the coefficient of equality and the equality ratio do not involve the

parameter b, as should be expected since it is a scale parameter for the SM distribution.

From the above, it is clearly straightforward to derive a general expression for the C-CC

although it is rather cumbersome and not too enlightening. However, certain special cases

simplify matters greatly. For instance, if we take a = 1 then we find after a little algebra

that

)2pq)(1pq()1p2)(1p)(1q2)(1q(

)Y,X(Q−+−+

−−−−=

It is now straightforward to plot this as a function of p > 1 for any value of q > 1 (to

ensure finite means). This is illustrated in Figure 3, again using the squared form of the

function. Notice that here we get Q2 = 1 when p = q, again corresponding to the two

distributions coinciding.

***************************** Insert Figure 3 about here *********************

15

For the modified relative concentration coefficient, rather than give the general form let

us just stay with the special case where a = 1 as considered above. Routine calculation

leads, for a given value of q, to

D* = p/q if p < q, D* = q/p if p > q.

Thus the graph of D* is linear in p for p < q and is proportional to 1/p for p > q. See

Figure 4 and again compare with the corresponding Figure 3 for the Q2 measure. Our

view is that, once again, D*is rather harsh in distinguishing between the distributions,

which is not surprising given that it hinges on the mean rather than the overall

distribution.

****************** Insert Figure 4 about here *******************************

Remark. The reason that the Singh-Maddala example works so neatly in the calculation

of the C-C coefficient above is that the tail distribution function is of the form

where α is the sole parameter of interest. This then gives the identity

, say, and then

α=Φ )x(g)x(

∫ ∫ µ==Φ αg α)( ∫ ∫ αµ==Φ α )2(g 22 . Also if X and Y belong to the

same parametric family, with parameter values α, β respectively, then

∫∫ =ΦΦ gYX β+αµ= )(β+α

Substituting into (8) then gives the C-C coefficient as

)2()2()()(

)()()(2

)Y,X(Qβµαµ

βµαµβµ+αµ

β+αµ=

Use of this formula allows us to straight away write down the C-CC for such as the

exponential and Pareto distributions as well as the Singh-Maddala considered here.

16

5. Concluding remarks.

In this paper we have merely defined and given some simple examples of the Q2 and D*

measures and have not made any investigation of their statistical properties, although we

hope to have convinced the reader of the superiority of Q2 as the more subtle measure.

Nor have we considered empirical applications, though there are several possibilities,

including:

• Within informetrics, examples of comparative studies over several data sets have

been given in Burrell (2005b), leading to a so-called co-concentration matrix. The

analogous treatments of income/wealth distributions for different countries or for

the same country during different years, maybe in “real” terms, are obvious

applications.

• It would seem that it could also be used in investigative studies to assess the

effects of (proposed) taxation changes or degrees of inflation.

• Again from informetrics, much use is made of time-dependent stochastic models

in which case the entire distributional shape changes as the length of the time

period increases. This means that concentration, as measured by the Gini index or

illustrated via the Lorenz curve, also changes, see Burrell (1992a, b). An

investigation of the behaviour of the Q2 measure in such circumstances is

currently in hand (Burrell, 2005c). Are there similar models appropriate for

income/wealth distributions? After all, if we double the period of observation, we

(roughly) double the average income so how does the income distribution

change?

17

Any conclusions regarding the efficacy of the Q2 measure must be tentative at this stage –

anything definitive requires further experience of its application and interpretation - but

we are hopeful that it might be a useful additional tool in comparative studies.

References

Arnold, B. C. & Laguna, L. (1977). On generalized Pareto distributions with applications

to income data. International Studies in Economics No. 10, Department of Economics,

Iowa State University, Ames, Iowa.

Burrell, Q. L. (1991). The Bradford distribution and the Gini index. Scientometrics, 21,

181-194.

Burrell, Q. L. (1992a). The Gini index and the Leimkuhler curve for bibliometric

processes. Information Processing and Management, 28, 19-33.

Burrell, Q. L. (1992b). The dynamic nature of bibliometric processes: a case study. In I.

K. Ravichandra Rao (Ed.), Informetrics – 91: selected papers from the Third International

Conference on Informetrics (pp. 97-129), Bangalore: Ranganathan Endowment.

Burrell, Q. L. (2005a). Measuring similarity of concentration between different

informetric distributions: Two new approaches. Journal of the American Society for

Information Science and Technology. (To appear.)

Burrell, Q. L. (2005b). Some empirical studies of the measurement of similarity of

concentration between different informetric distributions. (Submitted for publication.)

18

Burrell, Q. L. (2005c). Time-dependent aspects of the co-concentration coefficient. (In

preparation.)

Dagum, C. (1980). Inequality measures between income distributions. Econometrica, 48,

1791-1803.

Dagum, C. (1983). Income inequality measures. In S. Kotz & N. S. Johnson (Eds.),

Encyclopaedia of Statistical Sciences, Volume 4 (pp. 34-40), New York: Wiley.

Dagum, C. (1987). Measuring the economic affluence between populations of income

receivers. Journal of Business and Economic Statistics, 5, 5-11.

Dalton, H. (1920). The measurement of inequality of incomes. Economic Journal, 30,

348-361.

Dorfman, R. (1979). A formula for the Gini coefficient. Review of Economics and

Statistics, 61, 146-149.

Egghe, L. & Rousseau, R. (1990). Elements of concentration theory. In L. Egghe & R.

Rousseau (Eds.), Informetrics 89/90: Selection of papers submitted for the Second

International Conference on Bibliometrics, Scientometrics and Informetrics (pp. 97-137),

Amsterdam: Elsevier.

Gini, C. (1914). Sulla misura della concentrazione e della variabilità dei caratteri. Atti del

Reale Istituto Veneto di Scienze, Lettere ed Arti, 73, 1203-1248.

Kleiber, C. & Kotz, S. (2003). Statistical size distributions in economics and actuarial

sciences. New Jersey: Wiley.

Lambert, P. J. (2001). The distribution and redistribution of income. 3rd edition.

Manchester: Manchester University Press.

19

Singh, M. & Maddala, G. S. (1975). A stochastic process for income distributions and

tests for income distribution functions. ASA Proceedings of the Business and Economic

Statistics Section, 551-553.

Singh, M. & Maddala, G. S. (1976). A Function for the size distribution of incomes.

Econometrica, 44, 963-970.

Stirzaker, D. (1994). Elementary Probability. Cambridge: Cambridge University Press.

Stuart, A., & Ord, J. K. (1987). Kendall’s Advanced Theory of Statistics. Volume 1:

Distribution Theory (5th edition). London: Griffin.

Yitzhaki, S. (1998). More than a dozen alternative ways of spelling Gini. Research on

Income Inequality, 8, 13-30.

20

Appendix

1. Proof of Theorem 1.

Although it is straightforward to give a general proof, either using Lebesgue-Stieltjes

integration or via the expectation operator (see Dagum, 1987), we prefer to use

elementary methods and restrict attention to the (absolutely) continuous case. Suppose

that X and Y are independent copies of the variables. Then

dxdy)y(f)x(f|yx||]YX[|E YX∫∫ −=−

Splitting the region of integration into {x>y} and {y>x}, the former yields, let us say

p(X,Y) = dx)x(fdy)y(f)yx(dxdy)y(f)x(f)yx( X0

x

0YY

yxX ∫ ∫∫∫

∞

>

−=−

= dx)x(fdy)y(yfdy)y(fx X0

x

0

x

0YY∫ ∫ ∫

∞

−

= dx)x(fdy)y(F|)y(yF)x(xF X0

x

0Y

x0YY∫ ∫

∞

−−

= dx)x(fdy)y(F x0

x

0Y∫ ∫

∞

= dxdy)y(F)x(f0

x

0YX∫ ∫

∞

= dydx)y(F)x(f0 y

YX∫ ∫∞ ∞

= ( )dy)y(1)y(dy)y(F)y( Y0

XY0

X Φ−Φ=Φ ∫∫∞∞

Interchanging the roles of x and y in the above leads straight to

21

p(Y,X) = dxdy)y(f)x(f)xy( Yxy

X∫∫>

− ( )dy)y(1)y(dy)y(F)y( X0

YX0

Y Φ−Φ=Φ= ∫∫∞∞

Adding these two expressions gives

dxdy)y(f)x(f|yx|)X,Y(p)Y,X(p|]YX[|E YX∫∫ −=+=−

= + ( )dy)y(1)y( Y0

X Φ−Φ∫∞

( dy)y(1)y( X0

Y Φ−Φ∫∞

)

= ( )dy)y()y(2)y()y(0

YXYX∫∞

ΦΦ−Φ+Φ

= YXYX 2 ΦΦ−Φ+Φ ∫∫∫

= (A1) YX µ+µ YX2 ΦΦ− ∫

and the result follows.

2. Proof of Proposition 5.

Eliminating p(X,Y) from (6) and (7), and rearranging, gives

2/)|]YX[|E()X,Y(p XY µ−µ+−=

and similarly

2/))(|]YX[|E()Y,X(p XY µ−µ−−=

Hence if , XY µ>µ

)(|]YX[|E)(|]YX[|E)Y,X(D

XY

XY

µ−µ+−µ−µ−−

=

Now substitute from (A1) and then dividing numerator and denominator by

gives the result.

YX µ+µ

3. Proof of Theorem 2.

22

(i) YX

0YX dx)x()x(2

)Y,X(G1)Y,X(Hµ+µ

ΦΦ=−=

∫∞

so that

( )2

YX

2

YX

2

YX

YX2

)(

42)Y,X(H

µ+µ

ΦΦ=

µ+µ

ΦΦ= ∫∫ (A2)

Now

( ) ( )(∫∫∫ ΦΦ≤ΦΦ 2Y

2X

2

YX ) (A3)

by the Cauchy-Schwarz inequality, variants of which can be found in most introductory

texts on analysis, see also Stuart & Ord (1987, p. 65). Also

YXYX2

YX2

YX 44)()( µµ≥µµ+µ−µ=µ+µ (A4)

Combining (A3) and (A4) then gives, from (A2)

( ) ( )( )YX

YX

2Y

2X

2YX

2

YX2

)(

4)Y,X(H θθ=

µµ

ΦΦ≤

µ+µ

ΦΦ= ∫∫∫

and the result follows.

(ii) For Q(X,Y) = 1, both of the above inequalities (A3) and (A4) must be equalities. For

the second, trivially the equality holds if and only if YX µ=µ .

The Cauchy-Schwarz inequality reduces to an equality if and only if there is a constant c

such that, for all x, Φ . Then using the note at the end of the end of the

Proof of Theorem 1 above, this leads to

)x(c)x( YX Φ=

YYXX cc µ=Φ=Φ=µ ∫∫ . Having established the requirement that the two means must

be the same, this implies that c = 1 and hence the two distributions are the same.

23

Figure 1 : Q-squared for the Weibull distribution

0

0.2

0.4

0.6

0.8

1

1.2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

Scale ratio, beta

Q-s

quar

ed

alpha = 1/2alpha = 2alpha = 5

Figure 2. D-star for the Weibull distribution

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

Scale ratio, beta

D-s

tar alpha = 1/2

alpha = 2

alpha = 5

24

Figure 3: Q-squared for the Singh-Maddala distribution

0

0.2

0.4

0.6

0.8

1

1.2

1 2 3 4 5 6 7 8 9

Parameter value, p

Q-s

quar

ed q = 1.5

q = 2

q = 3

Figure 4: D-star for the Singh-Maddala distribution

0

0.2

0.4

0.6

0.8

1

1.2

1 2 3 4 5 6 7 8 9

Parameter value, p

D-s

tar q = 1.5

q = 2

q = 3

25