Em~irical Likelihood Inference &der S ratified · PDF fileAbstract This dissertation consists of seven chapters. The first chapter is an introduction to empirical likelihood inference,

Em~irical Likelihood Inference &der S t ratified Sampling

b y

Bob(Chongxin) Zhong

A thesis submitted to

the Faculty of Graduate Studies and Research

in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

Department of Mat hematics and Statistics

Carleton University

Ottawa. Ontario, Canada

December 1997

@ Copyright 1997

Bob(Chongxin) Zhong

National Library 1+1 of Canada Bibliothèque nationale du Canada

Acquisitions and Acquisitions et Bibliographic Services services bibliographiques

395 Wellington Street 395, rue Wellington Ottawa ON K1A ON4 Ottawa ON K1A ON4 Canada Canada

The author has granted a non- exclusive licence dowing the National Library of Canada to reproduce, loan, distribute or seii copies of this thesis in microform, paper or electronic formats.

The author retains ownership of the copyright in t h ~ s thesis. Neither the thesis nor substantial extracts fkom it may be printed or otherwise reproduced without the author's permission.

L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thése sous la forme de microfiche/fïlm, de reproduction sur papier ou sur format électronique.

L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.

Abstract

This dissertation consists of seven chapters. The first chapter is an introduction

to empirical likelihood inference, the last chapter gives conclusions and further re-

search areas, while the remaining five chapters deal with specific topics related to

ernpirical li kelihood met hod. Strat ified random sampling is discussed in chapters 2

to 5.

In chapter 3. empirical likeli hood under stratified random sampling is discussed

for finite population inference problems. üsing population auxiliary information.

such as overall total, this method is shown to lead to more efficient estimators.

Some limi ted simulation results are also given. Empirical likeli hood ratio tests are

also discussed.

In chapter 3 , empirical likelihood and strat ified sampling are combined wi t h

estimating equations to make non-parametric inferences. The results show t hat the

empirical li kelihood for parameters of interest have properties similar to t hose under

a parametric likelihood. These results cover empirical likelihood ratio estimation

and testing. Parameters subject to constraints are also studied. Illustrations of the

met hods are provided.

In chapter 4, a pseudo empirical likelihood method is developed in order to handle

complex survey sampling designs. Asymptotic properties of these estimators are also

discussed. Pseudo empirical likelihood ratio testing is also included. Illustrations of

the methods are provided.

In chapter 5. we apply the empirical likelihood technique to propose a new class

of M-estimators in the presence of auxiliary information under a nooparametric set-

ting, using stratified random sarnpling. It is shown that the proposed M-estimators

are consistent and asymptoticdly normally distributed with smaller asymptot ic vari-

ances than those of the usual M-estimators.

In chapter 6. we consider the following scenario: several different imperfect in-

struments and one perfect instrument are used independently t o measure a char-

acteristic of interest of a target population. We wish to combine the information

from these independent sarnples to make statistical inference on parameters of inter-

est, such as the population mean and population distribution function. We develop

empirical likeli hood estimators and empirical likelihood ratio tests and study their

asymptotic properties.

Finally. some suggestions for further research are presented in chapter 7.

Acknowledgement s

First of all, I would Like to express rny sincere and deeply grateful thanks to

my thesis supervisor, Professor J. N. K. Rao for his supervision and encouragement

during the course of this work, as well as for his kindness and support during the

past four years.

My thanks also go to Prof. Shisong Mao, my M.sc. supervisor, and to Prof.

J . Shao. Their support helped to make it possible for me to corne to Carleton

University.

My sincere thanks are also extended to Professor M. Csorgo. for his helpful

discussions and suggestions.

This research was partially supported by Chand and Ratna Devi Marwah Merno-

rial Scholarship and NSERC research grants of Professor J. N. K. Rao.

Finally, special thanks go to my wife Ci Jiang for her persistent support and to

my parents for their understanding.

Contents

Acceptance Sheet

Abstract

Acknowledgements

Contents

List of Tables ... Vl l l

Introduction 1

-2 1.1 Empirical likelihood and asymptotic setup . . . . . . . . . . . . . . . - 3 1.1.1 Empirical likelihood . . . . . . . . . . . . . . . . . . . . . . . -.

1.1.2 Asymptotic setup . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Outline of Chapters 2 to 4 . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1 Introduction to Chapter 2 . . . . . . . . . . . . . . . . . . . . 4

- 1.2.2 EL and estimating equations . . . . . . . . . . . . . . . . . . . r

1.2.3 Pseudo EL and estimating equations . . . . . . . . . . . . . . 9

1.3 EL and M-estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.4 EL in the Presence of Measurement Error . . . . . . . . . . . . . . . 14

2 EL for Finite Populations 17

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2 Maximuin EmpiricaI Likelihood Estimator . . . . . . . . . . . . . . . 19

2 . 2 1 Numerical Evaluation for MELE . . . . . . . . . . . . . . . . . '21

2.2.2 Asymptotic Results . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2.3 Variance Estimation . . . . . . . . . . . . . . . . . . . . . . . 77

2.3 Estimation of Small Area Means . . . . . . . . . . . . . . . . . . . . . 28

2.4 Empirical Likelihood Ratio Test . . . . . . . . . . . . . . . . . . . . . 30

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Simulation study 34

1.6 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3 EL and GEE 55

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5

3.2 MELR Estimators and Their Properties . . . . . . . . . . . . . . . . 5 S

3.2.1 MELR estirriators . . . . . . . . . . . . . . . . . . . . . . . . . 5s

3 - 2 2 Numerical method for given 0 . . . . . . . . . . . . . . . . . . 60

3.2.3 Asymptotic Results . . . . . . . . . . . . . . . . . . . . . . . . 61

3.3 MELR Estimators and Testing with Constraints . . . . . . . . . . . . 64

3.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.5 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4 Pseudo EL and GEE

. . . . . . . . . . . . . 4 1 Introduction

vii

4.2 MPELR Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

4.3 Xsymptotic Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

-4.4 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

4.5 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

5 M-estimation and EL 114

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

5.2 Profile empirical likelihood . . . . . . . . . . . . . . . . . . . . . . . . 115

5.3 $1-estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

5.4 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

6 EL in the presence of measurement error 127

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

6.2 Empirical likelihood method in the presence of measurement error . 129

6.3 Xsymptotic properties of MELRE . . . . . . . . . . . . . . . . . . . 132

6.4 Likelihood Ratio Tests . . . . . . . . . . . . . . . . . . . . . . . . . . 134

6.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.5

6.6 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

7 Conclusions and Fùrther Research 150

7.1 Conchsions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

7.2 Further Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15'3

References 154

List of Tables

. . . . . . . . . . 2.1 Parameter settings for generated finite populations 37

. . . . . . . . . . . . 2.2 Ratios of MSE's of the to FGREGt J s t o FOPT 3s

. . . . . . . . . . . 2.3 Ratios of MSE's of F ( ~ , ) to F ~ ~ ( ~ ~ ) and F.,, (IJ,) 39

2.4 Empirical Error Rate(% ) for MELRT . . . . . . . . . . . . . . . . . 40

. . . . . . . . . . . . . . . . . . . 2.5 Empirical Error Rate(% ) for .A NT 40

Chapter 1

Introduction

Li kelihood and est imating equat ions provide common approaches to parametric in-

ference. Receotly, t hese approaches have been shown to be uselul in nonparametric

contexts.

Likelihood in nonparametric contexts. cailed empirical likelihood (described in

Section 1 below). was recently introduced by Owen( 1988. 1990. 1991). Owen has

shown that the empirical likeliliood ratio statistics have limiting chi-square dist ribu-

tions in certain situations. and has shown how to ohtain tests and confidence limits

for parameters. espressed as functionals. B ( F ) of an unknown distribution function

F. Ot her asymptotic properties and the possi bility of correct ing likelihood rat.io

statistics or their signed roots have been studied by DiCiccio and Romano(lSY9).

Hall(l990), DiCicco, Hall and Romano(1989, 1991) and ot hers.

Estimating equations is anot her important met hod in statistics. Qin and Law-

less(1994, 1995) showed how to link empirical likelihood and estimating equations.

and have shown that empirical likelihoods lead to asymptotic results similar to t hose

CHAPTER 1 . INTRODUCTION

under parametric likelihoods.

Actually, Hartley and Rao(1968) gave the original idea of empirical likelihood

in sample survey context as earfy as 1968. They called the method "scale-load*

approach.

The most of above discussions deal with simple random sarnpling. Since simple

random sampling is rarely used in practice frorn both practical and theoretical con-

sideration, we will consider the case of stratified random sampling. The sampling

scherne in each stratum maybe simple random sampling or a more complex sampling

met hod.

In this chapter. we will first describe the empirical likelihood(EL) and an asymp-

totic set-up in Section 1.1. Section 1.2 describes briefly the problems discussed in

chapters 2 to 4 and our main residts. Section 1.3 covers the problerns discussed in

Chapter 5. Section 1.4 describes the problems studied in Chapter 6.

1.1 Empirical likelihood and asymptotic setup

1.1.1 Empirical likelihood

Suppose that a target population is divided into H strata with known weights CC;I

for al1 strata h , Eh Wh = 1. In stratum h there are !Vh units with values x h ; ( i = 1.

. , .Wh; h = 1,. , H), where X h i is d-dimension variable. Denote the hth stratum

population mean, rnedian and distribution function by .Th. mxh and

respect idy. and l ( x h i < X ) = < z l ) , - - - l ( X h i Y d < x ~ ) ) ~ . X h i , , and z, are

j - th component of vector X h i and x respectively, l ( x h i , < x,) is the indicator of set

( < ) A~so, let -F, rnx and F x ( r ) = xf=i=l WhFhiVh(z) be the mean: rnedian

H and distribution function of the target population respectively, where M = lvh h'

and 5 = CE, IVhxh.

Suppose that x h l , . . xh,, is a simple random sample(SRS) without replacement

from stratum h with distribution function Fhm for al1 h. and that the sampies are

selected independently across the strata. As argued by Chen and Qin (1993), the

empirical likelihood for the above sampling scheme can be approximated by

if nh << l\ih and Nh is large, where phi = P r ( x h = x h i ) and x h has the distribution

function F h i v h -

CVe will cal1 (1.1) the empirical likelihood. Maximizing ( 1 . 1 ) subject to some

addi tional condit ions according to available auxiliary information will give us more

efficient estimators of parameters of interest, denoted as maximum empirical likeli-

hood estimators or ,MELE.

1.1.2 Asymptoticsetup

It is difficult to study the finite sample properties of MELE theoretically. Hence.

we study their asymptotic properties. The finite sample properties of MELE will be

studied through limited simulations. Let {Pu, u = 1,2. -) be a sequence of finite

populations. Throughout the paper v is used as the index of the finite population.

The asymptotic setup here is that both the sample size n h and the stratum size ;\.h

CHAPTER 1. INTRODUCTION 4

for each h go to infinity as v + oo, and Fhqivh goes to distribution function Fh, but

we will suppress the index v for convenience. We can suppose the population of

stratum h as a SRS sample from a super population Fh, hence FhYiVh -+ Fh. We also

H suppose that n/nh + kh > O, h = 1, . H. where n = zizi n h .

1.2 Outline of Chapters 2 to 4

In this section, we present an outline of Chapters 2 to 1. First. Chapters 2 to 4

a11 deal with stratified sampling. However, in Chapters 2: 3, we use simple random

sampling in each stratum. while in Chapter 1 we permit more complex sampling in

each stratum. Samples are al1 taken independently across strata.

1.2.1 Introduction to Chapter 2

Data set and auxiliary information

In order to be consistent with the convention of using x as the auxiliary variable

and y as the response variable. we change the notation for characteristics of popu-

lation to z = ( x r , gr)'. where x is auxiliary variable of dimension d - p and y is the

response variable of dimension p.

We suppose that samples are taken in each stratum by simple random sampling

wit hout replacement. Samples are denoted by

Suppose some auxiliary information about z is available, such as population mean

X or rn, when z is a scalar. In the following we suppose that only the overall mean

CHAPTER 1. INTROD UCTIOL!

.% is known from external sources such as demographic projections.

MEL estimators

The information X can be used to adjust our estirnators for p l i , - - - , PHnx? that

is, to force these estimators to satisfy

Maximizing ( 1.1) subject to conditions (2.1 ) and xi phi = 1. phi > O to get estimators

f i h i , we estimate the population mean Y by maximum empirical likeli hood( MEL)

and the MEL estimator of distribution function Fv(y) as

Asymptotic properties of MELE

Under appropriate conditions, we can prove that t h e estimator Y has the same

asymptotic variance as the optimal regression estimator

where Ê0 is given by Rao and Liu(see Chapter 2. Section 2). Further. is

asyniptotically more efficient t han F,(~) = Eh wh ;hFh.,,, ( Y ) : the customary cumuia-

tive distribution function. It is possible to construct an optimal regression est imator

of Fv(y) by changing ?Jhi t o the indicator variable l(yhi 5 y) in the formula for Io,,

(Rao and Liu, 1992). This estirnator, called F ~ ~ ~ ( ~ ) , is asyrnptotically equivalent to

F ~ ( ~ ) , but it rnay not be monotone unlike the MEL estirnator F ~ ( ~ ) .

Our simulation studies show t hat outperforms the GREG estimator. especially

for populations wit h widely varying intercept terms across strata. It is also consider-

ably more efficient t han the stratified mean gSl, except in the case of population with

a weak relationship between y and x. As expected? and the optimal estirnator

are equally efficient. However, the MEL estimator F~ ( y) out performs the empirical

distribution function FSt (IJ). Although &+J) and Po&) are asymptotically equiv-

den t , our simulation results indicate t hat Fu ( y ) is often significantly more efficient

Empirical likelihood ratio test

Since ( 1 . l ) is defined as empirical likeiihood, we can conduct empirical likelihood

ratio test(ELRT) on parameter p. Let t9 = (ûf7û:)', el, = .y and O, = Y' then

E ( z ) = Eh Wh E ( i h ) = 0 and we define the profile empirical likelihood ratio funct ion

." Maximizing RE(t9) will give an estimator O, of the parameter 0,. hence we ma?

as the empirical likelihood ratio statistic, where Bo = (Sr . v;)' and Fo is specified.

Under appropriate conditions, we get

under the hypothesis Ho : I = 6.

CHAPTER 1 . INTRODUCTION

1.2.2 EL and estimating equat ions

In Chapter 3, we combine EL, estimating equations and stratified random sampling

together. The auxiliary information here is in the form of r 2 p functionally inde-

pendent unbiased estimation functions, t hat is functions gj(x. O ) . j = 1. - . r such

that E F f g j ( x , O)} = O. ln vector form, we have

where g(x, O ) satisfies

The basic idea is to maximize empirical likelihood subject to constraints provided

by (2.3). Since we need to consider empirical likelihood ratio tests. we first define

empirical likelihood ratio function, then get the empirical likelihood ratio estimator

for the parameter of interest, 6. The estimator of 6 obtained by this method is called

the MELR estimator.

MELR estimators

Since empirical likelihood function is

and (2.4) is maximized by the empirical distribution function F,(I) = Eh CV, Fnh (x).

where n = Eh nh, F,.,, = Ci l ( xh i 5 x), the empirical likelihood ratio is then

defined as R ( F ) = L ( F ) / L ( F , ) , which reduces to


Since we are interested in estimating the parameter 9 , and since we know the esti-

mating equat ion (2.3), we define the ernpirical likelihood ratio function

where ghi(B) = g(xhi, 6) for ail h and i. We may maximize RE(B) to obtain the h)

MELR estirnate of the parameter O. In addition, this yields estirnates phi's of

phi's, and an estimate for the distribution F as

hr h)

The asymptotic properties of 0 and F, ( y ) have also been studied. The results are

The terms involved in the above asymptotic distributions are given in Chapter 3. hi

Frorn the asymptotic variance of fi 6. we can see that it is of the usual sandwich rV

form. The matrix U is non-negative definite so that F, ( s ) is at least as efficient as

F,(y)? the sample cumulative distribution function.

MELR testing with constraints

The empirical likelihood methods can be extended to deal with the case in which

there are constraints on parameters and even to testing on these constraints. Sup-

pose tliat q-dimensional constraints on û are of the form


where r ( 0 ) is a q x 1 ( q 5 p ) and the q x p rnatrix R ( 0 ) = is of full rank q . We

first consider how to estimate O . This can be done by defining

where v is a q x 1 vector of Lagrange multiplierso and then differentiating G2 with h

respect to 8 and v to get the estirnator for O denoted by O,.

Now the problern about how to test Ho : r(O) = O can be handled by any of the

t hree most popular met hods based on likelihood: likelihood-ratio test. Lagrange-

multiplier test and Wald test. Of course here they should be based on empirical

likelihood. We can prove t hat t hese met hods are asymp totically equivalent . The

conclusions in t his chapter show t hat the paramet ric likelihood and non-parametric

likelihood have sirnilar properties.

1.2.3 Pseudo EL and estimating equations

Sampling Design

Our sampling design in this chapter is stratified sampling, where samples are

taken frorn each straturn according to some specific design. such as simple random

sampling(SRS). probability proportional to size(PPS) sampling and samples are

independent across strata.

Pseudo EL

Suppose Pu with d-dimension characteristic x, has the unknown distribution

Fu and a pdirnension parameter 0 associated with it, and the hth stratum of P,

has d-dimension characteristic Xuh which has the unknown distribution Fuh. Sup-

pose xhl, - , x h h , the values of units in hth straturn of Pu. is a random sample

CHAPTER 1. INTROD UCTION 10

from a super-population, Say FA. If the entire population Pu were available: the

corresponding likelihood funct ion would be

where Phi = F:h(xhi) - F;h(xhi). The log-likelihood function then is

where N = CE, Nh. But we only have a sample. Say. s, from the finite population

and if we view (2.9) as a finite population total, then we may have available a design

unbiased estimator of l (F) ' namely.

h, and E refers to expectation with respect to the sampling design. Obviously LW

have E ( ~ ( F ) ) = l ( F ) . We will cal1 (2.10) the "pseudo empirical likelihood".

Using the pseudo empirical likelihood (2 . I O ) , we can obtain a maximum pseudo

empirical likeli hood ratio estimator for the parameter of interest. These estimators

will be called MPELR estimators.

MPELR estimators

Since Ci ph; = 1 , we know (2 .10) is rnaximized by F n ( x ) = Ch WhFnh(x )? where

n = Eh n h 7 Fnh(x) = xierh ~ h i ( ~ ) l ( x h i < x), hi(^) = dhi (s ) / CiEsh d h i ( ~ ) - The

log-empirical likeli hood ratio is t hen defined as

We also assume that information about 8 a r d F is available in the form of

r 2 p functionally independent unbiased estimation functions, that is. functions

gj ( x , O ) , j = 1, . - , r such that E F { g j ( x 7 O)} = O. In vector forrn, we have

where g ( x . 8) satisfies

Now we define the (negative) log-pseudo empirical likelihood ratio function of 6 by

where gh i (û ) = g ( x h , . 0 ) for al1 h and i. -u

We may minimize l E ( 0 ) to obtain an estimator 0 of the parameter 0 (called

MPELR estimator). In addition. this yields estimators P h i 7 % and therefore an esti-

mator of the distribution function F as

(2. l-i)

Asymptotic Results

Some conditions are needed to study the existence of solutions and the asymp-

totic properties of MPELR estimators. We assume that n m. n h / n -+ kh > O

and

Max 1 ibIûzidh;(s) l<hCH. l s i<nh dhi(s) = O( ;) 7 ~ i ~ ~ d ~ ~ ( ~ ) = 0(1)

as Y + 00. The assumption (2.15) means that no survey weight is disproportionately

large. Due to some technical difficulty. only stratified SRS and PPS with replacement

are discussed.

CIIAPTER 1. INTROD UCTION 12

Since our strata weights Wh are known and samples from different strata are

independent, we should also construct the weights to satisfy

S. we establish Under the conditions above and some other condition

normality of the estimator for both the parameter 0 and the distribution function

Pseudo empirical likelihood ratio test(PELRT) is discussed. Here the asymptotic

distribution of PELRT is the sum of weighted y: distribution. The reason for that is

that pseudo-EL is used as the likelihood. Methods of approximating the asymptotic

distribution are also discussed.

1.3 EL and M-estimation

M-estimation plays an important role in robust parametric inference and in non-

paramet ric inference. CVe consider a new class of hl-estimators here. Suppose t hat

some auxiliary information in the form of general estimating equations is available.

We can use this information to improve our estimator for distribution function. then

replace F in k1-functional O(F) by this improved estimator for distribution function

to get an estirnator 8. Since this estirnator 8 combines the information from auxiliary

information and hl-function, it wiil be more efficient than the usual M-estimator.

Auxiliary information

We assume that auxiliary information is in the form of r (2 1) functions gi (x)'

CHAPTER 1. INTRODUCTION

. -, g,(x) such that

EF&) = C bvh'IhEhg(x) = 0, (3-1 h

where g(x) = (gl(x), . . g,(x))', Eh denotes the expectation wit h respect to the

distribution function of stratum h. Here g ( x ) does not contain any unknoivn pa-

rameters, hence is different from the problems discussed in chapters 3 to 4.

M-est imat ion

An M-functional O( F) associated

root Bo of the equation

with distribution funct.ion F is defined as a

where x -- F and x, Bo, &(x, $0) E R. For a stratified sample +11: . - .XI,,: - : X H I :

- - -' x~~~ from F ( - ) , the usual estirnator of e0 is given by 4, = 6 ( F n ) , where F,(-) is

the naturai stratified ernpirical distributior? function, i-e., F,(z) = Ch -b xi l ( zh i < nh

x). 6(Fn) is called the M-estimator corresponding to t,b.

EL and M-estimation

The method to combine EL and M-estimation can be divided into two steps.

First. we use auxiliary information (3.1) to estimate F ( x ) . That is to define EL as

before as

where Phi = Pr(xI i = x h i ) Under condition (3.1), i.e..

and

CHAPTER 1. INTRODUCTION

maximizing L to get estimators for P h i :

where

t is the solution to

h h 1

Now let

then F*(z) can be regarded as an dternative estimator of the distribution function

The new M-estirnator to $ is then defined as by 6,. which is a root uo of the

squat ion

bVe will also cal1 0, the ELM-estirnator corresponding to Q.

Asymptotic properties

We prove that 0,, is asymptotic normal and that it has srnaller asymptotic vari-

ance compared to the one that does not use the auxiliary information.

1.4 EL in the Presence of Measurement Error

Sometime we are in the situation to measure a characteristic of interest of a target

population. We have several different "instruments" , one of t hem is "accuratel.

or it has srnalier measurement error cornpared to the other ones. In practice we

treat this instrument as "perfect", or no measurement error. Al1 other instruments

will have larger measurement error and we treat them as 'imperfect". The perfect

instrument can be used only very Iimited times due to cost, whereas the imperfect

ones, however, can be used more frequently. Now the question is how to combine

al1 the data available to rnake inference on parameters of interest.

Data set

Suppose t hat t here are H different instruments used in measuring t h e characteris-

tic of interest, the distribution function associated wit h instrument h ( h = 1. . . . . H)

is Fh(x) = P[zh 5 a ] , with unknown parameter O, where O is pdimension vector.

and the H-th measuring instrument is taken as perfect. These H different popu-

lations are related by the comrnon parameter B. Mie also assume that information

about O and Fh is available in the form of r h > p functionally independent unbiased

estimation functions, t hat is functions gh(xh, O ) such t hat

where gh(x. O ) is a rh-dimension vector function.

We furt her suppose that x h l , . -, xh,, is an i.i.d. sample frorn Fh(x). and samples

measured by different instruments are independent.

EL in the presence of measurement error

The ernpirical likelihood based on the indepe~dent samples

is given by

CHAPTER 1. INTROD UCTiON 16

where ph, = P r ( x h = xhi ) and Cy.l Phi = 1 for each h. The ernpirical likelihood

ratio is then defined as

Since we know the estimating equation (4.1 ), we define the empirical likelihood ratio

function for 8 as

where ghi(B) = gh (zhir 0) for d l h and i . LY

We can maximize RE(B) to obtain an estimator 6 of the parameter B. calied the

maximum empirical likelihood ratio estimator (MELRE). In addition. t his yields h

estimators phi, and an estimator for the true distribution function FH as

Asymptotic properties

+

cumulative distribution function of X H I . XH,,. The estimator e usually is

significantly more efficient t han the estimator using only ? -. xH,,.

Chapter 2

Empirical Likelihood Inference

for Finite Populations wit h

Auxiliary Informat ion Using

St ratified Random Sampling

2.1 Introduction

Suppose that a target population is divided into H strata with known weights Ch

for al1 s trata h, C h Wh = 1. In stratum h there are !Vh units with values zh;

( i = 1- - - . Nh; h = 1: - - , H), where zh, is d-dimension variables. -hi = (d. y;,)'.

xhj and yh, are d - p and p dimensions respectively and T denotes the transpose.

Denote the hth stratum population mean, median and distribution function by

CHAPTER 2. EL FOR FINITE POPULATIONS

(.Fi? F;)', (mi,? mih)* and

the j- th component of vectors ;hi and r respectively, I ( z h i , < z j ) is the indicator

xLi CVh Fhiv, ,(z) be the mean, rnedian and distribution function of the target pop-

H dation respectively. where N = Eh=* hrh, ?ir = xf=i NrhXh and P = Whrh. The reason we separate Zhi into two parts is that we are going to suppose that some

auxiliary information such as or rn, is available when we want to make inference

about target population parameters such as Y or m y .

In many practical problems, some population values of the auxiliary variable

x are known. such as the population mean or median from external sources such

as demographic projections. By using this knowledge. we would like to provide

improved inferences on the population parameters associated with the characteristic

of interest y. such as the population mean P, the population distri bution funct ion

of y, F&), and the population median rnv. Empirical likelihood met hods. recently

introduced by Owen ( 1988. 90, 9 1 ) in the context of iid random variables. provide

a systematic nonparametric approach to utilizing auxiliary information in making

inference on the parameters of interest. Hartley and Rao ( 1968) gave the original idea

of empirical li keli hood in sarnple survey context , using t heir '%cale-load" approach.

ünder simple random sampling, t hey obtained the maximum empirical likelihood

estimator of when only X is known and showed that it is approximately equal

to the regression estimator as the sarnple size, n o increases. Chen and Qin (1993)

CHAPTER 2. EL FOR FINITE POPULATIONS 19

extended these results to cover distribution function and median of Y? i.e., Fv ( y ) and

m p Qin and Lawless ( 1994, 95) used empirical likelihood and estimating equations

in the iid case to deal with interval estimation and hypothesis testing. They obtained

an empirical likelihood ratio test statistic (ELRS) and its asymptotic distribution

under null hypothesis. The main aim of this chapter is to deal with the case of

stratified simple random sarnpling.

In section 2.2, we derive the maximum empirical likeli hood est imator (MELE)

for the parameters of interest. Some large sample properties of MECLE are also

established. Application of MELE to small area estimation will b e given in section

2.3. In section 2.4, empirical likelihood ratio test (ELRT) on Y is considered and

its asymptotic distribution under the null hypothesis is obtained. In section 2.5. we

present the result of a limited simulation study on the finite sample properties of the

empirical likelihood method. Proofs of Theorems in section 2.2 and 2.4 are given in

section 2.6.

2.2 Maximum Empirical Likelihood Est imat or

Suppose t hat z h l , . . ? Zhnh is a simple randoni sarnple wit hout replacement from

stratum h with distribution function Fh1vh for al1 h. and that the samples are selected

independently across the strata. As argued by Chen and Qin (1993) for simple

random sampling , the empirical likelihood for the above sampling scheme can be

approximated by

L = nr=,n:,lPhi (2.1)

where ph; = P P ( Z ~ = z h i ) and zh = ( x i , y;)' has the distribution function Fhlvh.

CHAPTER 2. EL FOR F I N T E POPULA4TIONS 20

We consider the case of knoivn vector of population means. .%. of the variable

x = ( x ~ , . . , Xddp)':

Our results can be easily generalized to other forms of auxiliary information such as

a known population median in the case of scalar x.

The maximum empirical likelihood estimator should be sought among distribu-

tion functions satisfying (2.2) . Using the same argument as in Owen (1990), we

need consider only estimators of FhNh whose support are contained in the set of

observations. The problem t herefore reduces to maximizing

subject to

and

Wh p h i r h i = -Y- -

(2.5) h i

where Eh = xr==l and xi = xyi,. A solution for the above problem exists wi th

probability tending to one as the sample sizes go to infinity in each stratum. The

proof of this conclusion will be given in section 2.6.

Using the Lagrange multiplier method, let

11 = C C log phi - C X h ( C phi - 1) - n+'(C W 1 P h i x h i - X)-

where n = Eh n h and I,LJ is a (d - p)-dimension vector. Then


From (2.6) we get

( A h + nbIhS) 'xh; )ph i = 1.

and summing this expression over i leads to

using (2.4). Hence

where .Th = xi PhiXhi and mh = n b l / h n h L . Therefore. the estimators of phi's are the

solutions to the following system of equat ions:

subject to O 5 phi 5 1 . We denote the solutions of (2.7). (2.8) and (2.9) as fihi and

4. We cal1 the above method as the ernpirical likelihood rnethod. The estimator

we get by using this method is called the Maximum Empirical Likelihood Estimator

(MELE). We will now discuss how to solve (2.7), (2.5) and (2.9) in the following.

2.2.1 Numerical Evaluation for MELE

It is difficult to solve (2.'7),(2.8) and (2.9) directly. But they can be solved indirectly

as follows. Using Lagrange multiplier method, we maximize (2.3) subject to (2.4)


and

C P h i ( x h i - . Z ; i ) =O; h = 1, - ' . H I

for each given ( X I , - - - ,.rH) such that whrih = X. We get the maximum as

wit h the Lagrange multiplier sat isfying

where implicit ly

phi = nh[l f [ i ( x h i - - T h ) ] .

Hence, we can Say tha t Eh is a function of .Th and we then maximize

subject to

Using the Lagrange method again. let

solving

we get th = m,h#. Therefore,


Or ~h is a fiinction of 4, and Q should be chosen such that

It is evident that the solutions & .ch@)? phi to (2.10). (2.1 1) and (2.12). defined

as 4, . rh(g) , Phi , are the same as the solutions t o (2.7): (2.8) and (2.9). Hence

Xh(4) = .Ch satisfies (2.10). i.e.

We now prove that (2.10) and (2.12) have a unique solution. Denote

then, noting (?.IO),

aw where Id-' is the identity matrix of order d - p. For a given 4, is nonsingular

since the mat rix

CHAPTER 2. EL FOR FlNlTE POPULATIONS 24

is positive definite. If a solution -Yh to bV(-Y, 4) = O exists for each given 4. then

-Th is an implicit function of 4, Say ~ ( d ) , by irnplicit theorem. Since

a W ( x h ( d ) , 4 ) )

we obtain, noting (2.4) and (2.11),

for al1 h. By applying the theorem of the mean we will know that

will not have two differeot roots. Hence if a solution d exists. then it is unique and

we can find the 4 satisfying (2.12).

We solve (2.10)-(9.12) by cornbining (2.10) and (2.1'2) as a nonlinear system of

equations and then using NAG library function to get a solution for appropriatel-

selected initial values. e.g., $ = O and .Th = xh. For this solution the condition

O 5 Phi 5 I usually holds as seen in our simulation study.

Once we have the estimators P h i , we c m estimate I by


2.2.2 Asymptotic Results

In this section, we discuss the asymptotic behavior of the MELE. We use II . II to

denote Euclidean norm and oz > ol means that c2 -al is a positive definite matrix.

Also, we assume that both the sample size n h and the stratum size lVh for each h go

to infinity as the index u attached to n h and Nh goes to infinity, Le., n h u and !Vhv

go to infinity as u + m. However, for convenience, we will suppress the indes v in

the following whenever possible. And we will denote the solution $ by 4, since we

are going to deal with large sample problems.

Theorem 1. Suppose that as v -+ 3co Nh, nh. Nh - n h go t o infinit- n / n h -, kh.

kh > 0, and both Eh WhiVrl xi I I X h i Il3 and Ch ~ V ~ N L ' xi II yhi Il3 have an upper

bound independent of uo and ah,, = Cou(xh, x h ) > al > O for al1 h and v . where

o1 is a fixed positive definite rnatrix, then

as u + m. tvhere

= x h Wi(nhL - ~ v ; l ) [ ~ h , ~ ~ ( l - F ~ , , ~ ~ ) - 2GCov'(i(yh < y), zh ) + GVar(lh)GT]


From the results of Theorem 1' we find that the estimator y has the same

asymptotic variance as the optimal regression estimator

where

wit h

Furt her. & (ZJ ) is asyrnptotically more efficient t han the customary est imator Pn(y )

= Eh w ~ F ~ , ~ , , (y). I t is possible to construct an optimal regression estimator of

F u ( y ) by changing yhi to the indicator variable l ( y h i 5 y ) in the formula for Top,

(Rao and Liu, 1992). This estimator, F,,t!y ). is asyrnptotically equivalent to F ~ ( ~ ).

but it may not be monotone unlike the MEL estirnator klonotonicitÿ of

Fv ( y ) in the case of scalar y ensures the calculation of quantiles. etc.

In above discussion we considered auxiliary information of the form of (2.2). but

we can easily generalize it to the auriliary information of the form

The choice w(xh i ) = xh; - X gives (2.2). When the population median m, is knorvn

and x is a scalar, we let w ( x h i ) = Ikh,<mx] - 0.5.

ClYrlPTER 2. EL FOR FINITE POPULATIONS

2.2.3 Variance Estimation

In this section, we coosider the estimation of variance of B and F ~ ( ~ ) . Ünder the

conditions of Theorem 1 and from its proof, we can easily see that

is a consistent estimator of a. the asyrnptotic variance of P. in the sense that

n ( ô - 0) +, O. where

Çimilarly,

is a consistent estimator of OF,, the asymptotic variance of Fy(y 1, where

We can use (2.15) and (2.16) to get normal theory confidence intervals for f and

F&), which are asymptotically correct.

Another consistent estimator of O is obtained by the jackknife method. We give

the result in the following t heorem.

Theorem 2. Linder the same conditions as Theorern let Y - l j be the estimator

of Y when the j th observation of lth stratum is deleted and let


be the jackknife variance estimator. Tben

P n(âJ - 5 ) - 0. A jackknife variance estirnator for can be obtained from (2.17) by simply

changing yhi to L(yhi 5 y) . The consistency of t his variance estimator follows along

the lines of Theorem 2.

2.3 Estimation of Small Area Means

In some applications the strata may be regarded as srnall areas of interest. The

term %mal1 area?' is used to denote a srnall geographical area. such as a county. a

municipality or a census division. Sample survey data certainly can be used to derive

reliable estirnators of totals and means for large areas or domains. However. the

usual direct survey estimators for a small area, based on data only from the sample

units in the area, are likely to yield unacceptably large standard errors due to the

unduly small size of the sample in the area. Sample sizes for small areas are typically

small because the overall sample size in a survey is usually determined to provide

specific accuracy at a much higher level of aggregation than that of small areas.

Ghosh and Rao( 1994) reviewed many methods of small area estimation. including

t hose based on models that link the small area through auxiliary inforrnat.ion. Here

we do not assume any such model.

If the strata can be regarded as small areas and we only know the mean of

the population, we can g t more efficient estimators of the individual small area

CHAPTER 2. EL FOR FINITE POPCTLATIONS 29

(stratum) rneans ph by using auxilirtry information in the forrn of the overall rnean,

X. Under our set-up, we can use the MEL method to get MEL estimators for the

strata means vh as

where fihi's are the same as those given in section 2.

We now study the efficiency of T h relative to the sarnple mean y h that uses only

t h e y-data in small area h. We have

where

Hence

Therefore, the average variance over strata is

We use this as a measure of precision of the MEL estirnators vh. We consider two

special cases. First, if x and y are uncorrelated for al1 hl then

CHAPTER 2. EL FOR FINITE POPULATIONS 3 O

In this case, there is no gain in efficiency over y h . On the other hand, if yhi =

a h + C h X h i for al1 h , where Xhi is scalar, then

This suggests that in practice Y h will be more efficient than g h if y a n d r a r e

approximately Iinearly related in each small area h.

2.4 Empirical Likelihood Ratio Test

Now we turn to the empirical likelihood ratio test (ELRT) on p. Here ive denote

E ( z ) = Ch W h E ( z h ) = 8 , where 0 = (Bi18:)'. Bk = .r, 8, = Y. and k. u rnean

"known". "unknown' respectively. By (2.1) we know the empirical likelihood func-

tion is

L ( F ) = nclny& F ~ , , V ~ ( Z ~ ; ) = n:&Phi: (4.1 )

where Phi = d F h , i ~ h ( ~ h i ) = Pr( zh = zhi) and F denote the population distribution.

Noting that ph; 2 O, Ci ph; = 1 and F ( - ) = Ch CVhFh,iVh(z)~ we know (4.1) is

maximized by F,(z) = Ch WhFhVnh(z) , where FhVnh(z) = & x i I(ihi < 2). The

empirical likelihood ratio is then defined as R ( F ) = L ( F ) / L ( F,) which reduces to


Since we are interested in the parameter O, = I. and we have auxiliary information

(2.2), we define the profile empirical likelihood ratio function

The existence of a unique value for the right-hand side of (4.3) for a given 0, is the

same as discussed in section 2. üsing the Lagrange multiplier method, let

where t = ( t 1, t 2 , - - - . t d ) T are Lagrange multipliers. Taking derivat ives wit h respect

Hence,

where Z h = x i p h , z h i . mh = nkvhnhl and t is the solution to

From the discussion of section 2 we know that t can be deterrnined in terms of 6.

and t = t (8) is actually a continuous differentiable function of 8 .

The empirical likelihood ratio function for 0 now becomes as


which leads to the empirical log-likelihood ratio

"u hr "ur

We minimize l E ( 8 ) to obtain an estimator 8. of the parameter 8, and 8= (*TT. eu)'.

called the empirical likelihood ratio est imator. In addition. t his yields est imators hr

Phi from (4.4), and an estimator for the distribution function F:

Y

Now we turn to the calculation of 6 by minimizing of l E ( 8 ) . From the context above

we know that it is equivalent to maximizing

subject to

where 8 , is unknown. Since O, is unknown. (4.10) does not add any information

into t he estimation problem. Hence, L will be maximized by dropping (1.10). CVe

can show this as follows. Let

where tk , tu are multipliers with the same dimension as Bi, O, respectively. Then

811 -- - nt, = O 3 t , = O. 88,'


ry

This shows that we can drop (4.10). Therefore, Phi's are the same as Phi's given h) 1 ry

in section 2.2, and du= = Ch Wh xi Phi Y h i We can prove this by fdowing the

steps in section 2.2.

We note here that under Ho : O, = 8,: the estimator for phi will be changed to

1 phi =

nh[l + rnhir(~o)(-hi - S h ) ] '

where Zh = C i j h i i h i , Bo = ( Z r , O&)'. The calculation of Phi's is as discussed in

section 2 e?tcept for changing z h i to --hi? -Yh t0 i h , etc.

The empirical likelihood ratio statistic for testing Ho : 0, = 6, is given by

Theorem 3. Under the conditions of Theorem 1 and V a r ( z h ) 3 ai > O for an-

h and v . ive have

c ~ É ( o ~ ) 5 as o -+ oc

when Ho is true.

Using Theorem 3: we reject Ho at level a if WE(Bo) exceeds the upper a-point.

1 3 ~ ) : of x 2 ( p ) .

ClHAPTER 2. EL FOR FINITE POPULATIONS

2.5 Simulation study

We conducted a limited simulation study on the finite sample properties of the

proposed MEL estimators of the population mean F and the distribution Function

Fv(y) in the scalar case (p = 1. d = 3). We also considered the likelihood ratio test

on k.

For the simulation st udy, we employed six strat ified finite populations considered

by Chen and Sitter (1996). Each population consisted of L = 4 strata with stratum

sizes N h = 8000 - 300h and stratum sample sizes nh = 100 - 9h for h = 1.2.3.4.

The scalar characteristic yhi for the i-th unit within the h-th stratum was generated

from the mode1

for specified values of a h , Ph, yh, a and bh, where eh; are iid random variables for

each h , generated from either a chi-squared distribution with bh degrees of freedom. -

i.e.. y:,, or a standard normal distribution. N(0.1). Further. the scalar xhi are iid

variables for each h. generated from Table 2.1 gives the parameter settings

for generating the six populations. The strength of the relationship between y and

x is weak for population 1, and in populations 2-4. y and x are linearly related.

while in populations 5 and 6 a quadratic term is added to create departures from

linearity. The populations cover both symmetric and skewed errors eh;.

For each of the parameter settings in Table 3.1. a stratified finite population was

created and R = 1000 independent stratified simple random samples were drawn.

The random numbers needed to generate the finite population were obtained using

the NAG fortran library function. The empirical mean squared error of an estimator

CHAPTER 2. EL FOR FINITE POPu'LATlOM

8 of a parameter 0 was calculated as

where &') is the value of for the r-th simulated sample.

We now st.udy the performance of the MEL estirnator ? of the mean k- relative

to the stratified rnean, ijSt = Eh CVhijh, a generalized regression estirnator (GREG)

and the optimal regression estimator I,,. The GREG estimator is given by

w here

This estimator is motivated by a linear regression mode1 with common intercept and

dope coefficients and constant error variance ( Sarndal et al., 199 1. p.229). Table 2.2

reports the ratios of :USE(?) to M S E ( P ~ R ~ ~ ) , iLISE(yst) and MSE(P ,~ ) for al1

the six populations. It is clear from Table 3.2 that MELE outperforrns the GREG

estimator? especially for populations 1-4 with widely varying intercept terms across

strata. It is also considerably more efficient than the stratified mean except in

the case of population 1 with a weak relation between y and x. As erpected. MELE

and the optimal estimator are equally efficient for al1 the sis populations.

The second part of our simulation study deals with the estimation of popula-

tion distribution function Fy( y ). We selected the values of F y ( y ) corresponding

2 3 t o the quartiles y p ( p = f, 4, i-e., Fv(yp) = p. Table 2.3 reports the ratios

of M S E ( F ~ ( ~ ) ) to M S E ( F , ~ ( ~ ) ) and MSE(F , , , (~ ) ) . where is the MELE of

F Y ( y ) , pst(y) = Ch CVhFh,nh (y ) is the ernpirical distribution function and Fo, t (y ) is

CHAPTER 2. EL FOR FINITE POP lJLATIONS 36

the optimal regression estimator of Fy(y). It is clear from Table 2.3 that the MEL

estimator p&) outperforrns the ernpirical distribution function ) in al1 the six

populations. Alt hough F~ (IJ) and fi,,,(y) are asymptotically equivalent , our sirnu-

lation results indicate that FY(y) is often çignificantly more efficient than F,,t(Y) in

finite samples. For example, the ratio equals 0.62 for population 2 and 4 at p = $

and 0.7 for population 6 a t p = f . The main advantage of F ~ ( ~ ) . however. is tha t

it is monotone, unlike FOpt( y).

The final part of our simulation study investigates the performance of the em-

pirical likelihood ratio test and the asymptotic normality test obtained from the

result of theorem [(called ANT). Since we know the means of the six populations,

say, Foy we can test the hypothesis Ho : Y = vo from the simulated samples. We

tested Ho using the empirical likelihood ratio test statisiic ~ . b ( k b ) a t nominal levels

a = 0.05 and 0.1. Table 2.4 reports the empirical type 1 error rates for a = 0.10 and

0.05, obtained by computing the proportion of samples for which WE(F0) > >:(l)-

the upper a-point of a x2 variable with 1 degree of freedom. Table 2.5 reports the

empirical error rates for a = 0.10 and 0.05. obtained by computing the proportion of - -

samples for which IV - Y! > u,Cu :, where u, is the upper a-point of a :V(O. 1) dis-

tribution. Table 2.4 shows that the empirical likelihood ratio test performs well over

the six populations. although somewhat conservative. It appears that higher-order

corrections to the empirical likelihood ratio test are needed to get more accurate re-

sults, as noted by Qin and LawIess(1995) in the context of simple random sampling.

Table 2.5 shows that the standard normal approximation performs well over al1 six

populations.


Table 2.1: Parameter settings for generated finite populations

1

II

III

IV

V

rv

.xi Y:

2 Ys

'CE

Y;

2 Y4

Y:

Y;

Y:

15

2 'Cs

N(0.1)

:\-(o. 1)

X(0. 1)

Y(O.1)

X(0. 1)

N(0. 1)

iV(0,l)

X(0. 1)

X(0.1)

iV(0,I)

X(0.1)

N(0,l)


Table 2.2: Ratios of MSE's of the to PGREC. iOPT

Population

1

3 - 3

4

5

6

to y.t to FGREG

0.0247

0.471

0.345

O. 434

O. 629

OS18

Y to vopT 0.938

0.786

0.SSS

0.191

0.604

O Xi6

1.001

O. 999

O. 994

0.996

1 .O09

1 .O06


Table 2.3: Ratios of MSE's of F ( y P ) to k"t(9,) and FaPt(gp)

I Population 1 Ratio 1 p = L/4 1 p - - 214 1 p = 314

CHAPTER 2. EL FOR FINITE POPCJL.4TIONS

Table 2.4: Empirical Error Rate(% ) for MELRT

Population

Table 2.5: Empirical Error Rate(% ) for ANT

Population L

CHAPTER 2. EL FOR FINITE POPLfLATIONS

2.6 Proofs

We first summarize the existence cooclusion of maxirnizing problem at the beginning

of section 2 into a lemma and then prove it.

Lemma 1 Under conditions of Theorem 1, there exists wi th probability tending

to one, as v goes to infinity, a solution to the problem of maximizing Ch xi log Phi

subject to

In order to simplify our discussion. we suppose that x is a scalar here. The

discussion is simitar wher: x is multi-dimensional.

and

then obviously Dl is a convex subset of D. As shown by Owen (1990). a unique

solution exists for rnaxirnizing Eh xi logpi; within D y provided that X is within the

convex hull of xlll . ., x H n H . Since Dl c D, and if we can show that Dl is not

empty, we can get a maximum for Eh Ci log& within Dl: which is equivalent to

maxirnizing problem in the lemma since phi = CVhphz.


Under the conditions of Theorem 1, it is obvious that

where xcl) = minhminizhi, s(,) = maxhmax;xhi. Since H is finite, and samples are

independent across strata. we can also conclude that

h = 1? H, then we have solution (PZ,. . . ? pK,, )(xi P& = 1 ) ' such that 1, phi+,i =

-yh. Since Ch = -X, it follows that

The conditions .ih E ( x ~ ( ~ ) , ( h = 1, . . H ) hold with probability tending to

1, hence Dl is not empty with probability tending to 1. This cornpletes the proof.

We need the following Lemma to prove Theorem 1.

Lemma 2 linder conditions of Theorem 1. ive have

where


Proof of Lemma 2

By using the decomposition

and noting (2.13) we have

From (6.2) and (6.4) we get

P where 1 1 Zst - ,r II2+ O follows from standard results for stratified random samplinp

theory( e.g.. Bickel and Freeman, 1984 ). Hence

CK4PTER 2. EL FOR FINITE POPULATIONS

T herefore,

From (6.4) we have

where xZvRh = maxi II xh, II . Similar to the proof of Owen's (1988) Lemma. we can 1

show that xh,,, = o(nf ) almost surely(a.s.). Hence

where n('l = mirr{nl,. . , n H } By (6.8) we get

where A = Eh kh W:o(1)/2 > 0, is t he sniallest eigenvalue of Q. Hence.

CHAPTER 2. EL FOR F M T E POPULATIONS

we know II z,, - X II= O,@) and II tSt - -r II op(&) = o,(l). Therefore. (6.10)

estabiishes the last part of (6.1). From (6.4) we have

and

Therefore,

CHAPTER 2. EL FOR F I N T E POPULATIONS

Proof of theorem 1

We have

where the last term. noting (6.9), is

Hence, from (6.1) and (6.1) we get


Since E[y,, - B(I., - X)] = Y and

by appealing to Bickel & Freedman (1984): we get

! - - - L O" '(FWn - F) 4 N p ( O , I p ) .

Now ive turn to the second result of theorem 1. Denoting

we have


where by (6.4) the third term is

T herefore

where


Hence, --

OF! ( ~ y ( y ) - FY(Y)) i V p ( 0 ~ Ip).'

Proof of theorem 2

for convenience. where c i>-r j is the Lagrange multiplier $J when j-th observation of

1-th stratum is deleted. In the following op(

From Lemrna 2.1 it can be shown that the

1 ) means it holds uniformly over h. i. l. j.

follorving hold uniformly over h. i. 1. j :


Though MELE for plj is O when we delete xij from the data. we redefine f i l j , - l j as

above for convenience. We have

CK4PTER 2. EL FOR FINITE POPULATIONS *5 1

where the t hird term in the sixt h equality sign has the order

The fourth term of the same equation has also the same order.

Using the same method we can prove for h # 2 that

Si nce

by (6.11

O =

- -

) and (6.12) we have

Therefore

CHAPTER 2. EL FOR FINITE POPULATIONS 5 2

Following the steps of (6.11) we can obtain similar results on vh - W e have

Wl .- vt; (xij - i l ) + - 1

ni - 1 n1 - 1 (91j - Y i ) + o p ( ; )

Proof of theorem 3

- T h e result in Lemma 2 holds if we change -v to Bo, ZSt to z ,~ . m..

CH.4PTER 2. EL FOR FINITE POPULATIONS

where gr, = Eh bi/hkhah,zr. Therefore.

Since


Now

T herefore,

noting that

Chapter 3

Empirical Likelihood and General

Est imat ing Equat ions under

S t rat ified Sampling

3.1 Introduction

Likelihood and estimat ing equat ions provide the most common approaches to para-

metric inference. Recently. it has also been shown to be useful in non-parametric

contexts (Qin and Lawless(l994,5) and the references cited therein ). Our purpose

in this chapter is to combine empirical likelihood, estimating equations and stratified

sampling technology together. In order to sirnplify the discussion. ive suppose i.i.d.

sampie in each stratum, but the discussion can be carried over to simple random

sampling wit hout replacement.

Suppose that a target population with d-dimensional characteristic r has the

CHAPTER 3. EL AND GEE 56

unknown distribution function F and a pdimensional parameter 0 associated with

F. We are interested in making inference on 8. The sampling scheme is as lollows.

Suppose that the target population is divided into H strata with known weight

Wh for al1 strata. We have an i.i.d. sample t h t , - - , Xhnh on the characteristic xh

from stratum h with distribution function Fh' where xhi7s are d-dimensionaI(h =

1, - - - , H , i = 1, - , n h ) . The Fhls are unknown, and the sarnples across strata are

independent. From the context we know that

We also assume that information about 0 and F is available in the form of r 2 p func-

tionally independent unbiased estimation functions. t hat is. functions g j ( x . O), j =

1, . , r such that EF{gj(x, O ) } = O. In vector form.

where g ( x . O ) satisfies

In the following, we will show how to use such information to estirnate 0 and F.

in conjunction with empirical likelihood. But first for illustration we give some

examples that ive will return to later.

Example 1. Sometimes we have information relating to the first and second

moment of a variable. For example, let z be univariate with mean 6' and E ( x 2 ) =

m(0) , where m(.) is a known function. The information about F can be expressed

CHAPTER 3. EL AND GEE 5 7

in the form (1.2) by taking g(x. O ) = (x - 6 , x2 - m(O))' and the restriction becornes

Example 2. Let x = ( y , 2)' be bivariate with E(y j = E(r) = O. In this case we

can take g ( r , 6) = ( y - 6 , z - O)' and the restriction becomes

A somewhat similar problem is when E ( y ) = I is known and E ( z ) = 6 is to be

estimated, in which case we would have g(x) = ( y - c. 2 - O)' . This problem h a

been discussed in the survey context in chapter 2: where (Y and F played the rules

of and 9 respectively.

We wili show later how to use the empirical likelihood method to solve the above

problems. The basic idea is to maxirnize an empirical likeiihood subject to con-

straints provided by (1.2) to get the maximum empirical likelihood ratio estimators

(MELR) for O and F. In section 3.2? we give MELR estimators for û and F, and es-

tablish asymptotic normality. Section 3.3 shows how the MELR method can be used

in models where parameters are subject to constraints. Hypothesis testing problems

will also be considered. Finaily, section 3.4 presents examples to demonstrate how

these met hods can b e used in practice. Proofs are delegated to section 3.5. .

CHAPTER 3. EL AND GEE

3.2 MELR Estimators and Their Properties

3.2.1 MELR estimators

The empirical likeli hood funct ion given by

where Phi = P f ( x h = x h i ) Only those F distributions wit h Fhqs which have an atom

of probability on each xhi have nonzero likelihood. We kknow (2.1) is maximized by

the empirical distribution function F J x ) = Eh Wh Fnh(x) by noting (1.1). where

I n = x h n h , F,,(+) = < C i l ( z h i < x). l ( x h i < X ) = ( l ( x h i V l < x ~ ) . . . . . 1(zhiVd <

td))l. X h i j and + j are j-th component of vector z h , and x respectively. l ( x h i , , < xj)

is t he indicator of set (xh iV j < xj ) The empirical likelihood ratio is then defined as

R ( F ) = L ( F ) / L ( F , ) . which reduces to

R ( F ) = II~==,I 'I :~lnhph, . (2.2)

We remark that formulas here and elsewhere in this chapter do not require that

the xhi's be distinct. Since we are interested in estimating the parameter O . and we

know the estimating equation (1 .2) ' we define the empirical likelihood ratio function

where ghi (B) = g ( x h i ? O ) for al1 h and i. As discussed in chapter 2, for any given 0

in a neighbourhood of true 0, a unique value for the right side of ( 2 . 3 ) exists with

probability tending to 1 as al1 nh -t CO. The maximum can be found via Lagrange

multiplier met hod. Let


where t is a r-dimension vector, Then

with the restriction from = O? i.e.,

WhGh (0) = 0. (2.6) h

from which (see section :3.2.2) t can be determined in terms of 8. Note that t = t ( d )

is actually a continuous differentiable vector valued function of 8. Therefore. the

empirical log-likeli hood ratio statistic is given by

h,

with t being defined by (2.5). We minimize fE(B) to obtain an estimate 8 of the

paraneter B (called MELR ). In addition. this yields estirnates P h i ' s frorn (2.4). and

an estimate for distribution F as

h 1

Sornetime in order to indicate lE(6) and ~ ~ ( 8 ) also depend on t ( O ) , 1, lE(O. t ( O ) ) and

CHAPTER 3. EL A N D GEE

3.2.2 Numerical method for given 9

We first show how to solve (2.4) (2.5) and (2.6) to get phi's for given 6' where

O 5 phi 5 1. Here irnplicitly phi's are functions of 0. Suppose al1 Ehg(xhi 8) = ~vIh(8)

are also given, where Eh indicates the expectation is taken with respect to Fh( we

note here M h ( 0 ) may include other characteristics of xh which we are not interested

) Here, of course, ~ b . ( ~ ( e ) ' s must satisfy Ch W , f b f h ( 8 ) = O , then using Lagrange

multiplier met hod we find t hat Eh xi log phi attains the maximum

with $h satisfying

where implicit

Hence. we can Say that t,bh is a function of M h ( 0 ) and 8 and we want to maximize

subject to C h iwhklh(h(8) = 0. Using Lagrange method again. k t


letting nh?,bh - nbvht = O, we get $h = mht, which implies that

1 Phi = nh[ l + mhtT(ghi(d) - lMh(e))]' 0 <phi < 1 1

or we can say that iWh(0) is a function of t , Say !Gh(t). and t should be chosen such

t hat

In conclusion. for any given 6. we can select a value of t , solving (2.9) for each h =

1,. . . , H under the conditions of phi 2 O to get Lfh(t). then check whet her iUh(t)'s

satisfy equation (2.11). Another way is combining (2.9) and (2.11) as a system of

equations, usiog Newton's method tc solve these equations for appropriatel- selected

initial values. Usually (2.10) will be satisfied.

Using the same method as in Chapter 2. we can prove the existence of solutions

to ('2.9), (2.10) and (2.1 1). We also know that iClh(9)'s are continuous differentiable

functions of O.

3.2.3 Asymptotic Results CY

The following lemmas tvill be used to prove that at some point B. l E ( B ) attains .y

its minimum and to establish asymptotic properties of 0. We use 11 II to denote

Euclidean norm.

Lemrna 1 Let k; be i.i.d. random d-dimension variables and define 2, =

CHAPTER 3. EL .4ND GEE

Lemma 2 Suppose a s n + CU, n / n h -, kh > O for al1 h. And suppose

that in a neighborhood o f the true value 00, & [ ( g ( q l 00) - Ehg(xh? &)) (g(xh? 00) -

Ehg(zh, = oh(Oo) > 01 > O for ail h. where al is positive definite. I I g ( x , 6 ) I l 3 is bounded by some integrable function G(x) in this neighbourhood. Tben for any

given 6 in this neighbourhood.

Lemma 3 (Jennrich. 1969) Let g be a function on X x 0, rvhere .Y is a

Euclidean space and O is a compact subset of a Euclidean space. Let g ( x . 6 ) be

a continuous function o f 8 for each r2: and a measurable function o f x for each 6 .

Assume also that Ig(x,6)1 4 h ( x ) for al1 x and B. where h is integrable with respect

to a probability distribution function F on x. I f s l . x?. is a randorn sample

from F then for almost every sequence (x,)

uniformfy for al1 0 in O.

Lemma 4 In addition to the conditions of Lemma 2, rr7e further assume that

d g ( x , B)/d0 is continuous in a neighborhood of the true value Bol II a g ( x . 0)laQ I I is

bounded by some integrable function G(x) in this neighbourhood. and the rank o f

Ch k f i E h [ a g ( x h ? eo)/ae] is p. Then, a s n -, CO. tvith probability 1. l E ( B ) at tains its

CHAPTER 3. EL ,4ND GEE 63

Y Y

minimum value at some point 8 in the interior of the b d II 6 - O0 115 n- i . and 0 A I - -

and t = t ( O ) satisfy h, h, Ly h,

&ln(& t ) = 0, & d e . 1 ) = 0, (2.L3)

wbere

wit 6

and

Theorem 1 In addition to the conditions o f Lemma 4 above, we further assume

325(r,4 that is continuous in 8 in a oeighbourhood of the true value $0 and I I 11 is bounded by some integrable function G ( x ) in that neighbourhood. Then

Y n, Y

1 where F, ( x ) = Ch kvh x i P h i l ( x h i < s) . P h i = - 1 , , and V, CÏ. I/tr nh lf m h ; r ( 9 h ~ ( ~ ) - ~ h ( ~ 1)

are defined in the proof Section 3.5. Further. 8 and ? are asymptotically indepen-

dent .

Theorem 2 In the semi-parametric mode1 ( 1 4 , for testing Ho : 8 = Bo the

ernpirical Iikelihood ratio test statistic


is asymptotically under Ho, assuming the conditions of theorem 1.

Corollary 1 Assume r > p. In order to test the mode1 (1.2). Le..

we c m use the empirical likefihood ratio statistic

Under the conditions of theorem 2, Ri is asymptoticaffy y?-, i f (1.2) is correct.

3.3 MELR Estimators and Testing with Con-

straints

In this section we extend the empirical likelihood met hods to deal wit h case in which

there are constraints on parameters. Suppose there is a q-dimensional constraint on

9, i.e.,

r ( 6 ) = O. (3.1)

where r ( 6 ) is a q x 1 ( q 5 p) and the q x p rnatrix R(0) = 6 is of full rank q. To

minirnize lE(0) defined by (2.7) subject to r(0) = O, we consider

where v is a q x L vector of Lagrange multipliers. Differentiating G2 with respect to

9 and v, we have

CHAPTER 3. EL AND GEE 6 5

Consequently, to rninimize l E ( B ) subject to r ( 0 ) = O, we need the solution of

w here

Q3n(d, t , V ) = r(6')? - ' - Y

wherejh(B) is defined by (2.13). We denote the solution of (3 .3 ) as (O,. t , , ~ , ) . To

discuss the asymptotic behavior of these estirnators and the test statistics based on

them, we suppose that r (&) = O, where Bo is the true value of 8. First we give two

lemmas which show that the solution to (3.3) does exist with probability tending 1

a s n + c u .

We rieed the following lemma from Aitchison and Silvey (19%) in the proof of

Lemma 6:

Lemma 5 If w ( X ) is a continuous function rnapping RS inio itself with the

property that, for every X such that II h II= 1 . Xr+(h) < O . then lhere exists a point

,\ such that II II< 1 and zb(i) = 0.

Lemma 6 Suppose that g ( x , 0 ) is continuous and differentiable in the neigh-

bourhood o f Bo and E II g(x, 8 ) Il3< m. Furthermore. assume that oh(ûO) > O for all

h and that Eh W h E h { w } i sof fu l l r a d . Then in thesphere {O :II O - & 115 d,).

the equation Q l n ( B , t ) = O almost surely has roots t = t ( 8 ) = O(d,). and t ( 6 ) is

' 5 continuous and differentiable when 0 belongs to this sphere, where d, = n-7- . + & > O . 6


Lemma 7 Assume that the conditions of Lemma 3.1 hold; that in a neighbour-

hood of Bo, r (6 ) is a continuous differentiable function; that the q x p rnatrix R(0)

is of rank q; and that

d2r(0) aZg(r, 0) dOdBT' dB60T

exist and are bounded by some constant and some in tegrable function respectively

Then the equations (3.3) almost surely have solutions in ifd, = { ( O . t' V ) :II 9 - 00 II

+ II t II + II v Il 5 d, } as n goes t O infini ty, and any sol u t ion of (3.3) in Lid, minimizes

lE(0) subject to the condition r(0) = 0. ry

Theorem 3 below establishes the asymptotic normality of 8,.

Theorem 3 Under conditions of Lemma 3.2. we have

where P, H are defined in Section 3.5.

Now we turn to the problem about how to test Ho : r ( 0 ) = O. There are three

popular tests based on the likelihood. They are: likelihood-ratio test. Lagrange-

multiplier test and Wald test. Of course here they should be based on the empirical

likelihood. These stat istics are defined respect ively as

rr

ELR = 21E(&), L A = n G: HZ' G r , W A = n r ' ( ê ) ~ ~ r ( e ) . 6 r

where 6 is a solution of the estirnating equation Ch CVhnhL Ci ghi(B) = 0, and He

defined in (5.%7), is t.he asymptotic covariance rnatrix of fi y,.

The following t heorem gives the asymptotic behavior and relat ionship between

the test statistics defined in (3.5).


Theorem 4 Under the assumptions of Theorem 3.3, the three test statistics in

(3.5) are asymp to t ically eq uivalen t , and each of t hem is asymp tot icaily dis t ri b u t ed

as uoder Ho.

3.4 Examples

We consider several illustrations of the estimation procedures. We primarily con-

sider large-sample aspects, but computational issues are also discussed by giving the

equations and the method to solve them.

Example 1 (continued) We apply the approach in Section 2 to solve the problem

displayed in Section 1. Equations (2.9), (2.1 1) and (2.16) lead to

The third equation implies t l = -mf(B)t2 and by ssubstituting this into t he

first and the second we get 2 H + 2 equations in t î 7 0, u , , . uw. cl. . . . U H to solve. We use Fortran NAG library function to solve them, for initial value

t l = O. 0 = Eh W h ~ h , uh = O, vh = O, h = 1, .. , H . Usually the solutions should

satisfy (2.10).

CHAPTER 3. EL A N D GEE 68

The results of Theorem 1 show that fi(; -O0 ) -+ N(0, i f ) , where V is given in ry

Theorem 1, or 8 -do = N ( 0 , +V). Since

Also. since

Wh A = a22 - 2rn1(0)a12 + a l l (m1(8) )2 = Var mf(0)zst - &) > o.

Y

Thus 4 Var&) , which is the variance of ISt - do. Therefore. 0 is asyrnptotically

a t least as efficient as Z s t .

Example 2 (continued) [Two-sample problem with common mean 1. In this case

observations (xhi7 IJh i ) i = 1, . nh O C C U ~ in independent pairs and Ch Wh E x h =


Ch WhEyh = 8. TO estimate 0. we consider the estimating equations based on

gl = x - O, g2 = y - t9 and we associate the empiricai likelihood probability phi with

(zhit y h i ) After some simplification, from equations (2.9), (2.11) and (2.16): we have

The initial values may be set to

Here

T herefore


In particular, note that in the case where

which is the same as the variance of the optimal linear combination estimator

3.5 Proofs

Proof of Lemma 1

Following Owen(19SS). we get M û ~ ~ < ~ < ~ l & ~ ~ ~ l - - = o ( n i ) (a-S.). hence rve have z , =

Proof of Lemma 2


and noting ( 2 . 5 ) . we have


where gh ( O ) = Ci ghi(0) /nh Since Whjh ( O ) = O. we have

0 = W h g h ( 6 ) - n bv:nh2 x (ghi(e) - gh(e))(~hi(e) - Gh(e))I h h ; 1 + mhtT(e)(ghi(0) - Gh(6)) W )


by the strong Iaw of large number. Hence


Therefore, we have

Since ah(e) > 01, hence S (0) > Ch W;khol > 0, therefore

From (5.3) we have

- where = maXi II ghi (B) 11. By the result of Lemma 1. we have uhSnh - 1

o ( n i ) ( a . ~ . ) . Hence

Therefore, froni (5.7) and (5-8) y we get

CHAPTER 3. EL AiVD GEE 73

where X = Ch kh Wto( )/2 > O and q1 is the smallest eigenvalue of CI. Hence

From (-5.3) we have

and

T herefore,

Proof of lemma 4

We follow Qin and Lawless( 1994) in proving this Lernrna. Denote 0 = Bo + un-: for 6 E {BI 11 8 - Bo I I = n- i } , where II u I I = 1. First we give a lower bound

for l E ( 0 ) on the surface of the ball. From Lemmas 1, 2 and 3 we know that when

II 0 - 6 , 115 n - f .

By this and Taylor's expansion, we have (uniformly in u )

CHAPTER 3. EL AND GEE CI / a

w here

by the law of i terat ive logari t hm. r > O, c - E > O and c is the srnallest eigenvalue of

S imilarly,

Since lE(e) is a continuous function of B belonging to the ball II O - O0 11s n-!. h

l E ( B ) has minimum value in t he interior of this ball. and 8 satisfies (noting ( 2 . 5 ) and

(2.6))

Proof of Theorem 1

Taking derivatives with respect to B r 7 t r , we have:


N Y

where 6, =Il8 -00 II + ( 1 t 11. Noting (2.14) (%.15), (5.10), (5.11). we have

The system of equations (5.14) and (51.5) rnay be written in matrix forrn as

CHAPTER 3. EL A N D GEE

it follows that

where hl2* = !MT2 and Mzz = O. Thereiore,

From this result and

ive get 6, = O&-f ).

Since

where iÇ122.1 = - &&Jdfi1 iÇflZ, we have

CHAPTER 3. EL ,4ND GEE

w here

and

where


it follows that 8 and 'i are asyrnptotically independent.

Now we turn to the third result. Since

noting (5.20) we have

where

and B, 3 B. Hence


where

Proof of Theorern 2 ry hi

Noting (5.20) and t hat t ,O are asymptotically independent, we have

Similarly. we get

Hence


1 1 1

Now noting that J n I C ~ ~ ' i j S t ( B o ) -+ N ( 0 , I ) and -iV11~'~~112iiM;2f112.f21~M1~T is sym-

metric and idempotent with trace equal to p, it follows that R converges to y;.

Proof of Corollary 1

From the proof of theorem 2.2, we have

For fixed 6 such that II 8 - do II < d,, consider t h e function

almost surely continuous function for II X II < 1. When II A II = 1. we have

CHA4PTER 3. EL AND GEE

where c is the smallest eigenvalue of Ch khCVi~h(BO). By Lemma .5 there exists a

point A such that II II < 1 and $(i) = O. C'sing the irnplicit-function theorern. we

can easily get ot her theorem's results. O

When 1) 0 - 00 115 d,, from

we have

t hen


= ~ ( n - f log log n ) + ( D ' ( O ~ ) S - ' ( O ~ ) ~ ( 0 ~ ) + ~ ( n - i log log n)) ( O - B o )

+w 0 - 00 ID2 = D ' ( ~ ~ ) S - ' ( O ~ ) D ( B o ) ( O - B o ) + o ( I I 0 - Bo 1 1 ) . U . S .

That is, CC',(O) can be represented as

where V ( x , O ) is almost surely continuous function of 0 and V ( x , 8) = o(ll O - O0 11

) as.. Similar to the arguments in Aitchison and Silvey (1958 ). we can show that - ry

the system of equations Q2,(0, t ( O ) , Y ) = O and Q3n(B) = O have a solution 8. v such - N

that O , vE { ( 9 . 4 : I l 6 - Bo I I + II III d,}. 'Y

Norv we prove the last part of Lemma 7. Suppose that e is a solution in Fd,. B

is a point in Ud, and satisfies r ( 0 ) = O. Then we have

Note t hat

- LII n.

and Q ~ , ( o : t (B) , G) = O , so that t r ( ; )2 , ' (6 ) = - Cr R ( B ) frorn (3.2). Since the q x p

matrix R(Bo) is of rank q? ive have

CHAPTER .3. EL AND GEE

where c is the smallest eigenvalue of D(Bo)S; ( B o ) DT (Bo). Hence

Proof of theorem 3

Taking derivatives about B r , t r . uT' we have


where 6, =11zr -0, II + 11Yr11 + 116r11, Dn(Bo) = Ch 2 xi -. The systern of

equations (5.21) (5.22) and (5.23) written in rnatrix form is

where

we know that 6, = 0, (n- i ) . Since

CHAPTER 3. EL AND CEE

where R = R(Bo). D = D(B0) , V = D':CI,'D. we have

where

H = ( R V - ~ R ~ ) - ~ , Q = HRV-I , P = V - I ( I - WQ). (5.271

Hence using the central limit theorem for ( O o ) and Slutsky's t heorem. we can

CHAPTER 3. EL AND CEE

We have

Therefore. noting (5.30),

Since

we know, noting (5 .28) ,

Hence

Chapter 4

Pseudo Empirical Likelihood and

General Estimat ing Equat ions

Using Stratified Sampling

4.1 Introduction

kVe applied empirical likelihood(EL) to stratified simple random sarnpling and estab-

lished asymptotic properties of EL estimators and likelihood ratio tests in chapters

2 and 3. The purpose of this chapter is to use empirical likelihood and estimating

equations in more complex sampling situations. Sample surveys often use a com-

binat ion of several of the following sampling met hods: s trat ified sampling, cluster

sampling, unequal probability sampling and multi-stage sampling. In this chapter.

ive will consider stratified sampling again, but samples taken from each straturn can

be complex.

CHAPTER 4. PSE UDO EL AND GEE 91

Section 4.2 contains the defini tion of pseudo empirical likelihood, ratio stat istic.

estimation equations and the maximum pseudo empirical li keli hood ratio ( M P ELR)

estimator. The existence of these estirnators and the asymptotic properties of these

estimators will be established in Section 6.3. An illustration example will be given

in Section 6.4. Proofs are delegated to Section 6.5.

4.2 MPELR Estimators

Let {P,, v = 1,2, . - } be a sequence of finite populations. Throughout the paper v is

used as the index of the finite population, but it may be omitted frequently for sim-

plicity. Each Pu contains H strata and the hth stratum contains Nh units. Associated

with the ith unit of hth straturn is the characteristic x hi. h = 1. , H. i = 1. . . . . ivh. where xhi is d-dimensional. Note that H, Nh, xhi etc. depend on v also but the

subscript u is omitted. Suppose P, with d-dimension characteristic x, has the un-

known distribution Fu and a pdimension parameter O associated with it. and the

hth stratum of P, has d-dimension characteristic r,h which has the unknown distri-

bution Fuh. We will omit v also from Fu. F u h . x v h in the following discussion when

appropriate. We assume that the weight of the hth stratum of P,. Wh' is known for

every h.

A sarnpie, sh. with values z h l , . - a . sh,, irom stratum h, is taken according to

some sarnpling scheme and samples are independent across strata. From the context

we know that

Suppose x h l , . x ~ N , , , units ia hth stratum of P,, is a random sample from a super-

CHAPTER 4. PSEUDO EL AND GEE 92

population, say Fjh. If the entire population P, were available, the corresponding

likelihood function would be

where phi = F:h(+hi) - F$( th i ) . The log-likelihood function then is

1 ( F ) = log phi-

The oonparametric maximum likelihood estimator of F: is

nent of vector 2hi and z respectively. 1 ( x h i j < X j ) is the indicator of set ( x h i V j < Xj). H N = lvh. But we o d y have a sample, Say. s, from the finite population and

if we view (2.3) as a finite population total. then we may have available a design

unbiased estimator of l ( F), namely,

al1 h , and E refers to expectation with respect to the sampling design. Obviously

we have E ( ~ ( F ) ) = l ( F ) ; We will cal1 (2.4) the "pseudo empirical likelihoody . Chen

and Sitter(1996) defined pseudo empirical likelihood and we use a similar definition

here. Some examples about how to construct weight d h i ( s ) are given in Section 4.3.

Since Ciphi = 1, we know (2.4) is maximized by F,(x) = Eh W h F n h ( x ) where

n = nh, F,,(X) = C i p S h ~ h i ( ~ ) l ( x h i < x), a h i ( ~ ) = d h i ( s ) / x i E S h d h i ( ~ ) - The

CHAPTER 4. PSEUDO EL AND GEE

log-empirical likelihood ratio is t hen defined as

We also assume that information about O and F is available in the forrn of

r 2 p functionally independent unbiased estimation functions. that is functions

g j ( t , O), j = 1,. . . , r such that E F { g j ( ~ , O ) } = O . In vector form. that is

where g ( x , 6) satisfies

Hence, we define the (negative) log-pseudo ernpirical likeli hood ratio funct ion for 0

where ghi(B) = g ( x h i , 19) for al1 h and i. The maximum may be found via Lagrange

multiplier method for any given B. Let

where t is a r-dimension vector. Then


with the restriction from = O, Le.,

Therefore,

with t being the solution to (2.10). We may minimize I E ( @ ) to obtain an estimator Y

0 of the parameter O (called MPELR estimator). In addition. this yields estimators

PhiYs from (2.4). and therefore an estirnator for distribution function F as

Note that if there is no auxiliary information, i.e., g(x ,B) = O, this approach

yields fihi = d h i ( s ) / xi,=,, d h i ( s ) Then the estimator of .IN = C h W;-Fivh will be

Ch xi fihixhi = Eh W h ( x i d h i ( s ) l h i ) / x i dh i (û ) . a separate ratio estimator.

The existence of solutions to (2.8)-(2.10) and ni11 be discussed in the following

section.

CHAPTER 4. PSEUDO EL

4.3 Asymptotic

AND GEE

Result s

Some conditions are needed to study the existence of solutions and asyrnptotic prop-

erties of MPELR estimators. In the following discussion, we always assume that

n + m7 n/nh -t kh > O and

as v + W. The assumption (3.1) means that no survey weight is disproportionally

large. We will limit ourself to stratified SRS or unequal probabiiity sampling with

replacement. It is of interest to see what (3.1) reduces to in some simple special cases.

if the sampling design is stratified SRS sampling, then d h i ( s ) = bvh/h/nh = iVh/(fVnh ).

and ( 3 . 1 ) is the same as

lVh -- I - O( - ) for each h. Nntl n

which will be satisfied since n l n h + 4 > O. If the sampling design is a stratified

unequal probability sampling design. we suppose that the sample is taken with

replacement with selection probability ah;: where a h i are known measure of the i-th

unit in straturn h, and the characteristic of interest is positive correiated with ah;.

In this case. dhi(s) = wh!(!L;inhahi), and ( 3 . 1 ) reduces to

W h 1 1 Maxidhi ( s ) = -Maxi - = O ( - ) for each h

nh N h a h i n


Hence in this case (3.1) can be reduced to

We will also suppose some other conditions about survey weights. In section 4.2

we required

for al1 h and phi , we will have

Under assurnption (3.1), we have

hence, noting t hat E ( z i E s h dhi(5)) = Wh.

Lemma 1 shows that the solutions to (2.7)-(2.9) exist for given 0.

Lernma 1 Suppose t6at g ( x , O ) is continuous and differentiable in the neigh-

bourhood o f ûo and II g ( x , B ) I l 3 is bounded by some integrable function G(x) in

this neighbourhood. Furthermore, assume that V a r ( r h ) = uh(0) > QO > O for al1 h

and Y in this neighbourhood. Then in the sphere { O :)1 8 - Bo 11 5 d,} . the equation


QI& t ) = O almost surelv has roots t = t ( 6 ) = O(d,), and t ( 9 ) is continuous and

differentiable wheo 0 belongs to this sphere, where d, = n- > 6 > 0.

From Lernma 1 we know under sorne conditions for any O in the neighbourhood

of Bo, we will have t = t ( 0 ) which is the solution to (2.8)-(2.10), and this solution

also guarantees phi > O. We also know t ( 6 ) = o ( d ) a.s.

We need some results from Shao (1994) for the Eollowing discussion. We sum-

marize them as a Lemrna.

Lemma 2 Under conditions o f Lemma 3.1. for any &en O satisfying II B - Bo 11 5

d,, we have

where &.w(8) = Ch Lh I idhi(s)ghi(0) . bv(0) = C'a+(g,t,,(B))-

Some special examples of Lemma 2 were discussed by a few authors. P.K. Sen

discussed multi-stage probability proportional to size(PPS) sampling and its asymp-

totic norrnality(Chapter 12, Handbook of statistics 6, edited by Krishnaiah and

Rao). The asymptotic results for two-stage and three-stage cluster sampling were

given by Rao and Scott(1979, 1981).

Lemma 3 Linder conditions of Lemma 3.1, for any given 0 satisfying 1) 0 - Bo II 5

d,, we have


rvh ere

I

t ( 0 ) is solution to (2.8)-(2.10).

Lemma 4 In addition to the conditions of Lemma 1. rve further assume that

dg(x, 6 ) / 8 0 is continuous in a neigbborhood of the true value Bo; that 1 1 ag(r. O ) / % II

is bounded by some integrable function G(x) in this neighbourhood: and that the

rank o f x h w ~ E ~ [ ~ ~ ( . z ~ , B ~ ) / ~ ~ ] is p. ~e dso assume S,(B) +, ~(6). Then. as h

Y -+ 00: with probability 1 , i E ( 6 ) attains its minimum value at some point 0 in the h c y h

interior of the bal1 11 0 - Bo 115 n - i . and e and t=t ( 0 ) satisf)

where

Theorem 1 Zn addition to the conditions of Lemma 4 ahove, we further assume

a2g(r O that is continuous in O in a neighbourhood o f the true value 90, II - II is

bounded by some integrable function G ( x ) in the neighbourhood. Then


Theorem 2 Suppose nu,(Oo) = nVar(g,,,,(B0)) -+ G' as u 4 iu, and the

conditions of Theorem 2.1, then in the serni-parametric model (2.6). for testing

& : 9 = O0 the empirical likelihood ratio test statistic

is asymptoticalb distributed as xf=, AiZ; under Ho, where ZI.. - , 2, are indepen-

dent N(0, l) ran dom variables, and A, ?s are eigen values of


Since the asymptotic distribution of R involves unknown parameters Xi'so we

must have their estirnators before any practical testing. Suppose Y is a consistent es- - 1

timator of V as discussed previously, then frorn -Y i i \ f ~ $ ~ M ~ , ~ ~ ! P I ~ ~ ~ J ~ I ~ , ~ ~ ~ ~ I ~ ~ ~ V T

we can have consistent estirnators for Ai 's , say A i , then ive can use either exact

methods or simulation to calculate the required percentile points and hence do the

testing of &. Another way is to use a modified R, denoted by R,:

w here

FVe treat Rc approximately distributed as under Ho. Rao and Scott(l994) con-

sidered such modifications in the content of analysis of categorical survey data.

Corollary 1 Assume r > p. In order to test the mode1 (2.6). ive can use the

empirical likelihood ratio st at istic

under the conditions of the Theorem 2, Ri is asyrnptoticaily distribu ted a s

i f (2.6) is correct, where Z1, . . , Zr-, are in dependen t iV(O.1) random variables. and

Ai '.s are eigenvalues of ~f c'vS.

We can use the sarne method to rnodify R1 as for R in the above theorem.


We consider an illustration of the estimation procedures. We primarily consider large

sample aspects, but computational issues are discussed by giving the equations and

method to solve them.

Suppose that our sampling scheme is stratified sampling, sample in stratum

h is taken according to some method. such as unequal probability sampling with

replacement. Samples are independent across strata. We denote xhl: - - . xhnh as

the sample frorn stratum h, h = 1, - - . H. kVe aiso suppose that xhi = (--hi, y h i ) .

where z is auxiliary variabIe and y is the characteristic variable of interest. that the

population mean of zhi1s1 2, is known. we want to est imate 0 = I. the population

mean of ZJhi7s. If we denote

then t h e estimating equation ( 2 . 6 ) can be written as

agha'(e) = ( O , -1)" t = ( t , , t 2 y . 88

from (3 .8 ) we have t 2 = O . Then (3.6) becomes -


where

We have 4 = Ch Wh xi Pli iyhi- AS discusçed in chapter 2? solving (4.3) is equi valent

to solving the following equations:

For appropriately selected initial values, e-g.?

we can solve (4.5) and (4.6) by using NAG Fortran library function to get a solution

Hence we can have and F ~ ( ~ J ) .

Since ~ 1 . 1 ~ 2 = (0. l)', = (0,1)? M2'(22.1 = -(O. l)iCI~l(O,l)'. and replacing :bhi

by its estirnator


For stratified simple random sampling, this approximate estimator will be

which is identical to the estirnator obtained by Chen and Sitter(1996). Note that.

4' is not equal to the approximate EL estimator in Chapter 2, The latter estimator

is asymptoticallÿ equal to the optimal regression estimator.

We will use the above example to show that the asymptotic distribution of pseudo

empirical likelihood ratio test statistic is not generally a y2-distribution here. Sup-

pose the sampling scheme is stratified SRS, and w h / n h = c/n for al1 h, that is.

proport ional allocation, t hen

Denote

t hen

hence

4.5 Proofs

Proof of Lemma 1.


For fixed 6 such that II 0 - Bo II< d,, consider the function

From E 1 1 g ( x , 6 ) Il3< m we can get mazhmüxi II g(xhi ,O) II= o(n5) a.s. for both

stratified SRS or PPS with replacement case, and $ ( A ) is an almost surely continuous

function for II X I j 5 1. When ( 1 ,\ I I= 1, we have

where c is t he smallest eigenvalue of 00. By the Lemma above there exists a point

X such that II ,i I I< 1 and $ ( A ) = O. Using the implicit-function theorem. we can

easily get other Lemma's results.

Proof of Lemma 3


CHAPTER 4. PSEUDO EL A N D GEE

and noting (2.9) ive have


Hence

Therefore, we know

From QI,(& t ( 9 ) ) = 0: we have

Proof of Lemma 4

Denote 9 = Bo + un-f for 8 E {e l 1 1 O - ûo II= n- f} , where II u II= 1. First we

give a lower bound for l E ( B ) on the surface of the bail. From Lemma 1. Lemma 3

and Lemma 3 of chapter 3 we know when II 0 - Bo 115 n-f. we have

uniforrnly about û E { O II 0 - 00 il 5 d}.

By this and Taylor's expansion, we have (uniformly for u )


where c - E > O and c is the smallest eigenvalue of

Since f E ( B ) is a continuous function about 0 as 8 belongs to the bal1 11 8 - Bo /I 5 n-i. h

f E ( 8 ) has minimum value in the interior of this bâll. and e satisfies (noting (2.8) and

(2.10))


Proof of Theorern 1

Taking derivatives about 0'; t T , we have

hr hi

where 6, =Il0 -Bo II + l i t 11. Hence we have

The system equations of (5.10) (5.11 ) written in matrix form is


where iCI, is the matrix of (5.13), hlnvl2 = - ai'~$eo), KV2, = M;,,,. Since

hence

M,z 3 lkl =

where i1hl = M;,, Mz2 = O. Therefore

From this and g,t,w(&) = Ch b f i Ci "thi(~)ghi(S0) = O&-+), we know that 6, =

we have


Hence

where

Now we turn to the second conclusion. Since

not ing (5.17) we get


w here

Hence,

We note here since ibfn,112 1b1,,21 are consistent estimators of Mil! :G1120

iCf21 respectively. if we have a consistent estimator for op(&) = Var(g,, , ,(O0)), Say.

û(Bo), t hen

N

is a consistent estirnator of I/,(B), where ~$122.1 = -~ l .1 , ,2~ Ml: , iWn,12. There are some

methods in literature on how to estimate o,(Oo), such as the jackknife estimator.

Since B can be estimated by B,, li by Un, hence from an estimator of ~ ~ ( 8 ~ ) . ive ry

can also get an estimator of Vu(F) .

Proof of Theorem 2

Noting (5.17) we have


Similarly Ive can get

Hence

Noting that

and

CHAPTER 4. PSEliDO EL AND GEE 11:3

is symmetric and nonnegative with rank p, hence we have the theorem's results.

Proof of Corollary

From the process of proof of theorern 2.2. we have

and the results follows.

Chapter 5

M-estimation in the Presence of

Auxiliary Informat ion Using

Strat ified Random Sampling

5.1 Introduction

Huber ( 1964) introduced a flexible class of estimators? called 'M-estimators'. ivhich

are generalizations of the usual maximum likelihood estimators and have become

a very useful tool in statistical inference. M-estimation plays an important role

in robust parameter inference and in non-parametric inference. In this paper. we

consider a new class of M-estimators under the following model. Let the population

distribution be F ( - ) , and the population be stratified into H strata with known

weight Wh of each straturn, t h i , - - , xh,, be iid sample from stratum h, h = 1: . H.

IVe assume t hat ive have some auxiliary information about the distribution function

C'HA PTER 5. M-ESTIMATION AND EL

F in the sense that there exist r ( 2 1) functioos g l ( x ) , - . - .g,(x) such that

where g ( x ) = (gi(r), - - - ,g,(s))l, Eh denotes the expectation with respect to the

distribution function of stratum h. Here g(x) does not contain any unknown pa-

rameters, hence is different from the questions discussed in chapter 3.

Our approach is first to apply empirical likelihood to provide an aiternative

estimator for the distribution function F ( . ) by ut ilizing auxiliary information ( 1.1 ).

Based on the estimated distribution function which puts unequal weight at each

Xhi , we propose a new chss of M-estimators, and establish some asymptotic results.

One advantage of the new M-estimators over those based on the usual empirical

distribution function is that they have smaller asyrnptotic variances. which enables

us to improve our stat istical inference t hrough West imation.

In section 5.2. we describe the profile empirical likelihood under ( 1.1) and some

asymptotic results. In section 5.3, ive propose a new class of M-estimators. It is

shown t hat the proposed M-estimators are consistent and asymptot ically normally

distributed. The proofs are delegated to section 5.4.

5.2 Profile empirical likelihood

Let xhl, - t h n h be iid sample from stratum h, and samples be independent across

strata. The empirical likelihood, as before, given by

CHAPTER 5. Ad-ESTIMATION . W D EL 116

where Phi = P r ( x h = z h i ) Under the condition that EFg(x) = 0. the profile

empirical likelihood is defined to be the maximum of L with respect to phi subject

to

L can be maximized. The maximum is reached when

where

and t is the solution to

The numerical met hod of calculating ihi is given in chapter 3 .

Now let

nent of vector Zhi and x respectively, l ( x h i j < X j ) is the indicator of set (rhiqj < Xj).

then F ~ ( + ) can be regarded as an alternative estirnator of the distribution function

of F ( x ) . Note t hat instead of the usual empirical distribution function giving weight

bVh/nh ât each observed point 3 h i bn(x) gives the weight CVhFhi at each x h i . On the

CHAPTER 5. M-ESTIMATION AND EL L 17

other hand, if there is no auxiliary information given in (1.1), L attains its maximum

at Phi = l/nh, and hence

the usual weighted ernpirical distribution function.

From chapter 3, we have the following results.

Lemma 1 Suppose that as n + oo, n/nh - kh > O for all h and Eh[g(xh) -

Ehg(xh)][g(~h) - Ehg(xh)lr = c h > 00 for a A h , ivhere 00 is positive definite. Eh

denotes the expectation with respect to the distribution of stratum h. Denote o =

Ch Ch Wtoh7 then

t = o-'gSt + o p ( n - f 1.

Furttiermore, if E ( 1 g(t) (1" m7 then

where gst = Eh Wh & xi g ( x h i ) Hence in both cases.

The variance of Fn(3) shows clearly that it is generally more efficient than the

empirical c.d.f. Fn(x).

CHAPTER 5. M-ESTIMATION AND EL

An M-functional 8 ( F ) associated with distribution function F is defined as a root

Bo of the equation

where x - F and x, Bo, $(x, Bo) E R. For a stratified sample X I I : . , X I , , ; : r H i 7

- ., z~~~ from F ( - ) , the usual estimator of $ is given by 0, = O(F,), where F,(.) is

the natural stratified ernpirical distribution function. i.e.? F&) = Eh 2 Cl 1(xhi <

2). O(Fn) is called the M-estimator corresponding to $. We propose a new class of

M-est imators by taking the auxiliary information (1.1) into account. Specificall-

we propose t o estimate Bo by O n , which is defined to be a root. uo, of the equation

We will also cal1 4, the M-estirnator corresponding to d?. Of course different choices

of ~ lead to different estimators. Xote that if t here is no auxiliary information ( 1.1).

then k(. j = F,(-), and hence 4 = B(F,), the usual M-estirnator. Though ive could

include d ~ ( x , O ) = O as an estirnating equation in the general estimating estimating

equations setup of Chapter 3. it may add some dificulty in getting an appropriate

solution in practice since ii>(x,B) includes the unknown parameter O. Instead. we

first use the empirical likelihood method to get the estimators fihi and then solve

(3.2) to get an M-estimator for O . We cal1 it ernpirical likelihood moment (ELM)

est imator.

Next we establish the asyrnptotic properties of the ELM-estimator 8,. Denote

CHAPTER 5. M-ESTIMATION AND EL 119

with t being the solution to (2.3) and

We first establish the consistency of the ELM-estirnator 8,. Given t hat the equation

X F ( 0 ) = O has a root Bo and the equation Q,(B) = O has a root On, we wish to show

that ê, + Bo in probability under appropriate conditions. In this paper. the norrn t

of a p x q rnatrix A = (ai j ) , , , is defined by II A II= (& & a:,) ' . p. q 2 1.

Theorern 1 establishes the asyrnptotic consistency of the ELM-estirnator 9,.

Theorem 1 Let Bo be an isolated root of XF(B) = O , i.e.. Bo is the on- root

in a neighbourhood of 9,. Suppose that E g ( x ) g T ( x ) is positive definite. and that

in a neighbourhood of 00, E [ l S ( x , O ) 11 is finite. Furthermore, assume that IL>(r, 8 ) is

continuous in 0 and monotone in 19, or $(z?O) is continuous in O and bounded. and

X F ( 0 ) changes sign uniquefy in a neighbourhood of Bo. Then the equation Q,(O) = O

bas a solution {d,) which converges to Bo in probability.

Next we establish the asyrnptotic norrnality of the ELM-estirnator 8,. We give

the following two theorerns.

Theorem 2 Let Bo be an isolated root of AF(B) = O , r /~(x,B) be monotone in

O. Suppose that AF(0 ) is differentiable at 0 = Bo, with X ; ( O o ) + O. Furthermore.

suppose that E&*(x, 0 ) and E[II g ( x ) 11 1+(x,O)1] are finite for 8 in a neighborhood

of go, and that E?,b2(r, O ) and E [ $ ( r , B ) g ( t ) ] are continuous at O = Bo. Then i f 8, is

any solution sequence o f the equation Qn(B) = 0 , Ive have

wit h

The next theorem does not require rnonotonicity of d ( x , O ) in B. However. we

need to have some srnoothness conditions on $(x, 0) and higher moment require-

rnents on g(x) .

Theorem 3 Let O0 be an isolated of X F ( B ) = O. Let d.l(xl 8)ldO be continuous

at 8 = Bo uniformly in r. Suppose that E [ g ( x ) g ( x ) '1 is positive definite.

is finite and nonzero, and E[II g ( x ) I l2 I+(x. 00)12] < co. Furthermore. suppose that in

a neighborhood o f do, [$'(x, 8) 1 < G ( x ) , where G(-) satisfies E [II g(x) 1 ) G(x)] < ,m.

Let {0,) be a consistent solution sequence of the equation Q,(B) = O . then

tvh ere

CHAPTER 5. LM-ESTIMATION AND EL

5.4 Proofs

Proof of Theorem 1

Let c > O be given. From Lemmas 1 and 2 of Chapter 3, we have

Using Lemma i . (4.1) and the strong law of large numbers. the last term in (-1.2)

has absolute value bounded by

Therefore. applying the weak !aw of large numbers gives

Without loss of generality, assume that d ( x , O ) is non-increasing in B. t hen E ( ~ ( x . 8 )

is a non-increasing function of O. Since 6 is an isolated root of XF(0) = 0. we have

CHAPTER 5. LW-ESTIMATION AND EL 123

E$(x, $ + E ) < E$(x, 80) < E+(x, 00 - c) for sufficiently small e. Thus (4.3) implies

t hat

Since Q n ( B ) is a continuous function of O for each n, it follows from (4.4) that there

^ P exists a sequence (On} such that Qn(Bn) = O and 0, + Bo.

For the case that $ ( x , 6 ) is continuous in 0 and is bounded, it follows by the

dominated convergence theorem that X F ( B ) is continuous in O, we can complete the

prool as with part one of the theorem.

Proof of Theorem 2

The consistency of b, to Bo is guaranteed by Theorern 1. Now we assume that

t,b(z,B) is non-increasing in 8, so that Q.(B) is non-increasing. Thus for every y.

where y, = O. + y o r l / f i . Thus, to prove the theorem, it suffices to show that for

every y.

where 9(-) is the standard normal distribution function. Now using Lemma 1. (4.1 )

and the assumptions given in the theorem, we can show that

w here

or equivalently,

where V, is t h e variance of rh Whnhl - Ezhi,*) . NOW

Hence

In order to prove (4.6), it therefore suffices to show that

or equi valent ly

where

Yhi,n = d'(xhi7 Y , ) - o L g ( x h i ) .

and V , is the variance of Ch CVhnhl Ci(yhi,n - Eyhi , , ) .

Since YhiVn - Eyhi tR7 1 5 i 5 n, are independent and identically distributed wit h

mean O . by appealing to the centrai lirnit theorem, it suffices to verify the condition

for every c > 0. or equivalently, for c

Hence it suffices to prove for any c > 0 ,

For every a > O, we have for each x and for n sufficiently large that

CHAPTER 5. M-ESTILWATION AND EL

where

The proof is complete.

Proof of Theorem 3

Since &,(O,) = O and Q,(B) is differentiable in 8. we have

CHAPTER 5. Ad-ESTIMATION AND EL

It can be shown that

hence,

Therefore, i t foI!ows from the central limit theorem and Slutsky's theorern that

where

Chapter 6

Empirical Likelihood Inference in

the Presence of Measurement

Error

6.1 Introduction

Suppose that we want to measure a characteristic of interest of a target population.

We have several different "instruments". one of them is "accurate". or it has smaHer

measurement error compared to the ot her ones. In pract ice we t reat t his instrument

as "perfect", or no rneasurement error. -411 other instruments will have larger mea-

surement error and we treat them as "irnperfect*'. Here we use "instrument" to refer

the method used to measure the characteristic. I t may include factors such as the

real instrument used, the personnel involved. working environment. etc. It may be

due to the cost. or lack of highly trained personnel for using the perfect instrument.

CHAPTER 6. EL IN THE PRESENCE OF MEASUREMENT ERROR

We use imperfect instruments to measure some samples drawn from the population,

and use perfect instrument to measure other samples from the population. We com-

bine à11 the data, both imperfect and perfect, to make inference on the parameters

of the populat,ion. Of course, the inference should be more accurate compared to

using only the perfect measurement data.

Survey researchers have long been cognizant of the il1 effects of measurement

error on estimators of means and totais of finite populations. In general. compared

with perfect measurements. imperfect but unbiased measurernents increase the vari-

ance, but do not affect the bias of estimators of means and totals. However. the

sample cumulative distribution function(CDF) is no longer an unbiased or consis-

tent estimator of the population CDF even if the measurement error has mean zero.

Several authors discussed these problems. Luo, Stokes and Sager gave a good sum-

mary of history and current development in this area, though alrnost ali the papers

listed there only deal with imperfect sample without calibration sample. i-e.. per-

fect sample. The pa.per by Luo. etc.. may be the first to handle the data collected

with measurement error with a calibration sample. In their paper. they used the

simulation to compare the performance of several estimates of CDF. The estima-

tors t hey considered are: the difference, ratio, regression. aeighted average of the

empirical CDF's of the imperfect and perfect measurements, etc. Their proposed

estimators are linear combinations of the empirical CDF's from the imperfect and

perfect siimples. The weights are chosen so that the resulting estimator h a the

smallest possible variance subject to an unbiasedness constraint. These weights are

also estimated from the data. They did not give any theory about the proposed

estimators such as asymptotic variances. They also pointed out that the proposed

CHAPTER 6. EL IN THE PRESENCE OF MEASUREMENT ERROR 139

weighted linear estimators of CDF may not monotone.

In this chapter we propose to use the empirical likelihood method to malie in-

ference on parameters of interest by taking al1 the data into account, treating the

imperfect measurements as auxiliary information. The ernpirical likelihood estima-

tor of CDF is monotone, asy mptotically more efficient than the empirical CDF from

perfect sarnple only, and asyrnptotically normd distributed. Section 6.2 shows how

to use the empirical likelihood to solve the problems above. Section 6.3 discusses the

associated asymptotic results., while Section 6.4 discusses likelihood ratio testing.

Several examples will be given in section 6.5. Proofs are delegated to Section 6.6.

6.2 Empirical likelihood method in the presence

of measurement error

Suppose that there are H different instruments used in rneasuring a characteristic of

interest. The distribution function associated with instrument h ( h = 1. . . H) is

Fh(x) = P ( x h < a ) with unknown parameter B. where 8 is a pvector. and the H-th

measuring instrument is taken as perfect. Note that we have H different distribu-

tions due to possibly different measurement errors in the instruments. alt hough the

same characteristics, XH, are measured by al1 the instruments. These H different

distributions are assumed to be related by the common parameter O. We also as-

sume that information about B and Fh is available in the form of r h > p hnctionally

independent unbiased estimation functions, that is functions gh (xh, O ) such t bat

CHAPTER 6. EL IN THE PRESElVCE OF R/IEASClREhfENT ERROR 130

where gh (x, 0) is a rh-dimension vector function. We furt her suppose t hat measure-

rnents xhl, ..-, Xhnh are i.i.d. measurements obtained by instrument h: different

measuring by different instruments are independent. The sampling scheme consid-

ered here is different from the method of Luo et al (1996, unpublished) and others.

They used two-phase sampling with the imperfect sample as the first phase sample

and the second phase sample taken from the first phase sample. This means that the

second phase sample is dependent on the first phase sample, and usually the second

phase sample is a very small portion of the first phase sample. We consider inde-

pendent samples here for two main reasons. First, in the case different inst rurnents

are involved in measuring, the makers may have done a lot of testing and will give

the accuracies of different instruments. Or a great number of testing can be made to

know the difference between different instruments before practical measuring. The

sarne thing can be said about different degrees of trained personnel. Our method

can take this kind of information into account. and make more accurate inference.

Second, as pointed by Luo et al (1996, unpublished) and others. when the second

phase sample is quite small compared to the first phase sample. the two samples

can be treated as independent samples. Actually their recommended estimator for

the CDF is based on independent samples.

The empirical likelihood based on the independent samples

is given by

L(Fl, ' . . ? FH) = n ~ i ~ : ~ I P h i ~ (2.2)

where phi = Pr(xh = x h i ) and l:zl phi = 1 for each h. Only those Fh distributions

CHAPTER 6. EL IN THE PRESENCE OF II/IE.~SC'REMENT ERROR 131

which have an atom of probability on each xhi have nonzero likelihood. (2.2) is

rnaximized by the empirical distribution function FhTn,, (z) = n i 1 x i I (xh i < x).

The empirical likelihood ratio is then defined as

which reduces to

We remark that formulas here and elsewhere in this chapter also do not require that

the xhi7s be distinct. Since we are interested in estimating the parameter 0, and we

know the estimating equation (2.1), we define the empirical likelihood ratio function

and i. Here function ghi(B) depends on h aiso.

which is different irom the one discussed in chapters 2 to 4.

As noted before, for any given 0 and h , n:,hlphi can be maximized, provided O

is inside the convex hull of the point ghl ( O ) , . . gh,, ( O ) . The maximum of II:zlphi

subject to the constraints phi 2 O, xi Phi = 1. and C; phighi(0) = 0 is attained when

where t h = t h ( 0 ) is a rh x 1 vector given as the solution to

Hence the left side of (2.4) is

CIHAPTER 6. EL IN THE PRESENCE OF MEASUREMENT ERROR 132

and the empirical (negative) log-likelihood ratio of 8 is

hi

with th being t h e solution to (2 .6 ) . We can minimizelE(B) to obtain an estimator 0 of

the parameter 8 , called the maximum empirical likelihood ratio estimator (MELRE). h

In addition, this yields estimators P h i t from (2.5): and an estimator for the true

distribution function FH as

CY

0 is obtained by solving

and (2.6) toget her. Ly

In the following section we will prove the existence of 9 and study the asymptotic ry Y

properties of e and F H , ~ ~ (5)-

6.3 Asymptotic properties of MELRE hi

First, we give the conditions for the existence of 8 which will minirnize l E ( 0 ) defined

by (1.8).

In the lollowing. we will use I I I I to denote Euclidean norm and n = Eh nh.

Lemma 1. Suppose as R + oc, n/nh -t /ri, > O for al1 h . And suppose

that in a neighborhood o f the true value Bo, V ~ P [ ~ ~ ( X ~ . Bo) ] = ah(&) > O for al1 h .

I I agh(xh,O)/aO II and I I gn(xh, 8 ) I l 4 are bounded by some integrable function G(x)


in this neighbourhood, and the rank of E[agh(xh, $ ) / a d ] is p. Then. as n + ai' with CY

probability 1 l E ( Q at tains its minimum value at some point ,g in t6e interior o f the ry & h i -

bail II B - Bo 115 n-+, and 0 and t h = t h ($) satis-

w-h ere

Theorem 1. ln addition to the conditions o f Lemma 1. we further assume

that is continuous in d in a neighbourhood of the brue value Bo, then if

II II is bounded by some integrable function G(z) in the neigbbourhood for

ail h , then

fi(; - B a ) + $(O, V), J n ( Y h -0) + !\*(O. Q).

h r rr

where F H . ~ ~ (x) = Ci p ~ i 1 ( x ~ i < x), p ~ i = ' 1 - : and V, Q , kt' are defined nH l+C&,(e 3

in the proof.

Since usually n h >> n H for h = 1, - - , H - 1. we can see the main part of the h)

variance of F H Y n H (2) will come from the first three term of (6.4), ahich is smaller

tiian the variance of t he estimator FHVnH (x), t he cumulative distribution function

from the precise data x ~ l , . - . , XH,, only. Some further discussion will be given in

Section 6.+5.


h, H

The estimaton for variance of 9 and F H , ~ ~ (x) can be obtained by replacing

each component of V and W by their obvious estimators. These variance estimators

are asymptot ically correct.

6.4 Likelihood Ratio Tests

Several testing issues arise naturally from the appiication point of view. We first

consider how to test hypothesis Ho : 19 = Bo.

Theorem 2. In the serni-parametric mode1 (2.1)' under the conditions o f

theorem 1, for testing Ha : 19 = Bo, the empirical likelihood ratio statistic

is asymptotically distï-ibuted as xg i f n l = = n ~ .

Now we turn to test the mode1 ( 2 4 , i-e..

Corollary 2.1 In order to test the mode1 (4.1). ive can use the empirical like-

lihood ratio statistic

Under the conditions o f theorem 2 above. Ri is asymptotically distributed as y?-,

i f (4.1) is correct, and nl = . . . H = n~ = n, where r = Eh=, rh .

CH.4PTER 6. EL IN THE PRESENCE OF MEMUREMENT ERROR 135

6.5 Examples

We present several illustrations of the estimation procedures. Procedures about how

to solve equations (3.1) through (3.3) will be given. Large-sample aspects of these

estimators wiIl be discussed.

Example 1. Common Mean Mode1

Suppose we have two different instruments, i.e., H = 2. and we only know that

they are unbiased, i.e.,

E x l = O , Ex2 = O?

and x2 refers to the perfect measurement. Suppose that 111 ... -. II,, and xzl . ..

xz,, are independent sarnples from xi and 22 respectively. Let

t hen

and equat ion (3.3) becornes

Substituting (5.1) into (3.2): we get

CHAPTER 6. EL IN THE PRESENCE OF MEASUREMEIL'T ERROR 136

Solving (5.2) and (5.3) by Newton method or using NAG Fortran library function

with initial value (8, t ) = (2, O ) often works, where 3 = 2 Z 1 + nf 2 .

Let cri = Var(xl). 0 2 = V a r ( x 2 ) . Since

CY

replacing kh by n / n h we pet the asymptotic variance of 6 as

which is the same as the variance of the optimal linear combinat ion *estimator3

Note è is actually not an estimator since we do not know t h e variances ol and 0,.

but we use it for cornparison here and in the following. rV

A s for the asymptotic variance of FH,,, (x) , Le.. M.' given by (6.4). alter a little

calculation we get

+ L ~ ~ ~ ; ' C ~ O ; ~ - 2o;l

(a;' + 0;' + OF' 1

From (-5.4) we see t hat if a2 < < ai, then combining the two samples will not improve

the efficiency of CDF estimator significantly compared to using only the pcrfect data.

On the other hand, if a2/al is reasonable large, such as greater than 0.5. then for

example, n z / n l = k1 /k2 < 518, we will have a significantly more efficient estimator

for CDF.

CHAPTER 6. EL IN THE PRESENCE OF MEASIREMENT ERROR 137

Example 2 Additive Mode1

We consider the following mode1

i.e., the variance of imperfect measurement is larger than the variance of the ~ e r f e c t

measurement by 00, where ao(> O ) is known. We note here that if E x l = dl + UO. where vo is known, we can change the data x i to X I - vol and al1 the following

discussion still applies.

Suppose two independent samples frorn X I and x2 are slil - . .. XI,, and xzl? - - -.

x2,, respectively. Denote zl = (xi, x:)'. z2 = ( x 2 , x;)', 0 = (61782)17 gi(zl.O) =

(xi-$1, ~ : - 8 : - 6 ~ - 0 , 2 ) ' , g 2 ( i 2 , O ) = ( x 2 - d l , x Z - B : - ~ ~ ) ~ : then Egl = O. Eg2 = 0.

and equation (3.3) becomes

n2 n2 tZl = -- h i , t22 = -- tl2.

nt nl

Hence, (3.2) becomes


CVe can solve the above four equations by Ietting the initial values

- h

Using Newton's method will usually give us a solution to B I , 02 . and hence Phi, F 2 n 2

(4. In order to get an idea of the asymptotic properties of these estimators. we

suppose xl and xz have normal distribution. Since

CHAPTER 6. EL IN THE PRESENCE OF MEA4SUREMENT ERROR 139

hence. by (6.4) we have V = (kl& + k2B,)62Q./(B2 + O.)? If ive replace kh by n/nht ly

then Var(@) = (n~'O2 + n;19.)626i/(82 + which is the sarne as t h e variance of

the optimal linear combination estirnator

Example 3. Product Mode1

Suppose

i.e., the ratio of variances of imperfect measuremeot to perfect measurement is a

known constant(c > O). The two independent samples from xl and x 2 are xil. -.

2 ln, and 121. . . S . x~~~ respectively. Denote

2 gl ( z ~ . O ) = (z1 - Bi, r; - 8: - ce2)'. g2(z2. O ) = (x2 - 4. tz - 8: - 02 )'.

t hen

Egl = 0, Eg2 = 0,

Hence equat ions (3.1) t hrough (3.3) become

CHAPTER 6. EL IN THE PRESENCE OF MEMUREMENT ERROR 140

Solving the above six equations by letting the initial values

tlJJ = Oi t2 ,0 = 0

n, N

and using Newton's method. give us a solution to 61, 02 and hence P h i , F2n2 ( X 1. h,

Usually the solution Phi is non-negative.

In the following discussion, we suppose xi and x2 are normally distributed. Then

Therefore ' 2 ~ 0 ~ 0 ~

Var(gl(rl, 8)) = 01 = %&O2 2 ~ ~ 8 ; + .~c0&

CHAPTER 6. EL II THE PRESENCE OF MEASUREkiENT ERROR 141

hi

Hence, by (6.4) we have Var (0 ) z: (&)2 (-& + $) Q2. which is the same as the

variance of the optimal linear combination estimator

6.6 Proofs

Proof of Lemrna 1

Denote 0 = Bo + un-f for O E {el II 9 - Bo I I = n- ; } : where I I u II= 1. First

rve give a Iower bound for l E ( 0 ) on the surface of the bail. From Qin and Lawless

(1994). when II 0 - O o I I $ n-;, ive have

uniforrnly about O E {O 11 O - Bo 115 n-f}, where

By this and Taylor's expansion, we have (uniformly for u )


mhere c - E > O and c is t h e smallest eigenvalue of

Similarly

Since l E ( B ) is a continuous function about 0 when 0 belongs t o the ball II 6 - Bo II< Ly

n - i l f E ( e ) has minimum value in t h e interior of this ball. and e satisfies ( noting

C H A P T E R 6. EL IN THE PRESENCE OF MEASII'REMENT ERROR 143

Proof of Theorem 1

Taking derivatives about B r , th, we have

CHAPTER 6. EL IN THE PRESENCE OF MEASLrREhfEiVT ERROR 144

equations of (6.1) (6.2) can be written in nlatrix form as

where iC122.1 = -&lM{11Vf12, we have

fi(; - 9 0 ) + !V(O, V ) ,

where


Fur t, her

where

Now we turn to the third conclusion. Since

we know

w here


By the definition of U , we have

= M;'

and

cy

t =

Hence

where

CHAPTER 6. EL I N THE PRESENCE OF MEASUREMENT ERROR 147

Noting that CJiMIIU = CJo we have

Proof of Theorem 2

'loting (6.:3), we have


where Li, Mil are defined in section 4: N = diag{nl ï,, , . - . n H l , , } . Similady. we

F t

we can see only when nl = - . - - n~ = n. then LW = .VL:. and P is symmetric:

otherwise it is asymmetric, and the distribution of R is of an unknown form. Since

we suppose that nt = - = n H . t herefore

By the definition of lJ? we have

CHAPTER 6- EL IN THE PRESENCE OF MEASbïtEhIENT ERROR

Hence P is idempotent, with trace equal to p, the theorem follows.

Proof of Corollary 2

Frorn the proof of theorem 2, we have

Chapter 7

Conclusions and Further Research

7.1 Conclusions

An empirical likeli hood approach to the use of auxiliary informât ion in st rat ified

survey is introduced. CVe first considered how to make inference on the popula-

tion mean I and the distribution function F v ( y ) of characteristic y of a population

when the population mean of an auxiliary variable x is known. Then we extended t,he

form of auxiliary information to more general format, namely. the general estimat-

ing equations. In the above two cases, only stratified simple randorn sampling(with

or without replacement) is considered. \.Ve proceeded to consider more complex

sampling designs, i-e.. in each stratum, other sampling methods such as probability

proportional to size sampling are used. Chapter 5 is trying to solve the calcula-

tion problem which rnay rise from Chapter 3 , that is? when we solve the estimating

equations involving unknown parameters, sometime the solution f ih i may be nega-

tive. Chapter 6 is an immediate application of empirical likelihood, namely, using

CHAPTER 7. CONCLUSIONS AND FURTHER RESEARCH

empirical likelihood method in the presence of measurement error.

The empirical likelihood estimators of parameters such as k', the mean of the

interested characteristic y of population, and the distri but ion funct ion of y. are

shown in C hapter 2 to be asymptotically equivalent to optimal estimators given

by Rao and Liu(1992). However, the EL estimator of F v ( y ) . is guaranteed to be

monotone, non-negative, unlike the optimal est imator. Empirical likelihood ratio

test is also considered, though our simulations show that its finite sarnple properties

are not as good as expected.

The above results are generalized to deal wit h the case of au'tiliary information

summarised in the form of general estimating equat ions, hence enlarging the appli-

cation areas of the empirical likelihood rnethod. Methods to deal witli cases where

parameters are subject to constraints are also included. Empirical likelihood ratio

test is also discussed. The results show that empirical likelihood ratio test, Lagrange

multiplier test. and Wald test lead to the same asymptotic distribution.

In practice. a cornplex survey design is often employed. How to incorporate

such designs into empirical likelihood inference is presented in Chapter 4. Here

the empirical likelihood had been changed to pseudo-likelihood. Ail the inferences

are based on the pseudo-likelihood. The EL ratio test is no longer asymptotically

,y2-distributed. It is a weighted sum of independent y: variable.

The method of combining EL. general estimating equations and M-estimation is

presented in Chapter 5. The key point here is that the general estimation equations

and EL are used to estimate distribution îunction first, where the auxiliary infor-

mat ion summarised t hrough estimating equat ions does not include any unknown

parameters, then use this est ina ted distribution function in $1-estimation. In some

CHAPTER 7. CONCL USIONS AND FURTHER RESEARCH

complex case, this will reduce the difficulty in calculation. but the auxiliary infor-

mation should be summarised without unknown parameters.

In the presence of mrasurement error, EL can be used to combine information

from different instruments to rnake more accurate inference. In other words. we

can borrow strengt h from other populations to make inference on the population of

interest. Some form of relation among these populations must be known before we

can torrow st rength from ot her populations.

7.2 Further Research

While it is hoped that the results of this research will be of use in making more accu-

rate inference rvhen auxiliary information is available. many areas remain in which

further research might be profitably undertaken. Three areas are now discussed.

The sirnulat ions presented in Chapter 2 show t hat the empirical likelihood (ratio)

estimator for parameters of interest are very good. However. the empirical likelihood

ratio tests are not as good as expected. Higher order correction to the tests may

improve the performance of EL ratio testing.

The EL estimators for the distribution function from Chapters 2 to 6 are nonneg-

ative, monotone, hence can be used to estimate quantiles. Large sample properties

of these estimators need to be studied.

Throughout the paper, the weights of al1 strata are assumed known. This will

limit the application of EL method to more complex sampling such as multi-stage

sampling. Typically, in multiple-stage sampling, the weights of second stage sam-

pling are not known. How to combine the auxiliary information in estimating weights

CHAPTER 7. CONCL USIONS AND FURTHER RESEARCH

and the parameter of interest will be more challenging.

References

Aitchison, J.. & Silvey, S. D. (1955). Maximum-likelihood estimation of parameters

subject to restraints. Ann. Math. Statist., 29, 513-828.

Bickel. P.J. & Freedman, D.A. (1984). Asymptotic normality and the bootstrap in

stratified sampling. Ann. Statist. 12, 470-82.

Chaudhary M. A. & Sen. P.K. (1995). .Asymptotic Distribut ion of Estimators lrom

Unequal Probabili ty Sampling, Proceedings of the American Statistica! Associ-

ation, Section on Survey Research illethods, 34.5-349.

Chen, J. k Qin, J. (1993). Empirical likelihood estimation for finite population and

the effective usage of auxiliary information. Biometrika 80 107- 1 16.

Chen. J. Sr Sitter. R.R. A Pseudo Empirical Likelihood Approach to the Effective

Use of Auxiliary Information in Complex Surveys. Research Report. Xo. 96-01.

Cochran, W.G. ( 1977). Sampling Techniques. 3rd. New York: Wily

DiCiccio, T.J., P. Hall Sr J.P. Romano (1989). Cornparison of para,metric and em-

pirical likelihood funct ions. Biometrika, 76, 465-476.

DiCiccio, T. J. & Romano, J. (1990). Nonparametric confidence limits by resampiing

methods and least favorable families. Internat. Statist. Rev. . 58. Fj9-76

Fuller. W.A. (1995). Estimation in the presence of measurement error. Int. Statist.

Rev., 63, 121-141.

Ghosh M. 5i Rao, J.N.K. (1994). Small -4rea Estimation: An Appraisal. Statistical

Science, 9, 55-93.

Hall, P. (1990). Pseudo-likelihood theory for empirical likelihood. Ann. Statist..

18, 1'21-140

Hartley. H.O. 8 Rao, J.N.K. (1968). A new estimation theory for sample surveys'

Biometrika, 55, 547-1557.

Hochberg, Y . (1977), On the Use of Double Sampling Schemes in Analyzing Cate-

gorical Data wit h Misclassification Errors, JASA, 72, 914-921.

Holl, P. (1990 j. Pseudo-likelihood theory for empirical likelihood. Ann. Statist..

18, 121-140.

Huber, P. (1964). Robust estimation of a location parameter. Ann. Math. Statist.

, 35, 73-101

Jennrich, R.I. (1969) Asymptotic properties of non-linear least squares estimates.

Ann. Math. Statisist., 40, 633-643.

Luo, M. k Stokes, L. & Sager, T. Estimation of the CDF of a Finite Population in

the Presence of a Calibration SampIe.

Ohlsson, E. (1986). Asymptotic Normality of the Rao-Hartley-Cochran Estimator:

.4n Application of the Martingale CLT, Scand J. Statist.: 13. 17-23.

Overton. W. S. (1989). Effects of measurement and other extraneous errors on

estimated distri but ion functions in the national surface water surveys. Technical

report 12.9, Dept . of Statistics, Oregon State LTniversity

Owen, A. B. (1988). Empirical likelihood ratio confidence intervals for a single

functional, Biometrika, 75 337-249.

Owen, .4. B. (1990). Empirical likelihood confidence regions, Ann. Statist .. 18,

90-120.

Owen, A. B. (1991). Empirical likelihood for linear rnodels, Ann. Statist.. 19,

1725-1747.

Qin, J. k Lawless, J.F. (1994). Empirical likelihood and general estimating equa-

tions. rlnn. Statist. 22 300-325

Qin, J. Sr Lawless, J.F. (1995). Estimating equations, empirical likelihood and

coristraints on parameters, The Cimadian Journal of Statistics, '23. 14.5- 159.

Rao, J.N.K. (1994). Estimating Totals and Distribution Functions Using Auxiliary

Information at the Estimation Stage, Journal of Official Statistics, 10. 153- 165

Rao, J.N.K. Sr Kovar, J.G. k Mantel, H.J. (1990). On estimating Distribution Func-

t ions and Quant iles from Survey Data using Aiixiliary Informat ion. Biometrika.

'77, 365-3'75

Rao, J.N.K. k Liu, J. (1992). On estimating distribution function from sample

survey data using supplementary information a t the estimation stage. 'ionpara-

metricstatistics and Related Topics (A.K.lLId.E. Saleh. Ed.). ~Veeiv York: Elswior.

399-40 7

Rao. J .N.K. & Scott, A.J. (1979) Chi-Squared Tests for Analysis of Categorical Data

From Complex Surveys, Proceedings of the American Statistical Association.

Section on Survey Research Methods, 58-66.

Rao, J.N.K. Sr Scott, A.J. (1981). The analysis of Categorical Data From Complex

Sample Surveys: Chi-Squared Tests for Goodness of Fit and Independence in

Two-Way Tables, JASA, 76, 221-230.

Rosen. B. (1972); Asymptotic Thcory for Successive Sarnpling wit h Varying Proba-

biiities, 1 and II, AnnaIs of Mathematical Statistics, 43, 373-397; 748-776.

Sarndal. C.E., Swensson. B. & Wretman, J. (1991). Mode1 assisted survey sampling.

Springer.

Sen, P.K. (198S), Asymptotics in Finite Population Sampling, Hand Book of Statis-

tics 6, P.R. Krishnaiah and C.R. Rao Eds., Elsevier Science Publishers B.V.

391-331.

Shao, J. (1994). L-statistics in complex survey problems, -4nn. Statist ., 22. 946-967.

Tenenbein, A. (1970), A Double sampling Scheme for Estimating from Binomial

Data with Misclassifications, JASA, 65, 1330-1361.

Tenenbein, A. (1971), A Double sampling Scheme for Estimating from Binomial

Data with b1isclassiFications: Sample Size Determination, Biometrics. 27, 935-

944.

Tenenbein, A. ( lW2), A Double sampling Scheme for Estimating from Misclassified

Multinomial Data with Applications to Sampling Inspection. Technometries. 14.

187-202.

Zhang, B. (1995). -M-estirnat,ion and Quantile Estimation in the Presence of Auxil-

iary Information, Journal of Statistical Planning and Inference, 44. 77-94.

Zhang, B. ( l996), Estimating a Population Variance wi th Known Mean. Interna-

tional Statistical Review, 64, 21 5-229.

Zhong, C. & Rao, J N K (1996). Empirical likelihood inference using stratified Sam-

pling using auxiliary information, The proceeding of JSM 96.

TEST TARGET (QA-3)

APPLIED IMAGE. lnc 1653 East Main Street - -. - - Rochester. NY 14609 USA -- -- - - Phone: 716/482-0300 --

I- - - Fa: 716i288-5989

Documents

Em~irical Likelihood Inference &der S ratified · PDF fileAbstract This dissertation consists of seven chapters. The first chapter is an introduction to empirical likelihood inference,