8
italian journal of pure and applied mathematics – n. 28-2011 (135-142) 135 IMPROVEMENT IN ESTIMATING POPULATION MEAN USING TWO AUXILIARY VARIABLES IN TWO-PHASE SAMPLING Rajesh Singh Department of Statistics Banaras Hindu University (U.P.) India e-mail: [email protected] Pankaj Chauhan Nirmala Sawan School of Statistics DAVV, Indore (M.P.) India Florentin Smarandache Department of Mathematics University of New Mexico Gallup USA e-mail: [email protected] Abstract. This study proposes improved chain-ratio type estimator for estimating population mean using some known values of population parameter(s) of the second auxiliary character. The proposed estimators have been compared with two-phase ratio estimator and some other chain type estimators. The performances of the proposed estimators have been supposed with a numerical illustration. AMS Subject Classification: 62D05. Keywords: auxiliary variables, chain ratio-type estimator, bias, mean squared error. 1. Introduction The ratio method of estimation is generally used when the study variable Y is po- sitively correlated with an auxiliary variable X whose population mean is known in advance. In the absence of the knowledge on the population mean of the auxi- liary character we go for two-phase (double) sampling. The two-phase sampling

IMPROVEMENT IN ESTIMATING POPULATION MEAN USING TWO AUXILIARY VARIABLES IN TWO-PHASE SAMPLING

Embed Size (px)

DESCRIPTION

This study proposes improved chain-ratio type estimator for estimating population mean using some known values of population parameter(s) of the second auxiliary character. The proposed estimators have been compared with two-phase ratio estimator and some other chain type estimators. The performances of the proposedestimators have been supposed with a numerical illustration.

Citation preview

Page 1: IMPROVEMENT IN ESTIMATING POPULATION MEAN USING TWO AUXILIARY VARIABLES IN TWO-PHASE SAMPLING

italian journal of pure and applied mathematics – n. 28−2011 (135−142) 135

IMPROVEMENT IN ESTIMATING POPULATION MEAN USINGTWO AUXILIARY VARIABLES IN TWO-PHASE SAMPLING

Rajesh Singh

Department of StatisticsBanaras Hindu University (U.P.)Indiae-mail: [email protected]

Pankaj Chauhan

Nirmala Sawan

School of StatisticsDAVV, Indore (M.P.)India

Florentin Smarandache

Department of MathematicsUniversity of New MexicoGallupUSAe-mail: [email protected]

Abstract. This study proposes improved chain-ratio type estimator for estimatingpopulation mean using some known values of population parameter(s) of the secondauxiliary character. The proposed estimators have been compared with two-phase ratioestimator and some other chain type estimators. The performances of the proposedestimators have been supposed with a numerical illustration.

AMS Subject Classification: 62D05.

Keywords: auxiliary variables, chain ratio-type estimator, bias, mean squared error.

1. Introduction

The ratio method of estimation is generally used when the study variable Y is po-sitively correlated with an auxiliary variable X whose population mean is knownin advance. In the absence of the knowledge on the population mean of the auxi-liary character we go for two-phase (double) sampling. The two-phase sampling

Page 2: IMPROVEMENT IN ESTIMATING POPULATION MEAN USING TWO AUXILIARY VARIABLES IN TWO-PHASE SAMPLING

136 r. singh, p. chauhan, n. sawan, fl. smarandache

happens to be a powerful and cost effective (economical) procedure for finding thereliable estimate in first phase sample for the unknown parameters of the auxiliaryvariable x and hence has eminent role to play in survey sampling, for instance, seeHidiroglou and Sarndal (1998). Consider a finite population U = (U1, U2, ..., UN).Let y and x be the study and auxiliary variable, taking values yi and xi, respec-tively, for the ith unit Ui. Allowing SRSWOR (Simple Random Sampling withoutReplacement) design in each phase, the two-phase sampling scheme is as follows:

(i) the first phase sample sn′ (sn′ ⊂ U) of a fixed size n′ is drawn to measureonly x in order to formulate a good estimate of a population mean X,

(ii) given sn′ , the second phase sample sn (sn ⊂ sn′) of a fixed size n is drawnto measure y only.

Let

x =1

n

i∈sn

xi, y =1

n

i∈sn

yi, x′ =1

n′∑

i∈sn′xi,

The classical ratio estimator for Y is defined as

(1.1) yr =y

xX.

If X is not known, we estimate Y by the two-phase ratio estimator

(1.2) yrd =y

xx′.

Sometimes, even if X is not known, information on a cheaply ascertainable variablez, closely related to x but compared to x remotely related to y, is available on allunits of the population. For instance, while estimating the total yield of wheatin a village, the yield and area under the crop are likely to be unknown, but thetotal area of each farm may be known from village records or may be obtainedat a low cost. Then y, x and z are respectively yield, area under wheat and areaunder cultivation see Singh et.al. (2004). Assuming that the population mean Zof the variable z is known, Chand (1975) proposed a chain type ratio estimator as

(1.3) t1 =y

x

(x′

z′

)Z.

Several authors have used prior value of certain population parameter(s) to findmore precise estimates. Singh and Upadhyaya (1995) used coefficient of variationof z for defining modified chain type ratio estimator. In many situation the valueof the auxiliary variable may be available for each unit in the population, forinstance, see Das and Tripathi (1981). In such situations knowledge on Z, Cz

β1(z) (coefficient of skewness), β2(z) (coefficient of kurtosis) and possibly on someother parameters may be utilized. Regarding the availability of information onCz, β1(z) and β2(z), the researchers may be referred to Searls (1964), Sen (1978),Singh et.al. (1973), Searls and Intarapanich (1990) and Singh et.al. (2007). Using

Page 3: IMPROVEMENT IN ESTIMATING POPULATION MEAN USING TWO AUXILIARY VARIABLES IN TWO-PHASE SAMPLING

improvement in estimating population mean using two auxiliary ...137

the known coefficient of variation Cz and known coefficient of kurtosis β2(z) ofthe second auxiliary character z Upadhyaya and Singh (2001) proposed someestimators for Y .

If the population mean and coefficient of variation of the second auxiliarycharacter is known, the standard deviation σz is automatically known and it ismore meaningful to use the σz in addition to Cz, see Srivastava and Jhajj (1980).Further, Cz, β1(z) and β2(z) are the unit free constants, their use in additiveform is not much justified. Motivated with the above justifications and utilizingthe known values of σz, β1(z) and β2(z), Singh (2001) suggested some modifiedestimators for Y .

In this paper, under simple random sampling without replacement (SRSWOR),we have suggested improved chain ratio type estimator for estimating populationmean using some known values of population parameter(s).

2. The suggested estimator

The work of authors discussed in Section 1 can be summarized by using followingestimator

(2.1) t = y

(x′

x

) (aZ + b

az′ + b

),

where a (6= 0), b are either real numbers or the functions of the known parametersof the second auxiliary variable z such as standard deviation (σz), coefficient ofvariation (Cz), skewness (β1(z)) and kurtosis (β2(z)).

The following scheme presents some of the important known estimators of thepopulation mean which can be obtained by suitable choice of constants a and b.

Estimator Values ofa b

t1 = y

(x′

x

) (Z

z′

)1 0

Chand (1975) chain ratio type estimator

t2 = y

(x′

x

) (Z + Cz

z′ + Cz

)1 Cz

Singh and Upadhyaya (1995) estimator

t3 = y

(x′

x

) (β2(z)Z + Cz

β2(z)z′ + Cz

)β2(z) Cz

Upadhyaya and Singh (2001) estimator

t4 = y

(CzZ + β2(z)

Czx′ + β2(z)

) (β2(z)Z + Cz

β2(z)z′ + Cz

)Cz β2(z)

Upadhyaya and Singh (2001) estimator

Page 4: IMPROVEMENT IN ESTIMATING POPULATION MEAN USING TWO AUXILIARY VARIABLES IN TWO-PHASE SAMPLING

138 r. singh, p. chauhan, n. sawan, fl. smarandache

t5 = y

(x′

x

) (Z + σz

β1(z)z′ + σz

)1 σz

Singh (2001) estimator

t6 = y

(x′

x

) (β1(z)Z + σz

β1(z)z′ + σz

)β1(z) σz

Singh (2001) estimator

t7 = y

(x′

x

) (β2(z)Z + σz

β2(z)z′ + σz

)β2(z) σz

In addition to these estimators, a large number of estimators can also begenerated from the estimator t at (2.1) by putting suitable values of a and b.

Following Kadilar and Cingi (2006), we propose modified estimator combiningt1 and ti (i = 2, 3, ...., 7) as follows

(2.2) t∗i = αt1 + (1− α)ti, (i = 2, 3, ..., 7),

where α is a real constant to be determined such that MSE of t∗i is minimum andti (i = 2, 3, ...., 7) are estimators listed above.

To obtain the bias and MSE of t∗i , we write

y = Y (1 + e0), x = X(1 + e1), x′ = X(1 + e′1), z′ = Z(1 + e′2)

such that

E(e0) = E(e1) = E(e′1) = E(e′2) = 0

and

E(e20) = f1C

2y , E(e2

1) = f1C2x, E(e

′21 ) = f2C

2x,

E(e′22 ) = f2C

2z , E(e0e1) = f1ρxyCxCy, E(e0e

′1) = f2ρxyCxCy,

E(e0e′2) = f2ρyzCyCz, E(e1e

′1) = f2C

2x, E(e1e

′2) = f2ρxzCxCz,

E(e′1e′2) = f2ρxzCxCz,

where

f1 =(

1

n− 1

N

), f2 =

(1

n′− 1

N

),

C2y =

S2y

Y2 , C2

x =S2

x

X2 , C2

z =S2

z

Z2 ,

ρxy =Sxy

SxSy

, ρxz =Sxz

SxSz

, ρyz =Syz

SySz

,

Page 5: IMPROVEMENT IN ESTIMATING POPULATION MEAN USING TWO AUXILIARY VARIABLES IN TWO-PHASE SAMPLING

improvement in estimating population mean using two auxiliary ...139

S2y =

1

(N − 1)

i∈U

(yi − Y )2, S2x =

1

(N − 1)

i∈U

(xi −X)2,

S2z =

1

(N − 1)

i∈U

(zi − Z)2, S2xy =

1

(N − 1)

i∈U

(xi −X)(yi − Y ),

S2xz =

1

(N − 1)

i∈U

(xi −X)(zi − Z), S2yz =

1

(N − 1)

i∈U

(yi − Y )(zi − Z),

Expressing t∗i in terms of e’s, we have

(2.3)t∗i = Y (1 + e0)[α(1 + e′1)(1 + e1)

−1(1 + e′2)−1

+(1− α)(1 + e′1)(1 + e1)−1(1 + θe′2)

−1]

where

(2.4) θ =aZ

aZ + b·

Expanding the right hand side of (2.3) and retaining terms up to second power ofe’s, we have

(2.5) t∗i ∼= Y [1 + e0 − e1 + e′1 − e′2(α + θ − αθ)]

or

(2.6) t∗i − Y ∼= Y [e0 − e1 + e′1 − e′2(α + θ − αθ)].

Squaring both sides of (2.6) and then taking expectation, we get the MSE of theestimator t∗i , up to the first order of approximation, as

(2.7) MSE(t∗i ) = Y2[f1C

2y + f3C

2x + (α + αθ)2f2C

2z ],

where

f3 =(

1

n− 1

n′

).

Minimization of (2.7) with respect to α yield its optimum value as

(2.8) αopt =Kyz − θ

1− θ,

where

Kyz = ρyzCy

Cz

·

Substitution of (2.8) in (2.7) yields the minimum value of MSE(t∗i ) as

(2.9) min .MSE(t∗i ) = M0 = Y2[f1C

2y + f3(C

2x − 2ρyxCyCx)− f2ρ

2yzC

2y ].

Page 6: IMPROVEMENT IN ESTIMATING POPULATION MEAN USING TWO AUXILIARY VARIABLES IN TWO-PHASE SAMPLING

140 r. singh, p. chauhan, n. sawan, fl. smarandache

3. Efficiency comparisons

In this section, the conditions for which the proposed estimator is better than ti(i = 1, 2, ....7) have been obtained. The MSE’s of these estimators up to the order◦(n)− 1 are derived as

(3.1) MSE(yrd) = Y2[f1C

2y + f3(C

2x − 2ρyxCyCx)]

(3.2) MSE(t1) = Y2[f1C

2y + f2(C

2z − 2ρyzCyCz) + f3(C

2x − 2ρyxCyCx)]

(3.3) MSE(t2) = Y2[f1C

2y + f2(θ

22C

2z − 2θ2ρyzCyCz) + f3(C

2x − 2ρyxCyCx)]

(3.4) MSE(t3) = Y2[f1C

2y + f2(θ

23C

2z − 2θ2ρyzCyCz) + f3(C

2x − 2ρyxCyCx)]

(3.5) MSE(t4) = Y2[f1C

2y + f2(θ

24C

2z − 2θ2ρyzCyCz) + f3(C

2x − 2ρyxCyCx)]

(3.6) MSE(t5) = Y2[f1C

2y + f2(θ

25C

2z − 2θ2ρyzCyCz) + f3(C

2x − 2ρyxCyCx)]

(3.7) MSE(t6) = Y2[f1C

2y + f2(θ

26C

2z − 2θ2ρyzCyCz) + f3(C

2x − 2ρyxCyCx)]

and

(3.8) MSE(t7) = Y2[f1C

2y + f2(θ

27C

2z − 2θ2ρyzCyCz) + f3(C

2x − 2ρyxCyCx)]

where

θ2 =Z

Z + Cz

, θ3 =β2(z)Z

β2(z)Z + Cz

, θ4 =CzZ

C2Z + β2(z),

θ5 =Z

Z + σz

, θ6 =β1(z)Z

β1(z)Z + σz

, θ7 =β2(z)Z

β2(z)Z + (s)z

·

From (2.9) and (3.1), we have

(3.9) MSE(yrd)−M0 = f2ρ2yzC

2y ≥ 0.

Also from (2.9) and (3.2)-(3.8), we have

(3.10) MSE(ti)−M0 = f2(θiCz − ρyzCy)2 ≥ 0, (i = 2, 3, ..., 7).

Thus, it follows from (3.9) and (3.10) that the suggested estimator under optimumcondition is always better than the estimator ti (i = 1, 2, ....7).

4. Empirical study

To illustrate the performance of various estimators of Y , we consider the dataused by Anderson (1958). The variates are

y: Head length of second son,

x: Head length of first son,

z: Head breadth of first son,

Page 7: IMPROVEMENT IN ESTIMATING POPULATION MEAN USING TWO AUXILIARY VARIABLES IN TWO-PHASE SAMPLING

improvement in estimating population mean using two auxiliary ...141

N = 25, Y = 183.84, X = 185.72, Z = 151.12, σz = 7.224,

Cy = 0.0546, Cx = 0.0526, Cz = 0.0488,

ρyx = 0.7108, ρyz = 0.6932, ρxz = 0.7346, β1(z) = 0.002, β2(z) = 2.6519.

Consider n′ = 10 and n = 7. We have computed the percent relative efficiency(PRE) of different estimators of Y with respect to usual estimator y and compiledin the Table 4.1.

Table 4.1. PRE of different estimators of Y with respect to y

Estimator PRE

y 100yrd 122.5393t1 178.8189t2 178.8405t3 178.8277t4 186.3912t5 181.6025t6 122.5473t7 179.9636t∗i 186.6515

5. Conclusion

We have suggested modified estimators t∗i (i = 2, 3, ..., 7). From Table 4.1, we con-clude that the proposed estimators are better than usual two-phase ratio estimatoryrd, Chand (1975) chain type ratio estimator t1, estimator t2 proposed by Singhand Upadhyaya (1995), estimators ti (i = 3, 4) and than that of Singh (2001) esti-mators ti (i = 5, 6, 7). For practical purposes, the choice of the estimator dependsupon the availability of the population parameter(s).

References

[1] Anderson, T.W., An Introduction to Multivariate Statistical Analysis,John Wiley & Sons, Inc., New York, 1958.

[2] Chand, L., Some ratio type estimators based on two or more auxiliary va-riables, Unpublished Ph.D. thesis, Iowa State University, Ames, Iowa (USA),1975.

Page 8: IMPROVEMENT IN ESTIMATING POPULATION MEAN USING TWO AUXILIARY VARIABLES IN TWO-PHASE SAMPLING

142 r. singh, p. chauhan, n. sawan, fl. smarandache

[3] Das, A.K. and Tripathi, T.P., A class of sampling strategies for popula-tion mean using information on mean and variance of an auxiliary charac-ter, Proc. of the Indian Statistical Institute Golden Jubilee InternationalConference on Statistics. Applications and New Directions, Calcutta, 16-19,December 1981, 174-181.

[4] Hidiroglou, M.A. and Sarndal, C.E., Use of auxiliary information fortwo-phase sampling, Survey Methodology, 24 (1) (1998), 11-20.

[5] Searls, D.T., The utilization of known coefficient of variation in the es-timation procedure, Journal of American Statistical Association, 59 (1964),1125-1126.

[6] Searls, D.T. and Intarapanich, R., A note on an estimator for variancethat utilizes the kurtosis, Amer. Stat., 44 (4) (1990), 295-296.

[7] Sen, A.R., Estimation of the population mean when the coefficient of va-riation is known, Comm. Stat.-Theory Methods, A7 (1978), 657-672.

[8] Singh, G.N., On the use of transformed auxiliary variable in the estimationof population mean in two-phase sampling, Statistics in Transition, 5 (3)(2001), 405- 416.

[9] Singh, G.N. and Upadhyaya, L.N., A class of modified chain type esti-mators using two auxiliary variables in two-phase sampling, Metron, LIII,(1995), 117-125.

[10] Singh, H.P, Upadhyaya, L.N. and Chandra, P., A general family ofestimators for estimating population mean using two auxiliary variables intwo-phase sampling, Statistics in transition, 6 (7) (2004), 1055-1077.

[11] Singh, J., Pandey, B.N. and Hirano, K., On the utilization of a knowncoefficient of kurtosis in the estimation procedure of variance, Ann. Inst.Statist. Math., 25 (1973), 51-55.

[12] Singh, R. Chauhan, P. Sawan, N. and Smarandache, F., Auxiliaryinformation and a priori values in construction of improved estimators, Re-naissance High Press, USA, 2007.

[13] Srivastava, S.K. and Jhajj, H.S., A class of estimators using auxiliaryinformation for estimating finite population variance, Sankhya, C, 42 (1980),87-96.

[14] Upadhyaya, L.N. and Singh, G.N., Chain type estimators using trans-formed auxiliary variable in two-phase sampling, Advances in Modeling andAnalysis, 38, (1-2) (2001), 1-10.

Accepted: 20.04.2009