114

Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing
Page 2: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

USE OF

MATHEMATICAL

PROGRAMMING IN

STRATIFICATION

WITH COST

CONSTRAINT

Page 3: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

USE OF MATHEMATICAL PROGRAMMING IN

STRATIFICATION WITH COST CONSTRAINT

by

Aluwesi Volau Fonolahi

A thesis submitted in fulfilment of the requirements for the degree of

Masters of Science in Mathematics.

Copyright © 2015 by Aluwesi Fonolahi

School of Computing, Information and Mathematical Sciences

Faculty of Science, Technology and Environment

The University of the South Pacific

Suva, Fiji Islands.

September, 2015

Page 4: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing
Page 5: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

i

Contents Acknowledgement iv

Abstract v

Preface vi

1. Introduction…………………………………………...……..……………….…..1

1.1 Survey…………………………………...…………………………..….....1

1.2 Stratified Random Sampling…..…………………..……………….….…..2

1.3 Mathematical Programming Problem..…………………….…..……….....4

1.4 The Dynamic Programming Technique…………………....………….…..5

1.5 The Review of the Literature ……..………………...…………………..…7

2. Determination of the Optimum Strata Boundaries

with Cost Constraint..…………..……………………………………..……...….10

2.1 Introduction…………………..…………..…………………………..…...10

2.2 Formulation of the Problem of Determining the OSB as an

MPP……………………………………………………………….…...…..10

2.3 Solution Procedure Using Dynamic Programming

Technique…..…………………………………………………….....….….14

3. Determination of the Optimum Strata Boundaries for a Population

with Exponential Study Variable.……………………….………..….……...…....18

3.1 Exponential distribution………………....…………………………….…..18

3.2 Formulation of the Problem of Determining the OSB for

Exponential Variable……………………………….……………..……....19

3.3 Numerical Illustration of the Solution Procedure………………………....22

4. Determination of the Optimum Strata Boundaries for a Population

with Right-Triangular Study Variable ...….………………………….…….…...26

4.1 Right-Triangular Distribution…………………….……………………....26

4.2 Formulation of the Problem of Determining the OSB for

Right-Triangular Variable .……………………………….……………....27

Page 6: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

ii

4.3 Numerical Illustration of the Solution Procedure………………….…..….31

5. Determination of the Optimum Strata Boundaries for a Population

with Cauchy Study Variable ………………..……………...………….….…......35

5.1 Cauchy Distribution…………………………………………………..…. .35

5.2 Formulation of the Problem of Determining the OSB for

Cauchy Variable …………………………………...……………………....36

5.3 Numerical Illustration of the Solution Procedure..………………………...41

6. Determination of the Optimum Strata Boundaries for a Population

with Power Study Variable …………………………………………………........45

6.1 Power Distribution………………...………………………………….........45

6.2 Formulation of the Problem of Determining the OSB for

Power Variable...…………………………………………………………...45

6.3 Numerical Illustration of the Solution Procedure ………….……………...49

7. Conclusion……………...……………………………………….…………….…..53

8. Bibliography………………...………………………………..…………………...55

9. Appendix…………..……………………………………………………….….......64

A. The C++ Program Created to Determine the OSB with cost factor

for Exponential Distribution…………..…………………………………..…..64

B. The C++ Program Created to Determine the OSB with cost factor for

Right-Triangular Distribution………………………………………………....74

C. The C++ Program Created to Determine the OSB with cost factor for

Standard Cauchy Distribution…………………………………….…………..84

D. C The C++ Program Created to Determine the OSB with cost factor for

Power Distribution ……………………………………………………......….94

Page 7: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

iii

List of Tables 3.1 OSW, OSB and Optimum Value of the Objective Function for

Exponential Distribution……………………………………………………..24

4.1 OSW, OSB and Optimum Value of the Objective Function for

Right-Triangular Distribution ….......................................................................33

5.1 OSW, OSB and Optimum Value of the Objective Function for

Cauchy Distribution………………………………………………………..….43

6.1 OSW, OSB and Optimum Value of the Objective Function for

Power Distribution………………………………………………..……...……51

Page 8: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

iv

Acknowledgement First of all I would like to thank the Almighty God for his continual blessing to my life.

This thesis is dedicated to my husband Mr Jale Kotobalavu Fonolahi and my four children,

Joana Agnes Volau Sorowale, Susana Elizabeth Takaiwai Fonolahi, John Rabici Sakai

Fonolahi and Loata Talei Sorowale Fonolahi. Without their support, understanding and

encouragement, I would have not completed this thesis.

There are a number of people who I would also like to thank for helping me to complete

this thesis.

� My parents Mr Viliame Sorowale and Mrs Loata Lutu Volau Sorowale for their

guidance, encouragement and prayers.

� My supervisor, Dr. M.G.M Khan who has guided me through the process of

research and scholarly writing. I would like to express my sincere gratitude for

his valuable suggestions, motivation and counselling in this meritorious task.

� Mr Karuna Reddy and Mr Shalvindra Prasad for answering my queries in

programming.

� My sponsors, “The Itaukei Scholarship Unit” for providing financial support

towards my studies.

� My friends who have helped me in any way to complete this thesis.

Page 9: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

v

Abstract The aim of survey design is to obtain maximum precision at minimum cost. To achieve

this, stratified random sampling is one of the commonly used sampling techniques in

designing a survey. While using stratified sampling, the problem of stratification, that is,

determining optimum strata boundaries (OSB) is one of the main problems encountered

by survey designers. Many authors have proposed different methods of determining the

OSB by considering merely a fact that the total sample size is fixed. They ignored the fact

that the cost of measurement per unit may vary from stratum to stratum. This research is

an attempt to determine the OSB when the budget of the targeted survey is fixed in

advance and the measurement cost per unit varies across the strata. The problem is

formulated as a mathematical programming problem and solved to obtain the optimum

strata width, which is then used to calculate the optimum strata boundaries. The

formulated mathematical programming problem, being a multistage problem is solved by

developing a Dynamic Programming Technique. Numerical examples using the

population in which the stratification variables follow the exponential distribution, right-

triangular distribution, Cauchy distribution and the power distribution are presented to

illustrate the procedure developed in this thesis.

Page 10: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

vi

Preface This thesis entitled “Use of Mathematical Programming in Stratification with Cost

Constrait” is submitted to The University of the South Pacific, Suva, Fiji to supplicate the

Master of Science in Mathematics.

When conducting a survey it is always best to include the data of the whole population,

but this is usually impossible because it is usually too expensive and time consuming. Due

to this, many sampling techniques have been developed. When conducting these sampling

techniques, the researcher’s aim is to obtain maximum precision at minimum cost. This

means that the results obtained should be as close as possible to the results of the

population.

One commonly used technique in survey is the stratified random sampling that increases

the precisions of the estimates. When conducting survey using stratified sampling one of

the main factors that one should consider is the determination of the optimum strata

boundaries (OSB), known as optimum stratification. The strata boundaries chosen should

ensure that the sample inside a strata is as homogenous as possible. This thesis looks at

constructing the OSB while the cost factor, that is, the total budget of the survey is fixed

in advance and the measurement cost per unit varies across strata is considered.

This thesis has seven chapters. Chapter 1 consists of the introduction which explains the

purpose of conducting a survey and sampling. It explains why the stratified random

sampling technique has been very popular when compared to other sampling techniques

and the factors that need to be considered when conducting stratified random sampling.

This chapter also describes a mathematical programming problem (MPP) as a technique

of finding the optimum values of an objective function given a set of constraints. Also in

this chapter the dynamic programming technique is explained as a method used to solve

complex optimization problems. The chapter ends with a literature review of the different

methods of determining the OSB.

Page 11: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

vii

In Chapter 2, the problem of determing the OSB with cost factor is considered. First, a

brief introduction on considering the cost factor in sampling is explained. Next, the

formulation of the problem as an MPP with cost factor is given. The formulated MPP is

reconsidered as an equivalent MPP of determining optimum strata width, which is then

solved for obtaining OSB. Lastly, the solution procedure using a dynamic programming

technique is described.

In surveys the main stratification variables may follow different types of distributions. In

most cases it is assumed that the data follow a normal distribution, but in reality this is not

the case. In many surveys such as engineering, business and economics, etc. stratification

variables generally have a distribution different from the normal distribution. So because

of this, it is important to consider the different types of distributions when determining the

OSB of a stratification variable.

Thus, in Chapter 3 the problem of determining the OSB with cost factor for a population

with exponentially distributed stratification variable is considered. The chapter begins

with an introduction of the exponential distribution. Then the MPP for determining the

optimum strata width is described. This is then solved by developing the solution

procedure using a dynamic programming technique. A computer program coded in C++

was created to execute the solution procedure giving both the optimum strata widths and

the OSB as the output. The work carried out in Chapter 3 was presented at the IEEE Asia

Pacific World Congress on Computer Science and Engineering (APWC on CSE) held

during 4th and 5th of November 2014 at Plantation Island in Fiji (see Fonolahi and Khan,

2014).

Chapters 4, 5 and 6 look at determining the OSB with cost factor for the stratification

variables, respectively, with right-triangular distribution, Cauchy distribution and power

distribution.

Finally, Chapter 7 gives a brief conclusion of this research work, followed by a

comprehensive list of references in the bibliography and the code for the C++ computer

programs in the appendix.

Page 12: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

1

Chapter 1

Introduction

______________________________

1.1 Survey

Survey is the process of collecting data that aids in decision making for development.

These data are analyzed and findings are interpreted from the results of the analyses. Later

recommendations and conclusions are drawn from the findings.

Survey is a very important component for the development of any country or organization.

It is from survey (such as estimating poverty, agricultural products, etc.) that we get

information on how to improve procedures for development. In government and other

non-profit organizations, before new policies are created, it is very important that a survey

is conducted first to determine whether there is a need to make new policies, what changes

are needed from the existing policies and what are the advantages of having new policies

compared to the old ones.

Manufacturers do a lot of surveys to determine whether the public like their products or

to estimate the number of products that they are manufacturing and selling, how they can

improve the quality of their products so that it suits the market and the people’s needs.

Big businesses also conduct surveys to find out the best venue to set up their businesses

and how many clients they are likely to have and the type of products that their clients

need. The decisions resulting from information obtained in the survey is critical for

developing production and marketing policies and because of this, it is very important that

the information obtained from the survey is as accurate as possible. Otherwise, conducting

the survey will be a waste of time, money and resources.

Page 13: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

2

The best way to conduct a survey is to reach out and question each individual in the

population. Yet, if the population is very large, this can be very time consuming and also

very expensive. Due to this, there has been a wide range of sampling techniques developed

to try and minimize the time spent and the cost of the survey.

Survey sampling simply means taking out a small group from the population and the

statistical analysis of this small group is assumed to represent the whole population. Some

common sampling techniques used are simple random sampling, systematic sampling,

quota sampling, stratified random sampling and cluster sampling. Choosing the sample is

a very critical process as we try to ensure that the sample taken gives a statistical result

that represents very closely the analysis given by the population. So after one has chosen

a sampling technique, the next step is to develop a way of using this technique that will

make the results obtained to be as precise as possible. The sampling technique that will be

looked at in this research project is stratified random sampling.

1.2 Stratified Random Sampling

In stratified random sampling, a population of size N is divided into smaller non

overlapping groups of sizes 1 2, , .. LN N N� such that 1 2 LN N N N� ���� � . These L

groups are called strata. After the strata have been formed then an independent sample of

sizes 1 2, , .. Ln n n� is drawn from within each stratum by simple random sampling. This

method is used when we have heterogeneous units. Heterogeneous means that the unit

vary a lot or have a very wide range. The main aim of stratification is to try and group

similar types of units together, that is, the strata should be as homogeneous as possible.

There are many reasons why stratified random sampling is chosen over the other methods.

Some reasons are:

1. The researcher ensures the representation of all the different subgroups in the

population, especially, in cases where we have extreme ends. An example is, if

there is a survey to be conducted on the average income of people working in a

company, then strata could be divided according to different categories of

Page 14: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

3

positions in the company, i.e. executive positions, senior positions, officers,

assistants and interns. In this way, the employees within a group are homogenous

with respect to salary and also all the groups of salaries are represented in the

sample, even the executive position with high salary, although there are very few

people holding these positions.

2. There is increased precision obtained from the result since the subgroups are more

homogeneous.

3. The researcher is able to obtain individual results for each stratum and compare

the results for each stratum, especially if they are interested in finding out more

information or highlights about a particular subgroup. This can also set a platform

for further research. For example, if there is an agricultural survey on when is the

right time to pick the oranges ensuring that they produce sweet juice. Then, the

strata can be divided according to the different types of oranges.

4. Since the variability of data within each strata is less compared to working with

the entire population, stratified random sampling usually requires less sample.

This is because each stratum has very similar data and no matter if the sample is

big or small the results obtained will be similar. Thus, choosing fewer samples

saves a lot of money, time and energy.

5. Stratification has an advantage due to administration because interviewers can be

trained to interview a specific group. This helps the interviewer to study and

manage a specific group which makes the work easier for the interviewer instead

of focusing on a very wide range of population. This will help save money and

time in terms of training the interviewer different skills for the wide range of

population. This will also guarantee that a set of rich information is obtained from

the interview. An example is, if a research has to be conducted to people of

different languages then the strata can be divided according to these different

languages and the interviewer can be assigned to the particular strata where they

know the language instead of training each interviewer to know all the types of

languages.

Page 15: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

4

It is important to note that if stratification is not done properly then the results obtained

will be unreliable. Thus to obtain maximum precision in estimate of the study variable

when using stratification, one should consider the following factors:

1. The choice of the stratification variables.

2. The determination of the number of strata.

3. The determination of the optimum strata boundaries.

4. The determination of the optimum sample size to be selected from within each

stratum.

This research project will specifically look at the problem (3) above.

1.3 Mathematical Programming Problem

A Mathematical Programming Problem (MPP) can be stated as a technique of finding the

optimum solution of an objective function from all feasible values given by a set of

constraints. The general form of an MPP is given as:

Maximize (or minimize): 1 2 3 ( , , , .. )nZ f x x x x� �

Subject to � �� 1 2 3, , , ., , , 0; 1,2,3 .i ng x x x x i m�� � � � � and 0; 1,2,3,jx j n� � �� .

All the functions in the MPP above are assumed to be continuously differentiable unless

stated otherwise and also for each i only one of the signs , , � � holds true. An MPP may

consist of one or more objective functions and constraints of several decision variables.

The equation that expresses the system response as a function of decision variables is

referred to as the objective function. The objective function is the function that we wish

to maximize or minimize. In the general form above the objective function is given as:

� �1 2 3 , , , .. . nZ f x x x x� � The decision variables ( 0; 1,2,3, )jx j n� � �� are variables

which can be controlled and which influence the performance of the system. In most

situations some values of the decision variables are not possible. When there are

Page 16: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

5

limitations on resources required to implement a system, they will be expressed as

constraints equations. In the general form above the constraints functions are given as

( ).ig x The aim of solving the MPP is to find the values of the decision variable that gives

the optimum solution for a particular objective function given a set of constraints. Thus

when solving the MPP one can locate value(s) of the decision variable(s) that will result

in the “best” (optimum) system in view of limited resources available.

If both the objective function and the constraints consist of all linear functions then the

MPP is a Linear Programming Problem (LPP). In more complex situation where nonlinear

functions are involved then the MPP becomes a Nonlinear Programming Problem (NLPP).

The MPP has received a great attention from researchers in the field of mathematics,

economics and operations research. The advantage of MPP is that one can easily

manipulate the variables, parameters, constraints or even change the objective function.

The MPP has grown into many branches depending on the nature of the objective function,

the constraints and the decision variables. Some branches of MPP are Integer

Programming Problem (IPP), Quadratic Programming Problem (QPP), Convex

Programming Problem (CPP), Separable Programming Problem (SPP), Multi Objective

Programming Problem (MOPP), Fractional Programming Problem (FPP) and Geometric

Programming Problem (GPP).

The problem of stratification usually involves nonlinear functions so the research carried

out in this thesis will be attempted using NLPP. Then, a Dynamic Programming technique

is used to solve the problem.

1.4 The Dynamic Programming Technique Dynamic Programming is used to solve complex optimization problems. After the MPP

has been constructed, an appropriate optimization technique must be chosen. This will

depend on the form of the objective function and constraints, the number and nature of the

variables and the kind of computational facilities available. Due to the complexity of the

nature of the problem to be optimized there is a need to make some transformation. The

transformed model preserves the properties of the original model but it is now in a form

Page 17: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

6

that can be easily optimized. This transformation is known as dynamic programming.

Dynamic programming takes a sequential or multistage decision process containing many

interdependent variables and converts it into a series of single-stage problems, each

containing only a single decision variable. This transformation is invariant in that the

number of feasible solutions and the value of the objective function as associated with

each feasible solution are preserved. The transformation is based on Bellman’s (1957)

principle of optimality that:

“An optimal set of decisions has the property that whatever the first decision is, the

remaining decisions must be optimal with respect to the outcome which results from the

first decision.”

Although the principle of optimality seems both obvious and simple, it can more

appropriately be described as powerful, subtle and elusive. We may say that a problem

with N decision variables can be transformed into N sub-problems, each containing only

one decision variable. As a rule of thumb, the computations increase exponentially with

the number of variables, but only linearly with a number of sub-problems. Thus there can

be a great computation savings. Often this saving makes a difference between an

insolvable problem and one requiring only a small amount of computer time.

An MPP that has the characteristics listed below can be solved using dynamic

programming techniques.

1. The given MPP may be described as a multistage decision problem, where at each

stage, the value(s) of one or more decision variables are to be determined.

2. The problem must be defined for any number of stages.

3. At each stage, there must be a specified set of parameters describing the state of

the system, that is, the parameters on which the values of the decision variables

and the objective functions depend.

4. The same set of state parameters must be described as the state of the system

irrespective of the number of stages.

5. The decision at any stage, that is, the determination of the decision variable(s) at

any stage must have no effect on the decisions of the remaining stages except in

changing the values of the parameter which describes the state of the system.

Page 18: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

7

Certain problem areas, such as inventory theory, allocation, control theory, and chemical

engineering design, have been particularly fertile for dynamic programming applications.

Dynamic Programming was certainly practiced long before it was named. Wald’s (1947)

work on sequential decision theory contains the seed of dynamic programming approach.

The two papers by Dvoretzky et al. (1952), on inventory theory are certainly in the spirit

of dynamic programming.

Undoubtedly, however, Richard Bellman is the father of dynamic programming. His

research at the Rand Corporation in the 1950’s led to the publication of a large number of

significant papers on dynamic programming first published in Bellman (1957). He

invented the rather undescriptive but alluring name for the approach-dynamic

programming. A more representative but less glamorous name would be recursive

optimization.

1.5 The Review of the Literature To obtain the maximum precision of the estimates of population parameters when using

stratified random sampling one of the important problems to consider is the determination

of the optimum strata boundaries (OSB). In practice, the OSB is determined by cutting

the range of the distribution of the study variable at suitable points.

The choice of OSB is important to ensure that the units in each stratum are homogenous.

This means that in order to achieve maximum precision, the stratum variance � �2h� should

be as small as possible for a given type of sample allocation. The problem of determining

OSB was first studied by Dalenius (1950) when he used the study variable as the

stratification variable. He presented a set of minimal equations for finding the OSB.

Unfortunately, the minimal equation was difficult to solve because of its implicit nature.

Some other classical methods of obtaining the OSB was determined by Dalenius and

Gurney (1951) where they mentioned that the boundary points can be obtained by making

h hW � constant where hW is defined as the weight of the thh stratum. However, they

found that an explicit solution could not be determined but they managed to achieve some

relations which the OSB points must satisfy. So starting with a set of points they proceeded

Page 19: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

8

towards the optimum set by iterative steps, but it was noted that the results can be

unreliable for more than two strata. On the other hand Mahalanobis (1952), Hansen and

Hurwitz (1953) suggested that the stratum boundaries can be obtained by making h hW

constant, where h is defined as the stratum mean. Their rule is to consider equal stratum

totals given the condition that the coefficient of variation within the strata are equal and

will remain the same if the strata size is adjusted. The advantage of this rule is its

simplicity and it has been claimed that it works well with a large number of real

populations.

Some authors that came up with an approximation rule are Aoyama (1954) who suggested

that boundaries of equal width should be made, while Ekman (1959) gave a condition for

determining the stratum boundaries which is the 1( )h h hW x x �� should be constant. In

reality the frequency distribution of the study variable is usually not known, so authors

like Dalenius (1957), Taga (1967), Singh and Sukhatme (1969, 1972, 1973), Singh (1971),

Singh and Prakash (1975), Mehta et al. (1996), Rizvi et al. (2002) and Gupta et al. (2005)

use the frequency distribution of an auxiliary variable and came up with different

approximation methods of determining the OSB.

Dalenius and Hodges (1959) first came up with the method of constructing the OSB by

dividing the square root of the cumulative frequency at equal intervals. This method was

tested by Cochran (1977) who mentioned that it works well, especially when the

regression of y on x is linear and the � (correlation coefficient) is nearly perfect. It was

noted that the disadvantages of using this rule are the breaks for the intervals and the

number of initial class intervals are random. Due to these disadvantages authors like Singh

and Sukhatme (1969, 1972, 1973), Singh (1971), Singh and Prakash (1975), Mehta et al.

(1996) and Rizvi et al. (2002) and Serfling (1968) tried to modify the Dalenius and Hodges

(1959) rule in some way. On the other hand Cochran (1961), Hess et al. (1966) and Murthy

(1967) compared some classical methods of obtaining the OSB and concluded that the

Ekman method and the Dalenius and Hodges method worked consistently well.

Page 20: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

9

Sethi (1963) came up with another method where he proposed that the boundaries can be

calculated from the calculus equations: � � � �2 22 21 1 1

1

h h h h h h

h h

x x � �� �

� � �

� � � �� .

Lately, more researchers have moved into the direction of proposing an algorithm to

determine the OSB such as Unnithan (1978), Lavallee and Hidiroglou (1988), Niemiro

(1999), Nicolini (2001), Lednicki and Wieczorkowski (2003) and Kozak (2004). One such

method was proposed by Buhler and Dutler (1975) where the OSB is formulated as an

optimization problem and solved using a dynamic programming technique. This method

was reviewed by Khan et al. (2008). Lavallee (1988, 1988) also used this approach where

the OSB divides the population domain of two stratification variables into distinct subsets

such that the precision of the variable of interest is maximized.

Recently Khan et al. (2002) proposed a technique to obtain the exact value for the OSB

when the frequency distribution of the study variable is given and the number of strata is

fixed in advance by formulating the problem as an MPP and solving it using the Buhler

and Deutlers dynamic programming approach. Later Khan et al. (2002, 2003, 2005, 2008,

2008, 2014) and Nand and Khan (2005a, 2005b) applied these procedures to determine

the OSB for other populations with different type of distributions such as uniform, right

triangular, exponential, triangular, standard normal, Cauchy, power and log-normal type

frequency distribution. Khan et al. (2015), also looked at determining the OSB for skewed

population using auxiliary information that follows the gamma distribution. They found

the OSB by considering the problem as determining the optimum strata width (OSW),

which they formulated as an MPP and solved using the dynamic programming method.

This research is an attempt to determine the OSB for the populations in which the study

variable follows exponential, right-triangular, Cauchy and power distribution

respectively. The OSB obtained for these populations by extending the problems

discussed in Khan et al. (2002, 2005) while taking into account the cost constraints as

discussed in the subsequent chapters.

Page 21: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

10

Chapter 2

Determination of the Optimum

Strata Boundaries with Cost

Constraint

______________________________ 2.1 Introduction

In the literature so far the discussion of constructing optimum stratification has been made

merely in terms of a given total size of sample. However, in practice, in many surveys the

total budget is fixed in advance and the cost of measurement per unit within stratum varies

across the strata. Thus, the OSB obtained based on the total sample size may not remain

optimum for a given cost. Due to this fact, it is important to consider the problem of

determining the OSB that is constrained by cost.

2.2 Formulation of the Problem of Determining the

OSB as an MPP

In stratified random sampling, where the population is divided into L strata, an

unbiased estimate of the population mean hY�

is given by

1

Lst h hh

y W y�

�� (2.1)

with variance

� � 2 21

1 1Lst h hh

h h

V y W Sn N�

� �� �� �

� �� , (2.2)

Page 22: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

11

where hh

NW N� is the proportion of population contained in thh stratum, hy is the mean

of a sample of size hn and 2hS is the variance of thh stratum.

The total cost C of a survey may be expressed as

0 1

Lh hh

C c c n�

� �� , (2.3)

where 0c represents the overhead cost. This is usually the cost of administration and

conducting training for interviewers. The term hc gives the cost of collecting information

per unit in thh stratum. This is usually the cost of travelling to conduct the interview and

the cost of interview, which cost is usually different from stratum to stratum. A reason for

this could be the different distance the interviewer will need to travel to conduct the

interview. For example, in a particular strata the houses, where one needs to conduct the

interviews, are very close together so the cost of travelling is very cheap. Whereas, in

another strata the houses are far apart thus the cost of travelling is very expensive.

Then, the problem of determining optimum allocation of sample size : 1,2,...,hn h L� for

which � �stV y in (2.2) is minimum for a fixed total cost C is given by

Minimize � � 2 21

1 1Lst h hh

h h

V y W Sn N�

� �� �� �

� �� ,

subject to 0 1

Lh hh

c c n C�

� �� . (2.4)

Solving the problem stated in (2.4) using a Lagrange multiplier technique, the optimum

allocation � �hn is obtained by

0

1

h hh L

hh h hh

C c W SncW S c

�� ��

. (2.5)

Substituting (2.5) in (2.2), that is, the variance with this optimum allocation is given by

Page 23: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

12

� �� �2 2 2

1

10

Lh h h Lh h h

st hh

W S c W SV yC c N�

�� �

�� . (2.6)

If the finite population correction is ignored, then minimizing the expression on the right

hand side of (2.6) is equivalent to minimizing

1

Lh h hh

W S c�� . (2.7)

Assuming that the stratification variable x has a continuous frequency function � �f x ,

a x b and if the population is divided into L strata and 0x a� and Lx b� are the

initial and final value of the distribution then the problem of determining the OSB is to

cut the range of distribution, d , that is

0Lx x d� � (2.8)

at the intermediate points 1 2 1Lx x x � 1x such that the variance in (2.7) is minimum.

With a known frequency function � �f x of the stratification variable x , the values of hW

and hS in (2.7) are obtained by

� �1

h

h

x

hx

W f x dx�

� � , (2.9)

� �1

2 2 21 h

h

x

h hh x

S f x dxxW

� �� , (2.10)

where

� �1

1 h

h

x

hh x

x f x dxW

� � (2.11)

and � �1,h hx x� are the boundaries of th stratum.

So (2.7) can be expressed as a function of boundary points � �1,h hx x� .

Page 24: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

13

Let

� �1,h h h h h hf x x W S c� � . (2.12)

Then the problem of obtaining the OSB can be expressed as:

Find 1 2 1, , , Lx x x �1x that

Minimize � �11,L

h h hhf x x���

Subject to 0 1 2 1L La x x x x x b�� � � . (2.13)

Let

1h h hl x x �� � (2.14)

denote the width of the thh ( 1, 2, . )h L� �� stratum.

Obviously, with the above definition of hl , the range of the distribution in equation (2.8)

is expressed as a function of stratum width as:

01

1 1( )L

hhh

h h LL xl x x x d�

��� � � � �� � . (2.15)

The thh stratification point : 1,2,..., 1hx h L� � is then expressed as

0 1 1h h h hx x l l x l�� � � � � �ll xlh hhh . (2.16)

Adding (2.15) as a new constraint, the problem of determining the OSB can be treated as

the problem of determining the optimum strata widths 1 2, ......, Ll l l and can be expressed as

the following Mathematical Programming Problem (MPP):

Minimize � �11,L

h h hhf l x ���

subject to 1

Lhh

l d���

Page 25: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

14

and 0; 1,2,...,hl h L� � . (2.17)

Initially, 0x is known. Therefore, the first term � �1 1 0,f l x in the objective function of

(2.17) is a function of 1l alone. Once 1l is known, the second term

� � � �2 2 1 2 2 0 1, ,f l x f l x l� � will become a function of 2l alone and so on. Due to this special

nature, the MPP (2.17) may be treated as a function of hl alone and can be expressed as:

Minimize � �1

Lh hh

f l��

subject to 1

Lhh

l d���

and 0; 1,2,...,hl h L� � . (2.18)

2.3 Solution Procedure Using Dynamic Programming

Technique

The problem (2.18) is a multistage decision problem in which the objective function and

the constraint are sums of separable functions of ; 1,2,...,hl h L� . Due to this separable

characteristic and the nature of the problem, the MPP (2.18) may be solved using a

dynamic programming technique (see Khan et al., 2008). Dynamic programming

determines the optimum solution of a multi-variable problem by decomposing it into

stages, each state comprising a single variable sub-problem. A dynamic programming

model is basically a recursive equation based solution procedure, which is based on

Bellman’s principle of optimality (Bellman, 1957). The recursive equation links the

different stages of the problem in a manner which guarantees that each stage’s optimum

feasible solution is also optimal and feasible for the entire problem (see Taha 1997, chapter

10).

Considering a sub-problem of (2.18) for first ( )k L� strata:

Minimize � �1

kh hh

f l��

Page 26: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

15

subject to 1

kh kh

l d���

and 0; 1,2,...,hl h L� � . (2.19)

where kd d� is the total width available for the division into k strata.

Note that kd d� when k L� .

The transformation functions are given by

1 2

1 1 2 1

2 1 2 2 1 1

2 1 2 3 3

1 1 2 2

.

.

.

.

k k

k k k k

k k k k

d l l l

d l l l d l

d l l l d l

d l l d l

d l d l

� �

� � � �

� � ����

� � ���� � �

� � ���� � �

� � � �

� � �

Let � �, kf k d denote the minimum value of the objective function of (2.19), that is,

� � � �1 1

, min , and 0; 1,2, .k k

k h h h k hh h

f k d f l l d l h k� �

� �� � � � �� �

� �� � . (2.20)

With the above definition of � �, ,kf k d the MPP (2.18) is equivalent to finding � �,f L d

recursively by finding � �, kf k d for 1, 2, ..,k L� � and 0 .kd d

We can write:

� � � � � �1 1

1 1

, min , and 0; 1, 2, . 1k k

k k k h h h k k hh h

f k d f l f l l d l l h k� �

� �

� �� � � � � � � �� �

� �� �

For a fixed value of ;kd 0 ,k kl d

Page 27: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

16

� � � � � �1 1

1 1

, min , and 0; 1, 2, . 1k k

k k k h h h k k hh h

f k d f l f l l d l l h k� �

� �

� �� � � � � � � �� �

� �� �

Using Bellman’s Principle of Optimality, we get the recurrence relation of the Dynamic

Programming technique as

For stages 2k �

� � � � � �0

, min 1, k k

k k k k kl df k d f l f k d l

� � � �� �� � . (2.21)

For the first stage (i.e. 1k � )

� � � � *1 1 1 1 11,f d f d l d� � (2.22)

where *1 1l d� is the optimum width of the first stratum. The relations (2.21) and (2.22)

are solved recursively for 1, 2, .., k L� � and 0 kd d and � �,f L d is obtained. From

� �,f L d the optimum width of thL stratum, *Ll is obtained. From � �*11,f L d l� � the

optimum width of � �1 thL� stratum, *1Ll � is obtained and so on until *

1l is obtained.

The algorithm of the above solution procedure for MPP (2.18) to determine OSB is

summarized as follows:

Step 1: Start at 1k � . Set � �00, 0f d �

Step 2: Calculate � �11, f d , the minimum value of RHS of (2.22) for 1 1l d� ,

10 .d d

Step 3: Record � �11, f d and 1l .

Step 4: For 2k � , express the state variable as � �1 , .k k kd d f k d� � �

Step 5: Set � �, 0kf k d � if ,k kl d! where 0 .kl d

Step 6: Calculate � �, kf k d , the minimum value of RHS of (2.21) for ;kl

0 k kl d .

Page 28: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

17

Step 7: Record � �, kf k d and kl .

Step 8: For 3, .., k L� � , go to step 4.

Step 9: At ,k L� � �, f L d is obtained and hence the optimum value of *Ll and Ll

is obtained.

Step 10: At 1k L� � , using the backward calculation for *1L Ld d l� � � , read

the value of � �11, Lf L d �� and hence the optimum value *1Ll � of 1.Ll �

Step 11: Repeat Step 10 until the optimum value *1l of 1l is obtained from � �11, .f d

Page 29: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

18

Chapter 3

Determination of the Optimum

Strata Boundaries for a Population

with Exponential Study Variable

______________________________

3.1 Exponential Distribution In probability theory, the exponential distribution is a single parameter family of

continuous distribution. It is commonly used model for waiting times between occurrences

of events; for example, lifetimes of electrical or mechanical devices, the waiting time until

failure, etc. are the random variables that are frequently modeled with an exponential

distribution.

If the study of the stratification variable x follows exponential distribution, then its

density function is given by:

� �λxλ ; 0

; λ 0 ; elsewhere

e xf x

�" !� #$

(3.1)

where λ is called the rate parameter.

The exponential distribution play an important role in both queuing theory and reliability

problems. This distribution is very useful in modelling the time between arrivals at service

facilities and time to failure of component parts of electrical systems. There are many real

life examples where the exponential distributions are used. For example, individual

Page 30: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

19

income, energy consumption and many wealth variables are exponential (Banerjee et al.,

2006; Banerjee and Yakovenko, 2010).

3.2 Formulation of the Problem of Determining the

OSB for Exponential Variable

Let the stratification variable follows the exponential distribution with parameter

λ 0! as given by (3.1).

In practice, the actual populations are often finite, so assuming the largest value of in

the population as D ; the frequency function given in (3.1) can be approximated as:

� �λxλ ; 0

; λ 0 ; elsewhere.

e x Df x

�" � #$

(3.2)

Note that we have here 0 0 x � and .Lx D� If D is sufficiently large, (3.2) can be

considered as an approximate exponential density. Otherwise, the truncated exponential

density is to be used in the expression.

If the stratification variable x , follows the exponential distribution with density function

given in (3.2), then the stratum weight ( )hW , stratum mean ( )h and the stratum variance

2( )hS can be obtained as a function of boundary points � �1,h hx x� , by using (2.9), (2.11), and

(2.10) respectively as follows:

� �1

1

λxλ .

h

h

h

h

x

hx

x

x

W f x dx

e dx

By integrating the function and substituting hx and 1hx � , it gives us

Page 31: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

20

1λx (λ )[1 ]h hlhW e e�� �� � (3.3)

From (2.15) note that

1h h hx l x �� � . (3.4)

The stratum mean ( )h is obtained as follows:

� �1

1

λx

1

1 λ .

h

h

h

h

x

hh x

x

h x

x f x dxW

x e dxW

Thus, substituting the value of hW from (3.3), we get

1

1

λx

λx (λ )

1 λ[1 ]

h

h h

h

x

h lx

x e dxe e

�� ��

� � .

By integrating the function gives

1

1

λxλx

λx (λ )

1[1 ] λ

h

h h

h

x

h lx

exee e

��

� �

� �� � �� �� � �

.

Substituting the value of hx from (3.4), it gives

1

1

1

λxλx

( )

1[1 ] λ

h h

h

l x

hx

exe% % �

���

� �

� �� � �� �� � �

xh hle e

which reduces to

Page 32: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

21

� � � �� �1

1

1

11

λλ

1

( )λ

λ

1

λ1

[1 ]

λ

h hh h

hh

l xl x

h h

h

xx

h

el x e

ex e

% %

��

��

� �� �

� �

��

� �� �� � �� �� �� �� �� �� �� � ��� �� �

� � �� �� �� �� �� �

xh hle e.

Simplifying the above gives

� �λ λ

1

λ

1 1λ

1

h h

h

l lh h

h l

x e l e

e

� ��

� �� � � �� �� ��

�. (3.5)

Similarly, the stratum variance 2( )hS is found as follows:

� �1

1

2 2 2

2 λx 2

1

1 .λ

h

h

h

h

x

h hh x

x

hh x

x fS x dxW

x e dxW

� �

� �

Thus, substituting the value of hW from (3.3) and h from (3.5), we get

� �1

1

2λ λ

12 2 λx

( ) λ

1 11 λ

λ[1 ] 1

h hh

h

h

l lx h h

h lx

x e l ex e dx

eS % %�

� ��

�� � �

� �� �� � � �� �� �� �� �� �� �� �

� �� �

�xh hle e.

By integrating and simplifying, it gives

� �1

1

1

λx λx 12 2 λx

( ) 2

1 11 2 2

[1 ] λ λ 1

h h

h

l x

hx

S e ex e x

% %

% % %%�

� ��� � �

�� � �

� �� �� � � �� �� �� � � �� �� � � � �� �� �� �� �� �� �

x

h h

h h h

l lh h

l l

x e l e

e e e.

By simplifying further we get

Page 33: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

22

� �� �

2 22

22

1 1

1hS

% %

%

%� �

� ��

h h

h

l lh

l

e l e

e . (3.6)

Thus substituting (3.3) and (3.6) in (2.7) we get

� � � �12 2 2

2

1 1h h hx l lh h h h hW S c e e l e c% % %

%�� � �� �� � �� �� �

.

Then, the formulated MPP given in (2.18) to determine the optimum stratum widths and

hence the optimum stratum boundaries could be expressed using (2.12), (3.3) and (3.6)

as

Minimise � � � �12 2

λx λ λ22

1

1 1λ

h h h

Ll l

h hh

e e l e c�� � �

� �� �� �� ��

subject to 1

L

hh

l d�

��

and 0, 1,2, ..hl h L� � �� (3.7)

where 0 Ld x x� � is the range of the distribution.

3.3 Numerical Illustration of the Solution Procedure

This section gives an illustration of the computational details of the proposed solution

procedure using the dynamic programming technique to determine the OSB with varying

stratum cost as discussed in Section 2.3. For the purpose of illustration we assume that x

follows the exponential distribution with 1% � , 0 0x � and 20 Lx � . This implies that

20d � . Then the MPP (3.7) is reduced to

Minimise � � � �12 2x 2

1

1h h h

Ll l

h hh

e e l ce�� � �

� �� �� �� ��

Page 34: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

23

subject to 1

20L

hh

l�

��

and 0, 1,2, ..hl h L� � �� . (3.8)

Note that the ( 1)thh � stratification point is obtained by

1 0 1 2 1

1 2 1

1

0

0

h h

h

h

h h

x x l l l

l l l

d

d l

� �

� � � ���

� � � ���

� �

� �

Substituting the value of 1hx � , the recurrence relation (2.21) and (2.22) for solving the

MPP (3.8) reduces to

For the first stage, 1k �

� � � �1 12 2

1 1 11, 1 d df d e d e c� �� �� � �� �� � at 1 1*l d� . (3.9)

For the stage k , where 2k �

� � � �� � � � � �2 2 2

0, min 1 1,k k k k

k k

d l d dk k k kl d kf k d e e d e c f k d l� � � �

� � � � � �

� �� �� �� �� �� �

(3.10)

because 1 0 1 1k k k kx x l l d l� �� � ��� � � .

Page 35: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

24

A C++ program (see Appendix A) was coded to solve the recurrence relation (3.9) and

(3.10). While executing the developed program, the optimum strata width *hl and hence

the optimum strata boundaries * * *1h h hx x l�� � are obtained. The results for six different

number of strata 2, 3, 4, 6 a5 7, ndL � with different strata measurement cost hc , are

presented in the Table 3.1.

Table 3.1 OSW, OSB and Optimum Value of the Objective Function for

Exponential Distribution

No of

Strata

� �L

Strata

Measurement

Cost � �hc

Optimum Strata

Width � �hl

Optimum Strata

Boundaries

� �1h h hx x l�� �

Optimum Value

of the Objective

Function

2 1 2c �

2 3c �

1 1 .467970l �

2 1 8.53203l �

1 0 1 1.467970x x l� � � 0.8368043

3 1 2c �

2 3c �

3 4c �

1 0.95892l �

2 1 .40568l �

3 1 7.6354l �

1 0 1 0.95892x x l� � �

2 1 2 2.36460x x l� � �

0.6177939

4 1 2c �

2 3c �

3 4c �

4 5c �

1 0.735210l �

2 0.90241l �

3 1 .37240l �

4 1 6.98998l �

1 0 1 0.735210x x l� � �

2 1 2 1.637620x x l� � �

3 2 3 3.01002x x l� � �

0.5002674

5 1 2c �

2 3c �

3 4c �

4 5c �

5 6c �

1 0.60669l �

2 0.68336l �

3 0.87155l �

4 1 .35166l �

5 16.48674l �

1 0 1 0.60669x x l� � �

2 1 2 1.29005x x l� � �

3 2 3 2.16160x x l� � �

4 3 4 3.51326x x l� � �

0.4260679

Page 36: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

25

6 1 2c �

2 3c �

3 4c �

4 5c �

5 6c �

6 7c �

1 0.52232l �

2 0.55854l �

3 0.65463l �

4 0.85203l �

5 1 .33748l �

6 1 6.07500l �

1 0 1 0.52232x x l� � �

2 1 2 1.08086x x l� � �

3 2 3 1.73549x x l� � �

4 3 4 2.58752x x l� � �

5 4 5 3.92500x x l� � �

0.3745282

7 1 2c �

2 3c �

3 4c �

4 5c �

5 6c �

6 7c �

7 8c �

1 0.46223l �

2 0.47719l �

3 0.53160l �

4 0.63626l �

5 0.83853l �

6 1.32718l �

7 15.72701l �

1 0 1 0.46223 x x l� � �

2 1 2 0.93942x x l� � �

3 2 3 1.47102x x l� � �

4 3 4 2.10728x x l� � �

5 4 5 2.94581x x l� � �

6 5 6 4.27299x x l� � �

0.3364002

Page 37: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

26

Chapter 4

Determination of the Optimum

Strata Boundaries for a Population

with Right-Triangular Study

Variable

______________________________

4.1 Right-Triangular Distribution

A right-triangular distribution is a family of continuous probability distribution, which

models observable phenomena where the most likely success or mode falls at the

maximum and the least likely success falls at the minimum values. For example; less

income earned by a larger portion of families in a society whereas very few families earn

larger income. It is defined by two parameters a and b , where a is its minimum and also

where the least likely success falls and b is the maximum and where the most likely

success falls.

The probability density function of a right-triangular distribution is given by the general

formula

� � 2

2( ) ; ( ); ,

0 ; otherwise

b x a x bb af x a b�" & �� #

&$

(4.1)

Page 38: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

27

4.2 Formulation of the Problem of Determining the

OSB for Right-Triangular Variable

If the stratification variable x follows the right-triangular distribution with density

function given in (4.1), then the stratum weight ( )hW , stratum mean ( )h , and stratum

variance 2( )hS , can be obtained as a function of boundary points � �1,h hx x� , by using (2.9),

(2.11), and (2.10) respectively as derived below.

� �

� �

1

1

1

2

2

( )

2( )

2 ( )

h

h

h

h

h

h

x

hx

x

x

x

x

W f x dx

b x dxb a

b x dxb a

��

� ��

Performing simple integration gives

� �

� �

1

2

2

2 21

12

22

2 .2 2

h

h

x

hx

h hh h

xW bxb a

x xbx bxb a

��

� �� �� �� � �

� �� � � �� � � �� �� � � �� � � � �� �

From (3.4), replacing hx with 1h hl x �� gives

� �

2 21 1

1 12

( )2 ( ) 2 2

h h hh h h h

l x xW b l x bxb a

� �� �

� �� �� ��� � � �� �� �� �� � �� �� �

.

Simplifying the above equation gives

Page 39: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

28

� �

� �' (

2

12

12

22

2 2 .

hh h h h

hh h

lW bl l xb a

l b l xb a

� �� � �� �� � �

� � ��

Finally, substituting 1h ha b x �� � into the equation yields

� �� �22

.h h hh

l a lW

b a�

��

(4.2)

Using (2.12) the stratum mean ( )h of the right-triangular distribution is obtained as

follows:

� �

1

1

2

1 ( )

1 2( ) .

h

h

h

h

x

hh x

x

h x

x f x dxw

b xx dxW b a

��

Substituting hW from (4.2) gives

� �

1

22

2

( ) 2 ( )2 ( )

h

h

x

hh h h x

b a bx x dxl a l b a

�� ) �

� � � .

Performing simple integration gives

� �

� �

1

2 3

2 3 2 31 1

22 2 3

3 2 3 22 .2 6

h

h

x

hh h h x

h h h h

h h h

bx xl a l

bx x bx xl a l

� �

� �� �� �� � �

� �� � �� � �� � �

From (3.4) substituting 1h h hx l x �� � and expanding the equation, results in

Page 40: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

29

� �� �� � � �� �

� �� � � �

32 32 31 1 1 1

2 3 2 21 1 1

3 212 3

3 2 2 3 31 .2 3

h h h h h h

hh h h

h h h h h h h h

h h h

b l x x l x x

l a l

b l l x l l x l xl a l

� � � �

� � �

� �� � � � �� ��� ��� �� �

� �� � � �� ��

� � �� �

Finally, simplifying the above expression

� � � �

� �

2 21 1 13 2 2 3 33 2

h h h h h hh

h h

b l x l l x xa l

� � �� � � ��

�. (4.3)

Similarly, using (2.10) the stratum variance � �2hS of the right-triangular distribution is

obtained as follows:

1

2 2 21 ( ) h

h

x

h hh x

S x f x dxw

� �� .

By substituting the mean � �h from (4.3) gives

� �

� � � �� �

1

22 21 1 12 2

2

3 2 2 3 31 2( ) 3 2

h

h

xh h h h h h

hh h hx

b l x l l x xb xx dxw aa

Slb

� � �� �� � � ��� �� �

�� � �� �� .

When substituting the weight � �hW from (4.2) gives

Page 41: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

30

� � � �� �

� � � �� �

1

1

22 221 1 12 2

2

22 21 1 12 3

3 2 2 3 3( ) 2 ( ) (2 ) ( ) 3 2

3 2 2 3 32 (2 ) 3 2

.

h

h

h

h

xh h h h h h

hh h h h hx

xh h h h h h

h h h h hx

b l x l l x xb a x b x dxl a l b a a l

b l x l l x xx b x dx

a

S

l l a l

� � �

� � �

� �� � � ��� �� � �

� � �� �� �

� �� � � �� �� � �

� �� �� �

Performing simple integration gives

� �

� � � �� �

2 4 3 41 1

222 2

1 1 1

22 3 4 3 4

3 2 2 3 33 2

h h h h

h h h

hh h h h h h

h h

x b x x b xl a l

b l x l l x xa l

S

� �

� � �

� �� �� � � �) � � �� �� �� � � �� � � � �� �� �

� � �� �� � � �� �� ��� ��� �� �� �� �

.

Substituting 1h h hx l x �� � , expanding and simplifying further yields

� � � �

� � � �� �

3 41 1

3 41 12

22 21 1 1

4 3122

(2 ) 4 312

3 2 2 3 33 2

h h h h

h h hh h

h

h h h h h h

h h

l x b l x

l a l x b x

b l x l l x xa

S

l

� �

� �

� � �

� �� �� �� � �� �� �� �� �� �� �� �)� �� �� � �� ��� ��� � �� �� �

� �� �� �� �� �� � � �� �� ��� ��� �� �� �� �

.

After further simplification we get

� �2 2 22

2

6 618(2 )

h h h h hh

h h

l l a l aa

Sl

� ��

�. (4.4)

Then, the formulated MPP given in (2.18) to determine the optimum stratum widths and

hence the optimum stratum boundaries could be expressed using (2.12), (4.2) and (4.4)

as

Page 42: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

31

Minimize � � � �2 2 2

2 21

6 62( ) 18(2 )

Lh h h h hh h h

hh h h

l l a l al a lc

b a a l�

� ��� ��

subject to 1

L

hh

l d�

��

and 0, 1,2, ..hl h L� � �� (4.5)

where 0 Ld x x� � is the range of the distribution.

4.3 Numerical Illustration of the Solution Procedure

This section gives an illustration of the computational details of the proposed solution

procedure using the dynamic programming technique to determine the OSB with varying

strata measurement cost for the right-triangular distribution as discussed in Section 2.3.

We assume that a 1� and b 2,� which gives 12h ha x �� � and 1d � . So the MPP reduces

to

Minimize � �4 3 2 2

1

6 6

3 2

L h h h h h h h

h

l l a l a l c

� ��

subject to 1

1L

hh

l�

��

and 0, 1,2, ..hl h L� � �� . (4.6)

Note that the ( 1)thh � stratification point is obtained by

Page 43: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

32

1 0 1 2 1

1 2 1

1

1

1

1

h h

h

h

h h

x x l l l

l l l

d

d l

� �

� � � ���

� � � ���

� �

� � �

Substituting the value of 1hx � , the recurrence relation (2.21) and (2.22) for solving the

MPP reduces to

For the first stage, 1k �

� �� �4 3 2

1 1 1 1 11

6 61,

3 2

d d d l cf d

� �� at *

1 1l d� . (4.7)

For the stage k , where 2k �

� �� �

� �

4 3 2

0

6(1 ) 6(1 ), min 3 2

1, k k

k k h h k h h k k

k l d

k k

d d d l d d l d cf k d

f k d l

� �� � � � � �� �

� � �� �

� � �� �� �

. (4.8)

A C++ program (see Appendix B) was coded to solve the recurrence relation (4.7) and

(4.8). While executing the developed program, the optimum strata width *hl and hence the

optimum strata boundaries * * *1h h hx x l�� � are obtained. The results for six different

number of strata 2, 3, 4, 5 6 an 7, d L � with different strata measurement cost hc are

presented in the Table 4.1.

Page 44: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

33

Table 4.1 OSW, OSB and Optimum Value of the Objective Function for Right-

Triangular Distribution

No of

Strata

� �L

Strata

Measurement

Cost � �hc

Optimum Strata

Width � �hl

Optimum Strata

Boundaries

� �1h h hx x l�� �

Optimum

Value of the

Objective

Function

2 1 2c �

2 3c �

1 0.39770l �

2 0.60230l �

1 0 1 1.39770x x l� � � 0.1915934

3 1 2c �

2 3c �

3 4c �

1 0.27858l �

2 0.27769l �

3 0.44373l �

1 0 1 1.27858x x l� � �

2 1 2 1.55627x x l� � �

0.1399830

4 1 2c �

2 3c �

3 4c �

4 5c �

1 0.22026l �

2 0.20646l �

3 0.21668l �

4 0.35660l �

1 0 1 1.22026x x l� � �

2 1 2 1.42672x x l� � �

3 2 3 1.64340x x l� � �

0.1127605

5 1 2c �

2 3c �

3 4c �

4 5c �

5 6c �

1 0.18500l �

2 0.16838l �

3 0.16626l �

4 0.17946l �

5 0.30090l �

1 0 1 1.18500x x l� � �

2 1 2 1.35338x x l� � �

3 2 3 1.51964x x l� � �

4 3 4 1.69910x x l� � �

0.09573736

6 1 2c �

2 3c �

3 4c �

4 5c �

1 0.16116l �

2 0.14409l �

3 0.13820l �

4 0.14038l �

1 0 1 1.16116x x l� � �

2 1 2 1.30525x x l� � �

3 2 3 1.44345x x l� � �

4 3 4 1.58383x x l� � �

5 4 5 1.73805x x l� � �

0.0839845

Page 45: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

34

5 6c �

6 7c �

5 0.15422l �

6 0.26195l �

7 1 2c �

2 3c �

3 4c �

4 5c �

5 6c �

6 7c �

7 8c �

1 0.14383l �

2 0.12706l �

3 0.11977l �

4 0.11820l �

5 0.12222l �

6 0.13590l �

7 0.23302l �

1 0 1 1.14383x x l� � �

2 1 2 1.27089x x l� � �

3 2 3 1.39066x x l� � �

4 3 4 1.50886x x l� � �

5 4 5 1.63108x x l� � �

6 5 6 1.76698x x l� � �

0.07532636

Page 46: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

35

Chapter 5

Determination of the Optimum

Strata Boundaries for a Population

with Cauchy Study Variable

______________________________

4.1 Cauchy Distribution

The Cauchy distribution, named after Augustin Cauchy is a continuous probability

distribution. The history of the discovery of this distribution goes back to the 17th century

but it was first published by the French mathematician Siméon Denis Poisson in 1824 and

Cauchy became associated with it during an academic controversy in 1853.

A Cauchy distribution is considered as a possible model whenever one needs a density

function with heavier tails than the normal distribution allows. This distribution does not

possess (finite) moments. This is a unimodal, symmetric distribution, stable and infinitely

divisible. This distribution is very interesting because it is a simple family of distribution

yet the expected value does not exist. The family of distribution is closed under the

formation of sums of independent variable, and it is also an infinitely divisible family of

distributions. The Cauchy distribution is often used in statistics as a canonical example of

a pathological distribution, since both its mean and its variance are undefined.

The probability density function of a Cauchy distribution is given by

Page 47: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

36

0 20

1( ; , )

1

f x xx x

*

+**

�� �� � ��� �� �� �� �� �

, (5.1)

where 0x is the location parameter that specifies the location of the peak of the distribution

and is the scale parameter that specifies the half-width at half maximum.

The simplest Cauchy distribution called the standard Cauchy distribution, is a special case

where 0 0x � and 1.* � It has the probability density function

� � � �2

1 ; 1

f x xx+

� �, ,�

(5.2)

In this Chapter the problem of constructing the OSB is discussed when the study

variable of the underlying population has a standard Cauchy distribution given in (5.2).

5.2 Formulation of the Problem of Determining the

OSB for Cauchy Variable

If the stratification variable , follows the standard Cauchy distribution with density

function as given by (5.2), then the stratum weight ( )hW , stratum mean ( )h , and stratum

variance 2( )hS can be obtained as a function of boundary points � �1,h hx x� by using (2.09),

(2.11), and (2.10) respectively as follows:

� �

� �

1

1

2

11

h

h

h

h

x

hx

x

x

W f x dx

dxx+

��

By integrating the function and substituting hx and 1hx � , it gives us

Page 48: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

37

� �1 11

1 tan tanh h hW x x+

� ��� � .

Note that 1h h hl x x �� � and thus substituting this into the above equation gives

� �1 11 1

1 tan ( ) tanh h h hW l x x+

� �� �� � � . (5.3)

Similarly, using (2.11) the stratum mean ( )h is obtained as follows:

� �

1

1

2

1 ( )

1 .1

h

h

h

h

x

hh x

x

h x

x f x dxw

x dxW x

+

��

Let

21u x� � .

Then

2

dux dx � .

Also

2 ; 1h hx x u x- - �

21 1 ; 1h hx x u x� �- - � .

Thus

2

21

1

1

1 1h

h

x

hh x

duW u

+

� � .

Performing simple integration yields

Page 49: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

38

' (

� � � �

2

21

1

1

2 21

1 ln2

1 ln 1 ln 1 .2

h

h

xh x

h

h hh

uW

x xW

+

+

� �� � � �� �

Substituting 1h h hl x x �� � gives

� � � �2 2 21 1 1

1 ln 1 2 ln 12h h h h h h

h

l l x x xW

+ � � �� �� � � � � �� � .

Using the identity, ln ln ln xx yy� �

� � � �� �

, the above equation reduces to

2 21 1

21

1 21 ln2 1

h h h hh

h h

l l x xW x

+

� �

� �� �� � �� � �� ��� �� �

.

Also substituting the value of hW from (5.3), we get

� �

2 21 1

21

1 11 1

1 2ln1

2 tan ( ) tan

h h h h

hh

h h h

l l x xx

l x x

� �

�� �

� �

� �� � �� ��� ��

� �. (5.4)

The stratum variance 2( )hS for the standard Cauchy distribution is found using (2.10) as

follows:

� �

� �

1

1

2 2 2

22

21.

1

1

h

h

h

h

x

h hh x

x

hh x

x f x dxW

x

S

dxW x

+

� �

� ��

Let

tan x .� .

Page 50: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

39

Differentiating this gives

2sec dx d. .� .

Also

1tan x. �� .

When

1 ; tanh hx x x. �- -

11 1 ; tanh hx x x. �� �- - .

Thus the integral becomes

� �

1

11

tan 22 2 2

2tan

1 tan sec 1 tan

h

h

x

h hh x

S dW

. . . + .

��

� ���

or

1

11

tan2 2 2

tan

1 tan h

h

x

h hh x

S dW

. . +

��

� �� . (5.5)

Substituting, 2 2tan sec 1. .� � in the equation (5.5) and integrating gives

' (1

11

tan2 2tan

1 tan h

h

xh hx

hWS . .

+

��

� � � .

Thus

� � � �2 1 1 1 1 21 1

1 tan tan tan tan tan tanh h h h h hh

x x x xW

S +

� � � �� �

� �� � � � �� � .

Substituting 1h h hl x x �� � gives

Page 51: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

40

� �2 1 1 21 1

1 tan tanh h h h h hh

l x lS xW

+

� �� �� �� � � � �� � .

Substituting the value of hW from (5.3) and h from (5.4), gives

� �

� � � �

22 21 11 1 21 1 12

21 11 11 11 1

1 211 lntan tan 4 1 1 tan ( ) tantan ( ) tan

h h h h

h h h h hh

h h hh h h

l l x xl x l x x

l xS

xl x x+

+

� �� �� �

� �� �� �� �

� �� �� � �� �� �� �� � � �� � � �� �� �

� �� �.

Simplifying the above equation gives

� �1 11 1

2 1 11 121 1

1 1 22 21 1

21

tan tan1 tan ( ) tan

tan ( ) tan1 21 ln

4 1

h h h h

h h h h

h h h

h h h h

h

l x l x

l x xl x x

l lx

S

x x

� �� �

� �� �� �

� �

� �

" /& &� �� � �& &� �& && &� �� � �# 0� �� �� � & &� �

� �& &� �� � �� � �& &� ��� �& &� �$ 1

. (5.6)

Then, the formulated MPP given in (2.18) to determine the optimum stratum widths and

hence the optimum stratum boundaries could be expressed using (2.12), (5.3) and (5.6)

as:

Minimise

� �� �� �

1 11 1

1 11 1

22 211 1

21

tan tan

tan ( ) tan1 sqrt1 21 ln

4 1

h h h h

Lh h h

hh

h h h h

h

l x l x

l x x

l x xx

cl+

� �� �

� �� �

�� �

" /� �� �� � �& &� �� �& &� �� �� �& &� �� �# 0� �� �& &� �� �� � �� �� ��& &� �� �� �� ��& &� �� �� �� �$ 1

subject to 1

L

hh

l d�

��

Page 52: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

41

and 0, 1,2, ..hl h L� � �� (5.7)

where 0 Ld x x� � is the range of the distribution.

5.3 Numerical Illustration of the Solution Procedure

This Section gives an illustration of the computational details of the proposed solution

procedure using the dynamic programming technique to determine the OSB with varying

strata measurement cost for the standard Cauchy distribution as discussed in Section 2.3.

Let us assume that x follows the standard Cauchy distribution in the interval ' (1,1� , that

is, 0 1X � � , 1LX � and 2d � . Then, the MPP (5.7) becomes

Minimise

� �� �� �

1 11 1

1 11 1

22 211 1

21

tan tan

tan ( ) tan1 sqrt1 21 ln

4 1

h h h h

Lh h h

hh

h h h h

h

l x l x

l x x

l x xx

cl+

� �� �

� �� �

�� �

" /� �� �� � �& &� �� �& &� �� �� �& &� �� �# 0� �� �& &� �� �� � �� �� ��& &� �� �� �� ��& &� �� �� �� �$ 1

subject to 1

2L

hh

l�

��

and 0; 1,2, .h hl L� � �� . (5.8)

Note that the ( 1)thh�

Page 53: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

42

1 0 1 2 1

1 2 1

1

1

1

1 .

h h

h

h

h h

x x l l l

l l l

d

d l

� �

� � � ���

� � � � ���

� � �

� � � �

Substituting the value of 1hx � , the recurrence relation (2.21) and (2.22) for solving the

MPP reduces to:

For the first stage,

� �

� �� �� �� �

1 11 1

1

1 11

1 221 1

tan 1 tan ( 1)

tan 1 ) tan ( 1)11, sqrt2 21 ln

4 2

l

dd

c

d

df

d+

� �

� �

" /� �� � � �& &� �& &� �� � �& &� �� # 0

� �& &� �� �� �� ��& &� �� �� �& &� �� �� �$ 1

at *1 1l d� . (5.9)

For the stage ,k where 2k �

� �

� � � �� �� � � �� �

� �� �� �

� �

1 1

1 1

202

2

tan 1 tan 11 sqrt tan 1 tan 1

, min 1 1 11 ln

4 1 1

1,

k k

h k k k

k k k kk x d

k k k k k

k k

k k

l d d l

d d l cf k d

l d l d ld l

f k d l

+

� �

� �

" /� �� �& &� �� �& &� �� �� � � � �& &� �� �& &� �� �� � � �& &� �� �� # 0� �� �& � �� �� � � � � �� �� �& � �� � �� �� �� �� �& � � �� �� �� �� �� �� �&& � � �$

&&&&&1

(5.10)

A C++ program (see Appendix C) was coded to solve the recurrence relation (5.9) and

(5.10). While executing the developed program, the optimum strata width *hl and hence

the optimum strata boundaries * * *1h h hx x l�� � are obtained. The results for six different

Page 54: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

43

number of strata 2,3,4,5,6 and 7L � and the different values of hc are presented in the

Table 5.1.

Table 5.1 OSW, OSB and Optimum Value of the Objective Function for Standard

Cauchy Distribution

No of

Strata

� �L

Strata

Measurement

Cost � �hc

Optimum Strata

Width � �hl

Optimum Strata

Boundaries

� �1h h hx x l�� �

Optimum

Value of the

Objective

Function

2 1 2c �

2 3c �

1 1.09957l �

2 0.90043l �

1 0 1 0.09957x x l� � �

0.2179531

3 1 2c �

2 3c �

3 4c �

1 0.81943l �

2 0.58253l �

3 0.59804l �

1 0 1 0.18057x x l� � � �

2 1 2 0.40196x x l� � �

0.1586139

4 1 2c �

2 3c �

3 4c �

4 5c �

1 0.67321l �

2 0.46327l �

3 0.41898l �

4 0.444540l �

1 0 1 0.32679x x l� � � �

2 1 2 0.13648x x l� � �

3 2 3 0.55546x x l� � �

0.1273067

5 1 2c �

2 3c �

3 4c �

4 5c �

5 6c �

1 0.58034l �

2 0.39783l �

3 0.33898l �

4 0.33133l �

5 0.35152l �

1 0 1 0.41966x x l� � � �

2 1 2 0.02183x x l� � � �

3 2 3 0.317150x x l� � �

4 3 4 0.64848x x l� � �

0.1078071

6 1 2c �

2 3c �

1 0.51509l �

2 0.35472l �

1 0 1 0.48491x x l� � � �

2 1 2 0.13019x x l� � � �

0.0943949

Page 55: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

44

3 4c �

4 5c �

5 6c �

6 7c �

3 0.29332l �

4 0.27303l �

5 0.27442l �

6 0.28942l �

3 2 3 0.16313x x l� � �

4 3 4 0.43616x x l� � �

5 4 5 0.71058x x l� � �

7 1 2c �

2 3c �

3 4c �

4 5c �

5 6c �

6 7c �

7 8c �

1 0.46630l �

2 0.32335l �

3 0.26317l �

4 0.23768l �

5 0.23032l �

6 0.23398l �

7 0.24520l �

1 0 1 0.53370x x l� � � �

2 1 2 0.21035x x l� � � �

3 2 3 0.05282x x l� � �

4 3 4 0.29050x x l� � �

5 4 5 0.52082x x l� � �

6 5 6 0.75480x x l� � �

0.0845448

Page 56: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

45

Chapter 6

Determination of the Optimum

Strata Boundaries for a Population

with Power Study Variable

______________________________

6.1 Power Distribution

The Power distribution is a continuous probability distribution with probability density

function given by

� �1

; 0

0 ; otherwise

x xf x

2

2

2 ..

�" &� #

&$

(6.1)

where 0 and 0. 2! ! are the scale and shape parameters. In this chapter the problem of

determining the OSB is discussed when the study variable in the underlying population

has a power distribution described in (6.1).

6.2 Formulation of the Problem of Determining the

OSB for Power Variable

If the stratification variable x follows the power distribution with density function given

in (6.1), then the stratum weight � �,hW stratum mean � �h and stratum variance � �2hS can

Page 57: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

46

be obtained as a function of boundary points � �1,h hx x� by using (2.9), (2.11), and (2.10)

respectively as follows:

� �1

h

h

x

hx

W f x dx�

� �

1

1h

h

x

x

x dx2

2

2.

� � .

By integrating the function and substituting hx and 1 hx � , it gives us

11

h h hW x x2 22. �� �� �� � . (6.2)

Note that 1h h hl x x �� � and thus substituting this to (6.2) gives

� �1 11 h h h hW l x x2 22. � �� �� � �� � . (6.3)

The stratum mean � �h of the power distribution is obtained using (2.11) as follows:

1

11 h

h

x

h x

xx dxW

2

2

2.

� � .

By integrating the function and substituting 1h h hx l x �� � gives

� � � � 1 11 11h h h h

h

l x xW

2 22

2 . 2

� �� �

� �� � �� ��.

Thus, substituting the value of ,hW from (6.2) and simplifying gives

1

1 ( ) h

h

x

hh x

x f x dxw

� �

Page 58: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

47

� �� �� �

1 11 1

1 11

h h hh

h h h

l x xl x x

2 2

2 2

2 2

� �� �

� �

� �� �� � �

� � �� �� �. (6.4)

The stratum variance � �2hS for the power distribution is found using (2.10) as follows:

� �1

2 2 21 h

h

x

h hh x

x f xW

S x d �

� ��

1

12 21 h

h

x

hh x

xx dxW

2

2

2 .

� �� .

By integrating the function above we get

� �2 2 2 2

11

2h h h hh

xW

S x2 22 . 2

� ��� �� � �� ��

.

Substituting 1h hl x �� and 1hx � gives

� � � � 22 2 21 1

12h h h h h

h

l x xSW

2 22 . 2

� �� �

� �� � � �� ��.

Thus, substituting the value of hW from (6.3) and h from (6.4), gives

� �

� �� �

� �

� �

2 21 12 2 1 121 1

22

1 1 1 1

12h h h

h h h

hh h h h h h

l x xl x x

l xS

x l x x

2 22 2

2 2 2 2

2222

� �� � � �� �

� � � �

� �� �� �� � � ��� ��� �� � � �� �� �

.

Expanding and finding the common denominator the above function can be simplified

as

Page 59: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

48

� �

� �� �� �

� �� �

� � � �� �

� �� �� �

2 21

2

2 21

2

22

1 1 2 2 21 1 1 1

2 2 21 1 1 1

2

2 1

2 11

2 22

2

1

h h

h

h

h h hh h h h h h h

h h h h h h

S

l x

x

l x xl x x l l x x

l x x l x x

2

2

2 22

2

22 2

22 2

22

2

2

��

��

� �� � � �

� � � �

" /�& &

& &� �& && && && &�& &� �& && &� # 0

� � & &� �� � � � �& &�& &�& && && && &� �& &�

�& &$ 1

Further simplification gives

� �� � � �

� �

� �

� �� �

2 2 2 21 1

221 122

1 1

2 21 1

2 1

1 2 2

h h h

h h hh

h h h

h h h h

l x x

l x xl x x

l l x

S

x

2

2

2 22 2

2

� �� �

� �

� �

� �

" /� �& && && && &� �� # 0

� � & &� � � �� � & && &� � �& &$ 1

α

α . (6.5)

Then, the formulated MPP given in (2.18) to determine the optimum stratum widths and

hence the optimum stratum boundaries could be expressed using (2.12), (6.3) and (6.5)

as:

Minimize

� �

� �

� �

� �� �

2 2 2 21 1

21 1

1

2 21 1

12 2 sqrt

1 2 2

h h h

L

h h h hh

h h h h

l x x

l x x c

l l x x

2

22. 2 2 2

2

� �� �

� ��

� �

" /� �� �� �� �& &� �� �� �& &� �� �� �& &� �� �� �& &� �� �� �� � � �# 0

� �� �� �& &� �� �� �& &� �� �� �& &� �� � �� �� �& &� �� �� �$ 1

α

subject to 1

L

hh

l d�

��

and 0, 1,2, ..hl h L� � �� . (6.6)

Page 60: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

49

6.3 Numerical Illustration of the Solution Procedure

This section gives an illustration of the computational details of the proposed solution

procedure using the dynamic programming technique to determine the OSB with varying

stratum cost for the power distribution. Let x follow the power distribution in the interval

[0, 1], that is, 0 0,x � 1,Lx � and 1d � . We also assume that 1. � and 32 � . Then, the

MPP (6.6) reduces to:

Minimize � �2 4 3 2 2 3 4

1 1 1 1

1

3 24 84 120 60

4 5

L h h h h h h h h h h

h

l l l x l x l x x c� � � �

� � � ��

subject to 1

1L

hh

l�

��

and 0; 1,2, .h hl L� � �� . (6.7)

Note that the � �1 thh� stratification point is obtained by

1 0 1 2 1

1 2 1

1

h h

h

h

h h

x x l l l

l l l

d

d l

� �

� � � ���

� � ���

� �

Substituting the value of 1hx � , the recurrence relation (2.21) and (2.22) for solving the

MPP reduces to:

For the first stage, 1k �

� �4

11

61, 4 5

df d � at 1*1l d� . (6.8)

Page 61: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

50

For the stage k , where 2k �

� �� �

� �

2 4 3 2 2 3 41 1 1 1

0

3 24 84 120 60, min 4 5

1, k k

k k h h h h h h h k

k l d

k k

l l l x l x l x x cf k d

f k d l

� � � �

" /� � � �& && &� # 0& &

� � �& &$ 1

which simplifies to

� �� � � �� � � �� �

24 3 22

3 4

0

3 24 84sqrt

, min 4 5 120 60

1, k k

k h k k h k kkk

k h k k k kx d

k k

l l d l l d ll cf k d l d l d l

f k d l

" /� �� �� � � � �& &� �� �& &� �� �� � � �# 0� �� �& &� � �& &$ 1

(6.9)

because 1 0 1 1k k k kx x l l d l� �� � ��� � � .

A C++ program (see Appendix D) was coded to solve the recurrence relation (6.8) and

(6.9). While executing the developed program, the optimum strata width *hl and hence the

optimum strata boundaries * * *1h h hx x l�� � are obtained. The results for six different number

of strata 2,3,4,5,6 and 7L � and the different values of hc , are presented in the Table 6.1.

Page 62: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

51

Table 6.1 OSW, OSB and Optimum Value of the Objective Function for Power

Distribution

No of

Strata

� �L

Strata

Measurement

Cost � �hc

Optimum

Strata Width

� �hl

Optimum Strata

Boundaries

� �1h h hx x l�� �

Optimum

Value of the

Objective

Function

2 1 2c �

2 3c �

1 0.75883l �

2 0.24117l �

1 0 1 0.75883x x l� � � 0.1580252

3 1 2c �

2 3c �

3 4c �

1 0.64929l �

2 0.20635l �

3 0.14436l �

1 0 1 0.64929x x l� � �

2 1 2 0.85564x x l� � �

0.1157368

4 1 2c �

2 3c �

3 4c �

4 5c �

1 0.58323l �

2 0.18536l �

3 0.12967l �

4 0.10174l �

1 0 1 0.58323x x l� � �

2 1 2 0.76859x x l� � �

3 2 3 0.89826x x l� � �

0.0933962

5 1 2c �

2 3c �

3 4c �

4 5c �

5 6c �

1 0.53776l �

2 0.17091l �

3 0.11957l �

4 0.09381l �

5 0.07795l �

1 0 1 0.53776x x l� � �

2 1 2 0.70867x x l� � �

3 2 3 0.82824x x l� � �

4 3 4 0.92205x x l� � �

0.0794072

6 1 2c �

2 3c �

3 4c �

4 5c �

1 0.50396l �

2 0.16016l �

3 0.11205l �

4 0.08791l �

1 0 1 0.50396x x l� � �

2 1 2 0.66412x x l� � �

3 2 3 0.77617x x l� � �

0.0697378

Page 63: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

52

5 6c �

6 7c �

5 0.07305l �

6 0.06287l �

4 3 4 0.86408x x l� � �

5 4 5 0.93713x x l� � �

7 1 2c �

2 3c �

3 4c �

4 5c �

5 6c �

6 7c �

7 8c �

1 0.47750l �

2 0.15175l �

3 0.10616l �

4 0.08330l �

5 0.06921l �

6 0.05957l �

7 0.05251l �

1 0 1 0.47750x x l� � �

2 1 2 0.62925x x l� � �

3 2 3 0.73541x x l� � �

4 3 4 0.81871x x l� � �

5 4 5 0.88792x x l� � �

6 5 6 0.94749x x l� � �

0.0626070

Page 64: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

53

Chapter 7

Conclusion

______________________________

Survey is an important decision-making tool for development. Results from survey will

aid in decision-making in government, businesses and non-profit organizations. The best

way to conduct a survey is to consider the population. Usually this is impossible due to

budget constraints. Thus, to cater for this problem, many sampling techniques have been

developed. One commonly used technique is stratified random sampling, where the

population is divided into non-overlapping groups called strata and the random samples

are drawn from the strata. To achieve the maximum precision of the estimates when using

stratified sampling, the four basic problems to consider are: (1) the choice of the

stratification variables, (2) the determination of the number of strata, (3) the determination

of the optimum strata boundaries and (4) the determination of the optimum sample size to

be selected from each stratum.

The optimum strata boundaries (OSB) are determined by cutting the range of the

distribution of the study variable at suitable points. The choice of the OSB is important to

ensure that the strata are homogenous. This means that in order to achieve maximum

precision, the stratum variance � �2hS should be as small as possible for a given type of

sample allocation. The problem of determining OSB was first studied by Dalenius (1950)

and subsequently many authors proposed various technique of determining the OSB.

On the other hand, the budget of a survey is fixed in advance in practice. If so, then the

purpose of survey design is to maximize the amount of information gathered with

optimum precision for a given cost. Also when the cost of measurement per unit varies

from stratum to stratum, this certainly influences the determination of the OSB that

Page 65: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

54

maximizes the precision of the estimate. Thus it is important that while trying to determine

the OSB, the cost of measuring observation per unit in each stratum should also be

considered. Unfortunately, the problem of determining the OSB with these cost

constraints was not studied much in the literature. In this thesis, an attempt is made to

determine the OSB for a given budget where the cost of measuring units varies from

stratum to stratum. First a mathematical programming problem was formulated, which

addresses the problem of determining the OSB. Then a solution procedure was proposed

to solve the MPP using a dynamic programming technique. Numerical examples were

illustrated for determining the OSB using four populations in which the study variable

follows different distributions, namely, exponential distribution, right-triangular

distribution, Cauchy distribution and the power distribution.

The findings from this project will be very useful for those who want to conduct a survey

using stratified random sampling given some budget constraints. The advantage of the

proposed method is that it gives global optimum of the boundary points. However, the

research carried out in this thesis is for only a single study variable. Thus future research

can be carried out for more study variables that have other frequency function such as

normal, lognormal, gamma etc. which are more useful in the industry. The research can

also be carried out for determining the OSB for multivariate.

Page 66: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

55

Bibliography 1. Aoyama, H. (1954). A study of stratified random sampling. Ann. Inst. Stat.

Math., 6, 1-36.

2. Arthanari, T.S., Dodge, Y. (1981). Mathematical Programming in Statistics.

Wiley and Sons, Inc. USA

3. Banerjee, A., Yakovenko, V.M. and Matteo, T.D. (2006). A study of the personal

income distribution in Australia. Physica A, 370, 54-59.

4. Banerjee, A., Yakovenko, V.M. (2010). Universal patterns of inequality. New

Journal of Physics, 12, DOI: 10.1088/1367-2630/12/7/075032.

5. Bellman, R.E. (1957). Dynamic Programming. Princetown University Press,

New Jersey.

6. Brandt, S. (1999). Data Analysis. Statistical and Computational Methods for

Scientists and Engineers. Ed. 3. Springer Verlag, New York.

7. Buhler. W., Deutler. T., (1975). Optimal stratification and grouping by dynamic

programming. Metrika. 22(1), 161-175.

8. Cameron, N. (1985). Introduction to Linear and Convex Programming

Cambridge University Press, Cambridge.

9. Claycombe, W.W. and Sullivan, W.G. (1975). Foundations of Mathematical

Programming. Reston Publishing Company, INC. A Prentice-Hall

Company,USA

Page 67: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

56

10. Cochran, W.G. (1977). Sampling Techniques. John Wiley & Sons, New York.

11. Cochran, W.G. (1961). Comparisons of methods for determining stratum

boundaries. Bull. Int. Stat. Inst, Vol 38. Part 2, pp. 345-358.

12. Dalenius, T. (1950). The problem of optimum stratification- II. Skand.

Aktuartidskr, 33, 203-213.

13. Dalenius, T., and Gurney, M. (1951). The problem of optimum stratification.

Skand. Akt.,34, 133-148.

14. Dalenius, T. (1957). Sampling in Sweden. Almquist and Wiksell. Stockholm.

15. Dalenius, T., and Hodges, J. L. (1959). Minimum variance stratification. J. Amer.

Statis. Assoc., 54, 88-101.

16. Detlefsen, R. E., and Veum, C. S. (1991). Design Issues for the retail trade

sample surveys of the US Bureau of the Censors. Proceeding of the Survey

Research Methods Section, ASA, pp. 214-219.

17. Dvoretzky, A. Kiefer, J. Wolfowitz, J. (1952). The Inventory Problem: I, Case of

Known Distribution of Demand. Econometrica. 20, 187-222.

18. Dvoretzky, A. Kiefer, J. Wolfowitz, J. (1952). The Inventory Problem: II, Case

of Unknown Distribution of Demand. Econometrica. 20, 450-466.

19. Ekman, G. (1959). Approximate expression for conditional mean and variance

over small intervals of a continuous distribution. Ann. Inst. Stat. Math., 30, 1131-

1134.

Page 68: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

57

20. Evans, M., Hasting, N. and Peacock, B. (2000). Statistical Distributions. John

Wiley and Sons Inc., Canada.

21. Findensein, W., Szymanowski, J. and Wierzbicki, A. (1974). Metody

obliczeniowe optymalizacji (Computing Methods of Optimization).

Wydawnictwa Politechniki Warszawskiej, Warsaw, Poland.

22. Fonolahi, A.V., Khan, M.G.M. (2014). Determining the optimum strata

boundaries with constant cost factor. IEEE Proceding of 2014 Asia-Pacific

World Congress on Computer Science and Engineering (APWC on CSE), 1-7,

DOI: 10.1109/APWC CSE.2014.7053850.

23. Glaisher, J.W. (1871). On a class of definite integrals. Philosophical Magazine,

XXXII, 294-301.

24. Govindarajulu, Z. (1999). Elements of sampling Theory and Methods. Prentice-

Hall Company.

25. Gupta, R. K., Singh, R., Mahajan, P.K. (2005). Approximate optimum strata

boundaries for ratio and regression estimators. Aligarh Journal of Statistics. 25,

49-55.

26. Hansen, M.H., and Hurwitz, W.N. (1953). On the theory of sampling from finite

population. Ann. Math. Statist., 14, 333-362.

27. Hansen, M.H., and Hurwitz, W.N., Madow, W.G. (1962). Sample Survey

Methods and Theory Methods and applications, Vol 1, John Wiley &Sons, Inc.

28. Hess, I., Sethi, V.K. and Balakrishnan, T.R. (1966). Stratification: A practical

investigation. J. Amer. Statist. Assoc., 61, 71-90.

Page 69: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

58

29. Hidiroglou, M.A., and Srinath, K.P. (1993). Problems associated with designing

subannual business surveys. Journal of Bussiness and Economic Statistics, 11,

397-405.

30. Johnson, D. (1997 ). The triangular distribution as a proxy for the beta

distribution in risk analysis. Journal of the Royal Statistical Society: Series D (

The Statistician), Vol 47, Issue 3, 387- 398.

31. Khan, E.A., Khan, M.G.M., and Ahsan, M.J. (2002). Optimum stratification: A

mathematical programming approach. Culcutta Statistical Association Bulletin,

52 (special), 205-208.

32. Khan, M.G.M., Khan, N., and Ahsan, M.J. (2003). Optimum stratification for

exponential study variable under Neyman Allocation. Bulletin of the

International Statistical Institute, 54th Session, Vol LX, 606-607.

33. Khan, M.G.M., Najmussehar., Ahsan, M.J. (2005). Optimum stratification for

exponential study variable under Neyman allocation. J. Indi. Soc. Agri. Statist.,

59(2), 146-150.

34. Khan, M.G.M., Nand, N., Ahmad, N. (2008). Optimum stratification for cauchy

and power type study variables. J. Appl. Statist. Sci. 16(4), 64-74.

35. Khan, M.G.M., Nand, N., Ahmad, N. (2008). Determining the optimum strata

boundary points using dynamic programming. Survey methodology. 34(2), 205-

214.

36. Khan, M.G.M., Rao, D., Ansari, A. H., Ahsan, M. J., (2014). Determining

optimum strata boundaries and sample sizes for skewed population using log-

normal distribution. Communications in Statistics- Simulation and Computation.

DOI # 10.1080/03610918.2013.819917.

Page 70: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

59

37. Khan, M.G.M., Reddy, K.G. and Rao, D.K. (2015). Designing stratified

sampling in economic and business surveys. Journal of Applied Statistics. DOI:

10.1080/02664763.2015.1018674.

38. Kozak, M. (2004). Optimal stratification using random search method in

agricultural surveys. Statistics in Transition, Vol. 6, No.5, 797-806.

39. Lavalle’e, P. (1988). Two-way optimal stratification using dynamic

programming. Proc. Sect. Surv. Resea. Meth., Amer. Statist. Assoc., Virginia,

646-651.

40. Lavalle’e, P., Hidiroglou, M. (1988). On the stratification of skewed populations,

Survey Methodlogy, 14, 3-43.

41. Lednicki, B., Wieczorkowski, R. (2003). Optimal stratification and sample

surveys. Sankhya, 12, 1-7.

42. Mahalanobis, P. C. (1952). Some Aspects of the Design of Sample Survey.

Sankhya, 12, 1-7.

43. Mehta, S. K., Singh, R. Kishore, L. (1996). On optimum stratification for

allocation proportional to strata totals. J. Indi. Statist. Assoc. 34, 9-19.

44. Murthy, M.N. (1967). Sampling Theory and Methods. Statistical Publishing

Society, Culcutta.

45. Nand, N. and Khan, M.G.M. (2005a). Use of mathematical programming in

optimum stratification. Presented in the conference of Australia – New Zealand

Industrial and Applied Mathematics (ANZIAM 2005) held during January 30 to

February 3 at Napier, New Zealand.

Page 71: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

60

46. Nand, N. and Khan, M.G.M. (2005b). Determining the optimum strata boundary

points using mathematical programming. Presented in the 55th Session of the

International Statistical Institute (ISI) held during April 5-12 at Sydney,

Australia.

47. Nelders, J. A., Mead, R. (1965). A simplex method for function minimization.

Computer Journal, 7, 308-313.

48. Nemhauser, G. L. (1980). Introduction to Dynamic Programming. John Wiley &

Sons, Inc, USA.

49. Nicolini, G. (2001). A method to define strata boundaries. Departemental

Working Papers 2001-01, Department of Economics University of Milan Italy.

(www.economia.unimi.it/pubb/wp83)

50. Niemiro, W. (1999). Konstrukcja optymalnej stratyfikacja metoda poszukiwan

losowych. (Optimal stratification using random search method). Wiadomosci

Statystyczne, 10, 1-9.

51. Poisson, D.D. (1832). Sur la probabilite’ des resultat moyens des observations,

Paris.

52. Rivest, L. P. (2002). A generalization of Lavallee and Hidiroglou algorithm for

stratification in business survey. Techniques d’enquete, 28, 207-214.

53. Rizvi, S.E.H., Gupta, J.P., Bhargava M. (2002). Optimum stratification base on

auxiliary variable for compromise allocation. Metron, 28(1), 201-215.

54. Scheaffer, R. L., Mendenhall, R., Ott, L. R. (2006). Elementary Survey

Sampling (6th Edn). Thomson Brooks/Cole. University of California, USA.

Page 72: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

61

55. Serfling, R.J. (1968). Approximately optimum stratification. Journal of American

Statistical Association. 63, 1298-1309.

56. Sethi, V. K. (1963). A note on optimum stratification of population for

estimating the population mean. Aust. J. Statist. 5, 20-23.

57. Simpson, T. (1755). A letter to the Right Honourable George Earl of

Macclesfield, President of the Royal Society on the advantage of taking the mean

of a number of observations, in practical astronomy. Philosophical Transactions

of the Royal Society of London. 49, 82-93.

58. Singh, R., Prakash, D. (1975). Optimum Stratification for Equal Allocation.

Annals of the Institute of Statistical mathematics, 27, 273-280.

59. Singh, R., Sukhatme, B.V. (1969). Optimum stratification. Ann. Inst. Stat. Math.

21, 515-528.

60. Singh, R., Sukhatme, B.V. (1972). Optimum stratification in sampling with

varying probabilities. Ann. Inst. Stat. Math. 24, 485-494.

61. Singh, R., Sukhatme, B.V. (1973). Optimum stratification with ratio and

regression methods of estimation. Ann. Inst. Stat. Math. 25, 627-633.

62. Singh, R. (1971). Approximately optimum stratification on auxiliary variable. J.

Amer. Statist. Assoc. 66, 829-833.

63. Singh, R., Prakash, D. (1975). Optimum stratification for equal allocation. Ann.

Inst. Stat. Math. 27, 273-280.

Page 73: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

62

64. Sundaram, R. K. (1996). A first Course in optimization Theory. Cambridge

University Press, USA.

65. Sweet, E.M., Sigman, R.S. (1995a). Evaluation of model-assisted procedures for

stratifying skewed populations using auxiliary data, U.S. Bereau of the Census

(available on the internet: www.census.gov/srd/papers/pdf/sm95-22.pdf).

66. Sweet, E.M., Sigma, R.S. (1995b). User guide for the generalized SAS univariate

stratification program, ESM Report Series, ESM-9504, U.S. Bureau of the

Census.

67. Taga, Y., (1967). On optimum stratification for the objective variable based on

concomitant variables using prior information. Ann. Inst. Stat. Math., 19, 101-

129.

68. Taha, H. A. (1997). Operations Research: An Introduction. Prentice Hall, Inc.,

Opper Saddle River, New Jersey.

69. Howard G. T. (1962). An introduction to probability and mathematical statistics.

Academic Press Inc. USA.

70. Unnithan, V.K.G. (1978). The minimum variance boundary points of

stratification. Sankhya, 40, C, 60-72.

71. Unnithan, V.K.G. and Nair, N.U. (1995). Minimum variance stratification.

Commun. Statist., 24(1), 275-284.

72. Walpole, R. E. Myers, R. H., Ye, K. (1981).Probability and Statistics for

engineers and scientists, 7th edn. Prentice Hall, Inc USA.

Page 74: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

63

73. Winston, L. W., Venkataramanan, M. (2003). Introduction to Mathematical

Programming, Operation Research Volume One, 4th Edn . Brooks/Cole-

Thompson Learning, USA.

74. Wald, A. (1947). Foundation of a General Theory of Sequential Decision

Functions. Econometrica, Vol. 15. 279-313.

Page 75: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

64

Appendix Appendix A:

The C++ Program Created to Determine the OSB

with cost factor for Exponential Distribution

/* This program gives the optimum strata width and optimum strata boundaries of

the main study variable following the exponential distribution */

#include <iostream>

#include <math.h>

#include <assert.h>

#include <conio.h>

#include <iomanip>

using namespace std;

typedef double Number;

/*********************************************************************/

//declare and initialize global constants

# define z 100 //refine to 5 decimal places

# define factor 4

# define inc 0.001 // amount of precision (10^-3)

# define inc2 0.00001 // amount of precision (10^-5)

# define prec 1/inc

# define points 1000 //Keep this to be 1/inc

// function declarations

Page 76: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

65

double RootVal(int k, double d, double y, double c);

/*calculates the value of objective function and the minimal elements*/

double Minimum(double val1,double val2); // returns minimum of 2 numbers

/*Recursive function receives the parameter k, dk, yk to calculate f.*/

double fun(int k,int n,double incf, int minYk, int maxYk, bool isFirstRun, double []);

//void Weight Sam Pop(int h, int N, int nh[], int Nh[], double w[], double d[], double y[],

double f, double Vybarst, double Vh[]) ;

//void sample size(int nh[], int n, int h, double w[], double sig[], double prodwhsigh);

//declare global variables

int n, N; // number of strata (h), total sample size (n), pop size

double s; // s=x0, the initial value of 6.1

const int SIZE = 10;

const double lambda = 1;

//declare global constants and initialize their values

const double g = 20; // g is the distance

const int stages = 10;

int ylimits[10]; //stores the 3dp values for refining

const int e = (int)(g*points*z+1);

const int p = (int)(g*points+1);

double minkf2[stages][e];

double dk2[stages][e];

int h = 0;

int main() //program execution starts

{

double c[SIZE];

//take inputs of L, d and s as local variables

cout<<"Enter Number of Strata, L " << endl;

Page 77: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

66

cin >> h;

cout<<"Enter Initial Value, Xo " << endl;

cin >> s;

s=0;

for (int i=0; i<h; i++)

{

cout<<"Enter the Cost, C " << endl;

cin >> c[i];

printf("c[%i+1] = %f \n\n", i, c[i]);

}

//initialize minkf locally

cout<<"\initializing points ...."<<endl;

for (int i=0; i < stages; i++)

{

for(int j=0; j<e; j++)

{

minkf2[i][j]= -9999; //assign -9999 to every cell

}

}

for (int k=0; k < stages; k++)

{

for(int l=0; l<e; l++)

{

dk2[k][l]= -9999; //assigning same as above

}

}

cout<<"Initialization complete"<<endl<<endl<<"Calculating...."<<endl<<endl;

Page 78: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

67

double f=fun(h,p,inc ,0,p ,true, c);

double d[SIZE], y[SIZE], x[SIZE], w[SIZE], Vh[SIZE]; // d, y and x are arrays

of h float numbers

int nh[SIZE], Nh[SIZE]; //stratum sample sizes

int temp;

double Vybarst;

//backward calculation for the 3dp results

for(int i=h; i>=1; i--)

{

//c[i] = i+1;

if(i==h)

{

d[i] = g;

y[i] = dk2[i][p];

//c[i] = c[i+1];

}

else if(i<h && i>1)

{

d[i] = d[i+1]-y[i+1];

temp = (int)(d[i]*points);

y[i] = dk2[i][temp];

//c[i] = c[i+1];

}

else if(i==1)

{

d[i]=d[i+1]-y[i+1];

y[i]=d[i];

//c[i] = c[i+1];

Page 79: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

68

}

}

//setup the limits for the 6dp calculations

for(int i=h; i>=1; i--)

{

temp = (int)(y[i]*points*z);

ylimits[i] = temp;

}

f=fun(h, e-1, inc2, ylimits[h]-factor*z, ylimits[h]+factor*z, false, c);// for k>=2

cout <<"Strata: L = " << h << setw(30) << "Distance: d = " << g << endl;

printf("\nf(h,g): %.10f \n" ,f);

cout << setw(20) << "\n\n Distance" << setw(25) << "Width" << setw(26) <<

"Boundary" << endl;

cout << setfill('-') << setw(73) << "-";

//Backward calculation for the 6 dp, compute d, y and x

for(int i=h; i>=1; i--)

{

//c[i] = i+1;

if(i==h)

{

d[i]=g;

y[i] = dk2[i][(e-1)];

x[i]=s+g;

}

else if(i<h && i>1)

Page 80: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

69

{

//cout << d[i+1] << "\t\t" << y[i+1] << endl;

d[i]=d[i+1]-y[i+1];

temp = (int)(d[i]*points*z);

y[i]=dk2[i][temp];

x[i]=x[i+1]-y[i+1];

}

else if(i==1)

{

//cout << "d=" << d[i+1] << "\t\ty=" << y[i+1] << endl;

d[i]=d[i+1]-y[i+1];

y[i]=d[i];

x[i]=y[i]+s;

}

printf("\nd[%i] = %f \t y[%i] = %f \t x[%i] = %f" , i, d[i], i, y[i], i, x[i]);

}

cout << endl << setfill('-') << setw(73) << "-" << endl;

getch();

system ("PAUSE");

return 0;

} //end main

double RootVal(int k, double d, double y, double c)//calculate the root value of the

current distribution

{

double rtval;

double calc;

Page 81: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

70

double A = exp(-1*lambda*(d-y+s));

double B = (1/pow(lambda,2))*pow((1-exp(-1*lambda*y)),2);

double C = pow(y,2)*exp(-1*lambda*y);

double Wh = exp(-1*lambda*(d-y+s))*(1-exp(-1*lambda*y));

double Sig2h = ((1/pow(lambda,2))*pow((1-exp(-1*lambda*y)),2)-

pow(y,2)*exp(-1*lambda*y))/pow((1-exp(-1*lambda*y)),2);

calc = pow(A,2)*(B-C)*c;

if (calc<0)

{

cout<<"\nError: Negative root...\n";

rtval = -1;

}

else

{

calc = sqrt(calc);

}

rtval = calc;

return rtval;

}

double Minimum(double val1,double val2) // returns minimum of 2 numbers

{

if (val1<=val2)

{

return val1;

}

else

{

Page 82: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

71

return val2;

}

}

/*this functions performs the same actions as "function". It only defers in terms of the

iterations of the for loop.*/

double fun(int k,int n,double incf,int minYk,int maxYk,bool isFirstRun, double cost[])

{

assert (k>=1); //Abort if k is negative

double dblRetVal;

double d = n*incf; //d value for the function

double y;

//int c;

double min;

double val;

double miny = 0;

int col;

if(k==1) //base case

{

y = d;

dblRetVal = RootVal(k, d, y, cost[0]);

}

else

{

for(int i=minYk; i<=maxYk; i++) //iterate over the interval allowed to

calculate the 6dp results.

{

y = i*incf; //this sets to precission of y to 6dp

double root;

Page 83: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

72

root = RootVal(k, d, y, cost[k-1]); //calculate the root.

if(root != -1) //if root is valid

{

col = n-i; //get the current d value

if(minkf2[k-1][col]==-9999)

{

if(isFirstRun) //check if the result has been

previously calculated

{

val = root+ fun((k-1),col,incf,0,col,true,

cost); //if not, calculate the result

}

else

{ //if not, calculate the result

val = root+ fun((k-1),col,incf,ylimits[k-1]-

factor*z,ylimits[k-1]+ factor*z,false, cost);

}

}

else

{

val = root+ minkf2[k-1][col]; //if result exists, use

it for calculations

}

}

if (i==minYk)

{

min =val;//base case

}

Page 84: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

73

else

{

min = Minimum(min,val);//get the minimum if the result

and the current minimum

}

if(min == val)

{

miny=y;

}//get the position of the current minimum

}//end for

dblRetVal = min;

}//end else

//store the f and the d value of the minimum calculated.

col = n;

minkf2[k][col] = dblRetVal;

dk2[k][col]=miny;

return dblRetVal;

}//end function

Page 85: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

74

Appendix B:

The C++ Program Created to Determine the OSB

with cost factor for Right-Triangular Distribution

/* This program gives the optimum strata width and optimum strata boundaries of

the main study variable following the right-triangular distribution */

#include <iostream>

#include <math.h>

#include <assert.h>

#include <conio.h>

#include <iomanip>

using namespace std;

typedef double Number;

/*********************************************************************/

//declare and initialise global constants

# define z 100 //refine to 5 decimal places

#define a 1

#define b 2

# define factor 4

# define inc 0.001 // amount of precision (10^-3)

# define inc2 0.00001 // amount of precision (10^-5)

# define prec 1/inc

# define points 1000 //Keep this to be 1/inc

// function declarations

Page 86: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

75

double RootVal(int k, double d, double y, double c); /*calculates the value of objective

function and the minimal elements*/

double Minimum(double val1,double val2); // returns minimum of 2 numbers

/*Recursive function receives the parameter k, dk, yk to calculate f.*/

double fun(int k,int n,double incf,int minYk,int maxYk,bool isFirstRun, double []);

//void WeightSamPop(int h, int N, int nh[], int Nh[], double w[], double d[], double y[],

double f, double Vybarst, double Vh[]) ;

//void sampsize(int nh[], int n, int h, double w[], double sig[], double prodwhsigh);

//declare global variables

int n, N; // number of strata (h), total sample size (n), pop size

double s; // s=x0, the initial value of 6.1

const int SIZE = 10;

//declare global constants and initialize their values

const double g = 1; // g is the distance

const int stages = 10;

int ylimits[10]; //stores the 3dp values for refining

const int e = (int)(g*points*z+1);

const int p = (int)(g*points+1);

double minkf2[stages][e];

double dk2[stages][e];

int h = 0;

int main() //program execution starts

{

double c[SIZE];

//take inputs of L, d and s as local variables

cout<<"Enter Number of Strata, L " << endl;

cin >> h;

Page 87: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

76

s=1;

for (int i=0; i<h; i++)

{

cout<<"Enter the Cost, C " << endl;

cin >> c[i];

printf("c[%i+1] = %f \n\n", i, c[i]);

}

//initialize minkf locally

cout<<"\initializing points ...."<<endl;

for (int i=0; i < stages; i++)

{

for(int j=0; j<e; j++)

{

minkf2[i][j]= -9999; //assign -9999 to every cell

}

}

for (int k=0; k < stages; k++)

{

for(int l=0; l<e; l++)

{

dk2[k][l]= -9999; //assigning same as above

}

}

cout<<"Initialization complete"<<endl<<endl<<"Calculating...."<<endl<<endl;

double f=fun(h,p,inc ,0,p ,true, c);

Page 88: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

77

double d[SIZE], y[SIZE], x[SIZE], w[SIZE], Vh[SIZE]; // d, y and x are arrays

of h float numbers

int nh[SIZE], Nh[SIZE]; //stratum sample sizes

int temp;

double Vybarst;

//backward calculation for the 3dp results

for(int i=h; i>=1; i--)

{

//c[i] = i+1;

if(i==h)

{

d[i] = g;

y[i] = dk2[i][p];

//c[i] = c[i+1];

}

else if(i<h && i>1)

{

d[i] = d[i+1]-y[i+1];

temp = (int)(d[i]*points);

y[i] = dk2[i][temp];

//c[i] = c[i+1];

}

else if(i==1)

{

d[i]=d[i+1]-y[i+1];

y[i]=d[i];

//c[i] = c[i+1];

}

}

Page 89: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

78

//setup the limits for the 6dp calculations

for(int i=h; i>=1; i--)

{

temp = (int)(y[i]*points*z);

ylimits[i] = temp;

}

f=fun(h, e-1, inc2, ylimits[h]-factor*z, ylimits[h]+factor*z, false, c);// for k>=2

cout <<"Strata: L = " << h << setw(30) << "Distance: d = " << g << endl;

printf("\nf(h,g): %.10f \n" ,f);

cout << setw(20) << "\n\n Distance" << setw(25) << "Width" << setw(26) <<

"Boundary" << endl;

cout << setfill('-') << setw(73) << "-";

//Backward calucation for the 6 dp, compute d, y and x

for(int i=h; i>=1; i--)

{

if(i==h)

{

d[i]=g;

y[i] = dk2[i][(e-1)];

x[i]=s+g;

}

else if(i<h && i>1)

{

//cout << d[i+1] << "\t\t" << y[i+1] << endl;

Page 90: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

79

d[i]=d[i+1]-y[i+1];

temp = (int)(d[i]*points*z);

y[i]=dk2[i][temp];

x[i]=x[i+1]-y[i+1];

}

else if(i==1)

{

//cout << "d=" << d[i+1] << "\t\ty=" << y[i+1] << endl;

d[i]=d[i+1]-y[i+1];

y[i]=d[i];

x[i]=y[i]+s;

}

printf("\nd[%i] = %f \t y[%i] = %f \t x[%i] = %f" , i, d[i], i, y[i], i, x[i]);

}

cout << endl << setfill('-') << setw(73) << "-" << endl;

getch();

system("PAUSE");

return 0;

} //end main

double RootVal (int k, double d, double y, double c)//calculate the root value of the

current distribution

{

double rtval;

double calc;

double A = y;

Page 91: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

80

double B = ((pow(y,2))*((pow(y,2))-(6*(b-(d-y+s))*y)+(6*(pow((b-(d-

y+s)),2)))))/(18);

double Wh = y*(2*(b-(d-y+s))-y);

double Sig2h = ((pow(y,2))*((pow(y,2))-(6*(b-(d-y+s))*y)+(6*(pow((b-(d-

y+s)),2)))))/(18*pow((2*(b-(d-y+s))-y),2));

calc=pow(A,2)*(B*c);

if (calc<0)

{

// cout<<"\nError: Negative root...\n";

rtval = -1;

}

else

{

calc = sqrt(calc);

}

rtval = calc;

return rtval;

}

double Minimum (double val1,double val2) // returns minimum of 2 numbers

{

If (val1<=val2)

{

return val1;

}

else

{

return val2;

Page 92: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

81

}

}

/*this functions performs the same actions as "function". It only defers in terms of the

iterations of the for loop.*/

double fun(int k,int n,double incf,int minYk,int maxYk,bool isFirstRun, double cost[])

{

assert (k>=1); //Abort if k is negative

double dblRetVal;

double d = n*incf; //d value for the function

double y;

//int c;

double min;

double val;

double miny = 0;

int col;

if (k==1) //base case

{

y = d;

dblRetVal = RootVal(k, d, y, cost[0]);

}

else

{

for(int i=minYk; i<=maxYk; i++) //iterate over the interval allowed to

calculate the 6dp

results.

{

y = i*incf; //this sets to precission of y to 6dp

double root;

Page 93: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

82

root = RootVal(k, d, y, cost[k-1]); //calculate the root.

If (root != -1) //if root is valid

{

col = n-i; //get the current d value

if (minkf2[k-1][col]==-9999)

{

If (isFirstRun) //check if the result has been

previously calculated

{

val = root+ fun((k-1),col,incf,0,col,true,

cost); //if not, calculate the result

}

else

{ //if not, calculate the result

val = root+ fun((k-1),col,incf,ylimits[k-1]-

factor*z,ylimits[k-1]+ factor*z,false, cost);

}

}

else

{

val = root+ minkf2[k-1][col]; //if result exists, use

it for calculations

}

}

if (i==minYk)

{

min =val;//base case

}

Page 94: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

83

else

{

min = Minimum(min,val);//get the minimum if the result

and the current minimum

}

If (min == val)

{

miny=y;

}//get the position of the current minimum

}//end for

dblRetVal = min;

}//end else

//store the f and the d value of the minimum calculated.

col = n;

minkf2[k][col] = dblRetVal;

dk2[k][col]=miny;

return dblRetVal;

}//end function

Page 95: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

84

Appendix C:

The C++ Program Created to Determine the OSB

with cost factor for Standard Cauchy Distribution

/* This program gives the optimum strata width and optimum strata boundaries of

the main study variable following the standard Cauchy distribution */

#include <iostream>

#include <math.h>

#include <assert.h>

#include <conio.h>

#include <iomanip>

using namespace std;

typedef double Number;

/*********************************************************************/

//declare and initialise global constants

# define z 100 //refine to 5 decimal places

# define factor 4

# define inc 0.001 // amount of precision (10^-3)

# define inc2 0.00001 // amount of precision (10^-5)

# define prec 1/inc

# define points 1000 //Keep this to be 1/inc

// function declarations

double RootVal(int k, double d, double y, double c); /*calculates the value of objective

function and the minimal elements*/

double Minimum(double val1,double val2); // returns minimum of 2 numbers

Page 96: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

85

/*Recursive function receives the parameter k, dk, yk to calculate f.*/

double fun(int k,int n,double incf,int minYk,int maxYk,bool isFirstRun, double []);

//void WeightSamPop(int h, int N, int nh[], int Nh[], double w[], double d[], double y[],

double f, double Vybarst, double Vh[]) ;

//void sampsize(int nh[], int n, int h, double w[], double sig[], double prodwhsigh);

//declare global variables

int n, N; // number of strata (h), total sample size (n), pop size

double s; // s=x0, the initial value of 6.1

const int SIZE = 10;

//declare global constants and initialise their values

const double g = 2; // g is the distance

const int stages = 10;

int ylimits[10]; //stores the 3dp values for refining

const int e = (int)(g*points*z+1);

const int p = (int)(g*points+1);

double minkf2[stages][e];

double dk2[stages][e];

int h = 0;

int main() //program execution starts

{

double c[SIZE];

//take inputs of L, d and s as local variables

cout<<"Enter Number of Strata, L " << endl;

cin >> h;

//cout<<"Enter Initial Value, Xo " << endl;

//cin >> s;

s=-1;

for (int i=0; i<h; i++)

Page 97: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

86

{

cout<<"Enter the Cost, C " << endl;

cin >> c[i];

printf("c[%i+1] = %f \n\n", i, c[i]);

}

//initialize minkf locally

cout<<"\nInitializing points ...."<<endl;

for (int i=0; i < stages; i++)

{

for(int j=0; j<e; j++)

{

minkf2[i][j]= -9999; //assign -9999 to every cell

}

}

for (int k=0; k < stages; k++)

{

for(int l=0; l<e; l++)

{

dk2[k][l]= -9999; //assigning same as above

}

}

cout<<"Initialization complete"<<endl<<endl<<"Calculating...."<<endl<<endl;

double f=fun(h,p,inc ,0,p ,true, c);

double d[SIZE], y[SIZE], x[SIZE], w[SIZE], Vh[SIZE]; // d, y and x are arrays

of h float numbers

int nh[SIZE], Nh[SIZE]; //stratum sample sizes

Page 98: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

87

int temp;

double Vybarst;

//backward calculation for the 3dp results

for(int i=h; i>=1; i--)

{

//c[i] = i+1;

if(i==h)

{

d[i] = g;

y[i] = dk2[i][p];

//c[i] = c[i+1];

}

else if(i<h && i>1)

{

d[i] = d[i+1]-y[i+1];

temp = (int)(d[i]*points);

y[i] = dk2[i][temp];

//c[i] = c[i+1];

}

else if(i==1)

{

d[i]=d[i+1]-y[i+1];

y[i]=d[i];

//c[i] = c[i+1];

}

}

//setup the limits for the 6dp calculations

for(int i=h; i>=1; i--)

{

Page 99: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

88

temp = (int)(y[i]*points*z);

ylimits[i] = temp;

}

f=fun(h, e-1, inc2, ylimits[h]-factor*z, ylimits[h]+factor*z, false, c);// for k>=2

cout <<"Strata: L = " << h << setw(30) << "Distance: d = " << g << endl;

printf("\nf(h,g): %.10f \n" ,f);

cout << setw(20) << "\n\n Distance" << setw(25) << "Width" << setw(26) <<

"Boundary" << endl;

cout << setfill('-') << setw(73) << "-";

for(int i=h; i>=1; i--)

{

//c[i] = i+1;

if(i==h)

{

d[i]=g;

y[i] = dk2[i][(e-1)];

x[i]=s+g;

}

else if(i<h && i>1)

{

//cout << d[i+1] << "\t\t" << y[i+1] << endl;

d[i]=d[i+1]-y[i+1];

temp = (int)(d[i]*points*z);

y[i]=dk2[i][temp];

Page 100: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

89

x[i]=x[i+1]-y[i+1];

}

else if(i==1)

{

//cout << "d=" << d[i+1] << "\t\ty=" << y[i+1] << endl;

d[i]=d[i+1]-y[i+1];

y[i]=d[i];

x[i]=y[i]+s;

}

printf("\nd[%i] = %f \t y[%i] = %f \t x[%i] = %f" , i, d[i], i, y[i], i, x[i]);

}

//WeightSamPop(h, N, nh, Nh, w, d, y, f, Vybarst, Vh);

cout << endl << setfill('-') << setw(73) << "-" << endl;

getch();

system("PAUSE");

return 0;

} //end main

double RootVal(int k, double d, double y, double c)//calculate the root value of the

current distribution

{

double rtval;

double calc;

calc = (y-atan(d-1)+atan(d-y-1))*(atan(d-1)-atan(d-y-1))-

0.25*pow((log((1+y*y+(2*y*(d-y-1))+

(d-y-1)*(d-y-1))/(1+(d-y-1)*(d-y-1)))),2);

Page 101: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

90

if(calc<0)

{

// cout<<"\nError: Negative root...\n";

rtval = -1;

}

else

{

calc = 0.318309886*sqrt(calc*c);

}

rtval = calc;

return rtval;

}

double Minimum(double val1,double val2) // returns minimum of 2 numbers

{

if(val1<=val2)

{

return val1;

}

else

{

return val2;

}

}

/*this functions performs the same actions as "function". It only defers in terms of the

iterations of the for loop.*/

double fun(int k,int n,double incf,int minYk,int maxYk,bool isFirstRun, double cost[])

{

Page 102: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

91

assert (k>=1); //Abort if k is negative

double dblRetVal;

double d = n*incf; //d value for the function

double y;

//int c;

double min;

double val;

double miny = 0;

int col;

if(k==1) //base case

{

y = d;

dblRetVal = RootVal(k, d, y, cost[0]);

}

else

{

for(int i=minYk; i<=maxYk; i++) //iterate over the interval allowed to

calculate the 6dp results.

{

y = i*incf; //this sets to precission of y to 6dp

double root;

root = RootVal(k, d, y, cost[k-1]); //calculate the root.

if(root != -1) //if root is valid

{

col = n-i; //get the current d value

if(minkf2[k-1][col]==-9999)

{

if(isFirstRun) //check if the result has been

previously calculated

Page 103: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

92

{

val = root+ fun((k-1),col,incf,0,col,true,

cost); //if not, calculate the result

}

else

{ //if not, calculate the result

val = root+ fun((k-1),col,incf,ylimits[k-1]-

factor*z,ylimits[k-1]+ factor*z,false, cost);

}

}

else

{

val = root+ minkf2[k-1][col]; //if result exists, use

it for calculations

}

}

if (i==minYk)

{

min =val;//base case

}

else

{

min = Minimum(min,val);//get the minimum if the result

and the current minimum

}

if(min == val)

{

miny=y;

}//get the position of the current minimum

Page 104: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

93

}//end for

dblRetVal = min;

}//end else

//store the f and the d value of the minimum calculated.

col = n;

minkf2[k][col] = dblRetVal;

dk2[k][col]=miny;

return dblRetVal;

}//end function

Page 105: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

94

Appendix D:

The C++ Program Created to Determine the OSB

with cost factor for Power Distribution

/* This program gives the optimum strata width and optimum strata boundaries of

the main study variable following the power distribution */

#include <iostream>

#include <math.h>

#include <assert.h>

#include <conio.h>

#include <iomanip>

using namespace std;

typedef double Number;

/*********************************************************************/

//declare and initialise global constants

# define z 100 //refine to 5 decimal places

# define factor 4

# define inc 0.001 // amount of precision (10^-3)

# define inc2 0.00001 // amount of precision (10^-5)

# define prec 1/inc

# define points 1000 //Keep this to be 1/inc

// function declarations

double RootVal(int k, double d, double y, double c); /*calculates the value of objective

function and

the minimal elements*/

Page 106: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

95

double Minimum(double val1,double val2); // returns minimum of 2 numbers

/*Recursive function receives the parameter k, dk, yk to calculate f.*/

double fun(int k,int n,double incf,int minYk,int maxYk,bool isFirstRun, double []);

//void WeightSamPop(int h, int N, int nh[], int Nh[], double w[], double d[], double y[],

double f, double Vybarst, double Vh[]) ;

//void sampsize(int nh[], int n, int h, double w[], double sig[], double prodwhsigh);

//declare global variables

int n, N; // number of strata (h), total sample size (n), pop size

double s; // s=x0, the initial value of 6.1

const int SIZE = 10;

//const double lambda = 1;

//declare global constants and initialise their values

const double g = 1; // g is the distance

const int stages = 10;

int ylimits[10]; //stores the 3dp values for refining

const int e = (int)(g*points*z+1);

const int p = (int)(g*points+1);

double minkf2[stages][e];

double dk2[stages][e];

int h = 0;

int main() //program execution starts

{

double c[SIZE];

//take inputs of L, d and s as local variables

cout<<"Enter Number of Strata, L " << endl;

cin >> h;

//cout<<"Enter Range of Data, d " << endl;

//cin >> g;

Page 107: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

96

cout<<"Enter Initial Value, Xo " << endl;

cin >> s;

s=0;

for (int i=0; i<h; i++)

{

cout<<"Enter the Cost, C " << endl;

cin >> c[i];

printf("c[%i+1] = %f \n\n", i, c[i]);

}

//initialize minkf locally

cout<<"\nInitializing points ...."<<endl;

for (int i=0; i < stages; i++)

{

for(int j=0; j<e; j++)

{

minkf2[i][j]= -9999; //assign -9999 to every cell

}

}

for (int k=0; k < stages; k++)

{

for(int l=0; l<e; l++)

{

dk2[k][l]= -9999; //assigning same as above

}

}

cout<<"Initialization complete"<<endl<<endl<<"Calculating...."<<endl<<endl;

Page 108: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

97

double f=fun(h,p,inc ,0,p ,true, c);

double d[SIZE], y[SIZE], x[SIZE], w[SIZE], Vh[SIZE]; // d, y and x are arrays

of h float numbers

int nh[SIZE], Nh[SIZE]; //stratum sample sizes

int temp;

double Vybarst;

//backward calculation for the 3dp results

for(int i=h; i>=1; i--)

{

//c[i] = i+1;

if(i==h)

{

d[i] = g;

y[i] = dk2[i][p];

//c[i] = c[i+1];

}

else if(i<h && i>1)

{

d[i] = d[i+1]-y[i+1];

temp = (int)(d[i]*points);

y[i] = dk2[i][temp];

//c[i] = c[i+1];

}

else if(i==1)

{

d[i]=d[i+1]-y[i+1];

y[i]=d[i];

//c[i] = c[i+1];

}

Page 109: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

98

}

//setup the limits for the 6dp calculations

for(int i=h; i>=1; i--)

{

temp = (int)(y[i]*points*z);

ylimits[i] = temp;

}

f=fun(h, e-1, inc2, ylimits[h]-factor*z, ylimits[h]+factor*z, false, c);// for k>=2

cout <<"Strata: L = " << h << setw(30) << "Distance: d = " << g << endl;

// cout << "Sample Size: n = " << n << setw(30) << "Population Size: N = " << N

<< endl;

// printf("\n\nAccurate values derved after refining\n");

printf("\nf(h,g): %.10f \n" ,f);

cout << setw(20) << "\n\n Distance" << setw(25) << "Width" << setw(26) <<

"Boundary" << endl;

cout << setfill('-') << setw(73) << "-";

//Backward calculation for the 6 dp, compute d, y and x

for(int i=h; i>=1; i--)

{

//c[i] = i+1;

if(i==h)

{

d[i]=g;

Page 110: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

99

y[i] = dk2[i][(e-1)];

x[i]=s+g;

}

else if(i<h && i>1)

{

//cout << d[i+1] << "\t\t" << y[i+1] << endl;

d[i]=d[i+1]-y[i+1];

temp = (int)(d[i]*points*z);

y[i]=dk2[i][temp];

x[i]=x[i+1]-y[i+1];

}

else if(i==1)

{

//cout << "d=" << d[i+1] << "\t\ty=" << y[i+1] << endl;

d[i]=d[i+1]-y[i+1];

y[i]=d[i];

x[i]=y[i]+s;

}

printf("\nd[%i] = %f \t y[%i] = %f \t x[%i] = %f" , i, d[i], i, y[i], i, x[i]);

}

cout << endl << setfill('-') << setw(73) << "-" << endl;

getch();

system("PAUSE");

return 0;

} //end main

double RootVal(int k, double d, double y, double c)//calculate the root value of the

current distribution

Page 111: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

100

{

double rtval;

double calc;

calc = 3*pow(y,4)+24*pow(y,3)*(d-y)+84*pow(y,2)*pow((d-y),2)+120*y*pow((d-

y),3)+60*pow((d-y),4);

if(calc<0)

{

cout<<"\nError: Negative root...\n";

rtval = -1;

}

else

{

calc = pow(y,2)*sqrt(calc *c)/(4*pow(5,0.5));

}

rtval = calc;

return rtval;

}

double Minimum(double val1,double val2) // returns minimum of 2 numbers

{

if(val1<=val2)

{

return val1;

}

else

{

return val2;

}

}

Page 112: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

101

/*this functions performs the same actions as "function". It only defers in terms of the

iterations of the for loop.*/

double fun(int k,int n,double incf,int minYk,int maxYk,bool isFirstRun, double cost[])

{

assert (k>=1); //Abort if k is negative

double dblRetVal;

double d = n*incf; //d value for the function

double y;

//int c;

double min;

double val;

double miny = 0;

int col;

if(k==1) //base case

{

y = d;

dblRetVal = RootVal(k, d, y, cost[0]);

}

else

{

for(int i=minYk; i<=maxYk; i++) //iterate over the interval allowed to

calculate the 6dp results.

{

y = i*incf; //this sets to precision of y to 6dp

double root;

root = RootVal(k, d, y, cost[k-1]); //calculate the root.

if(root != -1) //if root is valid

Page 113: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

102

{

col = n-i; //get the current d value

if(minkf2[k-1][col]==-9999)

{

if(isFirstRun) //check if the result has been

previously calculated

{

val = root+ fun((k-1),col,incf,0,col,true,

cost); //if not, calculate the result

}

else

{ //if not, calculate the result

val = root+ fun((k-1),col,incf,ylimits[k-1]-

factor*z,ylimits[k-1]+ factor*z,false, cost);

}

}

else

{

val = root+ minkf2[k-1][col]; //if result exists, use

it for calculations

}

}

if (i==minYk)

{

min =val;//base case

}

else

{

Page 114: Thesis MSC Aluwesi librarydigilib.library.usp.ac.fj/gsdl/collect/usplibr1/... · iv Acknowledgement First of all I would like to thank the Almighty God for his continual blessing

103

min = Minimum(min,val);//get the minimum if the result

and the current minimum

}

if(min == val)

{

miny=y;

}//get the position of the current minimum

}//end for

dblRetVal = min;

}//end else

//store the f and the d value of the minimum calculated.

col = n;

minkf2[k][col] = dblRetVal;

dk2[k][col]=miny;

return dblRetVal;

}//end function