23
Sampling Concept of sample and sampling Sampling process and problems Types of samples: Probability and non- probability sampling Determination and sample size Sampling and non-sampling errors

Sampling

Embed Size (px)

Citation preview

Sampling

Concept of sample and sampling

Sampling process and problems

Types of samples: Probability and non-

probability sampling

Determination and sample size

Sampling and non-sampling errors

Why Sampling is ?

In any statistical investigation complete

enumeration of the population is rather impractical.

If the population is infinite, complete enumeration is

not possible.

And, even if the population is finite, 100%

enumeration is not taken because of administrative

and financial implications, time factor, resource

available etc. So there is a need to take the help of

sampling.

The foremost purpose of sampling is to gather

maximum information about the population under

consideration at minimum cost, time and human

power.

Why and in what situation sampling is

inevitable ?

When population is infinite

When the items or units destroyed under

investigation

When the results are required in a short time

When the resources for a survey are limited

particularly money and trained persons

When area of survey is wide

Population:

Population is a group of items, units or subjects which is under

reference of study. Population may consist of finite and infinite

number of units. Population can be classified into four

categories.

Finite Population: If the population consists of fixed and limited

number of items or units, it is known as finite population.

Example, the workers in a factory, students in a college etc.

Infinite Population: If the population consists of an infinite

number of items or units, it is called infinite population.

Example, the population of stars in the sky, water in a pond etc.

Real population: Population consisting of the items which are

all present physically is termed as real population. Example, no

of court case in a day, no of snake bite patients admitted in the

hospital etc.

Hypothetical Population: Population consisting of the results

of repeated trials is named as hypothetical population. The

tossing of coin, rolling of a dice again and again are some

example of hypothetical population.

Sample

Sample is a part or fraction of a population selected in some

basis. It consists of a few representative items of a population.

Or, it is a finite subset of statistical individual in a population

and the number of individuals in sample is called the sample

size (n) (if B is a sub set of A fig..) if A

={1,2,3,4,5,6,………….100} and B

={3,7,9,10,45,51,55,72,91,96} Here, BcA then, A is pop. And B

is sample

In principle a sample should be such that it is a true

representative of the population or universe

Different sampling techniques are used to select sample units

according to the nature of the population (Homogeneous or

heterogeneous, geographical variation etc.)

Population

Sample

Related terminologies use in sampling

Sampling frame: It is a list or a map of population identifying

each sampling unit by a number. It is essential for adopting

any sampling procedure.

Parameter and statistics : The values that described the

characteristics of the population are called Parameters and the

values that described the characteristics of samples are called

Statistics.

For examples:

Population Size = (N)

Population Mean = ( ) Population

Parameters

Population Variance =

And,

Sample Size =

Sample Mean = Sample Statistics

Sample variance =

2

n

x

2s

Sampling Process

It is the procedures or steps of selecting final sample units

.These steps are in sequential order which are given below in

the chart (Seven steps sampling design)

Define the population

Specify the sampling frame

Specify sampling units

Selection of sampling method

Determine of the sample size

Specify the sampling plan

Select the sample

Type of

sampling:

Sampling Method

Non-Probability Sampling

Judgment Sampling

Quota Sampling

Convenience Sampling

Snowball Sampling

Probability Sampling

Simple random sampling

Stratified Sampling

Systematic Sampling

Cluster Sampling

Non-Probability and probability Sampling

In this sampling technique, all items or units in the

population do not have equal chance of being

selected. The sample selected in this method is

mostly based on the investigators own views or

ideas.

In probability sampling each and every element in

the population has equal chance of being selected.

This sampling technique attained through some

mathematical operation of randomization.

Judgment or purposive sampling

In this method of sampling the choice of sample

items depends exclusively on the judgment of the

investigators. That is, the investigators exercises

their judgment in the choice and includes those

items in the sample.

For example, if sample of 25 students is to be

selected from a class of 90 students for analyzing

the smoking habits of tobacco, the investigators

would select 25 students who, in his opinion, are

representative of the class.

Quota Sampling

Quota sampling is a type of judgment sampling and

is perhaps the most commonly used sampling

technique in non-probability category. In a quota

sample, quotas are set up according to the some

specified characteristics.

For example, in radio listening survey, the

interviewers may be told to interview 500 people

living in a certain areas and that out of every 100

parsons interviewed 60 are housewife, 25 farmers

and 15 children under the age of 18.With in these

quotas the interviewer is free to select the people to

be interviewed.

Convenience Sampling:

Convenience sampling is obtained by selecting

'convenient' population units.

If a person is to submit a project report on marketing

management in textile industry and he/she takes a

textile industry close to his/her house or area and

interviews some people over there and submit the

report. Then, it is known that he/she is following the

convenience sampling method.

Snowball Sampling

Snowball ball sampling is known as network or

chain referral sampling.

In this sampling technique, first one or two persons

in the population are contacted and ask them to

identify further persons.

Accordingly, new person are identified until there are

as large as manageable sample. The selection

process is stopped when either no new people are

identified or start to repeat the same people again

and again.

Simple Random Sampling:

Simple random sampling refers to that sampling technique in which each and every unit of the population has equal chances of being selected in the sample.

Personal bias of the investigator does not influence the selection.

For examples: blood tests in laboratory, tottery method, selection of sample units from random numbers chart/table are some examples of SRS.

Stratified Random Sampling: (restricted)

In this type of sampling method first the whole

population is divided into relatively homogenous

groups under certain criterion. These groups are

terms as strata.

Then the sample is drawn randomly from each

stratum independently. The estimate is calculated

from the data obtained from all the stratum.

Proper classification of the population into various

strata and a suitable sample size from each

stratum are the two major points need to be

considered in stratified random sampling.

Systematic Sampling: In systematic sampling, sample units are selected from a

population at a uniform interval that could measure in time,

order or space. It is formed by selecting one unit at random

and then selecting additional units at an equal interval of its

measurement.

i.e. [(j), (j+k), (j+2k), (j+3k),…………,{j+(n-1)k}], [1,2,3,4,

5,6,7,8,9,10,11,12,13,14,15,………………..1000]

In systematic sampling, N = nK, and K= N/n, where K is a

sampling interval, n is sample size and N is the population

size. Such a selection procedure is known as linear

systematic sampling. This sampling technique need a well

defined sampling frame.

This procedure fails if the population size N is not a multiple

of n. i.e ( N≠ nK) and need to introduce circular systematic

sampling which takes as rounded to the nearest integer.

Cluster Sampling:

In many situations, the sampling frame for elementary units of

the population is not available; moreover it is not easy to

prepare it.

However, the information is available for groups of elements.

In such case, cluster sampling can be applied to study the

population characteristics.

In this sampling tchnique the population is divided into groups

so called clusters. The cluster include all types of

characteristics in the population. Therefore, the characteristics

within the cluster are heterogeneous and between the cluster

are homogeneous. The size of cluster may or may not be

equal.

The sampling efficiency of cluster is likely to decrease with the

increase in cluster size. This sampling is extensively used, if

the population characteristics are heterogeneous and

geographically varied.

Each and every elementary unit of the selected cluster are

studied in this technique.

Contd…

For instance, the list of college may be available

but not the students studying over there,

The list of individual farms may not be available

but the list of villagers is generally available.

Hence, in these situation college or villages are

known as clusters and selection has to be made of

college or villages as of samples. And, each and

every element of the selected sample will be

studied to estimate the population characteristics.

Determination of sample size

Different opinion have been expressed by experts for the selection

of sample size (i.e 5%,10% or 25% of the universe). There are no

hard and fast rule can be laid down. However, according to the law

of large number, the largest the sample size, the better the

estimation, or the larger the sample, the closer the ‘true’ value of

the population. It may also be pointed out that the sample size

should neither be too large nor too small. It should be 'optimum'

(efficiency, representativeness, reliability and flexibility).

The following factor should be considered while deciding the

sample size:

The size of the universe, the resource available

Desired degree of accuracy or precision

Homogeneity or Heterogeneity of the universe, Nature of study

Method of sampling adopted and, the Nature of respondents

Formula for the determination of the sample size,

Here,

where,

n = Sample size

z = Value at a specified level of confidence or desired degree of precision.

= Standard deviation of the population.

d = Different between population mean and sample mean. Or desired error in estimation of pop. Mean

Example: 1) Determine the sample size if = 6, Population mean is 25, sample mean is 23 and the desired degree of precision is 99% ( z at 1% level of significance is 2.58)

2) Let us suppose the value of population mean is 45 and s.d = 8, N = 460 and the desired error in the estimation of this value is 10%, and the desired degree of precision is 95% ( z at 5% level of significance is 1.96). Determine the size of ‘n’ .

2

d

zn

If population size is given then, n can

be determine by,

Example: A survey is to be conducted to investigate the characteristics

of a factor in a population of size 500, having variance 85.

Determine the sample required for error of 2 in estimation of

population mean.

Level of

significanc

e

Z œ/2 Variance

(sigma sq.)

error n0 n

1% 2.58 85 2 141 110

5% 1.96 85 2 82 70

10% 1.64 85 2 58 52

Sampling and Non-sampling

Errors The errors involved in the collection, processing and analysis of data may

be broadly classified under two categories such as

i) Sampling errors ii) Non- sampling errors

Sampling errors: Even if utmost care has been made in selecting a sample, the result derive from a sample study may not be exactly equal to the true value in the population.

The reason is that the estimate is based on a part and not on the whole. Hence sampling rise certain errors known as sampling error or sampling fluctuations. This error would not present in a complete enumeration.

Sampling error are primarily exist due to the following reasons:

1. Faulty selection of sample (e.g bias or defective sampling technique –Judgment or purposive, quota, convenience sampling technique)

2. Substitution (if difficulties arise in enumerating, the investigator usually substitute a convenient number of population)

3. Faulty demarcation of sampling unit ( Prevailing in area sampling)

4. Constant error due to improper choice of the statistics for estimating the population parameter, that is,

Where , the given variance are bias and unbiased estimate of population variance.

Non-sampling errors

Non sampling errors mainly arise at the stage of observation and processing of the data and thus present in both the complete and enumeration survey and the sample survey. It can occur at ever stage of the planning or execution of census or sample survey.

Non-sampling error may exist due to the following reasons:

Faulty planning or definition (definition of employment, literacy, labour etc.)

Response errors ( response error may be accidental or prestige bias)

Self-interest

Bias due to interviewer

Non-response bias

Errors in coverage (inclusion or exclusion of certain items which are not to be included and not to be excluded )

Compiling Errors (Cleaning, coding, data entry )

Publication errors (errors committed in presentation and printing)