36 MR Lecture 6

8/2/2019 36 MR Lecture 6

1/36

Sampling:Design and Procedures

8/2/2019 36 MR Lecture 6

2/36

The Sampling Design Process

Define the Population

Determine the Sampling Frame

Select Sampling Technique(s)

Determine the Sample Size

Execute the Sampling Process

8/2/2019 36 MR Lecture 6

3/36

Define the Target Population

The target population is the collection of elements or objectsthat possess the information sought by the researcher andabout which inferences are to be made. The targetpopulation should be defined in terms of elements, samplingunits, extent, and time.

An element is the object about which or from which theinformation is desired, e.g., the respondent.

A sampling unit is a unit containing the element, that is

available for selection at some stage of the samplingprocess. For e.g. individuals living in a household.

Extent refers to the geographical boundaries.

Time is the time period under consideration.

8/2/2019 36 MR Lecture 6

4/36

Define the Sample Size

Important qualitative factors in determining the sample size

the importance of the decision (more important decision larger thesample size)

the nature of the research (for exploratory research designs the

sample size is small) the number of variables (to cover large number of variables large

sample size is required)

the nature of the analysis (for multivariate analysis techniques largesample size is required)

sample sizes used in similar studies ( table in the next slide gives some

indication of the sample size) incidence rates ( eligible respondents)

completion rates (mailed questionnaire has less completion rate)

8/2/2019 36 MR Lecture 6

5/36

Determine the Sampling Size

Type of Study Minimum Size Typical Range

Problem identification research(e.g. market potential)

500 1,000-2,500

Problem-solving research (e.g.pricing) 200 300-500

Product tests 200 300-500

Test marketing studies 200 300-500

TV, radio, or print advertising (percommercial or ad tested)

150 200-300

Test-market audits 10 stores 10-20 stores

Focus groups 2 groups 4-12 groups

8/2/2019 36 MR Lecture 6

6/36

Determine the Sampling Frame

Sampling Frame: It is the representation of the

elements of the target population. It consists

of a list or set of directions for identifying the

target population.

8/2/2019 36 MR Lecture 6

7/36

Classification of Sampling Techniques

Sampling Techniques

Non-probabilitySampling Techniques

ProbabilitySampling Techniques

Convenience

Sampling

Judgmental

Sampling

Quota

Sampling

Snowball

Sampling

Systematic

Sampling

Stratified

Sampling

Cluster

Sampling

Other Sampling

Techniques

Simple Random

Sampling

8/2/2019 36 MR Lecture 6

8/36

Non-Probability Sampling Techniques

8/2/2019 36 MR Lecture 6

9/36

Convenience Sampling

Convenience sampling attempts to obtain a sample of

convenient elements. Often, respondents are selected

because they happen to be in the right place at the right time.

use of students, and members of social organizations

mall intercept interviews without qualifying the

respondents

department stores using ch

arge account lists people on the street interviews

8/2/2019 36 MR Lecture 6

10/36

Judgmental Sampling

Judgmental sampling is a form of convenience sampling in

which the population elements are selected based on the

judgment of the researcher.

test markets

purchase engineers selected in industrial marketing

research

expert witnesses used in court

8/2/2019 36 MR Lecture 6

11/36

Quota Sampling

Quotasampling may be viewed as two-stage restricted judgmental sampling.

The first stage consists of developing control categories, or quotas, ofpopulation elements.

In the second stage, sample elements are selected based on pro-rata,convenience or judgment.

8/2/2019 36 MR Lecture 6

12/36

Snowball Sampling

In snowball sampling, an initial group of respondents is

selected, usually at random.

After being interviewed, these respondents are asked toidentify others who belong to the target population of

interest.

Subsequent respondents are selected based on the

referrals.

8/2/2019 36 MR Lecture 6

13/36

Probability Sampling Techniques

8/2/2019 36 MR Lecture 6

14/36

Simple Random Sampling (SRS)

Each element in the population has a known and equal

probability of selection.

Each possible sample of a given size (n) has a known and

equal probability of being th

e sample actually selected. This implies that every element is selected independently of

every other element.

8/2/2019 36 MR Lecture 6

15/36

Systematic Sampling

The sample is chosen by selecting a random starting point and thenpicking every ith element in succession from the sampling frame.

The sampling interval, i, is determined by dividing the population size N bythe sample size n and rounding to the nearest integer.

When the ordering of the elements is related to the characteristic ofinterest, systematic sampling increases the representativeness of thesample.

If the ordering of the elements produces a cyclical pattern, systematicsampling may decrease the representativeness of the sample.

8/2/2019 36 MR Lecture 6

16/36

Stratified Sampling

A two-step process in which the population is partitioned into

subpopulations, or strata.

The strata should be mutually exclusive and collectively

exh

austive in th

at every population element sh

ould beassigned to one and only one stratum and no population

elements should be omitted.

Next, elements are selected from each stratum by a random

procedure, usually SRS.

A major objective of stratified sampling is to increase

precision without increasing cost.

8/2/2019 36 MR Lecture 6

17/36

Stratified Sampling

The elements within a stratum should be homogeneous vis--vis aparticular attribute, but the elements in different strata should be asheterogeneous as possible.

The stratification variables should also be closely related to thecharacteristic of interest.

In proportionate stratified sampling, the size of the sample drawn fromeach stratum is proportionate to the relative size of that stratum in thetotal population.

In disproportionate stratified sampling, the size of the sample from eachstratum is proportionate to the relative size of that stratum and to the

standard deviation of th

e distribution of th

e ch

aracteristic of interestamong all the elements in that stratum. Crucially, the sampling fraction isnot the same within all strata: some strata are over-sampled relative toothers.

8/2/2019 36 MR Lecture 6

18/36

Stratified Sampling

Disproportionate stratification is used for two

purposes:

To give larger than proportionate sample sizes in one

or more sub-groups so that separate analyses by sub-

group will be possible; and

To increase the precision of key survey estimates.

NOTE: Disproportionate stratification will only reduce standard

errors (relative to a proportionate stratified sample) if the

population standard deviation for the variable of interest is

higher than average within the over-sampled strata.

8/2/2019 36 MR Lecture 6

19/36

Cluster Sampling

The target population is first divided into mutually exclusive andcollectively exhaustive subpopulations, or clusters.

Then a random sample of clusters is selected, based on a probabilitysampling technique such as SRS.

For each selected cluster, either all the elements are included in thesample (one-stage) or a sample of elements is drawn probabilistically(two-stage).

Elements within a cluster should be as heterogeneous as possible, butclusters themselves should be as homogeneous as possible. Ideally, eachcluster should be a small-scale representation of the population.

In probability proportionate tosize sampling (PPS), the clusters aresampled with probability proportional to size. Thus in the first stage largeclusters are more likely to be included. In the second stage, theprobability of selecting a sampling unit in a selected cluster variesinversely with the size of the cluster.

8/2/2019 36 MR Lecture 6

20/36

Types ofCluster SamplingCluster Sampling

One-Stage

Sampling

Multistage

Sampling

Two-Stage

Sampling

Simple Cluster

SamplingProbability

Proportionate

to Size Sampling

8/2/2019 36 MR Lecture 6

21/36

When to Use Cluster Sampling

Cluster sampling should be used only when it is economically justified - when

reduced costs can be used to overcome losses in precision. This is most likely to

occur in the following situations:

Constructing a complete list of population elements is difficult, costly, or

impossible.

For example, it may not be possible to list all of the customers of a chain of

hardware stores. However, it would be possible to randomly select a subset of

stores (stage 1 of cluster sampling) and then interview a random sample of

customers who visit those stores (stage 2 of cluster sampling).

The population is concentrated in "natural" clusters (city blocks, schools,

hospitals, etc.).

For example, to conduct personal interviews of operating room nurses, it might

make sense to randomly select a sample ofhospitals (stage 1 of cluster

sampling) and then interview all of the operating room nurses at that hospital.

Using cluster sampling, the interviewer could conduct many interviews in a

single day at a single hospital. Simple random sampling, in contrast, might

require the interviewer to spend all day traveling to conduct a single interviewat a single hospital.

8/2/2019 36 MR Lecture 6

22/36

Technique Strengths Weaknesses

Nonprobability SamplingConvenience sampling

Least expensive, leasttime-consuming, mostconvenient

Selection bias, sample notrepresentative, not recommended fordescriptive or causal research

Judgmental sampling Low cost, convenient,not time-consuming

Does not allow generalization,subjective

Quota sampling Sample can be controlled

for certain characteristics

Selection bias, no assurance of

representativenessSnowball sampling Can estimate rare

characteristicsTime-consuming

Probability samplingSimple random sampling(SRS)

Easily understood,results projectable

Difficult to construct samplingframe, expensive, lower precision,no assurance of representativeness.

Systematic sampling Can increase

representativeness,easier to implement thanSRS, sampling frame notnecessary

Can decrease representativeness

Stratified sampling Include all importantsubpopulations,

precision

Difficult to select relevantstratification variables, not feasible tostratify on many variables, expensive

Cluster sampling Easy to implement, cost

effective

Imprecise, difficult to compute and

interpret results

Strengths and Weaknesses of

Basic Sampling Techniques

8/2/2019 36 MR Lecture 6

23/36

Choosing Nonprobability vs.

Probability Sampling

Conditions Favoring the Use of

Factors Nonprobabilitysampling

Probabilitysampling

Nature of research Exploratory Conclusive

Relative magnitude of samplingand nonsampling errors

Nonsamplingerrors arelarger

Samplingerrors arelarger

Variability in the population Homogeneous(low)

Heterogeneous(high)

Statistical considerations Unfavorable Favorable

Operational considerations Favorable Unfavorable

8/2/2019 36 MR Lecture 6

24/36

Univariate Analysis

UNIVARIATE ANALYSIS

Definition: Method for analyzing data on a single variable at a

time.

Sometimes, all a researcher wants to do is report the data onone variable. For example, you might want to report the

number ofhigh school dropouts in New Delhi since

2005. That is a type of univariate analysis.

8/2/2019 36 MR Lecture 6

25/36

Univariate Analysis

Univariate method

Frequencies: A frequency distribution is a listing of categories

of possible values for a variable, together with a tabulation of

th

e number of observations in each

category.

An example of a frequency is the following table of a

frequency distribution of the POLVIEWS that asks the

respondent to classify her/his political views.

8/2/2019 36 MR Lecture 6

26/36

Univariate Analysis

Value Frequency

Extremely

Liberal

1 12

Liberal 2 63

Slightly Liberal 3 62

Moderate 4 202

Slightly

Conservative

5 71

Conservative 6 59

Extremely

Conservative

7 9

Dont Know 8 21

NA 9 1

Total 500

Sample Table 1:

POLVIEWS THINK OF SELF AS LIBERAL

OR CONSERVATIVE.

The values column contains thenumbers arbitrarily assigned to this

ordinal variable. The frequency

column contains the number of

observations that correspond to each

category.

As you can see, most people consider

themselves to be moderate (202

people).

8/2/2019 36 MR Lecture 6

27/36

Value Frequency Percent Valid

Percent

Cumulative

Percent

Extremely

Liberal

1 12 2.4 2.5 2.5

Liberal 2 63 12.6 13.2 15.7

Slightly Liberal 3 62 12.4 13.0 28.7

Moderate 4 202 40.4 42.3 70.9

Slightly

Conservative

5 71 14.2 14.9 85.8

Cons

erva

tive 6 59 11.8 12.3 98.1Extremely

Conservative

7 9 1.8 1.9 100.0

Dont Know 8 21 4.2 Missing

NA 9 1 .2 Missing

Total 500 100.0 100.0

Sample Table 2

8/2/2019 36 MR Lecture 6

28/36

Univariate Analysis

In Sample Table 2, the percent column provides the relative frequencyfor each category. The valid percent column calculates the percentageswithout the missing data being included. The cumulative percent talliesthe cumulative percentage of the valid percentages.

The relative frequency is usually much more usefulthan the frequency

itself because it puts th

e raw number into perspective. Now, we can saythat 42% of the sample were moderate.

We can use frequencies on any type of variable.

Other methods (They are better for variables that have multiple responsecategories, such as interval variables):

Histograms

Pie Charts

Bar Diagrams

8/2/2019 36 MR Lecture 6

29/36

Multivariate Analysis

Multivariate testing

In statistics, multivariate testing or multi-variable testing is a technique

for testing hypotheses on complex multi-variable systems, especially used

in testing market perceptions.

In marketing, multivariate hypothesis testing is a process by which morethan one component of a product/service may be tested in a live

environment. It can be thought of in simple terms as numerous A/B tests

performed on one page at the same time.

A/B tests are usually performed to determine the better of two content

variations, multivariate testing can th

eoretically test th

e effectiveness oflimitless combinations. The only limits on the number of combinations

and the number of variables in a multivariate test are the amount of time

it will take to get a statistically valid sample of visitors and computational

power.

8/2/2019 36 MR Lecture 6

30/36


Multivariate hypothesis testing is usually employed in order to ascertain which

content or creative variation produces the best improvement in the pre-defined.

For e.g. in case of the User Experience Testing for a website, dramatic increase in

user convenience can be seen through testing different copy text, form layouts and

even landing page images and background colours.

However, not all elements produce the same increase in user convenience, and by

looking at the results from different tests, it is possible to identify those elements

that consistently tend to produce the greatest increase in user convenience.

8/2/2019 36 MR Lecture 6

31/36


Testing can be carried out on a dynamically generated website by setting

up the server to display the different variations of content in equal

proportions to incoming visitors. Statistics on how each visitor went on to

behave after seeing the content under test must then be gathered and

presented. Outsourced services can also be used to provide multivariate

testing on websites with minor changes to page coding. These services

insert their content to predefined areas of a site and monitor user

behavior.

In a nutshell, multivariate testing can be seen as allowing respondents to

vote for the preferred option and will stand the most chance of them

proceeding to a defined goal.

8/2/2019 36 MR Lecture 6

32/36

Measurements of Association

In certain situations, we are interested todetermine whether some association existsamong the groups of data and the intensity of

this association. We make use of certain statistical tools to

determine whether any association exists amonggroups of data.

These statistical tools are: Correlation Analysis

Regression Analysis

8/2/2019 36 MR Lecture 6

33/36


Correlation Analysis

It is a statistical tool with the help of which we determinethe intensity of relationship between two or more than twovariables.

Two variables are said to be correlated if the movements inone are accompanied by the movements in the other.

For example: With the increase of rainfall up to a certainextent the production of rice increases.

The Correlation Coefficient is a measure of correlation and

summarises the direction and degree of correlation. To determine correlation, we could use:

Karl Pearsons Coefficient ofCorrelation

Rank Correlation

8/2/2019 36 MR Lecture 6

34/36


Karl Pearsons Coefficient ofCorrelation: It is denoted by

r.

The formula used to find r is as follows:

The value of r will lie between -1 and +1. A positive sign and negative sign for r have significant

implications.

Where X and Y are two

variables under study

and X and Y are their

respective means.

8/2/2019 36 MR Lecture 6

35/36


Rank Correlation

It is also known as Spearmans Rank Correlation method.

This method holds good when quantitative measures of

certain factors such

as evaluation of salesmens leadersh

ipability cannot be ascertained.

Then for these items of individuals the Rank Correlation

Coefficient is determined.

The formula used is:

Where d = Differences of rank

between paired items in two series

n = No. of units

d = X-Y

8/2/2019 36 MR Lecture 6

36/36


Regression Analysis

It is a statistical tool which enables a researcher to estimateor predict the values of unknown variables from the knownvalues of another variable, i.e. we can say that with the

help of regression analysis, a marketing researcher is ableto determine the average probable change in one variablegiven a certain amount of change in another variable.

For example: If a marketer has got the values on advertisingexpenditure with regard to a companys product, he canpredict the amounts of sales corresponding to eachadvertising expenditure or vice versa.

The regression analysis enables us to find the cause andeffect relationship between variables.

Documents

36 MR Lecture 6