Upload
khushbu-tewari
View
224
Download
0
Embed Size (px)
Citation preview
8/2/2019 36 MR Lecture 6
1/36
Sampling:Design and Procedures
8/2/2019 36 MR Lecture 6
2/36
The Sampling Design Process
Define the Population
Determine the Sampling Frame
Select Sampling Technique(s)
Determine the Sample Size
Execute the Sampling Process
8/2/2019 36 MR Lecture 6
3/36
Define the Target Population
The target population is the collection of elements or objectsthat possess the information sought by the researcher andabout which inferences are to be made. The targetpopulation should be defined in terms of elements, samplingunits, extent, and time.
An element is the object about which or from which theinformation is desired, e.g., the respondent.
A sampling unit is a unit containing the element, that is
available for selection at some stage of the samplingprocess. For e.g. individuals living in a household.
Extent refers to the geographical boundaries.
Time is the time period under consideration.
8/2/2019 36 MR Lecture 6
4/36
Define the Sample Size
Important qualitative factors in determining the sample size
the importance of the decision (more important decision larger thesample size)
the nature of the research (for exploratory research designs the
sample size is small) the number of variables (to cover large number of variables large
sample size is required)
the nature of the analysis (for multivariate analysis techniques largesample size is required)
sample sizes used in similar studies ( table in the next slide gives some
indication of the sample size) incidence rates ( eligible respondents)
completion rates (mailed questionnaire has less completion rate)
8/2/2019 36 MR Lecture 6
5/36
Determine the Sampling Size
Type of Study Minimum Size Typical Range
Problem identification research(e.g. market potential)
500 1,000-2,500
Problem-solving research (e.g.pricing) 200 300-500
Product tests 200 300-500
Test marketing studies 200 300-500
TV, radio, or print advertising (percommercial or ad tested)
150 200-300
Test-market audits 10 stores 10-20 stores
Focus groups 2 groups 4-12 groups
8/2/2019 36 MR Lecture 6
6/36
Determine the Sampling Frame
Sampling Frame: It is the representation of the
elements of the target population. It consists
of a list or set of directions for identifying the
target population.
8/2/2019 36 MR Lecture 6
7/36
Classification of Sampling Techniques
Sampling Techniques
Non-probabilitySampling Techniques
ProbabilitySampling Techniques
Convenience
Sampling
Judgmental
Sampling
Quota
Sampling
Snowball
Sampling
Systematic
Sampling
Stratified
Sampling
Cluster
Sampling
Other Sampling
Techniques
Simple Random
Sampling
8/2/2019 36 MR Lecture 6
8/36
Non-Probability Sampling Techniques
8/2/2019 36 MR Lecture 6
9/36
Convenience Sampling
Convenience sampling attempts to obtain a sample of
convenient elements. Often, respondents are selected
because they happen to be in the right place at the right time.
use of students, and members of social organizations
mall intercept interviews without qualifying the
respondents
department stores using ch
arge account lists people on the street interviews
8/2/2019 36 MR Lecture 6
10/36
Judgmental Sampling
Judgmental sampling is a form of convenience sampling in
which the population elements are selected based on the
judgment of the researcher.
test markets
purchase engineers selected in industrial marketing
research
expert witnesses used in court
8/2/2019 36 MR Lecture 6
11/36
Quota Sampling
Quotasampling may be viewed as two-stage restricted judgmental sampling.
The first stage consists of developing control categories, or quotas, ofpopulation elements.
In the second stage, sample elements are selected based on pro-rata,convenience or judgment.
8/2/2019 36 MR Lecture 6
12/36
Snowball Sampling
In snowball sampling, an initial group of respondents is
selected, usually at random.
After being interviewed, these respondents are asked toidentify others who belong to the target population of
interest.
Subsequent respondents are selected based on the
referrals.
8/2/2019 36 MR Lecture 6
13/36
Probability Sampling Techniques
8/2/2019 36 MR Lecture 6
14/36
Simple Random Sampling (SRS)
Each element in the population has a known and equal
probability of selection.
Each possible sample of a given size (n) has a known and
equal probability of being th
e sample actually selected. This implies that every element is selected independently of
every other element.
8/2/2019 36 MR Lecture 6
15/36
Systematic Sampling
The sample is chosen by selecting a random starting point and thenpicking every ith element in succession from the sampling frame.
The sampling interval, i, is determined by dividing the population size N bythe sample size n and rounding to the nearest integer.
When the ordering of the elements is related to the characteristic ofinterest, systematic sampling increases the representativeness of thesample.
If the ordering of the elements produces a cyclical pattern, systematicsampling may decrease the representativeness of the sample.
8/2/2019 36 MR Lecture 6
16/36
Stratified Sampling
A two-step process in which the population is partitioned into
subpopulations, or strata.
The strata should be mutually exclusive and collectively
exh
austive in th
at every population element sh
ould beassigned to one and only one stratum and no population
elements should be omitted.
Next, elements are selected from each stratum by a random
procedure, usually SRS.
A major objective of stratified sampling is to increase
precision without increasing cost.
8/2/2019 36 MR Lecture 6
17/36
Stratified Sampling
The elements within a stratum should be homogeneous vis--vis aparticular attribute, but the elements in different strata should be asheterogeneous as possible.
The stratification variables should also be closely related to thecharacteristic of interest.
In proportionate stratified sampling, the size of the sample drawn fromeach stratum is proportionate to the relative size of that stratum in thetotal population.
In disproportionate stratified sampling, the size of the sample from eachstratum is proportionate to the relative size of that stratum and to the
standard deviation of th
e distribution of th
e ch
aracteristic of interestamong all the elements in that stratum. Crucially, the sampling fraction isnot the same within all strata: some strata are over-sampled relative toothers.
8/2/2019 36 MR Lecture 6
18/36
Stratified Sampling
Disproportionate stratification is used for two
purposes:
To give larger than proportionate sample sizes in one
or more sub-groups so that separate analyses by sub-
group will be possible; and
To increase the precision of key survey estimates.
NOTE: Disproportionate stratification will only reduce standard
errors (relative to a proportionate stratified sample) if the
population standard deviation for the variable of interest is
higher than average within the over-sampled strata.
8/2/2019 36 MR Lecture 6
19/36
Cluster Sampling
The target population is first divided into mutually exclusive andcollectively exhaustive subpopulations, or clusters.
Then a random sample of clusters is selected, based on a probabilitysampling technique such as SRS.
For each selected cluster, either all the elements are included in thesample (one-stage) or a sample of elements is drawn probabilistically(two-stage).
Elements within a cluster should be as heterogeneous as possible, butclusters themselves should be as homogeneous as possible. Ideally, eachcluster should be a small-scale representation of the population.
In probability proportionate tosize sampling (PPS), the clusters aresampled with probability proportional to size. Thus in the first stage largeclusters are more likely to be included. In the second stage, theprobability of selecting a sampling unit in a selected cluster variesinversely with the size of the cluster.
8/2/2019 36 MR Lecture 6
20/36
Types ofCluster SamplingCluster Sampling
One-Stage
Sampling
Multistage
Sampling
Two-Stage
Sampling
Simple Cluster
SamplingProbability
Proportionate
to Size Sampling
8/2/2019 36 MR Lecture 6
21/36
When to Use Cluster Sampling
Cluster sampling should be used only when it is economically justified - when
reduced costs can be used to overcome losses in precision. This is most likely to
occur in the following situations:
Constructing a complete list of population elements is difficult, costly, or
impossible.
For example, it may not be possible to list all of the customers of a chain of
hardware stores. However, it would be possible to randomly select a subset of
stores (stage 1 of cluster sampling) and then interview a random sample of
customers who visit those stores (stage 2 of cluster sampling).
The population is concentrated in "natural" clusters (city blocks, schools,
hospitals, etc.).
For example, to conduct personal interviews of operating room nurses, it might
make sense to randomly select a sample ofhospitals (stage 1 of cluster
sampling) and then interview all of the operating room nurses at that hospital.
Using cluster sampling, the interviewer could conduct many interviews in a
single day at a single hospital. Simple random sampling, in contrast, might
require the interviewer to spend all day traveling to conduct a single interviewat a single hospital.
8/2/2019 36 MR Lecture 6
22/36
Technique Strengths Weaknesses
Nonprobability SamplingConvenience sampling
Least expensive, leasttime-consuming, mostconvenient
Selection bias, sample notrepresentative, not recommended fordescriptive or causal research
Judgmental sampling Low cost, convenient,not time-consuming
Does not allow generalization,subjective
Quota sampling Sample can be controlled
for certain characteristics
Selection bias, no assurance of
representativenessSnowball sampling Can estimate rare
characteristicsTime-consuming
Probability samplingSimple random sampling(SRS)
Easily understood,results projectable
Difficult to construct samplingframe, expensive, lower precision,no assurance of representativeness.
Systematic sampling Can increase
representativeness,easier to implement thanSRS, sampling frame notnecessary
Can decrease representativeness
Stratified sampling Include all importantsubpopulations,
precision
Difficult to select relevantstratification variables, not feasible tostratify on many variables, expensive
Cluster sampling Easy to implement, cost
effective
Imprecise, difficult to compute and
interpret results
Strengths and Weaknesses of
Basic Sampling Techniques
8/2/2019 36 MR Lecture 6
23/36
Choosing Nonprobability vs.
Probability Sampling
Conditions Favoring the Use of
Factors Nonprobabilitysampling
Probabilitysampling
Nature of research Exploratory Conclusive
Relative magnitude of samplingand nonsampling errors
Nonsamplingerrors arelarger
Samplingerrors arelarger
Variability in the population Homogeneous(low)
Heterogeneous(high)
Statistical considerations Unfavorable Favorable
Operational considerations Favorable Unfavorable
8/2/2019 36 MR Lecture 6
24/36
Univariate Analysis
UNIVARIATE ANALYSIS
Definition: Method for analyzing data on a single variable at a
time.
Sometimes, all a researcher wants to do is report the data onone variable. For example, you might want to report the
number ofhigh school dropouts in New Delhi since
2005. That is a type of univariate analysis.
8/2/2019 36 MR Lecture 6
25/36
Univariate Analysis
Univariate method
Frequencies: A frequency distribution is a listing of categories
of possible values for a variable, together with a tabulation of
th
e number of observations in each
category.
An example of a frequency is the following table of a
frequency distribution of the POLVIEWS that asks the
respondent to classify her/his political views.
8/2/2019 36 MR Lecture 6
26/36
Univariate Analysis
Value Frequency
Extremely
Liberal
1 12
Liberal 2 63
Slightly Liberal 3 62
Moderate 4 202
Slightly
Conservative
5 71
Conservative 6 59
Extremely
Conservative
7 9
Dont Know 8 21
NA 9 1
Total 500
Sample Table 1:
POLVIEWS THINK OF SELF AS LIBERAL
OR CONSERVATIVE.
The values column contains thenumbers arbitrarily assigned to this
ordinal variable. The frequency
column contains the number of
observations that correspond to each
category.
As you can see, most people consider
themselves to be moderate (202
people).
8/2/2019 36 MR Lecture 6
27/36
Value Frequency Percent Valid
Percent
Cumulative
Percent
Extremely
Liberal
1 12 2.4 2.5 2.5
Liberal 2 63 12.6 13.2 15.7
Slightly Liberal 3 62 12.4 13.0 28.7
Moderate 4 202 40.4 42.3 70.9
Slightly
Conservative
5 71 14.2 14.9 85.8
Cons
erva
tive 6 59 11.8 12.3 98.1Extremely
Conservative
7 9 1.8 1.9 100.0
Dont Know 8 21 4.2 Missing
NA 9 1 .2 Missing
Total 500 100.0 100.0
Sample Table 2
8/2/2019 36 MR Lecture 6
28/36
Univariate Analysis
In Sample Table 2, the percent column provides the relative frequencyfor each category. The valid percent column calculates the percentageswithout the missing data being included. The cumulative percent talliesthe cumulative percentage of the valid percentages.
The relative frequency is usually much more usefulthan the frequency
itself because it puts th
e raw number into perspective. Now, we can saythat 42% of the sample were moderate.
We can use frequencies on any type of variable.
Other methods (They are better for variables that have multiple responsecategories, such as interval variables):
Histograms
Pie Charts
Bar Diagrams
8/2/2019 36 MR Lecture 6
29/36
Multivariate Analysis
Multivariate testing
In statistics, multivariate testing or multi-variable testing is a technique
for testing hypotheses on complex multi-variable systems, especially used
in testing market perceptions.
In marketing, multivariate hypothesis testing is a process by which morethan one component of a product/service may be tested in a live
environment. It can be thought of in simple terms as numerous A/B tests
performed on one page at the same time.
A/B tests are usually performed to determine the better of two content
variations, multivariate testing can th
eoretically test th
e effectiveness oflimitless combinations. The only limits on the number of combinations
and the number of variables in a multivariate test are the amount of time
it will take to get a statistically valid sample of visitors and computational
power.
8/2/2019 36 MR Lecture 6
30/36
Multivariate Analysis
Multivariate hypothesis testing is usually employed in order to ascertain which
content or creative variation produces the best improvement in the pre-defined.
For e.g. in case of the User Experience Testing for a website, dramatic increase in
user convenience can be seen through testing different copy text, form layouts and
even landing page images and background colours.
However, not all elements produce the same increase in user convenience, and by
looking at the results from different tests, it is possible to identify those elements
that consistently tend to produce the greatest increase in user convenience.
8/2/2019 36 MR Lecture 6
31/36
Multivariate Analysis
Testing can be carried out on a dynamically generated website by setting
up the server to display the different variations of content in equal
proportions to incoming visitors. Statistics on how each visitor went on to
behave after seeing the content under test must then be gathered and
presented. Outsourced services can also be used to provide multivariate
testing on websites with minor changes to page coding. These services
insert their content to predefined areas of a site and monitor user
behavior.
In a nutshell, multivariate testing can be seen as allowing respondents to
vote for the preferred option and will stand the most chance of them
proceeding to a defined goal.
8/2/2019 36 MR Lecture 6
32/36
Measurements of Association
In certain situations, we are interested todetermine whether some association existsamong the groups of data and the intensity of
this association. We make use of certain statistical tools to
determine whether any association exists amonggroups of data.
These statistical tools are: Correlation Analysis
Regression Analysis
8/2/2019 36 MR Lecture 6
33/36
Measurements of Association
Correlation Analysis
It is a statistical tool with the help of which we determinethe intensity of relationship between two or more than twovariables.
Two variables are said to be correlated if the movements inone are accompanied by the movements in the other.
For example: With the increase of rainfall up to a certainextent the production of rice increases.
The Correlation Coefficient is a measure of correlation and
summarises the direction and degree of correlation. To determine correlation, we could use:
Karl Pearsons Coefficient ofCorrelation
Rank Correlation
8/2/2019 36 MR Lecture 6
34/36
Measurements of Association
Karl Pearsons Coefficient ofCorrelation: It is denoted by
r.
The formula used to find r is as follows:
The value of r will lie between -1 and +1. A positive sign and negative sign for r have significant
implications.
Where X and Y are two
variables under study
and X and Y are their
respective means.
8/2/2019 36 MR Lecture 6
35/36
Measurements of Association
Rank Correlation
It is also known as Spearmans Rank Correlation method.
This method holds good when quantitative measures of
certain factors such
as evaluation of salesmens leadersh
ipability cannot be ascertained.
Then for these items of individuals the Rank Correlation
Coefficient is determined.
The formula used is:
Where d = Differences of rank
between paired items in two series
n = No. of units
d = X-Y
8/2/2019 36 MR Lecture 6
36/36
Measurements of Association
Regression Analysis
It is a statistical tool which enables a researcher to estimateor predict the values of unknown variables from the knownvalues of another variable, i.e. we can say that with the
help of regression analysis, a marketing researcher is ableto determine the average probable change in one variablegiven a certain amount of change in another variable.
For example: If a marketer has got the values on advertisingexpenditure with regard to a companys product, he canpredict the amounts of sales corresponding to eachadvertising expenditure or vice versa.
The regression analysis enables us to find the cause andeffect relationship between variables.