49
Chapter Twelve Quality Control and Initial Analysis of Data

Chapter Twelve Quality Control and Initial Analysis of Data

Embed Size (px)

Citation preview

Page 1: Chapter Twelve Quality Control and Initial Analysis of Data

Chapter Twelve

Quality Control and Initial

Analysis of Data

Page 2: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 2

Chapter Objectives

• Define editing and distinguish between a field edit and an office edit

• Define coding and outline the steps it involves• Compute measures of central tendency and

dispersion of the data for each variable in a data set

• State the potential uses of frequency distribution or one- way tables

Page 3: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 3

Data Analysis at Rockbridge Associates: Data Integrity

• Data integrity is the foundation for successful marketing research

• Rockbridge ensures integrity in the collection and processing of the data by a number of quality control checks for– mail surveys

– telephone surveys

– web surveys

• Rockbridge ensures data integrity in how the results are interpreted and explained to management

Page 4: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 4

Editing

• Editing is the process of examining completed data collection forms and taking whatever corrective action is needed to ensure the data are of high quality– Preliminary or field edit

– Final or office edit

Page 5: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 5

Field Edit

• A field edit, or preliminary edit, is a quick examination of completed data collection forms, usually on the same day they are filled out

• Objectives– Ensure that proper procedures are being followed in

selecting respondents, interviewing them, and recording their responses

– Fix fieldwork deficiencies before they turn into major problems

Page 6: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 6

Office Edit

• A final, or office edit, verifies response consistency and accuracy– Makes necessary corrections

– Determines whether some or all parts of a data collection form should be discarded

Page 7: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 7

What Is Wrong With this Response…

• A respondent said he was 18 years old but indicated that he had a Ph.D. when asked for his highest level of education.

Page 8: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 8

Editing Can Help Uncover

• Improper field procedures• Incomplete interviews• Improperly conducted interviews• Technical problems with the questionnaire or

interview• Respondent rapport problems• Consistency problems that can be isolated

and reconciled

Page 9: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 9

Improper Field Procedures

• Wrong questionnaire form used• Interview inadvertently not taken

Page 10: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 10

Incomplete Interviews

• Questions not asked• Directions not followed (proper segments of

the questionnaire were not administered)

Page 11: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 11

Improperly Conducted Interviews

• The wrong respondent interviewed (e.g., son instead of father)

• Questions misinterpreted by interviewer or respondent

• Evidence of bias or influencing of answers.• Failure to probe for adequate answers or the use of

poor probes• Interviewer's illegible writing and/or style.• Interviewer recorded information which identified a

respondent whose anonymity should have been protected

Page 12: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 12

Improperly Conducted Interviews (Cont’d)

• Interviewer apparently does not understand what type of responses constitute an answer to the actual question asked

• Interviewer does not understand what the objective of the question is and thus accepts an improper frame of reference for the respondent's answer

• Other evidence of need for training or instructions to be given to interviewer – failure to write down probes, wrong abbreviations,

failure to follow directions

Page 13: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 13

Technical Problems With the Questionnaire or Interview

• Space was not provided for needed information• The presence of unanticipated or unusually frequent

extreme responses to questions, indicating a possible need for rewording of certain questions

• Inappropriate or unworkable interviewer instructions not detected in the pretest

• The order in which questions were asked introduces confusion, resentment, or bias into the respondent's answers

Page 14: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 14

Respondent Rapport Problems

• Frequent refusal to answer certain questions.• Reports of abnormal termination of the

interview (or presence of hostility) due to sensitive questions

• Evidence that respondent and interviewer are playing the "game" of "What answer do you want me to give?"

• Evidence that the presence of other people in the interview situation is causing problems

Page 15: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 15

Consistency Problems That Can Be Isolated and Reconciled

• Contradictory answers – Reports no savings in one section of the interview but

reports interest from bank accounts in another section

• Misclassification – Mortgage debt improperly reported as installment debt

• Impossible answers – Reports paying $600 for a new Edsel in 1970 - the car

should have been recorded as a "used" car; or weekly income reported on the income-per-month line

Page 16: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 16

Consistency Problems That Can Be Isolated and Reconciled (Cont’d)

• Unreasonable (and probably erroneous) responses – Respondent reports borrowing $2,000 for two years to

buy a car but reported monthly payments multiplied by 24 months are less than $2,000

– Respondent reports that the house value is $90,000 while income is $2,000 per year and the respondent claims less than a high school education

Page 17: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 17

Preventing Errors

• Careful planning before fieldwork begins• Automating data entry

Page 18: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 18

Coding

• Coding broadly refers to the set of all tasks associated with transforming edited responses into a form that is ready for analysis

• Steps– Transforming responses to each question into a set of

meaningful categories

– Assigning numerical codes to the categories

– Creating a data set suitable for computer analysis

Page 19: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 19

Transforming Responses into Meaningful Categories

• A structured question is pre-categorized• Responses to a nonstructured or open-ended

question to be grouped into a meaningful and manageable set of categories

Page 20: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 20

The Best Way to Treat "Don't Know" Responses

• Infer an actual response – dubious validity• Classify the "don't know's" as a separate

response category for each question

Page 21: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 21

Missing-Value Category

• A missing value can stem from– A respondent's refusal to answer a question

– An interviewer's failure to ask a question or record an answer or a "don't know" that does not seem legitimate

• Best way to treat missing value responses– Sound questionnaire design

– Tight control over fieldwork

Page 22: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 22

Assigning Numerical Codes

• Assign appropriate numerical codes to responses that are not already in quantified form

• To assign numerical codes, the researcher should facilitate computer manipulation and analysis of the responses

Page 23: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 23

Coding Multiple Response

• Which of the following countries have you visited during the past 12 months?

________Canada________England________France________Germany________Japan________Mexico

• Need six variables, each relating to a specific country and having two possible values. For example, 1= “No” and 2 = “Yes”

• Six columns must be set aside in the data spreadsheet to record responses to this question

Page 24: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 24

Multiple Response Question –Rank Order Question

• Please rank the following fast-food restaurants by placing a 1 beside the restaurant you think is best overall, a 2 beside the restaurant you think is second best, and so on.__________Burger King__________McDonald's__________Wendy's__________Whataburger

• This question requires as many variables (and columns) as there are objects to be ranked

• 4 separate variables are needed

Page 25: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 25

Creating a Data Set

• Organized collection of data records• Each sample unit within the data set is called

a case or observation• Structure of a Data Set

– The number of observations = n

– The total number of variables embedded in the questionnaire is m, then

• Data set = n x m matrix of numbers

Page 26: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 26

Table 12.3 Structure of a Data Sheet

Page 27: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 27

Preliminary Data Analysis:Basic Descriptive Statistics

• Preliminary data analysis examines the central tendency and the dispersion of the data on each variable in the data set

Page 28: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 28

Table 12.4 Measures of Central Tendency and Dispersion for Different Types of Variables

Page 29: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 29

Measurement Level of Data Pertaining to Variable – Nominal

• Measures of Central Tendency– Mode: Most frequently occurring response

• Measures of Dispersion – Strictly speaking, the concept of dispersion is

not meaningful for nominal data

– An idea about the distribution of responses can be obtained by examining their relative frequencies of occurrence

Page 30: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 30

Measurement Level of Data Pertaining to Variable – Ordinal

• Measures of Central Tendency– Median: 50th percentile response

• Measures of Dispersion – Range: Defined by the highest and lowest

response values

– Interquartile range: Difference between the 75th and 25th percentile responses

Page 31: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 31

Measurement Level of Data Pertaining to Variable – Interval

• Measures of Central Tendency– Mean: Arithmetic average of response values

• Measures of Dispersion – Standard deviation: As defined in Chapter 9

Page 32: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 32

Measurement Level of Data Pertaining to Variable – Ratio

• Measures of Central Tendency– Mean: Arithmetic average of response values

• Measures of Dispersion – Standard deviation: As defined in Chapter 9

Page 33: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 33

Mode

• The value that occurs most frequently

Page 34: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 34

Table 12.5 How Long Have You Been Using

the Services of National? – Computing

Mode

Page 35: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 35

Median

• The observation below which 50 percent of the observations fall

Page 36: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 36

How long have you been using the services of National?

4 3 4 1 4 4 4 4 4 4 3

4 4 3 4 4 4 3 1 1

1= Less than a year; 2 = 1 to less than 2 years; 3 = 2 to less than 5 years;

4 = 5 years or more

Arranging the 20 values in ascending order:

1 1 1 3 3 3 3 4 4 4 4

4 4 4 4 4 4 4 4 4

Because the sample size = 20, there are two middle values: 4 and 4. The

median is, therefore, the average of the two middle values = 4.

Table 12.6 Length of Time Service Used – Responses from 20 Customers

Page 37: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 37

Table 12.7 Computing Median for Length of Time Service Used

Page 38: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 38

Mean

n = Number of units in the sample

xi = data obtained from each sample unit I

= sample mean value, given by

1

( )n

ii

X

n

X

Page 39: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 39

Table 12.8 Overall Quality of Services Provided by

National– Computing Mean

Page 40: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 40

Measures of Dispersion

• Range• Variance• Standard Deviation

Page 41: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 41

Range

• Range is the difference between the largest and smallest value

• The simplest measure of dispersion

Page 42: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 42

(xi –x )2

S2 = ---------- n-1

Variance

• Variance of a set of data is a measure of deviation of the data around the arithmetic mean

Page 43: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 43

n (xi –x )2

i=1---------- n-1

Standard Deviation

• Standard deviation is the square root of the variance

Page 44: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 44

Table 12.9 Overall Quality of Services Provided by National: Computing Range, Variance, and Standard Deviation

Page 45: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 45

Frequency Distribution: One-Way Tabulation

• One-way tabulation is a table showing the distribution of data pertaining to categories of a single variable

Page 46: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 46

Table 12.10 Age and Length of Time Service Used

Page 47: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 47

Table 12.10 Age and Length of Time Service Used (Cont’d)

Page 48: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 48

Why Averages May be Misleading

• Researchers tested a new sauce product and found– Mean rating of the taste test was close to the

middle of the scale, which had "very mild" and "very hot" as its bipolar adjectives

• Researcher’s conclusion – Consumers need really neither really hot nor

really mild sauce

Page 49: Chapter Twelve Quality Control and Initial Analysis of Data

Copyright © Houghton Mifflin Company. All rights reserved. 12 | 49

Why Averages May be Misleading (Cont’d)

• Deeper examination revealed – The existence of a large proportion of

consumers who wanted the sauce to be mild and an equally large proportion who wanted it to be hot nor really mild sauce

• Moral of the story– A clear understanding of the distribution of

responses can help a researcher avoid erroneous inferences