9
July 2014 updated Prepared by Michael Ling Page 1 QUANTITATIVE RESEARCH METHODS SAMPLE OF REGRESSION & MANOVA PROCEDURES Prepared by Michael Ling Reference: Parasuraman, A., Zeithaml, V. A., and L. L. Berry (1988), “SERVQUAL: A Multiple-Item Scale for Measuring Consumer Perceptions of Service Quality,” Journal of Retailing, Vol. 64, No. 1, 12-40.

SERVQUAL Service Quality (July 2014 updated)

Embed Size (px)

DESCRIPTION

Regression and MANOVA analysis. Review of Parasuraman, A., Zeithaml, V. A., and L. L. Berry (1988), “SERVQUAL: A Multiple-Item Scale for Measuring Consumer Perceptions of Service Quality,” Journal of Retailing, Vol. 64, No. 1, 12-40.

Citation preview

Page 1: SERVQUAL Service Quality (July 2014 updated)

July 2014 updated

Prepared by Michael Ling Page 1

QUANTITATIVE RESEARCH METHODS

SAMPLE OF

REGRESSION & MANOVA PROCEDURES

Prepared by

Michael Ling

Reference: Parasuraman, A., Zeithaml, V. A., and L. L. Berry (1988),

“SERVQUAL: A Multiple-Item Scale for Measuring Consumer Perceptions of

Service Quality,” Journal of Retailing, Vol. 64, No. 1, 12-40.

Page 2: SERVQUAL Service Quality (July 2014 updated)

July 2014 updated

Prepared by Michael Ling Page 2

INTRODUCTION

Service quality has been considered as an important attribute to business but

yet hard to measure due to its unique features: intangibility, heterogeneity and

inseparability of production and consumption. In the absence of an objective

measure of service quality, customers’ perception is considered as the standard of

measure.

This paper contributes to the marketing literature by developing the service

quality concept and the derivation of the SERVQUAL scale. The key research

question is to search for a universal service quality scale that can be applicable to all

service categories.

Factor analysis is employed as the data reduction method in the development

of SERVQUAL. The paper provides details on how it is developed from initially a 97-

item scale across 10 service dimensions into a 22-item scale across 5 service

dimensions: tangibles, reliability, responsiveness, assurance and empathy.

Page 3: SERVQUAL Service Quality (July 2014 updated)

July 2014 updated

Prepared by Michael Ling Page 3

SUMMARY

Based on the service literature, the initial SERVQUAL scale consists of 97

items across ten service dimensions: tangibles, reliability, responsiveness,

communication, credibility, security, competence, courtesy, understanding/knowing

the customer, and access. Each item is represented by two kinds of statements –

expectation statements (E’s) that measure customer expectations about the firms in

a service category and perception statements (P’s) that measure customer

perceptions about the performance of a particular firm in the same service category.

Data collection is conducted in two stages. During the first stage, 200

respondents from five service categories are selected and provided with self-

administered questionnaires. All responses gathered are pooled for analysis,

regardless of their service categories. Based on the disconfirmation model in

customer satisfaction literature, a difference score Q = P – Q is formed for each of

the 97 items and the coefficient alpha values (α) for the service dimensions range

from 0.55 to 0.78. Coefficient α values are then improved through an iterative

process of deleting items with low item-to-total correlations to achieve better

reliability. The outcome is a reduced set of 54 items, with coefficient alpha values

range from 0.72 to 0.83. Finally, the factor structure is reduced to 34 items across 7

dimensions, with coefficient α values range from 0.72 to 0.94.

During the second stage, four samples of 200 respondents are selected from

each of the four service firms. Again, the respondents are self-administered with

questionnaires that made up of 34 items. This time, the data are sorted into the four

corresponding groups and analysed. The outcome is a 22-item scale across five

dimensions: tangibles, reliability, responsiveness, assurance and empathy.

Page 4: SERVQUAL Service Quality (July 2014 updated)

July 2014 updated

Prepared by Michael Ling Page 4

CRITIQUE

The difference approach

The difference approach, Q = P - E, used in the evaluation of SERVQUAL is

based on the disconfirmation model in the customer satisfaction literature. The

authors argue that the “idea” of a difference score is not new and this approach has

been used in role conflict research.

Consider the equation Q = P – E, where the same Q value can be obtained

from various combinations of P’s and E’s. For example, the case where the

difference between P and E is 1 can come from these scenarios: P = 2, E = 1; P = 3,

E = 2; P = 4, E = 3; P = 5, E = 4; P = 6, E = 5; P = 7, E = 6. The difference score, Q,

will not capture the individual P’s and E’s and valuable information could be left out.

A major concern is whether the customers’ perception of service quality is the same

regardless of the individual P’s and E’s. The authors have neither discussed this

point nor conducted trials to test this possibility.

Dimensions of SERVQUAL

The final refined SERVQUAL scale consists of five dimensions, which are

“designed to be applicable across a broad spectrum of services”. A concern is

whether these five service dimensions are sufficient to account for the variations of

quality across all service categories. The sample data has been drawn from a limited

number (five) of service categories and a limited number (four) of service firms. Is it

possible that complex SERVQUAL dimensions (larger number of dimensions) are

required in some services such as movie ticketing but not required in other services

such as airline ticketing? The authors should address the external validity of

Page 5: SERVQUAL Service Quality (July 2014 updated)

July 2014 updated

Prepared by Michael Ling Page 5

SERVQUAL by cross-validating their results against a much broader range of service

categories.

The number of items used for each SERVQUAL dimension is made up of only

four to five items. A concern is whether the number of items is sufficient. Is it

possible that service quality can be influenced by contextual factors (depending on

service categories) which some service categories, due to their complex nature, need

to be measured by a larger number of items than others?

Reliability

The inter-item reliability (coefficient alpha) of the final refined scale ranges

from 0.52 to 0.84, where “the reliabilities are consistently high across all four

samples” and “the total–scale reliability is close to 0.9”. This is a good outcome.

However, given the limited data samples, a concern is whether the reliability can be

sustained across all service categories. Again, the issue of external validity of

SERVQUAL should be addressed by cross-validating the results against a much

broader range of service categories.

Amongst the test items, nine pairs of P’s and E’s statements (items #10 to

#13, items #18 to #22) are negatively worded, which all come from the

Responsiveness and the Empathy dimensions. It is well understood that negatively

worded statements are designed to reduce systematic response bias. There are a

couple of concerns here. Firstly, the negatively worded items are not spread out

across the five dimensions, which should be a better alternative to reduce bias.

Secondly, some of the negatively worded items are not straightforward to understand

Page 6: SERVQUAL Service Quality (July 2014 updated)

July 2014 updated

Prepared by Michael Ling Page 6

and interpret. For example, “It is not realistic for customers to expect prompt service

from employees of these firms” (E11). There is potential data quality problem here.

The SERVQUAL items are ordinal, which mean that polychoric correlation

might be needed to estimate the correlations if the underlying distributions are

assumed to be continuous.

Questionnaire administration

The questionnaire is made up of “97-statement expectations part followed by a

97-statement perceptions part”. There are a couple of concerns here. Why is the

expectations part before the perceptions part and not the other way round? Why are

the individual items, P’s and E’s, not grouped together? Focus groups should be

conducted prior to data collection to find out how the expectation and perception

statements should be set up.

The 97-statement pairs make the questionnaire lengthy. A concern is that it

might cause the respondents to lose interest and attention to answer all the

questions. Again, there is a potential data quality problem.

Convergent validity

Separate one-way ANOVA procedures have been used in the evaluation of

the association between SERVQUAL scores (dependent variables) and Overall Q

(independent variable) across each of the five SERVQUAL dimensions. Some of the

concerns are as below.

i. In ANOVA/MANOVA procedures, the dependent and independent variables

are interval (or continuous) and categorical respectively. Here, the dependent

Page 7: SERVQUAL Service Quality (July 2014 updated)

July 2014 updated

Prepared by Michael Ling Page 7

variables (or SEVQUAL scores) are ordinal, not interval, variables. No

discussions are provided to explain how this might affect the results.

ii. No considerations are taken to distinguish the impact of experiment-wise level

of Type I error given that multiple ANOVA procedures are used. In the second

data collection stage, six one-way ANOVAs are conducted – one for each of

the five SERVQUAL dimension and one for the combined scale. The

experiment-wise probability of a Type I error might be be 6 F tests at .05 each

or 30 percent. It is important to discuss whether the probability should be set

at this level.

iii. Why is the MANOVA omnibus test not conducted prior to the ANOVAs? Apart

from protecting against inflated error probability of Type I error, the MANOVA

procedure also takes into account the intercorrelations among the SERVQUAL

dimensions.

iv. The assumptions of ANOVAs such as independence, normality and

homogeneity of variance for each test group are not tested. No descriptive

statistics (such as Skewness and Kurtosis) or Shapiro-Wilk’s statistic is

provided. No Levene’s test of homogeneity of variances is reported.

v. No effect sizes such as Cohen’s measure is reported.

Overall Assessment

There is concern that the difference approach, Q = P – E, might be too

simplified to have omitted critical information. There is concern whether the five

dimensions are sufficient to cover all service categories. There is concern whether

the items in the dimensions are influenced by contextual factors. There is concern

Page 8: SERVQUAL Service Quality (July 2014 updated)

July 2014 updated

Prepared by Michael Ling Page 8

about negatively worded items not spread out. There is concern that the

questionnaire is lengthy. There is concern on how the perception-expectation

statements are presented. There is concern over convergent validity. The strengths

of the paper are the new conceptual framework of SERVQUAL and the high

reliabilities achieved. The weaknesses of the paper are the concerns raised above

and the applicability of SERVQUAL across all service categories.

Page 9: SERVQUAL Service Quality (July 2014 updated)

July 2014 updated

Prepared by Michael Ling Page 9

CONCLUSION

The contribution of the paper is its development of the service quality scale,

SERVQUAL, in the marketing discipline. The final refined scale consists of 22-items

across five service dimensions, which is the result of an iterative process of data

reduction based on samples drawn from five service categories and four service

firms.

Though the reliabilities of the measurement scale are consistently high

(Cronbach’s value close to 0.9) across the samples, this critique raises concerns

over the difference model, Q = P – E, and other areas such as item dimensions,

validity, reliability, questionnaire administration and generalization.

The research could have improved by addressing the concerns raised in this

critique. In particular, closer examination of the difference model should be done to

ascertain whether customer perceptions can be summarized by the difference

scores, which is a key assumption upon which SERVQUAL is built. Other

improvement includes testing whether the reliability coefficients of the SERVQUAL

dimensions will hold across a broader range of service categories, and testing the

convergent validity of SERVQUAL to increase the rigour of the research method.