Upload
chh
View
220
Download
0
Embed Size (px)
Citation preview
8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
1/26
8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
2/26
Article
Its Not Me, Its You:
Miscomprehension in Surveys
Ben Hardy1 and Lucy R. Ford2
Abstract
The ubiquity of surveys in organizational research means that their quality is of paramount
importance. Commonly this has been addressed through the use of sophisticated statisticalapproaches with scant attention paid to item comprehension. Linguistic theory suggests that while
everyone may understand an item, they may comprehend it in different ways. We explore this in two
studies in which we administered three published scales and asked respondents to indicate what
they believed the items meant, and a third study that replicated the results with an additional scale.
These demonstrate three forms of miscomprehension: instructional (where instructions are not
followed), sentential (where the syntax of a sentence is enriched or depleted as it is interpreted), and
lexical (where different meanings of words are deployed). These differences in comprehension are
not appreciable using conventional statistical analyses yet can produce significantly different results
and cause respondents to tap into different concepts. These results suggest that item interpretation
is a significant source of error, which has been hitherto neglected in the organizational literature.We suggest remedies and directions for future research.
Keywords
survey research, quantitative research, construct validation procedures
How satisfied are you with the pay you receive for your job? (Tsui, Egan, & OReilly, 1992). This
seems a straightforward question that anyone who is employed should be able to understand and
answer. But what does it actually mean? Is it asking whether you are happy with the pay you receive
for your job, or whether you think the amount you earn is fair for the work you do? Or something
else? No doubt you will understand both the question and its meaning. The crucial issue, however, is
whether others understand it in exactly the same way as you.
Organizational researchers often solicit the opinion of others through surveys. This frequently
involves administering a stimulus, in the form of a question or statement,1 and allowing the partici-
pant to choose from a limited menu of responses. Closed questions of this nature allow a verbal
1The Open University Business School, Milton Keynes, United Kingdom2
Saint Josephs University, Philadelphia, PA, USA
Corresponding Author:
Lucy R. Ford, Saint Josephs University, 5600 City Avenue, Philadelphia, PA 19131, USA.
Email: [email protected]
Organizational Research Methods
1-25
The Author(s) 2014
Reprints and permission:
sagepub.com/journalsPermissions.nav
DOI: 10.1177/1094428113520185orm.sagepub.com
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://www.sagepub.com/journalsPermissions.navhttp://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://www.sagepub.com/journalsPermissions.nav8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
3/26
(analog) signal to be converted to a numerical (digital) output, through the allocation of ordinal num-
bers to Likert scale responses, and the consequent output to be subjected to statistical examination.
The advantage of this process of transduction is also its weakness, as the simple numerical output
masks infelicities in comprehension of the instruction, question, or response. Individuals agreeing
with a statement may not necessarily be agreeing with the same thing as other respondents, and
even sophisticated statistics may not detect these differing interpretations. The whole enterprise of
survey research rests on the assumption that there is an unbroken chain of comprehension from the
mind of the researcher, through the survey instrument, to the mind of the respondent, and back again.
Miscomprehension at any stage in this process introduces error.
A survey that is used as the basis for strategy or policy and that is poorly constructed and ignores
different interpretations of questions could have profoundly negative effects. As a consequence, a
great deal of effort has been put into improving standards of measurement. This research has mainly
focused on using statistics to assay scale quality, with little attention paid to the stimulus questions
themselves and the way in which individuals comprehend them.
This article examines sources and types of linguistic miscomprehension in survey research, usingpublished, multi-item scales. We begin with a brief review of scale development and some of the
principles of linguistics. We then present three studies that explore miscomprehension in survey
research. The first study shows that while participants understand survey questions, they understand
them in different ways. Using existing linguistic theory we code the results into three forms of mis-
comprehension. The second study tests this taxonomy by presenting respondents with a stimulus
question and asking them to select, from a list of possible interpretations, the interpretation of the
question that most closely matches their own. We find that participants commonly depart from the
strict syntax of the item in their interpretations. This threatens construct validity and can have impli-
cations for item score on the scale itself and can impact on other scalesin this case turnover inten-
tion. In the third study we replicate the findings of the first two studies using a different measure toestablish that our findings are not particular to our scale selection. These three studies demonstrate
that respondents interpret items differently, that this threatens construct validity, and yet is not
apparent when standard statistical tests to assess factor structure and validity are used. We then
examine the import of these findings for organizational research, suggest remedies, and outline
directions for future research.
A Brief Review of Scale Development
The process of scale development has been discussed in a number of texts (e.g., DeVellis, 2003;
Hinkin, 1998). These generally aim to fulfill the American Psychological Association guidelines,
which center around content validity, criterion-related validity, construct validity, and internal con-
sistency reliability (Hinkin, 1998).
The first step is to define the concept of interest and its domain. Poorly specified concepts and
inadequate domain sampling will guarantee an inadequate scale. The next steps are elegantly sum-
marized by Hinkin (1998). They begin with developing items that either inductively or deductively
sample the conceptual domain (Hinkin, 1995). If the resulting items are poorly developed, then it is
unlikely that the subsequent stages of the developmental processes will remedy this. Unfortunately,
this critical step of item development is seldom accorded appropriate emphasis (Schriesheim,
Powers, Scandura, Gardiner, & Lankau, 1993), with DeVellis (2003) suggesting that researchers
often throw together or dredge up items and assume they constitute a suitable scale (p. 11).
Hinkin (1998) advocates a more rigorous process, where parsimonious, readily comprehensiblequestions are written and construct validity is examined using multiple samples and techniques such
as exploratory and confirmatory factor analysis. Despite the importance of the initial stages of item
development (Hinkin, 1995), greater emphasis is often placed on the statistical assessment of the
2 Organizational Research Methods
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
4/26
psychometric properties of the scale (Rossiter, 2002) and its relationship to other variables in the
nomological net (Borsboom, Mellenbergh, & van Heerden, 2004).
Researchers may choose instead to use published measures. Passage through the peer review pro-
cess has typically been perceived as evidence of scale quality. However, Ford and Scandura (2007),
in an examination of a compilation of organizational measures (Fields, 2002), found that the major-
ity of scales contained one or more threats to construct validity, suggesting that published measures
are not without flaws. Using a published measure also involves disembedding it from its original
context, potentially increasing the risk of error. Comprehension (u) is a function of syntax (s) and
context (c);u f(s c), and so the same syntax may be comprehended differently in different con-
texts. For example, saying break a leg means something different in a theater dressing room and
an operating theater. The importance of linguistics in item interpretation has not been widely dis-
cussed although there are some existing sources that do address the issue (Schwarz, 1999), and it
is to this topic we now turn.
A Brief Review of Linguistic Theory Surrounding Item Interpretation
Surveys hinge on comprehension. If the respondent does not understand the survey question in
exactly the same way as the researcher then the instrument is not measuring what the researcher
intended. This interface between researcher and respondent is of critical importance and where lin-
guistic problems of interpretation manifest themselves. Communication depends on one persons
statements being understood by another. This, however, is not enough, as understanding what
another person is saying is one thing, while understanding exactly what they mean is another.
Researchers in the pragmatic tradition of semantics make a clear distinction between what is said
and the context (both social and linguistic) in which it is said, the one influencing the other. The
interplay between the contextual and literal was articulated by Grice (1975) and subsequently mod-ified and extended by other authors (e.g., Jaszczolt, 2005; Sperber & Wilson, 1986). The principle
underpinning this field is that when we interpret a sentence, we flesh out the bare syntax of a sen-
tence, drawing on our experience, context, and environment. This process, however, is neither uni-
form nor predictable, varying across individuals and situations. These variations mean that survey
questions may be fleshed out by individuals to give meanings other than the one intended by the
questions author. Consequently, this article is concerned with cataloging these differences and
examining their impact on survey research.
Types of ErrorThreats to comprehension, and hence validity, fall into two basic categories: instructional and inter-
pretive. Interpretive errors can then be broken down into two further categories commonly used
within the psycholinguistic literature (e.g., Hernandez, 2001). One concerns the comprehension
of the full sentence, or sentential comprehension, and the other the comprehension of individual
words, or lexical comprehension.
Instructional Miscomprehension.This is the most easily understood source of error, where the respon-
dent either does not read/follow the instructions for completing the survey or they misunderstand the
instructions (Tourangeau, Rips, & Rasinski, 2000). This failure to follow or understand instructions
may not be evident in surveys with a numerical output, yet instructions are of pivotal importance.For some surveys they provide direction as to what is actually being measured. In others, the instruc-
tions might contain the experimental manipulation through a change in wording. In either case, the
results of the research are affected if the instructions are ignored. In short, instructions are important.
Hardy and Ford 3
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
5/26
Interpretive Miscomprehension: Sentential.When hearing a question we attempt to understand what
it means. This involves making decisions, both conscious and unconscious, about what the ques-
tioner is actually trying to ask. For example, the question Have you had lunch? at a 2 pm meeting
is likely to be interpreted as Have you had lunch today? rather than Have you ever had lunch?
Understanding a question, therefore, is not a banausic process of literally answering what has been
asked, but rather one of applying contextual information in order to answer appropriately. Crucially,
we must reach the same interpretation as the author of the question, otherwise we are likely to
answer a different question from the one the author intended. This is sentential miscomprehension.
Respondents might enrich or deplete the meaning of questions, or both. When an item is enriched,
the respondent adds additional information to the stimulus. For example, if the question Consider-
ing everything how satisfied are you with your current job situation? (Tsui et al., 1992) is inter-
preted by the respondent as Would you stay in your job if someone offered you something
else?,2 they have enriched the sentence to include elements of turnover intention that were not
intended. This may mean that the respondent is actually answering a question about turnover inten-
tion as opposed to just job satisfaction.Just as sentences can be enriched they can also be depleted. The question How fair or unfair are
the procedures used to determine pay rates? (Sweeney & McFarlin, 1993) shows depletion if inter-
preted as How fair is your pay? The respondent has clearly understood the fairness element of the
question but not the procedural part and has, effectively, turned a procedural justice item into a
distributive justice one.
Interpretive Miscomprehension: Lexical. This form of miscomprehension concerns the meaning of
the words themselves. One persons definition of a word does not necessarily accurately map onto
that of anothers because they are drawing on a variety of educational, cultural, social, contextual, or
gender-specific definitions.The wordsatisfactionhas two historical meanings (Simpson & Weiner, 1989). One is with refer-
ence to desires or feelings (Simpson & Weiner, 1989, p. 502) and is described as The action of grat-
ifying (an appetite or desire) to the full a sense of pleasurable gratification (p. 502); the other, with
reference to obligations (p. 502), is a more transactional sensation of obligation having been fulfilled.
Depending on exposure and knowledge, individuals interpreting the wordsatisfaction may draw on
one or the other interpretation, or a blend of both. The issue for survey researchers is that it is very
difficult to know which definition the respondent is drawing on. For example, two different respon-
dents may both agree with the statement Are you satisfied with this company but one might
be agreeing that they like the company while the other feels that the company has met its obligations.
The possibility of lexical miscomprehension resulting from this polysemy, or multiple meanings,
is not restricted to the wordsatisfied. The Collins English Dictionary lists 43,636 different nouns and
14,190 different verbs. The average noun has 1.74 meanings and the average verb 2.11 (Fellbaum,
1990), suggesting that there is plenty of opportunity for respondents to draw on more than one
meaning.
Lexical miscomprehension can introduce primary error, where the items miscomprehension
affects its score, and also secondary error, where miscomprehension causes collinearity between
scales. This could occur if, for example, a question about satisfaction were included in a model along
with an instrument for turnover intention. The responses of those using a transactional interpretation
of satisfaction should correlate with turnover intention but the responses of those taking a gratifica-
tional view may not.
These three forms of miscomprehension, instructional, sentential, and lexical, have the potentialto introduce considerable error into the measurement process. We shall now turn to two studies that
provide evidence for the existence of these forms of miscomprehension and a third, which confirms
our preliminary findings using a different scale.
4 Organizational Research Methods
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
6/26
Study 1: Respondent Interpretation of Survey Question
This study aims to ascertain whether the forms of miscomprehension outlined previously occur with
questions used in organizational research.
Method
The scales used in this study were selected on the basis of three criteria. First, Ford and Scan-
dura (2007) did not identify any threats to construct validity in these scales in their analyses of
all the scales contained in Fieldss (2002) book of organizational measures. Second, they con-
tained a mix of both questions and statements so that comparisons could be made between a
satisfaction scale using questions (Tsui et al., 1992) and one containing statements (Agho,
Price, & Mueller, 1992; drawn from a longer measure in Brayfield & Rothe, 1951). Finally
they were brief. This last point was of particular importance in order to minimize the potential
for survey fatigue. Two multiple item scales for job satisfaction (Agho et al., 1992, 6 items;Tsui et al., 1992, 6 items) and one for procedural justice (McFarlin & Sweeney, 1992, 4 items)
were used. The job satisfaction measures were also different in that one (Tsui et al., 1992) is a
general measure of job satisfaction that measures specific facets such as satisfaction with the
work itself, supervision, and coworkers, while the other (Agho et al., 1992) was intended as an
affective measure of job satisfaction. The papers in which these scales appeared have been
widely cited in organizational research and the scales themselves used frequently in subse-
quent research.
The survey was administered using Qualtrics (2007). The respondents were first asked to explain
what they thought the survey question meant, in a free-text, open-ended response format, imagining
that they were explaining the item to a non-native English speaker and attempting to convey the truemeaning of the item. We then administered the same three scales in their usual format with Likert-
type responses. Finally, we captured standard demographic data such as gender, educational level,
and native language.
Sample.For this initial exploratory study, we used three convenience samples of participants. First,
the authors sent an invitation to participate to their personal contacts, with a request that partici-
pants forward the survey to others. We used this method as variance in pragmatic inference is uni-
versal (Sperber & Wilson, 1986), and so a nonrandom sampling method was appropriate. We also
wanted respondents who had the intellectual capacity to think through the meaning of items care-
fully. Therefore, sampling our own contacts made sense, as our contacts are typically well
educated.
Our final sample comprised a total of 115 respondents. Forty-one of these were native speakers
of British English (BrE) (average age 34.3,SD 11.9; 42%female, 58%had a masters degree
of higher), 40 were native speakers of American English (AmE) (average age 37.1, SD 11.6;
52%female, 60%had a masters degree of higher). We selected speakers of British English and
American English as we were concerned that there might be differences in interpretation between
these two forms that are strongly represented in organizational research and represent the two
forms of English that are taught internationally. We also asked members of the RMNet listserv
(a listserv restricted to members of the Research Methods Division of the Academy of Manage-
ment) to complete the survey (n 34, average age 44.5, SD 12.2; 48%female, 91%had a
masters degree or higher) as those with an interest in research methodology may assist colleaguesin developing surveys. The BrE and AmE samples were broadly similar in terms of age, gender
profile, and educational attainment, while the RMNet sample was slightly older and more highly
educated, as would be expected.
Hardy and Ford 5
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
7/26
Classification of Open-Ended Responses.Responses were classified manually using NVivo 8 (2008) by
two coders and interrater reliability statistics (Cohens kappa, k; Cohen, 1960, as implemented in
SPSS) were calculated.
Instructional miscomprehension was identified and coded when a respondent failed to follow the
instructions. The instructions were in a bordered box at the top of the first page and clearly stated in
bold, italicized, block capitals: DO NOT ATTEMPT TO ANSWER THE QUESTION. Respon-
dents who answered the question anyway demonstrated instructional miscomprehension, as did
those who did not follow the ensuing instructions to explain the item.3
In order to identify sentential miscomprehension we examined the responses for deviation from
the syntax of the question, in the form of enrichment or depletion. Enrichment was defined as the
respondent venturing beyond a strict syntactic interpretation of the question by including other con-
ceptual elements. Depletion, on the other hand, was defined as the absence of an element of the
question in the answer provided. Depending, of course, on the nature of the item, it is possible to
simultaneously enrich and deplete an item. For instance, a respondent who interprets the question
How fair or unfair are the procedures used to determine pay rates? (Sweeney & McFarlin,1993) as How transparent is compensation? simultaneously depletes the question by not asking
about the (un)fairness of pay rates (i.e., removing a conceptual element) and also enriches it by
expanding from pay to compensation.
Lexical miscomprehension is difficult to apprehend, as it is impossible to know what mental
schema the respondent is drawing on, so interpretation of the response can only be made by infer-
ence from the rest of the sentence. The definition of satisfaction in the question How satisfied are
you with the nature of the work you perform (Tsui et al., 1992) was coded as pleasurable if
words indicating pleasure (e.g., happy) were included in the interpretation and transactional if the
interpretation included phrases indicating that it matched or met their expectations. Questions where
no classification could be made were coded as neutral.Other words that proved tractable to classification for lexical miscomprehension included ques-
tions where a vague term such as oftenormostwas used, and there were responses that quantified
what these words meant, for exampleoftenbeing interpreted as 3/5 days. Lexical ambiguity could
also be seen in interpretations of such items as I like my job better than the average worker (Agho
et al., 1992), which begged the question of how you define the average worker. The differing
referents for the average worker were readily classifiable in the responses.
Finally, analyses were conducted using SPSS to establish the reliability and dimensionality of the
measures used in the survey.
ResultsStatistical Tests for Dimensionality. Coefficient alpha (Cronbach, 1951) for the measures ranged
between .79 and .91, which was consistent with or better than alphas previously reported for these
measures (Fields, 2002). In addition, we used exploratory factor analysis to establish unidimension-
ality of each measure (Conway & Huffcutt, 2003). We used maximum likelihood extraction and
direct oblimin rotation. All items had factor loadings exceeding .40, with almost all exceeding
.50, indicating undimensionality of each measure.
Results of Coding Open-Ended Responses
Instructional Miscomprehension. This was readily detected when respondents answered the ques-
tion rather than describing what the question meant. Eight respondents consistently answered thequestions instead of describing them (BrE 3/41; AmE 2/40; RMNet 3/34). RMNet members in par-
ticular demonstrated another form of instructional miscomprehension. Nine of 34 wrote responses
such as job satisfaction/facet is pay, procedural justice, or Need a Likert scale response,
6 Organizational Research Methods
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
8/26
which are clearly incompatible with the instructions. Overall, 17 of 115 respondents (15%) pro-
vided answers suggesting that they had not read the instructions properly.
Sentential Miscomprehension.Examination of the sample revealed evidence of both enrichment anddepletion, with little difference in degree of miscomprehension between the three groups of respon-
dents. Accordingly they were combined. The coding was completed by two raters, with excellent
interrater reliability (k 0.85-0.90). Depletion was particularly evident with the items from the pro-
cedural justice scale (Sweeney & McFarlin, 1993) where respondents ignored the procedural ele-
ment of the question, such that How fair or unfair are the procedures used to determine pay
rates? was interpreted as How fair is your pay? Effectively this turned a procedural justice item
into a distributive justice one. Overall 27% of respondents ignored the procedural element of the
question (BrE 30%, AmE 32%, RMNet 16%).
Five respondents said that they did not understand the item How fair or unfair are the procedures
used to communicate performance feedback. In spite of this they were still able to provide a numer-ical response in the multiple-choice section, despite the option not to do so. Therefore their lack of
understanding is undetectable in the statistical data. Up to 44%of respondents depleted any given
item, as can be seen in Table 1.
Table 1. Number of Respondents Coded as Depleting or Enriching Items.
Depletion(k 0.87)
Enrichment(k 0.85)
Scale Item N(%) N(%)
Tsui, Egan, andOReilly (1992)
1. How satisfied are you with the opportunities which exist inthis organization for advancement or promotion
51 (44) 46 (40)
2. How satisfied are you with the nature of the work youperform
20 (17) 40 (35)
3. How satisfied are you with the person who supervisesyouyour organizational superior
5 (4) 39 (34)
4. How satisfied are you with your relations with others in theorganization with whom you workyour coworkers andpeers
11 (10) 34 (30)
5. How satisfied are you with the pay you receive for your job 35 (30) 35 (30)
6. Considering everything, how satisfied are you with yourcurrent job situation 20 (17) 24 (21)
Sweeney andMcFarlin (1993)
1. How fair or unfair are the procedures used to communicateperformance feedback
28 (24) 38 (33)
2. How fair or unfair are the procedures used to determinepay rates
10 (9) 30 (26)
3. How fair or unfair are the procedures used to evaluateperformance
19 (17) 28 (24)
4. How fair or unfair are the procedures used to determinepromotions
26 (23) 38 (33)
Agho, Price, andMueller (1992)
1. I am often bored with my job 26 (23) 63 (55)2. I feel fairly well satisfied with my present job 5 (4) 26 (23)
3. I am satisfied with my job for the time being 7 (6) 60 (52)4. Most days I am enthusiastic about my work 12 (10) 38 (33)5. I like my job better than the average worker does 14 (12) 24 (21)6. I find real enjoyment in my work 0 (0) 0 (0)
Mean 19.3 (17) 37.5 (33)
Hardy and Ford 7
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
9/26
Enrichment was far more common, which is understandable as linguistic theory suggests that
individuals are more likely to augment basic sentence syntax to recover meaning. Table 1 demon-
strates that items were enriched by 21%to 55%of the respondents. There was no clear pattern to
the enrichment, with the exception of the statement I am satisfied with my job for the time
being (Agho et al., 1992), which seemed to trigger an association with turnover intention. This
enrichment was not uniform, however, as one respondent interpreted it as I am currently happy
with my job, but I may look for a new job in the future and another as Ill be out of here at the
first opportunity. Despite these very different interpretations, both these respondents agreed
with this item in their Likert response. It is clear then that the numerical output from multiple item
scales can mask considerable linguistic variance, as two opposing interpretations had the same
score (4/5).
No statistical difference was detectable between the depleted or enriched responses and the rest of
the sample on the basis of a Mann-Whitney U test (a nonparametric comparison test appropriate fornon-normal small samples) applied to the Likert responses for each category. So individuals are
answering different questions, sometimes radically so, and yet this is undetectable statistically.
Lexical Miscomprehension.Despite the intractability of detecting lexical miscomprehension, as pre-
viously noted, there was clear evidence of differing interpretations of the wordsatisfaction, as can be
seen in Table 2.
Other examples of lexical miscomprehension can be seen with the use of vague terms (Tourangeau
et al., 2000) such as the wordoftenin the item I am often bored with my job (Agho et al., 1992). For
example,oftenwas interpreted as varying from More than 33%of the time to 99%of my work.
The wordmostin the statement Most days I am enthusiastic about my work was interpreted withresponses as varied as 4/5 days, at least 3 days/week, and more often than not.
The final element of lexical miscomprehension examined was the comparators for the item
I like my job better than the average worker does. Thirty-four percent compared themselves to
Table 2. Classification of Satisfaction by Item.
Pleasurable(k 0.88)
Transactional(k 0.79) Neutral
Scale Item N(%) N(%) N(%)
Tsui, Egan, andOReilly(1992)
1. How satisfied are you with the opportunities whichexist in this organization for advancement orpromotion
22 (19) 24 (21) 69 (60)
2. How satisfied are you with the nature of the workyou perform
63 (55) 5 (4) 47 (41)
3. How satisfied are you with the person whosupervises youyour organizational superior
39 (34) 30 (26) 46 (40)
4. How satisfied are you with your relations withothers in the organization with whom you workyour co-workers and peers
47 (41) 21 (19) 47 (41)
5. How satisfied are you with the pay you receive foryour job 28 (24) 41 (36) 46 (40)
6. Considering everything, how satisfied are you withyour current job situation
53 (46) 5 (4) 57 (50)
Agho, Price,and Mueller(1992)
2. I feel fairly well satisfied with my present job3. I am satisfied with my job for the time being
43 (43)44 (38)
10 (9)6 (5)
62 (54)65 (48)
Mean 42.4 (38) 37.5 (18) 17.8 (16)
8 Organizational Research Methods
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
10/26
the general population, 18% to their peers, and 4% to people in similar jobs. Social comparison
theory suggests that the choice of referent is critical to attitude formation (Riordan & Shore,
1997). As with sentential miscomprehension there was no statistical difference between categories
on the basis of the Mann-Whitney test applied to the Likert responses for each category.
Study 1 Results Summary.Analysis of the qualitative results provides evidence for all three types of
miscomprehension and suggests that this miscomprehension may not be readily detectable statisti-
cally. The linguistic ambiguity within these scales is therefore a potentially significant but typically
undetectable source of error.
Study 2, Phase 1: Respondent Self-Classification Into Types of
Miscomprehension
In order to ensure that the miscomprehension observed in the first study was not an artifact of thecoding and classification process, we ran a second study to verify our results.
Method
Study 2 was survey based and had three parts. The first was the normal presentation of the scales
(i.e., with a modified Likert response scale), and the last section asked for demographic information.
The second section, however, was very different.
Participants were presented with the items and then invited to choose the interpretation that most
closely matched their understanding of the original item. We derived these alternative interpreta-
tions from the open-ended responses gathered in Study 1. In order to assess sentential miscompre-
hension, each stimulus item was presented with several response options, including a neutralparaphrase of the original item (N), an enriched response (E), a depleted response (D), a response
that was both depleted and enriched (D&E), and an option for respondents who did not agree any
of the choices were consistent with their understanding (I). A sample item is presented in Table 3.
Table 3. Sample Items for Study 2, Phase 1.
Sentential Miscomprehension Lexical Miscomprehension
Actualitem
How satisfied are you with thenature of the work you perform?
Actualitem
I feel fairly well satisfied with mypresent job
N How satisfied are you by the type of work you do?
N My present job leaves me fairly wellsatisfied
E How satisfied are you with your joband its responsibilities?
P My present job makes me fairly happy
D How satisfied are you with your job? T My present job meets myexpectations
D&E How satisfied are you that you aredoing a good job?
I None of the interpretations offeredmatch my interpretation of thequestion/statement
I None of the interpretations offered
match my interpretation of thequestion/statement
Note: Sentential miscomprehension: N neutral paraphrase of the original item; E an enriched response; D a depletedresponse (D); D&E a response that was both depleted and enriched; I an option for respondents who did not agree anyof the choices were consistent with their understanding (I); Lexical miscomprehension: N neutral interpretation; P plea-surable interpretation; T transactional interpretation; I a response for not agreeing with any of the choices.
Hardy and Ford 9
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
11/26
Each stimulus was presented twice with differing sets of possible interpretations (i.e., neutral,
enriched, etc.) on each occasion, in order to verify that the respondents selections were not simply
an artifact of the specific choices presented. If a respondent is interpreting the item accurately, they
should select the neutral option both times.
Similarly when assessing lexical miscomprehension we focused on the different meanings of the
wordsatisfaction. Respondents were offered neutral (N), pleasurable (P), and transactional (T) inter-
pretations and a response for not agreeing with any of the choices (I). A sample item is presented in
Table 3. As with sentential miscomprehension each stimulus was presented twice with differing
response options. If linguistic ambiguity has no effect on survey research, then respondents should
either pick the neutral item, which was a paraphrasing of the initial question or, at least, all choose
the same non-neutral option.
The second test for lexical miscomprehension included the use of three items with ambiguous
modifying terms: I amoftenbored with my job, Mostdays I am enthusiastic about my work,
and I like my job better than the average worker does (Agho et al., 1992). Respondents were
presented with different options for each term indicating different frequencies, time periods, orcomparative groups, respectively. For example, for the item I am often bored with my job,
responses ranged from More than 33%of the time I am bored with my job to More than 75%
of the time I am bored with my job.
The validity of the response choices was checked by sending them to a university linguistics
professor. She classified the responses into the sentential and lexical miscomprehension cate-
gories. She accurately classified 98% of the response choices into the same category as the
authors, suggesting that the choices accurately mirrored sentential and lexical miscomprehension
as outlined previously.
The sample for Study 2 came from two sources. First, we again sent the survey to some of our
contacts and asked them to forward it to other working adults and collected 165 valid responses fromthis group. We then collected an additional 100 responses using a paid Qualtrics panel of working
adults. This allowed us to check whether the phenomena observed were an artifact of our sampling
method or whether they also occurred in a broader sample that is likely to be more representative of
the population of working adults.
Results
Dimensions of the Sample.Two hundred sixty-five valid responses were received. Most of the respon-
dents were natives of either the UK (37%) or US (52%) with the remaining 11%from a mix of other
countries. The average age was 42 (SD 11.5), 57.7%were female and 87.5%had a college degree,
with 52%having a masters degree or higher.
Statistical Tests for Scale Validity.As in Study 1 numerical responses to the first portion of the sur-
vey where the items were administered in their conventional format were subjected to statistical
analysis. Coefficient alpha values were somewhat higher than in Study 1, varying between .87
and .91.
We established scale dimensionality in this sample by conducting confirmatory factor analysis
using LISREL (Joreskog & Sorbom, 2006) to examine factor structure. We looked at each of the
job satisfaction scales (Agho et al., 1992; Tsui et al., 1992) on its own in combination with the
measure of procedural justice (Sweeney & McFarlin, 1993) as we did not expect to find that a
three-factor model including both measures of job satisfaction would provide a satisfactory fitto the data. In both cases, all items loaded significantly on the latent variable, and acceptable fit
statistics were obtained (Comparative Fit Index [CFI] of .95 and .98; standardized root mean
square residual [SRMR] of .043 and .059), although in both cases the chi-square test was
10 Organizational Research Methods
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
12/26
significant. Chi-square can be problematic (Joreskog, 1969) as it is very sensitive to both sample
size and violations of distributional assumptions. Garson (2009) advises that chi-square test sig-
nificance can be overlooked if other fit measures indicate good fit. Given that other fit statistics
were consistently within acceptable range (Hair, Black, Babin, Anderson, & Tatham, 2006), and
the reliability coefficients were strong, we concluded that the scales demonstrated sufficiently
good fit that if we were conducting a substantive analysis using these data, we would consider that
we had confirmed the unidimensionality of each scale.
Sentential Miscomprehension.Over 90%of the respondents found that one of the four interpretations
of the stimulus item offered (i.e., neutral, enriched, depleted, or enriched and depleted) coincided
with their interpretation of the stimulus item. If respondents had followed the strict syntax of the
stimulus item then they should have selected the neutral response. While this was the modal category
and was, on average, selected 49% of the time (range 24%-68%), other categories, principally
enrichment, were also commonly chosen. Table 4 shows the number and percentage of respondents
demonstrating sentential miscomprehension.These results confirm that respondents routinely go beyond the strict syntactic meaning of items.
Moreover this interpretive process does not appear to be uniform and some items are enriched or
depleted to differing degrees. These data accord with the results of Study 1.
The first section of the survey contained the scales in their conventional format (i.e., with Likert
responses). This allowed us to see whether different interpretations produced significantly different
scores. For 9 of 30 items the different interpretations produced significantly different scores on the
Kruskall-Wallis test, with these demonstrating mild to moderate effect sizes of between r0.15 to
0.29 (Cohen, 1977). A Bonferroni correction was applied to post hoc Mann-Whitney results, such
that differences are reported at a .01 level of significance. The categories that differed from the
neutral interpretation and the direction in which they differed are shown in Table 4.We conducted character, syllable, and word counts for each item and calculated a number of
readability measures (after Jensen, 2009) to explore the impact of item length and complexity on com-
prehension. These were then compared to the number of neutral interpretations of each item as fewer
neutral interpretations suggests greater miscomprehension. No significant relationship was observed,
suggesting that item length and complexity did not correlate with miscomprehension.
Lexical Miscomprehension. Over 90% of respondents found that the options that they were pre-
sented with matched their own interpretation of the item. Analysis of respondents understand-
ing of the word satisfied, presented in Table 5, suggests that there was significant deviation
from the neutral response. Fifty-six percent of respondents selected the neutral option
which contained the word satisfied rather than any interpretation. Twenty-four percent of
respondents selected a pleasurable interpretation of the wordsatisfied, and 14%chose the trans-
actional interpretation.
The number of respondents choosing pleasurable or transactional interpretations was lower than
in Study 1. Statistical analysis of the responses to the different forms of lexical miscomprehension
using the Kruskal-Wallis test demonstrated a significant difference for only 1 of 16 items, again,
with a mild to moderate effect size (r .18, see Table 4).
Differing lexical interpretations were also observed. The wordoftenin the item I am often bored
with my job was interpreted as being 33%to 75%of the time.Mostin Most days I am enthusiastic
about my work was either 3 of 5 or 4 of 5 days with a significant number understanding most as
more often than not. Finally for the item I like my job better than the average worker does, anumber of different options were given to define the average worker. Most respondents selected
either the typical worker in this country or the average person, but 14% selected either
coworkers or peers as their comparator.
Hardy and Ford 11
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
13/26
Although the differences in categories was significant when analyzed using the Kruskal-Wallis
test as an omnibus test, none of the categories reached significance in post hoc analysis using
Mann-Whitney with a Boniferroni correction, indicating that there is no significant difference in
scale score based on respondent interpretation of the item.
These data present strong evidence that individuals interpret words differently. Some regardbeing satisfied as a pleasurable experience whereas others regard it as a transactional one, even
when the original intent of the author was one or the other. Modifying words such as oftenare simi-
larly open to interpretation as individuals draw on different mental schemata.
Table 4. Number and Percentage (in Parentheses) of Respondents Demonstrating Various Forms of SententialMiscomprehension.
Scale/item Pr. N E D D&E I
Tsui, Egan, and OReilly (1992)1. 1 164 (61) 58 (21) 22 (8) 15 (5) 6 (2)
2 145 (55) 45 (17) 46 (17) 19 (7) 8 (3)2. 1 180 (68) 47 (17) 13 (4) 20 (7) 4 (1)
2 168 (63) 28 (10) 37 (13) 23 (8) 9 (3)3. 1 163 (61) 36 (13)*a 30 (11) 23 (8) 12 (4)*a
2 130 (49) 66 (25)*a 16 (6) 37 (14)*a 14 (5)4. 1 175 (66) 31 (11) 35 (13) 16 (6) 6 (2)
2 102 (38) 49 (18) 46 (17) 58 (22) 8 (3)5. 1 131 (49) 39 (14) 75 (28) 15 (5) 5 (1)
2 89 (33) 94 (35)*a 40 (15) 20 (7) 20 (7)
6. 1 152 (57) 8 (3) 79 (29) 24 (9) 2 (0)2 119 (45) 22 (8) 69 (26) 43 (16) 9 (3)Sweeney and McFarlin (1993)
1. 1 96 (36) 60 (22) 72 (27) 20 (7) 16 (6)2 97 (37) 89 (33) 30 (11) 17 (6) 29 (11)
2. 1 123 (46) 63 (23) 29 (11) 41 (15) 7 (2)2 155 (58) 67 (25) 19 (7) 12 (4) 10 (3)
3. 1 88 (33) 56 (21) 38 (14) 75 (28) 5 (1)2 152 (58) 41 (15) 41 (15) 14 (5) 14 (5)
4. 1 64 (24) 78 (29) 41 (15) 52 (19) 27 (10)2 74 (28) 87 (33) 54 (20) 27 (10) 18 (6)
Agho, Price, and Mueller (1992)
1. 1 74 (28) 81 (30)*a
71 (26) 18 (6)*a
20 (7)2 155 (58) 34 (12)*a 38 (14) 7 (2) 30 (11)
2. 1 147 (56) 58 (22) 31 (11)*b 16 (6) 10 (3)2 150 (56) 27 (10) 66 (25)*a 12 (4) 9 (3)
3. 1 158 (60) 69 (26) 19 (7) 5 (1) 12 (4)2 147 (55) 57 (21) 23 (8) 22 (8) 14 (5)
4. 1 127 (48) 74 (28) 23 (8) 30 (11) 8 (3)2 100 (38) 86 (32) 26 (9) 28 (10)*b 23 (8)*b
5. 1 147 (55) 42 (15) 17 (6) 43 (16)*b 14 (5)2 91 (34) 82 (31) 15 (5) 40 (15) 35 (13)
Grand mean 128.7 (48.9) 55.8 (21.2) 38.7 (14.7) 26.4 (10) 13.4 (5)
Note: Pr. presentation (each item was presented twice with a set of interpretations for each presentation); N neutral;E enriched; D depleted; D&E depleted and enriched; I inappropriate interpretation.*Denotes significant differences in score compared to neutral category.aDenotes mean below the neutral response mean.bDenotes mean above the neutral response mean.
12 Organizational Research Methods
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
14/26
Study 2, Phase 2
Phase 1 of the study demonstrated primary effects, where differences in interpretation resulted in
significant differences in score. We have already raised the possibility of secondary effects, where
differences in interpretation affect other constructs. So to explore this possibility we repeated Phase
1 and incorporated a measure of turnover intention.
Method
The survey used was exactly the same as that used in Phase 1, but with the addition of two items from
the Camman, Fischman, Jenkins, and Klesh (1983) scale measuring turnover intention. The surveywas administered using Amazons Mechanical Turk (mTurk) to obtain a sample of respondents who
were currently employed. mTurk has been used effectively in various fields, including both linguis-
tic and psychology studies (see Mason & Suri, 2012; Sprouse, 2011).
Results
Of the 250 respondents who completed the survey, 39 failed checks built in to test for careless
responding, leaving a final sample size of 211 valid responses. The checks included 3 items scattered
through the survey such as If you are reading this, please select disagree. Participants were dis-
carded from the sample if they failed two or more of these checks and they completed the survey
very quickly. This is the consistent with the methods described by Meade and Craig (2012) to detectand eliminate cases in which the respondent is not attending to the content at all.
Of respondents, 51.2%were female, the average age was 35.2 (SD 10.9), 95.3%of respondents
were American, 78.7%had a college degree or higher, with 15.4%having a masters degree or higher.
Table 5. Number and Percentage of Respondents Demonstrating Lexical Miscomprehension of the WordSatisfied.
Scale/item Pr. N P T I
Tsui, Egan, and OReilly (1992)1. 1 145 (55) 63 (23) 48 (18) 7 (2)
2 139 (53) 78 (29) 36 (13) 9 (3)2. 1 141 (53) 87 (33) 26 (9) 9 (3)
2 176 (67) 41 (15) 35 (13) 10 (3)3. 1 151 (57) 38 (14) 59 (22) 14 (5)
2 156 (59) 54 (20) 45 (17)*a 8 (3)4. 1 118 (44) 103 (39) 24 (9) 18 (6)
2 169 (65) 53 (20) 32 (12) 6 (2)5. 1 127 (48) 55 (20) 76 (28) 6 (2)
2 79 (30) 95 (36) 78 (29) 10 (3)
6. 1 151 (57) 89 (33) 18 (6) 5 (1)2 139 (52) 78 (29) 34 (12) 12 (4)Agho, Price, and Mueller (1992)
2. 1 163 (63) 44 (17) 37 (14) 14 (5)2 176 (67) 54 (20) 23 (8) 8 (3)
3. 1 168 (63) 59 (22) 14 (5) 23 (8)2 183 (69) 40 (15) 23 (8) 16 (6)
Grand mean 148.8 (56.7) 64.4 (24.5) 38 (14.4) 10.9 (4.1)
Note: Pr. presentation (each item was presented twice with a set of interpretations for each presentation); N neutral;P pleasurable; T transactional; I inappropriate interpretation.*Denotes significant differences in score compared to neutral interpretation.aDenotes mean below the neutral response mean.
Hardy and Ford 13
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
15/26
Sentential Effects on Turnover Intention. We noted in Study 1 that the item I am satisfied with my
job for the time being (Agho et al., 1992) seemed to trigger an association with turnover
intention in some respondents. In Study 2, one of the enriched interpretations reflects this.
Respondents therefore could choose between a neutral interpretation of the item and an
enriched version, At the moment I am satisfied with my job and I am not looking for a new
one. There was no significant difference detectable in item score across the interpretations,
suggesting that the impact of the miscomprehension is not directly detectable. Accordingly
we selected this item for analysis.
We compared the turnover intention scores of those who selected the enriched interpretation of
the item with those who selected the neutral interpretation. Using a Mann-Whitney test to compare
the scores for these two groups we found a difference that approached significance, U 1,867,
z1.88,p .060,r.13, whereby those enriching the item scored higher on turnover intention.
This suggests that sentential miscomprehension can have indirect effects, as such a result is unlikely
to have happened by chance (Nickerson, 2000).
Lexical Effects on Turnover Intention. We explored the possibility that lexical miscomprehension
might also have indirect effects by examining the relationship between differing interpretations
of the wordsatisfactionand turnover intention. We chose the satisfaction item that best reflected
the global concept of job satisfactionConsidering everything, how satisfied are you with your
current job situation? In Phase 1 there was no significant difference in item score among the
types of interpretation offered.
We found a significant difference across groups when comparing the transactional interpretation
to the pleasurable interpretation, U 569,z 2.95,p .013,r 0.20, and also when comparing
the transactional interpretation to the neutral one, U 1,208, z 2.38,p .017,r .16.
These results show that differences that do manifest themselves on item score may, as linguistictheory suggests (Schwarz, 1999), reflect differing cognition and have indirect effects.
Study 2 Results Summary.The findings of Phase 1 of Study 2 confirm the findings in Study 1, suggest-
ing that they are not the product of researcher confirmation bias (see Nickerson, 1998). Furthermore,
Phase 2 of Study 2 demonstrates that miscomprehension may also have indirect effects.
Study 3: Replication of Results With Spreitzers Empowerment Scale
To allay concerns that the findings demonstrated were an artifact of the scales selected we replicated
Studies 1 and 2 with Spreitzers (1995) 12-item empowerment scale. This scale was developed usingbest practices for scale development and the author reported strong evidence of construct validity.
The scale includes four subdimensions of psychological empowerment: meaning, competence, self-
determination, and impact.
Method
The replication was carried out in two phases. Phase 1 involved administering the 12 items of the
scale to a pool of 29 participants, as in Study 1. The results of this first phase were then used to create
the item interpretations for the second phase, which replicated Study 2. In this phase we constructed
neutral, enriched, depleted, and depleted and enriched interpretations for each item in Spreitzers
(1995) scale and asked respondents to select the interpretation that most closely matched their owninterpretation. We again collected demographic data and asked the respondents to answer the scale
in its original format. Data were collected from 100 employed workers through Amazons Mechan-
ical Turk.
14 Organizational Research Methods
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
16/26
8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
17/26
The reasons why individuals did not read the instructions is unclear, but it seems probable that
familiarity with the task and the seeming obviousness of what needs to be done are likely. The
highly educated nature of the sample may mean that they are more regularly surveyed. They may
also take mental shortcuts as they believe they know what is required. This would seem partic-
ularly true for the RMNet group for whom surveys are likely to be common currency.
The fact that a large number of RMNet members (12/34) did not read the instructions is under-
standable but nonetheless a cause for concern. When developing scales, researchers routinely ask
their colleagues opinions on various matters and use them to help generate items. This result sug-
gests that Schriesheim and colleagues (1993) were correct when they asserted that academic col-
leagues might not be the ideal assistants.
Sentential Miscomprehension
The results suggest significant sentential miscomprehension. This is most dramatically observed in
the procedural justice scale (Sweeney & McFarlin, 1993) where (depending on the item) 21% to31%of respondents appeared to miss the process element and hence answer a question about dis-
tributive justice. The figure was rather lower for the RMNet group, which we believe may be
because they are more likely to be sensitized to the procedural element of the question. The fact that
the group that ignored the procedural component was indistinguishable statistically from the respon-
dents interpreting the question correctly means that any results from this scale would contain sub-
stantial and hidden error. However, as the literature is replete with studies in which procedural and
distributive justice are highly correlated (Colquitt, 2001), this study perhaps sheds some new light on
the source of some of this collinearity.
Study 2 showed that enrichment, depletion, or a combination of the two were demonstrated by
about half of respondents. This suggests that around half of respondents deviate from the strict syn-tax of the item and alter it according to their own understanding.
The impact of sentential miscomprehension is difficult to ascertain. For 9 of the 30 items, differ-
ing interpretations produced significantly different results on the traditional presentation of the item.
This effect, however, was inconsistent. It seems likely that the process of enrichment or depletion
meant that respondents tapped into other concepts when responding. For example, the item I am
satisfied with my job for the time being was interpreted 22% of the time as At the moment I
am satisfied with my job and I am not looking for a new one, an interpretation that, as Phase 2
of Study 2 shows, also taps into turnover intention. The implications of this are both direct and indi-
rect. Directly, some respondents tap into a construct other than job satisfaction. Indirect effects
might potentially occur when job satisfaction and turnover intention are included in the same study.
The linguistic interaction for some (but not all) of the participants would potentially interfere with
the validity of the results obtained.
Sentential miscomprehension therefore has the potential to introduce substantial error. This may
not be detectable using conventional scale appraisal techniques, as the scales in this study had high
coefficient alphas and were unidimensional in factor analyses. Despite this, this error has consider-
able impact on construct validity and sporadic impact on scale score.
Lexical Miscomprehension
Although lexical miscomprehension is difficult to ascertain, the different understandings of the word
satisfaction and the variation in meaning of the wordoften in this study demonstrates that there islinguistic variance among respondents.
What is the impact of this? It seems likely that respondents are answering somewhat different
questions as a result of lexical miscomprehension of the word satisfied. Again this raises the
16 Organizational Research Methods
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
18/26
possibility that some respondents may be tapping into different constructs when they interpret items.
This will have both primary effects on the scale itself and secondary effects on any model incorpor-
ating constructs similar to those being unconsciously tapped into by the respondent.
Lexical miscomprehension is most clearly observed in the different interpretations of often.
When responding to the statement I am often bored with my job, if the threshold for often is
one-third of the time then you are likely to respond differently to this question than would someone
for whom often means 99%of the time.
The results demonstrate that individuals have different interpretations of individual words, yet
this is not appreciable using conventional statistical techniques. The impact of this is that respon-
dents are drawing on quite different conceptual schema and referents, with attendant diminution
of construct validity.
Consequences and Remedies
The findings of this study make uncomfortable reading for those of us involved in survey research.The degree of variance observed linguistically would be serious cause for concern if it was observed
numerically. All four measures used in this series of studies have been published in high-quality
journals and frequently cited. They have furthermore been subjected to significant previous con-
struct validation analyses. And yet they all contain linguistic threats to validity.
The reliability and validity of measurement instruments and surveys is of pivotal importance. If
the measures do not measure what they purport to, or do not do so accurately, then recommendations
based upon research that employs these measures are likely to be flawed. So what is to be done? We
propose strategies to minimize each form of miscomprehension for those developing new scales and
those using existing scales. We begin by providing principles to undergird item construction.
Instructional Miscomprehension.Putting borders around the instructions and using bold type or capitals
do not seem to completely eliminate instructional miscomprehension, as these were all used in Study
1 to little effect. The literature on manipulation checks however offers a possible direction. Oppen-
heimer, Meyvis, and Davidenko (2009) used an effective system to detect failure to follow instruc-
tions. Participants were presented with a survey about sports participation. The instructions indicated
that participants should ignore the first question in the survey and instead click on the page title.
Those who had not read the instructions were able to proceed with the survey normally, allowing
researchers to compare the responses of those who read the instructions with those who did not. This
is similar to the captcha or reverse Turing test advocated by Mason and Suri (2012) where a
particular response is requested by the survey item to prove that the participant is paying attention
and motivated. Use of these approaches improves engagement, reducing the proportion of invalid
responses from 48.6%to 2.5%(Kittur, Chi, & Suh, 2008).
There is a danger that these seemingly counterintuitive instructions (e.g., to ignore a particular
item) might confuse respondents. Instructing participants to pay close attention to the items and
informing them that tests that measure their carefulness or attentiveness are being used may help.
Care should also be taken to ensure that such tests do not interfere with respondents capacity to
answer other items in the survey. There is also a potential danger that selecting only those respon-
dents who obey all the instructions excludes particular groups, for example those with a particular
personality trait. Nonetheless, tests of this sort are a useful addition to any survey to ensure that par-
ticipants have attended to the instructions and are sufficiently motivated.
Sentential Miscomprehension. When looking at the structure of the item, it is important to eschew
vague words such asmany,most,often, orsometimes. These have no formal quantity and so repre-
sent an open invitation to miscomprehension. Try to use a quantity so that instead of an item that
Hardy and Ford 17
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
19/26
says Most days I am enthusiastic about my work use I am enthusiastic about my work at least
75% of the time. When asking for a comparison, ensure that the comparator is clear. The UK
Office of National Statistics Wellbeing Survey (Office of National Statistics, 2013), for example,
contains an item Overall, how satisfied are you with your life nowadays? Nowadaysis a vague
term. A better item would be Overall, how happy have you been with your life over the last three
months.
Comprehension may also be improved by the use of bold or italicized elements of items (see
Christian & Dillman, 2004). As those respondents not following instructions have already been
eliminated it seems likely that those remaining in the sample are more motivated and attentive. The
use of bold or italics may help emphasize elements of the item, for example, How fair or unfair are
theproceduresused to determine pay rates?, thus avoiding some of the problems witnessed with
the Sweeney and McFarlin (1993) scale.
Lexical Miscomprehension.There is a great deal of extant advice on item construction (e.g., DeVellis,
2003; Groves et al., 2004) and our recommendations are intended to supplement this, not supplant it.We first consider the actual words used in the item.
The tendency for multiple interpretations of the same word (polysemy) that we have observed
with the wordsatisfactionsuggests that care should be used to avoid words with multiple meanings.
The number of different meanings of a particular word can be examined using a dictionary. As an
example, the wordhappymight be preferred to the wordsatisfiedwhen asking about the relationship
between an employee and their organization if the measure is intended as an affective one. If another
word is not available, then linguistic theory suggests that context aids comprehension. This means
that the scale authors should provide careful guidance, through contextual information, as to which
meaning the respondent should select. As this contextual information is typically provided in the
instructions, the problem of instructional miscomprehension becomes even more of a concern andthe use of effective mechanisms to combat it even more vital.
Ill-defined words, such asmeaningful, should also be avoided. While the researcher may have a
clear idea of what a concept means, the respondents may not. Plain, short, commonly used words are
most likely to be understood and reduce miscomprehension. Care should also be taken to avoid jar-
gon and culturally specific terms. The wordquite, as inquite good, for example, is a superlative in
the US but represents borderline mediocrity in the UK. In some cases researchers may abjure from
using words altogether and use pictograms, such as smiling/frowning faces used by Kunin (1998).
Those using existing scales should be similarly critical of words used, even in published scales, as
the authors may not have attended to issues of miscomprehension.
Survey Construction.Moving from the individual item to survey composition allows us to use linguis-
tic theory to aid comprehension. Given that comprehension is a function of syntax and context, a
sensible approach is to provide plenty of context to ensure a more uniform and predictable interpre-
tation process. While there have been conflicting views of the necessity or indeed the desirability of
intermixing items (Schriesheim & Denisi, 1980; Schriesheim, Kopelman, & Solomon, 1989; Spar-
feldt, Schilling, Rost, & Thiel, 2006), we suggest that it might potentially have an effect on miscom-
prehension as surrounding items may well provide contextual information that helps the respondent
understand the item, and so grouping may help reduce miscomprehension (Tourangeau & Rasinski,
1988). We therefore recommend grouping items together.
Similarly, more thoughtful instructionsagain with an attentiveness checkcould help
improve item comprehension. If the heading for the Sweeney and McFarlin (1993) scale containedthe instruction We now want you to think about the processes by which a number of different
things that affect you at work are decided, then this may help emphasize the procedural element
of the scale.
18 Organizational Research Methods
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
20/26
While it is clear that linguistic errors are a considerable concern in survey research, few researchers
are looking for it. When a new scale is being developed, researchers might ask expert judges if they
understand the item, or ask whether the item appears to them to measure the construct, but they do not
typically ask respondents what they think the item means. If an existing scale is used, then the prove-
nance of publication may well ensure that even less attention is paid to the properties of the scale.
Testing and Evaluation. After developing a scale that appears to minimize linguistic miscomprehen-
sion while sampling the content domain appropriately, we suggest that researchers assess the degree
of linguistic ambiguity that the items produce within the target population. This field testing should
be used both during scale development, where it should be thought of as a distinct and necessary step
in the scale development process and also when using preexisting scales. As linguistic theory sug-
gests that comprehension is a function of both syntax and context, any change in context necessitates
a check to ensure that the item is still uniformly understood by respondents and, crucially, in the
same way as the researcher.
We have found, during the course of our own research, that the approach we adopted in Study 1was very effective at identifying ambiguity. This simply asks respondents to describe what they
think the question means in their own words. This technique can be used in both the development
of new items and in appraising preexisting ones. The advantage of this approach is that it can be
administered remotely, and it elicits useful information. For item development purposes it is not nec-
essary to go through the coding processes we did in Study 1, as inspection of the responses is usually
sufficient to establish whether the item is being homogenously interpreted. This approach is prob-
ably only suitable, however, for about 15 to 20 items as it is time-consuming for the respondent. For
longer surveys one might use a piecemeal approach where the whole survey is broken down into
blocks of 15 to 20 items and administered to separate samples.
The scale development literature includes a number of approaches to further ensure both con-tent adequacy and homogeneity of comprehension of survey items. Schriesheim et al. (1993) sug-
gest using Q-methodology to help measure the differences between individual judges. Anderson
and Gerbing (1991) also offer methods for pretesting with small samples, although their technique
focuses more on predicting confirmatory factor analysis (CFA) performance. Hinkin (1998) has
suggested both these approaches to test content validity. Hinkin and Tracey (1999) built on this
work and provided an analysis of variance technique that allows for evaluation of item distinctive-
ness as part of the content validation process. In public opinion surveys cognitive interviewing is
commonly used to pretest items (Beatty & Willis, 2007; Schwarz & Sudman, 1996). This may
either take the form of asking respondents to think aloud as they answer the survey question
(Ericsson & Simon, 1980) or by probing specific areas of understanding to help draw out elements
of the respondents thinking (Willis, DeMaio, & Harris-Kojetin, 1999).
In all survey development there is a tension between specificity and applicability. Words and items
that are too specific will not be applicable in other contexts, similarly general words and items may not
be sufficiently precise to reflect subtle difference. The process of thoughtful development and field
testing should help ensure that the researcher successfully navigates between these two poles.
A Note for Reviewers
Those reviewing papers using survey research should also attend to issues of linguistic ambiguity, as
quality measurement is the responsibility of both researcher and reviewer (Hinkin, 1995). The first
step should be to require anyone who submits a paper based on survey research to provide the mea-sure for examination.4 They should then, having ruled out normal threats to validity (e.g., double
barreled, etc.), look at the individual words in the item to see if the item contains any modifiers such
asmostoroften or words with multiple meanings, such as satisfaction. The next step should be to
Hardy and Ford 19
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
21/26
inspect the syntax of the item to make sure that referents are clear, for example, yesterdayrather than
recently. Reviewers should then assure themselves that the authors have ascertained whether the
item is uniformly comprehensible to the target audience. Finally reviewers should inquire as to the
steps taken by researchers to ensure that their respondents have read the instructions and are moti-
vated throughout the survey. These steps, taken together, should help reduce poorly worded items
and weed out unmotivated participants, thus improving the quality of research based on soliciting
opinions.
It is unlikely, however, that linguistic ambiguity will ever be eliminated. As linguistic philoso-
phers have pointed out, and as we discussed in the linguistic theory section, there is always an
indeterminacy of language. Nonetheless these approaches to item development and testing will
help identify and eliminate the more obvious forms of instructional, sentential, and lexical
miscomprehension.
Limitations and Future DirectionsThis article used two different approaches to explore the impact of linguistic pragmatics on survey
interpretation through the examination of four carefully chosen scales. By using a combination of
open-ended and fixed response format questions we have aimed to use methods that have non-
overlapping weaknesses in addition to their complementary strengths (Brewer & Hunter, 2006,
p. 4). The four scales used here, however, can hardly be seen as representative of all available scales.
Future studies should extend this analysis to other scales.
Future research could explore the properties of items and words, and their context, which led to
them being either sententially or lexically miscomprehended. This would require considerable
effort, but given careful design and sufficient respondents, it may be that general principles to reduce
the impact of linguistic factors on survey research could be produced, given that the extant linguis-tics literature has examined some of these issues already.
Another interesting area for future research would involve comparing whether native English
speakers were more likely to sententially or lexically miscomprehend items than non-native
speakers. Theoretically native English speakers should have a more nuanced vocabulary and so
be more likely to make linkages to other English words than non-native speakers. This may mean
that non-native English speakers are actually better survey respondents as they are more likely to
interpret items appropriately. Research in children, who similarly have a more restricted vocabu-
lary, suggests that they produce a more restricted set of interpretations (Noveck, 2001). In addi-
tion, future research might examine the possibility that there may be individual differences that
drive linguistic miscomprehension.
Finally, the current study has been within the paradigm of classical test theory. Item response the-
ory (IRT) might offer an alternative approach to identifying linguistic ambiguity. IRTs ability to
examine bias by comparing the performance of individual items has been used to identify charac-
teristics of respondent populationsfor example those faking personality tests (Zickar, Gibby, &
Robie, 2004). IRT has also been used to explore characteristics of surveys, for example context
effects (Rivers, Meade, & Lou Fuller, 2009), the effects of extreme wording (Nye, Newman, &
Joseph, 2010), and equivalence of translation (Ellis, 1989). Evaluation of item response curves may
help identify differences in interpretation that are not readily appreciable using classical techniques.
SummarySurvey research is a critical weapon in the social scientists methodological armory. It enables the
opinions and feelings of large numbers of respondents to be rapidly ascertained and collated. Devel-
opments in statistical techniques have enabled more sophisticated analyses to be performed in order
20 Organizational Research Methods
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
22/26
to enhance our understanding of social phenomena and processes. Surveys have tended to rest on the
assumption of an unbroken chain of comprehension between the mind of the researcher through the
survey instrument and to the mind of the recipientand back again. This assumption does not seem,
on the basis of the results of this study, to be particularly robust. Respondents often either fail to
follow instructions or miscomprehend the items presented. This is not readily detectable when the
output from a survey is numerical; a problem that may be further exacerbated by changes in the con-
text in which the items are presented.
This article is not intended to denigrate surveys as an information source or research tool, but
rather it seeks to draw the readers attention to some of the linguistic problems that underlie surveys
and to demonstrate the magnitude of effect of these problems. The problem is, perhaps, most neatly
summarized by the sociologist R. H. Tawneys (1971) comment that Sociology . . . is a department
of knowledge which requires that facts should be counted and weighed, but which, if it omits to
make allowance for the imponderables, is unlikely to weigh or even count them right (p. 147), a
comment that seems as relevant and applicable to organizational survey research as it does to
sociology.Overall, research into the linguistics of survey items is a rich soil for future research. Given the
misinterpretation described in this article, there is clearly much work to be done. Attention to the
potential methodological issues outlined in this article should help produce better, more valid results
that will in turn provide the basis for an improved understanding of social and organizational
phenomena.
Authors Note
All data are available from either author (Ben Hardy, [email protected], or Lucy Ford, [email protected]). We
would like to thank Dr. Alyson Pitts, University of Cambridge, for her expert linguistics assistance, and Dr.
Raina Brands, University of Cambridge, for her tireless assistance with data coding. We would also like to
thank the three anonymous reviewers who provided invaluable feedback that greatly improved this manuscript.
A previous version of this manuscript appears in the Academy of Management (2012) proceedings and was
recipient of the Sage Publications/Research Methods Division Best Paper Award.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publi-
cation of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Notes
1. In this article, any distinction between statements (declaratives) and questions (interrogatives) is disregarded
for current purposes as both statements and questions serve as the stimulus to which the survey recipient is
asked to respond. Accordingly, the terms will be used interchangeably.
2. The examples quoted in this section are actual responses from Study 1 in the empirical portion of this article.
3. As the Agho, Price, and Mueller (1992) satisfaction scale used declaratives it was impossible to tell whether
respondents were paraphrasing the question or answering. Accordingly this scale was not analyzed.4. We understand that some authors may not wish their scales to be placed in the public domain. This is rea-
sonable. It is not reasonable, however, given the problems of construct validity that numerous authors have
identified for the scales not to be shared with reviewers.
Hardy and Ford 21
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey
23/26
References
Agho, A. O., Price, J. L., & Mueller, C. W. (1992). Discriminant validity of measures of job satisfaction, pos-
itive affectivity and negative affectivity. Journal of Occupational and Organizational Psychology, 65(3),
185-196.Anderson, J. C., & Gerbing, D. W. (1991). Predicting the performance of measures in a confirmatory factor
analysis with a pretest assessment of their substantive validities. Journal of Applied Psychology, 76(5),
732-740.
Beatty, P. C., & Willis, G. B. (2007). Research synthesis: The practice of cognitive interviewing. Public
Opinion Quarterly,71(2), 287-311.
Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity.Psychological Review,
111(4), 1061-1071. doi:10.1037/0033-295x.111.4.1061
Brayfield, A. H., & Rothe, H. F. (1951). An index of job satisfaction. Journal of Applied Psychology, 35(5),
307-311.
Brewer, J., & Hunter, A. (2006). Foundations of multimethod research: Synthesizing styles. Thousand Oaks,
CA: Sage.
Camman, C., Fischman, M., Jenkins, G. D., & Klesh, J. (1983). The Michigan organizational assessment sur-
vey: Conceptualization and instrumentation. In S. E. Seashore, E. E. Lawler III, P. H. Mirvis, & C. Camman
(Eds.), Assessing organizational change: A guide to methods, measures and practices . New York, NY:
Wiley Interstice.
Christian, L. M., & Dillman, D. A. (2004). The influence of graphical and symbolic language manipulations on
responses to self-administered questions.Public Opinion Quarterly,68(1), 57-80. doi:10.1093/poq/nfh004
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement,
20(1), 37-46. doi:10.1177/001316446002000104
Cohen, J. (1977).Statistical power analysis for the behavioral sciences(Rev. ed.). New York, NY: Academic
Press.
Colquitt, J. A. (2001). On the dimensionality of organizational justice: A construct validation of a measure.
Journal of Applied Psychology,86(3), 386-400. doi:10.1037/0021-9010.86.3.386
Conway, J. M., & Huffcutt, A. I. (2003). A review and evaluation of exploratory factor analysis practices in
organizational research.Organizational Research Methods,6(2), 147-168. doi:10.1177/1094428103251541
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika,16(3), 297-334.
DeVellis, R. F. (2003).Scale development: Theory and applications (2nd ed.). Thousand Oaks, CA: Sage.
Ellis, B. B. (1989). Differential item functioning: Implications for test translations. Journal of Applied
Psychology, 74(6), 912.
Ericsson, K. A., & Simon, H. A. (1980). Verbal reports as data. Psychological Review, 87(3), 215-251.
doi:10.1037/0033-295x.87.3.215Fellbaum, C. (1990). English verbs as a semantic net. International Journal of Lexicography, 3(4), 278-301.
doi:10.1093/ijl/3.4.278
Fields, D. L. (2002).Taking the measure of work: A guide to validated scales for organizational research and
diagnosis. Thousand Oaks, CA: Sage.
Ford, L. R., & Scandura, T. A. (2007, November).Item generation: A review of commonly used measures and rec-
ommendations for future practice. Paper presented at the Southern Management Association, Nashville, TN.
Garson, G. D. (2009).Statnotes: Topics in multivariate analysis. Retrieved fromhttp://faculty.chass.ncsu.edu/
garson/pa765/statnote.htm
Grice, H. P. (1975). Logic and conversation. In J. L. Morgan & P. Cole (Eds.), Syntax and semantics (Vol. 3).
New York, NY: Academic Press.Groves, R. M., Fowler, F. J., Couper, M. P., Lepkowski, J. M., Singer, E., & Tourangeau, R. (2004). Survey
methodology. Hoboken, NJ: John Wiley.
22 Organizational Research Methods
at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from
http://faculty.chass.ncsu.edu/garson/pa765/statnote.h