2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

  • Upload
    chh

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    1/26

  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    2/26

    Article

    Its Not Me, Its You:

    Miscomprehension in Surveys

    Ben Hardy1 and Lucy R. Ford2

    Abstract

    The ubiquity of surveys in organizational research means that their quality is of paramount

    importance. Commonly this has been addressed through the use of sophisticated statisticalapproaches with scant attention paid to item comprehension. Linguistic theory suggests that while

    everyone may understand an item, they may comprehend it in different ways. We explore this in two

    studies in which we administered three published scales and asked respondents to indicate what

    they believed the items meant, and a third study that replicated the results with an additional scale.

    These demonstrate three forms of miscomprehension: instructional (where instructions are not

    followed), sentential (where the syntax of a sentence is enriched or depleted as it is interpreted), and

    lexical (where different meanings of words are deployed). These differences in comprehension are

    not appreciable using conventional statistical analyses yet can produce significantly different results

    and cause respondents to tap into different concepts. These results suggest that item interpretation

    is a significant source of error, which has been hitherto neglected in the organizational literature.We suggest remedies and directions for future research.

    Keywords

    survey research, quantitative research, construct validation procedures

    How satisfied are you with the pay you receive for your job? (Tsui, Egan, & OReilly, 1992). This

    seems a straightforward question that anyone who is employed should be able to understand and

    answer. But what does it actually mean? Is it asking whether you are happy with the pay you receive

    for your job, or whether you think the amount you earn is fair for the work you do? Or something

    else? No doubt you will understand both the question and its meaning. The crucial issue, however, is

    whether others understand it in exactly the same way as you.

    Organizational researchers often solicit the opinion of others through surveys. This frequently

    involves administering a stimulus, in the form of a question or statement,1 and allowing the partici-

    pant to choose from a limited menu of responses. Closed questions of this nature allow a verbal

    1The Open University Business School, Milton Keynes, United Kingdom2

    Saint Josephs University, Philadelphia, PA, USA

    Corresponding Author:

    Lucy R. Ford, Saint Josephs University, 5600 City Avenue, Philadelphia, PA 19131, USA.

    Email: [email protected]

    Organizational Research Methods

    1-25

    The Author(s) 2014

    Reprints and permission:

    sagepub.com/journalsPermissions.nav

    DOI: 10.1177/1094428113520185orm.sagepub.com

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://www.sagepub.com/journalsPermissions.navhttp://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://www.sagepub.com/journalsPermissions.nav
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    3/26

    (analog) signal to be converted to a numerical (digital) output, through the allocation of ordinal num-

    bers to Likert scale responses, and the consequent output to be subjected to statistical examination.

    The advantage of this process of transduction is also its weakness, as the simple numerical output

    masks infelicities in comprehension of the instruction, question, or response. Individuals agreeing

    with a statement may not necessarily be agreeing with the same thing as other respondents, and

    even sophisticated statistics may not detect these differing interpretations. The whole enterprise of

    survey research rests on the assumption that there is an unbroken chain of comprehension from the

    mind of the researcher, through the survey instrument, to the mind of the respondent, and back again.

    Miscomprehension at any stage in this process introduces error.

    A survey that is used as the basis for strategy or policy and that is poorly constructed and ignores

    different interpretations of questions could have profoundly negative effects. As a consequence, a

    great deal of effort has been put into improving standards of measurement. This research has mainly

    focused on using statistics to assay scale quality, with little attention paid to the stimulus questions

    themselves and the way in which individuals comprehend them.

    This article examines sources and types of linguistic miscomprehension in survey research, usingpublished, multi-item scales. We begin with a brief review of scale development and some of the

    principles of linguistics. We then present three studies that explore miscomprehension in survey

    research. The first study shows that while participants understand survey questions, they understand

    them in different ways. Using existing linguistic theory we code the results into three forms of mis-

    comprehension. The second study tests this taxonomy by presenting respondents with a stimulus

    question and asking them to select, from a list of possible interpretations, the interpretation of the

    question that most closely matches their own. We find that participants commonly depart from the

    strict syntax of the item in their interpretations. This threatens construct validity and can have impli-

    cations for item score on the scale itself and can impact on other scalesin this case turnover inten-

    tion. In the third study we replicate the findings of the first two studies using a different measure toestablish that our findings are not particular to our scale selection. These three studies demonstrate

    that respondents interpret items differently, that this threatens construct validity, and yet is not

    apparent when standard statistical tests to assess factor structure and validity are used. We then

    examine the import of these findings for organizational research, suggest remedies, and outline

    directions for future research.

    A Brief Review of Scale Development

    The process of scale development has been discussed in a number of texts (e.g., DeVellis, 2003;

    Hinkin, 1998). These generally aim to fulfill the American Psychological Association guidelines,

    which center around content validity, criterion-related validity, construct validity, and internal con-

    sistency reliability (Hinkin, 1998).

    The first step is to define the concept of interest and its domain. Poorly specified concepts and

    inadequate domain sampling will guarantee an inadequate scale. The next steps are elegantly sum-

    marized by Hinkin (1998). They begin with developing items that either inductively or deductively

    sample the conceptual domain (Hinkin, 1995). If the resulting items are poorly developed, then it is

    unlikely that the subsequent stages of the developmental processes will remedy this. Unfortunately,

    this critical step of item development is seldom accorded appropriate emphasis (Schriesheim,

    Powers, Scandura, Gardiner, & Lankau, 1993), with DeVellis (2003) suggesting that researchers

    often throw together or dredge up items and assume they constitute a suitable scale (p. 11).

    Hinkin (1998) advocates a more rigorous process, where parsimonious, readily comprehensiblequestions are written and construct validity is examined using multiple samples and techniques such

    as exploratory and confirmatory factor analysis. Despite the importance of the initial stages of item

    development (Hinkin, 1995), greater emphasis is often placed on the statistical assessment of the

    2 Organizational Research Methods

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    4/26

    psychometric properties of the scale (Rossiter, 2002) and its relationship to other variables in the

    nomological net (Borsboom, Mellenbergh, & van Heerden, 2004).

    Researchers may choose instead to use published measures. Passage through the peer review pro-

    cess has typically been perceived as evidence of scale quality. However, Ford and Scandura (2007),

    in an examination of a compilation of organizational measures (Fields, 2002), found that the major-

    ity of scales contained one or more threats to construct validity, suggesting that published measures

    are not without flaws. Using a published measure also involves disembedding it from its original

    context, potentially increasing the risk of error. Comprehension (u) is a function of syntax (s) and

    context (c);u f(s c), and so the same syntax may be comprehended differently in different con-

    texts. For example, saying break a leg means something different in a theater dressing room and

    an operating theater. The importance of linguistics in item interpretation has not been widely dis-

    cussed although there are some existing sources that do address the issue (Schwarz, 1999), and it

    is to this topic we now turn.

    A Brief Review of Linguistic Theory Surrounding Item Interpretation

    Surveys hinge on comprehension. If the respondent does not understand the survey question in

    exactly the same way as the researcher then the instrument is not measuring what the researcher

    intended. This interface between researcher and respondent is of critical importance and where lin-

    guistic problems of interpretation manifest themselves. Communication depends on one persons

    statements being understood by another. This, however, is not enough, as understanding what

    another person is saying is one thing, while understanding exactly what they mean is another.

    Researchers in the pragmatic tradition of semantics make a clear distinction between what is said

    and the context (both social and linguistic) in which it is said, the one influencing the other. The

    interplay between the contextual and literal was articulated by Grice (1975) and subsequently mod-ified and extended by other authors (e.g., Jaszczolt, 2005; Sperber & Wilson, 1986). The principle

    underpinning this field is that when we interpret a sentence, we flesh out the bare syntax of a sen-

    tence, drawing on our experience, context, and environment. This process, however, is neither uni-

    form nor predictable, varying across individuals and situations. These variations mean that survey

    questions may be fleshed out by individuals to give meanings other than the one intended by the

    questions author. Consequently, this article is concerned with cataloging these differences and

    examining their impact on survey research.

    Types of ErrorThreats to comprehension, and hence validity, fall into two basic categories: instructional and inter-

    pretive. Interpretive errors can then be broken down into two further categories commonly used

    within the psycholinguistic literature (e.g., Hernandez, 2001). One concerns the comprehension

    of the full sentence, or sentential comprehension, and the other the comprehension of individual

    words, or lexical comprehension.

    Instructional Miscomprehension.This is the most easily understood source of error, where the respon-

    dent either does not read/follow the instructions for completing the survey or they misunderstand the

    instructions (Tourangeau, Rips, & Rasinski, 2000). This failure to follow or understand instructions

    may not be evident in surveys with a numerical output, yet instructions are of pivotal importance.For some surveys they provide direction as to what is actually being measured. In others, the instruc-

    tions might contain the experimental manipulation through a change in wording. In either case, the

    results of the research are affected if the instructions are ignored. In short, instructions are important.

    Hardy and Ford 3

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    5/26

    Interpretive Miscomprehension: Sentential.When hearing a question we attempt to understand what

    it means. This involves making decisions, both conscious and unconscious, about what the ques-

    tioner is actually trying to ask. For example, the question Have you had lunch? at a 2 pm meeting

    is likely to be interpreted as Have you had lunch today? rather than Have you ever had lunch?

    Understanding a question, therefore, is not a banausic process of literally answering what has been

    asked, but rather one of applying contextual information in order to answer appropriately. Crucially,

    we must reach the same interpretation as the author of the question, otherwise we are likely to

    answer a different question from the one the author intended. This is sentential miscomprehension.

    Respondents might enrich or deplete the meaning of questions, or both. When an item is enriched,

    the respondent adds additional information to the stimulus. For example, if the question Consider-

    ing everything how satisfied are you with your current job situation? (Tsui et al., 1992) is inter-

    preted by the respondent as Would you stay in your job if someone offered you something

    else?,2 they have enriched the sentence to include elements of turnover intention that were not

    intended. This may mean that the respondent is actually answering a question about turnover inten-

    tion as opposed to just job satisfaction.Just as sentences can be enriched they can also be depleted. The question How fair or unfair are

    the procedures used to determine pay rates? (Sweeney & McFarlin, 1993) shows depletion if inter-

    preted as How fair is your pay? The respondent has clearly understood the fairness element of the

    question but not the procedural part and has, effectively, turned a procedural justice item into a

    distributive justice one.

    Interpretive Miscomprehension: Lexical. This form of miscomprehension concerns the meaning of

    the words themselves. One persons definition of a word does not necessarily accurately map onto

    that of anothers because they are drawing on a variety of educational, cultural, social, contextual, or

    gender-specific definitions.The wordsatisfactionhas two historical meanings (Simpson & Weiner, 1989). One is with refer-

    ence to desires or feelings (Simpson & Weiner, 1989, p. 502) and is described as The action of grat-

    ifying (an appetite or desire) to the full a sense of pleasurable gratification (p. 502); the other, with

    reference to obligations (p. 502), is a more transactional sensation of obligation having been fulfilled.

    Depending on exposure and knowledge, individuals interpreting the wordsatisfaction may draw on

    one or the other interpretation, or a blend of both. The issue for survey researchers is that it is very

    difficult to know which definition the respondent is drawing on. For example, two different respon-

    dents may both agree with the statement Are you satisfied with this company but one might

    be agreeing that they like the company while the other feels that the company has met its obligations.

    The possibility of lexical miscomprehension resulting from this polysemy, or multiple meanings,

    is not restricted to the wordsatisfied. The Collins English Dictionary lists 43,636 different nouns and

    14,190 different verbs. The average noun has 1.74 meanings and the average verb 2.11 (Fellbaum,

    1990), suggesting that there is plenty of opportunity for respondents to draw on more than one

    meaning.

    Lexical miscomprehension can introduce primary error, where the items miscomprehension

    affects its score, and also secondary error, where miscomprehension causes collinearity between

    scales. This could occur if, for example, a question about satisfaction were included in a model along

    with an instrument for turnover intention. The responses of those using a transactional interpretation

    of satisfaction should correlate with turnover intention but the responses of those taking a gratifica-

    tional view may not.

    These three forms of miscomprehension, instructional, sentential, and lexical, have the potentialto introduce considerable error into the measurement process. We shall now turn to two studies that

    provide evidence for the existence of these forms of miscomprehension and a third, which confirms

    our preliminary findings using a different scale.

    4 Organizational Research Methods

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    6/26

    Study 1: Respondent Interpretation of Survey Question

    This study aims to ascertain whether the forms of miscomprehension outlined previously occur with

    questions used in organizational research.

    Method

    The scales used in this study were selected on the basis of three criteria. First, Ford and Scan-

    dura (2007) did not identify any threats to construct validity in these scales in their analyses of

    all the scales contained in Fieldss (2002) book of organizational measures. Second, they con-

    tained a mix of both questions and statements so that comparisons could be made between a

    satisfaction scale using questions (Tsui et al., 1992) and one containing statements (Agho,

    Price, & Mueller, 1992; drawn from a longer measure in Brayfield & Rothe, 1951). Finally

    they were brief. This last point was of particular importance in order to minimize the potential

    for survey fatigue. Two multiple item scales for job satisfaction (Agho et al., 1992, 6 items;Tsui et al., 1992, 6 items) and one for procedural justice (McFarlin & Sweeney, 1992, 4 items)

    were used. The job satisfaction measures were also different in that one (Tsui et al., 1992) is a

    general measure of job satisfaction that measures specific facets such as satisfaction with the

    work itself, supervision, and coworkers, while the other (Agho et al., 1992) was intended as an

    affective measure of job satisfaction. The papers in which these scales appeared have been

    widely cited in organizational research and the scales themselves used frequently in subse-

    quent research.

    The survey was administered using Qualtrics (2007). The respondents were first asked to explain

    what they thought the survey question meant, in a free-text, open-ended response format, imagining

    that they were explaining the item to a non-native English speaker and attempting to convey the truemeaning of the item. We then administered the same three scales in their usual format with Likert-

    type responses. Finally, we captured standard demographic data such as gender, educational level,

    and native language.

    Sample.For this initial exploratory study, we used three convenience samples of participants. First,

    the authors sent an invitation to participate to their personal contacts, with a request that partici-

    pants forward the survey to others. We used this method as variance in pragmatic inference is uni-

    versal (Sperber & Wilson, 1986), and so a nonrandom sampling method was appropriate. We also

    wanted respondents who had the intellectual capacity to think through the meaning of items care-

    fully. Therefore, sampling our own contacts made sense, as our contacts are typically well

    educated.

    Our final sample comprised a total of 115 respondents. Forty-one of these were native speakers

    of British English (BrE) (average age 34.3,SD 11.9; 42%female, 58%had a masters degree

    of higher), 40 were native speakers of American English (AmE) (average age 37.1, SD 11.6;

    52%female, 60%had a masters degree of higher). We selected speakers of British English and

    American English as we were concerned that there might be differences in interpretation between

    these two forms that are strongly represented in organizational research and represent the two

    forms of English that are taught internationally. We also asked members of the RMNet listserv

    (a listserv restricted to members of the Research Methods Division of the Academy of Manage-

    ment) to complete the survey (n 34, average age 44.5, SD 12.2; 48%female, 91%had a

    masters degree or higher) as those with an interest in research methodology may assist colleaguesin developing surveys. The BrE and AmE samples were broadly similar in terms of age, gender

    profile, and educational attainment, while the RMNet sample was slightly older and more highly

    educated, as would be expected.

    Hardy and Ford 5

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    7/26

    Classification of Open-Ended Responses.Responses were classified manually using NVivo 8 (2008) by

    two coders and interrater reliability statistics (Cohens kappa, k; Cohen, 1960, as implemented in

    SPSS) were calculated.

    Instructional miscomprehension was identified and coded when a respondent failed to follow the

    instructions. The instructions were in a bordered box at the top of the first page and clearly stated in

    bold, italicized, block capitals: DO NOT ATTEMPT TO ANSWER THE QUESTION. Respon-

    dents who answered the question anyway demonstrated instructional miscomprehension, as did

    those who did not follow the ensuing instructions to explain the item.3

    In order to identify sentential miscomprehension we examined the responses for deviation from

    the syntax of the question, in the form of enrichment or depletion. Enrichment was defined as the

    respondent venturing beyond a strict syntactic interpretation of the question by including other con-

    ceptual elements. Depletion, on the other hand, was defined as the absence of an element of the

    question in the answer provided. Depending, of course, on the nature of the item, it is possible to

    simultaneously enrich and deplete an item. For instance, a respondent who interprets the question

    How fair or unfair are the procedures used to determine pay rates? (Sweeney & McFarlin,1993) as How transparent is compensation? simultaneously depletes the question by not asking

    about the (un)fairness of pay rates (i.e., removing a conceptual element) and also enriches it by

    expanding from pay to compensation.

    Lexical miscomprehension is difficult to apprehend, as it is impossible to know what mental

    schema the respondent is drawing on, so interpretation of the response can only be made by infer-

    ence from the rest of the sentence. The definition of satisfaction in the question How satisfied are

    you with the nature of the work you perform (Tsui et al., 1992) was coded as pleasurable if

    words indicating pleasure (e.g., happy) were included in the interpretation and transactional if the

    interpretation included phrases indicating that it matched or met their expectations. Questions where

    no classification could be made were coded as neutral.Other words that proved tractable to classification for lexical miscomprehension included ques-

    tions where a vague term such as oftenormostwas used, and there were responses that quantified

    what these words meant, for exampleoftenbeing interpreted as 3/5 days. Lexical ambiguity could

    also be seen in interpretations of such items as I like my job better than the average worker (Agho

    et al., 1992), which begged the question of how you define the average worker. The differing

    referents for the average worker were readily classifiable in the responses.

    Finally, analyses were conducted using SPSS to establish the reliability and dimensionality of the

    measures used in the survey.

    ResultsStatistical Tests for Dimensionality. Coefficient alpha (Cronbach, 1951) for the measures ranged

    between .79 and .91, which was consistent with or better than alphas previously reported for these

    measures (Fields, 2002). In addition, we used exploratory factor analysis to establish unidimension-

    ality of each measure (Conway & Huffcutt, 2003). We used maximum likelihood extraction and

    direct oblimin rotation. All items had factor loadings exceeding .40, with almost all exceeding

    .50, indicating undimensionality of each measure.

    Results of Coding Open-Ended Responses

    Instructional Miscomprehension. This was readily detected when respondents answered the ques-

    tion rather than describing what the question meant. Eight respondents consistently answered thequestions instead of describing them (BrE 3/41; AmE 2/40; RMNet 3/34). RMNet members in par-

    ticular demonstrated another form of instructional miscomprehension. Nine of 34 wrote responses

    such as job satisfaction/facet is pay, procedural justice, or Need a Likert scale response,

    6 Organizational Research Methods

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    8/26

    which are clearly incompatible with the instructions. Overall, 17 of 115 respondents (15%) pro-

    vided answers suggesting that they had not read the instructions properly.

    Sentential Miscomprehension.Examination of the sample revealed evidence of both enrichment anddepletion, with little difference in degree of miscomprehension between the three groups of respon-

    dents. Accordingly they were combined. The coding was completed by two raters, with excellent

    interrater reliability (k 0.85-0.90). Depletion was particularly evident with the items from the pro-

    cedural justice scale (Sweeney & McFarlin, 1993) where respondents ignored the procedural ele-

    ment of the question, such that How fair or unfair are the procedures used to determine pay

    rates? was interpreted as How fair is your pay? Effectively this turned a procedural justice item

    into a distributive justice one. Overall 27% of respondents ignored the procedural element of the

    question (BrE 30%, AmE 32%, RMNet 16%).

    Five respondents said that they did not understand the item How fair or unfair are the procedures

    used to communicate performance feedback. In spite of this they were still able to provide a numer-ical response in the multiple-choice section, despite the option not to do so. Therefore their lack of

    understanding is undetectable in the statistical data. Up to 44%of respondents depleted any given

    item, as can be seen in Table 1.

    Table 1. Number of Respondents Coded as Depleting or Enriching Items.

    Depletion(k 0.87)

    Enrichment(k 0.85)

    Scale Item N(%) N(%)

    Tsui, Egan, andOReilly (1992)

    1. How satisfied are you with the opportunities which exist inthis organization for advancement or promotion

    51 (44) 46 (40)

    2. How satisfied are you with the nature of the work youperform

    20 (17) 40 (35)

    3. How satisfied are you with the person who supervisesyouyour organizational superior

    5 (4) 39 (34)

    4. How satisfied are you with your relations with others in theorganization with whom you workyour coworkers andpeers

    11 (10) 34 (30)

    5. How satisfied are you with the pay you receive for your job 35 (30) 35 (30)

    6. Considering everything, how satisfied are you with yourcurrent job situation 20 (17) 24 (21)

    Sweeney andMcFarlin (1993)

    1. How fair or unfair are the procedures used to communicateperformance feedback

    28 (24) 38 (33)

    2. How fair or unfair are the procedures used to determinepay rates

    10 (9) 30 (26)

    3. How fair or unfair are the procedures used to evaluateperformance

    19 (17) 28 (24)

    4. How fair or unfair are the procedures used to determinepromotions

    26 (23) 38 (33)

    Agho, Price, andMueller (1992)

    1. I am often bored with my job 26 (23) 63 (55)2. I feel fairly well satisfied with my present job 5 (4) 26 (23)

    3. I am satisfied with my job for the time being 7 (6) 60 (52)4. Most days I am enthusiastic about my work 12 (10) 38 (33)5. I like my job better than the average worker does 14 (12) 24 (21)6. I find real enjoyment in my work 0 (0) 0 (0)

    Mean 19.3 (17) 37.5 (33)

    Hardy and Ford 7

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    9/26

    Enrichment was far more common, which is understandable as linguistic theory suggests that

    individuals are more likely to augment basic sentence syntax to recover meaning. Table 1 demon-

    strates that items were enriched by 21%to 55%of the respondents. There was no clear pattern to

    the enrichment, with the exception of the statement I am satisfied with my job for the time

    being (Agho et al., 1992), which seemed to trigger an association with turnover intention. This

    enrichment was not uniform, however, as one respondent interpreted it as I am currently happy

    with my job, but I may look for a new job in the future and another as Ill be out of here at the

    first opportunity. Despite these very different interpretations, both these respondents agreed

    with this item in their Likert response. It is clear then that the numerical output from multiple item

    scales can mask considerable linguistic variance, as two opposing interpretations had the same

    score (4/5).

    No statistical difference was detectable between the depleted or enriched responses and the rest of

    the sample on the basis of a Mann-Whitney U test (a nonparametric comparison test appropriate fornon-normal small samples) applied to the Likert responses for each category. So individuals are

    answering different questions, sometimes radically so, and yet this is undetectable statistically.

    Lexical Miscomprehension.Despite the intractability of detecting lexical miscomprehension, as pre-

    viously noted, there was clear evidence of differing interpretations of the wordsatisfaction, as can be

    seen in Table 2.

    Other examples of lexical miscomprehension can be seen with the use of vague terms (Tourangeau

    et al., 2000) such as the wordoftenin the item I am often bored with my job (Agho et al., 1992). For

    example,oftenwas interpreted as varying from More than 33%of the time to 99%of my work.

    The wordmostin the statement Most days I am enthusiastic about my work was interpreted withresponses as varied as 4/5 days, at least 3 days/week, and more often than not.

    The final element of lexical miscomprehension examined was the comparators for the item

    I like my job better than the average worker does. Thirty-four percent compared themselves to

    Table 2. Classification of Satisfaction by Item.

    Pleasurable(k 0.88)

    Transactional(k 0.79) Neutral

    Scale Item N(%) N(%) N(%)

    Tsui, Egan, andOReilly(1992)

    1. How satisfied are you with the opportunities whichexist in this organization for advancement orpromotion

    22 (19) 24 (21) 69 (60)

    2. How satisfied are you with the nature of the workyou perform

    63 (55) 5 (4) 47 (41)

    3. How satisfied are you with the person whosupervises youyour organizational superior

    39 (34) 30 (26) 46 (40)

    4. How satisfied are you with your relations withothers in the organization with whom you workyour co-workers and peers

    47 (41) 21 (19) 47 (41)

    5. How satisfied are you with the pay you receive foryour job 28 (24) 41 (36) 46 (40)

    6. Considering everything, how satisfied are you withyour current job situation

    53 (46) 5 (4) 57 (50)

    Agho, Price,and Mueller(1992)

    2. I feel fairly well satisfied with my present job3. I am satisfied with my job for the time being

    43 (43)44 (38)

    10 (9)6 (5)

    62 (54)65 (48)

    Mean 42.4 (38) 37.5 (18) 17.8 (16)

    8 Organizational Research Methods

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    10/26

    the general population, 18% to their peers, and 4% to people in similar jobs. Social comparison

    theory suggests that the choice of referent is critical to attitude formation (Riordan & Shore,

    1997). As with sentential miscomprehension there was no statistical difference between categories

    on the basis of the Mann-Whitney test applied to the Likert responses for each category.

    Study 1 Results Summary.Analysis of the qualitative results provides evidence for all three types of

    miscomprehension and suggests that this miscomprehension may not be readily detectable statisti-

    cally. The linguistic ambiguity within these scales is therefore a potentially significant but typically

    undetectable source of error.

    Study 2, Phase 1: Respondent Self-Classification Into Types of

    Miscomprehension

    In order to ensure that the miscomprehension observed in the first study was not an artifact of thecoding and classification process, we ran a second study to verify our results.

    Method

    Study 2 was survey based and had three parts. The first was the normal presentation of the scales

    (i.e., with a modified Likert response scale), and the last section asked for demographic information.

    The second section, however, was very different.

    Participants were presented with the items and then invited to choose the interpretation that most

    closely matched their understanding of the original item. We derived these alternative interpreta-

    tions from the open-ended responses gathered in Study 1. In order to assess sentential miscompre-

    hension, each stimulus item was presented with several response options, including a neutralparaphrase of the original item (N), an enriched response (E), a depleted response (D), a response

    that was both depleted and enriched (D&E), and an option for respondents who did not agree any

    of the choices were consistent with their understanding (I). A sample item is presented in Table 3.

    Table 3. Sample Items for Study 2, Phase 1.

    Sentential Miscomprehension Lexical Miscomprehension

    Actualitem

    How satisfied are you with thenature of the work you perform?

    Actualitem

    I feel fairly well satisfied with mypresent job

    N How satisfied are you by the type of work you do?

    N My present job leaves me fairly wellsatisfied

    E How satisfied are you with your joband its responsibilities?

    P My present job makes me fairly happy

    D How satisfied are you with your job? T My present job meets myexpectations

    D&E How satisfied are you that you aredoing a good job?

    I None of the interpretations offeredmatch my interpretation of thequestion/statement

    I None of the interpretations offered

    match my interpretation of thequestion/statement

    Note: Sentential miscomprehension: N neutral paraphrase of the original item; E an enriched response; D a depletedresponse (D); D&E a response that was both depleted and enriched; I an option for respondents who did not agree anyof the choices were consistent with their understanding (I); Lexical miscomprehension: N neutral interpretation; P plea-surable interpretation; T transactional interpretation; I a response for not agreeing with any of the choices.

    Hardy and Ford 9

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    11/26

    Each stimulus was presented twice with differing sets of possible interpretations (i.e., neutral,

    enriched, etc.) on each occasion, in order to verify that the respondents selections were not simply

    an artifact of the specific choices presented. If a respondent is interpreting the item accurately, they

    should select the neutral option both times.

    Similarly when assessing lexical miscomprehension we focused on the different meanings of the

    wordsatisfaction. Respondents were offered neutral (N), pleasurable (P), and transactional (T) inter-

    pretations and a response for not agreeing with any of the choices (I). A sample item is presented in

    Table 3. As with sentential miscomprehension each stimulus was presented twice with differing

    response options. If linguistic ambiguity has no effect on survey research, then respondents should

    either pick the neutral item, which was a paraphrasing of the initial question or, at least, all choose

    the same non-neutral option.

    The second test for lexical miscomprehension included the use of three items with ambiguous

    modifying terms: I amoftenbored with my job, Mostdays I am enthusiastic about my work,

    and I like my job better than the average worker does (Agho et al., 1992). Respondents were

    presented with different options for each term indicating different frequencies, time periods, orcomparative groups, respectively. For example, for the item I am often bored with my job,

    responses ranged from More than 33%of the time I am bored with my job to More than 75%

    of the time I am bored with my job.

    The validity of the response choices was checked by sending them to a university linguistics

    professor. She classified the responses into the sentential and lexical miscomprehension cate-

    gories. She accurately classified 98% of the response choices into the same category as the

    authors, suggesting that the choices accurately mirrored sentential and lexical miscomprehension

    as outlined previously.

    The sample for Study 2 came from two sources. First, we again sent the survey to some of our

    contacts and asked them to forward it to other working adults and collected 165 valid responses fromthis group. We then collected an additional 100 responses using a paid Qualtrics panel of working

    adults. This allowed us to check whether the phenomena observed were an artifact of our sampling

    method or whether they also occurred in a broader sample that is likely to be more representative of

    the population of working adults.

    Results

    Dimensions of the Sample.Two hundred sixty-five valid responses were received. Most of the respon-

    dents were natives of either the UK (37%) or US (52%) with the remaining 11%from a mix of other

    countries. The average age was 42 (SD 11.5), 57.7%were female and 87.5%had a college degree,

    with 52%having a masters degree or higher.

    Statistical Tests for Scale Validity.As in Study 1 numerical responses to the first portion of the sur-

    vey where the items were administered in their conventional format were subjected to statistical

    analysis. Coefficient alpha values were somewhat higher than in Study 1, varying between .87

    and .91.

    We established scale dimensionality in this sample by conducting confirmatory factor analysis

    using LISREL (Joreskog & Sorbom, 2006) to examine factor structure. We looked at each of the

    job satisfaction scales (Agho et al., 1992; Tsui et al., 1992) on its own in combination with the

    measure of procedural justice (Sweeney & McFarlin, 1993) as we did not expect to find that a

    three-factor model including both measures of job satisfaction would provide a satisfactory fitto the data. In both cases, all items loaded significantly on the latent variable, and acceptable fit

    statistics were obtained (Comparative Fit Index [CFI] of .95 and .98; standardized root mean

    square residual [SRMR] of .043 and .059), although in both cases the chi-square test was

    10 Organizational Research Methods

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    12/26

    significant. Chi-square can be problematic (Joreskog, 1969) as it is very sensitive to both sample

    size and violations of distributional assumptions. Garson (2009) advises that chi-square test sig-

    nificance can be overlooked if other fit measures indicate good fit. Given that other fit statistics

    were consistently within acceptable range (Hair, Black, Babin, Anderson, & Tatham, 2006), and

    the reliability coefficients were strong, we concluded that the scales demonstrated sufficiently

    good fit that if we were conducting a substantive analysis using these data, we would consider that

    we had confirmed the unidimensionality of each scale.

    Sentential Miscomprehension.Over 90%of the respondents found that one of the four interpretations

    of the stimulus item offered (i.e., neutral, enriched, depleted, or enriched and depleted) coincided

    with their interpretation of the stimulus item. If respondents had followed the strict syntax of the

    stimulus item then they should have selected the neutral response. While this was the modal category

    and was, on average, selected 49% of the time (range 24%-68%), other categories, principally

    enrichment, were also commonly chosen. Table 4 shows the number and percentage of respondents

    demonstrating sentential miscomprehension.These results confirm that respondents routinely go beyond the strict syntactic meaning of items.

    Moreover this interpretive process does not appear to be uniform and some items are enriched or

    depleted to differing degrees. These data accord with the results of Study 1.

    The first section of the survey contained the scales in their conventional format (i.e., with Likert

    responses). This allowed us to see whether different interpretations produced significantly different

    scores. For 9 of 30 items the different interpretations produced significantly different scores on the

    Kruskall-Wallis test, with these demonstrating mild to moderate effect sizes of between r0.15 to

    0.29 (Cohen, 1977). A Bonferroni correction was applied to post hoc Mann-Whitney results, such

    that differences are reported at a .01 level of significance. The categories that differed from the

    neutral interpretation and the direction in which they differed are shown in Table 4.We conducted character, syllable, and word counts for each item and calculated a number of

    readability measures (after Jensen, 2009) to explore the impact of item length and complexity on com-

    prehension. These were then compared to the number of neutral interpretations of each item as fewer

    neutral interpretations suggests greater miscomprehension. No significant relationship was observed,

    suggesting that item length and complexity did not correlate with miscomprehension.

    Lexical Miscomprehension. Over 90% of respondents found that the options that they were pre-

    sented with matched their own interpretation of the item. Analysis of respondents understand-

    ing of the word satisfied, presented in Table 5, suggests that there was significant deviation

    from the neutral response. Fifty-six percent of respondents selected the neutral option

    which contained the word satisfied rather than any interpretation. Twenty-four percent of

    respondents selected a pleasurable interpretation of the wordsatisfied, and 14%chose the trans-

    actional interpretation.

    The number of respondents choosing pleasurable or transactional interpretations was lower than

    in Study 1. Statistical analysis of the responses to the different forms of lexical miscomprehension

    using the Kruskal-Wallis test demonstrated a significant difference for only 1 of 16 items, again,

    with a mild to moderate effect size (r .18, see Table 4).

    Differing lexical interpretations were also observed. The wordoftenin the item I am often bored

    with my job was interpreted as being 33%to 75%of the time.Mostin Most days I am enthusiastic

    about my work was either 3 of 5 or 4 of 5 days with a significant number understanding most as

    more often than not. Finally for the item I like my job better than the average worker does, anumber of different options were given to define the average worker. Most respondents selected

    either the typical worker in this country or the average person, but 14% selected either

    coworkers or peers as their comparator.

    Hardy and Ford 11

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    13/26

    Although the differences in categories was significant when analyzed using the Kruskal-Wallis

    test as an omnibus test, none of the categories reached significance in post hoc analysis using

    Mann-Whitney with a Boniferroni correction, indicating that there is no significant difference in

    scale score based on respondent interpretation of the item.

    These data present strong evidence that individuals interpret words differently. Some regardbeing satisfied as a pleasurable experience whereas others regard it as a transactional one, even

    when the original intent of the author was one or the other. Modifying words such as oftenare simi-

    larly open to interpretation as individuals draw on different mental schemata.

    Table 4. Number and Percentage (in Parentheses) of Respondents Demonstrating Various Forms of SententialMiscomprehension.

    Scale/item Pr. N E D D&E I

    Tsui, Egan, and OReilly (1992)1. 1 164 (61) 58 (21) 22 (8) 15 (5) 6 (2)

    2 145 (55) 45 (17) 46 (17) 19 (7) 8 (3)2. 1 180 (68) 47 (17) 13 (4) 20 (7) 4 (1)

    2 168 (63) 28 (10) 37 (13) 23 (8) 9 (3)3. 1 163 (61) 36 (13)*a 30 (11) 23 (8) 12 (4)*a

    2 130 (49) 66 (25)*a 16 (6) 37 (14)*a 14 (5)4. 1 175 (66) 31 (11) 35 (13) 16 (6) 6 (2)

    2 102 (38) 49 (18) 46 (17) 58 (22) 8 (3)5. 1 131 (49) 39 (14) 75 (28) 15 (5) 5 (1)

    2 89 (33) 94 (35)*a 40 (15) 20 (7) 20 (7)

    6. 1 152 (57) 8 (3) 79 (29) 24 (9) 2 (0)2 119 (45) 22 (8) 69 (26) 43 (16) 9 (3)Sweeney and McFarlin (1993)

    1. 1 96 (36) 60 (22) 72 (27) 20 (7) 16 (6)2 97 (37) 89 (33) 30 (11) 17 (6) 29 (11)

    2. 1 123 (46) 63 (23) 29 (11) 41 (15) 7 (2)2 155 (58) 67 (25) 19 (7) 12 (4) 10 (3)

    3. 1 88 (33) 56 (21) 38 (14) 75 (28) 5 (1)2 152 (58) 41 (15) 41 (15) 14 (5) 14 (5)

    4. 1 64 (24) 78 (29) 41 (15) 52 (19) 27 (10)2 74 (28) 87 (33) 54 (20) 27 (10) 18 (6)

    Agho, Price, and Mueller (1992)

    1. 1 74 (28) 81 (30)*a

    71 (26) 18 (6)*a

    20 (7)2 155 (58) 34 (12)*a 38 (14) 7 (2) 30 (11)

    2. 1 147 (56) 58 (22) 31 (11)*b 16 (6) 10 (3)2 150 (56) 27 (10) 66 (25)*a 12 (4) 9 (3)

    3. 1 158 (60) 69 (26) 19 (7) 5 (1) 12 (4)2 147 (55) 57 (21) 23 (8) 22 (8) 14 (5)

    4. 1 127 (48) 74 (28) 23 (8) 30 (11) 8 (3)2 100 (38) 86 (32) 26 (9) 28 (10)*b 23 (8)*b

    5. 1 147 (55) 42 (15) 17 (6) 43 (16)*b 14 (5)2 91 (34) 82 (31) 15 (5) 40 (15) 35 (13)

    Grand mean 128.7 (48.9) 55.8 (21.2) 38.7 (14.7) 26.4 (10) 13.4 (5)

    Note: Pr. presentation (each item was presented twice with a set of interpretations for each presentation); N neutral;E enriched; D depleted; D&E depleted and enriched; I inappropriate interpretation.*Denotes significant differences in score compared to neutral category.aDenotes mean below the neutral response mean.bDenotes mean above the neutral response mean.

    12 Organizational Research Methods

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    14/26

    Study 2, Phase 2

    Phase 1 of the study demonstrated primary effects, where differences in interpretation resulted in

    significant differences in score. We have already raised the possibility of secondary effects, where

    differences in interpretation affect other constructs. So to explore this possibility we repeated Phase

    1 and incorporated a measure of turnover intention.

    Method

    The survey used was exactly the same as that used in Phase 1, but with the addition of two items from

    the Camman, Fischman, Jenkins, and Klesh (1983) scale measuring turnover intention. The surveywas administered using Amazons Mechanical Turk (mTurk) to obtain a sample of respondents who

    were currently employed. mTurk has been used effectively in various fields, including both linguis-

    tic and psychology studies (see Mason & Suri, 2012; Sprouse, 2011).

    Results

    Of the 250 respondents who completed the survey, 39 failed checks built in to test for careless

    responding, leaving a final sample size of 211 valid responses. The checks included 3 items scattered

    through the survey such as If you are reading this, please select disagree. Participants were dis-

    carded from the sample if they failed two or more of these checks and they completed the survey

    very quickly. This is the consistent with the methods described by Meade and Craig (2012) to detectand eliminate cases in which the respondent is not attending to the content at all.

    Of respondents, 51.2%were female, the average age was 35.2 (SD 10.9), 95.3%of respondents

    were American, 78.7%had a college degree or higher, with 15.4%having a masters degree or higher.

    Table 5. Number and Percentage of Respondents Demonstrating Lexical Miscomprehension of the WordSatisfied.

    Scale/item Pr. N P T I

    Tsui, Egan, and OReilly (1992)1. 1 145 (55) 63 (23) 48 (18) 7 (2)

    2 139 (53) 78 (29) 36 (13) 9 (3)2. 1 141 (53) 87 (33) 26 (9) 9 (3)

    2 176 (67) 41 (15) 35 (13) 10 (3)3. 1 151 (57) 38 (14) 59 (22) 14 (5)

    2 156 (59) 54 (20) 45 (17)*a 8 (3)4. 1 118 (44) 103 (39) 24 (9) 18 (6)

    2 169 (65) 53 (20) 32 (12) 6 (2)5. 1 127 (48) 55 (20) 76 (28) 6 (2)

    2 79 (30) 95 (36) 78 (29) 10 (3)

    6. 1 151 (57) 89 (33) 18 (6) 5 (1)2 139 (52) 78 (29) 34 (12) 12 (4)Agho, Price, and Mueller (1992)

    2. 1 163 (63) 44 (17) 37 (14) 14 (5)2 176 (67) 54 (20) 23 (8) 8 (3)

    3. 1 168 (63) 59 (22) 14 (5) 23 (8)2 183 (69) 40 (15) 23 (8) 16 (6)

    Grand mean 148.8 (56.7) 64.4 (24.5) 38 (14.4) 10.9 (4.1)

    Note: Pr. presentation (each item was presented twice with a set of interpretations for each presentation); N neutral;P pleasurable; T transactional; I inappropriate interpretation.*Denotes significant differences in score compared to neutral interpretation.aDenotes mean below the neutral response mean.

    Hardy and Ford 13

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    15/26

    Sentential Effects on Turnover Intention. We noted in Study 1 that the item I am satisfied with my

    job for the time being (Agho et al., 1992) seemed to trigger an association with turnover

    intention in some respondents. In Study 2, one of the enriched interpretations reflects this.

    Respondents therefore could choose between a neutral interpretation of the item and an

    enriched version, At the moment I am satisfied with my job and I am not looking for a new

    one. There was no significant difference detectable in item score across the interpretations,

    suggesting that the impact of the miscomprehension is not directly detectable. Accordingly

    we selected this item for analysis.

    We compared the turnover intention scores of those who selected the enriched interpretation of

    the item with those who selected the neutral interpretation. Using a Mann-Whitney test to compare

    the scores for these two groups we found a difference that approached significance, U 1,867,

    z1.88,p .060,r.13, whereby those enriching the item scored higher on turnover intention.

    This suggests that sentential miscomprehension can have indirect effects, as such a result is unlikely

    to have happened by chance (Nickerson, 2000).

    Lexical Effects on Turnover Intention. We explored the possibility that lexical miscomprehension

    might also have indirect effects by examining the relationship between differing interpretations

    of the wordsatisfactionand turnover intention. We chose the satisfaction item that best reflected

    the global concept of job satisfactionConsidering everything, how satisfied are you with your

    current job situation? In Phase 1 there was no significant difference in item score among the

    types of interpretation offered.

    We found a significant difference across groups when comparing the transactional interpretation

    to the pleasurable interpretation, U 569,z 2.95,p .013,r 0.20, and also when comparing

    the transactional interpretation to the neutral one, U 1,208, z 2.38,p .017,r .16.

    These results show that differences that do manifest themselves on item score may, as linguistictheory suggests (Schwarz, 1999), reflect differing cognition and have indirect effects.

    Study 2 Results Summary.The findings of Phase 1 of Study 2 confirm the findings in Study 1, suggest-

    ing that they are not the product of researcher confirmation bias (see Nickerson, 1998). Furthermore,

    Phase 2 of Study 2 demonstrates that miscomprehension may also have indirect effects.

    Study 3: Replication of Results With Spreitzers Empowerment Scale

    To allay concerns that the findings demonstrated were an artifact of the scales selected we replicated

    Studies 1 and 2 with Spreitzers (1995) 12-item empowerment scale. This scale was developed usingbest practices for scale development and the author reported strong evidence of construct validity.

    The scale includes four subdimensions of psychological empowerment: meaning, competence, self-

    determination, and impact.

    Method

    The replication was carried out in two phases. Phase 1 involved administering the 12 items of the

    scale to a pool of 29 participants, as in Study 1. The results of this first phase were then used to create

    the item interpretations for the second phase, which replicated Study 2. In this phase we constructed

    neutral, enriched, depleted, and depleted and enriched interpretations for each item in Spreitzers

    (1995) scale and asked respondents to select the interpretation that most closely matched their owninterpretation. We again collected demographic data and asked the respondents to answer the scale

    in its original format. Data were collected from 100 employed workers through Amazons Mechan-

    ical Turk.

    14 Organizational Research Methods

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    16/26

  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    17/26

    The reasons why individuals did not read the instructions is unclear, but it seems probable that

    familiarity with the task and the seeming obviousness of what needs to be done are likely. The

    highly educated nature of the sample may mean that they are more regularly surveyed. They may

    also take mental shortcuts as they believe they know what is required. This would seem partic-

    ularly true for the RMNet group for whom surveys are likely to be common currency.

    The fact that a large number of RMNet members (12/34) did not read the instructions is under-

    standable but nonetheless a cause for concern. When developing scales, researchers routinely ask

    their colleagues opinions on various matters and use them to help generate items. This result sug-

    gests that Schriesheim and colleagues (1993) were correct when they asserted that academic col-

    leagues might not be the ideal assistants.

    Sentential Miscomprehension

    The results suggest significant sentential miscomprehension. This is most dramatically observed in

    the procedural justice scale (Sweeney & McFarlin, 1993) where (depending on the item) 21% to31%of respondents appeared to miss the process element and hence answer a question about dis-

    tributive justice. The figure was rather lower for the RMNet group, which we believe may be

    because they are more likely to be sensitized to the procedural element of the question. The fact that

    the group that ignored the procedural component was indistinguishable statistically from the respon-

    dents interpreting the question correctly means that any results from this scale would contain sub-

    stantial and hidden error. However, as the literature is replete with studies in which procedural and

    distributive justice are highly correlated (Colquitt, 2001), this study perhaps sheds some new light on

    the source of some of this collinearity.

    Study 2 showed that enrichment, depletion, or a combination of the two were demonstrated by

    about half of respondents. This suggests that around half of respondents deviate from the strict syn-tax of the item and alter it according to their own understanding.

    The impact of sentential miscomprehension is difficult to ascertain. For 9 of the 30 items, differ-

    ing interpretations produced significantly different results on the traditional presentation of the item.

    This effect, however, was inconsistent. It seems likely that the process of enrichment or depletion

    meant that respondents tapped into other concepts when responding. For example, the item I am

    satisfied with my job for the time being was interpreted 22% of the time as At the moment I

    am satisfied with my job and I am not looking for a new one, an interpretation that, as Phase 2

    of Study 2 shows, also taps into turnover intention. The implications of this are both direct and indi-

    rect. Directly, some respondents tap into a construct other than job satisfaction. Indirect effects

    might potentially occur when job satisfaction and turnover intention are included in the same study.

    The linguistic interaction for some (but not all) of the participants would potentially interfere with

    the validity of the results obtained.

    Sentential miscomprehension therefore has the potential to introduce substantial error. This may

    not be detectable using conventional scale appraisal techniques, as the scales in this study had high

    coefficient alphas and were unidimensional in factor analyses. Despite this, this error has consider-

    able impact on construct validity and sporadic impact on scale score.

    Lexical Miscomprehension

    Although lexical miscomprehension is difficult to ascertain, the different understandings of the word

    satisfaction and the variation in meaning of the wordoften in this study demonstrates that there islinguistic variance among respondents.

    What is the impact of this? It seems likely that respondents are answering somewhat different

    questions as a result of lexical miscomprehension of the word satisfied. Again this raises the

    16 Organizational Research Methods

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    18/26

    possibility that some respondents may be tapping into different constructs when they interpret items.

    This will have both primary effects on the scale itself and secondary effects on any model incorpor-

    ating constructs similar to those being unconsciously tapped into by the respondent.

    Lexical miscomprehension is most clearly observed in the different interpretations of often.

    When responding to the statement I am often bored with my job, if the threshold for often is

    one-third of the time then you are likely to respond differently to this question than would someone

    for whom often means 99%of the time.

    The results demonstrate that individuals have different interpretations of individual words, yet

    this is not appreciable using conventional statistical techniques. The impact of this is that respon-

    dents are drawing on quite different conceptual schema and referents, with attendant diminution

    of construct validity.

    Consequences and Remedies

    The findings of this study make uncomfortable reading for those of us involved in survey research.The degree of variance observed linguistically would be serious cause for concern if it was observed

    numerically. All four measures used in this series of studies have been published in high-quality

    journals and frequently cited. They have furthermore been subjected to significant previous con-

    struct validation analyses. And yet they all contain linguistic threats to validity.

    The reliability and validity of measurement instruments and surveys is of pivotal importance. If

    the measures do not measure what they purport to, or do not do so accurately, then recommendations

    based upon research that employs these measures are likely to be flawed. So what is to be done? We

    propose strategies to minimize each form of miscomprehension for those developing new scales and

    those using existing scales. We begin by providing principles to undergird item construction.

    Instructional Miscomprehension.Putting borders around the instructions and using bold type or capitals

    do not seem to completely eliminate instructional miscomprehension, as these were all used in Study

    1 to little effect. The literature on manipulation checks however offers a possible direction. Oppen-

    heimer, Meyvis, and Davidenko (2009) used an effective system to detect failure to follow instruc-

    tions. Participants were presented with a survey about sports participation. The instructions indicated

    that participants should ignore the first question in the survey and instead click on the page title.

    Those who had not read the instructions were able to proceed with the survey normally, allowing

    researchers to compare the responses of those who read the instructions with those who did not. This

    is similar to the captcha or reverse Turing test advocated by Mason and Suri (2012) where a

    particular response is requested by the survey item to prove that the participant is paying attention

    and motivated. Use of these approaches improves engagement, reducing the proportion of invalid

    responses from 48.6%to 2.5%(Kittur, Chi, & Suh, 2008).

    There is a danger that these seemingly counterintuitive instructions (e.g., to ignore a particular

    item) might confuse respondents. Instructing participants to pay close attention to the items and

    informing them that tests that measure their carefulness or attentiveness are being used may help.

    Care should also be taken to ensure that such tests do not interfere with respondents capacity to

    answer other items in the survey. There is also a potential danger that selecting only those respon-

    dents who obey all the instructions excludes particular groups, for example those with a particular

    personality trait. Nonetheless, tests of this sort are a useful addition to any survey to ensure that par-

    ticipants have attended to the instructions and are sufficiently motivated.

    Sentential Miscomprehension. When looking at the structure of the item, it is important to eschew

    vague words such asmany,most,often, orsometimes. These have no formal quantity and so repre-

    sent an open invitation to miscomprehension. Try to use a quantity so that instead of an item that

    Hardy and Ford 17

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    19/26

    says Most days I am enthusiastic about my work use I am enthusiastic about my work at least

    75% of the time. When asking for a comparison, ensure that the comparator is clear. The UK

    Office of National Statistics Wellbeing Survey (Office of National Statistics, 2013), for example,

    contains an item Overall, how satisfied are you with your life nowadays? Nowadaysis a vague

    term. A better item would be Overall, how happy have you been with your life over the last three

    months.

    Comprehension may also be improved by the use of bold or italicized elements of items (see

    Christian & Dillman, 2004). As those respondents not following instructions have already been

    eliminated it seems likely that those remaining in the sample are more motivated and attentive. The

    use of bold or italics may help emphasize elements of the item, for example, How fair or unfair are

    theproceduresused to determine pay rates?, thus avoiding some of the problems witnessed with

    the Sweeney and McFarlin (1993) scale.

    Lexical Miscomprehension.There is a great deal of extant advice on item construction (e.g., DeVellis,

    2003; Groves et al., 2004) and our recommendations are intended to supplement this, not supplant it.We first consider the actual words used in the item.

    The tendency for multiple interpretations of the same word (polysemy) that we have observed

    with the wordsatisfactionsuggests that care should be used to avoid words with multiple meanings.

    The number of different meanings of a particular word can be examined using a dictionary. As an

    example, the wordhappymight be preferred to the wordsatisfiedwhen asking about the relationship

    between an employee and their organization if the measure is intended as an affective one. If another

    word is not available, then linguistic theory suggests that context aids comprehension. This means

    that the scale authors should provide careful guidance, through contextual information, as to which

    meaning the respondent should select. As this contextual information is typically provided in the

    instructions, the problem of instructional miscomprehension becomes even more of a concern andthe use of effective mechanisms to combat it even more vital.

    Ill-defined words, such asmeaningful, should also be avoided. While the researcher may have a

    clear idea of what a concept means, the respondents may not. Plain, short, commonly used words are

    most likely to be understood and reduce miscomprehension. Care should also be taken to avoid jar-

    gon and culturally specific terms. The wordquite, as inquite good, for example, is a superlative in

    the US but represents borderline mediocrity in the UK. In some cases researchers may abjure from

    using words altogether and use pictograms, such as smiling/frowning faces used by Kunin (1998).

    Those using existing scales should be similarly critical of words used, even in published scales, as

    the authors may not have attended to issues of miscomprehension.

    Survey Construction.Moving from the individual item to survey composition allows us to use linguis-

    tic theory to aid comprehension. Given that comprehension is a function of syntax and context, a

    sensible approach is to provide plenty of context to ensure a more uniform and predictable interpre-

    tation process. While there have been conflicting views of the necessity or indeed the desirability of

    intermixing items (Schriesheim & Denisi, 1980; Schriesheim, Kopelman, & Solomon, 1989; Spar-

    feldt, Schilling, Rost, & Thiel, 2006), we suggest that it might potentially have an effect on miscom-

    prehension as surrounding items may well provide contextual information that helps the respondent

    understand the item, and so grouping may help reduce miscomprehension (Tourangeau & Rasinski,

    1988). We therefore recommend grouping items together.

    Similarly, more thoughtful instructionsagain with an attentiveness checkcould help

    improve item comprehension. If the heading for the Sweeney and McFarlin (1993) scale containedthe instruction We now want you to think about the processes by which a number of different

    things that affect you at work are decided, then this may help emphasize the procedural element

    of the scale.

    18 Organizational Research Methods

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    20/26

    While it is clear that linguistic errors are a considerable concern in survey research, few researchers

    are looking for it. When a new scale is being developed, researchers might ask expert judges if they

    understand the item, or ask whether the item appears to them to measure the construct, but they do not

    typically ask respondents what they think the item means. If an existing scale is used, then the prove-

    nance of publication may well ensure that even less attention is paid to the properties of the scale.

    Testing and Evaluation. After developing a scale that appears to minimize linguistic miscomprehen-

    sion while sampling the content domain appropriately, we suggest that researchers assess the degree

    of linguistic ambiguity that the items produce within the target population. This field testing should

    be used both during scale development, where it should be thought of as a distinct and necessary step

    in the scale development process and also when using preexisting scales. As linguistic theory sug-

    gests that comprehension is a function of both syntax and context, any change in context necessitates

    a check to ensure that the item is still uniformly understood by respondents and, crucially, in the

    same way as the researcher.

    We have found, during the course of our own research, that the approach we adopted in Study 1was very effective at identifying ambiguity. This simply asks respondents to describe what they

    think the question means in their own words. This technique can be used in both the development

    of new items and in appraising preexisting ones. The advantage of this approach is that it can be

    administered remotely, and it elicits useful information. For item development purposes it is not nec-

    essary to go through the coding processes we did in Study 1, as inspection of the responses is usually

    sufficient to establish whether the item is being homogenously interpreted. This approach is prob-

    ably only suitable, however, for about 15 to 20 items as it is time-consuming for the respondent. For

    longer surveys one might use a piecemeal approach where the whole survey is broken down into

    blocks of 15 to 20 items and administered to separate samples.

    The scale development literature includes a number of approaches to further ensure both con-tent adequacy and homogeneity of comprehension of survey items. Schriesheim et al. (1993) sug-

    gest using Q-methodology to help measure the differences between individual judges. Anderson

    and Gerbing (1991) also offer methods for pretesting with small samples, although their technique

    focuses more on predicting confirmatory factor analysis (CFA) performance. Hinkin (1998) has

    suggested both these approaches to test content validity. Hinkin and Tracey (1999) built on this

    work and provided an analysis of variance technique that allows for evaluation of item distinctive-

    ness as part of the content validation process. In public opinion surveys cognitive interviewing is

    commonly used to pretest items (Beatty & Willis, 2007; Schwarz & Sudman, 1996). This may

    either take the form of asking respondents to think aloud as they answer the survey question

    (Ericsson & Simon, 1980) or by probing specific areas of understanding to help draw out elements

    of the respondents thinking (Willis, DeMaio, & Harris-Kojetin, 1999).

    In all survey development there is a tension between specificity and applicability. Words and items

    that are too specific will not be applicable in other contexts, similarly general words and items may not

    be sufficiently precise to reflect subtle difference. The process of thoughtful development and field

    testing should help ensure that the researcher successfully navigates between these two poles.

    A Note for Reviewers

    Those reviewing papers using survey research should also attend to issues of linguistic ambiguity, as

    quality measurement is the responsibility of both researcher and reviewer (Hinkin, 1995). The first

    step should be to require anyone who submits a paper based on survey research to provide the mea-sure for examination.4 They should then, having ruled out normal threats to validity (e.g., double

    barreled, etc.), look at the individual words in the item to see if the item contains any modifiers such

    asmostoroften or words with multiple meanings, such as satisfaction. The next step should be to

    Hardy and Ford 19

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    21/26

    inspect the syntax of the item to make sure that referents are clear, for example, yesterdayrather than

    recently. Reviewers should then assure themselves that the authors have ascertained whether the

    item is uniformly comprehensible to the target audience. Finally reviewers should inquire as to the

    steps taken by researchers to ensure that their respondents have read the instructions and are moti-

    vated throughout the survey. These steps, taken together, should help reduce poorly worded items

    and weed out unmotivated participants, thus improving the quality of research based on soliciting

    opinions.

    It is unlikely, however, that linguistic ambiguity will ever be eliminated. As linguistic philoso-

    phers have pointed out, and as we discussed in the linguistic theory section, there is always an

    indeterminacy of language. Nonetheless these approaches to item development and testing will

    help identify and eliminate the more obvious forms of instructional, sentential, and lexical

    miscomprehension.

    Limitations and Future DirectionsThis article used two different approaches to explore the impact of linguistic pragmatics on survey

    interpretation through the examination of four carefully chosen scales. By using a combination of

    open-ended and fixed response format questions we have aimed to use methods that have non-

    overlapping weaknesses in addition to their complementary strengths (Brewer & Hunter, 2006,

    p. 4). The four scales used here, however, can hardly be seen as representative of all available scales.

    Future studies should extend this analysis to other scales.

    Future research could explore the properties of items and words, and their context, which led to

    them being either sententially or lexically miscomprehended. This would require considerable

    effort, but given careful design and sufficient respondents, it may be that general principles to reduce

    the impact of linguistic factors on survey research could be produced, given that the extant linguis-tics literature has examined some of these issues already.

    Another interesting area for future research would involve comparing whether native English

    speakers were more likely to sententially or lexically miscomprehend items than non-native

    speakers. Theoretically native English speakers should have a more nuanced vocabulary and so

    be more likely to make linkages to other English words than non-native speakers. This may mean

    that non-native English speakers are actually better survey respondents as they are more likely to

    interpret items appropriately. Research in children, who similarly have a more restricted vocabu-

    lary, suggests that they produce a more restricted set of interpretations (Noveck, 2001). In addi-

    tion, future research might examine the possibility that there may be individual differences that

    drive linguistic miscomprehension.

    Finally, the current study has been within the paradigm of classical test theory. Item response the-

    ory (IRT) might offer an alternative approach to identifying linguistic ambiguity. IRTs ability to

    examine bias by comparing the performance of individual items has been used to identify charac-

    teristics of respondent populationsfor example those faking personality tests (Zickar, Gibby, &

    Robie, 2004). IRT has also been used to explore characteristics of surveys, for example context

    effects (Rivers, Meade, & Lou Fuller, 2009), the effects of extreme wording (Nye, Newman, &

    Joseph, 2010), and equivalence of translation (Ellis, 1989). Evaluation of item response curves may

    help identify differences in interpretation that are not readily appreciable using classical techniques.

    SummarySurvey research is a critical weapon in the social scientists methodological armory. It enables the

    opinions and feelings of large numbers of respondents to be rapidly ascertained and collated. Devel-

    opments in statistical techniques have enabled more sophisticated analyses to be performed in order

    20 Organizational Research Methods

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    22/26

    to enhance our understanding of social phenomena and processes. Surveys have tended to rest on the

    assumption of an unbroken chain of comprehension between the mind of the researcher through the

    survey instrument and to the mind of the recipientand back again. This assumption does not seem,

    on the basis of the results of this study, to be particularly robust. Respondents often either fail to

    follow instructions or miscomprehend the items presented. This is not readily detectable when the

    output from a survey is numerical; a problem that may be further exacerbated by changes in the con-

    text in which the items are presented.

    This article is not intended to denigrate surveys as an information source or research tool, but

    rather it seeks to draw the readers attention to some of the linguistic problems that underlie surveys

    and to demonstrate the magnitude of effect of these problems. The problem is, perhaps, most neatly

    summarized by the sociologist R. H. Tawneys (1971) comment that Sociology . . . is a department

    of knowledge which requires that facts should be counted and weighed, but which, if it omits to

    make allowance for the imponderables, is unlikely to weigh or even count them right (p. 147), a

    comment that seems as relevant and applicable to organizational survey research as it does to

    sociology.Overall, research into the linguistics of survey items is a rich soil for future research. Given the

    misinterpretation described in this article, there is clearly much work to be done. Attention to the

    potential methodological issues outlined in this article should help produce better, more valid results

    that will in turn provide the basis for an improved understanding of social and organizational

    phenomena.

    Authors Note

    All data are available from either author (Ben Hardy, [email protected], or Lucy Ford, [email protected]). We

    would like to thank Dr. Alyson Pitts, University of Cambridge, for her expert linguistics assistance, and Dr.

    Raina Brands, University of Cambridge, for her tireless assistance with data coding. We would also like to

    thank the three anonymous reviewers who provided invaluable feedback that greatly improved this manuscript.

    A previous version of this manuscript appears in the Academy of Management (2012) proceedings and was

    recipient of the Sage Publications/Research Methods Division Best Paper Award.

    Declaration of Conflicting Interests

    The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publi-

    cation of this article.

    Funding

    The author(s) received no financial support for the research, authorship, and/or publication of this article.

    Notes

    1. In this article, any distinction between statements (declaratives) and questions (interrogatives) is disregarded

    for current purposes as both statements and questions serve as the stimulus to which the survey recipient is

    asked to respond. Accordingly, the terms will be used interchangeably.

    2. The examples quoted in this section are actual responses from Study 1 in the empirical portion of this article.

    3. As the Agho, Price, and Mueller (1992) satisfaction scale used declaratives it was impossible to tell whether

    respondents were paraphrasing the question or answering. Accordingly this scale was not analyzed.4. We understand that some authors may not wish their scales to be placed in the public domain. This is rea-

    sonable. It is not reasonable, however, given the problems of construct validity that numerous authors have

    identified for the scales not to be shared with reviewers.

    Hardy and Ford 21

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/http://orm.sagepub.com/
  • 8/10/2019 2014 It_s Not Me, It is You-Miscomprehention Linguistic in Survey

    23/26

    References

    Agho, A. O., Price, J. L., & Mueller, C. W. (1992). Discriminant validity of measures of job satisfaction, pos-

    itive affectivity and negative affectivity. Journal of Occupational and Organizational Psychology, 65(3),

    185-196.Anderson, J. C., & Gerbing, D. W. (1991). Predicting the performance of measures in a confirmatory factor

    analysis with a pretest assessment of their substantive validities. Journal of Applied Psychology, 76(5),

    732-740.

    Beatty, P. C., & Willis, G. B. (2007). Research synthesis: The practice of cognitive interviewing. Public

    Opinion Quarterly,71(2), 287-311.

    Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity.Psychological Review,

    111(4), 1061-1071. doi:10.1037/0033-295x.111.4.1061

    Brayfield, A. H., & Rothe, H. F. (1951). An index of job satisfaction. Journal of Applied Psychology, 35(5),

    307-311.

    Brewer, J., & Hunter, A. (2006). Foundations of multimethod research: Synthesizing styles. Thousand Oaks,

    CA: Sage.

    Camman, C., Fischman, M., Jenkins, G. D., & Klesh, J. (1983). The Michigan organizational assessment sur-

    vey: Conceptualization and instrumentation. In S. E. Seashore, E. E. Lawler III, P. H. Mirvis, & C. Camman

    (Eds.), Assessing organizational change: A guide to methods, measures and practices . New York, NY:

    Wiley Interstice.

    Christian, L. M., & Dillman, D. A. (2004). The influence of graphical and symbolic language manipulations on

    responses to self-administered questions.Public Opinion Quarterly,68(1), 57-80. doi:10.1093/poq/nfh004

    Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement,

    20(1), 37-46. doi:10.1177/001316446002000104

    Cohen, J. (1977).Statistical power analysis for the behavioral sciences(Rev. ed.). New York, NY: Academic

    Press.

    Colquitt, J. A. (2001). On the dimensionality of organizational justice: A construct validation of a measure.

    Journal of Applied Psychology,86(3), 386-400. doi:10.1037/0021-9010.86.3.386

    Conway, J. M., & Huffcutt, A. I. (2003). A review and evaluation of exploratory factor analysis practices in

    organizational research.Organizational Research Methods,6(2), 147-168. doi:10.1177/1094428103251541

    Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika,16(3), 297-334.

    DeVellis, R. F. (2003).Scale development: Theory and applications (2nd ed.). Thousand Oaks, CA: Sage.

    Ellis, B. B. (1989). Differential item functioning: Implications for test translations. Journal of Applied

    Psychology, 74(6), 912.

    Ericsson, K. A., & Simon, H. A. (1980). Verbal reports as data. Psychological Review, 87(3), 215-251.

    doi:10.1037/0033-295x.87.3.215Fellbaum, C. (1990). English verbs as a semantic net. International Journal of Lexicography, 3(4), 278-301.

    doi:10.1093/ijl/3.4.278

    Fields, D. L. (2002).Taking the measure of work: A guide to validated scales for organizational research and

    diagnosis. Thousand Oaks, CA: Sage.

    Ford, L. R., & Scandura, T. A. (2007, November).Item generation: A review of commonly used measures and rec-

    ommendations for future practice. Paper presented at the Southern Management Association, Nashville, TN.

    Garson, G. D. (2009).Statnotes: Topics in multivariate analysis. Retrieved fromhttp://faculty.chass.ncsu.edu/

    garson/pa765/statnote.htm

    Grice, H. P. (1975). Logic and conversation. In J. L. Morgan & P. Cole (Eds.), Syntax and semantics (Vol. 3).

    New York, NY: Academic Press.Groves, R. M., Fowler, F. J., Couper, M. P., Lepkowski, J. M., Singer, E., & Tourangeau, R. (2004). Survey

    methodology. Hoboken, NJ: John Wiley.

    22 Organizational Research Methods

    at University of Huddersfield on June 5, 2014orm.sagepub.comDownloaded from

    http://faculty.chass.ncsu.edu/garson/pa765/statnote.h