Mathieu, Taylor, 2006

Embed Size (px)

Citation preview

  • 8/4/2019 Mathieu, Taylor, 2006

    1/27

    Clarifying conditions and decision pointsfor mediational type inferences inOrganizational Behaviory

    JOHN E. MATHIEU* AND SCOTT R. TAYLOR

    University of Connecticut, Storrs, Connecticut, U.S.A.

    Summary Although mediational designs and analyses are quite popular in Organizational Behavior

    research, there is much confusion surrounding the basis of causal inferences. We reviewtheoretical, research design, and construct validity issues that are important for drawinginferences from mediational analyses. We then distinguish between indirect effects, and partialand full mediational hypotheses and outline decision points for drawing inferences of eachtype. An empirical illustration is provided using structural equation modeling (SEM)techniques, and we discuss extensions and directions for future research. Copyright #2006 John Wiley & Sons, Ltd.

    Introduction

    Over 20 years ago, Baron and Kenny (1986) and James and Brett (1984) published papers that have had

    a profound influence on Organizational Behavior research and theory. Those authors advanced atheoretical foundation and analytic guidelines for drawing mediational inferencestheories, methods,

    and analyses that elucidate the underlying mechanisms linking antecedents and their consequences. At

    issue in this approach are research questions that seek to better understand how some antecedent (X)

    variable influences some criterion (Y) variable, as transmitted through some mediating (M) variable. In

    this sense, mediators are explanatory variables that provide substantive interpretations of the

    underlying nature of an XY relationship.

    Mediational designs have become ubiquitous in the organizational literature. Wood, Goodman,

    Cook, and Beckman (in press) reviewed five leading management journals over the years 19812005

    and identified 381 studies that tested mediational relationships. Of these, over 60 per cent of the studies

    followed prescriptions offered by Baron and Kenny (1986) or James and Brett (1984). However, the

    state of the art in mediational analysis is far from consistent. The fact that mediational designs have

    developed in different disciplines has only exacerbated the situation (Alwin & Hauser, 1975; Baron &

    Kenny, 1986; Frazier, Tix, & Barron, 2004; Holmbeck, 1997; James & Brett, 1984). Indeed,

    Journal of Organizational Behavior

    J. Organiz. Behav. 27, 10311056 (2006)

    Published online 7 September 2006 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/job.406

    * Correspondence to: John E. Mathieu, University of Connecticut, 2100 Hillside road, Unit 1041, Storrs, CT 06269-1041, U.S.A.E-mail: [email protected] article was published online on 7 September 2006. An error was subsequently identified and corrected by an Erratum noticethat was published online only on 13 October 2006; DOI: 10.1002/job.426. This printed version incorporates the amendmentsidentified by the Erratum notice.

    Copyright # 2006 John Wiley & Sons, Ltd. Received 29 April 2006

    Accepted 5 May 2006

  • 8/4/2019 Mathieu, Taylor, 2006

    2/27

    MacKinnon, Lockwood, Hoffman, West, and Sheets (2002) noted that Reflecting their diverse

    disciplinary origins, the procedures [for testing mediating variables] vary in their conceptual basis, the

    null hypothesis being tested, their assumptions, and statistical methods of estimation (p. 84).

    Mediational designs come in a variety of forms that differ in terms of the nature of the variables

    involved in this XMY chain. For example, Judd and Kenny (1981) discussed the influence of

    psychological treatments on individuals behavior as mediated by their knowledge. Saks (1995) studiedthe influence of training on newcomer adjustments, as mediated by their post-training self-efficacy.

    Mayer and Gavin (2005) considered how employees trust in their management influenced their in-role

    performance and organizational citizenship behaviors, as mediated by their ability to focus their

    attention on work activities. Chen, Gully, Whiteman, and Kilcullen (2000) tested the influence of

    individuals psychological traits on their behavior, as mediated by their psychological states.

    Claessens, Eerde, Rutte, and Roe (2004) considered the influence of individuals planning behavior and

    work characteristics on their work outcomes such as strain, job satisfaction, and performance, as

    mediated by their perceived control of time. In all instances, mediational models advance an

    XMY causal sequence, and seek to illustrate the mechanisms through which X and Yare related.

    However, there are important nuances in such designs that are often not appreciated, such as whether a

    mediator variable partially or full accounts for an XY relationship, or whether it merely serves as a

    linking mechanism between variables. This important distinction along with other aspects ofmediational designs constitutes our focus.

    Given the diversity of approaches and statistical techniques that currently exist for testing mediation,

    our aims for this paper are to: (1) revisit the research design and measurement preconditions that must

    be met in order for tests of mediational relations to be meaningful; (2) review definitions of mediators

    and related concepts, and in so doing distinguish between indirect effects, and partial and full

    mediational models; (3) distinguish the different statistical tests and decision points that apply,

    depending on what type of relationship is hypothesized; and (4) provide an empirical example that

    illustrates such differences. We conclude with a discussion of directions for future research and theory

    incorporating these distinctions. We note at the onset that many of the points we make below have been

    voiced previously. However, a quick perusal of the literature will reveal that some conventional bits of

    wisdom have been routinely ignored by many authors, and there remains a lack of consensus regarding

    how mediational hypotheses should be framed and tested (Wood et al., in press). Our hope is that this

    presentation can serve as a guide for those wishing to advance and to test mediational type relations.

    Preconditions for Mediation Tests

    The basic mediational design tests whether some antecedent condition X has a relationship with some

    criterion Y through some intervening mechanism M. In other words, mediational design advance an

    XMY style causal chain. Later we will distinguish different types of such designs and argue that

    they advance different a priori hypotheses. Nevertheless, the important point to emphasize here is thatmediational designs implicitly depict a causal XMY chain. Whereas substantial development

    has occurred surrounding statistical tests of mediated relationships, far less attention has been devoted

    to conditions for strong causal inference in such designs. We submit that inferences of mediation are

    founded first and foremost in terms of theory, research design, and the construct validity of measures

    employed, and second in terms of statistical evidence of relationships . The greatest challenges for

    deriving mediational inferences relates to the specification of causal order among variables, and the

    construct validity of the measures employed to operationalize X, M, and Y. In this sense, the

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    1032 J. E. MATHIEU AND S. R. TAYLOR

  • 8/4/2019 Mathieu, Taylor, 2006

    3/27

    preconditions for mediation tests are quite similar to those involved in specifying and testing causal

    (i.e., structural) models (see James, Mulaik, & Brett, 1982; Kenny, 1979).1

    Causal sequence

    Perhaps most fundamentally, inferences concerning mediational XMY relationships hinge on

    the validity of the assertion that the relationships depicted unfold in that sequence (Stone-Romero &

    Rosopa, 2004). In other words, as with structural modeling techniques, multiple qualitatively different

    models can be fit equally well to the same covariance matrix. Using the exact same data, one could as

    easily confirm a YMX mediational chain as one can an XMY sequence (MacCallum,

    Wegener, Uchino, & Fabrigar, 1993; Stezl, 1986; Stone-Romero & Rosopa, 2004). Despite passionate

    pleas to the contrary by Mitchell (1985) and others, a clear trend away from the use of experimental

    designs and a parallel increase in correlational designs has been evident in organizational research in

    the past two decades (Scandura & Williams, 2000). Indeed, commenting upon that current state of the

    literature, Spencer, Zanna, and Fong (2005, p. 845) lamented that this [correlational mediational]

    analysis strategy is overused and has perhaps been elevated as the gold standard of tests of

    psychological processes and may even be seen in some quarters as the only legitimate way to examinethem. In short, no statistical analysis can unequivocally differentiate one causal sequence from

    another. Theorists and researchers must then rely on other means to justify the sequence of effects. The

    most valuable bases to advance such inference come from: (1) experimental design features; (2)

    temporal precedence; and (3) theoretical rationales.

    Experimental designs

    Naturally, experimental designs afford the strongest foundation for making causal inferences.

    Hallmarks of randomized experimental designs include random assignment of participants to

    conditions, control of extraneous variables, and experimenter control of the independent variable .

    Indeed, the philosophy of experimental designs is to isolate and test, as best as possible, X Y

    relationships from competing sources of influence. In mediational designs, however, this focus is

    extended to a three phase XMY causal sequence. The benefits of conducting randomized

    experiments for testing such sequences has long been recognized. For example, Baron and Kenny

    (1986) described a design (based on Smith, 1982) whereby one introduces two experimental

    manipulations: (1) one presumed to influence the mediator and not the criterion; and (2) one presumed

    to influence the criterion yet not the mediator. Analyses of such designs would permit one to distinguish

    factors that exert influence directly on a criterion versus those that are carried through an intervening

    mechanism.

    More recently, Stone-Romero and Rosopa (2004, p. 283) argued that the only way that one can

    make credible inferences about mediation is to perform two or more experiments. In the first, the cause

    [i.e., X] is manipulated to determine its effect on the mediator [i.e., M]. In the second, the mediator [i.e.,

    M] is manipulated to determine its effect on the dependent variable [i.e., Y]. Certainly such an

    approach affords a solid foundation for making causal inferences, but may not be feasible or evendesirable in many applied circumstances. There may well be ethical, logistical, financial, and other

    considerations that limit the extent to which researchers can employ randomized experimental designs.

    1We should further note that mediation inferences from such designs are predicated not only on the assumptions that X, M, and Yarecausallyordered in that fashion, andtheir relationships arenot attributableto other variables or processes, butalso that they arerelated linearly (see Bollen, 1989; Kenny et al., 1998; Pearl, 2000 for further details). Whereas non-linear relationships, such asmoderation, can be incorporated in mediational frameworks, that takes us beyond the current discussion (see Baron & Kenny,1986; James & Brett, 1984; Muller et al., 2005).

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    MEDIATIONAL INFERENCES 1033

  • 8/4/2019 Mathieu, Taylor, 2006

    4/27

    Spencer et al. (2005) outlined circumstances when experiments offer desirable features for drawing

    mediational inferences, as well as instances when they may be less applicable. In short, however,

    randomized experimental designs offer the strongest basis for drawing causal inferences and should not

    be abandoned so prematurely by applied researchers (Mitchell, 1985; Scandura & Williams, 2000).

    Quasi-experimental designs also afford means by which causal order can be established (Cook &

    Campbell, 1979; Shadish, Cook, & Campbell, 2002). Because researchers may not be able to randomlyassign participants to conditions, the causal sequence of XMY is vulnerable to any selection

    related threats to internal validity (Cook & Campbell, 1979; Shadish et al., 2002). To the extent that

    individuals status on a mediator or criterion variable may alter their likelihood of experiencing a

    treatment, the implied causal sequence may also be compromised. For example, consider a typical:

    training self-efficacy performance, mediational chain. If participation in training is voluntary, and

    more efficacious people are more likely to seek training, then the true sequence of events may well be

    self-efficacy training performance. If higher performing employees develop greater self-efficacy

    (Bandura, 1986), then the sequence could actually be performance efficacy training. If efficacy

    and performance levels remain fairly stable over time, one could easily misconstrue and find substantial

    support for the training efficacy performance sequence when the very reverse is actually

    occurring.

    Researchers also have less control over contaminating variables in quasi-experiments ascompared to randomized experiments. Whereas concerns about contaminating influences and other

    threats to internal validity are extensive and well discussed elsewhere (see Cook & Campbell, 1979;

    Shadish et al., 2002; West, Biesanz, & Pitts, 2000), our primary focus here concerns threats to the

    implied causal sequence of effects in a mediational design. Revisiting our training example, a

    misspecification of causal sequence can emanate from the influence of an omitted (sometimes

    referred to as unmeasured, third, contaminating, hidden, or confounding) variable. For example, if

    employees with greater seniority are first eligible to receive training, and if they also tend to have

    higher self-efficacy, then there would be an illusionary training efficacy relationship unless

    seniority is also controlled. The issue here is that one must carefully consider what other variables

    might confound the relationships under consideration and account for their influence when

    evaluating the specified causal sequence and variable relationships as outline above. Otherwise,

    such influences might mask real effects (see MacKinnon, Krull, & Lockwood, 2000) or generate

    artifactual relationships.

    In summary, our discussion about the advantages of experimental and quasi-experimental designs

    converges on the larger issues of justifying the presumed causal order of variables and minimizing the

    influence of unmeasured variables. Randomized experiments certainly provide the strongest case for

    minimizing the influence of such potential effects but are difficult to implement. Quasi-experiments

    offer much as well, but are susceptible to a variety of threats to causal inferences. James (1980) and

    James et al. (1982) have well chronicled this issue. They submitted that an omitted variable poses a

    threat to causal inferences if it: (a) has a significant unique influence on an effect (i.e., mediator or

    criterion); (b) is stable; and (c) is related to at least one other predictor included in the model. In other

    words, in mediation analyses, omitted variables represent a significant threat to validity of the X M

    relationship if they are related both to the antecedent and to the mediator, and have a unique influenceon the mediator. Moreover, omitted variables represent a significant threat to inferences involving the

    prediction of the criterion, if they have a unique influence on the outcome variable and either the

    antecedent or mediator.

    Temporal precedence

    A mediational framework proposes that the antecedent preceded the mediator, which in turn preceded

    the criterion. Implicitly, therefore, mediational designs advance a time-based model of events whereby

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    1034 J. E. MATHIEU AND S. R. TAYLOR

  • 8/4/2019 Mathieu, Taylor, 2006

    5/27

    X occurs before M which in turn occurs before Y. To the extent that the measures and operations

    employed to operationalize variables in a study are aligned with that sequence, one can have more

    confidence that the chain of relationships is not compromised. Let us emphasize one point here: it is the

    temporal relationships of the underlying phenomena that are at issue, not necessarily the timing of

    measurements. Certainly, to the extent that the two different sequences are aligned is of concern.

    However, the literature is replete with designs whereby researchers collect a set of observations andthen correlate variables with some record of last years performance, whether that was derived from

    performance appraisals, performance outputs, sales, or some financial index. In effect, this reverses the

    presumed sequence of events and more likely models YXM than it does XMY.

    Simply assessing a presumed antecedent before measuring a presumed mediator and criterion in no

    way assures that the true underlying causal order is consistent with the order of measurement (James

    et al., 1982; Kenny, 1979). Indeed the synchronization of measurement timing and the development of

    phenomena over time are critical to the basis of causal inferences (Mitchell & James, 2001). Given that

    XMY relationships are presumed to unfold over time, it begs the question of how long does it

    take each variable to develop and to change? Consider a work redesign effort intended to empower

    employees and thereby to enhance their work motivation with the aim of increasing customer

    satisfaction. How long does it take to establish the new work design? Over what duration should we

    track employees subsequent motivation? If employees are indeed more motivated to perform, howlong will it take for customers to notice and for them to become more satisfied? These questions are not

    easy to answer, and in few instances would phenomena readily align with the 3 or 6 month intervals that

    organizations are willing to tolerate, even if they are open to multiple data collections. Even worse,

    consider the fact that employee motivations (M) in this instance are likely to begin changing before the

    work redesign (X) intervention is fully entrenched. And, the appropriate window for sampling

    customer reactions may vary widely depending on their frequency of encounters with employees and

    other factors. In sum, the guiding point here is that the passage of time between the assessment of X, M,

    and Y helps to further strengthen inferences about the causal sequence. To the extent that such

    assessments are aligned with the underlying developmental phenomena being studied will strengthen

    causal inferences.

    Theoretical guidance

    Theoretical frameworks usually prescribe a distinct ordering of variables. In fact, it is a hallmark of

    good theories that they articulate the how and why variables are ordered in a particular way (e.g., Sutton

    & Staw, 1995; Whetten, 1989). This is perhaps the only basis for advancing a particular causal order in

    non-experimental studies with simultaneous measurement of the antecedent, mediator, and criterion

    variables (i.e., classic cross-sectional designs). For example, Fishbein and Aijzens (1975) Theory of

    Reasoned Action has long posited that individuals attitudes give rise to intentions, which in turn

    influence their actual behaviors. This theoretical foundation has been applied extensively to the study

    of employees absence and turnover behaviors (Hom & Kinicki, 2001; Tett & Meyer, 1993). The job

    characteristics model argues that work design features give rise to psychological states which in turn

    influence individuals reactions (e.g., satisfaction) and behaviors (Hackman & Oldham, 1980). Mathieu

    (1991) used Lewinian Field Theory (Lewin, 1943) to submit that variables more psychologicallyremoved from oneself (i.e., distal effects such as perceptions of work characteristics), would influence

    more psychologically proximal variables (e.g., role states) and thereby affect work attitudes (e.g.,

    satisfaction, organizational commitment). Absent an experimental or longitudinal design, one might

    test a mediational model on the basis of the theoretical ordering of variables. Naturally the case would

    be stronger if one could also leverage features from a design perspective, but clearly the theory must

    articulate a certain causal sequence. And, in cross-sectional studies one often has little else to justify

    any particular order.

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    MEDIATIONAL INFERENCES 1035

  • 8/4/2019 Mathieu, Taylor, 2006

    6/27

    To summarize, the specification of the causal order of variables is absolutely critical to inferences

    about mediational relationships. This is first and foremost a theoretical exercise. Research design

    features in terms of experimental control and temporal precedence provide additional justification for

    particular sequences. Notably, there is no panacea for justifying causal sequence. Learned scholars

    differ on what they believe is sufficient grounds upon which to claim causal order. On one extreme,

    Stone-Romero and Rosopa (2004) submitted that anything short of a randomized experiment isinsufficient to claim justified causal order. Their position is tests of mediation models that are based

    upon data from non-experimental studies have little or no capacity to serve as a basis for valid

    inferences about mediation (Stone-Romero & Rosopa, 2004; p. 250). On the other extreme, a perusal

    of journal articles will quickly reveal numerous instances of authors claiming causal connections

    from mediational analyses of cross-sectional data collected in a single survey as related to last years

    performance indices. Reasonable people can disagree, and we personally believe that both of the above

    extreme positions are probably overstated. In any case, to the extent that ones work is: (a) grounded in

    strong theory; (b) employs true or quasi-experimental designs; and (c) assesses variables over time in

    the proper sequence and intervals, confidence in the causal sequence of variables in a particular model

    is enhanced.

    Measurement related issues

    As with any research investigation, the construct validity of measures employed are of concern in tests

    of mediation. Schwab (1980) submitted Construct validity is defined as representing the

    correspondence between a construct (conceptual definition of a variable) and the operational

    procedure to measure or manipulate that construct (pp. 56). Of note in particular for mediational

    analyses, attention should be directed at the convergent and discriminant validity of measures.

    Convergent validity

    Convergent validity essentially concerns the extent to which different measures of the same construct

    hold together or converge on the intended construct. Usually convergent validity is assessed using

    techniques such as factor analyses and other approaches that evaluate how well different observations

    relate to a latent variable. Naturally, this concept is related to reliability concepts such as internal

    consistency estimates, alternative forms/methods, interrater, or test-retest. Depending on the nature of

    the constructs involved in the XMY relationship, any combination of reliability estimates may

    be applicable (see Nunnally, 1978). Of note for the present discussion is the fact that measurement

    unreliability, particularly that of the mediator, can bias mediational analyses. As Hoyle and Kenny

    (1999) have demonstrated, assuming all positive paths, to the extent that a mediator is measured with less

    than perfect reliability, the MY relationship would likely be underestimated, whereas the XY would

    likely be overestimated when the antecedent and mediator are considered simultaneously. Whereas latent

    variable modeling can help to compensate for measurement shortcomings, the technique is certainly not a

    panacea. Consequently, the message here is clear: it is critical to use reliable measures when testing

    mediation, particularly when it comes to the mediator variable.

    Discriminant validity

    Discriminant validity of measures is another concern for all research investigations, yet particularly for

    tests of mediation. Discriminant validity refers to the extent to which measures of different constructs

    are empirically and theoretically distinguishable. Note that discriminant validity must be gauged in the

    context of the larger nomological network within which the relationships being considered are believed

    to reside. Discriminant validity does not imply that measures of different constructs are uncorrelated;

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    1036 J. E. MATHIEU AND S. R. TAYLOR

  • 8/4/2019 Mathieu, Taylor, 2006

    7/27

    indeed if that were the case there would be no mediational covariance to be modeled. The issue is

    whether measures of different variables are so highly correlated as to raise questions about whether

    they are assessing different constructs.

    If measures of an antecedent variable and a mediator are not sufficiently distinguishable, then they

    are in effect tapping the same underlying domain. Consequently, any attempt to parse their independent

    contributions to a criterion variable will be futile. For example, if X and M fail to evidence discriminantvalidity, any sequential analysis of their substantive relationship will conclude that the mediator carries

    the influence of X on Y. The same conclusion would follow in situations where the mediator and

    criterion and not distinguishable. This problem is akin to the notion of multi-collinearity between either

    the X and M variables, or between the M and Y variables. Consequently, it is incumbent on researchers

    to demonstrate that their measures of X, M, and Y evidence acceptable discriminant validity before any

    mediational tests are justified. This may be done in a variety of fashions ranging from exploratory factor

    analyses to more powerful multi-trait, multi-method approaches, and confirmatory factor analyses. In

    sum, a lack of discriminant validity between either X and M, or between M and Y, will lead to an

    illusionary mediational relationship that amounts to nothing more than correlating some measure of a

    construct with another measure of the same construct.

    Distinguishing indirect and mediating relationships

    Up to this point we have employed the term mediating variable in a very general sense. Unfortunately,

    different authors define mediation in many different ways and often use terms such as indirect effects,

    intervening variables, intermediate endpoint, and so forth interchangeably with mediators. (see

    MacKinnon et al., 2002 for a review). Further, although mediational models are pervasive in applied

    research and elsewhere, there is some debate concerning the requisite statistical evidence for drawing

    inferences about mediation (cf., Baron & Kenny, 1986; Collins, Graham, & Flaherty, 1998; Frazier

    et al, 2004; Holmbeck, 1997; James & Brett, 1984; James, Mulaik, & Brett, 2006; Kenny, Kashy, &

    Bolger, 1998; MacKinnon, et al., 2000, 2002; Preacher & Hayes, 2004; Shrout & Bolger, 2002). We

    believe that root causes of such controversies lie in differences of opinion regarding: (1) definitions ofmediators and related concepts; (2) the necessity of first demonstrating a significant total XY

    relationship; and (3) the appropriate base model for tests of different forms of mediation.

    The first two points of contention are closely intertwined. Some have submitted that a precondition

    for tests of mediation is that the antecedent must exhibit a significant total relationship with a criterion

    when considered alone (i.e., XY, see Baron & Kenny, 1986; Judd & Kenny, 1981; Preacher &

    Hayes, 2004). Others have relaxed this precondition, and argued that mediation inferences are justified

    if the indirect effect carried by the XM and MY paths is significant (e.g., Kenny, et al., 1998;

    MacKinnon et al., 2002). Advocates of this latter view often equate mediator variables with indirect

    effects (e.g., Alwin & Hauser, 1975; Bollen, 1987; MacKinnon et al., 2002). However, there is an

    important distinction between indirect and mediator variables.

    For example, MacKinnon et al. (2002, p. 83) suggested that An intervening variable (Mediator)

    transmits the effect of an independent variable to a dependent variable [emphasis added]. In contrast,Baron and Kenny (1986, p. 176) submitted that a given variable may be said to function as a mediator

    to the extent that it accounts for the relation between the predictor and the criterion [emphasis added].

    Preacher and Hayes (2004, p.719) explicitly drew a distinction between the two concepts and argued

    that mediation is a special, more restrictive, type of intervening relationship.

    A conclusion that a mediation effect is present implies that the total effect XY was present

    initially. There is no such assumption in the assessment of indirect effects. It is quite possible to find

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    MEDIATIONAL INFERENCES 1037

  • 8/4/2019 Mathieu, Taylor, 2006

    8/27

    that an indirect effect is significant even when there is no evidence for a significant total effect.

    Whether or not the effect also represents mediation should be judged through examination of the

    total effect.

    In other words, mediator variables are explanatory mechanisms that shed light on the nature of the

    relationship that exists between two variables. If no such relationship exists, then there is nothing to bemediated. While a chain of events whereby XM and MY may well be of interest, along with

    the extent to which variance in Y can be attributed to the indirect effect of X, we submit that sequence

    represents a qualitatively different phenomenon than mediation. We prefer to label such relationships

    as indirect effects.

    We readily acknowledge that there is another reason why some researchers (e.g., MacKinnon et al.,

    2000) advocate dropping the XY precondition for mediational inferences. This second position

    centers around the fact that confounding, suppression, and interactive effects could attenuate overall

    XY relationships (see MacKinnon et al., 2000).2 Similarly, others have argued that competing

    effects might mitigate total XY relationships, when opposite signed direct and indirect effects are

    present (e.g., when X and M are both positively related to Y, yet X and M are negatively related).

    Notably, the common thread through all of these positions is that some other variable, including

    perhaps the mediator, serves to contaminate the total XY relationship when viewed in isolation. Inother words, the opposite signs and mediator as a suppressor arguments both suggest that the true

    underlying model is a partially mediated one whereby the direct effect of XY can only be interpreted

    in the context of a model that also includes the MY path. This raises a central point about the

    importance of the base model that one hypothesizes.

    James et al. (2006) underscored the importance of the base model that one adopts for tests of

    mediation. In the case of full mediation, M is hypothesized to fully account for the significant total

    effect of XY. In other words, the direct effect of XY is no longer significant once MY has been

    included. In contrast, in the case of partial mediation, M is believed to account for a significant portion

    of the total XY, but a significant direct effect also remains. In other words, both MYand XY

    are significant when considered simultaneously. Of course, both partial and full mediation models are

    predicated on a significant XM relationship. James et al. (2006) and Shrout and Bolger (2002) noted

    that full and partial mediational inferences rely on different types of statistical tests. Consequently,

    which model one hypothesizes may lead to different conclusions in many instances.

    James et al. (2006) noted that the Baron and Kenny (1986) approach implicitly advocates partial

    mediation as the base model for tests of mediation. Alternatively, James and Brett (1984), and James

    et al. (2006) prefer the axiom of parsimony and advocate the full mediation base model. All agreed that

    substantive reasoning should guide which is adopted in any given circumstance; but the important point

    is that the a priori model that one advances has important implications for confirmatory and

    disconfirming statistical evidence. We extend this logic and submit that the specification of ones

    hypothesized base model has implications for indirect effects along with partial and full mediation.

    Moreover, such relations can be examined in the context of larger structural models where the

    influences of other variables of interest are also considered. Nevertheless, the evidential basis for

    drawing inferences of each type remains consistent and is outlined below.In summary, we believe that there are different types of relationships that fall under the general

    heading of intervening effects. Accordingly, we use the term intervening effects to describe any type of

    2Concerns about the influence of interactive or confounding variables imply the presence of non-linear relationships whichviolates an assumption of testing indirect or mediated relations, unless one is also hypothesizing moderation (see Footnote 1).Further, extraneous, omitted, or 3rd variablesrepresentspecification errorsthat always must be accountedfor, through theoretical,methodological, or empirical means, whenever a causal sequence of effects is advanced (James et al., 1982; Stone-Romero &Rosopa, 2004). We elaborate more fully on this and related points below.

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    1038 J. E. MATHIEU AND S. R. TAYLOR

  • 8/4/2019 Mathieu, Taylor, 2006

    9/27

    linking mechanism M that ties an antecedent with a criterion. Indirect effects are a special form of

    intervening effect whereby X and Y are not related directly (i.e., are uncorrelated), but they are

    indirectly related through significant relationships with a linking mechanism. In contrast, mediation

    refers to instances where the significant total relationship that exists between an antecedent and a

    criterion, is accounted for in part ( partial mediation) or completely ( full mediation) by a mediator

    variable.

    Estimation Guidelines and Decision Rules

    We submit that different statistical rules of evidence apply depending on whether one anticipates an

    indirect effect versus partial or full mediation. We argue that researchers are obliged to specify, a priori,

    which type of intervening process that they anticipate. Importantly, the nature of the hypothesized

    relationship leads to different sources of confirmatory and disconfirming evidence. In this sense, what

    we are advocating is an approach that is similar to that of structural equation modeling (SEM).Accordingly, a failure to reject a hypothesized model hinges on two types of tests: (1) confirmation of

    hypothesized relations (i.e., relationships that were hypothesized to exist are indeed significant and in

    the hypothesized directions); and (2) disconfirmation of non-hypothesized paths (i.e., sufficient model

    fit indices, which indicate that the paths that were hypothesized to be absent are indeed not significant).

    Moreover, because different competing models can be fit to the same data, we advocate contrasting

    ones hypothesized model against viable alternative models (see Anderson & Gerbing, 1988, for a good

    background on this general approach).

    Figure 1 presents three alternative models containing intervening effects and their respective

    parameters. In this sense, the indirect effects model is the most constrained or parsimonious, as it

    implies that the only significant relationships observed are the combined effect (bmxbym). This

    implies that both the XM (bmx)andMY (bym) paths are significant, although the combined effect

    is best tested using approaches such as the Sobel test (see MacKinnon et al., 2002; Shrout & Bolger,

    YMX

    YMX

    YMX

    noitaideMlaitraP

    noitaideMlluF

    tceffEtceridnI

    xm

    xm

    xm

    ym

    ym

    y x.m

    y m.x

    xy

    Figure 1. Alternative intervening models

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    MEDIATIONAL INFERENCES 1039

  • 8/4/2019 Mathieu, Taylor, 2006

    10/27

    2002 for details). Importantly, an indirect effect hypothesis also implicitly suggests that the total XY

    relationship (byx) is absent. The full mediation model is the next most parsimonious. This model also

    includes significant XM (bmx) a n d MY (bym) paths. However, the dashed line from XY in this

    model is meant to imply a significant total XY (byx) relationship that becomes non-significant when

    MY (bym) is included. In other words, a hypothesis of full mediation also requires a non-significant

    byx.m effect. Last, the partial mediation model is the least parsimonious and implies that XM (bmx),as well as both MYand XY will be significant when considered simultaneously (bym.x and byx.m,

    respectively).

    The three panels of Figure 2 specify sequences of effects to be considered in order for each of the

    hypothesized models to be supported. Later we will describe analytic techniques which provide

    information regarding these effects. The columns of rectangles in Figure 2 depicts the various

    conditions that must hold for each model to be accepted, whereas the branches containing circles depict

    guided alternative hypotheses that might be considered if a hypothesized condition is disconfirmed.

    Notably, by accepted models we mean that the data fail to reject the hypothesized model. As is always

    the case, this does not mean that a hypothesized model has been proven; simply that it is not

    inconsistent with the data. Similarly, once any facet of a hypothesized model is rejected, one enters an

    exploratory mode as alternative models are considered. Consequently, any conclusions that are derived

    from such searches are tentative at best and need to be validated on a new sample.

    Indirect effects

    As shown in Panel 1 of Figure 2, the pivotal test of the indirect model is simply (bmxbym) using

    methods such as the Sobel (1982) test or more sophisticated approaches employing bootstrapping

    techniques (see, MacKinnon et al., 2002; Preacher & Hayes, 2004; Shrout & Bolger, 2002). If such a

    test is not significant, then one should reject the indirect effect hypothesis and consider viable

    alternatives. A more parsimonious alternative in such instances would be to consider simply a direct

    Figure 2. Decision tree for evidence supporting different intervening effects

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    1040 J. E. MATHIEU AND S. R. TAYLOR

  • 8/4/2019 Mathieu, Taylor, 2006

    11/27

    XY (byx) relationship. Moreover, even if the indirect effect was significant, researchers should

    consider whether alternative models such as a partially or fully mediated model are suggested by the

    data. For example, if the overall XY relationship was significant (byx), and X does not contribute to

    the prediction of Yonce M has been considered (byx.m), then the hypothesis of an indirect effect would

    be rejected in lieu of an alternative model of full mediation.

    This approach echoes our earlier comments about researchers casual dismissal of the preconditionof a total XY effect for tests of mediation. Our reading of the literature suggests that there are very

    few instances where researchers actually hypothesized a priori that the total XY relationship would

    be non-significant. Rather, it appears as though many evoke a waiver of the XY precondition when it

    fails to materialize in their data, and then simply proceed to test the significance of the indirect effect.

    This has occurred even when there was no evidence to suggest suppression or counter-acting signs of

    direct and indirect effects. Such tactics, in our opinion, moves one away from confirmatory hypothesis

    testing and into the exploratory realm. In summary, we submit that the presence of a significant total

    XY relationship leads to the rejection of a hypothesis of an indirect effect and should trigger a

    consideration of an alternative partial mediation hypothesis (if suppression or counter-acting effects are

    suspected), and thereby perhaps, to a full mediation explanation.

    Full mediation

    As depicted in the second panel of Figure 2, a hypothesis of full mediation is predicated on a significant

    total XY (byx) relationship. Failing that, one might consider an alternative hypothesis of an indirect

    effect. If suppression is evident, then one might consider the alternative hypothesis of a partially

    mediated relationship. Assuming the total effect was present, one proceeds to test the XM (bmx) and

    MY (bym) relationships. If either fails to exist, then the evidence is consistent with the alternative

    hypothesis of a direct effect. Moreover, full mediation depends on the non-significance of direct effect

    of XY when the MY path is included (i.e., a non-significant byx.m). If the direct XY path is

    significant in this context, then the hypothesis of full mediation should be rejected and the researcher

    should consider the alternative hypothesis of partial mediation. We should note that there may be caseswhere adding the byx.m parameter attenuates the MY relationship (bym.x) to a non-significant level.

    If the XY relationship (byx.m) is significant in such an instance, then the full mediational hypothesis

    should be rejected in lieu of an alternative hypothesis of a direct effect. Alternatively, if neither byx.m or

    bym.x are significant, and the previous conditions were satisfied, then the data are consistent with the

    hypothesis of full mediation. This follows from the fact that the relevant M Y parameter for the full

    mediation hypothesis is bym not bym.x (see James et al., 2006).

    Partial mediation

    A partial mediation hypothesis, as shown in Panel 3 of Figure 1, is the least constrained and rests on thesignificance of all three paths: XM (bmx) and both XY (byx.m) and MY (bym.x) when

    considered simultaneously. Given the presumed causal order of variables, if the XY (byx.m) path is

    not significant in this model, then the hypothesis of partial mediation should be rejected and one should,

    perhaps, consider an alternative hypothesis of full mediation. Alternatively, if the XM (bmx) or the

    MY (bym.x) paths are not significant, then the partial mediation hypothesis should be rejected in lieu

    of the alternative hypothesis of simply a direct effect. The partial mediation hypothesis would only be

    supported if all three hypothesized paths are significant.

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    MEDIATIONAL INFERENCES 1041

  • 8/4/2019 Mathieu, Taylor, 2006

    12/27

    Caveats

    We should highlight two other related concerns for the tests outlined above. First, our various decision

    points pivot on tests of statistical significance, just as any SEM nested model comparisons do (see

    Anderson & Gerbing, 1988). Nevertheless, tests of statistical significance must be considered in light of

    related issues such as sample size and measurement reliability (Hoyle & Kenny, 1999). In other words,significance tests must be tempered by considerations of power and effect size estimates. Enormous

    sample sizes can yield statistically significant results that are virtually meaningless in practice and also

    easily lead to the rejection of full mediation hypotheses. Alternatively, small sample sizes can easily

    lead to inferences of full mediation in instances where there is not sufficient power to adequately test for

    partial mediation. The summary point is that sufficient power must exist to adequately test various

    relationships, and researchers should balance conclusions about statistical significance with those

    about practical significance.

    A second, and related caveat, concerns the relative powerof different tests. MacKinnon et al. (2002)

    have argued that a direct test of intervening effects (bmxbym) has greater power as compared to causal

    steps approaches such as those outlined by Baron and Kenny (1986). However, recall that MacKinnon

    et al. (2002) equated indirect effects with mediation, and argued that overall, the step requiring a

    significant total effect of X on Y led to the most Type II errors, (p. 96). In other words, thedistinguishing feature between indirect and mediator relations is what accounts for the fact that tests of

    the latter are more conservative than the former. The fact that mediators rely on the presence of a total

    direct effect represents a greater statistical burden, however, so we believe that the corresponding lower

    power is totally appropriate. In other words, the combined tests of indirect effects appear to have greater

    statistical power simply as a consequence of comparing them with qualitatively different types of

    relationshipsnamely, mediation.

    Summary

    Tests of intervening effects are predicated on the assumption that the causal sequence of variables is

    sufficiently justified and the measures employed to represent the constructs possess sufficient construct

    validity. While not particularly controversial, we believe these preconditions are often overlooked and

    should be afforded more attention by scholars. Less clarity surrounds the rules of evidence for

    mediational type inferences and associated statistical tests. A key to most of this confusion is the fact

    that XMY models may represent full mediation, partial mediation, or indirect effectsall of

    which are confirmed or disconfirmed in slightly different ways. We submit that researchers are obliged

    to a priori specify the nature of the relationship(s) that they anticipate, and then to conduct the

    corresponding tests to demonstrate both confirmatory and disconfirming evidence. To better illustrate

    how this works in practice, we offer the following example.

    Empirical Illustration

    The purpose of this illustration is to demonstrate the steps and evidential basis involved in testing

    indirect effects, and partial and fully mediated relationships. Our example focuses on the concept of

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    1042 J. E. MATHIEU AND S. R. TAYLOR

  • 8/4/2019 Mathieu, Taylor, 2006

    13/27

    self-efficacy as a mediator of the influences of individual differences and situational cues on

    individuals performance and is illustrated in Figure 3. Self-efficacy is defined as peoples judgments

    of their capabilities to organize and execute courses of action required to attain designated types of

    performances (Bandura, 1986, p. 391). The positive relationship between self-efficacy and

    performance has been demonstrated time and again and summarized in narrative reviews (e.g.,

    Bandura & Locke, 2003) and in meta-analyses (e.g., Stajkovic & Luthans, 1998). Hence, there isabundant support for viewing it as an influence on performance.

    Bandura (1977) has long theorized that self-efficacy is determined mostly by the cognitive appraisal

    and integration of information cues. Two of the most influential of such cues are enactive mastery and

    vicarious experience. Enactive mastery develops through repeated performance accomplishments in

    the same or similar situations. In other words, to the extent that individuals have performed well in a

    particular situation in the past, it is reasonable to expect that their efficacy expectations will be higher.

    Further, naturally one would expect that individuals past performance would correlate positively with

    their future performance for reasons other than simply their efficacy expectations (e.g., because ability

    influences both). Consequently, as depicted in Figure 3, we would hypothesize that self-efficacy would

    partially mediate the relationship between previous (i.e., baseline) performance and subsequent

    performance.

    Vicarious experience is gained through direct observation or information about how well others

    have performed in a situation. However, there is no reason to expect that vicarious experience

    would influence individuals performance unless they internalized such information in terms of

    their efficacy expectations. Consequently, we hypothesized that self-efficacy would fully mediate

    the relationship between normative information and individuals performance. Indeed, previous

    research has been consistent with this expectation (e.g., Mathieu & Button, 1992; Weiss, Suckow, &

    Rakestraw, 1999).

    Finally, recent theorizing and research have argued that relatively stable individual differences may

    influence efficacy expectations. For example, Phillips and Gully (1997) found support for a positive

    correlation between individuals learning goal orientation and their self-efficacy in an academic setting.

    Individuals who are high on learning goal-orientation strive to understand something new or to increase

    their level of competence in a given activity (Button, Mathieu, & Zajac, 1996). Whereas a learning goalorientation may contribute directly to performance, it is more likely to help shape specific task-related

    perceptions such as self-efficacy. Although some previous researchers have found results that are

    consistent with a hypothesis of full mediation (e.g., Phillips & Gully, 1997), others have found

    relationships more consistent with an indirect effect inference (e.g., Chen et al., 2000; Diefendorff,

    2004; Potosky & Ramakrishna, 2002). Which interpretation is most appropriate is debatable. However,

    for present illustration purposes, we hypothesized that learning goal orientation would exhibit a

    positive indirect effect with performance via self-efficacy.

    Figure 3. Hypothesized intervening effects

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    MEDIATIONAL INFERENCES 1043

  • 8/4/2019 Mathieu, Taylor, 2006

    14/27

    Method

    Participants

    Two hundred and one undergraduates were recruited from introductory psychology courses at a largenortheastern University and received extra credit toward their course grade for participation. The

    sample was 61 per cent female, and their average age was 18.65 (SD 1.97). Participants were invited

    to attend experimental sessions where they were randomly assigned to one of the three normative

    information conditions (n 67 per condition) as described below.

    Task and procedure

    The task was identical to the one used by Mathieu and Button (1992). It involved the creation of words

    containing three or more letters drawn from a list of 10 letters: four vowels (worth 1 point each), two

    consonants (worth 1 point each), and four additional consonants worth 2, 3, 4, and 10 points,

    respectively. The point values associated with each letter correspond to those used in the gameScrabble. The object of the task was to score as many points as possible during a 10-minutes session

    by forming words containing three or more letters from the list provided, excluding proper nouns and

    slang. Points were awarded according to the point values associated with letters used in each word

    generated.

    Upon arrival at the experimental session, participants completed an informed consent form and a

    survey that contained demographic items and a measure of learning goal orientation from Button et al.

    (1996). Once the survey was completed, the experimental task was explained to participants and they

    completed a 5-minutes practice exercise. They then calculated their own score on the practice exercise

    and were told that they would perform a 10-minutes experimental trial after answering some survey

    questions. The first page of the survey instrument presented the normative information manipulation

    using the following statement. In previous testing we have found that students like you score about 115

    [175, 235] points on the task you are about to complete [emphasis in instructions]. The middle value

    corresponded to pilot subjects average performance, whereas the low- and high-point values were set

    one standard deviation from the mean, based on pilot data. They then completed several survey items

    that included their self-efficacy and a manipulation check, and then completed the 10-minutes task,

    were debriefed and given their extra credit slips, and dismissed.

    Measures

    Manipulation check

    A manipulation check identical to that used by Mathieu and Button (1992) was administered after the

    survey items. It asked participants how many points they thought most people would score on the taskthey were about to complete. As anticipated, their responses differed significantly across the normative

    information conditions (F(2,194) 189.16, p< 0.001) and all means differed significantly from each

    other in the anticipated fashion.

    Performance

    Participants practice and task performances were simply the total number of points they earned during

    each of the timed periods. Individuals scored their own practice trail in order to provide clear and

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    1044 J. E. MATHIEU AND S. R. TAYLOR

  • 8/4/2019 Mathieu, Taylor, 2006

    15/27

    immediate feedback. Their calculations were later checked by a coder, blind to manipulated normative

    information, and found to be highly accurate.

    Self-efficacy

    Self-efficacy was assessed using a nine-item graduated scale identical to that employed by Mathieu and

    Button (1992). The 9-point scale spanned two standard deviations above and below the mean pointobtained in the pilot sessions. Participants rated the extent to which they believed that they could score

    at least each of the point values using a Likert-type 7-point scale, ranging from 1 ( virtually no

    possibility) through 4 (about a 50/50 chance) to 7 (complete certainty). Thus, higher scores represent

    greater levels of self-efficacy (a 0.95). We created three parcels (i.e., subscales) for use in the SEM

    analysis by averaging the highest, lowest, and midpoint rating to form one indicator, the next highest,

    lowest, and middle value to form a second, and the remaining three ratings to form a third.3

    Learning goal orientation

    Learning goal orientation was assessed using six items from Button et al. (1996). An example item is I

    prefer to work on tasks that force me to learn new things. Participants responded to each item using a

    17 agreement scale with higher values representing greater learning orientation (a 0.77). We also

    created three parcels for this measure, by first fitting them to a single factor model and then averagingthe highest and lowest loading items to form one composite, and so forth as described above.

    Analytic overview

    We employed Anderson and Gerbings (1988) two-step SEM strategy to test the model depicted in

    Figure 1 using LISREL 8.54 (Joreskog, Sorbom, du Toit, & du Toit, 2000). SEM techniques have long

    been advocated as preferable to regression techniques for testing mediational relationships because

    they permit one to model both measurement and structural relationships and yield overall fit indices

    (cf., Baron & Kenny, 1986; James & Brett, 1984; James et al., 2006; Kenny et al., 1998). Accordingly,

    we first fit a confirmatory factor analytic (CFA) measurement model followed by a series of structural

    models testing our hypothesized relationships.

    In order to gauge model fit, we report the Standardized Root Mean Square Residual (SRMSR),

    Goodness of Fit index (GFI; Joreskog et al., 2000), and the Comparative Fit Index (CFI; Bentler, 1990).

    We also report x2 values which provide a statistical basis for comparing the relative fit of nested models.

    SRMSR is a measure of the standardized difference between the observed covariance and predicted

    covariance. Usually, SRMSR values 0.08 are considered a relatively good fit for the model, and

    values 0.10 considered fair (Browne & Cudeck, 1989). The CFI is an incremental fit index that

    contrasts the fit of a hypothesized SEM model against a baseline (uncorrelated indicators) model.

    Historically, SEM model incremental fit indices such as GFI and CFI < 0.90 have been considered

    wanting and likely to be improved substantially. More recently, however, Hu and Bentler (1999)

    proposed that use of combined cutoffs such as CFI$! 0.95 and SRMSR$ 0.08 results in better

    balance of rejection rates for misspecified models under different conditions. In contrast, Marsh,Kit-Tai, and Wen (2004), Beauducel and Wittmann (2005), Fan and Sivo (2005) illustrated that

    deciding on the most appropriate index and cutoff is a complex function of the nature of model

    3This parceling approach helps to reduce the ratio of estimated parameters to sample size in SEM analyses (Hagtvet & Nasser,2004; Hall, Snell, & Foust, 1999; Landis, Beal, & Tesluk, 2000). It is also true that graduated self-efficacy items such as theseyield notoriously positively skewed distributions for the highest levels rated and negatively skewed distributions for the lowestlevels rated. Combining the ratings in this fashion yields parcels that better fulfill the normal distribution assumptions of SEMindicators.

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    MEDIATIONAL INFERENCES 1045

  • 8/4/2019 Mathieu, Taylor, 2006

    16/27

    misspecifications, sample sizes, balances between Type I and Type II tradeoffs, and a host of other

    factors. All authors also emphasized that the acceptability of models rests heavily on the extent to

    which hypothesized parameters are significant and in the anticipated directions, as well as issues such

    as parsimony. Given such controversy and the complexity of issues surrounding cutoff values for model

    fit indices, along with our study characteristics, we will consider models with CFI values 0.10 as deficient, those with CFI! 0.90 to0.08 to0.10 rangesas acceptable, and ones with CFI! 0.95 and SRMSR

  • 8/4/2019 Mathieu, Taylor, 2006

    17/27

    Collectively, these results indicate that the measurement properties fit quite well and there is sufficient

    covariance among the latent variables to warrant examining the different intervening effects. We should

    also highlight that the fit of the five-factor CFA is equivalent to a saturated structural modelor onethat includes direct paths from all antecedents to both the mediator (i.e., self-efficacy) and to the

    criterion (i.e., performance). This saturated model provides a useful comparison against which to gauge

    the fit of other models.

    Structural models

    Below we fit different structural models to test the three different types of intervening effects that were

    hypothesized. In effect, we isolate the direct and indirect effects for each of the three antecedents.

    However, we first fit only directs and no directs models to serve as additional bases of comparison.

    The only directs model estimates direct relationships from all antecedents to performance, with no

    paths leading to or stemming from the self-efficacy mediator (although self-efficacy remains as a latent

    variable in the model). This model exhibited deficient fit indices [x2(24) 96.36, p< 0.001;

    GFI 0.92; CFI 0.94; SRMSR 0.170] and differed significantly from the CFA model

    [Dx2(4) 77.32, p

  • 8/4/2019 Mathieu, Taylor, 2006

    18/27

    addition, the indirect effects of all the antecedents to performance via self-efficacy (bmxbym) were

    significant in this model: learning goal orientation: b 0.09, Sobel 4.70, SE 1.69, p< 0.05;

    normative information b 0.05, Sobel 2.14, SE 0.99, p< 0.05; and baseline performance

    b 0.20, Sobel 0.21, SE 0.04, p< 0.01.

    In summary, these two base models provide us with valuable information about the significance of

    the parameters associated with the different intervening effects. From the only directs model weascertained that the mediator variable plays an important role in the context of our model. From the no

    directs model we learned that the indirect effect of each antecedent with performance was significant,

    as transmitted through the self-efficacy mediator. We now turn to additional models that complete the

    picture for the three different relationships that were hypothesized.

    Indirect effect

    Recall that learning goal orientation was hypothesized to have only an indirect effect with performance

    via self-efficacy. As shown in the upper triangle of Table 1, the correlation (i.e., total effect, byx)

    between the latent learning goal orientation variable and performance was notsignificant (r0.03,

    ns), as anticipated. Using the no directs model as a base, we next fit a learning goal direct model by

    adding a path from learning goal orientation to performance. Although this model exhibited excellent

    fit indices, [x2(22) 48.79, p< 0.01; GFI 0.95; CFI 0.98; SRMSR 0.047], it was not asignificant improvement over the no direct model [Dx2(1) 2.53, n.s.] and it differed significantly from

    the saturated model [Dx2(2) 29.75, p

  • 8/4/2019 Mathieu, Taylor, 2006

    19/27

    GFI 0.98; CFI 1.00; SRMSR 0.024], was a significant improvement over the no directs model

    [Dx2(1) 29.42, p< 0.01], and did not differ significantly from the saturated model [Dx2(2) 2.86,

    n.s.]. This implies that the direct effect of baseline performance to performance was significant, and in

    fact it was (byx.m 0.40, p< 0.01), as was the indirect effect via self-efficacy (bmxbym.x 0.10,Sobel 0.10, SE 0.04, p

  • 8/4/2019 Mathieu, Taylor, 2006

    20/27

    Research design factors are paramount for reasonable mediational inferences to be drawn. If the

    causal order of variables is compromised, then it matters little how well the measures perform or the

    covariances are partitioned. Because no analytic technique can discern the true causal order of

    variables, establishing the internal validity of a study is critical. Adequately ruling out the influence of

    alternative explanations is also vital for drawing mediational inferences. Randomized field experiments

    afford the greatest control over such concerns, yet they may not be feasible for a number of reasons.Nevertheless, they remain the gold standard and should be pursued whenever possible. Quasi-

    experimental designs offer reasonable fall back options, but as Campbell and Stanley (1966) long ago

    warned, are fraught with threats to internal validity. Lacking the ability to perform any type of

    experiment, temporal precedence, and strong theory offer some bases for specifying causal order, but

    they are certainly not the strongest positions to defend. In the end, journal editors, reviewers, and

    consumers of research will no doubt have greater confidence in studies that leverage strong theory and

    experimental design features, reasonable exclusion of alternative explanations for effects, measures

    that have good construct validity and were gathered in the proper temporal precedence, and results that

    were consistent with the hypothesized relationships.

    In our empirical illustration, we justified the causal order of variables using a combination of

    techniques. First, an individual difference variable (learning goal orientation) was collected before the

    experiment was even introduced. Second, participants completed a practice exercise to familiarizethemselves with the task and to establish a baseline. Third, we then randomly assigned participants to

    normative information experimental conditions, after which we assessed their self-efficacy before they

    completed the performance trial. We reported confirmatory factor analysis results that supported the

    measurement properties of the scales we employed, and then described a series of competing structural

    models that homed in on the parameters of interest for different intervening relationships. Given the

    strong theoretical foundation concerning self-efficacy, the combination of experimental design

    features, temporal precedence, measurement quality, and focused analyses represents a fairly strong

    position from which to draw mediational inferences.

    We also sought to differentiate indirect effects, partial mediation, and full mediation. Clearly they are

    similar in the sense that they all describe an intervening process linking antecedents with an outcome.

    However, we submitted that there are important, albeit subtle, differences between the nature of the

    relationship that they each advance. Moreover, we argued that different types of confirmatory and

    disconfirming evidence are warranted for each type of relationships. Most importantly, we argued that

    researchers should articulate a priori hypotheses concerning the nature of the relationship(s) that they

    anticipate. This underscores the importance of adopting a confirmatory approach toward tests of

    intervening effects. As illustrated in the panels of Figure 2, the base model(s) that one chooses presents

    important guidelines for the evidential basis of different types of inferences. Moreover, when

    considered collectively in a larger structural model, which parameters are included has implications

    for tests of indirect and mediated relations. For example, a close examination of the results we reported

    will reveal that the magnitude of any given direct and indirect effect varied as a function of what other

    parameters were being modeled. In practice, it could well be that the significance of a given parameter

    will change depending on the nature of the entire network of model relations. Therefore, we encourage

    researchers to articulate an a priori model (including any potential co-variates of interest), and to reportthe parameter estimates for that model. Naturally, a revised model may be suggested by the data; in

    which case it is informative to report the parameter estimates from that context as well. Of course,

    revised models need to be validated on a new sample.

    We provided an empirical illustration of the three types of intervening relationships. In so doing, we

    outlined how a series of structural equation models could be employed to test the relevant parameters

    for each relationship. This could have just as easily been done using standard multiple regression

    techniques. However, SEM techniques offer three critical advantages over multiple regression

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    1050 J. E. MATHIEU AND S. R. TAYLOR

  • 8/4/2019 Mathieu, Taylor, 2006

    21/27

    approaches. First, using a two-stage SEM approach, researchers explicitly address the measurement

    properties of the variables that they have collected before the consideration of the more substantive

    relations. Whereas SEM does not absolve researchers from the importance of using valid and reliable

    measures, it does explicitly account for how measurement properties influence substantive conclusions.

    Second, SEM techniques explicitly consider the potential influence of constrained parameters. In other

    words, SEM model fit indices hinge on the veracity relationships thought notto exist. For example, ahypothesis of full mediation rests not only on the significance of X M and MY parameters, but

    also on whether X fails to relate significantly to Y once M has been considered. The third strength of

    SEM analyses derives from the nested model comparisons. Whereas sample size and related factors

    determine the power of tests of whether a parameter of interest differs significantly from zero, these are

    held relatively constant in the context of nested model tests. In other words, the SEM nested model

    comparisons allow one to home in on the specific parameters of interest and to contrast a given pattern

    of effects against viable alternatives. Clearly substantive considerations should guide the selection of

    alternative models (Anderson & Gerbing, 1988), yet we believe the three types of intervening effects

    we described will likely represent fairly viable alternatives for any hypothesis of interest.

    While we are clearly echoing previous calls for greater use of SEM techniques in mediational

    analyses (Baron & Kenny, 1986; James & Brett, 1984), they are not panaceas. Researchers must still

    attend to the preconditions for tests of mediation that we reviewed. Furthermore, the variouscomparison models that we advanced are not all directly comparable. Model contrasts are only valuable

    if competing models are nested. In other words, models are nested if one represents a more restrictive

    version of the other. Whereas both the saturated model and null latent models provide valuable

    universal benchmarks, the directs only and no directs models are only useful for limited comparisons.

    Nevertheless, the series of model comparisons enable researchers to test all the relevant parameters

    related to intervening effects. We should add that simpler approaches such as regression may well be

    applied in circumstances where the assumptions of SEM techniques have not been met (e.g., reasonable

    sample sizes).

    Extensions

    Moderated relationships

    Thus far we have been concerned with strictly main effect or linear relationships associated with

    various intervening effects. However, interactions or moderator relationships can also be incorporated

    into this framework. Both James and Brett (1984) and Baron and Kenny (1986) discussed procedures

    for testing both mediators and moderators simultaneously. James and Brett (1984), and more recently,

    Muller, Judd, and Yzerbyt (2005), further differentiated different forms that combinations of mediators

    and moderators may take. First, they described mediated moderation as the situation where an

    interaction between two antecedents, as related to a criterion variable, passes through a mediator. In

    effect, this implies that the moderator influences the XM link of a mediated relationship. For

    example, the influence of normative information on individuals self-efficacy might be contingent on

    the extent to which participants identify with the normative group. In other words, normativeperformance information about people just like me is likely to influence a persons self-efficacy far

    more than is information about people much different than me. Tests of this moderation, whether they

    be conducted using moderated regression or more sophisticated SEM techniques (Cortina, Chen, &

    Dunlap, 2001), would follow the analytic approach that we outlined earlier, while also considering

    interactions involving the antecedent variable(s) and the moderator.

    The other combination of variables is referred to as moderated mediation (James & Brett, 1984). In

    this case, the moderator exerts its influence on the MY path in the XMY sequence. For

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    MEDIATIONAL INFERENCES 1051

  • 8/4/2019 Mathieu, Taylor, 2006

    22/27

    example, individuals self-efficacy is likely to have a stronger relationship with their performance in

    weak or unconstrained environments, whereas strong environments or ones with abundant situational

    constraints would likely attenuate self-efficacy performance relations. Here again, the standard

    evidential basis for establishing mediated relations is followed; only the moderators influence on tests

    including the MY combination is also considered. In short, the difference between the two types of

    combinations follows from whether the moderator exerts its influence on the XM link (mediatedmoderation) or on the MY link (moderated mediation) in the XMY sequence.

    Multiple mediators

    To this point we have been discussing relationships between antecedents and an outcome via a single

    mediator. However, there are many circumstances where multiple mediators may be in operation.

    There are two different varieties ofmultiple mediation. The first instance of multiple mediation simply

    involves a longer causal chain such as XM1 M2 Y. For example, self-set goals have long been

    considered as a mediating mechanism linking self-efficacy in performance (Bandura & Locke, 2003).

    Consequently, the fully mediated relationship between normative information and performance that we

    illustrated would be transmitted through a self-efficacy self-set goals performance chain.

    Whether the relationship between self-efficacy and performance is partially or fully mediated by goals

    must be hypothesized and analyzed accordingly, as does the normative information self-efficacy self-set goals sequence. Analytic techniques to address the relative contribution of some

    X variable on some distal Y variable as transmitted by two (or more) sequential mediators are still

    evolving (see Shrout & Bolger, 2002). Nevertheless, the preconditions for testing mediational type

    inferences that we outlined would apply.

    The second form of multiple mediation concerns two or more stacked mediators. For example,

    Kohler and Mathieu (1993) advanced a model whereby individual resource variables and work related

    perceptions where associated with different forms of absenteeism as mediated by three work attitudes

    (e.g., job satisfaction) and three forms of work stress (e.g., somatic tensions). These authors considered

    the work attitudes and stresses as co-occurring in the sense that they advanced no causal sequence

    among them. Kohler and Mathieu (1993) tested mediational relations using a block of mediators

    considered together as a set. More recently, Preacher and Hayes (2005) have advanced techniques to

    not only assess the extent to which blocks of such mediators convey indirect effects, but also enable

    researchers to differentiate the extent to which the collective indirect effects are attributable to each of

    the mediators considered.

    Multi-level approaches

    Throughout this paper we have assumed that all variables of interest were indexed at the same level of

    analysis. However, mediational inferences can also be considered in the context of multi-level designs.

    Generally, multi-level designs come in two varieties: (1) nested entities; and (2) longitudinal

    approaches. In nested entity designs, some focal level-1 of analysis (e.g., individuals) is considered in

    the context of higher level-2 units (e.g., teams). In these designs, antecedents and mediators may

    emanate from different levels of analysis and combine to influence a lower-level criterion (see Mathieu

    & Taylor, in press). For example, team characteristics (X) may influence members individualperformances as mediated by level-2 team processes (M) or perhaps by their level-1 identification with

    the team.

    A second type of multi-level design is commonly referred to as within-subject (Judd, Kenny, &

    McClelland, 2001), growth-curve modeling (e.g., Bliese & Polyhart, 2002), or repeated measures (e.g.,

    Moskowitz & Hershberger, 2002) designs. In these designs, the lower level-1 variables are represented

    as repeated observations of the same unit of analysis (e.g., individual) over time. For example, one

    might consider the influence of level-2 individuals personality traits (X) on their individual level-1

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    1052 J. E. MATHIEU AND S. R. TAYLOR

  • 8/4/2019 Mathieu, Taylor, 2006

    23/27

    performance overtime (Y), as mediated by their work attitudes. Mediators (M) in this context could be

    relatively stable level-2 work attitudes (e.g., organizational commitment) assessed at a single point in

    time, or temporally changing level-1 variables (e.g., moods) assessed using time sampling techniques.

    Plenty of good work is currently being advanced along these lines (e.g., Judd et al., 2001; Kenny,

    Korchmaros, & Bolger, 2003). In summary, multi-level designs expand the scope of mediational

    inferences to incorporate relationships that reside within levels of analysis, traverse levels of analysis,and unfold over time.

    Conclusion

    Our goal for this paper was to revisit issues related to the validity of mediational inferences in

    organizational behavior. We sought to emphasize the inextricable ties between theory, design,

    measurement, and analysis related to such inferences. We also argued that indirect effects, partial

    mediation, and full mediation represent slightly different forms of intervening effects. We submittedthat researchers should specify which they anticipate a priori, as each relies on slightly different types

    of statistical evidence. Our hope is that this paper provides a framework for future investigations. We

    also believe that this approach should provide a foundation upon which to expand and incorporate

    moderated relationships, more complex multiple mediation applications, and multi-level designs.

    Acknowledgements

    We thank Gilad Chen, Jodi Goodman, Kris Preacher, Jack Veiga, and Zeki Simsek for their helpfulcomments on an earlier version of this paper.

    Author biographies

    John E. Mathieu ([email protected]) is a Professor and Cizik Chair of Management

    at the University of Connecticut. He received his PhD in Industrial/Organizational Psychology from

    Old Dominion University. He is a member of the Academy of Management and a Fellow of the Society

    of Industrial Organizational Psychology, and the American Psychological Association. His currentresearch interests include models of team and multi-team processes, and cross-level models of

    organizational behavior.

    Scott R. Taylor ([email protected]) is a PhD candidate in organizational behavior at

    the University of Connecticut. He received his MBA from the University of Virginia. He is a student

    member of the Academy of Management and Society of Industrial Organizational Psychology. His

    current research interests include team leadership and influence, multi-level models of organizational

    behavior, and multi-team systems.

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    MEDIATIONAL INFERENCES 1053

  • 8/4/2019 Mathieu, Taylor, 2006

    24/27

    References

    Alwin, D. F., & Hauser, R. M. (1975). The decomposition of effects in path analysis, American Sociological Review, 40, 3747.

    Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommendedtwo step approach. Psychological Bulletin, 103, 411423.Bandura, A. (1977). Social learning theory. New Jersey: Prentice-Hall.Bandura, A. (1986). Social foundations of thought and action: A social cognitive theory. Englewood Cliffs, NJ:

    Prentice Hall.Bandura, A., & Locke, E. A. (2003). Negative Self-efficacy and goal effects revisited. Journal of Applied

    Psychology, 88, 8799.Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological

    research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychlology,51, 11731182.

    Beauducel, A., & Wittmann, W. (2005). Simulation study on fit indices in confirmatory factor analysis based ondata with slightly distorted simple structure. Structural Equation Modeling, 12, 4175.

    Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238246.Bliese, P. D., & Ployhart, R. E. (2002). Growth modeling using random coefficient models: Model building, testing,

    and illustrations. Organizational Research Methods, 5, 362387.Bollen, K. A. (1987). Total direct and indirect effects in structural equation models. In C. C. Clogg (Ed.),

    Sociological methodology (pp. 3769). Washington DC: American Sociological Association.Bollen, K. A. (1989). Structural equations with latent variables. Oxford, England: Wiley.Browne, M. W., & Cudeck, R. (1989). Single sample cross-validation indices for covariance structures. Multi-

    variate Behavioral Research, 24, 445455.Button, S. B., Mathieu, J. E., & Zajac, D. M. (1996). Goal orientation in organizational research: A conceptual and

    empirical foundation. Organizational Behavior and Human Decision Processes, 67, 2648.Campbell, D. T., & Stanley, J. C. (1966). Experimental and quasi-experimental designs for research. Chicago:

    Rand McNally.Chen, G., Gully, S., Whiteman, J.-A., & Kilcullen, R. (2000). Examination of relationships among trait-like

    individual differences, state-like individual differences, and learning performance. Journal of Applied Psychol-ogy, 85, 835847.

    Claessens, B. J. C., Eerde, W. V., Rutte, C. G., & Roe, R. A. (2004). Planning behavior and perceived control oftime at work. Journal of Organizational Behavior, 25, 937950.

    Collins, L. M., Graham, J. W., & Flaherty, B. P. (1998). An alternative framework for defining mediation.Multivariate Behavioral Research, 33, 295312.

    Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings.Boston: Houghton Mifflin.

    Cortina, J. M., Chen, G., & Dunlap, W. P. (2001). Testing interaction effects in LISREL: Examination andillustration of available procedures. Organizational Research Methods, 4, 324360.

    Diefendorff, J. M. (2004). Examination of the roles of action-state orientation and goal orientation in the goal-setting and performance process. Human Performance, 17, 375395.

    Fan, X., & Sivo, S. A. (2005). Sensitivity of fit indices to misspecified structural or measurement modelcomponents: Rationale of two-index strategy revisited. Structural Equation Modeling, 12, 343367.

    Fishbein, M., & Ajzen, I. (1975). Belief, attitude, intention, and behavior: An introduction to theory and research .Reading, MA: Addison-Wesley.

    Frazier, P. A., Tix, A. P., & Barron, K. E. (2004). Testing moderator and mediator effects in counseling psychologyresearch. Journal of Counseling Psychology, 51, 115134.

    Hackman, J. R., & Oldman, G. R. (1980). Work redesign. Reading, MA: Addison-Wesley.Hagtvet, K. A., & Nasser, F. M. (2004). How well do item parcels represent conceptually defined latent constructs?

    A two-facet approach. Structural Equation Modeling, 11, 168193.Hall, R. J., Snell, A. F., & Foust, M. S. (1999). Item parceling strategies in SEM: Investigating the subtle effects of

    unmodeled secondary constructs. Organizational Research Methods, 2, 233256.Holmbeck, G. N. (1997). Toward terminological, conceptual, and statistical clarity in the study of mediators and

    moderators: Examples from the child-clinical and pediatric psychology literatures. Journal of Consulting andClinical Psychology, 65, 599610.

    Copyright # 2006 John Wiley & Sons, Ltd. J. Organiz. Behav. 27, 10311056 (2006)

    1054 J. E. MATHIEU AND S. R. TAYLOR

  • 8/4/2019 Mathieu, Taylor, 2006

    25/27

    Hom, P. W., & Kinicki, A. J. (2001). Toward a greater understanding of how dissatisfaction drives employeeturnover. Academy of Management Journal, 44, 975987.

    Hoyle, R. H., & Kenny, D. A. (1999). Statistical power and tests of mediation. In R. H. Hoyle (Ed.), Statisticalstrategies for small sample research. Newbury Park: Sage.

    Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteriaversus new alternatives. Structural Equation Modeling, 6, 155.

    James, L. R. (1980). The unmeasured variable problem in path analysis. Journal of Applied Psychology, 65, 415421.

    James, L. R., & Brett, J. M. (1984). Mediators, moderators, and tests for mediation. Journal of AppliedPsychlology, 69, 307321.

    James, L. R., Mulaik, S. A., & Brett, J. M. (1982). Causal analysis: Assumptions, models and data. Newbury Park,CA: Sage .

    James, L. R., Mulaik, S. A., & Brett, J. M. (2006). A tale of two methods. Organizational Research Methods, 9(2),233244.

    Joreskog, K., Sorbom, D., du Toit, S., & du Toit, M. (2000). LISREL 8: New statistical features. Lincolnwood, IL:Scientific Software International.

    Judd, C. M., & Kenny, D. A. (1981). Process Analysis: Estimating mediation in treatment evaluations. Evaluationreview, 5, 602619.

    Judd, C. M., Kenny, D. A., & McClelland, G. H. (2001). Estimating and testing mediation and moderation inwithin-subjects designs. Psychological Methods, 6, 115134.

    Kenny, D. A. (1979). Correlation and causality. New York: John Wiley & Sons.Kenny, D. A., Kashy, D. A., & Bolger, N. (1998). Data analysis in social psychology. In S. F. D. Gilbert, &

    G. Lindzey (Eds.), Handbook of social psychology (4 ed., Vol. 1). Boston, MA: McGraw-Hill.Kenny, D. A., Korchmaros, J. D., & Bolger, N. (2003). Lower-level mediation in multilevel models. Psychological

    Methods, 8, 115128.Kohler, S. S., & Mathieu, J. E. (1993). Individual characteristics, work perceptions, and affective reactions

    influences on differentiated absence criteria. Journal of Organizational Behavior, 14, 515530.Landis, R. S., Beal, D. J., & Tesluk, P. E. (2000). A comparison of approac