metaanaliza moderatori

Embed Size (px)

DESCRIPTION

identificarea moderatorilor in metaanalize

Citation preview

  • 10.1177/1094428103257358ARTICLEORGANIZATIONAL RESEARCH METHODSCortina / MODERATORS IN META-ANALYSIS

    Apples and Oranges (and Pears, Oh My!):The Search for Moderators in Meta-Analysis

    JOSE M. CORTINAGeorge Mason University

    The purpose of this article is to review current practices with respect to detectionand estimation of moderators in meta-analysis and to develop recommendationsthat are driven by the results of this reviewand previous research. The first purposewas accomplished through a review of the meta-analyses published in Journal ofApplied Psychology from 1978 to 1997. Results show, first, that practices withrespect to both the execution of and the reporting of results from searches for mod-erators are highly variable and, second, that findings relevant for detection ofmoderators (e.g., percentage variance attributable to artifacts, SD, etc.) are of-ten highly inconsistent with what has been suggested in the past. These practicesheld regardless of time of publication, specificity of the question addressed in thepaper, and content area. Detailed suggestions for modifications of current prac-tices are offered.

    Keywords: meta-analysis; moderators; review; second-order-meta-analysis

    Since its advent, meta-analysis has become the predominant form of literature reviewin areas such as psychology, education, andmedicine. The detection and estimation ofmoderators is central to the interpretation ofmeta-analytic results inmany cases.Mod-erators provide boundary conditions for the effects that are hypothesized, thus inform-ing researchers of the situations in which the effects in question do and do not hold(Cortina&Folger, 1998). The identification of such boundary conditions, if they exist,is critical if meta-analytic results are to be generalized.

    Given the amount of importance that our field tends to attach to meta-analytic find-ings, it is essential that we have agreed upon mechanisms for dealing with moderatorsin meta-analytic studies. Nevertheless, a variety of authors have questioned the appro-priateness of many of the methods that are typically used to identify moderators (e.g.,James, Demaree, & Mulaik, 1986; James, Demaree, Mulaik, & Ladd, 1992; Sackett,Harris, & Orr, 1987). The purposes of this article are to review current practices with

    Authors Note: Thanks to AdamWinsler and Kim Eby for their helpful comments. Correspondence re-garding this article should be sent to JoseM.Cortina, Department of Psychology, GeorgeMasonUniversity,Fairfax, VA 22030.

    Organizational Research Methods, Vol. 6 No. 4, October 2003 415-439DOI: 10.1177/1094428103257358 2003 Sage Publications

    415

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • respect to detection of and testing for moderators in meta-analysis and to develop rec-ommendations that are driven by the results of this review and previous research.

    The article is organized as follows: First, a brief description of previous research onthe identification and estimation of moderators in meta-analysis is offered. Second, areview of meta-analyses published in the Journal of Applied Psychology is used toaddress the first purpose mentioned above. Third, the results of this review are evalu-ated in light of recommendations made in the past. Finally, a set of modified recom-mendations is offered as per the second stated purpose of the article.

    A Brief History of Moderator Detection and Estimation

    Let us begin by considering the alternatives for detecting and estimating modera-tors. The term detection is used here tomean the recognition, often post hoc, that mod-erators of a given relationship appear to exist through consideration of variability inobserved correlations, sample size, and other artifacts. A variety of detection optionshas been offered. Among them are the 75% rule (moderators are present if less than75%of observed variance is attributable to artifacts), the Callender andOsburn (1981)bootstrap significance test for residual variability, the Hunter and Schmidt (1990) chi-square approximation (Q), the examination of lower credibility interval values, thecomparison of effect size variability between and within categories (Schmidt, Hunter,& Caplan, 1981), the practical consideration of residual variability, and Marascuilos(1971) U.

    Various studies have evaluated one or more of these alternatives (e.g., Aguinis &Whitehead, 1997; Huffcutt & Arthur, 1995; Hunter & Schmidt, 1994; James et al.,1986; Johnson, Mullen, & Salas, 1995; Kemery, Mossholder, & Roth, 1987; Osburn,Callender,Greener,&Ashworth, 1983;Raju, Pappas,&Williams, 1989; Sackett et al.,1987; Spector & Levine, 1987; Switzer, Paese, & Drasgow, 1992; Whitener, 1990).Although the specific set of alternatives varies across these studies, as a whole theysuggest the following.

    First, the 75% rule is themost powerful of the commonly usedmoderator detectiontechniques, although its power is not high if the difference in population correlations isless than .2, or if k andmeanN are small (e.g., Osburn et al., 1983; Sackett et al., 1986;Spector & Levine, 1987). Second, power for other procedures is often very low, espe-cially for the 90% credibility value (CV) test. Third, the 90% CV test addresses a dif-ferent question fromother procedures insofar as its primary focus is onwhether a cred-ibility interval includes zero (Kemery et al., 1987). Fourth, Type I error is inflated inthe 75% rule and in some other procedures but controlled for in the Callender andOsburn (1981), chi-square, andU procedures (Sackett et al., 1986; Spector & Levine,1987). Fifth, estimation of residual standard deviation and the standard deviation of depends on artifacts corrected for, appropriateness/accuracy of those corrections, esti-mation procedure (e.g., Hunter & Schmidt, 1994; Raju et al., 1989; Switzer et al.,1992), and consideration of outliers (e.g., Beal, Corey, & Dunlap, 2002; Huffcutt &Arthur, 1995; James et al., 1992; Raju et al., 1989).

    In addition to thesemoderator detection techniques, there exist various alternativesfor estimating specific moderator effects that have been coded for in the meta-analyticdata set (see Steel & Kammeyer-Mueller, 2002, for a recent review). The most com-mon method in the organizational sciences is the subgroup meta-analysis in whichstudies are categorized according to some substantive or methodological attribute.

    416 ORGANIZATIONAL RESEARCH METHODS

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • Meta-analyses are then conducted within categories. These subgroup meta-analysescan be followed up by t tests, informal consideration of percentage variance attribut-able to artifacts (e.g., Huffcutt, Roth, &McDaniel, 1996), Hedges and Olkin (1985)Qtests (e.g., Gerstner &Day, 1997), or more sophisticated techniques such as hierarchi-cal linear modeling (van Eerde & Thierry, 1996). The primary limitation of subgroupanalysis is the need to categorize continuous moderators. Steel and Kammeyer-Mueller (2002) reviewed techniques used to estimatemoderators, whether continuousor categorical, in meta-analysis and found that weighted least squares (WLS) regres-sion is the most accurate estimator of moderator effects under a variety of conditions.Unfortunately, this technique is seldom used in the organizational sciences.

    Clearly, considerable guidance has been offered as to how to identify the presenceand nature of moderators in meta-analyses. The next section of this article describes areview of meta-analyses published in the Journal of Applied Psychology (JAP). Thepurpose of this review is to address three critical questions. First, how have authorsanticipated and planned for detection of moderators in their meta-analyses? Second,how have authors presented information relevant for the detection of moderators intheir meta-analyses? Third, how has this information been interpreted?

    What Has Been Done? A Second-Order Meta-Analysis

    To address the three issues raised above, meta-analyses published in JAP werecoded for 18 variables. The variables coded for are listed in Table 1. Before moving tothe description of the coding procedure, two points should be made. First, it is in noway my intention to cast work published in JAP in an unfavorable light. Much of themost important work on meta-analysis has been published there (e.g., Callender &Osburn, 1981; James et al., 1986; Johnson et al., 1995; Law, Schmidt, & Hunter,1994a, 1994b; Raju & Burke, 1983; Sackett et al., 1987; Schmidt & Hunter, 1977;Spector & Levine, 1987; Switzer et al., 1992; Wanous, Sullivan, & Malinak, 1989;Whitener, 1990). Indeed, JAP is the ideal choice of target if for no other reason thanbecause those who publish there are at least as likely to be well informed with respectto appropriate meta-analytic procedures as are those who publish in any other journal.In otherwords, if JAP authors have difficulty in choosing analysis/reporting strategies,it is likely that others will have similar difficulties. I revisit this issue in the Discussionsection.

    Second, it is recognized that not all relationships are moderated to a substantialdegree by other variables. However, it is also the case that moderators unsought arelikely to be moderators undetected. As was mentioned above, existing techniques fordetecting the presence of moderators using variability of effect size values have con-siderable limitations. If the importance and generality that we typically attribute tometa-analytic results is to be justified, efforts to consider at least the possibility ofmoderators should be made. It is for this reason that both consideration of potentialmoderators a priori and attention to information suggesting the presence ofmoderatorspost hoc are important.

    Coding. A total of 59 quantitative reviews (see the appendix) were included in thepresent analysis. These 59 reviews contained a total of 1,647 meta-analyses, some ofwhich were subgroup meta-analyses (M = 27.45, range = 1 to 435). If the review con-taining what was by far the largest number of meta-analyses (Podsakoff, MacKenzie,

    Cortina / MODERATORS IN META-ANALYSIS 417

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • &Bommer, 1996) were excluded, the mean number of meta-analyses per study dropsto 20.54 and the range is 1 to 120. Analyses with and without the Podsakoff et al.(1996) values were very similar. Therefore, the analyses reported here include valuesfrom the Podsakoff et al. (1996) study. Althoughmost of the analyses were conductedat the individual meta-analysis level (n = 1,647), some questions were also addressedat the paper level (n = 59).

    Each of the 1,647 meta-analyses was coded for each of the 18 variables. My goalwas to code for any attributes that reflect the choicesmade by the author with regard tomoderators (e.g., were moderators specified a priori?), that influence the way that thereader might make sense of the results (e.g., were moderator-relevant values dis-cussed?), or that speak to widely held assumptions about meta-analysis (e.g., howmuch variance was explained by artifacts?).

    Although all of the items coded for were defined in an unambiguousmanner, Items7, 11, 12, 14, and 15 required some judgment. Thus, a second rater with experience in

    418 ORGANIZATIONAL RESEARCH METHODS

    Table 1Variables Coded for in Second-Order Meta-Analysis

    1. Was the percentage variance attributable to artifacts (or the information necessary to compute it) pre-sented? It should be mentioned that many studies failed to present this value but did present observedvariance, mean r, k, N, and artifact info that allowed the computation of the percentage values. Thesewere also coded yes.

    2. If so, what was the percentage value?3. Was the residual standard deviation [i.e., sqrt(s

    r

    2 s

    2artifacts)] or the information necessary to compute it

    presented?4. If so, what was the residual standard deviation? (This variable and the next were coded from meta-

    analyses based on correlations only.)5. What was the standard deviation of (i.e., residual standard deviation divided by the compound atten-

    uation factor)?6. What artifacts were corrected for?7. Was the analysis a zero-order, first-order, second-order, and so forth? For example, McDaniel,

    Whetzel, Schmidt, and Maurer (1994) examined the relationship between interviews and job perfor-mance, training performance, and tenure. These overall analyses were coded as zero-order analyses.Interviews were then broken down by amount of structure, and the analyses that resulted were codedfirst-order analyses. The structured interviewswere broken down further bywhether or not ability testinformation was available to the interviewer, and the analyses that resulted were coded second-orderanalyses.

    8. Was the analysis the lowest order analysis included in the study, the second lowest, the third lowest,and so forth?

    9. Was a criterion for concluding that a moderator exists specified?10. If so, what was the criterion?11. Were the data relevant for detection of moderators discussed?12. Was a watered-down criterion used? That is, was a less stringent version of an existing procedure for

    detecting moderators used (e.g., a 60% rule)?13. Were a priori moderators specified?14. How many substantive moderators were specified?15. How many methodological moderators were specified? Substantive moderators were psychological

    constructs of the sort that might be seen in a causal model. Methodological moderators were thingslike civilian versus military or administrative versus research purpose

    16. What procedures were used to test these moderators?17. Were outliers removed?18. What was the mean compound attenuation factor, computed either directly as the product of the indi-

    vidual attenuation factors or indirectly as the ratio of uncorrected to corrected r?

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • meta-analysis independently coded these items for 5 of the studies (188 meta-analy-ses) reviewed. Of the 940 judgments, there were 10 disagreements, all of which wereresolved through additional discussion of the definition of Item 7.

    It should be noted that even Items 7, 11, 12, 14, and 15 required relatively little judg-ment because of the way that they were defined. For example, Item 11 asks, Were thedata relevant for detection of moderators discussed? This may seem like it requiresconsiderable judgment, but because a study was coded as having discussed thesedata if any interpretive text at all was offered, there was little basis for disagreement.The same can be said of the other items: Because the definitionswere so specific, therewas little room for disagreement.

    Results

    Results are presented in three sections: Anticipation of and Planning for Modera-tors inMeta-Analyses, Presentation of InformationRelevant forModeratorDetection,and Estimation and Interpretation of Moderator Information. Because of the amountof information contained in each section, they begin with a summary of the results thatthey contain. Each summary is followed by the specifics of the analyses that are rele-vant to them.

    Anticipation of and planning for moderators in meta-analyses. Results of theseanalyses are presented in Table 2. Although the details of these results are describedbelow, they can be summarized as follows.Approximately two thirds ofmeta-analysesincluded specific criteria for detection of moderators. Of these, the vast majority usedeither the percentage of variance attributable to artifacts or a chi-squared test of the nullhypothesis ( res2 = 0). A priori moderators were infrequent and were slightly morelikely to bemethodological. Post hocmoderatorswere almost alwaysmethodological.

    With regard to specifics, the first variable relevant for the issue of anticipation andplanning had to do with whether a particular criterion was specified for determiningexistence ofmoderators.Of the 1,647meta-analyses, 1,054 (64%) specified such a cri-terion. Although the percentage was higher for analyses of zero-order relationships(88%), it dropped off sharply for more specific relationships (43%, 57%, and 9% for

    Cortina / MODERATORS IN META-ANALYSIS 419

    Table 2How Were Moderators Anticipated and Planned For?

    Of Those That Specified a Criterion75% 90% Credibility Conjunctive Disjunctive

    Specified Criterion Rule 2 Value Combination Combination

    1,054/1,647 598/1,054 201/1,054 18/1,054 105/1,054 70/1,054(64) (57) (19) (2) (10) (7)

    Of Those That Tested ModeratorsA Priori A Priori Moderators Number of Number ofModerators for Zero-Order Substantive MethodologicalOverall Relationships Moderators Tested Moderators Tested

    314/1,647 (19) 136/675 (20) 443/1,072 (41) 629/1,072 (59)Note: Percentages in parentheses.

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • first-order, second-order, and third-order relationships, respectively). Of the 1,054analyses in which a criterion was specified, 57% specified the 75% rule, 19% speci-fied a 2 of some kind (e.g.,Q,U), 2% relied on a credibility interval does not containzero rule, and 6% specified some other rule (e.g., Salgado, 1997: sampling erroraccounts for more than half of the observed variance). In addition, some of the studiesthat specified a criterion used two ormore of the above in combination: 10%usedwhatmight be referred to as a conjunctive combination such that the existence ofmodera-torswas ruled out only ifmultiple criteriaweremet (e.g., Kozlowsky, Sagie,Krausz,&Singer, 1997: 75% rule, nonsignificantQ, 90%CV> 0). The final 7% usedwhat mightbe referred to as a disjunctive combination such that the existence ofmoderatorswasruled out if any of multiple criteria were met (e.g., Schmidt, Hunter, & Caplan, 1981:75% rule or 90%CV > 0).

    Given the research reviewed previously, we can conclude that the 57% that used the75% rule, along with those that used a conjunctive combination, would have the mostpower to detect moderators but would also have the highest Type I error rates by a con-siderable margin. Those studies using the credibility interval does not contain zerorule or a disjunctive combination would have the least power and the lowest Type Ierror rates, and those studies using one of the 2 tests would be somewhere in betweenon both counts.

    Studies were also coded as to whether they made reference to a watered downversion of a conventional criterion. Of those that reported a criterion, 7.4% (78 meta-analyses from five studies) made use of a scaled down version of a conventional crite-rion.Watered-down criteria included a 60% rule, a 50% rule, and an informal compari-son of within-group and between-group variance.

    Also relevant for the issue of moderator anticipation is the extent to which a priorimoderators were proposed. Certainly there is no rule stating that moderators must bepresent in a meta-analysis, but given the importance that is often attached to meta-analytic results and the lack of variance that is typically attributable to artifacts (seebelow), it is critical that opportunities to consider potential moderators not be missed.As can be seen at the bottom of Table 2, of the 1,647 meta-analyses coded, only 314(19%) offered a priori moderator variables to be considered. Moreover, of the 675zero-order analyses, only 136 (20%) offered a priori moderators. Thus, it wouldappear that the possibility of moderators is considered prior to examination of the dataonly infrequently, and such consideration is nomore likely inmeta-analyses of generaltopics than in meta-analyses of specific topics.

    Finally, the nature of moderators that were tested was examined. Both a priori andpost hoc moderators were included in this analysis; 1,072 moderators were examinedin 419 meta-analyses. Of the 1,072 moderators, 443 were coded substantive and 629were coded methodological. The ratio of substantive to methodological moderatorsdiffered depending on whether moderators were suggested a priori. Whenmoderatorswere suggested a priori, 45% were substantive in nature, whereas the other 55% weremethodological. When moderators were post hoc, only 15% were substantive. Thisfinding highlights the importance of a priori moderator consideration. It may be thatwithout such consideration, substantive moderators are likely to be missed.

    Presentation of information relevant for moderator detection. One of the keys tounderstanding how authors go about investigating moderators is the presentation ofmoderator-relevant information. The focus in this article is on the percentage of vari-

    420 ORGANIZATIONAL RESEARCH METHODS

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • ance attributable to artifacts, the residual standard deviation, and the discussion of val-ues relevant formoderator detection. Once again,many studies did not present the per-centage of variance attributable to artifacts or the residual standard deviation but didprovidemean correlations, observed variance in correlations, k, mean sample size, andrelevant artifact information. This information could then be used to compute artifactvariance, residual standard deviation, and the standard deviation of (SD) usingstandard formulas presented in Hunter and Schmidt (1990). The residual standarddeviation valuewas the square root of the difference between variance in observed val-ues and variance expected from variability in artifacts. Because SD is estimated bythe ratio of residual standard deviation to the mean compound attenuation factor andbecause the mean compound attenuation factor was almost always available from theratio of uncorrected to corrected mean correlations if nowhere else, SD could almostalways be computed when the residual standard deviation was available.

    Percentage information is presented in Table 3. This information can be summa-rized as follows: Percentage variance values and residual standard deviation valueswere omitted in approximately 22% of the 1,647 meta-analyses. Discussion of valuesrelevant for the detection ofmoderatorswas presented in 37%of the 1,647meta-analy-ses. This percentage is higher for the studies that specified a criterion for moderatordetection but lower for analyses involving zero-order relationships. Finally, the vastmajority of the 59 studies reviewed here omitted one of these pieces of informationfrom one ofmore of themeta-analyses onwhich they report, and these omissionswerenot driven by content area.

    With regard to specifics, percentage variance attributable to artifacts and the resid-ual standard deviation (or the information necessary to compute them) was presentedin 1,294 and 1,288 of the 1,647 meta-analyses respectively (78.6% and 78.2%). Thus,although most studies reported this information, approximately 22% did not.

    Studies were also coded on whether values relevant for detection of moderationwere discussed, in which discuss simply means something more than a mere men-tioning of the values. Of the 1,647meta-analyses thatwere coded, only 604 (37%) pro-vided discussion of values relevant for the detection of moderators. This percentage ishigher for the studies that specified a criterion for moderator detection (526 out of1,054 or 50%). Oddly, this percentage is lower for analyses involving zero-order rela-tionships (187 out of 675, or 28%).

    Cortina / MODERATORS IN META-ANALYSIS 421

    Table 3How Have We Presented Information?

    Was percentage variance information presented? 1,294/1,647 (78)Was SDres information presented? 1,288/1,647 (78)Were relevant values discussed? 604/1,647 (37)Information presentation by study1. Did not present percentage of variance attributed to by sampling error 16/59 (27)2. Did not present residual SD 15/59 (25)3. Did not present specific moderators criterion 34/59 (58)4. Did not present a priori moderators for zero-order moderators identified 17/49 (35)5. Did not discuss relevant information 38/59 (64)Did not discuss failed relevant information in zero-order analysis 15/49 (31)Did not discuss failed relevant information in 1, 2, 3, or 5 above 48/59 (81)

    Note: Percentages in parentheses.

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • Finally, it is possible that many of the between-analysis differences that have beenidentified so far may be better described as between-study differences. Each of the 59meta-analysis studies reported on here was examined for each of five anticipation andreporting problems identified thus far: failure to report percentage variance attribut-able to artifacts, failure to report residual standard deviation, failure to specify a mod-erator criterion, failure to present a priori moderators for zero-order analyses, and fail-ure to discuss relevant moderator information. Results of these analyses are presentedat the bottom of Table 3.

    As can be seen, about one quarter of the 59 studies failed to provide information onpercentage variance attributable to artifacts in one or more of the meta-analyses thatthey contained. The same can be said for residual standard deviation. In addition,morethan half of the 59 studies failed to specify a criterion for decidingwhethermoderatorswere present in one or more of the meta-analyses that they contained. Table 3 alsoshows that about one third of the 49 studies that contained zero-order meta-analysesfailed to generate a priori moderators for one or more of those zero-order analyses.Because the zero-order relationships are the most general, they are the most likely tocontain moderators. Thus, the fact that so many studies containing such relationshipsfailed to generate moderators is disconcerting.

    Almost two thirds of the 59 studies failed to discuss the values relevant for modera-tor detection. The situation is better for zero-order analyses, in which 31% of the stud-ies that contained zero-order analyses failed to discuss relevant values for one or moreof them.

    The final value contained in Table 3 provides a more global picture of informationpresentation in the 59 studies; 81% of the 59 studies failed to do one ormore of the fol-lowing: present percentage of variance attributable to artifacts, present residual stan-dard deviation, present specific moderator criterion, or discuss relevant values. Thus,only rarely did a meta-analytic study present and discuss all of these sources of infor-mation critical to the detection of moderators.

    Before concluding this section, it should bementioned that the 59 studies were alsocoded for topic using the coding scheme adopted for the 16th Annual Conference ofthe Society for Industrial and Organizational Psychology (Society for Industrial andOrganizational Psychology, 2000). Examination of study by topic suggested that pre-sentation omissions were not tied to particular content areas. Thus, it does not appearthat presentation habits were tied to content area.

    Estimation and interpretation of moderator information. Results with regard toestimation and interpretation, contained in Tables 4 and 5, can be summarized as fol-lows. Relatively little variance is typically attributable to artifacts, leaving a sizableresidual standard deviation and SD. Whenmoderators are tested for, they are usuallytested through subgroup analysis. Outlier removal is rare but simplifies the ruling outof moderators. Finally, examination of changes over time suggest that the questionsaddressedwithmeta-analysis have becomemore generalwith time but that the numberof effect size values included has become smaller.

    With regard to specifics, to address this issue, we must first know something aboutthe information being tested and interpreted. Thus, let us begin by examining levels ofpercentage variance attributable to artifacts, levels of residual standard deviation, andlevels of SD. This is followed by examinations of themethods used to test for moder-ators, the treatment of outliers, and changes over time.

    422 ORGANIZATIONAL RESEARCH METHODS

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • 423

    Table 4Amount of Variance Attributable to Artifacts, Residual Standard Deviation Values, and SD Values

    Overall Meana Zero-Order Meanb First-Order Mean Second-Order Mean Higher Order Mean

    Percentage variance attributed to artifacts 21.7 19.27 24.80 21.0 25.51(544) (171) (215) (154) (4)

    Residual SD .122 .116 .126 .123 .078(512) (149) (195) (164) (4)

    SD .160 .186 .137 .179 .078(502) (143) (191) (164) (4)

    Artifacts Corrected for in the Meta-AnalysisSE Only SE + ryy SE + rxxryy SE + ryy+RRb SE + rxxryy+RR SE + rxxryy+RR+zc SE + Other Combination

    Percentage varianceattributed to artifacts 14.9 18.80 24.21 38.17 63.57 Insufficient data 27.4

    (197) (21) (49) (38) (146) (42)Residual SD .128 .134 .102 .110 .159 Insufficient data .093

    (197) (21) (44) (38) (146) (59)SD .128 .175 .127 .182 .308 Insufficient data .130

    (197) (21) (44) (38) (143) (59)Note: k values are in parentheses; SE = sampling error.a.As suggested by a reviewer, these are harmonic means as opposed to arithmetic means.Also, because some studies rounded percentage values to 100% and resid-ual SD values to 0, precise values could not be computed for these studies. Thus, data from these studies were omitted from the harmonic mean computations.b. Values in parentheses are the number of meta-analyses included in the analysis.c. The additional artifact represented by z was typically correction for artificial dichotomization or unequal category sample sizes.

    at University of Bucharest on May 31, 2015

    orm

    .sagepub.comD

    ownloaded from

  • 424 Table 5Trends in Reporting and Interpretation Practices

    Correlationsa With Study Number VariablePercentage Variance Criterion Discussion of

    Attributable to Sampling Error SD k Analysis Orderb Specification Relevant Values

    Study number .184 .049 .236 .289 .126 .355

    Frequency by Year Block (in percentages)Variance Information Residual SD

    Presented Information Presented What Artifacts Were Corrected For What Criterion Was Used

    1978-1984 77.5 75.0 SE only: 25.4 75% rule: 45.5ryy: 6.1

    2: 33.8rxxryy: 1.4 90%CV: 0ryyRR: 22.1 Conjunctive: 0rxxryyRR: 37.9 Disjunctive: 13.6rxxryyRR+z: 0

    1985-1989 82.8 84.2 SE only: 14.9 75% rule: 22ryy: 29.0

    2: 30.8rxxryy: 31.7 90%CV: 0ryyRR: 18.6 Conjunctive: 2.2rxxryyRR: 0 Disjunctive: 45.1rxxryyRR+z: 0

    1990-1994 54 50.1 SE only: 20.7 75% rule: 26.1ryy: 2.6

    2: 42rxxryy 41.5 90%CV: 0ryyRR: 1.8 Conjunctive 21.7rxxryyRR: 1 3.8 Disjunctive: 0rxxryyRR+z: 13.3

    1995-1997 89.5 92.8 SE only: 9.6 75% rule: 68ryy: 4.2

    2: 10.6

    at University of Bucharest on May 31, 2015

    orm

    .sagepub.comD

    ownloaded from

  • 425

    rxxryy: 76.4 90%CV: 2.6ryyRR: 0 Conjunctive: 12.9rxxryyRR: 6.1 Disjunctive: 0rxxryyRR+z: 0

    Note: S.E. = sampling error; CV = credibility value.a.Correlations involving percentage of variance and residual standard deviations are equal to 1 times the correlation between the study number variable and the recip-rocal of the variable in question.b. Analysis order refers to the specificity of the analysis. A lower order analysis is more general than a higher order analysis.

    at University of Bucharest on May 31, 2015

    orm

    .sagepub.comD

    ownloaded from

  • As is shown in Table 4, for those studies from which this information could betaken, themean percentage variance valuewas 21.7.1 Themean residual standard devi-ation value was .122 and the mean SD value was .160. Thus, contrary to claims byHunter and Schmidt (1990) and others, a relatively small percentage variance is typi-cally attributable to artifacts, and considerable variability remains after correction ofvariance for artifacts.

    To examine this issue further, the 1,647 meta-analyses were broken down accord-ing towhether theywere zero order, first order, second order, or higher order as definedin Table 1, with higher order analyses addressing more specific questions than lowerorder analyses. Mean percentage variance, residual standard deviation values, andSD values were then computed for each category. This information is also presentedin Table 4. Although there is a slight tendency for the percentage variance attributableto artifacts to increase as the studies get more specific, there is no trend with respect tothe standard deviations. Thus, it would appear that breaking studies down by potentialmoderators does little in the way of increasing our confidence in the notion that meta-analyses typically isolate those studies producing values sampled from a single popu-lation.

    Studies were also broken down according to the artifacts for which they corrected.This information is also presented in Table 4. As can be seen, there was a general ten-dency for the percentage variance attributable to artifacts to increase as a function ofthe number of artifacts included. However, no combination of artifacts resulted in anaverage percentage higher than 64%. No clear pattern emerged for the standard devia-tions.

    We can now ask about the tests that were actually conducted. Of the 419 meta-analyses inwhichmoderatorswere suggested (either a priori or post hoc), 399 actuallytested for them. In the other 20meta-analyses, moderators were not tested because of alack of relevant information. Of the 399 meta-analyses that tested for moderators, thevast majority (85%) used subgroup meta-analysis.

    An additional issue relevant for interpretation and estimation is the treatment ofoutliers. Various authors have suggested that outliers be routinely removed from datasets so that the distributions conformbetter to expectations (e.g., Huber, 1981;Wilcox,1997). One might feel, however, that this practice is suspect when it is known before-hand to increase the chances of finding results that correspond with hypotheses(Cortina &Gully, 1999). Nevertheless, some authors have suggested that this practicebe extended tometa-analysis. To this end,Huffcutt andArthur (1995) developed a pro-cedure for determining the number of effect sizes in a meta-analysis that might belabeled outliers. If these outliers are removed, then the observed variance of effectsizes, and therefore the residual variance and variance of , are certain to decrease.This in turn makes conclusions of cross-situational consistency easier to reach. Ofcourse, another way of phrasing this is that the test for existence of moderators, whichhas been shown to be quite low in power for many situations, will be even lower if out-liers are routinely removed.

    The prevalence of outlier identification and removal in the 1,647 meta-analysesunder considerationwas examined. Outliers were removed from 91 of the 1,647meta-analyses (5.5%). Although this is not a large percentage, the percentage varianceattributable to artifacts and the residual standard deviation associated with the meta-analyses in which these values were reported and from which outliers had beenremoved were also examined. In the 44 relevant meta-analyses that reported informa-

    426 ORGANIZATIONAL RESEARCH METHODS

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • tion on variance attributable to artifacts and removed outliers, the mean percentagevariance attributable to artifacts was 30.03%. In the 33 relevant studies that reportedresidual standard deviation, the mean residual standard deviation value was .087, andthe mean SD was .095. Thus, the mean percentage variance attributable to artifactsvalue is larger than the value for studies in general (21.7%), and the residual standarddeviation and SD values are considerably smaller than their corresponding overallvalues (.122 and .160, respectively). Clearly, removal of outliersmakes conclusions ofcross-situational consistency easier to draw. Whether such ease is warranted (ordesired) is another matter.

    Finally, we might consider trends in estimation and interpretation over the 20-yearperiod covered by this review.Trendswere examined in two relatedways. First, certainstudy characteristics were correlated with the chronological order of the study. Sec-ond, chronology was broken down into four groups: 1978-1984, 1985-1989, 1990-1994, and 1995-1997. These groups are of different sizes so that they might includesimilar numbers of meta-analyses. These results are presented in Table 5.

    The percentage variance attributable to artifacts has actually decreased with time.This might be surprising given the expectation that meta-analyses would have gottenmore specific over time. However, there also exist negative correlations between studynumber and both k and analysis order. This means that higher order analyses weremore common in the past than they are now and that authors of more recent meta-ana-lytic studies include fewer studies in theirmeta-analyses than did authors of less recentmeta-analytic studies. It appears, therefore, that more recent authors are asking moregeneral questions with meta-analysis and are answering them with fewer effect sizevalues. Finally, later studies are somewhat more likely to specify a criterion for detect-ing moderators. However, later studies are also less likely to discuss information rele-vant for that detection.

    Table 5 also contains descriptive data for the four time blocks mentioned earlier.The only trend that seems to emerge is that correction for range restriction is less com-mon than it used to be. This may be due in part to the fact that whereas early meta-analyses often dealt with personnel selection situations in which range restriction wasa critical factor, later meta-analyses focused on a variety of areas, some of which hadlittle concern for range restriction. As for the Criterion column, the only trend appearsto be that the low power Disjunctive combinations of criteria for detecting modera-tors have fallen out of favor.

    Summary

    The preceding section of the article included an examination of the ways thatresearchers have dealt with moderators in their meta-analyses. The following conclu-sions can be drawn. First, authors usually fail to provide all information relevant for thedetection of moderators. Second, a smaller percentage variance in effect size values isattributable to artifacts than has been suggested in the past. Third, residual variabilityand variability in is often considerable. Fourth, the 75% rule is still the most com-monly used criterion for detectingmoderators. Fifth, although post hocmoderators areusually methodological, about half of a priori moderators are substantive in nature.Sixth, subgroupmeta-analysis is still themost common vehicle for estimating specificmoderators. Finally, authors have come, over time, to address more general questions

    Cortina / MODERATORS IN META-ANALYSIS 427

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • with fewer effect sizes, and authors have grown less inclined to discuss the values rele-vant for detection of moderators.

    Discussion

    The purpose of the review described above was to examine how authors have goneabout conducting meta-analyses. Although many of the practices uncovered by thereview appear sound, others are misguided. Before moving on to recommendations, Ishould recognize that this article might be criticized for reviewing only studies pub-lished in JAP. In defense of this choice, I would point out that JAP is considered to bethe flagship journal in industrial and organizational psychology. In addition, much ofthe meta-analysis wisdom in the organizational sciences has come from articles pub-lished there. In an attempt to further support the generalizeability of these results, Iexamined the list of first authors of the 59 meta-analyses on which I report. Of the 48first authors (some had multiple meta-analytic studies published in JAP), the over-whelming majority has published in other journals in the field (e.g., Personnel Psy-chology, Academy of Management Journal, Academy of Management Review, andOrganizational Research Methods), and 17 have served as editorial board membersand/or editors for journals such as these. Several of the authors of the meta-analysesincluded in the present review have published meta-analyses in these other journals,and there is no reason why the standards of these authors would be watered down fortheir JAP submissions. Thus, problems associated with meta-analyses published inJAP are unlikely to be unique to JAP.

    In the remainder of this article, extensive recommendations with respect to thesearch formoderators inmeta-analysis are offered. Of course, it is neither possible noradvisable to suggest a single method for dealing with moderators in meta-analysis.The purposes here are to examine the options, highlight the criteria by which theoptions might be judged, and consider the options in light of those criteria.

    Recommendations. Although previous authors have made recommendationsregarding the conduct of meta-analyses (e.g., Hedges & Olkin, 1985; Hunter &Schmidt, 1990; Orwin & Cordray, 1985; Slavin, 1984;Wanous et al., 1989), few havebeen specific tomoderator identification and estimation. Certainly, there exists no sin-gle compendium of best practices for the treatment of moderators in meta-analysis.The results of the present study suggest that such recommendations are needed. Theremainder of the article is devoted to such recommendations.

    The recommendations that might be offered depend on the purpose of the meta-analysis. Is ones goal parameter estimation or hypothesis testing? In meta-analyticmoderator terms, is ones goal cross-situational consistency or transportability?Kemery et al. (1987)made the distinction between validity generalization (also knownas transportability) and cross-situational consistency. Cross-situational consistency issaid to exist if true validity variance equals 0. Even if cross-situational consistency isnot supported, however, transportability may still be possible. It may be the case that agroup of sample correlations come from one of two populations, the first with = .3and the secondwith= .5. True validity variance is clearly greater than zero, but valid-ity generalization is still possible as long as validity is defined in general terms.

    If ones only interest lies in transportability, then a credibility interval that does notcontain zero provides reasonable evidence. However, it is important to impose two

    428 ORGANIZATIONAL RESEARCH METHODS

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • additional conditions. First, the meta-analysis must contain sufficient between-studyvariability on relevant variables towarrant conclusions of transportability. Second, the95% credibility interval should be used. There is nothing sacred about 95%, but thesame can be said for 80%,which is the interval referenced by the 90%credibility valueoften reported in meta-analyses. Given that 95% is traditional and that both are arbi-trary, the only reason to make the low power test for moderators less powerful by scal-ing it back to 80% is to allow us to generate values that we would prefer to see. Thishardly seems a justification with scientific merit.

    If, instead, our goal is parameter estimation, then the search formoderatorsmust beorganized around the notion of cross-situational consistency. Cross-situational consis-tency suggests a definition of moderation that is more consistent with traditional defi-nitions (e.g., a relationship between two variables that varies with a third variable asopposed to a relationship equaling zero in certain cases). As such, it is best approachedthrough the development of a priori moderator hypotheses. These hypotheses shouldbe tested throughwhichevermeans ismost appropriate given the nature of themodera-tor and error constraints regardless of the percentage of variance attributable to arti-facts, the standard deviation of , and so forth. There are two reasons for this sugges-tion. First, conclusions about moderators, as with all scientific conclusions, should bebased on theory as well as data. Second, it is possible for distributions of observedeffect size values to bothmimic chance distributions and yield sizable subgroup differ-ences. This issue is revisited at the end of the article.

    Even if a priori moderator hypotheses are not available, however, the importancethat is typically attached tometa-analytic findingsmakes the identification ofmodera-tors of paramount importance.Aswasmentioned in the introductory paragraphs,mod-erators unsought are likely to be moderators undetected. The proper beginning forsuch an examination is the specification of criteria for concluding that moderators arepresent. All available criteria have advantages and disadvantages. Consider first themost commonly used of these, the 75% rule. The procedure based on this ratio hasbeen shown to have higher power values than do alternative procedures along with acorresponding Type I error rate tradeoff (Sackett et al., 1986). However, as L. Jamesand his colleagues have pointed out (James, Demaree, Mulaik, & Mumford, 1988;James et al., 1986), the procedure based on the 75% rule (or any percentage cutoff)contains a fallacy known as the affirming the consequent fallacy. The logic of theprocedure is thus: If there were no situational specificity, then the variance ratio isexpected to exceed .75. The variance ratio exceeds .75. Therefore, there is no situa-tional specificity. In more general terms, if A then B; B, therefore A. This is a bastard-ization of either the Modus Ponens or Modus Tollens rules of syllogistic reasoning.The form of Modus Ponens is the following: If A then B; A, therefore B. The form ofModus Tollens is as follows: If A then B; not B, therefore not A. The logic implied bythe 75% rule contains incompatible components of both of these rules and is thereforefallacious. The antecedent in a statement of implication is only one of many possibleavenues to the consequent, so the latter does not imply the former. With regard to the75% rule, there are many possible avenues to a variance ratio exceeding .75. One is alack of moderators, but there are others, such as small average sample sizes (Jameset al., 1988), lack of between-study variability on a given moderator, and relativelysmall differences between population effect sizes. Thus, it is entirely possible to sufferthe consequences of this fallacious reasoning by concluding that therewas nomodera-tor when in fact there was.

    Cortina / MODERATORS IN META-ANALYSIS 429

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • The 2 tests do not suffer from the same logical problems. Their logic can be sum-marized as follows: If SDres (or SD) were equal to 0, then the amount of variability inobserved effect sizes would probably not exceed a certain value. The amount of vari-ability in observed effect sizes does exceed that value. Therefore, SDres probably doesnot equal zero. This is a probabilistic version of theModus Tollens rule. The probabil-istic nature of these statements can cause problems, but this is not especially likely(Cortina &Dunlap, 1997). Nevertheless, the 2 tests do have more serious drawbacks.First, they tend to be low in power. Sackett et al. (1986) showed that although the pro-cedure based on a 90% rule (which is sometimes adopted whenmany artifacts are cor-rected for) has lowpower formany situations, the2 test has adequate power only if thedifference between population effect sizes is large or if the difference is moderate andthe number of effect sizes is large.

    The other, related problemwith the2 test is that it does, in some sense, penalize theresearcher with the good design. Assuming that one wishes to generalize the meaneffect size value obtained in a given meta-analysis, evidence of moderation is notdesired. However, the power of the 2 test to detect moderators increases withN and k.Thus, the best way to avoid detection ofmoderators is tometa-analyze a small numberof small sample studies. This hardly seems a justifiable practice.

    So, what should meta-analysts do? The 75% rule, the 2 test, and all other criteriahave advantages and disadvantages. The 2 test and, to a lesser degree, the 75% rule,often have very low power. The 75% rule also has the logical problems describedabove. Given the importance of identifying boundary conditions for the meta-analyticvalues to which we tend to attach a great deal of weight, the failure to detect modera-tors when they do in fact exist (i.e., a Type II error) seems especially egregious. Oneway to improve the power to detect moderators would be to adopt a more conjunctiveapproach. For example, logical problems notwithstanding, one might base conclu-sions, namely, moderators on multiple criteria, such as the percentage of varianceattributable to artifacts, the 2 test, and SD, and do so in the following way. If the pre-determined percentage variance cutoff is met and the 2 test is nonsignificant (i.e., aconjunctive rule), then one may conclude that moderators are not present. This willresult in a more powerful search for moderators. If such an approach were adopted, itwould bewise to then examine the SD value. The reason for this is that there are casesin which a small percentage variance can be attributed to artifacts, but SD is, in fact,quite small. For example, a meta-analysis reported in Ones, Viswesvaran, and Reiss(1996) resulted in a percentage variance value of 73%, but the standard deviation of was only .034, resulting in a very small credibility interval. Any selection of a cutofffor SD is somewhat arbitrary, but .05 would satisfy common practical criteria. The95% credibility interval that results has a width of about .2 correlation units, andmanyof the conclusions that are drawn are only slightly affected by correlation differencesof .2.

    This is not to say that an approach involving the percentage variance attributable toartifacts, the 2 test, and SD is the single best approach. Indeed, subsequent researchsimilar to that conducted bySackett et al. (1987) andSpector andLevine (1987) shouldbe conducted in an attempt to determine the approaches that minimize error to thegreatest degree. If, however, one believes that the level of power associated with com-monly used techniques is unacceptable, then some alternative must be employed.

    The SDmight be put to other uses as well. The SD represents the standard devia-tion of population correlations. Unfortunately, it is difficult to point to any given SD

    430 ORGANIZATIONAL RESEARCH METHODS

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • value and conclude that moderators need not be pursued. A similar problem arises infactors analysis with respect to the determination of the number of factors present andin structural equation modeling with respect to model fit. One of the common conven-tions for decidingwhich factors or components to retain in a factor or component anal-ysis is the eigenvalue > 1 convention. This convention stems from work by Kaiser(1960) and Guttman (1954) in which it was shown that a factor with an eigenvalue = 1has a reliability of zero. Thus, this can be thought of as a point of no return criterionsuch thatwemaynot thinkmuch of factorswith eigenvalues only slightly larger than 1,but we can rest assured that any factor whose eigenvalue is not greater than 1 shouldnot be retained. Although it can certainly be argued that Kaisers criterion is overex-tended (Cattell & Vogelmann, 1977; Fabrigar, Wegener, MacCallum, & Strahan,1999; Zwick & Velicer, 1986), it is useful as a baseline value for ruling out factors/components.

    In structural equation modeling, indices of model fit, such as the normed fit index(Bentler &Bonnet, 1980), involve a comparison of the residual matrix of the hypothe-sized model to the residual matrix of a null model.2 Although the definition of nullcan be debated (Mulaik, James, Van Alstine, Bennett, Lind, & Stillwell, 1989), themost common null model is one in which all linkages between observed and latentvariables, and all linkages among latent variables, are set to zero. The null model,therefore, represents a worst case scenario. Model fit is then a function of the decreasein residuals as one goes from this worst case scenario to a hypothesized model. Anyhypothesized model that fails to improve substantially on the null model has little torecommend it.

    It would be useful to have an apples and oranges moderator detection statisticanalogous to the eigenvalue = 1 or the null model residual matrix. There appears to beno way of generating values from an assumed distribution of correlations becausethere is no basis for any particular assumption. A uniform distribution is inappropriatebecause extreme effect sizes are rare, and there is no compelling argument for anyother distribution either. As a result, a set of point of no return values were empiri-cally generated. The intent was to randomly sample effect size values andmeta-analyzethem, thereby generating percentage variance attributable to artifacts values, residualstandard deviation values, and SD values that represent the quintessential apples andoranges meta-analysis.

    To this end, one correlation value fromeach of the first two empirical, primary stud-ies reported in the first issue of each volume of JAP from 1978 to 1997 was selected,resulting in 40 correlations. For each study, a correlation table was randomly selected(if there weremultiple tables), and a correlationwas randomly selected from the table.The only substantive exclusion criterionwas that the value could not represent the cor-relation between conceptually identical variables (e.g., test-retest reliabilities). Itshould also be noted that variables were rarely repeated across correlations.

    The resulting values were as follows. The observed variance in correlations was.0378, whereas the variance attributable to sampling error was .0022. Hunter andSchmidt (1990) claimed that the vast majority of variance attributable to artifacts isattributable to sampling error. Given that the formula for variance attributable to arti-facts other than sampling error is the product of the square of the mean value, thesquare of the compound attenuation factor, and the sum of the coefficients of variationfor the individual attenuation factors (Hunter & Schmidt, 1990, p. 176), the varianceattributable to artifacts other than sampling error will seldom exceed .001. For exam-

    Cortina / MODERATORS IN META-ANALYSIS 431

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • ple, the average amount of variance attributable to artifacts other than sampling errorin the Ones et al. (1996) studywas .0002. In any case, it is possible to estimate roughlythe variance in the 40 observed correlations attributable to artifacts other than sam-pling error by using the mean sample size weighted correlation, information from thereview of JAPmeta-analyses, and assumed artifact distributions to fill in the values inthe formula justmentioned. The average compound attenuation factor from the reviewof meta-analyses was .71, and this value can be used to correct the sample sizeweighted mean of the 40 correlations for attenuation (.094/.71 = .132). The square ofthis value provides the first component of the formula. The square of the average com-pound attenuation factor provides the second component. Although the relevant datafor the third component were not coded for, a conservative estimate of .0368, which isthe mean of the V values for the assumed distributions involving ryy = .6, rxx = .8, andratios of restricted to unrestricted standard deviations of either 1.0 or .59, can be takenfrom Pearlman, Schmidt, and Hunter (1980). This last value is, nevertheless, alwayssmall and of little consequence.

    The three values needed to estimate variance attributable to artifacts other thansampling error are, therefore, .1322, .712, and .0368. Their product, .0003, provides therequired value and is consistent with the notion that at least 90% of the variance attrib-utable to artifacts is attributable to sampling error.

    Thus, residual variance is estimated as .0378 .0022 .0003 = .0353, the residualstandard deviation is estimated as .19, and 6.6%of the variance in correlations is attrib-utable to artifacts. If we divide .19 by the compound attenuation factor from the reviewofmeta-analyses, we have an SD value of .265 and, therefore, a 95%credibility inter-val value of 1.96 .265 = .519. These values are far from being out of reach. Forexample, 21 of the 59 studies reviewed earlier contained a total of 148 meta-analysesthat yielded residual standard deviation values that equaled or exceeded the baselinevalue of .19. Other, similar analyses would likely yield different baseline values, butthere is no reason to expect large discrepancies (a study is currently underway to inves-tigate this issue). Thus, these values might be used as baseline values such that if ourresidual standard deviation is no smaller than .19 (or our SD value is no smaller than.265), then the mean correlation must be regarded as uninterpretable because it is amean of sample values that are no less discrepant than would be values taken from kpopulations.

    Note that the purpose served by this baseline value is different from that served bythe .05 recommendation made earlier. The baseline value is an absolute maximum inthe same way that the Kaiser criterion or the null model residual matrix is an absoluteminimum. The effect size estimation from a meta-analysis that yields a residual stan-dard deviation value that is slightly smaller than .19 or an SD value that is slightlysmaller than .265 may well be uninterpretable, but these values can act as hard cutpoints. The .05 suggestion is meant to lie at the other end of this continuum such that ifSD is less than .05, then there is little empirical reason to be concerned about interpre-tation of effect size estimates.

    The previous several paragraphs have dealt with issues of detecting moderators.Next, there is the issue of presentation. It should go without saying that all values rele-vant for decisions relating to moderators should be reported. Specifically, every meta-analysis should include some combination of the following: observed variance ofeffect size values, residual variance of effect size values, percentage variance attribut-able to sampling error, percentage variance attributable to each other artifact consid-

    432 ORGANIZATIONAL RESEARCH METHODS

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • ered in the study, corrected and uncorrected effect size values, compound attenuationfactors, and confidence and credibility intervals. If space permits, then all this infor-mation should be presented. At the very least, a subset should be presented that allowsthe computation of the remaining values in the list (e.g., presentation of mean cor-rected and uncorrected effect size values allows computation of the compound attenu-ation factor).

    The number of studies (k), N, and the criterion used to determine the existence ofmoderators should also be included. Also included should be evidence of a priori con-sideration of moderators. It may be that no plausible moderators present themselves.Nevertheless, the importance that is attached to meta-analytic findings makes para-mount the concern over the identification of boundary conditions, and these cannot beidentified if they are never considered. This is very important for zero-order analysesand less so for higher order analyses. Of course, this implies that studies to be includedin meta-analyses must be coded for information relevant to potential moderator vari-ables.

    If there is empirical evidence of moderators, then above all else, this evidenceshould be discussed. It should be dismissed as unimportant only if there are over-whelming theoretical and empirical reasons to do so. Theoretical reasons may be rarebecause they would involve a hypothesis of no effect and would contradict the data.Empirical reasons are possible in the form of outliers. As was mentioned earlier, thereare a variety of opinions on the subject of outliers and what to do when they have beenidentified. There should, nevertheless, be some substantive (as opposed to purelyempirical) grounds for deletion of an outlier, even if those grounds are discovered afterthe fact. The grounds for deletion should be presented, and results should be generatedand discussed with and without the outliers in question.

    As a last note on presentation, it is wise to use the phrase percentage varianceattributable to artifacts rather than the phrase percentage variance accounted for byartifacts. The latter suggests that artifacts do account for observed variance when infactwe do not know if this is true.We choose to attribute variance to artifacts or not.

    Finally, there are the issues of estimation and interpretation. The vast majority ofprevious meta-analyses published in JAP have used one of two procedures for testingmoderator hypotheses: subgroup meta-analysis and correlation/regression. Althoughmore research needs to be conducted in which additional procedures are developedand in which these and other procedures are compared, it is possible to extrapolatefrom the multiple groups analysis versus product term dichotomy in the structuralequation literature. Multiple groups analysis, such as subgroup meta-analysis, is par-ticularly useful when the moderating variable is categorical. Inclusion of productterms, not unlike the correlation of moderators with effect size values, is particularlyuseful when themoderating variable is continuous. This is not to say that eachmethodcannot be modified to accommodate different sorts of variables (e.g., median splits ofcontinuous variables, dummy coding of categorical variables). It is enough to say thateach approach is particularly suited to one situation and that careful consideration ofthe characteristics of the variables involved should dictate the choice of method.

    There remains the issue of interpretation when evidence of moderators exists, butthe limitations of the data set preclude the estimation of further moderator effects. It isassumed here that the theoretical evidence suggests that a nonzero effect should exist.Just as caution must be exercised when interpreting main effects in primary studies inthe presence of an interaction, so must caution be exercised when interpreting esti-

    Cortina / MODERATORS IN META-ANALYSIS 433

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • mated and values in the presence of evidence that more than one such true valueexists. Indeed, in such a case, it may bewise to avoid specific conclusions and fall backon more general, transportability-based conclusions, leaving parameter estimation tofuture research. This is a reasonable strategy if the estimated effect size value is large.If, however, the estimated effect size value is small such that the 95% credibility inter-val contains values that are trivially different from zero, then the results should proba-bly be labeled inconclusive.

    To summarize, the following steps in anticipating moderators, presenting relevantinformation, and testingand interpreting such information inmeta-analysis are suggested:

    1. Identify the goal of the meta-analysis (parameter estimation vs. hypothesis testing).2. Identify possible moderators a priori for as many of the relationships to be examined as

    possible and code for them.3. Choose a decision rule, taking into account error rates, practical significance, and so

    forth. Conjunctive strategies may work well in this regard (although more research isneeded). Conjunctive strategies may not control for Type I error in a precise way. Nev-ertheless, they are also less likely to let importantmoderators slip through the cracks.

    4. Report all relevant values, including observed variances, residual standard deviation/variance values, percentage observed variances attributable to artifacts, confidence in-tervals, and credibility intervals. Also include corrected and uncorrected effect size val-ues, N, k, and any additional attenuation information that might be useful.

    5. Compare relevant values to decision criteria and, if necessary, to baseline values.6. Discuss the comparisons and the values.7. Examine outliers and discard only if there are overwhelming empirical and substan-

    tive reasons for doing so. There is little to be lost by presented results with andwithoutoutliers.

    8. Choose a strategy for estimatingmoderator effects basedon thenature of themoderators.9. Interpret any population effect size values for which substantial variance remains unex-

    plained with caution, pointing out that further research may be required to uncover thevariables causing observed variability in effect sizes.

    It is hoped that such prescriptions will lead to better practice of meta-analysis.Given the importance typically attributed to meta-analytic findings, such practice isessential.

    AppendixQuantitative Reviews Included in the Analysis

    Bothwell, R. K., Deffenbacher, K. A., &Brigham, J. C. (1987). Correlation of eyewitness accu-racy and confidence: Optimality hypothesis revisited. Journal of Applied Psychology, 72,691-695.

    Brown, S. F. (1981). Validity generalization and situational moderation in the life insurance in-dustry. Journal of Applied Psychology, 66, 664-670.

    Burke, M. J., & Day, R. (1986). A cumulative study of the effectiveness of managerial training.Journal of Applied Psychology, 71, 232-245.

    Carsten, J.M.,&Spector, P. E. (1987).Unemployment, job satisfaction, and employee turnover:Ameta-analytic test of theMuchinskymodel. Journal of AppliedPsychology,72, 374-381.

    Conway, J.M., Jako, R. A., &Goodman, D. F. (1995). Ameta-analysis of interrater and internalconsistency reliability of selection interviews. Journal of AppliedPsychology,80, 565-579.

    Driskell, J. E., Copper, C., & Moran, A. (1994). Does mental practice enhance performance?Journal of Applied Psychology, 79, 481-492.

    434 ORGANIZATIONAL RESEARCH METHODS

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • Driskell, J. E.,Willis, R. P., &Copper, C. (1992). Effect of overlearning on retention. Journal ofApplied Psychology, 77, 615-622.

    Finkelstein, L.M., Burke,M. J., &Raju, N. S. (1995). Age discrimination in simulated employ-ment contexts: An integrative analysis. Journal of Applied Psychology, 80, 652-663.

    Fisher, C. D., &Gitelson, G. (1983). Ameta-analysis of the correlates of role conflict and ambi-guity. Journal of Applied Psychology, 68, 320-333.

    Fried, Y. (1991). Meta-analytic comparison of the job diagnostic survey and job characteristicsinventory as correlates of work satisfaction and performance. Journal of Applied Psychol-ogy, 76, 690-697.

    Gaugler, B. B., Rosenthal, D. B., Thornton, G. C., & Bentson, C. (1987). Meta-analysis of as-sessment center validity. Journal of Applied Psychology, 72, 493-511.

    Gerstner, C. R., & Day, D. V. (1997). Meta-analytic review of leader-member exchange theory:Correlates and construct issues. Journal of Applied Psychology, 82, 827-844.

    Hattrup, K., Rock, J., & Scalia, C. (1997). The effects of varying conceptualizations of job per-formance on adverse impact, minority hiring, and predicted performance. Journal of Ap-plied Psychology, 82, 656-664.

    Hom, P. W., Caranikas-Walker, F., Prussia, G. E., & Griffeth, R. W. (1992). A meta-analyticalstructural equations analysis of a model of employee turnover. Journal of Applied Psychol-ogy, 77, 890-909.

    Hough, L. M., Eaton, N. K., Dunnette, M. D., Kamp, J. D., &McCloy, R. A. (1990). Criterion-related validities of personality constructs and the effect of response distortion on those va-lidities. Journal of Applied Psychology, 75, 581-595.

    Huffcutt, A. I., & Arthur, W. (1994). Hunter and Hunter (1984) revisited: Interview validity forentry-level jobs. Journal of Applied Psychology, 79, 184-190.

    Huffcutt, A. I., Roth, P. L., &McDaniel, M. A. (1996). A meta-analytic investigation of cogni-tive ability in employment interview evaluations: moderating characteristics and implica-tions for incremental validity. Journal of Applied Psychology, 81, 459-473.

    Hunter, J. E., Schmidt, F. L., & Judiesch,M.K. (1990). Individual differences in output variabil-ity as a function of job complexity. Journal of Applied Psychology, 75, 28-42.

    Kozlowsky,M., Sagie, A., Krausz,M., & Singer, A. D. (1997). Correlates of employee lateness:Some theoretical considerations. Journal of Applied Psychology, 82, 79-88.

    Kraiger, K., & Ford, J. K. (1985). A meta-analysis of ratee race effects in performance ratings.Journal of Applied Psychology, 70, 56-65.

    Lee, R. T., & Ashforth, B. E. (1996). A meta-analytic examination of the correlates of the threedimensions of job burnout. Journal of Applied Psychology, 81, 123-133.

    Loher, B. T., Noe, R. A., Moeller, N. L., & Fitzgerald, M. P. (1985). Ameta-analysis of the rela-tion of job characteristics to job satisfaction. Journal of AppliedPsychology,70, 280-289.

    Lord, R. G., DeVader, C. L., & Alliger, G. M. (1986). A meta-analysis of the relation betweenpersonality traits and leadership perceptions: An application of validity generalization pro-cedures. Journal of Applied Psychology, 71, 402-410.

    Mabe, P. A., & West, S. G. (1982). Validity of self-evaluation of ability: A review and meta-analysis. Journal of Applied Psychology, 67, 280-296.

    Martocchio, J. J., & OLeary, A. M. (1989). Sex differences in occupational stress: A meta-analytic review. Journal of Applied Psychology, 74, 495-501.

    McDaniel,M.A., Schmidt, F. L., &Hunter, J. E. (1988). Job experience correlates of job perfor-mance. Journal of Applied Psychology, 72, 327-330.

    McDaniel, M. A., Whetzel, D. L., Schmidt, F. L., & Maurer, S. D. (1994). The validity of em-ployment interviews: A comprehensive review and meta-analysis. Journal of Applied Psy-chology, 79, 599-616.

    McEvoy, G. M., & Cascio, W. F. (1985). Strategies for reducing employee turnover: A meta-analysis. Journal of Applied Psychology, 70, 342-353.

    Cortina / MODERATORS IN META-ANALYSIS 435

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • McEvoy, G. M., & Cascio, W. F. (1989). Cumulative evidence of the relationship between em-ployee age and job performance. Journal of Applied Psychology, 74, 11-17.

    Mitra, A., Jenkins, G. D., & Gupta, N. (1992). A meta-analytic review of the relationship be-tween absence and turnover. Journal of Applied Psychology, 77, 879-889.

    Murphy,K.R.,&Balzer,W.K. (1989). Rater errors and rating accuracy. Journal of AppliedPsy-chology, 74, 619-624.

    Narby, D. B., Cutler, B. L., &Moran, G. (1993). Ameta-analysis of the association between au-thoritarianism and jurors perceptions of defendant culpability. Journal of AppliedPsychol-ogy, 78, 34-42.

    Ones, D. S., Viswesvaran, C., & Reiss, A. D. (1996). Role of social desirability in personalitytesting for personnel selection: The red herring. Journal of Applied Psychology, 81, 660-679.

    Ones, D. S., Viswesvaran, C., & Schmidt, F. L. (1993). Comprehensive meta-analysis of integ-rity test validities: Findings and implications for personnel selection and theories of job per-formance. Journal of Applied Psychology, 78, 679-703.

    Pearlman, K., Schmidt, F. L., & Hunter, J. E. (1980). Validity generalization results for tests ofused to predict job proficiency and training success in clerical occupations. Journal of Ap-plied Psychology, 65, 373-406.

    Podsakoff, P. M., MacKenzie, S. B., & Bommer, W. H. (1996). Meta-analysis of the relation-ships betweenKerr and Jermiers substitutes for leadership and employee job attitudes, roleperceptions, and performance. Journal of Applied Psychology, 81, 380-399

    Premack, S. L., & Wanous, J. P. (1985). A meta-analysis of realistic job preview experiments.Journal of Applied Psychology, 70, 706-719.

    Reilly, R. R., & Israelski, E. W. (1988). Development and validation of minicourses in the tele-communication industry. Journal of Applied Psychology, 73, 721-726.

    Robertson, I. T., & Downs, S. (1989). Work-sample tests of trainability: Ameta-analysis. Jour-nal of Applied Psychology, 74, 402-410.

    Rosenthal, R. (1991).Meta-analytic procedures for social research. NewburyPark,CA:Sage.Roth, P. L., BeVier, C. A., Schippmann, J. S., & Switzer, F. S. (1996). Meta-analyzing the rela-

    tionship between grades and job performance. Journal of AppliedPsychology,81, 548-556.Rothstein, H. R., Schmidt, F. L., Erwin, F.W., Owens,W.A., & Sparks, C. P. (1990). Biographi-

    cal data in employment selection: Can validities bemade generalizable? Journal of AppliedPsychology, 75, 175-184.

    Russell, C. J., Settoon, R. P., McGrath, R. N., Blanton, A. E., Kidwell, R. E., Lohrke, F. T., et al.(1994). Investigator characteristics as moderators of personnel selection research: Ameta-analysis. Journal of Applied Psychology, 79, 163-170.

    Salgado, J. F. (1997). The five factor model of personality and job performance in the Europeancommunity. Journal of Applied Psychology, 82, 30-43.

    Schmidt, F. L., Gast-Rosenberg, I., & Hunter, J. E. (1980). Validity generalization results forcomputer programmers. Journal of Applied Psychology, 65, 643-661.

    Schmidt, F. L., Hunter, J. E., & Caplan, J. R. (1981). Validity generalization results for two jobgroups in the petroleum industry. Journal of Applied Psychology, 66, 261-273.

    Schmidt, F. L., Hunter, J. E., Outerbridge, A. N., & Goff, S. (1988). Joint relation of experienceand ability with job performance: Test of three hypotheses. Journal of Applied Psychology,73, 46-57.

    Schmidt, F. L., Hunter, J. E., & Pearlman, K. (1981). Task differences as moderators of aptitudetest validity in selection: A red herring. Journal of Applied Psychology, 66, 166-185.

    Steel, R. P., & Ovalle, N. K. (1984). A review and meta-analysis of research on the relationshipbetween behavioral intentions and employee turnover. Journal of Applied Psychology, 69,673-686.

    436 ORGANIZATIONAL RESEARCH METHODS

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • Steel, R. P., & Griffeth, R. W. (1989). The elusive relationship between perceived employmentopportunity and turnover behavior: A methodological or conceptual artifact? Journal ofApplied Psychology, 74, 846-854.

    Tubbs, M. E. (1986). Goal-setting: A meta-analytic examination of the empirical evidence.Journal of Applied Psychology, 71, 474-483.

    van Eerde, W., & Thierry, H. (1996). Vrooms expectancy model and work-related criteria: Ameta-analysis. Journal of Applied Psychology, 81, 575-586.

    Viswesvaran, C., & Barrick, M. R. (1992). Decision-making effects on compensation surveys:Implications for market wages. Journal of Applied Psychology, 77, 588-597.

    Viswesvaran, C., Ones, D. S., & Schmidt, F. L. (1996). Comparative analysis of the reliability ofjob performance ratings. Journal of Applied Psychology, 81, 557-574.

    Visewsvaran, C., & Schmidt, F. L. (1992). A meta-analytic comparison of the effectiveness ofsmoking cessation methods. Journal of Applied Psychology, 77, 554-561.

    Waldman, D. A., &Avolio, B. J. (1986). Ameta-analysis of age differences in job performance.Journal of Applied Psychology, 71, 33-38.

    Wanous, J. P., Poland, T. D., Premack, S. L., & Davis, K. S. (1992). The effects of met expecta-tions on newcomer attitudes and behaviors: A review andmeta-analysis. Journal of AppliedPsychology, 77, 288-297.

    Wanous, J. P., Reichers, A. E., & Hudy, M. J. (1997). Overall job satisfaction: How good aresingle-item measures? Journal of Applied Psychology, 82, 247-252.

    Wood, R. E., Mento, A. J., & Locke, E. A. (1987). Task complexity as a moderator of goal ef-fects. Journal of Applied Psychology, 72, 416-425.

    Wright, P. M. (1990). Operationalization of goal difficulty as a moderator of the goal difficulty-performance relationship. Journal of Applied Psychology, 75, 227-234.

    Notes

    1. All average percentages were calculated as the reciprocal of the average of the reciprocalsas suggested by Hunter and Schmidt (1990).

    2. Indices such as the Normed Fit Index (NFI) are specifically composed of the fit functionminima for hypothesized and null models. Because all estimators include the residual matrix intheir fit functions, comparison of residual matrices is implicit in such indices.

    References

    Aguinis, H., &Whitehead, R. (1997). Sampling variance of the correlation coefficient under in-direct range restriction: Implications for validity generalization. Journal of Applied Psy-chology, 82, 528-538.

    Beal, D. J., Corey, D. M., & Dunlap, W. P. (2002). On the bias of Huffcutt and Arthurs (1995)procedure for identifying outliers in the meta-analysis of correlations. Journal of AppliedPsychology, 87, 583-589.

    Bentler, P. M., & Bonnet, D. G. (1980). Significance tests and goodness of fit in the analysis ofcovariance structures. Psychological Bulletin, 88, 588-606.

    Callender, J. C., &Osburn, H. C. (1981). Testing the constancy of validity with computer gener-ated sampling distributions of themultiplicativemodel variance estimate: Results for petro-leum industry validation research. Journal of Applied Psychology, 66, 274-281.

    Cattell, R. B., & Vogelmann, S. (1977). A comprehensive trial of the scree and HG criteria fordetermining the number of factors. Multivariate Behavioral Research, 12, 289-325.

    Cortina, J.M.,&Dunlap,W. P. (1997). On the logic and purpose of significance testing.Psycho-logical Methods, 2, 161-172.

    Cortina / MODERATORS IN META-ANALYSIS 437

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • Cortina, J. M., & Folger, R. G. (1998). When is it acceptable to accept the null hypothesis: Noway, Jose? Organizational Research Methods, 1, 334-350.

    Cortina, J. M., & Gully, S. M. (1999). So the great dragon was cast out . . . who deceives thewholeworld.Newsletter of the ResearchMethodsDivision of the Academy ofManagement,14(1).

    Fabrigar, L. R.,Wegener, D. T.,MacCallum,R. C.,&Strahan, E. J. (1999). Evaluating the use ofexploratory factor analysis in psychological research.PsychologicalMethods, 4, 272-299.

    Gerstner, C. R., & Day, D. V. (1997). Meta-analytic review of leader-member exchange theory:Correlates and construct issues. Journal of Applied Psychology, 82, 827-844.

    Guttman, L. (1954). Some necessary conditions for common factor analysis. Psychometrika,19, 149-162.

    Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. New York: AcademicPress.

    Huber, P. (1981). Robust statistics. New York: John Wiley.Huffcutt, A. I., & Arthur, W. (1995). Development of a new outlier statistic for meta-analytic

    data. Journal of Applied Psychology, 80, 327-333.Huffcutt, A. I., Roth, P. L., &McDaniel, M. A. (1996). A meta-analytic investigation of cogni-

    tive ability in employment interview evaluations: Moderating characteristics and implica-tions for incremental validity. Journal of Applied Psychology, 81, 459-473.

    Hunter, J. E., & Schmidt, F. L. (1990).Methods of meta-analysis: Correcting error and bias inresearch findings. Newbury Park, CA: Sage.

    Hunter, J. E., & Schmidt, F. L. (1994). Estimation of sampling error variance in the meta-analy-sis of correlations: Use of average correlation in the homogenous case. Journal of AppliedPsychology, 79, 171-177.

    James, L. R., Demaree, R. J., & Mulaik, S. A. (1986). A note on validity generalization proce-dures. Journal of Applied Psychology, 71, 440-450.

    James, L. R., Demaree, R. J., Mulaik, S. A., & Ladd, R. T. (1992). Validity generalization in thecontext of situational models. Journal of Applied Psychology, 77, 3-14.

    James, L. R., Demaree, R. J., Mulaik, S. A., &Mumford,M. D. (1988). Validity generalization:A rejoinder to Schmidt,Hunter,&Raju, 1988. Journal of AppliedPsychology,73, 673-678.

    Johnson, B. T., Mullen, M., & Salas, E. (1995). Comparison of three major meta-analytic ap-proaches. Journal of Applied Psychology, 80, 94-106.

    Kaiser,H. F. (1960). The application of electronic computers to factor analysis.Educational andPsychological Measurement, 20, 141-151.

    Kemery, E. R.,Mossholder, K.W., &Roth, L. (1987). The power of the Schmidt andHunter ad-ditive model of validity generalization. Journal of Applied Psychology, 72, 30-37.

    Kozlowsky,M., Sagie, A., Krausz,M., & Singer, A. D. (1997). Correlates of employee lateness:Some theoretical considerations. Journal of Applied Psychology, 82, 79-88.

    Law, K. S., Schmidt, F. L., & Hunter, J. E. (1994a). Nonlinearity of range corrections in meta-analysis: Test of an improved procedure. Journal of Applied Psychology, 79, 425-438.

    Law, K. S., Schmidt, F. L., & Hunter, J. E. (1994b). A test of two refinements for procedures inmeta-analysis. Journal of Applied Psychology, 79, 978-986.

    Marascuilo, L. A. (1971). Statistical methods for behavioral science research. New York:McGraw-Hill.

    Mulaik, S.A., James, L.R.,VanAlstine, J., Bennett, N., Lind, S.,&Stillwell, C.D. (1989). Eval-uation of goodness of fit indices for structural equation models. Psychological Bulletin,105, 430-445.

    Ones, D. S., Viswesvaran, C., & Reiss, A. D. (1996). Role of social desirability in personalitytesting for personnel selection: The red herring. Journal of Applied Psychology, 81, 660-679.

    Orwin, R. G., & Cordray, D. S. (1985). Effects of deficient reporting on meta-analysis: A con-ceptual framework and reanalysis. Psychological Bulletin, 97, 134-147.

    438 ORGANIZATIONAL RESEARCH METHODS

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from

  • Osburn, H. C., Callender, J. C., Greener, J. M., &Ashworth, S. (1983). Statistical power of testsof the situational specificity hypothesis in validity generalization studies: A cautionarynote. Journal of Applied Psychology, 68, 115-122.

    Pearlman, K., Schmidt, F. L., & Hunter, J. E. (1980). Validity generalization results for tests ofused to predict job proficiency and training success in clerical occupations. Journal of Ap-plied Psychology, 65, 373-406.

    Podsakoff, P. M., MacKenzie, S. B., & Bommer, W. H. (1996). Meta-analysis of the relation-ships betweenKerr and Jermiers substitutes for leadership and employee job attitudes, roleperceptions, and performance. Journal of Applied Psychology, 81, 380-399.

    Raju, N. S., & Burke, M. J. (1983). Two new procedures for studying validity generalization.Journal of Applied Psychology, 68, 382-395.

    Raju, N. S., Pappas, S., &Williams, C. P. (1989). An empiricalMonte Carlo test of the accuracyof the correlation, covariance, and regression slope models for assessing validity general-ization. Journal of Applied Psychology, 74, 901-911.

    Sackett, P. R., Harris, M. M., & Orr, J. M. (1987). On seeking moderator variables in the meta-analysis of correlation data: AMonte Carlo investigation of statistical power and resistanceto Type I error. Journal of Applied Psychology, 71, 302-310.

    Salgado, J. F. (1997). The five factor model of personality and job performance in the Europeancommunity, Journal of Applied Psychology, 82, 30-43.

    Schmidt, F. L., & Hunter, J. E. (1977). Development of a general solution to the problem of va-lidity generalization. Journal of Applied Psychology, 62, 529-540.

    Schmidt, F. L., Hunter, J. E., & Caplan, J. R. (1981). Validity generalization results for two jobgroups in the petroleum industry. Journal of Applied Psychology, 66, 261-273.

    Slavin, R. E. (1984). Meta-analysis in education: How has it been used? Educational Re-searcher, 13, 6-15.

    Society for Industrial and Organizational Psychology. (2000). 16th Annual Conference call forproposals. Bowling Green, OH: Author.

    Spector, P. E., & Levine, E. L. (1987). Meta-analysis for integrating study outcomes: A MonteCarlo study of its susceptibility to Type I and Type II errors. Journal of Applied Psychology,72, 3-9.

    Steel, P. D., & Kammeyer-Mueller (2002). Comparing meta-analytic moderator estimationtechniques under realistic conditions. Journal of Applied Psychology, 87, 96-111.

    Switzer, F. S., Paese, P. W., & Drasgow, F. (1992). Bootstrap estimates of standard errors in va-lidity generalization research. Journal of Applied Psychology, 77, 123-129.

    van Eerde, W., & Thierry, H. (1996). Vrooms expectancy model and work-related criteria: Ameta-analysis. Journal of Applied Psychology, 81, 575-586.

    Wanous, J. P., Sullivan, S. E., &Malinak, J. (1989). The role of judgment calls in meta-analysis.Journal of Applied Psychology, 74, 259-264.

    Whitener, E. M. (1990). Confusion of confidence intervals and credibility intervals in meta-analysis. Journal of Applied Psychology, 75, 315-321.

    Wilcox, R. R. (1997). How many discoveries have been lost by ignoring modern statisticalmethods? American Psychologist, 53, 300-314.

    Zwick, W. R., & Velicer, W. F. (1986). Comparison of five rules for determining the number ofcomponents to retain. Psychological Bulletin, 99, 432-442.

    Cortina / MODERATORS IN META-ANALYSIS 439

    at University of Bucharest on May 31, 2015orm.sagepub.comDownloaded from