Marketing Practices

Embed Size (px)

Citation preview

  • 8/13/2019 Marketing Practices

    1/13

    Reliability: A Review of Psychometric Basics and Recent Marketing PracticesAuthor(s): J. Paul PeterSource: Journal of Marketing Research, Vol. 16, No. 1 (Feb., 1979), pp. 6-17Published by: American Marketing AssociationStable URL: http://www.jstor.org/stable/3150868.

    Accessed: 19/12/2013 09:12

    Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at.http://www.jstor.org/page/info/about/policies/terms.jsp

    .JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of

    content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms

    of scholarship. For more information about JSTOR, please contact [email protected].

    .

    American Marketing Associationis collaborating with JSTOR to digitize, preserve and extend access to

    Journal of Marketing Research.

    http://www.jstor.org

    http://www.jstor.org/action/showPublisher?publisherCode=amahttp://www.jstor.org/stable/3150868?origin=JSTOR-pdfhttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/stable/3150868?origin=JSTOR-pdfhttp://www.jstor.org/action/showPublisher?publisherCode=ama
  • 8/13/2019 Marketing Practices

    2/13

    ReliabilityStudiesJ. PAULPETER*

    The basic theories and measurement procedures for reliability and the closelyrelated concept of generalizability are reviewed, illustrated, and evaluatedfor use in marketing research. A critique is given of a subset of previousmarketing research studies in which reliability estimates were used and

    recommendations are made for future research.

    Reliability e v i e w o Psychometric a s i c sn d e c e n t arketing Practices

    Validmeasurement s the sine qua non of science.In a general sense, validity refers to the degree towhichinstruments rulymeasurethe constructswhichthey are intended to measure. If the measures usedin a discipline have not been demonstrated o haveahighdegreeof validity, hatdiscipline snota science.Anecessary(butnotsufficient)condition orvalidityof measures is that they are reliable. Reliabilitycanbe defined broadlyas the degree to which measuresare free from error and therefore yield consistentresults. For example, ordinal evel measuresare reli-able if they consistently rank order subjects in thesame manner;interval level measures are reliable ifthey consistentlyrankorderandmaintain he distancebetween subjects (up to a lineartransformation).Ofcourse,behavioralmeasuresareseldom if ever totallyreliableand valid, but the degreeof theirvalidityandreliabilitymust be assessed if research is to be trulyscientific.Marketing esearchersseldom assess the reliability(much ess the validity)of theirmeasures(Heeler andRay, 1972,p. 369).Forexample, nconsumerbehavior,which is traditionallyviewed as a marketingarea,Jacoby (1976,p. 6) reportsthatin the entire 300-itembrand oyalty literatureonly one study has presenteda measureof test-retestreliability or that construct.Rogers (1976, p. 299) states that there is a lack ofevidence of the accuracy and stability over time of

    *J. Paul Peter is Associate Professor of Marketing, WashingtonUniversity. The author gratefully acknowledges the many usefulsuggestions and contributions provided by the reviewers.

    measures in the adoption-diffusion iterature. Ryanand Bonfield (1975, p. 22) criticize the common useof single-item cales and ackof concernwithreliabilityin the attitude-behavioral ntention literature, andKassarjian(1971, p. 415) points out that too oftenresearchersare disinterested nreliabilityandvalidity)criteriain the study of personality.The problemisnot uniqueto these areas;the situation s similarwithrespectto measuresof othercore constructs(Jacoby,1976,p. 6). In fact, of the more than 400 consumerbehaviorstudies surveyedfor this research,less than5%assessed the reliabilityof the measuresemployed.If the state of the art in marketing s to developbeyond its currentcondition, a useful startingpointwould be the regular assessment of reliability inmarketingresearch studies and the developmentofhighlyreliable cales. Not onlyis reliability necessaryconditionfor validity,but unreliablemeasuresattenu-ate (lessen) the correlationbetween measures.Thus,if reliabilitys not assessed andthecorrelationbetweenmeasures of two constructs is low, marketingre-searchers have no way of knowingwhether there issimplylittle relationshipbetween the two constructsor whetherthe measuresare unreliable.

    The purposeof this articleis to providea resourcefor marketing esearchers nterested n understandingreliability heory and assessing the reliabilityof theirmeasures.The firstsection is adiscussionof traditionalreliability heory and measurement.Thoughmuch ofthe psychometric iteraturehas been concerned withanalyzingappropriateassumptionstructures,formu-las, and methods for assessing reliability,the focushere is on discussingbasic concepts and evaluating6

    Journal of Marketing ResearchVol. XVI (February 1979), 6-17

    This content downloaded from 111.93.6.230 on Thu, 19 Dec 2013 09:12:39 AMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Marketing Practices

    3/13

    RELIABILITY:SYCHOMETRICASICSAND MARKETINGRACTICESreliability assessment procedures for use in marketingresearch.'The second section is concerned with the reformula-tion of reliability as generalizability theory. Thoughgeneralizability theory has not replaced traditionalreliability theory, it does provide a unified conceptualand operational approach for addressing reliabilityissues.The final section is a review of applications ofreliability assessment in one area of marketing, con-sumer behavior. Consumer behavior was selected forthis review because it is the most heavily researchedarea in marketing and has borrowed most heavily frompsychology. Thus, reliability was expected to be as-sessed more often in consumer behavior than in otherareas of marketing research. However, this is notmeant to imply that reliability assessment is appropri-ate only for consumer behavior constructs; reliabilityneeds to be assessed regularly for constructs in allareas of marketing research.

    TRA DITIONA L APPROACHES TORELIABILITYThe voluminous literature on reliability began withthe work of Spearman in 1904 and 1910.Though severalother formulations and assumptions structures forderiving reliability theory have been advanced, Spear-man's notion of true and error components remainsthe most influential model in psychological research(Campbell, 1976, p. 18). In this section the Spearmanapproach is discussed and basic methods for assessingthe reliability of a measurement scale are evaluated.The term scale is used here to mean a multi-itemscale and not simply a single item. One of the advan-tages of multi-item scales is that they allow measure-ment errors to cancel out against each other and thusthe reliability of the scale is increased. In addition,multi-item scales may be necessary to approach validmeasurement of factorially complex constructs.

    Reliability TheoryThis basic approach starts with the notion that themean and variance of any observed scale score caneach be divided into two parts.? In terms of the mean,the two parts are the true score and the error scoreor

    (1) Xobserved Xtrue + Xerror.'Excellent reviews and comparisons of alternative formulationsare provided by Bohrnstedt (1970), Campbell (1976), and Tryon(1957). The Campbell review is the most recent and includes someformulations not found in the earlier works. More comprehensiveand technical treatments can be found in Lord and Novick (1968)and Cronbach et al. (1972).2Partsof this section are based on Guilford (1954) and Kerlinger(1973). This is by no means the only rationale for deriving reliabilitytheory and it has been subject to criticism (e.g., Tryon, 1957).However, in spite of its limitations, it provides a useful frameworkfor discussing reliability concepts.

    Conceptually, the true score is a perfect measureof the property being measured. However, in practice,the true score can never really be known and generallyis assumed to be the mean score of a large numberof administrations of the same scale to the samesubject. The error score is an increase or decreasefrom the true score resulting from measurement error.Measurement error is the source of unreliability andits primary cause is that items in the scale are notmeasuring the same phenomenon.The variance of an observed scale score also isassumed to have a true component and an errorcomponent or(2) Vobserved - true + 'errorThe true variance component includes all systematicvariance. In one sense, it is a misnomer because itincludes both variance from the phenomenon underinvestigation and all other sources of systematicvariance. (Determination of the difference betweentypes of systematic variance is a validity question.)The error variance component includes all randomor nonsystematic variance. In terms of the previousdefinition of reliability, systematic variance does notaffect either the rank order or distance betweensubjects but random or error variance does and thuserror variance lowers the reliability of measures. Areliability coefficient (r,,), therefore, is nothing morethan the ratio of true variance to observed varianceor the percentage of total variance which is of thesystematic type. Symbolically,(3) true

    Vobserved

    Because Vtr,e annot be estimated directly, equation3 cannot be used to compute a reliability coefficient.However, because V, = I - V,rror and V,rror anbe estimated, equation 3 can be rewritten into acomputational formula as(4) erro rr. = I -

    Vlobservedor by further multiplying through by Vobserv,ed(5) Vobse rved Verrort observed

    VobservedThese two equations are both theoretical and practi-cal. As a theoretical matter, they exemplify the notionthat measurement error (error or random variance)reduces the reliability of measures. As a practicalmatter, an analysis of variance approach has beensuggested (e.g., Alexander, 1947; Burt, 1955; Hoyt,1941) for estimating these sources of variance. Basi-cally, the ANOVA model employs the mean squareof the residual as an estimate of Verror, the meansquare between individuals as an estimate of Vobserved

    7

    This content downloaded from 111.93.6.230 on Thu, 19 Dec 2013 09:12:39 AMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Marketing Practices

    4/13

    JOURNAL FMARKETINGESEARCH,EBRUARY979and substitutes each into equation 4 or 5. However,this is a method of reliability assessment.Reliability Measurement

    There are three basic methods for assessing thereliability of a measurement scale: test-retest, internalconsistency, and alternative forms.4 All three methodsattempt to determine the proportion of variance ina measurement scale that is systematic. Basically,these methods correlate scores obtained from a scalewith scores from some form of replication of the scale.If the correlation is high, most of the variance is ofthe systematic type and, with some degree of consis-tency, the measures can be depended upon to yieldthe same results.The basic difference among the three methods isin what the scale is to be correlated with to computethe reliability coefficient. In test-retest, the identicalset of measures is applied to the same subjects attwo different times. The two sets of obtained scoresare then correlated. In internal consistency, a mea-surement scale is applied to subjects at one pointin time; subsets of items within the scale are thencorrelated. In alternative forms, two similar sets ofitems are applied to the same subjects at two differenttimes. Scale items on one form are designed to besimilar (but not identical) to scale items on the otherform. The resulting scores from the two administra-tions of the alternative forms are then correlated.Test-retest reliability. In this method of reliabilityassessment the same scale is applied a second timeto the same subjects under conditions as similar asthe investigator can make them. The scores from thetwo administrations then are correlated and the result-ing index is interpreted in terms of the stability ofperformance of the measures over time. A two-weekinterval is the generally recommended retest period.The retest method involves at least three basicproblems. First, different results may be obtaineddepending on the length of time between measurementand remeasurement. In general, the longer the timeinterval the lower the reliability (Bohrnstedt, 1970,p. 85). Second, if a change in the phenomenon occursbetween the first and second administration, thereis no way to distinguish between change and unreli-ability.5 Third, the retest correlation is only partlydependent on the correlation between different itemsin the scale, because a portion of the correlation ofsums includes the correlation of each item with itself.

    3For computational examples, see Kerlinger (1973, p. 447-51).A more comprehensive ANOVA approach is presented in thegeneralizability section of this article.4Strictly speaking, reliability is never really measured but onlyestimated. For convenience, however, the terms measure, esti-mate, and assess are used interchangeably in this article.5See Hiese (1969) and Wiley and Wiley (1971) for approachesto overcoming this problem.

    Such correlations would be expected to be much higherthan those found between different items and couldproduce a substantial correlation between test andretest (Nunnally, 1967, p. 215).Although the retest method provides useful in-formation about the stability of measures, the problemssuggest that it should not be used as the sole methodof reliability assessment. Rather, if the retest methodis employed, it should be supplemented with internalconsistency estimates for each administration.Internal consistency reliability. The basic form ofthis method is split-halves in which item scores ob-tained from the administration of a scale are splitin half and the resulting half scores are correlated.The scale is usually split in terms of odd and evennumbered items or on a random basis. Internal consis-tency measures assess the homogeneity of a set ofitems.Though split-halves is the basic form of internalconsistency estimate, there is one basic problem withusing it: different results may be obtained dependingon how the items are split in half. Thus, the researcheris faced with the bothersome question of what is thereal reliability coefficient. One approach to over-coming this problemis to determine the mean reliabilitycoefficient for all possible ways of splitting a set ofitems in half. A formula which accomplishes this stepis Cronbach's (1951) coefficient alpha which is themost commonly accepted formula for assessing thereliability of a measurement scale with multi-pointitems.6 Alpha is formulated as

    (6)where:

    k 2(Jr2a,t

    a= k-( Ik -lIk \

    i-i 2r, /

    numberof parts(usually tems) in the scale,varianceof item i, andtotal varianceof the scale.

    6If items are scored dichotomously, e.g., yes-no, true-false, Avs. B, a special case of a, Kuder-Richardson Formula 20 (Kuderand Richardson, 1937) is the appropriate formula. KR-20 is formu-lated as

    k- )R-20 = --k -l I I - ar,where:

    k = numberof items on the scale,p = proportion f responsesof the first type,q = proportionof responsesof the second type (1 - p),anda%,2= total varianceof the scale.

    8

    This content downloaded from 111.93.6.230 on Thu, 19 Dec 2013 09:12:39 AMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Marketing Practices

    5/13

    RELIABILITY:SYCHOMETRICASICSAND MARKETINGRACTICESTable 1

    LOWER ALF OVARIANCEATRIXORSIXPROBABILITYFLOSS TEMS3.4951.07 2.462.04 .83 3.371.45 1.62 1.97 3.621.10 1.00 1.80 1.61 3.621.91 .58 2.30 1.35 2.03 3.52

    'Underlinedvalues are the itemvariances.Because of the facts that (1) alpha is one of themost important deductions from the theory of mea-surement error (Nunnally, 1967, p. 96) and (2) themajority of marketing research studies employ scalesor items of the type alpha was designed to evaluate,a numerical example is provided hereafter. Thougha computer program is available in the marketingliterature for calculating alpha (Vigderhous, 1974), thecalculations can be made easily from the covariancematrix of a set of items.In a portion of a previously reported study (Peterand Ryan, 1976), the reliability of several six-itemperceived risk scales was assessed by using coefficienta. One scale was intended to measure the probabilityof six types of loss from the purchase of a FordPinto. The six types of loss examined were financial,social, performance, psychological, physical, and con-venience. The items were scored from 1 (improbable)to 7 (probable) by 108 subjects. Table 1 is the co-variance matrix for the resulting scores.Because the total variance can be restructured asthe sum of the item variances plus two times thesum of the item covariances, alpha can be restructuredinto a computational formula as7

    k i k(7) ?-\ i +2- ao/=l i > j

    7Not only is equation7 more convenient computationally,butalso it illustrateswhy increasing he numberof items on a scalealmost always increases the scale's reliability-the number ofcovarianceterms in the denominatorncreasesgeometricallywiththenumberof items whereas henumber f variance erms ncreasesonly arithmetically.An increase of m itemsincreasesthe varianceby m elements but the covarianceby m(m- 1) elements. Thus,althoughk/k- 1 in the formuladecreases with an increase in thenumberof items and the additionalvariancewouldhave a negativeeffecton the value of alpha, hegeometricncrease n thecovarianceelementsmore than offsets these effects. Thus, unless all of thecovariance of addeditems with the original tems is almost zeroand the varianceof the added items is not, an increase in thenumberof itemson a scale willincrease the reliabilityof the scale.

    The first step is to compute x6 Ci2 which is thesum of the item variances, the diagonal elements ofthe covariance matrix or6

    a,2 = (3.49 + 2.46 + 3.37 + 3.62 + 3.62 + 3.52)= 20.08.

    The next step is to compute two times the sumof the covariance elements, the off-diagonal elementsof the covariance matrix or6 6

    2 Cij = 2(1.07 + 2.04 + 1.45 + 1.10+ 1.91+ .83 + 1.62 + 1.00 + .58 + 1.97 + 1.80+ 2.30 + 1.61 + 1.35 + 2.03) = 45.32.

    Alpha then can be determined as6a-- 1

    6 - I20.08 = .83.

    20.08 + 45.32Thus, the scale demonstrates a high degree ofreliability, and correlations between the sum scoreof the scale items and other variables would be affectedvery little by attenuation.8In addition to alpha, there are numerous otherformulas for computing internal consistency estimatesof reliability. However, alpha is a general formulaand many other approaches have been shown to bedifferent computational forms which will yield similarresults (Tryon, 1957). Although some aspects of deriv-ing alpha have been criticized (Bentler, 1972; Tryon,1957), it is a most useful formula for assessing thereliability of measures in marketing research.Alternative form reliability. In this method of as-sessing reliability, the same subjects are measuredwith two scales at two different times, usually twoweeks apart. Each scale is designed to be similar incontent to the other but different enough that thefirst measurement will not substantially affect remea-surement. The resulting scores from the two admin-istrations of the alternative forms then are correlatedto obtain a reliability coefficient. Alternative formsassess the equivalency of content of sets of items.

    8Givena reliabilityestimate,the standard rror of measurementcan be computedby the formulam=cas / - rt,,where:

    'meas = the standard rrorof measurement,a, = the standarddeviationof total scores, andr, = the reliability stimate.In the example, crm...= 8.08 V/1 - .83 = 3.33. Although confidenceintervalsaround ruescorescanbe computedby usingthis estimate(see Bohrnstedt,1970,p. 83-4;Nunnally,1967,p. 220)suchanalysishas littlepracticalvalue for most types of marketing esearch.

    9

    This content downloaded from 111.93.6.230 on Thu, 19 Dec 2013 09:12:39 AMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Marketing Practices

    6/13

    JOURNAL FMARKETINGESEARCH,EBRUARY979The primaryproblemwithuse of alternative ormsis in the developmentof substantiallyequivalental-ternativemeasures.For example, strictdefinitions ofalternativeforms state that the mean, variance, andintercorrelationf items oneachform must be equiva-lent (Gulliksen,1950). Thoughthis problemhas beenovercome to some extent in educationaltesting, it

    remainsa seriousconsideration or the measurementof otherbehavioralconstructs.An even more perplexingproblemwith alternativeformsis proving that the two measuresare equiva-lent incontent.Forexample, f the correlationbetweenscores on the two forms is low, it is difficult todeterminewhether he measureshaveintrinsicallyowreliabilityor whetherone of the forms is simply notequivalent in content to the other. Although thisproblemcan be investigated Nunnally,1967,p. 211-13),the end result most likelywill be the developmentof yet anotheralternative orm.The importanceof assessing reliabilitywith alterna-tive forms depends on the phenomenonunder inves-tigation.If the phenomenon s expected to vary overrelativelyshortperiodsof time, then alternative ormmeasures may be necessary for examiningchanges.Thus, though the alternative form method may benecessary for the investigation of some marketingconstructs, coefficient alpha usually will provide aclose estimateof alternative ormsreliability Nunnal-ly, 1967,p. 211).

    REFORMULA TING RELIABILITY ASGENERALIZABILITY THEORYOneproblemnot explicitlyaddressedby traditionalapproachesto reliabilityis that measurement errorcan come from many sources and each definitionoferrorchanges the meaningof the reliabilitycoeffi-cient (Gleseret al., 1965).For example,the compo-nents of variance which are true and error incomputinga test-retestcorrelationare differentfromthose for an internalconsistency estimate. One ap-proachto simultaneouslyanalyzingmultiplesourcesof variance in a measurementprocedure has beenthe reformulationof reliability as generalizabilitytheory by Cronbach and his associates (1963, 1972;Gleseret al., 1965).

    GeneralizabilityTheoryGeneralizabilitytheory is based in part on the

    concept of sampling.However, the primary ocus isnot on the samplingof people from populationsofpeople, but ratheron sampling conditionsof mea-surement from universes of possible measurementconditions.A conditionof measurement or simply, condition)is a specific aspect of a measurementprocedure,e.g.,the specific times measures are taken or the specificitems used inthe scale. Thegeneraltermfor anaspectof a measurementprocedure s a facet of measure-

    ment, i.e., time, instrument,and observers are sev-eral commonfacets.The question in generalizability heory is whetherscores obtained n the sampledconditionsof measure-ment are representativeof the universe scores forthose conditions.The universe score is analogoustothe true score in traditionalreliability theory and isconceptualizedas the mean score a subject wouldprovideif measuredover allconditions n a universe.Measures cannot be taken over all conditions ina universe. However, it is still possible to determinethe correlationbetween observedscores and universescores, becauseit is assumed ngeneralizabilityheorythat conditionsof measurement rerandomly ampledfrom the universe of conditions.Given this assump-tion, the correlationbetweenanytwo sets of observedscores for a condition can be shown, on the average,to be equalto the correlationbetweenobserved scoresand universe scores (Campbell, 1976,p. 144). Thus,a coefficient of generalizability an be determinedfrom observed scores and is definmeds the ratio ofuniverse-scorevariance to expected observed-scorevariance(Cronbachet al., 1972,p. 17).Thoughgeneralizability oefficients can be comput-ed for a measurementprocedure,this is not the mainthrustof generalizabilityresearch.The maingoal ofgeneralizabilityresearch is to simultaneouslyassessthe componentsof variance n a measurementproce-dure. Variance can come from many sources in ameasurement rocedureand,depending n theinterestof the researcher, at least one of these sources isunwantedor error. By simultaneously nvestigatingmultiple ourcesof variance,theresearcher andevel-op more efficient measurementprocedures.The main benefit of generalizability heory is thatit explicitlyrecognizes that there are many universesto which a researchermay wish to generalize. Forexample, the researchermay wish to generalizefroma sample of items to the universe of items, from asample of times of measurement o the universe oftimes of measurement,from a sample of places ofmeasurementto the universe of places of measure-ment, from a sample of observers to the universeof observers, etc. Measurementprocedures are de-signed in generalizabilitystudies to investigate theuniverses of interestby samplingconditions of mea-surement rom each of them. In otherwords, for eachuniverse of interest,a facet is included n the general-izability study. Thus, traditionalreliabilitymethodscanbe viewed as single-facetgeneralizability tudies,e.g., a test-retestcorrelations concernedwithwhetherscores obtained from a measurement nstrumentaregeneralizable o the universe scores across all timesof possible measurement.Clearly,even if the test-re-test correlationwerehigh,no statementcouldbe madeabout the generalizabilityof the measures to otheruniverses. To generalize to other universes, othermeasurementprocedureswouldhave to be employed.

    10

    This content downloaded from 111.93.6.230 on Thu, 19 Dec 2013 09:12:39 AMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Marketing Practices

    7/13

    RELIABILITY:SYCHOMETRICASICSANDMARKETINGRACTICESTable 2

    ANALYSIS FVARIANCEORHYPOTHETICALRANDLOYALTYSTUDYSource of Sum of Meanvariance squares d.f. square

    Subjects (p) 5999.400 99 60.600Items(i) 1087.200 9 120.800Occasions j) 6540.000 2 3270.000pi 1335.609 891 1.499pj 712.800 198 3.600ij 67.500 18 3.750Residual 2227.500 1782 1.250

    Generalizability MeasurementAs previously noted, analysis of variance could beused to assess the reliability of a measurement scale.The logic of the ANOVA model can be expandedto include multiple sources of variance and this isprecisely the method used to assess the various com-

    ponents of variance in generalizability research.For example, suppose a researcher is interested ininvestigating a brand loyalty measurement procedure.Eventually, the measurement procedure is to be usedto help make a decision about whether or not aparticular brand should be dropped from a productline.The first step would be to perform a generalizability(G) study. A G study is used to estimate variancecomponents in the measurement procedure. If the Gstudy indicates that the components of variance whichthe researcher considers error are minimal, thenthe measurement procedure would be considered ac-ceptable for use in a decision (D) study. The D studywould be the actual study of brand loyalty on whichthe decision to drop the brand from the product linewould be based.The design of the G study depends on the facetsof interest to the researcher. Suppose the researcherwere interested in two facets, time and instrument,and therefore administered a 10-item brand loyaltyscale to 100 subjects on three occasions. The firststep in the analysis of the G study would be to performa three-way analysis of variance, subjects x itemsx occasions. Table 2 is an analysis of variance tablefor this hypothetical problem.Because subjects, items, and occasions are assumedto be randomly sampled, the observed mean squaresare unbiased estimates of the expected mean square(EMS). These expected mean squares also have beenshown by Cornfield and Tukey (1956) to be the sumof certain variance components, andit is these variancecomponents which are of primaryinterest in generaliz-ability research. The formulas for obtaining the sevensources of variance of interest in the sample problemare:(8) EMS = a2 + n 2 +na,p,2 + n nrp2

    (9) EMSi = a 2 + npora2 +jp + npnj/2(10) EMSj = a2 + n,ap2 + npni,a2(11) EMSpi = C2 + nap,2(12) EMSpJ= aor2+ ncrp2(13) EMSy = a.+2+npCra2(14) EMS,s = ae2.

    To determine the unknown variance components,the analysis starts with ore2which is 1.25.9 With thisestimate, and because every term in each of theequations is known except one of the variance compo-nents, the researcher can solve for the variance com-ponents. Because each of the observed mean squaresis an unbiased estimate of its respective EMS, theEMS terms are taken from Table 2; the number ofpersons, np = 100; the number of items, n, = 10;the number of occasions, nj = 3. For example, tosolve for the variance component of the ij interactionterm, the substitution would be:

    EMS,) = a,2 + npa,23.75 = 1.25 + 10OOr,2

    ri2 = .025.Working backward through each of the equations, onecan estimate each of the components of variance.In addition, by summing the variance components andthen dividing the sum into each component, one candetermine the percentage of total variance which isattributable to each source. The results of thesecomputations for the sample problem are reportedin Table 3.

    For the purpose of a G study, analysis is madedirectly from these components of variance. The largeeffects are those from subjects, occasions, and theresidual. The large subject variance component wouldsuggest that the sample is rather heterogeneous andthe measurement procedure is capable of discrimi-nating between the various subjects.The major source of variance is from occasionswhich would suggest that subjects are not respondingin the same way at different times of measurement.However, another possibility might be that differentcoders were employed at different times of measure-ment and this effect is masked because coders wasnot included as a facet in the study. Subsequentanalysis including coders as a separate facet couldbe performed to determine the extent of this effect.The high residual could stem from other facets affect-ing the scores which are not explicitly accounted forin the design.

    91t would be more proper to label a,2as ar; to account forthe within-cell error term as there is only one observationpercell in thisdesign.However,o,2 s used here tosimplify he notation.

    11

    This content downloaded from 111.93.6.230 on Thu, 19 Dec 2013 09:12:39 AMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Marketing Practices

    8/13

    JOURNAL F MARKETINGESEARCH,EBRUARY979Table 3ESTIMATEDOMPONENTSOF VARIANCEAND PERCENTAGESOF TOTALVARIANCEFORHYPOTHETICALRANDLOYALTYTUDY

    Source of Estimate of Percentage ofvariance Mean square d.f. variance component total varianceSubjects (p) 60.60 99 1.892 26.51Items (i) 120.80 9 .389 5.45Occasions (j) 3270.00 2 3.264 45.73pi 4.20 891 .083 1.16Pj 3.60 198 .235 3.29ij 3.75 18 .025 .35Residual 1.25 1782 1.250 17.51Ic2 = 7.138 100.00%

    The coefficient of generalizability can be determinedby dividing an estimate of universe-score varianceby an estimate of expected observed-score variance.In this example, the estimate of the universe-scorevariance is the variance component for subjects, 1.892.The estimate of the expected observed-score varianceis the sum of the variance components for p, pi, pj,and e or 1.892 + .083 + .235 + 1.250 = 3.46. The co-efficient of generalizability thus equals 1.892/3.46 or.547.Though the illustrative study used here is a com-pletely crossed, two-facet design, the possibilities forstudy designs for generalizability are unlimited. Notonly can any number of facets be examined simulta-neously, but nested designs ordesigns with any numberof fixed and random facets can be accommodated.By formulating reliability problems in terms ofgeneralizability, researchers are forced to recognizethat unwanted or error variances could come frommany sources in a measurement procedure. Even ifreliability is assessed by use of a traditional method,only one facet of measurement and type of errorvariance is being explicitly considered. Thus, a highreliability coefficient of one type cannot be interpretedto mean that the measurement procedure will yieldconsistent results across all potential facets.Although the generalizability formulation providesa framework for extensive investigation of measure-ment procedures, two practical considerations maylimit its use in marketing research. First, the designand interpretation of generalizability studies can be-come very complex. For example, interpretinga higherorder three- or four-way interaction can be a mostchallenging task. Second, whethergeneralizability stud-ies are profitable (reward-cost) depends on howimportant the various sources of error variance areexpected to be in a measurement procedure. In mostcases, the primary source of measurement error isfrom the items in a measurement scale. In other words,the scale items do not systematically measure the samephenomenon. Thus, even though coefficient alphadoes not consider many sources of measurement error,

    Nunnally (1967, p. 210-11) suggests that it is surprisingwhat little difference these other sources usually make.This is particularly true for situations in which instruc-tions are easily understood and there is little subjectiv-ity in scoring. Thus, the necessity of performing amultifaceted generalizability study depends on howimportant sources of variance other than the itemsare expected to be.RELIA BILITY ASSESSMENT PRACTICES INMARKETING RESEA RCHTo investigate reliability assessment practices, asample of 400 empirical research studies was surveyedin one area of marketing, consumer behavior. Thesestudies were surveyed primarily from the JMR (1972-1976), JCR (1974-1976), JM (1974-1976), Advancesin Consumer Research (1975-1977), and the AMAProceedings (1974-1976). ' Studies which were found

    to include some form of reliability assessment werereviewed closely to investigate problems in the area.There is no intent here to be critical of past researchefforts and in fact these studies are laudable for atleast attempting to address the reliability issue.Tables 4 and 5 summarize the results of the survey.No studies were found which included either analternative form or a multifaceted generalizabilityapproach. This outcome could be expected becausealternative forms are often difficult to develop andgeneralizability is a relatively new approach.Nineteen studies were found which included reli-ability estimates. Two studies (Best et al., 1977; Lund-strom and Lamont, 1976) employed both test-retest

    '?In addition, a computer scan of the Psychological Abstractswas performed by the Bibliographical Services for Research system.In this scan, the keywords Consumer Behavior, Consumer Research,and Consumer Attitudes were cross-referenced with both testreliability and reliability. The scan produced a total of nine referencesfor the years 1974-1976, of which only two were empirical researchemploying a reliability measure. Thus, this method was not effectivein locating appropriate articles because reliability estimates are notalways reported in the abstracts.

    12

    This content downloaded from 111.93.6.230 on Thu, 19 Dec 2013 09:12:39 AMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Marketing Practices

    9/13

    RELIABILITY:SYCHOMETRICASICS AND MARKETING RACTICESTable 4

    SUMMARYOF STUDIESINCLUDINGTEST-RETESTELIABILITYSTIMATESNature of Type of No. of items No. of points Retest Reliability SampleAuthors Study area scales scale' in scale per item period coefficient size

    Best, Attitudes Beliefs (about5 S 10perstore 6 10days range 41-.61 70Hawkins, department& Albaum stores)(1977)bBettman, Attitudes Beliefs/evalua- S 36 11 Immediate .672c 72Capon, tions (for fourLutz (1975) brandsoftoothpaste)Brooker Consumer Consumer 4 weeks .57 13(1975) self-actu- self-actualiza- A vs. B 20 2 5 weeks .67 24alization tionGreen & Consumerutili- Personalprefer- Category 15 9 1 hour range 8-.95 27Devita ty for item ence for meal ratings(1974) collection and dessertcombinations4dLandon Self-concept Self-con-(1974) cept/productcongruity S 1 9 1 week > .7e 352Ideal self-con-

    cept/productcongruity S 1 9 > .7Purchase nten-tions S 1 9 >.7Lundstrom& Consumerdis- Attitudes oward L 84 6 6 weeks .79 154Lamont content businessprac-(1976) ticesVillani & Personality Personality raits:Wind Sociable L 6 5 2 years .72f 504(1975) Relaxed L 4 5 .68Internal ontrol L 5 5 .48Waung 1975) Attitudes Perceived nstru- S 5 perdrug 7 Immediate range 38-1.0 55mentality for (intrasubj.)3 prescriptiondrugs)Wright 1975) Cognitive Generalizedresistance self-confidence S 10 9 2 weeks .75 160to advertising Informationprocessingconfidence S 10 9 .69

    'L = Likerttype.S = Semanticdifferential ype.See also Best et al. (1976).cThewithin-individualorrelationsbetweentwo replicatesof 36 responseswereaveragedacrosssubjects.dReplicates of 15 cards istingvariousmealanddessertcombinationswere sorted nto nine preferencegroups.'Test-retest correlations(_ .7) were used to determine which productswould be included in the study. Twelve were selected for179males;seven were selectedfor 173females.rTest-retestcorrelationsmeasuredby simple sum of factor analysis items were .70, .72, .39 and measuredby factor scores were.65, .55, .37 for the three scales respectively.

    and internalconsistency estimates and are includedin bothtables. Seven otherstudiesemployed est-retestand 10others used internalconsistencyestimates.In terms of content, 12 of the studies were in theattitude/behavioral intentions and personality/life-style areas. This finding can be explained because(1) these areas were the most heavily researched inthe review period and (2) multi-item scales wereavailable oradaptable) rom otherdisciplines n theseareas. The majority of the studies used semanticdifferential or Likert-type items and the magical

    number7 plus orminus2 accounted for the majorityof points per item.A difficultproblem s the numberof itemsnecessaryona scale. Conceptually, he answer s simplyenoughitems to measure the construct under investigation,and only that construct. Yet there are at least twopractical problems in determiningthe appropriatenumberof items for marketing onstructs.One is theproblemof boredom and fatigue if too many itemsare included on a questionnaireto measure eachconstruct.Althoughthe greaterthe numberof items,

    13

    This content downloaded from 111.93.6.230 on Thu, 19 Dec 2013 09:12:39 AMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Marketing Practices

    10/13

    JOURNALOF MARKETING ESEARCH,FEBRUARY979Table 5

    SUMMARYOF STUDIESINCLUDING INTERNAL ONSISTENCYRELIABILITYSTIMATESNo. of No. of InternalNature Type of items in points consistency Reliability SampleAuthors Study area of scales scale scales per item estimate coefficient size

    Best, Haw- Attitudes Beliefs (about 5 S 10 per 6 ANOVA range .62-.71 70kins, & Al- department store Ca range .56-.64stores)

    12 lifestyle co-variatesb15 lifestylescalesc

    Innovation 13 AlO'ssdApparel innova-tivenessGrooming innova-tivenessHome care inno-vativeness

    Personality Innovative-ness

    Consumerdis-contentInformal groupinfluences

    Attitudes towardbusiness prac-ticesGroup influenceReflected ap-praisalComparative ap-praisalAttitudes toward:Alcohol

    CigarettesCoffeeProbability of lossImportance of

    loss (for sixauto brands)Personality traits:CompliantAggressiveDetachedAttitudinal influ-enceSocial influencePurchase inten-tions (for 2brands oftoothpaste)Components ofbehavioral in-tentions modelfor six brandsof toothpaste

    Perry 1973) Attitudesper-sonalityPerceived risk

    Personality

    Behavioral (pur-chase) inten-tions

    Behavioral (pur-chase)intentions

    L range 2-5

    L range 2-4

    L range 4-10L 9L 6L 4NRe 40

    L 84

    L 4L 4L 3

    LLLSS

    2020206 per brand6 per brand

    S 10S 15S 10S 4 per brandL 14 per brandS 3 per brand

    S NR

    6 split-half range 68-.88 359

    NR split-half range .52-.83 278

    6 split-halfNR split-halfNR split-halfNR split-half5 Spearman-

    BrownKR-206 split-halfSpearman-Brown5 split-half55

    split-halfsplit-half

    77777

    aa

    att

    66677

    aa

    7 a

    7 KR-2Lambda-3a

    range .55-.89 154.67.59.61.90.88.96.94.76

    299226

    206.65.65

    .68.70.65range .72-.83range .55-.75

    .724.680.514.93, .94.86, .91.97, .98

    range .665-.786NRNR

    164

    210o

    175

    97

    162

    'L = Likert type items.S = Semantic differential type items.NR for any entry = not reported.bScales were determined by employing factor analysis and labeled generalized self-confidence, opinion leadership, plan-ahead traveler,information seeker, camp traveler, relaxing traveler, first-class traveler, national traveler, jetsetter-vagabond traveler, historical traveler,sports-spectator, functional gregarious.CScales were determined by employing factor analysis and labeled price conscious, fashion conscious, dislikes housekeeping, community-minded, self-confidence, opinion leadership, information seeker, new brand trier, canned food user, dieter, financial optimist, wide horizons,arts enthusiast, patronage innovator, patronage opinion leader.

    LifestyleLifestyle

    baum(1977)Darden&Perreault(1975)Darden&Perreault(1976)Darden&Reynolds(1974)

    Leavitt &Walton(1975)Lundstrom &Lamont(1976)Moschis(1976)

    Peter & Ryan(1976)

    Ryan & Be-cherer(1976)

    Ryan & Peter(1976)

    Wilson,Mathews,&Harvey(1975)

    14

    This content downloaded from 111.93.6.230 on Thu, 19 Dec 2013 09:12:39 AMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Marketing Practices

    11/13

    RELIABILITY:SYCHOMETRICASICSANDMARKETINGRACTICESthe higher the scale's reliability(see footnote 7), itmay be necessary to pruneextremely long scales toshorter forms. This step can be accomplished byselecting a subset of items which have high co-variances.A secondproblem s factoriallycomplexconstructs,i.e., constructs which contain multiple dimensions.For example, there are many types of perceived risk(e.g. financial, social, etc.) and perhapsa multi-itemscale is needed for each type. The problemis evenmore complex in measurementof the attributes ofvarious products, services, stores or brands. Forexample, in investigationof attitudes towardvariousbrands,common formulationsview the construct ofattitudes as some additive combinationof productorbrand attributes.These attribute measures could beviewed in two ways: (1) as single-itemcomponentsof the construct of attitudesor (2) as separatedimen-sions. If they are viewed as single-item components,then assessing the reliabilityof a scale composed ofa set of attributemeasuresmakes sense. However, ifeachattribute s viewed as a separatedimension,suchas the taste versus price of a brand of toothpaste,then a different procedureis required. In this casea separate, multi-item, internally consistent scalewould be required for each attribute in order toapproachvalidmeasurementof the attitudeconstruct.Though he latterapproachs more consistentconcep-tually and in terms of the measurementliterature,developingmulti-item ubscales foreachattributemaybe tedious, as would filling out questionnairescom-posed of sets of such highlyredundanttems.In terms of the retest periods used in the studiesin Table 4, at least five of the retest periods wereof such short durationthat the initial measurementcould have substantiallyaffected remeasurement.Anadditionalproblemwith test-retest was illustrated nmany of the studies-the sample size (of subjects)had to be reduced because some subjects were notavailable for the retest.In terms of the internalconsistency estimates inTable 5, five studies used the split-half procedurewhich leaves open the questionof whether differentresults would be obtainedif the items had been splitin half in anothermanner.Two studies employedtheSpearman-Brownormulawhich is used primarily odetermine what the reliabilityof the scale would be

    if the number of items were doubled. ' One studyused the analysis of variance approachand the re-mainderemployedcoefficient alpha.Reviewing he reliability oefficients reported n thetwo tables raises the question of whether the scalesdemonstrated atisfactory evels of reliability.Thoughno hardandfast rules havebeen offeredfor evaluatingthe magnitude of reliability coefficients, Nunnally(1967, p. 226) suggests the following guidelines. Inearlystagesof research,modestreliabilityn the rangeof .5 to .6 will suffice. Forbasicresearch,it is arguedthat increasing reliability beyond .8 is unnecessarybecause at that level correlationsare attenuatedverylittle by measurementerror. Nunnally says that inapplied settings, in contrast to basic research,a reli-abilityof .9 is the minimum hat should be toleratedand a reliability of .95 should be considered thedesirable standard. Thus, as none of the studiesreportedwere of an appliednature, these guidelinessuggestthat nalmostall cases the scales demonstratedat least satisfactoryreliability.12In marketing,guide-lines are yet to be developed.A finmaluestionis relatedto the samplesizes usedin these studies because sampling error can makeresultsappearbetter than they will be in subsequentstudies. In other words, samplingerrorsprovide theopportunityto take advantageof chance and suchopportunitiesare relatedpositively to the number of

    'Although testof double ength s themostcommonapplication,the generalizedSpearman-Brownormulacan be used for any ratioof altered est length o the original ength(Guilford,1954,p. 354).The generalized ormula snr,,tt

    r + (n-l)r,-where:

    rnn= the reliabilityof the altered-lengthcale,n = any proportionate hange n test length,andr, = the reliabilityof the original engthscale.'2This snotto saythatNunnally's uidelines houldbeinterpretedas absolutestandards or marketing esearch.They are primarilyconcerned with the development of finely tuned measures ofindividual raits to be used for decisions aboutindividualpersons(e.g., GMAT tests). As most marketingresearch is not of thisnature,lower levels of reliabilitymay be acceptable n marketingresearchstudies.

    dScales were determinedby employingfactor analysis and labeled interest-made ashions, interest-personal rooming, interest-homecare, information eeking-male fashions, information eeking-personal grooming, nformation eeking-home care, opinion eadership-male fashions, opinion leadership-home care products, self-esteem, generalizedself-confidence, attitude toward change, lifestyleventuresomeness.'Not reported;a listingof items is available from the authors.The 40 items were also split into two groupsof 20. Spearman-Brownand KR-20 for the forms were .84, .80 and .77, .77, respectively.Items were scored on five points and collapsed to two points forreliabilityassessment.f108subjectsfor threebrandsof compactcarsand 102subjectsfor three brandsof intermediate ars.'In that the itemsaremultipoint, he reportedreliability oefficients are perhapsmislabeledas KR-20rather hanalpha.

    15

    This content downloaded from 111.93.6.230 on Thu, 19 Dec 2013 09:12:39 AMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Marketing Practices

    12/13

    JOURNAL FMARKETINGESEARCH,EBRUARY979items and related negatively to the number of subjects.A useful guideline suggests that for any type of itemanalysis (or multivariate analysis) there should be atleast 10 times as many subjects as items or, in casesinvolving a large number of items, at least five subjectsper item (Nunnally, 1967, p. 280). Of the studiesreported in Tables 4 and 5, only five 'have at leasta 10 to 1 ratio of subjects to items for each scaleemployed. Thus, many of the reported coefficientsmight be considerably smaller in subsequent researchwith adequate sample sizes.

    CONCLUSIONS AND RECOMMENDATIONSThis research was concerned with reliability andgeneralizability theory, measurement, and assessmentpractices in marketing. Given the state of the art ofmeasurement in marketing, four recommendations areoffered.First, marketing researchers need to develop multi-item scales to measure constructs in the area. Mostconstructs by definmition re too complex to be mea-

    sured effectively with a single item, and multi-itemscales are necessary for appropriate reliability andvalidity assessment. Despite encouraging signs ofmulti-item scale development found in this review,more and better multi-item scales need to be devel-oped.Second, the development of reliable scales presentsa useful starting point for improving the quality ofmarketing research. If multi-item scales are developedwhich initially demonstrate low reliability, reliabilityoften can be increased to acceptable levels by improv-ing the clarity of the instructions, reducing ambiguityin the items, or by simply adding similar items tothe scale.Third, in reporting reliability coefficients, re-searchers should fully explain (1) relevant scale char-acteristics, (2) the procedure used to assess reliabilityand the source(s) of error which is treated, (3) appro-priate references and previous reliability estimates (ifany) for the scale, and (4) the interpreted meaningof the reliability coefficient. This information mayhelp to overcome the problems of ambiguity in thearea. For example, on the basis of the use of aninternal consistency estimate and the researcher'sbackground, internal consistency estimates have beenreferred to as measures of reliability, validity, homo-geneity, and generalizability.Last, coefficient alpha offers a useful and usableapproach to assessing the reliability of measurementscales in marketing research. Though the developmentof alternative forms and multifacet generalizabilitystudies will be needed for situations in which timeand other facets of measurement need investigation,alpha can be fruitfully employed for scales containinga minimum of three items. Clearly, the developmentof reliable scales is a necessary condition for improvingthe quality of marketing research and theory.

    REFERENCESAlexander, H. W. The Estimation of ReliabilityWhenSeveral Trials are Available, Psychometrika,12 (June1947),79-99.Bentler,P. M. A Lower-BoundMethod or the Dimension-Free Measurement of Internal Consistency, Social

    ScienceResearch,I (December1972),343-57.Best, Roger, Del I. Hawkins, and GeraldAlbaum. TheRoleof RandomWeightsandReliabilityntheAssessmentof MultiattributeAttitudeModels, in B. B. Anderson,ed., Advances n ConsumerResearch,Volume3. Chicago:Associationfor ConsumerResearch,1976,88-91., , and . Reliability f MeasuredBeliefsinConsumerResearch, n W.D. Perreault, d., Advancesin ConsumerResearch, Volume4. Atlanta:Associationfor ConsumerResearch, 1977,19-23.Bettman, James R., Noel Capon, and Richard J. Lutz.MultiattributeMeasurementModels and MultiattributeAttitudeTheory:A Test of ConstructValidity, Journalof ConsumerResearch,1 (March1975),1-15.Bohrnstedt,GeorgeW. Reliability ndValidityAssessmentin AttitudeMeasurement, n G. F. Summers,ed., Atti-tude Measurement.Chicago:RandMcNally, 1970,81-99.Brooker, George. An Instrument o Measure ConsumerSelf-Actualization, in M. J. Schlinger,ed., Advancesin ConsumerResearch,Volume2. Chicago:Associationfor ConsumerResearch,1975,563-75.Burt, C. Test Reliability Estimated by Analysis ofVariance, British Journal of Statistical Psychology, 8(November1955),103-18.Campbell,John P. PsychometricTheory, in MarvinD.Dunnette,ed., HandbookofIndustrial ndOrganizationalPsychology. Chicago:Rand McNally College PublishingCompany,1976,185-222.Cornfield,J. and J. W. Tukey. AverageValues of MeanSquaresnFactorials, AnnalsofMathematicalStatistics,27 (December1956),907-49.Cronbach,L. J. CoefficientAlphaand the InternalStruc-ture of Tests, Psychometrika,16 (September 1951),297-334., N. Rajaratnam,and G. C. Gleser. Theory ofGeneralizability:A Liberalization f ReliabilityTheory,BritishJournalof StatisticalPsychology, 16 (November1963),137-63., G. C. Gleser, H. Nanda, and N. Rajaratnam.TheDependabilityof Behavioral Measurements:Theory ofGeneralizabilityor Scores and Profiles.New York: JohnWiley & Sons, Inc., 1972.Darden,WilliamR. andW. D. Perreault, r. A MultivariateAnalysisof MediaExposureandVacationBehaviorwithLife Style Covariates, Journalof ConsumerResearch,2 (September1975),93-103.and . Identifying nterurban hoppers:Mul-tiproductPurchasePatternsand SegmentationProfiles,JournalofMarketingResearch,13 February1976),51-60.and FredD. Reynolds. BackwardProfilingof MaleInnovators, ournalofMarketingResearch,11(February1974),79-85.Gleser,G. C., L. J. Cronbach,and N. Rajaratnam. Gener-alizabilityof Scores Influencedby MultipleSources ofVariance, Psychometrika, 0 (December1965)395-418.Green,Paul E. andMichaelT. Devita. A Complementarity

    16

    This content downloaded from 111.93.6.230 on Thu, 19 Dec 2013 09:12:39 AMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Marketing Practices

    13/13

    RELIABILITY:SYCHOMETRICASICSANDMARKETINGRACTICESModel of ConsumerUtilityfor Item Collection, Journalof ConsumerResearch,1 (December 1974),56-67.Guilford,J. P. PsychometricMethods.New York:McGraw-Hill Book Company,1954.Gulliksen, H. Theory of Mental Tests. New York: JohnWiley & Sons, Inc., 1950.Heeler,RogerM. andMichaelL. Ray. MeasureValidationin Marketing, Journal of Marketing Research, 9 (No-vember1972),361-70.Hiese, D. R. SeparatingReliabilityand Stabilityin Test-Retest Correlations, American Sociological Review, 34(February1969),93-101.Hoyt, C. Test Reliability Estimated by Analysis ofVariance, Psychometrika, (June1941),153-60.Jacoby,Jacob. ConsumerResearch:TellingIt Like It Is,in B. B. Anderson, ed., Advances in Consumer Research,Volume 3. Chicago:Association for ConsumerResearch,1976, 1-11.Kassarjian,HaroldH. Personality ndConsumerBehavior:A Review, Journal of Marketing Research, 8 (November1971),409-18.

    Kerlinger, F. N. Foundations of Behavioral Research, 3rded. New York:Holt, RinehartandWinston, 1973.Kuder,G. F. and M. W. Richardson. The Theoryof theEstimationof Test Reliability, Psychometrika,2 (Sep-tember1937),151-60.Landon,E. Laird,Jr. Self Concept, Ideal Self Concept,andConsumerPurchase ntentions, Journalof ConsumerResearch,1 (September 1974),44-51.Lawlis,G. F. and E. Lu. Judgment f CounselingProcess:Reliability, Agreement,and Error, PsychologicalBulle-tin, 78 (July 1972),17-20.Leavitt,Clarkand JohnWalton. Developmentof a Scalefor Innovativeness, in M. J. Schlinger,ed., Advancesin Consumer Research, Volume 2. Chicago: Associationfor ConsumerResearch, 1975, 545-54.Lord, F. M. and M. R. Novick. Statistical Theories of MentalTest Scores. Reading, Massachusetts:Addison-Wesley,

    1968.Lundstrom,WilliamJ. and Lawrence M. Lamont. TheDevelopmentof a Scale to MeasureConsumer Discon-tent, Journal of Marketing Research, 13 (November1976),373-81.Moschis,GeorgeP. SocialComparison ndInformalGroupInfluences, Journal of Marketing Research, 13 (August1976),237-44.Nunnally, J. Psychometric Methods. New York: McGraw-Hill Book Co., 1967.Perry, Arnon. HeredityPersonalityTraits, Product Atti-tude,andProductConsumption-An ExploratoryStudy,

    Journal of Marketing Research, 10 (November 1973),376-9.Peter, J. Paul and Michael J. Ryan. An InvestigationofPerceivedRisk at the BrandLevel, Journalof MarketingResearch,13(May 1976),184-8.Rogers, Everett. New ProductAdoptionand Diffusion,Journal of Consumer Research, 2 (March 1976), 290-301.Ryan,MichaelJ. andRichardC. Becherer. A MultivariateTest of CAD InstrumentConstructValidity, in B. B.Anderson, ed., Advances in Consumer Research, Volume3. Chicago:Association for ConsumerResearch, 1976,149-54.and E. H. Bonfield. The FishbeinExtended Modeland Consumer Behavior, Journal of Consumer Research,2 (September1975),118-36.and J. Paul Peter. Two OperationalModificationsfor Improving he Delineationof Attitudinaland SocialInfluenceson Purchase ntentions, n K. Bernhardt, d.,Marketing: 1776-1976 and Beyond. Chicago: AmericanMarketingAssociation, 1976,147-50.Spearman,C. TheProof andMeasurement f the Associa-tion Between Two Things, American Journal of Psychol-ogy, 15(January1904),72-101.. CorrelationCalculated romFaultyData, BritishJournalof Psychology,3 (October1910),271-95.Tryon, R. C. Reliabilityand BehaviorDomain Validity:Reformation and Historical Critique, PsychologicalBulletin,54 (May 1957),229-49.Vigderhous,G. Coefficientof ReliabilityAlpha, Journalof Marketing Research, 11 (May 1974), 194.Villani,KathrynE. A. and Yoram Wind. On the Usageof 'Modified' PersonalityTrait Measures in ConsumerResearch, Journal of Consumer Research, 2 (December1975),223-8.Waung, Sherren. Explaining Behavior with Weighted,Unweightedand StandardizedAttitude Scores, in M.J. Schlinger, ed., Advances in Consumer Research, Vol-ume 2. Chicago: Association for ConsumerResearch,

    1975,345-55.Wiley, D. E. andJ. A. Wiley. TheEstimationof Measure-ment Error in Panel Data, in H. M. Blalock, Jr., ed.,Causal Models in the Social Sciences. Chicago: Aldine-Atherton,1971.Wilson,DavidT., H. Lee Mathews,and James W. Harvey.AnEmpiricalTest of the FishbeinBehavioral ntentionsModel, Journal of Consumer Research, 1 (March 1975),39-48.Wright,Peter. FactorsAffectingCognitiveResistance toAdvertising, Journal of Consumer Research, 2 (June1975),1-9.

    17