6
Psychological Bulletin 1971, Vol. 76, No. 2, 105-110 BELIEF IN THE LAW OF SMALL NUMBERS AMOS TVERSKY AND DANIEL KAHNEMAN 1 Hebrew University of Jerusalem People have erroneous intuitions about the laws of chance. In particular, they regard a sample randomly drawn from a population as highly representative, that is, similar to the population in all essential characteristics. The prevalence of the belief and its unfortunate consequences for psychological research are illustrated by the responses of professional psychologists to a questionnaire con- cerning research decisions. "Suppose you have run an experiment on 20 Apparently, most psychologists have an ex- subjects, and have obtained a significant re- sult which confirms your theory (z = 2.23, p < .05, two-tailed). You now have cause to run an additional group of 10 subjects. What do you think the probability is that the re- sults will be significant, by a one-tailed test, separately for this group?" If you feel that the probability is some- where around .35, you may be pleased to know that you belong to a majority group. Indeed, that was the median answer of two small groups who were kind enough to re- spond to a questionnaire distributed at meet- ings of the Mathematical Psychology Group and of the American Psychological Associa- tion. On the other hand, if you feel that the probability is around .48, you belong to a minority. Only 9 of our 84 respondents gave answers between .40 and .60. However, .48 happens to be a much more reasonable esti- mate than .85. 2 1 The ordering of authors is random. We wish to thank the many friends and colleagues who com- mented on an earlier version, and in particular we arc indebted to Maya Bar-Hillel, Jack Block, Jacob Cohen, Louis L. Gultman, John W. Tukey, Ester Samuel, and Gideon Shwarz. Requests for reprints should be sent to Amos Tvcrsky, Center for Advanced Study in the Behav- ioral Sciences, 202 Junipero Scrra Boulevard, Stan- ford, California 94305. 2 The required estimate can be interpreted in sev- eral ways. One possible approach is to follow com- mon research practice, where a value obtained in one study is taken to define a plausible alternative to the null hypothesis. The probability requested in the question can then be interpreted as the power of the second test (i.e., the probability of obtaining a sig- nificant result in the second sample) against the alternative hypothesis defined by the result of the first sample. In the special case of a test of a mean aggerated belief in the likelihood of success- fully replicating an obtained finding. The sources of such beliefs, and their consequences for the conduct of scientific inquiry, are what this paper is about. Our thesis is that people have strong intuitions about random sam- pling; that these intuitions are wrong in fun- damental respects; that these intuitions are shared by naive subjects and by trained sci- entists; and that they are applied with un- fortunate consequences in the course of sci- entific inquiry. We submit that people view a sample ran- domly drawn from a population as highly representative, that is, similar to the popula- tion in all essential characteristics. Conse- quently, they expect any two samples drawn from a particular population to be more simi- lar to one another and to the population than sampling theory predicts, at least for small samples. The tendency to regard a sample as a rep- resentation is manifest in a wide variety of situations. When subjects are instructed to generate a random sequence of hypothetical tosses of a fair coin, for example, they pro- duce sequences where the proportion of with known variance, one would compute the power of the test against the hypothesis that the population mean equals the mean of the first sample. Since the size of the second sample is half that of the first, the computed probability of obtaining z^> 1.645 is only .473. A theoretically more justifiable approach is to interpret the requested probability within a Baycsian framework and compute it relative to some appropri- ately selected prior distribution. Assuming a uni- form prior, the desired posterior probability is .478. Clearly, if the prior distribution favors the null hy- pothesis, as is often the case, the posterior proba- bility will be even smaller. 105

Tversky KahneMan 1971

Embed Size (px)

Citation preview

Page 1: Tversky KahneMan 1971

Psychological Bulletin1971, Vol. 76, No. 2, 105-110

BELIEF IN THE LAW OF SMALL NUMBERS

AMOS TVERSKY AND DANIEL KAHNEMAN 1

Hebrew University of Jerusalem

People have erroneous intuitions about the laws of chance. In particular, theyregard a sample randomly drawn from a population as highly representative,that is, similar to the population in all essential characteristics. The prevalenceof the belief and its unfortunate consequences for psychological research areillustrated by the responses of professional psychologists to a questionnaire con-cerning research decisions.

"Suppose you have run an experiment on 20 Apparently, most psychologists have an ex-subjects, and have obtained a significant re-sult which confirms your theory (z = 2.23, p< .05, two-tailed). You now have cause torun an additional group of 10 subjects. Whatdo you think the probability is that the re-sults will be significant, by a one-tailed test,separately for this group?"

If you feel that the probability is some-where around .35, you may be pleased toknow that you belong to a majority group.Indeed, that was the median answer of twosmall groups who were kind enough to re-spond to a questionnaire distributed at meet-ings of the Mathematical Psychology Groupand of the American Psychological Associa-tion.

On the other hand, if you feel that theprobability is around .48, you belong to aminority. Only 9 of our 84 respondents gaveanswers between .40 and .60. However, .48happens to be a much more reasonable esti-mate than .85.2

1 The ordering of authors is random. We wish tothank the many friends and colleagues who com-mented on an earlier version, and in particular wearc indebted to Maya Bar-Hillel, Jack Block, JacobCohen, Louis L. Gultman, John W. Tukey, EsterSamuel, and Gideon Shwarz.

Requests for reprints should be sent to AmosTvcrsky, Center for Advanced Study in the Behav-ioral Sciences, 202 Junipero Scrra Boulevard, Stan-ford, California 94305.

2 The required estimate can be interpreted in sev-eral ways. One possible approach is to follow com-mon research practice, where a value obtained in onestudy is taken to define a plausible alternative tothe null hypothesis. The probability requested in thequestion can then be interpreted as the power of thesecond test (i.e., the probability of obtaining a sig-nificant result in the second sample) against thealternative hypothesis defined by the result of thefirst sample. In the special case of a test of a mean

aggerated belief in the likelihood of success-fully replicating an obtained finding. Thesources of such beliefs, and their consequencesfor the conduct of scientific inquiry, are whatthis paper is about. Our thesis is that peoplehave strong intuitions about random sam-pling; that these intuitions are wrong in fun-damental respects; that these intuitions areshared by naive subjects and by trained sci-entists; and that they are applied with un-fortunate consequences in the course of sci-entific inquiry.

We submit that people view a sample ran-domly drawn from a population as highlyrepresentative, that is, similar to the popula-tion in all essential characteristics. Conse-quently, they expect any two samples drawnfrom a particular population to be more simi-lar to one another and to the population thansampling theory predicts, at least for smallsamples.

The tendency to regard a sample as a rep-resentation is manifest in a wide variety ofsituations. When subjects are instructed togenerate a random sequence of hypotheticaltosses of a fair coin, for example, they pro-duce sequences where the proportion of

with known variance, one would compute the powerof the test against the hypothesis that the populationmean equals the mean of the first sample. Since thesize of the second sample is half that of the first, thecomputed probability of obtaining z^> 1.645 is only.473. A theoretically more justifiable approach is tointerpret the requested probability within a Baycsianframework and compute it relative to some appropri-ately selected prior distribution. Assuming a uni-form prior, the desired posterior probability is .478.Clearly, if the prior distribution favors the null hy-pothesis, as is often the case, the posterior proba-bility will be even smaller.

105

Page 2: Tversky KahneMan 1971

106 AMOS TVERSKY AND DANIEL KAHNEMAN

heads in any short segment stays far closerto .SO than the laws of chance would predict(Tune, 1964). Thus, each segment of the re-sponse sequence is highly representative ofthe "fairness" of the coin. Similar effects areobserved when subjects successively predictevents in a randomly generated series, as inprobability learning experiments (Estes, 1964)or in other sequential games of chance. Sub-jects act as if every segment of the randomsequence must reflect the true proportion: ifthe sequence has strayed from the populationproportion, a corrective bias in the other direc-tion is expected. This has been called thegambler's fallacy.

The heart of the gambler's fallacy is a mis-conception of the fairness of the laws ofchance. The gambler feels that the fairness ofthe coin entitles him to expect that any devi-ation in one direction will soon be cancelledby a corresponding deviation in the other.Even the fairest of coins, however, given thelimitations of its memory and moral sense,cannot be as fair as the gambler expects it tobe. This fallacy is not unique to gamblers.Consider the following example:

The moan IQ of the population of eighth gradersin a cily is known to be 100. You have selected arandom sample of SO children for a study of educa-tional achievements. The first child tested has anIQ of ISO. What do you expect the mean IQ to befor the whole sample?

The correct answer is 101. A surprisingly largenumber of people believe that the expected JQfor the sample is still 100. This expectationcan be justified only by the belief that arandom process is self-correcting. Idioms suchas "errors cancel each other out" reflect theimage of an active self-correcting process.Some familiar processes in nature obey suchlaws: a deviation from a stable equilibriumproduces a force that restores the equilibrium.The laws of chance, in contrast, do not workthat way: deviations are not canceled assampling proceeds, they are merely diluted.

Thus far, we have attempted to describetwo related intuitions about chance. We pro-posed a representation hypothesis according towhich people believe samples to be very simi-lar to one another and to the population fromwhich they are drawn. We also suggested thatpeople believe sampling to be a self-correcting

process. The two beliefs lead to the same con-sequences. Both generate expectations aboutcharacteristics of samples, and the variabilityof these expectations is less than the true-variability, at least for small samples.

The law of large numbers guarantees thatvery large samples will indeed be highly rep-resentative of the population from which theyare drawn. If, in addition, a self-correctivetendency is at work, then small samples shouldalso be highly representative and similar toone another. People's intuitions about randomsampling appear to satisfy the law of smallnumbers, which asserts that the law of largenumbers applies to small numbers as well.

Consider a hypothetical scientist who livesby the law of small numbers. How would hisbelief affect his scientific work? Assume out-scientist studies phenomena whose magnitudeis small relative to uncontrolled variability,that is, the signal-to-noise ratio in the mes-sages he receives from nature is low. Our sci-entist could be a meteorologist, a pharma-cologist, or perhaps a psychologist.

If he believes in the law of small numbers,the scientist will have exaggerated confidencein the validity of conclusions based on smallsamples. To illustrate, suppose he is engagedin studying which of two toys infants willprefer to play with. Of the first five infantsstudied, four have shown a preference for thesame toy. Many a psychologist will feel someconfidence at this point, that the null hypothe-sis of no preference is false. Fortunately, sucha conviction is not a sufficient condition forjournal publication, although it may do for abook. By a quick computation, our psycholo-gist will discover that the probability of aresult as extreme as the one obtained is ashigh as 3/8 under the null hypothesis.

To be sure, the application of statisticalhypothesis testing to scientific inference isbeset with serious difficulties. Nevertheless, thecomputation of significance levels (or likeli-hood ratios, as a Bayesian might prefer)forces the scientist to evaluate the obtainedeffect in terms of a valid estimate of samplingvariance rather than in terms of his subjectivebiased estimate. Statistical tests, therefore,protect the scientific community against overlyhasty rejections of the null hypothesis (i.e.,Type 1 error) by policing its many members

Page 3: Tversky KahneMan 1971

BELIEF IN SMALL NUMBERS 107

who would rather live by the law of smallnumbers. On the other hand, there are nocomparable safeguards against the risk offailing to confirm a valid research hypothesis(i.e., Type II error).

Imagine a psychologist who studies the cor-relation between need for Achievement andgrades. When deciding on sample size, he mayreason as follows: "What correlation do Iexpect? r ~ .35. What N do I need to makethe result significant? (Looks at table.) N =3,3. Fine, that's my sample." The only flaw inthis reasoning is that our psychologist has for-gotten about sampling variation, possibly be-cause he believes that any sample must behighly representative of its population. How-ever, if his guess about the correlation in thepopulation is correct, the correlation in thesample is about as likely to lie below or above.35. Hence, the likelihood of obtaining a sig-nificant result (i.e., the power of the test) forN ~ 33 is about .50.

In a detailed investigation of statisticalpower, Cohen (1962, 1969) has providedplausible definitions of large, medium, andsmall effects and an extensive set of computa-tional aids to the estimation of power for avariety of statistical tests. In the normal testfor a difference between two means, for ex-ample, a difference of .2 So- is small, a differ-ence of .SOo- is medium, and a difference of\a is large, according to the proposed defini-tions. The mean IQ difference between clericaland semiskilled workers is a medium effect.In an ingenious study of research practice,Cohen (1962) reviewed all the statisticalanalyses published in one volume of the Jour-nal of Abnormal and Social Psychology, andcomputed the likelihood of detecting each ofthe three sizes of effect. The average powerwas .18 for the detection of small effects, .48for medium effects, and .83 for large effects.If psychologists typically expect medium ef-fects and select sample size as in the aboveexample, the power of their studies shouldindeed be about .50.

Cohen's analysis shows that the statisticalpower of many psychological studies is ridicu-lously low. This is a self-defeating practice:it makes for frustrated scientists and ineffi-cient research. The investigator who tests avalid hypothesis but fails to obtain significant

results cannot help but regard nature as un-trustworthy or even hostile. Furthermore, asOverall (1969) has shown, the prevalence ofstudies deficient in statistical power is not onlywasteful but actually pernicious: it results ina large proportion of invalid rejections of thenull hypothesis among published results.

Because considerations of statistical powerare of particular importance in the design ofreplication studies, we probed attitudes con-cerning replication in our questionnaire.

Suppose one of your doctoral students has com-pleted a difficult and time-consuming experiment on40 animals. He has scored and analyzed a large num-ber of variables. His results are generally inconclusive,but one before-after comparison yields a highly sig-nificant t — 2.70, which is surprising and could be ofmajor theoretical significance.

Considering the importance of the result, its sur-prisal value, and the number of analyses that yourstudent has performed—

Would you recommend that he replicate the studybefore publishing? If you recommend replication,how many animals would you urge him to run ?

Among the psychologists to whom we putthese questions there was overwhelming senti-ment favoring replication: it was recom-mended by 66 out of 75 respondents, probablybecause they suspected that the single sig-nificant result was due to chance. The medianrecommendation was for the doctoral studentto run 20 subjects in a replication study. It isinstructive to consider the likely consequencesof this advice. If the mean and the variancein the second sample are actually identical tothose in the first sample, then the resultingvalue of t will be 1.88. Following the reason-ing of Footnote 2, the student's chance of ob-taining a significant result in the replicationis only slightly above one-half (for p = .05,one-tail test). Since we had anticipated thata replication sample of 20 would appear rea-sonable to our respondents, we added the fol-lowing question:

Assume that your unhappy student has in factrepeated the initial study with 20 additional animals,and has obtained an insignificant result in the samedirection, t = 1.24. What would you recommendnow? Cheek one: [the numbers in parentheses referto the number of respondents who checked each an-swer I

(«) He should pool the results and publish hisconclusion as fact. (0)

Page 4: Tversky KahneMan 1971

108 AMOS TVKRSKY AND DANIEL KAHNEMAN

( b ) lie should report the results as a tentativefinding. (26)

(c) He should run another group of [median —20] animals. (21)

(d) He should try to find an explanation lor thedifference between the two groups. (30)

Note that regardless of one's confidence inthe original finding, its credibility is surelyenhanced by the replication. Not only is theexperimental effect in the same direction inthe two samples but the magnitude of theeffect in the replication is fully two-thirds ofthat in the original study. In view of the sam-ple size (20) , which our respondents recom-mended, the replication was about as success-ful as one is entitled to expect. The distribu-tion of responses, however, reflects continuedskepticism concerning the student's findingfollowing the recommended replication. Thisunhappy state of affairs is a typical conse-quence of insufficient statistical power.

In contrast to Responses I) and c, whichcan be justified on some grounds, the mostpopular response, Response d, is indefensible.We doubt that the same answer would havebeen obtained if the respondents had realizedthat the difference between the two studiesdoes not even approach significance. (If thevariances of the two samples arc equal, /. forthe difference is .53.) In the absence of astatistical test, our respondents followed therepresentation hypothesis: as the differencebetween the two samples was larger than theyexpected, they viewed it as worthy of expla-nation. However, the attempt, to "find an ex-planation for the difference between the twogroups" is in all probability an exercise in ex-plaining noise.

Altogether our respondents evaluated thereplication rather harshly. This follows fromthe representation hypothesis: if we expect allsamples to be very similar to one another,then almost all replications of a valid hy-pothesis should he statistically significant. Theharshness of the criterion for successful repli-cation is manifest in the responses to the fol-lowing question:

An investigator has reported a result that youconsider implausible. He ran 15 subjects, and reporteda significant value, t = 2 .4G. Another investigator hasattempted to duplicate his procedure, and he ob-tained a nonsignificant value of t with the same

number of subjects. The direction was the same inboth sets of data.

You arc rcvicwiiiK the literature. What is thehighest value of t in the second set of data that youwould describe as a failure to replicate?

The majority of our respondents regarded I----- .1.70 as a failure to replicate. If the data oftwo such studies (t = 2.46 and t -- 1.70) arepooled, the value of t for the combined datais about 3.00 (assuming equal variances).Thus, we are faced with a paradoxical state ofaffairs, in which the same data that would in-crease our confidence in the finding whenviewed as part of the original study, shakeour confidence when viewed as an independentstudy. This double standard is particularlydisturbing since, for many reasons, replica-tions are usually considered as independentstudies, and hypotheses are often evaluatedby listing confirming and discon firm ing re-ports.

Contrary to a widespread belief, a case canbe made that a replication sample shouldoften be larger than the original. The deci-sion to replicate a once obtained finding oftenexpresses a great fondness for that finding anda desire to see it accepted by a skeptical com-munity. Since that community unreasonablydemands that the replication be indepen-dently significant, or at least that it approachsignificance, one must run a large sample. Toillustrate, it" the unfortunate doctoral studentwhose thesis was discussed earlier assumesthe validity of his initial result (I = 2.70, N-- 40), and if he is willing to accept a risk ofonly JO of obtaining a t lower than 1.70, heshould run approximately SO animals in hisreplication study. With a somewhat weakerinitial result (I = 2.20, N ~- 40) , the size ofthe replication sample required for the samepower rises to about 75.

That the effects discussed thus far are notlimited to hypotheses about means and vari-ances is demonstrated by the responses to thefollowing question:

You have run a correlational study, scoring 20variables on 100 subjects. Twenty-seven of the 190correlation coefficients are significant at the .OS level;and 9 of these are significant beyond the .0] level.The mean absolute level of the significant correla-tions is .31, and the pattern of results is very rea-sonable on theoretical grounds. How many of the 27

Page 5: Tversky KahneMan 1971

BELIEF IN SMALL NUMBERS 109

significant correlations would you expect to be sig-nificant again, in an exact replication of the study,with N — 40 ?

With N = 40, a correlation of about .31 isrequired for significance at the .OS level. Thisis the mean of the significant correlations inthe original study. Thus, only about half ofthe originally significant correlations (i.e., 13or 14) would remain significant with N = 40.In addition, of course, the correlations in thereplication are bound to differ from those inthe original study. Hence, by regression ef-fects, the initially significant coefficients aremost likely to be reduced. Thus, 8 to 10 re-peated significant correlations from the origi-nal 2 7 is probably a generous estimate of whatone is entitled to expect. The median estimateof our respondents is 18. This is more thanthe number of repeated significant correla-tions that will be found if the correlations arerecomputed for 40 subjects randomly selectedfrom the original 100! Apparently, people ex-pect more than a mere duplication of the orig-inal statistics in the replication sample; theyexpect a duplication of the significance of re-sults, with little regard for sample size. Thisexpectation requires a ludicrous extension ofthe representation hypothesis; even the law ofsmall numbers is incapable of generating sucha result.

The expectation that patterns of resultsare replicable almost in their entirety providesthe rationale for a common, though much de-plored practice. The investigator who com-putes all correlations between three indexes ofanxiety and three indexes of dependency willoften report and interpret with great confi-dence the single significant correlation ob-tained.3 His confidence in the shaky findingstems from his belief that the obtained corre-lation matrix is highly representative andreadily replicable.

In review, we have seen that the believerin the law of small numbers practices scienceas follows:

• He gambles his research hypotheses onsmall samples without realizing that the oddsagainst him are unreasonably high. He over-estimates power.

3 Examples can be supplied on demand.

• He has undue confidence in early trends(e.g., the data of the first few subjects) andin the stability of observed patterns (e.g., thenumber and identity of significant results).He overestimates significance.

• In evaluating replications, his or others',he has unreasonably high expectations aboutthe replicability of significant results. He un-derestimates the breadth of confidence inter-vals.

• He rarely attributes a deviation of resultsfrom expectations to sampling variability, be-cause he finds a causal "explanation" for anydiscrepancy. Thus, he has little opportunityto recognize sampling variation in action. Hisbelief in the law of small numbers, therefore,will forever remain intact.

Our questionnaire elicited considerable evi-dence for the prevalence of the belief in thelaw of small numbers.11 Our typical respondentis a believer, regardless of the group to whichhe belongs. There were practically no differ-ences between the median responses of audi-ences at a mathematical psychology meetingand at a general session of the American Psy-chological Association convention, although wemake no claims for the representativeness ofeither sample. Apparently, acquaintance withformal logic and with probability theory doesnot extinguish erroneous intuitions. What,then, can be done? Can the belief in the law ofsmall numbers be abolished or at least con-trolled?

Research experience is unlikely to helpmuch, because sampling variation is all tooeasily "explained." Corrective experiences arethose that provide neither motive nor oppor-tunity for spurious explanation. Thus, a stu-dent in a statistics course may draw repeatedsamples of given size from a population, andlearn the effect of sample size on sampling

* Edwards (1968) has argued that people fail toextract sufficient information or certainty from prob-abilistic data; he called this failure conservatism.Our respondents can hardly be described as conserva-tive. Rather, in accord with the representation hy-pothesis, they tend to extract more certainty from thedata than the data, in fact, contain. The conditionsunder which people may appear conservative arediscussed in Kahnemann, D., and Tversky, A. Sub-jective probability: A judgment of representativeness.(Tech. Rep.) Oregon Research Institute, 1971, 2 (2).

Page 6: Tversky KahneMan 1971

110 AMOS TVKRSKY AND DANIEL KAHNEMAN

variability from personal observation. We arefar from certain, however, that expectationscan be corrected in this manner, since relatedbiases, such as the gambler's fallacy, surviveconsiderable contradictory evidence.

Even if the bias cannot be unlearned, stu-dents can learn to recognize its existence andtake the necessary precautions. Since theteaching of statistics is not short on admoni-tions, a warning about biased statistical intui-tions may not be out of place. The obviousprecaution is computation. The believer inthe law of small numbers has incorrect intui-tions about significance level, power, and confi-dence intervals. Significance levels are usuallycomputed and reported, but power and con-fidence limits are not. Perhaps they should be.

Explicit computation of power, relative tosome reasonable hypothesis, for instance, Co-hen's (1962, 1969) small, large, and mediumeffects, should surely be carried out beforeany study is done. Such computations willoften lead to the realization that there issimply no point in running the study unless,for example, sample size is multiplied by four.We refuse to believe that a serious investi-gator will knowingly accept a .50 risk of fail-ing to confirm a valid research hypothesis. Inaddition, computations of power are essentialto the interpretation of negative results, thatis, failures to reject the null hypothesis. Be-cause readers' intuitive estimates of power arelikely to be wrong, the publication of com-puted values does not appear to be a waste ofcither readers' time or journal space.

In the early psychological literature, theconvention prevailed of reporting, for example,a sample mean as X ± I'E, where 1*E is theprobable error (i.e., the SO'/o confidence in-terval around the mean). This convention waslater abandoned in favor of the hypothesis-testing formulation. A confidence interval,however, provides a useful index of samplingvariability, and it is precisely this variabilitythat we tend to underestimate. The emphasison significance levels tends to obscure a funda-mental distinction between the size of an effectand its statistical significance. Regardless of

sample size, the size of an effect in one studyis a reasonable estimate of the size of the effectin replication. In contrast, the estimated sig-nificance level in a replication depends criti-cally on sample size. Unrealistic expectationsconcerning the replicability of significance lev-els may be corrected if the distinction betweensize and significance is clarified, and if thecomputed size of observed effects is routinelyreported. From this point of view, at least,the acceptance of the hypothesis-testing modelhas not been an unmixed blessing for psychol-ogy.

The true believer in the law of small num-bers commits his multitude of sins against thelogic of statistical inference in good faith. Therepresentation hypothesis describes a cogni-tive or perceptual bias, which operates regard-less of motivational factors. Thus, while thehasty rejection of the null hypothesis is grati-fying, the rejection of a cherished hypothesisis aggravating, yet the true believer is subjectto both. His intuitive expectations are gov-erned by a consistent misperception of theworld rather than by opportunistic wishfulthinking. Given some editorial prodding, hemay be willing to regard his statistical intui-tions with proper suspicion and replace im-pression formation by computation wheneverpossible.

REFERENCES

COHEN, J. The statistical power of abnormal-socialpsychological research. Journal of Abnormal andSocial Psychology, 1962, 65, 145-153.

COHEN, J. Statistical power analysis in the behavioralsciences. New York: Academic Press, 1969.

EDWARBS, W. Conservatism, in human informationprocessing. In B. Kleinmunlz (Ed.), Formal repre-sentation of human judgment. New York: Wiley,1968.

ESTES, W. K. Probability learning. In A. W. Melton(Ed.), Categories of human learning. New York:Academic Press, 1964.

OVERALL, J. E. Classical statistical hypothesis testingwithin the context of Bayesian theory. Psycho-logical Bulletin, 1969, 71, 285-292.

TUNE, G. S. Response preferences: A review of somerelevant literature. Psychological Bulletin, 1964, 61,286-302.

(Received February 26, 1970)