27
PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE RECALL JOHN R. ANDERSON 2 AND GORDON H. BOWER Stanford University A model of free recall is described which identifies two processes in free recall: a retrieval process by which the subject accesses the words, and a recognition process by which the subject decides whether an implicitly retrieved word is a to-be-recalled word. Submodels for the recognition process and the retrieval process are described. The recognition model assumes that during the study phase, the subject associates "list markers" to the to-be-recalled words. The establishment of such associates is postulated to be an all-or-none stochastic process. In the test phase, the subject recognizes to-be-recalled words by de- ciding which words have relevant list markers as associates. A signal detect- ability model is developed for this decision process. The retrieval model is introduced as a computer program that tags associative paths between list words. In several experiments, subjects studied and were tested on a sequence of overlapping sublists sampled from a master set of common nouns. The two- process model predicts that the subject's ability to retrieve the words should in- crease as more overlapping sublists are studied, but his ability to differentiate the words on the most recent list should deteriorate. Experiments confirmed this predicted dissociation of recognition and retrieval. Further predictions derived from the free recall model were also supported. This paper has several aims: First, we offer a critique of two popular "strength" theories, one which relates recall and recog- nition to the strength of one and the same memory trace, and another which relates only recognition memory to a similar strength measure; second, we develop a particular conceptualization about recog- nition memory which we believe satisfies the criticisms of the traditional strength theory; third, we illustrate how that recog- nition mechanism could be interfaced with a retrieval mechanism so as to yield a viable theory about multilist free recall. Along with reviewing published data relevant to these points, we also shall present new 1 This research was supported by Grant MH- 13950-04 from the National Institute of Mental Health to Gordon Bower. 2 Requests for reprints should be sent to John Anderson, Department of Psychology, Stanford University, Stanford, California 94305. experimental evidence bearing on the sec- ond and third points. Two THEORIES OF RECOGNITION AND RECALL Kintsch (1970) provides a careful review of the strength theory of recall and recog- nition which supposes that recall and recog- nition involve basically the same process, except that recognition of an item requires a lower "threshold of strength" than does recall. It is because of this lower threshold that recognition of an item is easier than recall. This strength theory has been an important learning theory for years (cf. Bahrick, 1965; McDougall, 1904; Postman, 1963). Support for this theory comes from the fact that several experimental variables affect recall and recognition in the same way. For instance, temporal variables such as study time, the retention interval, 97 1972 by the American Psychological Association, Inc.

PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

PSYCHOLOGICALREVIEW

VOL. 79, No. 2 MARCH 1972

RECOGNITION AND RETRIEVAL PROCESSES IN FREE RECALL

JOHN R. ANDERSON 2 AND GORDON H. BOWER

Stanford University

A model of free recall is described which identifies two processes in free recall:a retrieval process by which the subject accesses the words, and a recognitionprocess by which the subject decides whether an implicitly retrieved word is ato-be-recalled word. Submodels for the recognition process and the retrievalprocess are described. The recognition model assumes that during the studyphase, the subject associates "list markers" to the to-be-recalled words. Theestablishment of such associates is postulated to be an all-or-none stochasticprocess. In the test phase, the subject recognizes to-be-recalled words by de-ciding which words have relevant list markers as associates. A signal detect-ability model is developed for this decision process. The retrieval model isintroduced as a computer program that tags associative paths between listwords. In several experiments, subjects studied and were tested on a sequenceof overlapping sublists sampled from a master set of common nouns. The two-process model predicts that the subject's ability to retrieve the words should in-crease as more overlapping sublists are studied, but his ability to differentiatethe words on the most recent list should deteriorate. Experiments confirmedthis predicted dissociation of recognition and retrieval. Further predictionsderived from the free recall model were also supported.

This paper has several aims: First, weoffer a critique of two popular "strength"theories, one which relates recall and recog-nition to the strength of one and the samememory trace, and another which relatesonly recognition memory to a similarstrength measure; second, we developa particular conceptualization about recog-nition memory which we believe satisfiesthe criticisms of the traditional strengththeory; third, we illustrate how that recog-nition mechanism could be interfaced witha retrieval mechanism so as to yield a viabletheory about multilist free recall. Alongwith reviewing published data relevant tothese points, we also shall present new

1 This research was supported by Grant MH-13950-04 from the National Institute of MentalHealth to Gordon Bower.

2 Requests for reprints should be sent to JohnAnderson, Department of Psychology, StanfordUniversity, Stanford, California 94305.

experimental evidence bearing on the sec-ond and third points.

Two THEORIES OF RECOGNITIONAND RECALL

Kintsch (1970) provides a careful reviewof the strength theory of recall and recog-nition which supposes that recall and recog-nition involve basically the same process,except that recognition of an item requiresa lower "threshold of strength" than doesrecall. It is because of this lower thresholdthat recognition of an item is easier thanrecall. This strength theory has been animportant learning theory for years (cf.Bahrick, 1965; McDougall, 1904; Postman,1963). Support for this theory comes fromthe fact that several experimental variablesaffect recall and recognition in the sameway. For instance, temporal variablessuch as study time, the retention interval,

971972 by the American Psychological Association, Inc.

Page 2: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

98 JOHN R. ANDERSON AND GORDON H. BOWER

and massing and spacing of presentationshave similar effects on recognition andrecall (Kintsch, 1966; Olson, 1969). Also,the two tasks display similar serial positioncurves with small primacy and large re-cency effects (Shiffrin, 1970b).

However, in contrast to these com-monalities, there are a number of variablesthat affect recall and recognition in differ-ent ways, suggesting that different pro-cesses underlie the two performances. Asa first example, words of low frequency inthe Thorndike-Lorge count of text aremore easily recognized (Schwartz & Rouse,1961; Shepard, 1967), but are recalledmore poorly than words of high frequency(Hall, 1954). As a second example,subjects learning under intentional in-structions recall better than subjects learn-ing under incidental instructions, but thisordering of the groups is actually reversedfor recognition (Dornbush & Winnich,1964; Eagle & Leiter, 1964; Postman,Adams, & Phillips, 1955). Finally, as-sociative and categorical relationshipsamong items play an important role inrecall, but appear to have little if anyeffect on recognition (Cofer, 1967; Kintsch,1966).

An alternate theory about the relationbetween recognition and recall which hasas prestigious a history as the strengththeory is the two-process theory of recall(James, 1890; Kintsch, 1970; Muller,1913). This asserts that while free recallinvolves processes similar to those occurringin recognition, recall differs qualitativelyin that additional search processes are in-volved. Specifically, it is assumed thatduring recall, an item is first retrieved frommemory by the search process (whateverthat is), and then it is tested by the recog-nition process to determine if it is from theto-be-recalled list. Thus, in order for aword to be recalled it must both be suc-cessfully retrieved and recognized. Withinthis two-process model of recall, Kintschwas able to give a natural interpretation ofthe experimental variables which differ-entially affect recall and recognition. How-ever, these interpretations were clearlypost hoc, and therefore, as Kintsch (1970)

himself admits, "little direct evidence forthe two-stage theory is available at thistime [p. 341]." One of the purposes ofthe experiments to be reported in thispaper is to provide some evidence directlybearing on this issue.

STRENGTH THEORIES OF RECOGNITION

Current literature on memory appearsto present a solid concensus regardingthe dominant theory of recognition memory—a theory which we will criticize here andto which we will offer an alternative.The dominant theory assumes that uponpresentation of a stimulus on a recogni-tion-memory test, the subject can accessdirectly a representation or trace of thatitem in memory. Stored with this repre-sentation is some single continuous (ormany-valued discrete) variable that pro-vides a measure of the subject's degree orstrength of familiarity with the test item.We will call this measure the "strength"of the item, using the term formally in-troduced by Wickelgren and Norman(1966). However, this measure has beengiven many verbal guises—"familiarity"by Kintsch (1970) and Parks (1966);"amount of information" by Freund,Loftus, and Atkinson (1969); and "numberof feature matches" by Bower (1967).The important feature to note about thestrength theory of recognition is that it is"ahistorical" ; that is, it assumes that a sub-ject makes recognition decisions about anitem not on the basis of detailed memory ofthe past history of occurrences of the item,but rather on the basis of a single measurewhich reflects to some extent its past fre-quency, recency, and duration of exposure.It is this ahistorical character of strengththeory which is the source of all itsweaknesses.

The strength conception of recognitionmemory is practically as old as experi-mental psychology (see e.g., the use of"excitatory potential" by Hull, Hovland,Ross, Hall, Perkins, & Fitch, 1940). How-ever, a significant technical advance instrength theory came with Egan's (1958)application of statistical decision theory torecognition memory. That approach as-

Page 3: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

RECOGNITION AND RETRIEVAL PROCESSES 99

sumes that there is one normal distributionof strengths for those items which havebeen studied recently, and a different nor-mal distribution for those items whichhave not. The first distribution is calledthe "old" and the second the "new."The average distance between the two dis-tributions reflects the amount of strengththat was added to the studied items bytheir recent presentation. It is assumedthat the subject chooses some criterionpoint, C, along the dimension of strength.(This criterion corresponds to the former"threshold of recognition.") If the strengthof a particular item exceeds C, the subjectwill decide it is an old item and thus "recog-nize" it. If the strength is less than C,the subject will decide it is a new item andreject it.

This strength theory, elaborated with thetechnical machinery of statistical decisiontheory, can handle some of the salient factsgleaned from simple recognition experi-ments. For example, it can handle thedata taken from experiments in which thesubject is asked to rate the confidence ofhis recognition judgment (e.g., Murdock,1965). In particular, the theory predictsthe bowed-shape memory-operating char-acteristic relating changes in confidencerating to changes in the proportions of oldand new items assigned to the confidence-rating categories. The theory predictshow this memory-operating characteristicchanges as a function of retention variablessuch as recency and frequency (see Kintsch& Carlson, 1967). The theory can alsopredict the relationship between multiple-choice and single-item recognition experi-ments (e.g., Green & Moses, 1966; Kintsch,1968a).

However, two detracting points need tobe emphasized regarding these successes.First, these several phenomena are notstrong evidence for the theory because theresults have been shown to also be quiteconsistent with several alternate theoriesof decision making. For example, Kintsch(1968a) showed that the relationship be-tween yes-no and multiple-choice recogni-tion could be well predicted by at leastthree theories other than statistical de-

cision theory. Similarly, Lockhart andMurdock (1970) have shown that bowedmemory-operating-characteristic curves arederivable from a diverse variety of under-lying old and new distributions other thanthe presumed normal distributions. Sec-ond, and of greater consequence, these suc-cesses are really triumphs (such as theyare) for the technical machinery of statis-tical decision theory, not for the basicstrength assumptions. In fact, they pro-vide no discriminating evidence in favor ofthe strength assumptions. For instance,Bernbach (1967), using a different concep-tion of the underlying memory representa-tion but similar decision-making machinery,produced a recognition model equally asadequate as that generated from strengththeory. Similarly, we too will produce analternate memory model that is equallycapable of handling these elementary factsabout recognition. The thrust of thisargument is that results from simplerecognition experiments do not suffice todiscriminate among alternative conceptionsof the memory representation.

The strength approach can explain ourability to differentiate between a positiveand negative set of items on a retention testif these differ in strength; that is, if theitems differ in recency, frequency, and dura-tion of study. The approach fails, however,to distinguish theoretically among itemswhich are equated on frequency and re-cency, but are nonetheless differentiableempirically on other grounds. Yet thereare multiple dimensions along which todifferentiate sets of items besides their fre-quency, recency, duration, or some com-posite strength measure. Example differ-entia in verbal learning experiments wouldbe where in space the item was presented,who said it, how it was said, and otherspecial characteristics of its physical andpsychological context of presentation. Infairness, the original strength theoristsnever actually said that subjects could notperform discriminations on the basis ofthese dimensions. They were rather con-cerned with other matters. Strength offamiliarity was adopted as a convenientconcept for integrating with their technical

Page 4: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

100 JOHN R. ANDERSON AND GORDON H. BOWER

machinery. However, the evidence is nowavailable that an undifferentiated strengthof familiarity concept is not sufficiently richto account for the subject's ability todifferentiate sets of items.

Of relevance to this issue is evidenceshowing item differentiation in the subject'sediting of his free recall. An experiment byBower and Clark (1969) illustrates differ-entiation in recall of a sort that cannot behandled in simple strength terms. Theirsubjects learned lists of unrelated nouns bygenerating meaningful sentences, woveninto stories around the critical words. Thisheuristic greatly enhanced their later recallof the critical nouns. But, for presentpurposes, the significant fact is that sub-jects almost never intruded their elabora-tive additions while overtly recalling (inwriting) the critical nouns. In recalling,they would recite their story to themselvesbut write down only the critical nouns.That outcome is not an isolated incident.When asked, most subjects in free recallexperiments will report thinking aboutmany nonlist words (e.g., a category name)to help mediate recall of list words, yet themediators are rarely intruded in recall.An experiment by G. H. Bower and P.Winchester (unpublished) showed thatfollowing a lengthy word association task,subjects had over 90% accuracy in lateridentifying whether a given item was oneof the experimenter's stimulus words orone of their own associate responses.

The argument above claims that an ele-mental strength measure provides nogrounds for discriminating list items fromimplicit associative responses during listlearning. A related difficulty with a simplestrength theory is that it cannot accom-modate facts about list differentiation.This issue is engaged, for example, in single-trial free recall studies using multiple listsstudied and recalled in succession. In thetypical task, the subject is required torecall only the "most recent" list. Now,recall of the most recent list could be im-plemented in strength models by assuminga retrieval mechanism that outputs onlythose items whose strength exceeds a cri-terion amount (earlier items having de-

cayed below that amount). However, asShiffrin (1970a) has demonstrated, subjectsare also capable of recalling only those itemsfrom the list which just preceded the mostrecent list; following presentation of List n,his subjects recalled List n — 1, and theycould do so with reasonable accuracy.

One might imagine that strength theorycould also explain Shiffrin's results byinvoking the idea of the decay of memorystrength (see Wickelgren & Norman, 1966).Since an item's strength decays continu-ously between its presentation and itsretention test, one might suppose thataccess to List n — 1 is provided by an"internal editor" which recalls only thoseitems whose strengths fall in an appropri-ate bandwidth (see Hinrichs, 1970), abovea lower bound (distinguishing List n — 1from List n — 2) and below an upper bound(distinguishing List n — 1 from List «).This "bandwidth" explanation receivessome support in research by Hintzman andWaters (1969, 1970) showing that identi-fication of which of two lists an item ap-peared in improves with increasing tem-poral interval between the two lists butdeterioriates with increasing interval be-tween study of the lists and the retentiontest. So, temporal discriminations (im-plemented by whatever mechanism) areclearly important in list differentiation.

In counterargument, however, it can beshown that such temporal judgments can-not be based solely on any simple strengthvariable. This can be concluded from ex-periments which simultaneously vary fre-quency and recency. An item in List 1may be presented 10 times as frequently asan item in List 2; so, according to any"strength" theory, the List 1 item will bethe stronger. Yet in List 2 recall, the sub-ject is much more likely to recall the List2 item than the frequent List 1 item. Instudies of list differentiation, Winograd(1968) found in fact that such unbalancedfrequencies actually improved list dis-crimination rather than the converse asexpected by strength theory. Hintzmanand Block (1971) showed that correctassignment of an item to List 1 rather thanList 2 increased directly with its frequency

Page 5: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

RECOGNITION AND RETRIEVAL PROCESSES 101

of occurrence in List 1; a simple strengththeory of list differentiation would havepredicted the opposite, namely, higherprobability of assigning frequent items tothe more recent List 2,

Although list differentiation experimentscommonly use each item in just one ofseveral lists presented (see, however, Hintz-man & Block, 1971), interesting results arealso yielded by experiments in which agiven item appears in several lists. Wehave conducted such an experiment (G.H. Bower & J. R. Anderson, in preparation)in which subjects studied a series of fouroverlapping lists. There are 16 (24) pos-sible combinations of appearances andnonappearances for a particular word acrossthe four lists. Several words were used toinstantiate all 16 possible combinations,and subjects were later asked on a reten-tion test to indicate in which lists each wordhad occurred. For present purposes, themost important result of that experiment isthat subjects were much more accurate inlist identifications than one could ever pre-dict from a simple strength theory. Forinstance, subjects can correctly rememberthat one item occurred in Lists 1 and 4 andthat another occurred in Lists 2 and 3,etc.; yet it is simply unimaginable how asingle strength measure attached to eachitem could provide for such patterned dis-criminations. Such results show that"memory strength" alone is insufficient toaccount for the salient facts about listdiscrimination.

To pursue matters just one more counter-move, a strength theorist might try to ex-plain such embarrassing results by sup-plementing the strength information witha further independent measure of the tem-poral recency of the item. Such a "two-measure" theory might integrate the datain our "24" list discrimination experimentalong with Winograd's results and those ofHintzman and Waters. However, wewould then argue that two measures arestill not enough. For instance, the pro-posed two measures (strength plus a re-cency tag) cannot explain the ability ofsubjects to discriminate implicit associatesfrom test items, or their ability to identify

the location or the modality (auditory orvisual) in which an item was presented(D. L. Hintzman, unpublished). On thesegrounds it would seem reasonable to sup-pose that subjects can perform list dis-crimination on any reasonably salientdimension.

RECOGNITION VIA ASSOCIATIONTO CONTEXT

For such reasons, we reject the viewthat recognition is mediated by a simplestrength measure or any simple combina-tion of measures. In place of this we pro-pose that items are recognized by retrievalof certain kinds of context informationoriginally stored along with the item inquestion. This contextual informationincludes physical characteristics of anitem's presentation, implicit associationsto the item, and some cognitive elementsrepresenting the list in question. We haveno startling insights regarding the nature ofthese context elements which are presumedto vary across lists. They might includethe subject's general mood or attitude, hisphysical posture, and his physiologicalstate, as well as any conspicuous externalcues prevailing during presentation of Listn. One prominent set of cues may be pro-vided by implicit verbalizations, especiallyan implicit count as when the subject saysto himself "first list," "second list."Another prominent set of contextual cuesfor the recognition of an item could beprovided by other words in the list. Aninteresting curiosity is how the subjectidentifies where List n — 1 leaves off andList n begins; but clearly, temporal group-ing and distinctive signals, instructions,or activities (like recall) interpolated be-tween lists all combine to promote identi-fication of beginning and end segments ofthat serial unit we call List n. A furthercuriosity, not explored here, is the issue ofhow the subject decides whether temporallydistributed clusters of presentations con-stitute different presentations of the "same"list or are "different" lists.

A significant consequence of this con-text-retrieval theory is that the task ofitem recognition is not really different in

Page 6: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

102 JOHN R. ANDERSON AND GORDON H. BOWER

kind from that of list discrimination. Inboth tasks, the subject must make hisdecisions on the basis of contextual infor-mation retrievable from the test word.The only difference is that the recognitiontask involves a yes-no decision ; that is, thesubject must indicate whether he saw theword in a particular list context. In con-trast, list discrimination involves a forced-choice task; that is, the subject must decidein which list context the word occurred.

At this point, however, the proponent ofthe strength of familiarity hypothesismight want to object and insist on a theo-retical separation of the simple recognitionparadigm from the list discriminationparadigm. His argument might go asfollows: "I concede that list discrimina-tion tasks may require something beyonda strength measure, perhaps something likeyour 'contextual associations' business.But that does not disprove my strengththeory of pure recognition. In that task,surely, subjects use strength of familiaritybecause it is convenient and because itworks. In other words, I claim that simplerecognition and list discrimination involvefundamentally different mnemonic anddecision processes."

As long as both models handle equallywell the simple recognition task, there isno way to disprove this dualistic claim.However, on grounds of parsimony onewould say that there is no need for a specialtheory for a special circumstance whena more general theory subsumes the specialcircumstance and many others. Besidesits lack of parsimony, this attempt to givea special status to the pure recognitionparadigm ignores the fact that all recogni-tion experiments are implicitly list or setdiscrimination experiments. This is obvi-ous upon closer analysis of the "purerecognition" experiments. Every subjecthas seen, heard, and thought thousands ofwords before he sees the experimental list,and he frequently sees, hears, and thinks ofa great many nonlist words between studyand test. We have already noted thatsubjects think implicitly of many wordswhile the study list is being presented.Undoubtedly, included among the dis-tractors in a "pure recognition" test will

be words that the subject encounteredbefore, during, and after the study list inquestion. Therefore, the subject's actualtask is to discriminate a particular subsetof words from all others on the basis ofdifferentia such as where, when, and howthe word was encountered.

For example, in a recent experiment ofours (G. H. Bower and J. R. Anderson, inpreparation), the implicit discriminationinvolved in an alleged "pure recognition"experiment was explicitly given experi-mental analysis. After hearing a list ofstudy words which they were told to re-member, the subjects heard lengthy in-structions describing the multiple-choicerecognition test they were about to receive.The multiple choice included one wordfrom the study list along with two dis-tractors. Unknown to the subject, how-ever, one of the distractors was a semantic-ally similar word that had been mentionedtwo times in the instructions interpolatedbetween the study list and the recognitionretention test. In comparison to the study-list item, this "instruction distractor"should have been greatly favored in termsof a strength measure, because it enjoyedboth higher frequency and shorter recencythan the study-list word. However, asintuition and our model suggest, the sub-jects had not the slightest difficulty inselecting the study-list word of this mul-tiple-choice set. Their ability to discrimi-nate was also unrelated to whether thesubject indicated any awareness of theinstructional distractors. The instructionaldistractor was chosen just slightly morethan the totally new distractor, a resultconsonant with some generalization amongthe "list" and "instructional" contextsassociated to the words. This smalldemonstration just makes transparent thepoint we have been urging: What is, toappearances, a simple recognition memoryexperiment is, in actuality, a list differen-tiation experiment in disguise.

DETAILS OF THE MNEMONICREPRESENTATIONS

We could represent the entire constel-lation of contextual stimuli prevailing

Page 7: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

RECOGNITION AND RETRIEVAL PROCESSES 103

during List n presentation as a single pat-tern which is treated in the learning systemas a single unit, subject to all-or-none as-sociation to items appearing on List n.However, for technical and theoreticalreasons, it is desirable to postulate a poolof hypothetical elements which singly,separately, and independently serve toidentify List n. As in the stimulus-sam-pling theory of Estes (1959), List n willthen be represented as a population ofa number of variable stimulus components.There will be one such stimulus populationcorresponding to each ordinal list—one forList 1, one for List 2, and so on. It wouldbe realistic to assume that the successivestimulus populations form an ordered arrayof uniformly overlapping sets so thatstimulus generalization, confusion, or fail-ure of list differentiation would be mostlikely among temporally adjacent lists.However, the basis for that proximitymetric will not be presented here, but ithas been developed elsewhere (Bower,1972b).

Our basic conception of human memoryis that of a huge network of tens of thou-sands of nodes interconnected by associa-tions. The nodes correspond to individualconcepts the subject has, and the associa-tions encode relations that the subject haslearned between the concepts. Upon pre-sentation of a word in study, we assumethat the sensory features of that wordactivate the node in the network thatcorresponds to the word. Simultaneously,there are active in the network nodes cor-responding to the various contextual stimulithat the subject is attending to. We willmake two specific assumptions with respectto how the node corresponding to the wordbecomes connected to the contextual nodes:

1. At the time of occurrence of any wordin the list, we suppose that there is simul-taneously active a unique element or nodein memory which will be called the "listmarker" or "list tag." The purpose of thislist marker is to record the context prevail-ing during that presentation of the word.It does this by interconnecting the set ofcontextual nodes active at that moment.The marker thus acts like a label for a col-lection or bundle of contextual elements.

We assume that there is a probability 6that any particular element in the List npopulation of context elements is activeand is associated to the list marker.

2. There is a probability a that the sub-ject will form an association between thememory node corresponding to the wordand the memory node corresponding to thelist marker. This association serves torecord a particular fact about the word,namely, that it occurred in a particular ex-perimental context. Thus, we assume, incommon with Mandler (1967), Kintsch(1970), Norman and Rumelhart (1970),and many others, that what is learned ina verbal learning task is not the word itselfbut rather information about the word(e.g., that it was presented in surprisingred letters in the first part of the list whilethe subject was thinking about what hewould buy with his wages from service inthe experiment).

Figure 1 provides a schematic diagramof what we have been discussing. Threewords in memory, A, B, and C are beingassociated to a set of contextual elementsfrom List i and a set from List j. Thefigure illustrates some overlap between thesets of elements for Lists i and j because,in general, there is reason to believe thatthe contextual elements for a particular listwill not all be unique. These overlappingelements do not serve to discriminate List

WORDS CONTEXTUALELEMENTS

FIG. 1. A hypothetical state of marking of threewords A, B, and C, with respect to two lists iand j.

Page 8: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

104 JOHN R. ANDERSON AND GORDON H. BOWER

i from List j, but they could help dis-criminate both lists from a third list. InFigure 1, Word A has been successfullyassociated to a List i marker, Word B toboth a List i and a List j marker, and WordC to just a List j marker. The list markersserve to keep track of the lists in which aparticular word occurred.

One may ask why we have inserted, inFigure 1, list markers between the con-textual elements and the words. Why notsimply assume that the subject associatesthe contextual elements directly to thewords and eliminate list markers entirely?The reader should keep in mind that thepoint of introducing associations to contextis to provide a means for keeping track ofthe occasions and the lists in which particu-lar words appeared. This would be difficultto implement on the basis of direct associa-tions between the word and the contextualelements. How would the subject sortout which contextual elements belonged towhich list? But even greater difficultiesarise for this alternate hypothesis when weconsider the problem of how a subjectwould keep track of what happens withina single list. For instance, if a word ap-peared twice in a single list, we expect thata subject could often report that fact anddescribe the contexts prevailing at each ofthe presentations. According to the modelin Figure 1, the subject would have twodistinct list markers associated to the word.Each of these list markers would have as-sociated to it a different subset of the con-textual elements. Thus, when tested, thesubject could retrieve two distinct listmarkers and therefore judge that he hadseen the word in two contexts. Moreover,by means of the contextual elements as-sociated to each of the list markers, hemight be able to describe something aboutthe two contexts. For instance, Hintzmanand Block (1971) showed that the subjectshad considerable accuracy in reportingboth serial positions of a word that oc-curred twice in one list. On the other hand,consider the predicament of the subject ifall he had was direct word to context-element associations. If the word waspresented twice, more contextual elements

would be associated to the word, so thesubject might be able to judge that theword had been presented twice. However,he would have no way of sorting out whichcontextual elements belonged to whichpresentation. To reiterate, the point ofintroducing list markers in an associationtheory is to give the subject a means ofkeeping different events distinct in hismemory.

The notion of list markers developedabove is very similar to a model suggestedby Hintzman and Block (1971). Theyalso briefly considered an alternate model inwhich item repetition is reflected by mul-tiple copies or replicas of each word (seealso Bernbach, 1969; Bower, 1967). Withinsuch a model, the contextual elementswould be directly associated to the wordwithout an intervening marker label, butto a different replica of the word each timeit was presented. This approach may ap-pear to eliminate the need for the listmarkers and the association between thewords and the list markers. However, thatappearance is deceiving. In such a mul-tiple-copy model, it is necessary to postu-late some means of retrieving the manycopies from a single phonemic representa-tion of the word (e.g., in order to estimateits frequency of presentation in a list).When such a model is elaborated with thenecessary retrieval mechanism, it turns outto be isomorphic to ours: The prototypicalphonemic representation of the word cor-responds to the word in Figure 1; thedifferent copies correspond to the differentlist markers; and the probability of retriev-ing a copy from a phonemic representationcorresponds to the probability of formingan association between the word and a listmarker.

FORMAL MODEL FOR RECOGNITIONJUDGMENTS

The foregoing assumptions provide uswith a formal model for recognition judg-ments that is indistinguishable from Bern-bach's (1967). In deciding whether a testword was in List n, the subject evaluatesthe item with respect to how much evi-dence there is for the word's membership

Page 9: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

RECOGNITION AND RETRIEVAL PROCESSES 105

in List n. This is done by assessing thenumber of List n elements associated tothe list marker. During study of anyword in List n, there is probability a thata List n marker is associated to the word(a would be assumed to vary with study-time and similar "strengthening" vari-ables). If a List n marker is associated tothe word, the number of List n elementsassociated to the node of that marker willbe binomially distributed with parametersk, the number of elements in the List npopulation, and 8, the probability thata List n element is associated to the listmarker (see our earlier assumptions). Forlarge k, it is well known that a binomialdistribution approximates to a normal dis-tribution ; so such normal approximationswill be used throughout this paper. Wewill let fm(x) denote this probability dis-tribution of x, the amount of evidencetoward List n, for a test item that has beensuccessfully tagged with a list marker. Inthe following we will refer for convenienceto x as the scale of "List n evidence."

We let fu(x) denote the comparable dis-tribution of evidence for List n for itemsnot presented on List n as well as for itemspresented on List n but not successfullyassociated to a List n marker, which eventoccurs with probability 1 — a. One basisfor fu(x) having nonzero values could beoverlap or generalization between succes-sive list contexts in which the test itemoccurred, and hence confusions. For suchreasons, a test item may be judged falselyas occurring on List n when in fact it oc-curred on earlier lists.

The distributions fm(x) and /«(#), andthe parameter a, are precisely those re-quired in Bernbach's model. The proba-bility distribution of list evidence for anitem presented in the iti\ list is the proba-bility mixture

/.•(*) = <«/„(*) + (1 - «)/.(*).

The distribution /,•(*) is definitely notnormal, depending as it does on a and themeans and variances of fm(x) and fu(x).Figure 2 illustrates some possible distri-butions for f{(x) when /„ and /„ have equalvariances.

The ingredients for a "statistical deci-sion theory" analysis of recognition memoryare now at hand. The distributions /«(*)and /,•(*) serve, respectively, like the noiseand signal-plus-noise distributions in thetheory of signal detectability. So the wholeset of facts outlined previously regardingyes-no and multiple-choice recognitionprocedures and memory-operating charac-teristics are within explanatory reach of themodel. The model also extends in thecustomary way to handle confidence ratingsregarding an item's membership in Listn. It is supposed that the subject assignsa rating dependent on how much evidence(number of associated List n elements)relevant to List n membership is retrievedby the test item.

The foregoing remarks develop our modelfor recognition and list differentiation, andit is directly relevant to such data. How-ever, we are here concerned with the recog-nition process as a monitor or editor forfree recall. Following input of a word list,the cue for recall initiates some search andretrieval mechanisms which begin implicitlyproducing plausible list candidates, itemswhich might be on List n. These implicitcandidates are then input to a recognitionprocess to assess evidence for their List nmembership. If that evidence exceedscriterion, the candidate is designated aList n item and it is given in overt recall.

RETRIEVAL PROCESS

Having completed our description of therecognition model, we will now outlineour model of the retrieval process whichgenerates implicit candidate words forpossible overt recall. It is simply hopelessto imagine a random mechanism whichsearches haphazardly and serially (or inparallel, for that matter) through all pos-sible memory locations, hoping to stumbleacross a few of the list words. Our memo-ries are very large and not well organizedfor such random searches. One obviouslyneeds a directed search process, and themodel to be described accomplishes this bysearching only those associative pathwayswhich have been recently tagged as usefulfor retrieving the word set under considera-

Page 10: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

106 JOHN R. ANDERSON AND GORDON H. BOWER

(a) (|au-nm)/o =a =75

fu(x)>

Mm

fm(x)

FIG. 2. Possible forms of ft(x), the distribution of List i evidence for wordsfrom List i, as a function of the probability of association to the list marker a,and the distance between the distribution for marked words, fm(x), and thedistribution for unmarked words, fu(x).

tion. This model is realized in a computersimulation program described by Anderson(1972). Since its particular task environ-ment and testing ground was free recall ofword lists, it has been dubbed FRAN, anacronym for Free Recall in an AssociativeNetwork.

FRAN starts with a preexperimentalnetwork of associations among a large setof concepts (words). The experimenterselects a subset of these words to composea free recall list; this list is then presentedto FRAN one word at a time. FRAN com-

mences to develop a way to recall the wordson this presented list (call it List n) andonly these. FRAN learns which words arepresented on List n by tagging the memorynodes (corresponding to the words beingpresented) with a List n marker. Thiscorresponds to the recognition process wehave just described. In addition to thistagging of the memory node (call it A),FRAN follows one or more associativepathways radiating out from Node A,searching for other words (memory nodes)that are also on the list. This latter in-

Page 11: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

RECOGNITION AND RETRIEVAL PROCESSES 107

formation is provided by whether or notthe node being examined has a previousassociation to a List n marker. If anotherlist word is found in this associative search,then FRAN attempts to tag the associativepathway from Node A to the found targetnode. This tagging of a prior associativepathway proceeds just like the tagging ofa memory node, namely, by establishing anassociation to the List n marker. These listtags on associative pathways are what willlater guide the search process; a taggedpathway essentially delivers a later direc-tive to the search process to follow out thispathway to get from recall of Word A torecall of other list words.

There is one further process of signifi-cance occurring during the study phase ofa free recall experiment and that is con-struction of a sublist called ENTRYSET.The list of words serve as special "starters"from which FRAN can begin her chains ofrecall during output. Technically, a wordon ENTRYSET is simply one which re-ceives an association from a special memorynode called LIST N, which we may con-sider as a prototypical representation ofList n. The number of "recall starters" onENTRYSET is assumed to be very limited(in the current program, only three items).The program has a set of crude heuristicsfor eventually converging upon those itemswhich are most "central" in the associativenetwork in that they lead to recall of thelargest number of list words.

In the usual free recall experiment, dur-ing output FRAN first dumps out the threeto five recent list words in her short-termmemory. She then uses these plus theitems on ENTRYSET to start her searchof long-term memory for further words torecall. This search involves following thetagged associative pathways radiating outfrom each of the starter nodes in turn, witheach search proceeding in a "depth first"manner. During this directed associativesearch, FRAN considers many words, test-ing each for recognition (i.e., whether it hasan association to a List n marker). Ifa retrieved word is recognized, it is overtlyrecalled.

The reader will note that context infor-

mation is being used in three different waysin FRAN. First, via the list marker (col-lections of context elements), retrieval ofcontext from a test node underlies correctidentification of List n words, thus formingthe basis for list discrimination. Second,via the LIST N node (i.e., the prototype ofthe List n context), the context also has thestatus of a "stimulus" initiating recallchains from the ENTRYSET words.These two uses of context are not originalwith us. Similar notions were suggested byNorman and Rumelhart (1970); a stan-dard assumption of interference theory isthat context cues serve as stimuli govern-ing emission of a response (Keppel, 1968;McGovern, 1964). We have expandedthese notions and provided an explicitrepresentation and model for them. Thethird use of context, to tag associativepaths in the memory network, is unique toour learning and retrieval model. Suchsecond-order associations, that is, an as-sociation between an associative link anda memory node (a List n marker), have notbeen used in the more traditional associa-tive theories. However, the second-orderassociation is really only a labeling of alower order association, and most currenttheories of memory (e.g., Quillian, 1969;Rumelhart, Lindsay, & Norman, 1972) doinvolve labeling of associations with suchsemantic relations as category membership,superordination, opposition, and the like.In FRAN, labeling preexisting associativepathways with a List n tag provides themechanism for guiding FRAN's memorysearch during recall. Since the two kindsof tagging, of memory nodes versus as-sociative pathways, are independent, it isthe case that word recognition in FRAN isindependent of the retrieval processes.This independence of the two processes willbe important in the experiments thatfollow.

The above description of FRAN isnecessarily brief and incomplete, focused ononly those aspects pertinent to the two-process recall hypothesis. Papers byAnderson (1972) and Bower (1972a) shouldbe consulted for further details of thetheory, the simulation program, and a sam-

Page 12: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

108 JOHN R. ANDERSON AND GORDON H. BOWER

pling of results which the theory explainsand those with which it has difficulties.In our opinion, it is the most viable theoryof free recall currently available in termsof its range of applicability.

EXPERIMENT I

Experiment I is one of those researchcuriosities begun for irrelevant reasons butwhich turned out to provide the stimulusfor this program of research and theorizing.We presented subjects 15 successive listsof 16 words, each list selected in a pseudo-random manner from a master set of 32words. As a consequence, lists after thefirst had partial overlap in membershipwith earlier lists. After the presentationof each list, the subjects were required torecall only those words that were in themost recent list. According to our re-trieval model, the ability to retrieve thefull set of 32 words should improve withrepeated study trials because there is moretime to discover and tag associative path-ways between the words.

However, the model makes a markedlydifferent prediction about the recognitionsubprocess of free recall. The implicitcandidates for recall generated by the re-trieval component will include many ormost of the master set of 32 plus someassociatively related items. The task ofthe recognition component is to select whichof these words occurred in the most recentlist. This it does by checking to see whichof the candidates are tagged with listmarkers referring to the most recent list.Since a different set of 16 words must berecalled on each trial, the subject can onlysucceed if he associates a different listmarker with the to-be-recalled words oneach trial (list). Tagging with list markersis conceived essentially as a paired-associatetask in which the word functions as thestimulus term and the marker functions asthe response term. Thus, the tagging ofa single word appearing in successive listsexemplifies the classical A-B, A-C para-digm for negative transfer. One mightassume that the amount of negative trans-fer accumulates as the subject learns A-B,

then A-C, then A-D, etc. Similarly, weassume negative transfer grows as the sub-ject learns associations to list markers as inA-List 1, A-List 3, A-List 5, and so forth.We have not found a cumulative negativetransfer study in the literature that exactlyparallels our situation, but Underwood(1945) has demonstrated that proactiveinhibition increases as the same stimulusterm is paired with progressively moreresponses. It is reasonable to assume thatnegative transfer would also increase asthe same stimulus is paired with more re-sponses. For instance, such a result ispredicted by the assumptions about re-sponse competition proposed by Kjelder-gaard (1968).

In this experiment, each word will bepaired with a new list marker on every trialthat it appears for study. Therefore, theprobability of successfully tagging a wordshould decrease across lists, with a conse-quent decrease in the subject's ability torecognize to-be-recalled words. Thus, a dis-sociation of the two subprocesses isexpected, with retrieval improving andrecognition of implicitly retrieved wordsdeteriorating across successive lists. Sucha dissociation would emphasize the utilityof conceiving of free recall in terms of thetwo-process model. It would be impossiblefor the threshold model, which relates bothrecognition and retrieval to the same factorof "strength," to account for such adissociation.

Method

From a master set of 32 common concrete nouns,15 lists of 16 words were generated. Words wereassigned randomly to lists with the constraint thatthe overlap of each list with each other list wouldbe eight words. Such independent, multiple over-lapping of lists was done so that the subjects couldnot use membership of a word in one list as anyindication of its membership in any other list.Therefore, decisions about membership of a word inthe current list would have to be based solely on theword's rating on the scale of evidence for that list.By the sixth list, every word in the master set hadoccurred in at least one list. By the fifteenth list,a particular word had occurred in anywhere from4 to 11 of the lists. The order of words withina particular list was randomly determined. Thesame sequence of IS lists was presented to all of thesubjects. The words were slide projected one at

Page 13: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

RECOGNITION AND RETRIEVAL PROCESSES 109

a time for two seconds. Immediately after presen-tation of the list, the subjects wrote their recall ofthe most recent list of words they had just studied.They had 135 seconds for this recall. They wereinstructed to recall a word only if they thought it"probable" that the word came from the last liststudied. The experimenter collected the recallsheets and the procedure was repeated for the nextlist. Including the instructions, the experimentalsession lasted about SO minutes. Seventeen sub-jects (6 males, 11 females; 16-22 years old) wererecruited through an advertisement in a local news-paper and were paid $1.75 for their services. Theywere tested in two groups of size 7 and 10.

Results

Figure 3 shows the average number ofwords recalled from the most recent list(abbreviated R words) and separately, thenumber of words which were intruded fromearlier lists and were not on the most recentlist (N words). The number of R wordsrecalled minus the number of N words, alsoshown in Figure 3, provides a conservativecorrection for guessing in recall. In thecurve for R-word recall and in the cor-rected-recall curve, there appears to be aninitial improvement in recall followed bydeterioration. Using orthogonal poly-nomials, an equation involving the first sixpowers was fitted to the corrected recall

over the 15 lists. The improvements in thefit due to the linear and quadratic compo-nents of the curve were significant(F = 8.35, df = 1/8, p < .05, for thelinear component; F= 12.36, df = 1/8,p < .01, for the quadratic component).However, the addition of the four higherorder polynomials did not significantly im-prove the fit of the theoretical curve(F = 2.03, df = 4/8, p > .10). Thesmooth curve in Figure 3 describes thebest fitting quadratic equation for the cor-rected recall. There is considerable varia-bility of the observed points about thequadratic curve. This variability may beattributed to differences among the indi-vidual word lists. The theoretical inter-pretation of this overall rise then fall inrecall is that the improvement in itemretrievability predominated initially, butthat after retrievability had reached anupper limit, the degradation of list differ-entiation continued and recall deterioratedas a consequence. This is because recall ismonitored or edited by list differentiationdecisions. This interpretation is furthertested in the following experiments.

iz.o

ca

i11.0

10.0-

002

9.0

R WORDS

-CORRECTED RECALL

I 2 3 4 5 6 7 8 0 10 II 12 13 14 IS

T R I A L

FlG. 3. Mean number of words recalled in Experiment I as a function of trials.

Page 14: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

110 JOHN R. ANDERSON AND GORDON H. BOWER

EXPERIMENT II

A simple challenge to this interpretationof the results is to question whether theresults have anything to do with the par-ticular experimental manipulation em-ployed. Quite possibly the initial improve-ment may simply reflect "learning tolearn" and the later decrement may onlyreflect progressive fatigue or loss of motiva-tion of the subjects. The obvious control,then, is to repeat the last experiment butwith a completely different list (i.e., nowords repeated) on each trial. Thereshould be negligible across-trial change inretrievability because the subject is learn-ing new items on every list. Also becausenew words are being studied on each trial,both the stimuli and the responses for thehypothetical paired-associate task under-lying the list marking are different on eachtrial. Therefore, there should be little, ifany, negative transfer in identifying themost recent list. Hence, if the results ofExperiment I were really due to the pseudo-random repetition of words across lists,then neither the initial increase nor thelater decrease in recall should be observedin this second experiment.

MethodTwenty lists of 16 words were created by sampling

randomly without replacement from a master setof 320 common concrete nouns. The same 20 listswere used in the same order for all subjects. Theprocedure for testing was identical to that in Ex-periment I. This experiment, including instruction,lasted about 70 minutes. Twenty-one subjects (10males and 11 females; 18-22 years old) served inthis experiment as partial fulfillment of a require-ment for the introductory psychology course atStanford. They were tested in two groups of size10 and 11.

ResultsFigure 4 gives the mean number of

words recalled in Experiment II as a func-tion of the trial number. Strictly speaking,only the first 15 trials are relevant to acomparison with Experiment I. Over thesetrials there is neither a significant linearnor a significant quadratic trend. How-ever, if all 20 trials are considered, thelinear trend becomes significant (F = 8.04,df = 1/17, p < .05) although the quadratic

II.0

10.0

cc.UJm

Z

z

9.0

8.0

a 4 6 8 10 12 14 16 18 20

TRIALFIG. 4. Mean number of words recalled in

Experiment II as a function of trials.

trend is still not significant (F = 1.87,df = 1/17). The smooth curve in Figure4 describes the best fitting quadratic equa-tion to the data over the 20 trials. Thenonsignificant quadratic trend in Figure 4is exactly opposite to that of Figure 3—thatis, recall is worse in the middle trials of theexperiment. Flowever, the only significanteffect is the linear trend which indicatessomething of a warm-up effect with recallimproving slightly toward the end of theexperiment. While the improvement acrosstrials is not very substantial, it contrasts ina minor way with the results of Murdock(1960) who found no improvement acrossunrelated lists in single-trial free recall.

EXPERIMENT III

Although the preceding experiments con-firmed expectations, the hypothesizedmechanisms would be more credible if wecould observe separately the decay inrecognition and the increase in retrieva-bility rather than viewing only their com-bined effect on recall. One would then beable to determine whether improvementin retrievability first dominated but waslater overcome by deterioration in listrecognition. The third experiment pro-vides separate measures of retrieval andlist recognition.

Method

The subjects in this experiment studied the iden-tical sequence of 15 lists as used in Experiment I.However, they were required to try to recall the

Page 15: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

RECOGNITION AND RETRIEVAL PROCESSES 111

entire master set of 32 words on each trial, not justthe 16 presented in the most recent list. The sub-jects were further required to rate on a 6-point con-fidence scale whether or not they thought eachword they recalled came from the most recent listthey had studied. The rating scale ranged from"1 = confident the recalled word was on the mostrecent list" up to "6 = confident the recalled wordwas not on the most recent list." Hence, to usethe terminology of Experiment I, a low confidencerating indicated that the subject thought the wordwas an R word, whereas a high confidence ratingindicated that he thought it was an N word. Byexamining recall of R words without regard to theirrating, changes in retrievability may be monitored;by examining the confidence ratings, changes in thesubject's ability to recognize words can bemonitored.

The subjects were given 165 seconds to writetheir recall and to record confidence ratings besideeach of the words recalled. Detailed instructionsand illustrations on use of the confidence scale weregiven since a pilot study had indicated that manysubjects would misunderstand the instructions anduse low numbers just for those words which ap-peared for the first time in the most recent list.Including the instructions, the experimental sessionlasted about 75 minutes. Twenty-four subjects(10 males, 14 females; 16-23 years old) were re-cruited through an advertisement in a local news-paper. In a two-hour experimental session they par-ticipated in this and another unrelated experimentwhich followed, and they were paid $3.50 for thetotal session. They were run in three groups of

sizes 6, 8, and 10. Six subjects were excluded fromthe analysis because of failure to use the recognitionscale properly on some of the trials.

Results

An initial question is whether theearlier results of Figure 3 have been dupli-cated under these altered conditions. Inthis experiment, a rating of 1 indicatedthat the subject was certain that he hadseen the word in the list just studied, anda rating of 2 indicated he thought it prob-able that the word came from that list.Since the subjects were instructed in Ex-periment I to recall only the words theythought "probably had occurred" in thelast list, the words recalled and rated 1 or 2in Experiment III should be comparableto those recalled in Experiment I. Figure5 shows the frequency with which R words(in the most recent list) were recalled andrated 1 or 2, and the frequency with whichN words (not in most recent list) were re-called and rated 1 or 2. Figure 5 also showsrecall corrected for guessing by subtractingthe mean for the N words from the meanfor the R words. The data represented in

10.0

atO

a: 8'°CO2

I 7.0

<UJ

1.0

0.0-

R WORDS

\/\ ^ *

CORRECTED RECALU- -

N WORDS

I 2 3 4 5 6 7 8 9 10 II 12 13 14 15

TRIALFIG. 5. Mean number of words recalled and rated "1" or "2"

in Experiment III as a function of trials.

Page 16: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

112 JOHN R. ANDERSON AND GORDON H. BOWER

Figure 3 and the data represented in Figure5 are quite similar. As in Experiment I,an equation involving the first six powerswas fitted by means of orthogonal poly-nomials to the corrected recall over trials.The improvement in the fit due to the linearcomponent was marginally significant(F = 5.06, df = 1/8, p < .10); the im-provement in fit due to the quadraticcomponent as quite significant (F = 9.54,df = 1/8, p < .025); the four higher orderpolynomials did not significantly improvethe fit (F = .90, df = 4/8). The smoothcurve in Figure 5 describes the best fittingquadratic equation to the corrected recall.That quadratic equation confirms an initialrise and subsequent fall in the correctedrecall. Thus, we may conclude that theprocedural changes in this experiment havenot altered the basic processes that wereoccurring in the first experiment.

We can now determine whether, ashypothesized, retrievability improves whilelist recognition deteriorates across trials.Figure 6 presents the mean number ofwords recalled, while Figure 7 shows themean confidence rating of the words re-called. In these figures, the data areclassified into four groups according to

whether or not the word was in the mostrecent list and according to the number oftimes the word had appeared before thatlist. The eight R words that had beenpresented fewer times than the mean num-ber for those in the most recent list wereclassified as LR (less frequent, recent), andthe other eight presented more than themean, MR (more frequent, recent). Whenfrequencies were tied, the words assignedto the LR group were those for which thelag between the current presentation andthe next most recent was the longest. Ona similar basis, the N words were sub-classified as LN and MN.

In Figure 6 it is clear that recall of Rwords increases with negative accelera-tion to an asymptote, confirming the hy-pothesis about the improvement in re-trievability. On the other hand, Figure 7shows that the mean difference in recog-nition ratings between R words and Nwords steadily decreases as a function oftrials, confirming the hypothesis about thedegradation in list recognition. Thus, theinterpretation of the results of ExperimentI as arising through the dissociationof retrieval and recognition has beenconfirmed.

14.0

12.0

iuiCD

10.0

6.0

2.0

TOTAL R

TOTAL N

I 2 3 4 5 6 7 8 9 10 II 12 13 14 IS

TRIAL

FIG. 6. Mean number of words recalled in Experiment IIIas a function of trials

Page 17: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

RECOGNITION AND RETRIEVAL PROCESSES 113

6.0

Oz< 5.0

OZ 4.0

eU-

8

UJ 2.0

1.0 I 2 3 4 5 6 7 8 9 10 II 12 13 14 IS

T R I A L

FIG. 7. Mean rating of the words recalled in Experiment IIIas a function of trials.

Several auxiliary hypotheses arising fromour theoretical position may also be ex-amined. First, our model relates improve-ments in retrieval over trials to the factthat the subject has more opportunities tolocate and learn associative paths to accessthe words. Therefore, those words bestrecalled should be those that have appearedin the most prior lists—a simple frequencyeffect. Inspection of Figure 6 confirms thatthe more frequently occurring words (rep-resented in the curves labeled MR andMN) were initially better recalled than theless frequently occurring words (curves LRand LN). As retrievability approached itsasymptotic level, the differences betweenmore and less frequently occurring wordsdiminished, as is to be expected. Theseconclusions from a visual examination ofFigure 6 are fully substantiated by statis-tical analyses.

Second, on the hypothesis that negativetransfer increases as the same word ispaired with more list markers, one wouldpredict that the mean confidence ratingshould be lower (i.e., more correct) forthose R words that occurred less frequently.The fewer prior list markers associated toa word, the more "novel" it is, the less thenegative transfer in establishing the new"word —* List i" association. Figure 7

shows 14 trials over which a comparison ofconfidence ratings can be made betweenMR and LR words. Of these 14 compari-sons, 11 trials exhibit the predicted in-equality. To confirm the reliability of thisdifference, the mean ratings across these14 trials for the MR and LR words werecomputed for each of the 18 subjects. Thedifference in average ratings between thetwo types of words is highly significant(t = 4.41, df = 17, p < .001). This find-ing is all the more impressive when onerealizes that just the opposite ordering ofthe MR and LR curves would be predictedon the basis of a strength model whichrelates recognition to frequency, recency,and duration of exposure.

Considering the data at the top of Figure7, on 12 of the 13 relevant trials, the lessfrequent N words were better differentiated(have higher scores) than the more fre-quent N words. In other words, morefrequently presented N words were judgedas more likely to have been on the mostrecent list than were less frequently pre-sented N words. A t test confirms thesignificance of this difference (t = 3.31,df = 17, p < .005). It is not clear how tointerpret this result. Perhaps it could beexplained by the mechanism of generaliza-tion of contextual elements between lists.

Page 18: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

114 JOHN R. ANDERSON AND GORDON H. BOWER

The more prior lists in which an item hasappeared, the greater the probability wouldbe that one of the list markers associatedwould be mistaken as a list marker for thecurrent list. The problem with the gen-eralization mechanism is that it wouldpredict the opposite ordering of the MRand LR curves. To explain why the LRcurve is below the MR we would haveto make the added and unmotivated as-sumption that the negative transfer in listtagging outweighed the list generalizationfactor. In any event, the conclusion to bedrawn from Figures 6 and 7 is that, in thisparadigm, retrieval is directly and recog-nition is inversely related to frequency ofexposure.

Model Testing

In the introduction, we stated that thedistribution of values for nonpresented Nwords on the scale of List i evidence is thenormal fu(x), but the distribution for theR words is the nonnormal ft(x) = afm(x)+ (1 — a ) f u ( x ) . A relatively simple testof this mathematical model is possible. Theprobability that a particular N word israted with a confidence exceeding j pro-vides an estimate of the standard normaldeviate corresponding to the j'th criterionpoint. In this way, the data from the Nwords provide estimates of the intervalsbetween the five criterion points. A gridsearch can then find the probability oftagging, a, and of the mean of the distribu-tion of tagged words, /um, that will yieldthe best fitting distribution f i ( x ) for the Rwords. The best fitting parameters arethose that yield predicted frequencies ofthe confidence ratings that deviate mini-mally from the observed frequencies asmeasured by a chi-square test.

One could test the statistical significanceof the deviations between observed andpredicted frequencies as a goodness-of-fitstatistic. However, a few large chi-squaresmay be expected even if the model wereessentially correct because the estimates ofthe confidence intervals are only approxi-mate and are subject to random error. Afairer test of the model would find theminimum chi-square estimates of seven

free parameters, namely, the five criterionpoints (for the confidence ratings), a, andMm- Unfortunately, the computing cost tofind seven parameters is many orders ofmagnitude greater than the cost to esti-mate two parameters. So, instead of thebest absolute test of the model, we willpresent a comparative test of the model.We will compare our model with a plausiblealternative theory, the traditional recog-nition model that presumes that one normaldistribution underlies words from the listand another underlies words not from thelist (e.g., Parks, 1966; Wickelgren &Norman, 1966). Since that model assumesthat the two distributions have the samevariance, the grid search needs only to esti-mate the mean of the likelihood distributionfor R words, d'. The value of d' wouldcharacterize a memory-operating charac-teristic in the traditional signal-detecta-bility analysis.

The test proposed is impossible for Trial1 because there are no N words recalled;also not enough N words were recalled onTrial 2 to provide reliable estimates of theintervals between the criterion points.Therefore, comparisons of our model withthe traditional one will be confined toTrials 3 through 15. Table 1 presents forthese trials the estimated values of thecriterion points (cL,c-i,Cz,ci,Cb), a, fj,m, and d'as well as the chi-square measures for devia-tions of our model and of the traditionalsignal-detectability model. For every trial,the predictions of the traditional signal-detectability model lead to large chi-squares, while the assumptions of ourmodel results in smaller chi-squares. Only4 of the 13 chi-square totals for our modelare significant at the .05 level. Noticethat as predicted by the principle of nega-tive transfer, there is a decrease acrosstrials in a, the probability of successfullytagging (associating) a word with a listmarker. Using the weights suggested byAbelson and Tukey (1963) to test formonotonic trend, the decrease in a acrosstrials is found to be highly significant(F = 24.32, df = 1/11, p < .001). Al-though less obvious, there is also a mar-ginally significant decrease in the value of

Page 19: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

RECOGNITION AND RETRIEVAL PROCESSES 115

TABLE 1

PARAMETER ESTIMATIONS AND CHI-SQUARE DEVIATIONS FOR THREE MATHEMATICAL MODELS

Trial

34S6789

101112131415

Criterian points

Cl

1.882.331.601.601.881.881.701.881.642.171.601.441.64

Cl

1.341.701.251.481.551.441.441.481.441.601.341.231.34

C3

1.181.371.131.371.511.231.101.101.181.201.04.90.97

a

1.141.08.90

1.231.231.08.93.86.82.92.64.63.66

Ct

.82

.61

.51

.76

.69

.47

.37

.40

.47

.45

.11

.35

.40

Traditional model

d

2.602.852.202.162.341.862.051.951.672.091.621.451.51

X1

(df = 4)

25.3954.1332.8020.9929.0336.7546.6862.4436.44

103.6435.6629.3452.76

Proposed model

a

.91

.93

.85

.95

.89

.79

.83

.77

.78

.79

.77

.70

.65

Mm

3.423.633.082.382.982.542.812.912.372.992.342.272.69

(<*/*= 3)

.502.545.70

16.984.794.54

15.496.168.50

22.501.765.887.55

Tradition model withvariable a\

d'

4.444.753.582.683.142.343.092.772.052.932.021.791.97

<ri

2.382.402.341.481.831.822.142.171.642.311.721.692.15

y!

W = 3)

.804.593.72

15.264.261.15

10.432.309.76

12.57.66

4.155.60

Mm (F = 6.14, df = 1/11, p < .05). Thisdecrease in the distance between the dis-tributions fm(x) and fu(x) may reflectincreasing confusion between the list markerfrom the most recent trial and list markersfrom earlier trials because of the general-ization of contextual elements.

Perhaps* a more appropriate alternativemodel would be the traditional signal-detectability theory with the variance ofthe distribution for R words as a free pa-rameter to be estimated from the data.This model has the advantage of equalingours in the number of estimated param-eters. The fit of this expanded model isshown in the last columns of Table 1,showing the estimates of d' and in, themean and standard deviation of the /<(#)distributions. In terms of goodness of fit,there is little basis to choose between thismodel and ours. This outcome was notentirely unexpected. Inspection of Figure2 shows that the likelihood distributionft(x) is sometimes sufficiently normal tomake difficult the discrimination between itand the true normal distribution predictedby the traditional model for the R words.Furthermore, we are dealing with an aver-age of likelihood distributions. There isa different distribution fi(x) for everyword because each word has appeared ina different set of lists and therefore has

a different amount of negative transferassociated with it as a stimulus. Theaverage of these many individual distribu-tions will be considerably more normal thanany of its constituents.

It is pertinent to examine the changes inthe parameters <T, and d' across trials forthis version of the traditional signal-detectability recognition model. The esti-mates of d', the distance between the meansof the distributions, show a fairly consistentdecrease across trials. One difficulty withthis traditional signal-detectability modelis that it provides no theoretical base fromwhich one might predict this decrease ind'. Also, the parameter <7» is fluctuatingerratically from trial to trial. These fluc-tuations are probably a consequence of theminimum chi-square estimation procedure.Occasionally, events with low theoreticalprobabilities will have frequencies that areconsiderably deviant. The estimation pro-cedure gives considerable weight to suchdeviant low-probability events. In an at-tempt to minimize deviations of predictedfrom observed frequency for such an event,the predicted distribution can be markedlyaltered from one trial to the next.

Discussion of Experiment IIIThe results of Experiment III showing

an improvement in recall alongside a de-

Page 20: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

116 JOHN R. ANDERSON AND GORDON H. BOWER

cline in recognition, bring into sharp focusthe failing of the "threshold" theory whichrelates recognition and free recall to a singletheoretical construct, such as strength orfamiliarity. We have dissociated twobehavioral measures which that theoryclaims are coupled by the nature of thesystem. Such data, along with other argu-ments advanced in the introduction, renderthe threshold theory untenable as an ac-count of multilist free recall.

In a similar vein, our results increase theimplausibility of a simple strength of famil-iarity theory of list recognition. The basicproblem with such theories is that the sub-ject's judgment of the list membership ofan item is not wholly predictable from thatitem's "composite strength." Both in Ex-periment I and Experiment III, all 32 wordsof the master set rapidly became quite"familiar" to the subject. And yet, asFigures 3 and 5 demonstrate, the subjectsstill showed considerable ability to recog-nize (discriminate) R words from N words.This discrimination varied only moderatelywith the frequency of an item's presentationprior to the current list being discriminated.

One might suppose that this R versus Ndiscrimination could be effected by arapidly decaying "short-term" strengthattached to items, such that only items inthe most recent list would have strengthsexceeding a specific criterion. But thisaccount then finds inexplicable the resultsof Experiment III in which subjects notonly recalled but correctly labeled the Nwords that were not presented on the mostrecent list. How are such N items beingdiscriminated from the vast pool of knowncommon nouns in the subject's long-termmemory which are not being used at all inthe experiment? Such questions, and theseveral others posed in the introduction,overstrain and discredit the simple strengthor familiarity assumptions about list recog-nition. In place of simple strength, wepropose that subjects use something like"list markers" for indexing the lists inwhich an item has appeared.

One piece of evidence that appears toupset the picture established by the firstthree experiments comes from a free recallstudy reported by Ehrlich (1970). He

gave his subjects 10 trials of free recalllearning on a 20-word list. By the tenthlist, recall had reached a near maximum of19 out of 20. Then the subjects wereswitched to a series of 10 single-trial freerecall trials on semirandom subsets of theoriginal set of 20, much in the manner ofExperiment I. Since retrievability for theset of 20 had apparently asyrnptoted, onemight expect to see a continuous decay inthe number of words recalled as a conse-quence of the build-up in negative transferfor list identification. However, the levelof recall remained constant over the 10 sub-lists in Ehrlich's experiment. It is possiblethat over the initial 10 trials on the wholelist in his experiment, negative transfer mayhave reached its asymptotic level, and hencethere would be no more deterioration in thelist-tagging process over the last 10 trialson the part lists. By the fifteenth trial inour Experiment I, subjects had seen eachword in a mean of 7 lists; by the first part-list trial of his experiment, subjects hadseen each word in 10 lists.

The point is that we cannot expect thedecay in recognition to continue forever.If our experiment had been extended formore trials, recognition performance wouldsurely decay eventually to some asymptoticlevel. The two facts that both retrieva-bility increases to an asymptotic level andthat recognition also decreases to an asymp-tote implies that the phenomenon of a riseand fall in recall found in Experiments Iand III is in reality a rather delicate mat-ter. It depends on retrievability increasingfaster than recognition deteriorates, butasymptoting sooner. In a pilot attempt toreplicate Experiment III with alteredprocedures (more time for recall, differentrating scale, different subject population),we replicated the increase in retrieval dis-played by Figure 6 and the deterioration ofrecognition displayed by Figure 7. How-ever, the quadratic trend in Figure 5 (therise and fall in recall) was relatively slightand in fact nonsignificant statistically.

EXPERIMENT IV

Our model assumes that the processesunderlying recognition in free recall are

Page 21: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

RECOGNITION AND RETRIEVAL PROCESSES 117

identical to those processes which underlieperformance in a pure recognition task.However, evidence has not yet been pre-sented to support this assumption. Thenext experiment does this by examiningpure recognition performance with thesame 15 lists that were used in ExperimentsI and III. If our negative-transfer as-sumption is correct, we should find thesame deterioration in recognition with fre-quency of presentation as was obtained inExperiment III.

Method

The identical sequence of 15 lists were presentedas in Experiments I and III, but with two differenttesting methods. For the first test method, im-mediately following the presentation of the list onany trial, the subject had to add mentally 15 single-digit numbers presented at a two-second rate. Thismanipulation was intended to prevent the subjectfrom recognizing any test word from his short-termmemory (see Anderson, 1972). The list markertheory of recognition only applies to items not inshort-term memory at the time of testing. InFRAN, any item still in short-term memory at thetime of testing is, of course, assured of perfect recog-nition. After this summation task, the subjectswere given a sheet listing the master set of 32 wordsand they were to rate on a 1 to 8 scale the subjec-tive likelihood or confidence that each word hadbeen on the most recent list. In this experiment,a high rating indicated that the subject thought theword very probably was on the most recent list, anda low number indicated that it very probably wasnot on the most recent list.

For the second test 'method, the subject wasshown the test items one at a time slide-projectedat a five-second rate. The subject rated the items onthe same 8-point scale as in the first test method.These two different methods of test were used toinsure the generality of the results. The secondmethod corresponds more to how we postulate thatrecognition happens in free recall—that is, the sub-ject must judge one word at a time. However, thereis no compelling reason to expect differences betweenthe two methods. Thirty-one subjects (15 males and16 females; 18-32 years old) participated in thisexperiment as partial fulfillment of a requirementfor the introductory psychology course at Stanford.They were run in groups ranging in size from 5 to10. Twenty subjects were tested by the first testmethod, and 11 by the second test method. Twosubjects tested under the first test method wereeliminated from the analysis because of incorrect useof the recognition scale.

Results

No systematic differences appeared inthe confidence ratings as a function of themethod of test, so only the pooled resultswill be presented. Figure 8 presents themean confidence ratings for the differenttypes of words across trials. Note that theconfidence scale in this experiment wasreversed from Experiment III in that in thisexperiment high numbers mean the subjectthought the word was on the most recentlist. Also an 8- rather than a 6-point con-fidence scale was used. It does appearthat the mean difference in rating between

2.0

= 3.0

<ruiO 4.0

§ 5.0O

6.0

7<° I 2 3 4 5 6 7 8 9 10 II 12 13 14 15

TRIAL

FIG. 8. Mean rating of the words in Experiment IV as a function of trials.

Page 22: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

118 JOHN R. ANDERSON AND GORDON H. BOWER

the R words and the N words decreasesacross trials. On Trials 1-5, the meandifference is 3.31; on Trials 6-10, it is 2.92;on Trials 11-15, it is 2.65. A test for amonotonic decreasing trend in the differ-ence between the R and N words is highlysignificant (t = 6.10, df = 13, p < .001).However, the deterioration in recognitiondisplayed in Figure 8 is not as dramatic asthat in Figure 7 from Experiment III. Asin Experiment III, those words presentedless frequently (represented in the curvesLR and LM) are the more accurately iden-tified. Thus we have replicated the find-ings of Experiment III with respect torecognition.

From the data of this experiment, wecan discover what free recall would be likein an experiment in which retrieval wasperfect from the start but recognition wasstill subject to deterioration through nega-tive transfer in the list-tagging process.

Retrieval is perfect because the experi-menter provides the subject with all thewords; the subject needs only to decidewhich came from the most recent list.Figure 9 indicates the sort of free recalldata that would obtain under the circum-stance of perfect retrieval. Here we havethe number of R words rated 7 or 8 andthe number of N words rated similarly.These may be taken to represent thenumber recalled and the number intrudedin the hypothetical circumstance. We havealso plotted in Figure 9 the corrected recallobtained by deducting intrusions fromrecalls. Figure 9 is to be compared withFigure 3 from Experiment I and Figure 5from Experiment III. As we would predict,since "retrieval" is asymptotic from thestart, there is a rather dramatic decline in"recall" as negative transfer builds up andimpairs the recognition component.

Thus, it would appear that we may

10.0

9.0

oc0 8.0

a:UJCO

7.0

e.o

1.0

0.0 -

WORDS

CORRECTED RECALL

I 2 3 4 5 6 7 8 9 10 II \Z 13 14 15

TRIALFIG. 9. Mean number of words rated "7" or "8" in Experiment IV as a function of trials.

Page 23: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

RECOGNITION AND RETRIEVAL PROCESSES 119

profitably conceive of recognition and listdiscrimination experiments in paired-associ-ate terms in which each word serves as thenominal "stimulus" and the List n markeras the nominal response. This functionalidentification leads to implications regard-ing negative transfer and retroactive andproactive inhibitions. This experiment hasclearly shown negative transfer may beobtained in a list discrimination task. InG. H. Bower and J. R. Anderson (in prepa-ration), further data will be reported toindicate that retroactive and proactiveinhibition can also be obtained in a similardesign.

GENERAL DISCUSSION

We will mention several conceptual puz-zles regarding recognition memory whichmay have a viable solution within theframework of our theory. It should bestressed, however, that the theory was notconstructed to account for these pheno-mena ; they are rather in the form of after-thoughts. They will be given a morecomplete analysis in a forthcoming paperof G. H. Bower and J. R. Anderson.

Part-Whole Negative Transfer

The notion we have developed of nega-tive transfer in the tagging of items inmultiple lists may help to explain thepuzzling phenomenon of negative transferin part-to-whole or whole-to-part studies offree recall (see Tulving, 1966; Tulving &Osier, 1967). A subject pretrained withpart of a free recall list will subsequentlylearn the whole list more slowly than a con-trol subject pretrained on an irrelevant listbefore receiving the whole list. The diffi-culty is largely localized in the very poorimprovement in recall of old (part-list)items (see Bower & Lesgold, 1969). Thisoutcome would be predicted if there werenegative transfer in associating a List 2marker to an item previously associated toa number of List 1 markers, and if whole-listrecall were monitored and edited for a List2 tag associated to the candidate items be-fore they were overtly recalled. Thus,part-list items previously associated to aList 1 tag would acquire List 2 tags more

slowly and would thus be frequently editedout from free recall. This outcome hingescritically on the experimental subject notbeing aware that all part-list items are con-tained in the whole list. If he were to beinformed of this fact, then there would beno list discrimination problem, and themonitor would recall any candidate itemretrieved having either a List 1 or a List 2tag associated to it. Thus, informed sub-jects should give only positive part-to-whole transfer. This is indeed the case, ashas been found by E. Tulving (personalcommunication, 1971).

According to this analysis, the negativetransfer in part-whole experiments is oc-curring in the recognition phase of free re-call and not the retrieval phase. In fact,the theory expects negative transfer if thesubject went from study of a whole list tomore study on the selfsame whole list—provided he was led to believe that thefirst and second lists contained some differ-ent items. This result is precisely what hasbeen found by R. M. Schwartz and M. S.Humphreys.3

Associatively Related False Alarms

The theoretical location of fu(x) relativeto fm(x) in Figure 2 may provide a meansfor rationalizing the effects on recognitionmemory of similar or associatively relateddistractors. It is well known that suchrelated distractors elicit more false-positiverecognition judgments than do unrelateddistractors. We may conceive of this asmediated recognition. Although the testitem may not be directly marked, it maycall to mind an associated list word whichwas marked, and on the basis of that evi-dence the subject may infer that the testitem was on the list.

This indirect evidence, from mediatedrecognition, is available not only for relateddistractors but also for unmarked list items.Recall that during list study, it is presumedthat the subject is searching out and mark-ing associative pathways linking list items.

3 R. M. Schwartz and M. S. Humphreys. Listdifferentiation in part-whole free recall. Unpublishedmanuscript, 1971.

Page 24: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

120 JOHN R. ANDERSON AND GORDON H. BOWER

Hence, a list item may have several as-sociative paths marked to other list itemswithout itself being marked. Becausemediated retrieval of a list marker can oc-cur for either unmarked list' words or as-sociatively related distractors, it is of onlypartial reliability as evidence for list mem-bership of the test item.

What the subject does with such indirectevidence will probably depend on the test-ing situation. For example, in multiple-choice tests, the subject would doubtlesschoose a directly marked test word inpreference to a mediately marked testword; but he would also choose the latterover distractors having neither direct normediate list markings. For single stimulusor yes-no tests, allowing for mediatedretrieval of list markers is equivalentmathematically to adding a constant Cto the "List i evidence" scores for all itemsthat have marked associations to itemsthat can be independently identified aslist members. This constant would beadded to marked list items, unmarked listitems, and associatively related distractors.

The effect of this allowance on perfor-mance depends jointly on the nature of theold study items and the new distractoritems. Let us consider the base controlcondition to be recognition performance ona list of unrelated words tested against es-sentially unrelated distractors. Comparedto that control, recognition would be higherfor a list of highly interassociated items(e.g., members of one taxonomic category)when tested against unrelated distractors.This corresponds to what Kintsch (1968b)called "category recognition" and it is im-plemented in the model by increasing m,the mean of the fi(x) distribution, by aconstant C without at the same time shift-ing the distribution /«(») for the unrelateddistractors.

On the other hand, consider the casewhen all the distractors are in the samecategory (or categories) as the study items;then fi(x) and fu(x) would both be shiftedup the scale by C. This is because ineither case an unmarked test word of thecategory is likely to elicit highly associ-ated words of that category which are

marked. The result, then, would be no netchange in d' or the overall recognition per-formance, as Kintsch (1968b) reported.In this sense, associative or categorical"organization" of the study list will affectrecall but not recognition against seman-tically related distractors.

If the recognition test series containsa mixture of semantically related and un-related items, then the subject is essentiallydealing with three distributions havingrespective means of m + C for old items,Hu + C for related distractors, and /xu forunrelated distractors. Since the decisionmodel supposes that the subject selectsa single, item-independent criterion formaking his yes-no recognition judgments,the expected outcome is a higher false-alarm probability for related than for un-related distractors.

Returning to the earlier discussion ofdifferentiation from unrelated distractors,the importance of list-mediated recognitionwill depend on the average probability ofretrieving indirect list evidence for an un-marked list item during testing. Thatprobability will depend in turn on theaverage degree of interitem associationsestablished by the list-studying conditions.For example, if free classification of a setof items into more tightly packed categoriesimproves free recall (Mandler, 1967), thenthe number of categories should also havea significant (though smaller) effect on listrecognition against unrelated distractors.Mandler, Pearlstone, and Koopmans (1969)have reported such results; following freeclassification, later recognition performance(mainly hit rate) improved linearly withthe number of categories into which thesubjects had sorted the items.

Mandler et al. proposed a "retrievalcheck" hypothesis to explain their correla-tion between recognition and the numberof categories. That hypothesis assumesthat an item not recognized immediatelymay be recognized indirectly because it isrecallable from one or another retrieval cuefor the list. Our account is similar: Atest item not associated directly to a listmarker may be recognized because itelicits associates which do have associated

Page 25: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

RECOGNITION AND RETRIEVAL PROCESSES 121

list markers. Handler et al. would seemto emphasize the associations to the testitem from other list items or retrieval cues,whereas our "mediated recognition" hy-pothesis emphasizes the reverse associa-tions. The two views would appear diffi-cult to distinguish in practice.

Distinguishing Implicit Responses

An earlier criticism of familiarity theoriesof recognition is that they provide no basisfor editing out implicit responses thatsubjects make to the list items, since at thetime of test the two sets of items would beequated in terms of their frequency andrecency. The present approach couldhandle such matters by supposing that thesubject has control over whether or not hewill try to associate an item to a list marker.Items presented by the experimenter dur-ing that temporal block denoted as "Listi" activate processes which associate theitems to List i markers. But implicitresponses are discriminated at the time anddo not activate any processes designed toassociate them to List i markers. Analternative and more satisfying mecha-nism which achieves the same end assumesthat the subject associates with the item,as one of the prevailing contextual elements,its source (the experimenter or the subject).On this latter view, the subject might beable to recall an item that helps mediaterecall of other list items and have avail-able the further information that (a) thisword occurred in the context of List i, and(b) it was an implicit associate of the sub-ject, not provided by the experimenter.

Frequency Estimates

Although the marker model was de-veloped for handling list differentiation,it would appear also to give a usable ac-count of frequency estimates, that is,judgments from memory of how manytimes a test item appeared in an earliercontext. A prototypical experiment is oneby Underwood, Zimmerman, and Freund(1971) in which the subject studied a longlist composed of words that individually ap-peared either one, two, four, or six times

scattered throughout the input list. Dur-ing a later test series, they judged frommemory how frequently a given word hadappeared. This is obviously an extendedrecognition memory experiment, since a"zero frequency" judgment is equivalent tononrecognition of the item, whereas a non-zero frequency judgment corresponds torecognition of the item.

Our model can keep track of item fre-quency by counting the different list mark-ers associated to the item. Each time anitem is presented in the context of List i,there is probability a that it becomes as-sociated with a List i marker. Let M,-,ndenote the number of List i markers as-sociated to an item presented n times inList i. Then Mi,n has the binomial dis-tribution given by

Pr[Mi.n = x\ = «*(!-«)-* [1]

The mean of Af,-,K is na and the variance isna(l — a), both increasing linearly withthe number of presentations of an item.

A plausible hypothesis is that the subjectjudges the frequency of occurrence of a testitem in List i by some transformation ofMi,n, the number of test markers associ-ated to that item. The simplest mappingfrom Mi,n to a frequency judgment Fi,n isa linear transformation, namely

= a + b Mi.n. [2]

This relation would predict a linear func-tion relating the mean frequency estimateto the number of presentations, a resultreported by Underwood et al. (1971).Furthermore, the variance of the frequencyestimates should increase linearly with thenumber of presentations; this prediction isapproximately borne out in the Underwoodet al. data.

This model may be elaborated to predictan effect of forgetting on frequency esti-mates. By whatever mechanism oneadopts, forgetting surely produces a lossof discrimination between items presentedvarying numbers of times. A simplerealization of this in the mathematicalmodel is to assume that each association,established during study between an item

Page 26: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

122 JOHN R. ANDERSON AND GORDON H. BOWER

and a list marker, has a probability 1 — f(t)of being retained over an interval of dura-tion t. Therefore, as time increases, thedistance between M,,o and Mi,n wouldshrink. This could appear in the data asa decrease in the slope of the line relatingmean frequency judgments to actual fre-quency. This prediction is upheld by theUnderwood et al. data plotting the averagejudged versus the actual frequencies atretention intervals of a few seconds, oneday, or seven days after study of the list.

REFERENCES

ABELSON, R. P., & TUKEY, J. W. Efficient utiliza-tion of non-numerical information in quantitativeanalysis: General theory and the case of simpleorder. Annals of Mathematical Statistics, 1963,34, 1347-1369.

ANDERSON, J. R. FRAN: A simulation model offree recall. In G. H. Bower (Ed.), The psychologyof learning and motivation. Vol. 5. New York:Academic Press, 1972, in press.

BAHRICK, H. P. The ebb of retention. PsychologicalReview, 1965, 72, 60-73.

BERNBACH, H. A. Decision processes in memory.Psychological Review, 1967, 74, 462-480.

BERNBACH, H. A. Replication processes in humanmemory and learning. In G. H. Bower & J. T.Spence (Eds.), The psychology of learning andmotivation. Vol. 3. New York: Academic Press,1969.

BOWER, G. H. A multicomponcnt theory of thememory trace. In K. W. Spence & J. T. Spence(Eds.), Psychology of learning and motivation.Vol. 1. New York: Academic Press, 1967.

BOWER, G. H. A selective review of organizationalfactors in memory. In E. Tulving & W. Donald-son (Eds.), Organization and memory. New York:Academic Press, 1972, in press, (a)

BOWER, G. H. Stimulus-sampling theory of encod-ing variability. In E. Martin & A. Melton(Eds.), Coding theory in learning and memory.New York: Academic Press, 1972, in press, (b)

BOWER, G. H., & CLARK, M. C. Narrative storiesas mediators for serial learning. PsychonomicScience, 1969, 14, 181-182.

BOWER, G. H., & LESGOLD, A. M. Organization asa determinant of part-to-wholc transfer in freerecall. Journal of Verbal Learning and VerbalBehavior, 1969, 8, 501-506.

COFER, C. N. Does conceptual organization in-fluence the amount retained in immediate freerecall? In B. Kleinmuntz (Ed.), Concepts and thestructure of memory. New York: Wiley, 1967.

DORNBUSH, R. L., & WINNICH, W. A. Short-termintentional and incidental learning. Journal ofExperimental Psychology, 1964, 68, 58-63.

EAGLE, M., & LEITER, E. Recall and recognition

in intentional and incidental learning. Journal ofExperimental Psychology, 1964, 68, 105-111.

EGAN, J. P. Recognition memory and the operatingcharacteristic. (Tech. Note AFCRC-TN-58-51)Bloomington: Indiana University, Hearing andCommunication Laboratory, 1958.

EHRLICH, S. Structuration and destructuration ofresponses in free-recall learning. Journal ofVerbal Learning and Verbal Behavior, 1970, 9,282-286.

ESTES, W. K. The statistical approach to learningtheory. In S. Koch (Ed.) Psychology: A study ofa science. Vol. 2. New York: McGraw-Hill,1959.

FREUND, R. D., LOFTUS, G. R., & ATKINSON, R. C.Applications of multiprocess models for memoryto continuous recognition tasks. Journal ofMathematical Psychology, 1969, 6, 576-594.

GREEN, D. M., & MOSES, F. L. On the equivalenceof two recognition measures of short-termmemory. Psychological Bulletin, 1966, 66, 228-234.

HALL, J. F. Learning as a function of word fre-quency. American Journal of Psychology, 1954,67, 138-140.

HINRICHS, J. V. A two-process memory-strengththeory for judgment of recency. PsychologicalReview, 1970, 77, 223-233.

HINTZMAN, D. L., & BLOCK, R. A. Repetition andmemory: Evidence for a multiple-trace hypothe-sis. Journal of Experimental Psychology, 1971,88,297-306.

HINTZMAN, D. L., & WATERS, R. M. Interlist andretention intervals in list discrimination. Psycho-nomic Science, 1969, 17, 357-358.

HINTZMAN, D. L., & WATERS, R. M. Recency andfrequency as factors in list discrimination.Journal of Verbal Learning and Verbal Behavior,1970, 9, 218-221.

HULL, C. L., HOVLAND, C. J., Ross, R. T., HALL, M.,PERKINS, D. T., & FITCH, F. B. Mathematica-deductive theory of rate learning. New Haven:Yale University Press, 1940.

JAMES, W. Principles of psychology. Vol. 1. NewYork: Holt, 1890.

KEPPEL, G. Retroactive and proactive inhibition.In T. R. Dixon & D. L. Horton (Eds.), Verbalbehavior and general behavior theory. EnglewoodCliffs, N. J.: Prentice-Hall, 1968.

KINTSCH, W. Recognition learning as a function ofthe retention interval and changes in the retentioninterval. Journal of Mathematical Psychology,1966, 3, 413-433.

KINTSCH, W. An experimental analysis of singlestimulus tests and multiple-choice tests of recog-nition memory. Journal of Experimental Psy-chology, 1968, 76, 1-6. (a)

KINTSCH, W. Recognition and free recall of or-ganized lists. Journal of Experimental Psychology,1968, 78, 481-487. (b)

KINTSCH, W. Models for free recall and recognition.In D. A. Norman (Ed.), Models of human memory.New York: Academic Press, 1970.

Page 27: PSYCHOLOGICAL REVIEW - Stanford Universitygbower/1972/recognition_retrieval_process.pdf · PSYCHOLOGICAL REVIEW VOL. 79, No. 2 MARCH 1972 RECOGNITION AND RETRIEVAL PROCESSES IN FREE

RECOGNITION AND RETRIEVAL PROCESSES 123

KINTSCH, W., & CARLSON, W. J. Changes in thememory operating characteristic during recogni-tion learning. Journal of Verbal Learning andVerbal Behavior, 1967, 6, 891-896.

KJELDERGAARD, P. M. Transfer and mediation inverbal learning. In T. R. Dixon & D. L. Horton(Eds.), Verbal behavior and general behavior theory.Englewood Cliffs, N. J.: Prentice-Hall, 1968.

LOCKHART, R. S., & MURDOCK, B. B., JR. Memoryand the theory of signal detection. PsychologicalBulletin, 1970, 74, 100-109.

MANDLER, G. Organization and memory. In K.W. Spence & J. T. Spence (Eds.), The psychologyof learning and motivation. Vol. 1. New York:Academic Press, 1967.

MANDLER, G., PEARLSTONE, A., & KOOPMANS, H. S.Effects of organization and semantic similarity onrecall and recognition. Journal of Verbal Learningand Verbal Behavior, 1969, 8, 410-423.

McDouoALL, R. Recognition and recall. Journalof Philosophical and Scientific Methods, 1904, 1,229-233.

McGoVERN, J. B. Extinction of associations in fourtransfer paradigms, Psychological Monographs,1964, 78(16, Whole No. 593).

MULLER, G. E. Zur Analyse der Geda'chtnista'tig-keit und des Vorstellungsverlaufes. ZeitschriftfurPsychologie, Erganzungsband 8, 1913.

MURDOCK, B. B., JR. The immediate retention ofunrelated words. Journal of Experimental Psy-chology, 1960, 60, 222-234.

MURDOCK, B. B., JR. Signal detection and short-term memory. Journal of Experimental Psy-chology, 1965, 70, 443-447.

NORMAN, D. A., & RUMELHART, D. E. A system forperception and memory. In D. A. Norman (Ed.),Models of human memory. New York: AcademicPress, 1970.

OLSON, G. M. Learning and retention in a con-tinuous recognition task. Journal of ExperimentalPsychology, 1969, 81, 381-384.

PARKS, T. E. Signal-detectability theory of recog-nition-memory performance. Psychological Re-view, 1966, 73, 44-58.

POSTMAN, L. One-trial learning. In C. N. Cofer& B. S. Musgrave (Eds.), Verbal behavior andlearning. New York: McGraw-Hill, 1963.

POSTMAN, L., ADAMS, P. A., & PHILLIP'S, L. W.Studies in incidental learning: II. The effects ofassociation value and method of testing. Journalof Experimental Psychology, 1955, 49, 1-10.

QUILLIAN, M. R. The teachable language com-prehender. Communications of the Association forComputing Machinery, 1969, 12, 459-476.

RUMELHART, D. E., LINDSAY, P. H., & NORMAN, D.A. A process model for long-term memory. InE. Tulving & W. Donaldson (Eds.), Organizationand memory. New York: Academic Press, 1972,in press.

SCHWARTZ, F., & ROUSE, R. D. The activation andrecovery of associations. Psychological Issues,1961,3 (Whole No. 9).

SHEPARD, R. N. Recognition memory for words,sentences, and pictures. Journal of Verbal Learn-ing and Verbal Behavior, 1967, 6, 156-163.

SHIFFRIN, R. M. Forgetting: Trace erosion or re-trieval failure? Science, 1970, 168, 1601-1603. (a)

SHIFFRIN, R. M. Memory search. In D. A.Norman (Ed.), Models of human memory. NewYork: Academic Press, 1970. (b)

TULVING, E. Subjective organization and effectsof repetition in multi-trial free-recall learning.Journal of Verbal Learning and Verbal Behavior,1966, S, 193-197.

TULVING, E., & OSLER, S. Transfer effects in whole-part transfer. Canadian Journal of Psychology,1967, 21, 253-262.

UNDERWOOD, B. J. The effect of successive inter-polations on retroactive and proactive inhibition.Psychological Monographs, 1945, 59(3, WholeNo. 273).

UNDERWOOD, B. J., ZIMMERMAN, J., & FREUND, J.S. Retention of frequency information withobservations on recognition and recall. Journalof Experimental Psychology, 1971, 87, 149-162.

WICKELGREN, W. A., & NORMAN, D. A. Strengthmodels and serial position in short-term recogni-tion memory. Journal of Mathematical Psychol-ogy, 1966, 3, 316-347.

WINOGRAD, E. List differentiation as a function offrequency and retention interval. Journal ofExperimental Psychology, 1968, 76 (2, Pt. 2).

(Received June 14, 1971)