The influence of encoding intention on electrophysiological indices of recognition memory

www.elsevier.com/locate/ijpsycho

International Journal of Psychop

The influence of encoding intention on electrophysiological indices

of recognition memory

Johanna Catharina van Hooff *,1

Department of Psychology, University of Portsmouth, King Henry I Street, Portsmouth, PO1 2DY, United Kingdom

Received 13 October 2003; received in revised form 24 May 2004; accepted 28 September 2004

Available online 10 November 2004

Abstract

The main aim of this study was to further specify the encoding and retrieval conditions that determine the success of an

ERP-based memory assessment procedure, originally derived from lie detection studies. We examined whether event-related

brain potentials (ERPs) recorded during successful and unsuccessful retrieval would vary according to intentional (study) and

incidental (repetition) encoding conditions. Participants (N=20) were asked to indicate recognition of previously studied words

(learned targets, p=0.2) and words that were used as distractors in a preceding recognition task (repeated targets, p=0.2). Words

that were recognised elicited a P3 component, which was largely absent for new words and words that failed to be recognised.

Encoding intention was found to increase the P3 amplitude slightly but had no influence on P3 scalp distribution, suggesting

that the differently encoded targets were similarly processed during retrieval but to a different extent. The amplitude difference

was explained in terms of variance in memory trace strength and decision confidence. With respect to negative findings for

repeated items in our earlier study (Van Hooff, J.C., Golden, S. 2002. Validation of an event-related potential memory

assessment procedure: Intentional learning as opposed to simple repetition. J. Psychophysiol., 16, 12–22.), it was suggested that

the instruction to actively retrieve the repeated words was essential for obtaining reliable indications of the presence or absence

of weak memory traces.

D 2004 Elsevier B.V. All rights reserved.

Keywords: P3; Recognition memory; Encoding intention; Memory trace strength; Retrieval instruction

1. Introduction

This experiment is part of a series of studies

examining the sensitivity of a memory assessment

0167-8760/$ - see front matter D 2004 Elsevier B.V. All rights reserved.

doi:10.1016/j.ijpsycho.2004.09.010

* Present address: Department of Psychology, University of

Kent, Keynes College, Canterbury, Kent CT2 7NP, UK. Tel.: +44

1227 823097; fax: +44 1227 827030.

E-mail address: [email protected].

procedure, which, in addition to recognition judge-

ments, involves the recording of event-related brain

potentials (ERPs). The rationale of this procedure is

derived from lie detection studies (e.g., Allen et al.,

1992; Farwell and Donchin, 1991) and is based on the

assumption that a P3 ERP component can be reliably

elicited by items that are infrequently presented and

that possess special significance for the participants. In

previous studies, we have demonstrated that, in

hysiology 56 (2005) 25–36

J.C. van Hooff / International Journal of Psychophysiology 56 (2005) 25–3626

absence of a behavioural indication of recognition,

items could gain special significance, and hence the

ability to elicit a P3, by virtue of previous learning (van

Hooff et al., 1996a) but not by virtue of mere repetition

(van Hooff and Golden, 2002). These results have

implications for the applicability of the ERP-based

memory assessment procedure since they suggest that

it may not be sensitive enough to detect possible weak

memory traces, for example, as a result of suboptimal

encoding conditions (e.g., during sleep or anaesthesia,

or in dual-task situations) or in patients with amnesic

syndrome (cf., Allen, 2002). Themain aim of this study

was to further specify the encoding and retrieval

conditions that determine the success of the ERP-based

memory assessment procedure. More specifically, it

focused on effects of intentional vs. unintentional

encoding and successful and unsuccessful retrieval.

The P3 component is identified as a positive

deflection in the ERP waveform, reaching a maximum

over the central–parietal areas in a 250–800 ms

poststimulus time window. The P3 is typically elicited

during an oddball paradigm and is believed to reflect

processes that are essential for event categorisation.

According to Kok (1997), these processes are con-

trolled by working memory and attention, which refers

to both automatic capturing of attention and active

focussing of attention. To this extent, researchers

sometimes distinguish different P3 subcomponents

where the dnovelty P3T mainly reflects involuntary

attention shifts to changes in the environment (Spencer

et al., 1999) while the centro-parietal dP3bT (or

dclassical P3T) mainly reflects processes associated

with the evaluation of task relevant stimuli (Kok,

2001). Furthermore, the novelty P3 has a more fronto-

central distribution than P3b and is believed to be

functionally related to the P3a subcomponent, which

indexes the automatic detection of deviant stimuli that

are not task relevant (Squires et al., 1975). In the rest of

this report, the P3 is referred to as the total collection of

these subcomponents, which frequently overlap in time

and which are not always easily distinguishable.

Targets or task relevant items are subject of focused

attention and can thus be expected to elicit large P3s

when correctly classified as targets. In contrast, non-

targets or task irrelevant items may capture the

participant’s attention and, consequently, may elicit a

P3, depending on the items’ specific attributes and their

relationship to the participant. The P3-evoking ability

of personally relevant items is believed to be an

automatic and involuntary process (Kok, 2001) and

has been used in the past to detect guilty knowledge

(Farwell and Donchin, 1991), deception (Rosenfeld et

al., 1991), concealed learning (Allen et al., 1992), and

feigned amnesia (Rosenfeld et al., 1995). In a typical

experiment of this kind, participants are first con-

fronted with a small set of items (i.e., the critical items),

which subsequently are embedded as nontargets in a

two-choice recognition task. The main finding in these

studies was that despite their nontarget status, the

critical items elicited a P3 component, presumably due

to automatic attention processes.

The above studies have in common that the critical

items were infrequently presented (cf., oddball para-

digm) and were made extra important by means of a

deceive or conceal instruction. Furthermore, the items

were typically well encoded through enactment (Far-

well and Donchin, 1991) or elaborate study (Allen et

al., 1992), or alternatively, were of an autobiograph-

ical nature (Rosenfeld et al., 1995). Following on

from these studies, we demonstrated that neither a

deceive instruction nor an association with a crime-

related or autobiographical event was needed to obtain

comparable results (van Hooff et al., 1996a). Aurally

presented, neutral words from one semantic category

that was previously studied were found to elicit a

reliable P3 component, even when these words did

not require a behavioural recognition response

(studied nontargets). This was replicated in a sub-

sequent study for visually presented words that were

not semantically related, but a similar effect could not

be observed for items that were repeated from an

earlier phase of the experiment (repeated nontargets;

van Hooff and Golden, 2002). This seemed to imply

that the type or depth of encoding crucially affected

the items’ ability to elicit a P3. It was therefore

concluded that the ERP-based memory assessment

procedure might be less suited to demonstrate the

presence of weak memory traces (here modelled by

repetition) in the absence of overt behavioural

recognition responses. Accordingly, this would then

also provide an explanation as to why we did not find

ERP evidence for memories for intraoperatively

presented words, using a highly similar memory

assessment procedure (van Hooff et al., 1996b). There

is at least one reason to doubt this conclusion,

however, because participants were not asked to

J.C. van Hooff / International Journal of Psychophysiology 56 (2005) 25–36 27

actively try to retrieve the repeated items or to

cognitively distinguish between the different types

of nontargets. Consequently, it could have been the

absence of any retrieval effort that had caused the

nonsignificant findings in the van Hooff and Golden

(2002) study. Furthermore, the experimental setup did

not allow us to separate repeated items that may have

been partially recognised from those that were not

recognised to any extent. The current study was

designed to clarify these issues, using a comparable

study-test recognition paradigm as in our previous

studies but now including a condition in which

participants were asked to provide an affirmative

recognition response for the repeated items. More

specifically, participants first studied a short list of

words and then took part in two consecutive

recognition tests, of which, the second one is of our

primary interest (cf., van Hooff and Golden, 2002).

Target items in this second test were the words from

the study list (learned targets) as well as words that

served as distractors in the first test (repeated targets).

This setup would enable us to compare ERPs,

obtained in one and the same test condition, for

successfully recognised targets that had received

different types/levels of encoding (studied hits vs.

repeated hits). Furthermore, anticipating that a con-

siderable number of repeated targets would not be

recognised, it would also enable us to compare ERPs

for items classified as dnewT that were or were not

presented in an earlier task condition (repeated misses

vs. correctly rejected new items).

Based on earlier ERP studies manipulating the depth

of encoding, it was expected that repeated items that are

recognised as previously presented would elicit a P3,

but this P3 would probably be smaller than the one

elicited by successfully recognised studied items, due

to either a weaker representation inmemory (cf., Bentin

et al., 1992), a lower decision confidence (cf., Finnigan

et al., 2002), or a combination of both. Expectations

with regard to ERPs for repeated items that are not

recognised as previously presented were less straight-

forward. Based on our previous findings (van Hooff

and Golden, 2002), it could be hypothesised that they

would not elicit a differential ERP response; however,

this would be in contrast to findings by Rugg et al.

(1998) and Walla et al. (1999). Rugg et al. namely

reported that perceptually encoded words that were not

recognised as doldT produced more positive-going

ERPs (300–500 ms post stimulus) at the parietal

electrode positions, which, it was claimed, may be bacorrelate of memory in the absence of conscious

recognitionQ (Rugg et al., 1998, p. 596). Although this

finding could not be replicated in a more recent study

(Rugg et al., 2000), Walla et al. (1999) found a similar

but slightly earlier effect for semantically encoded

words that failed to be recognised. In addition, Walla

and colleagues also found a later frontal and right

parietal/occipital effect for the missed items, which

they believed to be associated with ban enhanced effortto retrieve item representations of the prior word

exposuresQ (Walla et al., 1999, p. 132). An important

difference between these studies and our previous

study (van Hooff and Golden, 2002) was that they

required their participants to behaviourally discrim-

inate between repeated and new items. In the present

study, we incorporated this task requirement, and so,

this would enable us to examine whether an active

retrieval effort would indeed be crucial for obtaining

significant ERP differences between repeated and new

items in the absence of a behavioural recognition

response.

2. Method

2.1. Participants

Twenty-two participants took part in the experi-

ment, but due to technical difficulties during record-

ing, only data from 20 participants were used for

analysis. All participants were undergraduate psy-

chology students (9 males, 11 females) and took part

in the experiment to gain course credits. They were

aged between 19 and 26 years (mean=21.1 years,

S.D.=2.21). All participants had English as their first

language, reported to be right handed, and had

normal or corrected-to-normal vision.

2.2. Stimuli and tasks

Stimuli were chosen from the same set of words as

in our previous study (van Hooff and Golden, 2002).

They consisted of 11 lists of six words each, and a set of

nine filler words. All 75 words were four- to seven-

letter nouns with a medium occurrence frequency

(mean 48.4 and range of 27–88 per million) according


to the norms of Johansson and Hofland (1989). To

ensure comparability between the 11-word lists, each

list consisted of three one-syllable words, two two-

syllable words, and one three-syllable word. The mean

word occurrence frequency for each list ranged

between 42 and 57. The words in each list were not

semantically related. Three or four words in each list

were concrete words. The stimuli were presented in

lowercase white letters (12 point ROM2 font) on a

black background (STIM software). They were cen-

trally presented on a PC monitor, which was placed at a

comfortable distance (approximately 70 cm) in front of

the participants. This resulted in horizontal visual

angles between 88 and 128.The experiment consisted of one study period and

two subsequent recognition tests. During the study

period, six words from one list were presented for

1520 ms with stimulus onset asynchronies (SOAs) of

2000 ms. The list was presented as many times as

needed with the same word order. Following this

study period, participants were required to conduct a

two-choice recognition test. This test consisted of

random presentation of the words from five lists,

including the list previously studied (e.g., lists A, B,

C, D, and E, of which list A was studied). The second

test consisted of random presentation of words from

eight lists. One of these eight lists was studied prior to

the first test (e.g., dstudiedT list A), and one was used

as a distractor list in the first test (e.g., drepeatedT listB); the other six lists were not presented previously

(e.g., dnewT lists F, G, H, I, J, and K). The lists were

rotated in a counterbalanced design, so that, for each

individual participant, other lists were studied,

repeated, and new. In this second test, words from

the studied list and the repeated list served as targets.

Because the targets ( p=1/8 for each list, respectively)

appeared less frequently then the nontargets ( p=6/8),

this was essentially an oddball task.

In both tests, words were presented for 306 ms

with SOAs of 2000 ms. Each test consisted of eight

blocks with different word orders. In all test blocks,

each word was presented only once. Because the

words were identical for each test block, all words

(targets and nontargets) were repeated eight times.

Because of possible influences of an orienting

response at the beginning of each test block, they

began with two out of the nine filler words, which

were discarded from further analysis. Each block also

finished with one of the filler words. YES–NO

responses had to be given for each word presented

by pressing either the left or right mouse button with

the index and middle finger, respectively. Left and

right were reversed for half the participants.

2.3. Procedure

Written consent was obtained from all participants

before the start of the experiment. Participants were

informed that the experiment would involve two word

recognition tests to obtain information about memory

processes in the brain. The recording procedure was

explained in detail using a standardised instruction

form and it was emphasised that participants could

withdraw from the experiment at any time if they

wished to do so. The specific aim and the exact nature

of the manipulations were not disclosed until after the

experiment.

Participants were first instructed to memorise a list

of six words which appeared on the monitor in front

of them. It was emphasised that they could view the

words as many times as required and that no record-

ings were taken at this stage of the study. After three

presentations of the study list, participants were asked

to recall the words in the order presented and in the

reversed order. If participants were unable to do this,

the list was presented another three times. If they had

no difficulties recalling the words in both orders, they

were asked to do so again after 1 min to ensure

storage in long-term memory. Once this was com-

pleted successfully, the instructions for the first test

were displayed on the monitor. Participants were

instructed to provide a YES response to the words

they recently studied and a NO response to all other

words. They were informed that words requiring a

YES response would appear less frequently than

words requiring a NO response, and therefore, they

should pay close attention. Participants were

requested to give their responses as accurately and

quickly as possible, using the specified mouse

buttons. To obtain a sufficient number of trials for a

proper signal-to-noise ratio, the test was repeated

eight times, with the same words in each test block but

different word orders. Participants could start each test

block at their own pace by pressing a mouse button.

Participants were furthermore asked to relax as much

as possible and to make no excessive movements.


After completion of the first test and a short break

(approximately 2 min), new instructions were dis-

played on the monitor. Participants were this time

asked to provide a YES response to the words they

had previously studied (studied targets) and to the

words that had been presented as distractors in the

first test (repeated targets). A NO response was

required for all other words (nontargets). Participants

were informed that this test would take longer to

complete than the first test, and that again, they could

expect more words requiring a NO response than

words requiring a YES response. This test was also

repeated eight times and participants could start each

test block at their own pace. This second test was the

main focus of the current study. Because of the

described manipulation and because participants were

kept uninformed about the requirements of the second

test at the start of the experiment, it was assumed that

the studied targets and the repeated targets had

received different levels of encoding, producing

strong and weak memory traces, respectively.

2.4. EEG recording and ERP analysis

The scalp EEG was recorded from 19 tin electrodes

embedded in a strech-lycra cap (Electrocap Interna-

tional). Recording locations were based on the

International 10–20 system and included Fp1, Fp2,

F3, F4, F7, F8, Fz, T3, T4, C3, C4, Cz, T5, T6, P3,

P4, Pz, O1, and O2. Linked ear lobes were used as

reference, and Fpz served as ground. Vertical eye

movements (EOG) were recorded bipolarly between

electrodes placed above and below the right eye.

Impedances were kept below 5 kV. EEG and EOG

signals were recorded and amplified using a band pass

filter of 0.1 (24 dB) and 30 Hz (48 dB; Neuroscan 4.0

software). The amplified signals were digitised on-

line at a sampling rate of 200 Hz.

EEG data were corrected for eye movements,

using Neuroscan software. Recordings were then

visually inspected, and recording epochs containing

artifacts, such as excessive drift or muscle tone

interference, were removed before further analyses.

EEG epochs were subsequently created starting 100

ms prior to stimulus onset to 1500 ms following

stimulus onset. These epochs were then baseline-

corrected to a prestimulus baseline of �100 to 0 ms

and automatically checked for possible remaining

artifacts (amplitudesb�60 or N60 AV). Accepted

epochs were averaged for each channel time-locked

to the onset of the stimuli. Averaged ERP waveforms

were obtained for each word list and each response

category (dhitsT and dmissesT) separately.

3. Results

Repeated measures ANOVAs were used to analyse

the results, supplemented with Bonferroni pairwise

comparisons where appropriate. Because of easier

notification, the standard noncorrected degrees of

freedom are indicated for the obtained F-values in

the next section; however, these were adjusted using

the Greenhouse–Geisser correction method. Because

the second recognition task was the main focus of our

study, behavioural and ERP data are discussed for this

test only.

3.1. Behavioural data

The mean percentage of correct responses for the

new words (i.e., correct rejections) was 95.5%

(S.D.=4.8), whereas this was 85.6% (S.D.=17.4) for

the studied words and 54.3% (S.D.=22.0) for the

repeated words. As expected, recognition accuracy

(pHit–pFalse Alarm; Snodgrass and Corwin, 1988)

was higher for the studied targets than for the repeated

targets (t=5.61, df=19, pb0.001). Because of the

oddball character of the task, participants could

mistakenly respond NO to studied or repeated items

for two reasons: (1) because they did not recognise

them as being previously presented (dtrueT miss), or

(2) because they were affected by the negative

response bias (dunintentionalT miss). To distinguish

these two types of hypothetical misses, a variable was

created that indicated the number of words that was

missed six times or more (out of a possible eight, thus

75% of the time). If this was the case, it was believed

that this word was a dtrueT miss. It was found that the

number of dtrueT misses was substantially larger for

the repeated words (M=1.85, S.D.=1.84) as compared

to the studied words (M=0.20, S.D.=0.62; t=3.94,

df=19, pb0.01).

Mean reaction times (RTs) were calculated for each

stimulus and response category. For the new words,

only one list was selected to account for the larger


number of stimuli in this category. This list corre-

sponded to the one also selected for the ERP averages

(see further) and was counterbalanced across partic-

ipants. Mean RTs for correct responses were quickest

for the new words (M=544 ms, S.D.=89.7), slightly

slower for the studied targets (M=625ms, S.D.=105.6),

and slowest for the repeated targets [M=721 ms,

S.D.=107.2; F(2,38)=74.11, pb0.001]. All Bonferroni

corrected pairwise comparisons were significant at

pb0.001. For the studied targets (N=10), mean RTs for

correct responses (mean RT hits=618 ms) were

significantly slower than for incorrect responses (mean

RT misses=434 ms; t=4.82, df=9, pb0.01).2 For the

repeated targets (N=20), a similar effect was found

(mean RT hits=721 ms, mean RT misses=580 ms,

t=4.91, df=19, pb0.001). Because of these RT differ-

ences, it seems likely that, for both types of targets, at

least some of the misses were due to premature

responding (previously classified as unintentional

misses).

3.2. ERP data

Individual ERPs were averaged separately for each

stimulus and response category. For the new words,

only one list was selected for averaging (counter

balanced across participants) to approach a compara-

ble number of trials as for the studied and repeated

word lists. In agreement with previous studies (e.g.,

Rugg et al., 1998), a minimum number of 16 artifact-

free trials were considered sufficient to obtain a

reliable ERP. Because five participants did not

recognise many of the repeated targets, this criterion

resulted in that ERP comparisons for correct

responses were based on 15 participants only. Mean

number of trials contributing to the ERPs were 38.1

(studied hits), 28.3 (repeated hits), and 40.7 (correct

rejections), respectively. Similarly, 10 participants

performed very well and did not miss many of the

repeated targets; thus ERP comparisons for misses

were based on the other 10 participants who produced

a sufficient number of missed repeated targets (these

10 included the five left out in the first comparison).

2 Mean RTs were considered reliable when calculated on the

basis of at least five trials. Half of the participants did not meet this

criterion for the learned misses, and hence, this analysis was based

on data from 10 participants only.

Mean number of trials was 24.5 (repeated misses) and

27.2 (correct rejections).

3.2.1. Correct responses (studied and repeated hits)

Grand average ERP waveforms (N=15) are shown

in Fig. 1 for all electrode positions. As can be

observed in this figure, the three stimulus categories

elicited comparable early, visual evoked responses

(80–250 ms). In addition, a large positive wave,

starting around 400 ms poststimulus and reaching a

maximum at 520 ms, was present for all stimuli but

most clearly so for the recognised studied words and

the recognised repeated words. These positive waves

were present at all electrode positions but were largest

over the central and parietal areas. Because of these

characteristics and because they were elicited by

infrequent target stimuli, these waves were considered

to correspond to the P3 component. Visual inspection

of Fig. 1 reveals that the upward slope of the P3 seems

to occur somewhat earlier for the studied words as

compared to the repeated words. In contrast, the

downward slow seems to be prolonged for the

repeated words, particularly at the frontal electrode

positions. P3 amplitude appears largest for the studied

targets, slightly smaller for the repeated targets, and

considerably smaller for the nontargets.

The P3 peak was defined as the most positive peak

in the 450–650 ms post stimulus period. For statistical

analyses, P3 peak amplitude and latency were

obtained from three coronal chains of electrodes F3–

Fz–F4 (frontal), C3–Cz–C4 (central), P3–Pz–P4

(parietal). The three coronal chains were chosen to

enable analysis of scalp distribution differences in the

anterior–posterior and the lateral–medial direction. All

anticipated effects were most visible at these electrode

locations, and no additional effects could be observed

at the other electrodes (see Fig. 1). Repeated measures

ANOVAs were performed with stimulus category

(studied, repeated, new), anterior–posterior electrode

position (frontal, central, and parietal), and laterality

(left, central, and right), as within subjects factor.

A main effect of stimulus category [F(2,28)=39.6,

pb0.001] confirmed that the P3 peak was largest for the

studied hits (mean=8.05 AV), somewhat smaller for the

repeated hits (mean=7.38 AV), and smallest for the

correctly rejected new words (mean=4.44 AV). Bon-ferroni corrected pairwise comparisons showed that

both types of recognised targets (studied and repeated)

Fig. 1. Grand average ERP waveforms (N=15) for correctly classified words from the different encoding conditions (studied, repeated, and

new).


had larger amplitudes than the new words (both

p’sb0.001) but were not significantly different from

each other ( p=0.24). The P3 had a centro-parietal

distribution [F(2,28)=12.73, pb0.001], which seemed

somewhat more pronounced for both recognised

targets than for the new words [category�anterior–

posterior interaction F(4,56)=3.67, pb0.05]. Alterna-

tively, the differences in scalp distribution could also

have been caused by the previously described ampli-

tude differences. Therefore, the same analyses were

performed on scaled amplitudes, using the vector

length method (McCarthy and Wood, 1985). For each

stimulus category, the individual P3 peak amplitude

values were divided by the square root of the sum of the

squared frontal, central, and parietal values. Outcomes

of this analyses revealed that the stimulus catego-

ry�anterior–posterior interaction was no longer sig-

nificant [F(4,56)=3.67, p=0.33], making the second

explanation more plausible. There were no laterality

effects.

P3 peak latency was found to be different for the

three stimulus categories [F(2,28)=4.56, pb0.05]. P3

latency for the studied targets (mean=519 ms,

S.D.=11.7) was significantly shorter than P3 latency

for the new words (mean=544 ms, S.D. 11.2, pb0.05)

but not so compared to the repeated targets (mean=538

ms, S.D. 9.1, p=0.14). There were no other significant

main or interaction effects for P3 latency.

To allow assessment of differences over time, mean

ERP amplitudes for six consecutive 100 ms time


windows (250–350, 350–450, 450–550, 550–650,

650–750, and 750–850 ms) were calculated for the

same three coronal chains of electrodes. The same

ANOVAs, as described above, were performed, for

which a summary of the main effects is shown in

Table 1.

The main effects of stimulus category and sub-

sequent pairwise comparisons (Bonferroni corrected)

confirmed our earlier observation that the P3 was

larger for the correctly identified targets (both studied

and repeated) than for the new words. In addition, the

P3 reached higher amplitudes for the studied targets

than the repeated targets ( pb0.05) in the 450–550 ms

time window. In the 650–750 ms time window

(downward slope of the P3), the repeated targets,

but not the studied targets, had significantly larger

amplitudes than the new words ( pb0.001). The main

effect of anterior–posterior position, followed by

pairwise comparisons, showed that the P3 had

primarily a centro-parietal distribution.

A stimulus category�anterior–posterior electrode

position interaction was present for the last four time

windows (all p’sb0.05), suggesting possible differ-

ences in anterior–posterior scalp distribution for

studied targets, repeated targets, and new nontargets.

A reanalysis on scaled amplitudes (procedure

described above) revealed that there were still signifi-

cant stimulus category�anterior–posterior interactions

for the 650–750 ms [F(4,56)=5.32, pb0.05] and 750–

850 ms [F(4,56)=4.30, pb0.05] time windows (i.e.,

downward slope of the P3). Inspection of mean

amplitudes revealed that this interaction referred to a

different anterior–posterior scalp distribution for the

targets as compared to the nontargets. More specifi-

Table 1

Repeated measures ANOVA main effects of category (learned targets, re

central, and parietal), and laterality (left, middle, and right) for mean ERP

ERP time window Category Anterior–

F(2,28) p Description F(2,28)

250–350 0.69 ns 19.14

350–450 8.92 ** ST/RTNN 5.10

450–550 46.15 *** STNRTNN 12.54

550–650 13.26 *** ST/RTNN 13.95

650–750 7.47 ** RTNN 9.82

750–850 1.64 ns 5.28

ns—nonsignificant; *—pb0.05; **—pb0.01; ***—pb0.001. Descript

ST—studied targets; RT—repeated targets; N—new words; F—frontal; C

cally, during these time windows, ERPs elicited by the

targets (both studied and repeated) were characterised

by a centro-parietal maximum, whereas ERPs elicited

by the nontargets were characterised by a central

maximum and a parietal minimum.

3.2.2. Incorrect responses (repeated misses)

Grand average ERPs (N=10) for correctly rejected

new words and missed repeated words are displayed in

Fig. 2 for electrode positions F3, F4, P3, and P4. Please

note that dtrueT misses and dincidentalT misses were

both included in these averages because of insufficient

trials in each separate category. Based on behavioural

results, it could be calculated that proportionally, about

43% of the trails were dtrueT misses. Visual inspection

of this figure, reveals that ERPs elicited by the repeated

targets that were missed seemed to be more positive

going in the 550–750 ms period than those elicited by

new words, especially over the frontal areas. In the

period thereafter (750–900 ms), the repeated misses

were still more positive going but for the parietal

electrode positions only. Because of its relatively late

occurrence and its scalp distribution, it is unlikely that

this wave corresponds to the classical P3 component.

Similar repeated measures ANOVAs were carried

out on the same time windows, as described earlier.

Significant main effects of stimulus category were

found for the 550–650 ms period [F(1,9)=8.46,

pb0.05], the 650–750 ms time period [F(1,9)=16.87,

pb0.01], and the 750–850 time period [F(1,9)=11.92,

pb0.01]. This confirmed the observation that the

nonrecognised repeated words elicited a more positive

going ERP waveform than the new words in the later

stages of the recording epoch. There were no signifi-

peated targets, and new items), anterior–posterior position (frontal,

amplitude measures for the different time windows

posterior Laterality

p Description F(2,28) p Description

*** FbCbP 1.74 ns

* 3.32 0.083 RiNLe

** FbC/P 3.72 *

*** FbC/P 0.93 ns

** FbC 1.51 ns

* F/PbC 2.81 0.092

ions only include significant Bonferroni pairwise comparisons.

—central; P—parietal; Ri—right; Le—left.

Fig. 2. Grand average ERP waveforms (N=10) for the missed repeated words and correctly rejected new words.


cant main effects of anterior–posterior position or

laterality. There was a significant interaction between

stimulus category and anterior–posterior position for

the 750–850 time period [F(2,18)=7.94, pb0.01];

however, this interaction only approached significance

when recalculated on scaled amplitudes [F(2,18)=

3.84, p=0.071].

4. Discussion

On average, participants recognised most of the

studied targets (intentional learning) but only about

half of the repeated targets (incidental learning). This

meant that our manipulation was successful and that

comparisons could be made between (a) recognised

targets (hits) that were either intentionally or inciden-

tally learned, and (b) not-recognised repeated targets

(misses) and correctly classified newwords. Compared

to our previous study (van Hooff and Golden, 2002),

the percentage of recognised repeated items was

surprisingly high because in that study, posttest recall

for the repeated words was largely at chance level while

there were also no indirect indications (error rate, RT,

and ERP) of incidental learning. At the time, it was

therefore suggested that incidental learning had not

taken place, but results from the current study now offer

an alternative explanation. Indeed, incidental learning

could have occurred in our earlier study but may have

remained undetected by the ERP-based memory

assessment procedure. An important distinction

between the current study and our earlier study was

that this time, participants were asked to actively

distinguish between repeated and new items. Tenta-

tively, it may be that this retrieval instruction (and

presumably the resulting retrieval effort) is essential for


the success of the ERP-based memory assessment

procedure. Moreover, Rugg et al. (2000) described

retrieval effort as bthe mobilization of processing

resources in service of attempts to retrieve memoryQ(p. 673), and it might be these extra resources that are

essential not only for a behavioural recognition

response but also for the generation of a differential

ERP response. In future applications, the ERP-based

memory assessment procedure should therefore

include an explicit recognition instruction in order to

judge more reliably the presence or absence of weak

memory traces as a result of repetition or incidental

learning. This would also be more in line with studies

investigating spared memory functions in patients with

amnesic syndrome (cf., Lalouschek et al., 1997), in

patients undergoing general anaesthesia (van Hooff et

al., 1995, 1996b), or in patients with dissociative

identity disorder (Allen and Movius, 2000).

Both types of recognised targets elicited a reliable

P3 component with a centro-parietal maximum and a

peak latency of about 520–540 ms post stimulus. This

is in agreement with our previous studies (van Hooff et

al., 1996a; van Hooff and Golden, 2002) and other

studies using comparable oddball recognition para-

digms (e.g., Allen et al., 1992; Farwell and Donchin,

1991). Although P3 peak amplitude did not differ

between the studied and repeated targets, mean

amplitudes in the 450–550 ms time window were

somewhat larger for the studied targets as compared to

the repeated targets. As mentioned in the Introduction,

this might have been the result of a stronger memory

trace strength (cf., Bentin and Moscovitch, 1990;

Bentin et al., 1992) or a higher decision confidence

(Finnigan et al., 2002). An interpretation in terms of

memory trace strength would fit in best with the

model of the P3 amplitude presented by Kok (1997,

2001). In Kok’s model, the P3 is considered to reflect

a target identification mechanism that can be con-

ceptualised as ba set of neural elements or

drecognition unitsT that form a neural networkQ, theprimary function of which is bto compare stimulus

attributes with an internal representation of the targetQ(Kok, 2001, p. 571). Accordingly, a P3 will only be

elicited when some kind of connection has been

established between the perceptual and memory

systems. Thus, when a target is presented and a

matching process has been triggered, the recognition

units will be activated and more units will be

activated as a result of stronger memory traces.

Hence, in our experiment, the studied items may have

elicited somewhat larger P3s than the repeated items

because they had formed stronger memory traces.

This amplitude difference and the absence of scalp

distribution differences furthermore suggests that

intentional study and repetition have generated

memory traces that give rise to retrieval mechanisms

that differ quantitatively rather than qualitatively. This

is an important finding that can be contrasted with

(nonoddball) studies using perceptual- and semantic-

encoding conditions to create weak and strong

memory traces, respectively (e.g., Rugg et al., 1998,

2000). Results from these studies namely suggested

that while recognition of perceptually (or shallowly)

encoded words relied primarily on familiarity pro-

cesses (as reflected by an early decreased negativity

over the frontal areas), recognition of semantically (or

deeply) encoded words relied on familiarity and

recollection processes (as reflected by an additional

late increased positivity over the left central–parietal

areas).

An alternative explanation for the P3 amplitude

difference could be that participants were more

confident in categorising the studied targets as

compared to the repeated targets. This interpretation

is supported by the behavioural results showing a

higher recognition accuracy, a lower number of dtrueTmisses, and a faster mean RT for the studied targets as

compared to the repeated targets. It is also in

agreement with Finnigan et al.’s (2002) suggestion

that the amplitude of a late positive component (LPC)

is modulated by decisional factors. However, unlike

Finnigan et al., we did not observe a preceding or

partially overlapping effect (which they claimed to be

a N400 and not a P3 effect) that they associated with

memory trace strength. It may therefore be speculated

that variance in memory trace strength and decision

confidence both have contributed to the observed P3

amplitude difference between learned and repeated

targets (if indeed they can be presumed to be

manifestations of two different processes). Task

characteristics that might have promoted a combina-

tion or overlap of effects were the oddball format and

the requirement for a quick rather than a delayed

response. In a future study, the inclusion of confidence

ratings should be able to shed more light on this issue,

since even Finnigan et al. seemed less certain to


associate the LPC amplitude effect to decision

confidence than to decision accuracy.

Sometimes, latency jitter has been suggested to

artificially create amplitude differences between mem-

ory conditions (e.g., Spencer et al., 2000). Latency jitter

refers to bthe variability in the latency of the ERP

component across the individual trials in an

experimentQ and is of particular concern when averagedERP deflections seem to be more dpeakedT in one

condition as compared to the other (Spencer et al.,

2000, p. 495). In our experiment, this was not the case

since the width of the P3 for the studied targets seems

highly comparable with that of the repeated targets

(Fig. 1).

The repeated items that were not recognised

elicited a late positive deflection, which characteristics

did not correspond to those of the classic P3

component nor to any of its subcomponents. It had

a frontal rather than centro-parietal maximum, and its

maximum occurred relatively late; around 650 ms (as

compared to 520–540 ms for recognised items) and

after the response had been given (mean RT 580 ms).

It therefore appeared too late to index an implicit

memory or a partial recognition process, as described

by Rugg et al. (1998) and Walla et al. (1999). Instead,

it is more likely that the increased frontal positivity is

a reflection of some kind of postretrieval or decision-

related process. The presence of such process may

have been facilitated by the strong negative response

bias, which consequently also led to a large number of

unintentional misses. These misses presumably origi-

nated from participants pressing the YES button

prematurely while at the same time (or shortly

thereafter) realising that it was the incorrect response.

Nevertheless, even if about half of the trials that

contributed to the ERPs for repeated misses was

elicited by such premature responses (and the other

half by dtrueT misses), it was striking that, for these

items, no earlier positivity was observed that

resembled a P3-related deflection and that could have

been indicative of at least a partial activation of the

relevant memory trace. It can therefore be speculated

that the increased frontal positivity for the repeated

misses indexes some kind of postretrieval evaluation

process but which must have had access to memory

traces that were apparently not capable of attracting

sufficient attention resources to generate the correct

overt recognition response and to elicit a P3 compo-

nent. The frontal maximum of this component

supports this interpretation, since the prefrontal cortex

has been associated with postretrieval monitoring and

verification processes (Rugg et al., 2000).

In conclusion, behavioural and ERP results dem-

onstrated that the ERP-based memory assessment

procedure is sensitive to both study intention and

retrieval performance. Intentional encoding (study) as

compared to incidental encoding (repetition) led to

less errors, quicker responses, and larger P3 ampli-

tudes. Essentially, these were all quantitative differ-

ences suggesting that activation of the created

memory traces depended on similar neural systems.

Repeated items that did not receive a recognition

response could be differentiated from new items based

on their ERPs. This ERP difference is believed to be

associated with a postretrieval evaluation process, but

this needs further investigation.

Acknowledgement

I would like to thank Graham Wade for his help

with data collection.

References

Allen, J.J.B., 2002. The role of psychophysiology in clinical

assessment: ERPs in the evaluation of memory. Psychophysio-

logy 39, 261–280.

Allen, J.J.B., Movius, H.L., 2000. The objective assessment of

amnesia in dissociative identity disorder using event-related

potentials. Int. J. Psychophysiol. 38, 21–41.

Allen, J.J., Iacono, W.G., Danielson, K.D., 1992. The identification

of concealed memories using event related potential and implicit

behavioural measures: a methodology for prediction in the face

of individual differences. Psychophysiology 29, 504–522.

Bentin, S., Moscovitch, M., 1990. Psychophysiological indices of

implicit memory performance. Bull. Psychon. Soc. 28, 346–352.

Bentin, S., Moscovitch, M., Heth, I., 1992. Memory with and

without awareness: performance and electrophysiological evi-

dence of savings. J. Exper. Psychol., Learn., Mem., Cogn. 18,

359–366.

Farwell, L.A., Donchin, E., 1991. The truth will out: interrogative

polygraphy (Lie-detection) with event-related brain potentials.

Psychophysiology 28, 531–547.

Finnigan, S., Humphreys, M.S., Dennis, S., Geffen, G., 2002. ERP

dold/newT effects: memory strength and decisional factor(s).

Neuropsychologia 40, 2288–2304.

Johansson, S., Hofland, K., 1989. Frequency Analysis of English

Vocabulary and Grammar. Claredon Press, Oxford.


Kok, A., 1997. Event-related potential (ERP) reflections of mental

resources: a review and synthesis. Biol. Psychol. 45, 19–56.

Kok, A., 2001. On the utility of P3 amplitude as a measure of

processing capacity. Psychophysiology 38, 557–577.

Lalouschek, W., Goldenberg, G., Merterer, A., Beisteiner, R.,

Lindinger, G., Lang, W., 1997. Brain/behaviour dissociation

on old/new distinction in a patient with amnesic syndrome.

Electroencephalogr. Clin. Neurophysiol. 104, 222–227.

McCarthy, G., Wood, C.C., 1985. Scalp distributions of event-related

potentials: an ambiguity associated with analysis of variance

models. Electroencephalogr. Clin. Neurophysiol. 62, 203–208.

Rosenfeld, J.P., Angell, A., Johnson, M., Qian, J., 1991. An ERP-

based, control-question lie detector analog: algorithms for

discriminating effects within individuals’ average waveforms.


Rosenfeld, J.P., Ellwanger, J., Sweet, J., 1995. Detecting simulated

amnesia with event-related brain potentials. Int. J. Psychophysiol.

19, 1–11.

Rugg, M.D., Mark, R.E., Walla, P., Schloerscheidt, A.M., Birch,

C.S., Allan, K., 1998. Dissociation of the neural correlates of

implicit and implicit memory. Nature 392, 595–598.

Rugg, M.D., Allan, K., Birch, C.S., 2000. Electrophysiological

evidence for the modulation of retrieval orientation by depth of

study processing. J. Cogn. Neurosci. 12, 664–678.

Snodgrass, J.G., Corwin, J., 1988. Pragmatics of measuring

recognition memory: applications to dementia and amnesia.

J. Exp. Psychol. Gen. 117, 34–50.

Spencer, K.M., Dien, J., Donchin, E., 1999. A componential

analysis of the ERP elicited by novel events using a dense

electrode array. Psychophysiology 36, 409–414.

Spencer, K.M., Vila Abad, E., Donchin, E., 2000. On the search for

the neurophysiological manifestation of recollective experience.


Squires, N.C., Squires, K.C., Hillyard, S.A., 1975. Two varieties of

long-latency positive waves evoked by unpredictable auditory

stimuli in man. Electroencephalogr. Clin. Neurophysiol. 38,

387–401.

van Hooff, J.C., Golden, S., 2002. Validation of an event-related

potential memory assessment procedure: intentional learning as

opposed to simple repetition. J. Psychophysiol. 16, 12–22.

van Hooff, J.C., De Beer, N.A.M., Brunia, C.H.M., Cluitmans,

P.J.M., Korsten, H.H.M., Tavilla, G., Grouls, R., 1995.

Information during cardiac surgery: an event-related potential

study. Electroencephalogr. Clin. Neurophysiol. 96, 433–452.

van Hooff, J.C., Brunia, C.H.M., Allen, J.J.B., 1996a. Event-related

potentials as indirect measures of recognition memory. Int. J.

Psychophysiol. 21, 15–31.

van Hooff, J.C., de Beer, N.A.M., Brunia, C.H.M., Cluitmans,

P.J.M., Korsten, H.H.M., 1996b. Detection of information

processing during general anaesthesia by means of event related

potentials. In: Bonke, B., Bovill, J.G., Moerman, N. (Eds.).

Memory and Awareness in Anaesthesia, vol. III. Van Gorcum,

Assen, NL, pp. 207–218.

Walla, P., Endl, W., Lindinger, G., Deecke, L., Lang, W., 1999.

Implicit memory within a word recognition task: an event-

related potential study in human subjects. Neurosci. Lett. 269,

129–132.

Documents

The influence of encoding intention on electrophysiological indices of recognition memory