9
Memory & Cognition 1995,23 (2), 192-200 Organizational factors in the effect of irrelevant speech: The role of spatial location and timing DYLAN M. JONES and WILLIAM J. MACKEN University of Wales CoUege of Cardiff, Cardiff, Wales Typically, hearing a repeated syllable produces minimal disruption of serial recall of visual lists, but a sequence of different syllables impairs performance markedly. Two conditions for presenting an identical sequence of three syllables are compared: one, in which, by means of stereophony, each syllable is assigned to the left, center, or right auditory locus (three streams not changing in state), and another, in which the same syllable sequence occurs in one location only (one stream with changing state). Disruption was significantly less in the stereophonic than in the monophonic condition. There was a joint effect of changing state and location, not an effect of the number of locations alone. In Experiment 2, temporal predictability was used to manipulate changing state. The disruptive effect of regular presentation of a repeated syllable was markedly increased when it was presented irreg- ularly. The results are discussed in the context of organizational factors in short-term memory. Even though subjects are explicitly instructed to ignore it, irrelevant speech present during a test of short-term memory for visually presented items results in a marked reduction in efficiency in serial recall (see, e.g., Colle, 1980; Colle & Welsh, 1976; Jones, Miles, & Page, 1990; Morris & Jones, 1990a, 1990b; Salame & Baddeley, 1982, 1986a, 1986b, 1987, 1989; see Jones & Morris, 1992, for a review). Several features of this effect are now under- stood. The meaning of the speech is not an important de- terminant of the effect, because disruption is roughly the same for English narrative speech as it is for speech in a foreign language (Colle, 1980; Colle & Welsh, 1976; Salame & Baddeley, 1982) or indeed for reversed speech (Jones et al., 1990). Moreover, the effect does not occur at encoding-that is, it is not an effect of interference by speech as the visual items are being presented-but rather occurs in memory as the items are rehearsed (Jones, 1994; Jones & Macken, 1993; Miles, Jones, & Madden, 1991; Salame & Baddeley, 1986b). Despite these ad- vances, we still do not have a satisfactory account of the attributes of speech that determine the disruption. Change in the composition of the irrelevant sound, or changing state, has emerged as an important factor. A re- peated sound, such as a single syllable, shows little dis- ruption of serial recall, but if there is some degree of variability between the discrete events in the sequence, disruption becomes marked (Jones, 1994; Jones & Macken, 1993, in press; Jones, Madden, & Miles, 1992). For ex- These experiments are part of a program of work funded by the United Kingdom's Economic and Social Research Council. Thanks are due Alison Murray, Andrew Bridges, and Philip Tucker for their com- ments on work reported here. Edward Buck helped by running subjects for Experiment I C. Requests for reprints should be sent to D. Jones, School of Psychology, University of Wales College of Cardiff, P.O Box 901, Cathays, CardiffCFI 3YG, U.K. ample, a sequence of syllables formed of the regularly repeated letter-name "c"-a steady-state sequence- produces little or no disruption, but a sequence made up of the syllables based on the letter names "c, h,j, u"-a changing state sequence-produces a more substantial decrement in performance (Jones, 1994; Jones & Macken, 1993; Jones et al., 1992). In the present series of experiments, we were concerned with refining our understanding of the process of interference from irrel- evant speech by focusing on the effect of organizational factors in the auditory stream. The present experimental series capitalizes on the rules of organization used for auditoryperception, known generically as streaming, to manipulate the dimension of changing state in the irrelevant speech (see Bregman, 1990). Auditory streaming refers to the result of percep- tual processes involved in the organization of sound, particularly by temporal integration of auditory stimuli into coherent wholes. So, for example, a fast sequence of sounds coming from the same spatial locus tends to be perceived as a single, temporally extended stream. Sim- ilarly, sounds in temporal proximity may be grouped to- gether into related wholes. These two techniques of streaming were used to test the idea that, if the organi- zation of speech into streams modifies the degree of dis- ruption by irrelevant speech, then the interference can- not be due merely to the similarity of phonology between the speech and the to-be-rememberedmaterial, as Salame and Baddeley (1982) have suggested. The present ex- periments were designed to supplement those in which the relation of the phonology of the heard stream to the phonology of the to-be-remembered items has been ma- nipulated systematically (Jones & Macken, in press). All these studies are at variance with the Salame and Badde- ley (1982, 1989) hypothesis by showing that the simi- larity of the heard and remembered material is relatively unimportant. Rather, the similarity in phonology within Copyright 1995 Psychonomic Society, Inc. 192

Organizational factors in the effect of irrelevant speech

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Organizational factors in the effect of irrelevant speech

Memory & Cognition1995,23 (2), 192-200

Organizational factors in the effect of irrelevantspeech: The role of spatial location and timing

DYLAN M. JONES and WILLIAM J. MACKENUniversity ofWales CoUege ofCardiff, Cardiff, Wales

Typically, hearing a repeated syllable produces minimal disruption of serial recall of visual lists,but a sequence of different syllables impairs performance markedly. Two conditions for presentingan identical sequence of three syllables are compared: one, in which, by means of stereophony, eachsyllable is assigned to the left, center, or right auditory locus (three streams not changing in state), andanother, in which the same syllable sequence occurs in one location only (one stream with changingstate). Disruption was significantly less in the stereophonic than in the monophonic condition. Therewas a joint effect of changing state and location, not an effect of the number of locations alone. InExperiment 2, temporal predictability was used to manipulate changing state. The disruptive effectof regular presentation of a repeated syllable was markedly increased when it was presented irreg­ularly. The results are discussed in the context of organizational factors in short-term memory.

Even though subjects are explicitly instructed to ignoreit, irrelevant speech present during a test of short-termmemory for visually presented items results in a markedreduction in efficiency in serial recall (see, e.g., Colle,1980; Colle & Welsh, 1976; Jones, Miles, & Page, 1990;Morris & Jones, 1990a, 1990b; Salame & Baddeley, 1982,1986a, 1986b, 1987, 1989; see Jones & Morris, 1992, fora review). Several features of this effect are now under­stood. The meaning of the speech is not an important de­terminant of the effect, because disruption is roughly thesame for English narrative speech as it is for speech in aforeign language (Colle, 1980; Colle & Welsh, 1976;Salame & Baddeley, 1982) or indeed for reversed speech(Jones et al., 1990). Moreover, the effect does not occurat encoding-that is, it is not an effect of interference byspeech as the visual items are being presented-but ratheroccurs in memory as the items are rehearsed (Jones,1994; Jones & Macken, 1993; Miles, Jones, & Madden,1991; Salame & Baddeley, 1986b). Despite these ad­vances, we still do not have a satisfactory account of theattributes of speech that determine the disruption.

Change in the composition of the irrelevant sound, orchanging state, has emerged as an important factor. A re­peated sound, such as a single syllable, shows little dis­ruption of serial recall, but if there is some degree ofvariability between the discrete events in the sequence,disruption becomes marked (Jones, 1994; Jones & Macken,1993, in press; Jones, Madden, & Miles, 1992). For ex-

These experiments are part of a program of work funded by theUnited Kingdom's Economic and Social Research Council. Thanks aredue Alison Murray, Andrew Bridges, and Philip Tucker for their com­ments on work reported here. Edward Buck helped by running subjectsfor Experiment IC. Requests for reprints should be sent to D. Jones,School of Psychology, University of Wales College of Cardiff,P.O Box 901, Cathays, CardiffCFI 3YG, U.K.

ample, a sequence of syllables formed of the regularlyrepeated letter-name "c"-a steady-state sequence­produces little or no disruption, but a sequence made upof the syllables based on the letter names "c, h,j, u"-achanging state sequence-produces a more substantialdecrement in performance (Jones, 1994; Jones &Macken, 1993; Jones et al., 1992). In the present seriesof experiments, we were concerned with refining ourunderstanding of the process of interference from irrel­evant speech by focusing on the effect of organizationalfactors in the auditory stream.

The present experimental series capitalizes on therules oforganization used for auditory perception, knowngenerically as streaming, to manipulate the dimension ofchanging state in the irrelevant speech (see Bregman,1990). Auditory streaming refers to the result ofpercep­tual processes involved in the organization of sound,particularly by temporal integration of auditory stimuliinto coherent wholes. So, for example, a fast sequence ofsounds coming from the same spatial locus tends to beperceived as a single, temporally extended stream. Sim­ilarly, sounds in temporal proximity may be grouped to­gether into related wholes. These two techniques ofstreaming were used to test the idea that, if the organi­zation of speech into streams modifies the degree ofdis­ruption by irrelevant speech, then the interference can­not be due merely to the similarity ofphonology betweenthe speech and the to-be-remembered material, as Salameand Baddeley (1982) have suggested. The present ex­periments were designed to supplement those in whichthe relation of the phonology of the heard stream to thephonology of the to-be-remembered items has been ma­nipulated systematically (Jones & Macken, in press). Allthese studies are at variance with the Salame and Badde­ley (1982, 1989) hypothesis by showing that the simi­larity of the heard and remembered material is relativelyunimportant. Rather, the similarity in phonology within

Copyright 1995 Psychonomic Society, Inc. 192

Page 2: Organizational factors in the effect of irrelevant speech

PERCEPTUAL ORGANIZATION OF IRRELEVANT SPEECH 193

the irrelevant stream is the important factor, such that in­creasing levels of phonological dissimilarity betweenthe discrete units within the auditory material tend to in­crease disruption (Jones & Macken, in press).

The approach adopted in the present experiments wasto hold the phonological contents of the irrelevantspeech constant and to use different methods of organi­zation to manipulate the degrees ofchanging state. Twoconvergent techniques were employed; in one, a se­quence of three different syllables was used. By allocat­ing each syllable to one of three spatial locations, audi­tory streaming is induced so as to result in the perceptionof a number of steady-state streams and thereby de­crease the disruption relative to that found with a singlechanging-state stream comprising the same three sylla­bles. In the other technique, the interstimulus interval(lSI) was used to alter the organization of a sequencethat consisted of a single repeated (steady-state) sylla­ble. Variable lSI was used to produce a changing-statesequence, which should in turn increase the degree ofdisruption relative to fixed lSI. Although grouping bylocation and by temporal proximity is well documented(see Bregman, 1990; Handel, 1989), nearly all observa­tions of it are based on techniques in which the sound isone on which the subject is instructed to focus. In theexperiments reported here, the subjects were instructedspecifically to ignore the sound and to attend only to thememory task. Arguably, therefore, any effects of atten­tion observed in these studies should arise when thematerial is out of the main attentional focus and maytherefore be said to arise preattentively (see also Jones,Macken, & Murray, 1993).

EXPERIMENTIA

In this experiment, streaming by spatial location wasused as a means of modifying the changing-state effect.Three conditions were compared: quiet, stereophonicpresentation, and monophonic presentation. Ifa syllableis heard in one ear, after which a different syllable is heardin the other ear, and if a third syllable is then played toboth ears, the stimuli will appear to come from threeseparate locations. If this cycle of events is repeated,particularly if the rate of presentation is high, then phe­nomenally, three separate streams corresponding to thethree spatial locations are heard. Moreover, from thepoint of view of the changing-state hypothesis, if a par­ticular syllable is associated with each of the locations,the content of each of the streams (one associated witheach ear and one located in the center of the head) willnot be changing in state. However, only one stream isformed if the same sequence of syllables is presented toboth ears; in this case the stream will appear to comefrom the same point in auditory space (subjectively lo­cated in the middle of the head). Since the successivesyllables are different within this single stream, condi­tions of changing state will now be met. In Experi­ment lA, we used a fixed sequence of three syllables;Jones et al. (1992) had shown that a fixed repeated se-

quence of a small set of syllables was as disruptive asrandom sequences of syllables from the same set. If or­ganizational factors were at play, the changing sequenceat the single locus should be significantly more disrup­tive than three unchanging streams.

MethodSubjects. Twenty undergraduate and graduate students (14 fe­

males and 6 males) were paid for participating in the experiment.Apparatus and Materials. Lists of to-be-remembered items

were constructed from the consonants F, K, L, M, Q, R, and Y.They were presented one at a time (on for 0.8 sec, off for 0.2 sec)in a random order on the screen of a Macintosh Ilcx computerusing HyperCard.

Two types of auditory material were prepared. In the first ofthese, the monaural presentation, the three utterances "C," "U,"and "0" (i.e., the letter-names "see," "you:' and "oh") were recordedin that order in a female voice digitized to 8-bit resolution at a sam­pling rate of22 kHz, using a MacRecorder analogue-to-digital con­verter. These utterances were then edited with SoundEdit digitalediting software in order to produce successive utterances of200­msec duration. A monaural recording ofapproximately 20-sec du­ration was created by repeatedly looping this sequence of three ut­terances (using digital editing techniques), and this was thenstored for use in a HyperCard environment.

A second type ofauditory material was created for stereophonicpresentation. The same three utterances were used in the same se­quence as before, but in this case each utterance was assigned dif­ferently in relation to the two audio channels: the syllable "U" wasplayed on the left channel, the syllable "0" was played on theright channel, and the syllable "C" was played on both channels.Again, this sequence was recorded as an apparently seamless loopto provide a recording of approximately 20-sec duration. In thisway, a recording was created, which, when played over head­phones, led to the perception of three individual unvarying streams­a stream of repeating "U"s, a stream of repeating "O"s, and astream of repeating "C"s---each stream corresponding to a differ­ent location, and each syllable within each stream repeating at a400-msec offset-to-onset interval. Note that the syllables were pre­sented sequentially (as in the monaural condition) but in three dif­ferent locations. This recording was also stored in digital form tobe used in the HyperCard program. The sequence of syllables andthe timing of events was therefore Identical for both conditions,only their assignment to channels was manipulated. Timing of thestimuli meant that there was no period of silence between succes­sive utterances, but this should make no material difference in theoutcome since prior work has shown that continuously voiced speech(produced by a synthesizer) produces effects almost identical tospeech with syllables separated by silence (Jones et aI., 1993).

Sound was played back at approximately 65 dB(A) as measuredby a sound level meter from an artificial ear. Only air-conditioningnoise was present during the quiet conditions, and after taking intoaccount the attenuation by the ear-cup ofthe headphones, we estimatethat this noise was at a level approaching the threshold of hearing.

Procedure. Subjects used a mouse to initiate each trial by click­ing on a HyperCard "button" on the screen of the computer. Afterthe seven consonants had been presented, the word "wait" flashedon the screen for 10 sec, after which the subjects were promptedto recall the list by writing it in Its initial order ofpresentation. Au­ditory conditions (mono-one varying stream; stereo-three un­varying streams; and quiet) were randomly allocated to each trial.Sound was "on" during presentation and rehearsal and "off" dur­ing recall. There were 60 trials in all-20 for each auditory con­dition. The subjects were instructed to ignore any sound theyheard, and they were reassured that they would not be tested on itscontents and that the sound would not contain any useful infor­mation. Before the expenment proper, the subjects were given five

Page 3: Organizational factors in the effect of irrelevant speech

194 JONES AND MACKEN

50

60-,-------------------,

practice trials in quiet. The experimental session lasted 35-40 nunfor each subject.

EXPERIMENTIB

number of sound sources and their contents. These sub­jects undertook 10 trials with the stereo and the monomaterial. Without exception, they judged the stimuli ap­propriately.

The results show that a disruptive effect arising fromthe presence of phonological change can be attenuatedby splitting the sound into (perceptually) separate un­changing streams. The relation ofphonological contentsof the speech to the material being rehearsed is alone in­sufficient to account for all the disruption of serial re­call; some account needs to be given ofthe way in whichsound is organized.

Before proceeding to Experiment 2, a number ofcheckswere undertaken on the results of Experiment I A.

Experiment I A suffered from an unfortunate choiceofsyllables, in that it was possible that the subjects heardthe meaningful sequence as "oh-see-you" in the monau­ral condition. Arguably, stereophonic presentation wouldbreak up this meaningful sequence so that the differ­ences between the two methods ofpresentation could beattributable to changes in meaning rather than to orga­nization in memory. Given what is known about the roleof meaning in the irrelevant speech effect, this seemedhighly unlikely. From its first demonstration by Colleand Welsh (1976), several studies have shown that mean­ing is not important (Colle, 1980; Salame & Baddeley,1982; Jones et aI., 1990). Nevertheless, on the remotepossibility of an effect of meaning arising in this case,another experiment was conducted in which the sylla­bles chosen were unlikely to make up a meaningful se­quence. In addition, the supplementary study was a checkon the possibility that streaming by spatial locationmight have an effect not in memory but at encoding. InExperiment l A, speech was played during both the pre­sentation and the rehearsal phases of the task, whichmeans that any disruption could have arisen from thedisruption ofencoding, not memory. Again, the fact thatdisruption does not occur at encoding but after the itemsare in memory and being rehearsed is well established(Jones & Macken, 1993; Miles et aI., 1991; Salame &Baddeley, 1986b), but on the remote possibility that theconditions of Experiment I were an exception to thisrule, we restricted exposure to irrelevant speech to theperiod following the presentation ofthe visual lists. Ourexpectation was that this procedure would produce ef­fects qualitatively similar to those in Experiment I A.

MethodSubjects. Twenty undergraduate and graduate students (13 fe­

males and 7 males) were paid for participating. All reported nor­mal hearing and normal (or corrected-to-normal) vision.

Apparatus and Materials. These were the same for Expert­ment 1A, but with three exceptions. First, the three utterances "Y,""J," and "X" (that is, the letter sounds "vee," "jay," and "ecks")were used, again recorded in a female voice. Second, the exposureto sound was restricted to the period following presentation of theto-be-remembered lists, instead ofthe whole ofthe trial pnor to re-

__ Quiet

__ Stereo (3 steady state streams)

___ Mono (1 changing state stream)

10

20

40

Results and DiscussionA 3 (auditory condition) X 7 (serial position) analy­

sis of variance (ANOVA) was carried out on the data.There were significant main effects of both auditorycondition [F(2,38) = 9.63, MS. = 9.40,p < .01] and se­rial position [F(6,1 14) = 9.24, MS. = 17.36, P < .01].The interaction of the two factors was also significant[F(l2, 228) = 1.873, MS. = 2.24, p < .05]. The serialposition curves for each condition are presented in Fig­ure I, which shows that the interaction was due to thefact that disruption increased with serial position.Planned comparisons indicated that performance in themono condition was significantly worse than that in thequiet condition [F(I,38) = 4.48, MS. = 9.40, p < .05],and the difference between the mono and stereo condi­tion was also significant [F(I,38) = 5.06, MS. = 9.40,p < .05]. The very considerable difference between thestereophonic and quiet conditions was highly signifi­cant [F(l,38) = 19.04, MS. = 9.40,p < .01].

As a check that streaming did occur in the appropri­ate manner with these stimuli, subsequent to the experi­ment an additional 5 subjects were asked to judge the

345Serial Position

Figure 1. Experiment lA: Serial position curves show the contrastbetween three auditory streams (one to the left, right, and center ofthe head), one stream (to the center of the head), and quiet. Error barsindicate standard error.

Page 4: Organizational factors in the effect of irrelevant speech

PERCEPTUAL ORGANIZATION OF IRRELEVANT SPEECH 195

50

O-'-----,-----,---,-----,---,---r--,---'

60.--------------------.

MethodSubjects. Twenty undergraduate and graduate students (12

males and 8 females) were paid for participating. All reported nor­mal hearing and normal (or corrected-to-normal) vision.

Apparatus and Materials. This experiment was identical toExpenrnent IB, except for the addition of the following condi­tions: stereo-steady state, in which a single syllable ("J") was re­peated every 200 msec and successive items were assigned to dif­ferent spatial positions in the order left-center-right (theoffset-to-onset interval being 400 msec per stream) as in thestereophonic conditions ofExperiments lA and IB; mono-steadystate, in which the same sequence was presented monaurally.These two conditions were contrasted with the stereo changing­state, mono changing-state, and control conditions of Experi-

EXPERIMENT ic

In Experiment 1C, we entertained yet a further possi­bility about the results of Experiment 1A-this time,that the effect of monaural presentation arises not be­cause of its changing-state content, but from the factthat one location of irrelevant speech simply is more po­tent than three, all else being equal. This view, an alter­native to the effect of streaming, predicts that even asingle repeated syllable should be less distracting ifit isdivided into three locations rather than presented at one.To test this possibility, in Experiment 1C two sets offac­tors were manipulated. As before, stereophonic presen­tation of a changing-state sequence ("V;" "J," "X") wascontrasted with its monaural presentation. This time,however, in addition to the contrast between a changing­state sequence presented monaurally and the same se­quence presented stereophonically in order to give riseto three steady-state streams, a further two conditionswere included, in which a steady state stream ("J") waspresented either monaurally or stereophonically. If theresults ofExperiments IA and IB were due to the num­ber of sound sources, with a single source giving rise tomore disruption than multiple sources, then the monau­rally presented steady-state sequence should be moredisruptive than the stereophonically presented steady­state sequence. If, however, the effect is one of modula­tion of changing state by streaming, one would expectthat the two methods of presenting steady-state se­quences should give broadly comparable results.

In summary, if the effect is related to the number ofsources of sound, only the method of presentation, notthe sequences of syllables, will be important. Specifi­cally, by this account both the monaural conditions (witheither steady-state or changing-state sequences) shouldproduce appreciably more disruption than both condi­tions of stereophonic presentation (again with steady­state or changing-state conditions). The streaming ex­planation makes different predictions-namely, that thedegree ofdisruption is jointly determined by the methodofpresentation and by the content ofthe syllable sequence.Specifically, we predicted that the monaural changing­state condition would be significantly more disruptivethan all others, since the other three conditions all con­tained steady-state streams.

762

10

20

call. Third, the retention interval was Increased from 10 to 18 secso that the subject's exposure to sound on each trial would beroughly equated In the two experiments.

Procedure. As for Experiment IA.

40

___ Quiet

-+-- Stereo (3 steady state streams)

--.- Mono (1 changing state stream)

ResultsAs before, a 3 (auditory condition) X 7 (serial position)

ANOVA was carried out on the error data. Once againthere were main effects of auditory condition [F(2,38) =16.48, MSe = &.14,p < .01]and serialposition [F(6,II4) =27.77, MSe = 5.83,p < .01]. These two factors interactedsignificantly [F(l2,228) = 3.37, MSe = 2.03, p < .01],and the pattern of interaction (see Figure 2) is strikinglysimilar to that found in Experiment 1A. Planned com­parisons revealed that each of the conditions was signif­icantlydifferent from the others [quietvs. stereo, F(I,38) =8.73,MSe = 8.14; quietvs. mono,F(l,38) = 32.95,MSe =8.14; stereo vs. mono, F(l,38) = 7.76, MSe = 8.14; forall comparisons,p < .01].

The results are in line with those of Experiment IA.They show that the effect was not due to the particularsyllables used, as well as that the effect occurred afterthe material had been encoded and was being rehearsedin memory. We undertook a further check on the resultsin Experiment 1C.

3 4 5Serial Position

Figure 2. Experiment IB: Replication of the conditions of Experi­ment lA using diffferent syllables. Serial position curves show thecontrast between three auditory streams (stereo), one stream (mono),and a quiet control condition. Error bars indicate standard error.

Page 5: Organizational factors in the effect of irrelevant speech

196 JONES AND MACKEN

ment IB. There were therefore five conditions in all, with condi­tions changing for each trial.

Procedure. This was the same as in Experiment IA, except thatthere were 16 trials for each condition.

ResultsAs in other experiments of this series, the error scores

were subject to an ANOVA with two factors, serial posi­tion (with seven levels) and auditory conditions (thistime with five levels). As before, there were main effectsof auditory condition [F(4,76) = 3.64, MS. = 7.22,p <.01] and serial position [F(6,114) = 13.82, MS. = 10.81,P < .01]. However, unlike in Experiments lA and lB,these two factors failed to interact significantly (F < 1). Inother studies we have found that the interaction of irrel­evant speech with serial position is not always consistent(see Jones & Macken, in press; Jones et aI., 1993), butany inconsistency of the present data in this respect isnot material to the hypothesis under test and may safelybe disregarded. The mean data for the five conditions arepresented in Figure 3. Planned comparisons revealedthat only one contrast was significant-that betweenquiet and mono [changing state [F(l,76) = 7.49, MS. =7.22, P < .01). All the other contrasts with quiet w~renonsignificant (all Fs < 1). Thus, these results also dif-

fer slightly from those ofExperiments IA and IB in thatthe changing-state stereo condition did not increase dis­ruption vis-a-vis quiet. However, the previous findingthat changing-state mono presentation is more disrup­tive than changing-state stereo was replicated [F(l, 76) =4.75, MSe = 7.22,p < .05].

This pattern of results is the one predicted by thestreaming hypothesis. It seems unlikely, therefore, thatthe results of Experiments 1A and IB arose because ofan effect of increasing the number of sources of soundalone; rather, the spatial distribution of sound must actin concert with the effect of changing state. As a series,the component parts ofExperiment 1 reveal a consistentand replicable effect of auditory streaming. In Experi­ment 2, we moved on to explore other ways in whichother types of organization of material can be used tomodulate the changing-state effect. In general terms, thepurpose ofExperiment 2 was the same as that ofExper­iment I, but this time the sequence consisted of a singlerepeated syllable and an auditory streaming phenome­non was used to induce a changing-state effect on visualserial recall.

EXPERIMENT 2

Experiment 2 tested the possibility that an auditorystream composed ofa single repeated syllable might stillgive rise to disruption of serial recall ifperceptual orga­nization based on temporal grouping should come intoplay. Typically, when a single sound, b~ it a tone or ~ syl­lable, is repeated at a regular rate, It forms a singlesteady-state stream. If, however, the same stimulus is re­peated at an irregular rate, what is perceived in some cir­cumstances is not a sequence of individual identicalstimuli but rather stimuli clustered together to formhigher order stimuli. The effect is based on such facto~s

as the temporal proximity of groups of sounds. ThISprocess is referred to as unitization (see,. e.g., Royer ~Robin, 1986). In this way, a temporally Irregular audi­tory input, although made up of identical phonologicalunits at one level, may be perceived as being composedof units of different length and composition at a higherlevel of organization. Hence, from the point of view ofstreaming, we hypothesized that even though the audi­tory input might contain a repeated syllable (and hen~e

show no irrelevant speech effect based on change inphonological terms alone), by changing the regularity ofthe syllable sequence, different sizes of phonologicalunit might be formed that would then qualify for condi­tions of changing state.

MethodSubjects. Twenty volunteers-undergraduates and postgradu­

ates-were paid an honorarium for participating in the study. Thegroup was made up of II females and 9 males. All. reported nor­mal hearing and normal (or corrected-to-normal) vision,

Apparatus and Materials. The lists were constructed and pre­sented in the same way as for Experiment 1.

Auditory materials were prepared by using digital signal pro­cessing techniques to sample and edit a single example of the ut-

JMono

Quiet

r+-

r+ + + ri-

II II II IIo

30

10

20

25

5

"VJX" "VJX" JStereo Mono Stereo

AUditory Condition

Figure 3. Experiment 1C: Recall proportions in each of five con­ditions, showing the joint action of changing-state auditory sequen~es

(steady-state and changing-state) and the method of presentation(stereophonic vs, monophonic). Error bars indicate standard error.

Page 6: Organizational factors in the effect of irrelevant speech

PERCEPTUAL ORGANIZATION OF IRRELEVANT SPEECH 197

40

35

45

1.92,p> .05]. Figure 4 gives the values of the serial po­sition X treatments interaction.

Planned comparisons revealed that the irregular rateand quiet conditions were significantly different [F(1 ,38)= 6.84, MSe = 5.32, p < .01], as was the contrast be­tween regular and irregular conditions [F(1,38) = 4.51,MSe = 5.32, p < .05]. The difference between quiet andregular presentation was not significant [F( 1,38) = 0.24,MSe = 5.32,p > .05].

As for Experiment 1, a test of the perception of the au­ditory material was undertaken with a fresh group of 5subjects. They were asked to make a forced-choice clas­sification of either "regular" or "irregular" for each ofthe two sounds used in the experiment (half were regu­lar and half were irregular). There was no case of mis­classification in 20 trials.

The results complement those of Experiment 1 byshowing the effect of streaming (here on the basis oftemporal regularity rather than localization), this time byaugmenting the small disruptive effect usually foundwith repeated syllables. They also suggest an importantqualification to an earlier assertion that processes asso­ciated with the detection of syllable boundaries are inpart responsible for the irrelevant speech effect. Joneset al., (1993) argued that only when the auditory streamis broken up into separate segmented entities is there amarked disruption ofserial recall, and they further spec­ulated that the unit of analysis for such segmentationmay correspond to the syllable. The results of Experi­ment 2 point to the fact that suprasyllabic units createdby temporal conjunction may also serve as the basis fora changing-state effect. However, organization based onhigher order analysis such as the meaning of the ele­ments in a sound stream is an unlikely basis for the dis­ruption. The fact that speech played backwards pro­duced effects as potent as forward narrative speech (cf.Jones et al., 1990) suggests that higher order organiza­tion imposed by the meaning of the irrelevant soundplays no major role in the disruption. Again, that the pre­dictability of the phonological content of the syllablestream does not influence the degree of disruption(Jones et al., 1992) further suggests that changing stateis being signified by a very low level of analysis, beforethe phonemic content of the sound is fully analyzed. Thefact that nonspeech sounds produce disruption similar inform and extent to speech again supports this point(Jones & Macken, 1993). However, the results of Ex­periment 2 suggest that suprasyllabic organization ispossible, but as a result of temporal organization. Takenwith the other evidence just reviewed, these results tendto confirm the generalization that organization of the ir­relevant stream may be based solely on representationsat the precategoricalleve1; that is, the organization seemsto be independent of language. But this is not the onlypossibility. Another is that the disruptive effect occurs atthe point ofgreatest change in the irrelevant stream, thisbeing when a relatively long silence is broken or at theend ofa set ofmultiple stimuli.' This explanation wouldnot involve the idea of higher order objects, but instead

76

___ Quiet

__ Regular lSI

__ Irregular lSI

345Serial Position

2

5

10

O-L--r---...--------.---,----.---,..-----.---'

50,------------------,

Figure4.Experiment 2: Serial positioncurves showingthe contrastbetween a sequence of repeated syllables presented regularly(500 msec offset to onset) and irregularly (with a mean of 500 msecoffset to onset), and quiet. Error bars indicate standard error.

20

30

15

terance "ah" in a male voice recorded at a pitch of approximatelyC3 (131 Hz). The editing software was then used to edit digitizedcopies of the utterance into a sequence. Twotypes ofmaterial werepresented: unvarying, with the "ah" (300 msec long) repeated witha fixed 500-msec interval (offset to onset); and varying, with themterutterance interval randomly selected from the range 0, 100,200, 800,900, and 1,000 msec (thus giving a sequence with an av­erage offset-to-onset interval of 500 msec). Pilot trials had indi­cated that a bimodal distribution of intervals was superior to a rec­tangular one in conveying a subjective impression of a brokenIrregular stream.

Procedure. The general procedure was identical to that of Ex­penment I. As before, conditions were allocated randomly to tri­als and the average level of sound was 65 dB(A). Again subjectswere asked to ignore any sound that they heard and they were toldthat they would not be tested on the contents of the auditory ma­terial. A typical experimental session lasted 35-40 min for eachsubject.

Results and DiscussionA two-factor repeated measures ANOVA was under­

taken with auditory conditions (varying rate, unvaryingrate, and quiet) and serial position (1-7) as factors. Asusual, the effect of serial position was highly significant[F(6,114) = 12.76, MSe = 9.45,p < .01]. The effect ofauditory treatments was also significant [F(2,38) = 3.87,MSe = 5.32, p < .05], but serial position and treatmentdid not interact significantly [F(12,228) = 1.27, MSe =

Page 7: Organizational factors in the effect of irrelevant speech

198 JONES AND MACKEN

the perception of the sharpest transience, and it has theattendant advantage of parsimony. Moreover, the resultwould harmonize well with findings ofJones et al. (1993)that cues to segmentation based on sharp transitions inpitch or loudness serve as the basis for segmenting theunattended stream.

In one respect, the results of Experiments 1 and 2 ap­pear to be discrepant, since the degree ofdisruption pro­duced by the steady-state conditions is less in Experi­ment 2 than it is in Experiment 1 (see Figures 1 and 4).At this juncture, it is worth recollecting that the maindifference between Experiments I and 2 is that in Ex­periment I there were up to three streams, whereas therewas always only one stream in Experiment 2. The dis­crepancy between conditions can be explained by sup­posing that there is a small residual degree ofdisruption,even in a single steady-state stream. By further suppos­ing that this is additive across streams, we would thenpredict that as the number of streams increases, so willthe magnitude of this residual disruption. So, for thestereo (steady-state) conditions of Experiment 1 inwhich three streams were formed we would expect thiseffect to be larger than in the regular condition of Ex­periment 2 in which only one steady state stream wasformed. The degree of disruption brought about by thechanging-state conditions of Experiments 1 and 2 werealso appreciably different. This is not too surprising inview of the qualitative differences in the way in whichchanging state was induced in each case; we have to as­sume that such qualitative differences will lead to quan­titative changes.

GENERAL DISCUSSION

The results, in summary, are the following: (I) Theusual disruption by changing-state stimuli of serial re­call may be attenuated by assigning each syllable in thesequence to a different spatial location, so that they be­come steady-state streams (Experiment IA). (2) This ef­fect is replicated with auditory stimuli that cannot beformed into a meaningful sequence and when the speechis presented only during a retention interval (Experi­ment IB). (3) The effects found in Experiments IA and1B are not due to a confounding of the number of loca­tions. Ifwe co-vary the number of sources with the de­gree ofchanging state, we then find that disruption is de­pendent on both changing state and spatial location andthat maximal disruption occurs with changing-statestimuli from one spatial location, but there is no differ­ence between mono and stereo steady-state conditions.(4) With sequences that consist ofa single repeated syl­lable, presentation of the syllable at regular intervalsproduces minimal disruption of serial recall, but thesame syllable, when presented irregularly, produces sig­nificant degrees of disruption. Amendments to Salameand Baddeley's (1982) account of the irrelevant speecheffect may be necessary in light of these findings. In ad­dition, the results inform our own theoretical account ofthe irrelevant speech effect. Finally, the results carry

with them certain implications for the study of auditorystreaming, particularly in relation to whether streamingcan occur preattentively.

Implications for Models ofthe IrrelevantSpeech Effect

Without modification, the model of the irrelevantspeech effect offered by Salame and Baddeley (1982,1989) cannot account for the effects of streaming demon­strated here. In essence their model assumes that inter­ference between what is heard and what is rehearsed isbased largely on the phonological similarity of the twostrands of material. In each of the present experiments,the relation between the phonology oflists' contents andthe irrelevant speech was fixed, yet it was possible toaugment or diminish the degree ofdisruption by stream­ing. Of course, the Salame and Baddeley (1982, 1989)position could be modified to incorporate the effects de­scribed here. One modification would be that non­changing-state stimuli would not enter memory (thephonological store) and that organizational factors gov­ern the streaming of sound to create either steady-stateor changing-state streams. However, even this modifiedapproach fails to explain why the processing undertakenwithin memory also partly determines the degree ofdis­ruption-in particular, that disruption by irrelevantspeech is minimal in tasks not requiring serial recall(Jones & Macken, 1993; Morris & Jones, 1990b; Salame& Baddeley, 1990).

Ofcourse, the present results are not sufficient inthem­selves to overturn the Salame and Baddeley (1982, 1989)account. However, the weight of evidence, to which thepresent experiments contribute, is accumulating againsttheir position. There is only one clear demonstration ofthe phonological similarity effect with the irrelevantspeech paradigm (Salame & Baddeley, 1982, Experi­ment 5). In that study, irrelevant digits or words thatsounded like the to-be-remembered digits (e.g., tun forone, woo for two) disrupted serial recall more than didirrelevant dissimilar words (e.g., tennis, wicket, jelly).However, subsequent attempts to reproduce this effecthave been unsuccessful. In three experiments, Jones andMacken (in press) failed to find effects of phonologicalsimilarity similar to those of Salame and Baddeley (1982).Rather, this recent work suggests than within-streamphonological similarity is the important factor, so thatsequences ofirrelevant words that rhymed (e.g., sea,flea,key) interfered less than did those that did not rhyme(e.g., hat, cow, nest). Arguably, rhyming sequences con­tain less changing-state information than do those thatdo not rhyme.

The object-oriented episodic record model (O-OER;see Jones, 1993) attempts to explain the changing-stateeffect by supposing that auditory and visual event se­quences are represented within memory as object!pointer combinations. An object is the representation ofa discrete event in the auditory environment. A pointerassociated with that object indicates, stochastically, thelocation of the next discrete object in a stimulus se-

Page 8: Organizational factors in the effect of irrelevant speech

PERCEPTUAL ORGANIZATION OF IRRELEVANT SPEECH 199

quence. If an event is repeated, as in a steady-state se­quence of sounds, then the pointer is self-referential;that is, only a single representation of the event is heldin memory, coupled to a single pointer that refers to theobject itself. If the sequence is made up of stimuli thatare different, as in a changing-state sequence, then theobjects and pointers form a related series. For example,in the case ofauditory stimuli, if a sequence arises froma single source such as the same speaker in a single spa­tiallocation, then the objects are joined by pointers thatencode the sequential organization of successive stimuli.

In the auditory case, it is assumed that objects are formedand linked according to the same principles as those ofauditory streaming. Therefore it is possible for severalstreams to be represented within memory, each made upofobject/pointer combinations. Interference ofserial re­call by irrelevant speech arises because the process ofre­hearsal also involves such objects and their associatedpointers. During rehearsal, the presence ofother object/pointer combinations such as those from changing-stateirrelevant speech makes the retracing of the sequence ofrepresentations of candidates for rehearsal more diffi­cult. If the irrelevant sound consists ofa repeated sylla­ble, only a single object with a single self-referentialpointer is set up. Here the retracing ofthe object/pointersequence for serial recall is less difficult, simply becausethere are fewer competing objects and pointers.

Implications for Auditory AttentionAt first blush, the results seem to fit with the idea that

factors related to auditory attention can provide a full ac­count of the pattern of disruption found in the presentexperiments. Although we think that on balance thisseems to be a reasonable conclusion, a number ofcaution­ary notes need to be sounded. We can be reasonably con­fident that the results of Experiment I arise from thejoint action ofstreaming and changing state. The resultsof Experiment lC are particularly comforting in this re­gard by showing that a possible confounding effect ofthe mere number of locations is not responsible for theeffect. But we can be less sure about the results of Ex­periment 2, Here there is a possibility that the dominantfactor is not the suprasyllabic organization of sounds,but some factor related to the irregularity of the point ofgreatest change in the irrelevant stream, such as when arelatively long silence is broken, or at the end of a clus­ter of syllables. Here, the dominant factor is energy tran­sition, rather than any higher order organizational factor.

Notwithstanding these reservations, the present re­sults have a number ofgeneral implications for the studyof auditory streaming as well as for the study of mem­ory, because we have been able to demonstrate that theeffects of streaming may be made manifest even whenthe subject's attention is directed away from the sound.Unlike previous work on auditory streaming in whichsubjective report was the only dependent variable, the ir­relevant speech paradigm allows us to infer that stream­ing occurs when the sound is not in the attentional focus(but see Bregman & Rudnicky, 1975). Speculatively, we

may regard this organization as occurring at the preat­tentive level.

The advantage of irrelevant speech studies is that theindex ofstreaming is taken when the subject is instructedto focus attention away from the sound. It is thereforepossible to assert with more confidence than has hith­erto been possible that auditory streaming is governedby preattentive processes (see also Jones et al., 1993, fora discussion). However, even in the irrelevant speechparadigm we cannot rule out the possibility that subjectsswitch attention momentarily to the irrelevant material,and thus it may only be safe to argue that the main­tenance of the percept can occur without attention di­rected toward the auditory modality.

One way ofconstruing the different effects ofchanging­state and steady-state sequences on serial recall is to re­gard them as an effect of habituation. The notion of ha­bituation seems to be inappropriate to the effects foundby us. First, it does not explain why the effect of speechdepends very much on the nature of the task undertakenby the subject; for example, only tasks requiring memoryfor serial order are disrupted by irrelevant speech (seeMorris & Jones, 1990b; Salame and Baddeley, 1990). Inaddition, it fails to predict or explain why the effect of ir­relevant speech is greater for phonologically dissimilarlists than it is for phonologically similar lists (Colle &Welsh, 1976; Jones & Macken, in press; Salame & Bad­deley, 1986a). Although habituation can explain the re­duction in response to a repeated stimulus, it cannot,without considerable elaboration, be modified to accountfor the interaction of the effects of the repeated stimuluswith activities at a postcategoricallevel within memoryas the items from the to-be-remembered list are being re­hearsed. Second, at an empirical level, there is a range offindings that do not fit neatly within an explanationbased on habituation. For example, in an examination ofthe possible effects of habituation over successive trialswith irrelevant speech conditions randomized over trials,there was no diminution in the irrelevant speech effect(Jones et al., 1992). Third, the typical exposure to soundin the present experiments is about 18 sec per trial, whichis a relatively short interval for habituation effects to de­velop. Habituation-like effects have been found (Morris& Jones, 1990a) for prior exposure to speech for the ir­relevant speech effects, but for continuous exposures of20 min. Unpublished studies show little habituation over10 min (Bridges & Jones, 1992).

In reviewing the association between attention andmemory, Cowan (1988) notes:

A lingering doubt about all of the effects of selective at­tention is that they might occur in memory rather than per­ception; that is, subjects might initially encode attended andunattended stimuli equally well, but they might translatethe attended target into a response sooner while allowinginformation from an unattended or poorly attended chan­nel to decay. (p. 175)

Although it may be premature to convert this doubtinto certainty on the basis of present research alone, we

Page 9: Organizational factors in the effect of irrelevant speech

200 JONES AND MACKEN

argue that it is gradually being established that p.ercep­tual organization is an inherent property of what IS usu­ally regarded as short-term storage, not a precursor to it,and we claim that phenomena of auditory attention maybe explained with reference to these organizational fac­tors (see also Penney, 1989, for a discussion of factorsrelating to streaming in memory).

REFERENCES

BREGMAN, A. S. (1990). Auditory scene analysis. Boston, MA: MITPress.

BREGMAN, A. S., & RUDNICKY, A. (1975). Auditory segregation:Stream or streams? Journal of Experimental Psychology: HumanPerception & Performance, 1,263-267.

BRIDGES, A., & JONES, D. M. (1992). Habituation and the irrelevantspeech effect: The role ofduration. Unpublished manuscript.

COLLE, H. A. (1980). Auditory encoding in visual short-term recall:Effects of noise intensity and spatial location. Journal of VerbalLearning & VerbalBehavior, 19,722-735.

COLLE, H. A., & WELSH, A. (1976). Acoustic masking in primarymemory. Journal ofVerbalLearning & VerbalBehavior, 15, 17-32.

COWAN, N. (1988). Evolving conceptions ofmemory storage, selectIveattention, and their mutual constraints within the human informa­tion-processing system. Psychological Bulletin, 104,163-191.

HANDEL, S. (1989). Listening: An introduction to the perception ofau-ditory events. Cambridge, MA: MIT Press. .

JONES, D. M. (1993). Objects, streams and threads of auditory atten­tion. In A. D. Baddeley & L. Weiskrantz (Eds.), Attention, awarenessand control (pp. 87-104). Oxford: Oxford University Press.

JO~ES,D. M. (1994). Disruption of memory for lipread lists by irrele­vant speech: Further support for the changing state hypothesis.Quarterly Journal ofExperimental Psychology, 47A, 143-160.

JONES, D. M., & MACKEN, W. J. (1993). Irrelevant tones also producean irrelevant speech effect: Implications for phonological coding inmemory. Journal ofExperimental Psychology: Learning, Memory,& Cognition, 19, 369-381.

JONES, D. M., & MACKEN, W.J. (in press). Phonological similarity in theirrelevant speech effect: Within- or between-stream similarity? Jour­nal ofExperimental Psychology: Learning, Memory, & Cognition.

JONES, D. M., MACKEN, W.J., & MURRAY, A. C. (1993). Disruption ofvisual short-term memory by changing-state auditory stimuli: Therole of segmentation. Memory & Cognition, 21, 3 I 8-328.

JONES, D. [M.], MADDEN, C., & MILES,C. (1992). Privileged access byIrrelevant speech to short-term memory: The role ofchanging state.Quarterly Journal ofExperimental Psychology, 44A, 645-669.

JONES, D. M., MILES, C, & PAGE, J. (1990). DisruptIon ofproof-readingby irrelevant speech: Effects of attention, arousal or memory? Ap­plied Cognitive Psychology, 4, 89-108.

JONES, D. M., & MORRIS, N. (1992). Irrelevant speech and senal recall:Implications for theories ofattention and working memory. Scandi­navian Journal ofPsychology, 33, 212-229.

MILES, C, JONES, D. M., & MADDEN, C. (1991). Locus of the irrelevantspeech effect in short-term memory. Journal ofExperimental Psy­chology: Learning, Memory, & Cognition, 17,578-584.

MORRIS, N., & JONES, D. M. (1990a). Habituation to irrelevant speech:Effects on a visual short-term memory task. Perception & Psycho-physics, 47, 291-297. .. .

MORRIS, N., & JONES, D. M. (I 990b). Memory updating In workingmemory: The role of the central executive. British Journal ofPsy­chology, 81, I II-121.

PENNEY, C. G. (1989). Modality effects and the structure ofshort-termverbal memory. Memory & Cognition, 17,398-422.

ROYER, F. L., & ROBIN, D. A. (1986). On the perceived unitizationof repetitive auditory patterns. Perception & Psychophysics, 39,9-18.

SALAME, P., & BADDELEY, A. D. (1982). Disruption of short-termmemory by unattended speech: Implications for the structure ofworking memory. Journal of Verbal Learning & Verbal Behavior,21, 150-164.

SALAME, P., & BADDELEY, A. D. (1986a). Phonological factors in STM:Similarity and the unattended speech effect. Bulletin ofthe Psycho­nomic Society, 24, 263-265.

SALAME, P., & BADDELEY, A. D. (l986b). The unattended speech ef­fect: Perception or memory? Journal ofExperimental Psychology:Learning, Memory, & Cognition, 12, 525-529.

SALAME, P., & BADDELEY, A. D. (1987). NOise, unattended speech andshort-term memory. Ergonomics, 30,1185-1194.

SALAME, P., & BADDELEY, A. D. (1989). Effects of background musicon phonological short-term memory. Quarterly Journal ofExperi­mental Psychology, 41A, 107-122.

SALAME, P., & BADDELEY, A. [D.] (1990). The effects of irrelevantspeech on immediate free recall. Bulletin ofthe Psychonomic Soci­ety, 28, 540-542.

NarE

I. Thanks are due Nelson Cowan for making this suggestIon duringthe review process.

(Manuscript received August 26, 1993;revision accepted for publication March 14, 1994.)