12
The Neurophysiology of Backward Visual Masking: Information Analysis Edmund T. Rolls, Martin J. Tovée, and Stefano Panzeri University of Oxford Abstract n Backward masking can potentially provide evidence of the time needed for visual processing, a fundamental constraint that must be incorporated into computational models of vision. Although backward masking has been extensively used psycho- physically, there is little direct evidence for the effects of visual masking on neuronal responses. To investigate the effects of a backward masking paradigm on the responses of neurons in the temporal visual cortex, we have shown that the response of the neurons is interrupted by the mask. Under conditions when humans can just identify the stimulus, with stimulus onset asynchronies (SOA) of 20 msec, neurons in macaques respond to their best stimulus for approximately 30 msec. We now quantify the information that is available from the re- sponses of single neurons under backward masking conditions when two to six faces were shown. We show that the informa- tion available is greatly decreased as the mask is brought closer to the stimulus. The decrease is more marked than the decrease in ring rate because it is the selective part of the ring that is especially attenuated by the mask, not the spontaneous ring, and also because the neuronal response is more variable at short SOAs. However, even at the shortest SOA of 20 msec, the information available is on average 0.1 bits. This compares to 0.3 bits with only the 16-msec target stimulus shown and a typical value for such neurons of 0.4 to 0.5 bits with a 500- msec stimulus. The results thus show that considerable infor- mation is available from neuronal responses even under backward masking conditions that allow the neurons to have their main response in 30 msec. This provides evidence for how rapid the processing of visual information is in a cortical area and provides a fundamental constraint for understanding how cortical information processing operates. n INTRODUCTION Our visual environment is constantly changing. To inter- act with it in real time we need to rapidly process and interpret visual stimuli. How fast our visual system can do this, and the amount of time needed at each synapse for the computations it performs, is of fundamental im- portance for understanding cortical function (Golledge, Hilgetag, & Tovée, 1996; Wallis & Rolls, 1997). We have previously investigated this vital question using a visual backward masking paradigm. In this paradigm there is a brief presentation of a test stimulus, which after a brief interval (the stimulus onset asynchrony, or SOA) is rap- idly followed by the presentation of a second stimulus (the mask), which impairs or masks the perception of the test stimulus. Although backward masking has been extensively used psychophysically (Humphreys & Bruce, 1989), there is little direct evidence of the effects of visual masking on neuronal responses in cortical areas (cf. Schiller, 1968 for lateral geniculate nucleus). In two of the studies that have been performed (Rolls & Tovée, 1994; Rolls, Tovée, Purcell, Stewart, & Azzopardi, 1994) it was found that under short SOA conditions, visual neu- rons in the primate temporal cortex re for a very short time. In particular, with SOAs of 20 msec it was found © 1999 Massachusetts Institute of Technology Journal of Cognitive Neuroscience 11:3, pp. 300–311 that the face selective neurons responded for only 20 to 30 msec. This suggests that presentation of the mask interrupts the processing of the test stimulus. The neuro- physiological data can be compared directly with the effects of backward masking in human observers studied in the same apparatus with the same stimuli. For the human observers, identi cation of which face from a set of six had been seen was 50% correct with an SOA of 20 msec, and 97% correct with an SOA of 40 msec (corrected for guessing) (Rolls et al., 1994). Comparing the human performance and the macaque neuronal re- sponses under the same stimulus conditions leads to the conclusion that, when it is just possible to identify which face has been seen, neurons in a given cortical area may be responding for only 20 to 30 msec. A subsequent study by Kovacs, Vogels, and Orban (1995) using a similar backward masking paradigm combined with primate electrophysiology con rmed our results. The results of the masking experiments were consis- tent with previous work that suggests that very little time is required at each processing area for object rec- ognition (see Tovée, 1994). The techniques of informa- tion theory have been used to analyze the responses of visual neurons in the temporal visual cortex of awake, behaving macaques (Tovée, Rolls, Treves, & Bellis, 1993;

The Neurophysiology of Backward Visual Masking ...Tovee+99.pdf · The Neurophysiology of Backward Visual Masking: Information Analysis ... SOAs, and was smaller than ... The Neurophysiology

Embed Size (px)

Citation preview

The Neurophysiology of Backward VisualMasking Information Analysis

Edmund T Rolls Martin J Toveacutee and Stefano PanzeriUniversity of Oxford

Abstract

n Backward masking can potentially provide evidence of thetime needed for visual processing a fundamental constraintthat must be incorporated into computational models of visionAlthough backward masking has been extensively used psycho-physically there is little direct evidence for the effects of visualmasking on neuronal responses To investigate the effects of abackward masking paradigm on the responses of neurons inthe temporal visual cortex we have shown that the responseof the neurons is interrupted by the mask Under conditionswhen humans can just identify the stimulus with stimulusonset asynchronies (SOA) of 20 msec neurons in macaquesrespond to their best stimulus for approximately 30 msec Wenow quantify the information that is available from the re-sponses of single neurons under backward masking conditionswhen two to six faces were shown We show that the informa-tion available is greatly decreased as the mask is brought closer

to the stimulus The decrease is more marked than the decreasein ring rate because it is the selective part of the ring thatis especially attenuated by the mask not the spontaneousring and also because the neuronal response is more variableat short SOAs However even at the shortest SOA of 20 msecthe information available is on average 01 bits This comparesto 03 bits with only the 16-msec target stimulus shown and atypical value for such neurons of 04 to 05 bits with a 500-msec stimulus The results thus show that considerable infor-mation is available from neuronal responses even underbackward masking conditions that allow the neurons to havetheir main response in 30 msec This provides evidence forhow rapid the processing of visual information is in a corticalarea and provides a fundamental constraint for understandinghow cortical information processing operates n

INTRODUCTION

Our visual environment is constantly changing To inter-act with it in real time we need to rapidly process andinterpret visual stimuli How fast our visual system cando this and the amount of time needed at each synapsefor the computations it performs is of fundamental im-portance for understanding cortical function (GolledgeHilgetag amp Toveacutee 1996 Wallis amp Rolls 1997) We havepreviously investigated this vital question using a visualbackward masking paradigm In this paradigm there is abrief presentation of a test stimulus which after a briefinterval (the stimulus onset asynchrony or SOA) is rap-idly followed by the presentation of a second stimulus(the mask) which impairs or masks the perception ofthe test stimulus Although backward masking has beenextensively used psychophysically (Humphreys amp Bruce1989) there is little direct evidence of the effects ofvisual masking on neuronal responses in cortical areas(cf Schiller 1968 for lateral geniculate nucleus) In twoof the studies that have been performed (Rolls amp Toveacutee1994 Rolls Toveacutee Purcell Stewart amp Azzopardi 1994) itwas found that under short SOA conditions visual neu-rons in the primate temporal cortex re for a very shorttime In particular with SOAs of 20 msec it was found

copy 1999 Massachusetts Institute of Technology Journal of Cognitive Neuroscience 113 pp 300ndash311

that the face selective neurons responded for only 20 to30 msec This suggests that presentation of the maskinterrupts the processing of the test stimulus The neuro-physiological data can be compared directly with theeffects of backward masking in human observers studiedin the same apparatus with the same stimuli For thehuman observers identication of which face from a setof six had been seen was 50 correct with an SOA of20 msec and 97 correct with an SOA of 40 msec(corrected for guessing) (Rolls et al 1994) Comparingthe human performance and the macaque neuronal re-sponses under the same stimulus conditions leads to theconclusion that when it is just possible to identify whichface has been seen neurons in a given cortical area maybe responding for only 20 to 30 msec A subsequentstudy by Kovacs Vogels and Orban (1995) using a similarbackward masking paradigm combined with primateelectrophysiology conrmed our results

The results of the masking experiments were consis-tent with previous work that suggests that very littletime is required at each processing area for object rec-ognition (see Toveacutee 1994) The techniques of informa-tion theory have been used to analyze the responses ofvisual neurons in the temporal visual cortex of awakebehaving macaques (Toveacutee Rolls Treves amp Bellis 1993

Toveacutee amp Rolls 1995) Up to 649 of the informationavailable in a 400-msec period is available in a 20-msecsample near the start of the spike train and up to 87is available in a 50-msec sample The response latenciesin different cortical areas also suggest a processing dura-tion of 10 to 20 msec at each area both in primates(Raiguel Lagae Gulyas amp Orban 1989 Perrett HietanenOram amp Benson 1992 Vogel amp Orban 1994) and in cats(Dinse amp Kruger 1994) a gure consistent with a recentvisual evoked potential study in humans (Thorpe Fizeamp Mariot 1996) Moreover we have also recently shownthat visual learning can occur very rapidly The simpleprepresentation of a stimulus can signicantly alter theresponse of a cell to the stimulus when it is shown inan ambiguous situation just seconds later suggesting thecell has ldquolearnedrdquo to recognize the stimulus under theambiguous conditions (Toveacutee Rolls amp Ramachandran1996)

The masking experiment we described (Rolls amp Toveacutee1994 Rolls et al 1994) measured the ring rate to aneffective stimulus or the average ring rate to all thestimuli as a function of the SOA However that analysisdid not take into account how the SOA affected theresponses of the cell not just to the best stimulus or theaverage but also to the less effective stimuli for a cell Itis possible that at short SOAs neurons do not actuallydiscriminate between effective and less effective stimulias well as they do with long and unmasked stimuluspresentation One way to address this is to measure howSOA inuences the difference in ring rate to the bestand the least effective stimulus in the set of stimuliKovacs et al (1995) showed that this difference measurewas greatly decreased when short SOAs are used Whatis still to be answered is if at short SOAs the neuronalresponses can actually quantitatively discriminate be-tween different stimuli and if the differences betweenneuronal responses to different stimuli are still sig-nicant with respect to the variability of the responsesand the spontaneous activity The proper way to analyzethis is to use an information theoretic measure whichwill reect at the same time how different the ringrates are to different stimuli and how signicant thedifferences in the responses are taking into account thenoise (see Optican amp Richmond 1987 Rieke Warland deRuyter van Steveninck amp Bialek 1996 Rolls amp Treves1998 Toveacutee et al 1993) In this paper we use an infor-mation theoretic approach to measure how much infor-mation is actually provided by the neurons at differentSOAs The results show that at short SOAs rather lessinformation is available than might be predicted fromthe average ring to all stimuli because at short SOAsthe responses to the different stimuli become muchsmaller but some spontaneous ring remains

RESULTS

Using the masking protocol we recorded from a total of34 cells in the inferior temporal cortex and the cortexin the banks and oor of the superior temporal sulcusIt was possible to complete the information analyses onthe subset of 15 of these cells for which sufcient trialswere available for this type of information-theoreticanalysis (see Methods for details) None of the cellsresponded to the mask stimulus itself

Figure 1 provides examples of the responses of onecell to the most effective and to the least effectivestimulus without a mask and with a mask at an SOA of20 msec It is notable that in the unmasked condition(indicated by an ldquoSOArdquo of 1000 msec) the cell red forapproximately 120 msec to the effective stimulus eventhough the stimulus lasted only for 20 msec (A moretypical value is 200 to 300 msec as will be shown inFigure 3) It is also shown that the mask effectivelyinterrupts the ring of the neuron It is data of this typethat were the subject of the information analysis de-scribed here The latency of the neuronal response forthis and the other neurons described here was in theregion of 75 to 85 msec

As a preliminary to the analysis we show in Figure 2the effect of the SOA on the neuronal responses aver-aged across the population of 15 neurons The responsesfor the most (max) and the least (min) effective stimuliare shown for the period 0 to 200 msec with respect tostimulus onset Although the main effect of the mask onthe number of spikes emitted was evident at shorterSOAs and was smaller than for the larger set of neuronsanalyzed previously (Rolls amp Toveacutee 1994) there was astatistically signicant reduction of the most effectivering rate due to the mask (as tested by a one-wayanalysis of variance or ANOVA) at p lt 005 There waslittle effect (not signicant) of the mask on the re-sponses to the least effective stimulus in the set forwhich the number of spikes was close to the spontane-ous activity (For Figure 2 and the ANOVA the responsesof different cells were combined by scaling them so thatthe average response across SOAs of each cell to themost effective stimulus was that of the average responseto the most effective stimulus of the population of cellsacross the different SOAs We used this normalizationbecause it removes the variability that is just due to theinhomogeneity of the cell sample without changing thedependence of the responses on the SOAs)

Also as a preliminary to the analysis we show inFigure 3 the mean ring rate computed in 50-msecepochs for the 15 cells to the most effective stimulusand to the least effective stimulus as a function of theSOA We also show the average response to all stimuli Itcan be seen that the average ring rate across stimulidoes reduce to some extent as the SOA becomes shorterbut that there is a much larger decrease in the difference

Rolls et al 301

in the responses to the best and the least effectivestimuli It is also clear that the responses which are allto a 16-msec stimulus last for longer in the unmaskedcondition (ldquoSOArdquo = 1000 msec) than at an SOA of 20msec

As a further preliminary to the information analysisfor which the information shown will be that availableby different poststimulus times we show in Figure 4 theaverage across the cells of the number of spikes cumu-lated by different poststimulus times for the most effec-

tive and the least effective stimuli for each cell and alsothe average across stimuli It can be seen that the num-ber of spikes continues to increase a little more for theunmasked than for the masked conditions but that atshort SOAs (eg 20 msec) the cumulated number ofspikes that separates the most effective from the leasteffective stimulus becomes quite small This is an indica-tion that at short SOAs the information available aboutwhich stimulus was shown might be quite small relativeto that with longer SOAs or no masking stimulus

Figure 1 Examples of the responses of one cell to the most effective (a) and to the least effective stimulus (b) without a mask and with a maskat an SOA of 20 msec (c vs d) The responses are shown in rastergram and peristimulus time histogram form The stimulus appeared at time 0

302 Journal of Cognitive Neuroscience Volume 11 Number 3

Figure 5 shows the cumulated information (averagedacross stimuli) at different poststimulus times for thedifferent SOAs and for the unmasked condition It is clearthat there is considerable information in the unmaskedcondition with an average value reached typically by150 msec poststimulus of 030 bits In the masked con-dition the information is progressively reduced as theSOA is decreased to 20 msec At an SOA of 40 msec theaverage information was 016 bits (53) and at an SOAof 20 msec the average information was 010 bits (33)Thus the mask produces a very considerable reductionin the information available about which stimulus wasshown The information tends to decrease after approxi-mately 250 msec indicating that after this time theresponses tend to introduce noise and no net furtherinformation about which stimulus was shown This issueis addressed specically in Figure 6 which shows theinformation available in 50-msec epochs taken at differ-ent poststimulus times with different SOAs It is clearthat most information is available in 50-msec epochstaken up to 200 to 250 msec At longer poststimulustimes there is still some information in the unmaskedcondition but little in the masked conditions The reasonfor the small decrease in information shown in Figure 5at poststimulus times longer than approximately 250msec can overall be accounted for by the fact that thelarge difference in the responses to the different stimulievident by 250 msec tend to become proportionally

smaller as more spikes in the longer poststimulus timesare included Any ring after 250 msec poststimulus iseffectively spontaneous activity of the neuron and therecontributes only noise about the stimulus

To show that the results in Figures 5 and 6 are notdue to averaging across cells we show in Figure 7 theinformation available from the responses of one cellInformation is encoded in the cellrsquos response for severalhundred milliseconds after the cell starts responding tothe stimulus rising to a peak of 085 bits at 100 to 150msec after the onset of stimulus presentation The cellcontinues to encode a signicant amount of informationabout the stimulus in 50-msec epochs taken up to 400msec

To summarize the results we show in Figure 8a theaverage across the cells of the cumulated informationavailable in a 200-msec period from stimulus onset fromthe responses of the cells as a function of the SOA Thisemphasizes how as the SOA is reduced toward 20 msecthe information does reduce rapidly but that neverthe-less at an SOA of 20 msec there is still considerableinformation about which stimulus was shown The re-duction of the information at different SOAs was highlysignicant (one-way ANOVA) at p lt 0001 (The ANOVAwas performed after rescaling the data so that each cellhad the same average information and so that variancedue to differences in the magnitude of the informationcarried by each cell would not interfere with the test ofwhether the SOA affected the information) For compari-son we show in Figure 8b the difference in the numberof spikes to the most effective and the least effectivestimulus as a function of the SOA (normalized as de-scribed for Figure 2) This response difference is closelyrelated to the information available to the different stim-uli Indeed the difference in the number of spikes ismuch more closely related to the information than is themean ring rate of the neurons (shown in Figure 4)emphasizing that it is differences of neuronal responsesto different stimuli that convey information The reduc-tion of the difference in the number of spikes at differ-ent SOAs was again signicant (p lt 005) although lesssignicant than the reduction of the information

DISCUSSION

The advance described in this paper is that by combin-ing neurophysiology and information theory we havebeen able to quantitatively measure how backward vis-ual masking affects the information available from theresponses of neurons in the visual system The effect ofthe mask is to reduce the total information availablefrom the cell the size of the information peak and thelength of time it is signaling information (Figures 5 to 8)The results clearly show that there is a systematic andstatistically signicant reduction in the amount of infor-mation with decreasing SOA (see eg Figure 8)

The results emphasize that very considerable informa-

Figure 2 The mean (+- sem) across cells of the number ofspikes produced by the most effective stimulus (max) and the leasteffective stimulus (min) as a function of SOA

Rolls et al 303

tion about which stimulus was shown is available in ashort epoch (eg 50 msec see Figure 6) of the ringrate This information is available even when the epochis taken near the start of the neuronal response Thisconrms the ndings of Heller Hertz Kjaer and Rich-mond (1995) Toveacutee et al (1993) and Toveacutee and Rolls(1995) also made with recordings from inferior tempo-

ral cortex neurons In those studies in which the stimu-lus lasted for several hundred milliseconds the epochcould be taken at a wide range of poststimulus times (inthe range 100 to 500 msec) In the present study thestimulus itself lasted for only 20 msec and correspond-ingly in the no mask condition the information availabledid drop over the next few hundred milliseconds How-

Figure 3 The mean ringrate computed in 50-msec ep-ochs for the 15 cells to themost effective stimulus and tothe least effective stimulus asa function of the SOA The av-erage response to all stimuli isshown as the middle line ofthe three

304 Journal of Cognitive Neuroscience Volume 11 Number 3

ever it was notable that the information in the no-maskcondition did outlast the end of the stimulus by as muchas 200 to 300 msec indicating some short-term memorytrace property of the neuronal circuitry (cf Wallis ampRolls 1987)

The results show also that an effect of the masking isthat the decrease in information is more marked than

the decrease in ring rate because it is the selective partof the ring that is especially attenuated by the masknot the spontaneous ring (see Figures 2 and 8) Theresults also show that even at the shortest SOA of 20msec the information available was on average 01 bitsThis compares to 03 bits with the 16-msec stimulusshown without the mask It also compares to a typical

Figure 4 The average acrossthe cells of the number ofspikes cumulated by differentpoststimulus times for themost effective and the least ef-fective stimuli for each celland also the average acrossstimuli

Rolls et al 305

value for such neurons of 035 to 05 bits with a 500-msec stimulus presentation (Rolls Treves Toveacutee ampPanzeri 1997 Toveacutee amp Rolls 1995) The results thus showthat considerable information (33 of that availablewithout a mask and approximately 22 of that with a500-msec stimulus presentation) is available from neuro-nal responses even under backward masking conditionsthat allow the neurons to have their main response in30 msec Also we note that the information availablefrom a 16-msec unmasked stimulus (03 bits) is a largeproportion (approximately 65 to 75) of that availablefrom a 500-msec stimulus These results provide evi-dence of how rapid the processing of visual informationis in a cortical area and provide a fundamental constraintfor understanding how cortical information processingoperates (see Rolls amp Treves 1998)

The analysis using information theory draws out aninteresting point about the relation between the ringrate measure used previously (Rolls amp Toveacutee 1994 Rollset al 1994) and the information measure (see Figure 8)It is shown in Figure 8 that the difference in the meanresponses to different stimuli (not the mean response toall stimuli) is related to the information This reects thefact that the information is related to the differences inneuronal responses to different stimuli However theinformation also reects the variance in the neuronalresponses and it is presumably the larger variance atshort SOAs that makes the information decrease more

than the difference in neuronal responses In fact thestandard deviation of the neuronal responses at an SOAof 20 msec is approximately 045 of the mean ring ratewhereas it is 04 of the mean ring rate without mask-ing

The neurophysiological and information data de-scribed here can be compared directly with the effectsof backward masking in human observers studied in thesame apparatus with the same stimuli (Rolls et al 1994)For the human observers identication of which facefrom a set of six had been seen was 50 correct withan SOA of 20 msec and 97 correct with an SOA of 40msec (corrected for guessing) (Rolls et al 1994) Com-paring the human performance purely with the changesin ring rate under the same stimulus conditions sug-gested that when it is just possible to identify which facehas been seen neurons in a given cortical area may beresponding for only approximately 30 msec (Rolls ampToveacutee 1994 Rolls et al 1994) The implication is that 30msec is enough time for a neuron to perform sufcientcomputation to enable its output to be used for iden-tication The new results based on an analysis of theinformation encoded in the spike trains at different SOAssupport this hypothesis by showing that a signicantproportion of information is available in these few spikes(see Figures 5 to 7) Thus there is very rapid processingof visual stimuli in the visual system We also note thatthe analysis using information theory adds to previous

Figure 5 The average across the cells of the cumulated informa-tion (averaged across stimuli) at different poststimulus times for thedifferent SOAs and for the unmasked condition

Figure 6 The average across the cells of the information availablein discrete 50-msec epochs taken at different poststimulus timeswith different SOAs and for the unmasked condition

306 Journal of Cognitive Neuroscience Volume 11 Number 3

analyses by providing quantitative information measuresthat can in principle be compared to the performanceof human observers using the same information metric

A comparison of the latencies of the activation ofneurons in the different visual cortical areas V1 V2 V4posterior inferior temporal cortex and anterior inferiortemporal cortex suggests that approximately 10 to 15msec is added by each stage (Dinse amp Kruger 1994Oram amp Perrett 1992 Raiguel et al 1989 Rolls 1992Vogel amp Orban 1994) This lag also seems to be commonto the passage of information between different subdivi-sions of a given area For example there seems to be alag of 15 msec between neurons in layer 4C and in themore supercial layers of V1 and a lag of 11 msecbetween viewer-centered cells and object-centered cellsin the temporal visual cortex (Maunsell amp Gibson 1992Perrett et al 1992) These studies show that neurons inthe next stage of processing start ring soon after (15msec) the neurons in any stage of processing havestarted to re The fact that considerable information isavailable in short epochs of for example 20 msec of thering of neurons provides part of the underlying basisfor this rapid sequential activation of connected visualcortical areas (Toveacutee amp Rolls 1995) The high ring ratesof neurons in the visual cortical areas to their mosteffective stimulus may be an important aspect of thisrapid transmission of visual information (cf Rolls ampTreves 1998 section A232) Nevertheless the ring of

neurons within a cortical area normally continues to avisual stimulus for several hundred milliseconds (seeRolls amp Treves 1998) Over this long period of timedifferent factors will inuence the ring of neurons in agiven cortical area Initially the ring will be basedmainly on incoming information from the precedingcortical area (feed-forward information) but at varyingtemporal intervals different feed-back mechanisms willplay a modulating role Initially this is likely to include

Figure 7 The information available from the responses of one ofthe cells as a function of the poststimulus time for different SOAs

Figure 8 (a) The average (plusmn sem) across the cells of the cumu-lated information available in a 200-msec period from stimulus onsetfrom the responses of the cells as a function of the SOA (b) The dif-ference in the number of spikes to the most effective and to theleast effective stimulus is also shown as a function of the SOA Thescale in (b) of the ldquonumber of spikesrdquo axis is scaled so that the infor-mation value (a) and the difference in the number of spikes (b) co-incide in the no-mask condition

Rolls et al 307

lateral inhibition followed by intracortical feedbackfrom local excitatory recurrent collateral axons of otherpyramidal cells and then feedback from higher corticalareas However although all these separate inputs seemto be very different conceptually in time it is of interestthat in dynamical systems using integrate-and-re neu-rons to model the dynamics of the real brain the ex-change of information required to achieve rapid settlingof the network into a nal state can be very much fasterthan might be expected by taking the contribution ofeach stage as a separate time step (see Rolls amp Treves1998 Treves 1993) This is exactly consistent with therapid speed of processing demonstrated here and em-phasized by the fact that considerable information isavailable from inferior temporal cortex neurons whentheir processing is interrupted by a mask starting only20 msec after the onset of a stimulus

Our previous studies suggested that there is a popula-tion of neurons in the temporal cortical areas that re-spond to a single frame (16 msec) presentation of thetarget stimulus a face with an increased ring rate thatcan last for 200 to 300 msec (Rolls amp Toveacutee 1994 Rollset al 1994) This is in the unmasked condition Theinformation analysis conrms and extends this ndingThe neuronal responses can encode signicant amountsof information for up to 300 msec after the presentationof a 16-msec stimulus in the unmasked condition (seeFigure 5) These results suggest that there may be ashort-term visual memory implemented by the con-tinued ring of these neurons after a stimulus hasdisappeared This short-term visual memory may be im-plemented by the recurrent collateral connections madebetween nearby pyramidal cells in the cerebral cortexThese recurrent connections may function in part as anautoassociative memory which not only enables corticalnetworks to show continued ring for a few hundredmilliseconds after a briey presented stimulus but alsoconfer some of the response specicity inherent in theresponses of cortical neurons (Rolls amp Treves 1998) Oneeffect that may be facilitated by this short-term visualmemory is implementation of a trace learning rule in thevisual cortex that may be used to learn invariant repre-sentations (Rolls 1992 1994 1995 Rolls amp Toveacutee 1994Wallis amp Rolls 1997) The continuing neural activity afterone stimulus would enable it to be associated with thesucceeding images which given the nature of the statis-tics of our visual world would probably be images of thesame object These associations between images pro-duced within a short time by the same object could formthe basis for the construction of an invariant repre-sentation of that stimulus For example the neuronsactivated by a stimulus are still in a state of activation(and postsynaptic depolarization) perhaps 300 mseclater when the same stimulus is seen translated acrossthe retina viewed from a different angle or at a differentsize so that the active axons carrying the transformed

representation can undergo Hebbian associative synapticmodication onto just those neurons that remain in anactivated state from the previous input produced by thesame object In this way the invariant properties evidentacross short time epochs of the inputs produced byobjects may be learned by the visual system (Rolls 1992Rolls amp Treves 1998 Wallis amp Rolls 1997)

We nally note that the information per spike denedas the information calculated in short windows dividedby the number of spikes emitted by the cell in that timewindow is in the range 005 to 02 bits per spike forlonger SOAs slightly decreasing down to 003 to 015bits per spike at very short SOAs (The values of theinformation per spike can be easily extracted by com-paring Figure 6 which reports the values of the informa-tion available in short time epochs and Figure 3 whichplots the mean ring rates of the neurons in the sametime epochs) These values are similar to the values ofthe information per spike found in IT neurons respond-ing to long-lasting (eg 500 msec) stimuli (Heller et al1995 Rolls Treves Toveacutee amp Panzeri 1997) Thus for theexperiments described the information per spike is onlymoderate even when there are a very few spikes inresponse to a stimulus as in the mask condition withvery short SOAs This result has some implications forthe nature of the neuronal code It has been suggestedby Rieke et al (1996) that the fact that at short SOAsonly a few spikes are emitted in response to a stimulusand at the same time a psychophysical discrimination ofthe visual stimuli is still possible may imply that even inthe mammalian central nervous system when a singlestimulus is rapidly varying or is presented just for a veryshort time one or two spikes may be enough to carryvery high values of information (thus challenging theidea of a ldquorate coderdquo) This is indeed the case for forexample the visual system of the y (Rieke et al 1996)where a single spike can carry several bits of informa-tion at least when studied with dynamically varyingstimuli However the evidence presented here seems toindicate that several spikes from (perhaps different) neu-rons in the IT cortex of the monkey are needed toprovide much information about a briey presented vis-ual pattern This is consistent with the distributed repre-sentation of information used in many mammalianneural systems and with the advantages that computa-tion with this type of distributed representation of infor-mation confers (Rolls amp Treves 1998 Rolls Treves ampToveacutee 1997)

EXPERIMENTAL PROCEDURES AND DATAANALYSIS

The activity of single neurons was recorded with glass-insulated tungsten microelectrodes in the anterior partof the superior temporal sulcus (STS) in two alert ma-caque monkeys (Macaca mulatta mass 30 kg) seated in

308 Journal of Cognitive Neuroscience Volume 11 Number 3

a primate chair using techniques that have been de-scribed elsewhere (Toveacutee et al 1993 Toveacutee amp Rolls1992) The preparative procedures were performed asep-tically under sodium thiopentone anaesthesia by usingpretreatment with ketamine and posttreatment with theanalgesic buprenorphine (Temgesic and the antibioticamoxycillin Cynulox) and all procedures were in ac-cordance with the Policy Regarding the Care and Use ofAnimals approved by the Society for Neuroscience andwere licensed under the UK Animals Scientic Proce-dures Act 1986 Eye position was measured to an accu-racy of 05 8 with the search coil technique A visualxation task ensured that the monkey looked steadily atthe screen throughout the presentation of each stimulusThe task was a blink version of a visual xation task inwhich the xation spot was blinked off 100 msec beforethe target (otherwise called test) stimulus appeared Thestimuli were static visual stimuli subtending 8 in thevisual eld presented on a video monitor at a distanceof 10 m The xation spot position was at the center ofthe screen The monitor was viewed binocularly with thewhole screen visible to both eyes

Each trial started at - 500 msec with respect to theonset of the test image with a 500-msec warning toneto allow xation of the xation point which appearedat the same time At - 100 msec the xation spot wasblinked off so there was no stimulus on the screen inthe 100-msec period immediately preceding the testimage The screen in this period and at all other timesincluding the interstimulus interval and the interval be-tween the test image and the mask was set at the meanluminance of the test images and the mask At 0 msecthe tone was switched off and the test image wasswitched on for 16 msec This period was the frameduration of the video framestore with which the imageswere presented The image was drawn on the monitorfrom the top to the bottom in the rst 16 msec of theframe period by the framestore with the remaining 4msec of the frame period being the vertical blank inter-val (The PAL video system was in use) The monitor hada persistence of less than 3 msec so no part of the testimage was present at the start of the next frame Thetime between the start of the test stimulus and the startof the mask stimulus (the SOA) was either 20 40 60 100or 1000 msec (chosen in a pseudorandom sequence bythe computer) The 1000-msec condition was used tomeasure the response to the test stimulus alone (whichwas possible because the mask was delayed for so long)The duration of the masking stimulus was 300 msec Atthe termination of the masking stimulus the xation spotreappeared and then after a random interval in the range150 to 3350 msec it dimmed to indicate that lickingresponses to a tube in front of the mouth would resultin the delivery of a reward The dimming period was1000 msec and after this the xation spot was switchedoff and reward availability terminated 500 msec later

The monkey was required to xate the xation spot andif it licked at any time other than when the spot wasdimmed saline instead of fruit juice was delivered fromthe tube If the eyes moved by more than 05deg from time0 until the start of the dimming period the trial wasaborted and the data for the trial were rejected When atrial aborted a high-frequency tone sounded for 05 secno reinforcement was available for that trial and theintertrial interval was lengthened from 8 to 11 sec

The criterion for the face-selective neurons analyzedin this study was that the response to the optimal stimu-lus should be at least twice that to the optimal nonfacestimulus and that this difference should be signicant(Rolls 1984 Rolls amp Toveacutee 1995 Rolls Treves Toveacutee ampPanzeri 1997) If the neuron satised the criterion it wastested with two to six of the effective face stimuli forthat neuron We checked that none of the selected cellshad any response to the mask presented alone

The transmitted information carried by neuronal ringrates about the stimuli was computed with the use oftechniques that have been described fully previously(eg Rolls Treves Toveacutee amp Panzeri 1997 Rolls amp Treves1998) and have been used previously to analyze theresponses of inferior temporal cortex neurons (Gawneamp Richmond 1993 Optican amp Richmond 1987 RollsTreves amp Toveacutee 1997 Toveacutee amp Rolls 1995 Toveacutee et al1993) In brief the general procedure was as follows Theresponse r of a neuron to the presentation of a particularstimulus s was computed by measuring the ring rate ofthe neuron in a xed time window after the stimuluspresentation (In this experiment the information in anumber of different time windows was calculated)Measured in this way the ring rate responses takediscrete rather than continuous values that consist of thenumber of spikes in the time window on a particulartrial and span a discrete set of responses R across allstimuli and trials In the experiment the number of trialsavailable for each stimulus is limited (in the range of 6to 30 in the present experiment) When calculating theinformation the number of ring rate bins that can beused must be smaller than (or equal to) the number oftrials available for each stimulus to prevent undersam-pling and incorrectly high values of calculated informa-tion (Rolls amp Treves 1998) We therefore quantized themeasured ring rates into a smaller number of bins dWe chose here d in a way specic for every cell andevery time window according to the following d wasthe number of trials per stimulus (or the number ofdifferent rates that actually occurred if this was lower)This procedure is very effective in minimizing informa-tion loss due to overregularization of responses whileeffectively controlling for nite sampling biases seeGolomb Hertz Panzeri Treves amp Richmond (1997) andPanzeri amp Treves (1996) After this response quantizationthe experimental joint stimulus-response probability ta-ble P(s r) was computed from the data (where P(r) and

Rolls et al 309

P(s) are the experimental probability of occurrence ofresponses and of stimuli respectively) and the informa-tion I(S R) transmitted by the neurons averaged acrossthe stimuli was calculated by using the Shannon formula(Cover amp Thomas 1991 Shannon 1948)

I(S R) = aring sr

P(s r) log2 P(s r)

P(s)P(r)

and then subtracting the nite sampling correction ofPanzeri and Treves (1996) to obtain estimates unbiasedfor the limited sampling This leads to the informationavailable in the ring rates about the stimulus We didnot calculate the additional information available fromtemporal ring patterns within the spike train becausethe additional information is low often only 10 to 20and reects mainly the onset latency of the neuronalresponse (Toveacutee et al 1993 Toveacutee amp Rolls 1995)

In the experiments described here three cells weretested with six stimuli one with four stimuli ve cellswith three stimuli and six cells with two stimuli We notethat the number of stimuli used here is smaller than thatused in other experiments that applied informationtheoretic measures to the responses of inferior temporalcortex cells (Optican amp Richmond 1987 Rolls Treves ampToveacutee 1997 Toveacutee et al 1993 Toveacutee amp Rolls 1995) Thereason for using fewer images in the experiments de-scribed here is that each image needed to be tested withve different masking conditions There are two poten-tial problems arising from a calculation of informationfrom a limited set of stimuli The rst one is that withfew stimuli the full response space of the neuron maynot be adequately sampled We guarded against this bychoosing images for the tests that elicited quite differentresponses from the cells and ensuring that the informa-tion measured with these images was high The secondpossible problem is that the information measures maybe distorted by the fact that there is a ceiling to theamount of information that can be provided aboutwhich of a set of stimuli has been seen and this entropyis the logarithm of the number of stimuli in the set Fortwo stimuli just 1 bit of information is all that could beprovided by a cell This ceiling could limit the informa-tion measurement obtained from neuronal responses iftoo few stimuli are used (Gawne Kjaer Hertz amp Rich-mond 1996 Gawne amp Richmond 1993 Rolls amp Treves1998 Rolls Treves amp Toveacutee 1997) For the analyses de-scribed here we checked that the information availablefrom the neuronal responses was always well below theceiling so that the information measure was not dis-torted In fact for the unmasked condition the averageinformation from the neuronal responses was 03 bitsand this is much lower than the entropy of the sets ofimages which varied for different cells between 1 bit

and 26 bits Finally we note that each neuron was testedwith the same stimuli across the different masking con-ditions and therefore the comparison of the informationvalues obtained under the different masking conditionsis homogenous

Acknowledgments

This research was supported by Medical Research CouncilProgramme Grant PG8513790 to E T Rolls S Panzeri is sup-ported by an EC Marie Curie Research Training GrantERBFMBICT972749

Reprint requests should be sent to E T Rolls Department ofExperimental Psychology University of Oxford Oxford OX13UD UK or via e-mail EdmundRollspsyoxacuk

REFERENCES

Cover T M amp Thomas J A (1991) Elements of informationtheory New York Wiley

Dinse H R amp Kruger K (1994) The timing of processingalong the visual pathway in the cat NeuroReport 5 893ndash897

Gawne T J Kjaer T W Hertz J A amp Richmond B J (1996)Adjacent cortical complex cells share about 20 of theirstimulus-related information Cerebral Cortex 6 482ndash489

Gawne T J amp Richmond B J (1993) How independent arethe messages carried by adjacent inferior temporal corti-cal neurons Journal of Neuroscience 13 2758ndash2771

Golledge H D R Hilgetag C amp Toveacutee M J (1996) Informa-tion processing A solution to the binding problem Cur-rent Biology 6 1092ndash1095

Golomb D Hertz J A Panzeri S Treves A amp RichmondB J (1997) How well can we estimate the information car-ried in neuronal responses from limited samples NeuralComputation 9 649ndash665

Heller J Hertz J A Kjaer T W amp Richmond B J (1995)Information ow and temporal coding in primate patternvision Journal of Computational Neuroscience 2 175ndash193

Humphreys G W amp Bruce V (1989) Visual cognitionHove Eng Erlbaum

Kovacs G Vogels R amp Orban G A (1995) Cortical corre-lates of pattern backward-masking Proceedings of the Na-tional Academy of Sciences USA 92 5587ndash5591

Maunsell J H R amp Gibson J R (1992) Visual response la-tencies in striate cortex of the macaque monkey Journalof Neurophysiology 68 1332ndash1344

Optican L amp Richmond B J (1987) Temporal encoding oftwo-dimensional patterns by single units in primate infe-rior temporal cortex III Information theoretical analysisJournal of Neurophysiology 57 132ndash146

Oram M W amp Perrett D I (1992) Time course of neural re-sponses discriminating different views of the face andhead Journal of Neurophysiology 68 70ndash84

Panzeri S amp Treves A (1996) Analytical estimates of limitedsampling biases in different information measures Net-work 7 87ndash107

Perrett D I Hietanen J K Oram M W amp Benson P J(1992) Organization and functions of cells in the ma-caque temporal cortex Philosophical Transactions of theRoyal Society London B 335 23ndash50

Raiguel S E Lagae L Gulyas B amp Orban G A (1989) Re-

310 Journal of Cognitive Neuroscience Volume 11 Number 3

sponse latencies of visual cells in macaque areas V1 V2and V5 Brain Research 493 155ndash159

Rieke F Warland D de Ruyter van Steveninck R R ampBialek W (1996) Spikes Exploring the neural code Cam-bridge MA MIT Press

Rolls E T (1984) Neurons in the cortex of the temporallobe and in the amygdala of the monkey with responsesselective for faces Human Neurobiology 3 209ndash222

Rolls E T (1992) Neurophysiological mechanisms underly-ing face processing within and beyond the temporal corti-cal visual areas Philosophical Transactions of the RoyalSociety London B 335 11ndash21

Rolls E T (1994) Brain mechanisms for invariant visual rec-ognition and learning Behavioral Processes 33 113ndash138

Rolls E T (1995) Learning mechanisms in the temporal lobevisual cortex Behavioral Brain Research 66 177ndash185

Rolls E T amp Toveacutee M J (1994) Processing speed in thecerebral cortex and the neurophysiology of backwardmasking Proceedings of the Royal Society London B257 9ndash15

Rolls E T amp Toveacutee M J (1995) The sparseness of the neuro-nal representation of stimuli in the primate temporal vis-ual cortex Journal of Neurophysiology 73 713ndash726

Rolls E T Toveacutee M J Purcell D G Stewart A L amp Azzo-pardi P (1994) The responses of neurons in the temporalcortex of primates and face identication and detectionExperimental Brain Research 101 473ndash484

Rolls E T amp Treves A (1998) Neural networks and brainfunction Oxford Oxford University Press

Rolls E T Treves A amp Toveacutee M J (1997) The repre-sentational capacity of the distributed encoding of informa-tion provided by populations of neurons in the primatetemporal visual cortex Experimental Brain Research114 149ndash162

Rolls E T Treves A Toveacutee M amp Panzeri S (1997) Informa-

tion in the neuronal representation of individual stimuli inthe primate temporal visual cortex Journal of Computa-tional Neuroscience 4 309ndash333

Schiller P H (1968) Single unit analysis of backward visualmasking and metacontrast in the cat lateral geniculate nu-cleus Vision Research 8 855ndash866

Shannon C E (1948) A mathematical theory of communica-tion ATampT Bell Laboratories Technical Journal 27 379ndash423

Thorpe S Fize D amp Mariot C (1996) Speed of processingin the human visual system Nature 381 520ndash522

Toveacutee M J (1994) How fast is the speed of thought Cur-rent Biology 4 1125ndash1127

Toveacutee M J amp Rolls E T (1992) Oscillatory activity is not evi-dent in the primate temporal visual cortex with static vis-ual stimuli NeuroReport 3 369ndash372

Toveacutee M J amp Rolls E T (1995) Information encoding inshort ring rate epochs by single neurons in the primatetemporal visual cortex Visual Cognition 2 35ndash58

Toveacutee M J Rolls E T amp Ramachandran V S (1996) Rapidvisual learning in the neurons of the primate temporal vis-ual cortex NeuroReport 7 2757ndash2760

Toveacutee M J Rolls E T Treves A amp Bellis R P (1993) Infor-mation encoding and the responses of single neurons inthe primate temporal visual cortex Journal of Neurophysi-ology 70 640ndash654

Treves A (1993) Mean-eld analysis of neuronal spike dy-namics Network 4 259ndash284

Vogel R amp Orban G A (1994) Activity of inferior temporalneurons during orientation discrimination with succes-sively presented gratings Journal of Neurophysiology 711428ndash1451

Wallis G amp Rolls E T (1997) Invariant face and object rec-ognition in the visual system Progress in Neurobiology51 167ndash194

Rolls et al 311

Toveacutee amp Rolls 1995) Up to 649 of the informationavailable in a 400-msec period is available in a 20-msecsample near the start of the spike train and up to 87is available in a 50-msec sample The response latenciesin different cortical areas also suggest a processing dura-tion of 10 to 20 msec at each area both in primates(Raiguel Lagae Gulyas amp Orban 1989 Perrett HietanenOram amp Benson 1992 Vogel amp Orban 1994) and in cats(Dinse amp Kruger 1994) a gure consistent with a recentvisual evoked potential study in humans (Thorpe Fizeamp Mariot 1996) Moreover we have also recently shownthat visual learning can occur very rapidly The simpleprepresentation of a stimulus can signicantly alter theresponse of a cell to the stimulus when it is shown inan ambiguous situation just seconds later suggesting thecell has ldquolearnedrdquo to recognize the stimulus under theambiguous conditions (Toveacutee Rolls amp Ramachandran1996)

The masking experiment we described (Rolls amp Toveacutee1994 Rolls et al 1994) measured the ring rate to aneffective stimulus or the average ring rate to all thestimuli as a function of the SOA However that analysisdid not take into account how the SOA affected theresponses of the cell not just to the best stimulus or theaverage but also to the less effective stimuli for a cell Itis possible that at short SOAs neurons do not actuallydiscriminate between effective and less effective stimulias well as they do with long and unmasked stimuluspresentation One way to address this is to measure howSOA inuences the difference in ring rate to the bestand the least effective stimulus in the set of stimuliKovacs et al (1995) showed that this difference measurewas greatly decreased when short SOAs are used Whatis still to be answered is if at short SOAs the neuronalresponses can actually quantitatively discriminate be-tween different stimuli and if the differences betweenneuronal responses to different stimuli are still sig-nicant with respect to the variability of the responsesand the spontaneous activity The proper way to analyzethis is to use an information theoretic measure whichwill reect at the same time how different the ringrates are to different stimuli and how signicant thedifferences in the responses are taking into account thenoise (see Optican amp Richmond 1987 Rieke Warland deRuyter van Steveninck amp Bialek 1996 Rolls amp Treves1998 Toveacutee et al 1993) In this paper we use an infor-mation theoretic approach to measure how much infor-mation is actually provided by the neurons at differentSOAs The results show that at short SOAs rather lessinformation is available than might be predicted fromthe average ring to all stimuli because at short SOAsthe responses to the different stimuli become muchsmaller but some spontaneous ring remains

RESULTS

Using the masking protocol we recorded from a total of34 cells in the inferior temporal cortex and the cortexin the banks and oor of the superior temporal sulcusIt was possible to complete the information analyses onthe subset of 15 of these cells for which sufcient trialswere available for this type of information-theoreticanalysis (see Methods for details) None of the cellsresponded to the mask stimulus itself

Figure 1 provides examples of the responses of onecell to the most effective and to the least effectivestimulus without a mask and with a mask at an SOA of20 msec It is notable that in the unmasked condition(indicated by an ldquoSOArdquo of 1000 msec) the cell red forapproximately 120 msec to the effective stimulus eventhough the stimulus lasted only for 20 msec (A moretypical value is 200 to 300 msec as will be shown inFigure 3) It is also shown that the mask effectivelyinterrupts the ring of the neuron It is data of this typethat were the subject of the information analysis de-scribed here The latency of the neuronal response forthis and the other neurons described here was in theregion of 75 to 85 msec

As a preliminary to the analysis we show in Figure 2the effect of the SOA on the neuronal responses aver-aged across the population of 15 neurons The responsesfor the most (max) and the least (min) effective stimuliare shown for the period 0 to 200 msec with respect tostimulus onset Although the main effect of the mask onthe number of spikes emitted was evident at shorterSOAs and was smaller than for the larger set of neuronsanalyzed previously (Rolls amp Toveacutee 1994) there was astatistically signicant reduction of the most effectivering rate due to the mask (as tested by a one-wayanalysis of variance or ANOVA) at p lt 005 There waslittle effect (not signicant) of the mask on the re-sponses to the least effective stimulus in the set forwhich the number of spikes was close to the spontane-ous activity (For Figure 2 and the ANOVA the responsesof different cells were combined by scaling them so thatthe average response across SOAs of each cell to themost effective stimulus was that of the average responseto the most effective stimulus of the population of cellsacross the different SOAs We used this normalizationbecause it removes the variability that is just due to theinhomogeneity of the cell sample without changing thedependence of the responses on the SOAs)

Also as a preliminary to the analysis we show inFigure 3 the mean ring rate computed in 50-msecepochs for the 15 cells to the most effective stimulusand to the least effective stimulus as a function of theSOA We also show the average response to all stimuli Itcan be seen that the average ring rate across stimulidoes reduce to some extent as the SOA becomes shorterbut that there is a much larger decrease in the difference

Rolls et al 301

in the responses to the best and the least effectivestimuli It is also clear that the responses which are allto a 16-msec stimulus last for longer in the unmaskedcondition (ldquoSOArdquo = 1000 msec) than at an SOA of 20msec

As a further preliminary to the information analysisfor which the information shown will be that availableby different poststimulus times we show in Figure 4 theaverage across the cells of the number of spikes cumu-lated by different poststimulus times for the most effec-

tive and the least effective stimuli for each cell and alsothe average across stimuli It can be seen that the num-ber of spikes continues to increase a little more for theunmasked than for the masked conditions but that atshort SOAs (eg 20 msec) the cumulated number ofspikes that separates the most effective from the leasteffective stimulus becomes quite small This is an indica-tion that at short SOAs the information available aboutwhich stimulus was shown might be quite small relativeto that with longer SOAs or no masking stimulus

Figure 1 Examples of the responses of one cell to the most effective (a) and to the least effective stimulus (b) without a mask and with a maskat an SOA of 20 msec (c vs d) The responses are shown in rastergram and peristimulus time histogram form The stimulus appeared at time 0

302 Journal of Cognitive Neuroscience Volume 11 Number 3

Figure 5 shows the cumulated information (averagedacross stimuli) at different poststimulus times for thedifferent SOAs and for the unmasked condition It is clearthat there is considerable information in the unmaskedcondition with an average value reached typically by150 msec poststimulus of 030 bits In the masked con-dition the information is progressively reduced as theSOA is decreased to 20 msec At an SOA of 40 msec theaverage information was 016 bits (53) and at an SOAof 20 msec the average information was 010 bits (33)Thus the mask produces a very considerable reductionin the information available about which stimulus wasshown The information tends to decrease after approxi-mately 250 msec indicating that after this time theresponses tend to introduce noise and no net furtherinformation about which stimulus was shown This issueis addressed specically in Figure 6 which shows theinformation available in 50-msec epochs taken at differ-ent poststimulus times with different SOAs It is clearthat most information is available in 50-msec epochstaken up to 200 to 250 msec At longer poststimulustimes there is still some information in the unmaskedcondition but little in the masked conditions The reasonfor the small decrease in information shown in Figure 5at poststimulus times longer than approximately 250msec can overall be accounted for by the fact that thelarge difference in the responses to the different stimulievident by 250 msec tend to become proportionally

smaller as more spikes in the longer poststimulus timesare included Any ring after 250 msec poststimulus iseffectively spontaneous activity of the neuron and therecontributes only noise about the stimulus

To show that the results in Figures 5 and 6 are notdue to averaging across cells we show in Figure 7 theinformation available from the responses of one cellInformation is encoded in the cellrsquos response for severalhundred milliseconds after the cell starts responding tothe stimulus rising to a peak of 085 bits at 100 to 150msec after the onset of stimulus presentation The cellcontinues to encode a signicant amount of informationabout the stimulus in 50-msec epochs taken up to 400msec

To summarize the results we show in Figure 8a theaverage across the cells of the cumulated informationavailable in a 200-msec period from stimulus onset fromthe responses of the cells as a function of the SOA Thisemphasizes how as the SOA is reduced toward 20 msecthe information does reduce rapidly but that neverthe-less at an SOA of 20 msec there is still considerableinformation about which stimulus was shown The re-duction of the information at different SOAs was highlysignicant (one-way ANOVA) at p lt 0001 (The ANOVAwas performed after rescaling the data so that each cellhad the same average information and so that variancedue to differences in the magnitude of the informationcarried by each cell would not interfere with the test ofwhether the SOA affected the information) For compari-son we show in Figure 8b the difference in the numberof spikes to the most effective and the least effectivestimulus as a function of the SOA (normalized as de-scribed for Figure 2) This response difference is closelyrelated to the information available to the different stim-uli Indeed the difference in the number of spikes ismuch more closely related to the information than is themean ring rate of the neurons (shown in Figure 4)emphasizing that it is differences of neuronal responsesto different stimuli that convey information The reduc-tion of the difference in the number of spikes at differ-ent SOAs was again signicant (p lt 005) although lesssignicant than the reduction of the information

DISCUSSION

The advance described in this paper is that by combin-ing neurophysiology and information theory we havebeen able to quantitatively measure how backward vis-ual masking affects the information available from theresponses of neurons in the visual system The effect ofthe mask is to reduce the total information availablefrom the cell the size of the information peak and thelength of time it is signaling information (Figures 5 to 8)The results clearly show that there is a systematic andstatistically signicant reduction in the amount of infor-mation with decreasing SOA (see eg Figure 8)

The results emphasize that very considerable informa-

Figure 2 The mean (+- sem) across cells of the number ofspikes produced by the most effective stimulus (max) and the leasteffective stimulus (min) as a function of SOA

Rolls et al 303

tion about which stimulus was shown is available in ashort epoch (eg 50 msec see Figure 6) of the ringrate This information is available even when the epochis taken near the start of the neuronal response Thisconrms the ndings of Heller Hertz Kjaer and Rich-mond (1995) Toveacutee et al (1993) and Toveacutee and Rolls(1995) also made with recordings from inferior tempo-

ral cortex neurons In those studies in which the stimu-lus lasted for several hundred milliseconds the epochcould be taken at a wide range of poststimulus times (inthe range 100 to 500 msec) In the present study thestimulus itself lasted for only 20 msec and correspond-ingly in the no mask condition the information availabledid drop over the next few hundred milliseconds How-

Figure 3 The mean ringrate computed in 50-msec ep-ochs for the 15 cells to themost effective stimulus and tothe least effective stimulus asa function of the SOA The av-erage response to all stimuli isshown as the middle line ofthe three

304 Journal of Cognitive Neuroscience Volume 11 Number 3

ever it was notable that the information in the no-maskcondition did outlast the end of the stimulus by as muchas 200 to 300 msec indicating some short-term memorytrace property of the neuronal circuitry (cf Wallis ampRolls 1987)

The results show also that an effect of the masking isthat the decrease in information is more marked than

the decrease in ring rate because it is the selective partof the ring that is especially attenuated by the masknot the spontaneous ring (see Figures 2 and 8) Theresults also show that even at the shortest SOA of 20msec the information available was on average 01 bitsThis compares to 03 bits with the 16-msec stimulusshown without the mask It also compares to a typical

Figure 4 The average acrossthe cells of the number ofspikes cumulated by differentpoststimulus times for themost effective and the least ef-fective stimuli for each celland also the average acrossstimuli

Rolls et al 305

value for such neurons of 035 to 05 bits with a 500-msec stimulus presentation (Rolls Treves Toveacutee ampPanzeri 1997 Toveacutee amp Rolls 1995) The results thus showthat considerable information (33 of that availablewithout a mask and approximately 22 of that with a500-msec stimulus presentation) is available from neuro-nal responses even under backward masking conditionsthat allow the neurons to have their main response in30 msec Also we note that the information availablefrom a 16-msec unmasked stimulus (03 bits) is a largeproportion (approximately 65 to 75) of that availablefrom a 500-msec stimulus These results provide evi-dence of how rapid the processing of visual informationis in a cortical area and provide a fundamental constraintfor understanding how cortical information processingoperates (see Rolls amp Treves 1998)

The analysis using information theory draws out aninteresting point about the relation between the ringrate measure used previously (Rolls amp Toveacutee 1994 Rollset al 1994) and the information measure (see Figure 8)It is shown in Figure 8 that the difference in the meanresponses to different stimuli (not the mean response toall stimuli) is related to the information This reects thefact that the information is related to the differences inneuronal responses to different stimuli However theinformation also reects the variance in the neuronalresponses and it is presumably the larger variance atshort SOAs that makes the information decrease more

than the difference in neuronal responses In fact thestandard deviation of the neuronal responses at an SOAof 20 msec is approximately 045 of the mean ring ratewhereas it is 04 of the mean ring rate without mask-ing

The neurophysiological and information data de-scribed here can be compared directly with the effectsof backward masking in human observers studied in thesame apparatus with the same stimuli (Rolls et al 1994)For the human observers identication of which facefrom a set of six had been seen was 50 correct withan SOA of 20 msec and 97 correct with an SOA of 40msec (corrected for guessing) (Rolls et al 1994) Com-paring the human performance purely with the changesin ring rate under the same stimulus conditions sug-gested that when it is just possible to identify which facehas been seen neurons in a given cortical area may beresponding for only approximately 30 msec (Rolls ampToveacutee 1994 Rolls et al 1994) The implication is that 30msec is enough time for a neuron to perform sufcientcomputation to enable its output to be used for iden-tication The new results based on an analysis of theinformation encoded in the spike trains at different SOAssupport this hypothesis by showing that a signicantproportion of information is available in these few spikes(see Figures 5 to 7) Thus there is very rapid processingof visual stimuli in the visual system We also note thatthe analysis using information theory adds to previous

Figure 5 The average across the cells of the cumulated informa-tion (averaged across stimuli) at different poststimulus times for thedifferent SOAs and for the unmasked condition

Figure 6 The average across the cells of the information availablein discrete 50-msec epochs taken at different poststimulus timeswith different SOAs and for the unmasked condition

306 Journal of Cognitive Neuroscience Volume 11 Number 3

analyses by providing quantitative information measuresthat can in principle be compared to the performanceof human observers using the same information metric

A comparison of the latencies of the activation ofneurons in the different visual cortical areas V1 V2 V4posterior inferior temporal cortex and anterior inferiortemporal cortex suggests that approximately 10 to 15msec is added by each stage (Dinse amp Kruger 1994Oram amp Perrett 1992 Raiguel et al 1989 Rolls 1992Vogel amp Orban 1994) This lag also seems to be commonto the passage of information between different subdivi-sions of a given area For example there seems to be alag of 15 msec between neurons in layer 4C and in themore supercial layers of V1 and a lag of 11 msecbetween viewer-centered cells and object-centered cellsin the temporal visual cortex (Maunsell amp Gibson 1992Perrett et al 1992) These studies show that neurons inthe next stage of processing start ring soon after (15msec) the neurons in any stage of processing havestarted to re The fact that considerable information isavailable in short epochs of for example 20 msec of thering of neurons provides part of the underlying basisfor this rapid sequential activation of connected visualcortical areas (Toveacutee amp Rolls 1995) The high ring ratesof neurons in the visual cortical areas to their mosteffective stimulus may be an important aspect of thisrapid transmission of visual information (cf Rolls ampTreves 1998 section A232) Nevertheless the ring of

neurons within a cortical area normally continues to avisual stimulus for several hundred milliseconds (seeRolls amp Treves 1998) Over this long period of timedifferent factors will inuence the ring of neurons in agiven cortical area Initially the ring will be basedmainly on incoming information from the precedingcortical area (feed-forward information) but at varyingtemporal intervals different feed-back mechanisms willplay a modulating role Initially this is likely to include

Figure 7 The information available from the responses of one ofthe cells as a function of the poststimulus time for different SOAs

Figure 8 (a) The average (plusmn sem) across the cells of the cumu-lated information available in a 200-msec period from stimulus onsetfrom the responses of the cells as a function of the SOA (b) The dif-ference in the number of spikes to the most effective and to theleast effective stimulus is also shown as a function of the SOA Thescale in (b) of the ldquonumber of spikesrdquo axis is scaled so that the infor-mation value (a) and the difference in the number of spikes (b) co-incide in the no-mask condition

Rolls et al 307

lateral inhibition followed by intracortical feedbackfrom local excitatory recurrent collateral axons of otherpyramidal cells and then feedback from higher corticalareas However although all these separate inputs seemto be very different conceptually in time it is of interestthat in dynamical systems using integrate-and-re neu-rons to model the dynamics of the real brain the ex-change of information required to achieve rapid settlingof the network into a nal state can be very much fasterthan might be expected by taking the contribution ofeach stage as a separate time step (see Rolls amp Treves1998 Treves 1993) This is exactly consistent with therapid speed of processing demonstrated here and em-phasized by the fact that considerable information isavailable from inferior temporal cortex neurons whentheir processing is interrupted by a mask starting only20 msec after the onset of a stimulus

Our previous studies suggested that there is a popula-tion of neurons in the temporal cortical areas that re-spond to a single frame (16 msec) presentation of thetarget stimulus a face with an increased ring rate thatcan last for 200 to 300 msec (Rolls amp Toveacutee 1994 Rollset al 1994) This is in the unmasked condition Theinformation analysis conrms and extends this ndingThe neuronal responses can encode signicant amountsof information for up to 300 msec after the presentationof a 16-msec stimulus in the unmasked condition (seeFigure 5) These results suggest that there may be ashort-term visual memory implemented by the con-tinued ring of these neurons after a stimulus hasdisappeared This short-term visual memory may be im-plemented by the recurrent collateral connections madebetween nearby pyramidal cells in the cerebral cortexThese recurrent connections may function in part as anautoassociative memory which not only enables corticalnetworks to show continued ring for a few hundredmilliseconds after a briey presented stimulus but alsoconfer some of the response specicity inherent in theresponses of cortical neurons (Rolls amp Treves 1998) Oneeffect that may be facilitated by this short-term visualmemory is implementation of a trace learning rule in thevisual cortex that may be used to learn invariant repre-sentations (Rolls 1992 1994 1995 Rolls amp Toveacutee 1994Wallis amp Rolls 1997) The continuing neural activity afterone stimulus would enable it to be associated with thesucceeding images which given the nature of the statis-tics of our visual world would probably be images of thesame object These associations between images pro-duced within a short time by the same object could formthe basis for the construction of an invariant repre-sentation of that stimulus For example the neuronsactivated by a stimulus are still in a state of activation(and postsynaptic depolarization) perhaps 300 mseclater when the same stimulus is seen translated acrossthe retina viewed from a different angle or at a differentsize so that the active axons carrying the transformed

representation can undergo Hebbian associative synapticmodication onto just those neurons that remain in anactivated state from the previous input produced by thesame object In this way the invariant properties evidentacross short time epochs of the inputs produced byobjects may be learned by the visual system (Rolls 1992Rolls amp Treves 1998 Wallis amp Rolls 1997)

We nally note that the information per spike denedas the information calculated in short windows dividedby the number of spikes emitted by the cell in that timewindow is in the range 005 to 02 bits per spike forlonger SOAs slightly decreasing down to 003 to 015bits per spike at very short SOAs (The values of theinformation per spike can be easily extracted by com-paring Figure 6 which reports the values of the informa-tion available in short time epochs and Figure 3 whichplots the mean ring rates of the neurons in the sametime epochs) These values are similar to the values ofthe information per spike found in IT neurons respond-ing to long-lasting (eg 500 msec) stimuli (Heller et al1995 Rolls Treves Toveacutee amp Panzeri 1997) Thus for theexperiments described the information per spike is onlymoderate even when there are a very few spikes inresponse to a stimulus as in the mask condition withvery short SOAs This result has some implications forthe nature of the neuronal code It has been suggestedby Rieke et al (1996) that the fact that at short SOAsonly a few spikes are emitted in response to a stimulusand at the same time a psychophysical discrimination ofthe visual stimuli is still possible may imply that even inthe mammalian central nervous system when a singlestimulus is rapidly varying or is presented just for a veryshort time one or two spikes may be enough to carryvery high values of information (thus challenging theidea of a ldquorate coderdquo) This is indeed the case for forexample the visual system of the y (Rieke et al 1996)where a single spike can carry several bits of informa-tion at least when studied with dynamically varyingstimuli However the evidence presented here seems toindicate that several spikes from (perhaps different) neu-rons in the IT cortex of the monkey are needed toprovide much information about a briey presented vis-ual pattern This is consistent with the distributed repre-sentation of information used in many mammalianneural systems and with the advantages that computa-tion with this type of distributed representation of infor-mation confers (Rolls amp Treves 1998 Rolls Treves ampToveacutee 1997)

EXPERIMENTAL PROCEDURES AND DATAANALYSIS

The activity of single neurons was recorded with glass-insulated tungsten microelectrodes in the anterior partof the superior temporal sulcus (STS) in two alert ma-caque monkeys (Macaca mulatta mass 30 kg) seated in

308 Journal of Cognitive Neuroscience Volume 11 Number 3

a primate chair using techniques that have been de-scribed elsewhere (Toveacutee et al 1993 Toveacutee amp Rolls1992) The preparative procedures were performed asep-tically under sodium thiopentone anaesthesia by usingpretreatment with ketamine and posttreatment with theanalgesic buprenorphine (Temgesic and the antibioticamoxycillin Cynulox) and all procedures were in ac-cordance with the Policy Regarding the Care and Use ofAnimals approved by the Society for Neuroscience andwere licensed under the UK Animals Scientic Proce-dures Act 1986 Eye position was measured to an accu-racy of 05 8 with the search coil technique A visualxation task ensured that the monkey looked steadily atthe screen throughout the presentation of each stimulusThe task was a blink version of a visual xation task inwhich the xation spot was blinked off 100 msec beforethe target (otherwise called test) stimulus appeared Thestimuli were static visual stimuli subtending 8 in thevisual eld presented on a video monitor at a distanceof 10 m The xation spot position was at the center ofthe screen The monitor was viewed binocularly with thewhole screen visible to both eyes

Each trial started at - 500 msec with respect to theonset of the test image with a 500-msec warning toneto allow xation of the xation point which appearedat the same time At - 100 msec the xation spot wasblinked off so there was no stimulus on the screen inthe 100-msec period immediately preceding the testimage The screen in this period and at all other timesincluding the interstimulus interval and the interval be-tween the test image and the mask was set at the meanluminance of the test images and the mask At 0 msecthe tone was switched off and the test image wasswitched on for 16 msec This period was the frameduration of the video framestore with which the imageswere presented The image was drawn on the monitorfrom the top to the bottom in the rst 16 msec of theframe period by the framestore with the remaining 4msec of the frame period being the vertical blank inter-val (The PAL video system was in use) The monitor hada persistence of less than 3 msec so no part of the testimage was present at the start of the next frame Thetime between the start of the test stimulus and the startof the mask stimulus (the SOA) was either 20 40 60 100or 1000 msec (chosen in a pseudorandom sequence bythe computer) The 1000-msec condition was used tomeasure the response to the test stimulus alone (whichwas possible because the mask was delayed for so long)The duration of the masking stimulus was 300 msec Atthe termination of the masking stimulus the xation spotreappeared and then after a random interval in the range150 to 3350 msec it dimmed to indicate that lickingresponses to a tube in front of the mouth would resultin the delivery of a reward The dimming period was1000 msec and after this the xation spot was switchedoff and reward availability terminated 500 msec later

The monkey was required to xate the xation spot andif it licked at any time other than when the spot wasdimmed saline instead of fruit juice was delivered fromthe tube If the eyes moved by more than 05deg from time0 until the start of the dimming period the trial wasaborted and the data for the trial were rejected When atrial aborted a high-frequency tone sounded for 05 secno reinforcement was available for that trial and theintertrial interval was lengthened from 8 to 11 sec

The criterion for the face-selective neurons analyzedin this study was that the response to the optimal stimu-lus should be at least twice that to the optimal nonfacestimulus and that this difference should be signicant(Rolls 1984 Rolls amp Toveacutee 1995 Rolls Treves Toveacutee ampPanzeri 1997) If the neuron satised the criterion it wastested with two to six of the effective face stimuli forthat neuron We checked that none of the selected cellshad any response to the mask presented alone

The transmitted information carried by neuronal ringrates about the stimuli was computed with the use oftechniques that have been described fully previously(eg Rolls Treves Toveacutee amp Panzeri 1997 Rolls amp Treves1998) and have been used previously to analyze theresponses of inferior temporal cortex neurons (Gawneamp Richmond 1993 Optican amp Richmond 1987 RollsTreves amp Toveacutee 1997 Toveacutee amp Rolls 1995 Toveacutee et al1993) In brief the general procedure was as follows Theresponse r of a neuron to the presentation of a particularstimulus s was computed by measuring the ring rate ofthe neuron in a xed time window after the stimuluspresentation (In this experiment the information in anumber of different time windows was calculated)Measured in this way the ring rate responses takediscrete rather than continuous values that consist of thenumber of spikes in the time window on a particulartrial and span a discrete set of responses R across allstimuli and trials In the experiment the number of trialsavailable for each stimulus is limited (in the range of 6to 30 in the present experiment) When calculating theinformation the number of ring rate bins that can beused must be smaller than (or equal to) the number oftrials available for each stimulus to prevent undersam-pling and incorrectly high values of calculated informa-tion (Rolls amp Treves 1998) We therefore quantized themeasured ring rates into a smaller number of bins dWe chose here d in a way specic for every cell andevery time window according to the following d wasthe number of trials per stimulus (or the number ofdifferent rates that actually occurred if this was lower)This procedure is very effective in minimizing informa-tion loss due to overregularization of responses whileeffectively controlling for nite sampling biases seeGolomb Hertz Panzeri Treves amp Richmond (1997) andPanzeri amp Treves (1996) After this response quantizationthe experimental joint stimulus-response probability ta-ble P(s r) was computed from the data (where P(r) and

Rolls et al 309

P(s) are the experimental probability of occurrence ofresponses and of stimuli respectively) and the informa-tion I(S R) transmitted by the neurons averaged acrossthe stimuli was calculated by using the Shannon formula(Cover amp Thomas 1991 Shannon 1948)

I(S R) = aring sr

P(s r) log2 P(s r)

P(s)P(r)

and then subtracting the nite sampling correction ofPanzeri and Treves (1996) to obtain estimates unbiasedfor the limited sampling This leads to the informationavailable in the ring rates about the stimulus We didnot calculate the additional information available fromtemporal ring patterns within the spike train becausethe additional information is low often only 10 to 20and reects mainly the onset latency of the neuronalresponse (Toveacutee et al 1993 Toveacutee amp Rolls 1995)

In the experiments described here three cells weretested with six stimuli one with four stimuli ve cellswith three stimuli and six cells with two stimuli We notethat the number of stimuli used here is smaller than thatused in other experiments that applied informationtheoretic measures to the responses of inferior temporalcortex cells (Optican amp Richmond 1987 Rolls Treves ampToveacutee 1997 Toveacutee et al 1993 Toveacutee amp Rolls 1995) Thereason for using fewer images in the experiments de-scribed here is that each image needed to be tested withve different masking conditions There are two poten-tial problems arising from a calculation of informationfrom a limited set of stimuli The rst one is that withfew stimuli the full response space of the neuron maynot be adequately sampled We guarded against this bychoosing images for the tests that elicited quite differentresponses from the cells and ensuring that the informa-tion measured with these images was high The secondpossible problem is that the information measures maybe distorted by the fact that there is a ceiling to theamount of information that can be provided aboutwhich of a set of stimuli has been seen and this entropyis the logarithm of the number of stimuli in the set Fortwo stimuli just 1 bit of information is all that could beprovided by a cell This ceiling could limit the informa-tion measurement obtained from neuronal responses iftoo few stimuli are used (Gawne Kjaer Hertz amp Rich-mond 1996 Gawne amp Richmond 1993 Rolls amp Treves1998 Rolls Treves amp Toveacutee 1997) For the analyses de-scribed here we checked that the information availablefrom the neuronal responses was always well below theceiling so that the information measure was not dis-torted In fact for the unmasked condition the averageinformation from the neuronal responses was 03 bitsand this is much lower than the entropy of the sets ofimages which varied for different cells between 1 bit

and 26 bits Finally we note that each neuron was testedwith the same stimuli across the different masking con-ditions and therefore the comparison of the informationvalues obtained under the different masking conditionsis homogenous

Acknowledgments

This research was supported by Medical Research CouncilProgramme Grant PG8513790 to E T Rolls S Panzeri is sup-ported by an EC Marie Curie Research Training GrantERBFMBICT972749

Reprint requests should be sent to E T Rolls Department ofExperimental Psychology University of Oxford Oxford OX13UD UK or via e-mail EdmundRollspsyoxacuk

REFERENCES

Cover T M amp Thomas J A (1991) Elements of informationtheory New York Wiley

Dinse H R amp Kruger K (1994) The timing of processingalong the visual pathway in the cat NeuroReport 5 893ndash897

Gawne T J Kjaer T W Hertz J A amp Richmond B J (1996)Adjacent cortical complex cells share about 20 of theirstimulus-related information Cerebral Cortex 6 482ndash489

Gawne T J amp Richmond B J (1993) How independent arethe messages carried by adjacent inferior temporal corti-cal neurons Journal of Neuroscience 13 2758ndash2771

Golledge H D R Hilgetag C amp Toveacutee M J (1996) Informa-tion processing A solution to the binding problem Cur-rent Biology 6 1092ndash1095

Golomb D Hertz J A Panzeri S Treves A amp RichmondB J (1997) How well can we estimate the information car-ried in neuronal responses from limited samples NeuralComputation 9 649ndash665

Heller J Hertz J A Kjaer T W amp Richmond B J (1995)Information ow and temporal coding in primate patternvision Journal of Computational Neuroscience 2 175ndash193

Humphreys G W amp Bruce V (1989) Visual cognitionHove Eng Erlbaum

Kovacs G Vogels R amp Orban G A (1995) Cortical corre-lates of pattern backward-masking Proceedings of the Na-tional Academy of Sciences USA 92 5587ndash5591

Maunsell J H R amp Gibson J R (1992) Visual response la-tencies in striate cortex of the macaque monkey Journalof Neurophysiology 68 1332ndash1344

Optican L amp Richmond B J (1987) Temporal encoding oftwo-dimensional patterns by single units in primate infe-rior temporal cortex III Information theoretical analysisJournal of Neurophysiology 57 132ndash146

Oram M W amp Perrett D I (1992) Time course of neural re-sponses discriminating different views of the face andhead Journal of Neurophysiology 68 70ndash84

Panzeri S amp Treves A (1996) Analytical estimates of limitedsampling biases in different information measures Net-work 7 87ndash107

Perrett D I Hietanen J K Oram M W amp Benson P J(1992) Organization and functions of cells in the ma-caque temporal cortex Philosophical Transactions of theRoyal Society London B 335 23ndash50

Raiguel S E Lagae L Gulyas B amp Orban G A (1989) Re-

310 Journal of Cognitive Neuroscience Volume 11 Number 3

sponse latencies of visual cells in macaque areas V1 V2and V5 Brain Research 493 155ndash159

Rieke F Warland D de Ruyter van Steveninck R R ampBialek W (1996) Spikes Exploring the neural code Cam-bridge MA MIT Press

Rolls E T (1984) Neurons in the cortex of the temporallobe and in the amygdala of the monkey with responsesselective for faces Human Neurobiology 3 209ndash222

Rolls E T (1992) Neurophysiological mechanisms underly-ing face processing within and beyond the temporal corti-cal visual areas Philosophical Transactions of the RoyalSociety London B 335 11ndash21

Rolls E T (1994) Brain mechanisms for invariant visual rec-ognition and learning Behavioral Processes 33 113ndash138

Rolls E T (1995) Learning mechanisms in the temporal lobevisual cortex Behavioral Brain Research 66 177ndash185

Rolls E T amp Toveacutee M J (1994) Processing speed in thecerebral cortex and the neurophysiology of backwardmasking Proceedings of the Royal Society London B257 9ndash15

Rolls E T amp Toveacutee M J (1995) The sparseness of the neuro-nal representation of stimuli in the primate temporal vis-ual cortex Journal of Neurophysiology 73 713ndash726

Rolls E T Toveacutee M J Purcell D G Stewart A L amp Azzo-pardi P (1994) The responses of neurons in the temporalcortex of primates and face identication and detectionExperimental Brain Research 101 473ndash484

Rolls E T amp Treves A (1998) Neural networks and brainfunction Oxford Oxford University Press

Rolls E T Treves A amp Toveacutee M J (1997) The repre-sentational capacity of the distributed encoding of informa-tion provided by populations of neurons in the primatetemporal visual cortex Experimental Brain Research114 149ndash162

Rolls E T Treves A Toveacutee M amp Panzeri S (1997) Informa-

tion in the neuronal representation of individual stimuli inthe primate temporal visual cortex Journal of Computa-tional Neuroscience 4 309ndash333

Schiller P H (1968) Single unit analysis of backward visualmasking and metacontrast in the cat lateral geniculate nu-cleus Vision Research 8 855ndash866

Shannon C E (1948) A mathematical theory of communica-tion ATampT Bell Laboratories Technical Journal 27 379ndash423

Thorpe S Fize D amp Mariot C (1996) Speed of processingin the human visual system Nature 381 520ndash522

Toveacutee M J (1994) How fast is the speed of thought Cur-rent Biology 4 1125ndash1127

Toveacutee M J amp Rolls E T (1992) Oscillatory activity is not evi-dent in the primate temporal visual cortex with static vis-ual stimuli NeuroReport 3 369ndash372

Toveacutee M J amp Rolls E T (1995) Information encoding inshort ring rate epochs by single neurons in the primatetemporal visual cortex Visual Cognition 2 35ndash58

Toveacutee M J Rolls E T amp Ramachandran V S (1996) Rapidvisual learning in the neurons of the primate temporal vis-ual cortex NeuroReport 7 2757ndash2760

Toveacutee M J Rolls E T Treves A amp Bellis R P (1993) Infor-mation encoding and the responses of single neurons inthe primate temporal visual cortex Journal of Neurophysi-ology 70 640ndash654

Treves A (1993) Mean-eld analysis of neuronal spike dy-namics Network 4 259ndash284

Vogel R amp Orban G A (1994) Activity of inferior temporalneurons during orientation discrimination with succes-sively presented gratings Journal of Neurophysiology 711428ndash1451

Wallis G amp Rolls E T (1997) Invariant face and object rec-ognition in the visual system Progress in Neurobiology51 167ndash194

Rolls et al 311

in the responses to the best and the least effectivestimuli It is also clear that the responses which are allto a 16-msec stimulus last for longer in the unmaskedcondition (ldquoSOArdquo = 1000 msec) than at an SOA of 20msec

As a further preliminary to the information analysisfor which the information shown will be that availableby different poststimulus times we show in Figure 4 theaverage across the cells of the number of spikes cumu-lated by different poststimulus times for the most effec-

tive and the least effective stimuli for each cell and alsothe average across stimuli It can be seen that the num-ber of spikes continues to increase a little more for theunmasked than for the masked conditions but that atshort SOAs (eg 20 msec) the cumulated number ofspikes that separates the most effective from the leasteffective stimulus becomes quite small This is an indica-tion that at short SOAs the information available aboutwhich stimulus was shown might be quite small relativeto that with longer SOAs or no masking stimulus

Figure 1 Examples of the responses of one cell to the most effective (a) and to the least effective stimulus (b) without a mask and with a maskat an SOA of 20 msec (c vs d) The responses are shown in rastergram and peristimulus time histogram form The stimulus appeared at time 0

302 Journal of Cognitive Neuroscience Volume 11 Number 3

Figure 5 shows the cumulated information (averagedacross stimuli) at different poststimulus times for thedifferent SOAs and for the unmasked condition It is clearthat there is considerable information in the unmaskedcondition with an average value reached typically by150 msec poststimulus of 030 bits In the masked con-dition the information is progressively reduced as theSOA is decreased to 20 msec At an SOA of 40 msec theaverage information was 016 bits (53) and at an SOAof 20 msec the average information was 010 bits (33)Thus the mask produces a very considerable reductionin the information available about which stimulus wasshown The information tends to decrease after approxi-mately 250 msec indicating that after this time theresponses tend to introduce noise and no net furtherinformation about which stimulus was shown This issueis addressed specically in Figure 6 which shows theinformation available in 50-msec epochs taken at differ-ent poststimulus times with different SOAs It is clearthat most information is available in 50-msec epochstaken up to 200 to 250 msec At longer poststimulustimes there is still some information in the unmaskedcondition but little in the masked conditions The reasonfor the small decrease in information shown in Figure 5at poststimulus times longer than approximately 250msec can overall be accounted for by the fact that thelarge difference in the responses to the different stimulievident by 250 msec tend to become proportionally

smaller as more spikes in the longer poststimulus timesare included Any ring after 250 msec poststimulus iseffectively spontaneous activity of the neuron and therecontributes only noise about the stimulus

To show that the results in Figures 5 and 6 are notdue to averaging across cells we show in Figure 7 theinformation available from the responses of one cellInformation is encoded in the cellrsquos response for severalhundred milliseconds after the cell starts responding tothe stimulus rising to a peak of 085 bits at 100 to 150msec after the onset of stimulus presentation The cellcontinues to encode a signicant amount of informationabout the stimulus in 50-msec epochs taken up to 400msec

To summarize the results we show in Figure 8a theaverage across the cells of the cumulated informationavailable in a 200-msec period from stimulus onset fromthe responses of the cells as a function of the SOA Thisemphasizes how as the SOA is reduced toward 20 msecthe information does reduce rapidly but that neverthe-less at an SOA of 20 msec there is still considerableinformation about which stimulus was shown The re-duction of the information at different SOAs was highlysignicant (one-way ANOVA) at p lt 0001 (The ANOVAwas performed after rescaling the data so that each cellhad the same average information and so that variancedue to differences in the magnitude of the informationcarried by each cell would not interfere with the test ofwhether the SOA affected the information) For compari-son we show in Figure 8b the difference in the numberof spikes to the most effective and the least effectivestimulus as a function of the SOA (normalized as de-scribed for Figure 2) This response difference is closelyrelated to the information available to the different stim-uli Indeed the difference in the number of spikes ismuch more closely related to the information than is themean ring rate of the neurons (shown in Figure 4)emphasizing that it is differences of neuronal responsesto different stimuli that convey information The reduc-tion of the difference in the number of spikes at differ-ent SOAs was again signicant (p lt 005) although lesssignicant than the reduction of the information

DISCUSSION

The advance described in this paper is that by combin-ing neurophysiology and information theory we havebeen able to quantitatively measure how backward vis-ual masking affects the information available from theresponses of neurons in the visual system The effect ofthe mask is to reduce the total information availablefrom the cell the size of the information peak and thelength of time it is signaling information (Figures 5 to 8)The results clearly show that there is a systematic andstatistically signicant reduction in the amount of infor-mation with decreasing SOA (see eg Figure 8)

The results emphasize that very considerable informa-

Figure 2 The mean (+- sem) across cells of the number ofspikes produced by the most effective stimulus (max) and the leasteffective stimulus (min) as a function of SOA

Rolls et al 303

tion about which stimulus was shown is available in ashort epoch (eg 50 msec see Figure 6) of the ringrate This information is available even when the epochis taken near the start of the neuronal response Thisconrms the ndings of Heller Hertz Kjaer and Rich-mond (1995) Toveacutee et al (1993) and Toveacutee and Rolls(1995) also made with recordings from inferior tempo-

ral cortex neurons In those studies in which the stimu-lus lasted for several hundred milliseconds the epochcould be taken at a wide range of poststimulus times (inthe range 100 to 500 msec) In the present study thestimulus itself lasted for only 20 msec and correspond-ingly in the no mask condition the information availabledid drop over the next few hundred milliseconds How-

Figure 3 The mean ringrate computed in 50-msec ep-ochs for the 15 cells to themost effective stimulus and tothe least effective stimulus asa function of the SOA The av-erage response to all stimuli isshown as the middle line ofthe three

304 Journal of Cognitive Neuroscience Volume 11 Number 3

ever it was notable that the information in the no-maskcondition did outlast the end of the stimulus by as muchas 200 to 300 msec indicating some short-term memorytrace property of the neuronal circuitry (cf Wallis ampRolls 1987)

The results show also that an effect of the masking isthat the decrease in information is more marked than

the decrease in ring rate because it is the selective partof the ring that is especially attenuated by the masknot the spontaneous ring (see Figures 2 and 8) Theresults also show that even at the shortest SOA of 20msec the information available was on average 01 bitsThis compares to 03 bits with the 16-msec stimulusshown without the mask It also compares to a typical

Figure 4 The average acrossthe cells of the number ofspikes cumulated by differentpoststimulus times for themost effective and the least ef-fective stimuli for each celland also the average acrossstimuli

Rolls et al 305

value for such neurons of 035 to 05 bits with a 500-msec stimulus presentation (Rolls Treves Toveacutee ampPanzeri 1997 Toveacutee amp Rolls 1995) The results thus showthat considerable information (33 of that availablewithout a mask and approximately 22 of that with a500-msec stimulus presentation) is available from neuro-nal responses even under backward masking conditionsthat allow the neurons to have their main response in30 msec Also we note that the information availablefrom a 16-msec unmasked stimulus (03 bits) is a largeproportion (approximately 65 to 75) of that availablefrom a 500-msec stimulus These results provide evi-dence of how rapid the processing of visual informationis in a cortical area and provide a fundamental constraintfor understanding how cortical information processingoperates (see Rolls amp Treves 1998)

The analysis using information theory draws out aninteresting point about the relation between the ringrate measure used previously (Rolls amp Toveacutee 1994 Rollset al 1994) and the information measure (see Figure 8)It is shown in Figure 8 that the difference in the meanresponses to different stimuli (not the mean response toall stimuli) is related to the information This reects thefact that the information is related to the differences inneuronal responses to different stimuli However theinformation also reects the variance in the neuronalresponses and it is presumably the larger variance atshort SOAs that makes the information decrease more

than the difference in neuronal responses In fact thestandard deviation of the neuronal responses at an SOAof 20 msec is approximately 045 of the mean ring ratewhereas it is 04 of the mean ring rate without mask-ing

The neurophysiological and information data de-scribed here can be compared directly with the effectsof backward masking in human observers studied in thesame apparatus with the same stimuli (Rolls et al 1994)For the human observers identication of which facefrom a set of six had been seen was 50 correct withan SOA of 20 msec and 97 correct with an SOA of 40msec (corrected for guessing) (Rolls et al 1994) Com-paring the human performance purely with the changesin ring rate under the same stimulus conditions sug-gested that when it is just possible to identify which facehas been seen neurons in a given cortical area may beresponding for only approximately 30 msec (Rolls ampToveacutee 1994 Rolls et al 1994) The implication is that 30msec is enough time for a neuron to perform sufcientcomputation to enable its output to be used for iden-tication The new results based on an analysis of theinformation encoded in the spike trains at different SOAssupport this hypothesis by showing that a signicantproportion of information is available in these few spikes(see Figures 5 to 7) Thus there is very rapid processingof visual stimuli in the visual system We also note thatthe analysis using information theory adds to previous

Figure 5 The average across the cells of the cumulated informa-tion (averaged across stimuli) at different poststimulus times for thedifferent SOAs and for the unmasked condition

Figure 6 The average across the cells of the information availablein discrete 50-msec epochs taken at different poststimulus timeswith different SOAs and for the unmasked condition

306 Journal of Cognitive Neuroscience Volume 11 Number 3

analyses by providing quantitative information measuresthat can in principle be compared to the performanceof human observers using the same information metric

A comparison of the latencies of the activation ofneurons in the different visual cortical areas V1 V2 V4posterior inferior temporal cortex and anterior inferiortemporal cortex suggests that approximately 10 to 15msec is added by each stage (Dinse amp Kruger 1994Oram amp Perrett 1992 Raiguel et al 1989 Rolls 1992Vogel amp Orban 1994) This lag also seems to be commonto the passage of information between different subdivi-sions of a given area For example there seems to be alag of 15 msec between neurons in layer 4C and in themore supercial layers of V1 and a lag of 11 msecbetween viewer-centered cells and object-centered cellsin the temporal visual cortex (Maunsell amp Gibson 1992Perrett et al 1992) These studies show that neurons inthe next stage of processing start ring soon after (15msec) the neurons in any stage of processing havestarted to re The fact that considerable information isavailable in short epochs of for example 20 msec of thering of neurons provides part of the underlying basisfor this rapid sequential activation of connected visualcortical areas (Toveacutee amp Rolls 1995) The high ring ratesof neurons in the visual cortical areas to their mosteffective stimulus may be an important aspect of thisrapid transmission of visual information (cf Rolls ampTreves 1998 section A232) Nevertheless the ring of

neurons within a cortical area normally continues to avisual stimulus for several hundred milliseconds (seeRolls amp Treves 1998) Over this long period of timedifferent factors will inuence the ring of neurons in agiven cortical area Initially the ring will be basedmainly on incoming information from the precedingcortical area (feed-forward information) but at varyingtemporal intervals different feed-back mechanisms willplay a modulating role Initially this is likely to include

Figure 7 The information available from the responses of one ofthe cells as a function of the poststimulus time for different SOAs

Figure 8 (a) The average (plusmn sem) across the cells of the cumu-lated information available in a 200-msec period from stimulus onsetfrom the responses of the cells as a function of the SOA (b) The dif-ference in the number of spikes to the most effective and to theleast effective stimulus is also shown as a function of the SOA Thescale in (b) of the ldquonumber of spikesrdquo axis is scaled so that the infor-mation value (a) and the difference in the number of spikes (b) co-incide in the no-mask condition

Rolls et al 307

lateral inhibition followed by intracortical feedbackfrom local excitatory recurrent collateral axons of otherpyramidal cells and then feedback from higher corticalareas However although all these separate inputs seemto be very different conceptually in time it is of interestthat in dynamical systems using integrate-and-re neu-rons to model the dynamics of the real brain the ex-change of information required to achieve rapid settlingof the network into a nal state can be very much fasterthan might be expected by taking the contribution ofeach stage as a separate time step (see Rolls amp Treves1998 Treves 1993) This is exactly consistent with therapid speed of processing demonstrated here and em-phasized by the fact that considerable information isavailable from inferior temporal cortex neurons whentheir processing is interrupted by a mask starting only20 msec after the onset of a stimulus

Our previous studies suggested that there is a popula-tion of neurons in the temporal cortical areas that re-spond to a single frame (16 msec) presentation of thetarget stimulus a face with an increased ring rate thatcan last for 200 to 300 msec (Rolls amp Toveacutee 1994 Rollset al 1994) This is in the unmasked condition Theinformation analysis conrms and extends this ndingThe neuronal responses can encode signicant amountsof information for up to 300 msec after the presentationof a 16-msec stimulus in the unmasked condition (seeFigure 5) These results suggest that there may be ashort-term visual memory implemented by the con-tinued ring of these neurons after a stimulus hasdisappeared This short-term visual memory may be im-plemented by the recurrent collateral connections madebetween nearby pyramidal cells in the cerebral cortexThese recurrent connections may function in part as anautoassociative memory which not only enables corticalnetworks to show continued ring for a few hundredmilliseconds after a briey presented stimulus but alsoconfer some of the response specicity inherent in theresponses of cortical neurons (Rolls amp Treves 1998) Oneeffect that may be facilitated by this short-term visualmemory is implementation of a trace learning rule in thevisual cortex that may be used to learn invariant repre-sentations (Rolls 1992 1994 1995 Rolls amp Toveacutee 1994Wallis amp Rolls 1997) The continuing neural activity afterone stimulus would enable it to be associated with thesucceeding images which given the nature of the statis-tics of our visual world would probably be images of thesame object These associations between images pro-duced within a short time by the same object could formthe basis for the construction of an invariant repre-sentation of that stimulus For example the neuronsactivated by a stimulus are still in a state of activation(and postsynaptic depolarization) perhaps 300 mseclater when the same stimulus is seen translated acrossthe retina viewed from a different angle or at a differentsize so that the active axons carrying the transformed

representation can undergo Hebbian associative synapticmodication onto just those neurons that remain in anactivated state from the previous input produced by thesame object In this way the invariant properties evidentacross short time epochs of the inputs produced byobjects may be learned by the visual system (Rolls 1992Rolls amp Treves 1998 Wallis amp Rolls 1997)

We nally note that the information per spike denedas the information calculated in short windows dividedby the number of spikes emitted by the cell in that timewindow is in the range 005 to 02 bits per spike forlonger SOAs slightly decreasing down to 003 to 015bits per spike at very short SOAs (The values of theinformation per spike can be easily extracted by com-paring Figure 6 which reports the values of the informa-tion available in short time epochs and Figure 3 whichplots the mean ring rates of the neurons in the sametime epochs) These values are similar to the values ofthe information per spike found in IT neurons respond-ing to long-lasting (eg 500 msec) stimuli (Heller et al1995 Rolls Treves Toveacutee amp Panzeri 1997) Thus for theexperiments described the information per spike is onlymoderate even when there are a very few spikes inresponse to a stimulus as in the mask condition withvery short SOAs This result has some implications forthe nature of the neuronal code It has been suggestedby Rieke et al (1996) that the fact that at short SOAsonly a few spikes are emitted in response to a stimulusand at the same time a psychophysical discrimination ofthe visual stimuli is still possible may imply that even inthe mammalian central nervous system when a singlestimulus is rapidly varying or is presented just for a veryshort time one or two spikes may be enough to carryvery high values of information (thus challenging theidea of a ldquorate coderdquo) This is indeed the case for forexample the visual system of the y (Rieke et al 1996)where a single spike can carry several bits of informa-tion at least when studied with dynamically varyingstimuli However the evidence presented here seems toindicate that several spikes from (perhaps different) neu-rons in the IT cortex of the monkey are needed toprovide much information about a briey presented vis-ual pattern This is consistent with the distributed repre-sentation of information used in many mammalianneural systems and with the advantages that computa-tion with this type of distributed representation of infor-mation confers (Rolls amp Treves 1998 Rolls Treves ampToveacutee 1997)

EXPERIMENTAL PROCEDURES AND DATAANALYSIS

The activity of single neurons was recorded with glass-insulated tungsten microelectrodes in the anterior partof the superior temporal sulcus (STS) in two alert ma-caque monkeys (Macaca mulatta mass 30 kg) seated in

308 Journal of Cognitive Neuroscience Volume 11 Number 3

a primate chair using techniques that have been de-scribed elsewhere (Toveacutee et al 1993 Toveacutee amp Rolls1992) The preparative procedures were performed asep-tically under sodium thiopentone anaesthesia by usingpretreatment with ketamine and posttreatment with theanalgesic buprenorphine (Temgesic and the antibioticamoxycillin Cynulox) and all procedures were in ac-cordance with the Policy Regarding the Care and Use ofAnimals approved by the Society for Neuroscience andwere licensed under the UK Animals Scientic Proce-dures Act 1986 Eye position was measured to an accu-racy of 05 8 with the search coil technique A visualxation task ensured that the monkey looked steadily atthe screen throughout the presentation of each stimulusThe task was a blink version of a visual xation task inwhich the xation spot was blinked off 100 msec beforethe target (otherwise called test) stimulus appeared Thestimuli were static visual stimuli subtending 8 in thevisual eld presented on a video monitor at a distanceof 10 m The xation spot position was at the center ofthe screen The monitor was viewed binocularly with thewhole screen visible to both eyes

Each trial started at - 500 msec with respect to theonset of the test image with a 500-msec warning toneto allow xation of the xation point which appearedat the same time At - 100 msec the xation spot wasblinked off so there was no stimulus on the screen inthe 100-msec period immediately preceding the testimage The screen in this period and at all other timesincluding the interstimulus interval and the interval be-tween the test image and the mask was set at the meanluminance of the test images and the mask At 0 msecthe tone was switched off and the test image wasswitched on for 16 msec This period was the frameduration of the video framestore with which the imageswere presented The image was drawn on the monitorfrom the top to the bottom in the rst 16 msec of theframe period by the framestore with the remaining 4msec of the frame period being the vertical blank inter-val (The PAL video system was in use) The monitor hada persistence of less than 3 msec so no part of the testimage was present at the start of the next frame Thetime between the start of the test stimulus and the startof the mask stimulus (the SOA) was either 20 40 60 100or 1000 msec (chosen in a pseudorandom sequence bythe computer) The 1000-msec condition was used tomeasure the response to the test stimulus alone (whichwas possible because the mask was delayed for so long)The duration of the masking stimulus was 300 msec Atthe termination of the masking stimulus the xation spotreappeared and then after a random interval in the range150 to 3350 msec it dimmed to indicate that lickingresponses to a tube in front of the mouth would resultin the delivery of a reward The dimming period was1000 msec and after this the xation spot was switchedoff and reward availability terminated 500 msec later

The monkey was required to xate the xation spot andif it licked at any time other than when the spot wasdimmed saline instead of fruit juice was delivered fromthe tube If the eyes moved by more than 05deg from time0 until the start of the dimming period the trial wasaborted and the data for the trial were rejected When atrial aborted a high-frequency tone sounded for 05 secno reinforcement was available for that trial and theintertrial interval was lengthened from 8 to 11 sec

The criterion for the face-selective neurons analyzedin this study was that the response to the optimal stimu-lus should be at least twice that to the optimal nonfacestimulus and that this difference should be signicant(Rolls 1984 Rolls amp Toveacutee 1995 Rolls Treves Toveacutee ampPanzeri 1997) If the neuron satised the criterion it wastested with two to six of the effective face stimuli forthat neuron We checked that none of the selected cellshad any response to the mask presented alone

The transmitted information carried by neuronal ringrates about the stimuli was computed with the use oftechniques that have been described fully previously(eg Rolls Treves Toveacutee amp Panzeri 1997 Rolls amp Treves1998) and have been used previously to analyze theresponses of inferior temporal cortex neurons (Gawneamp Richmond 1993 Optican amp Richmond 1987 RollsTreves amp Toveacutee 1997 Toveacutee amp Rolls 1995 Toveacutee et al1993) In brief the general procedure was as follows Theresponse r of a neuron to the presentation of a particularstimulus s was computed by measuring the ring rate ofthe neuron in a xed time window after the stimuluspresentation (In this experiment the information in anumber of different time windows was calculated)Measured in this way the ring rate responses takediscrete rather than continuous values that consist of thenumber of spikes in the time window on a particulartrial and span a discrete set of responses R across allstimuli and trials In the experiment the number of trialsavailable for each stimulus is limited (in the range of 6to 30 in the present experiment) When calculating theinformation the number of ring rate bins that can beused must be smaller than (or equal to) the number oftrials available for each stimulus to prevent undersam-pling and incorrectly high values of calculated informa-tion (Rolls amp Treves 1998) We therefore quantized themeasured ring rates into a smaller number of bins dWe chose here d in a way specic for every cell andevery time window according to the following d wasthe number of trials per stimulus (or the number ofdifferent rates that actually occurred if this was lower)This procedure is very effective in minimizing informa-tion loss due to overregularization of responses whileeffectively controlling for nite sampling biases seeGolomb Hertz Panzeri Treves amp Richmond (1997) andPanzeri amp Treves (1996) After this response quantizationthe experimental joint stimulus-response probability ta-ble P(s r) was computed from the data (where P(r) and

Rolls et al 309

P(s) are the experimental probability of occurrence ofresponses and of stimuli respectively) and the informa-tion I(S R) transmitted by the neurons averaged acrossthe stimuli was calculated by using the Shannon formula(Cover amp Thomas 1991 Shannon 1948)

I(S R) = aring sr

P(s r) log2 P(s r)

P(s)P(r)

and then subtracting the nite sampling correction ofPanzeri and Treves (1996) to obtain estimates unbiasedfor the limited sampling This leads to the informationavailable in the ring rates about the stimulus We didnot calculate the additional information available fromtemporal ring patterns within the spike train becausethe additional information is low often only 10 to 20and reects mainly the onset latency of the neuronalresponse (Toveacutee et al 1993 Toveacutee amp Rolls 1995)

In the experiments described here three cells weretested with six stimuli one with four stimuli ve cellswith three stimuli and six cells with two stimuli We notethat the number of stimuli used here is smaller than thatused in other experiments that applied informationtheoretic measures to the responses of inferior temporalcortex cells (Optican amp Richmond 1987 Rolls Treves ampToveacutee 1997 Toveacutee et al 1993 Toveacutee amp Rolls 1995) Thereason for using fewer images in the experiments de-scribed here is that each image needed to be tested withve different masking conditions There are two poten-tial problems arising from a calculation of informationfrom a limited set of stimuli The rst one is that withfew stimuli the full response space of the neuron maynot be adequately sampled We guarded against this bychoosing images for the tests that elicited quite differentresponses from the cells and ensuring that the informa-tion measured with these images was high The secondpossible problem is that the information measures maybe distorted by the fact that there is a ceiling to theamount of information that can be provided aboutwhich of a set of stimuli has been seen and this entropyis the logarithm of the number of stimuli in the set Fortwo stimuli just 1 bit of information is all that could beprovided by a cell This ceiling could limit the informa-tion measurement obtained from neuronal responses iftoo few stimuli are used (Gawne Kjaer Hertz amp Rich-mond 1996 Gawne amp Richmond 1993 Rolls amp Treves1998 Rolls Treves amp Toveacutee 1997) For the analyses de-scribed here we checked that the information availablefrom the neuronal responses was always well below theceiling so that the information measure was not dis-torted In fact for the unmasked condition the averageinformation from the neuronal responses was 03 bitsand this is much lower than the entropy of the sets ofimages which varied for different cells between 1 bit

and 26 bits Finally we note that each neuron was testedwith the same stimuli across the different masking con-ditions and therefore the comparison of the informationvalues obtained under the different masking conditionsis homogenous

Acknowledgments

This research was supported by Medical Research CouncilProgramme Grant PG8513790 to E T Rolls S Panzeri is sup-ported by an EC Marie Curie Research Training GrantERBFMBICT972749

Reprint requests should be sent to E T Rolls Department ofExperimental Psychology University of Oxford Oxford OX13UD UK or via e-mail EdmundRollspsyoxacuk

REFERENCES

Cover T M amp Thomas J A (1991) Elements of informationtheory New York Wiley

Dinse H R amp Kruger K (1994) The timing of processingalong the visual pathway in the cat NeuroReport 5 893ndash897

Gawne T J Kjaer T W Hertz J A amp Richmond B J (1996)Adjacent cortical complex cells share about 20 of theirstimulus-related information Cerebral Cortex 6 482ndash489

Gawne T J amp Richmond B J (1993) How independent arethe messages carried by adjacent inferior temporal corti-cal neurons Journal of Neuroscience 13 2758ndash2771

Golledge H D R Hilgetag C amp Toveacutee M J (1996) Informa-tion processing A solution to the binding problem Cur-rent Biology 6 1092ndash1095

Golomb D Hertz J A Panzeri S Treves A amp RichmondB J (1997) How well can we estimate the information car-ried in neuronal responses from limited samples NeuralComputation 9 649ndash665

Heller J Hertz J A Kjaer T W amp Richmond B J (1995)Information ow and temporal coding in primate patternvision Journal of Computational Neuroscience 2 175ndash193

Humphreys G W amp Bruce V (1989) Visual cognitionHove Eng Erlbaum

Kovacs G Vogels R amp Orban G A (1995) Cortical corre-lates of pattern backward-masking Proceedings of the Na-tional Academy of Sciences USA 92 5587ndash5591

Maunsell J H R amp Gibson J R (1992) Visual response la-tencies in striate cortex of the macaque monkey Journalof Neurophysiology 68 1332ndash1344

Optican L amp Richmond B J (1987) Temporal encoding oftwo-dimensional patterns by single units in primate infe-rior temporal cortex III Information theoretical analysisJournal of Neurophysiology 57 132ndash146

Oram M W amp Perrett D I (1992) Time course of neural re-sponses discriminating different views of the face andhead Journal of Neurophysiology 68 70ndash84

Panzeri S amp Treves A (1996) Analytical estimates of limitedsampling biases in different information measures Net-work 7 87ndash107

Perrett D I Hietanen J K Oram M W amp Benson P J(1992) Organization and functions of cells in the ma-caque temporal cortex Philosophical Transactions of theRoyal Society London B 335 23ndash50

Raiguel S E Lagae L Gulyas B amp Orban G A (1989) Re-

310 Journal of Cognitive Neuroscience Volume 11 Number 3

sponse latencies of visual cells in macaque areas V1 V2and V5 Brain Research 493 155ndash159

Rieke F Warland D de Ruyter van Steveninck R R ampBialek W (1996) Spikes Exploring the neural code Cam-bridge MA MIT Press

Rolls E T (1984) Neurons in the cortex of the temporallobe and in the amygdala of the monkey with responsesselective for faces Human Neurobiology 3 209ndash222

Rolls E T (1992) Neurophysiological mechanisms underly-ing face processing within and beyond the temporal corti-cal visual areas Philosophical Transactions of the RoyalSociety London B 335 11ndash21

Rolls E T (1994) Brain mechanisms for invariant visual rec-ognition and learning Behavioral Processes 33 113ndash138

Rolls E T (1995) Learning mechanisms in the temporal lobevisual cortex Behavioral Brain Research 66 177ndash185

Rolls E T amp Toveacutee M J (1994) Processing speed in thecerebral cortex and the neurophysiology of backwardmasking Proceedings of the Royal Society London B257 9ndash15

Rolls E T amp Toveacutee M J (1995) The sparseness of the neuro-nal representation of stimuli in the primate temporal vis-ual cortex Journal of Neurophysiology 73 713ndash726

Rolls E T Toveacutee M J Purcell D G Stewart A L amp Azzo-pardi P (1994) The responses of neurons in the temporalcortex of primates and face identication and detectionExperimental Brain Research 101 473ndash484

Rolls E T amp Treves A (1998) Neural networks and brainfunction Oxford Oxford University Press

Rolls E T Treves A amp Toveacutee M J (1997) The repre-sentational capacity of the distributed encoding of informa-tion provided by populations of neurons in the primatetemporal visual cortex Experimental Brain Research114 149ndash162

Rolls E T Treves A Toveacutee M amp Panzeri S (1997) Informa-

tion in the neuronal representation of individual stimuli inthe primate temporal visual cortex Journal of Computa-tional Neuroscience 4 309ndash333

Schiller P H (1968) Single unit analysis of backward visualmasking and metacontrast in the cat lateral geniculate nu-cleus Vision Research 8 855ndash866

Shannon C E (1948) A mathematical theory of communica-tion ATampT Bell Laboratories Technical Journal 27 379ndash423

Thorpe S Fize D amp Mariot C (1996) Speed of processingin the human visual system Nature 381 520ndash522

Toveacutee M J (1994) How fast is the speed of thought Cur-rent Biology 4 1125ndash1127

Toveacutee M J amp Rolls E T (1992) Oscillatory activity is not evi-dent in the primate temporal visual cortex with static vis-ual stimuli NeuroReport 3 369ndash372

Toveacutee M J amp Rolls E T (1995) Information encoding inshort ring rate epochs by single neurons in the primatetemporal visual cortex Visual Cognition 2 35ndash58

Toveacutee M J Rolls E T amp Ramachandran V S (1996) Rapidvisual learning in the neurons of the primate temporal vis-ual cortex NeuroReport 7 2757ndash2760

Toveacutee M J Rolls E T Treves A amp Bellis R P (1993) Infor-mation encoding and the responses of single neurons inthe primate temporal visual cortex Journal of Neurophysi-ology 70 640ndash654

Treves A (1993) Mean-eld analysis of neuronal spike dy-namics Network 4 259ndash284

Vogel R amp Orban G A (1994) Activity of inferior temporalneurons during orientation discrimination with succes-sively presented gratings Journal of Neurophysiology 711428ndash1451

Wallis G amp Rolls E T (1997) Invariant face and object rec-ognition in the visual system Progress in Neurobiology51 167ndash194

Rolls et al 311

Figure 5 shows the cumulated information (averagedacross stimuli) at different poststimulus times for thedifferent SOAs and for the unmasked condition It is clearthat there is considerable information in the unmaskedcondition with an average value reached typically by150 msec poststimulus of 030 bits In the masked con-dition the information is progressively reduced as theSOA is decreased to 20 msec At an SOA of 40 msec theaverage information was 016 bits (53) and at an SOAof 20 msec the average information was 010 bits (33)Thus the mask produces a very considerable reductionin the information available about which stimulus wasshown The information tends to decrease after approxi-mately 250 msec indicating that after this time theresponses tend to introduce noise and no net furtherinformation about which stimulus was shown This issueis addressed specically in Figure 6 which shows theinformation available in 50-msec epochs taken at differ-ent poststimulus times with different SOAs It is clearthat most information is available in 50-msec epochstaken up to 200 to 250 msec At longer poststimulustimes there is still some information in the unmaskedcondition but little in the masked conditions The reasonfor the small decrease in information shown in Figure 5at poststimulus times longer than approximately 250msec can overall be accounted for by the fact that thelarge difference in the responses to the different stimulievident by 250 msec tend to become proportionally

smaller as more spikes in the longer poststimulus timesare included Any ring after 250 msec poststimulus iseffectively spontaneous activity of the neuron and therecontributes only noise about the stimulus

To show that the results in Figures 5 and 6 are notdue to averaging across cells we show in Figure 7 theinformation available from the responses of one cellInformation is encoded in the cellrsquos response for severalhundred milliseconds after the cell starts responding tothe stimulus rising to a peak of 085 bits at 100 to 150msec after the onset of stimulus presentation The cellcontinues to encode a signicant amount of informationabout the stimulus in 50-msec epochs taken up to 400msec

To summarize the results we show in Figure 8a theaverage across the cells of the cumulated informationavailable in a 200-msec period from stimulus onset fromthe responses of the cells as a function of the SOA Thisemphasizes how as the SOA is reduced toward 20 msecthe information does reduce rapidly but that neverthe-less at an SOA of 20 msec there is still considerableinformation about which stimulus was shown The re-duction of the information at different SOAs was highlysignicant (one-way ANOVA) at p lt 0001 (The ANOVAwas performed after rescaling the data so that each cellhad the same average information and so that variancedue to differences in the magnitude of the informationcarried by each cell would not interfere with the test ofwhether the SOA affected the information) For compari-son we show in Figure 8b the difference in the numberof spikes to the most effective and the least effectivestimulus as a function of the SOA (normalized as de-scribed for Figure 2) This response difference is closelyrelated to the information available to the different stim-uli Indeed the difference in the number of spikes ismuch more closely related to the information than is themean ring rate of the neurons (shown in Figure 4)emphasizing that it is differences of neuronal responsesto different stimuli that convey information The reduc-tion of the difference in the number of spikes at differ-ent SOAs was again signicant (p lt 005) although lesssignicant than the reduction of the information

DISCUSSION

The advance described in this paper is that by combin-ing neurophysiology and information theory we havebeen able to quantitatively measure how backward vis-ual masking affects the information available from theresponses of neurons in the visual system The effect ofthe mask is to reduce the total information availablefrom the cell the size of the information peak and thelength of time it is signaling information (Figures 5 to 8)The results clearly show that there is a systematic andstatistically signicant reduction in the amount of infor-mation with decreasing SOA (see eg Figure 8)

The results emphasize that very considerable informa-

Figure 2 The mean (+- sem) across cells of the number ofspikes produced by the most effective stimulus (max) and the leasteffective stimulus (min) as a function of SOA

Rolls et al 303

tion about which stimulus was shown is available in ashort epoch (eg 50 msec see Figure 6) of the ringrate This information is available even when the epochis taken near the start of the neuronal response Thisconrms the ndings of Heller Hertz Kjaer and Rich-mond (1995) Toveacutee et al (1993) and Toveacutee and Rolls(1995) also made with recordings from inferior tempo-

ral cortex neurons In those studies in which the stimu-lus lasted for several hundred milliseconds the epochcould be taken at a wide range of poststimulus times (inthe range 100 to 500 msec) In the present study thestimulus itself lasted for only 20 msec and correspond-ingly in the no mask condition the information availabledid drop over the next few hundred milliseconds How-

Figure 3 The mean ringrate computed in 50-msec ep-ochs for the 15 cells to themost effective stimulus and tothe least effective stimulus asa function of the SOA The av-erage response to all stimuli isshown as the middle line ofthe three

304 Journal of Cognitive Neuroscience Volume 11 Number 3

ever it was notable that the information in the no-maskcondition did outlast the end of the stimulus by as muchas 200 to 300 msec indicating some short-term memorytrace property of the neuronal circuitry (cf Wallis ampRolls 1987)

The results show also that an effect of the masking isthat the decrease in information is more marked than

the decrease in ring rate because it is the selective partof the ring that is especially attenuated by the masknot the spontaneous ring (see Figures 2 and 8) Theresults also show that even at the shortest SOA of 20msec the information available was on average 01 bitsThis compares to 03 bits with the 16-msec stimulusshown without the mask It also compares to a typical

Figure 4 The average acrossthe cells of the number ofspikes cumulated by differentpoststimulus times for themost effective and the least ef-fective stimuli for each celland also the average acrossstimuli

Rolls et al 305

value for such neurons of 035 to 05 bits with a 500-msec stimulus presentation (Rolls Treves Toveacutee ampPanzeri 1997 Toveacutee amp Rolls 1995) The results thus showthat considerable information (33 of that availablewithout a mask and approximately 22 of that with a500-msec stimulus presentation) is available from neuro-nal responses even under backward masking conditionsthat allow the neurons to have their main response in30 msec Also we note that the information availablefrom a 16-msec unmasked stimulus (03 bits) is a largeproportion (approximately 65 to 75) of that availablefrom a 500-msec stimulus These results provide evi-dence of how rapid the processing of visual informationis in a cortical area and provide a fundamental constraintfor understanding how cortical information processingoperates (see Rolls amp Treves 1998)

The analysis using information theory draws out aninteresting point about the relation between the ringrate measure used previously (Rolls amp Toveacutee 1994 Rollset al 1994) and the information measure (see Figure 8)It is shown in Figure 8 that the difference in the meanresponses to different stimuli (not the mean response toall stimuli) is related to the information This reects thefact that the information is related to the differences inneuronal responses to different stimuli However theinformation also reects the variance in the neuronalresponses and it is presumably the larger variance atshort SOAs that makes the information decrease more

than the difference in neuronal responses In fact thestandard deviation of the neuronal responses at an SOAof 20 msec is approximately 045 of the mean ring ratewhereas it is 04 of the mean ring rate without mask-ing

The neurophysiological and information data de-scribed here can be compared directly with the effectsof backward masking in human observers studied in thesame apparatus with the same stimuli (Rolls et al 1994)For the human observers identication of which facefrom a set of six had been seen was 50 correct withan SOA of 20 msec and 97 correct with an SOA of 40msec (corrected for guessing) (Rolls et al 1994) Com-paring the human performance purely with the changesin ring rate under the same stimulus conditions sug-gested that when it is just possible to identify which facehas been seen neurons in a given cortical area may beresponding for only approximately 30 msec (Rolls ampToveacutee 1994 Rolls et al 1994) The implication is that 30msec is enough time for a neuron to perform sufcientcomputation to enable its output to be used for iden-tication The new results based on an analysis of theinformation encoded in the spike trains at different SOAssupport this hypothesis by showing that a signicantproportion of information is available in these few spikes(see Figures 5 to 7) Thus there is very rapid processingof visual stimuli in the visual system We also note thatthe analysis using information theory adds to previous

Figure 5 The average across the cells of the cumulated informa-tion (averaged across stimuli) at different poststimulus times for thedifferent SOAs and for the unmasked condition

Figure 6 The average across the cells of the information availablein discrete 50-msec epochs taken at different poststimulus timeswith different SOAs and for the unmasked condition

306 Journal of Cognitive Neuroscience Volume 11 Number 3

analyses by providing quantitative information measuresthat can in principle be compared to the performanceof human observers using the same information metric

A comparison of the latencies of the activation ofneurons in the different visual cortical areas V1 V2 V4posterior inferior temporal cortex and anterior inferiortemporal cortex suggests that approximately 10 to 15msec is added by each stage (Dinse amp Kruger 1994Oram amp Perrett 1992 Raiguel et al 1989 Rolls 1992Vogel amp Orban 1994) This lag also seems to be commonto the passage of information between different subdivi-sions of a given area For example there seems to be alag of 15 msec between neurons in layer 4C and in themore supercial layers of V1 and a lag of 11 msecbetween viewer-centered cells and object-centered cellsin the temporal visual cortex (Maunsell amp Gibson 1992Perrett et al 1992) These studies show that neurons inthe next stage of processing start ring soon after (15msec) the neurons in any stage of processing havestarted to re The fact that considerable information isavailable in short epochs of for example 20 msec of thering of neurons provides part of the underlying basisfor this rapid sequential activation of connected visualcortical areas (Toveacutee amp Rolls 1995) The high ring ratesof neurons in the visual cortical areas to their mosteffective stimulus may be an important aspect of thisrapid transmission of visual information (cf Rolls ampTreves 1998 section A232) Nevertheless the ring of

neurons within a cortical area normally continues to avisual stimulus for several hundred milliseconds (seeRolls amp Treves 1998) Over this long period of timedifferent factors will inuence the ring of neurons in agiven cortical area Initially the ring will be basedmainly on incoming information from the precedingcortical area (feed-forward information) but at varyingtemporal intervals different feed-back mechanisms willplay a modulating role Initially this is likely to include

Figure 7 The information available from the responses of one ofthe cells as a function of the poststimulus time for different SOAs

Figure 8 (a) The average (plusmn sem) across the cells of the cumu-lated information available in a 200-msec period from stimulus onsetfrom the responses of the cells as a function of the SOA (b) The dif-ference in the number of spikes to the most effective and to theleast effective stimulus is also shown as a function of the SOA Thescale in (b) of the ldquonumber of spikesrdquo axis is scaled so that the infor-mation value (a) and the difference in the number of spikes (b) co-incide in the no-mask condition

Rolls et al 307

lateral inhibition followed by intracortical feedbackfrom local excitatory recurrent collateral axons of otherpyramidal cells and then feedback from higher corticalareas However although all these separate inputs seemto be very different conceptually in time it is of interestthat in dynamical systems using integrate-and-re neu-rons to model the dynamics of the real brain the ex-change of information required to achieve rapid settlingof the network into a nal state can be very much fasterthan might be expected by taking the contribution ofeach stage as a separate time step (see Rolls amp Treves1998 Treves 1993) This is exactly consistent with therapid speed of processing demonstrated here and em-phasized by the fact that considerable information isavailable from inferior temporal cortex neurons whentheir processing is interrupted by a mask starting only20 msec after the onset of a stimulus

Our previous studies suggested that there is a popula-tion of neurons in the temporal cortical areas that re-spond to a single frame (16 msec) presentation of thetarget stimulus a face with an increased ring rate thatcan last for 200 to 300 msec (Rolls amp Toveacutee 1994 Rollset al 1994) This is in the unmasked condition Theinformation analysis conrms and extends this ndingThe neuronal responses can encode signicant amountsof information for up to 300 msec after the presentationof a 16-msec stimulus in the unmasked condition (seeFigure 5) These results suggest that there may be ashort-term visual memory implemented by the con-tinued ring of these neurons after a stimulus hasdisappeared This short-term visual memory may be im-plemented by the recurrent collateral connections madebetween nearby pyramidal cells in the cerebral cortexThese recurrent connections may function in part as anautoassociative memory which not only enables corticalnetworks to show continued ring for a few hundredmilliseconds after a briey presented stimulus but alsoconfer some of the response specicity inherent in theresponses of cortical neurons (Rolls amp Treves 1998) Oneeffect that may be facilitated by this short-term visualmemory is implementation of a trace learning rule in thevisual cortex that may be used to learn invariant repre-sentations (Rolls 1992 1994 1995 Rolls amp Toveacutee 1994Wallis amp Rolls 1997) The continuing neural activity afterone stimulus would enable it to be associated with thesucceeding images which given the nature of the statis-tics of our visual world would probably be images of thesame object These associations between images pro-duced within a short time by the same object could formthe basis for the construction of an invariant repre-sentation of that stimulus For example the neuronsactivated by a stimulus are still in a state of activation(and postsynaptic depolarization) perhaps 300 mseclater when the same stimulus is seen translated acrossthe retina viewed from a different angle or at a differentsize so that the active axons carrying the transformed

representation can undergo Hebbian associative synapticmodication onto just those neurons that remain in anactivated state from the previous input produced by thesame object In this way the invariant properties evidentacross short time epochs of the inputs produced byobjects may be learned by the visual system (Rolls 1992Rolls amp Treves 1998 Wallis amp Rolls 1997)

We nally note that the information per spike denedas the information calculated in short windows dividedby the number of spikes emitted by the cell in that timewindow is in the range 005 to 02 bits per spike forlonger SOAs slightly decreasing down to 003 to 015bits per spike at very short SOAs (The values of theinformation per spike can be easily extracted by com-paring Figure 6 which reports the values of the informa-tion available in short time epochs and Figure 3 whichplots the mean ring rates of the neurons in the sametime epochs) These values are similar to the values ofthe information per spike found in IT neurons respond-ing to long-lasting (eg 500 msec) stimuli (Heller et al1995 Rolls Treves Toveacutee amp Panzeri 1997) Thus for theexperiments described the information per spike is onlymoderate even when there are a very few spikes inresponse to a stimulus as in the mask condition withvery short SOAs This result has some implications forthe nature of the neuronal code It has been suggestedby Rieke et al (1996) that the fact that at short SOAsonly a few spikes are emitted in response to a stimulusand at the same time a psychophysical discrimination ofthe visual stimuli is still possible may imply that even inthe mammalian central nervous system when a singlestimulus is rapidly varying or is presented just for a veryshort time one or two spikes may be enough to carryvery high values of information (thus challenging theidea of a ldquorate coderdquo) This is indeed the case for forexample the visual system of the y (Rieke et al 1996)where a single spike can carry several bits of informa-tion at least when studied with dynamically varyingstimuli However the evidence presented here seems toindicate that several spikes from (perhaps different) neu-rons in the IT cortex of the monkey are needed toprovide much information about a briey presented vis-ual pattern This is consistent with the distributed repre-sentation of information used in many mammalianneural systems and with the advantages that computa-tion with this type of distributed representation of infor-mation confers (Rolls amp Treves 1998 Rolls Treves ampToveacutee 1997)

EXPERIMENTAL PROCEDURES AND DATAANALYSIS

The activity of single neurons was recorded with glass-insulated tungsten microelectrodes in the anterior partof the superior temporal sulcus (STS) in two alert ma-caque monkeys (Macaca mulatta mass 30 kg) seated in

308 Journal of Cognitive Neuroscience Volume 11 Number 3

a primate chair using techniques that have been de-scribed elsewhere (Toveacutee et al 1993 Toveacutee amp Rolls1992) The preparative procedures were performed asep-tically under sodium thiopentone anaesthesia by usingpretreatment with ketamine and posttreatment with theanalgesic buprenorphine (Temgesic and the antibioticamoxycillin Cynulox) and all procedures were in ac-cordance with the Policy Regarding the Care and Use ofAnimals approved by the Society for Neuroscience andwere licensed under the UK Animals Scientic Proce-dures Act 1986 Eye position was measured to an accu-racy of 05 8 with the search coil technique A visualxation task ensured that the monkey looked steadily atthe screen throughout the presentation of each stimulusThe task was a blink version of a visual xation task inwhich the xation spot was blinked off 100 msec beforethe target (otherwise called test) stimulus appeared Thestimuli were static visual stimuli subtending 8 in thevisual eld presented on a video monitor at a distanceof 10 m The xation spot position was at the center ofthe screen The monitor was viewed binocularly with thewhole screen visible to both eyes

Each trial started at - 500 msec with respect to theonset of the test image with a 500-msec warning toneto allow xation of the xation point which appearedat the same time At - 100 msec the xation spot wasblinked off so there was no stimulus on the screen inthe 100-msec period immediately preceding the testimage The screen in this period and at all other timesincluding the interstimulus interval and the interval be-tween the test image and the mask was set at the meanluminance of the test images and the mask At 0 msecthe tone was switched off and the test image wasswitched on for 16 msec This period was the frameduration of the video framestore with which the imageswere presented The image was drawn on the monitorfrom the top to the bottom in the rst 16 msec of theframe period by the framestore with the remaining 4msec of the frame period being the vertical blank inter-val (The PAL video system was in use) The monitor hada persistence of less than 3 msec so no part of the testimage was present at the start of the next frame Thetime between the start of the test stimulus and the startof the mask stimulus (the SOA) was either 20 40 60 100or 1000 msec (chosen in a pseudorandom sequence bythe computer) The 1000-msec condition was used tomeasure the response to the test stimulus alone (whichwas possible because the mask was delayed for so long)The duration of the masking stimulus was 300 msec Atthe termination of the masking stimulus the xation spotreappeared and then after a random interval in the range150 to 3350 msec it dimmed to indicate that lickingresponses to a tube in front of the mouth would resultin the delivery of a reward The dimming period was1000 msec and after this the xation spot was switchedoff and reward availability terminated 500 msec later

The monkey was required to xate the xation spot andif it licked at any time other than when the spot wasdimmed saline instead of fruit juice was delivered fromthe tube If the eyes moved by more than 05deg from time0 until the start of the dimming period the trial wasaborted and the data for the trial were rejected When atrial aborted a high-frequency tone sounded for 05 secno reinforcement was available for that trial and theintertrial interval was lengthened from 8 to 11 sec

The criterion for the face-selective neurons analyzedin this study was that the response to the optimal stimu-lus should be at least twice that to the optimal nonfacestimulus and that this difference should be signicant(Rolls 1984 Rolls amp Toveacutee 1995 Rolls Treves Toveacutee ampPanzeri 1997) If the neuron satised the criterion it wastested with two to six of the effective face stimuli forthat neuron We checked that none of the selected cellshad any response to the mask presented alone

The transmitted information carried by neuronal ringrates about the stimuli was computed with the use oftechniques that have been described fully previously(eg Rolls Treves Toveacutee amp Panzeri 1997 Rolls amp Treves1998) and have been used previously to analyze theresponses of inferior temporal cortex neurons (Gawneamp Richmond 1993 Optican amp Richmond 1987 RollsTreves amp Toveacutee 1997 Toveacutee amp Rolls 1995 Toveacutee et al1993) In brief the general procedure was as follows Theresponse r of a neuron to the presentation of a particularstimulus s was computed by measuring the ring rate ofthe neuron in a xed time window after the stimuluspresentation (In this experiment the information in anumber of different time windows was calculated)Measured in this way the ring rate responses takediscrete rather than continuous values that consist of thenumber of spikes in the time window on a particulartrial and span a discrete set of responses R across allstimuli and trials In the experiment the number of trialsavailable for each stimulus is limited (in the range of 6to 30 in the present experiment) When calculating theinformation the number of ring rate bins that can beused must be smaller than (or equal to) the number oftrials available for each stimulus to prevent undersam-pling and incorrectly high values of calculated informa-tion (Rolls amp Treves 1998) We therefore quantized themeasured ring rates into a smaller number of bins dWe chose here d in a way specic for every cell andevery time window according to the following d wasthe number of trials per stimulus (or the number ofdifferent rates that actually occurred if this was lower)This procedure is very effective in minimizing informa-tion loss due to overregularization of responses whileeffectively controlling for nite sampling biases seeGolomb Hertz Panzeri Treves amp Richmond (1997) andPanzeri amp Treves (1996) After this response quantizationthe experimental joint stimulus-response probability ta-ble P(s r) was computed from the data (where P(r) and

Rolls et al 309

P(s) are the experimental probability of occurrence ofresponses and of stimuli respectively) and the informa-tion I(S R) transmitted by the neurons averaged acrossthe stimuli was calculated by using the Shannon formula(Cover amp Thomas 1991 Shannon 1948)

I(S R) = aring sr

P(s r) log2 P(s r)

P(s)P(r)

and then subtracting the nite sampling correction ofPanzeri and Treves (1996) to obtain estimates unbiasedfor the limited sampling This leads to the informationavailable in the ring rates about the stimulus We didnot calculate the additional information available fromtemporal ring patterns within the spike train becausethe additional information is low often only 10 to 20and reects mainly the onset latency of the neuronalresponse (Toveacutee et al 1993 Toveacutee amp Rolls 1995)

In the experiments described here three cells weretested with six stimuli one with four stimuli ve cellswith three stimuli and six cells with two stimuli We notethat the number of stimuli used here is smaller than thatused in other experiments that applied informationtheoretic measures to the responses of inferior temporalcortex cells (Optican amp Richmond 1987 Rolls Treves ampToveacutee 1997 Toveacutee et al 1993 Toveacutee amp Rolls 1995) Thereason for using fewer images in the experiments de-scribed here is that each image needed to be tested withve different masking conditions There are two poten-tial problems arising from a calculation of informationfrom a limited set of stimuli The rst one is that withfew stimuli the full response space of the neuron maynot be adequately sampled We guarded against this bychoosing images for the tests that elicited quite differentresponses from the cells and ensuring that the informa-tion measured with these images was high The secondpossible problem is that the information measures maybe distorted by the fact that there is a ceiling to theamount of information that can be provided aboutwhich of a set of stimuli has been seen and this entropyis the logarithm of the number of stimuli in the set Fortwo stimuli just 1 bit of information is all that could beprovided by a cell This ceiling could limit the informa-tion measurement obtained from neuronal responses iftoo few stimuli are used (Gawne Kjaer Hertz amp Rich-mond 1996 Gawne amp Richmond 1993 Rolls amp Treves1998 Rolls Treves amp Toveacutee 1997) For the analyses de-scribed here we checked that the information availablefrom the neuronal responses was always well below theceiling so that the information measure was not dis-torted In fact for the unmasked condition the averageinformation from the neuronal responses was 03 bitsand this is much lower than the entropy of the sets ofimages which varied for different cells between 1 bit

and 26 bits Finally we note that each neuron was testedwith the same stimuli across the different masking con-ditions and therefore the comparison of the informationvalues obtained under the different masking conditionsis homogenous

Acknowledgments

This research was supported by Medical Research CouncilProgramme Grant PG8513790 to E T Rolls S Panzeri is sup-ported by an EC Marie Curie Research Training GrantERBFMBICT972749

Reprint requests should be sent to E T Rolls Department ofExperimental Psychology University of Oxford Oxford OX13UD UK or via e-mail EdmundRollspsyoxacuk

REFERENCES

Cover T M amp Thomas J A (1991) Elements of informationtheory New York Wiley

Dinse H R amp Kruger K (1994) The timing of processingalong the visual pathway in the cat NeuroReport 5 893ndash897

Gawne T J Kjaer T W Hertz J A amp Richmond B J (1996)Adjacent cortical complex cells share about 20 of theirstimulus-related information Cerebral Cortex 6 482ndash489

Gawne T J amp Richmond B J (1993) How independent arethe messages carried by adjacent inferior temporal corti-cal neurons Journal of Neuroscience 13 2758ndash2771

Golledge H D R Hilgetag C amp Toveacutee M J (1996) Informa-tion processing A solution to the binding problem Cur-rent Biology 6 1092ndash1095

Golomb D Hertz J A Panzeri S Treves A amp RichmondB J (1997) How well can we estimate the information car-ried in neuronal responses from limited samples NeuralComputation 9 649ndash665

Heller J Hertz J A Kjaer T W amp Richmond B J (1995)Information ow and temporal coding in primate patternvision Journal of Computational Neuroscience 2 175ndash193

Humphreys G W amp Bruce V (1989) Visual cognitionHove Eng Erlbaum

Kovacs G Vogels R amp Orban G A (1995) Cortical corre-lates of pattern backward-masking Proceedings of the Na-tional Academy of Sciences USA 92 5587ndash5591

Maunsell J H R amp Gibson J R (1992) Visual response la-tencies in striate cortex of the macaque monkey Journalof Neurophysiology 68 1332ndash1344

Optican L amp Richmond B J (1987) Temporal encoding oftwo-dimensional patterns by single units in primate infe-rior temporal cortex III Information theoretical analysisJournal of Neurophysiology 57 132ndash146

Oram M W amp Perrett D I (1992) Time course of neural re-sponses discriminating different views of the face andhead Journal of Neurophysiology 68 70ndash84

Panzeri S amp Treves A (1996) Analytical estimates of limitedsampling biases in different information measures Net-work 7 87ndash107

Perrett D I Hietanen J K Oram M W amp Benson P J(1992) Organization and functions of cells in the ma-caque temporal cortex Philosophical Transactions of theRoyal Society London B 335 23ndash50

Raiguel S E Lagae L Gulyas B amp Orban G A (1989) Re-

310 Journal of Cognitive Neuroscience Volume 11 Number 3

sponse latencies of visual cells in macaque areas V1 V2and V5 Brain Research 493 155ndash159

Rieke F Warland D de Ruyter van Steveninck R R ampBialek W (1996) Spikes Exploring the neural code Cam-bridge MA MIT Press

Rolls E T (1984) Neurons in the cortex of the temporallobe and in the amygdala of the monkey with responsesselective for faces Human Neurobiology 3 209ndash222

Rolls E T (1992) Neurophysiological mechanisms underly-ing face processing within and beyond the temporal corti-cal visual areas Philosophical Transactions of the RoyalSociety London B 335 11ndash21

Rolls E T (1994) Brain mechanisms for invariant visual rec-ognition and learning Behavioral Processes 33 113ndash138

Rolls E T (1995) Learning mechanisms in the temporal lobevisual cortex Behavioral Brain Research 66 177ndash185

Rolls E T amp Toveacutee M J (1994) Processing speed in thecerebral cortex and the neurophysiology of backwardmasking Proceedings of the Royal Society London B257 9ndash15

Rolls E T amp Toveacutee M J (1995) The sparseness of the neuro-nal representation of stimuli in the primate temporal vis-ual cortex Journal of Neurophysiology 73 713ndash726

Rolls E T Toveacutee M J Purcell D G Stewart A L amp Azzo-pardi P (1994) The responses of neurons in the temporalcortex of primates and face identication and detectionExperimental Brain Research 101 473ndash484

Rolls E T amp Treves A (1998) Neural networks and brainfunction Oxford Oxford University Press

Rolls E T Treves A amp Toveacutee M J (1997) The repre-sentational capacity of the distributed encoding of informa-tion provided by populations of neurons in the primatetemporal visual cortex Experimental Brain Research114 149ndash162

Rolls E T Treves A Toveacutee M amp Panzeri S (1997) Informa-

tion in the neuronal representation of individual stimuli inthe primate temporal visual cortex Journal of Computa-tional Neuroscience 4 309ndash333

Schiller P H (1968) Single unit analysis of backward visualmasking and metacontrast in the cat lateral geniculate nu-cleus Vision Research 8 855ndash866

Shannon C E (1948) A mathematical theory of communica-tion ATampT Bell Laboratories Technical Journal 27 379ndash423

Thorpe S Fize D amp Mariot C (1996) Speed of processingin the human visual system Nature 381 520ndash522

Toveacutee M J (1994) How fast is the speed of thought Cur-rent Biology 4 1125ndash1127

Toveacutee M J amp Rolls E T (1992) Oscillatory activity is not evi-dent in the primate temporal visual cortex with static vis-ual stimuli NeuroReport 3 369ndash372

Toveacutee M J amp Rolls E T (1995) Information encoding inshort ring rate epochs by single neurons in the primatetemporal visual cortex Visual Cognition 2 35ndash58

Toveacutee M J Rolls E T amp Ramachandran V S (1996) Rapidvisual learning in the neurons of the primate temporal vis-ual cortex NeuroReport 7 2757ndash2760

Toveacutee M J Rolls E T Treves A amp Bellis R P (1993) Infor-mation encoding and the responses of single neurons inthe primate temporal visual cortex Journal of Neurophysi-ology 70 640ndash654

Treves A (1993) Mean-eld analysis of neuronal spike dy-namics Network 4 259ndash284

Vogel R amp Orban G A (1994) Activity of inferior temporalneurons during orientation discrimination with succes-sively presented gratings Journal of Neurophysiology 711428ndash1451

Wallis G amp Rolls E T (1997) Invariant face and object rec-ognition in the visual system Progress in Neurobiology51 167ndash194

Rolls et al 311

tion about which stimulus was shown is available in ashort epoch (eg 50 msec see Figure 6) of the ringrate This information is available even when the epochis taken near the start of the neuronal response Thisconrms the ndings of Heller Hertz Kjaer and Rich-mond (1995) Toveacutee et al (1993) and Toveacutee and Rolls(1995) also made with recordings from inferior tempo-

ral cortex neurons In those studies in which the stimu-lus lasted for several hundred milliseconds the epochcould be taken at a wide range of poststimulus times (inthe range 100 to 500 msec) In the present study thestimulus itself lasted for only 20 msec and correspond-ingly in the no mask condition the information availabledid drop over the next few hundred milliseconds How-

Figure 3 The mean ringrate computed in 50-msec ep-ochs for the 15 cells to themost effective stimulus and tothe least effective stimulus asa function of the SOA The av-erage response to all stimuli isshown as the middle line ofthe three

304 Journal of Cognitive Neuroscience Volume 11 Number 3

ever it was notable that the information in the no-maskcondition did outlast the end of the stimulus by as muchas 200 to 300 msec indicating some short-term memorytrace property of the neuronal circuitry (cf Wallis ampRolls 1987)

The results show also that an effect of the masking isthat the decrease in information is more marked than

the decrease in ring rate because it is the selective partof the ring that is especially attenuated by the masknot the spontaneous ring (see Figures 2 and 8) Theresults also show that even at the shortest SOA of 20msec the information available was on average 01 bitsThis compares to 03 bits with the 16-msec stimulusshown without the mask It also compares to a typical

Figure 4 The average acrossthe cells of the number ofspikes cumulated by differentpoststimulus times for themost effective and the least ef-fective stimuli for each celland also the average acrossstimuli

Rolls et al 305

value for such neurons of 035 to 05 bits with a 500-msec stimulus presentation (Rolls Treves Toveacutee ampPanzeri 1997 Toveacutee amp Rolls 1995) The results thus showthat considerable information (33 of that availablewithout a mask and approximately 22 of that with a500-msec stimulus presentation) is available from neuro-nal responses even under backward masking conditionsthat allow the neurons to have their main response in30 msec Also we note that the information availablefrom a 16-msec unmasked stimulus (03 bits) is a largeproportion (approximately 65 to 75) of that availablefrom a 500-msec stimulus These results provide evi-dence of how rapid the processing of visual informationis in a cortical area and provide a fundamental constraintfor understanding how cortical information processingoperates (see Rolls amp Treves 1998)

The analysis using information theory draws out aninteresting point about the relation between the ringrate measure used previously (Rolls amp Toveacutee 1994 Rollset al 1994) and the information measure (see Figure 8)It is shown in Figure 8 that the difference in the meanresponses to different stimuli (not the mean response toall stimuli) is related to the information This reects thefact that the information is related to the differences inneuronal responses to different stimuli However theinformation also reects the variance in the neuronalresponses and it is presumably the larger variance atshort SOAs that makes the information decrease more

than the difference in neuronal responses In fact thestandard deviation of the neuronal responses at an SOAof 20 msec is approximately 045 of the mean ring ratewhereas it is 04 of the mean ring rate without mask-ing

The neurophysiological and information data de-scribed here can be compared directly with the effectsof backward masking in human observers studied in thesame apparatus with the same stimuli (Rolls et al 1994)For the human observers identication of which facefrom a set of six had been seen was 50 correct withan SOA of 20 msec and 97 correct with an SOA of 40msec (corrected for guessing) (Rolls et al 1994) Com-paring the human performance purely with the changesin ring rate under the same stimulus conditions sug-gested that when it is just possible to identify which facehas been seen neurons in a given cortical area may beresponding for only approximately 30 msec (Rolls ampToveacutee 1994 Rolls et al 1994) The implication is that 30msec is enough time for a neuron to perform sufcientcomputation to enable its output to be used for iden-tication The new results based on an analysis of theinformation encoded in the spike trains at different SOAssupport this hypothesis by showing that a signicantproportion of information is available in these few spikes(see Figures 5 to 7) Thus there is very rapid processingof visual stimuli in the visual system We also note thatthe analysis using information theory adds to previous

Figure 5 The average across the cells of the cumulated informa-tion (averaged across stimuli) at different poststimulus times for thedifferent SOAs and for the unmasked condition

Figure 6 The average across the cells of the information availablein discrete 50-msec epochs taken at different poststimulus timeswith different SOAs and for the unmasked condition

306 Journal of Cognitive Neuroscience Volume 11 Number 3

analyses by providing quantitative information measuresthat can in principle be compared to the performanceof human observers using the same information metric

A comparison of the latencies of the activation ofneurons in the different visual cortical areas V1 V2 V4posterior inferior temporal cortex and anterior inferiortemporal cortex suggests that approximately 10 to 15msec is added by each stage (Dinse amp Kruger 1994Oram amp Perrett 1992 Raiguel et al 1989 Rolls 1992Vogel amp Orban 1994) This lag also seems to be commonto the passage of information between different subdivi-sions of a given area For example there seems to be alag of 15 msec between neurons in layer 4C and in themore supercial layers of V1 and a lag of 11 msecbetween viewer-centered cells and object-centered cellsin the temporal visual cortex (Maunsell amp Gibson 1992Perrett et al 1992) These studies show that neurons inthe next stage of processing start ring soon after (15msec) the neurons in any stage of processing havestarted to re The fact that considerable information isavailable in short epochs of for example 20 msec of thering of neurons provides part of the underlying basisfor this rapid sequential activation of connected visualcortical areas (Toveacutee amp Rolls 1995) The high ring ratesof neurons in the visual cortical areas to their mosteffective stimulus may be an important aspect of thisrapid transmission of visual information (cf Rolls ampTreves 1998 section A232) Nevertheless the ring of

neurons within a cortical area normally continues to avisual stimulus for several hundred milliseconds (seeRolls amp Treves 1998) Over this long period of timedifferent factors will inuence the ring of neurons in agiven cortical area Initially the ring will be basedmainly on incoming information from the precedingcortical area (feed-forward information) but at varyingtemporal intervals different feed-back mechanisms willplay a modulating role Initially this is likely to include

Figure 7 The information available from the responses of one ofthe cells as a function of the poststimulus time for different SOAs

Figure 8 (a) The average (plusmn sem) across the cells of the cumu-lated information available in a 200-msec period from stimulus onsetfrom the responses of the cells as a function of the SOA (b) The dif-ference in the number of spikes to the most effective and to theleast effective stimulus is also shown as a function of the SOA Thescale in (b) of the ldquonumber of spikesrdquo axis is scaled so that the infor-mation value (a) and the difference in the number of spikes (b) co-incide in the no-mask condition

Rolls et al 307

lateral inhibition followed by intracortical feedbackfrom local excitatory recurrent collateral axons of otherpyramidal cells and then feedback from higher corticalareas However although all these separate inputs seemto be very different conceptually in time it is of interestthat in dynamical systems using integrate-and-re neu-rons to model the dynamics of the real brain the ex-change of information required to achieve rapid settlingof the network into a nal state can be very much fasterthan might be expected by taking the contribution ofeach stage as a separate time step (see Rolls amp Treves1998 Treves 1993) This is exactly consistent with therapid speed of processing demonstrated here and em-phasized by the fact that considerable information isavailable from inferior temporal cortex neurons whentheir processing is interrupted by a mask starting only20 msec after the onset of a stimulus

Our previous studies suggested that there is a popula-tion of neurons in the temporal cortical areas that re-spond to a single frame (16 msec) presentation of thetarget stimulus a face with an increased ring rate thatcan last for 200 to 300 msec (Rolls amp Toveacutee 1994 Rollset al 1994) This is in the unmasked condition Theinformation analysis conrms and extends this ndingThe neuronal responses can encode signicant amountsof information for up to 300 msec after the presentationof a 16-msec stimulus in the unmasked condition (seeFigure 5) These results suggest that there may be ashort-term visual memory implemented by the con-tinued ring of these neurons after a stimulus hasdisappeared This short-term visual memory may be im-plemented by the recurrent collateral connections madebetween nearby pyramidal cells in the cerebral cortexThese recurrent connections may function in part as anautoassociative memory which not only enables corticalnetworks to show continued ring for a few hundredmilliseconds after a briey presented stimulus but alsoconfer some of the response specicity inherent in theresponses of cortical neurons (Rolls amp Treves 1998) Oneeffect that may be facilitated by this short-term visualmemory is implementation of a trace learning rule in thevisual cortex that may be used to learn invariant repre-sentations (Rolls 1992 1994 1995 Rolls amp Toveacutee 1994Wallis amp Rolls 1997) The continuing neural activity afterone stimulus would enable it to be associated with thesucceeding images which given the nature of the statis-tics of our visual world would probably be images of thesame object These associations between images pro-duced within a short time by the same object could formthe basis for the construction of an invariant repre-sentation of that stimulus For example the neuronsactivated by a stimulus are still in a state of activation(and postsynaptic depolarization) perhaps 300 mseclater when the same stimulus is seen translated acrossthe retina viewed from a different angle or at a differentsize so that the active axons carrying the transformed

representation can undergo Hebbian associative synapticmodication onto just those neurons that remain in anactivated state from the previous input produced by thesame object In this way the invariant properties evidentacross short time epochs of the inputs produced byobjects may be learned by the visual system (Rolls 1992Rolls amp Treves 1998 Wallis amp Rolls 1997)

We nally note that the information per spike denedas the information calculated in short windows dividedby the number of spikes emitted by the cell in that timewindow is in the range 005 to 02 bits per spike forlonger SOAs slightly decreasing down to 003 to 015bits per spike at very short SOAs (The values of theinformation per spike can be easily extracted by com-paring Figure 6 which reports the values of the informa-tion available in short time epochs and Figure 3 whichplots the mean ring rates of the neurons in the sametime epochs) These values are similar to the values ofthe information per spike found in IT neurons respond-ing to long-lasting (eg 500 msec) stimuli (Heller et al1995 Rolls Treves Toveacutee amp Panzeri 1997) Thus for theexperiments described the information per spike is onlymoderate even when there are a very few spikes inresponse to a stimulus as in the mask condition withvery short SOAs This result has some implications forthe nature of the neuronal code It has been suggestedby Rieke et al (1996) that the fact that at short SOAsonly a few spikes are emitted in response to a stimulusand at the same time a psychophysical discrimination ofthe visual stimuli is still possible may imply that even inthe mammalian central nervous system when a singlestimulus is rapidly varying or is presented just for a veryshort time one or two spikes may be enough to carryvery high values of information (thus challenging theidea of a ldquorate coderdquo) This is indeed the case for forexample the visual system of the y (Rieke et al 1996)where a single spike can carry several bits of informa-tion at least when studied with dynamically varyingstimuli However the evidence presented here seems toindicate that several spikes from (perhaps different) neu-rons in the IT cortex of the monkey are needed toprovide much information about a briey presented vis-ual pattern This is consistent with the distributed repre-sentation of information used in many mammalianneural systems and with the advantages that computa-tion with this type of distributed representation of infor-mation confers (Rolls amp Treves 1998 Rolls Treves ampToveacutee 1997)

EXPERIMENTAL PROCEDURES AND DATAANALYSIS

The activity of single neurons was recorded with glass-insulated tungsten microelectrodes in the anterior partof the superior temporal sulcus (STS) in two alert ma-caque monkeys (Macaca mulatta mass 30 kg) seated in

308 Journal of Cognitive Neuroscience Volume 11 Number 3

a primate chair using techniques that have been de-scribed elsewhere (Toveacutee et al 1993 Toveacutee amp Rolls1992) The preparative procedures were performed asep-tically under sodium thiopentone anaesthesia by usingpretreatment with ketamine and posttreatment with theanalgesic buprenorphine (Temgesic and the antibioticamoxycillin Cynulox) and all procedures were in ac-cordance with the Policy Regarding the Care and Use ofAnimals approved by the Society for Neuroscience andwere licensed under the UK Animals Scientic Proce-dures Act 1986 Eye position was measured to an accu-racy of 05 8 with the search coil technique A visualxation task ensured that the monkey looked steadily atthe screen throughout the presentation of each stimulusThe task was a blink version of a visual xation task inwhich the xation spot was blinked off 100 msec beforethe target (otherwise called test) stimulus appeared Thestimuli were static visual stimuli subtending 8 in thevisual eld presented on a video monitor at a distanceof 10 m The xation spot position was at the center ofthe screen The monitor was viewed binocularly with thewhole screen visible to both eyes

Each trial started at - 500 msec with respect to theonset of the test image with a 500-msec warning toneto allow xation of the xation point which appearedat the same time At - 100 msec the xation spot wasblinked off so there was no stimulus on the screen inthe 100-msec period immediately preceding the testimage The screen in this period and at all other timesincluding the interstimulus interval and the interval be-tween the test image and the mask was set at the meanluminance of the test images and the mask At 0 msecthe tone was switched off and the test image wasswitched on for 16 msec This period was the frameduration of the video framestore with which the imageswere presented The image was drawn on the monitorfrom the top to the bottom in the rst 16 msec of theframe period by the framestore with the remaining 4msec of the frame period being the vertical blank inter-val (The PAL video system was in use) The monitor hada persistence of less than 3 msec so no part of the testimage was present at the start of the next frame Thetime between the start of the test stimulus and the startof the mask stimulus (the SOA) was either 20 40 60 100or 1000 msec (chosen in a pseudorandom sequence bythe computer) The 1000-msec condition was used tomeasure the response to the test stimulus alone (whichwas possible because the mask was delayed for so long)The duration of the masking stimulus was 300 msec Atthe termination of the masking stimulus the xation spotreappeared and then after a random interval in the range150 to 3350 msec it dimmed to indicate that lickingresponses to a tube in front of the mouth would resultin the delivery of a reward The dimming period was1000 msec and after this the xation spot was switchedoff and reward availability terminated 500 msec later

The monkey was required to xate the xation spot andif it licked at any time other than when the spot wasdimmed saline instead of fruit juice was delivered fromthe tube If the eyes moved by more than 05deg from time0 until the start of the dimming period the trial wasaborted and the data for the trial were rejected When atrial aborted a high-frequency tone sounded for 05 secno reinforcement was available for that trial and theintertrial interval was lengthened from 8 to 11 sec

The criterion for the face-selective neurons analyzedin this study was that the response to the optimal stimu-lus should be at least twice that to the optimal nonfacestimulus and that this difference should be signicant(Rolls 1984 Rolls amp Toveacutee 1995 Rolls Treves Toveacutee ampPanzeri 1997) If the neuron satised the criterion it wastested with two to six of the effective face stimuli forthat neuron We checked that none of the selected cellshad any response to the mask presented alone

The transmitted information carried by neuronal ringrates about the stimuli was computed with the use oftechniques that have been described fully previously(eg Rolls Treves Toveacutee amp Panzeri 1997 Rolls amp Treves1998) and have been used previously to analyze theresponses of inferior temporal cortex neurons (Gawneamp Richmond 1993 Optican amp Richmond 1987 RollsTreves amp Toveacutee 1997 Toveacutee amp Rolls 1995 Toveacutee et al1993) In brief the general procedure was as follows Theresponse r of a neuron to the presentation of a particularstimulus s was computed by measuring the ring rate ofthe neuron in a xed time window after the stimuluspresentation (In this experiment the information in anumber of different time windows was calculated)Measured in this way the ring rate responses takediscrete rather than continuous values that consist of thenumber of spikes in the time window on a particulartrial and span a discrete set of responses R across allstimuli and trials In the experiment the number of trialsavailable for each stimulus is limited (in the range of 6to 30 in the present experiment) When calculating theinformation the number of ring rate bins that can beused must be smaller than (or equal to) the number oftrials available for each stimulus to prevent undersam-pling and incorrectly high values of calculated informa-tion (Rolls amp Treves 1998) We therefore quantized themeasured ring rates into a smaller number of bins dWe chose here d in a way specic for every cell andevery time window according to the following d wasthe number of trials per stimulus (or the number ofdifferent rates that actually occurred if this was lower)This procedure is very effective in minimizing informa-tion loss due to overregularization of responses whileeffectively controlling for nite sampling biases seeGolomb Hertz Panzeri Treves amp Richmond (1997) andPanzeri amp Treves (1996) After this response quantizationthe experimental joint stimulus-response probability ta-ble P(s r) was computed from the data (where P(r) and

Rolls et al 309

P(s) are the experimental probability of occurrence ofresponses and of stimuli respectively) and the informa-tion I(S R) transmitted by the neurons averaged acrossthe stimuli was calculated by using the Shannon formula(Cover amp Thomas 1991 Shannon 1948)

I(S R) = aring sr

P(s r) log2 P(s r)

P(s)P(r)

and then subtracting the nite sampling correction ofPanzeri and Treves (1996) to obtain estimates unbiasedfor the limited sampling This leads to the informationavailable in the ring rates about the stimulus We didnot calculate the additional information available fromtemporal ring patterns within the spike train becausethe additional information is low often only 10 to 20and reects mainly the onset latency of the neuronalresponse (Toveacutee et al 1993 Toveacutee amp Rolls 1995)

In the experiments described here three cells weretested with six stimuli one with four stimuli ve cellswith three stimuli and six cells with two stimuli We notethat the number of stimuli used here is smaller than thatused in other experiments that applied informationtheoretic measures to the responses of inferior temporalcortex cells (Optican amp Richmond 1987 Rolls Treves ampToveacutee 1997 Toveacutee et al 1993 Toveacutee amp Rolls 1995) Thereason for using fewer images in the experiments de-scribed here is that each image needed to be tested withve different masking conditions There are two poten-tial problems arising from a calculation of informationfrom a limited set of stimuli The rst one is that withfew stimuli the full response space of the neuron maynot be adequately sampled We guarded against this bychoosing images for the tests that elicited quite differentresponses from the cells and ensuring that the informa-tion measured with these images was high The secondpossible problem is that the information measures maybe distorted by the fact that there is a ceiling to theamount of information that can be provided aboutwhich of a set of stimuli has been seen and this entropyis the logarithm of the number of stimuli in the set Fortwo stimuli just 1 bit of information is all that could beprovided by a cell This ceiling could limit the informa-tion measurement obtained from neuronal responses iftoo few stimuli are used (Gawne Kjaer Hertz amp Rich-mond 1996 Gawne amp Richmond 1993 Rolls amp Treves1998 Rolls Treves amp Toveacutee 1997) For the analyses de-scribed here we checked that the information availablefrom the neuronal responses was always well below theceiling so that the information measure was not dis-torted In fact for the unmasked condition the averageinformation from the neuronal responses was 03 bitsand this is much lower than the entropy of the sets ofimages which varied for different cells between 1 bit

and 26 bits Finally we note that each neuron was testedwith the same stimuli across the different masking con-ditions and therefore the comparison of the informationvalues obtained under the different masking conditionsis homogenous

Acknowledgments

This research was supported by Medical Research CouncilProgramme Grant PG8513790 to E T Rolls S Panzeri is sup-ported by an EC Marie Curie Research Training GrantERBFMBICT972749

Reprint requests should be sent to E T Rolls Department ofExperimental Psychology University of Oxford Oxford OX13UD UK or via e-mail EdmundRollspsyoxacuk

REFERENCES

Cover T M amp Thomas J A (1991) Elements of informationtheory New York Wiley

Dinse H R amp Kruger K (1994) The timing of processingalong the visual pathway in the cat NeuroReport 5 893ndash897

Gawne T J Kjaer T W Hertz J A amp Richmond B J (1996)Adjacent cortical complex cells share about 20 of theirstimulus-related information Cerebral Cortex 6 482ndash489

Gawne T J amp Richmond B J (1993) How independent arethe messages carried by adjacent inferior temporal corti-cal neurons Journal of Neuroscience 13 2758ndash2771

Golledge H D R Hilgetag C amp Toveacutee M J (1996) Informa-tion processing A solution to the binding problem Cur-rent Biology 6 1092ndash1095

Golomb D Hertz J A Panzeri S Treves A amp RichmondB J (1997) How well can we estimate the information car-ried in neuronal responses from limited samples NeuralComputation 9 649ndash665

Heller J Hertz J A Kjaer T W amp Richmond B J (1995)Information ow and temporal coding in primate patternvision Journal of Computational Neuroscience 2 175ndash193

Humphreys G W amp Bruce V (1989) Visual cognitionHove Eng Erlbaum

Kovacs G Vogels R amp Orban G A (1995) Cortical corre-lates of pattern backward-masking Proceedings of the Na-tional Academy of Sciences USA 92 5587ndash5591

Maunsell J H R amp Gibson J R (1992) Visual response la-tencies in striate cortex of the macaque monkey Journalof Neurophysiology 68 1332ndash1344

Optican L amp Richmond B J (1987) Temporal encoding oftwo-dimensional patterns by single units in primate infe-rior temporal cortex III Information theoretical analysisJournal of Neurophysiology 57 132ndash146

Oram M W amp Perrett D I (1992) Time course of neural re-sponses discriminating different views of the face andhead Journal of Neurophysiology 68 70ndash84

Panzeri S amp Treves A (1996) Analytical estimates of limitedsampling biases in different information measures Net-work 7 87ndash107

Perrett D I Hietanen J K Oram M W amp Benson P J(1992) Organization and functions of cells in the ma-caque temporal cortex Philosophical Transactions of theRoyal Society London B 335 23ndash50

Raiguel S E Lagae L Gulyas B amp Orban G A (1989) Re-

310 Journal of Cognitive Neuroscience Volume 11 Number 3

sponse latencies of visual cells in macaque areas V1 V2and V5 Brain Research 493 155ndash159

Rieke F Warland D de Ruyter van Steveninck R R ampBialek W (1996) Spikes Exploring the neural code Cam-bridge MA MIT Press

Rolls E T (1984) Neurons in the cortex of the temporallobe and in the amygdala of the monkey with responsesselective for faces Human Neurobiology 3 209ndash222

Rolls E T (1992) Neurophysiological mechanisms underly-ing face processing within and beyond the temporal corti-cal visual areas Philosophical Transactions of the RoyalSociety London B 335 11ndash21

Rolls E T (1994) Brain mechanisms for invariant visual rec-ognition and learning Behavioral Processes 33 113ndash138

Rolls E T (1995) Learning mechanisms in the temporal lobevisual cortex Behavioral Brain Research 66 177ndash185

Rolls E T amp Toveacutee M J (1994) Processing speed in thecerebral cortex and the neurophysiology of backwardmasking Proceedings of the Royal Society London B257 9ndash15

Rolls E T amp Toveacutee M J (1995) The sparseness of the neuro-nal representation of stimuli in the primate temporal vis-ual cortex Journal of Neurophysiology 73 713ndash726

Rolls E T Toveacutee M J Purcell D G Stewart A L amp Azzo-pardi P (1994) The responses of neurons in the temporalcortex of primates and face identication and detectionExperimental Brain Research 101 473ndash484

Rolls E T amp Treves A (1998) Neural networks and brainfunction Oxford Oxford University Press

Rolls E T Treves A amp Toveacutee M J (1997) The repre-sentational capacity of the distributed encoding of informa-tion provided by populations of neurons in the primatetemporal visual cortex Experimental Brain Research114 149ndash162

Rolls E T Treves A Toveacutee M amp Panzeri S (1997) Informa-

tion in the neuronal representation of individual stimuli inthe primate temporal visual cortex Journal of Computa-tional Neuroscience 4 309ndash333

Schiller P H (1968) Single unit analysis of backward visualmasking and metacontrast in the cat lateral geniculate nu-cleus Vision Research 8 855ndash866

Shannon C E (1948) A mathematical theory of communica-tion ATampT Bell Laboratories Technical Journal 27 379ndash423

Thorpe S Fize D amp Mariot C (1996) Speed of processingin the human visual system Nature 381 520ndash522

Toveacutee M J (1994) How fast is the speed of thought Cur-rent Biology 4 1125ndash1127

Toveacutee M J amp Rolls E T (1992) Oscillatory activity is not evi-dent in the primate temporal visual cortex with static vis-ual stimuli NeuroReport 3 369ndash372

Toveacutee M J amp Rolls E T (1995) Information encoding inshort ring rate epochs by single neurons in the primatetemporal visual cortex Visual Cognition 2 35ndash58

Toveacutee M J Rolls E T amp Ramachandran V S (1996) Rapidvisual learning in the neurons of the primate temporal vis-ual cortex NeuroReport 7 2757ndash2760

Toveacutee M J Rolls E T Treves A amp Bellis R P (1993) Infor-mation encoding and the responses of single neurons inthe primate temporal visual cortex Journal of Neurophysi-ology 70 640ndash654

Treves A (1993) Mean-eld analysis of neuronal spike dy-namics Network 4 259ndash284

Vogel R amp Orban G A (1994) Activity of inferior temporalneurons during orientation discrimination with succes-sively presented gratings Journal of Neurophysiology 711428ndash1451

Wallis G amp Rolls E T (1997) Invariant face and object rec-ognition in the visual system Progress in Neurobiology51 167ndash194

Rolls et al 311

ever it was notable that the information in the no-maskcondition did outlast the end of the stimulus by as muchas 200 to 300 msec indicating some short-term memorytrace property of the neuronal circuitry (cf Wallis ampRolls 1987)

The results show also that an effect of the masking isthat the decrease in information is more marked than

the decrease in ring rate because it is the selective partof the ring that is especially attenuated by the masknot the spontaneous ring (see Figures 2 and 8) Theresults also show that even at the shortest SOA of 20msec the information available was on average 01 bitsThis compares to 03 bits with the 16-msec stimulusshown without the mask It also compares to a typical

Figure 4 The average acrossthe cells of the number ofspikes cumulated by differentpoststimulus times for themost effective and the least ef-fective stimuli for each celland also the average acrossstimuli

Rolls et al 305

value for such neurons of 035 to 05 bits with a 500-msec stimulus presentation (Rolls Treves Toveacutee ampPanzeri 1997 Toveacutee amp Rolls 1995) The results thus showthat considerable information (33 of that availablewithout a mask and approximately 22 of that with a500-msec stimulus presentation) is available from neuro-nal responses even under backward masking conditionsthat allow the neurons to have their main response in30 msec Also we note that the information availablefrom a 16-msec unmasked stimulus (03 bits) is a largeproportion (approximately 65 to 75) of that availablefrom a 500-msec stimulus These results provide evi-dence of how rapid the processing of visual informationis in a cortical area and provide a fundamental constraintfor understanding how cortical information processingoperates (see Rolls amp Treves 1998)

The analysis using information theory draws out aninteresting point about the relation between the ringrate measure used previously (Rolls amp Toveacutee 1994 Rollset al 1994) and the information measure (see Figure 8)It is shown in Figure 8 that the difference in the meanresponses to different stimuli (not the mean response toall stimuli) is related to the information This reects thefact that the information is related to the differences inneuronal responses to different stimuli However theinformation also reects the variance in the neuronalresponses and it is presumably the larger variance atshort SOAs that makes the information decrease more

than the difference in neuronal responses In fact thestandard deviation of the neuronal responses at an SOAof 20 msec is approximately 045 of the mean ring ratewhereas it is 04 of the mean ring rate without mask-ing

The neurophysiological and information data de-scribed here can be compared directly with the effectsof backward masking in human observers studied in thesame apparatus with the same stimuli (Rolls et al 1994)For the human observers identication of which facefrom a set of six had been seen was 50 correct withan SOA of 20 msec and 97 correct with an SOA of 40msec (corrected for guessing) (Rolls et al 1994) Com-paring the human performance purely with the changesin ring rate under the same stimulus conditions sug-gested that when it is just possible to identify which facehas been seen neurons in a given cortical area may beresponding for only approximately 30 msec (Rolls ampToveacutee 1994 Rolls et al 1994) The implication is that 30msec is enough time for a neuron to perform sufcientcomputation to enable its output to be used for iden-tication The new results based on an analysis of theinformation encoded in the spike trains at different SOAssupport this hypothesis by showing that a signicantproportion of information is available in these few spikes(see Figures 5 to 7) Thus there is very rapid processingof visual stimuli in the visual system We also note thatthe analysis using information theory adds to previous

Figure 5 The average across the cells of the cumulated informa-tion (averaged across stimuli) at different poststimulus times for thedifferent SOAs and for the unmasked condition

Figure 6 The average across the cells of the information availablein discrete 50-msec epochs taken at different poststimulus timeswith different SOAs and for the unmasked condition

306 Journal of Cognitive Neuroscience Volume 11 Number 3

analyses by providing quantitative information measuresthat can in principle be compared to the performanceof human observers using the same information metric

A comparison of the latencies of the activation ofneurons in the different visual cortical areas V1 V2 V4posterior inferior temporal cortex and anterior inferiortemporal cortex suggests that approximately 10 to 15msec is added by each stage (Dinse amp Kruger 1994Oram amp Perrett 1992 Raiguel et al 1989 Rolls 1992Vogel amp Orban 1994) This lag also seems to be commonto the passage of information between different subdivi-sions of a given area For example there seems to be alag of 15 msec between neurons in layer 4C and in themore supercial layers of V1 and a lag of 11 msecbetween viewer-centered cells and object-centered cellsin the temporal visual cortex (Maunsell amp Gibson 1992Perrett et al 1992) These studies show that neurons inthe next stage of processing start ring soon after (15msec) the neurons in any stage of processing havestarted to re The fact that considerable information isavailable in short epochs of for example 20 msec of thering of neurons provides part of the underlying basisfor this rapid sequential activation of connected visualcortical areas (Toveacutee amp Rolls 1995) The high ring ratesof neurons in the visual cortical areas to their mosteffective stimulus may be an important aspect of thisrapid transmission of visual information (cf Rolls ampTreves 1998 section A232) Nevertheless the ring of

neurons within a cortical area normally continues to avisual stimulus for several hundred milliseconds (seeRolls amp Treves 1998) Over this long period of timedifferent factors will inuence the ring of neurons in agiven cortical area Initially the ring will be basedmainly on incoming information from the precedingcortical area (feed-forward information) but at varyingtemporal intervals different feed-back mechanisms willplay a modulating role Initially this is likely to include

Figure 7 The information available from the responses of one ofthe cells as a function of the poststimulus time for different SOAs

Figure 8 (a) The average (plusmn sem) across the cells of the cumu-lated information available in a 200-msec period from stimulus onsetfrom the responses of the cells as a function of the SOA (b) The dif-ference in the number of spikes to the most effective and to theleast effective stimulus is also shown as a function of the SOA Thescale in (b) of the ldquonumber of spikesrdquo axis is scaled so that the infor-mation value (a) and the difference in the number of spikes (b) co-incide in the no-mask condition

Rolls et al 307

lateral inhibition followed by intracortical feedbackfrom local excitatory recurrent collateral axons of otherpyramidal cells and then feedback from higher corticalareas However although all these separate inputs seemto be very different conceptually in time it is of interestthat in dynamical systems using integrate-and-re neu-rons to model the dynamics of the real brain the ex-change of information required to achieve rapid settlingof the network into a nal state can be very much fasterthan might be expected by taking the contribution ofeach stage as a separate time step (see Rolls amp Treves1998 Treves 1993) This is exactly consistent with therapid speed of processing demonstrated here and em-phasized by the fact that considerable information isavailable from inferior temporal cortex neurons whentheir processing is interrupted by a mask starting only20 msec after the onset of a stimulus

Our previous studies suggested that there is a popula-tion of neurons in the temporal cortical areas that re-spond to a single frame (16 msec) presentation of thetarget stimulus a face with an increased ring rate thatcan last for 200 to 300 msec (Rolls amp Toveacutee 1994 Rollset al 1994) This is in the unmasked condition Theinformation analysis conrms and extends this ndingThe neuronal responses can encode signicant amountsof information for up to 300 msec after the presentationof a 16-msec stimulus in the unmasked condition (seeFigure 5) These results suggest that there may be ashort-term visual memory implemented by the con-tinued ring of these neurons after a stimulus hasdisappeared This short-term visual memory may be im-plemented by the recurrent collateral connections madebetween nearby pyramidal cells in the cerebral cortexThese recurrent connections may function in part as anautoassociative memory which not only enables corticalnetworks to show continued ring for a few hundredmilliseconds after a briey presented stimulus but alsoconfer some of the response specicity inherent in theresponses of cortical neurons (Rolls amp Treves 1998) Oneeffect that may be facilitated by this short-term visualmemory is implementation of a trace learning rule in thevisual cortex that may be used to learn invariant repre-sentations (Rolls 1992 1994 1995 Rolls amp Toveacutee 1994Wallis amp Rolls 1997) The continuing neural activity afterone stimulus would enable it to be associated with thesucceeding images which given the nature of the statis-tics of our visual world would probably be images of thesame object These associations between images pro-duced within a short time by the same object could formthe basis for the construction of an invariant repre-sentation of that stimulus For example the neuronsactivated by a stimulus are still in a state of activation(and postsynaptic depolarization) perhaps 300 mseclater when the same stimulus is seen translated acrossthe retina viewed from a different angle or at a differentsize so that the active axons carrying the transformed

representation can undergo Hebbian associative synapticmodication onto just those neurons that remain in anactivated state from the previous input produced by thesame object In this way the invariant properties evidentacross short time epochs of the inputs produced byobjects may be learned by the visual system (Rolls 1992Rolls amp Treves 1998 Wallis amp Rolls 1997)

We nally note that the information per spike denedas the information calculated in short windows dividedby the number of spikes emitted by the cell in that timewindow is in the range 005 to 02 bits per spike forlonger SOAs slightly decreasing down to 003 to 015bits per spike at very short SOAs (The values of theinformation per spike can be easily extracted by com-paring Figure 6 which reports the values of the informa-tion available in short time epochs and Figure 3 whichplots the mean ring rates of the neurons in the sametime epochs) These values are similar to the values ofthe information per spike found in IT neurons respond-ing to long-lasting (eg 500 msec) stimuli (Heller et al1995 Rolls Treves Toveacutee amp Panzeri 1997) Thus for theexperiments described the information per spike is onlymoderate even when there are a very few spikes inresponse to a stimulus as in the mask condition withvery short SOAs This result has some implications forthe nature of the neuronal code It has been suggestedby Rieke et al (1996) that the fact that at short SOAsonly a few spikes are emitted in response to a stimulusand at the same time a psychophysical discrimination ofthe visual stimuli is still possible may imply that even inthe mammalian central nervous system when a singlestimulus is rapidly varying or is presented just for a veryshort time one or two spikes may be enough to carryvery high values of information (thus challenging theidea of a ldquorate coderdquo) This is indeed the case for forexample the visual system of the y (Rieke et al 1996)where a single spike can carry several bits of informa-tion at least when studied with dynamically varyingstimuli However the evidence presented here seems toindicate that several spikes from (perhaps different) neu-rons in the IT cortex of the monkey are needed toprovide much information about a briey presented vis-ual pattern This is consistent with the distributed repre-sentation of information used in many mammalianneural systems and with the advantages that computa-tion with this type of distributed representation of infor-mation confers (Rolls amp Treves 1998 Rolls Treves ampToveacutee 1997)

EXPERIMENTAL PROCEDURES AND DATAANALYSIS

The activity of single neurons was recorded with glass-insulated tungsten microelectrodes in the anterior partof the superior temporal sulcus (STS) in two alert ma-caque monkeys (Macaca mulatta mass 30 kg) seated in

308 Journal of Cognitive Neuroscience Volume 11 Number 3

a primate chair using techniques that have been de-scribed elsewhere (Toveacutee et al 1993 Toveacutee amp Rolls1992) The preparative procedures were performed asep-tically under sodium thiopentone anaesthesia by usingpretreatment with ketamine and posttreatment with theanalgesic buprenorphine (Temgesic and the antibioticamoxycillin Cynulox) and all procedures were in ac-cordance with the Policy Regarding the Care and Use ofAnimals approved by the Society for Neuroscience andwere licensed under the UK Animals Scientic Proce-dures Act 1986 Eye position was measured to an accu-racy of 05 8 with the search coil technique A visualxation task ensured that the monkey looked steadily atthe screen throughout the presentation of each stimulusThe task was a blink version of a visual xation task inwhich the xation spot was blinked off 100 msec beforethe target (otherwise called test) stimulus appeared Thestimuli were static visual stimuli subtending 8 in thevisual eld presented on a video monitor at a distanceof 10 m The xation spot position was at the center ofthe screen The monitor was viewed binocularly with thewhole screen visible to both eyes

Each trial started at - 500 msec with respect to theonset of the test image with a 500-msec warning toneto allow xation of the xation point which appearedat the same time At - 100 msec the xation spot wasblinked off so there was no stimulus on the screen inthe 100-msec period immediately preceding the testimage The screen in this period and at all other timesincluding the interstimulus interval and the interval be-tween the test image and the mask was set at the meanluminance of the test images and the mask At 0 msecthe tone was switched off and the test image wasswitched on for 16 msec This period was the frameduration of the video framestore with which the imageswere presented The image was drawn on the monitorfrom the top to the bottom in the rst 16 msec of theframe period by the framestore with the remaining 4msec of the frame period being the vertical blank inter-val (The PAL video system was in use) The monitor hada persistence of less than 3 msec so no part of the testimage was present at the start of the next frame Thetime between the start of the test stimulus and the startof the mask stimulus (the SOA) was either 20 40 60 100or 1000 msec (chosen in a pseudorandom sequence bythe computer) The 1000-msec condition was used tomeasure the response to the test stimulus alone (whichwas possible because the mask was delayed for so long)The duration of the masking stimulus was 300 msec Atthe termination of the masking stimulus the xation spotreappeared and then after a random interval in the range150 to 3350 msec it dimmed to indicate that lickingresponses to a tube in front of the mouth would resultin the delivery of a reward The dimming period was1000 msec and after this the xation spot was switchedoff and reward availability terminated 500 msec later

The monkey was required to xate the xation spot andif it licked at any time other than when the spot wasdimmed saline instead of fruit juice was delivered fromthe tube If the eyes moved by more than 05deg from time0 until the start of the dimming period the trial wasaborted and the data for the trial were rejected When atrial aborted a high-frequency tone sounded for 05 secno reinforcement was available for that trial and theintertrial interval was lengthened from 8 to 11 sec

The criterion for the face-selective neurons analyzedin this study was that the response to the optimal stimu-lus should be at least twice that to the optimal nonfacestimulus and that this difference should be signicant(Rolls 1984 Rolls amp Toveacutee 1995 Rolls Treves Toveacutee ampPanzeri 1997) If the neuron satised the criterion it wastested with two to six of the effective face stimuli forthat neuron We checked that none of the selected cellshad any response to the mask presented alone

The transmitted information carried by neuronal ringrates about the stimuli was computed with the use oftechniques that have been described fully previously(eg Rolls Treves Toveacutee amp Panzeri 1997 Rolls amp Treves1998) and have been used previously to analyze theresponses of inferior temporal cortex neurons (Gawneamp Richmond 1993 Optican amp Richmond 1987 RollsTreves amp Toveacutee 1997 Toveacutee amp Rolls 1995 Toveacutee et al1993) In brief the general procedure was as follows Theresponse r of a neuron to the presentation of a particularstimulus s was computed by measuring the ring rate ofthe neuron in a xed time window after the stimuluspresentation (In this experiment the information in anumber of different time windows was calculated)Measured in this way the ring rate responses takediscrete rather than continuous values that consist of thenumber of spikes in the time window on a particulartrial and span a discrete set of responses R across allstimuli and trials In the experiment the number of trialsavailable for each stimulus is limited (in the range of 6to 30 in the present experiment) When calculating theinformation the number of ring rate bins that can beused must be smaller than (or equal to) the number oftrials available for each stimulus to prevent undersam-pling and incorrectly high values of calculated informa-tion (Rolls amp Treves 1998) We therefore quantized themeasured ring rates into a smaller number of bins dWe chose here d in a way specic for every cell andevery time window according to the following d wasthe number of trials per stimulus (or the number ofdifferent rates that actually occurred if this was lower)This procedure is very effective in minimizing informa-tion loss due to overregularization of responses whileeffectively controlling for nite sampling biases seeGolomb Hertz Panzeri Treves amp Richmond (1997) andPanzeri amp Treves (1996) After this response quantizationthe experimental joint stimulus-response probability ta-ble P(s r) was computed from the data (where P(r) and

Rolls et al 309

P(s) are the experimental probability of occurrence ofresponses and of stimuli respectively) and the informa-tion I(S R) transmitted by the neurons averaged acrossthe stimuli was calculated by using the Shannon formula(Cover amp Thomas 1991 Shannon 1948)

I(S R) = aring sr

P(s r) log2 P(s r)

P(s)P(r)

and then subtracting the nite sampling correction ofPanzeri and Treves (1996) to obtain estimates unbiasedfor the limited sampling This leads to the informationavailable in the ring rates about the stimulus We didnot calculate the additional information available fromtemporal ring patterns within the spike train becausethe additional information is low often only 10 to 20and reects mainly the onset latency of the neuronalresponse (Toveacutee et al 1993 Toveacutee amp Rolls 1995)

In the experiments described here three cells weretested with six stimuli one with four stimuli ve cellswith three stimuli and six cells with two stimuli We notethat the number of stimuli used here is smaller than thatused in other experiments that applied informationtheoretic measures to the responses of inferior temporalcortex cells (Optican amp Richmond 1987 Rolls Treves ampToveacutee 1997 Toveacutee et al 1993 Toveacutee amp Rolls 1995) Thereason for using fewer images in the experiments de-scribed here is that each image needed to be tested withve different masking conditions There are two poten-tial problems arising from a calculation of informationfrom a limited set of stimuli The rst one is that withfew stimuli the full response space of the neuron maynot be adequately sampled We guarded against this bychoosing images for the tests that elicited quite differentresponses from the cells and ensuring that the informa-tion measured with these images was high The secondpossible problem is that the information measures maybe distorted by the fact that there is a ceiling to theamount of information that can be provided aboutwhich of a set of stimuli has been seen and this entropyis the logarithm of the number of stimuli in the set Fortwo stimuli just 1 bit of information is all that could beprovided by a cell This ceiling could limit the informa-tion measurement obtained from neuronal responses iftoo few stimuli are used (Gawne Kjaer Hertz amp Rich-mond 1996 Gawne amp Richmond 1993 Rolls amp Treves1998 Rolls Treves amp Toveacutee 1997) For the analyses de-scribed here we checked that the information availablefrom the neuronal responses was always well below theceiling so that the information measure was not dis-torted In fact for the unmasked condition the averageinformation from the neuronal responses was 03 bitsand this is much lower than the entropy of the sets ofimages which varied for different cells between 1 bit

and 26 bits Finally we note that each neuron was testedwith the same stimuli across the different masking con-ditions and therefore the comparison of the informationvalues obtained under the different masking conditionsis homogenous

Acknowledgments

This research was supported by Medical Research CouncilProgramme Grant PG8513790 to E T Rolls S Panzeri is sup-ported by an EC Marie Curie Research Training GrantERBFMBICT972749

Reprint requests should be sent to E T Rolls Department ofExperimental Psychology University of Oxford Oxford OX13UD UK or via e-mail EdmundRollspsyoxacuk

REFERENCES

Cover T M amp Thomas J A (1991) Elements of informationtheory New York Wiley

Dinse H R amp Kruger K (1994) The timing of processingalong the visual pathway in the cat NeuroReport 5 893ndash897

Gawne T J Kjaer T W Hertz J A amp Richmond B J (1996)Adjacent cortical complex cells share about 20 of theirstimulus-related information Cerebral Cortex 6 482ndash489

Gawne T J amp Richmond B J (1993) How independent arethe messages carried by adjacent inferior temporal corti-cal neurons Journal of Neuroscience 13 2758ndash2771

Golledge H D R Hilgetag C amp Toveacutee M J (1996) Informa-tion processing A solution to the binding problem Cur-rent Biology 6 1092ndash1095

Golomb D Hertz J A Panzeri S Treves A amp RichmondB J (1997) How well can we estimate the information car-ried in neuronal responses from limited samples NeuralComputation 9 649ndash665

Heller J Hertz J A Kjaer T W amp Richmond B J (1995)Information ow and temporal coding in primate patternvision Journal of Computational Neuroscience 2 175ndash193

Humphreys G W amp Bruce V (1989) Visual cognitionHove Eng Erlbaum

Kovacs G Vogels R amp Orban G A (1995) Cortical corre-lates of pattern backward-masking Proceedings of the Na-tional Academy of Sciences USA 92 5587ndash5591

Maunsell J H R amp Gibson J R (1992) Visual response la-tencies in striate cortex of the macaque monkey Journalof Neurophysiology 68 1332ndash1344

Optican L amp Richmond B J (1987) Temporal encoding oftwo-dimensional patterns by single units in primate infe-rior temporal cortex III Information theoretical analysisJournal of Neurophysiology 57 132ndash146

Oram M W amp Perrett D I (1992) Time course of neural re-sponses discriminating different views of the face andhead Journal of Neurophysiology 68 70ndash84

Panzeri S amp Treves A (1996) Analytical estimates of limitedsampling biases in different information measures Net-work 7 87ndash107

Perrett D I Hietanen J K Oram M W amp Benson P J(1992) Organization and functions of cells in the ma-caque temporal cortex Philosophical Transactions of theRoyal Society London B 335 23ndash50

Raiguel S E Lagae L Gulyas B amp Orban G A (1989) Re-

310 Journal of Cognitive Neuroscience Volume 11 Number 3

sponse latencies of visual cells in macaque areas V1 V2and V5 Brain Research 493 155ndash159

Rieke F Warland D de Ruyter van Steveninck R R ampBialek W (1996) Spikes Exploring the neural code Cam-bridge MA MIT Press

Rolls E T (1984) Neurons in the cortex of the temporallobe and in the amygdala of the monkey with responsesselective for faces Human Neurobiology 3 209ndash222

Rolls E T (1992) Neurophysiological mechanisms underly-ing face processing within and beyond the temporal corti-cal visual areas Philosophical Transactions of the RoyalSociety London B 335 11ndash21

Rolls E T (1994) Brain mechanisms for invariant visual rec-ognition and learning Behavioral Processes 33 113ndash138

Rolls E T (1995) Learning mechanisms in the temporal lobevisual cortex Behavioral Brain Research 66 177ndash185

Rolls E T amp Toveacutee M J (1994) Processing speed in thecerebral cortex and the neurophysiology of backwardmasking Proceedings of the Royal Society London B257 9ndash15

Rolls E T amp Toveacutee M J (1995) The sparseness of the neuro-nal representation of stimuli in the primate temporal vis-ual cortex Journal of Neurophysiology 73 713ndash726

Rolls E T Toveacutee M J Purcell D G Stewart A L amp Azzo-pardi P (1994) The responses of neurons in the temporalcortex of primates and face identication and detectionExperimental Brain Research 101 473ndash484

Rolls E T amp Treves A (1998) Neural networks and brainfunction Oxford Oxford University Press

Rolls E T Treves A amp Toveacutee M J (1997) The repre-sentational capacity of the distributed encoding of informa-tion provided by populations of neurons in the primatetemporal visual cortex Experimental Brain Research114 149ndash162

Rolls E T Treves A Toveacutee M amp Panzeri S (1997) Informa-

tion in the neuronal representation of individual stimuli inthe primate temporal visual cortex Journal of Computa-tional Neuroscience 4 309ndash333

Schiller P H (1968) Single unit analysis of backward visualmasking and metacontrast in the cat lateral geniculate nu-cleus Vision Research 8 855ndash866

Shannon C E (1948) A mathematical theory of communica-tion ATampT Bell Laboratories Technical Journal 27 379ndash423

Thorpe S Fize D amp Mariot C (1996) Speed of processingin the human visual system Nature 381 520ndash522

Toveacutee M J (1994) How fast is the speed of thought Cur-rent Biology 4 1125ndash1127

Toveacutee M J amp Rolls E T (1992) Oscillatory activity is not evi-dent in the primate temporal visual cortex with static vis-ual stimuli NeuroReport 3 369ndash372

Toveacutee M J amp Rolls E T (1995) Information encoding inshort ring rate epochs by single neurons in the primatetemporal visual cortex Visual Cognition 2 35ndash58

Toveacutee M J Rolls E T amp Ramachandran V S (1996) Rapidvisual learning in the neurons of the primate temporal vis-ual cortex NeuroReport 7 2757ndash2760

Toveacutee M J Rolls E T Treves A amp Bellis R P (1993) Infor-mation encoding and the responses of single neurons inthe primate temporal visual cortex Journal of Neurophysi-ology 70 640ndash654

Treves A (1993) Mean-eld analysis of neuronal spike dy-namics Network 4 259ndash284

Vogel R amp Orban G A (1994) Activity of inferior temporalneurons during orientation discrimination with succes-sively presented gratings Journal of Neurophysiology 711428ndash1451

Wallis G amp Rolls E T (1997) Invariant face and object rec-ognition in the visual system Progress in Neurobiology51 167ndash194

Rolls et al 311

value for such neurons of 035 to 05 bits with a 500-msec stimulus presentation (Rolls Treves Toveacutee ampPanzeri 1997 Toveacutee amp Rolls 1995) The results thus showthat considerable information (33 of that availablewithout a mask and approximately 22 of that with a500-msec stimulus presentation) is available from neuro-nal responses even under backward masking conditionsthat allow the neurons to have their main response in30 msec Also we note that the information availablefrom a 16-msec unmasked stimulus (03 bits) is a largeproportion (approximately 65 to 75) of that availablefrom a 500-msec stimulus These results provide evi-dence of how rapid the processing of visual informationis in a cortical area and provide a fundamental constraintfor understanding how cortical information processingoperates (see Rolls amp Treves 1998)

The analysis using information theory draws out aninteresting point about the relation between the ringrate measure used previously (Rolls amp Toveacutee 1994 Rollset al 1994) and the information measure (see Figure 8)It is shown in Figure 8 that the difference in the meanresponses to different stimuli (not the mean response toall stimuli) is related to the information This reects thefact that the information is related to the differences inneuronal responses to different stimuli However theinformation also reects the variance in the neuronalresponses and it is presumably the larger variance atshort SOAs that makes the information decrease more

than the difference in neuronal responses In fact thestandard deviation of the neuronal responses at an SOAof 20 msec is approximately 045 of the mean ring ratewhereas it is 04 of the mean ring rate without mask-ing

The neurophysiological and information data de-scribed here can be compared directly with the effectsof backward masking in human observers studied in thesame apparatus with the same stimuli (Rolls et al 1994)For the human observers identication of which facefrom a set of six had been seen was 50 correct withan SOA of 20 msec and 97 correct with an SOA of 40msec (corrected for guessing) (Rolls et al 1994) Com-paring the human performance purely with the changesin ring rate under the same stimulus conditions sug-gested that when it is just possible to identify which facehas been seen neurons in a given cortical area may beresponding for only approximately 30 msec (Rolls ampToveacutee 1994 Rolls et al 1994) The implication is that 30msec is enough time for a neuron to perform sufcientcomputation to enable its output to be used for iden-tication The new results based on an analysis of theinformation encoded in the spike trains at different SOAssupport this hypothesis by showing that a signicantproportion of information is available in these few spikes(see Figures 5 to 7) Thus there is very rapid processingof visual stimuli in the visual system We also note thatthe analysis using information theory adds to previous

Figure 5 The average across the cells of the cumulated informa-tion (averaged across stimuli) at different poststimulus times for thedifferent SOAs and for the unmasked condition

Figure 6 The average across the cells of the information availablein discrete 50-msec epochs taken at different poststimulus timeswith different SOAs and for the unmasked condition

306 Journal of Cognitive Neuroscience Volume 11 Number 3

analyses by providing quantitative information measuresthat can in principle be compared to the performanceof human observers using the same information metric

A comparison of the latencies of the activation ofneurons in the different visual cortical areas V1 V2 V4posterior inferior temporal cortex and anterior inferiortemporal cortex suggests that approximately 10 to 15msec is added by each stage (Dinse amp Kruger 1994Oram amp Perrett 1992 Raiguel et al 1989 Rolls 1992Vogel amp Orban 1994) This lag also seems to be commonto the passage of information between different subdivi-sions of a given area For example there seems to be alag of 15 msec between neurons in layer 4C and in themore supercial layers of V1 and a lag of 11 msecbetween viewer-centered cells and object-centered cellsin the temporal visual cortex (Maunsell amp Gibson 1992Perrett et al 1992) These studies show that neurons inthe next stage of processing start ring soon after (15msec) the neurons in any stage of processing havestarted to re The fact that considerable information isavailable in short epochs of for example 20 msec of thering of neurons provides part of the underlying basisfor this rapid sequential activation of connected visualcortical areas (Toveacutee amp Rolls 1995) The high ring ratesof neurons in the visual cortical areas to their mosteffective stimulus may be an important aspect of thisrapid transmission of visual information (cf Rolls ampTreves 1998 section A232) Nevertheless the ring of

neurons within a cortical area normally continues to avisual stimulus for several hundred milliseconds (seeRolls amp Treves 1998) Over this long period of timedifferent factors will inuence the ring of neurons in agiven cortical area Initially the ring will be basedmainly on incoming information from the precedingcortical area (feed-forward information) but at varyingtemporal intervals different feed-back mechanisms willplay a modulating role Initially this is likely to include

Figure 7 The information available from the responses of one ofthe cells as a function of the poststimulus time for different SOAs

Figure 8 (a) The average (plusmn sem) across the cells of the cumu-lated information available in a 200-msec period from stimulus onsetfrom the responses of the cells as a function of the SOA (b) The dif-ference in the number of spikes to the most effective and to theleast effective stimulus is also shown as a function of the SOA Thescale in (b) of the ldquonumber of spikesrdquo axis is scaled so that the infor-mation value (a) and the difference in the number of spikes (b) co-incide in the no-mask condition

Rolls et al 307

lateral inhibition followed by intracortical feedbackfrom local excitatory recurrent collateral axons of otherpyramidal cells and then feedback from higher corticalareas However although all these separate inputs seemto be very different conceptually in time it is of interestthat in dynamical systems using integrate-and-re neu-rons to model the dynamics of the real brain the ex-change of information required to achieve rapid settlingof the network into a nal state can be very much fasterthan might be expected by taking the contribution ofeach stage as a separate time step (see Rolls amp Treves1998 Treves 1993) This is exactly consistent with therapid speed of processing demonstrated here and em-phasized by the fact that considerable information isavailable from inferior temporal cortex neurons whentheir processing is interrupted by a mask starting only20 msec after the onset of a stimulus

Our previous studies suggested that there is a popula-tion of neurons in the temporal cortical areas that re-spond to a single frame (16 msec) presentation of thetarget stimulus a face with an increased ring rate thatcan last for 200 to 300 msec (Rolls amp Toveacutee 1994 Rollset al 1994) This is in the unmasked condition Theinformation analysis conrms and extends this ndingThe neuronal responses can encode signicant amountsof information for up to 300 msec after the presentationof a 16-msec stimulus in the unmasked condition (seeFigure 5) These results suggest that there may be ashort-term visual memory implemented by the con-tinued ring of these neurons after a stimulus hasdisappeared This short-term visual memory may be im-plemented by the recurrent collateral connections madebetween nearby pyramidal cells in the cerebral cortexThese recurrent connections may function in part as anautoassociative memory which not only enables corticalnetworks to show continued ring for a few hundredmilliseconds after a briey presented stimulus but alsoconfer some of the response specicity inherent in theresponses of cortical neurons (Rolls amp Treves 1998) Oneeffect that may be facilitated by this short-term visualmemory is implementation of a trace learning rule in thevisual cortex that may be used to learn invariant repre-sentations (Rolls 1992 1994 1995 Rolls amp Toveacutee 1994Wallis amp Rolls 1997) The continuing neural activity afterone stimulus would enable it to be associated with thesucceeding images which given the nature of the statis-tics of our visual world would probably be images of thesame object These associations between images pro-duced within a short time by the same object could formthe basis for the construction of an invariant repre-sentation of that stimulus For example the neuronsactivated by a stimulus are still in a state of activation(and postsynaptic depolarization) perhaps 300 mseclater when the same stimulus is seen translated acrossthe retina viewed from a different angle or at a differentsize so that the active axons carrying the transformed

representation can undergo Hebbian associative synapticmodication onto just those neurons that remain in anactivated state from the previous input produced by thesame object In this way the invariant properties evidentacross short time epochs of the inputs produced byobjects may be learned by the visual system (Rolls 1992Rolls amp Treves 1998 Wallis amp Rolls 1997)

We nally note that the information per spike denedas the information calculated in short windows dividedby the number of spikes emitted by the cell in that timewindow is in the range 005 to 02 bits per spike forlonger SOAs slightly decreasing down to 003 to 015bits per spike at very short SOAs (The values of theinformation per spike can be easily extracted by com-paring Figure 6 which reports the values of the informa-tion available in short time epochs and Figure 3 whichplots the mean ring rates of the neurons in the sametime epochs) These values are similar to the values ofthe information per spike found in IT neurons respond-ing to long-lasting (eg 500 msec) stimuli (Heller et al1995 Rolls Treves Toveacutee amp Panzeri 1997) Thus for theexperiments described the information per spike is onlymoderate even when there are a very few spikes inresponse to a stimulus as in the mask condition withvery short SOAs This result has some implications forthe nature of the neuronal code It has been suggestedby Rieke et al (1996) that the fact that at short SOAsonly a few spikes are emitted in response to a stimulusand at the same time a psychophysical discrimination ofthe visual stimuli is still possible may imply that even inthe mammalian central nervous system when a singlestimulus is rapidly varying or is presented just for a veryshort time one or two spikes may be enough to carryvery high values of information (thus challenging theidea of a ldquorate coderdquo) This is indeed the case for forexample the visual system of the y (Rieke et al 1996)where a single spike can carry several bits of informa-tion at least when studied with dynamically varyingstimuli However the evidence presented here seems toindicate that several spikes from (perhaps different) neu-rons in the IT cortex of the monkey are needed toprovide much information about a briey presented vis-ual pattern This is consistent with the distributed repre-sentation of information used in many mammalianneural systems and with the advantages that computa-tion with this type of distributed representation of infor-mation confers (Rolls amp Treves 1998 Rolls Treves ampToveacutee 1997)

EXPERIMENTAL PROCEDURES AND DATAANALYSIS

The activity of single neurons was recorded with glass-insulated tungsten microelectrodes in the anterior partof the superior temporal sulcus (STS) in two alert ma-caque monkeys (Macaca mulatta mass 30 kg) seated in

308 Journal of Cognitive Neuroscience Volume 11 Number 3

a primate chair using techniques that have been de-scribed elsewhere (Toveacutee et al 1993 Toveacutee amp Rolls1992) The preparative procedures were performed asep-tically under sodium thiopentone anaesthesia by usingpretreatment with ketamine and posttreatment with theanalgesic buprenorphine (Temgesic and the antibioticamoxycillin Cynulox) and all procedures were in ac-cordance with the Policy Regarding the Care and Use ofAnimals approved by the Society for Neuroscience andwere licensed under the UK Animals Scientic Proce-dures Act 1986 Eye position was measured to an accu-racy of 05 8 with the search coil technique A visualxation task ensured that the monkey looked steadily atthe screen throughout the presentation of each stimulusThe task was a blink version of a visual xation task inwhich the xation spot was blinked off 100 msec beforethe target (otherwise called test) stimulus appeared Thestimuli were static visual stimuli subtending 8 in thevisual eld presented on a video monitor at a distanceof 10 m The xation spot position was at the center ofthe screen The monitor was viewed binocularly with thewhole screen visible to both eyes

Each trial started at - 500 msec with respect to theonset of the test image with a 500-msec warning toneto allow xation of the xation point which appearedat the same time At - 100 msec the xation spot wasblinked off so there was no stimulus on the screen inthe 100-msec period immediately preceding the testimage The screen in this period and at all other timesincluding the interstimulus interval and the interval be-tween the test image and the mask was set at the meanluminance of the test images and the mask At 0 msecthe tone was switched off and the test image wasswitched on for 16 msec This period was the frameduration of the video framestore with which the imageswere presented The image was drawn on the monitorfrom the top to the bottom in the rst 16 msec of theframe period by the framestore with the remaining 4msec of the frame period being the vertical blank inter-val (The PAL video system was in use) The monitor hada persistence of less than 3 msec so no part of the testimage was present at the start of the next frame Thetime between the start of the test stimulus and the startof the mask stimulus (the SOA) was either 20 40 60 100or 1000 msec (chosen in a pseudorandom sequence bythe computer) The 1000-msec condition was used tomeasure the response to the test stimulus alone (whichwas possible because the mask was delayed for so long)The duration of the masking stimulus was 300 msec Atthe termination of the masking stimulus the xation spotreappeared and then after a random interval in the range150 to 3350 msec it dimmed to indicate that lickingresponses to a tube in front of the mouth would resultin the delivery of a reward The dimming period was1000 msec and after this the xation spot was switchedoff and reward availability terminated 500 msec later

The monkey was required to xate the xation spot andif it licked at any time other than when the spot wasdimmed saline instead of fruit juice was delivered fromthe tube If the eyes moved by more than 05deg from time0 until the start of the dimming period the trial wasaborted and the data for the trial were rejected When atrial aborted a high-frequency tone sounded for 05 secno reinforcement was available for that trial and theintertrial interval was lengthened from 8 to 11 sec

The criterion for the face-selective neurons analyzedin this study was that the response to the optimal stimu-lus should be at least twice that to the optimal nonfacestimulus and that this difference should be signicant(Rolls 1984 Rolls amp Toveacutee 1995 Rolls Treves Toveacutee ampPanzeri 1997) If the neuron satised the criterion it wastested with two to six of the effective face stimuli forthat neuron We checked that none of the selected cellshad any response to the mask presented alone

The transmitted information carried by neuronal ringrates about the stimuli was computed with the use oftechniques that have been described fully previously(eg Rolls Treves Toveacutee amp Panzeri 1997 Rolls amp Treves1998) and have been used previously to analyze theresponses of inferior temporal cortex neurons (Gawneamp Richmond 1993 Optican amp Richmond 1987 RollsTreves amp Toveacutee 1997 Toveacutee amp Rolls 1995 Toveacutee et al1993) In brief the general procedure was as follows Theresponse r of a neuron to the presentation of a particularstimulus s was computed by measuring the ring rate ofthe neuron in a xed time window after the stimuluspresentation (In this experiment the information in anumber of different time windows was calculated)Measured in this way the ring rate responses takediscrete rather than continuous values that consist of thenumber of spikes in the time window on a particulartrial and span a discrete set of responses R across allstimuli and trials In the experiment the number of trialsavailable for each stimulus is limited (in the range of 6to 30 in the present experiment) When calculating theinformation the number of ring rate bins that can beused must be smaller than (or equal to) the number oftrials available for each stimulus to prevent undersam-pling and incorrectly high values of calculated informa-tion (Rolls amp Treves 1998) We therefore quantized themeasured ring rates into a smaller number of bins dWe chose here d in a way specic for every cell andevery time window according to the following d wasthe number of trials per stimulus (or the number ofdifferent rates that actually occurred if this was lower)This procedure is very effective in minimizing informa-tion loss due to overregularization of responses whileeffectively controlling for nite sampling biases seeGolomb Hertz Panzeri Treves amp Richmond (1997) andPanzeri amp Treves (1996) After this response quantizationthe experimental joint stimulus-response probability ta-ble P(s r) was computed from the data (where P(r) and

Rolls et al 309

P(s) are the experimental probability of occurrence ofresponses and of stimuli respectively) and the informa-tion I(S R) transmitted by the neurons averaged acrossthe stimuli was calculated by using the Shannon formula(Cover amp Thomas 1991 Shannon 1948)

I(S R) = aring sr

P(s r) log2 P(s r)

P(s)P(r)

and then subtracting the nite sampling correction ofPanzeri and Treves (1996) to obtain estimates unbiasedfor the limited sampling This leads to the informationavailable in the ring rates about the stimulus We didnot calculate the additional information available fromtemporal ring patterns within the spike train becausethe additional information is low often only 10 to 20and reects mainly the onset latency of the neuronalresponse (Toveacutee et al 1993 Toveacutee amp Rolls 1995)

In the experiments described here three cells weretested with six stimuli one with four stimuli ve cellswith three stimuli and six cells with two stimuli We notethat the number of stimuli used here is smaller than thatused in other experiments that applied informationtheoretic measures to the responses of inferior temporalcortex cells (Optican amp Richmond 1987 Rolls Treves ampToveacutee 1997 Toveacutee et al 1993 Toveacutee amp Rolls 1995) Thereason for using fewer images in the experiments de-scribed here is that each image needed to be tested withve different masking conditions There are two poten-tial problems arising from a calculation of informationfrom a limited set of stimuli The rst one is that withfew stimuli the full response space of the neuron maynot be adequately sampled We guarded against this bychoosing images for the tests that elicited quite differentresponses from the cells and ensuring that the informa-tion measured with these images was high The secondpossible problem is that the information measures maybe distorted by the fact that there is a ceiling to theamount of information that can be provided aboutwhich of a set of stimuli has been seen and this entropyis the logarithm of the number of stimuli in the set Fortwo stimuli just 1 bit of information is all that could beprovided by a cell This ceiling could limit the informa-tion measurement obtained from neuronal responses iftoo few stimuli are used (Gawne Kjaer Hertz amp Rich-mond 1996 Gawne amp Richmond 1993 Rolls amp Treves1998 Rolls Treves amp Toveacutee 1997) For the analyses de-scribed here we checked that the information availablefrom the neuronal responses was always well below theceiling so that the information measure was not dis-torted In fact for the unmasked condition the averageinformation from the neuronal responses was 03 bitsand this is much lower than the entropy of the sets ofimages which varied for different cells between 1 bit

and 26 bits Finally we note that each neuron was testedwith the same stimuli across the different masking con-ditions and therefore the comparison of the informationvalues obtained under the different masking conditionsis homogenous

Acknowledgments

This research was supported by Medical Research CouncilProgramme Grant PG8513790 to E T Rolls S Panzeri is sup-ported by an EC Marie Curie Research Training GrantERBFMBICT972749

Reprint requests should be sent to E T Rolls Department ofExperimental Psychology University of Oxford Oxford OX13UD UK or via e-mail EdmundRollspsyoxacuk

REFERENCES

Cover T M amp Thomas J A (1991) Elements of informationtheory New York Wiley

Dinse H R amp Kruger K (1994) The timing of processingalong the visual pathway in the cat NeuroReport 5 893ndash897

Gawne T J Kjaer T W Hertz J A amp Richmond B J (1996)Adjacent cortical complex cells share about 20 of theirstimulus-related information Cerebral Cortex 6 482ndash489

Gawne T J amp Richmond B J (1993) How independent arethe messages carried by adjacent inferior temporal corti-cal neurons Journal of Neuroscience 13 2758ndash2771

Golledge H D R Hilgetag C amp Toveacutee M J (1996) Informa-tion processing A solution to the binding problem Cur-rent Biology 6 1092ndash1095

Golomb D Hertz J A Panzeri S Treves A amp RichmondB J (1997) How well can we estimate the information car-ried in neuronal responses from limited samples NeuralComputation 9 649ndash665

Heller J Hertz J A Kjaer T W amp Richmond B J (1995)Information ow and temporal coding in primate patternvision Journal of Computational Neuroscience 2 175ndash193

Humphreys G W amp Bruce V (1989) Visual cognitionHove Eng Erlbaum

Kovacs G Vogels R amp Orban G A (1995) Cortical corre-lates of pattern backward-masking Proceedings of the Na-tional Academy of Sciences USA 92 5587ndash5591

Maunsell J H R amp Gibson J R (1992) Visual response la-tencies in striate cortex of the macaque monkey Journalof Neurophysiology 68 1332ndash1344

Optican L amp Richmond B J (1987) Temporal encoding oftwo-dimensional patterns by single units in primate infe-rior temporal cortex III Information theoretical analysisJournal of Neurophysiology 57 132ndash146

Oram M W amp Perrett D I (1992) Time course of neural re-sponses discriminating different views of the face andhead Journal of Neurophysiology 68 70ndash84

Panzeri S amp Treves A (1996) Analytical estimates of limitedsampling biases in different information measures Net-work 7 87ndash107

Perrett D I Hietanen J K Oram M W amp Benson P J(1992) Organization and functions of cells in the ma-caque temporal cortex Philosophical Transactions of theRoyal Society London B 335 23ndash50

Raiguel S E Lagae L Gulyas B amp Orban G A (1989) Re-

310 Journal of Cognitive Neuroscience Volume 11 Number 3

sponse latencies of visual cells in macaque areas V1 V2and V5 Brain Research 493 155ndash159

Rieke F Warland D de Ruyter van Steveninck R R ampBialek W (1996) Spikes Exploring the neural code Cam-bridge MA MIT Press

Rolls E T (1984) Neurons in the cortex of the temporallobe and in the amygdala of the monkey with responsesselective for faces Human Neurobiology 3 209ndash222

Rolls E T (1992) Neurophysiological mechanisms underly-ing face processing within and beyond the temporal corti-cal visual areas Philosophical Transactions of the RoyalSociety London B 335 11ndash21

Rolls E T (1994) Brain mechanisms for invariant visual rec-ognition and learning Behavioral Processes 33 113ndash138

Rolls E T (1995) Learning mechanisms in the temporal lobevisual cortex Behavioral Brain Research 66 177ndash185

Rolls E T amp Toveacutee M J (1994) Processing speed in thecerebral cortex and the neurophysiology of backwardmasking Proceedings of the Royal Society London B257 9ndash15

Rolls E T amp Toveacutee M J (1995) The sparseness of the neuro-nal representation of stimuli in the primate temporal vis-ual cortex Journal of Neurophysiology 73 713ndash726

Rolls E T Toveacutee M J Purcell D G Stewart A L amp Azzo-pardi P (1994) The responses of neurons in the temporalcortex of primates and face identication and detectionExperimental Brain Research 101 473ndash484

Rolls E T amp Treves A (1998) Neural networks and brainfunction Oxford Oxford University Press

Rolls E T Treves A amp Toveacutee M J (1997) The repre-sentational capacity of the distributed encoding of informa-tion provided by populations of neurons in the primatetemporal visual cortex Experimental Brain Research114 149ndash162

Rolls E T Treves A Toveacutee M amp Panzeri S (1997) Informa-

tion in the neuronal representation of individual stimuli inthe primate temporal visual cortex Journal of Computa-tional Neuroscience 4 309ndash333

Schiller P H (1968) Single unit analysis of backward visualmasking and metacontrast in the cat lateral geniculate nu-cleus Vision Research 8 855ndash866

Shannon C E (1948) A mathematical theory of communica-tion ATampT Bell Laboratories Technical Journal 27 379ndash423

Thorpe S Fize D amp Mariot C (1996) Speed of processingin the human visual system Nature 381 520ndash522

Toveacutee M J (1994) How fast is the speed of thought Cur-rent Biology 4 1125ndash1127

Toveacutee M J amp Rolls E T (1992) Oscillatory activity is not evi-dent in the primate temporal visual cortex with static vis-ual stimuli NeuroReport 3 369ndash372

Toveacutee M J amp Rolls E T (1995) Information encoding inshort ring rate epochs by single neurons in the primatetemporal visual cortex Visual Cognition 2 35ndash58

Toveacutee M J Rolls E T amp Ramachandran V S (1996) Rapidvisual learning in the neurons of the primate temporal vis-ual cortex NeuroReport 7 2757ndash2760

Toveacutee M J Rolls E T Treves A amp Bellis R P (1993) Infor-mation encoding and the responses of single neurons inthe primate temporal visual cortex Journal of Neurophysi-ology 70 640ndash654

Treves A (1993) Mean-eld analysis of neuronal spike dy-namics Network 4 259ndash284

Vogel R amp Orban G A (1994) Activity of inferior temporalneurons during orientation discrimination with succes-sively presented gratings Journal of Neurophysiology 711428ndash1451

Wallis G amp Rolls E T (1997) Invariant face and object rec-ognition in the visual system Progress in Neurobiology51 167ndash194

Rolls et al 311

analyses by providing quantitative information measuresthat can in principle be compared to the performanceof human observers using the same information metric

A comparison of the latencies of the activation ofneurons in the different visual cortical areas V1 V2 V4posterior inferior temporal cortex and anterior inferiortemporal cortex suggests that approximately 10 to 15msec is added by each stage (Dinse amp Kruger 1994Oram amp Perrett 1992 Raiguel et al 1989 Rolls 1992Vogel amp Orban 1994) This lag also seems to be commonto the passage of information between different subdivi-sions of a given area For example there seems to be alag of 15 msec between neurons in layer 4C and in themore supercial layers of V1 and a lag of 11 msecbetween viewer-centered cells and object-centered cellsin the temporal visual cortex (Maunsell amp Gibson 1992Perrett et al 1992) These studies show that neurons inthe next stage of processing start ring soon after (15msec) the neurons in any stage of processing havestarted to re The fact that considerable information isavailable in short epochs of for example 20 msec of thering of neurons provides part of the underlying basisfor this rapid sequential activation of connected visualcortical areas (Toveacutee amp Rolls 1995) The high ring ratesof neurons in the visual cortical areas to their mosteffective stimulus may be an important aspect of thisrapid transmission of visual information (cf Rolls ampTreves 1998 section A232) Nevertheless the ring of

neurons within a cortical area normally continues to avisual stimulus for several hundred milliseconds (seeRolls amp Treves 1998) Over this long period of timedifferent factors will inuence the ring of neurons in agiven cortical area Initially the ring will be basedmainly on incoming information from the precedingcortical area (feed-forward information) but at varyingtemporal intervals different feed-back mechanisms willplay a modulating role Initially this is likely to include

Figure 7 The information available from the responses of one ofthe cells as a function of the poststimulus time for different SOAs

Figure 8 (a) The average (plusmn sem) across the cells of the cumu-lated information available in a 200-msec period from stimulus onsetfrom the responses of the cells as a function of the SOA (b) The dif-ference in the number of spikes to the most effective and to theleast effective stimulus is also shown as a function of the SOA Thescale in (b) of the ldquonumber of spikesrdquo axis is scaled so that the infor-mation value (a) and the difference in the number of spikes (b) co-incide in the no-mask condition

Rolls et al 307

lateral inhibition followed by intracortical feedbackfrom local excitatory recurrent collateral axons of otherpyramidal cells and then feedback from higher corticalareas However although all these separate inputs seemto be very different conceptually in time it is of interestthat in dynamical systems using integrate-and-re neu-rons to model the dynamics of the real brain the ex-change of information required to achieve rapid settlingof the network into a nal state can be very much fasterthan might be expected by taking the contribution ofeach stage as a separate time step (see Rolls amp Treves1998 Treves 1993) This is exactly consistent with therapid speed of processing demonstrated here and em-phasized by the fact that considerable information isavailable from inferior temporal cortex neurons whentheir processing is interrupted by a mask starting only20 msec after the onset of a stimulus

Our previous studies suggested that there is a popula-tion of neurons in the temporal cortical areas that re-spond to a single frame (16 msec) presentation of thetarget stimulus a face with an increased ring rate thatcan last for 200 to 300 msec (Rolls amp Toveacutee 1994 Rollset al 1994) This is in the unmasked condition Theinformation analysis conrms and extends this ndingThe neuronal responses can encode signicant amountsof information for up to 300 msec after the presentationof a 16-msec stimulus in the unmasked condition (seeFigure 5) These results suggest that there may be ashort-term visual memory implemented by the con-tinued ring of these neurons after a stimulus hasdisappeared This short-term visual memory may be im-plemented by the recurrent collateral connections madebetween nearby pyramidal cells in the cerebral cortexThese recurrent connections may function in part as anautoassociative memory which not only enables corticalnetworks to show continued ring for a few hundredmilliseconds after a briey presented stimulus but alsoconfer some of the response specicity inherent in theresponses of cortical neurons (Rolls amp Treves 1998) Oneeffect that may be facilitated by this short-term visualmemory is implementation of a trace learning rule in thevisual cortex that may be used to learn invariant repre-sentations (Rolls 1992 1994 1995 Rolls amp Toveacutee 1994Wallis amp Rolls 1997) The continuing neural activity afterone stimulus would enable it to be associated with thesucceeding images which given the nature of the statis-tics of our visual world would probably be images of thesame object These associations between images pro-duced within a short time by the same object could formthe basis for the construction of an invariant repre-sentation of that stimulus For example the neuronsactivated by a stimulus are still in a state of activation(and postsynaptic depolarization) perhaps 300 mseclater when the same stimulus is seen translated acrossthe retina viewed from a different angle or at a differentsize so that the active axons carrying the transformed

representation can undergo Hebbian associative synapticmodication onto just those neurons that remain in anactivated state from the previous input produced by thesame object In this way the invariant properties evidentacross short time epochs of the inputs produced byobjects may be learned by the visual system (Rolls 1992Rolls amp Treves 1998 Wallis amp Rolls 1997)

We nally note that the information per spike denedas the information calculated in short windows dividedby the number of spikes emitted by the cell in that timewindow is in the range 005 to 02 bits per spike forlonger SOAs slightly decreasing down to 003 to 015bits per spike at very short SOAs (The values of theinformation per spike can be easily extracted by com-paring Figure 6 which reports the values of the informa-tion available in short time epochs and Figure 3 whichplots the mean ring rates of the neurons in the sametime epochs) These values are similar to the values ofthe information per spike found in IT neurons respond-ing to long-lasting (eg 500 msec) stimuli (Heller et al1995 Rolls Treves Toveacutee amp Panzeri 1997) Thus for theexperiments described the information per spike is onlymoderate even when there are a very few spikes inresponse to a stimulus as in the mask condition withvery short SOAs This result has some implications forthe nature of the neuronal code It has been suggestedby Rieke et al (1996) that the fact that at short SOAsonly a few spikes are emitted in response to a stimulusand at the same time a psychophysical discrimination ofthe visual stimuli is still possible may imply that even inthe mammalian central nervous system when a singlestimulus is rapidly varying or is presented just for a veryshort time one or two spikes may be enough to carryvery high values of information (thus challenging theidea of a ldquorate coderdquo) This is indeed the case for forexample the visual system of the y (Rieke et al 1996)where a single spike can carry several bits of informa-tion at least when studied with dynamically varyingstimuli However the evidence presented here seems toindicate that several spikes from (perhaps different) neu-rons in the IT cortex of the monkey are needed toprovide much information about a briey presented vis-ual pattern This is consistent with the distributed repre-sentation of information used in many mammalianneural systems and with the advantages that computa-tion with this type of distributed representation of infor-mation confers (Rolls amp Treves 1998 Rolls Treves ampToveacutee 1997)

EXPERIMENTAL PROCEDURES AND DATAANALYSIS

The activity of single neurons was recorded with glass-insulated tungsten microelectrodes in the anterior partof the superior temporal sulcus (STS) in two alert ma-caque monkeys (Macaca mulatta mass 30 kg) seated in

308 Journal of Cognitive Neuroscience Volume 11 Number 3

a primate chair using techniques that have been de-scribed elsewhere (Toveacutee et al 1993 Toveacutee amp Rolls1992) The preparative procedures were performed asep-tically under sodium thiopentone anaesthesia by usingpretreatment with ketamine and posttreatment with theanalgesic buprenorphine (Temgesic and the antibioticamoxycillin Cynulox) and all procedures were in ac-cordance with the Policy Regarding the Care and Use ofAnimals approved by the Society for Neuroscience andwere licensed under the UK Animals Scientic Proce-dures Act 1986 Eye position was measured to an accu-racy of 05 8 with the search coil technique A visualxation task ensured that the monkey looked steadily atthe screen throughout the presentation of each stimulusThe task was a blink version of a visual xation task inwhich the xation spot was blinked off 100 msec beforethe target (otherwise called test) stimulus appeared Thestimuli were static visual stimuli subtending 8 in thevisual eld presented on a video monitor at a distanceof 10 m The xation spot position was at the center ofthe screen The monitor was viewed binocularly with thewhole screen visible to both eyes

Each trial started at - 500 msec with respect to theonset of the test image with a 500-msec warning toneto allow xation of the xation point which appearedat the same time At - 100 msec the xation spot wasblinked off so there was no stimulus on the screen inthe 100-msec period immediately preceding the testimage The screen in this period and at all other timesincluding the interstimulus interval and the interval be-tween the test image and the mask was set at the meanluminance of the test images and the mask At 0 msecthe tone was switched off and the test image wasswitched on for 16 msec This period was the frameduration of the video framestore with which the imageswere presented The image was drawn on the monitorfrom the top to the bottom in the rst 16 msec of theframe period by the framestore with the remaining 4msec of the frame period being the vertical blank inter-val (The PAL video system was in use) The monitor hada persistence of less than 3 msec so no part of the testimage was present at the start of the next frame Thetime between the start of the test stimulus and the startof the mask stimulus (the SOA) was either 20 40 60 100or 1000 msec (chosen in a pseudorandom sequence bythe computer) The 1000-msec condition was used tomeasure the response to the test stimulus alone (whichwas possible because the mask was delayed for so long)The duration of the masking stimulus was 300 msec Atthe termination of the masking stimulus the xation spotreappeared and then after a random interval in the range150 to 3350 msec it dimmed to indicate that lickingresponses to a tube in front of the mouth would resultin the delivery of a reward The dimming period was1000 msec and after this the xation spot was switchedoff and reward availability terminated 500 msec later

The monkey was required to xate the xation spot andif it licked at any time other than when the spot wasdimmed saline instead of fruit juice was delivered fromthe tube If the eyes moved by more than 05deg from time0 until the start of the dimming period the trial wasaborted and the data for the trial were rejected When atrial aborted a high-frequency tone sounded for 05 secno reinforcement was available for that trial and theintertrial interval was lengthened from 8 to 11 sec

The criterion for the face-selective neurons analyzedin this study was that the response to the optimal stimu-lus should be at least twice that to the optimal nonfacestimulus and that this difference should be signicant(Rolls 1984 Rolls amp Toveacutee 1995 Rolls Treves Toveacutee ampPanzeri 1997) If the neuron satised the criterion it wastested with two to six of the effective face stimuli forthat neuron We checked that none of the selected cellshad any response to the mask presented alone

The transmitted information carried by neuronal ringrates about the stimuli was computed with the use oftechniques that have been described fully previously(eg Rolls Treves Toveacutee amp Panzeri 1997 Rolls amp Treves1998) and have been used previously to analyze theresponses of inferior temporal cortex neurons (Gawneamp Richmond 1993 Optican amp Richmond 1987 RollsTreves amp Toveacutee 1997 Toveacutee amp Rolls 1995 Toveacutee et al1993) In brief the general procedure was as follows Theresponse r of a neuron to the presentation of a particularstimulus s was computed by measuring the ring rate ofthe neuron in a xed time window after the stimuluspresentation (In this experiment the information in anumber of different time windows was calculated)Measured in this way the ring rate responses takediscrete rather than continuous values that consist of thenumber of spikes in the time window on a particulartrial and span a discrete set of responses R across allstimuli and trials In the experiment the number of trialsavailable for each stimulus is limited (in the range of 6to 30 in the present experiment) When calculating theinformation the number of ring rate bins that can beused must be smaller than (or equal to) the number oftrials available for each stimulus to prevent undersam-pling and incorrectly high values of calculated informa-tion (Rolls amp Treves 1998) We therefore quantized themeasured ring rates into a smaller number of bins dWe chose here d in a way specic for every cell andevery time window according to the following d wasthe number of trials per stimulus (or the number ofdifferent rates that actually occurred if this was lower)This procedure is very effective in minimizing informa-tion loss due to overregularization of responses whileeffectively controlling for nite sampling biases seeGolomb Hertz Panzeri Treves amp Richmond (1997) andPanzeri amp Treves (1996) After this response quantizationthe experimental joint stimulus-response probability ta-ble P(s r) was computed from the data (where P(r) and

Rolls et al 309

P(s) are the experimental probability of occurrence ofresponses and of stimuli respectively) and the informa-tion I(S R) transmitted by the neurons averaged acrossthe stimuli was calculated by using the Shannon formula(Cover amp Thomas 1991 Shannon 1948)

I(S R) = aring sr

P(s r) log2 P(s r)

P(s)P(r)

and then subtracting the nite sampling correction ofPanzeri and Treves (1996) to obtain estimates unbiasedfor the limited sampling This leads to the informationavailable in the ring rates about the stimulus We didnot calculate the additional information available fromtemporal ring patterns within the spike train becausethe additional information is low often only 10 to 20and reects mainly the onset latency of the neuronalresponse (Toveacutee et al 1993 Toveacutee amp Rolls 1995)

In the experiments described here three cells weretested with six stimuli one with four stimuli ve cellswith three stimuli and six cells with two stimuli We notethat the number of stimuli used here is smaller than thatused in other experiments that applied informationtheoretic measures to the responses of inferior temporalcortex cells (Optican amp Richmond 1987 Rolls Treves ampToveacutee 1997 Toveacutee et al 1993 Toveacutee amp Rolls 1995) Thereason for using fewer images in the experiments de-scribed here is that each image needed to be tested withve different masking conditions There are two poten-tial problems arising from a calculation of informationfrom a limited set of stimuli The rst one is that withfew stimuli the full response space of the neuron maynot be adequately sampled We guarded against this bychoosing images for the tests that elicited quite differentresponses from the cells and ensuring that the informa-tion measured with these images was high The secondpossible problem is that the information measures maybe distorted by the fact that there is a ceiling to theamount of information that can be provided aboutwhich of a set of stimuli has been seen and this entropyis the logarithm of the number of stimuli in the set Fortwo stimuli just 1 bit of information is all that could beprovided by a cell This ceiling could limit the informa-tion measurement obtained from neuronal responses iftoo few stimuli are used (Gawne Kjaer Hertz amp Rich-mond 1996 Gawne amp Richmond 1993 Rolls amp Treves1998 Rolls Treves amp Toveacutee 1997) For the analyses de-scribed here we checked that the information availablefrom the neuronal responses was always well below theceiling so that the information measure was not dis-torted In fact for the unmasked condition the averageinformation from the neuronal responses was 03 bitsand this is much lower than the entropy of the sets ofimages which varied for different cells between 1 bit

and 26 bits Finally we note that each neuron was testedwith the same stimuli across the different masking con-ditions and therefore the comparison of the informationvalues obtained under the different masking conditionsis homogenous

Acknowledgments

This research was supported by Medical Research CouncilProgramme Grant PG8513790 to E T Rolls S Panzeri is sup-ported by an EC Marie Curie Research Training GrantERBFMBICT972749

Reprint requests should be sent to E T Rolls Department ofExperimental Psychology University of Oxford Oxford OX13UD UK or via e-mail EdmundRollspsyoxacuk

REFERENCES

Cover T M amp Thomas J A (1991) Elements of informationtheory New York Wiley

Dinse H R amp Kruger K (1994) The timing of processingalong the visual pathway in the cat NeuroReport 5 893ndash897

Gawne T J Kjaer T W Hertz J A amp Richmond B J (1996)Adjacent cortical complex cells share about 20 of theirstimulus-related information Cerebral Cortex 6 482ndash489

Gawne T J amp Richmond B J (1993) How independent arethe messages carried by adjacent inferior temporal corti-cal neurons Journal of Neuroscience 13 2758ndash2771

Golledge H D R Hilgetag C amp Toveacutee M J (1996) Informa-tion processing A solution to the binding problem Cur-rent Biology 6 1092ndash1095

Golomb D Hertz J A Panzeri S Treves A amp RichmondB J (1997) How well can we estimate the information car-ried in neuronal responses from limited samples NeuralComputation 9 649ndash665

Heller J Hertz J A Kjaer T W amp Richmond B J (1995)Information ow and temporal coding in primate patternvision Journal of Computational Neuroscience 2 175ndash193

Humphreys G W amp Bruce V (1989) Visual cognitionHove Eng Erlbaum

Kovacs G Vogels R amp Orban G A (1995) Cortical corre-lates of pattern backward-masking Proceedings of the Na-tional Academy of Sciences USA 92 5587ndash5591

Maunsell J H R amp Gibson J R (1992) Visual response la-tencies in striate cortex of the macaque monkey Journalof Neurophysiology 68 1332ndash1344

Optican L amp Richmond B J (1987) Temporal encoding oftwo-dimensional patterns by single units in primate infe-rior temporal cortex III Information theoretical analysisJournal of Neurophysiology 57 132ndash146

Oram M W amp Perrett D I (1992) Time course of neural re-sponses discriminating different views of the face andhead Journal of Neurophysiology 68 70ndash84

Panzeri S amp Treves A (1996) Analytical estimates of limitedsampling biases in different information measures Net-work 7 87ndash107

Perrett D I Hietanen J K Oram M W amp Benson P J(1992) Organization and functions of cells in the ma-caque temporal cortex Philosophical Transactions of theRoyal Society London B 335 23ndash50

Raiguel S E Lagae L Gulyas B amp Orban G A (1989) Re-

310 Journal of Cognitive Neuroscience Volume 11 Number 3

sponse latencies of visual cells in macaque areas V1 V2and V5 Brain Research 493 155ndash159

Rieke F Warland D de Ruyter van Steveninck R R ampBialek W (1996) Spikes Exploring the neural code Cam-bridge MA MIT Press

Rolls E T (1984) Neurons in the cortex of the temporallobe and in the amygdala of the monkey with responsesselective for faces Human Neurobiology 3 209ndash222

Rolls E T (1992) Neurophysiological mechanisms underly-ing face processing within and beyond the temporal corti-cal visual areas Philosophical Transactions of the RoyalSociety London B 335 11ndash21

Rolls E T (1994) Brain mechanisms for invariant visual rec-ognition and learning Behavioral Processes 33 113ndash138

Rolls E T (1995) Learning mechanisms in the temporal lobevisual cortex Behavioral Brain Research 66 177ndash185

Rolls E T amp Toveacutee M J (1994) Processing speed in thecerebral cortex and the neurophysiology of backwardmasking Proceedings of the Royal Society London B257 9ndash15

Rolls E T amp Toveacutee M J (1995) The sparseness of the neuro-nal representation of stimuli in the primate temporal vis-ual cortex Journal of Neurophysiology 73 713ndash726

Rolls E T Toveacutee M J Purcell D G Stewart A L amp Azzo-pardi P (1994) The responses of neurons in the temporalcortex of primates and face identication and detectionExperimental Brain Research 101 473ndash484

Rolls E T amp Treves A (1998) Neural networks and brainfunction Oxford Oxford University Press

Rolls E T Treves A amp Toveacutee M J (1997) The repre-sentational capacity of the distributed encoding of informa-tion provided by populations of neurons in the primatetemporal visual cortex Experimental Brain Research114 149ndash162

Rolls E T Treves A Toveacutee M amp Panzeri S (1997) Informa-

tion in the neuronal representation of individual stimuli inthe primate temporal visual cortex Journal of Computa-tional Neuroscience 4 309ndash333

Schiller P H (1968) Single unit analysis of backward visualmasking and metacontrast in the cat lateral geniculate nu-cleus Vision Research 8 855ndash866

Shannon C E (1948) A mathematical theory of communica-tion ATampT Bell Laboratories Technical Journal 27 379ndash423

Thorpe S Fize D amp Mariot C (1996) Speed of processingin the human visual system Nature 381 520ndash522

Toveacutee M J (1994) How fast is the speed of thought Cur-rent Biology 4 1125ndash1127

Toveacutee M J amp Rolls E T (1992) Oscillatory activity is not evi-dent in the primate temporal visual cortex with static vis-ual stimuli NeuroReport 3 369ndash372

Toveacutee M J amp Rolls E T (1995) Information encoding inshort ring rate epochs by single neurons in the primatetemporal visual cortex Visual Cognition 2 35ndash58

Toveacutee M J Rolls E T amp Ramachandran V S (1996) Rapidvisual learning in the neurons of the primate temporal vis-ual cortex NeuroReport 7 2757ndash2760

Toveacutee M J Rolls E T Treves A amp Bellis R P (1993) Infor-mation encoding and the responses of single neurons inthe primate temporal visual cortex Journal of Neurophysi-ology 70 640ndash654

Treves A (1993) Mean-eld analysis of neuronal spike dy-namics Network 4 259ndash284

Vogel R amp Orban G A (1994) Activity of inferior temporalneurons during orientation discrimination with succes-sively presented gratings Journal of Neurophysiology 711428ndash1451

Wallis G amp Rolls E T (1997) Invariant face and object rec-ognition in the visual system Progress in Neurobiology51 167ndash194

Rolls et al 311

lateral inhibition followed by intracortical feedbackfrom local excitatory recurrent collateral axons of otherpyramidal cells and then feedback from higher corticalareas However although all these separate inputs seemto be very different conceptually in time it is of interestthat in dynamical systems using integrate-and-re neu-rons to model the dynamics of the real brain the ex-change of information required to achieve rapid settlingof the network into a nal state can be very much fasterthan might be expected by taking the contribution ofeach stage as a separate time step (see Rolls amp Treves1998 Treves 1993) This is exactly consistent with therapid speed of processing demonstrated here and em-phasized by the fact that considerable information isavailable from inferior temporal cortex neurons whentheir processing is interrupted by a mask starting only20 msec after the onset of a stimulus

Our previous studies suggested that there is a popula-tion of neurons in the temporal cortical areas that re-spond to a single frame (16 msec) presentation of thetarget stimulus a face with an increased ring rate thatcan last for 200 to 300 msec (Rolls amp Toveacutee 1994 Rollset al 1994) This is in the unmasked condition Theinformation analysis conrms and extends this ndingThe neuronal responses can encode signicant amountsof information for up to 300 msec after the presentationof a 16-msec stimulus in the unmasked condition (seeFigure 5) These results suggest that there may be ashort-term visual memory implemented by the con-tinued ring of these neurons after a stimulus hasdisappeared This short-term visual memory may be im-plemented by the recurrent collateral connections madebetween nearby pyramidal cells in the cerebral cortexThese recurrent connections may function in part as anautoassociative memory which not only enables corticalnetworks to show continued ring for a few hundredmilliseconds after a briey presented stimulus but alsoconfer some of the response specicity inherent in theresponses of cortical neurons (Rolls amp Treves 1998) Oneeffect that may be facilitated by this short-term visualmemory is implementation of a trace learning rule in thevisual cortex that may be used to learn invariant repre-sentations (Rolls 1992 1994 1995 Rolls amp Toveacutee 1994Wallis amp Rolls 1997) The continuing neural activity afterone stimulus would enable it to be associated with thesucceeding images which given the nature of the statis-tics of our visual world would probably be images of thesame object These associations between images pro-duced within a short time by the same object could formthe basis for the construction of an invariant repre-sentation of that stimulus For example the neuronsactivated by a stimulus are still in a state of activation(and postsynaptic depolarization) perhaps 300 mseclater when the same stimulus is seen translated acrossthe retina viewed from a different angle or at a differentsize so that the active axons carrying the transformed

representation can undergo Hebbian associative synapticmodication onto just those neurons that remain in anactivated state from the previous input produced by thesame object In this way the invariant properties evidentacross short time epochs of the inputs produced byobjects may be learned by the visual system (Rolls 1992Rolls amp Treves 1998 Wallis amp Rolls 1997)

We nally note that the information per spike denedas the information calculated in short windows dividedby the number of spikes emitted by the cell in that timewindow is in the range 005 to 02 bits per spike forlonger SOAs slightly decreasing down to 003 to 015bits per spike at very short SOAs (The values of theinformation per spike can be easily extracted by com-paring Figure 6 which reports the values of the informa-tion available in short time epochs and Figure 3 whichplots the mean ring rates of the neurons in the sametime epochs) These values are similar to the values ofthe information per spike found in IT neurons respond-ing to long-lasting (eg 500 msec) stimuli (Heller et al1995 Rolls Treves Toveacutee amp Panzeri 1997) Thus for theexperiments described the information per spike is onlymoderate even when there are a very few spikes inresponse to a stimulus as in the mask condition withvery short SOAs This result has some implications forthe nature of the neuronal code It has been suggestedby Rieke et al (1996) that the fact that at short SOAsonly a few spikes are emitted in response to a stimulusand at the same time a psychophysical discrimination ofthe visual stimuli is still possible may imply that even inthe mammalian central nervous system when a singlestimulus is rapidly varying or is presented just for a veryshort time one or two spikes may be enough to carryvery high values of information (thus challenging theidea of a ldquorate coderdquo) This is indeed the case for forexample the visual system of the y (Rieke et al 1996)where a single spike can carry several bits of informa-tion at least when studied with dynamically varyingstimuli However the evidence presented here seems toindicate that several spikes from (perhaps different) neu-rons in the IT cortex of the monkey are needed toprovide much information about a briey presented vis-ual pattern This is consistent with the distributed repre-sentation of information used in many mammalianneural systems and with the advantages that computa-tion with this type of distributed representation of infor-mation confers (Rolls amp Treves 1998 Rolls Treves ampToveacutee 1997)

EXPERIMENTAL PROCEDURES AND DATAANALYSIS

The activity of single neurons was recorded with glass-insulated tungsten microelectrodes in the anterior partof the superior temporal sulcus (STS) in two alert ma-caque monkeys (Macaca mulatta mass 30 kg) seated in

308 Journal of Cognitive Neuroscience Volume 11 Number 3

a primate chair using techniques that have been de-scribed elsewhere (Toveacutee et al 1993 Toveacutee amp Rolls1992) The preparative procedures were performed asep-tically under sodium thiopentone anaesthesia by usingpretreatment with ketamine and posttreatment with theanalgesic buprenorphine (Temgesic and the antibioticamoxycillin Cynulox) and all procedures were in ac-cordance with the Policy Regarding the Care and Use ofAnimals approved by the Society for Neuroscience andwere licensed under the UK Animals Scientic Proce-dures Act 1986 Eye position was measured to an accu-racy of 05 8 with the search coil technique A visualxation task ensured that the monkey looked steadily atthe screen throughout the presentation of each stimulusThe task was a blink version of a visual xation task inwhich the xation spot was blinked off 100 msec beforethe target (otherwise called test) stimulus appeared Thestimuli were static visual stimuli subtending 8 in thevisual eld presented on a video monitor at a distanceof 10 m The xation spot position was at the center ofthe screen The monitor was viewed binocularly with thewhole screen visible to both eyes

Each trial started at - 500 msec with respect to theonset of the test image with a 500-msec warning toneto allow xation of the xation point which appearedat the same time At - 100 msec the xation spot wasblinked off so there was no stimulus on the screen inthe 100-msec period immediately preceding the testimage The screen in this period and at all other timesincluding the interstimulus interval and the interval be-tween the test image and the mask was set at the meanluminance of the test images and the mask At 0 msecthe tone was switched off and the test image wasswitched on for 16 msec This period was the frameduration of the video framestore with which the imageswere presented The image was drawn on the monitorfrom the top to the bottom in the rst 16 msec of theframe period by the framestore with the remaining 4msec of the frame period being the vertical blank inter-val (The PAL video system was in use) The monitor hada persistence of less than 3 msec so no part of the testimage was present at the start of the next frame Thetime between the start of the test stimulus and the startof the mask stimulus (the SOA) was either 20 40 60 100or 1000 msec (chosen in a pseudorandom sequence bythe computer) The 1000-msec condition was used tomeasure the response to the test stimulus alone (whichwas possible because the mask was delayed for so long)The duration of the masking stimulus was 300 msec Atthe termination of the masking stimulus the xation spotreappeared and then after a random interval in the range150 to 3350 msec it dimmed to indicate that lickingresponses to a tube in front of the mouth would resultin the delivery of a reward The dimming period was1000 msec and after this the xation spot was switchedoff and reward availability terminated 500 msec later

The monkey was required to xate the xation spot andif it licked at any time other than when the spot wasdimmed saline instead of fruit juice was delivered fromthe tube If the eyes moved by more than 05deg from time0 until the start of the dimming period the trial wasaborted and the data for the trial were rejected When atrial aborted a high-frequency tone sounded for 05 secno reinforcement was available for that trial and theintertrial interval was lengthened from 8 to 11 sec

The criterion for the face-selective neurons analyzedin this study was that the response to the optimal stimu-lus should be at least twice that to the optimal nonfacestimulus and that this difference should be signicant(Rolls 1984 Rolls amp Toveacutee 1995 Rolls Treves Toveacutee ampPanzeri 1997) If the neuron satised the criterion it wastested with two to six of the effective face stimuli forthat neuron We checked that none of the selected cellshad any response to the mask presented alone

The transmitted information carried by neuronal ringrates about the stimuli was computed with the use oftechniques that have been described fully previously(eg Rolls Treves Toveacutee amp Panzeri 1997 Rolls amp Treves1998) and have been used previously to analyze theresponses of inferior temporal cortex neurons (Gawneamp Richmond 1993 Optican amp Richmond 1987 RollsTreves amp Toveacutee 1997 Toveacutee amp Rolls 1995 Toveacutee et al1993) In brief the general procedure was as follows Theresponse r of a neuron to the presentation of a particularstimulus s was computed by measuring the ring rate ofthe neuron in a xed time window after the stimuluspresentation (In this experiment the information in anumber of different time windows was calculated)Measured in this way the ring rate responses takediscrete rather than continuous values that consist of thenumber of spikes in the time window on a particulartrial and span a discrete set of responses R across allstimuli and trials In the experiment the number of trialsavailable for each stimulus is limited (in the range of 6to 30 in the present experiment) When calculating theinformation the number of ring rate bins that can beused must be smaller than (or equal to) the number oftrials available for each stimulus to prevent undersam-pling and incorrectly high values of calculated informa-tion (Rolls amp Treves 1998) We therefore quantized themeasured ring rates into a smaller number of bins dWe chose here d in a way specic for every cell andevery time window according to the following d wasthe number of trials per stimulus (or the number ofdifferent rates that actually occurred if this was lower)This procedure is very effective in minimizing informa-tion loss due to overregularization of responses whileeffectively controlling for nite sampling biases seeGolomb Hertz Panzeri Treves amp Richmond (1997) andPanzeri amp Treves (1996) After this response quantizationthe experimental joint stimulus-response probability ta-ble P(s r) was computed from the data (where P(r) and

Rolls et al 309

P(s) are the experimental probability of occurrence ofresponses and of stimuli respectively) and the informa-tion I(S R) transmitted by the neurons averaged acrossthe stimuli was calculated by using the Shannon formula(Cover amp Thomas 1991 Shannon 1948)

I(S R) = aring sr

P(s r) log2 P(s r)

P(s)P(r)

and then subtracting the nite sampling correction ofPanzeri and Treves (1996) to obtain estimates unbiasedfor the limited sampling This leads to the informationavailable in the ring rates about the stimulus We didnot calculate the additional information available fromtemporal ring patterns within the spike train becausethe additional information is low often only 10 to 20and reects mainly the onset latency of the neuronalresponse (Toveacutee et al 1993 Toveacutee amp Rolls 1995)

In the experiments described here three cells weretested with six stimuli one with four stimuli ve cellswith three stimuli and six cells with two stimuli We notethat the number of stimuli used here is smaller than thatused in other experiments that applied informationtheoretic measures to the responses of inferior temporalcortex cells (Optican amp Richmond 1987 Rolls Treves ampToveacutee 1997 Toveacutee et al 1993 Toveacutee amp Rolls 1995) Thereason for using fewer images in the experiments de-scribed here is that each image needed to be tested withve different masking conditions There are two poten-tial problems arising from a calculation of informationfrom a limited set of stimuli The rst one is that withfew stimuli the full response space of the neuron maynot be adequately sampled We guarded against this bychoosing images for the tests that elicited quite differentresponses from the cells and ensuring that the informa-tion measured with these images was high The secondpossible problem is that the information measures maybe distorted by the fact that there is a ceiling to theamount of information that can be provided aboutwhich of a set of stimuli has been seen and this entropyis the logarithm of the number of stimuli in the set Fortwo stimuli just 1 bit of information is all that could beprovided by a cell This ceiling could limit the informa-tion measurement obtained from neuronal responses iftoo few stimuli are used (Gawne Kjaer Hertz amp Rich-mond 1996 Gawne amp Richmond 1993 Rolls amp Treves1998 Rolls Treves amp Toveacutee 1997) For the analyses de-scribed here we checked that the information availablefrom the neuronal responses was always well below theceiling so that the information measure was not dis-torted In fact for the unmasked condition the averageinformation from the neuronal responses was 03 bitsand this is much lower than the entropy of the sets ofimages which varied for different cells between 1 bit

and 26 bits Finally we note that each neuron was testedwith the same stimuli across the different masking con-ditions and therefore the comparison of the informationvalues obtained under the different masking conditionsis homogenous

Acknowledgments

This research was supported by Medical Research CouncilProgramme Grant PG8513790 to E T Rolls S Panzeri is sup-ported by an EC Marie Curie Research Training GrantERBFMBICT972749

Reprint requests should be sent to E T Rolls Department ofExperimental Psychology University of Oxford Oxford OX13UD UK or via e-mail EdmundRollspsyoxacuk

REFERENCES

Cover T M amp Thomas J A (1991) Elements of informationtheory New York Wiley

Dinse H R amp Kruger K (1994) The timing of processingalong the visual pathway in the cat NeuroReport 5 893ndash897

Gawne T J Kjaer T W Hertz J A amp Richmond B J (1996)Adjacent cortical complex cells share about 20 of theirstimulus-related information Cerebral Cortex 6 482ndash489

Gawne T J amp Richmond B J (1993) How independent arethe messages carried by adjacent inferior temporal corti-cal neurons Journal of Neuroscience 13 2758ndash2771

Golledge H D R Hilgetag C amp Toveacutee M J (1996) Informa-tion processing A solution to the binding problem Cur-rent Biology 6 1092ndash1095

Golomb D Hertz J A Panzeri S Treves A amp RichmondB J (1997) How well can we estimate the information car-ried in neuronal responses from limited samples NeuralComputation 9 649ndash665

Heller J Hertz J A Kjaer T W amp Richmond B J (1995)Information ow and temporal coding in primate patternvision Journal of Computational Neuroscience 2 175ndash193

Humphreys G W amp Bruce V (1989) Visual cognitionHove Eng Erlbaum

Kovacs G Vogels R amp Orban G A (1995) Cortical corre-lates of pattern backward-masking Proceedings of the Na-tional Academy of Sciences USA 92 5587ndash5591

Maunsell J H R amp Gibson J R (1992) Visual response la-tencies in striate cortex of the macaque monkey Journalof Neurophysiology 68 1332ndash1344

Optican L amp Richmond B J (1987) Temporal encoding oftwo-dimensional patterns by single units in primate infe-rior temporal cortex III Information theoretical analysisJournal of Neurophysiology 57 132ndash146

Oram M W amp Perrett D I (1992) Time course of neural re-sponses discriminating different views of the face andhead Journal of Neurophysiology 68 70ndash84

Panzeri S amp Treves A (1996) Analytical estimates of limitedsampling biases in different information measures Net-work 7 87ndash107

Perrett D I Hietanen J K Oram M W amp Benson P J(1992) Organization and functions of cells in the ma-caque temporal cortex Philosophical Transactions of theRoyal Society London B 335 23ndash50

Raiguel S E Lagae L Gulyas B amp Orban G A (1989) Re-

310 Journal of Cognitive Neuroscience Volume 11 Number 3

sponse latencies of visual cells in macaque areas V1 V2and V5 Brain Research 493 155ndash159

Rieke F Warland D de Ruyter van Steveninck R R ampBialek W (1996) Spikes Exploring the neural code Cam-bridge MA MIT Press

Rolls E T (1984) Neurons in the cortex of the temporallobe and in the amygdala of the monkey with responsesselective for faces Human Neurobiology 3 209ndash222

Rolls E T (1992) Neurophysiological mechanisms underly-ing face processing within and beyond the temporal corti-cal visual areas Philosophical Transactions of the RoyalSociety London B 335 11ndash21

Rolls E T (1994) Brain mechanisms for invariant visual rec-ognition and learning Behavioral Processes 33 113ndash138

Rolls E T (1995) Learning mechanisms in the temporal lobevisual cortex Behavioral Brain Research 66 177ndash185

Rolls E T amp Toveacutee M J (1994) Processing speed in thecerebral cortex and the neurophysiology of backwardmasking Proceedings of the Royal Society London B257 9ndash15

Rolls E T amp Toveacutee M J (1995) The sparseness of the neuro-nal representation of stimuli in the primate temporal vis-ual cortex Journal of Neurophysiology 73 713ndash726

Rolls E T Toveacutee M J Purcell D G Stewart A L amp Azzo-pardi P (1994) The responses of neurons in the temporalcortex of primates and face identication and detectionExperimental Brain Research 101 473ndash484

Rolls E T amp Treves A (1998) Neural networks and brainfunction Oxford Oxford University Press

Rolls E T Treves A amp Toveacutee M J (1997) The repre-sentational capacity of the distributed encoding of informa-tion provided by populations of neurons in the primatetemporal visual cortex Experimental Brain Research114 149ndash162

Rolls E T Treves A Toveacutee M amp Panzeri S (1997) Informa-

tion in the neuronal representation of individual stimuli inthe primate temporal visual cortex Journal of Computa-tional Neuroscience 4 309ndash333

Schiller P H (1968) Single unit analysis of backward visualmasking and metacontrast in the cat lateral geniculate nu-cleus Vision Research 8 855ndash866

Shannon C E (1948) A mathematical theory of communica-tion ATampT Bell Laboratories Technical Journal 27 379ndash423

Thorpe S Fize D amp Mariot C (1996) Speed of processingin the human visual system Nature 381 520ndash522

Toveacutee M J (1994) How fast is the speed of thought Cur-rent Biology 4 1125ndash1127

Toveacutee M J amp Rolls E T (1992) Oscillatory activity is not evi-dent in the primate temporal visual cortex with static vis-ual stimuli NeuroReport 3 369ndash372

Toveacutee M J amp Rolls E T (1995) Information encoding inshort ring rate epochs by single neurons in the primatetemporal visual cortex Visual Cognition 2 35ndash58

Toveacutee M J Rolls E T amp Ramachandran V S (1996) Rapidvisual learning in the neurons of the primate temporal vis-ual cortex NeuroReport 7 2757ndash2760

Toveacutee M J Rolls E T Treves A amp Bellis R P (1993) Infor-mation encoding and the responses of single neurons inthe primate temporal visual cortex Journal of Neurophysi-ology 70 640ndash654

Treves A (1993) Mean-eld analysis of neuronal spike dy-namics Network 4 259ndash284

Vogel R amp Orban G A (1994) Activity of inferior temporalneurons during orientation discrimination with succes-sively presented gratings Journal of Neurophysiology 711428ndash1451

Wallis G amp Rolls E T (1997) Invariant face and object rec-ognition in the visual system Progress in Neurobiology51 167ndash194

Rolls et al 311

a primate chair using techniques that have been de-scribed elsewhere (Toveacutee et al 1993 Toveacutee amp Rolls1992) The preparative procedures were performed asep-tically under sodium thiopentone anaesthesia by usingpretreatment with ketamine and posttreatment with theanalgesic buprenorphine (Temgesic and the antibioticamoxycillin Cynulox) and all procedures were in ac-cordance with the Policy Regarding the Care and Use ofAnimals approved by the Society for Neuroscience andwere licensed under the UK Animals Scientic Proce-dures Act 1986 Eye position was measured to an accu-racy of 05 8 with the search coil technique A visualxation task ensured that the monkey looked steadily atthe screen throughout the presentation of each stimulusThe task was a blink version of a visual xation task inwhich the xation spot was blinked off 100 msec beforethe target (otherwise called test) stimulus appeared Thestimuli were static visual stimuli subtending 8 in thevisual eld presented on a video monitor at a distanceof 10 m The xation spot position was at the center ofthe screen The monitor was viewed binocularly with thewhole screen visible to both eyes

Each trial started at - 500 msec with respect to theonset of the test image with a 500-msec warning toneto allow xation of the xation point which appearedat the same time At - 100 msec the xation spot wasblinked off so there was no stimulus on the screen inthe 100-msec period immediately preceding the testimage The screen in this period and at all other timesincluding the interstimulus interval and the interval be-tween the test image and the mask was set at the meanluminance of the test images and the mask At 0 msecthe tone was switched off and the test image wasswitched on for 16 msec This period was the frameduration of the video framestore with which the imageswere presented The image was drawn on the monitorfrom the top to the bottom in the rst 16 msec of theframe period by the framestore with the remaining 4msec of the frame period being the vertical blank inter-val (The PAL video system was in use) The monitor hada persistence of less than 3 msec so no part of the testimage was present at the start of the next frame Thetime between the start of the test stimulus and the startof the mask stimulus (the SOA) was either 20 40 60 100or 1000 msec (chosen in a pseudorandom sequence bythe computer) The 1000-msec condition was used tomeasure the response to the test stimulus alone (whichwas possible because the mask was delayed for so long)The duration of the masking stimulus was 300 msec Atthe termination of the masking stimulus the xation spotreappeared and then after a random interval in the range150 to 3350 msec it dimmed to indicate that lickingresponses to a tube in front of the mouth would resultin the delivery of a reward The dimming period was1000 msec and after this the xation spot was switchedoff and reward availability terminated 500 msec later

The monkey was required to xate the xation spot andif it licked at any time other than when the spot wasdimmed saline instead of fruit juice was delivered fromthe tube If the eyes moved by more than 05deg from time0 until the start of the dimming period the trial wasaborted and the data for the trial were rejected When atrial aborted a high-frequency tone sounded for 05 secno reinforcement was available for that trial and theintertrial interval was lengthened from 8 to 11 sec

The criterion for the face-selective neurons analyzedin this study was that the response to the optimal stimu-lus should be at least twice that to the optimal nonfacestimulus and that this difference should be signicant(Rolls 1984 Rolls amp Toveacutee 1995 Rolls Treves Toveacutee ampPanzeri 1997) If the neuron satised the criterion it wastested with two to six of the effective face stimuli forthat neuron We checked that none of the selected cellshad any response to the mask presented alone

The transmitted information carried by neuronal ringrates about the stimuli was computed with the use oftechniques that have been described fully previously(eg Rolls Treves Toveacutee amp Panzeri 1997 Rolls amp Treves1998) and have been used previously to analyze theresponses of inferior temporal cortex neurons (Gawneamp Richmond 1993 Optican amp Richmond 1987 RollsTreves amp Toveacutee 1997 Toveacutee amp Rolls 1995 Toveacutee et al1993) In brief the general procedure was as follows Theresponse r of a neuron to the presentation of a particularstimulus s was computed by measuring the ring rate ofthe neuron in a xed time window after the stimuluspresentation (In this experiment the information in anumber of different time windows was calculated)Measured in this way the ring rate responses takediscrete rather than continuous values that consist of thenumber of spikes in the time window on a particulartrial and span a discrete set of responses R across allstimuli and trials In the experiment the number of trialsavailable for each stimulus is limited (in the range of 6to 30 in the present experiment) When calculating theinformation the number of ring rate bins that can beused must be smaller than (or equal to) the number oftrials available for each stimulus to prevent undersam-pling and incorrectly high values of calculated informa-tion (Rolls amp Treves 1998) We therefore quantized themeasured ring rates into a smaller number of bins dWe chose here d in a way specic for every cell andevery time window according to the following d wasthe number of trials per stimulus (or the number ofdifferent rates that actually occurred if this was lower)This procedure is very effective in minimizing informa-tion loss due to overregularization of responses whileeffectively controlling for nite sampling biases seeGolomb Hertz Panzeri Treves amp Richmond (1997) andPanzeri amp Treves (1996) After this response quantizationthe experimental joint stimulus-response probability ta-ble P(s r) was computed from the data (where P(r) and

Rolls et al 309

P(s) are the experimental probability of occurrence ofresponses and of stimuli respectively) and the informa-tion I(S R) transmitted by the neurons averaged acrossthe stimuli was calculated by using the Shannon formula(Cover amp Thomas 1991 Shannon 1948)

I(S R) = aring sr

P(s r) log2 P(s r)

P(s)P(r)

and then subtracting the nite sampling correction ofPanzeri and Treves (1996) to obtain estimates unbiasedfor the limited sampling This leads to the informationavailable in the ring rates about the stimulus We didnot calculate the additional information available fromtemporal ring patterns within the spike train becausethe additional information is low often only 10 to 20and reects mainly the onset latency of the neuronalresponse (Toveacutee et al 1993 Toveacutee amp Rolls 1995)

In the experiments described here three cells weretested with six stimuli one with four stimuli ve cellswith three stimuli and six cells with two stimuli We notethat the number of stimuli used here is smaller than thatused in other experiments that applied informationtheoretic measures to the responses of inferior temporalcortex cells (Optican amp Richmond 1987 Rolls Treves ampToveacutee 1997 Toveacutee et al 1993 Toveacutee amp Rolls 1995) Thereason for using fewer images in the experiments de-scribed here is that each image needed to be tested withve different masking conditions There are two poten-tial problems arising from a calculation of informationfrom a limited set of stimuli The rst one is that withfew stimuli the full response space of the neuron maynot be adequately sampled We guarded against this bychoosing images for the tests that elicited quite differentresponses from the cells and ensuring that the informa-tion measured with these images was high The secondpossible problem is that the information measures maybe distorted by the fact that there is a ceiling to theamount of information that can be provided aboutwhich of a set of stimuli has been seen and this entropyis the logarithm of the number of stimuli in the set Fortwo stimuli just 1 bit of information is all that could beprovided by a cell This ceiling could limit the informa-tion measurement obtained from neuronal responses iftoo few stimuli are used (Gawne Kjaer Hertz amp Rich-mond 1996 Gawne amp Richmond 1993 Rolls amp Treves1998 Rolls Treves amp Toveacutee 1997) For the analyses de-scribed here we checked that the information availablefrom the neuronal responses was always well below theceiling so that the information measure was not dis-torted In fact for the unmasked condition the averageinformation from the neuronal responses was 03 bitsand this is much lower than the entropy of the sets ofimages which varied for different cells between 1 bit

and 26 bits Finally we note that each neuron was testedwith the same stimuli across the different masking con-ditions and therefore the comparison of the informationvalues obtained under the different masking conditionsis homogenous

Acknowledgments

This research was supported by Medical Research CouncilProgramme Grant PG8513790 to E T Rolls S Panzeri is sup-ported by an EC Marie Curie Research Training GrantERBFMBICT972749

Reprint requests should be sent to E T Rolls Department ofExperimental Psychology University of Oxford Oxford OX13UD UK or via e-mail EdmundRollspsyoxacuk

REFERENCES

Cover T M amp Thomas J A (1991) Elements of informationtheory New York Wiley

Dinse H R amp Kruger K (1994) The timing of processingalong the visual pathway in the cat NeuroReport 5 893ndash897

Gawne T J Kjaer T W Hertz J A amp Richmond B J (1996)Adjacent cortical complex cells share about 20 of theirstimulus-related information Cerebral Cortex 6 482ndash489

Gawne T J amp Richmond B J (1993) How independent arethe messages carried by adjacent inferior temporal corti-cal neurons Journal of Neuroscience 13 2758ndash2771

Golledge H D R Hilgetag C amp Toveacutee M J (1996) Informa-tion processing A solution to the binding problem Cur-rent Biology 6 1092ndash1095

Golomb D Hertz J A Panzeri S Treves A amp RichmondB J (1997) How well can we estimate the information car-ried in neuronal responses from limited samples NeuralComputation 9 649ndash665

Heller J Hertz J A Kjaer T W amp Richmond B J (1995)Information ow and temporal coding in primate patternvision Journal of Computational Neuroscience 2 175ndash193

Humphreys G W amp Bruce V (1989) Visual cognitionHove Eng Erlbaum

Kovacs G Vogels R amp Orban G A (1995) Cortical corre-lates of pattern backward-masking Proceedings of the Na-tional Academy of Sciences USA 92 5587ndash5591

Maunsell J H R amp Gibson J R (1992) Visual response la-tencies in striate cortex of the macaque monkey Journalof Neurophysiology 68 1332ndash1344

Optican L amp Richmond B J (1987) Temporal encoding oftwo-dimensional patterns by single units in primate infe-rior temporal cortex III Information theoretical analysisJournal of Neurophysiology 57 132ndash146

Oram M W amp Perrett D I (1992) Time course of neural re-sponses discriminating different views of the face andhead Journal of Neurophysiology 68 70ndash84

Panzeri S amp Treves A (1996) Analytical estimates of limitedsampling biases in different information measures Net-work 7 87ndash107

Perrett D I Hietanen J K Oram M W amp Benson P J(1992) Organization and functions of cells in the ma-caque temporal cortex Philosophical Transactions of theRoyal Society London B 335 23ndash50

Raiguel S E Lagae L Gulyas B amp Orban G A (1989) Re-

310 Journal of Cognitive Neuroscience Volume 11 Number 3

sponse latencies of visual cells in macaque areas V1 V2and V5 Brain Research 493 155ndash159

Rieke F Warland D de Ruyter van Steveninck R R ampBialek W (1996) Spikes Exploring the neural code Cam-bridge MA MIT Press

Rolls E T (1984) Neurons in the cortex of the temporallobe and in the amygdala of the monkey with responsesselective for faces Human Neurobiology 3 209ndash222

Rolls E T (1992) Neurophysiological mechanisms underly-ing face processing within and beyond the temporal corti-cal visual areas Philosophical Transactions of the RoyalSociety London B 335 11ndash21

Rolls E T (1994) Brain mechanisms for invariant visual rec-ognition and learning Behavioral Processes 33 113ndash138

Rolls E T (1995) Learning mechanisms in the temporal lobevisual cortex Behavioral Brain Research 66 177ndash185

Rolls E T amp Toveacutee M J (1994) Processing speed in thecerebral cortex and the neurophysiology of backwardmasking Proceedings of the Royal Society London B257 9ndash15

Rolls E T amp Toveacutee M J (1995) The sparseness of the neuro-nal representation of stimuli in the primate temporal vis-ual cortex Journal of Neurophysiology 73 713ndash726

Rolls E T Toveacutee M J Purcell D G Stewart A L amp Azzo-pardi P (1994) The responses of neurons in the temporalcortex of primates and face identication and detectionExperimental Brain Research 101 473ndash484

Rolls E T amp Treves A (1998) Neural networks and brainfunction Oxford Oxford University Press

Rolls E T Treves A amp Toveacutee M J (1997) The repre-sentational capacity of the distributed encoding of informa-tion provided by populations of neurons in the primatetemporal visual cortex Experimental Brain Research114 149ndash162

Rolls E T Treves A Toveacutee M amp Panzeri S (1997) Informa-

tion in the neuronal representation of individual stimuli inthe primate temporal visual cortex Journal of Computa-tional Neuroscience 4 309ndash333

Schiller P H (1968) Single unit analysis of backward visualmasking and metacontrast in the cat lateral geniculate nu-cleus Vision Research 8 855ndash866

Shannon C E (1948) A mathematical theory of communica-tion ATampT Bell Laboratories Technical Journal 27 379ndash423

Thorpe S Fize D amp Mariot C (1996) Speed of processingin the human visual system Nature 381 520ndash522

Toveacutee M J (1994) How fast is the speed of thought Cur-rent Biology 4 1125ndash1127

Toveacutee M J amp Rolls E T (1992) Oscillatory activity is not evi-dent in the primate temporal visual cortex with static vis-ual stimuli NeuroReport 3 369ndash372

Toveacutee M J amp Rolls E T (1995) Information encoding inshort ring rate epochs by single neurons in the primatetemporal visual cortex Visual Cognition 2 35ndash58

Toveacutee M J Rolls E T amp Ramachandran V S (1996) Rapidvisual learning in the neurons of the primate temporal vis-ual cortex NeuroReport 7 2757ndash2760

Toveacutee M J Rolls E T Treves A amp Bellis R P (1993) Infor-mation encoding and the responses of single neurons inthe primate temporal visual cortex Journal of Neurophysi-ology 70 640ndash654

Treves A (1993) Mean-eld analysis of neuronal spike dy-namics Network 4 259ndash284

Vogel R amp Orban G A (1994) Activity of inferior temporalneurons during orientation discrimination with succes-sively presented gratings Journal of Neurophysiology 711428ndash1451

Wallis G amp Rolls E T (1997) Invariant face and object rec-ognition in the visual system Progress in Neurobiology51 167ndash194

Rolls et al 311

P(s) are the experimental probability of occurrence ofresponses and of stimuli respectively) and the informa-tion I(S R) transmitted by the neurons averaged acrossthe stimuli was calculated by using the Shannon formula(Cover amp Thomas 1991 Shannon 1948)

I(S R) = aring sr

P(s r) log2 P(s r)

P(s)P(r)

and then subtracting the nite sampling correction ofPanzeri and Treves (1996) to obtain estimates unbiasedfor the limited sampling This leads to the informationavailable in the ring rates about the stimulus We didnot calculate the additional information available fromtemporal ring patterns within the spike train becausethe additional information is low often only 10 to 20and reects mainly the onset latency of the neuronalresponse (Toveacutee et al 1993 Toveacutee amp Rolls 1995)

In the experiments described here three cells weretested with six stimuli one with four stimuli ve cellswith three stimuli and six cells with two stimuli We notethat the number of stimuli used here is smaller than thatused in other experiments that applied informationtheoretic measures to the responses of inferior temporalcortex cells (Optican amp Richmond 1987 Rolls Treves ampToveacutee 1997 Toveacutee et al 1993 Toveacutee amp Rolls 1995) Thereason for using fewer images in the experiments de-scribed here is that each image needed to be tested withve different masking conditions There are two poten-tial problems arising from a calculation of informationfrom a limited set of stimuli The rst one is that withfew stimuli the full response space of the neuron maynot be adequately sampled We guarded against this bychoosing images for the tests that elicited quite differentresponses from the cells and ensuring that the informa-tion measured with these images was high The secondpossible problem is that the information measures maybe distorted by the fact that there is a ceiling to theamount of information that can be provided aboutwhich of a set of stimuli has been seen and this entropyis the logarithm of the number of stimuli in the set Fortwo stimuli just 1 bit of information is all that could beprovided by a cell This ceiling could limit the informa-tion measurement obtained from neuronal responses iftoo few stimuli are used (Gawne Kjaer Hertz amp Rich-mond 1996 Gawne amp Richmond 1993 Rolls amp Treves1998 Rolls Treves amp Toveacutee 1997) For the analyses de-scribed here we checked that the information availablefrom the neuronal responses was always well below theceiling so that the information measure was not dis-torted In fact for the unmasked condition the averageinformation from the neuronal responses was 03 bitsand this is much lower than the entropy of the sets ofimages which varied for different cells between 1 bit

and 26 bits Finally we note that each neuron was testedwith the same stimuli across the different masking con-ditions and therefore the comparison of the informationvalues obtained under the different masking conditionsis homogenous

Acknowledgments

This research was supported by Medical Research CouncilProgramme Grant PG8513790 to E T Rolls S Panzeri is sup-ported by an EC Marie Curie Research Training GrantERBFMBICT972749

Reprint requests should be sent to E T Rolls Department ofExperimental Psychology University of Oxford Oxford OX13UD UK or via e-mail EdmundRollspsyoxacuk

REFERENCES

Cover T M amp Thomas J A (1991) Elements of informationtheory New York Wiley

Dinse H R amp Kruger K (1994) The timing of processingalong the visual pathway in the cat NeuroReport 5 893ndash897

Gawne T J Kjaer T W Hertz J A amp Richmond B J (1996)Adjacent cortical complex cells share about 20 of theirstimulus-related information Cerebral Cortex 6 482ndash489

Gawne T J amp Richmond B J (1993) How independent arethe messages carried by adjacent inferior temporal corti-cal neurons Journal of Neuroscience 13 2758ndash2771

Golledge H D R Hilgetag C amp Toveacutee M J (1996) Informa-tion processing A solution to the binding problem Cur-rent Biology 6 1092ndash1095

Golomb D Hertz J A Panzeri S Treves A amp RichmondB J (1997) How well can we estimate the information car-ried in neuronal responses from limited samples NeuralComputation 9 649ndash665

Heller J Hertz J A Kjaer T W amp Richmond B J (1995)Information ow and temporal coding in primate patternvision Journal of Computational Neuroscience 2 175ndash193

Humphreys G W amp Bruce V (1989) Visual cognitionHove Eng Erlbaum

Kovacs G Vogels R amp Orban G A (1995) Cortical corre-lates of pattern backward-masking Proceedings of the Na-tional Academy of Sciences USA 92 5587ndash5591

Maunsell J H R amp Gibson J R (1992) Visual response la-tencies in striate cortex of the macaque monkey Journalof Neurophysiology 68 1332ndash1344

Optican L amp Richmond B J (1987) Temporal encoding oftwo-dimensional patterns by single units in primate infe-rior temporal cortex III Information theoretical analysisJournal of Neurophysiology 57 132ndash146

Oram M W amp Perrett D I (1992) Time course of neural re-sponses discriminating different views of the face andhead Journal of Neurophysiology 68 70ndash84

Panzeri S amp Treves A (1996) Analytical estimates of limitedsampling biases in different information measures Net-work 7 87ndash107

Perrett D I Hietanen J K Oram M W amp Benson P J(1992) Organization and functions of cells in the ma-caque temporal cortex Philosophical Transactions of theRoyal Society London B 335 23ndash50

Raiguel S E Lagae L Gulyas B amp Orban G A (1989) Re-

310 Journal of Cognitive Neuroscience Volume 11 Number 3

sponse latencies of visual cells in macaque areas V1 V2and V5 Brain Research 493 155ndash159

Rieke F Warland D de Ruyter van Steveninck R R ampBialek W (1996) Spikes Exploring the neural code Cam-bridge MA MIT Press

Rolls E T (1984) Neurons in the cortex of the temporallobe and in the amygdala of the monkey with responsesselective for faces Human Neurobiology 3 209ndash222

Rolls E T (1992) Neurophysiological mechanisms underly-ing face processing within and beyond the temporal corti-cal visual areas Philosophical Transactions of the RoyalSociety London B 335 11ndash21

Rolls E T (1994) Brain mechanisms for invariant visual rec-ognition and learning Behavioral Processes 33 113ndash138

Rolls E T (1995) Learning mechanisms in the temporal lobevisual cortex Behavioral Brain Research 66 177ndash185

Rolls E T amp Toveacutee M J (1994) Processing speed in thecerebral cortex and the neurophysiology of backwardmasking Proceedings of the Royal Society London B257 9ndash15

Rolls E T amp Toveacutee M J (1995) The sparseness of the neuro-nal representation of stimuli in the primate temporal vis-ual cortex Journal of Neurophysiology 73 713ndash726

Rolls E T Toveacutee M J Purcell D G Stewart A L amp Azzo-pardi P (1994) The responses of neurons in the temporalcortex of primates and face identication and detectionExperimental Brain Research 101 473ndash484

Rolls E T amp Treves A (1998) Neural networks and brainfunction Oxford Oxford University Press

Rolls E T Treves A amp Toveacutee M J (1997) The repre-sentational capacity of the distributed encoding of informa-tion provided by populations of neurons in the primatetemporal visual cortex Experimental Brain Research114 149ndash162

Rolls E T Treves A Toveacutee M amp Panzeri S (1997) Informa-

tion in the neuronal representation of individual stimuli inthe primate temporal visual cortex Journal of Computa-tional Neuroscience 4 309ndash333

Schiller P H (1968) Single unit analysis of backward visualmasking and metacontrast in the cat lateral geniculate nu-cleus Vision Research 8 855ndash866

Shannon C E (1948) A mathematical theory of communica-tion ATampT Bell Laboratories Technical Journal 27 379ndash423

Thorpe S Fize D amp Mariot C (1996) Speed of processingin the human visual system Nature 381 520ndash522

Toveacutee M J (1994) How fast is the speed of thought Cur-rent Biology 4 1125ndash1127

Toveacutee M J amp Rolls E T (1992) Oscillatory activity is not evi-dent in the primate temporal visual cortex with static vis-ual stimuli NeuroReport 3 369ndash372

Toveacutee M J amp Rolls E T (1995) Information encoding inshort ring rate epochs by single neurons in the primatetemporal visual cortex Visual Cognition 2 35ndash58

Toveacutee M J Rolls E T amp Ramachandran V S (1996) Rapidvisual learning in the neurons of the primate temporal vis-ual cortex NeuroReport 7 2757ndash2760

Toveacutee M J Rolls E T Treves A amp Bellis R P (1993) Infor-mation encoding and the responses of single neurons inthe primate temporal visual cortex Journal of Neurophysi-ology 70 640ndash654

Treves A (1993) Mean-eld analysis of neuronal spike dy-namics Network 4 259ndash284

Vogel R amp Orban G A (1994) Activity of inferior temporalneurons during orientation discrimination with succes-sively presented gratings Journal of Neurophysiology 711428ndash1451

Wallis G amp Rolls E T (1997) Invariant face and object rec-ognition in the visual system Progress in Neurobiology51 167ndash194

Rolls et al 311

sponse latencies of visual cells in macaque areas V1 V2and V5 Brain Research 493 155ndash159

Rieke F Warland D de Ruyter van Steveninck R R ampBialek W (1996) Spikes Exploring the neural code Cam-bridge MA MIT Press

Rolls E T (1984) Neurons in the cortex of the temporallobe and in the amygdala of the monkey with responsesselective for faces Human Neurobiology 3 209ndash222

Rolls E T (1992) Neurophysiological mechanisms underly-ing face processing within and beyond the temporal corti-cal visual areas Philosophical Transactions of the RoyalSociety London B 335 11ndash21

Rolls E T (1994) Brain mechanisms for invariant visual rec-ognition and learning Behavioral Processes 33 113ndash138

Rolls E T (1995) Learning mechanisms in the temporal lobevisual cortex Behavioral Brain Research 66 177ndash185

Rolls E T amp Toveacutee M J (1994) Processing speed in thecerebral cortex and the neurophysiology of backwardmasking Proceedings of the Royal Society London B257 9ndash15

Rolls E T amp Toveacutee M J (1995) The sparseness of the neuro-nal representation of stimuli in the primate temporal vis-ual cortex Journal of Neurophysiology 73 713ndash726

Rolls E T Toveacutee M J Purcell D G Stewart A L amp Azzo-pardi P (1994) The responses of neurons in the temporalcortex of primates and face identication and detectionExperimental Brain Research 101 473ndash484

Rolls E T amp Treves A (1998) Neural networks and brainfunction Oxford Oxford University Press

Rolls E T Treves A amp Toveacutee M J (1997) The repre-sentational capacity of the distributed encoding of informa-tion provided by populations of neurons in the primatetemporal visual cortex Experimental Brain Research114 149ndash162

Rolls E T Treves A Toveacutee M amp Panzeri S (1997) Informa-

tion in the neuronal representation of individual stimuli inthe primate temporal visual cortex Journal of Computa-tional Neuroscience 4 309ndash333

Schiller P H (1968) Single unit analysis of backward visualmasking and metacontrast in the cat lateral geniculate nu-cleus Vision Research 8 855ndash866

Shannon C E (1948) A mathematical theory of communica-tion ATampT Bell Laboratories Technical Journal 27 379ndash423

Thorpe S Fize D amp Mariot C (1996) Speed of processingin the human visual system Nature 381 520ndash522

Toveacutee M J (1994) How fast is the speed of thought Cur-rent Biology 4 1125ndash1127

Toveacutee M J amp Rolls E T (1992) Oscillatory activity is not evi-dent in the primate temporal visual cortex with static vis-ual stimuli NeuroReport 3 369ndash372

Toveacutee M J amp Rolls E T (1995) Information encoding inshort ring rate epochs by single neurons in the primatetemporal visual cortex Visual Cognition 2 35ndash58

Toveacutee M J Rolls E T amp Ramachandran V S (1996) Rapidvisual learning in the neurons of the primate temporal vis-ual cortex NeuroReport 7 2757ndash2760

Toveacutee M J Rolls E T Treves A amp Bellis R P (1993) Infor-mation encoding and the responses of single neurons inthe primate temporal visual cortex Journal of Neurophysi-ology 70 640ndash654

Treves A (1993) Mean-eld analysis of neuronal spike dy-namics Network 4 259ndash284

Vogel R amp Orban G A (1994) Activity of inferior temporalneurons during orientation discrimination with succes-sively presented gratings Journal of Neurophysiology 711428ndash1451

Wallis G amp Rolls E T (1997) Invariant face and object rec-ognition in the visual system Progress in Neurobiology51 167ndash194

Rolls et al 311