15
Visual Representation in the Wild: How Rhesus Monkeys Parse Objects Yuko Munakata 1 , Laurie R. Santos 2 , Elizabeth S. Spelke 2 , Marc D. Hauser 2 , and Randall C. O’Reilly 3 Abstract & Visual object representation was studied in free-ranging rhesus monkeys. To facilitate comparison with humans, and to provide a new tool for neurophysiologists, we used a looking time procedure originally developed for studies of human infants. Monkeys’ looking times were measured to displays with one or two distinct objects, separated or together, stationary or moving. Results indicate that rhesus monkeys used featural information to parse the displays into distinct objects, and they found events in which distinct objects moved together more novel or unnatural than events in which distinct objects moved separately. These findings show both common- alities and contrasts with those obtained from human infants. We discuss their implications for the development and neural mechanisms of higher-level vision. & INTRODUCTION Physiological and anatomical studies in nonhuman primates have advanced the understanding of the primate visual system and revealed detailed homologies between human and nonhuman primate vision (Too- tell, Dale, Sereno, & Malach, 1996; Sereno, Dale, & Tootell, 1995; DeYoe & Van Essen, 1988; Maunsell & Newsome, 1987; Desimone, Albright, Gross, & Bruce, 1984). Fundamental questions nevertheless remain concerning the similarities and differences between the visual representations that humans and monkeys form. For example, to what degree do nonhuman primates share our human propensity to develop taxo- nomies of objects, treating each living object as a member of a given species and each artifact as a tool with a specific function? Answering such questions is critical to understanding the relation between visual cognition in human and nonhuman primates and to interpreting data on the physiology and anatomy of primate vision. Here we report four experiments investigating how semi–free-ranging rhesus monkeys form representa- tions of and inferences about visible objects presented under natural conditions. Our experiments use meth- ods that require no training, that allow direct compar- isons with studies of humans both before and after the acquisition of language, and that could be adapted to permit simultaneous behavioral and neuronal record- ings. We examine rhesus monkeys’ parsing of visual displays into objects and their sensitivity to the natural motions of those objects, using a looking time proce- dure that was developed for studies of human infants (e.g., Fantz, 1961), adapted for studies of object per- ception in human infancy (e.g., Spelke, Breinlinger, Jacobson, & Phillips, 1993), and applied to both free- ranging rhesus monkeys and captive cotton-top tamar- ins with considerable success (Hauser, MacNeilage, & Ware, 1996; Hauser, 1998). The Development of Object Parsing in Humans One motivation for the current work comes from developmental studies of object perception in human infancy. Infants perceive objects by using information about the three-dimensional arrangements and mo- tions of visible surfaces (hereafter, spatiotemporal information) before they use information about the colors, textures, and curvature of surfaces (hereafter, featural information) (Xu, Carey, & Welch, 1999; Needham & Baillargeon, 1998; Spelke et al., 1993; Kestenbaum, Termine, & Spelke, 1987; von Hofsten & Spelke, 1985). Further, the ability to use featural information to represent distinct objects emerges at the same time as the first names for objects. This correlation raises questions about the relation between visual representation and language. Does the acquisi- tion of a natural language, or the emergence of related symbolic capacities, lead humans to represent objects in a way that is unique to our species? Alternatively, do human and nonhuman primates form homologous object representations that emerge in humans at about 1 year of life? The current studies attempt to distin- 1 University of Denver, 2 Harvard University, 3 University of Colorado © 2001 Massachusetts Institute of Technology Journal of Cognitive Neuroscience 13:1, pp. 44–58

Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

Visual Representation in the Wild How Rhesus MonkeysParse Objects

Yuko Munakata1 Laurie R Santos2 Elizabeth S Spelke2Marc D Hauser2 and Randall C OrsquoReilly3

Abstract

amp Visual object representation was studied in free-rangingrhesus monkeys To facilitate comparison with humans and toprovide a new tool for neurophysiologists we used a lookingtime procedure originally developed for studies of humaninfants Monkeysrsquo looking times were measured to displayswith one or two distinct objects separated or togetherstationary or moving Results indicate that rhesus monkeys

used featural information to parse the displays into distinctobjects and they found events in which distinct objects movedtogether more novel or unnatural than events in which distinctobjects moved separately These findings show both common-alities and contrasts with those obtained from human infantsWe discuss their implications for the development and neuralmechanisms of higher-level vision amp

INTRODUCTION

Physiological and anatomical studies in nonhumanprimates have advanced the understanding of theprimate visual system and revealed detailed homologiesbetween human and nonhuman primate vision (Too-tell Dale Sereno amp Malach 1996 Sereno Dale ampTootell 1995 DeYoe amp Van Essen 1988 Maunsell ampNewsome 1987 Desimone Albright Gross amp Bruce1984) Fundamental questions nevertheless remainconcerning the similarities and differences betweenthe visual representations that humans and monkeysform For example to what degree do nonhumanprimates share our human propensity to develop taxo-nomies of objects treating each living object as amember of a given species and each artifact as a toolwith a specific function Answering such questions iscritical to understanding the relation between visualcognition in human and nonhuman primates and tointerpreting data on the physiology and anatomy ofprimate vision

Here we report four experiments investigating howsemindashfree-ranging rhesus monkeys form representa-tions of and inferences about visible objects presentedunder natural conditions Our experiments use meth-ods that require no training that allow direct compar-isons with studies of humans both before and after theacquisition of language and that could be adapted topermit simultaneous behavioral and neuronal record-ings We examine rhesus monkeysrsquo parsing of visual

displays into objects and their sensitivity to the naturalmotions of those objects using a looking time proce-dure that was developed for studies of human infants(eg Fantz 1961) adapted for studies of object per-ception in human infancy (eg Spelke BreinlingerJacobson amp Phillips 1993) and applied to both free-ranging rhesus monkeys and captive cotton-top tamar-ins with considerable success (Hauser MacNeilage ampWare 1996 Hauser 1998)

The Development of Object Parsing in Humans

One motivation for the current work comes fromdevelopmental studies of object perception in humaninfancy Infants perceive objects by using informationabout the three-dimensional arrangements and mo-tions of visible surfaces (hereafter spatiotemporalinformation) before they use information about thecolors textures and curvature of surfaces (hereafterfeatural information) (Xu Carey amp Welch 1999Needham amp Baillargeon 1998 Spelke et al 1993Kestenbaum Termine amp Spelke 1987 von Hofstenamp Spelke 1985) Further the ability to use featuralinformation to represent distinct objects emerges atthe same time as the first names for objects Thiscorrelation raises questions about the relation betweenvisual representation and language Does the acquisi-tion of a natural language or the emergence of relatedsymbolic capacities lead humans to represent objectsin a way that is unique to our species Alternatively dohuman and nonhuman primates form homologousobject representations that emerge in humans at about1 year of life The current studies attempt to distin-

1 University of Denver 2 Harvard University 3 University ofColorado

copy 2001 Massachusetts Institute of Technology Journal of Cognitive Neuroscience 131 pp 44ndash58

guish these possibilities by testing adult nonhumanprimatesrsquo abilities to perceive objects using featuralinformation

Much of the evidence concerning the development ofobject representations in humans comes from preferen-tial looking experiments in which infants are presentedrepeatedly with a visual display until their attention tothe display declines and then they are presented withnew displays Looking times to the new displays aremeasured on the assumption that infants will looklonger at displays they perceive to be novel or unnatural(Baillargeon 1995 Fantz 1964) These looking timestherefore provide evidence concerning infantsrsquo repre-sentations of all the displays and their (perhaps tacit)expectations about how the displays may change(Spelke 1985)

One set of preferential looking studies provides thebackground for the present experiments Xu et al(1999) presented 10- and 12-month-old infants withan array that adults perceive as one meaningful objecton top of another (eg a duck resting on a car) on aflat supporting surface (Figure 1) In one conditionthe objects were stationary in the other conditionthe top object was moved relative to the bottomobject during the initial familiarization After infantshad viewed the stationary or moving display repeat-edly they were shown two test events in which ahand grasped the top object and lifted it In oneevent the top object rose into the air while thebottom object remained on the supporting surface(left-hand side of Figure 1) in the other event bothobjects moved upward together (right-hand side ofFigure 1) Looking times to the outcomes of theseevents were compared to each other and to thelooking times of infants in a baseline condition whoviewed the same outcome displays but received noinitial familiarization with the objects and viewed no

object motion Relative to this baseline measure ofthe displaysrsquo intrinsic attractiveness to infants the 12-month-old infants in the main experiment lookedlonger at the test outcome in which the two objectshad moved together both after familiarization withmoving objects and after familiarization with stationaryobjects This looking pattern provides evidence thatinfants had parsed both of the initial displays into twobounded objects and that they represented eachobject as separately movable and manipulable The10-month-old infants showed the same looking pat-terns in the condition in which the objects initiallywere presented in motion but not in the condition inwhich the objects initially were stationary Thesefindings provide evidence that 10-month-old infantsused spatiotemporal information but not featuralinformation specifying object kind to represent objectboundaries

Xu et alrsquos (1999) findings accord with research usingvariants of this method with younger infants andsimpler object displays Infants 3 to 5 months old havebeen found to parse two adjacent objects such asblocks and cones into two separately movable manip-ulable units if the objects are separated in depth orundergo separate movement (Spelke Hofsten amp Kes-tenbaum 1989 Kestenbaum et al 1987 von Hofstenamp Spelke 1985) but not if the objects are stationaryadjacent and distinguishable only by their differentsurface texture coloring and curvature (Needham ampBaillargeon 1998 Spelke et al 1993) Because infantsare known to be sensitive to the latter featural informa-tion (see Kellman amp Arterberry 1998 for review) oneinterpretation of these findings is that infantsrsquo repre-sentations of objects depend on a modular systemwhich operates on spatiotemporal but not featuralinformation (eg Scholl amp Leslie in press Bertenthal1996 Spelke amp Van de Walle 1993) Alternatively

Figure 1 Displays testing in-fantsrsquo sensitivity to spatiotem-poral or featural information(Reprinted from Cognition 70Xu et al Infantsrsquo ability to useobject kind information forobject individuation 137ndash1661999 with permission fromElsevier Science)

Munakata et al 45

infants may use both spatiotemporal and featural in-formation to parse objects but they may be lesssensitive to the latter (Needham 1997 Johnson ampAslin 1996) In any case a change appears to occurat the end of the first year when infants first reliablydemonstrate the use of featural information to specifythe boundaries of adjacent objects1 This change coin-cides with the emergence of the first names for objects(Xu amp Carey 1996 Xu et al 1999)

One exception to the general rule that young infantsignore featural information in parsing objects concernstheir responses to humans and human body partsespecially hands By 6 months of age infants treat handsas distinct from inanimate objects and have differentexpectations about how hands and inanimate objectsshould behave In particular infants view the motions ofhands as goal-directed (Woodward 1998) they antici-pate that a hand can pick up an object but that oneobject cannot pick up another (Leslie 1982) and theyappreciate on some level that an object held by a handrequires no further support (Needham amp Baillargeon1993) Infantsrsquo sensitivity to relationships between handsand inanimate objects may form an important precursorto the development of tool use which is largely learnedby observation in young children (Nagell Olguin ampTomasello 1993 Tomasello Kruger amp Ratner 1993Meltzoff 1988) It is not known however whether thisearly-developing sensitivity is unique to humans orcausally involved in later-developing object representa-tions

Object Representations in Monkeys

Hauser et al (1996) Hauser (1998) and Hauserand Carey (1998) have adapted the preferentiallooking method for studies of object representationsin both semindashfree-ranging and captive monkeys InHauserrsquos studies adult monkeys are given abbre-viated versions of the preferential looking experi-ments used with infants with brief familiarizationperiods and brief fixed-duration test trials Monkeystypically are familiarized and tested with eventsinvolving food items because these objects elicithigh levels of spontaneous attention (Hauser etal 1996 Hauser amp Carey 1998 though see Hauser1998 Hauser amp Williams submitted for cases wheremonkeys have been tested successfully with nonfooditems) Like human infants adult monkeys havebeen found to show higher levels of spontaneouslooking time when they view certain events thathuman adults find novel or unnatural (eg eventsin which an object appears to vanish after movingbehind a screen) even when care is taken tomatch the natural and unnatural events on a varietyof other dimensions (see Hauser amp Carey 1998 fordiscussion) Hauserrsquos findings suggest that the pre-ferential looking methods developed for studies of

human infants can serve to assess object represen-tations in monkeys as well allowing systematiccomparisons of high-level visual abilities in monkeysand humans

Additional reasons for using preferential lookingmethods to test for monkeysrsquo object representationsrelate to the extensive anatomical physiological andbehavioral literature on the visual representations ofnonhuman primates and in particular rhesus mon-keys A wealth of studies provide evidence for homo-logous mechanisms subserving visual recognition inrhesus monkeys and humans (Tootell et al 1996Sereno et al 1995 DeYoe amp Van Essen 1988 Maun-sell amp Newsome 1987 Desimone et al 1984) Never-theless progress has been slow in understanding howthis recognition system works and in gaining insightsinto both the commonalities and differences betweenhuman and nonhuman object representation A stan-dard physiological study on rhesus monkeys mightdemonstrate that neurons in a given brain area be-come active when the monkeys see a given displayafter extensive training with the display Such datahowever do not reveal the cognitive and behavioralfunctions of the neurons activated by the visual dis-play nor the ways in which monkeys use thesepatterns of neural activation to interpret the worldMoreover such data make only limited contact withthe rich literature in cognitive psychology and cogni-tive neuroscience on object representations in hu-mans Do monkeys parse visual displays into objectsthat can be separately recognized and manipulatedWhat expectations do monkeys form about eventsinvolving objects

Three features of existing studies of the neuralmechanisms of visual representations may have hin-dered progress in answering these questions Firstwith the notable exception of studies of face recogni-tion (eg Perrett Mistlin amp Chitty 1987) neurophy-siological studies of visual processing in monkeys havetended to use two-dimensional arbitrary stimuli of noecological significance Because monkeysrsquo abilities torecognize objects surely have evolved to solve pro-blems such as finding food avoiding obstacles andusing landmarks to recognize significant places themechanisms subserving these abilities may not be bestrevealed by studies of the responses of neurons totwo-dimensional geometric figures or alphanumericcharacters Second the monkeys in neurophysiologicalstudies typically are given extensive training with anarbitrary set of objects before physiological recordingbegins It is not clear whether training regimes lastinga year or more reveal monkeysrsquo ordinary capacities fordetecting and remembering natural objects underconditions of incidental viewing or their ad hocstrategies adopted to solve the problem at hand (seediscussion in Rao Rainer amp Miller 1997) Third themethods used to study monkeys often are quite

46 Journal of Cognitive Neuroscience Volume 13 Number 1

different from those used with humans These differ-ences complicate comparisons across species and hin-der attempts to trace the evolutionary origins ofhuman capacities in our common primate heritage(Hauser amp Carey 1998)

The present studies investigate monkeysrsquo representa-tions of natural ecologically significant objectsmdash fooditemsmdashundergoing ecologically significant eventsmdashgrasping and lifting Moreover they use a method thathas been used extensively with humans of all agesrequires no training and can be administered to freeranging as well as captive animals For these reasons thestudies promise to shed light on the representations ofobjects that monkeys form spontaneously paving theway for simultaneous behavioral and neurophysiologicalstudies of the mechanisms of object representations inuntrained animals under conditions allowing systematiccomparison with humans

In the present studies we used a variant of themethod of Xu et al (1999) to investigate whetheruntrained free-ranging adult rhesus monkeys with amature visual system but no spontaneous tool use andat best a limited capacity for language perceive objectboundaries as older human infants do Monkeys werepresented with displays containing two novel foodobjects that were stationary and adjacent They thenviewed the lifting of the top object or of both objectstogether on two separate trials and their looking timesto the outcomes of these events were recorded In onecondition the lifting of the objects was accomplishedby a single human hand that grasped only the topobject In a second condition the lifting was accom-plished by two hands each grasping one object andmoving the objects together If monkeys perceived the

object boundaries as older infants do they should looklonger at the outcome of the event in which the twoobjects moved together in the one-hand events Thistendency might be attenuated in the two-hand eventsbecause each object was lifted by a supporting handSuch findings would suggest that human and nonhu-man primates have homologous representations ofobjects as movable and manipulable units and thatboth species distinguish hands from other objects andare sensitive to the functions of hands in supportingand moving objects

EXPERIMENT 1

In our first experiment monkeys were presented withtwo novel food itemsmdasheither a pumpkin and a pieceof ginger root or a pepper and a sweet potatomdashon thefloor of a stage (Figure 2) After a monkey hadobserved one item sitting on top of the other itemfor at least 1 sec a hand grasped the top item andlifted it On different trials either the top item movedalone while the bottom item remained on the stagefloor (lsquolsquoseparate trialrsquorsquo) or the two items moved to-gether (lsquolsquotogether trialrsquorsquo) At the end of the movementthe hand and objects remained stationary at their finalpositions for 10 sec and the monkeyrsquos looking timewas recorded Looking times to the two test displayswere compared to investigate whether monkeyslooked longer at the event outcome on the lsquolsquotogetherrsquorsquotrial a preference that would suggest that they per-ceived two separately movable objects in the originaldisplay

To investigate monkeysrsquo sensitivity to the role ofhuman hands in supporting and manipulating objects

Figure 2 Experimental set-upThe experimenter positionedthe apparatus 2ndash5 m away fromthe test monkey The apparatusconsisted of a stage and ascreen that could block theview of the stage and store thefood stimulus items for thestudy The screen was placedbehind the stage during testtrials as shown in the figure

Munakata et al 47

the test events were presented in two different waysto different groups of subjects For half the subjectsboth events were produced by a single hand thatgrasped only the top food object (hold-top) In thiscondition the bottom food object appeared to humanobservers to rest naturally on the display floor on thelsquolsquoseparatersquorsquo trial and to move unnaturally with the topobject on the lsquolsquotogetherrsquorsquo trial For the remainingsubjects each food object was grasped by a differenthand such that both objects appeared to be ade-quately supported on both trials (hold-both) Compar-isons of monkeysrsquo looking preferences across theseconditions should reveal whether monkeys take ac-count of the support function of human hands inrepresenting object motions

Results

Figure 3a and b present the findings from this experi-ment Monkeys looked longer at lsquolsquotogetherrsquorsquo events (51sec SE = 3 sec) than at lsquolsquoseparatersquorsquo events (41 sec SE =3 sec) F(1 57) = 92 p lt 005 This effect was equallystrong in the hold-top and hold-both conditions yield-ing no interaction of Condition by Display (F = 0) Ofthe 59 monkeys tested 42 looked longer at lsquolsquotogetherrsquorsquoevents and 17 looked longer at lsquolsquoseparatersquorsquo 2 = 11 p lt

005 The main effect of Condition was not statisticallysignificant (F lt 2)

Discussion

These results show that rhesus monkeys look longerwhen two bounded objects move together than whenone of the two objects moves separately from the otherLike human infants monkeys appear to parse arrays intobounded objects and they represent these objects asindependently movable and manipulable Moreovermonkeys and infants alike appear to look longer atevents in which two perceptually bounded objects moveand behave as a single unit suggesting that they findsuch events to be novel or unnatural Monkey andhuman object representations therefore appear to besimilar and to be testable by similar methods in accordwith previous findings (Hauser et al 1996)

The experiment also revealed two differences be-tween the object representations of adult monkeysand young human infants First infants as young as 6months take account of the supporting role of hands inlifting and moving objects but the monkeys in Experi-ment 1 showed no sensitivity to hands Informal obser-vations suggested that the monkeys were highlyattentive to the food objects but oblivious to the human

Figure 3 Results from Experi-ments 1 and 2 Rhesus monkeyslook longer when two distinctobjects move together thanwhen one of the two objectsmoves separately both (a)when one hand holds and liftsthe top object (Experiment 1hold-top condition) and (b)when two hands hold bothobjects (Experiment 1 hold-both condition) In contrast (c)rhesus do not distinguish lsquolsquoto-getherrsquorsquo from lsquolsquoseparatersquorsquo dis-plays in their looking timeswhen the distinct objects arestationary (Experiment 2)

0

1

2

3

4

5

6

Lo

oki

ng

tim

e (s

ec)

a b c

48 Journal of Cognitive Neuroscience Volume 13 Number 1

hands that held and manipulated them Second infantsbelow 12 months do not consistently perceive theboundary between two adjacent objects that are station-ary even when the objects belong to different familiarkinds Because the initial displays in the present studiescontained objects that were adjacent and underwent norelative motion the present findings suggest that therhesus monkeys used featural information to parse thevisual display into distinct objects

According to this object-parsing interpretation twofactors are critical to the rhesusrsquo responses the distinctfeatures of the objects and their distinct or commonmotions Because these two factors were not manipu-lated separately in the test displays in Experiment 1however there are two alternative accounts of thesefindings each of which discredits one of the factorscritical to the object-parsing account According to onealternative account monkeys simply find displays withdistinct features in spatial proximity more interestingthan displays with distinct features in more distantlocations Monkeys may look longer at the outcomeof the lsquolsquotogetherrsquorsquo trial because there are more featuresclustered together than in the outcome of the lsquolsquosepa-ratersquorsquo trial the motion of the objects may be comple-tely irrelevant to monkeysrsquo looking behaviorExperiment 2 tests this alternative interpretation ofthe findings by comparing monkeysrsquo looking times tothe same two outcome displays with no precedingmotion If the alternative interpretation is correct thenmonkeys should show a preference for the lsquolsquotogetherrsquorsquooutcome over the lsquolsquoseparatersquorsquo outcome in Experiment2 as in Experiment 1 If the object-parsing interpreta-tion is correct in contrast monkeys should not showthe same preference for the lsquolsquotogetherrsquorsquo outcome inExperiment 2

The second alternative interpretation of the data fromExperiment 1 is that monkeysrsquo looking times to differentevent outcomes depend on how much motion precededthose outcomes According to this account monkeyslooked longer at the outcome of the lsquolsquotogetherrsquorsquo trialbecause a greater volume of food moved during theevent that preceded the recording of their looking timeOn this interpretation the distinct features of the ob-jects and the representation of object boundaries areirrelevant to monkeysrsquo looking behavior Experiment 3tests this alternative interpretation by presenting mon-keys with lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo trials in which asingle object moves as a whole or splits apart If thealternative interpretation is correct then monkeysshould show a preference for the lsquolsquotogetherrsquorsquo outcomeover the lsquolsquoseparatersquorsquo outcome in Experiment 3 as inExperiment 1 because in both experiments the lsquolsquoto-getherrsquorsquo outcome was preceded by motion of a greatervolume of food In contrast the object-parsing interpre-tation predicts that monkeys will not show the samepreference for the lsquolsquotogetherrsquorsquo outcome in Experiment 3without the distinct featural information

EXPERIMENT 2

A new group of monkeys was presented with the twooutcome displays from Experiment 1 without any priorexposure to the food objects or to their motionsBecause hands were found not to influence monkeysrsquolooking patterns in Experiment 1 all the outcomedisplays presented two food items held by one handOn one trial (together) a hand held both food items inthe air by grasping both objects at once one atop theother On the other trial (separate) a hand held onefood item in the air while the other food item restedon the display floor Looking times to the two testoutcomes were compared to each other and to mon-keysrsquo looking times to the same outcome displays inExperiment 1 If the looking preference for the lsquolsquoto-getherrsquorsquo outcome in Experiment 1 reflected monkeysrsquoparsing of the initial arrays into two objects and theirexpectation that the two objects would move indepen-dently then that preference should be absent orattenuated in Experiment 2

Results

Figure 3c presents the principle findings of Experiment2 With stationary objects monkeys looked equally atlsquolsquotogetherrsquorsquo events (43 sec SE = 4 sec) and lsquolsquoseparatersquorsquoevents (47 sec SE = 4 sec) F(1 27) = 15 p gt 22 Ofthe 28 monkeys tested nine looked longer at thelsquolsquotogetherrsquorsquo event 18 looked longer at the lsquolsquoseparatersquorsquoevent and one individual looked equally at both events( 2 = 30 nonsignificant) The lsquolsquotiersquorsquo data point from thesingle individual was dropped for the 2 analysis

An analysis comparing looking times in Experiments 1and 2 revealed a significant interaction between trialtype (together vs separate) and experiment (1 vs 2)F(185) = 74 p = 008 Monkeys showed a greaterlooking preference for the lsquolsquotogetherrsquorsquo outcome displayin Experiment 1

Discussion

In Experiment 2 rhesus monkeys looked no longer at adisplay in which two different objects were held in the airtogether than at a display in which one object was held inthe air while the other object rested on the display floorThese findings contrast with the results from Experiment1 which focused on monkeysrsquo looking times to thesesame displays after prior exposure to two adjacent objectsand to their common or separate motions These findingschallenge one alternative explanation for the lookingpreferences in Experiment 1 and support our object-parsing interpretation of those looking preferenceswhereby monkeys parse visual displays into distinct ob-jects based on featural information and find events inwhich distinct objects move together more interestingthan events in which distinct objects move separately

Munakata et al 49

However the second alternative interpretation where-by monkeys look longer at displays presenting the out-comes of events in which a greater volume of food hasmoved could account for the data from both experi-ments Experiment 3 tests this alternative with food dis-plays that are parsed by human infants and adults into asingle object that either moves as a whole or breaks apartIf monkeysrsquo looking times to event outcomes depend onthe volume of food in motion that preceded each out-come then they should look longer at an outcome inwhich a whole food object has moved than at an outcomein which half of the object has moved In contrast ifmonkeyrsquos looking times depend on their parsing of visualdisplays into bounded objects then they should showdifferent looking preferences at the outcomes of eventsthat involve one versus two objects

EXPERIMENT 3

As in Experiment 1 monkeys were presented with adisplay of food sitting on a stage floor a hand graspedthe top of the food display and lifted either just thetop half of the food or all of the food into the air andthen the display remained stationary while lookingtimes to the event outcomes were recorded In con-

trast to Experiment 1 however each display containeda single food itemmdasha lemon or an orange peppermdashthat either broke into two pieces (separate) or movedas a whole (together) Looking times at the eventoutcomes were compared to one another and to thelooking times of the monkeys in Experiment 1 toinvestigate whether monkeysrsquo preferences betweenthe event outcomes depends on the volume of foodthat is lifted or on the monkeysrsquo parsing of the foodinto distinct objects

Results

Figure 4a presents the principal findings of Experiment3 Monkeys showed a nonsignificant trend toward look-ing longer at the lsquolsquoseparatersquorsquo outcome display (45 sec SE= 4 sec) than at the lsquolsquotogetherrsquorsquo outcome display (37sec SE = 4 sec) F(1 29) = 32 p = 08 Of the 30monkeys tested 19 looked longer at the lsquolsquoseparatersquorsquodisplay and 11 looked longer at the lsquolsquotogetherrsquorsquo display( 2 = 21 nonsignificant)

In contrast an analysis comparing looking times inExperiments 1 and 3 revealed a significant interactionbetween test display (together vs separate) and experi-ment (1 vs 3) F(1 100) = 85 p = 004 Monkeys

Figure 4 Results from Experi-ments 3 and 4 Rhesus looknonsignificantly longer whenone object appears in twopieces than when it appearstogether as a whole both (a)when the object is moved(Experiment 3) and (b) when itis stationary (Experiment 4)

a b

0

1

2

3

4

5

6

Lo

oki

ng

tim

e (s

ec)

50 Journal of Cognitive Neuroscience Volume 13 Number 1

looked longer at the outcome of the lsquolsquotogetherrsquorsquo event inExperiment 1 than in Experiment 3

Discussion

When monkeys were presented with events in whicheither a single food item moved as a whole or half theobject moved independently of the rest they did notlook longer at the event outcome that followed motionof a greater food volume Indeed monkeys showed amarginally significant tendency in the opposite direc-tion looking longer at the outcome of the event inwhich the object broke apart Looking preferencesbetween the lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo trials differedsignificantly from the preferences shown in Experiment1 in which the events involved two distinct objectsThese findings accord with the thesis that monkeys usefeatural information to parse visual scenes into objectsrepresent each object as separately movable and manip-ulable and look longer at events in which two distinctobjects move together

Nevertheless one of the alternative accounts could berevised to account for this collection of data Perhapsmonkeys have a preference both for event outcomesthat follow the motion of more food and for eventoutcomes that reveal the inside of a food object Accord-ing to this revised account monkeys in Experiment 1looked longer following an event in which two distinctobjects moved together because of their preference formore stuff moving This preference was not evident inExperiment 3 because it competed with an intrinsicpreference for the outcome display from the lsquolsquoseparatersquorsquotrial Because the inside of the lemon or pepper wasvisible following the lsquolsquoseparatersquorsquo event of Experiment 3but not following either the lsquolsquotogetherrsquorsquo event in Experi-ment 3 or either event in Experiment 1 a preference forviewing the inside of a food object would produce agreater preference for the lsquolsquoseparatersquorsquo outcome display inExperiment 3 than in Experiment 1

Experiment 4 tests this revised account by presentinga new group of monkeys with the outcome displays ofthe lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo events from Experiment3 with no prior presentation of any objects or motionAccording to the revised account monkeys should showa stronger preference for the lsquolsquoseparatersquorsquo event in Experi-ment 4 than in Experiment 3 because only Experiment 3would invoke the competing preference for more stuffmoving in the lsquolsquotogetherrsquorsquo event According to the origi-nal object-parsing account the preference for the lsquolsquose-paratersquorsquo event in Experiment 4 will not exceed that inExperiment 3 If the monkeys in Experiment 3 expectsingle objects to move as cohesive units then prefer-ence for the outcome of the lsquolsquoseparatersquorsquo event might begreater in Experiment 3 than in Experiment 4 If mon-keys have no expectations about the cohesive or non-cohesive motion of food objects then preferencesshould be the same in the two experiments

EXPERIMENT 4

Experiment 4 used the outcome displays of Experiment3 and the method of Experiment 2 Monkeys werepresented with one stationary display in which a handheld a whole food object in the air (together) and onestationary display in which a hand held the top half ofthe food object in the air while the bottom half of thefood object rested on the display floor (separate)Looking times to the two displays were compared toeach other and to the looking times of the monkeys inExperiment 3 who viewed the same displays followingpresentation of the whole object and two differentpatterns of motion

Results

Figure 4b presents the principal findings of Experiment4 Monkeys looked equally at lsquolsquotogetherrsquorsquo events (37 secSE = 3 sec) and lsquolsquoseparatersquorsquo events (42 sec SE = 3 sec)F(1 42) = 15 p = 2 Of the 43 monkeys tested 16looked longer at the lsquolsquotogetherrsquorsquo event and 27 lookedlonger at the lsquolsquoseparatersquorsquo event ( 2 = 28 nonsignificant)

The analysis comparing looking times in Experiments3 and 4 revealed a significant main effect of trial typemonkeys looked longer at the lsquolsquoseparatersquorsquo outcome dis-play (43 sec SE = 3 sec) than at the lsquolsquotogetherrsquorsquooutcome display (37 sec SE = 2 sec) F(1 71) = 45p lt 05 Of the 73 monkeys tested in Experiments 3 and4 46 looked longer at the lsquolsquoseparatersquorsquo outcome displayand 27 looked longer at the lsquolsquotogetherrsquorsquo outcome display

2 = 49 p lt 05

Discussion

In Experiment 4 rhesus monkeys showed a nonsignifi-cantly smaller preference for the lsquolsquoseparatersquorsquo display inwhich a single food item appeared in two pieces thantheir counterparts in Experiment 3 This finding pro-vides evidence against the thesis that monkeysrsquo lookingtimes depend on a preference for the outcomes ofevents involving the motion of more food stuff com-bined with an intrinsic preference for the separatedoutcome display with one object They instead supportthe object-parsing interpretation of the results fromExperiment 1 Monkeys appear to use featural informa-tion to parse visual displays into distinct objects andthey find events in which distinct objects move togethermore novel or less natural than events in which distinctobjects move separately

The findings of Experiments 3 and 4 provide no clearevidence concerning monkeysrsquo expectation that singlefood items will move cohesively If monkeys had such anexpectation then the subjects in Experiment 3 shouldhave looked longer at the lsquolsquoseparatersquorsquo display than thosein Experiment 4 because the lsquolsquoseparatersquorsquo display in Ex-periment 3 followed an event in which a single object

Munakata et al 51

broke apart and moved noncohesively Although the datafrom Experiments 3 and 4 tend in this direction noreliable differences were obtained between the lookingpreferences in the two experiments Reliable preferencesfor the outcomes of noncohesive motions have beenobserved both with human infants and with humanadults tested with similar methods and with displays ofsimple artifacts (Spelke et al 1989 1993 Kestenbaum etal 1987) The absence of a clear effect of cohesiveness inExperiments 3 and 4 may reflect either a species differ-ence or a difference in object domain Artifacts are moreapt to move cohesively than is food which breaks apartboth due to decay cutting or eating Such conclusionscannot be drawn from the present experiments howeverbecause of the equivocal findings

GENERAL DISCUSSION

Four experiments provide evidence that rhesus monkeysspontaneously parse arrays of adjacent food items intodistinct objects and that they represent these objects asseparately movable and manipulable Monkeys lookedlonger at the outcomes of events in which two previouslystationary adjacent objects moved as one unit than at theoutcomes of events in which one of the objects movedseparately from the other This preference was notattributable to any intrinsic preference for the formerevent outcome or to any preference for an outcome thatfollowed a greater amount of motion Instead it providesevidence that the monkeys represented the commonmotion of the two distinct objects as more novel orsurprising than the independent motion of those objects

The present findings suggest broad similarities be-tween the object representations formed by human andnonhuman primates and between the ways in whichthose representations are used to support inferencesabout objectsrsquo movability The well known detailedhomologies between the lower-level visual mechanismsof human and nonhuman primates (Tootell et al 1996Sereno et al 1995 DeYoe amp Van Essen 1988 Maunsell ampNewsome 1987 Desimone et al 1984) therefore appearto extend to higher-level mechanisms for parsing objectsand interpreting object motions In addition our findingsprovide evidence that adult monkeys and human infantsshow similar behavioral responses to object motionswith heightened visual exploration of motions that arenovel or surprising These findings complement previousresults showing that rhesus monkeys cotton-top tamar-ins and human infants show similar looking preferencesfor events in which objects are occluded or behave inanomalous ways (eg Hauser et al 1996 Hauser 1998)

Differences in Sensitivity to Hands

Our studies also reveal two differences between theobject representations formed by adult rhesus monkeysand young human infants First human infants take

account of the actions of human hands in analyzingthe motions and support relations among objects Whenhuman infants see an inanimate object rise into the air ina display that includes a human hand they show anovelty reaction if the hand and object are spatiallyseparated but not if the hand is grasping the object(Needham amp Baillargeon 1993 Leslie 1984) Monkeysin contrast showed no sensitivity to the supporting roleof hands in Experiment 1 Their novelty reaction to thecommon rising motion of two objects was equally strongwhen no hand contacted the bottom object (an eventthat implies that the two objects were connected) andwhen hands contacted each of the objects (an event thatimplies no connection between the objects)

We see two plausible accounts of the observed differ-ences in sensitivity to hands First human infantsrsquo great-er sensitivity to the supporting role of hands may reflecta species difference in the use of hands specifically inthe manipulation of inanimate objects Because humaninfants and human adults manipulate objects more thanother primates do human infants may have more op-portunities to learn about handndashobject support relationsthan do other species A second possibility not mutuallyexclusive from the first is that humans are innatelypredisposed to attend to the ways in which inanimateobjects are manipulated by other humans which in turncontributes to both infantsrsquo abilities to learn rapidlyabout tools and ultimately to humansrsquo superior tooluse2

Differences in the Use of Object Features forBoundaries

The second difference between the object representa-tions of adult monkeys and young human infants con-cerns the use of object features such as surface coloringand shape as information for object boundaries Adultmonkeys and human infants above 11 months of ageuse featural information to perceive object boundariesin contrast infants below 11 months of age do notreliably exhibit this ability Various factors have beenproposed to underlie the developmental change ob-served in humans Some factors focus on perceptualdevelopment with behavioral changes attributed toinfantsrsquo emerging abilities to use image features suchas edge alignment and texture similarity to group por-tions of the visual field into units directly (eg Kellmanamp Arterberry 1998 Needham 1998) In contrast otherfactors focus on the development of higher level pro-cesses with behavioral changes attributed to an emer-ging ability to represent objects as members of kindsand an emerging propensity to use object features suchas surface coloring and shape as information for thekinds to which specific objects belong (Needham ampModi 2000 Xu amp Carey 1996) Further this changemay be driven by the acquisition of verbal labels for theobjects (Xu amp Carey 1996)

52 Journal of Cognitive Neuroscience Volume 13 Number 1

Corresponding to these two interpretations of thedevelopmental change in humans are two differentinterpretations of monkeysrsquo performance in the presentstudies Monkeys may have perceived the object bound-aries by categorizing each object as a different kind offood or they may have perceived the boundaries group-ing together elements in the visual scene in accord withtheir colors textures and alignment relationships

There is compelling data suggesting that monkeysrepresent the category of food such that they are likelyto have lsquolsquofood kindrsquorsquo representations First monkeys inthe present studies were strongly attentive to food itemsand occasionally attempted to approach and take thembehaviors often observed with familiar foods and rarelyobserved with familiar nonfood objects This was trueeven though they had no prior experience with theseparticular food items Second experiments by SantosHauser and Spelke (in preparation) suggest that mon-keys given evidence that a novel object is food (byobserving a person eating part of it) subsequentlyapproach that object an odorless replica of that objectand other objects of the same color and texture as theoriginal object but of a different shape In contrastmonkeys do not approach these objects when they aregiven evidence that the initial object is not food (byobserving a person putting the object in her ear ratherthan her mouth) This finding suggests that monkeyscategorize novel objects as kinds of food in terms ofproperties such as their colors and textures If percep-tible properties of the present stimulus objects allowedmonkeys to perceive correctly that these objects werefood then monkeysrsquo propensity to categorize objects asthe same foods only when they share a common colorand texture would lead them to perceive each display oftwo (differently colored and textured) foods as contain-ing two distinct objects

Whatever the reason for monkeysrsquo successful use offeatural information to perceive object boundaries theexistence of this capacity in rhesus monkeys casts doubton the thesis that this ability either depends on or givesrise to any uniquely human ability to represent objectsHumans do represent objects in unique ways for wehave unparalleled abilities to build and use complextools and to communicate about objects with uniquesymbols for thousands of object kinds The sources ofour uniqueness however do not clearly appear in thecontexts that have been used thus far to assess objectrepresentations in human infants

Steps Toward a Cognitive Neuroscience of NaturalObject Representation

Although our experiments focus strictly on behavioralmeasures and functional analyses we believe their great-est potential lies in the contributions they can make tounderstanding the neural basis of object representationRhesus monkeys are one of the most intensively studied

species in the neuroanatomy and neurophysiology ofvision and such studies have provided evidence forextensive homologies between their visual systems andthose of humans Our experiments contribute to thisliterature in three ways First they suggest that rhesusmonkeys and humans have similar higher visual me-chanisms for representing objects and interpreting ob-ject motions The origin of these similarities remains anopen question with likely contributions from bothgenetically encoded homologies in the underlying neur-al architectures and similar experiential histories inter-acting with similar neural learning mechanisms

Second our experiments provide evidence that theobject representations of monkeys and humans can beassessed by nearly identical tasks Moreover these tasksrequire no training and so allow assessment of therepresentations that humans and monkeys developand use spontaneously rather than less naturalisticrepresentations that may have been developed specifi-cally for solving experimental tasks over months oftraining on those tasks (see discussion in Rao et al1997) Finally these tasks can be applied not only toadult animals but to infants Indeed the preferentiallooking method was developed for use with infanthumans and monkeys (Fantz 1961) and it has beenused to study lower-level visual functions in both spe-cies (see Kellman amp Banks 1998 for review) Themethod therefore should be ideal for investigating theneural architecture subserving visual cognition in bothspecies

Third our experiments offer a behavioral task that canreadily be adapted for simultaneous behavioral andneural recordings in monkeys Preferential lookingmethods have been used successfully both with semindashfree-ranging rhesus monkeys and with captive cotton-top tamarins (eg Hauser et al 1996 Hauser 1998) Inpreliminary research they have yielded similar findingswith rhesus monkeys tested with stabilized heads andimplanted electrodes (Munakata Miller amp Spelke un-published) In the future therefore cognitive neuros-cientists should be able to use these methods to probethe neural mechanisms of object representations inuntrained monkeys whose experience with objects canbe precisely controlled and to compare the functionalproperties of those mechanisms directly to those ofhuman infants with varying degrees of experience Suchstudies should prove a valuable complement to studiesof the neural mechanisms of object representations inadult humans using the combined approaches of cogni-tive psychology and functional brain imaging

More specifically the studies reported in this papercould serve as the starting point for physiological studiesprobing the cognitive and behavioral functions of neu-rons activated by visual displays As a first step onecould ask whether the extensively studied object codingneurons in the inferotemporal cortex (Tanaka 1996Perrett et al 1987 Baylis Rolls amp Leonard 1985) are

Munakata et al 53

responsible for the behavioral results found in ourexperiments The finding that monkeys encode two-object displays as two separate objects leads to theprediction that inferotemporal neurons will respondsimilarly to each object in a one-object display and in atwo-object display with a possible reduction in responseto the latter display due to competition from thedifferent object representations It is possible howeverthat monkeys distinguish the two objects in earlierstages of processing parsing the display based on con-tiguous regions of the same general color and texturewithout this parsing being clearly reflected in object-level representations These alternatives could be dis-tinguished by recording a population of inferotemporalresponses to one of our two-object displays and to eachof the two objects separately If the two-object responseswere different from the sum or average of the responsesto the two separate objects this would suggest thatmonkeys encode the two-object displays in a differentmanner than the separate objects at the level of theinferotemporal cortex

If this first experiment showed that inferotemporalneurons encode two-object displays in terms of the twoseparate objects one could next manipulate factors thatinfluence object perception and measure the neuralcorrelates For example spatiotemporal cues such ascommon motion may make monkeys more likely toperceive a two-object display as a single object (egKellman amp Spelke 1983) and elimination of colordifferences may make them less likely to do so (SantosHauser amp Spelke in preparation) One could thusmeasure both the behavioral (looking time) and electro-physiological consequences of such manipulations andcompare the results to those from experiments withoutthis preexposure

Another ideal candidate for converging explorationfocuses on the representations underlying abilities toperceive combinations of objects in terms of theseparable components Do such abilities stem fromrepresentations of distinct perceptual features or fromrepresentations of distinct object kinds The nature ofobject representations is a matter of considerabledebate in the electrophysiological and related litera-ture (eg Sugihara Edelman amp Tanaka 1998 Lo-gothetis amp Sheinberg 1996 Tanaka 1996 Tarr ampBulthoff 1995 Biederman amp Cooper 1992 Biedermanamp Gerhardstein 1995) and issues related to the kindsfeatures distinction have been discussed in a some-what different terminology For example Logothetisand Sheinberg (1996) posit that different levels ofcategorization could be used to organize object repre-sentations from more specific visual feature-basedrepresentations to more abstract-kind representationsElectrophysiological recordings have demonstratedthat a given visual object is represented in differentways along a rough hierarchy of processing pathwaysfrom more specific low-order featural representations

to more abstract invariant categorical representations(eg Desimone amp Ungerleider 1989) Objects such asthose used in our displays could be recognized asdistinct based on features at lower levels or categoriesat higher levels Alternatively even the lowest level ofobject representations may be organized into differentkind categories as suggested by the existence of face-specific representations in both rhesus monkey andhuman visual areas (Kanwisher McDermott amp Chun1997 Perrett et al 1987) Though most of theexplanations for face-specific representations focus onthe unique perceptual properties of faces rather thana more general categorical organization of the objectrecognition system a categorical organization is stillpossible (Caramazza 1998) Thus objects could becategorized as different kinds at the earliest levels offeatural processing

One could further explore these issues in physiologi-cal studies by presenting monkeys with different visualforms of a single food category (eg bananas that aresliced mashed peeled unpeeled green brown andyellow) and visually similar forms from different foodcategories (eg a green banana and a cucumber) Ifinferotemporal representations encode information atthe level of kinds the first condition should elicit similarresponses in the inferotemporal neurons whereas thesecond should elicit different responses In contrast ifinferotemporal representations encode information atthe level of features the first condition should elicitdifferent responses whereas the second should elicitsimilar responses

In such ways our understanding of object representa-tion and in turn of humansrsquo unique tool and symboluse may be enhanced by converging efforts at thebehavioral and physiological levels of analysis Themethods reported in this papermdashused extensively withhumans of all ages requiring no training and applicableto free-ranging as well as captive animalsmdashcould play aninstrumental role in this process

METHODS

Experiment 1

Participants

Subjects were 59 semindashfree-ranging rhesus monkeysliving on the island of Cayo Santiago Puerto RicoApproximately half the subjects were adult males (agegt4 years) and half adult female (age gt3 years) Subjectswere tested opportunistically whenever they were en-countered in a setting with few other monkeys ordistractions (eg not involved in or near to a fight)and when they remained in a seated position longenough for us to present our stimuli Monkeys occa-sionally changed positions between trials In these casestesting resumed if and when monkeys relocated toanother seated position within a couple of minutes An

54 Journal of Cognitive Neuroscience Volume 13 Number 1

additional 21 monkeys were tested but did not providedata for the analyses due to either position changes thatdid not allow testing to resume (20 monkeys) or experi-menter error (1 monkey)

Apparatus and Displays

The experimental apparatus consisted of a stage and ascreen constructed from white foam core (Figure 2)The 60 pound 30-cm floor and 60 pound 40-cm back of the stagewere attached at a right angle by triangular supports (12-cm height pound 7-cm base) attached to the sides of thestage The 60 pound 40-cm screen had a 60 pound 15-cm basesupporting small aluminum pans containing the foodstimuli for the study The base and pans were attachedto the screen at a right angle by large triangular supports(40-cm height pound 15-cm base) that occluded both thebase and the food

The objects were four foods of contrasting shapescolors and textures with sizes that made them easilygraspable by a single human hand a green pepper (7-cm tall pound 8-cm diameter) a brown sweet potato (7-cmtall pound 75-cm wide by 17-cm long) a miniature orangepumpkin (7-cm tall pound 8-cm diameter) and a segmentof tan ginger root (45-cm tall pound 12-cm wide pound 15-cmlong) None of these items grew on the island or werebrought there either as provisions for the monkeys oras food for the research team all items therefore wereunfamiliar to the subjects In one display the greenpepper rested on top of the sweet potato In theother display the pumpkin rested on top of theginger root

Design

Each monkey was presented with one lsquolsquotogetherrsquorsquo trialand one lsquolsquoseparatersquorsquo trial each involving a different pairof food items Twenty-eight monkeys were tested in ahold-top condition in which the experimenter held onlythe top object with one hand during each event and 31monkeys were tested in a hold-both condition in whichthe experimenter held both objects with both handsWithin each of these conditions the pairing of objects(pepperpotato vs pumpkinginger) and trial types (to-gether vs separate) and the order of test trials wereorthogonally counterbalanced across monkeys

Procedure

All testing was conducted by one experimenter and onecamcorder operator a test began when the investigatorslocated a monkey who was seated in a quiet spot Theexperimenter positioned the apparatus 2ndash5 m away fromthe test monkey with the screen in front of and blockingthe monkeyrsquos view of the stage and the camcorderoperator began to videotape the monkey from behindthe display (Figure 2) The experimenter then raised the

screen to reveal an empty stage and immediately low-ered it Each test trial then proceeded as follows Theexperimenter raised the screen to reveal one food itemsitting atop a second food item The experimenterchecked that the monkey had fixated the objects andthen she lifted the top object approximately 30 cm in 1sec In lsquolsquotogetherrsquorsquo events the bottom object moved withthe top object in lsquolsquoseparatersquorsquo events the bottom objectremained on the floor of the display In the hold-topcondition the experimenter held only the top objectfrom above with the right hand (the two objects wereattached with toothpicks invisible to the monkeys) Inthe hold-both condition the experimenter held the topobject from above with the right hand and the bottomobject from the side and bottom with the left hand Afterlifting the object(s) the experimenter called lsquolsquoCountrsquorsquoand the camcorder operator began counting 10 sec onthe camcorder display The experimenter held the ob-ject(s) stationary until the camcorder operator calledlsquolsquoDonersquorsquo to signal the end of the 10-sec trial Theexperimenter then lowered the screen This procedurehas been successfully used in previous looking timeexperiments on this population (eg Hauser et al1996)

Each monkey received one lsquolsquotogetherrsquorsquo and one lsquolsquose-paratersquorsquo trial These two trials were separated by twoadditional trials unrelated to the present studies andinvolving the stationary presentation of other food items(carrots and squash) For most monkeys trials wereseparated by an intertrial interval of 3ndash5 sec and theentire experiment lasted a couple of minutes For mon-keys who repositioned themselves between trials theintertrial intervals were longer but never exceeded acouple of minutes

Coding and Analysis

Two coders blind to the hypotheses and conditions ofthe experiment viewed the videotaped trials frame-by-frame to determine how long monkeys observed eachof the event outcomes On each trial coding began justafter the objects came to rest as signaled by theexperimenterrsquos voice on the videotape and ended 10sec later Four of the monkeys were coded by bothcoders the correlation between their judgments oftotal looking time on each trial was 93 Looking timeswere analyzed by a 2 pound 2 ANOVA with Condition (hold-top vs hold-both) as a between-subjects factor andDisplay (together vs separate) as the within-subjectsfactor

Experiment 2

Participants

Subjects were 28 monkeys from the same population asin Experiment 1 An additional 10 monkeys were tested

Munakata et al 55

but did not provide data for the analyses due to eitherposition changes that did not allow testing to resume (9monkeys) or experimental error (1 monkey)

Apparatus and Stimuli

The apparatus was similar to that in Experiment 1except that the stage was somewhat smaller (back =45 pound 30 cm floor = 45 pound 30 cm) and the screen wasslightly taller (base = 45 pound 15 cm face = 45 pound 45 cm)The food objects and object positions were the same asin the outcome displays for Experiment 1 On thelsquolsquoseparatersquorsquo trial the position of the experimenterrsquos handwas the same as in the lsquolsquoseparatersquorsquo trial of the hold-topcondition of Experiment 1 On the lsquolsquotogetherrsquorsquo trial theexperimenterrsquos right hand grasped the two objectssimultaneously from the side and supported them inthe same positions as on the lsquolsquotogetherrsquorsquo trials for bothconditions of Experiment 1

Design

The design was the same as in Experiment 1 except thatall subjects were run in a single-hand condition

Procedure

Each trial began when the experimenter lifted the screento reveal the objects currently held in the air As thescreen was raised the experimenter called lsquolsquoCountrsquorsquo andthe camcorder operator began counting 10 sec on thecamcorder display In all other respects the procedurewas the same as in Experiment 1

Coding and Analysis

A single coder blind to the conditions of the experimentscored the videotapes Trials for 10 subjects were codedby a second observer and the correlation betweenjudgments of both observers was 98 As in previousstudies (Hauser et al 1996 Hauser 1998) videos wereacquired onto a computer using Adobe Premiere soft-ware and a Radius Videovision board Coding began andended as for Experiment 1

Looking times in Experiment 2 were analyzed by aone-way ANOVA with Display (together vs separate) asthe within-subjects factor A further ANOVA with theadditional factor of Experiment compared the lookingpatterns of the monkeys in Experiment 2 to those inExperiment 1

Experiment 3

Participants

Subjects were 30 monkeys from the same population asin Experiments 1 and 2 An additional 13 monkeys were

tested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (12 monkeys) or experimental error (1 monkey)

Apparatus and Displays

The apparatus was identical to that of Experiment 2 Thedisplays were the same as in the hold-top condition ofExperiment 1 except for the objects a yellow lemon andan orange pepper oriented vertically On lsquolsquotogetherrsquorsquotrials a whole object appeared on the display floororiented vertically and a hand grasped its top half andlifted the object into the air On lsquolsquoseparatersquorsquo trials twohalves of an object with a horizontal cut through themiddle appeared on the display floor in the sameorientation and a hand grasped the top half and liftedit into the air while the bottom half remained on thedisplay floor At the start of the lsquolsquoseparatersquorsquo trial the cutin the object was detectable by adults but inconspicu-ous At the end of the trial small portions of the insideof the object were visible from the monkeyrsquos station-point

Design Procedure Coding and Analyses

The design and procedure were the same as in Experi-ment 1 except that only one condition (hold-top) wasadministered and only one object (together or separate)was displayed The coding and analyses were the sameas in Experiment 2

Experiment 4

Participants

Subjects were 43 monkeys from the same population asin Experiments 1ndash3 An additional 27 monkeys weretested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (26 monkeys) or experimental error (1 monkey)

Apparatus and Displays

These were the same as in Experiment 3 except that thefood object never appeared on the display floor and wasnot grasped and lifted

Design Procedure Coding and Analyses

These were the same as in Experiment 2

Acknowledgments

The research was supported by McDonnell-Pew PostdoctoralFellowships to Yuko Munakata and Randall OrsquoReilly an NSFpredoctoral fellowship and Harvard University McMasterrsquosfunds to Laurie Santos NIH grant R37-HD23103 to ElizabethSpelke and an NSF Young Investigator Award to Marc

56 Journal of Cognitive Neuroscience Volume 13 Number 1

Hauser We thank members of the Cognitive DevelopmentCenter at the University of Denver for feedback on drafts ofthis article We thank Elliott Blass Amy Jackendoff Katie LiuCory Miller Marianne Moon Bridget Spelke and Fei Xu forassistance with conducting and coding the studies We thankthe CPRC (PHS grant P51RR00168-38) and Drs M KesslerF Bercovitch and J Berard for helping secure the CayoSantiago facilities

Reprint requests should be sent to Yuko Munakata at theDepartment of Psychology University of Denver 2155 SRace Street Denver CO 80208 USA or via e-mail tomunakataduedu

Notes

1 In some cases infants younger than 1 year have demon-strated abilities to use featural information to parse simpleadjacent objects into separable units (Needham amp Baillargeon1997 Needham 1998) Further work is needed to determinewhy such abilities are not reliably observed in infants duringthe first year of life2 A third account for the differential sensitivity to hands maybe quickly rejected Human infantsrsquo greater response to handsis not due to any greater sensitivity of the preferential lookingtask with human infants relative to adult rhesus monkeysbecause we found monkeys to be more sensitive than humaninfants to featural information for objects in the presentexperiments A fourth interpretation is possible though notvery likely Human infantsrsquo greater sensitivity to hands in ourexperiments may depend on the use of human handsmdashperhaps monkeys would show similar sensitivity to hands iftested with monkey hands However rhesus monkeys showsimilar physiological responses to human hands and monkeyhands (Rizzolatti Fadiga Fogassi amp Gallese 1999 diPellegrino Fadiga Fogassi Gallese amp Rizzolatti 1992)suggesting that rhesus monkeysrsquo observed insensitivity to thesupporting role of hands in our studies is unlikely to be anartifact of the use of human hands

REFERENCES

Baillargeon R (1995) Physical reasoning in infancy In MGazzaniga (Ed) The cognitive neurosciences CambridgeMIT Press

Baylis G C Rolls E T amp Leonard C M (1985) Selectivitybetween faces in the responses of a population of neurons inthe cortex of the superior temporal sulcus of the macaquemonkey Brain Research 342 91ndash102

Bertenthal B I (1996) Origins and early development ofperception action and representation Annual Review ofPsychology 47 431ndash459

Biederman I amp Cooper E E (1992) Size invariance in visualobject priming Journal of Experimental Psychology Hu-man Perception and Performance 18 121ndash133

Biederman I amp Gerhardstein P C (1995) Viewpoint-de-pendent mechanisms in visual object recognition Reply toTarr and Bulthoff (1995) Journal of Experimental Psychol-ogy Human Perception and Performance 21 1506ndash1514

Caramazza A (1998) Domain-specific knowledge systems inthe brain The animatendashinanimate distinction Journal ofCognitive Neuroscience 10 1ndash34

Desimone R Albright T D Gross C G amp Bruce C (1984)Stimulus selective properties of inferior temporal neurons inthe macaque Journal of Neuroscience 4 2051ndash2062

Desimone R Ungerleider L G (1989) Neural mechanisms ofvisual processing in monkeys In F Boller amp J Grafman

(Eds) Handbook of neuropsychology vol 2 (pp 267ndash299)New York Elsevier

DeYoe E A amp Van Essen D C (1988) Concurrent processingstreams in monkey visual cortex Trends in Neurosciences11 219ndash226

di Pellegrino G Fadiga L Fogassi L Gallese V amp RizzolattiG (1992) Understanding motor events A neurophysiologi-cal study Experimental Brain Research 91 176ndash180

Fantz R (1961) The origin of form perception ScientificAmerican 204 66ndash72

Fantz R (1964) Visual experience in infants Decreased at-tention to familiar patterns relative to novel ones Science146 668ndash670

Hauser M D (1998) A nonhuman primatersquos expectationsabout object motion and destination The importance ofself-propelled movement and animacy DevelopmentalScience 1 31ndash37

Hauser M D amp Carey S (1998) Building a cognitive creaturefrom a set of primitives Evolutionary and developmentalinsights In D Cummins amp C Allen (Eds) The evolution ofmind Oxford Oxford University Press

Hauser M D MacNeilage P amp Ware M (1996) Numericalrepresentations in primates Proceedings of the NationalAcademy of Sciences USA 93 1514

Hauser M D amp Williams T (submitted) A nonhuman pri-matesrsquo expectations about invisible displacement Two pro-cedures two different systems of knowledge

Johnson S P amp Aslin R N (1996) Perceptions of object unityin young infants The rules of motion depth and orienta-tion Cognitive Development 11 161ndash180

Kanwisher N McDermott J amp Chun M M (1997) The fu-siform face area A module in human extrastriate cortexspecialized for face perception Journal of Neuroscience 174302

Kellman P J amp Arterberry M E (1998) The cradle ofknowledge Development of perception in infancy Cam-bridge MIT Press

Kellman P J amp Banks M S (1998) Infant visual perceptionIn D Kuhn amp R S Siegler (Eds) Handbook of childpsychology Cognition perception and language 5th ed(pp 103ndash146) New York Wiley

Kellman P J amp Spelke E (1983) Perception of partially oc-cluded objects in infancy Cognitive Psychology 15 483ndash524

Kestenbaum R Termine N amp Spelke E S (1987) Perceptionof objects and object boundaries by 3-month-old infantsBritish Journal of Developmental Psychology 5 367ndash383

Leslie A M (1982) The perception of causality in infantsPerception 11 173ndash186

Leslie A M (1984) Infant perception of a manual pick-up eventBritish Journal of Developmental Psychology 2 19ndash32

Logothetis N K amp Sheinberg D L (1996) Visual object re-cognition Annual Review of Neuroscience 19 577ndash621

Maunsell J H amp Newsome W T (1987) Visual processing inmonkey extrastriate cortex Annual Review of Neu-roscience 10 363ndash401

Meltzoff A S (1988) Infant imitation and memory Nine-month-olds in immediate and deferred tests Child Devel-opment 59 217ndash225

Nagell K Olguin R S amp Tomasello M (1993) Processes ofsocial learning in the tool use of chimpanzees (Pan troglo-dytes) and human children (Homo sapiens) Journal ofComparative Psychology 107 174ndash186

Needham A (1997) Factors affecting infantsrsquo use of featuralinformation in object segregation Current Directions inPsychological Science 6 26ndash33

Needham A (1998) Infantsrsquo use of featural information in thesegregation of stationary objects Infant Behavior and De-velopment 21 47ndash76

Munakata et al 57

Needham A amp Baillargeon R (1993) Intuitions about sup-port in 45-month-old infants Cognition 47 121ndash148

Needham A amp Baillargeon R (1997) Object segregation in 8-month-old infants Cognition 62 121ndash149

Needham A amp Baillargeon R (1998) Effects of prior experi-ence on 45-month-old infantsrsquo object segregation InfantBehavior and Development 21 1ndash23

Needham A amp Modi M (2000) Infantsrsquo use of prior experi-ences with objects in object segregation Implications forobject recognition in infancy In H Reese (Ed) Advances inchild development and behavior vol 27 (pp 99ndash133)

Perrett D I Mistlin A J amp Chitty A J (1987) Visual neu-rones responsive to faces Trends in Neurosciences 10 358ndash364

Rao S C Rainer G amp Miller E K (1997) Integration of whatand where in the primate prefrontal cortex Science 276821

Rizzolatti G Fadiga L Fogassi L amp Gallese V (1999) Re-sonance behaviors and mirror neurons Archives Italiennesde Biologie 137 85

Scholl B amp Leslie A (in press) Explaining the infantrsquos objectconcept Beyond the perceptioncognition dichotomy In ELepore amp Z Pylyshyn (Eds) What is cognitive science Ox-ford Blackwell

Sereno M I Dale A M amp Tootell R B H (1995) Borders ofmultiple visual areas in humans revealed by functionalmagnetic resonance imaging Science 268 889

Spelke E (1985) Preferential looking methods as tools for thestudy of cognition in infancy In G Gottlieb amp N Krasnegor(Eds) Measurement of audition and vision in the first yearof postnatal life (pp 323ndash363) Norwood NJ Ablex

Spelke E Breinlinger K Jacobson K amp Phillips A (1993)Gestalt relations and object perception A developmentalstudy Perception 22 1483ndash1501

Spelke E amp Van de Walle G A (1993) Perceiving and rea-soning about objects Insights from infants In N Eilan amp R AMcCarthy (Eds) Spatial representation Problems in philo-sophy and psychology (pp 132ndash161) Oxford Blackwell

Spelke E S Hofsten C V amp Kestenbaum R (1989) Objectperception in infancy Interaction of spatial and kinetic in-formation for object boundaries Developmental Psychol-ogy 25 185ndash196

Sugihara T Edelman S amp Tanaka K (1998) Representationof objective similarity among three-dimensional shapes inthe monkey Biological Cybernetics 78 1

Tanaka K (1996) Inferotemporal cortex and object visionAnnual Review of Neuroscience 19 109ndash139

Tarr M J amp Bulthoff H H (1995) Is human object recogni-tion better described by geon structural descriptions or bymultiple views Comment on Biederman and Gerhardstein(1993) Journal of Experimental Psychology Human Per-ception and Performance 21 1494

Tomasello M Kruger A C amp Ratner H H (1993) Culturallearning Behavioral and Brain Sciences 16 495ndash552

Tootell R B H Dale A M Sereno M I amp Malach R (1996)New images from human visual cortex Trends in Neuros-ciences 19 481ndash489

von Hofsten C amp Spelke E (1985) Object perception andobject-directed reaching in infancy Journal of ExperimentalPsychology General 11 198ndash212

Woodward A L (1998) Infants selectively encode the goalobject of an actorrsquos reach Cogniton 69 1ndash34

Xu F amp Carey S (1996) Infantsrsquo metaphysics The case ofnumerical identity Cognitive Psychology 30 111ndash153

Xu F Carey S amp Welch J (1999) Infantsrsquo ability to useobject kind information for object individuation Cognition70 137ndash166

58 Journal of Cognitive Neuroscience Volume 13 Number 1

Page 2: Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

guish these possibilities by testing adult nonhumanprimatesrsquo abilities to perceive objects using featuralinformation

Much of the evidence concerning the development ofobject representations in humans comes from preferen-tial looking experiments in which infants are presentedrepeatedly with a visual display until their attention tothe display declines and then they are presented withnew displays Looking times to the new displays aremeasured on the assumption that infants will looklonger at displays they perceive to be novel or unnatural(Baillargeon 1995 Fantz 1964) These looking timestherefore provide evidence concerning infantsrsquo repre-sentations of all the displays and their (perhaps tacit)expectations about how the displays may change(Spelke 1985)

One set of preferential looking studies provides thebackground for the present experiments Xu et al(1999) presented 10- and 12-month-old infants withan array that adults perceive as one meaningful objecton top of another (eg a duck resting on a car) on aflat supporting surface (Figure 1) In one conditionthe objects were stationary in the other conditionthe top object was moved relative to the bottomobject during the initial familiarization After infantshad viewed the stationary or moving display repeat-edly they were shown two test events in which ahand grasped the top object and lifted it In oneevent the top object rose into the air while thebottom object remained on the supporting surface(left-hand side of Figure 1) in the other event bothobjects moved upward together (right-hand side ofFigure 1) Looking times to the outcomes of theseevents were compared to each other and to thelooking times of infants in a baseline condition whoviewed the same outcome displays but received noinitial familiarization with the objects and viewed no

object motion Relative to this baseline measure ofthe displaysrsquo intrinsic attractiveness to infants the 12-month-old infants in the main experiment lookedlonger at the test outcome in which the two objectshad moved together both after familiarization withmoving objects and after familiarization with stationaryobjects This looking pattern provides evidence thatinfants had parsed both of the initial displays into twobounded objects and that they represented eachobject as separately movable and manipulable The10-month-old infants showed the same looking pat-terns in the condition in which the objects initiallywere presented in motion but not in the condition inwhich the objects initially were stationary Thesefindings provide evidence that 10-month-old infantsused spatiotemporal information but not featuralinformation specifying object kind to represent objectboundaries

Xu et alrsquos (1999) findings accord with research usingvariants of this method with younger infants andsimpler object displays Infants 3 to 5 months old havebeen found to parse two adjacent objects such asblocks and cones into two separately movable manip-ulable units if the objects are separated in depth orundergo separate movement (Spelke Hofsten amp Kes-tenbaum 1989 Kestenbaum et al 1987 von Hofstenamp Spelke 1985) but not if the objects are stationaryadjacent and distinguishable only by their differentsurface texture coloring and curvature (Needham ampBaillargeon 1998 Spelke et al 1993) Because infantsare known to be sensitive to the latter featural informa-tion (see Kellman amp Arterberry 1998 for review) oneinterpretation of these findings is that infantsrsquo repre-sentations of objects depend on a modular systemwhich operates on spatiotemporal but not featuralinformation (eg Scholl amp Leslie in press Bertenthal1996 Spelke amp Van de Walle 1993) Alternatively

Figure 1 Displays testing in-fantsrsquo sensitivity to spatiotem-poral or featural information(Reprinted from Cognition 70Xu et al Infantsrsquo ability to useobject kind information forobject individuation 137ndash1661999 with permission fromElsevier Science)

Munakata et al 45

infants may use both spatiotemporal and featural in-formation to parse objects but they may be lesssensitive to the latter (Needham 1997 Johnson ampAslin 1996) In any case a change appears to occurat the end of the first year when infants first reliablydemonstrate the use of featural information to specifythe boundaries of adjacent objects1 This change coin-cides with the emergence of the first names for objects(Xu amp Carey 1996 Xu et al 1999)

One exception to the general rule that young infantsignore featural information in parsing objects concernstheir responses to humans and human body partsespecially hands By 6 months of age infants treat handsas distinct from inanimate objects and have differentexpectations about how hands and inanimate objectsshould behave In particular infants view the motions ofhands as goal-directed (Woodward 1998) they antici-pate that a hand can pick up an object but that oneobject cannot pick up another (Leslie 1982) and theyappreciate on some level that an object held by a handrequires no further support (Needham amp Baillargeon1993) Infantsrsquo sensitivity to relationships between handsand inanimate objects may form an important precursorto the development of tool use which is largely learnedby observation in young children (Nagell Olguin ampTomasello 1993 Tomasello Kruger amp Ratner 1993Meltzoff 1988) It is not known however whether thisearly-developing sensitivity is unique to humans orcausally involved in later-developing object representa-tions

Object Representations in Monkeys

Hauser et al (1996) Hauser (1998) and Hauserand Carey (1998) have adapted the preferentiallooking method for studies of object representationsin both semindashfree-ranging and captive monkeys InHauserrsquos studies adult monkeys are given abbre-viated versions of the preferential looking experi-ments used with infants with brief familiarizationperiods and brief fixed-duration test trials Monkeystypically are familiarized and tested with eventsinvolving food items because these objects elicithigh levels of spontaneous attention (Hauser etal 1996 Hauser amp Carey 1998 though see Hauser1998 Hauser amp Williams submitted for cases wheremonkeys have been tested successfully with nonfooditems) Like human infants adult monkeys havebeen found to show higher levels of spontaneouslooking time when they view certain events thathuman adults find novel or unnatural (eg eventsin which an object appears to vanish after movingbehind a screen) even when care is taken tomatch the natural and unnatural events on a varietyof other dimensions (see Hauser amp Carey 1998 fordiscussion) Hauserrsquos findings suggest that the pre-ferential looking methods developed for studies of

human infants can serve to assess object represen-tations in monkeys as well allowing systematiccomparisons of high-level visual abilities in monkeysand humans

Additional reasons for using preferential lookingmethods to test for monkeysrsquo object representationsrelate to the extensive anatomical physiological andbehavioral literature on the visual representations ofnonhuman primates and in particular rhesus mon-keys A wealth of studies provide evidence for homo-logous mechanisms subserving visual recognition inrhesus monkeys and humans (Tootell et al 1996Sereno et al 1995 DeYoe amp Van Essen 1988 Maun-sell amp Newsome 1987 Desimone et al 1984) Never-theless progress has been slow in understanding howthis recognition system works and in gaining insightsinto both the commonalities and differences betweenhuman and nonhuman object representation A stan-dard physiological study on rhesus monkeys mightdemonstrate that neurons in a given brain area be-come active when the monkeys see a given displayafter extensive training with the display Such datahowever do not reveal the cognitive and behavioralfunctions of the neurons activated by the visual dis-play nor the ways in which monkeys use thesepatterns of neural activation to interpret the worldMoreover such data make only limited contact withthe rich literature in cognitive psychology and cogni-tive neuroscience on object representations in hu-mans Do monkeys parse visual displays into objectsthat can be separately recognized and manipulatedWhat expectations do monkeys form about eventsinvolving objects

Three features of existing studies of the neuralmechanisms of visual representations may have hin-dered progress in answering these questions Firstwith the notable exception of studies of face recogni-tion (eg Perrett Mistlin amp Chitty 1987) neurophy-siological studies of visual processing in monkeys havetended to use two-dimensional arbitrary stimuli of noecological significance Because monkeysrsquo abilities torecognize objects surely have evolved to solve pro-blems such as finding food avoiding obstacles andusing landmarks to recognize significant places themechanisms subserving these abilities may not be bestrevealed by studies of the responses of neurons totwo-dimensional geometric figures or alphanumericcharacters Second the monkeys in neurophysiologicalstudies typically are given extensive training with anarbitrary set of objects before physiological recordingbegins It is not clear whether training regimes lastinga year or more reveal monkeysrsquo ordinary capacities fordetecting and remembering natural objects underconditions of incidental viewing or their ad hocstrategies adopted to solve the problem at hand (seediscussion in Rao Rainer amp Miller 1997) Third themethods used to study monkeys often are quite

46 Journal of Cognitive Neuroscience Volume 13 Number 1

different from those used with humans These differ-ences complicate comparisons across species and hin-der attempts to trace the evolutionary origins ofhuman capacities in our common primate heritage(Hauser amp Carey 1998)

The present studies investigate monkeysrsquo representa-tions of natural ecologically significant objectsmdash fooditemsmdashundergoing ecologically significant eventsmdashgrasping and lifting Moreover they use a method thathas been used extensively with humans of all agesrequires no training and can be administered to freeranging as well as captive animals For these reasons thestudies promise to shed light on the representations ofobjects that monkeys form spontaneously paving theway for simultaneous behavioral and neurophysiologicalstudies of the mechanisms of object representations inuntrained animals under conditions allowing systematiccomparison with humans

In the present studies we used a variant of themethod of Xu et al (1999) to investigate whetheruntrained free-ranging adult rhesus monkeys with amature visual system but no spontaneous tool use andat best a limited capacity for language perceive objectboundaries as older human infants do Monkeys werepresented with displays containing two novel foodobjects that were stationary and adjacent They thenviewed the lifting of the top object or of both objectstogether on two separate trials and their looking timesto the outcomes of these events were recorded In onecondition the lifting of the objects was accomplishedby a single human hand that grasped only the topobject In a second condition the lifting was accom-plished by two hands each grasping one object andmoving the objects together If monkeys perceived the

object boundaries as older infants do they should looklonger at the outcome of the event in which the twoobjects moved together in the one-hand events Thistendency might be attenuated in the two-hand eventsbecause each object was lifted by a supporting handSuch findings would suggest that human and nonhu-man primates have homologous representations ofobjects as movable and manipulable units and thatboth species distinguish hands from other objects andare sensitive to the functions of hands in supportingand moving objects

EXPERIMENT 1

In our first experiment monkeys were presented withtwo novel food itemsmdasheither a pumpkin and a pieceof ginger root or a pepper and a sweet potatomdashon thefloor of a stage (Figure 2) After a monkey hadobserved one item sitting on top of the other itemfor at least 1 sec a hand grasped the top item andlifted it On different trials either the top item movedalone while the bottom item remained on the stagefloor (lsquolsquoseparate trialrsquorsquo) or the two items moved to-gether (lsquolsquotogether trialrsquorsquo) At the end of the movementthe hand and objects remained stationary at their finalpositions for 10 sec and the monkeyrsquos looking timewas recorded Looking times to the two test displayswere compared to investigate whether monkeyslooked longer at the event outcome on the lsquolsquotogetherrsquorsquotrial a preference that would suggest that they per-ceived two separately movable objects in the originaldisplay

To investigate monkeysrsquo sensitivity to the role ofhuman hands in supporting and manipulating objects

Figure 2 Experimental set-upThe experimenter positionedthe apparatus 2ndash5 m away fromthe test monkey The apparatusconsisted of a stage and ascreen that could block theview of the stage and store thefood stimulus items for thestudy The screen was placedbehind the stage during testtrials as shown in the figure

Munakata et al 47

the test events were presented in two different waysto different groups of subjects For half the subjectsboth events were produced by a single hand thatgrasped only the top food object (hold-top) In thiscondition the bottom food object appeared to humanobservers to rest naturally on the display floor on thelsquolsquoseparatersquorsquo trial and to move unnaturally with the topobject on the lsquolsquotogetherrsquorsquo trial For the remainingsubjects each food object was grasped by a differenthand such that both objects appeared to be ade-quately supported on both trials (hold-both) Compar-isons of monkeysrsquo looking preferences across theseconditions should reveal whether monkeys take ac-count of the support function of human hands inrepresenting object motions

Results

Figure 3a and b present the findings from this experi-ment Monkeys looked longer at lsquolsquotogetherrsquorsquo events (51sec SE = 3 sec) than at lsquolsquoseparatersquorsquo events (41 sec SE =3 sec) F(1 57) = 92 p lt 005 This effect was equallystrong in the hold-top and hold-both conditions yield-ing no interaction of Condition by Display (F = 0) Ofthe 59 monkeys tested 42 looked longer at lsquolsquotogetherrsquorsquoevents and 17 looked longer at lsquolsquoseparatersquorsquo 2 = 11 p lt

005 The main effect of Condition was not statisticallysignificant (F lt 2)

Discussion

These results show that rhesus monkeys look longerwhen two bounded objects move together than whenone of the two objects moves separately from the otherLike human infants monkeys appear to parse arrays intobounded objects and they represent these objects asindependently movable and manipulable Moreovermonkeys and infants alike appear to look longer atevents in which two perceptually bounded objects moveand behave as a single unit suggesting that they findsuch events to be novel or unnatural Monkey andhuman object representations therefore appear to besimilar and to be testable by similar methods in accordwith previous findings (Hauser et al 1996)

The experiment also revealed two differences be-tween the object representations of adult monkeysand young human infants First infants as young as 6months take account of the supporting role of hands inlifting and moving objects but the monkeys in Experi-ment 1 showed no sensitivity to hands Informal obser-vations suggested that the monkeys were highlyattentive to the food objects but oblivious to the human

Figure 3 Results from Experi-ments 1 and 2 Rhesus monkeyslook longer when two distinctobjects move together thanwhen one of the two objectsmoves separately both (a)when one hand holds and liftsthe top object (Experiment 1hold-top condition) and (b)when two hands hold bothobjects (Experiment 1 hold-both condition) In contrast (c)rhesus do not distinguish lsquolsquoto-getherrsquorsquo from lsquolsquoseparatersquorsquo dis-plays in their looking timeswhen the distinct objects arestationary (Experiment 2)

0

1

2

3

4

5

6

Lo

oki

ng

tim

e (s

ec)

a b c

48 Journal of Cognitive Neuroscience Volume 13 Number 1

hands that held and manipulated them Second infantsbelow 12 months do not consistently perceive theboundary between two adjacent objects that are station-ary even when the objects belong to different familiarkinds Because the initial displays in the present studiescontained objects that were adjacent and underwent norelative motion the present findings suggest that therhesus monkeys used featural information to parse thevisual display into distinct objects

According to this object-parsing interpretation twofactors are critical to the rhesusrsquo responses the distinctfeatures of the objects and their distinct or commonmotions Because these two factors were not manipu-lated separately in the test displays in Experiment 1however there are two alternative accounts of thesefindings each of which discredits one of the factorscritical to the object-parsing account According to onealternative account monkeys simply find displays withdistinct features in spatial proximity more interestingthan displays with distinct features in more distantlocations Monkeys may look longer at the outcomeof the lsquolsquotogetherrsquorsquo trial because there are more featuresclustered together than in the outcome of the lsquolsquosepa-ratersquorsquo trial the motion of the objects may be comple-tely irrelevant to monkeysrsquo looking behaviorExperiment 2 tests this alternative interpretation ofthe findings by comparing monkeysrsquo looking times tothe same two outcome displays with no precedingmotion If the alternative interpretation is correct thenmonkeys should show a preference for the lsquolsquotogetherrsquorsquooutcome over the lsquolsquoseparatersquorsquo outcome in Experiment2 as in Experiment 1 If the object-parsing interpreta-tion is correct in contrast monkeys should not showthe same preference for the lsquolsquotogetherrsquorsquo outcome inExperiment 2

The second alternative interpretation of the data fromExperiment 1 is that monkeysrsquo looking times to differentevent outcomes depend on how much motion precededthose outcomes According to this account monkeyslooked longer at the outcome of the lsquolsquotogetherrsquorsquo trialbecause a greater volume of food moved during theevent that preceded the recording of their looking timeOn this interpretation the distinct features of the ob-jects and the representation of object boundaries areirrelevant to monkeysrsquo looking behavior Experiment 3tests this alternative interpretation by presenting mon-keys with lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo trials in which asingle object moves as a whole or splits apart If thealternative interpretation is correct then monkeysshould show a preference for the lsquolsquotogetherrsquorsquo outcomeover the lsquolsquoseparatersquorsquo outcome in Experiment 3 as inExperiment 1 because in both experiments the lsquolsquoto-getherrsquorsquo outcome was preceded by motion of a greatervolume of food In contrast the object-parsing interpre-tation predicts that monkeys will not show the samepreference for the lsquolsquotogetherrsquorsquo outcome in Experiment 3without the distinct featural information

EXPERIMENT 2

A new group of monkeys was presented with the twooutcome displays from Experiment 1 without any priorexposure to the food objects or to their motionsBecause hands were found not to influence monkeysrsquolooking patterns in Experiment 1 all the outcomedisplays presented two food items held by one handOn one trial (together) a hand held both food items inthe air by grasping both objects at once one atop theother On the other trial (separate) a hand held onefood item in the air while the other food item restedon the display floor Looking times to the two testoutcomes were compared to each other and to mon-keysrsquo looking times to the same outcome displays inExperiment 1 If the looking preference for the lsquolsquoto-getherrsquorsquo outcome in Experiment 1 reflected monkeysrsquoparsing of the initial arrays into two objects and theirexpectation that the two objects would move indepen-dently then that preference should be absent orattenuated in Experiment 2

Results

Figure 3c presents the principle findings of Experiment2 With stationary objects monkeys looked equally atlsquolsquotogetherrsquorsquo events (43 sec SE = 4 sec) and lsquolsquoseparatersquorsquoevents (47 sec SE = 4 sec) F(1 27) = 15 p gt 22 Ofthe 28 monkeys tested nine looked longer at thelsquolsquotogetherrsquorsquo event 18 looked longer at the lsquolsquoseparatersquorsquoevent and one individual looked equally at both events( 2 = 30 nonsignificant) The lsquolsquotiersquorsquo data point from thesingle individual was dropped for the 2 analysis

An analysis comparing looking times in Experiments 1and 2 revealed a significant interaction between trialtype (together vs separate) and experiment (1 vs 2)F(185) = 74 p = 008 Monkeys showed a greaterlooking preference for the lsquolsquotogetherrsquorsquo outcome displayin Experiment 1

Discussion

In Experiment 2 rhesus monkeys looked no longer at adisplay in which two different objects were held in the airtogether than at a display in which one object was held inthe air while the other object rested on the display floorThese findings contrast with the results from Experiment1 which focused on monkeysrsquo looking times to thesesame displays after prior exposure to two adjacent objectsand to their common or separate motions These findingschallenge one alternative explanation for the lookingpreferences in Experiment 1 and support our object-parsing interpretation of those looking preferenceswhereby monkeys parse visual displays into distinct ob-jects based on featural information and find events inwhich distinct objects move together more interestingthan events in which distinct objects move separately

Munakata et al 49

However the second alternative interpretation where-by monkeys look longer at displays presenting the out-comes of events in which a greater volume of food hasmoved could account for the data from both experi-ments Experiment 3 tests this alternative with food dis-plays that are parsed by human infants and adults into asingle object that either moves as a whole or breaks apartIf monkeysrsquo looking times to event outcomes depend onthe volume of food in motion that preceded each out-come then they should look longer at an outcome inwhich a whole food object has moved than at an outcomein which half of the object has moved In contrast ifmonkeyrsquos looking times depend on their parsing of visualdisplays into bounded objects then they should showdifferent looking preferences at the outcomes of eventsthat involve one versus two objects

EXPERIMENT 3

As in Experiment 1 monkeys were presented with adisplay of food sitting on a stage floor a hand graspedthe top of the food display and lifted either just thetop half of the food or all of the food into the air andthen the display remained stationary while lookingtimes to the event outcomes were recorded In con-

trast to Experiment 1 however each display containeda single food itemmdasha lemon or an orange peppermdashthat either broke into two pieces (separate) or movedas a whole (together) Looking times at the eventoutcomes were compared to one another and to thelooking times of the monkeys in Experiment 1 toinvestigate whether monkeysrsquo preferences betweenthe event outcomes depends on the volume of foodthat is lifted or on the monkeysrsquo parsing of the foodinto distinct objects

Results

Figure 4a presents the principal findings of Experiment3 Monkeys showed a nonsignificant trend toward look-ing longer at the lsquolsquoseparatersquorsquo outcome display (45 sec SE= 4 sec) than at the lsquolsquotogetherrsquorsquo outcome display (37sec SE = 4 sec) F(1 29) = 32 p = 08 Of the 30monkeys tested 19 looked longer at the lsquolsquoseparatersquorsquodisplay and 11 looked longer at the lsquolsquotogetherrsquorsquo display( 2 = 21 nonsignificant)

In contrast an analysis comparing looking times inExperiments 1 and 3 revealed a significant interactionbetween test display (together vs separate) and experi-ment (1 vs 3) F(1 100) = 85 p = 004 Monkeys

Figure 4 Results from Experi-ments 3 and 4 Rhesus looknonsignificantly longer whenone object appears in twopieces than when it appearstogether as a whole both (a)when the object is moved(Experiment 3) and (b) when itis stationary (Experiment 4)

a b

0

1

2

3

4

5

6

Lo

oki

ng

tim

e (s

ec)

50 Journal of Cognitive Neuroscience Volume 13 Number 1

looked longer at the outcome of the lsquolsquotogetherrsquorsquo event inExperiment 1 than in Experiment 3

Discussion

When monkeys were presented with events in whicheither a single food item moved as a whole or half theobject moved independently of the rest they did notlook longer at the event outcome that followed motionof a greater food volume Indeed monkeys showed amarginally significant tendency in the opposite direc-tion looking longer at the outcome of the event inwhich the object broke apart Looking preferencesbetween the lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo trials differedsignificantly from the preferences shown in Experiment1 in which the events involved two distinct objectsThese findings accord with the thesis that monkeys usefeatural information to parse visual scenes into objectsrepresent each object as separately movable and manip-ulable and look longer at events in which two distinctobjects move together

Nevertheless one of the alternative accounts could berevised to account for this collection of data Perhapsmonkeys have a preference both for event outcomesthat follow the motion of more food and for eventoutcomes that reveal the inside of a food object Accord-ing to this revised account monkeys in Experiment 1looked longer following an event in which two distinctobjects moved together because of their preference formore stuff moving This preference was not evident inExperiment 3 because it competed with an intrinsicpreference for the outcome display from the lsquolsquoseparatersquorsquotrial Because the inside of the lemon or pepper wasvisible following the lsquolsquoseparatersquorsquo event of Experiment 3but not following either the lsquolsquotogetherrsquorsquo event in Experi-ment 3 or either event in Experiment 1 a preference forviewing the inside of a food object would produce agreater preference for the lsquolsquoseparatersquorsquo outcome display inExperiment 3 than in Experiment 1

Experiment 4 tests this revised account by presentinga new group of monkeys with the outcome displays ofthe lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo events from Experiment3 with no prior presentation of any objects or motionAccording to the revised account monkeys should showa stronger preference for the lsquolsquoseparatersquorsquo event in Experi-ment 4 than in Experiment 3 because only Experiment 3would invoke the competing preference for more stuffmoving in the lsquolsquotogetherrsquorsquo event According to the origi-nal object-parsing account the preference for the lsquolsquose-paratersquorsquo event in Experiment 4 will not exceed that inExperiment 3 If the monkeys in Experiment 3 expectsingle objects to move as cohesive units then prefer-ence for the outcome of the lsquolsquoseparatersquorsquo event might begreater in Experiment 3 than in Experiment 4 If mon-keys have no expectations about the cohesive or non-cohesive motion of food objects then preferencesshould be the same in the two experiments

EXPERIMENT 4

Experiment 4 used the outcome displays of Experiment3 and the method of Experiment 2 Monkeys werepresented with one stationary display in which a handheld a whole food object in the air (together) and onestationary display in which a hand held the top half ofthe food object in the air while the bottom half of thefood object rested on the display floor (separate)Looking times to the two displays were compared toeach other and to the looking times of the monkeys inExperiment 3 who viewed the same displays followingpresentation of the whole object and two differentpatterns of motion

Results

Figure 4b presents the principal findings of Experiment4 Monkeys looked equally at lsquolsquotogetherrsquorsquo events (37 secSE = 3 sec) and lsquolsquoseparatersquorsquo events (42 sec SE = 3 sec)F(1 42) = 15 p = 2 Of the 43 monkeys tested 16looked longer at the lsquolsquotogetherrsquorsquo event and 27 lookedlonger at the lsquolsquoseparatersquorsquo event ( 2 = 28 nonsignificant)

The analysis comparing looking times in Experiments3 and 4 revealed a significant main effect of trial typemonkeys looked longer at the lsquolsquoseparatersquorsquo outcome dis-play (43 sec SE = 3 sec) than at the lsquolsquotogetherrsquorsquooutcome display (37 sec SE = 2 sec) F(1 71) = 45p lt 05 Of the 73 monkeys tested in Experiments 3 and4 46 looked longer at the lsquolsquoseparatersquorsquo outcome displayand 27 looked longer at the lsquolsquotogetherrsquorsquo outcome display

2 = 49 p lt 05

Discussion

In Experiment 4 rhesus monkeys showed a nonsignifi-cantly smaller preference for the lsquolsquoseparatersquorsquo display inwhich a single food item appeared in two pieces thantheir counterparts in Experiment 3 This finding pro-vides evidence against the thesis that monkeysrsquo lookingtimes depend on a preference for the outcomes ofevents involving the motion of more food stuff com-bined with an intrinsic preference for the separatedoutcome display with one object They instead supportthe object-parsing interpretation of the results fromExperiment 1 Monkeys appear to use featural informa-tion to parse visual displays into distinct objects andthey find events in which distinct objects move togethermore novel or less natural than events in which distinctobjects move separately

The findings of Experiments 3 and 4 provide no clearevidence concerning monkeysrsquo expectation that singlefood items will move cohesively If monkeys had such anexpectation then the subjects in Experiment 3 shouldhave looked longer at the lsquolsquoseparatersquorsquo display than thosein Experiment 4 because the lsquolsquoseparatersquorsquo display in Ex-periment 3 followed an event in which a single object

Munakata et al 51

broke apart and moved noncohesively Although the datafrom Experiments 3 and 4 tend in this direction noreliable differences were obtained between the lookingpreferences in the two experiments Reliable preferencesfor the outcomes of noncohesive motions have beenobserved both with human infants and with humanadults tested with similar methods and with displays ofsimple artifacts (Spelke et al 1989 1993 Kestenbaum etal 1987) The absence of a clear effect of cohesiveness inExperiments 3 and 4 may reflect either a species differ-ence or a difference in object domain Artifacts are moreapt to move cohesively than is food which breaks apartboth due to decay cutting or eating Such conclusionscannot be drawn from the present experiments howeverbecause of the equivocal findings

GENERAL DISCUSSION

Four experiments provide evidence that rhesus monkeysspontaneously parse arrays of adjacent food items intodistinct objects and that they represent these objects asseparately movable and manipulable Monkeys lookedlonger at the outcomes of events in which two previouslystationary adjacent objects moved as one unit than at theoutcomes of events in which one of the objects movedseparately from the other This preference was notattributable to any intrinsic preference for the formerevent outcome or to any preference for an outcome thatfollowed a greater amount of motion Instead it providesevidence that the monkeys represented the commonmotion of the two distinct objects as more novel orsurprising than the independent motion of those objects

The present findings suggest broad similarities be-tween the object representations formed by human andnonhuman primates and between the ways in whichthose representations are used to support inferencesabout objectsrsquo movability The well known detailedhomologies between the lower-level visual mechanismsof human and nonhuman primates (Tootell et al 1996Sereno et al 1995 DeYoe amp Van Essen 1988 Maunsell ampNewsome 1987 Desimone et al 1984) therefore appearto extend to higher-level mechanisms for parsing objectsand interpreting object motions In addition our findingsprovide evidence that adult monkeys and human infantsshow similar behavioral responses to object motionswith heightened visual exploration of motions that arenovel or surprising These findings complement previousresults showing that rhesus monkeys cotton-top tamar-ins and human infants show similar looking preferencesfor events in which objects are occluded or behave inanomalous ways (eg Hauser et al 1996 Hauser 1998)

Differences in Sensitivity to Hands

Our studies also reveal two differences between theobject representations formed by adult rhesus monkeysand young human infants First human infants take

account of the actions of human hands in analyzingthe motions and support relations among objects Whenhuman infants see an inanimate object rise into the air ina display that includes a human hand they show anovelty reaction if the hand and object are spatiallyseparated but not if the hand is grasping the object(Needham amp Baillargeon 1993 Leslie 1984) Monkeysin contrast showed no sensitivity to the supporting roleof hands in Experiment 1 Their novelty reaction to thecommon rising motion of two objects was equally strongwhen no hand contacted the bottom object (an eventthat implies that the two objects were connected) andwhen hands contacted each of the objects (an event thatimplies no connection between the objects)

We see two plausible accounts of the observed differ-ences in sensitivity to hands First human infantsrsquo great-er sensitivity to the supporting role of hands may reflecta species difference in the use of hands specifically inthe manipulation of inanimate objects Because humaninfants and human adults manipulate objects more thanother primates do human infants may have more op-portunities to learn about handndashobject support relationsthan do other species A second possibility not mutuallyexclusive from the first is that humans are innatelypredisposed to attend to the ways in which inanimateobjects are manipulated by other humans which in turncontributes to both infantsrsquo abilities to learn rapidlyabout tools and ultimately to humansrsquo superior tooluse2

Differences in the Use of Object Features forBoundaries

The second difference between the object representa-tions of adult monkeys and young human infants con-cerns the use of object features such as surface coloringand shape as information for object boundaries Adultmonkeys and human infants above 11 months of ageuse featural information to perceive object boundariesin contrast infants below 11 months of age do notreliably exhibit this ability Various factors have beenproposed to underlie the developmental change ob-served in humans Some factors focus on perceptualdevelopment with behavioral changes attributed toinfantsrsquo emerging abilities to use image features suchas edge alignment and texture similarity to group por-tions of the visual field into units directly (eg Kellmanamp Arterberry 1998 Needham 1998) In contrast otherfactors focus on the development of higher level pro-cesses with behavioral changes attributed to an emer-ging ability to represent objects as members of kindsand an emerging propensity to use object features suchas surface coloring and shape as information for thekinds to which specific objects belong (Needham ampModi 2000 Xu amp Carey 1996) Further this changemay be driven by the acquisition of verbal labels for theobjects (Xu amp Carey 1996)

52 Journal of Cognitive Neuroscience Volume 13 Number 1

Corresponding to these two interpretations of thedevelopmental change in humans are two differentinterpretations of monkeysrsquo performance in the presentstudies Monkeys may have perceived the object bound-aries by categorizing each object as a different kind offood or they may have perceived the boundaries group-ing together elements in the visual scene in accord withtheir colors textures and alignment relationships

There is compelling data suggesting that monkeysrepresent the category of food such that they are likelyto have lsquolsquofood kindrsquorsquo representations First monkeys inthe present studies were strongly attentive to food itemsand occasionally attempted to approach and take thembehaviors often observed with familiar foods and rarelyobserved with familiar nonfood objects This was trueeven though they had no prior experience with theseparticular food items Second experiments by SantosHauser and Spelke (in preparation) suggest that mon-keys given evidence that a novel object is food (byobserving a person eating part of it) subsequentlyapproach that object an odorless replica of that objectand other objects of the same color and texture as theoriginal object but of a different shape In contrastmonkeys do not approach these objects when they aregiven evidence that the initial object is not food (byobserving a person putting the object in her ear ratherthan her mouth) This finding suggests that monkeyscategorize novel objects as kinds of food in terms ofproperties such as their colors and textures If percep-tible properties of the present stimulus objects allowedmonkeys to perceive correctly that these objects werefood then monkeysrsquo propensity to categorize objects asthe same foods only when they share a common colorand texture would lead them to perceive each display oftwo (differently colored and textured) foods as contain-ing two distinct objects

Whatever the reason for monkeysrsquo successful use offeatural information to perceive object boundaries theexistence of this capacity in rhesus monkeys casts doubton the thesis that this ability either depends on or givesrise to any uniquely human ability to represent objectsHumans do represent objects in unique ways for wehave unparalleled abilities to build and use complextools and to communicate about objects with uniquesymbols for thousands of object kinds The sources ofour uniqueness however do not clearly appear in thecontexts that have been used thus far to assess objectrepresentations in human infants

Steps Toward a Cognitive Neuroscience of NaturalObject Representation

Although our experiments focus strictly on behavioralmeasures and functional analyses we believe their great-est potential lies in the contributions they can make tounderstanding the neural basis of object representationRhesus monkeys are one of the most intensively studied

species in the neuroanatomy and neurophysiology ofvision and such studies have provided evidence forextensive homologies between their visual systems andthose of humans Our experiments contribute to thisliterature in three ways First they suggest that rhesusmonkeys and humans have similar higher visual me-chanisms for representing objects and interpreting ob-ject motions The origin of these similarities remains anopen question with likely contributions from bothgenetically encoded homologies in the underlying neur-al architectures and similar experiential histories inter-acting with similar neural learning mechanisms

Second our experiments provide evidence that theobject representations of monkeys and humans can beassessed by nearly identical tasks Moreover these tasksrequire no training and so allow assessment of therepresentations that humans and monkeys developand use spontaneously rather than less naturalisticrepresentations that may have been developed specifi-cally for solving experimental tasks over months oftraining on those tasks (see discussion in Rao et al1997) Finally these tasks can be applied not only toadult animals but to infants Indeed the preferentiallooking method was developed for use with infanthumans and monkeys (Fantz 1961) and it has beenused to study lower-level visual functions in both spe-cies (see Kellman amp Banks 1998 for review) Themethod therefore should be ideal for investigating theneural architecture subserving visual cognition in bothspecies

Third our experiments offer a behavioral task that canreadily be adapted for simultaneous behavioral andneural recordings in monkeys Preferential lookingmethods have been used successfully both with semindashfree-ranging rhesus monkeys and with captive cotton-top tamarins (eg Hauser et al 1996 Hauser 1998) Inpreliminary research they have yielded similar findingswith rhesus monkeys tested with stabilized heads andimplanted electrodes (Munakata Miller amp Spelke un-published) In the future therefore cognitive neuros-cientists should be able to use these methods to probethe neural mechanisms of object representations inuntrained monkeys whose experience with objects canbe precisely controlled and to compare the functionalproperties of those mechanisms directly to those ofhuman infants with varying degrees of experience Suchstudies should prove a valuable complement to studiesof the neural mechanisms of object representations inadult humans using the combined approaches of cogni-tive psychology and functional brain imaging

More specifically the studies reported in this papercould serve as the starting point for physiological studiesprobing the cognitive and behavioral functions of neu-rons activated by visual displays As a first step onecould ask whether the extensively studied object codingneurons in the inferotemporal cortex (Tanaka 1996Perrett et al 1987 Baylis Rolls amp Leonard 1985) are

Munakata et al 53

responsible for the behavioral results found in ourexperiments The finding that monkeys encode two-object displays as two separate objects leads to theprediction that inferotemporal neurons will respondsimilarly to each object in a one-object display and in atwo-object display with a possible reduction in responseto the latter display due to competition from thedifferent object representations It is possible howeverthat monkeys distinguish the two objects in earlierstages of processing parsing the display based on con-tiguous regions of the same general color and texturewithout this parsing being clearly reflected in object-level representations These alternatives could be dis-tinguished by recording a population of inferotemporalresponses to one of our two-object displays and to eachof the two objects separately If the two-object responseswere different from the sum or average of the responsesto the two separate objects this would suggest thatmonkeys encode the two-object displays in a differentmanner than the separate objects at the level of theinferotemporal cortex

If this first experiment showed that inferotemporalneurons encode two-object displays in terms of the twoseparate objects one could next manipulate factors thatinfluence object perception and measure the neuralcorrelates For example spatiotemporal cues such ascommon motion may make monkeys more likely toperceive a two-object display as a single object (egKellman amp Spelke 1983) and elimination of colordifferences may make them less likely to do so (SantosHauser amp Spelke in preparation) One could thusmeasure both the behavioral (looking time) and electro-physiological consequences of such manipulations andcompare the results to those from experiments withoutthis preexposure

Another ideal candidate for converging explorationfocuses on the representations underlying abilities toperceive combinations of objects in terms of theseparable components Do such abilities stem fromrepresentations of distinct perceptual features or fromrepresentations of distinct object kinds The nature ofobject representations is a matter of considerabledebate in the electrophysiological and related litera-ture (eg Sugihara Edelman amp Tanaka 1998 Lo-gothetis amp Sheinberg 1996 Tanaka 1996 Tarr ampBulthoff 1995 Biederman amp Cooper 1992 Biedermanamp Gerhardstein 1995) and issues related to the kindsfeatures distinction have been discussed in a some-what different terminology For example Logothetisand Sheinberg (1996) posit that different levels ofcategorization could be used to organize object repre-sentations from more specific visual feature-basedrepresentations to more abstract-kind representationsElectrophysiological recordings have demonstratedthat a given visual object is represented in differentways along a rough hierarchy of processing pathwaysfrom more specific low-order featural representations

to more abstract invariant categorical representations(eg Desimone amp Ungerleider 1989) Objects such asthose used in our displays could be recognized asdistinct based on features at lower levels or categoriesat higher levels Alternatively even the lowest level ofobject representations may be organized into differentkind categories as suggested by the existence of face-specific representations in both rhesus monkey andhuman visual areas (Kanwisher McDermott amp Chun1997 Perrett et al 1987) Though most of theexplanations for face-specific representations focus onthe unique perceptual properties of faces rather thana more general categorical organization of the objectrecognition system a categorical organization is stillpossible (Caramazza 1998) Thus objects could becategorized as different kinds at the earliest levels offeatural processing

One could further explore these issues in physiologi-cal studies by presenting monkeys with different visualforms of a single food category (eg bananas that aresliced mashed peeled unpeeled green brown andyellow) and visually similar forms from different foodcategories (eg a green banana and a cucumber) Ifinferotemporal representations encode information atthe level of kinds the first condition should elicit similarresponses in the inferotemporal neurons whereas thesecond should elicit different responses In contrast ifinferotemporal representations encode information atthe level of features the first condition should elicitdifferent responses whereas the second should elicitsimilar responses

In such ways our understanding of object representa-tion and in turn of humansrsquo unique tool and symboluse may be enhanced by converging efforts at thebehavioral and physiological levels of analysis Themethods reported in this papermdashused extensively withhumans of all ages requiring no training and applicableto free-ranging as well as captive animalsmdashcould play aninstrumental role in this process

METHODS

Experiment 1

Participants

Subjects were 59 semindashfree-ranging rhesus monkeysliving on the island of Cayo Santiago Puerto RicoApproximately half the subjects were adult males (agegt4 years) and half adult female (age gt3 years) Subjectswere tested opportunistically whenever they were en-countered in a setting with few other monkeys ordistractions (eg not involved in or near to a fight)and when they remained in a seated position longenough for us to present our stimuli Monkeys occa-sionally changed positions between trials In these casestesting resumed if and when monkeys relocated toanother seated position within a couple of minutes An

54 Journal of Cognitive Neuroscience Volume 13 Number 1

additional 21 monkeys were tested but did not providedata for the analyses due to either position changes thatdid not allow testing to resume (20 monkeys) or experi-menter error (1 monkey)

Apparatus and Displays

The experimental apparatus consisted of a stage and ascreen constructed from white foam core (Figure 2)The 60 pound 30-cm floor and 60 pound 40-cm back of the stagewere attached at a right angle by triangular supports (12-cm height pound 7-cm base) attached to the sides of thestage The 60 pound 40-cm screen had a 60 pound 15-cm basesupporting small aluminum pans containing the foodstimuli for the study The base and pans were attachedto the screen at a right angle by large triangular supports(40-cm height pound 15-cm base) that occluded both thebase and the food

The objects were four foods of contrasting shapescolors and textures with sizes that made them easilygraspable by a single human hand a green pepper (7-cm tall pound 8-cm diameter) a brown sweet potato (7-cmtall pound 75-cm wide by 17-cm long) a miniature orangepumpkin (7-cm tall pound 8-cm diameter) and a segmentof tan ginger root (45-cm tall pound 12-cm wide pound 15-cmlong) None of these items grew on the island or werebrought there either as provisions for the monkeys oras food for the research team all items therefore wereunfamiliar to the subjects In one display the greenpepper rested on top of the sweet potato In theother display the pumpkin rested on top of theginger root

Design

Each monkey was presented with one lsquolsquotogetherrsquorsquo trialand one lsquolsquoseparatersquorsquo trial each involving a different pairof food items Twenty-eight monkeys were tested in ahold-top condition in which the experimenter held onlythe top object with one hand during each event and 31monkeys were tested in a hold-both condition in whichthe experimenter held both objects with both handsWithin each of these conditions the pairing of objects(pepperpotato vs pumpkinginger) and trial types (to-gether vs separate) and the order of test trials wereorthogonally counterbalanced across monkeys

Procedure

All testing was conducted by one experimenter and onecamcorder operator a test began when the investigatorslocated a monkey who was seated in a quiet spot Theexperimenter positioned the apparatus 2ndash5 m away fromthe test monkey with the screen in front of and blockingthe monkeyrsquos view of the stage and the camcorderoperator began to videotape the monkey from behindthe display (Figure 2) The experimenter then raised the

screen to reveal an empty stage and immediately low-ered it Each test trial then proceeded as follows Theexperimenter raised the screen to reveal one food itemsitting atop a second food item The experimenterchecked that the monkey had fixated the objects andthen she lifted the top object approximately 30 cm in 1sec In lsquolsquotogetherrsquorsquo events the bottom object moved withthe top object in lsquolsquoseparatersquorsquo events the bottom objectremained on the floor of the display In the hold-topcondition the experimenter held only the top objectfrom above with the right hand (the two objects wereattached with toothpicks invisible to the monkeys) Inthe hold-both condition the experimenter held the topobject from above with the right hand and the bottomobject from the side and bottom with the left hand Afterlifting the object(s) the experimenter called lsquolsquoCountrsquorsquoand the camcorder operator began counting 10 sec onthe camcorder display The experimenter held the ob-ject(s) stationary until the camcorder operator calledlsquolsquoDonersquorsquo to signal the end of the 10-sec trial Theexperimenter then lowered the screen This procedurehas been successfully used in previous looking timeexperiments on this population (eg Hauser et al1996)

Each monkey received one lsquolsquotogetherrsquorsquo and one lsquolsquose-paratersquorsquo trial These two trials were separated by twoadditional trials unrelated to the present studies andinvolving the stationary presentation of other food items(carrots and squash) For most monkeys trials wereseparated by an intertrial interval of 3ndash5 sec and theentire experiment lasted a couple of minutes For mon-keys who repositioned themselves between trials theintertrial intervals were longer but never exceeded acouple of minutes

Coding and Analysis

Two coders blind to the hypotheses and conditions ofthe experiment viewed the videotaped trials frame-by-frame to determine how long monkeys observed eachof the event outcomes On each trial coding began justafter the objects came to rest as signaled by theexperimenterrsquos voice on the videotape and ended 10sec later Four of the monkeys were coded by bothcoders the correlation between their judgments oftotal looking time on each trial was 93 Looking timeswere analyzed by a 2 pound 2 ANOVA with Condition (hold-top vs hold-both) as a between-subjects factor andDisplay (together vs separate) as the within-subjectsfactor

Experiment 2

Participants

Subjects were 28 monkeys from the same population asin Experiment 1 An additional 10 monkeys were tested

Munakata et al 55

but did not provide data for the analyses due to eitherposition changes that did not allow testing to resume (9monkeys) or experimental error (1 monkey)

Apparatus and Stimuli

The apparatus was similar to that in Experiment 1except that the stage was somewhat smaller (back =45 pound 30 cm floor = 45 pound 30 cm) and the screen wasslightly taller (base = 45 pound 15 cm face = 45 pound 45 cm)The food objects and object positions were the same asin the outcome displays for Experiment 1 On thelsquolsquoseparatersquorsquo trial the position of the experimenterrsquos handwas the same as in the lsquolsquoseparatersquorsquo trial of the hold-topcondition of Experiment 1 On the lsquolsquotogetherrsquorsquo trial theexperimenterrsquos right hand grasped the two objectssimultaneously from the side and supported them inthe same positions as on the lsquolsquotogetherrsquorsquo trials for bothconditions of Experiment 1

Design

The design was the same as in Experiment 1 except thatall subjects were run in a single-hand condition

Procedure

Each trial began when the experimenter lifted the screento reveal the objects currently held in the air As thescreen was raised the experimenter called lsquolsquoCountrsquorsquo andthe camcorder operator began counting 10 sec on thecamcorder display In all other respects the procedurewas the same as in Experiment 1

Coding and Analysis

A single coder blind to the conditions of the experimentscored the videotapes Trials for 10 subjects were codedby a second observer and the correlation betweenjudgments of both observers was 98 As in previousstudies (Hauser et al 1996 Hauser 1998) videos wereacquired onto a computer using Adobe Premiere soft-ware and a Radius Videovision board Coding began andended as for Experiment 1

Looking times in Experiment 2 were analyzed by aone-way ANOVA with Display (together vs separate) asthe within-subjects factor A further ANOVA with theadditional factor of Experiment compared the lookingpatterns of the monkeys in Experiment 2 to those inExperiment 1

Experiment 3

Participants

Subjects were 30 monkeys from the same population asin Experiments 1 and 2 An additional 13 monkeys were

tested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (12 monkeys) or experimental error (1 monkey)

Apparatus and Displays

The apparatus was identical to that of Experiment 2 Thedisplays were the same as in the hold-top condition ofExperiment 1 except for the objects a yellow lemon andan orange pepper oriented vertically On lsquolsquotogetherrsquorsquotrials a whole object appeared on the display floororiented vertically and a hand grasped its top half andlifted the object into the air On lsquolsquoseparatersquorsquo trials twohalves of an object with a horizontal cut through themiddle appeared on the display floor in the sameorientation and a hand grasped the top half and liftedit into the air while the bottom half remained on thedisplay floor At the start of the lsquolsquoseparatersquorsquo trial the cutin the object was detectable by adults but inconspicu-ous At the end of the trial small portions of the insideof the object were visible from the monkeyrsquos station-point

Design Procedure Coding and Analyses

The design and procedure were the same as in Experi-ment 1 except that only one condition (hold-top) wasadministered and only one object (together or separate)was displayed The coding and analyses were the sameas in Experiment 2

Experiment 4

Participants

Subjects were 43 monkeys from the same population asin Experiments 1ndash3 An additional 27 monkeys weretested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (26 monkeys) or experimental error (1 monkey)

Apparatus and Displays

These were the same as in Experiment 3 except that thefood object never appeared on the display floor and wasnot grasped and lifted

Design Procedure Coding and Analyses

These were the same as in Experiment 2

Acknowledgments

The research was supported by McDonnell-Pew PostdoctoralFellowships to Yuko Munakata and Randall OrsquoReilly an NSFpredoctoral fellowship and Harvard University McMasterrsquosfunds to Laurie Santos NIH grant R37-HD23103 to ElizabethSpelke and an NSF Young Investigator Award to Marc

56 Journal of Cognitive Neuroscience Volume 13 Number 1

Hauser We thank members of the Cognitive DevelopmentCenter at the University of Denver for feedback on drafts ofthis article We thank Elliott Blass Amy Jackendoff Katie LiuCory Miller Marianne Moon Bridget Spelke and Fei Xu forassistance with conducting and coding the studies We thankthe CPRC (PHS grant P51RR00168-38) and Drs M KesslerF Bercovitch and J Berard for helping secure the CayoSantiago facilities

Reprint requests should be sent to Yuko Munakata at theDepartment of Psychology University of Denver 2155 SRace Street Denver CO 80208 USA or via e-mail tomunakataduedu

Notes

1 In some cases infants younger than 1 year have demon-strated abilities to use featural information to parse simpleadjacent objects into separable units (Needham amp Baillargeon1997 Needham 1998) Further work is needed to determinewhy such abilities are not reliably observed in infants duringthe first year of life2 A third account for the differential sensitivity to hands maybe quickly rejected Human infantsrsquo greater response to handsis not due to any greater sensitivity of the preferential lookingtask with human infants relative to adult rhesus monkeysbecause we found monkeys to be more sensitive than humaninfants to featural information for objects in the presentexperiments A fourth interpretation is possible though notvery likely Human infantsrsquo greater sensitivity to hands in ourexperiments may depend on the use of human handsmdashperhaps monkeys would show similar sensitivity to hands iftested with monkey hands However rhesus monkeys showsimilar physiological responses to human hands and monkeyhands (Rizzolatti Fadiga Fogassi amp Gallese 1999 diPellegrino Fadiga Fogassi Gallese amp Rizzolatti 1992)suggesting that rhesus monkeysrsquo observed insensitivity to thesupporting role of hands in our studies is unlikely to be anartifact of the use of human hands

REFERENCES

Baillargeon R (1995) Physical reasoning in infancy In MGazzaniga (Ed) The cognitive neurosciences CambridgeMIT Press

Baylis G C Rolls E T amp Leonard C M (1985) Selectivitybetween faces in the responses of a population of neurons inthe cortex of the superior temporal sulcus of the macaquemonkey Brain Research 342 91ndash102

Bertenthal B I (1996) Origins and early development ofperception action and representation Annual Review ofPsychology 47 431ndash459

Biederman I amp Cooper E E (1992) Size invariance in visualobject priming Journal of Experimental Psychology Hu-man Perception and Performance 18 121ndash133

Biederman I amp Gerhardstein P C (1995) Viewpoint-de-pendent mechanisms in visual object recognition Reply toTarr and Bulthoff (1995) Journal of Experimental Psychol-ogy Human Perception and Performance 21 1506ndash1514

Caramazza A (1998) Domain-specific knowledge systems inthe brain The animatendashinanimate distinction Journal ofCognitive Neuroscience 10 1ndash34

Desimone R Albright T D Gross C G amp Bruce C (1984)Stimulus selective properties of inferior temporal neurons inthe macaque Journal of Neuroscience 4 2051ndash2062

Desimone R Ungerleider L G (1989) Neural mechanisms ofvisual processing in monkeys In F Boller amp J Grafman

(Eds) Handbook of neuropsychology vol 2 (pp 267ndash299)New York Elsevier

DeYoe E A amp Van Essen D C (1988) Concurrent processingstreams in monkey visual cortex Trends in Neurosciences11 219ndash226

di Pellegrino G Fadiga L Fogassi L Gallese V amp RizzolattiG (1992) Understanding motor events A neurophysiologi-cal study Experimental Brain Research 91 176ndash180

Fantz R (1961) The origin of form perception ScientificAmerican 204 66ndash72

Fantz R (1964) Visual experience in infants Decreased at-tention to familiar patterns relative to novel ones Science146 668ndash670

Hauser M D (1998) A nonhuman primatersquos expectationsabout object motion and destination The importance ofself-propelled movement and animacy DevelopmentalScience 1 31ndash37

Hauser M D amp Carey S (1998) Building a cognitive creaturefrom a set of primitives Evolutionary and developmentalinsights In D Cummins amp C Allen (Eds) The evolution ofmind Oxford Oxford University Press

Hauser M D MacNeilage P amp Ware M (1996) Numericalrepresentations in primates Proceedings of the NationalAcademy of Sciences USA 93 1514

Hauser M D amp Williams T (submitted) A nonhuman pri-matesrsquo expectations about invisible displacement Two pro-cedures two different systems of knowledge

Johnson S P amp Aslin R N (1996) Perceptions of object unityin young infants The rules of motion depth and orienta-tion Cognitive Development 11 161ndash180

Kanwisher N McDermott J amp Chun M M (1997) The fu-siform face area A module in human extrastriate cortexspecialized for face perception Journal of Neuroscience 174302

Kellman P J amp Arterberry M E (1998) The cradle ofknowledge Development of perception in infancy Cam-bridge MIT Press

Kellman P J amp Banks M S (1998) Infant visual perceptionIn D Kuhn amp R S Siegler (Eds) Handbook of childpsychology Cognition perception and language 5th ed(pp 103ndash146) New York Wiley

Kellman P J amp Spelke E (1983) Perception of partially oc-cluded objects in infancy Cognitive Psychology 15 483ndash524

Kestenbaum R Termine N amp Spelke E S (1987) Perceptionof objects and object boundaries by 3-month-old infantsBritish Journal of Developmental Psychology 5 367ndash383

Leslie A M (1982) The perception of causality in infantsPerception 11 173ndash186

Leslie A M (1984) Infant perception of a manual pick-up eventBritish Journal of Developmental Psychology 2 19ndash32

Logothetis N K amp Sheinberg D L (1996) Visual object re-cognition Annual Review of Neuroscience 19 577ndash621

Maunsell J H amp Newsome W T (1987) Visual processing inmonkey extrastriate cortex Annual Review of Neu-roscience 10 363ndash401

Meltzoff A S (1988) Infant imitation and memory Nine-month-olds in immediate and deferred tests Child Devel-opment 59 217ndash225

Nagell K Olguin R S amp Tomasello M (1993) Processes ofsocial learning in the tool use of chimpanzees (Pan troglo-dytes) and human children (Homo sapiens) Journal ofComparative Psychology 107 174ndash186

Needham A (1997) Factors affecting infantsrsquo use of featuralinformation in object segregation Current Directions inPsychological Science 6 26ndash33

Needham A (1998) Infantsrsquo use of featural information in thesegregation of stationary objects Infant Behavior and De-velopment 21 47ndash76

Munakata et al 57

Needham A amp Baillargeon R (1993) Intuitions about sup-port in 45-month-old infants Cognition 47 121ndash148

Needham A amp Baillargeon R (1997) Object segregation in 8-month-old infants Cognition 62 121ndash149

Needham A amp Baillargeon R (1998) Effects of prior experi-ence on 45-month-old infantsrsquo object segregation InfantBehavior and Development 21 1ndash23

Needham A amp Modi M (2000) Infantsrsquo use of prior experi-ences with objects in object segregation Implications forobject recognition in infancy In H Reese (Ed) Advances inchild development and behavior vol 27 (pp 99ndash133)

Perrett D I Mistlin A J amp Chitty A J (1987) Visual neu-rones responsive to faces Trends in Neurosciences 10 358ndash364

Rao S C Rainer G amp Miller E K (1997) Integration of whatand where in the primate prefrontal cortex Science 276821

Rizzolatti G Fadiga L Fogassi L amp Gallese V (1999) Re-sonance behaviors and mirror neurons Archives Italiennesde Biologie 137 85

Scholl B amp Leslie A (in press) Explaining the infantrsquos objectconcept Beyond the perceptioncognition dichotomy In ELepore amp Z Pylyshyn (Eds) What is cognitive science Ox-ford Blackwell

Sereno M I Dale A M amp Tootell R B H (1995) Borders ofmultiple visual areas in humans revealed by functionalmagnetic resonance imaging Science 268 889

Spelke E (1985) Preferential looking methods as tools for thestudy of cognition in infancy In G Gottlieb amp N Krasnegor(Eds) Measurement of audition and vision in the first yearof postnatal life (pp 323ndash363) Norwood NJ Ablex

Spelke E Breinlinger K Jacobson K amp Phillips A (1993)Gestalt relations and object perception A developmentalstudy Perception 22 1483ndash1501

Spelke E amp Van de Walle G A (1993) Perceiving and rea-soning about objects Insights from infants In N Eilan amp R AMcCarthy (Eds) Spatial representation Problems in philo-sophy and psychology (pp 132ndash161) Oxford Blackwell

Spelke E S Hofsten C V amp Kestenbaum R (1989) Objectperception in infancy Interaction of spatial and kinetic in-formation for object boundaries Developmental Psychol-ogy 25 185ndash196

Sugihara T Edelman S amp Tanaka K (1998) Representationof objective similarity among three-dimensional shapes inthe monkey Biological Cybernetics 78 1

Tanaka K (1996) Inferotemporal cortex and object visionAnnual Review of Neuroscience 19 109ndash139

Tarr M J amp Bulthoff H H (1995) Is human object recogni-tion better described by geon structural descriptions or bymultiple views Comment on Biederman and Gerhardstein(1993) Journal of Experimental Psychology Human Per-ception and Performance 21 1494

Tomasello M Kruger A C amp Ratner H H (1993) Culturallearning Behavioral and Brain Sciences 16 495ndash552

Tootell R B H Dale A M Sereno M I amp Malach R (1996)New images from human visual cortex Trends in Neuros-ciences 19 481ndash489

von Hofsten C amp Spelke E (1985) Object perception andobject-directed reaching in infancy Journal of ExperimentalPsychology General 11 198ndash212

Woodward A L (1998) Infants selectively encode the goalobject of an actorrsquos reach Cogniton 69 1ndash34

Xu F amp Carey S (1996) Infantsrsquo metaphysics The case ofnumerical identity Cognitive Psychology 30 111ndash153

Xu F Carey S amp Welch J (1999) Infantsrsquo ability to useobject kind information for object individuation Cognition70 137ndash166

58 Journal of Cognitive Neuroscience Volume 13 Number 1

Page 3: Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

infants may use both spatiotemporal and featural in-formation to parse objects but they may be lesssensitive to the latter (Needham 1997 Johnson ampAslin 1996) In any case a change appears to occurat the end of the first year when infants first reliablydemonstrate the use of featural information to specifythe boundaries of adjacent objects1 This change coin-cides with the emergence of the first names for objects(Xu amp Carey 1996 Xu et al 1999)

One exception to the general rule that young infantsignore featural information in parsing objects concernstheir responses to humans and human body partsespecially hands By 6 months of age infants treat handsas distinct from inanimate objects and have differentexpectations about how hands and inanimate objectsshould behave In particular infants view the motions ofhands as goal-directed (Woodward 1998) they antici-pate that a hand can pick up an object but that oneobject cannot pick up another (Leslie 1982) and theyappreciate on some level that an object held by a handrequires no further support (Needham amp Baillargeon1993) Infantsrsquo sensitivity to relationships between handsand inanimate objects may form an important precursorto the development of tool use which is largely learnedby observation in young children (Nagell Olguin ampTomasello 1993 Tomasello Kruger amp Ratner 1993Meltzoff 1988) It is not known however whether thisearly-developing sensitivity is unique to humans orcausally involved in later-developing object representa-tions

Object Representations in Monkeys

Hauser et al (1996) Hauser (1998) and Hauserand Carey (1998) have adapted the preferentiallooking method for studies of object representationsin both semindashfree-ranging and captive monkeys InHauserrsquos studies adult monkeys are given abbre-viated versions of the preferential looking experi-ments used with infants with brief familiarizationperiods and brief fixed-duration test trials Monkeystypically are familiarized and tested with eventsinvolving food items because these objects elicithigh levels of spontaneous attention (Hauser etal 1996 Hauser amp Carey 1998 though see Hauser1998 Hauser amp Williams submitted for cases wheremonkeys have been tested successfully with nonfooditems) Like human infants adult monkeys havebeen found to show higher levels of spontaneouslooking time when they view certain events thathuman adults find novel or unnatural (eg eventsin which an object appears to vanish after movingbehind a screen) even when care is taken tomatch the natural and unnatural events on a varietyof other dimensions (see Hauser amp Carey 1998 fordiscussion) Hauserrsquos findings suggest that the pre-ferential looking methods developed for studies of

human infants can serve to assess object represen-tations in monkeys as well allowing systematiccomparisons of high-level visual abilities in monkeysand humans

Additional reasons for using preferential lookingmethods to test for monkeysrsquo object representationsrelate to the extensive anatomical physiological andbehavioral literature on the visual representations ofnonhuman primates and in particular rhesus mon-keys A wealth of studies provide evidence for homo-logous mechanisms subserving visual recognition inrhesus monkeys and humans (Tootell et al 1996Sereno et al 1995 DeYoe amp Van Essen 1988 Maun-sell amp Newsome 1987 Desimone et al 1984) Never-theless progress has been slow in understanding howthis recognition system works and in gaining insightsinto both the commonalities and differences betweenhuman and nonhuman object representation A stan-dard physiological study on rhesus monkeys mightdemonstrate that neurons in a given brain area be-come active when the monkeys see a given displayafter extensive training with the display Such datahowever do not reveal the cognitive and behavioralfunctions of the neurons activated by the visual dis-play nor the ways in which monkeys use thesepatterns of neural activation to interpret the worldMoreover such data make only limited contact withthe rich literature in cognitive psychology and cogni-tive neuroscience on object representations in hu-mans Do monkeys parse visual displays into objectsthat can be separately recognized and manipulatedWhat expectations do monkeys form about eventsinvolving objects

Three features of existing studies of the neuralmechanisms of visual representations may have hin-dered progress in answering these questions Firstwith the notable exception of studies of face recogni-tion (eg Perrett Mistlin amp Chitty 1987) neurophy-siological studies of visual processing in monkeys havetended to use two-dimensional arbitrary stimuli of noecological significance Because monkeysrsquo abilities torecognize objects surely have evolved to solve pro-blems such as finding food avoiding obstacles andusing landmarks to recognize significant places themechanisms subserving these abilities may not be bestrevealed by studies of the responses of neurons totwo-dimensional geometric figures or alphanumericcharacters Second the monkeys in neurophysiologicalstudies typically are given extensive training with anarbitrary set of objects before physiological recordingbegins It is not clear whether training regimes lastinga year or more reveal monkeysrsquo ordinary capacities fordetecting and remembering natural objects underconditions of incidental viewing or their ad hocstrategies adopted to solve the problem at hand (seediscussion in Rao Rainer amp Miller 1997) Third themethods used to study monkeys often are quite

46 Journal of Cognitive Neuroscience Volume 13 Number 1

different from those used with humans These differ-ences complicate comparisons across species and hin-der attempts to trace the evolutionary origins ofhuman capacities in our common primate heritage(Hauser amp Carey 1998)

The present studies investigate monkeysrsquo representa-tions of natural ecologically significant objectsmdash fooditemsmdashundergoing ecologically significant eventsmdashgrasping and lifting Moreover they use a method thathas been used extensively with humans of all agesrequires no training and can be administered to freeranging as well as captive animals For these reasons thestudies promise to shed light on the representations ofobjects that monkeys form spontaneously paving theway for simultaneous behavioral and neurophysiologicalstudies of the mechanisms of object representations inuntrained animals under conditions allowing systematiccomparison with humans

In the present studies we used a variant of themethod of Xu et al (1999) to investigate whetheruntrained free-ranging adult rhesus monkeys with amature visual system but no spontaneous tool use andat best a limited capacity for language perceive objectboundaries as older human infants do Monkeys werepresented with displays containing two novel foodobjects that were stationary and adjacent They thenviewed the lifting of the top object or of both objectstogether on two separate trials and their looking timesto the outcomes of these events were recorded In onecondition the lifting of the objects was accomplishedby a single human hand that grasped only the topobject In a second condition the lifting was accom-plished by two hands each grasping one object andmoving the objects together If monkeys perceived the

object boundaries as older infants do they should looklonger at the outcome of the event in which the twoobjects moved together in the one-hand events Thistendency might be attenuated in the two-hand eventsbecause each object was lifted by a supporting handSuch findings would suggest that human and nonhu-man primates have homologous representations ofobjects as movable and manipulable units and thatboth species distinguish hands from other objects andare sensitive to the functions of hands in supportingand moving objects

EXPERIMENT 1

In our first experiment monkeys were presented withtwo novel food itemsmdasheither a pumpkin and a pieceof ginger root or a pepper and a sweet potatomdashon thefloor of a stage (Figure 2) After a monkey hadobserved one item sitting on top of the other itemfor at least 1 sec a hand grasped the top item andlifted it On different trials either the top item movedalone while the bottom item remained on the stagefloor (lsquolsquoseparate trialrsquorsquo) or the two items moved to-gether (lsquolsquotogether trialrsquorsquo) At the end of the movementthe hand and objects remained stationary at their finalpositions for 10 sec and the monkeyrsquos looking timewas recorded Looking times to the two test displayswere compared to investigate whether monkeyslooked longer at the event outcome on the lsquolsquotogetherrsquorsquotrial a preference that would suggest that they per-ceived two separately movable objects in the originaldisplay

To investigate monkeysrsquo sensitivity to the role ofhuman hands in supporting and manipulating objects

Figure 2 Experimental set-upThe experimenter positionedthe apparatus 2ndash5 m away fromthe test monkey The apparatusconsisted of a stage and ascreen that could block theview of the stage and store thefood stimulus items for thestudy The screen was placedbehind the stage during testtrials as shown in the figure

Munakata et al 47

the test events were presented in two different waysto different groups of subjects For half the subjectsboth events were produced by a single hand thatgrasped only the top food object (hold-top) In thiscondition the bottom food object appeared to humanobservers to rest naturally on the display floor on thelsquolsquoseparatersquorsquo trial and to move unnaturally with the topobject on the lsquolsquotogetherrsquorsquo trial For the remainingsubjects each food object was grasped by a differenthand such that both objects appeared to be ade-quately supported on both trials (hold-both) Compar-isons of monkeysrsquo looking preferences across theseconditions should reveal whether monkeys take ac-count of the support function of human hands inrepresenting object motions

Results

Figure 3a and b present the findings from this experi-ment Monkeys looked longer at lsquolsquotogetherrsquorsquo events (51sec SE = 3 sec) than at lsquolsquoseparatersquorsquo events (41 sec SE =3 sec) F(1 57) = 92 p lt 005 This effect was equallystrong in the hold-top and hold-both conditions yield-ing no interaction of Condition by Display (F = 0) Ofthe 59 monkeys tested 42 looked longer at lsquolsquotogetherrsquorsquoevents and 17 looked longer at lsquolsquoseparatersquorsquo 2 = 11 p lt

005 The main effect of Condition was not statisticallysignificant (F lt 2)

Discussion

These results show that rhesus monkeys look longerwhen two bounded objects move together than whenone of the two objects moves separately from the otherLike human infants monkeys appear to parse arrays intobounded objects and they represent these objects asindependently movable and manipulable Moreovermonkeys and infants alike appear to look longer atevents in which two perceptually bounded objects moveand behave as a single unit suggesting that they findsuch events to be novel or unnatural Monkey andhuman object representations therefore appear to besimilar and to be testable by similar methods in accordwith previous findings (Hauser et al 1996)

The experiment also revealed two differences be-tween the object representations of adult monkeysand young human infants First infants as young as 6months take account of the supporting role of hands inlifting and moving objects but the monkeys in Experi-ment 1 showed no sensitivity to hands Informal obser-vations suggested that the monkeys were highlyattentive to the food objects but oblivious to the human

Figure 3 Results from Experi-ments 1 and 2 Rhesus monkeyslook longer when two distinctobjects move together thanwhen one of the two objectsmoves separately both (a)when one hand holds and liftsthe top object (Experiment 1hold-top condition) and (b)when two hands hold bothobjects (Experiment 1 hold-both condition) In contrast (c)rhesus do not distinguish lsquolsquoto-getherrsquorsquo from lsquolsquoseparatersquorsquo dis-plays in their looking timeswhen the distinct objects arestationary (Experiment 2)

0

1

2

3

4

5

6

Lo

oki

ng

tim

e (s

ec)

a b c

48 Journal of Cognitive Neuroscience Volume 13 Number 1

hands that held and manipulated them Second infantsbelow 12 months do not consistently perceive theboundary between two adjacent objects that are station-ary even when the objects belong to different familiarkinds Because the initial displays in the present studiescontained objects that were adjacent and underwent norelative motion the present findings suggest that therhesus monkeys used featural information to parse thevisual display into distinct objects

According to this object-parsing interpretation twofactors are critical to the rhesusrsquo responses the distinctfeatures of the objects and their distinct or commonmotions Because these two factors were not manipu-lated separately in the test displays in Experiment 1however there are two alternative accounts of thesefindings each of which discredits one of the factorscritical to the object-parsing account According to onealternative account monkeys simply find displays withdistinct features in spatial proximity more interestingthan displays with distinct features in more distantlocations Monkeys may look longer at the outcomeof the lsquolsquotogetherrsquorsquo trial because there are more featuresclustered together than in the outcome of the lsquolsquosepa-ratersquorsquo trial the motion of the objects may be comple-tely irrelevant to monkeysrsquo looking behaviorExperiment 2 tests this alternative interpretation ofthe findings by comparing monkeysrsquo looking times tothe same two outcome displays with no precedingmotion If the alternative interpretation is correct thenmonkeys should show a preference for the lsquolsquotogetherrsquorsquooutcome over the lsquolsquoseparatersquorsquo outcome in Experiment2 as in Experiment 1 If the object-parsing interpreta-tion is correct in contrast monkeys should not showthe same preference for the lsquolsquotogetherrsquorsquo outcome inExperiment 2

The second alternative interpretation of the data fromExperiment 1 is that monkeysrsquo looking times to differentevent outcomes depend on how much motion precededthose outcomes According to this account monkeyslooked longer at the outcome of the lsquolsquotogetherrsquorsquo trialbecause a greater volume of food moved during theevent that preceded the recording of their looking timeOn this interpretation the distinct features of the ob-jects and the representation of object boundaries areirrelevant to monkeysrsquo looking behavior Experiment 3tests this alternative interpretation by presenting mon-keys with lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo trials in which asingle object moves as a whole or splits apart If thealternative interpretation is correct then monkeysshould show a preference for the lsquolsquotogetherrsquorsquo outcomeover the lsquolsquoseparatersquorsquo outcome in Experiment 3 as inExperiment 1 because in both experiments the lsquolsquoto-getherrsquorsquo outcome was preceded by motion of a greatervolume of food In contrast the object-parsing interpre-tation predicts that monkeys will not show the samepreference for the lsquolsquotogetherrsquorsquo outcome in Experiment 3without the distinct featural information

EXPERIMENT 2

A new group of monkeys was presented with the twooutcome displays from Experiment 1 without any priorexposure to the food objects or to their motionsBecause hands were found not to influence monkeysrsquolooking patterns in Experiment 1 all the outcomedisplays presented two food items held by one handOn one trial (together) a hand held both food items inthe air by grasping both objects at once one atop theother On the other trial (separate) a hand held onefood item in the air while the other food item restedon the display floor Looking times to the two testoutcomes were compared to each other and to mon-keysrsquo looking times to the same outcome displays inExperiment 1 If the looking preference for the lsquolsquoto-getherrsquorsquo outcome in Experiment 1 reflected monkeysrsquoparsing of the initial arrays into two objects and theirexpectation that the two objects would move indepen-dently then that preference should be absent orattenuated in Experiment 2

Results

Figure 3c presents the principle findings of Experiment2 With stationary objects monkeys looked equally atlsquolsquotogetherrsquorsquo events (43 sec SE = 4 sec) and lsquolsquoseparatersquorsquoevents (47 sec SE = 4 sec) F(1 27) = 15 p gt 22 Ofthe 28 monkeys tested nine looked longer at thelsquolsquotogetherrsquorsquo event 18 looked longer at the lsquolsquoseparatersquorsquoevent and one individual looked equally at both events( 2 = 30 nonsignificant) The lsquolsquotiersquorsquo data point from thesingle individual was dropped for the 2 analysis

An analysis comparing looking times in Experiments 1and 2 revealed a significant interaction between trialtype (together vs separate) and experiment (1 vs 2)F(185) = 74 p = 008 Monkeys showed a greaterlooking preference for the lsquolsquotogetherrsquorsquo outcome displayin Experiment 1

Discussion

In Experiment 2 rhesus monkeys looked no longer at adisplay in which two different objects were held in the airtogether than at a display in which one object was held inthe air while the other object rested on the display floorThese findings contrast with the results from Experiment1 which focused on monkeysrsquo looking times to thesesame displays after prior exposure to two adjacent objectsand to their common or separate motions These findingschallenge one alternative explanation for the lookingpreferences in Experiment 1 and support our object-parsing interpretation of those looking preferenceswhereby monkeys parse visual displays into distinct ob-jects based on featural information and find events inwhich distinct objects move together more interestingthan events in which distinct objects move separately

Munakata et al 49

However the second alternative interpretation where-by monkeys look longer at displays presenting the out-comes of events in which a greater volume of food hasmoved could account for the data from both experi-ments Experiment 3 tests this alternative with food dis-plays that are parsed by human infants and adults into asingle object that either moves as a whole or breaks apartIf monkeysrsquo looking times to event outcomes depend onthe volume of food in motion that preceded each out-come then they should look longer at an outcome inwhich a whole food object has moved than at an outcomein which half of the object has moved In contrast ifmonkeyrsquos looking times depend on their parsing of visualdisplays into bounded objects then they should showdifferent looking preferences at the outcomes of eventsthat involve one versus two objects

EXPERIMENT 3

As in Experiment 1 monkeys were presented with adisplay of food sitting on a stage floor a hand graspedthe top of the food display and lifted either just thetop half of the food or all of the food into the air andthen the display remained stationary while lookingtimes to the event outcomes were recorded In con-

trast to Experiment 1 however each display containeda single food itemmdasha lemon or an orange peppermdashthat either broke into two pieces (separate) or movedas a whole (together) Looking times at the eventoutcomes were compared to one another and to thelooking times of the monkeys in Experiment 1 toinvestigate whether monkeysrsquo preferences betweenthe event outcomes depends on the volume of foodthat is lifted or on the monkeysrsquo parsing of the foodinto distinct objects

Results

Figure 4a presents the principal findings of Experiment3 Monkeys showed a nonsignificant trend toward look-ing longer at the lsquolsquoseparatersquorsquo outcome display (45 sec SE= 4 sec) than at the lsquolsquotogetherrsquorsquo outcome display (37sec SE = 4 sec) F(1 29) = 32 p = 08 Of the 30monkeys tested 19 looked longer at the lsquolsquoseparatersquorsquodisplay and 11 looked longer at the lsquolsquotogetherrsquorsquo display( 2 = 21 nonsignificant)

In contrast an analysis comparing looking times inExperiments 1 and 3 revealed a significant interactionbetween test display (together vs separate) and experi-ment (1 vs 3) F(1 100) = 85 p = 004 Monkeys

Figure 4 Results from Experi-ments 3 and 4 Rhesus looknonsignificantly longer whenone object appears in twopieces than when it appearstogether as a whole both (a)when the object is moved(Experiment 3) and (b) when itis stationary (Experiment 4)

a b

0

1

2

3

4

5

6

Lo

oki

ng

tim

e (s

ec)

50 Journal of Cognitive Neuroscience Volume 13 Number 1

looked longer at the outcome of the lsquolsquotogetherrsquorsquo event inExperiment 1 than in Experiment 3

Discussion

When monkeys were presented with events in whicheither a single food item moved as a whole or half theobject moved independently of the rest they did notlook longer at the event outcome that followed motionof a greater food volume Indeed monkeys showed amarginally significant tendency in the opposite direc-tion looking longer at the outcome of the event inwhich the object broke apart Looking preferencesbetween the lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo trials differedsignificantly from the preferences shown in Experiment1 in which the events involved two distinct objectsThese findings accord with the thesis that monkeys usefeatural information to parse visual scenes into objectsrepresent each object as separately movable and manip-ulable and look longer at events in which two distinctobjects move together

Nevertheless one of the alternative accounts could berevised to account for this collection of data Perhapsmonkeys have a preference both for event outcomesthat follow the motion of more food and for eventoutcomes that reveal the inside of a food object Accord-ing to this revised account monkeys in Experiment 1looked longer following an event in which two distinctobjects moved together because of their preference formore stuff moving This preference was not evident inExperiment 3 because it competed with an intrinsicpreference for the outcome display from the lsquolsquoseparatersquorsquotrial Because the inside of the lemon or pepper wasvisible following the lsquolsquoseparatersquorsquo event of Experiment 3but not following either the lsquolsquotogetherrsquorsquo event in Experi-ment 3 or either event in Experiment 1 a preference forviewing the inside of a food object would produce agreater preference for the lsquolsquoseparatersquorsquo outcome display inExperiment 3 than in Experiment 1

Experiment 4 tests this revised account by presentinga new group of monkeys with the outcome displays ofthe lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo events from Experiment3 with no prior presentation of any objects or motionAccording to the revised account monkeys should showa stronger preference for the lsquolsquoseparatersquorsquo event in Experi-ment 4 than in Experiment 3 because only Experiment 3would invoke the competing preference for more stuffmoving in the lsquolsquotogetherrsquorsquo event According to the origi-nal object-parsing account the preference for the lsquolsquose-paratersquorsquo event in Experiment 4 will not exceed that inExperiment 3 If the monkeys in Experiment 3 expectsingle objects to move as cohesive units then prefer-ence for the outcome of the lsquolsquoseparatersquorsquo event might begreater in Experiment 3 than in Experiment 4 If mon-keys have no expectations about the cohesive or non-cohesive motion of food objects then preferencesshould be the same in the two experiments

EXPERIMENT 4

Experiment 4 used the outcome displays of Experiment3 and the method of Experiment 2 Monkeys werepresented with one stationary display in which a handheld a whole food object in the air (together) and onestationary display in which a hand held the top half ofthe food object in the air while the bottom half of thefood object rested on the display floor (separate)Looking times to the two displays were compared toeach other and to the looking times of the monkeys inExperiment 3 who viewed the same displays followingpresentation of the whole object and two differentpatterns of motion

Results

Figure 4b presents the principal findings of Experiment4 Monkeys looked equally at lsquolsquotogetherrsquorsquo events (37 secSE = 3 sec) and lsquolsquoseparatersquorsquo events (42 sec SE = 3 sec)F(1 42) = 15 p = 2 Of the 43 monkeys tested 16looked longer at the lsquolsquotogetherrsquorsquo event and 27 lookedlonger at the lsquolsquoseparatersquorsquo event ( 2 = 28 nonsignificant)

The analysis comparing looking times in Experiments3 and 4 revealed a significant main effect of trial typemonkeys looked longer at the lsquolsquoseparatersquorsquo outcome dis-play (43 sec SE = 3 sec) than at the lsquolsquotogetherrsquorsquooutcome display (37 sec SE = 2 sec) F(1 71) = 45p lt 05 Of the 73 monkeys tested in Experiments 3 and4 46 looked longer at the lsquolsquoseparatersquorsquo outcome displayand 27 looked longer at the lsquolsquotogetherrsquorsquo outcome display

2 = 49 p lt 05

Discussion

In Experiment 4 rhesus monkeys showed a nonsignifi-cantly smaller preference for the lsquolsquoseparatersquorsquo display inwhich a single food item appeared in two pieces thantheir counterparts in Experiment 3 This finding pro-vides evidence against the thesis that monkeysrsquo lookingtimes depend on a preference for the outcomes ofevents involving the motion of more food stuff com-bined with an intrinsic preference for the separatedoutcome display with one object They instead supportthe object-parsing interpretation of the results fromExperiment 1 Monkeys appear to use featural informa-tion to parse visual displays into distinct objects andthey find events in which distinct objects move togethermore novel or less natural than events in which distinctobjects move separately

The findings of Experiments 3 and 4 provide no clearevidence concerning monkeysrsquo expectation that singlefood items will move cohesively If monkeys had such anexpectation then the subjects in Experiment 3 shouldhave looked longer at the lsquolsquoseparatersquorsquo display than thosein Experiment 4 because the lsquolsquoseparatersquorsquo display in Ex-periment 3 followed an event in which a single object

Munakata et al 51

broke apart and moved noncohesively Although the datafrom Experiments 3 and 4 tend in this direction noreliable differences were obtained between the lookingpreferences in the two experiments Reliable preferencesfor the outcomes of noncohesive motions have beenobserved both with human infants and with humanadults tested with similar methods and with displays ofsimple artifacts (Spelke et al 1989 1993 Kestenbaum etal 1987) The absence of a clear effect of cohesiveness inExperiments 3 and 4 may reflect either a species differ-ence or a difference in object domain Artifacts are moreapt to move cohesively than is food which breaks apartboth due to decay cutting or eating Such conclusionscannot be drawn from the present experiments howeverbecause of the equivocal findings

GENERAL DISCUSSION

Four experiments provide evidence that rhesus monkeysspontaneously parse arrays of adjacent food items intodistinct objects and that they represent these objects asseparately movable and manipulable Monkeys lookedlonger at the outcomes of events in which two previouslystationary adjacent objects moved as one unit than at theoutcomes of events in which one of the objects movedseparately from the other This preference was notattributable to any intrinsic preference for the formerevent outcome or to any preference for an outcome thatfollowed a greater amount of motion Instead it providesevidence that the monkeys represented the commonmotion of the two distinct objects as more novel orsurprising than the independent motion of those objects

The present findings suggest broad similarities be-tween the object representations formed by human andnonhuman primates and between the ways in whichthose representations are used to support inferencesabout objectsrsquo movability The well known detailedhomologies between the lower-level visual mechanismsof human and nonhuman primates (Tootell et al 1996Sereno et al 1995 DeYoe amp Van Essen 1988 Maunsell ampNewsome 1987 Desimone et al 1984) therefore appearto extend to higher-level mechanisms for parsing objectsand interpreting object motions In addition our findingsprovide evidence that adult monkeys and human infantsshow similar behavioral responses to object motionswith heightened visual exploration of motions that arenovel or surprising These findings complement previousresults showing that rhesus monkeys cotton-top tamar-ins and human infants show similar looking preferencesfor events in which objects are occluded or behave inanomalous ways (eg Hauser et al 1996 Hauser 1998)

Differences in Sensitivity to Hands

Our studies also reveal two differences between theobject representations formed by adult rhesus monkeysand young human infants First human infants take

account of the actions of human hands in analyzingthe motions and support relations among objects Whenhuman infants see an inanimate object rise into the air ina display that includes a human hand they show anovelty reaction if the hand and object are spatiallyseparated but not if the hand is grasping the object(Needham amp Baillargeon 1993 Leslie 1984) Monkeysin contrast showed no sensitivity to the supporting roleof hands in Experiment 1 Their novelty reaction to thecommon rising motion of two objects was equally strongwhen no hand contacted the bottom object (an eventthat implies that the two objects were connected) andwhen hands contacted each of the objects (an event thatimplies no connection between the objects)

We see two plausible accounts of the observed differ-ences in sensitivity to hands First human infantsrsquo great-er sensitivity to the supporting role of hands may reflecta species difference in the use of hands specifically inthe manipulation of inanimate objects Because humaninfants and human adults manipulate objects more thanother primates do human infants may have more op-portunities to learn about handndashobject support relationsthan do other species A second possibility not mutuallyexclusive from the first is that humans are innatelypredisposed to attend to the ways in which inanimateobjects are manipulated by other humans which in turncontributes to both infantsrsquo abilities to learn rapidlyabout tools and ultimately to humansrsquo superior tooluse2

Differences in the Use of Object Features forBoundaries

The second difference between the object representa-tions of adult monkeys and young human infants con-cerns the use of object features such as surface coloringand shape as information for object boundaries Adultmonkeys and human infants above 11 months of ageuse featural information to perceive object boundariesin contrast infants below 11 months of age do notreliably exhibit this ability Various factors have beenproposed to underlie the developmental change ob-served in humans Some factors focus on perceptualdevelopment with behavioral changes attributed toinfantsrsquo emerging abilities to use image features suchas edge alignment and texture similarity to group por-tions of the visual field into units directly (eg Kellmanamp Arterberry 1998 Needham 1998) In contrast otherfactors focus on the development of higher level pro-cesses with behavioral changes attributed to an emer-ging ability to represent objects as members of kindsand an emerging propensity to use object features suchas surface coloring and shape as information for thekinds to which specific objects belong (Needham ampModi 2000 Xu amp Carey 1996) Further this changemay be driven by the acquisition of verbal labels for theobjects (Xu amp Carey 1996)

52 Journal of Cognitive Neuroscience Volume 13 Number 1

Corresponding to these two interpretations of thedevelopmental change in humans are two differentinterpretations of monkeysrsquo performance in the presentstudies Monkeys may have perceived the object bound-aries by categorizing each object as a different kind offood or they may have perceived the boundaries group-ing together elements in the visual scene in accord withtheir colors textures and alignment relationships

There is compelling data suggesting that monkeysrepresent the category of food such that they are likelyto have lsquolsquofood kindrsquorsquo representations First monkeys inthe present studies were strongly attentive to food itemsand occasionally attempted to approach and take thembehaviors often observed with familiar foods and rarelyobserved with familiar nonfood objects This was trueeven though they had no prior experience with theseparticular food items Second experiments by SantosHauser and Spelke (in preparation) suggest that mon-keys given evidence that a novel object is food (byobserving a person eating part of it) subsequentlyapproach that object an odorless replica of that objectand other objects of the same color and texture as theoriginal object but of a different shape In contrastmonkeys do not approach these objects when they aregiven evidence that the initial object is not food (byobserving a person putting the object in her ear ratherthan her mouth) This finding suggests that monkeyscategorize novel objects as kinds of food in terms ofproperties such as their colors and textures If percep-tible properties of the present stimulus objects allowedmonkeys to perceive correctly that these objects werefood then monkeysrsquo propensity to categorize objects asthe same foods only when they share a common colorand texture would lead them to perceive each display oftwo (differently colored and textured) foods as contain-ing two distinct objects

Whatever the reason for monkeysrsquo successful use offeatural information to perceive object boundaries theexistence of this capacity in rhesus monkeys casts doubton the thesis that this ability either depends on or givesrise to any uniquely human ability to represent objectsHumans do represent objects in unique ways for wehave unparalleled abilities to build and use complextools and to communicate about objects with uniquesymbols for thousands of object kinds The sources ofour uniqueness however do not clearly appear in thecontexts that have been used thus far to assess objectrepresentations in human infants

Steps Toward a Cognitive Neuroscience of NaturalObject Representation

Although our experiments focus strictly on behavioralmeasures and functional analyses we believe their great-est potential lies in the contributions they can make tounderstanding the neural basis of object representationRhesus monkeys are one of the most intensively studied

species in the neuroanatomy and neurophysiology ofvision and such studies have provided evidence forextensive homologies between their visual systems andthose of humans Our experiments contribute to thisliterature in three ways First they suggest that rhesusmonkeys and humans have similar higher visual me-chanisms for representing objects and interpreting ob-ject motions The origin of these similarities remains anopen question with likely contributions from bothgenetically encoded homologies in the underlying neur-al architectures and similar experiential histories inter-acting with similar neural learning mechanisms

Second our experiments provide evidence that theobject representations of monkeys and humans can beassessed by nearly identical tasks Moreover these tasksrequire no training and so allow assessment of therepresentations that humans and monkeys developand use spontaneously rather than less naturalisticrepresentations that may have been developed specifi-cally for solving experimental tasks over months oftraining on those tasks (see discussion in Rao et al1997) Finally these tasks can be applied not only toadult animals but to infants Indeed the preferentiallooking method was developed for use with infanthumans and monkeys (Fantz 1961) and it has beenused to study lower-level visual functions in both spe-cies (see Kellman amp Banks 1998 for review) Themethod therefore should be ideal for investigating theneural architecture subserving visual cognition in bothspecies

Third our experiments offer a behavioral task that canreadily be adapted for simultaneous behavioral andneural recordings in monkeys Preferential lookingmethods have been used successfully both with semindashfree-ranging rhesus monkeys and with captive cotton-top tamarins (eg Hauser et al 1996 Hauser 1998) Inpreliminary research they have yielded similar findingswith rhesus monkeys tested with stabilized heads andimplanted electrodes (Munakata Miller amp Spelke un-published) In the future therefore cognitive neuros-cientists should be able to use these methods to probethe neural mechanisms of object representations inuntrained monkeys whose experience with objects canbe precisely controlled and to compare the functionalproperties of those mechanisms directly to those ofhuman infants with varying degrees of experience Suchstudies should prove a valuable complement to studiesof the neural mechanisms of object representations inadult humans using the combined approaches of cogni-tive psychology and functional brain imaging

More specifically the studies reported in this papercould serve as the starting point for physiological studiesprobing the cognitive and behavioral functions of neu-rons activated by visual displays As a first step onecould ask whether the extensively studied object codingneurons in the inferotemporal cortex (Tanaka 1996Perrett et al 1987 Baylis Rolls amp Leonard 1985) are

Munakata et al 53

responsible for the behavioral results found in ourexperiments The finding that monkeys encode two-object displays as two separate objects leads to theprediction that inferotemporal neurons will respondsimilarly to each object in a one-object display and in atwo-object display with a possible reduction in responseto the latter display due to competition from thedifferent object representations It is possible howeverthat monkeys distinguish the two objects in earlierstages of processing parsing the display based on con-tiguous regions of the same general color and texturewithout this parsing being clearly reflected in object-level representations These alternatives could be dis-tinguished by recording a population of inferotemporalresponses to one of our two-object displays and to eachof the two objects separately If the two-object responseswere different from the sum or average of the responsesto the two separate objects this would suggest thatmonkeys encode the two-object displays in a differentmanner than the separate objects at the level of theinferotemporal cortex

If this first experiment showed that inferotemporalneurons encode two-object displays in terms of the twoseparate objects one could next manipulate factors thatinfluence object perception and measure the neuralcorrelates For example spatiotemporal cues such ascommon motion may make monkeys more likely toperceive a two-object display as a single object (egKellman amp Spelke 1983) and elimination of colordifferences may make them less likely to do so (SantosHauser amp Spelke in preparation) One could thusmeasure both the behavioral (looking time) and electro-physiological consequences of such manipulations andcompare the results to those from experiments withoutthis preexposure

Another ideal candidate for converging explorationfocuses on the representations underlying abilities toperceive combinations of objects in terms of theseparable components Do such abilities stem fromrepresentations of distinct perceptual features or fromrepresentations of distinct object kinds The nature ofobject representations is a matter of considerabledebate in the electrophysiological and related litera-ture (eg Sugihara Edelman amp Tanaka 1998 Lo-gothetis amp Sheinberg 1996 Tanaka 1996 Tarr ampBulthoff 1995 Biederman amp Cooper 1992 Biedermanamp Gerhardstein 1995) and issues related to the kindsfeatures distinction have been discussed in a some-what different terminology For example Logothetisand Sheinberg (1996) posit that different levels ofcategorization could be used to organize object repre-sentations from more specific visual feature-basedrepresentations to more abstract-kind representationsElectrophysiological recordings have demonstratedthat a given visual object is represented in differentways along a rough hierarchy of processing pathwaysfrom more specific low-order featural representations

to more abstract invariant categorical representations(eg Desimone amp Ungerleider 1989) Objects such asthose used in our displays could be recognized asdistinct based on features at lower levels or categoriesat higher levels Alternatively even the lowest level ofobject representations may be organized into differentkind categories as suggested by the existence of face-specific representations in both rhesus monkey andhuman visual areas (Kanwisher McDermott amp Chun1997 Perrett et al 1987) Though most of theexplanations for face-specific representations focus onthe unique perceptual properties of faces rather thana more general categorical organization of the objectrecognition system a categorical organization is stillpossible (Caramazza 1998) Thus objects could becategorized as different kinds at the earliest levels offeatural processing

One could further explore these issues in physiologi-cal studies by presenting monkeys with different visualforms of a single food category (eg bananas that aresliced mashed peeled unpeeled green brown andyellow) and visually similar forms from different foodcategories (eg a green banana and a cucumber) Ifinferotemporal representations encode information atthe level of kinds the first condition should elicit similarresponses in the inferotemporal neurons whereas thesecond should elicit different responses In contrast ifinferotemporal representations encode information atthe level of features the first condition should elicitdifferent responses whereas the second should elicitsimilar responses

In such ways our understanding of object representa-tion and in turn of humansrsquo unique tool and symboluse may be enhanced by converging efforts at thebehavioral and physiological levels of analysis Themethods reported in this papermdashused extensively withhumans of all ages requiring no training and applicableto free-ranging as well as captive animalsmdashcould play aninstrumental role in this process

METHODS

Experiment 1

Participants

Subjects were 59 semindashfree-ranging rhesus monkeysliving on the island of Cayo Santiago Puerto RicoApproximately half the subjects were adult males (agegt4 years) and half adult female (age gt3 years) Subjectswere tested opportunistically whenever they were en-countered in a setting with few other monkeys ordistractions (eg not involved in or near to a fight)and when they remained in a seated position longenough for us to present our stimuli Monkeys occa-sionally changed positions between trials In these casestesting resumed if and when monkeys relocated toanother seated position within a couple of minutes An

54 Journal of Cognitive Neuroscience Volume 13 Number 1

additional 21 monkeys were tested but did not providedata for the analyses due to either position changes thatdid not allow testing to resume (20 monkeys) or experi-menter error (1 monkey)

Apparatus and Displays

The experimental apparatus consisted of a stage and ascreen constructed from white foam core (Figure 2)The 60 pound 30-cm floor and 60 pound 40-cm back of the stagewere attached at a right angle by triangular supports (12-cm height pound 7-cm base) attached to the sides of thestage The 60 pound 40-cm screen had a 60 pound 15-cm basesupporting small aluminum pans containing the foodstimuli for the study The base and pans were attachedto the screen at a right angle by large triangular supports(40-cm height pound 15-cm base) that occluded both thebase and the food

The objects were four foods of contrasting shapescolors and textures with sizes that made them easilygraspable by a single human hand a green pepper (7-cm tall pound 8-cm diameter) a brown sweet potato (7-cmtall pound 75-cm wide by 17-cm long) a miniature orangepumpkin (7-cm tall pound 8-cm diameter) and a segmentof tan ginger root (45-cm tall pound 12-cm wide pound 15-cmlong) None of these items grew on the island or werebrought there either as provisions for the monkeys oras food for the research team all items therefore wereunfamiliar to the subjects In one display the greenpepper rested on top of the sweet potato In theother display the pumpkin rested on top of theginger root

Design

Each monkey was presented with one lsquolsquotogetherrsquorsquo trialand one lsquolsquoseparatersquorsquo trial each involving a different pairof food items Twenty-eight monkeys were tested in ahold-top condition in which the experimenter held onlythe top object with one hand during each event and 31monkeys were tested in a hold-both condition in whichthe experimenter held both objects with both handsWithin each of these conditions the pairing of objects(pepperpotato vs pumpkinginger) and trial types (to-gether vs separate) and the order of test trials wereorthogonally counterbalanced across monkeys

Procedure

All testing was conducted by one experimenter and onecamcorder operator a test began when the investigatorslocated a monkey who was seated in a quiet spot Theexperimenter positioned the apparatus 2ndash5 m away fromthe test monkey with the screen in front of and blockingthe monkeyrsquos view of the stage and the camcorderoperator began to videotape the monkey from behindthe display (Figure 2) The experimenter then raised the

screen to reveal an empty stage and immediately low-ered it Each test trial then proceeded as follows Theexperimenter raised the screen to reveal one food itemsitting atop a second food item The experimenterchecked that the monkey had fixated the objects andthen she lifted the top object approximately 30 cm in 1sec In lsquolsquotogetherrsquorsquo events the bottom object moved withthe top object in lsquolsquoseparatersquorsquo events the bottom objectremained on the floor of the display In the hold-topcondition the experimenter held only the top objectfrom above with the right hand (the two objects wereattached with toothpicks invisible to the monkeys) Inthe hold-both condition the experimenter held the topobject from above with the right hand and the bottomobject from the side and bottom with the left hand Afterlifting the object(s) the experimenter called lsquolsquoCountrsquorsquoand the camcorder operator began counting 10 sec onthe camcorder display The experimenter held the ob-ject(s) stationary until the camcorder operator calledlsquolsquoDonersquorsquo to signal the end of the 10-sec trial Theexperimenter then lowered the screen This procedurehas been successfully used in previous looking timeexperiments on this population (eg Hauser et al1996)

Each monkey received one lsquolsquotogetherrsquorsquo and one lsquolsquose-paratersquorsquo trial These two trials were separated by twoadditional trials unrelated to the present studies andinvolving the stationary presentation of other food items(carrots and squash) For most monkeys trials wereseparated by an intertrial interval of 3ndash5 sec and theentire experiment lasted a couple of minutes For mon-keys who repositioned themselves between trials theintertrial intervals were longer but never exceeded acouple of minutes

Coding and Analysis

Two coders blind to the hypotheses and conditions ofthe experiment viewed the videotaped trials frame-by-frame to determine how long monkeys observed eachof the event outcomes On each trial coding began justafter the objects came to rest as signaled by theexperimenterrsquos voice on the videotape and ended 10sec later Four of the monkeys were coded by bothcoders the correlation between their judgments oftotal looking time on each trial was 93 Looking timeswere analyzed by a 2 pound 2 ANOVA with Condition (hold-top vs hold-both) as a between-subjects factor andDisplay (together vs separate) as the within-subjectsfactor

Experiment 2

Participants

Subjects were 28 monkeys from the same population asin Experiment 1 An additional 10 monkeys were tested

Munakata et al 55

but did not provide data for the analyses due to eitherposition changes that did not allow testing to resume (9monkeys) or experimental error (1 monkey)

Apparatus and Stimuli

The apparatus was similar to that in Experiment 1except that the stage was somewhat smaller (back =45 pound 30 cm floor = 45 pound 30 cm) and the screen wasslightly taller (base = 45 pound 15 cm face = 45 pound 45 cm)The food objects and object positions were the same asin the outcome displays for Experiment 1 On thelsquolsquoseparatersquorsquo trial the position of the experimenterrsquos handwas the same as in the lsquolsquoseparatersquorsquo trial of the hold-topcondition of Experiment 1 On the lsquolsquotogetherrsquorsquo trial theexperimenterrsquos right hand grasped the two objectssimultaneously from the side and supported them inthe same positions as on the lsquolsquotogetherrsquorsquo trials for bothconditions of Experiment 1

Design

The design was the same as in Experiment 1 except thatall subjects were run in a single-hand condition

Procedure

Each trial began when the experimenter lifted the screento reveal the objects currently held in the air As thescreen was raised the experimenter called lsquolsquoCountrsquorsquo andthe camcorder operator began counting 10 sec on thecamcorder display In all other respects the procedurewas the same as in Experiment 1

Coding and Analysis

A single coder blind to the conditions of the experimentscored the videotapes Trials for 10 subjects were codedby a second observer and the correlation betweenjudgments of both observers was 98 As in previousstudies (Hauser et al 1996 Hauser 1998) videos wereacquired onto a computer using Adobe Premiere soft-ware and a Radius Videovision board Coding began andended as for Experiment 1

Looking times in Experiment 2 were analyzed by aone-way ANOVA with Display (together vs separate) asthe within-subjects factor A further ANOVA with theadditional factor of Experiment compared the lookingpatterns of the monkeys in Experiment 2 to those inExperiment 1

Experiment 3

Participants

Subjects were 30 monkeys from the same population asin Experiments 1 and 2 An additional 13 monkeys were

tested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (12 monkeys) or experimental error (1 monkey)

Apparatus and Displays

The apparatus was identical to that of Experiment 2 Thedisplays were the same as in the hold-top condition ofExperiment 1 except for the objects a yellow lemon andan orange pepper oriented vertically On lsquolsquotogetherrsquorsquotrials a whole object appeared on the display floororiented vertically and a hand grasped its top half andlifted the object into the air On lsquolsquoseparatersquorsquo trials twohalves of an object with a horizontal cut through themiddle appeared on the display floor in the sameorientation and a hand grasped the top half and liftedit into the air while the bottom half remained on thedisplay floor At the start of the lsquolsquoseparatersquorsquo trial the cutin the object was detectable by adults but inconspicu-ous At the end of the trial small portions of the insideof the object were visible from the monkeyrsquos station-point

Design Procedure Coding and Analyses

The design and procedure were the same as in Experi-ment 1 except that only one condition (hold-top) wasadministered and only one object (together or separate)was displayed The coding and analyses were the sameas in Experiment 2

Experiment 4

Participants

Subjects were 43 monkeys from the same population asin Experiments 1ndash3 An additional 27 monkeys weretested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (26 monkeys) or experimental error (1 monkey)

Apparatus and Displays

These were the same as in Experiment 3 except that thefood object never appeared on the display floor and wasnot grasped and lifted

Design Procedure Coding and Analyses

These were the same as in Experiment 2

Acknowledgments

The research was supported by McDonnell-Pew PostdoctoralFellowships to Yuko Munakata and Randall OrsquoReilly an NSFpredoctoral fellowship and Harvard University McMasterrsquosfunds to Laurie Santos NIH grant R37-HD23103 to ElizabethSpelke and an NSF Young Investigator Award to Marc

56 Journal of Cognitive Neuroscience Volume 13 Number 1

Hauser We thank members of the Cognitive DevelopmentCenter at the University of Denver for feedback on drafts ofthis article We thank Elliott Blass Amy Jackendoff Katie LiuCory Miller Marianne Moon Bridget Spelke and Fei Xu forassistance with conducting and coding the studies We thankthe CPRC (PHS grant P51RR00168-38) and Drs M KesslerF Bercovitch and J Berard for helping secure the CayoSantiago facilities

Reprint requests should be sent to Yuko Munakata at theDepartment of Psychology University of Denver 2155 SRace Street Denver CO 80208 USA or via e-mail tomunakataduedu

Notes

1 In some cases infants younger than 1 year have demon-strated abilities to use featural information to parse simpleadjacent objects into separable units (Needham amp Baillargeon1997 Needham 1998) Further work is needed to determinewhy such abilities are not reliably observed in infants duringthe first year of life2 A third account for the differential sensitivity to hands maybe quickly rejected Human infantsrsquo greater response to handsis not due to any greater sensitivity of the preferential lookingtask with human infants relative to adult rhesus monkeysbecause we found monkeys to be more sensitive than humaninfants to featural information for objects in the presentexperiments A fourth interpretation is possible though notvery likely Human infantsrsquo greater sensitivity to hands in ourexperiments may depend on the use of human handsmdashperhaps monkeys would show similar sensitivity to hands iftested with monkey hands However rhesus monkeys showsimilar physiological responses to human hands and monkeyhands (Rizzolatti Fadiga Fogassi amp Gallese 1999 diPellegrino Fadiga Fogassi Gallese amp Rizzolatti 1992)suggesting that rhesus monkeysrsquo observed insensitivity to thesupporting role of hands in our studies is unlikely to be anartifact of the use of human hands

REFERENCES

Baillargeon R (1995) Physical reasoning in infancy In MGazzaniga (Ed) The cognitive neurosciences CambridgeMIT Press

Baylis G C Rolls E T amp Leonard C M (1985) Selectivitybetween faces in the responses of a population of neurons inthe cortex of the superior temporal sulcus of the macaquemonkey Brain Research 342 91ndash102

Bertenthal B I (1996) Origins and early development ofperception action and representation Annual Review ofPsychology 47 431ndash459

Biederman I amp Cooper E E (1992) Size invariance in visualobject priming Journal of Experimental Psychology Hu-man Perception and Performance 18 121ndash133

Biederman I amp Gerhardstein P C (1995) Viewpoint-de-pendent mechanisms in visual object recognition Reply toTarr and Bulthoff (1995) Journal of Experimental Psychol-ogy Human Perception and Performance 21 1506ndash1514

Caramazza A (1998) Domain-specific knowledge systems inthe brain The animatendashinanimate distinction Journal ofCognitive Neuroscience 10 1ndash34

Desimone R Albright T D Gross C G amp Bruce C (1984)Stimulus selective properties of inferior temporal neurons inthe macaque Journal of Neuroscience 4 2051ndash2062

Desimone R Ungerleider L G (1989) Neural mechanisms ofvisual processing in monkeys In F Boller amp J Grafman

(Eds) Handbook of neuropsychology vol 2 (pp 267ndash299)New York Elsevier

DeYoe E A amp Van Essen D C (1988) Concurrent processingstreams in monkey visual cortex Trends in Neurosciences11 219ndash226

di Pellegrino G Fadiga L Fogassi L Gallese V amp RizzolattiG (1992) Understanding motor events A neurophysiologi-cal study Experimental Brain Research 91 176ndash180

Fantz R (1961) The origin of form perception ScientificAmerican 204 66ndash72

Fantz R (1964) Visual experience in infants Decreased at-tention to familiar patterns relative to novel ones Science146 668ndash670

Hauser M D (1998) A nonhuman primatersquos expectationsabout object motion and destination The importance ofself-propelled movement and animacy DevelopmentalScience 1 31ndash37

Hauser M D amp Carey S (1998) Building a cognitive creaturefrom a set of primitives Evolutionary and developmentalinsights In D Cummins amp C Allen (Eds) The evolution ofmind Oxford Oxford University Press

Hauser M D MacNeilage P amp Ware M (1996) Numericalrepresentations in primates Proceedings of the NationalAcademy of Sciences USA 93 1514

Hauser M D amp Williams T (submitted) A nonhuman pri-matesrsquo expectations about invisible displacement Two pro-cedures two different systems of knowledge

Johnson S P amp Aslin R N (1996) Perceptions of object unityin young infants The rules of motion depth and orienta-tion Cognitive Development 11 161ndash180

Kanwisher N McDermott J amp Chun M M (1997) The fu-siform face area A module in human extrastriate cortexspecialized for face perception Journal of Neuroscience 174302

Kellman P J amp Arterberry M E (1998) The cradle ofknowledge Development of perception in infancy Cam-bridge MIT Press

Kellman P J amp Banks M S (1998) Infant visual perceptionIn D Kuhn amp R S Siegler (Eds) Handbook of childpsychology Cognition perception and language 5th ed(pp 103ndash146) New York Wiley

Kellman P J amp Spelke E (1983) Perception of partially oc-cluded objects in infancy Cognitive Psychology 15 483ndash524

Kestenbaum R Termine N amp Spelke E S (1987) Perceptionof objects and object boundaries by 3-month-old infantsBritish Journal of Developmental Psychology 5 367ndash383

Leslie A M (1982) The perception of causality in infantsPerception 11 173ndash186

Leslie A M (1984) Infant perception of a manual pick-up eventBritish Journal of Developmental Psychology 2 19ndash32

Logothetis N K amp Sheinberg D L (1996) Visual object re-cognition Annual Review of Neuroscience 19 577ndash621

Maunsell J H amp Newsome W T (1987) Visual processing inmonkey extrastriate cortex Annual Review of Neu-roscience 10 363ndash401

Meltzoff A S (1988) Infant imitation and memory Nine-month-olds in immediate and deferred tests Child Devel-opment 59 217ndash225

Nagell K Olguin R S amp Tomasello M (1993) Processes ofsocial learning in the tool use of chimpanzees (Pan troglo-dytes) and human children (Homo sapiens) Journal ofComparative Psychology 107 174ndash186

Needham A (1997) Factors affecting infantsrsquo use of featuralinformation in object segregation Current Directions inPsychological Science 6 26ndash33

Needham A (1998) Infantsrsquo use of featural information in thesegregation of stationary objects Infant Behavior and De-velopment 21 47ndash76

Munakata et al 57

Needham A amp Baillargeon R (1993) Intuitions about sup-port in 45-month-old infants Cognition 47 121ndash148

Needham A amp Baillargeon R (1997) Object segregation in 8-month-old infants Cognition 62 121ndash149

Needham A amp Baillargeon R (1998) Effects of prior experi-ence on 45-month-old infantsrsquo object segregation InfantBehavior and Development 21 1ndash23

Needham A amp Modi M (2000) Infantsrsquo use of prior experi-ences with objects in object segregation Implications forobject recognition in infancy In H Reese (Ed) Advances inchild development and behavior vol 27 (pp 99ndash133)

Perrett D I Mistlin A J amp Chitty A J (1987) Visual neu-rones responsive to faces Trends in Neurosciences 10 358ndash364

Rao S C Rainer G amp Miller E K (1997) Integration of whatand where in the primate prefrontal cortex Science 276821

Rizzolatti G Fadiga L Fogassi L amp Gallese V (1999) Re-sonance behaviors and mirror neurons Archives Italiennesde Biologie 137 85

Scholl B amp Leslie A (in press) Explaining the infantrsquos objectconcept Beyond the perceptioncognition dichotomy In ELepore amp Z Pylyshyn (Eds) What is cognitive science Ox-ford Blackwell

Sereno M I Dale A M amp Tootell R B H (1995) Borders ofmultiple visual areas in humans revealed by functionalmagnetic resonance imaging Science 268 889

Spelke E (1985) Preferential looking methods as tools for thestudy of cognition in infancy In G Gottlieb amp N Krasnegor(Eds) Measurement of audition and vision in the first yearof postnatal life (pp 323ndash363) Norwood NJ Ablex

Spelke E Breinlinger K Jacobson K amp Phillips A (1993)Gestalt relations and object perception A developmentalstudy Perception 22 1483ndash1501

Spelke E amp Van de Walle G A (1993) Perceiving and rea-soning about objects Insights from infants In N Eilan amp R AMcCarthy (Eds) Spatial representation Problems in philo-sophy and psychology (pp 132ndash161) Oxford Blackwell

Spelke E S Hofsten C V amp Kestenbaum R (1989) Objectperception in infancy Interaction of spatial and kinetic in-formation for object boundaries Developmental Psychol-ogy 25 185ndash196

Sugihara T Edelman S amp Tanaka K (1998) Representationof objective similarity among three-dimensional shapes inthe monkey Biological Cybernetics 78 1

Tanaka K (1996) Inferotemporal cortex and object visionAnnual Review of Neuroscience 19 109ndash139

Tarr M J amp Bulthoff H H (1995) Is human object recogni-tion better described by geon structural descriptions or bymultiple views Comment on Biederman and Gerhardstein(1993) Journal of Experimental Psychology Human Per-ception and Performance 21 1494

Tomasello M Kruger A C amp Ratner H H (1993) Culturallearning Behavioral and Brain Sciences 16 495ndash552

Tootell R B H Dale A M Sereno M I amp Malach R (1996)New images from human visual cortex Trends in Neuros-ciences 19 481ndash489

von Hofsten C amp Spelke E (1985) Object perception andobject-directed reaching in infancy Journal of ExperimentalPsychology General 11 198ndash212

Woodward A L (1998) Infants selectively encode the goalobject of an actorrsquos reach Cogniton 69 1ndash34

Xu F amp Carey S (1996) Infantsrsquo metaphysics The case ofnumerical identity Cognitive Psychology 30 111ndash153

Xu F Carey S amp Welch J (1999) Infantsrsquo ability to useobject kind information for object individuation Cognition70 137ndash166

58 Journal of Cognitive Neuroscience Volume 13 Number 1

Page 4: Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

different from those used with humans These differ-ences complicate comparisons across species and hin-der attempts to trace the evolutionary origins ofhuman capacities in our common primate heritage(Hauser amp Carey 1998)

The present studies investigate monkeysrsquo representa-tions of natural ecologically significant objectsmdash fooditemsmdashundergoing ecologically significant eventsmdashgrasping and lifting Moreover they use a method thathas been used extensively with humans of all agesrequires no training and can be administered to freeranging as well as captive animals For these reasons thestudies promise to shed light on the representations ofobjects that monkeys form spontaneously paving theway for simultaneous behavioral and neurophysiologicalstudies of the mechanisms of object representations inuntrained animals under conditions allowing systematiccomparison with humans

In the present studies we used a variant of themethod of Xu et al (1999) to investigate whetheruntrained free-ranging adult rhesus monkeys with amature visual system but no spontaneous tool use andat best a limited capacity for language perceive objectboundaries as older human infants do Monkeys werepresented with displays containing two novel foodobjects that were stationary and adjacent They thenviewed the lifting of the top object or of both objectstogether on two separate trials and their looking timesto the outcomes of these events were recorded In onecondition the lifting of the objects was accomplishedby a single human hand that grasped only the topobject In a second condition the lifting was accom-plished by two hands each grasping one object andmoving the objects together If monkeys perceived the

object boundaries as older infants do they should looklonger at the outcome of the event in which the twoobjects moved together in the one-hand events Thistendency might be attenuated in the two-hand eventsbecause each object was lifted by a supporting handSuch findings would suggest that human and nonhu-man primates have homologous representations ofobjects as movable and manipulable units and thatboth species distinguish hands from other objects andare sensitive to the functions of hands in supportingand moving objects

EXPERIMENT 1

In our first experiment monkeys were presented withtwo novel food itemsmdasheither a pumpkin and a pieceof ginger root or a pepper and a sweet potatomdashon thefloor of a stage (Figure 2) After a monkey hadobserved one item sitting on top of the other itemfor at least 1 sec a hand grasped the top item andlifted it On different trials either the top item movedalone while the bottom item remained on the stagefloor (lsquolsquoseparate trialrsquorsquo) or the two items moved to-gether (lsquolsquotogether trialrsquorsquo) At the end of the movementthe hand and objects remained stationary at their finalpositions for 10 sec and the monkeyrsquos looking timewas recorded Looking times to the two test displayswere compared to investigate whether monkeyslooked longer at the event outcome on the lsquolsquotogetherrsquorsquotrial a preference that would suggest that they per-ceived two separately movable objects in the originaldisplay

To investigate monkeysrsquo sensitivity to the role ofhuman hands in supporting and manipulating objects

Figure 2 Experimental set-upThe experimenter positionedthe apparatus 2ndash5 m away fromthe test monkey The apparatusconsisted of a stage and ascreen that could block theview of the stage and store thefood stimulus items for thestudy The screen was placedbehind the stage during testtrials as shown in the figure

Munakata et al 47

the test events were presented in two different waysto different groups of subjects For half the subjectsboth events were produced by a single hand thatgrasped only the top food object (hold-top) In thiscondition the bottom food object appeared to humanobservers to rest naturally on the display floor on thelsquolsquoseparatersquorsquo trial and to move unnaturally with the topobject on the lsquolsquotogetherrsquorsquo trial For the remainingsubjects each food object was grasped by a differenthand such that both objects appeared to be ade-quately supported on both trials (hold-both) Compar-isons of monkeysrsquo looking preferences across theseconditions should reveal whether monkeys take ac-count of the support function of human hands inrepresenting object motions

Results

Figure 3a and b present the findings from this experi-ment Monkeys looked longer at lsquolsquotogetherrsquorsquo events (51sec SE = 3 sec) than at lsquolsquoseparatersquorsquo events (41 sec SE =3 sec) F(1 57) = 92 p lt 005 This effect was equallystrong in the hold-top and hold-both conditions yield-ing no interaction of Condition by Display (F = 0) Ofthe 59 monkeys tested 42 looked longer at lsquolsquotogetherrsquorsquoevents and 17 looked longer at lsquolsquoseparatersquorsquo 2 = 11 p lt

005 The main effect of Condition was not statisticallysignificant (F lt 2)

Discussion

These results show that rhesus monkeys look longerwhen two bounded objects move together than whenone of the two objects moves separately from the otherLike human infants monkeys appear to parse arrays intobounded objects and they represent these objects asindependently movable and manipulable Moreovermonkeys and infants alike appear to look longer atevents in which two perceptually bounded objects moveand behave as a single unit suggesting that they findsuch events to be novel or unnatural Monkey andhuman object representations therefore appear to besimilar and to be testable by similar methods in accordwith previous findings (Hauser et al 1996)

The experiment also revealed two differences be-tween the object representations of adult monkeysand young human infants First infants as young as 6months take account of the supporting role of hands inlifting and moving objects but the monkeys in Experi-ment 1 showed no sensitivity to hands Informal obser-vations suggested that the monkeys were highlyattentive to the food objects but oblivious to the human

Figure 3 Results from Experi-ments 1 and 2 Rhesus monkeyslook longer when two distinctobjects move together thanwhen one of the two objectsmoves separately both (a)when one hand holds and liftsthe top object (Experiment 1hold-top condition) and (b)when two hands hold bothobjects (Experiment 1 hold-both condition) In contrast (c)rhesus do not distinguish lsquolsquoto-getherrsquorsquo from lsquolsquoseparatersquorsquo dis-plays in their looking timeswhen the distinct objects arestationary (Experiment 2)

0

1

2

3

4

5

6

Lo

oki

ng

tim

e (s

ec)

a b c

48 Journal of Cognitive Neuroscience Volume 13 Number 1

hands that held and manipulated them Second infantsbelow 12 months do not consistently perceive theboundary between two adjacent objects that are station-ary even when the objects belong to different familiarkinds Because the initial displays in the present studiescontained objects that were adjacent and underwent norelative motion the present findings suggest that therhesus monkeys used featural information to parse thevisual display into distinct objects

According to this object-parsing interpretation twofactors are critical to the rhesusrsquo responses the distinctfeatures of the objects and their distinct or commonmotions Because these two factors were not manipu-lated separately in the test displays in Experiment 1however there are two alternative accounts of thesefindings each of which discredits one of the factorscritical to the object-parsing account According to onealternative account monkeys simply find displays withdistinct features in spatial proximity more interestingthan displays with distinct features in more distantlocations Monkeys may look longer at the outcomeof the lsquolsquotogetherrsquorsquo trial because there are more featuresclustered together than in the outcome of the lsquolsquosepa-ratersquorsquo trial the motion of the objects may be comple-tely irrelevant to monkeysrsquo looking behaviorExperiment 2 tests this alternative interpretation ofthe findings by comparing monkeysrsquo looking times tothe same two outcome displays with no precedingmotion If the alternative interpretation is correct thenmonkeys should show a preference for the lsquolsquotogetherrsquorsquooutcome over the lsquolsquoseparatersquorsquo outcome in Experiment2 as in Experiment 1 If the object-parsing interpreta-tion is correct in contrast monkeys should not showthe same preference for the lsquolsquotogetherrsquorsquo outcome inExperiment 2

The second alternative interpretation of the data fromExperiment 1 is that monkeysrsquo looking times to differentevent outcomes depend on how much motion precededthose outcomes According to this account monkeyslooked longer at the outcome of the lsquolsquotogetherrsquorsquo trialbecause a greater volume of food moved during theevent that preceded the recording of their looking timeOn this interpretation the distinct features of the ob-jects and the representation of object boundaries areirrelevant to monkeysrsquo looking behavior Experiment 3tests this alternative interpretation by presenting mon-keys with lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo trials in which asingle object moves as a whole or splits apart If thealternative interpretation is correct then monkeysshould show a preference for the lsquolsquotogetherrsquorsquo outcomeover the lsquolsquoseparatersquorsquo outcome in Experiment 3 as inExperiment 1 because in both experiments the lsquolsquoto-getherrsquorsquo outcome was preceded by motion of a greatervolume of food In contrast the object-parsing interpre-tation predicts that monkeys will not show the samepreference for the lsquolsquotogetherrsquorsquo outcome in Experiment 3without the distinct featural information

EXPERIMENT 2

A new group of monkeys was presented with the twooutcome displays from Experiment 1 without any priorexposure to the food objects or to their motionsBecause hands were found not to influence monkeysrsquolooking patterns in Experiment 1 all the outcomedisplays presented two food items held by one handOn one trial (together) a hand held both food items inthe air by grasping both objects at once one atop theother On the other trial (separate) a hand held onefood item in the air while the other food item restedon the display floor Looking times to the two testoutcomes were compared to each other and to mon-keysrsquo looking times to the same outcome displays inExperiment 1 If the looking preference for the lsquolsquoto-getherrsquorsquo outcome in Experiment 1 reflected monkeysrsquoparsing of the initial arrays into two objects and theirexpectation that the two objects would move indepen-dently then that preference should be absent orattenuated in Experiment 2

Results

Figure 3c presents the principle findings of Experiment2 With stationary objects monkeys looked equally atlsquolsquotogetherrsquorsquo events (43 sec SE = 4 sec) and lsquolsquoseparatersquorsquoevents (47 sec SE = 4 sec) F(1 27) = 15 p gt 22 Ofthe 28 monkeys tested nine looked longer at thelsquolsquotogetherrsquorsquo event 18 looked longer at the lsquolsquoseparatersquorsquoevent and one individual looked equally at both events( 2 = 30 nonsignificant) The lsquolsquotiersquorsquo data point from thesingle individual was dropped for the 2 analysis

An analysis comparing looking times in Experiments 1and 2 revealed a significant interaction between trialtype (together vs separate) and experiment (1 vs 2)F(185) = 74 p = 008 Monkeys showed a greaterlooking preference for the lsquolsquotogetherrsquorsquo outcome displayin Experiment 1

Discussion

In Experiment 2 rhesus monkeys looked no longer at adisplay in which two different objects were held in the airtogether than at a display in which one object was held inthe air while the other object rested on the display floorThese findings contrast with the results from Experiment1 which focused on monkeysrsquo looking times to thesesame displays after prior exposure to two adjacent objectsand to their common or separate motions These findingschallenge one alternative explanation for the lookingpreferences in Experiment 1 and support our object-parsing interpretation of those looking preferenceswhereby monkeys parse visual displays into distinct ob-jects based on featural information and find events inwhich distinct objects move together more interestingthan events in which distinct objects move separately

Munakata et al 49

However the second alternative interpretation where-by monkeys look longer at displays presenting the out-comes of events in which a greater volume of food hasmoved could account for the data from both experi-ments Experiment 3 tests this alternative with food dis-plays that are parsed by human infants and adults into asingle object that either moves as a whole or breaks apartIf monkeysrsquo looking times to event outcomes depend onthe volume of food in motion that preceded each out-come then they should look longer at an outcome inwhich a whole food object has moved than at an outcomein which half of the object has moved In contrast ifmonkeyrsquos looking times depend on their parsing of visualdisplays into bounded objects then they should showdifferent looking preferences at the outcomes of eventsthat involve one versus two objects

EXPERIMENT 3

As in Experiment 1 monkeys were presented with adisplay of food sitting on a stage floor a hand graspedthe top of the food display and lifted either just thetop half of the food or all of the food into the air andthen the display remained stationary while lookingtimes to the event outcomes were recorded In con-

trast to Experiment 1 however each display containeda single food itemmdasha lemon or an orange peppermdashthat either broke into two pieces (separate) or movedas a whole (together) Looking times at the eventoutcomes were compared to one another and to thelooking times of the monkeys in Experiment 1 toinvestigate whether monkeysrsquo preferences betweenthe event outcomes depends on the volume of foodthat is lifted or on the monkeysrsquo parsing of the foodinto distinct objects

Results

Figure 4a presents the principal findings of Experiment3 Monkeys showed a nonsignificant trend toward look-ing longer at the lsquolsquoseparatersquorsquo outcome display (45 sec SE= 4 sec) than at the lsquolsquotogetherrsquorsquo outcome display (37sec SE = 4 sec) F(1 29) = 32 p = 08 Of the 30monkeys tested 19 looked longer at the lsquolsquoseparatersquorsquodisplay and 11 looked longer at the lsquolsquotogetherrsquorsquo display( 2 = 21 nonsignificant)

In contrast an analysis comparing looking times inExperiments 1 and 3 revealed a significant interactionbetween test display (together vs separate) and experi-ment (1 vs 3) F(1 100) = 85 p = 004 Monkeys

Figure 4 Results from Experi-ments 3 and 4 Rhesus looknonsignificantly longer whenone object appears in twopieces than when it appearstogether as a whole both (a)when the object is moved(Experiment 3) and (b) when itis stationary (Experiment 4)

a b

0

1

2

3

4

5

6

Lo

oki

ng

tim

e (s

ec)

50 Journal of Cognitive Neuroscience Volume 13 Number 1

looked longer at the outcome of the lsquolsquotogetherrsquorsquo event inExperiment 1 than in Experiment 3

Discussion

When monkeys were presented with events in whicheither a single food item moved as a whole or half theobject moved independently of the rest they did notlook longer at the event outcome that followed motionof a greater food volume Indeed monkeys showed amarginally significant tendency in the opposite direc-tion looking longer at the outcome of the event inwhich the object broke apart Looking preferencesbetween the lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo trials differedsignificantly from the preferences shown in Experiment1 in which the events involved two distinct objectsThese findings accord with the thesis that monkeys usefeatural information to parse visual scenes into objectsrepresent each object as separately movable and manip-ulable and look longer at events in which two distinctobjects move together

Nevertheless one of the alternative accounts could berevised to account for this collection of data Perhapsmonkeys have a preference both for event outcomesthat follow the motion of more food and for eventoutcomes that reveal the inside of a food object Accord-ing to this revised account monkeys in Experiment 1looked longer following an event in which two distinctobjects moved together because of their preference formore stuff moving This preference was not evident inExperiment 3 because it competed with an intrinsicpreference for the outcome display from the lsquolsquoseparatersquorsquotrial Because the inside of the lemon or pepper wasvisible following the lsquolsquoseparatersquorsquo event of Experiment 3but not following either the lsquolsquotogetherrsquorsquo event in Experi-ment 3 or either event in Experiment 1 a preference forviewing the inside of a food object would produce agreater preference for the lsquolsquoseparatersquorsquo outcome display inExperiment 3 than in Experiment 1

Experiment 4 tests this revised account by presentinga new group of monkeys with the outcome displays ofthe lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo events from Experiment3 with no prior presentation of any objects or motionAccording to the revised account monkeys should showa stronger preference for the lsquolsquoseparatersquorsquo event in Experi-ment 4 than in Experiment 3 because only Experiment 3would invoke the competing preference for more stuffmoving in the lsquolsquotogetherrsquorsquo event According to the origi-nal object-parsing account the preference for the lsquolsquose-paratersquorsquo event in Experiment 4 will not exceed that inExperiment 3 If the monkeys in Experiment 3 expectsingle objects to move as cohesive units then prefer-ence for the outcome of the lsquolsquoseparatersquorsquo event might begreater in Experiment 3 than in Experiment 4 If mon-keys have no expectations about the cohesive or non-cohesive motion of food objects then preferencesshould be the same in the two experiments

EXPERIMENT 4

Experiment 4 used the outcome displays of Experiment3 and the method of Experiment 2 Monkeys werepresented with one stationary display in which a handheld a whole food object in the air (together) and onestationary display in which a hand held the top half ofthe food object in the air while the bottom half of thefood object rested on the display floor (separate)Looking times to the two displays were compared toeach other and to the looking times of the monkeys inExperiment 3 who viewed the same displays followingpresentation of the whole object and two differentpatterns of motion

Results

Figure 4b presents the principal findings of Experiment4 Monkeys looked equally at lsquolsquotogetherrsquorsquo events (37 secSE = 3 sec) and lsquolsquoseparatersquorsquo events (42 sec SE = 3 sec)F(1 42) = 15 p = 2 Of the 43 monkeys tested 16looked longer at the lsquolsquotogetherrsquorsquo event and 27 lookedlonger at the lsquolsquoseparatersquorsquo event ( 2 = 28 nonsignificant)

The analysis comparing looking times in Experiments3 and 4 revealed a significant main effect of trial typemonkeys looked longer at the lsquolsquoseparatersquorsquo outcome dis-play (43 sec SE = 3 sec) than at the lsquolsquotogetherrsquorsquooutcome display (37 sec SE = 2 sec) F(1 71) = 45p lt 05 Of the 73 monkeys tested in Experiments 3 and4 46 looked longer at the lsquolsquoseparatersquorsquo outcome displayand 27 looked longer at the lsquolsquotogetherrsquorsquo outcome display

2 = 49 p lt 05

Discussion

In Experiment 4 rhesus monkeys showed a nonsignifi-cantly smaller preference for the lsquolsquoseparatersquorsquo display inwhich a single food item appeared in two pieces thantheir counterparts in Experiment 3 This finding pro-vides evidence against the thesis that monkeysrsquo lookingtimes depend on a preference for the outcomes ofevents involving the motion of more food stuff com-bined with an intrinsic preference for the separatedoutcome display with one object They instead supportthe object-parsing interpretation of the results fromExperiment 1 Monkeys appear to use featural informa-tion to parse visual displays into distinct objects andthey find events in which distinct objects move togethermore novel or less natural than events in which distinctobjects move separately

The findings of Experiments 3 and 4 provide no clearevidence concerning monkeysrsquo expectation that singlefood items will move cohesively If monkeys had such anexpectation then the subjects in Experiment 3 shouldhave looked longer at the lsquolsquoseparatersquorsquo display than thosein Experiment 4 because the lsquolsquoseparatersquorsquo display in Ex-periment 3 followed an event in which a single object

Munakata et al 51

broke apart and moved noncohesively Although the datafrom Experiments 3 and 4 tend in this direction noreliable differences were obtained between the lookingpreferences in the two experiments Reliable preferencesfor the outcomes of noncohesive motions have beenobserved both with human infants and with humanadults tested with similar methods and with displays ofsimple artifacts (Spelke et al 1989 1993 Kestenbaum etal 1987) The absence of a clear effect of cohesiveness inExperiments 3 and 4 may reflect either a species differ-ence or a difference in object domain Artifacts are moreapt to move cohesively than is food which breaks apartboth due to decay cutting or eating Such conclusionscannot be drawn from the present experiments howeverbecause of the equivocal findings

GENERAL DISCUSSION

Four experiments provide evidence that rhesus monkeysspontaneously parse arrays of adjacent food items intodistinct objects and that they represent these objects asseparately movable and manipulable Monkeys lookedlonger at the outcomes of events in which two previouslystationary adjacent objects moved as one unit than at theoutcomes of events in which one of the objects movedseparately from the other This preference was notattributable to any intrinsic preference for the formerevent outcome or to any preference for an outcome thatfollowed a greater amount of motion Instead it providesevidence that the monkeys represented the commonmotion of the two distinct objects as more novel orsurprising than the independent motion of those objects

The present findings suggest broad similarities be-tween the object representations formed by human andnonhuman primates and between the ways in whichthose representations are used to support inferencesabout objectsrsquo movability The well known detailedhomologies between the lower-level visual mechanismsof human and nonhuman primates (Tootell et al 1996Sereno et al 1995 DeYoe amp Van Essen 1988 Maunsell ampNewsome 1987 Desimone et al 1984) therefore appearto extend to higher-level mechanisms for parsing objectsand interpreting object motions In addition our findingsprovide evidence that adult monkeys and human infantsshow similar behavioral responses to object motionswith heightened visual exploration of motions that arenovel or surprising These findings complement previousresults showing that rhesus monkeys cotton-top tamar-ins and human infants show similar looking preferencesfor events in which objects are occluded or behave inanomalous ways (eg Hauser et al 1996 Hauser 1998)

Differences in Sensitivity to Hands

Our studies also reveal two differences between theobject representations formed by adult rhesus monkeysand young human infants First human infants take

account of the actions of human hands in analyzingthe motions and support relations among objects Whenhuman infants see an inanimate object rise into the air ina display that includes a human hand they show anovelty reaction if the hand and object are spatiallyseparated but not if the hand is grasping the object(Needham amp Baillargeon 1993 Leslie 1984) Monkeysin contrast showed no sensitivity to the supporting roleof hands in Experiment 1 Their novelty reaction to thecommon rising motion of two objects was equally strongwhen no hand contacted the bottom object (an eventthat implies that the two objects were connected) andwhen hands contacted each of the objects (an event thatimplies no connection between the objects)

We see two plausible accounts of the observed differ-ences in sensitivity to hands First human infantsrsquo great-er sensitivity to the supporting role of hands may reflecta species difference in the use of hands specifically inthe manipulation of inanimate objects Because humaninfants and human adults manipulate objects more thanother primates do human infants may have more op-portunities to learn about handndashobject support relationsthan do other species A second possibility not mutuallyexclusive from the first is that humans are innatelypredisposed to attend to the ways in which inanimateobjects are manipulated by other humans which in turncontributes to both infantsrsquo abilities to learn rapidlyabout tools and ultimately to humansrsquo superior tooluse2

Differences in the Use of Object Features forBoundaries

The second difference between the object representa-tions of adult monkeys and young human infants con-cerns the use of object features such as surface coloringand shape as information for object boundaries Adultmonkeys and human infants above 11 months of ageuse featural information to perceive object boundariesin contrast infants below 11 months of age do notreliably exhibit this ability Various factors have beenproposed to underlie the developmental change ob-served in humans Some factors focus on perceptualdevelopment with behavioral changes attributed toinfantsrsquo emerging abilities to use image features suchas edge alignment and texture similarity to group por-tions of the visual field into units directly (eg Kellmanamp Arterberry 1998 Needham 1998) In contrast otherfactors focus on the development of higher level pro-cesses with behavioral changes attributed to an emer-ging ability to represent objects as members of kindsand an emerging propensity to use object features suchas surface coloring and shape as information for thekinds to which specific objects belong (Needham ampModi 2000 Xu amp Carey 1996) Further this changemay be driven by the acquisition of verbal labels for theobjects (Xu amp Carey 1996)

52 Journal of Cognitive Neuroscience Volume 13 Number 1

Corresponding to these two interpretations of thedevelopmental change in humans are two differentinterpretations of monkeysrsquo performance in the presentstudies Monkeys may have perceived the object bound-aries by categorizing each object as a different kind offood or they may have perceived the boundaries group-ing together elements in the visual scene in accord withtheir colors textures and alignment relationships

There is compelling data suggesting that monkeysrepresent the category of food such that they are likelyto have lsquolsquofood kindrsquorsquo representations First monkeys inthe present studies were strongly attentive to food itemsand occasionally attempted to approach and take thembehaviors often observed with familiar foods and rarelyobserved with familiar nonfood objects This was trueeven though they had no prior experience with theseparticular food items Second experiments by SantosHauser and Spelke (in preparation) suggest that mon-keys given evidence that a novel object is food (byobserving a person eating part of it) subsequentlyapproach that object an odorless replica of that objectand other objects of the same color and texture as theoriginal object but of a different shape In contrastmonkeys do not approach these objects when they aregiven evidence that the initial object is not food (byobserving a person putting the object in her ear ratherthan her mouth) This finding suggests that monkeyscategorize novel objects as kinds of food in terms ofproperties such as their colors and textures If percep-tible properties of the present stimulus objects allowedmonkeys to perceive correctly that these objects werefood then monkeysrsquo propensity to categorize objects asthe same foods only when they share a common colorand texture would lead them to perceive each display oftwo (differently colored and textured) foods as contain-ing two distinct objects

Whatever the reason for monkeysrsquo successful use offeatural information to perceive object boundaries theexistence of this capacity in rhesus monkeys casts doubton the thesis that this ability either depends on or givesrise to any uniquely human ability to represent objectsHumans do represent objects in unique ways for wehave unparalleled abilities to build and use complextools and to communicate about objects with uniquesymbols for thousands of object kinds The sources ofour uniqueness however do not clearly appear in thecontexts that have been used thus far to assess objectrepresentations in human infants

Steps Toward a Cognitive Neuroscience of NaturalObject Representation

Although our experiments focus strictly on behavioralmeasures and functional analyses we believe their great-est potential lies in the contributions they can make tounderstanding the neural basis of object representationRhesus monkeys are one of the most intensively studied

species in the neuroanatomy and neurophysiology ofvision and such studies have provided evidence forextensive homologies between their visual systems andthose of humans Our experiments contribute to thisliterature in three ways First they suggest that rhesusmonkeys and humans have similar higher visual me-chanisms for representing objects and interpreting ob-ject motions The origin of these similarities remains anopen question with likely contributions from bothgenetically encoded homologies in the underlying neur-al architectures and similar experiential histories inter-acting with similar neural learning mechanisms

Second our experiments provide evidence that theobject representations of monkeys and humans can beassessed by nearly identical tasks Moreover these tasksrequire no training and so allow assessment of therepresentations that humans and monkeys developand use spontaneously rather than less naturalisticrepresentations that may have been developed specifi-cally for solving experimental tasks over months oftraining on those tasks (see discussion in Rao et al1997) Finally these tasks can be applied not only toadult animals but to infants Indeed the preferentiallooking method was developed for use with infanthumans and monkeys (Fantz 1961) and it has beenused to study lower-level visual functions in both spe-cies (see Kellman amp Banks 1998 for review) Themethod therefore should be ideal for investigating theneural architecture subserving visual cognition in bothspecies

Third our experiments offer a behavioral task that canreadily be adapted for simultaneous behavioral andneural recordings in monkeys Preferential lookingmethods have been used successfully both with semindashfree-ranging rhesus monkeys and with captive cotton-top tamarins (eg Hauser et al 1996 Hauser 1998) Inpreliminary research they have yielded similar findingswith rhesus monkeys tested with stabilized heads andimplanted electrodes (Munakata Miller amp Spelke un-published) In the future therefore cognitive neuros-cientists should be able to use these methods to probethe neural mechanisms of object representations inuntrained monkeys whose experience with objects canbe precisely controlled and to compare the functionalproperties of those mechanisms directly to those ofhuman infants with varying degrees of experience Suchstudies should prove a valuable complement to studiesof the neural mechanisms of object representations inadult humans using the combined approaches of cogni-tive psychology and functional brain imaging

More specifically the studies reported in this papercould serve as the starting point for physiological studiesprobing the cognitive and behavioral functions of neu-rons activated by visual displays As a first step onecould ask whether the extensively studied object codingneurons in the inferotemporal cortex (Tanaka 1996Perrett et al 1987 Baylis Rolls amp Leonard 1985) are

Munakata et al 53

responsible for the behavioral results found in ourexperiments The finding that monkeys encode two-object displays as two separate objects leads to theprediction that inferotemporal neurons will respondsimilarly to each object in a one-object display and in atwo-object display with a possible reduction in responseto the latter display due to competition from thedifferent object representations It is possible howeverthat monkeys distinguish the two objects in earlierstages of processing parsing the display based on con-tiguous regions of the same general color and texturewithout this parsing being clearly reflected in object-level representations These alternatives could be dis-tinguished by recording a population of inferotemporalresponses to one of our two-object displays and to eachof the two objects separately If the two-object responseswere different from the sum or average of the responsesto the two separate objects this would suggest thatmonkeys encode the two-object displays in a differentmanner than the separate objects at the level of theinferotemporal cortex

If this first experiment showed that inferotemporalneurons encode two-object displays in terms of the twoseparate objects one could next manipulate factors thatinfluence object perception and measure the neuralcorrelates For example spatiotemporal cues such ascommon motion may make monkeys more likely toperceive a two-object display as a single object (egKellman amp Spelke 1983) and elimination of colordifferences may make them less likely to do so (SantosHauser amp Spelke in preparation) One could thusmeasure both the behavioral (looking time) and electro-physiological consequences of such manipulations andcompare the results to those from experiments withoutthis preexposure

Another ideal candidate for converging explorationfocuses on the representations underlying abilities toperceive combinations of objects in terms of theseparable components Do such abilities stem fromrepresentations of distinct perceptual features or fromrepresentations of distinct object kinds The nature ofobject representations is a matter of considerabledebate in the electrophysiological and related litera-ture (eg Sugihara Edelman amp Tanaka 1998 Lo-gothetis amp Sheinberg 1996 Tanaka 1996 Tarr ampBulthoff 1995 Biederman amp Cooper 1992 Biedermanamp Gerhardstein 1995) and issues related to the kindsfeatures distinction have been discussed in a some-what different terminology For example Logothetisand Sheinberg (1996) posit that different levels ofcategorization could be used to organize object repre-sentations from more specific visual feature-basedrepresentations to more abstract-kind representationsElectrophysiological recordings have demonstratedthat a given visual object is represented in differentways along a rough hierarchy of processing pathwaysfrom more specific low-order featural representations

to more abstract invariant categorical representations(eg Desimone amp Ungerleider 1989) Objects such asthose used in our displays could be recognized asdistinct based on features at lower levels or categoriesat higher levels Alternatively even the lowest level ofobject representations may be organized into differentkind categories as suggested by the existence of face-specific representations in both rhesus monkey andhuman visual areas (Kanwisher McDermott amp Chun1997 Perrett et al 1987) Though most of theexplanations for face-specific representations focus onthe unique perceptual properties of faces rather thana more general categorical organization of the objectrecognition system a categorical organization is stillpossible (Caramazza 1998) Thus objects could becategorized as different kinds at the earliest levels offeatural processing

One could further explore these issues in physiologi-cal studies by presenting monkeys with different visualforms of a single food category (eg bananas that aresliced mashed peeled unpeeled green brown andyellow) and visually similar forms from different foodcategories (eg a green banana and a cucumber) Ifinferotemporal representations encode information atthe level of kinds the first condition should elicit similarresponses in the inferotemporal neurons whereas thesecond should elicit different responses In contrast ifinferotemporal representations encode information atthe level of features the first condition should elicitdifferent responses whereas the second should elicitsimilar responses

In such ways our understanding of object representa-tion and in turn of humansrsquo unique tool and symboluse may be enhanced by converging efforts at thebehavioral and physiological levels of analysis Themethods reported in this papermdashused extensively withhumans of all ages requiring no training and applicableto free-ranging as well as captive animalsmdashcould play aninstrumental role in this process

METHODS

Experiment 1

Participants

Subjects were 59 semindashfree-ranging rhesus monkeysliving on the island of Cayo Santiago Puerto RicoApproximately half the subjects were adult males (agegt4 years) and half adult female (age gt3 years) Subjectswere tested opportunistically whenever they were en-countered in a setting with few other monkeys ordistractions (eg not involved in or near to a fight)and when they remained in a seated position longenough for us to present our stimuli Monkeys occa-sionally changed positions between trials In these casestesting resumed if and when monkeys relocated toanother seated position within a couple of minutes An

54 Journal of Cognitive Neuroscience Volume 13 Number 1

additional 21 monkeys were tested but did not providedata for the analyses due to either position changes thatdid not allow testing to resume (20 monkeys) or experi-menter error (1 monkey)

Apparatus and Displays

The experimental apparatus consisted of a stage and ascreen constructed from white foam core (Figure 2)The 60 pound 30-cm floor and 60 pound 40-cm back of the stagewere attached at a right angle by triangular supports (12-cm height pound 7-cm base) attached to the sides of thestage The 60 pound 40-cm screen had a 60 pound 15-cm basesupporting small aluminum pans containing the foodstimuli for the study The base and pans were attachedto the screen at a right angle by large triangular supports(40-cm height pound 15-cm base) that occluded both thebase and the food

The objects were four foods of contrasting shapescolors and textures with sizes that made them easilygraspable by a single human hand a green pepper (7-cm tall pound 8-cm diameter) a brown sweet potato (7-cmtall pound 75-cm wide by 17-cm long) a miniature orangepumpkin (7-cm tall pound 8-cm diameter) and a segmentof tan ginger root (45-cm tall pound 12-cm wide pound 15-cmlong) None of these items grew on the island or werebrought there either as provisions for the monkeys oras food for the research team all items therefore wereunfamiliar to the subjects In one display the greenpepper rested on top of the sweet potato In theother display the pumpkin rested on top of theginger root

Design

Each monkey was presented with one lsquolsquotogetherrsquorsquo trialand one lsquolsquoseparatersquorsquo trial each involving a different pairof food items Twenty-eight monkeys were tested in ahold-top condition in which the experimenter held onlythe top object with one hand during each event and 31monkeys were tested in a hold-both condition in whichthe experimenter held both objects with both handsWithin each of these conditions the pairing of objects(pepperpotato vs pumpkinginger) and trial types (to-gether vs separate) and the order of test trials wereorthogonally counterbalanced across monkeys

Procedure

All testing was conducted by one experimenter and onecamcorder operator a test began when the investigatorslocated a monkey who was seated in a quiet spot Theexperimenter positioned the apparatus 2ndash5 m away fromthe test monkey with the screen in front of and blockingthe monkeyrsquos view of the stage and the camcorderoperator began to videotape the monkey from behindthe display (Figure 2) The experimenter then raised the

screen to reveal an empty stage and immediately low-ered it Each test trial then proceeded as follows Theexperimenter raised the screen to reveal one food itemsitting atop a second food item The experimenterchecked that the monkey had fixated the objects andthen she lifted the top object approximately 30 cm in 1sec In lsquolsquotogetherrsquorsquo events the bottom object moved withthe top object in lsquolsquoseparatersquorsquo events the bottom objectremained on the floor of the display In the hold-topcondition the experimenter held only the top objectfrom above with the right hand (the two objects wereattached with toothpicks invisible to the monkeys) Inthe hold-both condition the experimenter held the topobject from above with the right hand and the bottomobject from the side and bottom with the left hand Afterlifting the object(s) the experimenter called lsquolsquoCountrsquorsquoand the camcorder operator began counting 10 sec onthe camcorder display The experimenter held the ob-ject(s) stationary until the camcorder operator calledlsquolsquoDonersquorsquo to signal the end of the 10-sec trial Theexperimenter then lowered the screen This procedurehas been successfully used in previous looking timeexperiments on this population (eg Hauser et al1996)

Each monkey received one lsquolsquotogetherrsquorsquo and one lsquolsquose-paratersquorsquo trial These two trials were separated by twoadditional trials unrelated to the present studies andinvolving the stationary presentation of other food items(carrots and squash) For most monkeys trials wereseparated by an intertrial interval of 3ndash5 sec and theentire experiment lasted a couple of minutes For mon-keys who repositioned themselves between trials theintertrial intervals were longer but never exceeded acouple of minutes

Coding and Analysis

Two coders blind to the hypotheses and conditions ofthe experiment viewed the videotaped trials frame-by-frame to determine how long monkeys observed eachof the event outcomes On each trial coding began justafter the objects came to rest as signaled by theexperimenterrsquos voice on the videotape and ended 10sec later Four of the monkeys were coded by bothcoders the correlation between their judgments oftotal looking time on each trial was 93 Looking timeswere analyzed by a 2 pound 2 ANOVA with Condition (hold-top vs hold-both) as a between-subjects factor andDisplay (together vs separate) as the within-subjectsfactor

Experiment 2

Participants

Subjects were 28 monkeys from the same population asin Experiment 1 An additional 10 monkeys were tested

Munakata et al 55

but did not provide data for the analyses due to eitherposition changes that did not allow testing to resume (9monkeys) or experimental error (1 monkey)

Apparatus and Stimuli

The apparatus was similar to that in Experiment 1except that the stage was somewhat smaller (back =45 pound 30 cm floor = 45 pound 30 cm) and the screen wasslightly taller (base = 45 pound 15 cm face = 45 pound 45 cm)The food objects and object positions were the same asin the outcome displays for Experiment 1 On thelsquolsquoseparatersquorsquo trial the position of the experimenterrsquos handwas the same as in the lsquolsquoseparatersquorsquo trial of the hold-topcondition of Experiment 1 On the lsquolsquotogetherrsquorsquo trial theexperimenterrsquos right hand grasped the two objectssimultaneously from the side and supported them inthe same positions as on the lsquolsquotogetherrsquorsquo trials for bothconditions of Experiment 1

Design

The design was the same as in Experiment 1 except thatall subjects were run in a single-hand condition

Procedure

Each trial began when the experimenter lifted the screento reveal the objects currently held in the air As thescreen was raised the experimenter called lsquolsquoCountrsquorsquo andthe camcorder operator began counting 10 sec on thecamcorder display In all other respects the procedurewas the same as in Experiment 1

Coding and Analysis

A single coder blind to the conditions of the experimentscored the videotapes Trials for 10 subjects were codedby a second observer and the correlation betweenjudgments of both observers was 98 As in previousstudies (Hauser et al 1996 Hauser 1998) videos wereacquired onto a computer using Adobe Premiere soft-ware and a Radius Videovision board Coding began andended as for Experiment 1

Looking times in Experiment 2 were analyzed by aone-way ANOVA with Display (together vs separate) asthe within-subjects factor A further ANOVA with theadditional factor of Experiment compared the lookingpatterns of the monkeys in Experiment 2 to those inExperiment 1

Experiment 3

Participants

Subjects were 30 monkeys from the same population asin Experiments 1 and 2 An additional 13 monkeys were

tested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (12 monkeys) or experimental error (1 monkey)

Apparatus and Displays

The apparatus was identical to that of Experiment 2 Thedisplays were the same as in the hold-top condition ofExperiment 1 except for the objects a yellow lemon andan orange pepper oriented vertically On lsquolsquotogetherrsquorsquotrials a whole object appeared on the display floororiented vertically and a hand grasped its top half andlifted the object into the air On lsquolsquoseparatersquorsquo trials twohalves of an object with a horizontal cut through themiddle appeared on the display floor in the sameorientation and a hand grasped the top half and liftedit into the air while the bottom half remained on thedisplay floor At the start of the lsquolsquoseparatersquorsquo trial the cutin the object was detectable by adults but inconspicu-ous At the end of the trial small portions of the insideof the object were visible from the monkeyrsquos station-point

Design Procedure Coding and Analyses

The design and procedure were the same as in Experi-ment 1 except that only one condition (hold-top) wasadministered and only one object (together or separate)was displayed The coding and analyses were the sameas in Experiment 2

Experiment 4

Participants

Subjects were 43 monkeys from the same population asin Experiments 1ndash3 An additional 27 monkeys weretested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (26 monkeys) or experimental error (1 monkey)

Apparatus and Displays

These were the same as in Experiment 3 except that thefood object never appeared on the display floor and wasnot grasped and lifted

Design Procedure Coding and Analyses

These were the same as in Experiment 2

Acknowledgments

The research was supported by McDonnell-Pew PostdoctoralFellowships to Yuko Munakata and Randall OrsquoReilly an NSFpredoctoral fellowship and Harvard University McMasterrsquosfunds to Laurie Santos NIH grant R37-HD23103 to ElizabethSpelke and an NSF Young Investigator Award to Marc

56 Journal of Cognitive Neuroscience Volume 13 Number 1

Hauser We thank members of the Cognitive DevelopmentCenter at the University of Denver for feedback on drafts ofthis article We thank Elliott Blass Amy Jackendoff Katie LiuCory Miller Marianne Moon Bridget Spelke and Fei Xu forassistance with conducting and coding the studies We thankthe CPRC (PHS grant P51RR00168-38) and Drs M KesslerF Bercovitch and J Berard for helping secure the CayoSantiago facilities

Reprint requests should be sent to Yuko Munakata at theDepartment of Psychology University of Denver 2155 SRace Street Denver CO 80208 USA or via e-mail tomunakataduedu

Notes

1 In some cases infants younger than 1 year have demon-strated abilities to use featural information to parse simpleadjacent objects into separable units (Needham amp Baillargeon1997 Needham 1998) Further work is needed to determinewhy such abilities are not reliably observed in infants duringthe first year of life2 A third account for the differential sensitivity to hands maybe quickly rejected Human infantsrsquo greater response to handsis not due to any greater sensitivity of the preferential lookingtask with human infants relative to adult rhesus monkeysbecause we found monkeys to be more sensitive than humaninfants to featural information for objects in the presentexperiments A fourth interpretation is possible though notvery likely Human infantsrsquo greater sensitivity to hands in ourexperiments may depend on the use of human handsmdashperhaps monkeys would show similar sensitivity to hands iftested with monkey hands However rhesus monkeys showsimilar physiological responses to human hands and monkeyhands (Rizzolatti Fadiga Fogassi amp Gallese 1999 diPellegrino Fadiga Fogassi Gallese amp Rizzolatti 1992)suggesting that rhesus monkeysrsquo observed insensitivity to thesupporting role of hands in our studies is unlikely to be anartifact of the use of human hands

REFERENCES

Baillargeon R (1995) Physical reasoning in infancy In MGazzaniga (Ed) The cognitive neurosciences CambridgeMIT Press

Baylis G C Rolls E T amp Leonard C M (1985) Selectivitybetween faces in the responses of a population of neurons inthe cortex of the superior temporal sulcus of the macaquemonkey Brain Research 342 91ndash102

Bertenthal B I (1996) Origins and early development ofperception action and representation Annual Review ofPsychology 47 431ndash459

Biederman I amp Cooper E E (1992) Size invariance in visualobject priming Journal of Experimental Psychology Hu-man Perception and Performance 18 121ndash133

Biederman I amp Gerhardstein P C (1995) Viewpoint-de-pendent mechanisms in visual object recognition Reply toTarr and Bulthoff (1995) Journal of Experimental Psychol-ogy Human Perception and Performance 21 1506ndash1514

Caramazza A (1998) Domain-specific knowledge systems inthe brain The animatendashinanimate distinction Journal ofCognitive Neuroscience 10 1ndash34

Desimone R Albright T D Gross C G amp Bruce C (1984)Stimulus selective properties of inferior temporal neurons inthe macaque Journal of Neuroscience 4 2051ndash2062

Desimone R Ungerleider L G (1989) Neural mechanisms ofvisual processing in monkeys In F Boller amp J Grafman

(Eds) Handbook of neuropsychology vol 2 (pp 267ndash299)New York Elsevier

DeYoe E A amp Van Essen D C (1988) Concurrent processingstreams in monkey visual cortex Trends in Neurosciences11 219ndash226

di Pellegrino G Fadiga L Fogassi L Gallese V amp RizzolattiG (1992) Understanding motor events A neurophysiologi-cal study Experimental Brain Research 91 176ndash180

Fantz R (1961) The origin of form perception ScientificAmerican 204 66ndash72

Fantz R (1964) Visual experience in infants Decreased at-tention to familiar patterns relative to novel ones Science146 668ndash670

Hauser M D (1998) A nonhuman primatersquos expectationsabout object motion and destination The importance ofself-propelled movement and animacy DevelopmentalScience 1 31ndash37

Hauser M D amp Carey S (1998) Building a cognitive creaturefrom a set of primitives Evolutionary and developmentalinsights In D Cummins amp C Allen (Eds) The evolution ofmind Oxford Oxford University Press

Hauser M D MacNeilage P amp Ware M (1996) Numericalrepresentations in primates Proceedings of the NationalAcademy of Sciences USA 93 1514

Hauser M D amp Williams T (submitted) A nonhuman pri-matesrsquo expectations about invisible displacement Two pro-cedures two different systems of knowledge

Johnson S P amp Aslin R N (1996) Perceptions of object unityin young infants The rules of motion depth and orienta-tion Cognitive Development 11 161ndash180

Kanwisher N McDermott J amp Chun M M (1997) The fu-siform face area A module in human extrastriate cortexspecialized for face perception Journal of Neuroscience 174302

Kellman P J amp Arterberry M E (1998) The cradle ofknowledge Development of perception in infancy Cam-bridge MIT Press

Kellman P J amp Banks M S (1998) Infant visual perceptionIn D Kuhn amp R S Siegler (Eds) Handbook of childpsychology Cognition perception and language 5th ed(pp 103ndash146) New York Wiley

Kellman P J amp Spelke E (1983) Perception of partially oc-cluded objects in infancy Cognitive Psychology 15 483ndash524

Kestenbaum R Termine N amp Spelke E S (1987) Perceptionof objects and object boundaries by 3-month-old infantsBritish Journal of Developmental Psychology 5 367ndash383

Leslie A M (1982) The perception of causality in infantsPerception 11 173ndash186

Leslie A M (1984) Infant perception of a manual pick-up eventBritish Journal of Developmental Psychology 2 19ndash32

Logothetis N K amp Sheinberg D L (1996) Visual object re-cognition Annual Review of Neuroscience 19 577ndash621

Maunsell J H amp Newsome W T (1987) Visual processing inmonkey extrastriate cortex Annual Review of Neu-roscience 10 363ndash401

Meltzoff A S (1988) Infant imitation and memory Nine-month-olds in immediate and deferred tests Child Devel-opment 59 217ndash225

Nagell K Olguin R S amp Tomasello M (1993) Processes ofsocial learning in the tool use of chimpanzees (Pan troglo-dytes) and human children (Homo sapiens) Journal ofComparative Psychology 107 174ndash186

Needham A (1997) Factors affecting infantsrsquo use of featuralinformation in object segregation Current Directions inPsychological Science 6 26ndash33

Needham A (1998) Infantsrsquo use of featural information in thesegregation of stationary objects Infant Behavior and De-velopment 21 47ndash76

Munakata et al 57

Needham A amp Baillargeon R (1993) Intuitions about sup-port in 45-month-old infants Cognition 47 121ndash148

Needham A amp Baillargeon R (1997) Object segregation in 8-month-old infants Cognition 62 121ndash149

Needham A amp Baillargeon R (1998) Effects of prior experi-ence on 45-month-old infantsrsquo object segregation InfantBehavior and Development 21 1ndash23

Needham A amp Modi M (2000) Infantsrsquo use of prior experi-ences with objects in object segregation Implications forobject recognition in infancy In H Reese (Ed) Advances inchild development and behavior vol 27 (pp 99ndash133)

Perrett D I Mistlin A J amp Chitty A J (1987) Visual neu-rones responsive to faces Trends in Neurosciences 10 358ndash364

Rao S C Rainer G amp Miller E K (1997) Integration of whatand where in the primate prefrontal cortex Science 276821

Rizzolatti G Fadiga L Fogassi L amp Gallese V (1999) Re-sonance behaviors and mirror neurons Archives Italiennesde Biologie 137 85

Scholl B amp Leslie A (in press) Explaining the infantrsquos objectconcept Beyond the perceptioncognition dichotomy In ELepore amp Z Pylyshyn (Eds) What is cognitive science Ox-ford Blackwell

Sereno M I Dale A M amp Tootell R B H (1995) Borders ofmultiple visual areas in humans revealed by functionalmagnetic resonance imaging Science 268 889

Spelke E (1985) Preferential looking methods as tools for thestudy of cognition in infancy In G Gottlieb amp N Krasnegor(Eds) Measurement of audition and vision in the first yearof postnatal life (pp 323ndash363) Norwood NJ Ablex

Spelke E Breinlinger K Jacobson K amp Phillips A (1993)Gestalt relations and object perception A developmentalstudy Perception 22 1483ndash1501

Spelke E amp Van de Walle G A (1993) Perceiving and rea-soning about objects Insights from infants In N Eilan amp R AMcCarthy (Eds) Spatial representation Problems in philo-sophy and psychology (pp 132ndash161) Oxford Blackwell

Spelke E S Hofsten C V amp Kestenbaum R (1989) Objectperception in infancy Interaction of spatial and kinetic in-formation for object boundaries Developmental Psychol-ogy 25 185ndash196

Sugihara T Edelman S amp Tanaka K (1998) Representationof objective similarity among three-dimensional shapes inthe monkey Biological Cybernetics 78 1

Tanaka K (1996) Inferotemporal cortex and object visionAnnual Review of Neuroscience 19 109ndash139

Tarr M J amp Bulthoff H H (1995) Is human object recogni-tion better described by geon structural descriptions or bymultiple views Comment on Biederman and Gerhardstein(1993) Journal of Experimental Psychology Human Per-ception and Performance 21 1494

Tomasello M Kruger A C amp Ratner H H (1993) Culturallearning Behavioral and Brain Sciences 16 495ndash552

Tootell R B H Dale A M Sereno M I amp Malach R (1996)New images from human visual cortex Trends in Neuros-ciences 19 481ndash489

von Hofsten C amp Spelke E (1985) Object perception andobject-directed reaching in infancy Journal of ExperimentalPsychology General 11 198ndash212

Woodward A L (1998) Infants selectively encode the goalobject of an actorrsquos reach Cogniton 69 1ndash34

Xu F amp Carey S (1996) Infantsrsquo metaphysics The case ofnumerical identity Cognitive Psychology 30 111ndash153

Xu F Carey S amp Welch J (1999) Infantsrsquo ability to useobject kind information for object individuation Cognition70 137ndash166

58 Journal of Cognitive Neuroscience Volume 13 Number 1

Page 5: Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

the test events were presented in two different waysto different groups of subjects For half the subjectsboth events were produced by a single hand thatgrasped only the top food object (hold-top) In thiscondition the bottom food object appeared to humanobservers to rest naturally on the display floor on thelsquolsquoseparatersquorsquo trial and to move unnaturally with the topobject on the lsquolsquotogetherrsquorsquo trial For the remainingsubjects each food object was grasped by a differenthand such that both objects appeared to be ade-quately supported on both trials (hold-both) Compar-isons of monkeysrsquo looking preferences across theseconditions should reveal whether monkeys take ac-count of the support function of human hands inrepresenting object motions

Results

Figure 3a and b present the findings from this experi-ment Monkeys looked longer at lsquolsquotogetherrsquorsquo events (51sec SE = 3 sec) than at lsquolsquoseparatersquorsquo events (41 sec SE =3 sec) F(1 57) = 92 p lt 005 This effect was equallystrong in the hold-top and hold-both conditions yield-ing no interaction of Condition by Display (F = 0) Ofthe 59 monkeys tested 42 looked longer at lsquolsquotogetherrsquorsquoevents and 17 looked longer at lsquolsquoseparatersquorsquo 2 = 11 p lt

005 The main effect of Condition was not statisticallysignificant (F lt 2)

Discussion

These results show that rhesus monkeys look longerwhen two bounded objects move together than whenone of the two objects moves separately from the otherLike human infants monkeys appear to parse arrays intobounded objects and they represent these objects asindependently movable and manipulable Moreovermonkeys and infants alike appear to look longer atevents in which two perceptually bounded objects moveand behave as a single unit suggesting that they findsuch events to be novel or unnatural Monkey andhuman object representations therefore appear to besimilar and to be testable by similar methods in accordwith previous findings (Hauser et al 1996)

The experiment also revealed two differences be-tween the object representations of adult monkeysand young human infants First infants as young as 6months take account of the supporting role of hands inlifting and moving objects but the monkeys in Experi-ment 1 showed no sensitivity to hands Informal obser-vations suggested that the monkeys were highlyattentive to the food objects but oblivious to the human

Figure 3 Results from Experi-ments 1 and 2 Rhesus monkeyslook longer when two distinctobjects move together thanwhen one of the two objectsmoves separately both (a)when one hand holds and liftsthe top object (Experiment 1hold-top condition) and (b)when two hands hold bothobjects (Experiment 1 hold-both condition) In contrast (c)rhesus do not distinguish lsquolsquoto-getherrsquorsquo from lsquolsquoseparatersquorsquo dis-plays in their looking timeswhen the distinct objects arestationary (Experiment 2)

0

1

2

3

4

5

6

Lo

oki

ng

tim

e (s

ec)

a b c

48 Journal of Cognitive Neuroscience Volume 13 Number 1

hands that held and manipulated them Second infantsbelow 12 months do not consistently perceive theboundary between two adjacent objects that are station-ary even when the objects belong to different familiarkinds Because the initial displays in the present studiescontained objects that were adjacent and underwent norelative motion the present findings suggest that therhesus monkeys used featural information to parse thevisual display into distinct objects

According to this object-parsing interpretation twofactors are critical to the rhesusrsquo responses the distinctfeatures of the objects and their distinct or commonmotions Because these two factors were not manipu-lated separately in the test displays in Experiment 1however there are two alternative accounts of thesefindings each of which discredits one of the factorscritical to the object-parsing account According to onealternative account monkeys simply find displays withdistinct features in spatial proximity more interestingthan displays with distinct features in more distantlocations Monkeys may look longer at the outcomeof the lsquolsquotogetherrsquorsquo trial because there are more featuresclustered together than in the outcome of the lsquolsquosepa-ratersquorsquo trial the motion of the objects may be comple-tely irrelevant to monkeysrsquo looking behaviorExperiment 2 tests this alternative interpretation ofthe findings by comparing monkeysrsquo looking times tothe same two outcome displays with no precedingmotion If the alternative interpretation is correct thenmonkeys should show a preference for the lsquolsquotogetherrsquorsquooutcome over the lsquolsquoseparatersquorsquo outcome in Experiment2 as in Experiment 1 If the object-parsing interpreta-tion is correct in contrast monkeys should not showthe same preference for the lsquolsquotogetherrsquorsquo outcome inExperiment 2

The second alternative interpretation of the data fromExperiment 1 is that monkeysrsquo looking times to differentevent outcomes depend on how much motion precededthose outcomes According to this account monkeyslooked longer at the outcome of the lsquolsquotogetherrsquorsquo trialbecause a greater volume of food moved during theevent that preceded the recording of their looking timeOn this interpretation the distinct features of the ob-jects and the representation of object boundaries areirrelevant to monkeysrsquo looking behavior Experiment 3tests this alternative interpretation by presenting mon-keys with lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo trials in which asingle object moves as a whole or splits apart If thealternative interpretation is correct then monkeysshould show a preference for the lsquolsquotogetherrsquorsquo outcomeover the lsquolsquoseparatersquorsquo outcome in Experiment 3 as inExperiment 1 because in both experiments the lsquolsquoto-getherrsquorsquo outcome was preceded by motion of a greatervolume of food In contrast the object-parsing interpre-tation predicts that monkeys will not show the samepreference for the lsquolsquotogetherrsquorsquo outcome in Experiment 3without the distinct featural information

EXPERIMENT 2

A new group of monkeys was presented with the twooutcome displays from Experiment 1 without any priorexposure to the food objects or to their motionsBecause hands were found not to influence monkeysrsquolooking patterns in Experiment 1 all the outcomedisplays presented two food items held by one handOn one trial (together) a hand held both food items inthe air by grasping both objects at once one atop theother On the other trial (separate) a hand held onefood item in the air while the other food item restedon the display floor Looking times to the two testoutcomes were compared to each other and to mon-keysrsquo looking times to the same outcome displays inExperiment 1 If the looking preference for the lsquolsquoto-getherrsquorsquo outcome in Experiment 1 reflected monkeysrsquoparsing of the initial arrays into two objects and theirexpectation that the two objects would move indepen-dently then that preference should be absent orattenuated in Experiment 2

Results

Figure 3c presents the principle findings of Experiment2 With stationary objects monkeys looked equally atlsquolsquotogetherrsquorsquo events (43 sec SE = 4 sec) and lsquolsquoseparatersquorsquoevents (47 sec SE = 4 sec) F(1 27) = 15 p gt 22 Ofthe 28 monkeys tested nine looked longer at thelsquolsquotogetherrsquorsquo event 18 looked longer at the lsquolsquoseparatersquorsquoevent and one individual looked equally at both events( 2 = 30 nonsignificant) The lsquolsquotiersquorsquo data point from thesingle individual was dropped for the 2 analysis

An analysis comparing looking times in Experiments 1and 2 revealed a significant interaction between trialtype (together vs separate) and experiment (1 vs 2)F(185) = 74 p = 008 Monkeys showed a greaterlooking preference for the lsquolsquotogetherrsquorsquo outcome displayin Experiment 1

Discussion

In Experiment 2 rhesus monkeys looked no longer at adisplay in which two different objects were held in the airtogether than at a display in which one object was held inthe air while the other object rested on the display floorThese findings contrast with the results from Experiment1 which focused on monkeysrsquo looking times to thesesame displays after prior exposure to two adjacent objectsand to their common or separate motions These findingschallenge one alternative explanation for the lookingpreferences in Experiment 1 and support our object-parsing interpretation of those looking preferenceswhereby monkeys parse visual displays into distinct ob-jects based on featural information and find events inwhich distinct objects move together more interestingthan events in which distinct objects move separately

Munakata et al 49

However the second alternative interpretation where-by monkeys look longer at displays presenting the out-comes of events in which a greater volume of food hasmoved could account for the data from both experi-ments Experiment 3 tests this alternative with food dis-plays that are parsed by human infants and adults into asingle object that either moves as a whole or breaks apartIf monkeysrsquo looking times to event outcomes depend onthe volume of food in motion that preceded each out-come then they should look longer at an outcome inwhich a whole food object has moved than at an outcomein which half of the object has moved In contrast ifmonkeyrsquos looking times depend on their parsing of visualdisplays into bounded objects then they should showdifferent looking preferences at the outcomes of eventsthat involve one versus two objects

EXPERIMENT 3

As in Experiment 1 monkeys were presented with adisplay of food sitting on a stage floor a hand graspedthe top of the food display and lifted either just thetop half of the food or all of the food into the air andthen the display remained stationary while lookingtimes to the event outcomes were recorded In con-

trast to Experiment 1 however each display containeda single food itemmdasha lemon or an orange peppermdashthat either broke into two pieces (separate) or movedas a whole (together) Looking times at the eventoutcomes were compared to one another and to thelooking times of the monkeys in Experiment 1 toinvestigate whether monkeysrsquo preferences betweenthe event outcomes depends on the volume of foodthat is lifted or on the monkeysrsquo parsing of the foodinto distinct objects

Results

Figure 4a presents the principal findings of Experiment3 Monkeys showed a nonsignificant trend toward look-ing longer at the lsquolsquoseparatersquorsquo outcome display (45 sec SE= 4 sec) than at the lsquolsquotogetherrsquorsquo outcome display (37sec SE = 4 sec) F(1 29) = 32 p = 08 Of the 30monkeys tested 19 looked longer at the lsquolsquoseparatersquorsquodisplay and 11 looked longer at the lsquolsquotogetherrsquorsquo display( 2 = 21 nonsignificant)

In contrast an analysis comparing looking times inExperiments 1 and 3 revealed a significant interactionbetween test display (together vs separate) and experi-ment (1 vs 3) F(1 100) = 85 p = 004 Monkeys

Figure 4 Results from Experi-ments 3 and 4 Rhesus looknonsignificantly longer whenone object appears in twopieces than when it appearstogether as a whole both (a)when the object is moved(Experiment 3) and (b) when itis stationary (Experiment 4)

a b

0

1

2

3

4

5

6

Lo

oki

ng

tim

e (s

ec)

50 Journal of Cognitive Neuroscience Volume 13 Number 1

looked longer at the outcome of the lsquolsquotogetherrsquorsquo event inExperiment 1 than in Experiment 3

Discussion

When monkeys were presented with events in whicheither a single food item moved as a whole or half theobject moved independently of the rest they did notlook longer at the event outcome that followed motionof a greater food volume Indeed monkeys showed amarginally significant tendency in the opposite direc-tion looking longer at the outcome of the event inwhich the object broke apart Looking preferencesbetween the lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo trials differedsignificantly from the preferences shown in Experiment1 in which the events involved two distinct objectsThese findings accord with the thesis that monkeys usefeatural information to parse visual scenes into objectsrepresent each object as separately movable and manip-ulable and look longer at events in which two distinctobjects move together

Nevertheless one of the alternative accounts could berevised to account for this collection of data Perhapsmonkeys have a preference both for event outcomesthat follow the motion of more food and for eventoutcomes that reveal the inside of a food object Accord-ing to this revised account monkeys in Experiment 1looked longer following an event in which two distinctobjects moved together because of their preference formore stuff moving This preference was not evident inExperiment 3 because it competed with an intrinsicpreference for the outcome display from the lsquolsquoseparatersquorsquotrial Because the inside of the lemon or pepper wasvisible following the lsquolsquoseparatersquorsquo event of Experiment 3but not following either the lsquolsquotogetherrsquorsquo event in Experi-ment 3 or either event in Experiment 1 a preference forviewing the inside of a food object would produce agreater preference for the lsquolsquoseparatersquorsquo outcome display inExperiment 3 than in Experiment 1

Experiment 4 tests this revised account by presentinga new group of monkeys with the outcome displays ofthe lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo events from Experiment3 with no prior presentation of any objects or motionAccording to the revised account monkeys should showa stronger preference for the lsquolsquoseparatersquorsquo event in Experi-ment 4 than in Experiment 3 because only Experiment 3would invoke the competing preference for more stuffmoving in the lsquolsquotogetherrsquorsquo event According to the origi-nal object-parsing account the preference for the lsquolsquose-paratersquorsquo event in Experiment 4 will not exceed that inExperiment 3 If the monkeys in Experiment 3 expectsingle objects to move as cohesive units then prefer-ence for the outcome of the lsquolsquoseparatersquorsquo event might begreater in Experiment 3 than in Experiment 4 If mon-keys have no expectations about the cohesive or non-cohesive motion of food objects then preferencesshould be the same in the two experiments

EXPERIMENT 4

Experiment 4 used the outcome displays of Experiment3 and the method of Experiment 2 Monkeys werepresented with one stationary display in which a handheld a whole food object in the air (together) and onestationary display in which a hand held the top half ofthe food object in the air while the bottom half of thefood object rested on the display floor (separate)Looking times to the two displays were compared toeach other and to the looking times of the monkeys inExperiment 3 who viewed the same displays followingpresentation of the whole object and two differentpatterns of motion

Results

Figure 4b presents the principal findings of Experiment4 Monkeys looked equally at lsquolsquotogetherrsquorsquo events (37 secSE = 3 sec) and lsquolsquoseparatersquorsquo events (42 sec SE = 3 sec)F(1 42) = 15 p = 2 Of the 43 monkeys tested 16looked longer at the lsquolsquotogetherrsquorsquo event and 27 lookedlonger at the lsquolsquoseparatersquorsquo event ( 2 = 28 nonsignificant)

The analysis comparing looking times in Experiments3 and 4 revealed a significant main effect of trial typemonkeys looked longer at the lsquolsquoseparatersquorsquo outcome dis-play (43 sec SE = 3 sec) than at the lsquolsquotogetherrsquorsquooutcome display (37 sec SE = 2 sec) F(1 71) = 45p lt 05 Of the 73 monkeys tested in Experiments 3 and4 46 looked longer at the lsquolsquoseparatersquorsquo outcome displayand 27 looked longer at the lsquolsquotogetherrsquorsquo outcome display

2 = 49 p lt 05

Discussion

In Experiment 4 rhesus monkeys showed a nonsignifi-cantly smaller preference for the lsquolsquoseparatersquorsquo display inwhich a single food item appeared in two pieces thantheir counterparts in Experiment 3 This finding pro-vides evidence against the thesis that monkeysrsquo lookingtimes depend on a preference for the outcomes ofevents involving the motion of more food stuff com-bined with an intrinsic preference for the separatedoutcome display with one object They instead supportthe object-parsing interpretation of the results fromExperiment 1 Monkeys appear to use featural informa-tion to parse visual displays into distinct objects andthey find events in which distinct objects move togethermore novel or less natural than events in which distinctobjects move separately

The findings of Experiments 3 and 4 provide no clearevidence concerning monkeysrsquo expectation that singlefood items will move cohesively If monkeys had such anexpectation then the subjects in Experiment 3 shouldhave looked longer at the lsquolsquoseparatersquorsquo display than thosein Experiment 4 because the lsquolsquoseparatersquorsquo display in Ex-periment 3 followed an event in which a single object

Munakata et al 51

broke apart and moved noncohesively Although the datafrom Experiments 3 and 4 tend in this direction noreliable differences were obtained between the lookingpreferences in the two experiments Reliable preferencesfor the outcomes of noncohesive motions have beenobserved both with human infants and with humanadults tested with similar methods and with displays ofsimple artifacts (Spelke et al 1989 1993 Kestenbaum etal 1987) The absence of a clear effect of cohesiveness inExperiments 3 and 4 may reflect either a species differ-ence or a difference in object domain Artifacts are moreapt to move cohesively than is food which breaks apartboth due to decay cutting or eating Such conclusionscannot be drawn from the present experiments howeverbecause of the equivocal findings

GENERAL DISCUSSION

Four experiments provide evidence that rhesus monkeysspontaneously parse arrays of adjacent food items intodistinct objects and that they represent these objects asseparately movable and manipulable Monkeys lookedlonger at the outcomes of events in which two previouslystationary adjacent objects moved as one unit than at theoutcomes of events in which one of the objects movedseparately from the other This preference was notattributable to any intrinsic preference for the formerevent outcome or to any preference for an outcome thatfollowed a greater amount of motion Instead it providesevidence that the monkeys represented the commonmotion of the two distinct objects as more novel orsurprising than the independent motion of those objects

The present findings suggest broad similarities be-tween the object representations formed by human andnonhuman primates and between the ways in whichthose representations are used to support inferencesabout objectsrsquo movability The well known detailedhomologies between the lower-level visual mechanismsof human and nonhuman primates (Tootell et al 1996Sereno et al 1995 DeYoe amp Van Essen 1988 Maunsell ampNewsome 1987 Desimone et al 1984) therefore appearto extend to higher-level mechanisms for parsing objectsand interpreting object motions In addition our findingsprovide evidence that adult monkeys and human infantsshow similar behavioral responses to object motionswith heightened visual exploration of motions that arenovel or surprising These findings complement previousresults showing that rhesus monkeys cotton-top tamar-ins and human infants show similar looking preferencesfor events in which objects are occluded or behave inanomalous ways (eg Hauser et al 1996 Hauser 1998)

Differences in Sensitivity to Hands

Our studies also reveal two differences between theobject representations formed by adult rhesus monkeysand young human infants First human infants take

account of the actions of human hands in analyzingthe motions and support relations among objects Whenhuman infants see an inanimate object rise into the air ina display that includes a human hand they show anovelty reaction if the hand and object are spatiallyseparated but not if the hand is grasping the object(Needham amp Baillargeon 1993 Leslie 1984) Monkeysin contrast showed no sensitivity to the supporting roleof hands in Experiment 1 Their novelty reaction to thecommon rising motion of two objects was equally strongwhen no hand contacted the bottom object (an eventthat implies that the two objects were connected) andwhen hands contacted each of the objects (an event thatimplies no connection between the objects)

We see two plausible accounts of the observed differ-ences in sensitivity to hands First human infantsrsquo great-er sensitivity to the supporting role of hands may reflecta species difference in the use of hands specifically inthe manipulation of inanimate objects Because humaninfants and human adults manipulate objects more thanother primates do human infants may have more op-portunities to learn about handndashobject support relationsthan do other species A second possibility not mutuallyexclusive from the first is that humans are innatelypredisposed to attend to the ways in which inanimateobjects are manipulated by other humans which in turncontributes to both infantsrsquo abilities to learn rapidlyabout tools and ultimately to humansrsquo superior tooluse2

Differences in the Use of Object Features forBoundaries

The second difference between the object representa-tions of adult monkeys and young human infants con-cerns the use of object features such as surface coloringand shape as information for object boundaries Adultmonkeys and human infants above 11 months of ageuse featural information to perceive object boundariesin contrast infants below 11 months of age do notreliably exhibit this ability Various factors have beenproposed to underlie the developmental change ob-served in humans Some factors focus on perceptualdevelopment with behavioral changes attributed toinfantsrsquo emerging abilities to use image features suchas edge alignment and texture similarity to group por-tions of the visual field into units directly (eg Kellmanamp Arterberry 1998 Needham 1998) In contrast otherfactors focus on the development of higher level pro-cesses with behavioral changes attributed to an emer-ging ability to represent objects as members of kindsand an emerging propensity to use object features suchas surface coloring and shape as information for thekinds to which specific objects belong (Needham ampModi 2000 Xu amp Carey 1996) Further this changemay be driven by the acquisition of verbal labels for theobjects (Xu amp Carey 1996)

52 Journal of Cognitive Neuroscience Volume 13 Number 1

Corresponding to these two interpretations of thedevelopmental change in humans are two differentinterpretations of monkeysrsquo performance in the presentstudies Monkeys may have perceived the object bound-aries by categorizing each object as a different kind offood or they may have perceived the boundaries group-ing together elements in the visual scene in accord withtheir colors textures and alignment relationships

There is compelling data suggesting that monkeysrepresent the category of food such that they are likelyto have lsquolsquofood kindrsquorsquo representations First monkeys inthe present studies were strongly attentive to food itemsand occasionally attempted to approach and take thembehaviors often observed with familiar foods and rarelyobserved with familiar nonfood objects This was trueeven though they had no prior experience with theseparticular food items Second experiments by SantosHauser and Spelke (in preparation) suggest that mon-keys given evidence that a novel object is food (byobserving a person eating part of it) subsequentlyapproach that object an odorless replica of that objectand other objects of the same color and texture as theoriginal object but of a different shape In contrastmonkeys do not approach these objects when they aregiven evidence that the initial object is not food (byobserving a person putting the object in her ear ratherthan her mouth) This finding suggests that monkeyscategorize novel objects as kinds of food in terms ofproperties such as their colors and textures If percep-tible properties of the present stimulus objects allowedmonkeys to perceive correctly that these objects werefood then monkeysrsquo propensity to categorize objects asthe same foods only when they share a common colorand texture would lead them to perceive each display oftwo (differently colored and textured) foods as contain-ing two distinct objects

Whatever the reason for monkeysrsquo successful use offeatural information to perceive object boundaries theexistence of this capacity in rhesus monkeys casts doubton the thesis that this ability either depends on or givesrise to any uniquely human ability to represent objectsHumans do represent objects in unique ways for wehave unparalleled abilities to build and use complextools and to communicate about objects with uniquesymbols for thousands of object kinds The sources ofour uniqueness however do not clearly appear in thecontexts that have been used thus far to assess objectrepresentations in human infants

Steps Toward a Cognitive Neuroscience of NaturalObject Representation

Although our experiments focus strictly on behavioralmeasures and functional analyses we believe their great-est potential lies in the contributions they can make tounderstanding the neural basis of object representationRhesus monkeys are one of the most intensively studied

species in the neuroanatomy and neurophysiology ofvision and such studies have provided evidence forextensive homologies between their visual systems andthose of humans Our experiments contribute to thisliterature in three ways First they suggest that rhesusmonkeys and humans have similar higher visual me-chanisms for representing objects and interpreting ob-ject motions The origin of these similarities remains anopen question with likely contributions from bothgenetically encoded homologies in the underlying neur-al architectures and similar experiential histories inter-acting with similar neural learning mechanisms

Second our experiments provide evidence that theobject representations of monkeys and humans can beassessed by nearly identical tasks Moreover these tasksrequire no training and so allow assessment of therepresentations that humans and monkeys developand use spontaneously rather than less naturalisticrepresentations that may have been developed specifi-cally for solving experimental tasks over months oftraining on those tasks (see discussion in Rao et al1997) Finally these tasks can be applied not only toadult animals but to infants Indeed the preferentiallooking method was developed for use with infanthumans and monkeys (Fantz 1961) and it has beenused to study lower-level visual functions in both spe-cies (see Kellman amp Banks 1998 for review) Themethod therefore should be ideal for investigating theneural architecture subserving visual cognition in bothspecies

Third our experiments offer a behavioral task that canreadily be adapted for simultaneous behavioral andneural recordings in monkeys Preferential lookingmethods have been used successfully both with semindashfree-ranging rhesus monkeys and with captive cotton-top tamarins (eg Hauser et al 1996 Hauser 1998) Inpreliminary research they have yielded similar findingswith rhesus monkeys tested with stabilized heads andimplanted electrodes (Munakata Miller amp Spelke un-published) In the future therefore cognitive neuros-cientists should be able to use these methods to probethe neural mechanisms of object representations inuntrained monkeys whose experience with objects canbe precisely controlled and to compare the functionalproperties of those mechanisms directly to those ofhuman infants with varying degrees of experience Suchstudies should prove a valuable complement to studiesof the neural mechanisms of object representations inadult humans using the combined approaches of cogni-tive psychology and functional brain imaging

More specifically the studies reported in this papercould serve as the starting point for physiological studiesprobing the cognitive and behavioral functions of neu-rons activated by visual displays As a first step onecould ask whether the extensively studied object codingneurons in the inferotemporal cortex (Tanaka 1996Perrett et al 1987 Baylis Rolls amp Leonard 1985) are

Munakata et al 53

responsible for the behavioral results found in ourexperiments The finding that monkeys encode two-object displays as two separate objects leads to theprediction that inferotemporal neurons will respondsimilarly to each object in a one-object display and in atwo-object display with a possible reduction in responseto the latter display due to competition from thedifferent object representations It is possible howeverthat monkeys distinguish the two objects in earlierstages of processing parsing the display based on con-tiguous regions of the same general color and texturewithout this parsing being clearly reflected in object-level representations These alternatives could be dis-tinguished by recording a population of inferotemporalresponses to one of our two-object displays and to eachof the two objects separately If the two-object responseswere different from the sum or average of the responsesto the two separate objects this would suggest thatmonkeys encode the two-object displays in a differentmanner than the separate objects at the level of theinferotemporal cortex

If this first experiment showed that inferotemporalneurons encode two-object displays in terms of the twoseparate objects one could next manipulate factors thatinfluence object perception and measure the neuralcorrelates For example spatiotemporal cues such ascommon motion may make monkeys more likely toperceive a two-object display as a single object (egKellman amp Spelke 1983) and elimination of colordifferences may make them less likely to do so (SantosHauser amp Spelke in preparation) One could thusmeasure both the behavioral (looking time) and electro-physiological consequences of such manipulations andcompare the results to those from experiments withoutthis preexposure

Another ideal candidate for converging explorationfocuses on the representations underlying abilities toperceive combinations of objects in terms of theseparable components Do such abilities stem fromrepresentations of distinct perceptual features or fromrepresentations of distinct object kinds The nature ofobject representations is a matter of considerabledebate in the electrophysiological and related litera-ture (eg Sugihara Edelman amp Tanaka 1998 Lo-gothetis amp Sheinberg 1996 Tanaka 1996 Tarr ampBulthoff 1995 Biederman amp Cooper 1992 Biedermanamp Gerhardstein 1995) and issues related to the kindsfeatures distinction have been discussed in a some-what different terminology For example Logothetisand Sheinberg (1996) posit that different levels ofcategorization could be used to organize object repre-sentations from more specific visual feature-basedrepresentations to more abstract-kind representationsElectrophysiological recordings have demonstratedthat a given visual object is represented in differentways along a rough hierarchy of processing pathwaysfrom more specific low-order featural representations

to more abstract invariant categorical representations(eg Desimone amp Ungerleider 1989) Objects such asthose used in our displays could be recognized asdistinct based on features at lower levels or categoriesat higher levels Alternatively even the lowest level ofobject representations may be organized into differentkind categories as suggested by the existence of face-specific representations in both rhesus monkey andhuman visual areas (Kanwisher McDermott amp Chun1997 Perrett et al 1987) Though most of theexplanations for face-specific representations focus onthe unique perceptual properties of faces rather thana more general categorical organization of the objectrecognition system a categorical organization is stillpossible (Caramazza 1998) Thus objects could becategorized as different kinds at the earliest levels offeatural processing

One could further explore these issues in physiologi-cal studies by presenting monkeys with different visualforms of a single food category (eg bananas that aresliced mashed peeled unpeeled green brown andyellow) and visually similar forms from different foodcategories (eg a green banana and a cucumber) Ifinferotemporal representations encode information atthe level of kinds the first condition should elicit similarresponses in the inferotemporal neurons whereas thesecond should elicit different responses In contrast ifinferotemporal representations encode information atthe level of features the first condition should elicitdifferent responses whereas the second should elicitsimilar responses

In such ways our understanding of object representa-tion and in turn of humansrsquo unique tool and symboluse may be enhanced by converging efforts at thebehavioral and physiological levels of analysis Themethods reported in this papermdashused extensively withhumans of all ages requiring no training and applicableto free-ranging as well as captive animalsmdashcould play aninstrumental role in this process

METHODS

Experiment 1

Participants

Subjects were 59 semindashfree-ranging rhesus monkeysliving on the island of Cayo Santiago Puerto RicoApproximately half the subjects were adult males (agegt4 years) and half adult female (age gt3 years) Subjectswere tested opportunistically whenever they were en-countered in a setting with few other monkeys ordistractions (eg not involved in or near to a fight)and when they remained in a seated position longenough for us to present our stimuli Monkeys occa-sionally changed positions between trials In these casestesting resumed if and when monkeys relocated toanother seated position within a couple of minutes An

54 Journal of Cognitive Neuroscience Volume 13 Number 1

additional 21 monkeys were tested but did not providedata for the analyses due to either position changes thatdid not allow testing to resume (20 monkeys) or experi-menter error (1 monkey)

Apparatus and Displays

The experimental apparatus consisted of a stage and ascreen constructed from white foam core (Figure 2)The 60 pound 30-cm floor and 60 pound 40-cm back of the stagewere attached at a right angle by triangular supports (12-cm height pound 7-cm base) attached to the sides of thestage The 60 pound 40-cm screen had a 60 pound 15-cm basesupporting small aluminum pans containing the foodstimuli for the study The base and pans were attachedto the screen at a right angle by large triangular supports(40-cm height pound 15-cm base) that occluded both thebase and the food

The objects were four foods of contrasting shapescolors and textures with sizes that made them easilygraspable by a single human hand a green pepper (7-cm tall pound 8-cm diameter) a brown sweet potato (7-cmtall pound 75-cm wide by 17-cm long) a miniature orangepumpkin (7-cm tall pound 8-cm diameter) and a segmentof tan ginger root (45-cm tall pound 12-cm wide pound 15-cmlong) None of these items grew on the island or werebrought there either as provisions for the monkeys oras food for the research team all items therefore wereunfamiliar to the subjects In one display the greenpepper rested on top of the sweet potato In theother display the pumpkin rested on top of theginger root

Design

Each monkey was presented with one lsquolsquotogetherrsquorsquo trialand one lsquolsquoseparatersquorsquo trial each involving a different pairof food items Twenty-eight monkeys were tested in ahold-top condition in which the experimenter held onlythe top object with one hand during each event and 31monkeys were tested in a hold-both condition in whichthe experimenter held both objects with both handsWithin each of these conditions the pairing of objects(pepperpotato vs pumpkinginger) and trial types (to-gether vs separate) and the order of test trials wereorthogonally counterbalanced across monkeys

Procedure

All testing was conducted by one experimenter and onecamcorder operator a test began when the investigatorslocated a monkey who was seated in a quiet spot Theexperimenter positioned the apparatus 2ndash5 m away fromthe test monkey with the screen in front of and blockingthe monkeyrsquos view of the stage and the camcorderoperator began to videotape the monkey from behindthe display (Figure 2) The experimenter then raised the

screen to reveal an empty stage and immediately low-ered it Each test trial then proceeded as follows Theexperimenter raised the screen to reveal one food itemsitting atop a second food item The experimenterchecked that the monkey had fixated the objects andthen she lifted the top object approximately 30 cm in 1sec In lsquolsquotogetherrsquorsquo events the bottom object moved withthe top object in lsquolsquoseparatersquorsquo events the bottom objectremained on the floor of the display In the hold-topcondition the experimenter held only the top objectfrom above with the right hand (the two objects wereattached with toothpicks invisible to the monkeys) Inthe hold-both condition the experimenter held the topobject from above with the right hand and the bottomobject from the side and bottom with the left hand Afterlifting the object(s) the experimenter called lsquolsquoCountrsquorsquoand the camcorder operator began counting 10 sec onthe camcorder display The experimenter held the ob-ject(s) stationary until the camcorder operator calledlsquolsquoDonersquorsquo to signal the end of the 10-sec trial Theexperimenter then lowered the screen This procedurehas been successfully used in previous looking timeexperiments on this population (eg Hauser et al1996)

Each monkey received one lsquolsquotogetherrsquorsquo and one lsquolsquose-paratersquorsquo trial These two trials were separated by twoadditional trials unrelated to the present studies andinvolving the stationary presentation of other food items(carrots and squash) For most monkeys trials wereseparated by an intertrial interval of 3ndash5 sec and theentire experiment lasted a couple of minutes For mon-keys who repositioned themselves between trials theintertrial intervals were longer but never exceeded acouple of minutes

Coding and Analysis

Two coders blind to the hypotheses and conditions ofthe experiment viewed the videotaped trials frame-by-frame to determine how long monkeys observed eachof the event outcomes On each trial coding began justafter the objects came to rest as signaled by theexperimenterrsquos voice on the videotape and ended 10sec later Four of the monkeys were coded by bothcoders the correlation between their judgments oftotal looking time on each trial was 93 Looking timeswere analyzed by a 2 pound 2 ANOVA with Condition (hold-top vs hold-both) as a between-subjects factor andDisplay (together vs separate) as the within-subjectsfactor

Experiment 2

Participants

Subjects were 28 monkeys from the same population asin Experiment 1 An additional 10 monkeys were tested

Munakata et al 55

but did not provide data for the analyses due to eitherposition changes that did not allow testing to resume (9monkeys) or experimental error (1 monkey)

Apparatus and Stimuli

The apparatus was similar to that in Experiment 1except that the stage was somewhat smaller (back =45 pound 30 cm floor = 45 pound 30 cm) and the screen wasslightly taller (base = 45 pound 15 cm face = 45 pound 45 cm)The food objects and object positions were the same asin the outcome displays for Experiment 1 On thelsquolsquoseparatersquorsquo trial the position of the experimenterrsquos handwas the same as in the lsquolsquoseparatersquorsquo trial of the hold-topcondition of Experiment 1 On the lsquolsquotogetherrsquorsquo trial theexperimenterrsquos right hand grasped the two objectssimultaneously from the side and supported them inthe same positions as on the lsquolsquotogetherrsquorsquo trials for bothconditions of Experiment 1

Design

The design was the same as in Experiment 1 except thatall subjects were run in a single-hand condition

Procedure

Each trial began when the experimenter lifted the screento reveal the objects currently held in the air As thescreen was raised the experimenter called lsquolsquoCountrsquorsquo andthe camcorder operator began counting 10 sec on thecamcorder display In all other respects the procedurewas the same as in Experiment 1

Coding and Analysis

A single coder blind to the conditions of the experimentscored the videotapes Trials for 10 subjects were codedby a second observer and the correlation betweenjudgments of both observers was 98 As in previousstudies (Hauser et al 1996 Hauser 1998) videos wereacquired onto a computer using Adobe Premiere soft-ware and a Radius Videovision board Coding began andended as for Experiment 1

Looking times in Experiment 2 were analyzed by aone-way ANOVA with Display (together vs separate) asthe within-subjects factor A further ANOVA with theadditional factor of Experiment compared the lookingpatterns of the monkeys in Experiment 2 to those inExperiment 1

Experiment 3

Participants

Subjects were 30 monkeys from the same population asin Experiments 1 and 2 An additional 13 monkeys were

tested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (12 monkeys) or experimental error (1 monkey)

Apparatus and Displays

The apparatus was identical to that of Experiment 2 Thedisplays were the same as in the hold-top condition ofExperiment 1 except for the objects a yellow lemon andan orange pepper oriented vertically On lsquolsquotogetherrsquorsquotrials a whole object appeared on the display floororiented vertically and a hand grasped its top half andlifted the object into the air On lsquolsquoseparatersquorsquo trials twohalves of an object with a horizontal cut through themiddle appeared on the display floor in the sameorientation and a hand grasped the top half and liftedit into the air while the bottom half remained on thedisplay floor At the start of the lsquolsquoseparatersquorsquo trial the cutin the object was detectable by adults but inconspicu-ous At the end of the trial small portions of the insideof the object were visible from the monkeyrsquos station-point

Design Procedure Coding and Analyses

The design and procedure were the same as in Experi-ment 1 except that only one condition (hold-top) wasadministered and only one object (together or separate)was displayed The coding and analyses were the sameas in Experiment 2

Experiment 4

Participants

Subjects were 43 monkeys from the same population asin Experiments 1ndash3 An additional 27 monkeys weretested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (26 monkeys) or experimental error (1 monkey)

Apparatus and Displays

These were the same as in Experiment 3 except that thefood object never appeared on the display floor and wasnot grasped and lifted

Design Procedure Coding and Analyses

These were the same as in Experiment 2

Acknowledgments

The research was supported by McDonnell-Pew PostdoctoralFellowships to Yuko Munakata and Randall OrsquoReilly an NSFpredoctoral fellowship and Harvard University McMasterrsquosfunds to Laurie Santos NIH grant R37-HD23103 to ElizabethSpelke and an NSF Young Investigator Award to Marc

56 Journal of Cognitive Neuroscience Volume 13 Number 1

Hauser We thank members of the Cognitive DevelopmentCenter at the University of Denver for feedback on drafts ofthis article We thank Elliott Blass Amy Jackendoff Katie LiuCory Miller Marianne Moon Bridget Spelke and Fei Xu forassistance with conducting and coding the studies We thankthe CPRC (PHS grant P51RR00168-38) and Drs M KesslerF Bercovitch and J Berard for helping secure the CayoSantiago facilities

Reprint requests should be sent to Yuko Munakata at theDepartment of Psychology University of Denver 2155 SRace Street Denver CO 80208 USA or via e-mail tomunakataduedu

Notes

1 In some cases infants younger than 1 year have demon-strated abilities to use featural information to parse simpleadjacent objects into separable units (Needham amp Baillargeon1997 Needham 1998) Further work is needed to determinewhy such abilities are not reliably observed in infants duringthe first year of life2 A third account for the differential sensitivity to hands maybe quickly rejected Human infantsrsquo greater response to handsis not due to any greater sensitivity of the preferential lookingtask with human infants relative to adult rhesus monkeysbecause we found monkeys to be more sensitive than humaninfants to featural information for objects in the presentexperiments A fourth interpretation is possible though notvery likely Human infantsrsquo greater sensitivity to hands in ourexperiments may depend on the use of human handsmdashperhaps monkeys would show similar sensitivity to hands iftested with monkey hands However rhesus monkeys showsimilar physiological responses to human hands and monkeyhands (Rizzolatti Fadiga Fogassi amp Gallese 1999 diPellegrino Fadiga Fogassi Gallese amp Rizzolatti 1992)suggesting that rhesus monkeysrsquo observed insensitivity to thesupporting role of hands in our studies is unlikely to be anartifact of the use of human hands

REFERENCES

Baillargeon R (1995) Physical reasoning in infancy In MGazzaniga (Ed) The cognitive neurosciences CambridgeMIT Press

Baylis G C Rolls E T amp Leonard C M (1985) Selectivitybetween faces in the responses of a population of neurons inthe cortex of the superior temporal sulcus of the macaquemonkey Brain Research 342 91ndash102

Bertenthal B I (1996) Origins and early development ofperception action and representation Annual Review ofPsychology 47 431ndash459

Biederman I amp Cooper E E (1992) Size invariance in visualobject priming Journal of Experimental Psychology Hu-man Perception and Performance 18 121ndash133

Biederman I amp Gerhardstein P C (1995) Viewpoint-de-pendent mechanisms in visual object recognition Reply toTarr and Bulthoff (1995) Journal of Experimental Psychol-ogy Human Perception and Performance 21 1506ndash1514

Caramazza A (1998) Domain-specific knowledge systems inthe brain The animatendashinanimate distinction Journal ofCognitive Neuroscience 10 1ndash34

Desimone R Albright T D Gross C G amp Bruce C (1984)Stimulus selective properties of inferior temporal neurons inthe macaque Journal of Neuroscience 4 2051ndash2062

Desimone R Ungerleider L G (1989) Neural mechanisms ofvisual processing in monkeys In F Boller amp J Grafman

(Eds) Handbook of neuropsychology vol 2 (pp 267ndash299)New York Elsevier

DeYoe E A amp Van Essen D C (1988) Concurrent processingstreams in monkey visual cortex Trends in Neurosciences11 219ndash226

di Pellegrino G Fadiga L Fogassi L Gallese V amp RizzolattiG (1992) Understanding motor events A neurophysiologi-cal study Experimental Brain Research 91 176ndash180

Fantz R (1961) The origin of form perception ScientificAmerican 204 66ndash72

Fantz R (1964) Visual experience in infants Decreased at-tention to familiar patterns relative to novel ones Science146 668ndash670

Hauser M D (1998) A nonhuman primatersquos expectationsabout object motion and destination The importance ofself-propelled movement and animacy DevelopmentalScience 1 31ndash37

Hauser M D amp Carey S (1998) Building a cognitive creaturefrom a set of primitives Evolutionary and developmentalinsights In D Cummins amp C Allen (Eds) The evolution ofmind Oxford Oxford University Press

Hauser M D MacNeilage P amp Ware M (1996) Numericalrepresentations in primates Proceedings of the NationalAcademy of Sciences USA 93 1514

Hauser M D amp Williams T (submitted) A nonhuman pri-matesrsquo expectations about invisible displacement Two pro-cedures two different systems of knowledge

Johnson S P amp Aslin R N (1996) Perceptions of object unityin young infants The rules of motion depth and orienta-tion Cognitive Development 11 161ndash180

Kanwisher N McDermott J amp Chun M M (1997) The fu-siform face area A module in human extrastriate cortexspecialized for face perception Journal of Neuroscience 174302

Kellman P J amp Arterberry M E (1998) The cradle ofknowledge Development of perception in infancy Cam-bridge MIT Press

Kellman P J amp Banks M S (1998) Infant visual perceptionIn D Kuhn amp R S Siegler (Eds) Handbook of childpsychology Cognition perception and language 5th ed(pp 103ndash146) New York Wiley

Kellman P J amp Spelke E (1983) Perception of partially oc-cluded objects in infancy Cognitive Psychology 15 483ndash524

Kestenbaum R Termine N amp Spelke E S (1987) Perceptionof objects and object boundaries by 3-month-old infantsBritish Journal of Developmental Psychology 5 367ndash383

Leslie A M (1982) The perception of causality in infantsPerception 11 173ndash186

Leslie A M (1984) Infant perception of a manual pick-up eventBritish Journal of Developmental Psychology 2 19ndash32

Logothetis N K amp Sheinberg D L (1996) Visual object re-cognition Annual Review of Neuroscience 19 577ndash621

Maunsell J H amp Newsome W T (1987) Visual processing inmonkey extrastriate cortex Annual Review of Neu-roscience 10 363ndash401

Meltzoff A S (1988) Infant imitation and memory Nine-month-olds in immediate and deferred tests Child Devel-opment 59 217ndash225

Nagell K Olguin R S amp Tomasello M (1993) Processes ofsocial learning in the tool use of chimpanzees (Pan troglo-dytes) and human children (Homo sapiens) Journal ofComparative Psychology 107 174ndash186

Needham A (1997) Factors affecting infantsrsquo use of featuralinformation in object segregation Current Directions inPsychological Science 6 26ndash33

Needham A (1998) Infantsrsquo use of featural information in thesegregation of stationary objects Infant Behavior and De-velopment 21 47ndash76

Munakata et al 57

Needham A amp Baillargeon R (1993) Intuitions about sup-port in 45-month-old infants Cognition 47 121ndash148

Needham A amp Baillargeon R (1997) Object segregation in 8-month-old infants Cognition 62 121ndash149

Needham A amp Baillargeon R (1998) Effects of prior experi-ence on 45-month-old infantsrsquo object segregation InfantBehavior and Development 21 1ndash23

Needham A amp Modi M (2000) Infantsrsquo use of prior experi-ences with objects in object segregation Implications forobject recognition in infancy In H Reese (Ed) Advances inchild development and behavior vol 27 (pp 99ndash133)

Perrett D I Mistlin A J amp Chitty A J (1987) Visual neu-rones responsive to faces Trends in Neurosciences 10 358ndash364

Rao S C Rainer G amp Miller E K (1997) Integration of whatand where in the primate prefrontal cortex Science 276821

Rizzolatti G Fadiga L Fogassi L amp Gallese V (1999) Re-sonance behaviors and mirror neurons Archives Italiennesde Biologie 137 85

Scholl B amp Leslie A (in press) Explaining the infantrsquos objectconcept Beyond the perceptioncognition dichotomy In ELepore amp Z Pylyshyn (Eds) What is cognitive science Ox-ford Blackwell

Sereno M I Dale A M amp Tootell R B H (1995) Borders ofmultiple visual areas in humans revealed by functionalmagnetic resonance imaging Science 268 889

Spelke E (1985) Preferential looking methods as tools for thestudy of cognition in infancy In G Gottlieb amp N Krasnegor(Eds) Measurement of audition and vision in the first yearof postnatal life (pp 323ndash363) Norwood NJ Ablex

Spelke E Breinlinger K Jacobson K amp Phillips A (1993)Gestalt relations and object perception A developmentalstudy Perception 22 1483ndash1501

Spelke E amp Van de Walle G A (1993) Perceiving and rea-soning about objects Insights from infants In N Eilan amp R AMcCarthy (Eds) Spatial representation Problems in philo-sophy and psychology (pp 132ndash161) Oxford Blackwell

Spelke E S Hofsten C V amp Kestenbaum R (1989) Objectperception in infancy Interaction of spatial and kinetic in-formation for object boundaries Developmental Psychol-ogy 25 185ndash196

Sugihara T Edelman S amp Tanaka K (1998) Representationof objective similarity among three-dimensional shapes inthe monkey Biological Cybernetics 78 1

Tanaka K (1996) Inferotemporal cortex and object visionAnnual Review of Neuroscience 19 109ndash139

Tarr M J amp Bulthoff H H (1995) Is human object recogni-tion better described by geon structural descriptions or bymultiple views Comment on Biederman and Gerhardstein(1993) Journal of Experimental Psychology Human Per-ception and Performance 21 1494

Tomasello M Kruger A C amp Ratner H H (1993) Culturallearning Behavioral and Brain Sciences 16 495ndash552

Tootell R B H Dale A M Sereno M I amp Malach R (1996)New images from human visual cortex Trends in Neuros-ciences 19 481ndash489

von Hofsten C amp Spelke E (1985) Object perception andobject-directed reaching in infancy Journal of ExperimentalPsychology General 11 198ndash212

Woodward A L (1998) Infants selectively encode the goalobject of an actorrsquos reach Cogniton 69 1ndash34

Xu F amp Carey S (1996) Infantsrsquo metaphysics The case ofnumerical identity Cognitive Psychology 30 111ndash153

Xu F Carey S amp Welch J (1999) Infantsrsquo ability to useobject kind information for object individuation Cognition70 137ndash166

58 Journal of Cognitive Neuroscience Volume 13 Number 1

Page 6: Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

hands that held and manipulated them Second infantsbelow 12 months do not consistently perceive theboundary between two adjacent objects that are station-ary even when the objects belong to different familiarkinds Because the initial displays in the present studiescontained objects that were adjacent and underwent norelative motion the present findings suggest that therhesus monkeys used featural information to parse thevisual display into distinct objects

According to this object-parsing interpretation twofactors are critical to the rhesusrsquo responses the distinctfeatures of the objects and their distinct or commonmotions Because these two factors were not manipu-lated separately in the test displays in Experiment 1however there are two alternative accounts of thesefindings each of which discredits one of the factorscritical to the object-parsing account According to onealternative account monkeys simply find displays withdistinct features in spatial proximity more interestingthan displays with distinct features in more distantlocations Monkeys may look longer at the outcomeof the lsquolsquotogetherrsquorsquo trial because there are more featuresclustered together than in the outcome of the lsquolsquosepa-ratersquorsquo trial the motion of the objects may be comple-tely irrelevant to monkeysrsquo looking behaviorExperiment 2 tests this alternative interpretation ofthe findings by comparing monkeysrsquo looking times tothe same two outcome displays with no precedingmotion If the alternative interpretation is correct thenmonkeys should show a preference for the lsquolsquotogetherrsquorsquooutcome over the lsquolsquoseparatersquorsquo outcome in Experiment2 as in Experiment 1 If the object-parsing interpreta-tion is correct in contrast monkeys should not showthe same preference for the lsquolsquotogetherrsquorsquo outcome inExperiment 2

The second alternative interpretation of the data fromExperiment 1 is that monkeysrsquo looking times to differentevent outcomes depend on how much motion precededthose outcomes According to this account monkeyslooked longer at the outcome of the lsquolsquotogetherrsquorsquo trialbecause a greater volume of food moved during theevent that preceded the recording of their looking timeOn this interpretation the distinct features of the ob-jects and the representation of object boundaries areirrelevant to monkeysrsquo looking behavior Experiment 3tests this alternative interpretation by presenting mon-keys with lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo trials in which asingle object moves as a whole or splits apart If thealternative interpretation is correct then monkeysshould show a preference for the lsquolsquotogetherrsquorsquo outcomeover the lsquolsquoseparatersquorsquo outcome in Experiment 3 as inExperiment 1 because in both experiments the lsquolsquoto-getherrsquorsquo outcome was preceded by motion of a greatervolume of food In contrast the object-parsing interpre-tation predicts that monkeys will not show the samepreference for the lsquolsquotogetherrsquorsquo outcome in Experiment 3without the distinct featural information

EXPERIMENT 2

A new group of monkeys was presented with the twooutcome displays from Experiment 1 without any priorexposure to the food objects or to their motionsBecause hands were found not to influence monkeysrsquolooking patterns in Experiment 1 all the outcomedisplays presented two food items held by one handOn one trial (together) a hand held both food items inthe air by grasping both objects at once one atop theother On the other trial (separate) a hand held onefood item in the air while the other food item restedon the display floor Looking times to the two testoutcomes were compared to each other and to mon-keysrsquo looking times to the same outcome displays inExperiment 1 If the looking preference for the lsquolsquoto-getherrsquorsquo outcome in Experiment 1 reflected monkeysrsquoparsing of the initial arrays into two objects and theirexpectation that the two objects would move indepen-dently then that preference should be absent orattenuated in Experiment 2

Results

Figure 3c presents the principle findings of Experiment2 With stationary objects monkeys looked equally atlsquolsquotogetherrsquorsquo events (43 sec SE = 4 sec) and lsquolsquoseparatersquorsquoevents (47 sec SE = 4 sec) F(1 27) = 15 p gt 22 Ofthe 28 monkeys tested nine looked longer at thelsquolsquotogetherrsquorsquo event 18 looked longer at the lsquolsquoseparatersquorsquoevent and one individual looked equally at both events( 2 = 30 nonsignificant) The lsquolsquotiersquorsquo data point from thesingle individual was dropped for the 2 analysis

An analysis comparing looking times in Experiments 1and 2 revealed a significant interaction between trialtype (together vs separate) and experiment (1 vs 2)F(185) = 74 p = 008 Monkeys showed a greaterlooking preference for the lsquolsquotogetherrsquorsquo outcome displayin Experiment 1

Discussion

In Experiment 2 rhesus monkeys looked no longer at adisplay in which two different objects were held in the airtogether than at a display in which one object was held inthe air while the other object rested on the display floorThese findings contrast with the results from Experiment1 which focused on monkeysrsquo looking times to thesesame displays after prior exposure to two adjacent objectsand to their common or separate motions These findingschallenge one alternative explanation for the lookingpreferences in Experiment 1 and support our object-parsing interpretation of those looking preferenceswhereby monkeys parse visual displays into distinct ob-jects based on featural information and find events inwhich distinct objects move together more interestingthan events in which distinct objects move separately

Munakata et al 49

However the second alternative interpretation where-by monkeys look longer at displays presenting the out-comes of events in which a greater volume of food hasmoved could account for the data from both experi-ments Experiment 3 tests this alternative with food dis-plays that are parsed by human infants and adults into asingle object that either moves as a whole or breaks apartIf monkeysrsquo looking times to event outcomes depend onthe volume of food in motion that preceded each out-come then they should look longer at an outcome inwhich a whole food object has moved than at an outcomein which half of the object has moved In contrast ifmonkeyrsquos looking times depend on their parsing of visualdisplays into bounded objects then they should showdifferent looking preferences at the outcomes of eventsthat involve one versus two objects

EXPERIMENT 3

As in Experiment 1 monkeys were presented with adisplay of food sitting on a stage floor a hand graspedthe top of the food display and lifted either just thetop half of the food or all of the food into the air andthen the display remained stationary while lookingtimes to the event outcomes were recorded In con-

trast to Experiment 1 however each display containeda single food itemmdasha lemon or an orange peppermdashthat either broke into two pieces (separate) or movedas a whole (together) Looking times at the eventoutcomes were compared to one another and to thelooking times of the monkeys in Experiment 1 toinvestigate whether monkeysrsquo preferences betweenthe event outcomes depends on the volume of foodthat is lifted or on the monkeysrsquo parsing of the foodinto distinct objects

Results

Figure 4a presents the principal findings of Experiment3 Monkeys showed a nonsignificant trend toward look-ing longer at the lsquolsquoseparatersquorsquo outcome display (45 sec SE= 4 sec) than at the lsquolsquotogetherrsquorsquo outcome display (37sec SE = 4 sec) F(1 29) = 32 p = 08 Of the 30monkeys tested 19 looked longer at the lsquolsquoseparatersquorsquodisplay and 11 looked longer at the lsquolsquotogetherrsquorsquo display( 2 = 21 nonsignificant)

In contrast an analysis comparing looking times inExperiments 1 and 3 revealed a significant interactionbetween test display (together vs separate) and experi-ment (1 vs 3) F(1 100) = 85 p = 004 Monkeys

Figure 4 Results from Experi-ments 3 and 4 Rhesus looknonsignificantly longer whenone object appears in twopieces than when it appearstogether as a whole both (a)when the object is moved(Experiment 3) and (b) when itis stationary (Experiment 4)

a b

0

1

2

3

4

5

6

Lo

oki

ng

tim

e (s

ec)

50 Journal of Cognitive Neuroscience Volume 13 Number 1

looked longer at the outcome of the lsquolsquotogetherrsquorsquo event inExperiment 1 than in Experiment 3

Discussion

When monkeys were presented with events in whicheither a single food item moved as a whole or half theobject moved independently of the rest they did notlook longer at the event outcome that followed motionof a greater food volume Indeed monkeys showed amarginally significant tendency in the opposite direc-tion looking longer at the outcome of the event inwhich the object broke apart Looking preferencesbetween the lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo trials differedsignificantly from the preferences shown in Experiment1 in which the events involved two distinct objectsThese findings accord with the thesis that monkeys usefeatural information to parse visual scenes into objectsrepresent each object as separately movable and manip-ulable and look longer at events in which two distinctobjects move together

Nevertheless one of the alternative accounts could berevised to account for this collection of data Perhapsmonkeys have a preference both for event outcomesthat follow the motion of more food and for eventoutcomes that reveal the inside of a food object Accord-ing to this revised account monkeys in Experiment 1looked longer following an event in which two distinctobjects moved together because of their preference formore stuff moving This preference was not evident inExperiment 3 because it competed with an intrinsicpreference for the outcome display from the lsquolsquoseparatersquorsquotrial Because the inside of the lemon or pepper wasvisible following the lsquolsquoseparatersquorsquo event of Experiment 3but not following either the lsquolsquotogetherrsquorsquo event in Experi-ment 3 or either event in Experiment 1 a preference forviewing the inside of a food object would produce agreater preference for the lsquolsquoseparatersquorsquo outcome display inExperiment 3 than in Experiment 1

Experiment 4 tests this revised account by presentinga new group of monkeys with the outcome displays ofthe lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo events from Experiment3 with no prior presentation of any objects or motionAccording to the revised account monkeys should showa stronger preference for the lsquolsquoseparatersquorsquo event in Experi-ment 4 than in Experiment 3 because only Experiment 3would invoke the competing preference for more stuffmoving in the lsquolsquotogetherrsquorsquo event According to the origi-nal object-parsing account the preference for the lsquolsquose-paratersquorsquo event in Experiment 4 will not exceed that inExperiment 3 If the monkeys in Experiment 3 expectsingle objects to move as cohesive units then prefer-ence for the outcome of the lsquolsquoseparatersquorsquo event might begreater in Experiment 3 than in Experiment 4 If mon-keys have no expectations about the cohesive or non-cohesive motion of food objects then preferencesshould be the same in the two experiments

EXPERIMENT 4

Experiment 4 used the outcome displays of Experiment3 and the method of Experiment 2 Monkeys werepresented with one stationary display in which a handheld a whole food object in the air (together) and onestationary display in which a hand held the top half ofthe food object in the air while the bottom half of thefood object rested on the display floor (separate)Looking times to the two displays were compared toeach other and to the looking times of the monkeys inExperiment 3 who viewed the same displays followingpresentation of the whole object and two differentpatterns of motion

Results

Figure 4b presents the principal findings of Experiment4 Monkeys looked equally at lsquolsquotogetherrsquorsquo events (37 secSE = 3 sec) and lsquolsquoseparatersquorsquo events (42 sec SE = 3 sec)F(1 42) = 15 p = 2 Of the 43 monkeys tested 16looked longer at the lsquolsquotogetherrsquorsquo event and 27 lookedlonger at the lsquolsquoseparatersquorsquo event ( 2 = 28 nonsignificant)

The analysis comparing looking times in Experiments3 and 4 revealed a significant main effect of trial typemonkeys looked longer at the lsquolsquoseparatersquorsquo outcome dis-play (43 sec SE = 3 sec) than at the lsquolsquotogetherrsquorsquooutcome display (37 sec SE = 2 sec) F(1 71) = 45p lt 05 Of the 73 monkeys tested in Experiments 3 and4 46 looked longer at the lsquolsquoseparatersquorsquo outcome displayand 27 looked longer at the lsquolsquotogetherrsquorsquo outcome display

2 = 49 p lt 05

Discussion

In Experiment 4 rhesus monkeys showed a nonsignifi-cantly smaller preference for the lsquolsquoseparatersquorsquo display inwhich a single food item appeared in two pieces thantheir counterparts in Experiment 3 This finding pro-vides evidence against the thesis that monkeysrsquo lookingtimes depend on a preference for the outcomes ofevents involving the motion of more food stuff com-bined with an intrinsic preference for the separatedoutcome display with one object They instead supportthe object-parsing interpretation of the results fromExperiment 1 Monkeys appear to use featural informa-tion to parse visual displays into distinct objects andthey find events in which distinct objects move togethermore novel or less natural than events in which distinctobjects move separately

The findings of Experiments 3 and 4 provide no clearevidence concerning monkeysrsquo expectation that singlefood items will move cohesively If monkeys had such anexpectation then the subjects in Experiment 3 shouldhave looked longer at the lsquolsquoseparatersquorsquo display than thosein Experiment 4 because the lsquolsquoseparatersquorsquo display in Ex-periment 3 followed an event in which a single object

Munakata et al 51

broke apart and moved noncohesively Although the datafrom Experiments 3 and 4 tend in this direction noreliable differences were obtained between the lookingpreferences in the two experiments Reliable preferencesfor the outcomes of noncohesive motions have beenobserved both with human infants and with humanadults tested with similar methods and with displays ofsimple artifacts (Spelke et al 1989 1993 Kestenbaum etal 1987) The absence of a clear effect of cohesiveness inExperiments 3 and 4 may reflect either a species differ-ence or a difference in object domain Artifacts are moreapt to move cohesively than is food which breaks apartboth due to decay cutting or eating Such conclusionscannot be drawn from the present experiments howeverbecause of the equivocal findings

GENERAL DISCUSSION

Four experiments provide evidence that rhesus monkeysspontaneously parse arrays of adjacent food items intodistinct objects and that they represent these objects asseparately movable and manipulable Monkeys lookedlonger at the outcomes of events in which two previouslystationary adjacent objects moved as one unit than at theoutcomes of events in which one of the objects movedseparately from the other This preference was notattributable to any intrinsic preference for the formerevent outcome or to any preference for an outcome thatfollowed a greater amount of motion Instead it providesevidence that the monkeys represented the commonmotion of the two distinct objects as more novel orsurprising than the independent motion of those objects

The present findings suggest broad similarities be-tween the object representations formed by human andnonhuman primates and between the ways in whichthose representations are used to support inferencesabout objectsrsquo movability The well known detailedhomologies between the lower-level visual mechanismsof human and nonhuman primates (Tootell et al 1996Sereno et al 1995 DeYoe amp Van Essen 1988 Maunsell ampNewsome 1987 Desimone et al 1984) therefore appearto extend to higher-level mechanisms for parsing objectsand interpreting object motions In addition our findingsprovide evidence that adult monkeys and human infantsshow similar behavioral responses to object motionswith heightened visual exploration of motions that arenovel or surprising These findings complement previousresults showing that rhesus monkeys cotton-top tamar-ins and human infants show similar looking preferencesfor events in which objects are occluded or behave inanomalous ways (eg Hauser et al 1996 Hauser 1998)

Differences in Sensitivity to Hands

Our studies also reveal two differences between theobject representations formed by adult rhesus monkeysand young human infants First human infants take

account of the actions of human hands in analyzingthe motions and support relations among objects Whenhuman infants see an inanimate object rise into the air ina display that includes a human hand they show anovelty reaction if the hand and object are spatiallyseparated but not if the hand is grasping the object(Needham amp Baillargeon 1993 Leslie 1984) Monkeysin contrast showed no sensitivity to the supporting roleof hands in Experiment 1 Their novelty reaction to thecommon rising motion of two objects was equally strongwhen no hand contacted the bottom object (an eventthat implies that the two objects were connected) andwhen hands contacted each of the objects (an event thatimplies no connection between the objects)

We see two plausible accounts of the observed differ-ences in sensitivity to hands First human infantsrsquo great-er sensitivity to the supporting role of hands may reflecta species difference in the use of hands specifically inthe manipulation of inanimate objects Because humaninfants and human adults manipulate objects more thanother primates do human infants may have more op-portunities to learn about handndashobject support relationsthan do other species A second possibility not mutuallyexclusive from the first is that humans are innatelypredisposed to attend to the ways in which inanimateobjects are manipulated by other humans which in turncontributes to both infantsrsquo abilities to learn rapidlyabout tools and ultimately to humansrsquo superior tooluse2

Differences in the Use of Object Features forBoundaries

The second difference between the object representa-tions of adult monkeys and young human infants con-cerns the use of object features such as surface coloringand shape as information for object boundaries Adultmonkeys and human infants above 11 months of ageuse featural information to perceive object boundariesin contrast infants below 11 months of age do notreliably exhibit this ability Various factors have beenproposed to underlie the developmental change ob-served in humans Some factors focus on perceptualdevelopment with behavioral changes attributed toinfantsrsquo emerging abilities to use image features suchas edge alignment and texture similarity to group por-tions of the visual field into units directly (eg Kellmanamp Arterberry 1998 Needham 1998) In contrast otherfactors focus on the development of higher level pro-cesses with behavioral changes attributed to an emer-ging ability to represent objects as members of kindsand an emerging propensity to use object features suchas surface coloring and shape as information for thekinds to which specific objects belong (Needham ampModi 2000 Xu amp Carey 1996) Further this changemay be driven by the acquisition of verbal labels for theobjects (Xu amp Carey 1996)

52 Journal of Cognitive Neuroscience Volume 13 Number 1

Corresponding to these two interpretations of thedevelopmental change in humans are two differentinterpretations of monkeysrsquo performance in the presentstudies Monkeys may have perceived the object bound-aries by categorizing each object as a different kind offood or they may have perceived the boundaries group-ing together elements in the visual scene in accord withtheir colors textures and alignment relationships

There is compelling data suggesting that monkeysrepresent the category of food such that they are likelyto have lsquolsquofood kindrsquorsquo representations First monkeys inthe present studies were strongly attentive to food itemsand occasionally attempted to approach and take thembehaviors often observed with familiar foods and rarelyobserved with familiar nonfood objects This was trueeven though they had no prior experience with theseparticular food items Second experiments by SantosHauser and Spelke (in preparation) suggest that mon-keys given evidence that a novel object is food (byobserving a person eating part of it) subsequentlyapproach that object an odorless replica of that objectand other objects of the same color and texture as theoriginal object but of a different shape In contrastmonkeys do not approach these objects when they aregiven evidence that the initial object is not food (byobserving a person putting the object in her ear ratherthan her mouth) This finding suggests that monkeyscategorize novel objects as kinds of food in terms ofproperties such as their colors and textures If percep-tible properties of the present stimulus objects allowedmonkeys to perceive correctly that these objects werefood then monkeysrsquo propensity to categorize objects asthe same foods only when they share a common colorand texture would lead them to perceive each display oftwo (differently colored and textured) foods as contain-ing two distinct objects

Whatever the reason for monkeysrsquo successful use offeatural information to perceive object boundaries theexistence of this capacity in rhesus monkeys casts doubton the thesis that this ability either depends on or givesrise to any uniquely human ability to represent objectsHumans do represent objects in unique ways for wehave unparalleled abilities to build and use complextools and to communicate about objects with uniquesymbols for thousands of object kinds The sources ofour uniqueness however do not clearly appear in thecontexts that have been used thus far to assess objectrepresentations in human infants

Steps Toward a Cognitive Neuroscience of NaturalObject Representation

Although our experiments focus strictly on behavioralmeasures and functional analyses we believe their great-est potential lies in the contributions they can make tounderstanding the neural basis of object representationRhesus monkeys are one of the most intensively studied

species in the neuroanatomy and neurophysiology ofvision and such studies have provided evidence forextensive homologies between their visual systems andthose of humans Our experiments contribute to thisliterature in three ways First they suggest that rhesusmonkeys and humans have similar higher visual me-chanisms for representing objects and interpreting ob-ject motions The origin of these similarities remains anopen question with likely contributions from bothgenetically encoded homologies in the underlying neur-al architectures and similar experiential histories inter-acting with similar neural learning mechanisms

Second our experiments provide evidence that theobject representations of monkeys and humans can beassessed by nearly identical tasks Moreover these tasksrequire no training and so allow assessment of therepresentations that humans and monkeys developand use spontaneously rather than less naturalisticrepresentations that may have been developed specifi-cally for solving experimental tasks over months oftraining on those tasks (see discussion in Rao et al1997) Finally these tasks can be applied not only toadult animals but to infants Indeed the preferentiallooking method was developed for use with infanthumans and monkeys (Fantz 1961) and it has beenused to study lower-level visual functions in both spe-cies (see Kellman amp Banks 1998 for review) Themethod therefore should be ideal for investigating theneural architecture subserving visual cognition in bothspecies

Third our experiments offer a behavioral task that canreadily be adapted for simultaneous behavioral andneural recordings in monkeys Preferential lookingmethods have been used successfully both with semindashfree-ranging rhesus monkeys and with captive cotton-top tamarins (eg Hauser et al 1996 Hauser 1998) Inpreliminary research they have yielded similar findingswith rhesus monkeys tested with stabilized heads andimplanted electrodes (Munakata Miller amp Spelke un-published) In the future therefore cognitive neuros-cientists should be able to use these methods to probethe neural mechanisms of object representations inuntrained monkeys whose experience with objects canbe precisely controlled and to compare the functionalproperties of those mechanisms directly to those ofhuman infants with varying degrees of experience Suchstudies should prove a valuable complement to studiesof the neural mechanisms of object representations inadult humans using the combined approaches of cogni-tive psychology and functional brain imaging

More specifically the studies reported in this papercould serve as the starting point for physiological studiesprobing the cognitive and behavioral functions of neu-rons activated by visual displays As a first step onecould ask whether the extensively studied object codingneurons in the inferotemporal cortex (Tanaka 1996Perrett et al 1987 Baylis Rolls amp Leonard 1985) are

Munakata et al 53

responsible for the behavioral results found in ourexperiments The finding that monkeys encode two-object displays as two separate objects leads to theprediction that inferotemporal neurons will respondsimilarly to each object in a one-object display and in atwo-object display with a possible reduction in responseto the latter display due to competition from thedifferent object representations It is possible howeverthat monkeys distinguish the two objects in earlierstages of processing parsing the display based on con-tiguous regions of the same general color and texturewithout this parsing being clearly reflected in object-level representations These alternatives could be dis-tinguished by recording a population of inferotemporalresponses to one of our two-object displays and to eachof the two objects separately If the two-object responseswere different from the sum or average of the responsesto the two separate objects this would suggest thatmonkeys encode the two-object displays in a differentmanner than the separate objects at the level of theinferotemporal cortex

If this first experiment showed that inferotemporalneurons encode two-object displays in terms of the twoseparate objects one could next manipulate factors thatinfluence object perception and measure the neuralcorrelates For example spatiotemporal cues such ascommon motion may make monkeys more likely toperceive a two-object display as a single object (egKellman amp Spelke 1983) and elimination of colordifferences may make them less likely to do so (SantosHauser amp Spelke in preparation) One could thusmeasure both the behavioral (looking time) and electro-physiological consequences of such manipulations andcompare the results to those from experiments withoutthis preexposure

Another ideal candidate for converging explorationfocuses on the representations underlying abilities toperceive combinations of objects in terms of theseparable components Do such abilities stem fromrepresentations of distinct perceptual features or fromrepresentations of distinct object kinds The nature ofobject representations is a matter of considerabledebate in the electrophysiological and related litera-ture (eg Sugihara Edelman amp Tanaka 1998 Lo-gothetis amp Sheinberg 1996 Tanaka 1996 Tarr ampBulthoff 1995 Biederman amp Cooper 1992 Biedermanamp Gerhardstein 1995) and issues related to the kindsfeatures distinction have been discussed in a some-what different terminology For example Logothetisand Sheinberg (1996) posit that different levels ofcategorization could be used to organize object repre-sentations from more specific visual feature-basedrepresentations to more abstract-kind representationsElectrophysiological recordings have demonstratedthat a given visual object is represented in differentways along a rough hierarchy of processing pathwaysfrom more specific low-order featural representations

to more abstract invariant categorical representations(eg Desimone amp Ungerleider 1989) Objects such asthose used in our displays could be recognized asdistinct based on features at lower levels or categoriesat higher levels Alternatively even the lowest level ofobject representations may be organized into differentkind categories as suggested by the existence of face-specific representations in both rhesus monkey andhuman visual areas (Kanwisher McDermott amp Chun1997 Perrett et al 1987) Though most of theexplanations for face-specific representations focus onthe unique perceptual properties of faces rather thana more general categorical organization of the objectrecognition system a categorical organization is stillpossible (Caramazza 1998) Thus objects could becategorized as different kinds at the earliest levels offeatural processing

One could further explore these issues in physiologi-cal studies by presenting monkeys with different visualforms of a single food category (eg bananas that aresliced mashed peeled unpeeled green brown andyellow) and visually similar forms from different foodcategories (eg a green banana and a cucumber) Ifinferotemporal representations encode information atthe level of kinds the first condition should elicit similarresponses in the inferotemporal neurons whereas thesecond should elicit different responses In contrast ifinferotemporal representations encode information atthe level of features the first condition should elicitdifferent responses whereas the second should elicitsimilar responses

In such ways our understanding of object representa-tion and in turn of humansrsquo unique tool and symboluse may be enhanced by converging efforts at thebehavioral and physiological levels of analysis Themethods reported in this papermdashused extensively withhumans of all ages requiring no training and applicableto free-ranging as well as captive animalsmdashcould play aninstrumental role in this process

METHODS

Experiment 1

Participants

Subjects were 59 semindashfree-ranging rhesus monkeysliving on the island of Cayo Santiago Puerto RicoApproximately half the subjects were adult males (agegt4 years) and half adult female (age gt3 years) Subjectswere tested opportunistically whenever they were en-countered in a setting with few other monkeys ordistractions (eg not involved in or near to a fight)and when they remained in a seated position longenough for us to present our stimuli Monkeys occa-sionally changed positions between trials In these casestesting resumed if and when monkeys relocated toanother seated position within a couple of minutes An

54 Journal of Cognitive Neuroscience Volume 13 Number 1

additional 21 monkeys were tested but did not providedata for the analyses due to either position changes thatdid not allow testing to resume (20 monkeys) or experi-menter error (1 monkey)

Apparatus and Displays

The experimental apparatus consisted of a stage and ascreen constructed from white foam core (Figure 2)The 60 pound 30-cm floor and 60 pound 40-cm back of the stagewere attached at a right angle by triangular supports (12-cm height pound 7-cm base) attached to the sides of thestage The 60 pound 40-cm screen had a 60 pound 15-cm basesupporting small aluminum pans containing the foodstimuli for the study The base and pans were attachedto the screen at a right angle by large triangular supports(40-cm height pound 15-cm base) that occluded both thebase and the food

The objects were four foods of contrasting shapescolors and textures with sizes that made them easilygraspable by a single human hand a green pepper (7-cm tall pound 8-cm diameter) a brown sweet potato (7-cmtall pound 75-cm wide by 17-cm long) a miniature orangepumpkin (7-cm tall pound 8-cm diameter) and a segmentof tan ginger root (45-cm tall pound 12-cm wide pound 15-cmlong) None of these items grew on the island or werebrought there either as provisions for the monkeys oras food for the research team all items therefore wereunfamiliar to the subjects In one display the greenpepper rested on top of the sweet potato In theother display the pumpkin rested on top of theginger root

Design

Each monkey was presented with one lsquolsquotogetherrsquorsquo trialand one lsquolsquoseparatersquorsquo trial each involving a different pairof food items Twenty-eight monkeys were tested in ahold-top condition in which the experimenter held onlythe top object with one hand during each event and 31monkeys were tested in a hold-both condition in whichthe experimenter held both objects with both handsWithin each of these conditions the pairing of objects(pepperpotato vs pumpkinginger) and trial types (to-gether vs separate) and the order of test trials wereorthogonally counterbalanced across monkeys

Procedure

All testing was conducted by one experimenter and onecamcorder operator a test began when the investigatorslocated a monkey who was seated in a quiet spot Theexperimenter positioned the apparatus 2ndash5 m away fromthe test monkey with the screen in front of and blockingthe monkeyrsquos view of the stage and the camcorderoperator began to videotape the monkey from behindthe display (Figure 2) The experimenter then raised the

screen to reveal an empty stage and immediately low-ered it Each test trial then proceeded as follows Theexperimenter raised the screen to reveal one food itemsitting atop a second food item The experimenterchecked that the monkey had fixated the objects andthen she lifted the top object approximately 30 cm in 1sec In lsquolsquotogetherrsquorsquo events the bottom object moved withthe top object in lsquolsquoseparatersquorsquo events the bottom objectremained on the floor of the display In the hold-topcondition the experimenter held only the top objectfrom above with the right hand (the two objects wereattached with toothpicks invisible to the monkeys) Inthe hold-both condition the experimenter held the topobject from above with the right hand and the bottomobject from the side and bottom with the left hand Afterlifting the object(s) the experimenter called lsquolsquoCountrsquorsquoand the camcorder operator began counting 10 sec onthe camcorder display The experimenter held the ob-ject(s) stationary until the camcorder operator calledlsquolsquoDonersquorsquo to signal the end of the 10-sec trial Theexperimenter then lowered the screen This procedurehas been successfully used in previous looking timeexperiments on this population (eg Hauser et al1996)

Each monkey received one lsquolsquotogetherrsquorsquo and one lsquolsquose-paratersquorsquo trial These two trials were separated by twoadditional trials unrelated to the present studies andinvolving the stationary presentation of other food items(carrots and squash) For most monkeys trials wereseparated by an intertrial interval of 3ndash5 sec and theentire experiment lasted a couple of minutes For mon-keys who repositioned themselves between trials theintertrial intervals were longer but never exceeded acouple of minutes

Coding and Analysis

Two coders blind to the hypotheses and conditions ofthe experiment viewed the videotaped trials frame-by-frame to determine how long monkeys observed eachof the event outcomes On each trial coding began justafter the objects came to rest as signaled by theexperimenterrsquos voice on the videotape and ended 10sec later Four of the monkeys were coded by bothcoders the correlation between their judgments oftotal looking time on each trial was 93 Looking timeswere analyzed by a 2 pound 2 ANOVA with Condition (hold-top vs hold-both) as a between-subjects factor andDisplay (together vs separate) as the within-subjectsfactor

Experiment 2

Participants

Subjects were 28 monkeys from the same population asin Experiment 1 An additional 10 monkeys were tested

Munakata et al 55

but did not provide data for the analyses due to eitherposition changes that did not allow testing to resume (9monkeys) or experimental error (1 monkey)

Apparatus and Stimuli

The apparatus was similar to that in Experiment 1except that the stage was somewhat smaller (back =45 pound 30 cm floor = 45 pound 30 cm) and the screen wasslightly taller (base = 45 pound 15 cm face = 45 pound 45 cm)The food objects and object positions were the same asin the outcome displays for Experiment 1 On thelsquolsquoseparatersquorsquo trial the position of the experimenterrsquos handwas the same as in the lsquolsquoseparatersquorsquo trial of the hold-topcondition of Experiment 1 On the lsquolsquotogetherrsquorsquo trial theexperimenterrsquos right hand grasped the two objectssimultaneously from the side and supported them inthe same positions as on the lsquolsquotogetherrsquorsquo trials for bothconditions of Experiment 1

Design

The design was the same as in Experiment 1 except thatall subjects were run in a single-hand condition

Procedure

Each trial began when the experimenter lifted the screento reveal the objects currently held in the air As thescreen was raised the experimenter called lsquolsquoCountrsquorsquo andthe camcorder operator began counting 10 sec on thecamcorder display In all other respects the procedurewas the same as in Experiment 1

Coding and Analysis

A single coder blind to the conditions of the experimentscored the videotapes Trials for 10 subjects were codedby a second observer and the correlation betweenjudgments of both observers was 98 As in previousstudies (Hauser et al 1996 Hauser 1998) videos wereacquired onto a computer using Adobe Premiere soft-ware and a Radius Videovision board Coding began andended as for Experiment 1

Looking times in Experiment 2 were analyzed by aone-way ANOVA with Display (together vs separate) asthe within-subjects factor A further ANOVA with theadditional factor of Experiment compared the lookingpatterns of the monkeys in Experiment 2 to those inExperiment 1

Experiment 3

Participants

Subjects were 30 monkeys from the same population asin Experiments 1 and 2 An additional 13 monkeys were

tested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (12 monkeys) or experimental error (1 monkey)

Apparatus and Displays

The apparatus was identical to that of Experiment 2 Thedisplays were the same as in the hold-top condition ofExperiment 1 except for the objects a yellow lemon andan orange pepper oriented vertically On lsquolsquotogetherrsquorsquotrials a whole object appeared on the display floororiented vertically and a hand grasped its top half andlifted the object into the air On lsquolsquoseparatersquorsquo trials twohalves of an object with a horizontal cut through themiddle appeared on the display floor in the sameorientation and a hand grasped the top half and liftedit into the air while the bottom half remained on thedisplay floor At the start of the lsquolsquoseparatersquorsquo trial the cutin the object was detectable by adults but inconspicu-ous At the end of the trial small portions of the insideof the object were visible from the monkeyrsquos station-point

Design Procedure Coding and Analyses

The design and procedure were the same as in Experi-ment 1 except that only one condition (hold-top) wasadministered and only one object (together or separate)was displayed The coding and analyses were the sameas in Experiment 2

Experiment 4

Participants

Subjects were 43 monkeys from the same population asin Experiments 1ndash3 An additional 27 monkeys weretested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (26 monkeys) or experimental error (1 monkey)

Apparatus and Displays

These were the same as in Experiment 3 except that thefood object never appeared on the display floor and wasnot grasped and lifted

Design Procedure Coding and Analyses

These were the same as in Experiment 2

Acknowledgments

The research was supported by McDonnell-Pew PostdoctoralFellowships to Yuko Munakata and Randall OrsquoReilly an NSFpredoctoral fellowship and Harvard University McMasterrsquosfunds to Laurie Santos NIH grant R37-HD23103 to ElizabethSpelke and an NSF Young Investigator Award to Marc

56 Journal of Cognitive Neuroscience Volume 13 Number 1

Hauser We thank members of the Cognitive DevelopmentCenter at the University of Denver for feedback on drafts ofthis article We thank Elliott Blass Amy Jackendoff Katie LiuCory Miller Marianne Moon Bridget Spelke and Fei Xu forassistance with conducting and coding the studies We thankthe CPRC (PHS grant P51RR00168-38) and Drs M KesslerF Bercovitch and J Berard for helping secure the CayoSantiago facilities

Reprint requests should be sent to Yuko Munakata at theDepartment of Psychology University of Denver 2155 SRace Street Denver CO 80208 USA or via e-mail tomunakataduedu

Notes

1 In some cases infants younger than 1 year have demon-strated abilities to use featural information to parse simpleadjacent objects into separable units (Needham amp Baillargeon1997 Needham 1998) Further work is needed to determinewhy such abilities are not reliably observed in infants duringthe first year of life2 A third account for the differential sensitivity to hands maybe quickly rejected Human infantsrsquo greater response to handsis not due to any greater sensitivity of the preferential lookingtask with human infants relative to adult rhesus monkeysbecause we found monkeys to be more sensitive than humaninfants to featural information for objects in the presentexperiments A fourth interpretation is possible though notvery likely Human infantsrsquo greater sensitivity to hands in ourexperiments may depend on the use of human handsmdashperhaps monkeys would show similar sensitivity to hands iftested with monkey hands However rhesus monkeys showsimilar physiological responses to human hands and monkeyhands (Rizzolatti Fadiga Fogassi amp Gallese 1999 diPellegrino Fadiga Fogassi Gallese amp Rizzolatti 1992)suggesting that rhesus monkeysrsquo observed insensitivity to thesupporting role of hands in our studies is unlikely to be anartifact of the use of human hands

REFERENCES

Baillargeon R (1995) Physical reasoning in infancy In MGazzaniga (Ed) The cognitive neurosciences CambridgeMIT Press

Baylis G C Rolls E T amp Leonard C M (1985) Selectivitybetween faces in the responses of a population of neurons inthe cortex of the superior temporal sulcus of the macaquemonkey Brain Research 342 91ndash102

Bertenthal B I (1996) Origins and early development ofperception action and representation Annual Review ofPsychology 47 431ndash459

Biederman I amp Cooper E E (1992) Size invariance in visualobject priming Journal of Experimental Psychology Hu-man Perception and Performance 18 121ndash133

Biederman I amp Gerhardstein P C (1995) Viewpoint-de-pendent mechanisms in visual object recognition Reply toTarr and Bulthoff (1995) Journal of Experimental Psychol-ogy Human Perception and Performance 21 1506ndash1514

Caramazza A (1998) Domain-specific knowledge systems inthe brain The animatendashinanimate distinction Journal ofCognitive Neuroscience 10 1ndash34

Desimone R Albright T D Gross C G amp Bruce C (1984)Stimulus selective properties of inferior temporal neurons inthe macaque Journal of Neuroscience 4 2051ndash2062

Desimone R Ungerleider L G (1989) Neural mechanisms ofvisual processing in monkeys In F Boller amp J Grafman

(Eds) Handbook of neuropsychology vol 2 (pp 267ndash299)New York Elsevier

DeYoe E A amp Van Essen D C (1988) Concurrent processingstreams in monkey visual cortex Trends in Neurosciences11 219ndash226

di Pellegrino G Fadiga L Fogassi L Gallese V amp RizzolattiG (1992) Understanding motor events A neurophysiologi-cal study Experimental Brain Research 91 176ndash180

Fantz R (1961) The origin of form perception ScientificAmerican 204 66ndash72

Fantz R (1964) Visual experience in infants Decreased at-tention to familiar patterns relative to novel ones Science146 668ndash670

Hauser M D (1998) A nonhuman primatersquos expectationsabout object motion and destination The importance ofself-propelled movement and animacy DevelopmentalScience 1 31ndash37

Hauser M D amp Carey S (1998) Building a cognitive creaturefrom a set of primitives Evolutionary and developmentalinsights In D Cummins amp C Allen (Eds) The evolution ofmind Oxford Oxford University Press

Hauser M D MacNeilage P amp Ware M (1996) Numericalrepresentations in primates Proceedings of the NationalAcademy of Sciences USA 93 1514

Hauser M D amp Williams T (submitted) A nonhuman pri-matesrsquo expectations about invisible displacement Two pro-cedures two different systems of knowledge

Johnson S P amp Aslin R N (1996) Perceptions of object unityin young infants The rules of motion depth and orienta-tion Cognitive Development 11 161ndash180

Kanwisher N McDermott J amp Chun M M (1997) The fu-siform face area A module in human extrastriate cortexspecialized for face perception Journal of Neuroscience 174302

Kellman P J amp Arterberry M E (1998) The cradle ofknowledge Development of perception in infancy Cam-bridge MIT Press

Kellman P J amp Banks M S (1998) Infant visual perceptionIn D Kuhn amp R S Siegler (Eds) Handbook of childpsychology Cognition perception and language 5th ed(pp 103ndash146) New York Wiley

Kellman P J amp Spelke E (1983) Perception of partially oc-cluded objects in infancy Cognitive Psychology 15 483ndash524

Kestenbaum R Termine N amp Spelke E S (1987) Perceptionof objects and object boundaries by 3-month-old infantsBritish Journal of Developmental Psychology 5 367ndash383

Leslie A M (1982) The perception of causality in infantsPerception 11 173ndash186

Leslie A M (1984) Infant perception of a manual pick-up eventBritish Journal of Developmental Psychology 2 19ndash32

Logothetis N K amp Sheinberg D L (1996) Visual object re-cognition Annual Review of Neuroscience 19 577ndash621

Maunsell J H amp Newsome W T (1987) Visual processing inmonkey extrastriate cortex Annual Review of Neu-roscience 10 363ndash401

Meltzoff A S (1988) Infant imitation and memory Nine-month-olds in immediate and deferred tests Child Devel-opment 59 217ndash225

Nagell K Olguin R S amp Tomasello M (1993) Processes ofsocial learning in the tool use of chimpanzees (Pan troglo-dytes) and human children (Homo sapiens) Journal ofComparative Psychology 107 174ndash186

Needham A (1997) Factors affecting infantsrsquo use of featuralinformation in object segregation Current Directions inPsychological Science 6 26ndash33

Needham A (1998) Infantsrsquo use of featural information in thesegregation of stationary objects Infant Behavior and De-velopment 21 47ndash76

Munakata et al 57

Needham A amp Baillargeon R (1993) Intuitions about sup-port in 45-month-old infants Cognition 47 121ndash148

Needham A amp Baillargeon R (1997) Object segregation in 8-month-old infants Cognition 62 121ndash149

Needham A amp Baillargeon R (1998) Effects of prior experi-ence on 45-month-old infantsrsquo object segregation InfantBehavior and Development 21 1ndash23

Needham A amp Modi M (2000) Infantsrsquo use of prior experi-ences with objects in object segregation Implications forobject recognition in infancy In H Reese (Ed) Advances inchild development and behavior vol 27 (pp 99ndash133)

Perrett D I Mistlin A J amp Chitty A J (1987) Visual neu-rones responsive to faces Trends in Neurosciences 10 358ndash364

Rao S C Rainer G amp Miller E K (1997) Integration of whatand where in the primate prefrontal cortex Science 276821

Rizzolatti G Fadiga L Fogassi L amp Gallese V (1999) Re-sonance behaviors and mirror neurons Archives Italiennesde Biologie 137 85

Scholl B amp Leslie A (in press) Explaining the infantrsquos objectconcept Beyond the perceptioncognition dichotomy In ELepore amp Z Pylyshyn (Eds) What is cognitive science Ox-ford Blackwell

Sereno M I Dale A M amp Tootell R B H (1995) Borders ofmultiple visual areas in humans revealed by functionalmagnetic resonance imaging Science 268 889

Spelke E (1985) Preferential looking methods as tools for thestudy of cognition in infancy In G Gottlieb amp N Krasnegor(Eds) Measurement of audition and vision in the first yearof postnatal life (pp 323ndash363) Norwood NJ Ablex

Spelke E Breinlinger K Jacobson K amp Phillips A (1993)Gestalt relations and object perception A developmentalstudy Perception 22 1483ndash1501

Spelke E amp Van de Walle G A (1993) Perceiving and rea-soning about objects Insights from infants In N Eilan amp R AMcCarthy (Eds) Spatial representation Problems in philo-sophy and psychology (pp 132ndash161) Oxford Blackwell

Spelke E S Hofsten C V amp Kestenbaum R (1989) Objectperception in infancy Interaction of spatial and kinetic in-formation for object boundaries Developmental Psychol-ogy 25 185ndash196

Sugihara T Edelman S amp Tanaka K (1998) Representationof objective similarity among three-dimensional shapes inthe monkey Biological Cybernetics 78 1

Tanaka K (1996) Inferotemporal cortex and object visionAnnual Review of Neuroscience 19 109ndash139

Tarr M J amp Bulthoff H H (1995) Is human object recogni-tion better described by geon structural descriptions or bymultiple views Comment on Biederman and Gerhardstein(1993) Journal of Experimental Psychology Human Per-ception and Performance 21 1494

Tomasello M Kruger A C amp Ratner H H (1993) Culturallearning Behavioral and Brain Sciences 16 495ndash552

Tootell R B H Dale A M Sereno M I amp Malach R (1996)New images from human visual cortex Trends in Neuros-ciences 19 481ndash489

von Hofsten C amp Spelke E (1985) Object perception andobject-directed reaching in infancy Journal of ExperimentalPsychology General 11 198ndash212

Woodward A L (1998) Infants selectively encode the goalobject of an actorrsquos reach Cogniton 69 1ndash34

Xu F amp Carey S (1996) Infantsrsquo metaphysics The case ofnumerical identity Cognitive Psychology 30 111ndash153

Xu F Carey S amp Welch J (1999) Infantsrsquo ability to useobject kind information for object individuation Cognition70 137ndash166

58 Journal of Cognitive Neuroscience Volume 13 Number 1

Page 7: Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

However the second alternative interpretation where-by monkeys look longer at displays presenting the out-comes of events in which a greater volume of food hasmoved could account for the data from both experi-ments Experiment 3 tests this alternative with food dis-plays that are parsed by human infants and adults into asingle object that either moves as a whole or breaks apartIf monkeysrsquo looking times to event outcomes depend onthe volume of food in motion that preceded each out-come then they should look longer at an outcome inwhich a whole food object has moved than at an outcomein which half of the object has moved In contrast ifmonkeyrsquos looking times depend on their parsing of visualdisplays into bounded objects then they should showdifferent looking preferences at the outcomes of eventsthat involve one versus two objects

EXPERIMENT 3

As in Experiment 1 monkeys were presented with adisplay of food sitting on a stage floor a hand graspedthe top of the food display and lifted either just thetop half of the food or all of the food into the air andthen the display remained stationary while lookingtimes to the event outcomes were recorded In con-

trast to Experiment 1 however each display containeda single food itemmdasha lemon or an orange peppermdashthat either broke into two pieces (separate) or movedas a whole (together) Looking times at the eventoutcomes were compared to one another and to thelooking times of the monkeys in Experiment 1 toinvestigate whether monkeysrsquo preferences betweenthe event outcomes depends on the volume of foodthat is lifted or on the monkeysrsquo parsing of the foodinto distinct objects

Results

Figure 4a presents the principal findings of Experiment3 Monkeys showed a nonsignificant trend toward look-ing longer at the lsquolsquoseparatersquorsquo outcome display (45 sec SE= 4 sec) than at the lsquolsquotogetherrsquorsquo outcome display (37sec SE = 4 sec) F(1 29) = 32 p = 08 Of the 30monkeys tested 19 looked longer at the lsquolsquoseparatersquorsquodisplay and 11 looked longer at the lsquolsquotogetherrsquorsquo display( 2 = 21 nonsignificant)

In contrast an analysis comparing looking times inExperiments 1 and 3 revealed a significant interactionbetween test display (together vs separate) and experi-ment (1 vs 3) F(1 100) = 85 p = 004 Monkeys

Figure 4 Results from Experi-ments 3 and 4 Rhesus looknonsignificantly longer whenone object appears in twopieces than when it appearstogether as a whole both (a)when the object is moved(Experiment 3) and (b) when itis stationary (Experiment 4)

a b

0

1

2

3

4

5

6

Lo

oki

ng

tim

e (s

ec)

50 Journal of Cognitive Neuroscience Volume 13 Number 1

looked longer at the outcome of the lsquolsquotogetherrsquorsquo event inExperiment 1 than in Experiment 3

Discussion

When monkeys were presented with events in whicheither a single food item moved as a whole or half theobject moved independently of the rest they did notlook longer at the event outcome that followed motionof a greater food volume Indeed monkeys showed amarginally significant tendency in the opposite direc-tion looking longer at the outcome of the event inwhich the object broke apart Looking preferencesbetween the lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo trials differedsignificantly from the preferences shown in Experiment1 in which the events involved two distinct objectsThese findings accord with the thesis that monkeys usefeatural information to parse visual scenes into objectsrepresent each object as separately movable and manip-ulable and look longer at events in which two distinctobjects move together

Nevertheless one of the alternative accounts could berevised to account for this collection of data Perhapsmonkeys have a preference both for event outcomesthat follow the motion of more food and for eventoutcomes that reveal the inside of a food object Accord-ing to this revised account monkeys in Experiment 1looked longer following an event in which two distinctobjects moved together because of their preference formore stuff moving This preference was not evident inExperiment 3 because it competed with an intrinsicpreference for the outcome display from the lsquolsquoseparatersquorsquotrial Because the inside of the lemon or pepper wasvisible following the lsquolsquoseparatersquorsquo event of Experiment 3but not following either the lsquolsquotogetherrsquorsquo event in Experi-ment 3 or either event in Experiment 1 a preference forviewing the inside of a food object would produce agreater preference for the lsquolsquoseparatersquorsquo outcome display inExperiment 3 than in Experiment 1

Experiment 4 tests this revised account by presentinga new group of monkeys with the outcome displays ofthe lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo events from Experiment3 with no prior presentation of any objects or motionAccording to the revised account monkeys should showa stronger preference for the lsquolsquoseparatersquorsquo event in Experi-ment 4 than in Experiment 3 because only Experiment 3would invoke the competing preference for more stuffmoving in the lsquolsquotogetherrsquorsquo event According to the origi-nal object-parsing account the preference for the lsquolsquose-paratersquorsquo event in Experiment 4 will not exceed that inExperiment 3 If the monkeys in Experiment 3 expectsingle objects to move as cohesive units then prefer-ence for the outcome of the lsquolsquoseparatersquorsquo event might begreater in Experiment 3 than in Experiment 4 If mon-keys have no expectations about the cohesive or non-cohesive motion of food objects then preferencesshould be the same in the two experiments

EXPERIMENT 4

Experiment 4 used the outcome displays of Experiment3 and the method of Experiment 2 Monkeys werepresented with one stationary display in which a handheld a whole food object in the air (together) and onestationary display in which a hand held the top half ofthe food object in the air while the bottom half of thefood object rested on the display floor (separate)Looking times to the two displays were compared toeach other and to the looking times of the monkeys inExperiment 3 who viewed the same displays followingpresentation of the whole object and two differentpatterns of motion

Results

Figure 4b presents the principal findings of Experiment4 Monkeys looked equally at lsquolsquotogetherrsquorsquo events (37 secSE = 3 sec) and lsquolsquoseparatersquorsquo events (42 sec SE = 3 sec)F(1 42) = 15 p = 2 Of the 43 monkeys tested 16looked longer at the lsquolsquotogetherrsquorsquo event and 27 lookedlonger at the lsquolsquoseparatersquorsquo event ( 2 = 28 nonsignificant)

The analysis comparing looking times in Experiments3 and 4 revealed a significant main effect of trial typemonkeys looked longer at the lsquolsquoseparatersquorsquo outcome dis-play (43 sec SE = 3 sec) than at the lsquolsquotogetherrsquorsquooutcome display (37 sec SE = 2 sec) F(1 71) = 45p lt 05 Of the 73 monkeys tested in Experiments 3 and4 46 looked longer at the lsquolsquoseparatersquorsquo outcome displayand 27 looked longer at the lsquolsquotogetherrsquorsquo outcome display

2 = 49 p lt 05

Discussion

In Experiment 4 rhesus monkeys showed a nonsignifi-cantly smaller preference for the lsquolsquoseparatersquorsquo display inwhich a single food item appeared in two pieces thantheir counterparts in Experiment 3 This finding pro-vides evidence against the thesis that monkeysrsquo lookingtimes depend on a preference for the outcomes ofevents involving the motion of more food stuff com-bined with an intrinsic preference for the separatedoutcome display with one object They instead supportthe object-parsing interpretation of the results fromExperiment 1 Monkeys appear to use featural informa-tion to parse visual displays into distinct objects andthey find events in which distinct objects move togethermore novel or less natural than events in which distinctobjects move separately

The findings of Experiments 3 and 4 provide no clearevidence concerning monkeysrsquo expectation that singlefood items will move cohesively If monkeys had such anexpectation then the subjects in Experiment 3 shouldhave looked longer at the lsquolsquoseparatersquorsquo display than thosein Experiment 4 because the lsquolsquoseparatersquorsquo display in Ex-periment 3 followed an event in which a single object

Munakata et al 51

broke apart and moved noncohesively Although the datafrom Experiments 3 and 4 tend in this direction noreliable differences were obtained between the lookingpreferences in the two experiments Reliable preferencesfor the outcomes of noncohesive motions have beenobserved both with human infants and with humanadults tested with similar methods and with displays ofsimple artifacts (Spelke et al 1989 1993 Kestenbaum etal 1987) The absence of a clear effect of cohesiveness inExperiments 3 and 4 may reflect either a species differ-ence or a difference in object domain Artifacts are moreapt to move cohesively than is food which breaks apartboth due to decay cutting or eating Such conclusionscannot be drawn from the present experiments howeverbecause of the equivocal findings

GENERAL DISCUSSION

Four experiments provide evidence that rhesus monkeysspontaneously parse arrays of adjacent food items intodistinct objects and that they represent these objects asseparately movable and manipulable Monkeys lookedlonger at the outcomes of events in which two previouslystationary adjacent objects moved as one unit than at theoutcomes of events in which one of the objects movedseparately from the other This preference was notattributable to any intrinsic preference for the formerevent outcome or to any preference for an outcome thatfollowed a greater amount of motion Instead it providesevidence that the monkeys represented the commonmotion of the two distinct objects as more novel orsurprising than the independent motion of those objects

The present findings suggest broad similarities be-tween the object representations formed by human andnonhuman primates and between the ways in whichthose representations are used to support inferencesabout objectsrsquo movability The well known detailedhomologies between the lower-level visual mechanismsof human and nonhuman primates (Tootell et al 1996Sereno et al 1995 DeYoe amp Van Essen 1988 Maunsell ampNewsome 1987 Desimone et al 1984) therefore appearto extend to higher-level mechanisms for parsing objectsand interpreting object motions In addition our findingsprovide evidence that adult monkeys and human infantsshow similar behavioral responses to object motionswith heightened visual exploration of motions that arenovel or surprising These findings complement previousresults showing that rhesus monkeys cotton-top tamar-ins and human infants show similar looking preferencesfor events in which objects are occluded or behave inanomalous ways (eg Hauser et al 1996 Hauser 1998)

Differences in Sensitivity to Hands

Our studies also reveal two differences between theobject representations formed by adult rhesus monkeysand young human infants First human infants take

account of the actions of human hands in analyzingthe motions and support relations among objects Whenhuman infants see an inanimate object rise into the air ina display that includes a human hand they show anovelty reaction if the hand and object are spatiallyseparated but not if the hand is grasping the object(Needham amp Baillargeon 1993 Leslie 1984) Monkeysin contrast showed no sensitivity to the supporting roleof hands in Experiment 1 Their novelty reaction to thecommon rising motion of two objects was equally strongwhen no hand contacted the bottom object (an eventthat implies that the two objects were connected) andwhen hands contacted each of the objects (an event thatimplies no connection between the objects)

We see two plausible accounts of the observed differ-ences in sensitivity to hands First human infantsrsquo great-er sensitivity to the supporting role of hands may reflecta species difference in the use of hands specifically inthe manipulation of inanimate objects Because humaninfants and human adults manipulate objects more thanother primates do human infants may have more op-portunities to learn about handndashobject support relationsthan do other species A second possibility not mutuallyexclusive from the first is that humans are innatelypredisposed to attend to the ways in which inanimateobjects are manipulated by other humans which in turncontributes to both infantsrsquo abilities to learn rapidlyabout tools and ultimately to humansrsquo superior tooluse2

Differences in the Use of Object Features forBoundaries

The second difference between the object representa-tions of adult monkeys and young human infants con-cerns the use of object features such as surface coloringand shape as information for object boundaries Adultmonkeys and human infants above 11 months of ageuse featural information to perceive object boundariesin contrast infants below 11 months of age do notreliably exhibit this ability Various factors have beenproposed to underlie the developmental change ob-served in humans Some factors focus on perceptualdevelopment with behavioral changes attributed toinfantsrsquo emerging abilities to use image features suchas edge alignment and texture similarity to group por-tions of the visual field into units directly (eg Kellmanamp Arterberry 1998 Needham 1998) In contrast otherfactors focus on the development of higher level pro-cesses with behavioral changes attributed to an emer-ging ability to represent objects as members of kindsand an emerging propensity to use object features suchas surface coloring and shape as information for thekinds to which specific objects belong (Needham ampModi 2000 Xu amp Carey 1996) Further this changemay be driven by the acquisition of verbal labels for theobjects (Xu amp Carey 1996)

52 Journal of Cognitive Neuroscience Volume 13 Number 1

Corresponding to these two interpretations of thedevelopmental change in humans are two differentinterpretations of monkeysrsquo performance in the presentstudies Monkeys may have perceived the object bound-aries by categorizing each object as a different kind offood or they may have perceived the boundaries group-ing together elements in the visual scene in accord withtheir colors textures and alignment relationships

There is compelling data suggesting that monkeysrepresent the category of food such that they are likelyto have lsquolsquofood kindrsquorsquo representations First monkeys inthe present studies were strongly attentive to food itemsand occasionally attempted to approach and take thembehaviors often observed with familiar foods and rarelyobserved with familiar nonfood objects This was trueeven though they had no prior experience with theseparticular food items Second experiments by SantosHauser and Spelke (in preparation) suggest that mon-keys given evidence that a novel object is food (byobserving a person eating part of it) subsequentlyapproach that object an odorless replica of that objectand other objects of the same color and texture as theoriginal object but of a different shape In contrastmonkeys do not approach these objects when they aregiven evidence that the initial object is not food (byobserving a person putting the object in her ear ratherthan her mouth) This finding suggests that monkeyscategorize novel objects as kinds of food in terms ofproperties such as their colors and textures If percep-tible properties of the present stimulus objects allowedmonkeys to perceive correctly that these objects werefood then monkeysrsquo propensity to categorize objects asthe same foods only when they share a common colorand texture would lead them to perceive each display oftwo (differently colored and textured) foods as contain-ing two distinct objects

Whatever the reason for monkeysrsquo successful use offeatural information to perceive object boundaries theexistence of this capacity in rhesus monkeys casts doubton the thesis that this ability either depends on or givesrise to any uniquely human ability to represent objectsHumans do represent objects in unique ways for wehave unparalleled abilities to build and use complextools and to communicate about objects with uniquesymbols for thousands of object kinds The sources ofour uniqueness however do not clearly appear in thecontexts that have been used thus far to assess objectrepresentations in human infants

Steps Toward a Cognitive Neuroscience of NaturalObject Representation

Although our experiments focus strictly on behavioralmeasures and functional analyses we believe their great-est potential lies in the contributions they can make tounderstanding the neural basis of object representationRhesus monkeys are one of the most intensively studied

species in the neuroanatomy and neurophysiology ofvision and such studies have provided evidence forextensive homologies between their visual systems andthose of humans Our experiments contribute to thisliterature in three ways First they suggest that rhesusmonkeys and humans have similar higher visual me-chanisms for representing objects and interpreting ob-ject motions The origin of these similarities remains anopen question with likely contributions from bothgenetically encoded homologies in the underlying neur-al architectures and similar experiential histories inter-acting with similar neural learning mechanisms

Second our experiments provide evidence that theobject representations of monkeys and humans can beassessed by nearly identical tasks Moreover these tasksrequire no training and so allow assessment of therepresentations that humans and monkeys developand use spontaneously rather than less naturalisticrepresentations that may have been developed specifi-cally for solving experimental tasks over months oftraining on those tasks (see discussion in Rao et al1997) Finally these tasks can be applied not only toadult animals but to infants Indeed the preferentiallooking method was developed for use with infanthumans and monkeys (Fantz 1961) and it has beenused to study lower-level visual functions in both spe-cies (see Kellman amp Banks 1998 for review) Themethod therefore should be ideal for investigating theneural architecture subserving visual cognition in bothspecies

Third our experiments offer a behavioral task that canreadily be adapted for simultaneous behavioral andneural recordings in monkeys Preferential lookingmethods have been used successfully both with semindashfree-ranging rhesus monkeys and with captive cotton-top tamarins (eg Hauser et al 1996 Hauser 1998) Inpreliminary research they have yielded similar findingswith rhesus monkeys tested with stabilized heads andimplanted electrodes (Munakata Miller amp Spelke un-published) In the future therefore cognitive neuros-cientists should be able to use these methods to probethe neural mechanisms of object representations inuntrained monkeys whose experience with objects canbe precisely controlled and to compare the functionalproperties of those mechanisms directly to those ofhuman infants with varying degrees of experience Suchstudies should prove a valuable complement to studiesof the neural mechanisms of object representations inadult humans using the combined approaches of cogni-tive psychology and functional brain imaging

More specifically the studies reported in this papercould serve as the starting point for physiological studiesprobing the cognitive and behavioral functions of neu-rons activated by visual displays As a first step onecould ask whether the extensively studied object codingneurons in the inferotemporal cortex (Tanaka 1996Perrett et al 1987 Baylis Rolls amp Leonard 1985) are

Munakata et al 53

responsible for the behavioral results found in ourexperiments The finding that monkeys encode two-object displays as two separate objects leads to theprediction that inferotemporal neurons will respondsimilarly to each object in a one-object display and in atwo-object display with a possible reduction in responseto the latter display due to competition from thedifferent object representations It is possible howeverthat monkeys distinguish the two objects in earlierstages of processing parsing the display based on con-tiguous regions of the same general color and texturewithout this parsing being clearly reflected in object-level representations These alternatives could be dis-tinguished by recording a population of inferotemporalresponses to one of our two-object displays and to eachof the two objects separately If the two-object responseswere different from the sum or average of the responsesto the two separate objects this would suggest thatmonkeys encode the two-object displays in a differentmanner than the separate objects at the level of theinferotemporal cortex

If this first experiment showed that inferotemporalneurons encode two-object displays in terms of the twoseparate objects one could next manipulate factors thatinfluence object perception and measure the neuralcorrelates For example spatiotemporal cues such ascommon motion may make monkeys more likely toperceive a two-object display as a single object (egKellman amp Spelke 1983) and elimination of colordifferences may make them less likely to do so (SantosHauser amp Spelke in preparation) One could thusmeasure both the behavioral (looking time) and electro-physiological consequences of such manipulations andcompare the results to those from experiments withoutthis preexposure

Another ideal candidate for converging explorationfocuses on the representations underlying abilities toperceive combinations of objects in terms of theseparable components Do such abilities stem fromrepresentations of distinct perceptual features or fromrepresentations of distinct object kinds The nature ofobject representations is a matter of considerabledebate in the electrophysiological and related litera-ture (eg Sugihara Edelman amp Tanaka 1998 Lo-gothetis amp Sheinberg 1996 Tanaka 1996 Tarr ampBulthoff 1995 Biederman amp Cooper 1992 Biedermanamp Gerhardstein 1995) and issues related to the kindsfeatures distinction have been discussed in a some-what different terminology For example Logothetisand Sheinberg (1996) posit that different levels ofcategorization could be used to organize object repre-sentations from more specific visual feature-basedrepresentations to more abstract-kind representationsElectrophysiological recordings have demonstratedthat a given visual object is represented in differentways along a rough hierarchy of processing pathwaysfrom more specific low-order featural representations

to more abstract invariant categorical representations(eg Desimone amp Ungerleider 1989) Objects such asthose used in our displays could be recognized asdistinct based on features at lower levels or categoriesat higher levels Alternatively even the lowest level ofobject representations may be organized into differentkind categories as suggested by the existence of face-specific representations in both rhesus monkey andhuman visual areas (Kanwisher McDermott amp Chun1997 Perrett et al 1987) Though most of theexplanations for face-specific representations focus onthe unique perceptual properties of faces rather thana more general categorical organization of the objectrecognition system a categorical organization is stillpossible (Caramazza 1998) Thus objects could becategorized as different kinds at the earliest levels offeatural processing

One could further explore these issues in physiologi-cal studies by presenting monkeys with different visualforms of a single food category (eg bananas that aresliced mashed peeled unpeeled green brown andyellow) and visually similar forms from different foodcategories (eg a green banana and a cucumber) Ifinferotemporal representations encode information atthe level of kinds the first condition should elicit similarresponses in the inferotemporal neurons whereas thesecond should elicit different responses In contrast ifinferotemporal representations encode information atthe level of features the first condition should elicitdifferent responses whereas the second should elicitsimilar responses

In such ways our understanding of object representa-tion and in turn of humansrsquo unique tool and symboluse may be enhanced by converging efforts at thebehavioral and physiological levels of analysis Themethods reported in this papermdashused extensively withhumans of all ages requiring no training and applicableto free-ranging as well as captive animalsmdashcould play aninstrumental role in this process

METHODS

Experiment 1

Participants

Subjects were 59 semindashfree-ranging rhesus monkeysliving on the island of Cayo Santiago Puerto RicoApproximately half the subjects were adult males (agegt4 years) and half adult female (age gt3 years) Subjectswere tested opportunistically whenever they were en-countered in a setting with few other monkeys ordistractions (eg not involved in or near to a fight)and when they remained in a seated position longenough for us to present our stimuli Monkeys occa-sionally changed positions between trials In these casestesting resumed if and when monkeys relocated toanother seated position within a couple of minutes An

54 Journal of Cognitive Neuroscience Volume 13 Number 1

additional 21 monkeys were tested but did not providedata for the analyses due to either position changes thatdid not allow testing to resume (20 monkeys) or experi-menter error (1 monkey)

Apparatus and Displays

The experimental apparatus consisted of a stage and ascreen constructed from white foam core (Figure 2)The 60 pound 30-cm floor and 60 pound 40-cm back of the stagewere attached at a right angle by triangular supports (12-cm height pound 7-cm base) attached to the sides of thestage The 60 pound 40-cm screen had a 60 pound 15-cm basesupporting small aluminum pans containing the foodstimuli for the study The base and pans were attachedto the screen at a right angle by large triangular supports(40-cm height pound 15-cm base) that occluded both thebase and the food

The objects were four foods of contrasting shapescolors and textures with sizes that made them easilygraspable by a single human hand a green pepper (7-cm tall pound 8-cm diameter) a brown sweet potato (7-cmtall pound 75-cm wide by 17-cm long) a miniature orangepumpkin (7-cm tall pound 8-cm diameter) and a segmentof tan ginger root (45-cm tall pound 12-cm wide pound 15-cmlong) None of these items grew on the island or werebrought there either as provisions for the monkeys oras food for the research team all items therefore wereunfamiliar to the subjects In one display the greenpepper rested on top of the sweet potato In theother display the pumpkin rested on top of theginger root

Design

Each monkey was presented with one lsquolsquotogetherrsquorsquo trialand one lsquolsquoseparatersquorsquo trial each involving a different pairof food items Twenty-eight monkeys were tested in ahold-top condition in which the experimenter held onlythe top object with one hand during each event and 31monkeys were tested in a hold-both condition in whichthe experimenter held both objects with both handsWithin each of these conditions the pairing of objects(pepperpotato vs pumpkinginger) and trial types (to-gether vs separate) and the order of test trials wereorthogonally counterbalanced across monkeys

Procedure

All testing was conducted by one experimenter and onecamcorder operator a test began when the investigatorslocated a monkey who was seated in a quiet spot Theexperimenter positioned the apparatus 2ndash5 m away fromthe test monkey with the screen in front of and blockingthe monkeyrsquos view of the stage and the camcorderoperator began to videotape the monkey from behindthe display (Figure 2) The experimenter then raised the

screen to reveal an empty stage and immediately low-ered it Each test trial then proceeded as follows Theexperimenter raised the screen to reveal one food itemsitting atop a second food item The experimenterchecked that the monkey had fixated the objects andthen she lifted the top object approximately 30 cm in 1sec In lsquolsquotogetherrsquorsquo events the bottom object moved withthe top object in lsquolsquoseparatersquorsquo events the bottom objectremained on the floor of the display In the hold-topcondition the experimenter held only the top objectfrom above with the right hand (the two objects wereattached with toothpicks invisible to the monkeys) Inthe hold-both condition the experimenter held the topobject from above with the right hand and the bottomobject from the side and bottom with the left hand Afterlifting the object(s) the experimenter called lsquolsquoCountrsquorsquoand the camcorder operator began counting 10 sec onthe camcorder display The experimenter held the ob-ject(s) stationary until the camcorder operator calledlsquolsquoDonersquorsquo to signal the end of the 10-sec trial Theexperimenter then lowered the screen This procedurehas been successfully used in previous looking timeexperiments on this population (eg Hauser et al1996)

Each monkey received one lsquolsquotogetherrsquorsquo and one lsquolsquose-paratersquorsquo trial These two trials were separated by twoadditional trials unrelated to the present studies andinvolving the stationary presentation of other food items(carrots and squash) For most monkeys trials wereseparated by an intertrial interval of 3ndash5 sec and theentire experiment lasted a couple of minutes For mon-keys who repositioned themselves between trials theintertrial intervals were longer but never exceeded acouple of minutes

Coding and Analysis

Two coders blind to the hypotheses and conditions ofthe experiment viewed the videotaped trials frame-by-frame to determine how long monkeys observed eachof the event outcomes On each trial coding began justafter the objects came to rest as signaled by theexperimenterrsquos voice on the videotape and ended 10sec later Four of the monkeys were coded by bothcoders the correlation between their judgments oftotal looking time on each trial was 93 Looking timeswere analyzed by a 2 pound 2 ANOVA with Condition (hold-top vs hold-both) as a between-subjects factor andDisplay (together vs separate) as the within-subjectsfactor

Experiment 2

Participants

Subjects were 28 monkeys from the same population asin Experiment 1 An additional 10 monkeys were tested

Munakata et al 55

but did not provide data for the analyses due to eitherposition changes that did not allow testing to resume (9monkeys) or experimental error (1 monkey)

Apparatus and Stimuli

The apparatus was similar to that in Experiment 1except that the stage was somewhat smaller (back =45 pound 30 cm floor = 45 pound 30 cm) and the screen wasslightly taller (base = 45 pound 15 cm face = 45 pound 45 cm)The food objects and object positions were the same asin the outcome displays for Experiment 1 On thelsquolsquoseparatersquorsquo trial the position of the experimenterrsquos handwas the same as in the lsquolsquoseparatersquorsquo trial of the hold-topcondition of Experiment 1 On the lsquolsquotogetherrsquorsquo trial theexperimenterrsquos right hand grasped the two objectssimultaneously from the side and supported them inthe same positions as on the lsquolsquotogetherrsquorsquo trials for bothconditions of Experiment 1

Design

The design was the same as in Experiment 1 except thatall subjects were run in a single-hand condition

Procedure

Each trial began when the experimenter lifted the screento reveal the objects currently held in the air As thescreen was raised the experimenter called lsquolsquoCountrsquorsquo andthe camcorder operator began counting 10 sec on thecamcorder display In all other respects the procedurewas the same as in Experiment 1

Coding and Analysis

A single coder blind to the conditions of the experimentscored the videotapes Trials for 10 subjects were codedby a second observer and the correlation betweenjudgments of both observers was 98 As in previousstudies (Hauser et al 1996 Hauser 1998) videos wereacquired onto a computer using Adobe Premiere soft-ware and a Radius Videovision board Coding began andended as for Experiment 1

Looking times in Experiment 2 were analyzed by aone-way ANOVA with Display (together vs separate) asthe within-subjects factor A further ANOVA with theadditional factor of Experiment compared the lookingpatterns of the monkeys in Experiment 2 to those inExperiment 1

Experiment 3

Participants

Subjects were 30 monkeys from the same population asin Experiments 1 and 2 An additional 13 monkeys were

tested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (12 monkeys) or experimental error (1 monkey)

Apparatus and Displays

The apparatus was identical to that of Experiment 2 Thedisplays were the same as in the hold-top condition ofExperiment 1 except for the objects a yellow lemon andan orange pepper oriented vertically On lsquolsquotogetherrsquorsquotrials a whole object appeared on the display floororiented vertically and a hand grasped its top half andlifted the object into the air On lsquolsquoseparatersquorsquo trials twohalves of an object with a horizontal cut through themiddle appeared on the display floor in the sameorientation and a hand grasped the top half and liftedit into the air while the bottom half remained on thedisplay floor At the start of the lsquolsquoseparatersquorsquo trial the cutin the object was detectable by adults but inconspicu-ous At the end of the trial small portions of the insideof the object were visible from the monkeyrsquos station-point

Design Procedure Coding and Analyses

The design and procedure were the same as in Experi-ment 1 except that only one condition (hold-top) wasadministered and only one object (together or separate)was displayed The coding and analyses were the sameas in Experiment 2

Experiment 4

Participants

Subjects were 43 monkeys from the same population asin Experiments 1ndash3 An additional 27 monkeys weretested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (26 monkeys) or experimental error (1 monkey)

Apparatus and Displays

These were the same as in Experiment 3 except that thefood object never appeared on the display floor and wasnot grasped and lifted

Design Procedure Coding and Analyses

These were the same as in Experiment 2

Acknowledgments

The research was supported by McDonnell-Pew PostdoctoralFellowships to Yuko Munakata and Randall OrsquoReilly an NSFpredoctoral fellowship and Harvard University McMasterrsquosfunds to Laurie Santos NIH grant R37-HD23103 to ElizabethSpelke and an NSF Young Investigator Award to Marc

56 Journal of Cognitive Neuroscience Volume 13 Number 1

Hauser We thank members of the Cognitive DevelopmentCenter at the University of Denver for feedback on drafts ofthis article We thank Elliott Blass Amy Jackendoff Katie LiuCory Miller Marianne Moon Bridget Spelke and Fei Xu forassistance with conducting and coding the studies We thankthe CPRC (PHS grant P51RR00168-38) and Drs M KesslerF Bercovitch and J Berard for helping secure the CayoSantiago facilities

Reprint requests should be sent to Yuko Munakata at theDepartment of Psychology University of Denver 2155 SRace Street Denver CO 80208 USA or via e-mail tomunakataduedu

Notes

1 In some cases infants younger than 1 year have demon-strated abilities to use featural information to parse simpleadjacent objects into separable units (Needham amp Baillargeon1997 Needham 1998) Further work is needed to determinewhy such abilities are not reliably observed in infants duringthe first year of life2 A third account for the differential sensitivity to hands maybe quickly rejected Human infantsrsquo greater response to handsis not due to any greater sensitivity of the preferential lookingtask with human infants relative to adult rhesus monkeysbecause we found monkeys to be more sensitive than humaninfants to featural information for objects in the presentexperiments A fourth interpretation is possible though notvery likely Human infantsrsquo greater sensitivity to hands in ourexperiments may depend on the use of human handsmdashperhaps monkeys would show similar sensitivity to hands iftested with monkey hands However rhesus monkeys showsimilar physiological responses to human hands and monkeyhands (Rizzolatti Fadiga Fogassi amp Gallese 1999 diPellegrino Fadiga Fogassi Gallese amp Rizzolatti 1992)suggesting that rhesus monkeysrsquo observed insensitivity to thesupporting role of hands in our studies is unlikely to be anartifact of the use of human hands

REFERENCES

Baillargeon R (1995) Physical reasoning in infancy In MGazzaniga (Ed) The cognitive neurosciences CambridgeMIT Press

Baylis G C Rolls E T amp Leonard C M (1985) Selectivitybetween faces in the responses of a population of neurons inthe cortex of the superior temporal sulcus of the macaquemonkey Brain Research 342 91ndash102

Bertenthal B I (1996) Origins and early development ofperception action and representation Annual Review ofPsychology 47 431ndash459

Biederman I amp Cooper E E (1992) Size invariance in visualobject priming Journal of Experimental Psychology Hu-man Perception and Performance 18 121ndash133

Biederman I amp Gerhardstein P C (1995) Viewpoint-de-pendent mechanisms in visual object recognition Reply toTarr and Bulthoff (1995) Journal of Experimental Psychol-ogy Human Perception and Performance 21 1506ndash1514

Caramazza A (1998) Domain-specific knowledge systems inthe brain The animatendashinanimate distinction Journal ofCognitive Neuroscience 10 1ndash34

Desimone R Albright T D Gross C G amp Bruce C (1984)Stimulus selective properties of inferior temporal neurons inthe macaque Journal of Neuroscience 4 2051ndash2062

Desimone R Ungerleider L G (1989) Neural mechanisms ofvisual processing in monkeys In F Boller amp J Grafman

(Eds) Handbook of neuropsychology vol 2 (pp 267ndash299)New York Elsevier

DeYoe E A amp Van Essen D C (1988) Concurrent processingstreams in monkey visual cortex Trends in Neurosciences11 219ndash226

di Pellegrino G Fadiga L Fogassi L Gallese V amp RizzolattiG (1992) Understanding motor events A neurophysiologi-cal study Experimental Brain Research 91 176ndash180

Fantz R (1961) The origin of form perception ScientificAmerican 204 66ndash72

Fantz R (1964) Visual experience in infants Decreased at-tention to familiar patterns relative to novel ones Science146 668ndash670

Hauser M D (1998) A nonhuman primatersquos expectationsabout object motion and destination The importance ofself-propelled movement and animacy DevelopmentalScience 1 31ndash37

Hauser M D amp Carey S (1998) Building a cognitive creaturefrom a set of primitives Evolutionary and developmentalinsights In D Cummins amp C Allen (Eds) The evolution ofmind Oxford Oxford University Press

Hauser M D MacNeilage P amp Ware M (1996) Numericalrepresentations in primates Proceedings of the NationalAcademy of Sciences USA 93 1514

Hauser M D amp Williams T (submitted) A nonhuman pri-matesrsquo expectations about invisible displacement Two pro-cedures two different systems of knowledge

Johnson S P amp Aslin R N (1996) Perceptions of object unityin young infants The rules of motion depth and orienta-tion Cognitive Development 11 161ndash180

Kanwisher N McDermott J amp Chun M M (1997) The fu-siform face area A module in human extrastriate cortexspecialized for face perception Journal of Neuroscience 174302

Kellman P J amp Arterberry M E (1998) The cradle ofknowledge Development of perception in infancy Cam-bridge MIT Press

Kellman P J amp Banks M S (1998) Infant visual perceptionIn D Kuhn amp R S Siegler (Eds) Handbook of childpsychology Cognition perception and language 5th ed(pp 103ndash146) New York Wiley

Kellman P J amp Spelke E (1983) Perception of partially oc-cluded objects in infancy Cognitive Psychology 15 483ndash524

Kestenbaum R Termine N amp Spelke E S (1987) Perceptionof objects and object boundaries by 3-month-old infantsBritish Journal of Developmental Psychology 5 367ndash383

Leslie A M (1982) The perception of causality in infantsPerception 11 173ndash186

Leslie A M (1984) Infant perception of a manual pick-up eventBritish Journal of Developmental Psychology 2 19ndash32

Logothetis N K amp Sheinberg D L (1996) Visual object re-cognition Annual Review of Neuroscience 19 577ndash621

Maunsell J H amp Newsome W T (1987) Visual processing inmonkey extrastriate cortex Annual Review of Neu-roscience 10 363ndash401

Meltzoff A S (1988) Infant imitation and memory Nine-month-olds in immediate and deferred tests Child Devel-opment 59 217ndash225

Nagell K Olguin R S amp Tomasello M (1993) Processes ofsocial learning in the tool use of chimpanzees (Pan troglo-dytes) and human children (Homo sapiens) Journal ofComparative Psychology 107 174ndash186

Needham A (1997) Factors affecting infantsrsquo use of featuralinformation in object segregation Current Directions inPsychological Science 6 26ndash33

Needham A (1998) Infantsrsquo use of featural information in thesegregation of stationary objects Infant Behavior and De-velopment 21 47ndash76

Munakata et al 57

Needham A amp Baillargeon R (1993) Intuitions about sup-port in 45-month-old infants Cognition 47 121ndash148

Needham A amp Baillargeon R (1997) Object segregation in 8-month-old infants Cognition 62 121ndash149

Needham A amp Baillargeon R (1998) Effects of prior experi-ence on 45-month-old infantsrsquo object segregation InfantBehavior and Development 21 1ndash23

Needham A amp Modi M (2000) Infantsrsquo use of prior experi-ences with objects in object segregation Implications forobject recognition in infancy In H Reese (Ed) Advances inchild development and behavior vol 27 (pp 99ndash133)

Perrett D I Mistlin A J amp Chitty A J (1987) Visual neu-rones responsive to faces Trends in Neurosciences 10 358ndash364

Rao S C Rainer G amp Miller E K (1997) Integration of whatand where in the primate prefrontal cortex Science 276821

Rizzolatti G Fadiga L Fogassi L amp Gallese V (1999) Re-sonance behaviors and mirror neurons Archives Italiennesde Biologie 137 85

Scholl B amp Leslie A (in press) Explaining the infantrsquos objectconcept Beyond the perceptioncognition dichotomy In ELepore amp Z Pylyshyn (Eds) What is cognitive science Ox-ford Blackwell

Sereno M I Dale A M amp Tootell R B H (1995) Borders ofmultiple visual areas in humans revealed by functionalmagnetic resonance imaging Science 268 889

Spelke E (1985) Preferential looking methods as tools for thestudy of cognition in infancy In G Gottlieb amp N Krasnegor(Eds) Measurement of audition and vision in the first yearof postnatal life (pp 323ndash363) Norwood NJ Ablex

Spelke E Breinlinger K Jacobson K amp Phillips A (1993)Gestalt relations and object perception A developmentalstudy Perception 22 1483ndash1501

Spelke E amp Van de Walle G A (1993) Perceiving and rea-soning about objects Insights from infants In N Eilan amp R AMcCarthy (Eds) Spatial representation Problems in philo-sophy and psychology (pp 132ndash161) Oxford Blackwell

Spelke E S Hofsten C V amp Kestenbaum R (1989) Objectperception in infancy Interaction of spatial and kinetic in-formation for object boundaries Developmental Psychol-ogy 25 185ndash196

Sugihara T Edelman S amp Tanaka K (1998) Representationof objective similarity among three-dimensional shapes inthe monkey Biological Cybernetics 78 1

Tanaka K (1996) Inferotemporal cortex and object visionAnnual Review of Neuroscience 19 109ndash139

Tarr M J amp Bulthoff H H (1995) Is human object recogni-tion better described by geon structural descriptions or bymultiple views Comment on Biederman and Gerhardstein(1993) Journal of Experimental Psychology Human Per-ception and Performance 21 1494

Tomasello M Kruger A C amp Ratner H H (1993) Culturallearning Behavioral and Brain Sciences 16 495ndash552

Tootell R B H Dale A M Sereno M I amp Malach R (1996)New images from human visual cortex Trends in Neuros-ciences 19 481ndash489

von Hofsten C amp Spelke E (1985) Object perception andobject-directed reaching in infancy Journal of ExperimentalPsychology General 11 198ndash212

Woodward A L (1998) Infants selectively encode the goalobject of an actorrsquos reach Cogniton 69 1ndash34

Xu F amp Carey S (1996) Infantsrsquo metaphysics The case ofnumerical identity Cognitive Psychology 30 111ndash153

Xu F Carey S amp Welch J (1999) Infantsrsquo ability to useobject kind information for object individuation Cognition70 137ndash166

58 Journal of Cognitive Neuroscience Volume 13 Number 1

Page 8: Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

looked longer at the outcome of the lsquolsquotogetherrsquorsquo event inExperiment 1 than in Experiment 3

Discussion

When monkeys were presented with events in whicheither a single food item moved as a whole or half theobject moved independently of the rest they did notlook longer at the event outcome that followed motionof a greater food volume Indeed monkeys showed amarginally significant tendency in the opposite direc-tion looking longer at the outcome of the event inwhich the object broke apart Looking preferencesbetween the lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo trials differedsignificantly from the preferences shown in Experiment1 in which the events involved two distinct objectsThese findings accord with the thesis that monkeys usefeatural information to parse visual scenes into objectsrepresent each object as separately movable and manip-ulable and look longer at events in which two distinctobjects move together

Nevertheless one of the alternative accounts could berevised to account for this collection of data Perhapsmonkeys have a preference both for event outcomesthat follow the motion of more food and for eventoutcomes that reveal the inside of a food object Accord-ing to this revised account monkeys in Experiment 1looked longer following an event in which two distinctobjects moved together because of their preference formore stuff moving This preference was not evident inExperiment 3 because it competed with an intrinsicpreference for the outcome display from the lsquolsquoseparatersquorsquotrial Because the inside of the lemon or pepper wasvisible following the lsquolsquoseparatersquorsquo event of Experiment 3but not following either the lsquolsquotogetherrsquorsquo event in Experi-ment 3 or either event in Experiment 1 a preference forviewing the inside of a food object would produce agreater preference for the lsquolsquoseparatersquorsquo outcome display inExperiment 3 than in Experiment 1

Experiment 4 tests this revised account by presentinga new group of monkeys with the outcome displays ofthe lsquolsquotogetherrsquorsquo and lsquolsquoseparatersquorsquo events from Experiment3 with no prior presentation of any objects or motionAccording to the revised account monkeys should showa stronger preference for the lsquolsquoseparatersquorsquo event in Experi-ment 4 than in Experiment 3 because only Experiment 3would invoke the competing preference for more stuffmoving in the lsquolsquotogetherrsquorsquo event According to the origi-nal object-parsing account the preference for the lsquolsquose-paratersquorsquo event in Experiment 4 will not exceed that inExperiment 3 If the monkeys in Experiment 3 expectsingle objects to move as cohesive units then prefer-ence for the outcome of the lsquolsquoseparatersquorsquo event might begreater in Experiment 3 than in Experiment 4 If mon-keys have no expectations about the cohesive or non-cohesive motion of food objects then preferencesshould be the same in the two experiments

EXPERIMENT 4

Experiment 4 used the outcome displays of Experiment3 and the method of Experiment 2 Monkeys werepresented with one stationary display in which a handheld a whole food object in the air (together) and onestationary display in which a hand held the top half ofthe food object in the air while the bottom half of thefood object rested on the display floor (separate)Looking times to the two displays were compared toeach other and to the looking times of the monkeys inExperiment 3 who viewed the same displays followingpresentation of the whole object and two differentpatterns of motion

Results

Figure 4b presents the principal findings of Experiment4 Monkeys looked equally at lsquolsquotogetherrsquorsquo events (37 secSE = 3 sec) and lsquolsquoseparatersquorsquo events (42 sec SE = 3 sec)F(1 42) = 15 p = 2 Of the 43 monkeys tested 16looked longer at the lsquolsquotogetherrsquorsquo event and 27 lookedlonger at the lsquolsquoseparatersquorsquo event ( 2 = 28 nonsignificant)

The analysis comparing looking times in Experiments3 and 4 revealed a significant main effect of trial typemonkeys looked longer at the lsquolsquoseparatersquorsquo outcome dis-play (43 sec SE = 3 sec) than at the lsquolsquotogetherrsquorsquooutcome display (37 sec SE = 2 sec) F(1 71) = 45p lt 05 Of the 73 monkeys tested in Experiments 3 and4 46 looked longer at the lsquolsquoseparatersquorsquo outcome displayand 27 looked longer at the lsquolsquotogetherrsquorsquo outcome display

2 = 49 p lt 05

Discussion

In Experiment 4 rhesus monkeys showed a nonsignifi-cantly smaller preference for the lsquolsquoseparatersquorsquo display inwhich a single food item appeared in two pieces thantheir counterparts in Experiment 3 This finding pro-vides evidence against the thesis that monkeysrsquo lookingtimes depend on a preference for the outcomes ofevents involving the motion of more food stuff com-bined with an intrinsic preference for the separatedoutcome display with one object They instead supportthe object-parsing interpretation of the results fromExperiment 1 Monkeys appear to use featural informa-tion to parse visual displays into distinct objects andthey find events in which distinct objects move togethermore novel or less natural than events in which distinctobjects move separately

The findings of Experiments 3 and 4 provide no clearevidence concerning monkeysrsquo expectation that singlefood items will move cohesively If monkeys had such anexpectation then the subjects in Experiment 3 shouldhave looked longer at the lsquolsquoseparatersquorsquo display than thosein Experiment 4 because the lsquolsquoseparatersquorsquo display in Ex-periment 3 followed an event in which a single object

Munakata et al 51

broke apart and moved noncohesively Although the datafrom Experiments 3 and 4 tend in this direction noreliable differences were obtained between the lookingpreferences in the two experiments Reliable preferencesfor the outcomes of noncohesive motions have beenobserved both with human infants and with humanadults tested with similar methods and with displays ofsimple artifacts (Spelke et al 1989 1993 Kestenbaum etal 1987) The absence of a clear effect of cohesiveness inExperiments 3 and 4 may reflect either a species differ-ence or a difference in object domain Artifacts are moreapt to move cohesively than is food which breaks apartboth due to decay cutting or eating Such conclusionscannot be drawn from the present experiments howeverbecause of the equivocal findings

GENERAL DISCUSSION

Four experiments provide evidence that rhesus monkeysspontaneously parse arrays of adjacent food items intodistinct objects and that they represent these objects asseparately movable and manipulable Monkeys lookedlonger at the outcomes of events in which two previouslystationary adjacent objects moved as one unit than at theoutcomes of events in which one of the objects movedseparately from the other This preference was notattributable to any intrinsic preference for the formerevent outcome or to any preference for an outcome thatfollowed a greater amount of motion Instead it providesevidence that the monkeys represented the commonmotion of the two distinct objects as more novel orsurprising than the independent motion of those objects

The present findings suggest broad similarities be-tween the object representations formed by human andnonhuman primates and between the ways in whichthose representations are used to support inferencesabout objectsrsquo movability The well known detailedhomologies between the lower-level visual mechanismsof human and nonhuman primates (Tootell et al 1996Sereno et al 1995 DeYoe amp Van Essen 1988 Maunsell ampNewsome 1987 Desimone et al 1984) therefore appearto extend to higher-level mechanisms for parsing objectsand interpreting object motions In addition our findingsprovide evidence that adult monkeys and human infantsshow similar behavioral responses to object motionswith heightened visual exploration of motions that arenovel or surprising These findings complement previousresults showing that rhesus monkeys cotton-top tamar-ins and human infants show similar looking preferencesfor events in which objects are occluded or behave inanomalous ways (eg Hauser et al 1996 Hauser 1998)

Differences in Sensitivity to Hands

Our studies also reveal two differences between theobject representations formed by adult rhesus monkeysand young human infants First human infants take

account of the actions of human hands in analyzingthe motions and support relations among objects Whenhuman infants see an inanimate object rise into the air ina display that includes a human hand they show anovelty reaction if the hand and object are spatiallyseparated but not if the hand is grasping the object(Needham amp Baillargeon 1993 Leslie 1984) Monkeysin contrast showed no sensitivity to the supporting roleof hands in Experiment 1 Their novelty reaction to thecommon rising motion of two objects was equally strongwhen no hand contacted the bottom object (an eventthat implies that the two objects were connected) andwhen hands contacted each of the objects (an event thatimplies no connection between the objects)

We see two plausible accounts of the observed differ-ences in sensitivity to hands First human infantsrsquo great-er sensitivity to the supporting role of hands may reflecta species difference in the use of hands specifically inthe manipulation of inanimate objects Because humaninfants and human adults manipulate objects more thanother primates do human infants may have more op-portunities to learn about handndashobject support relationsthan do other species A second possibility not mutuallyexclusive from the first is that humans are innatelypredisposed to attend to the ways in which inanimateobjects are manipulated by other humans which in turncontributes to both infantsrsquo abilities to learn rapidlyabout tools and ultimately to humansrsquo superior tooluse2

Differences in the Use of Object Features forBoundaries

The second difference between the object representa-tions of adult monkeys and young human infants con-cerns the use of object features such as surface coloringand shape as information for object boundaries Adultmonkeys and human infants above 11 months of ageuse featural information to perceive object boundariesin contrast infants below 11 months of age do notreliably exhibit this ability Various factors have beenproposed to underlie the developmental change ob-served in humans Some factors focus on perceptualdevelopment with behavioral changes attributed toinfantsrsquo emerging abilities to use image features suchas edge alignment and texture similarity to group por-tions of the visual field into units directly (eg Kellmanamp Arterberry 1998 Needham 1998) In contrast otherfactors focus on the development of higher level pro-cesses with behavioral changes attributed to an emer-ging ability to represent objects as members of kindsand an emerging propensity to use object features suchas surface coloring and shape as information for thekinds to which specific objects belong (Needham ampModi 2000 Xu amp Carey 1996) Further this changemay be driven by the acquisition of verbal labels for theobjects (Xu amp Carey 1996)

52 Journal of Cognitive Neuroscience Volume 13 Number 1

Corresponding to these two interpretations of thedevelopmental change in humans are two differentinterpretations of monkeysrsquo performance in the presentstudies Monkeys may have perceived the object bound-aries by categorizing each object as a different kind offood or they may have perceived the boundaries group-ing together elements in the visual scene in accord withtheir colors textures and alignment relationships

There is compelling data suggesting that monkeysrepresent the category of food such that they are likelyto have lsquolsquofood kindrsquorsquo representations First monkeys inthe present studies were strongly attentive to food itemsand occasionally attempted to approach and take thembehaviors often observed with familiar foods and rarelyobserved with familiar nonfood objects This was trueeven though they had no prior experience with theseparticular food items Second experiments by SantosHauser and Spelke (in preparation) suggest that mon-keys given evidence that a novel object is food (byobserving a person eating part of it) subsequentlyapproach that object an odorless replica of that objectand other objects of the same color and texture as theoriginal object but of a different shape In contrastmonkeys do not approach these objects when they aregiven evidence that the initial object is not food (byobserving a person putting the object in her ear ratherthan her mouth) This finding suggests that monkeyscategorize novel objects as kinds of food in terms ofproperties such as their colors and textures If percep-tible properties of the present stimulus objects allowedmonkeys to perceive correctly that these objects werefood then monkeysrsquo propensity to categorize objects asthe same foods only when they share a common colorand texture would lead them to perceive each display oftwo (differently colored and textured) foods as contain-ing two distinct objects

Whatever the reason for monkeysrsquo successful use offeatural information to perceive object boundaries theexistence of this capacity in rhesus monkeys casts doubton the thesis that this ability either depends on or givesrise to any uniquely human ability to represent objectsHumans do represent objects in unique ways for wehave unparalleled abilities to build and use complextools and to communicate about objects with uniquesymbols for thousands of object kinds The sources ofour uniqueness however do not clearly appear in thecontexts that have been used thus far to assess objectrepresentations in human infants

Steps Toward a Cognitive Neuroscience of NaturalObject Representation

Although our experiments focus strictly on behavioralmeasures and functional analyses we believe their great-est potential lies in the contributions they can make tounderstanding the neural basis of object representationRhesus monkeys are one of the most intensively studied

species in the neuroanatomy and neurophysiology ofvision and such studies have provided evidence forextensive homologies between their visual systems andthose of humans Our experiments contribute to thisliterature in three ways First they suggest that rhesusmonkeys and humans have similar higher visual me-chanisms for representing objects and interpreting ob-ject motions The origin of these similarities remains anopen question with likely contributions from bothgenetically encoded homologies in the underlying neur-al architectures and similar experiential histories inter-acting with similar neural learning mechanisms

Second our experiments provide evidence that theobject representations of monkeys and humans can beassessed by nearly identical tasks Moreover these tasksrequire no training and so allow assessment of therepresentations that humans and monkeys developand use spontaneously rather than less naturalisticrepresentations that may have been developed specifi-cally for solving experimental tasks over months oftraining on those tasks (see discussion in Rao et al1997) Finally these tasks can be applied not only toadult animals but to infants Indeed the preferentiallooking method was developed for use with infanthumans and monkeys (Fantz 1961) and it has beenused to study lower-level visual functions in both spe-cies (see Kellman amp Banks 1998 for review) Themethod therefore should be ideal for investigating theneural architecture subserving visual cognition in bothspecies

Third our experiments offer a behavioral task that canreadily be adapted for simultaneous behavioral andneural recordings in monkeys Preferential lookingmethods have been used successfully both with semindashfree-ranging rhesus monkeys and with captive cotton-top tamarins (eg Hauser et al 1996 Hauser 1998) Inpreliminary research they have yielded similar findingswith rhesus monkeys tested with stabilized heads andimplanted electrodes (Munakata Miller amp Spelke un-published) In the future therefore cognitive neuros-cientists should be able to use these methods to probethe neural mechanisms of object representations inuntrained monkeys whose experience with objects canbe precisely controlled and to compare the functionalproperties of those mechanisms directly to those ofhuman infants with varying degrees of experience Suchstudies should prove a valuable complement to studiesof the neural mechanisms of object representations inadult humans using the combined approaches of cogni-tive psychology and functional brain imaging

More specifically the studies reported in this papercould serve as the starting point for physiological studiesprobing the cognitive and behavioral functions of neu-rons activated by visual displays As a first step onecould ask whether the extensively studied object codingneurons in the inferotemporal cortex (Tanaka 1996Perrett et al 1987 Baylis Rolls amp Leonard 1985) are

Munakata et al 53

responsible for the behavioral results found in ourexperiments The finding that monkeys encode two-object displays as two separate objects leads to theprediction that inferotemporal neurons will respondsimilarly to each object in a one-object display and in atwo-object display with a possible reduction in responseto the latter display due to competition from thedifferent object representations It is possible howeverthat monkeys distinguish the two objects in earlierstages of processing parsing the display based on con-tiguous regions of the same general color and texturewithout this parsing being clearly reflected in object-level representations These alternatives could be dis-tinguished by recording a population of inferotemporalresponses to one of our two-object displays and to eachof the two objects separately If the two-object responseswere different from the sum or average of the responsesto the two separate objects this would suggest thatmonkeys encode the two-object displays in a differentmanner than the separate objects at the level of theinferotemporal cortex

If this first experiment showed that inferotemporalneurons encode two-object displays in terms of the twoseparate objects one could next manipulate factors thatinfluence object perception and measure the neuralcorrelates For example spatiotemporal cues such ascommon motion may make monkeys more likely toperceive a two-object display as a single object (egKellman amp Spelke 1983) and elimination of colordifferences may make them less likely to do so (SantosHauser amp Spelke in preparation) One could thusmeasure both the behavioral (looking time) and electro-physiological consequences of such manipulations andcompare the results to those from experiments withoutthis preexposure

Another ideal candidate for converging explorationfocuses on the representations underlying abilities toperceive combinations of objects in terms of theseparable components Do such abilities stem fromrepresentations of distinct perceptual features or fromrepresentations of distinct object kinds The nature ofobject representations is a matter of considerabledebate in the electrophysiological and related litera-ture (eg Sugihara Edelman amp Tanaka 1998 Lo-gothetis amp Sheinberg 1996 Tanaka 1996 Tarr ampBulthoff 1995 Biederman amp Cooper 1992 Biedermanamp Gerhardstein 1995) and issues related to the kindsfeatures distinction have been discussed in a some-what different terminology For example Logothetisand Sheinberg (1996) posit that different levels ofcategorization could be used to organize object repre-sentations from more specific visual feature-basedrepresentations to more abstract-kind representationsElectrophysiological recordings have demonstratedthat a given visual object is represented in differentways along a rough hierarchy of processing pathwaysfrom more specific low-order featural representations

to more abstract invariant categorical representations(eg Desimone amp Ungerleider 1989) Objects such asthose used in our displays could be recognized asdistinct based on features at lower levels or categoriesat higher levels Alternatively even the lowest level ofobject representations may be organized into differentkind categories as suggested by the existence of face-specific representations in both rhesus monkey andhuman visual areas (Kanwisher McDermott amp Chun1997 Perrett et al 1987) Though most of theexplanations for face-specific representations focus onthe unique perceptual properties of faces rather thana more general categorical organization of the objectrecognition system a categorical organization is stillpossible (Caramazza 1998) Thus objects could becategorized as different kinds at the earliest levels offeatural processing

One could further explore these issues in physiologi-cal studies by presenting monkeys with different visualforms of a single food category (eg bananas that aresliced mashed peeled unpeeled green brown andyellow) and visually similar forms from different foodcategories (eg a green banana and a cucumber) Ifinferotemporal representations encode information atthe level of kinds the first condition should elicit similarresponses in the inferotemporal neurons whereas thesecond should elicit different responses In contrast ifinferotemporal representations encode information atthe level of features the first condition should elicitdifferent responses whereas the second should elicitsimilar responses

In such ways our understanding of object representa-tion and in turn of humansrsquo unique tool and symboluse may be enhanced by converging efforts at thebehavioral and physiological levels of analysis Themethods reported in this papermdashused extensively withhumans of all ages requiring no training and applicableto free-ranging as well as captive animalsmdashcould play aninstrumental role in this process

METHODS

Experiment 1

Participants

Subjects were 59 semindashfree-ranging rhesus monkeysliving on the island of Cayo Santiago Puerto RicoApproximately half the subjects were adult males (agegt4 years) and half adult female (age gt3 years) Subjectswere tested opportunistically whenever they were en-countered in a setting with few other monkeys ordistractions (eg not involved in or near to a fight)and when they remained in a seated position longenough for us to present our stimuli Monkeys occa-sionally changed positions between trials In these casestesting resumed if and when monkeys relocated toanother seated position within a couple of minutes An

54 Journal of Cognitive Neuroscience Volume 13 Number 1

additional 21 monkeys were tested but did not providedata for the analyses due to either position changes thatdid not allow testing to resume (20 monkeys) or experi-menter error (1 monkey)

Apparatus and Displays

The experimental apparatus consisted of a stage and ascreen constructed from white foam core (Figure 2)The 60 pound 30-cm floor and 60 pound 40-cm back of the stagewere attached at a right angle by triangular supports (12-cm height pound 7-cm base) attached to the sides of thestage The 60 pound 40-cm screen had a 60 pound 15-cm basesupporting small aluminum pans containing the foodstimuli for the study The base and pans were attachedto the screen at a right angle by large triangular supports(40-cm height pound 15-cm base) that occluded both thebase and the food

The objects were four foods of contrasting shapescolors and textures with sizes that made them easilygraspable by a single human hand a green pepper (7-cm tall pound 8-cm diameter) a brown sweet potato (7-cmtall pound 75-cm wide by 17-cm long) a miniature orangepumpkin (7-cm tall pound 8-cm diameter) and a segmentof tan ginger root (45-cm tall pound 12-cm wide pound 15-cmlong) None of these items grew on the island or werebrought there either as provisions for the monkeys oras food for the research team all items therefore wereunfamiliar to the subjects In one display the greenpepper rested on top of the sweet potato In theother display the pumpkin rested on top of theginger root

Design

Each monkey was presented with one lsquolsquotogetherrsquorsquo trialand one lsquolsquoseparatersquorsquo trial each involving a different pairof food items Twenty-eight monkeys were tested in ahold-top condition in which the experimenter held onlythe top object with one hand during each event and 31monkeys were tested in a hold-both condition in whichthe experimenter held both objects with both handsWithin each of these conditions the pairing of objects(pepperpotato vs pumpkinginger) and trial types (to-gether vs separate) and the order of test trials wereorthogonally counterbalanced across monkeys

Procedure

All testing was conducted by one experimenter and onecamcorder operator a test began when the investigatorslocated a monkey who was seated in a quiet spot Theexperimenter positioned the apparatus 2ndash5 m away fromthe test monkey with the screen in front of and blockingthe monkeyrsquos view of the stage and the camcorderoperator began to videotape the monkey from behindthe display (Figure 2) The experimenter then raised the

screen to reveal an empty stage and immediately low-ered it Each test trial then proceeded as follows Theexperimenter raised the screen to reveal one food itemsitting atop a second food item The experimenterchecked that the monkey had fixated the objects andthen she lifted the top object approximately 30 cm in 1sec In lsquolsquotogetherrsquorsquo events the bottom object moved withthe top object in lsquolsquoseparatersquorsquo events the bottom objectremained on the floor of the display In the hold-topcondition the experimenter held only the top objectfrom above with the right hand (the two objects wereattached with toothpicks invisible to the monkeys) Inthe hold-both condition the experimenter held the topobject from above with the right hand and the bottomobject from the side and bottom with the left hand Afterlifting the object(s) the experimenter called lsquolsquoCountrsquorsquoand the camcorder operator began counting 10 sec onthe camcorder display The experimenter held the ob-ject(s) stationary until the camcorder operator calledlsquolsquoDonersquorsquo to signal the end of the 10-sec trial Theexperimenter then lowered the screen This procedurehas been successfully used in previous looking timeexperiments on this population (eg Hauser et al1996)

Each monkey received one lsquolsquotogetherrsquorsquo and one lsquolsquose-paratersquorsquo trial These two trials were separated by twoadditional trials unrelated to the present studies andinvolving the stationary presentation of other food items(carrots and squash) For most monkeys trials wereseparated by an intertrial interval of 3ndash5 sec and theentire experiment lasted a couple of minutes For mon-keys who repositioned themselves between trials theintertrial intervals were longer but never exceeded acouple of minutes

Coding and Analysis

Two coders blind to the hypotheses and conditions ofthe experiment viewed the videotaped trials frame-by-frame to determine how long monkeys observed eachof the event outcomes On each trial coding began justafter the objects came to rest as signaled by theexperimenterrsquos voice on the videotape and ended 10sec later Four of the monkeys were coded by bothcoders the correlation between their judgments oftotal looking time on each trial was 93 Looking timeswere analyzed by a 2 pound 2 ANOVA with Condition (hold-top vs hold-both) as a between-subjects factor andDisplay (together vs separate) as the within-subjectsfactor

Experiment 2

Participants

Subjects were 28 monkeys from the same population asin Experiment 1 An additional 10 monkeys were tested

Munakata et al 55

but did not provide data for the analyses due to eitherposition changes that did not allow testing to resume (9monkeys) or experimental error (1 monkey)

Apparatus and Stimuli

The apparatus was similar to that in Experiment 1except that the stage was somewhat smaller (back =45 pound 30 cm floor = 45 pound 30 cm) and the screen wasslightly taller (base = 45 pound 15 cm face = 45 pound 45 cm)The food objects and object positions were the same asin the outcome displays for Experiment 1 On thelsquolsquoseparatersquorsquo trial the position of the experimenterrsquos handwas the same as in the lsquolsquoseparatersquorsquo trial of the hold-topcondition of Experiment 1 On the lsquolsquotogetherrsquorsquo trial theexperimenterrsquos right hand grasped the two objectssimultaneously from the side and supported them inthe same positions as on the lsquolsquotogetherrsquorsquo trials for bothconditions of Experiment 1

Design

The design was the same as in Experiment 1 except thatall subjects were run in a single-hand condition

Procedure

Each trial began when the experimenter lifted the screento reveal the objects currently held in the air As thescreen was raised the experimenter called lsquolsquoCountrsquorsquo andthe camcorder operator began counting 10 sec on thecamcorder display In all other respects the procedurewas the same as in Experiment 1

Coding and Analysis

A single coder blind to the conditions of the experimentscored the videotapes Trials for 10 subjects were codedby a second observer and the correlation betweenjudgments of both observers was 98 As in previousstudies (Hauser et al 1996 Hauser 1998) videos wereacquired onto a computer using Adobe Premiere soft-ware and a Radius Videovision board Coding began andended as for Experiment 1

Looking times in Experiment 2 were analyzed by aone-way ANOVA with Display (together vs separate) asthe within-subjects factor A further ANOVA with theadditional factor of Experiment compared the lookingpatterns of the monkeys in Experiment 2 to those inExperiment 1

Experiment 3

Participants

Subjects were 30 monkeys from the same population asin Experiments 1 and 2 An additional 13 monkeys were

tested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (12 monkeys) or experimental error (1 monkey)

Apparatus and Displays

The apparatus was identical to that of Experiment 2 Thedisplays were the same as in the hold-top condition ofExperiment 1 except for the objects a yellow lemon andan orange pepper oriented vertically On lsquolsquotogetherrsquorsquotrials a whole object appeared on the display floororiented vertically and a hand grasped its top half andlifted the object into the air On lsquolsquoseparatersquorsquo trials twohalves of an object with a horizontal cut through themiddle appeared on the display floor in the sameorientation and a hand grasped the top half and liftedit into the air while the bottom half remained on thedisplay floor At the start of the lsquolsquoseparatersquorsquo trial the cutin the object was detectable by adults but inconspicu-ous At the end of the trial small portions of the insideof the object were visible from the monkeyrsquos station-point

Design Procedure Coding and Analyses

The design and procedure were the same as in Experi-ment 1 except that only one condition (hold-top) wasadministered and only one object (together or separate)was displayed The coding and analyses were the sameas in Experiment 2

Experiment 4

Participants

Subjects were 43 monkeys from the same population asin Experiments 1ndash3 An additional 27 monkeys weretested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (26 monkeys) or experimental error (1 monkey)

Apparatus and Displays

These were the same as in Experiment 3 except that thefood object never appeared on the display floor and wasnot grasped and lifted

Design Procedure Coding and Analyses

These were the same as in Experiment 2

Acknowledgments

The research was supported by McDonnell-Pew PostdoctoralFellowships to Yuko Munakata and Randall OrsquoReilly an NSFpredoctoral fellowship and Harvard University McMasterrsquosfunds to Laurie Santos NIH grant R37-HD23103 to ElizabethSpelke and an NSF Young Investigator Award to Marc

56 Journal of Cognitive Neuroscience Volume 13 Number 1

Hauser We thank members of the Cognitive DevelopmentCenter at the University of Denver for feedback on drafts ofthis article We thank Elliott Blass Amy Jackendoff Katie LiuCory Miller Marianne Moon Bridget Spelke and Fei Xu forassistance with conducting and coding the studies We thankthe CPRC (PHS grant P51RR00168-38) and Drs M KesslerF Bercovitch and J Berard for helping secure the CayoSantiago facilities

Reprint requests should be sent to Yuko Munakata at theDepartment of Psychology University of Denver 2155 SRace Street Denver CO 80208 USA or via e-mail tomunakataduedu

Notes

1 In some cases infants younger than 1 year have demon-strated abilities to use featural information to parse simpleadjacent objects into separable units (Needham amp Baillargeon1997 Needham 1998) Further work is needed to determinewhy such abilities are not reliably observed in infants duringthe first year of life2 A third account for the differential sensitivity to hands maybe quickly rejected Human infantsrsquo greater response to handsis not due to any greater sensitivity of the preferential lookingtask with human infants relative to adult rhesus monkeysbecause we found monkeys to be more sensitive than humaninfants to featural information for objects in the presentexperiments A fourth interpretation is possible though notvery likely Human infantsrsquo greater sensitivity to hands in ourexperiments may depend on the use of human handsmdashperhaps monkeys would show similar sensitivity to hands iftested with monkey hands However rhesus monkeys showsimilar physiological responses to human hands and monkeyhands (Rizzolatti Fadiga Fogassi amp Gallese 1999 diPellegrino Fadiga Fogassi Gallese amp Rizzolatti 1992)suggesting that rhesus monkeysrsquo observed insensitivity to thesupporting role of hands in our studies is unlikely to be anartifact of the use of human hands

REFERENCES

Baillargeon R (1995) Physical reasoning in infancy In MGazzaniga (Ed) The cognitive neurosciences CambridgeMIT Press

Baylis G C Rolls E T amp Leonard C M (1985) Selectivitybetween faces in the responses of a population of neurons inthe cortex of the superior temporal sulcus of the macaquemonkey Brain Research 342 91ndash102

Bertenthal B I (1996) Origins and early development ofperception action and representation Annual Review ofPsychology 47 431ndash459

Biederman I amp Cooper E E (1992) Size invariance in visualobject priming Journal of Experimental Psychology Hu-man Perception and Performance 18 121ndash133

Biederman I amp Gerhardstein P C (1995) Viewpoint-de-pendent mechanisms in visual object recognition Reply toTarr and Bulthoff (1995) Journal of Experimental Psychol-ogy Human Perception and Performance 21 1506ndash1514

Caramazza A (1998) Domain-specific knowledge systems inthe brain The animatendashinanimate distinction Journal ofCognitive Neuroscience 10 1ndash34

Desimone R Albright T D Gross C G amp Bruce C (1984)Stimulus selective properties of inferior temporal neurons inthe macaque Journal of Neuroscience 4 2051ndash2062

Desimone R Ungerleider L G (1989) Neural mechanisms ofvisual processing in monkeys In F Boller amp J Grafman

(Eds) Handbook of neuropsychology vol 2 (pp 267ndash299)New York Elsevier

DeYoe E A amp Van Essen D C (1988) Concurrent processingstreams in monkey visual cortex Trends in Neurosciences11 219ndash226

di Pellegrino G Fadiga L Fogassi L Gallese V amp RizzolattiG (1992) Understanding motor events A neurophysiologi-cal study Experimental Brain Research 91 176ndash180

Fantz R (1961) The origin of form perception ScientificAmerican 204 66ndash72

Fantz R (1964) Visual experience in infants Decreased at-tention to familiar patterns relative to novel ones Science146 668ndash670

Hauser M D (1998) A nonhuman primatersquos expectationsabout object motion and destination The importance ofself-propelled movement and animacy DevelopmentalScience 1 31ndash37

Hauser M D amp Carey S (1998) Building a cognitive creaturefrom a set of primitives Evolutionary and developmentalinsights In D Cummins amp C Allen (Eds) The evolution ofmind Oxford Oxford University Press

Hauser M D MacNeilage P amp Ware M (1996) Numericalrepresentations in primates Proceedings of the NationalAcademy of Sciences USA 93 1514

Hauser M D amp Williams T (submitted) A nonhuman pri-matesrsquo expectations about invisible displacement Two pro-cedures two different systems of knowledge

Johnson S P amp Aslin R N (1996) Perceptions of object unityin young infants The rules of motion depth and orienta-tion Cognitive Development 11 161ndash180

Kanwisher N McDermott J amp Chun M M (1997) The fu-siform face area A module in human extrastriate cortexspecialized for face perception Journal of Neuroscience 174302

Kellman P J amp Arterberry M E (1998) The cradle ofknowledge Development of perception in infancy Cam-bridge MIT Press

Kellman P J amp Banks M S (1998) Infant visual perceptionIn D Kuhn amp R S Siegler (Eds) Handbook of childpsychology Cognition perception and language 5th ed(pp 103ndash146) New York Wiley

Kellman P J amp Spelke E (1983) Perception of partially oc-cluded objects in infancy Cognitive Psychology 15 483ndash524

Kestenbaum R Termine N amp Spelke E S (1987) Perceptionof objects and object boundaries by 3-month-old infantsBritish Journal of Developmental Psychology 5 367ndash383

Leslie A M (1982) The perception of causality in infantsPerception 11 173ndash186

Leslie A M (1984) Infant perception of a manual pick-up eventBritish Journal of Developmental Psychology 2 19ndash32

Logothetis N K amp Sheinberg D L (1996) Visual object re-cognition Annual Review of Neuroscience 19 577ndash621

Maunsell J H amp Newsome W T (1987) Visual processing inmonkey extrastriate cortex Annual Review of Neu-roscience 10 363ndash401

Meltzoff A S (1988) Infant imitation and memory Nine-month-olds in immediate and deferred tests Child Devel-opment 59 217ndash225

Nagell K Olguin R S amp Tomasello M (1993) Processes ofsocial learning in the tool use of chimpanzees (Pan troglo-dytes) and human children (Homo sapiens) Journal ofComparative Psychology 107 174ndash186

Needham A (1997) Factors affecting infantsrsquo use of featuralinformation in object segregation Current Directions inPsychological Science 6 26ndash33

Needham A (1998) Infantsrsquo use of featural information in thesegregation of stationary objects Infant Behavior and De-velopment 21 47ndash76

Munakata et al 57

Needham A amp Baillargeon R (1993) Intuitions about sup-port in 45-month-old infants Cognition 47 121ndash148

Needham A amp Baillargeon R (1997) Object segregation in 8-month-old infants Cognition 62 121ndash149

Needham A amp Baillargeon R (1998) Effects of prior experi-ence on 45-month-old infantsrsquo object segregation InfantBehavior and Development 21 1ndash23

Needham A amp Modi M (2000) Infantsrsquo use of prior experi-ences with objects in object segregation Implications forobject recognition in infancy In H Reese (Ed) Advances inchild development and behavior vol 27 (pp 99ndash133)

Perrett D I Mistlin A J amp Chitty A J (1987) Visual neu-rones responsive to faces Trends in Neurosciences 10 358ndash364

Rao S C Rainer G amp Miller E K (1997) Integration of whatand where in the primate prefrontal cortex Science 276821

Rizzolatti G Fadiga L Fogassi L amp Gallese V (1999) Re-sonance behaviors and mirror neurons Archives Italiennesde Biologie 137 85

Scholl B amp Leslie A (in press) Explaining the infantrsquos objectconcept Beyond the perceptioncognition dichotomy In ELepore amp Z Pylyshyn (Eds) What is cognitive science Ox-ford Blackwell

Sereno M I Dale A M amp Tootell R B H (1995) Borders ofmultiple visual areas in humans revealed by functionalmagnetic resonance imaging Science 268 889

Spelke E (1985) Preferential looking methods as tools for thestudy of cognition in infancy In G Gottlieb amp N Krasnegor(Eds) Measurement of audition and vision in the first yearof postnatal life (pp 323ndash363) Norwood NJ Ablex

Spelke E Breinlinger K Jacobson K amp Phillips A (1993)Gestalt relations and object perception A developmentalstudy Perception 22 1483ndash1501

Spelke E amp Van de Walle G A (1993) Perceiving and rea-soning about objects Insights from infants In N Eilan amp R AMcCarthy (Eds) Spatial representation Problems in philo-sophy and psychology (pp 132ndash161) Oxford Blackwell

Spelke E S Hofsten C V amp Kestenbaum R (1989) Objectperception in infancy Interaction of spatial and kinetic in-formation for object boundaries Developmental Psychol-ogy 25 185ndash196

Sugihara T Edelman S amp Tanaka K (1998) Representationof objective similarity among three-dimensional shapes inthe monkey Biological Cybernetics 78 1

Tanaka K (1996) Inferotemporal cortex and object visionAnnual Review of Neuroscience 19 109ndash139

Tarr M J amp Bulthoff H H (1995) Is human object recogni-tion better described by geon structural descriptions or bymultiple views Comment on Biederman and Gerhardstein(1993) Journal of Experimental Psychology Human Per-ception and Performance 21 1494

Tomasello M Kruger A C amp Ratner H H (1993) Culturallearning Behavioral and Brain Sciences 16 495ndash552

Tootell R B H Dale A M Sereno M I amp Malach R (1996)New images from human visual cortex Trends in Neuros-ciences 19 481ndash489

von Hofsten C amp Spelke E (1985) Object perception andobject-directed reaching in infancy Journal of ExperimentalPsychology General 11 198ndash212

Woodward A L (1998) Infants selectively encode the goalobject of an actorrsquos reach Cogniton 69 1ndash34

Xu F amp Carey S (1996) Infantsrsquo metaphysics The case ofnumerical identity Cognitive Psychology 30 111ndash153

Xu F Carey S amp Welch J (1999) Infantsrsquo ability to useobject kind information for object individuation Cognition70 137ndash166

58 Journal of Cognitive Neuroscience Volume 13 Number 1

Page 9: Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

broke apart and moved noncohesively Although the datafrom Experiments 3 and 4 tend in this direction noreliable differences were obtained between the lookingpreferences in the two experiments Reliable preferencesfor the outcomes of noncohesive motions have beenobserved both with human infants and with humanadults tested with similar methods and with displays ofsimple artifacts (Spelke et al 1989 1993 Kestenbaum etal 1987) The absence of a clear effect of cohesiveness inExperiments 3 and 4 may reflect either a species differ-ence or a difference in object domain Artifacts are moreapt to move cohesively than is food which breaks apartboth due to decay cutting or eating Such conclusionscannot be drawn from the present experiments howeverbecause of the equivocal findings

GENERAL DISCUSSION

Four experiments provide evidence that rhesus monkeysspontaneously parse arrays of adjacent food items intodistinct objects and that they represent these objects asseparately movable and manipulable Monkeys lookedlonger at the outcomes of events in which two previouslystationary adjacent objects moved as one unit than at theoutcomes of events in which one of the objects movedseparately from the other This preference was notattributable to any intrinsic preference for the formerevent outcome or to any preference for an outcome thatfollowed a greater amount of motion Instead it providesevidence that the monkeys represented the commonmotion of the two distinct objects as more novel orsurprising than the independent motion of those objects

The present findings suggest broad similarities be-tween the object representations formed by human andnonhuman primates and between the ways in whichthose representations are used to support inferencesabout objectsrsquo movability The well known detailedhomologies between the lower-level visual mechanismsof human and nonhuman primates (Tootell et al 1996Sereno et al 1995 DeYoe amp Van Essen 1988 Maunsell ampNewsome 1987 Desimone et al 1984) therefore appearto extend to higher-level mechanisms for parsing objectsand interpreting object motions In addition our findingsprovide evidence that adult monkeys and human infantsshow similar behavioral responses to object motionswith heightened visual exploration of motions that arenovel or surprising These findings complement previousresults showing that rhesus monkeys cotton-top tamar-ins and human infants show similar looking preferencesfor events in which objects are occluded or behave inanomalous ways (eg Hauser et al 1996 Hauser 1998)

Differences in Sensitivity to Hands

Our studies also reveal two differences between theobject representations formed by adult rhesus monkeysand young human infants First human infants take

account of the actions of human hands in analyzingthe motions and support relations among objects Whenhuman infants see an inanimate object rise into the air ina display that includes a human hand they show anovelty reaction if the hand and object are spatiallyseparated but not if the hand is grasping the object(Needham amp Baillargeon 1993 Leslie 1984) Monkeysin contrast showed no sensitivity to the supporting roleof hands in Experiment 1 Their novelty reaction to thecommon rising motion of two objects was equally strongwhen no hand contacted the bottom object (an eventthat implies that the two objects were connected) andwhen hands contacted each of the objects (an event thatimplies no connection between the objects)

We see two plausible accounts of the observed differ-ences in sensitivity to hands First human infantsrsquo great-er sensitivity to the supporting role of hands may reflecta species difference in the use of hands specifically inthe manipulation of inanimate objects Because humaninfants and human adults manipulate objects more thanother primates do human infants may have more op-portunities to learn about handndashobject support relationsthan do other species A second possibility not mutuallyexclusive from the first is that humans are innatelypredisposed to attend to the ways in which inanimateobjects are manipulated by other humans which in turncontributes to both infantsrsquo abilities to learn rapidlyabout tools and ultimately to humansrsquo superior tooluse2

Differences in the Use of Object Features forBoundaries

The second difference between the object representa-tions of adult monkeys and young human infants con-cerns the use of object features such as surface coloringand shape as information for object boundaries Adultmonkeys and human infants above 11 months of ageuse featural information to perceive object boundariesin contrast infants below 11 months of age do notreliably exhibit this ability Various factors have beenproposed to underlie the developmental change ob-served in humans Some factors focus on perceptualdevelopment with behavioral changes attributed toinfantsrsquo emerging abilities to use image features suchas edge alignment and texture similarity to group por-tions of the visual field into units directly (eg Kellmanamp Arterberry 1998 Needham 1998) In contrast otherfactors focus on the development of higher level pro-cesses with behavioral changes attributed to an emer-ging ability to represent objects as members of kindsand an emerging propensity to use object features suchas surface coloring and shape as information for thekinds to which specific objects belong (Needham ampModi 2000 Xu amp Carey 1996) Further this changemay be driven by the acquisition of verbal labels for theobjects (Xu amp Carey 1996)

52 Journal of Cognitive Neuroscience Volume 13 Number 1

Corresponding to these two interpretations of thedevelopmental change in humans are two differentinterpretations of monkeysrsquo performance in the presentstudies Monkeys may have perceived the object bound-aries by categorizing each object as a different kind offood or they may have perceived the boundaries group-ing together elements in the visual scene in accord withtheir colors textures and alignment relationships

There is compelling data suggesting that monkeysrepresent the category of food such that they are likelyto have lsquolsquofood kindrsquorsquo representations First monkeys inthe present studies were strongly attentive to food itemsand occasionally attempted to approach and take thembehaviors often observed with familiar foods and rarelyobserved with familiar nonfood objects This was trueeven though they had no prior experience with theseparticular food items Second experiments by SantosHauser and Spelke (in preparation) suggest that mon-keys given evidence that a novel object is food (byobserving a person eating part of it) subsequentlyapproach that object an odorless replica of that objectand other objects of the same color and texture as theoriginal object but of a different shape In contrastmonkeys do not approach these objects when they aregiven evidence that the initial object is not food (byobserving a person putting the object in her ear ratherthan her mouth) This finding suggests that monkeyscategorize novel objects as kinds of food in terms ofproperties such as their colors and textures If percep-tible properties of the present stimulus objects allowedmonkeys to perceive correctly that these objects werefood then monkeysrsquo propensity to categorize objects asthe same foods only when they share a common colorand texture would lead them to perceive each display oftwo (differently colored and textured) foods as contain-ing two distinct objects

Whatever the reason for monkeysrsquo successful use offeatural information to perceive object boundaries theexistence of this capacity in rhesus monkeys casts doubton the thesis that this ability either depends on or givesrise to any uniquely human ability to represent objectsHumans do represent objects in unique ways for wehave unparalleled abilities to build and use complextools and to communicate about objects with uniquesymbols for thousands of object kinds The sources ofour uniqueness however do not clearly appear in thecontexts that have been used thus far to assess objectrepresentations in human infants

Steps Toward a Cognitive Neuroscience of NaturalObject Representation

Although our experiments focus strictly on behavioralmeasures and functional analyses we believe their great-est potential lies in the contributions they can make tounderstanding the neural basis of object representationRhesus monkeys are one of the most intensively studied

species in the neuroanatomy and neurophysiology ofvision and such studies have provided evidence forextensive homologies between their visual systems andthose of humans Our experiments contribute to thisliterature in three ways First they suggest that rhesusmonkeys and humans have similar higher visual me-chanisms for representing objects and interpreting ob-ject motions The origin of these similarities remains anopen question with likely contributions from bothgenetically encoded homologies in the underlying neur-al architectures and similar experiential histories inter-acting with similar neural learning mechanisms

Second our experiments provide evidence that theobject representations of monkeys and humans can beassessed by nearly identical tasks Moreover these tasksrequire no training and so allow assessment of therepresentations that humans and monkeys developand use spontaneously rather than less naturalisticrepresentations that may have been developed specifi-cally for solving experimental tasks over months oftraining on those tasks (see discussion in Rao et al1997) Finally these tasks can be applied not only toadult animals but to infants Indeed the preferentiallooking method was developed for use with infanthumans and monkeys (Fantz 1961) and it has beenused to study lower-level visual functions in both spe-cies (see Kellman amp Banks 1998 for review) Themethod therefore should be ideal for investigating theneural architecture subserving visual cognition in bothspecies

Third our experiments offer a behavioral task that canreadily be adapted for simultaneous behavioral andneural recordings in monkeys Preferential lookingmethods have been used successfully both with semindashfree-ranging rhesus monkeys and with captive cotton-top tamarins (eg Hauser et al 1996 Hauser 1998) Inpreliminary research they have yielded similar findingswith rhesus monkeys tested with stabilized heads andimplanted electrodes (Munakata Miller amp Spelke un-published) In the future therefore cognitive neuros-cientists should be able to use these methods to probethe neural mechanisms of object representations inuntrained monkeys whose experience with objects canbe precisely controlled and to compare the functionalproperties of those mechanisms directly to those ofhuman infants with varying degrees of experience Suchstudies should prove a valuable complement to studiesof the neural mechanisms of object representations inadult humans using the combined approaches of cogni-tive psychology and functional brain imaging

More specifically the studies reported in this papercould serve as the starting point for physiological studiesprobing the cognitive and behavioral functions of neu-rons activated by visual displays As a first step onecould ask whether the extensively studied object codingneurons in the inferotemporal cortex (Tanaka 1996Perrett et al 1987 Baylis Rolls amp Leonard 1985) are

Munakata et al 53

responsible for the behavioral results found in ourexperiments The finding that monkeys encode two-object displays as two separate objects leads to theprediction that inferotemporal neurons will respondsimilarly to each object in a one-object display and in atwo-object display with a possible reduction in responseto the latter display due to competition from thedifferent object representations It is possible howeverthat monkeys distinguish the two objects in earlierstages of processing parsing the display based on con-tiguous regions of the same general color and texturewithout this parsing being clearly reflected in object-level representations These alternatives could be dis-tinguished by recording a population of inferotemporalresponses to one of our two-object displays and to eachof the two objects separately If the two-object responseswere different from the sum or average of the responsesto the two separate objects this would suggest thatmonkeys encode the two-object displays in a differentmanner than the separate objects at the level of theinferotemporal cortex

If this first experiment showed that inferotemporalneurons encode two-object displays in terms of the twoseparate objects one could next manipulate factors thatinfluence object perception and measure the neuralcorrelates For example spatiotemporal cues such ascommon motion may make monkeys more likely toperceive a two-object display as a single object (egKellman amp Spelke 1983) and elimination of colordifferences may make them less likely to do so (SantosHauser amp Spelke in preparation) One could thusmeasure both the behavioral (looking time) and electro-physiological consequences of such manipulations andcompare the results to those from experiments withoutthis preexposure

Another ideal candidate for converging explorationfocuses on the representations underlying abilities toperceive combinations of objects in terms of theseparable components Do such abilities stem fromrepresentations of distinct perceptual features or fromrepresentations of distinct object kinds The nature ofobject representations is a matter of considerabledebate in the electrophysiological and related litera-ture (eg Sugihara Edelman amp Tanaka 1998 Lo-gothetis amp Sheinberg 1996 Tanaka 1996 Tarr ampBulthoff 1995 Biederman amp Cooper 1992 Biedermanamp Gerhardstein 1995) and issues related to the kindsfeatures distinction have been discussed in a some-what different terminology For example Logothetisand Sheinberg (1996) posit that different levels ofcategorization could be used to organize object repre-sentations from more specific visual feature-basedrepresentations to more abstract-kind representationsElectrophysiological recordings have demonstratedthat a given visual object is represented in differentways along a rough hierarchy of processing pathwaysfrom more specific low-order featural representations

to more abstract invariant categorical representations(eg Desimone amp Ungerleider 1989) Objects such asthose used in our displays could be recognized asdistinct based on features at lower levels or categoriesat higher levels Alternatively even the lowest level ofobject representations may be organized into differentkind categories as suggested by the existence of face-specific representations in both rhesus monkey andhuman visual areas (Kanwisher McDermott amp Chun1997 Perrett et al 1987) Though most of theexplanations for face-specific representations focus onthe unique perceptual properties of faces rather thana more general categorical organization of the objectrecognition system a categorical organization is stillpossible (Caramazza 1998) Thus objects could becategorized as different kinds at the earliest levels offeatural processing

One could further explore these issues in physiologi-cal studies by presenting monkeys with different visualforms of a single food category (eg bananas that aresliced mashed peeled unpeeled green brown andyellow) and visually similar forms from different foodcategories (eg a green banana and a cucumber) Ifinferotemporal representations encode information atthe level of kinds the first condition should elicit similarresponses in the inferotemporal neurons whereas thesecond should elicit different responses In contrast ifinferotemporal representations encode information atthe level of features the first condition should elicitdifferent responses whereas the second should elicitsimilar responses

In such ways our understanding of object representa-tion and in turn of humansrsquo unique tool and symboluse may be enhanced by converging efforts at thebehavioral and physiological levels of analysis Themethods reported in this papermdashused extensively withhumans of all ages requiring no training and applicableto free-ranging as well as captive animalsmdashcould play aninstrumental role in this process

METHODS

Experiment 1

Participants

Subjects were 59 semindashfree-ranging rhesus monkeysliving on the island of Cayo Santiago Puerto RicoApproximately half the subjects were adult males (agegt4 years) and half adult female (age gt3 years) Subjectswere tested opportunistically whenever they were en-countered in a setting with few other monkeys ordistractions (eg not involved in or near to a fight)and when they remained in a seated position longenough for us to present our stimuli Monkeys occa-sionally changed positions between trials In these casestesting resumed if and when monkeys relocated toanother seated position within a couple of minutes An

54 Journal of Cognitive Neuroscience Volume 13 Number 1

additional 21 monkeys were tested but did not providedata for the analyses due to either position changes thatdid not allow testing to resume (20 monkeys) or experi-menter error (1 monkey)

Apparatus and Displays

The experimental apparatus consisted of a stage and ascreen constructed from white foam core (Figure 2)The 60 pound 30-cm floor and 60 pound 40-cm back of the stagewere attached at a right angle by triangular supports (12-cm height pound 7-cm base) attached to the sides of thestage The 60 pound 40-cm screen had a 60 pound 15-cm basesupporting small aluminum pans containing the foodstimuli for the study The base and pans were attachedto the screen at a right angle by large triangular supports(40-cm height pound 15-cm base) that occluded both thebase and the food

The objects were four foods of contrasting shapescolors and textures with sizes that made them easilygraspable by a single human hand a green pepper (7-cm tall pound 8-cm diameter) a brown sweet potato (7-cmtall pound 75-cm wide by 17-cm long) a miniature orangepumpkin (7-cm tall pound 8-cm diameter) and a segmentof tan ginger root (45-cm tall pound 12-cm wide pound 15-cmlong) None of these items grew on the island or werebrought there either as provisions for the monkeys oras food for the research team all items therefore wereunfamiliar to the subjects In one display the greenpepper rested on top of the sweet potato In theother display the pumpkin rested on top of theginger root

Design

Each monkey was presented with one lsquolsquotogetherrsquorsquo trialand one lsquolsquoseparatersquorsquo trial each involving a different pairof food items Twenty-eight monkeys were tested in ahold-top condition in which the experimenter held onlythe top object with one hand during each event and 31monkeys were tested in a hold-both condition in whichthe experimenter held both objects with both handsWithin each of these conditions the pairing of objects(pepperpotato vs pumpkinginger) and trial types (to-gether vs separate) and the order of test trials wereorthogonally counterbalanced across monkeys

Procedure

All testing was conducted by one experimenter and onecamcorder operator a test began when the investigatorslocated a monkey who was seated in a quiet spot Theexperimenter positioned the apparatus 2ndash5 m away fromthe test monkey with the screen in front of and blockingthe monkeyrsquos view of the stage and the camcorderoperator began to videotape the monkey from behindthe display (Figure 2) The experimenter then raised the

screen to reveal an empty stage and immediately low-ered it Each test trial then proceeded as follows Theexperimenter raised the screen to reveal one food itemsitting atop a second food item The experimenterchecked that the monkey had fixated the objects andthen she lifted the top object approximately 30 cm in 1sec In lsquolsquotogetherrsquorsquo events the bottom object moved withthe top object in lsquolsquoseparatersquorsquo events the bottom objectremained on the floor of the display In the hold-topcondition the experimenter held only the top objectfrom above with the right hand (the two objects wereattached with toothpicks invisible to the monkeys) Inthe hold-both condition the experimenter held the topobject from above with the right hand and the bottomobject from the side and bottom with the left hand Afterlifting the object(s) the experimenter called lsquolsquoCountrsquorsquoand the camcorder operator began counting 10 sec onthe camcorder display The experimenter held the ob-ject(s) stationary until the camcorder operator calledlsquolsquoDonersquorsquo to signal the end of the 10-sec trial Theexperimenter then lowered the screen This procedurehas been successfully used in previous looking timeexperiments on this population (eg Hauser et al1996)

Each monkey received one lsquolsquotogetherrsquorsquo and one lsquolsquose-paratersquorsquo trial These two trials were separated by twoadditional trials unrelated to the present studies andinvolving the stationary presentation of other food items(carrots and squash) For most monkeys trials wereseparated by an intertrial interval of 3ndash5 sec and theentire experiment lasted a couple of minutes For mon-keys who repositioned themselves between trials theintertrial intervals were longer but never exceeded acouple of minutes

Coding and Analysis

Two coders blind to the hypotheses and conditions ofthe experiment viewed the videotaped trials frame-by-frame to determine how long monkeys observed eachof the event outcomes On each trial coding began justafter the objects came to rest as signaled by theexperimenterrsquos voice on the videotape and ended 10sec later Four of the monkeys were coded by bothcoders the correlation between their judgments oftotal looking time on each trial was 93 Looking timeswere analyzed by a 2 pound 2 ANOVA with Condition (hold-top vs hold-both) as a between-subjects factor andDisplay (together vs separate) as the within-subjectsfactor

Experiment 2

Participants

Subjects were 28 monkeys from the same population asin Experiment 1 An additional 10 monkeys were tested

Munakata et al 55

but did not provide data for the analyses due to eitherposition changes that did not allow testing to resume (9monkeys) or experimental error (1 monkey)

Apparatus and Stimuli

The apparatus was similar to that in Experiment 1except that the stage was somewhat smaller (back =45 pound 30 cm floor = 45 pound 30 cm) and the screen wasslightly taller (base = 45 pound 15 cm face = 45 pound 45 cm)The food objects and object positions were the same asin the outcome displays for Experiment 1 On thelsquolsquoseparatersquorsquo trial the position of the experimenterrsquos handwas the same as in the lsquolsquoseparatersquorsquo trial of the hold-topcondition of Experiment 1 On the lsquolsquotogetherrsquorsquo trial theexperimenterrsquos right hand grasped the two objectssimultaneously from the side and supported them inthe same positions as on the lsquolsquotogetherrsquorsquo trials for bothconditions of Experiment 1

Design

The design was the same as in Experiment 1 except thatall subjects were run in a single-hand condition

Procedure

Each trial began when the experimenter lifted the screento reveal the objects currently held in the air As thescreen was raised the experimenter called lsquolsquoCountrsquorsquo andthe camcorder operator began counting 10 sec on thecamcorder display In all other respects the procedurewas the same as in Experiment 1

Coding and Analysis

A single coder blind to the conditions of the experimentscored the videotapes Trials for 10 subjects were codedby a second observer and the correlation betweenjudgments of both observers was 98 As in previousstudies (Hauser et al 1996 Hauser 1998) videos wereacquired onto a computer using Adobe Premiere soft-ware and a Radius Videovision board Coding began andended as for Experiment 1

Looking times in Experiment 2 were analyzed by aone-way ANOVA with Display (together vs separate) asthe within-subjects factor A further ANOVA with theadditional factor of Experiment compared the lookingpatterns of the monkeys in Experiment 2 to those inExperiment 1

Experiment 3

Participants

Subjects were 30 monkeys from the same population asin Experiments 1 and 2 An additional 13 monkeys were

tested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (12 monkeys) or experimental error (1 monkey)

Apparatus and Displays

The apparatus was identical to that of Experiment 2 Thedisplays were the same as in the hold-top condition ofExperiment 1 except for the objects a yellow lemon andan orange pepper oriented vertically On lsquolsquotogetherrsquorsquotrials a whole object appeared on the display floororiented vertically and a hand grasped its top half andlifted the object into the air On lsquolsquoseparatersquorsquo trials twohalves of an object with a horizontal cut through themiddle appeared on the display floor in the sameorientation and a hand grasped the top half and liftedit into the air while the bottom half remained on thedisplay floor At the start of the lsquolsquoseparatersquorsquo trial the cutin the object was detectable by adults but inconspicu-ous At the end of the trial small portions of the insideof the object were visible from the monkeyrsquos station-point

Design Procedure Coding and Analyses

The design and procedure were the same as in Experi-ment 1 except that only one condition (hold-top) wasadministered and only one object (together or separate)was displayed The coding and analyses were the sameas in Experiment 2

Experiment 4

Participants

Subjects were 43 monkeys from the same population asin Experiments 1ndash3 An additional 27 monkeys weretested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (26 monkeys) or experimental error (1 monkey)

Apparatus and Displays

These were the same as in Experiment 3 except that thefood object never appeared on the display floor and wasnot grasped and lifted

Design Procedure Coding and Analyses

These were the same as in Experiment 2

Acknowledgments

The research was supported by McDonnell-Pew PostdoctoralFellowships to Yuko Munakata and Randall OrsquoReilly an NSFpredoctoral fellowship and Harvard University McMasterrsquosfunds to Laurie Santos NIH grant R37-HD23103 to ElizabethSpelke and an NSF Young Investigator Award to Marc

56 Journal of Cognitive Neuroscience Volume 13 Number 1

Hauser We thank members of the Cognitive DevelopmentCenter at the University of Denver for feedback on drafts ofthis article We thank Elliott Blass Amy Jackendoff Katie LiuCory Miller Marianne Moon Bridget Spelke and Fei Xu forassistance with conducting and coding the studies We thankthe CPRC (PHS grant P51RR00168-38) and Drs M KesslerF Bercovitch and J Berard for helping secure the CayoSantiago facilities

Reprint requests should be sent to Yuko Munakata at theDepartment of Psychology University of Denver 2155 SRace Street Denver CO 80208 USA or via e-mail tomunakataduedu

Notes

1 In some cases infants younger than 1 year have demon-strated abilities to use featural information to parse simpleadjacent objects into separable units (Needham amp Baillargeon1997 Needham 1998) Further work is needed to determinewhy such abilities are not reliably observed in infants duringthe first year of life2 A third account for the differential sensitivity to hands maybe quickly rejected Human infantsrsquo greater response to handsis not due to any greater sensitivity of the preferential lookingtask with human infants relative to adult rhesus monkeysbecause we found monkeys to be more sensitive than humaninfants to featural information for objects in the presentexperiments A fourth interpretation is possible though notvery likely Human infantsrsquo greater sensitivity to hands in ourexperiments may depend on the use of human handsmdashperhaps monkeys would show similar sensitivity to hands iftested with monkey hands However rhesus monkeys showsimilar physiological responses to human hands and monkeyhands (Rizzolatti Fadiga Fogassi amp Gallese 1999 diPellegrino Fadiga Fogassi Gallese amp Rizzolatti 1992)suggesting that rhesus monkeysrsquo observed insensitivity to thesupporting role of hands in our studies is unlikely to be anartifact of the use of human hands

REFERENCES

Baillargeon R (1995) Physical reasoning in infancy In MGazzaniga (Ed) The cognitive neurosciences CambridgeMIT Press

Baylis G C Rolls E T amp Leonard C M (1985) Selectivitybetween faces in the responses of a population of neurons inthe cortex of the superior temporal sulcus of the macaquemonkey Brain Research 342 91ndash102

Bertenthal B I (1996) Origins and early development ofperception action and representation Annual Review ofPsychology 47 431ndash459

Biederman I amp Cooper E E (1992) Size invariance in visualobject priming Journal of Experimental Psychology Hu-man Perception and Performance 18 121ndash133

Biederman I amp Gerhardstein P C (1995) Viewpoint-de-pendent mechanisms in visual object recognition Reply toTarr and Bulthoff (1995) Journal of Experimental Psychol-ogy Human Perception and Performance 21 1506ndash1514

Caramazza A (1998) Domain-specific knowledge systems inthe brain The animatendashinanimate distinction Journal ofCognitive Neuroscience 10 1ndash34

Desimone R Albright T D Gross C G amp Bruce C (1984)Stimulus selective properties of inferior temporal neurons inthe macaque Journal of Neuroscience 4 2051ndash2062

Desimone R Ungerleider L G (1989) Neural mechanisms ofvisual processing in monkeys In F Boller amp J Grafman

(Eds) Handbook of neuropsychology vol 2 (pp 267ndash299)New York Elsevier

DeYoe E A amp Van Essen D C (1988) Concurrent processingstreams in monkey visual cortex Trends in Neurosciences11 219ndash226

di Pellegrino G Fadiga L Fogassi L Gallese V amp RizzolattiG (1992) Understanding motor events A neurophysiologi-cal study Experimental Brain Research 91 176ndash180

Fantz R (1961) The origin of form perception ScientificAmerican 204 66ndash72

Fantz R (1964) Visual experience in infants Decreased at-tention to familiar patterns relative to novel ones Science146 668ndash670

Hauser M D (1998) A nonhuman primatersquos expectationsabout object motion and destination The importance ofself-propelled movement and animacy DevelopmentalScience 1 31ndash37

Hauser M D amp Carey S (1998) Building a cognitive creaturefrom a set of primitives Evolutionary and developmentalinsights In D Cummins amp C Allen (Eds) The evolution ofmind Oxford Oxford University Press

Hauser M D MacNeilage P amp Ware M (1996) Numericalrepresentations in primates Proceedings of the NationalAcademy of Sciences USA 93 1514

Hauser M D amp Williams T (submitted) A nonhuman pri-matesrsquo expectations about invisible displacement Two pro-cedures two different systems of knowledge

Johnson S P amp Aslin R N (1996) Perceptions of object unityin young infants The rules of motion depth and orienta-tion Cognitive Development 11 161ndash180

Kanwisher N McDermott J amp Chun M M (1997) The fu-siform face area A module in human extrastriate cortexspecialized for face perception Journal of Neuroscience 174302

Kellman P J amp Arterberry M E (1998) The cradle ofknowledge Development of perception in infancy Cam-bridge MIT Press

Kellman P J amp Banks M S (1998) Infant visual perceptionIn D Kuhn amp R S Siegler (Eds) Handbook of childpsychology Cognition perception and language 5th ed(pp 103ndash146) New York Wiley

Kellman P J amp Spelke E (1983) Perception of partially oc-cluded objects in infancy Cognitive Psychology 15 483ndash524

Kestenbaum R Termine N amp Spelke E S (1987) Perceptionof objects and object boundaries by 3-month-old infantsBritish Journal of Developmental Psychology 5 367ndash383

Leslie A M (1982) The perception of causality in infantsPerception 11 173ndash186

Leslie A M (1984) Infant perception of a manual pick-up eventBritish Journal of Developmental Psychology 2 19ndash32

Logothetis N K amp Sheinberg D L (1996) Visual object re-cognition Annual Review of Neuroscience 19 577ndash621

Maunsell J H amp Newsome W T (1987) Visual processing inmonkey extrastriate cortex Annual Review of Neu-roscience 10 363ndash401

Meltzoff A S (1988) Infant imitation and memory Nine-month-olds in immediate and deferred tests Child Devel-opment 59 217ndash225

Nagell K Olguin R S amp Tomasello M (1993) Processes ofsocial learning in the tool use of chimpanzees (Pan troglo-dytes) and human children (Homo sapiens) Journal ofComparative Psychology 107 174ndash186

Needham A (1997) Factors affecting infantsrsquo use of featuralinformation in object segregation Current Directions inPsychological Science 6 26ndash33

Needham A (1998) Infantsrsquo use of featural information in thesegregation of stationary objects Infant Behavior and De-velopment 21 47ndash76

Munakata et al 57

Needham A amp Baillargeon R (1993) Intuitions about sup-port in 45-month-old infants Cognition 47 121ndash148

Needham A amp Baillargeon R (1997) Object segregation in 8-month-old infants Cognition 62 121ndash149

Needham A amp Baillargeon R (1998) Effects of prior experi-ence on 45-month-old infantsrsquo object segregation InfantBehavior and Development 21 1ndash23

Needham A amp Modi M (2000) Infantsrsquo use of prior experi-ences with objects in object segregation Implications forobject recognition in infancy In H Reese (Ed) Advances inchild development and behavior vol 27 (pp 99ndash133)

Perrett D I Mistlin A J amp Chitty A J (1987) Visual neu-rones responsive to faces Trends in Neurosciences 10 358ndash364

Rao S C Rainer G amp Miller E K (1997) Integration of whatand where in the primate prefrontal cortex Science 276821

Rizzolatti G Fadiga L Fogassi L amp Gallese V (1999) Re-sonance behaviors and mirror neurons Archives Italiennesde Biologie 137 85

Scholl B amp Leslie A (in press) Explaining the infantrsquos objectconcept Beyond the perceptioncognition dichotomy In ELepore amp Z Pylyshyn (Eds) What is cognitive science Ox-ford Blackwell

Sereno M I Dale A M amp Tootell R B H (1995) Borders ofmultiple visual areas in humans revealed by functionalmagnetic resonance imaging Science 268 889

Spelke E (1985) Preferential looking methods as tools for thestudy of cognition in infancy In G Gottlieb amp N Krasnegor(Eds) Measurement of audition and vision in the first yearof postnatal life (pp 323ndash363) Norwood NJ Ablex

Spelke E Breinlinger K Jacobson K amp Phillips A (1993)Gestalt relations and object perception A developmentalstudy Perception 22 1483ndash1501

Spelke E amp Van de Walle G A (1993) Perceiving and rea-soning about objects Insights from infants In N Eilan amp R AMcCarthy (Eds) Spatial representation Problems in philo-sophy and psychology (pp 132ndash161) Oxford Blackwell

Spelke E S Hofsten C V amp Kestenbaum R (1989) Objectperception in infancy Interaction of spatial and kinetic in-formation for object boundaries Developmental Psychol-ogy 25 185ndash196

Sugihara T Edelman S amp Tanaka K (1998) Representationof objective similarity among three-dimensional shapes inthe monkey Biological Cybernetics 78 1

Tanaka K (1996) Inferotemporal cortex and object visionAnnual Review of Neuroscience 19 109ndash139

Tarr M J amp Bulthoff H H (1995) Is human object recogni-tion better described by geon structural descriptions or bymultiple views Comment on Biederman and Gerhardstein(1993) Journal of Experimental Psychology Human Per-ception and Performance 21 1494

Tomasello M Kruger A C amp Ratner H H (1993) Culturallearning Behavioral and Brain Sciences 16 495ndash552

Tootell R B H Dale A M Sereno M I amp Malach R (1996)New images from human visual cortex Trends in Neuros-ciences 19 481ndash489

von Hofsten C amp Spelke E (1985) Object perception andobject-directed reaching in infancy Journal of ExperimentalPsychology General 11 198ndash212

Woodward A L (1998) Infants selectively encode the goalobject of an actorrsquos reach Cogniton 69 1ndash34

Xu F amp Carey S (1996) Infantsrsquo metaphysics The case ofnumerical identity Cognitive Psychology 30 111ndash153

Xu F Carey S amp Welch J (1999) Infantsrsquo ability to useobject kind information for object individuation Cognition70 137ndash166

58 Journal of Cognitive Neuroscience Volume 13 Number 1

Page 10: Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

Corresponding to these two interpretations of thedevelopmental change in humans are two differentinterpretations of monkeysrsquo performance in the presentstudies Monkeys may have perceived the object bound-aries by categorizing each object as a different kind offood or they may have perceived the boundaries group-ing together elements in the visual scene in accord withtheir colors textures and alignment relationships

There is compelling data suggesting that monkeysrepresent the category of food such that they are likelyto have lsquolsquofood kindrsquorsquo representations First monkeys inthe present studies were strongly attentive to food itemsand occasionally attempted to approach and take thembehaviors often observed with familiar foods and rarelyobserved with familiar nonfood objects This was trueeven though they had no prior experience with theseparticular food items Second experiments by SantosHauser and Spelke (in preparation) suggest that mon-keys given evidence that a novel object is food (byobserving a person eating part of it) subsequentlyapproach that object an odorless replica of that objectand other objects of the same color and texture as theoriginal object but of a different shape In contrastmonkeys do not approach these objects when they aregiven evidence that the initial object is not food (byobserving a person putting the object in her ear ratherthan her mouth) This finding suggests that monkeyscategorize novel objects as kinds of food in terms ofproperties such as their colors and textures If percep-tible properties of the present stimulus objects allowedmonkeys to perceive correctly that these objects werefood then monkeysrsquo propensity to categorize objects asthe same foods only when they share a common colorand texture would lead them to perceive each display oftwo (differently colored and textured) foods as contain-ing two distinct objects

Whatever the reason for monkeysrsquo successful use offeatural information to perceive object boundaries theexistence of this capacity in rhesus monkeys casts doubton the thesis that this ability either depends on or givesrise to any uniquely human ability to represent objectsHumans do represent objects in unique ways for wehave unparalleled abilities to build and use complextools and to communicate about objects with uniquesymbols for thousands of object kinds The sources ofour uniqueness however do not clearly appear in thecontexts that have been used thus far to assess objectrepresentations in human infants

Steps Toward a Cognitive Neuroscience of NaturalObject Representation

Although our experiments focus strictly on behavioralmeasures and functional analyses we believe their great-est potential lies in the contributions they can make tounderstanding the neural basis of object representationRhesus monkeys are one of the most intensively studied

species in the neuroanatomy and neurophysiology ofvision and such studies have provided evidence forextensive homologies between their visual systems andthose of humans Our experiments contribute to thisliterature in three ways First they suggest that rhesusmonkeys and humans have similar higher visual me-chanisms for representing objects and interpreting ob-ject motions The origin of these similarities remains anopen question with likely contributions from bothgenetically encoded homologies in the underlying neur-al architectures and similar experiential histories inter-acting with similar neural learning mechanisms

Second our experiments provide evidence that theobject representations of monkeys and humans can beassessed by nearly identical tasks Moreover these tasksrequire no training and so allow assessment of therepresentations that humans and monkeys developand use spontaneously rather than less naturalisticrepresentations that may have been developed specifi-cally for solving experimental tasks over months oftraining on those tasks (see discussion in Rao et al1997) Finally these tasks can be applied not only toadult animals but to infants Indeed the preferentiallooking method was developed for use with infanthumans and monkeys (Fantz 1961) and it has beenused to study lower-level visual functions in both spe-cies (see Kellman amp Banks 1998 for review) Themethod therefore should be ideal for investigating theneural architecture subserving visual cognition in bothspecies

Third our experiments offer a behavioral task that canreadily be adapted for simultaneous behavioral andneural recordings in monkeys Preferential lookingmethods have been used successfully both with semindashfree-ranging rhesus monkeys and with captive cotton-top tamarins (eg Hauser et al 1996 Hauser 1998) Inpreliminary research they have yielded similar findingswith rhesus monkeys tested with stabilized heads andimplanted electrodes (Munakata Miller amp Spelke un-published) In the future therefore cognitive neuros-cientists should be able to use these methods to probethe neural mechanisms of object representations inuntrained monkeys whose experience with objects canbe precisely controlled and to compare the functionalproperties of those mechanisms directly to those ofhuman infants with varying degrees of experience Suchstudies should prove a valuable complement to studiesof the neural mechanisms of object representations inadult humans using the combined approaches of cogni-tive psychology and functional brain imaging

More specifically the studies reported in this papercould serve as the starting point for physiological studiesprobing the cognitive and behavioral functions of neu-rons activated by visual displays As a first step onecould ask whether the extensively studied object codingneurons in the inferotemporal cortex (Tanaka 1996Perrett et al 1987 Baylis Rolls amp Leonard 1985) are

Munakata et al 53

responsible for the behavioral results found in ourexperiments The finding that monkeys encode two-object displays as two separate objects leads to theprediction that inferotemporal neurons will respondsimilarly to each object in a one-object display and in atwo-object display with a possible reduction in responseto the latter display due to competition from thedifferent object representations It is possible howeverthat monkeys distinguish the two objects in earlierstages of processing parsing the display based on con-tiguous regions of the same general color and texturewithout this parsing being clearly reflected in object-level representations These alternatives could be dis-tinguished by recording a population of inferotemporalresponses to one of our two-object displays and to eachof the two objects separately If the two-object responseswere different from the sum or average of the responsesto the two separate objects this would suggest thatmonkeys encode the two-object displays in a differentmanner than the separate objects at the level of theinferotemporal cortex

If this first experiment showed that inferotemporalneurons encode two-object displays in terms of the twoseparate objects one could next manipulate factors thatinfluence object perception and measure the neuralcorrelates For example spatiotemporal cues such ascommon motion may make monkeys more likely toperceive a two-object display as a single object (egKellman amp Spelke 1983) and elimination of colordifferences may make them less likely to do so (SantosHauser amp Spelke in preparation) One could thusmeasure both the behavioral (looking time) and electro-physiological consequences of such manipulations andcompare the results to those from experiments withoutthis preexposure

Another ideal candidate for converging explorationfocuses on the representations underlying abilities toperceive combinations of objects in terms of theseparable components Do such abilities stem fromrepresentations of distinct perceptual features or fromrepresentations of distinct object kinds The nature ofobject representations is a matter of considerabledebate in the electrophysiological and related litera-ture (eg Sugihara Edelman amp Tanaka 1998 Lo-gothetis amp Sheinberg 1996 Tanaka 1996 Tarr ampBulthoff 1995 Biederman amp Cooper 1992 Biedermanamp Gerhardstein 1995) and issues related to the kindsfeatures distinction have been discussed in a some-what different terminology For example Logothetisand Sheinberg (1996) posit that different levels ofcategorization could be used to organize object repre-sentations from more specific visual feature-basedrepresentations to more abstract-kind representationsElectrophysiological recordings have demonstratedthat a given visual object is represented in differentways along a rough hierarchy of processing pathwaysfrom more specific low-order featural representations

to more abstract invariant categorical representations(eg Desimone amp Ungerleider 1989) Objects such asthose used in our displays could be recognized asdistinct based on features at lower levels or categoriesat higher levels Alternatively even the lowest level ofobject representations may be organized into differentkind categories as suggested by the existence of face-specific representations in both rhesus monkey andhuman visual areas (Kanwisher McDermott amp Chun1997 Perrett et al 1987) Though most of theexplanations for face-specific representations focus onthe unique perceptual properties of faces rather thana more general categorical organization of the objectrecognition system a categorical organization is stillpossible (Caramazza 1998) Thus objects could becategorized as different kinds at the earliest levels offeatural processing

One could further explore these issues in physiologi-cal studies by presenting monkeys with different visualforms of a single food category (eg bananas that aresliced mashed peeled unpeeled green brown andyellow) and visually similar forms from different foodcategories (eg a green banana and a cucumber) Ifinferotemporal representations encode information atthe level of kinds the first condition should elicit similarresponses in the inferotemporal neurons whereas thesecond should elicit different responses In contrast ifinferotemporal representations encode information atthe level of features the first condition should elicitdifferent responses whereas the second should elicitsimilar responses

In such ways our understanding of object representa-tion and in turn of humansrsquo unique tool and symboluse may be enhanced by converging efforts at thebehavioral and physiological levels of analysis Themethods reported in this papermdashused extensively withhumans of all ages requiring no training and applicableto free-ranging as well as captive animalsmdashcould play aninstrumental role in this process

METHODS

Experiment 1

Participants

Subjects were 59 semindashfree-ranging rhesus monkeysliving on the island of Cayo Santiago Puerto RicoApproximately half the subjects were adult males (agegt4 years) and half adult female (age gt3 years) Subjectswere tested opportunistically whenever they were en-countered in a setting with few other monkeys ordistractions (eg not involved in or near to a fight)and when they remained in a seated position longenough for us to present our stimuli Monkeys occa-sionally changed positions between trials In these casestesting resumed if and when monkeys relocated toanother seated position within a couple of minutes An

54 Journal of Cognitive Neuroscience Volume 13 Number 1

additional 21 monkeys were tested but did not providedata for the analyses due to either position changes thatdid not allow testing to resume (20 monkeys) or experi-menter error (1 monkey)

Apparatus and Displays

The experimental apparatus consisted of a stage and ascreen constructed from white foam core (Figure 2)The 60 pound 30-cm floor and 60 pound 40-cm back of the stagewere attached at a right angle by triangular supports (12-cm height pound 7-cm base) attached to the sides of thestage The 60 pound 40-cm screen had a 60 pound 15-cm basesupporting small aluminum pans containing the foodstimuli for the study The base and pans were attachedto the screen at a right angle by large triangular supports(40-cm height pound 15-cm base) that occluded both thebase and the food

The objects were four foods of contrasting shapescolors and textures with sizes that made them easilygraspable by a single human hand a green pepper (7-cm tall pound 8-cm diameter) a brown sweet potato (7-cmtall pound 75-cm wide by 17-cm long) a miniature orangepumpkin (7-cm tall pound 8-cm diameter) and a segmentof tan ginger root (45-cm tall pound 12-cm wide pound 15-cmlong) None of these items grew on the island or werebrought there either as provisions for the monkeys oras food for the research team all items therefore wereunfamiliar to the subjects In one display the greenpepper rested on top of the sweet potato In theother display the pumpkin rested on top of theginger root

Design

Each monkey was presented with one lsquolsquotogetherrsquorsquo trialand one lsquolsquoseparatersquorsquo trial each involving a different pairof food items Twenty-eight monkeys were tested in ahold-top condition in which the experimenter held onlythe top object with one hand during each event and 31monkeys were tested in a hold-both condition in whichthe experimenter held both objects with both handsWithin each of these conditions the pairing of objects(pepperpotato vs pumpkinginger) and trial types (to-gether vs separate) and the order of test trials wereorthogonally counterbalanced across monkeys

Procedure

All testing was conducted by one experimenter and onecamcorder operator a test began when the investigatorslocated a monkey who was seated in a quiet spot Theexperimenter positioned the apparatus 2ndash5 m away fromthe test monkey with the screen in front of and blockingthe monkeyrsquos view of the stage and the camcorderoperator began to videotape the monkey from behindthe display (Figure 2) The experimenter then raised the

screen to reveal an empty stage and immediately low-ered it Each test trial then proceeded as follows Theexperimenter raised the screen to reveal one food itemsitting atop a second food item The experimenterchecked that the monkey had fixated the objects andthen she lifted the top object approximately 30 cm in 1sec In lsquolsquotogetherrsquorsquo events the bottom object moved withthe top object in lsquolsquoseparatersquorsquo events the bottom objectremained on the floor of the display In the hold-topcondition the experimenter held only the top objectfrom above with the right hand (the two objects wereattached with toothpicks invisible to the monkeys) Inthe hold-both condition the experimenter held the topobject from above with the right hand and the bottomobject from the side and bottom with the left hand Afterlifting the object(s) the experimenter called lsquolsquoCountrsquorsquoand the camcorder operator began counting 10 sec onthe camcorder display The experimenter held the ob-ject(s) stationary until the camcorder operator calledlsquolsquoDonersquorsquo to signal the end of the 10-sec trial Theexperimenter then lowered the screen This procedurehas been successfully used in previous looking timeexperiments on this population (eg Hauser et al1996)

Each monkey received one lsquolsquotogetherrsquorsquo and one lsquolsquose-paratersquorsquo trial These two trials were separated by twoadditional trials unrelated to the present studies andinvolving the stationary presentation of other food items(carrots and squash) For most monkeys trials wereseparated by an intertrial interval of 3ndash5 sec and theentire experiment lasted a couple of minutes For mon-keys who repositioned themselves between trials theintertrial intervals were longer but never exceeded acouple of minutes

Coding and Analysis

Two coders blind to the hypotheses and conditions ofthe experiment viewed the videotaped trials frame-by-frame to determine how long monkeys observed eachof the event outcomes On each trial coding began justafter the objects came to rest as signaled by theexperimenterrsquos voice on the videotape and ended 10sec later Four of the monkeys were coded by bothcoders the correlation between their judgments oftotal looking time on each trial was 93 Looking timeswere analyzed by a 2 pound 2 ANOVA with Condition (hold-top vs hold-both) as a between-subjects factor andDisplay (together vs separate) as the within-subjectsfactor

Experiment 2

Participants

Subjects were 28 monkeys from the same population asin Experiment 1 An additional 10 monkeys were tested

Munakata et al 55

but did not provide data for the analyses due to eitherposition changes that did not allow testing to resume (9monkeys) or experimental error (1 monkey)

Apparatus and Stimuli

The apparatus was similar to that in Experiment 1except that the stage was somewhat smaller (back =45 pound 30 cm floor = 45 pound 30 cm) and the screen wasslightly taller (base = 45 pound 15 cm face = 45 pound 45 cm)The food objects and object positions were the same asin the outcome displays for Experiment 1 On thelsquolsquoseparatersquorsquo trial the position of the experimenterrsquos handwas the same as in the lsquolsquoseparatersquorsquo trial of the hold-topcondition of Experiment 1 On the lsquolsquotogetherrsquorsquo trial theexperimenterrsquos right hand grasped the two objectssimultaneously from the side and supported them inthe same positions as on the lsquolsquotogetherrsquorsquo trials for bothconditions of Experiment 1

Design

The design was the same as in Experiment 1 except thatall subjects were run in a single-hand condition

Procedure

Each trial began when the experimenter lifted the screento reveal the objects currently held in the air As thescreen was raised the experimenter called lsquolsquoCountrsquorsquo andthe camcorder operator began counting 10 sec on thecamcorder display In all other respects the procedurewas the same as in Experiment 1

Coding and Analysis

A single coder blind to the conditions of the experimentscored the videotapes Trials for 10 subjects were codedby a second observer and the correlation betweenjudgments of both observers was 98 As in previousstudies (Hauser et al 1996 Hauser 1998) videos wereacquired onto a computer using Adobe Premiere soft-ware and a Radius Videovision board Coding began andended as for Experiment 1

Looking times in Experiment 2 were analyzed by aone-way ANOVA with Display (together vs separate) asthe within-subjects factor A further ANOVA with theadditional factor of Experiment compared the lookingpatterns of the monkeys in Experiment 2 to those inExperiment 1

Experiment 3

Participants

Subjects were 30 monkeys from the same population asin Experiments 1 and 2 An additional 13 monkeys were

tested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (12 monkeys) or experimental error (1 monkey)

Apparatus and Displays

The apparatus was identical to that of Experiment 2 Thedisplays were the same as in the hold-top condition ofExperiment 1 except for the objects a yellow lemon andan orange pepper oriented vertically On lsquolsquotogetherrsquorsquotrials a whole object appeared on the display floororiented vertically and a hand grasped its top half andlifted the object into the air On lsquolsquoseparatersquorsquo trials twohalves of an object with a horizontal cut through themiddle appeared on the display floor in the sameorientation and a hand grasped the top half and liftedit into the air while the bottom half remained on thedisplay floor At the start of the lsquolsquoseparatersquorsquo trial the cutin the object was detectable by adults but inconspicu-ous At the end of the trial small portions of the insideof the object were visible from the monkeyrsquos station-point

Design Procedure Coding and Analyses

The design and procedure were the same as in Experi-ment 1 except that only one condition (hold-top) wasadministered and only one object (together or separate)was displayed The coding and analyses were the sameas in Experiment 2

Experiment 4

Participants

Subjects were 43 monkeys from the same population asin Experiments 1ndash3 An additional 27 monkeys weretested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (26 monkeys) or experimental error (1 monkey)

Apparatus and Displays

These were the same as in Experiment 3 except that thefood object never appeared on the display floor and wasnot grasped and lifted

Design Procedure Coding and Analyses

These were the same as in Experiment 2

Acknowledgments

The research was supported by McDonnell-Pew PostdoctoralFellowships to Yuko Munakata and Randall OrsquoReilly an NSFpredoctoral fellowship and Harvard University McMasterrsquosfunds to Laurie Santos NIH grant R37-HD23103 to ElizabethSpelke and an NSF Young Investigator Award to Marc

56 Journal of Cognitive Neuroscience Volume 13 Number 1

Hauser We thank members of the Cognitive DevelopmentCenter at the University of Denver for feedback on drafts ofthis article We thank Elliott Blass Amy Jackendoff Katie LiuCory Miller Marianne Moon Bridget Spelke and Fei Xu forassistance with conducting and coding the studies We thankthe CPRC (PHS grant P51RR00168-38) and Drs M KesslerF Bercovitch and J Berard for helping secure the CayoSantiago facilities

Reprint requests should be sent to Yuko Munakata at theDepartment of Psychology University of Denver 2155 SRace Street Denver CO 80208 USA or via e-mail tomunakataduedu

Notes

1 In some cases infants younger than 1 year have demon-strated abilities to use featural information to parse simpleadjacent objects into separable units (Needham amp Baillargeon1997 Needham 1998) Further work is needed to determinewhy such abilities are not reliably observed in infants duringthe first year of life2 A third account for the differential sensitivity to hands maybe quickly rejected Human infantsrsquo greater response to handsis not due to any greater sensitivity of the preferential lookingtask with human infants relative to adult rhesus monkeysbecause we found monkeys to be more sensitive than humaninfants to featural information for objects in the presentexperiments A fourth interpretation is possible though notvery likely Human infantsrsquo greater sensitivity to hands in ourexperiments may depend on the use of human handsmdashperhaps monkeys would show similar sensitivity to hands iftested with monkey hands However rhesus monkeys showsimilar physiological responses to human hands and monkeyhands (Rizzolatti Fadiga Fogassi amp Gallese 1999 diPellegrino Fadiga Fogassi Gallese amp Rizzolatti 1992)suggesting that rhesus monkeysrsquo observed insensitivity to thesupporting role of hands in our studies is unlikely to be anartifact of the use of human hands

REFERENCES

Baillargeon R (1995) Physical reasoning in infancy In MGazzaniga (Ed) The cognitive neurosciences CambridgeMIT Press

Baylis G C Rolls E T amp Leonard C M (1985) Selectivitybetween faces in the responses of a population of neurons inthe cortex of the superior temporal sulcus of the macaquemonkey Brain Research 342 91ndash102

Bertenthal B I (1996) Origins and early development ofperception action and representation Annual Review ofPsychology 47 431ndash459

Biederman I amp Cooper E E (1992) Size invariance in visualobject priming Journal of Experimental Psychology Hu-man Perception and Performance 18 121ndash133

Biederman I amp Gerhardstein P C (1995) Viewpoint-de-pendent mechanisms in visual object recognition Reply toTarr and Bulthoff (1995) Journal of Experimental Psychol-ogy Human Perception and Performance 21 1506ndash1514

Caramazza A (1998) Domain-specific knowledge systems inthe brain The animatendashinanimate distinction Journal ofCognitive Neuroscience 10 1ndash34

Desimone R Albright T D Gross C G amp Bruce C (1984)Stimulus selective properties of inferior temporal neurons inthe macaque Journal of Neuroscience 4 2051ndash2062

Desimone R Ungerleider L G (1989) Neural mechanisms ofvisual processing in monkeys In F Boller amp J Grafman

(Eds) Handbook of neuropsychology vol 2 (pp 267ndash299)New York Elsevier

DeYoe E A amp Van Essen D C (1988) Concurrent processingstreams in monkey visual cortex Trends in Neurosciences11 219ndash226

di Pellegrino G Fadiga L Fogassi L Gallese V amp RizzolattiG (1992) Understanding motor events A neurophysiologi-cal study Experimental Brain Research 91 176ndash180

Fantz R (1961) The origin of form perception ScientificAmerican 204 66ndash72

Fantz R (1964) Visual experience in infants Decreased at-tention to familiar patterns relative to novel ones Science146 668ndash670

Hauser M D (1998) A nonhuman primatersquos expectationsabout object motion and destination The importance ofself-propelled movement and animacy DevelopmentalScience 1 31ndash37

Hauser M D amp Carey S (1998) Building a cognitive creaturefrom a set of primitives Evolutionary and developmentalinsights In D Cummins amp C Allen (Eds) The evolution ofmind Oxford Oxford University Press

Hauser M D MacNeilage P amp Ware M (1996) Numericalrepresentations in primates Proceedings of the NationalAcademy of Sciences USA 93 1514

Hauser M D amp Williams T (submitted) A nonhuman pri-matesrsquo expectations about invisible displacement Two pro-cedures two different systems of knowledge

Johnson S P amp Aslin R N (1996) Perceptions of object unityin young infants The rules of motion depth and orienta-tion Cognitive Development 11 161ndash180

Kanwisher N McDermott J amp Chun M M (1997) The fu-siform face area A module in human extrastriate cortexspecialized for face perception Journal of Neuroscience 174302

Kellman P J amp Arterberry M E (1998) The cradle ofknowledge Development of perception in infancy Cam-bridge MIT Press

Kellman P J amp Banks M S (1998) Infant visual perceptionIn D Kuhn amp R S Siegler (Eds) Handbook of childpsychology Cognition perception and language 5th ed(pp 103ndash146) New York Wiley

Kellman P J amp Spelke E (1983) Perception of partially oc-cluded objects in infancy Cognitive Psychology 15 483ndash524

Kestenbaum R Termine N amp Spelke E S (1987) Perceptionof objects and object boundaries by 3-month-old infantsBritish Journal of Developmental Psychology 5 367ndash383

Leslie A M (1982) The perception of causality in infantsPerception 11 173ndash186

Leslie A M (1984) Infant perception of a manual pick-up eventBritish Journal of Developmental Psychology 2 19ndash32

Logothetis N K amp Sheinberg D L (1996) Visual object re-cognition Annual Review of Neuroscience 19 577ndash621

Maunsell J H amp Newsome W T (1987) Visual processing inmonkey extrastriate cortex Annual Review of Neu-roscience 10 363ndash401

Meltzoff A S (1988) Infant imitation and memory Nine-month-olds in immediate and deferred tests Child Devel-opment 59 217ndash225

Nagell K Olguin R S amp Tomasello M (1993) Processes ofsocial learning in the tool use of chimpanzees (Pan troglo-dytes) and human children (Homo sapiens) Journal ofComparative Psychology 107 174ndash186

Needham A (1997) Factors affecting infantsrsquo use of featuralinformation in object segregation Current Directions inPsychological Science 6 26ndash33

Needham A (1998) Infantsrsquo use of featural information in thesegregation of stationary objects Infant Behavior and De-velopment 21 47ndash76

Munakata et al 57

Needham A amp Baillargeon R (1993) Intuitions about sup-port in 45-month-old infants Cognition 47 121ndash148

Needham A amp Baillargeon R (1997) Object segregation in 8-month-old infants Cognition 62 121ndash149

Needham A amp Baillargeon R (1998) Effects of prior experi-ence on 45-month-old infantsrsquo object segregation InfantBehavior and Development 21 1ndash23

Needham A amp Modi M (2000) Infantsrsquo use of prior experi-ences with objects in object segregation Implications forobject recognition in infancy In H Reese (Ed) Advances inchild development and behavior vol 27 (pp 99ndash133)

Perrett D I Mistlin A J amp Chitty A J (1987) Visual neu-rones responsive to faces Trends in Neurosciences 10 358ndash364

Rao S C Rainer G amp Miller E K (1997) Integration of whatand where in the primate prefrontal cortex Science 276821

Rizzolatti G Fadiga L Fogassi L amp Gallese V (1999) Re-sonance behaviors and mirror neurons Archives Italiennesde Biologie 137 85

Scholl B amp Leslie A (in press) Explaining the infantrsquos objectconcept Beyond the perceptioncognition dichotomy In ELepore amp Z Pylyshyn (Eds) What is cognitive science Ox-ford Blackwell

Sereno M I Dale A M amp Tootell R B H (1995) Borders ofmultiple visual areas in humans revealed by functionalmagnetic resonance imaging Science 268 889

Spelke E (1985) Preferential looking methods as tools for thestudy of cognition in infancy In G Gottlieb amp N Krasnegor(Eds) Measurement of audition and vision in the first yearof postnatal life (pp 323ndash363) Norwood NJ Ablex

Spelke E Breinlinger K Jacobson K amp Phillips A (1993)Gestalt relations and object perception A developmentalstudy Perception 22 1483ndash1501

Spelke E amp Van de Walle G A (1993) Perceiving and rea-soning about objects Insights from infants In N Eilan amp R AMcCarthy (Eds) Spatial representation Problems in philo-sophy and psychology (pp 132ndash161) Oxford Blackwell

Spelke E S Hofsten C V amp Kestenbaum R (1989) Objectperception in infancy Interaction of spatial and kinetic in-formation for object boundaries Developmental Psychol-ogy 25 185ndash196

Sugihara T Edelman S amp Tanaka K (1998) Representationof objective similarity among three-dimensional shapes inthe monkey Biological Cybernetics 78 1

Tanaka K (1996) Inferotemporal cortex and object visionAnnual Review of Neuroscience 19 109ndash139

Tarr M J amp Bulthoff H H (1995) Is human object recogni-tion better described by geon structural descriptions or bymultiple views Comment on Biederman and Gerhardstein(1993) Journal of Experimental Psychology Human Per-ception and Performance 21 1494

Tomasello M Kruger A C amp Ratner H H (1993) Culturallearning Behavioral and Brain Sciences 16 495ndash552

Tootell R B H Dale A M Sereno M I amp Malach R (1996)New images from human visual cortex Trends in Neuros-ciences 19 481ndash489

von Hofsten C amp Spelke E (1985) Object perception andobject-directed reaching in infancy Journal of ExperimentalPsychology General 11 198ndash212

Woodward A L (1998) Infants selectively encode the goalobject of an actorrsquos reach Cogniton 69 1ndash34

Xu F amp Carey S (1996) Infantsrsquo metaphysics The case ofnumerical identity Cognitive Psychology 30 111ndash153

Xu F Carey S amp Welch J (1999) Infantsrsquo ability to useobject kind information for object individuation Cognition70 137ndash166

58 Journal of Cognitive Neuroscience Volume 13 Number 1

Page 11: Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

responsible for the behavioral results found in ourexperiments The finding that monkeys encode two-object displays as two separate objects leads to theprediction that inferotemporal neurons will respondsimilarly to each object in a one-object display and in atwo-object display with a possible reduction in responseto the latter display due to competition from thedifferent object representations It is possible howeverthat monkeys distinguish the two objects in earlierstages of processing parsing the display based on con-tiguous regions of the same general color and texturewithout this parsing being clearly reflected in object-level representations These alternatives could be dis-tinguished by recording a population of inferotemporalresponses to one of our two-object displays and to eachof the two objects separately If the two-object responseswere different from the sum or average of the responsesto the two separate objects this would suggest thatmonkeys encode the two-object displays in a differentmanner than the separate objects at the level of theinferotemporal cortex

If this first experiment showed that inferotemporalneurons encode two-object displays in terms of the twoseparate objects one could next manipulate factors thatinfluence object perception and measure the neuralcorrelates For example spatiotemporal cues such ascommon motion may make monkeys more likely toperceive a two-object display as a single object (egKellman amp Spelke 1983) and elimination of colordifferences may make them less likely to do so (SantosHauser amp Spelke in preparation) One could thusmeasure both the behavioral (looking time) and electro-physiological consequences of such manipulations andcompare the results to those from experiments withoutthis preexposure

Another ideal candidate for converging explorationfocuses on the representations underlying abilities toperceive combinations of objects in terms of theseparable components Do such abilities stem fromrepresentations of distinct perceptual features or fromrepresentations of distinct object kinds The nature ofobject representations is a matter of considerabledebate in the electrophysiological and related litera-ture (eg Sugihara Edelman amp Tanaka 1998 Lo-gothetis amp Sheinberg 1996 Tanaka 1996 Tarr ampBulthoff 1995 Biederman amp Cooper 1992 Biedermanamp Gerhardstein 1995) and issues related to the kindsfeatures distinction have been discussed in a some-what different terminology For example Logothetisand Sheinberg (1996) posit that different levels ofcategorization could be used to organize object repre-sentations from more specific visual feature-basedrepresentations to more abstract-kind representationsElectrophysiological recordings have demonstratedthat a given visual object is represented in differentways along a rough hierarchy of processing pathwaysfrom more specific low-order featural representations

to more abstract invariant categorical representations(eg Desimone amp Ungerleider 1989) Objects such asthose used in our displays could be recognized asdistinct based on features at lower levels or categoriesat higher levels Alternatively even the lowest level ofobject representations may be organized into differentkind categories as suggested by the existence of face-specific representations in both rhesus monkey andhuman visual areas (Kanwisher McDermott amp Chun1997 Perrett et al 1987) Though most of theexplanations for face-specific representations focus onthe unique perceptual properties of faces rather thana more general categorical organization of the objectrecognition system a categorical organization is stillpossible (Caramazza 1998) Thus objects could becategorized as different kinds at the earliest levels offeatural processing

One could further explore these issues in physiologi-cal studies by presenting monkeys with different visualforms of a single food category (eg bananas that aresliced mashed peeled unpeeled green brown andyellow) and visually similar forms from different foodcategories (eg a green banana and a cucumber) Ifinferotemporal representations encode information atthe level of kinds the first condition should elicit similarresponses in the inferotemporal neurons whereas thesecond should elicit different responses In contrast ifinferotemporal representations encode information atthe level of features the first condition should elicitdifferent responses whereas the second should elicitsimilar responses

In such ways our understanding of object representa-tion and in turn of humansrsquo unique tool and symboluse may be enhanced by converging efforts at thebehavioral and physiological levels of analysis Themethods reported in this papermdashused extensively withhumans of all ages requiring no training and applicableto free-ranging as well as captive animalsmdashcould play aninstrumental role in this process

METHODS

Experiment 1

Participants

Subjects were 59 semindashfree-ranging rhesus monkeysliving on the island of Cayo Santiago Puerto RicoApproximately half the subjects were adult males (agegt4 years) and half adult female (age gt3 years) Subjectswere tested opportunistically whenever they were en-countered in a setting with few other monkeys ordistractions (eg not involved in or near to a fight)and when they remained in a seated position longenough for us to present our stimuli Monkeys occa-sionally changed positions between trials In these casestesting resumed if and when monkeys relocated toanother seated position within a couple of minutes An

54 Journal of Cognitive Neuroscience Volume 13 Number 1

additional 21 monkeys were tested but did not providedata for the analyses due to either position changes thatdid not allow testing to resume (20 monkeys) or experi-menter error (1 monkey)

Apparatus and Displays

The experimental apparatus consisted of a stage and ascreen constructed from white foam core (Figure 2)The 60 pound 30-cm floor and 60 pound 40-cm back of the stagewere attached at a right angle by triangular supports (12-cm height pound 7-cm base) attached to the sides of thestage The 60 pound 40-cm screen had a 60 pound 15-cm basesupporting small aluminum pans containing the foodstimuli for the study The base and pans were attachedto the screen at a right angle by large triangular supports(40-cm height pound 15-cm base) that occluded both thebase and the food

The objects were four foods of contrasting shapescolors and textures with sizes that made them easilygraspable by a single human hand a green pepper (7-cm tall pound 8-cm diameter) a brown sweet potato (7-cmtall pound 75-cm wide by 17-cm long) a miniature orangepumpkin (7-cm tall pound 8-cm diameter) and a segmentof tan ginger root (45-cm tall pound 12-cm wide pound 15-cmlong) None of these items grew on the island or werebrought there either as provisions for the monkeys oras food for the research team all items therefore wereunfamiliar to the subjects In one display the greenpepper rested on top of the sweet potato In theother display the pumpkin rested on top of theginger root

Design

Each monkey was presented with one lsquolsquotogetherrsquorsquo trialand one lsquolsquoseparatersquorsquo trial each involving a different pairof food items Twenty-eight monkeys were tested in ahold-top condition in which the experimenter held onlythe top object with one hand during each event and 31monkeys were tested in a hold-both condition in whichthe experimenter held both objects with both handsWithin each of these conditions the pairing of objects(pepperpotato vs pumpkinginger) and trial types (to-gether vs separate) and the order of test trials wereorthogonally counterbalanced across monkeys

Procedure

All testing was conducted by one experimenter and onecamcorder operator a test began when the investigatorslocated a monkey who was seated in a quiet spot Theexperimenter positioned the apparatus 2ndash5 m away fromthe test monkey with the screen in front of and blockingthe monkeyrsquos view of the stage and the camcorderoperator began to videotape the monkey from behindthe display (Figure 2) The experimenter then raised the

screen to reveal an empty stage and immediately low-ered it Each test trial then proceeded as follows Theexperimenter raised the screen to reveal one food itemsitting atop a second food item The experimenterchecked that the monkey had fixated the objects andthen she lifted the top object approximately 30 cm in 1sec In lsquolsquotogetherrsquorsquo events the bottom object moved withthe top object in lsquolsquoseparatersquorsquo events the bottom objectremained on the floor of the display In the hold-topcondition the experimenter held only the top objectfrom above with the right hand (the two objects wereattached with toothpicks invisible to the monkeys) Inthe hold-both condition the experimenter held the topobject from above with the right hand and the bottomobject from the side and bottom with the left hand Afterlifting the object(s) the experimenter called lsquolsquoCountrsquorsquoand the camcorder operator began counting 10 sec onthe camcorder display The experimenter held the ob-ject(s) stationary until the camcorder operator calledlsquolsquoDonersquorsquo to signal the end of the 10-sec trial Theexperimenter then lowered the screen This procedurehas been successfully used in previous looking timeexperiments on this population (eg Hauser et al1996)

Each monkey received one lsquolsquotogetherrsquorsquo and one lsquolsquose-paratersquorsquo trial These two trials were separated by twoadditional trials unrelated to the present studies andinvolving the stationary presentation of other food items(carrots and squash) For most monkeys trials wereseparated by an intertrial interval of 3ndash5 sec and theentire experiment lasted a couple of minutes For mon-keys who repositioned themselves between trials theintertrial intervals were longer but never exceeded acouple of minutes

Coding and Analysis

Two coders blind to the hypotheses and conditions ofthe experiment viewed the videotaped trials frame-by-frame to determine how long monkeys observed eachof the event outcomes On each trial coding began justafter the objects came to rest as signaled by theexperimenterrsquos voice on the videotape and ended 10sec later Four of the monkeys were coded by bothcoders the correlation between their judgments oftotal looking time on each trial was 93 Looking timeswere analyzed by a 2 pound 2 ANOVA with Condition (hold-top vs hold-both) as a between-subjects factor andDisplay (together vs separate) as the within-subjectsfactor

Experiment 2

Participants

Subjects were 28 monkeys from the same population asin Experiment 1 An additional 10 monkeys were tested

Munakata et al 55

but did not provide data for the analyses due to eitherposition changes that did not allow testing to resume (9monkeys) or experimental error (1 monkey)

Apparatus and Stimuli

The apparatus was similar to that in Experiment 1except that the stage was somewhat smaller (back =45 pound 30 cm floor = 45 pound 30 cm) and the screen wasslightly taller (base = 45 pound 15 cm face = 45 pound 45 cm)The food objects and object positions were the same asin the outcome displays for Experiment 1 On thelsquolsquoseparatersquorsquo trial the position of the experimenterrsquos handwas the same as in the lsquolsquoseparatersquorsquo trial of the hold-topcondition of Experiment 1 On the lsquolsquotogetherrsquorsquo trial theexperimenterrsquos right hand grasped the two objectssimultaneously from the side and supported them inthe same positions as on the lsquolsquotogetherrsquorsquo trials for bothconditions of Experiment 1

Design

The design was the same as in Experiment 1 except thatall subjects were run in a single-hand condition

Procedure

Each trial began when the experimenter lifted the screento reveal the objects currently held in the air As thescreen was raised the experimenter called lsquolsquoCountrsquorsquo andthe camcorder operator began counting 10 sec on thecamcorder display In all other respects the procedurewas the same as in Experiment 1

Coding and Analysis

A single coder blind to the conditions of the experimentscored the videotapes Trials for 10 subjects were codedby a second observer and the correlation betweenjudgments of both observers was 98 As in previousstudies (Hauser et al 1996 Hauser 1998) videos wereacquired onto a computer using Adobe Premiere soft-ware and a Radius Videovision board Coding began andended as for Experiment 1

Looking times in Experiment 2 were analyzed by aone-way ANOVA with Display (together vs separate) asthe within-subjects factor A further ANOVA with theadditional factor of Experiment compared the lookingpatterns of the monkeys in Experiment 2 to those inExperiment 1

Experiment 3

Participants

Subjects were 30 monkeys from the same population asin Experiments 1 and 2 An additional 13 monkeys were

tested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (12 monkeys) or experimental error (1 monkey)

Apparatus and Displays

The apparatus was identical to that of Experiment 2 Thedisplays were the same as in the hold-top condition ofExperiment 1 except for the objects a yellow lemon andan orange pepper oriented vertically On lsquolsquotogetherrsquorsquotrials a whole object appeared on the display floororiented vertically and a hand grasped its top half andlifted the object into the air On lsquolsquoseparatersquorsquo trials twohalves of an object with a horizontal cut through themiddle appeared on the display floor in the sameorientation and a hand grasped the top half and liftedit into the air while the bottom half remained on thedisplay floor At the start of the lsquolsquoseparatersquorsquo trial the cutin the object was detectable by adults but inconspicu-ous At the end of the trial small portions of the insideof the object were visible from the monkeyrsquos station-point

Design Procedure Coding and Analyses

The design and procedure were the same as in Experi-ment 1 except that only one condition (hold-top) wasadministered and only one object (together or separate)was displayed The coding and analyses were the sameas in Experiment 2

Experiment 4

Participants

Subjects were 43 monkeys from the same population asin Experiments 1ndash3 An additional 27 monkeys weretested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (26 monkeys) or experimental error (1 monkey)

Apparatus and Displays

These were the same as in Experiment 3 except that thefood object never appeared on the display floor and wasnot grasped and lifted

Design Procedure Coding and Analyses

These were the same as in Experiment 2

Acknowledgments

The research was supported by McDonnell-Pew PostdoctoralFellowships to Yuko Munakata and Randall OrsquoReilly an NSFpredoctoral fellowship and Harvard University McMasterrsquosfunds to Laurie Santos NIH grant R37-HD23103 to ElizabethSpelke and an NSF Young Investigator Award to Marc

56 Journal of Cognitive Neuroscience Volume 13 Number 1

Hauser We thank members of the Cognitive DevelopmentCenter at the University of Denver for feedback on drafts ofthis article We thank Elliott Blass Amy Jackendoff Katie LiuCory Miller Marianne Moon Bridget Spelke and Fei Xu forassistance with conducting and coding the studies We thankthe CPRC (PHS grant P51RR00168-38) and Drs M KesslerF Bercovitch and J Berard for helping secure the CayoSantiago facilities

Reprint requests should be sent to Yuko Munakata at theDepartment of Psychology University of Denver 2155 SRace Street Denver CO 80208 USA or via e-mail tomunakataduedu

Notes

1 In some cases infants younger than 1 year have demon-strated abilities to use featural information to parse simpleadjacent objects into separable units (Needham amp Baillargeon1997 Needham 1998) Further work is needed to determinewhy such abilities are not reliably observed in infants duringthe first year of life2 A third account for the differential sensitivity to hands maybe quickly rejected Human infantsrsquo greater response to handsis not due to any greater sensitivity of the preferential lookingtask with human infants relative to adult rhesus monkeysbecause we found monkeys to be more sensitive than humaninfants to featural information for objects in the presentexperiments A fourth interpretation is possible though notvery likely Human infantsrsquo greater sensitivity to hands in ourexperiments may depend on the use of human handsmdashperhaps monkeys would show similar sensitivity to hands iftested with monkey hands However rhesus monkeys showsimilar physiological responses to human hands and monkeyhands (Rizzolatti Fadiga Fogassi amp Gallese 1999 diPellegrino Fadiga Fogassi Gallese amp Rizzolatti 1992)suggesting that rhesus monkeysrsquo observed insensitivity to thesupporting role of hands in our studies is unlikely to be anartifact of the use of human hands

REFERENCES

Baillargeon R (1995) Physical reasoning in infancy In MGazzaniga (Ed) The cognitive neurosciences CambridgeMIT Press

Baylis G C Rolls E T amp Leonard C M (1985) Selectivitybetween faces in the responses of a population of neurons inthe cortex of the superior temporal sulcus of the macaquemonkey Brain Research 342 91ndash102

Bertenthal B I (1996) Origins and early development ofperception action and representation Annual Review ofPsychology 47 431ndash459

Biederman I amp Cooper E E (1992) Size invariance in visualobject priming Journal of Experimental Psychology Hu-man Perception and Performance 18 121ndash133

Biederman I amp Gerhardstein P C (1995) Viewpoint-de-pendent mechanisms in visual object recognition Reply toTarr and Bulthoff (1995) Journal of Experimental Psychol-ogy Human Perception and Performance 21 1506ndash1514

Caramazza A (1998) Domain-specific knowledge systems inthe brain The animatendashinanimate distinction Journal ofCognitive Neuroscience 10 1ndash34

Desimone R Albright T D Gross C G amp Bruce C (1984)Stimulus selective properties of inferior temporal neurons inthe macaque Journal of Neuroscience 4 2051ndash2062

Desimone R Ungerleider L G (1989) Neural mechanisms ofvisual processing in monkeys In F Boller amp J Grafman

(Eds) Handbook of neuropsychology vol 2 (pp 267ndash299)New York Elsevier

DeYoe E A amp Van Essen D C (1988) Concurrent processingstreams in monkey visual cortex Trends in Neurosciences11 219ndash226

di Pellegrino G Fadiga L Fogassi L Gallese V amp RizzolattiG (1992) Understanding motor events A neurophysiologi-cal study Experimental Brain Research 91 176ndash180

Fantz R (1961) The origin of form perception ScientificAmerican 204 66ndash72

Fantz R (1964) Visual experience in infants Decreased at-tention to familiar patterns relative to novel ones Science146 668ndash670

Hauser M D (1998) A nonhuman primatersquos expectationsabout object motion and destination The importance ofself-propelled movement and animacy DevelopmentalScience 1 31ndash37

Hauser M D amp Carey S (1998) Building a cognitive creaturefrom a set of primitives Evolutionary and developmentalinsights In D Cummins amp C Allen (Eds) The evolution ofmind Oxford Oxford University Press

Hauser M D MacNeilage P amp Ware M (1996) Numericalrepresentations in primates Proceedings of the NationalAcademy of Sciences USA 93 1514

Hauser M D amp Williams T (submitted) A nonhuman pri-matesrsquo expectations about invisible displacement Two pro-cedures two different systems of knowledge

Johnson S P amp Aslin R N (1996) Perceptions of object unityin young infants The rules of motion depth and orienta-tion Cognitive Development 11 161ndash180

Kanwisher N McDermott J amp Chun M M (1997) The fu-siform face area A module in human extrastriate cortexspecialized for face perception Journal of Neuroscience 174302

Kellman P J amp Arterberry M E (1998) The cradle ofknowledge Development of perception in infancy Cam-bridge MIT Press

Kellman P J amp Banks M S (1998) Infant visual perceptionIn D Kuhn amp R S Siegler (Eds) Handbook of childpsychology Cognition perception and language 5th ed(pp 103ndash146) New York Wiley

Kellman P J amp Spelke E (1983) Perception of partially oc-cluded objects in infancy Cognitive Psychology 15 483ndash524

Kestenbaum R Termine N amp Spelke E S (1987) Perceptionof objects and object boundaries by 3-month-old infantsBritish Journal of Developmental Psychology 5 367ndash383

Leslie A M (1982) The perception of causality in infantsPerception 11 173ndash186

Leslie A M (1984) Infant perception of a manual pick-up eventBritish Journal of Developmental Psychology 2 19ndash32

Logothetis N K amp Sheinberg D L (1996) Visual object re-cognition Annual Review of Neuroscience 19 577ndash621

Maunsell J H amp Newsome W T (1987) Visual processing inmonkey extrastriate cortex Annual Review of Neu-roscience 10 363ndash401

Meltzoff A S (1988) Infant imitation and memory Nine-month-olds in immediate and deferred tests Child Devel-opment 59 217ndash225

Nagell K Olguin R S amp Tomasello M (1993) Processes ofsocial learning in the tool use of chimpanzees (Pan troglo-dytes) and human children (Homo sapiens) Journal ofComparative Psychology 107 174ndash186

Needham A (1997) Factors affecting infantsrsquo use of featuralinformation in object segregation Current Directions inPsychological Science 6 26ndash33

Needham A (1998) Infantsrsquo use of featural information in thesegregation of stationary objects Infant Behavior and De-velopment 21 47ndash76

Munakata et al 57

Needham A amp Baillargeon R (1993) Intuitions about sup-port in 45-month-old infants Cognition 47 121ndash148

Needham A amp Baillargeon R (1997) Object segregation in 8-month-old infants Cognition 62 121ndash149

Needham A amp Baillargeon R (1998) Effects of prior experi-ence on 45-month-old infantsrsquo object segregation InfantBehavior and Development 21 1ndash23

Needham A amp Modi M (2000) Infantsrsquo use of prior experi-ences with objects in object segregation Implications forobject recognition in infancy In H Reese (Ed) Advances inchild development and behavior vol 27 (pp 99ndash133)

Perrett D I Mistlin A J amp Chitty A J (1987) Visual neu-rones responsive to faces Trends in Neurosciences 10 358ndash364

Rao S C Rainer G amp Miller E K (1997) Integration of whatand where in the primate prefrontal cortex Science 276821

Rizzolatti G Fadiga L Fogassi L amp Gallese V (1999) Re-sonance behaviors and mirror neurons Archives Italiennesde Biologie 137 85

Scholl B amp Leslie A (in press) Explaining the infantrsquos objectconcept Beyond the perceptioncognition dichotomy In ELepore amp Z Pylyshyn (Eds) What is cognitive science Ox-ford Blackwell

Sereno M I Dale A M amp Tootell R B H (1995) Borders ofmultiple visual areas in humans revealed by functionalmagnetic resonance imaging Science 268 889

Spelke E (1985) Preferential looking methods as tools for thestudy of cognition in infancy In G Gottlieb amp N Krasnegor(Eds) Measurement of audition and vision in the first yearof postnatal life (pp 323ndash363) Norwood NJ Ablex

Spelke E Breinlinger K Jacobson K amp Phillips A (1993)Gestalt relations and object perception A developmentalstudy Perception 22 1483ndash1501

Spelke E amp Van de Walle G A (1993) Perceiving and rea-soning about objects Insights from infants In N Eilan amp R AMcCarthy (Eds) Spatial representation Problems in philo-sophy and psychology (pp 132ndash161) Oxford Blackwell

Spelke E S Hofsten C V amp Kestenbaum R (1989) Objectperception in infancy Interaction of spatial and kinetic in-formation for object boundaries Developmental Psychol-ogy 25 185ndash196

Sugihara T Edelman S amp Tanaka K (1998) Representationof objective similarity among three-dimensional shapes inthe monkey Biological Cybernetics 78 1

Tanaka K (1996) Inferotemporal cortex and object visionAnnual Review of Neuroscience 19 109ndash139

Tarr M J amp Bulthoff H H (1995) Is human object recogni-tion better described by geon structural descriptions or bymultiple views Comment on Biederman and Gerhardstein(1993) Journal of Experimental Psychology Human Per-ception and Performance 21 1494

Tomasello M Kruger A C amp Ratner H H (1993) Culturallearning Behavioral and Brain Sciences 16 495ndash552

Tootell R B H Dale A M Sereno M I amp Malach R (1996)New images from human visual cortex Trends in Neuros-ciences 19 481ndash489

von Hofsten C amp Spelke E (1985) Object perception andobject-directed reaching in infancy Journal of ExperimentalPsychology General 11 198ndash212

Woodward A L (1998) Infants selectively encode the goalobject of an actorrsquos reach Cogniton 69 1ndash34

Xu F amp Carey S (1996) Infantsrsquo metaphysics The case ofnumerical identity Cognitive Psychology 30 111ndash153

Xu F Carey S amp Welch J (1999) Infantsrsquo ability to useobject kind information for object individuation Cognition70 137ndash166

58 Journal of Cognitive Neuroscience Volume 13 Number 1

Page 12: Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

additional 21 monkeys were tested but did not providedata for the analyses due to either position changes thatdid not allow testing to resume (20 monkeys) or experi-menter error (1 monkey)

Apparatus and Displays

The experimental apparatus consisted of a stage and ascreen constructed from white foam core (Figure 2)The 60 pound 30-cm floor and 60 pound 40-cm back of the stagewere attached at a right angle by triangular supports (12-cm height pound 7-cm base) attached to the sides of thestage The 60 pound 40-cm screen had a 60 pound 15-cm basesupporting small aluminum pans containing the foodstimuli for the study The base and pans were attachedto the screen at a right angle by large triangular supports(40-cm height pound 15-cm base) that occluded both thebase and the food

The objects were four foods of contrasting shapescolors and textures with sizes that made them easilygraspable by a single human hand a green pepper (7-cm tall pound 8-cm diameter) a brown sweet potato (7-cmtall pound 75-cm wide by 17-cm long) a miniature orangepumpkin (7-cm tall pound 8-cm diameter) and a segmentof tan ginger root (45-cm tall pound 12-cm wide pound 15-cmlong) None of these items grew on the island or werebrought there either as provisions for the monkeys oras food for the research team all items therefore wereunfamiliar to the subjects In one display the greenpepper rested on top of the sweet potato In theother display the pumpkin rested on top of theginger root

Design

Each monkey was presented with one lsquolsquotogetherrsquorsquo trialand one lsquolsquoseparatersquorsquo trial each involving a different pairof food items Twenty-eight monkeys were tested in ahold-top condition in which the experimenter held onlythe top object with one hand during each event and 31monkeys were tested in a hold-both condition in whichthe experimenter held both objects with both handsWithin each of these conditions the pairing of objects(pepperpotato vs pumpkinginger) and trial types (to-gether vs separate) and the order of test trials wereorthogonally counterbalanced across monkeys

Procedure

All testing was conducted by one experimenter and onecamcorder operator a test began when the investigatorslocated a monkey who was seated in a quiet spot Theexperimenter positioned the apparatus 2ndash5 m away fromthe test monkey with the screen in front of and blockingthe monkeyrsquos view of the stage and the camcorderoperator began to videotape the monkey from behindthe display (Figure 2) The experimenter then raised the

screen to reveal an empty stage and immediately low-ered it Each test trial then proceeded as follows Theexperimenter raised the screen to reveal one food itemsitting atop a second food item The experimenterchecked that the monkey had fixated the objects andthen she lifted the top object approximately 30 cm in 1sec In lsquolsquotogetherrsquorsquo events the bottom object moved withthe top object in lsquolsquoseparatersquorsquo events the bottom objectremained on the floor of the display In the hold-topcondition the experimenter held only the top objectfrom above with the right hand (the two objects wereattached with toothpicks invisible to the monkeys) Inthe hold-both condition the experimenter held the topobject from above with the right hand and the bottomobject from the side and bottom with the left hand Afterlifting the object(s) the experimenter called lsquolsquoCountrsquorsquoand the camcorder operator began counting 10 sec onthe camcorder display The experimenter held the ob-ject(s) stationary until the camcorder operator calledlsquolsquoDonersquorsquo to signal the end of the 10-sec trial Theexperimenter then lowered the screen This procedurehas been successfully used in previous looking timeexperiments on this population (eg Hauser et al1996)

Each monkey received one lsquolsquotogetherrsquorsquo and one lsquolsquose-paratersquorsquo trial These two trials were separated by twoadditional trials unrelated to the present studies andinvolving the stationary presentation of other food items(carrots and squash) For most monkeys trials wereseparated by an intertrial interval of 3ndash5 sec and theentire experiment lasted a couple of minutes For mon-keys who repositioned themselves between trials theintertrial intervals were longer but never exceeded acouple of minutes

Coding and Analysis

Two coders blind to the hypotheses and conditions ofthe experiment viewed the videotaped trials frame-by-frame to determine how long monkeys observed eachof the event outcomes On each trial coding began justafter the objects came to rest as signaled by theexperimenterrsquos voice on the videotape and ended 10sec later Four of the monkeys were coded by bothcoders the correlation between their judgments oftotal looking time on each trial was 93 Looking timeswere analyzed by a 2 pound 2 ANOVA with Condition (hold-top vs hold-both) as a between-subjects factor andDisplay (together vs separate) as the within-subjectsfactor

Experiment 2

Participants

Subjects were 28 monkeys from the same population asin Experiment 1 An additional 10 monkeys were tested

Munakata et al 55

but did not provide data for the analyses due to eitherposition changes that did not allow testing to resume (9monkeys) or experimental error (1 monkey)

Apparatus and Stimuli

The apparatus was similar to that in Experiment 1except that the stage was somewhat smaller (back =45 pound 30 cm floor = 45 pound 30 cm) and the screen wasslightly taller (base = 45 pound 15 cm face = 45 pound 45 cm)The food objects and object positions were the same asin the outcome displays for Experiment 1 On thelsquolsquoseparatersquorsquo trial the position of the experimenterrsquos handwas the same as in the lsquolsquoseparatersquorsquo trial of the hold-topcondition of Experiment 1 On the lsquolsquotogetherrsquorsquo trial theexperimenterrsquos right hand grasped the two objectssimultaneously from the side and supported them inthe same positions as on the lsquolsquotogetherrsquorsquo trials for bothconditions of Experiment 1

Design

The design was the same as in Experiment 1 except thatall subjects were run in a single-hand condition

Procedure

Each trial began when the experimenter lifted the screento reveal the objects currently held in the air As thescreen was raised the experimenter called lsquolsquoCountrsquorsquo andthe camcorder operator began counting 10 sec on thecamcorder display In all other respects the procedurewas the same as in Experiment 1

Coding and Analysis

A single coder blind to the conditions of the experimentscored the videotapes Trials for 10 subjects were codedby a second observer and the correlation betweenjudgments of both observers was 98 As in previousstudies (Hauser et al 1996 Hauser 1998) videos wereacquired onto a computer using Adobe Premiere soft-ware and a Radius Videovision board Coding began andended as for Experiment 1

Looking times in Experiment 2 were analyzed by aone-way ANOVA with Display (together vs separate) asthe within-subjects factor A further ANOVA with theadditional factor of Experiment compared the lookingpatterns of the monkeys in Experiment 2 to those inExperiment 1

Experiment 3

Participants

Subjects were 30 monkeys from the same population asin Experiments 1 and 2 An additional 13 monkeys were

tested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (12 monkeys) or experimental error (1 monkey)

Apparatus and Displays

The apparatus was identical to that of Experiment 2 Thedisplays were the same as in the hold-top condition ofExperiment 1 except for the objects a yellow lemon andan orange pepper oriented vertically On lsquolsquotogetherrsquorsquotrials a whole object appeared on the display floororiented vertically and a hand grasped its top half andlifted the object into the air On lsquolsquoseparatersquorsquo trials twohalves of an object with a horizontal cut through themiddle appeared on the display floor in the sameorientation and a hand grasped the top half and liftedit into the air while the bottom half remained on thedisplay floor At the start of the lsquolsquoseparatersquorsquo trial the cutin the object was detectable by adults but inconspicu-ous At the end of the trial small portions of the insideof the object were visible from the monkeyrsquos station-point

Design Procedure Coding and Analyses

The design and procedure were the same as in Experi-ment 1 except that only one condition (hold-top) wasadministered and only one object (together or separate)was displayed The coding and analyses were the sameas in Experiment 2

Experiment 4

Participants

Subjects were 43 monkeys from the same population asin Experiments 1ndash3 An additional 27 monkeys weretested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (26 monkeys) or experimental error (1 monkey)

Apparatus and Displays

These were the same as in Experiment 3 except that thefood object never appeared on the display floor and wasnot grasped and lifted

Design Procedure Coding and Analyses

These were the same as in Experiment 2

Acknowledgments

The research was supported by McDonnell-Pew PostdoctoralFellowships to Yuko Munakata and Randall OrsquoReilly an NSFpredoctoral fellowship and Harvard University McMasterrsquosfunds to Laurie Santos NIH grant R37-HD23103 to ElizabethSpelke and an NSF Young Investigator Award to Marc

56 Journal of Cognitive Neuroscience Volume 13 Number 1

Hauser We thank members of the Cognitive DevelopmentCenter at the University of Denver for feedback on drafts ofthis article We thank Elliott Blass Amy Jackendoff Katie LiuCory Miller Marianne Moon Bridget Spelke and Fei Xu forassistance with conducting and coding the studies We thankthe CPRC (PHS grant P51RR00168-38) and Drs M KesslerF Bercovitch and J Berard for helping secure the CayoSantiago facilities

Reprint requests should be sent to Yuko Munakata at theDepartment of Psychology University of Denver 2155 SRace Street Denver CO 80208 USA or via e-mail tomunakataduedu

Notes

1 In some cases infants younger than 1 year have demon-strated abilities to use featural information to parse simpleadjacent objects into separable units (Needham amp Baillargeon1997 Needham 1998) Further work is needed to determinewhy such abilities are not reliably observed in infants duringthe first year of life2 A third account for the differential sensitivity to hands maybe quickly rejected Human infantsrsquo greater response to handsis not due to any greater sensitivity of the preferential lookingtask with human infants relative to adult rhesus monkeysbecause we found monkeys to be more sensitive than humaninfants to featural information for objects in the presentexperiments A fourth interpretation is possible though notvery likely Human infantsrsquo greater sensitivity to hands in ourexperiments may depend on the use of human handsmdashperhaps monkeys would show similar sensitivity to hands iftested with monkey hands However rhesus monkeys showsimilar physiological responses to human hands and monkeyhands (Rizzolatti Fadiga Fogassi amp Gallese 1999 diPellegrino Fadiga Fogassi Gallese amp Rizzolatti 1992)suggesting that rhesus monkeysrsquo observed insensitivity to thesupporting role of hands in our studies is unlikely to be anartifact of the use of human hands

REFERENCES

Baillargeon R (1995) Physical reasoning in infancy In MGazzaniga (Ed) The cognitive neurosciences CambridgeMIT Press

Baylis G C Rolls E T amp Leonard C M (1985) Selectivitybetween faces in the responses of a population of neurons inthe cortex of the superior temporal sulcus of the macaquemonkey Brain Research 342 91ndash102

Bertenthal B I (1996) Origins and early development ofperception action and representation Annual Review ofPsychology 47 431ndash459

Biederman I amp Cooper E E (1992) Size invariance in visualobject priming Journal of Experimental Psychology Hu-man Perception and Performance 18 121ndash133

Biederman I amp Gerhardstein P C (1995) Viewpoint-de-pendent mechanisms in visual object recognition Reply toTarr and Bulthoff (1995) Journal of Experimental Psychol-ogy Human Perception and Performance 21 1506ndash1514

Caramazza A (1998) Domain-specific knowledge systems inthe brain The animatendashinanimate distinction Journal ofCognitive Neuroscience 10 1ndash34

Desimone R Albright T D Gross C G amp Bruce C (1984)Stimulus selective properties of inferior temporal neurons inthe macaque Journal of Neuroscience 4 2051ndash2062

Desimone R Ungerleider L G (1989) Neural mechanisms ofvisual processing in monkeys In F Boller amp J Grafman

(Eds) Handbook of neuropsychology vol 2 (pp 267ndash299)New York Elsevier

DeYoe E A amp Van Essen D C (1988) Concurrent processingstreams in monkey visual cortex Trends in Neurosciences11 219ndash226

di Pellegrino G Fadiga L Fogassi L Gallese V amp RizzolattiG (1992) Understanding motor events A neurophysiologi-cal study Experimental Brain Research 91 176ndash180

Fantz R (1961) The origin of form perception ScientificAmerican 204 66ndash72

Fantz R (1964) Visual experience in infants Decreased at-tention to familiar patterns relative to novel ones Science146 668ndash670

Hauser M D (1998) A nonhuman primatersquos expectationsabout object motion and destination The importance ofself-propelled movement and animacy DevelopmentalScience 1 31ndash37

Hauser M D amp Carey S (1998) Building a cognitive creaturefrom a set of primitives Evolutionary and developmentalinsights In D Cummins amp C Allen (Eds) The evolution ofmind Oxford Oxford University Press

Hauser M D MacNeilage P amp Ware M (1996) Numericalrepresentations in primates Proceedings of the NationalAcademy of Sciences USA 93 1514

Hauser M D amp Williams T (submitted) A nonhuman pri-matesrsquo expectations about invisible displacement Two pro-cedures two different systems of knowledge

Johnson S P amp Aslin R N (1996) Perceptions of object unityin young infants The rules of motion depth and orienta-tion Cognitive Development 11 161ndash180

Kanwisher N McDermott J amp Chun M M (1997) The fu-siform face area A module in human extrastriate cortexspecialized for face perception Journal of Neuroscience 174302

Kellman P J amp Arterberry M E (1998) The cradle ofknowledge Development of perception in infancy Cam-bridge MIT Press

Kellman P J amp Banks M S (1998) Infant visual perceptionIn D Kuhn amp R S Siegler (Eds) Handbook of childpsychology Cognition perception and language 5th ed(pp 103ndash146) New York Wiley

Kellman P J amp Spelke E (1983) Perception of partially oc-cluded objects in infancy Cognitive Psychology 15 483ndash524

Kestenbaum R Termine N amp Spelke E S (1987) Perceptionof objects and object boundaries by 3-month-old infantsBritish Journal of Developmental Psychology 5 367ndash383

Leslie A M (1982) The perception of causality in infantsPerception 11 173ndash186

Leslie A M (1984) Infant perception of a manual pick-up eventBritish Journal of Developmental Psychology 2 19ndash32

Logothetis N K amp Sheinberg D L (1996) Visual object re-cognition Annual Review of Neuroscience 19 577ndash621

Maunsell J H amp Newsome W T (1987) Visual processing inmonkey extrastriate cortex Annual Review of Neu-roscience 10 363ndash401

Meltzoff A S (1988) Infant imitation and memory Nine-month-olds in immediate and deferred tests Child Devel-opment 59 217ndash225

Nagell K Olguin R S amp Tomasello M (1993) Processes ofsocial learning in the tool use of chimpanzees (Pan troglo-dytes) and human children (Homo sapiens) Journal ofComparative Psychology 107 174ndash186

Needham A (1997) Factors affecting infantsrsquo use of featuralinformation in object segregation Current Directions inPsychological Science 6 26ndash33

Needham A (1998) Infantsrsquo use of featural information in thesegregation of stationary objects Infant Behavior and De-velopment 21 47ndash76

Munakata et al 57

Needham A amp Baillargeon R (1993) Intuitions about sup-port in 45-month-old infants Cognition 47 121ndash148

Needham A amp Baillargeon R (1997) Object segregation in 8-month-old infants Cognition 62 121ndash149

Needham A amp Baillargeon R (1998) Effects of prior experi-ence on 45-month-old infantsrsquo object segregation InfantBehavior and Development 21 1ndash23

Needham A amp Modi M (2000) Infantsrsquo use of prior experi-ences with objects in object segregation Implications forobject recognition in infancy In H Reese (Ed) Advances inchild development and behavior vol 27 (pp 99ndash133)

Perrett D I Mistlin A J amp Chitty A J (1987) Visual neu-rones responsive to faces Trends in Neurosciences 10 358ndash364

Rao S C Rainer G amp Miller E K (1997) Integration of whatand where in the primate prefrontal cortex Science 276821

Rizzolatti G Fadiga L Fogassi L amp Gallese V (1999) Re-sonance behaviors and mirror neurons Archives Italiennesde Biologie 137 85

Scholl B amp Leslie A (in press) Explaining the infantrsquos objectconcept Beyond the perceptioncognition dichotomy In ELepore amp Z Pylyshyn (Eds) What is cognitive science Ox-ford Blackwell

Sereno M I Dale A M amp Tootell R B H (1995) Borders ofmultiple visual areas in humans revealed by functionalmagnetic resonance imaging Science 268 889

Spelke E (1985) Preferential looking methods as tools for thestudy of cognition in infancy In G Gottlieb amp N Krasnegor(Eds) Measurement of audition and vision in the first yearof postnatal life (pp 323ndash363) Norwood NJ Ablex

Spelke E Breinlinger K Jacobson K amp Phillips A (1993)Gestalt relations and object perception A developmentalstudy Perception 22 1483ndash1501

Spelke E amp Van de Walle G A (1993) Perceiving and rea-soning about objects Insights from infants In N Eilan amp R AMcCarthy (Eds) Spatial representation Problems in philo-sophy and psychology (pp 132ndash161) Oxford Blackwell

Spelke E S Hofsten C V amp Kestenbaum R (1989) Objectperception in infancy Interaction of spatial and kinetic in-formation for object boundaries Developmental Psychol-ogy 25 185ndash196

Sugihara T Edelman S amp Tanaka K (1998) Representationof objective similarity among three-dimensional shapes inthe monkey Biological Cybernetics 78 1

Tanaka K (1996) Inferotemporal cortex and object visionAnnual Review of Neuroscience 19 109ndash139

Tarr M J amp Bulthoff H H (1995) Is human object recogni-tion better described by geon structural descriptions or bymultiple views Comment on Biederman and Gerhardstein(1993) Journal of Experimental Psychology Human Per-ception and Performance 21 1494

Tomasello M Kruger A C amp Ratner H H (1993) Culturallearning Behavioral and Brain Sciences 16 495ndash552

Tootell R B H Dale A M Sereno M I amp Malach R (1996)New images from human visual cortex Trends in Neuros-ciences 19 481ndash489

von Hofsten C amp Spelke E (1985) Object perception andobject-directed reaching in infancy Journal of ExperimentalPsychology General 11 198ndash212

Woodward A L (1998) Infants selectively encode the goalobject of an actorrsquos reach Cogniton 69 1ndash34

Xu F amp Carey S (1996) Infantsrsquo metaphysics The case ofnumerical identity Cognitive Psychology 30 111ndash153

Xu F Carey S amp Welch J (1999) Infantsrsquo ability to useobject kind information for object individuation Cognition70 137ndash166

58 Journal of Cognitive Neuroscience Volume 13 Number 1

Page 13: Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

but did not provide data for the analyses due to eitherposition changes that did not allow testing to resume (9monkeys) or experimental error (1 monkey)

Apparatus and Stimuli

The apparatus was similar to that in Experiment 1except that the stage was somewhat smaller (back =45 pound 30 cm floor = 45 pound 30 cm) and the screen wasslightly taller (base = 45 pound 15 cm face = 45 pound 45 cm)The food objects and object positions were the same asin the outcome displays for Experiment 1 On thelsquolsquoseparatersquorsquo trial the position of the experimenterrsquos handwas the same as in the lsquolsquoseparatersquorsquo trial of the hold-topcondition of Experiment 1 On the lsquolsquotogetherrsquorsquo trial theexperimenterrsquos right hand grasped the two objectssimultaneously from the side and supported them inthe same positions as on the lsquolsquotogetherrsquorsquo trials for bothconditions of Experiment 1

Design

The design was the same as in Experiment 1 except thatall subjects were run in a single-hand condition

Procedure

Each trial began when the experimenter lifted the screento reveal the objects currently held in the air As thescreen was raised the experimenter called lsquolsquoCountrsquorsquo andthe camcorder operator began counting 10 sec on thecamcorder display In all other respects the procedurewas the same as in Experiment 1

Coding and Analysis

A single coder blind to the conditions of the experimentscored the videotapes Trials for 10 subjects were codedby a second observer and the correlation betweenjudgments of both observers was 98 As in previousstudies (Hauser et al 1996 Hauser 1998) videos wereacquired onto a computer using Adobe Premiere soft-ware and a Radius Videovision board Coding began andended as for Experiment 1

Looking times in Experiment 2 were analyzed by aone-way ANOVA with Display (together vs separate) asthe within-subjects factor A further ANOVA with theadditional factor of Experiment compared the lookingpatterns of the monkeys in Experiment 2 to those inExperiment 1

Experiment 3

Participants

Subjects were 30 monkeys from the same population asin Experiments 1 and 2 An additional 13 monkeys were

tested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (12 monkeys) or experimental error (1 monkey)

Apparatus and Displays

The apparatus was identical to that of Experiment 2 Thedisplays were the same as in the hold-top condition ofExperiment 1 except for the objects a yellow lemon andan orange pepper oriented vertically On lsquolsquotogetherrsquorsquotrials a whole object appeared on the display floororiented vertically and a hand grasped its top half andlifted the object into the air On lsquolsquoseparatersquorsquo trials twohalves of an object with a horizontal cut through themiddle appeared on the display floor in the sameorientation and a hand grasped the top half and liftedit into the air while the bottom half remained on thedisplay floor At the start of the lsquolsquoseparatersquorsquo trial the cutin the object was detectable by adults but inconspicu-ous At the end of the trial small portions of the insideof the object were visible from the monkeyrsquos station-point

Design Procedure Coding and Analyses

The design and procedure were the same as in Experi-ment 1 except that only one condition (hold-top) wasadministered and only one object (together or separate)was displayed The coding and analyses were the sameas in Experiment 2

Experiment 4

Participants

Subjects were 43 monkeys from the same population asin Experiments 1ndash3 An additional 27 monkeys weretested but did not provide data for the analyses due toeither position changes that did not allow testing toresume (26 monkeys) or experimental error (1 monkey)

Apparatus and Displays

These were the same as in Experiment 3 except that thefood object never appeared on the display floor and wasnot grasped and lifted

Design Procedure Coding and Analyses

These were the same as in Experiment 2

Acknowledgments

The research was supported by McDonnell-Pew PostdoctoralFellowships to Yuko Munakata and Randall OrsquoReilly an NSFpredoctoral fellowship and Harvard University McMasterrsquosfunds to Laurie Santos NIH grant R37-HD23103 to ElizabethSpelke and an NSF Young Investigator Award to Marc

56 Journal of Cognitive Neuroscience Volume 13 Number 1

Hauser We thank members of the Cognitive DevelopmentCenter at the University of Denver for feedback on drafts ofthis article We thank Elliott Blass Amy Jackendoff Katie LiuCory Miller Marianne Moon Bridget Spelke and Fei Xu forassistance with conducting and coding the studies We thankthe CPRC (PHS grant P51RR00168-38) and Drs M KesslerF Bercovitch and J Berard for helping secure the CayoSantiago facilities

Reprint requests should be sent to Yuko Munakata at theDepartment of Psychology University of Denver 2155 SRace Street Denver CO 80208 USA or via e-mail tomunakataduedu

Notes

1 In some cases infants younger than 1 year have demon-strated abilities to use featural information to parse simpleadjacent objects into separable units (Needham amp Baillargeon1997 Needham 1998) Further work is needed to determinewhy such abilities are not reliably observed in infants duringthe first year of life2 A third account for the differential sensitivity to hands maybe quickly rejected Human infantsrsquo greater response to handsis not due to any greater sensitivity of the preferential lookingtask with human infants relative to adult rhesus monkeysbecause we found monkeys to be more sensitive than humaninfants to featural information for objects in the presentexperiments A fourth interpretation is possible though notvery likely Human infantsrsquo greater sensitivity to hands in ourexperiments may depend on the use of human handsmdashperhaps monkeys would show similar sensitivity to hands iftested with monkey hands However rhesus monkeys showsimilar physiological responses to human hands and monkeyhands (Rizzolatti Fadiga Fogassi amp Gallese 1999 diPellegrino Fadiga Fogassi Gallese amp Rizzolatti 1992)suggesting that rhesus monkeysrsquo observed insensitivity to thesupporting role of hands in our studies is unlikely to be anartifact of the use of human hands

REFERENCES

Baillargeon R (1995) Physical reasoning in infancy In MGazzaniga (Ed) The cognitive neurosciences CambridgeMIT Press

Baylis G C Rolls E T amp Leonard C M (1985) Selectivitybetween faces in the responses of a population of neurons inthe cortex of the superior temporal sulcus of the macaquemonkey Brain Research 342 91ndash102

Bertenthal B I (1996) Origins and early development ofperception action and representation Annual Review ofPsychology 47 431ndash459

Biederman I amp Cooper E E (1992) Size invariance in visualobject priming Journal of Experimental Psychology Hu-man Perception and Performance 18 121ndash133

Biederman I amp Gerhardstein P C (1995) Viewpoint-de-pendent mechanisms in visual object recognition Reply toTarr and Bulthoff (1995) Journal of Experimental Psychol-ogy Human Perception and Performance 21 1506ndash1514

Caramazza A (1998) Domain-specific knowledge systems inthe brain The animatendashinanimate distinction Journal ofCognitive Neuroscience 10 1ndash34

Desimone R Albright T D Gross C G amp Bruce C (1984)Stimulus selective properties of inferior temporal neurons inthe macaque Journal of Neuroscience 4 2051ndash2062

Desimone R Ungerleider L G (1989) Neural mechanisms ofvisual processing in monkeys In F Boller amp J Grafman

(Eds) Handbook of neuropsychology vol 2 (pp 267ndash299)New York Elsevier

DeYoe E A amp Van Essen D C (1988) Concurrent processingstreams in monkey visual cortex Trends in Neurosciences11 219ndash226

di Pellegrino G Fadiga L Fogassi L Gallese V amp RizzolattiG (1992) Understanding motor events A neurophysiologi-cal study Experimental Brain Research 91 176ndash180

Fantz R (1961) The origin of form perception ScientificAmerican 204 66ndash72

Fantz R (1964) Visual experience in infants Decreased at-tention to familiar patterns relative to novel ones Science146 668ndash670

Hauser M D (1998) A nonhuman primatersquos expectationsabout object motion and destination The importance ofself-propelled movement and animacy DevelopmentalScience 1 31ndash37

Hauser M D amp Carey S (1998) Building a cognitive creaturefrom a set of primitives Evolutionary and developmentalinsights In D Cummins amp C Allen (Eds) The evolution ofmind Oxford Oxford University Press

Hauser M D MacNeilage P amp Ware M (1996) Numericalrepresentations in primates Proceedings of the NationalAcademy of Sciences USA 93 1514

Hauser M D amp Williams T (submitted) A nonhuman pri-matesrsquo expectations about invisible displacement Two pro-cedures two different systems of knowledge

Johnson S P amp Aslin R N (1996) Perceptions of object unityin young infants The rules of motion depth and orienta-tion Cognitive Development 11 161ndash180

Kanwisher N McDermott J amp Chun M M (1997) The fu-siform face area A module in human extrastriate cortexspecialized for face perception Journal of Neuroscience 174302

Kellman P J amp Arterberry M E (1998) The cradle ofknowledge Development of perception in infancy Cam-bridge MIT Press

Kellman P J amp Banks M S (1998) Infant visual perceptionIn D Kuhn amp R S Siegler (Eds) Handbook of childpsychology Cognition perception and language 5th ed(pp 103ndash146) New York Wiley

Kellman P J amp Spelke E (1983) Perception of partially oc-cluded objects in infancy Cognitive Psychology 15 483ndash524

Kestenbaum R Termine N amp Spelke E S (1987) Perceptionof objects and object boundaries by 3-month-old infantsBritish Journal of Developmental Psychology 5 367ndash383

Leslie A M (1982) The perception of causality in infantsPerception 11 173ndash186

Leslie A M (1984) Infant perception of a manual pick-up eventBritish Journal of Developmental Psychology 2 19ndash32

Logothetis N K amp Sheinberg D L (1996) Visual object re-cognition Annual Review of Neuroscience 19 577ndash621

Maunsell J H amp Newsome W T (1987) Visual processing inmonkey extrastriate cortex Annual Review of Neu-roscience 10 363ndash401

Meltzoff A S (1988) Infant imitation and memory Nine-month-olds in immediate and deferred tests Child Devel-opment 59 217ndash225

Nagell K Olguin R S amp Tomasello M (1993) Processes ofsocial learning in the tool use of chimpanzees (Pan troglo-dytes) and human children (Homo sapiens) Journal ofComparative Psychology 107 174ndash186

Needham A (1997) Factors affecting infantsrsquo use of featuralinformation in object segregation Current Directions inPsychological Science 6 26ndash33

Needham A (1998) Infantsrsquo use of featural information in thesegregation of stationary objects Infant Behavior and De-velopment 21 47ndash76

Munakata et al 57

Needham A amp Baillargeon R (1993) Intuitions about sup-port in 45-month-old infants Cognition 47 121ndash148

Needham A amp Baillargeon R (1997) Object segregation in 8-month-old infants Cognition 62 121ndash149

Needham A amp Baillargeon R (1998) Effects of prior experi-ence on 45-month-old infantsrsquo object segregation InfantBehavior and Development 21 1ndash23

Needham A amp Modi M (2000) Infantsrsquo use of prior experi-ences with objects in object segregation Implications forobject recognition in infancy In H Reese (Ed) Advances inchild development and behavior vol 27 (pp 99ndash133)

Perrett D I Mistlin A J amp Chitty A J (1987) Visual neu-rones responsive to faces Trends in Neurosciences 10 358ndash364

Rao S C Rainer G amp Miller E K (1997) Integration of whatand where in the primate prefrontal cortex Science 276821

Rizzolatti G Fadiga L Fogassi L amp Gallese V (1999) Re-sonance behaviors and mirror neurons Archives Italiennesde Biologie 137 85

Scholl B amp Leslie A (in press) Explaining the infantrsquos objectconcept Beyond the perceptioncognition dichotomy In ELepore amp Z Pylyshyn (Eds) What is cognitive science Ox-ford Blackwell

Sereno M I Dale A M amp Tootell R B H (1995) Borders ofmultiple visual areas in humans revealed by functionalmagnetic resonance imaging Science 268 889

Spelke E (1985) Preferential looking methods as tools for thestudy of cognition in infancy In G Gottlieb amp N Krasnegor(Eds) Measurement of audition and vision in the first yearof postnatal life (pp 323ndash363) Norwood NJ Ablex

Spelke E Breinlinger K Jacobson K amp Phillips A (1993)Gestalt relations and object perception A developmentalstudy Perception 22 1483ndash1501

Spelke E amp Van de Walle G A (1993) Perceiving and rea-soning about objects Insights from infants In N Eilan amp R AMcCarthy (Eds) Spatial representation Problems in philo-sophy and psychology (pp 132ndash161) Oxford Blackwell

Spelke E S Hofsten C V amp Kestenbaum R (1989) Objectperception in infancy Interaction of spatial and kinetic in-formation for object boundaries Developmental Psychol-ogy 25 185ndash196

Sugihara T Edelman S amp Tanaka K (1998) Representationof objective similarity among three-dimensional shapes inthe monkey Biological Cybernetics 78 1

Tanaka K (1996) Inferotemporal cortex and object visionAnnual Review of Neuroscience 19 109ndash139

Tarr M J amp Bulthoff H H (1995) Is human object recogni-tion better described by geon structural descriptions or bymultiple views Comment on Biederman and Gerhardstein(1993) Journal of Experimental Psychology Human Per-ception and Performance 21 1494

Tomasello M Kruger A C amp Ratner H H (1993) Culturallearning Behavioral and Brain Sciences 16 495ndash552

Tootell R B H Dale A M Sereno M I amp Malach R (1996)New images from human visual cortex Trends in Neuros-ciences 19 481ndash489

von Hofsten C amp Spelke E (1985) Object perception andobject-directed reaching in infancy Journal of ExperimentalPsychology General 11 198ndash212

Woodward A L (1998) Infants selectively encode the goalobject of an actorrsquos reach Cogniton 69 1ndash34

Xu F amp Carey S (1996) Infantsrsquo metaphysics The case ofnumerical identity Cognitive Psychology 30 111ndash153

Xu F Carey S amp Welch J (1999) Infantsrsquo ability to useobject kind information for object individuation Cognition70 137ndash166

58 Journal of Cognitive Neuroscience Volume 13 Number 1

Page 14: Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

Hauser We thank members of the Cognitive DevelopmentCenter at the University of Denver for feedback on drafts ofthis article We thank Elliott Blass Amy Jackendoff Katie LiuCory Miller Marianne Moon Bridget Spelke and Fei Xu forassistance with conducting and coding the studies We thankthe CPRC (PHS grant P51RR00168-38) and Drs M KesslerF Bercovitch and J Berard for helping secure the CayoSantiago facilities

Reprint requests should be sent to Yuko Munakata at theDepartment of Psychology University of Denver 2155 SRace Street Denver CO 80208 USA or via e-mail tomunakataduedu

Notes

1 In some cases infants younger than 1 year have demon-strated abilities to use featural information to parse simpleadjacent objects into separable units (Needham amp Baillargeon1997 Needham 1998) Further work is needed to determinewhy such abilities are not reliably observed in infants duringthe first year of life2 A third account for the differential sensitivity to hands maybe quickly rejected Human infantsrsquo greater response to handsis not due to any greater sensitivity of the preferential lookingtask with human infants relative to adult rhesus monkeysbecause we found monkeys to be more sensitive than humaninfants to featural information for objects in the presentexperiments A fourth interpretation is possible though notvery likely Human infantsrsquo greater sensitivity to hands in ourexperiments may depend on the use of human handsmdashperhaps monkeys would show similar sensitivity to hands iftested with monkey hands However rhesus monkeys showsimilar physiological responses to human hands and monkeyhands (Rizzolatti Fadiga Fogassi amp Gallese 1999 diPellegrino Fadiga Fogassi Gallese amp Rizzolatti 1992)suggesting that rhesus monkeysrsquo observed insensitivity to thesupporting role of hands in our studies is unlikely to be anartifact of the use of human hands

REFERENCES

Baillargeon R (1995) Physical reasoning in infancy In MGazzaniga (Ed) The cognitive neurosciences CambridgeMIT Press

Baylis G C Rolls E T amp Leonard C M (1985) Selectivitybetween faces in the responses of a population of neurons inthe cortex of the superior temporal sulcus of the macaquemonkey Brain Research 342 91ndash102

Bertenthal B I (1996) Origins and early development ofperception action and representation Annual Review ofPsychology 47 431ndash459

Biederman I amp Cooper E E (1992) Size invariance in visualobject priming Journal of Experimental Psychology Hu-man Perception and Performance 18 121ndash133

Biederman I amp Gerhardstein P C (1995) Viewpoint-de-pendent mechanisms in visual object recognition Reply toTarr and Bulthoff (1995) Journal of Experimental Psychol-ogy Human Perception and Performance 21 1506ndash1514

Caramazza A (1998) Domain-specific knowledge systems inthe brain The animatendashinanimate distinction Journal ofCognitive Neuroscience 10 1ndash34

Desimone R Albright T D Gross C G amp Bruce C (1984)Stimulus selective properties of inferior temporal neurons inthe macaque Journal of Neuroscience 4 2051ndash2062

Desimone R Ungerleider L G (1989) Neural mechanisms ofvisual processing in monkeys In F Boller amp J Grafman

(Eds) Handbook of neuropsychology vol 2 (pp 267ndash299)New York Elsevier

DeYoe E A amp Van Essen D C (1988) Concurrent processingstreams in monkey visual cortex Trends in Neurosciences11 219ndash226

di Pellegrino G Fadiga L Fogassi L Gallese V amp RizzolattiG (1992) Understanding motor events A neurophysiologi-cal study Experimental Brain Research 91 176ndash180

Fantz R (1961) The origin of form perception ScientificAmerican 204 66ndash72

Fantz R (1964) Visual experience in infants Decreased at-tention to familiar patterns relative to novel ones Science146 668ndash670

Hauser M D (1998) A nonhuman primatersquos expectationsabout object motion and destination The importance ofself-propelled movement and animacy DevelopmentalScience 1 31ndash37

Hauser M D amp Carey S (1998) Building a cognitive creaturefrom a set of primitives Evolutionary and developmentalinsights In D Cummins amp C Allen (Eds) The evolution ofmind Oxford Oxford University Press

Hauser M D MacNeilage P amp Ware M (1996) Numericalrepresentations in primates Proceedings of the NationalAcademy of Sciences USA 93 1514

Hauser M D amp Williams T (submitted) A nonhuman pri-matesrsquo expectations about invisible displacement Two pro-cedures two different systems of knowledge

Johnson S P amp Aslin R N (1996) Perceptions of object unityin young infants The rules of motion depth and orienta-tion Cognitive Development 11 161ndash180

Kanwisher N McDermott J amp Chun M M (1997) The fu-siform face area A module in human extrastriate cortexspecialized for face perception Journal of Neuroscience 174302

Kellman P J amp Arterberry M E (1998) The cradle ofknowledge Development of perception in infancy Cam-bridge MIT Press

Kellman P J amp Banks M S (1998) Infant visual perceptionIn D Kuhn amp R S Siegler (Eds) Handbook of childpsychology Cognition perception and language 5th ed(pp 103ndash146) New York Wiley

Kellman P J amp Spelke E (1983) Perception of partially oc-cluded objects in infancy Cognitive Psychology 15 483ndash524

Kestenbaum R Termine N amp Spelke E S (1987) Perceptionof objects and object boundaries by 3-month-old infantsBritish Journal of Developmental Psychology 5 367ndash383

Leslie A M (1982) The perception of causality in infantsPerception 11 173ndash186

Leslie A M (1984) Infant perception of a manual pick-up eventBritish Journal of Developmental Psychology 2 19ndash32

Logothetis N K amp Sheinberg D L (1996) Visual object re-cognition Annual Review of Neuroscience 19 577ndash621

Maunsell J H amp Newsome W T (1987) Visual processing inmonkey extrastriate cortex Annual Review of Neu-roscience 10 363ndash401

Meltzoff A S (1988) Infant imitation and memory Nine-month-olds in immediate and deferred tests Child Devel-opment 59 217ndash225

Nagell K Olguin R S amp Tomasello M (1993) Processes ofsocial learning in the tool use of chimpanzees (Pan troglo-dytes) and human children (Homo sapiens) Journal ofComparative Psychology 107 174ndash186

Needham A (1997) Factors affecting infantsrsquo use of featuralinformation in object segregation Current Directions inPsychological Science 6 26ndash33

Needham A (1998) Infantsrsquo use of featural information in thesegregation of stationary objects Infant Behavior and De-velopment 21 47ndash76

Munakata et al 57

Needham A amp Baillargeon R (1993) Intuitions about sup-port in 45-month-old infants Cognition 47 121ndash148

Needham A amp Baillargeon R (1997) Object segregation in 8-month-old infants Cognition 62 121ndash149

Needham A amp Baillargeon R (1998) Effects of prior experi-ence on 45-month-old infantsrsquo object segregation InfantBehavior and Development 21 1ndash23

Needham A amp Modi M (2000) Infantsrsquo use of prior experi-ences with objects in object segregation Implications forobject recognition in infancy In H Reese (Ed) Advances inchild development and behavior vol 27 (pp 99ndash133)

Perrett D I Mistlin A J amp Chitty A J (1987) Visual neu-rones responsive to faces Trends in Neurosciences 10 358ndash364

Rao S C Rainer G amp Miller E K (1997) Integration of whatand where in the primate prefrontal cortex Science 276821

Rizzolatti G Fadiga L Fogassi L amp Gallese V (1999) Re-sonance behaviors and mirror neurons Archives Italiennesde Biologie 137 85

Scholl B amp Leslie A (in press) Explaining the infantrsquos objectconcept Beyond the perceptioncognition dichotomy In ELepore amp Z Pylyshyn (Eds) What is cognitive science Ox-ford Blackwell

Sereno M I Dale A M amp Tootell R B H (1995) Borders ofmultiple visual areas in humans revealed by functionalmagnetic resonance imaging Science 268 889

Spelke E (1985) Preferential looking methods as tools for thestudy of cognition in infancy In G Gottlieb amp N Krasnegor(Eds) Measurement of audition and vision in the first yearof postnatal life (pp 323ndash363) Norwood NJ Ablex

Spelke E Breinlinger K Jacobson K amp Phillips A (1993)Gestalt relations and object perception A developmentalstudy Perception 22 1483ndash1501

Spelke E amp Van de Walle G A (1993) Perceiving and rea-soning about objects Insights from infants In N Eilan amp R AMcCarthy (Eds) Spatial representation Problems in philo-sophy and psychology (pp 132ndash161) Oxford Blackwell

Spelke E S Hofsten C V amp Kestenbaum R (1989) Objectperception in infancy Interaction of spatial and kinetic in-formation for object boundaries Developmental Psychol-ogy 25 185ndash196

Sugihara T Edelman S amp Tanaka K (1998) Representationof objective similarity among three-dimensional shapes inthe monkey Biological Cybernetics 78 1

Tanaka K (1996) Inferotemporal cortex and object visionAnnual Review of Neuroscience 19 109ndash139

Tarr M J amp Bulthoff H H (1995) Is human object recogni-tion better described by geon structural descriptions or bymultiple views Comment on Biederman and Gerhardstein(1993) Journal of Experimental Psychology Human Per-ception and Performance 21 1494

Tomasello M Kruger A C amp Ratner H H (1993) Culturallearning Behavioral and Brain Sciences 16 495ndash552

Tootell R B H Dale A M Sereno M I amp Malach R (1996)New images from human visual cortex Trends in Neuros-ciences 19 481ndash489

von Hofsten C amp Spelke E (1985) Object perception andobject-directed reaching in infancy Journal of ExperimentalPsychology General 11 198ndash212

Woodward A L (1998) Infants selectively encode the goalobject of an actorrsquos reach Cogniton 69 1ndash34

Xu F amp Carey S (1996) Infantsrsquo metaphysics The case ofnumerical identity Cognitive Psychology 30 111ndash153

Xu F Carey S amp Welch J (1999) Infantsrsquo ability to useobject kind information for object individuation Cognition70 137ndash166

58 Journal of Cognitive Neuroscience Volume 13 Number 1

Page 15: Visual Representation in the Wild: How Rhesus Monkeys ...psych.colorado.edu/~oreilly/papers/MunakataEtAl01_monkey.pdfVisual Representation in the Wild: How Rhesus Monkeys Parse Objects

Needham A amp Baillargeon R (1993) Intuitions about sup-port in 45-month-old infants Cognition 47 121ndash148

Needham A amp Baillargeon R (1997) Object segregation in 8-month-old infants Cognition 62 121ndash149

Needham A amp Baillargeon R (1998) Effects of prior experi-ence on 45-month-old infantsrsquo object segregation InfantBehavior and Development 21 1ndash23

Needham A amp Modi M (2000) Infantsrsquo use of prior experi-ences with objects in object segregation Implications forobject recognition in infancy In H Reese (Ed) Advances inchild development and behavior vol 27 (pp 99ndash133)

Perrett D I Mistlin A J amp Chitty A J (1987) Visual neu-rones responsive to faces Trends in Neurosciences 10 358ndash364

Rao S C Rainer G amp Miller E K (1997) Integration of whatand where in the primate prefrontal cortex Science 276821

Rizzolatti G Fadiga L Fogassi L amp Gallese V (1999) Re-sonance behaviors and mirror neurons Archives Italiennesde Biologie 137 85

Scholl B amp Leslie A (in press) Explaining the infantrsquos objectconcept Beyond the perceptioncognition dichotomy In ELepore amp Z Pylyshyn (Eds) What is cognitive science Ox-ford Blackwell

Sereno M I Dale A M amp Tootell R B H (1995) Borders ofmultiple visual areas in humans revealed by functionalmagnetic resonance imaging Science 268 889

Spelke E (1985) Preferential looking methods as tools for thestudy of cognition in infancy In G Gottlieb amp N Krasnegor(Eds) Measurement of audition and vision in the first yearof postnatal life (pp 323ndash363) Norwood NJ Ablex

Spelke E Breinlinger K Jacobson K amp Phillips A (1993)Gestalt relations and object perception A developmentalstudy Perception 22 1483ndash1501

Spelke E amp Van de Walle G A (1993) Perceiving and rea-soning about objects Insights from infants In N Eilan amp R AMcCarthy (Eds) Spatial representation Problems in philo-sophy and psychology (pp 132ndash161) Oxford Blackwell

Spelke E S Hofsten C V amp Kestenbaum R (1989) Objectperception in infancy Interaction of spatial and kinetic in-formation for object boundaries Developmental Psychol-ogy 25 185ndash196

Sugihara T Edelman S amp Tanaka K (1998) Representationof objective similarity among three-dimensional shapes inthe monkey Biological Cybernetics 78 1

Tanaka K (1996) Inferotemporal cortex and object visionAnnual Review of Neuroscience 19 109ndash139

Tarr M J amp Bulthoff H H (1995) Is human object recogni-tion better described by geon structural descriptions or bymultiple views Comment on Biederman and Gerhardstein(1993) Journal of Experimental Psychology Human Per-ception and Performance 21 1494

Tomasello M Kruger A C amp Ratner H H (1993) Culturallearning Behavioral and Brain Sciences 16 495ndash552

Tootell R B H Dale A M Sereno M I amp Malach R (1996)New images from human visual cortex Trends in Neuros-ciences 19 481ndash489

von Hofsten C amp Spelke E (1985) Object perception andobject-directed reaching in infancy Journal of ExperimentalPsychology General 11 198ndash212

Woodward A L (1998) Infants selectively encode the goalobject of an actorrsquos reach Cogniton 69 1ndash34

Xu F amp Carey S (1996) Infantsrsquo metaphysics The case ofnumerical identity Cognitive Psychology 30 111ndash153

Xu F Carey S amp Welch J (1999) Infantsrsquo ability to useobject kind information for object individuation Cognition70 137ndash166

58 Journal of Cognitive Neuroscience Volume 13 Number 1