30
The maintenance and updating of representations of no longer visible objects and their parts J. Daniel McCarthy* ,1 , Gennady Erlikhman †,1 , Gideon Paul Caplovitz †,1 *Brown University, Providence, RI, United States University of Nevada, Reno, NV, United States 1 Corresponding authors: Tel.: 401-863-5171/310-825-4202/775-682-8673; Fax: 401-863-2255/310-206-5895/775-784-1126, e-mail address: dan_mccarthy@brown.edu; gerlikhman@unr.edu; gcaplovitz@unr.edu Abstract When an object partially or completely disappears behind an occluding surface, a representa- tion of that object persists. For example, fragments of no longer visible objects can serve as an input into mid-level constructive visual processes, interacting and integrating with currently visible portions to form perceptual units and global motion signals. Remarkably, these persis- tent representations need not be static and can have their positions and orientations updated postdictively as new information becomes visible. In this chapter, we highlight historical con- siderations, behavioral evidence, and neural correlates of this type of representational updating of no longer visible information at three distinct levels of visual processing. At the lowest level, we discuss spatiotemporal boundary formation in which visual transients can be inte- grated over space and time to construct local illusory edges, global form, and global motion percepts. At an intermediate level, we review how the visual system updates form information seen at one moment in time and integrates it with subsequently available information to gen- erate global shape and motion representations (e.g., spatiotemporal form integration and anorthoscopic perception). At a higher level, when an entire object completely disappears be- hind an occluder, the object’s identity and predicted position can be maintained in the absence of visual information. Keywords Spatiotemporal form integration, Shape perception, Motion perception, Formmotion interac- tions, Dynamic occlusion Progress in Brain Research, ISSN 0079-6123, https://doi.org/10.1016/bs.pbr.2017.07.010 © 2017 Elsevier B.V. All rights reserved. 163 ARTICLE IN PRESS

The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

The maintenance andupdating of representationsof no longer visible objectsand their parts

J. Daniel McCarthy*,1, Gennady Erlikhman†,1, Gideon Paul Caplovitz†,1

*Brown University, Providence, RI, United States†University of Nevada, Reno, NV, United States

1Corresponding authors: Tel.: 401-863-5171/310-825-4202/775-682-8673;

Fax: 401-863-2255/310-206-5895/775-784-1126, e-mail address: [email protected];

[email protected]; [email protected]

AbstractWhen an object partially or completely disappears behind an occluding surface, a representa-

tion of that object persists. For example, fragments of no longer visible objects can serve as an

input into mid-level constructive visual processes, interacting and integrating with currently

visible portions to form perceptual units and global motion signals. Remarkably, these persis-

tent representations need not be static and can have their positions and orientations updated

postdictively as new information becomes visible. In this chapter, we highlight historical con-

siderations, behavioral evidence, and neural correlates of this type of representational updating

of no longer visible information at three distinct levels of visual processing. At the lowest

level, we discuss spatiotemporal boundary formation in which visual transients can be inte-

grated over space and time to construct local illusory edges, global form, and global motion

percepts. At an intermediate level, we review how the visual system updates form information

seen at one moment in time and integrates it with subsequently available information to gen-

erate global shape and motion representations (e.g., spatiotemporal form integration and

anorthoscopic perception). At a higher level, when an entire object completely disappears be-

hind an occluder, the object’s identity and predicted position can be maintained in the absence

of visual information.

KeywordsSpatiotemporal form integration, Shape perception, Motion perception, Form–motion interac-

tions, Dynamic occlusion

Progress in Brain Research, ISSN 0079-6123, https://doi.org/10.1016/bs.pbr.2017.07.010

© 2017 Elsevier B.V. All rights reserved.163

ARTICLE IN PRESS

Page 2: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

1 INTRODUCTION“… whilst part of what we perceive comes through our senses from the object

before us, another part (and it may be the larger part) always comes … out of

our own head”

James (1890)

A primary function of the human visual system is to parse incoming retinal input

into meaningful surface and object representations. The brain does this despite the

fact that at any moment objects are often only partially visible. For instance, a bike

parked behind a tree is not perceived as two separate wheels and frames, but as a

unified whole—even when the unified whole creates an unexpected or previously

unseen object (Kanizsa, 1979). Perceiving coherent objects in these circumstances

relies on constructive neural processes that integrate spatially sparse retinal input

to arrive at accurate representations of visual form. In addition, we experience a uni-

fied, fluid world despite the fact that observers and objects are often moving and the

limited information available at one moment in time can vary considerably from the

next (e.g., a cat darting through the bushes is perceived as a single object, not dis-

connected patches of fur). As the world moves and we move with it, piecemeal sur-

face representations must be continually updated to maintain correspondence

between dynamic changes in visual information across successive time points. In this

chapter, we discuss examples of percepts that rely on this representational updating,

and describe behavioral and neural correlates at several levels of perceptual proces-

sing. Importantly, this requires a view of visual cortex that goes beyond classic con-

siderations that posit form and motion are processed independently.

Traditionally, visual processing has been conceptualized as two parallel and

somewhat separate streams or pathways (Goodale and Milner, 1992; Goodale

et al., 1994; Milner and Goodale, 2008; Ungerleider and Mishkin, 1982): a dorsal

“where” pathway extending from primary visual cortex (V1) up through the medial

temporal area (hMT+), the intraparietal sulcus (IPS), and posterior parietal cortex

associated with motion perception, attention, and visually guided action (Goodale

et al., 1994; Ikkai et al., 2011; Sereno et al., 2001; Silver et al., 2005) and a ventral

“what” pathway extending from V1, V2, and V4 through the posterior and ventral

temporal lobes that are involved with the representation of objects and their catego-

ries (DiCarlo et al., 2012; Grill-Spector and Weiner, 2014).

Challenging this classic view, a growing body of behavioral and neuroimaging

evidence has shown that putative motion areas also process form information and

vice versa. For example, neural correlates of form–motion percepts (e.g., transfor-

mational apparent motion; Tse, 2006) have been found within ventral visual regions

such as the lateral occipital complex (LOC), which is typically thought to be specif-

ically involved in form and object perception (Grill-Spector et al., 2001; Haxby et al.,

2001; Kanwisher et al., 1996; Malach et al., 1995) and the motion-sensitive hMT+ in

the dorsal stream (Huk et al., 2002; Rodman and Albright, 1989; Tootell et al., 1995).

Functional and anatomical overlap between LOC and hMT+ (Ferber et al., 2003;

164 Maintenance and updating of visual representations

ARTICLE IN PRESS

Page 3: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

Kourtzi et al., 2003; Liu and Cooper, 2003; Liu et al., 2004; Murray et al., 2003;

Stone, 1999; Zhuo et al., 2003) suggests that these areas may not be exclusively

form- and motion-specific (for a review of shape processing in the dorsal stream,

see Freud et al., 2016).

A critical interplay between dorsal and ventral streams may underlie form–motion interactions and the updating of persistent representations of no longer visible

information (for reviews see Blair et al., 2015; Caplovitz, 2011). Recent work has

shown that mid-level visual areas V3A and V3B as well as regions along the IPS

in the dorsal stream are activated in response to stimuli in which form and motion

information interact. These cortical areas respond to global motion patterns

(Koyama et al., 2005; Wall et al., 2008), structure from motion (Klaver et al.,

2008), contour curvature during rotational motion (Caplovitz and Tse, 2007), biolog-

ical motion (Vaina et al., 2001), both stationary and dynamic Glass patterns (Lestou

et al., 2014; Ostwald et al., 2008; Pavan et al., 2017), and motion edges (Vinberg and

Grill-Spector, 2008). The IPS has been further implicated in shape perception and

integration (Konen and Kastner, 2008; Lehky and Sereno, 2007; Perry and Fallah,

2014; Zaretskaya et al., 2013). Additionally, a largely ignored white matter bundle

connecting dorsal and ventral regions of occipital visual cortex called the vertical

occipital fasciculus (VOF) has recently regained the attention of neuroscientists

(Keser et al., 2016; Weiner et al., 2016; Yeatman et al., 2014).

Collectively, these findings indicate that while the dorsal–ventral distinction is

useful for understanding visual processing architecture at a large scale, they are

not fully independent and interact across multiple levels of analysis. We hypothesize

that one of the functions of this dorsal–ventral communication is to support the spa-

tiotemporal processes that underlie the perception of objects reliant on the mainte-

nance, updating, and integration of previously visible information. Representations

of such objects must be maintained as they move, disappear, and reappear to arrive at

integrated global forms—processes that are related to both pathways. In the follow-

ing sections, we discuss behavioral and neuroimaging evidence of different phenom-

ena that demonstrate how critical interactions between regions that process visual

form and areas computing motion support the perception of moving objects under

conditions of fragmented input and dynamic occlusion. At the lowest level of anal-

ysis, a phenomenon called spatiotemporal boundary formation (SBF) demonstrates

that illusory edges of moving global objects can be recovered from the spatiotempo-

ral integration of perceptual transients (Shipley and Kellman, 1993, 1994, 1997). At

an intermediate level, anorthoscopic perception (AP; Fendrich et al., 2005;

Hochberg, 1968; Parks, 1965; Plateau, 1836; von Helmholtz, 1867/1925; Z€ollner,1862) and spatiotemporal form integration (STFI; McCarthy, 2014; McCarthy

et al., 2015a,b) show that local form information revealed at different locations in

space and time can be maintained and updated to generate global representations

of translating and rotating objects. At a relatively high level, we consider processes

that support dynamically occluded object perception (Erlikhman and Caplovitz,

2017; Palmer and Kellman, 2014; Palmer et al., 2006) in which representations of

the identity and speed of moving objects persist when disappearing behind occluders

1651 Introduction

ARTICLE IN PRESS

Page 4: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

even when no partial visual information is available. Taken together, these phenom-

ena highlight various ways in which the visual system supports coherently moving

object percepts when faced with the pervasive real-world problems of motion and

occlusion.

2 LOW LEVEL: INTEGRATING INFORMATION TO RECOVER THEEDGES OF MOVING OBJECTSWhen an object is camouflaged, its surface texture is similar to that of the back-

ground and its boundaries cannot be seen. Camouflage can be broken when the object

moves relative to the background. When this happens, its edges suddenly become

visible. This occurs because the object gradually occludes and reveals (accretes

and deletes) portions of the background’s texture as it moves, creating sporadic

changes in edge luminance (Gibson et al., 1969; Kaplan, 1969). A similar pattern

of transformations occurs when an object is seen through many small apertures, like

an animal seen through foliage. As in the case of camouflage, information about the

object’s shape becomes available only intermittently when it passes behind an aper-

ture. These situations present an important but difficult problem for the visual system

to solve: at any single moment in time, only a small fraction of the object is visible

and there is not enough information to accurately recover its shape. As the object

moves, more information about its shape is revealed as it occludes new background

elements or as a new section is seen through apertures. Hence, by the time a new,

previously unseen part of an object is revealed, the portions of the object that used

to be visible may have disappeared. In order to recover the form of the object, the

visual system must do more than simply integrate the spatially disparate visible frag-

ments of the object: it must also maintain a representation of the invisible fragments,

update their positions relative to each other as the object moves, interpolate missing

regions that may never be visible, and fuse all of this information spatiotemporally.The formation of global shape percepts over time as fragments of the object are

gradually revealed and occluded is a canonical example of dynamic object percep-

tion. That the visual system is capable of doing this has been demonstrated in a num-

ber of behavioral experiments (Andersen and Cortese, 1989; Gibson et al., 1969;

Kaplan, 1969; Palmer et al., 2006; Shipley and Kellman, 1993, 1994; Staffers,

1989; Yonas et al., 1987); however, accretion and deletion are but one of a wide va-

riety of abrupt changes to texture elements that can lead to the perception of edges

from dynamic changes in a display. Shipley and Kellman (1993, 1994, 1997) referred

to the more general process by which edges, global form, and global motion are seen

from discrete events or spatiotemporal discontinuities as SBF. They demonstrated

that global forms can be recovered from various sequences of instantaneous (as op-

posed to gradual) events including texture element changes in orientation, color, and

shape (see also Cicerone et al., 1995; Hine, 1987; Miyahara and Cicerone, 1997).

Video 1 (https://doi.org/10.1016/bs.pbr.2017.07.010) shows an example of this ef-

fect. In the video, a virtual square moves on a background of white circular elements

166 Maintenance and updating of visual representations

ARTICLE IN PRESS

Page 5: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

on a black background. The object is virtual in the sense that its borders are not

luminance-defined; rather, its extent is used only to help determine when the circular

elements transform: whenever an element falls within the boundary of the virtual

object, it changes its color all at once to black; whenever an element exits the bound-

ary, it returns to white. The resulting percept is of an illusory, black square with crisp

contours. The instantaneous nature of the change is an important feature because it

demonstrates that gradual occlusion and therefore partial presentation of edges is not

necessary.

A sequence of frames from the video (with colors inverted) are shown in Fig. 1.

The dashed lines indicate the boundaries of the virtual object and are not actually

shown in the video. On any given frame, there is no clearly defined square; all that

one sees is a region of the display that has fewer circular elements than the rest of the

display, but that region has no clearly defined shape. Furthermore, as one element

disappears or reappears, information is only provided about that single, local position

along the virtual object’s boundary. As the virtual object moves about the display and

such events accumulate from different parts along its boundary, each must be inte-

grated over time in order to recover the global shape. That is, an element disappear-

ing in one position at one time cannot be directly integrated with the disappearance or

reappearance of a second element in a different position at a different time because

the object has moved between the two instances. Because the object is in motion, not

only must the positions of the transformations relative to the virtual object’s contours

be maintained over time but so must the position of the entire object, which is con-

stantly changing. The visual system must take into account the fact that an event

Frame 1 Frame 2 Frame 3

FIG. 1

Three frames similar to those in Video 1. The square with thick dashed lines demarcates the

virtual object and are not displayed. The circles defined by dotted contours indicate elements

that fall within the boundary of the virtual object and are invisible (white). In each frame, one

would therefore see a collection of black dots on a white background with an empty region.

The region itself does not have clearly defined boundaries. When a movie is played, however,

illusory contours are seen that correspond to the outline of the virtual object.

Figure from Erlikhman, G. and Kellman, P. J. 2016. Modeling spatiotemporal boundary formation. Vision Res.,

126, 131–142.

1672 Low level: Integrating information to recover the edges ofmoving objects

ARTICLE IN PRESS

Page 6: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

indicates the presence of a moving boundary whose position must be updated before

it can be integrated with new events.

Perhaps surprisingly, a global form can still be seen even if only a single element

transforms per frame (Shipley and Kellman, 1994). It is not necessary to have many

simultaneous events to specify a contour between them; in fact, simultaneous events

sometimes should not be integrated if they do not appear along the same contour

(e.g., on opposite sides of the virtual object). Global forms are also seen if the virtual

object rotates, changes in size, or nonrigidly deforms (Erlikhman et al., 2014). In

such cases, the transformation of an element on one frame may correspond to a con-

tour that not only changes position on the next frame but also orientation, length, and

curvature. This phenomenon is a form of postdiction because the perception of a sur-

face arises only after several transformation events, but it appears to have existed

continuously (Choi and Scholl, 2006; Kawabe, 2011). This is most apparent when

the elements are arranged in a sawtooth pattern and disappear and reappear one at

a time, as if being occluded one by one by a passing object that traverses the display

(Erlikhman and Kellman, 2016b). The percept is indeed of an illusory vertical bar.

The width of this illusory bar is perfectly determined by the distance between ele-

ments and the time between their disappearances and reappearances. For example,

a wide bar moving at a constant velocity would occlude an element for longer than a

narrow bar. This is exactly what is seen even though the stimulus itself is created

simply by specifying the rate at which elements should appear and disappear, with

no reference to a bar. This example is postdictive because the width of the bar and

therefore the position of its trailing edge can only be determined after an element has

reappeared; yet, the entire illusory bar is seen continuously. A similar process takes

place in standard SBF displays made of large arrays of elements. The spatiotemporal

integration process therefore revises the information from the immediate past with

currently available information in order to construct the perceptual present.

The recovery of edges and ultimately global form and global motion in SBF dis-

plays may involve several representational and processing stages. First, edges in SBF

may be constructed from motion-like signals between transformation events (i.e.,

spatiotemporal discontinuities). The orientation of an edge can be unambiguously

recovered from the time and distance between any three such noncollinear events

(Shipley and Kellman, 1994, 1997). Error in orientation estimation of the constructed

edges is well predicted by human spatial and temporal uncertainty in event percep-

tion, perhaps suggesting that the task can be performed by commonmotion-detection

mechanisms. It has been proposed that the mechanism by which edges are recovered

in SBF can be explained by combinations of motion energy filters at different scales

which serve a dual purpose as not only motion detectors but also edge detectors

(Erlikhman and Kellman, 2016b). Once edges are recovered, a secondary process

integrates them and interpolates missing regions to form the completed boundary

(Palmer et al., 2006). Whether edges enter this process and are ultimately seen is

constrained by the distance and time between element transformation events

(Cunningham et al., 1998; Erlikhman and Kellman, 2016a,b; Shipley and

Kellman, 1993). For instance, if Video 1 is slowed down, the element appearances

168 Maintenance and updating of visual representations

ARTICLE IN PRESS

Page 7: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

and disappearances are no longer integrated and are perceived as isolated flickering

with no coherent global form or motion signal.

In addition to color changes leading to SBF as in Video 1, it is also possible to

create illusory figures by rotating orientated elements as they enter or exit the

boundary of the virtual object or by displacing them in a random direction by a

small amount (Shipley and Kellman, 1994). Videos 2 and 3 (https://doi.org/10.

1016/bs.pbr.2017.07.010) demonstrate both of these transformation types. In the

videos, the elements are Gabor patches which change in orientation (Video 2) or

position (Video 3) whenever they enter or exit the boundary of an expanding

and contracting circle centered in the middle of the screen. In static frames from

the videos, not only is no shape visible, neither is the general region which may

contain the virtual object, unlike Video 1. Such SBF stimuli are an excellent testbed

for examining the neural mechanisms of representational updating and form–motion interactions. In order to perceive the global form, individual events must

be perceived and integrated, edges must be extracted, those edges must be inte-

grated, and missing regions must be interpolated. This requires a combination of

local processing, memory mechanisms, motion detection, and global form percep-

tion. These processes are not readily captured by any single processing pathway.

The fact that the same global percept (a circle) is the result of different transfor-

mation types allows one to disentangle the effects of local changes from the rep-

resentation of the global form.

Erlikhman et al. (2016) showed observers such displays containing either

expanding and contracting circles, squares, or nonsense shapes while measuring

BOLD activity in an fMRI experiment. The nonsense shapes were created by alter-

nating between the virtual circle and square objects between frames. This maintained

the same average number of element transformations per frame occurring in the same

general regions of the display as the circle and square stimuli without producing il-

lusory contours or a global form percept. A classifier was trained to discriminate pat-

terns of activity between the two transformation types (rotation vs displacement),

between shape and nonshape stimuli, and between circles and squares. Element

transformation types could not be discriminated in any visual areas; however, shapes

were distinguishable from nonshapes and circles were distinguishable from squares

in early visual cortex (V1–V3), intermediate visual areas V3A, V3B, V4v, and the

lateral occipital area (LO), as well as the ventral visual area (VO) and parahippocam-

pal cortex (PHC) of the ventral stream. This was not surprising because these areas

(e.g., V4v) have been previously associated with representing shape information

(Mannion et al., 2010; Wilkinson et al., 2000). Importantly, shapes were also dis-

criminable in the temporal occipital area (TO; overlapping with hMT+) and IPS

along the dorsal pathway, which are putatively primarily processing motion informa-

tion. This suggests that the dorsal stream may have a vital role in the accumulation

and integration of shape information over time (see also Caplovitz and Tse, 2007;

Harvey et al., 2010; Newell et al., 2005) and is in agreement with recent work that also

implicates areas V3A, V3B/KO, and IPS in perceptual organization and structure-

from-motion perception (Braddick et al., 2000; Konen and Kastner, 2008;

1692 Low level: Integrating information to recover the edges ofmoving objects

ARTICLE IN PRESS

Page 8: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

Lestou et al., 2014; Orban, 2011; Orban et al., 1999; Paradis et al., 2000; Vanduffel

et al., 2002; Zaretskaya et al., 2013).

To investigate the constructive time course of spatiotemporal representations in

SBF, a similar experiment using a modified version of the SBF stimulus was con-

ducted while measuring EEG (Caplovitz et al., 2013). In this study, the shapes

(squares and circles) were defined by color changes of elements (as in Video 1),

and instead of expanding and contracting, they moved on a circular path around

the middle of the screen. Evoked potentials time-locked to the onset of these SBF

stimuli were contrasted with those evoked by a control stimulus that shared the same

basic spatiotemporal structure as the SBF stimulus yet did not produce the percept of

a coherent moving object. Using source localization (Pascual-Marqui, 2002), prior

research identified regions that preferentially responded to SBF-defined shape vs

nonshape stimuli and examine how activity in those regions changed over time.

In agreement with earlier studies on illusory contour formation, differential activity

was observed first in later visual areas, prior to differences in early visual cortex (e.g.,

Murray et al., 2002a). The activation time course across different cortical regions is

shown in Fig. 2 for one sample subject. Source localization was used to map EEG

activity to high-resolution anatomical scans of the same participants taken from an

fMRI experiment. By activation, we therefore do not mean BOLD fMRI activity, but,

rather, the amplitude of the difference wave between shape and nonshape SBF stim-

uli. Unlike studies using static stimuli, this early activity first appeared 180 ms after

stimulus onset in intermediate (V3A) and dorsal (IPS) areas, and later (215 ms) in

ventral temporal cortex and the posterior fusiform (pFs). After these initial activa-

tions, the difference was again observed in V3A (256 ms) and pFs (350 ms), indicat-

ing that there was continuous communication between ventral and dorsal areas

during the construction of SBF shapes. Only at 450 ms was there differential activity

in early visual areas (V1 and V2), suggesting that SBF-related fMRI activity in these

regions (Erlikhman et al., 2016) may have arisen due feedback from higher-order

visual areas. Taken together, these findings demonstrate that SBF relies on multiple

phases of processing in intermediate as well as dorsal and ventral visual areas, with

differences in early visual cortex only becoming apparent following the spatiotem-

poral construction of SBF-defined shapes.

Gradually revealed fragments of an object’s boundary must also be integrated in

AP (to be discussed in detail in the next section) in which a moving object is seen

through a narrow slit (Hochberg, 1968; Parks, 1965; Plateau, 1836; von Helmholtz,

1867/1925; Z€ollner, 1862). Kuai et al. (2017) constructed contour integration dis-

plays (Field et al., 1993; Li, 1998) of randomly oriented Gabor elements which con-

tained an embedded sequence of collinear elements, which were shown

anorthoscopically (i.e., seen through a moving, narrow slit so that only a subset of

the Gabors were visible on any frame). Observers performed a two-interval

forced-choice task in which they discriminated which of two movies contained a col-

linear set of Gabor elements. However, because the displays were only seen through

a narrow slit, no set of collinear elements were ever visible on any single frame. As in

SBF, observers had to integrate the individual elements over time by maintaining a

170 Maintenance and updating of visual representations

ARTICLE IN PRESS

Page 9: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

representation of their positions and orientations in order to do the task. A classifier

was able to distinguish displays which contained a collinear set of elements from

those that did not, based on patterns of brain activity, only in higher-level ROIs

(LO, V3A, V3B/KO, V7, MT, and IPS), but not in early visual cortex (V1 and

V2 or V4). The collinear elements were also sometimes aligned along a diagonal path

from the bottom-left to the top-right and sometimes from the top-right to the bottom-

left. Again, a classifier was only able to distinguish these two kinds of arrangements

based on activity in IPS and in no other cortical areas. As with SBF, on any frame

only a single element from the path was shown, suggesting that these higher-level

visual areas along the dorsal pathway are where dynamic shape information may

be integrated. If the displays were not viewed anorthoscopically so that the entire

field of Gabors were shown all at once, then both classifications (collinear vs random

and path orientation) could be performed with high accuracy (>70%) in all early

visual areas (V1, V2, V3v, V4).

180 ms

215 ms

256 ms

350 ms

450 ms

V1/V2Ventral/pFS

V3A/IPS

Ventral/pFS

Ventral/pFS

V3A/IPS

FIG. 2

Source localization of EEG activity for one sample subject depicting the difference between

SBF-defined shape and nonshape stimuli. Bright colors indicate a stronger difference.

Data were projected to anatomical scans of the same subject.

1712 Low level: Integrating information to recover the edges ofmoving objects

ARTICLE IN PRESS

Page 10: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

Taken together, the contour integration and SBF results suggest that it is dorsal

and posterior parietal regions like the IPS where dynamic shape information is inte-

grated. These areas are involved with motion processing, working memory (Todd

and Marois, 2004; Xu and Chun, 2006), attention (Corbetta et al., 1998), and global

form integration (Caclin et al., 2012; Konen and Kastner, 2008; Zaretskaya et al.,

2013). These processes may subserve the construction of shape representations

over time.

Lastly, while the simple nature of the SBF stimulus demonstrated in Video 1 and

Fig. 1 is well suited for empirical study, it may seem to be somewhat contrived and

lacking in external validity, but SBF is not an esoteric illusion observed only in the

laboratory: the visual system constantly has to recover edges, global form, and global

motion whenever partial information about an object is gradually revealed over time,

such as when we see an object through leaves or the slats of a fence. Indeed, a visit to

a busy train station or airport will quickly reveal that we use this sort of dynamic

piecemeal constructive perceptual process quite frequently in everyday naturalistic

settings.

3 INTERMEDIATE LEVEL: INTEGRATING LOCAL EDGE ANDSURFACE INFORMATION TO RECOVER THE SHAPE OF MOVINGOBJECTSSBF elegantly demonstrates that spatiotemporal interpolation processes can operate

on sparse visual input to construct dynamic object boundaries; however, surface in-

formation is also critical for figural parsing and determining object identity. In this

section, we focus on two phenomena—AP and STFI—that highlight mechanisms

involved in the maintenance, integration, and position updating of dynamically

revealed surface and edge information to generate persistent representations of par-

tially occluded moving objects.

3.1 ANORTHOSCOPIC PERCEPTIONAP has been a topic of research interest for over a century (Fendrich et al., 2005;

Hochberg, 1968; Parks, 1965; Plateau, 1836; von Helmholtz, 1867/1925; Z€ollner,1862). AP occurs when an object made partially visible through a narrow slit in

an occluding surface is perceived in its entirety, due to translation of either the object

or the occluding surface. Despite the fact that only a restricted area of the retina cor-

responding to the area projected from the slit is stimulated—as is the case when see-

ing a dog pass behind a door that is ajar—we experience a complete moving figure

that is perceptually expanded beyond the slit itself (Fig. 3; McCloskey and Watkins,

1978). AP indicates that surface and edge information revealed at the same locations

in space can be integrated over time to generate global representations of object

shape and motion. Because AP represents a case of the aperture problem

(Adelson and Movshon, 1982; Nakayama and Silverman, 1988a,b) in which the

172 Maintenance and updating of visual representations

ARTICLE IN PRESS

Page 11: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

speed and direction of motion viewed through a restricted region is ambiguous, its

occurrence is similarly constrained. For one, visible salient features such as line ends

or regions of high curvature (Attneave, 1954; Biederman, 1987) increase the likeli-

hood of coherent object percepts because they aid in the disambiguation of local mo-

tion signals (Rock, 1981). In addition, the slit has to be wide enough so that these

important features can be extracted to construct object identity. Such “trackable

features” are known to lead to more accurate moving object representations under

a variety of viewing conditions because they provide cues that allow the visual sys-

tem to maintain persistent representations of object features that can be positionally

updated and matched over space and time (Blair et al., 2014; Caplovitz and Tse,

2007; Caplovitz et al., 2006).

von Helmholtz (1867/1925) originally proposed that AP occurs due to “retinal

painting” such that the moving figure is progressively spread across the retina with

each successive view in time, resulting in the imprinting of the whole object on the

retina; however, later investigations indicate that it is not a sufficient general expla-

nation. For example, AP still occurs when the eyes remain stationary (Anstis et al.,

1976; Rock, 1981) or when themoving slit is stabilized on the retina (Fendrich, 1983;

Fendrich and Mack, 1980; Morgan et al., 1982). This suggests that while retinal

painting may contribute to AP under some circumstances, it is not a requirement

(Rock et al., 1987).

FIG. 3

Anorthoscopic perception. Form features seen through the white slit are integrated across

successive time points to generate the percept of a coherent translating dog. In these

displays, only the form information available in the white slit is available at any given moment

in time. This demonstrates a case in which the object is translating behind an occluding

surface; however, AP can also occur if the object remains stationary and the slit moves such

that it “scans” the object to generate the completed percept.

1733 Intermediate level: Integrating local edge

ARTICLE IN PRESS

Page 12: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

It is now commonly believed that the spatiotemporal integration processes sup-

porting AP occur downstream at the level of cortex. Parks (1965) proposed that suc-

cessive views are accumulated in a postretinal storage buffer and spatiotemporally

integrated based on their time of arrival to yield a complete object representation.

Soon after, in a series of experiments providing evidence against the retinal painting

hypothesis, Rock and Halper (1969) suggested that anorthoscopic percepts are con-

structed through inference processes to arrive at a parsimonious “solution” given ac-

cumulated visual information: a shift in identity from simple features viewed through

the slit to the percept of a coherent object relies on the observer constructing a rep-

resentation of the object that can then be maintained and integrated across successive

views (Rock and Gilchrist, 1975; Rock and Halper, 1969; Rock and Sigman, 1973).

In addition to cognitive factors, pursuit eye movements have been shown to facilitate

AP when the visible moving target is continuously tracked (Fendrich and Mack,

1981; Morgan, 1981; Morgan et al., 1982) by serving as a cue to the occluded ob-

ject’s motion (Rock et al., 1987); however, under free-viewing conditions, sponta-

neous pursuit eye movements did not influence AP, suggesting that the

spatiotemporal integration process occurred postretinally (Fendrich et al., 2005;

Rieger et al., 2007).

Where in the visual systemmight this postretinal integration occur? Recent fMRI

research indicates important roles for regions of extrastriate cortex including the ven-

tral LOC (Caclin et al., 2012; de-Wit et al., 2012; Fang et al., 2008; Murray et al.,

2002a; Reichert et al., 2014; Yin et al., 2002), and fusiform gyrus (Caclin et al.,

2012), and dorsal hMT+ (Yin et al., 2002), as well as potential contributions from

IPS (Reichert et al., 2014). Interestingly, greater activity in these higher tier regions

is associated with concurrent deactivation in early visual cortex (Caclin et al., 2012;

de-Wit et al., 2012; Fang et al., 2008; Murray et al., 2002a; Reichert et al., 2014; Yin

et al., 2002). This early suppression has been interpreted through the lens of predic-

tive coding models that propose neuronal activity in lower visual regions decreases

when feedback from higher stages of visual processing can “explain” a visual stim-

ulus (Mumford, 1992; Rao and Ballard, 1999). During AP and perhaps other circum-

stances in which partial form features of an object are spatiotemporally revealed,

these higher regions responsible for the integration and maintenance may discount

early visual input when the incoming information is consistent with the current

higher-order representation.

Of particular interest is the involvement of hMT+ in AP. While hMT+ responds

vigorously to visual motion (Tootell et al., 1995) and would be expected to respond to

the motion of the display, it was found to show a preference for moving global

objects (Yin et al., 2002). Specifically, fully visible objects were compared to objects

moving behind a slit that matched the fully visible image or whose visible portions

were distorted as they passed behind the aperture. Interestingly, hMT+ showed the

strongest response to the fully visible and undistorted slit-view conditions, indicating

it plays an important role in the spatiotemporal integration process of sequentially

revealed object parts (Yin et al., 2002). Indeed, hMT+ has been implicated in the com-

putation of structure from motion (Bradley et al., 1998; Murray et al., 2002a, 2003),

174 Maintenance and updating of visual representations

ARTICLE IN PRESS

Page 13: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

is sensitive to the form of objects (Kourtzi et al., 2002), and plays an important role in

generating form–motion percepts (Tse, 2006). Taken together, this suggests that LOC

and hMT+ may not strictly subserve form and motion processing, respectively, but

rather work in concert as a form–motion processing circuit and aid in the perception

of dynamically occluded objects.

3.2 SPATIOTEMPORAL FORM INTEGRATION AND POSITION UPDATINGWhile AP highlights several critical interactions between form and motion proces-

sing that underlie our perception of partially occluded moving objects, it is com-

monly the case that the location of form features detected at a given moment will

not match the position(s) at which they are subsequently visible. The available sur-

face information of an object at any given point in time is often sparsely dispersed

across different locations in space (e.g., a child running behind a lattice fence). Such

circumstances require the visual system to maintain and spatiotemporally integratepersistent form representations of information at a given location (STFI) and also

postdictively update the position of STFI-completed representations in light of

new visual information to match the speed and direction of the moving object

(Choi and Scholl, 2006; Kawabe, 2011; McCarthy, 2014; McCarthy et al., 2015a,

b; Palmer et al., 2006).

Studies investigating spatial integration have used illusory contours (Kanizsa,

1955, 1979; Kellman and Shipley, 1991; Schuman, 1904) extensively to explore

mechanisms that support unified surface perception under conditions of occlusion.

In these displays, local edge and corner information provided by spatially separated

Pac-man shaped inducers lead to a robust percept of an occluding shape that matches

color of the background through the process of visual interpolation (Kellman and

Shipley, 1991). Notably, Kojo et al. (1993) found that presenting inducers sequen-

tially can lead to similar surface representations of static illusory shapes through

STFI. In addition to stationary surface percepts, these STFI processes can lead to

representations of translating (Ghose et al., 2014; Kellman and Shipley, 1991;

Seghier et al., 2000; Stanley and Rubin, 2005) and rotating (Kellman and Cohen,

1984; Kellman and Loukides, 1987; McCarthy, 2014; McCarthy et al., 2015a,b) il-

lusory surfaces. In these moving cases, STFI alone is not sufficient to support coher-

ent object representations: the position of the object representation must also be

updated and matched with current visual input (Kellman and Cohen, 1984;

Kellman and Loukides, 1987; Palmer et al., 2006). A failure to do so leads to shape

distortions and even breakdown of coherent object representations if displacements

are so extreme that they violate the geometric constraints of spatiotemporal relatabil-

ity (Ghose et al., 2014; Keane et al., 2007; Palmer et al., 2006) and correspondence

between successive views cannot be established.

To investigate the spatial and temporal limits under which STFI and position

updating can occur, McCarthy et al. (2015b) showed participants sequentially pre-

sented inducers that generated percepts of (1) stationary illusory squares similar

to Kojo et al. (1993) and (2) rigidly rotating illusory squares (Videos 4 and 5

1753 Intermediate level: Integrating local edge

ARTICLE IN PRESS

Page 14: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

(https://doi.org/10.1016/bs.pbr.2017.07.010), respectively; Fig. 4). In contrast to

previous work (Kellman and Cohen, 1984; Kellman and Loukides, 1987), rotation

was perceived in the complete absence of motion energy. Specifically, the inducers

themselves and the surface that partially occluded them were static while visible;

hence, motion could only be perceived if the visual system was able to position up-

date the illusory surface between each successive inducer presentation. Critically, if

four successive inducer presentations were instead shown simultaneously, they

would be consistent with an irregular, misaligned polygon (Fig. 4, bottom-right).

A similar percept is generated when the rotation between successive inducers is

FIG. 4

Sequentially presented inducers lead to representations of stationary and rigidly rotating

objects. Top: sequentially presented inducers are spatiotemporally integrated to generate the

percept of an illusory square (STFI; Video 4). Bottom: inducers are presented sequentially,

and the occluding square is rotated between each successive inducer (position updating). If

the position of the features defined by the inducers are updated and spatiotemporally

integrated, they will lead to the percept of a rigidly rotating rectangle (left; Video 5). If the

rotation between successive inducers is too great such that features are no longer

spatiotemporally relatable, position updating breaks down, leading to local feature

misalignment and global figure distortions (right; Video 6). Dotted lines are shown for

illustration purposes only and were not shown in the actual display.

176 Maintenance and updating of visual representations

ARTICLE IN PRESS

Page 15: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

too great, leading to distortions and a breakdown of the rigid square percept (Video 6

(https://doi.org/10.1016/bs.pbr.2017.07.010)); if position updating occurs and the

conditions of spatiotemporal relatability are not exceeded, however, a rigidly rotat-

ing square can be perceived (Fig. 4, bottom-left). Four primary conclusions were

drawn from the results: (1) STFI can occur over relatively long intervals when the

illusory surface is stationary (i.e., >500 ms between successive inducers), (2) the

temporal window for synergy between STFI and position updating leads to rigidly

rotating percepts more restricted (i.e., <300 ms) and breaks down if the spatial mis-

alignment between inducers is too great (i.e., <10 degree), (3) static percepts were

just as robust when the ratio between the physically specified edge of the inducer and

the total length of the illusory figure (support ratio; Shipley and Kellman, 1992) was

reduced, suggesting that STFI relies on feature integration rather than contour inter-

polation (i.e., the individual inducers do not appear to generate illusory contours be-

yond their explicitly defined borders), and (4) top-down knowledge facilitates STFI

and position updating in that the spatial and temporal limits supporting representa-

tions of rigidly rotating complex shapes (i.e., silhouettes of animals) were less re-

stricted when they were presented upright compared to when they were inverted,

consistent with inversion effects observed in face (Schwaninger et al., 2003) and ob-

ject perception (Diamond and Carey, 1986; Tanaka and Curran, 2001).

To determine the neural substrates of STFI and position updating,McCarthy et al.

(2015a) conducted an fMRI study comparing BOLD activity for static and rigidly

rotating surface representations generated through the sequential presentation of in-

ducers to conditions in which the inducers were rotated 180 degree outward so that

no illusory surface was perceived. They found that STFI was associated with greater

activity in several extrastriate visual areas, including V3, V3A/V3B, V4v, LOC, and

the kinetic occipital cortex (KO; Tyler et al., 2006; Van Oostende et al., 1997). Each

of these regions showed a greater response to the presence of an illusory figure, in-

dependent of whether or not it was moving. Instead, position updating was isolated to

KO (with hMT+ bordering on significance): the interaction in these areas revealed

they responded more vigorously when there was a figure present and it was moving.

This suggests that KO and possibly hMT+ play a specialized role in the interplay

between STFI and position updating. KO’s involvement is not surprising given that

it is highly sensitive to kinetically defined borders (Dupont et al., 1997; Grossman

et al., 2000; Van Oostende et al., 1997), coherent dot rotation (K€on€onen et al., 2003),and kinetic depth structure (Tyler et al., 2006). Moreover, KO and hMT+ have been

implicated in the perception of implied motion from dynamic Glass patterns

(Krekelberg et al., 2005).

Unlike work on spatial integration, no modulation of V1 or V2 was observed for

STFI. Several single-unit studies have found evidence of illusory contour sensitivity

in early visual cortex (Hubel and Wiesel, 1968; Peterhans and von der Heydt, 1989;

Peterhans et al., 1986; von der Heydt and Peterhans, 1989; von der Heydt et al.,

1984), and have argued that such percepts arise from feed-forward, low level mech-

anisms. Importantly, however, these studies only recorded in early visual cortex, ig-

noring the potential role of feedback from higher visual areas. A seminal study

1773 Intermediate level: Integrating local edge

ARTICLE IN PRESS

Page 16: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

combining fMRI, EEG, and source localization identified LOC as the first site of

illusory contour sensitivity (Murray et al., 2002b; see also Brodeur et al., 2006),

and subsequent work has supported the notion that V1/V2 activity reflects feedback

from higher regions such as the LOC (for review see Murray and Herrmann, 2013).

Specifically, the N170 component observed bilaterally over lateral occipital cortex

FIG. 5

Top row: stimulus configurations used in the EEG experiment. Dotted lines in the STFI

conditions are for illustrative purposes only and were not shown in the experiment. Middle

row: average current source density (Perrin et al., 1989) waveforms over lateral parieto-

occipital electrodes sites (PO7/PO8) previously implicated in illusory contour perception (e.g.,

Murray and Herrmann, 2013; Murray et al., 2002b). Black solid lines display the real

contour condition and black dotted lines depict the STFI condition for the stationary (left) and

rotating (right) stimuli. A significantly greater negative deflection was observed from 157

to 226 ms in the stationary condition and 169 to 203 ms in the rotating condition.Bottom row:

associated scalp topography of the STFI–real difference wave for the stationary (left) and

rotating (right) conditions for the significant time windows above. Significant electrode

clusters are shown with white circles around sites PO7/PO8 corresponding to the waveforms

above.

178 Maintenance and updating of visual representations

ARTICLE IN PRESS

Page 17: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

between �150 and 200 ms has been implicated in the spatial integration process

(Knebel and Murray, 2012; Murray et al., 2002b, 2004, 2006).

EEG results indicate that similar mechanisms may also support the spatial inte-

gration of form features over time (McCarthy, 2014). Sequentially presented in-

ducers giving rise to STFI-completed static and rigidly rotating figures were

compared to similar displays in which a real contour was superimposed on each in-

ducer and either remained stationary or rotated at the point of each successive in-

ducer presentation (Fig. 5, top row). Consistent with previous work on spatial

integration (Herrmann et al., 1999; Knebel and Murray, 2012; Murray et al.,

2002b, 2004, 2006; Shpaner et al., 2009), permutation tests (Oostenveld et al.,

2011) revealed significantly greater negative deflections were observed over bilat-

eral parieto-occipital electrode sites (PO7/PO8) for STFI figures from 157 to

226 ms in the stationary condition and 169 to 203 ms in the rotating condition

(Fig. 5, middle row). These differences in the N170 window suggest that STFI relies

in part on the mechanisms that underlie spatial integration (Knebel and Murray,

2012; Murray et al., 2002b, 2004, 2006). Cluster-based permutation analyses

(Oostenveld et al., 2011) of the scalp topography for the associated time windows

revealed that this difference was most pronounced bilaterally over parieto-occipital

and fronto-central sites, with STFI activity being more negative at posterior locations

and positive at frontal locations (Fig. 5, bottom row left). Intriguingly, the observed

amplitude difference between the STFI and real contour displays was reduced in the

rigidly rotating condition. Topographical differences were most pronounced over bi-

lateral parieto-occipital and frontal-temporal sites, as well as centroparietal elec-

trodes with activity again being posteriorly more negative and anteriorly more

positive (Fig. 5, bottom row right). These data may suggest that the addition of po-

sition updating may have resulted in a less robust percept (e.g., reduced N170) and

the recruitment of additional parietal and frontal mechanisms. We note, however,

that while the increased frontal activations may implicate the involvement of top-

down factors such as attention (Corbetta and Shulman, 2002; Corbetta et al.,

1998; Stojanoski and Niemeier, 2011), this is only speculative and the observation

of significance over these regions only indicates that there was a difference between

the STFI and real contour conditions, not where or when this difference occurred

(Maris, 2012; Maris and Oostenveld, 2007). Future research is necessary to draw

strong conclusions regarding the temporal evolution of STFI and delineate the neural

mechanisms that selectively contribute to the position updating of STFI

representations.

Collectively, these findings indicate that STFI is supported by regions beyond

early visual cortex and depends largely on similar mechanisms implicated in spatial

integration. Instead, position updating reflects a highly specialized type of proces-

sing that is distinct from other form-based motion percepts (e.g., Krekelberg

et al., 2005; Tse, 2006), recruiting only a few specific regions of visual cortex.

The critical interplay between these two processes represents one of the essential

mechanisms that allow us to perceive coherent object representations across spatial

and temporal gaps in visual input.

1793 Intermediate level: Integrating local edge

ARTICLE IN PRESS

Page 18: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

4 HIGH LEVEL: MAINTAINING AND UPDATINGREPRESENTATIONS OF MOVING OBJECTS CURRENTLY NOTIN VIEWIn the previous sections, we discussed how the representation of an object whose

fragments are only gradually revealed may be integrated and represented neurally.

A more difficult but equally common problem arises for the visual system when

an object completely disappears from view by passing behind a nearer object. In this

case, the observer must also maintain a representation of the object’s position and

velocity over time, but receives no supporting information until the moment the ob-

ject reappears. Like SBF, the neural basis for the representation of occluded objects

has also been localized to both dorsal and ventral higher-level cortical areas such as

the IPS, MT, and LOC (Assad and Maunsell, 1995; Baker et al., 2001; Hulme and

Zeki, 2007; Makin et al., 2009; Olson et al., 2004; Shuwairi et al., 2007). However,

there is one critical difference between SBF and fully occluded forms: in SBF, a

global shape with clearly defined contours is seen, whereas no shape is seen when

an object is occluded. It may therefore not be entirely surprising that shape identity

information in SBF is also represented in early visual cortex (Erlikhman et al., 2016).

For completely occluded objects, just because there is some BOLD response for oc-

cluded objects in higher visual areas, it is not clear what features of the objects are

being represented: the shape, the position, the motion path, etc.

Consider, for example, the tunnel effect: a moving green circle passes behind an

occluder and a red square emerges on the other side traveling at the same speed. The

perception is that of a single, continuously moving object that transformed behind the

occluder (Burke, 1952; Flombaum and Scholl, 2006; Flombaum et al., 2009;

Michotte et al., 1964). Perhaps all that is represented while the object is occluded

is its identity—a token—the representation that there is “something” behind the

occluder, but little other precise information. In this way, the tunnel effect is similar

to multiple object tracking in which observers can readily maintain which ones of

several moving objects were the targets, but may not remember which object had

which arbitrarily assigned name (Pylyshyn, 1989). Pylyshyn described this type

of representation as an “object file” (FINST), arguing that we maintain abstract rep-

resentations of objects that are described only by their positions in space, while other

properties such as color and shape are more incidental. This description accords with

some theories of long-range apparent motion in which motion is seen between suc-

cessive flashes of two stationary but different shapes (e.g., a red circle and a green

square)—a sort of two-frame version of the tunnel effect. The representation of the

object in the space between the two flashing shapes was described as “blob-like” and

formless (Attneave, 1974; Marr, 1982; Ullman, 1980). Perhaps when an object dis-

appears, all we represent and maintain are its position and that there is a “thing”

there, but not its surface features.

In order to investigate what information is represented during dynamic occlusion,

Erlikhman and Caplovitz (2017) measured BOLD fMRI activity across both early

180 Maintenance and updating of visual representations

ARTICLE IN PRESS

Page 19: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

(V1–V3) and higher-level cortical areas while observers viewed various shapes

passing behind occluders. In their displays, the visual field was divided into four

quadrants. A circle or star traveled along a circular path from the upper-right to the

lower-left quadrant. Once it reached the horizontal meridian separating the lower-left

and upper-left quadrants, the object reversed direction, traveling only along three quar-

ters of the circular path. The lower-right quadrant through which the object passed was

occluded by a surface the same color as the background. In one condition, the object

disappeared instantly the moment any part of it came into contact with that quadrant

and reappeared as soon as it was fully out of that quadrant. In another condition, the

object becomes gradually occluded. Fig. 6 shows the sequence of frames correspond-

ing to these two conditions. In this manner, both the lower-right (the “occlusion quad-

rant”) and the upper-left (the “empty quadrant”) contained no shape information

during the experiment; however, during certain moments in the animation sequence,

the object was nevertheless passing through the lower-right (but never the upper-left)

quadrant, although it was occluded and invisible.

Because the display was divided into four quadrants, it was possible to leverage

the retinotopy of the early visual cortex to identify regions that corresponded to each

FIG. 6

Sequence of frames depicting what participants saw in Erlikhman and Caplovitz (2017). In the

disappearance and occlusion conditions, objects traveled from the upper-right quadrant

to the lower-left quadrant and then reversed direction. In the disappearance condition (top),

the shape disappeared (represented as an outline in this figure) as soon as any part

came into contact with the occlusion (lower-right) quadrant and reappeared as soon as it fully

exited that quadrant. In the occlusion condition (bottom), the shape gradually

disappeared behind the occlusion quadrant. The thin horizontal and vertical lines indicating

the quadrants are for illustration purposes only and were not shown in the experiment.

Figure adapted from Figure 1 in Erlikhman, G., Caplovitz, G.P., 2017. Decoding information about dynamically

occluded objects in visual cortex. Neuroimage 146, 778–788.

1814 High level: Maintaining and updating representations

ARTICLE IN PRESS

Page 20: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

of the quadrants. For example, because the occluded quadrant was always in the

lower-right portion of the display, the dorsal regions of V1, V2, and V3 in the left

hemisphere encoded corresponding information from that part of space. Likewise,

the ventral regions of V1–V3 in the right hemisphere represented the upper-left,

empty quadrant. By comparing the BOLD activation in these two regions, one could

get a measure of occluded-object-specific activity. Note again that no object was ever

seen in either of the regions. A difference in activation was found for gradually oc-

cluded, but not for instantly disappearing objects in all three early visual areas. Ac-

tivity in these areas has also been found for invisible objects along the path of

apparent motion (Chong et al., 2016; Larsen et al., 2006; Muckli et al., 2005;

Sterzer et al., 2006); yet, activity in these early visual areas on its own does not ex-

plain what features of the object may be represented.

To determine whether information about an occluded object’s shape is repre-

sented, a classifier was trained to discriminate between circle and star stimuli. How-

ever, this classifier was not trained onmoving circles and stars. Instead, it was trained

on data from a separate condition in which the shapes flashed in the centers of the

empty and occluded quadrants. The training data were therefore purely derived from

shape information, with no motion signals. These data were then used to attempt to

classify (circle vs star) shapes in the motion condition. If the cross-classification was

successful, it would indicate that the ROIs for which it worked contained some of the

same shape information during dynamic occlusion as they did for when the objects

were visible and stationary. It was first shown that stars and circles could be classi-

fied in V1–V3with a high accuracy (>80%) when trained and tested on flashing data

and also when trained and tested on the motion data for quadrants in which the shapes

were visible (i.e., stars were discriminable from circles in the upper-right and lower-

left quadrants through which the shapes were seen to move). Despite there being

activity in V1–V3 during dynamic occlusion, that activity could not be used to dis-

criminate stars from circles (all classification accuracies were not significantly dif-

ferent from chance). This suggests that the information represented in early visual

cortex during dynamic occlusion is not shape-specific. Rather, it may correspond

to the object’s position, its motion path, or the path of attention (Akselrod et al.,

2014; Culham et al., 1998; Schwarzkopf et al., 2011; Sterzer et al., 2006). Further

analysis found that shape identity could be decoded in higher visual areas such as

VO, LO, TO, LOC, PHC, parahippocampal place area, and hMT+, although the re-

sults should be interpreted with caution because voxels in those regions may capture

information from quadrants where the shape was physically visible, and not only

from the occluded quadrant.

These results suggest that the representation of occluded objects may be token-

like: when an object disappears, there is a memory maintained that says what the

object is (e.g., a circle or a star) as well as a general representation of its position

and velocity, but not specific representations about the positions of its boundaries

or even other surface properties, at least at the level of early visual cortex where

such features of visible objects would be represented (Kahneman et al., 1992;

Pylyshyn, 1989).

182 Maintenance and updating of visual representations

ARTICLE IN PRESS

Page 21: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

It should be noted that disappearance of an object is not strictly necessary for an

unlinking of objects and their features. Feature differences between objects and be-

tween an object and its surround can affect the interpretation of motion in a scene

(Caplovitz et al., 2011; Shapiro et al., 2014). For example, if two black rectangles

move toward each other from opposite ends of a screen, they may appear to either

pass through each other or bounce off one another at their point of overlap (Kanizsa,

1969; Metzger, 2006; Michotte et al., 1964). If the two objects are given additional

cues about their identities such as unique surface textures, for example, if the one

moving left to right is red and the one moving right to left is green, then the display

should unambiguously signal that the two objects stream past one another. Neverthe-

less, under certain circumstances, observers will still experience the two objects as

bouncing off of each other, even though in order to do so they must swap or exchange

features. That is, the left-to-right moving red object, upon collision, is perceived to

become green and reverses direction.

5 CONCLUSIONThe various phenomena discussed in this chapter highlight the importance of con-

structive visual processes that support coherent object perception in which visual in-

put is limited or absent across space and time. Central to the construction of these

percepts is the visual system’s ability to maintain representations of previously seen

but no longer visible information about an object. Moreover, once new information

becomes visible, these persistent representations are operated on, with their posi-

tions, orientations, and potentially other feature dimensions updated to form spatio-

temporal correspondences that allow a coherent representation of the object to be

formed. This constructive process necessarily involves the integration of form and

motion information and time, and again, regions of visual cortex in both the ventral

and dorsal streams are identified as neural correlates of spatiotemporal integration

processes and percepts. These demonstrations call into question the classic form

and motion processing dichotomy by demonstrating that critical interactions be-

tween these systems occur at multiple levels of analysis to solve the real-world prob-

lems of motion and occlusion. SBF demonstrates that the edges of moving objects

can be recovered through the integration and updating of fragmented visual tran-

sients across space and time. AP and STFI similarly demonstrate that form and sur-

face information can be spatiotemporally integrated to generate representations of

global object motion and shape despite fractured visual input. Lastly, object identity

and position can be maintained and updated when it completely disappears, support-

ing our experience of a cohesive and fluid visual experience.

Although the physical properties that give rise to these phenomena differ, they

challenge the view that the form and motion of objects are processed independently;

instead, there is consistent evidence for important contributions of mid-level

visual areas such as V3A, V3B, LO, and KO (Caplovitz et al., 2013; Erlikhman

et al., 2016; Kuai et al., 2017; McCarthy et al., 2015a), as well as regions of the dorsal

1835 Conclusion

ARTICLE IN PRESS

Page 22: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

(e.g., hMT+, TO, and IPS; Erlikhman and Caplovitz, 2017; Erlikhman et al., 2016;

McCarthy et al., 2015a; Reichert et al., 2014; Yin et al., 2002) and ventral streams

(VO, PHC, pFs, and LOC; Caplovitz et al., 2013; Erlikhman and Caplovitz, 2017;

Erlikhman et al., 2016; McCarthy et al., 2015a) in constructing representations of

form–motion percepts. Activity representing critical interactions between form

and motion processing manifest within these regions, and communication between

them—perhaps via the VOF (Keser et al., 2016; Weiner et al., 2016; Yeatman et al.,

2014)—may jointly comprise a circuit well suited for carrying out such computa-

tions. We note that these specific examples represent only a subset of what has cur-

rently been reported. Nonetheless, they represent an important step toward

understanding how the visual system resolves fundamental ambiguities in the con-

structive nature of perception to arrive at accurate representations of the current state

of our world.

REFERENCESAdelson, E.H., Movshon, J.A., 1982. Phenomenal coherence of moving visual patterns. Nature

300, 523–525.Akselrod, M., Herzog, M.H., Ogmen, H., 2014. Tracing path-guided apparent motion in hu-

man primary visual cortex V1. Sci. Rep. 4, 6063.Andersen, G.J., Cortese, J.M., 1989. 2-D contour perception resulting from kinetic occlusion.

Percept. Psychophys. 46, 49–55.Anstis, S., Rogers, G., Seteinbach, M., 1976. Research on Anorthoscopic Perception. York

University.

Assad, J.A., Maunsell, J.H., 1995. Neuronal correlates of inferred motion in primate posterior

parietal cortex. Nature 373, 518–521.Attneave, F., 1954. Some informational aspects of visual perception. Psychol. Rev.

61, 183–193.Attneave, F., 1974. Apparent movement and the what-where connection. Psychologia

17, 108–120.Baker, C.I., Keysers, C., Jellema, T., Wicker, B., Perrett, D.I., 2001. Neuronal representation

of disappearing and hidden objects in temporal cortex of the macaque. Exp. Brain Res.

140, 375–381.Biederman, I., 1987. Recognition-by-components: a theory of human image understanding.

Psychol. Rev. 94, 115–147.Blair, C.D., Goold, J., Killebrew, K., Caplovitz, G.P., 2014. Form features provide a cue to the

angular velocity of rotating objects. J. Exp. Psychol. Hum. Percept. Perform. 40, 116–128.Blair, C.D., Tse, P.U., Caplovitz, G.P., 2015. Interactions of form and motion in the perception

of moving objects. In: Wagemans, J. (Ed.), The Oxford Handbook of Perceptual Organi-

zation. Oxford University Press, Oxford, UK, pp. 541–559.Braddick, O.J., O’Brien, J.M., Wattam-Bell, J., Atkinson, J., Turner, R., 2000. Form and mo-

tion coherence activate independent, but not dorsal/ventral segregated, networks in the hu-

man brain. Curr. Biol. 10, 731–734.Bradley, D.C., Chang, G.C., Andersen, R.A., 1998. Encoding of three-dimensional structure-

from-motion by primate area MT neurons. Nature 392, 714–717.Brodeur, M., Lepore, F., Debruille, J.B., 2006. The effect of interpolation and perceptual

difficulty on the visual potentials evoked by illusory figures. Brain Res. 1068,

143–150.

184 Maintenance and updating of visual representations

ARTICLE IN PRESS

Page 23: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

Burke, L., 1952. On the tunnel effect. Q. J. Exp. Psychol. 4, 121–138.Caclin, A., Paradis, A.L., Lamirel, C., Thirion, B., Artiges, E., Poline, J.B., Lorenceau, J.,

2012. Perceptual alternations between unbound moving contours and bound shape motion

engage a ventral/dorsal interplay. J. Vis. 12 (7), 11.

Caplovitz, G.P., 2011. Visual form motion interactions. In: Colombus, A.M. (Ed.), Advances

in Psychology Research, vol. 82. Nova Science, New York, pp. 133–152.Caplovitz, G.P., Tse, P.U., 2007. V3A processes contour curvature as a trackable feature for

the perception of rotational motion. Cereb. Cortex 17, 1179–1189.Caplovitz, G.P., Hsieh, P.J., Tse, P.U., 2006. Mechanisms underlying the perceived angular

velocity of a rigidly rotating object. Vision Res. 46, 2877–2893.Caplovitz, G.P., Shapiro, A.G., Stroud, S., 2011. The maintenance and disambiguation of ob-

ject representations depend upon feature contrast within and between objects. J. Vis.

11 (14), 1.

Caplovitz, G.P., Erlikhman, G., Lago, J., Kellman, P.J., 2013. Neural correlates of spatiotem-

poral boundary formation (SBF). J. Vis. 13, 58.Choi, H., Scholl, B.J., 2006. Perceiving causality after the fact: postdiction in the temporal

dynamics of causal perception. Perception 35, 385–399.Chong, E., Familiar, A.M., Shim, W.M., 2016. Reconstructing representations of dynamic vi-

sual objects in early visual cortex. Proc. Natl. Acad. Sci. U. S. A. 113, 1453–1458.Cicerone, C.M., Hoffman, D.D., Gowdy, P.D., Kim, J.S., 1995. The perception of color from

motion. Percept. Psychophys. 57, 761–777.Corbetta, M., Shulman, G.L., 2002. Control of goal-directed and stimulus-driven attention in

the brain. Nat. Rev. Neurosci. 3, 201–215.Corbetta, M., Akbudak, E., Conturo, T.E., Snyder, A.Z., Ollinger, J.M., Drury, H.A.,

Linenweber, M.R., Petersen, S.E., Raichle, M.E., Van Essen, D.C., Shulman, G.L.,

1998. A common network of functional areas for attention and eye movements. Neuron

21, 761–773.Culham, J.C., Brandt, S.A., Cavanagh, P., Kanwisher, N.G., Dale, A.M., Tootell, R.B., 1998.

Cortical fMRI activation produced by attentive tracking of moving targets.

J. Neurophysiol. 80, 2657–2670.Cunningham, D.W., Shipley, T.F., Kellman, P.J., 1998. The dynamic specification of surfaces

and boundaries. Perception 27, 403–415.de-Wit, L.H., Kubilius, J.,Wagemans, J., Op deBeeck,H.P., 2012. BistableGestalts reduce activity

in the whole of V1, not just the retinotopically predicted parts. J. Vis. 12 (11), 12.

Diamond, R., Carey, S., 1986. Why faces are and are not special: an effect of expertise. J. Exp.

Psychol. Gen. 115, 107–117.DiCarlo, J.J., Zoccolan, D., Rust, N.C., 2012. How does the brain solve visual object recog-

nition? Neuron 73, 415–434.Dupont, P., De Bruyn, B., Vandenberghe, R., Rosier, A.M., Michiels, J., Marchal, G.,

Mortelmans, L., Orban, G.A., 1997. The kinetic occipital region in human visual cortex.

Cereb. Cortex 7, 283–292.Erlikhman, G., Caplovitz, G.P., 2017. Decoding information about dynamically occluded ob-

jects in visual cortex. Neuroimage 146, 778–788.Erlikhman, G., Kellman, P.J., 2016a. From flashes to edges to objects: recovery of local edge

fragments initiates spatiotemporal boundary formation. Front. Psych. 7, 910.

Erlikhman, G., Kellman, P.J., 2016b. Modeling spatiotemporal boundary formation. Vision

Res. 126, 131–142.Erlikhman, G., Xing, Y.Z., Kellman, P.J., 2014. Non-rigid illusory contours and global shape

transformations defined by spatiotemporal boundary formation. Front. Hum. Neurosci.8, 978.

185References

ARTICLE IN PRESS

Page 24: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

Erlikhman, G., Gurariy, G., Mruczek, R.E., Caplovitz, G.P., 2016. The neural representation of

objects formed through the spatiotemporal integration of visual transients. Neuroimage

142, 67–78.Fang, F., Kersten, D., Murray, S.O., 2008. Perceptual grouping and inverse fMRI activity pat-

terns in human visual cortex. J. Vis. 8 (2), 1–9.Fendrich, R., 1983. Anorthoscopic figure perception: the role of retinal painting produced by

observer eye motion. Unpublished doctoral dissertation, New School for Social Research.

Fendrich, R., Mack, A., 1980. Anorthoscopic perception occurs with a retinally stabilized

stimulus. Invest. Ophthalmol. Vis. Sci. 19 (Suppl), 166.

Fendrich, R., Mack, A., 1981. Retinal and post-retinal factors in anorthoscopic figure percep-

tion. Invest. Ophthalmol. Vis. Sci. 20, 166.

Fendrich, R., Rieger, J.W., Heinze, H.J., 2005. The effect of retinal stabilization on anortho-

scopic percepts under free-viewing conditions. Vision Res. 45, 567–582.Ferber, S., Humphrey, G.K., Vilis, T., 2003. The lateral occipital complex subserves the per-

ceptual persistence of motion-defined groupings. Cereb. Cortex 13, 716–721.Field, D.J., Hayes, A., Hess, R.F., 1993. Contour integration by the human visual system: ev-

idence for a local “association field”. Vision Res. 33, 173–193.Flombaum, J.I., Scholl, B.J., 2006. A temporal same-object advantage in the tunnel effect: fa-

cilitated change detection for persisting objects. J. Exp. Psychol. Hum. Percept. Perform.32, 840–853.

Flombaum, J.I., Scholl, B.J., Santos, L.R., 2009. Spatiotemporal priority as a fundamental

principle of object persistence. In: Hood, B., Santos, L. (Eds.), The Origins of Object

Knowledge. Oxford University Press, Oxford, UK, pp. 135–164.Freud, E., Plaut, D.C., Behrmann, M., 2016. “What” is happening in the dorsal visual pathway.

Trends Cogn. Sci. 20, 773–784.Ghose, T., Liu, J., Kellman, P.J., 2014. Recovering metric properties of objects through spa-

tiotemporal interpolation. Vision Res. 102, 80–88.Gibson, J.J., Kaplan, G.A., Reynolds, H.N., Wheeler, K., 1969. The change from visible to

invisible: a study of optical transitions. Percept. Psychophys. 5, 113–116.Goodale, M.A., Milner, A.D., 1992. Separate visual pathways for perception and action.

Trends Neurosci. 15, 20–25.Goodale, M.A., Meenan, J.P., Bulthoff, H.H., Nicolle, D.A., Murphy, K.J., Racicot, C.I., 1994.

Separate neural pathways for the visual analysis of object shape in perception and prehen-

sion. Curr. Biol. 4, 604–610.Grill-Spector, K., Weiner, K.S., 2014. The functional architecture of the ventral temporal cor-

tex and its role in categorization. Nat. Rev. Neurosci. 15, 536–548.Grill-Spector, K., Kourtzi, Z., Kanwisher, N., 2001. The lateral occipital complex and its role

in object recognition. Vision Res. 41, 1409–1422.Grossman, E., Donnelly, M., Price, R., Pickens, D., Morgan, V., Neighbor, G., Blake, R., 2000.

Brain areas involved in perception of biological motion. J. Cogn. Neurosci. 12,

711–720.Harvey, B.M., Braddick, O.J., Cowey, A., 2010. Similar effects of repetitive transcranial mag-

netic stimulation of MT+ and a dorsomedial extrastriate site including V3A on pattern

detection and position discrimination of rotating and radial motion patterns. J. Vis. 10, 21.Haxby, J.V., Gobbini, M.I., Furey, M.L., Ishai, A., Schouten, J.L., Pietrini, P., 2001. Distrib-

uted and overlapping representations of faces and objects in ventral temporal cortex.

Science 293, 2425–2430.Herrmann, C.S., Mecklinger, A., Pfeifer, E., 1999. Gamma responses and ERPs in a visual

classification task. Clin. Neurophysiol. 110, 636–642.

186 Maintenance and updating of visual representations

ARTICLE IN PRESS

Page 25: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

Hine, T., 1987. Subjective contours produced purely by dynamic occlusion of sparse-points

array. Bull. Psychon. Soc. 25, 182–184.Hochberg, J., 1968. In the mind’s eye. In: Haber, R.N. (Ed.), Comtemporary Theory and Re-

search in Visual Perception. Holt, Rinehart & Winston, New York, pp. 309–331.Hubel, D.H., Wiesel, T.N., 1968. Receptive fields and functional architecture of monkey stri-

ate cortex. J. Physiol. 195, 215–243.Huk, A.C., Dougherty, R.F., Heeger, D.J., 2002. Retinotopy and functional subdivision of hu-

man areas MT and MST. J. Neurosci. 22, 7195–7205.Hulme, O.J., Zeki, S., 2007. The sightless view: neural correlates of occluded objects. Cereb.

Cortex 17, 1197–1205.Ikkai, A., Jerde, T.A., Curtis, C.E., 2011. Perception and action selection dissociate human

ventral and dorsal cortex. J. Cogn. Neurosci. 23, 1494–1506.James, W., 1890. The Principles of Psychology (vols. 1 and 2). Holt, New York.

Kahneman, D., Treisman, A., Gibbs, B.J., 1992. The reviewing of object files: object-specific

integration of information. Cogn. Psychol. 24, 175–219.Kanizsa, G., 1955. Margini quasi-percettivi in campi con stimolazione omogenea. Riv. Psicol.

49, 7–30.Kanizsa, G., 1969. Perception, past experience and the “impossible experiment”. Acta Psy-

chol. (Amst.) 31, 66–96.Kanizsa, G., 1979. Organization in Vision: Essays on Gestalt Perception. Praeger, New York.

Kanwisher, N., Chun, M.M., McDermott, J., Ledden, P.J., 1996. Functional imaging of human

visual recognition. Brain Res. Cogn. Brain Res. 5, 55–67.Kaplan, G.A., 1969. Kinetic disruption of optical texture: perception of depth at an edge. Per-

cept. Psychophys. 6, 193–198.Kawabe, T., 2011. Nonretinotopic processing is related to postdictive size modulation in ap-

parent motion. Atten. Percept. Psychophys. 73, 1522–1531.Keane, B.P., Lu, H., Kellman, P.J., 2007. Classification images reveal spatiotemporal contour

interpolation. Vision Res. 47, 3460–3475.Kellman, P.J., Cohen, M.H., 1984. Kinetic subjective contours. Percept. Psychophys.

35, 237–244.Kellman, P.J., Loukides, M.G., 1987. An object perception approach to static and kinetic sub-

jective contours. In: Petry, S., Meyer, G.E. (Eds.), The Perception of Illusory Contours.

Springer, New York, pp. 151–164.Kellman, P.J., Shipley, T.F., 1991. A theory of visual interpolation in object perception. Cogn.

Psychol. 23, 141–221.Keser, Z., Ucisik-Keser, F.E., Hasan, K.M., 2016. Quantitative mapping of human brain

vertical-occipital fasciculus. J. Neuroimaging 26, 188–193.Klaver, P., Lichtensteiger, J., Bucher, K., Dietrich, T., Loenneker, T., Martin, E., 2008. Dorsal

stream development in motion and structure-from-motion perception. Neuroimage

39, 1815–1823.Knebel, J.F., Murray, M.M., 2012. Towards a resolution of conflicting models of illusory con-

tour processing in humans. Neuroimage 59, 2808–2817.Kojo, I., Liinasuo, M., Rovamo, J., 1993. Spatial and temporal properties of illusory figures.

Vision Res. 33, 897–901.Konen, C.S., Kastner, S., 2008. Two hierarchically organized neural systems for object infor-

mation in human visual cortex. Nat. Neurosci. 11, 224–231.K€on€onen, M., P€a€akk€onen, A., Pihlajam€aki, M., Partanen, K., Karjalainen, P.A.,

Soimakallio, S., Aronen, H.J., 2003. Visual processing of coherent rotation in the central

visual field: an fMRI study. Perception 32, 1247–1257.

187References

ARTICLE IN PRESS

Page 26: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

Kourtzi, Z., Bulthoff, H.H., Erb, M., Grodd, W., 2002. Object-selective responses in the hu-

man motion area MT/MST. Nat. Neurosci. 5, 17–18.Kourtzi, Z., Erb, M., Grodd, W., Bulthoff, H.H., 2003. Representation of the

perceived 3-D object shape in the human lateral occipital complex. Cereb. Cortex 13,

911–920.Koyama, S., Sasaki, Y., Andersen, G.J., Tootell, R.B., Matsuura, M., Watanabe, T., 2005. Sep-

arate processing of different global-motion structures in visual cortex is revealed by fMRI.

Curr. Biol. 15, 2027–2032.Krekelberg, B., Vatakis, A., Kourtzi, Z., 2005. Implied motion from form in the human visual

cortex. J. Neurophysiol. 94, 4373–4386.Kuai, S.G., Li, W., Yu, C., Kourtzi, Z., 2017. Contour integration over time: psychophysical

and fMRI evidence. Cereb. Cortex 27, 3042–3051.Larsen, A., Madsen, K.H., Lund, T.E., Bundesen, C., 2006. Images of illusory motion in pri-

mary visual cortex. J. Cogn. Neurosci. 18, 1174–1180.Lehky, S.R., Sereno, A.B., 2007. Comparison of shape encoding in primate dorsal and ventral

visual pathways. J. Neurophysiol. 97, 307–319.Lestou, V., Lam, J.M., Humphreys, K., Kourtzi, Z., Humphreys, G.W., 2014. A dorsal visual

route necessary for global form perception: evidence from neuropsychological fMRI.

J. Cogn. Neurosci. 26, 621–634.Li, Z., 1998. A neural model of contour integration in the primary visual cortex. Neural Com-

put. 10, 903–940.Liu, T., Cooper, L.A., 2003. Explicit and implicit memory for rotating objects. J. Exp. Psychol.

Learn. Mem. Cogn. 29, 554–562.Liu, T., Slotnick, S.D., Yantis, S., 2004. Human MT+ mediates perceptual filling-in during

apparent motion. Neuroimage 21, 1772–1780.Makin, A.D., Poliakoff, E., El-Deredy, W., 2009. Tracking visible and occluded targets:

changes in event related potentials during motion extrapolation. Neuropsychologia

47, 1128–1137.Malach, R., Reppas, J.B., Benson, R.R., Kwong, K.K., Jiang, H., Kennedy,W.A., Ledden, P.J.,

Brady, T.J., Rosen, B.R., Tootell, R.B., 1995. Object-related activity revealed by func-

tional magnetic resonance imaging in human occipital cortex. Proc. Natl. Acad. Sci. U.

S. A. 92, 8135–8139.Mannion, D.J., McDonald, J.S., Clifford, C.W., 2010. The influence of global form on local

orientation anisotropies in human visual cortex. Neuroimage 52, 600–605.Maris, E., 2012. Statistical testing in electrophysiological studies. Psychophysiology

49, 549–565.Maris, E., Oostenveld, R., 2007. Nonparametric statistical testing of EEG- and MEG-data.

J. Neurosci. Methods 164, 177–190.Marr, D., 1982. Vision: A Computational Investigation Into the Human Representation and

Processing of Visual Information. WH Freeman, San Francisco.

McCarthy, J.D., 2014. Spatiotemporal form integration: unified surface perception in a world

fragmented in space and time. Doctoral dissertation, University of Nevada, Reno.

McCarthy, J.D., Kohler, P.J., Tse, P.U., Caplovitz, G.P., 2015a. Extrastriate visual areas in-

tegrate form features over space and time to construct representations of stationary and

rigidly rotating objects. J. Cogn. Neurosci. 27, 2158–2173.McCarthy, J.D., Strother, L., Caplovitz, G.P., 2015b. Spatiotemporal form integration: sequen-

tially presented inducers can lead to representations of stationary and rigidly rotating ob-

jects. Atten. Percept. Psychophys. 77, 2740–2754.

188 Maintenance and updating of visual representations

ARTICLE IN PRESS

Page 27: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

McCloskey, M., Watkins, M.J., 1978. The seeing-more-than-is-there phenomenon:

implications for the locus of iconic storage. J. Exp. Psychol. Hum. Percept. Perform.4, 553–564.

Metzger, W., 2006. Laws of Seeing (L. Spillman, Trans.). MIT Press, Cambridge (Original

work published 1953).

Michotte, A., Thines, G., Crabb�e, G., 1964. Les Complements Amodaux des Structures Percep-

tives. Studia Psychologica. Institut de psychologie de l’Universit�e de Louvain, Louvain, BE.Milner, A.D., Goodale, M.A., 2008. Two visual systems re-viewed. Neuropsychologia

46, 774–785.Miyahara, E., Cicerone, C.M., 1997. Color from motion: separate contributions of chromatic-

ity and luminance. Perception 26, 1381–1396.Morgan, M., 1981. How pursuit eye movements can convert temporal into spatial information.

In: Fisher, D.F., Monty, R.A., Senders, J.W. (Eds.), EyeMovements: Cognition and Visual

Perception. Erlbaum, Hillsdale, NJ, pp. 95–111.Morgan, M.J., Findlay, J.M., Watt, R.J., 1982. Aperture viewing: a review and a synthesis.

Q. J. Exp. Psychol. 34A, 211–233.Muckli, L., Kohler, A., Kriegeskorte, N., Singer, W., 2005. Primary visual cortex activity

along the apparent-motion trace reflects illusory perception. PLoS Biol. 3, e265.Mumford, D., 1992. On the computational architecture of the neocortex. II. The role of cortico-

cortical loops. Biol. Cybern. 66, 241–251.Murray, M.M., Herrmann, C.S., 2013. Illusory contours: a window onto the neurophysiology

of constructing perception. Trends Cogn. Sci. 17, 471–481.Murray, S.O., Kersten, D., Olshausen, B.A., Schrater, P., Woods, D.L., 2002a. Shape percep-

tion reduces activity in human primary visual cortex. Proc. Natl. Acad. Sci. U. S. A.99, 15164–15169.

Murray, M.M., Wylie, G.R., Higgins, B.A., Javitt, D.C., Schroeder, C.E., Foxe, J.J., 2002b.

The spatiotemporal dynamics of illusory contour processing: combined high-density elec-

trical mapping, source analysis, and functional magnetic resonance imaging. J. Neurosci.22, 5055–5073.

Murray, S.O., Olshausen, B.A., Woods, D.L., 2003. Processing shape, motion and three-

dimensional shape-from-motion in the human cortex. Cereb. Cortex 13, 508–516.Murray, M.M., Foxe, D.M., Javitt, D.C., Foxe, J.J., 2004. Setting boundaries: brain

dynamics of modal and amodal illusory shape completion in humans. J. Neurosci.24, 6898–6903.

Murray, M.M., Imber, M.L., Javitt, D.C., Foxe, J.J., 2006. Boundary completion is automatic

and dissociable from shape discrimination. J. Neurosci. 26, 12043–12054.Nakayama, K., Silverman, G.H., 1988a. The aperture problem—I. Perception of nonrigidity

and motion direction in translating sinusoidal lines. Vision Res. 28, 739–746.Nakayama, K., Silverman, G.H., 1988b. The aperture problem—II. Spatial integration of ve-

locity information along contours. Vision Res. 28, 747–753.Newell, F.N., Sheppard, D.M., Edelman, S., Shapiro, K.L., 2005. The interaction of shape- and

location-based priming in object categorisation: evidence for a hybrid “what + where” rep-

resentation stage. Vision Res. 45, 2065–2080.Olson, I.R., Gatenby, J.C., Leung, H.-C., Skudlarski, P., Gore, J.C., 2004. Neuronal represen-

tation of occluded objects in the human brain. Neuropsychologia 42, 95–104.Oostenveld, R., Fries, P., Maris, E., Schoffelen, J.M., 2011. FieldTrip: open source software

for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput.

Intell. Neurosci. 2011, 156869.

189References

ARTICLE IN PRESS

Page 28: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

Orban, G.A., 2011. The extraction of 3D shape in the visual system of human and nonhuman

primates. Annu. Rev. Neurosci. 34, 361–388.Orban, G.A., Sunaert, S., Todd, J.T., Van Hecke, P., Marchal, G., 1999. Human cortical re-

gions involved in extracting depth from motion. Neuron 24, 929–940.Ostwald, D., Lam, J.M., Li, S., Kourtzi, Z., 2008. Neural coding of global form in the human

visual cortex. J. Neurophysiol. 99, 2456–2469.Palmer, E.M., Kellman, P.J., 2014. The aperture capture illusion: misperceived forms in dy-

namic occlusion displays. J. Exp. Psychol. Hum. Percept. Perform. 40, 502–524.Palmer, E.M., Kellman, P.J., Shipley, T.F., 2006. A theory of dynamic occluded and illusory

object perception. J. Exp. Psychol. Gen. 135, 513–541.Paradis, A.L., Cornilleau-Peres, V., Droulez, J., Van DeMoortele, P.F., Lobel, E., Berthoz, A.,

Le Bihan, D., Poline, J.B., 2000. Visual perception of motion and 3-D structure from mo-

tion: an fMRI study. Cereb. Cortex 10, 772–783.Parks, T.E., 1965. Post-retinal visual storage. Am. J. Psychol. 78, 145–147.Pascual-Marqui, R.D., 2002. Standardized low-resolution brain electromagnetic tomography

(sLORETA): technical details. Methods Find. Exp. Clin. Pharmacol. 24 (Suppl. D),

5–12.Pavan, A., Ghin, F., Donato, R., Campana, G., Mather, G., 2017. The neural basis of form and

form-motion integration from static and dynamic translational Glass patterns: a rTMS in-

vestigation. Neuroimage 157, 555–560.Perrin, F., Pernier, J., Bertrand, O., Echallier, J.F., 1989. Spherical splines for scalp potential

and current density mapping. Electroencephalogr. Clin. Neurophysiol. 72, 184–187.Perry, C.J., Fallah, M., 2014. Feature integration and object representations along the dorsal

stream visual hierarchy. Front. Comput. Neurosci. 8, 84.Peterhans, E., von der Heydt, R., 1989. Mechanisms of contour perception in monkey visual

cortex. II. Contours bridging gaps. J. Neurosci. 9, 1749–1763.Peterhans, E., Von der Heydt, R., Baumgartner, G., 1986. Neuronal responses to illusory con-

tour stimuli reveal stages of visual cortical processing. Vis. Neurosci. 343–351.Plateau, J., 1836. Notice sur l’Anorthoscope. Bull. Acad. Roy. Sci. Bruxelles 3, 7–10.Pylyshyn, Z., 1989. The role of location indexes in spatial perception: a sketch of the FINST

spatial-index model. Cognition 32, 65–97.Rao, R.P., Ballard, D.H., 1999. Predictive coding in the visual cortex: a functional interpre-

tation of some extra-classical receptive-field effects. Nat. Neurosci. 2, 79–87.Reichert, C., Fendrich, R., Bernarding, J., Tempelmann, C., Hinrichs, H., Rieger, J.W., 2014.

Online tracking of the contents of conscious perception using real-time fMRI. Front. Neu-

rosci. 8, 116.Rieger, J.W., Gruschow, M., Heinze, H.J., Fendrich, R., 2007. The appearance of figures seen

through a narrow aperture under free viewing conditions: effects of spontaneous eye mo-

tions. J. Vis. 7, 10.Rock, I., 1981. Anorthoscopic perception. Sci. Am. 244, 145–153.Rock, I., Gilchrist, A., 1975. Induced form. Am. J. Psychol. 88, 475–482.Rock, I., Halper, F., 1969. Form perception without a retinal image. Am. J. Psychol.

82, 425–440.Rock, I., Sigman, E., 1973. Intelligence factors in the perception of form through a moving slit.

Perception 2, 357–369.Rock, I., Halper, F., DiVita, J., Wheeler, D., 1987. Eye movement as a cue to figure motion in

anorthoscopic perception. J. Exp. Psychol. Hum. Percept. Perform. 13, 344–352.Rodman, H.R., Albright, T.D., 1989. Single-unit analysis of pattern-motion selective proper-

ties in the middle temporal visual area (MT). Exp. Brain Res. 75, 53–64.

190 Maintenance and updating of visual representations

ARTICLE IN PRESS

Page 29: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

Schuman, F., 1904. Einige Beobachtungen €uber die Zusammenfassung yon Gesichtendrucken

zu Einheiten. Psychol. Studien. 1, 1–32.Schwaninger, A., Carbon, C.-C., Leder, H., 2003. Expert face processing: specialization and

constraints. In: Schwarzer, G., Leder, H. (Eds.), Development of Face Processing.

Hogrefe, G€ottingen, DE, pp. 81–97.Schwarzkopf, D.S., Sterzer, P., Rees, G., 2011. Decoding of coherent but not incoherent mo-

tion signals in early dorsal visual cortex. Neuroimage 56, 688–698.Seghier, M., Dojat, M., Delon-Martin, C., Rubin, C., Warnking, J., Segebarth, C., Bullier, J.,

2000. Moving illusory contours activate primary visual cortex: an fMRI study. Cereb. Cor-

tex 10, 663–670.Sereno, M.I., Pitzalis, S., Martinez, A., 2001. Mapping of contralateral space in retinotopic

coordinates by a parietal cortical area in humans. Science 294, 1350–1354.Shapiro, A.G., Caplovitz, G.P., Dixon, E.L., 2014. Feature- and face-exchange illusions: new

insights and applications for the study of the binding problem. Front. Hum. Neurosci.8, 804.

Shipley, T.F., Kellman, P.J., 1992. Strength of visual interpolation depends on the ratio of

physically specified to total edge length. Percept. Psychophys. 52, 97–106.Shipley, T.F., Kellman, P.J., 1993. Optical tearing in spatiotemporal boundary formation:

when do local element motions produce boundaries, form, and global motion? Spat.

Vis. 7, 323–339.Shipley, T.F., Kellman, P.J., 1994. Spatiotemporal boundary formation: boundary, form, and

motion perception from transformations of surface elements. J. Exp. Psychol. Gen.123, 3–20.

Shipley, T.F., Kellman, P.J., 1997. Spatio-temporal boundary formation: the role of local mo-

tion signals in boundary perception. Vision Res. 37, 1281–1293.Shpaner, M., Murray, M.M., Foxe, J.J., 2009. Early processing in the human lateral occipital

complex is highly responsive to illusory contours but not to salient regions. Eur. J. Neu-

rosci. 30, 2018–2028.Shuwairi, S.M., Curtis, C.E., Johnson, S.P., 2007. Neural substrates of dynamic object occlu-

sion. J. Cogn. Neurosci. 19, 1275–1285.Silver, M.A., Ress, D., Heeger, D.J., 2005. Topographic maps of visual spatial attention in

human parietal cortex. J. Neurophysiol. 94, 1358–1371.Staffers, P.J., 1989. Forms can be recognized from dynamic occlusion alone. Percept. Mot.

Skills 68, 243–251.Stanley, D.A., Rubin, N., 2005. Rapid detection of salient regions: evidence from apparent

motion. J. Vis. 5, 690–701.Sterzer, P., Haynes, J.D., Rees, G., 2006. Primary visual cortex activation on the path of

apparent motion is mediated by feedback from hMT+/V5. Neuroimage 32, 1308–1316.Stojanoski, B., Niemeier, M., 2011. The timing of feature-based attentional effects during ob-

ject perception. Neuropsychologia 49, 3406–3418.Stone, J.V., 1999. Object recognition: view-specificity and motion-specificity. Vision Res.

39, 4032–4044.Tanaka, J.W., Curran, T., 2001. A neural basis for expert object recognition. Psychol. Sci.

12, 43–47.Todd, J.J., Marois, R., 2004. Capacity limit of visual short-term memory in human posterior

parietal cortex. Nature 428, 751–754.Tootell, R.B., Reppas, J.B., Kwong, K.K., Malach, R., Born, R.T., Brady, T.J., Rosen, B.R.,

Belliveau, J.W., 1995. Functional analysis of human MT and related visual cortical areas

using magnetic resonance imaging. J. Neurosci. 15, 3215–3230.

191References

ARTICLE IN PRESS

Page 30: The maintenance and updating of representations of no ...research.clps.brown.edu/danmccarthy/ewExternalFiles/McCarthy, Erli… · et al., 1994; Ikkai et al., 2011; Sereno et al.,

Tse, P.U., 2006. Neural correlates of transformational apparent motion. Neuroimage

31, 766–773.Tyler, C.W., Likova, L.T., Kontsevich, L.L., Wade, A.R., 2006. The specificity of cortical re-

gion KO to depth structure. Neuroimage 30, 228–238.Ullman, S., 1980. The effect of similarity between line segments on the correspondence

strength in apparent motion. Perception 9, 617–626.Ungerleider, L., Mishkin, M., 1982. Two cortical visual systems. In: Ingle, D., Goodale, M.A.,

Mansfield, R. (Eds.), Analysis of Visual Behavior. MIT Press, Cambridge, MA,

pp. 549–586.Vaina, L.M., Solomon, J., Chowdhury, S., Sinha, P., Belliveau, J.W., 2001. Functional neu-

roanatomy of biological motion perception in humans. Proc. Natl. Acad. Sci. U. S. A.

98, 11656–11661.Van Oostende, S., Sunaert, S., Van Hecke, P., Marchal, G., Orban, G.A., 1997. The kinetic

occipital (KO) region in man: an fMRI study. Cereb. Cortex 7, 690–701.Vanduffel, W., Fize, D., Peuskens, H., Denys, K., Sunaert, S., Todd, J.T., Orban, G.A., 2002.

Extracting 3D from motion: differences in human and monkey intraparietal cortex.

Science 298, 413–415.Vinberg, J., Grill-Spector, K., 2008. Representation of shapes, edges, and surfaces across mul-

tiple cues in the human visual cortex. J. Neurophysiol. 99, 1380–1393.von der Heydt, R., Peterhans, E., 1989. Mechanisms of contour perception in monkey visual

cortex. I. Lines of pattern discontinuity. J. Neurosci. 9, 1731–1748.von der Heydt, R., Peterhans, E., Baumgartner, G., 1984. Illusory contours and cortical neuron

responses. Science 224, 1260–1262.von Helmholtz, H., 1867/1925. Treatise on Physiological Optics. Dover Press, New York.

Wall, M.B., Lingnau, A., Ashida, H., Smith, A.T., 2008. Selective visual responses to expan-

sion and rotation in the human MT complex revealed by functional magnetic resonance

imaging adaptation. Eur. J. Neurosci. 27, 2747–2757.Weiner, K.S., Yeatman, J.D., Wandell, B.A., 2016. The posterior arcuate fasciculus and the

vertical occipital fasciculus. Cortex. pii: S0010-9452(16)30050-8, Epub ahead of print.

Wilkinson, F., James, T.W., Wilson, H.R., Gati, J.S., Menon, R.S., Goodale, M.A., 2000. An

fMRI study of the selective activation of human extrastriate form vision areas by radial and

concentric gratings. Curr. Biol. 10, 1455–1458.Xu, Y., Chun, M.M., 2006. Dissociable neural mechanisms supporting visual short-termmem-

ory for objects. Nature 440, 91–95.Yeatman, J.D., Weiner, K.S., Pestilli, F., Rokem, A., Mezer, A., Wandell, B.A., 2014. The

vertical occipital fasciculus: a century of controversy resolved by in vivo measurements.

Proc. Natl. Acad. Sci. U. S. A. 111, E5214–5223.Yin, C., Shimojo, S., Moore, C., Engel, S.A., 2002. Dynamic shape integration in extrastriate

cortex. Curr. Biol. 12, 1379–1385.Yonas, A., Craton, L.G., Thompson, W.B., 1987. Relative motion: kinetic information for the

order of depth at an edge. Percept. Psychophys. 41, 53–59.Zaretskaya, N., Anstis, S., Bartels, A., 2013. Parietal cortex mediates conscious perception of

illusory gestalt. J. Neurosci. 33, 523–531.Zhuo, Y., Zhou, T.G., Rao, H.Y., Wang, J.J., Meng, M., Chen, M., Zhou, C., Chen, L., 2003.

Contributions of the visual ventral pathway to long-range apparent motion. Science

299, 417–420.Z€ollner, F., 1862. €Uber eine neue Art anorthoskopischer Zerrbilder. Ann. Phys. 193, 477–484.

192 Maintenance and updating of visual representations

ARTICLE IN PRESS