What is focal attention for? The What and Why of perceptual selection The central function of focal attention is to select We must select because our

What is focal attention for?What is focal attention for?The The WhatWhat and and WhyWhy of perceptual selection of perceptual selection

The central function of focal attention is to select We must select because our capacity to process

information is limited We must select because we need to be able to mark

certain aspects of a display and to refer to the marked tokens individually

That’s what this talk is principally about: but first some background

The functions of focal attentionThe functions of focal attention A central notion is that of “picking out” or selecting. The

usual mechanism that is appealed to in explaining perceptual selection is attention (sometimes called focal attention or selective attention).

Why must we select anyway? We must select because we can’t process all the information

available. This is the resource-limitation reason. ○ But in what way (along what dimensions) is it limited? What happens

to what is not selected? The “filter theory” has many problems.

We need to select because certain patterns cannot be computed without first marking certain special elements (e.g. in counting)

We need to select in order to track the identity of individual things (e.g., to solve the correspondence problem)

We need to select because of the way relevant information in the world is packaged. This leads to the Binding Problem (later)

What is selected?What is selected?

Whatever the reason for selection, the selection must occur in early in vision (in the visual module) and prior to conceptualization. For resource-limitation reasons, selection must occur before

the need for major resources In the case of the “marking” or individuating, the empirical

facts require that vision pick out and individuate without regard for the conceptual category or properties of the individuals

In the case of the property-binding, there are good reasons why selection should be based on individual things (objects)

All these reasons converge on the claim that what is selected is individuals or proto-objects

Attention and SelectionAttention and Selection

Early research concentrated on selective attention as a filter. It assumed that we select what can be described in very low-level terms – i.e., in terms of physical “channels” or based on transducer outputs. But the filter idea was shown to be only approximate – because filters always leaked

It is important that the question of selection be placed in the context of a pre-attentive (modular, nonconceptual, cognitively-impenetrable) stage of vision – otherwise in some sense anything can be “selected” (e.g., being edible, being a genuine Rembrandt painting)

Broadbent’s Filter TheoryBroadbent’s Filter Theory(illustrating the resource-limited account of selection)(illustrating the resource-limited account of selection)

Broadbent, D. E. (1958). Perception and Communication. London: Pergamon Press.

Limited Capacity Channel

Effectors

Store of conditional probabilities of past events (in LTM)

Filt

erMotor planner

Ver

y Sh

ort T

erm

Sto

re

Sens

es

Rehearsal loop

Attention and SelectionAttention and Selection The question the basis for selection has been at the bottom

of a lot of controversy in vision science. Some options that have been proposed include: We select what can be described physically (i.e., by “channels”)

– i.e. we select based on transducer outputs e.g., we select by frequency, color, shape, or location

We select according to what is important to us (e.g., affordances), or according to phenomenological salience

We select what we need to treat as special (selection = “marking”) or what we need to refer to

We select aspects (properties) to which we subsequently attach concepts (this idea will be important later)

It is important that the question of selection be placed in the context of a pre-attentive (modular, nonconceptual, cognitively-impenetrable) stage of vision – otherwise in some sense anything can be “selected” (e.g., being edible, being a genuine Rembrandt painting)

What does What does visualvisual attention select? attention select? (What is the basis for selection?)(What is the basis for selection?)

The most obvious answer to what we select is places. For example, we can select places by moving our eyes so our gaze lands on different places When places are selected, are they selected automatically? Must we always move our eyes to change what we attend to?

○ Studies of Covert Attention-Movement: Posner (1980).○ How does attention switch from one place to another?

▫ When places are selected, are they selected automatically? ○ How does the visual system specify where to move attention to?

If we select places, are there restrictions on those places? e.g.,○ Must those places be filled or can they be empty places? ○ Must they be specifiable in relation to landmark objects?

Covert movement of attentionCovert movement of attention

Example of an experiment using a cue-validity paradigm for showing that the locus of attention moves without eye movements and for estimating its speed. Posner, M. I. (1980). Orienting of Attention. Quarterly Journal of Experimental Psychology, 32, 3-25.

Extension of Posner’s demonstration of attention switchExtension of Posner’s demonstration of attention switch

Does the improved detection in intermediate locations entail that the “spotlight of attention” moves continuously through empty space?

Uncued

Cued

CueFixationframe

Target-cueinterval Detection target

*

Along thepath

*

*

But there are empirical reasons why But there are empirical reasons why objectsobjects are a are a better basis for attentional selection than better basis for attentional selection than locationlocation

There is experimental evidence that attention attaches to things rather than places

When attention is exogenously summoned, the appearance of analog movement of focal attention can be explained by a punctate object-based theory of attention-allocation – Sperling & Weichselgartner (1995)

Sperling & Weichselgartner (1995) “Episodic” or Sperling & Weichselgartner (1995) “Episodic” or Quantal Theory of Attention switching Quantal Theory of Attention switching

Assumes a quantal “shift” in attention in which the spotlight pointed at location -2 is extinguished and, simultaneously, the spotlight at location +2 is turned on. Because extinction and onset take a measurable amount of time, there is a brief period when the spotlights partially illuminate both locations simultaneously.

This This object-basedobject-based view of attentional view of attentional selection is at the heart of FINST theoryselection is at the heart of FINST theory

I propose that there are good reasons on both experimental and conceptual grounds for supposing that attention attaches itself to objects rather than locations

In what other ways might our visual In what other ways might our visual information capacity be limited?information capacity be limited?

There are obviously limitations on the input side of vision that depend on the acuity of the sensors and the range of physical properties to which they respond.

But there is a limitation beyond that of acuity: The perceptual system is limited in what it can individuate and how many of these individuals it can deal with at one time. The capacity to individuate is different from the capacity to discriminate.

Some reason for thinking that individuating is a distinct process

The increasingly important role played by The increasingly important role played by ‘‘ObjectsObjects’’ in studies of visual attention in studies of visual attention

There is a limitation in visual information processing that is beyond the limitation of acuity and of channel capacity: The perceptual system is limited in what it can individuate and how many of these individuals it can deal with at one time.

The capacity to individuate is different from memory capacity and discrimination capacity.

This notion of individuating and of individuals may be related to Miller’s “chunks”, but it has a special role in vision which we will explore in the next lecture

First some reasons why individuating is a distinct process

Visual Indexes (Visual Indexes (akaaka FINSTs) FINSTs) There is evidence that individuating is a special aspect of vision

and the capacity to individuate is different from memory capacity and discrimination capacity. This notion of individuating and of individuals may be related to Miller’s

“chunks”, but it has a special role in vision In vision there appears to be a limit to how many objects

(individuals) can be selected and bound to the arguments of cognitive functions at one time.

There is evidence that we can hold on to 4 objects in visual short term memory (Luck & Vogel, 1997).

There is evidence that Objects (i.e., individual things) may be the basic units of visual attention

FINST Theory (to be described later) claims that there is a mechanism for picking out and referring to (pointing to) primitive visual elements independent of any of their properties and that this mechanism is the essential bridge between nonconceptual and conceptual representation.

Pick out 3 dots and keep track of themPick out 3 dots and keep track of them

In a field of identical elements you can select a number of them and move your attention among them (e.g., “move one up” or Move 2 right” etc) so long as at no time do you have to hold on to more than 4 dots

Individuals and patternsIndividuals and patterns Vision does not recognize patterns by applying templates since the

size, shape, retinal location, orientation, and other properties must be abstracted away,

A pattern is encoded over time (and often over saccades), therefore the visual system must keep track of the individual parts and merge descriptions of the same part at different times and stages of encoding

Individuating is a prerequisite for recognition of patterns and configural properties defined among a number of individual partsAn example of how we can easily detect patterns if they are defined

over a small enough number of parts is subitizingIn order to recognize a pattern, the visual system must pick out

individual parts and bind them to the representation being constructed Examples include what Ullman called “visual routines” Another area where the concept of an individual has become

important is in cognitive development, where it is clear that babies are sensitive to the numerosity of individual things in a way that is distinct from their perceptual abilities but is limited in its capacity

Are there collinear items (n>3)?Are there collinear items (n>3)?

Several objects must be picked out at once Several objects must be picked out at once in making relational judgmentsin making relational judgments

The same is true for other relational judgments like inside or on-the-same-contour… etc. We must pick out the relevant individual objects first. Respond: Inside-same contour? On-same contour?

Signature subitizing phenomena only appear when objects Signature subitizing phenomena only appear when objects are automatically individuated and indexedare automatically individuated and indexed

Trick, L. M., & Pylyshyn, Z. W. (1994). Why are small and large numbers enumerated differently? A limited capacity preattentive stage in vision. Psychological Review, 101(1), 80-102.

Encoding conjunctions of propertiesEncoding conjunctions of properties

Experiments showing the special difficulty that vision has in detecting conjunctions of several properties have provided a basis for understanding an important problem in in visual analysis

How are conjunctions of features detected?How are conjunctions of features detected?

Read the vertical line of digits in the following display

Under these conditions Conjunction Errors are very frequent

Rapid visual search Rapid visual search (Treisman)(Treisman)

Find the following simple figure in the next slide:

This case is easy – and the time is independent of how many nontargets there are – because there is only one red item. This is called a ‘popout’ search

This case is also easy – and the time is independent of how many nontargets there are – because there is only one right-leaning item. This is also a ‘popout’ search.

Rapid visual search Rapid visual search (conjunction)(conjunction)

Find the following simple figure in the next slide:

Serial vs parallel search?Serial vs parallel search?

Finding an element that differs from all others in a scene by a single feature – called a single-feature search – is fast, error-free and almost independent of how many nontargets there are;

Finding an object that differs from all others by a conjunction of two or more features (and that shares at least one feature with each object in the scene) – called a conjunction search – is usually slow, error-prone, and is worse the more nontargets there are in the scene*.

These results suggest that in order to find a conjunction, which requires solving the binding problem, attention has to be scanned serially to all objects.

* This way of putting is simplifies things. Under certain conditions the serial-parallel distinction breaks down

Single-Feature Single-Feature vsvs Conjunction-feature search Conjunction-feature search

What is attention is for? What is attention is for? Treisman’s Treisman’s Attention as GlueAttention as Glue Hypothesis Hypothesis

The purpose of visual attention is to The purpose of visual attention is to BindBind properties properties together in order to recognize objectstogether in order to recognize objects This is called the “binding problem” or the “many This is called the “binding problem” or the “many

properties problem” and it is of considerable interest to properties problem” and it is of considerable interest to philosophers as well as vision scientistsphilosophers as well as vision scientists

We can recognize not only the presence of “squareness” We can recognize not only the presence of “squareness” and “redness” in our field of view, but we can also and “redness” in our field of view, but we can also distinguish between different ways they may be conjoineddistinguish between different ways they may be conjoined

The role of attention to location in Treisman’s The role of attention to location in Treisman’s Feature Integration TheoryFeature Integration Theory

Color maps Shape maps Orientation maps

Master location map

Original Input

Attention “beam”

Conjunction detected

R

Y

G

The ‘The ‘attention-as-glue’ attention-as-glue’ hypothesis has a corollary: hypothesis has a corollary: In computing conjunctions of properties, attention In computing conjunctions of properties, attention

must be directed primarily at must be directed primarily at objectsobjects since it is since it is objects that have the conjoined propertiesobjects that have the conjoined properties

Instead of being like a spotlight beam that can be scanned around a scene, and can be zoomed to cover a larger or smaller area, maybe attention can only be directed towards occupied places – i.e., to visual objects

An alternative view of how we An alternative view of how we solve the binding problemsolve the binding problem

If we assume that only properties of indexed objects are encoded and stored in Object Files, then properties that belong to the same object are stored in the same Object File, so the binding problem does not arise This is the Object-Based Attention view exemplified by

FINST Theory

The assumption that only properties of indexed objects are encoded raises the problem of what happens to properties of the other (unindexed) objects or unencoded properties in a display

I will return to this conundrum later.

FINST Theory postulates a limited number of pointers in FINST Theory postulates a limited number of pointers in early early visionvision that are elicited by causal events in the visual field and that that are elicited by causal events in the visual field and that enable vision to refer to things without doing so under concept or enable vision to refer to things without doing so under concept or

a descriptiona description

Evidence for attentional selection based on Evidence for attentional selection based on ObjectsObjects

Single Object Advantage: pairs of judgments are faster when both apply to the same perceived object

Entire objects acquire enhanced sensitivity from focal attention to a part of the object

Single-Object advantage occurs even with generalized “objects” defined in feature space

Simultanagnosia and hemispatial neglect show object-based effects

Attention moves with Moving Objects IORObject FilesMOT

Single-object superiority even when the Single-object superiority even when the shapes are controlledshapes are controlled

Attention spreads over Attention spreads over perceivedperceived objects objects

Using a priming method (Egly, Driver & Rafal, 1994) showed that the effect of a prime spreads to other parts of the same visual object compared to equally distant parts of different objects.

Spreads toB and not C

Spreads toB and not C

Spreads toC and not B

Spreads toC and not B

A

B

C

D

A

B

C

D

A

B

C

D

A

B

C

D

A

B

C

D

A

B

C

D

A

B

C

D

A

B

C

D

Objecthood endures over time

Several studies have shown that what counts as an object (as the same object) endures over time and over changes in location; Certain forms of disappearances in time and changes in

location preserve objecthood.

This gives what we have been calling a “visual object” a real physical-object character and partly justifies our calling it an “object”.

Inhibition of return appears to be object-basedInhibition of return appears to be object-based (as well as to some extent location-based)(as well as to some extent location-based)

Inhibition-of-return is thought to help in visual search since it prevents previously visited objects from being revisited

The original study used static objects. Then (Tipper, Driver & Weaver, 1991) showed that IOR moves with the inhibited object

IOR appears to be object-based (it travels IOR appears to be object-based (it travels with the object that was attended)with the object that was attended)

Most studies showed that IOR is object-based Most studies showed that IOR is object-based (it travels with the object that was attended)(it travels with the object that was attended)

Some studies (Tipper, Weaver, Jerreat, & Burak, 1994) showed that attention can also be location-based, but in those cases the “location” was well marked by visible context cues – so it may be that locations such as “halfway between object X and Object Y” can be attended

Clinical studies with patients who have attentional deficits show that their deficit is object based (illustrated later)

Tracking objects not defined by distinct spatial Tracking objects not defined by distinct spatial locations and spatial trajectorieslocations and spatial trajectories

Blaser, E., Pylyshyn, Z. W., & Holcombe, A. O. (2000). Tracking an object through feature-space. Nature, 408(Nov 9), 196-199.

There is also evidence from neuropsychology There is also evidence from neuropsychology that is consistent with the object-based viewthat is consistent with the object-based view

Neglect Balint and simultanagnosic patients

Visual neglect syndrome is object-basedVisual neglect syndrome is object-based

When a right neglect patient is shown a dumbbell that rotates,the patient continues to neglect the object that had been on the right, even though It is now on the left (Behrmann & Tipper, 1999).

Simultanagnosic (Balint Syndrome) patients only attend Simultanagnosic (Balint Syndrome) patients only attend to one object at a timeto one object at a time

Simultanagnosic patients cannot judge the relative length of twolines, but they can tell that a figure made by connecting the endsof the lines is not a rectangle but a trapezoid (Holmes & Horax, 1919).

Balint patients can only attend to one object at a time Balint patients can only attend to one object at a time even if they are overlappingeven if they are overlapping

Luria, 1959

The End (for now)The End (for now)

Multiple Object TrackingMultiple Object Tracking

One of the clearest cases illustrating object-based attention is Multiple Object Tracking

Keeping track of individual objects in a scene requires a mechanism for individuating, selecting, accessing and tracking the identity of individuals over time These are the functions we have proposed are carried out by

the mechanism of visual indexes (FINSTs)

We have been using a variety of methods for studying visual indexing, including subitizing, subset selection for search, and Multiple Object Tracking (MOT).

Multiple Object TrackingMultiple Object Tracking In a typical experiment, 8 simple identical objects are

presented on a screen and 4 of them are briefly distinguished in some visual manner – usually by flashing them on and off.

After these 4 “targets” have been briefly identified, all objects resume their identical appearance and move randomly. The subjects’ task is to keep track of which ones had earlier been designated as targets.

After a period of 5-10 seconds the motion stops and subjects must indicate, using a mouse, which objects were the targets.

People are very good at this task (80%-98% correct). The question is: How do they do it?

Keep track of the objects that flash

How do we do it? What properties of individual objects do we use?

Keep track of the objects that flash

How do we do it? What properties of individual objects do we use?

Basic finding: People (even 5 year old children) can track 4 to 5 individual objects that have no unique visual properties

How is it done? Can it be done by keeping track of the only

distinctive property of objects – their location?

Explaining Multiple Object TrackingExplaining Multiple Object Tracking

If we are not using and updating objects’ If we are not using and updating objects’ locations, then how are we tracking them?locations, then how are we tracking them? Our hypothesis, which is independently motivated, is that

there are a small number of primitive indexes or pointers, each of which can pick out a particular individual object The index keeps providing access to the object as the object

changes its properties and its location.

The object is not selected by using an encoding of any of its properties. It is picked it out nonconceptually just as the demonstrative that does in language. Nonconceptual selection is selection without classification

(without encoding the selected thing as having certain properties or as being a member of a certain category)

Nonconceptual contact with the world is essential in order to ground concepts in causal connections

A FINST is a mechanism that:A FINST is a mechanism that:1. Picks out, and 2. Keeps track of

individual distal elements, and3. Does so directly (i.e., without mediation of concepts and

without appealing to or using any encoded properties of the individuals). Therefore,

4. FINSTs pick out and track individuals as individuals rather than as bearers of certain properties

5. FINSTs do not pick out and track individuals as members of any category: The connection to the world is purely causal and nonconceptual, so there is no “seeing as” relation. So the visual system (and the person) literally does not what is

being selected and tracked, even though this indexed selection allows further properties of the object in question to be encoded subsequently!

Where does this leave the binding problem?Where does this leave the binding problem?

Binding by location – advantages It’s easy to see how locations might be picked out since

they are physically specifiable and are a logical extension of direction of gaze

Location can be specified across modalities

Binding by location – disadvantages Empty locations do not have causal powers Empty locations do not have properties Point locations do not help with the binding problem

○ they have to be at least regions○ The boundaries of regions are defined by objects, so

objects first have to be selected in any case

Objects as the basis for bindingObjects as the basis for binding

Binding by individual – advantages Individuals are the focus of properties – in the end we

need to bind together properties of a single individual Binding by individual – disadvantages

It is hard to see how a mechanism can pick out individuals without focusing on their location

How can individuals be tracked without detecting properties unique to that individual?

Philosophers from Strawson to Clark have argued that individuation requires the apparatus of concepts to provide conditions of individuation, so how can individuals be recognized and tracked by early (nonconceptual) vision?

Summary of some properties of indexing revealed Summary of some properties of indexing revealed by recent experimentsby recent experiments

1. Targets can be tracked even when they disappear behind an occluder and, under certain conditions, even when all objects disappear from view (Scholl & Pylyshyn, 1999; Keane & Pylyshyn, VSS2003). Demo: MOT with occlusion

2. Properties of targets are not encoded during MOT nor are they used in tracking. Changes in target properties are not even noticed (Scholl, Pylyshyn & Franconeri, 1999; Bahrami, 2003).

3. Not all well-defined clusters of features can be tracked: Only ones that correspond to objects (Scholl, Pylyshyn & Feldman, 2001). Demo: "Rubber band" displays


4.Indexes are assigned primarily in an exogenous, automatic, involuntary and data-drive manner. They can also be assigned endogenously (voluntarily) but we believe this happens only by moving focal attention to each target serially (Annon & Pylyshyn, VSS2003).

5.Index maintenance in tracking appears to be non-predictive and non-attentive (Keane & Pylyshyn, VSS2003; Leonard &

Pylyshyn, VSS2003).

6.Target-target confusions are much more numerous than target-nontarget confusions. The reason appears to be that nontargets are inhibited, which may prevent them from being swapped with nontargets (Pylyshyn & Leonard, VSS2003).


7. Keeping track of objects as targets is easier than keeping track of their identity (when the latter is provided at the start of the trial by a name or special location)The poorer recall of object identities is surprising, given that in order to

judge an object as a target one needs to trace its identity back to an object that had been visibly distinct at the start of a trial! So why is ID lost?

8. One reason is that target-target confusions are much more numerous than target-nontarget confusions. But why should this be so?

9. One reason may be that nontargets are inhibited, which may prevent them from being swapped with nontargets. We have shown this is so experimentally. But that leaves a serious puzzle: How can inhibition travel with objects when no indexes are available for tracking?

The beginnings of the puzzle of clustering prior to The beginnings of the puzzle of clustering prior to indexing, and what that might mean!indexing, and what that might mean!

If moving objects are inhibited then inhibition moves along with the objects. How can this be unless they are being tracked? And if they are being tracked there must be at least 8 FINSTs!

This puzzle may signal the need for a kind of individuation that is weaker than the individuation we have discussed so far – a mere clustering, circumscribing, figure-ground distinction without a pointer or access mechanism – i.e. without reference!

It turns out that such a circumscribing-clustering process is needed to fulfill many different functions in early vision. It is needed whenever the correspondence problem arises – whenever visual elements need to be placed in correspondence or paired with other elements. This occurs in computing stereo, apparent motion, and other grouping situations in which the number of elements does not affect ease of pairing (or even results in faster pairing when there are more elements). Correspondence is not computed over continuous visual manifolds but only over some pre-clustered elements.

An alternative view of how to solve the An alternative view of how to solve the Binding ProblemBinding Problem

According to the current version of FINST theory, only properties of indexed objects are encoded (conceptualized) The binding problem never arises because properties are always

encoded as properties of an indexed object, and no other properties are encoded at all.

This is in conflict with strong intuitions – namely that we see much more than we conceptualize. So what do we do about the things we “see” but do not conceptualize? Some philosophers say they are represented nonconceptually?

But what is such a representation like? And what makes it a representation, as opposed to just a biological reaction?

My provisional answer is that such biological reactions (e.g., retinal activity) are not representations at all – they have no truth values and so they cannot misrepresent This is another hard issue to be deferred to later

Puzzles raised by FINST theory and MOT resultsPuzzles raised by FINST theory and MOT results

If the only information about indexed objects is encoded and made available to the cognitive mind, what happens to information about other parts of the visual scene? There are, after all, only about 4 or 5 indexes and surely

we see a lot more of the world than 4 or 5 objects!

This raises the question about whether non-indexed objects are ‘processed’ in any sense at all, and whether they are even represented in some (presumably nonconceptual) way.

Do objects that are not indexed have any effect on the visual system at all? The mystery of unattended objects Functional blindness in normal vision

Austen Clark (& P. Strawson) and Austen Clark (& P. Strawson) and feature placing languagesfeature placing languages

What kind of representation does sensation allow?Ans: Just those in feature-placing languages

“The hypothesis that this book offers is that sensation is feature-placing: a pre-linguistic system of mental representation. Mechanisms of spatio-temporal discrimination … serve to pick out or identify the subject-matter of sensory representation. That subject-matter turns out invariably to be some place-time in or around the body of the sentient organism. …the various reasons cited for thinking that sensation is intentional can also be explained on this hypothesis. The ‘aboutness’ of sensation reduces to its spatial character. (p 165)”“…there is a sensory level of identification of place-times that is more primitive than the identification of three-dimensional material objects. Below our conceptual scheme – underneath the streets, so to speak – we find evidence of this more primitive system. The sensory identification of place-times is independent of the identification of objects; one can place features even though one lacks the latter conceptual scheme.

Because our perceptual system can distinguish objects that differ by Because our perceptual system can distinguish objects that differ by conjunctionsconjunctions of properties, early vision must not fuse together or lose of properties, early vision must not fuse together or lose the object-specificity of properties it detects. In reporting properties the object-specificity of properties it detects. In reporting properties early vision must bind them together according to the objects that have early vision must bind them together according to the objects that have those propertiesthose properties

Some philosophical morals we Some philosophical morals we can draw from FINST theorycan draw from FINST theory

Distinguishing causes and codes○ What causes Object Files to be created vs what is entered into them

Conceptual and nonconceptual contents Representing and carrying information

○ The case of clusters, figure-ground, and correspondence

Can information-carrying properties (e.g., location on the proximal pattern) create clusters without representing locations of features that are clustered?

The problem is what to do about the items The problem is what to do about the items that were not attended but in some sense that were not attended but in some sense

had been ‘seen’had been ‘seen’

Some considerations:We should not equate ‘attended’ with indexed or selected

or with any other information-processing function? To be attended is typically defined in terms of either the task goals (where unattended means unreported) or the perceptual experience

Forms of inattentional blindness Non-indexed items may continue to be indexable for a short

time after they physically disappear (e.g., occlusions in MOT) The question is whether this persistence is a form of

nonconceptual representation or a mere latency or inertia in the visual mechanism, and that question eventually comes back to whether we must advert to semantical notions in stating the generalizations (De Morgan’s Canon or Occam’s Razor).

Another puzzle: Punctate inhibition of moving objects?Another puzzle: Punctate inhibition of moving objects?

We have recently obtained evidence that nontargets are inhibited (as measured by the rate of detection of small faint probe dots). There appears to be no inhibition of the empty region through

which the nontargets move The inhibition is spatially local

How can a punctate moving object be inhibited unless the object is being tracked? And how can it be tracked if there are many (n > 5) of them? But there is some sense in which moving objects must be

tracked: E.g., Dynamic random-dot stereograms, kinetic depth effect

Maybe Indexing is a two-stage process?

1. Individuate2. Reference (for accessing)

Exp 1: Probe-dot detection (statistically adjusted using regression)Exp 1: Probe-dot detection (statistically adjusted using regression)

Recent experimental results on Inhibition of nontargetsRecent experimental results on Inhibition of nontargetsExperiment 1: 3 locationsExperiment 1: 3 locations

Probe Detection while Tracking and Nontracking

40%

50%

60%

70%

80%

90%

100%

OpenSpace Target NonTarget

Probe Location

De

tec

tio

n %

While Tracking

Non-Tracking Control

Recent experimental results on Inhibition of nontargetsRecent experimental results on Inhibition of nontargetsExpt 2: 5 locationsExpt 2: 5 locations

Probe Detection during tracking and nontracking

65%

70%

75%

80%

85%

90%

95%

100%

Space Target NonTarget NearTarget NearNonTarg

Probe Location

Pro

bes

Det

ecte

d (

%)

Nontracking (Control)

Tracking

Exp 2: Showing results when statistically adjusted using regressionExp 2: Showing results when statistically adjusted using regression

The effect of doubling the number of nontargetsThe effect of doubling the number of nontargets

The beginnings of the puzzle of individuating The beginnings of the puzzle of individuating prior to indexing, and what that might mean!prior to indexing, and what that might mean!

If moving objects are inhibited then inhibition moves along with the objects. How can this be unless they are being tracked? And if they are being tracked there must be at least 8 FINSTs!

This puzzle may signal the need for a kind of individuation that is weaker than the individuation we have discussed so far – a mere clustering, circumscribing, figure-ground distinction without a pointer or access mechanism – i.e. without reference!

It turns out that such a circumscribing-clustering process is needed to fulfill many different functions in early vision. It is needed whenever the correspondence problem arises – whenever visual elements need to be placed in correspondence or paired with other elements. This occurs in stereo, apparent motion, and other situations in which increasing the number of elements does not increase the difficulty of computing correspondences. Correspondence is not computed over continuous visual manifolds

but only over some pre-clustered elements.

Example of the correspondence problem for apparent motion

The grey disks correspond to the first flash and the black ones to the second flash. Which of the 24 possible matches will the visual system select as the solution to this correspondence problem? What principal does it use?

Curved matches Linear matches

Here is how it actually looks Here is how it actually looks

Why does the apparent motion take the form it does?Why does the apparent motion take the form it does?

The principle appears to be one of minimizing the vector difference between each possible correspondence pair and that of its nearest neighbors (Dawson & Pylyshyn, 1988)

This principle arises from (is justified by) the natural constraints of rigidity and opacity: In our kind of world most image features arise from distal

elements on the surface of opaque rigid objects, i.e., the vast majority of perceived distal elements are on the visible surface of opaque rigid objects

Therefore each distal element is likely to move the same amount and in the same direction as elements near to it (since they are likely to be on the same surface)

Views of a domeViews of a dome

Structure from Motion Demo

Cylinder Kinetic Depth Effect

The correspondence problem for biological motionThe correspondence problem for biological motion

Reprise … what are FINSTs?Reprise … what are FINSTs? They are a primitive reference mechanism that refer to

individual objects in the world (FINGs?) Objects are picked out and referred to without using any

encoding of their properties, including their location. Picking out objects is prior to encoding their locations!

Indexing is nonconceptual because it does not represent an individuals as a member of some conceptual category – not even as being in the category individual or object!

FINSTs serve as visual demonstratives, much like the terms this or that do in language, by picking out and referring to individuals without using their properties.

The central function of FINST indexes is to bind arguments of visual predicates or of motor commands to things in the world to which they must refer. Only predicates with bound arguments can be evaluated.

Schema for how FINSTs function in Schema for how FINSTs function in visual-motor controlvisual-motor control

Documents

What is focal attention for? The What and Why of perceptual selection The central function of focal attention is to select We must select because our