Upload
beatrix-mosley
View
225
Download
0
Tags:
Embed Size (px)
Citation preview
What is focal attention for?What is focal attention for?The The WhatWhat and and WhyWhy of perceptual selection of perceptual selection
The central function of focal attention is to select We must select because our capacity to process
information is limited We must select because we need to be able to mark
certain aspects of a display and to refer to the marked tokens individually
That’s what this talk is principally about: but first some background
The functions of focal attentionThe functions of focal attention A central notion is that of “picking out” or selecting. The
usual mechanism that is appealed to in explaining perceptual selection is attention (sometimes called focal attention or selective attention).
Why must we select anyway? We must select because we can’t process all the information
available. This is the resource-limitation reason. ○ But in what way (along what dimensions) is it limited? What happens
to what is not selected? The “filter theory” has many problems.
We need to select because certain patterns cannot be computed without first marking certain special elements (e.g. in counting)
We need to select in order to track the identity of individual things (e.g., to solve the correspondence problem)
We need to select because of the way relevant information in the world is packaged. This leads to the Binding Problem (later)
What is selected?What is selected?
Whatever the reason for selection, the selection must occur in early in vision (in the visual module) and prior to conceptualization. For resource-limitation reasons, selection must occur before
the need for major resources In the case of the “marking” or individuating, the empirical
facts require that vision pick out and individuate without regard for the conceptual category or properties of the individuals
In the case of the property-binding, there are good reasons why selection should be based on individual things (objects)
All these reasons converge on the claim that what is selected is individuals or proto-objects
Attention and SelectionAttention and Selection
Early research concentrated on selective attention as a filter. It assumed that we select what can be described in very low-level terms – i.e., in terms of physical “channels” or based on transducer outputs. But the filter idea was shown to be only approximate – because filters always leaked
It is important that the question of selection be placed in the context of a pre-attentive (modular, nonconceptual, cognitively-impenetrable) stage of vision – otherwise in some sense anything can be “selected” (e.g., being edible, being a genuine Rembrandt painting)
Broadbent’s Filter TheoryBroadbent’s Filter Theory(illustrating the resource-limited account of selection)(illustrating the resource-limited account of selection)
Broadbent, D. E. (1958). Perception and Communication. London: Pergamon Press.
Limited Capacity Channel
Effectors
Store of conditional probabilities of past events (in LTM)
Filt
erMotor planner
Ver
y Sh
ort T
erm
Sto
re
Sens
es
Rehearsal loop
Attention and SelectionAttention and Selection The question the basis for selection has been at the bottom
of a lot of controversy in vision science. Some options that have been proposed include: We select what can be described physically (i.e., by “channels”)
– i.e. we select based on transducer outputs e.g., we select by frequency, color, shape, or location
We select according to what is important to us (e.g., affordances), or according to phenomenological salience
We select what we need to treat as special (selection = “marking”) or what we need to refer to
We select aspects (properties) to which we subsequently attach concepts (this idea will be important later)
It is important that the question of selection be placed in the context of a pre-attentive (modular, nonconceptual, cognitively-impenetrable) stage of vision – otherwise in some sense anything can be “selected” (e.g., being edible, being a genuine Rembrandt painting)
What does What does visualvisual attention select? attention select? (What is the basis for selection?)(What is the basis for selection?)
The most obvious answer to what we select is places. For example, we can select places by moving our eyes so our gaze lands on different places When places are selected, are they selected automatically? Must we always move our eyes to change what we attend to?
○ Studies of Covert Attention-Movement: Posner (1980).○ How does attention switch from one place to another?
▫ When places are selected, are they selected automatically? ○ How does the visual system specify where to move attention to?
If we select places, are there restrictions on those places? e.g.,○ Must those places be filled or can they be empty places? ○ Must they be specifiable in relation to landmark objects?
Covert movement of attentionCovert movement of attention
Example of an experiment using a cue-validity paradigm for showing that the locus of attention moves without eye movements and for estimating its speed. Posner, M. I. (1980). Orienting of Attention. Quarterly Journal of Experimental Psychology, 32, 3-25.
Extension of Posner’s demonstration of attention switchExtension of Posner’s demonstration of attention switch
Does the improved detection in intermediate locations entail that the “spotlight of attention” moves continuously through empty space?
Uncued
Cued
CueFixationframe
Target-cueinterval Detection target
*
Along thepath
*
*
But there are empirical reasons why But there are empirical reasons why objectsobjects are a are a better basis for attentional selection than better basis for attentional selection than locationlocation
There is experimental evidence that attention attaches to things rather than places
When attention is exogenously summoned, the appearance of analog movement of focal attention can be explained by a punctate object-based theory of attention-allocation – Sperling & Weichselgartner (1995)
Sperling & Weichselgartner (1995) “Episodic” or Sperling & Weichselgartner (1995) “Episodic” or Quantal Theory of Attention switching Quantal Theory of Attention switching
Assumes a quantal “shift” in attention in which the spotlight pointed at location -2 is extinguished and, simultaneously, the spotlight at location +2 is turned on. Because extinction and onset take a measurable amount of time, there is a brief period when the spotlights partially illuminate both locations simultaneously.
This This object-basedobject-based view of attentional view of attentional selection is at the heart of FINST theoryselection is at the heart of FINST theory
I propose that there are good reasons on both experimental and conceptual grounds for supposing that attention attaches itself to objects rather than locations
In what other ways might our visual In what other ways might our visual information capacity be limited?information capacity be limited?
There are obviously limitations on the input side of vision that depend on the acuity of the sensors and the range of physical properties to which they respond.
But there is a limitation beyond that of acuity: The perceptual system is limited in what it can individuate and how many of these individuals it can deal with at one time. The capacity to individuate is different from the capacity to discriminate.
Some reason for thinking that individuating is a distinct process
The increasingly important role played by The increasingly important role played by ‘‘ObjectsObjects’’ in studies of visual attention in studies of visual attention
There is a limitation in visual information processing that is beyond the limitation of acuity and of channel capacity: The perceptual system is limited in what it can individuate and how many of these individuals it can deal with at one time.
The capacity to individuate is different from memory capacity and discrimination capacity.
This notion of individuating and of individuals may be related to Miller’s “chunks”, but it has a special role in vision which we will explore in the next lecture
First some reasons why individuating is a distinct process
Visual Indexes (Visual Indexes (akaaka FINSTs) FINSTs) There is evidence that individuating is a special aspect of vision
and the capacity to individuate is different from memory capacity and discrimination capacity. This notion of individuating and of individuals may be related to Miller’s
“chunks”, but it has a special role in vision In vision there appears to be a limit to how many objects
(individuals) can be selected and bound to the arguments of cognitive functions at one time.
There is evidence that we can hold on to 4 objects in visual short term memory (Luck & Vogel, 1997).
There is evidence that Objects (i.e., individual things) may be the basic units of visual attention
FINST Theory (to be described later) claims that there is a mechanism for picking out and referring to (pointing to) primitive visual elements independent of any of their properties and that this mechanism is the essential bridge between nonconceptual and conceptual representation.
Pick out 3 dots and keep track of themPick out 3 dots and keep track of them
In a field of identical elements you can select a number of them and move your attention among them (e.g., “move one up” or Move 2 right” etc) so long as at no time do you have to hold on to more than 4 dots
Individuals and patternsIndividuals and patterns Vision does not recognize patterns by applying templates since the
size, shape, retinal location, orientation, and other properties must be abstracted away,
A pattern is encoded over time (and often over saccades), therefore the visual system must keep track of the individual parts and merge descriptions of the same part at different times and stages of encoding
Individuating is a prerequisite for recognition of patterns and configural properties defined among a number of individual partsAn example of how we can easily detect patterns if they are defined
over a small enough number of parts is subitizingIn order to recognize a pattern, the visual system must pick out
individual parts and bind them to the representation being constructed Examples include what Ullman called “visual routines” Another area where the concept of an individual has become
important is in cognitive development, where it is clear that babies are sensitive to the numerosity of individual things in a way that is distinct from their perceptual abilities but is limited in its capacity
Are there collinear items (n>3)?Are there collinear items (n>3)?
Several objects must be picked out at once Several objects must be picked out at once in making relational judgmentsin making relational judgments
The same is true for other relational judgments like inside or on-the-same-contour… etc. We must pick out the relevant individual objects first. Respond: Inside-same contour? On-same contour?
Signature subitizing phenomena only appear when objects Signature subitizing phenomena only appear when objects are automatically individuated and indexedare automatically individuated and indexed
Trick, L. M., & Pylyshyn, Z. W. (1994). Why are small and large numbers enumerated differently? A limited capacity preattentive stage in vision. Psychological Review, 101(1), 80-102.
Encoding conjunctions of propertiesEncoding conjunctions of properties
Experiments showing the special difficulty that vision has in detecting conjunctions of several properties have provided a basis for understanding an important problem in in visual analysis
How are conjunctions of features detected?How are conjunctions of features detected?
Read the vertical line of digits in the following display
Under these conditions Conjunction Errors are very frequent
Rapid visual search Rapid visual search (Treisman)(Treisman)
Find the following simple figure in the next slide:
This case is easy – and the time is independent of how many nontargets there are – because there is only one red item. This is called a ‘popout’ search
This case is also easy – and the time is independent of how many nontargets there are – because there is only one right-leaning item. This is also a ‘popout’ search.
Rapid visual search Rapid visual search (conjunction)(conjunction)
Find the following simple figure in the next slide:
Serial vs parallel search?Serial vs parallel search?
Finding an element that differs from all others in a scene by a single feature – called a single-feature search – is fast, error-free and almost independent of how many nontargets there are;
Finding an object that differs from all others by a conjunction of two or more features (and that shares at least one feature with each object in the scene) – called a conjunction search – is usually slow, error-prone, and is worse the more nontargets there are in the scene*.
These results suggest that in order to find a conjunction, which requires solving the binding problem, attention has to be scanned serially to all objects.
* This way of putting is simplifies things. Under certain conditions the serial-parallel distinction breaks down
Single-Feature Single-Feature vsvs Conjunction-feature search Conjunction-feature search
What is attention is for? What is attention is for? Treisman’s Treisman’s Attention as GlueAttention as Glue Hypothesis Hypothesis
The purpose of visual attention is to The purpose of visual attention is to BindBind properties properties together in order to recognize objectstogether in order to recognize objects This is called the “binding problem” or the “many This is called the “binding problem” or the “many
properties problem” and it is of considerable interest to properties problem” and it is of considerable interest to philosophers as well as vision scientistsphilosophers as well as vision scientists
We can recognize not only the presence of “squareness” We can recognize not only the presence of “squareness” and “redness” in our field of view, but we can also and “redness” in our field of view, but we can also distinguish between different ways they may be conjoineddistinguish between different ways they may be conjoined
The role of attention to location in Treisman’s The role of attention to location in Treisman’s Feature Integration TheoryFeature Integration Theory
Color maps Shape maps Orientation maps
Master location map
Original Input
Attention “beam”
Conjunction detected
R
Y
G
The ‘The ‘attention-as-glue’ attention-as-glue’ hypothesis has a corollary: hypothesis has a corollary: In computing conjunctions of properties, attention In computing conjunctions of properties, attention
must be directed primarily at must be directed primarily at objectsobjects since it is since it is objects that have the conjoined propertiesobjects that have the conjoined properties
Instead of being like a spotlight beam that can be scanned around a scene, and can be zoomed to cover a larger or smaller area, maybe attention can only be directed towards occupied places – i.e., to visual objects
An alternative view of how we An alternative view of how we solve the binding problemsolve the binding problem
If we assume that only properties of indexed objects are encoded and stored in Object Files, then properties that belong to the same object are stored in the same Object File, so the binding problem does not arise This is the Object-Based Attention view exemplified by
FINST Theory
The assumption that only properties of indexed objects are encoded raises the problem of what happens to properties of the other (unindexed) objects or unencoded properties in a display
I will return to this conundrum later.
FINST Theory postulates a limited number of pointers in FINST Theory postulates a limited number of pointers in early early visionvision that are elicited by causal events in the visual field and that that are elicited by causal events in the visual field and that enable vision to refer to things without doing so under concept or enable vision to refer to things without doing so under concept or
a descriptiona description
Evidence for attentional selection based on Evidence for attentional selection based on ObjectsObjects
Single Object Advantage: pairs of judgments are faster when both apply to the same perceived object
Entire objects acquire enhanced sensitivity from focal attention to a part of the object
Single-Object advantage occurs even with generalized “objects” defined in feature space
Simultanagnosia and hemispatial neglect show object-based effects
Attention moves with Moving Objects IORObject FilesMOT
Single-object superiority even when the Single-object superiority even when the shapes are controlledshapes are controlled
Attention spreads over Attention spreads over perceivedperceived objects objects
Using a priming method (Egly, Driver & Rafal, 1994) showed that the effect of a prime spreads to other parts of the same visual object compared to equally distant parts of different objects.
Spreads toB and not C
Spreads toB and not C
Spreads toC and not B
Spreads toC and not B
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
Objecthood endures over time
Several studies have shown that what counts as an object (as the same object) endures over time and over changes in location; Certain forms of disappearances in time and changes in
location preserve objecthood.
This gives what we have been calling a “visual object” a real physical-object character and partly justifies our calling it an “object”.
Inhibition of return appears to be object-basedInhibition of return appears to be object-based (as well as to some extent location-based)(as well as to some extent location-based)
Inhibition-of-return is thought to help in visual search since it prevents previously visited objects from being revisited
The original study used static objects. Then (Tipper, Driver & Weaver, 1991) showed that IOR moves with the inhibited object
IOR appears to be object-based (it travels IOR appears to be object-based (it travels with the object that was attended)with the object that was attended)
Most studies showed that IOR is object-based Most studies showed that IOR is object-based (it travels with the object that was attended)(it travels with the object that was attended)
Some studies (Tipper, Weaver, Jerreat, & Burak, 1994) showed that attention can also be location-based, but in those cases the “location” was well marked by visible context cues – so it may be that locations such as “halfway between object X and Object Y” can be attended
Clinical studies with patients who have attentional deficits show that their deficit is object based (illustrated later)
Tracking objects not defined by distinct spatial Tracking objects not defined by distinct spatial locations and spatial trajectorieslocations and spatial trajectories
Blaser, E., Pylyshyn, Z. W., & Holcombe, A. O. (2000). Tracking an object through feature-space. Nature, 408(Nov 9), 196-199.
There is also evidence from neuropsychology There is also evidence from neuropsychology that is consistent with the object-based viewthat is consistent with the object-based view
Neglect Balint and simultanagnosic patients
Visual neglect syndrome is object-basedVisual neglect syndrome is object-based
When a right neglect patient is shown a dumbbell that rotates,the patient continues to neglect the object that had been on the right, even though It is now on the left (Behrmann & Tipper, 1999).
Simultanagnosic (Balint Syndrome) patients only attend Simultanagnosic (Balint Syndrome) patients only attend to one object at a timeto one object at a time
Simultanagnosic patients cannot judge the relative length of twolines, but they can tell that a figure made by connecting the endsof the lines is not a rectangle but a trapezoid (Holmes & Horax, 1919).
Balint patients can only attend to one object at a time Balint patients can only attend to one object at a time even if they are overlappingeven if they are overlapping
Luria, 1959
The End (for now)The End (for now)
Multiple Object TrackingMultiple Object Tracking
One of the clearest cases illustrating object-based attention is Multiple Object Tracking
Keeping track of individual objects in a scene requires a mechanism for individuating, selecting, accessing and tracking the identity of individuals over time These are the functions we have proposed are carried out by
the mechanism of visual indexes (FINSTs)
We have been using a variety of methods for studying visual indexing, including subitizing, subset selection for search, and Multiple Object Tracking (MOT).
Multiple Object TrackingMultiple Object Tracking In a typical experiment, 8 simple identical objects are
presented on a screen and 4 of them are briefly distinguished in some visual manner – usually by flashing them on and off.
After these 4 “targets” have been briefly identified, all objects resume their identical appearance and move randomly. The subjects’ task is to keep track of which ones had earlier been designated as targets.
After a period of 5-10 seconds the motion stops and subjects must indicate, using a mouse, which objects were the targets.
People are very good at this task (80%-98% correct). The question is: How do they do it?
Keep track of the objects that flash
How do we do it? What properties of individual objects do we use?
Keep track of the objects that flash
How do we do it? What properties of individual objects do we use?
Basic finding: People (even 5 year old children) can track 4 to 5 individual objects that have no unique visual properties
How is it done? Can it be done by keeping track of the only
distinctive property of objects – their location?
Explaining Multiple Object TrackingExplaining Multiple Object Tracking
If we are not using and updating objects’ If we are not using and updating objects’ locations, then how are we tracking them?locations, then how are we tracking them? Our hypothesis, which is independently motivated, is that
there are a small number of primitive indexes or pointers, each of which can pick out a particular individual object The index keeps providing access to the object as the object
changes its properties and its location.
The object is not selected by using an encoding of any of its properties. It is picked it out nonconceptually just as the demonstrative that does in language. Nonconceptual selection is selection without classification
(without encoding the selected thing as having certain properties or as being a member of a certain category)
Nonconceptual contact with the world is essential in order to ground concepts in causal connections
A FINST is a mechanism that:A FINST is a mechanism that:1. Picks out, and 2. Keeps track of
individual distal elements, and3. Does so directly (i.e., without mediation of concepts and
without appealing to or using any encoded properties of the individuals). Therefore,
4. FINSTs pick out and track individuals as individuals rather than as bearers of certain properties
5. FINSTs do not pick out and track individuals as members of any category: The connection to the world is purely causal and nonconceptual, so there is no “seeing as” relation. So the visual system (and the person) literally does not what is
being selected and tracked, even though this indexed selection allows further properties of the object in question to be encoded subsequently!
Where does this leave the binding problem?Where does this leave the binding problem?
Binding by location – advantages It’s easy to see how locations might be picked out since
they are physically specifiable and are a logical extension of direction of gaze
Location can be specified across modalities
Binding by location – disadvantages Empty locations do not have causal powers Empty locations do not have properties Point locations do not help with the binding problem
○ they have to be at least regions○ The boundaries of regions are defined by objects, so
objects first have to be selected in any case
Objects as the basis for bindingObjects as the basis for binding
Binding by individual – advantages Individuals are the focus of properties – in the end we
need to bind together properties of a single individual Binding by individual – disadvantages
It is hard to see how a mechanism can pick out individuals without focusing on their location
How can individuals be tracked without detecting properties unique to that individual?
Philosophers from Strawson to Clark have argued that individuation requires the apparatus of concepts to provide conditions of individuation, so how can individuals be recognized and tracked by early (nonconceptual) vision?
Summary of some properties of indexing revealed Summary of some properties of indexing revealed by recent experimentsby recent experiments
1. Targets can be tracked even when they disappear behind an occluder and, under certain conditions, even when all objects disappear from view (Scholl & Pylyshyn, 1999; Keane & Pylyshyn, VSS2003). Demo: MOT with occlusion
2. Properties of targets are not encoded during MOT nor are they used in tracking. Changes in target properties are not even noticed (Scholl, Pylyshyn & Franconeri, 1999; Bahrami, 2003).
3. Not all well-defined clusters of features can be tracked: Only ones that correspond to objects (Scholl, Pylyshyn & Feldman, 2001). Demo: "Rubber band" displays
Summary of some properties of indexing revealed Summary of some properties of indexing revealed by recent experimentsby recent experiments
4.Indexes are assigned primarily in an exogenous, automatic, involuntary and data-drive manner. They can also be assigned endogenously (voluntarily) but we believe this happens only by moving focal attention to each target serially (Annon & Pylyshyn, VSS2003).
5.Index maintenance in tracking appears to be non-predictive and non-attentive (Keane & Pylyshyn, VSS2003; Leonard &
Pylyshyn, VSS2003).
6.Target-target confusions are much more numerous than target-nontarget confusions. The reason appears to be that nontargets are inhibited, which may prevent them from being swapped with nontargets (Pylyshyn & Leonard, VSS2003).
Summary of some properties of indexing revealed Summary of some properties of indexing revealed by recent experimentsby recent experiments
7. Keeping track of objects as targets is easier than keeping track of their identity (when the latter is provided at the start of the trial by a name or special location)The poorer recall of object identities is surprising, given that in order to
judge an object as a target one needs to trace its identity back to an object that had been visibly distinct at the start of a trial! So why is ID lost?
8. One reason is that target-target confusions are much more numerous than target-nontarget confusions. But why should this be so?
9. One reason may be that nontargets are inhibited, which may prevent them from being swapped with nontargets. We have shown this is so experimentally. But that leaves a serious puzzle: How can inhibition travel with objects when no indexes are available for tracking?
The beginnings of the puzzle of clustering prior to The beginnings of the puzzle of clustering prior to indexing, and what that might mean!indexing, and what that might mean!
If moving objects are inhibited then inhibition moves along with the objects. How can this be unless they are being tracked? And if they are being tracked there must be at least 8 FINSTs!
This puzzle may signal the need for a kind of individuation that is weaker than the individuation we have discussed so far – a mere clustering, circumscribing, figure-ground distinction without a pointer or access mechanism – i.e. without reference!
It turns out that such a circumscribing-clustering process is needed to fulfill many different functions in early vision. It is needed whenever the correspondence problem arises – whenever visual elements need to be placed in correspondence or paired with other elements. This occurs in computing stereo, apparent motion, and other grouping situations in which the number of elements does not affect ease of pairing (or even results in faster pairing when there are more elements). Correspondence is not computed over continuous visual manifolds but only over some pre-clustered elements.
An alternative view of how to solve the An alternative view of how to solve the Binding ProblemBinding Problem
According to the current version of FINST theory, only properties of indexed objects are encoded (conceptualized) The binding problem never arises because properties are always
encoded as properties of an indexed object, and no other properties are encoded at all.
This is in conflict with strong intuitions – namely that we see much more than we conceptualize. So what do we do about the things we “see” but do not conceptualize? Some philosophers say they are represented nonconceptually?
But what is such a representation like? And what makes it a representation, as opposed to just a biological reaction?
My provisional answer is that such biological reactions (e.g., retinal activity) are not representations at all – they have no truth values and so they cannot misrepresent This is another hard issue to be deferred to later
Puzzles raised by FINST theory and MOT resultsPuzzles raised by FINST theory and MOT results
If the only information about indexed objects is encoded and made available to the cognitive mind, what happens to information about other parts of the visual scene? There are, after all, only about 4 or 5 indexes and surely
we see a lot more of the world than 4 or 5 objects!
This raises the question about whether non-indexed objects are ‘processed’ in any sense at all, and whether they are even represented in some (presumably nonconceptual) way.
Do objects that are not indexed have any effect on the visual system at all? The mystery of unattended objects Functional blindness in normal vision
Austen Clark (& P. Strawson) and Austen Clark (& P. Strawson) and feature placing languagesfeature placing languages
What kind of representation does sensation allow?Ans: Just those in feature-placing languages
“The hypothesis that this book offers is that sensation is feature-placing: a pre-linguistic system of mental representation. Mechanisms of spatio-temporal discrimination … serve to pick out or identify the subject-matter of sensory representation. That subject-matter turns out invariably to be some place-time in or around the body of the sentient organism. …the various reasons cited for thinking that sensation is intentional can also be explained on this hypothesis. The ‘aboutness’ of sensation reduces to its spatial character. (p 165)”“…there is a sensory level of identification of place-times that is more primitive than the identification of three-dimensional material objects. Below our conceptual scheme – underneath the streets, so to speak – we find evidence of this more primitive system. The sensory identification of place-times is independent of the identification of objects; one can place features even though one lacks the latter conceptual scheme.
Because our perceptual system can distinguish objects that differ by Because our perceptual system can distinguish objects that differ by conjunctionsconjunctions of properties, early vision must not fuse together or lose of properties, early vision must not fuse together or lose the object-specificity of properties it detects. In reporting properties the object-specificity of properties it detects. In reporting properties early vision must bind them together according to the objects that have early vision must bind them together according to the objects that have those propertiesthose properties
Some philosophical morals we Some philosophical morals we can draw from FINST theorycan draw from FINST theory
Distinguishing causes and codes○ What causes Object Files to be created vs what is entered into them
Conceptual and nonconceptual contents Representing and carrying information
○ The case of clusters, figure-ground, and correspondence
Can information-carrying properties (e.g., location on the proximal pattern) create clusters without representing locations of features that are clustered?
The problem is what to do about the items The problem is what to do about the items that were not attended but in some sense that were not attended but in some sense
had been ‘seen’had been ‘seen’
Some considerations:We should not equate ‘attended’ with indexed or selected
or with any other information-processing function? To be attended is typically defined in terms of either the task goals (where unattended means unreported) or the perceptual experience
Forms of inattentional blindness Non-indexed items may continue to be indexable for a short
time after they physically disappear (e.g., occlusions in MOT) The question is whether this persistence is a form of
nonconceptual representation or a mere latency or inertia in the visual mechanism, and that question eventually comes back to whether we must advert to semantical notions in stating the generalizations (De Morgan’s Canon or Occam’s Razor).
Another puzzle: Punctate inhibition of moving objects?Another puzzle: Punctate inhibition of moving objects?
We have recently obtained evidence that nontargets are inhibited (as measured by the rate of detection of small faint probe dots). There appears to be no inhibition of the empty region through
which the nontargets move The inhibition is spatially local
How can a punctate moving object be inhibited unless the object is being tracked? And how can it be tracked if there are many (n > 5) of them? But there is some sense in which moving objects must be
tracked: E.g., Dynamic random-dot stereograms, kinetic depth effect
Maybe Indexing is a two-stage process?
1. Individuate2. Reference (for accessing)
Exp 1: Probe-dot detection (statistically adjusted using regression)Exp 1: Probe-dot detection (statistically adjusted using regression)
Recent experimental results on Inhibition of nontargetsRecent experimental results on Inhibition of nontargetsExperiment 1: 3 locationsExperiment 1: 3 locations
Probe Detection while Tracking and Nontracking
40%
50%
60%
70%
80%
90%
100%
OpenSpace Target NonTarget
Probe Location
De
tec
tio
n %
While Tracking
Non-Tracking Control
Recent experimental results on Inhibition of nontargetsRecent experimental results on Inhibition of nontargetsExpt 2: 5 locationsExpt 2: 5 locations
Probe Detection during tracking and nontracking
65%
70%
75%
80%
85%
90%
95%
100%
Space Target NonTarget NearTarget NearNonTarg
Probe Location
Pro
bes
Det
ecte
d (
%)
Nontracking (Control)
Tracking
Exp 2: Showing results when statistically adjusted using regressionExp 2: Showing results when statistically adjusted using regression
The effect of doubling the number of nontargetsThe effect of doubling the number of nontargets
The beginnings of the puzzle of individuating The beginnings of the puzzle of individuating prior to indexing, and what that might mean!prior to indexing, and what that might mean!
If moving objects are inhibited then inhibition moves along with the objects. How can this be unless they are being tracked? And if they are being tracked there must be at least 8 FINSTs!
This puzzle may signal the need for a kind of individuation that is weaker than the individuation we have discussed so far – a mere clustering, circumscribing, figure-ground distinction without a pointer or access mechanism – i.e. without reference!
It turns out that such a circumscribing-clustering process is needed to fulfill many different functions in early vision. It is needed whenever the correspondence problem arises – whenever visual elements need to be placed in correspondence or paired with other elements. This occurs in stereo, apparent motion, and other situations in which increasing the number of elements does not increase the difficulty of computing correspondences. Correspondence is not computed over continuous visual manifolds
but only over some pre-clustered elements.
Example of the correspondence problem for apparent motion
The grey disks correspond to the first flash and the black ones to the second flash. Which of the 24 possible matches will the visual system select as the solution to this correspondence problem? What principal does it use?
Curved matches Linear matches
Here is how it actually looks Here is how it actually looks
Why does the apparent motion take the form it does?Why does the apparent motion take the form it does?
The principle appears to be one of minimizing the vector difference between each possible correspondence pair and that of its nearest neighbors (Dawson & Pylyshyn, 1988)
This principle arises from (is justified by) the natural constraints of rigidity and opacity: In our kind of world most image features arise from distal
elements on the surface of opaque rigid objects, i.e., the vast majority of perceived distal elements are on the visible surface of opaque rigid objects
Therefore each distal element is likely to move the same amount and in the same direction as elements near to it (since they are likely to be on the same surface)
Views of a domeViews of a dome
Structure from Motion Demo
Cylinder Kinetic Depth Effect
The correspondence problem for biological motionThe correspondence problem for biological motion
Reprise … what are FINSTs?Reprise … what are FINSTs? They are a primitive reference mechanism that refer to
individual objects in the world (FINGs?) Objects are picked out and referred to without using any
encoding of their properties, including their location. Picking out objects is prior to encoding their locations!
Indexing is nonconceptual because it does not represent an individuals as a member of some conceptual category – not even as being in the category individual or object!
FINSTs serve as visual demonstratives, much like the terms this or that do in language, by picking out and referring to individuals without using their properties.
The central function of FINST indexes is to bind arguments of visual predicates or of motor commands to things in the world to which they must refer. Only predicates with bound arguments can be evaluated.
Schema for how FINSTs function in Schema for how FINSTs function in visual-motor controlvisual-motor control