17
Pergamon PII: S0042-6989(97)00124-7 Vision Res., Vol. 37, No. 23, pp. 3255-3271, 1997 © 1997 Elsevier Science Ltd. All rights reserved Printed in Great Britain 0042-6989/97 $17.00 + 0.00 Perception of Stationary Plaids: The Role of Spatial Filters in Edge Analysis MARK A. GEORGESON,*$ TIM S. MEESEt Received 16 July 1996; in final form 13 March 1997 Orientation-tuned spatial filters in visual cortex are widely held to act as "orientation detectors", but our experiments on the perception of stationary two-dimensional (2-D) plaids require a new view. When two sinusoidal gratings at different orientations (say 1 c/deg, +45 deg from vertical) are superimposed to form a standard plaid they do not, in general, look like two sets of oblique contours (diamonds) but more like a blurred checkerboard (squares) with vertical and horizontal edges, although the Fourier components are oblique. The pattern of edges seen in this plaid and others corresponds to the zero-crossings (ZCs) in the output of a circular filter, but adaptation and masking experiments suggest that oriented filters are being summed to emulate circular filtering, before ZC analysis. At low contrasts or after adaptation to an intermediate orientation, the combining of filters can fail or be "broken", and the diamond structure of the components is seen instead. Adding a low contrast third harmonic to one component in square-wave phase also changed the plaid's appearance from squares to diamonds, but adapting to the third harmonic enhanced the square appearance. Filters can evidently switch from combining across orientation to combining across spatial frequency. The combination stage of edge detection may involve variably weighted summing of oriented filters in monocular pathways, followed by a process that makes explicit the locations and orientations of features. © 1997 Elsevier Science Ltd Human vision Spatial filtering Filter combination Adaptation Edge detection Zero-crossings Orientation coding INTRODUCTION Oriented lines and edges are key features in the visual structure of images, and it is important to understand the mechanisms by which they are located and represented (Marr, 1982; Watt & Morgan, 1985; Morrone & Burr, 1988; Georgeson, 1994). The evidence for orientation- tuned spatial filters in mammalian visual cortex is abundant (Hubel & Wiesel, 1968; De Valois et al., 1982; De Valois & De Valois, 1988), and it has been natural to assume that these filters serve in some way as "orientation detectors" in early vision. Experiments on the detection and perception of one-dimensional (l-D) gratings and lines (e.g. Wilson, 1983; Graham, 1989; Wilson et al., 1990; Snowden, 1992) have tended to reinforce this view. Thus, orientation discrimination might depend on just-detectable changes in the distribu- tion of activity across an array of such "detectors" tuned to different orientations (Regan & Beverley, 1985); edge *School of Psychology,University of Birmingham, Birmingham B 15 2TT, U.K. tDepartment of Vision Sciences, Aston University, Birmingham B4 7ET, U.K. STo whom all correspondence should be addressed [Fax: 121-414 4897; Email: [email protected]]. orientation might be determined by the peak of activity (Malik & Perona, 1992); and illusory shifts in orientation may depend on shifts in the distribution of activity, induced by adaptation or by lateral inhibition between detectors (Carpenter & Blakemore, 1973). What do we see if two sinusoidal gratings are superimposed at different orientations (say +45 deg from vertical) to form a two-dimensional (2-D) plaid (Fig. 1)? On the "orientation detector" view, we should surely expect to see two sets of criss-crossing oblique contours ("diamonds") at the component orientations but, in general, this does not happen. Instead, such a plaid tends to look like a blurred checkerboard ("squares") with vertical and horizontal edges, even though the Fourier components are oblique. If the plaid angle is reduced to q-30 deg, the vertical edges are elongated [Fig. 2(C)]. These visible edges cannot be understood by considering which are the most active oriented filters, since they would presumably be tuned to the plaid's oblique components. When we applied a standard edge- finding algorithm (Canny, 1986) to a variety of plaids we found that it did not, in general, predict the perceived pattem of edges convincingly. However, the perceived structure of these plaids (Fig. 2) is well predicted by the zero-crossings (ZCs) in the output of a circular filter (Georgeson, 1992, 1997; Meese & Freeman, 1995). To 3255

Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

Pergamon PII: S0042-6989(97)00124-7

Vision Res., Vol. 37, No. 23, pp. 3255-3271, 1997 © 1997 Elsevier Science Ltd. All rights reserved

Printed in Great Britain 0042-6989/97 $17.00 + 0.00

Perception of Stationary Plaids: The Role of Spatial Filters in Edge Analysis MARK A. GEORGESON,*$ TIM S. MEESEt

Received 16 July 1996; in final form 13 March 1997

Orientation-tuned spatial filters in visual cortex are widely held to act as "orientation detectors", but our experiments on the perception of stationary two-dimensional (2-D) plaids require a new view. When two sinusoidal gratings at different orientations (say 1 c/deg, +45 deg f rom vertical) are superimposed to form a standard plaid they do not, in general, look like two sets of oblique contours (diamonds) but more like a blurred checkerboard (squares) with vertical and horizontal edges, although the Fourier components are oblique. The pattern of edges seen in this plaid and others corresponds to the zero-crossings (ZCs) in the output of a circular filter, but adaptation and masking experiments suggest that oriented filters are being summed to emulate circular filtering, before ZC analysis. At low contrasts or after adaptation to an intermediate orientation, the combining of filters can fail or be "broken", and the diamond structure of the components is seen instead. Adding a low contrast third harmonic to one component in square-wave phase also changed the plaid's appearance from squares to diamonds, but adapting to the third harmonic enhanced the square appearance. Filters can evidently switch from combining across orientation to combining across spatial frequency. The combination stage of edge detection may involve variably weighted summing of oriented filters in monocular pathways, followed by a process that makes explicit the locations and orientations of features. © 1997 Elsevier Science Ltd

Human vision Spatial filtering Filter combination Adaptation

Edge detection Zero-crossings Orientation coding

INTRODUCTION

Oriented lines and edges are key features in the visual structure of images, and it is important to understand the mechanisms by which they are located and represented (Marr, 1982; Watt & Morgan, 1985; Morrone & Burr, 1988; Georgeson, 1994). The evidence for orientation- tuned spatial filters in mammalian visual cortex is abundant (Hubel & Wiesel, 1968; De Valois et al., 1982; De Valois & De Valois, 1988), and it has been natural to assume that these filters serve in some way as "orientation detectors" in early vision. Experiments on the detection and perception of one-dimensional ( l-D) gratings and lines (e.g. Wilson, 1983; Graham, 1989; Wilson et al., 1990; Snowden, 1992) have tended to reinforce this view. Thus, orientation discrimination might depend on just-detectable changes in the distribu- tion of activity across an array of such "detectors" tuned to different orientations (Regan & Beverley, 1985); edge

*School of Psychology, University of Birmingham, Birmingham B 15 2TT, U.K.

tDepartment of Vision Sciences, Aston University, Birmingham B4 7ET, U.K.

STo whom all correspondence should be addressed [Fax: 121-414 4897; Email: [email protected]].

orientation might be determined by the peak of activity (Malik & Perona, 1992); and illusory shifts in orientation may depend on shifts in the distribution of activity, induced by adaptation or by lateral inhibition between detectors (Carpenter & Blakemore, 1973).

What do we see if two sinusoidal gratings are superimposed at different orientations (say +45 deg from vertical) to form a two-dimensional (2-D) plaid (Fig. 1)? On the "orientation detector" view, we should surely expect to see two sets of criss-crossing oblique contours ("diamonds") at the component orientations but, in general, this does not happen. Instead, such a plaid tends to look like a blurred checkerboard ("squares") with vertical and horizontal edges, even though the Fourier components are oblique. If the plaid angle is reduced to q-30 deg, the vertical edges are elongated [Fig. 2(C)]. These visible edges cannot be understood by considering which are the most active oriented filters, since they would presumably be tuned to the plaid's oblique components. When we applied a standard edge- finding algorithm (Canny, 1986) to a variety of plaids we found that it did not, in general, predict the perceived pattem of edges convincingly. However, the perceived structure of these plaids (Fig. 2) is well predicted by the zero-crossings (ZCs) in the output of a circular filter (Georgeson, 1992, 1997; Meese & Freeman, 1995). To

3255

Page 2: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

3256 M. A. GEORGESON and T. S. MEESE

Sinuso ida l C o m p o n e n t s Plaid

/ / + // // Diamonds Squares

C o m p o n e n t Z C s C o m p o u n d Z C s

FIGURE 1. A standard plaid is formed by adding two sinusoidal gratings of the same spatial frequency and contrast, ar different orientations. Zero-crossings (ZCs) of the components are oblique, but ZCs of the plaid are vertical and horizontal. Our experiments evaluated the perceived structure of plaids by having observers decide whether component or compound ZCs were

a better representation of the plaid's perceived spatial structure.

that extent the appearance of plaids supports a model like that proposed by Marr & Hildreth (1980), in which edges are found at ZCs in the output of a smoothed, isotropic (circular) second derivative filter, V 2 G , the Laplacian of a gauss±an.

Nevertheless, Georgeson (1992) has argued from adaptation and masking experiments that circular filters are not used directly in edge-finding. Adaptation or masking of a standard plaid [Fig. 2(B)] by a left oblique grating caused the plaid's structure to appear distorted, as in Fig. 2(A). The distortion is equivalent to a reduction in contrast of the masked or adapted component of the plaid. It is difficult to see how this orientation-selective masking could occur in a system of circular filters, but the effects can be readily understood if oriented filters are combined by linear summation, to emulate circular filtering, before ZC analysis. This is the central idea that

we explore in this paper . W e emphas ize , however , that the evidence for var iable combina t ion of or iented filters does not depend cruc ia l ly on our work ing hypothes is that ZCs in the combined- f i l t e r output are used to locate edges. The status and value o f the ZC rule is d i scussed further in Append ix I.

It would be curious i f v is ion invar iab ly emula ted c i rcular filtering in this way. W h y construct or iented filters, only to throw them away by hard-wired pool ing across all or ienta t ions? In this paper we review recent results and repor t new exper iments that point ins tead to the dynamic or var iable manner in which or iented filters are combined before edge-f inding. The exper iments on pe rce ived structure o f p la ids fall into four groups, examin ing the effects of (i) vary ing bas ic parameters such as contrast and relat ive or ientat ion o f the compo- nents; (ii) pr ior adapta t ion to grat ings at the componen t

A. CR:C L = 2:1 B. +450 C. +300 D. +300, 900

FIGURE 2. Upper row: four plaids with components at a common spatial frequency. (A) Components ±45 deg from vertical, with contrast ratio 2:1. (B) Standard ±45 deg plaid, as in Fig. 1. (C) Standard plaid with components at ±30 deg. (D) Three- component plaid, with components 30, 30 and 90 deg from vertical, in cosine phase. Experiments (e.g. Georgeson, 1992) show that compound ZCs (lower row) usually describe perceived structure fairly well. This implies oriented Fourier

components of a plaid are not analysed separately.

Page 3: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

SPATIAL FILTERS IN EDGE ANALYSIS 3257

frequency; (iii) adding a higher harmonic to one of the fundamental components; and (iv) adapting to higher harmonic frequencies. The common theme that emerges is that perceptual combination of two gratings in a plaid can be radically altered by these various manipulations. It can be broken, or reinforced, or switched to another combination of components. We suggest that these changes arise from variable combination of the outputs of oriented spatial filters, and in a final experiment we ask about the neural site at which this filter combination Occurs .

GENERAL METHODS

Terminology and definitions A plaid is the pattern formed as the linear sum of two or

more superimposed sinusoidal gratings (Fig. 1). These are the components of the plaid. The plaid angle is the difference between the component orientations. The 2-D luminance profile of a sinusoidal grating of spatial frequency (f), orientation (0), contrast (C), phase (qS) and mean luminance I0 can be expressed as:

I(x,y) = I0.[1 + C.sin(u.x + v.y + ~b)]

where u =f.cos(0) and v =f.sin(0) are the Cartesian (u,v) co-ordinates of that component in Fourier space, and f = x/-(u2+ v2). We express the contrast (C) of the components by the usual Michelson contrast, in percent (%):

C = 100.(Imax - lmin)/(Imax q'- Imin).

The Michelson contrast of a two-component plaid is always the sum of its component contrasts. But for three or more components, the Michelson contrast of a plaid varies with the phase of its components and it is generally easier to refer to the component contrasts directly. It is sometimes also useful to refer to contrast in logarithmic units; thus contrast in decibels (dB) is defined as 20.1og10(C).

A standard plaid is of the type shown in Fig. 1, where there are just two components whose spatial frequency (f) and contrast (C) are equal. As a useful shorthand summary of a plaid's composition we shall refer to Left oblique, Right oblique, Vertical and Horizontal compo- nents as L, R, V and H respectively, subscripted by their harmonic number. Thus the set < L1,R1 > is a standard plaid with components at the fundamental frequency q), and < L1,R1,R3 > is a standard plaid with added third harmonic (330.

The zero-crossings (ZCs) of a function are those places at which the function's value passes from positive to negative. In referring to the ZCs of an image, researchers usually have in mind the ZCs of the image after it has passed through some specified linear filter (Marr & Hildreth, 1980). In general, the pattern of ZCs depends on the filter applied, but gratings and plaids allow some simplification. For a sine grating, the ZC pattern is the same for all filters, provided the zero frequency (d.c.) component is suppressed; the ZCs always lie along the orientation of the grating. For a plaid the ZC pattern

varies with the orientation preference and tuning of the filter. However, if the filter is circular and the plaid contains components at a single spatial frequency (as in Fig. 2) then the ZC pattern is the same for all spatial frequency tunings and spatial scales, provided the zero frequency (d.c.) component is suppressed. In such cases, the circular filter alters the amplitudes of all components by the same factor, so that the input and output are isomorphic and the ZC pattern is unaffected by filtering. As shown in Fig. 1, we refer to ZC patterns formed by taking the components individually as Component ZCs, and the ZC patterns formed after combining the components within an isotropic (circular) filter as Compound ZCs.

Apparatus and stimuli Gratings and plaids were generated by an Innisfree

"Picasso" image generator, and displayed at a frame rate of 220 Hz on a Tektronix 608 monitor with green phosphor (P31). The "Picasso" produces plaids by frame- interleaving the two grating components. This has the benefit that the time-averaged luminance pattern is truly the sum of the two gratings, preventing any inter- modulation distortion between orientations. The software took two frames to update the Picasso, so that grating parameters could be changed at a rate of 110 Hz. A plaid containing two orientations thus had a refresh rate for each component of 55 Hz, which is high enough to ensure no visible flicker. Mean luminance was constant at about 20 cd/m 2. Regular computer-controlled calibrations with a digital photometer (Photodyne XLA) confirmed that the luminance output was linear with applied voltage in the contrast range up to 40%. In the experiments, the contrasts, spatial phases and orientations of the plaid's components were controlled by a computer which also scheduled the trials, displayed response alternatives on an adjacent graphics screen and recorded the subject's responses. The graphics screen was approximately matched to the plaid display in colour and luminance. The plaid display was viewed binocularly in a dimly lit room from a distance of 114 or 228 cm, at which the circular display subtended 5 or 2.5 deg in diameter, respectively. Fundamental frequency f = 1 c/deg for both field sizes. The head was supported by chin and forehead rests. A small dark fixation point was present at the centre of the display.

General procedure

On each trial a plaid was shown for 500 msec, in a single-interval, binary choice procedure. Presentation was cued by tones and the task was to classify the appearance of the plaid into one of two categories defined by outline drawings displayed on the graphics screen. The observer chose between two types of icon or between two ZC sketches. With the icon method, perceived structure of plaids had to be classed as "diamonds" or "squares" (see Fig. 1), and a single outline diamond and square shape served as the response cues. With the sketch selection method, the sketches were outline versions of

Page 4: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

3258 M.A. GEORGESON and T. S. MEESE

the type of ZC image shown in Figs 1 and 2, computed appropriately for each test plaid, and displayed at 1:1 scale on the screen adjacent to the plaid display. The line segments marking ZCs were light increments on a uniform background (described above), and the observer chose between sketches representing ZCs at the compo- nent orientations ("diamonds") or ZCs of the pattern as a whole ("squares").

The use of outline sketches has the advantage that the Fourier spectra of the sketches are very different from those of the plaid images. In particular, the low frequency components of the plaid are absent from the sketches, and the high frequencies of the sketches are absent from the plaid. Thus, subjects cannot sensibly compare the Fourier spectra and must instead base their choice on spatial structure. The line sketches contained size, shape and orientation information, but said nothing about the possible perceived contrast, polarity or blur of features in the test plaid. Instructions emphasized that the "diamond" category should be used if edges seemed predominantly oblique, while "squares" should be used if edge orientations were mainly horizontal and vertical. Observers MAG, TSM (the authors) and JMG were well practiced in this general task and, where necessary, further practice sessions were given, as well as practice runs at the start of experimental sessions. JMG did not know the specific aims of the experiments.

Detectability of single gratings was tested under the same display conditions, using the same single-interval procedure. In the experiments of Fig. 7, Figs 8 and 10 the grating detection trials were randomly interleaved with plaid trials within a session, and the type of trial was pre- cued by a tone. In one experiment (Effect of contrast on perceived structure of plaids), grating detection was tested in a separate two-interval forced-choice procedure, and the percent correct data were transformed into an estimate of the proportion seen in a single-interval procedure by a standard correction for guessing: p (seen) = [2.p (correct response) - 1]. Any other deviation from these general conditions will be noted in the appropriate Results section.

PERCEPTUAL COMBINATION OF FOURIER COMPONENTS IN PLAIDS

In previous experiments we have found that naive observers can reliably associate plaid images (Fig. 2) with their corresponding compound ZC patterns (Geor- geson, 1992). In parametric experiments (Georgeson, 1997), the relative orientation and spatial frequency of the two components were varied, and the observer was required to choose between compound ZCs or component ZCs as a better representation of the plaid's structure. At all plaid angles, up to the maximum of 90 deg, compound ZCs were the dominant response up to a spatial frequency ratio of 1-1.5 octaves. For greater spatial frequency ratios, component ZCs were the dominant response. Since compound ZCs emerge from circular filters, the simplest interpretation of these results is given by circular filtering followed by ZC analysis (Marr & Hildreth,

1980). The effect of spatial frequency ratio is consistent with filtering at several spatial scales. When the two components are sensed mainly by filters of different sizes, then component ZCs would emerge at seperate spatial scales, as observed. However, the weight of other evidence is not consistent with such a simple isotropic model, as we shall see. We begin with perception at low contrasts, and then look at the interaction between contrast and plaid angle.

Effect of contrast on perceived structure of plaids

One observer (TSM) was tested extensively on two tasks--plaid perception and grating detection. For both tasks, the stimulus components had a spatial frequency of 1 c/deg, and orientations of -4-45 °, while contrast varied from trial to trial. Display diameter was 5 deg. For plaid perception, the observer made sketch selections (see: General procedure) to indicate whether the stimulus appeared as a compound pattern (like a checkerboard) or as two overlapping gratings (see Fig. 1). For grating detection, only one of the components was presented, and the observer chose which of two temporal intervals contained the grating.

Results (Meese & Georgeson, 1996b) are re-plotted in Fig. 3(A), where open triangles and left ordinate show the probability of grating detection after correction for guessing, and filled circles (right ordinate) show the percentage of compound responses for the plaid. Solid curves are probit fits, yielding a grating detection threshold (50% "seen"; 75% correct) of - 1 0 . 2 d B (0.31%). At low contrasts, component ZCs were almost always chosen for the plaid, but at higher component contrasts, above 4%, compound ZCs were the dominant response. The transition (50%) point between these two responses occurred at a contrast (P50) of 11 dB (3.55%), about 10 times higher than the detection threshold for a single component.

One may wonder whether the appearance of compo- nent ZCs at low contrast arose from seeing only one of the two components on a given trial, but this is very unlikely. The dashed curve [Fig. 3(A)] shows the proportion of trials on which both components of a plaid would be seen, assuming that each component is detected independently of the other (Georgeson & Shackleton, 1994). The estimated contrast level (P50) at which both components in the plaid were detected was -6 .7 dB (0.46%), about 8-times lower than the P50 for compound ZCs in plaid perception. Thus, at component contrasts from 1 to 4% both components were reliably detected, but the oblique structure of the components was perceived. At low contrast, then, a plaid can appear more like two sets of overlapping, oblique contours, and not like a checkerboard. Analogous results have been obtained for observer JMG (Georgeson, 1997) and for naive observers (Meese, 1993; Meese & Freeman, 1995). No model based directly on ZCs of circular filters can account for this effect of contrast, since for a standard plaid the circular filter always delivers compound ZCs.

Page 5: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

SPATIAL FILTERS IN EDGE ANALYSIS

~, 1.0

u~ 0.8 ¢ -

0

~ 0.6 Q. £ ~ 0.4

t -

O

~ O.Z 0

o 0.0

-18 -17' -6 0 6 12 18 24 dB i i i i i i i i

G~telngon/." ' / P l a i d . ..,' • perception

_. _ , . . . . "

I I I I I I I I

0.130.25 0.5 1 2 4 8 16 Component Contrast (%)

~ 8 100 .~ ~- O "O .~

80 o .~ 4

60 E 2

zo ~ ~2 o.s

O.Z5 0 (D

B

deg

T L __- I I I I I

30 45 60 75 90 Plaid Angle (Deg)

FIGURE 3. (A) Effect of contrast on plaid perception. Filled symbols (right ordinate) show percentage of trials on which a standard plaid (Fig. 1) was judged to have "compound" (checkerboard) structure. At low component contrasts, below 4%, "component" structure was seen. Open symbols show the detectability of a single grating component, after correction for guessing. Dashed curve shows estimated probability that both components of a plaid would be detected on a given trial. (B) Effect of plaid angle. Contrast required to produce 50% "compound" responses is plotted against the angle between plaid

components. More contrast was needed to evoke the checkerboard percept at larger angles.

3259

18

12

6

dB 0

-6

-12

Effect of angle between the two components

The experiment on plaid perception was repeated for observer TSM at five plaid angles from 30 to 90 deg, with components oriented +15, 22.5, 30, 37.5, and 45 deg from vertical. Probit curves were fitted to the data as in Fig. 3(A), to derive the contrast threshold (P50) for reporting compound ZCs at each plaid angle. Each point was based on a total N = 320 observations taken over eight contrast levels. Figure 3(B) shows that less and less contrast was required to produce compound responses at smaller angles, implying that perceptual combination of the components of a plaid occurred more readily at smaller plaid angles. While more than 4% contrast was required for components at larger angles (75, 90 deg), this fell to 0.5% at 30 deg. This latter value is about equal to the contrast (P50) required for detection of both components [0.46%; Fig. 3(A), dashed curve].

Other factors: individual differences and absolute orientation. There are individual differences, however, since observer MAG has consistently shown no effect of contrast for a standard plaid < L1,RI > [e.g., open circles, Fig. 5(A)], but does show a contrast effect when the plaid is rotated through 45deg, to become < H],V] >. This difference may partly be due to much greater practice on the < Lj,R1 > plaid, and also to a real effect of plaid orientation. Compound ZCs were seen more frequently when they were vertical and horizontal than when they were oblique (Meese & Freeman, 1995). This was true even when the possible influence of component orientation was eliminated. A plaid whose components are +22.5 deg from vertical (or i 2 2 . 5 deg from horizontal) has vertical and horizontal compound ZCs, whereas a plaid whose two components are 22.5 deg from horizontal and from vertical respectively has oblique ZCs. The latter plaid is less often seen as having compound ZCs.

Combining oriented filters: the bridge hypothesis

In summary, we have seen that the appearance of two- component plaids does not always conform to the compound ZC structure predicted from circular filters. Although compound ZCs are indeed seen over a wide range of conditions, component ZCs are increasingly reported at lower contrasts and larger plaid angles (Fig. 3). This trend is even more pronounced when the plaid is oblique (i.e. has compound ZCs that are oblique; Meese & Freeman, 1995). These results are not consistent with the direct use of circular filters, but are consistent with the idea of filter combination, in which the outputs of oriented filters activated by the plaid's components are summed, and then ZCs are located in this combined output. We must suppose that there is some process that controls the manner in which the filters are, or are not, combined. When the filters activated by an < La,R1 > plaid are combined, the result is compound ZCs since the combination emulates circular filtering by being equally sensitive to both components. This entails combining filters tuned to widely different orientations, up to 90 deg apart. When the filters are not combined we suppose that ZCs are extracted separately from the filters activated by L] and by R1, leading to component ZCs. To put these two outcomes into a consistent framework, we suppose that active filters are aggregated into one or more groups, that filters within each group are summed, and that ZCs are found in the output of each group [Fig. 11 (A)]. Thus, for a standard plaid, compound ZCs emerge from a single group containing all active filters, while component ZCs emerge from two distinct groups activated by L and R components. Why should this segregation into two groups happen more readily at low contrast and at larger angles?

We offer the bridge hypothesis as an account of filter grouping control. A standard plaid will always activate

Page 6: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

3260 M.A. GEORGESON and T. S. MEESE

f

/ / ~ A

Grating Filter response pattern

±150 ±300 +450

B

Pla ids

C

Medium contrast

D

Low contrast

E

Very low contrast

FIGURE 4. The bridge hypothesis accounts for the variation in grouping of spatial filters in response to standard plaids. (A) Each grating component of the input evokes one pair of "blobs" of activity in the set of spatial filters tuned to different orientations and spatial frequencies. Plaids (B) contain two gratings, and so evoke two pairs of blobs in the response pattern (C). Considering just the right half of each response diagram (covering the full range of 0-180 deg orientations) it can be seen that the two component "blobs" of activity can merge into one group at higher contrasts [low threshold, (C)] but can be split into two groups at low contrasts [modelled here by setting higher thresholds; (D, E)]. The split into two groups occurs when the responses of intermediate "bridge" channels fall below a threshold

value (marked by white outline). See text for discussion.

two sets of filters tuned to the component orientations, but whether these two sets form two groups or one may depend on the level of activity in the filter(s) tuned to intermediate orientations. We shall refer to these intermediate filters as bridge channels. If the bridge channel activity exceeds a critical level then we suppose that this serves to link the adjacent, more highly activated, filters into a single group. When contrast is decreased, so the bridging activity also decreases and if it falls below the critical level then the link is no longer made and two groups are formed; component ZCs are the

result. Similarly, when the plaid angle is increased, so the bridge channel receives increasingly weak input from the components that now fall on the tails of its orientation sensitivity curve. The link will again fail, especially if the contrast is also low.

A specific version of the bridging idea is illustrated in Fig. 4. Figure 4(A) shows the basic assumption, common to most models of spatial vision, that each sinusoidal component of the input produces a graded pattern of responses across a set of tuned filters. The intensity of each point in this neural Fourier space represents the magnitude of response of a filter tuned to a particular spatial frequency and orientation. Because orientation ranges only from 0 to 180 deg, each filter is represented twice in the range 0-360 deg, and there are two "blobs" of activity evoked by each component. This redundancy means that we can pay attention mainly to one half of each response diagram, e.g. the right half. Each image in Fig. 4(C, D, E) represents the pattern of activity across filters evoked by a plaid [Fig. 4(B)]. We suppose that adjacent filters are grouped and combined if their responses exceed some threshold level. It can be seen that when the threshold is low [representing medium or high contrast input; Fig. 4(C)] the "blobs" of activity from the two components merge to form one set of supra- threshold responses; the filters form one group, leading to the checkerboard percept. But when contrast is decreased [threshold is raised, Fig. 4(D, E)], so the bridging activity may fall below the critical level and then two groups are formed, leading to perception of the separate compo- nents. This happens more readily when the plaid angle is large or the contrast is very low, as found experimentally. When the angle is small (±15deg) the overlap of responses is so great that separate grouping never occurs. This corresponds to the experimental finding that for this small angle compound responses predominated, even at the lowest visible contrasts [Fig. 3(B)].

The bridge hypothesis thus accounts, qualitatively at least, for the effects of contrast and angle on plaid perception. The effect of absolute orientation suggests further that bridge channels tuned to horizontal and vertical can form stronger, more effective links than those tuned to oblique orientations (Meese & Freeman, 1995). We have no direct evidence on the nature of the linking process, nor on how different filters might be "tagged" as belonging to the same group, but it is tempting to draw the analogy between this linking process and the linking of filters across visual space that seems to underlie the extraction of global contours from a field of local elements (Field et al., 1993). A communicative, dynamic system (Gilbert, 1995) is implicated in both cases, and linking by synchronized neural activity remains an attractive possibility (Gray et al., 1989; Eckhorn, 1994).

B R E A K I N G THE COMBINATION

The bridge hypothesis leads to a simple and direct experimental prediction. If intermediate filters are critical to the perceptual combination of components in a plaid, then adapting to intermediate orientations should reduce

Page 7: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

100

0

( /1

8O ¢ -

O

o ~

~ 60 t -"

o 4 0 E 0

~ 20

Baseline Bisecting Aligned

Adaptation Condition

A i !

1 c/deg

i i i i I

-@-Adapt MAG ..O.No adapt

-B -Adapt TSM .£}.No adapt

" I

• . • % A

, . . . . o ' '

5 10 20 Test component contrast (%)

I00 v

B

I' T

( / I

80 t- O Q. in

60

~ 40 i-- J. e -

~ 20 ¢ -

0

E 0 0

SPATIAL FILTERS IN EDGE ANALYSIS 3261

FIGURE 5. (A) Effect of adapting to a grating on perceived structure of a plaid. Adapting grating was vertical, bisecting the angle between plaid components. Test plaid looked like overlapping oblique components more often after contrast adaptation (filled symbols) than without adaptation (open symbols). Circles: observer MAG; squares: TSM. (B) Effect of adapting to a plaid on perceived structure of a plaid. Spatial frequency of adapting and test components, 1 c/deg. Means of three observers (MAG, TSM, TCAF). Adapting to intermediate orientations ("Bisecting" condition) markedly increased perception of component orientations in the test plaid, confirming (A), while adapting and testing at the same ("Aligned") orientations did so only slightly. This points to a crucial role for mechanisms at intermediate orientations in perceptual linking and combining of

widely separated orientation components.

the bridging activity, and break the link. Perception of compound ZCs should give way to component ZCs, with no physical change in the test stimulus. We report two experimental tests of this prediction.

Adaptation to a vertical or horizontal grating

In the first experiment, two observers (MAG, TSM) adapted in different sessions to a 1 c/deg vertical or horizontal grating of 0% (control) or 40% contrast. The spatial phase of the adapting grating jumped randomly by 90 to 270 deg every 200 msec to prevent the formation of afterimages. After 4 sec adaptation there was a 100 msec blank delay, then a < L1,RI > test plaid for 100 msec. After each test presentation the observer made a sketch selection, choosing component or compound ZCs as described earlier, and then re-adapted for 4 sec before the next test. The experiment was actually a study of the tilt aftereffect (Meese & Georgeson, 1996a) and so from trial to trial the test component orientations varied in small steps, symmetrically from about ± 4 0 to 4-50 deg off- vertical, but only the sketch selection data need concern us here.

Data points in Fig. 5(A) show the percentage of component responses as a function of test component contrast, which ranged from 5 to 20%. Open symbols are control data (0% contrast adaptation) showing that without contrast adaptation the number of component responses was low, especially for MAG where virtually all responses were of the compound (checkerboard) type. For TSM there was an effect of contrast, confirming the results of Fig. 3(A). The important result is that after adaptation (filled symbols) the number of component responses was greatly increased for both observers,

especially at lower contrasts. In short, adapting to a grating that bisected the plaid angle increased the number of component responses, implying that perceptual combination of the components was weaker after this adaptation, as predicted by the bridge hypothesis.

Adaptation to a plaid

The second experiment was basically similar to the first, except that observers adapted to a 1 c/deg plaid instead of a grating. We compared the effect of adapting to intermediate orientations with the effect of adapting at the test component orientations, because the bridge hypothesis proposes a special role for the intermediate orientations. Initial adaptation was for 5 min with 10 sec top-ups between trials. The spatial phase of the adapting components jumped randomly by 90 to 270 deg every 200 or 400 msec to prevent the formation of afterimages. The test stimulus was a two-component 1 c/deg plaid, with component orientations that were 90 deg apart. Test duration was 500 msec, field size 5 deg. Test component contrast varied between trials from 1.2 to 9.4%, and subjects made responses by choosing between a com- pound or component sketch as before. In the baseline condition, adaptation was to a blank screen of mean luminance, 0% contrast. In the bisecting condition, the adapting plaid had component contrasts of 5% and its component orientations bisected the angles between the test components. In the aligned condition, the adapting plaid again had component contrasts of 5%, but its component orientations were the same as the test plaid. For TCAF, the test component orientations were the standard +45 deg from vertical, but for MAG and TSM they were 0 and 90 deg from vertical, chosen on the basis

Page 8: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

3262 M. A. GEORGESON and T. S. MEESE

No 3f 3f phase 1 8 0 0 3fphase 900 3f phase 00

FIGURE 6. (A) Standard plaid with and without added 3f component. In these illustrations harmonic contrast was one-tenth of the fundamental component contrast. (B) 2-D luminance distributions as maps of iso-luminant contours, linearly spaced above and below mean luminance. The basic "egg-box" structure of the intensity surface is altered only subtly by addition of a weak 3f component, but perceived structure changes markedly (cf. Figure 7, Fig. 9). In (A), perceived structure should change from "squares" (left) to "diamonds" (right). [Viewing from 1-2 m may help to reduce the inevitable distortion introduced by

reproduction of these images.]

of previous results so as to increase the baseline of component responding. This would allow both increases and decreases of component responding to be observed in the data after adaptation.

Results (Meese & Georgeson, 1996b) are shown in Fig. 5(B) as the mean percentage of component responses, collapsed across test contrasts, for the three conditions of adaptation. Adapting to orientations that lay halfway between the test components approximately doubled the number of component responses, from approx. 40 to 80%, while adapting at the test component orientations led to a much smaller increase in component responding. These results strongly support the bridge hypothesis. The bisecting condition adapts specifically the bridge chan- nels that are critical for filter combination, and so the link is broken and component ZCs are seen much more frequently than in the baseline condition. Conversely, adapting at the component orientations would have a much greater impact on the filters responding most strongly to the test components, but would have only a weak effect on the bridge channels. Thus, the increase in component responding was much smaller in this condi- tion.

In the next section we return to the perception of plaids without adaptation, exploring the effects produced by adding a higher harmonic to one of the fundamental components. We shall see that barely visible harmonics can have a dramatic perceptual effect that again points to variable combination of the underlying oriented filters.

SWITCHING FROM ONE COMBINATION TO ANOTHER

The experiments were motivated by a simple observa- tion that when one of the sinusoidal components of a standard < LI,RI > plaid was replaced by a square-wave grating then the perceived checkerboard structure was replaced by "diamonds" formed from sharp square-wave edges at (say) left-oblique superimposed on blurred, sine- wave edges at fight oblique (Georgeson, 1990). It is perhaps not surprising that the sharp oblique edges of the square-wave should be seen as such, but it is surprising that the opposite sine-wave edges should be seen, since it suggests that the perceptual combination of < LbR1 > is no longer occurring. Our experiments examined and quantified this effect by making small changes to the waveform of the sine-wave R1 by adding a third or fifth harmonic and plotting the extent to which these changes caused a switch in perceived structure from "squares" to "diamonds".

Adding a weak third or fifth harmonic to a s tandard plaid: the switching effect

In the first experiment of this series, a third or fifth harmonic component was added in square-wave (sine) phase to one component (R1) of a standard < L~,RL > plaid whose components had 10% contrast, oriented ±45 deg from vertical. Image diameter was 2.5 deg. An illustration of the standard plaid is shown in Fig. 6(A)

Page 9: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

SPATIAL FILTERS IN EDGE ANALYSIS 3263

100

( / }

80 ( D t -

O

~ 60 "(3 t- O E 40 a

~ 20

1 c/deg Plaid with added 3rd or 5th harmonic i i ~ 1 i i , , I ' i , ,

3f grating 'yes'

[ ] Plaid + 3f

. . . . . 5 f grating 'yes'

0 Plaid + 5f

100

80

60

40

Mean 2 Ss, 1 c /deg

2.5 0 field

10%, +450 f

O~ ' 0 ' 0 - 0 - ~ ) ' ' b ' ' ' O l . . . . ' . . . . 0

0 0.5 1 1.5 2 2.5 Contrast of 3f or 5f (%)

20

1 c/deg Plaid with added 3rd harmonic

2

I I I I I t I I I

[ ] +/_45 °

0 + / -30 °

0 + / -15 o

. ~ _ 3f '_yes--

Mean 2 Ss, 1 c/deg

2.5 ° field

10% f contrast

, , , , I , I I I

0.5 1 1.5 2 2.5 Contrast of 3f (%)

FIGURE 7. (A) Effect of adding 3f or 5f in sine-phase to a standard plaid (see Fig. 6 and text). Fundamental (1 c/deg) component contrast was 10%, orientations +45 deg from vertical. With 3f added (square symbols) the switch from the usual perception of "squares" to "diamonds" followed the visibility of the 3f component presented alone (solid curve). With 5f added (circles) this was not so, even though 3f and 5f (dashed curve) were about equally visible. Mean of 2 Ss (MAG, JMG). (B) As

(A) but for three plaid angles. Angle had no significant influence on the switching effect.

(left), and with an added third harmonic (3f) in Fig. 6(A) (right). The harmonic contrast varied randomly from trial to trial, and the subject's task was to classify the plaid's appearance as "squares" or "diamonds" by choosing the appropriate graphical icon, as described earlier. Trials testing the detection of the harmonic alone were randomly interleaved with plaid trials. Two observers (MAG, JMG) were tested on 20 trials, each at seven harmonic contrast levels from 0 to 1.4%. Third and fifth harmonics were tested in separate sessions.

Figure 7(A) shows the results pooled over the two observers, whose individual data were similar. It is clear that with no 3f (at 0% on the abscissa) both observers reported squares (almost 0% "diamonds"), as expected from earlier results. But adding a weak third harmonic (open square symbols) caused a rapid switch from squares to diamonds. The proportion of trials on which "diamonds" were reported matched closely the propor- tion of trials on which 3f was seen when presented alone (solid curve); the P50 values for "diamonds" and for 3f detection were both about 0.5% contrast. This shows that when a weak 3f component was added to the plaid then, even if it was only just detectable, it caused the switching effect from squares to diamonds.

Results for 5f were quite different. Although detect- ability of 5f alone [dashed curve in Fig. 7(A)] was almost the same as for 3f, the addition of a 5f component to the plaid did not cause the same dramatic switch in perceptual categorization. Up to 1.4% contrast (the highest tested here) the added 5f component caused no change in responses at all (open circles). This rules out the possibility that in the 3f case the observers were responding merely to the presence of the added harmonic, since they should then have clone the same in the 5f case and they clearly did not, even though 3f and 5f were

about equally detectable on their own. Subjectively, the 5f component seemed to be a set of fine oblique stripes superimposed on the blurred, checkerboard structure of the plaid, while in the 3f case the checkerboard structure was no longer seen.

No effect of plaid angle on the switching effect

We repeated the 3f experiment at two other plaid angles, with components oriented ± 15 and ±30 deg from vertical. The results are compared with the ±45 deg data in Fig. 7(B), and show that there was no discernible effect of plaid angle on the switching effect. To confirm this lack of effect one observer (JMG) repeated the experi- ment, with the f components of the plaid at either 10 or 20% contrast, and at two viewing distances giving 2.5 and 5 deg field sizes. Spatial frequency q3 was main- tained at 1 c/deg for both field sizes. A We±bull function was fitted to each psychometric function to extract a threshold value for 3f detection and for the switch from squares to diamonds. Figure 8 shows that there was no significant effect of plaid angle on the threshold for perception of diamonds. It tended to be just a little higher than the corresponding 3f detection threshold.

Effect of harmonic phase on the switching effect The effect of adding a single 3f component to a

standard plaid was thus invariant with plaid angle, but turned out to vary dramatically with 3f phase. In the first of several experiments, we held constant the f component contrast at 10 or 20%, and the 3f contrast at 1%, chosen on the basis of the data shown in Fig. 7. Component orientations were 5:30 deg from vert ical ,f= 1 c/deg, field size = 2.5 deg. Spatial phase of 3f varied randomly from trial to trial, and trials with no 3f component were

Page 10: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

3264 M.A. GEORGESON and T. S. MEESE

10

v

O9 O

t -

O

o

Contrast (3f) threshold for Diamonds 1 i

O 10%, 2.5 deg [ ] 20%, 2.5 deg • 10%, 5 deg • 20%, 5 deg

. . . . . 3f thd 2.5 deg 3f thd 5 deg

[ ] [ ]

. . . . . . . . . . . - -m- . . . . . . . . . . . . . . . . . . . . . a s . . . . . . . . . . .

I.,,,,,J

JMG, f = 1 c/deg 1 0.1 1 a_

30 60 90 Plaid angle (deg)

100

tn

80 o ID_ (n

~ 60 "o C o

4 o C3

20

1 c/deg Plaid, 3f contrast = 1%

0 --lo% " % o [] 20%

E ~ O Mean

2.50 field, _+300 ~ I d e a n 2 Ss

k [ ]

O

0 45 90 135 180 Phase of 3f (deg) No 3f

FIGURE 8. Contrast (3f) threshold for diamonds. Plaid angle does not influence the switching effect. Experiment of Fig. 7(B) was repeated for JMG at two fundamental component contrast levels (10, 20%) and two display sizes (2.5, 5 deg). Psychometric functions (cf. Figure 7) were fitted with a Weibull function, and summarized here by the threshold parameter. Threshold 3f contrasts required to see diamonds in the < LI,RI,R3 > plaid (symbols) were a little higher than 3f detection threshold (horizontal lines), but did not vary much with plaid

angle.

randomly interspersed. The observers (MAG, JMG) had no external cue as to which trials were which.

Figure 9 shows the percentage (N = 40, pooled over the 2 Ss) of diamond responses as a function of 3f spatial phase. Baseline responding (20-30% diamonds) was somewhat higher than usual for a standard plaid. The main effect, however, was that as 3f phase shifted from square wave (0 deg) through to triangle-wave (180 deg) the switching effect was progressively eliminated. Although the presence of 3f was quite visible at 180 deg, it did not "switch off" the appearance of a blurred checkerboard in the way observed at 0 deg. Figure 6(A) illustrates this transition for the reader (though distortion in reproduction does not render these subtle patterns of luminance very well). This effect of 3f phase was confirmed by testing JMG across a range of 3f contrasts at each of four or six phases. Procedure was as for Fig. 7, except that 3f phase rather than plaid angle was varied between sessions. Results are shown in Fig. 10 for two combinations of fundamental contrast level and field size. For 3f phases of 0 and 45 deg, diamond responses followed fairly closely the detectability of the 3f component, as in Fig. 7, while at 3f, phases greater than 90 deg "square" checkerboard responses dominated, with 0% "diamonds" at both 135 and 180 deg phases.

Summary and discussion. These results can be summarized as reflecting different visual combinations of the harmonic components in a plaid. We have seen that with a standard < L t , R l > plaid there is a strong tendency for the two orientation components to be combined visually to form a checkerboard, so let us denote that perceptual combination as [L1 + R1]. When the contrast is low and the plaid angle is large the

FIGURE 9. Effect of adding a low-contrast 3f component to a standard plaid depends crucially on 3f phase. Fundamental (1 c/deg) component contrast was 10% (circles) or 20% (squares); orientations ~30 deg from vertical; 3f contrast was 1%. With no 3f (left), perception was mainly "squares" as usual. With 3f added, the switch to "diamonds" was complete at 0 deg (square-wave) phase, but absent at 180 deg

(triangle-wave) phase. Mean of 2 Ss (MAG, JMG).

components tend not to combine, which we can denote as [Ll, R1]. When 3f is added the plaid has components < L1,R1,R3 >, and in or near square-wave phase (0 deg) this seems to lead very readily to the new combination [Lt, R1 + R3], whose L/R oblique structure is categorized as "diamonds". The results suggest that in square-wave phase the combination [L1 + R1] occurs only when R3 is undetectable. When the harmonic component is detect- able, the combination immediately switches to [L1, RI + R3]. "Squares" give way to "diamonds". On the other hand, when R3 is in or near triangle-wave phase (180 deg) the original grouping tends to persist in the form [L1 + RI, R3] and to be categorized as "squares", albeit with superimposed stripes (R3).

There is clearly something special about the (f,3f) square-wave phase relationship that leads immediately to the combination of frequency components at a common orientation, and to the breaking of links across different orientations. In other experiments (Meese & Georgeson, 1992) we have used matching methods to examine the < L1,R1,R3 > combination in greater detail, and the results strongly suggest that at 180 deg phase, where the switching effect is less abrupt, the combination can be better represented as a sharing of one component between two groupings, expressed as [LI + ~.Rl, (1 -- 7).R1 + R3]. This means that some proportion (~) of R1 's contrast is combined with Ll while the remainder is combined with R3. At 180deg phase :( decreases gradually with increasing harmonic contrast; at 0 deg phase it switches abruptly from 1 to 0 as soon as the harmonic is detectable.

To the extent that different components are processed by different spatial filters, this evidence for switchable combination and sharing of components can be taken as evidence for switching and sharing of filter outputs, in a dynamic, combinatorial process that precedes the finding

Page 11: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

SPATIAL FILTERS IN EDGE ANALYSIS 3265

, ~ . , , 100 _ , , , ';h:e'o;;,'

= I / : . ; ....... -" 6o . l ~ ','" "~-112 / 6o p

,/,t ~ 3f yes j 40

123 ! JMG, 1 c/deg r [~ JMG, 1 c/deg " o~ 20 # 2.50 field 2 0 r / . . . , . , .-X 50 field . -

o l i ' e ~ ' - ' ' ' = ' ' ' " ' ' ' t o~, ,+45 ° o

0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 Contrast of 3f (%) Contrast of 3f (%)

FIGURE 10. Confirming Fig. 9, effect of adding a low-contrast 3f component to a standard plaid depends crucially on 3f phase. Fundamental (1 c/deg) component contrast was (A) 10%; or (B) 20%; orientations 4-45 deg. Switch from squares to diamonds followed 3f detectability (solid curve) at 0 deg phase (open squares), but declined as phase was shifted to 180 deg (filled

triangles).

A

F r e q u e n / ~ L ~ f

Spatially Tuned Linear Filters

Orientation

Cosine Phase I I Basis Filters I J

Switching a n d ~ sharing of ~ $ ~ . ~ outputs from ~ basis filters k~l~ ~ ~llk

Group 1 1 - ~ Filter Synthesis 1

[~ G roup 2

Superposition of ZC maps

Global Feature Map

(i) Input

Plaid image

3 f phase 00

; Spatial filtering

(ii)

/ / ZCs

Output Feature

map

+

FIGURE 11. (A) Schematic outline of the filter combination model. (B) Two worked examples of responses to a standard plaid with 3f added in square-wave phase (cf Figs 6-10). In (i) a small circular filter is used, or equivalently all oriented filters are combined to emulate one. The "squiggly" pattern of ZCs does not correspond to perceived structure. In (ii) oriented filters have been routed into two groups, centred on L and R oblique. Row 2 shows the analogue filter outputs, and row 3 the corresponding ZC patterns. Superposition of the two ZC maps creates a primitive "feature map". The oblique, diamond structure of this map corresponds well to perceived structure. Note that the orientation-specific grouping of filters (ii) is not usually evoked by single- frequency plaids (Fig. 3). Different plaids evoke different patterns of filter grouping, and the grouping can be broken, or

reinforced, by different adapting conditions.

B

Page 12: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

3266 M.A. GEORGESON and T. S. MEESE

100

8 0 ¢ -

O

~ 6 0

.o 4 0 t - "

o 2 0

E o 0

i

A i i i i i i i i

O Baseline Adapt

?

I I I I I I I I I

Baseline 1 2 3 4 5 6 7 8 Adapting SF (c/deg)

100

80

60

40

20

0

i i i i i

g O Baseline ~ - Adapt

O

I I I I I

Baseline I 2 3 4 Adapting SF (c/deg)

FIGURE 12. Perceptual combination of plaid components can be enhanced by adaptation to higher frequencies. Test plaid components: 1 c/deg; adapting and test component orientations: 0 and 90 deg. (A) Subject MAG; (B) subject TSM.

of ZC locations. Although the combinatorial rules of this process are not yet fully understood, at least the outlines of it are becoming clear. Figure I I(A) summarizes graphically our proposals, and Fig. 1 I(B) gives a worked example for extraction of ZCs from the < L1,R1,R3 > plaid. In Fig. 1 l[B(ii)] we assume that the (unknown) control process has grouped the filters to represent the preferred combination [L1, Rl + R3]. Active left oblique filters are linearly combined to form a high-pass filter suitable for edge detection. Even though the response at 3f is amplified relative to the response at f (second row of the diagram), this does not introduce ZCs corresponding to the 3f frequency, unless the output amplitude at 3f exceeds that at f. ZCs are recovered separately from the two filter groups (left and right oblique). We assume that these two ZC patterns are then written into a "feature map" that corresponds to the perceived spatial structure. No doubt there are additional processes of interest in this step, such as the linking of local ZCs into elongated contours (Field et al., 1993; Moulden, 1994) or the representation of corners, regions and shape, but they are beyond the scope of this paper.

Figure l l [B(i)] shows what happens instead if a circular filter with the same high-pass frequency response is applied to the plaid. This represents the output expected from the Marr & Hildreth (1980) model. Given this "squiggly" output pattern of ZCs, it seems unlikely that observers would consistently choose diamonds. In particular, the left oblique structure is completely lost in this output, in a way that does not occur in human vision. Thus the Marr-Hildreth model can account for the square structure of a standard plaid, but not the diamond structure created by adding a 3f component.

The cross-channel inhibition hypothesis

We noted above that perception of the < LI,RI,R3 > plaid at 0 deg phase involves the combining of compo- nents at a common orientation [R1 +R3], and the breaking of links [Ll + RI] across different orientations.

We have seen experimentally how adapting to < V~ > or < V~ ,H~ > can break this link. How might the addition of < R3 > to the standard plaid serve to break the link? One general possibility is that these "links" are related to excitatory and inhibitory connections between spatial filters, for which there is considerable evidence from physiology (e.g. Morrone et al., 1982; Bonds & De Bruyn, 1985; Morrone & Burr, 1986; Gilbert, 1995) and psychophysics (e.g. De Valois, 1977; Tolhurst & Barfield, 1978; Georgeson, 1985; Greenlee & Magnus- sen, 1988). In the present example we should need the activation of filters tuned to < R3 > to suppress the bridge channels tuned to < V1 > and < HI >, in order to break the [L1 + Rl] combination. Although we have no direct evidence for this inhibitory link, we report next a very striking aftereffect of adaptation that is consistent with it (Meese & Georgeson, 1996b).

REINFORCING THE COMBINATION: SUPER-SQUARES

Two observers adapted to plaids whose components were at the same orientation as those of the test plaid. The general adapting and test conditions were as described previously. The test plaid was < VI,Hj >, while the adapting plaid was < Vn,Hn > (n = 1-5 or 8). Contrasts and durations were as in the section: Adaptat ion to a plaid. On each trial the observer chose between sketches representing the component ZCs and compound ZCs of the test plaid. Again, the test components were chosen to he horizontal and vertical so that the unadapted, baseline level of component responses would be quite high (open circles in Fig. 12).

Results (Meese & Georgeson, 1996b) are shown in Fig. 12(A, B) for the two observers. The percentage of component responses, pooled over test contrast levels, is shown as a function of the adapting spatial frequency (filled symbols). Adapting at the test spatial frequency (1 c/deg) slightly increased the number of component

Page 13: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

SPATIAL FILTERS IN EDGE ANALYSIS 3267

Plaid Component Orientat ions

+ 15 ° + 30 ° + 45 °

A.

flilttltlttttll[ttlttlt n- i5

1.9 3.8 7.5 15 2.7 5.3 10.6 1.9 3.8 7.5 "15

Component Contrast (%)

Monocular

Compound

Rivalry

[ ] Components

B. Dichoptic

FIGURE 13. (A, B) Proportions of three response types made to monocular and dichoptic presentation of plaids. Perceptual combination of plaid components ("compound" responses) occurred monocularly as usual (A; dark bars) but not dichoptically (B). Test plaid components: 1 c/deg; orientations: :t:15, 30 or 45 deg from horizontal; duration 100 msec. Mean of 3 Ss.

Combination of oriented filter outputs and edge analysis may be early processes in monocular cortical pathways.

responses; this is the aligned condition of Fig. 5(B). However, adapting to a higher spatial frequency (3 c/deg for TSM; 3-5 c/deg for MAG) drastically reduced component responses, or equivalently it increased compound responses that represent the perceptual combination of the test components. The checkerboard perceived after adaptation was so sharp and clear, and apparently sharper than the usual appearance of a standard plaid, that we came to call this the "super- squares" effect. It did not occur when the same set of adapters was rotated by 45 deg, to bisect the angle between the test components (Meese & Georgeson, 1996b). Striking confirmation of the super-squares effect came from testing three na'fve subjects, selected for their low spontaneous level of combination. Without adapta- tion they reported 100% "components" (0% "squares") for a 1 c/deg, +45 deg plaid over the tested range of contrasts, but this changed to around 70% "squares" after adapting to a 3 c/deg, -t-45 deg plaid (Meese & Georgeson, 1996b).

Following the bridge hypothesis, we suggest that this effect reflects an enhancement of filter combination following the removal of an inhibitory influence on the "bridge channels". The inhibitory link would have to run between filters tuned to different orientations and frequencies, separated by about 45 deg in orientation and 1.5-2 octaves in spatial frequency. Thus, for example, R3 would inhibit responses to V], and in the configuration of the present experiment filters responsive to V3 would inhibit the critical bridge channels tuned to R1. After adapting to < V3,H3 > the source of this inhibitory influence would be attenuated, and the bridge channels' activity would increase, thus strengthening the IV1 + HI] link.

This account is, of course, speculative, though broadly consistent with other physiological and psychophysical evidence for disinhibition effects following adaptation (Vantin & Berkley, 1977; De Valois, 1977; Georgeson, 1980; Georgeson, 1985; Greenlee & Magnussen, 1988). At the very least, the "super-squares" effect after adaptation is another pointer to the dynamic and variable nature of filter combination in vision. In the variety of experiments reported here we have seen evidence that the links can be weakened, or reinforced, or switched to another pattern of linking. In order to get some clues as to where in the visual pathway the combining of filters takes place, we have compared the appearance of plaids with monocular and dichoptic viewing of the components.

THE NEURAL SITE OF COMBINATION

The logic of this experiment is simple. The two components of a standard < L1,R] > plaid are presented separately to the left and right eyes. If filter combination takes place at an early, monocular cortical site then the compound "square" appearance should be prevented by this dichoptic presentation. Alternatively, if combination takes place at a binocular site then the "square" appearance should still be possible. There are, however, certain difficulties to be overcome. Presenting orthogonal gratings to the two eyes is a classic recipe for demonstrating binocular rivalry, and if rivalry were the only outcome the experiment would hold little interest. However, Liu et al. (1992) have shown that a stable, non- rivalrous "plaid" perception can be obtained at low contrasts and with brief presentations. This condition of anomalous fusion offers an opportunity to test the logic just described. Liu et al. (1992) did not describe the

Page 14: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

3268 M.A. GEORGESON and T. S. MEESE

perceived structure of the dichoptic plaid, and so the question remains open.

The perceived structure o f dichoptic plaids

We used the same display as in the other experiments, but one eye viewed it through a Dove prism. The prism produces a mirror reversal of the image, so that a 1 c/deg grating < Ll > presented to one eye was seen as < R1 > by the other eye, as required for a dichoptic < LI ,RI > plaid. Field size was 5 deg. Duration was reduced to 100 msec to enhance the fusion effect, and component contrast ranged from 1.9 to 15%. Component orientations were ±15, 30 or 45 deg from horizontal. A horizontal axis was chosen so that, at the smallest angle, true binocular fusion would not lead to the appearance of a tilted surface in depth. In the monocular control condition plaids were shown on the display screen, and one eye was covered. Observers used three response categories: "compound", "component" and "rivalry". Practice trials were given to stabilize the use of these categories.

Three observers were tested (MAG, TSM and RK, a naive subject) with 10 trials at each combination of test conditions. Results are shown in Fig. 13, pooled over the three subjects. In Fig. 13(A, B), the stacked bar chart shows the proportion of responses made in each of the three categories, as a function of test component contrast. Firstly, the control data [Fig. 13(A)] confirm the effects of contrast and angle described earlier. Responses were mainly "compound" (dark bars), except for a rise in component responses (cross-hatched bars) when contrast was low at the larger angles. Evidently monocular viewing per se did not alter plaid perception.

In the dichoptic conditions, however, compound responses were very rare [Fig. 13(B)]. Rivalry responses (stippled bars) increased with contrast, but were inde- pendent of angle. The remaining responses were of the component type. These data confirm the finding that a non-rivalrous plaid percept can be obtained dichoptically, especially at low contrast (Liu et al., 1992). Most importantly, they show that when it was not rivalrous the appearance of the dichoptic plaid was almost always of the component-type, not the compound-type. These results imply that perceptual combination of a plaid's components is prevented by dichoptic presentation, and this, in turn, suggests that filter combination is principally an early, monocular process. In terms of the scheme outlined in Fig. 11(A), the experiment suggests that the processes leading to ZC extraction occur mainly in monocular pathways, while the superposition of ZC maps, and other processes operating on the "feature map", reflect later stages capable of binocular integration.

Discussion. The monocularity of ZC extraction may be a functional requirement for stereoscopic vision, since if ZCs serve as matching primitives in stereopsis (Mart & Poggio, 1979; Mayhew & Frisby, 1981) they would need to be extracted by monocular processes, before binocular combination. The monocularity of early stages in the representation of oriented features is consistent with evidence from the tilt aftereffect (TAE) and motion

aftereffect (MAE). As with the MAE (Anstis & Duncan, 1983), TAEs of opposite sign can be induced by adapting the left and right eyes to opposite tilts. In a recent preliminary study, we adapted dichoptically to gratings of 1.2 c/deg, 50% contrast, counterphasing at 1.7 Hz, tilted ±22.5 deg from vertical (or horizontal) in opposite directions for the two eyes, and tested on a vertical (or horizontal) grating of 20% contrast presented for 250 msec. The observer closed the left or fight eye before the test grating appeared, and made a binary choice about the tilt of the test grating. Dichoptic adaptation was "topped-up" for 11.5 sec after each test presentation. For the five observers tested, the perceptual aftereffect was robust, and 96% of responses were in the direction of an eye-specific TAE (where 50% represents chance). The percentages for individual subjects were 100, 100, 100, 91, 88% (n = 32). This result echoes the earlier, analogous finding of an eye-specific spatial frequency shift (Georgeson, 1975). Thus, the orienta- tion-specific (and direction-specific) processing under- lying the TAE (and MAE) can be supported by monocular, eye-specific mechanisms. The fact that interocular transfer of the TAE and MAE after adaptation to first-order, luminance patterns is only partial is also consistent with this conclusion (Moulden, 1980; Paradiso et al., 1989). On the other hand, interocular transfer of the TAE and MAE from second-order, "illusory" contours is complete (Paradiso et al., 1989; Nishida et al., 1994), suggesting that higher stages of coding are fully binocular, and carry no eye-of-origin information. This idea is reinforced by the finding that the second-order TAE was suppressed by binocular rivalry during adapta- tion, but the first-order TAE was not (van der Zwan & Wenderoth, 1994). The latter must occur at an earlier neural site, before the level at which rivalry suppression occurs. These and our results point to oriented filtering and ZC analysis at an early level in area V1.

We found that monocular and dichoptic percepts of a plaid were very different. Our conclusion that filter combination and ZC extraction are largely monocular seems to be at odds with earlier work reporting that presentation of a compound grating (f+ 2f, or f + 3f) dichoptically was seen as a rectangular wave or square- wave structure, just as it was when the two components were presented to the same eye (Maffei & Fiorentini, 1972). The simplest interpretation of that work is that linear binocular summation precedes the analysis of spatial structure--just the opposite of our conclusion for plaids. A possible resolution is that summation across different orientations occurs at an earlier, more mono- cularly dominated site than summation across different spatial frequencies. This is an interesting possibility, but it should be regarded cautiously; our efforts to replicate Maffei and Fiorentini's phenomenon were largely unsuccessful (Georgeson & Meese, 1996).

CONCLUSIONS

In brief, we have seen that the Fourier components of stationary plaids are combined perceptually in a variety

Page 15: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

SPATIAL FILTERS IN EDGE ANALYSIS 3269

of ways, and we propose that the neural basis for these effects is the selective grouping and summation of the outputs of oriented spatial filters, across orientation or across spatial frequency, or both. Our experiments required fairly complex, criterion-dependent judgements, and so it is encouraging to find that using signal-detection methods and d' analysis, other researchers have been led to strikingly similar conclusions about the stage of filter combination (Olzak & Thomas, 1991, 1992; Thomas & Olzak, 1996). To make orientation and other attributes of features explicit the filtering stages must be followed by edge-finding operations (Georgeson, 1994; Akutsu & Legge, 1995; Georgeson & Freeman, 1997). Local Fourier analysis and feature analysis thus appear to be successive stages of early spatial vision, rather than competitive hypotheses about it.

REFERENCES

Akutsu, H. & Legge, G. E. (1995). Discrimination of compound gratings: spatial frequency channels or local features? Vision Research, 35, 2685-2695.

Anstis, S. M. & Duncan, K. (1983). Separate motion aftereffects from each eye and from both eyes. Vision Research, 23, 161-169.

Bonds A. B. & De Bruyn E. J. (1985). Inhibition and spatial selectivity in the visual cortex: the co-operative neuronal network revisited. In Rose, D. & Dobson, V. (Eds), Models of the visual cortex. New York: John Wiley.

Canny, J. (1986). A computational approach to edge detection. IEEE Transactions PAMI, 8, 679-698.

Carpenter, R. H. S. & Blakemore, C. (1973). Interactions between orientations in human vision. Experimental Brain Research, 18, 287-303.

Daugman, J. G. (1988). Pattern and motion vision without Laplacian zero crossings. Journal of the Optical Society of America, A 5, 1142- 1148.

De Valois, K. K. (1977). Spatial frequency adaptation can enhance sensitivity. Vision Research, 17, 1057-1065.

De Valois, R. L., Albrecht, D. G. & Thorell, L. G. (1982). Spatial frequency selectivity of cells in macaque visual cortex. Vision Research, 22, 545-559.

De Valois, R. L. & De Valois, K. K. (1988). Spatial vision. Oxford: Oxford University Press.

Eckhorn, R. (1994). Oscillatory and non-oscillatory synchronizations in the visual cortex and their possible roles in associations of visual features. Progress in Brain Research, 102, 405-426.

Elder, J. H. & Zucker, S. W. (1996). Local scale control for edge detection and blur estimation. In Lecture Notes in Computer Science (Proc. 4th ECCV) (Vol. 1064, pp. 57-69). New York: Springer.

Field, D. J., Hayes, A. & Hess, R. F. (1993). Contour integration by the human visual system: evidence for a local "association field". Vision Research, 33, 173-193.

Foley, J. M. (1994). Human luminance pattern-vision mechanisms: masking experiments require a new model. Journal of the Optical Society of America A, 11, 1710-1719.

Georgeson, M. A. (1975). Mechanisms of visual image processing. Ph.D thesis, University of Sussex, U.K.

Georgeson, M. A. (1980). The perceived spatial frequency, contrast and orientation of illusory gratings. Perception, 9, 695-712.

Georgeson, M. A. (1985). Inferring cortical organization from subjective visual patterns. In Rose, D. & Dobson, V. (Eds), Models of the visual cortex. New York: Wiley.

Georgeson, M. A. (1990). Human vision combines oriented filters to compute edges. Perception, 19, 354.

Georgeson, M. A. (1992). Human vision combines oriented filters to compute edges. Proceedings of the Royal Society of London B, 249, 235-245.

Georgeson, M. A. (1994). From filters to features: location, orientation,

contrast and blur. In Higher order processing in the visual system (Ciba Foundation Symposium 184) (pp. 147-165). Chichester: Wiley.

Georgeson, M. A. (1997). Edge finding in human vision: a model based on perception of plaids. Image and Vision Computing (in press).

Georgeson, M. A. & Freeman, T. C. A. (1997). Perceived location of bars and edges in 1-D images: Computational models and human vision. Vision Research, 37, 127-142.

Georgeson, M. A. & Meese, T. S. (1996). Perception of plaids: oriented filters combine monocularly but not dichoptically. Perception, 25 Suppl, 121.

Georgeson, M. A. & Shackleton, T. M. (1994). Perceived contrast of gratings and plaids: non-linear summation across oriented filters. Vision Research, 34, 1061-1075.

Gilbert, C. D. (1995). Dynamic properties of adult visual cortex. In Gazzaniga, M. S. (Ed.), The cognitive neurosciences. Cambridge, MA: MIT Press.

Graham, N. V. S. (1989). Visual pattern analyzers. Oxford: Oxford University Press.

Gray, C. M., Koenig, P., Engel, A. K. & Singer, W. (1989). Oscillatory responses in cat visual cortex exhibit inter-columnar synchroniza- tion which reflects global sitmulus properties. Nature, 338, 334-337.

Greenlee, M. W. & Magnussen, S. (1988). Interactions among spatial frequency and orientation channels adapted concurrently. Vision Research, 28, 1303-1310.

Hubel, D. H. & Wiesel, T. N. (1968). Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology, 195, 215-243.

Legge, G. E. & Foley, J. M. (1980). Contrast masking in human vision. Journal of the Optical Society of America, 70, 1458-1471.

Liu, L., Tyler, C. W. & Schor, C. M. (1992). Failure of rivalry at low contrast: evidence of a suprathreshold binocular summation process. Vision Research, 32, 1471-1479.

Maffei, L. & Fiorentini, A. (1972). Processes of synthesis in visual perception. Nature, 240, 479-481.

Malik, J. & Perona, P. (1992). Finding boundaries in images. In Wechsler, H. (Ed.), Neural networks for perception Vol 1: Human and machine perception. London: Academic Press.

Marr, D. (1982). Vision. New York: W. H. Freeman. Marr, D. & Hildreth, E. (1980). Theory of edge detection. Proceedings

of the Royal Society of London B, 207, 187-217. Marr, D. & Poggio, T. (1979). A theory of human stereopsis.

Proceedings of the Royal Society of London B, 204, 301-328. Mather, G. & Morgan, M. J. (1986). Irradiation: implications for

theories of edge localization. Vision Research, 26, 1007-1015. Mayhew, J. E. W. & Frisby, J. P. (1981). Psychophysical and

computational studies towards a theory of human stereopsis. Artificial Intelligence, 17, 349-385.

Meese, T. S. (1993) Feature coding in human pattern vision. PhD Thesis, University of Bristol, UK.

Meese, T. S. & Freeman, T. C. A. (1995). Edge computation in human vision: anisotropy in combining of oriented filters. Perception, 24, 603-622.

Meese, T. S. & Georgeson, M. A. (1992). Perceived structure of 3- component plaids: combining and sharing of oriented filter outputs. Perception, 21Suppl. 2, 91-92.

Morrone, M. C. & Burr, D. C. (1986). Evidence for the existence and development of visual inhibition in humans. Nature, 321,235-237.

Morrone, M. C. & Burr, D. C. (1988). Feature detection in human vision: a phase-dependent energy model. Proceedings of the Royal Society of London B, 235, 221-245.

Morrone, M. C., Burr, D. C. & Maffei, L. (1982). Functional significance of cross-orientational inhibition. Part I. Neurophysiol- ogy. Proceedings of the Royal Society of London B, 216, 335-354.

Morrone, M. C., Navangione, A. & Burr, D. C. (1995). An adaptive approach to scale selection for line and edge detection. Pattern Recognition Letters, 16, 667~577.

Moulden, B. (1980). After-effects and the integration of patterns of neural activity within a channel. Philosophical Transactions of the Royal Society B, 290, 39-55.

Page 16: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

3270 M . A . GEORGESON and T. S. MEESE

Moulden, B. (1994) Collator units: second-stage orientational filters. In Higher order processing in the visual system (Ciba Foundation Symposium 184) (pp. 170-184). Chichester: John Wiley.

Nishida, S., Ashida, H. & Sato, T. (1994). Complete interocular transfer of motion aftereffect with flickering test. Vision Research, 34, 2707-2716.

Olzak, L. A. & Thomas, J. P. (1991). When orthogonal orientations are not processed independently. Vision Research, 31, 51-57.

Olzak, L. A. & Thomas, J. P. (1992). Configural effects constrain Fourier models of pattern discrimination. Vision Research, 32, 1885-1898.

Paradiso, M. A., Shimojo, S. & Nakayama, K. (1989). Subjective contours tilt aftereffects and visual cortical organization. Vision Research, 29, 1205-1213.

Regan, D. & Beverley, K. I. (1985). Post-adaptation orientation discrimination. Journal of the Optical Society of America A, 2, 147- 155.

Snowden, R. J. (1992). Orientation bandwidth: the effect of spatial and temporal frequency. Vision Research, 32, 1965-1974.

Thomas, J. P. & Olzak, L. A. (1996). Uncertainty experiments support the roles of second-order mechanisms in spatial frequency and orientation discriminations. Journal of the Optical Society of America A, 13, 689-696.

Tolhurst, D. J. & Barfield, L. P. (1978). Interactions between spatial frequency channels. Vision Research, 18, 951-958.

van der Zwan, R. & Wenderoth, P. (1994). Psychophysical evidence for area V2 involvement in the reduction of subjective contour tilt aftereffects by binocular rivalry. Visual Neuroscience, 11,823-830.

Vautin, R. G. & Berkley, M. A. (1977). Responses of single cells in cat visual cortex to prolonged stimulus movement: neural correlates of visual aftereffects. Journal of Neurophysiology, 40, 1051-1065.

Watt, R. J. (1988). Visual processing: computational, psychophysical and cognitive research. Hove and London: Lawrence Erlbaum.

Watt, R. J. & Morgan, M. J. (1984). Spatial filters and the localization of luminance changes in human vision. Vision Research, 24, 1387- 1397.

Watt, R. J. & Morgan, M. J. (1985). A theory of the primitive spatial code in human vision. Vision Research, 25, 1661-1674.

Wilson, H. R. (1983). Psychophysical evidence for spatial channels. In Braddick, O. J. & Sleigh, A. C. (Eds), Physical and biological processing of images. New York: Springer.

Wilson, H. R., Levi, D., Maffei, L., Rovamo, J. & De Valois, R. (1990) The perception of form: retina to striate cortex. In Spillman, L. & Werner, J. S. (Eds), Visual perception: the neurophysiological foundations. San Diego: Academic Press.

Acknowledgements--The work was supported by SERC and BBSRC (U.K.) project grants GR/D/81022, GR/G/63582 to MAG and by a University of Bristol scholarship to TSM. Thanks to Jan Georgeson and Tom Freeman for lengthy participation in the experiments.

APPENDIX

The zero-crossing rule for edge-finding in machine vision and psychophysics

Does human vision locate edges at zero-crossings? Though often hotly debated, the question is too imprecise to be answered unless it is embedded in a more fully specified theory of edge detection. In order to test a model, one must specify the nature of the preceding filtering, in either I-D or 2-D, the process by which ZCs are located, and the nature of the edge description that is then produced. If the model is to be linked to human performance, then the linking assumptions and other system parameters such as noise levels may have to be made explicit. There have been several widely cited attempts to show that ZCs are not the basis for human edge-finding, giving ZCs a status that (in the words of one reviewer) "many regard as history". We think such a view is misplaced, partly because the psychophysical evidence does not warrant it, and partly because

models of edge detection in machine vision continue to use ZCs as a marker for edge location, while offering new and principled solutions to the problems of noise and spatial scale that have proved difficult in the past (Elder & Zucker, 1996).

Daugman (1988) offered two demonstrations against the use of Laplacian ZCs in human vision. The first involved a vertical, sinusoidal carrier grating whose contrast was amplitude-modulated by a horizontal sinusoid. Daugman showed that ZCs in the output of a Laplacian-of-gaussian (LOG) filter were located only at the ZCs of the carrier grating, irrespective of the presence or absence of modulation. He argued from this that Laplacian ZCs could not account for the fact that the horizontal modulation was very evident to the human observer. This difficulty disappears, however, if we suppose that edges are tagged with a magnitude or contrast value. This edge attribute would co-vary with contrast modulation, and so provide a basis for human perception and discrimination of the different patterns of modulation. Only a very restricted form of binary ZC-based model is rejected by Daugman's first demonstration. His second demonstration concerns images that can be shown mathematically to yield no LoG ZCs at all, but which contain visible shape and form. This is more difficult to evaluate without further mathematical analysis, but we note (a) that early nonlinearity (in the photoreceptors and ganglion cells) modifies the input waveform and so might re-introduce ZCs that were otherwise absent; and (b) that Daugman's argument was not extended to non- Laplacian filters, for example, oriented filters or directional deriva- tives.

Watt (1988) compared data on the precision of edge localization (assessed in a Vernier task) with predictions based on different parameters (peaks, ZCs, centroids) of the I-D LoG response distribution, as a function of contrast. [The data came from the study of Watt & Morgan (1984), but apparently not from the published paper.] To make the comparison, Watt assumed that signal:noise ratio (SNR; in the model) was proportional to contrast (in the experiment). That is, the filters were linear, and the noise had constant variance. On this basis, the variation of ZC location increased far too rapidly with decreasing contrast to be consistent with the data, while the precision of centroid location did a better job. However, it seems unlikely that SNR is in fact proportional to contrast (C). Contrast discrimination experiments suggest instead that SNR may be approximately proportional to C °4 (Legge & Foley, 1980; Foley, 1994) and this idea is also quantitatively consistent with contrast matching experi- ments and with the behaviour of single cortical cells (evidence summarized by Georgeson & Shackleton, 1994). The discrepancy between ZC predictions and Watt 's data would be greatly reduced by this alternative assumption about SNR.

Moreover, even if we accepted that Laplacian ZCs were too noisy to account for discrimination of edge locations, we might seek instead a more robust way of locating the ZC. This has been done computationally by Elder & Zucker (1996), who used space-variant filtering, based on the idea of a minimum reliable scale selected locally within the image. Zero-crossings from second directional derivative filtering, with local scale control, were found to be reliable indicators of edge location in complex 2-D images. A similar, locally adaptive approach has been used by Morrone et al. (1995), but applied to two other edge-finding algorithms.

Also searching for robustness, Watt & Morgan (1985) showed that the centroids of zero-bounded response distributions (ZBRs) are less susceptible to noise than ZCs are, and hence form a more reliable "primitive" for spatial coding. In response to a 1-D edge, LoG filters give rise to two ZBRs (positive and negative) flanking the edge location. Since the edge lies midway between the two centroids, it has been suggested that edge location might be coded by the centroid-of- centroids, and indeed several phenomena of perceived edge location support this view (Mather & Morgan, 1986; Watt, 1988). But since the centroid-of-centroids tracks a point lying between positive and negative parts of the filter response, it is a zero-crossing by another name (Watt, 1988, p.69). Hence rather than rejecting ZCs, the approach via centroids of ZBRs can be seen as a robust way of finding ZCs.

Morrone & Burr (1988) reported one experiment apparently showing that perceived location of edges was well predicted by peaks

Page 17: Perception of Stationary Plaids: The Role of Spatial ... · Oriented lines and edges are key features in the visual structure of images, and it is important to understand the

SPATIAL FILTERS IN EDGE ANALYSIS 3271

of local contrast energy, but not by LoG ZCs. However, Georgeson & Freeman (1997) have shown that this result no longer holds when presentation time is a few hundred msec, rather than prolonged. They discussed this in some detail, and showed that Morrone and Burr's results could be understood through the distorting effect of negative afterimages, coupled with small eye movements, during sustained viewing. Moreover, Georgeson & Freeman (1997) found that the perceived location of edges in compound (f + 3f) gratings was actually very well predicted by the ZCs of a LoG filter of suitable spatial scale (s = 9 min arc; f = 0.4 c/deg). Peaks of local energy, on the other hand,

do not produce enough features to serve as a model for human perception of sine-wave gratings or compound gratings (Georgeson & Freeman, 1997), nor broad-band edges with components at different phases (Georgeson and Barbieri, in prep.).

In sum, ZCs continue to be a useful marker for edge locations, and the focus in this paper is on the filtering operations that lead up to the ZC stage (or its equivalent). The choice is also pragmatic: we find, in general, that with suitable filtering, the pattern of ZCs tends to capture well the perceived structure of the plaids and gratings we have looked at.