Spatial and temporal determinants of context
modulation in human contrast perception
Markku Kilpeläinen
Institute of Behavioural Sciences, University of Helsinki,
Finland
Academic dissertation to be publicly discussed, by due permission of the Faculty of Behavioural Sciences
at the University of Helsinki in Auditorium XII at the Main building, Fabianinkatu 33, on the 19th of April, 2012, at 12 o’clock
University of Helsinki Institute of Behavioural Sciences Studies in Psychology 82: 2012
2
Supervisors: Professor Kristian Donner, PhD Department of Biosciences University of Helsinki Helsinki, Finland Professor Jussi Saarinen, PhD Institute of Behavioural Sciences University of Helsinki Helsinki, Finland Docent Simo Vanni, MD, PhD Brain Research Unit O.V. Lounasmaa Laboratory School of Science Aalto University Espoo, Finland Reviewers: Professor Lynn Olzak, PhD Department of Psychology Miami University Oxford, OH, USA Professor Matti Weckström, MD, PhD Department of Physics University of Oulu Oulu, Finland Opponent: Professor Mark Georgeson School of Life and Health Sciences Aston University Birmingham, UK
ISSN 1798-842X ISSN-L 1798-842X
ISBN 978-952-10-7915-3 (pbk.) ISBN 978-952-10-7916-0 (PDF)
http://www.ethesis.helsinki.fi Unigrafia
Helsinki, 2012
3
Contents
Abstract ....................................................................................................................... 4 Tiivistelmä ................................................................................................................... 5 Acknowledgements .................................................................................................... 6 List of original publications ....................................................................................... 7 Abbreviations .............................................................................................................. 8 1. Introduction ............................................................................................................. 9
1.1 The primate visual system ........................................................................... 11 1.2 The extra-classical receptive field ................................................................ 14
2. Aims of the study .................................................................................................. 17 3. General experimental methods ............................................................................ 18
3.1 Psychophysics ............................................................................................. 18 3.1.1 Subjects ........................................................................................... 18 3.1.2 Stimuli .............................................................................................. 18 3.1.3 Procedures ...................................................................................... 18
3.2 Functional magnetic resonance imaging ...................................................... 19 3.3 Modelling ..................................................................................................... 19
4. The individual studies: methods and results ...................................................... 20 4.1 Study I: A change in mean luminance attenuates perceived contrast in a subtractive, local and transient manner .............................................................. 20
4.1.1 Methods ........................................................................................... 20 4.1.2 Results ............................................................................................. 21
4.2 Study II: Modelling the effects of mean luminance changes and the pedestal effect with the same early retinal mechanisms ................................................... 22
4.2.1 Methods ........................................................................................... 23 4.2.2 Results of the luminance step simulation ......................................... 25 4.2.3 Results of the pedestal effect simulation .......................................... 26
4.3 Study III: The time-course of surround suppression in human visual system 28 4.3.1 Methods ........................................................................................... 28 4.3.2 Results ............................................................................................. 28
4.4 Study IV: The spatial extent of the ECRF mechanisms in human visual system ............................................................................................................... 30
4.4.1 Methods ........................................................................................... 31 4.4.2 Results ............................................................................................. 32
5. Discussion ............................................................................................................. 35 5.1 Modulation caused by changes in the underlying context ............................. 35 5.2 Modulation caused by changes in the adjacent context................................ 38 5.3 The temporal nature of the neural signal ...................................................... 42 5.4 Conclusions ................................................................................................. 43
6. References ............................................................................................................ 44
4
Abstract
The amount of information carried by light from our environment is practically unlimited. The amount of information that can be handled by the biological visual system, in contrast, is strictly limited by the amount of resources (neurons, energy and time) allocated to the system. Within these constraints, the visual system must perform its task of extracting the features most relevant for object recognition and movement planning. To perform the task successfully, the visual system needs to sacrifice the faithful coding of each point’s light intensity. In early visual processing, this is done in (at least) two ways. Firstly, retinal neurons adapt to the local time-averaged luminance in their receptive field, and use most of their dynamic range to signal small contrasts, i.e., deviations from the mean level. Secondly, the system emphasizes spatial and temporal changes. This entails suppressing responses not only to uniform fields of uniform luminance, but also to “repetitive” contrast. This thesis presents measurements and modelling of the spatial and temporal determinants of these forms of information compression in the human visual system. The first key finding of the psychophysical experiments carried out is that the perceived contrast of a target grating is attenuated if the target is presented simultaneously with a change in mean luminance, indicating a lack of adaptation. The attenuation is highly transient and its relative strength decreases with increasing target contrast. A model is presented and supported by simulations, showing that the contrast dependence of attenuation can be explained by two realistic retinal mechanisms: the linear-nonlinear dynamics of cone photoreceptors and the subsequent thresholding and leaky integration of the cone signals by retinal ganglion cells. The model parameters are strictly fixed by known physiology. The same mechanisms also predict the classic pedestal effect, where instead of the mean luminance, the local contrast in an area coextensive with the target changes simultaneously with target presentation. The second key finding of the psychophysical experiments is that the time course of suppression of the perceived contrast of a target grating by a non-overlapping surround grating depends on both the contrast and area of the surround. The contrast effect can be explained by the dynamics of the early retinal response, but the area effect cannot be. Thus, the spatial properties of surround suppression in the human visual system were further explored with neuroimaging, psychophysics and modelling. The structure was found to be similar in humans and non-human primates, and moreover, in agreement with the psychophysically observed effect of surround area on the timing of suppression. On a more general level, the results support the idea that the very early parts of the signals of photoreceptors and subsequent visual neurons are critical in the transmission of visual information.
5
Tiivistelmä
Ympäristössämme on tarjolla käytännössä rajaton määrä näköinformaatiota. Näköjärjestelmäämme sen sijaan rajoittaa ainakin se, kuinka paljon hermosoluja, energiaa ja aikaa sillä on käytössään. Näiden rajoitusten alaisena näköjärjestelmän täytyy kyetä poimimaan näköinformaatiosta ne piirteet, jotka eniten auttavat meitä esineiden tunnistamisessa ja toiminnan ohjauksessa. Ilmeisesti tästä tehtävästä suoriutuminen on edellyttänyt luopumista ympäristön jokaisen pisteen kirkkauden koodaamisesta. Sen sijaan näköjärjestelmä sopeutuu näkökentän eri osien paikalliseen keskiluminanssiin ja käyttää hermosolujen vastealueen pääasiassa ilmaisemaan pieniä, paikallisia poikkeamia tuosta keskiluminanssista. Lisäksi, järjestelmä käyttää hermosolujen vasteita pääasiassa ajassa ja tilassa tapahtuvien muutosten ilmaisemiseen. Solut eivät juurikaan reagoi tasaisiin pintoihin ja toistuvaan tekstuuriinkin vähemmän kuin yksittäisiin reunoihin. Tässä väitöskirjassa selvitetään yllä kuvattujen, ihmisen näköjärjestelmän suorittamien informaatiomuunnosten ajallisia ja spatiaalisia reunaehtoja. Ensimmäinen kokeellinen päätulos on se, että kontrastiärsykkeen havaittu kontrasti vaimentuu, kun se esitetään samanaikaisesti keskiluminanssin muutoksen kanssa. Vaimennusvaikutus poistuu melko nopeasti ja sen suhteellinen voimakkuus heikkenee ärsykkeen kontrastia kasvatettaessa. Vaimennus ja sen ajalliset piirteet selittyvät adaptaatiomekanismeilla. Väitöskirjassa esitetään mallisimulaatioita, joiden mukaan vaimennuksen kontrastiriippuvuus voidaan selittää kahdella realistisella verkkokalvon mekanismilla: reseptorisoluvasteen lineaarisen alkunousun muuttuminen kompressiiviseksi huippuvasteeksi ja ganglion-solujen suorittama reseptorivasteiden kynnystys ja integrointi. Samoilla mekanismeilla pystytään myös tuottamaan klassinen pedestaali-efekti, jossa keskiluminanssin sijaan muuttuu kohdeärsykkeen alla oleva kontrastirakenne. Toinen kokeellinen päätulos on, että ajoitus, jolla ei-päällekkäinen taustakontrasti maksimaalisesti vaimentaa keskustan havaittua kontrastia, riippuu taustan kontrastista ja pinta-alasta. Kontrastin vaikutus selittyy samalla retinan dynamiikalla kuin luminanssimuutoksen vaikutuksen kontrastiriippuvuus, mutta pinta-alan vaikutus ei. Tästä syystä väitöskirjassa tutkitaan vielä ympäristön aiheuttaman vaimennuksen spatiaalisia rajoitteita aivokuvantamisen, psykofysiikan ja mallinnuksen avulla. Vaimennuksen spatiaalinen rakenne oli samankaltainen kuin apinoilla, ja myös sopusoinnussa sen kanssa, mikä oli taustan pinta-alan vaikutus vaimennuksen ajoitukseen. Väitöskirjan tutkimukset tukevat ajatusta, jonka mukaan näköjärjestelmän hermosolujen viestinnässä signaalin tärkein vaihe on sen lyhyt alkuosa.
6
Acknowledgements
This work was conducted mostly in the stimulating environment of the Department of Behavioural Sciences in the University of Helsinki. The skills of my supervising professor Kimmo Alho were needed many times in solving various issues of academic administration. Study IV was conducted in the Advanced Magnetic Imaging Centre at the Aalto University. I have been privileged to receive a 4-year salary from the National Doctoral Programme of Psychology, which has enabled me to concentrate on the thesis without the need to constantly worry about funding. This thesis has many supervisors, and it is safe to say that I have received the best and most diversified education any doctoral student could wish for. The clearly observable enthusiasm and insightfulness of Pentti Laurinen (1945-2009) originally drew me to the study of the visual system. A clear manifestation of his enthusiasm is that although he retired on the day I started my full time doctoral studies, he carried on instructing me and has participated in three of this thesis’ studies. At the time when Pena could no longer participate as actively, I was very lucky to be hit by a second intellectual lightning, Professor Kristian Donner. I don’t know whether it is a result of conscious self education or simply the result of talent and vast experience, but Kristian has truly developed the mentoring process to an art. He always emphasizes the student’s own responsibility and competence and always puts quality above quantity and haste. Thank you, Kristian, for the opportunity to enjoy your art. During the rather confusing process of departing from Pena’s care, Professor Jussi Saarinen and Docent Simo Vanni kept me instructed, employed, funded and motivated. Without them, the thesis might never have been completed. Discussions with my colleagues have been instrumental in making the work as interesting as it is. They have often known the solution to something I’ve found puzzling and when they haven’t, they have been eager to think about it with me. Thank you Ilmari, Jenni, Jussi, Kaisa, Lari, Tarja, Viljami and, especially Lauri, for thinking with me. Talking about the visual system and various world visions with other colleagues within the Helsinki region has been a source of academic and personal satisfaction. Thanks are due to Dr. Linda Henriksson, Dr. Hanna Heikkinen, Dr. Frans Vinberg, Dr. Juha Silvanto, Dr. Petri Ala-Laurila, Prof. Ari Koskelainen, and many others. I also thank the two expert reviewers, Prof. Lynn Olzak and Prof. Matti Weckström. Besides the vision science group, my other fellow doctoral students have also been an essential source of normality in the often quite solitary job of a doctoral student. Thank you, Annika, Janne and Sointu, for the normalization lunches. In my childhood home, both education and thinking for oneself were held in great value. As a result, my sister was excellent in school and a suitable role model in many ways and I stayed in school for about 27 years. Thank you mum, dad and Kirsi for the spark to learn. Nothing about my sweet wife Anna could be said here in a way that would truly express her significance. What is clear, though, is that while being a warm, loving and fun companion, her pride and faith in my work has always remained strong even at the times when mine have hit a low. Thank you, Anna. It has also been a privilege to share with you the wisdoms of our dearest and wittiest teacher, Lotta.
7
List of original publications
I Kilpeläinen, M., Nurminen, L., & Donner, K. (2011). Effects of mean luminance
changes on human contrast perception: Contrast dependence, time-course and
spatial specificity. PLoS ONE, 6, e17200.
II Kilpeläinen, M., Nurminen, L., & Donner, K. (in press). The effect of mean
luminance change and grating pedestals on contrast perception: Model
simulations suggest a common, retinal, origin. Vision research.
III Kilpeläinen, M., Donner, K., & Laurinen, P. (2007). Time course of suppression
by surround gratings: Highly contrast-dependent, but consistently fast. Vision
Research, 47, 3298-3306.
IV Nurminen, L., Kilpeläinen, M., Laurinen, P., & Vanni, S. (2009). Area summation
in human visual system: Psychophysics, fMRI, and modeling. Journal of
Neurophysiology, 102, 2900-2909.
The articles are printed with the permission of the copyright holders.
8
Abbreviations
BOLD blood-oxygenation-level-dependent (contrast) cpd cycles per degree (of visual angle) fMRI functional magnetic resonance imaging ECRF extra-classical receptive field LGN lateral geniculate nucleus RF receptive field VOI voxel of interest V1 primary visual cortex
9
1. Introduction
The visual system continuously provides us with an abundance of information that
promotes survival and reproduction. Although in modern industrialized societies
blindness does not necessarily lead to elevated mortality (Krumpaszky, Dietz, Mickler
& Selbmann, 1999), quality of life is nevertheless compromised. For example, cataract
surgery increases the patients’ quality of life considerably, even if the operation leaves
vision clearly below normal levels (Desai, Reidy, Minassian, Vafidis & Bolger, 1996).
People who do not struggle with significant vision related problems are tempted to
take for granted the visual information and the apparent ease of acquiring it. It is
possibly this ease that creates the illusion that the visual system somehow transfers an
accurate copy of our surroundings to our consciousness. In reality, the visual system
actively reconstructs the visual scene as a stream of neuro-chemical signals. The signals
are transferred through the visual system and modulated to varying degrees before they
are combined with stored information to produce conscious percepts and behavioural
responses. It is easy to accept that only representations of elephants can exist inside
one’s head, but why bother with modulating the signals representing the elephant? Why
not make the representation precisely like the elephant in front of the beholder?
The visual system faces the monumental task of processing a practically unlimited
amount of visual information with a biological system constrained at least by the
amount of neurons and energy allocated to the system as well as the range of intensities
the neurons are able to cover (Niven & Laughlin, 2008). Obviously, a considerable
amount of modulation of the neural signals is necessary to transfer as much as possible
of the most relevant information with the limited biological system.
The most compelling need for modulation comes from the fact that the mean light
levels in our surroundings vary by at least 109 -fold from a moonless night to a sunny
day. The largest changes, such as sunrise/sunset or moving from a dense forest to a
sunny, open plain, occur globally, affecting our entire visual environment similarly.
Such changes are covered by relatively slow processes, such as pupil size adjustment,
sensitivity control of the retinal neurons and a switch between different photoreceptors
systems (reviewed by Walraven, Enroth-Cugell, Hood, MacLeod & Schnapf, 1990).
However, the efficient extraction of relevant features from the visual scene requires
modulation that operates on the same spatial and temporal scales as the features
10
themselves. This thesis concentrates on such fast and local interactions. The different
types of phenomena considered may be described as examples of contextual
modulation. In each case the perceptual responses to a certain target stimulus (e.g., a
wrinkle on the elephant’s trunk) are affected by other stimuli (e.g., other wrinkles).
One of the most studied forms of contextual modulation consists of effects brought
about by fast changes in local light level within which the target stimulus is observed.
When looking at a common day-light natural scene, the light levels falling on different
parts of the retina easily change by a factor of 10-100 from one fixation to another
(Frazor & Geisler, 2006). The photoreceptors could, in principle, simply adjust their
response range so that they would respond maximally to the highest light level they
typically meet and only just to a very faint light level. However, that would result in an
extremely poor ability to signal moderate contrasts (such as some trunk wrinkles) and
significant amounts of energy would be used just to signal the average light level. The
biological visual system has, instead, evolved to adjust the sensitivity of photoreceptors
and other neurons to the local light levels very rapidly. By shifting their intensity
response function so that the steep part of the function is aligned with mean intensity,
the neurons retain maximal sensitivity to the contrasts typically encountered (Rieke &
Rudd, 2009; Shapley & Enroth-Cugell, 1984). However, such adjustment is not
instantaneous. The extent of success in normal visual behaviour, where light levels
change rapidly with each eye movement is one question without an unequivocal answer
in the literature (Geisler, 1978; Mante, Frazor, Bonin, Geisler & Carandini, 2005).
Visual perception is not merely affected by context elements that are coextensive
with the target. Strong modulating effects can originate from completely non-
overlapping, even distant areas of the visual field (Nurminen, Peromaa & Laurinen,
2010; Westheimer, 1967). Since the post-retinal visual system operates with borders
signalled by contrast, rather than uniformly lit areas (Kurki, Peromaa, Hyvärinen &
Saarinen, 2009; Whittle, 1994), interactions between adjacent contrast elements are
critical to understand. One such effect that has been observed on virtually all levels of
the visual system is the so-called surround modulation, where the perceptual or neural
responses to a contrast stimulus are modulated by surrounding contrast stimulation
(Ejima & Takahashi, 1983, 1985; Maffei & Fiorentini, 1976). The fundamental function
of the phenomenon has not been resolved (reviewed by Carandini & Heeger, 2012), but
11
one quite certain consequence of it is an attenuation of less informative texture (e.g.,
trunk wrinkles) in relation to the more important solitary edges of objects (the trunk,
Grigorescu, Petkov & Westenberg, 2003; Nothdurft, Gallant & Van Essen, 1999).
1.1 The primate visual system
Everything we see is light. Light consists of elementary particles called photons. From a
light source or a reflecting surface, light travels through the pupil to the back of the eye,
where the retina is located (Figure 1). The retina consists of multiple layers of
morphologically different neurons (inset in Figure 1). At the very back of the retina are
the photoreceptors containing visual pigment, i.e., molecules that can absorb photons.
Through the electrochemical process of phototransduction, the photoreceptors signal the
absorption of photons with electric responses that are synaptically transmitted to
second-order neurons. The rod photoreceptors (dark blue in Figure 1) are able to signal
the reception of single photons reliably (Rieke & Baylor, 1998) and thus provide vision
at very low light levels. The cone photoreceptors need much more light for useful
responses, but provide the fast, high resolution vision available to us at higher light
levels. The three different cone types (red, green, and light blue in Figure 1), tuned to
different photon energies of light, make colour vision possible.
Figure 1. A schematic picture of the lower levels of the human visual system. The inset represents a
cross-section of the retina. Credits: University of Michigan Kellogg eye center, Helga Kolb, Webvision.
Reproduced under the Creative Commons License.
12
The output of the retina consists of the spike trains produced by the retinal ganglion
cells (purple in Figure 1). Approximately 90 % of the ganglion cells in the primate
retina represent three subtypes (Dacey, 2007). The midget ganglion cells are the most
abundant, possess the highest spatial resolution and process red-green colour
information. The parasol ganglion cells have the highest contrast sensitivity and
temporal resolution. The small bistratified ganglion cells process blue-yellow colour
information. All of these ganglion cell types have an approximately circular centre-
surround receptive field (RF) structure (Figure 2A, top). They respond maximally when
the light in the RF centre, but not in the surround, increases (ON-centre) or decreases
(OFF-centre). Such a receptive field also enhances responses to borders between light
and dark surfaces (see, Figure 2B, for an example). Instead, a uniformly lit surface
leaves excitation and inhibition in balance and results in little, if any, firing.
Figure 2. A) Examples of receptive fields of retinal ganglion cells and V1 neurons. B) The receptive field
of a neuron characterizes its response to a given stimulus.
In concert with the above described retinal neurons, the rest of the retinal circuitry,
with altogether 50 different cell types identified so far, performs extremely sophisticated
computations on the visual signals (see, Werblin, 2011, for a review). It is currently
unclear, to what extent all the cell types and computational properties described in non-
primates apply to the primate retina. However, the idea that all the advanced functions
found in the retinas of “lower” vertebrates are implemented only in the cortex in
13
primates, is a harsh and increasingly outdated oversimplification (Field & Chichilnisky,
2007; Gollisch & Meister, 2010).
The lateral geniculate nucleus (LGN, Figure 1) of the thalamus is the main target of
the information stream from the retina. The LGN has been considered rather
uninteresting in terms of feedforward visual processing, mainly because the number of
neurons in the LGN, and the RF properties thereof, appear to be determined by the
projections from the retina (Ridder, 2006). Also, information from the parasol, midget
and small bistratified ganglion cells, as well as from the two eyes, is kept segregated in
the LGN. Instead, the LGN is considered a gate-keeper of the visual stream. At its most
basic, LGN neurons filter solitary spikes from the retina (Carandini, Horton & Sincich,
2007), which stops most of the maintained activity of single retinal ganglion cells from
entering the cortex. More elaborately, the LGN receives an abundance of connections
from the cortex and other thalamic structures and is hence able to modulate retinal
inputs according to rather high-level parameters such as direction of voluntary attention
(McAlonan, Cavanaugh & Wurtz, 2008; O'Connor, Fukui, Pinsk & Kastner, 2002).
The primary visual cortex (V1, Figure 1), in contrast to the LGN, provides an
obvious progressive step in the analysis of visual information. First and foremost, the
majority of V1 neurons have elongated receptive fields (Figure 2A, bottom) and are
thus selective to the orientation of an edge or stripe in their receptive field (Hubel &
Wiesel, 1962, 1968). A neuron of the primary visual cortex may not respond at all to an
otherwise appropriate stimulus, if its orientation does not match the neuron’s receptive
field (Figure 2B). The visual cortex is organized in vertical columns. Each column
contains, in a circular arrangement, cells sequentially representing all orientations. As a
result, each column informs the rest of the visual system about the orientation of a
segment at a particular location in the visual field (Hubel & Wiesel, 1968). The V1 is
also the first level of the visual system, where information from the two eyes is
combined. In a dimension orthogonal to that defining orientation, there is a columnar
sequence of cells progressing from “pure” left-eye to “pure” right-eye dominance
(Bartfeld & Grinvald, 1992). Together the orthogonal sets of orientation and eye-
dominance slabs make up a so-called hypercolumn. Some V1 neurons are selective to
binocular disparity, which is an important depth cue (Cumming & Parker, 1999), but
depth analysis appears to occur predominantly on higher levels of the visual cortex
14
(Parker, 2007). Interestingly, the large number of neurons in V1 may be entirely
determined by the need to preserve the spatial resolution provided by the number of
ganglion cells while adding orientation selectivity with comparable precision (Stevens,
2001).
In addition to V1, there are some 15-20 distinct cortical areas that predominantly
process visual information (Van Essen & Maunsell, 1983). However, V1 mainly
projects to three areas of the visual cortex, namely V2, V3 and MT/V5 (Casagrande &
Kaas, 1994). These connections are reciprocal, i.e., there are also feedback connections
from those areas back to V1.
1.2 The extra-classical receptive field
The receptive field of a neuron expresses the location and shape of the area where light
stimulation can evoke a response in the neuron. As shown in Figure 2, the classical RF
generally includes areas where light increases (white) or decreases (black) the response.
However, the response can be modulated by additional stimulation outside the classical
RF. The combination of these modulation-mediating areas with the classical RF is often
called the extra-classical receptive field (ECRF).
The extra-classical receptive field of retinal ganglion cells takes two forms. The
response to the stimulation of the RF can be modulated by adding either uniform
luminance or contrast patterns outside the RF (Shapley & Victor, 1979; Solomon, Lee
& Sun, 2006). These ECRF properties appear to be inherited to some extent by neurons
in V1 (Webb, Dhruv, Solomon, Tailby & Lennie, 2005).
The V1 adds something new to the ECRF, however. In contrast to subcortical
neurons (Bonin, Mante & Carandini, 2005; Solomon et al., 2006; Solomon, White &
Martin, 2002), the ECRFs of V1 neurons are clearly orientation selective (Cavanaugh,
Bair & Movshon, 2002b; Knierim & van Essen, 1992), although not quite as strongly as
their classical RFs are.
Figure 3A presents a typical measurement of a V1 neuron’s ECRF. A grating
matched to the location and properties of the neuron’s conventional receptive field is
increased in diameter (see Figure 3A). Typically, an increase in the stimulus diameter
first increases the neuron’s firing rate. The stimulus diameter at which the response
peaks is often considered the neuron’s summation field. After the peak in firing rate, an
additional increase in stimulus size starts to suppress the firing. At some point the firing
15
rate plateaus at a level lower than the maximum response but higher than spontaneous
activity. The stimulus diameter at which the firing rate plateaus is considered the
neuron’s surround field.
Figure 3. A) In comparison to a stimulus that approximately matches the receptive field, the response of
neurons in V1 increases with increasing stimulus size. However, after a point, additional increase in
stimulus size suppresses the response. B) When a centre grating is surrounded by a collinear grating, the
perceived contrast of the centre clearly decreases. The effect is significant but considerably weaker if the
surround grating is orthogonal. The physical contrast of all the centre patches is the same.
Measuring a correlate of the ECRF in the human visual system is complicated.
Imaging methods have been used successfully to measure the strength of ECRF effects
(Williams, Singh & Smith, 2003; Zenger-Landolt & Heeger, 2003), but only
psychophysical methods currently have sufficient spatial and temporal resolution to
enable measurement of the spatial structure and temporal dynamics of ECRF effects.
However, psychophysical methods of course measure the final output after multiple
stages of processing, and the relative contributions of the different stages can be only
partially disentangled with experimental designs. In addition, there is no straightforward
perceptual correlate to neural firing rates, especially for supra-threshold stimulation.
Nevertheless, many psychophysically observed figure-ground interactions have been
hypothesized to be functional correlates of ECRF type mechanisms (reviewed by Seriès,
Lorenceau & Frégnac, 2003). For example, when a grating patch is surrounded by
similar grating, the perceived contrast of the patch is clearly affected. With some
stimulus parameters, the effect is incremental, but more often it is suppressive (Cannon
16
& Fullenkamp, 1991; Ejima & Takahashi, 1985; Olzak & Laurinen, 1999). The effect
can be observed by comparing the panels 1 and 2 in Figure 3B, while fixating the black
dot in between. The effect is strongly orientation selective, a surround of matching
orientation causing much stronger suppression than one of orthogonal orientation
(compare panels 2 and 3 in Figure 3B). In this respect, perception seems to correspond
to V1 neurons rather than sub-cortical neurons.
The mechanisms behind the ECRF of V1 neurons themselves have currently not
been fully resolved. Originally, horizontal connections from other V1 neurons with
adjacent visual field coverage and similar orientation preferences were, quite logically,
assumed to be critical (Gilbert & Wiesel, 1989; Grinvald, Lieke, Frostig & Hildesheim,
1994). However, such connections may be too short and slow to account for the effects
observed experimentally (Angelucci et al., 2002; Bair, Cavanaugh & Movshon, 2003).
Instead, feedback connections from higher visual areas appear to be better suited.
Firstly, receptive fields in those areas, especially in MT/V5 are much larger than in V1.
Secondly, the feedforward and feedback axons conduct much faster than axons of
horizontal connections (Girard, Hupe & Bullier, 2001). Thus, a feedforward-feedback
loop between V1 and a higher visual area would allow the suppressive signals to
propagate from a larger distance and with higher speed. Out of the three areas
predominantly targeted by V1 projections, area V5/MT is likely to have a dominant role
in mediating the suppression. Firstly, surround modulation is unaffected by inactivation
of V2 (Hupé, James, Girard & Bullier, 2001). Secondly, at least in some V1 neurons the
visual area from which suppression is integrated (where the descent from maximum
plateaus in Figure 3A) is so large, that only MT/V5 neurons seem to have large enough
receptive fields (Angelucci et al., 2002). Importantly, though, both the summation field
and the surround field of the V1 ECRF are likely to be formed by a combination of
feedforward, horizontal and feedback connections, with relative roles determined by
stimulus properties, such as contrast and spatial arrangement (Schwabe, Obermayer,
Angelucci & Bressloff, 2006).
In this thesis, the term surround suppression refers to the electrophysiological effect
described in Figure 3A and the psychophysical effect demonstrated in Figure 3B.
17
2. Aims of the study
Study I
To characterize the effects of an abrupt change in mean luminance on human contrast
perception.
Study II
To analyse, by means of physiologically realistic modelling, to what extent the results
of Study I as well as the classical “pedestal effect” can be explained by known
properties of early retinal processing.
Study III
To psychophysically determine the time-course of surround suppression as a function of
surround contrast and surround size.
Study IV
To analyze the spatial structure of surround modulation mechanisms in the human
visual system with neuroimaging, psychophysics and modelling.
18
3. General experimental methods
3.1 Psychophysics
3.1.1 Subjects
All subjects who participated in the psychophysical experiments were young adults with
normal or corrected to normal vision. In all psychophysical experiments, at least one
subject was naïve to the purposes of the study.
3.1.2 Stimuli
The stimuli were always presented on high-quality linearized CRT monitors with an
800x600 resolution. Presentation of stimuli was controlled by the Vision Works 3
(Vision research graphics, Durham, NH) or the ViSaGe (Cambridge research systems,
Kent, UK) system. The display was always the only light source in the room.
3.1.3 Procedures
In all the main experiments, the 2-interval forced choice procedure was used. When
perceived contrast was measured (Studies I and III), the subject was first presented with
a target stimulus with the same contrast throughout the measurement and then a
comparison stimulus, the contrast of which was varied from trial to trial. The subject’s
task was to tell, which of the two stimuli had the higher contrast. The value of perceived
contrast was determined with a staircase method. The contrast of the comparison
stimulus was decreased if the subject judged it to be higher and decreased if the contrast
of the target had been perceived as higher. A point where the progress of the
comparison contrast changed direction was considered a reversal point. The value of
perceived contrast is given by the mean of the reversal points. When discrimination
threshold was measured (Study IV), the subject was presented with an identical mask
stimulus in both intervals, with a target added on the mask randomly in one of the
intervals. The subject’s task was to tell, in which of the intervals the target was present.
The threshold contrast was determined with a method of constant stimuli. The
percentage of correct responses was determined for a range of target contrasts and the
target contrast that led to 75 % correct responses was considered the threshold contrast.
19
3.2 Functional magnetic resonance imaging
In Study IV, the blood oxygenation level dependent (BOLD) signal was measured with
3T General Electric Signa EXCITE (General electric medical systems, Milwaukee, WI)
scanner with an eight-channel receiver coil (repetition time 1.8 s, voxel size (2.5 mm)3.
Borders of retinotopic visual areas were identified with the phase-encoded retinotopic
mapping (Sereno et al., 1995) or multifocal mapping (Vanni, Henriksson & James,
2005). The voxels (volume pixels), from which BOLD signal change was analyzed,
were determined with a functional localizer. The voxels used in the main analyses
within each retinotopic area were those with the highest t-value when the BOLD signal
caused by a high contrast grating (diameter 2 deg) was contrasted with activity during
rest. Both gradient echo and spin echo fMRI sequences were used, but spin echo was
determined more spatially specific and all the conclusions are based on data from spin
echo fMRI. The stimuli were presented with projected with a linearized Christie X3TM
(Christie Digital Systems Ltd) data projector with 1024 x 768 pixel spatial resolution
and 75 Hz refresh rate to a semitransparent screen, subtending 40 x 31 degrees of visual
angle at 34 cm viewing distance. The functional data were preprocessed and analyzed
with SPM2 (Wellcome Department of Imaging Neuroscience, London, UK)
(Frackowiak et al., 2003) Matlab toolbox. The BOLD signal change in each stimulus
condition was estimated by fitting a general linear model to the time-series data.
3.3 Modelling
All model computations were performed with Matlab 7 (MathWorks Inc, Natick, MA,
USA). All integration operations were conducted with numerical methods (ODE45 and
Euler’s in Study II, dblquad in Study IV).
20
4. The individual studies: methods and results
4.1 Study I: A change in mean luminance attenuates perceived
contrast in a subtractive, local and transient manner
During natural viewing, every eye movement changes the contrast and mean luminance
falling on each part of the retina very abruptly and often substantially. The visual
system quickly adapts to the new image statistics at each retinal location. However,
such adaptation can never be completely instantaneous (Rieke & Rudd, 2009). It is thus
not a surprise that detecting small luminance-defined targets against a background
becomes harder if the luminance of the background is changed simultaneously (Geisler,
1978; Poot, Snippe & van Hateren, 1997). It is more surprising that the contrast
responses of single neurons in the cat LGN and V1 have been reported to be
independent of co-occurring changes in mean luminance (Geisler, Albrecht & Crane,
2007; Mante et al., 2005). Study I sought to clarify this quite profound discrepancy.
4.1.1 Methods
The effect of mean luminance change on perceived supra-threshold contrast was
measured psychophysically in three subjects (1 naïve). The subjects viewed the screen
monocularly through an artificial pupil (diameter 4.1 mm, natural pupil dilated) from a
distance of 103 cm. The stimuli were circular patches of sine wave grating (spatial
frequency 4 cpd, orientation horizontal) or bars of increment or decrement luminance.
The Michelson contrast of the target stimulus was varied (0.1–0.45) between
measurements. The contrast of the comparison was varied during the measurement,
according to the staircase method. The target grating was presented together with an
upward or downward step of mean luminance (between 185 and 1295 Td,
corresponding to 14 and 98 cd/m2), either simultaneously or with various delays (50–
800 ms). The time-course of a single trial is presented in Figure 4. One measurement
proceeded as follows. First, the subject adapted to the baseline luminance of the
measurement for a minimum of 40 seconds. Then, a trial was presented: a fixation
stimulus was followed by an abrupt change in mean luminance. The target stimulus
(duration 100 ms) was presented with either a simultaneous or a delayed onset relative
to the change in mean luminance. A comparison stimulus was presented 1800 ms after
the luminance change. The baseline luminance was shown for 8 seconds before the next
21
trial. The subject's task was to indicate, whether the target or the comparison stimulus
appeared to have a higher contrast. Trials were continued until both staircases reached 8
reversal points.
Figure 4. After a longer period of adaptation to the baseline luminance, the mean luminance changes
abruptly. The target grating is presented either simultaneously or with a delay of 50-800 ms. The
comparison grating is presented approximately 1800 ms later, after which the mean luminance returns to
the adaptation value.
4.1.2 Results
The perceived contrast of the supra-threshold grating was consistently attenuated by the
co-occurring change in mean luminance (Figure 5A). For contrasts stronger than 0.1-
0.2, absolute attenuation was independent of target contrast, i.e. subtractive, as indicated
by the linear fits (dashed lines in Figure 5A). The attenuation subsided in some 200-400
ms (Figure 5B). Upward steps caused somewhat stronger (mean suppression strength 36
% vs. 17%) and longer lasting (mean time constant 157 ms vs. 78 ms) attenuation than
downward steps. The pattern of results was the same when increment or decrement bars
were used as stimuli instead of gratings and when the mean luminance was changed
only in the immediate vicinity of the target. Finally, attenuation was also clearly present
when detection thresholds were measured instead of perceived contrast.
22
Figure 5. A change in mean luminance simultaneous with, or closely preceding, the presentation of a
contrast target attenuates perceived contrast consistently, but transiently. A) Perceived contrast of the
target grating as a function of physical contrast of the grating for three subjects. Dashed lines are linear
fits with slopes 0.995 (S1), 0.997 (S2) and 1.040 (S3).B) Perceived contrast of the target grating as a
function of target onset relative to the luminance change onset for two subjects. Smooth curves are best
fitting exponential functions. Data points (and accompanying error bars) are the means (and standard
deviations) of at least 4 staircases. Data of subject S3 was very similar to that of S1.
4.2 Study II: Modelling the effects of mean luminance changes
and the pedestal effect with the same early retinal mechanisms
The attenuation of perceived contrast caused by a simultaneous step in mean luminance
(Study I) can be explained by response compression due to a saturating non-linearity of
cone responses when there is insufficient time for adaptation (Schnapf, Nunn, Meister
& Baylor, 1990). The observed gradual release from attenuation with time would then
mainly reflect the adaptation process. However, the fact that absolute attenuation is
largely independent of contrast (implying that the relative attenuation strength decreases
with increasing stimulus contrast) is not consistent with such an explanation. The simple
prediction based on a saturating non-linearity would be that all contrasts are scaled
according to the compressed dynamic range of the cones, leading to contrast dependent
(divisive) attenuation. In Study II, we developed a computational model to study to
what extent the contrast dependence of the luminance step effect could be explained by
retinal mechanisms that are definitely involved in mediating the contrast signals to
23
higher levels of the visual system. The basic signal considered in the model is the early
rise of cone photoreceptor responses. Since discriminating a contrast target presented on
top of a spatially coextensive “pedestal” is basically a similar problem from the
viewpoint of that cone signals, we also studied whether the same model could explain
the well-known and intensely investigated pedestal effect (Nachmias & Sansbury,
1974).
4.2.1 Methods
All vision depends on the responses of photoreceptors and on the encoding of the
graded photoreceptor signals into trains of action potentials transmitted by retinal
ganglion cells. In the current implementation, the response of each cone to a light
stimulus caused by its stimulating pixel was calculated according to the cone response
model of Baylor, Hodgkin and Lamb (1974, their equations 8, 33, and 46). The values
of all parameters were taken from reports of relevant electrophysiological experiments.
No free parameters were used. The second, equally pertinent mechanism is the
transformation of the graded cone signals to impulse frequencies, carried out by the
retinal ganglion cells. In the model, the “ganglion cell” was modeled as a leaky
integrate-and-fire operator. The neuron’s membrane voltage V is given by equation (1).
))((1 InhURVdtdV
m
(1),
where τm is the membrane time constant (2.7 ms, Weber & Harman, 2005), U is the
input current response in pA, R is input resistance (here normalized to 1 GΩ, to simplify
the numerical implementation) and Inh is a static inhibition applied to all cone
responses. The model neuron creates a spike when the membrane voltage crosses the
spike threshold (0.13 mV in the luminance step simulation and 0.195 in the pedestal
effect simulation). The strength of inhibition (3.95 pA) was set to match the
physiological spontaneous activity of macaque ganglion cells (Brivanlou, Warland &
Meister, 1998; Croner, Purpura & Kaplan, 1993). The spike frequencies (SF) were then
calculated as (tx-t1) / (x-1), where t is time of the spike and x (here 3) is the number of
spikes considered to carry the signal. Data were qualitatively the same for x = 4.
24
The simplest signal on which an observer could base contrast perception is the
difference between peak and mean intensities. Thus, in the luminance step simulation,
the difference of the cone responses to the peak and the mean luminances was used as
the input to the leaky integrator. In contrast, in a contrast discrimination task,
determining stimulus contrast is neither necessary nor efficient. Instead, the observer
needs to find a large enough difference in any pixel of the stimulus. Thus, in the
pedestal effect simulation, cone responses from 32 units, spanning luminance levels
from peak to mean, were used as input.
To compare the model results with psychophysical data, a matching procedure was
applied to the model responses. For an array of target and comparison contrasts, we
calculated the probabilities (over noise samples) that the SF produced by the
target+noise is greater than the SF produced by the comparison+noise. The
probabilities were calculated according to equation 2.
21 1
)(
m
ncntm
k
m
jkj
(2),
where t is the target, c is the comparison stimulus, n is noise and m is the number of
noise samples (we used m = 200). The expression (t + nj > c + nk) equals 1 if true and 0
if false. In the pedestal effect simulation, the corresponding expression was (p+t + nj >
p + nk), where p is pedestal contrast.
Figure 6A illustrates the definition of perceived contrast in the luminance step
simulation. The probability of the noisy comparison response being larger than the
noisy target response is plotted as a function of comparison contrast. It can be seen that
with a 0.15 target contrast, the comparison contrast needs to be about 0.1 for the model
to produce larger responses for the comparison in 50 % of comparisons (corresponding
to random answering in a 2-alternative task). Thus, the model predicts roughly 33 %
suppression with 0.15 target contrast. Figure 6B illustrates the definition of
discrimination thresholds in the pedestal effect simulation. The probability for
target+pedestal being larger than pedestal alone is plotted against target contrast, for
four different pedestal contrasts (0, 0.009, 0.017 and 0.025). The graph shows that with
25
the 0.009 target contrast the probability crosses the 75 % threshold criterion with a
lower target contrast than with no pedestal (black line), thus producing the well known
pedestal effect.
Figure 6. Comparison of simulated noisy responses. A) Judging supra-threshold contrast of a target
(simultaneous with the luminance step) relative to the contrast of a comparison stimulus (steady-state
luminance). The probability that the response for comparison+noise is larger than the response for
target+noise, as a function of comparison contrast. B) Discriminating a contrast target on pedestals. The
probability that the response for pedestal+target+noise is larger than the response for pedestal+noise, as a
function of target contrast. The black, blue, red and green lines correspond to pedestal contrasts 0, 0.009,
0.017 and 0.025, respectively.
4.2.2 Results of the luminance step simulation
Figure 7A presents the spike rates produced by the model for the steady-state and step-
up situations. In qualitative terms, comparing the two graphs reveals firstly, that the
spike rates are consistently suppressed in the step-up situation and secondly, that the
suppression becomes relatively weaker with increasing grating contrast. However, a
quantitative comparison with the psychophysical data requires the application of the
matching procedure of equation 1. Figure 7B compares the simulation results with the
psychophysical data from Study I (Figure 5). Both have been plotted as relative
suppression due to the luminance step, calculated in the conventional way: (Cphysical –
Capparent) / Cphysical, where Capparent corresponds to the comparison contrast with which the
observer or the model judges the comparison stimulus to have the higher contrast on 50
% of the trials (dashed lines in Figure 6A).
26
Figure 7. Results of simulations of the effect of a luminance step on spike frequencies and perceived
contrast and comparison to psychophysical data. A) Spike frequencies produced by model simulations as
functions of grating contrast for the step-up situation (blue line) and the steady-state situation (red line).
B) Relative suppression caused by the luminance step. The continuous curve is the result of the model
simulations. The data points are derived from Figure5A of Study I (Figure 1A in the original study of
Kilpeläinen et al., 2011), re-plotted as relative suppression of perceived contrast. The symbols for the
three subjects are the same as in the original figure.
4.2.3 Results of the pedestal effect simulation
In this simulation, the model produces discrimination data for 32 different model units,
5 examples of which are presented in Figure 8A. The different curves illustrate how the
curves for units seeing different parts of the grating dip at different pedestal contrasts.
For example, the unit that has the highest threshold at zero pedestal is the one that has
the lowest threshold at pedestal contrasts higher than 0.07 (the curve reaching its
minimum around pedestal contrast 0.13). To simplify matters as far as possible, it was
assumed that the discrimination threshold of the whole population of model units is
equal to the lowest threshold of any model unit. Schematically this corresponds to the
lower envelope of the curve pattern in Figure 8A.
Figure 8B compares simulated 75 % discrimination threshold with the
psychophysical data of Henning and Wichmann (2007). It is evident that the model
produces the familiar dipper function, the shape of which is roughly in agreement with
the psychophysically measured functions (which themselves vary quite considerably
between subjects). Since the main point of the simulation was to qualitatively reproduce
the overall pattern of data, Figure 8C compares the simulations and data, for three
different discrimination threshold criteria, on normalized axes. In addition to the correct
27
general shape of the functions, the model correctly predicts two overall patterns of the
data. First, lowering the criterion deepens the dip trough and moves it to higher pedestal
contrast. Second, the slopes of the subsequent rising parts of the functions are generally
well reproduced. The main shortcoming is that the model produces a somewhat too
shallow dip in the 60% criterion function.
Figure 8. Results of the pedestal effect simulations. A) Discrimination thresholds from 5 different model
units for the 75 % threshold criterion. B) Comparison of model simulations and data for three subjects at
the 75 % correct threshold criterion (data from Henning and Wichmann, 2007). C) Comparison of
simulations and data at three different % correct criteria (as indicated in the leftmost panel) on normalized
axes. Panels S1 and S2 show data for two subjects and the third panel (bottom right) gives the model
curves for the same levels of % correct. The data markers are as in Figure 4 of Henning & Wichmann
(2007, first two subjects).
28
4.3 Study III: The time-course of surround suppression in
human visual system
Surround suppression is a form of contextual modulation where the perceptual or neural
response to a centre stimulus is decreased by a completely non-overlapping surround
stimulus (Figure 3B). It is trivial that very large differences in the onset times of the two
stimuli will render the surround ineffective. However, since the critical part of neural
signals is very short and transient (Ludwig, Gilchrist, McSorley & Baddeley, 2005;
Müller, Metha, Krauskopf & Lennie, 2001), even modest differences in the times when
the centre and surround signals arrive at the level of interaction can have a considerable
effect. Such differences are likely to arise since the latencies of the retinal outputs
depend, at least, on the contrasts and sizes of different stimulus parts (Donner, 1981,
1989; Donner & Fagerholm, 2003). Study III measured how the timing of the surround
relative to the centre affects the surround’s suppressive effect, using stimuli designed to
probe effects of the “extra-classical receptive field”.
4.3.1 Methods
The effect of a surround grating on the perceived contrast of a centre grating was
measured psychophysically in three subjects (2 naïve). The subjects viewed the screen
binocularly from a distance of 114 cm. The centre stimulus was a circular patch of sine
wave grating (4 cpd). The surround stimuli were similar gratings in an annulus window.
The contrast and the size of the surround stimulus were separately varied in the two
experiments of this study. The mean luminance of the screen was 33 cd/m2. The most
effective timing of the surround was determined with a procedure very similar to that in
Study I. However, in this study, the centre and the surround stimuli were presented
either with simultaneous onset or with various amounts of asynchrony and the mean
luminance of the screen remained constant. The comparison stimulus was presented
1100 ms after the target. Trials were presented until both staircases reached 10 reversal
points.
4.3.2 Results
The timing of the surround stimulus had a clear effect on its ability to suppress the
perceived contrast of the centre stimulus. If the centre and surround were presented with
an onset difference greater than approximately 75 ms, the surround had very little effect
29
(Figure 9A). However, simultaneous onset did not lead to the strongest effect, either.
The higher the surround contrast, the later it had to be presented in order to have the
maximal effect (Figure 9B). This indicates that the contrast dependence of neural
latencies, which is known to exist on the level of the retinal output, is transferred up to
the level where the centre and surround signals interact. However, the relative latencies
of the surround were consistently too short, leaving no time for the transfer of
suppression at the site of interaction (compare broken curves and data markers in Figure
9B).
Figure 9. A) Perceived contrast of a centre grating as a function of the onset difference between the
centre and surround, separately for two subjects. Negative values on the x-axis mean that the surround
was presented before the centre. Smooth curves are best fitting normal or lognormal functions. B)
Relative latency, i.e., the opposite of the most effective surround onset time (see broken lines from A), as
a function of surround contrast. Centre contrast was always 0.2. Smooth curves are best fitting contrast-
latency functions. In the broken lines, a calculated median delay for horizontal and feedback connections
has been added (see labels next to curves). The delay calculations are based on the average distance
30
between the centre and the surround on V1 and cortical propagation velocities. The thick curve indicates
no mediation delay, and the thin solid curve the best fit to the data.
In the first experiment the area of the surround annulus was always 3 times as large
as the centre patch. In the second experiment we studied the effect of surround area on
the optimal timing of the surround stimulus. Decreasing the surround area from 3-fold
increased the relative latency of the surround, and when centre and surround areas were
equal (leftmost data points in Figure 10), the data actually matched the predictions
based on the mediation via feedback connections. In contrast, increasing the surround
area from 3-fold did not decrease the relative latency of the surround. On the contrary,
the latency increased somewhat (rightmost data points in Figure 10).
Figure 10. Relative latency as a function of surround area (area of the annulus / area of the central patch)
for three subjects. Relative latencies and predictions for horizontal and feedback connections (see labels)
calculated as in Figure 9. Predictions were made both for a case where average propagation distance stays
the same (straight lines) and for a case where average propagation distance increases with increasing
surround size (curved lines).
4.4 Study IV: The spatial extent of the ECRF mechanisms in
human visual system
The result of Study III where the relative latency of the surround signal first decreased
with increasing surround area, but then started increasing again, suggests a complex
interplay between the different ECRF components and their latencies. To study the
31
spatial extents of summation and suppression in the human visual system, Study IV
combined psychophysics, fMRI and computational modelling.
4.4.1 Methods
The basic paradigm was the one that has been routinely used to describe the structure of
the ECRF in single cells in the primate visual system (Cavanaugh, Bair & Movshon,
2002a; Sceniak, Hawken & Shapley, 2001; Solomon et al., 2006), the measurement of
responses to gratings of various sizes (Figure 3A).
There is no straightforward way to psychophysically measure the response to a
certain sub-area (the central area in this case) of a uniform, suprathreshold grating.
Instead, we assumed that discriminating the Gabor target from the underlying grating is
an index of, at least, the slope of the underlying neural contrast-response function or
neural noise, or both (Geisler & Albrecht, 1997; Georgeson & Meese, 2006; Klein,
2006). Both of these factors change monotonically with neural responses around the 15
% contrast used here. We thus interpret an increase in the discrimination threshold as a
correlate of an increase in the responses of the neurons that process the area covered by
the target.
The discrimination threshold with each mask size was measured with the method of
constant stimuli in 4 subjects (1 naïve). All stimuli were drifting gratings in oblique
orientation (1 cpd, 4 Hz, presentation time 180 ms) presented at 13 deg eccentricity. The
target to be detected was a Gabor grating (SD 0.125 deg, see Figure 11A) superimposed
on the centre of the mask. Mask diameter varied between measurements (0.5-24 deg),
while target size was held constant. The target contrast corresponding to 75 % correct
answers was taken as the threshold value. Viewing distance was 49.5 cm, mean
luminance 40 cd/m2.
In fMRI, the stimuli were the same as in the psychophysical experiment, but without
the target Gabor and with a 10.8 s presentation time (drift direction reversed every 450
ms. Attention was controlled with a fixation task. Six subjects (3 naïve) participated in
the fMRI experiment. See 3.2 for further details on fMRI methods.
Both in psychophysics and fMRI, functions of the form difference-of-integrals-of-
Gaussians were fitted to the data for interpolation purposes. The size of the summation
field was measured by the diameter where the fitted function peaked, the size of the
surround field by the diameter where the function had decreased to a value 5 % of that
32
at the largest diameter. The suppression index was calculated as the (Rpeak - Rsup) / Rpeak,
where Rpeak is the peak value of the function and Rsup is the value at the largest diameter.
4.4.2 Results
The overall shape of the psychophysically measured functions greatly resembles
functions measured in macaque V1 (Cavanaugh et al., 2002a; 2002b). The functions
first increase to a peak and then decrease at a decelerating rate and reach a plateau
(Figure 11B). Quantitative comparison between the two types of data is inevitably quite
speculative, but it is nevertheless interesting that the values observed here (summation
field 2.1 ± 0.3 deg, surround field 6.2 ± 2.5 deg, suppression index 0.34 ± 0.08) are
quite close to average values in the Cavanaugh et al. (2002a; 2002b) macaque data (2.7
± 0.14 deg, 4.5 ± 0.22, 0.32 ± 0.02).
Figure 11. A) The contrast profile of the stimulus used in the psychophysical experiment of Study IV. B)
The threshold contrast increment (∆C) for detecting the gabor target from the mask grating as a function
of mask diameter. The broken lines indicate the summation and surround fields, defined from the fitted
DOiG functions (smooth curves).
The data of the fMRI experiment replicate the familiar pattern, although the function
relating signal size to stimulus diameter reached a plateau only in the data from some
subjects (Figure 12). The average summation field diameter in V1 indicated by the
BOLD signal was 3.2 ± 1.3° (95% CI), the surround field diameter 15 ± 2.3°, and the
suppression index 0.87 ± 0.23. V2/V3 values were 5.6 ± 6.0°, 15 ± 6.4°, and 0.82 ±
0.68. In line with single-cell studies (Shushruth, Ichida, Levitt & Angelucci, 2009;
Zhang et al., 2005), the data showed a trend towards larger summation field size in V2
than in V1.
33
Figure 12. BOLD signal change as a function of grating diameter in human visual areas V1 and V2. The
individual data of six subjects (gray lines) and mean over subjects (black line). The label V2/V3 refers to
the fact there was no full certainty in all subjects that the VOI was exclusively within V2, as the
activation was close to the V2/V3 border.
The fMRI measurement used a typical (2.5 mm)3 voxel size. Such a cortical volume
includes hundreds of thousands of neurons with significantly varying receptive field
locations. To link the measured voxel responses to the underlying neural responses, a
model was developed. The model included a population of stereotypical model neurons
differing only in their receptive field locations (Figure 13A). The retinotopic locations
of the receptive fields were determined by projecting a single voxel in V1 at 13°
eccentricity to visual space (Figure 13A, top). The projection was based on human V1
magnification factors (Duncan & Boynton, 2003).
After determining the RF positions, the response of a model neuron at each visual
field location to a stimulus size was given by a two-dimensional difference-of-
integrals-of-Gaussians function, integrated over the stimulus area. The function is
essentially the Sceniak et al. (2001) RF model applied in two dimensions. The
parameters for the model neuron responses were extracted separately from the
psychophysical experiment of this study and from macaque single neuron data from
Cavanaugh, Bair and Movshon (2002a; 2002b), but only simulations based on the
psychophysical parameters are shown here. The black and red solid curves in Figure
13A (bottom) illustrate how receptive field positions have a significant influence on the
stimulus diameter at which the responses peak. After computing the response of each
model neuron, the voxel response is achieved by summing over all model neurons i.e.,
over all RF locations. We used 3600 model neurons, which was deemed sufficient to
34
represent realistically the output of the hundred-fold larger, but much more
computationally demanding actual population.
Figure 13B compares the simulation results with the fMRI data from V1. The model
captures the summation part and some of the suppressing part well, but whereas the
model function plateaus, the measured functions continue decreasing. Figure 13C shows
that the model generalizes to a completely novel stimulus paradigm, where the
measured neural population is displaced from the centre of stimulation.
Figure 13. A) The logic of the voxel response model. The model neurons are evenly distributed in the
voxel in cortex (dashed line square). The positions of the ECRFs of the model neurons are determined by
projecting the voxels to visual space according to human magnification factors (dashed-line quadrangle).
Different positions within the voxel, and relative to the centre of stimulation, lead to differing response
functions for the different model neurons (red vs. black curve). The voxel response is the sum of all
individual responses (dashed curve). B) Comparison of fMRI data (one thin line for each subject) and
model simulations (thick black curve). Normalized responses are plotted as functions of stimulus
diameter. C) fMRI data and model simulations when the voxel of interest is in the centre of stimulation
(black lines, same settings as in A, but separate data) or 3.6 deg toward the fovea from the centre of
stimulation (red lines).
35
5. Discussion
The visual system is an active system that receives an immensely dynamic spatio-
temporal pattern of light and transforms that into features that facilitate object
recognition and action planning. Performing this task within the constraints of the
biological visual system requires the visual information to be modulated and processed
between light absorption and perception. This thesis focuses on two types of modulation
effects, where the neural signal elicited by a certain stimulus is affected by additional
stimulation either at the same or at immediately adjacent location in space and time. I
will first discuss these two types of effects separately and then consider the temporal
nature of neural signals, a concept that links many of the studies of the thesis.
5.1 Modulation caused by changes in the underlying context
Arguably the simplest change in the context of a contrast stimulus is that its mean
luminance changes at the moment of presentation. Such changes are very naturalistic,
since they occur (with varying magnitude) with every saccade during viewing of natural
scenes (Frazor & Geisler, 2006). Effects of such a change on the perception of supra-
threshold contrast were investigated in Study I. In line with earlier results regarding
detection thresholds (Geisler, 1978; Poot et al., 1997), the change in mean luminance
caused significant attenuation of the perceived contrast of the target grating. Attenuation
as such is an expected result. At the moment of the change, the cones are adapted to the
lower mean luminance and the 0.7 log step upward in mean luminance transiently
consumes about 60 % of the dynamic range of the cones (Schnapf et al., 1990; Valeton
& Van Norren, 1983), leaving a heavily compressed range available for transmitting a
signal about contrast, i.e., each “pixel’s” deviation from the mean luminance.
Why, then, was attenuation not observed in the two single neuron recordings (see,
Geisler et al., 2007; Mante et al., 2005) in the cat visual system? Since we obtained very
similar results with stimuli near and much above threshold, as well as with gratings and
increment/decrement bars, stimulus differences are probably not the cause of the
discrepancy between the single neuron recordings and the psychophysical data. The
most probable reason is that the single cell studies were interested in responses after a
step to a certain absolute luminance level, rather than responses after a luminance step
36
of a certain size. In those studies, responses were averaged over the size and direction of
luminance steps, potentially obscuring the effects of step size.
However, if the compression of cone peak amplitudes were the only effect at play,
attenuation should increase with increasing target contrast. Instead, attenuation was
roughly constant in absolute terms, meaning that it became relatively weaker at higher
contrasts. While the compression of cone responses would undoubtedly affect
increments at the peak of the responses, it may lose its significance in responses
signalling strong contrasts with sharp onset. This is due to a basic, but rather
underappreciated fact of retinal signalling: Unlike the peak amplitude, the amplitude
during the earliest rise of photoreceptor responses is a completely linear function of
stimulus intensity over a very wide range of stimulus intensities (Baylor et al., 1974;
Lamb & Pugh, 1992) and adaptation levels (Friedburg, Thomas & Lamb, 2001; Hood &
Birch, 1993). Only the earliest rise of the response is strictly linear, however, and the
effect of the compressive non-linearity gradually increases towards the peak of the
response. It follows that with a weak stimulus contrast, a mechanism which integrates
and thresholds the cone signals, such as a retinal ganglion cell, integrates the cone
contrast all the way to the non-linear parts to produce reliable spike responses (see solid
curves in Figure 14 for a schematic illustration). In that situation, the difference integral
for the responses with the luminance step (blue) reaches spike thresholds (dashed
arrows) much more slowly than the integral for the responses with a steady mean
luminance (red) and, consequently, produces a lower spike frequency. With an
increasing contrast (dashed lines), however, the ganglion cell reaches its thresholds
increasingly at the same pace in the two situations, approaching (but never quite
reaching) the point where all information is based on the strictly linear part of the
photoreceptor response. As a result, the spike rates also converge with increasing
contrast (Figure 7A).
37
Figure 14. A schematic presentation of spike generation by an integrating and thresholding operator from
integrals of cone contrast responses (Rmax-Rmean). Blue lines correspond to the step-up situation, red lines
to the steady-state situation. Solid lines show the situation for a low contrast stimulus, dashed lines for a
high contrast stimulus.
Study II tested whether the above presented principle, implemented within a realistic
computational model, could reproduce the psychophysical data of Study I. The
modelling study provides a proof of principle. The model correctly captures the
decrease of attenuation, and roughly the strength of attenuation, with parameters strictly
fixed by primate physiology.
The credibility of the model is further supported by the fact that the same
mechanisms also reproduced the main features of the classical pedestal effect, where the
contrast context underlying the target changes abruptly and simultaneously with target
onset. The pedestal effect has traditionally been explained with a hypothetical sigmoid-
shaped contrast-response function (Legge & Foley, 1980), where the initial dip in
thresholds would reflect the initial acceleration of the contrast-response function and the
subsequent rise would be caused by deceleration of the function at higher contrasts.
Later it has been indicated that the need for signals to exceed internal noise levels,
which are possibly signal-dependent, can also lead to the initial dip (Geisler & Albrecht,
1997; Gur, Beylin & Snodderly, 1997; Georgeson & Meese, 2006). Common to the
theories above is the assumption that discrimination is always determined by the
neurons most sensitive to the target stimulus. More current theories suggest that to
correctly grasp the discrimination performance of a psychophysical observer, it is
necessary to consider neurons with varying stimulus preferences (Chirimuuta &
Tolhurst, 2005; Goris, Wichmann & Henning, 2009).
The model presented here is not categorically at odds with any of the earlier theories.
For example, the idea of multiple units with varying preferences is inherent in the idea
38
of units that read the cone responses at different parts of the grating. However, Study II
demonstrates that basic retinal mechanisms, which undoubtedly are at work in any case,
can generate the functions observed in psychophysical experiments. Thus, it seems that
the relevance of those mechanisms can only be dismissed by an explanation of where
and how the effect is lost and a similar function recreated by mechanisms upstream.
Since there are, at least on the surface, many differences between the two
psychophysical data sets of Study II, it may appear that the two simulations also rely on
quite different mechanisms. It is thus worthwhile to summarize the common aspects of
the two simulations. Firstly, cone responses were computed with identical functions and
parameters. Secondly, cone signals were transformed into spike rates with the same
functions and only one modest parameter change, a higher spike threshold in the
pedestal effect simulation. Even this adjustment is understandable since, in reality, there
are likely post-retinal thresholding mechanisms, such as the spike rate dependent spike
transfer probability of thalamic neurons (Carandini et al., 2007), which should not
significantly affect the perception of supra-threshold contrast. Thirdly, we used the cone
contrast as input to the thresholding integrator in the luminance step simulation but the
responses to each individual pixel of the grating in the pedestal effect. That, however, is
physiologically completely realistic, since parasol/Y ganglion cells of the primate retina
are capable of responding to both aspects of a grating stimulus (Crook et al., 2008;
Demb, Zaghloul, Haarsma & Sterling, 2001). Of course, it is not currently known how
the two response types are combined in different situations at higher levels of the visual
system. Finally, although the two main mechanisms are necessary in both simulations,
the linear-nonlinear cone responses are more critical in creating decelerating,
converging contrast response functions in the luminance step simulation. In the pedestal
effect simulation, in contrast, the threshold mechanism as such creates the dip and the
nature of cone signals becomes important on the later part of the dipper function.
5.2 Modulation caused by changes in the adjacent context
In surround suppression, the effect investigated in Studies III and IV, the target stimulus
itself remains completely unchanged and the modulation effect is instead caused by a
change in the immediately adjacent visual space. This brings about the need for an
additional mechanism, one which transmits the signal from neurons processing the
39
surround area to neurons processing the centre. The identification of the mechanism(s)
can greatly benefit from characterization of the spatial and temporal principles involved.
The temporal principles involved in surround suppression (called contrast-contrast
suppression in the original publication) were determined in Study III. The determinants
of the optimal timing of the surround stimulus (∆L) were divided into three linearly
summed factors, according to equation 3.
)()()( lDALCLL (3),
where ∆L(C) is the latency difference due to different contrasts of the centre and
surround stimuli, ∆L(A) is the difference due to different stimulus areas and D(l) is the
delay caused by the mediation of the suppressive signal from the neurons processing the
surround to the neurons processing the centre, at the level where the two signals
interact.
The contrast-related latency difference ∆L(C) followed the contrast-latency
relationship of retinal ganglion cells (Donner, 1989), which is indicated by the similar
shape of the model predictions and the data patterns in Figure 9B. The relationship is in
line with the effects of contrast on reaction times (Donner & Fagerholm, 2003), and
arises from a thresholding mechanism acting on the early rise of photoreceptor
responses, similarly as outlined in Studies I and II. However, when delays due to the
transmission of suppression D(l) derived from the literature (Girard et al., 2001) were
included, it appeared that the observed suppression was consistently too fast. This is
indicated by the vertical distance between the data and the model predictions based
either on horizontal connections or on feedback connections (see labels horiz. and
feedb.) in Figure 9B.
Perhaps the larger area of the surround gave it an advantage in the latency
comparison by affecting ∆L(A). We therefore proceeded to measure the effect of
surround area. When the areas and contrasts of the centre and surround stimuli were
equal, the optimal presentation time of the surround stimulus was approximately 5 ms
before the centre (leftmost data points in Figure 10). This is in fair agreement with the
model predictions based on feedback connections, but irreconcilable with the
predictions based on horizontal connections. When surround size was increased to 2-
40
fold and 3-fold, the relative latency of the surround decreased, although the average
propagation distance in cortex almost certainly increased. This is most easily
understood as reflecting acceleration in the surround signals due to increased surround
area. A similar acceleration effect of increased grating area has been observed in human
reaction times (Harwerth & Levi, 1978) and it can plausibly be explained by many
complementary mechanisms. On one hand, the neurons that mediate the suppressive
surround field get a stronger input to their RF and thus respond with a shorter latency
(see eg., Liu, Hashemi-Nezhad & Lyon, 2011). In addition, more neurons processing
the surround get an input and, as a result, more neurons processing the centre get their
surround fields stimulated, leading to probability summation type of acceleration.
In contrast, the subsequent slowing down of the surround signal with an increase
from 3-fold to 10-fold area is a much more puzzling result. It is true that the average
distance from the surround to the centre increases, but not very much. Why does the
area-related acceleration not overcome the increasing propagation distance after the 3-
fold surround area? It is, at this point, important to keep in mind that we measured the
time-course of the strongest suppression, not the fastest. In that light, it is interesting
that a recent study found three types of ECRFs in primate V1. All neurons show
surround suppression right after response onset, but after that, the suppression subsides
in about 25 % of neurons and turns to facilitation in a few percent of neurons (Liu et al.,
2011). As a result, the surround timing leading to the strongest suppression is likely the
result of a complex combination of faster surround components from longer distances
and the slower components from shorter distances. After all, Bair et al. (2003) did not
show that the connections underlying surround suppression were exclusively feedback,
only that they were too fast to be exclusively horizontal.
Regardless of the complexity of the underlying circuitry, understanding the spatial
structure of the ECRF mechanism could shed light on the complex relationship between
stimulus area and latency. In Study IV, three complementary methodologies were
combined to this end. Functional MRI was used to get an estimate from a specific level
of the visual system (e.g., V1). The drawback of fMRI is that it is known to be an
extremely indirect correlate of spiking activity (Attwell et al., 2010). However, in most
cases, the correlation is very strong (Lee et al., 2010; Nir, Dinstein, Malach & Heeger,
2008). Psychophysics was used to probe what the ECRF of a spatially localized (unlike
41
dispersed as in fMRI) neural population could approximately look like. Also,
psychophysical data are likely to correlate closely with spiking activity, as signals can
only be transmitted to the level of decision making by spiking. Finally, computational
modelling was used to analyse the relation between fMRI, psychophysics and primate
single neurons.
It is quite encouraging that the data from the different methods mostly produce a
rather coherent picture, although no attempts were made to force the different data sets
into agreement via parameter adjustments or otherwise. The comparison of primate
single cell data and our psychophysical data on one hand, and the fMRI data and
modelling results on the other hand, suggests that the ECRF profile in human V1 is
quite similar to that in non-human primates. The notable discrepancy is that in the fMRI
data, suppression is much stronger and more far-reaching than in single cells,
psychophysics or in the model simulations. Some justifiable adjustments of the
parameters of the model neurons could improve the fit. However, we suspected that the
discrepancy is mainly caused by a dissociation of spiking and the BOLD signal. An
analogous dissociation in suppression extent has been observed between spiking and
synaptic activity in single neurons in cat V1 (Anderson, Lampl, Gillespie & Ferster,
2001). Synaptic activity, in turn, correlates with the BOLD signal even in certain
situations where spiking does not (Logothetis, Pauls, Augath, Trinath & Oeltermann,
2001).
In most of the model simulations, we used parameters from our psychophysical study
rather than the macaque single cell data. Although it might appear that our rationale of
using contrast discrimination thresholds to probe cortical responses is in contradiction
with our claims about the retinal origin of the pedestal effect in Study II, this is not the
case. While the effects considered in Study II probably originate in the retina, the
signals of course thereafter travel through the cortex, where they can be modulated, for
example, by stimulation within the surround fields of V1 neurons.
What about the relationship between the time-course and spatial extent of surround
suppression? If we now take our estimate of the surround field diameter, 6.2 degrees at
13 degree eccentricity, and take into consideration the change of surround size as a
function of eccentricity in primates (Cavanaugh et al., 2002a; Teichert, Wachtler,
Michler, Gail & Eckhorn, 2007), a rough approximation of the surround diameter in
42
human fovea would be about 3 degrees. Although such a comparison is infested with
uncertainty, it is nevertheless interesting that the accelerating effect of surround area
ceased somewhere between diameters 2.1 and 3.4 degrees in Study III.
5.3 The temporal nature of the neural signal
Studies I, II, and III all support the idea that the critical part of the neural response to
stimulus onset in cone vision is extremely short and transient. Although modulation by
underlying (Study II) and adjacent contrast context (Study III) probably involves at least
partly different mechanisms, the time-courses are remarkably similar. Both in Study III
and an overlay masking study by Georgeson and Georgeson (1987), the modulating
stimulus is effective only if presented ± 100 ms from the target stimulus. This suggests
that in feed forward visual processing, the modulating stimulus must be able to affect
the target response during its short initial part in order to be effective. The critical role
of the early part has been shown in a variety of paradigms (Hegdé & Van Essen, 2004;
Ludwig et al., 2005; Müller et al., 2001; Osborne, Bialek & Lisberger, 2004; Watson &
Nachmias, 1977; however, see also Tolhurst, 1975). Some forms of higher-level
contextual modulation, where the perceptual significance of a stimulus is altered after
the primary coding of the stimulus, such as perceptual pop-out (Smith, Kelly & Lee,
2007; Zipser, Lamme & Schiller, 1996) and attentional modulation (Roelfsema, Lamme
& Spekreijse, 1998; Wannig, Stanisor & Roelfsema, 2011) may work on longer time
scales. The effect of the luminance change in Study I also lasts a bit longer, 200-400 ms,
probably because the system must not only cope with the transient neural signal caused
by the modulating stimulus, but must also adapt to the new mean luminance in order to
restore steady-state responsivity. Indeed, the time course observed in Study I is roughly
in agreement with fast adaptation observed in macaque cones (Schnapf et al., 1990).
Due to the dynamic nature of normal visual processing observed here and in many other
studies, it cannot be emphasized enough that changing certain properties of a
modulating stimulus (e.g., size or contrast) does not merely change its strength, but also
the timing of its neural representations. As a result, the changes observed in the strength
of modulation cannot be considered as direct indicators of the strength of the
modulating signal, but indicate the combined effect of strength and timing.
43
5.4 Conclusions
The current study provides information about the spatial and temporal constraints that
affect the effectiveness of modulating visual signals. The thesis adds to the evidence
that the relevant part of neural signals is very short and transient (Studies I, II and III).
The thesis shows that basic retinal signalling is a plausible and powerful determinant of
the time window within which modulating signals can be effective, whether the
modulation arises from stimulation at the same (Study II) or adjacent (Study III)
location in the visual field. However, in the latter case, timing of effective modulation is
also affected by spatial properties of the surround in a way that suggests the necessity of
cortical mechanisms, probably feedforward-feedback-loops between primary visual
cortex and higher visual cortices (Studies III and IV).
44
6. References
Anderson, J. S., Lampl, I., Gillespie, D. C., & Ferster, D. (2001). Membrane potential and conductance changes underlying length tuning of cells in cat primary visual cortex. Journal of Neuroscience, 21, 2104-2112.
Angelucci, A., Levitt, J. B., Walton, E. J. S., Hupe, J. M., Bullier, J., & Lund, J. S. (2002). Circuits for local and global signal integration in primary visual cortex. Journal of Neuroscience, 22, 8633-8646.
Attwell, D., Buchan, A. M., Charpak, S., Lauritzen, M., MacVicar, B. A., & Newman, E. A. (2010). Glial and neuronal control of brain blood flow. Nature, 468, 232-243.
Bair, W., Cavanaugh, J. R., & Movshon, J. A. (2003). Time course and time-distance relationships for surround suppression in macaque V1 neurons. Journal of Neuroscience, 23, 7690-7701.
Bartfeld, E., & Grinvald, A. (1992). Relationships between orientation-preference pinwheels, cytochrome oxidase blobs, and ocular-dominance columns in primate striate cortex. Proceedings of the National Academy of Sciences, 89, 11905-11909.
Baylor, D. A., Hodgkin, A. L., & Lamb, T. D. (1974). Electrical response of turtle cones to flashes and steps of light. Journal of Physiology-London, 242, 685-727.
Bonin, V., Mante, V., & Carandini, M. (2005). The suppressive field of neurons in lateral geniculate nucleus. Journal of Neuroscience, 25, 10844-10856.
Brivanlou, I. H., Warland, D. K., & Meister, M. (1998). Mechanisms of concerted firing among retinal ganglion cells. Neuron, 20, 527-539.
Cannon, M. W., & Fullenkamp, S. C. (1991). Spatial interactions in apparent contrast - inhibitory effects among grating patterns of different spatial-frequencies, spatial positions and orientations. Vision Research, 31, 1985-1998.
Carandini, M., & Heeger, D. J. (2012). Normalization as a canonical neural computation. Nature Reviews Neuroscience, 13, 51-62.
Carandini, M., Horton, J. C., & Sincich, L. C. (2007). Thalamic filtering of retinal spike trains by postsynaptic summation. Journal of Vision, 7.
Casagrande, V. A., & Kaas, J. H. (1994). The afferent, intrinsic, and efferent connections of primary visual cortex in primates. In A. Peters & K. Rockland (Eds.), Cerebral cortex (Vol. 10, pp. 201-259). New York: Plenum Press.
Cavanaugh, J. R., Bair, W., & Movshon, J. A. (2002a). Nature and interaction of signals from the receptive field center and surround in macaque V1 neurons. Journal of Neurophysiology, 88, 2530-2546.
Cavanaugh, J. R., Bair, W., & Movshon, J. A. (2002b). Selectivity and spatial distribution of signals from the receptive field surround in macaque V1 neurons. Journal of Neurophysiology, 88, 2547 - 2556.
Chirimuuta, M., & Tolhurst, D. J. (2005). Does a bayesian model of V1 contrast coding offer a neurophysiological account of human contrast discrimination? Vision Research, 45, 2943-2959.
Croner, L. J., Purpura, K., & Kaplan, E. (1993). Response variability in retinal ganglion-cells of primates. Proceedings of the National Academy of Sciences of the United States of America, 90, 8128-8130.
Crook, J. D., Peterson, B. B., Packer, O. S., Robinson, F. R., Troy, J. B., & Dacey, D. M. (2008). Y-cell receptive field and collicular projection of parasol ganglion cells in macaque monkey retina. Journal of Neuroscience, 28, 11277-11291.
Cumming, B. G., & Parker, A. J. (1999). Binocular neurons in V1 of awake monkeys are selective for absolute, not relative, disparity. Journal of Neuroscience, 19, 5602-5618.
Dacey, D. M. (2007). Physiology, morphology and spatial densities of identified ganglion cell types in primate retina. In G. Bock & J. Goode (Eds.), Ciba foundation symposium 184 - Higher-order processing in the visual system (pp. 12-34). Chichester: John Wiley & Sons, Ltd.
Demb, J. B., Zaghloul, K., Haarsma, L., & Sterling, P. (2001). Bipolar cells contribute to nonlinear spatial summation in the brisk-transient (Y) ganglion cell in mammalian retina. Journal of Neuroscience, 21, 7447-7454.
Desai, P., Reidy, A., Minassian, D. C., Vafidis, G., & Bolger, J. (1996). Gains from cataract surgery: Visual function and quality of life. British Journal of Ophthalmology, 80, 868-873.
Donner, K. (1981). Receptive-fields of frog retinal ganglion-cells - response formation and light-dark-adaptation. Journal of Physiology-London, 319, 131-142.
Donner, K. (1989). Visual latency and brightness: An interpretation based on the responses of rods and ganglion-cells in the frog retina. Visual Neuroscience, 3, 39-51.
45
Donner, K., & Fagerholm, P. (2003). Visual reaction time: Neural conditions for the equivalence of stimulus area and contrast. Vision Research, 43, 2937-2940.
Duncan, R. O., & Boynton, G. M. (2003). Cortical magnification within human primary visual cortex correlates with acuity thresholds. Neuron, 38, 659-671.
Ejima, Y., & Takahashi, S. (1983). Effects of high-contrast peripheral patterns on the detection threshold of sinusoidal targets. Journal of the Optical Society of America, 73, 1695-1700.
Ejima, Y., & Takahashi, S. (1985). Apparent contrast of a sinusoidal grating in the simultaneous presence of peripheral gratings. Vision Research, 25, 1223-1232.
Field, G. D., & Chichilnisky, E. J. (2007). Information processing in the primate retina: Circuitry and coding. Annual Review of Neuroscience, 30, 1-30.
Frackowiak, R. S., Friston, K. J., Frith, C. D., Dolan, R., Ashburner, J., Price, C., et al. (2003). Human brain function (2nd ed.). New York, NY: Academic Press.
Frazor, R. A., & Geisler, W. S. (2006). Local luminance and contrast in natural images. Vision Research, 46, 1585-1598.
Friedburg, C., Thomas, M. M., & Lamb, T. D. (2001). Time course of the flash response of dark- and light-adapted human rod photoreceptors derived from the electroretinogram. Journal of Physiology-London, 534, 217-242.
Geisler, W. S. (1978). Adaptation, afterimages and cone saturation. Vision Research, 18, 279-289. Geisler, W. S., & Albrecht, D. G. (1997). Visual cortex neurons in monkeys and cats: Detection,
discrimination, and identification. Visual Neuroscience, 14, 897-919. Geisler, W. S., Albrecht, D. G., & Crane, A. M. (2007). Responses of neurons in primary visual cortex to
transient changes in local contrast and luminance. Journal of Neuroscience, 27, 5063-5067. Georgeson, M. A., & Georgeson, J. M. (1987). Facilitation and masking of briefly presented gratings -
time-course and contrast dependence. Vision Research, 27, 369-379. Georgeson, M. A., & Meese, T. S. (2006). Fixed or variable noise in contrast discrimination? The jury's
still out. Vision Research, 46, 4294-4303. Gilbert, C. D., & Wiesel, T. N. (1989). Columnar specificity of intrinsic horizontal and corticocortical
connections in cat visual cortex. Journal of Neuroscience, 9, 2432-2442. Girard, P., Hupe, J. M., & Bullier, J. (2001). Feedforward and feedback connections between areas V1
and V2 of the monkey have similar rapid conduction velocities. Journal of Neurophysiology, 85, 1328-1331.
Gollisch, T., & Meister, M. (2010). Eye smarter than scientists believed: Neural computations in circuits of the retina. Neuron, 65, 150-164.
Goris, R. L. T., Wichmann, F. A., & Henning, G. B. (2009). A neurophysiologically plausible population code model for human contrast discrimination. Journal of Vision, 9.
Grigorescu, C., Petkov, N., & Westenberg, M. A. (2003). Contour detection based on nonclassical receptive field inhibition. IEEE Transactions on Image Processing, 12, 729-739.
Grinvald, A., Lieke, E. E., Frostig, R. D., & Hildesheim, R. (1994). Cortical point-spread function and long-range lateral interactions revealed by real-time optical imaging of macaque monkey primary visual-cortex. Journal of Neuroscience, 14, 2545-2568.
Gur, M., Beylin, A., & Snodderly, D. M. (1997). Response variability of neurons in primary visual cortex (V1) of alert monkeys. The Journal of Neuroscience, 17, 2914-2920.
Harwerth, R. S., & Levi, D. M. (1978). Reaction-time as a measure of suprathreshold grating detection. Vision Research, 18, 1579-1586.
Hegdé, J., & Van Essen, D. C. (2004). Temporal dynamics of shape analysis in macaque visual area V2. Journal of Neurophysiology, 92, 3030-3042.
Henning, G. B., & Wichmann, F. A. (2007). Some observations on the pedestal effect. Journal of Vision, 7.
Hood, D. C., & Birch, D. G. (1993). Human cone receptor activity - the leading-edge of the alpha-wave and models of receptor activity. Visual Neuroscience, 10, 857-871.
Hubel, D. H., & Wiesel, T. N. (1962). Receptive fields, binocular interaction and functional architecture in cats visual cortex. Journal of Physiology-London, 160, 106-154.
Hubel, D. H., & Wiesel, T. N. (1968). Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology-London, 195, 215-243.
Hupé, J.-M., James, A. C., Girard, P., & Bullier, J. (2001). Response modulations by static texture surround in area V1 of the macaque monkey do not depend on feedback connections from V2. Journal of Neurophysiology, 85, 146-163.
46
Kilpeläinen, M., Nurminen, L., & Donner, K. (2011). Effects of mean luminance changes on human contrast perception: Contrast dependence, time-course and spatial specificity. PLoS ONE, 6, e17200.
Klein, S. A. (2006). Separating transducer non-linearities and multiplicative noise in contrast discrimination. Vision Research, 46, 4279-4293.
Knierim, J. J., & van Essen, D. C. (1992). Neuronal responses to static texture patterns in area V1 of the alert macaque monkey. Journal of Neurophysiology, 67, 961-980.
Krumpaszky, H. G., Dietz, K., Mickler, A., & Selbmann, H. K. (1999). Mortality in blind subjects. Ophthalmologica, 213, 48-53.
Kurki, I., Peromaa, T., Hyvärinen, A., & Saarinen, J. (2009). Visual features underlying perceived brightness as revealed by classification images. PLoS ONE, 4, e7432.
Lamb, T. D., & Pugh, E. N. (1992). A quantitative account of the activation steps involved in phototransduction in amphibian photoreceptors. Journal of Physiology-London, 449, 719-758.
Lee, J. H., Durand, R., Gradinaru, V., Zhang, F., Goshen, I., Kim, D.-S., et al. (2010). Global and local fMRI signals driven by neurons defined optogenetically by type and wiring. Nature, 465, 788-792.
Legge, G. E., & Foley, J. M. (1980). Contrast masking in human vision. Journal of the Optical Society of America, 70, 1458-1471.
Liu, Y. J., Hashemi-Nezhad, M., & Lyon, D. C. (2011). Dynamics of extraclassical surround modulation in three types of V1 neurons. Journal of Neurophysiology, 105, 1306-1317.
Logothetis, N. K., Pauls, J., Augath, M., Trinath, T., & Oeltermann, A. (2001). Neurophysiological investigation of the basis of the fMRI signal. Nature, 412, 150-157.
Ludwig, C. J. H., Gilchrist, I. D., McSorley, E., & Baddeley, R. J. (2005). The temporal impulse response underlying saccadic decisions. Journal of Neuroscience, 25, 9907-9912.
Maffei, L., & Fiorentini, A. (1976). The unresponsive regions of visual cortical receptive-fields. Vision Research, 16, 1131-1139.
Mante, V., Frazor, R. A., Bonin, V., Geisler, W. S., & Carandini, M. (2005). Independence of luminance and contrast in natural scenes and in the early visual system. Nature Neuroscience, 8, 1690-1697.
McAlonan, K., Cavanaugh, J., & Wurtz, R. H. (2008). Guarding the gateway to cortex with attention in visual thalamus. Nature, 456, 391-394.
Müller, J. R., Metha, A. B., Krauskopf, J., & Lennie, P. (2001). Information conveyed by onset transients in responses of striate cortical neurons. Journal of Neuroscience, 21, 6978-6990.
Nachmias, J., & Sansbury, R. V. (1974). Grating contrast - discrimination may be better than detection. Vision Research, 14, 1039-1042.
Nir, Y., Dinstein, I., Malach, R., & Heeger, D. J. (2008). BOLD and spiking activity. Nature Neuroscience, 11, 523-524.
Niven, J. E., & Laughlin, S. B. (2008). Energy limitation as a selective pressure on the evolution of sensory systems. Journal of Experimental Biology, 211, 1792-1804.
Nothdurft, H.-C., Gallant, J. L., & Van Essen, D. C. (1999). Response modulation by texture surround in primate area V1: Correlates of "Popout" Under anesthesia. Visual Neuroscience, 16, 15-34.
Nurminen, L., Peromaa, T., & Laurinen, P. (2010). Surround suppression and facilitation in the fovea: Very long-range spatial interactions in contrast perception. Journal of Vision, 10.
O'Connor, D. H., Fukui, M. M., Pinsk, M. A., & Kastner, S. (2002). Attention modulates responses in the human lateral geniculate nucleus. Nature Neuroscience, 5, 1203-1209.
Olzak, L. A., & Laurinen, P. I. (1999). Multiple gain control processes in contrast-contrast phenomena. Vision Research, 39, 3983-3987.
Osborne, L. C., Bialek, W., & Lisberger, S. G. (2004). Time course of information about motion direction in visual area MT of macaque monkeys. Journal of Neuroscience, 24, 3210-3222.
Parker, A. J. (2007). Binocular depth perception and the cerebral cortex. Nature Reviews Neuroscience, 8, 379-391.
Poot, L., Snippe, H. P., & van Hateren, J. H. (1997). Dynamics of adaptation at high luminances: Adaptation is faster after luminance decrements than after luminance increments. Journal of the Optical Society of America A-Optics Image Science and Vision, 14, 2499-2508.
Ridder, W. (2006). Visual evoked potentials in animals. In J. R. Heckenlively & G. B. Arden (Eds.), Principles and practice of clinical electrophysiology of vision (2 ed.). Cambridge, MA: MIT Press.
Rieke, F., & Baylor, D. A. (1998). Single-photon detection by rod cells of the retina. Reviews of Modern Physics, 70, 1027.
Rieke, F., & Rudd, M. E. (2009). The challenges natural images pose for visual adaptation. Neuron, 64, 605-616.
47
Roelfsema, P. R., Lamme, V. A. F., & Spekreijse, H. (1998). Object-based attention in the primary visual cortex of the macaque monkey. Nature, 395, 376-381.
Sceniak, M. P., Hawken, M. J., & Shapley, R. (2001). Visual spatial characterization of macaque V1 neurons. Journal of Neurophysiology, 85, 1873-1887.
Schnapf, J. L., Nunn, B. J., Meister, M., & Baylor, D. A. (1990). Visual transduction in cones of the monkey macaca fascicularis. Journal of Physiology-London, 427, 681-713.
Schwabe, L., Obermayer, K., Angelucci, A., & Bressloff, P. C. (2006). The role of feedback in shaping the extra-classical receptive field of cortical neurons: A recurrent network model. Journal of Neuroscience, 26, 9117-9129.
Sereno, M. I., Dale, A. M., Reppas, J. B., Kwong, K. K., Belliveau, J. W., Brady, T. J., et al. (1995). Borders of multiple visual areas in humans revealed by functional magnetic-resonance-imaging. Science, 268, 889-893.
Seriès, P., Lorenceau, J., & Frégnac, Y. (2003). The "Silent" Surround of V1 receptive fields: Theory and experiments. Journal of Physiology-Paris, 97, 453-474.
Shapley, R., & Enroth-Cugell, C. (1984). Visual adaptation and retinal gain controls. In N. Osborne & G. Chader (Eds.), Progress in retinal research (Vol. 3, pp. 263-346). Oxford: Pergamon Press.
Shapley, R. M., & Victor, J. D. (1979). Nonlinear spatial summation and the contrast gain control of cat retinal ganglion cells. Journal of Physiology,290, 141-161.
Shushruth, S., Ichida, J. M., Levitt, J. B., & Angelucci, A. (2009). Comparison of spatial summation properties of neurons in macaque V1 and V2. Journal of Neurophysiology, 102, 2069-2083.
Smith, M. A., Kelly, R. C., & Lee, T. S. (2007). Dynamics of response to perceptual pop-out stimuli in macaque V1. Journal of Neurophysiology, 98, 3436-3449.
Solomon, S. G., Lee, B. B., & Sun, H. (2006). Suppressive surrounds and contrast gain in magnocellular-pathway retinal ganglion cells of macaque. Journal Of Neuroscience, 26, 8715-8726.
Solomon, S. G., White, A. J. R., & Martin, P. R. (2002). Extraclassical receptive field properties of parvocellular, magnocellular, and koniocellular cells in the primate lateral geniculate nucleus. Journal of Neuroscience, 22, 338-349.
Stevens, C. F. (2001). An evolutionary scaling law for the primate visual system and its basis in cortical function. Nature, 411, 193-195.
Teichert, T., Wachtler, T., Michler, F., Gail, A., & Eckhorn, R. (2007). Scale-invariance of receptive field properties in primary visual cortex. BMC Neuroscience, 8, 38.
Tolhurst, D. J. (1975). Reaction-times in detection of gratings by human observers - probabilistic mechanism. Vision Research, 15, 1143-1149.
Valeton, J. M., & Van Norren, D. (1983). Light adaptation of primate cones: An analysis based on extracellular data. Vision Research, 23, 1539-1547.
Walraven, J., Enroth-Cugell, C., Hood, D. C., MacLeod, D. I. A., & Schnapf, J. L. (1990). The control of visual sensitivity: Receptoral and postreceptoral processes. In L. Spillmann & J. S. Werner (Eds.), Visual perception: The neurophysiological foundations (pp. 53-101). San Diego, CA: Academic Press, Inc.
Van Essen, D. C., & Maunsell, J. H. R. (1983). Hierarchical organization and functional streams in the visual cortex. Trends in Neurosciences, 6, 370-375.
Vanni, S., Henriksson, L., & James, A. C. (2005). Multifocal fMRI mapping of visual cortical areas. Neuroimage, 27, 95-105.
Wannig, A., Stanisor, L., & Roelfsema, P. R. (2011). Automatic spread of attentional response modulation along gestalt criteria in primary visual cortex. Nature Neuroscience, 14, 1243-1244.
Watson, A. B., & Nachmias, J. (1977). Patterns of temporal interaction in the detection of gratings. Vision Research, 17, 893-902.
Webb, B. S., Dhruv, N. T., Solomon, S. G., Tailby, C., & Lennie, P. (2005). Early and late mechanisms of surround suppression in striate cortex of macaque. Journal of Neuroscience, 25, 11666-11675.
Weber, A. J., & Harman, C. D. (2005). Structure-function relations of parasol cells in the normal and glaucomatous primate retina. Investigative Ophthalmology and Visual Science, 46, 3197-3207.
Werblin, F. S. (2011). The retinal hypercircuit: A repeating synaptic interactive motif underlying visual function. The Journal of Physiology, 589, 3691-3702.
Westheimer, G. (1967). Spatial interaction in human cone vision. Journal of Physiology-London, 190, 139-154.
Whittle, P. (1994). The psychophysics of contrast-brightness. In Gilchrist (Ed.), Lightness, brightness and transparency (pp. 35–110). Hillsdale, NJ: Lawrence Erlbaum Associates.
48
Williams, A. L., Singh, K. D., & Smith, A. T. (2003). Surround modulation measured with functional MRI in the human visual cortex. Journal of Neurophysiology, 89, 525-533.
Zenger-Landolt, B., & Heeger, D. J. (2003). Response suppression in V1 agrees with psychophysics of surround masking. Journal of Neuroscience, 23, 6884-6893.
Zhang, B., Zheng, J. H., Watanabe, I., Maruko, I., Bi, H., Smith, E. L., et al. (2005). Delayed maturation of receptive field center/surround mechanisms in V2. Proceedings of the National Academy of Sciences of the United States of America, 102, 5862-5867.
Zipser, K., Lamme, V. A. F., & Schiller, P. H. (1996). Contextual modulation in primary visual cortex. Journal of Neuroscience, 16, 7376-7389.