24
1 doi:10.1167/5.1.1 Received January 1, 2005; published February 18, 2005 ISSN 1534-7362 © 2005 ARVO Interpreting Bistability Using Probabilistic Inference Rashmi Sundareswara Department of Computer Science and Eng., University of Minnesota, MN, USA Paul R. Schrater Departments of Computer Science and Eng and Psychology, University of Minnesota, MN, USA Perceptual multistability refers to the phenomenon of spontaneous perceptual switching between two or more likely interpretations of an image. Although frequently explained by processes of adaptation or hysteresis, we show that perceptual switching can arise as a natural by-product of performing probabilistic (Bayesian) inference, which interprets images by combining probabilistic models of image formation with knowledge of scene regularities. Empirically, we investigated the effect of introducing scene regularities on Necker cube bistability by flanking the Necker cube with fields of unambiguous cubes that are oriented to coincide with one of the Necker cube percepts. We show that background cubes increase the time spent in percepts most similar to the background. To characterize changes in the temporal dynamics of the perceptual alternations beyond percept durations, we introduce Markov Renewal Processes (MRPs). MRPs provide a general mathematical framework for describing probabilistic switching behavior in finite state processes. Finally, we introduce a simple theoretical model consistent with Bayesian models of vision that involves searching for good interpretations of an image by sampling a posterior distribution coupled with a decay process that favors recent to old interpretations. The model has the same quantitative characteristics as our human data and variation in model parameters can capture between-subject variation. Because the model produces the same kind of stochastic process found in human perceptual behavior, we conclude that multistability may represent an unavoidable by-product normal perceptual inference with ambiguous images. Keywords: Multistability, Bayesian inference, perceptual inference, Markov Renewal Process, context. Introduction The human visual system is remarkably good at producing crisp and consistent perceptions of the world. Underlying these crisp and consistent percepts are sophisticated methods for resolving sensory ambiguity and conflicts, many of which have been successfully described using Bayesian inference (Kersten, Mamassian & Yuille, 2004). However, we experience a breakdown of this consistency when viewing stimuli that produce phenomena like perceptual multistability and binocular rivalry. The question motivating this study is to what extent perceptual multistability can be viewed as resulting from the normal process of perceptual inference, without recourse to special mechanisms or processing. Many theories have emphasized the role that non- inferential explanations like neural adaptation may play in generating multistability (Attneave, 1971; Blake 1989; Lehky 1988; Long, 1992 and Taylor 1974). However, recent evidence points to both bottom-up and top-down influences on multistability (Leopold and Logothetis, 1999). Top-down influences include attention (e.g. van Ee, Noest and Brascamp and van den Berg, 2006;Rock, Hall and Davis, 1994; Pelton and Solley, 1968; Peterson and Gibson, 1994), intention (Toppino, 2003;Peterson, Harvey and Weidenbacher, 1991), recognition (Peterson and Gibson, 1994), semantic content (Davis, Schiffman and Greist-Bousquet, 1990;Walker 1978) and voluntary control (Hol, K., Koene, A. and van Ee, R.,2003, van Ee, van Dam and Brower , 2005; Toppino, 2003; Meng and Tong, 2004, van Ee, 2006; van Ee, 2005). These many and diverse influences suggest multistable perception may be better viewed as resulting from a general process of perceptual inference than a specific low-level mechanism. In particular, it has been suggested that multistability may be an extreme case of ambiguity in normal perceptual processing (van Ee et al, 2003). If multistability results from an extreme case of ambiguity in normal perceptual processing, then manipulations that reduce ambiguity in normal vision should increase perceptual stability when viewing multistable stimuli. One way to reduce ambiguity involves placing an ambiguous image in a global context that promotes a perceptual interpretation. Previous studies have shown that moving multistable stimuli can be stabilized by the motion of background elements (Dawson 1987;Kramer and Yantis 1997). Similar manipulations for stationary rivalrous stimuli increase dominance times (Blake and Logothetis, 2002; Alais and Blake, 1999; Sobel and Blake, 2002 and Yu and Blake, 1992). In this paper we investigate the effects of background (context) on the dynamics of perceptual bistability while viewing the Necker cube. From an ecological or Bayesian perspective, background context can provide information about the frequency of occurrence of attributes in an image. Because the ambiguity in the Necker cube can be thought of as viewpoint ambiguity—the dominant percepts are consistent with two viewpoints of a single cube (Mamassian, et al 1998), we introduced background

Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

  • Upload
    lamdieu

  • View
    221

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

1

doi:10.1167/5.1.1 Received January 1, 2005; published February 18, 2005 ISSN 1534-7362 © 2005 ARVO

Interpreting Bistability Using Probabilistic Inference

Rashmi SundareswaraDepartment of Computer Science and Eng., University ofMinnesota, MN, USA

Paul R. SchraterDepartments of Computer Science and Eng andPsychology, University of Minnesota, MN, USA

Perceptual multistability refers to the phenomenon of spontaneous perceptual switching between two or more likelyinterpretations of an image. Although frequently explained by processes of adaptation or hysteresis, we show thatperceptual switching can arise as a natural by-product of performing probabilistic (Bayesian) inference, which interpretsimages by combining probabilistic models of image formation with knowledge of scene regularities. Empirically, weinvestigated the effect of introducing scene regularities on Necker cube bistability by flanking the Necker cube with fieldsof unambiguous cubes that are oriented to coincide with one of the Necker cube percepts. We show that backgroundcubes increase the time spent in percepts most similar to the background. To characterize changes in the temporaldynamics of the perceptual alternations beyond percept durations, we introduce Markov Renewal Processes (MRPs).MRPs provide a general mathematical framework for describing probabilistic switching behavior in finite state processes.Finally, we introduce a simple theoretical model consistent with Bayesian models of vision that involves searching forgood interpretations of an image by sampling a posterior distribution coupled with a decay process that favors recent toold interpretations. The model has the same quantitative characteristics as our human data and variation in modelparameters can capture between-subject variation. Because the model produces the same kind of stochastic processfound in human perceptual behavior, we conclude that multistability may represent an unavoidable by-product normalperceptual inference with ambiguous images.

Keywords: Multistability, Bayesian inference, perceptual inference, Markov Renewal Process, context.

IntroductionThe human visual system is remarkably good at

producing crisp and consistent perceptions of the world.Underlying these crisp and consistent percepts aresophisticated methods for resolving sensory ambiguityand conflicts, many of which have been successfullydescribed using Bayesian inference (Kersten, Mamassian& Yuille, 2004). However, we experience a breakdownof this consistency when viewing stimuli that producephenomena like perceptual multistability and binocularrivalry. The question motivating this study is to whatextent perceptual multistability can be viewed as resultingfrom the normal process of perceptual inference, withoutrecourse to special mechanisms or processing.

Many theories have emphasized the role that non-inferential explanations like neural adaptation may playin generating multistability (Attneave, 1971; Blake 1989;Lehky 1988; Long, 1992 and Taylor 1974). However,recent evidence points to both bottom-up and top-downinfluences on multistability (Leopold and Logothetis,1999). Top-down influences include attention (e.g. vanEe, Noest and Brascamp and van den Berg, 2006;Rock,Hall and Davis, 1994; Pelton and Solley, 1968; Petersonand Gibson, 1994), intention (Toppino, 2003;Peterson,Harvey and Weidenbacher, 1991), recognition (Petersonand Gibson, 1994), semantic content (Davis, Schiffmanand Greist-Bousquet, 1990;Walker 1978) and voluntarycontrol (Hol, K., Koene, A. and van Ee, R.,2003, van Ee,van Dam and Brower , 2005; Toppino, 2003; Meng and

Tong, 2004, van Ee, 2006; van Ee, 2005). These manyand diverse influences suggest multistable perception maybe better viewed as resulting from a general process ofperceptual inference than a specific low-level mechanism.In particular, it has been suggested that multistability maybe an extreme case of ambiguity in normal perceptualprocessing (van Ee et al, 2003).

If multistability results from an extreme case ofambiguity in normal perceptual processing, thenmanipulations that reduce ambiguity in normal visionshould increase perceptual stability when viewingmultistable stimuli. One way to reduce ambiguityinvolves placing an ambiguous image in a global contextthat promotes a perceptual interpretation. Previousstudies have shown that moving multistable stimuli canbe stabilized by the motion of background elements(Dawson 1987;Kramer and Yantis 1997). Similarmanipulations for stationary rivalrous stimuli increasedominance times (Blake and Logothetis, 2002; Alais andBlake, 1999; Sobel and Blake, 2002 and Yu and Blake,1992).

In this paper we investigate the effects of background(context) on the dynamics of perceptual bistability whileviewing the Necker cube. From an ecological or Bayesianperspective, background context can provide informationabout the frequency of occurrence of attributes in animage. Because the ambiguity in the Necker cube can bethought of as viewpoint ambiguity—the dominantpercepts are consistent with two viewpoints of a singlecube (Mamassian, et al 1998), we introduced background

Page 2: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 http://journalofvision.org/5/1/1/ 2

doi:10.1167/5.1.1 Received January 1, 2005; published February 18, 2005 ISSN 1534-7362 © 2005 ARVO

context that carried viewpoint information. In particular,our context stimuli consist of fields of unambiguouslyoriented cubes whose orientations are consistent with oneof the dominant Necker cube interpretations. Thesebackground objects flank a central Necker cube (withoutoverlap). Similar to previous effects of context onrivalrous stimuli (Sobel and Blake, 2002) we find contextlengthens the durations of the supported percepts.

To better quantify the effects of context on thedynamics of multistability, we developed a new approachto the analysis of multistability data. We developed ageneral non-parametric descriptive framework foranalyzing multistability data based on Markov RenewalProcesses. The approach describes perceptual switching interms of a discrete state variable which undergoesprobabilistic transitions after random times in a state.The approach overcomes several limitations of previousanalysis methods. It can capture the temporal dynamicsof the switching process, contingencies between states anddurations and provides tools to relate different measuresof perceptual behavior. We provide a brief exposition ofthe key ideas and mathematical properties of theapproach in Appendix A.

We use our analysis to reveal the dynamics ofmultistability in the presence of context. We found non-trivial dynamics that were modulated by context, withstrong biases toward one perceptual interpretation oninitial viewing that partially dissipate in subsequentviewing, similar to those reported by Mamassian et al(2005) for rivalrous stimuli. In addition, we show how theanalysis approach can be used express relationshipsbetween measures, predict co-variaton between the meansand variabilities of percept durations, and to teststationarity (the property of stationarity measures whetheror not the probability distributions governing perceptualstate transitions vary across time).

Lastly, we show how multistability could result fromthe normal dynamics of perceptual inference. Similar tovan Ee et. al, 2003 we assume that multistability ariseswhen perceptual inference produces ambiguity that canbe described by a multi-modal posterior distribution onthe parameters of interest (e.g. surface slant, viewpoint,etc). The novelty of our approach is to provide a simplebut specific method of implementing perceptual decisionthat can account for the origin of the temporal dynamicsin multistability. In particular we show that searching forgood interpretations by sampling a multimodal posterior,together with memory decay which values recent samplesin times more than older samples, results in multistable-like behavior similar to that exhibited by participants.We believe that treating multistablity within the contextof a general approach to perception (Bayesian inference)has the potential to reconcile many of the diverse resultsobserved for multistable perception.

Methods

Task

Participants continuously viewed a Necker cube for10sec intervals while fixating at a point in the center ofthe cube (Figure 1). Perceptual states were elicited viabutton press at semi-random intervals by the sound of abeep. A trial is defined as a single event of an arrowpress at the sound of a beep to indicate the orientation ofthe current perception of the Necker cube (see Figure 1).We refer to the two dominant interpretations in terms ofuser’s viewpoint with respect to the Necker cube – ‘VA’indicates the percept is consistent with the viewpoint ofthe subject is above the cube in 3D space (resulting in thetop surface perceived as in front) and ‘VB’ indicates thatthe percept is consistent with a viewpoint below the cubeand hence viewing the bottom surface. At the sound of abeep, if the user is currently experiencing the VA percept,they were instructed to press the ‘up’ arrow key and ifthey were experiencing the VB percept, they were to pressthe ‘down’ arrow key.

Figure 1: Necker Stimulus used in the experiment. Participants

fixated the central green dot and reported their perceptual state

via key-press when cued by the sound of a beep.

Stimuli and Apparatus

The Necker cube was rendered in OpenGL and orientedat 45 degrees away from the x-axis and 45 degrees awayfrom the y-axis such that vertically and horizontally itsubtended a visual angle of 3.16 degrees. Participantswere directed to maintain fixation on a tiny green dotrendered in the center of the cube. The user’s chin restedon a chin rest fixed in the center of the 24.2 cm highmonitor at a distance of 71cm.

Training

All participants ran in the following training sessionsbefore the main experimental data was collected. Atraining session consisted of the following (in the ordergiven):

1) 40 repetitions of VA/VB key association practicewhere a single unambiguous cube would be presented inthe middle of the screen and the subject is asked toindicate with a key press what percept they wereexperiencing.

Page 3: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 http://journalofvision.org/5/1/1/ 3

doi:10.1167/5.1.1 Received January 1, 2005; published February 18, 2005 ISSN 1534-7362 © 2005 ARVO

2) 40 repetitions of orientation discriminationpractice: two columns of 100 unambiguously cubesappeared as shown Figure 2 and Figure 3. The twocolumns differed in orientation from the Necker Cube inthe x-axis by a mean of 7 degrees and a standard deviationof 1. One of the columns (chosen at random) differed bya positive mean of 7 and the other by a negative mean of7 and standard deviation of 1. The user was asked toindicate which column appeared more fronto-parallel.

3) The Necker cube was presented for 1 minute. Theuser was asked to passively focus on the green dot in themiddle of the screen. No beeps were sounded and no keypresses were required.

4) The Necker cube was presented in the middle ofthe screen for 20 seconds. Beeps were generated using arandom distribution with a mean of .5 seconds and astandard deviation of a 100 milliseconds after registeringthe previous response. The subject was asked to quicklyrespond with a key press at the sound of the beep, whichresulted in an average inter-response interval of 1.2sec.Five 20sec blocks were collected from each subject, whichconstituted the base-rate data for the no-context condition.

No time limits were enforced in the training session.The session was usually completed in 6-7 minutes.

Figure 2: An example of an orientation discrimination task for

unambiguously oriented cubes in VA orientation. Subjects are

asked to indicate which column is more fronto-parallel.

Figure 3: An example of an orientation discrimination task for

unambiguously oriented cubes in VB orientation. Subjects are

asked to indicate which column is more fronto-parallel.

Context experiment

The experiment consists of 100 blocks where each blockconsists of 10 trials. During the duration of a block, theNecker cube appears in the middle of the screen and isflanked by two columns of 100 unambiguously cubes oneither side of the cube as shown in Figure 4and Figure 5,

in the exact conditions of the orientation discriminationtask described in the training session. The orientations ofthe cubes in the two columns are either both VA or bothVB. The orientation of the cubes for a block is randomlypermuted such that at the end of 100 blocks, 50 of themwould be VA oriented and 50 of them would be VB

oriented. In a block, the beeps are generated using aNormal distribution with a mean of .5 seconds and astandard deviation of a 100 milliseconds (similar to thetraining session). It takes an average of .5 seconds for theuser to respond to a beep (by pressing an arrow key).Therefore, the duration of a block lasts for 10-11 secondsas 10 events of arrow presses are recorded for every block.If the subject takes longer than 1.2 seconds to respond tothe beep, a double beep is sounded and the block restarts.A 2-3 minute break is provided at the end of 20 blocks.

Figure 4: An instance of the VA context stimulus.

Figure 5: An instance of VB context stimulus.

Participants

21 participants ran in three 1-hour experimentalsessions scheduled over three days. All had normal orcorrected-to-normal vision. Participants were eitherfinancially compensated or given extra-credits points in anintroductory Psychology class.

AnalysisData from multi-stability experiments are composed

of sequences of durations spent in alternating states. Akey characteristic of this data are switching betweendiscrete states with random phase durations betweenswitch events. An essential part of analyzing bistabilitydata is estimating the distribution of phase durations.However, for a complete model of the data it is notsufficient to simply histogram the phase durations. To do

Page 4: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 http://journalofvision.org/5/1/1/ 4

doi:10.1167/5.1.1 Received January 1, 2005; published February 18, 2005 ISSN 1534-7362 © 2005 ARVO

so would ignore the possibilities that phase durationdistributions may depend on previous perceptual statesand may change across time. Our goal was to developanalysis methods that could more completely capture thetemporal dynamics of perceptual switching behavior.

As pointed out by Mamassian and Goutcher (2005),to capture the temporal dynamics of perceptual switchingit is not sufficient to measure only response probabilitiesor phase durations, because these measures are insensitiveto temporal changes in the distribution of phasedurations. To rectify this, Mamassian and Goutcherintroduced three additional measures – transientprobabilities, reversal probabilities and survivalprobabilities to capture the time-varying dynamics ofperceptual switching. We also measure survivalprobabilities, but use a non-parametric estimationmethod (they fit cumulative Gaussian functions). Inaddition, we measure cumulative transition distributionfunctions (these are related to the inverse of phaseduration) separately for each state. Both these measuresare conditioned on particular times (rather than averagedacross all occurrences in the sequence) and areconditioned on perceptual states as described below.Ultimately, these measures were chosen because of theirsimple relationships to a general mathematical model ofprobabilistic switching processes called Markov RenewalProcesses, as described below in this section and inAppendix A. First we describe the measures, andsubsequently explain how they were estimated from thedata.

Preliminaries

Our data consists of sequences of response/timepairs. For the kth block of responses, letD

k={(r1,t1),(r2,t2),…,(rN,tN)}, where ri denotes

responses, encoded as 0 for VA and 1 for VB, and tirepresents a recorded response time measured from thestart of the response period. We also use superscripts toindicate the block number of a measurement, (e.g. ti

k

represents the ith measurement from the kth block. All ofour measures were conditioned on starting times, definedfor each sequence as:

0k min ti

k |rik 0

1k min ti

k |rik 1

In words, a starting time is the first time a state isobserved within a response block.

In general, all measures were computed using thefollowing steps. Responses in each block were classifiedaccording to the event type of interest (e.g. whether atransition has occurred). Interpolation is used toconstruct a time-continuous indicator function from theclassified events separately for each block. Estimates ofthe desired probability functions are generated fromaverages of indicator functions across blocks.

Estimating Survival and Response probabilities

Survival probabilities are defined as the probability ofobserving the same state i after an elapsed time t:

P(s( t)i |s( )i)

This function would be simple to compute if we couldmeasure perceptual states continuously as a function oftime. Instead, the response/time sequences can beinterpreted as discrete samples from this continuousfunction.To compute survival probabilities at each time t, weclassify response events for each block based on whetherthe initial state was VA or VB. Then we linearlyinterpolate between response times and whether therewas a match with the initial state. Let Sk(t) represent theinterpolated indicator function for the kthblock. For eacht falling between two response times, Sk(t) is given by (letthe subscripts 0 and 1 in Sk(t) represent states VA and VB

respectively) :

S0k (t) 1ri

k t( tik 0

k )

t i1k t i

k 1ri1k (t i1

k 0k )t

t i1k t i

k for (ti

k 0k ) t (ti1

k 0k )

(1)

if the initial response is of type VA. The interpolatedindicator function for initial response of type VB issimilar:

S1k (t) ri

k t( tik 0

k )

t i1k ti

k ri1k (t i1

k 0k )t

t i1k t i

k (2)

No extrapolation is performed for times t greater than anyof the response times. Instead, times exceeding the lastresponse are treated as missing data for that block. Tokeep track of which times points are valid in each block,we compute a second indicator variable that marks validtime points:

J k (t)1 0 tmax(ti

k )0 otherwise

The survival probability estimate (PS) for state i iscomputed as:

P̂S s( t)i |s( )i Si

k (t)k1

#blocks

Jik (t)k1

#blocks

(3)

Response probabilities (PR) are similar, but simpler toestimate. The recorded response sequences for eachblock were interpolated and averaged incorporating onlyvalid times as above.

Estimating Cumulative Transition Probabilities

We also measured the cumulative transition probabilityof each state, conditioned on initial states 0

kand

1k(where subscripts 0 and 1 represent states VA and VB)

Note that the time spent in a state is the same as the timeobserved before a transition. Let jmin represent the indexof the first response of the appropriate type (VA or VB).

Page 5: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 http://journalofvision.org/5/1/1/ 5

doi:10.1167/5.1.1 Received January 1, 2005; published February 18, 2005 ISSN 1534-7362 © 2005 ARVO

To construct an indicator function for a transition from astate, VA or VB, we first classify the responses subsequentto jmin in each block such that responses that are thesame as the initial state get classified as zero and responsesat and after the response that is different from the initialstate get classified as 1.

The formula for )(tZ k returns 0 for all responsesafter jmin that match the conditioning state (initial stateVA or VB) before transition, but returns 1 for all responsessubsequent to the first mismatch (after transition). Let

kZ0 denote the transition sequence described above forstate VA. Linear interpolation is used to constructindicator functions for cumulative probability untiltransition. For the cumulative transition probability ofstate VA, the indicator function T0

k(t) is given by:

)()(for

)(

010

)(

0

)(

001

01

1

0

kki

kki

tt

ttk

tt

ttkk

ttt

ZZtT ki

ki

kki

ki

ki

kki

(4)

We computed indicator functions for valid times asbefore, and generated estimates of cumulative transitionprobabilities (PT) by averaging indicator functions acrossblocks:

blocksk

ki

blocksk

ki

iTtJ

tTisNP

#1

#1

)(

)()(|0ˆ (5a)

where Ni is the number of transitions from state i. Notethat another way of expressing Eqn. 5a is:

blocksk

ki

blocksk

ki

iTtJ

tTistDP

#1

#1

)(

)()(|ˆ (5b)

In Eqn. 5b, we are expressing the idea that PT computesthe cumulative probability of the duration of a percept isless than t.

Phase duration means and variances were computeddirectly from numerical estimates of the moments of thecumulative transition probability curves.

Errors in estimates

Errors in probability function estimates were assessedby resampling data across blocks using standard bootstrapprocedures (Efron and Tibshirani, 1993). In particular,blocks were re-sampled with replacement and 50bootstrap estimates of the survival, response andcumulative transition probability functions weregenerated using the .632+ procedure (Efron andTibshirani, 1994). Standard errors of the probabilityestimates were computed via standard deviations of theset of bootstrap probability function estimates.

We tested several assumptions of our estimationmethod by assessing whether modified estimates would

fall outside of the 95% bootstrap confidence intervals. Inparticular, we found that changing the interpolationmethod to cubic interpolation had no significant effecton estimates, nor did assigning transition times to themidpoint between response times. We also assessedwhether selecting only those blocks where 0

kand 1

k

occurred in the first 1-3 seconds would change theestimates, but also found no significant effect.

Note that the estimates are noisier for the elapsedtimes approaching 10 seconds after the first recordedpercept time; this is because there were fewer blocks inwhich the appropriate initial state occurred at the start ofthe block.

General Framework for Analyzing Multi-stability data

Markov Renewal Processes (MRPs) (Ross, 1992) aregeneral, well-characterized stochastic processes that modeltransitions among discrete states. In particular, MRPsmake transitions between discrete states via a Markovchain and sample the amount of time spent in each statefrom a set of state-contingent temporal distributionfunctions.

MRPs provide greater clarity on the analysis ofmultistability behavior by expressing relationshipsbetween different measures and address challenging issuesin data interpretation such as the temporal dynamics andstate stationarity.

Some of the key properties useful for quantifyingbistable data include: temporal contingencies can bemeasured by conditioning on past events (we focus onfirst order, or just the previous event); non-stationaritycan be diagnosed; MRPs allow for arbitrary temporaldistribution functions for perceptual durations; many ofthe key predictions and relationships involve momentsrather than full distribution functions (so thatdistributional assumptions can be avoided). Finally, inprevious work (Schrater and Sundareswara, 2006) weshowed how MRPs can be generated by our theoreticalmodel described in the section “Theoretical frameworkfor interpretation”. We describe the properties of MRPsmore fully in Appendix A.

Results

Average Effects of Context

The average effects of context on Necker cubeperception reported in Figure 6, Figure 7 and Figure 8.We analyze response data compiled across 20 participants(one participant was excluded from the analysis as a clearoutlier). If context partially disambiguates perception ofthe Necker cube, we would expect background contextsthat suggest view from above to increase VA perceptdurations, but may also decrease VB percept durations.We would also expect to see an initial bias toward VA firstin the first few responses.

Page 6: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 http://journalofvision.org/5/1/1/ 6

doi:10.1167/5.1.1 Received January 1, 2005; published February 18, 2005 ISSN 1534-7362 © 2005 ARVO

Figure 6 shows response probability (PR) as a functionof time for view from below responses VB (i.e. thepercentage of time participants will respond VB at thesound of the beep). The base-rate/no-context data (bluecurve), shows that participants are biased towards VA,percepts (viewpoint from above) quite strongly for initialresponses made early in a response block and asymptoteat approximately 40% after 5-6 sec. The fact that theasymptotic value is not 50% shows that percepts werebiased toward viewpoint from above on average withoutcontext. The red curve (VA context) shows increased VA

response to VA context. The green curve shows increasedVB responses to and VB context. However, while VB

context strongly biased initial responses toward view frombelow, VA context stimuli do not significantly changeinitial bias over baseline. Overall, VB context produces alarger change in response behavior than VA context. VA

context may produce a smaller shift because of the biastoward VA responses in the no-context data.

While response probabilities showed overall effects ofcontext, they do not capture state-contingent effects – forexample, it does not answer whether context hasdifferential effects on transition probabilities of similarand dissimilar percepts. We use two related measures tocapture state-contingent effects, cumulative transitionprobability (PT) and survival probability curves (PS), whereboth are conditioned on the initial perceptual state.Based on response probability results we would expectcontext to increase time until transition and increasesurvival probabilities for context-consistent perceptualstates.

Cumulative transition probability (PT) and survivalprobability (PS) data are shown in and Figure 7 and Figure8. In both figures (a) shows results for initial VA percepts(b) shows results for initial VB percepts. Clear effects ofcontext for both types of percept are observed. Whencontext is consistent with the initial percept, this initialpercept takes longer to transition (i.e. consistent perceptsare less likely to transition with consistent context.) andsurvival probabilities increase. In addition, inconsistentcontext has a suppressive effect, increasing the likelihoodof transition. However, suppressive effects are only largefor view-from-below initial perceptual states VA. Thismight be related to biases toward VA perceptual statesfound in the baseline conditions. For example, becauseVA percepts are more frequent in baseline viewing (interms of durations, and initial frequencies) there may be aceiling effect for suppression. The relationship betweenthe two measures in shown in Appendix A. Hoqw

Next, we examined how the mean phase duration ofboth percepts was affected by context. Literature inbinocular rivalry (Blake and Logothetis, 2002; Sobel andBlake, 2002) suggests that in the presence of a supportingcontext, dominance times of the matched percept arelengthened, but suppression times of the incongruentpercept are not affected, suggesting that perhaps there are

different neural events associated with dominance andsuppression.

However, in Figure 9 we show that not only weredominance times lengthened but suppression times wereshortened as well – although the amount of suppressionis much smaller. Thus we see no need to invoke differentneural events to explain our bistability data.

Figure 6: ‘PR’ measure: Response of VB percept as a function

of time. The blue curve represents baseline data (no context),

the red curve data is PR when the context shown was VA, and

green represents PR when the context was VB context. The

figure shows large initial response biases both with and without

context. View from above context can be described as

producing the same initial bias as baseline, but preserves more

of the initial bias to asymptote around 0.3. VB context produces

a strong bias towards VB responses that decreases to

asymptote around 0.6 after 5 sec Dashed lines represented

+/-1 bootstrap s.e. of the estimate.

Page 7: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 http://journalofvision.org/5/1/1/ 7

doi:10.1167/5.1.1 Received January 1, 2005; published February 18, 2005 ISSN 1534-7362 © 2005 ARVO

Figure 7: Average effect of context on cumulative transition

probability, conditioned on the initial perceptual state. Shows

PT curves given the initial percept is: (a) view from above, VA

and (b) view from below, VB Context has both positive

(positive) and negative (suppressive) effects on cumulative

transition probability (red=VA and green=VB, dashed lines

represent +/- 1 bootstrap standard error) compared to no

context (blue curves). Durations (time before transition) spent

in the VA initial state (figure a) are prolonged by VA context and

shortened by VB context. However, while initial VB durations

(figure b) are prolonged by consistent context, there is little

suppression of VB percept from view-from-above context when

compared to the no-context (base-rate) condition. Note that

context effects on distribution functions involve simultaneous

shifts and slope changes, corresponding to correlated changes

in the phase duration means and variances.

Figure 8: Effects of context on survival probabilities (PS). (a)

survival of VA initial percept. (b) survival of VB initial percepts.

Context has both positive (dominance) and negative

(suppressive) effects on survival probability functions (red=VA

and green=VB) compared to no context (blue curves). Dashed

lines represent +/- 1 bootstrap standard error. Note that all

survival curves have the same asymptote as the response

probability functions (which they should if MRPs provide a

sufficient data model).

Individual Effects

Although the previous results hold for participants onaverage, we also found individual differences in contexteffects that included effects on initial responses and themagnitude of the context effects. We found thatindividual response probabilities during the first twoseconds of a block could be divided into two main

Page 8: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 http://journalofvision.org/5/1/1/ 8

doi:10.1167/5.1.1 Received January 1, 2005; published February 18, 2005 ISSN 1534-7362 © 2005 ARVO

categories depending on the strength of the context effect.We call the difference between initial responseprobabilities across contexts ‘RD’, illustrated in Figure 10.Some participants have a large RD (Figure 10a) that ishighly influenced by the context while others are hardlyinfluenced (Figure 10b)

Large values of RD might indicate participants aremore ‘malleable’ to context, which predicts they wouldshow stronger context effects throughout the blockduration (10 seconds). We tested whether there was arelationship between (RD) and the magnitude of thecontext effect on transition probabilities, as measured bythe difference between the cumulative transitionprobabilities (PD) evaluated at 4 secs.

Figure 9: Effect of consistent an inconsistent context on

average percept durations. Average percept durations with

context are plotted against average durations in the baseline

condition, for each combination of percept and context type.

Error bars represent 99% bootstrap confidence intervals. The

blue line represents the values predicted if context had no

effect. Points above the line correspond to percept durations

increased by context (dominance), while points below the line

indicate context-induced suppression.

Figure 11(a) shows the relationship between RD and themagnitude of the context effect is roughly linear- - largerinitial response effects predict larger context effects onpercept durations. A similar relationship was foundbetween RD and the size of the context effect on survivalprobabilities averaged between the 2nd and 5th second,shown in Figure 11(b). Thus context effects areimmediately present and can be used to predict contexteffect sizes.

Figure 10: Illustrates the initial response measure RD, for two

extreme-case participants. RD is defined as the difference in

PR curves (response probability) across context, averaged over

the first two seconds. The extreme cases of RD are shown,

where the PR curves in each panel are computed from data

pooled across the two participants with the largest (left) and

smallest (right) initial response effects.

Figure 11: Relationship between initial response effects PR

and the size of the context effect on cumulative transition

probabilities, PD (left) and survival probabilities (right). Red

points correspond to the difference in PD of the VA percept

between contexts, evaluated at 4 secs. The green points

correspond to the difference in PD of the VB percept between

contexts, averaged across 2-5th secs. A roughly linear

relationship holds in both bases, with a correlation coefficient

of -0.67 (p= .0014) for red points, and 0.6 (p=0.0054) for

green points.

Page 9: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 http://journalofvision.org/5/1/1/ 9

doi:10.1167/5.1.1 Received January 1, 2005; published February 18, 2005 ISSN 1534-7362 © 2005 ARVO

Figure 12: Effect of context on means and standard deviations

of percept durations for individual participants. (a) Percept

duration standard deviations of time spent in consistent

percept/condition is larger than inconsistent percept/context

condition. The red points indicate the consistent

percept/context condition i.e. the mean and standard deviation

of VA percept when the context is VA. The green points

indicate the inconsistent percept/context condition, which is

condition of VB percept when the context is VA. The arrow

indicates the average direction of change (mean = 1.85, std =

0.70) for 20 Participants from inconsistent percept/context

condition to the consistent percept/context condition. The

average direction of change indicates that the mean and

standard deviation increases as percept matches the context.

. The figure also shows the existing correlation (.722, p-value

< 10-5

) between mean and standard deviation.

(b) Mean and standard deviation of time spent in consistent

percept/context condition is larger than inconsistent

percept/context condition. The red points indicate the

consistent percept/context condition i.e. the mean and

standard deviation of VB percept when the context is VB. The

green points indicate the inconsistent percept/context

condition, which is condition of VA percept when the context is

VB. The average direction of change (mean = 1.41, std

=0.688)) indicates that the mean and standard deviation

increases as percept matches the context. The figure also

shows the existing correlation (.8048, p-value < 10-7

) between

mean and standard deviation.

We were also interested in determining how contextaffected the means and variances of percept durationsacross individual subjects. Past research (Borsellino et al,1972) have revealed a correlation between means andstandard deviations of percept durations in bistable data.As explained in the Appendix A, this correlation is apredictable feature of the family of Markov renewalprocesses. Our question was whether means and standarddeviations co-varied in our dataset, and if so, wouldchanges introduced by context also co-vary?

As Figure 12(a) and (b) indicate, the data showsstrong co-variation between means and standarddeviations across participants that is roughly linear.Moreover, the changes in means and standard deviationsacross contexts were also correlated, and tended to be inthe same direction as the scatter across individualsubjects. To visualize that trend, we averaged the changevectors across subjects and superimposed the direction ofaverage change as a blue vector. The effect is ratherstriking—the direction of change in means and standarddeviations we find across participants is roughly the sameas the direction of change induced by context for aparticular participant.

Another outstanding question was whether a fewparticipants with a large effect size drove the averagecontext effect. Although the amount of data collectedwas not sufficient to make strong within-subjectscomparisons, we found all but one of the participant’scontext effects were in the same direction. In addition,more than half of the 20 participants showed significantchange in means and standard deviations across contextfor both percepts at the 0.1 level. We tested thesignificance by evaluating the consistency of the directionof the context effect suggests that the effects of contextwere similar for the majority of subjects.

Theoretical Framework forinterpretation

Bayesian inference via sampling

The function of the brain’s visual system is to makeinformed guesses or inferences about the world given

Page 10: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 http://journalofvision.org/5/1/1/ 10

doi:10.1167/5.1.1 Received January 1, 2005; published February 18, 2005 ISSN 1534-7362 © 2005 ARVO

varying levels of uncertain retinal input. Inferencesresulting from Bayesian statistics have been widelyemployed to describe visual processing from the level ofneuronal behavior (Pouget, Dayan and Zemel, 2003) tohigh-level visual processing (Yuille and Kersten, 2006).The purpose of this section is to show how bistableswitching behavior might also arise from probabilisticmodels of perceptual inference. By providing a simplemethod for generating spontaneous switching in aBayesian framework we offer an alternative to the ideathat explanations of perceptual bistability requirespecialized neural mechanisms whose job is to implementdominance, suppression or switching.

At minimum, theories of bistability, should provideexplanations of: 1) the existence of multipleinterprettions; 2) the awareness of only one interpretationat a time; and 3) spontaneous switching between percepts.The first two criteria are natural byproducts of usingBayesian decision theory to construct computationalmodels of perception (Kersten and Schrater, 2002).Models of perception formulated using Bayesian decisiontheory treat percepts as decisions about scene (world)properties based on a combination of prior visualinformation (priors) and knowledge about imageformulation and scene/image regularities (likelihood).Decisions have two parts—an inferential process thatcomputes probabilities across interpretations and aselection process that chooses high-probabilityinterpretations (and may incorporate costs, Kersten andSchrater, 2002).

Interpretation ambiguity is characteristic ofambiguous objects; in Bayesian decision theory, thehallmark of ambiguity is the presence of a multimodalposterior distribution function. For example, part ofperceptual inference for the Necker cube figure is todecide the viewpoint direction from which the image Iof the cube was taken (we do not address the other part,which involves an inference about the shape of thewireframe object). Given that the object’s shape iscuboidal, the posterior probability P( | I) will bebimodal (see Figure 13), with equal peaks given a uniformprior on viewpoint. Bayesian inference can alsoincorporate viewpoint information (gathered fromexperience or background context) to bias perceptiontoward more probable views of the cube. Previous workby Mamassian and Landy (1998) show that humanobservers are biased towards a viewpoint from above (our“no context” condition data suggests this as well). InBayesian models, percepts result from selecting a goodinterpretation, typically by choosing the viewpoint withmaximal posterior probability. When the maximum isnot unique, Bayesian decision theory provides nomechanism for choosing between equally goodinterpretations. Thus, while criteria 1) and 2) are naturalbyproducts of using Bayesian decision theory, criterion 3)(the spontaneous switching of percepts) is not

However, we found that spontaneous switching canarise when Bayesian decisions are implemented by aprocess that generates a set of samples from the posteriordistribution across time, and selects a percept from thelargest probability sample in the set. We describe thisprocess in detail below. The key idea of the approach is toimplement probabilistic inference by a samplingprocedure, in accord with some recent theories of howBayesian decision theory may be neurally implemented. Inparticular, Lee and Mumford (Lee and Mumford, 2003),and Pouget (Pouget, Dayan and Zemel, 2003), point outthat probabilistic population coding models of cortex canbe thought of as representing probability distributions interms of samples from the distribution. In this approach,receptive fields of neurons in the population encoderegions of the interpretation parameter space, and firingrates encode the probability of a sample. Whether or notthis interpretation of neural coding is an accuratereflection of biology, sampling constitutes a fundamentalmethod for implementing Bayesian inference, and thebrain’s computations may perform something equivalent.In the next section we explain how switching canspontaneously arise in Bayesian decisions based onsampling.

Figure 13: An illustration of the probability densities used in the

model. The interpretation ambiguity is represented by a bi-

modal density function on cube orientation (blue curve). The

prior P() encodes biases toward one of the interpretations,

including the effects of context (green curve). The product of

the likelihood and the prior produces the (unnormalized)

posterior probability (red curve), from which we sample in our

simulations.

Page 11: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 http://journalofvision.org/5/1/1/ 1

doi:10.1167/5.1.1 Received January 1, 2005; published February 18, 2005 ISSN 1534-7362 © 2005 ARVO

Figure 14: An illustration of the sampling procedure. (a) A bimodal posterior is shown with independent sampling. The two modes

correspond to the two interpretations: VA percept and VB percept (b) The probability values for samples chosen earlier in (a) shown: red

= VA percept, green = VB percept (c) 5 samples in window are shown. Each sample is generated at the point in time as indicated by

time (x-axis) axis. The red corresponds to the VA percept and the green corresponds to the VB percept. The exponential decay factor

is applied to the samples (i.e. each sample’s probability is weighted by the exponential decay factor’s value corresponding to that point

in time). (d) The sample with the maximum value after the weighting is chosen as the current percept. (e) A new sample at the 6th

time

instant is now in the window – the exponential decay factor is applied. (f) The new max. probability sample is now the sample that

arrived at the 6th

second; this causes a switch in percepts. The last percept in memory was VA, now it has been changed to VB.

Page 12: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 Smith & Jones2

Bistability from inference

We assume the brain performs Bayesian inference on theNecker cube figure and generates a bimodal posteriordistribution, representing the two dominant viewpointinterpretations. A decision process operates on thisposterior, and brings the best interpretation into awarenessand memory. More precisely we make the followingassumptions:1) Posterior probabilities are represented by a set of

samples, updated across time. In particular, at asequence of times ti (on a neural scale), the brainexplores the posterior distribution by collecting a newset of samples and discarding the oldest samples.Samples, consist of both the parameter values ((t1))and the associated posterior probabilities, (w(ti)):, i.e.{((t1),w(t1)),…,((tN),w(tN))}, where w(t1)=P((t1)|I) areweights that represent the posterior probabilities whenfirst sampled and ti are consecutive time points.Weights represent the quality of the interpretationassociated with a sample and they are discounted by amemory decay process.

2) Memory Decay expresses the idea that the quality ofan old interpretation decreases with time. We assumesimple exponential discounting of a memory sample’sweight by the age of the sample.

3) Perceptual Decisions result from choosing the samplewith the highest discounted weight. This sample’sparameter value is chosen as the current interpretationand brought into memory. The memory sample’sdiscounted posterior probability w(ti) represents thequality of the interpretation; this is also stored inmemory.

The interpretation process is illustrated in Figure 14.These basic assumptions produce spontaneous perceptualswitching similar to human behavior.

In (Schrater and Sundareswara, 2006), we analyze theprobabilistic properties of this kind of sampling model in amore general setting and show that they generate MarkovRenewal processes. Note that memory update events(where samples in memory are exchanged) can be thoughtof as perceptual state transitions. These transitions caneither be same-state or between state transitions, but weassume that same-state transitions are not perceptuallysalient. Additionally, we showed that MRP models can begood descriptions of our human bistability data. Here weillustrate the explanatory power of the theory describedabove by accounting for the main elements of our data witha simple implementation of a sampling-based decisionmodel

Simulation Methods:

Posterior: A 2D image I of a Necker cube is theoretically iscompatible with the projection of an infinite number ofpolyhedra, depending on the depth of its 8 vertices.

However, human perception is dominated by twointerpretations, both cuboidal in shape, that involve twodifferent poses or viewpoints on the object. Therefore wesimplified the 8-dimensional Necker Cube vertex-depthspace into a 1D viewpoint parameter space of theobserver. The viewpoint likelihood defined by P(I|) ismodeled as the sum of two Gaussians centered on each ofthe two Necker cube interpretation modes,VA and VB.Assuming a Gaussian prior on viewpoint centered on M,

the posterior, P() is proportional to P(I|)P().

Memory Decay: Memory decay is implemented byexponential function e

-t/that discounts the weights

attached to samples as a function of their age t. The rate ofdecay is determined by the parameter The sampling ratewas arbitrarily set to 1000 samples/sec, however, thischoice did not matter because changes in sampling rate canbe compensated by changes in

Sampling and Perceptual Decisions: The sampling iscarried out as follows. At the start of the simulations, ‘n’independent samples are drawn i.i.d. from the posteriordistribution. The sample with largest weight (afterdiscounting) is selected as the current percept–theorientation of this sample represents the perceivedorientation of the Necker cube. Subsequently, newsamples are drawn in the same way and added to thecurrent set of samples, and samples older than 1 secondare discarded.

Simulations: To illustrate model properties we varied twoparameters in the simulations, the ratio of posterior heightsat the peaks of the two likelihoods and the memory decayrate. These two parameters were varied because theycorresponded to hypothesized differences betweenparticipants (this is confirmed by Figure 16).

We varied the ratio of posterior heights to account forpotential differences in response probabilities andcumulative transition probabilities between subjects. Thereare many ways to vary the posterior heights, but we wereparticularly interested in assessing the effect of varying thewidth of the prior distribution because the width of theprior distribution encodes how strongly viewpoint is biasedby the context manipulation. Leaving the prior mean M

fixed to reflect either context VA or VB, the relative heightsof the two posterior peaks can be controlled by varying theprior variance 2

, which adjusts the ratio of the priorprobability values at the two peaks of the likelihood:

P(VA )

P(VB )expVB M

2 VA M

2

2 2 (6)

where VA is the mean parameter value that represents the“view from above” percept and VB is the mean thatrepresents the “view from below” percept.

We varied between 1+.5-8 and 1.03 for each contextcondition. For simplicity, we rescaled the viewpoint axisso that the two Necker cube percepts corresponded to --90

Page 13: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 Smith & Jones3

degrees (VA percept at mode VA ) and +90 degrees (VB

percept at mode VB) and the likelihood standarddeviations were set to 0.3. The mean of the prior, M, wasfixed at 0.1 or at -0.1, depending on which contextcondition we were simulating.To model potential differences in memory discountingbetween observers, we varied a second simulationparameter, the exponential memory decay constant which determines how quickly sample weights decay.Assuming a sample is generated every millisecond, wevaried the memory decay constant () such that the decay ofa sample’s age deteriorated from 1/20th to 3 seconds of itsage, i.e varied from 20 seconds to .33 seconds

Results

In this section, we compare the results of the simulationswith our experimental data. To better compare humanand model behavior, we simulated the experimental datacollection procedure to collect model responses andanalyzed the simulated response data exactly the same wayas human data.

First we show that the simulated data exhibitsswitching distributions similar to humans. Figure 15illustrates simulated survival probability curves for theand parameter value that best matched human data. Weonly fit , because we found that changes in did notmatter as far as a good-fit of the pooled subject data.Therefore, we fixed = 2 seconds and computed fits basedon just The human and simulated survival probabilitycurves matched within measurement error with theexception of context- consistent VB data. We believe thatthe mismatch for this condition is likely due to non-stationarity in the human data for the VB context data.The best-fit model parameters for both context conditionsare simple to interpret. The effect of context stimuli can bemodeled as higher confidence (low variance in the priordistribution) in percept interpretation when thebackground orientation matches the percept.

Next we addressed whether variations in simulationparameter can capture much of the between-subjectsvariation. In Figure 16 below, we plot phase durationmeans against variances for each of the 20 participant’sdata in the context condition. Plotted on the same graphare simulated means and variances, where each pointcorresponds to a different pair of prior ratio (at the twopeaks of the likelihood) and memory decay values. Despitethe simplicity of the theoretical assumptions andsimulation choices, the model produced bistable behaviorsimilar to human data and the variation in simulationparameters captures most of the spread of human data.Note that the simulated data follows the same “arc-like”trend of the human observers for the context consistentcondition. Additionally, variation in simulationparameters generates correlations between means andvariances for all four conditions that are similar to humandata.

Figure 15: Comparison of pooled human (red) and model (blue)

survival probability curves, for the best-matched simulation

parameters. Survival probability curves for: (a) Percept VA in

context VA. (b) VB in context VA. Simulations for context VA run

with with best-matched parameters: (c) VA in context

VB. (d) VB in context VB. Best matched simulation parameters

e. Dashed lines represent +/- 1 bootstrap SE. Base-

line (no context) best matched simulation parameters were:

As previously mentioned, our model is an instance of aMarkov Renewal Process. Here we show the model’s MRPgenerator functions for simulation parameters that bestmatched the combined VA-context human data. Asexplained in Appendix A, MRPs are discrete-state stochasticprocesses generated by functions Qij(t)=ijFij(t) whichrepresent the probability that the process makes atransition to state j in time less than or equal to t, given theprocess just entered state i. In an MRP model, perceptualstate updates evolve according to a Markov chain withtransition matrix:

ij = P(sN= j|sN-1 = i)

Page 14: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 Smith & Jones4

Figure 16: Comparisons of phase duration means vs variance in

context-consistent and context-inconsistent phases. In all

figures a-d, red points indicate subject data and blue points

indicate simulated data for one pair of () parameter values. (a)

Context-consistent view-from-above data (VA context/VA

percept). There is an arc-like pattern in the human data that is

mirrored by the simulated data. (b) Context-inconsistent VB

percept data (VB percept durations in VA context). (c) Context-

inconsistent VA percept data. (d) Context-consistent VB percept

data (VB percept survival in VB context).

Figure 17: Simulations capture observed correlation between

mean phase durations found across participants. Data points

labeled as in Fig. 21. (a) Context consistent mean phase

durations plotted against each other. (b) Context-inconsistent

means The analysis reveals the ability of the model to capture

key aspects of individual differences in human data.

Finally, we show how the model generates a simpleMarkov Renewal Process (MRP), and also how thegenerator functions of an MRP provide a quantitativedescription of both the simulated and human data.

In the Bayesian theoretical model outlined, the timesbetween updates (defined as the selection of the highestweighted sample) can be described by temporal distributionfunctions Fij(t) that depend on the current state and thenext state entered and ij represents average transitionprobability of state i to state j. Survival probabilityfunctions, cumulative transition probability distributions,and response probability functions can be derived from thegenerators for an MRP as shown in Appendix A.

To better characterize the statistical properties of thesimulation model, we computed the MRP generators thatcorresponded to the best-match simulation parameters.From the best-matching simulations, was obtained bycomputing the percentage of time each type of transitionoccurred in memory update events in the simulation. Fij(t)were estimated using empirical cumulative distributionfunctions of the amount of time occurring betweenmemory events.

Because transitions in the simulation are events inwhich the sample in memory is replaced by a higher-weighted sample, simulation updates can be interpreted asperceptual updates but with an important caveat: same-statetransitions can occur that are not perceptually salient.Most of the important properties of the model result as adirect consequence of same-state transitions, includingapproximately Gamma-distributed phase durations, and theobserved covariation between the mean and variance ofphase durations.

We can use the MRP characterization to provide moreinsight into the effects of context on switching behavior inthe model. From our previous results, VA percepts havehigher chances of survival and their phases are longer whenshown VA context. We would expect the best-matchedsimulation data have generators with the followingcharacteristics: > 01 and similarly, 10 > 11, andbecause the transition mechanism in the simulations areindependent of the past, the two rows of the ij will beequal. We simulated a million samples, which generatedabout 70,000 update events. We counted the relativefrequency of these update transitions, conditioned on theprevious model state. The resulting transition matrix was :

VA 0.6660.334

0.6660.334 The first row of the matrix shows that 00 transitions ()occurred at twice the rate of 01 (01), and similarly thesecond row shows 10 (10) occurred at twice the rate of11 transitions (11).

The MRP generator temporal distribution functionsFij(t) are shown in Figure 18. These curves were obtainedby computing histograms of the amount of time betweenmemory updates. Notice that they are similar toexponential functions; an implication of this is that meansand variances of percept durations will be correlated asshown in Appendix A; a fact that is empirically confirmedby other sources (Borsellino, et al 1972; De Marco et al,

Page 15: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 Smith & Jones5

1977) and also seen earlier in Figure 16. Additionally, theyare all similar to each other, implying that they have sharethe same temporal distribution mean.

Figure 18: Temporal distribution functions for all transitions

recorded while simulating bistable data (Best matched human

parameters for VA percept were used for the simulation) (a) Blue

curve indicates VAVA transitions, green curve indicates VAVB

transitions. (b) Blue indicates VBVB transitions and green

curve represents VBVA transitions. The simulation gives rise to

temporal distribution functions that are approximately

exponential and are similar for all transitions.

Thus a prior bias in the model affects the probability oftransitioning into VA percepts, as seen in the stationarytransition matrix, However, the temporal distributionfunctions are nearly identical and exponential memorydecay produces exponential temporal distributions.

Discussion

In our simulation, we have modeled the effect ofcontext on bistability. Using exponentially weightedsamples from the posterior distribution, we were able toproduce switching behavior similar to human observers.From the standpoint of MRP generators, the simulationsgenerated a very simple model, and that model capturesmuch of human switching dynamics. However, thesampling idea presented above can generate much morecomplicated behavior. In particular, in Schrater andSundareswara (2006) we show that the independentsampling assumption can be relaxed, and that relaxing thatassumption generates a richer class of MRP models. Thesimulations also rely on a fixed posterior distribution,which is unreasonable. A better model of the perceptualinference should incorporate the effects of time, eyemovements and memory.

As an example of the simplifications in the probabilitymodel, we used a likelihood model on viewpoint (pose)P(I|). A more realistic representation would incorporatelikelihood functions that implement perspective projectionconstraints in the inference of 3D objects from 2D, theeffects of eye movements on the image data, and priorsembodying the typical biases found in interpreting linedrawings, like preference for equal angles between 3Dedges, or compactness. Additionally, the current systemcould be embedded in a hierarchical recurrent inferencesystem similar to the one proposed by Lee and Mumford(Lee and Mumford, 2003). However, the simplesimulations were sufficient to illustrate the idea thatimplementing Bayesian inference via sampling can createhuman-like switching behavior.

In the basic implementation presented here, we assumethat switching behavior is generated by a stationary process.However, a non-stationary switching process is easy tosimulate, although harder to characterize mathematically.For example, by allowing posterior probabilities to varywith time (e.g. due to eye fixations or previous stimuli), webelieve many reported bistability effects could be accountedfor by models that perform inference via sampling (seeDiscussion).

Our approach is related to a model proposed byGepshtein and Kubovy (Gepshtein and Kubovy, 2005) toaccount for data from experiments with bistable dot-lattices. The authors posit that bistability in successivepresentation of dot lattices results from an intrinsic biasthat stochastically drifts in orientation space, causing achange in perception. Their “intrinsic bias” and “stimulussupport” are similar to prior and likelihood respectively.Percepts are probabilistically generated at transition times,where the probability of a percept depends on the ratio ofits intrinsic bias and stimulus support. Their model can beviewed as an instance of a sampling model, but with nomemory and a non-stationary prior distribution (thedrifting intrinsic bias). In fact the key to generatingswitching behavior in their model is the drift in the“intrinsic bias”. In contrast, the most important aspects ofthe model used in our simulation are memory and self-statetransitions. Both models show the potential explanatorypower of sampling within an inference framework toaccount for bistability data.

DiscussionWhen a bistable image was embedded in a context that

favors one of the interpretations, perceptual ambiguity wasreduced; the interpretation supported by the contextpersists longer and occurs more frequently. Additionally,the percept that is not supported by the context issuppressed relative to baseline (without any context).These empirical results suggest context is modulating theperceptual ambiguity of the Necker cube in a mannerconsonant with Bayesian models of perceptual inference.

Page 16: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 Smith & Jones6

Our model’s successful prediction of both the time courseof perceptual switching and the impact of context suggestthat perceptual bistability may occur as a by-product ofnormal perceptual decision-making: postulating specialmechanisms for switching may be unnecessary. Inparticular, the same processes may mediate dominance andsuppression of bistable percepts (a view also suggested byGepshtein and Kubovy, 2005). This view is not supportedby recent binocular rivalry results, where supportingcontext might increase dominance times of the consistentpercept, but does not affect suppression times (Sobel andBlake, 2002).

In this paper we introduced a perceptual inferenceapproach to modeling bistability. The model is based onsearching for the high probability interpretation of theNecker cube by sampling a posterior probabilitydistribution on interpretation parameters over time,together with exponential discounting of past samples. Weshowed that this model is capable of producing bistable-likebehavior when the posterior distribution is bimodal. Muchof the explanatory power of this model stems from itsreliance on Bayesian models of perceptual inference.Bayesian models use likelihood terms to encode knowledgeabout image rendering by expressing the scene parametersconsistent with the image. For the Necker cube, alikelihood would express that 3D vertex depths must beperspectively consistent with the image lines and vertices.When combined with a prior that promotes 3Dinterpretations with equal angles between edges, theposterior distribution is bimodal on pose or viewpoint.Within this framework, the effects of context can be simplydescribed as influencing prior probabilities of sceneinterpretations.

Can other effects of bistability be explained within ourtheoretical framework? We believe the theory provides anatural interface to interpret both high-level and low-leveleffects on bistability, because any event that has an impacton the relative probability of perceptual interpretations caninfluence bistabilty in the model. In particular,manipulations of variables which produce changes in aBayesian interpretation of the ambiguous image, shouldimpact perceptual bistability in ways predicted by ourmodel. To help illustrate the potential of this idea, wediscuss three main manipulations below: eye-position,attention and global context.

Eye-position: changes in eye-position are correlated withpercept changes (Einhauser, Martin and Konig, 2004; Ellisand Stark, 1978; Kawabata, 1978). Moreover, fixationregions within an image are predictive of the perceptualstate--for example, fixations are more probable towardregions of the Necker cube perceived as closer. Currently itis unknown whether the change in eye-position causespercept switches or vice versa. Einhauser et al (Einhauser,Martin and Konig, 2004) have suggested a theory thatintegrates both views. They hypothesize that perceptchanges induce eye-position shifts toward image locationsconsistent with the percept (e.g. near points on the Necker

cube), but also that maintaining eye-position inducespercept changes via a bottom-up fatigue process.

In our model, eye position would influence bistabilityby affecting the process of inference. Influences of eyeposition on inference seem quite plausible. The eyes areconverged at the depth of the image plane at fixation, andthe image data at that point is at the highest resolution(least blurred). Points away from fixation are measuredwith less accuracy, both in terms of their depth andlocation, an effect that will change the likelihood function—the fixated region is more tightly constrained. As eye-position changes the likelihood should change as afunction of time. Additionally, fixated regions may beperceived as closer (because of blur gradients and theireffect on perception, (e.g. Mather 1997). For both thesereasons we would predict that eye position should have animpact on the perceptual inference, and thus that eyeposition would influence bistability.

Attention/Voluntary control: Another key variable shownto affect mean durations of percepts in bistable andrivalrous displays is attention (or voluntary control) orattentive tracking (in the case of bi-stable apparent motion).In particular many studies (van Ee, Noest, Brascamp andvan den Berg, 2006; Hol, Koene and van Ee, 2003; van Ee,van Dam and Brower , 2005; Toppino, 2003; Meng andTong, 2004) have shown that voluntary maintenance orforced reversals have been shown to either prolong orshorten mean dominance times respectively. Moreover,combination of both the eye-position and attention canhave a multiplicative effect (Suzuki and Peterson, 2000).

Because attention is known to influence perceptualprocessing we would expect to find influences onbistability. In particular, Lu and Dosher (Lu and Dosher,1998) show that attention increases signal to noise ratio ina discrimination task. Moreover, attention can reducetarget ambiguity in search tasks (Eckstein et al, 2004;2006).These effects can be modeled in terms of their impact onlikelihoods and priors. While most attentional effects havenot been modeled using Bayesian inference models, one ofthe virtues of our proposal is that we can make strongpredictions about the impact of any influence on bistabilitywhen its impact on perceptual inference is known. Forexample, consider a discrimination task that involvesdetermining the depth of a probe object relative to the faceof a cube. An attentional manipulation could be added toquantify the change in discrimination performance thatresults from attending to the face closest to the probe or toa different, task irrelevant face. Given our model, weshould be able to translate the magnitude of the attentioneffect on discrimination into predictions for the effects ofattention to cube face on Necker cube bistabilty.

Global context: Levelt (Levelt, 1965) laid out aconvincing argument that binocular rivalry arose primarilyfrom inter-ocular reciprocal competition resulting fromcoherent but dissimilar images presented to the two eyeswhich implied that the process of alternation washappening early in the visual system – a view that was later

Page 17: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 Smith & Jones7

promoted by other research (Blake, 1989; Lehky 1988).However, Diaz-Caneja’s early work (Diaz-Caneja, 1928;translated in Alais, O’Shea, Mesana-Alais and Wilson,2000) and more recent work by Kovács et al (1996) andAlais and Blake (1999) showed that when the imagepresented to each eye consists of a “patchwork” of twodifferent targets (for example, as a complementarypatchwork of two different images or just halves of eachimage), the brain perceives and alternates between the twodifferent and coherent images, implying that a higher-ordersemantic process is operating. For binocular rivalry, thishigher-level processing has been shown to include context—when a target is embedded in a coherent context, it tendsto predominate (lengthening of duration times of thetarget) when compared to placing it in an incoherentcontext (Alais and Blake, 1999). Blake and Logothetis(2002) note that a similar pattern of results is observed bymaking the face/vase bistable figure more face-like. In thecase of pattern rivalry, Maier et al (2005) showed thatdominance times are largely determined by the “global andholistic” interpretation of variables that include the size ofthe surrounding region and stereoscopic depth ordering,rather than by local conflict resolution. All of these resultssuggest the importance of contextual processes for rivalry.For bistability, Gepshtein and Kubovy (2005), and theresults presented here show that context influencesbistability in ways qualitatively similar to rivalry. In thispaper, we showed that context could be modeled asinfluencing a prior distribution on interpretations forbistability. The strong similarities between rivalry andbistability effects raises the possibility that rivalry may resultfrom a similar kind of perceptual decision process asbistability, except that image interpretations choices inrivalry are coupled with choices about data sources.

Quantifying perceptual bistability

The measurement framework introduced in this papercan be used to address several complex issues that arise inthe interpretation of bistability data, including stationarity,covariation between phase duration measures such as meanand variance, and relationships between state-transitionmodels and percept duration distributions.

Intuitively, stationarity measures whether theprobability distributions governing perceptual statetransitions vary across time. A stochastic process withstationary dynamics will converge to a state distributionafter a burn-in period, and maintain that distribution forall subsequent time. Mamassian and Goutcher (2005) usedsurvival probability functions to test a key property ofstationary distributions which is whether the probability ofobserving a percept is independent of the past aftersufficient elapsed time T. We tested whether stationarityheld for our experimental data.

For percept VB, stationarity implies there is a time suchthat the probability of the present state is independent ofthe past:

P VB (tT) |VB (t) P VB (tT) |VA (t) (7)

This relationship can be used to demonstrate that survivalprobability asymptotes much agree. By laws of conditionalprobability,

P VB (tT)|VA (t) 1P VA (tT)|VA (t)

Plugging in equation 7,

P VB (tT)|VB (t) 1P VA (tT)|VA (t)

The equation expresses that survival probabilitiesshould converge to the percept frequency, if the process isstationary—i.e. the switching rates and durations areprobabilistically the same across time (see Ross, 1992).However, non-stationary processes (processes that allow forchanges across time) can result in survival probability curvesthat do not match at asymptote. Thus we can test thestationarity of our bistability data by assessing theagreement between P(VA(t+T) |VA(t)) and 1 - P(VB(t+T)|VB(t)) at asymptote.We present the survival probability curves from ourbaseline and context experiment re-plotted in Figure 19. Inboth figures, the dark curves shows P(VA|VA) and the lightcurves shows 1 – P(VB|VB). Within measurement error,both context and baseline data are stationary over the datacollection period. We do not find strong evidence for non-stationarity in Necker cube, suggesting that effects of initialbias are quite transient, dissipating their influence beforeasymptote. In contrast, Mamassian and Goutcher (2005)found non-stationary rivalry behavior in their experimentfor binocular rivalry data.

The covariation between means and variances weobserved across participants has been found in severalother studies (Mamassian and Goutcher, 2005; Borsellino,1972). In Appendix A, we present an analysis ofdependence between means and variances of the generalclass of MRP models, and show that such variation canarise from a wide class of models. We show that ourtheoretical model generates an MRP with simple propertiesand covariation that strongly resembles our participant’sdata. Thus, our approach provides a compellingexplanation for one of the recurring patterns in bistabilitydata.

Traditionally, bistablity data has been characterized bymeasuring phase duration distributions, frequently usinggamma distributions. In Appendix A, we show thatMarkov Renewal Processes employing same-state transitionsand exponentially distributed times between statetransition

Page 18: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 Smith & Jones8

Figure 19: Assessing stationarity using survival probability

curves. (a) No- context condition: Because the dark-blue (VA

survival) has the same asymptote as the light-blue curve (1- VB

survival), the survival of a percept at asymptote is independent of

initial state—indicating a stationary process (b) Similar behavior

is observed in the case of context. The two red survival curves

correspond to VA context and the two green curves to VB context.

events produce phase durations that can be well-approximated by Gamma distributions, even though notexactly Gamma. This result accords well with the recentfindings of Brascamp et al who found that the best-fittingdistributions to phase duration data are not gamma, butare similar in shape (Brascamp et al, 2005). Rather thanproviding a better fitting family of distributions, theintroduction of Markov Renewal Processes provides amathematical framework for expressing properties andrelationships between model properties and measures ofthe dynamics of bistable behavior of the visual system.

ConclusionsIn this paper, we showed how bistability could beinterpreted as arising from a general process of probabilisticinference with ambiguous images, without needing specialmechanisms to implement switching via adaptation orhysteresis. In the model, Bayesian inference isimplemented by sampling from a probability distributionacross possible interpretations, and percepts are selected assamples with the largest time-discounted probability. Weshow that the model can reproduce the detailed dynamicsof bistability, including covariation between means andvariances of percept durations and the inverse covariationbetween the durations of the two percepts. We find thatintroducing regularity in the orientations of backgroundobjects influences the dynamics of bistability in waysquantitatively predicted by the model. To characterize thetemporal dynamics effectively, we used Markov RenewalProcesses; this quantification, in addition to allowing theprocess to be described mathematically with its measures,allowed us to introduce the notion of same-state transition

in perceptual states which are inaccessible to measurementbut is relevant in technical descriptions of bistability.

Appendix A

Markov Renewal Processes

Stationary Markov Renewal Processes (MRPs) arestochastic processes that transition between states accordingto a Markov Chain, but the amount of time spent in eachstate before transition is random. We assume states inmodels of multi-stability can be described by a finite set ofpositive integers. The simplest correspondence betweenMRP states and perceptual states is to use a binary statespace; we chose to represent the ‘VA’ state by 0 and the VB

state by ‘1’. Ignoring time, the sequence of states{sN ,N 0,1,2...} generated by an MRP is a discreteMarkov chain, wheresN 0,,M is the MRP state (fortwo states M={0,1}, and N is a counter variable that keepstrack of the total number of transitions. The Markov chainis generated by the transition matrix ij = P(sN= j|sN-1 = i).The time between state transitions are described bytemporal distribution functions Fij(t) that depend on thecurrent state and the next state entered. MRPs can besuccinctly described using generator distributionsQij(t)=ijFij(t) which represent the probability that theprocess makes a transition to state j in time less than orequal to t, given the process just entered state i. Samplingfrom this generating process involves picking an initialstate, using the transition matrix to draw subsequent states,then sampling a duration from the appropriate temporaldistribution.

MRPs constitute a significant generalization of thedescriptive framework typically used to characterizebistability. In addition to relaxing distributionalassumptions, MRPs explicitly model temporalcontingencies and allow for same state transitions. Same-state transitions are transition events for the generatingprocess that change the event count but not the perceptualstate. While same-state transitions are not perceptuallyaccessible (same-state transitions are not phenomenallyobservable), we believe same-state transitions are animportant possibility that may arise in the linkage betweenneural events and conscious perceptual transitions. In thisregard, the Bayesian theoretical account presented abovedraws much of its explanatory power from allowing same-state transitions (e.g covariation between mean andvariance falls out of MRPs with same-state transitions). Themain reason we introduce MRPs is to provide clarity on therelationships (and differences) between the assumptionsand properties of probabilistic generative models ofperceptual state sequences and the properties of measurablequantities (e.g. phase durations). Because MRPs include alarge family of stochastic processes, the relationshipsdescribed below are general.

Page 19: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 Smith & Jones9

Measures of perceptual switching behavior

Measures of perceptual switching behavior all involverecording either phase durations or percept frequencies.Temporal contingencies between measures are moredifficult to adequately represent, and authors vary widely intheir handling of the time series/sequential aspect ofbistability data. Because high-order contingencies can bemeasured only with massive amounts of data, we restrictour focus to first-order Markov contingencies betweenperceptual states at a particular time and future states anddurations. These contingencies are exactly thosecompletely described by two-state MRPs (higher-ordercontingencies can be handled by MRPs via stateaugmentation).

Phase durations

In terms of the notation introduced above, phasedurations contingent on the state of the process s() at time can be expressed in terms of transition probabilities(Ross, 1992):

})(|0)({)( istNPtG jij (8)

where Nj(+t) is the number of transitions into state j thatoccur after an elapsed time t. Note that 1-Gjj(+t)represents the cumulative probability of observing notransitions of type j in time t, which is equivalent to theprobability that the remaining phase duration Di of state iis at least t:

1-Gij(+t)=P(Di>t|s()=i)

When corresponds to the start of a perceptual phase, theprobability density function of Gjj is the same as the phaseduration density function (which is typically approximatedby histograms of recorded phase durations). We describe anon-parametric estimator for Gjj in the Methods section,where corresponds to the first observed percept of type i.

Survival and Switching Probabilities

Mamassian and Goutcher (2005) introduced switching andsurvival probabilities to the analysis of bistability data.Here we formulate it in terms of an MRP. Survivalprobabilities are defined as:

Sii( t) P{s( t) i | s() i} (9)

which is the probability of observing the same state after anelapsed time t. For binary states, the switching probabilitiesare one minus the survival probability functions:P{s(+t)=j|s()=i}=1-Sii(+t).

Note that the definition places no restrictions on thenumber of transitions in the elapsed time t. Thus, it issomewhat surprising that there is a simple relationshipbetween transition and survival probability functions forstationary MRPs. For binary state MRPs the relation hasthe form (Ross,1992):

t

dxxtPxGtP0

110100 )()(1)( (10)

t

dxxtPxGtP0

001011 )()(1)( (9)

Relations to MRP generators

Within a stationary MRP process, transition andswitching probability functions are related to the generatorfunctions Q via a set of renewal equations (Ross, 1992).For notational simplicity (and without loss of generality) weset = 0 for the rest of the Appendix. The renewalequation for Gij have a simple relation to MRP generators:

tt

dxxqdxxqxtGtG0

010

000101 )()()()( (11)

tt

dxxqdxxqxtGtG0

100

111010 )()()()( (12)

where qij(t)=dQij(t)/dt are the density functions of theMRP generators. In words Equation. 11 can be interpretedas saying that the observation of at least one transition0 between time 0 and time t occurs in two ways: theprocess stays in state 0 at each time up to x (withinstantaneous probability q00(x)) and undergoes transitionin the remaining time (with probability G01(t)), or theprocess transitions in the first x seconds (with probabilitygiven by integrating q01(x)). Note that these equations mustbe solved, because Gij appears on both sides of theequation. Because the integral equation is a convolutionand the functions are zero for negative argument, solutionsfor Gij in terms of the generating densities can be writtendown using the Laplace transform. Using L[] to representthe Laplace transform, and L-1[] to represent the inverseLaplace transform, the Laplace transform of equation 11 isgiven by:

L G01(t) L G01(t x)q00(x)dx0t q01(x)dx0

t G 01(s) G 01(s)L q00(t) 1

s L q01(t)

Solving and taking the inverse Laplace transform:

G01(t)L-1 G 01(s) L-1 L q01(t) s 1L q00(t)

(13)

results in a formal solution for the cumulative phasedurations in state 0 (solution for the other phase durationis similar). For many generator distribution functions (suchas exponential functions) the transforms can be computedin closed form, producing explicit formula for phaseduration distributions (see examples below). Perhaps moreimportant is that equation 13 above provides closed formsolutions for phase duration moments (i.e. mean, variance,etc) in terms of generator moments (exponential shape andscale) because of the connections between the Laplacetransform and the moment generating function of adistribution. In fact, all generator distributions with the

Page 20: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 Smith & Jones10

same means and variances produce the same phaseduration mean and variance (true for all higher moments aswell).

(The moment generating function M of a distributionis the same as the two-sided Laplace transform of thedensity function with negated frequency argument. Recallthat all moments can be computed from the derivatives ofM at zero. For distributions with domains in the positivereals the two-sided and standard Laplace transforms areequivalent. To compute the mean phase duration in state0, we first need to write the renewal equation in terms ofthe density gij =dGij /dt. Differentiating both sides of therenewal equation with respect to t is equivalent tomultiplying by s in the Laplace transform domain. Letgij(s) denote the Laplace transform of gij =dGij /dt. Usingproperties of the moment generating function, mean phaseduration is the first derivative of gij(s) evaluated at zero:

dds g 01(s) d

ds g 01(s) 00 f 00(s) 01 f 01(s)

dg 01(s)

ds

00

dg 01(s)

dsf 00(s)

df 00(s)

dsg 01(s)

(1 00 )df 01(s)

ds

(14)

where we have separated the Markov and temporal parts ofthe generator density qij(t) = ijfij(t), and fij(s) is theLaplace transform of fij(t). Because fij(0)=gij(0)=1,evaluating the equation above at zero results in thefollowing relation between means:

01 00 0100 1 00 01

01 00

1 00 0001

(15)

where 01 is the mean phase duration and 00 and 01are the means of the generator densities. The expressionfor the phase duration of state 1 is similar:

10 11

111 1110 (16)

An expression for the phase duration variance can bederived in a similar manner, with the result:

012

00 002 1 00 00

2 1 00

2 012 (17)

Note the relationship between phase duration varianceand mean phase duration. This relationship holds for allMRPs, independent of distribution, as long as theprobability of self-state transitions 00 is not zero. Forgenerator distributions in which the means and variancesare coupled (like the exponential, where variance is thesquare of the mean), correlations between the mean andvariance will be observed even for 00 = 0.

Survival probabilities are also related to the generatorfunctions by a renewal-type equations (Ross, 1992):

dxxSxqdxxqxtStS

dxxSxqdxxqxtStS

tt

tt

0 00000 011100

0 11110 100011

)(1)()()()(1

)(1)()()()(1 (18)

Note that the renewal equations for survival probabilitiesare coupled. The Laplace transform can also be used tosolve for the survival probability of state 0 in terms of thegenerators:

S00(t)L1 1

s

q00 (s)1 q10 (s)q 11 (s)1 q 00 (s)q11 (s)q 01 (s)q 10 (s)q00 (s)q11 (s)1

(19)A similar formula can be derived for the other survivalprobability function. Explicitly solving this equation resultsin complicated or intractable expressions even for simplegenerator functions, but the asymptotic value of thesurvival probability function has simple properties wediscuss in the next subsection.

Similar relationships between generators andmeasurable quantities can be described for non-stationaryMRPs—i.e. those with time-varying generators—but theanalysis is considerably more difficult (Monteiro, et. al,2006).

Stationarity of perceptual switching

When the generator probabilities are constant acrosstime, MRPs converge in steady-state (assuming all states arerecurrent) to a stationary distribution. Convergence tosteady-state can be diagnosed through the approach ofsurvival probability curves to asymptotic values. One ofthe key results for renewal processes is that survivalprobabilities asymptote at the ratio between the expectedtime spent in state j during one cycle to the expectedamount of time for a whole cycle (both states). Let jdenote the asymptotic value of the survival curve for state jand jj denote the expected amount of time in a wholecycle from state j to the next occurrence of j. Theasymptotic value is given by:

j limt P{s(t) j |s(0) j}

limt S jj (t)

ikikk0

M

jj

(20)

Specializing the result to the binary case produces:

0 0000(1 00)01

0000 01 0110

11111(111)10

111110 1001

(21)

The numerators are the expected amount of time spent insame-state events plus the expected amount of time until

Page 21: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 Smith & Jones11

transition. The denominator of the top expression is theexpected amount of time spent in state 0 plus the expectedphase duration in state 1 plus the time required totransition back to state 0 and complete the cycle. Byplugging in the results previously derived for the phasedurations, it is simple algebra to show that the asymptoticprobabilities sum to one: 0 +1 = 1. Failure of theasymptotic values to sum to one implies the processgenerating the percepts is non-stationarity.

Note that independence of previous state values inMRP generators has no relevance to whether theasymptotic survival probabilities sum to one. StationaryMRPs with dependence between states will generateasymptotic values that sum to one, as described above. Tobe clear what independence means in this context,generator independence implies q10(t) = q00(t) and q01(t) =q11(t), or the durations depend only on the state. This alsoimplies that 10 = 00 and 01 = 11 (because theprobability of transitioning into a 1 or 0 does not dependon the previous state), and hence 11 = 1-00. Underindependence the formulas for the asymptotic valuesreduce to:

0 00 and 111 (22)

On the other hand, independent but time-varyingtransition mechanisms can generate failures of asymptotictransition values to sum to one. In general, diagnosinggenerator independence in MRPs appears morecomplicated than diagnosing stationarity. Note

Example of Two-state MRP with ExponentialGenerators

In this section we derive formulas for phase durations andsurvival probabilities for the special case that the generatordistributions are exponential. Exponential generatorsrepresent the simplest type of process, representingconstant instantaneous probabilities of ending the durationof the current event (and generating a new event).

A two-state stationary MRP with exponential generatorfunction is completely characterized by 6 parameters in themost general case (2 transition probabilities and 4 rateparameters). The generator densities are given by:

q00(t) 00100

et /00

q01(t) 1 00 101

et /01

q10(t) 111 110

e t /10

q11(t)11111

e t /11

(23)

Expressions for phase duration distributions and survivalprobability functions can be written down using propertiesof the Laplace transform. The Laplace transform of theexponential densities yields:

q 00(s)L 00100

e t /00 00

1/00

1/00s

q 01(s)L 1 00 101

e t /01 1 00 1/01

1/01 s

q10(s)L 111 110

e t /10 111 1/10

1/10 s

q11(s)L 11111

e t /11 11

1/11

1/11 s

(24)

The phase duration distribution in state 0 is given by:

g01(t)L-1 q 01(s)

1q 00(s)

(1 00)

(1 00)0100

et /01 1 00et 001

00 101 00

01

(25)

and the phase duration distribution for state 1 can bewritten down from the above result with the appropriatesubstitutions. How does this result fit with the frequentfinding that phase duration distributions are close togamma distributed ? (although not exactly—see Brascamp,2005). Although the phase duration distribution above isnot a gamma distribution, gamma distributions form goodapproximations. The appropriate gamma distribution canbe determined by moment matching using the expressionsfor means and variances in MRPs given above and therelationship between means and variances and gammaparameters. Recall that gamma density functions have twoparameters, a and b and take the form: p(t) = b

-ae

-t/bt

a-

1/[a]. The mean and variance of a gamma can be

expressed in terms of the parameters as = a b and = ab

2. Using expressions for MRP means and variances from

equations 15, 16 and 17 (and using the fact that generatormeans and standard deviations both equal their rateparametersij), we can solve for the gamma parameters:

a2

2

0000 1 00 01 2

00002 2 00 1 00 2

012

b 2

00

1 00 2 00

2 2 00 012

00

1 00 0001

(26)

An example of the match is shown in Figure 20, wherethe green is the MRP curve with 00=4, 01=0.5 and00=0.7, and the red curve is the gamma with matchingmoments :

Survival probability functions generated by the MRPwith exponential generators can look similar to the humansurvival probability functions for a range of parametervalues. Formulas for survival probabilities for the generalcase are quite complicated (taking over a page), and weomit them. However, for the special case in which theexponential generators are independent, the survival

Page 22: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 Smith & Jones12

probability functions are simply related to the generatorfunctions. In particular S00(t) = 1- 00(1- e

-t/00), which issimply 1 minus the cumulative probability of durationsspent in state zero, scaled by the probability the processgenerates a state of type 0.

The main point of this section is to show thatexponential generators in an MRP can lead to a range ofmeasured behavior that effectively mask the simplicity ofthe underlying generative process.

.

Figure 20: Figure illustrates phase durations produces by MRP

with exponential generators and same-state transitions. Many

values of the generator parameters produce phase duration

densities well-approximated by a gamma distribution. An

example (green curve) is shown superimposed with a gamma

curve with matching mean and variance (red curve)

ReferencesAlais, D., and Blake, R. (1999) Grouping visual features

during binocular rivalry. Vision Research. 34 4341-4353.

Alais, D., O’Shea, R.P., Mesana-Alais, C. and Wilson, G.(2000) On binocular alternation. Perception 29, 1437-1445.

Attneave, F. (1971) Multistability in Perception. Scientific.American. 225, 63–71.

Barlow, HB (1990) A theory about the function role andsynaptic mechanism of visual after-effects. InBlakemore C (editor) Vision: cosing and efficiency.Academic Press, New York, pp 363-375.

Bialek, W., DeWeese, M. (1995) Random Switching andOptimal Processing in the Perception of AmbiguousSignals. Physics Review Letters. 74(15) 3077-80.

Blake, R.R. (1989) A neural theory of binocular rivalryPsychol. Rev. 96, 145–167.

Blake, R. and Logothetis, N.(2002) Visual Competition.Nature Reviews Neuroscience, 3, 1-11.

Brascamp, J. W., van Ee, R., Pestman W. R., and Van denBerg, A.V. (2005). Distributions of alternation rates invarious forms of bistable perception. J. of Vision 5(4),287-298.

Brewster, D. (1847) On the conversion of relief by invertedvision. Edinburgh Philosoph. Trans., 30, 432–437.

Borsellino, A., De Marco, A., Allazetta, A., Rinesi, S., &Bartolini, B. (1972). Reversal time distribution in theperception of visual ambiguous stimuli. Kybernetik, 10,139-144

Brower, G.J. and van Ee, R.(2006) Endogenous influenceson perceptual bistability depend on exogenousstimulus characteristics. Vision Research, 46, 3393-3402.

Davis, J., Schiffman, H.R., and Greist-Bousquet, S. (1990)Semantic context and figure-ground organization.Psychological Research 52, 306-309.

De Marco, A., Penengo, P., Trabucco, A., Borsellino, A.,Carlini, F., Riani, M., et al. (1977). Stochastic modelsand fluctuation in reversal time of ambiguous figures.Perception, 6, 645-656.

Diaz-Caneja, E. (1928) Sur l’alternance binoculaire. Ann.Ocul.(Paris). October 721-731. (Translation providedin Alais, D., O’Shea, R.P., Mesana-Alais, C. andWilson, G. (2000)).

Eckstein, M.P., Pham B.T., Shimozaki, S.S. (2004). Thefootprints of visual attention during search with 100%valid and 100% invalid cues, Vision Research, 40,1193-207

Eckstein, M.P., Drescher B., Shimozaki, S.S. (2006)Attentional cues in real scenes, saccadic targeting andBayesian priors, Psychological Science, 17, 973-80.

Einhauser, W., Martin, K. A., and Konig, P. (2004). Areswitches in perception of the Necker cube related toeye position? Eur J Neuroscience. 20(10), 2811-2818.

Ellis, S.R. & Stark, L. (1978) Eye movements during theviewing of Necker cubes. Perception, 7, 575–581.

Freeman, W.T (1994) The generic viewpoint assumption ina framework for visual perception, Nature, vol. 368,April 1994.

Fox R, (1991) “Binocular rivalry'', in BinocularVision Ed.D MRegan, volume 9 of Vision and Visual Dysfunction Ed. J RCronly-Dillon (London: Macmillan), 93-110.

Gepshtein, S. and Kubovy, M. (2005) Stability and changein perception: spatial organization in temporalcontext. Exp. Brain Res 160: 487-495.

von Grunau, M. W., Wiggin, S. and Reed, M. (1984). Thelocal character of perspective organization. Perceptionand Psychophysics. 35(4), 319-324.

Moran, J. and Desimone, R. (1985) Selective attentiongates visual processing in the extrastriate cortex.Science, 229, 782-784.

Page 23: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 Smith & Jones13

Hol, K., Koene, A. and van Ee, R. (2003). Attention-biasedmulti-stable surface perception in three-dimensionalstructure-from-motion. Journal of Vision, 3, 486–498.

Kawabata, N., Yamagami, K. & Noaki, M. (1978) Visualfixation points and depth perception. Vision Res., 18,853–854.

Kersten, D., Mamassian, P. and Yuille, A. (2004) ObjectPerception as Bayesian Inference. Annual Review ofPsychology. Vol. 55, 271-304.

Kersten, D. & Schrater, P. R., (2002). Pattern InferenceTheory: A Probabilistic Approach to Vision. In R.Mausfeld, & D. Heyer (Ed.), Perception and thePhysical World. Chichester: John Wiley & Sons, Ltd

Knill, D. C. and Pouget, A. (2004). The Bayesian brain:The role of uncertainty in neural coding andcomputation. Trends in Neuroscience. 27(12), 712 -719.

Kovács, I., Papathomas, V., Yang, M. and Ákos Fehér(1996) When the brain changes its mind: Interoculargrouping during binocular rivalry. Proc. Natl. Acad.Sci. 9, 15508-15511.

Lee, T.S. and Mumford, D. (2003) Hierarchical BayesianInference in the Visual Cortex. Journal of the OpticalSociety of America. Vol. 20, No. 7.

Lehky, S.R (1988) An astable multivibrator model ofbinocular rivalry. Perception. 17, 215-229.

Leopold, D. and Logothetis, N.(1999) Multistablephenomena: Changing views in Perception. Trends inCognitive Sciences. Vol.3, No.7, 254-264.

Leopold, D., Wilke, M., Maier, A. and Logothetis, N.(2002) Stable perception of visually ambiguouspatterns. Nature Neuroscience. Vol. 5, No. 6.

Levelt, W. (1965) On Binocular rivalry. Institute forPerception RVO-TNO. Soesterberg, TheNetherlands.

Long, G., Toppino, T. and Mondin, G. (1992) Prime Time:Fatigue and set effects in the perception of reversiblefigures. Perception and Psychophysics. Vol.52, No.6, 609-616.

Lu, Z.L and Dosher, B.A (1998) External Noisedistinguishes Attention Mechanisms. VisionResearch. Vol. 38, No. 9. 1183-1198.

Pouget, A., Dayan, P. and Zemel, A. (2003) Inference andComputation with Population Codes. Ann. Rev.Neuroscience. 26: 381-410.

Maier, A., Logothetis, N. and Leopold, D. (2005) Globalcompetition dictates local suppression in patternrivalry. Journal of Vision. No. 5, 668-677.

Mamassian, P. and Goutcher, R. (2005) Temporaldynamincs in Bistable Perception. Journal of Vision.No. 5, 361-375.

Mamassian, P. and Landy, M. (1998) Observer Biases in3D Interpretation of Line Drawings. Vision Research(38) 2817-2832.

Mather G (1997) The use of image blur as a depth cue.Perception 26, 1147-1158.

Meng, M. and Tong, F. (2004) Can attention selectivelybias bistable perception? Differences betweenbinocular rivalry and ambiguous figures. Journal ofVision. 4, 539-551

Monteiro, A., Smirnov, G., Lucas, A. (2006). Non-parametric Estimation for Non-homogeneous Semi-Markov Processes: An Application to Credit Risk.Tinbergen Institute Discussion Paper (TI 2006-024/2).

Moran, J. and Desimone, R. Selective attention gates visualprocessing in the extrastriate cortex. Science,229(4715) 782-784.

Necker, L.A. (1832) Observations on some remarkableoptical phaenomena seen in Switzerland; and on anoptical phaenomenon which occurs on viewing afigure of a crystal or geometrical solid. London andEdinbugh Philosophical Magazine and Journal of Science 1.329-337.

Pelton, L.H. and Solley, C.M. (1968) Acceleration of thereversals of the Necker Cube. Am. J. Psychology 81,585-588.

Peterson, M.A and Gibson, B.S. (1994) Object recognitioncontributions to figure-ground organizations:operations on outlines and subjective contours.Percept Psychophysics 56, 551-564.

Peterson, M.A., Harvey, E.M. and Weidenbacher, H.J(1991) Shape recognition contributions to figure-ground reversals: which route counts? J. Exp. PsycholHum. Percept Perform. 17, 1075-1089.

Rock , I. (1975) Introduction to Perception. Macmillian,New York.

Rock, I., Hall, S. and Davis J. (1994) Why do ambiguousfigures reverse? Acta Psychologica, 87, 33-59.

Rock, I. and Mitchener, K.(1992) Further evidence of thefailure of reversal of ambiguous figures by uninformedParticipants. Perception. 21, 39-45.

Ross, Sheldon (1992) Applied Probability Models withOptimization Applications, Mineola, N.Y., DoverPress.

Schrater, P.R. and Sundareswara, R. (2006) Theory andDynamics of Perceptual Bistability. Neural InformationProcessing Systems, 2006 (in press).

Sobel, K. and Blake, R.(2002) How context affectsinfluences predominance during binocular rivalry.Perception. 31, 824-831.

Stocker, A. and Simoncelli, E. (2006) Noise characteristicsand prior expectations in human visual speedperception. Nature Neuroscience. vol.9, no.4, 578-585.

Page 24: Interpreting Bistability Using Probabilistic Inferencevision.psych.umn.edu/users/schrater/Papers/JOV17submitted.pdf · Interpreting Bistability Using Probabilistic Inference ... investigated

Journal of Vision (2005) 5, 1-3 Smith & Jones14

Suzuki, S., & Peterson, M. A. (2000). Multiplicative effectsof intention on the perception of bistable apparent motion.Psychological Science, 11, 202–209.

Taylor, M.M. and Aldridge, K.D. (1974) Stochasticprocesses in reversing figure perception. Perception andPsychopysics, 16, 9–27

Toppino, T. C. and Long, G. M. (1987). Selectiveadaptation with reversible figures: don't change thatchannel. Perception and Psychophysics. 42(1), 37-48

Toppino, T. C. (2003). Reversible-figure perception:mechanisms of intentional control. Perception andPsychophysics. 65(8), 1285-1295.

van Ee, R., Adams, W. and Mamassian, P. (2003) Bayesianmodeling of cue interaction: bistability in stereoscopicslant perception. J. Opt. Soc. Am. A/ Vol. 20, No. 7

van Ee, R., Noest, A.J., Brascamp, van den Berg, A.V.(2006) Attentional control over either of the twocompeting percepts of ambiguous stimuli revealed by atwo-parameter analysis: Means do not make thedifference. Vision Research, 46, 3129-3141.

van Ee, R., van Dam, L. C. J., & Brouwer, G. J. (2005).Voluntary control and the dynamics of perceptual bi-stability. Vision Research, 41, 41–55.

Walker, P. (1978). Binocular rivalry: central or peripheralselective processes? Psychol. Bull. 85, 376–389.

Yu, K. and Blake, R. (1992) Do recognizable figures enjoyan advantage in binocular rivalry? J. Exp. Psychology andHuman Perception Performance. 18, 1158-1173.

Yuille, A. and Kersten, D. (2006). Vision as Bayesianinference: analysis by synthesis? Trends Cogn Sci, 10(7),301-308.