A computational perspective of the role of Thalamus in …...Thalamus has traditionally been considered as only a relay source of cortical inputs, with hierarchically organized cortical

A computational perspective of the role of Thalamus in cognitionNima Dehghani1, 2, a) and Ralf D. Wimmer31)Department of Physics, Massachusetts Institute of Technology2)Center for Brains, Minds and Machines (CBMM), Massachusetts Institute of Technology3)Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology

(Dated: February 25, 2018)

Thalamus has traditionally been considered as only a relay source of cortical inputs, with hierarchicallyorganized cortical circuits serially transforming thalamic signals to cognitively-relevant representations. Giventhe absence of local excitatory connections within the thalamus, the notion of thalamic ‘relay’ seemed like areasonable description over the last several decades. Recent advances in experimental approaches and theoryprovide a broader perspective on the role of the thalamus in cognitively-relevant cortical computations, andsuggest that only a subset of thalamic circuit motifs fit the relay description. Here, we discuss this perspectiveand highlight the potential role for the thalamus in dynamic selection of cortical representations through acombination of intrinsic thalamic computations and output signals that change cortical network functionalparameters. We suggest that through the contextual modulation of cortical computation, thalamus andcortex jointly optimize the information/cost tradeoff in an emergent fashion. We emphasize that coordinatedexperimental and theoretical efforts will provide a path to understanding the role of the thalamus in cognition,along with an understanding to augment cognitive capacity in health and disease.

Keywords: Thalamo-cortical system, Recurrent Neural Network, Reservoir Computing, Multi-objective Op-timization, Cognitive Computing, Artificial Intelligence

Cortico-centric view of perceptual and cognitive processing

Until recently, cognition [in mammalian, bird and rep-tilian nervous system] has been viewed as a cortico-centric process, with thalamus considered to only playthe mere role of a relay system. This classic view, muchdriven by the visual hierarchical model of the mammaliancortex24, puts thalamus at the beginning of a feedforwardhierarchy. The transmission of information from thala-mus to early sensory cortex (V1 in the visual system forexample), and the gradual increasing complex represen-tations from V2 to MT/IT and eventually prefrontal cor-tex (PFC), constitute the core of the perceptual repre-sentation under the hierarchical model. A recent compar-ative study of biologically-plausible Convolutional Neu-ral Networks (CNN) and the visual ventral stream, em-phasizes on the feature enrichment through the networkof hierarchically interconnected computational modules(layers in the neural network and areas in the ventralstream)110.

The strictly static feedforward model has since mor-phed to a dynamic hierarchical model due to the discov-eries of the role of feedback from higher cortical areasto lower cortical areas35. These dynamic hierarchies areeven considered to be favorable for recursive optimiza-tions, where the overall optimization can be achieved bybreaking the problem into smaller ones in order to findthe optimum for each of these smaller problems. How-ever, it is possible for the recursive optimization not tobe confined to just one area of the cortex. It may also bethat these smaller optimizations may be solved differently

a)Electronic mail: [email protected]

at different cortical areas58. This view of cortical com-putation is also paralleled with the growing use of recur-rent neural networks (RNN) that can capture the dynam-ics of single neurons or neural population in a variety oftasks. As such, RNNs can mimic (a) context-dependentprefrontal response57 or (b) can reproduce the tempo-ral scaling of neural responses in medial frontal cortexand caudate nucleus103. Nonetheless, the main attributeof perception/cognition remains cortico-centric under theumbrella of dynamic hierarchical models or RNN embod-iment of cortical cognitive functions. Since the computa-tion that is carried by the system should match the com-puting elements at the appropriate scale17 a mismatchbetween these presumed computational systems and theunderlying circuitry becomes vividly apparent. Specifi-cally, let’s note that a) associative cortex (and not justsensory cortex) receives thalamic input, b) certain thala-mic territories receive their primary (driving) input fromthe cortex, rather than sensory periphery, some of whichare likely to be highly convergent on a single thalamiccell level and c) cortex is demarcated by local excitatoryconnectivity (while the thalamus is devoid of this fea-ture). These points should prompt us to reconsider thiscortico-centric view of computation and indicates that itneed to be extended to a thalamocortical one. As such,the thalamic contribution to cognitive computations canbe twofold: First, they can modulate the ongoing corticalactivity and second, they might be pure thalamic com-putations. The nature of such intrinsic thalamic compu-tations would be determined by both the types of inputsa thalamic neuron receives as well as how these inputsinteract (both linearly and non-linearly). As we will dis-cuss in the next several sections, the combination of thesefeatures allows the thalamus to compute a variety of sig-nals associated with ongoing cortical states, which can be

mailto:[email protected]

2

more easily used to dynamically adjust cortical computa-tions through real-time modulation of functional corticalarchitecture. Further, we propose that thalamus mayprovide a contextual signal that reorganizes functionalconnectivity in frontal cortices in response to contextualchanges. We suggest that the unique cognitive capabil-ity of the thalamo-cortical system is tightly bound to

parallel processing and contextual modulation that areenabled by the diversity of computing nodes (includingthalamic and cortical structures) and complexity of thecomputing architecture (Fig. 1). We will start with abrief overview of the thalamic architecture, followed byexperimental evidence and a computational perspectiveof the thalamic role in contextual cognitive computation.

Parallel processingCo

gniti

ve c

apab

ility

Compute node

(Diversity & Complexity)

Com

Logic Gates

cortical system

ReservoirComputing

CNN

FPGA

mpute node& Com

mGenetic

Logic Gate

sing

DNAComputing

Cogn

itiv

PhysariumMachine

CCCCCNCNCNNNNNNNNNNNNNNNNNNN

FFFFFPFPPPP

CCCCC

hhhPhPPPPhyhyyyysysyssasariumsariumariumariumariumriumriumMacMaMaMaMMMMacacachachinechinechinechinechinehinehinei

cococcccooorororortrtrtirtictical tical icalcallstemstemystemystemystyssyssys stemstemtemtemem

ReReRRRReeeesssseseeerervoirervoirrvoirrvoirvoimputingmputingomputingomputingomputingomputingComCoComputingmputingmputingmputingputingputingputingputingutingtii

PPGPGPGAPGAGAGAGAAAPPPPGG

DDDDDDNDNDNNNANANAACCompumpumpompomomomComCoC mpumpumpupupupuu

AAAgggtingngngggtingtingtingtingtingtintintii

GGGGGGeGeeeeenennneneneticeticeticticgic Gategic Gateogic Gateogic Gateogic Gateogic GateLogic GateLoLogic Gateic Gateic Gatec Gatec Gatec GateGateGateG t

ggogogogoLoLoLoL gigicgic Gatesic Gatesic Gatesc Gatesc Gatesc Gatesc GatesGatesGateGateGategiggicic Gatesc Gatesc Gatesc GatesGatesGateGateGateat

PP

PPPPPPPPP

P

PPPPPPPPPPPPPPPPPPPPPPPPPGPPPPG

PPPPPP

PPPPPPPPPPPPPPPPPPPPPP

P

PPPPGPPGPPPPPG

PP

Figure 1. Cognitive computing morphospace: Morphospace of a few example biological and synthetic computing enginesin a multidimensional layout. The thalamo-cortical system standout as a unique system with high cognitive capacity, massiveparallel processing and extreme diversity of the computing nodes. Other computing systems occupy less desirable domainsof this morphospace. Logic Gates: NAND, NOR; Genetic Logic Gate: Synthetic biology adaptation of logic gate; FPGA:field-programmable gate array (configurable integrated circuit); CNN : Convolutional Neural Network; Physarium Machine:programmable amorphous biological computer experimentally implemented in slime mould. Reservoir Computing : A reservoirof recurrent neurons that dynamically change their activity to nonlinearly map the input to a new space; DNA computing : Acomputing paradigm where many different DNA molecules are used to perform large number of logical computations in parallel;Connection Machine: the first commercial supercomputer designed for problems of Artificial Intelligence (AI) using hardwareenabled parallel processing.

Thalamic architecture: anatomical and functional features

Among all brain structures, the forebrain is likely to bemost related to cognitive capacity, given that forebrain

expansion positively correlates (and perhaps defines) cog-nitive expansion throughout evolution46,74,75,87. Fore-brain is composed of diencephalon (hypothalamus, epi-thalamus and subthalamus) along with the telencephalon

3

(cortex and basal ganglia)10. Analogous structures ex-ist throughout the vertebrate lineage10,15. Both the cor-tex and basal ganglia are demarcated by local recurrentconnections, with the main difference being that corticalexcitatory recurrence gives rise to long-lasting attractordynamics thought to underlie many cognitive processessuch as memory and decision making104,107. The basalganglia are inhibitory structures, with local recurrencethought to implement selection through lateral inhibitionrather than long lasting attractor state63,105,106. Withinthis framework, what does the thalamus do? The an-swer depends on ‘the thalamus’ in question. Tradition-ally, thalamic nuclei (see Fig. 2) are defined as collec-tions of neurons that are segregated by gross featuressuch as white matter tracts and various types of tissuestaining procedures42. This gross anatomical classifica-tion has been equated with a functional one, where in-dividual thalamic nuclei giving rise to a set of definedfunctions42,43. More recent fine anatomical studies chal-lenge this notion, showing that within individual nuclei,single cell input/output connectivity patterns are quitevariable (see below).

This thalamic input diversity sets the bases for non-unitary computational role of thalamus in integrating in-formation. Thalamus lacks lateral excitatory connectionsand rather receives inputs from other subcortical struc-tures and/or cortex. In fact, a major feature of forebrainexpansion across evolution is the invasion of the thala-mus by cortical inputs30,85. Most (90-95%) afferents tothe relay nuclei are not from the sensory organs43,93. Themajority of these inputs originate from intrinsic (localGABAergic neurons, projections from reticular nucleus,thalamic interneurons–mostly absent in non-LGN relaynuclei) and extrinsic sources (feedback from layer 6 ofthe cortex as well as from brainstem reticular formation)(see Fig. 3 for an example view of cell and network archi-tecture diversity). Recent anatomical studies have showna great diversity of cortical input type, strength and in-ferred degree of convergence, even within individual tha-lamic nuclei11.

In addition to the diversity of excitatory inputs, thala-mic circuits receive a diversity of inhibitory inputs. Thetwo major systems of inhibitory control are the thala-mic reticular nucleus (TRN), a shell of inhibitory nucleussurrounding thalamic excitatory nuclei, and the extra-thalamic inhibitory system; a group of inhibitory projectsacross the fore-, mid- and hindbrain (see32 for a reviewon thalamic inhibition). Perhaps a major differentiatingfeature of these two systems (TRN and ETI) is tempo-ral precision. One of the key characteristics of thalamusis lack of direct local loops. Only a very small groupof inhibitory neurons with local connections exists inthalamus98. A mechanistic consequence of this architec-ture is the differential control of thalamic response gainand selectivity (Fig. 3 F and G), with the TRN control-

ling the first as observed in sensory systems80, and ETIcontrolling the latter as observed in motor systems102.For example, basal ganglia control of thalamic responses,

P

F

T

O

1 2 3

1

AM

AVTRN

ILVA

AD

3

TRN

LGN

PU

LD

MGN

(D)(V)

H

2

TRN

VPL

CNVL

MD

IL

CMVPM

LD

VPI

Figure 2. Schematic layout of thalamic nuclei: Threecross sections of monkey thalamus. AD : anterodorsal nu-cleus; AM : anteromedial nucleus; AV : anteroventral nucleus;CM : centromedian nucleus; CN : caudate nucleus; H : habe-nular nucleus; IL: intralaminar nuclei;LD : lateral dorsal nu-cleus; LGN : lateral geniculate nucleus;MD : mediodorsal nu-cleus; MGN(D): medial geniculate nucleus (dorsal); MGN(V):medial geniculate nucleus (ventral); PU : pulvinar; TRN : tha-lamic reticular nucleus (not a relay nucleus); VA: ventralanterior nucleus; VL: ventral lateral nucleus; VPI : ventralposterior nucleus (inferior); VPL: ventral posterior nucleus(lateral); VPM : ventral posterior nucleus medial. Redrawnfrom93.

a form of ETI control, would be implemented throughthalamic disinhibition, which is not only dependent onETI input, but also a special type of thalamic conduc-tance that enables high frequency ‘bursting’ upon releasefrom inhibition19,29.

Overall, the variety of thalamic inputs (both excita-tory and inhibitory), combined with intrinsic thalamicfeatures such as excitability and morphology will deter-mine the type of intrinsic computations the thalamusperforms. For example, if a thalamic neuron within aparticular circuit motif received temporally offset corti-cal inputs, one direct and another through basal gangliastations, then that neuron may be perfect for computinga prediction error based on a raw cortical signal (or state)at t0 and a processed one at t0 +∆t (basal ganglia opera-tion). This view appears to be consistent with recent ob-servations of confidence encoding in both sensory47 andmotor systems41,54.

4

100 µm100 µm

IIIb

IIIc

IVa

IVb

V

A B

C D

F G

100 µm

TRN

VA

VL

RE

RE

RE

Th-cx

Th-cxTh-cx

Th-cx

Th-cx

L-circ

L-circ

100 µm100 µm

E

Figure 3. Diversity and complexity of thalamo-cortical architectures: (A) Comparative size of a single RGC (retinalganglion cell) terminal (red) and an LGN neuron (black); Redrawn from25. (B) Complete terminal arbor of a single LGNneuron projecting to V1; Redrawn from82. (C) Projection of a TC (thalamo-cortical) neuron to cat motor cortex (with only23 terminals in TRN versus 1632 terminals in VA/VL); Redrawn from45,76 (D,E) Axonal arborization of a single MD neuron(E) and a single POm neuron (E). Principal target layers are: layer I (green), layer II-IV (green-yellow), layer V (blue),layer VI (purple); Redrawn from50. Note the comparative size of panels A-E (scale bar at 100 μm). (F,G) Synaptic networkof thalamo-cortical (Th-cx), Reticular thalamic cell (RE) and local circuit (L-circ) thalamic interneuron in rodents (F) andfeline/primates (G). Note that rodents do not have L-circ. Afferent axon excites Th-cx, which in return sends the signal tocortex. RE inhibitory effect on Th-cx cells varies depending on the excitatory drive to each Th-cx cell (F : compare the twoneurons on the right versus the one on the left). Axonal collaterals of an RE cell could inhibit another RE cell (G: top REinhibits the bottom RE), which releases the activity of L-circ leading to inhibition of weakly excited Th-cx (bottom) adjacentto the active Th-cx (top). Panels F and G are redrawn from99 based on experiments from97,100.

Thalamic output diversity sets thalamus to be poisedas the relay as well as the modulator of cortical activ-ity. Thalamic relay nuclei mostly project to the corticalmiddle layers in a topographic fashion. However, themajority of thalamic structures project more diffusely to

the cortical superficial layers, such as mediodorsal (MD),posteriodmedial complex (POm) and pulvinar for exam-ple (see Fig. 3 for an example of thalamic cell and circuitdiversity). These diffuse projections seem poorly suitedto relay information in a piecewise manner. Rather,

5

they might have a modulatory role of cortical function.Further, a great degree of diversity can be observed atthe level of thalamic axonal terminals within the cortex.While the idea of a thalamic relay was consolidated byobserving that the main LGN neurons thought to be as-sociated with form vision (M and P pathways) exhibitspatially compact cortical terminals, recent anatomicalstudies of individual neurons across the thalamus showa variety of terminal sizes and degree of spatial spreadand intricate computational architecture (Fig. 3). Thiscomplexity of the architecture and diversity of the com-puting nodes are among the key factors that set apartthe thalamo-cortical system from other conventional andunconventional computing engines (Fig. 1). Part ofthe complication in understanding how these anatomicaltypes give rise to different functions is their potential forcontacting different sets of excitatory and inhibitory cor-tical neurons. For example, activating the mediodorsalthalamus (MD) does not generate spikes across a popula-tion of prefrontal cortical neurons they project to, whileactivating LGN generates spikes in striate cortex88. In-stead, MD activation results in overall enhancement ofinhibitory tone, coupled with enhanced local recurrentconnectivity within the PFC. This finding also arguesagainst a relay function, because in this case the MD ischanging the mode by which PFC neurons interact withone another, initiating and updating different attractordynamics underlying distinct cognitive states. This ideaof the thalamus controlling cortical state parameters ishighlighted in Figs. 4, 5 and the next section.

Many facets of thalamic computation

It is commonly thought that processes like attention,decision making and working memory are implementedthrough distributed computations across multiple corti-cal regions12,62,89. However, it is unclear how these com-putations are coordinated to drive relevant behavioraloutputs. From an anatomical standpoint, the thalamus isstrategically positioned to perform this function, but rel-atively little is known about its broad functional engage-ment in cognition. The thalamic cellular compositionand network structure constrain how cortex receives andprocesses information. The thalamus is the major inputto the cortex and interactions between the two structuresare critical for sensation, action and cognition44,69,92. De-spite recent studies showing that the mammalian thala-mus contains several circuit motifs, each with specific in-put/output characteristics, the thalamus is traditionallyviewed as a relay to or between cortical regions93.

It is worth mentioning that this view of bona fidethalamic computations is quite distinct from the one inwhich thalamic responses reflect their inputs, with onlylinear changes in response size. This property of reflect-ing an input (with only slight modification of amplitude)was initially observed in the lateral geniculate nucleus(LGN), which receives inputs from the retina. LGN re-

sponses to specific sensory inputs (their receptive fields,RF) are very similar to those in the retina itself, arguingthat there is little intrinsic computation happening in theLGN itself outside of gain control. Success in early visionstudies36,37 might have inadvertently given rise to theLGN relay function being generalized across the thala-mus. The strictly feedforward thalamic role in cognitionrequires reconsideration33; only a few thalamic territoriesreceive peripheral sensory inputs and project cortically ina localized manner, as the LGN does25,42,45,82,92. Belowwe will discuss how the distinctive anatomical architec-ture and computational role of pulvinar and MD differfrom the one just described for LGN.

The largest thalamic structures in mammals, the MDand pulvinar contain many neurons that receive conver-gent cortical inputs and project diffusely across multiplecortical layers and regions11,86. For example, the primatepulvinar has both territories that receive topographical,non-convergent inputs from the striate cortex86 and oth-ers that receive convergent inputs from non-striate visualcortical (and frontal) areas60. This same thalamic nu-cleus also receives inputs from the superior colliculus79, asubcortical region receiving retinal inputs. This suggeststhat the pulvinar contains multiple input ‘motifs’ solelybased on the diversity of excitatory input. This input di-versity is not limited to the pulvinar, but is seen withinmany thalamic nuclei across the mammalian forebrain8.Local inactivation of pulvinar neurons results in reducedneural activity in primary visual cortex81 suggesting afeedforward role. In contrast, recent findings show thatduring perceptual decision making pulvinar neurons en-code choice confidence, rather than stimulus category47,strongly arguing for more pulvinar functions beyond re-laying information.

In the case of MD, direct sensory input is limited64 andthe diffuse, distributed projections to cortex50 are poorlysuited for information relay; this input/output connectiv-ity suggests different functions. Recent studies9,88 havebegun to shed light on the type of computation that MDperforms. Studying attentional control in the mouse88

has revealed that MD coordinates task-relevant activ-ity in the prefrontal cortex (PFC) in a manner analo-gous to a contextual signal regulating distinct attractorstates within a computing reservoir. Specifically, in atask where animals had to keep a rule in mind over a briefdelay period (Fig. 4A), PFC neurons show population-level persistent activity following the rule presentation, asensory cue that instructs the animal to direct its atten-tion to either vision or audition (Fig. 4B,C). MD neuronsshow responses that are devoid of categorical selectivity(Fig. 4D), yet are critical for selective PFC activity; opto-genetic MD inhibition diminishes this activity, while MDactivation augments it. The conclusion is that MD inputsenhance synaptic connectivity among PFC neurons ormay adjust the activity of PFC neurons through selectivefiltering of the thalamic inputs. In other words, delay-period MD activity maintains rule-selective PFC repre-sentations by augmenting local excitatory recurrence88.

6

In a related study, a delayed nonmatching-to-sampleT-maze working memory task9, it was shown that MDamplification and maintenance of higher PFC activity in-dicated correct performance during the subsequent choicephase of the task. Interestingly, MD-dependent increasedPFC activity was much more pronounced during the later(in delay) rather than earlier part of the task. These find-

ings indicate that PFC might have to recursively pull inMD to sustain cortical representations as working mem-ory weakens with time. Together these studies indicatethat PFC cognitive computation can not be dissociatedfrom MD activity. Further evidence for the critical roleof the MD-PFC interaction for cognition is the disruptedfronto-thalamic anatomical and functional connectivityseen in neurodevelopmental disorders59,65,68,78,108.

A

B

ATTEND TO VISION

ATTEND TO AUDITION

Visual selection

Auditory selection

noitatneserPgnieuC100 ms

ResponseMedian: 1521 ms

Cue-free delay400 ms

Ign

ore

so

un

dIg

no

re li

gh

t

Low-pass

High-pass

Broadband

Trial available100 ms

Start initiation

Delay period

E

-2024

0 0.5Time (s)

Firin

g ra

te (z

-sco

re)

attend to vision

attend to audition

0 0.5-2024

Time (s)

attend to audition0

5

10

15

Time (s)0 0.6

attend to vision

Firin

g ra

te (H

z)

0

5

10

15

-2

0

2

Firin

g ra

te (z

-sco

re)

attend to vision

0 0.5-2

0

2

Time (s)

attend to audition

C D

MD

outside

.6−.2 0 .2 .4

02468

1012

0

5

10

15

inside

PFC

outside inside

Firin

g ra

te (H

z)

.6−.2 0 .2 .4

02468

1012

0

5

10

15

.6−.2 0 .2 .4

02468

1012

0

5

10

15

.6−.2 0 .2 .4

02468

1012

0

5

10

15

Figure 4. MD-PFC interactions during sustained rule representations. (A) attentional control task design. (B)Example peri-stimulus time histogram (PSTH) for a neuron tuned to attend to vision rule signaled through low pass noisecue (C) Examples showing that rule-specificity is maintained across distinct PFC rule-tuned populations. (D) PSTHs of fourMD neurons showing consistent lack of rule specificity. (E) Example rasters and PSTHs of an MD and PFC neuron whenthe animal is engaged in the task and outside of the behavioral arena. In contrast to PFC, MD neurons show the contextualdifference in a change in firing rate. Figure is redrawn from88.

7

Can MD select cortical subnetworks based on contextualmodulation?

Why would a recurrent network (PFC) computationdepend on its interaction with a non-recurrent (MD) non-relay network? What computational advantage such sys-tem would have? Using a chemogenetic approach, a re-cent study suggested that information flow in the MD-PFC network can be unidirectional. While both inacti-vating PFC-to-thalamus and MD-to-cortex pathways im-paired recognition of a change in reward value in ratsperforming a decision making task, only the inactivationof MD-to-cortex pathway had an impact on the behav-ioral response to a change in action-reward relationship3.Given that a sensory stimulus may require a different ac-tion depending on the context in which it occurs, theability to flexibly re-route the active PFC subnetworkto a different output may be crucial. In an architec-ture like the PFC-MD network, where MD can modulatePFC functional connectivity, MD might well be suitedto re-route the ongoing activity in a context dependentmanner. In fact, in the mouse cognitive task describedabove (Fig. 4A), a subset of MD neurons showed sub-stantial spike rate modulation during task engagementcompared to when the animals is in its home in cage (seeFig. 4E)88. In contrast, PFC neurons show very littledifference in spike rates when the animal gets engagedin the task. This suggests that perhaps different subsetsof MD neurons are capable of encoding task ‘contexts’.Subsequently, each given subset could unlock a distinctcortical association. These MD subsets have to be ableto shift the cortical states dynamically while maintainingthe selectivity based on the subset of cortical connectionsthey target. This idea would also fit with the paradigmshift indicating that thalamic neurons exert dynamicalcontrol over information relay to cortex6,77.

Overall, the anatomical and neurophysiological datashow that the thalamic structure and cortico-thalamicnetwork circuitry, and the interplay between thalamusand cortex, shape the frame within which thalamus playsthe dual role of relay and modulator. Under this frame-work, different thalamic nuclei carry out multitude ofcomputation including but not limited to information re-lay. A suggestion of this comparative computational roleof LGN, pulvinar and MD is depicted in Fig 8.

Is thalamus a read-write medium for cortical parallelprocessing?

If the brain were to function as a simple pattern match-ing system without wiring and metabolic constrains, evo-lution would just expand the size and depth of the net-work to the point that it could potentially memorize alarge number of possible patterns. Possibly, evolutionwould have achieved this approximation of arbitrary pat-terns by evolving a deep network. This would be a desir-able solution since any system can be defined as a poly-

nomial Hamiltonians of low order, which can be accu-rately approximated by neural networks53. But cogni-tion is much more than template matching and classifi-cation achieved by a neural network. The limits of tem-plate matching methods in dealing with (rotation, trans-lation and scale) invariance in object recognition quicklybecame known to neuroscientists and in early works oncomputer vision. One of the early pioneers of AI, OliverSelfridge, proposed Pandemonium architecture to over-come this issue90. Selfridge envisioned serially connecteddistinct demons (an image demon, followed by a set ofparallel feature demons, followed by a set of parallel cog-nitive demons and eventually a decision demon), that in-dependently perceive parts of the input before reachinga consensus together through a mixture of serial culmi-nation of evidence from parallel processing. This simplefeedforward computational pattern recognition model is(in some ways) a predecessor to modern day connection-ist feedforward neural networks, much like what we dis-cussed earlier in the text. However, despite its simplicity,Pandemonium was a leap forward in understanding thatthe intensity of (independent parallel) activity along witha need to a summation inference are the keys to movefrom simple template matching to a system that has aconcept about the processed input. A later extension ofthis idea was proposed by Allen Newell as the Blackboardmodel: “Metaphorically, we can think of a set of work-ers, all looking at the same blackboard: each is able toread everything that is on it and to judge when he hassomething worthwhile to add to it. This conception isjust that of Selfridge’s Pandemonium: a set of demonsindependently looking at the total situation and shriekingin proportion to what they see that fits their natures”71.Blackboard AI systems, adapted based on this model,have a common knowledge base (blackboard) that is it-eratively updated (written to and read from) by a groupof knowledge sources (specialist modules), and a controlshell (organizing the updates by knowledge sources)72.

Interestingly, this computational metaphor can also beextended to the interaction between thalamus and cor-tex, though thalamic blackboard is not a passive one asin the blackboard systems34,66,67. Although, initially theactive blackboard was used as an analogy for LGN com-putation, the nature of MD connectivity and its com-munication with cortex seem much more suitable to thetype of computations that is enabled by an active black-board. Starting with an input, thalamus as the commonblackboard visible to processing (cortical) modules, ini-tially presents the problem (input) for parallel process-ing by modules. Here by module, we refer to a group ofcortical neurons that form a functional assembly whichmay or may not be clustered together (in a column forexample). By iteratively reading (via thalamo-corticalprojections) from and writing (via cortico-thalamic pro-jections) to this active blackboard, expert pattern recog-nition modules, gradually refine their initial guess basedon their internal processing and the updates of the com-mon knowledge. This process continues until the problem

8

is solved (Fig. 5). However, since we are dealing witha biological system with finite resources, this back andforth communication needs to have some characteristicsto provide a viable computational solution. First andforemost, the control of interaction and its schedulinghas to have a plausible biological component and shouldbind solutions as time evolves. Second, to avoid turn-ing into an NP-hard (non-deterministic polynomial-timehardness) problem, there must exist a mechanism thatstops this iterative computation once an approximation

has been reached (Fig. 5). In this paper, we propose spe-cific solution to the first problem and a plausible one forthe later issue. We suggest that phase-dependent contex-tual modulation serves to deal with the first issue and amulti-objective optimization of efficiency (computationalinformation gain) and economy (computational cost, i.e.metabolic needs and the required time for computation)handles the second issue (Fig. 6). In both cases, we sug-gest that thalamus plays an integral role in conjunctionwith cortex.

Core attributes of cognitive processing, attention andbinding in time

To expand the idea of read-write further, let’s revisitsome core attributes of cognitive processing. First, wewish to point to two key features of cognitive process-ing, namely “searchlight of attention” and “binding intime”. Then we examine how these attributes matchthe computational constrains that are met by thalamiccircuitry and functions. The “searchlight of attention”,first proposed by Crick as a function of thalamic retic-ular formation14, proposes that through change in fir-ing pattern (bursting vs tonic firing), thalamic nucleidynamically switch between detection and perception.The bursting nonlinear response signals a change in theenvironment, while the tonic mode underlies perceptualprocessing. One biophysical mechanism responsible forthis change between bursting and tonic modes is im-plemented by cortico-thalamic activation of glutamatemetabotropic receptors61. This mechanism puts thala-mus as the mediator between peripheral and cortical in-put, creating a closed-loop computation1. A necessaryproperty of thalamic-driven change of perceptual pro-cessing, and hence cognition, is the existence of corticalre-entry. From both anatomical studies and electrophys-iological investigations31, we know that thalamus is at aprime position to modify the relay signal based on thecognitive processing that is happening in the cortex9,88.This thalamic-driven attention entails “binding in time”since how thalamus modifies its relay at a given time isitself influenced by what is perceived by the cortex intime prior.

But how can the “binding in time” avoid locking-inthe thalamic function to a set of inputs at a given time?How can thalamus constantly be both ahead of cortexand yet keep track of the past information? The secretmay be embedded in the non-recurrent intrinsic struc-ture of thalamus and the recurrent structure of the highercortical areas. As mentioned earlier, we know that hier-archical convolutional neural networks (HCNN), whichcan recapitulate certain properties of static hierarchicalforward models, can not capture any processes that needto store prior states109. As a result, context-dependentprocessing can be extremely hard to implement in neural

network models84. The most widely used ANNs (Feed-forward nets , i.e. multilayer perceptrons/Deep Learningalgorithms) face fundamental deficiencies: the ubiquitoustraining algorithms (such as back-propagation), i) haveno biological plausibility, ii) have high computational cost(number of operations and speed), and iii) require mil-lions of examples for proper adjustment of NN weights.These features renders feedforward NNs not suitable fortemporal information processing. In contrast, recur-rent neural networks (RNNs) can universally approxi-mate the state of dynamical systems26, and because oftheir dynamical memory are well suited for contextualcomputation. It has been suggested that RNNs are ableto perform real-time computation at the edge of chaos,where the network harbors optimal readout and memorycapacity7. The functional similarity of gated-recurrentneural networks and cortical microcircuitry continues tobe at the forefront of AI and neuroscience interface13. Ifhigher cortical areas were to show some features of RNN-like networks, as manifested by the dynamical response ofsingle neurons57, then we anticipate that the local com-putation (interaction between neighboring neurons) to bemostly driven by external biases. The thalamic projec-tions could then play the role of bias where they seedthe state of the network. While this may be a feasibleportrait, the picture misses the existence of large-scalefeedback loops which are neither feedforward nor locallyrecurrent2,31. To reconcile, we need our system of in-terest to show phase-sensitive detection that binds thelocally recurrent activity in cortex, with large-scale feed-back loops.

Computational constrains and the role of thalamus inphase-dependent contextual modulation

Based on the observations of behavior, higher cogni-tion requires “efficient computation”, “time delay feed-back”, the capacity to “retain information” and “contex-tual” computational properties. Such computational cog-nitive process surpass the computational capacity of sim-ple RNN-like networks. The essential required propertiesof a complex cognitive system of such kind are: 1) inputshould be nonlinearly mapped onto the high-dimensional

9

1 01 1 1 0 1 0 1

. . . . . . .

. . . . . . .

1 10 0 1 1 0 1 1

0 11 1 1 0 1 1 1

0 11 1 1 0 1 0 0

1 10 0 1 1 0 1 11 01 1 1 0 1 0 1

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

Static Data

.1 .5.3 .4 .6 .7 0 .2 .11 10 0 1 1 0 0 0Function

Weight

b ca d e ....7 .5.3 .6 .2 ...Weight

Pointer

.3 .6.70 11Function

Weight .5 .5.2 .2 .60 11 1 0

(a)

.3 .5 .3 .41 1 0 1

.3 .20 1

.8 .6 0 .30 1 1 0

(b) (c) (d) (e)

b ca d e ....3 .9.2 .6 .1 ...Weight

Pointer

.1 .5.31 10Function

Weight .1 .5.3 .4 .61 10 0 1

(a)

.7 0 .2 .11 0 0 0

.2 .81 1

.4 .6 .7 00 1 1 0

(b) (c) (d) (e)

b ca d e ....9 .6.5 0 0 ...Weight

Pointer

.4 .2.71 01Function

Weight .6 .5.8 0 .11 11 0 0

(a)

.5 .1 .6 01 0 1 1

(b) (c)

.8 .1.91 00

tn

tn+dt

tn+2dt

1 01 1 1 0 1 0 1

. . . . . . .

. . . . . . .

1 10 0 1 1 0 1 1

0 11 1 1 0 1 1 1

0 11 1 1 0 1 0 0

1 10 0 1 1 0 1 11 01 1 1 0 1 0 1

Dynamic DataTime

tn+3dt

Classification

. . . . . . .

B

A

C

Readout

Reservoir

Non-recurrent

Figure 5. Schematic representation of thalamic cognitive contextual computation. (A): In the case of static data,a set of function/weight modules can yield good classification. Function represents a polynomial (since any system that isknown to be a polynomial Hamiltonians of low order can be accurately approximated by neural networks53) and the weightsexemplify the connection matrix of an artificial neural network encapsulating this polynomial. Stacking multiple of such modulecan increase the accuracy of polynomial approximation (such as in the case of CNN). (B): Thalamo-cortical computation forcontextual processing of dynamic data. Each dataframe is processed by a weight/pointer module (thalamus MD-like structure)which like a blackboard is writable by different sets of neuronal assemblies in cortex. Thalamic pointers assign the assemblies;modules’ weights adjust the influence of each assembly in further computational step [inset C shows a non-recurrent thalamicnuclei (MD-like) modulating the weights in the PFC (reservoir and readout). Here, depending on the context (blue or red),the interactions between MD and Reservoir, between Reservoir and Readout, and between Readout and MD could pursueone of the two possible outcomes. Specifically, MD changes the weights in the Reservoir to differentially set assemblies thatproduce two different attractor states, each leading to one of the two possible network outputs]. In (B, C), each operationof the thalamic module is itself influenced, not only by the current frame (t), but also by the computation carried by cortexmodule on the prior frame (t − 1). Cortical module is composed of multiple assemblies where each operate similar to thefunction/weight module of the static case. These assemblies are locally recurrent and each cell may be recruited to a differentassembly during each operation. This mechanism could explain why prefrontal cells show mixed selectivity in their responsesto stimuli (as reported in27,83). Through this recursive interaction between thalamus and cortex, cognition emerges not as justa pattern matching computation, but through contextual computation of dynamic data (bottom right schematic drawing).

state, while different inputs map onto different states, 2)slightly different states should map onto identical targets,

3) only recent past should influence the state and networkis essentially unaware of remote past, 4) a phase-locked

10

loop should decode information that is already encodedin time and 5) the combination of 1-4, should optimizesensory processing based on the context. The first threeattributes of such system have close relevance to con-strains and computational properties of higher corticalareas (prefrontal). The same three are also the main fea-tures of reservoir computing, namely “separation prop-erty”, “approximation property” and “fading memory”.Reservoir Computing (RC), with Echo State Networks(ESN)39,40 and Liquid State Machines (LSM)55,56 as twoearly variants of what is now unified as RC, have emergedas powerful RNNs. Although a large enough randomnetwork could essentially represent all combinations ofcortical network84, the training of such system would behard and changing the network from task to task will notbe easily achievable. An advantage of a RC systems isto “non-linearly” map a lower dimensional system to ahigh-dimensional space facilitating classification of the el-ements of the low-dimensional space. The last two prop-erties match the structure and computational constraintsof non-relay thalamic system as a contextual modulatorthat is phasically changing the input to the RC system.In fact, in an RC model of prefrontal cortex, additionof a phase neuron significantly improved the networksperformance in complex cognitive tasks. The phase neu-ron improves the performance by generating input drivenattractor dynamics that best matched the input23. Ina recent study, electronic implementation and numeri-cal studies of a limited RC system of a single nonlin-ear node with delayed feedback showed efficient informa-tion processing4. Such reservoir’s transient dynamical re-sponse follows delay-dynamical systems, and only a lim-ited set of parameters set the rich dynamical propertiesof delay systems38. This system was able to effectivelyprocess time-dependent signals. The phase neuron23 anddelayed dynamical RC4 both show properties that resem-ble the thalamic functions as discussed here. The col-lective system (thalamus and cortex together) is neitherfeedforward, nor locally recurrent, but it has a mixture ofnon-recurrent phase encoder that keeps copies of the pastprocessing and modulates the sensory input relay and itsnext step processing (Fig. 5). These features emphasizethat the perceptual and cognitive processing can not besolely cortico-centric operations.

Biological constrains and the role of thalamus incomputational optimization

Computation and optimization are two sides of thesame coin. But how does the brain optimize the com-putations that would match its required objective, i.e.cognitive processing? There is a current trend of thinkingthat brain optimizes some arbitrary functions, with thehopes that the future discovery of these unknown func-tions may guide us to establish a link between brain’s op-erations and deep learning58. This line of approach to op-timizational (and computational) operations of the brain

Information CostA1 A2

B

C

ζ1

η1

ζ2 ζ3

ζ4

ζ1ζ2ζ3ζ4

ζ1

ζ2

ζ3ζ4

η2

η3 η4

η1

η2

η3 η4

η1 η2 η3 η4Information

Cost

Figure 6. Dynamic role of thalamo-cortical system inthe information/cost optimization. (A) Iso-maps of in-formation (A1) and cost (A2) in the domain specified by ωand λ (functions of cortical and thalamic activity). Informa-tion across each Iso-quant curve (η1 for example) is constantand is achieved at a certain mixture of ω and λ. Optimal in-formation can be obtained by moving outward (arrows, A1 ).Cost optimization can be achieved by moving inward (A2 ).(B) since information and cost are both defined in the domainof ω and λ, thalamus and cortex jointly contribute to informa-tion and cost optimization. The points where the iso-quantcurves’ tangents are equal (black dashed line), provide the op-timal combination of information/cost (green curve). In anycycle of cognitive operation, depending on the prior state ofthe system (ω and λ), the nearest points on the green curveare the optimal solutions for ending that cycle. (C) map-ping of the optima curve to information/cost domain showsall pareto efficient allocations (cyan curve). The slope of theparto frontier shows how the system trades cost versus in-formation: along the pareto curve, efficiency is constant butthe exchange between information and cost is not. All allo-cations inside of this curve could be improved as thalamusand cortex interact. The grey zone shows the biophysicallynot-permissible allocation of computation and resource.

has few flaws. First, it avoids specifying what functionthe brain is supposed to optimize (and as a result it re-mains vague). Second, it refrains from addressing certain

11

limitations that brain has to cope with due to biologicalconstrains. First of these limitations is the importance ofusing just enough resources to solve the current percep-tual problem. Second is the necessity to come up witha solution just in (the needed) time. The importance of“just-enough” and “just-in-time” computation in corti-cal computation should not be overlooked20. If the firstcondition is not met, the organism can not sustain con-tinued activity since the metabolic demand surpasses thededicated energetic expenditure and the animal can notsurvive. In fact, the communication in neural networksare highly constrained by number of factors, specificallythe energetic demands of the network operations51. Fromestimates of the cost of cortical computation52, we knowthat the high cost of spiking forces the brain to rely onsparse communication and using only a small fraction ofthe available neurons5,94. While, theoretically, cortex candedicate a large number of neurons (and very high dy-namical space) to solve any cognitive task, metabolic de-mand of such high-energetic neural activity renders suchmechanism highly inefficient. As a result, the “law of di-minishing returns” dictates that increased energetic costcausing excessive pooling of active neurons to an assem-bly would be penalized73. The penalization for unnec-essary high-energetic neural activity, in itself, should bedriven by the nature of computation rather than beingformulated as a fixed arbitrary threshold imposed by anexternal observer. On the other hand, a system can re-sort to low-cost computation at any given time but ded-icate long enough time to solve the task on hand. Nat-urally, such system would not be very relevant to thebiological systems since time is of essence. If an animaldedicates a long instance of its computational capacityto solve a problem, the environment has changed beforeit reaches a solution and the solution becomes obsolete.A deer would never have an advantage for its brain tohave fully analyzed the visual scene instead of spottingthe approaching wolf and shifting resources to the most-needed task, i.e. escape. As a result, many of the opti-mization techniques and concepts that may be relevantto artificial neural networks are irrelevant to embodiedcomputational cognition of the brain. The optimizationthat the brain requires is not aiming for the best possibleperformance, but rather needs to reach a good mixtureof economy and efficiency.

Not surprisingly, these constrains, i.e. efficiency andeconomy, are cornerstones of homeostasis and are ob-served across many scales in living systems101. Thesimple “Integral feedback” acts as the mainstay of con-trol feedback in such homeostatic systems (such as EColi heat-shock or DNA repair after exposure to gammaradiation)18,21,22,48. Change in input leads to change inthe output and the proportional change in the controlleraiming to reset the output to the desired regime. Theconstrains that we discussed above, remind us of a sim-ilar scenario. Instead of just trying to deal with one fit-ness function at a time (where the minima of the land-scape would be deemed as “the” optima), the brain has to

perform a multi-objective optimization, finding solutionsto both metabolic cost (economy) and just-in-time (effi-ciency) computation. Thus we can infer that a uniquesolution does not exist for such a problem. Rather,any optimization for computational efficiency will costus economy and any optimization for economy will costus efficiency. In such case, a multi-objective optimizationpareto frontier is desirable. Pareto frontier of informa-tion/cost will be the set of solutions where any otherpoint in the space is objectively worse for both of theobjectives28,49,101. As a result, the optimization mecha-nism should push the system to this frontier. The itera-tive dynamical interaction between thalamus and cortexseems to provide an elegant solution for this problem.We discuss this in more details below.

Consider a set of functions, ω and λ of fTh (firing rateof thalamic cell) and fCx (firing rate of cortical cells).Uncertainty (or its opposite, information) and computa-tional cost (a mixture of time and metabolic expense) canboth be mapped to this functional space of ω (fTh, fCx)and λ (fTh, fCx) (Fig. 6A1,2). Let’s define computationalcost and information as product and linear sum of cor-

tical and thalamic activity ( αfnTh.βfnCx, θ

dnfTh

dt+ψ

dnfCx

dt;

with α, β, θ, ψ as coefficients) to reflect the logarithmicnature of information (entropy) and the fact that biolog-ical cost is an accelerating function of the cost-inducingvariables18. The hypothetical space of cost/informationis depicted in Fig. 6, where top panels show indifferencemaps of information (A1) and cost (A2). The examplesimulations and parametric plots of the cost and infor-mation functions defined as above are shown in Fig. 7.In each indifference map, along each iso-quant curve,the total functional attribute is the same. For example,anywhere on the η1 curve, the uncertainty (or informa-tion) in our computational engine is the same. How-ever, different iso-quant curves represent different lev-els of the functional attribute. For example, movingoutward increases information (reduces uncertainty) asη1 < η2 < η3 < η4 and thus if computational cost wasnot a constrain, the optimal solution would have existedon η4 or further away (Fig.6 A1). In contrast, movinginward would preserve the cost (ζ1 < ζ2 < ζ3 < ζ4)if the computational engine did not have the objectiveof reducing uncertainty (Fig. 6A2). Since informationand cost are interdependent and both depend on the in-teraction between thalamus and cortex, we suggest thatinformation/cost optimization happens through an it-erative interaction between thalamus and cortex (notethe blackboard analogy and contextual modulation dis-cussed above). Since we defined both information andcost as a set of iso-quant curves in the functional space ofω (fTh, fCx) and λ (fTh, fCx), they can be co-representedin the same space (Fig. 6B). Optimal solutions for in-formation/cost optimizations are simply the solutions towhere the tangents of the iso-quant curves are equal (seethe tangents [black dashed lines] and points A, B, C andD, in Fig. 6B). These points create a set of optimal solu-tions for the tradeoff between information and cost (green

12

curve). Mapping of the optimal solutions to the compu-tational efficiency space E, gives us the pareto efficientcurve (cyan curve, Fig. 6C). Anywhere inside the curveis not pareto efficient (i.e. information gain and com-putational cost can change in such a way that, collec-tively, the system can be in a better state (on the paretocurve). Points outside of the pareto efficient curve arenot available to the current state of the system due tothe coefficients of ω and λ. A change in these coefficientscan potentially shape a different co-representation of in-formation and cost (see Fig 7, top row for 3 differentinstances of ω and λ based on different α, β, θ, ψ val-ues), and thus a different pareto efficient curve (see Fig 7,bottom row). These different possible pareto frontierscan be set based on the prior state of the system andthe complexity of the computational problem on hand.Nonetheless, the computational efficiency of the systemcan not be infinitely pushed outward because of the sys-tem’s intrinsic biophysical constrains (neurons and theirwiring). The shaded region in Fig 7, bottom row, showsthis non-permissible zone.

In the defined computational efficiency space E, com-posed of the two variables information and cost (as theobjective functions, shown in bottom panels of Fig. 6 andFig. 7), solving a computational problem is representedby a decrease in uncertainty. However, any change inuncertainty has an associated cost. First derivative ofthe pareto frontier shows “marginal rate of substitution”

as∆info

∆cost. This ratio varies among different points on

the pareto efficient curve. If we take two points on thepareto curve in the computational efficiency space, suchas A and C for example, computational efficiency of thesetwo points are equal EA(η1,λ4) = EC(η3,λ2). The changein efficiency of point A with respect to information andcost, are the partial derivatives ∂EA

∂info and ∂EA

∂ cos t , respec-

tively. As a result, ∂EA

∂ inf odinf o + ∂EA

∂ cos tdcos t = 0, meaningthat there is constant efficiency along the pareto curve,the tradeoff between information and cost is not con-stant. The optimization in this space is not based onsome fixed built-in algorithm or arbitrary thresholds byan external observer. Rather, information/cost optimiza-tion is the result of back and forth interaction betweenthalamus and cortex. Based on the computational per-spective that we have portrayed, thalamus seems to bepoised to operate as an optimizer. Thalamus receives acopy of (sensory) input while relaying it, and receives anefferent copy from the processor (cortex), while trying toefficiently bind the information from past and presentand sending it back to cortex. The outcome of suchemergent optimization, is a pareto front in the economy-efficiency landscape (Fig. 6,7). If the cortex were to bethe sole conductor of cognitive processing, the dynamicsof the relay and cortical processing would meander in theparameter space and not yielding any optimization thatcan provide a feasible solution to economic and just-in-time computation. Such system is doomed to fail, eitherdue to metabolic costs or due to computational freezeover time ; thus more or less be a useless cognitive en-

gine. In contrast, with the help of an optimizer thatacts as a contextual modulator, the acceptable parame-ters will be confined to a manifold within the parameterspace. Such regime would be a sustainable and favorabledomain for cognitive computing. This property showsanother important facet of a thalamo-cortical computa-tional cognitive system and the need to move passed thecortico-centric view of cognition.

Figure 7. Dynamic parameter space of thalamo-cortical joint optimization of information/cost. Threedifferent realization of information/cost interaction as a func-tion of thalamic and cortical activity (ω, λ) and the corre-sponding pareto curves (see Fig. 6 for details of this opti-mization construct). Pareto curve shows the optimal set ofboth cost and information that can be obtained given the bio-physical constrains of neurons and networks connecting them.Every point on the pareto frontier shows technically efficientlevels for a given parameter set of ω, λ (see text for more de-tails). All the points inside the curve are feasible but are notmaximally efficient. The slope (marginal rate of transforma-tion between cost and information) shows how in order toincrease information, cost has to change. The dynamic na-ture of interaction between thalamus and cortex enables anemergent optimization of information/cost depending on thecomputational problem on hand and the prior state of thesystem.

Concluding remarks: Reframing Thalamic function aboveand beyond information relay

Lately, new evidence about the possible role of tha-lamus has started to challenged the cortico-centric viewof perception/cognition. Anatomical studies and physio-logical measurements have begun to unravel the impor-tance of the Cortico-Thalamo-Cortical loops in cognitiveprocesses6,77. Under this emerging paradigm, thalamusplays two distinctive roles: a) information relay, b) mod-ulation of cortical function93, where the neocortex doesnot work in isolation but is largely dependent on tha-

13

lamus. In contrast to cortical networks which operateas specialized memory devices via their local recurrentexcitatory connections, the thalamus is devoid of localconnections, and is instead optimized for capturing stateinformation that is distributed across multiple corticalnodes while animals are engaged in context-dependent

task switching88. This allows the thalamus to explic-itly represent task context (corresponding to differentcombinations of cortical states), and through its uniqueprojection patterns to the cortex, different thalamic in-puts modify the effective connections between corticalneurons9,88.

Figure 8. The emergent view of thalamic role in cognition. (Top) In the traditional view, serial processing of informationconfines the role of thalamus to only a relay station. (Bottom) the view that is discussed in this manuscript considers thalamusas a key player in cognition, above and beyond relay to sensory cortices. Through combining the efferent readout from cortexwith sensory afferent, MD-like thalamic nuclei modulate further activity of the higher cortex. The contextual modulationenabled by MD is composed of distinctively parallel operations (individual circles represent the non-recurrent nature of theseprocesses due to lack of local excitatory connections). Under this view, and the computational operatives discussed here, thethalamo-cortical system (and not just cortex) is in charge of contextual cognitive computing. The computation enabled byPulvinar/PO like nuclei is different from LGN and also from MD-like nuclei.

Here, we started with a brief overview of the architec-ture of thalamus, the back and forth communication be-tween thalamus and cortex, then we provided the electro-physiological evidence of thalamic modulatory function,and concluded with a computational frame that encap-sulates the architectural and functional attributes of thethalamic role in cognition. In such frame, the computa-tional efficiency of the cognitive computing machinery isachieved through iterative interactions between thalamusand cortex embedded in the hierarchical organization(Fig. 4, 5). Under this emergent view, thalamus servesnot only as relay, but also as a read/write medium forcortical processing , playing a crucial role in contextual

modulation of cognition (Fig. 8). Such multiscale organi-zation of computational processes is a necessary require-ment for design of the intelligent systems16,95,96. Dis-tributed computing in biological systems in most casesoperates without central control70. This is well reflectedin the computational perspective that we discussed here.We suggest that through the continuous contextual mod-ulation of cortical activity, thalamus (along with cortex)plays a significant role in emergent optimization of com-putational efficiency and computational cost. This phe-nomenon has a deep relation with phase transitions incomplex networks. Different states (phases) of the net-work are associated with the connectivity of the com-

14

puting elements (see thalamic weight/pointer and cor-tical function/weight modules in Fig. 5). Interestingly,intrinsic properties of the complex networks do not de-fine the phase transitions in system. Rather, the inter-play of the system with its external environment shapesthe landscape where phase transitions occur91. This par-allel in well-studied physical systems and neuronal net-works of thalamo-cortical system show the importance ofthe interplay between thalamus and cortex in cognitivecomputation and optimization. The proposed frame forcontextual cognitive computation and the emergent in-formation/cost optimization in thalamo-cortical systemcan guide us in designing novel AI architecture.

ACKNOWLEDGMENTS

We wish to thank Michael Halassa for helpful discus-sions.

REFERENCES

1Ahissar, E., and Kleinfeld, D. Closed-loop Neuronal Com-putations: Focus on Vibrissa Somatosensation in Rat. CerebralCortex 13, 1 (jan 2003), 53–62.

2Ahissar, E., and Oram, T. Thalamic Relay or Cortico-Thalamic Processing? Old Question New Answers. CerebralCortex 25, 4 (nov 2015), 845–848.

3Alcaraz, F., Fresno, V., Marchand, A. R., Kremer, E. J.,Coutureau, E., and Wolff, M. Thalamocortical and corti-cothalamic pathways differentially contribute to goal-directedbehaviors in the rat. eLife 7 (feb 2018).

4Appeltant, L., Soriano, M. C., Van der Sande, G., Danck-aert, J., Massar, S., Dambre, j., Schrauwen, B., Mirasso,C. R., and Fischer, I. Information processing using a singledynamical node as complex system. Nature Communications 2(sep 2011), 468.

5Baddeley, R., Abbott, L. F., Booth, M. C. A., Sengpiel,F., Freeman, T., Wakeman, E. A., and Rolls, E. T. Re-sponses of neurons in primary and inferior temporal visual cor-tices to natural scenes. Proceedings of the Royal Society B:Biological Sciences 264, 1389 (dec 1997), 1775–1783.

6Basso, M. A., Uhlrich, D., and Bickford, M. E. CorticalFunction: A View from the Thalamus. Neuron 45, 4 (feb 2005),485–488.

7Bertschinger, N., and Natschlager, T. Real-time computa-tion at the edge of chaos in recurrent neural networks. NeuralComput 16 (Jul 2004), 1413–36.

8Bickford, M. E. Thalamic Circuit Diversity: Modulation ofthe Driver/Modulator Framework. Frontiers in Neural Circuits9 (jan 2016).

9Bolkan, S. S., Stujenske, J. M., Parnaudeau, S., Spellman,T. J., Rauffenbart, C., Abbas, A. I., Harris, A. Z., Gor-don, J. A., and Kellendonk, C. Thalamic projections sustainprefrontal activity during working memory maintenance. NatureNeuroscience 20, 7 (may 2017), 987–996.

10Butler, A. B., and Hodos, W. Comparative Vertebrate Neu-roanatomy. John Wiley & Sons Inc., aug 2005.

11Clasca, F., Rubio-Garrido, P., and Jabaudon, D. Unveilingthe diversity of thalamocortical neuron subtypes. Eur J Neu-rosci 35 (May 2012), 1524–32.

12Corbetta, M. Frontoparietal cortical networks for directingattention and the eye to visual locations: identical, indepen-

dent, or overlapping neural systems? Proceedings of NationalAcademy of Science 95 (Feb 1998), 831–8.

13Costa, R. P., Assael, Y. M., Shillingford, B., de Freitas,N., and Vogels, T. P. Cortical microcircuits as gated-recurrentneural networks. arXiv (2017).

14Crick, F. Function of the thalamic reticular complex: thesearchlight hypothesis. Proceedings of the National Academyof Sciences 81, 14 (jul 1984), 4586–4590.

15Deacon, T. W. Rethinking Mammalian Brain Evolution.American Zoologist 30, 3 (aug 1990), 629–705.

16Dehghani, N. Design of the Artificial: lessons from the biolog-ical roots of general intelligence. arXiv (2017).

17Dehghani, N. Theoretical principles of multiscale spatiotempo-ral control of neuronal networks: a complex systems perspective.BiorXiv (jan 2017).

18Dekel, E., and Alon, U. Optimality and evolutionary tuningof the expression level of a protein. Nature 436, 7050 (jul 2005),588–592.

19Deniau, J. M., and Chevalier, G. Disinhibition as a basic pro-cess in the expression of striatal functions. II. The striato-nigralinfluence on thalamocortical cells of the ventromedial thalamicnucleus. Brain Research 334, 2 (may 1985), 227–233.

20Douglas, R. J., and Martin, K. A. Mapping the Matrix: TheWays of Neocortex. Neuron 56, 2 (oct 2007), 226–238.

21El-Samad, H. J., Goff, J. P., and Khamash, M. H. CalciumHomeostasis and Parturient Hypocalcemia: An Integral Feed-back Perspective. Journal of Theoretical Biology 214, 1 (jan2002), 17–29.

22El-Samad, H. J., Kurata, H., Doyle, J. C., Gross, C. A.,and Khamash, M. H. Surviving heat shock: Control strategiesfor robustness and performance. Proceedings of the NationalAcademy of Sciences 102, 8 (jan 2005), 2736–2741.

23Enel, P., Procyk, E., Quilodran, R., and Dominey, P. F.Reservoir Computing Properties of Neural Dynamics in Pre-frontal Cortex. PLOS Computational Biology 12, 6 (jun 2016),e1004967.

24Felleman, D. J., and Essen, D. C. V. Distributed HierarchicalProcessing in the Primate Cerebral Cortex. Cerebral Cortex 1,1 (jan 1991), 1–47.

25FitzGibbon, T., Erikoz, B., Grunert, U., and Martin, P. R.Analysis of the lateral geniculate nucleus in dichromatic andtrichromatic marmosets. Journal of Comparative Neurology523, 13 (jul 2015), 1948–1966.

26Funahashi, K., and Nakamura, Y. Approximation of dynami-cal systems by continuous time recurrent neural networks. Neu-ral Networks 6, 6 (jan 1993), 801–806.

27Fusi, S., Miller, E. K., and Rigotti, M. Why neurons mix:high dimensionality for higher cognition. Current Opinion inNeurobiology 37 (apr 2016), 66–74.

28Godfrey, P., Shipley, R., and Gryz, J. Algorithms and anal-yses for maximal vector computation. The VLDB Journal 16,1 (sep 2006), 5–28.

29Goldberg, J. H., Farries, M. A., and Fee, M. S. Basalganglia output to the thalamus: still a paradox. Trends in Neu-rosciences 36, 12 (dec 2013), 695–705.

30Grant, E., Hoerder-Suabedissen, A., and Molnar, Z. De-velopment of the Corticothalamic Projections. Frontiers in Neu-roscience 6 (2012).

31Groh, A., Bokor, H., Mease, R. A., Plattner, V. M.,Hangya, B., Stroh, A., Deschenes, M., and Acsady, L.Convergence of Cortical and Sensory Driver Inputs on SingleThalamocortical Cells. Cerebral Cortex 24, 12 (jul 2014), 3167–3179.

32Halassa, M. M., and Acsady, L. Thalamic Inhibition: DiverseSources Diverse Scales. Trends in Neurosciences 39, 10 (oct2016), 680–693.

33Halassa, M. M., and Kastner, S. Thalamic functions in dis-tributed cognitive control. Nature Neuroscience 20, 12 (nov2017), 1669–1679.

34Harth, E. M., Unnikrishnan, K. P., and Pandya, A. S. The

15

inversion of sensory processing by feedback pathways: a modelof visual cognitive functions. Science 237, 4811 (jul 1987), 184–187.

35Heeger, D. J. Theory of cortical function. Proceedings of theNational Academy of Sciences 114, 8 (feb 2017), 1773–1782.

36Hubel, D. H., and Wiesel, T. N. Receptive fields of singleneurones in the cat's striate cortex. The Journal of Physiology148, 3 (oct 1959), 574–591.

37Hubel, D. H., and Wiesel, T. N. Receptive fields binocularinteraction and functional architecture in the cat's visual cortex.The Journal of Physiology 160, 1 (jan 1962), 106–154.

38Ikeda, K., and Matsumoto, K. High-dimensional chaotic be-havior in systems with time-delayed feedback. Physica D: Non-linear Phenomena 29, 1-2 (nov 1987), 223–235.

39Jaeger, H. The echo state approach to analysing and trainingrecurrent neural networks. Tech. rep., 2001.

40Jaeger, H. Echo state network. Scholarpedia 2, 9 (2007), 2330.41Jazayeri, M., and Shadlen, M. N. A Neural Mechanism for

Sensing and Reproducing a Time Interval. Current Biology 25,20 (oct 2015), 2599–2609.

42Jones, E. G. Functional subdivision and synaptic organizationof the mammalian thalamus. Int Rev Physiol 25 (1981), 173–245.

43Jones, E. G. Principles of Thalamic Organization. In TheThalamus. Springer US, 1985, pp. 85–149.

44Jones, E. G. Viewpoint: the core and matrix of thalamic orga-nization. Neuroscience 85, 2 (apr 1998), 331–345.

45Kakei, S., Na, J., and Shinoda, Y. Thalamic terminal mor-phology and distribution of single corticothalamic axons origi-nating from layers 5 and 6 of the cat motor cortex. The Journalof Comparative Neurology 437, 2 (Aug 2001), 170–185.

46Karten, H. J. Evolutionary developmental biology meets thebrain: The origins of mammalian cortex. Proceedings of theNational Academy of Sciences 94, 7 (apr 1997), 2800–2804.

47Komura, Y., Nikkuni, A., Hirashima, N., Uetake, T., andMiyamoto, A. Responses of pulvinar neurons reflect a subject'sconfidence in visual categorization. Nature Neuroscience 16, 6(may 2013), 749–755.

48Krishna, S., Maslov, S., and Sneppen, K. UV-Induced Mu-tagenesis in Escherichia coli SOS Response: A QuantitativeModel. PLoS Computational Biology 3, 3 (2007), e41.

49Kung, H.-T., Luccio, F., and Preparata, F. P. On Findingthe Maxima of a Set of Vectors. Journal of the ACM 22, 4 (oct1975), 469–476.

50Kuramoto, E., Pan, S., Furuta, T., Tanaka, Y. R., Iwai, H.,Yamanaka, A., Ohno, S., Kaneko, T., Goto, T., and Hioki,H. Individual mediodorsal thalamic neurons project to multipleareas of the rat prefrontal cortex: A single neuron-tracing studyusing virus vectors. Journal of Comparative Neurology 525, 1(jul 2017), 166–185.

51Laughlin, S. B., and Sejnowski, T. J. Communication inneuronal networks. Science 301 (Sep 2003), 1870–4.

52Lennie, P. The Cost of Cortical Computation. Current Biology13, 6 (mar 2003), 493–497.

53Lin, H. W., Tegmark, M., and Rolnick, D. Why Does Deepand Cheap Learning Work So Well? Journal of StatisticalPhysics 168, 6 (jul 2017), 1223–1247.

54Ma, W. J., and Jazayeri, M. Neural Coding of Uncertaintyand Probability. Annual Review of Neuroscience 37, 1 (jul2014), 205–220.

55Maass, W., Natschlaeger, T., and Markram, H. Computa-tional Models for Generic Cortical Microcircuits. In Computa-tional Neuroscience. Chapman and Hall/CRC, oct 2003.

56Maass, W., Natschlager, T., and Markram, H. Real-TimeComputing Without Stable States: A New Framework for Neu-ral Computation Based on Perturbations. Neural Computation14, 11 (nov 2002), 2531–2560.

57Mante, V., Sussillo, D., Shenoy, K. V., and Newsome,W. T. Context-dependent computation by recurrent dynam-ics in prefrontal cortex. Nature 503, 7474 (nov 2013), 78–84.

58Marblestone, A. H., Wayne, G., and Kording, K. P. Towardan Integration of Deep Learning and Neuroscience. Frontiers inComputational Neuroscience 10 (sep 2016).

59Marenco, S., Stein, J. L., Savostyanova, A. A., Sambataro,F., Tan, H.-Y., Goldman, A. L., Verchinski, B. A., Bar-nett, A. S., Dickinson, D., Apud, J. A., Callicott, J. H.,Meyer-Lindenberg, A., and Weinberger, D. R. Investiga-tion of Anatomical Thalamo-Cortical Connectivity and fMRIActivation in Schizophrenia. Neuropsychopharmacology 37, 2(sep 2012), 499–507.

60Mathers, L. H. The synaptic organization of the cortical pro-jection to the pulvinar of the squirrel monkey. The Journal ofComparative Neurology 146, 1 (sep 1972), 43–59.

61McCormick, D. A., and von Krosigk, M. Corticothala-mic activation modulates thalamic firing through glutamatemetabotropic receptors. Proceedings of the National Academyof Sciences 89, 7 (apr 1992), 2774–2778.

62Mesulam, M. M. Large-scale neurocognitive networks and dis-tributed processing for attention, language, and memory. AnnNeurol 28 (Nov 1990), 597–613.

63Middleton, F., and Strick, P. L. Anatomical evidence forcerebellar and basal ganglia involvement in higher cognitivefunction. Science 266, 5184 (oct 1994), 458–461.

64Mitchell, A. The mediodorsal thalamus as a higher order tha-lamic relay nucleus important for learning and decision-making.Neurosci Biobehav Rev 54 (Jul 2015), 76–88.

65Mitelman, S. A., Byne, W., Kemether, E. M., Hazlett,E. A., and Buchsbaum, M. S. Metabolic Disconnection Be-tween the Mediodorsal Nucleus of the Thalamus and CorticalBrodmann’s Areas of the Left Hemisphere in Schizophrenia.American Journal of Psychiatry 162, 9 (sep 2005), 1733–1735.

66Mumford, D. On the computational architecture of the neo-cortex. I: The role of the thalamo-cortical loop. Biological Cy-bernetics 65, 2 (jun 1991), 135–145.

67Mumford, D. On the computational architecture of the neocor-tex. II The role of cortico-cortical loops. Biological Cybernetics66, 3 (jan 1992), 241–251.

68Nair, A., Treiber, J. M., Shukla, D. K., Shih, P., andMuller, R.-A. Impaired thalamocortical connectivity in autismspectrum disorder: a study of functional and anatomical con-nectivity. Brain 136, 6 (jun 2013), 1942–1955.

69Nakajima, M., and Halassa, M. M. Thalamic control of func-tional cortical connectivity. Current Opinion in Neurobiology44 (jun 2017), 127–131.

70Navlakha, S., and Bar-Joseph, Z. Distributed informationprocessing in biological and computational systems. Communi-cations of the ACM 58, 1 (dec 2014), 94–102.

71Newell, A. Some problems of basic organization in problemsolving programs, 1962.

72Nii, P. The Blackboard Model of Problem Solving and theEvolution of Blackboard Architectures. The AI Magazine 7, 2(1986), 38–53.

73Niven, J. E., and Laughlin, S. B. Energy limitation as aselective pressure on the evolution of sensory systems. Journalof Experimental Biology 211, 11 (jun 2008), 1792–1804.

74Northcutt, R. G. Evolution of the Telencephalon in Nonmam-mals. Annual Review of Neuroscience 4, 1 (mar 1981), 301–350.

75Northcutt, R. G., and Kaas, J. H. The emergence and evo-lution of mammalian neocortex. Trends in Neurosciences 18, 9(sep 1995), 373–379.

76Ohno, S., Kuramoto, E., Furuta, T., Hioki, H., Tanaka,Y., Fujiyama, F., Sonomura, T., Uemura, M., Sugiyama, K.,and Kaneko, T. A morphological analysis of thalamocorticalaxon fibers of rat posterior thalamic nuclei: a single neurontracing study with viral vectors. Cereb Cortex 22 (Dec 2012),2840–57.

77Parnaudeau, S., Bolkan, S. S., and Kellendonk, C. TheMediodorsal Thalamus: An Essential Partner of the PrefrontalCortex for Cognition. Biological Psychiatry (nov 2017).

78Parnaudeau, S., Taylor, K., Bolkan, S. S., Ward, R. D.,

16

Balsam, P. D., and Kellendonk, C. Mediodorsal ThalamusHypofunction Impairs Flexible Goal-Directed Behavior. Biolog-ical Psychiatry 77, 5 (mar 2015), 445–453.

79Partlow, G. D., Colonnier, M., and Szabo, J. Thalamic pro-jections of the superior colliculus in the rhesus monkey,Macacamulatta. A light and electron microscopic study. The Journalof Comparative Neurology 171, 3 (feb 1977), 285–317.

80Pinault, D. The thalamic reticular nucleus: structure functionand concept. Brain Research Reviews 46, 1 (aug 2004), 1–31.

81Purushothaman, G., Marion, R., Li, K., and Casagrande,V. Gating and control of primary visual cortex by pulvinar. NatNeurosci 15 (Jun 2012), 905–12.

82Raczkowski, D., and Fitzpatrick, D. Terminal arbors of in-dividual physiologically identified geniculocortical axons in thetree shrew's striate cortex. The Journal of Comparative Neu-rology 302, 3 (dec 1990), 500–514.

83Rigotti, M., Barak, O., Warden, M. R., Wang, X.-J., Daw,N. D., Miller, E. K., and Fusi, S. The importance of mixedselectivity in complex cognitive tasks. Nature 497, 7451 (may2013), 585–590.

84Rigotti, M., Rubin, D. B. D., Wang, X.-J., and Fusi, S.Internal representation of task rules by recurrent dynamics: theimportance of the diversity of neural responses. Frontiers inComputational Neuroscience 4 (2010).

85Rouiller, E. M., and Welker, E. A comparative analysisof the morphology of corticothalamic projections in mammals.Brain Research Bulletin 53, 6 (dec 2000), 727–741.

86Rovo, Z., Ulbert, I., and Acsady, L. Drivers of the primatethalamus. J Neurosci 32 (Dec 2012), 17894–908.

87Salas, C., Broglio, C., and Rodrıguez, F. Evolution of Fore-brain and Spatial Cognition in Vertebrates: Conservation acrossDiversity. Brain Behavior and Evolution 62, 2 (2003), 72–82.

88Schmitt, L. I., Wimmer, R. D., Nakajima, M., Happ, M.,Mofakham, S., and Halassa, M. M. Thalamic amplificationof cortical connectivity sustains attentional control. Nature 545,7653 (may 2017), 219–223.

89Scott, B. B., Constantinople, C., Akrami, A., Hanks,T. D., Brody, C. D., and Tank, D. W. Fronto-parietal Corti-cal Circuits Encode Accumulated Evidence with a Diversity ofTimescales. Neuron 95 (Jul 2017), 385–398.e5.

90Selfridge, O. G. Pandemonium: a paradigm for learning in .In Proceedings of the Symposium on Mechanisation of ThoughtProcesses (London, nov 1959), D. V. Blake and A. M. Uttley,Eds., National Physical Laboratory, pp. 513–526.

91Seoane, L. F., and Sole, R. Phase transitions in Pareto opti-mal complex networks. Physical Review E 92, 3 (sep 2015).

92Sherman, S. M. Thalamus plays a central role in ongoing corti-cal functioning. Nature Neuroscience 16, 4 (mar 2016), 533–541.

93Sherman, S. M., and Guillery, R. W. Functional Connec-tions of Cortical Areas. The MIT Press, aug 2013.

94Shoham, S., O’Connor, D. H., and Segev, R. How silent isthe brain: is there a “dark matter” problem in neuroscience?Journal of Comparative Physiology A 192, 8 (mar 2006), 777–

784.95Simon, H. A. The Architecture of Complexity. Proceedings of

the American Philosophical Society 106, 6 (1962), 467–482.96Simon, H. A. The Sciences of the Artificial. MIT Press, 1969,

ch. The Architecture of Complexity.97Steriade, M., Domich, L., and Oakson, G. Reticularis tha-

lami neurons revisited: activity changes during shifts in statesof vigilance. J Neurosci 6 (Jan 1986), 68–81.

98Steriade, M., and Llinas, R. R. The functional states of thethalamus and the associated neuronal interplay. PhysiologicalReviews 68, 3 (jul 1988), 649–742.

99Steriade, M., and Pare, D. Morphology and electroresponsiveproperties of thalamic neurons. In Gating in Cerebral Networks.Cambridge University Press, 2007, pp. 1–26.

100Steriade, M., Parent, A., and Hada, J. Thalamic projectionsof nucleus reticularis thalami of cat: A study using retrogradetransport of horseradish peroxidase and fluorescent tracers. The

Journal of Comparative Neurology 229, 4 (nov 1984), 531–547.101Szekely, P., Sheftel, H., Mayo, A., and Alon, U. Evolution-

ary Tradeoffs between Economy and Effectiveness in BiologicalHomeostasis Systems. PLoS Computational Biology 9, 8 (aug2013), e1003163.

102Urbain, N., and Deschenes, M. Motor Cortex Gates VibrissalResponses in a Thalamocortical Projection Pathway. Neuron56, 4 (nov 2007), 714–725.

103Wang, J., Narain, D., Hosseini, E. A., and Jazayeri, M.Flexible timing by temporal scaling of cortical responses. NatureNeuroscience (dec 2018).

104Wang, X.-J. Neural dynamics and circuit mechanisms ofdecision-making. Current Opinion in Neurobiology 22, 6 (dec2012), 1039–1046.

105Wei, W., and Wang, X.-J. Inhibitory Control in the Cortico-Basal Ganglia-Thalamocortical Loop: Complex Regulation andInterplay with Memory and Decision Processes. Neuron 92, 5(dec 2016), 1093–1105.

106Wiecki, T. V., and Frank, M. J. A computational model ofinhibitory control in frontal cortex and basal ganglia. Psycho-logical Review 120, 2 (2013), 329–355.

107Wong, K.-F., and Wang, X.-J. A Recurrent Network Mech-anism of Time Integration in Perceptual Decisions. Journal ofNeuroscience 26, 4 (jan 2006), 1314–1328.

108Woodward, N. D., Giraldo-Chica, M., Rogers, B., andCascio, C. J. Thalamocortical Dysconnectivity in Autism Spec-trum Disorder: An Analysis of the Autism Brain Imaging DataExchange. Biological Psychiatry: Cognitive Neuroscience andNeuroimaging 2, 1 (jan 2017), 76–84.

109Yamins, D. L. K., and DiCarlo, J. J. Using goal-driven deeplearning models to understand sensory cortex. Nature Neuro-science 19, 3 (feb 2016), 356–365.

110Yamins, D. L. K., Hong, H., Cadieu, C. F., Solomon, E. A.,Seibert, D., and DiCarlo, J. J. Performance-optimized hier-archical models predict neural responses in higher visual cortex.Proceedings of the National Academy of Sciences 111, 23 (may2014), 8619–8624.

Documents

A computational perspective of the role of Thalamus in …...Thalamus has traditionally been considered as only a relay source of cortical inputs, with hierarchically organized cortical