21
Computational Explorations in Cognitive Neuroscience Chapter 2

Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

Computational Explorations in Cognitive Neuroscience

Chapter 2

Page 2: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

2.4 The Electrophysiology of the Neuron Some basic principles of electricity are useful for understanding the function of neurons. This is because a good deal of that function deals with charged atomic particles, called ions. A key concept in understanding the electrophysiology of the neurons is the transmembrane potential. Three major ion types contribute to changes in the transmembrane potential: Na+, K+, and Cl-. The neuron membrane is permeable to each of these ions by virtue of membrane channels or pores. The membranes of neurons and other cell types contain a molecular apparatus called the Na-K pump. It normally acts to extrude Na+ from the cell and collect K+. The result is that Na+ ions are typically in greater concentration outside the cell than inside, and K+ ions are in greater concentration inside the cell. Cl- ions are extruded from the cell due to the transmembrane potential established by the pump. The equilibrium potential for each ion type is that potential at which all the forces acting on the ion across the membrane are balanced.

Page 3: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

The difference between the transmembrane potential (Vm) and the equilibrium potential (E) creates an electromotive force on the ion species, thereby producing a flow of ions, i.e. a current, across the membrane: ( m )I G V E= − (2.4) where I is the transmembrane current, and G is the ion’s conductance. The total conductance for ion c may be expressed by the product of the maximum conductance that would occur if all the ion channels were open ( cg ) and the fraction of the total number of channels that are open at any given time t (gc(t)). Thus: ( ) ( )( )c c c m cI g t g V t E= − (2.5) We now consider the three fundamental channels and their currents that affect the transmembrane potential:

1) excitatory synaptic input channel (activated by glutamate and passing Na+)

2) inhibitory synaptic input channel (activated by GABA and passing Cl-)

3) leak channel (always open and passing K+) The net sum of all three currents is:

( ) ( )( )( ) ( )(( )

)( )( )

net e e m e

i i m i

l l m l

I g t g V t E

g t g V t E

g t g V t E

= − +

− +

(2.6)

Page 4: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

This net current (I_net) is used in the simulator to update the transmembrane potential (v_m) as:

V t V t dt I

V t dt

g t g E V t

g t g E V t

g t g E V t

m m v

m vm

e e e m

i i i m

l l l m

m net+ = +

= +

− +

− +

1b g b gb gb g b gc hb g b gc hb g b gc h (2.8)

where dt_vm is a time constant reflecting the slowing of changes in potential due to the capacitance of the cell membrane. Neural function can be modeled at different levels. One approach is to construct detailed models of the single neuron, realistically modeling the entire neuronal shape in great detail with realistic compartments for each small patch of membrane, and containing all known membrane components. This is a valuable approach, but precludes the possibility of also modeling networks of neurons. A simplifying approach to network modeling is to treat each individual cell as a point neuron, in which the spatial features of the dendritic tree, soma, and axon are ignored, and the entire neuron is modeled by a single equation. Leabra uses this simplification to model neurons as point processes.

Page 5: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

Note from (2.8) that if Inet is zero, then Vm does not change, i.e. it is at rest. The resting transmembrane potential may be written as: e e e i i i l l l

me e i i l l

g g E g g E g g EV

g g g g g g+ +

=+ +

(2.9)

We see that the resting potential is determined by the three equilibrium potentials, each weighted by the magnitude of their conductances. The greater the excitatory conductance, the greater is the effect on Vm of the Na+ equilibrium potential; the greater the inhibitory conductance, the greater is the effect of the Cl- equilibrium potential; and the greater the leak conductance, the greater is the effect of the K+ equilibrium potential. Note that Ee is in the depolarizing direction, whereas Ei and El are in the hyperpolarizing direction. Therefore, inhibition and leak counteract excitation.

Page 6: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

2.5 Computational Implementation of the Neural Activation Function The point neuron activation function used in Leabra attempts to strike a balance between computational efficiency and neurobiological reality. The point neuron model has important differences from the more abstract artificial neural network (ANN) model. In ANNs, the net input to a unit is defined as: j i ij

ix wη = ∑ (2.11)

and the transformation of the net input into an activation value that is sent to other units is defined as: 1

1 jjye η−=

+ (2.12)

This latter equation has the form of a sigmoid function, which is based on the need for a saturating nonlinearity to reflect the natural property of neurons having a linear input-output relation in the middle range with saturation at low and high values of input.

Page 7: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

In Leabra, there is an explicit separation between excitatory and inhibitory inputs. Only excitatory neurons are explicitly modeled. (In the next chapter, we will see a computational approximation that is used to represent inhibitory influences.) The excitatory input conductance is computed as an average over all weighted inputs to the neuron, normalized by the number of inputs: ( )

1

1 n

e i ij i iji

g t x w x wn =

= = ∑ (2.13)

Page 8: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

In ANNs, the neuronal production of an action potential (spike) can be represented by a threshold with a binary activation value (i.e. 1 if Vm is over threshold, 0 otherwise). However, this is not useful when we consider the output of neurons in terms of a rate code, as most Leabra models do. For a rate code, we can consider the neuronal output to be a continuous, real-valued number reflecting the instantaneous rate of spike firing. The spike rate can represent the output of either a single spiking neuron or a population of spiking neurons. A thresholded sigmoid nonlinearity (as used in ANNs) is a good continuous-valued approximation to the output of spiking neurons.

Page 9: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

2.5.1 Computing Input Conductances Details of Input Conductance Computation The excitatory synaptic input to a neuron depends on the fraction of excitatory input channels that are open. This is computed in Leabra as the product of the sending activation (xi) times the weight (wij) for that input projection (wt in the simulator). A projection is a group of inputs on a dendritic tree from a single source (neuron or neuron group). Computation of the excitatory input allows projections from different sources (indexed by k), which may be grouped on different parts of the dendritic tree of the postsynaptic neuron, to exert different levels of impact. The average of all (n) individual synaptic inputs from the same input projection k is given by: 1

i ij i ijki

x w x wn

= ∑ (2.14)

[In practice, a given network layer may have partial connectivity. In that case, n represents the number of units in the sending layer rather than the number of actual connections (which may be smaller). Missing connections are represented by setting their weights to zero.]

Page 10: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

The average in (2.14) is then multiplied by a normalizing factor to reflect differences in the expected level of activity of different projections: 1

ke i kk

g x wα

= ij (2.15)

The normalizing factor αk is based on the expected activity level of the sending projection. This normalization is useful in simulations because it provides a balance across different inputs, which may have different baseline levels of activity, giving them all roughly the same level of influence. The overall excitatory conductance to a neuron, also called the net input (net in the simulator), is an average of projection-level conductances combined with a bias weight (β, or bias.wt in the simulator). Update of the net input is computed as:

( ) ( ) ( )

( ) ( )

11 1

1 11 1

ke net e net ekp

net e net i ij kkp k

g t dt g t dt gn N

dt g t dt x wn N

β

βα

⎛ ⎞= − − + +⎜ ⎟⎜ ⎟

⎝⎛ ⎞

= − − + +⎜ ⎟⎜ ⎟⎝ ⎠

⎠ (2.16)

where np is the number of projections; αk is the normalizing factor for projection k; β is the bias weight (bias.wt in the simulator); dtnet (dt_net in the simulator) is the time-averaging time constant with value between 0 and 1; and N is the total number of input connections. Division by N has the effect of scaling the bias weight so that it is in the same range as the synaptic inputs.

Page 11: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

The bias input represents different sensitivities to input (i.e. excitabilities) that exist among different neurons, reflecting a host of possible cellular differences such as in their level of leak current. In Leabra, the bias input is represented by a bias weight (bias.wt in the simulator) factor. An important feature of the bias weight is that it can be modified with learning, like other weights. The time delays that occur as activity is propagated along axons, across synapses, and through dendritic trees is reflected in Leabra by time averaging. This is important for smoothing out rapid transitions or fluctuations that could cause instability in network models. The term dtnet (dt_net in the simulator) represents a time-averaging time constant, reflecting this “sluggishness”. To summarize (2.16): to compute the current excitatory conductance, ge(t), the previous value, ge(t-1) is multiplied by (1-dtnet) and the current inputs by dtnet. Since the default value of dtnet is set to 0.7, this has the effect of giving greater weight to the current input, while also allowing the previous level to “fade out”.

Page 12: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

Differential Projection-Level Scaling As stated above, by their position on different parts of the dendritic tree, different sources may exert different levels of impact of the postsynaptic neuron. 1

k

ke k i ij k

p kp

rg s x w

r α=

∑ (2.17)

Arbitrary absolute, sk (wt_scale.abs in the simulator), and relative, rk (wt_scale.rel in the simulator) scaling constants, are introduced to increase or decrease the effectiveness of certain projections, either absolutely or as a fraction of the total projections. This can be useful in some cases to represent factors such as the proximity to the soma on the dendritic tree of different projections. However, they are often set to 1, so they have no effect.

Page 13: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

2.5.2 Point Neuron Parameter Values In Leabra, Eq. 2.16 is used to compute the net input, and then Eq. 2.8 is used to update the membrane potential. In running Leabra simulations, the parameter values are typically normalized to a range between 0 and 1, as in Table 2.1.

( ) ( ) ( )

( ) ( )

11 1

1 11 1

ke net e net ekp

net e net i ij kkp k

g t dt g t dt gn N

dt g t dt x wn N

β

βα

⎛ ⎞= − − + +⎜ ⎟⎜ ⎟

⎝ ⎠⎛ ⎞

= − − + +⎜ ⎟⎜ ⎟⎝ ⎠

V t V t dt I

V t dt

g t g E V t

g t g E V t

g t g E V t

m m v

m vm

e e e m

i i i m

l l l m

m net+ = +

= +

− +

− +

1b g b gb gb g b gc hb g b gc hb g b gc h

Page 14: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

2.5.3 The Discrete Spiking Output Function After the membrane potential is updated, an output value is computed as a function of this potential. There are two options for this output: discrete or continuous. To simulate the discrete spiking output of the neuron, Leabra uses a simple threshold mechanism to output an activation value (act in the simulator), of 1 if the membrane potential exceeds a threshold value (thr in the simulator), and zero otherwise. The refractory period is implemented by resetting the membrane potential to a sub-resting level (v_m_r in the simulator) on the time step following the “spike”. Extended synaptic effects due to a spike are implemented by extending the spike activation for multiple cycles using the dur parameter. The firing rate is time-averaged (act_eq in the simulator) as: spikeseq

j eqcycles

Ny

Nγ= (2.18)

where Nspikes is the number of spikes fired during a time period, Ncycles is the total number of cycles in the period, and γeq (eq_gain in the simulator) is a scaling factor.

Page 15: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

2.5.4 The Rate Code Output Function A rate code output function is a reasonable approximation to the output of an ensemble of neurons, rather than a single neuron. It also has the advantage of providing a smoother activation dynamics. The output in this case is continuous rather than discrete. The output activation depends on the difference between the membrane potential and the threshold level in the positive direction. The activation (act in the simulator) is computed as:

1m

jm

Vy

γ+

+

− Θ⎡ ⎤⎣ ⎦=− Θ +⎡ ⎤⎣ ⎦

(2.19)

where γ is a gain parameter (act_gain in the simulator), Θ is the threshold, and [x]+ means to take the value of x if positive and zero if negative. Eq. (2.19) can be re-written as:

( ) 1

1

1j

m

yVγ

+

=+ −Θ⎡ ⎤⎣ ⎦

(2.20)

In this form, it has a similar form to Eq. (2.12), showing that the continuous output activation has the form of a sigmoidal function.

Page 16: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

One problem is that (2.12) has a sharp transition at the threshold. This form does not reflect the effect of randomness in firing that characterizes all real neuronal output. To reflect the normally stochastic behavior of real neurons, the simulator modifies the rate-code activation function by convolution with a Gaussian-distributed noise function (Fig 2.13). The effect of this convolution is to smooth the sharp transition in the activation function (Fig 2.14). The smoothed activation function is sometimes referred to as the noisy-X-over-X-plus-1 or noisy XX1 function. The noisy XX1 function gives a good approximation to the rate produced by discrete spiking in a unit that has noise added to its membrane potential (Fig 2.15).

Page 17: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

2.5.5 Summary (Box 2.2)

1) Activation flows from the sending units (xi) through the weights (wij), resulting in excitatory net input ge (net) over all inputs (including the bias weight).

2) Excitatory input is combined with inhibition and leak to compute the membrane potential Vm (v_m).

3) Activation yj (act) is a thresholded sigmoidal function of the membrane potential.

Page 18: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

2.7 Hypothesis Testing Analysis of a Neural Detector Statistical view of point neuron activation function. Hypothesis testing: comparison of hypotheses with data. Detection is translated into the determination of how well a hypothesis is supported by the data. Contrast between hypothesis that some entity is present in environment with (null) hypothesis that it is not. To determine which hypothesis is most warranted, we want to compare the relative probabilities that each hypothesis is true. The joint probability of the hypothesis (h) and the data (d) is given by P(h,d). It is equivalent to the intersection of the hypothesis and data. Conditional probability: ( ) ( )

( ),P h d

P h dP d

= (2.23)

Interpretation: when we receive some particular input data, this equation give the probability that the hypothesis is true.

Page 19: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

Likelihood: ( ) ( )

( ),P h d

P d hP h

= (2.25)

Interpretation: assuming a hypothesis to be true, this equation tells the likelihood of receiving some particular input data, i.e. how likely is the data as predicted by the hypothesis. Notice that: ( ) ( ) ( ) ( ) ( ),P h d P h d P d P d h P h= = (2.27) Rearranging, we get Bayes’ formula: P h d

P d h P hP d

c h c h b gb g=

This formula is the basis for the field of Bayesian statistics. The term on the left is called the posterior, and P(h) is called the prior. Thus, the posterior is equal to the likelihood times the prior, normalized by the probability of the data. Bayes’ formula gives us an expression for [the probability of a hypothesis being true conditional on receiving a particular data item (the posterior)] in terms of the product of [the probability of receiving a particular data item given that the hypothesis is true (the likelihood)] and [the probability of the hypothesis being true without having seen any data (the prior)], normalized by [the probability of the data item occurring].

Page 20: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

We would like a way to replace P(d) so that we can deal with only likelihood and prior terms. We can express P(d) as: P d P h d P h d P d h P h P d h P hb g b g d i c h b g e j d i= + = +, , since the hypothesis and the null hypothesis are mutually exclusive. Bayes’ formula then becomes: P h d

P d h P h

P d h P h P d h P hc h c h b g

c h b g e j d i=

+ It is now in a form in which the product of the posterior and likelihood of a hypothesis being true is expressed as a fraction of the total product (i.e. for the hypothesis being true and being false). To apply Bayes’ formula to the operation of a detector neuron, we assume that the likelihood function is directly proportional to the number of inputs that match the “template” pattern that the detector is set to detect. In other words, assuming that the hypothesis is true that an entity is in the environment, the likelihood of receiving a particular input pattern is proportional to the number of its inputs that match the template for that entity. We can also assume that the null likelihood is directly proportional to the number of inputs that “fail” to match the template pattern.

Page 21: Computational Explorations in Cognitive Neurosciencebressler/EDU/CompNeuro/Notes/Chapter_2.pdf · Icc cm c=gtgV t E( ) (( )−) (2.5) We now consider the three fundamental channels

We can then express Bayes’ formula as:

P h dx w P h

x w P h x w P h

i ii

i ii

i ii

c hb g

b g b g b g=+ − −

∑∑ ∑ 1 1

Conclusion: Given an input pattern, we can determine the probability that the hypothesis is true that the input pattern is caused by a particular entity in the environment. We add up the number of “hits” and “non-hits” and weigh the sums by the priors. To use the prior, we need an estimate of the prevalence of our entity in the environment. That is, determination of the posterior depends on the value of the prior. For the simple example given in the chapter, we have the following relationship: P h P hb g d i/ P h dc h .1/.9 .2 .25/.75 .4 .5/.5 .67 .9/.1 .95 So, we see that the greater the prevalence of our entity in the environment, the greater will be our confidence that it caused any given data item.