Hebbian Coincidence Learning When one neuron contributes to the firing of another neuron the pathway...

Preview:

Citation preview

Hebbian Coincidence Learning

When one neuron contributes to the firing of

another neuron the pathway between them is

strengthened.That is, if the output of i is the input to j, then the

weight is adjusted by a quantity proportional to

c * (oi * o

j).

Unsupervised Hebbian Learning

The rule is ΔW = c * f(X,W) * X

An example of unsupervised Hebbian learning is to

simulate transfer of a response from a primary or

unconditioned stimulus to a conditioned stimulus.

Example

Example (cont'd)

In this example, the first three inputs represented the

unconditioned stimuli and the second three inputs

represent the new stimuli.

Supervised Hebbian Learning

In supervised Hebbian learning, instead of using the

output of a neuron, we use the desired output as

supplied by the instructor. The rule becomes ΔW = c * D * X

Example

Recognizing associations between sets of patterns:

{<X1, Y

1>, <X

2, Y

2>, ... <X

t, Y

t>}. The input to the

network would be pattern Xi and the output should

be the associated pattern Yi. The network consists

on an input layer with n neurons (where n is the

number of different input patterns, and with an

output layer of size m, where m is the number of

output pattens. The network is fully connected.

Example (cont'd)

In this example, the learning rule becomes: ΔW = c * Y * X,

where Y * X is the outer vector product. We cycle

through the pairs in the training set, adjusting the

weights each time

This kind of network (one the maps input vectors to

output vectors using this rule) is called a linear

associator.

Associative Memory

Used for memory retrieval, returning one pattern

given another. There are three types of associative

memory

1 Heteroassociative: Mapping from X to Y s.t. if an

arbitrary vector is closer to Xi than to any other Xj,

the vector Yi associated with Xi is returned.

2 Autoassociative: Same as above except that Xi = Yi

for all exemplar pairs. Useful in retrieving a full

pattern from a degraded one.

Associative Memory (cont'd)

Interpolative: If X differs from the exemplar Xi by

an amount Δ, then the retrieved vector Y differs

from Yi by some function of Δ. A linear associative

network (one input layer, one output layer, fully

connected) can be used to implement interpolative

memory.

Representation of Vectors

Hamming vectors are vectors composed of just the

numbers +1 and -1. Assume all vectors are size n.The Hamming distance between two vectors is just

the number of components which differ.An orthonormal set of vectors is a set of vectors

where are all unit length and each pair of distinct

vectors is orthogonal (the cross-product of the

vectors is 0).

Properties of a LAN

If the input set of vectors is orthonormal, then a

linear associative network implements interpolative

memory. The output is the weighted sum of the

input vectors (we assume a trained network). If the

input pattern matches one of the exemplars, Xi, then

the output will be Yi. If the input pattern is Xi + Δi,

then the output will be Yi + Φ(Δi) where Φ is the

mapping function of the network.

Problems with LANs

If the exemplars do not form an orthonormal set,

then there may be interference between the stored

patterns. This is know as crosstalk.The number of patterns which may be stored is

limited by the dimensionality of the vector space. The mapping from real-life situations to

orthonormal sets may not be clear.

Attractor Network

Instead of return an interpolation, we may wish to

return the vector associated with closest exemplar.

We can create such a network (an attactor network)

by using feedback instead of a strictly feed-foward

network.

Feedback Network

Feedback networks have the following properties:There are feedback connections between the nodesThis is a time delay in signal, i.e., signal

propagation is no instantaneousThe output of the network depends on the network

state upon convergence of the signals.Usefulness depends on convergence

Feedback Network (cont'd)

A feedback network is initialized with an input

pattern. The network then processes the input,

passing signal between nodes, going through

various states until it (hopefully) reaches

equilibrium. The equilibrium state of the network

supplies the output.

Feedback networks can be used for heteroassociative

and autoassociative memories.

Attractors

An attractor is a state toward which other states in

the region evolve in time. The region associated

with an attractor is called a basin.

Bi-Directional Associative Memory

A bi-directional associative memory (BAM) network

is one with two fully connected layers, in which the

links are all bi-directional. There can also be a

feedback link connecting a node to itself. A BAM

network may be trained, or its weights may be

worked out in advance. It is used to map a set of

vectors Xi (input layer) to a set of vectors Yi

(output layer).

BAM for autoassociative memory

If a BAM network is used to implement an

autoassocative memory then the input layer is the

same as the output layer, i.e., there is just one layer

with feedback links connecting nodes to themselves

in addition to the links between nodes. This network

can be used to retrieve a pattern given a noisy or

incomplete pattern.

BAM Processing

Apply an initial vector pair (X,Y) to the processing

elements. X is the pattern we wish to retrieve and Y

is random.Propagate the information from the X layer to the Y

layer and update the values at the Y layer.Send the information back to the X layer, updating

those nodes.Continue until equilibrium is reached.

Hopfield Networks

Two goals:Guarantee that the network converges to a stable

state, no matter what input is given.The stable state should be the closest one to the

input state according to some distance metric

Hopfield Network (cont'd)

A Hopfield Network is identical in structure to an

autoassociative BAM network – one layer of fully

connected neurons. The activation function is

+1, if net > Ti,

xnew = xold, if net = Ti,

-1, if net < Ti,

where net = Σj w

j * x

j.

More on Hopfield Nets

The are restrictions on the weights: wii = 0 for all i,

and wij = w

ji for i.j.

Usually the weights are calculated in advance,

rather than having the net trained.The behavior of the net is characterized as an

energy function, H(X) = - Σi Σj wij wi wj + 2 Σi Ti

xi, decreases from every network transition.

Hopfield Nets

Thus, the network must converge, and converge to a

local energy minimum, but there is no guarantee

that in converges to a state near the input state.Can be used for optimization problems such a TSP

(map the cost function of the optimization problem

to the energy function of the Hopfield net).

Recommended