Brains…

Brains…

Ted Brookings

10/18/05

Outline

Review of Neurons Typical Structure Interaction Information

Understanding Cognition Primitive Neural Nets Hawkins’ Ideas

What the @$&# I’ve been doing

(Hopefully Less Primitive Neural Nets)

A Typical Neuron

Dendrites: numerous small, branching processes that conduct signals

Axon: a single, relatively large process for conducting signals

Synapses: a signal interface between cells

Myelin Sheath: a separate cell that wraps around the axon, speeding up transmission speed.

Stereotypical InteractionStereotypically, a neuron receives signals from its dendrites. It sends signals down its axon to the synapses, where it is transmitted to the dendrites of a connected neuron (or some enervated tissue).

In the cortex, a typical neuron will send signals to ~10,000 postsynaptic neurons.

Synapses don’t transmit the spike with perfect efficiency. It will typically take tens of spikes in a short time to excite the postsynaptic neuron. The efficiency varies from connection to connection, it can even be negative (inhibiting spikes rather than exciting them).

Neuronal Code: Where is the Information?

PULSE SHAPE: Neurons act as integrators Spikes are short (1-2 ms) Ion channels have active

control

=>No (or very little) information in pulse-shape

Neurons send information via spikes, but what aspects of the spikes encode the information?

It takes many incoming spikes (e.g. 10-50) to initiate a spike in a neuron. Incoming spikes are temporally very short, and the neuron essentially integrates until a threshold is reached. There are nonlinearities in this process, but to a large extent, pulse formation is independent of incoming pulse shape. Also, if an action potential IS triggered, the neurons ion channels actively work to produce a uniform shape, regardless of incoming shape.

Neuronal Code: Where is the Information?

TIMING: Spikes are stochastic Coding via average firing

rate? Short response times to

visual stimuli: fly (30-40 ms, 1-2 spikes) Human (few hundred ms)

30 neurons over 4 sec.

Comments on next page…

By saying spikes are stochastic, I mean that their arrival times are not regular or (easily) predictable. Exact timing may depend on thermal fluctuations, or it may be due to the complexity of many neurons being wired together, or both.

An early hypothesis was that information was encoded in the average firing rate of a neuron. Presumably the brain averages the number of received spikes over some time window. However, there are problems with this theory. Visual response times in the fly, humans, and other animals are far too quick to permit much averaging.

A more likely view is that the information is encoded in exact spike times (and also the strength of synaptic connections). Thus neurons are communicating by sending numbers (times) to each other, and interpreting that information via synaptic strengths.

Understanding Cognition

Traditional Neural Nets:

Typically Three layers

Feed-forward

Spikes last infinitely long (i.e. neurons are excited

permanently or not at all)

Training is an algorithm to update weights in order to achieve the correct “answer”


In traditional neural networks, the first layer (the “input layer”) is stimulated directly. For instance, if a bitmap is being analyzed, a neuron may be sensitive to a particular pixel of the image. The input layer in turn stimulates the second layer (“hidden layer”). The hidden layer stimulates the “output layer” which corresponds to results. For example, if the net is supposed to analyze images, one particular neuron in the output layer may have its activity correspond to seeing a circle.

There are no connections between neurons of the same layer, and no connections leading backwards. If a neuron is stimulated, it is stimulated for all time. Thus a pattern of input leads to a constant pattern of neuron stimulation ---there are no brief spikes, and no dynamics.

In order to learn, the net is subjected to a training set (a set of inputs) with known correct answers. An input from the training set will cause the net to display some pattern of activity in the output layer. The correct answer is known, and an algorithm is used to update the weights in the connections between the three layers in order to force the net to obtain the correct answer. By exposing the net to many training examples it is hoped that it has somehow abstracted commonalities between them, and can accurately respond to future inputs.

Obviously these properties have no basis in biology. Feedback and dynamics are essential properties of cognition, so biologically realistic neural nets must work differently. It should be noted that while many neural nets have the form described above, many have attempted more realistic design.

On Intelligence (Jeff Hawkins)

Intelligence is internal, not behavioral (Chinese room)

Intelligence is (mostly) due to the cortex

Understanding the structure/behavior of the cortex is the key to understanding intelligence

Traditional neural networks have little in common with the cortex


My Chinese room rant:

The Chinese room is a room filled with a non-Chinese speaker. Chinese writing is shoved into the room, and the person in the room follows instructions written in a book. The book tells this person to manipulate the symbols, and eventually to shove the results out of the room. The presumption is that if the instructions in the book are sufficiently complicated then it could convince a human that it is intelligent. However, since neither the hum

I believe that Hawkins (and some others) overstated their case with the Chinese room. Consider instead a series of Chinese rooms. The first is as above. The second has many operators with many books, all working together and passing each other messages. The final has done away with operators and books: instead they are replaced by cells, the cells send each other messages and follow rules according to “rule books” that are encoded into their construction. This third room is the same as a brain! I think that even the first room would be capable of genuine intelligence, simply by allowing the operator to write on and erase sections of the book. In MY opinion, there are two real problems with the setup of the original Chinese room:

1) There is no (explicit) memory. If the writer can’t alter the book, it can’t learn. Suppose you ask the Chinese room a question and get an answer. You find the answer is wrong and tell the it. Then if you ask it the same question, you will still get the same answer ---clearly not intelligent behavior

2) The setup imagines careless questioning falling prey to some trick. Anyone testing the Chinese room to see if it exhibits genuine learning will quickly realize that something is wrong, even if its conversation Chinese is very good.

I believe that behavior IS a valid way of testing for the presence of intelligence, however the tests must be chosen carefully. I will return to this later. Hawkins believes that intelligence has an internal reality, apart from any behavior, and emphasizes this. I agree with that.

Cortical layers

The cortex is a sheet, approximately 2mm thick and “the size of a

dinner napkin”

It is scrunched up in order to fit inside the skull

Cortical Layers

Notice that the layers are visible (when dyed appropriately) ---they are real structures. However, they are also fuzzy. Also remember that the physical location of neurons is not terribly important: it’s the wiring that really matters.

Cortical columns

The cortex also possesses distinct columns running perpendicular to the layers.

Cortical Columns

Mountcastle’s Theory

Layers and columns are small and ubiquitous

There are few large-scale features in the cortex

However, there are many functions

Therefore, every region of the cortex is (essentially) the same, and carrying out the same cortical algorithm. Only the data being processed varies.

Hawkins scheme

The cortex has a few main functions

Hierarchically summarizes/abstracts data

Supplies predictions/feedback

Recognizes missed predictions and passes back error

Creates and recognizes “invariant representations”

These are (my abstraction of) the basic principles of Hawkins theory of how the cortex works. Each of these points will be touched on in detail.

Hierarchy

The cortex is divided into regions by function and hierarchy. Example: the visual system.

Visual information flows from the retina, into a region of the cortex known as V1, which feeds into a different region known as V2, which feeds into V4, followed by IT.

Lower-level regions (such as V1) process simple information (e.g. responding to vertical lines)

Higher-level regions process complex information based on the abstracted results in lower-level regions (e.g. responding to a face)

Feedback

There are at least as many connections running from high regions to low as from low to high.

So if a higher region expects to see a face, it passes back the expected lines, etc. to lower regions.

In order to pass back expectations, the higher-level regions must have an internal model of the world. This model sends the sense experience that it expects the lower region to receive.

Errors and Feed-Forward

Data expected by higher-level cortical regions is compared with information flowing up from lower regions

Data agreements are passed downward

Data disagreements (e.g. an object moved) are passed upward to refine predictions

Data agreements mean that the internal model (predictions from higher regions) continues to be accurate. Disagreements occur when the model is incorrect (something unexpected happens). In that case, the novel sense information is passed upwards to adjust the model. Thus we “see” what we expect to see, and only notice things that clash with our expectations ---the basic principle of magicians’ work.

Invariant Representations

Patterns of activity represent ideas/objects/sense-data

Activity must be invariant to position/timing, etc.

(a face is a face, even if it is upside-down)

Activity must be robust against incomplete information

Patterns must be learned (we are not born with a finite supply of conceivable ideas/objects)

These invariant representations form the basis of communication between cortex regions

It should be pointed out that Hawkins believes these representations are localized to specific neurons (i.e. they don’t “float” from one cortical region to another).

Hebbian Learning

“Neurons that fire together, wire together.”

Experiment: stimulate a presynaptic neuron and a postsynaptic neuron, observe changes in the synaptic strength

Promotes recognition of coincidences (i.e. patterns)

Cells that fire together with a certain stimulus form an invariant representation


The diagram shows a Hebbian learning experiment. Neuron j sends signals to neuron i. If j fires first, then i afterwards the connection between them is strengthened. If i fires first the connection is weakened. The magnitude of this effect decreases exponentially as the time between events increases.

This type of learning will promote the learning of coincidence: if two invariant representations are activated simultaneously on a regular basis, they will eventually stimulate each other. The presence of one will cause the brain to infer the other.

Hawkins Summary

Hebbian learning creates “invariant representations”

Mental model in higher cortical regions predicts input from senses and lower regions

Input from lower cortical regions updates the model (when conflicts are observed)

We “see” the model, not direct data from our senses

Of course, other senses work the same way, and are integrated in even higher regions of the cortex.

Some Issues

Birds (which have no cortex) have displayed some intelligent behavior

Some areas of the visual cortex are suspiciously both highly-specialized and common to all people (e.g. recognition of faces, objects, and motion)

The layers and columns in the cortex may be due to ontogony


In addition to tool-making (which is an intelligent, learned behavior, not extinctive), birds have displayed many other intelligent behaviors. Some parrots have learned to use (very simple) language, that is to say, they use words in context and in simple phrases (not mere mimicry). Birds lack a cortex, although they have other brain structures that serve a similar function. Thus the exact structure of the cortex is not necessary for intelligence. Investigating the cortex structure is still a valid tactic however, as the cortex is one way of producing intelligence.

There are parts of the brain (discovered via brain-damaged people) that have very specialized tasks.

1) Seeing faces. When a part of the brain is damaged, the person can no longer recognize faces, not even their own.

2) Seeing objects. People with a specific area damaged will no longer be able to recognize objects. They may be able to figure out what they are (“It’s big and green…. Is it a tree?”) but can’t recognize them the way we can

3) Seeing motion. People rely on a part of the brain to detect motion. When it is damaged, moving objects are confusing and disturbing, their behavior difficult to predict.

One could argue that such regions develop automatically early in development and can’t be relearned later in life. However, I saw a documentary once were a sighted person was blindfolded for a week and taught Braille. When she read, the visual cortex was stimulated (even though she was using her touch sense). This suggests to me that there is some flexibility in the cortex, but some things that are specific to a certain task. I don’t believe that the cortex is completely uniform, but rather that the wiring structure has adapted to solve certain categories of problem, while the physical layout is (mostly) uniform.

Ontogony refers to the development of an embryo.

Structure FormationEarly embryo is shaped like a disk

CNS originates in ectoderm

Embryo and disk fold into tubes

(in opposite directions)

Neural tube sinks into body tube

The CNS is topologically a hollow tube The formation of the brain is simply a

thickening of one end of the tube Many structures in the brain can be traced

back to original locations on the neural plate ---this is called a fate-map

However, there is migration of neurons between locations, and extensive wiring between regions

I argue that all this means that the mere presence of layers and columns doesn’t necessarily argue that they are critical for cognition. They could be entirely due to ontology, or even have a mix of ontology and function…

Structure vs. Systems

Structures in the brain have purpose, just as structures in the body have purpose. However, structures in the brain have many neural systems within them, just as structures in the body have many systems.

Example: the neck.

Attaches head to body, allows pivoting

Contains part of circulatory system, respiratory system, and various nervous systems

…and even in the case where a structure IS the result of function, it doesn’t stand to reason that there is only one function.

Toward a More Realistic Neural Network

Simplify properties not essential for the basic functioning of the network:

Spike shape is (largely) irrelevant, so model spikes as Dirac-deltas

Simplify response of neurons ---model as perfect integrators

Keep essential properties:

Feedback

Hebbian learning

MANY connections per neuron

Add up incoming spike amplitudes (with some temporal decay). If the amplitude exceeds a threshold, a spike is produced in the neuron.

Actually just store a list of spike times and amplitudes.

What About Structure?Cognition depends on network structure (wiring, not location)

Cortical structure is complicated, unnecessary, and the wiring is largely unknown

Return to the Chinese room, but choose tasks carefully

Use an evolutionary algorithm to incrementally improve performance

That is to say, it seems fruitless to attempt to recreate the human brain if something simpler can work as well (or better) for my purposes.

Choose tasks that require learning. One can monitor this learning by testing the networks with novel tasks, and monitoring progress. If actual learning is taking place, one would expect performance to increase over time.

More on next page…

The Chinese room can be used to test if intelligence exists, but in order to actually create the structure necessary for intelligence, an evolutionary algorithm will be used. A carefully chosen problem will be presented to each generation, and a population of neural nets will be tested against several examples of this problem. Those that score well will reproduce (with mutation) and go on to the next generation. Each succeeding generation must have the tests changed sufficiently that the previous tests would not prepare them to succeed with the new test (i.e. they are not genetically programmed to pass the tests, they are genetically programmed to learn to pass).

Practical ProblemsSpeed:

The presence of feedback loops necessitates updating neuron and synapse states frequently

The kind of computation that must be done is more taxing than is frequently estimated

Memory usage:

Many neurons and realistic numbers of synapses per neuron results in a huge number of synapses

Each of those synapses must record the spikes that pass through it for one time-step

e.g. Hebbian learning requires computation, and is usually not included in estimates of brain power

“Progress” So Far…

Code is completed!

Program Results:

Segmentation Faults

Bus Errors

Infinite Loops

Documents

Brains…