13
Neural Networks. Vol, 6, pp. 869-881, 1993 0893-6080/93 $6.00 + .00 Printed in the USA. All rights reserved. Copyright © 1993 Pergamon Press Ltd. ORIGINAL CONTRIB UTION An Analog Neural Network to Solve the Hamiltonian Cycle Problem SHASHANK MEHTA I AND LASZLO FULOP 2 ~Universityof Poona, India. and 2RensselaerPolytechnic Institute, New York, USA (Received 10 December 1992; revised and accepted 10 February 1993) Abstraet--Hoplield's (analog) neural net algorithm shows very different characteristics when the net is allowed to evoh,e while the gain ./actor of the neural response is gradually increased. The study of the new approach (called quasi stationao'flow) yields that ( 1 ) The net converges f the weights are symmetric and the strength o[inhibitory self connection is less than the slope of the transfer function (i.e.. -n'~.~ < g;~i,); and (2) The energy of the net decreases, and the distance of the state from the center of the cube, in the unit cube representation of the state space. increases with the increase in the gain.factor (i.e., (1/X)dR2/dh = -2dE/d~ > 0). This approach has been successJid in soh,ing very large optimization problems. The quasi stationao' approach is applied to the Hopfield and Tank's algorithm to soh,e the Traveling Salesman Problem (TSP). Besides the method of approaching the stable state, modO~cations are also suggested in (i) the energ3, fimction; (ii) the method of determining the energ3, coefficients: and (iii) the connectivity of the network. Results are reported of experiments with the Hamiltonian Cycle Problem ( HCP or discrete TSP) on graphs of up to 500 nodes. The result of experiments with the well known 318 city TSP is also reported and compared with its best known solution computed by linear programming. Keywords--Traveling Salesman Problem, Hopfield neural network, Optimization, Analog neurons, Quasi stationary evolution. 1. INTRODUCTION Significant attention has been drawn to the application of neural networks to optimization problems since the Hopfield and Tank's (1985) paper on the Traveling Salesman Problem (TSP) was published. The neurons in the original model had only two states (0 and 1 ). A couple of years after his original paper (1982), Hopfield extended his ideas from binary neurons to those with analog output (see Hopfield, 1984). The motivation behind this work was to implement the model using real-world hardware components such as op-amps and resistors. Nearly 3 years after Hopfield and Tank's paper de- scribing the neural net solution for the TSP, Wilson and Pawley (1988) reported their inability to produce any acceptable results for problems of practical interest and shed a very unfavorable light on the usefulness of the optimizing neural nets. Wilson and Pawley's paper Acknowledgement: We are thankful to the referees for going through the first and the second drafts with a keen interest and for pointing out mistakes. We are also grateful for their several useful suggestions. Requests for reprints should be sent to Shashank Mehta, De- partment of Computer Science, Universityof Poona, Pune, 411007, India. seemed to act as a catalyst to ignite the interest of other researchers in this area. A number of research groups also tried to copy Hopfield's experiments and generally found the same result as that of Wilson and Pawley. Simple modifications were proposed to Hopfield's en- ergy function by some researchers (see e.g., Brandt et al., 1988; Szu, 1988), while others experimented to find better ways to determine the very illusive coeffi- cients that make up the connection weights (see Gunn & Weidlich, 1989, Hegde et al., 1988; Protzel et al., 1989). All the related works including those cited above have three major points in common. First, they typically report results for the TSPs with number of cities in the 10-30 range, with the exception of Xu and Tsai ( 1991 ), and the results generally de- graded with increasing problem sizes. Secondly, in all the results reported, there were problems in consistently arriving at even a valid tour (Hamiltonian cycle), be it optimum or not. Thirdly, all these papers have de- pended on trial and error experimentation to determine the network coefficients as was the case in the Hopfield paper. A somewhat more novel approach has been pre- sented by Van den Bout and Miller ( 1988 ). A solution to the TSP is described that combines features of neural networks and simulated annealing. The resulting 869

An analog neural network to solve the hamiltonian cycle problem

Embed Size (px)

Citation preview

Neural Networks. Vol, 6, pp. 869-881, 1993 0893-6080/93 $6.00 + .00 Printed in the USA. All rights reserved. Copyright © 1993 Pergamon Press Ltd.

ORIGINAL CONTRIB UTION

An Analog Neural Network to Solve the Hamiltonian Cycle Problem

SHASHANK M E H T A I AND LASZLO FULOP 2

~University of Poona, India. and 2Rensselaer Polytechnic Institute, New York, USA

(Received 10 December 1992; revised and accepted 10 February 1993)

Abstraet--Hoplield's (analog) neural net algorithm shows very different characteristics when the net is allowed to evoh,e while the gain ./actor of the neural response is gradually increased. The study of the new approach (called quasi stationao'flow) yields that ( 1 ) The net converges f the weights are symmetric and the strength o[inhibitory self connection is less than the slope of the transfer function (i.e.. -n'~.~ < g;~i,); and (2) The energy of the net decreases, and the distance of the state from the center of the cube, in the unit cube representation of the state space. increases with the increase in the gain.factor (i.e., (1/X)dR2/dh = - 2 d E / d ~ > 0). This approach has been successJid in soh,ing very large optimization problems. The quasi stationao' approach is applied to the Hopfield and Tank's algorithm to soh,e the Traveling Salesman Problem (TSP). Besides the method of approaching the stable state, modO~cations are also suggested in (i) the energ3, fimction; (ii) the method of determining the energ3, coefficients: and (iii) the connectivity of the network. Results are reported of experiments with the Hamiltonian Cycle Problem ( HCP or discrete TSP) on graphs of up to 500 nodes. The result of experiments with the well known 318 city TSP is also reported and compared with its best known solution computed by linear programming.

Keywords--Traveling Salesman Problem, Hopfield neural network, Optimization, Analog neurons, Quasi stationary evolution.

1. I N T R O D U C T I O N

Significant attention has been drawn to the application of neural networks to optimization problems since the Hopfield and Tank's (1985) paper on the Traveling Salesman Problem (TSP) was published. The neurons in the original model had only two states (0 and 1 ). A couple of years after his original paper (1982), Hopfield extended his ideas from binary neurons to those with analog output (see Hopfield, 1984). The motivation behind this work was to implement the model using real-world hardware components such as op-amps and resistors.

Nearly 3 years after Hopfield and Tank's paper de- scribing the neural net solution for the TSP, Wilson and Pawley (1988) reported their inability to produce any acceptable results for problems of practical interest and shed a very unfavorable light on the usefulness of the optimizing neural nets. Wilson and Pawley's paper

Acknowledgement: We are thankful to the referees for going through the first and the second drafts with a keen interest and for pointing out mistakes. We are also grateful for their several useful suggestions.

Requests for reprints should be sent to Shashank Mehta, De- partment of Computer Science, University of Poona, Pune, 411007, India.

seemed to act as a catalyst to ignite the interest of other researchers in this area. A number of research groups also tried to copy Hopfield's experiments and generally found the same result as that of Wilson and Pawley. Simple modifications were proposed to Hopfield's en- ergy function by some researchers (see e.g., Brandt et al., 1988; Szu, 1988), while others experimented to find better ways to determine the very illusive coeffi- cients that make up the connection weights (see Gunn & Weidlich, 1989, Hegde et al., 1988; Protzel et al., 1989). All the related works including those cited above have three major points in common.

First, they typically report results for the TSPs with number of cities in the 10-30 range, with the exception of Xu and Tsai ( 1991 ), and the results generally de- graded with increasing problem sizes. Secondly, in all the results reported, there were problems in consistently arriving at even a valid tour (Hamiltonian cycle), be it opt imum or not. Thirdly, all these papers have de- pended on trial and error experimentation to determine the network coefficients as was the case in the Hopfield paper.

A somewhat more novel approach has been pre- sented by Van den Bout and Miller ( 1988 ). A solution to the TSP is described that combines features of neural networks and simulated annealing. The resulting

869

870 S. Mehta and L. Fulop

"neural net" (loosely termed) finds valid tours consis- tently ( for the 30 city cases reported), but gives a some- what longer (costlier) path than those given by pure simulated annealing (see Kirkpatrick et al., 1983). It is also believed that simulated annealing solutions are much slower than pure neural net solutions.

Xu and Tsai ( 1991 ) propose an interesting variation of the Hopfield solution. In their first algorithm they attempt to find an optimum 2-regular subgraph. This could either be a Hamiltonian cycle or a set of cycles covering each vertex exactly once. In the next step (without using neural network) the subtours are com- bined pairwise by detecting some edges (a, a') and (b, b'.) from respective cycles and replacing them by edges (a, b) and (a', b'). In an improved version (Algorithm 2) they propose using generalised neural network (al- lowing higher order interactions) for the first step. Xu and Tsai report a good solution (within 2.5% Of the best known solution) ofa 100 node TSP. The inherent problem with this approach is that optimum combi- nation of optimum subtour need not be an optimum Hamiltonian tour.

Connectionist approaches other than those based on the Hopfield network have also been used to solve TSP. Kohonen's self organizing feature map algorithm and elastic net algorithm are especially shown to have solved TSP successfully on a large scale for planar graphs. See Angeniol et al. ( 1988 ) and Durbin and Willshaw ( 1987 ) for details and recent results.

In this paper we re-examine Hopfield's original ap- proach and modify it by ( l ) converging the net in quasi- stationary manner; (2) using inhibitory self-connec- tions; (3) giving a method to determine the values of the energy coefficients; and (4) redefining the object function to be minimized to solve the problem. Solu- tions of the (Hamilton Cycle Problem) HCP on up to 500 node graphs and the well known 318-city TSP (see Lawler et al., 1985) are reported.

The primary reason for focussing on HCP (rather than the TSP) is that it is easy to evaluate the perfor- mance of the algorithm. The result can be tested for being a Hamiltonian cycle of the given graph in linear time. Besides, HCP is an important problem in its own right. It is an NP-complete problem and several other NP-complete problems (SAT, 3-SAT, Vertex Covering, etc.) can be transformed to HCP in linear or quadratic time. For more details on SAT and HCP, refer to Hop- croft and Ullman (1979). For a complete account of NP-Complete problems, see Garey and Johnson (1978).

Traditionally, HCP is solved by a variety of methods that suit only sparcely connected graphs: ( 1 ) Explicit enumeration of all the Hamiltonian paths by computing the (n - 1 )th power of the symbolic adjacency matrix; (2) Branch and bound search methods; and (3) Dy- namic programming. Performances of these methods quickly deteriorates as the number of the nodes grows.

2. HOPFIELD'S A L G O R I T H M AND ITS DRAWBACKS

In this section a complete discussion of the Hopfield and Tank's ( 1985 ) solution of the TSP is presented and its shortcomings are studied in order to understand how to improve its performance.

The TSP involves finding a Hamiltonian cycle (a cycle passing through each node once) in a completely connected graph with a cost/distance associated with each edge (non-negative real numbers) such that the cycle cost (sum of the costs of the edges in the cycle) be minimum or as close to minimum as possible. It is important to remember that any Hamiltonian cycle is a valid solution of the TSE but its acceptability depends on its cost.

2.1. Basic Concepts

DEFINITION. An n × n matr ix will be called legal ( f i t has one 1 in each row and column and the rest o f the elements are 0 "s.

In an n-city (node) problem any Hamiltonian cycle can be represented by a legal matrix M. The cycle as- sociated with such a matrix is given by CoC~C2 . . . Cn-~, where Ck, in the range from 0 to n - 1 denotes the city in the kth halt and Mck.k = 1 Vk. In other words, the columns of M represent halts and the rows represent cities (nodes). All the nonlegal matrices will be termed illegal. Every legal matrix represents a Hamiltonian cycle and every Hamiltonian cycle has n legal-matrix representations.

The Hopfidd network is a direct consequence of this matrix representation. We begin the discussion of the network with a short description of the simple com- puting elements, namely, neurons.

A neuron is a primitive computing device. It has one output and, in principle, any number of inputs. The inputs and the output are real numbers in a closed interval. Without loss of generality, the interval is as- sumed to be [0, !]. A special family of these neurons is called discrete, which can only output 0 or 1. In the Hopfield network each input can be either the output of another neuron or the constant value 1 from an ex- ternal source (ES). Each connection between neurons has a weight associated with it. A weight is a (positive or negative) real number, which scales the signal trav- eling through the connection. Positive weight connec- tions and negative weight connections are also called excitatory connections and inhibitory connections re- spectively. In general, the weight of the connection from neuron a to neuron b need not be the same as that of the connection from neuron b to neuron a.

The input-activity (or simply, activity) of a neuron is the sum of the weighted inputs to the neuron. If the weight of the connection from x to a is w,.a and the output o f x is V,-then the input activity of neuron a is

An Analog Neural Network 871

Ao = Y,x w~,,V,_ + WEs.a , because the output of the ex- ternal source (ES) is VEs = 1. The domain of the vari- able x is the set of the neurons.

A neuron computes its output solely as a function of its activity. The output computed by the neuron a is g(A~), where g is known as it t ransfer function. In the literature it is also called the act ivat ion function and the input -output character is t ic function. For most analyses and experiments involving such neurons it is sufficient to have g to be a nondecreasing function with g ( - o o ) = 0 and g ( + o o ) = 1. Three functions are prominently discussed in the literature. These are ( 1 ) Step: g ( A ) = 0 i fA < 0, 1 i fA > 0; (2) Sigmoidal; g( A ) = eA / ( e 4 + e -A) ; and (3) Bounded-Linear: g( A )

= .5 + A i f - ½ < A < ½,0 ifA _< - ½, and 1 otherwise. Obviously, the neurons with Step-function as their transfer function are discrete neurons.

Biological, electronic, or computer simulated, a neuron will take a finite amount of time to generate the output dictated by its characteristic function and the input activity. Therefore, it is not possible that the neuron output be compatible with its input at all the times in an evolving environment. A neuron is said to be in a s table s ta te at time t if its output V~(t) is equal to g ( A a ( t ) ) . Similarly, a network of interconnected neurons is said to be in a stable state if every neuron in it is in a stable state. A neuron, not in a stable state, is said to h a v e f i r e d when it updates its output.

The state of N interconnected neurons and the con- stant external source can be fully described by the values of N variable parameters Vo . . . . VN- ~, the outputs of the neurons. Because each ~ is a number in the closed interval [0, 1 ], the state of the network can be uniquely represented by a point in the N-dimensional unit cube. While only the corners of the cube represent the feasible states of a network of discrete neurons, every point in the volume of the cube is a possible state for the analog- neuron networks.

As the basic concepts and terminology are covered, we shall now see how to solve problems using this net- work. Hopfield interpreted the evolution of the state of the network as a computing process. The process ter- mintes when the network arrives at a stable state, and the outputs of the neurons in the stable state are the result of the computation. Before we see how to solve the TSP using this instrument, let us see how to ensure that the process will always terminate.

2.2. Convergence Condition

Every useful algorithm should terminate. Therefore, termination of the evolution of the network should be the foremost criterion in determining the configuration of the net. In this section we present some constraints on the connection-weights that are sufficient to ensure the termination of this process.

After assigning weights Wa.b g a b and setting the out- puts to a chosen (often randomly) initial state, the net is allowed to evolve. Neurons are randomly selected and fred, one at a time. Hopfield (1982) shows that a discrete net always arrives at a stable state if its weights are symmetrical and there are no self-connections, that is

and

w,.b = Wb,, Vab ( la )

w~..=0. Va. ( lb)

Simple analysis shows that the energy, function E, given by

l E = ~ ~ wo.h V, Vb - ~ WEs, h Vh, ( 2 )

a,b b

decreases every time a neuron is fred. Because E is bounded below, the process must terminate.

2.3. Energy of the Stable States

There is an elegant characterization of the stable states in terms of the energy function [eqn (2)] . In a discrete- neuron network satisfying eqn ( 1 ), a state (corner of the unit cube) is stable if and only if it is a (local) minimum on the energy surface. In other words, a state is stable if and only if its energy is less than that of its N neighbors (each node in a cube has N adjacent nodes). This observation is the key to understanding how to use the network to solve optimization problems such as the TSP.

In light of the fact that all the stable states (final states of computation process) are local minimum points, the computation process can be interpreted as a minimization algorithm in the energy space. Suppose we want to minimize a function Eproblem( V~ . . . . . VN), where ~ are binary variables, and suppose there exists a set of weights wa,h satisfying eqn ( 1 ) such that E = Ep,obte,,,, then the computation on a discrete network designed with these weights will give a (local) minimum value of Eproble,,,.

2.4. Connection Weights for the TSP

Hopfield and Tank suggested that the TSP can be solved by minimizing

H V G Ep,o~e,. : - " Z v~., . W,j + Z v , , . v,,., +

2 .~.i~j --2" "' -2 x ÷ y, i ( )2o × n - V,.i + -~" ~, d,:,." I{,., • (I.'~..,+~ + v.,.,_, ), (3)

xy i

where d.,.v is the cost (distance) associated with the edge exy of the object graph, and the addition and the sub- traction in the last term are modulo n, i.e., (n - 1 ) + 1 = O a n d O - 1 = n - 1.

8 72 S. M e h t a a n d L. Fu lop

The coefficients are positive, therefore the function is non-negative. The first term is zero if and only if there is no more than one 1 in each row. The second term is zero if and only if there is no more than one 1 in each column. The third term is zero if and only if there are exactly n l 's in all, and the last term, in case of legal states, is equal to the cost of the cycle. Therefore, a legal state with minimum Eproblem corresponds to a Hamiltonian cycle with minimum cost. With this jus- tification it is assumed that the TSP can be solved by minimizing Eproblem. Solving E = Eprob l e m yields

w.,.i.yj = - H6xy ( 1 - 60)

- - l : 6 i j ( l - - 6 . ~ y ) - - G - D d x y ( 6 i , j + I + 6 i , j _ t )

WEs..,j = riG,

where H, V, D, and G are positive constants and 6 .... is the Kroneker delta function.

Indeed, the weights satisfy the symmetry condition, but the self-connection weights are not zero. To con- form to condition eqn ( l b ) Hopfield used the following approximation for the connection weights

w~:,.~ = -H6.~r( I - 6,j)

- V6O(I - 6x~,) - G(I - 6,.~,6,j) - O d ~ y ( 6 i . j + I -b 6 i . j _ l )

t _ TiP E S , ) j - - 17 G.

Extensive experiments were performed with this network. Various values of the coefficients V, H, D, and G have been tried. Experiments are also performed with various initial states. Unfortunately, all this ex- perimentation bore poor results. Results for graphs with n > 20 have been unreliable. Often one good result is found after hundreds of runs showing uncomfortable dependence on the order in which the neurons are fired. Experiments with n > 40 have not even re- suited in legal states. The disappointment with Hop- field's approach has lured the researchers to try cross breeding between neural nets, simulated annealing, elastic net, etc.

In our opinion the Hopfield approach deserves fur- ther investigation.

2.5. Problems with the Hopfield Approach

There are five serious problems with this algorithm. The first flaw is that while this algorithm only searches for minimal points of the energy surface, the desired state (optimum cycle) may not be a minimal point of the surface. In the legal states, all the terms in the expression [eqn ( 3 )] reduce to zero except the D-term, which, becomes the cycle cost. Therefore, the optimum cycle state has the lowest energy among all the legal states. This, however, does not imply that the optimal cycle state is a minimal point of the surface, because it is possible that there may be an illegal state in the neighborhood with lower energy.

The second objection to the approach is the random approximation of the connection weights. Any ap- proximation of the weights cannot be justified because Eproblem = E no longer remains true.

The third problem is the random selection of the initial state. Because the system energy monotonically decreases, the final (solution) state inherently depends on the initial state. Therefore, there should be a better way to select the initial state.

There is also a problem with the random selection of the energy coefficients. Again, there should be a systematic way to determine the values of V, H, D, and G.

Lastly, even if some optimal or near optimal solution states are among the minima of the energy surface, there is no control over what fraction of the minimum points of the surface are illegal or nonoptimal legal states. From the random nature of the algorithm it is obvious that the larger the fraction, the smaller the chances of arriving at an acceptable legal state.

Improvements are suggested in the remainder of this paper to eliminate or subdue the effects of these draw- backs.

3. WHY HAMILTONIAN CYCLE PROBLEM?

The very first improvement comes not from changing the approach to the solution but from changing the problem itself. In this section we shall justify that the HCP, a special case of the TSP (as described below), is a more suitable problem for the neural network ap- proach than the general TSP. The new algorithm will still be applicable to TSP but better performance should be expected on HCP rather than on general TSP. The experimental results will be provided primarily for the HCP. Of course, the new algorithm will also be applied to a general TSP ( 318-city problem, see Lawler et al., 1985) and its result will be compared with the best known solution for that problem.

The HCP is the problem of finding a Hamiltonian cycle, if it exists, in a given graph that, in general, is not fully connected. This problem can be viewed as a special case of the TSP. It can be converted to the TSP by augmenting the graph to a fully connected one with original edges assigned the value zero and new edges assigned the value one. A Hamiltonian cycle in the augmented graph is of cost zero (optimal) if and only if it is a Hamiltonian cycle of the original graph.

The HCP is also an NP-complete problem and many interesting and important problems can be reduced to the HCP with linear or quadratic time complexity transformations.

Although the HCP is introduced as a special case of the TSP, most TSP algorithms may fail to solve the HCP. The reason is that the TSP is an optimization problem while the HCP is not. By definition, all the

An Analog Neural Network 873

Hamiltonian cycles are acceptable solutions of the TSP if their costs are close enough to the optimal-cycle cost. If this description of an acceptable solution is carried over to the HCP, then a cycle which costs l or 2 in a 200 node graph can be considered an acceptable so- lution. This is not the case, however. A Hamiltonian cycle of the augmented graph is either a Hamiltonian cycle of the original graph or it is not. If not, the result is as bad as it could possibly be. In other words, in the HCP we only want the opt imum cost solutions. Any near-optimum cycle is of no value no matter how close is its cost to that of the optimal cycle (i.e., zero).

DEFINITION. Legal matrices which correspond to the Hamiltonian cycles of the original graph will be called valid.

To explain why we chose to focus on the HCP rather than on the general TSP, let us recall the energy expres- sion. One of the objections to it, in the context of the TSP, was that there was no guarantee that the optimal solutions were (global or local) minima on the energy surface. In the case of the HCP, on the other hand, every valid state is a min imum point on the energy surface because its energy is zero. Therefore all the so- lution states are stable.

It should be observed that the problem of optimal cycle of the TSP not being a minimal state can be avoided by choosing D relatively smaller than H , Vand G. Therefore, the algorithm developed here will cer- tainly be applicable to TSP.

Another, more intuitive, justification of choosing the HCP is that it allows greater degree of decision making locally by a connection. In the TSP a graph edge can neither be accepted nor rejected based on its weight alone, but in the HCP first level of decision can be made locally by rejecting edges with weight 1 without any global feedback.

The other reason for focusing on the HCP is that it is easier to judge the quality of the results. In this case there is no need to assign a degree of goodness to the cycle. After every run the outcome can be simply tested by checking whether the result is a valid matrix or not.

In the following section we introduce several changes in the Hopfield method to significantly improve the odds of arriving at a valid matrix if one exists.

4. ANALOG N E U R O N S

A neuron is analog if its transfer function is continuous, nondecreasing, and bounded. As described in Section 2, we are particularly interested in the sigmoidal and the bounded-linear functions. A family of functions can be generated from each of these functions by scaling the curve along the x-axis. Therefore, the family of an- alog transfer functions is defined as

gx(A) = g(XA), X>O,

where g is the sigmoidal or the bounded-linear function.

The parameter X is called the gain factor because it scales the rate of the change of the output. The range of X is 0 to oo. For X = 0, gx(x) is constant at ½, and for X = oo, gx = Step function.

In his paper, Hopfield (1984) studied analog-neuron networks. His motivation behind considering analog neurons over discrete neurons was that latter can only be simulated on a digital computer while the former can be emulated by much simpler electronic or electro- chemical devices. Therefore, much cheaper hardware can be built with hundred of neurons on a chip with the advantage of processing at all neuron sites in par- allel. Hopfield showed that discrete neurons (X -- ov ) can be approximated by analog neurons with high gain (X >> 0) without affecting the convergence characteristics of the network.

We shall also consider analog-neuron network, but for entirely different reasons.

4.1. Why Analog Neurons?

There are three reasons to justify the choice of the an- alog neurons over the discrete neurons.

Firstly, the feasible state space with the analog neu- rons is no longer restricted to a few scattered points (N-dimensional-cube comers) on the surface of the unit cube. As the entire volume of the cube constitues the state space for analog-neuron networks, the minima on the surface can be approached from the inside of the volume. This is important because the discrete-neuron system monotonically descends on the energy surface and gets trapped at the first minimum encountered even if it is not deep enough compared to the other minima. Opening of new approaches to optimum minima in- creases the probability of reaching them.

Secondly, analog-neuron networks can exploit com- plete information about the energy surface. The dis- crete-neuron network performs gradient descent based on the energy values at the nodes of a coarse grid. It, therefore, misses the sharper features of the surface.

Lastly, the steps of the system state can be restricted to an arbitrarily small size. In the case of discrete net- works, the system can only move from one corner of the cube to one of its N neighbors. Each firing transports the system to a distant environment giving no chance to select an optimal direction. Discrete nets are, there- fore, extremely vulnerable to the order in which the neurons are fired. For example, the system in state A, Figure l, can end up in state B or state C solely de- pending on which neuron is fired first. The direction of maximum gradient, on the other hand, can be se- lected irrespective of the firing order if steps can be controlled to an arbitrarily small size.

4.2. The Problems with Analog Neurons

Analog neurons also impose difficulties of their own. One of the problems is that because the new state space

874 S. Mehta and L. Fulop

3 ~.2 3 3 3 3

3 .5 3 3 3 3 i

3 O~l . 9 . 8 0~.7 3

3 3 3 3 3 3

FIGURE 1. Gradient descent on a grid. Starting from state A the system may converge to state B or C depending on the neuron firing sequence. Numbers represent energy.

is uncountably infinite, the search for an optimal state may become unacceptably inefficient. The second problem is to find a way to control the step size. The outcome of a firing only depends on the transfer func- tion of the neuron. Therefore, although any value in the range [0, 1] is a feasible output, apparently there is no way to control the change in the output (step size) of a firing neuron.

In the following section a search algorithm is devel- oped that solves both these problems.

5. ANALYSIS OF T H E ANALOG N E T W O R K

In this section we shall study the analog network in order to design a search algorithm in accordance with the criteria discussed in the previous section. The transfer functions for all the neurons will be assumed to be gx. Therefore, the parameters W~.b, WEe, b, ~, and g completely characterize the net.

5.1. Characterization of the Stable States

The stable states of the analog networks, like those of the discrete networks, can be characterized by the min- ima of an energy function Ex. The new function, which is a generalization orE, originally suggested by Hopfield (1984)

I E~ = - ~ . ~ wo.bV.V h - ~ WEs.bV b

ab h

~0 Va + Y. g~l(u)du, (4) a

is also a potential energy expression. While the the first two terms are energy due to interactions between charged bodies, the third term is energy of the individ- ual charged bodies which are clusters of infinitesimally small particles. In the discrete case the charged body was an indivisible unit. Unlike E, function Ex also de- pends on X and g. In the limit, Ex --~ E as X --~ oo.

A point v = ( Vt, V2 . . . . . VN) is a min imum of Ex

iff OEdOV~ = 0 Va and 02 Ex/OV 2 > 0 Va.

iffg~J(V,) = ~ wh,~Vb + WES.~ and w~.~ < g~r(t,~) Va. b

iff v is a stable state and wo,~ < g~t ' (V~) Va. For in- hibitory self connections (i.e., w~,~ < O) because g is a monotonically nondecreasing function, a state is stable iff it is a min imum of Ex.

5.2. The Convergence

The new energy function plays the same role in the proof of convergence of the analog net as E did in the discrete case. The convergence condition for the analog networks is similar to that of the discrete case, except that the condition on the self connection weights is re- laxed. We have the following convergence result which is a generalization of the Hopfield's result for discrete systems.

THEOREM 1. The state o f an analog network, which is close to a s teady state (i.e., gx(A~) ~- V~ 'Ca), converges to that s teady state i f ( 1 ) w~.o = Wb.~ 'Cab; and (2 ) -w~.~ < g,-.~ Va, where -1' gx,.~, is the m i n i m u m value o f the derivative o f g ; l ( V ~ ) .

Proof If the neuron a is fired and its output changes from V~to V a + 6V~ then g ~ ( V ~ + 6V~) = A~ and the change in the activity which caused the firing is 6 g ~ ( V a ) = A . - g;~(Va) . The change in the energy is given by

~4~a,a rEx = - Z Wo.bVbfVa -- T ((V. + 6V.) 2 -- V~)

a ÷ b

yv,[ + ~ V° - WEs,~6V~ + g'~l(u)du,

because

Wa.b = Wa.b 'Cab. (5a)

Keeping the terms up to the second order, the expres- sion simplifies to

1 ~Ex = - ~ Wa,bVb~V~ -- w~,aV~Va - "~ w~,~(~Va) 2

b+a

- W~s,a~Va + g~'(Vo)~Vo + ½~g~t(V.)6Vo

= -A.~Vo - ½. w~.~(~Vo) 2 + g; '(Vo)~V.

+ ½6g~'(Z~)6Va

An Analog Neural Network 875

= -6g~ ' (Va)6V.- lwo,o(6V.)2+ ½6g~'(Va)6V.

= _ I ~ g ~ l ( V a ) ~ V a - . l W a , a ( ~ V a ) 2 .

The above approximation is valid because 6 Vo is small so higher order terms can be neglected. The equation is exact for the bounded-linear transfer function. The first term is always negative since g~' is a nondecreasing function. The second term will also be negative if wo,a > 0, but it will be positive if the self connection is in- hibitory (i.e., wa.. < 0). To ensure that the change in the energy be negative we need

dg;'(V.) -w... < (5b)

dV. ,,~"

If this condition is satisfied then the process will even- tually stop because Ex is a bounded function. •

5.3. Quasi Stationary Paths

The energy, Ex, is a differentiable function of X. Suppose S(X) denotes a minimum of the function. If the gain factor is incremented by a small value to X + ~ X, then the corresponding minimum S(X + 6 X) should move out of S(X) on a differentiably continuous curve. The continuous and differentiable curve formed by the locus of S(X) will be called a quasi stationary path (QSP) as X spans its range from 0 to oo.

A few important observations about QSPs can be made immediately. From the definition it is clear that every minimum point of Ex is on some QSP. Further- more, because the only feasible (therefore, the only minimum) state at X = 0 is S(0) = ( ½, ½ . . . . . ½ ) (the center of the cube), all QSPs must start from (½, ½, . . . . ½ ) . Figure 2 shows the QSPs schematically.

The significance of the QSPs comes from the con- vergence result of the previous subsection. Suppose the system is at a stable state (minimum) point ofEx) S(X). Now the gain factor is incremented by a small value

to X + 5 X and the network is left to evolve. Because the perturbation is small and the stable state S(X + ~X) (ofEx+~x) is very close to S(X), the convergence theorem implies that system will converge to S(X + 6 X) if conditions [eqn (5)] are satisfied. Therefore, the system state will trace a QSP if iteratively X is incre- mented by a small value and the system is allowed to stabilize.

5.4. Asymptotic Behavior of QSPs

In this section it will be established that QSPs diverge to comer states monotonically.

THEOREM 2. dR2/dXlosp > 0 and dE /dX lose < 0 where I R [ is the radial distance of the steady state S(X)

from the center o f the cube.

Proof For simplicity we shall prove the theorem for bounded-linear neurons only. Let v = (V,, V2 . . . . VN), W be the symmetric N × N connection weight matrix (i.e., Wa,b = Wa,b), WES = (WEs, t, WES,2 . . . . . WES.N), and u = (1, 1 . . . . . 1) with N components. In these notations

E = - I v r w v - wrs v. (6)

The square of the radial distance of S(X) from the center is R 2 = (v - u / 2 ) r ( v - u /2 ) . So

T dvl dR" = dR2 = 2(v - u/2) ~ .... d, ...... " (7) dX ose dX ~.,.dy.l.te

For simplicity the subscript steady state will be dropped. All derivatives will be assumed to be subject to steady state constraints.

Until the state S(X) approaches the surface of the cube, i.e., the outputs of the neurons approach the bounds, the function gx(x) is equal to by ½ + Xx. Thus at the steady state S(X)

( . 5 , . 5 , . 4 . . . . . 5 )

FIGURE 2. An example of a quasi stationary path tree.

876 S. Mehta and L. Fulop

v = (½)u + X(Wv + WEs). (8)

This gives ( I - X W ) v = ( ( ½ )u + XWEs), or

1) = M - 1 ( ( ½ ) U q- XWES), ( 9 )

where the symmetric matrix M is given by I - X W. The rate of change of v is d v / d X = M - ' W M - ~ ( ( ½ )u + XWes) + M - ~ w e s , because d M - ~ / d X = M - ' W M - ' . Substituting for M -t (( ½ )it + hWEs) from eqn (9), we get

dv /dX = M - t W v + M-twes = M - t ( W v + wEs). (10)

From eqn (6), d E / d X = - ( v r w + W E s ) d v / d L Substituting for v - u / 2 from eqn (8) into eqn (7) we get

- 2 d E / d X = ( l /X)dR2/dX = 2 ( W v + WEs)rdv/dX.

Therefore, from eqn (10) we have

- 2 d E / d X = ( 1/X)dR2/dX

= 2 ( W v + W e s ) r M - I ( W v + WEs). (11)

Now we shall show that the right hand side of eqn ( 1 1 ) is a positive quantity at a steady state. Because W is symmetric, there exists a unitary matrix U which diagonalizes W. Let U W U r = D, which is diagonal. Also, because M = I - X W , U M U r = I - XD is also diagonal. If U ( W v + WEs) = ~ then

N ( I / X ) . d R 2 / d X = - 2 d E / d X = 2 ~ ~ / ( 1 - XD.). (12)

I = I

The right hand side of eqn ( 12 ) remains non-negative as long as hDii < 1 Vi. The only case of concern is positive D , for some i and X approaches 1 /D, . Because this is a necessary and sufficient condition for singu- larity of M, it can be concluded that d R 2 / d X remains positive until M becomes singular. From eqn (9) it im- plies that d R 2 / d X remains positive if there exists a sta- tionary state for given M and X. Put simply, the rate of change of the radial distance with respect to X is positive in a stationary state. Because a QSP is the locus of a stationary state, the right hand side ofeqn (12) remains positive on it. •

Let us take this opportunity to understand the im- pact of the approximation o fgx(x) by ½ + Xx on the analysis. The analysis is valid as long as the system state is in the interior of the cube. If at some point the matrix M approaches singularity, then it implies that the sys- tem will diverge to infinity in search of a stationary state. Because in reality the outputs of neurons are bounded, the system state will freeze at a corner (on the surface) for all the larger values of X. With this observation we conclude that the QSP will terminate at some corner monotonically.

5.5. QSPs and the Minima of E

Although the minima of Ex are spanned by the QSPs, the generalized energy function cannot be exploited to solve the optimization problems. Ex cannot be used as the object function because of its dependence on X and g. With this motivation we investigate the relationship between the QSPs and the minima of E.

We expect that the QSPs will converge to the minima of E as X --~ ov because Ex --~ E as X --~ ~ . Although this gives a way to approach a min imum of E, it is unrealistic to realize a very large value of X (--~ c~! ).

At the corners of the unit cube, E~ is equal to E for any value of X (not just X = ov ) because the last term ofeqn (4) vanishes. So i fS(X) is a min imum point of Ex at a corner then it is also a minimum of E. Therefore, it is not necessary to prolong the process until X reaches a very large value. It is sufficient to arrive at a minimum of any Ex as long as it is at a corner of the cube.

Fortunately, Theorem 2 establishes that all the QSPs monotonically approach a corner state. Because a QSP is the locus of a min imum of Ex, it approaches a min- imum of E (because Ex = E At the corners). Conversely, by definition every min imum of E (= E~ ) must be the convergence point of a QSP. Thus every min imum of E is approachable through a QSP.

The rate at which a QSP approaches a corner (so a min imum of E) depends on the other parameters of the network. In the case of the T S P / H C P net, X need not increase beyond ~kcritical = 1/(2WEs, o + 2wa,,), see Section 7. For bounded-linear neurons all QSPs reach a corner at X = Xcriti¢~l. In the case ofsigmoidal neurons QSPs never reach a corner but at XCntic~j they are within ~ distance from a corner (i.e., I R I >

5.6. Searching Along QSPs

A simple procedure for searching a min imum of E is described in this section. Initializing X at 0 forces the system to the center of the cube. The procedure then involves incrementing X by a small amount of 6 X and allowing the network to stabilize. The process is re- peated until the system state approaches a corner of the cube.

This procedure restricts the search to a small part of the space, namely, the QSPs. Notwithstanding the restriction, there is a finite chance to approach any min imum of E because QSPs do not pass through one min imum to another (note that d E / d X l Q s e < 0) and because each min imum has a QSP leading to it. The procedure also gives a handle on the step size. By se- lecting a small enough dX the displacement between S(X) and S( X + 6 X) can be reduced arbitrarily. This results in robustness (greater independence from the order in which the neurons are fired). It also increases

A n A n a l o g Neura l Ne twork 8 7 7

the chances of following that QSP, which has the max- imum gradient.

6. T H E O B J E C r F U N C T I O N FOR T H E T S P / H C P

The E0roblern in eqn (3) was unsuitable because the con- nection weights resulting from the solution of E = Eprob,~m did not satisfy the convergence conditions. In this section we shall formulate a new object function for the problem which yields connection weights that satisfy eqn (5).

Recall the expression [eqn (3)] used by Hopfield. The first two terms were penalties for having more than one 1 in any row or column. There is no reason to consider the case of more than n l 's in the third term, because the first two terms implicitly exclude that case. The third term should be used to penalize less than n l 's only. This modification results in the following expression

Eproblem E Z If = 2 .,,,, vx, v~j + 7 Z v.,y,,~ x #y,i

( )° + G n - E V,., + ~ . y . dxyV~,.i(V,.,+t + Vy.i_,). r i xJ , i

The constant term G. n does not play any useful role, so it can be removed. Giving

= 2 ,.i+j --2 x÷viZ VxiVyi

D - c Z v x , + ~ Z dx..v, ,(v, . . ,+, + ~;v-,).

xi .vJ 'i

Because the G-term is negative and the other terms are positive, every additional 1 will be favored if G is large compared to the other coefficients. Similarly, the system will tend to have fewer l 's when G is relatively smaller compared to H, V, and D. To achieve a balance between the two situations we introduce a self-connection term

E 2 Ep~obt~,,, = H Z V,.,V,~ + V, ,V, i - G Z V~, 2 x,i÷J " " 2 x÷y.i " " .,.i "

o__ Z &.v~,(v~,+, + v,,,_,)+_s + 2 .~,i . . . . . 2 ~i VxiVxi. (13)

Solving E = Eprobt~,,, gives the following connection weights

l*~-i,yj = -H6.,o,( 1 - 60) - V 6 i j ( 1 - 6x~,)

-- S~xy~ij -- Ddxy(~i , j+l + ~i , j - l )"

Wes. , = a .

The connection weights are symmetric. Compliance with eqn (5b) requires

< d g ~ ' ( x ) 1 d g - ' ( x )

S = -Wa . a d x rain h d x rain" (14)

Comparison with the connection weights in the Hopfield net shows that now there is no global inhi- bition. In the design of a VLSI chip it is advantageous to have lesser and more localized connectivity. In a computer simulation of the network, computation of the activity of a neuron nxi requires scanning of 4n - 3 inputs, namely, those in columns i - l, i, i + 1 and in row x. Therefore, the time complexity of one cycle of firing is 0(n 3). The total time of the process if 0(n 3 X No. of cycles).

Self-connection plays an important role in control- ling the flow of the system at all the stages. In the process of approaching a desirable low energy state, neurons should constantly adjust their values in order to resolve incompatibility with their neighbors. However, a grown (large output) neuron tends to dominate over weaker neighbors depriving them of a fair chance. This is where inhibitory self connection comes in. While having an insignificant effect on a small output neuron, it becomes a strong force against a grown neuron. As a result, weak surrounding neurons can put an effective force against the grown neurons to get a fair settlement.

7. C O E F F I C I E N T VALUES

One major difficulty with Hopfleld's algorithm was that there were no guidelines to determine suitable values for the coefficients. In this section we shall derive re- lations between the coefficients so that ( 1 ) convergence condition [eqn ( 14)] on the self-connections be satis- fied; (2) the valid states become the min imum points on the surface; and (3) there be significantly fewer un- wanted min imum points. The last criterion will help increase the probability of reaching a valid state.

The output of an analog neuron is given by gx (A~ j )

= g(~,A.~) = g( ~ (w.,.o~ E~/) + ~WEs.~j); therefore, the absolute values of the coefficients are not important as

plays the role of a scaling factor. Thus we will only need to determine the ratios between the coefficients.

The role of H and Vis to penalize multiple l 's in a row and in a column respectively. A matrix is legal if and only if its transpose is legal. As the two coefficients play similar roles, it is reasonable to require

H = V. (15)

To make a valid matrix stable, we must require that any decrease in the output of a neuron at 1 or any increase in the output of a neuron at 0 must increase the energy. That is

OE OE > 0 .

< 0 and ~ txi=O, validstate O~xi Vx I = I,valid~tate

Because

OE OEproblera o , Z Vxj ~ - V , Z Vk , i_ G .~_ O OVxi OVxi i÷j x~.,,

x Z d,,.(V,v+, + V,..,-,) + s. v~,. .v

878 S. Mehta and L. Fulop

the above conditions, for HCP, transform into the fol- lowing inequalities on the coefficient

- G + S < 0 , (16)

and

- a + H + V+ D ~, d.y.(V,,,+, + Vr.,_t) ),

> - G + H + V > 0 ,

or,

- G + 2H > 0. (17)

The D-term in eqn (17), which depends on the graph, has been replaced by its min imum value, namely, zero.

Conditions [eqns (16) and ( 17)] ensure that every valid state is a min imum point of the energy surface. It is impossible to satisfy both the conditions if S > 2H, but so long as S < 2H any value of G in the range from S to 2H satisfies them. In the previous section, we observed than the choice of the value of G is critical in determining where the final state will end up between the two extremes of too few l 's and too many l's. To determine a suitable value for G let us assume that the outputs of all the neurons in the final state (zero or one) should be equally stable.

The stability of a neuron output can be measured by how little it changes for a given perturbation in the activity. From the observation of the transfer function, sigmoidal or bounded-linear, it can be concluded that the farther the activity is from zero, the more stable the output. The equal stability assumption, therefore, translates into equal magnitude of the activities of all the neurons. Because the activities of neurons at 0 and 1 in a valid state are G - 2H and G - S respectively, the equal stability assumption results in the following equation

o r

I G - S I = I G - 2 H I

G - S = 2 H - G

o r

G = H + S ] 2 , (18)

which is also the midpoint of the allowed range for G. The value of G given by eqn ( 18 ) satisfies eqns (16) and (17), if S _< 2H. Furthermore, it also eliminates a large class of undesirable states from being minimal points of the energy surface as shown below.

In case of TSE inequality eqns (16) and ( 17 ) should be - G + S + to < 0 a n d - G + 2 H + to >- 0 where tn = D. Zy d~y" ( (,,.i+t + (,,.i-t). Still G should be in the range [S, 2H]. Thus the most suitable value of G will be t'o + H + S / 2 . The best approximation t'o o f to for the neuron nx~ is D times the sum of the two smallest members of { dxy [ y 4: x }.

States containing two or more l 's in a row or a col- umn constitute a majority among the illegal corner states. The activity of each of these l 's in the final state (a corner state) will be A _< (G - H - S). To make sure that no such state be stable, it is sufficient to have

G - H - S < O . (19)

Fortunately, G = H + S / 2 satisfies (19) when S > 0, thus eliminating this large class of illegal states from being stable.

The condition H > S / 2 is subsumed by another condition imposed by the convergence requirement on S. From (14), k cannot be increased beyond

1 dg- ' !u ) (20)

This equation shows that the choice of S must be made prudently otherwise ;~,.o_,- may not be large enough for the process to converge. Suppose for a constant A0, g(Ao) is close enough to l. then the magnitude of the activity of every neuron in the final (valid) state must. at least, be equal to Ao/ ~ . . . . So ] G - SI >-- Ao/ X,.~, or ?~mo_, >- ~critical = A o / ( G - S ) . Using eqns. (18) and (20) we get the condition A o / g ~ <- ( H - S / 2 ) / S .

For the bounded-linear function Ao/g~,}~ is equal to ½ because A0 -- ½ and gT,}~, = 1. So ( H - S / 2 )] S should be greater than or equal to ½ or

S < H, (21)

which subsumes the condition S < 2H. For a sigmoidal function it is not possible to choose

Ao such that g(Ao) = 1 so let us assume that g(Ao = 1) = .88 ( o r g ( - 1 ) = .12) is close enough to 1 ( o r 0 ) . Then - l, Ao/g, , i , is equal to ½, resulting once again in condition (21 ). If the outputs of the neurons are ex- pected to be closer to 1 (or 0), then a higher value of Ao is required, giving Ao/g2,,~ >> ½. This in turn requires that S ,~ H.

DEFINITION. States with no more than one 1 in each row and column and less than n 1 's in all will be called Z-states. E-states are illegal because they have at least one row~column with all O's.

Let us now consider D. I f D is set to zero, then valid and invalid legal states become indistinguishable be- cause they have the same energy. Thus far D was ignored in the discussion. Conditions (15), (18) and (21) do not impose any constraints on D. Therefore, at this point we can only claim that a system subject to these condition can either converge to a Z-state or to a legal state. Our interest, however, is not in just any legal state but only in the valid ones. For that we turn to D.

The role of D is not to allow an augmented edge in the resulting Hamiltonian cycle. The feature of an in- valid legal configuration which makes it invalid is hav- ing l 's at (x, i) and (y, i + 1 ) where dxy = 1. Repeating

An Analog Neural Network 879

the argument applied to the case of two l 's in a row or a column, it can be concluded that G - D - S < 0 guarantees that the feature described above will never occur in the final state. The condition D > G - S further restricts the types of states that can be stable. Now, with the four conditions, only valid states and Z states can be minimal points.

Experiments with the coefficients satisfying eqns (15), (18), and (21) and the condition D > G - S were successful with graphs of medium and high level of connectivity. However, a new factor began to influ- ence the outcome in very low connectivity graphs (most edges with weight 1 ). It was found that D > G - S was too restrictive. The system frequently began to converge to Z-states rather than to valid states. A closer look revealed that early in the process (i.e., small ~,) system jumped to a state in which alternate columns were completely zero (two consecutive columns were zero at one place in case of an odd n). These states were the only minima of Ex because the D-term in eqn (13), which is zero for these striped states, was too large for the others. These (striped) states are on the QSP of states. Therefore, the restriction of D > G - S was relaxed. As the results of the experiments will show in the following section, low D was required for graphs with low connectivity.

8. E X P E R I M E N T S W I T H T H E N E T W O R K

In this section the details of the experiments and the results are presented. Finally, we try to draw conclusions regarding the limitations and the reliability of the ap- proach. Experiments were performed on the graphs of up to 500 nodes. Larger graphs were never tried because of the limitations of the computing facility and the time. Objectives of the experiments were to determine the rate of success and the suitable values of the parameter D. Besides the HCP, a 318-city TSP (see Lawler et al., 1985) was also solved.

8.1. Experimental Details

8.1.1. Computing Facility. All experiments were per- formed with a simulation on a time shared SUN-4 sys- tem (Sun Corp., Ichinomiya, Japan) .

8.1.2. Graph Generation. Graphs were generated based on two parameters, n ( number of nodes) and c (con- nectivity). The parameter c is a number between 0 and 1, denoting the probability of an arbitrary pair of nodes being connected. For any given values of n and c, the family of graphs G(n, c) was generated as follows. Every graph had one cycle to begin with. The remaining vertex pairs were connected randomly with the probability c.

8.1.3. Transfer Function. A bounded-linear-neuron network was chosen for the experiments. The problem

with sigmoidal function, in the case of large graphs, is that the small remaining outputs of near-zero neurons add up to a large value. This value, which should be negligible in the ideal case, can no longer be ignored and constitutes a major factor in the input activity of every neuron. As a result, higher level of activity needs to be achieved to push neurons closer to zero, which in turn requires higher ;%,~,_. Equation (20), therefore, requires lower S. After a limit, S can no longer be low- ered without making self-connection ineffective. Thus only the bounded-linear neurons were found suitable for the entire range of n.

8.1.4. Coefficients G. H, V, S. These coefficients were chosen subject to eqns (15), (18), and (21). Since ab- solute values are of no significance, G was fixed at 1. From eqns (18) and (21) we have .66 < H < 1 and S = 2( 1 - H). As pointed out in Section 6, the inhibitory self-connection is a deterrent to the growth of a neuron against an incompatible environment. The highest value of S was chosen subject to the constraints. Thus the values of the coefficients for the entire experiment were chosen to be

G = 1.0, V = H = 0 . 7 , S=0 .6 ,

with only one exception. Due to the poor results for the graphs of the family G (n = 20, c = .2) the value of S was chosen to be 0.55 instead of 0.6.

8.1.5. Parameter )~. All runs were started with ~, = 0. ~, was incremented by a fixed amount t3 ~, after every intermediate stable state until it reached )~,,,o_,, given by eqn (20). t5 )~ was kept at a conservative 0.01.

8.1.6. Firing Sequence. Neurons were fired in a cyclic fashion (i.e., each neuron once per cycle). In order to save time, a random number generator was not used. Instead, the kth neuron in the cycle was chosen to be (x, i) where x = R e m ( R e m ( k . p / n 2 ) / n ) and i = L(Rem ( k . p~ n 2) / n ) J, where Rem stands for remainder. Integer p is a prime number relative to n 2.

8.1.7. Termination Condition. The net was considered to have reached a stable state ifthe change in the output of each neuron was less than a tolerance factor T for the cycle. T was set to 0.02.

8.1.8. Coefficient D. Given a family, G(n, c), of graphs various values of D were tried. Because it is preferable to choose D greater than G - S = H - S / 2 = .4, the search was started with D = H = .7. Various values were tried down to .05, in the steps of .05.

8.1.9. Sampling. Ten graphs of each family, G(n, c), were solved for each value of D. Each graph was run only once because the firing sequence was fixed.

880 S. Mehta and L. Fulop

8.2. Results of HCP

The following table (Table 1 ) shows the range of D for which the success (computing a Hamiltonian cycle) rate was 100% for every G(n, c).

8.3. Results of the 318-City TSP

One experiment with a general TSP was performed to compare the performance of this net with earlier nets and other optimization algorithms. The well known 318-city problem was selected because it is the largest problem attempted by any method for which the results are published. The cycle cost varied, from 64552 to 61337, with the value of D. The cost of 61337 was achieved with H = 0.7, S = 0.6 and D = 0.35. It took a total of 446 cycles of firing. The result was signifcantly away from the best known cost of 41269 computed

T A B L E 1 Algorithm Per formance on Graphs of

Various Sizes and Connectivit ies

n c D % Success Av. Cycles

20 .10 - - 20% - - • 15 - - 2 0 % - - .20 .35 -~ .15 100% 311 .50 .7 -~ .05 100% 281 .90 .7 ~ .05 100% 378

50 .10 - - 20% - - .15 .1 --~ .05 100% 426 .20 .5 --* .05 100% 419 .50 .7 ~ .05 100% 360 .90 .7 --~ .05 100% 314

100 .10 - - 30% - - .15 .1 ~ .05 100% 553 .20 .55 ~ .05 100% 503 .50 .7 --~ .05 100% 478 .90 .7 -~ .05 100% 358

200 .10 - - 50% - - .15 .35 --~ .05 100% 541 .20 .55 ~ .05 100% 666 .50 .7 --* .05 100% 458 .90 .7 --~ .05 100% 388

300 .10 .7 ~ .2 100% 665 .15 .7 ~ .05 100% 616 .20 .7 ~ .05 100% 691 .50 .7 --* .05 100% 489 .90 .7 --* .05 100% 379

400 .10 .5 -~ .2 100% 912 .15 .7 -~ .05 100% 584 .20 .7 -~ .05 100% 756 .50 .7 ~ .05 100% 511 .90 .7 ~ .05 100% 407

500 .10 .5 ~ .1 100% 996 .15 .7 --~ .05 100% 871 .20 .7 ~ .05 100% 858 .50 .7 --~ .05 100% 550 .90 .7 ~ .05 100% 409

using linear programming techniques. Yet, this is the only reasonably successful at tempt with a neural net algorithm for any graph of more than 30 nodes to the best of our knowledge. The significance of the result can further be emphasized by observing that the average length of an edge in the 318-city graph is 1849.0 so the cost of an average Hamiltonian cycle in the graph is 587991, almost l0 times the neural net solution. Fur- thermore, the linear programming solutions are often tailor-made for special classes of the problems. In this case it assumes that the graph is planar. The neural-net solution assumes nothing about the char- acteristics of the graph and it is applicable to the most general case.

9. C O N C L U S I O N

The results of the experiments are stable enough to draw conclusions with confidence. Contrary to intu- ition, the performance improves with the size of the graphs. As the results show, the problems on 20 node graphs with 15% connectivity could not be solved, while those on 400 and 500 node graphs were solved with even lower connectivity.

The cost of an iterative procedure such as this one is difficult to estimate with precision, but the results clearly show that the number of cycles grew much slower than ~n. Therefore, it is fair to estimate the time com- plexity by O(n 3.5). Actually, the number of the cycles was more sensitive to c.

It is also interesting to note that the procedure did find a reasonably good solution of 318-city TSP. It is true that the result was no where near the best known cycle of 41269, but still it is the only reported result computed using a neural network. In fact, no neural net except that of Xu and Tsai succeeded in even ar- riving at any tour in a consistent manner, much less a near-optimal one, for n above 30 (see Protzel et al., 1989). It is also worth noticing that the other algo- rithms, which found the best known cycle, have used the fact that the graph is planar. The neural net method, on the other hand, does not exploit such information and it is perfectly general in its applicability.

The experimental results show that the proposed modifications in the original Hopfield-Tank approach have resulted in a robust algorithm which can be used for large graphs. We believe that graphs with n >> 500 would also show 100% success rate for most connec- tivity levels. The main limitation in handling large graphs comes from the time complexity of O(n 35) which is too large for n = 1000.

REFERENCES

Angeniol, B., Vaubois, G., & Texier, J. ( 1988 ). Self-organizing feature maps and the Traveling Sales-man Problem. Neural Networks, 1, 289-283.

Brandt, R. D., Wang, Y., Laub, A. J., & Mitra, S. K. (1988). Alter- native networks for solving the Traveling Salesman Problem and

An Analog Neural Network 881

the List-Matching Problem. Proceedings of the Second IEEE Neural Network Conference. I1, 333-340.

Durbin, R., & Willshaw, D. (1987). An analogue approach to the Traveling Salesman Problem using an elastic net method. Nature. 326, 689-691.

Garey, M. R.. & Johnson, D. S. ( 1978 ). Computers and intractability. A guide to the theor), of NP-completeness. San Francisco: H. Free- man.

Gunn, J. P., & Weidlich, R. B. (1989). A derivative of the Hopfield- Tank neural network model that reliably solves the Traveling Salesman Problem. Proceedings of the First International Joint Conference on Neural Network. 588-593.

Hedge S. U., Sweet, J. L., & Levy W. B. (1988). Determination of parameters in a Hopfield/Tank computational network. Proceed- ings of the Second IEEE Neural Network Cot!lerence, II, 291- 298.

Hoperoft J. E., & UIIman J. D. Introduction to ataomata theory, lan- guages, and computation. Reading. MA: Addison Wesley.

Hopfield J. J. (1982). Neural networks and physical system with emergent collective computational abilities. Proceedings of the National .-lcademy of Sciences. 79, 2554-2558.

Hopfield J. J. (1984). Neurons with graded response have collective computational properties like those of two-state neurons. Pro- ceedings of the National Academy of Science. 81, 3088-3092.

Hopfield J. J., & Tank D. W. ( 1985 ). Neural computation of decisions in optimization problems. Biological Cybernetics 52, 141-152.

Kirkpatrick, S., Gelatt, C. D., Jr., & Vecchi M. P. (1983). Optimization by simulated annealing. Science, 220, 671-680.

Lawler, E. L., Lenstra, J. K., Rinnooy, Kan, A. G. H. & Shmoys, D. B. ( 1985 ). The Traveling Salesman Problem: ,4 guided tour of combinatorial optimization. New York: John Wiley.

Protzel, P., Palumbo, D., & Arras, M. (1989). Fault-tolerance of a neural network solving the Traveling Salesman Problem. ( 181798 ) NASA Langley Research Center Contract Report, Hampton, Vir- ginia.

Szu, H. ( 1988 ). Fast TSP algorithm based on binary neuron output and analog neuron input using the zero-diagonal interconnect matrix and necessary and sutficient constraints of the permutation matrix. Proceedings of the Second IEEE Neural Network Confer- ence. II, 259-266.

Van de Bout, D., & Miller, T. K. (1988). A traveling salesman objective function that works. Proceedings of the Second IEEE Neural Net- work Cot!li, rence. II, 299-303.

Wilson, G. V., & Pawley, G. S. ( 1988 ). On the stability ofthe Travelling Salesman Problem algorithm of Hopfield and Tank. Biological Cybernetics. 58, 63-70.

Xu, X., & Tsai, W. T. ( 1991 ). Effective neural algorithms for the Traveling Salesman Problem. Neural Networks, 4, 193-205.