Sutanto Soehodho Department of Civil Engineering, Faculty ... · Sutanto Soehodho Department of Civil Engineering, Faculty of Engineering, University of Indonesia, Depok 16424, Indonesia

Development of genetic-taxonomy evaluator

for finding shortest path in transportation

planning systems

Sutanto SoehodhoDepartment of Civil Engineering, Faculty of Engineering,University of Indonesia, Depok 16424, Indonesia

62-27-727 0028

Abstract

This research is aimed at elaborating a new methodology of shortest pathfinding by using the capability of taxonomy systems and geneticalgorithms. Combination of the two developed in this research is calledGenetic Taxonomy Evaluator (GTE) which is expected to be an effectiveand efficient tool compared to conventional method that enumerates anypossible combination of links within the network. While keeping thecharacteristics of transportation networks TR transforms the networksinto taxonomy structure which is hierarchically shaped based on problemto be solved. In the process TR also creates classification of nodes in thenetwork. This classification provides facilities to isolate the problem tothe core, and criteria that can be inserted in the genetic algorithm.Furthermore, the methodology is to be analyzed to comprehend theperformance and benefits compared to conventional method. So apackage program for GTE is developed in C Language. Results ofanalysis show that several optimal parameters should be determined priorto searching the shortest path, and the findings are quite promising as athreshold for further researches.

1 Introduction

One of the biggest computational efforts devoted to transportationplanning is undertaking the step of network assignment in which flow oneach link of the network is to be determined. The big computationaleffort in network assignment is caused by iteration in searching for

Transactions on the Built Environment vol 30, © 1997 WIT Press, www.witpress.com, ISSN 1743-3509

178 Urban Transport and the Environment for the 21st Century

shortest path that connect each of origin-destination pair in the network,which is done by enumerating possible combination of links such asDjakstra, Moore and others. This problem has stimulated the research inwhich a method is to be sought for solving the shortest path finding withmore efficient way.

The original idea of this research in developing efficient method isconducted by learning the characters of transportation network as theobject and characters of human in doing their optimal decision for routechoice as the subject. Learning the two has evolved with two stepsolution which consists of two steps. The first step is to simplify thestructure of network shape by maintaining the original characters such asdistance. This simplification is necessary since it eases humanrecognition for route choice. The second step is to adopt an optimizationmethod with closed characters to the simplified network structure.

The ensuing chapters will explain the methodology step by step,development of the algorithm, and some computational experience withprior determination of several optimal parameters.

2 General Methodology

As already mentioned above that there are two main tasks in themethodology proposed in this research. The first task is to restructureshape of transportation network into new network where its nodesarranged hierarchically to connect their closest nodes which are located atdifferent level. As what shown in the original network, restructurednetwork should still reflects the relative position of each node to theothers including their connecting links. This sort of network characters isowned by one of the paradigms of computer learning called taxonomyformation.

Furthermore, the closeness of any origin node to destination node of atrip in transportation system is the most important perception for routechoice. So it is emphasized in the hierarchical taxonomy formation of thenetwork.

As for the second task, it has to deal with the solution of non-linearoptimization problems since the impedance or link performance functionis indicated by non-linear function such flow-dependent travel time ofroad performance. Unlike conventional method, the simulated annealingmethod is known to be able to provide global solution to such problems,however its original physics nature imposes lengthy solution for mappingproblem of transportation systems. It is then chosen in this approach of


Urban Transport and the Environment for the 21st Century 179

using the genetic algorithm which has proven to give good solution inmany disciplines. Its properties with parallel solution are very suitable fortransportation problems and further development of parallel processor incomputing systems.

It is the general methodology adopted by this research in which thehybrid of both taxonomy re-constructor and genetic algorithm isdeveloped as a method, latter called as Genetic-Taxonomy Evaluator(GTE), in which a set of source code is developed in C-Language forfurther investigation of its performance regarding optimal parameters.

3 Taxonomy Re-constructor (TR) and Genetic Algorithm (GA)

The task of taxonomy re-constructor (TR) is to arrange the original roadnetwork into representative hierarchical network in which a set of certainnodes will be located in a layer showing their closeness and directness tothe origin node being considered. The following explanation is contrived.

3.1 Formation of Taxonomy Structure

To provide clear understanding in the taxonomy re-constructor, simplenetwork illustrated in Figure 1 is used.

Figure 1: Simple Transportation Network

Supposed node 17 is chosen as origin node, then taxonomy re-constructorwill put node 17 as the paramount in hierarchical network, while thereseveral vertical layers consist of certain nodes for each layer. The firstlayer under the paramount comprises of a set of nodes that has degree ofcloseness one in which they are separated from the origin node only byone direct link. In the same manner the ensuing layers denote their



closeness by their layer, so nodes in the second layer are separated fromthe origin node by two links and so forth. By doing so, simple network inFigure 1 is restructured as shown in Figure 2.

Figure 2: Taxonomy Structure of Simple Transportation Network

Localized Problem

Route choice behavior of individual appear to be various, one maychooses the route that directly connects his/her destination, by other mychoose indirect route. If all trip maker will take only direct route, theproblem of route choice may reduce a lot in computation effort, however,the practice always shows many possibilities since performance of roadchanges by time and flow condition (e.g., congestion). In the proposedGTE problem can be generalized for all possibilities, so that several routetypes are defined as follows;

• Direct Route is defined as a route that connects origin node anddestination with sequential nodes in which only one node in each rowis connected e.g., route 8-9-11-15-17 shown by Figure 2.

• Indirect Route is defined as a route that connects origin node todestination node with possibility of having horizontal connection ofminimal two nodes at the same row, e.g., route 8-9-10-13-18-17shown by Figure 2.

• Round Route is defined as a route that connects origin node anddestination node with possibility of having vertical connection ofminimal two nodes of different rows, e.g., route 8-7-6-12-16-15-17shown in Figure 2.



By having route types, the process of route choice in GTE could becontrolled to a certain level of complexity such of finding route withdirect route with least size of route set, and forth to the biggest set withround routes. For example, to find the shortest route connecting node 8 to1 7 with direct route can be focused or localized by inventory of directroutes available between rows 5 and 1. While for indirect and roundroutes, the search may move within the same row and/or to lower levelrow and back to upper row and so forth depend on the degree of searchchosen.

3.2 Genetic Algorithm (GA)

The original Genetic Algorithm (GA) (Holland. 1992) is utilized tofinding the shortest path by cross-matching and mutation processes. GAuses a population of strings or character harnesses to represent data. Thecharacter harnesses used in the presentation denote the performance ofroad segments such as travel time, and they can be assumed as geneticentity that can be cross-matched and mutated to produce new entities.Similar to biologically genetic process, the best entity or the bestroute/path is sought by iteration. The worse entity is eliminated, and thebetter entity is processed further until no better entity is found anditeration is stopped.

4 Development of Genetic Taxonomy Evaluator (GTE)

The development of GTE comprises of two tasks. First task is totransform original transport network into taxonomy network, and secondtask is to genetic process to produce best route/path which incorporate thefollowing rules :

1. Inventory of complete nodes and group of nodes at each taxonomyrow

2. Choose string integers of nodes and/or group of nodes as symbols3. Consider representation of all types of routes within the population4. Enable probability concept for various route choice5. Strings produced by genetic process that do not represent real network

should be eliminated6. Intensity of links or road segments should represent the road

performance such as travel time for evaluation

By having the rules and various search scheme can be developed, andused for single or multi-path assignment in transportation problems.



String and Integer as Data Representation

String is used to represent taxonomy node. As for simple example givenabove Table 1 illustrates some integer strings that compose the inventoryof taxonomy row 5 to 1 with degree of search 1.

Taxonomy RowString Location

ActualSymbolic

12~>j45678

51

88-5

42

7910

9-5-107-8-99-10

7-8-5-109-8-5-10

3->j

1 113

1 1-9-1311-9-10-13

2

4

1518

15-14-18

15

17

Table 1: Inventory of Taxonomy Row for Degree of Search 1

1) String locationTaxonomy rowInteger SymbolActual node

2) String locationTaxonomy rowInteger symbolActual node

3) String locationTaxonomy rowInteger symbolActual node

1518152

8-51518

24172469402417

3 43 21 111 153 4^ iJ Z,2 213 18

3

11-9-13

51115

111742218

51117

First example shows string 11111 that represent direct route 8-7-11-15-17, string 26221 of second example represent indirect route 8-5-9-10-13-18-17, while string 11321 of third example represent round route 8-7-11-9-13-18-17. So it is clearly seen that length of string reflects number oftaxonomy rows.



Cross-Matching Process

Cross-matching process of GTE is adopted from GA. To give anillustration the strings given above can be used. For example string 11111is crossed with string 26221 with a certain value probability, and randomprocess of choosing location 2 as crossing location.

By this process some new strings will be produced. Cross-matching ofthe string 1 1 1 1 1 (route 8-9-10-13-18-17) and string 21111 (route 8-5-7-11-15-17) produces new route 8-9-10-13-18-17 which is valid, and newroute 8-5-7-11-15-17 which is not valid since no connection betweennodes 5 and 7 in the real network.

Mutation Process

In taxonomy the mutation process may increase possibility of improvingthe route. It is then necessary to do the process before discarding theinvalid route. Again probability concept is helpful to adjust thepossibility of discarding route. The valid route may have high probabilityof mutation, while invalid route may have low probability. Further, themutation process can be done at the location in which inconsistencyappears such as route 8-5-11-15-17 that may experience mutation processat location 2 with two possibilities ;

a) 8-5-7-11-15-17 mutates into 8-5-10-11-15-17orb) 8-5-7-11-15-17 mutates into 8-5-9-11-15-17

The possibility a) may repeat several times the process since theproduced route is still invalid. The repetition could be done until validroute is gained, or stopped by giving some penalty to the string. Whilepossibility b) has certainly created a new route.

6 Development of Algorithm & Source Code

In order to comprehend GTE mechanism in searching shortest path analgorithm is developed. The algorithm is structured based on C-Languagewhich is widely used, and the illustrated as followings :* * ;]s * »M< * * * * * * * * * * *******************************************

^include <stdlib.h>



# include <stdio.h># include <conio.h>//include <string.h>#include <alloc.h>//include <dir.h># include <dos.h># include <time.h>

//include "global. h"//include "gte.h"//include "fileinit.h"# include "taxonomy. h"//include "inventrs.h"//include "genetic. h"

NETWORK struct _STC ̂ network;int gteTimerld;int processLevel = 2;int main (int argc. char *argv [])((FILE ^nflle;char outname [MAXPATH];char *cptr;int argCnt;

struct F1LENODE_STC fileNodes;struct TAXONOMY_ROW_STCint start? o int,

endPoint;

cprintf ("%s [/qj[/g[d|sn]][/p[t|i|g]] <route-filename>\r\n", argv [0]);exit(1);

»̂argCnt = 1;argCnt = processOptions (argc, argv, argCnt);

if (argCnt >= argc) [cprintf ("%s [/q][/d] <route-filename>\r\n", argv [0]);exit (1);

(iif ((infile = fopen (argv [argCnt], "rt")) == NULL) {cprintf ("Cannot open %s\r\n", argv [1]);exit(l):



strcpy (outname. argv [argCnt]):if ((cptr = strrchr (outname.'.')) == NULL)cptr = outname + strlen (outname):

strcpy (cptr, (strcmp (cptr. ".out") ? ".out" : ".oul")):

gteTimerld = initTimer ("GTE"):stopTimer (gteTimerld):

if (IreadDataFile (infile. &fileNodes))exit(l):

if(!getStartEndData (&fileNodes. &startPoint. &endPoint))exit(l):

if (!taxonomy (&fileNodes. &endPoint)) [cxit(l);

f (processLevel >=if (linventarisLajur (&lajurTax. startPoint. endPoint)) [exit(l):

if (processLevel >= 2)uenetic (TaxRow):

gteTimerld = endTimer ("GTE". gteTimerld):

GTE Sub-routines

Furthermore, the algorithm has several sub-routines with the followingexplanations :

• GTE.C is main program• TAXONOMY.C contains routines for taxonomy process• INVENTRS.C contains routines for inventory• GENETIC 1 .C contains routines for genetic process• FILEINIT.C contains routines for reading input file and sending to

memory• GLOBAL.C contains global routines and supports the main program• ERRORS.C contains routines for printing when errors occur



7 Computational Experiences

To see the performance of GTE algorithm in solving the shortest pathproblems, a medium size of transport network given by Sioux-Falls city(LeBlanc, et.al., 1975) is used. The network has 24 nodes and 76 links intotal which is shown in Figure 4, and the link performance function isgiven in Table 2.

Figure 4: Sioux-Falls City Network

There are six (6) parameters involved in the GTE algorithm. Theseparameters are necessary to be set up prior to the search since theydetermine the optimal search, and have the following definitions :

• Population Size (PS) - denotes the size of string population used inthe genetic process

• Best Paths to Keep (BP) - denotes number of the best strings that willgenerate next strings without cross-over and/or mutation

• Number of Generations (NG) - denotes number of generations to beprocessed

• Cross-Over Probability (COP) - denotes probability of cross-overprocess at each generation

• Mutation Probability (MP) - denotes probability of mutation processat each generation for valid strings or routes

• Invalid Mutation Rate (IMR) - denotes the number of trials to bemutated to invalid string to obtain valid strings

LINK TRAVEL TIME(HOUR)

(7,18) 6k (18,7) 0.02



LIN

(10,1 5) 6k(3,4) &

(1,3)&(1,2) &

(3, 12) 6k(12,13)&(18,20) 6k(10,1 1)&(16,1 8) 6t(15,19)&(10,1 7) 6k(8,9) &

(4,11) 4c(5,9) &(9,10)&(15,22)&(11, 12) 6k(4,5) &

(13,24) 6k(24,21) 6k(20,21) 6k(2,6) &(6,8) &(7,8) &(5,6) &

(23,24) &(21,22)&(14,23) &(22,23) &(14,1 5) 6k(16,17)&(17,19) 6k(19,20)&(11, 14) 6k(20,22) &(8J6)&(10,16)&

K

(15,10)

(4,3)(3,1)(2,1)(12,3)(13,12)(20,18)(11,10)(18,16)(19,15)(17,10)(9,8)(11,4)(9,5)(10,9)(22,15)(12,1 1)(5,4)(24,13)(21,24)(21,20)(6,2)(8,6)(8,7)(6,5)(24,23)(22,21)(23,14)(23,22)(15,14)(17,16)(19,17)(20,19)(14,1 1)(22,20)(16,8)(16,10)

TRAVEL TIME(HOUR)

0.060.040.040.060.040.030.040.050.030.030.080.10.060.050.030.030.060.020.040.030.060.050.020.030.040.020.020.040.040.050.020.020.040.040.050.050.04

Table 2: Sioux-Falls Link Performance Function



Prior to conducting the search for shortest path the six parametersmentioned above are to be determined first, and the popular HillClimbing method is used. Parameter determination is done by searchingone parameter as variable and holding the others as constant within acertain range with decreasing/increasing move step to converge to realshortest path (obtained by conventional Moore Algorithm] and reachminimal computing time. Similar process is applied for each otherparameter intermittently until optimal value is obtained, and so on.

To have conclusive optimal parameters, the Sioux-Falls network isutilized by testing two Origin-Destination pairs, namely (11-16) and (1-18) with feasible paths of 286 and 14,256 consecutively. The results ofparameters determination is summarized in Table 3.

O-D Pair

11-161-18

PS(paths)

50400

BP(paths)

44

NG(paths)

1010

COP

(%)

6060

MP(o/oo)

1515

IMR(times)

1010

Table 3: Optimal Parameters for Different O-D Pairs

It can be concluded from Table 3 that almost all parameters for differentO-D pairs have the same values except for PS in which the numbersdiffer reciprocally with their feasible paths. O-D pair 11-16 with 286feasible paths has PS value of 50 which is about 17%, while O-D pair 1-18 with 14,256 feasible paths has PS value of only 400 which is 2.8%. Itseems that the ratio is reciprocal, hence the greater the number of feasiblepaths the less the value of PS.

It is recorded that once the optimal parameters are determined the pathfinding process could rapidly converge to the shortest one. In the case ofO-D pair 11-16 path finding process is completed in 0.17 second, whileO-D pair 1-18 is 6.09 seconds. Furthermore, computation experience hasshown that the convergence of shortest path finding is achieved withinless number of generation iterations. Table 4 shows the convergence ascompared with number of iterations for O-D pair 11-16, and theillustration is given in Figure 5 in which convergence is achieved at the7"' iteration. The similar situation is also shown by Table 5 and Figure 6in which convenience is achieved at the 7''' iteration.

PSBP

(path)(path)

504 # of

Shortest 1Generation

>athIterations



NO (path)

COP (%)

MP(o/oo)

IMR (times)

10

60

1510

1

10

n

18

nJ

18

4

17

5

9

6

9

-7/

9

8

9

9

9

10

9

Table 4: Convenience of O-D 11-16

Generation Iteration

-B- Length of Shortest Path

Figure 5: Convenience of O-D 11-16

PS (path)BP (path)

NG(path)

COP(%)

MP(o/oo)IMR (times)

4004

10

601510

Shortest Path

# of Generations Iterations1

50

2

30

i

27

4

21

:>

21

6

21

7

18

8

18

9

18

10

18

Table 5: Convenience of O-D 1-18

60^ 50t 40m 30o 20W 10

c

Generation Iteration

:V___

^^m m m_- , "= = =Hnp-m-«p2 4 6 8 10 1

Generation

_g_ Length of Shortest Path

2

Fimire 6: Convenience of O-D 1-18



7 Conclusions

A new approach of way-finding in transportation systems, namelyshortest path, is proposed in this paper. The approach is a hybrid of bothtaxonomy re-constructor and genetic algorithm which is called Genetic-Taxonomy Evaluator (GTE).

The GTE is coded in C-Language as a tool to evaluate the performance ofthe proposed method. The core of GTE comprises of six parameters thatshould be determined prior to the searching process. These parametersare essentially embedded from the process of cross-matching andmutation as the nature of genetic algorithm, and experience ofdetermining the parameters shows that they have specific characteristics.Having optimal value of these parameters lead the computation to rapidconvergence to real shortest path sought. Limited computation effort isevaluated by utilizing real data of Sioux-Falls network which consists of24 nodes and 76 links and is considered as medium size network.

Findings of new approach (GTE) and its performance is very promisingand could be a threshold for further research and refinement.

Acknowledgment

This research is funded by Indonesia-Japan Toray Foundation, and to thefoundation the author is grateful.

References

1. Carbonell J., Paradigms for Machine Learnings, Machine LearningParadigms & Methods, Ed. Carbonell J., The MIT Press, page 1-9,1990.

2. Davis L., & Steenstrup M., Genetic Algorithms & SimulatedAnnealing : An Overview, Genetic Algorithms & SimulatedAnnealing, Ed. Davis L., Morgan Kaufman, page 1-11, 1987.

3. Dial, R.B., Probabilistic Multipath Traffic Assignment AlgorithmWhich Obviates Path Enumeration, Transportation Research 5(2),page 83-111, 1971.

4. Fisher Jr., D.H., Pazzani M.J. and Langley P., Concept Formation :Knowledge & Experience in Supervised Learning, Morgan Kaufman,1991.

5. Goldberg D.E., Genetic Algorithms in Search, Optimization andMachine Learning, Addison-Wesley, 1989.

6. Grefenstette J.J., Incorporating Problem Specific Knowledge intoGenetic Algorithms & Simulated Annealing, Ed. Davis L., MorganKaufman, page 42-60, 1987.



7. Grefenstette J.J.. & Baker J.E., How Genetic Algorithms Work : ACritical Look at Implicit Parallelism, Proceedings of The 3"'International Conference on Genetic Algorithms, Morgan Kaufman,page 20-27, 1989.

8. Holland J., Adaptation in Natural & Artificial Systems, Second Ed.,The MIT Press, 1992.

9. LeBlanc L.J., Morlok E.K, and Pierskalla W.P., An EfficientApproach to Solving The Road Network Equilibrium TrafficAssignment Problems, Transportation Research, Vol. 15, No. 2/4,page 115-124. 1975.

IQ.Michaelwics Z., Genetic Algorithms + Data structures = EvolutionPrograms, Springer-Verlag, 1992.

ll.Sheffi Y., Urban Transportation Networks : Equilibrium Analysiswith Mathematical Programming Methods, Prentice-Hall, Inc., 1985.

I. Sudarbo D.H. & Gorostiza Z.C., Use of Genetic Algorithms toOptimize Vehicle Wire Harness Costs, Internal Report, CondumesAutomotive Co., Livonia MI, USA, 1992.


Documents

Sutanto Soehodho Department of Civil Engineering, Faculty ... · Sutanto Soehodho Department of Civil Engineering, Faculty of Engineering, University of Indonesia, Depok 16424, Indonesia