11
Pattern Recognition System Using Evolvable Hardware Masaya Iwata Electrotechnical Laboratory, Ibaraki, Japan 305-8568 Isamu Kajitani Doctoral Program in Engineering, University of Tsukuba, Ibaraki, Japan 305-8573 Masahiro Murakawa Graduate School of Engineering, The University of Tokyo, Tokyo, Japan 113-8656 Yuji Hirao Graduate School of Engineering, University of Tokushima, Tokushima, Japan 770-8506 Hitoshi Iba Graduate School of Engineering, The University of Tokyo, Tokyo, Japan 113-8656 Tetsuya Higuchi Electrotechnical Laboratory, Ibaraki, Japan 305-8568 SUMMARY We have developed a high-speed pattern recognition system using evolvable hardware (EHW). EHW is hard- ware that can change its own structure by genetic learning for maximum adaptation to the environment. The purpose of the system is to show that recognition devices based on EHW are possible and that they have the same robustness to noise as devices based on an artificial neural network (ANN). The advantage of EHW compared with ANN is the high processing speed and the readability of the learned result. In this paper, we describe the learning algorithm, the architecture, and the experiment involving a pattern recog- nition system that uses EHW. We also compare the process- ing speed of the pattern recognition system with two types of ANN dedicated hardware and discuss the performance of the system. ' 2000 Scripta Technica, Syst Comp Jpn, 31(4): 111, 2000 ' 2000 Scripta Technica Systems and Computers in Japan, Vol. 31, No. 4, 2000 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J81-D-II, No. 10, October 1998, pp. 24112420 1

Pattern recognition system using evolvable hardware

Embed Size (px)

Citation preview

Page 1: Pattern recognition system using evolvable hardware

Pattern Recognition System Using Evolvable Hardware

Masaya Iwata

Electrotechnical Laboratory, Ibaraki, Japan 305-8568

Isamu Kajitani

Doctoral Program in Engineering, University of Tsukuba, Ibaraki, Japan 305-8573

Masahiro Murakawa

Graduate School of Engineering, The University of Tokyo, Tokyo, Japan 113-8656

Yuji Hirao

Graduate School of Engineering, University of Tokushima, Tokushima, Japan 770-8506

Hitoshi Iba

Graduate School of Engineering, The University of Tokyo, Tokyo, Japan 113-8656

Tetsuya Higuchi

Electrotechnical Laboratory, Ibaraki, Japan 305-8568

SUMMARY

We have developed a high-speed pattern recognition

system using evolvable hardware (EHW). EHW is hard-

ware that can change its own structure by genetic learning

for maximum adaptation to the environment. The purpose

of the system is to show that recognition devices based on

EHW are possible and that they have the same robustness

to noise as devices based on an artificial neural network

(ANN). The advantage of EHW compared with ANN is the

high processing speed and the readability of the learned

result. In this paper, we describe the learning algorithm, the

architecture, and the experiment involving a pattern recog-

nition system that uses EHW. We also compare the process-

ing speed of the pattern recognition system with two types

of ANN dedicated hardware and discuss the performance

of the system. © 2000 Scripta Technica, Syst Comp Jpn,

31(4): 1�11, 2000

© 2000 Scripta Technica

Systems and Computers in Japan, Vol. 31, No. 4, 2000Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J81-D-II, No. 10, October 1998, pp. 2411�2420

1

Page 2: Pattern recognition system using evolvable hardware

Key words: Genetic algorithms; evolvable hard-

ware; adaptive hardware; pattern recognition; neural net-

works.

1. Introduction

The genetic algorithm (GA) is a method of simulating

the genetic evolution process in biology. It has attracted

considerable attention as an algorithm for searching or

optimization [1]. As an application of GA, the idea of

evolvable hardware (EHW) was proposed independently in

Japan and in Switzerland around 1992 [4, 12]. EHW is

hardware that can adapt to a new environment not antici-

pated by the designer. Since this proposal, interest in EHW

has grown rapidly and an international conference on evolv-

able systems has been held [7, 16]. Conventional hardware

has no provisions for adaptation to a new environment not

anticipated by the designer because it cannot change the

specification during operation. This contrasts with EHW,

which can adapt to a new environment not anticipated by

the designer. This is a very important feature. EHW uses

PLDs (programmable logic devices) or FPGAs (field pro-

grammable gate arrays) in order to change its own circuits.

The circuits are reconfigured through the use of genetic

algorithms to adapt to a new environment.

EHW is best suited for applications where hardware

specifications cannot be given in advance which are diffi-

cult for conventional hardware. Applications solved by an

artificial neural network (ANN) are good examples of such

applications, in which pattern classifier functions can be

obtained only after learning has been completed.

The aim of the proposed system is to show that EHW

may have the potential to take the place of ANN when used

as a pattern recognition system. We expect EHW to work

as an ANN-like robust pattern recognizer that can accom-

plish noise-insensitive recognition. There are two advan-

tages of EHW over ANN. First, the processing speed is

much faster than that of ANN systems, whose execution is

mostly software-based. Second, the learned results of EHW

are readable. That means that the learned result is easily

expressed in terms of readable Boolean functions. In ANN,

in contrast, it is difficult to read the learned result because

it is represented solely by the enumeration of real values for

thresholds and weights. Readability is a particularly impor-

tant feature, which solves the problem of understanding the

learned result in an ANN.

This paper is structured as follows. Section 2 de-

scribes the EHW concept. Section 3 describes pattern rec-

ognition using EHW. It introduces the MDL (minimum

description length) [15] for increasing the capability of

noise-insensitive recognition, and VGA (variable-length

chromosome genetic algorithm) [10] for fast learning of a

large circuit. Section 4 describes the architecture of the

pattern recognition system using EHW and experiments on

the recognition of numerical characters. Section 5 com-

pares the processing speed of the EHW pattern recognition

system with that of dedicated ANN hardware. Section 6

discusses the recognition system and Section 7 concludes

the paper.

2. Evolvable Hardware

2.1. Basic idea

Evolvable hardware (EHW) [4] modifies its own

hardware structure in accordance with environmental

changes. EHW is implemented on a PLD (programmable

logic device) or an FPGA (field programmable gate array),

whose architecture can be altered by downloading a binary

bit string.

The basic idea of EHW is to regard the architecture

bits of a PLD as a chromosome for a GA (see Fig. 1). A

PLD can configure logic circuits in it by downloading a

binary bit string of �architecture bits.� Architecture bits are

acquired adaptively by learning with genetic algorithms

(GAs). These architecture bits, that is, the GA chromosome,

are downloaded onto a PLD, during the genetic learning.

Therefore, EHW can be considered as online adaptive

hardware.

In this type of EHW, the hardware evolution is based

on primitive logic gates. We call it gate-level EHW. This

type has mainly been used since EHW was proposed [4�6,

18]. Another type is called function-level EHW [13, 14]. It

aims for a more complex process than gate-level EHW by

using functions, such as addition, as primitives of evolution.

An EHW using a hardware description language as the

primitive of evolution has also been proposed [3]. In this

paper we use gate-level EHW, which is the most fundamen-

tal and has the highest implementation ability. We explain

in this section the implementation method using a program-

mable logic device and learning with GA.

Fig. 1. Conceptual diagram of evolvable hardware.

2

Page 3: Pattern recognition system using evolvable hardware

2.2. Programmable logic device

Here we will explain the implementation method

using a PLD, with the simplified model shown in Fig. 2.

A PLD consists of logic cells and a fuse array. In

addition, architecture bits determine the architecture of the

PLD. Each dot of the fuse array corresponds to a bit in the

ABR.

The fuse array determines the interconnection be-

tween the device inputs and the logic cell. It also specifies

the logic cell�s AND term inputs. If a link on a particular

row of the fuse array is switched on (indicated in Fig. 2 by

a black dot), then the corresponding input signal is con-

nected to the row. In the architecture bits, these black and

white dots are represented, respectively, by 1 and 0.

Consider the example PLD shown in Fig. 2. The first

row indicates that I0 and IB

2 are connected by an AND term,

which generates I0IB

2. Similarly, the second row generates

I1. These AND terms are connected by an OR gate. Thus,

the resultant output is O0 I0IB

2 � I1.

As mentioned above, both the fuse array and the

function of the logic cell are represented in a binary string.

For the sake of GA-based adaptive search, the key idea of

EHW is to regard this binary bit string as a chromosome.

The hardware structure we actually use is an FPLA

device [17], which is a commercial PLD (Fig. 3)*. This

architecture consists mainly of an AND and an OR array. A

vertical line in the OR array corresponds to a logic cell in

Fig. 2. This device can configure any binary logic circuits.

2.3. Genetic learning

We will now describe the genotype representation of

EHW and the genetic learning method.

In our earlier works on gate-level EHW, we regarded

the architecture bits as the GA chromosome, with a fixed

length. Despite this simple representation, the hardware

evolution was successful for combinatorial logic circuits

(e.g., six-multiplexer [4]) and sequential logic circuits (e.g.,

finite-state automaton, 3-bit counter [5]).

However, this straightforward representation im-

posed a serious limitation on hardware evolution. It re-

quired inclusion of all of the fuse array bits in the genotype,

even when only a few bits in the fuse array were effective.

This made the chromosome too long to be effectively

searched by the evolutionary process.

Consequently, we introduced a new GA based on a

variable-length chromosome called VGA [10]. VGA is

expected to evolve a large circuit more quickly. The chro-

mosome length of the VGA is smaller than the earlier GA,

especially when a circuit with a large number of inputs is

being evolved. In this paper, we confirmed the effectiveness

of VGA by experiments. VGA is described in more detail

in Section 3.3.

The fitness evaluation of the GA is based on the

correctness of the EHW�s output for the training data set.

In the pattern recognition system we introduce the MDL

(minimum description length) [15] for the fitness evalu-

ation. Using the MDL, the robustness in recognizing noisy

patterns is expected to increase (for further details, see

Section 3.2).

3. Pattern Recognition Using EHW

3.1. Motivation

The aim of our system is to implement high-speed

pattern recognition for the purpose of establishing a robust

system in noisy environments using EHW. This robustness

seems to be the main feature of ANNs. ANNs are run mostly

in a software-based way, that is, executed by a workstation.

Fig. 3. An FPLA architecture.

*We cannot use commercial PLDs for EHW because the rewriting time of

the fuse array is slow. We implement EHW by making PLDs that rewrite

fuse arrays quickly on FPGAs.

Fig. 2. A simple PLD structure.

3

Page 4: Pattern recognition system using evolvable hardware

Thus, current ANNs may have difficulty with real-time

processing because of the speed limit of the software-based

execution. On the other hand, EHW can execute the recog-

nition process faster than ANNs can because the learned

result of EHW is represented by hardware.

Another desirable feature of EHW is its readability.

The learned result using EHW is expressed as a Boolean

function, whereas ANN represents it as thresholds and

weights. Thus, the acquired result using EHW is more

easily understood than that of an ANN. We believe that this

understandability feature allows wider usage of EHW in

industrial applications because the maintenance becomes

easier.

For the sake of achieving flexible recognition capa-

bility, it is necessary to cope with a pattern that is classifi-

able not by a linear function, but by a nonlinear function.

To check the above capability, we have conducted an ex-

periment in learning the exclusive-OR problem. From the

results of the simulation, we confirmed that EHW could

learn nonlinear functions successfully [6]. In other words,

we consider that EHW will fulfill the minimum require-

ment toward robust pattern recognition. In this paper we

show the potential of EHW to take the place of ANNs by

using typical experiments on a developed pattern recogni-

tion system using EHW.

The pattern recognition procedure consists of two

phases. The first phase is the learning of the training pat-

terns. The training patterns are genetically learned by EHW.

We use the VGA and MDL-based fitness described in

Sections 3.2 and 3.3. The second phase is the recognition

of test patterns. Our aim is pattern recognition that is

insensitive to noise.

3.2. Fitness evaluation by MDL

MDL is an information criterion in machine learning

needed to predict the rest of the data set by means of the given

data set [15]. Using the MDL for pattern classification is an

effective way to obtain a noise-insensitive classifier function.

In this section, we will describe the principle of MDL and

show how to apply it to pattern recognition using EHW.

MDL is based on a �simplicity criterion,� which

prescribes that simpler is better. Statisticians have studied

simplicity criteria for many years. The MDL estimates the

model for source of data to estimate data. In this case, model

means statistical model, which is the probability distribu-

tion with statistical restriction. The complexity of the

source is defined by the MDL of the model. Thus, the MDL

selects a model which minimizes the following summation

[8]:

MDL = desc len (model) + desc len (error) o min (1)

desc len (model) is the description length of the

model. desc len (error) is the code length of the error when

encoded using the model as a predictor for the data. The

sum MDL represents the trade-off between model complex-

ity (the first term) and residual error (the second term),

including a structure estimation term for the final model.

The final model with the minimal MDL is optimum in the

sense of being a consistent estimate of the number of

parameters while achieving the minimum error.

The reason for using the MDL is to let the EHW learn

a classifier function that is noise-insensitive, since a noise-

sensitive function is prone to overfitting. When we use the

MDL as the decision of classifier function, it treats the more

noise-insensitive classifier function, that is, the simpler

function, as a better classifier function because it will

predict the rest of the data more correctly [9]. Thus, the

MDL is defined so as to choose simpler and more general

classifier functions.

The MDL is introduced into the GA fitness evalu-

ation. For example, the MDL is used as the fitness function

to avoid overfitting of GA learning [11].

We have introduced the above MDL criterion into the

GA fitness evaluation for pattern recognition. The goal is

to establish a robust learning method for EHW. In general,

the greater the number of indifferent (i.e., don�t care)*

inputs, the more robust (i.e., noise-insensitive) the evolved

hardware. The reason is that even if noise is present in a

�don�t care� input, the noise does not affect the output.

Thus, we regard the number of indifferent inputs as an index

of MDL.

More formally, the fitness value F for our EHW using

MDL is derived from Eq. (1) as follows:

F 1 – FM (2)

FM Ac log�C � 1� � �1 � Ac� log�E � 1� (3)

FM denotes the MDL of the EHW. The smaller the

value of MDL is, the better. Thus, Eq. (2) is necessary to

get larger fitness values by using MDL. Ac is the coefficient

of the first term for FM (0 < Ac < 1); C denotes the

complexity of EHW. E is the error rate of the EHW�s output

for training data. FM is normalized so that it has the range

of 0 d FM d 1.

The goal of this expression is to minimize the MDL

value and to find the best trade-off between the complexity

and the error rate of the EHW. The C value (i.e., the

complexity of the EHW) determines the performance of the

MDL. In this paper we introduce three definitions of C as

shown in the appendix.

*We call an input �don�t care� if it is not included in the output expression.

For instance, if O I1 � I2 in the case of a PLD shown in Fig. 2, then I0 is

a �don�t care� input.

4

Page 5: Pattern recognition system using evolvable hardware

3.3. Variable-length chromosome GA

We here introduce a new GA based on the variable-

length chromosome, called the VGA, to increase the per-

formance of the GA [10]. In conventional EHW, the totality

of the architecture bits of the PLDs was regarded as one

chromosome of the GA. We call this method the simple GA

(SGA). But in the pattern recognition problem involving a

two-dimensional image, a large number of inputs are

needed. This causes an increase in chromosome length,

leading to an increase in GA learning time and restrictions

on the size of the evolved circuit.

Compared with the SGA, the chromosome length of

the VGA is smaller, especially in the evolution of a circuit

with a large number of inputs. This is because the VGA can

deal with that part of the architecture bits which effectively

determines the hardware structure [10]. Because of this

short chromosome, the VGA can increase the maximum

circuit size and establish an efficient adaptive search.

The coding method of the VGA is described in Fig.

4. An example of a chromosome and a representation of an

allele are shown in Fig. 4(a). An allele in a chromosome

consists of a location and a connection type. The location

is the position of the allele in the fuse array. There are two

kinds of connection. The AND connection type defines the

input of the AND array as either positive or negative. The

OR connection type defines the output of the AND array as

either connected or not connected to the input of the OR

array. For example, denoting an allele as (0,1) means that

the connection type at location 0 is 1. One chromosome is

represented by a string of alleles. Chromosome length is the

number of alleles in a chromosome. By converting each

allele into the connection pattern of the PLD, the chromo-

some is converted into the architecture bits defining the

PLD, as shown in Fig. 4(b).

We use the following method for genetic operation.

We adopt the roulette wheel selection strategy. The recom-

bination operators are cut and splice (Fig. 5), which are used

in the messy GA [2]. The cut probability is represented by

pc �O � 1� pk; O is the chromosome length (number of

alleles in a chromosome); and pk is a real number between

0 and 1. If pk > 1, then pk = 1. The mutation operation selects

numbers in alleles randomly and changes the values.

For further details about the VGA, refer to Ref. 10.

4. Experiment with Pattern Recognition

System

4.1. System characteristics

We have developed a pattern recognition system us-

ing EHW. The organization of the system is shown in Fig.

6. It consists of an EHW board that incorporates four FPGA

chips (Xilinx 4025), a DOS/V machine, and an input tablet

for drawing patterns. The DOS/V machine handles GA

operations, the control of the EHW board, and the display

of patterns. The PLD on the FPGA is reconfigurable, which

means that the system can be used as a universal EHW system.

Fig. 4. Representation of the chromosome of the variable-length chromosome GA (VGA).

Fig. 5. Cut & splice operator of variable-length

chromosome GA (VGA).

5

Page 6: Pattern recognition system using evolvable hardware

An overview of the EHW board is shown in Fig. 7,

and a block diagram is shown in Fig. 8. In the EHW board,

there are four FPGAs (hatched areas in the figure), board

control registers, and SRAM that stores the configuration

data of the FPGA. In the EHW, a circuit represented by a

chromosome is obtained by an ABR (architecture bit regis-

ter) and a PLD. The ABR stores architecture bits of the

PLD. The PLD has the architecture of an FPLA device (Fig.

3). In this figure, there are K individuals, that is, K pairs of

an ABR and a PLD in an FPGA chip. In the first version of

this system, we designed a genetically reconfigurable hard-

ware device having four FPGAs. The processing time of the

EHW board was 720 ns.

4.2. Experimental results

We conducted an experiment on recognizing the bi-

nary patterns of 8 u 8 pixels. There are 30 input patterns of

64 bits in the training set, as shown in Fig. 9. Three patterns

represent numerical characters (i.e., 0, 1, and 2) unambigu-

ously. The other 27 patterns represent the same numerical

characters with noise (5 bits have been flipped at random).

The outputs of the EHW consist of 3 bits; each bit corre-

sponds to one of three characters. The initial length of a

chromosome is 100. The probability of the cut operator pc

is 0.1 when the chromosome has its initial length, the

probability of the splice operator is 0.1, and the mutation

probability is 0.01. The line number of the AND array in

the PLD is 16. The test data set consists of 30 patterns,

which are generated with random noise (from 1 to 5 bits

have been flipped at random).

Fig. 7. The EHW board.

Fig. 8. Block diagram of the EHW board.

Fig. 9. Training patterns.

Fig. 6. Block diagram of pattern recognition system.

6

Page 7: Pattern recognition system using evolvable hardware

Four different learning methods were examined:

three kinds of MDL-based EHW with three MDL defini-

tions [MDL1, MDL2, and MDL3 which correspond, re-

spectively, to Eqs. (A.1) to (A.3) in the appendix], and a

non-MDL EHW. In the non-MDL EHW, Ac in Eq. (3) is Ac

= 0.2 for MDL1 and 2, Ac = 0.1 for MDL3. The fitness of

the non-MDL EHW is equal to the error of EHW for the

training data. We repeated the learning and the test for the

same pattern 10 times for all MDLs. The recognition result

of the test set (the average of 10 trials) is plotted in Fig. 10.

From the figure, it is clear that MDL-based EHWs give

better performance for noisy patterns than EHW without a

MDL.

An important feature of the EHW is that the resultant

expression can be represented by a simple Boolean func-

tion. For example, in one run, the learning results in the case

of MDL3 are O0 I22I46IB

58, O1 IB

8I20IB

37, O2 I18IB

50IB

54

� I11IB

38IB

63, where Ii (0 d Ii d 63) indicates the value of the

pixel in the pattern and Oi is the recognition output for the

pattern of letter i. In the ANN, in contrast, it is represented

solely by the enumeration of real values for the thresholds

and weights. Clearly, the results obtained by the EHW are

easier to understand than those obtained by ANN.

5. Comparison of Processing Speed

between EHW and Dedicated ANN

Hardware

We now compare the processing speed of the EHW

pattern recognition system with that of two kinds of dedi-

cated ANN hardware. We consider first the recognition

processing speed after learning. In the second step we

consider the learning time, but that will not be the main

focus, because the learning time of the EHW depends on

the execution time of the software.

EHW is compared with two types of hardware for

ANN: CNAPS Server II (Adaptive Solutions, Inc.) and

MY-NEUPOWER (Hitachi Microcomputer Systems, Inc.).

Both have 512 processors simulating neurons in SIMD

(single instruction multiple data stream) type parallel proc-

essing. In CNAPS and MY-NEUPOWER, we used the

three-layer (input layer 64, hidden layer 3, and output layer

3) back propagation learning method. The programs are

written in CNAPS-C and MY-PARAL, the special lan-

guages for CNAPS and MY-NEUPOWER, respectively.

5.1. Recognition time

Measurement of the recognition time after learning

is described as follows. In EHW, the time is measured with

a logic analyzer from the beginning of transmission of the

input pattern from the PC (DOS/V, 90 MHz) to the EHW

board until the recognition result is sent back to the PC. For

CNAPS we measured the processing speed by clock num-

ber, and for MY-NEUPOWER by using the timer on the

host computer. When measuring the speed of MY-NEU-

POWER, we also measured the overhead for interaction

with the host computer and then subtracted the overhead

time from the compared time.

In Table 1 we show the result of measuring the time

for the recognition process after learning. The recognition

speed of the EHW is fastest. It is 1.7 times as fast as that of

CNAPS and 9.2 times as fast as that of MY-NEUPOWER.

The reason is that the recognition process of the EHW uses

only a learned logic circuit. In contrast, the ANN hardware

needs to perform some multiplication and addition opera-

tions for each neuron.

The pattern recognition we tested here was executed

with small numbers of neurons. Accordingly, we could not

utilize enough parallelism in the hardware for the ANN. In

our experiment, three neurons in hidden and output layers

were operated in parallel. In comparison, the hardware for

the ANN can operate up to 512 neurons in parallel. These

types of hardware will have better performance, which

in turn requires more neurons in the hidden and output

layers.

5.2. Correctness

Another interesting comparison criterion is the cor-

rectness of a testing data set. We tested the same test set as

Fig. 10. Result of recognition by test set.

Table 1. Result of measurement of recognition time

7

Page 8: Pattern recognition system using evolvable hardware

described in Section 4.2. The correctness for CNAPS and

MY-NEUPOWER was perfect for all test sets that had

noise. The correctness for EHW depends on the number of

bits of noise in the test set (Fig. 10). For 1 bit of noise, it is

about 97% on average and 100% maximum, showing that

EHW is as robust as ANN is when the amount of noise is

small. When there is more noise in the test set, ANN is better

able to recognize noisy patterns than EHW. The reason is

that EHW is good at learning particular Boolean functions

consisting of small terms, such as functions for pattern

recognition (as shown in section 4.2), multiplexers, and so

on [4, 5]. If the learning algorithm is improved to obtain

more complicated functions, the probability of correctness

for the EHW will improve over the current result.

5.3. Learning time

Next we note the learning time as reference data. The

conditions for measuring the learning time were as follows.

We measured the time from the beginning of learning until

the training set was learned. In EHW, the learning of the

training set continued until the MDL (described in Section

3.2) became stable, that is, after 2000 generations, and we

selected the generation in which the best MDL value was

obtained. When the best MDL was obtained, the output

error was 0%. In ANN, the training set is learned until the

average of the error of each output drops to less than 1%.

We measured the processing time of the EHW using the

timer on the PC, and measured the processing time of

CNAPS and MY-NEUPOWER in the same way for the

recognition time.

The results of measuring the learning time are shown

in Table 2. It is seen that the learning speed of EHW is about

four orders of magnitude slower than that of neural hard-

ware. There are two explanations. One is that the learning

speed of the EHW depends on the execution time of the GA

program on the PC. In contrast, the hardware for the ANN

can perform high-speed learning using hardware. If we

design the hardware for learning and put it on the EHW

board, then the learning will be accomplished much faster

[5]. The other reason is the number of iterations in learning.

The learning was completed in about 10 iterations in the

ANN, while in the EHW it took about 1200 generations to

finish. Improvement of the learning algorithm in EHW is a

subject for future study.

6. Discussion

In this section we discuss (1) Boolean functions with

high ability for recognizing noisy patterns, (2) the advan-

tages of the VGA over the SGA, and (3) how to improve

the processing speed of the pattern recognition system.

First we discuss which Boolean function has high

ability for recognizing noisy patterns. Roughly speaking,

the Boolean function with better recognition ability is the

function that has fewer inputs, that is, a large number of

indifferent inputs. We confirmed that we could obtain such

a function using the MDL. However, we can obtain func-

tions that are more robust by adding more terms to the

equation. A method of obtaining such functions is also a

subject for future research.

Next we discuss the advantages of the VGA over the

SGA. In the pattern recognition system, we used the VGA

instead of the SGA. The main advantage of the VGA in

pattern recognition is that we can handle larger inputs than

can be done using the SGA. For example, the EHW was

able to learn three patterns of 16 inputs each by the SGA

with a chromosome length of 840. In comparison, when

using the VGA, the EHW can learn three patterns of 64

inputs with an average chromosome length of 187.6. In

addition, learning by the VGA is much faster than by the

SGA: 416.7 generations using the VGA versus 4053 using

the SGA. The reason that the VGA can handle larger inputs

than the SGA is that the VGA encodes into the chromosome

only those inputs which actually generate AND terms, so

that the chromosome length can be kept small. If the SGA

is used for problems of this nature, we incur an increase of

chromosome length because of the many inputs, leading to

an increase of GA execution time. In addition, the VGA has

very good matching with the MDL because the MDL

directs the GA search to find smaller circuits, that is, smaller

chromosomes as shown in Section 3.2. Thus, we can say

that the VGA is suitable for pattern recognition problems

because it handles many inputs and learns small circuits.

Next we discuss how to improve the processing speed

of the pattern recognition system. The pattern recognition

system offers some possibilities for improving processing

speed. We can make it at least 20 times faster if we improve

the timing of the control signals on the ISA board used as

the interface with the PC, and if we implement one PLD per

FPGA.

Table 2. Result of measurement of learning time

8

Page 9: Pattern recognition system using evolvable hardware

7. Conclusions

We have developed a pattern recognition system us-

ing EHW. The mission of the system is to recognize noisy

or incomplete patterns as neural networks do. The advan-

tages of EHW over ANNs are high processing speed and

the readability of the learned results of the EHW. We

describe the learning algorithm using the MDL and the

VGA. A noise-insensitive function was obtained effectively

by using the MDL as a fitness function of the GA. By using

the VGA, the EHW was able to handle larger inputs at a

faster learning speed than when using the simple GA. We

developed a pattern recognition system to show the feasi-

bility of using EHW for noise-insensitive recognition. We

conducted experiments in recognizing noisy patterns. We

confirmed that EHW could recognize noisy patterns by

introducing the MDL into the fitness function. The read-

ability of the learned result was confirmed by experiments.

We compared the processing speed of the EHW and dedi-

cated ANN hardware, and confirmed that the recognition

speed of the EHW was faster.

Acknowledgments. This research was supported by

the MITI Real World Computing (RWC) Project. The

authors would like to express their thanks to Dr. Otsu,

director of the Machine Understanding Division at ETL,

and Dr. Ohmaki, director of the Computer Science Division

at ETL, for their support and encouragement. The authors

thank OA Laboratory Co., Ltd. for their cooperation in the

fabrication of the system. They also thank Mr. Umeki at

Toshiba Corporation, Research and Development Center,

and Mr. Mukai at RWC Tsukuba Research Center, for

assistance in using CNAPS and MY-NEUPOWER, respec-

tively.

REFERENCES

1. Goldberg D. Genetic algorithms in search, optimiza-

tion, and machine learning. Addison�Wesley; 1989.

2. Goldberg D, Deb K, Kargupta H, Harik G. Rapid

accurate optimization of difficult problems using fast

messy genetic algorithms. Proc 5th Int Joint Conf on

Genetic Algorithms, Illinois, p 56�64, 1993.

3. Hemmi H, Mizoguchi J, Shimohara K. Development

and evolution of hardware behavior. In: Sanchez E,

Tomassini M, editors. Towards evolvable hardware:

The evolutionary engineering approach. LNCS 1062.

Springer-Verlag; 1996. p 250�265.

4. Higuchi T, Niwa T, Tanaka T, Iba H, de Garis H,

Furuya T. Evolvable hardware with genetic learning.

Proc Simulation of Adaptive Behavior, Honolulu, p

417�424, MIT Press, 1992.

5. Higuchi T, Iba H, Manderick B. Evolvable hardware.

In: Kitano H, Hendler JA, editors. Massively parallel

artificial intelligence. MIT Press; 1994. p 398�421.

6. Higuchi T, Iwata M, Kajitani I, Iba H, Hirao Y, Furuya

T, Manderick B. Evolvable hardware and its applica-

tions to pattern recognition and fault-tolerant sys-

tems. In: Sanchez E, Tomassini M, editors. Towards

evolvable hardware: The evolutionary engineering

approach. LNCS 1062. Springer-Verlag; 1996. p

118�135.

7. Higuchi T, Iwata M, Liu W, editors. Evolvable sys-

tems: from biology to hardware. LNCS 1259. Sprin-

ger-Verlag; 1997.

8. Iba H, de Garis H, Sato T. Genetic programming

using a minimum description length principle. In:

Advances in genetic programming. MIT Press; 1994.

p 265�284.

9. Itoh S. Application of MDL principle to pattern clas-

sification problems. J Jpn Soc Artif Intell

1992;7:608�614. (in Japanese)

10. Kajitani I, Hoshino T, Iwata M, Higuchi T. Variable

length chromosome GA for evolvable hardware. Proc

3rd Int Conf on Evolutionary Computation, Nagoya,

p 443�447, 1996.

11. Konagaya A, Kondo Y. Stochastic motif extraction

using a genetic algorithm with the MDL principle.

Hawaii Int Conf on System Sciences, 1993.

12. Marchal P, Piguet C, Mange D, Stauffer A, Durand S.

Embryological development on silicon. In: Artificial

life IV. MIT Press; 1994. p 365�370.

13. Murakawa M, Yoshizawa S, Kajitani I, Furuya T,

Iwata M, Higuchi T. Hardware evolution at function

level. In: Parallel problem solving from nature IV.

Springer-Verlag; 1996. p 62�71.

14. Murakawa M, Yoshizawa S, Kajitani I, Higuchi T.

Evolvable hardware for generalized neural networks.

15th Int Joint Conf on Artificial Intelligence. Morgan

Kaufmann; 1997. p 1146�1151.

15. Rissanen J. Stochastic complexity in statistical in-

quiry. World Scientific Series in Computer Science

Vol. 15. World Scientific; 1989.

16. Sanchez E, Tomassini M, editors. Towards evolvable

hardware: The evolutionary engineering approach.

LNCS 1062. Springer-Verlag; 1996.

17. Sasao T. How to make and how to use PLA. Nikkan

Kogyo Shinbun; 1986. p 2. (in Japanese)

18. Thompson A. Evolving electronic robot controllers

that exploit hardware resources. Proc 3rd European

Conf on Artificial Life, Granada, p 640�656, 1996.

9

Page 10: Pattern recognition system using evolvable hardware

APPENDIX

Definition of Complexity Value for MDL

The C value, which is the complexity value for the

MDL (i.e., the complexity of the EHW), determines the

performance of the MDL. We introduce three definitions,

as follows:

C1 ¦i |ANDOi| (A.1)

C2 |AND| u |OR| (A.2)

C3 ¦i |ANDOi| u |ORoi| (A.3)

where |ANDOi| and |OROi| are the numbers of ANDs and

ORs connected to the output Oi, and |AND| (|OR|) is the

number of ANDs (ORs) on the AND (OR) array. Consider

Fig. 4(b) as an instance. ANDs and ORs are shown as x and

u symbols in the figure. The values of C1, C2, and C3 are 3

(= 1 + 2), 9 (= 3 u 3), and 5 (= 1 u 1 + 2 u 2), respectively,

because |ANDO0|, |ORO0|, |ANDO1|, |ORO1|, | AND|, and |OR|

are 1, 1, 2, 2, 3, and 3.

The definition of C1 is not very precise because it

does not include the information on OR gates. C2 and C3

are expected to give more exact MDL values. We tested

several other definitions of complexity. In this paper we

presented the best three definitions.

AUTHORS (from left to right)

Masaya Iwata (member) received his B.E., M.E., and Ph.D. degrees in applied physics from Osaka University in 1988,

1990, and 1993. He was a postdoctoral fellow at ONERA-CERT, Toulouse, France, in 1993. He is currently a senior researcher

at the Electrotechnical Laboratory. His research interests are in genetic algorithms. He is a member of the IEICE and the

Information Processing Society of Japan.

Isamu Kajitani (nonmember) received his B.E., M.E., and Ph.D. degrees in engineering from the University of Tsukuba

in 1994, 1996, and 1999. He is currently working at the Electrotechnical Laboratory as the proposal-based researcher of the

new energy and industrial technology development organization. His research interests are engineering applications of genetic

algorithms. He is a member of the Japanese Society for Artificial Intelligence.

Masahiro Murakawa (nonmember) received his B.E., M.E. and Ph.D. degrees in mechano-informatics engineering from

the University of Tokyo in 1994, 1996, and 1999. He is currently a researcher at the Electrotechnical Laboratory. His research

interests include evolutionary algorithms, reconfigurable computing, neural networks, and reinforcement learning. He received

the best paper award at the second international conference on evolvable systems. He is a member of the Information Processing

Society of Japan and the Japanese Neural Network Society.

Yuji Hirao (nonmember) received his B.E. degree in electrical and electronic engineering from University of Tokushima

in 1986. He joined Kawasaki Heavy Industries, Ltd. in 1986. He joined Tokushima prefectural Industrial Technology Center in

1990. He is currently in the Graduate School of Engineering, University of Tokushima. His research interests are in robot systems

and genetic algorithms. He is a member of the Information Processing Society of Japan and the Institute of System,

Communication and Information Engineers.

10

Page 11: Pattern recognition system using evolvable hardware

AUTHORS (continued ) (from left to right)

Hitoshi Iba (nonmember) received his B.E., M.E., and Ph.D. degrees in information science from the University of Tokyo

in 1985, 1987, and 1990. He is currently an assistant professor there. His research interests are in artificial intelligence,

evolutionary algorithms, and robotics. He is a member of the Information Processing Society of Japan and the Japanese Society

for Artificial Intelligence.

Tetsuya Higuchi (member) received his B.E., M.E., and Ph.D. degrees in electrical engineering from Keio University in

1978, 1980, and 1984. He was a visiting researcher at Carnegie Mellon University in 1990. He heads the Evolvable Systems

Laboratory at the Electrotechnical Laboratory. His research interests include genetic algorithms and parallel associative

processing architecture. He is a member of IEICE and the Information Processing Society of Japan.

11