Upload
masaya-iwata
View
222
Download
4
Embed Size (px)
Citation preview
Pattern Recognition System Using Evolvable Hardware
Masaya Iwata
Electrotechnical Laboratory, Ibaraki, Japan 305-8568
Isamu Kajitani
Doctoral Program in Engineering, University of Tsukuba, Ibaraki, Japan 305-8573
Masahiro Murakawa
Graduate School of Engineering, The University of Tokyo, Tokyo, Japan 113-8656
Yuji Hirao
Graduate School of Engineering, University of Tokushima, Tokushima, Japan 770-8506
Hitoshi Iba
Graduate School of Engineering, The University of Tokyo, Tokyo, Japan 113-8656
Tetsuya Higuchi
Electrotechnical Laboratory, Ibaraki, Japan 305-8568
SUMMARY
We have developed a high-speed pattern recognition
system using evolvable hardware (EHW). EHW is hard-
ware that can change its own structure by genetic learning
for maximum adaptation to the environment. The purpose
of the system is to show that recognition devices based on
EHW are possible and that they have the same robustness
to noise as devices based on an artificial neural network
(ANN). The advantage of EHW compared with ANN is the
high processing speed and the readability of the learned
result. In this paper, we describe the learning algorithm, the
architecture, and the experiment involving a pattern recog-
nition system that uses EHW. We also compare the process-
ing speed of the pattern recognition system with two types
of ANN dedicated hardware and discuss the performance
of the system. © 2000 Scripta Technica, Syst Comp Jpn,
31(4): 1�11, 2000
© 2000 Scripta Technica
Systems and Computers in Japan, Vol. 31, No. 4, 2000Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J81-D-II, No. 10, October 1998, pp. 2411�2420
1
Key words: Genetic algorithms; evolvable hard-
ware; adaptive hardware; pattern recognition; neural net-
works.
1. Introduction
The genetic algorithm (GA) is a method of simulating
the genetic evolution process in biology. It has attracted
considerable attention as an algorithm for searching or
optimization [1]. As an application of GA, the idea of
evolvable hardware (EHW) was proposed independently in
Japan and in Switzerland around 1992 [4, 12]. EHW is
hardware that can adapt to a new environment not antici-
pated by the designer. Since this proposal, interest in EHW
has grown rapidly and an international conference on evolv-
able systems has been held [7, 16]. Conventional hardware
has no provisions for adaptation to a new environment not
anticipated by the designer because it cannot change the
specification during operation. This contrasts with EHW,
which can adapt to a new environment not anticipated by
the designer. This is a very important feature. EHW uses
PLDs (programmable logic devices) or FPGAs (field pro-
grammable gate arrays) in order to change its own circuits.
The circuits are reconfigured through the use of genetic
algorithms to adapt to a new environment.
EHW is best suited for applications where hardware
specifications cannot be given in advance which are diffi-
cult for conventional hardware. Applications solved by an
artificial neural network (ANN) are good examples of such
applications, in which pattern classifier functions can be
obtained only after learning has been completed.
The aim of the proposed system is to show that EHW
may have the potential to take the place of ANN when used
as a pattern recognition system. We expect EHW to work
as an ANN-like robust pattern recognizer that can accom-
plish noise-insensitive recognition. There are two advan-
tages of EHW over ANN. First, the processing speed is
much faster than that of ANN systems, whose execution is
mostly software-based. Second, the learned results of EHW
are readable. That means that the learned result is easily
expressed in terms of readable Boolean functions. In ANN,
in contrast, it is difficult to read the learned result because
it is represented solely by the enumeration of real values for
thresholds and weights. Readability is a particularly impor-
tant feature, which solves the problem of understanding the
learned result in an ANN.
This paper is structured as follows. Section 2 de-
scribes the EHW concept. Section 3 describes pattern rec-
ognition using EHW. It introduces the MDL (minimum
description length) [15] for increasing the capability of
noise-insensitive recognition, and VGA (variable-length
chromosome genetic algorithm) [10] for fast learning of a
large circuit. Section 4 describes the architecture of the
pattern recognition system using EHW and experiments on
the recognition of numerical characters. Section 5 com-
pares the processing speed of the EHW pattern recognition
system with that of dedicated ANN hardware. Section 6
discusses the recognition system and Section 7 concludes
the paper.
2. Evolvable Hardware
2.1. Basic idea
Evolvable hardware (EHW) [4] modifies its own
hardware structure in accordance with environmental
changes. EHW is implemented on a PLD (programmable
logic device) or an FPGA (field programmable gate array),
whose architecture can be altered by downloading a binary
bit string.
The basic idea of EHW is to regard the architecture
bits of a PLD as a chromosome for a GA (see Fig. 1). A
PLD can configure logic circuits in it by downloading a
binary bit string of �architecture bits.� Architecture bits are
acquired adaptively by learning with genetic algorithms
(GAs). These architecture bits, that is, the GA chromosome,
are downloaded onto a PLD, during the genetic learning.
Therefore, EHW can be considered as online adaptive
hardware.
In this type of EHW, the hardware evolution is based
on primitive logic gates. We call it gate-level EHW. This
type has mainly been used since EHW was proposed [4�6,
18]. Another type is called function-level EHW [13, 14]. It
aims for a more complex process than gate-level EHW by
using functions, such as addition, as primitives of evolution.
An EHW using a hardware description language as the
primitive of evolution has also been proposed [3]. In this
paper we use gate-level EHW, which is the most fundamen-
tal and has the highest implementation ability. We explain
in this section the implementation method using a program-
mable logic device and learning with GA.
Fig. 1. Conceptual diagram of evolvable hardware.
2
2.2. Programmable logic device
Here we will explain the implementation method
using a PLD, with the simplified model shown in Fig. 2.
A PLD consists of logic cells and a fuse array. In
addition, architecture bits determine the architecture of the
PLD. Each dot of the fuse array corresponds to a bit in the
ABR.
The fuse array determines the interconnection be-
tween the device inputs and the logic cell. It also specifies
the logic cell�s AND term inputs. If a link on a particular
row of the fuse array is switched on (indicated in Fig. 2 by
a black dot), then the corresponding input signal is con-
nected to the row. In the architecture bits, these black and
white dots are represented, respectively, by 1 and 0.
Consider the example PLD shown in Fig. 2. The first
row indicates that I0 and IB
2 are connected by an AND term,
which generates I0IB
2. Similarly, the second row generates
I1. These AND terms are connected by an OR gate. Thus,
the resultant output is O0 I0IB
2 � I1.
As mentioned above, both the fuse array and the
function of the logic cell are represented in a binary string.
For the sake of GA-based adaptive search, the key idea of
EHW is to regard this binary bit string as a chromosome.
The hardware structure we actually use is an FPLA
device [17], which is a commercial PLD (Fig. 3)*. This
architecture consists mainly of an AND and an OR array. A
vertical line in the OR array corresponds to a logic cell in
Fig. 2. This device can configure any binary logic circuits.
2.3. Genetic learning
We will now describe the genotype representation of
EHW and the genetic learning method.
In our earlier works on gate-level EHW, we regarded
the architecture bits as the GA chromosome, with a fixed
length. Despite this simple representation, the hardware
evolution was successful for combinatorial logic circuits
(e.g., six-multiplexer [4]) and sequential logic circuits (e.g.,
finite-state automaton, 3-bit counter [5]).
However, this straightforward representation im-
posed a serious limitation on hardware evolution. It re-
quired inclusion of all of the fuse array bits in the genotype,
even when only a few bits in the fuse array were effective.
This made the chromosome too long to be effectively
searched by the evolutionary process.
Consequently, we introduced a new GA based on a
variable-length chromosome called VGA [10]. VGA is
expected to evolve a large circuit more quickly. The chro-
mosome length of the VGA is smaller than the earlier GA,
especially when a circuit with a large number of inputs is
being evolved. In this paper, we confirmed the effectiveness
of VGA by experiments. VGA is described in more detail
in Section 3.3.
The fitness evaluation of the GA is based on the
correctness of the EHW�s output for the training data set.
In the pattern recognition system we introduce the MDL
(minimum description length) [15] for the fitness evalu-
ation. Using the MDL, the robustness in recognizing noisy
patterns is expected to increase (for further details, see
Section 3.2).
3. Pattern Recognition Using EHW
3.1. Motivation
The aim of our system is to implement high-speed
pattern recognition for the purpose of establishing a robust
system in noisy environments using EHW. This robustness
seems to be the main feature of ANNs. ANNs are run mostly
in a software-based way, that is, executed by a workstation.
Fig. 3. An FPLA architecture.
*We cannot use commercial PLDs for EHW because the rewriting time of
the fuse array is slow. We implement EHW by making PLDs that rewrite
fuse arrays quickly on FPGAs.
Fig. 2. A simple PLD structure.
3
Thus, current ANNs may have difficulty with real-time
processing because of the speed limit of the software-based
execution. On the other hand, EHW can execute the recog-
nition process faster than ANNs can because the learned
result of EHW is represented by hardware.
Another desirable feature of EHW is its readability.
The learned result using EHW is expressed as a Boolean
function, whereas ANN represents it as thresholds and
weights. Thus, the acquired result using EHW is more
easily understood than that of an ANN. We believe that this
understandability feature allows wider usage of EHW in
industrial applications because the maintenance becomes
easier.
For the sake of achieving flexible recognition capa-
bility, it is necessary to cope with a pattern that is classifi-
able not by a linear function, but by a nonlinear function.
To check the above capability, we have conducted an ex-
periment in learning the exclusive-OR problem. From the
results of the simulation, we confirmed that EHW could
learn nonlinear functions successfully [6]. In other words,
we consider that EHW will fulfill the minimum require-
ment toward robust pattern recognition. In this paper we
show the potential of EHW to take the place of ANNs by
using typical experiments on a developed pattern recogni-
tion system using EHW.
The pattern recognition procedure consists of two
phases. The first phase is the learning of the training pat-
terns. The training patterns are genetically learned by EHW.
We use the VGA and MDL-based fitness described in
Sections 3.2 and 3.3. The second phase is the recognition
of test patterns. Our aim is pattern recognition that is
insensitive to noise.
3.2. Fitness evaluation by MDL
MDL is an information criterion in machine learning
needed to predict the rest of the data set by means of the given
data set [15]. Using the MDL for pattern classification is an
effective way to obtain a noise-insensitive classifier function.
In this section, we will describe the principle of MDL and
show how to apply it to pattern recognition using EHW.
MDL is based on a �simplicity criterion,� which
prescribes that simpler is better. Statisticians have studied
simplicity criteria for many years. The MDL estimates the
model for source of data to estimate data. In this case, model
means statistical model, which is the probability distribu-
tion with statistical restriction. The complexity of the
source is defined by the MDL of the model. Thus, the MDL
selects a model which minimizes the following summation
[8]:
MDL = desc len (model) + desc len (error) o min (1)
desc len (model) is the description length of the
model. desc len (error) is the code length of the error when
encoded using the model as a predictor for the data. The
sum MDL represents the trade-off between model complex-
ity (the first term) and residual error (the second term),
including a structure estimation term for the final model.
The final model with the minimal MDL is optimum in the
sense of being a consistent estimate of the number of
parameters while achieving the minimum error.
The reason for using the MDL is to let the EHW learn
a classifier function that is noise-insensitive, since a noise-
sensitive function is prone to overfitting. When we use the
MDL as the decision of classifier function, it treats the more
noise-insensitive classifier function, that is, the simpler
function, as a better classifier function because it will
predict the rest of the data more correctly [9]. Thus, the
MDL is defined so as to choose simpler and more general
classifier functions.
The MDL is introduced into the GA fitness evalu-
ation. For example, the MDL is used as the fitness function
to avoid overfitting of GA learning [11].
We have introduced the above MDL criterion into the
GA fitness evaluation for pattern recognition. The goal is
to establish a robust learning method for EHW. In general,
the greater the number of indifferent (i.e., don�t care)*
inputs, the more robust (i.e., noise-insensitive) the evolved
hardware. The reason is that even if noise is present in a
�don�t care� input, the noise does not affect the output.
Thus, we regard the number of indifferent inputs as an index
of MDL.
More formally, the fitness value F for our EHW using
MDL is derived from Eq. (1) as follows:
F 1 – FM (2)
FM Ac log�C � 1� � �1 � Ac� log�E � 1� (3)
FM denotes the MDL of the EHW. The smaller the
value of MDL is, the better. Thus, Eq. (2) is necessary to
get larger fitness values by using MDL. Ac is the coefficient
of the first term for FM (0 < Ac < 1); C denotes the
complexity of EHW. E is the error rate of the EHW�s output
for training data. FM is normalized so that it has the range
of 0 d FM d 1.
The goal of this expression is to minimize the MDL
value and to find the best trade-off between the complexity
and the error rate of the EHW. The C value (i.e., the
complexity of the EHW) determines the performance of the
MDL. In this paper we introduce three definitions of C as
shown in the appendix.
*We call an input �don�t care� if it is not included in the output expression.
For instance, if O I1 � I2 in the case of a PLD shown in Fig. 2, then I0 is
a �don�t care� input.
4
3.3. Variable-length chromosome GA
We here introduce a new GA based on the variable-
length chromosome, called the VGA, to increase the per-
formance of the GA [10]. In conventional EHW, the totality
of the architecture bits of the PLDs was regarded as one
chromosome of the GA. We call this method the simple GA
(SGA). But in the pattern recognition problem involving a
two-dimensional image, a large number of inputs are
needed. This causes an increase in chromosome length,
leading to an increase in GA learning time and restrictions
on the size of the evolved circuit.
Compared with the SGA, the chromosome length of
the VGA is smaller, especially in the evolution of a circuit
with a large number of inputs. This is because the VGA can
deal with that part of the architecture bits which effectively
determines the hardware structure [10]. Because of this
short chromosome, the VGA can increase the maximum
circuit size and establish an efficient adaptive search.
The coding method of the VGA is described in Fig.
4. An example of a chromosome and a representation of an
allele are shown in Fig. 4(a). An allele in a chromosome
consists of a location and a connection type. The location
is the position of the allele in the fuse array. There are two
kinds of connection. The AND connection type defines the
input of the AND array as either positive or negative. The
OR connection type defines the output of the AND array as
either connected or not connected to the input of the OR
array. For example, denoting an allele as (0,1) means that
the connection type at location 0 is 1. One chromosome is
represented by a string of alleles. Chromosome length is the
number of alleles in a chromosome. By converting each
allele into the connection pattern of the PLD, the chromo-
some is converted into the architecture bits defining the
PLD, as shown in Fig. 4(b).
We use the following method for genetic operation.
We adopt the roulette wheel selection strategy. The recom-
bination operators are cut and splice (Fig. 5), which are used
in the messy GA [2]. The cut probability is represented by
pc �O � 1� pk; O is the chromosome length (number of
alleles in a chromosome); and pk is a real number between
0 and 1. If pk > 1, then pk = 1. The mutation operation selects
numbers in alleles randomly and changes the values.
For further details about the VGA, refer to Ref. 10.
4. Experiment with Pattern Recognition
System
4.1. System characteristics
We have developed a pattern recognition system us-
ing EHW. The organization of the system is shown in Fig.
6. It consists of an EHW board that incorporates four FPGA
chips (Xilinx 4025), a DOS/V machine, and an input tablet
for drawing patterns. The DOS/V machine handles GA
operations, the control of the EHW board, and the display
of patterns. The PLD on the FPGA is reconfigurable, which
means that the system can be used as a universal EHW system.
Fig. 4. Representation of the chromosome of the variable-length chromosome GA (VGA).
Fig. 5. Cut & splice operator of variable-length
chromosome GA (VGA).
5
An overview of the EHW board is shown in Fig. 7,
and a block diagram is shown in Fig. 8. In the EHW board,
there are four FPGAs (hatched areas in the figure), board
control registers, and SRAM that stores the configuration
data of the FPGA. In the EHW, a circuit represented by a
chromosome is obtained by an ABR (architecture bit regis-
ter) and a PLD. The ABR stores architecture bits of the
PLD. The PLD has the architecture of an FPLA device (Fig.
3). In this figure, there are K individuals, that is, K pairs of
an ABR and a PLD in an FPGA chip. In the first version of
this system, we designed a genetically reconfigurable hard-
ware device having four FPGAs. The processing time of the
EHW board was 720 ns.
4.2. Experimental results
We conducted an experiment on recognizing the bi-
nary patterns of 8 u 8 pixels. There are 30 input patterns of
64 bits in the training set, as shown in Fig. 9. Three patterns
represent numerical characters (i.e., 0, 1, and 2) unambigu-
ously. The other 27 patterns represent the same numerical
characters with noise (5 bits have been flipped at random).
The outputs of the EHW consist of 3 bits; each bit corre-
sponds to one of three characters. The initial length of a
chromosome is 100. The probability of the cut operator pc
is 0.1 when the chromosome has its initial length, the
probability of the splice operator is 0.1, and the mutation
probability is 0.01. The line number of the AND array in
the PLD is 16. The test data set consists of 30 patterns,
which are generated with random noise (from 1 to 5 bits
have been flipped at random).
Fig. 7. The EHW board.
Fig. 8. Block diagram of the EHW board.
Fig. 9. Training patterns.
Fig. 6. Block diagram of pattern recognition system.
6
Four different learning methods were examined:
three kinds of MDL-based EHW with three MDL defini-
tions [MDL1, MDL2, and MDL3 which correspond, re-
spectively, to Eqs. (A.1) to (A.3) in the appendix], and a
non-MDL EHW. In the non-MDL EHW, Ac in Eq. (3) is Ac
= 0.2 for MDL1 and 2, Ac = 0.1 for MDL3. The fitness of
the non-MDL EHW is equal to the error of EHW for the
training data. We repeated the learning and the test for the
same pattern 10 times for all MDLs. The recognition result
of the test set (the average of 10 trials) is plotted in Fig. 10.
From the figure, it is clear that MDL-based EHWs give
better performance for noisy patterns than EHW without a
MDL.
An important feature of the EHW is that the resultant
expression can be represented by a simple Boolean func-
tion. For example, in one run, the learning results in the case
of MDL3 are O0 I22I46IB
58, O1 IB
8I20IB
37, O2 I18IB
50IB
54
� I11IB
38IB
63, where Ii (0 d Ii d 63) indicates the value of the
pixel in the pattern and Oi is the recognition output for the
pattern of letter i. In the ANN, in contrast, it is represented
solely by the enumeration of real values for the thresholds
and weights. Clearly, the results obtained by the EHW are
easier to understand than those obtained by ANN.
5. Comparison of Processing Speed
between EHW and Dedicated ANN
Hardware
We now compare the processing speed of the EHW
pattern recognition system with that of two kinds of dedi-
cated ANN hardware. We consider first the recognition
processing speed after learning. In the second step we
consider the learning time, but that will not be the main
focus, because the learning time of the EHW depends on
the execution time of the software.
EHW is compared with two types of hardware for
ANN: CNAPS Server II (Adaptive Solutions, Inc.) and
MY-NEUPOWER (Hitachi Microcomputer Systems, Inc.).
Both have 512 processors simulating neurons in SIMD
(single instruction multiple data stream) type parallel proc-
essing. In CNAPS and MY-NEUPOWER, we used the
three-layer (input layer 64, hidden layer 3, and output layer
3) back propagation learning method. The programs are
written in CNAPS-C and MY-PARAL, the special lan-
guages for CNAPS and MY-NEUPOWER, respectively.
5.1. Recognition time
Measurement of the recognition time after learning
is described as follows. In EHW, the time is measured with
a logic analyzer from the beginning of transmission of the
input pattern from the PC (DOS/V, 90 MHz) to the EHW
board until the recognition result is sent back to the PC. For
CNAPS we measured the processing speed by clock num-
ber, and for MY-NEUPOWER by using the timer on the
host computer. When measuring the speed of MY-NEU-
POWER, we also measured the overhead for interaction
with the host computer and then subtracted the overhead
time from the compared time.
In Table 1 we show the result of measuring the time
for the recognition process after learning. The recognition
speed of the EHW is fastest. It is 1.7 times as fast as that of
CNAPS and 9.2 times as fast as that of MY-NEUPOWER.
The reason is that the recognition process of the EHW uses
only a learned logic circuit. In contrast, the ANN hardware
needs to perform some multiplication and addition opera-
tions for each neuron.
The pattern recognition we tested here was executed
with small numbers of neurons. Accordingly, we could not
utilize enough parallelism in the hardware for the ANN. In
our experiment, three neurons in hidden and output layers
were operated in parallel. In comparison, the hardware for
the ANN can operate up to 512 neurons in parallel. These
types of hardware will have better performance, which
in turn requires more neurons in the hidden and output
layers.
5.2. Correctness
Another interesting comparison criterion is the cor-
rectness of a testing data set. We tested the same test set as
Fig. 10. Result of recognition by test set.
Table 1. Result of measurement of recognition time
7
described in Section 4.2. The correctness for CNAPS and
MY-NEUPOWER was perfect for all test sets that had
noise. The correctness for EHW depends on the number of
bits of noise in the test set (Fig. 10). For 1 bit of noise, it is
about 97% on average and 100% maximum, showing that
EHW is as robust as ANN is when the amount of noise is
small. When there is more noise in the test set, ANN is better
able to recognize noisy patterns than EHW. The reason is
that EHW is good at learning particular Boolean functions
consisting of small terms, such as functions for pattern
recognition (as shown in section 4.2), multiplexers, and so
on [4, 5]. If the learning algorithm is improved to obtain
more complicated functions, the probability of correctness
for the EHW will improve over the current result.
5.3. Learning time
Next we note the learning time as reference data. The
conditions for measuring the learning time were as follows.
We measured the time from the beginning of learning until
the training set was learned. In EHW, the learning of the
training set continued until the MDL (described in Section
3.2) became stable, that is, after 2000 generations, and we
selected the generation in which the best MDL value was
obtained. When the best MDL was obtained, the output
error was 0%. In ANN, the training set is learned until the
average of the error of each output drops to less than 1%.
We measured the processing time of the EHW using the
timer on the PC, and measured the processing time of
CNAPS and MY-NEUPOWER in the same way for the
recognition time.
The results of measuring the learning time are shown
in Table 2. It is seen that the learning speed of EHW is about
four orders of magnitude slower than that of neural hard-
ware. There are two explanations. One is that the learning
speed of the EHW depends on the execution time of the GA
program on the PC. In contrast, the hardware for the ANN
can perform high-speed learning using hardware. If we
design the hardware for learning and put it on the EHW
board, then the learning will be accomplished much faster
[5]. The other reason is the number of iterations in learning.
The learning was completed in about 10 iterations in the
ANN, while in the EHW it took about 1200 generations to
finish. Improvement of the learning algorithm in EHW is a
subject for future study.
6. Discussion
In this section we discuss (1) Boolean functions with
high ability for recognizing noisy patterns, (2) the advan-
tages of the VGA over the SGA, and (3) how to improve
the processing speed of the pattern recognition system.
First we discuss which Boolean function has high
ability for recognizing noisy patterns. Roughly speaking,
the Boolean function with better recognition ability is the
function that has fewer inputs, that is, a large number of
indifferent inputs. We confirmed that we could obtain such
a function using the MDL. However, we can obtain func-
tions that are more robust by adding more terms to the
equation. A method of obtaining such functions is also a
subject for future research.
Next we discuss the advantages of the VGA over the
SGA. In the pattern recognition system, we used the VGA
instead of the SGA. The main advantage of the VGA in
pattern recognition is that we can handle larger inputs than
can be done using the SGA. For example, the EHW was
able to learn three patterns of 16 inputs each by the SGA
with a chromosome length of 840. In comparison, when
using the VGA, the EHW can learn three patterns of 64
inputs with an average chromosome length of 187.6. In
addition, learning by the VGA is much faster than by the
SGA: 416.7 generations using the VGA versus 4053 using
the SGA. The reason that the VGA can handle larger inputs
than the SGA is that the VGA encodes into the chromosome
only those inputs which actually generate AND terms, so
that the chromosome length can be kept small. If the SGA
is used for problems of this nature, we incur an increase of
chromosome length because of the many inputs, leading to
an increase of GA execution time. In addition, the VGA has
very good matching with the MDL because the MDL
directs the GA search to find smaller circuits, that is, smaller
chromosomes as shown in Section 3.2. Thus, we can say
that the VGA is suitable for pattern recognition problems
because it handles many inputs and learns small circuits.
Next we discuss how to improve the processing speed
of the pattern recognition system. The pattern recognition
system offers some possibilities for improving processing
speed. We can make it at least 20 times faster if we improve
the timing of the control signals on the ISA board used as
the interface with the PC, and if we implement one PLD per
FPGA.
Table 2. Result of measurement of learning time
8
7. Conclusions
We have developed a pattern recognition system us-
ing EHW. The mission of the system is to recognize noisy
or incomplete patterns as neural networks do. The advan-
tages of EHW over ANNs are high processing speed and
the readability of the learned results of the EHW. We
describe the learning algorithm using the MDL and the
VGA. A noise-insensitive function was obtained effectively
by using the MDL as a fitness function of the GA. By using
the VGA, the EHW was able to handle larger inputs at a
faster learning speed than when using the simple GA. We
developed a pattern recognition system to show the feasi-
bility of using EHW for noise-insensitive recognition. We
conducted experiments in recognizing noisy patterns. We
confirmed that EHW could recognize noisy patterns by
introducing the MDL into the fitness function. The read-
ability of the learned result was confirmed by experiments.
We compared the processing speed of the EHW and dedi-
cated ANN hardware, and confirmed that the recognition
speed of the EHW was faster.
Acknowledgments. This research was supported by
the MITI Real World Computing (RWC) Project. The
authors would like to express their thanks to Dr. Otsu,
director of the Machine Understanding Division at ETL,
and Dr. Ohmaki, director of the Computer Science Division
at ETL, for their support and encouragement. The authors
thank OA Laboratory Co., Ltd. for their cooperation in the
fabrication of the system. They also thank Mr. Umeki at
Toshiba Corporation, Research and Development Center,
and Mr. Mukai at RWC Tsukuba Research Center, for
assistance in using CNAPS and MY-NEUPOWER, respec-
tively.
REFERENCES
1. Goldberg D. Genetic algorithms in search, optimiza-
tion, and machine learning. Addison�Wesley; 1989.
2. Goldberg D, Deb K, Kargupta H, Harik G. Rapid
accurate optimization of difficult problems using fast
messy genetic algorithms. Proc 5th Int Joint Conf on
Genetic Algorithms, Illinois, p 56�64, 1993.
3. Hemmi H, Mizoguchi J, Shimohara K. Development
and evolution of hardware behavior. In: Sanchez E,
Tomassini M, editors. Towards evolvable hardware:
The evolutionary engineering approach. LNCS 1062.
Springer-Verlag; 1996. p 250�265.
4. Higuchi T, Niwa T, Tanaka T, Iba H, de Garis H,
Furuya T. Evolvable hardware with genetic learning.
Proc Simulation of Adaptive Behavior, Honolulu, p
417�424, MIT Press, 1992.
5. Higuchi T, Iba H, Manderick B. Evolvable hardware.
In: Kitano H, Hendler JA, editors. Massively parallel
artificial intelligence. MIT Press; 1994. p 398�421.
6. Higuchi T, Iwata M, Kajitani I, Iba H, Hirao Y, Furuya
T, Manderick B. Evolvable hardware and its applica-
tions to pattern recognition and fault-tolerant sys-
tems. In: Sanchez E, Tomassini M, editors. Towards
evolvable hardware: The evolutionary engineering
approach. LNCS 1062. Springer-Verlag; 1996. p
118�135.
7. Higuchi T, Iwata M, Liu W, editors. Evolvable sys-
tems: from biology to hardware. LNCS 1259. Sprin-
ger-Verlag; 1997.
8. Iba H, de Garis H, Sato T. Genetic programming
using a minimum description length principle. In:
Advances in genetic programming. MIT Press; 1994.
p 265�284.
9. Itoh S. Application of MDL principle to pattern clas-
sification problems. J Jpn Soc Artif Intell
1992;7:608�614. (in Japanese)
10. Kajitani I, Hoshino T, Iwata M, Higuchi T. Variable
length chromosome GA for evolvable hardware. Proc
3rd Int Conf on Evolutionary Computation, Nagoya,
p 443�447, 1996.
11. Konagaya A, Kondo Y. Stochastic motif extraction
using a genetic algorithm with the MDL principle.
Hawaii Int Conf on System Sciences, 1993.
12. Marchal P, Piguet C, Mange D, Stauffer A, Durand S.
Embryological development on silicon. In: Artificial
life IV. MIT Press; 1994. p 365�370.
13. Murakawa M, Yoshizawa S, Kajitani I, Furuya T,
Iwata M, Higuchi T. Hardware evolution at function
level. In: Parallel problem solving from nature IV.
Springer-Verlag; 1996. p 62�71.
14. Murakawa M, Yoshizawa S, Kajitani I, Higuchi T.
Evolvable hardware for generalized neural networks.
15th Int Joint Conf on Artificial Intelligence. Morgan
Kaufmann; 1997. p 1146�1151.
15. Rissanen J. Stochastic complexity in statistical in-
quiry. World Scientific Series in Computer Science
Vol. 15. World Scientific; 1989.
16. Sanchez E, Tomassini M, editors. Towards evolvable
hardware: The evolutionary engineering approach.
LNCS 1062. Springer-Verlag; 1996.
17. Sasao T. How to make and how to use PLA. Nikkan
Kogyo Shinbun; 1986. p 2. (in Japanese)
18. Thompson A. Evolving electronic robot controllers
that exploit hardware resources. Proc 3rd European
Conf on Artificial Life, Granada, p 640�656, 1996.
9
APPENDIX
Definition of Complexity Value for MDL
The C value, which is the complexity value for the
MDL (i.e., the complexity of the EHW), determines the
performance of the MDL. We introduce three definitions,
as follows:
C1 ¦i |ANDOi| (A.1)
C2 |AND| u |OR| (A.2)
C3 ¦i |ANDOi| u |ORoi| (A.3)
where |ANDOi| and |OROi| are the numbers of ANDs and
ORs connected to the output Oi, and |AND| (|OR|) is the
number of ANDs (ORs) on the AND (OR) array. Consider
Fig. 4(b) as an instance. ANDs and ORs are shown as x and
u symbols in the figure. The values of C1, C2, and C3 are 3
(= 1 + 2), 9 (= 3 u 3), and 5 (= 1 u 1 + 2 u 2), respectively,
because |ANDO0|, |ORO0|, |ANDO1|, |ORO1|, | AND|, and |OR|
are 1, 1, 2, 2, 3, and 3.
The definition of C1 is not very precise because it
does not include the information on OR gates. C2 and C3
are expected to give more exact MDL values. We tested
several other definitions of complexity. In this paper we
presented the best three definitions.
AUTHORS (from left to right)
Masaya Iwata (member) received his B.E., M.E., and Ph.D. degrees in applied physics from Osaka University in 1988,
1990, and 1993. He was a postdoctoral fellow at ONERA-CERT, Toulouse, France, in 1993. He is currently a senior researcher
at the Electrotechnical Laboratory. His research interests are in genetic algorithms. He is a member of the IEICE and the
Information Processing Society of Japan.
Isamu Kajitani (nonmember) received his B.E., M.E., and Ph.D. degrees in engineering from the University of Tsukuba
in 1994, 1996, and 1999. He is currently working at the Electrotechnical Laboratory as the proposal-based researcher of the
new energy and industrial technology development organization. His research interests are engineering applications of genetic
algorithms. He is a member of the Japanese Society for Artificial Intelligence.
Masahiro Murakawa (nonmember) received his B.E., M.E. and Ph.D. degrees in mechano-informatics engineering from
the University of Tokyo in 1994, 1996, and 1999. He is currently a researcher at the Electrotechnical Laboratory. His research
interests include evolutionary algorithms, reconfigurable computing, neural networks, and reinforcement learning. He received
the best paper award at the second international conference on evolvable systems. He is a member of the Information Processing
Society of Japan and the Japanese Neural Network Society.
Yuji Hirao (nonmember) received his B.E. degree in electrical and electronic engineering from University of Tokushima
in 1986. He joined Kawasaki Heavy Industries, Ltd. in 1986. He joined Tokushima prefectural Industrial Technology Center in
1990. He is currently in the Graduate School of Engineering, University of Tokushima. His research interests are in robot systems
and genetic algorithms. He is a member of the Information Processing Society of Japan and the Institute of System,
Communication and Information Engineers.
10
AUTHORS (continued ) (from left to right)
Hitoshi Iba (nonmember) received his B.E., M.E., and Ph.D. degrees in information science from the University of Tokyo
in 1985, 1987, and 1990. He is currently an assistant professor there. His research interests are in artificial intelligence,
evolutionary algorithms, and robotics. He is a member of the Information Processing Society of Japan and the Japanese Society
for Artificial Intelligence.
Tetsuya Higuchi (member) received his B.E., M.E., and Ph.D. degrees in electrical engineering from Keio University in
1978, 1980, and 1984. He was a visiting researcher at Carnegie Mellon University in 1990. He heads the Evolvable Systems
Laboratory at the Electrotechnical Laboratory. His research interests include genetic algorithms and parallel associative
processing architecture. He is a member of IEICE and the Information Processing Society of Japan.
11