Transcript
Page 1: [IEEE NAFIPS 2005 - 2005 Annual Meeting of the North American Fuzzy Information Processing Society - Detroit, MI, USA (26-28 June 2005)] NAFIPS 2005 - 2005 Annual Meeting of the North

NAFIPS 2005 - 2005 Annual Meeting of the North American Fuzzy Information Processing Society

Fuzzy Modelling Through Logic OptimizationAdam F. Gobi, Witold Pedrycz

Department of Electrical and Computer EngineeringUniversity of Alberta, Edmonton, CanadaEmail: {gobia,pedrycz} @ece.ualberta.ca

Abstract- This study concerns a new approach to fuzzy modelidentification. Primarily focusing on the core of the model,we propose a two-phase design process realizing adaptive logicprocessing in the form of structural and parametric optimization.In recognizing the fundamental link between binary and fuzzylogic, effective structural learning is achieved through establishedmethods in logic minimization. This underlying structure is thenaugmented with fuzzy neural networks in order to learn thefiner details of the target system's behaviour. The combinationof a logic-driven architecture with this novel hybrid-learningscheme helps to develop transparent and accurate models whilemaintaining excellent computational efficiency.

I. INTRODUCTION

What becomes quite apparent in fuzzy modelling are thegrowing difficulties when dealing with multi-variable systems.The transparency of fuzzy models, usually regarded as sometype of rule-based system, amplifies these difficulties evenfurther. The "curse" of dimensionality becomes apparent evenfor small rule-based systems. There have been numerous hy-brid approaches [1]-[3] to alleviate these problems includingneurofuzzy systems (endowed with their learning abilities) andevolutionary optimization (with their mechanisms of globaloptimization). The successes of such hybrid development en-vironments are limited; we have not yet reached a state wherelarge models could be built quite efficiently while retainingthe semantics of the resulting constructs.

In this study, we propose a significant departure from themain design direction by going back to the development offuzzy models based on the principles of two-valued logic.The wealth of design tools of two-valued logic is immense[4]-[6], with techniques capable of handling large problemsand dealing with hundreds of binary variables. The underlyingobjective there is to minimize Boolean functions so that theensuing realization could be made as compact as possible. Oneof the tools, Espresso [4], helps satisfy this important goal; ithas been previously investigated for application to machinelearning problems [7], though unrelated to the approach takenin this study.

Given the state of the art of handling and simplifyingBoolean functions, our objective is to capitalize on this frame-work and treat it as an important phase in the design of fuzzymodels. The proposed conceptual setting can be succinctlyrepresented in the form seen in Fig. 1.

Let us briefly elaborate on the main phases of this scheme:1) Granulation: each system variable is granulated through

a collection of semantically meaningful fuzzy sets.

data -- fuzzy granulation

binarization of information granules

minimization of binary expressions

refinement of the fuzzy model * knowledge

Fig. 1. General design flow of the fuzzy model development

2) Binarization: these fuzzy information granules are con-verted into a binary format.

3) Minimization: the binary data is minimized using well-known techniques of logic synthesis, leading to theunderlying structure of the fuzzy model.

4) Refinement: the structure is refined, moving back to thefuzzy domain and augmenting the model with fuzzyneurons [2] while optimizing their connections.

The paper is organized into seven sections. In Section II,we present a general fuzzy modelling overview. Section IIIgives a brief background in logic minimization, with SectionIV discussing the two phases in building an adaptive logicprocessing core. In Section V, we identify critical design issuesand propose suitable solutions. Section VI demonstrates thedetails of the design process as we conduct comprehensiveexperiments to validate the proposed design methodology.

II. FuzzY MODELLING

A. Architecture of the Fuzzy Model

As advocated in [2], fuzzy modelling is realized at the con-ceptual level formed by a collection of semantically meaning-ful information granules (fuzzy sets) defined in each variable.These are also regarded as linguistic landmarks whose choiceimplies a certain point of view of the system under discussion.When dealing with many variables the fuzzy sets are aggre-

gated with AND and OR operations carried out by some set oftriangular norms (tls-norms). Fig. 2 emphasizes the structuralnature of this construct, where each entity represents a uniquefuzzy set defined in its corresponding space (A, B, etc.). Thisnetwork represents a logic-based description of a single infor-mation granule (fuzzy set) in the output space. In the case of anumber of fuzzy sets there, the architecture is augmented withseveral of these networks forming the underlying structure ofthe fuzzy model; the result is a heterogeneous knowledge basedescribing the behaviour of the entire system.

0-7803-9187-X/05/$20.00 ©2005 IEEE. 494

Page 2: [IEEE NAFIPS 2005 - 2005 Annual Meeting of the North American Fuzzy Information Processing Society - Detroit, MI, USA (26-28 June 2005)] NAFIPS 2005 - 2005 Annual Meeting of the North

Fig. 2. Structure of the model represented as a two-level aggregation ofinformation granules

B. Modelling ScenarioA general fuzzy model has two fundamental functional com-

ponents: (a) input and output interfaces and (b) a processingcore. The interfaces allow interaction between the conceptual,logic-driven structure of the model and the physical worldof measured variables. More specifically, the input interfacerealizes perception, where input variables are transformed intoan internal format of information granules understood by thelogic-processing core. The output interface communicates theresults of processing in a form understood by the externalworld (modelling environment). The processing core consistsof the architecture detailed above.

The experimental data measured from the target systemare used as training examples, taking the form of N input-output pairs, i.e. {x (k), target (k) }, k = 1,2, ... ,N. We requirethat y(k), the output of the fuzzy model for the input x(k),is equal to target(k), y(k) - target(k). In this study, we areprimarily concerned with the model's internal optimization.Here the original training data are converted through themodel interfaces to an internal format, allowing us to focuson the mapping of the information granules rather than theexperimental data.

III. TWO-VALUED LOGIC MINIMIZATIONBoolean logic minimization is best known as the main part

of logic synthesis, which converts a logic function to a circuit.Logic minimization consists of the manipulation of a logicrepresentation without modifying the functionality, in orderto achieve a minimal representation. Through binarization ofthe information granules formed in the interfaces of a fuzzymodel, we are able to realize a novel application for logicminimization in the form of structural learning; these efficienttools can help us in sorting through large amounts of data,eliminating redundancy and producing a simplified, compactand equivalent result in the form of a logic-based structure.

A. BackgroundIn providing some background on logic synthesis, the fol-

lowing definitions are found in [6].The set of binary values are defined as B {0, 1}. B' can

be modelled as a binary n-dimensional hypercube, where eachelement e = (el,.. ., e,) E B'" is called a minterm.

A Boolean function f for n variables, xl,...,x,, is amapping f: Bn {O, 1, *}, with * being a don'tcare conditionfor when the value of the function is irrelevant. All mintermsfor which f has value 1 form the ON-set of the function, withthe OFF-set and DC-set defined as sets of minterms where fis 0 and *, respectively.

Each variable xi has two literals associated with it: xi andits complement xi. A product term is a Boolean product(AND) of literals, said to contain a minterm e if each literalof e included in the product evaluates to 1. Since a productcorresponds to a set of adjacent minterms in the binary n-cube,a product may also be referred to as a cube. A sum-of-productsis a Boolean sum (OR) of products.An implicant of a Boolean function is a cube that contains

no minterm in the OFF-set, which is prime if contained in noother implicant of the function, and is an essential prime ifit contains at least one ON-set minterm that is not containedin any other prime implicant. A cover of a Boolean functionis a set of implicants interpreted as a sum-of-products, whichevaluates to 1 for all minterms in the ON-set, and none of theOFF-set.

B. Minimization AlgorithmsThe problem of two-level logic minimization is to find a

cover for f that minimizes a given cost function. Such acover can be implemented as a minimum-cost sum-of-productsequation, where the cost often considers parameters such asthe number of products (cubes) or literals in the cover.The Quine-McCluskey method [8] was one of the first

exact methods for two-level logic minimization, based on theobservation that the implicants in a minimum-cost cover can berestricted to prime implicants. Although these exact algorithmsare useful, the exact two-level minimization problem involvescomputationally intractable problems. In many cases, gettingsatisfactory results in far less time is often more important,leading to the development of heuristic logic minimizationtools. ESPRESSO-II [4] is the state-of-the-art algorithm forheuristic logic minimization, forming the main component ofthe Espresso software distribution [9], developed in the 1980sas a tool for programmable logic array (PLA) design.The output of ESPRESSO-II is a sum-of-products cover,

which in practice is almost always near minimum in car-dinality. The eight-step algorithm is described in detail in[4], with its basic goal to take a verbose representation ofa logic function and produce a condensed representation, es-sentially learning its underlying structure. While there may benewer algorithms claiming superiority in one or more specificproblem areas [5], [6], the software and source code are notas easily obtained. Regardless, Espresso is still regarded asthe standard two-level logic minimization tool in the (VLSI)design automation community.

IV. INTELLIGENT INFORMATION PROCESSINGThe logic-processing core is created in two phases, taking

advantage of a bottom-up approach. Starting from a com-pletely unstructured state, the entity can learn from example,

495

Page 3: [IEEE NAFIPS 2005 - 2005 Annual Meeting of the North American Fuzzy Information Processing Society - Detroit, MI, USA (26-28 June 2005)] NAFIPS 2005 - 2005 Annual Meeting of the North

adapting itself into a structure closely resembling the targetsystems underlying mechanisms. Following structural opti-mization, the core continues to learn, now focusing on the finerdetails of the system in order to improve upon behaviouralapproximation.

A. Phase 1: Structure DiscoveryUsing fuzzy versus binary membership in granular comput-

ing is advantageous in overcoming real-world problems of im-precision and uncertainty. However, traditional Boolean logicprocessing has previously unseen potential as an importantcomponent of a comprehensive fuzzy modelling environment.Given the fundamental link between two-valued and fuzzylogic processing, there exists here an excellent opportunity totake advantage of existing tools of logic minimization.

For building and optimizing structure, this first phase of thecores "birth" employs these methods of logic optimization. Bytemporarily converting the fuzzy granules into ones with strictmembership boundaries, the information becomes inherentlybinaiy in nature. This gives us a rough approximation of thetarget system, while retaining the most important informationexplaining its fundamental behaviour. In this form we canthen access established methodologies of logic minimization,providing an efficient means of discovering a concise, logic-based structure within the system data.Upon returning to the domain of continuous logic, this

newly discovered structure may be directly utilized in pro-cessing the original fuzzy information granules, forming theunderlying structure of the fuzzy model.

B. Phase 2: Parametric RefinementAlthough the information granules convey detailed numeric

information in the format of their membership functions, theresulting structure in Fig. 2 does not include any other numericquantification. A calibration of the structure is possible byequipping it with some parametric flexibility; this is achievedwith the introduction of fuzzy AND and OR neurons [2] inplace of the existing nodes, of which there is a direct linguisticcorrespondence. These are adaptive logic-processing elementsconnecting into a fuzzy neural network, whose learning isequivalent to the adjustment of its connections.

Recall that an n-input single output OR neuron is describedin the form y = OR(z; w), where z is a set of inputs E [0, 1] andw are the corresponding weights (connections). Rewriting theprevious expression in a coordinate-wise manner, we obtain

nS Zitwii=1

meaning that the neuron realizes an s-t composition of thecorresponding finite sets z and w.The AND neuron y = AND(z; w) realizes t-s composition

of z and w., governed by the expression

n

TZi S Wii=1

The role of the connections in both neurons is to weight theinputs and in this way furnish them with required parametric

(a) Binary (b) Trapezoidal

\

(c) Triangular

Fig. 3. Information granules (sets) can translate smoothly from binary tofuzzy

flexibility. In case of OR neurons, the higher the connection,the more essential the associated input. For AND neurons,an opposite situation holds. In general, a certain thresholdingoperation can be sought. For any OR neuron, we may considerthe input irrelevant if the associated connection assumes valueslower than a certain threshold. An input of the AND neuroncan also be eliminated if the connection value exceeds aspecified limit.

V. DESIGN ISSUES

A. Granular InterfacingAlthough the optimization of fuzzy sets existing in the

interfaces could be beneficial to the model, in this study weare focused on the logic processing core and its optimization.However, due to the nature of the core, there are still importantinterfacing issues to be dealt with. During the two-phase de-velopment scheme, we deal with differing types of informationgranules; while we are always processing information at thelevel of set membership across each variables own space, weneed to consider both binary and fuzzy abstractions.

For this task, triangular membership functions appear tobe quite suitable, offering both flexibility and simplicity.Fig. 3 shows how they are able to translate smoothly intotraditional binary sets, with trapezoidal membership functionsas an intermediate possibility, with its upper width seen as apotential "fuzziness" factor when converting from binary tofuzzy sets. Note that we keep all fuzzy membership functionsat one-half overlap, as is the norm seen in the literature [2]. Aswell, by default we choose to use membership functions spreadevenly across variable spaces (equal-width distribution).

B. Binarization

In order to take advantage of logic minimization tools forstructural learning, we must build a binary truth table outof the abstracted system data. Here measurements for eachvariable (input and output) are represented as simple binarymembership to a few semantically meaningful sets distributedover their respective universes of discourse; view that thesesets are mutually exclusive, so a measurement would have fullmembership to just one set in the space, with zero membershipto the rest. See Fig. 4 for an illustration of this, where wehave four sets defined for each input variable, two for theoutput variable, and present a single input-output pair forsome continuous system. Table I shows the membership valuesobtained.

496

F-OS ----\ 11

I.A OA

.A OA

02 1 0.2

01 I I I 01

Page 4: [IEEE NAFIPS 2005 - 2005 Annual Meeting of the North American Fuzzy Information Processing Society - Detroit, MI, USA (26-28 June 2005)] NAFIPS 2005 - 2005 Annual Meeting of the North

A1 A2 A3 A4 B, B2 B3 B4 YV Y2

LLLJI1F]7T]->7TIx1A B Y

Fig. 4. An example of binarization of information granules for obtainingbinary membership from data from a two-input, one-output continuous system

TABLE ISET MEMBERSHIPS OBTAINED DURING BINARIZATION IN FIG. 4

Set fA1 A2i A3 IA4I B B2 I B3 I B4| Y Y2Mem. 0 0 O0 1 0 0 0 0 1

1) Binary Encoding: Consider a binary coding approach.With each set in a variables space numbered, it may beconverted to its binary equivalent; note that we are confinedto powers of two for the number of sets per variable.

To encode the results seen in Table I we would require fourBoolean inputs (xl.,x4) and one output (y). View that eachof the four sets for an input space can be represented withtwo bits, where '00' would denote the first set (ex. A1), '01'the second, and so forth. This encoding results in the Booleanfunction (representing the target system) having a value of 1for the minterm (1, 0, 0, 0).

After subsequent minimization, we would need to decodethe binary variables in order to obtain actual set-based logicdescriptions. When running into don't cares, it would nec-essary to add a third level of logic to the sum-of-productsrepresentation. Consider a resulting product term such asY= xI* 2 * X3, where X4 is a doni't care; X1 * 2 (10) woulddecode to the third set in A (A3), however X3 (1 *) could decodeinto either B3 or B4 because X4 is a don't care; this essentiallyforms a union of sets in the same universe. Thus, the aboveexpression decodes into a two level product-of-sums equation,Y, =A3 * (B3 + B4). The third level of logic is constructed whensumming the remaining product terms to form the cover.

2) Set-Based Encoding: Not only are we limited by powersof two for the number of sets, but the binary encoding schemerestricts the optimization to dealing with certain unions ofsets in a variable's domain (expressed with don't cares asillustrated above). A more suitable approach is a straighttranslation of set memberships. For the results in Table I wewould directly use those membership values, resulting in aneight inputs (Xl,... ,x8) and two outputs (Yl,Y2)- While thisresults in more Boolean variables in comparison to binarycoding, the logic optimization methods are quite capable ofefficiently handling large Boolean systems.View that with each set membership coded as its own

Boolean variable, we encounter the issue of complements.While complements are not often considered in fuzzy models,they nevertheless retain meaning, ex. variable A is not low.As well, similar to the binary decoding scheme mentionedabove, we can derive a three-level logic description andavoid complements altogether if desired; notice how one or

AlA2A3A4

B,

83B4

ClC2C3C4

Y1

Fig. 5. Three-layered logic processing topology, forming the core of thefuzzy model. The small circles denote negations.

more complements of set membership in a universe can beequivalently represented as a union of the remaining sets;from Fig. 4, view how the negation of AI(A1) is equivalentto A2 UA3 UA4. Here there is no limitation in expressing setunions to form new, broader sets when the nature of the datashows these relationships to be pertinent.

In the interest of conciseness and knowledge interpretability,it is advantageous to consider both complements and setunions for decoding. From the Fig. 4 example, suppose weencountered the product term Yl = x * x2 *xg. Using thishybrid-decoding scheme, we end up with Y1 = (A3 +A4) * B4.View that it is more suitable to state "A3 or A4" rather than"not A1 and not A2", and that "not B4" is a more concisestatement than "B1 or B2 or B3".

C. Derivinig Knowledge-Based Neural NetworksWith fuzzy neurons we are able to derive a complete

knowledge-based neural architecture from the structurally op-timized core. The hybrid idea of a three-level logic descriptionwith complements (negations) can directly translate into athree-layer network, as seen in Fig. 5 for a single fuzzyoutput (recall in Section II-A how we can form separate logicdescriptions for each output label).

Note that we fix the connections of the first hidden layerof OR neurons. The inputs to these elements are not weightedbecause they only deal with fuzzy sets existing in the sameuniverse of discourse, and hence can be viewed as a unionto create a new, more general membership function for aparticular variable. In this way, the two-level topology shownin Fig. 2 is preserved. The degree of membership to this newfuzzy set is weighted like any other input when fed into thehidden AND layer, which deals with processing the fuzzyrelations formed from separate variable spaces.

D. Training the NetworkTo provide improved behavioural approximation, the net-

work's connections must be properly adjusted during an ap-propriate learning process. In terms of speed and efficiency,an on-line, gradient-based method of learning is desirable.With the fundamental structure already pre-defined, the size ofthe learning problem is greatly reduced from traditional fully

497

Page 5: [IEEE NAFIPS 2005 - 2005 Annual Meeting of the North American Fuzzy Information Processing Society - Detroit, MI, USA (26-28 June 2005)] NAFIPS 2005 - 2005 Annual Meeting of the North

connected networks, allowing us to concentrate on only themost important connections rather than training the networkfrom scratch.

The general scheme of learning can be qualitatively de-scribed as

Aconnections = -a Q.aconnections

where a denotes a learning rate, and Q is defined as [y(k) -target(k)]2, a squared-error performance measure that must beminimized. Subsequently, the connections of the network are

adjusted with these increments.In addition to the numeric calibration of the network,

the connections of the fuzzy neural network help prune theoriginal structure, done by applying the thresholding operationdiscussed in Section IV-B. The network after pruning can berepresented in an equivalent rule-based format

TABLE II

ATTRIBUTES OF THE BOSTON HOUSING DATA

If conditioni and conditionj and ... then conclusionk

In this setting, the connections of the fuzzy neural networkcan be interpreted as calibration factors of the conditions andrules. The connections of the AND neuron modify the mem-bership functions of the input fuzzy sets; a higher connectionvalue means a less specific fuzzy set. In the limit, when theconnection is equal to 1, the corresponding fuzzy input iseliminated from the rule. The connections of the OR neuron

determine confidence of the rule, meaning that the overallcondition of the rule is quantified in terms of its relevance.

VI. EXPERIMENTAL STUDIES

For the experimental part of this study, we provide a

comprehensive case study concerning a real-world systemdetailing housing prices in the Boston area. Note that whencomputing in the fuzzy domain, triangular norms are realizedas a product for the norm and probabilistic sum for the co-

norm. We use the centre of gravity (COG) formula [2] as a

decoding (defuzzification) scheme in order to measure real-world performance of the models with a root-mean-squared-error (RMSE) formula, given in a standard format:

/1N

y, [y(k) - target(k)]2k=1

When discretizing a continuous system into a small numberof binary sets, it is difficult to avoid conflicts in the data,i.e. having two or more identical input patterns showinginconsistent output patterns. This is discussed quantitativelyin the ensuing experiments. As well, note that with increasingdimensionality this problem generally reduces.

The Boston housing data set was taken from the UCIrepository of machine learning databases [10]. It concerns a

description of real estate in the Boston area where housingis characterized by a number of features (see Table II). Thedataset consists of 506 14-dimensional points; the constructionof the fuzzy model is completed for 304 data points (60%)

treated as a training set. This is done ten times to complete a

ten-fold cross-validation.The goal is to create a model with two outputs, each

representing an information granule defined on the outputspace, i.e. the median housing price. These outputs takethe labels of "low" and "high". We use three granules foreach input space, using the labels "low", "medium", "high".Note that the choice of these numbers was arbitrary, seeminglike a reasonable number of information granules to use forgenerating rules.

In the first phase of model development we attempt to find a

structure in the data through information granulation, binariza-tion (using set-based encoding), and logic minimization. Dur-ing binarization, the amount of conflicting data was minimal,amounting to 14 1.9 data points, less than 5% of the trainingset. These conflicting data points were removed for structurallearning in order to provide a valid truth table to Espresso, butwere re-introduced during parametric optimization.

The resultant structures showed consistency between cross-

validation iterations, with the number of rules (product terms)equalling 19.1 2.23, and the number of literals at 68.5 i11.6. Note how the data is fairly complex, requiring a largenumber of non-parameterized rules to describe its structure.

With binary processing the performance results amountedto 15.4 0.16 and 16.1 ± 0.33 for training and testing,respectively. These numbers improved to 10.4 ± 0.49 and 10.5i 0.69 when using fuzzy processing. Here we see excellentgains; as well, notice the improved testing performance, show-ing the high level of robustness of the fuzzy interface.

Proceeding with neurofuzzy parametric optimization, thelearning rate was set to 0.01, running for 1000 epochs. Theresults were 3.51 0.22 for training and 3.98 4 0.61 fortesting. Note that perhaps the output granulation was toocoarse for achieving high accuracy during structural learning,although it appears that this did not hinder the parametriclearning, which took advantage of the underlying structure

498

CRIM per capita crime rate by townZN proportion residential land zoned for lots over 25,000ft2INDUS proportion of non-retail business acres per townCHAS Charles River dummy variable

(1 if tract bounds river; 0 otherwise)NOX nitric oxides concentration (parts per 10 million)RM average number of rooms per dwellingAGE proportion of owner-occupied units built prior to 1940DIS weighted distances to five Boston employment centresRAD index of accessibility to radial highwaysTAX full value property-tax rate per $10,000PTRATIO pupil-teacher ratio by townB 1000(BkO.63)2

Bk is the proportion of African-Americans by townLSTAT % lower status of the populationMEDV Median value of owner-occupied homes in $1000s

Page 6: [IEEE NAFIPS 2005 - 2005 Annual Meeting of the North American Fuzzy Information Processing Society - Detroit, MI, USA (26-28 June 2005)] NAFIPS 2005 - 2005 Annual Meeting of the North

45

40

35

30

20

15

10

5 10 15 20 25 30 35 40 45 50 5 10 15 0 25 30 35 40 45 30ModeI Model

(a) Fuzzy core (b) Neurofuzzy core

Fig. 6. Scatter plots showing system output vs. model output

in order to achieve large performance increases over thenon-parameterized fuzzy model. To view these improvementsvisually, refer to the scatter plots presented in Fig. 6.

In addition to the performance gain, the neural augmentationto the model was able to drastically reduce the structuralcomplexity. The best performing network out of the tentraining instances was pruned using the process detailed inSection IV-B with thresholds of 0.7 for AND neurons and 0.4for OR neurons. As a result, the size of the rule-base was

simplified from 16 rules with 51 literals, to 5 rules with 10literals.

The interpretation of this learnt knowledge is found in TableIII, showing a concise logic description. Note how the rulesare quite intuitive; for instance, for some high price houses,it makes sense that the average number of rooms is high andthe age is not high (not old). Although the resultant structureof each of the ten networks were not exactly the same dueto the different training data, there were many similaritiesand common rules among them, exhibiting non-contradictoryknowledge.One of the advantages of such a transparent model is its

ability to be customized. For instance, view that one of therules in Table III states that NOX must be "not medium" (lowor high) for causing a low housing price. This doesn't appearto be very meaningful, and could merely reflect some error or

lack of information in the data. Upon removing this portion ofthe rule, the overall performance remained unaffected, actuallyimproving slightly. However, if we attempted to remove a

seemingly more truthful rule, such as RM being "high" fora high housing cost, the performance degraded significantly.

Finally, the computational efficiency of the model develop-ment should be emphasized; for a single instance of learningthe Boston housing dataset, both structural and parametricoptimization were completed together in under ten seconds ona 1.33GHz PowerPC G4 processor, from data to knowledge.

VII. CONCLUSIONIn this study, we have proposed and validated an effective

and novel design methodology for fuzzy system modelling.The two key technologies used here for model development,logic minimization and fuzzy neural networks, are instrumen-tal in achieving overall accuracy with inherent abilities to

TABLE IIICOLLECTION OF QUANTIFIED RULES DERIVED FROM THE BOSTON

HOUSING MODEL

provide a completely interpretable architecture. Not only isthe methodology capable of handling large problems and over-coming the "curse" of dimensionality, but also it is extremelycomputationally efficient.

There are still a number of issues worth pursuing in furtherimproving upon the methods presented here.

. Interpretation mechanisms may be improved through a

more sophisticated pruning process considering variouscriteria such as performance (accuracy) and structuralcomplexity (interpretability).

. The granular interface may be better constructed bycapturing the nature of the data.

REFERENCES[1] W. Pedrycz and M. Reformat, "Evolutionary fuzzy modeling," IEEE

Trans. Fuzzy Syst., vol. 11, pp. 652-665, Oct. 2003.[2] W. Pedrycz and F. Gomide, An Introduction to Fuzz Sets: Analysis and

Design. Camnbridge, MA: MIT Press, 1998.[3] M. Delgado, A. F. Gomez-Skarmeta, and F Martin, "A fuzzy clustering-

based rapid prototyping for fuzzy rule-based modeling," IEEE Tr-ans.Fuzzy Syst., vol. 5, pp. 223-233, May 1997.

[4] R. K. Brayton, A. L. Sangiovanni-Vincentelli, C. T. McMullen, andG. D. Hachtel, Logic Minimization for VLSI Synthesis. Boston, MA:Kluwer Academic Publishers, 1984.

[5] J. 1-Havicka and P. Fiser, "BOOM - a heuristic Boolean minimizer," inProc. of the IEEE/ACM International Conference on Coinputer AidedDesign (ICCAD 2001), 2001, pp. 439-442.

[6] S. Sapra, M. Theobald, and E. Clarke, "SAT-based algorithms forlogic minimization," in Proc. of the 21st International Conference on

Computer Design (ICCD'03), 2003, pp. 510-517.[7] J. A. Goldman and M. L. Axtell, "On using logic synthesis for

supervised classification learning," in Proc. of the Seventh InternationalConference on Tools with Artificial Intelligence, 1995, pp. 198-205.

[8] E. J. McCluskey, "Minimization of boolean functions," Bell Syst. Tech.Jour, vol. 35, pp. 1417-1444, Apr. 1956.

[9] (1994) Espresso Boolean minimization software. [Online]. Available:http://www-cad.eecs.berkeley.edu/Software/software.htmI

[10] C. J. Merz and P. M. Murphy. UCI repository of machine learningdatabases. Depart. Inform. Comput. Sci., Univ. California, Irvine.[Online]. Available: http://www.ics.uci.edu/lmlearn/MLRepository.html

499

cn..5 '. 4.

5045 _ t +

404

_

35_i + +

30 tt F*+4i5*

++F+

5.

if-conditionNOX not med.NOX_____ (0.001)_RM not high not high high

(0.000) (0.181) (0.003)AGE not high

TAX_ (8°3)(0.254)RAD high

~~~~(0.244)TAX not low

~~(0.183)PTRATIO not high_________(0.001)

LSTAT not low low(0.004) (0.000)

then-conclusionMEDV low low low high highconfidence 0.441 0.493 0.606 0.984 0.890

4

4

3

I 3

w 2

2

1

-4-++.4-*4** ..

. -t+* -f . +

.

+ 4'*'+ + ti.

.t .*

1. RI

T4, .1+


Recommended