19
* 7,5 26,25 8 *

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijaithome.agh.edu.pl/~gjn/wiki/_media/xttform-ijait.pdf · June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait 2 Grzegorz J. Nalepa,

  • Upload
    leliem

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

International Journal on Arti�cial Intelligence Toolsc© World Scienti�c Publishing Company

FORMALIZATION AND MODELING OF RULES

USING THE XTT2 METHOD∗

Grzegorz J. Nalepa

Antoni Lig¦za

Krzysztof Kaczor

AGH University of Science and Technology

al. A. Mickiewicza 30, 30-059 Krakow, Poland

[email protected], [email protected], [email protected]

Received (Day Month Year)Revised (Day Month Year)Accepted (Day Month Year)

The paper discusses a new knowledge representation for rule-based systems called XTT2.It combines decision trees and decision tables forming a transparent and hierarchicalvisual representation of the decision units linked into a work�ow-like structure. Thereare two levels of abstraction in the XTT2 model: the lower level, where a single knowledgecomponent de�ned by a set of rules working in the same context is represented by a singledecision table, and the higher level, where the structure of the whole knowledge base isconsidered. This model has a concise formalization which opens up possibility of well-de�ned, structured design and veri�cation of formal characteristics. Based on the visualXTT2 model, a textual representation of the rule base is generated. A dedicated engineprovides a uni�ed run-time environment for the XTT2 rule bases. The focus of the paperis on the formalization of the presented approach. It is based on the so-called ALSV(FD)logic that provides an expressive calculus for rules.

Keywords: rules; rule-based systems; knowledge-representation; inference engine.

1. Introduction

Rule-based systems still constitute one of the most powerful and most popular

knowledge representation formalisms 7,5 for intelligent systems 26,25. Formalization

of knowledge within a rule-based system can be based on logic 8 or performed

on the basis of engineering intuition, constituting a kind of superstructure with a

programming language such as Lisp or Java in the background. Modern rule-based

shells, including CLIPS, Jess, Drools, Aion, Ilog Rules, or G2 of Gensym, follow the

latter, classical paradigm, where the rule language is a programming solution, with

no formal de�nition. For example, CLIPS and Jess follow the Lisp syntax, while

Drools, build upon Java o�ers more user-friendly knowledge encoding.

∗The paper is supported by the PARNAS Project funded from NCN (National Science Center)resources for science.

1

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

2 Grzegorz J. Nalepa, Antoni Lig¦za, Krzysztof Kaczor

On the other hand, development of a strict, logical rule representation language

from scratch seems to o�er indisputable chance with respect to formal design and

evaluation. The main objectives for introducing a logic-based formalization of the

rule language are as follows:

• providing a clear framework enabling uniform knowledge modeling with

well-de�ned expressive power,

• speeding up the design process � logic-based rule language opens possibility

to partially formalize the design process which can in turn lead to better

detection of design errors, possibly at early development stages,

• allowing for a superior knowledge base quality control � formal methods

can be used to identify logical errors in rule formulation,

• simplifying knowledge translation � partially formalized translation to other

knowledge representation formats are possible, and

• proposing custom inference modes � structured rule bases require inference

strategies alternative to the classical inference algorithms.

In fact, well-formalized knowledge bases are easier to design, develop and verify;

the ultimate goal is to enable e�cient Knowledge Management, in a way analogous

to data management in Database Management Systems.

The rule representation discussed in this paper is called XTT2 (eXtended Tab-

ular Trees) 17,20. This hybrid knowledge representation combines decision trees and

decision tables. It forms a transparent and hierarchical visual representation of

the decision tables linked into a decision network structure. The name XTT2 was

kept to provide compatibility with previous works.

There are two levels of abstraction in the XTT2 model: the lower level � where

a single knowledge component de�ned by a set of rules working in the same context

is represented as a single XTT2 table, and the higher level � where the structure

of the whole XTT2 knowledge base (consisting of XTT2 tables) is considered. Such

knowledge representation provides not only high density of knowledge visualization,

but assures transparency and readability.

Contrary to majority of other systems, where a basic knowledge item is a single

rule, in the XTT2 formalism the basic component displayed, edited and managed

at a time is an extended decision table. Such table is logically equivalent to a set of

rules. Due to the fact that the XTT2 model represents rules using the ALSV(FD)

(Attributive Logic with Set of Values over Finite Domains) logic 20, it is much more

expressive than the classic (mostly propositional) rule languages, e.g. it allows for

formal speci�cation of non-atomic values in rule conditions.

The approach discussed in this paper is based on certain concepts related to

dynamic system modeling. The primary assumption is that the rule-based model is

a model of a dynamic system having certain internal state. The state is described

using attributes, that refer to certain crucial properties of the system. The state is

represented by the set of current attribute values. A statement that an attribute

has a given value can be interpreted as a fact in terms of classic expert systems.

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

Formalization and Modeling of Rules Using the XTT2 Method 3

The dynamics of the system � transitions between states � is modeled using rules.

Rules are described using the ALSV(FD) logic. The conditional part of the rule is

a conjunction of atomic formulae in the ALSV(FD) logic which are the attribute-

relation-value triples. The decision part of the rule includes statements that modify

system state in case the rule is �red (the proper decision) and actions that do not

change attribute values, thus the state.

The paper is organized as follows: In the following section an introduction to

the ALSV(FD) is given. Then, in Sect. 3 a proposal of the formalization of the

approach is introduced. General concepts of the inference control are then presented

in Sect. 4. A description of a prototype implementation follows in Sect. 5. Then a

short discussion of the related works and evaluation is presented in Sect. 6. The

paper ends with concluding remarks.

The results presented in this paper follow the research line described in 20,9.

This paper is partially based on the recent developments presented in detail in 14.

2. Overview of the ALSV(FD) Logic

In 8 a thorough discussion of attributive logics and their application in rule-based

systems was given. It includes a formal framework of SAL (Set Attributive Logic)

that provides syntax, semantics and some notes on inference rules for a logical

calculus in which attributes can take set values.

Here, an improved and extended version of SAL, namely ALSV(FD) (Attribu-

tive Logic with Set Values over Finite Domains) 18,19,20 is considered. It is ori-

ented towards Finite Domains (FD) and its expressive power is increased through

the introduction of new relational symbols enabling de�nitions of atomic formulae.

Moreover, ALSV(FD) introduces a formal speci�cation of the partitioning of the

attribute set needed for practical implementation, and a more coherent notation.

Simple and Generalized Attributes Let A denote the set of all attributes

used to describe the system. Each attribute has a set of admissible values that

it takes (a domain). Let D denote the set of all possible attribute values; D =

D1 ∪ D2 ∪ · · · ∪ Dn where Di is the domain of attribute Ai ∈ A, i = 1 . . . n. Any

domain Di is assumed to be a �nite. In the general case, the domain can be ordered,

partially ordered or unordered (this depends on the speci�cation of an attribute,

see Sect. 5).

In ALSV(FD) two types of attributes are identi�ed: simple ones taking only one

value at a time, and generalized ones taking multiple values. Therefore, we introduce

the following partitioning of the set of all attributes: A = As ∪ Ag, As ∩ Ag = ∅where: As is the set of simple attributes, and Ag is the set of generalized attributes.

A simple attribute Ai is a function (or a partial function) of the form:

Ai : O→ Di (1)

where: O is a set of objects, Di is the domain of attribute Ai.

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

4 Grzegorz J. Nalepa, Antoni Lig¦za, Krzysztof Kaczor

The de�nition of generalized attribute is as follows:

Ai : O→ 2Di (2)

where: O is a set of objects, 2Di is the set of all possible subsets of the domain Di.

Attribute Ai denotes a property of an object. The formula Ai(o), where o ∈ O,denotes the value of property Ai of object o.

For simplicity, in the rest of the paper no objects are speci�ed in an explicit way.

It is assumed that only one object (in this case it is the system being described) with

a speci�c property name exists. This is why, the following notational convention is

used: the formula Ai = V simply denotes a value (V ) of the attribute Ai.

Consider the following example of a system recommending books to di�erent

groups of people depending on their age and reading preferences. The age of a

reader and his/her preference could be represented by the following attributes: A =

{fav_genres, age, age_filter, rec_book}. In this case we assume that the second

attribute is a simple one, whereas the others are generalized ones. The fourth one

contains book titles that can be recommended to a reader. The attributes have the

following domains: D = Dfav_genres ∪ Dage ∪ Dage_filter ∪ Drec_book, where:

• Dfav_genres = {horror, handbook, fantasy, science, historical, poetry},• Dage = {1 . . . 99},• Dage_filter = {youngHorrors, youngPoetry, adultHorrors, adultPoetry},• Drec_book = { 'It', 'Logical Foundations for RBS', 'The Call of Cthulhu'}.

The system that is described using the attributes is in a certain state. The state

of the system is described by the values of attributes.

State Representation The current values of all attributes are speci�ed within

the contents of the knowledge base. From logical point of view the state of the

system is represented as a logical formula of the form:

s : (A1 = S1) ∧ (A2 = S2) ∧ . . . ∧ (An = Sn) (3)

where Ai are the attributes and Si are their current values. Note that Si ∈ Di for

simple attributes and Si ⊆ Di for generalized ones.

An explicit notation for covering unspeci�ed, unknown values is proposed: Ai =

null means that the value of Ai is unspeci�ed.

Following the example, a state can be de�ned as: (age = 16) ∧ (fav_genres =

{horror, fantasy}). This means that a given person is 16 years old and she or he

likes reading horror and fantasy books. In fact, it is a partial state where only the

values of the input attributes are de�ned. In this example, it will be su�cient to

start the inference process. To specify the full state, the values of the remaining

attributes should be de�ned as null .

Attribute Classes Considering the practical implementation of the commu-

nication architecture, where several attribute classes are identi�ed, the following

partitioning of the set of attributes is introduced:

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

Formalization and Modeling of Rules Using the XTT2 Method 5

As = Asin ∪ As

int ∪ Asout ∪ As

io (4)

Ag = Agin ∪ Ag

int ∪ Agout ∪ Ag

io (5)

where all these sets are pairwise disjoint:

• Asin,A

gin are the sets of input attributes,

• Asint,A

gint are the sets of internal attributes,

• Asout,A

gout are the sets of output attributes, and

• Asio,A

gio are the sets of attributes that can be simultaneously input and

output (communication attributes).

These attribute classes (i.e. input, internal, output, and communication) are used in

the rule speci�cation to support the interaction of the system with its environment.

They are handled by dedicated callbacks. Callbacks are procedures providing means

for reading and writing attribute values (see Sect. 5).

In the example, both fav_genres and age attributes are input ones, age_filter

is an internal one, and rec_book is an output one.

Atomic Formulae Syntax Let Ai be an attribute from A, and Di the domain

related to it. Let Vi denote an arbitrary subset of Di and let di ∈ Di be a single

element of the domain. The legal atomic formulae of ALSV(FD) along with their

semantics are presented, for simple and general attributes respectively.

If Vi is an empty set (the attribute takes no value), we shall write Ai = ∅. Inthe case when the value of Ai is unspeci�ed, we shall write Ai = null . If the current

attribute value is of no importance, we shall write A = any.

More complex formulae can be constructed with conjunction (∧) and disjunction(∨); both of these have classical meaning and interpretation. For enabling e�cient

veri�cation, there is no explicit use of negation in the formulae. The proposed set

of relations is selected for convenience and they are not completely independent.

The meaning of these formulae is as follows:

• Ai = di � the value of Ai is precisely de�ned as di,

• Ai ∈ Vi � the current value of Ai belongs to Vi,

• Ai 6= di � shorthand for Ai ∈ Di \ {di},• Ai 6∈ Vi � shorthand for Ai ∈ Di \ Vi,• Ai = Vi � Ai equals to Vi (and nothing more),

• Ai 6= Vi � Ai is di�erent from Vi (at least at one element),

• Ai ⊆ Vi � Ai is a subset of Vi,

• Ai ⊇ Vi � Ai is a superset of Vi,

• Ai ∼ Vi � Ai has a non-empty intersection with Vi.

Formulae Semantics The semantics of the atomic formulae is as follows:

• If Vi = {d1, d2, . . . , dk}, then Ai = Vi means that the attribute takes as its

value the set of all the values speci�ed with Vi (and nothing more).

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

6 Grzegorz J. Nalepa, Antoni Lig¦za, Krzysztof Kaczor

• (Ai ⊆ Vi) ≡ (Ai = Ui) for some Ui such that Ui ⊆ Vi, i.e. Ai takes some of

the values from Vi (and nothing out of Vi),

• (Ai ⊇ Vi) ≡ (Ai = W ), for some W such that Vi ⊆ W , i.e. Ai takes all of

the values from Vi (and perhaps some out of Vi), and

• (Ai ∼ Vi) ≡ (Ai = Xi), for some Xi such that Vi ∩ Xi 6= ∅, i.e. Ai takes

some of the values from Vi (and perhaps some out of Vi).

In the example, the following atomic formulae could be present: age ∈{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17}, which could also be denoted as

age < 18 and fav_genres ⊆ {science, fantasy, horror}. The interpretation of the

second one is: the person likes a subset of science, fantasy, horror books.

ALSV(FD) was introduced with practical applications for rule languages in

mind. In fact, the primary aim of the presented language is to formalize and extend

the notational possibilities and expressive power of rule languages for modularized

Rule-Based Systems (RBS). The atomic formulae of ALSV(FD) correspond to

simple statements (facts) about attribute values. These formulae are then used to

express certain conditions or constraints. Using this formalism, a complete solution

that allows for building decision rules is discussed in the following section.

3. Formalization of Modularized Rule Bases

The rule formalism considered here is called XTT2. It provides a formalized rule

speci�cation, based on some preliminary results presented in 10,17. This section

starts from single rule formulation using the ALSV(FD) concepts. Then provides

de�nitions for grouping similar rules into decision components (tables) linked into

an inference network. A number of structural de�nitions is given; their goal is to

formalize the structure of the knowledge base. Moreover, they are used to organize

the process of the design and possible translation of the knowledge base.

Conclusion and Decision Let us consider the following convention where two

identi�ers are used to denote attributes as well as operators in rule parts: cond

corresponds to the conditional part of a rule, and dec corresponds to the decision

part. Using it, two subsets of the attribute set can be identi�ed. Acond is a subset

of attribute set A that contains attributes present in the conditional part of a rule.

Adec is a subset of attribute set A containing attributes in the decision part.

Relational Operators in Rules Considering the syntax of the legal

ALSV(FD) formulae, the legal use of the relational operators in rules is speci-

�ed. With respect to the previously identi�ed attribute classes, not every operator

can be used with any attribute or in any rule part. Hence, the set of all operators

is divided into smaller subsets that contain all the operators, which can be used at

the same time.

The set of all relational operators that can be used in rules is de�ned as follows:

F = Fcond ∪ Fdec where:

• Fcond is a set of all operators that can be used in the conditional part of a

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

Formalization and Modeling of Rules Using the XTT2 Method 7

rule Fcond = Fconda ∪ Fcond

s ∪ Fcondg where:

� Fconda contains operators, that can be used in the rule conditional part

with all attributes. The set is de�ned as: Fconda = {=, 6=}.

� Fconds is the set that contains the operators, which can be used in a rule

conditional part with simple attributes. The set is de�ned as: Fconds =

{∈, /∈}. ALSV(FD) also allows for using the following operators <,>,≤,≥ which provide only a variation for ∈ operator. These operators

can be used only with attributes whose domains are ordered sets.

• Fcondg contains operators that can also be used in the rule conditional part

with generalized attributes. The set is de�ned as: Fcondg = {⊆,⊇,∼, 6∼}.

• Fdec is a set of all operators that can be used in a rule decision part Fdec =

{:=}. The operator := allows for assigning a new value to an attribute.

Moreover, to specify in the rule condition that the value of the attribute is to be

null (unknown) or any (unimportant) the operator = is used. To specify in the

same rule part that the value of the attribute is not null the operator 6= is used.

ALSV(FD) Triples Let us consider the E set that contains all the triples that

are legal atomic formulae in ALSV(FD). The triples are built using the previously

de�ned relational operators:

E ={(Ai,∝, di), Ai ∈ As,∝∈ F \ Fcondg , di ∈ Di} ∪ (6)

{(Ai,∝, Vi), Ai ∈ Ag,∝∈ F \ Fconds , Vi ∈ 2Di}

The ALSV(FD) triples are the basic components of rules.

XTT2 Rule Let us consider the set of all rules de�ned in the knowledge base

denoted as R. A single XTT2 rule is a triple: r = (COND,DEC,ACT), where:COND ⊆ E, DEC ⊆ E, and ACT is a set of actions to be executed when a rule is

�red. A rule can be written using LHS (Left Hand Side) and RHS (Right Hand

Side): LHS(r) → RHS(r),DO(ACT) where LHS(r) and RHS(r) correspond re-

spectively to the condition and decision parts of the rule r, and DO(ACT) involvesexecuting actions from a prede�ned set. Actions are not included in the RHS of the

rule, because it is assumed that they are independent from the considered system,

and the execution of actions does not change the state of the system.

A rule can also be presented in the following form: r : [φ1 ∧ φ2 ∧ · · · ∧ φn] →[θ1 ∧ θ2 ∧ · · · ∧ θm],DO(ACT) where: {1, . . . , n} and {1, . . . ,m} are the sets of

identi�ers, n ∈ N, m ∈ N, φ1, . . . , φn ∈ COND, θ1, . . . , θm ∈ DEC.From a logical point of view, the order of the atomic formulae in both the

precondition and conclusion parts is unimportant. In general, rules with empty

decisions can also be considered. They are helpful to take better control of the

inference process. A rule with no conditions can be used to set the attribute value.

In fact, such rules are used in real-life XTT2 systems to �ll the knowledge base

with facts or import sets of values.

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

8 Grzegorz J. Nalepa, Antoni Lig¦za, Krzysztof Kaczor

Consider example of the following rules:

r1 : [age < 18 ∧ fav_genres ⊇ {horror}] −→ [age_filter := youngHorrors]

r2 : [age = _ ∧ fav_genres ⊆ {science}] −→ [age_filter := allScience]

r3 : [age_filter ∈ {youngHorrors, adultHorrors}] −→ [rec_book := 'It']

Now, having the previously de�ned state: (age = 16) ∧ (fav_genres =

{horror, fantasy}), it can be observed, that rules r1 and r3 could be �red. The

notion of rule �ring is explained next.

Having the structure of a single rule de�ned, the structure of the complete

knowledge base is introduced. The knowledge base is composed of tables grouping

rules having the same attributes lists (rule schemas).

Rule Schema Now let us introduce a concept of a rule schema. It can be de�ned

as follows. Consider a rule r = (COND,DEC,ACT). Then the rule schema h(r) is

de�ned as h(r) = (trunc(COND), trunc(DEC)) where: the function trunc extracts

from a set of atomic formulae a set of attributes that are used in these triples.

Therefore, each rule has a schema that is a pair of attributes sets: h = (Hcond, Hdec)

where Hcond and Hdec sets de�ne the attributes occurring in the conditional and

decision part of the rule. ThereforeHcond = trunc(COND) andHdec = trunc(DEC).A schema is used to identify rules working in the same operational context.

Such a set of rules can form a decision component in the form of a decision table.

A schema can also be considered a table header.

Decision Component (Table) Let us consider a decision component (or ta-

ble). It is an ordered set (sequence) of rules having the same rule schema, de�ned as

follows: t = (r1, r2, . . . , rn) such that ∀i,j : ri, rj ∈ t → h(ri) = h(rj) where h(ri) is

the schema of the rule ri. In XTT2 the rule schema h can also be called the schema

of the component (or table). Note that all the rules incorporated in the same table

must have the same schema. Considering the rule schema notation, a table schema

h(t) can be de�ned as h(t) = h(r), which holds for any rule r in table t .

Consider the following illustration given in Figure 1. On the left table t1 is

represented. It is an example of a table having three rules: r1, r2, r3. These rules

have the same schema h1 = ({A1, A2, A3}, {A4, A5}). This means that the respectiveALSV(FD) triples contain given attribute e.g. a triple e2,3 is a part of rule r2 and

it contains the attribute A3. To simplify the visual representation a convention is

introduced, where the schema of a table is depicted on the top of the table.

Inference Link An inference link l is an ordered pair: l = (r, t), l ∈ R×T, whereR is the set of rules in the knowledge base, and T is the set of tables. Components

(tables) are connected (linked) in order to provide inference control. A single link

connects a single rule (a row in a table) with another table. A structure composed

of linked decision components is called a XTT2 knowledge base.

XTT2 Knowledge Base The XTT2 knowledge base is the set of components

connected with links. It can be de�ned as an ordered pair: X = (T,L), where T is

a set of components (tables), L is a set of links, and all the links from L connect

rules from R with tables from T. Links are introduced during the design process

according to the speci�cation provided by the designer. The knowledge base can be

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

Formalization and Modeling of Rules Using the XTT2 Method 9

perceived as an inference network.

An example of a simple XTT2 knowledge base is presented in Figure 1. It is

composed of three tables X1 = ({t1, t2, t3} and two links {l1, l2}). In the �rst table

(t1) the table schema is indicated in an explicit way. In this table also the conditional

and decision parts are presented explicitly (action-related part is optional).

There are nine rules, three in each of the tables. Rules in a table are �red starting

from the �rst one. The order in which tables are �red depends on a speci�c inference

mode (see Sect. 4). In the simple forward chaining (data-driven) mode, the inference

process would be as follows: If rule r1 from the table t1 is �red the inference control

would be passed to table t2 through link l1, otherwise �re rule r2. If rule r2 is �red

then proceed to rule r3. If rule r3 is �red the inference control is passed to table t3through link l2. If it is not �red then the inference process stops.

DEC

Schema

Links

1 2 3 4 5

4

5

6 7 8 9

8 91 110t

t

t

r

r

r

r

r

r

r

r

r

1

2

2

3

1

3

4

5

6

7

8

9

l

l2

1

A AAAA

A AAAA

A AAAA

e e e e e

eeeee

e e e e e

1 1 1 1 1

2 2 2 2 2

3 3 3 3 3

2

2

2

1

1

1

3

3

4

4

4 5

5

53

h 1

COND

Fig. 1. An example of an XTT2 knowledge base

Let us observe that a number of speci�c structures of knowledge bases could

be considered including decision trees. In such a decision tree-like structure nodes

would consist of single decision components. So, the XTT2 knowledge base can be

seen as a generalization of classic decision trees and tables.

This section provides a logic-based formalism for decision rules. The formalism

allows for building a modularized knowledge base. Such a solution is superior to

the one found in the classic rule-based shells. The XTT2 knowledge base is highly

modularized and hence XTT2 tables are not only a mechanism for managing large

knowledge bases, but they also allow for context-aware reasoning. It is due to the

fact that each XTT2 table groups rules that belong to the same operational context

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

10 Grzegorz J. Nalepa, Antoni Lig¦za, Krzysztof Kaczor

(have similar LHS and RHS). In practice, building such a modularized knowledge

base is not a straightforward task.

Practical inference in such a system cannot use the classic inference algorithms 1.

This is why custom inference modes are introduced.

4. Inference Control in Structured Rule Bases

Because of the non-trivial structure of the XTT2 knowledge base, custom inference

modes need to be formulated. They have been presented in this section. They should

be perceived as prototype solutions with di�erent optimized implementations pos-

sible. For a more formalized description see 15. Moreover, the development of other

modes is considered as the future work.

Any XTT2 table can have input links (inputs), as well as output links (outputs).

Links are related to the possible inference order. Tables to which no connections

point are referred to as input tables. Tables with no connections pointing to other

tables are called output tables. All the other tables (ones having both input and

output links) are considered middle tables.

The �rst most basic algorithm consists of a hard-coded order of inference. Every

table is assigned a unique integer number. The tables are �red in order from the

lowest number to the highest one. After starting the inference process, the prede-

�ned order of inference is followed. The inference stops after �ring the last table.

In case a table contains a complete set of rules (w.r.t. possible outputs generated

by preceding tables) the inference process should end with all the attribute values

de�ned by all the output tables being produced. This approach is suitable for rel-

atively small knowledge bases, where the manual analysis is possible but also for

quite large systems, provided that a speci�cation of inference process is available;

the latter is the case well-de�ned service procedures or formalized knowledge pro-

cessing in organizations such as insurance companies, telecom companies, banks,

etc. Therefore, more complex modes are considered, including DDI (Data-Driven

Inference), TDI (Token-Driven Inference), and GDI (Goal-Driven Inference). The

preliminary formalization was introduced in 15.

The Data-Driven Inference algorithm identi�es start tables, and puts all the

tables that are linked to the initial ones in the XTT2 network into a FIFO queue.

When there are no more tables to be added to the queue, the algorithm �res selected

tables in the order they are popped from the queue. The forward-chaining strategy

is suitable for simple tree-like inference structures. However, it has limitations in a

general case, because it cannot determine tables having multiple dependants.

Token-Driven Inference approach is based on monitoring the partial inference

order de�ned by the network structure with tokens assigned to tables. A table can

be �red only when there is a token at each input. Intuitively, a token is a �ag,

signalling that the necessary data generated by the preceding table is ready for use.

Note that this model of inference execution covers the case of possible loops in the

network. For example, if there is a loop and a table should be �red several times,

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

Formalization and Modeling of Rules Using the XTT2 Method 11

the token is passed from its output to its input, and it is analyzed if can be �red; if

so, it is placed in the queue.

The goal-driven approach works backwards with respect to selecting the tables

necessary for a speci�c task, and then �res the tables forward so as to achieve the

goal. The previous models of inference control can be considered blind procedures

because they do not take into consideration the goal of inference. Hence, it may

happen that numerous tables are �red without purpose � the results they produce

are of no interest. This, in fact, is a de�ciency of most of the forward-chaining rule-

based inference control strategies. In the goal-driven approach, one or more output

tables are identi�ed as the ones that can generate the desired goal values and are

put on a stack. As a consequence, only the tables that lead to the desired solution

are �red, and no rules are �red without purpose. The Goal-Driven Inference may be

particularly suitable for situations where the context of the operation can be clearly

de�ned and it is possible to clearly identify the knowledge component that needs

to be evaluated.

It can be observed that having the same rule base de�ned, a number of inference

scenarios can be applied. Their use depends on the given application and expected

results. The XTT2 representation does not enforce any single interpretation of the

rule sets (the only assumption is, that rules in a table are �red sequentially). Such

an approach allows for the reuse of the same logical model in several applications.

In Figure 2 selected aspects of the operation of the inference modes are visualized.

The concept of the XTT2 formalism has been implemented in the HeKatE

(see http://hekate.ia.agh.edu.pl) (Hybrid Knowledge Engineering) project, as

described in the following section.

5. XTT2 Implementation

For practical design and implementation of rules the XTT2 method includes two

representations: textual suitable for processing by a rule engine, and visual aimed

at a design tool. The textual representation of XTT2 is called HMR (HeKatE Meta

Representation). HMR allows for textual de�nition of all the XTT2 concepts (at-

tributes, tables, rules, links, etc). The important advantage of this representation

is the fact that it can be easily read and automatically processed by an inference

engine. The visual representation is supported by the HQEd (HeKatE Qt Editor) 22

graphical editor. In the Figure 3 the example of an editing session is depicted.

Both representations can be used to design the system. However, the visual rep-

resentation is more convenient. In fact, the textual form of HMR can be written

directly or automatically generated by the HQEd tool for a given visual representa-

tion. The HMR representation is processed by HeaRT 13 tool which is a dedicated

inference engine implemented in Prolog for reasoning with XTT2 rule bases.

The syntax of HMR provides appropriate predicates for the XTT2 concepts:

xattr allows for attributes de�nition,

xschm de�nes the table and rule schema,

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

12 Grzegorz J. Nalepa, Antoni Lig¦za, Krzysztof Kaczor

Table 6

Table 7

Table 2Table 5

A1

Table 1

B1

A2 B2

C1 D1

C2 D2

B1

Table 3

F1

B2 F2

B1

Table 4

H1

B2 H2

D1 J1

D2 J2

H2 Z2

J1 Z5

J2 Z6

F1

H1

H1

Input tables Middle tables Output tables

H1 Z3

H2 Z4

F2

F2

J1 Z7

J2 Z8

H2

H2

T

T

T

T

DDI with assumption that fact A1 is in knowledge base and Table 1 is a start table

TDI with assumption that facts A2 and C1 are in knowledge base and Table 6 and Table 7 are goal tables

GDI with assumption that facts A1 and C2 are in knowledge base, and Table 7 is a goal table

T Token sent from one table to another

H1 Z1F1

Fig. 2. Main Inference Modes for XTT2

xrule represents rules,

xstat allows for state de�nition,

xcall allows for callback de�nition.

HMR allows for de�ning the attribute domains using types. A given type directly

de�nes a set of allowed attribute values. There are two built-in primitive types:

numeric and symbolic. The semantics of the numeric type is straightforward. The

symbolic type allows for de�ning the symbolic domains, which can contain numbers,

characters, words, etc. A named type is derived from a primitive type and can be

considered as a set of constraints imposed on a primitive type, including speci�c

domain de�nition. Every attribute is of a named type (e.g. temperature). This

makes the design easier and faster because only one type must be de�ned for all the

attributes, which have the same domain.

Here, several HMR examples from the Cashpoint case study 2 are given. The

following example shows the part of HMR �le that de�nes two named types:

xtype [name: tPin,

base: numeric,

length: 4,

scale: 0,

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

Formalization and Modeling of Rules Using the XTT2 Method 13

Fig. 3. Visual XTT2 representation in HQEd

desc: 'Represents the PIN numbers',

domain: [0 to 9999]

].

xtype [name: tUserActions,

base: symbolic,

desc: 'The set of user's actions',

domain: [withdraw,balance]

].

The �rst type is named tPin and is based on the numeric primitive type. It

contains the integer values from 0 to 9999. The two parameters length and scale

correspond to maximal length and decimal places of an allowed value. The second

type is named tUserActions and bases on the symbolic primitive type. This type

allows for only two values withdraw and balance. The next example presents a

de�nition of an attribute:

xattr [name: correctPin,

abbrev: corre,

class: simple,

type: tPin,

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

14 Grzegorz J. Nalepa, Antoni Lig¦za, Krzysztof Kaczor

comm: in,

callback: [xpce_ask_numeric,[correctPin]]

].

The attribute has a de�ned name (correctPin) and a name abbreviation

(corre). The third line de�nes a class of the attribute (see attribute partition-

ing). In this case the attribute can take only one value at one time. In the next line

a type (domain) of the attribute is de�ned. According to the previous example, the

set of allowed values of the attribute contains the integer values from 0 to 9999.

The �fth line de�nes the relation between an attribute and an external system. In

this case the value of the attribute is provided by a system. The sixth line de�nes

a callback action that allows for getting a value of the attribute from an external

system. The callback action can receive the value from any source: sensor, Internet

or simply from a user. The following Prolog code de�nes the callback action that

uses GUI to receive the value from the user.

xcall xpce_ask_numeric: [AttName] >>>

(alsv_domain(AttName,[Min to Max],numeric),

dynamic(end), new(@dialog,dialog('Select a value')),

send_list(@dialog,append,

[new(I,int_item(AttName,low := Min, high := Max)),

new(_,button('Select',and(message(@prolog,assert,end),

and(message(@prolog,alsv_new_val,AttName,I?selection),

message(@dialog,destroy)))))]),

send(@dialog,open),

repeat,send(@display,dispatch),end,!,retractall(end)).

The next example shows the de�nition of the table schema:

xschm 'isPinCorrect': [enteredPin,correctPin] ==> [pinDifference].

The decision table de�ned by this schema is named isPinCorrect. The ta-

ble can contain the rules that have three attributes: enteredPin, correctPin and

pinDifference. The �rst two must be placed in a conditional part of the rules and

the last one in a decision part. In the following example, a single rule is de�ned.

The rule belongs to the table de�ned by the former schema:

xrule 'isPinCorrect'/1:

[enteredPin eq any, correctPin eq any]

==>

[pinDifference set (correctPin-enteredPin)]

:'authorization'.

Note that the �rst line contains information about rule schema. This information

can be considered as redundant but from the implementation point of view it allows

for convenient processing of the code. Further, the rule stays human readable. After

the / character the rule identi�er is de�ned. The second and third lines constitute a

conditional part of the rule while the �fth line a decision part. The last line de�nes

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

Formalization and Modeling of Rules Using the XTT2 Method 15

a link to the authorization table. The last example presents the HMR syntax that

allows for state de�nition:

xstat init0: [enteredPin, 1111].

xstat init0: [correctPin, 1234].

Both lines de�ne values of attributes for the state init0. HMR allows for the

de�nitions of any number of states. This can be especially useful during the system

testing. The test cases can be generated in the form of states.

The HMR representation provides a callback mechanism. This mechanism is re-

lated with the attributes and allows for receiving a value from the environment. The

HeaRT tool supports this mechanism in a two ways: all the callbacks are executed

before the inference starts, and a single callback is executed during the inference

when the value of an attribute is unknown.

A very important feature of XTT2 that has been implemented is a formal

veri�cation. HeaRT supports the veri�cation of the rules on the tables level The

rules can be checked against a number of of anomalies, see 16 for more details.

6. Related Work

This paper describes the formalization of XTT2 method that provides a complete

rule modeling framework, which includes a visual design method supported by tools,

formal veri�cation methods as well as execution environment.

Currently, a number of modern rule-based shells exist, e.g. Clips, Jess or Drools.

They provide new high-level features in the area of current software technologies,

such as Java-integration, network services, etc., However, the rule representation

and inference methods do not evolve. The rule languages found in these tools tend

to be logically trivial, and conceptually simple, with no formalization at all. They

mostly reuse very basic logic solutions, and combine them with new programming

language features, building on top of classic inference approaches, such as the blind,

forward-checking inference engines employing the Rete-style algorithm 4.

Historically, an important approach that aimed at full formalization of the

knowledge representation including basic inference tasks was KADS. In fact, in this

area an important e�ort was done, see3,27. However, KADS had a very broad per-

spective on the knowledge representation, with rules being one of several methods.

Due to this complexity, KADS-oriented research did not result in practical tools for

rule-based systems. Compared to it, the approach presented here is a focused one,

where a concrete, well-de�ned syntax and semantics of the rule language is given.

Moreover, the approach is supported by practical tools.

Considering the use of decision tables our approach is similar to Vanthienen's re-

search on decision tables. In the paper 30 it is presented how the rule-based system

can be created with the help of PROLOGA (Procedural Logic Analyzer) system,

which is an interactive rule-based tool for computer-supported construction and

manipulation of decision tables. The problems of maintenance, e�ciency and ver-

i�cation of a large knowledge bases are also discussed. A proposal of knowledge

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

16 Grzegorz J. Nalepa, Antoni Lig¦za, Krzysztof Kaczor

clustering is described in 29, where some formalisms and algorithms are introduced.

The paper 28 describes an approach to a knowledge modularization in order to

increase an e�ciency of veri�cation.

However, in XTT2 the perspective is on state-based representation. Moreover,

the formalization is more complete, and practical inference issues in structured rule

bases are considered. Vanthienen's works do not describe an in�uence of the knowl-

edge base modularization on the inference process. This is why, XTT2 provides

dedicated inference engine, which allows for using the advanced inference strate-

gies. In comparison to Vanthienen's works, XTT2 provides a stronger formalism

and more expressive rule language based on the ALSV(FD) logic.

In terms of the formalization of decision tables other approaches can also be

pointed out. In 24 Petri net representation of decision tables is presented. However,

it is aimed mainly at distributed concurrent systems. Moreover, the rule language

used in the tables is a basic propositional one.

To summarize, it can be stated that the introduced formalization allows for

a knowledge base quality control, where formal methods can be used to identify

logical errors in rule formulation on an early stage. Ultimately, this can speed up

the design process which becomes more transparent 20. This rule language opens

possibility for simplifying knowledge translation. In fact a proposal of the XTT2 to

Drools translation has been formulated 6 Moreover, the method introduces custom

inference modes, required for a structured rule base.

7. Concluding Remarks and Future Work

In this paper a rigorous formalized approach, called XTT2, for developing rule-

based systems has been presented in detail. The approach is based on the idea of

using a formal, attributive logic based approach for rule description. Moreover, it

allows for identi�cation of a structure of the rule base, by using extended decision

tables grouping rules working in the same context.

The main original contribution of the proposed approach consists in:

• introducing a network-like structure of the rule-base, consisting of com-

plex decision tables connected with inference links; in this way the purely

declarative knowledge is enriched with inference-control component,

• building the knowledge representation scheme upon a powerful, well-

formalized attribute logic, namely the ALSV(FD),

• providing the formalized de�nitions of system components, i.e. system ta-

bles and links,

• building a set of tools providing the proof-of-a-concept.

This approach is superior to wide-spread rule-based solutions thanks to the trans-

parent and scalable visual representation 21, possibility of formal veri�cation on the

logical level 16, as well as the introduction of custom inference algorithms 15. In the

paper a prototype implementation as well as supporting tools have been presented.

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

Formalization and Modeling of Rules Using the XTT2 Method 17

See also 22 and http://ai.ia.agh.edu.pl/wiki/hekate:hades for the complete

design toolset. The approach has been evaluated using a number of illustrative

system cases that have been selected, analyzed, designed and implemented. Impor-

tant cases were made available online: http://ai.ia.agh.edu.pl/wiki/hekate:

cases:start. Cases were selected in order to investigate and possibly boost the

selected language features of XTT2.

The XTT2 representation has been invented to support knowledge engineering

and management in classic rule systems. However, one of the considered future ap-

plications includes distributed agent systems. In such an approach several attribute

pools are considered. Some attributes are used by all agents, where as some other

correspond to features of single agents. A knowledge base of a single agent is de-

scribed by a single rule set represented by a decision table. Therefore inference is

simpli�ed. A simple communication protocol allowing for synchronizing input and

output attribute values is considered. Moreover, another application of the approach

is related to Business Rules 23 modelling and implementation 11,12. A formalized

translation between the XTT2 language and other rule languages used in Business

Rules systems could be formulated. In order to do so, a model-theoretic speci�ca-

tion of the language semantics is also planned. A related approach using the R2ML

intermediate language was proposed in 31.

References

1. Szymon Bobek, Krzysztof Kaczor, and Grzegorz J. Nalepa. Overview of rule inferencealgorithms for structured rule bases. Gdansk University of Technology Faculty of ETIAnnals, 18(8):57�62, 2010.

2. Tim Denvir, Jose Oliveira, and Nico Plat. The Cash-Point (ATM) 'Problem'. FormalAspects of Computing, 12(4):211�215, December 2000.

3. Dieter Fensel, Jürgen Angele, and Dieter Landes. KARL: a knowledge acquisition andrepresentation language. In J. C. Rault, editor, Proceedings of the 11th InternationalConference Expert Systems and their Applications, volume 1 (Tools, Techniques &Methods), pages 513�528, Avignon, 1991. EC2, Nanterre.

4. Charles Forgy. Rete: A fast algorithm for the many patterns/many objects matchproblem. Artif. Intell., 19(1):17�37, 1982.

5. Joseph Giarratano and Gary Riley. Expert Systems. Principles and Programming.Thomson Course Technology, Boston, MA, United States, 4th edition, 2005.

6. Krzysztof Kluza, Grzegorz J. Nalepa, and �ukasz �ysik. Visual inference speci�cationmethods for modularized rulebases. Overview and integration proposal. In Grzegorz J.Nalepa and Joachim Baumeister, editors, 6th Workshop on Knowledge Engineeringand Software Engineering (KESE2009) at the 32nd German conference on Arti�cialIntelligence: September 21, 2010, Karlsruhe, Germany, pages 6�17, Karlsruhe, Ger-many, 2010.

7. Jay Liebowitz, editor. The Handbook of Applied Expert Systems. CRC Press, BocaRaton, 1998.

8. Antoni Lig¦za. Logical Foundations for Rule-Based Systems. Springer-Verlag, Berlin,Heidelberg, 2006.

9. Antoni Lig¦za and Grzegorz J. Nalepa. A study of methodological issues in designand development of rule-based systems: proposal of a new approach. Wiley Interdis-

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

18 Grzegorz J. Nalepa, Antoni Lig¦za, Krzysztof Kaczor

ciplinary Reviews: Data Mining and Knowledge Discovery, 1(2):117�137, 2011.10. Grzegorz J. Nalepa. A new approach to the rule-based systems design and implemen-

tation process. Computer Science, 6:65�79, 2004.11. Grzegorz J. Nalepa. Business Rules design and re�nement using the XTT approach.

In David C. Wilson, Geo�rey C. J. Sutcli�e, and FLAIRS, editors, FLAIRS-20: Pro-ceedings of the 20th International Florida Arti�cial Intelligence Research Society Con-ference: Key West, Florida, May 7-9, 2007, pages 536�541, Menlo Park, California,2007. Florida Arti�cial Intelligence Research Society, AAAI Press.

12. Grzegorz J. Nalepa. Proposal of business process and rules modeling with the XTTmethod. In Viorel Negru and et al., editors, Symbolic and numeric algorithms for sci-enti�c computing: SYNASC'07: 9th international symposium: RuleApps'2007 � work-shop on Rule-based applications: Timisoara, Romania, September 26�29, 2007, pages500�506, Los Alamitos, California ; Washington ; Tokyo, 2007. IEEE, CPS ConferencePublishing Service.

13. Grzegorz J. Nalepa. Architecture of the HeaRT hybrid rule engine. In LeszekRutkowski and [et al.], editors, Arti�cial Intelligence and Soft Computing: 10th In-ternational Conference, ICAISC 2010: Zakopane, Poland, June 13�17, 2010, Pt. II,volume 6114 of Lecture Notes in Arti�cial Intelligence, pages 598�605. Springer, 2010.

14. Grzegorz J. Nalepa. Semantic Knowledge Engineering. A Rule-Based Approach.Wydawnictwa AGH, Kraków, 2011.

15. Grzegorz J. Nalepa, Szymon Bobek, Antoni Lig¦za, and Krzysztof Kaczor. Algorithmsfor rule inference in modularized rule bases. In N. Bassiliades, G. Governatori, andA. Pasckhe, editors, RuleML2011 - International Symposium on Rules, Lecture Notesin Computer Science. Springer-Verlag, 2011. accepted for publication.

16. Grzegorz J. Nalepa, Szymon Bobek, Antoni Lig¦za, and Krzysztof Kaczor. HalVA �rule analysis framework for XTT2 rules. In N. Bassiliades, G. Governatori, andA. Pasckhe, editors, RuleML2011 - International Symposium on Rules, Lecture Notesin Computer Science. Springer-Verlag, 2011. accepted for publication.

17. Grzegorz J. Nalepa and Antoni Lig¦za. A graphical tabular model for rule-based logicprogramming and veri�cation. Systems Science, 31(2):89�95, 2005.

18. Grzegorz J. Nalepa and Antoni Lig¦za. XTT+ rule design using the ALSV(FD). InAdrian Giurca, Anastasia Analyti, and Gerd Wagner, editors, ECAI 2008: 18th Euro-pean Conference on Arti�cial Intelligence: 2nd East European Workshop on Rule-basedapplications, RuleApps2008: Patras, 22 July 2008, pages 11�15, Patras, 2008. Univer-sity of Patras.

19. Grzegorz J. Nalepa and Antoni Lig¦za. On ALSV rules formulation and inference. InH. Chad Lane and Hans W. Guesgen, editors, FLAIRS-22: Proceedings of the twenty-second international Florida Arti�cial Intelligence Research Society conference: 19�21May 2009, Sanibel Island, Florida, USA, pages 396�401, Menlo Park, California, 2009.FLAIRS, AAAI Press.

20. Grzegorz J. Nalepa and Antoni Lig¦za. HeKatE methodology, hybrid engineering ofintelligent systems. International Journal of Applied Mathematics and Computer Sci-ence, 20(1):35�53, 2010.

21. Grzegorz J. Nalepa, Antoni Lig¦za, and Krzysztof Kaczor. Overview of knowledgeformalization with XTT2 rules. In N. Bassiliades, G. Governatori, and A. Pasckhe,editors, RuleML2011 - International Symposium on Rules, Lecture Notes in ComputerScience. Springer-Verlag, 2011. accepted for publication.

22. Grzegorz J. Nalepa, Antoni Lig¦za, Krzysztof Kaczor, and Weronika T. Furma«ska.HeKatE rule runtime and design framework. In Adrian Giurca, Grzegorz J. Nalepa,and Gerd Wagner, editors, Proceedings of the 3rd East European Workshop on Rule-

June 16, 2011 0:33 WSPC/INSTRUCTION FILE xttform-ijait

Formalization and Modeling of Rules Using the XTT2 Method 19

Based Applications (RuleApps 2009) Cottbus, Germany, September 21, 2009, pages21�30, Cottbus, Germany, 2009.

23. Ronald G. Ross. Principles of the Business Rule Approach. Addison-Wesley Profes-sional, 2003.

24. Marcin Szpyrka and Tomasz Szmuc. Decision tables in petri net models. In MarzenaKryszkiewicz, James F. Peters, Henryk Rybinski, and Andrzej Skowron, editors, RoughSets and Intelligent Systems Paradigms, International Conference, RSEISP 2007,Warsaw, Poland, June 28-30, 2007, Proceedings, volume 4585 of Lecture Notes inComputer Science, pages 648�657. Springer, 2007.

25. Ryszard Tadeusiewicz. Introduction to intelligent systems. In Bogdan M. Wilamowskiand J. David Irwin, editors, Intelligent systems, The Electrical Engineering HandbookSeries. The Industrial Electronics Handbook, pages 1�1�1�12. Boca Raton; London;New York: CRC Press Taylor & Francis Group, second edition edition, 2011.

26. I. S. Torsun. Foundations of Intelligent Knowledge-Based Systems. Academic Press,London, San Diego, New York, Boston, Sydney, Tokyo, Toronto, 1995.

27. Frank van Harmelen and John Balder. (ML)2: A formal language for KADS models.In ECAI, pages 582�586, 1992.

28. J. Vanthienen, C. Mues, A. Aerts, and G. Wets. A modularization approach to theveri�cation of knowledge based systems. In 14th International Joint Conference on Ar-ti�cial Intelligence (IJCAI'95) - Workshop on Validation & Veri�cation of KnowledgeBased Systems, Montreal, Canada 20 - 25 Aug 1995., aug 1995.

29. Jan Vanthienen, E. Dries, and J. Keppens. Clustering knowledge in tabular knowledgebases. In ICTAI, pages 88�95, 1996.

30. Jan Vanthienen and F. Robben. Developing legal knowledge based systems using de-cision tables. In ICAIL, pages 282�291, 1993.

31. G. Wagner, A.Giurca, and S. Lukichev. R2ml: A general approach for marking uprules. In F. Bry, F. Fages, M. Marchiori, and H. Ohlbach, editors, Principles andPractices of Semantic Web Reasoning, Dagstuhl Seminar Proceedings 05371, 2005.