fuzzy in rs

7/29/2019 fuzzy in rs

1/17

Fuzzy Sets and Systems 136 (2003) 133149

www.elsevier.com/locate/fss

Fuzzy logic methods in recommender systems

Ronald R. Yager

Machine Intelligence Institute, Iona College, 715 North Avenue, New Rochelle, NY 10801, USA

Received 5 July 2001; received in revised form 4 March 2002; accepted 25 April 2002

Abstract

Here we consider methodologies for constructing recommender systems. The approaches studied here dier

from collaborative ltering, they are based solely on the preferences of the single individual for whom we

are providing the recommendation and make no use of the preferences of other collaborators. We have called

these reclusive methods. Another important feature distinguishing these reclusive methods from collaborative

methods is that they require a representation of the objects. Considerable use is made of fuzzy set methods

for the representation and subsequent construction of justications and recommendation rules. It is pointed

out these reclusive methods rather than being competitive with collaborative methods are complementary.c 2002 Elsevier Science B.V. All rights reserved.

Keywords: Customization; Recommender systems; Fuzzy methods; Collaborative ltering

1. Introduction

Recommender systems [8,5] are a rapidly emerging class of software especially within the domain

of E-Commerce [9]. Their importance being directly related to the ability of the internet to collect,

store and process vast quantities of information about individuals actions and preferences. They

are important component toward the goal of providing specic customized information to each user.

Most of the current generation of recommender systems are based on collaborative ltering technolo-gies [4,6,7,10]. An important component of collaborative ltering type systems is the calculation of

similarity of interest based on correlations between individuals. In order to predict a users potential

interest for some object they have not experienced, collaborative ltering uses these measures of

similarity of interest in conjunction with ratings of the object by other individuals who have expe-

rienced the object. An important feature of these pure collaborative ltering systems is that they do

not require any representation of the objects being considered.

Tel.: +1-212-249-2047; fax: +1-212-249-1689.

E-mail address: [email protected] (R.R. Yager).

0165-0114/03/$ - see front matter c 2002 Elsevier Science B.V. All rights reserved.P I I : S 0165- 0114(02)00223- 3


2/17

134 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149

In this work we shall focus on a dierent class of recommender systems which are not collabo-

rative. These types of systems, which we call reclusive, use only preference information about the

user of interest. This types of systems require some representation of the object.

An essential dierence between the collaborative ltering approach and the reclusive approach isthat collaborative ltering is based upon nding a similarity between people whereas the reclusive

approach is based upon nding a similarity between the objects.

What is clear, although we shall not study it here, is that future recommender systems will

incorporate both these perspectives.

2. A general view of recommender systems

A recommender systems is associated with a collection of objects D = {d1; : : : ; dn}. The purpose

of this system is to recommend to the user objects of D that may be of interest to him. As a tangibleexample of a recommender system we shall often nd it convenient to use one in which the objects

are movies.

Here we shall consider some approaches to this problem of recommendation and shall describe

some methods for performing this task. The implementation of technologies for developing these

systems is strongly dependent upon the type of information that is being used. In the following, we

shall discuss the types of information that may be available to a recommender system.

A prime source of information for use in a recommender system is the knowledge about

the objects in D. The usefulness of this information is dependent upon the representation used

for the objects in D. The least information-rich situation is the one in which all we have is just

some unique identication of an object and no other information. For example, all we know about

a movie is just its title. A more information-rich environment is one in which we describe an objectwith some attributes. For example, we indicate the year the movie was made, the type of movie, the

stars. These attributes and their associated values provide a representation of the objects. We can

have degrees of representation, more sophisticated representations will depend upon the features used

to characterize the objects. Many techniques that can be used in recommender systems are based

upon using some representation of the objects. Generally, the more sophisticated the representation,

the better these techniques perform.

In order to make a recommendation to a user we must have some information about the users

preferences. Information about user preferences can essentially be obtained in two dierent ways,

although these need not be mutually exclusive. We shall refer to these two modes, respectively, as

extensionally and intentionally expressed preference information. By extensionally expressed prefer-ence information we mean information based upon the actions or past experiences of the user with

respect to specic objects of the type found in D. Examples of this are movies a user has previously

seen and possibly some rating of these movies. In another domain we could mean the objects which

the user has purchased. By intentionally expressed information we mean some specications by the

user of what they desire in objects of the type under consideration. Generally to be of use these

specications must be of such a nature that they can be related to the attributes and features used

in the representation of the objects in D.

We would like to make some comment on the distinction between targeted marketing [ 16] and

recommender systems. We say that recommender systems are participatory in that the user is


3/17

R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 135

participating in the process by providing information about their preferences. In a targeted marketing

system, while information may be available about a users preferences, this is generally based on

extensional information obtained from past actions of the user. Here the target is a passive supplier

of information rather than an active supplier as in the recommender system. For example the systemused by Amazon.com, while called a recommender system, according to our denition is more

appropriately a targeted marketing system that is based solely upon past information of the user and

does not involve any cooperation by the user. On the other hand the system used by NETFLIX 1 is

a true recommender system as it uses ratings supplied by the user.

Another characterizing aspect of these recommender systems is whether the system is collabo-

rative or not. We shall say a system is collaborative if information about the preferences of other

people are used in determining the recommendation to the current user. Furthermore, in these col-

laborative recommender systems the available technologies depend on the nature of preference in-

formation used with respect to participating agents. Generally, in the collaborative approach when

extensional information is used one tries to obtain, based on mutually experienced items, some mea-sure of correlation between the participants and use this as a basis of providing recommendations.

In Fig. 1 we summarize the situation with respect to the information available in a recommender

system.

In order to develop a recommender system we need to use information about the users preferences.

Collaborative type recommender can be constructed using only extensional preference information.

Non-collaborative, reclusive, type systems require the availability of object representations. Table 1

shows a simplied typology of dierent types of recommender systems. The rst column indicates

a purely collaborative type of system. Columns 24 indicate those based purely on representa-

tion, the dierences between these columns being types of information used to specify the users

preferences. The nal column indicates systems which use both collaborative and representational

information.Here we shall focus on reclusive, non-collaborative, recommender systems in which there exists

a representation of the objects. At a meta level, we see a kind of symmetry between reclusive

methods and collaborative methods. In both cases we make use of a vector of ratings of the objects

the current user has experienced. We shall denote this as A. In the collaborative ltering approach

we have for each collaborator a vector Aj indicating their ratings of the corresponding objects. For

any object d unexperienced by our current user we have vector R whose components, rj, are a rating

of this object by the collaborators. The procedure for obtaining the degree of recommendation can

be seen to essentially involve two steps:

1. We combine A and Aj to obtain Sj, a degree of similarity of our user with each collaborator.

2. Rating of d = Aggregation of weighted tuples (Sj ; rj).

In the reclusive method for each object the user has experienced we have a representation Ri as

well as the current users rating ai as contained in A. In addition, for an unexperienced object d

which we are trying to evaluate we only have a representation R. The procedure for obtaining the

1NETFLIX is a website that rents DVD videos. It asks users to rate the videos they have rented and uses this to

recommend other videos.


4/17


Both

Intentional

ExtentionalPreferenceinformation

type

USER

OBJECTSof

INTEREST

RepresentationAvailable

Yes NoYes No

Available

COLLABORATORS

Peer Group

PreferenceInformation Type

Both

Intention

al

Extentional

Fig. 1. Recommender systems information structure.

Table 1

Recommender systems typology

Extensional preferences

Intentional preferences

Representation

Collaborators

degree of recommendation also essentially involves a two step procedure:

1. We combine R and Ri to obtain Si, a degree of similarity of the object d with the experienced

objects.

2. Rating of d = Aggregation of weighted tuples (Si; ai).

Our goal in the following is to develop modules that can be used to help evaluate objects with

regard to their degree of recommendation in the case of reclusive approaches.


5/17


3. Object representation

We now turn to the issue of object representation. For our purposes the representation of an object

shall be based upon a set of primitive assertions or statements. Each assertion can essentially beviewed as some declarative statement. Associated with each object and each assertion is a value

contained in the unit interval indicating the degree to which the assertion is valid for that object.

For example in the movie domain a primitive assertion may be this movie is a comedy. In this

case, the value associated with this assertion for a movie indicates the degree to which it is true

that this movie is a comedy. Another assertion may be that Robert DeNiro is a star in this movie.

If the movie has Robert DeNiro as one of its stars then this assertion has validity one otherwise it

is zero. Another assertion may be that this movie was made in 1993, if the movie was made in

1995 this would have a validity of zero. If it was made in 1993, this assertion would have truth

value one.

Essentially then the basis of our representation scheme is a collection of assertions or statementswhose validity is determinable for any object in D. We shall denote this set of primitive or atomic

assertions as A = {A1; : : : ; An}.The representation of an object consists of a valuation of these assertions for the object. For

object d, Aj(d) indicates the degree to which assertion Aj is satised by d. When we are just focusing

on one object we can denote this aj. For some purposes we can view any object d as a fuzzy subset

over the space A. Using this perspective the membership grade of Aj in d; d(Aj) =Aj(d) = aj. As

an alternative perspective an object can be viewed as an n dimensional vector whose jth component

is Aj(d). As we shall subsequently see these dierent perspectives are useful in inspiring dierent

information processing operations.

We shall call a subset of related assertions an attribute or feature. For example, the subset V may

consist of all the assertions of the form this movie was made in the year xyz. We can denotethis attribute as the year the movie was made. Another notable subset of related assertions from

A may consist of all the assertion of the form x stars in this movie. This feature corresponds to

the attribute of who are the stars of the movie.

For a given recommender system, in addition to the set A of primitive assertions, we shall assume

the existence of a collection of features or attributes associated with the objects in D. We denote this

collection of attributes as F= {V1; V2; : : : ; V q}. Each attribute Vj corresponds to a subset of assertionswhich can be seen as constituting the possible values for the attribute. In some special cases a feature

may consist of a single assertion. The performance of a recommender system is clearly related to

the sophistication of the primitive assertions and associated features used to represent the objects of

interest.We note that while we have started with assertions and constructed features by forming subsets of

related assertions it is equally valid, and perhaps more intuitive, to start with attributes and generate

the primitive assertions as being the possible values for these attributes.

Attributes can be classied by various characteristics associated with their solution space [23,24].

Attributes can be distinguished with respect to number of solutions they allow, is it restricted to

having only one solution, does it allow multiple solutions, must it have a solution. For exam-

ple, the attribute corresponding to release year of a movie must have only one solution. On the

other hand the attribute corresponding to the star of a movie can take on multiple values. The

primitive assertions can also be classied with respect to the allowable truth values they can


6/17


assume. For example binary type assertions are those in which its truth value must assume

the value of either one or zero while other assertions can have truth values lying in the unit

interval.

We shall look little more carefully at the relationship between atomic assertions and attributes. Aswe have indicated an assertion is a declarative statement that is assigned a value for a given object

depending on its degree of validity for that object, generally lies in the unit interval. On the other

hand an attribute can be viewed as a variable that takes its value(s) from a universe associated with

the variable. In our framework the universe associated with an attribute corresponds to the subset

of primitive assertions that is used to dene it. The value of an attribute for a given object depends

upon the truth values of the associated primitives. Let us look at this.

If Vj is an attribute we can nd its value for a particular object d in the following way. Let

A(Vj) indicate the subset of primitives associated with Vj. Let d represent the fuzzy subset of A

corresponding to object d, then the value of the feature Vj for object d is

Vj(d) = A(Vj) d;

it is the intersection of the attribute denition, the crisp subset A(Vj), and the object representation,

the fuzzy subset d. The collection of elements in the subset Vj(d) determine the value of the attribute

Vj for the object d.

Often the information about an object will be specied directly in terms of attribute values. We

shall assume the ability to extract information about assertion validity from information expressed

about attribute values. To illustrate this we consider the following. Let A(Vj) = {Aj1; Aj2; : : : ; Ajn} bethe subset of assertions related to the attribute Vj. If we are informed that the value of attribute

Vj for object d is q this means that Vj(d) = {Ajq}, where Ajq is the assertion that Vj is q. Since

Vj(d) =A(Vj) d, we can conclude that Aji(d)=0 for all i= q and Ajq(d) = 1 .In relating the knowledge about assertion validation and feature values it is necessary to carefully

distinguish between features that can only assume one unique value, such as date of release of

a movie, and features that can assume multiple values, such as people starring in the movie. In the

rst case multiple assertions in Vj(d) is an indication of uncertainty regarding our knowledge of the

value of Vj. In the second case multiple assertions in Vj(d) is an indication of multiple solutions

for Vj. Here we shall not further pursue this important issue regarding dierent types of variables

but only point to [17] for those interested. Here, we shall assume the ability to interchange between

these two representations.

4. Modeling user expressed preferences

The basic function of a recommender system is to use what we shall call justications to generate

recommendations to a user. By a justication we shall mean a rational for believing a user may like

an object. These justications can be obtained either from preferences directly expressed by users

or induced using data about the users experiences. In the following we shall look at techniques

for obtaining recommendations which make use of a representation of the objects. As we noted

not all recommender technologies require representations, collaborative ltering being an example of

a technology that does not need representations of the objects.


7/17


In this section we shall consider the situation in which we have a representation of the objects and

the user has specied their preferences intentionally in some manner compatible with this represen-

tation. This situation is closely related to the problem of information retrieval [ 1]. The availability

of technologies for this environment is quite rich. The quality of performance of a recommendersystem in this environment is strongly dependent upon the ability of the system to allow the user to

eectively express their preferences. This capability is itself dependent both upon the assertions and

features used to represent the object and the sophistication of the language available to the user to

express their preferences in terms of these assertions and features.

In the following we briey describe a language which we introduced in [18]. This language called

Hi-Ret provides a very expressive language. This language makes considerable use of the ordered

weighted averaging (OWA) operator [13,21]. We shall rst briey describe this operator.

The OWA operator F of dimension n is a mapping OWA :Rn R characterized by an n-dimensionvector W, called the weighting vector, such that its components wj; j = 1 to n, lie in the unit interval

and sum to one. The OWA aggregation is dened as

OWA(a1; : : : ; an) =

nj=1

wjbj;

where bj is the jth largest of the ai.

The unique feature of this operator is the ordering of the arguments by value, a process that

introduces a nonlinearity into the operation. We can represent this aggregation operator in vector

notation as OWA(a1; a2; : : : ; an) = WTB, where W is the weighting vector and B is a vector, called

the ordered argument vector, whose components are the bj. The generality of the operator lies in

the fact that by selecting W we can implement many dierent aggregation operators. Specically,

by appropriately selecting the weights in W, we can emphasize dierent arguments based upon theirposition in the ordering. From an application point of view an important feature of this operator is the

characterizing vector W can be readily related to natural language expressions of aggregation rules.

A number of special cases of this operator are illustrated in the following. If the components in W

are such that w1 = 1 and wj =0 for all j = 1 we get OWA(a1; a2; : : : ; an)=Maxj[aj]. We denote thisweighting vector as W. If the weights are wn =1 and wj =0 for j = n we get OWA(a1; a2; : : : ; an) =Minj[aj]. We denote this weighting vector as W

. If the weights are such that wj = 1=n for all j,

denoted Wave, then OWA(a1; a2; : : : ; an) = ( 1=n)n

j=1 aj. Thus we see that the simple average is

a special case of these operators.

A number of dierent methods have been suggested for obtaining the weighting vector to be used

in the aggregation. For our purpose we shall use an approach based upon the idea of linguisticquantiers. Classical logic provides two quantiers for aggregating truth values for all and there

exists, these correspond to anding and oring. The concept of linguistic quantiers was originally

introduced by Zadeh [22] to help formalize the many expressions of quantication available in natural

language. According to Zadeh a linguistic quantier is a natural language expression corresponding

to a proportional quantity. Examples of this are at least one, all, at least %, most, more than a few,

some and all. Zadeh [22] suggested a method for formally representing these linguistic quantiers.

Let Q be a linguistic expression corresponding to a quantier such as most; then Zadeh suggested

representing this as a fuzzy subset Q over I= [0; 1] in which for any proportion rI; Q(r) indicatesthe degree to which r satises the concept identied by the quantier Q.


8/17


1

1

Fig. 2. Linguistic quantier at least .

In [15], Yager considered the use of linguistic quantiers to generalize the logical quantication

operation. He considered the valuation of the statement Q(a1; : : : ; an) where Q is a linguistic quantier

and the aj are truth values. It was suggested that the truth value of this type of statement could be

obtained with the aid of the OWA operator. This process involved rst representing the quantier

Q as a fuzzy subset Q and then using Q to obtain an OWA weighting vector W which was used

to perform an OWA aggregation of the ai. Formally we denote this as

Q(a1; : : : ; an) = OWAQ(a1; : : : ; an):

In the following we shall describe the process of obtaining the weighting vector from the associated

fuzzy subset Q. Here we shall restrict ourselves to the class of linguistic quantiers called RIM

quantiers. A RIM quantier is represented by fuzzy subset Q :II in which: Q(0)=0; Q(1)=1and if r1r2 then Q(r1)Q(r2) (monotonic). These RIM quantiers model the class in which

an increase in proportion results in an increase in compatibility to the linguistic expression being

modeled. Examples of these types of quantiers are at least one, all, at least %, most, more than

a few, some. These are the type of quantiers that are generally used by people in expressing their

preferences. If Q is a RIM quantier we associate with it an OWA weighting vector W such that

wj = Q(j=n) Q((j 1)=n) for j =1 to n.

Fig. 2 is seen as corresponding to the quantier at least %. For this quantier wj = 1 for j suchthat (j 1)=n6jn

and wj = 0 for all other.

Another quantier is one in which Q(r) = r for r [0; 1]. For this quantier we get wj = 1=n forall j. This gives us the simple average. We shall denote this quantier as some.

One can consider parameterized families of quantiers [14]. For example consider the parameter-

ized family Q(r) = r where [0;]. Here if = 0, we get the existential quantier; when ,we get the quantier for all and when = 1, we get the quantier some. In addition for the case in

which = 2, Q(r) = r2, we get one possible interpretation of the quantier most.

We are now in a position to describe the use of the OWA operator in the construction of a rec-

ommender system. We shall assume available to the user a vocabulary of linguistic quantiers


9/17


Q = {Q1; Q2; : : : ; Qq} in which they can express themselves. Furthermore, we assume transparent tothe user is the representation of each of these quantiers in terms of a fuzzy subset of the unit

interval, QkQk.

We now turn to representation of user preference information. We rst introduce the idea of primalpreference module (PPM). As we shall see this will serve as the basic unit which can be used to

evaluate the appropriateness of an object for recommendation based on the users preferences. A

PPM is of the form A1; : : : ; Aq : Q. The components of a PPM, the Ai, are assertions associatedwith the objects in D and Q is a linguistic quantier. With a PPM a user can express preference

information by describing what properties they are interested with respect to the class of objects in

D and then using Q to capture the desired relationship between these properties. For example do

they desire all or most or some or at least one of these requirements satised. If h is a PPM we

can evaluate any object in D with respect to this. In particular for object d we obtain the values

Aj(d) from our representation of d then use the OWA aggregation to evaluate it,

h(d) = OWAQ(A1(d); A2(d); : : : ; Aq(d))]:

Here the weighting vector is determined from Q.

While the PPM can be directly evaluated for any object, the great benet of our system is that

we can let users express their preferences in much more sophisticated ways. We now shall introduce

the idea of a basic preference module (BPM). A BPM is a module of the form

m = C1; C2; : : : ; C p : Q

in which the Ci are called the components of the BPM. The only required property of these com-

ponents are that they can be evaluated for each object. That is for any Ci we need to be able toobtain Ci(d). Once having this we can obtain using the OWA aggregation

m(d) = OWAQ[C1(d); : : : ; C p(d)]:

Let see what kinds of elements can constitute the Ci. Clearly the Ci can be any of the assertions in

the set A. More generally the Ci can be any PPM as we know how to evaluate these. Even more

generally the Ci can itself be a BPM if we can evaluate it.2 Additionally the Ci can be the negation

of any of preceding types. For example if C is an object which we can evaluate if we include as

one of our components not C, C, then C(d) = 1 C(d).We note that preferences specied in terms of attribute values can be easily represented in this

framework. Let us illustrate this. Consider an attribute Vj and let A(Vj) = {Aj1; Aj2; : : : ; Ajn} be thesubset of assertions related to the attribute Vj. Without loss of generality we shall let Aji indicate

the assertion that Vji is ai. First let us consider the case where Vj is a variable, such as star in

a movie, which can take multiple solutions. The requirement that Vj has aq as one of its values

can be easily expressed simply using the assertion Ajq as one of the components in our preference

modules. Consider now the situation where Vj is an attribute, such as year of release of a movie, that

can assume one and only one value. Consider now the representation of the desire that Vj is a1. We

2 In this case we must be careful to avoid self-reference.


10/17


BPMC =

C1 Cn

C1 =

BPM

BPM

C11

BPMCn =

Ck

C11Cn1

BPMCn1

Ai

Aj

Fig. 3. Hierarchical structure of BPM.

represent this as the BPM m = C1; C2 : all where C1 is simply the attribute Aj1. The componentC2 is obtained as not C3 where C3 is the BPM dened by Aj2; Aj3; : : : ; Ajn : Q where Q is thequantier any. The preceding illustrates the ability of formalizing preferences expressed in terms

of attributes within this framework. This allows to express their preferences in terms of attribute

requirements.

Using this framework based on BPMs we can express very sophisticated user preferences. Usinga BPM we can express any type of user preference information as long as it can be evaluated by

decomposing it into primitive assertions. Of particular value, is the fact that a user can express

their preferences even using concepts and language not within the given set of primitive assertions

and associated attributes as long as they can eventually formulate their concepts using the primitive

assertions. The general structure resulting from the use of BPM is a hierarchical type tree structure

whose leafs are primitive assertions (see Fig. 3).

Let us see the process. A user expresses a predilection, C, for some types of objects. This

predilection is formalized in terms of some BPM, a collection of components (criteria) and some

quantier relating these components. This components get further expressed (decomposed) by BPMs

which are then further decomposed until we reach a component that is a primitive assertion whichterminates a branch. This process can be considered as a type of grounding. We start at the top

with the most highly abstract cognitive concepts we then express these using less abstract terms

and continue downward in the tree until we reach a grounded concept, a primitive assertion. Once

having terminated each of the branches with a primitive assertion our tree provides an operational

denition of the predilection expressed by the user. For any object d in D we can evaluate the

degree to which it satises the predilection expressed. Starting at the bottom of the tree with the

primitive assertions whose validities can be obtained from our database we then back up the tree

using the OWA aggregation method. We stop when we reach the top of the tree, this is the degree

to which the object d satises the expressed preference.


11/17


5. User proles

Using these basic preference modules we can now dene what we shall call a user prole. One part

of the user prole is the user preference prole which consists of a collection of basic preferencemodules, mj for j = 1 t o K, of the type described in the preceding section. Each mj provides

a description of a class of objects from D that the user likes. These BPMs can be simple or

sophisticated. From statements such as I like Robert DeNiro movies to complicated descriptions

of movies. For any object d, mj(d) indicates the degree to which it satises the BPM mj. As any of

these preference modules provides a justication for recommendation an object satisfying anyone of

these mj is recommendable to the user. If M = {m1; m2; : : : ; mK} is the preference prole for a givenuser, then for any object d in D we calculate

M(d) = Maxj[mj(d)]

as the degree of positive recommendation of this object to the user.

At formal level one can view a user preference prole M as a BPM in which its components

are the mj. One reason we choose not to do this is that we prefer to reserve the idea of BPM for

user preferences that are in some sense conceptually distinct. Thus, while we can formally combine

a user preference for Robert Di Niro movies with a user preference for 1940s musical in a single

BPM by just oring these it is more in keeping with the way the user sees these by keeping them

distinct. Thus in introducing the user prole we are emphasizing the individuality of each of the

preferences in M.

In the preceding we have assumed that each of the BPM in M had an equal value with regard to

their worth to the user. We can consider the situation in which the user associates with each BPM

mj in his prole a value j [0; 1] indicating the weight or strength of this preference module. Usingthis we can calculate

M(d) = Maxj[mj(a) j]:

We can also allow a user to supply negative or rejection information. We dene a basic rejection

module (BRM) ni to be a description of objects from D which the user prefers not to have recom-

mended to him. A BRM is of the same form of BPM except it describes features which the user

species as constituting objects he does not want. Thus a second component of the user prole is

a collection N = {n1; n2; : : :} of basic rejection modules. Using this we can calculate the degree of

negative recommendation (rejection) of any object to a user, N(d)=Maxi[ni(d)]. It is not necessarythat a user have any negative modules. Additionally, we can associate with each rejection module,

ni a value i [0; 1] indicating the weight associated with the rejection module ni. Using this we get

N(d) = Maxi[ni(d) i]:

We must now combine these two types of scores, recommendation and rejection. Here we just

describe some types of operations available. The builder of the system must implement the one that

best represents the situation they are trying to model. We note that closely related issues have been

discussed by Dubois et al. [3] in there use of examples and counter examples for querying databases.


12/17


Let R(d) indicate the overall degree of recommendations of the object d with respect to a user.

One possibility is some kind of bounded subtraction of the two types of recommendations

R(d) = (M(d) N(d)) 0:

Another possibility is to assume that rejection has priority over preference. In this case we have

R(d) = (1 N(d)) M(d):

Here then we are saying we recommend things that are preferred and not rejected by the user.

6. Extensionally expressed preference

Now we consider the environment in which each object in D is represented but the user preference

information is expressed extensionally. We assume each user has an associated subset E of Dcorresponding to the subset of objects which it is known they have experienced. Since we shall

focus on just one user we can equivalently view this as a situation in which each object has an

attribute indicating whether the person has experienced this object or not. Here we shall initially

assume a closed world assumption. Under this assumption any object which we have not been

informed of the user experiencing we shall assume they have not experienced. In addition in some

systems of this type we shall assume that for any object which the user has experienced they have

provided a value a [0; 1] indicating their scoring of that object. Our goal here is to suggest waysin which we can use this type of information to recommend new objects to our user.

In this environment, one basic paradigm that we can use for justifying recommending objects

becomes very obvious. We look for objects which the user has experienced and liked and tryto nd objects which the user has not experienced similar to these. In trying to implement this

we become faced with the issue of determining the similarity between objects. The problem of

determining similarity is clearly context dependent and often very complex [2]. However, the assumed

availability of a representation for each of the objects allows us to develop some kind of tool for

the calculating the similarity (proximity) between objects. In the following we shall assume the

existence of a similarity (proximity) relationship S over the set D of objects. That is, for any two

objects di and dj in D we assume S(di; dj) [0; 1] is available. The larger S(di; dj) the more relatedor similar the objects. As we already indicated we also assume the existence of some subset EDof objects the user has experienced. Furthermore, we assume for each object di in E the availability

of a rating, ai, indicating the score the user has attributed to this object. We note these ratings can

be viewed as a fuzzy subset A of E in which A(di) = ai. Semantically, A corresponds to the subsetof objects the user liked.

Our goal here is to try to obtain a fuzzy subset R over the space M=D E correspondingto the objects to be recommended. One approach to obtaining this fuzzy subset is in the spirit of

fuzzy modeling. We shall try to provide a collection of justications, rules or circumstances, which

indicate that an object in M is suitable for recommendation. If Rj are a collection of circumstances

for recommending objects where Rj(di) indicates the degree to which di M meets this conditionthen

R(di) = Maxj[Rj(di)]:


13/17


Let us begin considering some guidelines that can be used to support recommendation of an object

di. Our focus here is not as much on providing a denitive listing of rules but more to see how fuzzy

logic can be used to enable the evaluation of some commonsense guidelines which are expressed

in a natural type language. That is we are interested in the process of translating some rules thatappear to provide reasonable justications for recommending objects.

A most natural source of recommendation can be captured by the following guideline.

Rule 1: Recommend an object if there exists a similar object that the user liked. Under this rule

the strength of recommendation of an unexperienced object di in D E can be obtained as

R1(di) = MaxjE

[S(di; dj) A(dj)]:

We express a second guideline for recommending objects which can be seen as a softening of the

rst rule.

Rule 2: Recommend an object for which we have at least several comparable objects which theuser somewhat liked.

Here we are softening the requirements of rule 1 by allowing a less stronger indication of sat-

isfaction, somewhat liked, and allowing a weaker connection as denoted by the use of the word

comparable instead of similar. We are compensating for this reduction by requiring at least several

such objects instead of just a single object. Our goal now is to suggest a way to formalize this

type of rule in a way that we can evaluate it. In anticipation of expressing this rule we introduce

some fuzzy subsets. First we note that the term at least several is an example of what Zadeh [22]

called a linguistic quantity, words denoting precise or imprecise quantities. In [ 22] Zadeh suggested

that any linguistic quantity can be represented as a fuzzy subset Q of the set of integers. It is clear

that at least several is monotonic in that Q(k1)Q(k2) if k1k2. We must now introduce a fuzzysubset to capture the idea of somewhat liked. This concept can be modeled in a number of dierent

ways. With A being the fuzzy subset of E indicating the users satisfaction, A(dj) = aj we let A be

a softening of this corresponding to the concept somewhat liked. One way of dening A is asA(di) = (A(di))

for 01. The smaller the more the softening. Another method to dene A is

A(di) = 1 if A(di) ;

A(di) = A(di) if A(di)6 :

More generally, we can express A using a transformation function T : [0; 1] [0; 1] such that T(a)a

and then dening A(xj) = T(A(xj)). The function T can be expressed using a fuzzy systems model[12], for example

if a is low then T(a) is medium

if a is moderate then T(a) is high

if a is large then T(a) is very large

Finally, we must dene the concept comparable. As used, the term comparable is meant to indicate

a softening of the concept of similar. Again if T is dened in the preceding, as some softening

function, T(a)a, we can use this to provide a denition for comparable. Thus, if S(x; y) indicates

the degree of similarity between two objects then we can use Comp(x; y) = T(S(x; y)) to indicate


14/17


the degree to which they are comparable. One possible denition for T in this case is T(a) = 1 i f

a and T(a) = if a.

Once having satisfactorily obtained representations of these softening concepts we can use them

to provide an operational formulation of this second rule. For any object di D E we have

R2(di) = MaxFE

Q(|F|) Min

djF( A(dj) Comp(dj; di))

:

In the preceding we can express A(dj) = T1(A(dj)) and Comp(dj ; di) = T2(S(dj ; di)) where T1 and T2are two transformations. It is interesting to see that our rst rule is a special case of this. If we let

T1 and T2 be such that T1(a) = T2(a) = a, identity transforms, then

R2(di) = MaxFE

Q(|F|) Min

djF(A(dj) S(dj ; di))

:

Furthermore, if Q is dened to be at least one then Q(|F|) = 1 if F= and hence

R2(di) = MaxFE

MindjF

(A(dj) S(dj; di))

this can be seen to be equal to MaxdjE [(A(dj) S(dj; di))] which is R1(di).

It is interesting to consider a collection of rules of this type Rk = Qk; Ak; Compk where each isa softening. Each one requiring more objects but softening either or both the requirements regarding

satisfaction to the user and proximity to the object being evaluated. Here then, in this softening

process, we are essentially increasing the radius about the object, decreasing the required strength

but increasing the number of objects that need be found.Another method for justifying possible objects to recommend is to look for unexperienced objects

that have a lot of neighbors which the user has experienced regardless of the valuation which

they have been given. This captures the idea that the user likes objects of this type regardless of

their evaluation. For example a person may see horror movies even if they think these movies are

bad. We can see that this type of situation can be expressed as an extreme case of the preceding

recommendation rules. Again consider a rule Q; A; Comp where we evaluate it for an item as

R(di) = MaxFE

Q(|F|) Min

djF( A(dj) Comp(dj; di))

:

To capture the above imperative we let A(dj) = 1 if dj E and hence we get

R(di) = MaxFE

Q(|F|) Min

djF(Comp(dj; di))

:

Letting Comp(dj; di) = S(dj ; di) we have R(di)= MaxFE [Q(|F|)MindjF (S(dj ; di))]. This can beseen as a type of fuzzy integral [11]. Let Sindexi(k) be the similarity of the kth most similar object

in E to the object di. Furthermore let qk = Q(k) then

R(di) = Maxk[qk Sindexi(k)]:


15/17


The essential idea of the preceding methods for justifying items to recommend was based on

the process of discovering unexperienced items located in areas of the object space that are rich in

objects that the user liked or experienced. We can capture this imperative in an alternative manner,

one that is in the spirit of the mountain method [19,20]. For each unexperienced item, di D E,we calculate M1(di) =

djE

ajS(di; dj). Here M1(di) can be seen as a kind of support for di based

on a weighted sum. We let d be such that M1(d)=Maxi[M1(di)]. Using this we can obtain an

evaluation or score for each di as R(di) =M1(di)=M1(d). Alternatively, for each di DE we can

calculate M2(di) =

djES(di; dj), the density of nearby experienced objects regardless of the users

rating. We then calculate d+ such that M2(d+)=Maxi[M2(di)]. From this we obtain a normalized

score function R(di) =M2(di)=M2(d+).

7. Using domain expert prototypes

We shall now consider another approach for obtaining basis for recommendation of objects. In

this approach, we rely upon the use of expert dened prototype objects. That is using the features

available in the representation of the objects we allow experts in the domain to dene prototype

objects. In some sense these prototypes can be seen as reection of the language and categories in

which the community discusses the domain. For example, terms like lm noir, happy movie, block

buster, epic can be seen as prototypes used in the movie domain. The actual creation and selection

of these domain prototypes is clearly a creative activity and we shall not venture into the issue of

obtaining methods for generating prototypes. From a formal point of view these prototypes can be

expressed in the terms of the primitive attributes of the objects in a manner analogous to that used

to express user preference models.

Here we shall assume the availability of a collection of prototypes. For our subsequent purposeswe shall consider a prototype object Ti to be some function of the domain of attributes such that

for each dj D, we can obtain Ti(d) as a value in the unit interval.As a justication for recommendation we can say that if a user likes prototype class Ti and

if a given unexperienced object d is in this prototype class then we recommend the object. To

calculate the degree to which an object d is of type Ti we use our denition of this prototype class

Ti in which Ti(d) indicates the degree to which d belongs to the prototype. If we let i indicate the

degree to which the user likes type Ti objects then the degree of recommendation of object d is

Ri(d) = i Ti(d):

In order to implement this we need to obtain i, the degree the user likes type Ti objects. Onemethod to determine whether the user likes objects in class Ti is as follows. For any experienced

object dj E let aj be the users rating. Then if we calculate L(Ti) =

jE ajTi(dj)=

djETi(dj)

this then gives us the users average rating for objects in class Ti. We can use this for i.

Let us consider another method for determining a users inclination toward objects in class Ti.

If a user has experienced a lot of objects in the class Ti it is reasonable to assume that they are

interested in this class. This interest may be independent of their reporting liking the objects or not.

People experience things for various, sometimes neurotic, reasons not necessarily only because they

think they are good. The term camp, used to describe objects that are so bad they become amusing

reects this situation. Thus, it appears useful to be able to provide some indication that a user has


16/17


a signicant degree of interest in movies of type Ti based solely on the quantity of items of this

type experienced by the user. We can calculate the number of objects of type Ti experienced by

this user as

N(Ti) =djE

Ti(dj):

Using this, we can calculate a number of indices. The rst

P1(Ti) =

djE

Ti(dj)djD

Ti(dj)

indicates the proportion of available type Ti experienced by this user. The second index is

P2(Ti) =djE Ti(dj)i(

djETi(dj)) :

We now dene a fuzzy subset of the unit interval corresponding to the concept signicant pro-

portion and calculate the degree of membership of both P1(Ti) and P2(Ti) in this set. Let us

denote these values as SP1(Ti) and SP2(Ti). Using these and the value L(Ti) we can calculate

i =Max[L(Ti); SP1(Ti); SP2(Ti)] and then use as our degree of recommendation Ri(d) = i Ti(d).

8. Conclusion

Here we have considered methodologies for constructing recommender systems. The reclusiveapproaches studied here dier from the collaborative ltering in that they are based solely on the

preferences of the individual for whom we are providing the recommendation and make no use of the

preferences of other individuals. We have called these reclusive methods. Another important feature

distinguishing these reclusive methods from collaborative methods is that they require a representation

of the objects not necessarily required of collaborative ltering methods. While our focus has not

been on collaborative methods but rather reclusive methods optimal recommender systems should

of course use all information available and hence should be based on a combination of these two

classes of systems. In future research we shall look at methods integrating collaborative and reclusive

approaches.

References

[1] R. Baeza-Yates, B. Ribeiro-Neto, Modern Information Retrieval, Addison-Wesley, Reading, MA, 1999.

[2] J.C. Bezdek, J. Keller, R. Krisnapuram, N.R. Pal, Fuzzy Models and Algorithms for Pattern Recognition and Image

Processing, Kluwer, Boston, 1999.

[3] D. Dubois, H. Prade, F. Sedes, Fuzzy logic techniques in multimedia database querying: a preliminary investigation

of the potentials IEEE Trans. Knowledge Data Eng. 13 (2001) 383392.

[4] D. Goldberg, D. Nichols, B.M. Oki, D. Terry, Using collaborative ltering to weave an information tapestry, Comm.

ACM 35 (12) (1992) 6170.

[5] H. Kautz, Recommender Systems, AAAI Press, Menlo Park, CA, 1998.


17/17


[6] J.A. Konstan, B.N. Miller, D. Maltz, J.L. Herlocker, L.R. Gordon, J. Riedl, Grouplens: applying collaborative ltering

to Usenet news Comm. ACM 40 (3) (1997) 7787.

[7] P. Perny, J.-D. Zucker, Collaborative ltering methods based on fuzzy preference relations, EUROFUSE-SIC99,

Budapest, 1999.[8] P. Resnick, H.R. Varian, Recommender systems, Comm. ACM 40 (3) (1997) 5658.

[9] J.B. Schafer, J.A. Konstan, J. Reidl, E-Commerce recommendation applications, Data Mining Knowledge Discovery

5 (2001) 115153.

[10] U. Shardanand, P. Maes, Social information ltering: algorithms for automating word of mouth, Proc. Computer

Human Interaction-95 Conference, Denver, 1995, pp. 210217.

[11] M. Sugeno, Fuzzy measures and fuzzy integrals: a survey, in: M.M. Gupta, G.N. Saridis, B.R. Gaines (Eds.), Fuzzy

Automata and Decision Process, North-Holland, Amsterdam, 1977, pp. 89102.

[12] T. Terano, K. Asai, M. Sugeno, Applied Fuzzy Systems, Academic Press, Orlando, FL, 1994.

[13] R.R. Yager, On ordered weighted averaging aggregation operators in multi-criteria decision making, IEEE Trans.

Systems Man Cybernet. 18 (1988) 183190.

[14] R.R. Yager, Families of OWA operators, Fuzzy Sets and Systems 59 (1993) 125148.

[15] R.R. Yager, Quantier guided aggregation using OWA operators, Internat. J. Intell. Systems 11 (1996) 4973.

[16] R.R. Yager, Targeted e-commerce marketing using fuzzy intelligent agents, IEEE Intell. Systems (2000) 4245.

[17] R.R. Yager, Veristic variables, IEEE Trans. Systems Man Cybernet. Part B: Cybernetics 30 (2000) 7184.

[18] R.R. Yager, A hierarchical document retrieval language, Inform. Retrieval 3 (2000) 357377.

[19] R.R. Yager, D.P. Filev, Approximate clustering via the mountain method, IEEE Trans. Systems Man Cybernet. 24

(1994) 12791284.

[20] R.R. Yager, D.P. Filev, Generation of fuzzy rules by mountain clustering, J. Intell. Fuzzy Systems 2 (1994) 209219.

[21] R.R. Yager, J. Kacprzyk, The Ordered Weighted Averaging Operators: Theory and Applications, Kluwer, Norwell,

MA, 1997.

[22] L.A. Zadeh, A computational approach to fuzzy quantiers in natural languages, Comput. Math. Appl. 9 (1983)

149184.

[23] L.A. Zadeh, Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic,

Fuzzy Sets and Systems 90 (1997) 111127.

[24] L.A. Zadeh, A new direction in AItoward a computational theory of perceptions, Artif. Intell. Mag. 22 (1) (2001)7384.

Documents

fuzzy in rs