Upload
haneesha-muddasani
View
216
Download
0
Embed Size (px)
Citation preview
7/29/2019 fuzzy in rs
1/17
Fuzzy Sets and Systems 136 (2003) 133149
www.elsevier.com/locate/fss
Fuzzy logic methods in recommender systems
Ronald R. Yager
Machine Intelligence Institute, Iona College, 715 North Avenue, New Rochelle, NY 10801, USA
Received 5 July 2001; received in revised form 4 March 2002; accepted 25 April 2002
Abstract
Here we consider methodologies for constructing recommender systems. The approaches studied here dier
from collaborative ltering, they are based solely on the preferences of the single individual for whom we
are providing the recommendation and make no use of the preferences of other collaborators. We have called
these reclusive methods. Another important feature distinguishing these reclusive methods from collaborative
methods is that they require a representation of the objects. Considerable use is made of fuzzy set methods
for the representation and subsequent construction of justications and recommendation rules. It is pointed
out these reclusive methods rather than being competitive with collaborative methods are complementary.c 2002 Elsevier Science B.V. All rights reserved.
Keywords: Customization; Recommender systems; Fuzzy methods; Collaborative ltering
1. Introduction
Recommender systems [8,5] are a rapidly emerging class of software especially within the domain
of E-Commerce [9]. Their importance being directly related to the ability of the internet to collect,
store and process vast quantities of information about individuals actions and preferences. They
are important component toward the goal of providing specic customized information to each user.
Most of the current generation of recommender systems are based on collaborative ltering technolo-gies [4,6,7,10]. An important component of collaborative ltering type systems is the calculation of
similarity of interest based on correlations between individuals. In order to predict a users potential
interest for some object they have not experienced, collaborative ltering uses these measures of
similarity of interest in conjunction with ratings of the object by other individuals who have expe-
rienced the object. An important feature of these pure collaborative ltering systems is that they do
not require any representation of the objects being considered.
Tel.: +1-212-249-2047; fax: +1-212-249-1689.
E-mail address: [email protected] (R.R. Yager).
0165-0114/03/$ - see front matter c 2002 Elsevier Science B.V. All rights reserved.P I I : S 0165- 0114(02)00223- 3
7/29/2019 fuzzy in rs
2/17
134 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149
In this work we shall focus on a dierent class of recommender systems which are not collabo-
rative. These types of systems, which we call reclusive, use only preference information about the
user of interest. This types of systems require some representation of the object.
An essential dierence between the collaborative ltering approach and the reclusive approach isthat collaborative ltering is based upon nding a similarity between people whereas the reclusive
approach is based upon nding a similarity between the objects.
What is clear, although we shall not study it here, is that future recommender systems will
incorporate both these perspectives.
2. A general view of recommender systems
A recommender systems is associated with a collection of objects D = {d1; : : : ; dn}. The purpose
of this system is to recommend to the user objects of D that may be of interest to him. As a tangibleexample of a recommender system we shall often nd it convenient to use one in which the objects
are movies.
Here we shall consider some approaches to this problem of recommendation and shall describe
some methods for performing this task. The implementation of technologies for developing these
systems is strongly dependent upon the type of information that is being used. In the following, we
shall discuss the types of information that may be available to a recommender system.
A prime source of information for use in a recommender system is the knowledge about
the objects in D. The usefulness of this information is dependent upon the representation used
for the objects in D. The least information-rich situation is the one in which all we have is just
some unique identication of an object and no other information. For example, all we know about
a movie is just its title. A more information-rich environment is one in which we describe an objectwith some attributes. For example, we indicate the year the movie was made, the type of movie, the
stars. These attributes and their associated values provide a representation of the objects. We can
have degrees of representation, more sophisticated representations will depend upon the features used
to characterize the objects. Many techniques that can be used in recommender systems are based
upon using some representation of the objects. Generally, the more sophisticated the representation,
the better these techniques perform.
In order to make a recommendation to a user we must have some information about the users
preferences. Information about user preferences can essentially be obtained in two dierent ways,
although these need not be mutually exclusive. We shall refer to these two modes, respectively, as
extensionally and intentionally expressed preference information. By extensionally expressed prefer-ence information we mean information based upon the actions or past experiences of the user with
respect to specic objects of the type found in D. Examples of this are movies a user has previously
seen and possibly some rating of these movies. In another domain we could mean the objects which
the user has purchased. By intentionally expressed information we mean some specications by the
user of what they desire in objects of the type under consideration. Generally to be of use these
specications must be of such a nature that they can be related to the attributes and features used
in the representation of the objects in D.
We would like to make some comment on the distinction between targeted marketing [ 16] and
recommender systems. We say that recommender systems are participatory in that the user is
7/29/2019 fuzzy in rs
3/17
R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 135
participating in the process by providing information about their preferences. In a targeted marketing
system, while information may be available about a users preferences, this is generally based on
extensional information obtained from past actions of the user. Here the target is a passive supplier
of information rather than an active supplier as in the recommender system. For example the systemused by Amazon.com, while called a recommender system, according to our denition is more
appropriately a targeted marketing system that is based solely upon past information of the user and
does not involve any cooperation by the user. On the other hand the system used by NETFLIX 1 is
a true recommender system as it uses ratings supplied by the user.
Another characterizing aspect of these recommender systems is whether the system is collabo-
rative or not. We shall say a system is collaborative if information about the preferences of other
people are used in determining the recommendation to the current user. Furthermore, in these col-
laborative recommender systems the available technologies depend on the nature of preference in-
formation used with respect to participating agents. Generally, in the collaborative approach when
extensional information is used one tries to obtain, based on mutually experienced items, some mea-sure of correlation between the participants and use this as a basis of providing recommendations.
In Fig. 1 we summarize the situation with respect to the information available in a recommender
system.
In order to develop a recommender system we need to use information about the users preferences.
Collaborative type recommender can be constructed using only extensional preference information.
Non-collaborative, reclusive, type systems require the availability of object representations. Table 1
shows a simplied typology of dierent types of recommender systems. The rst column indicates
a purely collaborative type of system. Columns 24 indicate those based purely on representa-
tion, the dierences between these columns being types of information used to specify the users
preferences. The nal column indicates systems which use both collaborative and representational
information.Here we shall focus on reclusive, non-collaborative, recommender systems in which there exists
a representation of the objects. At a meta level, we see a kind of symmetry between reclusive
methods and collaborative methods. In both cases we make use of a vector of ratings of the objects
the current user has experienced. We shall denote this as A. In the collaborative ltering approach
we have for each collaborator a vector Aj indicating their ratings of the corresponding objects. For
any object d unexperienced by our current user we have vector R whose components, rj, are a rating
of this object by the collaborators. The procedure for obtaining the degree of recommendation can
be seen to essentially involve two steps:
1. We combine A and Aj to obtain Sj, a degree of similarity of our user with each collaborator.
2. Rating of d = Aggregation of weighted tuples (Sj ; rj).
In the reclusive method for each object the user has experienced we have a representation Ri as
well as the current users rating ai as contained in A. In addition, for an unexperienced object d
which we are trying to evaluate we only have a representation R. The procedure for obtaining the
1NETFLIX is a website that rents DVD videos. It asks users to rate the videos they have rented and uses this to
recommend other videos.
7/29/2019 fuzzy in rs
4/17
136 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149
Both
Intentional
ExtentionalPreferenceinformation
type
USER
OBJECTSof
INTEREST
RepresentationAvailable
Yes NoYes No
Available
COLLABORATORS
Peer Group
PreferenceInformation Type
Both
Intention
al
Extentional
Fig. 1. Recommender systems information structure.
Table 1
Recommender systems typology
Extensional preferences
Intentional preferences
Representation
Collaborators
degree of recommendation also essentially involves a two step procedure:
1. We combine R and Ri to obtain Si, a degree of similarity of the object d with the experienced
objects.
2. Rating of d = Aggregation of weighted tuples (Si; ai).
Our goal in the following is to develop modules that can be used to help evaluate objects with
regard to their degree of recommendation in the case of reclusive approaches.
7/29/2019 fuzzy in rs
5/17
R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 137
3. Object representation
We now turn to the issue of object representation. For our purposes the representation of an object
shall be based upon a set of primitive assertions or statements. Each assertion can essentially beviewed as some declarative statement. Associated with each object and each assertion is a value
contained in the unit interval indicating the degree to which the assertion is valid for that object.
For example in the movie domain a primitive assertion may be this movie is a comedy. In this
case, the value associated with this assertion for a movie indicates the degree to which it is true
that this movie is a comedy. Another assertion may be that Robert DeNiro is a star in this movie.
If the movie has Robert DeNiro as one of its stars then this assertion has validity one otherwise it
is zero. Another assertion may be that this movie was made in 1993, if the movie was made in
1995 this would have a validity of zero. If it was made in 1993, this assertion would have truth
value one.
Essentially then the basis of our representation scheme is a collection of assertions or statementswhose validity is determinable for any object in D. We shall denote this set of primitive or atomic
assertions as A = {A1; : : : ; An}.The representation of an object consists of a valuation of these assertions for the object. For
object d, Aj(d) indicates the degree to which assertion Aj is satised by d. When we are just focusing
on one object we can denote this aj. For some purposes we can view any object d as a fuzzy subset
over the space A. Using this perspective the membership grade of Aj in d; d(Aj) =Aj(d) = aj. As
an alternative perspective an object can be viewed as an n dimensional vector whose jth component
is Aj(d). As we shall subsequently see these dierent perspectives are useful in inspiring dierent
information processing operations.
We shall call a subset of related assertions an attribute or feature. For example, the subset V may
consist of all the assertions of the form this movie was made in the year xyz. We can denotethis attribute as the year the movie was made. Another notable subset of related assertions from
A may consist of all the assertion of the form x stars in this movie. This feature corresponds to
the attribute of who are the stars of the movie.
For a given recommender system, in addition to the set A of primitive assertions, we shall assume
the existence of a collection of features or attributes associated with the objects in D. We denote this
collection of attributes as F= {V1; V2; : : : ; V q}. Each attribute Vj corresponds to a subset of assertionswhich can be seen as constituting the possible values for the attribute. In some special cases a feature
may consist of a single assertion. The performance of a recommender system is clearly related to
the sophistication of the primitive assertions and associated features used to represent the objects of
interest.We note that while we have started with assertions and constructed features by forming subsets of
related assertions it is equally valid, and perhaps more intuitive, to start with attributes and generate
the primitive assertions as being the possible values for these attributes.
Attributes can be classied by various characteristics associated with their solution space [23,24].
Attributes can be distinguished with respect to number of solutions they allow, is it restricted to
having only one solution, does it allow multiple solutions, must it have a solution. For exam-
ple, the attribute corresponding to release year of a movie must have only one solution. On the
other hand the attribute corresponding to the star of a movie can take on multiple values. The
primitive assertions can also be classied with respect to the allowable truth values they can
7/29/2019 fuzzy in rs
6/17
138 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149
assume. For example binary type assertions are those in which its truth value must assume
the value of either one or zero while other assertions can have truth values lying in the unit
interval.
We shall look little more carefully at the relationship between atomic assertions and attributes. Aswe have indicated an assertion is a declarative statement that is assigned a value for a given object
depending on its degree of validity for that object, generally lies in the unit interval. On the other
hand an attribute can be viewed as a variable that takes its value(s) from a universe associated with
the variable. In our framework the universe associated with an attribute corresponds to the subset
of primitive assertions that is used to dene it. The value of an attribute for a given object depends
upon the truth values of the associated primitives. Let us look at this.
If Vj is an attribute we can nd its value for a particular object d in the following way. Let
A(Vj) indicate the subset of primitives associated with Vj. Let d represent the fuzzy subset of A
corresponding to object d, then the value of the feature Vj for object d is
Vj(d) = A(Vj) d;
it is the intersection of the attribute denition, the crisp subset A(Vj), and the object representation,
the fuzzy subset d. The collection of elements in the subset Vj(d) determine the value of the attribute
Vj for the object d.
Often the information about an object will be specied directly in terms of attribute values. We
shall assume the ability to extract information about assertion validity from information expressed
about attribute values. To illustrate this we consider the following. Let A(Vj) = {Aj1; Aj2; : : : ; Ajn} bethe subset of assertions related to the attribute Vj. If we are informed that the value of attribute
Vj for object d is q this means that Vj(d) = {Ajq}, where Ajq is the assertion that Vj is q. Since
Vj(d) =A(Vj) d, we can conclude that Aji(d)=0 for all i= q and Ajq(d) = 1 .In relating the knowledge about assertion validation and feature values it is necessary to carefully
distinguish between features that can only assume one unique value, such as date of release of
a movie, and features that can assume multiple values, such as people starring in the movie. In the
rst case multiple assertions in Vj(d) is an indication of uncertainty regarding our knowledge of the
value of Vj. In the second case multiple assertions in Vj(d) is an indication of multiple solutions
for Vj. Here we shall not further pursue this important issue regarding dierent types of variables
but only point to [17] for those interested. Here, we shall assume the ability to interchange between
these two representations.
4. Modeling user expressed preferences
The basic function of a recommender system is to use what we shall call justications to generate
recommendations to a user. By a justication we shall mean a rational for believing a user may like
an object. These justications can be obtained either from preferences directly expressed by users
or induced using data about the users experiences. In the following we shall look at techniques
for obtaining recommendations which make use of a representation of the objects. As we noted
not all recommender technologies require representations, collaborative ltering being an example of
a technology that does not need representations of the objects.
7/29/2019 fuzzy in rs
7/17
R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 139
In this section we shall consider the situation in which we have a representation of the objects and
the user has specied their preferences intentionally in some manner compatible with this represen-
tation. This situation is closely related to the problem of information retrieval [ 1]. The availability
of technologies for this environment is quite rich. The quality of performance of a recommendersystem in this environment is strongly dependent upon the ability of the system to allow the user to
eectively express their preferences. This capability is itself dependent both upon the assertions and
features used to represent the object and the sophistication of the language available to the user to
express their preferences in terms of these assertions and features.
In the following we briey describe a language which we introduced in [18]. This language called
Hi-Ret provides a very expressive language. This language makes considerable use of the ordered
weighted averaging (OWA) operator [13,21]. We shall rst briey describe this operator.
The OWA operator F of dimension n is a mapping OWA :Rn R characterized by an n-dimensionvector W, called the weighting vector, such that its components wj; j = 1 to n, lie in the unit interval
and sum to one. The OWA aggregation is dened as
OWA(a1; : : : ; an) =
nj=1
wjbj;
where bj is the jth largest of the ai.
The unique feature of this operator is the ordering of the arguments by value, a process that
introduces a nonlinearity into the operation. We can represent this aggregation operator in vector
notation as OWA(a1; a2; : : : ; an) = WTB, where W is the weighting vector and B is a vector, called
the ordered argument vector, whose components are the bj. The generality of the operator lies in
the fact that by selecting W we can implement many dierent aggregation operators. Specically,
by appropriately selecting the weights in W, we can emphasize dierent arguments based upon theirposition in the ordering. From an application point of view an important feature of this operator is the
characterizing vector W can be readily related to natural language expressions of aggregation rules.
A number of special cases of this operator are illustrated in the following. If the components in W
are such that w1 = 1 and wj =0 for all j = 1 we get OWA(a1; a2; : : : ; an)=Maxj[aj]. We denote thisweighting vector as W. If the weights are wn =1 and wj =0 for j = n we get OWA(a1; a2; : : : ; an) =Minj[aj]. We denote this weighting vector as W
. If the weights are such that wj = 1=n for all j,
denoted Wave, then OWA(a1; a2; : : : ; an) = ( 1=n)n
j=1 aj. Thus we see that the simple average is
a special case of these operators.
A number of dierent methods have been suggested for obtaining the weighting vector to be used
in the aggregation. For our purpose we shall use an approach based upon the idea of linguisticquantiers. Classical logic provides two quantiers for aggregating truth values for all and there
exists, these correspond to anding and oring. The concept of linguistic quantiers was originally
introduced by Zadeh [22] to help formalize the many expressions of quantication available in natural
language. According to Zadeh a linguistic quantier is a natural language expression corresponding
to a proportional quantity. Examples of this are at least one, all, at least %, most, more than a few,
some and all. Zadeh [22] suggested a method for formally representing these linguistic quantiers.
Let Q be a linguistic expression corresponding to a quantier such as most; then Zadeh suggested
representing this as a fuzzy subset Q over I= [0; 1] in which for any proportion rI; Q(r) indicatesthe degree to which r satises the concept identied by the quantier Q.
7/29/2019 fuzzy in rs
8/17
140 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149
1
1
Fig. 2. Linguistic quantier at least .
In [15], Yager considered the use of linguistic quantiers to generalize the logical quantication
operation. He considered the valuation of the statement Q(a1; : : : ; an) where Q is a linguistic quantier
and the aj are truth values. It was suggested that the truth value of this type of statement could be
obtained with the aid of the OWA operator. This process involved rst representing the quantier
Q as a fuzzy subset Q and then using Q to obtain an OWA weighting vector W which was used
to perform an OWA aggregation of the ai. Formally we denote this as
Q(a1; : : : ; an) = OWAQ(a1; : : : ; an):
In the following we shall describe the process of obtaining the weighting vector from the associated
fuzzy subset Q. Here we shall restrict ourselves to the class of linguistic quantiers called RIM
quantiers. A RIM quantier is represented by fuzzy subset Q :II in which: Q(0)=0; Q(1)=1and if r1r2 then Q(r1)Q(r2) (monotonic). These RIM quantiers model the class in which
an increase in proportion results in an increase in compatibility to the linguistic expression being
modeled. Examples of these types of quantiers are at least one, all, at least %, most, more than
a few, some. These are the type of quantiers that are generally used by people in expressing their
preferences. If Q is a RIM quantier we associate with it an OWA weighting vector W such that
wj = Q(j=n) Q((j 1)=n) for j =1 to n.
Fig. 2 is seen as corresponding to the quantier at least %. For this quantier wj = 1 for j suchthat (j 1)=n6jn
and wj = 0 for all other.
Another quantier is one in which Q(r) = r for r [0; 1]. For this quantier we get wj = 1=n forall j. This gives us the simple average. We shall denote this quantier as some.
One can consider parameterized families of quantiers [14]. For example consider the parameter-
ized family Q(r) = r where [0;]. Here if = 0, we get the existential quantier; when ,we get the quantier for all and when = 1, we get the quantier some. In addition for the case in
which = 2, Q(r) = r2, we get one possible interpretation of the quantier most.
We are now in a position to describe the use of the OWA operator in the construction of a rec-
ommender system. We shall assume available to the user a vocabulary of linguistic quantiers
7/29/2019 fuzzy in rs
9/17
R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 141
Q = {Q1; Q2; : : : ; Qq} in which they can express themselves. Furthermore, we assume transparent tothe user is the representation of each of these quantiers in terms of a fuzzy subset of the unit
interval, QkQk.
We now turn to representation of user preference information. We rst introduce the idea of primalpreference module (PPM). As we shall see this will serve as the basic unit which can be used to
evaluate the appropriateness of an object for recommendation based on the users preferences. A
PPM is of the form A1; : : : ; Aq : Q. The components of a PPM, the Ai, are assertions associatedwith the objects in D and Q is a linguistic quantier. With a PPM a user can express preference
information by describing what properties they are interested with respect to the class of objects in
D and then using Q to capture the desired relationship between these properties. For example do
they desire all or most or some or at least one of these requirements satised. If h is a PPM we
can evaluate any object in D with respect to this. In particular for object d we obtain the values
Aj(d) from our representation of d then use the OWA aggregation to evaluate it,
h(d) = OWAQ(A1(d); A2(d); : : : ; Aq(d))]:
Here the weighting vector is determined from Q.
While the PPM can be directly evaluated for any object, the great benet of our system is that
we can let users express their preferences in much more sophisticated ways. We now shall introduce
the idea of a basic preference module (BPM). A BPM is a module of the form
m = C1; C2; : : : ; C p : Q
in which the Ci are called the components of the BPM. The only required property of these com-
ponents are that they can be evaluated for each object. That is for any Ci we need to be able toobtain Ci(d). Once having this we can obtain using the OWA aggregation
m(d) = OWAQ[C1(d); : : : ; C p(d)]:
Let see what kinds of elements can constitute the Ci. Clearly the Ci can be any of the assertions in
the set A. More generally the Ci can be any PPM as we know how to evaluate these. Even more
generally the Ci can itself be a BPM if we can evaluate it.2 Additionally the Ci can be the negation
of any of preceding types. For example if C is an object which we can evaluate if we include as
one of our components not C, C, then C(d) = 1 C(d).We note that preferences specied in terms of attribute values can be easily represented in this
framework. Let us illustrate this. Consider an attribute Vj and let A(Vj) = {Aj1; Aj2; : : : ; Ajn} be thesubset of assertions related to the attribute Vj. Without loss of generality we shall let Aji indicate
the assertion that Vji is ai. First let us consider the case where Vj is a variable, such as star in
a movie, which can take multiple solutions. The requirement that Vj has aq as one of its values
can be easily expressed simply using the assertion Ajq as one of the components in our preference
modules. Consider now the situation where Vj is an attribute, such as year of release of a movie, that
can assume one and only one value. Consider now the representation of the desire that Vj is a1. We
2 In this case we must be careful to avoid self-reference.
7/29/2019 fuzzy in rs
10/17
142 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149
BPMC =
C1 Cn
C1 =
BPM
BPM
C11
BPMCn =
Ck
C11Cn1
BPMCn1
Ai
Aj
Fig. 3. Hierarchical structure of BPM.
represent this as the BPM m = C1; C2 : all where C1 is simply the attribute Aj1. The componentC2 is obtained as not C3 where C3 is the BPM dened by Aj2; Aj3; : : : ; Ajn : Q where Q is thequantier any. The preceding illustrates the ability of formalizing preferences expressed in terms
of attributes within this framework. This allows to express their preferences in terms of attribute
requirements.
Using this framework based on BPMs we can express very sophisticated user preferences. Usinga BPM we can express any type of user preference information as long as it can be evaluated by
decomposing it into primitive assertions. Of particular value, is the fact that a user can express
their preferences even using concepts and language not within the given set of primitive assertions
and associated attributes as long as they can eventually formulate their concepts using the primitive
assertions. The general structure resulting from the use of BPM is a hierarchical type tree structure
whose leafs are primitive assertions (see Fig. 3).
Let us see the process. A user expresses a predilection, C, for some types of objects. This
predilection is formalized in terms of some BPM, a collection of components (criteria) and some
quantier relating these components. This components get further expressed (decomposed) by BPMs
which are then further decomposed until we reach a component that is a primitive assertion whichterminates a branch. This process can be considered as a type of grounding. We start at the top
with the most highly abstract cognitive concepts we then express these using less abstract terms
and continue downward in the tree until we reach a grounded concept, a primitive assertion. Once
having terminated each of the branches with a primitive assertion our tree provides an operational
denition of the predilection expressed by the user. For any object d in D we can evaluate the
degree to which it satises the predilection expressed. Starting at the bottom of the tree with the
primitive assertions whose validities can be obtained from our database we then back up the tree
using the OWA aggregation method. We stop when we reach the top of the tree, this is the degree
to which the object d satises the expressed preference.
7/29/2019 fuzzy in rs
11/17
R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 143
5. User proles
Using these basic preference modules we can now dene what we shall call a user prole. One part
of the user prole is the user preference prole which consists of a collection of basic preferencemodules, mj for j = 1 t o K, of the type described in the preceding section. Each mj provides
a description of a class of objects from D that the user likes. These BPMs can be simple or
sophisticated. From statements such as I like Robert DeNiro movies to complicated descriptions
of movies. For any object d, mj(d) indicates the degree to which it satises the BPM mj. As any of
these preference modules provides a justication for recommendation an object satisfying anyone of
these mj is recommendable to the user. If M = {m1; m2; : : : ; mK} is the preference prole for a givenuser, then for any object d in D we calculate
M(d) = Maxj[mj(d)]
as the degree of positive recommendation of this object to the user.
At formal level one can view a user preference prole M as a BPM in which its components
are the mj. One reason we choose not to do this is that we prefer to reserve the idea of BPM for
user preferences that are in some sense conceptually distinct. Thus, while we can formally combine
a user preference for Robert Di Niro movies with a user preference for 1940s musical in a single
BPM by just oring these it is more in keeping with the way the user sees these by keeping them
distinct. Thus in introducing the user prole we are emphasizing the individuality of each of the
preferences in M.
In the preceding we have assumed that each of the BPM in M had an equal value with regard to
their worth to the user. We can consider the situation in which the user associates with each BPM
mj in his prole a value j [0; 1] indicating the weight or strength of this preference module. Usingthis we can calculate
M(d) = Maxj[mj(a) j]:
We can also allow a user to supply negative or rejection information. We dene a basic rejection
module (BRM) ni to be a description of objects from D which the user prefers not to have recom-
mended to him. A BRM is of the same form of BPM except it describes features which the user
species as constituting objects he does not want. Thus a second component of the user prole is
a collection N = {n1; n2; : : :} of basic rejection modules. Using this we can calculate the degree of
negative recommendation (rejection) of any object to a user, N(d)=Maxi[ni(d)]. It is not necessarythat a user have any negative modules. Additionally, we can associate with each rejection module,
ni a value i [0; 1] indicating the weight associated with the rejection module ni. Using this we get
N(d) = Maxi[ni(d) i]:
We must now combine these two types of scores, recommendation and rejection. Here we just
describe some types of operations available. The builder of the system must implement the one that
best represents the situation they are trying to model. We note that closely related issues have been
discussed by Dubois et al. [3] in there use of examples and counter examples for querying databases.
7/29/2019 fuzzy in rs
12/17
144 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149
Let R(d) indicate the overall degree of recommendations of the object d with respect to a user.
One possibility is some kind of bounded subtraction of the two types of recommendations
R(d) = (M(d) N(d)) 0:
Another possibility is to assume that rejection has priority over preference. In this case we have
R(d) = (1 N(d)) M(d):
Here then we are saying we recommend things that are preferred and not rejected by the user.
6. Extensionally expressed preference
Now we consider the environment in which each object in D is represented but the user preference
information is expressed extensionally. We assume each user has an associated subset E of Dcorresponding to the subset of objects which it is known they have experienced. Since we shall
focus on just one user we can equivalently view this as a situation in which each object has an
attribute indicating whether the person has experienced this object or not. Here we shall initially
assume a closed world assumption. Under this assumption any object which we have not been
informed of the user experiencing we shall assume they have not experienced. In addition in some
systems of this type we shall assume that for any object which the user has experienced they have
provided a value a [0; 1] indicating their scoring of that object. Our goal here is to suggest waysin which we can use this type of information to recommend new objects to our user.
In this environment, one basic paradigm that we can use for justifying recommending objects
becomes very obvious. We look for objects which the user has experienced and liked and tryto nd objects which the user has not experienced similar to these. In trying to implement this
we become faced with the issue of determining the similarity between objects. The problem of
determining similarity is clearly context dependent and often very complex [2]. However, the assumed
availability of a representation for each of the objects allows us to develop some kind of tool for
the calculating the similarity (proximity) between objects. In the following we shall assume the
existence of a similarity (proximity) relationship S over the set D of objects. That is, for any two
objects di and dj in D we assume S(di; dj) [0; 1] is available. The larger S(di; dj) the more relatedor similar the objects. As we already indicated we also assume the existence of some subset EDof objects the user has experienced. Furthermore, we assume for each object di in E the availability
of a rating, ai, indicating the score the user has attributed to this object. We note these ratings can
be viewed as a fuzzy subset A of E in which A(di) = ai. Semantically, A corresponds to the subsetof objects the user liked.
Our goal here is to try to obtain a fuzzy subset R over the space M=D E correspondingto the objects to be recommended. One approach to obtaining this fuzzy subset is in the spirit of
fuzzy modeling. We shall try to provide a collection of justications, rules or circumstances, which
indicate that an object in M is suitable for recommendation. If Rj are a collection of circumstances
for recommending objects where Rj(di) indicates the degree to which di M meets this conditionthen
R(di) = Maxj[Rj(di)]:
7/29/2019 fuzzy in rs
13/17
R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 145
Let us begin considering some guidelines that can be used to support recommendation of an object
di. Our focus here is not as much on providing a denitive listing of rules but more to see how fuzzy
logic can be used to enable the evaluation of some commonsense guidelines which are expressed
in a natural type language. That is we are interested in the process of translating some rules thatappear to provide reasonable justications for recommending objects.
A most natural source of recommendation can be captured by the following guideline.
Rule 1: Recommend an object if there exists a similar object that the user liked. Under this rule
the strength of recommendation of an unexperienced object di in D E can be obtained as
R1(di) = MaxjE
[S(di; dj) A(dj)]:
We express a second guideline for recommending objects which can be seen as a softening of the
rst rule.
Rule 2: Recommend an object for which we have at least several comparable objects which theuser somewhat liked.
Here we are softening the requirements of rule 1 by allowing a less stronger indication of sat-
isfaction, somewhat liked, and allowing a weaker connection as denoted by the use of the word
comparable instead of similar. We are compensating for this reduction by requiring at least several
such objects instead of just a single object. Our goal now is to suggest a way to formalize this
type of rule in a way that we can evaluate it. In anticipation of expressing this rule we introduce
some fuzzy subsets. First we note that the term at least several is an example of what Zadeh [22]
called a linguistic quantity, words denoting precise or imprecise quantities. In [ 22] Zadeh suggested
that any linguistic quantity can be represented as a fuzzy subset Q of the set of integers. It is clear
that at least several is monotonic in that Q(k1)Q(k2) if k1k2. We must now introduce a fuzzysubset to capture the idea of somewhat liked. This concept can be modeled in a number of dierent
ways. With A being the fuzzy subset of E indicating the users satisfaction, A(dj) = aj we let A be
a softening of this corresponding to the concept somewhat liked. One way of dening A is asA(di) = (A(di))
for 01. The smaller the more the softening. Another method to dene A is
A(di) = 1 if A(di) ;
A(di) = A(di) if A(di)6 :
More generally, we can express A using a transformation function T : [0; 1] [0; 1] such that T(a)a
and then dening A(xj) = T(A(xj)). The function T can be expressed using a fuzzy systems model[12], for example
if a is low then T(a) is medium
if a is moderate then T(a) is high
if a is large then T(a) is very large
Finally, we must dene the concept comparable. As used, the term comparable is meant to indicate
a softening of the concept of similar. Again if T is dened in the preceding, as some softening
function, T(a)a, we can use this to provide a denition for comparable. Thus, if S(x; y) indicates
the degree of similarity between two objects then we can use Comp(x; y) = T(S(x; y)) to indicate
7/29/2019 fuzzy in rs
14/17
146 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149
the degree to which they are comparable. One possible denition for T in this case is T(a) = 1 i f
a and T(a) = if a.
Once having satisfactorily obtained representations of these softening concepts we can use them
to provide an operational formulation of this second rule. For any object di D E we have
R2(di) = MaxFE
Q(|F|) Min
djF( A(dj) Comp(dj; di))
:
In the preceding we can express A(dj) = T1(A(dj)) and Comp(dj ; di) = T2(S(dj ; di)) where T1 and T2are two transformations. It is interesting to see that our rst rule is a special case of this. If we let
T1 and T2 be such that T1(a) = T2(a) = a, identity transforms, then
R2(di) = MaxFE
Q(|F|) Min
djF(A(dj) S(dj ; di))
:
Furthermore, if Q is dened to be at least one then Q(|F|) = 1 if F= and hence
R2(di) = MaxFE
MindjF
(A(dj) S(dj; di))
this can be seen to be equal to MaxdjE [(A(dj) S(dj; di))] which is R1(di).
It is interesting to consider a collection of rules of this type Rk = Qk; Ak; Compk where each isa softening. Each one requiring more objects but softening either or both the requirements regarding
satisfaction to the user and proximity to the object being evaluated. Here then, in this softening
process, we are essentially increasing the radius about the object, decreasing the required strength
but increasing the number of objects that need be found.Another method for justifying possible objects to recommend is to look for unexperienced objects
that have a lot of neighbors which the user has experienced regardless of the valuation which
they have been given. This captures the idea that the user likes objects of this type regardless of
their evaluation. For example a person may see horror movies even if they think these movies are
bad. We can see that this type of situation can be expressed as an extreme case of the preceding
recommendation rules. Again consider a rule Q; A; Comp where we evaluate it for an item as
R(di) = MaxFE
Q(|F|) Min
djF( A(dj) Comp(dj; di))
:
To capture the above imperative we let A(dj) = 1 if dj E and hence we get
R(di) = MaxFE
Q(|F|) Min
djF(Comp(dj; di))
:
Letting Comp(dj; di) = S(dj ; di) we have R(di)= MaxFE [Q(|F|)MindjF (S(dj ; di))]. This can beseen as a type of fuzzy integral [11]. Let Sindexi(k) be the similarity of the kth most similar object
in E to the object di. Furthermore let qk = Q(k) then
R(di) = Maxk[qk Sindexi(k)]:
7/29/2019 fuzzy in rs
15/17
R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 147
The essential idea of the preceding methods for justifying items to recommend was based on
the process of discovering unexperienced items located in areas of the object space that are rich in
objects that the user liked or experienced. We can capture this imperative in an alternative manner,
one that is in the spirit of the mountain method [19,20]. For each unexperienced item, di D E,we calculate M1(di) =
djE
ajS(di; dj). Here M1(di) can be seen as a kind of support for di based
on a weighted sum. We let d be such that M1(d)=Maxi[M1(di)]. Using this we can obtain an
evaluation or score for each di as R(di) =M1(di)=M1(d). Alternatively, for each di DE we can
calculate M2(di) =
djES(di; dj), the density of nearby experienced objects regardless of the users
rating. We then calculate d+ such that M2(d+)=Maxi[M2(di)]. From this we obtain a normalized
score function R(di) =M2(di)=M2(d+).
7. Using domain expert prototypes
We shall now consider another approach for obtaining basis for recommendation of objects. In
this approach, we rely upon the use of expert dened prototype objects. That is using the features
available in the representation of the objects we allow experts in the domain to dene prototype
objects. In some sense these prototypes can be seen as reection of the language and categories in
which the community discusses the domain. For example, terms like lm noir, happy movie, block
buster, epic can be seen as prototypes used in the movie domain. The actual creation and selection
of these domain prototypes is clearly a creative activity and we shall not venture into the issue of
obtaining methods for generating prototypes. From a formal point of view these prototypes can be
expressed in the terms of the primitive attributes of the objects in a manner analogous to that used
to express user preference models.
Here we shall assume the availability of a collection of prototypes. For our subsequent purposeswe shall consider a prototype object Ti to be some function of the domain of attributes such that
for each dj D, we can obtain Ti(d) as a value in the unit interval.As a justication for recommendation we can say that if a user likes prototype class Ti and
if a given unexperienced object d is in this prototype class then we recommend the object. To
calculate the degree to which an object d is of type Ti we use our denition of this prototype class
Ti in which Ti(d) indicates the degree to which d belongs to the prototype. If we let i indicate the
degree to which the user likes type Ti objects then the degree of recommendation of object d is
Ri(d) = i Ti(d):
In order to implement this we need to obtain i, the degree the user likes type Ti objects. Onemethod to determine whether the user likes objects in class Ti is as follows. For any experienced
object dj E let aj be the users rating. Then if we calculate L(Ti) =
jE ajTi(dj)=
djETi(dj)
this then gives us the users average rating for objects in class Ti. We can use this for i.
Let us consider another method for determining a users inclination toward objects in class Ti.
If a user has experienced a lot of objects in the class Ti it is reasonable to assume that they are
interested in this class. This interest may be independent of their reporting liking the objects or not.
People experience things for various, sometimes neurotic, reasons not necessarily only because they
think they are good. The term camp, used to describe objects that are so bad they become amusing
reects this situation. Thus, it appears useful to be able to provide some indication that a user has
7/29/2019 fuzzy in rs
16/17
148 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149
a signicant degree of interest in movies of type Ti based solely on the quantity of items of this
type experienced by the user. We can calculate the number of objects of type Ti experienced by
this user as
N(Ti) =djE
Ti(dj):
Using this, we can calculate a number of indices. The rst
P1(Ti) =
djE
Ti(dj)djD
Ti(dj)
indicates the proportion of available type Ti experienced by this user. The second index is
P2(Ti) =djE Ti(dj)i(
djETi(dj)) :
We now dene a fuzzy subset of the unit interval corresponding to the concept signicant pro-
portion and calculate the degree of membership of both P1(Ti) and P2(Ti) in this set. Let us
denote these values as SP1(Ti) and SP2(Ti). Using these and the value L(Ti) we can calculate
i =Max[L(Ti); SP1(Ti); SP2(Ti)] and then use as our degree of recommendation Ri(d) = i Ti(d).
8. Conclusion
Here we have considered methodologies for constructing recommender systems. The reclusiveapproaches studied here dier from the collaborative ltering in that they are based solely on the
preferences of the individual for whom we are providing the recommendation and make no use of the
preferences of other individuals. We have called these reclusive methods. Another important feature
distinguishing these reclusive methods from collaborative methods is that they require a representation
of the objects not necessarily required of collaborative ltering methods. While our focus has not
been on collaborative methods but rather reclusive methods optimal recommender systems should
of course use all information available and hence should be based on a combination of these two
classes of systems. In future research we shall look at methods integrating collaborative and reclusive
approaches.
References
[1] R. Baeza-Yates, B. Ribeiro-Neto, Modern Information Retrieval, Addison-Wesley, Reading, MA, 1999.
[2] J.C. Bezdek, J. Keller, R. Krisnapuram, N.R. Pal, Fuzzy Models and Algorithms for Pattern Recognition and Image
Processing, Kluwer, Boston, 1999.
[3] D. Dubois, H. Prade, F. Sedes, Fuzzy logic techniques in multimedia database querying: a preliminary investigation
of the potentials IEEE Trans. Knowledge Data Eng. 13 (2001) 383392.
[4] D. Goldberg, D. Nichols, B.M. Oki, D. Terry, Using collaborative ltering to weave an information tapestry, Comm.
ACM 35 (12) (1992) 6170.
[5] H. Kautz, Recommender Systems, AAAI Press, Menlo Park, CA, 1998.
7/29/2019 fuzzy in rs
17/17
R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 149
[6] J.A. Konstan, B.N. Miller, D. Maltz, J.L. Herlocker, L.R. Gordon, J. Riedl, Grouplens: applying collaborative ltering
to Usenet news Comm. ACM 40 (3) (1997) 7787.
[7] P. Perny, J.-D. Zucker, Collaborative ltering methods based on fuzzy preference relations, EUROFUSE-SIC99,
Budapest, 1999.[8] P. Resnick, H.R. Varian, Recommender systems, Comm. ACM 40 (3) (1997) 5658.
[9] J.B. Schafer, J.A. Konstan, J. Reidl, E-Commerce recommendation applications, Data Mining Knowledge Discovery
5 (2001) 115153.
[10] U. Shardanand, P. Maes, Social information ltering: algorithms for automating word of mouth, Proc. Computer
Human Interaction-95 Conference, Denver, 1995, pp. 210217.
[11] M. Sugeno, Fuzzy measures and fuzzy integrals: a survey, in: M.M. Gupta, G.N. Saridis, B.R. Gaines (Eds.), Fuzzy
Automata and Decision Process, North-Holland, Amsterdam, 1977, pp. 89102.
[12] T. Terano, K. Asai, M. Sugeno, Applied Fuzzy Systems, Academic Press, Orlando, FL, 1994.
[13] R.R. Yager, On ordered weighted averaging aggregation operators in multi-criteria decision making, IEEE Trans.
Systems Man Cybernet. 18 (1988) 183190.
[14] R.R. Yager, Families of OWA operators, Fuzzy Sets and Systems 59 (1993) 125148.
[15] R.R. Yager, Quantier guided aggregation using OWA operators, Internat. J. Intell. Systems 11 (1996) 4973.
[16] R.R. Yager, Targeted e-commerce marketing using fuzzy intelligent agents, IEEE Intell. Systems (2000) 4245.
[17] R.R. Yager, Veristic variables, IEEE Trans. Systems Man Cybernet. Part B: Cybernetics 30 (2000) 7184.
[18] R.R. Yager, A hierarchical document retrieval language, Inform. Retrieval 3 (2000) 357377.
[19] R.R. Yager, D.P. Filev, Approximate clustering via the mountain method, IEEE Trans. Systems Man Cybernet. 24
(1994) 12791284.
[20] R.R. Yager, D.P. Filev, Generation of fuzzy rules by mountain clustering, J. Intell. Fuzzy Systems 2 (1994) 209219.
[21] R.R. Yager, J. Kacprzyk, The Ordered Weighted Averaging Operators: Theory and Applications, Kluwer, Norwell,
MA, 1997.
[22] L.A. Zadeh, A computational approach to fuzzy quantiers in natural languages, Comput. Math. Appl. 9 (1983)
149184.
[23] L.A. Zadeh, Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic,
Fuzzy Sets and Systems 90 (1997) 111127.
[24] L.A. Zadeh, A new direction in AItoward a computational theory of perceptions, Artif. Intell. Mag. 22 (1) (2001)7384.