fuzzy in rs

Embed Size (px)

Citation preview

  • 7/29/2019 fuzzy in rs

    1/17

    Fuzzy Sets and Systems 136 (2003) 133149

    www.elsevier.com/locate/fss

    Fuzzy logic methods in recommender systems

    Ronald R. Yager

    Machine Intelligence Institute, Iona College, 715 North Avenue, New Rochelle, NY 10801, USA

    Received 5 July 2001; received in revised form 4 March 2002; accepted 25 April 2002

    Abstract

    Here we consider methodologies for constructing recommender systems. The approaches studied here dier

    from collaborative ltering, they are based solely on the preferences of the single individual for whom we

    are providing the recommendation and make no use of the preferences of other collaborators. We have called

    these reclusive methods. Another important feature distinguishing these reclusive methods from collaborative

    methods is that they require a representation of the objects. Considerable use is made of fuzzy set methods

    for the representation and subsequent construction of justications and recommendation rules. It is pointed

    out these reclusive methods rather than being competitive with collaborative methods are complementary.c 2002 Elsevier Science B.V. All rights reserved.

    Keywords: Customization; Recommender systems; Fuzzy methods; Collaborative ltering

    1. Introduction

    Recommender systems [8,5] are a rapidly emerging class of software especially within the domain

    of E-Commerce [9]. Their importance being directly related to the ability of the internet to collect,

    store and process vast quantities of information about individuals actions and preferences. They

    are important component toward the goal of providing specic customized information to each user.

    Most of the current generation of recommender systems are based on collaborative ltering technolo-gies [4,6,7,10]. An important component of collaborative ltering type systems is the calculation of

    similarity of interest based on correlations between individuals. In order to predict a users potential

    interest for some object they have not experienced, collaborative ltering uses these measures of

    similarity of interest in conjunction with ratings of the object by other individuals who have expe-

    rienced the object. An important feature of these pure collaborative ltering systems is that they do

    not require any representation of the objects being considered.

    Tel.: +1-212-249-2047; fax: +1-212-249-1689.

    E-mail address: [email protected] (R.R. Yager).

    0165-0114/03/$ - see front matter c 2002 Elsevier Science B.V. All rights reserved.P I I : S 0165- 0114(02)00223- 3

  • 7/29/2019 fuzzy in rs

    2/17

    134 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149

    In this work we shall focus on a dierent class of recommender systems which are not collabo-

    rative. These types of systems, which we call reclusive, use only preference information about the

    user of interest. This types of systems require some representation of the object.

    An essential dierence between the collaborative ltering approach and the reclusive approach isthat collaborative ltering is based upon nding a similarity between people whereas the reclusive

    approach is based upon nding a similarity between the objects.

    What is clear, although we shall not study it here, is that future recommender systems will

    incorporate both these perspectives.

    2. A general view of recommender systems

    A recommender systems is associated with a collection of objects D = {d1; : : : ; dn}. The purpose

    of this system is to recommend to the user objects of D that may be of interest to him. As a tangibleexample of a recommender system we shall often nd it convenient to use one in which the objects

    are movies.

    Here we shall consider some approaches to this problem of recommendation and shall describe

    some methods for performing this task. The implementation of technologies for developing these

    systems is strongly dependent upon the type of information that is being used. In the following, we

    shall discuss the types of information that may be available to a recommender system.

    A prime source of information for use in a recommender system is the knowledge about

    the objects in D. The usefulness of this information is dependent upon the representation used

    for the objects in D. The least information-rich situation is the one in which all we have is just

    some unique identication of an object and no other information. For example, all we know about

    a movie is just its title. A more information-rich environment is one in which we describe an objectwith some attributes. For example, we indicate the year the movie was made, the type of movie, the

    stars. These attributes and their associated values provide a representation of the objects. We can

    have degrees of representation, more sophisticated representations will depend upon the features used

    to characterize the objects. Many techniques that can be used in recommender systems are based

    upon using some representation of the objects. Generally, the more sophisticated the representation,

    the better these techniques perform.

    In order to make a recommendation to a user we must have some information about the users

    preferences. Information about user preferences can essentially be obtained in two dierent ways,

    although these need not be mutually exclusive. We shall refer to these two modes, respectively, as

    extensionally and intentionally expressed preference information. By extensionally expressed prefer-ence information we mean information based upon the actions or past experiences of the user with

    respect to specic objects of the type found in D. Examples of this are movies a user has previously

    seen and possibly some rating of these movies. In another domain we could mean the objects which

    the user has purchased. By intentionally expressed information we mean some specications by the

    user of what they desire in objects of the type under consideration. Generally to be of use these

    specications must be of such a nature that they can be related to the attributes and features used

    in the representation of the objects in D.

    We would like to make some comment on the distinction between targeted marketing [ 16] and

    recommender systems. We say that recommender systems are participatory in that the user is

  • 7/29/2019 fuzzy in rs

    3/17

    R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 135

    participating in the process by providing information about their preferences. In a targeted marketing

    system, while information may be available about a users preferences, this is generally based on

    extensional information obtained from past actions of the user. Here the target is a passive supplier

    of information rather than an active supplier as in the recommender system. For example the systemused by Amazon.com, while called a recommender system, according to our denition is more

    appropriately a targeted marketing system that is based solely upon past information of the user and

    does not involve any cooperation by the user. On the other hand the system used by NETFLIX 1 is

    a true recommender system as it uses ratings supplied by the user.

    Another characterizing aspect of these recommender systems is whether the system is collabo-

    rative or not. We shall say a system is collaborative if information about the preferences of other

    people are used in determining the recommendation to the current user. Furthermore, in these col-

    laborative recommender systems the available technologies depend on the nature of preference in-

    formation used with respect to participating agents. Generally, in the collaborative approach when

    extensional information is used one tries to obtain, based on mutually experienced items, some mea-sure of correlation between the participants and use this as a basis of providing recommendations.

    In Fig. 1 we summarize the situation with respect to the information available in a recommender

    system.

    In order to develop a recommender system we need to use information about the users preferences.

    Collaborative type recommender can be constructed using only extensional preference information.

    Non-collaborative, reclusive, type systems require the availability of object representations. Table 1

    shows a simplied typology of dierent types of recommender systems. The rst column indicates

    a purely collaborative type of system. Columns 24 indicate those based purely on representa-

    tion, the dierences between these columns being types of information used to specify the users

    preferences. The nal column indicates systems which use both collaborative and representational

    information.Here we shall focus on reclusive, non-collaborative, recommender systems in which there exists

    a representation of the objects. At a meta level, we see a kind of symmetry between reclusive

    methods and collaborative methods. In both cases we make use of a vector of ratings of the objects

    the current user has experienced. We shall denote this as A. In the collaborative ltering approach

    we have for each collaborator a vector Aj indicating their ratings of the corresponding objects. For

    any object d unexperienced by our current user we have vector R whose components, rj, are a rating

    of this object by the collaborators. The procedure for obtaining the degree of recommendation can

    be seen to essentially involve two steps:

    1. We combine A and Aj to obtain Sj, a degree of similarity of our user with each collaborator.

    2. Rating of d = Aggregation of weighted tuples (Sj ; rj).

    In the reclusive method for each object the user has experienced we have a representation Ri as

    well as the current users rating ai as contained in A. In addition, for an unexperienced object d

    which we are trying to evaluate we only have a representation R. The procedure for obtaining the

    1NETFLIX is a website that rents DVD videos. It asks users to rate the videos they have rented and uses this to

    recommend other videos.

  • 7/29/2019 fuzzy in rs

    4/17

    136 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149

    Both

    Intentional

    ExtentionalPreferenceinformation

    type

    USER

    OBJECTSof

    INTEREST

    RepresentationAvailable

    Yes NoYes No

    Available

    COLLABORATORS

    Peer Group

    PreferenceInformation Type

    Both

    Intention

    al

    Extentional

    Fig. 1. Recommender systems information structure.

    Table 1

    Recommender systems typology

    Extensional preferences

    Intentional preferences

    Representation

    Collaborators

    degree of recommendation also essentially involves a two step procedure:

    1. We combine R and Ri to obtain Si, a degree of similarity of the object d with the experienced

    objects.

    2. Rating of d = Aggregation of weighted tuples (Si; ai).

    Our goal in the following is to develop modules that can be used to help evaluate objects with

    regard to their degree of recommendation in the case of reclusive approaches.

  • 7/29/2019 fuzzy in rs

    5/17

    R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 137

    3. Object representation

    We now turn to the issue of object representation. For our purposes the representation of an object

    shall be based upon a set of primitive assertions or statements. Each assertion can essentially beviewed as some declarative statement. Associated with each object and each assertion is a value

    contained in the unit interval indicating the degree to which the assertion is valid for that object.

    For example in the movie domain a primitive assertion may be this movie is a comedy. In this

    case, the value associated with this assertion for a movie indicates the degree to which it is true

    that this movie is a comedy. Another assertion may be that Robert DeNiro is a star in this movie.

    If the movie has Robert DeNiro as one of its stars then this assertion has validity one otherwise it

    is zero. Another assertion may be that this movie was made in 1993, if the movie was made in

    1995 this would have a validity of zero. If it was made in 1993, this assertion would have truth

    value one.

    Essentially then the basis of our representation scheme is a collection of assertions or statementswhose validity is determinable for any object in D. We shall denote this set of primitive or atomic

    assertions as A = {A1; : : : ; An}.The representation of an object consists of a valuation of these assertions for the object. For

    object d, Aj(d) indicates the degree to which assertion Aj is satised by d. When we are just focusing

    on one object we can denote this aj. For some purposes we can view any object d as a fuzzy subset

    over the space A. Using this perspective the membership grade of Aj in d; d(Aj) =Aj(d) = aj. As

    an alternative perspective an object can be viewed as an n dimensional vector whose jth component

    is Aj(d). As we shall subsequently see these dierent perspectives are useful in inspiring dierent

    information processing operations.

    We shall call a subset of related assertions an attribute or feature. For example, the subset V may

    consist of all the assertions of the form this movie was made in the year xyz. We can denotethis attribute as the year the movie was made. Another notable subset of related assertions from

    A may consist of all the assertion of the form x stars in this movie. This feature corresponds to

    the attribute of who are the stars of the movie.

    For a given recommender system, in addition to the set A of primitive assertions, we shall assume

    the existence of a collection of features or attributes associated with the objects in D. We denote this

    collection of attributes as F= {V1; V2; : : : ; V q}. Each attribute Vj corresponds to a subset of assertionswhich can be seen as constituting the possible values for the attribute. In some special cases a feature

    may consist of a single assertion. The performance of a recommender system is clearly related to

    the sophistication of the primitive assertions and associated features used to represent the objects of

    interest.We note that while we have started with assertions and constructed features by forming subsets of

    related assertions it is equally valid, and perhaps more intuitive, to start with attributes and generate

    the primitive assertions as being the possible values for these attributes.

    Attributes can be classied by various characteristics associated with their solution space [23,24].

    Attributes can be distinguished with respect to number of solutions they allow, is it restricted to

    having only one solution, does it allow multiple solutions, must it have a solution. For exam-

    ple, the attribute corresponding to release year of a movie must have only one solution. On the

    other hand the attribute corresponding to the star of a movie can take on multiple values. The

    primitive assertions can also be classied with respect to the allowable truth values they can

  • 7/29/2019 fuzzy in rs

    6/17

    138 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149

    assume. For example binary type assertions are those in which its truth value must assume

    the value of either one or zero while other assertions can have truth values lying in the unit

    interval.

    We shall look little more carefully at the relationship between atomic assertions and attributes. Aswe have indicated an assertion is a declarative statement that is assigned a value for a given object

    depending on its degree of validity for that object, generally lies in the unit interval. On the other

    hand an attribute can be viewed as a variable that takes its value(s) from a universe associated with

    the variable. In our framework the universe associated with an attribute corresponds to the subset

    of primitive assertions that is used to dene it. The value of an attribute for a given object depends

    upon the truth values of the associated primitives. Let us look at this.

    If Vj is an attribute we can nd its value for a particular object d in the following way. Let

    A(Vj) indicate the subset of primitives associated with Vj. Let d represent the fuzzy subset of A

    corresponding to object d, then the value of the feature Vj for object d is

    Vj(d) = A(Vj) d;

    it is the intersection of the attribute denition, the crisp subset A(Vj), and the object representation,

    the fuzzy subset d. The collection of elements in the subset Vj(d) determine the value of the attribute

    Vj for the object d.

    Often the information about an object will be specied directly in terms of attribute values. We

    shall assume the ability to extract information about assertion validity from information expressed

    about attribute values. To illustrate this we consider the following. Let A(Vj) = {Aj1; Aj2; : : : ; Ajn} bethe subset of assertions related to the attribute Vj. If we are informed that the value of attribute

    Vj for object d is q this means that Vj(d) = {Ajq}, where Ajq is the assertion that Vj is q. Since

    Vj(d) =A(Vj) d, we can conclude that Aji(d)=0 for all i= q and Ajq(d) = 1 .In relating the knowledge about assertion validation and feature values it is necessary to carefully

    distinguish between features that can only assume one unique value, such as date of release of

    a movie, and features that can assume multiple values, such as people starring in the movie. In the

    rst case multiple assertions in Vj(d) is an indication of uncertainty regarding our knowledge of the

    value of Vj. In the second case multiple assertions in Vj(d) is an indication of multiple solutions

    for Vj. Here we shall not further pursue this important issue regarding dierent types of variables

    but only point to [17] for those interested. Here, we shall assume the ability to interchange between

    these two representations.

    4. Modeling user expressed preferences

    The basic function of a recommender system is to use what we shall call justications to generate

    recommendations to a user. By a justication we shall mean a rational for believing a user may like

    an object. These justications can be obtained either from preferences directly expressed by users

    or induced using data about the users experiences. In the following we shall look at techniques

    for obtaining recommendations which make use of a representation of the objects. As we noted

    not all recommender technologies require representations, collaborative ltering being an example of

    a technology that does not need representations of the objects.

  • 7/29/2019 fuzzy in rs

    7/17

    R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 139

    In this section we shall consider the situation in which we have a representation of the objects and

    the user has specied their preferences intentionally in some manner compatible with this represen-

    tation. This situation is closely related to the problem of information retrieval [ 1]. The availability

    of technologies for this environment is quite rich. The quality of performance of a recommendersystem in this environment is strongly dependent upon the ability of the system to allow the user to

    eectively express their preferences. This capability is itself dependent both upon the assertions and

    features used to represent the object and the sophistication of the language available to the user to

    express their preferences in terms of these assertions and features.

    In the following we briey describe a language which we introduced in [18]. This language called

    Hi-Ret provides a very expressive language. This language makes considerable use of the ordered

    weighted averaging (OWA) operator [13,21]. We shall rst briey describe this operator.

    The OWA operator F of dimension n is a mapping OWA :Rn R characterized by an n-dimensionvector W, called the weighting vector, such that its components wj; j = 1 to n, lie in the unit interval

    and sum to one. The OWA aggregation is dened as

    OWA(a1; : : : ; an) =

    nj=1

    wjbj;

    where bj is the jth largest of the ai.

    The unique feature of this operator is the ordering of the arguments by value, a process that

    introduces a nonlinearity into the operation. We can represent this aggregation operator in vector

    notation as OWA(a1; a2; : : : ; an) = WTB, where W is the weighting vector and B is a vector, called

    the ordered argument vector, whose components are the bj. The generality of the operator lies in

    the fact that by selecting W we can implement many dierent aggregation operators. Specically,

    by appropriately selecting the weights in W, we can emphasize dierent arguments based upon theirposition in the ordering. From an application point of view an important feature of this operator is the

    characterizing vector W can be readily related to natural language expressions of aggregation rules.

    A number of special cases of this operator are illustrated in the following. If the components in W

    are such that w1 = 1 and wj =0 for all j = 1 we get OWA(a1; a2; : : : ; an)=Maxj[aj]. We denote thisweighting vector as W. If the weights are wn =1 and wj =0 for j = n we get OWA(a1; a2; : : : ; an) =Minj[aj]. We denote this weighting vector as W

    . If the weights are such that wj = 1=n for all j,

    denoted Wave, then OWA(a1; a2; : : : ; an) = ( 1=n)n

    j=1 aj. Thus we see that the simple average is

    a special case of these operators.

    A number of dierent methods have been suggested for obtaining the weighting vector to be used

    in the aggregation. For our purpose we shall use an approach based upon the idea of linguisticquantiers. Classical logic provides two quantiers for aggregating truth values for all and there

    exists, these correspond to anding and oring. The concept of linguistic quantiers was originally

    introduced by Zadeh [22] to help formalize the many expressions of quantication available in natural

    language. According to Zadeh a linguistic quantier is a natural language expression corresponding

    to a proportional quantity. Examples of this are at least one, all, at least %, most, more than a few,

    some and all. Zadeh [22] suggested a method for formally representing these linguistic quantiers.

    Let Q be a linguistic expression corresponding to a quantier such as most; then Zadeh suggested

    representing this as a fuzzy subset Q over I= [0; 1] in which for any proportion rI; Q(r) indicatesthe degree to which r satises the concept identied by the quantier Q.

  • 7/29/2019 fuzzy in rs

    8/17

    140 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149

    1

    1

    Fig. 2. Linguistic quantier at least .

    In [15], Yager considered the use of linguistic quantiers to generalize the logical quantication

    operation. He considered the valuation of the statement Q(a1; : : : ; an) where Q is a linguistic quantier

    and the aj are truth values. It was suggested that the truth value of this type of statement could be

    obtained with the aid of the OWA operator. This process involved rst representing the quantier

    Q as a fuzzy subset Q and then using Q to obtain an OWA weighting vector W which was used

    to perform an OWA aggregation of the ai. Formally we denote this as

    Q(a1; : : : ; an) = OWAQ(a1; : : : ; an):

    In the following we shall describe the process of obtaining the weighting vector from the associated

    fuzzy subset Q. Here we shall restrict ourselves to the class of linguistic quantiers called RIM

    quantiers. A RIM quantier is represented by fuzzy subset Q :II in which: Q(0)=0; Q(1)=1and if r1r2 then Q(r1)Q(r2) (monotonic). These RIM quantiers model the class in which

    an increase in proportion results in an increase in compatibility to the linguistic expression being

    modeled. Examples of these types of quantiers are at least one, all, at least %, most, more than

    a few, some. These are the type of quantiers that are generally used by people in expressing their

    preferences. If Q is a RIM quantier we associate with it an OWA weighting vector W such that

    wj = Q(j=n) Q((j 1)=n) for j =1 to n.

    Fig. 2 is seen as corresponding to the quantier at least %. For this quantier wj = 1 for j suchthat (j 1)=n6jn

    and wj = 0 for all other.

    Another quantier is one in which Q(r) = r for r [0; 1]. For this quantier we get wj = 1=n forall j. This gives us the simple average. We shall denote this quantier as some.

    One can consider parameterized families of quantiers [14]. For example consider the parameter-

    ized family Q(r) = r where [0;]. Here if = 0, we get the existential quantier; when ,we get the quantier for all and when = 1, we get the quantier some. In addition for the case in

    which = 2, Q(r) = r2, we get one possible interpretation of the quantier most.

    We are now in a position to describe the use of the OWA operator in the construction of a rec-

    ommender system. We shall assume available to the user a vocabulary of linguistic quantiers

  • 7/29/2019 fuzzy in rs

    9/17

    R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 141

    Q = {Q1; Q2; : : : ; Qq} in which they can express themselves. Furthermore, we assume transparent tothe user is the representation of each of these quantiers in terms of a fuzzy subset of the unit

    interval, QkQk.

    We now turn to representation of user preference information. We rst introduce the idea of primalpreference module (PPM). As we shall see this will serve as the basic unit which can be used to

    evaluate the appropriateness of an object for recommendation based on the users preferences. A

    PPM is of the form A1; : : : ; Aq : Q. The components of a PPM, the Ai, are assertions associatedwith the objects in D and Q is a linguistic quantier. With a PPM a user can express preference

    information by describing what properties they are interested with respect to the class of objects in

    D and then using Q to capture the desired relationship between these properties. For example do

    they desire all or most or some or at least one of these requirements satised. If h is a PPM we

    can evaluate any object in D with respect to this. In particular for object d we obtain the values

    Aj(d) from our representation of d then use the OWA aggregation to evaluate it,

    h(d) = OWAQ(A1(d); A2(d); : : : ; Aq(d))]:

    Here the weighting vector is determined from Q.

    While the PPM can be directly evaluated for any object, the great benet of our system is that

    we can let users express their preferences in much more sophisticated ways. We now shall introduce

    the idea of a basic preference module (BPM). A BPM is a module of the form

    m = C1; C2; : : : ; C p : Q

    in which the Ci are called the components of the BPM. The only required property of these com-

    ponents are that they can be evaluated for each object. That is for any Ci we need to be able toobtain Ci(d). Once having this we can obtain using the OWA aggregation

    m(d) = OWAQ[C1(d); : : : ; C p(d)]:

    Let see what kinds of elements can constitute the Ci. Clearly the Ci can be any of the assertions in

    the set A. More generally the Ci can be any PPM as we know how to evaluate these. Even more

    generally the Ci can itself be a BPM if we can evaluate it.2 Additionally the Ci can be the negation

    of any of preceding types. For example if C is an object which we can evaluate if we include as

    one of our components not C, C, then C(d) = 1 C(d).We note that preferences specied in terms of attribute values can be easily represented in this

    framework. Let us illustrate this. Consider an attribute Vj and let A(Vj) = {Aj1; Aj2; : : : ; Ajn} be thesubset of assertions related to the attribute Vj. Without loss of generality we shall let Aji indicate

    the assertion that Vji is ai. First let us consider the case where Vj is a variable, such as star in

    a movie, which can take multiple solutions. The requirement that Vj has aq as one of its values

    can be easily expressed simply using the assertion Ajq as one of the components in our preference

    modules. Consider now the situation where Vj is an attribute, such as year of release of a movie, that

    can assume one and only one value. Consider now the representation of the desire that Vj is a1. We

    2 In this case we must be careful to avoid self-reference.

  • 7/29/2019 fuzzy in rs

    10/17

    142 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149

    BPMC =

    C1 Cn

    C1 =

    BPM

    BPM

    C11

    BPMCn =

    Ck

    C11Cn1

    BPMCn1

    Ai

    Aj

    Fig. 3. Hierarchical structure of BPM.

    represent this as the BPM m = C1; C2 : all where C1 is simply the attribute Aj1. The componentC2 is obtained as not C3 where C3 is the BPM dened by Aj2; Aj3; : : : ; Ajn : Q where Q is thequantier any. The preceding illustrates the ability of formalizing preferences expressed in terms

    of attributes within this framework. This allows to express their preferences in terms of attribute

    requirements.

    Using this framework based on BPMs we can express very sophisticated user preferences. Usinga BPM we can express any type of user preference information as long as it can be evaluated by

    decomposing it into primitive assertions. Of particular value, is the fact that a user can express

    their preferences even using concepts and language not within the given set of primitive assertions

    and associated attributes as long as they can eventually formulate their concepts using the primitive

    assertions. The general structure resulting from the use of BPM is a hierarchical type tree structure

    whose leafs are primitive assertions (see Fig. 3).

    Let us see the process. A user expresses a predilection, C, for some types of objects. This

    predilection is formalized in terms of some BPM, a collection of components (criteria) and some

    quantier relating these components. This components get further expressed (decomposed) by BPMs

    which are then further decomposed until we reach a component that is a primitive assertion whichterminates a branch. This process can be considered as a type of grounding. We start at the top

    with the most highly abstract cognitive concepts we then express these using less abstract terms

    and continue downward in the tree until we reach a grounded concept, a primitive assertion. Once

    having terminated each of the branches with a primitive assertion our tree provides an operational

    denition of the predilection expressed by the user. For any object d in D we can evaluate the

    degree to which it satises the predilection expressed. Starting at the bottom of the tree with the

    primitive assertions whose validities can be obtained from our database we then back up the tree

    using the OWA aggregation method. We stop when we reach the top of the tree, this is the degree

    to which the object d satises the expressed preference.

  • 7/29/2019 fuzzy in rs

    11/17

    R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 143

    5. User proles

    Using these basic preference modules we can now dene what we shall call a user prole. One part

    of the user prole is the user preference prole which consists of a collection of basic preferencemodules, mj for j = 1 t o K, of the type described in the preceding section. Each mj provides

    a description of a class of objects from D that the user likes. These BPMs can be simple or

    sophisticated. From statements such as I like Robert DeNiro movies to complicated descriptions

    of movies. For any object d, mj(d) indicates the degree to which it satises the BPM mj. As any of

    these preference modules provides a justication for recommendation an object satisfying anyone of

    these mj is recommendable to the user. If M = {m1; m2; : : : ; mK} is the preference prole for a givenuser, then for any object d in D we calculate

    M(d) = Maxj[mj(d)]

    as the degree of positive recommendation of this object to the user.

    At formal level one can view a user preference prole M as a BPM in which its components

    are the mj. One reason we choose not to do this is that we prefer to reserve the idea of BPM for

    user preferences that are in some sense conceptually distinct. Thus, while we can formally combine

    a user preference for Robert Di Niro movies with a user preference for 1940s musical in a single

    BPM by just oring these it is more in keeping with the way the user sees these by keeping them

    distinct. Thus in introducing the user prole we are emphasizing the individuality of each of the

    preferences in M.

    In the preceding we have assumed that each of the BPM in M had an equal value with regard to

    their worth to the user. We can consider the situation in which the user associates with each BPM

    mj in his prole a value j [0; 1] indicating the weight or strength of this preference module. Usingthis we can calculate

    M(d) = Maxj[mj(a) j]:

    We can also allow a user to supply negative or rejection information. We dene a basic rejection

    module (BRM) ni to be a description of objects from D which the user prefers not to have recom-

    mended to him. A BRM is of the same form of BPM except it describes features which the user

    species as constituting objects he does not want. Thus a second component of the user prole is

    a collection N = {n1; n2; : : :} of basic rejection modules. Using this we can calculate the degree of

    negative recommendation (rejection) of any object to a user, N(d)=Maxi[ni(d)]. It is not necessarythat a user have any negative modules. Additionally, we can associate with each rejection module,

    ni a value i [0; 1] indicating the weight associated with the rejection module ni. Using this we get

    N(d) = Maxi[ni(d) i]:

    We must now combine these two types of scores, recommendation and rejection. Here we just

    describe some types of operations available. The builder of the system must implement the one that

    best represents the situation they are trying to model. We note that closely related issues have been

    discussed by Dubois et al. [3] in there use of examples and counter examples for querying databases.

  • 7/29/2019 fuzzy in rs

    12/17

    144 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149

    Let R(d) indicate the overall degree of recommendations of the object d with respect to a user.

    One possibility is some kind of bounded subtraction of the two types of recommendations

    R(d) = (M(d) N(d)) 0:

    Another possibility is to assume that rejection has priority over preference. In this case we have

    R(d) = (1 N(d)) M(d):

    Here then we are saying we recommend things that are preferred and not rejected by the user.

    6. Extensionally expressed preference

    Now we consider the environment in which each object in D is represented but the user preference

    information is expressed extensionally. We assume each user has an associated subset E of Dcorresponding to the subset of objects which it is known they have experienced. Since we shall

    focus on just one user we can equivalently view this as a situation in which each object has an

    attribute indicating whether the person has experienced this object or not. Here we shall initially

    assume a closed world assumption. Under this assumption any object which we have not been

    informed of the user experiencing we shall assume they have not experienced. In addition in some

    systems of this type we shall assume that for any object which the user has experienced they have

    provided a value a [0; 1] indicating their scoring of that object. Our goal here is to suggest waysin which we can use this type of information to recommend new objects to our user.

    In this environment, one basic paradigm that we can use for justifying recommending objects

    becomes very obvious. We look for objects which the user has experienced and liked and tryto nd objects which the user has not experienced similar to these. In trying to implement this

    we become faced with the issue of determining the similarity between objects. The problem of

    determining similarity is clearly context dependent and often very complex [2]. However, the assumed

    availability of a representation for each of the objects allows us to develop some kind of tool for

    the calculating the similarity (proximity) between objects. In the following we shall assume the

    existence of a similarity (proximity) relationship S over the set D of objects. That is, for any two

    objects di and dj in D we assume S(di; dj) [0; 1] is available. The larger S(di; dj) the more relatedor similar the objects. As we already indicated we also assume the existence of some subset EDof objects the user has experienced. Furthermore, we assume for each object di in E the availability

    of a rating, ai, indicating the score the user has attributed to this object. We note these ratings can

    be viewed as a fuzzy subset A of E in which A(di) = ai. Semantically, A corresponds to the subsetof objects the user liked.

    Our goal here is to try to obtain a fuzzy subset R over the space M=D E correspondingto the objects to be recommended. One approach to obtaining this fuzzy subset is in the spirit of

    fuzzy modeling. We shall try to provide a collection of justications, rules or circumstances, which

    indicate that an object in M is suitable for recommendation. If Rj are a collection of circumstances

    for recommending objects where Rj(di) indicates the degree to which di M meets this conditionthen

    R(di) = Maxj[Rj(di)]:

  • 7/29/2019 fuzzy in rs

    13/17

    R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 145

    Let us begin considering some guidelines that can be used to support recommendation of an object

    di. Our focus here is not as much on providing a denitive listing of rules but more to see how fuzzy

    logic can be used to enable the evaluation of some commonsense guidelines which are expressed

    in a natural type language. That is we are interested in the process of translating some rules thatappear to provide reasonable justications for recommending objects.

    A most natural source of recommendation can be captured by the following guideline.

    Rule 1: Recommend an object if there exists a similar object that the user liked. Under this rule

    the strength of recommendation of an unexperienced object di in D E can be obtained as

    R1(di) = MaxjE

    [S(di; dj) A(dj)]:

    We express a second guideline for recommending objects which can be seen as a softening of the

    rst rule.

    Rule 2: Recommend an object for which we have at least several comparable objects which theuser somewhat liked.

    Here we are softening the requirements of rule 1 by allowing a less stronger indication of sat-

    isfaction, somewhat liked, and allowing a weaker connection as denoted by the use of the word

    comparable instead of similar. We are compensating for this reduction by requiring at least several

    such objects instead of just a single object. Our goal now is to suggest a way to formalize this

    type of rule in a way that we can evaluate it. In anticipation of expressing this rule we introduce

    some fuzzy subsets. First we note that the term at least several is an example of what Zadeh [22]

    called a linguistic quantity, words denoting precise or imprecise quantities. In [ 22] Zadeh suggested

    that any linguistic quantity can be represented as a fuzzy subset Q of the set of integers. It is clear

    that at least several is monotonic in that Q(k1)Q(k2) if k1k2. We must now introduce a fuzzysubset to capture the idea of somewhat liked. This concept can be modeled in a number of dierent

    ways. With A being the fuzzy subset of E indicating the users satisfaction, A(dj) = aj we let A be

    a softening of this corresponding to the concept somewhat liked. One way of dening A is asA(di) = (A(di))

    for 01. The smaller the more the softening. Another method to dene A is

    A(di) = 1 if A(di) ;

    A(di) = A(di) if A(di)6 :

    More generally, we can express A using a transformation function T : [0; 1] [0; 1] such that T(a)a

    and then dening A(xj) = T(A(xj)). The function T can be expressed using a fuzzy systems model[12], for example

    if a is low then T(a) is medium

    if a is moderate then T(a) is high

    if a is large then T(a) is very large

    Finally, we must dene the concept comparable. As used, the term comparable is meant to indicate

    a softening of the concept of similar. Again if T is dened in the preceding, as some softening

    function, T(a)a, we can use this to provide a denition for comparable. Thus, if S(x; y) indicates

    the degree of similarity between two objects then we can use Comp(x; y) = T(S(x; y)) to indicate

  • 7/29/2019 fuzzy in rs

    14/17

    146 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149

    the degree to which they are comparable. One possible denition for T in this case is T(a) = 1 i f

    a and T(a) = if a.

    Once having satisfactorily obtained representations of these softening concepts we can use them

    to provide an operational formulation of this second rule. For any object di D E we have

    R2(di) = MaxFE

    Q(|F|) Min

    djF( A(dj) Comp(dj; di))

    :

    In the preceding we can express A(dj) = T1(A(dj)) and Comp(dj ; di) = T2(S(dj ; di)) where T1 and T2are two transformations. It is interesting to see that our rst rule is a special case of this. If we let

    T1 and T2 be such that T1(a) = T2(a) = a, identity transforms, then

    R2(di) = MaxFE

    Q(|F|) Min

    djF(A(dj) S(dj ; di))

    :

    Furthermore, if Q is dened to be at least one then Q(|F|) = 1 if F= and hence

    R2(di) = MaxFE

    MindjF

    (A(dj) S(dj; di))

    this can be seen to be equal to MaxdjE [(A(dj) S(dj; di))] which is R1(di).

    It is interesting to consider a collection of rules of this type Rk = Qk; Ak; Compk where each isa softening. Each one requiring more objects but softening either or both the requirements regarding

    satisfaction to the user and proximity to the object being evaluated. Here then, in this softening

    process, we are essentially increasing the radius about the object, decreasing the required strength

    but increasing the number of objects that need be found.Another method for justifying possible objects to recommend is to look for unexperienced objects

    that have a lot of neighbors which the user has experienced regardless of the valuation which

    they have been given. This captures the idea that the user likes objects of this type regardless of

    their evaluation. For example a person may see horror movies even if they think these movies are

    bad. We can see that this type of situation can be expressed as an extreme case of the preceding

    recommendation rules. Again consider a rule Q; A; Comp where we evaluate it for an item as

    R(di) = MaxFE

    Q(|F|) Min

    djF( A(dj) Comp(dj; di))

    :

    To capture the above imperative we let A(dj) = 1 if dj E and hence we get

    R(di) = MaxFE

    Q(|F|) Min

    djF(Comp(dj; di))

    :

    Letting Comp(dj; di) = S(dj ; di) we have R(di)= MaxFE [Q(|F|)MindjF (S(dj ; di))]. This can beseen as a type of fuzzy integral [11]. Let Sindexi(k) be the similarity of the kth most similar object

    in E to the object di. Furthermore let qk = Q(k) then

    R(di) = Maxk[qk Sindexi(k)]:

  • 7/29/2019 fuzzy in rs

    15/17

    R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 147

    The essential idea of the preceding methods for justifying items to recommend was based on

    the process of discovering unexperienced items located in areas of the object space that are rich in

    objects that the user liked or experienced. We can capture this imperative in an alternative manner,

    one that is in the spirit of the mountain method [19,20]. For each unexperienced item, di D E,we calculate M1(di) =

    djE

    ajS(di; dj). Here M1(di) can be seen as a kind of support for di based

    on a weighted sum. We let d be such that M1(d)=Maxi[M1(di)]. Using this we can obtain an

    evaluation or score for each di as R(di) =M1(di)=M1(d). Alternatively, for each di DE we can

    calculate M2(di) =

    djES(di; dj), the density of nearby experienced objects regardless of the users

    rating. We then calculate d+ such that M2(d+)=Maxi[M2(di)]. From this we obtain a normalized

    score function R(di) =M2(di)=M2(d+).

    7. Using domain expert prototypes

    We shall now consider another approach for obtaining basis for recommendation of objects. In

    this approach, we rely upon the use of expert dened prototype objects. That is using the features

    available in the representation of the objects we allow experts in the domain to dene prototype

    objects. In some sense these prototypes can be seen as reection of the language and categories in

    which the community discusses the domain. For example, terms like lm noir, happy movie, block

    buster, epic can be seen as prototypes used in the movie domain. The actual creation and selection

    of these domain prototypes is clearly a creative activity and we shall not venture into the issue of

    obtaining methods for generating prototypes. From a formal point of view these prototypes can be

    expressed in the terms of the primitive attributes of the objects in a manner analogous to that used

    to express user preference models.

    Here we shall assume the availability of a collection of prototypes. For our subsequent purposeswe shall consider a prototype object Ti to be some function of the domain of attributes such that

    for each dj D, we can obtain Ti(d) as a value in the unit interval.As a justication for recommendation we can say that if a user likes prototype class Ti and

    if a given unexperienced object d is in this prototype class then we recommend the object. To

    calculate the degree to which an object d is of type Ti we use our denition of this prototype class

    Ti in which Ti(d) indicates the degree to which d belongs to the prototype. If we let i indicate the

    degree to which the user likes type Ti objects then the degree of recommendation of object d is

    Ri(d) = i Ti(d):

    In order to implement this we need to obtain i, the degree the user likes type Ti objects. Onemethod to determine whether the user likes objects in class Ti is as follows. For any experienced

    object dj E let aj be the users rating. Then if we calculate L(Ti) =

    jE ajTi(dj)=

    djETi(dj)

    this then gives us the users average rating for objects in class Ti. We can use this for i.

    Let us consider another method for determining a users inclination toward objects in class Ti.

    If a user has experienced a lot of objects in the class Ti it is reasonable to assume that they are

    interested in this class. This interest may be independent of their reporting liking the objects or not.

    People experience things for various, sometimes neurotic, reasons not necessarily only because they

    think they are good. The term camp, used to describe objects that are so bad they become amusing

    reects this situation. Thus, it appears useful to be able to provide some indication that a user has

  • 7/29/2019 fuzzy in rs

    16/17

    148 R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149

    a signicant degree of interest in movies of type Ti based solely on the quantity of items of this

    type experienced by the user. We can calculate the number of objects of type Ti experienced by

    this user as

    N(Ti) =djE

    Ti(dj):

    Using this, we can calculate a number of indices. The rst

    P1(Ti) =

    djE

    Ti(dj)djD

    Ti(dj)

    indicates the proportion of available type Ti experienced by this user. The second index is

    P2(Ti) =djE Ti(dj)i(

    djETi(dj)) :

    We now dene a fuzzy subset of the unit interval corresponding to the concept signicant pro-

    portion and calculate the degree of membership of both P1(Ti) and P2(Ti) in this set. Let us

    denote these values as SP1(Ti) and SP2(Ti). Using these and the value L(Ti) we can calculate

    i =Max[L(Ti); SP1(Ti); SP2(Ti)] and then use as our degree of recommendation Ri(d) = i Ti(d).

    8. Conclusion

    Here we have considered methodologies for constructing recommender systems. The reclusiveapproaches studied here dier from the collaborative ltering in that they are based solely on the

    preferences of the individual for whom we are providing the recommendation and make no use of the

    preferences of other individuals. We have called these reclusive methods. Another important feature

    distinguishing these reclusive methods from collaborative methods is that they require a representation

    of the objects not necessarily required of collaborative ltering methods. While our focus has not

    been on collaborative methods but rather reclusive methods optimal recommender systems should

    of course use all information available and hence should be based on a combination of these two

    classes of systems. In future research we shall look at methods integrating collaborative and reclusive

    approaches.

    References

    [1] R. Baeza-Yates, B. Ribeiro-Neto, Modern Information Retrieval, Addison-Wesley, Reading, MA, 1999.

    [2] J.C. Bezdek, J. Keller, R. Krisnapuram, N.R. Pal, Fuzzy Models and Algorithms for Pattern Recognition and Image

    Processing, Kluwer, Boston, 1999.

    [3] D. Dubois, H. Prade, F. Sedes, Fuzzy logic techniques in multimedia database querying: a preliminary investigation

    of the potentials IEEE Trans. Knowledge Data Eng. 13 (2001) 383392.

    [4] D. Goldberg, D. Nichols, B.M. Oki, D. Terry, Using collaborative ltering to weave an information tapestry, Comm.

    ACM 35 (12) (1992) 6170.

    [5] H. Kautz, Recommender Systems, AAAI Press, Menlo Park, CA, 1998.

  • 7/29/2019 fuzzy in rs

    17/17

    R.R. Yager/ Fuzzy Sets and Systems 136 (2003) 133 149 149

    [6] J.A. Konstan, B.N. Miller, D. Maltz, J.L. Herlocker, L.R. Gordon, J. Riedl, Grouplens: applying collaborative ltering

    to Usenet news Comm. ACM 40 (3) (1997) 7787.

    [7] P. Perny, J.-D. Zucker, Collaborative ltering methods based on fuzzy preference relations, EUROFUSE-SIC99,

    Budapest, 1999.[8] P. Resnick, H.R. Varian, Recommender systems, Comm. ACM 40 (3) (1997) 5658.

    [9] J.B. Schafer, J.A. Konstan, J. Reidl, E-Commerce recommendation applications, Data Mining Knowledge Discovery

    5 (2001) 115153.

    [10] U. Shardanand, P. Maes, Social information ltering: algorithms for automating word of mouth, Proc. Computer

    Human Interaction-95 Conference, Denver, 1995, pp. 210217.

    [11] M. Sugeno, Fuzzy measures and fuzzy integrals: a survey, in: M.M. Gupta, G.N. Saridis, B.R. Gaines (Eds.), Fuzzy

    Automata and Decision Process, North-Holland, Amsterdam, 1977, pp. 89102.

    [12] T. Terano, K. Asai, M. Sugeno, Applied Fuzzy Systems, Academic Press, Orlando, FL, 1994.

    [13] R.R. Yager, On ordered weighted averaging aggregation operators in multi-criteria decision making, IEEE Trans.

    Systems Man Cybernet. 18 (1988) 183190.

    [14] R.R. Yager, Families of OWA operators, Fuzzy Sets and Systems 59 (1993) 125148.

    [15] R.R. Yager, Quantier guided aggregation using OWA operators, Internat. J. Intell. Systems 11 (1996) 4973.

    [16] R.R. Yager, Targeted e-commerce marketing using fuzzy intelligent agents, IEEE Intell. Systems (2000) 4245.

    [17] R.R. Yager, Veristic variables, IEEE Trans. Systems Man Cybernet. Part B: Cybernetics 30 (2000) 7184.

    [18] R.R. Yager, A hierarchical document retrieval language, Inform. Retrieval 3 (2000) 357377.

    [19] R.R. Yager, D.P. Filev, Approximate clustering via the mountain method, IEEE Trans. Systems Man Cybernet. 24

    (1994) 12791284.

    [20] R.R. Yager, D.P. Filev, Generation of fuzzy rules by mountain clustering, J. Intell. Fuzzy Systems 2 (1994) 209219.

    [21] R.R. Yager, J. Kacprzyk, The Ordered Weighted Averaging Operators: Theory and Applications, Kluwer, Norwell,

    MA, 1997.

    [22] L.A. Zadeh, A computational approach to fuzzy quantiers in natural languages, Comput. Math. Appl. 9 (1983)

    149184.

    [23] L.A. Zadeh, Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic,

    Fuzzy Sets and Systems 90 (1997) 111127.

    [24] L.A. Zadeh, A new direction in AItoward a computational theory of perceptions, Artif. Intell. Mag. 22 (1) (2001)7384.