[IEEE 2008 12th Panhellenic Conference on Informatics - Samos, Greece (2008.08.28-2008.08.30)] 2008 Panhellenic Conference on Informatics - P2P Reputation Systems Credibility Analysis:

P2P Reputation Systems Credibility Analysis: Tradeoffs and Design Decisions

Eleni Koutrouli* and Aphrodite Tsalgatidou Department of Informatics and Telecommunications,

National and Kapodistrian University of Athens {ekou, atsalga}@di.uoa.gr

* Eleni Koutrouli is also with the Bank of Greece.

Abstract

Peer-to-Peer (P2P) systems are distributed self-organized systems without a controlling entity where peers need to make trust decisions regarding who they will transact with. P2P reputation systems exploit past transactional information to provide members of P2P networks with measures which will help them make such trust decisions. However, these systems are vulnerable themselves to various types of attacks which can distort their functionality and credibility. A number of defense mechanisms have been proposed to counteract specific attacks, but research lacks an exploration of how these mechanisms can be used together so as to provide the highest degree of credibility. In this paper, after presenting the basic attacks and defense mechanisms concerning P2P reputation systems, we analyze the tradeoffs between different credibility enhancing mechanisms and ways to achieve the right balances. Our aim is to help reputation systems designers to incorporate in their systems the right mechanisms or the right combinations of mechanisms against the threats of the specific reputation systems. 1. Introduction P2P systems are distributed systems without a controlling entity. Due to their nature, P2P systems are vulnerable to attacks by malicious peers; thus, they need to use trust mechanisms in order to help peers decide whom to transact with. Such mechanisms are offered by P2P reputation systems which utilize information regarding peers’ past experiences with each other and attach reputation measures to peers which give an estimate of their performance in future transactions. However, due to their decentralized and social nature, P2P reputation systems are exposed to

attacks themselves. A number of defense mechanisms have been proposed as countermeasures to specific attacks, but the literature lacks a complete threat analysis which will reveal all possible attacks of a reputation system and the need for suitable countermeasures. Furthermore, it is not clear how the existing mechanisms can be used in combination, as mechanisms which counteract a specific attack may favor other attacks. In other words, there exist tradeoffs between credibility enhancing mechanisms which have not systematically been studied. These tradeoffs have to be considered when designing a reputation system in order to balance the desired credibility features. Work related to reputation systems credibility can be found in [1] and [2]; however, the issues of tradeoffs and related design decisions are not discussed in the literature.

In this paper we explore the tradeoffs between credibility enhancing mechanisms in reputation systems. Additionally, we present some ways to achieve the necessary balances. Our aim is to support the design process of credible reputation systems by a) proposing suitable countermeasures for the various

reputation systems attacks; b) presenting the various tradeoffs between specific

credibility enhancing mechanisms and other desired characteristics and

c) presenting ways to handle these tradeoffs. In the following, we briefly describe the P2P

reputation systems and analyze the concept of their credibility; then we present the attacks that distort the credibility of a reputation system as well as the various defense mechanisms found in the literature. In section 4 we describe the tradeoffs between desirable credibility characteristics and specific mechanisms for achieving the right balances. Our conclusions follow in section 5.

Panhellenic Conference on Informatics

978-0-7695-3323-0/08 $25.00 © 2008 IEEE

DOI 10.1109/PCI.2008.49

88

2. P2P reputation systems credibility

In a decentralized reputation system the participating entities play interchangeably the roles of trustor, trustee and recommender. The trustor is an entity which wants to make a trust decision regarding whether to participate in a transaction (e.g. selling goods, accessing a resource, etc.) with another entity, the trustee, or not. In order to make a trust decision, the trustor tries to predict the future behavior of the trustee by forming a view of the later based on experience about its earlier actions. The trustor gathers experience information either by referring to its own earlier experience with the trustee, or by acquiring it from other entities (recommenders) in the form of recommendations. A recommendation is either a rating describing a single transaction (transaction-based recommendation), or an opinion formed by the outcome of several transactions (opinion-based recommendation). Based on those recommendations and on its personal experience, the trustor estimates an indicator of the quality of the trustee regarding its services, which constitutes the trustee’s reputation. Furthermore, the reputation of a peer depends on the specific context and time, as peers’ behavior is dynamic; also, recent transactions are often considered more important than older ones.

Credibility is an essential property of a reputation system and refers to the confidence that can be placed on its effectiveness. As analyzed in [1], credibility concerns various components of a reputation system, namely the recommendation information itself, the recommendation sources, the collection and aggregation methods and the management of reputation information.

Recommendation information should be carefully created so as to reflect the quality of real transactions with the trustee. In the case of opinion-based recommendations, both the number of aggregated recommendations and the aggregation method influence their credibility. The type of experiences reflected in recommendations (positive, negative or both) affect recommendation accuracy. The ability to trace the recommendation source and the originality of related transactions is also important.

Recommendations should come from the most relative and credible recommenders, so attention should be paid to the recommender’s selection method, possible bias in this method and keeping track of the recommender’s credibility. When recommendation information is communicated through mediating entities, the mediator’ credibility is also relevant.

Quality of reasoning also affects reputation credibility; this depends: on a) the recommendations

aggregation method; b) the translation of the calculated reputation measure; c) considerations about the possible evaluation of the estimated reputation; d) the kind and amount of information history used and, e) on the possible evaluation of a calculated reputation value. Furthermore, storing and retrieving both recommendation information and reputation values should be done in a secure manner, so as information is not lost, altered or intercepted. 3. P2P reputation systems attacks and defenses 3.1. P2P reputation systems attacks

Attacks targeting reputation systems can be conducted either deliberately or not, isolated or collusively; they take one of the following forms: • Spreading unfair ratings: Entities can spread unfair ratings for other entities either intentionally or unintentionally. Intentional unfair ratings aim at either defaming others (badmouthing) or praising each other (unfair praising). The attack which takes place when malicious peers claim fake transactions and spread recommendations for peers they have not transacted with, is known as ballot stuffing. Malicious peers can make collusions and conduct collusive badmouthing and other attacks, where some of the colluders behave badly and the others spread high recommendations for them (collusive deceiving). When peers can obtain multiple identities, then Sybil attacks [3] may take place, where a single peer uses a number of identities to conduct collusive attacks. • Inconsistent behavior: Peers can strategically have an inconsistent behavior that can lead to an incorrect estimation of their reputation. They can, for example misbehave towards a subset of peers (discrimination) or change their behavior suddenly or periodically (oscillatory behavior). They can also behave honestly for many small transactions and misbehave in few large transactions so as to keep a high reputation (traitors). • Exploiting weak identification and weak authorization: This kind of attacks are highly dependant on the identity policies used, i.e. on the number and the kind of identities a peer can obtain and on the way the identity of a peer is linked to a recommendation it gives to others. For example, misbehaving peers can escape own bad reputation by entering the system as new peers (whitewashing). Peers can also manipulate ratings which either should be transmitted to others or should be stored by them (man-in-the-middle attacks). They can, for example, omit a recommendation or change its content. They can

89

also monitor recommendation information to infer information about the recommender or the trustee (privacy breaching), or they can impersonate others by stealing their pseudonyms and misbehave without harming their own reputation. Furthermore, if the system permits it, peers can refuse having sent a rating (repudiation), which makes them capable of sending unfair ratings without being accountable for them. 3.2. Defense mechanisms

Defense mechanisms against unfair recommendations include the following: • Similarity based filtering techniques, which are frequently used in order to compare the provided recommendations and filter out recommenders with low similarity with the trustor on commonly evaluated entities as unreliable [4]. • Estimating the reputation of the recommender (recommendation reputation) and using it as a weight for its recommendations in order to decide whether to use the recommendation or not. This is done either based on the peer’s reputation regarding its honesty in its transactions [5][6][7][8], or by keeping track of its trustworthiness regarding each recommendation it gives [9][10]. • Methods for unbiased selection of recommenders / recommendations, e.g.: - Using social relationships. In social network based reputation systems (such as [10]), the cooperation relationship between a recommender and the trustee or between recommenders is taken into account, so as to prevent collusion attacks. - Using a number of pre-trusted peers (as in EigenTrust [8]) to isolate unfair recommenders. • Tying a recommendation with a transaction, so as to prevent ballot stuffing attacks. A simple way to do this is by incorporating a timestamp in the recommendation [11], which can then be used to verify the originality of a transaction. The use of electronic payment schemes is proposed in [12] to allow the creation of a transaction originality statement. In [13] a recommendation is proposed to be bound to a transaction through transaction proofs, built using a Public Key Infrastructure (PKI) [14] based scheme. • Considering uncertainty and lack of evidence: In case of opinion-based recommendations, confidence measures can be estimated for each recommendation and can be used to weight the recommendation in the reputation metric. Such a confidence measure is estimated in [10] based on the amount of available information and on the deviation from the average of the aggregated ratings. In [15] each opinion based recommendation is adjusted in the reputation metric

using the size of transaction history between the recommender and the trustee and also the number of recommenders which have given recommendations for its calculation. Furthermore, uncertainty factors can be incorporated in the reputation metric, as in [5], where fuzzy logic rules are used for estimating and weighting ratings. A different way to incorporate uncertainty in the reputation metric is found in [16] where a peer can ask its partner in a transaction to provide it with its recommendation regarding their transaction. Such recommendations can be sent by the receiver to future possible partners to help them estimate a reputation metric for itself even when they can find no or little information about the trustee. • Reward / punishment based defenses: In this kind of defense mechanisms honest recommenders are rewarded while peers, which provide unfair recommendations or random opinions, are punished (e.g. [17][18]). In [17] a participant is credited with a reward for each opinion and debited for each recommendation query. Honest peers are rewarded with credit discounts, but dishonest ones are punished with a probationary period during which they are not rewarded for their recommendations. In [18], monetary rewards for submitted feedback are used, based on the correlation between the reports of different entities.

Defenses against inconsistent behavior incorporate

sudden changes in behavior and oscillatory behavior in the reputation metric. For example, a dynamic reputation metric which will penalize sudden changes in behavior and oscillatory behavior is proposed in [19]. In this metric, negative ratings are weighted more than positive ratings (negative ratings sensitivity) so as to quickly reflect cheating behavior in the peers’ reputation. Furthermore, a reputation variation monitoring mechanism, which detects variations in peers’ transactional behavior and incorporates them in their reputation is proposed in [20].

Finally, defense mechanisms against weak

identification and weak authorization based attacks have to do with strong identification of peers and authentication of reputation information aiming at enhancing accountability. They include: • Using unique digital identities, in order to prevent impersonation, e.g. by using a PKI [14]. • Using digital signatures and other cryptographic mechanisms, as in [21], to preserve the integrity and secrecy of reputation information, so as to avoid collusion and man-in-the-middle attacks. • Restricting generation of new identities, in order to discourage Sybil attacks, whitewashers and ballot stuffing. Popular methods use: limiting the rate at

90

which a peer can generate new identities [8], entry fees for a newcomer to join the system [22], and allowing a few in a life time identities issued by a few trusted companies [23]. Creation of new identities can also be discouraged by assigning a low reputation to newcomers, and not allowing a reputation to fall below this value. 1. Tradeoffs

The defense mechanisms against reputation systems attacks presented in the previous section bring to light several tradeoffs between desirable characteristics of P2P reputation systems, as shown in the following: 1. ‘Privacy’ versus ‘trust’: Privacy of a peer has to do with its ability to be anonymous and not to let other peers monitor its transactions and recommendations. However, for a reputation system to work, information about a peer’s identity, its transactions and the provided recommendations needs to be monitored. The more information is linked to a peer’s identity the less its anonymity level and the privacy it has, and the higher the level of accountability that can be achieved. A specific case of this tradeoff is the use of references proposed in [16], which may be given by a peer to other peers regarding the transactions it had with them and then used by the receivers as proof of their honesty. Although this mechanism results in more accurate reputation estimation, peers which give recommendations to others, surrender their privacy with respect to how they value their partner’s performance. Reputation based trust is, thus, traded with the level of privacy achieved. To handle this tradeoff, cryptographic mechanisms can be used for hiding the real identity of a peer and for encrypting its communication [24]. 2. ‘Negative feedback sensitivity’ versus ‘robustness against collusive badmouthing’: If negative recommendations are quickly and significantly reflected in the reputation of a peer, then deceiving transactional behavior is discouraged. However, the reputation system becomes indefensible to bad mouthing and even more to collusive misbehavior. 3. ‘Encouraging newcomers’ versus ‘preventing whitewashing via determining a default reputation value to newcomers’: The initial trustworthiness value assigned to a peer, should be high enough to encourage new peers to enter a P2P system and at the same time not so high, that it doesn’t encourage whitewashers. If the value of a peer’s reputation cannot fall below the default value assigned to a newcomer, peers will not have incentives to execute whitewashing attacks. 4. ‘Resiliency to oscillatory behavior’ versus ‘helping reputation restoration of previously

misbehaving peers’ via monitoring changes in behavior: Taking into consideration previous transaction history and especially changes in behavior and incorporating these changes in the reputation metric, as in [19-20], mitigates oscillatory behavior but has as a result that a misbehaving peer cannot easily restore its reputation. This may be unfair for honest peers which may unintentionally misbehave. 5. ‘Performance’ versus ‘accuracy’ via history size: The history size of transaction information used for reputation estimation is very much related with the performance and the accuracy of the reputation system. The largest the history size the better the accuracy of the reputation metric but the smaller the performance, as both storage space and computation time needs are increased. To keep the right history sizes, recommendation data aggregation over intervals of exponentially increasing length is proposed in [13]. 6. ‘Performance’ versus ‘resilience to man-in-the-middle attacks’ via reputation information redundancy: Reputation information redundancy is employed by a number of reputation systems (e.g. [8]) to facilitate resiliency to manipulation by malicious nodes. However, it downgrades the performance of the system, as it poses increased requirements for storage and communication. 7. ‘Considering only positive experiences, and thus counteracting badmouthing attacks’ versus ‘resilience to collusive deceiving attacks’: Taking into account only positive experiences in reputation estimation, counteracts badmouthing attacks, but favors collusive praising. 8. ‘Considering only negative experiences, and thus counteracting collusive deceiving’ versus ‘resilience to badmouthing attacks’: Taking into account only negative experiences in reputation estimation counteracts the effects of collusive dishonest praises, but favors badmouthing attacks. 9. ‘Resilience to unfair recommendations’ via similarity measures versus ‘considering honest recommendations which do not comply with the majority of recommendations’: Recommendations which deviate from the average recommendation value tend to be considered dishonest when similarity measures are used. They may be honest though, reflecting a sudden change in the trustee’s behavior or a discriminating behavior of the trustee towards the recommender. In both cases the trustee behaves in a different way than the average behavior, so a recommendation describing this new behavior will deviate from the average, being honest though. 10. ‘Encouraging recommendation provision’ via rewards for recommendations versus ‘preventing random recommendations’: Some reputation systems (e.g. [18]) reward peers for giving recommendations.

91

However, peers may be tempted to give random opinions. As shown in [18], other mechanisms should then be deployed to ensure that rewards are given only for honest recommendations. 11. ‘Incentives for honest recommendations’ via credit-based reward/punishment mechanisms versus ‘ease of development’: Credit based mechanisms incorporated in reputation systems, such as [18], require increased communication and implementation costs as the credit balance of each peer should be kept and updated based on complex economic models in a decentralized manner. 5. Concluding remarks

P2P reputation systems are vulnerable to various attacks which have been examined in this paper, along with the related defense mechanisms. However, one cannot simply use a specific defense mechanism in order to cope with a specific attack as this may favor another type of attack or affect another desirable characteristic of the reputation system. Therefore, we also identified and analyzed the basic tradeoffs between various desirable credibility characteristics of reputation systems. This analysis can help the reputation system designer in choosing the right mechanisms in order to balance the various tradeoffs, according to the system requirements and priorities. Hence, our work can be used as a reference for the design process of credible reputation systems as well as a framework for evaluating the resilience of reputation systems to attacks by examining the use of countermeasures and the ways tradeoffs are handled. Acknowledgements

This work was partly supported by the University of Athens (ELKE) under contract 70/3/5829 and the European Commission under contract 511680 for the SeCSE project. References [1] Ruohomaa, S. et al., Reputation Management Survey,

ARES 2007, 103-111 [2] Hoffman, K. et al., A Survey of Attack and Defense

Techniques for Reputation Systems, Purdue University, CSD TR #07-013, 2007

[3] Douceur, J. R. 2002, The sybil attack, IPTPS 2002, 251-260

[4] Dellarocas, C., Mechanisms for coping with unfair ratings and discriminatory behavior in online reputation reporting systems, ICIS 2000, 520-525.

[5] Song, S., et al., Trusted P2P Transactions with Fuzzy Reputation Aggregation, IEEE Internet Computing Magazine, Nov/Dec 2005, 24-34

[6] Xiong, L. et al., PeerTrust: Supporting reputation based trust for peer-to-peer electronic communities, IEEE Transactions on Knowledge and Data Engineering, 2004, 16(7):843–857

[7] Lee, S. et al., Cooperative peer groups in Nice, IEEE Infocom 2003, vol. 2, 1272- 1282

[8] Kamvar, S. et al., The EigenTrust Algorithm for Reputation Management in P2P Networks, WWW 2003, 640-651

[9] Dillon, T. et al., Managing the Dynamic Nature of Trust, IEEE Journal of Intelligent Systems, 2004, 19(5), 79-82

[10] Sabater, J. et al., Reputation and social network analysis in multi-agent systems, AAMAS 2002, 475-482

[11] Despotovic, Z. et al., Maximum Likelihood Estimation of Peers' Performance in P2P Networks, P2PECON 2004

[12] Kinadeter, M. et al. 2006, Architecture and Algorithms for a Distributed Reputation System, iTrust 2003, 1-16

[13] Srivatsa, M. et al., TrustGuard: Countering Vulnerabilities in Reputation Management for Decentralized Networks, WWW 2005

[14] The PKI page, http://www.pki-page.org/ [15] Can, A. et al., SORT: A Self-ORganizing Trust Model

for Peer-to-peer Systems, TR-016-0016, Purdue University, Indiana, 2006

[16] Huynh, T. et al., Certified reputation - how an agent can trust a stranger, AAMAS 2006, 1217-1224.

[17] Papaioannou, Th. et al., Enforcing Truthful-Rating Equilibria in Electronic Marketplaces, IEEE ICDCSW 2006, 40

[18] Fernandes, A. et al., Pinocchio: Incentives for honest participation in distributed trust management, iTrust 2004, 63-77

[19] Duma, C. et al., Dynamic Trust Metrics for Peer-to-Peer Systems, in Proc. PDMST 2005, 776-781

[20] Dariotaki, Th. et al., Detecting Reputation Variations in P2P Networks, WDAS 2004

[21] Jurca, R. et al., An Incentive Compatible Reputation Mechanism, IEEE CEC 2003, 285-292

[22] Friedman, E. et al., The Social Cost of Cheap Pseudonyms, Journal of Economics and Management Strategy, 2001, 10(2), 173-199

[23] Ingram, D., An Evidence Based Architecture for Efficient, Attack-Resistant Computational Trust Dissemination in Peer-to-Peer Networks, iTrust 2005, 273-288

[24] Seigneur, J.-M. et al., Trading Privacy for Trust, iTrust 2004, 93-107

92

Documents

[IEEE 2008 12th Panhellenic Conference on Informatics - Samos, Greece (2008.08.28-2008.08.30)] 2008 Panhellenic Conference on Informatics - P2P Reputation Systems Credibility Analysis: