19
1 HOTP2P 2011 Parallel and Distributed Systems Group, Delft University of Technology, the Netherlands May 20, 2011 Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System Dimitra Gkorou, Johan Pouwelse, and Dick Epema

Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

  • Upload
    pomona

  • View
    35

  • Download
    0

Embed Size (px)

DESCRIPTION

Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System. Dimitra Gkorou, Johan Pouwelse, and Dick Epema. Overview. Tribler The Bartercast Reputation Mechanism Betweenness Centrality Approximations for Betweenness Centrality - PowerPoint PPT Presentation

Citation preview

Page 1: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

1HOTP2P 2011

Parallel and Distributed Systems Group,Delft University of Technology, the Netherlands

May 20, 2011

Betweenness Centrality Approximations for an Internet Deployed P2P Reputation SystemDimitra Gkorou, Johan Pouwelse, and Dick Epema

Page 2: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

2HOTP2P 2011

Overview

• Tribler• The Bartercast Reputation Mechanism• Betweenness Centrality• Approximations for Betweenness Centrality• Integration of these methods in Bartercast• Conclusion

Page 3: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

3HOTP2P 2011

Tribler: main features

• based on the BitTorrent P2P file-sharing system• an epidemic protocol for peer and content

discovery• social phenomena to implement distributed

control:• content discovery • content recommendation• reputation system

• first released on 17 March 2006

• more than 1,000,000 downloads

• BarterCast: the reputation system of Tribler against free-riders

J.A. Pouwelse, P. Garbacki, J. Wang, A. Bakker, J. Yang, A. Iosup, D.H.J. Epema, M. Reinders, M.R. van Steen, H.J. Sips, "Tribler: A social-based peer-to-peer system," Concurrency and Computation: Practice and Experience Vol. 20, 127-138, 2008.

Page 4: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

4HOTP2P 2011

BarterCast 1: Basic Concepts

• information exchange: using an epidemic protocol

• peers keep the history of their own interactions + the interactions among other peers

• each peer i creates a directed, weighted local graph:

• vertices: the peers whose activity is known to peer i

• weighted edges: the amount of the transferred data between two peers

• each peer computes locally the subjective reputations of other peers in the system

kwkb

b

wim

m

wij wjc

j

wki

ci

local subjective graph of peer i

Page 5: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

5HOTP2P 2011

Bartercast 2: Information Exchange

Bartercast

12

2

4

6

3

8

9

21

5

7

10

11

13

15 14

16

17

18

19

1

local subjective graph of peer 9local subjective

graph of peer 10

10 9w9,8

8

w6,9

6w9,10

10

4

9

12 w10,12

w2,12

2

w9,10

w10,4

w9,8

8

w6,9

6

data transactions

8

6

M. Meulpolder, J.A. Pouwelse, D.H.J. Epema, and H.J. Sips, "BarterCast: A Practical Approach to Prevent Lazy Freeriding in P2P Networks," (HoT-P2P), in conjunction with IPDPS, May 2009.

Page 6: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

6HOTP2P 2011

Bartercast 3: Computing Reputation • a peer i willing to interact with a peer g:• considers the amount of transferred data in its local

subjective graph as flows• use of the max-flow algorithm to compute fgi and fig

• reputation of peer g: the difference of fgi and fig

• the computation is restricted to paths of length 2 due to its computational cost

jelocal subjective graph of

peer i

ci

b

Wca\ac

wbi

wia

a

f

wgk

k

wbj

g

wjg

wfg

wge

Wba\ab

wgc

Page 7: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

7HOTP2P 2011

Bartercast 4: Problem Description

• starting the computation from the owner of the subjective graph itself results in bad coverage

• starting from the most central node results in better coverage

• the most central node is the node with the highest betweenness centrality (BC)

local subjective graph of peer i e

j

ci

b

Wca\ac

wbi

wiaa

f

wgk

kwbj

g

wjg

wfg

wge

Wba\abwga

Page 8: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

8HOTP2P 2011

Betweenness Centrality

• The BC of a node is the sum of the ratios of shortest paths between pairs of nodes passing through node :

• computation of BC: the all-pair shortest path problem

• the fastest algorithm for BC: • explores and counts the shortest paths using Breadth-

First Search starting from every node in the network• aggregates efficiently the path counts

,

,

( )( ) s t

s t s t

BC

# shortest paths between nodes s,t passing through

node # shortest paths

between nodes s,t

Page 9: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

9HOTP2P 2011

Experimental Setup 1: Dataset

• growing synthetic and Bartercast graphs

• the synthetic graphs grow from 1,000 up to 20,000 nodes

• 20 instances, each one containing the previous one + 1,000 new nodes

• for the BarterCast graph:• we crawled BarterCast from 24 July to 9 September 2009• it starts with 1,592 nodes and reaches up to 2,408 nodes

Page 10: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

10HOTP2P 2011

Experimental Setup 2: Graph Types

• random graph• each new node is connected to every existent node with

a constant probability P• power-law graph

• each new node is preferentially attached to existent nodes with a probability proportional to their degree.

• its degree distribution is expressed as P(k)ck-

• only a few nodes are highly connected • graph derived from Bartercast

• power-law exponent : 2.2

power-law exponent

Page 11: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

11HOTP2P 2011

Approximation 1: Growing Graphs

• the most central node in real graphs does not change often due to their structural properties and so, we don’t have to update BC values often.

• focus on the stability of the top-n most central nodes

• consider the sequences of IDs of the top-n most central nodes in consecutive graph instances

• we use two metrics:• the number of common nodes in two consecutive sequences• the minimal number of transpositions needed to get all the

common nodes of latter sequence in the order of the previous

Page 12: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

12HOTP2P 2011

Approximation 1: Growing GraphsRandom Graph Power-law Graph

Nu

mb

er

of

com

mon

nod

es

Instance number (t)

Nu

mb

er

of

com

mon

nod

es

Instance number (t)

Instance number (t)

Nu

mb

er

of

tran

sposi

tion

s

Nu

mb

er

of

tran

sposi

tion

s

Instance number (t)

In power-law graphs, the most central nodes remain almost

invariant in time and so, BC has not to be recomputed often

Page 13: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

13HOTP2P 2011

Approximation 2: Large Graphs

1. Pivot BC (P-BC): random selection of a small subset of nodes (the pivots) to start Breadth-First Search• Overestimation of the BC of nodes close to pivots

2. Scale BC (S-BC): like P-BC but normalized over the distance of a node from the pivots

3. k-BC: exploring the paths of length at most equal to k

k=2

Page 14: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

14HOTP2P 2011

Approximation 2: Large GraphsRandom Graph Power-Law Graph

Cost of computationNu

mb

er

of

corr

ect

ly id

en

tified

nod

es

Cost of computation

Nu

mb

er

of

corr

ect

ly id

en

tified

n

od

es

Cost of computation

Nu

mb

er

of

tran

sposi

tion

s

Cost of computation

Nu

mb

er

of

tran

sposi

tion

s

• In power-law graphs, the approximations of BC are highly accurate (S-BC achieves the best accuracy)

• In random graphs, all the approximations have a lower accuracy (k-BC achieves the best accuracy)

Page 15: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

15HOTP2P 2011

Approximation 2: Large Graphs

BarterCast Graph

Nu

mb

er

of

corr

ect

ly id

en

tified

nod

es

Cost of computation Cost of computation

Nu

mb

er

of

tran

sposi

tion

s

In BarterCast graphs, the approximations are accurate enough, with

S-BC achieving the best results

Page 16: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

16HOTP2P 2011

Integration in BarterCast 1: Setup

• we integrate P-BC, S-BC and k-BC in BarterCast evaluating their effect

• each peer identifies the most central node in its subjective graph using one of these approximations and then applies max-flow with that node as a start point

• two metrics• coverage: the fraction of peers in a subjective graph for

which the local reputations turn out to be non-zero• relative average error: the absolute difference of the

locally computed reputations of the peers and their actual reputations

Page 17: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

17HOTP2P 2011

Integration in BarterCast 2: Results

• BC=0: a node with BC equal to 0• 1/2maxBC: the node with BC equal to 50% of the

maximum BC• maxBC: the node with the maximum BC

Covera

ge

Rela

tive A

vera

ge

Err

or

• Using the most central node in the computation of reputation results in better coverage and

smaller average error

• S-BC and k-BC identify the most central node correctly

Page 18: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

18HOTP2P 2011

Conclusions & Future Work

• power-law graphs: the approximation of BC are efficient and highly accurate

• random graphs: it is harder to identify the most central nodes

• using the node with the highest BC increases the accuracy and the coverage in Bartercast

• k-BC and S-BC identify correctly the most central node in Bartercast

• future work: not keeping the complete history of transferred data for the computation of reputation• limited size of memory• computational cost• accuracy

Page 19: Betweenness Centrality Approximations for an Internet Deployed P2P Reputation System

19HOTP2P 2011

Questions?

www.pds.ewi.tudelft.nl

www.tribler.org

contact: [email protected]