7
On Randomness in ISP-friendly P2P Applications S M Saif Shams Simula Research Lab University of Oslo Oslo, Norway Email: [email protected] Paal E. Engelstad Simula Research Lab University of Oslo Oslo, Norway Email: [email protected] Amund Kvalbein Simula Research Lab University of Oslo Oslo, Norway Email: [email protected] Abstract—P2P networking is a popular technology for sharing large files efficiently without using powerful servers. The goal for a P2P file sharing application is to minimize the download time experienced by the peers. However, the efficiency of a P2P network comes at the expense of excessive network utilization. ISPs want to control the network utilization of P2P applications, and in particular it is desirable to minimize the inter-ISP traffic. The often contradicting interests of applications and network owners have led to the search for an ISP-friendly P2P application that reduces inter-ISP traffic while maintaining the high file- sharing efficiency. Existing solutions propose to localize most of the P2P traffic within the ISP, while randomly selecting a few external peers to ensure the global spread of content. However, these proposals disregard the inherent randomness that will naturally exist in P2P systems. In this paper, we analyze the different sources of inherent randomness in ISP-friendly P2P systems, and show that they can have a significant impact on performance, and therefore must be taken into account when designing new solutions. Keywords-BitTorrent; P2P; ISP-Friendly P2P; I. I NTRODUCTION P2P application has gained popularity due to its efficiency in distributing large files among many users with a minimal central infrastructure. The amount of P2P traffic over the Internet is enormous [1]. However, ISPs often regard P2P traffic as challenging, since it is a large amount of traffic that does not fall into a classical server-client model, and hence is difficult to control and manage. The random peer selection method used in Bittorrent like P2P applications often gives inefficient use of network resources, when content that is available locally is instead downloaded from remote peers. One particular problem for ISPs is that P2P applications are generating an increasing amount of inter-domain traffic, and thus imposing an increasing transit cost. This conflict of interests between P2P applications and ISPs has led to several proposals for ISP-friendly P2P applications that are less austere to ISPs but still maintains the efficiency of the existing BitTorrent (BT) protocol. The basis for ISP-friendly P2P applications [2], [3], [4], [5], [6] is to confine most of the P2P traffic inside the local ISP when possible. By relying on information about which ISP a peer belongs to, these proposals modify the peer selection strategy so that connections to other local peers, peers in the same ISP, are preferred over remote peers (RP), peers in other ISPs. Connections with some randomly selected RPs are still necessary to spread new content quickly. Purposefully selecting RPs can be termed as intentional randomness. In the literature, there are different opinions about how many RPs an ISP-friendly P2P needs to choose from the outside of the local ISP [2], [3]. In addition to the intentional randomness that is designed into P2P applications, there is also a significant inherent randomness that stems from the nature of the Internet and the user behavior, i.e. factors that are not part of the neighbor selection mechanism. For instance, in a real-life system, the P2P application’s view of which ISP each peer belongs to will not be completely accurate. Furthermore, peers are randomly distributed across ISPs, and it has also been shown [5] that peers might be very unevenly distributed. Moreover, during the lifetime of a swarm, there will be some degree of churn, meaning that some peers may join or leave at random points in time. These are all examples of sources of inherent ran- domness, and their effect will be explored in this paper. Previous related works, on the other hand, miss to take the effects of inherent randomness into account [2], [3], [9], [10], [11]. They are typically assuming true knowledge of a peer’s location. Furthermore, they often simulate a small scenario with similar sets of configuration parameters, based on a flash crowd scenario where a number of peers join at the beginning of the simulation and stay until they finish downloading the whole file. However, there are some exceptions, for example [4] and [5], which implement ISP-friendly P2Ps by modifying only the client side. However, Piatek et al. [7] show that implementing ISP-friendly clients without involvement of an application tracker has a very limited influence over the reduction of inter-ISP traffic. In this paper, we analyze how different sources of inherent randomness influence two important performance metrics for ISP-friendly P2P systems, namely the download time and the number of times a piece of content is transferred into an ISP network. We show that in scenarios where ISP-friendly P2P seems extremely advantageous for ISPs using the ”classical” simulation setup, this advantage is significantly reduced when taking inherent randomness into account. Our work has sig- nificant implications for the design of future ISP-friendly P2P applications. Section II provides an overview of traditional BitTorrent and existing ISP-friendly P2Ps. Section III illustrates the im- portance of randomness in P2P systems. Section IV describes 2012 26th IEEE International Conference on Advanced Information Networking and Applications 1550-445X/12 $26.00 © 2012 IEEE DOI 10.1109/AINA.2012.101 431

[IEEE 2012 IEEE 26th International Conference on Advanced Information Networking and Applications (AINA) - Fukuoka, Japan (2012.03.26-2012.03.29)] 2012 IEEE 26th International Conference

  • Upload
    amund

  • View
    214

  • Download
    1

Embed Size (px)

Citation preview

On Randomness in ISP-friendly P2P Applications

S M Saif ShamsSimula Research Lab

University of OsloOslo, Norway

Email: [email protected]

Paal E. EngelstadSimula Research Lab

University of OsloOslo, Norway

Email: [email protected]

Amund KvalbeinSimula Research Lab

University of OsloOslo, Norway

Email: [email protected]

Abstract—P2P networking is a popular technology for sharinglarge files efficiently without using powerful servers. The goalfor a P2P file sharing application is to minimize the downloadtime experienced by the peers. However, the efficiency of a P2Pnetwork comes at the expense of excessive network utilization.ISPs want to control the network utilization of P2P applications,and in particular it is desirable to minimize the inter-ISP traffic.The often contradicting interests of applications and networkowners have led to the search for an ISP-friendly P2P applicationthat reduces inter-ISP traffic while maintaining the high file-sharing efficiency. Existing solutions propose to localize most ofthe P2P traffic within the ISP, while randomly selecting a fewexternal peers to ensure the global spread of content. However,these proposals disregard the inherent randomness that willnaturally exist in P2P systems. In this paper, we analyze thedifferent sources of inherent randomness in ISP-friendly P2Psystems, and show that they can have a significant impact onperformance, and therefore must be taken into account whendesigning new solutions.

Keywords-BitTorrent; P2P; ISP-Friendly P2P;

I. INTRODUCTION

P2P application has gained popularity due to its efficiencyin distributing large files among many users with a minimalcentral infrastructure. The amount of P2P traffic over theInternet is enormous [1]. However, ISPs often regard P2Ptraffic as challenging, since it is a large amount of trafficthat does not fall into a classical server-client model, andhence is difficult to control and manage. The random peerselection method used in Bittorrent like P2P applications oftengives inefficient use of network resources, when content thatis available locally is instead downloaded from remote peers.One particular problem for ISPs is that P2P applicationsare generating an increasing amount of inter-domain traffic,and thus imposing an increasing transit cost. This conflictof interests between P2P applications and ISPs has led toseveral proposals for ISP-friendly P2P applications that areless austere to ISPs but still maintains the efficiency of theexisting BitTorrent (BT) protocol.

The basis for ISP-friendly P2P applications [2], [3], [4], [5],[6] is to confine most of the P2P traffic inside the local ISPwhen possible. By relying on information about which ISPa peer belongs to, these proposals modify the peer selectionstrategy so that connections to other local peers, peers inthe same ISP, are preferred over remote peers (RP), peersin other ISPs. Connections with some randomly selected RPs

are still necessary to spread new content quickly. Purposefullyselecting RPs can be termed as intentional randomness. In theliterature, there are different opinions about how many RPsan ISP-friendly P2P needs to choose from the outside of thelocal ISP [2], [3].

In addition to the intentional randomness that is designedinto P2P applications, there is also a significant inherentrandomness that stems from the nature of the Internet andthe user behavior, i.e. factors that are not part of the neighborselection mechanism. For instance, in a real-life system, theP2P application’s view of which ISP each peer belongs to willnot be completely accurate. Furthermore, peers are randomlydistributed across ISPs, and it has also been shown [5] thatpeers might be very unevenly distributed. Moreover, duringthe lifetime of a swarm, there will be some degree of churn,meaning that some peers may join or leave at random pointsin time. These are all examples of sources of inherent ran-domness, and their effect will be explored in this paper.

Previous related works, on the other hand, miss to take theeffects of inherent randomness into account [2], [3], [9], [10],[11]. They are typically assuming true knowledge of a peer’slocation. Furthermore, they often simulate a small scenariowith similar sets of configuration parameters, based on a flashcrowd scenario where a number of peers join at the beginningof the simulation and stay until they finish downloading thewhole file. However, there are some exceptions, for example[4] and [5], which implement ISP-friendly P2Ps by modifyingonly the client side. However, Piatek et al. [7] show thatimplementing ISP-friendly clients without involvement of anapplication tracker has a very limited influence over thereduction of inter-ISP traffic.

In this paper, we analyze how different sources of inherentrandomness influence two important performance metrics forISP-friendly P2P systems, namely the download time and thenumber of times a piece of content is transferred into an ISPnetwork. We show that in scenarios where ISP-friendly P2Pseems extremely advantageous for ISPs using the ”classical”simulation setup, this advantage is significantly reduced whentaking inherent randomness into account. Our work has sig-nificant implications for the design of future ISP-friendly P2Papplications.

Section II provides an overview of traditional BitTorrentand existing ISP-friendly P2Ps. Section III illustrates the im-portance of randomness in P2P systems. Section IV describes

2012 26th IEEE International Conference on Advanced Information Networking and Applications

1550-445X/12 $26.00 © 2012 IEEE

DOI 10.1109/AINA.2012.101

431

major sources of inherent randomness. Section V presentsour analysis of inherent randomness. Section VI containsrecommendations for future ISP-friendly P2Ps, and SectionVII concludes the paper.

II. BACKGROUND

A. Bittorrent

We follow the description of BT in [8]. In his nomenclature,there are two different types of peers, seeders and leechers,in a BT network. A seeder holds the complete file whereas aleecher is still downloading the file. An application tracker is acentralized component that stores the information of all peers.The tracker provides a short list of participating peers to anypeer that requests it. When the total number of participatingpeers grows larger, the tracker randomly selects a few of themto make a short list. Using that short list, the requesting peerthen tries to connect with as many as 35 peers to share thecontent.

In a BT protocol, there are two important algorithms runningin every peer, the choking/unchoking algorithm and the pieceselection algorithm. The unchoking algorithm determines towhich neighbor a peer shall upload the content. The algorithmis also referred to as a tit-for-tat mechanism, because it willunchoke the neighbors from which the peer downloaded thehighest amount of content in the last 20 seconds. To dothat, a peer always keeps track of the contribution from eachof its neighbors in the last 20 seconds. After a 10-secondinterval, a peer reevaluates the tit-for-tat mechanism. A peersimultaneously uploads to the four of its neighbors that areselected by the tit-for-tat mechanism.

The piece selection algorithm is executed in a peer when itis unchoked by a neighboring peer. The algorithm is used todetermine which chunk a peer will download from the neigh-bor. The algorithm will first prioritize incomplete (partiallydownloaded) chunks. In the absence of such chunks, it willselect the rarest chunk in its neighborhood.

To explore potential new contributors in the neighborhood,the BT protocol includes another unchoking mechanism called’optimistic unchoke’. This unchoking mechanism randomlyselects a neighbor without considering its contribution to thepeer. The BT protocol executes the optimistic unchoke in every30 seconds.

B. ISP-friendly P2Ps

Several methods have been proposed for more ISP-friendlyP2Ps. Some of those works use a cross-layer communicationmechanism between the ISP and the P2P application [3], [9],[10], [11], while others exploit application-layer techniques toestablish the ISP-friendly P2P [6], [4], [5].

Bindal et. al. [2] were the first who proposed a novelpeer selection algorithm to reduce the inter-ISP traffic byprioritizing local peers over remote peers. The authors proposean algorithm named biased neighbor selection that selects 34out of 35 neighbors from the inside of the local ISP, andselects the last one randomly from any other ISP. Simulatinga small scenario with 14 ISPs and 700 uniformly distributed

peers, the authors demonstrate that biased neighbor selectionreduces inter-ISP traffic dramatically. They also show that apure localized method where all 35 neighbors are selectedfrom the local ISP may decrease the inter-ISP traffic to aminimum, but it may increase the average downloading timefor all peers significantly. Showing a trade-off between theamount of inter-ISP traffic and the downloading time, theauthors recommend having only one neighbor from outside ofthe local ISP. Although Bindal et al. introduced the concept oflocalizing neighbors in the ISP-friendly P2P application, theydo not convincingly explain how the P2P application knowsthe locations of all peers.

One of the ways to get the location information of peerscould be to establish explicit communication between the ISPand the P2P application. Instead of focusing on a neighborselection process, some research works focus on differentpossibilities of establishing cross-layer communications mech-anism between an ISP and the P2P application [3], [11], [9],[10]. In a cross-layer communication mechanism, an ISP willprovide network information to a P2P application, and theP2P application will utilize this information to reduce theinter-ISP traffic. To establish the cross-layer communication,the ISP and the P2P application developer need a businessagreement between them because both of them need to investin infrastructure. On the one hand, an ISP needs to deployextra devices to provide some services for its customers, andon the other hand, P2P application developers need to changethe application tracker, clients or both.

To avoid establishing new infrastructures, some other re-search works [6], [4], [5], [12] propose to exploit differentapplication-layer techniques to determine the location of peers.For example, [4] suggests to use the redirection pattern ofa content distribution network (CDN) , [6] to use the pingfunction, [5] to use ping and traceroute, and [12] to usepublicly available BGP update information. In this type ofISP-friendly P2Ps, a peer gets a short list of random peersfrom an application tracker, and the peer sorts that list basedon the location, distance or bandwidth associated with thosepeers. The peer then shares the content by prioritizing a nearbyone. B. Liu et al. [13] identifies this type of locality as chokerlocality. Showing a large scale trace information, Piatek et al.[7] claims that the choker locality has a very limited impacton the application performance as well as on the inter-ISPtraffic. The reason is that since the number of participatingpeers and the number of ISPs over the Internet is large, whenthe application tracker randomly picks a small subset of allparticipating peers, the probability of having multiple peers inan ISP becomes very low.

As far as the amount of randomness in the neighborhoodis concerned, the biased neighbor selection [2] and P4P [3]clearly suggest a specific value mentioned above of 2.85% and20% randomly selected RPs respectively. This paper showsthat the optimal amount of intentional randomness in neighborselection depends on the size of the network and the impactof inherent sources of randomness.

432

Fig. 1: The network size does not change the average numberof incoming chunks to an ISP.

Fig. 2: Localized P2P increases the downloading time.

III. IMPORTANCE OF RANDOMNESS

The primary goal for a P2P file sharing application is tominimize the download time experienced by the peers. Foran ISP on the other hand, an important goal is to minimizethe amount of inter-domain traffic generated by the P2Papplication. Intuitively, this last goal is best achieved by usinga purely localized peer selection strategy, where a participatingnode selects all its peers from the local ISP when possible.However, as we show here, such a strategy will increasedownload times.

To illustrate the trade-off between download times and inter-domain bandwidth conservation, we set up a classical scenario,which is frequently used in the literature [2], [3], [6]. Thisscenario involves a flash crowd situation where all peers joinin a very short time and stay in the system until they havedownloaded the whole file. It is also assumed that an ISP-friendly P2P possesses the true knowledge about locations ofall peers. In our experiment, the scenario includes a networkwith 14 ISPs and 700 peers. There is one initial seeder thatstays until the simulation ends. A detailed description of thesetup is given in Section V. Figures 1 - 3 summarize the result.

In our classical scenario, localized P2P becomes the mostfavorable to the ISPs because it has the minimum randomnessin a peer’s neighborhood. Figure 1 shows that an ISP-friendlyP2P without any RP generates the minimum amount of inter-ISP traffic. However, as shown in Figure 2, the downloadingtime increases significantly if the localized P2P is used insteadof the traditional BT.

An increasing number of RPs in a neighborhood contributeswith more inter-ISP traffic. A RP helps a peer to get newcontent quickly. Red curve in figure 3 supports this argumentby showing that an ISP-friendly P2P with one RP in theneighborhood experiences faster downloading time than theP2P with no RP. When every peer has a few RP in itsneighborhood, most of the peers possess some new content,and they become busy uploading the content at their fullcapacity. further increasing in the number of RPs does notdecrease the downloading time any more.

Fig. 3: For a larger network more remote peers are needed forexpected avg. DL time. DL time for 1 is equal to the avg. DLtime for BitTorrent.

The effect of adding RPs depends on network size. Ina small network of 14 ISPs, if a single peer chooses oneRP randomly from any of the other 13 ISPs, 50 peers inan ISP creates 50 connections from that ISP to other ISPs.This implies that on average, there are three such connectionsbetween any pair of ISPs. This is also true for the ISP thatholds the initial seeder. In a small network, when a seederstarts spreading the content, every ISP quickly gets some of thecontent to share inside it. However, when the network is large,for example a network with 100 ISPs, this strong connectionamong ISPs does not hold anymore. In a large network of 100ISPs, 50 random connections, each from a peer in an ISP, canconnect a maximum of half of the whole network. That is, theISP that contains the initial seeder spreads the content directlyto at most half of the network. This is why more than one RPis required to minimize download time in a larger network.Figure 3 indicates that the number of remote peers requiredto achieve a file-downloading performance at a level that iscompetitive to the BT depends on the size of the ISP-leveltopology.

This section shows that a limited amount of randomnessis necessary for ISP-friendly P2P applications. Followingsections describe different types of inherent randomness andtheir impact on application performance.

IV. INHERENT RANDOMNESS

In this section we describe different sources of inherentrandomness.

A. Wrong Location Information

A peer that is wrongly assumed to belong to a given ISP,will form many connections across ISP borders, and hence addsignificant randomness. This is likely to happen in practice,since it is a challenging task to identify always correctly thehost ISP of a peer.

There are several ways to find the host ISP of a peer.One way is to match the IP prefix with AS/network addressgiven in CIDR-report (cidr.org). Route aggression in routers,and multi homing and mobility services used by many usersmake it difficult to identify the host ISP of some peersaccurately. Another way is to measure the proximity betweenpeers in terms of RTT or link-hop. However, measurementsshow that at least 5% of the hosts are not responsive to theping command, and for traceroute it increases to 55% [5].An academic solution could be a cross-layer communicationmechanism. Even in that mechanism, it is possible that an

433

ISP intentionally provides wrong information to get financialbenefit.

B. Distribution of Peers over the Network

In a real P2P system, the number of participating peersdepends on the type of content and user behavior. Also, thenumber of participating peers is far from equal in every ISP.This irregular distribution of peers over the network bringsrandomness for the whole P2P system. Consider an ISP-friendly P2P where the P2P application wants to localize thepeers (or traffic) as much as possible (localized P2P). In thatsystem, if an ISP hosts more than 35 peers then probably mostof the peers will have 35 neighbors from the same ISP. Onthe other hand, if an ISP hosts only one peer, then the peerwill randomly choose 35 peers from other ISPs. As a result,the sparsely populated ISPs will contribute with randomnessnot only to themselves but also to other ISPs.

C. Peer Joining and Leaving Process

The peer arrival process can also contribute with random-ness to the system. For example, the first peer in any ISPwill definitely choose all of its neighbors randomly from otherISPs. However, peers joining later will usually get at leastsome peers from the local ISP. Since the P2P application hasalmost no control over the peer arrival process, a peer cannotwait for a local peer that might join later.

V. THE IMPACT OF INHERENT RANDOMNESS

The goal of this research is to show how the performanceof a P2P application is affected by the inherent randomnessthat exists in realistic network environments. Simulating largescenarios, we analyze the impact of different sources ofinherent randomness. In this section, we present our simulationframework, and evaluate the influence of inherent randomnesson several important performance metrics.

A. Simulator Details

To simulate large scenarios with more than 5000 activepeers, we develop a flow-based simulator . The simulator hasa network model that includes configurable upload bandwidth,download bandwidth, and the maximum number of concurrentoutgoing flows for each peer. We assume that upload ordownload bandwidth of a peer is the bottleneck bandwidthfor a flow.

Our discrete-event simulator also models the major partsof a trivial Bittorrent application including neighbor selection,tit-for-tat, optimistic unchoking, local rarest first policy, etc. Incase of an ISP-friendly P2P, the simulator uses the localizedneighbor selection process instead of the random neighborselection.

We follow the TCP implementation of R. Bindal [2] and H.Xie [3] in our model. In the simulator, we consider long-termaverage TCP throughput as a throughput on a path. Multipleflows on a link share the bandwidth of a link equally. Onan event of a new flow arrival or departure, bandwidths ofaffected flows are recalculated. Since Bittorrent uses a request

TABLE I: Configuration parameter used in the simulationParameter ValueNumber of Peers 5000Number of ISPs 100Number of initial seeders 1File size 25.6 MBNumber of Chunks 100Upload bandwidth 1 MbpsDownload bandwidth 2 MbpsNumber of neighbors of each peer 35Number of tit-for-tat flows 4Number of optimistic unchokes 1

queue to keep the TCP flow continuous, the assumption ofsteady performance of TCP will not significantly influence theresults [15].

B. Performance Metrics

An ISP-friendly P2P application has two different objec-tives; first, it tries to improve the performance of the applica-tion and second, it tries to minimize the inter-ISP traffic. Inthis paper, we used three different metrics to measure thesetwo criteria.

1) Download Time: By download time, we refer to the timea peer needs to download the whole file. The time starts whena peer joins the P2P system. The download time is the primarymetric to measure the performance of a P2P application. Tomeasure the download time in a flash crowd situation, weconsider the time when all peers finish downloading the wholefile.

2) Content Distribution: Considering the download timeas a metric to measure the efficiency of a P2P system isnot sufficiently accurate when the situation is dynamic. In adynamic situation many peers randomly join and leave thesystem. Because of this random user behavior, some peers mayleave the system before they finish downloading the whole file,leaving no credit to the system. So, to measure the efficiencyof a P2P system, it is logical to consider the amount of contentthe system spreads over the network. By a more efficient P2Psystem, we mean a P2P application that spreads the contentfaster than other P2P applications. To quantify this metric, weuse the number of chunks distributed within a particular timeby a P2P system.

3) Number of Incoming Copies to an ISP: We use thismetric to evaluate how friendly a P2P application is towardsthe ISP. The value is calculated by counting the number ofcopies of each chunk that peers in an ISP receive from RPs. Toget a representative value for the whole network, we averagethose values across all chunks and all ISPs.

In this calculation, transit traffic is not considered. By transittraffic we mean the traffic that is generated from and destinedto other ISPs. In our previous work [14], we show how transitISPs makes profit out of that transit traffic. ISPs are worriedabout the traffic that is generated from or destined to insidethe ISP.

C. Simulation Setup

Unless otherwise stated, we use the configuration parame-ters mentioned in Table I for our experiments,

434

Fig. 4: Wrong location information increases the inter-ISPtraffic.

1) Network Topology: For an ISP-level network topology,we adopt the topology model presented in [16], which capturesseveral key properties of the Internet topology, includingthe hierarchical structure given by policy-based routing, thepower-law degree distribution for inter-ISP relationship, andthe specific ratio among number of ISPs in different tiers.

2) Churn Model: The churn model defines how peersappear and leave in a P2P system. It is possible to developa sophisticated churn model considering time of the day, typeof the file, user behavior of different parts of the world, andnetwork condition [5], [17]. However, what we want to showis that if all peers do not join at the same time, peers joiningat later times will bring randomness to the P2P application.To reflect the effect, we model a simple churn model withonly one parameter, the percentage of total peers that joinlater. Here, 10% churn indicates that 90% of the peers join atthe beginning of the simulation and the rest appear randomlywithin the simulation time. It also indicates that 10% of thewhole population leaves the system at a random time and 90%of the peers leave the system after finishing the download.

D. Simulation Results

1) Wrong Location Information: In a localized P2P, mis-placed peers generate a lot of inter-ISP traffic. As shownin Figure 4, number of copies of each chunk received byan ISP increases with an increasing fraction of misplacedpeers. An interesting observation in this figure is that witha perfect location information, a strategy where 20% of peersare intentionally selected from remote ISPs produces a similaramount of inter-ISP traffic compared to a pure localizedP2P with 20% misplaced peers. In other words, intentionalrandomness and inherent randomness in the form of misplacedpeers has the same effect on the amount of inter-ISP traffic.

The wrong location information, however, does not influ-ence the downloading performance. The reason is that networkdelay has a limited impact on long-term TCP performance. Werefer the empirical data shown by S. Ren [5].

2) Distribution of Peers over the Network: In reality, thenumber of peers that are part of a swarm will vary widelyacross ISPs. Experimental results suggest that many ISPs mayhave only a few peers [7], [5]. Shansi et al. [5] show that thenumber of peers in different ISPs follow the Zipf distributionwith alpha equals to 0.99.

In a Zipf distribution with alpha close to 1, a few ISPscontain a large number of peers while most of the ISPs containonly a small number of peers. Peers in a sparsely populated

Fig. 5: With a Zipf distribution of peers, localization decreasesthe inter-ISP traffic.

Fig. 6: A Zipf distribution of peers improves download timesfor localized P2P.

ISP will get many remote neighbors even though they use alocalized peer selection strategy because they may not finda sufficient number of local peers. Compared to a classicalscenario with a uniform distribution of peers, a scenario withthe Zipf distribution of peers creates a high number of inter-ISP neighborhood relationships, which in turn increases theamount of inter-ISP traffic. Figure 5 shows that localized P2Pstill reduces inter-ISP traffic, but far less so than in the classicalscenario. With localized P2P, the most populated ISPs reducethe inter-ISP traffic by around 80%, while sparsely populatedISPs may reduce the traffic less than 40%. This inter-ISP trafficreduction is far below the estimation for the classical scenarioshown in Figure 1, which is around 96%.

Next, we look at how download time is affected by thepeer distribution. Peer distribution has a little influence onBT’s performance, while it has a dramatic influence on ISP-friendly P2Ps’ performance. With a Zipf distribution, theaverage download time for BT is around 424 seconds whichis similar to the average download time for BT with a uniformpeer distribution shown in Fig. 2. However, the averagedownload time for localized P2P decreases considerably from536 seconds to 417 seconds, if we use the Zipf distributioninstead of a uniform peer distribution. Taking into accountthe confidence intervals of these results, we conclude thatBT and localized P2P performs about equally well in termsof the downloading time, while the localized P2P performssignificantly better in terms of inter-ISP traffic.

3) Effect of Churn: Instead of joining together if peers joinat random points in time, chances are that a peer will not find asufficient number of local peers and hence must select remotepeers as neighbors instead. Those remote peers generate inter-ISP traffic and reduce the performance gap between BT andlocalized P2P. In accordance with this argument, Fig. 7 showsthat with localized P2P and an increasing amount of churn, an

435

Fig. 7: Traffic reduction with localized P2P decreases withincreasing churn

Fig. 8: Influence of churn on content distribution

ISP receives more copies of each chunk. Without any churn,localized P2P reduces the number of incoming copies by morethan 95% - from 49.5 copies to 2.2 copies. As the amount ofchurn increases, this reduction diminishes. With 25% churn thereduction is only 50%, while at 60% churn the improvementis only 20%.

Churn has an additional effect in reducing the uploadingcapacity in the BT system due to the departure of peers holdingmany chunks. Figure 8 shows how churn reduces contentdistribution (i.e., the number of chunks uploaded) in BT withrandom peer selection. When churn increases from 0% to 30%,the number of chunks uploaded in 200 seconds decreasesby about 20%. For localized P2P on the other hand, churnincreases the number of chunks uploaded. This is becausechurn increases connectivity among the small cliques createdby localized P2P, and hence improved content distribution.

VI. DISCUSSION

A general challenge for ISP-friendly P2P methods is thelack of incentives for the end user. Native Bittorrent is veryefficient in utilizing the available uploading capacity [15].Hence, ISP-friendly P2P methods that aim at localizing trafficwill normally not be able to improve on Bittorrent’s download-ing times, as long as the local upload capacity is the limitingfactor for the throughput of a connection between two peers.This is a reasonable assumption given today’s generally well-provisioned core networks, and is used in our model. It maybe argued [3] that reducing the network delay between peerswill lead to a higher throughput, since a shorter round triptime might allow TCP to better utilize the available bandwidth.Measurement studies, however, indicate that this effect is verylimited [5]. As mentioned by Choffnes and Bustamente [4],there exists ISPs that discriminate inter-ISP flows and givethem a lower bandwidth than local flows. In such an environ-ment, ISP-friendly P2P will also improve performance for endusers. However, given the current focus on network neutrality,this service model has not seen widespread deployment.

This discussion highlights the importance of keeping ISP-friendly P2P methods simple: given the limited achievableperformance gain for the application, any additional imple-mentation complexity may discourage its use. This workhighlights the inherent randomness in locality-aware systems,which might lead to simpler designs for ISP-friendly P2Papplications.

Due to inherent randomness (discussed in section IV), it isdifficult to maintain a neighbor list strictly with a specific ratiobetween local and remote neighbors. In a sparsely populatedISP, a peer always ends up with an undesirable ratio oflocal and remote neighbors. Initially, the peer connects withthe prescribed number of remote peers and other availablelocal peers by sending neighbor-requests. However, the peergets additional remote neighbors, when it receives neighbor-requests from other peers. With existing ISP-friendly P2Papplications, a peer generally accepts a neighbor-request, ifit has available space in its neighborhood. Thus, the peergets a higher number of remote neighbors than expected.Maintaining the specific ratio in a neighborhood is difficultalso in a densely populated ISP. This is because randomnessin peer arrival process may temporarily cause lack of localpeers in an ISP. In that situation, a peer in the ISP may getadditional remote neighbors before other local peers arrive.With a BitTorrent like P2P application, once a peer connectsto a remote peer, it cannot break the relationship unless theremote peer leaves the system. As a result, the peer keeps theundesirable ratio in neighborhood even when new peer-arrivalsmake it possible to reach the target ratio. Since the existinglocal peers are connected with excess remote peers, somenewly joined peers will be deprived of local connections, andultimately they will be connected with remote peers. Thus, theproblem of undesirable ratio in neighborhood propagates fromthe formerly arrived peers to the newly arrived peers. There areseveral ways to stop this propagation. One way is to restrictboth sending and accepting of neighbor-requests so that thenumber of remote neighbors never exceeds the prescribedvalue. However, in a sparsely populated ISP, peers will alwayshave a limited number of neighbors. As a result, those peerswill suffer from a slow downloading performance. Given thatthe number of participating peers in an ISP follows a heavytailed distribution [5], a significant number of peers in a P2Psystem will suffer from this restriction. The most promisingapproach is to accept excess remote peers provisionally, ifneeded, and disconnect the remote neighbors when local peersare available. One example of such peer selection algorithmis PreeN [18].

VII. CONCLUSION

Existing ISP-friendly P2P methods are based on preferringlocal peers over remote peers. To increase robustness anddecrease download times, they add some randomness in theneighbor selection process. In this paper, we argue that inaddition to this intentional randomness, there will always bean element of inherent randomness in a P2P system. We showthat this inherent randomness will significantly influence the

436

performance in terms of download times and inter-ISP traffic.In general, inherent randomness will counter-act the intentionsof ISP-friendly P2P methods by limiting the ability to localizetraffic. There are, however, also positive effects of inherentrandomness, in that it may improve content distribution andthus sometimes reduce download times. Our findings showthat inherent randomness must be taken into account in thedesign of ISP-friendly P2P applications, and we question thecommon belief that an amount of intentional randomness mustbe added to avoid excessive download times.

REFERENCES

[1] Thomas Karagiannis, Pablo Rodriguez and Dina Papagiannaki, Should In-ternet Service providers Fear peer-Assisted Content Distribution?, IMC05,Berkeley, 2005.

[2] Bindal et.al., Improving Traffic Locality in BitTorrent via Biased Neigh-bor Selection, IEEE ICDCS, 2006.

[3] H. Xie, Y. Yang, A. Krishnamurthy, Y. Liu, A. Silberschatz, P4P: providerPortal for Application, SIGCOMM, Washington, USA, 2008.

[4] D. R. Choffnes and F. E. Bustamante, Taming the torrent: a practical ap-proach to reducing cross-isp traffic in peer-to-peer systems, SIGCOMM,2008.

[5] S. Ren, T. Luo, S. Chen, L. Guo, X. Zhang, TopBT: A topology-awareand infrastructure-independent Bittorrent client, INFOCOM, 2010.

[6] L. Sheng, H. Wen, Reducing cross-network traffic in P2P systems vialocalized neighbor selection, ChinaCom, 2009.

[7] M. Piatek, H. V. Madhyastha, J. P. John, A. Krishnamurthy, and T.Anderson. Pitfalls for ISP-friendly P2P design. In The Eighth ACMWorkshop on Hot Topics in Networks (HotNets-VIII), New York City,NY, USA, October 2009.

[8] B. Cohen. Incentives build robustness in BitTorrent. In Proc. 1st Work-shop on Economics of Peer-to-Peer Systems, Berkeley, June 2003.

[9] V. Aggarwal, A. Feldmann, and C. Scheideler. Can ISPs and P2P systemscooperate for improved performance? ACM CCR, July 2007.

[10] D. P. Pezaros, L. Mathy, Explicit Application-Network Cross-layer Op-timisation, The 4th International Telecommunication NEtworking Work-Shop (IT-NEWS), Italy, 2008.

[11] B. Ruan, W. Xiong, H. Chen, D. Ye, Improving Locality of BitTorrentwith ISP Cooperation, International Conference on Electronic ComputerTechnology, 2009.

[12] C. Hsu, M. Heffeda, ISP-friendly peer matching without ISP collabora-tion, CoNEXT, 2008.

[13] Bo Liu, Yi Cui, Yansheng Lu, and Yuan Xue, Locality-Awareness inBitTorrent-Like P2P Applications, IEEE transactions on multimedia vol.11 no.3, April 2009.

[14] S.M.Saif Shams, Paal E. Engelstad, Amund Kvalbein, Analysis of peerselection algorithms in cross-layer P2P architectures, Proceedings ofthe 3rd IEEE international conference on Internet multimedia servicesarchitecture and applications, IMSAA-IWAP2PT09, 2009.

[15] Ashwin R. Bharambe, Cormac Herley, Venkata N. Padmanabhan, Ana-lyzing and Improving BitTorrent Performance, IEEE Infocom, 2006

[16] A. Elmokashfi, A. Kvalbein, and C. Dovrolis. On the Scalability of BGP:the roles of topology growth and update rate-limiting, CoNext 2008, ed.by ACM, 2008.

[17] D. Stutzbach, R. Rejaie, Understanding Churn in Peer-to-Peer Networks,IMC, 2006.

[18] S. M. Saif Shams, Paal E. Engelstad, Amund Kvalbein,”PreeN:Improving steady-state performance of ISP-friendly P2Papplication.”, ICDCN 2012, (Accepted)

[19] M. Izal, G. U. Keller, E. W. Biersack, P.A. Felber, A. A. hamra, and L. G.Erice, ”Dissecting BitTorrent: five months in a torrent’s lifetime”,LectureNotes in Computer Science, 2004

437