57
BitTorrent

BitTorrent. BitTorrent network On the itinerary: Introduction to BitTorrent Basics & properties 3 Interesting analysis results

Embed Size (px)

Citation preview

Page 1: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent

Page 2: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent network

On the itinerary: Introduction to BitTorrent Basics & properties 3 Interesting analysis results

Page 3: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Publishing

How to publish a (usually large) file ? Dedicated server:

Easy to manage, Easy to find, Persistent service Nevertheless…

BitTorrent: Organizes multiple clients that share the same file Leveraging the upload bandwidth of the

participants Self scaling, resilience, operates well in “Flash

crowed” period Takes 50%-60% of all p2p traffic

Page 4: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

The Basics of BitTorrent

Content provider (everyday people) wants to publish a file (the initial seed):

Creates a meta file (*.torrent) Publish this (light) *.torrent file on a web server File is broken to small blocks (32-256 KB) Uploads blocks to other peers The goal: to publish the file to many nodes by

the help of other peers, with minimum load on the seeder

Page 5: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

The Basics of BitTorrent

Third party (the tracker): A tracker site keeps track of the active

participants + extra statistics. Upon requests from nodes, supplies a random

subset of active nodes Receives updates from the active nodes Keeps track of new node joining the ‘torrent’ (or

‘swarm’) and nodes that left it

Page 6: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

The Basics of BitTorrent

Peer (leecher) who is interested: Obtains the public *.torrent file Being directed to the tracker Obtains a list of random neighbors (~40) Downloads and uploads blocks to its ‘best’

neighbors (choking and unchoking) Upon download completion, becomes a seed

Page 7: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent demo (Wikipedia)

Page 8: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent demo (Wikipedia)

Page 9: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent demo (Wikipedia)

Page 10: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent demo (Wikipedia)

Page 11: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent demo (Wikipedia)

Page 12: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent demo (Wikipedia)

Page 13: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent demo (Wikipedia)

Page 14: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent demo (Wikipedia)

Page 15: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent demo (Wikipedia)

Page 16: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent demo (Wikipedia)

Page 17: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent basic schemes

(Immediate) Problems arise: Last block problem: assume nodes depart upon

completion, how would leeches obtain the last block ?

Free riding problem: willing to download, but unwilling to serve others

Simple and effective solutions: Last block problem: Nodes employ Local Rarest

First policy Free riding problem: Nodes employ “Tit For Tat”

policy, i.e. give more to those whom you accept more from

Page 18: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

On the itinerary

One interesting limitation of BitTorrent networks

BitTorrent provides poor service availability via analysis of tracker logs over long period

Proposition of peer selection approach Enables to lower costs and resources of ISPs Does not require ISP and peers cooperation Implementation of the proposition as part of Azureus

client, codename ‘Ono’ Tradeoffs

Performance (avg. Download time) vs. Fairness (avg. share ratio)

Page 19: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent limitations

Taken from the paper “Measurements, Analysis and Modeling of BitTorrent-like systems” [1]

Inspect overall performance in the lifetime of a torrent Analysis is based on traces A model is derived which is used to draw conclusions Verify that the derived conclusions match the observed

behavior Limitations were found:

Poor service availability (coming next) Fluctuating download performance Unfair service to peers

Page 20: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Service availability - Analysis

Analysis is based on traces Tracker logs (~1500 torrents, sampled every 30

sec) Traces from servers that publish *.torrent files

Extracted data Identify peers Birth time of the torrent, size For each peer in the same torrent: arrival time,

download & upload bandwidth, download & upload accumulative bytes

Page 21: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Service availability - Analysis

Y-axis at time t : The total number of requests for all torrentsin the trace minus the cumulative number of requests for all torrents after time t, since theyare born.

Similar observations areseen in the *.torrent metadata traces

Page 22: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Service availability - Analysis

This suggests an exponential decrease rate of requests, since a torrent is born

Notice that this is a cumulative measure Does a specific torrent behaves the

same? Use the least square method to measure how

much a specific torrent deviates from this logarithmic fitting

Analyze the average relative deviation distribution (which is mostly small – on average 6%)

Page 23: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Service availability – A model

Based on the observations, define the torrent popularity at time t as the peers arrival rate, which is the derivative of the of the peer arrival time distribution for that torrent.

Arrival rate: Where is the initial arrival rate when the

torrent starts And is the attenuation parameter Both are evaluated from the observations

Page 24: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Service availability – A model

Define Torrent lifespan: duration from the birth to the

time of no complete copy (thus new leeches would not be able to complete the download)

Inter arrival time between two successive arriving peers could be approximated as

Assume seeds leave the system at rate then the average service time for a seed is approximately

Page 25: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Service availability – A model

Look at consecutive peers that join the torrent

Peer n and (n+1) join the torrent in time t(n) and t(n+1) respectively

The inter arrival time between them is approximately

Peer n downloads the file with speed u(n) and stays in the torrent for time duration

When peer arrival rate is small enough (n is large), peer n+1, with speed u(n+1) <= u(n) could only be served by peer n

Page 26: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Service availability – A model

Thus when peer n+1 can’t complete the download and the torrent is dead

Using the definition of arrival rate, we get the torrent lifespan:

Both and are extract from the trace (using linear regression), as well as

Compare results from the model to what the trace holds

Results match very well

Page 27: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Model vs. Observations

Comparison of torrents lifespan:average lifespanaccording to traceis 8.89 and 8.34based on the model

Page 28: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Service availability - Summery

Conclusions were obtained relying on extensive trace analysis and modeling

Existing BitTorrent systems provides poor service availability

This is due to the exponential decreasing peer arrival rate

This provides strong motivation for inter-torrent collaborations

Page 29: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Next on the itinerary

One interesting limitation of BitTorrent networks

BitTorrent provides poor service availability via analysis of tracker logs over long period

Proposition of peer selection approach Enables to lower costs and resources of ISPs Does not require ISP and peers cooperation Implementation of the proposition as part of Azureus

client, codename ‘Ono’ Tradeoffs

Performance (avg. Download time) vs. Fairness (avg. share ratio)

Page 30: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Reducing cross-ISP traffic

Taken from the paper “Taming the torrent” [2] Motivation: overwhelming popularity of p2p (70%

of internet traffic worldwide) yielded significant revenues for ISP.

However, p2p traffic significantly has increased ISP’s costs, particularly in terms of cross-ISP traffic

This has driven ISP to try and forcefully reduce p2p traffic Block specific ports, tricking clients to close

connections Deep packet inspection Caching

Page 31: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Reducing cross-ISP traffic

One approach to alleviate this pain is to use an Oracle that provides knowledge about which peers are in the same ISP This would benefit both ISPs and p2p

community But, this requires p2p users and ISP to

collaborate and to trust each other Not likely to be adopted

Page 32: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Reducing cross-ISP traffic

Another approach is to recycle data that is already being collected by Content Distribution Networks CDNs attempt to improve web performance

by redirecting requests to replica servers The goal is to help content providers (i.e.

CNN) to distribute content by redirecting requests to replica servers that are: Topologically proximate Provide lower-latency

Page 33: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

CDNs as oracles

Hypothesis: when peers exhibit similar redirection behavior, they are likely to be close to the replica server, and thus to each other Represent redirection behavior using ratio-

maps Each ratio represents the frequency of

redirecting to a specific replica Number of replicas is usually small (max 31) Keep a time window (~ a day)

Page 34: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

CDNs redirections as ratio-maps

The ratio map of a peer is a set of (replica server, ratio) for peer a Specifically, if peer a is redirected toward

replica server r1 75% of the time window, and toward replica server r2 25% of the time window, then the corresponding ratio-map is

The sum of all in a given ratio map equals one

Page 35: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Similarity via ratio-maps

Define a metric that, given two peers, produces a value describing the similarity between the peers’ redirections behavior We are looking for overlap in redirection

frequencies maps between each two peers Use cosine-similarity between two peers a and

b:

Page 36: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Cosine-similarity

Distance(a,b): Sum is over the set of replica servers that a (b) to

which the peer has been redirected over the time window

is the ratio of time that peer a has been redirected to replica server i

Cosine-similarity is analogous to dot product When maps are identical, equals 1 When maps are orthogonal (no common replica),

equals 0 Values lie in [0,1] Determine a threshold (0.15)

Page 37: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

CDNs implementation

Ono, an extension to Azureus client Upon handshake of two peers, exchange ratio-

maps This enables Ono to perform a biased peer

selection Performs DNS lookup for each CDN name to

determine redirection behavior and encodes it in ratio-maps

Periodically update the ratio-maps Overhead is extremely small

18KB upstream, 36KB downstream per day Computation of cosine-similarity is easy

Page 38: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Ono-recommended empirical results

Over 120,000 peers use Ono Ono collects extra network data

Ping, Trace-Route to replica servers and peers

Obtain feedback on the biased peer selection Not easy to determine cross-ISP hops IP hops is easy and gives some measure

Compare Ono-recommended peers selection to random peer selection

Page 39: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Ono-recommended empirical results

Cumulative Distribution function of the number of ip hops taken along paths between Ono client and his peers.

Each value represents the average number of hops for all peers, seen by a particular Ono client during 6 hour interval

Ono finds shorter paths Median in less than half More than 20%

are only one hopaway, via less than 2%

Page 40: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Ono-recommended empirical results

Each ip address was mapped to corresponding Autonomous System id

Similar to the previous graph Over 33% of paths found

by Ono do not leave the origin AS

Median AS hops is onevs. less than 10% in the random case

Page 41: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

CDNs as oracles - Summery

Recycling network views collected by CDNs Good internet citizenship in terms of

reducing cross-ISP traffic Performance of peers is not effected Scalable (the more clients adopt it, the

more accurate the bias would get) Available easily and freely

Page 42: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Last on the itinerary

One interesting limitation of BitTorrent networks

BitTorrent provides poor service availability via analysis of tracker logs over long period

Proposition of peer selection approach Enables to lower costs and resources of ISPs Does not require ISP and peers cooperation Implementation of the proposition as part of Azureus

client, codename ‘Ono’ Tradeoffs

Performance (avg. Download time) vs. Fairness (avg. share ratio)

Page 43: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent Tradeoffs

Taken from the paper “The delicate tradeoffs in BitTorrent-like file sharing protocol design” [3] Peers that participate in BT are heterogeneous

with regard to download and upload capacities Taking a system approach

The system throughput depends critically on the “fat” peers

However, this might result in unfairness towards those who contribute more

This in turn would encourage peers to supply low upload rate to others

Page 44: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

BitTorrent Tradeoffs

A user would look for download it gets to be proportional to the upload it supplies Assuming peers take the system overview Long lasting, steady state, rational

Two parameters and their inter relations are explored Performance: minimum average rate of

download time Fairness: ratio between give and take

Page 45: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Model

W.L.O.G. the file is of size 1 Assume peer average arrival rate of

Assume Peers do not abort Upon completion, peer leaves the torrent (BT provides

no incentive of seeding) Assume n classes (types) of peers

For each new peer arrival, with probability it belongs to type I

Thus, average arrival rate for class i is Class i has Ui and Di as upload and download

capacities Assume U1>U2>…>Un (type 1 are the “fat” ones…)

Page 46: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Model

Visualization of the model for n=2

Page 47: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Model – measuring performance

Assume (quite natural) that the bottleneck is the upload capacity i.e. no network bottleneck such as server

saturation The file uploading capacity of the entire

system is Consider the steady state:

Define as the average number of type i peers Approximation: Substitute the later in the former, in s.s.:

Page 48: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Model – measuring performance

In steady state, the capacity should be equal to the arrival rate (as the file size is 1) Obtain Define “share-ratio” and rewrite as

In a steady and balanced system, share-ratio should be 1

Average system download time Those two equations define the solution

space, as well as the resulting performance Feasible solutions are the set

Page 49: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Model – measuring fairness

Consider share-ratio as a good and natural measure of fairness

Define fairness index: Measures how equal the ratios are (if all are

the same, it equals 1) (after some work) Obtain: Also expressed in terms of upload and

download of each class

Page 50: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Rate strategies

General assumption: all peers maximize their upload capacity Based on experiments

We want optimal average download time T Solve the constraint problem

Use Lagrangian multiplier method

Page 51: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Rate strategies

Optimal average download time T that is obtained is:

We get an assignment for The system gives the “thin” peers (other

than type-1) maximum upload capacity “Thin” peers get more than they contribute Calculate the fairness index under this

solution (shown to be quite low)

Page 52: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Rate strategies

Now, apply the same strategy to achieve optimal fairness Then check what is the resulting

performance measure Optimal fairness is achieved when We get different assignments for Compare the two:

In terms of system performance, we have: In terms of system fairness, we have:

Page 53: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Entire design space

Actually those are only two of infinite solutions for assigning

The space lies on a curve Example: a system of two types with

specific capacities

Page 54: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Simulation results

Experiments with two-types system (with the same characteristics as the last system) Average downloading time:

Fairness:

Page 55: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Simulation results

Fundamental tradeoffs: Taken for the extreme values of the two

strategies:

Summery: Cannot enjoy both heavens Current BitTorrnet implementations lie

somewhere on the curve

Page 56: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Think about …

With regard to the first paper (service availability), how did the ~8.3 average torrent lifespan was deduced ?

Thank you (those who are still awake…)

Page 57: BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Papers

[1] Measurements, Analysis, and Modeling of BitTorrent-like Systems Lei Guo, Songqing Chen, Zhen Xiao, Enhua Tan, Xiaoning

Ding, and Xiaodong Zhang [2] Taming the Torrent - A Practical Approach to

Reducing Cross-ISP Traffic in Peer-to-Peer Systems David R. Choffnes and Fabián E. Bustamante

[3] The Delicate Tradeoffs in BitTorrent-like FileSharing Protocol Design Bin Fan, Dah-Ming Chiu, John C.S. Lui