Upload
adelia-warren
View
216
Download
2
Tags:
Embed Size (px)
Citation preview
Peer-to-peer systems have become increasingly popular◦ Millions of simultaneous users◦ Significant percentage of Internet
traffic is one of the most
popular p2p applications◦ Responsible for 35% of all Internet
traffic [Parker05] BitTorrent is important because
◦ Popularity◦ Its impact on the network
3
Scalable one to many peer-to-peer file distribution
Overlay: Unstructured, Random, High degree
Swarming◦ File is divided into segments◦ Segments are randomly distributed
among peers – Get rarest seg. first Contribution
◦ Peers exchange segments and contribute their outgoing bandwidth
◦ Incentive: Tit-for-Tat Tracker
◦ Torrent coordinator◦ Periodic peer status updates
Performance: Intuitively depends on◦ Peer properties (BW, Contribution,
etc. )◦ Group properties (Population, Content
availability, Churn)
Introduction
4
1. Modeling and analytical studies2. Simulation studies
3. Empirical studies◦ Capture BitTorrent system properties in operation through
measurement (instrumented clients)[Legout06]◦ Group properties[Izal04]: Population, Average cont. avail., ..◦ No explicit notion of performance◦ No study on the effects of underlying factors of peer
performance
Related work
Characterization:◦ Understanding group-level and
peer-level properties in a torrent Analysis:
◦ What are the main factors that affect observed performance by individual peers?
5
Common approach: Instrumented clients
◦ Detailed and flexible◦ Representative?
Our approach: Tracker logs ◦ Coarse granularity(30 min)◦ Global view
Data Sets
Methodology/Approach
Tracker Log
file
Tracker
Source #Torrents
Start Date
End Date
#Reports
#Sessions
RedHat 1 3/03 8/03 2M 170k
Debian 1599 2/05 3/05 32M 1268k
Games 2585 8/03 12/04 38M 4416k
Torrent File Size # Sessions, rank
Duration
RedHat
1.8GB 170k, 3rd 146d
Debian 677MB 139k, 6th 51d
Games 363MB 195k, 2th 66dTracker logs sets
Selected Torrents
6
Session: ◦ Set of all updates from a
particular peer from its arrival till departure
Peer-level properties:◦Represent the peer’s
status during a session:
◦ Average download rate◦ Average upload rate
Methodology
Slope = upload rate
Download Complete
Session Start
Studied zone(leeching)
Download rate
Slopes= upload rates
Download rates
Avg download rate
7
Population, Avg. Content Availability, Churn
Sampling approach:◦ Once every τ minutes◦ Last update before and first
update after each sample◦ Interpolation◦ Averaging across peers
τ determines sampling resolution
τ > average update interval Peer view:
◦ Average of the samples during peer’s download time
Measurement methodology
Update Time
τ
8
Is Download Rate a good performance metric ?◦ A reference is needed to evaluate peer’s download rate◦ Ideally peer performance is:
◦ Accurate measurement of Utilization is difficult We use maximum observed download rate as a (lower bound)
estimate for incoming bandwidth. Standard deviation of download rate captures stability of
download rate◦ Rates close to avg. higher performance◦ Normalization comparability
Two performance metrics:
BandwidthIn
rateDownloadAvgnUtilizatio
_
__
Methodology
RateDownloadObservedMax
RateDownloadAvgUtil
___
__
RateDownloaAvg
SdevrateDownloadsdevNorm
__
___
Similar distribution across 3 different torrents
Utilization has an almost uniform distribution◦ Nearly Fixed probability density
90% show closely uniform distribution
Diverse performance No dominant modes
9
Characterization Results/Peer-Properties
10
Content availability ◦ 75% of peers in RH
observe an average cont. avail. of 50%
◦ No content shortage
Avg. Population◦ Very different◦ Flash crowd in RH
Characterization Results/Peer-view of group properties
Initial flash crowd
11
Underlying factorsUnderlying factors Remember the second questions
◦ What are the peer- or group-level properties that primarily determine the observed performance by individual peers in a torrent?
Performance metrics:◦ Utilization and Stability
Possible Underlying factors:◦ Group-level properties: Population, Churn , Content avail.◦ Peer-level properties: Upload rate, etc.
Approach To Identify Underlying factors◦ Scatter-plot◦ Linear Regression (Using S-plus)◦ Spearman’s rank correlation (S-Plus)
12
Utilization vs. Average group content availability◦ No obvious correlation
Utilization vs. Average group population◦ Vertical patterns◦ No obvious correlation
Statistical Analysis/Scatter-plots
13
Model R-Square outbw.50p avg.grp.pop avg.grp.cont.avail avg.grp.churnutil 0.0651 0.0091 -0.1206 0.3493 0.0015util-log 0.0603 0.0965 -0.0311 0.4367 0util-step 0.0603 0.0965 -0.0309 0.4358 removedsdev 0.0709 -0.0142 0.2245 -0.3344 -0.0029sdev-log 0.0741 -0.1585 0.0778 -0.6486 -0.0005sdev-step 0.0741 -0.1585 0.0778 -0.6486 -0.0005
Suggested techniques result in marginal improvement (R-squared)
No single parameter with dominant effect
Seed percentage was removed by step() suggests number of seeds is sufficient
Statistical Analysis/Linear Regression
Several values to consider:
R-Squared determines goodness of fit [0:1]
P-value determines: “Probability of obtaining a result as impressive” just by chance
14
Torrent Perf. up.dev Avg.Pop Avg.Cont Avg.ChurnRH inbw.util -0.46 -0.13 0.05 -0.12RH inbw.sdev 0.49 0.2 -0.03 0.19DE inbw.util -0.42 -0.02 0.1 -0.02DE inbw.sdev 0.47 0.03 -0.1 0GA inbw.util -0.36 -0.05 0.04 -0.05GA inbw.sdev 0.47 0.14 -0.11 0.14
Highest correlation with deviation of upload rate for all torrents -> Tit-for-tat effect
Two perf. metrics are similarly affected with opposite signs
GA: Little correlation with util. -> unreliable metric
DE: Slightly larger effect from content avail.
Statistical Analysis/Spearman’s Rank correlation
15
Conclusions◦ No single factor determines observed performance by peers◦ Outgoing bandwidth seems to have the largest effect
Tit-for-tat is working◦ There often appears to be sufficient number of seeds available
(non-factor on performance)◦ Capturing comparable performance is hard◦ Performance of the peers in a torrent is rather diverse
Instrumented clients cannot reflect a representative picture. Future work
◦ Active monitoring of BitTorrent◦ BitTorrent overlay topology using peer exchange feature◦ Characterizing new features:
DHT, super-seeding, peer exchange