2
Characterizing the File Hosting Service Ecosystem  Aniket Mahanti 1 , Niklas Carlsson 2 , Carey Williamson 1 1 University of Calgary (Canada), 2 Linköping University (Sweden) Objective File Hosting Services (FHS) such as Rapidshare and Megaupload have recently become popular. The decline of P2P file sharing has prompted various services including FHS to replace it. We propose a comprehen- sive multi-level characterization of the FHS ecosystem. We study four pop- ular FHS: Rapidshare, Megaupload, Hotfile, and Mediafire. We devise a measurement framework to collect datasets from multiple vantage points. The work will highlight the content, usage, performance, infrastructure, and quality of service characteristics of FHS. Methodology 1. Passive monitoring of campus traffic. 2. Active measurements via crawling. 3. Collect Web analytics data. 4. Supplementary data collection. 5. Data aggregation 6. Secure storage of data. 7. Analyze data. 8. Apply the results. FHS measurement and analysis methodology. File Hosting Services FHS offer several advantages over P2P technologies such as higher avail- ability of active files, improved privacy for users, hosting diverse content, and economic incentive mechanisms for frequent uploaders. FHS file upload and download process. FHS versus P2P We analyze how quickly content is made available once the content has been broadcast, and how many copies of the content are available on FHS and P2P. Users with high-speed connections can promptly upload content to the FHS and distribute the URLs on the Web. The FHS content is ready for consumption sooner when compared to BitTorrent. FHS are an easy media for users to make content available quickly and there are many content replicas on FHS, when compared to BitTorrent. FHS Content Properties We study what type of content is being hosted on FHS. Majority of them were archive files due to the file upload size limitations imposed by FHS. Users split large content into smaller parts and then upload them part-by- part. Megaupload and Rapidshare have the oldest files. Hotfile, a newer service, had files that were less than a year old. FHS are generally being used to host very large content and the active files are being hosted for a long period of time. FHS Performance We study the download rates for free and premium Rapidshare users. Pre- mium users receive an order of magnitude higher download rates than free users. Rapidshare used a fixed throughput throttling rate for free users. We study the relationship between the file size and the wait time imposed by Rapidshare before free users can start their download. We observe three distinct regions where the wait times increase linearly with file size. FHS offer an order of magnitude faster download s for premium users, and wait times increase lin- early with file size.

Characterizing the File Hosting Service Ecosystem (ACM CoNEXT 2010)

Embed Size (px)

Citation preview

8/8/2019 Characterizing the File Hosting Service Ecosystem (ACM CoNEXT 2010)

http://slidepdf.com/reader/full/characterizing-the-file-hosting-service-ecosystem-acm-conext-2010 1/1

Characterizing the File Hosting Service Ecosystem Aniket Mahanti1, Niklas Carlsson2, Carey Williamson1

1University of Calgary (Canada), 2Linköping University (Sweden)

ObjectiveFile Hosting Services (FHS) such as Rapidshare and Megaupload have

recently become popular. The decline of P2P file sharing has prompted

various services including FHS to replace it. We propose a comprehen-

sive multi-level characterization of the FHS ecosystem. We study four pop-

ular FHS: Rapidshare, Megaupload, Hotfile, and Mediafire. We devise a

measurement framework to collect datasets from multiple vantage points.

The work will highlight the content, usage, performance, infrastructure, and

quality of service characteristics of FHS.

Methodology1. Passive monitoring of campus traffic.

2. Active measurements via crawling.

3. Collect Web analytics data.

4. Supplementary data collection.

5. Data aggregation

6. Secure storage of data.7. Analyze data.

8. Apply the results.

FHS measurement and analysis methodology.

File Hosting Services

FHS offer several advantages over P2P technologies such as higher avail-ability of active files, improved privacy for users, hosting diverse content,

and economic incentive mechanisms for frequent uploaders.

FHS file upload and download process.

FHS versus P2PWe analyze how quickly content is made available once the content has

been broadcast, and how many copies of the content are available on FHS

and P2P. Users with high-speed connections can promptly upload content

to the FHS and distribute the URLs on the Web. The FHS content is ready

for consumption sooner when compared to BitTorrent.

FHS are an easy media for users to make content available quickly and there are many contentreplicas on FHS, when compared to BitTorrent.

FHS Content PropertiesWe study what type of content is being hosted on FHS. Majority of them

were archive files due to the file upload size limitations imposed by FHS.

Users split large content into smaller parts and then upload them part-by-

part. Megaupload and Rapidshare have the oldest files. Hotfile, a newer 

service, had files that were less than a year old.

FHS are generally being used to host very large content and the active files are being hosted for along period of time.

FHS PerformanceWe study the download rates for free and premium Rapidshare users. Pre-

mium users receive an order of magnitude higher download rates than free

users. Rapidshare used a fixed throughput throttling rate for free users. We

study the relationship between the file size and the wait time imposed by

Rapidshare before free users can start their download. We observe three

distinct regions where the wait times increase linearly with file size.

FHS offer an order of magnitude faster downloads for premium users, and wait times increase lin-early with file size.