28
The BitTorrent Protocol

Bit torrent protocol seminar by Sanjay R

Embed Size (px)

Citation preview

Page 1: Bit torrent protocol seminar by Sanjay R

The BitTorrent Protocol

Page 2: Bit torrent protocol seminar by Sanjay R

Common Scenario

• Millions want to download the same popular huge files (for free)– Softwares– Media (the real example!)

• Client-server model fails– Single server fails– Can’t afford to deploy enough servers

Page 3: Bit torrent protocol seminar by Sanjay R

Router

“Interested” End-host

Source

Client-Server

Overloaded!

Page 4: Bit torrent protocol seminar by Sanjay R

Peer-to-Peer

• A model of communication where every node in the network acts alike.

• As opposed to the Client-Server model, where one node provides services and other nodes use the services.

Page 5: Bit torrent protocol seminar by Sanjay R

Advantages of P2P Computing

• No central point of failure– E.g., the Internet and the Web do not have a central

point of failure.– Most internet and web services use the client-server

model (e.g. HTTP), so a specific service does have a central point of failure.

• Scalability– Since every peer is alike, it is possible to add more

peers to the system and scale to larger networks.

Disadvantage of P2P Computing• Decentralized coordination. • All nodes are not created equal.

Page 6: Bit torrent protocol seminar by Sanjay R

BitTorrent

• Written by Bram Cohen in 2001• Designed to transfer large files • 160 million clients, 100 million active users• Used by many different people and

organisations• The more popular a large video, audio or

software file, the faster and cheaper it can be transferred with BitTorrent

Page 7: Bit torrent protocol seminar by Sanjay R

• “Pull-based” “swarming” approach– Each file split into smaller pieces– Nodes request desired pieces from neighbors

• As opposed to parents pushing data that they receive

– Pieces not downloaded in sequential order

• Encourages contribution by all nodes• Peer-to-peer in nature• Even if clients join simultaneously (“flash crowd”)• BitTorrent protocol is implemented in

applications called BitTorrent Clients such as uTorrent, Bit Comet.

Page 8: Bit torrent protocol seminar by Sanjay R

• Peers – A node or computer that does not have the complete file

• Seed or seeder - A computer with a complete copy of a BitTorrent file

• Swarm - A group of computers simultaneously sending (uploading) or receiving (downloading) the same file

• .torrent - A pointer file that directs your computer to the file you want to download

• Tracker - A server that manages the BitTorrent file-transfer process

BitTorrent Terminology

Page 9: Bit torrent protocol seminar by Sanjay R

BitTorrent Swarm

• Swarm– Set of peers all downloading the same file– Organized as a random mesh

• Each node knows list of pieces downloaded by neighbors

• Node requests pieces it does not own from neighbors

Page 10: Bit torrent protocol seminar by Sanjay R
Page 11: Bit torrent protocol seminar by Sanjay R
Page 12: Bit torrent protocol seminar by Sanjay R

3

User obtains *.torrent file. File contains meta info about a target file.

2

User loads *.torrent file into BitTorrent client, which then looks up the named client

1

Armed with a list of peers holding pieces of the file, user downloads from many peers

4

A *.torrent guides users to owners of a file

Tracker coordinates peers.

Page 13: Bit torrent protocol seminar by Sanjay R

All peers act as a source

Peers exchange different pieces of the file with one another until they assemble a whole

As soon as the user has a piece of the file on his machine, he can become a source of that piece to other peers, helping speed download

Seed

A machine with a complete copy (the seed) can distribute incomplete pieces to multiple peers

Page 14: Bit torrent protocol seminar by Sanjay R

• All data in a metainfo file is encoded. • info: a dictionary that describes the file(s) of the torrent. • announce: contains the URL of the “tracker”• creation date• Comments from the author(optional)• created by: (optional)• piece length: number of bytes in each piece (integer)• pieces: string consisting of the concatenation of all 20-

byte SHA1 hash values, one per piece

The key ingredients of the *.torrent file are the tracker’s address and the unique SHA1 hash

Page 15: Bit torrent protocol seminar by Sanjay R

Bit Torrent Download• Download and install the BitTorrent client

software

• Check and configure firewall and/or router for BitTorrent (if applicable)

• Find files to download

• Download and open the .torrent pointer file • Let BitTorrent give and receive pieces of the file

• Stay connected after the download completes to share your .torrent files with others

Page 16: Bit torrent protocol seminar by Sanjay R

Upload and Publish File

• Publish the .torrent file on torrent search Index sites such as PirateBay

• Download and install the BitTorrent client software

• Create a New .torrent file

Page 17: Bit torrent protocol seminar by Sanjay R

Peer-peer transactions:Choosing pieces to request

• Rarest-first: Look at all pieces at all peers, and request piece that’s owned by fewest peers– Increases diversity in the pieces downloaded

• avoids case where a node and each of its peers have exactly the same pieces; increases throughput

– Increases likelihood all pieces still available even if original seed leaves before any one node has downloaded entire file

Page 18: Bit torrent protocol seminar by Sanjay R

Choosing pieces to request

• Random First Piece:– When peer starts to download, request

random piece.• So as to assemble first complete piece quickly• Then participate in uploads

– When first complete piece assembled, switch to rarest-first

Page 19: Bit torrent protocol seminar by Sanjay R

Why BitTorrent took off

• Better performance through “pull-based” transfer– Slow nodes don’t bog down other nodes

• Allows uploading from hosts that have downloaded parts of a file

Page 20: Bit torrent protocol seminar by Sanjay R

Why BitTorrent took off

• Practical Reasons (perhaps more important!)– Working implementation (Bram Cohen) with simple

well-defined interfaces for plugging in new content– Many recent competitors got sued / shut down

• Napster, Kazaa

– Users use well-known, trusted sources to locate content• Avoids the pollution problem, where garbage is passed off as

authentic content

Page 21: Bit torrent protocol seminar by Sanjay R

Pros and cons of BitTorrent

• Pros– Proficient in utilizing partially downloaded files– Discourages “freeloading”

• By rewarding fastest uploaders

– Encourages diversity through “rarest-first”• Extends lifetime of swarm

• Works well for “hot content”

Page 22: Bit torrent protocol seminar by Sanjay R

Pros and cons of BitTorrent

• Cons– Assumes all interested peers active at same

time; performance deteriorates if swarm “cools off”

– Even worse: no trackers for obscure content

Page 23: Bit torrent protocol seminar by Sanjay R

Pros and cons of BitTorrent

• Dependence on centralized tracker: pro/con?– Single point of failure: New nodes can’t

enter swarm if tracker goes down– Lack of a search feature

• Prevents pollution attacks• Users need to resort to out-of-band search: well

known torrent-hosting sites / plain old web-search

Page 24: Bit torrent protocol seminar by Sanjay R

“Trackerless” BitTorrent

• To be more precise, “BitTorrent without a centralized-tracker”

• E.g.: Azureus• Uses a Distributed Hash Table (Kademlia DHT)• Tracker run by a normal end-host (not a web-

server anymore)– The original seeder could itself be the tracker – Or have a node in the DHT randomly picked to act as

the tracker

Page 25: Bit torrent protocol seminar by Sanjay R

Why is (studying) BitTorrent important?

• BitTorrent consumes significant amount of internet traffic today– In 2004, BitTorrent accounted for 35 to 60% of

all internet traffic (according to CacheLogic)– BT always used for legal software (linux iso)

distribution to

Page 26: Bit torrent protocol seminar by Sanjay R

• With help from BitTorrent, Facebook can now push hundreds of megabytes of new code to all servers worldwide in just a minute.

• Twitter is calling in the help of BitTorrent to deploy files across its many servers in a more efficient way. The project dubbed ‘Murder’ is based on the Open Source BitTornado BitTorrent client.

Companies using BitTorrent Technology

Page 27: Bit torrent protocol seminar by Sanjay R

Conclusion

• BitTorrent is a well thought-out protocol that embraces aspects of cooperation and self-optimizing mechanisms.

• BitTorrent propose solutions for current optimization and scalability problems

Page 28: Bit torrent protocol seminar by Sanjay R

Thank you for your attention.