Upload
hector-barber
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
P2P Networking
2/51
What is peer-to-peer (P2P)?
“Peer-to-peer is a way of structuring distributed applications such that the individual nodes have symmetric roles. Rather than being divided into clients and servers each with quite distinct roles, in P2P applications a node may act as both a client and a server.”-- Charter of Peer-to-peer Research Group, IETF/IRTF, June 24, 2004(http://www.irtf.org/charters/p2prg.html)
3/51
Client/Server Architecture
GET /index.html HTTP/1.0
HTTP/1.1 200 OK ...Client Server
4/51
Disadvantages of C/S Architecture
Single point of failure Strong expensive server Dedicated maintenance (a sysadmin) Not scalable - more users, more
servers
5/51
The Client Side
Today’s clients can perform more roles than just forwarding users requests
Today’s clients have:more computing powermore storage space
Thin client Fat client
6/51
Evolution at the Client Side
IBM PC @ 4.77MHz
360k diskettes
PC @ 4-core 4GHz
300GB HD
DEC’S VT100No storage
‘70 ‘80 2008
7/51
What Else Has Changed?
The number of home PCs is increasing rapidly Most of the PCs are “fat clients” As the Internet usage grow, more and more PCs
are connecting to the global net Most of the time PCs are idle
How can we use all this?
8/51
Resources Sharing
What can we share? Computer resources
Shareable computer resources: CPU cycles - seti@home, GIMPS Bandwidth - PPLive, PPStream Storage Space - OceanStore, Murex Data - Napster, Gnutella
9/51
SETI@Home
SETI – Search for Extra-Terrestrial Intelligence
@Home – On your own computer A radio telescope in Puerto Rico scans the
sky for radio signals Fills a DAT tape of 35GB in 15 hours That data have to be analyzed
10/51
SETI@Home (cont.)
The problem – analyzing the data requires a huge amount of computation
Even a supercomputer cannot finish the task on its own
Accessing a supercomputer is expensive What can be done?
11/51
SETI@Home (cont.)
Can we use distributed computing?YEAH
Fortunately, the problem can be solved in parallel - examples:Analyzing different parts of the skyAnalyzing different frequenciesAnalyzing different time slices
12/51
SETI@Home (cont.)
The data can be divided into small segments
A PC is capable of analyzing a segment in a reasonable amount of time
An enthusiastic UFO searcher will lend his spare CPU cycles for the computationWhen? Screensavers
13/51
SETI@Home - Example
14/51
SETI@Home - Summary
SETI reverses the C/S modelClients can also provide servicesServers can be weaker, used mainly for storage
Distributed peers serving the centerNot yet P2P but we’re close
Outcome - great results:Thousands of unused CPU hours tamed for the
mission3+ millions of users
15/51
16/51
17/51
MUREX: A Mutable Replica Control Scheme for Peer-to-Peer Storage Systems
Murex: Basic Concept
HotOSAttendee
Peer-to-Peer Video Streaming
… …Video stream
Peer-to-Peer Video Streaming
22/51
Napster -- Shawn Fanning
23/51
24/51
History of Napster (1/2)
5/99: Shawn Fanning (freshman, Northeastern University) founds Napster Online (supported by Groove)
12/99: First lawsuit 3/00: 25% Univ. of Wisconsin traffic on
Napster
25/51
History of Napster (2/2)
2000: estimated 23M users
7/01: simultaneous online users 160K 6/02: file bankrupt … 10/03: Napster 2 (Supported by Roxio)
(users should pay $9.99/month)
1984~2000, 23M domain names are countedvs.
16 months, 23M Napster-style names are registered at Napster
26/51
Napster Sharing Style: hybrid center+edge
“slashdot”•song5.mp3•song6.mp3•song7.mp3
“kingrook”•song4.mp3•song5.mp3•song6.mp3
•song5.mp3
1. Users launch Napster and connect to Napster server
3. beastieboy enters search criteria
4. Napster displays matches to beastieboy
2. Napster creates dynamic directory from users’ personal .mp3 libraries
Title User Speed song1.mp3 beasiteboy DSLsong2.mp3 beasiteboy DSLsong3.mp3 beasiteboy DSLsong4.mp3 kingrook T1song5.mp3 kingrook T1song5.mp3 slashdot 28.8song6.mp3 kingrook T1song6.mp3 slashdot 28.8song7.mp3 slashdot 28.8
5. beastieboy makes direct connection to kingrook for file transfer
s o n g 5
“beastieboy”•song1.mp3•song2.mp3•song3.mp3
27/51
Gnutella History Gnutella was written by Justin Frankel, the
21-year-old founder of Nullsoft. (Nullsoft acquired by AOL, June 1999) Nullsoft (maker of WinAmp) posted Gnutella
on the Web, March 14, 2000. A day later AOL yanked Gnutella, at the
behest of Time Warner. Too late: 23k users on Gnutella People had already downloaded and shared
the program. Gnutella continues today, run by independent
programmers.
28/30
Gnutella -- Justin Frankel and Tom Pepper
29/51
The ‘Animal’ GNU: Either of two large African antelopes (Connochaetes gnou or C. taurinus) having a drooping mane and beard, a long tufted tail, and curved horns in both sexes. Also called wildebeest.
Gnutella =
GNU: Recursive AcronymGNU’s Not Unix ….
+
Nutella: a hazelnut chocolate spread produced by the Italian
confectioner Ferrero ….
GNU
Nutella
30/51
GNUGNU's Not Unix 1983 Richard Stallman (MIT) established Free Software Foundation and Proposed GNU ProjectFree software is not freewareFree software is open source softwareGPL: GNU General Public License
31/51
About Gnutella No centralized directory servers Pings the net to locate Gnutella friends File requests are broadcasted to friends
Flooding, breadth-first search When provider located, file transferred via
HTTP History:
3/14/00: release by AOL, almost immediately withdrawn
32/51
Peer-to-Peer Overlay Network
Focus at the application layer
33/51
Peer-to-Peer Overlay Network
Internet
End systems
one hop(end-to-end comm.)
a TCP thru the Internet
Gnutella ProtocolScenario: Joining Gnutella Network
A
Gnutella Network The new node connects to a
well known ‘Anchor’ node or ‘Bootstrap’ node.
Then sends a PING message to discover other nodes.
PONG messages are sent in reply from hosts offering new connections with the new node.
Direct connections are then made to the newly discovered nodes.
NewPING
PINGPING
PINGPING
PINGPING
PINGPING
PING
PONG
PONG
35/51
Topology of a Gnutella Network
36/51
xyz.mp3 ?
Gnutella: Issue a Request
37/51
Gnutella: Flood the Request
38/51
xyz.mp3
Gnutella: Reply with the FileFully distributed storage and directory!
39/51
So Far
Centralized :
- Directory size – O(n)
- Number of hops – O(1) Flooded queries:
- Directory size – O(1)
- Number of hops – O(n)
n: number of participating nodes
40/51
We Want
Efficiency : O(log(n)) messages per lookup
Scalability : O(log(n)) state per node Robustness : surviving massive
failures
41/51
How Can It Be Done?
How do you search in O(log(n)) time?Binary search
You need an ordered array How can you order nodes in a network and data
objects? Hash function!
42/51
Example of Hasing
Shark SHA-1
Object ID (key):DE11AC
SHA-1
Object ID (key):AABBCC
194.90.1.5:8080
43/51
Basic Idea
Hash key
Object “y”
Objects have hash keys
Peer “x”Peer nodes also have hash keys in the same hash space
P2P Network
y xH(y) H(x)
Join (H(x))
Publish (H(y))
Place object to the peer with closest hash keys
44/51
Mapping an object to the closest node with a larger key
0 M
- an data object
- a node
45/51
Viewed as a Distributed Hash Table
Hashtable
0 2128-1
Peernode
Internet
46/51
DHT Distributed Hash Table Input: key (file name)
Output: value (file location) Each node is responsible for a range of the hash
table, according to the node’s hash key. Objects’ directories are placed in (managed by) the node with the closest key
It must be adaptive to dynamic node joining and leaving
47/51
How to Find an Object?
Hashtable
0 2128-1
Peernode
48/51
Simple Idea Track peers which allow us to move quickly
across the hash space a peer p tracks those peers responsible for hash keys
(p+2i-1), i=1,..,m
Hashtable
0 2128-1
Peernode
i+22 i+24 i+28i
49/46
Chord Lookup – with finger tableStart Int. node
2+1 [3,4) 3
2+2 [4,6) 7
2+4 [6,10) 7
2+8 [10,2) 10
12
14 2
10
15 1
3
7
I’m node 2.Please find key 14!
Start Int. node
10+1 [11,12) 12
10+2 [12,14) 12
10+4 [14,2) 14
10+8 [2,10) 2
O(log n) hops (messages)for each lookup!!
14 [10,2)∈
14 [14,2)∈
Circular 4-bitID space
O(log n) states per node
50/51
Hybrid P2P – Preserves some of the traditional C/S architecture. A central server links between clients, stores indices tables, etc Napster
Unstructured P2P – no control over topology and file placement Gnutella, Morpheus, Kazaa, etc
Structured P2P – topology is tightly controlled and placement of files are not random Chord, CAN, Pastry, Tornado, etc
Classification of P2P systems
51/51
Q&A