27
P2P Systems and Distributed Hash Tables Sec7on 9.4.2 COS 461: Computer Networks Spring 2011 Mike Freedman hIp://www.cs.princeton.edu/courses/archive/spring11/cos461/ 1

P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

P2PSystemsandDistributedHashTablesSec7on9.4.2

COS461:ComputerNetworksSpring2011

MikeFreedmanhIp://www.cs.princeton.edu/courses/archive/spring11/cos461/

1

Page 2: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

P2PasOverlayNetworking

•  P2Papplica7onsneedto:– Trackiden77es&IPaddressesofpeers

•  Maybemanyandmayhavesignificantchurn

– Routemessagesamongpeers•  Ifyoudon’tkeeptrackofallpeers,thisis“mul7‐hop”

•  Overlaynetwork– Peersdoingbothnamingandrou7ng–  IPbecomes“just”thelow‐leveltransport

2

Page 3: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

EarlyP2P

3

Page 4: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

EarlyP2PI:Client‐Server

•  Napster– Client‐serversearch– “P2P”filexfer

xyz.mp3?

xyz.mp3

1.insert

2.search

3.transfer

4

Page 5: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

EarlyP2PII:FloodingonOverlays

xyz.mp3?

xyz.mp3

Flooding

5

search

Page 6: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

EarlyP2PII:FloodingonOverlays

xyz.mp3?

xyz.mp3

Flooding

6

search

Page 7: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

EarlyP2PII:FloodingonOverlays

transfer

7

Page 8: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

EarlyP2PII:“Ultra/superpeers”•  Ultra‐peerscanbeinstalled(KaZaA)orself‐promoted(Gnutella)–  AlsousefulforNATcircumven7on,e.g.,inSkype

8

Page 9: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

LessonsandLimita7ons•  Client‐Serverperformswell

–  Butnotalwaysfeasible:Performancenotocenkeyissue!

•  Thingsthatflood‐basedsystemsdowell–  Organicscaling–  Decentraliza7onofvisibilityandliability–  Findingpopularstuff–  Fancylocalqueries

•  Thingsthatflood‐basedsystemsdopoorly–  Findingunpopularstuff–  Fancydistributedqueries–  Vulnerabili7es:datapoisoning,tracking,etc.–  Guaranteesaboutanything(answerquality,privacy,etc.)

9

Page 10: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

StructuredOverlays:DistributedHashTables

10

Page 11: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

BasicHashingforPar77oning?

•  Considerproblemofdatapar77on:–  GivendocumentX,chooseoneofkserverstouse

•  Supposeweusemodulohashing–  Numberservers1..k

–  PlaceXonserveri=(Xmodk)•  Problem?Datamaynotbeuniformlydistributed

–  PlaceXonserveri=hash(X)modk•  Problem?

– Whathappensifaserverfailsorjoins(kk±1)?

– Whatisdifferentclientshasdifferentes7mateofk?

– Answer:Allentriesgetremappedtonewnodes!

11

Page 12: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

•  Consistenthashingpar77onskey‐spaceamongnodes

•  Contactappropriatenodetolookup/storekey–  Bluenodedeterminesrednodeisresponsibleforkey1

–  Bluenodesendslookuporinserttorednode

key1 key2 key3

key1=value

insert(key1,value)

12

ConsistentHashing

lookup(key1)

Page 13: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

•  Par77oningkey‐spaceamongnodes

–  Nodeschooserandomiden7fiers: e.g.,hash(IP)

–  KeysrandomlydistributedinID‐space: e.g.,hash(URL)

–  Keysassignedtonode“nearest”inID‐space–  Spreadsownershipofkeysevenlyacrossnodes

0000 0010 0110 1010 1111 1100 1110 URL1 URL2 URL3 0001 0100 1011

13

ConsistentHashing

Page 14: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

ConsistentHashing0

4

8

12 Bucket

14 •  Construc7on–  Assignnhashbucketstorandompointsonmod2kcircle;hashkeysize=k

– Mapobjecttorandomposi7ononcircle

–  Hashofobject=closestclockwisebucket–  successor(key)bucket

•  Desiredfeatures–  Balanced:Nobuckethasdispropor7onatenumberofobjects

–  Smoothness:Addi7on/removalofbucketdoesnotcausemovementamongexis7ngbuckets(onlyimmediatebuckets)

–  Spreadandload:Smallsetofbucketsthatlienearobject

14

Page 15: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

Consistenthashingandfailures

•  Considernetworkofnnodes•  Ifeachnodehas1bucket

–  Owns1/nthofkeyspaceinexpecta<on–  Saysnothingofrequestloadperbucket

•  Ifanodefails:–  Itssuccessortakesoverbucket–  Achievessmoothnessgoal:Onlylocalizedshic,notO(n)–  Butnowsuccessorowns2buckets:keyspaceofsize2/n

•  Instead,ifeachnodemaintainsvrandomnodeIDs,not1–  “Virtual”nodesspreadoverIDspace,eachofsize1/vn–  Uponfailure,vsuccessorstakeover,eachnowstores(v+1)/vn

0

4

8

12 Bucket

14

15

Page 16: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

Consistenthashingvs.DHTs

ConsistentHashing

DistributedHashTables

Rou7ngtablesize O(n) O(logn)

Lookup/Rou7ng O(1) O(logn)

Join/leave:Rou7ngupdates

O(n) O(logn)

Join/leave:KeyMovement

O(1) O(1)

16

Page 17: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

DistributedHashTable

0010 0110 1010 1111 1100 1110 0000

•  Nodes’neighborsselectedfrompar7culardistribu7on

-  Visualkeyspaceasatreeindistancefromanode

0001 0100 1011

17

Page 18: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

DistributedHashTable

0010 0110 1010 1111 1100 1110 0000

•  Nodes’neighborsselectedfrompar7culardistribu7on

-  Visualkeyspaceasatreeindistancefromanode

-  Atleastoneneighborknownpersubtreeofincreasingsize/distancefromnode

18

Page 19: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

DistributedHashTable

0010 0110 1010 1111 1100 1110 0000

•  Nodes’neighborsselectedfrompar7culardistribu7on

-  Visualkeyspaceasatreeindistancefromanode

-  Atleastoneneighborknownpersubtreeofincreasingsize/distancefromnode

•  Routegreedilytowardsdesiredkeyviaoverlayhops

19

Page 20: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

TheChordDHT

•  Chordring:IDspacemod2160

–  nodeid=SHA1(IPaddress,i) fori=1..vvirtualIDs

–  keyid=SHA1(name)

•  Rou7ngcorrectness:–  Eachnodeknowssuccessorandpredecessoronring

•  Rou7ngefficiency:–  EachnodeknowsO(logn)well‐distributedneighbors

20

Page 21: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

BasiclookupinChordlookup (id): if ( id > pred.id && id <= my.id )

return my.id; else return succ.lookup(id);

•  Routehopbyhopviasuccessors– O(n)hopstofinddes7na7onid

Rou7ng

21

Page 22: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

EfficientlookupinChordlookup (id): if ( id > pred.id && id <= my.id )

return my.id; else // fingers() by decreasing distance

for finger in fingers(): if id <= finger.id return finger.lookup(id); return succ.lookup(id);

•  Routegreedilyviadistant“finger”nodes– O(logn)hopstofinddes7na7onid

Rou7ng

22

Page 23: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

Buildingrou7ngtables

Rou7ngRou7ngTables

Foriin1...logn:finger[i]=successor((my.id+2i)mod2160)

23

Page 24: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

Joiningandmanagingrou7ng•  Join:

–  Choosenodeid–  Lookup(my.id)tofindplaceonring

–  Duringlookup,discoverfuturesuccessor–  Learnpredecessorfromsuccessor

–  Updatesuccandpredthatyoujoined–  Findfingersbylookup((my.id+2i)mod2160)

•  Monitor:–  Ifdoesn’trespondforsome7me,findnew

•  Leave:Justgo,already!–  (Warnyourneighborsifyoufeellikeit)

24

Page 25: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

DHTDesignGoals

•  An“overlay”networkwith:–  Flexiblemappingofkeystophysicalnodes–  Smallnetworkdiameter

–  Smalldegree(fanout)–  Localrou7ngdecisions–  Robustnesstochurn–  Rou7ngflexibility–  Decentlocality(low“stretch”)

•  Different“storage”mechanismsconsidered:–  Persistencew/addi7onalmechanismsforfaultrecovery–  Besteffortcachingandmaintenanceviasocstate

25

Page 26: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

Storagemodels

•  Storeonlyonkey’simmediatesuccessor– Churn,rou7ngissues,packetlossmakelookupfailuremorelikely

•  Storeonksuccessors– Whennodesdetectsucc/predfail,re‐replicate

•  Cachealongreverselookuppath– Provideddataisimmutable– …andperformingrecursiveresponses

26

Page 27: P2P Systems and Distributed Hash Tables€¦ · Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 • Nodes’ neighbors selected from parcular distribuon - Visual keyspace

Summary•  Peer‐to‐peersystems

–  Unstructuredsystems•  Findinghay,performingkeywordsearch

–  Structuredsystems(DHTs)•  Findingneedles,exactmatch

•  Distributedhashtables–  BasedaroundconsistenthashingwithviewsofO(logn)–  Chord,Pastry,CAN,Koorde,Kademlia,Tapestry,Viceroy,…

•  Lotsofsystemsissues–  Heterogeneity,storagemodels,locality,churnmanagement,underlayissues,…

–  DHTsdeployedinwild:Vuze(Kademlia)has1M+ac7veusers

27