of 55/55
TECH TALKS BitTorrent DHT | Arvid Norberg BitTorrent, Inc / 2013 Tech Talks 01

Bit torrent techtalks_dht

  • View
    692

  • Download
    2

Embed Size (px)

DESCRIPTION

As part of BitTorrent's Tech Talks series, Arvid Norberg explains how BitTorrent Distributed Hash Tables work.

Text of Bit torrent techtalks_dht

  • 1. TECH TALKSBitTorrent DHT | Arvid NorbergBitTorrent, Inc / 2013 Tech Talks 01

2. INTRODUCTIONDHT = DISTRIBUTED HASH TABLE (Its like a hash table, where the payload is divided up across many nodes.)BitTorrent, Inc / 2013 Tech Talks 02 3. INTRODUCTION The DHT is primarily used to introduce peers to each other. Who else isPEER on sw arm X ?172.4 119.7 .12.7217.13.53.3.98. ...83220DHTBitTorrent, Inc / 2013 Tech Talks03 4. INTRODUCTION protocol topology routing routing table traversal algorithm BitTorrent, Inc / 2013 Tech Talks 04 5. PROTOCOLBitTorrent, Inc / 2013 Tech Talks 05 6. PROTOCOL ping announce_peer get_peers find_node PEER DHTBitTorrent, Inc / 2013 Tech Talks06 7. PROTOCOL ping ONE-OFF MESSAGES announce_peer get_peers find_node PEER DHTBitTorrent, Inc / 2013 Tech Talks07 8. PROTOCOL ping ONE-OFF MESSAGES announce_peer get_peers RECURSIVE find_node PEER DHTBitTorrent, Inc / 2013 Tech Talks08 9. PROTOCOL ping announce_peer get_peers find_node get_peers (x)PEER nodes: DHTBitTorrent, Inc / 2013 Tech Talks09 10. PROTOCOL ping announce_peer get_peers find_node get_ peers (x)nod es: get_peers (x)PEER nodes: get_peers (x)DHT nodes: REPEATSBitTorrent, Inc / 2013 Tech Talks 10 11. PROTOCOL ping announce_peer get_peers find_node get_peers (x)PEER nodes: get_peers (x) DHT nodes:TERMINATES WHEN WERECEIVE VALUES (I.E. PEERS)AND NO BETTER NODESBitTorrent, Inc / 2013 Tech Talks11 12. PROTOCOLping announce_peer get_peers find_node EACH MESSAGE INCLUDES ONES OWN NODE ID. THIS ENABLES NODES TO LEARN ABOUT NEW NODES AS PART OF THE PROTOCOL CHATTER.BitTorrent, Inc / 2013 Tech Talks 12 13. PROTOCOL bootstrap refresh buckets announce BitTorrent, Inc / 2013 Tech Talks 13 14. PROTOCOL Spoof protection get_peers responds with a write-token Write-token is typically a MAC of: - source (IP, port) - target info-hash - local secret (which may expire in tens of minutes) announce_peer requires a valid write token to insert the node in the peer listBitTorrent, Inc / 2013 Tech Talks 14 15. TOPOLOGY This section describes how nodes can respond to recursive lookup queries like this, with nodes closer and closer to the target.* *Spoiler alert: it has to do with the topologyBitTorrent, Inc / 2013 Tech Talks 15 16. TOPOLOGYThe DHT is made up by all BitTorrent peers, across all swarms.PEERPEERPEER PEERPEERPEER PEER PEER PEER PEER PEERPEER PEERPEER PEER PEERPEERBitTorrent, Inc / 2013 Tech Talks 16 17. TOPOLOGYEach node has a self-assigned address, or node ID.PEERPEER71690...29736 24e52...a22a6PEERPEERPEER138b8...cea6fcaa39...9087a 09b9b...6216bPEERPEER PEERPEERf091c...7fc099a2ac...1993e 03afa...f020009b9b...6216bPEER6c6cf...10f46PEERf8466...b21cfPEERPEER5f08d...e2537 22942...48a07 PEER PEER PEER255d5...b923d0f35f...386ab 79280...f8bf3 PEER 2f1d5...04177BitTorrent, Inc / 2013 Tech Talks 17 18. TOPOLOGYConsider all nodes lined up in the node ID space.... 0 2 160BitTorrent, Inc / 2013 Tech Talks18 19. TOPOLOGYConsider all nodes lined up in the node ID space.... 0 node-ID space 2 160....all nodes appear evenly distributed in this space.BitTorrent, Inc / 2013 Tech Talks19 20. TOPOLOGYIDs are 160 bits (20 bytes) long...Keys in the hash table (info-hashes) are also 160 bits. 0node-ID space 2 160BitTorrent, Inc / 2013 Tech Talks 20 21. TOPOLOGYIDs are 160 bits (20 bytes) long...Keys in the hash table (info-hashes) are also 160 bitsinfo-hash 0node-ID space 2 160Nodes whose ID is close to an info-hash are responsible forstoring information about it.BitTorrent, Inc / 2013 Tech Talks 21 22. TOPOLOGYIDs are 160 bits (20 bytes) long...Keys in the hash table (info-hashes) are also 160 bitsinfo-hash 0node-ID space 2 160Nodes whose ID is close to an info-hash are responsible forstoring information about it.BitTorrent, Inc / 2013 Tech Talks 22 23. ROUTINGBitTorrent, Inc / 2013 Tech Talks 23 24. ROUTING It is impractical for every node to know about every other node. There are millions of nodes. Nodes come and go constantly. Every node specializes in knowing about all nodes close to itself. The farther from itself, the more sparse its knowledge of nodes become.BitTorrent, Inc / 2013 Tech Talks24 25. ROUTINGThe routing table orders nodes based on their distance from oneself. distancedistance 0node-ID space self2 160BitTorrent, Inc / 2013 Tech Talks 25 26. ROUTINGThe routing table orders nodes based on their distance from oneself. distancedistance 0node-ID space self2 160 selfdistance from selfThis is simplified. The euclidian distance is not actually used.BitTorrent, Inc / 2013 Tech Talks 26 27. ROUTING The euclidian distance has a problem that it folds the space, and nodes are no longer uniformly distributed (in the distance-space).distancedistance 0node-ID space 2 160 selfBitTorrent, Inc / 2013 Tech Talks27 28. ROUTINGThe XOR distance metric is: d(a,b) = a b 0selfdistance from self 2 160 The XOR distance metric makes it so that the distances are still uniformly distributed. It doesnt fold the space the way euclidian distance does.BitTorrent, Inc / 2013 Tech Talks28 29. ROUTINGThe distance space is divided up into buckets. Each bucket holds no more than 8 nodes. 0selfdistance from self 2 160BitTorrent, Inc / 2013 Tech Talks29 30. ROUTINGThe distance space is divided up into buckets. Each bucket holds no more than 8 nodes.The space covered by a bucket is half as big as the previous one.You know about more nodes close to you. 0self 2 160distance from self ...bucket 2 bucket 1 bucket 0BitTorrent, Inc / 2013 Tech Talks3- 31. ROUTING For every hop in a recursive lookup, the nodes distance is cut in half. Lookup complexity: O(log n) 02 160 node ID space target This illustration is also simplified, the XOR Distance metric will make you jump back and forth a bit. Cutting your distance in half every hop still holds.BitTorrent, Inc / 2013 Tech Talks 31 32. ROUTING TABLEBitTorrent, Inc / 2013 Tech Talks 32 33. ROUTING TABLE The XOR distance metric applied to the routing table just counts the length of the common bit-prefix.OUR NODE ID: 10101110100010101001110...OTHER NODE ID: 10101011010010110100101...BitTorrent, Inc / 2013 Tech Talks33 34. ROUTING TABLE The XOR distance metric applied to the routing table just counts the length of the common bit-prefix.OUR NODE ID: 10101110100010101001110...OTHER NODE ID: 10101011010010110100101... shared bit prefix: 5 bitsnode belongs in bucket 5BitTorrent, Inc / 2013 Tech Talks34 35. ROUTING TABLE A 160 bit space can be cut in half 160 times. There is a max of 160 buckets. ...prefix = 00 prefix = 0no prefix 02 160 ... bucket 2 bucket 1bucket 0BitTorrent, Inc / 2013 Tech Talks 35 36. ROUTING TABLE View of the routing table in node ID space (instead of distance space).bucket 03 4bucket 2 bucket 1 0self 2 160bit 0bit 1bit 2bit 3BitTorrent, Inc / 2013 Tech Talks36 37. ROUTING TABLE A nave routing table implementation would be an array of 160 buckets.... bucket 4bucket 3 bucket 2bucket 1 bucket 0... bucket 5BitTorrent, Inc / 2013 Tech Talks 37 38. ROUTING TABLE A nave routing table implementation would be an array of 160 buckets... Not very efficient, since the majority of buckets will be empty. bucket 4bucket 3 bucket 2bucket 1 bucket 0... bucket 5BitTorrent, Inc / 2013 Tech Talks 38 39. ROUTING TABLEA typical routing table starts with only bucket 0.When the 9th node is added, the bucket is split into bucket 0 andbucket 1, with the nodes moved to their respective bucket.Only the highest numbered bucket is ever split.BitTorrent, Inc / 2013 Tech Talks 39 40. ROUTING TABLEbucket 0BitTorrent, Inc / 2013 Tech Talks40 41. ROUTING TABLEbucket 0BitTorrent, Inc / 2013 Tech Talks41 42. ROUTING TABLEbucket 0BitTorrent, Inc / 2013 Tech Talks42 43. ROUTING TABLEarrange nodes into the correct bucketbucket 1 bucket 0BitTorrent, Inc / 2013 Tech Talks 43 44. ROUTING TABLEbucket 1 bucket 0BitTorrent, Inc / 2013 Tech Talks 44 45. ROUTING TABLEbucket 1 bucket 0BitTorrent, Inc / 2013 Tech Talks 45 46. ROUTING TABLEbucket 1 bucket 0BitTorrent, Inc / 2013 Tech Talks 46 47. ROUTING TABLEbucket 1 bucket 0BitTorrent, Inc / 2013 Tech Talks 47 48. TRAVERSAL ALGORITHM A deeper look at the get_peers and announce_peer query.BitTorrent, Inc / 2013 Tech Talks48 49. TRAVERSAL ALGORITHMPick known nodes out of routing table, close to the target were looking up.Sort by distance to target. (IP, Node ID)closer to target (IP, Node ID) (IP, Node ID) (IP, Node ID)BitTorrent, Inc / 2013 Tech Talks49 50. TRAVERSAL ALGORITHM Send requests to 3 (or so) at a time.s( ih) pe er g et_ (IP, Node ID) xget_peers (ih)closer to target (IP, Node ID) x (IP, Node ID) x get _pe (IP, Node ID)ers(ih)BitTorrent, Inc / 2013 Tech Talks50 51. TRAVERSAL ALGORITHM Send requests to 3 (or so) at a time.s( ....) n ode (IP, Node ID) xnodes (....)closer to target (IP, Node ID) x (IP, Node ID) x nod es ( (IP, Node ID)....)BitTorrent, Inc / 2013 Tech Talks 51 52. TRAVERSAL ALGORITHM Nodes are inserted in sorted order. Nodes we already have are ignored. Nodes that dont respond, are marked as stale (IP, Node ID)closer to target (IP, Node ID) (IP, Node ID) (IP, Node ID) (IP, Node ID)x (IP, Node ID) (IP, Node ID)x (IP, Node ID)x (IP, Node ID) (IP, Node ID)BitTorrent, Inc / 2013 Tech Talks 52 53. TRAVERSAL ALGORITHM Keep requests 3 outstanding at all times.s( ih) pe er g et_ (IP, Node ID) xget_peers (ih)closer to target (IP, Node ID) x (IP, Node ID) x get _pe (IP, Node ID)ers(ih) (IP, Node ID) x (IP, Node ID) (IP, Node ID) x (IP, Node ID) x (IP, Node ID)BitTorrent, Inc / 2013 Tech Talks53 54. TRAVERSAL ALGORITHM Terminating condition: the top 8 nodes have all been queried (and responded). (IP, Node ID) xcloser to target (IP, Node ID) x (IP, Node ID) x (IP, Node ID) x (IP, Node ID) x (IP, Node ID) x (IP, Node ID) x (IP, Node ID) x (IP, Node ID)BitTorrent, Inc / 2013 Tech Talks54 55. PROTOCOL Send announce_peer to the top 8 nodes.r( ih)peen ce_ih) nn ou ee r( (IP, Node ID) x anc e_p ih) nn ou ee r( a e_pcloser to target (IP, Node ID) xu nc ih) an noer (_pe (IP, Node ID) xounce r( ih) ann e_ pee (IP, Node ID) xouncr( ih) a nne_ pee (IP, Node ID) xo uncih) a nn er (_pe (IP, Node ID) xounce r( ih) ann e_ pee (IP, Node ID) xnounc an (IP, Node ID) x (IP, Node ID)BitTorrent, Inc / 2013 Tech Talks55