P2P Lookup Protocols

Zubin BhuyanMTech (IT), Tezpur University,

Assam, INDIA

Distributed System

Peer-to-Peer Lookup Protocols

Outline

P2P Basics Architecture Lookup in P2P Related work in P2P Lookup Protocols

Chord Protocol Cluster based and Routing Balanced P2P Lookup

Protocol PathFinder LiChord

Proposed P2P Lookup Model based on RCC8 and Scalable Bloom Filter

Future work for proposed P2P lookup model

P2P Basics

A model of decentralized communication where every node in the network acts alike without any centralized control or

hierarchical organization Nodes in such a P2P network are both

suppliers and consumers of resources Nodes are autonomous popularized by file sharing systems like

Napster

P2P Architecture

Architecture determines structure of overlay network

Structured P2P network Peers are organized following specific

criteria and algorithms Unstructured P2P network

Network does not provide any algorithm for organization Pure P2P systems Centralized peer-to-peer systems

P2P Architecture

P2P Lookup

Providing object location service in P2P system

P2P system may involve thousands or millions of live peers (nodes)

Deliver high quality service with low response latency

Structured networks -> DHT Unstructured networks -> Exhaustive

searching like Bubble Storm, etc.

P2P Lookup challanges

Population dynamism Content dynamism Heterogeneity

I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, H. Balakrishnan, “Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications”, SIGCOMM, 2001

CHORD: A Scalable Peer-to-Peer Lookup Protocol

CHORD

Efficient lookup of a node which stores data items for a particular search key.

Provides only one operation: given a key, it maps the key onto a node.

Distributed Hash Table approach balances the load

When Nth node joins or leaves only O(1/N) fraction of keys moved.

Nodes maintains fingers (links) to other specific peers

Lookup Using Finger Table

N1

N8

N14

N21N32

N38

N42

N51

N56

N48

lookup(54)

Y. Liu, M. Chen, “Cluster-Based and Routing Balanced P2P Lookup Protocol” Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing

Cluster-Based and Routing Balanced P2P Lookup Protocol


All peers are grouped into clusters according to cost in latency

A hash function assigns each node and key an m-bit identifier Node's identifier is hashed IP address Key's identifier is produced by hashing the

key Identifiers are ordered in an identifier

circle modulo of 2m


Construction: Requires M well-known landmark machines CRP constructs M clusters based on these

landmarks and names them by landmarks' number, called CID

All nodes measure their round-trip-time (RTT) to each of these landmarks attaches themselves to any one of these nodes

based on minimum RTT value Nodes that have the same CID are grouped into

one cluster Clustered nodes are ordered in a sub-circle

according to their identifier



Intra Cluster Lookup Tree: Inside the cluster, CRP constructs a

complete and unique balanced lookup tree for each node

In the lookup tree of node K, the rest nodes in the cluster are laid out counterclockwise from node

Lookup tree of Node 0 of previous diagram


Lookup Table: Lookup table is composed of inner-

cluster section and outer-cluster section Inner-cluster section is a list of (target,

forwarder) pairs Outer-cluster section has at most M-1

rows each row shows the corresponding node

in other different clusters These nodes are the closest nodes to node

Dirk Bradler, Lachezar Krumov, Max Mühlhäuser, Jussi Kangasharju, “PathFinder: Efficient Lookups and Efficient Search in Peer-to-Peer Networks”, 12th International Conference on Distributed Computing and Networking, 2011

PathFinder: Efficient Lookups and Efficient Search in Peer-to-Peer Networks

PathFinder

Attempts to combines an unstructured and a structured network in a single overlay

Builds a robust network of virtual nodes on top of the physical peers

Actual data transfer still takes place directly among the physical peers

Based on random graph theory

PathFinder

Construction Two Pseudo Random Number Generators are

used Given a number c, the first generator returns

Poisson distributed numbers with mean value c.

The second pseudo number generator given a node ID produces a deterministic sequence of numbers which is used as IDs for the neighbors of the given node

[It is known that the degree sequence in a random graph is Poisson distributed.]

PathFinder

Construction PathFinder starts construction of the

number by choosing a number, c, according to the size of the network required

Then, for each virtual node determine the number of neighbors with the first number generator

The actual nodes IDs to which the current virtual node should be connected are chosen with the second number generator

PathFinder

Routing Table Each peer keeps track of its own outgoing links and

incoming links from other virtual nodes A peer learns the incoming links when the other peers

attempt to connect to it Keeping track of incoming links makes key lookups much

more efficient

PathFinder

Storing Objects: An object is stored on the virtual node which matches

the object’s identifier Key Lookup Suppose that peer A wants to retrieve an object O. Peer

A determines that the virtual node w is responsible for object O by using the hash function described above

For each virtual node who is its neighbors, A calculates the neighbors of those nodes using the PRNG.

This process is repeated recursively until the required virtual node for O is found

The average path length of PathFinder is , where N is the number of virtual nodes and c is the average number of neighbors

Shuling Wang, Shoubao Yang, Liangmin Guo, “LiChord: A Linear Code Based Structured P2P for Approximate Match”, Third International Conference on Communications and Mobile Computing (CMC), April 2011

LiChord: A Linear Code based structured P2P for Approximate Match

LiChord

General DHT-based structured P2P applications can work only for exact match cases

LiChord has been proposed to overcome this problem by employing a mapping process that can give approximate match for lookup operations Hamming Distance performs a mapping of node identifiers onto k-

bits long codes objects are mapped to nbits long ones

LiChord

Index Distribution A hash function is applied in LiChord to

assign node i a k-bits node identifier IDi by hashing node i’s IP address

all of these nodes denoted by u are organized in the same way with Chord based on the partial order of u

Objects’ identifiers are produced by consecutive insertions operated onto an empty bloom filter

LiChord

Query Identification Queries in LiChord are identified in the same

way as objects

a query requesting for keys x, y and z, the identifier of this query is 010111000001010010



The region connection calculus (RCC) serves for qualitative spatial representation and reasoning

RCC8 consists of 8 basic relations that are possible between two regions: disconnected (DC) externally connected (EC) equal (EQ) partially overlapping (PO) tangential proper part (TPP) tangential proper part inverse (TPPi) non-tangential proper part (NTPP) non-tangential proper part inverse (NTPPi)


Assumption: The overlay of virtual nodes, data keys are arranged on the basis of similarity using Hamming distance calculated from Bloom Filter

we use a PathFinder like lookup mechanism to restrict the unnecessary bubbling of data

and lookup query, we propose a Bloom Filter to restrict the query propagation in a specific direction

achieved by using a bloom filter to generate keys for objects Scalable Bloom Filter may be used


For very large topologies, the number of hash functions for the bloom filter might not suffice

The entire topology might be divided into different regions based on the Hamming distances of the objects keys stored in the nodes

Use RCC8 to decide propagation or restriction of queries in between regions

Since the object key generating Bloom Filter is available to everyone, every node in the overlay needs to know the Region Separation Parameter (RSP). RSP is Hamming distance among object keys, which is

used to determine whether an object belongs to a same region or not


Proposed idea, in theory, should show improved performances over existing P2P models because we have used selective directional propagation of queries No unnecessary query propagation is done

Dividing the entire system into regions based on similarity helps in deciding when the query broadcast has to be stopped

Future work for proposed P2P Lookup model

More extensive study of the idea has to be done.

Proper formal definition of rules for the algorithm has to be formulated.

Exact arrangement and selection of hash functions for the Scalable Bloom Filter

A generic heuristic for determining RSP needs to be done.

Simulation and comparison with other P2P Lookup Protocols is required

References

Shuling Wang, Shoubao Yang, Liangmin Guo, “LiChord: A Linear Code Based Structured P2P for Approximate Match”, Third International Conference on Communications and Mobile Computing (CMC), April 2011

R. Ahmed, R. Boutaba, “A Survey of Distributed Search Techniques in Large Scale Distributed Systems”, IEEE Communications Surveys & Tutorials, Second Quarter 2011

Dirk Bradler, Lachezar Krumov, Max Mühlhäuser, Jussi Kangasharju, “PathFinder: Efficient Lookups and Efficient Search in Peer-to-Peer Networks”, 12th International Conference on Distributed Computing and Networking, 2011

I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, H. Balakrishnan, “Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications”, SIGCOMM, 2001

W. Terpstra, J. Kangasharju, C. Leng, , A. Buchmann, “Bubblestorm: resilient, probabilistic, and exhaustive peer-to-peer search”, Proc. SIGCOMM, pp. 49–60, 2007

Donald Knuth. "The Art of Computer Programming”, Errata for Volume 3 (2nd ed.) Randell, D. A., Cui, Z. and Cohn, A. G.: A spatial logic based on regions and connection, Proc.

3rd Int. Conf. on Knowledge Representation and Reasoning, Morgan Kaufmann, San Mateo, pp. 165–176, 1992.

P. Almeida, C. Baquero, N. Preguica, D. Hutchison, "Scalable Bloom Filters", Information Processing Letters 101 (6): 255–261, 2007.

Y. Liu, M. Chen, “Cluster-Based and Routing Balanced P2P Lookup Protocol” Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing

Thank You!!

Education

P2P Lookup Protocols