38
LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment W. LITWIN CERIA Laboratory H.YAKOUBEN Paris Dauphine University [email protected] Paris Dauphine University Hanafi[email protected] T. SCHWARZ Santa Clara University (USA) [email protected]

LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment

  • Upload
    rangle

  • View
    55

  • Download
    1

Embed Size (px)

DESCRIPTION

CERIA Laboratory. LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment. W. LITWIN. T. SCHWARZ. H.YAKOUBEN. Paris Dauphine University [email protected]. Santa Clara University (USA) [email protected]. Paris Dauphine University [email protected]. - PowerPoint PPT Presentation

Citation preview

Page 1: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

LH*RSP2P: A Scalable Distributed

Data Structure for P2P Environment

W. LITWIN

CERIA Laboratory

H.YAKOUBENParis Dauphine University

[email protected] Dauphine University

[email protected]

T. SCHWARZSanta Clara University (USA)

[email protected]

Page 2: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 2

PlanObjective

Overview: SDDS & P2P

LH*RSP2P

Architecture Addressing Properties Churn Management

Conclusion

Page 3: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 3

Objective

High availabilityto deal with churn

At most one forwarding message for key search or insert or scan

(fastest known performance)

Very Large Scalable Files

Page 4: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

A File of records identified by keys SDDS client nodes face the applications

and send queries to SDDS server nodes No centralized addressing Servers contain application or parity data

In buckets Overflowing servers split on new servers Servers do not notify clients about splits

SDDS (1993)

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 4

Page 5: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

Clients use images of the file state for addressing Key based Range queries Scans …

Images get adjusted towards the file state during queries by Image Adjustment Messages Triggered by incorrect addressing by the client

IAMs reflect the file evolution by splits or, rarely, merges.

IAMs reflect also the location changes because of failures and recovery

SDDS (1993)

5LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Page 6: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

6

SDDS Typology

LH*sa

SDDS(1993)

Data Structures

Classics

Tree

m-d Tree1-d Tree

RP*, k-RP*, SD-Rtree, DRT*, LH*, DDH, EH*,

Hash

High Availability

1-dimensional d-dimensional

IH*…

LH*rs LH*sAlg. Sign…

k-Availability Security

LH*m LH*g

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Structured P2P Schemes

CHORD... BATON VBI-Tree

Page 7: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

New Peer

7LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Clients

Growth through splits under inserts

Peer

SDDS Expansion

Page 8: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

8LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Clients

SDDS Client Image Evolution

Image Adjustm

ent Messagin

g

Page 9: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

Available at CERIA site Announced at DbWorld Managing LH* RS and RP* files

LH* implementaton based on LH*LH n scheme J. Karlsson’s Thesis & EDBT paper

In distributed RAM Under Windows Over 1 gbs Ethernet

Various functions Response time reaching 30 microsec Up to 300 times faster than disk files

SDDS 2007 Prototypes

9

Page 10: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

Autonomous nodes store and search data By flooding in early systems

Freenet, Napster, Gnutella… Structured P2P reduce the flooding

Using decentralized data structures Distributed Hash Table (DHT) especially

Few folks know the concept is due to B. Devine FODO 93

Chord, P-tree, VBI, Baton… Structured P2P schemes are specific SDDS schemes

P2P (1995 ?)

10

Page 11: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

Architecture based on LH*RS

ACM –TODS, Oct. 05)

11

LH*RSP2P

j

i’n’

Client Part

Server Part

LH*P2P Peer

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

LH*RSP2P Peer

LH*RS

ClientLH*RS

DB

Candidate Peer

Candidate Peer

Client &Spare

Storage

LH*RSP2P Peer

LH*RS

ClientLH*RS

PB

Pupils

Pupil’s IP = Its key C

Page 12: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

12

Client Address Calculus

a’ hi’(C ) ; /* a’ is the address of peer destination of the key C*/

if a’ < n’ then a hi’+1(C ) ;

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Global Addressing Rulea hi(C ) ; /* a is the address of peer destination of the key C*/

if a < n then a hi+1(C ) ; /* (i, n) state of an SDDS file, they are only known to the

file coordinator node

LH*RSP2P Addressing

hi (C ) = C mod 2i

Page 13: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

File starts with i = 0 and n = 0 and a single data bucket 0

Every bucket keeps the bucket level j of hash function hi last used to split, j = 0 initially.

Overflowing bucket m alerts the coordinator Coordinator notifies bucket n to split

Usually n ≠ m It sends the address of the pupil ready for the new

bucket n + 2i

It also sends the addresses of the buckets that have been created since bucket n last split

LH*RSP2P File Expansion

13

Page 14: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

Bucket n applies hi + 1 About half of keys migrates to new bucket n + 2i

About half of pupils migrate as well Bucket n and the new one set j = j + 1 Coordinator performs

n = n + 1 if n = 2i then i = i + 1 and n = 0 Resulting address space growth, starting from i = 0

and n = 0: 0 1 0 2, 1 3 ; 0 4, 1 5, 2 6, 3 7 0 8, 1 9, 2 10, 3 11, 412, 513…7 15

LH*RSP2P File Expansion

14

Page 15: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

15

i’ = j - 1 ; /* j value after the splitn‘ = a + 1  /* a is the splitting bucket ; n = a + 1 if n’ = 2i’ then i’ = j + 1 ;  n’ = 0 ;

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Peer & Pupil Image Adjustment After Peer Split

The adjustment concerns

Splitting peer a New peer a’ = a + 2i Every pupil of a and of a’

Page 16: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

16

Before splittingCoordinator Peer (CP)

P0

j=2

i’=1n’=1

P2P1i=1n=1

j=1

i’=1n’=0

j=2

i’=1n’=1

After splitting

j=2

i’=2n’=0

CP

P0

j=2

i’=1n’=1

P2P1i=2n=0

j=2

i’=2n’=0

j=2

i’=1n’=1

P3

i’= j =1;n’= m+1= 1+1;

If n’=21 then n’=0; i’= i’+1and

(i’, n’)= (2,0)

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Example

Page 17: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

17LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Server Address Calculus

a’ hj (C ) ;

if a’= a then exit /* Bucket a is the correct one

else send C to bucket a /* Forwarding to bucket a’exit;

Simpler and faster than for LH*

As only one forwarding is possible

Page 18: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

18

i’ j - 1, n’ a + 1 ;if n’ >2i’ then n’ 0 ; i’ i’ + 1 ;

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Peer Image Adjustment by IAM

IAM comes from the correct bucket

Bucket a is the forwarding one

Bucket level j is that of the correct bucket

0f the forwarding one as well

• Same algorithm as for the adjustment of the local client and of pupils after a split

Page 19: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

19LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

PC

i =3n=2

P0

j=4

i’=3n’=1

Pairs

P4

j=3

i’=2n’=1

P9

j=4

i’=3n’=2

P1

j=4

i’=3n’=2

9

9 Checking and

forward the key using A2

Peer Image Adjustment by IAM

IAMa = 1j = 4

Page 20: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

20LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

PC

i =3n=2

P0

j=4

i’=3n’=1

Pairs

P4

j=3

i’= 3n’= 2

P9

j=4

i’=3n’=2

P1

j=4

i’=3n’=2

9

9

Peer Image Adjustment by IAM

IAMa = 1j = 4

Page 21: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

21

Example of the File Expansion

PC

i=2n=2

P0

j=3

i’=2n’=1

Peers

P2

j=2

i’=1n’=1

P5

j=3

i’=2n’=2

Candidate Peer

i’=0n’=0

P6

j=3

i’=2n’=3

i=2

n=3

i’=2n’=3

j=3

Pupil

i’=2n’=1

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Assign a Tutor for Candidate Peer: LH-hash of its IP

Address

TUTOR, Update Pupil

LH*RSP2P

Page 22: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

22

1. The maximal number of forwarding messages for the key search is one.

2. The maximal number of rounds for the scan search can be two.

3. The worst case addressing performance of LH*RS

P2P as defined by Property 1 is the fastest possible for any SDDS or a practical structured P2P addressing scheme.

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Properties of LH*RSP2P :

Page 23: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

23LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Proof Property 1

n n+2i’2i’a’0 a

Case 1 : i’ = i and n’ < n Peer a addresses peer a’, using its image (i’,n’) from last split

No IAM came since.

j = i’+1 j = i’

No forwardinga+2i’

Page 24: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

24LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Forwarding possible for any address a’ between (a, n)

Case 1 : i’ = i and n’ < n Peer a addresses peer a’, using its image (i’,n’) from last split

No IAM came since.

n n+2i’2i’

a’0 a

j = i’+1 j = i’

a+2i’

Proof Property 1

Page 25: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

25LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Forwarding possible for any address a’ beyond [n, a]

Case 2 : i = i’ + 1 and n < n’ Peer a addresses peer a’, using its image (i’,n’) from last split

No IAM came since.

n n+2i’+1 2i’

a’0 a

j = i’+2j = i’+1j = i’

2i’+1

Proof Property 1

Page 26: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

26LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Peer a sends the scan to all buckets in its image• Including its image (i’, n’)

• Receiving peer a’ can have bucket level j as in the image

• j (a) = j’ (a)• No forwarding of the scan

• Or, bucket a’ split• Once and only once

• j (a) = j’ (a) + 1• See the figs for the key address calculus

Proof Property 2

Page 27: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

27LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Peer a’ forwards the scan to its (only) child• No child can have a child• Peer a would first need to split again as

well

•Every peer gets thus the scan and only once

•There at worst two rounds

Proof Property 2

Page 28: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

28LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

•The only faster worst case performance is zero forwarding messages

•Every split has to be notified then to every peer

•It would be against the scalability goal of every SDDS & structured P2P scheme

Proof Property 2

Page 29: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

29LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

LH*RSP2P Churn Management

Bucket reliability group with k parity buckets protect against up to k bucket failures per group

••••

•••••

••••• •

•••••

54321

0

•••••

•••••

Parity PeerData PeerRank

Parity RecordData Record

Tutoring records

Page 30: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

30LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Peer leaves with notice

Coordinator Peer

j

i’,n’

j

i’,n’

j

i’,n’

i’,n’

P0 PmPl Candidate Peer

Notification

… … Say

that’s OK

LH*RSP2P Churn Management

Page 31: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

31LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Peer leaves without notice or fails

Coordinator Peer

j

i’,n’

j

i’,n’

j

i’,n’

i’,n’Pl-1 PmPl Parity Peer

Query

Forward

LH*RS Bucket Recovery

LH*RSP2P Churn Management

Page 32: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

32LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Peer leaves without notice or fails

Coordinator Peer

j

i’,n’

j

i’,n’

j

i’,n’

i’,n’Pl-1 Pm

Pl

Parity Peer

Answer

LH*RS Bucket Recovery

LH*RSP2P Churn Management

Page 33: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

33LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Sure Search : Protects against outdated server read (transient communication or peer failure)

Coordinator Peer

j

i’,n’

j

i’,n’

j

i’,n’

i’,n’Pl-1 PmPl Parity Peer

Query

Answer

j

i’,n’

Pl

LH*RSP2P Churn Management

Page 34: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

34

Conclusion LH*RS

P2P require at most one forward message when addressing error occur

Is the fastest known SDDS and P2P key based addressing algorithm

Protects efficiently against churn

Allows to manage very large scalable files Should have numerous applications Google ??

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Page 35: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

35

Current & Future Work

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Implementation of peer node architecture and of tutoring functionsUsing existing LH*RS prototypeCreated by Rim Moussa & shown at VLDB

2004 Performance Analysis Variants

Page 36: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

36

END

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Work partly funded by the IST eGov-Bus project

Thank you for Your Attention

Page 37: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

37

[1] Adina Crainiceanu, Prakash Linga, Johannes Gehrke, and Jayavel Shanmugasundaram. Querying Peer-to-Peer Networks Using P-Trees. In Proceedings of the Seventh International Workshop on the Web and Databases (WebDB 2004). , June 2004.

[2] Bolosky W. J, Douceur J. R, Howell J. The Farsite Project: A Retrospective. Operating System Review, April 2007, p.17-26

[3] Devine R. Design and Implementation of DDH: A Distributed Dynamic Hashing Algorithm, Proc. Of the 4th Intl. Foundation of Data Organisation and Algorithms –FODO, 1993.

[4] Litwin, W. Neimat, M-A., Schneider, D. LH*: Linear Hashing for Distributed Files. ACM-SIGMOD Int. Conf. On Management of Data, 93.

[5] Litwin, W., Neimat, M-A., Schneider, D. LH*: A Scalable Distributed Data Structure. ACM-TODS, (Dec., 1996).

[6] Litwin, W., Neimat, M-A. High Availability LH* Schemes with Mirroring, Intl. Conf on Cooperating systems, , IEEE Press 1996.

[7] Litwin, W. Moussa R, Schwarz T. LH*rs- A Highly Available Distributed Data Storage. Proc of 30th VLDB Conference, , 2004.

[8] Litwin, W. Moussa R, Schwarz T. LH*rs- A Highly Available Scalable Distributed Data Structure. ACM-TODS, Sept 2005.

[9] Steven D. Gribble, Eric A. Brewer, Joseph M. Hellerstein, and David Culler. Scalable, Distributed Data Structures for Internet Service Construction, Proceedings of the Fourth Symposium on Operating Systems Design and Implementation (OSDI 2000)

[10] Stoica, Morris, Karger, Kaashoek, Balakrishma. CHORD : A scalable Peer to Peer Lookup Service for Internet Application. SIGCOMM’O, August 27-31, 2001,

References

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Page 38: LH* RS P2P :  A Scalable Distributed Data Structure  for P2P Environment

38LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment