25
Data Currency in Replicated DHTs Reza Akbarinia, Esther Pacitti and Patrick Valduriez University of Nantes, France, INIRA ACM SIGMOD 2007 Presenter Jerry Wu

Data Currency in Replicated DHTs

  • Upload
    aminia

  • View
    46

  • Download
    0

Embed Size (px)

DESCRIPTION

Data Currency in Replicated DHTs. Reza Akbarinia , Esther Pacitti and Patrick Valduriez University of Nantes, France, INIRA ACM SIGMOD 2007 Presenter Jerry Wu. Motivation. P2P data sharing systems Enable large amount of users to share a massive number of files - PowerPoint PPT Presentation

Citation preview

Page 1: Data Currency in Replicated DHTs

Data Currency in Replicated DHTs

Reza Akbarinia, Esther Pacitti and Patrick ValduriezUniversity of Nantes, France, INIRA

ACM SIGMOD 2007

Presenter Jerry Wu

Page 2: Data Currency in Replicated DHTs

Motivation

• P2P data sharing systems– Enable large amount of users to share a massive

number of files

– Query Reply Send request Download• Message forwarding on these systems

– Flooding : KaZaA, Gnutella– DHT : CAN, Chord, Pastry, … etc.

Page 3: Data Currency in Replicated DHTs

Distributed Hash Table (DHT)

• Use hash functions to locate files– h(meta data) = k (for identification)– g(k) = k1 (for routing)

A

B F

D

EC

MetaFreeLoop.mp3

g(k)=k1 (A)

U

k1

Page 4: Data Currency in Replicated DHTs

k1

Data Replication

• What if node A fails?• Duplicate several copies

A

B F

D

EC

g(h(FreeLoop.mp3))=k1 (A)

U

g2(h(FreeLoop.mp3))=k2 (D)

g3(h(FreeLoop.mp3))=k3 (E)

MetaFreeLoop.mp3

k2

k3

Page 5: Data Currency in Replicated DHTs

Basic Operations

• putH(meta key k, File D)– Insert a file into the DHT

• getH(meta key k)– Retrieve the file from the DHT

H : { g(k , D) | g is used as a hash function}|H| : The replication level of the system

Each file will be stored at |H| peers

Page 6: Data Currency in Replicated DHTs

Additional Problems

• If the owner can modify the data …

• The nature of P2P system– Peers can join and leave dynamically

• Update while some peers depart and rejoins later?

• Concurrent update?

Page 7: Data Currency in Replicated DHTs

Solution

• If we have a timestamp for each transaction of update/insert ?– The currency of the file is judged by its

timestamp– FileX = File + timestamp– Put (k, FileX) instead of (k, File) into the

DHT!!• Then we know the freshness of the file• Only the latest update can succeed

Page 8: Data Currency in Replicated DHTs

How Can We Get A Timestamp?

• KTS (Key-based Timestamp Service)– Issue timestamps for each transaction– gen_ts(key k)

• Generate a timestamp w.r.t. key k– last_ts(key k)

• Return the finally issued timestamp

Page 9: Data Currency in Replicated DHTs

The New DHT Functions

• Based on the KTS service• Insert(key k, FileX D, Hash function set Hr)

– Insert or update a file with identity key k into the DHT

• Retrieve(k, Hr)– Retrieve the latest copy of the file with identity

key k

Page 10: Data Currency in Replicated DHTs

Insert A File

B F

G

EC

g(k)=k1 (A)

U

g2(k)=k2 (C)

InsertP.avi

k2

k1

D

Hh(P.avi)=k

KTSTimestamp

Service

gen_ts(k)=tA

A

putg(k, (tA, P.avi))

putg2(k, (tA, P.avi))

Page 11: Data Currency in Replicated DHTs

Retrieve A File

B F

G

EC

g(k)=k1 (A)

U

g2(k)=k2 (C)

GetP.avi

k2

k1

D

Hh(P.avi)=k

KTSTimestamp

Service

last_ts(k)=tA

A

getg(k)

getg2(k)

(t0, P.avi)

(tA, P.avi)

Page 12: Data Currency in Replicated DHTs

• If( tsx > ts0) then– Update File D

Update A File

putg(k, (tsx, File D))Key TS File

k ts0 File D (P.avi)

k1 ts1 File D1 (X.mp3)

k2 ts2 File D2 (Y.m4v)

k3 ts3 File D3 (Z.tar)

Page 13: Data Currency in Replicated DHTs

Retrieval Cost Analysis• C = Ckts + N * Cret

• Ckts = Cret = O(logn), n = # of peers• Let X be the random variable of N

• N : Number of retries to get the latest copy• pt : The probability of finding a fresh copy • Prob(X = i) = pt * (1 - pt)i-1

• |Hr| = number of replicas of the system

Page 14: Data Currency in Replicated DHTs

Retrieval Cost Analysis

• Then, how can we get a timestamp?– Key-based Timestamp Service (KTS)

Page 15: Data Currency in Replicated DHTs

The KTS Service• Use the same DHT but with different hash

function hts

1

2

Hash Table Req (k, hts)

Req(k, hts)=pTimeStampRequest (k)

Hash Table Req(k, hts)

3

4

Page 16: Data Currency in Replicated DHTs

The KTS Service• How can node p generate timestamps

w.r.t. key k?– Receive the counters from a leaving peer

• DHT system will distribute the load of the leaving peer to its neighbors

• Direct initialization

– Send a file request w.r.t. key k to obtain the latest timestamp• Take place if the leaving peer fails• Indirect initialization

Page 17: Data Currency in Replicated DHTs

The KTS Service

• Indirect initialization– The probability to fail pf

– pf = (1-pt)|H|

– If pt = 30%, |H|=13, then pf < 1%

• After initialization, increase timestamp on every timestamp request

Page 18: Data Currency in Replicated DHTs

Experiments And Simulations

• Environments– 64 node cluster– 10000 nodes on the SimJava platform

• Metrics– Response time : Time to return a current

replica in response to a query– Communication cost : # of messages to send

to answer a query

Page 19: Data Currency in Replicated DHTs

The Competitor - BRICKS

• Use a function to map key k to multiple keys (k1, k2, k3, k4, …)

• Each replica has a version number– Concurrent update problems– Must extract all replicas to find the newest

one

Page 20: Data Currency in Replicated DHTs

Response Time VS DHT Size

Page 21: Data Currency in Replicated DHTs

Communication Cost VS DHT Size

Page 22: Data Currency in Replicated DHTs

Response Time VS # of Replica

Page 23: Data Currency in Replicated DHTs

Failure Rate VS Response Time

Page 24: Data Currency in Replicated DHTs

Conclusion

• Pros– Use DHT to provide timestamp service is smart!– Consider the concurrent update problem– Easy to apply on exiting DHTs

• Cons– KTS service can raise additional communication

overhead

Page 25: Data Currency in Replicated DHTs

Thank You