56
Reconciling Differences: towards a theory of cloud complexity George Varghese UCSD, visiting at Yahoo! Labs 1

Reconciling Differences: towards a theory of cloud complexity

  • Upload
    suzuki

  • View
    49

  • Download
    3

Embed Size (px)

DESCRIPTION

Reconciling Differences: towards a theory of cloud complexity. George Varghese UCSD, visiting at Yahoo! Labs. Part 1 : Reconciling Sets across a link. Joint with D. Eppstein , M. Goodrich, F. Uyeda Appeared in SIGCOMM 2011. Motivation 1: OSPF Routing (1990). - PowerPoint PPT Presentation

Citation preview

Page 1: Reconciling Differences: towards a theory of cloud complexity

Reconciling Differences: towards a theory of cloud complexity

George VargheseUCSD, visiting at Yahoo! Labs

1

Page 2: Reconciling Differences: towards a theory of cloud complexity

2

Part 1: Reconciling Sets across a link

Joint with D. Eppstein, M. Goodrich, F. Uyeda

Appeared in SIGCOMM 2011

Page 3: Reconciling Differences: towards a theory of cloud complexity

3

Motivation 1: OSPF Routing (1990)

• After partition forms and heals, R1 needs updates at R2 that arrived during partition.

R1 R2

Must solve the Set-Difference Problem!

Partition heals

Page 4: Reconciling Differences: towards a theory of cloud complexity

4

Motivation 2:Amazon S3 storage (2007)

• Synchronizing replicas.

S1 S2

Set-Difference across cloud again!

Periodic Anti-entropy Protocol between replicas

Page 5: Reconciling Differences: towards a theory of cloud complexity

5

What is the Set-Difference problem?

• What objects are unique to host 1?• What objects are unique to host 2?

A

Host 1 Host 2

CAFEB D F

Page 6: Reconciling Differences: towards a theory of cloud complexity

6

Use case 1: Data Synchronization

• Identify missing data blocks• Transfer blocks to synchronize sets

A

Host 1 Host 2

CAFEB D F

DC

B E

Page 7: Reconciling Differences: towards a theory of cloud complexity

7

Use case 2: Data De-duplication

• Identify all unique blocks.• Replace duplicate data with pointers

A

Host 1 Host 2

CAFEB D F

Page 8: Reconciling Differences: towards a theory of cloud complexity

8

Prior work versus ours• Trade a sorted list of keys. – Let n be size of sets, U be size of key space– O(n log U) communication, O(n log n) computation– Bloom filters can improve to O(n) communication.

• Polynomial Encodings (Minsky ,Trachtenberg)– Let “d” be the size of the difference– O(d log U) communication, O(dn+d3) computation

• Invertible Bloom Filter (our result)– O(d log U) communication, O(n+d) computation

Page 9: Reconciling Differences: towards a theory of cloud complexity

9

Difference Digests

• Efficiently solves the set-difference problem.• Consists of two data structures:– Invertible Bloom Filter (IBF)• Efficiently computes the set difference.• Needs the size of the difference

– Strata Estimator• Approximates the size of the set difference.• Uses IBF’s as a building block.

Page 10: Reconciling Differences: towards a theory of cloud complexity

10

IBFs: main idea

• Sum over random subsets: Summarize a set by “checksums” over O(d) random subsets.

• Subtract: Exchange and subtract checksums.• Eliminate: Hashing for subset choice

common elements disappear after subtraction• Invert fast: O(d) equations in d unknowns;

randomness allows expected O(d) inversion.

Page 11: Reconciling Differences: towards a theory of cloud complexity

11

“Checksum” details

• Array of IBF cells that form “checksum” words– For set difference of size d, use αd cells (α > 1)

• Each element ID is assigned to many IBF cells• Each cell contains:

idSum XOR of all IDs assigned to cellhashSum XOR of hash(ID) of IDs assigned to cellcount Number of IDs assigned to cell

Page 12: Reconciling Differences: towards a theory of cloud complexity

12

IBF EncodeA

idSum ⊕ AhashSum ⊕ H(A)

count++

idSum ⊕ AhashSum ⊕

H(A)count++

idSum ⊕ AhashSum ⊕

H(A)count++

Hash1 Hash2 Hash3

B C

Assign ID to many cells

IBF:

αd “Add” ID to cellNot O(n), like

Bloom Filters!

All hosts use the same hash functions

Page 13: Reconciling Differences: towards a theory of cloud complexity

13

Invertible Bloom Filters (IBF)

• Trade IBF’s with remote host

A

Host 1 Host 2

CAFEB D F

IBF 2IBF 1

Page 14: Reconciling Differences: towards a theory of cloud complexity

14

Invertible Bloom Filters (IBF)

• “Subtract” IBF structures– Produces a new IBF containing only unique objects

A

Host 1 Host 2

CAFEB D F

IBF 2

IBF 1

IBF (2 - 1)

Page 15: Reconciling Differences: towards a theory of cloud complexity

15

IBF Subtract

Page 16: Reconciling Differences: towards a theory of cloud complexity

Disappearing act

• After subtraction, elements common to both sets disappear because:– Any common element (e.g W) is assigned to same cells on

both hosts (same hash functions on both sides)– On subtraction, W XOR W = 0. Thus, W vanishes.

• While elements in set difference remain, they may be randomly mixed need a decode procedure.

16

Page 17: Reconciling Differences: towards a theory of cloud complexity

17

IBF Decode

H(V X Z)⊕ ⊕≠

H(V) H(X) ⊕ ⊕H(Z)

Test for Purity:H( idSum )H( idSum ) = hashSumH(V) = H(V)

Page 18: Reconciling Differences: towards a theory of cloud complexity

18

IBF Decode

Page 19: Reconciling Differences: towards a theory of cloud complexity

19

IBF Decode

Page 20: Reconciling Differences: towards a theory of cloud complexity

20

IBF Decode

Page 21: Reconciling Differences: towards a theory of cloud complexity

21

Small Diffs:1.4x – 2.3x

Large Differences:1.25x - 1.4x

How many IBF cells?Sp

ace

Ove

rhea

d

Set Difference

Hash Cnt 3Hash Cnt 4

Overhead to decode at >99%

α

Page 22: Reconciling Differences: towards a theory of cloud complexity

How many hash functions?

• 1 hash function produces many pure cells initially but nothing to undo when an element is removed.

22

A B

C

Page 23: Reconciling Differences: towards a theory of cloud complexity

How many hash functions?

• 1 hash function produces many pure cells initially but nothing to undo when an element is removed.

• Many (say 10) hash functions: too many collisions.

23

A A B

C B C

A A

B B

C C

Page 24: Reconciling Differences: towards a theory of cloud complexity

How many hash functions?

• 1 hash function produces many pure cells initially but nothing to undo when an element is removed.

• Many (say 10) hash functions: too many collisions.• We find by experiment that 3 or 4 hash functions

works well. Is there some theoretical reason?

24

A A B

C C

A

B

B

C

Page 25: Reconciling Differences: towards a theory of cloud complexity

Theory

• Let d = difference size, k = # hash functions.• Theorem 1: With (k + 1) d cells, failure probability

falls exponentially with k. – For k = 3, implies a 4x tax on storage, a bit weak.

• [Goodrich,Mitzenmacher]: Failure is equivalent to finding a 2-core (loop) in a random hypergraph

• Theorem 2: With ck d, cells, failure probability falls exponentially with k.

– c4 = 1.3x tax, agrees with experiments

25

Page 26: Reconciling Differences: towards a theory of cloud complexity

26

Large Differences:1.25x - 1.4x

Recall experimentsSp

ace

Ove

rhea

d

Set Difference

Hash Cnt 3Hash Cnt 4

Overhead to decode at >99%

Page 27: Reconciling Differences: towards a theory of cloud complexity

Connection to Coding

• Mystery: IBF decode similar to peeling procedure used to decode Tornado codes. Why?

• Explanation: Set Difference is equivalent to coding with insert-delete channels

• Intuition: Given a code for set A, send checkwords only to B. Think of B as a corrupted form of A.

• Reduction: If code can correct D insertions/deletions, then B can recover A and the set difference.

27

Reed Solomon <---> Polynomial Methods LDPC (Tornado) <---> Difference Digest

Page 28: Reconciling Differences: towards a theory of cloud complexity

28

Random Subsets Fast Elimination

Sparse

X + Y + Z = . . αd

X = . .

Y = . .Pure

Roughly upper triangular and sparse

Page 29: Reconciling Differences: towards a theory of cloud complexity

29

Difference Digests

• Consists of two data structures:– Invertible Bloom Filter (IBF)• Efficiently computes the set difference.• Needs the size of the difference

– Strata Estimator• Approximates the size of the set difference.• Uses IBF’s as a building block.

Page 30: Reconciling Differences: towards a theory of cloud complexity

30

Strata EstimatorA

ConsistentPartitioning

B C

~1/2

~1/4

~1/8

1/16

IBF 1

IBF 4

IBF 3

IBF 2

Estimator

• Divide keys into sampled subsets containing ~1/2k

• Encode each subset into an IBF of small fixed size– log(n) IBF’s of ~20 cells each

Page 31: Reconciling Differences: towards a theory of cloud complexity

31

4x

Strata Estimator

IBF 1

IBF 4

IBF 3

IBF 2

Estimator 1

• Attempt to subtract & decode IBF’s at each level.• If level k decodes, then return:

2k x (the number of ID’s recovered)

IBF 1

IBF 4

IBF 3

IBF 2

Estimator 2…Decode

Host 1 Host 2

Page 32: Reconciling Differences: towards a theory of cloud complexity

32

KeyDiff Service

• Promising Applications:– File Synchronization– P2P file sharing– Failure Recovery

Key Service

Key Service

Key Service

Application Application

Application

Add( key )Remove( key )Diff( host1, host2 )

Page 33: Reconciling Differences: towards a theory of cloud complexity

33

Difference Digest Summary

• Strata Estimator– Estimates Set Difference.– For 100K sets, 15KB estimator has <15% error– O(log n) communication, O(n) computation.

• Invertible Bloom Filter– Identifies all ID’s in the Set Difference.– 16 to 28 Bytes per ID in Set Difference.– O(d) communication, O(n+d) computation– Worth it if set difference is < 20% of set sizes

Page 34: Reconciling Differences: towards a theory of cloud complexity

34

Connection to Sparse Recovery?

• If we forget about subtraction, in the end we are recovering a d-sparse vector.

• Note that the hash check is key for figuring out which cells are pure after differencing.

• Is there a connection to compressed sensing. Could sensors do the random summing? The hash summing?

• Connection the other way: could use compressed sensing for differences?

Page 35: Reconciling Differences: towards a theory of cloud complexity

35

Comparison with Information Theory and Coding

• Worst case complexity versus average• It emphasize communication complexity not

computation complexity: we focus on both.• Existence versus Constructive: some similar

settings (Slepian-Wolf) are existential• Estimators: We want bounds based on

difference and so start by efficiently estimating difference.

Page 36: Reconciling Differences: towards a theory of cloud complexity

36

Aside: IBFs in Digital Hardware

a , b, x, yStream of set elements

Logic (Read, hash, Write)

Bank 1 Bank 2 Bank 3

Hash 1 Hash 2 Hash 3

Hash to separate banks for parallelism, slight cost in space needed. Decode in software

Strata Hash

Page 37: Reconciling Differences: towards a theory of cloud complexity

37

Part 2: Towards a theory of Cloud Complexity

?O1

O3

O2

Complexity of reconciling “similar” objects?

Page 38: Reconciling Differences: towards a theory of cloud complexity

38

Example: Synching Files

?

Measures: Communication bits, computation

X.ppt.v3

X.ppt.v2

X.ppt.v1

Page 39: Reconciling Differences: towards a theory of cloud complexity

39

So far: Two sets, one link, set difference

{a,b,c} {d,a,c}

Page 40: Reconciling Differences: towards a theory of cloud complexity

40

Mild Sensitivity Analysis: One set much larger than other

?Set A Set B

Small difference d

(|A|) bits needed, not O (d) : Patrascu 2008Simpler proof: DKS 2011

Page 41: Reconciling Differences: towards a theory of cloud complexity

41

Asymmetric set difference inLBFS File System (Mazieres)

?File A

Chunk Set B at Server

1 chunk difference

LBFS sends all chunk hashes in File A: O|A|

C1 C2 C3

C97 C98 C99

C1 C5 C3

C97 C98 C99

. . .. . .

File B

Page 42: Reconciling Differences: towards a theory of cloud complexity

42

More Sensitivity Analysis: small intersection: database joins

?Set A

Set B

Small intersection d

(|A|) bits needed, not O (d) : Follows from results on hardness of set disjointness

Page 43: Reconciling Differences: towards a theory of cloud complexity

43

Sequences under Edit Distance(Files for example)

?

File A File B

Edit distance 2

Insert/delete can renumber all file blocks . . .

A BC D E F

A CD

E F G

Page 44: Reconciling Differences: towards a theory of cloud complexity

44

Sequence reconciliation (with J. Ullman)

File A File B

Edit distance 1

Send 2d+1 piece hashes. Clump unmatched pieces and recurse. O( d log (N) )

A BC D E F

A CD

E F

H1

H2

H3

H2

H3

2

Page 45: Reconciling Differences: towards a theory of cloud complexity

45

21 years of Sequence Reconciliation!

• Schwartz, Bowdidge, Burkhard (1990): recurse on unmatched pieces, not aggregate.

• Rsync: widely used tool that breaks file into roughly piece hashes, N is file length.

UCSD, Lunch Princeton, kids

N

Page 46: Reconciling Differences: towards a theory of cloud complexity

46

Sets on graphs?

{a,b,c} {d,c,e}

{b,c,d}

{a,f,g}

Page 47: Reconciling Differences: towards a theory of cloud complexity

47

Generalizes rumor spreading which has disjoint singleton sets

{a} {d}

{b}

{g}

CLP10,G11,: O( E n log n /conductance)

Page 48: Reconciling Differences: towards a theory of cloud complexity

48

Generalized Push-Pull (with N. Goyal and R. Kannan)

{a,b,c} {d,c,e}

{b,c,d}Pick random edge

Do 2 party set reconciliation

Complexity: C + D, C as before, D = Sum (U – S ) i i

Page 49: Reconciling Differences: towards a theory of cloud complexity

49

Sets on Steiner graphs?

{a} U S {b} U S

R1

Only terminals need sets. Push-pull wasteful!

Page 50: Reconciling Differences: towards a theory of cloud complexity

50

Butterfly example for Sets

S2

S1

S1

D = Diff(S1 ,S2)

S2

D D

Set difference instead of XOR within network

S1

X

Y

Page 51: Reconciling Differences: towards a theory of cloud complexity

51

How does reconciliation on Steiner graphs relate to network coding?

• Objects in general, not just bits.• Routers do not need objects but can

transform/code objects.• What transformations within network allow

efficient communication close to lower bound?

Page 52: Reconciling Differences: towards a theory of cloud complexity

52

Sequences with d mutations:VM code pages (with Ramjee et al)

?

VM A VM B

2 “errors”

Reconcile Set A = {(A,1)(B,2),(C,3),(D,4),(E,5)} and Set B = {(A,1),(X,2),(C,3),(D,4),(Y,5)}

A BC D E

A XC D Y

Page 53: Reconciling Differences: towards a theory of cloud complexity

53

Twist: IBFs for error correction?(with M. Mitzenmacher)

• Write message M[1..n] of n words as set S = {(M[1],1), (M[2], 2), . . (M[n], n)}.

• Calculate IBF(S) and transmit M, IBF(S) • Receiver uses received message M’ to find

IBF(S’); subtracts from IBF’(S) to locate errors.• Protect IBF using Reed-Solomon or redundancy• Why: Potentially O(e) decoding for e errors --

Raptor codes achieve this for erasure channels.

Page 54: Reconciling Differences: towards a theory of cloud complexity

54

The Cloud Complexity Milieu2 Node Graph Steiner

Nodes

Sets (Key,values) EGUV11 GKV11 ?

Sequence, Edit Distance (Files)

SBB90 ? ?

Sequence,errors only (VMs)

MV11 ? ?

Sets of sets (database tables)

? ? ?

Streams (movies) ? ? ?

Other dimensions: approximate, secure, . . .

Page 55: Reconciling Differences: towards a theory of cloud complexity

Conclusions: Got Diffs?

• Resiliency and fast recoding of random sums set reconciliation; and error correction?

• Sets on graphs– All terminals: generalizes rumor spreading – Routers,terminals: resemblance to network coding.

• Cloud complexity: Some points covered, many remain• Practical, may be useful to synch devices across cloud.

55

Page 56: Reconciling Differences: towards a theory of cloud complexity

56

Comparison to Logs/Incremental Updates

• IBF work with no prior context.• Logs work with prior context, BUT– Redundant information when sync’ing with

multiple parties.– Logging must be built into system for each write.– Logging adds overhead at runtime.– Logging requires non-volatile storage.• Often not present in network devices.

IBF’s may out-perform logs when:• Synchronizing multiple parties• Synchronizations happen infrequently