41
Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky David G. Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS

Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

  • Upload
    kesler

  • View
    40

  • Download
    0

Embed Size (px)

DESCRIPTION

Don’t Settle for Eventual : Scalable Causal Consistency for Wide -Area Storage with COPS. Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡ * Princeton, † Intel Labs, ‡ CMU. Wide-Area Storage. Stores: Status Updates Likes Comments - PowerPoint PPT Presentation

Citation preview

Page 1: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Wyatt Lloyd*

Michael J. Freedman*

Michael Kaminsky†

David G. Andersen‡

*Princeton, †Intel Labs, ‡CMU

Don’t Settle for Eventual: Scalable Causal

Consistency for Wide-

Area Storage with COPS

Page 2: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Wide-Area Storage

Stores:Status UpdatesLikesCommentsPhotosFriends List

Stores:TweetsFavoritesFollowing List

Stores:Posts+1sCommentsPhotosCircles

Page 3: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Wide-Area StorageServes Requests Quickly

Page 4: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Inside the Datacenter

Web Tier Storage Tier

A-F

G-L

M-R

S-Z

Web Tier Storage Tier

A-F

G-L

M-R

S-Z

Remote DC

Replication

Page 5: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Desired Properties: ALPS

• Availability

• Low Latency

• Partition Tolerance

• Scalability

“Always On”

Page 6: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

ScalabilityIncrease capacity and throughput in each datacenter

A-Z A-ZA-L

M-Z

A-L

M-Z

A-F

G-L

M-R

S-Z

A-F

G-L

M-R

S-Z

A-C

D-F

G-J

K-L

M-O

P-S

T-V

W-Z

A-C

D-F

G-J

K-L

M-O

P-S

T-V

W-Z

Page 7: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Desired Property: Consistency

• Restricts order/timing of operations

• Stronger consistency:– Makes programming easier– Makes user experience better

Page 8: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Consistency with ALPS

Strong

Sequential

Causal

Eventual

Impossible [Brewer00, GilbertLynch02]

Impossible [LiptonSandberg88, AttiyaWelch94]

COPS

Amazon LinkedIn Facebook/ApacheDynamo Voldemort Cassandra

Page 9: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

System A L P S Consistency

Scatter ✖ ✖ ✖ ✔ ✔ StrongWalter ✖ ✖ ✖ ? PSI + Txn

COPS ✔ ✔ ✔ ✔ Causal+

Bayou ✔ ✔ ✔ ✖ Causal+

PNUTS ✔ ✔ ? ✔ Per-Key Seq.

Dynamo ✔ ✔ ✔ ✔ ✖ Eventual

Page 10: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Causality By Example

Remove boss from friends group

Post to friends: “Time for a new

job!”

Friend reads post

Causality ( )Thread-of-Execution

Gets-From

TransitivityNew Job!

FriendsBoss

Page 11: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Causality Is Useful

For Programmers:For Users:

Photo Upload

Add to album

Employment Integrity Referential Integrity

New Job!

FriendsBoss

Page 12: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Conflicts in Causal

K=2K=1K=1

K=2K=1

K=2

Page 13: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Conflicts in Causal

K=2K=3

K=2K=3

K=2K=3

Causal + Conflict Handling = Causal+

Page 14: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Previous Causal+ Systems

• Bayou ‘94, TACT ‘00, PRACTI ‘06– Log-exchange based

• Log is single serialization point– Implicitly captures and enforces causal order– Limits scalability OR– No cross-server causality

Page 15: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Scalability Key Idea

• Dependency metadata explicitly captures causality

• Distributed verifications replace single serialization– Delay exposing replicated puts until all

dependencies are satisfied in the datacenter

Page 16: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

COPS

Key-Value Store

Causal+Replication

AllData

AllData

AllData

Client Library

Local Datacenter

Page 17: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Get

getget

Client Library

Key-Value Store

Local Datacenter

Page 18: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Put

Client Library

put put_after

?

?

Key-Value Store

Replication Qput

after

K:V

put+

orderingmetadata

putafter =

Local Datacenter

Page 19: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Dependencies

• Dependencies are explicit metadata on values• Library tracks and attaches them to put_afters

Page 20: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Dependencies

• Dependencies are explicit metadata on values• Library tracks and attaches them to put_afters

put(Key, Val)put_after(Key,Val,deps)

versiondeps

. . . Kversion

(Thread-Of-Execution Rule)

Client 1

Page 21: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Dependencies

• Dependencies are explicit metadata on values• Library tracks and attaches them to put_afters

deps. . . Kversion

L337

M195

(Gets-From Rule)

get(K)

get(K)

value,version,deps'value

(Transitivity Rule)

deps'L337

M195

Client 2

Page 22: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Causal+ Replication

Key-Value Store

Replication Qput

after

put_after(K,V,deps)K:V,deps

Page 23: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Causal+ Replication

put_after(K,V,deps) dep_check(L337)K:V,deps

deps

L337

M195 dep_check(M

195 )Exposing values after dep_checks return

ensures causal+

Page 24: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Basic COPS Summary

• Serve operations locally, replicate in background– “Always On”

• Partition keyspace onto many nodes– Scalability

• Control replication with dependencies– Causal+ Consistency

Page 25: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

RemoteDatacenter

Boss

Portugal!

Gets Aren’t EnoughRemoteProgress

RemoteProgress

RemoteProgress

MyOperations

New Job!

BossBoss

Portugal!

Boss

Boss New Job!New Job!

You’re Fired!!

Page 26: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

RemoteDatacenter

Boss

Portugal!

Gets Aren’t Enough

Boss

Portugal!

Boss

BossNew Job!New Job!

You’re Fired!!

Portugal!

RemoteProgress

RemoteProgress

RemoteProgress

MyOperations

New Job!

Boss

Boss

New Job!Portugal!

BossBoss

Page 27: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Get Transactions

• Provide consistent view of multiple keys– Snapshot of visible values

• Keys can be spread across many servers

• Takes at most 2 parallel rounds of gets

• No locks, no blocking

Low Latency

Page 28: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

RemoteDatacenter

Boss

Portugal!

Get TransactionsRemoteProgress

RemoteProgress

RemoteProgress

MyOperations

New Job!

BossBoss

Portugal!Boss

Portugal!

Boss

New Job!

Portugal! RemoteProgress

RemoteProgress

Boss

New Job!Portugal!

BossBoss

Boss Portugal!

Portugal!Boss

Portugal!Boss

New Job!Boss

Could Get

NeverBoss New Job!

Page 29: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

System So Far

• ALPS and Causal+, but …

• Proliferation of dependencies reduces efficiency– Results in lots of metadata– Requires lots of verification

• We need to reduce metadata and dep_checks– Nearest dependencies– Dependency garbage collection

Page 30: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Many Dependencies

• Dependencies grow with client lifetime

Put

Put

Put

Put

GetGet

Page 31: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Nearest Dependencies

• Transitively capture all ordering constraints

Page 32: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

The Nearest Are Few

• Transitively capture all ordering constraints

Page 33: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

The Nearest Are Few

• Only check nearest when replicating

• COPS only tracks nearest

• COPS-GT tracks non-nearest for transactions

• Dependency garbage collection tames metadata in COPS-GT

Page 34: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Extended COPS Summary

• Get transactions– Provide consistent view of multiple keys

• Nearest Dependencies– Reduce number of dep_checks– Reduce metadata in COPS

Page 35: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Evaluation Questions

• Overhead of get transactions?

• Compare to previous causal+ systems?

• Scale?

Page 36: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Experimental Setup

COPS

Remote DC

COPS ServersClients

Local Datacenter

N N

N

Replication

Page 37: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

COPS & COPS-GTCompetitive for Expected Workloads

High per-client write rates result in 1000s of dependencies

Low per-clientwrite rates expected

People tweeting 1000 times/sec

People tweeting 1 time/sec

All Put Workload – 4 Servers / Datacenter

Page 38: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

COPS & COPS-GTCompetitive for Expected Workloads

Varied Workloads – 4 Servers / Datacenter

Pathological Expected

Workload

Page 39: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

COPS Low Overhead vs. LOG

• COPS – dependencies ≈ LOG• 1 server per datacenter only

• COPS and LOG achieve very similar throughput– Nearest dependencies mean very little metadata– In this case dep_checks are function calls

Page 40: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

COPS Scales Out

Page 41: Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡

Conclusion

• Novel Properties– First ALPS and causal+ consistent system in COPS– Lock free, low latency get transactions in COPS-GT

• Novel techniques– Explicit dependency tracking and verification with

decentralized replication– Optimizations to reduce metadata and checks

• COPS achieves high throughput and scales out