60
Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity – London, 19. October 2017

Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Real-world consistency explained or the challenges of modern persistence

Uwe Friedrichsen (codecentric AG) – Velocity – London, 19. October 2017

Page 2: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

@ufried Uwe Friedrichsen | [email protected] | https://www.slideshare.net/ufried | https://medium.com/@ufried

Page 3: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Some kudos first …

Page 4: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

A lot of this talk was inspired by the great posts of Adrian Colyer

especially by his blog series "Out of the fire swamp”

see [Col], [Col2015a-c]

Page 5: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Past

Page 6: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

RDBMS ACID

Page 7: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

RDBMS

•  “One database to rule them all”

•  Good all-rounder •  Rich schema •  Rich access patterns

•  Designed for scarce resources •  Storage, CPU, Backup are expensive •  Network is slow

•  Shared database •  Replication was expensive •  Licenses were expensive •  Operations were expensive

•  Easy integration model

•  “Strange attractor” •  Accidental central integration hub •  Data spaghetti

Page 8: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

ACID

•  Atomicity •  Consistency •  Isolation •  Durability

•  Great programming model •  No temporal inconsistencies •  No anomalies •  Easy to reason about

•  But reality often is different! •  ACID does not necessarily mean

“serializability” •  Databases often run at lower

consistency levels •  Anomalies happen •  Most developers are not aware of it

Page 9: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

ANSI SQL Anomalies

•  Dirty write (P0): w1[x]...w2[x]...(c1 or a1) •  Dirty read (P1): w1[x]...r2[x]...(c1 or a1) •  Fuzzy read (P2): r1[x]...w2[x]...(c1 or a1) •  Phantom read (P3): r1[P]...w2[y in P]...(c1 or a1) Isolation levels

See [Ber+1995]

Dirty write Dirty read Fuzzy read Phantom read Read uncommitted Not possible Possible Possible Possible

Read committed Not possible Not possible Possible Possible

Repeatable read Not possible Not possible Not possible Possible

Serializable Not possible Not possible Not possible Not possible

Page 10: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Extended anomaly model •  Dirty write (P0): w1[x]...w2[x]...(c1 or a1) •  Dirty read (P1): w1[x]...r2[x]...(c1 or a1) •  Lost update (P4): r1[x]...w2[x]...w1[x]...c1 •  Lost cursor u. (P4C): rc1[x]...w2[x]...wc1[x]...c1. •  Fuzzy read (P2): r1[x]...w2[x]...(c1 or a1) •  Phantom read (P3): r1[P]...w2[y in P]...(c1 or a1) •  Read skew (A5A): r1[x]...w2[x]...w2[y]...c2...r1[y]...(c1 or a1) •  Write skew (A5B): r1[x]...r2[y]...w1[y]...w2[x]...(c1 and c2 occur)

see [Ber+1995]

Page 11: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Extended isolation level model

See [Ber+1995]

Isolation level Dirty write

Dirty read

Cursor lost

update

Lost update

Fuzzy read

Phantom read

Read skew

Write skew

Read uncommitted

Not possible Possible Possible Possible Possible Possible Possible Possible

Read committed

Not possible

Not possible Possible Possible Possible Possible Possible Possible

Cursor stability

Not possible

Not possible

Not possible

Sometimes possible

Sometimes possible Possible Possible Sometimes

possible

Repeatable read

Not possible

Not possible

Not possible

Not possible

Not possible Possible Not

possible Not

possible

Snapshot

Not possible

Not possible

Not possible

Not possible

Not possible

Sometimes possible

Not possible Possible

Serializable

Not possible

Not possible

Not possible

Not possible

Not possible

Not possible

Not possible

Not possible

Page 12: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Default & maximum isolation levels

See [Bai+2013a]

Database Default MaximumActian Ingres 10.0/10S [1] S SAerospike [2] RC RCAkiban Persistit [3] SI SIClustrix CLX 4100 [4] RR RRGreenplum 4.1 [8] RC SIBM DB2 10 for z/OS [5] CS SIBM Informix 11.50 [9] Depends SMySQL 5.6 [12] RR SMemSQL 1b [10] RC RCMS SQL Server 2012 [11] RC SNuoDB [13] CR CROracle 11g [14] RC SIOracle Berkeley DB [7] S SOracle Berkeley DB JE [6] RR SPostgres 9.2.2 [15] RC SSAP HANA [16] RC SIScaleDB 1.02 [17] RC RCVoltDB [18] S SRC: read committed, RR: repeatable read, SI: snapshot isola-tion, S: serializability, CS: cursor stability, CR: consistent read

Table 1: Default and maximum isolation levels for ACIDand NewSQL databases as of January 2013.

Read Committed by default, while three “NewSQL” datastores only offered Read Committed isolation.

In our investigation, we found that many databasesclaiming strong guarantees often offered weaker seman-tics. One store with an effective maximum of Read Com-mitted isolation claimed to provide “strong consistency(ACID)” [2], while another claiming “100% ACID” and“fully support[ed] ACID transactions” uses consistentread isolation [13]. Moreover, snapshot isolation is oftenlabeled as “serializability” [14]. We have accompaniedour bibliographic references with additional detail, but itis clear that these “ACID” guarantees rarely meet serial-izability’s goal of automatically protecting data integrityas set out by the database literature. This is especiallysurprising given that these databases’ “stronger” seman-tics are often thought to substantially differentiate themfrom their “NoSQL” peers [30, 56, 58].

These results—and several discussions with databasedevelopers and architects—indicate that weak isolationmodels are viable alternatives for many applications.There are applications that either work correctly withthese models or else work well enough to accept theresulting anomalies in exchange for their performancebenefits [45]. A key challenge is that, while the litera-ture provides reasonable taxonomy of the models, it con-siders them in either a single-node context [43] or ab-stractly [20, 26]—it is unclear which models are achiev-able with high availability and which are not. Indeed,most weak isolation levels today are implemented in anunavailable manner.

3 Highly Available TransactionsThe large number and prevalence of “weak ACID” guar-antees suggests that, although we cannot provide serial-izability with high availability, providing weaker guar-antees still provides users with a useful programming in-terface. In this section, we show that two major mod-els: Read Committed and ANSI SQL Repeatable Readare achievable in a highly available environment. Thispaves the way for broader theoretical and design stud-ies of Highly Available Transactions: multi-operation,multi-object guarantees achievable with high availability.We will sketch algorithms solely as a proof-of-conceptfor high availability; further engineering is required toimprove and evaluate their performance.Read Committed We first consider Read Committedisolation—a particularly widely used isolation model inour survey. Read Committed is often the lowest level ofisolation provided in a database beyond “No Isolation.”It requires that transactions do not read uncommitted dataitems, which would result in “Dirty Reads phenomena(i.e., ANSI P1 [22] and Adya G1{a,b,c} [20]). In the ex-ample below, T3 should never see a = 1, and, if T2 aborts,T3 will never see a = 3:

T1 : wx(1) wx(2)T2 : wx(3)T3 : rx(a)

Read Committed is a useful property because it ensuresthat transactions will not read intermediate versions of agiven data item or read data from transactions that willeventually be rolled back (and thus will never have “ex-isted” in the database).

Read Committed also disallows “Dirty Write” phe-nomena (Adya’s G0 [20]), so the database will “consis-tently” order writes from concurrent transactions . Effec-tively, the database induces a total order on transactions,and the replicas of the database should apply writes inthis order. For example, if T1, T2 commit, T3 can eventu-ally only read a = b = 1 or a = b = 2:

T1 : wx(1) wy(1)T2 : wx(2) wy(2)T3 : rx(a) ry(b)

This is useful because it effectively guarantees cross-item convergence, or eventual consistency. “Dirty Write”occurs when a database chooses different “winning”transactions across simultaneously written keys.

We can implement Read Committed isolation withhigh availability. If servers never reveal dirty data toclients, then clients will never experience “Dirty Read”phenomena. To ensure this, servers should only servedata that they are sure has been committed. Serverscan explicitly buffer incoming writes until they receive acommit message from clients. Alternatively, clients can

3

Page 13: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Wrap-up – Past •  The relational model is a good tradeoff •  ACID makes a developer's life easy •  Yet, we often live (unknowingly) with less than serializability

Page 14: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Present

Page 15: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Cloud µService

NoSQL BASE

Page 16: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Cloud

•  Self Service •  Elasticity •  Pay per use

•  Great resource provisioning model

•  Improves •  Autonomy •  Response time (lead time) •  Elasticity •  Cost efficiency (if done right)

•  Trade-offs •  Scale out (“distributed hell”) •  Reduced availability of individual

resources

Page 17: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

µService

•  “Microservices are the mapping of organizational autonomy to software architecture” •  Limited in scope •  Self-dependent •  Loosely coupled

•  Improves •  Autonomy •  Response time (if done right) •  Elasticity

•  Trade-offs •  Higher design effort •  Harder to operate •  Distributed by default

•  Shared nothing •  No shared data •  No cross-service coordination

Page 18: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

NoSQL

•  Extension of the storage solution space •  Before NoSQL RDBMS and file

system were predominant solutions •  NoSQL tries to fill the gaps

•  New options •  Scalability (Volume & Velocity) •  Relaxed schema •  Availability in cloud environments

•  Trade-offs •  CAP Theorem •  Capabilities and limitations often

poorly understood

Page 19: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

BASE

•  Response to CAP theorem •  “Relax temporal constraints in

exchange for better availability”

•  Improves •  Scalability •  Availability

•  Trade-offs •  Temporal inconsistencies and all

kinds of anomalies become visible •  Very hard programming model

•  Key readings •  [Bre2000] introduced CAP and BASE •  [Hel2007] “defined” boundaries of

eventual consistency for years •  [Sha+2011] introduced CRDTs,

improving convergence guarantees in eventually consistent systems

Page 20: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

About polyglot persistence and

choosing the right database

Page 21: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

The 8 dimensions of storage •  Data Scalability (amount of data) •  Transaction Scalability (access rate)

•  Latency (response time considering scalability) •  Read/Write Ratio (variability of r/w mix considering scalability)

•  Schema Richness (variability of data model) •  Access Richness (variability of access patterns)

•  Consistency (data consistency guarantees)

•  Fault Tolerance (ability to handle failures gracefully)

Page 22: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Data scalability

Transaction scalability

Schema richness

Access richness R/W ratio

Consistency

Fault tolerance

Latency

Rela

tiona

l dat

abas

e

Page 23: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Data scalability

Transaction scalability

Schema richness

Access richness R/W ratio

Consistency

Fault tolerance

Latency

File

sys

tem

Page 24: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Data scalability

Transaction scalability

Schema richness

Access richness R/W ratio

Consistency

Fault tolerance

Latency

RDBM

S &

file

sys

tem

Page 25: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Data scalability

Transaction scalability

Schema richness

Access richness R/W ratio

Consistency

Fault tolerance

Latency

RDBM

S &

file

sys

tem

NoSQL

Filling the blind spots of RDBMS and file systems

Page 26: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Data scalability

Transaction scalability

Schema richness

Access richness R/W ratio

Consistency

Fault tolerance

Latency

Cass

andr

a

Page 27: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Data scalability

Transaction scalability

Schema richness

Access richness R/W ratio

Consistency

Fault tolerance

Latency

RDBM

S-FS

-Cas

sand

ra

Page 28: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

There is no “one size fits all”

and no drop-in replacement for your GORDB *

* good ol’ relational database

Page 29: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

About eventual consistency

Page 30: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

NoSQL databases ... •  ... often do not guarantee any consistency out of the box

•  ... need to be configured properly for eventual consistency

•  ... sometimes can provide higher consistency guarantees

•  ... sometimes even ditch eventual consistency for better availability •  Minimum default consistency level to expect: single write/single entity/single node

Page 31: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

You will experience temporary inconsistencies

and anomalies at the application level

Page 32: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Inconsistencies across replica sets Typical measures

•  Read your writes

•  Read from master

•  Quorum based reads and writes •  Warning: This will not give you strict consistency!

Page 33: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Siblings Typical measures

•  Return all siblings, leave decision to client

•  Resolver (“Read repair”)

•  Conflict-free Replicated Data Types (CRDT) •  Great stuff, but brain twister and does not work for all data types

Page 34: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Performance hit for random access

Usually no reliable and efficient secondary indices possible •  Efficient and reliable access only via primary key access

Typical measures •  Coarse-grained entities plus denormalized associations

•  Multiple entity representations (one per access path)

•  Results in new type of inconsistencies (across entities)

•  Requires additional type of resolver (cross-entity resolver)

Page 35: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Thus, really understand your consistency options ...

Page 36: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

... and be wary of vendor promises

Page 37: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Wrap-up – Present •  IT goes distributed (“scale out”) •  NoSQL fills the empty spots in the storage solution space •  BASE transactions imply a very hard programming model

Page 38: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Future

Page 39: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

So, can I only choose between “ACID” and “BASE”?

Page 40: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Remember, that “ACID” does not necessarily mean serializability?

Page 41: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Repeatable read

One-copy serializability

Strong one-copy serializability

Read committed

regular

Read uncommitted

Cursor stability

linearizable

safe

recency Item cut isolation

Snapshot isolation

Monotonic atomic view

Predicate cut isolation PRAM

Monotonic reads

Writes follow reads

Monotonic writes

Causal consistency

Read your writes

see [Bai+2014a] BASE

ACID

Highly available transaction model

Sticky available transaction model

Non-HA transaction model

Page 42: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

There are a lot of consistency options to choose from between “ACID” and “BASE”

Page 43: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

But …

Page 44: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

The old model assumed the work would be processed in exactly one order of execution. There was a default “single system of record” form of isolation provided by the classic database system running at the primary. This single history allows for a low-level READ and WRITE semantic that depends on “replaying history”.

In this new [distributed] world, history cannot be exactly replayed and we must count on the ability to reorder the work. This means that we cannot completely know the accurate state of the system. It also means we must move the correctness and reordering semantics up from being based on system properties (i.e. READ and WRITE) to application based business operations.

see [Hel+2009]

Page 45: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

It is no longer sufficient to understand and reason about distributed application consistency at the level of data store access.

Instead you need to understand and reason about distributed application consistency at the level of application operations.

Page 46: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

1.  You need to analyze the consistency requirements of the application carefully based on the business requirements.

2.  The application (usually) needs to contribute explicitly to the implementation of the required consistency model.

Page 47: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Current research explores the “frontiers”

•  "HAT, not CAP: Towards Highly Available Transactions” [Bai+2013b] HA causal consistency in distributed systems

•  "Scalable Atomic Visibility with RAMP Transactions" [Bai+2014b] New isolation level “atomic read” (stronger than MAV), implemented in HA fashion

•  "Building Consistent Transactions with Inconsistent Replication" [Zha+2015] Low-latency distributed transactions by building a transactional application protocol on top of inconsistent replication

•  "Putting Consistency Back into Eventual Consistency" [Bal+2015] Explicit application-level consistency using application-defined invariants on top of eventual consistent data stores

•  "Implementing Linearizability at Large Scale and Low Latency" [Lee+2015] Implementing fast and scalable exactly-once semantics, enabling linearizable operations on top of it

•  "Spanner: Google’s Globally-Distributed Database" [Cor+2012] “Case study” for a very different approach to scalable serializability by using hardware to solve the “time problem”

•  "High-Performance ACID via Modular Concurrency Control" [Xie+2015] Speedup of traditional ACID transactions by grouping transactions into independent sets that can be handled concurrently

Page 48: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

And there is more to come ...

Page 49: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Current memory and storage trends

•  Terabyte memory computers Full in-memory computing of large data sets

•  Storage Class Memory (SCM) / Non-Volatile RAM (NVRAM) Many technologies under development filling the gap between RAM and SSD

•  Remote Direct Memory Access (RDMA) Accessing a remote machine’s RAM bypassing the CPU, allowing very low-latency remote memory access (around 2 order of magnitude faster than SSD access)

Keeping the CPU busy is no longer the core challenge

New system architectures and programming models will emerge

New consistency options not available today may also emerge

see [Coc2016], [Col2016]

Page 50: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Wrap-up – Future •  Lots of options to balance consistency constraints and intricacy of

the programming model •  Higher consistency guarantees than “plain BASE” in distributed HA

databases may become available in the near future •  Most of them will require effort on the application level

Page 51: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Recommendations

Page 52: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Across service/storage boundaries

•  Don't coordinate writes across service/storage boundaries •  Usually your design is wrong (entity-driven instead of behavior-driven) •  Remember: DDD is about domains, not entities! Domains include behavior •  The activation path of a use case should be as short as possible

•  Use a relaxed temporal constraint model plus reconciliation •  Consider applying the concepts of promise theory [Bur2005] and memory, guesses

and apologies [Hel+2009]

Page 53: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Within service/storage boundaries

•  Thoroughly analyze your storage requirements (8 dimensions) •  Thoroughly analyze your consistency requirements

•  Be wary of serializability (database reality is different) •  Don't think consistency is solely a data store issue

•  Don't distribute data or relax consistency without an explicit need •  Choose wisely – and have your developers in mind

Page 54: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

Wrap-up •  Past: RDBMS and ACID

•  Great programming model

•  ACID does not necessarily mean serializability

•  Present: Cloud, µServices, NoSQL & BASE •  More options, more challenges

•  Very hard programming model

•  Future: Exploring the boundaries •  Many options between ACID and BASE

•  Often requires awareness on the application level

•  Know your options!

Page 55: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

There is no “one-size-fits-all” solution

Page 56: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

References [Bai+2013a] Peter Bailis, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, Ion Stoica, "HAT, not CAP: Towards Highly Available Transactions", HotOS 2013

[Bai+2013b] Peter Bailis, Ali Ghodsi, Joseph M. Hellerstein, Ion Stoica, "Bolt-on Causal Consistency", SIGMOD 2013

[Bai+2014a] Peter Bailis, Aaron Davidson, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, Ion Stoica, "Highly Available Transactions: Virtues and Limitations", VLDB 2014

[Bai+2014b] Peter Bailis, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, Ion Stoica, "Scalable Atomic Visibility with RAMP Transactions", SIGMOD 2014

[Bai+2015] Peter Bailis, Alan Fekete, Michael J. Franklin, Ali Ghodsi, Joseph M. Hellerstein, Ion Stoica, "Coordination Avoidance in Database Systems", VLDB 2015

[Bal+2015] Valter Balegas, Sérgio Duarte, Carla Ferreira, Rodrigo Rodrigues, Nuno Preguica, Mahsa Najafzadeh, Marc Shapiro, "Putting Consistency Back into Eventual Consistency", EuroSys 2015

[Ber+1995] Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O’Neil, Patrick O'Neil, "A Critique of ANSI SQL Isolation Levels", Microsoft Research, Technical Report MSR-TR-95-51, June 1995

Page 57: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

References [Bre2000] Eric A. Brewer, "Towards Robust Distributed Systems", PODC 2000

[Bur2005] Mark Burgess, "An Approach to Understanding Policy Based on Autonomy and Voluntary Cooperation", DSOM 2005

[Coc2015] Adrian Cockcroft, "Innovation and Architecture", http://de.slideshare.net/adriancockcroft/innovation-and-architecture, S. 122-125

[Col] Adrian Colyer, "the morning paper", http://blog.acolyer.org

[Col2015a-c] Adrian Colyer, "Out of the Fire Swamp”, Parts I-III, http://blog.acolyer.org/2015/09/08/out-of-the-fire-swamp-part-i-the-data-crisis/ http://blog.acolyer.org/2015/09/09/out-of-the-fire-swamp-part-ii-peering-into-the-mist/ http://blog.acolyer.org/2015/09/10/out-of-the-fire-swamp-part-iii-go-with-the-flow/

[Col2016] Adrian Colyer, "All change please", http://blog.acolyer.org/2016/01/22/all-change-please/

[Cor+2012] James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, et al., "Spanner: Google’s Globally-Distributed Database", OSDI 2012

Page 58: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

References [Hel2007] Pat Helland, "Life beyond Distributed Transactions: an Apostate’s Opinion", CIDR 2007

[Hel+2009] Pat Helland, Dave Campell, "Building on Quicksand", CIDR 2009

[Lee+2015] Collin Lee, Seo Jin Park, Ankita Kejriwal, Satoshi Matsushita, John Ousterhout, "Implementing Linearizability at Large Scale and Low Latency", SOSP 2015

[Sha+2011] Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski, "A comprehensive study of Convergent and Commutative Replicated Data Types", Inria Research report, 2011

[Xie+2015] Chao Xie, Chunzhi Su, Cody Littley, Lorenzo Alvisi, Manos Kapritsos, Yang Wang, "High-Performance ACID via Modular Concurrency Control", SOSP 2015

[Zha+2015], Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports, "Building Consistent Transactions with Inconsistent Replication", SOSP 2015

Page 59: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity

@ufried Uwe Friedrichsen | [email protected] | https://www.slideshare.net/ufried | https://medium.com/@ufried

Page 60: Real-world consistency explained - mcorbin · 2020-05-04 · Real-world consistency explained or the challenges of modern persistence Uwe Friedrichsen (codecentric AG) – Velocity