30
Groupware Systems for fun and profit CRDT, OT, Quorum-commit etc. Max Klymyshyn July 2, 2017 https://github.com/joymax/odessajs2017 Max Klymyshyn Groupware Systems for fun and profit 1 / 30

OdessaJS 2017: Groupware Systems for fun and profit

Embed Size (px)

Citation preview

Groupware Systems for fun and profitCRDT, OT, Quorum-commit etc.

Max Klymyshyn

July 2, 2017

https://github.com/joymax/odessajs2017

Max Klymyshyn Groupware Systems for fun and profit 1 / 30

Groupware or Collaborative Systemsynchronization in asynchronous environments

Real-time groupware systems are multi-user systems where the actionsof one user must quickly be propagated to the other users.

Examples of Groupware Systems:

Any kind of chats with clients on multiple devicesReal-time group editing functionality (google docs etc.)Real-time gamesConferences softwareNotes, contacts, reminders, calendarsShopping carts, group ordering

Max Klymyshyn Groupware Systems for fun and profit 2 / 30

Why?

It matters for JavaScript and especially for client-side developers:

backend-free systems – It’s is possible to create and maintaincompletely back-end less p2p architecture using WebRTC,δ-mutation and different types of sharding.consistent behavior for devices with same service (gmail, notes etc.)network-partition tolerance

Max Klymyshyn Groupware Systems for fun and profit 3 / 30

Synchronization in asynchronous environmentand different types of distributed replication approaches

Here is list of fundamental problems within asynchronous setting:divergence Final document contents are different on all sitescausality-violations Execution order is different eq o1 → o3

intention-violoation the actual effect is different from intended effect eqo1||o2

Max Klymyshyn Groupware Systems for fun and profit 4 / 30

Replication and Consistencyand different types of distributed replication approaches

Basically there’s only two hight-level approaches for replications:Pessimistic and Optimistic replications.

Pessimistic ReplicationOr single-copy serializability: The idea is to prohibit access to a replicaunless it is provably up to date. Techniques: two-phase commit,quorum-consensus, same-order delivery.

Optimistic Replication or Eventual ConsistencyUpdates sent asynchronously to other replicas; every replica eventuallyapplies all updates, possibly in different orders

Max Klymyshyn Groupware Systems for fun and profit 5 / 30

Naive approachesto organize synchronization within groupware systems

LockingTransactionsSingle Active ParticipantDependency DetectionReversible Execution

Max Klymyshyn Groupware Systems for fun and profit 6 / 30

Consistency Models

Strong Consistency – after the update completes, anysubsequent access will return the updated valueWeak Consistency – the system does not guarantee thatsubsequent accesses will return the updated valueEventual Consistency – eventually all accesses will return thelast updated value.Strong Eventual Consistency – special case of EC which addsthe safety guarantee that any two nodes that have received thesame (unordered) set of updates will be in the same state

Max Klymyshyn Groupware Systems for fun and profit 7 / 30

Operation Transformation (OT) System Modeland definitions

OTis a technology for supporting a range of collaboration functionalities inadvanced collaborative software systems.

Definitions▶ Processes or sites are participants of the collaborative system▶ Commands or operations are operations provided by system (for

example insert text, delete text etc.)▶ Required properties and Transformations Matrices

Max Klymyshyn Groupware Systems for fun and profit 8 / 30

Transformation Property

Operations must bee commutative:

op1 ⊙ op2 = op2 ⊙ op1

Figure: order of operations for different sites

Max Klymyshyn Groupware Systems for fun and profit 9 / 30

Transformation Matrixformal definition

o′j = T (oj , oi, pj , pi)

o′i = T (oi, oj , pi, pj)

then T is such that following is satisfied:

o′j ⊙ o

′i = o

′i ⊙ o

′j

Max Klymyshyn Groupware Systems for fun and profit 10 / 30

Transformation Function picture

Figure: Transformation Function

Max Klymyshyn Groupware Systems for fun and profit 11 / 30

OT Summary: Intention-violation

Famous algos: GROVE, dOPT, Jupiter, adOPTed, REDUCE,GOT, GOTOProving correctness of transformation functions is a complex taskImpossible to establish proofs without computer within reasonabletime for complex TFFollowing Oster et al. ”Proving correctness of transformationfunctions in collaborative editing systems” paper only OTimplementation of Tombstone Transformation Functions is correct

Max Klymyshyn Groupware Systems for fun and profit 12 / 30

Who is using OT?

Google Docsex-Google WaveA lot of other experiments

Max Klymyshyn Groupware Systems for fun and profit 13 / 30

OT Ressel et al. model

Figure: L-Transformation:r1 ⊙ r

2 ≡ r2 ⊙ r′

1

Max Klymyshyn Groupware Systems for fun and profit 14 / 30

Conflict-free replicated data types (CRDT)

CRDT isdata type for the system of processes interconnected by anasynchronous network where replicas can be updated independentlyand concurrently without coordination between the replicas.

This kind of replication is very handy for client-side developers because:

Does not require active back-end system participationQuality leap in UX for users with weak or blinking networkconnectionsBetter experience for SPAs with long single-load sessions likeSoundCloud or GMailOffline work

Max Klymyshyn Groupware Systems for fun and profit 15 / 30

CRDTs: CvRDT, CmRDTstate-based and op-based replication

Figure: State-basedConvergent Replicated Data

Type or CvRDT

Figure: Op-basedCommutative Replicated Data

Type or CmRDT

Max Klymyshyn Groupware Systems for fun and profit 16 / 30

CvRDT and CmRDT PropertiesRequired propertiesIn Abstract Algebra a CRDT is formally known as a Semilattice. AllSemilattices must have the following properties:

commutativity – ∀x, y : x⊙ y = y ⊙ xassociativity – nodes must come to the exactly same stateregardless of the order they receive information:x⊙ (y ⊙ z) = (x⊙ y)⊙ zidempotency – x⊙ x = x

Figure: Least Upper Bound

Max Klymyshyn Groupware Systems for fun and profit 17 / 30

CRDT: TypesRegisters, Counters, Sets

Register: LWW or Multi-Value (Dynamo or Couchdb-like)Counter (growth-only) and Counter w/decrementingG-Set – growth-only set2P-Set – remove only once set (G-Set + Tombstones set)LWW-Element-Set – vector clocksOR-Set – unique-tagged elements and list of tags withinTombstones setTreedoc, Logoot etc.

Max Klymyshyn Groupware Systems for fun and profit 18 / 30

CRDT: GCounterIncrement only counter

linenos class GCounter {

linenos constructor(n) { this.n = n; }

linenos

linenos increment() { return new GCounter(this.n + 1); }

linenos

linenos query() { return this.n; }

linenos

linenos merge(counter) {

linenos return new GCounter(

linenos Math.max(counter.query(), this.n));

linenos }

linenos }

Max Klymyshyn Groupware Systems for fun and profit 19 / 30

CRDT: PNCounterIncrement and decrement counter

linenos class PNCounter {

linenos constructor(p, n) { this.n = n; this.p = p}

linenos

linenos increment() { return new PNCounter(

linenos this.p.increment(), this.n); }

linenos

linenos decrement() { return new PNCounter(

linenos this.p, this.n.increment()); }

linenos

linenos query() { return this.p.query() - this.n.query(); }

linenos

linenos merge(pncounter) {

linenos return new PNCounter(this.p.merge(pncounter.p),

linenos this.n.merge(pncounter.n));

linenos }

linenos }

Max Klymyshyn Groupware Systems for fun and profit 20 / 30

CRDT: PNCounter TestIncrement and decrement counter

linenosfunction examples_PNCounter() {

linenos let log = (op, c1, c2) => console.log(

linenos "[OP] " + op + ": c1=" + c1.query() + ", c2=" + c2.query());

linenos let pn_counter1 = new PNCounter(

linenos new GCounter(0), new GCounter(0));

linenos let pn_counter2 = new PNCounter(

linenos new GCounter(0), new GCounter(0));

linenos

linenos pn_counter1 = pn_counter1.increment();

linenos pn_counter1 = pn_counter1.increment();

linenos log("[BEFORE MERGE] pncounter1++", pn_counter1, pn_counter2);

linenos pn_counter2 = pn_counter2.merge(pn_counter1);

linenos log("[AFTER MERGE] pncounter1++", pn_counter1, pn_counter2);

linenos pn_counter2 = pn_counter2.decrement();

linenos log("[BEFORE MERGE] pncounter2--", pn_counter1, pn_counter2);

linenos pn_counter1 = pn_counter1.increment();

linenos log("[BEFORE MERGE] pncounter1++", pn_counter1, pn_counter2);

linenos pn_counter1 = pn_counter1.merge(pn_counter2);

linenos pn_counter2 = pn_counter2.merge(pn_counter1);

linenos log("[AFTER MERGE]", pn_counter1, pn_counter2);

linenos}

Max Klymyshyn Groupware Systems for fun and profit 21 / 30

CRDT: PNCounter example output

linenos [OP] [BEFORE MERGE] pncounter1++: c1=2, c2=0

linenos [OP] [AFTER MERGE] pncounter1++: c1=2, c2=2

linenos [OP] [BEFORE MERGE] pncounter2--: c1=2, c2=1

linenos [OP] [BEFORE MERGE] pncounter1++: c1=3, c2=1

linenos [OP] [AFTER MERGE]: c1=2, c2=2

Max Klymyshyn Groupware Systems for fun and profit 22 / 30

CRDT: OR-SetObserved-Remove Set

Max Klymyshyn Groupware Systems for fun and profit 23 / 30

Existing Tools

ShareDB (and fall of Google Wave)Swarm (and forever-in-pre-alpha tool)OT.js (Operational Transformation for JS)CRDT – github.com/dominictarr/crdtreplikativ.io – p2p distributed system frameworkother CRDT implementations

Max Klymyshyn Groupware Systems for fun and profit 24 / 30

Drawbacks of CRDTyou can't pour two buckets of shit into one bucket

However, a major drawback in current state-based CRDTs is thecommunication overhead of shipping the entire state, which canget very large in size. For instance, the size of a counter CRDT (avector of integer counters, one per replica) increases with the number ofreplicas; whereas in a grow-only Set, the state size depends on the setsize, that grows as more operations are invoked.

Max Klymyshyn Groupware Systems for fun and profit 26 / 30

δ-mutationEverything at a cost

“state fragments” merged using desired semanticsδ-mutators, defined in such a way to return a delta-state, typically,with a much smaller size than the full state.

delta-enabled-crdts

Max Klymyshyn Groupware Systems for fun and profit 27 / 30

References

[1] Marc Shapiro, Nuno Pregui �ca, Carlos Baquero, Marek Zawirski. Acomprehensive study of Convergent and Commutative Replicated Data Types.

[2] Yasushi Saito, Marc Shapiro Replication: Optimistic Approaches

[3] Mihai Let, Nuno Pregui, Marc Shapiro CRDTs: Consistency withoutconcurrency control∗

[4] Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, GunavardhanKakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian,Peter Vosshall and Werner Vogels Dynamo: Amazon’s Highly AvailableKey-value Store

[5] G ́erald Oster, Pascal Urso, Pascal Molli, Abdessamad Imine. Real time groupeditors without Operational transformation.

[6] Gérald Oster, Pascal Urso†, Pascal Molli, Abdessamad Imine Provingcorrectness of transformation functions in collaborative editing systems

Max Klymyshyn Groupware Systems for fun and profit 28 / 30

[1] Paulo S ́ergio Almeida, Ali Shoker, and Carlos Baquero Efficient State-basedCRDTs by Delta-Mutation

[2] A. Bieniusa, M. Zawirski, N. Preguiça, M. Shapiro, Carlos Baquero, ValterBalegas, Sérgio Duarte. An Optimized Conflict-free Replicated Set.

[3] C.A. Ellis, S.J. Gibbs Concurrency Control in Groupware Systems

[4] Pascal Molli, Ge ́rald Oster Using the Transformatinal Approach to Build a Safeand Generic Data Synchronizer

[5] Pascal Molli, Ge ́rald Oster Using the Transformatinal Approach to Build a Safeand Generic Data Synchronizer

[6] Paulo S ́ergio Almeida, Ali Shoker, and Carlos Baquero Delta State ReplicatedData Types, 2016

Max Klymyshyn Groupware Systems for fun and profit 29 / 30

Thanks

@maxmaxmaxmax

https://github.com/joymax/odessajs2017

Max Klymyshyn Groupware Systems for fun and profit 30 / 30