57
No stress with state Hands-on tips for scalable applications Uwe Friedrichsen, codecentric AG, 2014

No stress with state

Embed Size (px)

DESCRIPTION

This slide deck explains a bit how to deal best with state in scalable systems, i.e. pushing it to the system boundaries (client, data store) and trying to avoid state in-between. Then it picks arbitrarily two scenarios - one in the frontend part and one in the backend part of a system and shows concrete techniques to deal with them. In the frontend part is examined how to deal with session state of servlet containers in scalable scenarios and introduces the concept of a shared session cache layer. Also an example implementation using Redis is shown. In the backend part it is examined how to deal with potential data inconsistencies that can occur if maximum availability of the data store is required and eventual consistency is used. The normal way is to resolve inconsistencies manually implementing business specific logic or - even worse - asking the user to resolve it. A pure technical solution called CRDTs (Conflict-free Replicated Data Types) is then shown. CRDTs, based on sound mathematical concepts, are self-stabilizing data structures that offer a generic way to resolve inconsistencies in an eventual consistent data store. Besides some theory also some examples are shown to provide a feeling how CRDTs feel in practice.

Citation preview

Page 1: No stress with state

No stress with state Hands-on tips for scalable applications

Uwe Friedrichsen, codecentric AG, 2014

Page 2: No stress with state

@ufried Uwe Friedrichsen | [email protected] | http://slideshare.net/ufried | http://ufried.tumblr.com

Page 3: No stress with state

A word about scalability ...

Page 4: No stress with state

Your system is only scalable if you can easily scale up

and down

Page 5: No stress with state

State in scalable systems

Page 6: No stress with state

State is the enemy of scalable applications

Page 7: No stress with state

State … •  … avoids load distribution

•  … requires replication

•  … makes systems brittle

•  … makes scaling down hard

Page 8: No stress with state

General design rule •  Move state either to the frontend and/or to the backend

•  Make everything else stateless

•  Use caches if that’s too slow (but be aware of their tradeoffs)

Frontend Client

Backend Store

Application

Stateful Stateful Stateless

Page 9: No stress with state

Frontend client state examples •  Cookies (fondly handcrafted)

•  Play framework

•  JavaScript client

•  HTML5 storage

Page 10: No stress with state

And what if we are bound to a web framework that

heavily uses session state?

Page 11: No stress with state

Standard session state handling •  Sticky Sessions

•  Session Replication

Page 12: No stress with state

Sticky Sessions

Load Balancer

Session Node

4711 2

0815 4

… …

Session: 4711

Node: 2

Page 13: No stress with state

Sticky Sessions

Load Balancer

Session Node

4711 2

0815 4

… …

Session: 4711

Problem: Node is down

✖ New Node: 4

Node 4 does not know session 4711

Page 14: No stress with state

Sticky Sessions

Load Balancer

Session Node

4711 2

0815 4

… …

Problem: Load is unequally distributed

Page 15: No stress with state

Sticky Sessions

Load Balancer

Session Node

4711 2

0815 4

… …

Problem: Scale down

#Sessions: 1 #Sessions: 1 #Sessions: 1 #Sessions: 1 #Sessions: 1

Cannot shut down node without losing session

Page 16: No stress with state

Session Replication

Load Balancer

Request

Arbitrarynode

Session replication

Page 17: No stress with state

But what if we really need to scale?

Page 18: No stress with state

Lazy session replication leaves you with inconsistencies

Load Balancer

Eager session replication does not scale well

Page 19: No stress with state

What we really want

•  Scaling (up) behavior of sticky sessions

•  Node independence of session replication

à Isolated nodes with shared session cache

Page 20: No stress with state

Isolated nodes with shared session cache

Load Balancer

Request

Arbitrarynode

Application nodes

Cache nodes

Page 21: No stress with state

Isolated nodes with shared session cache •  Easy load distribution

•  Easy up- and down-scaling

•  Tolerant against node failures

•  Independent scaling of application and cache nodes

Page 22: No stress with state

But isn’t that way too slow?

Page 23: No stress with state

Scalability vs. Latency You need to give up some latency for an individual user

to provide acceptable latency for many users

Late

ncyd

istrib

utio

n

Not scalable

Acceptable latency

Single user

Many users

Scalable

Acceptable latency

Late

ncyd

istrib

utio

n

Single user

Many users

Page 24: No stress with state

And how do I implement it?

Page 25: No stress with state

Interception points

•  StandardSession (Tomcat)

•  HttpSessionAttributeListener (JEE)

•  Valve (Tomcat)

•  Filter (JEE)

•  SessionManager (ManagerBase, Tomcat)

Page 26: No stress with state

Available libraries

•  For Redis

•  For MemCached

•  For MongoDB

•  For Cassandra

•  …

Page 27: No stress with state

Example library

•  Redis Session Manager for Apache Tomcathttps://github.com/jcoleman/tomcat-redis-session-manager

•  A bit outdated, but easy to use

•  Usage

•  Install and start Redis

•  Copy tomcat-redis-session-manager.jar and

jedis-<version>.jar to Tomcat lib directory

•  Update Tomcat configuration

•  Restart Tomcat

Page 28: No stress with state

context.xml

<Context> ... <Valve className="com.radiadesign.catalina.session.RedisSessionHandlerValve" /> <Manager className="com.radiadesign.catalina.session.RedisSessionManager" host="localhost” port="6379" database="0” maxInactiveInterval="60" /> </Context>

Page 29: No stress with state

Tradeoffs

•  Easy to set up and use

•  No code required, only configuration

•  Works smoothly

•  Not best for web frameworks

that store lots of session state

•  Does not work well with AJAX

•  Does not work with single page applications

à Another reason to prefer ROCA style?

Page 30: No stress with state

General design rule •  Move state either to the frontend and/or to the backend

•  Make everything else stateless

•  Use caches if that’s too slow (but be aware of their tradeoffs)

Frontend Client

Backend Store

Application

Stateful Stateful Stateless

Page 31: No stress with state

Challenges of scalable datastores •  Requires relaxed consistency guarantees

•  Leads to (temporal) anomalies and inconsistencies short-term due to replication or long-term due to partitioning

It might happen. Thus, it will happen!

Page 32: No stress with state

C Consistency

A Availability

P Partition Tolerance

Strict Consistency

ACID / 2PC

Strong Consistency

Quorum R&W / Paxos

Eventual Consistency

CRDT / Gossip / Hinted Handoff

Page 33: No stress with state

Strict Consistency (CA) •  Great programming model

no anomalies or inconsistencies need to be considered

•  Does not scale well best for single node databases

„We know ACID – It works!“

„We know 2PC – It sucks!“

Use for moderate data amounts

Page 34: No stress with state

And what if I need more data?

•  Distributed datastore

•  Partition tolerance is a must

•  Need to give up strict consistency (CP or AP)

Page 35: No stress with state

Strong Consistency (CP) •  Majority based consistency model

can tolerate up to N nodes failing out of 2N+1 nodes

•  Good programming model Single-copy consistency

•  Trades consistency for availability in case of partitioning

Paxos (for sequential consistency)

Quorum-based reads & writes

Page 36: No stress with state

And what if I need more availability?

•  Need to give up strong consistency (CP)

•  Relax required consistency properties even more

•  Leads to eventual consistency (AP)

Page 37: No stress with state

Eventual Consistency (AP) •  Gives up some consistency guarantees

no sequential consistency, anomalies become visible

•  Maximum availability possiblecan tolerate up to N-1 nodes failing out of N nodes

•  Challenging programming model anomalies usually need to be resolved explicitly

Gossip / Hinted Handoffs

CRDT

Page 38: No stress with state

Conflict-free Replicated Data Types •  Eventually consistent, self-stabilizing data structures

•  Designed for maximum availability

•  Tolerates up to N-1 out of N nodes failing

State-based CRDT: Convergent Replicated Data Type (CvRDT)

Operation-based CRDT: Commutative Replicated Data Type (CmRDT)

Page 39: No stress with state

A bit of theory first ...

Page 40: No stress with state

Convergent Replicated Data Type State-based CRDT – CvRDT

•  All replicas (usually) connected

•  Exchange state between replicas, calculate new state on target replica

•  State transfer at least once over eventually-reliable channels

•  Set of possible states form a Semilattice •  Partially ordered set of elements where all subsets have a Least Upper Bound (LUB)

•  All state changes advance upwards with respect to the partial order

Page 41: No stress with state

Commutative Replicated Data Type Operation-based CRDT - CmRDT

•  All replicas (usually) connected

•  Exchange update operations between replicas, apply on target replica

•  Reliable broadcast with ordering guarantee for non-concurrent updates

•  Concurrent updates must be commutative

Page 42: No stress with state

That‘s been enough theory ...

Page 43: No stress with state

Counter

Page 44: No stress with state

Op-based Counter Data

Integer i

Init i ≔ 0

Query return i

Operations increment(): i ≔ i + 1 decrement(): i ≔ i - 1

Page 45: No stress with state

State-based G-Counter (grow only) (Naïve approach)

Data

Integer i

Init i ≔ 0

Query return i

Update increment(): i ≔ i + 1

Merge( j) i ≔ max(i, j)

Page 46: No stress with state

State-based G-Counter (grow only) (Naïve approach)

R1

R3

R2

i = 1

U

i = 1

U

i = 0

i = 0

i = 0

I

I

I

M

i = 1 i = 1

M

Page 47: No stress with state

State-based G-Counter (grow only) (Vector-based approach)

Data

Integer V[] / one element per replica set

Init V ≔ [0, 0, ... , 0]

Query return ∑i V[i]

Update increment(): V[i] ≔ V[i] + 1 / i is replica set number

Merge(V‘) ∀i ∈ [0, n-1] : V[i] ≔ max(V[i], V‘[i])

Page 48: No stress with state

State-based G-Counter (grow only) (Vector-based approach)

R1

R3

R2

V = [1, 0, 0]

U

V = [0, 0, 0]

I

I

I

V = [0, 0, 0]

V = [0, 0, 0]

U

V = [0, 0, 1]

M

V = [1, 0, 0]

M

V = [1, 0, 1]

Page 49: No stress with state

State-based PN-Counter (pos./neg.) •  Simple vector approach as with G-Counter does not work

•  Violates monotonicity requirement of semilattice

•  Need to use two vectors •  Vector P to track incements

•  Vector N to track decrements

•  Query result is ∑i P[i] – N[i]

Page 50: No stress with state

State-based PN-Counter (pos./neg.) Data

Integer P[], N[] / one element per replica set

Init P ≔ [0, 0, ... , 0], N ≔ [0, 0, ... , 0]

Query Return ∑i P[i] – N[i]

Update increment(): P[i] ≔ P[i] + 1 / i is replica set number decrement(): N[i] ≔ N[i] + 1 / i is replica set number

Merge(P‘, N‘) ∀i ∈ [0, n-1] : P[i] ≔ max(P[i], P‘[i]) ∀i ∈ [0, n-1] : N[i] ≔ max(N[i], N‘[i])

Page 51: No stress with state

Non-negative Counter Problem: How to check a global invariant with local information only? •  Approach 1: Only dec if local state is > 0

•  Concurrent decs could still lead to negative value •  Approach 2: Externalize negative values as 0

•  inc(negative value) == noop(), violates counter semantics •  Approach 3: Local invariant – only allow dec if P[i] - N[i] > 0

•  Works, but may be too strong limitation •  Approach 4: Synchronize

•  Works, but violates assumptions and prerequisites of CRDTs

Page 52: No stress with state

More datatypes

•  Register

•  Set

•  Dictionary (Map)

•  Tree

•  Graph

•  Array

•  List

plus several representations for each datatype

Page 53: No stress with state

Limitations of CRDTs •  Very weak consistency guarantees

Strives for „quiescent consistency“

•  Eventually consistentNot suitable for high-volume ID generator or alike

•  Not easy to understand and model

•  Not all data structures representable

Use if availability is extremely important

Page 54: No stress with state

Further reading on CRDTs 1.  Shapiro et al., Conflict-free Replicated Data

Types, Inria Research report, 2011

2.  Shapiro et al., A comprehensive study of Convergent and Commutative Replicated Data Types,Inria Research report, 2011

3.  Basho Tag Archives: CRDT, https://basho.com/tag/crdt/

4.  Leslie Lamport, Time, clocks, and the ordering of events in a distributed system, Communications of the ACM (1978)

Page 55: No stress with state

Wrap-up

•  Scalability means scaling up and down

•  State is the enemy of scalability

•  Move state to the frontend and backend

•  Shared session state layer

•  CRDTs

The best way to deal with state is to avoid it

Page 56: No stress with state

@ufried Uwe Friedrichsen | [email protected] | http://slideshare.net/ufried | http://ufried.tumblr.com

Page 57: No stress with state