Akka - Distributed by DesignBjörn Antonsson
@bantonsson
Wednesday, 30 October 13
@bantonsson
Overview
• Akka• Actors• Distributed by Design• Diving Into The Cluster• What The Future Brings
Wednesday, 30 October 13
Akka
Wednesday, 30 October 13
@bantonsson
Akka
• Toolkit and runtime for reactive applications• Write applications that are– Concurrent– Distributed– Fault tolerant– Event-driven
Wednesday, 30 October 13
@bantonsson
Akka
• Has multiple tools– Actors– Futures– Dataflow– Remoting– Clustering
Wednesday, 30 October 13
Actors
Wednesday, 30 October 13
@bantonsson
Actors
• Isolated lightweight event-based processes• Share nothing• Communicate through async messages• Each actor has a mailbox (message queue)• Each actor has a parent handling its failures• Location transparent (distributable)
Wednesday, 30 October 13
@bantonsson
Actors
• An island of sanity in a sea of concurrency• Everything inside the actor is sequential– Processes one message at a time
• Very lightweight– Create millions– Create short lived
• Inherently concurrent
Wednesday, 30 October 13
@bantonsson
Actor code sample
public class Greeting implements Serializable { public final String who; public Greeting(String who) { this.who = who; }}
public class GreetingActor extends UntypedActor { LoggingAdapter log = Logging.getLogger(getContext().system(), this); int counter = 0;
public void onReceive(Object message) { if (message instanceof Greeting) { counter++; log.info("Hello #" + counter + " " + ((Greeting) message).who); } else unhandled(message); }}
Define the message(s) the Actor should be able to respond to
Define the Actor class
Define the Actor’s behavior
Wednesday, 30 October 13
@bantonsson
Creating and using Actors
ActorSystem system = ActorSystem.create("MySystem"); ActorRef greeter = system.actorOf( Props.create(GreetingActor.class), "greeter");
greeter.tell(new Greeting("Charlie Parker"), null);
Wednesday, 30 October 13
@bantonsson
Actors compared to Objects
• Think of an Actor as an Object• You can't peek inside it• You don't call methods– You send messages (asynchronously)
• You don't get return values– You receive messages (asynchronously)
• The internal state is thread safe
Wednesday, 30 October 13
Why should I care?
Wednesday, 30 October 13
The world is multicore!
Wednesday, 30 October 13
@bantonsson
Amdahl’s Law
Wednesday, 30 October 13
@bantonsson
So what's the catch?
• Really no catch• A different programming paradigm• All about tradeoffs– Some things are easier some harder
• Think different
Wednesday, 30 October 13
Distributed by Design
Wednesday, 30 October 13
@bantonsson
Remote Actors
• Sending messages decouples actors• Local or remote doesn't matter
Wednesday, 30 October 13
@bantonsson
NODE 1 NODE 2
Wednesday, 30 October 13
@bantonsson
Remote Actors
• Zero code change deployment decision• Add configuration to the Actor Systemakka { actor { provider = akka.remote.RemoteActorRefProvider deployment { /greeter { remote = akka.tcp://MySystem@machine1:2552 } } }}
Configure a Remote Provider
Define Remote Path Protocol Actor System Hostname Port
The "greeter" actor
Wednesday, 30 October 13
@bantonsson
Looking up Actors
ActorSelection greeter = system.actorSelection( "akka.tcp://MySystem@machine1:2552/user/greeter");
Wednesday, 30 October 13
Can you see the problem?
Wednesday, 30 October 13
@bantonsson
Fixed addressesakka { actor { provider = akka.remote.RemoteActorRefProvider deployment { /greeter { remote = akka.tcp://MySystem@machine1:2552 } } }}
ActorSelection greeter = system.actorSelection( "akka.tcp://MySystem@machine1:2552/user/greeter");
Wednesday, 30 October 13
Diving Into The Cluster
Wednesday, 30 October 13
@bantonsson
Akka Cluster 2.2
• Gossip-Based Cluster Membership• Failure Detector• Cluster DeathWatch• Cluster-Aware Routers
Wednesday, 30 October 13
@bantonsson
Cluster Membership
• Node ring à la Riak / Dynamo• Gossip-protocol for state dissemination• Vector Clocks to resolve conflicts• Peer based failure detector
Wednesday, 30 October 13
@bantonsson
Node ring with gossiping Members
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
Gossip
Wednesday, 30 October 13
@bantonsson
Gossip Protocol
• Cluster Membership• Leader Determination• Targets Random Node– Partly biased towards nodes with older state
• Push-Pull based– Sender only sends his version number– Receiver asks for newer information
Wednesday, 30 October 13
@bantonsson
Vector Clocks
• Partial ordering in a distributed system• Detects causality violations• Used to reconcile and merge cluster state
Wednesday, 30 October 13
@bantonsson
Failure Detection
• Uses The Phi Accrual Failure Detector• Peer Based with limited targets– B monitors A– A sends heart beats to B– B samples inter-arrival time to expect next beat– B measures continuum of deadness of A– B marks A as unreachable if A is dead enough
Wednesday, 30 October 13
@bantonsson
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
Heartbeat
Wednesday, 30 October 13
@bantonsson
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
MemberNode
Heartbeat
Wednesday, 30 October 13
@bantonsson
Member States
• Joining• Up• Leaving• Exiting• Down• Removed• Unreachable*
unreachable*
joining
up
leaving
exitingdown
(leader action)
(fd*)
(fd*)
(fd*)
(fd*)
leave
(leader action)
(leader action)
join
removed
Wednesday, 30 October 13
@bantonsson
Leader
• No SPOF• Can be any node• No handover involved• Deterministically recognized by all nodes – always the first member in the sorted
membership ring
Wednesday, 30 October 13
@bantonsson
Leader Duties
• Shift members from– Joining to Up– Exiting to Removed– Up to Down (auto-downing) to Removed
• Can only be performed on convergence
Wednesday, 30 October 13
@bantonsson
Cluster Convergence
• A Node sees that all other Nodes have seen this version of the gossip
• Is always local to that Node• Unreachable Nodes blocks this• Mark Unreachable Nodes as Down to proceed– Manual Ops intervention– Automatic action
Wednesday, 30 October 13
@bantonsson
Cluster Metrics
• Gossip based• Metrics about the Nodes in the Cluster– Load– CPU Usage– Processors– Heap Memory• Used, Committed, Max
Wednesday, 30 October 13
@bantonsson
Cluster Roles
• Assign roles to Nodes (named tags)• Nodes can have multiple roles• Restrict work to certain roles• Deterministically recognized role leader
Wednesday, 30 October 13
@bantonsson
Cluster Events
• Subscribe to be notified• Membership Changes– Up, Exited, Removed
• Leader Changed• Metrics Changed• Role Leader Changed• Member Unreachable
Wednesday, 30 October 13
@bantonsson
Cluster DeathWatch
• Triggered by marking node «A» Down– Tell parents of their lost children on «A»– Kill all children of actors on «A»– Send Terminated for actors on «A»
Wednesday, 30 October 13
Building on The Cluster
Wednesday, 30 October 13
@bantonsson
Load Balancing
• Cluster aware routers– Round Robin Router– Consistent Hashing Router– Adaptive Load Balancing Router• Use Cluster Metrics to select target• CPU/Memory/Load
Wednesday, 30 October 13
@bantonsson
Cluster Contributions/Patterns
• Distributed Pub/Sub Mediator– Publish and Subscribe to message flows
• Cluster Singleton– HA singleton actor instance within the cluster
• Cluster Client– Let other systems connect to the cluster
Wednesday, 30 October 13
@bantonsson
DistributedPubSubMediator
Frontend Master
Mediator Mediator
Wednesday, 30 October 13
@bantonsson
DistributedPubSubMediator
Frontend Master
Mediator Mediator
Put
Wednesday, 30 October 13
@bantonsson
DistributedPubSubMediator
Frontend Master
Mediator Mediator
Send
Wednesday, 30 October 13
@bantonsson
DistributedPubSubMediator
Frontend Master
Mediator Mediator
Send
Wednesday, 30 October 13
@bantonsson
ClusterSingleton
ClusterSingletonManager
ClusterSingletonManager
MasterMaster(Standby)
Wednesday, 30 October 13
@bantonsson
ClusterSingleton
ClusterSingletonManager
ClusterSingletonManager
MasterMaster(Standby)
Wednesday, 30 October 13
@bantonsson
ClusterSingleton
ClusterSingletonManager
ClusterSingletonManager
Master Master
Wednesday, 30 October 13
@bantonsson
ClusterClient & ClusterSingleton
Master
Mediator
Mediator
Receptionist
Master(Standby)
Receptionist
ClusterClient
ClusterClient
Worker
Worker
Wednesday, 30 October 13
@bantonsson
Typesafe ActivatorDistributed Workers Cluster Template
• http://typesafe.com/platform/getstarted
Wednesday, 30 October 13
What The Future Brings
Wednesday, 30 October 13
@bantonsson
Gossip Optimizations
• Several times faster Vector Clock comparison• Fewer Vector Clock comparisons• Gossip message size cut in half• Gossip message scrubbing• Lazy deserialization of Gossip messages
Wednesday, 30 October 13
@bantonsson
Return from Unreachable
• Unreachableto Reachable
• Cluster ismore resilientto fluctuations
unreachable*
joining
up
leaving
exitingdown
(leader action)
(fd*)
(fd*)
(fd*)
(fd*)
leave
(leader action)
(leader action)
join
removed
Wednesday, 30 October 13
@bantonsson
Rebuilt Routers
• Rebuilt from the ground• Routing logic usable in Actors• Actor Selection as routees• Improved Cluster behavior
Wednesday, 30 October 13
@bantonsson
Persistence
• New module akka-persistence• Command sourcing & event sourcing• Based on the proven Eventsourced library• Migrate actors by persisting their state
Wednesday, 30 October 13
Resources
Wednesday, 30 October 13
@bantonsson
Survey and Resources
• Help Akka get better. Fill out the survey!– http://tinyurl.com/akka-survey
• Akka Cluster Documentation– http://tinyurl.com/akka-cluster
• Akka Cluster in Production Blog Post – Ryan Tanner• http://tinyurl.com/akka-at-conspire
Wednesday, 30 October 13
@bantonsson
Coursera Course
• Principles of Reactive Programming byMartin Odersky, Erik Meijer and Roland Kuhn– Starts 4th of November 2013– 7 weeks– Workload: 5-7 hours a week– Free as in free beer
• https://www.coursera.org/course/reactive
Wednesday, 30 October 13