Upload
jimmy-lai
View
772
Download
9
Tags:
Embed Size (px)
Citation preview
Distributed System Coordination by Zookeeper and Introduction to
Kazoo Python Library
Jimmy Lai r97922028 [at] ntu.edu.tw
Dec. 22th, 2014
1
Outline1. Overview 2. Basics 3. Deployment 4. Recipes 5. References
2
Overview of Zookeeper
3
A Distributed System - Master-Worker
• Coordination tasks: 1. elect new master when the master crashes 2. master assign tasks to worker 3. when worker crashes, re-assign the task to other
worker 4. When worker finished their task, master assign new
tasks to it
Master
Worker Worker Worker Worker Worker Worker
4
Distributed System• An application consists of programs run on a
group of computers. • Coordination is more difficult than writing a
standalone program. • Developer may take too much times to handle
the coordination or create a fragile (e.g. race condition, single point failure) distributed system.
5
Easy Distributed System by Zookeeper• Common coordination tasks:
• Naming service • Configuration management • Synchronization • Leader election • Message queue • Notification system
• Zookeeper provides highly reliable API for those common coordination tasks
http://en.wikipedia.org/wiki/Apache_ZooKeeper#Typical_use_cases6
Powered By Zookeeper• Zookeeper is built by Yahoo Research • Customers:
• Hadoop, Hbase • Solr • Neo4j • Flume • Facebook messages
7
Benefits of Zookeeper• With Zookeeper:
• simplify the development of distributed system, more agile and robust
• zookeeper is simple, fast and replicated • Without Zookeeper:
• more difficult8
• Servers replicate data • Client connect to one of the
server • Throughput test • Hardware: dual 2Ghz Xeon and
two SATA 15K RPM drives
Benefits of Zookeeper
9
Zookeeper Basics
10
Znode (1/2)• Based on shared storage
model, each client store/acquire data from zookeeper service
• File system-like API• Znode: hierarchical tree
contains optional data or optional znodes.
• Persistent znode will disappear after delete operation
• Ephemeral znode will disappear when the client creator crashes or close the connection, or deleted by any client
11
Znode (2/2)• Sequential znode will
be assigned a monotonically increasing integer at the end of path. E.g. /path-1, /path-2
• Versions: each node have a version and will be increased when its data changes
12
Operations• Primitive operations:
• create /path data • delete /path • exists /path • setData /path data • getData /path • getChildren /path
13
Notification• set a watch on a znode operation (getData,
getChildren, exist) and then get the notification when there is a change at the target
• Watch is: • one-time trigger • with ordering guarantee: all the event received
in client side will preserve the order of time
14
Session• Session: client create a session connection
to one of the server and start operations • Session states:
• connecting • connected • closed • not_connected
15
Example - implement a lock• Spec: n clients try to get the lock at the same
time, but only one of them can get the lock. • Solution: clients try to create a ephemeral
znode e.g. /lock. the first one will get the lock and the rest of them which fail to create the znode set up a watch to know when the lock is released and then try to acquire again.
16
Example - implement master-worker
• Spec: • client submit tasks • master watches for new workers and tasks,
assign tasks to available workers • backup master takes over when the master fails • workers register themselves and then watch for
new tasks
17
Example - implement master-worker• Solution:
• ephemeral znode /master for master election • backup masters sets up a watch for /master
• persistent znode /workers • master set up with for /workers • worker create a znode in /workers, e.g. /workers/host1
• persistent sequential znode /tasks • client submit tasks by creating znode under /tasks
• persistent znode /assign • workers set up watch on their corresponding znode under /assign e.g. /assign/
host1 • master assign task to worker by create znode under /assign, e.g. /assign/host1/
task1• worker mark the task as done by update the data of task as “done”
18
Zookeeper Deployment
19
Zookeeper Server Run Modes• Standalone: single server • Quorum: multiple servers replicate the data
• the cluster apply majority vote to keep the consistency so a cluster can afford less than half of nodes crash
• default ports: client(2181), quorum(2182), election(2183)
20
Clients• Native primitive operations
• C library • Java library
• Recipes (3rd party high level API) • Java: Curator (by Netflix) • Python: kazoo (by Mozilla and Zope)
21
Java Client Console• bin/zkCli.sh -server 127.0.0.1:2181 • Commands
• get path [watch] • ls path [watch] • set path data [version] • createpath data acl • delete path [version] • setquota -n|-b val path
22
Python client - kazoo
• from kazoo.client import KazooClient • zk = KazooClient(hosts='127.0.0.1:2181') • zk.start()
• zk.stop()
https://kazoo.readthedocs.org/en/latest/23
from kazoo.client import KazooClientfrom kazoo.client import KazooState
def my_listener(state): if state == KazooState.LOST: print 'lost session' elif state == KazooState.SUSPENDED: print 'disconnected from Zookeeper' elif state == KazooState.CONNECTED: # try to become the master print 'connected'
zk = KazooClient(hosts='127.0.0.1:2181')zk.add_listener(my_listener)zk.start()lock = zk.Lock('/master', '%s-%d' %(socket.gethostname(), os.getpid()))
24
zk.ensure_path("/path")
zk.set("/path", “data_string".encode('utf8'))
start_key, stat = zk.get("/path")
Zookeeper Recipes
25
Common Recipes• lock • election • counter • barrier • partitioner • party • queue
• watch
26
Lock
zk = KazooClient()lock = zk.Lock("/lockpath", "my-identifier")with lock: # blocks waiting for lock acquisition # do something with the lock
lock.release()
27
Electionzk = KazooClient()election = zk.Election("/electionpath", "my-identifier")# blocks until the election is won, then calls# my_leader_function() election.run(my_leader_function)
28
zk = KazooClient()counter = zk.Counter("/int")counter += 2counter -= 1counter.value == 1counter = zk.Counter("/float", default=1.0)counter += 2.0counter.value == 3.0
Counter
29
Barrierbarrier = zk.Barrier("/barrier")barrier.create() barrier.wait()# master release the barrier bybarrier.remove()
30
Partitionerfrom kazoo.client import KazooClientclient = KazooClient()qp = client.SetPartitioner( path='/work_queues', set=('queue-1', 'queue-2', 'queue-3'))while 1: if qp.failed: raise Exception("Lost or unable to acquire partition") elif qp.release: qp.release_set() elif qp.acquired: for partition in qp: # Do something with each partition elif qp.allocating: qp.wait_for_acquire()
31
Partyparty1 = zk.Party("/party1", "my-identifier")party2 = zk.Party("/party2", "my-identifier")party1.join()"my-identifier" in party1"my-identifier" not in party2
32
Queue
queue = zk.LockingQueue("/queue")for task in tasks: queue.put(task.encode('utf8')) task = queue.get()
33
Watch: watch znode continuously
@zk.DataWatch('/last_scanned_card_key')def my_func(data, stat, event): print("Data is %s" % data) print("Version is %s" % stat.version) print("Event is %s" % event)
34
References
35
• Flavio Junqueira, Benjamin Reed, ZooKeeper: Distributed Process Coordination, O'Reilly Media, Inc., November 25, 2013
• Zookeeper website, http://zookeeper.apache.org/