35
Distributed System Coordination by Zookeeper and Introduction to Kazoo Python Library Jimmy Lai r97922028 [at] ntu.edu.tw Dec. 22th, 2014 1

Distributed system coordination by zookeeper and introduction to kazoo python library

Embed Size (px)

Citation preview

Page 1: Distributed system coordination by zookeeper and introduction to kazoo python library

Distributed System Coordination by Zookeeper and Introduction to

Kazoo Python Library

Jimmy Lai r97922028 [at] ntu.edu.tw

Dec. 22th, 2014

1

Page 2: Distributed system coordination by zookeeper and introduction to kazoo python library

Outline1. Overview 2. Basics 3. Deployment 4. Recipes 5. References

2

Page 3: Distributed system coordination by zookeeper and introduction to kazoo python library

Overview of Zookeeper

3

Page 4: Distributed system coordination by zookeeper and introduction to kazoo python library

A Distributed System - Master-Worker

• Coordination tasks: 1. elect new master when the master crashes 2. master assign tasks to worker 3. when worker crashes, re-assign the task to other

worker 4. When worker finished their task, master assign new

tasks to it

Master

Worker Worker Worker Worker Worker Worker

4

Page 5: Distributed system coordination by zookeeper and introduction to kazoo python library

Distributed System• An application consists of programs run on a

group of computers. • Coordination is more difficult than writing a

standalone program. • Developer may take too much times to handle

the coordination or create a fragile (e.g. race condition, single point failure) distributed system.

5

Page 6: Distributed system coordination by zookeeper and introduction to kazoo python library

Easy Distributed System by Zookeeper• Common coordination tasks:

• Naming service • Configuration management • Synchronization • Leader election • Message queue • Notification system

• Zookeeper provides highly reliable API for those common coordination tasks

http://en.wikipedia.org/wiki/Apache_ZooKeeper#Typical_use_cases6

Page 7: Distributed system coordination by zookeeper and introduction to kazoo python library

Powered By Zookeeper• Zookeeper is built by Yahoo Research • Customers:

• Hadoop, Hbase • Solr • Neo4j • Flume • Facebook messages

7

Page 8: Distributed system coordination by zookeeper and introduction to kazoo python library

Benefits of Zookeeper• With Zookeeper:

• simplify the development of distributed system, more agile and robust

• zookeeper is simple, fast and replicated • Without Zookeeper:

• more difficult8

Page 9: Distributed system coordination by zookeeper and introduction to kazoo python library

• Servers replicate data • Client connect to one of the

server • Throughput test • Hardware: dual 2Ghz Xeon and

two SATA 15K RPM drives

Benefits of Zookeeper

9

Page 10: Distributed system coordination by zookeeper and introduction to kazoo python library

Zookeeper Basics

10

Page 11: Distributed system coordination by zookeeper and introduction to kazoo python library

Znode (1/2)• Based on shared storage

model, each client store/acquire data from zookeeper service

• File system-like API• Znode: hierarchical tree

contains optional data or optional znodes.

• Persistent znode will disappear after delete operation

• Ephemeral znode will disappear when the client creator crashes or close the connection, or deleted by any client

11

Page 12: Distributed system coordination by zookeeper and introduction to kazoo python library

Znode (2/2)• Sequential znode will

be assigned a monotonically increasing integer at the end of path. E.g. /path-1, /path-2

• Versions: each node have a version and will be increased when its data changes

12

Page 13: Distributed system coordination by zookeeper and introduction to kazoo python library

Operations• Primitive operations:

• create /path data • delete /path • exists /path • setData /path data • getData /path • getChildren /path

13

Page 14: Distributed system coordination by zookeeper and introduction to kazoo python library

Notification• set a watch on a znode operation (getData,

getChildren, exist) and then get the notification when there is a change at the target

• Watch is: • one-time trigger • with ordering guarantee: all the event received

in client side will preserve the order of time

14

Page 15: Distributed system coordination by zookeeper and introduction to kazoo python library

Session• Session: client create a session connection

to one of the server and start operations • Session states:

• connecting • connected • closed • not_connected

15

Page 16: Distributed system coordination by zookeeper and introduction to kazoo python library

Example - implement a lock• Spec: n clients try to get the lock at the same

time, but only one of them can get the lock. • Solution: clients try to create a ephemeral

znode e.g. /lock. the first one will get the lock and the rest of them which fail to create the znode set up a watch to know when the lock is released and then try to acquire again.

16

Page 17: Distributed system coordination by zookeeper and introduction to kazoo python library

Example - implement master-worker

• Spec: • client submit tasks • master watches for new workers and tasks,

assign tasks to available workers • backup master takes over when the master fails • workers register themselves and then watch for

new tasks

17

Page 18: Distributed system coordination by zookeeper and introduction to kazoo python library

Example - implement master-worker• Solution:

• ephemeral znode /master for master election • backup masters sets up a watch for /master

• persistent znode /workers • master set up with for /workers • worker create a znode in /workers, e.g. /workers/host1

• persistent sequential znode /tasks • client submit tasks by creating znode under /tasks

• persistent znode /assign • workers set up watch on their corresponding znode under /assign e.g. /assign/

host1 • master assign task to worker by create znode under /assign, e.g. /assign/host1/

task1• worker mark the task as done by update the data of task as “done”

18

Page 19: Distributed system coordination by zookeeper and introduction to kazoo python library

Zookeeper Deployment

19

Page 20: Distributed system coordination by zookeeper and introduction to kazoo python library

Zookeeper Server Run Modes• Standalone: single server • Quorum: multiple servers replicate the data

• the cluster apply majority vote to keep the consistency so a cluster can afford less than half of nodes crash

• default ports: client(2181), quorum(2182), election(2183)

20

Page 21: Distributed system coordination by zookeeper and introduction to kazoo python library

Clients• Native primitive operations

• C library • Java library

• Recipes (3rd party high level API) • Java: Curator (by Netflix) • Python: kazoo (by Mozilla and Zope)

21

Page 22: Distributed system coordination by zookeeper and introduction to kazoo python library

Java Client Console• bin/zkCli.sh -server 127.0.0.1:2181 • Commands

• get path [watch] • ls path [watch] • set path data [version] • createpath data acl • delete path [version] • setquota -n|-b val path

22

Page 23: Distributed system coordination by zookeeper and introduction to kazoo python library

Python client - kazoo

• from kazoo.client import KazooClient • zk = KazooClient(hosts='127.0.0.1:2181') • zk.start()

• zk.stop()

https://kazoo.readthedocs.org/en/latest/23

Page 24: Distributed system coordination by zookeeper and introduction to kazoo python library

from kazoo.client import KazooClientfrom kazoo.client import KazooState

def my_listener(state): if state == KazooState.LOST: print 'lost session' elif state == KazooState.SUSPENDED: print 'disconnected from Zookeeper' elif state == KazooState.CONNECTED: # try to become the master print 'connected'

zk = KazooClient(hosts='127.0.0.1:2181')zk.add_listener(my_listener)zk.start()lock = zk.Lock('/master', '%s-%d' %(socket.gethostname(), os.getpid()))

24

zk.ensure_path("/path")

zk.set("/path", “data_string".encode('utf8'))

start_key, stat = zk.get("/path")

Page 25: Distributed system coordination by zookeeper and introduction to kazoo python library

Zookeeper Recipes

25

Page 26: Distributed system coordination by zookeeper and introduction to kazoo python library

Common Recipes• lock • election • counter • barrier • partitioner • party • queue

• watch

26

Page 27: Distributed system coordination by zookeeper and introduction to kazoo python library

Lock

zk = KazooClient()lock = zk.Lock("/lockpath", "my-identifier")with lock: # blocks waiting for lock acquisition # do something with the lock

lock.release()

27

Page 28: Distributed system coordination by zookeeper and introduction to kazoo python library

Electionzk = KazooClient()election = zk.Election("/electionpath", "my-identifier")# blocks until the election is won, then calls# my_leader_function() election.run(my_leader_function)

28

Page 29: Distributed system coordination by zookeeper and introduction to kazoo python library

zk = KazooClient()counter = zk.Counter("/int")counter += 2counter -= 1counter.value == 1counter = zk.Counter("/float", default=1.0)counter += 2.0counter.value == 3.0

Counter

29

Page 30: Distributed system coordination by zookeeper and introduction to kazoo python library

Barrierbarrier = zk.Barrier("/barrier")barrier.create() barrier.wait()# master release the barrier bybarrier.remove()

30

Page 31: Distributed system coordination by zookeeper and introduction to kazoo python library

Partitionerfrom kazoo.client import KazooClientclient = KazooClient()qp = client.SetPartitioner( path='/work_queues', set=('queue-1', 'queue-2', 'queue-3'))while 1: if qp.failed: raise Exception("Lost or unable to acquire partition") elif qp.release: qp.release_set() elif qp.acquired: for partition in qp: # Do something with each partition elif qp.allocating: qp.wait_for_acquire()

31

Page 32: Distributed system coordination by zookeeper and introduction to kazoo python library

Partyparty1 = zk.Party("/party1", "my-identifier")party2 = zk.Party("/party2", "my-identifier")party1.join()"my-identifier" in party1"my-identifier" not in party2

32

Page 33: Distributed system coordination by zookeeper and introduction to kazoo python library

Queue

queue = zk.LockingQueue("/queue")for task in tasks: queue.put(task.encode('utf8')) task = queue.get()

33

Page 34: Distributed system coordination by zookeeper and introduction to kazoo python library

Watch: watch znode continuously

@zk.DataWatch('/last_scanned_card_key')def my_func(data, stat, event): print("Data is %s" % data) print("Version is %s" % stat.version) print("Event is %s" % event)

34

Page 35: Distributed system coordination by zookeeper and introduction to kazoo python library

References

35

• Flavio Junqueira, Benjamin Reed, ZooKeeper: Distributed Process Coordination, O'Reilly Media, Inc., November 25, 2013

• Zookeeper website, http://zookeeper.apache.org/