View
421
Download
6
Category
Preview:
Citation preview
Interacting with Mesos, 2 choicesPython API:
- not compatible with Python 3- Easy to implement- Bindings over C API
HTTP API:
- HTTP calls with persistent connection and streaming- Recent- Language independent,
WorkflowRegister => Listen for offer => accept/decline offer => listen for job status
Messages use Protobuf [0], HTTP interface also supports JSON.
See Mesos protobuf definition [1] to read or create messages.
[0] https://developers.google.com/protocol-buffers/
[1] https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto
Register framework = mesos_pb2.FrameworkInfo()
# mesos_pb2.XXX() read/use/write protobuf Mesos objects
framework.user = "" # Have Mesos fill in the current user.
framework.name = "Example Mesos framework"
framework.failover_timeout = 3600 * 24*7 # 1 week
# Optionally, restart from a previous run
mesos_framework_id = mesos_pb2.FrameworkID()
mesos_framework_id.value = XYZ
framework.id.MergeFrom(mesos_framework_id)
framework.principal = "godocker-mesos-framework"
# We will create our scheduler class MesosScheduler in next slide
mesosScheduler = MesosScheduler(1, executor)
# Let’s declare a framework, with a scheduler to manage offers
driver = mesos.native.MesosSchedulerDriver(
mesosScheduler,
framework,
‘zk://127.0.01:2881’)
driver.start()
executor = mesos_pb2.ExecutorInfo()
executor.executor_id.value = "sample"
executor.name = "Example executor"
When scheduler ends...When scheduler stops, Mesos will kill any remaining tasks after “failover_timeout” value.
One can set FrameworkID to restart framework and keep same context. Mesos will keep tasks, and send status messages to framework.
Scheduler skeletonclass MesosScheduler(mesos.interface.Scheduler):
def registered(self, driver, frameworkId, masterInfo):
logging.info("Registered with framework ID %s" % frameworkId.value)
self.frameworkId = frameworkId.value
def resourceOffers(self, driver, offers):
'''
Receive offers, an offer defines a node
with available resources (cpu, mem, etc.)
'''
for offer in offers:
logging.debug('Mesos:Offer:Decline)
driver.declineOffer(offer.id)
def statusUpdate(self, driver, update):
'''
Receive status info from submitted tasks
(switch to running, failure of node, etc.)
'''
logging.debug("Task %s is in state %s" % \
(update.task_id.value, mesos_pb2.TaskState.Name
(update.state)))
def frameworkMessage(self, driver,
executorId, slaveId, message):
logging.debug("Received framework message")
# usually, nothing to do here
Messages are asynchronousStatus updates and offers are asynchronous callbacks. Scheduler run in a separate thread.
You’re never the initiator of the requests (except registration), but you will receive callback messages when something change on Mesos side (job switch to running, node failure, …)
Submit a taskfor offer in offers:
# Get available cpu and mem for this offer
offerCpus = 0
offerMem = 0
for resource in offer.resources:
if resource.name == "cpus":
offerCpus += resource.scalar.value
elif resource.name == "mem":
offerMem += resource.scalar.value
# We could chek for other resources here
logging.debug("Mesos:Received offer %s with cpus: %s and mem: %s" \
% (offer.id.value, offerCpus, offerMem))
# We should check that offer has enough resources
sample_task = create_a_sample_task(offer)
array_of_task = [ sample_task ]
driver.launchTasks(offer.id, array_of_task)
Mesos support any custom resource definition on
nodes (gpu, slots, disk, …), using scalar or range
values
When a task is launched, requested resources will be
removed from available resources for the selected
node.
Next offers won’t propose thoses resources again
until task is over (or killed).
Define a taskdef create_a_sample_task(offer):
task = mesos_pb2.TaskInfo()
# The container part (native or docker)
container = mesos_pb2.ContainerInfo()
container.type = 1 # mesos_pb2.ContainerInfo.Type.DOCKER
# Let’s add a volume
volume = container.volumes.add()
volume.container_path = “/tmp/test”
volume.host_path = “/tmp/incontainer”
volume.mode = 1 # mesos_pb2.Volume.Mode.RW
# The command to execute, if not using entrypoint
command = mesos_pb2.CommandInfo()
command.value = “echo hello world”
task.command.MergeFrom(command)
# Unique identifier (or let mesos assign one)
task.task_id.value = XYZ_UNIQUE_IDENTIFIER
# the slave where task is executed
task.slave_id.value = offer.slave_id.value
task.name = “my_sample_task”
# The resources/requirements
# Resources have names, cpu, mem and ports are available
# by default, one can define custom ones per slave node
# and get them by their name here
cpus = task.resources.add()
cpus.name = "cpus"
cpus.type = mesos_pb2.Value.SCALAR
cpus.scalar.value = 2
mem = task.resources.add()
mem.name = "mem"
mem.type = mesos_pb2.Value.SCALAR
mem.scalar.value = 3000 #3 Go
Define a task (next) # Now the Docker part
docker = mesos_pb2.ContainerInfo.DockerInfo()
docker.image = “debian:latest”
docker.network = 2 # mesos_pb2.ContainerInfo.DockerInfo.Network.BRIDGE
docker.force_pull_image = True
container.docker.MergeFrom(docker)
# Let’s map some ports, ports are resources like cpu and mem
# We will map container port 80 to an available host port
# Let’s pick the first available port for this offer, for simplicity
# we will skip here controls and suppose there is at least one port
offer_port = None
for resource in offer.resources:
if resource.name == "ports":
for mesos_range in resource.ranges.range:
offer_port = mesos_range.begin
break
# We map port 80 to offer_port in container
docker_port = docker.port_mappings.add()
docker_port.host_port = 80
docker_port.container_port = offer_port
# We tell mesos that we reserve this port
# Mesos will remove it from next offers until task
completion
mesos_ports = task.resources.add()
mesos_ports.name = "ports"
mesos_ports.type = mesos_pb2.Value.RANGES
port_range = mesos_ports.ranges.range.add()
port_range.begin = offer_port
port_range.end = offer_port
task.container.MergeFrom(container)
return task
Task statusdef statusUpdate(self, driver, update):
'''
Receive status info from submitted tasks
(switch to running, failure of node, etc.)
'''
logging.debug("Task %s is in state %s" % \
(update.task_id.value, mesos_pb2.TaskState.Name(update.state)))
if int(update.state= == 1:
#Switched to RUNNING
container_info = json.loads(update.data)
if int(update.state) in [2,3,4,5,7]:
# Over or failure
logging.error(“Task is over or failed”)
Want to kill a task?def resourceOffers(self, driver, offers):
….
task_id = mesos_pb2.TaskID()
task_id.value = my_unique_task_id
driver.killTask(task_id)
Recommended