39
The simplicity of implementing a job scheduler with Circuit DEVIEW 2015 Seoul, South Korea Petar Maymounkov

[231] the simplicity of cluster apps with circuit

Embed Size (px)

Citation preview

The simplicity of implementing a job scheduler with Circuit

DEVIEW 2015Seoul, South Korea

Petar Maymounkov

Circuit: Light-weight cluster OS

● Real-time API to see and control:○ Hosts, processes, containers

● System never fails○ API endpoint on every host

○ Robust master-less, peer-to-peer membership

protocol

API overview

API: Model of cluster

Proc 1

Proc 2 Proc 3

Proc 4

Proc 5

Proc 7

Host 1 Host 2 Host 3 Host 4

Proc 8

Proc 6

API: Abstraction

Proc 1

Proc 2 Proc 3

Proc 4

Proc 5

Proc 7

Host 1 Host 2 Host 3 Host 4

Proc 8

Proc 6

Host 1Host 2 etc.

Cluster DOM operations:● Traversal● Manipulation● Notifications

API: Entrypoints

● Command-line tool● Go client package (more later)

API: Command-line example Host A

mysql

hdfs

Host B

apachephp

memcache

fs

db

cache

http

X85ec139ad

X68a30645c7

$ circuit ls -l /...server /X85ec139adserver /X85ec139ad/fsproc /X85ec139ad/dbserver /X68a30645cproc /X68a30645c/cachedocker /X68a30645c/http

API: Command-line example Host A

mysql

hdfs

Host B

apachephp

memcache

fs

db

cache

http

X85ec139ad

X68a30645c7

$ circuit mkproc /X85ec/test <<EOF{“Path”:”/sbin/lscpu”}EOF

$ circuit stdout /X85ec/testArchitecture: x86_64etc.

$ circuit wait /X85ec/db

$ circuit stderr /X68a3etc.

System architecture

Host C

System architecture: Boot individual hosts

Host A Host B

circuitserver

circuitserver

circuitserver

$ circuit start

Host C

System architecture: Discovery

Host A Host B

circuitserver

circuitserver

circuitserver

$ circuit start --discover=228.8.8.8:8822

Host C

System architecture: API endpoints

Host A Host B

circuitserver

circuitserver

circuitserver

API API API

$ circuit startcircuit://127.0.0.1:7822/17880/Q413a079318a275ca

Eng Notebook

System architecture: Client connections

Host A Host B

circuitserver

circuitserver

circuitclient

circuitclient

$ circuit startcircuit://127.0.0.1:7822/17880/Q413a079318a275ca

App-to-circuit connection never fails, because of

colocation.

Go API overview

Go: Entrypoint

import . “github.com/gocircuit/circuit/client”

func Dial(circuitAddr string,crypto []byte,

) *Client

func DialDiscover(udpMulticast string, crypto []byte,

) *Client

Go: Error handling

● Physical errors are panics● Application errors are returned values

func Dial(circuitAddr string,crypto []byte,

) *Client

Anchor

Anchor

Anchor

Anchor

Anchor

Anchor

Anchor

Anchor

Anchor

/X114cd3eba3423596

/Xfea8b5b798f2fc09

/X45af9c2b9c9ab78b

/X114cd3eba3423596/db

/X114cd3eba3423596/jobs

/X45af9c2b9c9ab78b/nginx

/X45af9c2b9c9ab78b/nodejs

/X45af9c2b9c9ab78b/dns

/

Container

Process

Container

DNS

Server

Server

Server

Go: Anchor hierarchyAn anchor is a node in a unified namespace

An anchor is like a “directory”. It can have children anchors

An anchor is like a “variable”. It can be empty or hold an object (server, process, container, etc.)

Client connection is the root anchor

Anchor

Anchor

Anchor

Anchor

/

Server

Server

Server

Go: Anchor API

type Anchor interface{Walk(path []string) AnchorView() map[string]AnchorMakeProc(Spec) (Proc, error)MakeDocker(Spec) (Container, error)Get() interface{}Scrub()

}

type Proc interface{Peek() (State, error)Wait() error…

}

/Xfea8b5b798f2fc09

/X45af9c2b9c9ab78b

/X114cd3eba3423596

Go: Anchor residence

Anchor

Anchor

Anchor

Anchor

Anchor

Anchor

Anchor

Anchor

Residing on host:/X114cd3eba3423596

/X114cd3eba3423596

/X45af9c2b9c9ab78b

/X114cd3eba3423596/db

/X114cd3eba3423596/jobs

/X45af9c2b9c9ab78b/nginx

/X45af9c2b9c9ab78b/nodejs

/X45af9c2b9c9ab78b/dns

/

Container

Process

Container

DNS

Server

Server

Residing on host:/X45af9c2b9c9ab78b

Implementing a job scheduler

Scheduler: Design spec

● User specifies:○ Maximum number of jobs per host○ Address of circuit server to connect to

● HTTP API server:○ Add job by name and command spec○ Show status

Scheduler: Service architecture

Host CHost B

circuitserver

circuitserver

circuitserver

sched

Host A

HTTP API

Job1Job2 Job3

Scheduler: State and logic

State

ControllerAdd job

Show status

Job exited

Host joined/left

USER

CIRCUIT

scheduleStart job

H1

J1

J2

J3

H3

J4

J7

H5

J8

Pend

J5

J6

Live hosts andthe jobs they are

running

Jobs waiting to

be run

Scheduler: Mainimport “github.com/gocircuit/circuit/client”

var flagAddr = flag.String(“addr”, “”, “Circuit to connect to”)var flagMaxJobs = flag.Int(“maxjob”, 2, “Max jobs per host”)

func main() {flag.Parse()defer func() {

if r := recover(); r != nil {log.Fatalf("Could not connect to circuit: %v", r)

}}()conn := client.Dial(*flagAddr, nil)controller := NewController(conn, *flagMaxJobs)… // Link controller methods to HTTP requests handlers.log.Fatal(http.ListenAndServe(":8080", nil))

}

Scheduler: Controller statetype Controller struct {

client *client.ClientmaxJobsPerHost intsync.Mutex // Protects state fields.jobName map[string]struct{}worker map[string]*workerpending []*job

}

type worker struct {name stringjob []*job

}

type job struct {name stringcmd client.Cmd

}

H1

J1

J2

J3

H3

J4

J7

H5

J8

Pend

J5

J6

Live hosts andthe jobs they are

running

Jobs waiting to

be run

func NewController(conn *Client, maxjob int) *Controller {c := &Controller{…} // Initialize fields.// Subscribe to notifications of hosts joining and leaving.c.subscribeToJoin()c.subscribeToLeave()return c

}

Scheduler: Start controller

func (c *Controller) subscribeToJoin() {a := c.client.Walk([]string{c.client.ServerID(), "controller", "join"})a.Scrub()onJoin, err := a.MakeOnJoin()if err != nil {

log.Fatalf("Another controller running on this circuit server.")}go func() {

for {x, ok := onJoin.Consume()if !ok {

log.Fatal("Circuit disappeared.")}c.workerJoined(x.(string)) // Update state.

}}()

}

Scheduler: Subscribe to host join/leave events

Place subscription on server the scheduler is connected to.

Pick a namespace for scheduler service.

func (c *Controller) workerJoined(name string) {c.Lock()defer c.Unlock()… // Update state structure. Add worker to map with no jobs.go c.schedule()

}

func (c *Controller) workerLeft(name string) {c.Lock()defer c.Unlock()… // Update state structure. Remove worker from map.go c.schedule()

}

Scheduler: Handle host join/leave

Scheduler: So far ...

State

ControllerAdd job

Show status

Job exited

Host joined/left

USER

CIRCUIT

scheduleStart job

func (c *Controller) AddJob(name string, cmd Cmd) {c.Lock()defer c.Unlock()… // Update state structure. Add job to pending queue.go c.schedule()

}

func (c *Controller) Status() string {c.Lock()defer c.Unlock()… // Print out state to string.

}

Scheduler: User requests

Scheduler: So far ...

State

ControllerAdd job

Show status

Job exited

Host joined/left

USER

CIRCUIT

scheduleStart job

Scheduler: Controller statefunc (c *Controller) schedule() {

c.Lock()defer c.Unlock()// Compute job-to-worker matchingvar match []*match = …for _, m := range match {

… // Mark job as running in workergo c.runJob(m.job, m.worker)

}}

type match struct {*job // Job from pending*worker // Worker below capacity

}

H1

J1

J2

H3

J4

H5

J8

Pend

Live hosts andthe jobs they are

running

Jobs waiting to

be run

Ja

Jb

J9 J6J3 J7

Scheduler: Run a jobfunc (c *Controller) runJob(job *job, worker *worker) {

defer func() {if r := recover(); r != nil { // Worker died before job completed.

… // Put job back on pending queue.}

}()jobAnchor := c.client.Walk([]string{worker.name, "job", job.name})proc, err := jobAnchor.MakeProc(job.cmd)… // Handle error, another process already running.proc.Stdin().Close()go func() { // Drain Stdout. Do the same for Stderr.

defer func() { recover() }() // In case of worker failure.io.Copy(drain{}, proc.Stdout())

}()_, err = proc.Wait()… // Mark complete or put back on pending queue.

}

Place process on desired host. Pick namespace for jobs.

Scheduler: Demo.

State

ControllerAdd job

Show status

Job exited

Host joined/left

USER

CIRCUIT

scheduleStart job

Universal cluster binaries

Recursive processes: Execution tree

Host CHost B

master

Host A

Job1

sched1 sched2

Job2 Job3 Job4 Job5

useruser

master

sched1 sched2

Job1 Job2 Job4 Job3 Job5

Universal distribution

master

sched

job1

job2

One Go binarythat takes on

different roles

master

sched

job1

job2

circuitserver

Package binary & circuit in an executable container image

Host A Host B

Host C Host D

Customer runs container alongside other

infrastructure with isolation

mysql

The vision forward

● Easy (only?) way to share any cloud system○ “Ship with Circuit included” vs “Hadoop required”?○ Circuit binary + Your binary = Arbitrary complex cloud○ Like Erlang/OTP but language agnostic

Thank you.

Circuit website:http://gocircuit.org

Source for Job Scheduler demo:http://github.com/gocircuit/talks/deview2015

Twitter:@gocircuit