Pacemaker: OpenStack's Pid 1

Preview:

Citation preview

David Vossel <dvossel@redhat.com>

PACEMAKER

OpenStack SummitMay 21, 2015David Vossel <dvossel@redhat.com>

OpenStack's PID 1

David Vossel <dvossel@redhat.com>

Story Time

David Vossel <dvossel@redhat.com>

And how Pacemaker Saved the DayThe Future of HA

David Vossel <dvossel@redhat.com>

There once was a database

David Vossel <dvossel@redhat.com>

There once was a database

DB

David Vossel <dvossel@redhat.com>

Not like the other databases

There once was a database

DB

David Vossel <dvossel@redhat.com>

DB

Not like the other databases

There once was a database

A distributed self replicating database

David Vossel <dvossel@redhat.com>

Not like the other databases

There once was a database

A distributed self replicating database

Active/Active Replicated Database

DB DB DB DB

David Vossel <dvossel@redhat.com>

Active/Active Replicated Database

DB DB DB DB

Everyone: HOORAY!

David Vossel <dvossel@redhat.com>

Active/Active Replicated Database

DB DB DB DB

Everyone: Load Balance?

Clients

????????????????????

David Vossel <dvossel@redhat.com>

HAProxy enters the scene.

David Vossel <dvossel@redhat.com>

Active/Active Replicated Database

DB DB DB DB

proxy

HA Proxy: No Problem

Clients

David Vossel <dvossel@redhat.com>

Active/Active Replicated Database

DB DB DB DB

proxy

HA Proxy: No Problem

Clients

Everyone:Hooray!

David Vossel <dvossel@redhat.com>

Active/Active Replicated Database

DB DB DB DB

proxy

Clients

* Everyone: ummmmm

David Vossel <dvossel@redhat.com>

Everyone: What if a node dies?

Active/Active Replicated Database

DB DB DB DB

proxy

Clients

David Vossel <dvossel@redhat.com>

Active/Active Replicated Database

DB DB DB DB

Clients

proxy

HA Proxy: No Problem

Everyone: What if a node dies?

David Vossel <dvossel@redhat.com>

Active/Active Replicated Database

DB DB DB DB

Clients

proxy

* Everyone: ummmmm

David Vossel <dvossel@redhat.com>

Active/Active Replicated Database

DB DB DB DB

Clients

proxy

Everyone: But... What if proxy dies?

David Vossel <dvossel@redhat.com>

Active/Active Replicated Database

DB DB DB DB

Clients

proxy

HA Proxy:...

Everyone: But... What if proxy dies?

David Vossel <dvossel@redhat.com>

Active/Active Replicated Database

DB DB DB DB

Clients

proxy

Everyone:What?????

HA Proxy:...

David Vossel <dvossel@redhat.com>

Active/Active Replicated Database

DB DB DB DB

Clients

proxy

Everyone:Uhhh???!!!!??

HA Proxy:...

????????????????????

David Vossel <dvossel@redhat.com>

Active/Active Replicated Database

DB DB DB DB

Clients

Everyone:Hello????

HA Proxy:...

????????????????????

David Vossel <dvossel@redhat.com>

Clients

Everyone:Anyone?

????????????????????

David Vossel <dvossel@redhat.com>

Everyone:I knew this cloud thing wouldn't work...

David Vossel <dvossel@redhat.com>

KeepaliveD: Wait! Guys... I've got an idea!!!

Everyone:I knew this cloud thing wouldn't work...

David Vossel <dvossel@redhat.com>

KeepaliveD enters the scene.

David Vossel <dvossel@redhat.com>

keepaliveD

Active/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

Clients

Everyone:Whoa, Proxy Failover!

David Vossel <dvossel@redhat.com>

keepaliveDkeepaliveD

Active/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

Clients

Everyone:Awesome!!

David Vossel <dvossel@redhat.com>

keepaliveDkeepaliveD

Active/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

Clients

Everyone:Awesome!!

Keepalived:: I know right?!!

David Vossel <dvossel@redhat.com>

keepaliveD

Clients

Everyone: No Way!!!

Keepalived: I Rock!

keepaliveD

Active/Active Replicated DatabaseActive/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

David Vossel <dvossel@redhat.com>

keepaliveD

Clients

keepaliveD

Active/Active Replicated DatabaseActive/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

Everyone:Wait... Hold up

David Vossel <dvossel@redhat.com>

keepaliveD

Clients

keepaliveD

Active/Active Replicated DatabaseActive/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

Everyone: But.... What actually happens to the “other” nodes?

David Vossel <dvossel@redhat.com>

Peripeteia - per·i·pe·tei·a

a sudden reversal of fortune or change in circumstances, especially in reference to fictional narrative.

David Vossel <dvossel@redhat.com>

keepaliveDkeepaliveD keepaliveD

Active/Active Replicated DatabaseActive/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

ClientsKeepalived:What Other Nodes? Keepalived:What Other Nodes?

David Vossel <dvossel@redhat.com>

keepaliveDkeepaliveD

Active/Active Replicated DatabaseActive/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

Clients

Clients: HEY?! How come the thing I just wrote isn't in the database?

David Vossel <dvossel@redhat.com>

keepaliveDkeepaliveD

Active/Active Replicated DatabaseActive/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

Clients

Clients: HEY?! How come the thing I just wrote isn't in the database?

Clients:Yeah, what's going on?!

David Vossel <dvossel@redhat.com>

keepaliveDkeepaliveD

Active/Active Replicated DatabaseActive/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

Clients

* Everyone: I've made a huge mistake....

David Vossel <dvossel@redhat.com>

The Missing Piece?

David Vossel <dvossel@redhat.com>

David Vossel <dvossel@redhat.com>

Pacemaker: System Level HA

● System level HA is holistic.

Pacemaker

Active/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

Pacemaker

Active/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

David Vossel <dvossel@redhat.com>

Pacemaker

Pacemaker: System Level HA

Active/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

● System level HA is holistic.

● Defines the policy of how to recover a set of applications

David Vossel <dvossel@redhat.com>

Pacemaker

Pacemaker: System Level HA

Active/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

● System level HA is holistic.

● Defines the policy of how to recover a set of applications

● Enforces the policy to achieve system wide deterministic behavior.

David Vossel <dvossel@redhat.com>

● We don't question what happened to the other nodes... We know.

Pacemaker Pacemaker

Active/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

Back to the Story... How does Pacemaker Help?

David Vossel <dvossel@redhat.com>

● We don't question what happened to the other nodes... We know.

● And how do we know this?

Pacemaker Pacemaker

Active/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

Back to the Story... How does Pacemaker Help?

David Vossel <dvossel@redhat.com>

Introducing STONITH

David Vossel <dvossel@redhat.com>

Introducing STONITH

Shoot the Other Node in the Head

David Vossel <dvossel@redhat.com>

STONITH = Pacemaker's Fencing Daemon.

● Pacemaker knows the state of lost/misbehaving nodes

Pacemaker Pacemaker

Active/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

David Vossel <dvossel@redhat.com>

STONITH = Pacemaker's Fencing Daemon.

● Pacemaker knows the state of lost/misbehaving nodes

● Because with fencing via STONITH... that state is dead.

Pacemaker Pacemaker

Active/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

David Vossel <dvossel@redhat.com>

Quick Recap...

David Vossel <dvossel@redhat.com>

Without Pacemaker+STONITH

keepaliveDkeepaliveD keepaliveD

Active/Active Replicated DatabaseActive/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

ClientsKeepalived:I dunno. Who cares? Keepalived:I dunno. Who cares?

David Vossel <dvossel@redhat.com>

With Pacemaker+STONITH...

Pacemaker Pacemaker

Active/Active Replicated Database

DB DB DB DB

proxy proxy proxy proxy

VIP VIP VIP VIP

Pacemaker:They are dead because I killed them..

Clients

David Vossel <dvossel@redhat.com>

The Takeaway.

● Pacemaker and load balancing are NOT mutually exclusive.

David Vossel <dvossel@redhat.com>

The Takeaway.

● Pacemaker and load balancing are NOT mutually exclusive.

● Pacemaker and HAProxy are meant for one another.

David Vossel <dvossel@redhat.com>

The Takeaway.

● Pacemaker and load balancing are NOT mutually exclusive.

● Pacemaker and HAProxy are meant for one another.

HAProxy

Pacemaker

David Vossel <dvossel@redhat.com>

PacemakerThe Distributed PID 1

David Vossel <dvossel@redhat.com>

Modern PID 1's role.

● SystemD:

● Launch services parallel● yet observe strict ordering between dependent services.● Monitor/recover failed resources.

systemd

galerarabbitmqStart Order Unrelated dependencies

start in parallel.

Ordering is enforced. These services can start in parallel only after their dependencies start.Nova

redis

ceilometer

David Vossel <dvossel@redhat.com>

The Problem

● OpenStack services are not isolated to a local machine.

systemd

Galera ClusterForm Galera cluster

Then Start Nova.

Nova

NODE1 NODE2 NODE3 NODE4

systemd systemd systemd

David Vossel <dvossel@redhat.com>

The Problem

● OpenStack services are not isolated to a local machine.

● SystemD Can't coordinate this.

systemd

NODE1 NODE2 NODE3 NODE4

systemd systemd systemd

Galera ClusterForm Galera cluster

Then Start Nova.

Nova

David Vossel <dvossel@redhat.com>

The Fix.

● But Pacemaker can

● because Pacemaker is distributed.

Pacemaker

NODE1 NODE2 NODE3 NODE4

Galera ClusterForm Galera cluster

Then Start Nova.

Nova

David Vossel <dvossel@redhat.com>

The Fix.

● Pacemaker, just like systemd, can...

● Launch services parallel● yet observe strict ordering between dependent services.● Monitor/recover failed resources.

Pacemaker

galerarabbitmqStart Order Unrelated dependencies

start in parallel.

Ordering is enforced. These services can start in parallel only after their dependencies start.Nova

redis

ceilometer

David Vossel <dvossel@redhat.com>

The Fix.● Except pacemaker can coordinate this across any number of nodes.

Pacemaker

NODE1 NODE2 NODE3 NODE4

RabbitMQ Cluster

NODE5 NODE6 NODE7 NODE8 NODE9

Galera Cluster Redis Cluster

Ceilometer ClusterNova Cluster

David Vossel <dvossel@redhat.com>

The Fix.● Except pacemaker can coordinate this across any number of nodes.

● With any number of resources

Pacemaker

NODE1 NODE2 NODE3 NODE4 NODE5 NODE6 NODE7 NODE8 NODE9

HAProxy HAProxy HAProxy

VIP-Galera VIP-RedisVIP-Rabbit VIP-Nova

HAProxy

VIP-keys

HAProxy

VIP-cinder VIP-celio

Keystone ClusterCeilometer Cluster

Nova Cluster

Cinder Cluster

VIP-glance

HAProxy

VIP-neutron

HAProxy

Glance Cluster

Horizon Cluster

HAProxy HAProxy

RabbitMQ ClusterGalera Cluster Redis Cluster

Neutron Cluster

Swift Cluster

Heat Cluster

David Vossel <dvossel@redhat.com>

Resource Constraints

● Pacemaker has unique capabilities for managing resources and modeling complex resource dependencies.

● Examples:

● Start resource X then start resource Y● Colocate resource X with resource Y● Resource X prefers node A over node B● Resource X prefers node A between 8am-5pm

David Vossel <dvossel@redhat.com>

ArchitectureHA OpenStack Controller Nodes

David Vossel <dvossel@redhat.com>

How it works... in one sentence

Pacemaker managing Virtual IPs + Load Balancers + Controller services to maximize the availability of OpenStack APIs

David Vossel <dvossel@redhat.com>

Pacemaker

HA-proxy

VIP

Service Service Service

NODE1 NODE2 NODE3

Each distributed service has a front end Virtual IP tied to a Load Balancer.

How it works... in one sentence

Pacemaker managing Virtual IPs + Load Balancers + Controller services to maximize the availability of OpenStack APIs

HA-proxyHA-proxy

David Vossel <dvossel@redhat.com>

Each distributed service has a front end Virtual IP tied to a Load Balancer.

Each Load Balancer routes VIP traffic to its respective Active/Active service instances

How it works... in one sentence

Pacemaker managing Virtual IPs + Load Balancers + Controller services to maximize the availability of OpenStack APIs

Pacemaker

HA-proxy

VIP

Service Service Service

NODE1 NODE2 NODE3

HA-proxyHA-proxy

David Vossel <dvossel@redhat.com>

Pacemaker

HA-proxy

VIP

Service Service Service

NODE1 NODE2 NODE3

Each distributed service has a front end Virtual IP tied to a Load Balancer.

Each Load Balancer routes VIP traffic to its respective Active/Active service instances

How it works... in one sentence

HA-proxy

Pacemaker managing Virtual IPs + Load Balancers + Controller services to maximize the availability of OpenStack APIs

HA-proxy

David Vossel <dvossel@redhat.com>

Each distributed service has a front end Virtual IP tied to a Load Balancer.

Each Load Balancer routes VIP traffic to its respective Active/Active service instances

Active/Active OpenStack Controller service... like Nova, Glance, Keystone, Galera, Redis, Rabbitmq, ect...

How it works... in one sentence

Pacemaker managing Virtual IPs + Load Balancers + Controller services to maximize the availability of OpenStack APIs

Pacemaker

HA-proxy

VIP

Service Service Service

NODE1 NODE2 NODE3

HA-proxyHA-proxy

David Vossel <dvossel@redhat.com>

PROXY CLONE

SERVICE CLONE

Pacemaker

HA-proxy

VIP

Service Service Service

NODE1 NODE2 NODE3

Scaling with Resource Clones

● Pacemaker's ability to clone services makes scaling trivial.

● Want more instance?

HA-proxy HA-proxy

David Vossel <dvossel@redhat.com>

PROXY CLONE

SERVICE CLONE

Pacemaker

HA-proxy

VIP

Service Service Service

NODE1 NODE2 NODE3

Scaling with Resource Clones

NODE4 NODE5 NODE6

Service Service Service

● Increment the number of clone instances pacemaker is allowed to run for a service to scale service instances.

HA-proxy HA-proxy HA-proxy HA-proxy HA-proxy

David Vossel <dvossel@redhat.com>

Pacemaker

HA-proxy

Galera VIP

Galera Galera Galera

NODE1 NODE2 NODE3

How it works, continued... ● Services interact with one another using each service's Virtual IP● Example: Both Glance and Nova need access to Galera... Galera is

accessed via the front end Virtual IP and those requests are distributed to the backend galera cluster.

Glance NovaDB Request DB Request

HA-proxyHA-proxy

David Vossel <dvossel@redhat.com>

Deployment StrategiesHA OpenStack Controller Nodes

David Vossel <dvossel@redhat.com>

Collapsed Architecture

PacemakerNODE1

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE2

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE3

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

HA Proxy HA Proxy HA Proxy

Separate VIP per service

● All controller nodes run the same services.

● VIP+load balancers distribute access to APIs across cloned nodes.

David Vossel <dvossel@redhat.com>

Collapsed Architecture

PacemakerNODE1

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE2

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE3

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

HA Proxy HA Proxy HA Proxy

NOVA VIP

Clients

David Vossel <dvossel@redhat.com>

Collapsed Architecture

PacemakerNODE1

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE2

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE3

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

HA Proxy HA Proxy HA Proxy

Glance VIP

Clients

David Vossel <dvossel@redhat.com>

Collapsed Architecture Scaling

PacemakerNODE1

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE2

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE3

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

HA Proxy HA Proxy HA Proxy

Separate VIP per service

NODE4

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE5

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE6

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

HA Proxy HA Proxy HA Proxy

David Vossel <dvossel@redhat.com>

Startup Ordering Revisited.

PacemakerNODE1

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE2

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE3

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

HA Proxy HA Proxy HA Proxy

Separate VIP per service

Bootstrap and start Start Galera Cluster across multiple nodes.

David Vossel <dvossel@redhat.com>

Startup Ordering Revisited.

PacemakerNODE1

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE2

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE3

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

HA Proxy HA Proxy HA Proxy

Separate VIP per service

Bootstrap and start Start Galera Cluster across multiple nodes.

Then Start Nova instances, which depend on an active Galera cluster.

David Vossel <dvossel@redhat.com>

Stop ordering Ordering Revisited.

PacemakerNODE1

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE2

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE3

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

HA Proxy HA Proxy HA Proxy

Separate VIP per service

If we're shutting down galera cluster

David Vossel <dvossel@redhat.com>

Stop ordering Ordering Revisited.

PacemakerNODE1

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE2

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE3

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

HA Proxy HA Proxy HA Proxy

Separate VIP per service

Stop everything that depends on Galera. Like Nova...

David Vossel <dvossel@redhat.com>

Stop ordering Ordering Revisited.

PacemakerNODE1

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE2

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE3

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

HA Proxy HA Proxy HA Proxy

Separate VIP per service

Then Shutdown Galera cluster.

David Vossel <dvossel@redhat.com>

Complex Startup Ordering

PacemakerNODE1

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera Redis Slave Neutron N

Swift

Heat

Neutron S

NODE2

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera Redis Slave Neutron N

Swift

Heat

Neutron S

NODE3

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera Redis SlaveNeutron N

Swift

Heat

Neutron S

HA Proxy HA Proxy HA Proxy

Separate VIP per service

Start Redis Clone.

David Vossel <dvossel@redhat.com>

Complex Startup Ordering

PacemakerNODE1

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera Redis Slave Neutron N

Swift

Heat

Neutron S

NODE2

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera Redis Slave Neutron N

Swift

Heat

Neutron S

NODE3

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera Redis MasterNeutron N

Swift

Heat

Neutron S

HA Proxy HA Proxy HA Proxy

Separate VIP per service

Promote one instance of Redisclone to be Master Instance.

David Vossel <dvossel@redhat.com>

Complex Startup Ordering

PacemakerNODE1

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera Redis Slave Neutron N

Swift

Heat

Neutron S

NODE2

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera Redis Slave Neutron N

Swift

Heat

Neutron S

NODE3

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera Redis MasterNeutron N

Swift

Heat

Neutron S

HA Proxy HA Proxy HA Proxy

Separate VIP per service

Then start Ceilometer cluster whichdepends on Redis

David Vossel <dvossel@redhat.com>

Segregated Architecture

Pacemaker

● Each service runs on its own dedicated hardware.

● Scales much further

● Add capacity where capacity makes sense.

● Requires lots and lots of nodes.

David Vossel <dvossel@redhat.com>

Segregated Architecture

Pacemaker

● Take a closer look.

David Vossel <dvossel@redhat.com>

Segregated Architecture

PacemakerNODE1

Galera

NODE2

Galera

NODE3

Galera

● Possible to have have an entire set of nodes just for Load balancing

● Dedicated cluster for galera

NODE4

Proxy

NODE5

Proxy

NODE6

Proxy

NODE7

Nova

NODE8

Nova

NODE9

Nova VIP VIP VIP

More services that way.

Glance

NODE9

VIP VIP VIP VIP VIP VIP VIP VIP VIP VIP VIP VIP

David Vossel <dvossel@redhat.com>

Mixed Architecture

● Mixture of collapsed and segregated.

● Break some components into separate hardware

● Most of cluster is collapsed

PacemakerNODE1

Galera

NODE2

Galera

NODE3

Galera

NODE4

Proxy

NODE5

Proxy

NODE6

Proxy

NODE7 NODE8 NODE9

VIP VIP VIP VIP VIP VIP VIP VIP VIP VIP VIP VIP VIP VIP VIP

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Redis

Neutron N

Swift

Heat

Neutron S

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Redis

Neutron N

Swift

Heat

Neutron S

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Redis

Neutron N

Swift

Heat

Neutron S

David Vossel <dvossel@redhat.com>

Pacemaker Advantages● Automates bootstrap of services that previously required hand holding.

● Example: Automate Galera bootstrap

1. Find out which galera instance is most up-to-date

2. Bootstrap most current galera instance first.

3. Then sync other galera instances

David Vossel <dvossel@redhat.com>

Pacemaker Advantages. Continued... ● start/stop distributed services in a graceful ordered manner.

● Gracefully and controller node into standby for maintenance.

● Dynamically grow capacity by adding more pacemaker nodes

● Centralized view of distributed service state.

David Vossel <dvossel@redhat.com>

ArchitectureHA OpenStack Compute Nodes

David Vossel <dvossel@redhat.com>

HA for Cattle

● Both Pets and Cattle need High Availability.● Recognize the techniques used for each are different.

David Vossel <dvossel@redhat.com>

Pacemaker for Pets and small Herds.

● No limits in the number of resources.

● Pacemaker supports “n-node” clusters.

● Cluster are limited by the Corosync messaging layer to 16 nodes.

Pacemaker

David Vossel <dvossel@redhat.com>

Pacemaker Remote for the Cattle.

● Pacemaker Remote allows clusters to scale beyond corosync membership layer limitations.

● Pacemaker Remote can scale clusters to 100s possibly 1000s of nodes.

Pacemaker + Pacemaker Remote

David Vossel <dvossel@redhat.com>

The Solution: Pacemaker Remote

● Pacemaker Remote is a single daemon, pacemaker_remoted

Pacemaker Remote

Remote Node

David Vossel <dvossel@redhat.com>

The Solution: Pacemaker Remote

Pacemaker

Node 2 Node 3Node 1

Pacemaker Remote

Node 5 Node 6Node 4Node 8 Node 9Node 7

Node 11 Node 12Node 10Node 14 Node 15Node 13

Node 16

Remote Node

● Pacemaker Remote is a single daemon, pacemaker_remoted● This daemon is a lightweight way of integrating nodes into the cluster.

David Vossel <dvossel@redhat.com>

The Solution: Pacemaker Remote

Pacemaker

Node 2 Node 3Node 1

Pacemaker Remote

Node 5 Node 6Node 4Node 8 Node 9Node 7

Node 11 Node 12Node 10Node 14 Node 15Node 13

Node 16

Remote Node

● Pacemaker Remote is a single daemon, pacemaker_remoted● This daemon is a lightweight way of integrating nodes into the cluster.

● Cluster services spread out across pacemaker and pacemaker_remote nodes as a single cluster partition.

Cluster Services

David Vossel <dvossel@redhat.com>

Pacemaker Remote use-case

Pacemaker

Node 2 Node 3Node 1

Pacemaker Remote

Node 5 Node 6Node 4Node 8 Node 9Node 7

Node 11 Node 12Node 10Node 14 Node 15Node 13

Node 16

Remote Node

● Management of services on Pacemaker Remote can work just like Pacemaker

● But thrives in the cattle use case where every remote instance is identical.

Cluster Services

Cloned service

Pacemaker RemoteRemote Node

Cloned service

Pacemaker RemoteRemote Node

Cloned service

David Vossel <dvossel@redhat.com>

Pacemaker Remote use case

Pacemaker

Node 2 Node 3Node 1

● Cloned services scale quite well on pacemaker remote

Cluster Services

Cloned services

Remote Node

Pacemaker Remote

David Vossel <dvossel@redhat.com>

Compute Node HA Strategy.

PacemakerNODE1

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE2

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE3

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

HA Proxy HA Proxy HA Proxy

Separate VIP per service

David Vossel <dvossel@redhat.com>

Compute Node HA Strategy.

Compute Service Group

Remote Node

Pacemaker Remote

PacemakerNODE1

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE2

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

NODE3

Keystone

Ceilometer

Nova Cinder

Glance

Horizon

RabbitMQ

Galera

Redis

Neutron N

Swift

Heat

Neutron S

HA Proxy HA Proxy HA Proxy

Separate VIP per serviceLibvirtdD

Neutron Open vSwitch

Ceilometer Compute Nova Compute

David Vossel <dvossel@redhat.com>

Why HA Compute Nodes?

● Maximize the Availability of Compute Instances.● detection of dead Cattle instances● Automate recovery of Cattle instances

David Vossel <dvossel@redhat.com>

Why HA Compute Nodes?

● Maximize the Availability of Compute Instances.● detection of dead Cattle instances● Automate recovery of Cattle instances

● Pacemaker Remote also has a secret weapon.

David Vossel <dvossel@redhat.com>

STONITH + Pacemaker Remote.

David Vossel <dvossel@redhat.com>

The Future

David Vossel <dvossel@redhat.com>

Limits?

Pacemaker + Pacemaker Remote

David Vossel <dvossel@redhat.com>

How manyservices?

Pacemaker + Pacemaker Remote

David Vossel <dvossel@redhat.com>

Pacemaker + Pacemaker Remote

How manyservices?

How manynodes?

David Vossel <dvossel@redhat.com>

Questions?

Visit us at

clusterlabs.org