33
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Galera on Kubernetes Running a Galera Cluster on a Kubernetes Cluster Patrick Galbraith Advanced Technology Group/ April 14th, 2015

Galera on kubernetes_no_video

Embed Size (px)

Citation preview

Page 1: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Galera on KubernetesRunning a Galera Cluster on a Kubernetes Cluster Patrick Galbraith Advanced Technology Group/ April 14th, 2015

Page 2: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

HP ATGHP's Advanced Technology Group for Open Source and Cloud embraces a vision that is two steps ahead of today's solutions. We use this vision to drive product adoption and incubate technologies to advance HP. Through Open Source initiatives we foster collaboration across HP and beyond.

Page 3: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

About the speaker● Patrick Galbraith ● HP Advanced Technology Group● Has worked at Blue Gecko, MySQL AB,

Classmates, Slashdot, Cobalt Group, US Navy, K-mart

● MySQL projects: memcached UDFs, DBD::mysql, federated storage engine

● Family● Outdoors

Page 4: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.4

Purpose of this talk – why are you here?Docker• Containers vs. Virtualization• Simple Docker usage• Clustered Docker

CoreOS• Container optimized, stripped-down Linux distribution• Overview of core components of CoreOS – Fleet, etcd. systemd

Kubernetes• A more advanced scheduler and how it works• Using Kubernetes to do work, in this case what it means to MySQL users

Galera • Synchronous replication – excellent solution to clustering MySQL

Vitesss• Introduction and demo

Page 5: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.5

What do you need?

Pre-reqs

• Vmware – fusion or workstation. ESXi should work too.• https://github.com/CaptTofu/mysql_replication_kubernetes.git• https://github.com/CaptTofu/kubernetes_cluster_vmware.git• Go -- brew install go (or https://golang.org/doc/install)Clients:• etcdctl: go get github.com/coreos/etcd/etcdctl• fleetctl: go get github.com/coreos/fleet/fleetctl • kubectl –

– git clone https://github.com/GoogleCloudPlatform/kubernetes – make– sudo cp cmd/kubectl /usr/local/bin

• vtctlclient -- go get github.com/youtube/vitess/go/cmd/vtctlclient

Page 6: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.6

What are containers?

•Operating-system-level virtualization•Encapsulated, hermetically sealed applications•Relatively isolated•Small footprint•Fast to launch!•Portable. And did I mention, portable?!•Use of host OS and Kernel•Execution consists of time to startup application in question•LXC, Docker, Solaris Zones, BSD Jails, Parallels Virtuozzo, OpenVZ, …

Page 7: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.7

Containers vs. VMs

Page 8: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.8

DockerWhat is Docker?• Set of tools for managing containers• Command line tool that doubles as a daemon• Uses Linux Kernel Features:

– Kernel namespaces – the core ingredient to containers working: PID, IPC ,uts (what will be seen by a group of processes), mount, network and user

– Cgroups (control groups) -- limit, account and isolate resource usage (CPU, memory, disk I/O, etc.) of process groups

• Originally used lxc, now defaults to Libcontainer but meant for any containerization mechanism

• Much more light weight than VMs• Encapsulated application containers in a relatively isolated but lightweight

operating environment• Written in Go

Page 9: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.9

Docker – common terms and usage

• Dockerfile• EXPOSE ports• Entrypoints and CMD• docker build• docker push• docker run• docker inspect• docker exec

Page 10: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.10

Dockerfile

• Show example of Dockerfile and explain• Show entrypoint scripts• Environment variables passed to container

Page 11: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.11

Simple Docker usage - forked

Video will be provided on youtube

Page 12: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.12

Clustered Docker• CoreOS (https://coreos.com/)• Kubernetes (http://kubernetes.io)• Mesos + Marathon (http://mesos.apache.org/) Apache project, Zookeeper, etc• Project Atomic (http://www.projectatomic.io/) RH/Fedora/Centos designed for running

Docker• Docker Openstack (https://wiki.openstack.org/wiki/Docker) Hypervisor Driver for

Openstack Compute• Swarm/Compose/Machine • RancherOS (http://rancher.com/rancher-os) Minimalist Linux, Docker daemon runs as PID

1 first process the kernel starts known as “System Docker,”• Flocker (https://clusterhq.com)• Spotify Helios (https://github.com/spotify/helios) Zookeeper• Flynn (https://flynn.io/)• Deis (http://deis.io)• Maestro (https://github.com/toscanini/maestro)• Shipyard (http://shipyard-project.com)• … others to come!

Page 13: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.13

What is CoreOS?

CoreOS

• Minimalist Linux• Optimized for containers• Easy to run containers• Service discovery, container management, • Docker -- Container runtime and management, though Rocket long-

term• Etcd – distributed global key value store for config data on each

node• Fleet – Rudimentary Scheduler interacts with systemd and etcd• Systemd – Newer Linuxi – system and service manager for Linux• Flannel – Networking across nodes

Page 14: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.14

CoreOS Cluster

CoreOS

CoreOS DNA NodeDocker Containers etcd

fleetddockerd

systemd

CoreOS DNA NodeDocker Containers etcd

fleetddockerd

systemd

CoreOS DNA NodeDocker Containers etcd

fleetddockerd

systemd…

etcd Discovery Server

etcd

http://discovery.etcd.io

Docker Registry

Images

Discovery

Control Node(jump box)

etcdctl

fleetctl

ssh

1

2

3

1. Cluster Start (etcd discovery)

2. Container Start ( fleetctl )

3. Docker Download

Page 15: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.15

What is Kubernetes?

Kubernetes

• “An open source system for managing containerized applications across multiple hosts, providing basic mechanisms for deployment, maintenance, and scaling of applications.”

• Pre-Production Beta • Lean• Portable – will run cloud, bare metal, hybrid, etc• Extensible – using modular design allowing for plug-ability and hooks• Self-healing – auto-placement, auto-restart, auto-replication• Google engineering bring good work to the Open-source world

Page 16: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.16

Master Components

Kubernetes

•kube-apiserver – API Server (RESTful)– primary management for cluster– reconciles etcd entries with deployed containers

•kube-controllermanager — Controller Manager Server– Handle replication precesses defined by replication tasks– Writes details to etcd– Monitors changes and implements procedure to reflect the change

•kube-scheduler -- Scheduler Server– Assigns workloads to specific minions in cluster taking into account

service’s operating requirements and infrastructure environment •kube-register -- Register Server

Page 17: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.17

Minion Components

Kubernetes

•Kubelet Kubelet– Communicates with the master, relaying information to/from– Reads and updates etcd– Receives work in a manifest that defines the workload and

operating parameters. – Assumes responsibility for the state of work on minion

•kube-proxy Kube proxy– Ensures network environment is accessible but isolated. – Makes services available externally by forwarding requests to

containers.– Can perform rudimentary load balancing.

Page 18: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.18

Kubernetes Terminology• Pod

• Group of closely-related containers on the same host• Service

• Virtual abstraction• Basic load-balancer• Single consistent access point to a pod

• Replication controller• Defines pods to be horizontally scaled• Uses a label query for identifying what containers to run • Maintains specified number of replicas of a particular thing to run• Dynamic resizing

• Label • Key/value tag to mark work units a part of group• Management and action targeting

• Definition file – YAML/json describing a pod, service, or replication controller

Page 19: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.19

Kubernetes

Page 20: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.20

Kubernetes usage pattern

• Pod configuration file – YAML or JSON• Service configuration file• Replication controller configuration file• export KUBERNETES_API=http://172.16.230.132:8080• kubectl create –f mysql_master.json • kubectl create –f mysql_master_service.json• …

Page 21: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.21

MySQL async replication on Kubernetes• Simple proof of concept• Master pod and service• Slave pod and service• Secret sauce?

• The master pod configuration file passes environment variables to set root password, replication password

• The entrypoint script runs an SQL file to grant permissions to replication user

• The master service configuration is loaded and the slave pod container(s) have as an environment variable MYSQL_MASTER_SERVICE_HOST set

• Slave pod when launched has environment variables for replication user and password passed and along with MYSQL_MASTER_SERVICE_HOST variable, changes the master host and user to point to that master

Page 22: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.22

Galera replication on Kubernetes• A bit more tricky• Currently unable to rely on services because only one port/IP

(v1beta3 API required in 0.14.x)• A pod per node (pxc_01, pxc_02, and pxc_03)• Single container per pod• Secret sauce?

• The pod configuration file passes environment variables to set root password, sst user and password

• Kubectl built into container and uses RO kubernetes API to get POD IP addresses of nodes

• Entrypoint script: • wsrep cluster address on pxc_01 set to gcomm://• wsrep_cluster_address on pxc_02 and pxc_03 set to POD

IP addresses gcomm://<pxc_01>,<pxc_02>, …

Page 23: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.23

Galera on Kubernetes

Page 24: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.24

Galera on Kubernetes

Video will be provided on youtube

Page 25: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.25

Vitess• YouTube – since 2011• Vitess from french “Vitesse” for fast. Came about because of cat movies (remember

Gearman?)• Backed by consistent data store (etcd, Chubby, Zookeeper)• Clients with simple interface to provide a view of a single instance• Lightweight connections (around 32kb) using BSON-based protocol• Row cache• SQL Parser the uses a configurable set of rules to rewrite queries • Sharding and shard management built-in:

• range based, supports horizontal and vertical sharding• can accommodate your existing sharding scheme• Supports split replication stream (keyspace ID in statement-based replication stream)

• Handles failover and backups• Includes a proxy to route queries to most appropriate MySQL instance • Supports transactions within a shard, plans to support cross-shard transactions using two

phase commit• Web and CLI

Page 26: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.26

Vitess components• Vtgate

• Lightweight proxy• Routes traffic to the correct vttablet based on sharding scheme, required

latency, and availability• Allows for simple client – client only needs to find the vtgate instance

• Vttablet: • Succeeded vtocc. • Fronts actual MySQL database• Provides connection pooling, query rewriting, query de-duping• Performs management tasks initiated by vtctl

• Topology server• Contains information about running servers, the sharding scheme, and the

replication graph• Vtctld

• Web UI• Vtctl

• Command line tool for administering the cluster

Page 27: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.27

Vitess components, topology server (cont)

Page 28: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.28

Vitess on Kubernetes

Video will be provided on youtube

Page 29: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Questions

http://patg.net

Page 30: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Thank you!

Thanks to:

GoogleKelsey HightowerTim HockinDaniel SmithAnthony Yeh

#google-containers .* and #coreos .*

Page 31: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Please attend:

SCALING MYSQL IN THE CLOUD WITH VITESS AND KUBERNETES

Anthony Yeh

14 April 5:15PM - 5:40PM @ Ballroom A

Page 32: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.32

Vitess components• Topology server – used to store information about keyspaces,

shards, tablets, replication graph, and serving graph. Also supports watch interface for changes on a node

• Global instance

• Keyspace object

• sharding of keyspace

• name of sharding column

• How to split incoming queries

• For use during resharding to change what shard is serving what inside of keyspace

• Shard

• Subset of data for a keyspace

• The shard record contains the master tablet alias for shard

• Sharding key range

• Tablet types

• Vschema Data

• Sharding and routing information

• Local instance (per cell)

Page 33: Galera on kubernetes_no_video

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.33

Vitess components, topology server (cont)• Topology server – used to store information about keyspaces,

shards, tablets, replication graph, and serving graph. Also supports watch interface for changes on a node

• Local instance (per cell) -- Tablet record contains information about both vtablet process and MySQL process. Contains tablet alias (cell + unique ID), hostname, IP, Port, Tablet type, Which keyspace/shard the tablet is part of, health map, sharding key range. Note that a tablet record is created before it is run

• Replication Graph – what clients use to find what endpoints to send queries to. The objects are:

• SrvKeySpace – local representation of a Keyspace containing information about what shard to use to access data

• SrvShard – local representation of a Shard, details internal only to this shard

• EndPoints – for all each possible serving type