Intro to Kubernetes

Joonathan Mägi, Teleport

kubernetes

https://twitter.com/joonathan

WHAT IS KUBERNETES?

Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications giving you the freedom to take advantage of on-premise, hybrid, or public cloud infrastructure, letting you effortlessly move workloads to where it matters to you.

It groups containers that make up an application into logical units for easy management and discovery.

Builds on top of Docker containers, but version 1.3 brought support also for rkt and OCI & CNI standards.

https://github.com/kubernetes/kubernetes

https://github.com/coreos/rkt

https://www.opencontainers.org

https://github.com/containernetworking/cni

WHAT CAN KUBERNETES DO FOR ME?

Scheduling

Automatically places containers based on their resource requirements and other constraints, while not sacrificing availability. Mix critical and best-effort workloads in order to drive up utilization and save even more resources.

Ships with default scheduler, but you can build and run your instead of or even simultaneously side-by-side with the default one.


Lifecycle and health

A replication controller ensures that a specified number of pod “replicas” are running at any one time.

Deployment provides declarative updates for Pods and Replica Sets and updates Pod’s in rolling update fashion (maxUnavailable and maxSurge can be defined to control the process).

Kubelet constantly monitors the Docker daemon to confirm the container process is still running, and if not, the container process is restarted.

Health Check probes can be defined (livenessProbe & readinessProbe) to run HTTP Health Checks, Container Exec checks or TCP socket checks.

Container Lifecycle Hooks are available — PostStart and PreStop.

With Horizontal Pod Autoscaling, Kubernetes automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization or user defined metrics.

http://kubernetes.io/docs/user-guide/liveness/

http://kubernetes.io/docs/user-guide/horizontal-pod-autoscaling/


Discovery

Service is an abstraction which defines a logical set of Pods running somewhere in your cluster, that all provide the same functionality. Each Service is assigned a unique IP address (clusterIP). This address is tied to the lifespan of the Service, and will not change while the Service is alive. Pods can be configured to talk to the Service, and know that communication to the Service will be automatically load-balanced out to some pod that is a member of the Service

DNS is a built-in service launched automatically as a cluster add-on allowing to access the service via the Service’s name. SRV queries can be used to discover ports if necessary.

Namespaces can provide scoping of ‘environments’ on the same cluster.


Configuration

Objects of type Secret are intended to hold sensitive information, such as passwords, OAuth tokens, and ssh keys. Putting this information in a secret is safer and more flexible than putting it verbatim in a pod definition or in a docker image. Secrets can be mounted as data volumes or be exposed as environment variables to be used by a container in a pod.

ConfigMap resource holds key-value pairs of configuration data that can be consumed in pods or used to store configuration data for system components such as controllers. ConfigMap is similar to Secrets, but designed to more conveniently support working with strings that do not contain sensitive information.

Similarly to service discovery Namespaces can be used scope configuration to ‘environments’ within cluster.

WHO IS USING KUBERNETES?

HOW TO GET STARTED?

Minikube

A tool that makes it easy to run Kubernetes locally. Minikube runs a single-node Kubernetes cluster inside a VM on your machine for users looking to try out Kubernetes or develop with it day-to-day.

Google Container Engine (GKE)

Google provides hosted master for Kubernetes clusters on top of Google Compute Engine platform.

https://github.com/kubernetes/minikube

https://cloud.google.com/container-engine/

TERMINOLOGY & ARCHITECTURE

POD

A pod is a group of one or more containers - it is the basic scheduling unit in Kubernetes.

Pods are always co-located and co-scheduled, and run in a shared context. A pod models an application-specific “logical host” - it contains one or more application containers which are relatively tightly coupled — in a pre-container world, they would have executed on the same physical or virtual machine.

Containers within a pod share an IP address and port space, and can find each other via localhost.

Users shouldn’t need to create pods directly, but rather use controllers (e.g., deployments, replication controller), even for singletons. Controllers provide self-healing with a cluster scope, as well as replication and rollout management.

REPLICATION CONTROLLER & REPLICATION SET

A replication controller ensures that a specified number of pod “replicas” are running at any one time. Unlike manually created pods, the pods maintained by a replication controller are automatically replaced if they fail, get deleted, or are terminated. You can think of a replication controller as something similar to a process supervisor, but rather than individual processes on a single node, the replication controller supervises multiple pods across multiple nodes.

Replica Set is the next-generation Replication Controller. The only difference between a Replica Set and a Replication Controller right now is the selector support. Replica Set supports the new set-based selector requirements as described in the labels user guide whereas a Replication Controller only supports equality-based selector requirements.

https://asciinema.org/a/84911

DEPLOYMENTS

A Deployment provides declarative updates for Pods and Replica Sets (the next-generation Replication Controller). You only need to describe the desired state in a Deployment object, and the Deployment controller will change the actual state to the desired state at a controlled rate for you. You can define Deployments to create new resources, or replace existing ones by new ones.

https://asciinema.org/a/84922

SERVICES

A Kubernetes Service is an abstraction which defines a logical set of Pods and a policy by which to access them. The set of Pods targeted by a Service is (usually) determined by a Label Selector. Kubernetes offers a virtual-IP-based bridge to Services which redirects to the backend Pods.

ClusterIP — use a cluster-internal IP only - this is the default and means that you want this service to be reachable only from inside of the cluster.

NodePort — on top of having a cluster-internal IP, expose the service on a port on each node of the cluster (the same port on each node). You’ll be able to contact the service on any <NodeIP>:NodePort address.

LoadBalancer — on top of having a cluster-internal IP and exposing service on a NodePort also, ask the cloud provider for a load balancer which forwards to the Service exposed as a <NodeIP>:NodePort for each Node.

INGRESS

Typically, services and pods have IP’s only routable by the cluster network. All traffic that ends up at an edge router is either dropped or forwarded elsewhere. An Ingress is a collection of rules that allow inbound connections to reach the cluster services.

Ingress can be configured to give services externally-reachable urls, load balance traffic, terminate SSL, offer name based virtual hosting etc. An Ingress controller is responsible for fulfilling the Ingress, usually with a loadbalancer, though it may also configure your edge router or additional frontends to help handle the traffic in an HA manner.

Ingress controllers available:

Nginx Ingress Controller

GLBC — Google Compute Engine L7 load balancer controller

https://github.com/kubernetes/contrib/tree/master/ingress/controllers/nginx

https://github.com/kubernetes/contrib/tree/master/ingress/controllers/gce

DAEMON SETS, JOBS, NODES…

Node — worker machine

Daemon Set — ensures that all (or some) nodes run a copy of a pod.

Job — creates one or more pods and ensures that a specified number of them successfully terminate. Tracks the successful completions. When a specified number of successful completions is reached, the job itself is complete.

PersistentVolume (PV) — a piece of networked storage in the cluster that has been pre provisioned. It is a resource in the cluster just like a node is a cluster resource.

PersistentVolumeClaim (PVC) — a request for storage.

Pet Set (alpha) — a group of stateful pods that require a stronger notion of identity. Example workloads would be databases, clustered software etc.

…

http://kubernetes.io/docs/admin/node/

http://kubernetes.io/docs/admin/daemons/

http://kubernetes.io/docs/user-guide/jobs/

http://kubernetes.io/docs/user-guide/persistent-volumes/

http://kubernetes.io/docs/user-guide/persistent-volumes/#persistentvolumeclaims

http://kubernetes.io/docs/user-guide/petset/

CLUSTER FEDERATION

Kubernetes 1.3 makes it possible to discover services running in multiple clusters, that may span regions and/or cloud providers, to be used by containers or external clients. This federation can be used for increased HA, geographic distribution and hybrid/multi-cloud.

Once created, the Federated Service automatically:

•creates matching Services in every cluster underlying your cluster federation,

•monitors the health of those service "shards" (and the clusters in which they reside)

•manages a set of DNS records in a public DNS provider (like Google Cloud DNS, or AWS Route 53), thus ensuring that clients of your federated service can seamlessly locate an appropriate healthy service endpoint at all times, even in the event of cluster, availability zone or regional outages.

• http://kubernetes.io

• https://twitter.com/kubernetesio

• https://github.com/kubernetes/kubernetes

http://kubernetes.io

https://twitter.com/kubernetesio

https://github.com/kubernetes/kubernetes

https://teleport.org

@TeleportInc

https://teleport.org

https://twitter.com/TeleportInc

Engineering

Intro to Kubernetes