Containerizing MongoDB with kubernetes

Preview:

Citation preview

Containerizing MongoDB with Kubernetes

Dan Worth (fuboTV)

Brian McNamara (CloudyOps)

Who are we?Dan Worth

● Software Engineer

● Currently at fuboTV

● @djworth on the Internet

Who are we?Brian McNamara

● Ops Engineer

● @mcnamarabrian / @cloudyops

● Enjoys learning new things and quoting corny 80s comedies whenever possible

What are we going to talk about?● In the beginning of fuboTV

● Fundamentals of Kubernetes (Dan)

● Fundamentals of MongoDB high availability (Brian)

● Challenges of running stateful services on Kubernetes (Brian)

Let’s Start with a Story...

fubo.tv Business Overview● Sports first virtual multichannel video programming distributor (MVPD)

● Partnership with 21st Century Fox and Sky

● Create the best live events platform

● Subscription based

● Build communities around sports (teams, matches, players)

fubo.tv Started Taking Over Streaming World● Needed scalable offering

● Bursty demand patterns around soccer / futbol matches

● Lots of in-house development experience

● Not a lot of in-house operational experience

● Didn’t want to maintain hardware

fubo.tv v1● Node.js hosted with well-known PaaS provider

● MongoDB hosted with a separate provider

● Focus was on introducing application features, iterating quickly

Things Changed in Philly

Enter Kubernetes● Presentation by Kelsey Hightower

● Sysadmin who can code

● 2015 PhillyETE: Managing

Containers at Scale with CoreOS

and Kubernetes

● Container cluster manager

● Don’t sweat the scheduling of

containers in your cluster

● Live demo of rolling application

updates

Minds = Blown

How Can We Do That?● Google’s infrastructure for everyone else

● Loved demo

● Saw the possibilities of Kubernetes

● But….still didn’t want to maintain hardware

Enter Google Container Engine

● No need to run a Kubernetes cluster in-house

● Google provided service

● Fully managed

Fundamentals of Kubernetes● Open Source container cluster manager by Google

● Run Anywhere (GKE, GCE, AWS, Bare metal)

● Self-healing when using the right primitives

● Service discovery and load balancing

● Secret and configuration management

● Key high level domain objects

○ Pods

○ Replication Controllers

○ Services

Fundamentals of Kubernetes (Pods)● Pods

○ Unit of Scheduling

○ One or more containers

○ Define environment

○ Pods get their own IP addresses

spec:containers: - name: mongo image: mongo:3.2 ports: - containerPort: 27017 resources: limits: cpu: 4

Fundamentals of Kubernetes (RC)● Replication Controller

● All the goodness of the Pod

● Additional benefit of defining count of pods

Replication Controller ExampleapiVersion: v1kind: ReplicationControllermetadata: name: mongo1

spec: replicas: 1 selector: name: mongo1 template: metadata: labels: name: mongo1 spec: containers: - name: mongo1 image: mongo:3.2

volumeMounts: - name: mongo1-data mountPath: /data/db resources: Limits: cpu: 4 memory: 4Gi ports: - name: "mongo" containerPort: 27017 protocol: TCP command: - ...

Fundamentals of Kubernetes (Services)● Provides stable endpoint to pods / replication controllers

● Uses metadata like ports and selectors to identify how to map endpoint to pod

Service ExampleapiVersion: v1kind: Servicemetadata: name: mongo1-service labels: name: mongo1-servicespec: ports: - port: 27017 targetPort: 27017 protocol: TCP selector: name: mongo1 type: LoadBalancer

Fundamentals of KubernetesService

Replication Controller

Pod Pod Pod

Fundamentals of MongoDB High Availability● Possible to scale reads and writes

○ Scaling reads: use replica sets

○ Scaling writes: use shards

● Clients can do things to take advantage of availability primitives

● We’ll focus on scaling reads using replica sets

Fundamentals of MongoDB High Availability (cont)

MongoDB Replica Set

Fundamentals of MongoDB High Availability (cont)

Heartbeat among replica set members

Fundamentals of MongoDB High Availability (cont)

Automated election of Primary in the event of failure

Fundamentals of MongoDB High Availability (cont)> rs.config(){ "_id" : "replica_set_name", "version" : 105978, "protocolVersion" : NumberLong(1), "members" : [ { "_id" : 0,

"host" : "ip_or_hostname:port_number", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : {}, "slaveDelay" : NumberLong(0), "votes" : 1 }, { … }}

Challenges of Running Stateful Services on Kubernetes● Kubernetes is amazing at running and rescheduling containers with Pods,

Replication Controllers, and Services

● Stateless services are easiest to manage but...

● Sometimes we need things to maintain state

Challenges of Running Stateful Services MongoDB on Kubernetes● Kubernetes Replication Controller

○ Ensure requisite number of Pods are scheduled

○ Don’t guarantee consistent hostname or IP address

● MongoDB replica set configuration uses well defined endpoints

○ Remember that rs.conf() output?

○ Updating replica set configuration by hand feels dirty and you’re a bad person if you want to do

that.

● MongoDB data should persist

○ If not, when a new replica set member comes up there will be a full sync

○ Kubernetes manages the scheduling but who needs the full sync?

Built-in Primitives to the Rescue● Kubernetes

○ Replication Controller + Service

○ Label selector allows for consistent association between pod and service hostname / IP

○ Persistent volume can be defined

■ Allows GCE volumes to move with Pod

■ Result: No need to do expensive resync of data

● MongoDB

○ Take advantage of service endpoint when defining replica set

■ Service IP or DNS

Parting Thoughts● MongoDB has good resilience in the face of failure but be sure to test different

failure scenarios.

● Docker is great to work with, but make sure your development workflows, tools

and harnesses are adapted to build and run apps with it.

● Kubernetes is still relatively young, but maturing quickly. You need to carefully

evaluate whether you want to roll your own platform with it, or instead rely on a

hosted service like Google Container Engine

Questions?… and thanks!

Dan Worth (@djworth)

Brian McNamara (@mcnamarabrian / @cloudyops)

Recommended