KUBERNETES AND OPENSTACK AT SCALE
Will it blend?
Stephen Gordon (@xsgordon)Principal Product Manager, Red Hat
May 8th, 2017
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2
ONCE UPON A TIME...Part 1
● 1000 OpenShift Container Platform 3.3 / Kubernetes 1.3 nodes on OpenStack infrastructure
● Presented methodology and results in Barcelona:○ https://www.cncf.io/blog/2016/08/23/deploying-1000-
nodes-of-openshift-on-the-cncf-cluster-part-1/● Goals were:
○ Push limits
○ Identify best practices
○ Document best practices
○ Fix issues
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT3
FOR OUR NEXT TRICK!Part 2
● Goals:○ 2048 OpenShift Container Platform 3.5 / Kubernetes 1.5
nodes on OpenStack infrastructure○ Network ingress tier saturation test○ Overlay2 graph driver w/ SELinux test
○ Persistent volume scalability and performance test of
Container Native Storage (glusterfs)
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT4
KUBERNETES SCALABILITY SIG
Scalability SIG SLAs:
● API responsiveness
○ 99% of calls return in < 1 s
● Pod startup time
○ 99% of pods start within 5s*
Also define a number of other primary and
derived metrics.
* With pre-pulled images
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT5
A CONTAINER STACK FOR OPENSTACK
OPENSTACK KUBERNETES
+
A wild solution appears...
Consumption of resources
Able to easily access new environments to
quickly build new apps and move on
Exposition of resources
Provide necessary environments to developers
in minutes, not weeks or months
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT6
A CONTAINER STACK FOR OPENSTACKA wild solution appears...
OPENSTACK OPENSHIFT
+
Consumption of resources
Integrated platform to run, orchestrate,
monitor, and scale containers. Built around
Kubernetes and Docker.
Exposition of resources
Provide necessary environments to developers
in minutes, not weeks or months
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT7
CONCEPTUAL ARCHITECTURE
Architectural tenets:
● Technical
independence
● Contextual awareness
● Avoiding redundancy
● Simplified management
Reference architecture:
red.ht/2ibNmvX
PREPARATION
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT9
WHERE TO TEST?
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT1
0
HOW TO TEST?System Verification Test suite (SVT)
● Red Hat OpenShift Performance and Scalability team’s
upstream test suites:
○ Application Performance
○ Application Scalability
○ OpenShift Performance
○ OpenShift Scalability (incl. cluster-loader)
○ Networking Performance
○ Reliability/Longevity
● Also includes some additional tools e.g. image provisioner
● https://github.com/openshift/svt
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT1
1
ARCHITECTUREBaremetal Cluster (100 nodes)
OpenShift-on-OpenStack Cluster (2048 nodes)
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT1
2
ARCHITECTURE (cont.)
● Software:○ Red Hat OpenStack Platform 10, based on “Newton”○ OpenShift Container Platform 3.5 (built around K8S 1.5)○ Red Hat Enterprise Linux 7.3 (mostly…)
● Deployment:○ Deployed OpenStack + Ceph using TripleO○ Deployed OpenShift Container Platform using openshift-ansible.
● Applying previous learnings○ Storage architecture○ Image formatting○ Pre-baked images (see image_provisioner tool)
NETWORK INGRESS/ROUTING
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT1
4
NETWORK INGRESS/ROUTING TIERTesting HAProxy Performance
● Load generator itself runs
in a pod.
● Added SNI and TLS variants
to the test suite.
● Configuration by passing in
configmaps.
● Focused in on HTTP with
keepalive and TLS
terminated at the edge.
projects:
- num: 1
basename: centos-stress
ifexists: delete
tuning: default
templates:
- num: 1
file: ./content/quickstarts/stress/stress-pod.json
parameters:
- RUN: "wrk" # which app to execute inside WLG pod
- RUN_TIME: "120" # benchmark run-time in seconds
- PLACEMENT: "test" # Placement of the WLG pods based on node label
- WRK_DELAY: "100" # maximum delay between client requests in ms
- WRK_TARGETS: "^cakephp-" # extended RE (egrep) to filter target routes
- WRK_CONNS_PER_THREAD: "1" # how many connections per worker thread/route
- WRK_KEEPALIVE: "y" # use HTTP keepalive [yn]
- WRK_TLS_SESSION_REUSE: "y" # use TLS session reuse [yn]
- URL_PATH: "/" # target path for HTTP(S) requests
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT1
5
NETWORK INGRESS/ROUTING TIERTesting HAProxy Performance (cont.)
● 1p-mix-cpu*: nbproc=1, run on any CPU
● 1p-mix-cpu0: nbproc=1, run on core 0
● 1p-mix-cpu1: nbproc=1, run on core 1
● 1p-mix-cpu2: nbproc=1, run on core 2
● 1p-mix-cpu3: nbproc=1, run on core 3
● 1p-mix-mc10x: nbproc=1, run on any core,
sched_migration_cost=5000000
● 2p-mix-cpu*: nbproc=2, run on any core
● 4p-mix-cpu02: nbproc=4, run on core 2
NETWORK
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT1
7
NETWORK PERFORMANCETesting OpenShift-sdn (OVS+VXLAN) Performance
● OpenShift includes and uses OpenShift-sdn (OpenvSwitch + VXLAN) by
default:
○ Provides full multi-tenancy
○ Is fully pluggable (as is ingress/routing tier)
○ Supports all four footprints (physical/virtual/private/public)
● Web-based workloads are mostly transactional
● Focused microbenchmark on a ping-pong test of varying payload sizes
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT1
8
NETWORK PERFORMANCETesting OpenShift-sdn (OVS+VXLAN) Performance (cont.)
● Tested mix of payload sizes
and stream counts.
● tcp_rr-XXB-Yi
○ XX = # of bytes
○ Y = # of instances
(streams)
● Slimmed down version of
RFC2544
STORAGE
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2
0
OVERLAY2 w/ SELINUXNext on storage wars...
● Until recently RHEL used Device Mapper for docker’s storage graph driver
○ Overlay support added in RHEL 7.2
○ Overlay2 supported added in RHEL 7.3
○ Overlay2 support w/ SELinux added upstream and expected in RHEL 7.4
■ https://lkml.org/lkml/2016/7/5/409
○ Device Mapper remains default in RHEL for now, Overlay2 default in Fedora
26
■ https://fedoraproject.org/wiki/Changes/DockerOverlay2
● Let’s try it out!
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2
1
OVERLAY2 w/ SELINUXResults
● Single base
image for all
pods
● 240 pods on
the node
(rate limited
creation)
● Reasonable
memory
savings
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2
2
OVERLAY2 w/ SELINUXResults
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2
3
CONTAINER NATIVE STORAGEApproach
● OpenShift Container Platform supports a wide variety of volume providers
via the standard Kubernetes volume interface
● Red Hat Container Native Storage is a Gluster-based persistent volume
provider deployed on OpenShift
● Used the NVMe disks as “bricks” for Gluster, exposed 1G persistent
volumes
● Container Native Storage nodes marked unschedulable for other OpenShift
pods
● Ran throughput numbers for create/delete operations, as well as API
parallelism
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2
4
CONTAINER NATIVE STORAGEResults
● CNS allocated
volumes in constant
time
● Consistent with
results for other
persistent volume
providers
NEXT STEPS
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2
6
NEXT STEPSTo infinity, and beyond!
● Filed 40+ bugs across a variety of projects and components
● Scaling and Performance Guide, new with OpenShift Container Platform
3.5
● Getting Involved
○ “Kubernetes Ops on OpenStack” forum session
■ Wednesday, May 10, 1:50pm-2:30pm
■ Hynes Convention Center MR102
○ K8S SIG Scalability
○ K8S SIG OpenStack
KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2
7
REFERENCES
● Part 1: https://www.cncf.io/blog/2016/08/23/deploying-1000-nodes-of-
openshift-on-the-cncf-cluster-part-1/
● Part 2: https://www.cncf.io/blog/2017/03/28/deploying-2048-openshift-
nodes-cncf-cluster-part-2/
● Overlay2 and Device Mapper
https://developers.redhat.com/blog/2016/10/25/docker-project-can-
you-have-overlay2-speed-and-density-with-devicemapper-yep/
● Red Hat Performance and Scale Trello:
https://trello.com/b/M1bpo55E/scalability
THANK YOU
plus.google.com/+RedHat
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHatNews