Upload
vuxuyen
View
242
Download
0
Embed Size (px)
Citation preview
KubeVirtKubernetes, Virtualization and Your Future Data Center
KubeVirtKubernetes, Virtualization and Your Future Data Center
KubeVirt - Kubernetes, Virtualization and Your Future Data CenterItamar Heim, Sr Director & Fabian Deutsch, As. ManagerRed HatOpenShift Commons Briefing, August 24 2017
KubeVirt
● Upstream research project● Converged kubernetes infrastructure● Containers and virtual machines● Still early days, but interesting concepts
http://kubevirt.io/
Background
● KVM - a VM is just a user process● oVirt - Open Source Enterprise Virtualization● OpenStack - Infrastructure-as-a-Service (IaaS) Cloud● Kubernetes - Deployment, scaling, and management of containers
So, If...
● VMs are just user processes○ VMs and containers already share some isolation technologies - selinux, cgroups
● Kubernetes manages clustered containers, which are user processes● Can we get to a converged infrastructure?
Why Converged Infrastructure?
● Environments will co-exist over time○ While many new workloads will move to containers, virtualization will remain for the
foreseeable future. Same goes to on-premise vs. public cloud
● Unified infrastructure will (should) be easier to maintain, operate and reduce costs
● Migrating workloads from VMs to Containers is on same infrastructure.○ Can also benefit from local affinity between VM and container workloads
● VMs can benefit from advanced Kubernetes concepts (load balancing, rolling deployment, etc.)
● Enhances Kubernetes on-premise and bare metal use cases
Use Cases
● Run a container workload in a VM○ Better isolation
● Virtualization as in oVirt, OpenStack, etc.○ Leverage Kubernetes
○ Run a full fledged/featured VM
Not So Fast...
Goals
● Feature complete virtualization API (without contradictions and container
workload related limits)○ … for consumption by higher layers (i.e. UI, automation, SDK)
● Well behaving citizen on Kubernetes (technically and community wise)
● Production stable on all levels (runtime up to public API)
● With a native look and feel
Research focus
● What virtualization API is needed?
● Where should the runtime (libvirt/KVM) reside and how should it work?
● How should the integration into Kubernetes look?
● Kubernetes Gaps
Prior art
● virtlet○ Pod API to run VMs
● runv○ Pod API to run pods inside VMs for isolation
● Clear Containers / oci-cc-runtime○ Pod API to run pods inside VMs for isolation
API
● Virtlet, runv, ClearContainers derive VM from the pod spec○ Allows to create VMs for isolation○ Is getting cumbersome if it’s about creating ABI-stable or specific VMs
● Dedicated API for virtualization○ CRD now, working on User API Server for custom (sub-)resource types○ Allows to define a VM resources and actions
Current API (Example)
kind: VMapiVersion: kubevirt.io/v1alpha1metadata: name: testvmspec: nodeSelector: kubernetes.io/hostname: master domain: name: testvm type: qemu memory: unit: MB value: 64 vcpus: value: 4 devices: disks: - type: PersistentVolumeClaim - source: name: disk-01
Libvirt portion
Kubernetes scheduling
Kubernetes pod
Kubernetes volumes
Runtime
● Virtlet, runv, ClearContainers use a CRI/OCI runtime on the node level
● Containerized runtime○ Libvirt as the underlying runtime - Proven, stable, and feature rich○ Libvirt and qemu in a container○ VMs are moved into the resource group of pods for proper accounting○ Independent runtime life-cycle, no node OS dependencies (except kmods)○ VMs becomes a (nearly) container workload
Integration
● Virtlet, runv, ClearContainers integrate (mainly) on the node level
● Kubernetes Add-On, Now CRD → Custom API Server is a WIP○ VM specific user API Server for API server aggregation○ CRD/UAS recommended way to extend Kubernetes API○ Permits to reuse stock Kubernetes resources on the API level
■ I.e. volumes backed by qemu supported protocol can be directly consumed by qemu, not required to go through kubelet
Architecture
Example: Launching a VM
$ kubectl create -f pod.yaml$ kubectl create -f vm.yaml
Awaiting …
Node
DaemonSetDeployment
virt-controller
Cluster
virt-handler
libvirtd
VM CRD/UAS
Create VM
DaemonSetDeployment
virt-controller
NodeCluster
libvirtd
virt-handler
VM CRD/UAS
Watch VM
DaemonSetDeployment
virt-controller
1
NodeCluster
1: Controller watches VM life-cycle
libvirtd
virt-handler
VM CRD/UAS
Schedule pod
Pod (VM Pod) DaemonSetDeployment
virt-controller
1
2
NodeCluster
1: Controller watches VM life-cycle2: Controller creates a pod for the VM, kube-scheduler schedules the pod
libvirtd
virt-handler
VM CRD/UAS
Launch domain
Pod (VM Pod)
VM qemu
DaemonSetDeployment
virt-controller
1
3
NodeCluster
Only in the cgroup of the VM pod
1: Controller watches VM life-cycle2: Controller creates a pod for the VM, kube-scheduler schedules the pod3: Handler monitors and react to VM object state changes
libvirtd
virt-handler
Pod (VM Pod)
VM qemu
Pod (VM Pod)
VM qemu
VM TPR/UASVM TPR/UAS
VM CRD/UAS
Repeat
DaemonSetDeployment
virt-controller
1
3
1: Controller watches VM life-cycle2: Controller creates a pod for the VM, kube-scheduler schedules the pod3: Handler monitors and react to VM object state changes
NodeCluster
Pod (VM Pod)
VM qemu
libvirtd
virt-handler
Technical gaps
● Depending on the area, a lot of functionality is already there.
● Gaps need to be solved upstream (if possible)
○ Technically
○ Conceptually
Detailed Gaps (examples)
● Clustering○ Resource protections (exclusiveness,
fencing)● Host
○ Host Life-Cycle Management○ Device management
● Compute○ Process aware CPU/NUMA pinning○ Dynamic SLA (hotplug CPU/RAM)
● Network○ Layer 2 network vs. pod IP
● Storage○ Multipath○ Advanced operations (cloning, snapshots)
● Scheduling○ Resource driven○ Custom metrics○ Rescheduling○ Modularity of scheduling units
● Infrastructure○ Kubectl plugins○ Add-on formalization○ UAS native object storage in Kubernetes
Thoughts so far
● API granularity and focus○ How to: simple-as-apod (few data points), workload-types-for-free-for-vms
● Still need to think through kubelet/libvirt relationship for VM process ownership (cgroups, svirt, numa pinning, pci passthrough)
○ Collides with Kube’s process and resource ownership model
● Kube still has gaps to deliver all required functionality● Operator pattern works out nice
Summary: Continue with research
● Different use cases - we are focusing on the full virt/cloud one● While early, potential for convergence is promising● Looks like a win-win - benefits for both kubernetes and virt● But need to balance between virt expectations and kubernetes way of
doing things● POC is easy - enterprise class solution is hard
https://github.com/kubevirt/demo (with minikube)
$ minikube start --vm-driver kvm --network-plugin cni
$ git clone -b openshift-commons-briefing-201708 \
https://github.com/fabiand/kubevirt-demo.git
$ cd kubevirt-demo
$ bash run-demo.sh
Questions?