Upload
sneha-inguva
View
62
Download
4
Embed Size (px)
Citation preview
digitalocean.com
about mesoftware engineer @DigitalOceandelivery teamkubernetes, prometheus, terraform
digitalocean.com
what is a container?
“a lightweight OS-level virtualization method”“stand-alone piece of executable software”
“NOT a virtual machine”
digitalocean.com
build your own container
1. run input commands with arguments
2. add hostname limitations
3. add process ID limitations
4. add mount point/filesystem limitations
digitalocean.com
let’s start with a basic “container”
func main() {switch os.Args[1] {case "run":
run()default:
panic("what?")}
}
func run() {fmt.Printf("running %v\n", os.Args[2:])
cmd := exec.Command(os.Args[2], os.Args[3:]...)
cmd.Stdin = os.Stdincmd.Stderr = os.Stderrcmd.Stdout = os.Stdout
must(cmd.Run())}
func must(err error) {if err != nil {
panic(err)}
}
digitalocean.com
func run() {fmt.Printf("running %v\n", os.Args[2:])
cmd := exec.Command(os.Args[2], os.Args[3:]...)
cmd.Stdin = os.Stdincmd.Stderr = os.Stderrcmd.Stdout = os.Stdout
cmd.SysProcAttr = &syscall.SysProcAttr{Cloneflags: syscall.CLONE_NEWUTS,
}
must(cmd.Run())}
UTS namespace
digitalocean.com
UTS + PID namespace: attempt 1
func run() {fmt.Printf("running %v\n", os.Args[2:])cmd := exec.Command(os.Args[2],
os.Args[3:]...)cmd.Stdin = os.Stdincmd.Stderr = os.Stderrcmd.Stdout = os.Stdoutcmd.SysProcAttr = &syscall.SysProcAttr{
Cloneflags: syscall.CLONE_NEWUTS | syscall.CLONE_NEWPID,
}must(cmd.Run())
}
UTS + PID namespace: attempt 2func run() {
cmd := exec.Command("/proc/self/exe", append([]string{"child"}, os.Args[2:]...)...) cmd.Stdin = os.Stdin
cmd.Stderr = os.Stderrcmd.Stdout = os.Stdoutcmd.SysProcAttr = &syscall.SysProcAttr{
Cloneflags: syscall.CLONE_NEWUTS | syscall.CLONE_NEWPID,}must(cmd.Run())
}
func child() {fmt.Printf("running %v as pid %v\n", os.Args[2:], os.Getpid())cmd := exec.Command(os.Args[2], os.Args[3:]...)cmd.Stdin = os.Stdincmd.Stderr = os.Stderrcmd.Stdout = os.Stdoutmust(cmd.Run())
}
UTS + PID + MNT namespace: attempt 1
func run() {
md := exec.Command("/proc/self/exe", append([]string{"child"}, os.Args[2:]...)...) // link to currently running process
cmd.Stdin = os.Stdincmd.Stderr = os.Stderrcmd.Stdout = os.Stdout
cmd.SysProcAttr = &syscall.SysProcAttr{Cloneflags: syscall.CLONE_NEWUTS | syscall.CLONE_NEWPID |
syscall.CLONE_NEWNS,}must(cmd.Run())
}
UTS + PID + MNT namespace: attempt 1
Initial mounts in MNT namespace inherited from creating namespace → filesystem same as host
next step: UTS + PID + MNT namespace + new root filesystem
example
func child() {fmt.Printf("running %v as pid%v\n", os.Args[2:], os.Getpid())
cmd := exec.Command(os.Args[2], os.Args[3:]...)cmd.Stdin = os.Stdincmd.Stderr = os.Stderrcmd.Stdout = os.Stdout
must(syscall.Chroot("/home/rootfs"))must(os.Chdir("/"))must(syscall.Mount("proc", "proc", "proc", 0, ""))must(cmd.Run())
}
TODO
digitalocean.com
what is a container?
process with isolation, shared resources, and
layered filesystems
what is a container?
namespace: linux kernel feature that isolates and virtualizes system resources for a collection of processes and their children
● PID: gives process own view of subset of system processes. ✔
● MNT: gives process mount table and allows process to have own filesystem ✔
● NET: gives process own network stack. (Container can have virtual ethernet pairs to link to host or other containers.)
● UTS: gives process own view of system hostname and domain name ✔
● IPC: isolates inter-process communications (i.e. message queues)
● USER: newest namespace that maps process UIDs to different set of UIDs on host (can map containers root uid to unprivileged UID on host)
what is a container?
cgroups: control groups collect set of process tasks IDS together and apply limits, such as for resource utilization
● Enforce fair/unfair resource sharing between processes● Exposed by kernel as special file system to to mount● Add a process or thread by adding process IDs to task file and
read/configure values by editing subdirectory files
what is a container?
layered filesystems: optimal way to make a copy of root filesystem for each container
● one of the reasons why it is easy to move containers around● can “copy on write” (btrFS) ● can use “union mounts” (aufs, OverlayFS) - way of combining multiple
directories
digitalocean.com
containers vs. VMS
Source: http://electronicdesign.com/dev-tools/what-s-difference-between-containers-and-virtual-machines
digitalocean.com
vms containers● Hypervisors run software on physical
servers to emulate a particular hardware system (aka a virtual machine)
● VM runs a fully copy of the operating system (OS)
● Hardware is also virtualized● Can run multiple applications
● Run isolated process on a single server or host operating system (OS)
● Can migrate only to servers with compatiable OS kernels
● Best for a single application
digitalocean.com
Source: https://docs.docker.com/engine/understanding-docker/ https://coreos.com/rkt/docs/latest/rkt-vs-other-projects.html#rkt-vs-docker
containers
digitalocean.com
container orchestration
Source: https://github.com/nkhare/container-orchestration/blob/master/kubernetes/README.md
digitalocean.com
___ as-a-servicecontainer service, managed clusters, etc.
Source: https://coreos.com/tectonic/
sources
● Liz Rice: What is a Container, Really?, Liz Rice
● Building a Container in Less than a 100 Lines
of Go, Julien Friedman
● My demo code