Docker Storage Internals - Amit Shukla

Preview:

Citation preview

Docker Storage InternalsAmit Shukla Docker Inc

1

Introduction• Engineer

• Manage Storage and Orchestration teams @ Docker

• Background in Data storage systems and Distributed systems

• Twitter, Microsoft, Teradata

• Univ of Wisconsin, IIT Madras

• Love building software to solve real problems

• @amits: eclectic stream of geeky things

2

Agenda

• Docker Introduction

• Docker Image storage

• Storage drivers

• Volumes

3

Docker• Open source projects

• Engine, Swarm, Compose, Machine, Registry, …

• Company

• Shepherd of open source projects

• Commercial: Hub, DTR

4

What are containers?

Server

Host OS

Hypervisor

Guest OS

Guest OS

Guest OS

Bins/ Libs

Bins/ Libs

Bins/ Libs

App A

App A’

App B

VM

Server

Host OS

Docker Engine

Bins/ Libs Bins/ Libs

App A

App A’

App B

App B

App B’

App B’

App B’

App B’Container

Host Level Virtualization

OS Level Virtualization

5

Docker: portability

6

Demo

• Pull a container Image from the Registry (docker hub)

• Start container

• Dockerfile for container

7

Diving deeper: Image storage

8

Local Image Storage

• Two enabling technologies

• Copy on Write

• Union File Systems

9

Copy on Write• Common paradigm in Unix

• fork(): start processes by making a copy of an existing process

• Single shared copy of data, until it is modified

• Without this containers

• would take forever to start

• would take a lot of space

10

Union File Systems• Represent a logical file system by grouping

different directories and/or files (branches)

• Each branch represents a layer in a Docker image

• Allows images to be constructed / deconstructed as needed vs. a huge monolithic image (ala traditional virtual machines)

• When a container is started a writeable layer is added to the “top” of the file system

11

Layers

12

Example: AUFS• Layers as “plates of glass”

• Only top most plate of glass is writable

• RO => find the file in the first plate of glass

• RW => copy file to top-most plate of glass (copy up), and then open it for write

• Delete => whiteout the file

13

Union File Systems: Advantages

• Memory usage:

• files have same inode => kernel shares copy of file for multiple containers in page cache

• Faster create/destroy times:

• Scanning for KSM (Kernel same-page merging) not needed

• Key to quick startup times for Docker containers14

Demo

• Explore where images live

15

Storage drivers

• Implement Union file system for image composition and CoW

• “Batteries included but pluggable”

• Flexibility to choose a solution

• AUFS, OverlayFS, Device Mapper, BTRFS, VFS, ZFS

16

AUFS• Oldest driver, and default for Ubuntu

• Works at the file level

• Scales really well due to memory sharing

• Not in mainline kernel

• Hence the need for linux-image-extra-virtual as part of the install process

• Not well suited for working with large files (logs, databases, etc)

• Use volumes instead

• Containers with a lot of layers will result in many branches and long traversal times

17

Device Mapper• Contributed by Red Hat

• Has some nice high end features (RAID, disk encryption, snapshots)

• Works at the block level with thin provisioning

• Can snapshot at any time

• More complex than AUFS

• Working at the block level reduces visibility into the diffs between images and containers

• Does not share pages in memory, so density is lower than AUFS

• Need to be deliberate about tuning to get good perf

• Docker uses loop device by default, poor performance

• Put data and metadata on real devices18

Recommended Storage driver?

• No best storage driver :-(

• High density Paas => aufs (if available), overlayfs (otherwise)

• Don't put big files in the container file system

• Otherwise use the file system you have the most experience with operating in production

19

Volumes

20

What are Volumes?• A directory mounted inside my container that exists

outside the union file system

• Directory's lifetime is independent of the container lifetime

• Create via Dockerfile or CLI

• Can map to existing directory on host or remote NFS device

• Share data between containers

21

When should I use a volume?• Recommended almost always

• Faster to write outside the union file system: databases, log files

• Perf study: http://jam.sg/blog/mongodb-docker-part-2/

• Significant increase in IOPs and throughput

• Sharing data between containers

• Sharing data between host and container (e.g.: source code)

22

Demo• Instantiating a volume (example)

• Command line

docker run -d -v /www -p 80:80 --name mynginx nginx

• Dockerfile

FROM nginx

VOLUME /www

23

Demo

• Find Volume located within a container

• Add a file to it

24

• Mounting host directory into container

• Not portable (scenario: source code)

• Data only containers -- subsumed by "docker volume create"

25

Named Volumes• Named volumes - volumes are managed

resources with their own commands

• Docker volume create

• Docker volume inspect

• Docker volume ls

• Docker volume rm

26

Demo (1.9+ only)

• Create a volume via command line

• Inspect container to find mount point

• Run a script in a container

• Look inside volume directory

• Start a new container that reuses volume

27

Volume Plugins• Allows docker to work with other storage backends

• Docker tools work as usual

• Backends: Ceph, EMC, portworx, Blockbridge

You can write your own:

http://docs.docker.com/engine/extend/plugins_volume/

28

Summary

• Docker overview

• Image storage

• Volumes

29

Questions?

30

Recommended