Upload
openvz
View
538
Download
0
Embed Size (px)
Citation preview
Denser containers with PFCacheDenser containers with PFCache
Pavel Emelyanov
ContainerCon, Seattle, 2015
AgendaAgenda
• How to store container files
• Why shared template matters
• What can be deduplicated and what should be
• PFCache
• Q&A
2
How to store container filesHow to store container files
4
Filesystem
Containerprocesses
Block device
NetworkHostFilesystem
Hostblock device
Hardware
How to store container files (1)How to store container files (1)
5
Filesystem
Containerprocesses
Block device
NetworkHostFilesystem
Hostblock device
Hardware
Chroot()Union FS
How to store container files (2)How to store container files (2)
6
Filesystem
Containerprocesses
Block device
NetworkHostFilesystem
Hostblock device
Hardware
Loop deviceZFS ZVolBTRFS subvolumePLoop
What's PLoopWhat's PLoop
• Loop device plus
– AIO for better performance
– Snapshots
– QCOW2-like format for thin provisioning
– Thin provisionong itself
• Upstreaming work in progress
7
How to store container files (3)How to store container files (3)
8
Filesystem
Containerprocesses
Block device
NetworkHostFilesystem
Hostblock device
Hardware
LVMDM-thin
How to store container files (4)How to store container files (4)
9
Filesystem
Containerprocesses
Block device
NetworkHostFilesystem
Hostblock device
Hardware
NBDCeph RBDiSCSI
How to store container files (5)How to store container files (5)
10
Filesystem
Containerprocesses
Block device
NetworkHostFilesystem
Hostblock device
Hardware
NFSGFS2OCFSCeph
Containers vs TemplatesContainers vs Templates
• Containers ...
– are massively cloned from pre-created “templates”
– do not have direct access to the underlying (block) storage
• Identical data can be effectively deduplicated
– Higher density
– Lower IO and/or memory consumption
11
Who can do shared templatesWho can do shared templates
12
Storage OpenVZ Docker LXC
Union FSs + + +
Btrfs +
DM-thin +
PLoop +
Ceph
ZFS +
What can be de-duplicatedWhat can be de-duplicated
13
Filesystem
Containerprocesses
Block device Network
What can be de-duplicatedWhat can be de-duplicated
14
Filesystem
Containerprocesses
Block device Network
Page cacheCached pages
What can be de-duplicatedWhat can be de-duplicated
15
Filesystem
Containerprocesses
Block device Network
Page cacheCached pages
IO flow
What is deduplicatedWhat is deduplicated
16
Storage Memory IO
Union FSs + +
Btrfs +/-
DM-thin
PLoop + +
Ceph
ZFS
Additional OpenVZ constraintsAdditional OpenVZ constraints
• Containers disks are independent image files
– Can be easily copied across nodes
– No single (shared) point of failure
• Deduplicated data is volatile
– “Templates” can be lost (e.g. while migrating)
– Too big pool with shared data can be easily shrunk
17
PF-CachePF-Cache
19
Ext4
PLoop device
Cache area
Cache link (xattr)Ext4
Containerprocesses
PLoop device
Image file Image file
Cache and cache link behaviorCache and cache link behavior
• Cache area
– target file name is sha1 sum of the contents
– files are created by user-space daemon
– cache size is limited by ploop
• Cache link
– created automatically upon file creation
– dropped when file is opened for writing
– Is kept during metadata update (chown/chmod)
20
Future workFuture work
• PLoop is available in OpenVZ & Virtuozzo
– Upstream WIP
• IO deduplication in the upstream
– Issue raied at 2013'th LSFMM
– DM-thin/btrfs IO dedup for containers
– KSM++ for VM-s
22
Thank you
Thank you