Docking postgres

Docker

• Based on containerization – in Linux kernel since 2008

• Platform to deploy and run lightweight virtualized servers

• Initial release in 2013

• explosive growth in 2014

• Becoming a de facto standard for Linux virtualization

• Evolution of purpose – core idea was a regularized one-size-fits-all approach to managing virtualized environments. Became an ecosystem: platform, delivery service, tools.

From this

To This

Container Analogy

• Shipping containers (intermodal freight containers)• Standardized dimensions (20’ X 8’)

• Standardized hooks for hoisting and moving

• One-size-fits-all, BUT… carefully chosen to handle 98% of transportation needs

• And if it doesn’t fit, it can be made to fit (ship in pieces and reassemble – still saves $$$)

• Docker containers• Standardized footprint (10G filesystem by default)

• Standardized methods to deploy – doesn’t matter what’s inside (start/stop/snapshot/export/import/destroy)

• Make it fit--many complex systems can be decomposed into orchestrated groups of containers

Virtualization approaches

Host OS

VMWareVirtualBox

Xen

BSD JailsSolaris Zones

Virtual Machine Jailed System

Pros: complete isolation, full machine mimicry, run any OS

Cons: performance hit, heavyweight deployment

Pros: native performance, easy deployment, full system init

Cons: IT’S NOT LINUX (& some nitpicks about IPC)

DockerCoreOS

LXC

MirageOS???

Containers Unikernel

Pros: native performance, stripped down, MANY options

Cons: Limited interaction by design

Pros: stripped-down, better than native performance for some tasks

Cons: ?? Need more info

Case Study: Client X

• Needs• Database-as-a-service, SaaS model

• High-throughput, update-intensive, lots of JSON data

• Replication, failover, PITR, etc…

• Flexible roll-out and deployment of many instances (some multi-tenant, some dedicated)

• Redundancy across physical machines

• Infrastructure• Essentially the largest x86-based servers available

• Essentially the fastest hard drive storage available

• Essentially the fastest network throughput available

• 2 Availability zones, 4 machines

OK, You want Details

• Cores: 60 (120 with hyperthreading)

• RAM 3TB (with parity)

• Onboard storage• 200G SATA array (OS and applications)

• 3TB FusionIO IODrive2 RAID ($PGDATA, indexes, WAL)

• Remote storage• 55TB Invicta SSD SAN array (Other tablespaces, logs, diff. backup)

• Dual 55TB NFS-mounted backup arrays (Backup archives)

• Network• Multiple 40GbE NIC (database replication, SSD storage)

• Multiple 10GbE NIC (backup and remote replication)

• Dual 1GbE NIC (admin network)

What does that look like?

What does that look like?

Judgment Call:

• Treat your containers • like a full VM?• like a single service box?

• The “Docker way” is single service box• You do not perform “server maintenance”• No sysinit, no syslogd, no cron• All important data (including logs) mapped to external volumes• Processes can be started, stopped, restarted from outside the container• Applications don’t interact inside a container• Limited shell access (only by root from host, via docker exec, docker attach)

• Reasons to emulate full VM• Software architecture expectations (EDB Postgres Plus)• SSH allows administrators to connect to containers rather than host• Paradigm comfort• A little rebellion is a good thing now and then

Considerations for Postgres

• Docker internal filesystem is UnionFS• Great for versioning, snapshotting… slow

• Limited by default to 10GB, defined in docker daemon (one size fits all)

• Ergo – use mapped volumes for any actual work

• Doing things the Docker Way• No SSH means no modifying postgresql.conf or pg_hba.conf

• Can modify many settings via queries, but not pg_hba.conf

• No restart/reload (just spin up another container) – kind of a pain for simple modifications

• Doing things the Full VM way• Still not perfect – init is not the same

• Either use custom init like runit or script your start/stop from the outside via SSH or nsenter (only applies when starting/stopping the whole container)

Working with Docker

• Containers are based on images (filesystem snapshots)

• Images are containerized versions of a Linux OS• Can be just a base distro

• Can be a distro+specialized application installed

• Can be any of the above, + any set of files you want on the Union FS

• Images can be fetched from Docker Registry, or built

• Containers are instantiated images

• BUT

• Containers can be saved as images, via docker commit

Docker as a VM

• Found several examples of Docker images with full system initon Docker Registry (https://registry.hub.docker.com)

• Not perfect• Could not run a real SysV init (for reasons intrinsic to Docker)

• Settled on runit as the init manager—good for standard services like syslogd, cron, sshd, not good for Postgres

• But, a starting point

• In the end, built custom image from scratch using the joliva/centos-baseimage as an example

• Wanted to base it on Oracle Enterprise Linux instead of CentOS

• Copied Dockerfile, made changes, applied to bare OEL image

https://registry.hub.docker.com/

Reasons for custom image

• Images pulled from Docker Registry are not secure.

• Even now, with “signed images” the situation is not resolved

• Wanted to be sure we understood all components

• Yes, even so, we had to trust the bare OEL image (security via locked-down network)

Docker Image Workflow

Iterative development to tweak an image

1. Pull a base image to start with, or build your own via Dockerfile

2. Launch a container based on that image

3. Modify that container however you want

4. Commit that container as a new image

5. Repeat

ContainerImage

Dev Pre Prod

Docker annoyances

• All containers depend on the docker daemon• More than just an annoyance—stability and availability issue

• Many files in /etc cannot be modified• Can be hacked by finding container FS on host and modifying

• SSH hostname lookup had to be turned off this way

• BUT, do it once and then commit image and all is good.

• In order to present services on a dedicated IP address and port, container must be run in –privileged mode (security and stability implications)• Docker 1.2 + allows for finer-grained capabilities

• Also, port forwarding must be enabled in host kernel• net.ipv4.conf.all.forwarding = 1

Docker benefits

• Mapped volumes make life easy• Default paths inside, custom paths outside

• Port mapping makes life easy• Default port inside, custom port outside

• Container snapshotting makes life easy

• 1-second startup times makes life easy

docker run \

–v [external filesystem path1]:[internal filesystem path] \

–p [external ip address]:[external port]:[internal port] \

–h [hostname] \

--name [container name] \

--privileged [Docker image] \

[initialization command] &

Why containers over instances?

• Yes, we could have just run many parallel instances of Postgres in the host.

• How many people here have done that?

• Was it fun?• Let’s count the ways

With Docker:

• Outer host system is “clean”, only concerned with data files.

• The Postgres installations didn’t have to “know” anything about outer environment

• Default paths, ports, etc… did not need to be changed. ALL DEFAULTS = easy.

• If a container has a problem, spin up another one using the same mapped volumes.

Final system

NOC 1

Server 1 – R/W Primary

Server 2 – R/O Standby

PgPool Dev

PgPool Pre

PgPool Prod

PgPool Dev

PgPool Pre

PgPool Prod

PG Dev

PG Pre

PG Prod

PG Dev

PG Pre

PG Prod

NOC 2

Server 3 - R/O Standby


PgPool Dev

PgPool Pre

PgPool Prod

PgPool Dev

PgPool Pre

PgPool Prod

PG Dev

PG Pre

PG Prod

PG Dev

PG Pre

PG Prod

Final system

NOC 1

Server 1 – R/W Primary

Server 2 – R/O Standby

PgPool Dev

PgPool Pre

PgPool Prod

PgPool Dev

PgPool Pre

PgPool Prod

PG Dev

PG Pre

PG Prod

PG Dev

PG Pre

PG Prod

NOC 2



PgPool Dev

PgPool Pre

PgPool Prod

PgPool Dev

PgPool Pre

PgPool Prod

PG Dev

PG Pre

PG Prod

PG Dev

PG Pre

PG Prod

SSH:22

PgPool:9000

Pg:5432

Things to remember

• If you want full VM style, it will cost you (time, frustration)

• If you want external networking, it will take elevated privileges in host and containers• Port forwarding turned on in host• --privileged, or --cap-add in container

• Mapped volumes need same uid/gid inside and out.

• Clock is the same inside and out, but time zone can differ.

• User in privileged container can set system clock.

• Set your /etc/security/limits.conf and /etc/sysctl.conf in host

• ALSO Set your /etc/security/limits.conf and /etc/sysctl.conf in container

• Run sysctl -p /etc/sysctl.conf EVERY TIME you start/restart a container.

The future of Docker for PostgreSQLThe future of Docker for PostgreSQL

The future of Docker for PostgreSQL

• Docker isn’t going away, anytime soon

• Postgres community involvement

• Docker PostgreSQL builds – many in registry hub.