Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
PostgreSQL and backup & recovery
ATL PostgreSQL Meetup, March 2015
scott bossfounder of nocturnal coding monkeys
@the_sboss (personal) !
@nocturnalCM (company) http://ncm.link/ncm
overview• backup & recovery
• PostgreSQL backups
• combining PostgreSQL backups with company backups
• q/a
backup & recovery
–Curtis “Mr Backup” Preston
“backups are weird.”
a few quick definitions
backupa copy of the important files to be used to restore to get a system up and running.
this is part of an operational restore process. !
think point-in-time copy of the data
archivelong term retention of files, usually required by legal governances (PCI, HIPPA,
etc), contract requirements or for historical purposes.
restoretaking the file(s) out of a backup and restoring them to the computer system to
bring it back to an operational state.
why backup & restore, and not archive & restore
• restores are for getting you operational again.
• you dont want to be operational at a point from a year+ ago; you want to be operational from immediately before issue/crash.
rtorecovery time objective. this is the amount of time you are allowed to recover
from an outage.
rporecovery point objective. this is the amount of data loss (measured in time) you
are allowed to lose.
rtrrecovery time reality. this is real amount of time it takes to recover. generally
1.5x-2x of the rto.
rprrecovery point reality. this is the real amount of loss when recovering from an
outage. generally 2x-3x of the rpo
rtr mitigations• to reduce the rtr (making it closer to rto), reduce the variables out of
your control. • i.e. dont use a 3rd party vendor to off site your tapes. • i.e. be staffed 24/7.
rpr mitigations• to reduce rpr (making it closer to rpo), reduce the things that can
make you go to an earlier backup. • use a less incrementals and more full backups. • monitor the backups to make sure they are successful.
–Curtis “Mr Backup” Preston (paraphrased)
“delete your backups as soon as you can, after they expire. don't keep them forever or you will get sucked into eDiscovery hell.”
eDiscoveryin a legal case, the “discovery” of materials from the computer* systems.
eDiscovery hellthe hell the IT staff (mostly the B&R) in producing all the data in the “discovery”,
the hell is increased based on the amount of data you have to divulge.
advice about eDiscovery hell• write down your backup/archive policies. this includes retention
periods, when you do fulls vs incrementals vs differentials, etc
• in your policy state when you will delete “expired” backups/archives. it should be to the effect, “immediately after the backup/archive expires.”
• delete/mangle/destroy/eat/etc the backups/archives based on your written policies.
• just because a backup/archive is expired doesn't exclude it from eDiscovery. expired and “gone” does.
PostgreSQL backups
pg_dump / pg_dumpall• in previous talks at the meet up, we have looked at pg_dump and
pg_dumpall.
• recap:
• pg_dump for a single database.
• pg_dumpall does all databases in a single command.
• pg_dump has a lot more functionality.
• pg_dumpall uses pg_dump under the hood.
pg_basebackup & wal’s• this is PostgreSQL’s “hot backup”.
• this works better for larger databases (collections).
• more complicated setup than pg_dump*
best practices• write a script to use pg_dump to dump the database(s). don't do a
single inline command.
• have a date/timestamp in the output filename.
• either use logrotate or write it in your script to clean up old pg_dumps so you dont fill up the filesystem.
• copy the pg_dumps to another computer or system*.
compression• do we compress the backups? maybe. depends on a few factors.
• where are the backups going to “live” (reside)? • is the data already compressed? do we need to recompress or
compress compressed data? • compressed backups could slow down restores.
encryption• do we encrypt the backups? maybe. depends on a few factors.
• do you need to? (PCI, HIPPA, etc requirement) • is the data already encrypted? if so, do we really need to encrypt
it again? • are they leaving your premises? • encrypted backups could slow down restores.
combining it all together
use the common backup solution.
if you have a solution that is backing up the servers now, that is your best bet (generally) on backing up your databases.
backup options• have the backup script dump the file to a centralized NAS share.
• the centralized NAS could be a backup appliance (like DataDomain, Quantum DXI, etc). or it can be centralize storage array like Isilon, NetApp, FreeNAS or NAS4Free.
• have a backup client (CommVault, NetBackup, Networker, etc) to “sweep” the database dump folder/directory. • there is some limited backup client* functionality in some
commercial software to back up directly from PostgreSQL.
not backup options• doing backups and NEVER testing restores.
• emailing the dumps. do not rely on email as part of your backup system.
• sending the backups unencrypted outside of your company. this includes writing them to SaaS (S3, Glacier, Azure, Dropbox, OneDrive, etc).
q & a
/presentation