Upload
elliando-dias
View
1.487
Download
0
Embed Size (px)
Citation preview
High Availability and PostgreSQLFujitsu Coffee Morning — July 28th, 2006
Gavin Sherry
High Availability and PostgreSQL – p. 1
What is high availability?
A system to reduce application down time
Usually consists ofMaster server(s)Slave server(s)Software to detect the failure of a masterSoftware to promote a slave to master statusSoftware or hardware to ensure data consistencybetween the master(s) and slave(s)
Sometimes technologies provide HA incidentally (morelater)
High Availability and PostgreSQL – p. 2
What high availability is not
... a mechanism to increase performanceUsually HA comes at the price of performance
... a way to simplify your network
... cheap
... easy to implement
High Availability and PostgreSQL – p. 3
Choosing a HA solution
How much down time can you afford?
How much data (how many seconds, minutes) can youafford to lose?
Do you need a geographically dispersed solution?
Are slaves online or offline?
Are online upgrades possible?
High Availability and PostgreSQL – p. 4
Failure detection
‘heartbeat’ / ‘ping’
Can be performed via: serial, fibre, Ethernet. . .Red Hat Cluster Suite, Linux-HA
More complex OS level interconnectHACMP (AIX), Lifekeeper (Linux), Sun Cluster(Solaris)
High Availability and PostgreSQL – p. 5
Switch over
IP take overManaged or automatic?
Application reconfiguration
Middleware reconfiguration
Shoot the master in the head
High Availability and PostgreSQL – p. 6
Data availability mechanisms
Historically seen to be the most complex part ofdatabase HA
Replicating data is easy, doing it with transactionalsemantics is hardMinimising performance impact is also hard
The slave server(s) need access to the data
How far out of date can the slave(s) be?
High Availability and PostgreSQL – p. 7
Shared storage
Master and slave(s) connect to shared disk/SAN
ProsLittle to no performance impactSimple designMay utilise existing infrastructureTheoretically no data loss in failure scenario
ConsSANs can fail — especially cheap onesNo off site capabilityUpgrades can be painfulSlave(s) are offlineHomogenous cluster
High Availability and PostgreSQL – p. 8
Data replication: Slony I
Trigger-based row level replication
Data is pulled by slave server(s)
ProsImplemented by PostgreSQL developersOnline upgrades are a featureData is replicated with transaction semanticsOnline slavesOff site capableHeterogeneous cluster
ConsPerformance impact: 1% to 5%Complex administration
High Availability and PostgreSQL – p. 9
Data replication: file system
Block based OS file system copy on write
DRBD on Linux
ProsTheoretically, no data lossSimplified implementation
ConsSignificant performance impact — more than 10%Off site will increase performance impactNo online upgradeOffline slavesHomogenous cluster
High Availability and PostgreSQL – p. 10
Data replication: log shipping
Point in time recovery (PITR)
Write ahead logs (WAL) are shipped to slave andreplayed
Pros<1% performance impactSimplified designOff site capable
ConsOffline slavesMinimising data loss is tricky and error proneLacks toolsNo online upgrade
High Availability and PostgreSQL – p. 11
SQL proxy: PG Cluster
Write statements are sent to all nodes via a proxy
ProsOnline slavesSimplified design
ConsData inconsistencies across serversNot two-phase commit — query may fail a serverDoes not handle non-deterministic queriesINSERT INTO SALES VALUES(1001,current_timestamp);
High Availability and PostgreSQL – p. 12
SQL proxy: PgPool
Much like PG Cluster
Other consOne slave
High Availability and PostgreSQL – p. 13
SQL proxy: Sequoia, p/cluster
J2EE/JDBC two phase commit statement basedreplication
Derived from c-JDBC by Continuent
ProsData consistency with 2PCp/cluster does complete HA, pretty toolsSimplified designOnline slaves
ConsOnly supports JDBC (other interfaces planned)Off site may have a huge performance impactPerformance impact would be non-trivial, especiallywith more slaves
High Availability and PostgreSQL – p. 14
Testing
Test your HA solution(s)... but thoroughness is hard
High Availability and PostgreSQL – p. 15
The future
Lots of next generation HA for PostgreSQL on thehorizon
PgPool 2.0PgCluster 2.0 — shared disk, shared memory
Postgres-R/Slony-2
High Availability and PostgreSQL – p. 16
Cautionary notes
HA solutions can still fail
SANs can blowup
Split brain
Replication software can fail — corrupt data or stopreplicating
Most common issue1. Network/software/hardware blip2. Heart beat to fails3. Switch over to takes place4. Something goes wrong
Network reconfiguration doesn’t workSlave server is misconfiguredSlave server doesn’t kill master properly
High Availability and PostgreSQL – p. 17
References
Lifekeeper, by SteelEye - http://www.steeleye.com
Linux-HA - http://www.linux-ha.org/
Red Hat Cluster Suite -http://www.redhat.com/solutions/clustersuite/
Slony - http://www.slony.info
DRBD - http://www.drbd.org
PG Cluster - http://pgcluster.projects.postgresql.org
PgPool - http://pgpool.projects.postgresql.org/
Continuent - http://www.continuent.com
Postgres-R(8) - http://www.postgres-r.org
High Availability and PostgreSQL – p. 18