Creating customized openSUSE versions with SUSE Studio

High Availability and PostgreSQLFujitsu Coffee Morning — July 28th, 2006

Gavin Sherry

[email protected]

High Availability and PostgreSQL – p. 1

What is high availability?

A system to reduce application down time

Usually consists ofMaster server(s)Slave server(s)Software to detect the failure of a masterSoftware to promote a slave to master statusSoftware or hardware to ensure data consistencybetween the master(s) and slave(s)

Sometimes technologies provide HA incidentally (morelater)


What high availability is not

... a mechanism to increase performanceUsually HA comes at the price of performance

... a way to simplify your network

... cheap

... easy to implement


Choosing a HA solution

How much down time can you afford?

How much data (how many seconds, minutes) can youafford to lose?

Do you need a geographically dispersed solution?

Are slaves online or offline?

Are online upgrades possible?


Failure detection

‘heartbeat’ / ‘ping’

Can be performed via: serial, fibre, Ethernet. . .Red Hat Cluster Suite, Linux-HA

More complex OS level interconnectHACMP (AIX), Lifekeeper (Linux), Sun Cluster(Solaris)


Switch over

IP take overManaged or automatic?

Application reconfiguration

Middleware reconfiguration

Shoot the master in the head


Data availability mechanisms

Historically seen to be the most complex part ofdatabase HA

Replicating data is easy, doing it with transactionalsemantics is hardMinimising performance impact is also hard

The slave server(s) need access to the data

How far out of date can the slave(s) be?


Shared storage

Master and slave(s) connect to shared disk/SAN

ProsLittle to no performance impactSimple designMay utilise existing infrastructureTheoretically no data loss in failure scenario

ConsSANs can fail — especially cheap onesNo off site capabilityUpgrades can be painfulSlave(s) are offlineHomogenous cluster


Data replication: Slony I

Trigger-based row level replication

Data is pulled by slave server(s)

ProsImplemented by PostgreSQL developersOnline upgrades are a featureData is replicated with transaction semanticsOnline slavesOff site capableHeterogeneous cluster

ConsPerformance impact: 1% to 5%Complex administration


Data replication: file system

Block based OS file system copy on write

DRBD on Linux

ProsTheoretically, no data lossSimplified implementation

ConsSignificant performance impact — more than 10%Off site will increase performance impactNo online upgradeOffline slavesHomogenous cluster


Data replication: log shipping

Point in time recovery (PITR)

Write ahead logs (WAL) are shipped to slave andreplayed

Pros<1% performance impactSimplified designOff site capable

ConsOffline slavesMinimising data loss is tricky and error proneLacks toolsNo online upgrade


SQL proxy: PG Cluster

Write statements are sent to all nodes via a proxy

ProsOnline slavesSimplified design

ConsData inconsistencies across serversNot two-phase commit — query may fail a serverDoes not handle non-deterministic queriesINSERT INTO SALES VALUES(1001,current_timestamp);


SQL proxy: PgPool

Much like PG Cluster

Other consOne slave


SQL proxy: Sequoia, p/cluster

J2EE/JDBC two phase commit statement basedreplication

Derived from c-JDBC by Continuent

ProsData consistency with 2PCp/cluster does complete HA, pretty toolsSimplified designOnline slaves

ConsOnly supports JDBC (other interfaces planned)Off site may have a huge performance impactPerformance impact would be non-trivial, especiallywith more slaves


Testing

Test your HA solution(s)... but thoroughness is hard


The future

Lots of next generation HA for PostgreSQL on thehorizon

PgPool 2.0PgCluster 2.0 — shared disk, shared memory

Postgres-R/Slony-2


Cautionary notes

HA solutions can still fail

SANs can blowup

Split brain

Replication software can fail — corrupt data or stopreplicating

Most common issue1. Network/software/hardware blip2. Heart beat to fails3. Switch over to takes place4. Something goes wrong

Network reconfiguration doesn’t workSlave server is misconfiguredSlave server doesn’t kill master properly


References

Lifekeeper, by SteelEye - http://www.steeleye.com

Linux-HA - http://www.linux-ha.org/

Red Hat Cluster Suite -http://www.redhat.com/solutions/clustersuite/

Slony - http://www.slony.info

DRBD - http://www.drbd.org

PG Cluster - http://pgcluster.projects.postgresql.org

PgPool - http://pgpool.projects.postgresql.org/

Continuent - http://www.continuent.com

Postgres-R(8) - http://www.postgres-r.org


Technology

Creating customized openSUSE versions with SUSE Studio