CloudTada | Infrastructure & DevOps Insights

Overview of PostgreSQL High Availability

Learning Objectives

After this lesson, you will:

Understand why High Availability (HA) is important for database systems
Master different HA deployment methods for PostgreSQL
Compare the advantages and disadvantages of Patroni, Repmgr and Pacemaker
Understand the overall architecture of PostgreSQL HA systems

1. Why do we need High Availability?

1.1. Problems with Single Point of Failure (SPOF)

In a traditional single-server database system:

Single Point of Failure (SPOF)

Consequences when database server fails:

Downtime: Applications cannot access data
Revenue loss: Each minute of downtime can result in significant revenue loss
Loss of reputation: Users cannot use the service
Data loss: If there is no timely backup

1.2. Common causes of downtime

Cause	Percentage	Impact
Hardware failures (disk, RAM, CPU)	30%	High
Network failures	20%	Medium
Software bugs	25%	High
Planned maintenance	15%	Controllable
Human error	10%	High

1.3. What is High Availability?

High Availability (HA) is the ability of a system to maintain continuous operation even when one or more components fail.

HA measurement metrics:

Availability	Downtime/year	Downtime/month	Level
99% (2 nines)	3.65 days	7.2 hours	Low
99.9% (3 nines)	8.76 hours	43.2 minutes	Medium
99.99% (4 nines)	52.56 minutes	4.32 minutes	High
99.999% (5 nines)	5.26 minutes	25.9 seconds	Very High

1.4. Benefits of HA

Business Benefits:

Minimize downtime and revenue loss
Increase system reliability
Improve user experience
Meet SLA (Service Level Agreement)

Technical Benefits:

Automatic failover when primary server fails
Zero-downtime maintenance
Load balancing for read queries
Disaster recovery
Data protection

2. PostgreSQL HA Methods

2.1. Log-Shipping (WAL Shipping)

Log-Shipping (WAL Shipping)

How it works:

Primary server writes WAL (Write-Ahead Log) files
WAL files are copied to standby server
Standby server replays WAL to synchronize data

Advantages:

Simple, easy setup
Low resource consumption

Disadvantages:

High Recovery Time Objective (RTO) (minutes → hours)
No automatic failover
Data loss may occur
Standby cannot query (warm standby)

2.2. Streaming Replication

How it works:

Primary streams WAL records in real-time to standby
Standby applies changes immediately
Standby can serve read queries (hot standby)

Log-Shipping (WAL Shipping)

Advantages:

Low latency (< 1 second)
Hot standby can serve read queries
Synchronous mode reduces data loss

Disadvantages:

Still requires manual failover
Needs external tools for automation

2.3. Logical Replication

How it works:

Replicate at logical level (tables, rows)
Allows selective data replication
Publisher → Subscriber model

Advantages:

Replication between different PostgreSQL versions
Selective replication (only certain tables)
Multi-master possible (with BDR)

Disadvantages:

Higher overhead than physical replication
Not the main HA solution (usually used for data distribution)

2.4. Shared Storage (SAN)

Shared Storage (SAN)

Advantages:

Fast failover (just start PostgreSQL)
No data loss

Disadvantages:

Expensive (needs SAN infrastructure)
SAN becomes single point of failure
Complex to maintain

3. Comparison: Patroni vs Repmgr vs Pacemaker

3.1. Patroni

Characteristics:

Python-based
Uses DCS (etcd, Consul, ZooKeeper) to store cluster state
REST API for management
Intelligent automatic failover
Template-based configuration

Advantages:

✅ Easy to install and configure
✅ Powerful REST API
✅ Good Kubernetes integration
✅ Active development, large community
✅ Automatic leader election
✅ Rolling restart, zero-downtime updates

Disadvantages:

❌ Depends on DCS (adds component)
❌ Need to learn DCS (etcd/Consul)

Suitable use cases:

Cloud-native applications
Kubernetes deployments
Microservices architecture
High automation needs

3.2. Repmgr

Characteristics:

Open-source tool from 2ndQuadrant (EnterpriseDB)
Standalone tool, no DCS required
Witness node for quorum voting
Command-line based management

Advantages:

✅ No external DCS needed
✅ Simpler than Patroni
✅ Good documentation
✅ Mature and stable

Disadvantages:

❌ Fewer automation features than Patroni
❌ No REST API
❌ Smaller community
❌ More complex failover

Suitable use cases:

Traditional infrastructure
Simple, few nodes
Don't want to add DCS

3.3. Pacemaker + Corosync

Characteristics:

High Availability cluster framework (Linux-HA)
Manages multiple types of resources, not just PostgreSQL
Voting quorum mechanism
Fencing/STONITH to avoid split-brain

Advantages:

✅ Mature, production-proven (20+ years)
✅ Manages multiple services (PostgreSQL, web server, etc.)
✅ Strong fencing mechanism
✅ Supports shared storage

Disadvantages:

❌ Very complex to setup and maintain
❌ High learning curve
❌ XML configuration difficult to read
❌ Difficult debugging

Suitable use cases:

Enterprise environment
Need to manage multiple services
Has shared storage (SAN)
Team has Pacemaker experience

3.4. Overall comparison table

Criteria	Patroni	Repmgr	Pacemaker
Complexity	Medium	Low	High
Learning curve	Medium	Low	Very High
Setup time	Fast	Fast	Slow
Automatic failover	✅ Excellent	✅ Good	✅ Excellent
REST API	✅ Yes	❌ No	❌ No
Kubernetes support	✅ Excellent	⚠️ Limited	❌ No
Community	⭐⭐⭐⭐⭐	⭐⭐⭐	⭐⭐⭐⭐
Documentation	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐
Dependencies	DCS (etcd/Consul)	None	None
Best for	Modern/Cloud	Simple setups	Enterprise/Complex

3.5. Recommendations

Choose Patroni if:

Deploying on cloud or Kubernetes
Need automation and REST API
Team has experience with modern DevOps tools
✅ This is the most popular choice today

Choose Repmgr if:

Simple setup, few nodes (2-3)
Don't want DCS dependency
Team familiar with traditional PostgreSQL tools

Choose Pacemaker if:

Complex enterprise environment
Already has Pacemaker infrastructure
Need to manage multiple services at once
Has shared storage (SAN)

4. Overall System Architecture Patroni + etcd

4.1. 3-node cluster architecture

Overall System Architecture Patroni + etcd

4.2. Main components

PostgreSQL

Main database engine
One node is Leader (read/write)
Other nodes are Replicas (read-only)
Uses Streaming Replication to synchronize

Patroni

Manages PostgreSQL lifecycle
Monitors node health
Performs automatic failover
Exposes REST API (
) to query cluster state
Reads/writes configuration to DCS

etcd (DCS - Distributed Configuration Store)

Stores cluster state and configuration
Leader election (decides which node is Leader)
Distributed lock mechanism
3 etcd nodes form a quorum (majority voting)

HAProxy (optional but recommended)

Load balancer
Route write traffic → Leader
Route read traffic → Replicas (round-robin)
Health check and automatic routing when failover occurs

4.3. Operation flow

1. Normal Operations (Normal state)

TEXT

1. Application sends query → HAProxy
2. HAProxy checks health check
3. Route write → Leader, read → Replicas
4. Patroni on each node:
   - Send heartbeat to etcd every 10s
   - Update health status
   - Maintain leader lease

2. Leader Failure Detection

TEXT

1. Node 1 (Leader) fails → stop heartbeat
2. etcd detects: leader lease expired (30s)
3. Patroni on Node 2 and Node 3 realizes
4. Leader election is triggered

3. Automatic Failover Process

TEXT

Timeline: 0s  ──────────► 30s ──────► 45s ──────► 60s
          │              │           │            │
      Leader dies    etcd detects  New leader  Applications
                     lease expire   elected     reconnect
                                   (Node 2)
                                   
Node 1:   LEADER ──────► DOWN ──────────────────► STANDBY (after recovery)
Node 2:   REPLICA ─────────────────► LEADER ────► LEADER
Node 3:   REPLICA ──────────────────────────────► REPLICA

4. After Failover

TEXT

- Node 2 becomes new Leader
- Node 3 remains Replica, changes replication source to Node 2
- HAProxy automatically detects and routes traffic to Node 2
- Node 1 (when recovered) will rejoin as Replica

4.4. Important scenarios

Scenario 1: Planned Switchover

TEXT

# Admin wants maintenance on Node 1 (Leader)
$ patronictl switchover postgres-cluster

# Patroni will:
1. Pause writing to current Leader
2. Wait for Replica to sync completely (zero lag)
3. Promote Replica → Leader
4. Demote old Leader → Replica
5. Zero data loss, downtime < 5s

Scenario 2: Split-brain Prevention

TEXT

Situation: Network partition between nodes

etcd quorum (3 nodes):
- Partition A: Node 1, Node 2 (2 nodes = majority)
- Partition B: Node 3 (1 node = minority)

Result:
✅ Partition A: Continues operation, can elect leader
❌ Partition B: Cannot elect leader (insufficient quorum)

→ Prevents 2 leaders existing simultaneously!

Scenario 3: Node Recovery

TEXT

Node 1 recovers after dying:

1. Patroni starts and reads cluster state from etcd
2. Realizes Node 2 is currently Leader
3. Automatically rejoins as Replica
4. Uses pg_rewind to sync data if there's divergence
5. Begins streaming replication from Node 2

4.5. Configuration timeline (important parameters)

TEXT

# patroni.yml
bootstrap:
  dcs:
    ttl: 30                    # Leader lease time (30s)
    loop_wait: 10              # Check interval (10s)
    retry_timeout: 10          # Retry time
    maximum_lag_on_failover: 1048576  # Max lag for failover candidate (1MB)

Explanation:

ttl: 30: Leader must renew lease every 30s, otherwise considered dead
loop_wait: 10: Patroni checks health every 10s
Failover trigger: when (ttl - loop_wait) expires → ~20-30s

5. Summary

Key Takeaways

High Availability is mandatory for production systems to reduce downtime and data loss
Streaming Replication + Automatic Failover is the most popular HA method for PostgreSQL
Patroni is the best choice for most modern use cases:
- Easy setup and maintenance
- Intelligent automatic failover
- Powerful REST API
- Good cloud/K8s integration
3-node architecture with Patroni + etcd provides:
- Automatic failover (RTO < 30s)
- Zero data loss with sync replication
- Split-brain prevention
- Scalability for read workloads

Homework

Calculate downtime for your system with different availability levels (99%, 99.9%, 99.99%)
Draw HA architecture for your specific use case (number of nodes, data centers, RTO/RPO requirements)
Compare costs between using HA and accepting downtime for your business

Preparation for next lesson

Lesson 2 will dive deep into Streaming Replication - the foundation of PostgreSQL HA:

Detailed mechanism of WAL operation
Synchronous vs Asynchronous replication
Replication slots
Lab: Manual replication setup

Course

PostgreSQL High Availability A-Z

Share this article

You might also like