CloudTadaInsights

Lesson 2: PostgreSQL Streaming Replication

PostgreSQL Streaming Replication

Learning Objectives

After this lesson, you will:

  • Deeply understand the Streaming Replication mechanism in PostgreSQL
  • Master Write-Ahead Logging (WAL) and its role
  • Distinguish between Synchronous and Asynchronous Replication
  • Understand and use Replication Slots
  • Practice manual replication setup (Primary-Standby)

1. Streaming Replication Operation Mechanism

1.1. Overview

Streaming Replication is the method PostgreSQL uses to replicate data from a Primary server to one or more Standby servers in real-time.

How Streaming Replication Works

1.2. Main components

WAL Sender (on Primary)

  • Process dedicated to sending WAL records to Standby
  • One WAL sender per Standby connection
  • Monitoring: SELECT * FROM pg_stat_replication;

WAL Receiver (on Standby)

  • Process receives WAL records from Primary
  • Writes WAL to local WAL files
  • Sends feedback to Primary (LSN position, status)

Startup Process (on Standby)

  • Replays WAL records into data files
  • Similar to recovery process
  • Can serve read queries (Hot Standby)

1.3. Detailed data flow

Transaction Commit Flow

Actual timing:

  • Asynchronous: ~0-100ms lag
  • Synchronous: ~1-10ms lag (depending on network latency)

2. Write-Ahead Logging (WAL)

2.1. What is WAL?

Write-Ahead Logging is a logging technique where:

"All changes must be written to the log BEFORE writing to data files"

WAL Principle:

Write-Ahead Logging (WAL)

2.2. WAL Files Structure

Location: $PGDATA/pg_wal/

TEXT
$ ls -lh $PGDATA/pg_wal/
-rw------- 1 postgres postgres 16M Nov 24 10:00 000000010000000000000001
-rw------- 1 postgres postgres 16M Nov 24 10:15 000000010000000000000002
-rw------- 1 postgres postgres 16M Nov 24 10:30 000000010000000000000003

Characteristics:

  • Each file: 16MB (default)
  • File name: Timeline ID + Segment Number
  • Format: TTTTTTTTXXXXXXXXYYYYYYYY
    • TTTTTTTT: Timeline (8 hex digits)
    • XXXXXXXX: Log file number (8 hex)
    • YYYYYYYY: Segment number (8 hex)

2.3. LSN (Log Sequence Number)

LSN is the position in the WAL stream, format: X/Y

  • X: WAL file number
  • Y: Offset within file
TEXT
-- Check current LSN
SELECT pg_current_wal_lsn();  -- Primary
-- Output: 0/3000060

SELECT pg_last_wal_receive_lsn();  -- Standby (received)
SELECT pg_last_wal_replay_lsn();   -- Standby (applied)

2.4. WAL Configuration Parameters

TEXT
# postgresql.conf

# WAL Settings
wal_level = replica              # minimal, replica, or logical
                                 # replica: for streaming replication

wal_log_hints = on              # Needed for pg_rewind

# WAL Writing
wal_buffers = 16MB              # WAL buffer size in shared memory
wal_writer_delay = 200ms        # WAL writer sleep time

# WAL Files Management
min_wal_size = 80MB            # Minimum WAL files to keep
max_wal_size = 1GB             # Trigger checkpoint when exceeded

# Checkpoints
checkpoint_timeout = 5min       # Maximum between 2 checkpoints
checkpoint_completion_target = 0.9  # Spread checkpoint writes

2.5. WAL and Crash Recovery

When PostgreSQL crashes:

TEXT
1. Server restart
2. PostgreSQL reads last checkpoint location
3. Replay all WAL records from checkpoint → crash point
4. Restore database to consistent state
5. Ready to accept connections

Example:

TEXT
Timeline:
10:00 ─── Checkpoint ─── 10:05 ─── 10:08 (CRASH)
          (LSN: 0/1000)          (LSN: 0/3000)
          
Recovery:
- Start from LSN 0/1000
- Replay WAL → LSN 0/3000
- Database consistent at 10:08

3. Synchronous vs Asynchronous Replication

3.1. Asynchronous Replication (Default)

How it works:

Asynchronous Replication (Default)

Characteristics:

  • High Performance: Primary doesn't wait for Standby
  • Low Latency: Commit time doesn't depend on network
  • Data Loss Possible: If Primary crashes before Standby receives WAL
  • RPO > 0: Recovery Point Objective is not zero

Configuration:

TEXT
# postgresql.conf (Primary)
synchronous_commit = off  # or local

Use cases:

  • Standby in different datacenter (high latency)
  • Prioritize performance over data safety
  • Acceptable data loss (a few seconds)

3.2. Synchronous Replication

How it works:

Synchronous Replication

Characteristics:

  • Zero Data Loss: Transaction only commits when Standby confirms
  • RPO = 0: Perfect for critical data
  • Performance Impact: ~2-10ms overhead per commit
  • Availability Risk: Primary blocks if Standby fails

Configuration:

TEXT
# postgresql.conf (Primary)
synchronous_commit = on              # on, remote_write, remote_apply
synchronous_standby_names = 'standby1,standby2'  # Standby names

# recovery.conf or postgresql.auto.conf (Standby)
primary_conninfo = 'host=primary port=5432 user=replicator application_name=standby1'

Synchronous Commit Levels:

LevelMeaningData SafetyPerformance
offDon't wait for StandbyLowHighest
localWait only for local diskMediumHigh
remote_writeWait for Standby to write to OS cacheGoodMedium
onWait for Standby to flush to diskGoodSlower
remote_applyWait for Standby to apply changesBestSlowest

3.3. Quorum-based Synchronous Replication

PostgreSQL 9.6+: Flexible synchronous replication

TEXT
# Wait for ANY 1 of 2 standbys
synchronous_standby_names = 'ANY 1 (standby1, standby2)'

# Wait for FIRST 2 of 3 standbys
synchronous_standby_names = 'FIRST 2 (standby1, standby2, standby3)'

# Wait for ALL standbys (same as old)
synchronous_standby_names = 'standby1, standby2'

Example: ANY 1

TEXT
3 Standbys: standby1 (DC1), standby2 (DC2), standby3 (DC3)

Transaction commits when:
✅ Primary committed + ANY 1 standby acknowledged

Scenario:
- standby1: ACK in 5ms
- standby2: ACK in 100ms (slow network)
- standby3: DOWN

→ Transaction commits after 5ms (waiting for standby1)
→ Good performance + Data safety

3.4. Sync vs Async Comparison

CriteriaAsyncSync
Commit latency~1ms~5-10ms
Data loss riskYes (a few seconds)No
RPOSecondsZero
RTO~30-60s~30-60s
Primary performance100%95-98%
Network dependencyLowHigh
Use caseRead replicas, ReportingCritical data, Financial

4. Replication Slots

4.1. Problem before Replication Slots

Scenario:

TEXT
1. Primary generates WAL files
2. Checkpoint happens → Old WAL cleaned up
3. Standby offline for hours
4. Standby comes back online
5. ❌ Needed WAL files have been deleted
6. ❌ Standby cannot catch up
7. ❌ Need to rebuild Standby from scratch

4.2. Replication Slots solve the problem

Replication Slot ensures Primary keeps WAL files until Standby consumes them.

Replication Slot

4.3. Create and manage Replication Slots

Create slot on Primary:

TEXT
-- Physical replication slot
SELECT * FROM pg_create_physical_replication_slot('standby1_slot');

-- View slot list
SELECT slot_name, slot_type, active, restart_lsn, confirmed_flush_lsn
FROM pg_replication_slots;

-- Output:
 slot_name     | slot_type | active | restart_lsn | confirmed_flush_lsn
---------------+-----------+--------+-------------+--------------------
 standby1_slot | physical  | t      | 0/3000000   | NULL

Use slot on Standby:

ini

TEXT
# postgresql.auto.conf (Standby)
primary_slot_name = 'standby1_slot'

Delete slot:

sql

TEXT
SELECT pg_drop_replication_slot('standby1_slot');

4.4. Monitor Replication Slots

sql

TEXT
-- Check slot status
SELECT 
    slot_name,
    active,
    pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) as retained_wal
FROM pg_replication_slots;

-- Alert if retained_wal is too large (>10GB)

4.5. Important notes

⚠️ Risks:

  • If Standby is offline for long with slot → Primary keeps WAL forever
  • Can fill Primary's disk
  • Need monitoring and alerts

Best practice:

sql

TEXT
-- Set max WAL size to protect Primary
ALTER SYSTEM SET max_slot_wal_keep_size = '100GB';  -- PostgreSQL 13+

-- Or automatically drop inactive slot after 24h
SELECT pg_drop_replication_slot(slot_name)
FROM pg_replication_slots
WHERE NOT active 
  AND pg_current_wal_lsn() - restart_lsn > 100*1024*1024*1024;  -- 100GB

5. Lab: Manual Streaming Replication Setup

5.1. Lab Objectives

Create a PostgreSQL cluster with:

  • 1 Primary server
  • 1 Standby server
  • Streaming replication (asynchronous)
  • Hot standby (read queries)

5.2. Environment

TEXT
Primary:  192.168.1.101 (node1)
Standby:  192.168.1.102 (node2)
PostgreSQL: 14
OS: Ubuntu 22.04

5.3. Step 1: Install PostgreSQL (both nodes)

bash

TEXT
# Install PostgreSQL 14
sudo apt update
sudo apt install -y postgresql-14 postgresql-contrib-14

# Stop service
sudo systemctl stop postgresql

5.4. Step 2: Configure Primary (node1)

Create replication user:

bash

TEXT
sudo -u postgres psql

sql

TEXT
-- Create user for replication
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'repl_password';

-- Exit
\q

Configure postgresql.conf:

bash

TEXT
sudo nano /etc/postgresql/14/main/postgresql.conf

ini

TEXT
# Connection
listen_addresses = '*'
port = 5432

# Replication
wal_level = replica
max_wal_senders = 5
max_replication_slots = 5
wal_keep_size = 1GB

# Hot Standby (not needed for primary but good to have)
hot_standby = on

# Archive (optional, recommended)
archive_mode = on
archive_command = 'test ! -f /var/lib/postgresql/14/archive/%f && cp %p /var/lib/postgresql/14/archive/%f'

Create archive directory:

bash

TEXT
sudo mkdir -p /var/lib/postgresql/14/archive
sudo chown postgres:postgres /var/lib/postgresql/14/archive

Configure pg_hba.conf:

bash

TEXT
sudo nano /etc/postgresql/14/main/pg_hba.conf

ini

TEXT
# Replication connections
host    replication     replicator      192.168.1.102/32        md5
host    replication     replicator      127.0.0.1/32            md5

Start Primary:

bash

TEXT
sudo systemctl start postgresql
sudo systemctl status postgresql

Create replication slot:

bash

TEXT
sudo -u postgres psql

sql

TEXT
SELECT pg_create_physical_replication_slot('standby_slot');
SELECT * FROM pg_replication_slots;
\q

5.5. Step 3: Setup Standby (node2)

Stop PostgreSQL and backup old data:

bash

TEXT
sudo systemctl stop postgresql
sudo mv /var/lib/postgresql/14/main /var/lib/postgresql/14/main.bak

Base backup from Primary:

bash

TEXT
# Use pg_basebackup
sudo -u postgres pg_basebackup \
    -h 192.168.1.101 \
    -D /var/lib/postgresql/14/main \
    -U replicator \
    -P \
    -v \
    -R \
    -X stream \
    -C -S standby_slot

# Options explanation:
# -h: Primary host
# -D: Data directory
# -U: Replication user
# -P: Show progress
# -v: Verbose
# -R: Create standby.signal and postgresql.auto.conf
# -X stream: Stream WAL during backup
# -C: Create replication slot
# -S: Slot name

Sample output:

TEXT
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created replication slot "standby_slot"
24567/24567 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/2000100
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed

Check that standby.signal was created:

bash

TEXT
ls -l /var/lib/postgresql/14/main/standby.signal
# This file marks this as a standby server

Check postgresql.auto.conf:

bash

TEXT
sudo cat /var/lib/postgresql/14/main/postgresql.auto.conf

ini

TEXT
# Created automatically by pg_basebackup -R
primary_conninfo = 'user=replicator password=repl_password host=192.168.1.101 port=5432 sslmode=prefer sslcompression=0 krbsrvname=postgres target_session_attrs=any'
primary_slot_name = 'standby_slot'

Start Standby:

bash

TEXT
sudo systemctl start postgresql
sudo systemctl status postgresql

5.6. Step 4: Verify Replication

On Primary (node1):

sql

TEXT
sudo -u postgres psql

-- Check replication status
SELECT 
    client_addr,
    state,
    sync_state,
    replay_lsn,
    pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)) as lag
FROM pg_stat_replication;

-- Output:
 client_addr   |   state   | sync_state | replay_lsn  |  lag
---------------+-----------+------------+-------------+-------
 192.168.1.102 | streaming | async      | 0/3000060   | 0 bytes

On Standby (node2):

sql

TEXT
sudo -u postgres psql

-- Check standby status
SELECT pg_is_in_recovery();  -- Should return 't' (true)

-- Check replication lag
SELECT 
    pg_last_wal_receive_lsn() AS receive,
    pg_last_wal_replay_lsn() AS replay,
    pg_size_pretty(pg_wal_lsn_diff(pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn())) AS lag;

-- Output:
   receive   |   replay    |  lag
-------------+-------------+--------
 0/3000060   | 0/3000060   | 0 bytes

5.7. Step 5: Test Replication

On Primary - Create test data:

sql

TEXT
-- Create database and table
CREATE DATABASE testdb;
\c testdb

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    created_at TIMESTAMP DEFAULT NOW()
);

INSERT INTO users (name) VALUES 
    ('Alice'),
    ('Bob'),
    ('Charlie');

SELECT * FROM users;

On Standby - Verify data:

sql

TEXT
\c testdb

-- Read queries work
SELECT * FROM users;

-- Output:
 id |  name   |       created_at
----+---------+------------------------
  1 | Alice   | 2024-11-24 10:30:15
  2 | Bob     | 2024-11-24 10:30:15
  3 | Charlie | 2024-11-24 10:30:15

-- Write queries are rejected
INSERT INTO users (name) VALUES ('David');
-- ERROR: cannot execute INSERT in a read-only transaction

5.8. Step 6: Monitoring Queries

Replication delay monitoring:

sql

TEXT
-- On Primary
CREATE OR REPLACE FUNCTION replication_lag_bytes()
RETURNS TABLE(client_addr INET, lag_bytes BIGINT) AS $
BEGIN
    RETURN QUERY
    SELECT 
        c.client_addr,
        pg_wal_lsn_diff(pg_current_wal_lsn(), c.replay_lsn)::BIGINT
    FROM pg_stat_replication c;
END;
$ LANGUAGE plpgsql;

-- Use
SELECT * FROM replication_lag_bytes();

Alert if lag > 10MB:

sql

TEXT
SELECT client_addr, 
       pg_size_pretty(lag_bytes) as lag
FROM replication_lag_bytes()
WHERE lag_bytes > 10*1024*1024;

5.9. Troubleshooting Common Issues

Issue 1: Standby cannot connect to Primary

bash

TEXT
# Check logs
sudo tail -f /var/lib/postgresql/14/main/log/postgresql-*.log

# Common errors:
# - "FATAL: password authentication failed"
#   → Check pg_hba.conf and password

# - "FATAL: no pg_hba.conf entry for replication"
#   → Add replication entry to pg_hba.conf

# - Connection refused
#   → Check firewall, listen_addresses

Issue 2: High replication lag

sql

TEXT
-- Check WAL sender busy
SELECT * FROM pg_stat_activity 
WHERE backend_type = 'walsender';

-- Check I/O on Standby
SELECT * FROM pg_stat_bgwriter;

Issue 3: Slot fills up disk

sql

TEXT
-- Check retained WAL
SELECT 
    slot_name,
    pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) as retained
FROM pg_replication_slots;

-- Drop inactive slot if needed
SELECT pg_drop_replication_slot('standby_slot');

6. Best Practices

6.1. Configuration Tuning

ini

TEXT
# Primary - postgresql.conf

# Network buffer (if many standbys)
max_wal_senders = 10  # Depending on standbys + 2 backup
# WAL retention
wal_keep_size = 2GB  # Keep enough WAL for standby to catch up
max_slot_wal_keep_size = 10GB  # Limit slot retention (PG 13+)

# Archive (backup strategy)
archive_mode = on
archive_command = 'cp %p /backup/archive/%f'

# Checkpoint tuning
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9

6.2. Monitoring Checklist

Replication lag (bytes and time) ✅ Standby connection statusWAL sender processesDisk space (pg_wal/ and archive/) ✅ Replication slots (retained WAL) ✅ Checkpoint performance

6.3. Security Recommendations

ini

TEXT
# Use SSL for replication
ssl = on
ssl_cert_file = '/path/to/server.crt'
ssl_key_file = '/path/to/server.key'

# Standby connection string
primary_conninfo = '... sslmode=require sslcompression=1'

ini

TEXT
# pg_hba.conf - Use hostssl
hostssl replication replicator 192.168.1.0/24 md5

7. Summary

Key Takeaways

  1. Streaming Replication is the foundation of PostgreSQL HA:
    • Realtime WAL streaming
    • Hot Standby for read queries
    • Basis for Patroni automated failover
  2. WAL (Write-Ahead Logging):
    • Log before write to data
    • Crash recovery mechanism
    • Replication transport format
  3. Synchronous vs Asynchronous:
    • Async: High performance, possible data loss
    • Sync: Zero data loss, performance impact
    • Quorum-based: Balance between both
  4. Replication Slots:
    • Ensure WAL is not deleted too early
    • Critical for standby stability
    • Need monitoring to avoid disk full

You might also like

Browse all articles

Lesson 1: Overview of PostgreSQL High Availability

Understanding the fundamentals of PostgreSQL High Availability, including why HA is critical, different HA methods, and comparing Patroni vs Repmgr vs Pacemaker solutions.

#Database#PostgreSQL#High Availability

Lesson 4: Infrastructure Preparation for PostgreSQL HA

Setting up the infrastructure for PostgreSQL High Availability with Patroni and etcd, including hardware requirements, network configuration, firewall, SSH keys, and time synchronization.

#Database#PostgreSQL#Infrastructure

Lesson 3: Introduction to Patroni and etcd

Understanding Patroni and etcd for PostgreSQL High Availability, including DCS, Raft consensus algorithm, leader election, and split-brain prevention mechanisms.

#Database#PostgreSQL#Patroni
Series

PostgreSQL Security Hardening Guide

Essential security features and hardening measures for PostgreSQL HA cluster deployment with Patroni, etcd, and PgBouncer. Follow this guide for production security best practices.

#Database#PostgreSQL#Security