CloudTada | Infrastructure & DevOps Insights

Deploying a PostgreSQL High Availability cluster shouldn't be a manual, error-prone process. In this guide, we'll leverage Ansible automation to deploy a robust HA cluster using Patroni, etcd, and PgBouncer.

This setup provides automatic failover, distributed configuration, and high-performance connection pooling, all optimized for production-grade environments.

Architecture Overview

Our cluster consists of three nodes, each running PostgreSQL, Patroni, and PgBouncer, with a distributed etcd cluster for consensus.

TEXT

┌────────────────────────────────────────────────────────────────┐
│                      Application Layer                         │
│  (Spring Boot / Python / Node.js / Java / .NET / etc.)         │
└──────────────┬──────────────┬──────────────┬───────────────────┘
               │              │              │
          Port 6432      Port 6432      Port 6432
               │              │              │
┌──────────────▼──────┐ ┌─────▼──────┐ ┌────▼───────────────────┐
│   PgBouncer (Node1) │ │ PgBouncer  │ │  PgBouncer (Node3)     │
│   Connection Pool   │ │   (Node2)  │ │  Connection Pool       │
│   Max: 1000 clients │ │            │ │  Max: 1000 clients     │
└──────────┬──────────┘ └─────┬──────┘ └────┬───────────────────┘
           │                  │              │
      Port 5432          Port 5432      Port 5432
           │                  │              │
┌──────────▼──────────┐ ┌─────▼──────┐ ┌────▼───────────────────┐
│  PostgreSQL 18.1    │ │ PostgreSQL │ │  PostgreSQL 18.1       │
│  Primary (Leader)   │ │  Replica   │ │  Replica               │
│  Read/Write         │ │  Read Only │ │  Read Only             │
└──────────┬──────────┘ └─────┬──────┘ └────┬───────────────────┘
           │                  │              │
      Port 8008          Port 8008      Port 8008
           │                  │              │
┌──────────▼──────────┐ ┌─────▼──────┐ ┌────▼───────────────────┐
│   Patroni 4.1.0     │ │  Patroni   │ │  Patroni 4.1.0         │
│   HA Manager        │ │ HA Manager │ │  HA Manager            │
└──────────┬──────────┘ └─────┬──────┘ └────┬───────────────────┘
           │                  │              │
      Port 2379          Port 2379      Port 2379
           │                  │              │
┌──────────▼──────────────────▼──────────────▼───────────────────┐
│                etcd 3.5.25 Cluster                             │
│          Distributed Configuration & Leader Election           │
│   Node1 (etcd1)  │  Node2 (etcd2)  │  Node3 (etcd3)            │
└────────────────────────────────────────────────────────────────┘

Key Features:

High Availability: Automatic failover (30-45s) managed by Patroni.
Connection Pooling: PgBouncer for efficient resource utilization.
Data Integrity: Optimized for SSDs with reliable streaming replication.
Production Ready: Full configuration externalized to a single .env file.

Step 1. Requirements & Preparation

Before running the playbooks, ensure your environment meets the following specifications.

Hardware (per node)

Minimum: 2 Cores, 4GB RAM, 20GB Disk.
Recommended: 4-8 Cores, 16-32GB RAM, 100GB+ NVMe SSD.

Network Configuration

The following ports must be accessible internally within the cluster:

Service	Port	Protocol	Purpose
PgBouncer	6432	TCP	Connection pooling (PRIMARY ACCESS)
PostgreSQL	5432	TCP	Direct DB connections (Admin)
Patroni API	8008	TCP	Health checks & Cluster management
etcd Client	2379	TCP	etcd-to-Patroni communication
etcd Peer	2380	TCP	etcd intra-cluster replication

Step 2. Project Configuration

All cluster settings are managed through a central .env file.

1. Clone the Automation Repository

BASH

git clone https://github.com/xdev-asia-labs/postgres-patroni-etcd-install.git
cd postgres-patroni-etcd-install

2. Configure Environment Variables

Copy the template and update with your node IPs and strong passwords.

BASH

cp .env.example .env
nano .env

Critical Variables to Set:

BASH

# Node IP Addresses
NODE1_IP=10.0.0.11
NODE2_IP=10.0.0.12
NODE3_IP=10.0.0.13

# PostgreSQL Passwords
POSTGRESQL_SUPERUSER_PASSWORD=your_strong_password
POSTGRESQL_REPLICATION_PASSWORD=your_strong_password
POSTGRESQL_ADMIN_PASSWORD=your_strong_password

# Performance tuning for 16GB RAM
POSTGRESQL_SHARED_BUFFERS=4GB
POSTGRESQL_EFFECTIVE_CACHE_SIZE=12GB

3. Update Ansible Inventory

Edit inventory/hosts.yml to reflect your node infrastructure:

YAML

all:
  children:
    postgres:
      hosts:
        pg-node1:
          ansible_host: 10.0.0.11
          patroni_name: node1
          etcd_name: etcd1
        pg-node2:
          ansible_host: 10.0.0.12
          patroni_name: node2
          etcd_name: etcd2
        pg-node3:
          ansible_host: 10.0.0.13
          patroni_name: node3
          etcd_name: etcd3

Step 3. Deploying the Cluster

Load your environment variables and execute the main site playbook.

BASH

# Load environment variables
set -a && source .env && set +a

# Run the deployment
ansible-playbook playbooks/site.yml -i inventory/hosts.yml

You can also deploy specific components using tags: --tags postgresql, --tags etcd, --tags patroni, or --tags pgbouncer.

Step 4. Verification & Management

Once the deployment completes, use patronictl to verify the cluster state from any node.

BASH

ssh root@${NODE1_IP} "patronictl -c /etc/patroni/patroni.yml list"

Expected Cluster Output:

Cluster: postgres (7441307089994301601)

Member	Host	Role	State	TL	Lag in MB
pg-node1	10.0.0.11	Leader	running	2
pg-node2	10.0.0.12	Replica	running	2	0
pg-node3	10.0.0.13	Replica	running	2	0

Common Management Tasks

Check etcd health: etcdctl --endpoints=http://10.0.0.11:2379 endpoint health
Manual Switchover: patronictl -c /etc/patroni/patroni.yml switchover
Check PgBouncer Stats: Connect to port 6432 and run SHOW POOLS;

Step 5. Application Connection

Important: Applications should always connect to PgBouncer (port 6432) rather than PostgreSQL (port 5432) for better resource management.

JDBC (Java/Spring Boot)

YAML

spring:
  datasource:
    url: jdbc:postgresql://10.0.0.11:6432,10.0.0.12:6432,10.0.0.13:6432/postgres?targetServerType=primary&loadBalanceHosts=true
    username: postgres
    password: ${POSTGRESQL_SUPERUSER_PASSWORD}

Python (psycopg2)

PYTHON

import psycopg2
conn = psycopg2.connect(
    host="10.0.0.11,10.0.0.12,10.0.0.13",
    port=6432,
    database="postgres",
    target_session_attrs="read-write"
)

Transaction vs Session Pooling

Choosing the right pool mode in PgBouncer is critical for application performance:

Transaction Mode (Recommended for Microservices): Each query borrows a connection and releases it immediately. This is the default in our setup (3000 client → 225 backends).
Session Mode (Recommended for Interactive Tools): Required for tools like DBeaver or pgAdmin that maintain temporary tables or prepared statements.

Troubleshooting Common Issues

1. Environment Variables Not Loading

Symptom: Ansible uses default values instead of .env settings. Solution: Always ensure you load the environment before running playbooks:

BASH

set -a && source .env && set +a

2. Patroni Fails to Start

Symptom: systemctl status patroni shows a failed state. Solution: Check the logs using journalctl -u patroni -n 100. Common causes include:

Incorrect bin_dir in .env (ensure it matches your PostgreSQL version).
Conflict with a manually started PostgreSQL instance. Patroni must be the one managing the startup.

3. etcd Cluster Unhealthy

Symptom: Endpoint health check fails. Solution: Verify connectivity between nodes on ports 2379 and 2380. Check individual node status:

BASH

ssh root@${NODE1_IP} "systemctl status etcd"

Security & Performance Characteristics

This deployment includes production-grade hardening:

SCRAM-SHA-256 Authentication: Replaces insecure MD5.
Firewall Isolation: Automated UFW/firewalld rules for cluster communication.
Scalable Performance: Optimized for 16GB RAM with shared_buffers at 25% and effective_cache_size at 75%.

Conclusion

Automating your PostgreSQL HA deployment with Ansible removes the complexity of manual configurations. By combining Patroni's cluster management with etcd's consensus and PgBouncer's connection pooling, you create a resilient, high-performance database infrastructure that is truly production-ready.

Automating PostgreSQL High Availability with Ansible, Patroni, and etcd

Series

PostgreSQL High Availability Mastery

Automating PostgreSQL High Availability with Ansible, Patroni, and etcd

PostgreSQL Security Hardening Guide

PostgreSQL HA Cluster - Monitoring Stack

Building a Highly Available PostgreSQL Cluster with Patroni, etcd, and PgBouncer

Architecture Overview

Step 1. Requirements & Preparation

Hardware (per node)

Network Configuration

Step 2. Project Configuration

1. Clone the Automation Repository

2. Configure Environment Variables

3. Update Ansible Inventory

Step 3. Deploying the Cluster

Step 4. Verification & Management

Common Management Tasks

Step 5. Application Connection

JDBC (Java/Spring Boot)

Python (psycopg2)

Transaction vs Session Pooling

Troubleshooting Common Issues

1. Environment Variables Not Loading

2. Patroni Fails to Start

3. etcd Cluster Unhealthy

Security & Performance Characteristics

Conclusion

Share this article

You might also like

Lesson 0: PostgreSQL High Availability with Patroni & etcd

PostgreSQL Security Hardening Guide

Building a Highly Available PostgreSQL Cluster with Patroni, etcd, and PgBouncer

Lesson 4: Infrastructure Preparation for PostgreSQL HA

Lesson 3: Introduction to Patroni and etcd