CloudTadaInsights

Lesson 12: Patroni REST API

Patroni REST API

Learning Objectives

After this lesson, you will:

  • Understand Patroni REST API and endpoints
  • Use REST API for health checks
  • Integrate with load balancers (HAProxy, Nginx)
  • Query cluster status and configuration
  • Implement custom monitoring
  • Secure REST API endpoints

1. REST API Overview

1.1. REST API là gì?

Patroni exposes HTTP REST API trên mỗi node để:

  • 🔍 Health checks: Load balancers check node health
  • 📊 Monitoring: External systems query cluster state
  • ⚙️ Management: Read configuration, cluster topology
  • 🔄 Automation: Integration với CI/CD, orchestration tools

1.2. API Configuration

In patroni.yml:

TEXT
restapi:
  listen: 0.0.0.0:8008        # Listen address and port
  connect_address: 10.0.1.11:8008  # Advertised address
  
  # Optional: Basic authentication
  # authentication:
  #   username: admin
  #   password: secret_password
  
  # Optional: SSL/TLS
  # certfile: /etc/patroni/certs/server.crt
  # keyfile: /etc/patroni/certs/server.key
  # cafile: /etc/patroni/certs/ca.crt

Default port8008

1.3. API Endpoints Overview

EndpointMethodPurposeUse Case
/GETBasic node infoQuick health check
/primary or /masterGETCheck if node is primaryLB primary routing
/replicaGETCheck if node is replicaLB read routing
/read-writeGETCheck if writable (primary)LB write routing
/read-only or /standbyGETCheck if read-only (replica)LB read routing
/synchronousGETCheck if synchronous replicaSync replica detection
/asynchronousGETCheck if asynchronous replicaAsync replica detection
/healthGETDetailed health checkMonitoring
/patroniGETDetailed cluster and node infoAdvanced monitoring
/configGETCluster configuration from DCSConfig inspection
/clusterGETAll cluster members infoTopology view
/historyGETFailover historyAudit log

2. Health Check Endpoints

2.1. Basic health check: GET /

Purpose: Quick check if node is running.

TEXT
curl -s http://10.0.1.11:8008/

# Response on PRIMARY:
# HTTP 200 OK
# {
#   "state": "running",
#   "postmaster_start_time": "2024-11-25 10:30:00.123456+00:00",
#   "role": "master",
#   "server_version": 180000,
#   "cluster_unlocked": false,
#   "xlog": {
#     "location": 67108864
#   },
#   "timeline": 1,
#   "database_system_identifier": "7001234567890123456",
#   "patroni": {
#     "version": "3.2.0",
#     "scope": "postgres"
#   }
# }

# Response on REPLICA:
# HTTP 200 OK
# {
#   "state": "running",
#   "postmaster_start_time": "2024-11-25 10:31:15.789012+00:00",
#   "role": "replica",
#   "server_version": 180000,
#   "cluster_unlocked": false,
#   "xlog": {
#     "received_location": 67108864,
#     "replayed_location": 67108864
#   },
#   "timeline": 1,
#   "database_system_identifier": "7001234567890123456",
#   "patroni": {
#     "version": "3.2.0",
#     "scope": "postgres"
#   }
# }

Response codes:

  • 200 OK: Node is healthy and running
  • 503 Service Unavailable: Node is unhealthy (PostgreSQL down, etc.)

2.2. Primary check: GET /primary or /master

Purpose: Check if node is current primary/leader.

TEXT
curl -s http://10.0.1.11:8008/primary

# On PRIMARY:
# HTTP 200 OK
# {
#   "state": "running",
#   "role": "master",
#   "xlog": {
#     "location": 67108864
#   }
# }

# On REPLICA:
# HTTP 503 Service Unavailable
# (empty body or error message)

Use case: Load balancer health check for write traffic routing.

2.3. Replica check: GET /replica

Purpose: Check if node is replica (standby).

TEXT
curl -s http://10.0.1.12:8008/replica

# On REPLICA:
# HTTP 200 OK
# {
#   "state": "running",
#   "role": "replica",
#   "xlog": {
#     "received_location": 67108864,
#     "replayed_location": 67108864
#   }
# }

# On PRIMARY:
# HTTP 503 Service Unavailable

Use case: Load balancer health check for read traffic routing.

2.4. Read-write check: GET /read-write

Purpose: Check if node accepts writes (primary + not in maintenance).

TEXT
curl -s http://10.0.1.11:8008/read-write

# Returns 200 if:
# - Node is primary
# - Cluster is not paused
# - No maintenance mode

2.5. Read-only check: GET /read-only or /standby

Purpose: Check if node is read-only replica.

TEXT
curl -s http://10.0.1.12:8008/read-only

# Returns 200 if:
# - Node is replica
# - PostgreSQL is running
# - Replication lag < threshold (optional)

Advanced: Lag tolerance:

TEXT
# Check replica with max 1MB lag tolerance
curl -s "http://10.0.1.12:8008/read-only?lag=1048576"

# Returns 503 if lag > 1MB

2.6. Synchronous replica check: GET /synchronous

Purpose: Check if node is synchronous replica.

TEXT
curl -s http://10.0.1.12:8008/synchronous

# Returns 200 if:
# - Node is replica
# - sync_state = 'sync' (from pg_stat_replication)

2.7. Asynchronous replica check: GET /asynchronous

Purpose: Check if node is asynchronous replica.

TEXT
curl -s http://10.0.1.13:8008/asynchronous

# Returns 200 if:
# - Node is replica
# - sync_state != 'sync'

2.8. Health endpoint: GET /health

Purpose: Detailed health information.

TEXT
curl -s http://10.0.1.11:8008/health | jq

# Response:
# {
#   "state": "running",
#   "role": "master",
#   "server_version": 180000,
#   "cluster_unlocked": false,
#   "timeline": 1,
#   "database_system_identifier": "7001234567890123456",
#   "postmaster_start_time": "2024-11-25 10:30:00.123456+00:00",
#   "patroni": {
#     "version": "3.2.0",
#     "scope": "postgres",
#     "name": "node1"
#   },
#   "replication": [
#     {
#       "usename": "replicator",
#       "application_name": "node2",
#       "client_addr": "10.0.1.12",
#       "state": "streaming",
#       "sync_state": "sync",
#       "sync_priority": 1
#     },
#     {
#       "usename": "replicator",
#       "application_name": "node3",
#       "client_addr": "10.0.1.13",
#       "state": "streaming",
#       "sync_state": "async",
#       "sync_priority": 0
#     }
#   ]
# }

3. Cluster Information Endpoints

3.1. Detailed node info: GET /patroni

Purpose: Comprehensive node and cluster information.

TEXT
curl -s http://10.0.1.11:8008/patroni | jq

# Response (truncated):
# {
#   "state": "running",
#   "postmaster_start_time": "2024-11-25 10:30:00.123456+00:00",
#   "role": "master",
#   "server_version": 180000,
#   "xlog": {
#     "location": 67108864
#   },
#   "timeline": 1,
#   "cluster_unlocked": false,
#   "database_system_identifier": "7001234567890123456",
#   "patroni": {
#     "version": "3.2.0",
#     "scope": "postgres",
#     "name": "node1"
#   },
#   "dcs": {
#     "last_seen": 1700912345,
#     "ttl": 30
#   },
#   "tags": {
#     "nofailover": false,
#     "noloadbalance": false,
#     "clonefrom": false,
#     "nosync": false
#   },
#   "pending_restart": false,
#   "replication": [...],
#   "timeline_history": [...]
# }

3.2. Cluster configuration: GET /config

Purpose: Get cluster-wide configuration from DCS.

TEXT
curl -s http://10.0.1.11:8008/config | jq

# Response:
# {
#   "ttl": 30,
#   "loop_wait": 10,
#   "retry_timeout": 10,
#   "maximum_lag_on_failover": 1048576,
#   "synchronous_mode": true,
#   "synchronous_mode_strict": false,
#   "postgresql": {
#     "parameters": {
#       "max_connections": 100,
#       "shared_buffers": "256MB",
#       "wal_level": "replica",
#       "max_wal_senders": 10,
#       "max_replication_slots": 10,
#       "hot_standby": "on"
#     },
#     "use_pg_rewind": true,
#     "use_slots": true
#   }
# }

3.3. Cluster members: GET /cluster

Purpose: Get information about all cluster members.

TEXT
curl -s http://10.0.1.11:8008/cluster | jq

# Response:
# {
#   "members": [
#     {
#       "name": "node1",
#       "role": "leader",
#       "state": "running",
#       "api_url": "http://10.0.1.11:8008/patroni",
#       "host": "10.0.1.11",
#       "port": 5432,
#       "timeline": 1,
#       "lag": 0
#     },
#     {
#       "name": "node2",
#       "role": "sync_standby",
#       "state": "running",
#       "api_url": "http://10.0.1.12:8008/patroni",
#       "host": "10.0.1.12",
#       "port": 5432,
#       "timeline": 1,
#       "lag": 0
#     },
#     {
#       "name": "node3",
#       "role": "replica",
#       "state": "running",
#       "api_url": "http://10.0.1.13:8008/patroni",
#       "host": "10.0.1.13",
#       "port": 5432,
#       "timeline": 1,
#       "lag": 0
#     }
#   ],
#   "scope": "postgres"
# }

3.4. Failover history: GET /history

Purpose: Get cluster failover/switchover history.

TEXT
curl -s http://10.0.1.11:8008/history | jq

# Response:
# [
#   [
#     1,  // Timeline
#     67108864,  // LSN
#     "no recovery target specified",
#     "2024-11-25T10:30:00+00:00"
#   ],
#   [
#     2,
#     134217728,
#     "no recovery target specified",
#     "2024-11-25T11:45:30+00:00"
#   ]
# ]

4. Load Balancer Integration

4.1. HAProxy configuration

haproxy.cfg:

TEXT
global
    log /dev/log local0
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000
    timeout client  50000
    timeout server  50000

# Stats page
listen stats
    bind *:7000
    stats enable
    stats uri /stats
    stats refresh 10s
    stats auth admin:password

# Primary/Write endpoint
listen postgres-primary
    bind *:5000
    mode tcp
    option tcplog
    option tcp-check
    
    # Health check via Patroni REST API
    tcp-check connect port 8008
    tcp-check send GET\ /primary\ HTTP/1.0\r\n\r\n
    tcp-check expect string HTTP/1.1\ 200
    
    default-server inter 3s fall 3 rise 2
    
    server node1 10.0.1.11:5432 check port 8008
    server node2 10.0.1.12:5432 check port 8008
    server node3 10.0.1.13:5432 check port 8008

# Replicas/Read-only endpoint
listen postgres-replicas
    bind *:5001
    mode tcp
    option tcplog
    option tcp-check
    balance roundrobin
    
    # Health check via Patroni REST API
    tcp-check connect port 8008
    tcp-check send GET\ /replica\ HTTP/1.0\r\n\r\n
    tcp-check expect string HTTP/1.1\ 200
    
    default-server inter 3s fall 3 rise 2
    
    server node1 10.0.1.11:5432 check port 8008
    server node2 10.0.1.12:5432 check port 8008
    server node3 10.0.1.13:5432 check port 8008

# Read-write endpoint (primary only)
listen postgres-read-write
    bind *:5002
    mode tcp
    option tcplog
    option tcp-check
    
    tcp-check connect port 8008
    tcp-check send GET\ /read-write\ HTTP/1.0\r\n\r\n
    tcp-check expect string HTTP/1.1\ 200
    
    default-server inter 3s fall 3 rise 2
    
    server node1 10.0.1.11:5432 check port 8008
    server node2 10.0.1.12:5432 check port 8008
    server node3 10.0.1.13:5432 check port 8008

# Read-only endpoint (replicas only)
listen postgres-read-only
    bind *:5003
    mode tcp
    option tcplog
    option tcp-check
    balance leastconn
    
    tcp-check connect port 8008
    tcp-check send GET\ /read-only\ HTTP/1.0\r\n\r\n
    tcp-check expect string HTTP/1.1\ 200
    
    default-server inter 3s fall 3 rise 2
    
    server node1 10.0.1.11:5432 check port 8008
    server node2 10.0.1.12:5432 check port 8008
    server node3 10.0.1.13:5432 check port 8008

Install và start HAProxy:

TEXT
# Install
sudo apt install -y haproxy

# Configure
sudo nano /etc/haproxy/haproxy.cfg
# (paste config above)

# Validate config
sudo haproxy -c -f /etc/haproxy/haproxy.cfg

# Start
sudo systemctl restart haproxy
sudo systemctl enable haproxy

# Check status
sudo systemctl status haproxy

Test HAProxy:

TEXT
# Connect to primary (port 5000)
psql -h haproxy_host -p 5000 -U app_user -d myapp -c "SELECT pg_is_in_recovery();"
# Should return: f (false = primary)

# Connect to replica (port 5001)
psql -h haproxy_host -p 5001 -U app_user -d myapp -c "SELECT pg_is_in_recovery();"
# Should return: t (true = replica)

# View HAProxy stats
curl http://haproxy_host:7000/stats
# Or open in browser: http://haproxy_host:7000/stats

4.2. Nginx (with stream module)

nginx.conf:

TEXT
stream {
    # Upstream for primary
    upstream postgres_primary {
        least_conn;
        server 10.0.1.11:5432 max_fails=3 fail_timeout=10s;
        server 10.0.1.12:5432 max_fails=3 fail_timeout=10s backup;
        server 10.0.1.13:5432 max_fails=3 fail_timeout=10s backup;
    }
    
    # Upstream for replicas
    upstream postgres_replicas {
        least_conn;
        server 10.0.1.11:5432 max_fails=3 fail_timeout=10s;
        server 10.0.1.12:5432 max_fails=3 fail_timeout=10s;
        server 10.0.1.13:5432 max_fails=3 fail_timeout=10s;
    }
    
    # Primary endpoint
    server {
        listen 5000;
        proxy_pass postgres_primary;
        proxy_connect_timeout 5s;
        proxy_timeout 300s;
    }
    
    # Replicas endpoint
    server {
        listen 5001;
        proxy_pass postgres_replicas;
        proxy_connect_timeout 5s;
        proxy_timeout 300s;
    }
}

Note: Nginx stream module doesn't support HTTP health checks directly. Need external script or use HAProxy instead.

4.3. Health check script for external LB

Script for cloud load balancers (AWS ALB, GCP LB, etc.):

TEXT
#!/bin/bash
# /usr/local/bin/patroni_health_check.sh

set -e

NODE_IP="$1"
PORT="${2:-8008}"
ENDPOINT="${3:-/primary}"  # or /replica

RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" "http://${NODE_IP}:${PORT}${ENDPOINT}")

if [ "$RESPONSE" = "200" ]; then
    echo "Healthy"
    exit 0
else
    echo "Unhealthy (HTTP $RESPONSE)"
    exit 1
fi

Usage:

TEXT
# Check if node is primary
./patroni_health_check.sh 10.0.1.11 8008 /primary

# Check if node is replica
./patroni_health_check.sh 10.0.1.12 8008 /replica

5. Monitoring Integration

5.1. Prometheus exporter

Use postgres_exporter with custom queries:

TEXT
# Install postgres_exporter
wget https://github.com/prometheus-community/postgres_exporter/releases/download/v0.15.0/postgres_exporter-0.15.0.linux-amd64.tar.gz
tar -xzf postgres_exporter-0.15.0.linux-amd64.tar.gz
sudo mv postgres_exporter-0.15.0.linux-amd64/postgres_exporter /usr/local/bin/

# Create systemd service
sudo tee /etc/systemd/system/postgres_exporter.service > /dev/null << EOF
[Unit]
Description=PostgreSQL Exporter
After=network.target

[Service]
Type=simple
User=postgres
Environment="DATA_SOURCE_NAME=postgresql://exporter:password@localhost:5432/postgres?sslmode=disable"
ExecStart=/usr/local/bin/postgres_exporter
Restart=always

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl start postgres_exporter
sudo systemctl enable postgres_exporter

Custom query for Patroni metrics:

TEXT
# /etc/postgres_exporter/queries.yaml

patroni_info:
  query: |
    SELECT 
      CASE WHEN pg_is_in_recovery() THEN 'replica' ELSE 'primary' END as role,
      1 as value
  metrics:
    - role:
        usage: "LABEL"
        description: "PostgreSQL role"
    - value:
        usage: "GAUGE"
        description: "Node role indicator"

5.2. Custom monitoring script

Python script using REST API:

TEXT
#!/usr/bin/env python3
# /usr/local/bin/patroni_monitor.py

import requests
import json
import sys

NODES = [
    "http://10.0.1.11:8008",
    "http://10.0.1.12:8008",
    "http://10.0.1.13:8008"
]

def check_cluster():
    results = []
    
    for node_url in NODES:
        try:
            response = requests.get(f"{node_url}/patroni", timeout=5)
            data = response.json()
            
            results.append({
                "node": data["patroni"]["name"],
                "role": data["role"],
                "state": data["state"],
                "timeline": data["timeline"],
                "lag": data.get("xlog", {}).get("replayed_location", 0)
            })
        except Exception as e:
            print(f"Error checking {node_url}: {e}", file=sys.stderr)
            results.append({
                "node": node_url,
                "role": "unknown",
                "state": "unreachable",
                "error": str(e)
            })
    
    return results

def main():
    cluster_status = check_cluster()
    
    print(json.dumps(cluster_status, indent=2))
    
    # Check if we have a leader
    leaders = [n for n in cluster_status if n.get("role") == "master"]
    
    if len(leaders) != 1:
        print(f"ERROR: Expected 1 leader, found {len(leaders)}", file=sys.stderr)
        sys.exit(1)
    
    # Check all nodes reachable
    unreachable = [n for n in cluster_status if n.get("state") == "unreachable"]
    
    if unreachable:
        print(f"WARNING: {len(unreachable)} nodes unreachable", file=sys.stderr)
        sys.exit(1)
    
    print("Cluster is healthy")
    sys.exit(0)

if __name__ == "__main__":
    main()

Run monitoring:

TEXT
python3 /usr/local/bin/patroni_monitor.py

# Output:
# [
#   {
#     "node": "node1",
#     "role": "master",
#     "state": "running",
#     "timeline": 1,
#     "lag": 0
#   },
#   {
#     "node": "node2",
#     "role": "replica",
#     "state": "running",
#     "timeline": 1,
#     "lag": 0
#   },
#   {
#     "node": "node3",
#     "role": "replica",
#     "state": "running",
#     "timeline": 1,
#     "lag": 0
#   }
# ]
# Cluster is healthy

5.3. Grafana dashboard query examples

PromQL queries:

TEXT
# Node role
patroni_info{role="primary"}

# Replication lag
pg_stat_replication_replay_lag_seconds

# Timeline
patroni_timeline

# Number of replicas
count(patroni_info{role="replica"})

# Synchronous replica status
patroni_sync_state{sync_state="sync"}

6. Secure REST API

6.1. Enable authentication

In patroni.yml:

TEXT
restapi:
  listen: 0.0.0.0:8008
  connect_address: 10.0.1.11:8008
  
  # Basic authentication
  authentication:
    username: admin
    password: secure_password_here

Access with authentication:

TEXT
# Using curl
curl -u admin:secure_password_here http://10.0.1.11:8008/patroni

# Or with header
curl -H "Authorization: Basic $(echo -n admin:secure_password_here | base64)" \
  http://10.0.1.11:8008/patroni

6.2. Enable SSL/TLS

Generate certificates:

TEXT
# Create CA
openssl genrsa -out ca.key 4096
openssl req -new -x509 -days 3650 -key ca.key -out ca.crt \
  -subj "/CN=Patroni-CA"

# Create server certificate
openssl genrsa -out server.key 4096
openssl req -new -key server.key -out server.csr \
  -subj "/CN=node1.example.com"

# Sign with CA
openssl x509 -req -days 365 -in server.csr -CA ca.crt -CAkey ca.key \
  -set_serial 01 -out server.crt

# Set permissions
sudo chown postgres:postgres server.key server.crt ca.crt
sudo chmod 600 server.key

Configure in patroni.yml:

TEXT
restapi:
  listen: 0.0.0.0:8008
  connect_address: 10.0.1.11:8008
  
  certfile: /etc/patroni/certs/server.crt
  keyfile: /etc/patroni/certs/server.key
  cafile: /etc/patroni/certs/ca.crt
  
  # Optional: Require client certificates
  # verify_client: required
  
  authentication:
    username: admin
    password: secure_password_here

Access with HTTPS:

TEXT
curl -k -u admin:secure_password_here https://10.0.1.11:8008/patroni

# Or with CA certificate
curl --cacert /etc/patroni/certs/ca.crt \
  -u admin:secure_password_here \
  https://10.0.1.11:8008/patroni

6.3. Firewall rules

TEXT
# Allow REST API only from specific IPs
sudo ufw allow from 10.0.1.0/24 to any port 8008
sudo ufw allow from <load_balancer_ip> to any port 8008
sudo ufw allow from <monitoring_server_ip> to any port 8008

# Deny from everywhere else
sudo ufw deny 8008

7. Advanced REST API Usage

7.1. Scripted failover check

TEXT
#!/bin/bash
# Check if failover is safe

CLUSTER_URL="http://10.0.1.11:8008/cluster"

# Get cluster info
CLUSTER_DATA=$(curl -s "$CLUSTER_URL")

# Count healthy replicas
HEALTHY_REPLICAS=$(echo "$CLUSTER_DATA" | jq '[.members[] | select(.role != "leader" and .state == "running")] | length')

if [ "$HEALTHY_REPLICAS" -ge 1 ]; then
    echo "Safe to failover: $HEALTHY_REPLICAS healthy replicas"
    exit 0
else
    echo "NOT safe to failover: only $HEALTHY_REPLICAS healthy replicas"
    exit 1
fi

7.2. Get primary endpoint dynamically

TEXT
#!/bin/bash
# Get current primary IP:port

get_primary() {
    for NODE in 10.0.1.11 10.0.1.12 10.0.1.13; do
        RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" "http://${NODE}:8008/primary")
        if [ "$RESPONSE" = "200" ]; then
            echo "${NODE}:5432"
            return 0
        fi
    done
    echo "No primary found" >&2
    return 1
}

PRIMARY=$(get_primary)
echo "Current primary: $PRIMARY"

# Use in connection string
psql "host=$(echo $PRIMARY | cut -d: -f1) port=5432 user=app_user dbname=myapp"

7.3. Monitor replication lag

TEXT
#!/bin/bash
# Alert if replication lag > threshold

THRESHOLD_MB=100

for NODE in 10.0.1.11 10.0.1.12 10.0.1.13; do
    LAG=$(curl -s "http://${NODE}:8008/patroni" | jq '.replication[]? | select(.sync_state != "sync") | .replay_lag' | wc -l)
    
    if [ "$LAG" -gt "$THRESHOLD_MB" ]; then
        echo "ALERT: Node $NODE replication lag > ${THRESHOLD_MB}MB"
        # Send notification
    fi
done

8. Lab Exercises

Lab 1: Explore REST API endpoints

Tasks:

  1. Query all endpoints on each node
  2. Compare responses between primary and replicas
  3. Identify which endpoint returns 200 on primary vs replica
TEXT
# Test script
for ENDPOINT in / /primary /replica /read-write /read-only /health /patroni; do
    echo "=== $ENDPOINT ==="
    for NODE in 10.0.1.11 10.0.1.12 10.0.1.13; do
        HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" "http://${NODE}:8008${ENDPOINT}")
        echo "  Node $NODE: $HTTP_CODE"
    done
done

Lab 2: Setup HAProxy

Tasks:

  1. Install HAProxy
  2. Configure with Patroni health checks
  3. Test write traffic goes to primary only
  4. Test read traffic distributed to replicas
  5. Trigger failover, verify HAProxy redirects automatically

Lab 3: Create monitoring dashboard

Tasks:

  1. Write Python script to query all nodes
  2. Display cluster topology
  3. Show replication lag
  4. Highlight current primary
  5. Run every 5 seconds

Lab 4: Secure REST API

Tasks:

  1. Enable basic authentication
  2. Generate SSL certificates
  3. Configure HTTPS
  4. Update curl commands to use auth + SSL
  5. Configure firewall rules

9. Troubleshooting REST API

9.1. REST API not responding

Check:

TEXT
# 1. Verify Patroni is running
sudo systemctl status patroni

# 2. Check if port is listening
sudo netstat -tlnp | grep 8008

# 3. Check firewall
sudo ufw status | grep 8008

# 4. Test locally
curl http://localhost:8008/

# 5. Check logs
sudo journalctl -u patroni -n 50 | grep -i rest

9.2. Wrong HTTP codes returned

Debug:

TEXT
# Get detailed response
curl -v http://10.0.1.11:8008/primary

# Check PostgreSQL status
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"

# Check Patroni sees correct role
patronictl list

9.3. SSL/TLS errors

Check:

TEXT
# Verify certificate
openssl x509 -in /etc/patroni/certs/server.crt -text -noout

# Check certificate matches key
openssl x509 -modulus -noout -in server.crt | md5sum
openssl rsa -modulus -noout -in server.key | md5sum
# Should match

# Test SSL connection
openssl s_client -connect 10.0.1.11:8008 -CAfile ca.crt

10. Tổng kết

Key Endpoints Summary

EndpointReturns 200 WhenUse Case
/primaryNode is primaryLB write routing
/replicaNode is replicaLB read routing
/read-writeNode accepts writesWrite endpoint
/read-onlyNode is read-only replicaRead endpoint
/healthNode is healthyDetailed monitoring
/patroniAlways (detailed info)Advanced monitoring
/clusterAlways (all members)Topology view

Integration Checklist

  •  REST API accessible from all nodes
  •  HAProxy configured with health checks
  •  Monitoring system queries REST API
  •  Authentication enabled
  •  SSL/TLS configured (production)
  •  Firewall rules configured
  •  Health check scripts tested

Architecture hiện tại

TEXT
✅ 3 VMs prepared (Bài 4)
✅ PostgreSQL 18 installed (Bài 5)
✅ etcd cluster running (Bài 6)
✅ Patroni installed (Bài 7)
✅ Patroni configured (Bài 8)
✅ Cluster bootstrapped (Bài 9)
✅ Replication configured (Bài 10)
✅ Callbacks implemented (Bài 11)
✅ REST API integrated (Bài 12)

Next: Failover management

Chuẩn bị cho Bài 13

Bài 13 sẽ cover Failover và Switchover:

  • Automatic failover process
  • Manual switchover
  • Failover scenarios và testing
  • DCS role in leader election
  • Minimize downtime strategies

Share this article

You might also like

Browse all articles

Lesson 20: Security Best Practices

Learn about Lesson 20: Security Best Practices in PostgreSQL HA clusters with Patroni and etcd.

#Patroni#PostgreSQL#high availability

Lesson 19: Logging và Troubleshooting

Learn about Lesson 19: Logging và Troubleshooting in PostgreSQL HA clusters with Patroni and etcd.

#Patroni#PostgreSQL#high availability

Lesson 15: Recovering Failed Nodes

Learn about Lesson 15: Recovering Failed Nodes in PostgreSQL HA clusters with Patroni and etcd.

#Patroni#PostgreSQL#high availability

Lesson 14: Planned Switchover

Learn about Lesson 14: Switchover - Planned Switchover in PostgreSQL HA clusters with Patroni and etcd.

#Patroni#PostgreSQL#high availability

Lesson 13: Automatic Failover

Learn about Lesson 13: Automatic Failover in PostgreSQL HA clusters with Patroni and etcd.

#Patroni#PostgreSQL#high availability