Navigation :

DB Failover Installation

Overview

This installation guide shows the steps to configure database replication management and failover via a tool known as RepMgr. To complete this configuration it is expected that the DB Installation tasks have been completed.

This example runs through the configuration of a PostgreSQL replication cluster consisting of three separate nodes:

A primary SMS database.
A secondary SMS database.
An application services node.

In a production deployment it is expected that more than one application services node will be deployed. Actions relating to application services node will need to be performed for each instance.

It is assumed that each N2ACD environment will deploy both a primary and secondary SMS node, and N+1 application service nodes in a geographically redundant setup. Only the SMS nodes will be configured to be elected as the primary in the replica set.

Overall Installation Steps

The high-level steps for installing and deploying PostgreSQL database instances are:

Install the RepMgr packages.
Configure PostgreSQL and RepMgr on each node. Additional configuration is made to the primary SMS node.
Deploy each PostgreSQL replica set via RepMgr.
Enable PostgreSQL replica management and fail over via RepMgr. If PostgreSQL is installed from the Linux distribution’s main source repositories, this will automatically be done.

Path and Service Name Usage

Due to the variety of ways that PostgreSQL and RepMgr can be installed, and support for multiple individual deployments of N-Squared software on individual nodes, naming conventions in this configuration page are kept generic.

repmgr13 is used as the service name for RepMgr (for PostgreSQL 13) as it is the default service name.
- If multiple database clusters are installed on a single machine, multiple services will be necessary and the systemd AT (@) feature can be used to achieve this.
- If a later version of PostgreSQL is being used, repmgr13 will need to be replaced with repmgrNN where NN is the version number.
postgresql-13 is used as the service name for PostgreSQL 13. As with RepMgr, the @ feature of systemd can be used for multiple PostgreSQL instances.
If PostgreSQL is installed via RHEL streams, the postgres binary directory will be /usr/bin. However, if installed from the official PostgreSQL yum repository this will be /usr/pgsql-13/bin. For clarity, /usr/pgsql-13/bin is used for the binary directory of PostgreSQL.
Similarly, the PostgreSQL instance data directory (often referred to PGDATA) is referred to as /var/lib/pgsql/13/data.

Software Install

RepMgr Repository Configuration (All Nodes)

RepMgr can be is installed via the N-Squared Yum repository, or from the official PostgreSQL yum repository. If using the official PostgreSQL repository, skip this step.

Add the N-Squared distribution repository alongside the existing N-Squared delivery repository:

sudo nano /etc/yum.repos.d/nsquared.repo

Adding the following, replacing the username and password as provided by N-Squared.

[2ndquadrant-dl-default-release-pg13]
gpgcheck=1
enabled=1
name=2ndquadrant-dl-default-release-pg13 via N-Squared Software Distributions
baseurl=https://username:password@artefacts.nsquared.nz/yum/repo/distributions/el8/2ndquadrant-dl-default-release-pg13

Clean and update:

sudo dnf clean all
sudo dnf update

RepMgr Installation (All Nodes)

Install the correct RepMgr version for your postgres version. E.g. for PostgreSQL 13:

sudo dnf install repmgr13

N-Squared “PGAUX” Installation (Primary & Secondary SMS Nodes)

N-Squared provide several enhancements to the RepMgr package in a separate n2pg-aux package. This is available from the same repository as the N2ACD packages. Install this package on the primary and secondary SMS nodes:

sudo dnf install n2pg-aux

Configuration

Setup PostgreSQL Path (All Nodes)

If PostgreSQL is installed from the official postgres source repository the PostgreSQL binaries will not be automatically added to the path. To make using PostgreSQL and RepMgr binaries easier, add the following to /etc/profile and restart the current SSH session. E.g.:

PATH=$PATH:/usr/pgsql-13/bin

Log Management

Setup log rotate for the RepMgr service. Logs will be written to: /var/log/repmgr/repmgrd.log

Create a new log rotate configuration. As root:

sudo nano /etc/logrotate.d/repmgrd

Add the content:

/var/log/repmgr/repmgrd.log {
        missingok
        compress
        rotate 52
        maxsize 100M
        weekly
        create 0600 postgres postgres
        postrotate
            /usr/bin/killall -HUP repmgrd
        endscript
}

Where multiple repmgrd instances are configured to manage multiple PostgreSQL clusters create individual logrotate configuration files referencing individual log files.

RepMgr Configuration (All Nodes)

Create the RepMgr configuration. As root:

sudo nano /etc/repmgr/13/repmgr.conf

Create this file on each node in the N2ACD cluster with the following configuration template, replacing the <...> variables as appropriate for each node:

# Base configuration.
node_id                         = <unique node ID>
node_name                       = '<local nodename>'
location                        = '<geographic location>'
conninfo                        = 'host=<local hostname> user=repmgr dbname=repmgr connect_timeout=5'
data_directory                  = '/var/lib/pgsql/13/data'
log_file                        = '/var/log/repmgr/repmgrd.log'
pg_bindir                       = '/usr/pgsql-13/bin'
use_replication_slots           = true

# Automatic failover configuration.
promote_command                 = '/usr/pgsql-13/bin/repmgr standby promote -f /etc/repmgr/13/repmgr.conf --log-to-file'
follow_command                  = '/usr/pgsql-13/bin/repmgr standby follow -f /etc/repmgr/13/repmgr.conf --log-to-file --upstream-node-id=%n'

failover                        = automatic
priority                        = <100 or 0>
standby_follow_restart          = false
primary_visibility_consensus    = true
standby_disconnect_on_failover  = true

service_start_command           = 'sudo service postgresql-13 start'
service_stop_command            = 'sudo service postgresql-13 stop'
service_restart_command         = 'sudo service postgresql-13 restart'
service_reload_command          = 'sudo service postgresql-13 reload'

reconnect_attempts              = 6
reconnect_interval              = 10

Where:

<unique node ID> is a unique identifier for the node. This can be any integer value as long as it is unique across the cluster.
<local nodename> is the name of the node. This can be any string value as long as it is unique across the cluster.
<geographic location> is the geographic location of the node. This can be any string value but should reflect the geographic location of the node. Generally a N2ACD deployment will have a service node and a SMS node at each geographic location. Grouping nodes into geographic locations allows RepMgr to avoid split-brain errors by failing over the primary to the secondary node only if the primary can be determined as failed by another node at the same location.
<local hostname> is part of the connection information for the node. <local hostname> should be the hostname or IP address to connect to the PostgreSQL instance on the local node.
Update /var/lib/pgsql/13/data to the correct path to the data directory of the database instance being set up for replication and RepMgr cluster management.
Change /usr/pgsql-13/bin to the correct path for the version and RPM source of the running the postgres database instance.
priority Should be set to 100 for the primary or secondary SMS node, and 0 for all service nodes.

RepMgr PostgresSQL Management (All Nodes)

In order for RepMgr to manage PostgreSQL instances sudo permissions will need to be granted. Permissions granted will be restricted to stopping, starting, restarting, and reloading PostgreSQL DB instances.

Setup sudoers file access for PostgreSQL so that it can manage its own PostgreSQL instance.

sudo nano /etc/sudoers

adding the following content to the bottom of the file:

    Defaults:postgres !requiretty
    postgres ALL=(ALL) NOPASSWD: /usr/sbin/service postgresql-13 stop, \
            /usr/sbin/service postgresql-13 start, \
            /usr/sbin/service postgresql-13 reload, \
            /usr/sbin/service postgresql-13 restart

Note that postgresql-13 is the service name, and may need to be adjusted.

RepMgr PostgreSQL Access (Primary Node Only)

The pre-requisite for this step is for the PostgreSQL instance being configured is available and running. If the relevant PostgreSQL instance is not yet configured and started, perform these steps as documented in the DB Node Installation guide.

Configure the PostgreSQL access configuration, this will contain the addresses of all nodes for when replication is enabled.

sudo nano /var/lib/pgsql/13/data/pg_hba.conf

In this file, add the following to rows for each node that will be part of the cluster:

host replication repmgr <hostname> trust
host repmgr      repmgr <hostname> trust

Where:

Each of these lines is duplicated for each node - both the primary and secondary SMS, and each service node.
<hostname> is the hostname of the node.

Then configure the replication user. RepMgr will create a schema called repmgr to be used to store its metadata information.

su - postgres
createuser -s repmgr
createdb repmgr -O repmgr

Then connect to the postgres instance as the maintenance user (as the user postgres run psql) and execute:

ALTER USER repmgr SET search_path TO repmgr, "$user", public;

PostgreSQL Replication Configuration (Primary Node Only)

Configure the primary node PostgreSQL instance for replication. Edit the instance configuration file:

sudo nano /var/lib/pgsql/13/data/postgres.conf

Configure the following options as appropriate:

Field	Value	Notes
`listen_addresses`	`localhost,<host1>,<host2>,etc`	Configure as a list of all nodes that are in the replication set. This is deployed automatically to all nodes as RepMgr/Postgres replicates this file to each node as part of the clone step. Alternatively configure a hostname such as `postgres-local` that has a local IP on each host by using the `/etc/hosts` file to change the IP on each node.
`port`	`5432`	The listen port for this DB instance.
`password_encryption`	`scram-sha-256`	Enables a secure password mechanism using encrypted passwords rather than plain text MD5 hashes. Note: This must be set before any users are created and must be set on duplicated instances as the default is `md5`.
`max_wal_senders`	`10`	This value should be configured based on the amount of connected replica sets. Rule of thumb is to set this to: `(Number of Nodes) * 2) + 2`
`wal_level`	`replica`
`hot_standby`	`on`
`archive_mode`	`on`
`archive_command`	`'true'`
`shared_preload_libraries`	`'repmgr'`	This enables the shared preloaded libraries that allow the `repmgrd` daemon process the ability to access and manage the PostgreSQL instances.
`wal_log_hints`	`on`
`max_replication_slots`	`4`	The number of active replica slots to allow. Rule of thumb is to set this to: `(Number of Nodes) + 1`

Primary SMS Node Registration with RepMgr

With all configuration complete, the primary node can now be registered with RepMgr.

Note: Any time the repmgr command is run, use the -f /path/to/conf/file to have repmgr use the correct configuration file.
Note: All repmgr command executions must be performed as the postgres user and cannot be executed as root.

su - postgres

repmgr -f /etc/repmgr/13/repmgr.conf primary register

Confirm registration. Use the cluster show and service status

repmgr -f /etc/repmgr/13/repmgr.conf cluster show

This will generate output, for example:

 ID | Name       | Role    | Status    | Upstream  | Location   | Priority | Timeline | Connection string
----+------------+---------+-----------+-----------+------------+----------+----------+-----------------------------------------------------------------------
 1  | pg-node-a  | primary | * running |           | Location A | 100      | 10       | host=pg-node-a port=5432 user=repmgr dbname=repmgr connect_timeout=5

repmgr -f /etc/repmgr/13/repmgr.conf service status

This will generate output, for example:

 ID | Name       | Role    | Status    | Upstream  | repmgrd | PID    | Paused? | Upstream last seen
----+------------+---------+-----------+-----------+---------+--------+---------+--------------------
 1  | pg-node-a  | primary | * running |           | running | 745513 | no      | n/a

Setup PostgreSQL DB Instance (Secondary SMS Node)

With the primary node configured with RepMgr, the secondary SMS node can be cloned and configured. All PostgreSQL initialization tasks will be performed by RepMgr.

Preparation

Validate that the data directories are configured for syncing:

Check that the target data directory exists.
Check that the target data directory is owned by the postgres user.
Check that the target data directory has permissions set to 0700.

For example:

[postgres@test-sms-02]$ ls -larth /var/lib/pgsql/13/
drwx------. 20 postgres postgres 4.0K Feb 12 00:00 data

If no direct exists it can be created. As the user root:

mkdir /var/lib/pgsql/13/data
chown -R postgres:postgres /var/lib/pgsql/13/data
chmod 700 /var/lib/pgsql/13/data
semanage fcontext -a -t postgresql_db_t "/var/lib/pgsql/13/data(/.*)?"
restorecon -R -v /var/lib/pgsql/13/data

Note: The correct directory (as configured in the repmgr.conf file) should be used in these actions.

Ensure the PostgreSQL service is enabled and correct permissions are granted. As the user root:

systemctl enable postgresql-13
sudo semanage port -a -t postgresql_port_t -p tcp 5432
firewall-cmd --zone=public --add-port=5432/tcp --permanent
firewall-cmd --reload

Use the correct service name for the version of postgres installed. Note: The service is not started at this stage. RepMgr will start it for us.

RepMgr Clone of Primary Node

Check connectivity to the primary node. As the user postgres:

repmgr -h <primary node> -U repmgr -d repmgr -f /etc/repmgr/13/repmgr.conf standby clone --dry-run

Where:

<primary node> is the hostname of the primary node to be cloned.

Perform the database sync from the primary node. As the user postgres:

repmgr -h <primary node> -U repmgr -d repmgr -f /etc/repmgr/13/repmgr.conf standby clone

Start the PostgreSQL instance on the secondary node. As the user root:

service postgresql-13 start

repmgr -f /etc/repmgr/13/repmgr.conf standby register

Checking the cluster status both the primary and secondary nodes should be visible. As the user postgres:

repmgr -f /etc/repmgr/13/repmgr.conf cluster show

This will give output similar to:

 ID | Name       | Role    | Status    | Upstream  | Location   | Priority | Timeline | Connection string
----+------------+---------+-----------+-----------+------------+----------+----------+-----------------------------------------------------------------------
 1  | pg-node-a  | primary | * running |           | Location A | 100      | 10       | host=pg-node-a port=5432 user=repmgr dbname=repmgr connect_timeout=5
 3  | pg-node-b  | standby |   running | pg-node-a | Location B | 100      | 10       | host=pg-node-b port=5432 user=repmgr dbname=repmgr connect_timeout=5

Setup PostgreSQL DB Instance (Service Nodes)

Repeating the steps taken for the secondary SMS node as described in the section above, each of the service nodes can be integrated into the cluster using RepMgr using the same process.

RepMgr Daemon Process (All Nodes)

RepMgr includes a daemon process that is responsible for monitoring the cluster health and initiating failover tasks. Once the cluster is initialized configure the RepMgr daemon process to start.

Enable and start the daemon process:

systemctl enable repmgr13
systemctl start repmgr13

The RepMgr service status can be checked with:

repmgr -f /etc/repmgr/13/repmgr.conf service status

Output logs for the daemon process can be found in /var/log/repmgr.

Enable PostgreSQL Fencing on Primary and Secondary Nodes

To avoid split brain scenarios where two PostgreSQL nodes in the cluster are both acting as primary, a fencing process is enabled using the n2pg-aux packages scripts.

To enable this, on both the primary SMS and secondary SMS nodes edit the systemd startup script. As root:

systemctl edit postgresql-13 --full

Add the following line to the [Service] section:

ExecStartPre=/usr/share/n2pg-aux/bin/pgsql_fence.pl /usr/pgsql-13/bin/postgres <nodename>

Where:

/usr/pgsql-13/bin should be replaced to the path to the bin directory of the relevant postgres version.
<nodename> is the node name as used in the repmgr.conf file of the node on which this script is being edited (so on the primary server this is the node name of the primary server as known by RepMgr).

In the same service configuration, tie RepMgr and the postgres instance together tightly. For postgres add:

After=network-online.target
Before=repmgr13.service
Wants=repmgr13.service

To the service configuration in the [unit] section. Use the correct service name for the service instance of repmgr you’re using (in this example it is repmgr13).

Tie the RepMgr Service to the PostgreSQL Service

Configure the RepMgr service in a reciprocal manner to how PostgreSQL is tied to RepMgr. As root:

systemctl edit repmgr13 --full

Add the following to the [unit] section:

After=postgresql-13.service
PartOf=postgresql-13.service

Use the correct service name for the service instance of PostgreSQL you’re using (in this example it is postgresql-13).

In the [service] section also add:

Restart    = on-failure
RestartSec = 60

The restart configuration for RepMgr is important as RepMgr will exit if the primary node is temporarily unavailable, To avoid manual intervention being required to start RepMgr it must be configured to restart regularly.

PostgreSQL Fencing

With the configuration as described in this document above, PostgreSQL SMS primary/secondary nodes may be fenced to avoid startup in scenarios where a primary node cannot be certain it is safe to do so.

A successfully fenced PostgreSQL instance can be determined by reviewing the systemd log for a PostgreSQL instance which has failed startup:

systemctl status postgresql-13

The log file will print output such as this:

-- Unit postgresql-13.service has begun starting up.
Apr 08 02:05:46 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:46.928] Running /usr/share/n2pg-aux/bin/pgsql_fence.pl
Apr 08 02:05:46 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:46.929] Postgres binary file is '/usr/pgsql-13/bin/postgres' and data dir is '/var/lib/pgsql/13/data/'. Force file is /etc/pgsqlchk/pgsql_fence.force
Apr 08 02:05:46 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:46.929] repmgr database name is 'repmgr' and node_name is 'pg-node-a'
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.074] Successfully executed: SELECT pg_is_in_recovery();
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.074] Local database is a primary node. Must check whether to fence it.
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.172] Successfully executed: SELECT node_name, conninfo FROM repmgr.nodes where priority != 0
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.172] Potential primary postgres node connection is: pg-node-b -> host=pg-node-b port=5432 user=repmgr dbname=repmgr connect_timeout=5
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.172] Potential primary postgres node connection is: pg-node-a -> host=pg-node-a port=5432 user=repmgr dbname=repmgr connect_timeout=5
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.192] [pg-node-b] is NOT in recovery (0) - i.e. is primary. Will not allow startup touch '/etc/pgsqlchk/pgsql_fence.force' to force startup
Apr 08 02:05:47 pg-node-a systemd[1]: postgresql-13.service: Control process exited, code=exited status=255
Apr 08 02:05:47 pg-node-a systemd[1]: postgresql-13.service: Failed with result 'exit-code'.
-- Subject: Unit failed

It is possible to force the startup of a PostgreSQL node even if it is fenced. To do so, create an empty file. As root:

touch /etc/pgsqlchk/pgsql_fence.force

Once this file is created, start PostgreSQL. As root:

systemctl start postgresql-13

The startup will show the fencing is skipped:

-- Unit postgresql-13.service has begun starting up.
Apr 08 02:06:17 pg-node-a pgsql_fence.pl[597572]: [2025-04-08 02:06:17.616] Running /usr/share/n2pg-aux/bin/pgsql_fence.pl
Apr 08 02:06:17 pg-node-a pgsql_fence.pl[597572]: [2025-04-08 02:06:17.617] Postgres binary file is '/usr/pgsql-13/bin/postgres' and data dir is '/var/lib/pgsql/13/data/'. Force file is /etc/pgsqlchk/pgsql_fence.force
Apr 08 02:06:17 pg-node-a pgsql_fence.pl[597572]: [2025-04-08 02:06:17.617] repmgr database name is 'repmgr' and node_name is 'pg-node-a'
Apr 08 02:06:17 pg-node-a pgsql_fence.pl[597572]: [2025-04-08 02:06:17.617] Determined force file '/etc/pgsqlchk/pgsql_fence.force' exists. Allowing Postgres to run regardless.
Apr 08 02:06:17 pg-node-a postmaster[597580]: 2025-04-08 02:06:17.667 UTC [597580] LOG:  redirecting log output to logging collector process
Apr 08 02:06:17 pg-node-a postmaster[597580]: 2025-04-08 02:06:17.667 UTC [597580] HINT:  Future log output will appear in directory "/var/log/postgresql".
Apr 08 02:06:17 pg-node-a systemd[1]: Started PostgreSQL 13 database server.
-- Subject: Unit postgresql-13.service has finished start-up

It is important to only force startup by using this procedure if it is safe to do so. In the above example the secondary SMS node is running as the primary node, and in forcing startup of the pg-node-a two primary nodes are now running. With a correctly configured HA setup the HAProxy setup would start blocking access to the backend databases until this has been resolved. However it is important in general to ensure that when the fencing is overridden the complete state of the cluster is understood.

To stop the fencing script from being overridden, remove the pgsql_fence.force file. As root:

rm /etc/pgsqlchk/pgsql_fence.force