DB Failover Installation
Overview
This installation guide shows the steps to configure database replication management and failover via a tool known as RepMgr. To complete this configuration it is expected that the DB Installation tasks have been completed.
This example runs through the configuration of a PostgreSQL replication cluster consisting of three separate nodes:
- A primary SMS database.
- A secondary SMS database.
- An application services node.
In a production deployment it is expected that more than one application services node will be deployed. Actions relating to application services node will need to be performed for each instance.
It is assumed that each N2ACD environment will deploy both a primary and secondary SMS node, and N+1 application service nodes in a geographically redundant setup. Only the SMS nodes will be configured to be elected as the primary in the replica set.
Overall Installation Steps
The high-level steps for installing and deploying PostgreSQL database instances are:
- Install the RepMgr packages.
- Configure PostgreSQL and RepMgr on each node. Additional configuration is made to the primary SMS node.
- Deploy each PostgreSQL replica set via RepMgr.
- Enable PostgreSQL replica management and fail over via RepMgr. If PostgreSQL is installed from the Linux distribution’s main source repositories, this will automatically be done.
Path and Service Name Usage
Due to the variety of ways that PostgreSQL and RepMgr can be installed, and support for multiple individual deployments of N-Squared software on individual nodes, naming conventions in this configuration page are kept generic.
repmgr13
is used as the service name for RepMgr (for PostgreSQL 13) as it is the default service name.- If multiple database clusters are installed on a single machine, multiple services will be necessary and the systemd AT (
@
) feature can be used to achieve this. - If a later version of PostgreSQL is being used,
repmgr13
will need to be replaced withrepmgrNN
whereNN
is the version number.
- If multiple database clusters are installed on a single machine, multiple services will be necessary and the systemd AT (
postgresql-13
is used as the service name for PostgreSQL 13. As with RepMgr, the@
feature of systemd can be used for multiple PostgreSQL instances.- If PostgreSQL is installed via RHEL streams, the postgres binary directory will be
/usr/bin
. However, if installed from the official PostgreSQL yum repository this will be/usr/pgsql-13/bin
. For clarity,/usr/pgsql-13/bin
is used for the binary directory of PostgreSQL. - Similarly, the PostgreSQL instance data directory (often referred to
PGDATA
) is referred to as/var/lib/pgsql/13/data
.
Software Install
RepMgr Repository Configuration (All Nodes)
RepMgr can be is installed via the N-Squared Yum repository, or from the official PostgreSQL yum repository. If using the official PostgreSQL repository, skip this step.
Add the N-Squared distribution repository alongside the existing N-Squared delivery repository:
sudo nano /etc/yum.repos.d/nsquared.repo
Adding the following, replacing the username and password as provided by N-Squared.
[2ndquadrant-dl-default-release-pg13]
gpgcheck=1
enabled=1
name=2ndquadrant-dl-default-release-pg13 via N-Squared Software Distributions
baseurl=https://username:password@artefacts.nsquared.nz/yum/repo/distributions/el8/2ndquadrant-dl-default-release-pg13
Clean and update:
sudo dnf clean all
sudo dnf update
RepMgr Installation (All Nodes)
Install the correct RepMgr version for your postgres version. E.g. for PostgreSQL 13:
sudo dnf install repmgr13
N-Squared “PGAUX” Installation (Primary & Secondary SMS Nodes)
N-Squared provide several enhancements to the RepMgr package in a separate n2pg-aux
package. This is available from the same repository as the N2ACD packages.
Install this package on the primary and secondary SMS nodes:
sudo dnf install n2pg-aux
Configuration
Setup PostgreSQL Path (All Nodes)
If PostgreSQL is installed from the official postgres source repository the PostgreSQL binaries will not be automatically added to the path. To make using PostgreSQL and RepMgr binaries easier, add the following to /etc/profile
and restart the current SSH session. E.g.:
PATH=$PATH:/usr/pgsql-13/bin
Log Management
Setup log rotate for the RepMgr service. Logs will be written to: /var/log/repmgr/repmgrd.log
Create a new log rotate configuration. As root:
sudo nano /etc/logrotate.d/repmgrd
Add the content:
/var/log/repmgr/repmgrd.log {
missingok
compress
rotate 52
maxsize 100M
weekly
create 0600 postgres postgres
postrotate
/usr/bin/killall -HUP repmgrd
endscript
}
Where multiple repmgrd
instances are configured to manage multiple PostgreSQL clusters create individual logrotate configuration files referencing individual
log files.
RepMgr Configuration (All Nodes)
Create the RepMgr configuration. As root:
sudo nano /etc/repmgr/13/repmgr.conf
Create this file on each node in the N2ACD cluster with the following configuration template, replacing the <...>
variables as appropriate for each node:
# Base configuration.
node_id = <unique node ID>
node_name = '<local nodename>'
location = '<geographic location>'
conninfo = 'host=<local hostname> user=repmgr dbname=repmgr connect_timeout=5'
data_directory = '/var/lib/pgsql/13/data'
log_file = '/var/log/repmgr/repmgrd.log'
pg_bindir = '/usr/pgsql-13/bin'
use_replication_slots = true
# Automatic failover configuration.
promote_command = '/usr/pgsql-13/bin/repmgr standby promote -f /etc/repmgr/13/repmgr.conf --log-to-file'
follow_command = '/usr/pgsql-13/bin/repmgr standby follow -f /etc/repmgr/13/repmgr.conf --log-to-file --upstream-node-id=%n'
failover = automatic
priority = <100 or 0>
standby_follow_restart = false
primary_visibility_consensus = true
standby_disconnect_on_failover = true
service_start_command = 'sudo service postgresql-13 start'
service_stop_command = 'sudo service postgresql-13 stop'
service_restart_command = 'sudo service postgresql-13 restart'
service_reload_command = 'sudo service postgresql-13 reload'
reconnect_attempts = 6
reconnect_interval = 10
Where:
<unique node ID>
is a unique identifier for the node. This can be any integer value as long as it is unique across the cluster.<local nodename>
is the name of the node. This can be any string value as long as it is unique across the cluster.<geographic location>
is the geographic location of the node. This can be any string value but should reflect the geographic location of the node. Generally a N2ACD deployment will have a service node and a SMS node at each geographic location. Grouping nodes into geographic locations allows RepMgr to avoid split-brain errors by failing over the primary to the secondary node only if the primary can be determined as failed by another node at the same location.<local hostname>
is part of the connection information for the node.<local hostname>
should be the hostname or IP address to connect to the PostgreSQL instance on the local node.- Update
/var/lib/pgsql/13/data
to the correct path to the data directory of the database instance being set up for replication and RepMgr cluster management. - Change
/usr/pgsql-13/bin
to the correct path for the version and RPM source of the running the postgres database instance. priority
Should be set to 100 for the primary or secondary SMS node, and 0 for all service nodes.
RepMgr PostgresSQL Management (All Nodes)
In order for RepMgr to manage PostgreSQL instances sudo
permissions will need to be granted. Permissions granted will be restricted to stopping, starting, restarting, and reloading PostgreSQL DB instances.
Setup sudoers
file access for PostgreSQL so that it can manage its own PostgreSQL instance.
sudo nano /etc/sudoers
adding the following content to the bottom of the file:
Defaults:postgres !requiretty
postgres ALL=(ALL) NOPASSWD: /usr/sbin/service postgresql-13 stop, \
/usr/sbin/service postgresql-13 start, \
/usr/sbin/service postgresql-13 reload, \
/usr/sbin/service postgresql-13 restart
Note that postgresql-13
is the service name, and may need to be adjusted.
RepMgr PostgreSQL Access (Primary Node Only)
The pre-requisite for this step is for the PostgreSQL instance being configured is available and running. If the relevant PostgreSQL instance is not yet configured and started, perform these steps as documented in the DB Node Installation guide.
Configure the PostgreSQL access configuration, this will contain the addresses of all nodes for when replication is enabled.
sudo nano /var/lib/pgsql/13/data/pg_hba.conf
In this file, add the following to rows for each node that will be part of the cluster:
host replication repmgr <hostname> trust
host repmgr repmgr <hostname> trust
Where:
- Each of these lines is duplicated for each node - both the primary and secondary SMS, and each service node.
<hostname>
is the hostname of the node.
Then configure the replication user. RepMgr will create a schema called repmgr
to be used to store its metadata information.
su - postgres
createuser -s repmgr
createdb repmgr -O repmgr
Then connect to the postgres instance as the maintenance user (as the user postgres run psql
) and execute:
ALTER USER repmgr SET search_path TO repmgr, "$user", public;
PostgreSQL Replication Configuration (Primary Node Only)
Configure the primary node PostgreSQL instance for replication. Edit the instance configuration file:
sudo nano /var/lib/pgsql/13/data/postgres.conf
Configure the following options as appropriate:
Field | Value | Notes |
---|---|---|
listen_addresses |
localhost,<host1>,<host2>,etc |
Configure as a list of all nodes that are in the replication set. This is deployed automatically to all nodes as RepMgr/Postgres replicates this file to each node as part of the clone step. Alternatively configure a hostname such as postgres-local that has a local IP on each host by using the /etc/hosts file to change the IP on each node. |
port |
5432 |
The listen port for this DB instance. |
password_encryption |
scram-sha-256 |
Enables a secure password mechanism using encrypted passwords rather than plain text MD5 hashes. Note: This must be set before any users are created and must be set on duplicated instances as the default is md5 . |
max_wal_senders |
10 |
This value should be configured based on the amount of connected replica sets. Rule of thumb is to set this to: (Number of Nodes) * 2) + 2 |
wal_level |
replica |
|
hot_standby |
on |
|
archive_mode |
on |
|
archive_command |
'true' |
|
shared_preload_libraries |
'repmgr' |
This enables the shared preloaded libraries that allow the repmgrd daemon process the ability to access and manage the PostgreSQL instances. |
wal_log_hints |
on |
|
max_replication_slots |
4 |
The number of active replica slots to allow. Rule of thumb is to set this to: (Number of Nodes) + 1 |
Primary SMS Node Registration with RepMgr
With all configuration complete, the primary node can now be registered with RepMgr.
- Note: Any time the
repmgr
command is run, use the-f /path/to/conf/file
to have repmgr use the correct configuration file. - Note: All
repmgr
command executions must be performed as thepostgres
user and cannot be executed asroot
.
su - postgres
Register the instance as the cluster primary:
repmgr -f /etc/repmgr/13/repmgr.conf primary register
Confirm registration. Use the cluster show
and service status
repmgr -f /etc/repmgr/13/repmgr.conf cluster show
This will generate output, for example:
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+------------+---------+-----------+-----------+------------+----------+----------+-----------------------------------------------------------------------
1 | pg-node-a | primary | * running | | Location A | 100 | 10 | host=pg-node-a port=5432 user=repmgr dbname=repmgr connect_timeout=5
repmgr -f /etc/repmgr/13/repmgr.conf service status
This will generate output, for example:
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+------------+---------+-----------+-----------+---------+--------+---------+--------------------
1 | pg-node-a | primary | * running | | running | 745513 | no | n/a
Setup PostgreSQL DB Instance (Secondary SMS Node)
With the primary node configured with RepMgr, the secondary SMS node can be cloned and configured. All PostgreSQL initialization tasks will be performed by RepMgr.
Preparation
Validate that the data directories are configured for syncing:
- Check that the target data directory exists.
- Check that the target data directory is owned by the
postgres
user. - Check that the target data directory has permissions set to
0700
.
For example:
[postgres@test-sms-02]$ ls -larth /var/lib/pgsql/13/
drwx------. 20 postgres postgres 4.0K Feb 12 00:00 data
If no direct exists it can be created. As the user root:
mkdir /var/lib/pgsql/13/data
chown -R postgres:postgres /var/lib/pgsql/13/data
chmod 700 /var/lib/pgsql/13/data
semanage fcontext -a -t postgresql_db_t "/var/lib/pgsql/13/data(/.*)?"
restorecon -R -v /var/lib/pgsql/13/data
Note: The correct directory (as configured in the repmgr.conf
file) should be used in these actions.
Ensure the PostgreSQL service is enabled and correct permissions are granted. As the user root:
systemctl enable postgresql-13
sudo semanage port -a -t postgresql_port_t -p tcp 5432
firewall-cmd --zone=public --add-port=5432/tcp --permanent
firewall-cmd --reload
Use the correct service name for the version of postgres installed. Note: The service is not started at this stage. RepMgr will start it for us.
RepMgr Clone of Primary Node
Check connectivity to the primary node. As the user postgres:
repmgr -h <primary node> -U repmgr -d repmgr -f /etc/repmgr/13/repmgr.conf standby clone --dry-run
Where:
<primary node>
is the hostname of the primary node to be cloned.
Perform the database sync from the primary node. As the user postgres:
repmgr -h <primary node> -U repmgr -d repmgr -f /etc/repmgr/13/repmgr.conf standby clone
Start the PostgreSQL instance on the secondary node. As the user root:
service postgresql-13 start
Register the secondary node into the cluster. As the user postgres:
repmgr -f /etc/repmgr/13/repmgr.conf standby register
Checking the cluster status both the primary and secondary nodes should be visible. As the user postgres:
repmgr -f /etc/repmgr/13/repmgr.conf cluster show
This will give output similar to:
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+------------+---------+-----------+-----------+------------+----------+----------+-----------------------------------------------------------------------
1 | pg-node-a | primary | * running | | Location A | 100 | 10 | host=pg-node-a port=5432 user=repmgr dbname=repmgr connect_timeout=5
3 | pg-node-b | standby | running | pg-node-a | Location B | 100 | 10 | host=pg-node-b port=5432 user=repmgr dbname=repmgr connect_timeout=5
Setup PostgreSQL DB Instance (Service Nodes)
Repeating the steps taken for the secondary SMS node as described in the section above, each of the service nodes can be integrated into the cluster using RepMgr using the same process.
RepMgr Daemon Process (All Nodes)
RepMgr includes a daemon process that is responsible for monitoring the cluster health and initiating failover tasks. Once the cluster is initialized configure the RepMgr daemon process to start.
Enable and start the daemon process:
systemctl enable repmgr13
systemctl start repmgr13
The RepMgr service status can be checked with:
repmgr -f /etc/repmgr/13/repmgr.conf service status
Output logs for the daemon process can be found in /var/log/repmgr
.
Enable PostgreSQL Fencing on Primary and Secondary Nodes
To avoid split brain scenarios where two PostgreSQL nodes in the cluster are both acting as primary, a fencing process is enabled using the n2pg-aux
packages
scripts.
To enable this, on both the primary SMS and secondary SMS nodes edit the systemd startup script. As root:
systemctl edit postgresql-13 --full
Add the following line to the [Service]
section:
ExecStartPre=/usr/share/n2pg-aux/bin/pgsql_fence.pl /usr/pgsql-13/bin/postgres <nodename>
Where:
/usr/pgsql-13/bin
should be replaced to the path to the bin directory of the relevant postgres version.<nodename>
is the node name as used in therepmgr.conf
file of the node on which this script is being edited (so on the primary server this is the node name of the primary server as known by RepMgr).
In the same service configuration, tie RepMgr and the postgres instance together tightly. For postgres add:
After=network-online.target
Before=repmgr13.service
Wants=repmgr13.service
To the service configuration in the [unit]
section.
Use the correct service name for the service instance of repmgr you’re using (in this example it is repmgr13
).
Tie the RepMgr Service to the PostgreSQL Service
Configure the RepMgr service in a reciprocal manner to how PostgreSQL is tied to RepMgr. As root:
systemctl edit repmgr13 --full
Add the following to the [unit]
section:
After=postgresql-13.service
PartOf=postgresql-13.service
Use the correct service name for the service instance of PostgreSQL you’re using (in this example it is postgresql-13
).
In the [service]
section also add:
Restart = on-failure
RestartSec = 60
The restart configuration for RepMgr is important as RepMgr will exit if the primary node is temporarily unavailable, To avoid manual intervention being required to start RepMgr it must be configured to restart regularly.
PostgreSQL Fencing
With the configuration as described in this document above, PostgreSQL SMS primary/secondary nodes may be fenced to avoid startup in scenarios where a primary node cannot be certain it is safe to do so.
A successfully fenced PostgreSQL instance can be determined by reviewing the systemd log for a PostgreSQL instance which has failed startup:
systemctl status postgresql-13
The log file will print output such as this:
-- Unit postgresql-13.service has begun starting up.
Apr 08 02:05:46 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:46.928] Running /usr/share/n2pg-aux/bin/pgsql_fence.pl
Apr 08 02:05:46 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:46.929] Postgres binary file is '/usr/pgsql-13/bin/postgres' and data dir is '/var/lib/pgsql/13/data/'. Force file is /etc/pgsqlchk/pgsql_fence.force
Apr 08 02:05:46 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:46.929] repmgr database name is 'repmgr' and node_name is 'pg-node-a'
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.074] Successfully executed: SELECT pg_is_in_recovery();
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.074] Local database is a primary node. Must check whether to fence it.
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.172] Successfully executed: SELECT node_name, conninfo FROM repmgr.nodes where priority != 0
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.172] Potential primary postgres node connection is: pg-node-b -> host=pg-node-b port=5432 user=repmgr dbname=repmgr connect_timeout=5
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.172] Potential primary postgres node connection is: pg-node-a -> host=pg-node-a port=5432 user=repmgr dbname=repmgr connect_timeout=5
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.192] [pg-node-b] is NOT in recovery (0) - i.e. is primary. Will not allow startup touch '/etc/pgsqlchk/pgsql_fence.force' to force startup
Apr 08 02:05:47 pg-node-a systemd[1]: postgresql-13.service: Control process exited, code=exited status=255
Apr 08 02:05:47 pg-node-a systemd[1]: postgresql-13.service: Failed with result 'exit-code'.
-- Subject: Unit failed
It is possible to force the startup of a PostgreSQL node even if it is fenced. To do so, create an empty file. As root:
touch /etc/pgsqlchk/pgsql_fence.force
Once this file is created, start PostgreSQL. As root:
systemctl start postgresql-13
The startup will show the fencing is skipped:
-- Unit postgresql-13.service has begun starting up.
Apr 08 02:06:17 pg-node-a pgsql_fence.pl[597572]: [2025-04-08 02:06:17.616] Running /usr/share/n2pg-aux/bin/pgsql_fence.pl
Apr 08 02:06:17 pg-node-a pgsql_fence.pl[597572]: [2025-04-08 02:06:17.617] Postgres binary file is '/usr/pgsql-13/bin/postgres' and data dir is '/var/lib/pgsql/13/data/'. Force file is /etc/pgsqlchk/pgsql_fence.force
Apr 08 02:06:17 pg-node-a pgsql_fence.pl[597572]: [2025-04-08 02:06:17.617] repmgr database name is 'repmgr' and node_name is 'pg-node-a'
Apr 08 02:06:17 pg-node-a pgsql_fence.pl[597572]: [2025-04-08 02:06:17.617] Determined force file '/etc/pgsqlchk/pgsql_fence.force' exists. Allowing Postgres to run regardless.
Apr 08 02:06:17 pg-node-a postmaster[597580]: 2025-04-08 02:06:17.667 UTC [597580] LOG: redirecting log output to logging collector process
Apr 08 02:06:17 pg-node-a postmaster[597580]: 2025-04-08 02:06:17.667 UTC [597580] HINT: Future log output will appear in directory "/var/log/postgresql".
Apr 08 02:06:17 pg-node-a systemd[1]: Started PostgreSQL 13 database server.
-- Subject: Unit postgresql-13.service has finished start-up
It is important to only force startup by using this procedure if it is safe to do so. In the above example the secondary SMS node is running as the primary node, and
in forcing startup of the pg-node-a
two primary nodes are now running. With a correctly configured HA setup the HAProxy setup would start blocking
access to the backend databases until this has been resolved. However it is important in general to ensure that when the fencing is overridden the complete state of
the cluster is understood.
To stop the fencing script from being overridden, remove the pgsql_fence.force
file. As root:
rm /etc/pgsqlchk/pgsql_fence.force