DB Failover Test

Overview

RepMgr is a robust failover solution for PostgreSQL. Testing the failover solution in a pre-live environment is recommended to test all configuration and automatiation is functioning as required.

Procedure

Tail the RepMgr daemon log file on all nodes.

tail -200f /var/log/repmgr/repmgr.conf

At this stage the daemon should report everything is running as expected.

Shutdown the PostgreSQL instance on the primary node:

service postgresql-13 stop

Watching the RepMgr logs on both nodes they will detect failure of the primary and begin checking the node health; after a configured amount of attempts all nodes will determine the primary node is degraded and consider it failed before electing a new primary node.

Because only the secondary node has an election priority set the secondary SMS will be elected the new primary and the AS node will automatically switch over and follow the secondary (now new primary).

This can be validated by watching the logs perform the switch over.

Note: When the switch over occurs NO downtime will be noticed by either nodes.

The new cluster status can then be viewed with:

repmgr -f /etc/repmgr/13/repmgr.conf cluster show

Once the failover test is complete see DB Failover Correction for more information on correcting the now degarded replica set.