Thursday 6 June 2013

watchdog script for fence_scsi to reboot a RHEL 6 cluster node

Reboot with fence_scsi

Environment

  • Red Hat Enterprise Linux (RHEL) 6 Update 2 or later with the High Availability Add On

Resolution

Configuration of the Watchdog Service for fence_scsi
This configuration assumes that you are using the fence_scsi agent and it is correctly configured in the /etc/cluster/cluster.conf file.
1) Install the watchdog package.
# yum -y install watchdog
2) Copy the fence_scsi_check script to the /etc/watchdog.d/ directory.
# cp /usr/share/cluster/fence_scsi_check.pl /etc/watchdog.d/
3) Enable and start the watchdog service
# chkconfig watchdog on
# service watchdog start
4) Restart the cman service and after it starts up, unfencing should have completed successfully and therefore the node will be registered with the appropriate devices. The local cluster node's key should be stored in the fence_scsi.key file. A list of devices that were successfully registered are stored in the fence_scsi.dev file. Note that if either of these files are empty or do not exist, the fence_scsi_check watchdog script will exit immediately and no reboot will be triggered.
# service cman start
Testing of the Watchdog Service for fence_scsi
1) The fence_scsi_check watchdog script should trigger a reboot when a clusternode has been successfully fenced via the fence_scsi agent. To test this, simply use the fence_node utility. The cluster node that was fenced should reboot itself.
# fence_node <nodename>