Categories
Linux Ubuntu

Monitor Linux with a hardware watchdog

I recently encountered a system crash that required power cycling one of my machines. At that point I decided to have a look at hardware watchdogs (which should trigger an automatic reboot in case the watchdog does no longer respond).

Fortunately the system involved had such a hardware watchdog in place:

linux # dmesg | grep -i watchdog
[    0.330901] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
[  392.504661] sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver
[  392.525720] sp5100-tco sp5100-tco: Using 0xfeb00000 for watchdog MMIO address
linux # wdctl 
Device:        /dev/watchdog0
Identity:      SP5100 TCO timer [version 0]
Timeout:       60 seconds
Timeleft:      60 seconds
Pre-timeout:    0 seconds
FLAG           DESCRIPTION               STATUS BOOT-STATUS
KEEPALIVEPING  Keep alive ping reply          1           0
MAGICCLOSE     Supports magic close char      0           0
SETTIMEOUT     Set timeout (in seconds)       0           0

However there was no other software in place to make use of the watchdog. So I installed the package watchdog and modified some settings:

linux # apt install watchdog
<...>
linux # vi /etc/watchdog.conf
<...>
watchdog-device = /dev/watchdog
watchdog-timeout = 60
interval = 5
<...>

So the system will check itself every 5 seconds (and will reboot after 60s with no response).

If you call wdctl now, you’ll see the “Timeout” of 60 seconds and the “Timeleft” hopefully somewhere between 60 and 55 seconds:

linux # wdctl 
Device:        /dev/watchdog0
Identity:      SP5100 TCO timer [version 0]
Timeout:       60 seconds
Timeleft:      57 seconds
FLAG           DESCRIPTION               STATUS BOOT-STATUS
KEEPALIVEPING  Keep alive ping reply          0           0
MAGICCLOSE     Supports magic close char      0           0
SETTIMEOUT     Set timeout (in seconds)       0           0

Normally I hate to reboot systems if they got stuck, however right now I’m curious to see whether this watchdog really works …

Leave a Reply

Your email address will not be published. Required fields are marked *