[pve-devel] training : watchdog not working on 1 server

Alexandre DERUMIER aderumier at odiso.com
Wed Feb 3 17:20:58 CET 2016


Hi,

We are currently testing watchdogs during our training session,

and 1 of the 3 nodes cluster don't load the watchdog correctly

I have tried with softdog  or iTCO_wdt, 

the watchdog timer is never enabled and have a 15s countdown.

the 3 nodes cluster are exactly the same model (old dell poweredge 2950),
clean proxmox 4.1 install with all last updates



# ipmitool mc watchdog get
Watchdog Timer Use:     Reserved (0x00)
Watchdog Timer Is:      Stopped
Watchdog Timer Actions: No action (0x00)
Pre-timeout interval:   1 seconds
Timer Expiration Flags: 0x00
Initial Countdown:      15 sec
Present Countdown:      15 sec

# dmesg|grep softdog
[   19.098138] softdog: Software Watchdog Timer: 0.08 initialized. soft_noboot=0 soft_margin=60 sec soft_panic=0 (nowayout=0)

# dmesg|grep -i watchdog
[    0.096195] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
[    9.340545] systemd[1]: Cannot add dependency job for unit watchdog-mux.socket, ignoring: Unit watchdog-mux.socket failed to load: No such file or directory.
[   19.098138] softdog: Software Watchdog Timer: 0.08 initialized. soft_noboot=0 soft_margin=60 sec soft_panic=0 (nowayout=0)



>>Unit watchdog-mux.socket failed to load: No such file or directory. 
I don't see this warning on other nodes


Any idea how I can debug that ?


Alexandre



More information about the pve-devel mailing list