[pve-devel] [PATCH ha-manager v2 0/5] watchdog: sync log to disk before and after expiring
Maximiliano Sandoval
m.sandoval at proxmox.com
Wed Jun 25 15:23:44 CEST 2025
Without a clear-cut message in the log, it is very hard to provide a definitive
answer to whether a host fenced or not. In some cases the journal on the disk
can be missing up to 2 minutes since its last logged entry and the time where
another node detects the corosync link is down, with such a gap, the fenced node
would not even record that it lost conenction and it is not possible to
fully-determine if the node was fenced or not.
This series:
- adds a second warning 10 seconds before the watchdog expires
- syncs the journal to disk after the warning was issued
- syncs the journal to disk after the watchdog expires
Differences from v1:
- Define the warning cuttoff based on the 60 second timeout
- Change log messages and constant names
- When not immediately fencing, run journal sync in double fork
Maximiliano Sandoval (5):
watchdog-mux: Use #define for 60s timeout
watchdog-mux: split if block in two if blocks
watchdog-mux: warn when about to expire
watchdog-mux: sync journal after logging expiration message
watchdog-mux: sync journal right after fencing warning
src/watchdog-mux.c | 52 +++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 47 insertions(+), 5 deletions(-)
--
2.39.5
More information about the pve-devel
mailing list