[pve-devel] corosync bug: cluster break after 1 node clean shutdown
Alexandre DERUMIER
aderumier at odiso.com
Sun Sep 6 08:33:56 CEST 2020
Also, I wonder if it could be possible to not use watchdog fencing at all (as option),
if cluster use only shared storages with native disk lock/reservation.
Like ceph rbd for example, with exclusive-lock, you can't write from 2 clients on same rbd,
so ha will not be able to start qemu on another node.
----- Mail original -----
De: "aderumier" <aderumier at odiso.com>
À: "dietmar" <dietmar at proxmox.com>
Cc: "Proxmox VE development discussion" <pve-devel at lists.proxmox.com>, "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Dimanche 6 Septembre 2020 07:36:10
Objet: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown
>>But the pve logs look ok, and there is no indication
>>that we stopped updating the watchdog. So why did the
>>watchdog trigger? Maybe an IPMI bug?
do you mean an ipmi bug on all 13 servers at the same time ?
(I also have 2 supermicro servers in this cluster, but they use same ipmi watchdog driver. (ipmi_watchdog)
I had same kind of with bug once (when stopping a server), on another cluster, 6 months ago.
This was without HA, but different version of corosync, and that time, I was really seeing quorum split in the corosync logs of the servers.
I'll try to reproduce with a virtual cluster with 14 nodes (don't have enough hardware)
Could I be a bug in proxmox HA code, where watchdog is not resetted by LRM anymore?
----- Mail original -----
De: "dietmar" <dietmar at proxmox.com>
À: "aderumier" <aderumier at odiso.com>
Cc: "Proxmox VE development discussion" <pve-devel at lists.proxmox.com>, "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Dimanche 6 Septembre 2020 06:21:55
Objet: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown
> >>So you are using ipmi hardware watchdog?
>
> yes, I'm using dell idrac ipmi card watchdog
But the pve logs look ok, and there is no indication
that we stopped updating the watchdog. So why did the
watchdog trigger? Maybe an IPMI bug?
_______________________________________________
pve-devel mailing list
pve-devel at lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
More information about the pve-devel
mailing list