[pve-devel] corosync bug: cluster break after 1 node clean shutdown

Thomas Lamprecht t.lamprecht at proxmox.com
Tue Sep 15 09:58:09 CEST 2020


On 9/15/20 8:27 AM, Alexandre DERUMIER wrote:
>>> This is by intention - we do not want to stop pmxcfs only because coorosync service stops. 
> 
> Yes, but at shutdown, it could be great to stop pmxcfs before corosync ?
> I ask the question, because the 2 times I have problem, it was when shutting down a server.
> So maybe some strange behaviour occur with both corosync && pmxcfs are stopped at same time ?
> 
> 
> looking at the pve-cluster unit file,
> why do we have "Before=corosync.service" and not "After=corosync.service" ?

We may need to sync over the cluster corosync.conf to the local one, that can
only happen before.

Also, if we shutdown pmxcfs before corosync we may still get corosync events (file writes,
locking, ...) but the node does not sees it locally anymore but still looks quorate for
others, that'd be not good.

> 
> I have tried to change this, but even with that, both are still shutting down in parallel.
> 
> the only way I have found to have clean shutdown, is "Requires=corosync.server" + "After=corosync.service".
> But that mean than if you restart corosync, it's restart pmxcfs too first.
> 
> I have looked at systemd doc, After= should be enough (as at shutdown it's doing the reverse order),
> but I don't known why corosync don't wait than pve-cluster ???
> 
> 
> (Also, I think than pmxcfs is also stopping after syslog, because I never see the pmxcfs "teardown filesystem" logs at shutdown)


is that true for (persistent) systemd-journald too? IIRC syslog.target is
deprecated and only rsyslog provides it.

As the next Debian will enable persistent journal by default and we already
use it for everything (IIRC) were we provide an interface to logs, we will
probably not enable rsyslog by default with PVE 7.x

But if we can add some ordering for this to be improved I'm open for it.





More information about the pve-devel mailing list