[PVE-User] Recurring crashes after cluster upgrade from 5 to 6

Hervé Ballans herve.ballans at ias.u-psud.fr
Wed Oct 2 18:09:13 CEST 2019


Hi Alexandre,

We encouter exactly the same problem as Laurent Caron (after upgrade 
from 5 to 6).

So I tried your patch 3 days ago, but unfortunately, the problem still 
occurs...

This is a really annoying problem, since sometimes, all the PVE nodes of 
our cluster reboot quasi-simultaneously !
And in the same time, we don't encounter this problem with our other PVE 
cluster in version 5.
(And obviously we are waiting for a solution and a stable situation 
before upgrade it !)

It seems to be a unicast or corosync3 problem, but logs are not really 
verbose at the time of reboot...

Is there anything else to test ?

Regards,
Hervé

Le 20/09/2019 à 17:00, Alexandre DERUMIER a écrit :
> Hi,
>
> a patch is available in pvetest
>
> http://download.proxmox.com/debian/pve/dists/buster/pvetest/binary-amd64/libknet1_1.11-pve2_amd64.deb
>
> can you test it ?
>
> (you need to restart corosync after install of the deb)
>
>
> ----- Mail original -----
> De: "Laurent CARON" <lcaron at unix-scripts.info>
> À: "proxmoxve" <pve-user at pve.proxmox.com>
> Envoyé: Lundi 16 Septembre 2019 09:55:34
> Objet: [PVE-User] Recurring crashes after cluster upgrade from 5 to 6
>
> Hi,
>
>
> After upgrading our 4 node cluster from PVE 5 to 6, we experience
> constant crashed (once every 2 days).
>
> Those crashes seem related to corosync.
>
> Since numerous users are reporting sych issues (broken cluster after
> upgrade, unstabilities, ...) I wonder if it is possible to downgrade
> corosync to version 2.4.4 without impacting functionnality ?
>
> Basic steps would be:
>
> On all nodes
>
> # systemctl stop pve-ha-lrm
>
> Once done, on all nodes:
>
> # systemctl stop pve-ha-crm
>
> Once done, on all nodes:
>
> # apt-get install corosync=2.4.4-pve1 libcorosync-common4=2.4.4-pve1
> libcmap4=2.4.4-pve1 libcpg4=2.4.4-pve1 libqb0=1.0.3-1~bpo9
> libquorum5=2.4.4-pve1 libvotequorum8=2.4.4-pve1
>
> Then, once corosync has been downgraded, on all nodes
>
> # systemctl start pve-ha-lrm
> # systemctl start pve-ha-crm
>
> Would that work ?
>
> Thanks
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user




More information about the pve-user mailing list