[PVE-User] Recurring crashes after cluster upgrade from 5 to 6
Eneko Lacunza
elacunza at binovo.es
Tue Nov 12 16:38:12 CET 2019
Hi all,
We are seeing this also with 5.4-3 clusters, a node was fenced in two
different clusters without any apparent reason.
Neither of the clusters had a node fence before...
Cheers
Eneko
El 7/11/19 a las 15:35, Eneko Lacunza escribió:
> Hi all,
>
> We updated our office cluster to get the patch, but got a node reboot
> on 31th october. Node was fenced and rebooted, everything continued
> working OK.
>
> Is anyone experencing yet this problem?
>
> Cheers
> Eneko
>
> El 2/10/19 a las 18:09, Hervé Ballans escribió:
>> Hi Alexandre,
>>
>> We encouter exactly the same problem as Laurent Caron (after upgrade
>> from 5 to 6).
>>
>> So I tried your patch 3 days ago, but unfortunately, the problem
>> still occurs...
>>
>> This is a really annoying problem, since sometimes, all the PVE nodes
>> of our cluster reboot quasi-simultaneously !
>> And in the same time, we don't encounter this problem with our other
>> PVE cluster in version 5.
>> (And obviously we are waiting for a solution and a stable situation
>> before upgrade it !)
>>
>> It seems to be a unicast or corosync3 problem, but logs are not
>> really verbose at the time of reboot...
>>
>> Is there anything else to test ?
>>
>> Regards,
>> Hervé
>>
>> Le 20/09/2019 à 17:00, Alexandre DERUMIER a écrit :
>>> Hi,
>>>
>>> a patch is available in pvetest
>>>
>>> http://download.proxmox.com/debian/pve/dists/buster/pvetest/binary-amd64/libknet1_1.11-pve2_amd64.deb
>>>
>>>
>>> can you test it ?
>>>
>>> (you need to restart corosync after install of the deb)
>>>
>>>
>>> ----- Mail original -----
>>> De: "Laurent CARON" <lcaron at unix-scripts.info>
>>> À: "proxmoxve" <pve-user at pve.proxmox.com>
>>> Envoyé: Lundi 16 Septembre 2019 09:55:34
>>> Objet: [PVE-User] Recurring crashes after cluster upgrade from 5 to 6
>>>
>>> Hi,
>>>
>>>
>>> After upgrading our 4 node cluster from PVE 5 to 6, we experience
>>> constant crashed (once every 2 days).
>>>
>>> Those crashes seem related to corosync.
>>>
>>> Since numerous users are reporting sych issues (broken cluster after
>>> upgrade, unstabilities, ...) I wonder if it is possible to downgrade
>>> corosync to version 2.4.4 without impacting functionnality ?
>>>
>>> Basic steps would be:
>>>
>>> On all nodes
>>>
>>> # systemctl stop pve-ha-lrm
>>>
>>> Once done, on all nodes:
>>>
>>> # systemctl stop pve-ha-crm
>>>
>>> Once done, on all nodes:
>>>
>>> # apt-get install corosync=2.4.4-pve1 libcorosync-common4=2.4.4-pve1
>>> libcmap4=2.4.4-pve1 libcpg4=2.4.4-pve1 libqb0=1.0.3-1~bpo9
>>> libquorum5=2.4.4-pve1 libvotequorum8=2.4.4-pve1
>>>
>>> Then, once corosync has been downgraded, on all nodes
>>>
>>> # systemctl start pve-ha-lrm
>>> # systemctl start pve-ha-crm
>>>
>>> Would that work ?
>>>
>>> Thanks
>>>
>>> _______________________________________________
>>> pve-user mailing list
>>> pve-user at pve.proxmox.com
>>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>>>
>>> _______________________________________________
>>> pve-user mailing list
>>> pve-user at pve.proxmox.com
>>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>>
>>
>> _______________________________________________
>> pve-user mailing list
>> pve-user at pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
>
--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarragako bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es
More information about the pve-user
mailing list