[pve-devel] corosync problems - need help

Alexandre DERUMIER aderumier at odiso.com
Sun Sep 14 15:41:26 CEST 2014


>>I am curios - you have done that on all nodes, or only on the failing 2 nodes?

Yes, I need to do it on all nodes.



I have done more invesgations, and now I can reproduce the problem 100%

The problem seem to come from a specific node: kvm11

When I start cman on this node,

I have :
pmxcfs[31484]: [status] notice: cpg_send_message retry XX

on all other nodes

Same hardware than other nodes, I need to check the network layer.


On the faulty node, I see also some pmxcfs segfaults in dmesg

[976776.602200] pmxcfs[3130]: segfault at 7ff1dcadef08 ip 00007ff1dcadef08 sp 00007fffd89cfe68 error 15
[977517.260211] pmxcfs[4947]: segfault at 1956b00 ip 0000000001956b00 sp 00007ffff3b109e8 error 15
[980494.722550] pmxcfs[15205]: segfault at 7f712457ef08 ip 00007f712457ef08 sp 00007fff4a916668 error 15



----- Mail original ----- 

De: "Dietmar Maurer" <dietmar at proxmox.com> 
À: "Alexandre DERUMIER" <aderumier at odiso.com> 
Cc: pve-devel at pve.proxmox.com 
Envoyé: Dimanche 14 Septembre 2014 12:53:45 
Objet: RE: [pve-devel] corosync problems - need help 

> Ok,I finally solved, 
> 
> kill -9 dlm_controld 
> kill -9 corosync -f 
> 
> and service cman start 
> 
> 
> Now all is working fine again. 

I am curios - you have done that on all nodes, or only on the failing 2 nodes? 



More information about the pve-devel mailing list