[pve-devel] corosync problems - need help

Alexandre DERUMIER aderumier at odiso.com
Sun Sep 14 09:17:57 CEST 2014


>># service pve-cluster restart 
>>
>>Note: You also need to restart depending services afterwards: 
>>
>># service pvedaemon restart 
>># service pveproxy restart 
>># service pvestatd restart 

Don't help.

Another strange thing, is that tcpdump show only multicast traffic for port 5054 from the 2 flooding nodes with retransmit.

all others nodes don't seem to send nothing.




----- Mail original ----- 

De: "Alexandre DERUMIER" <aderumier at odiso.com> 
À: "Dietmar Maurer" <dietmar at proxmox.com> 
Cc: pve-devel at pve.proxmox.com 
Envoyé: Dimanche 14 Septembre 2014 09:05:45 
Objet: Re: [pve-devel] corosync problems - need help 

>>What kernel do you run? 2.6.32 or 3.10.0? 

1 node 2.6.32 , 1 node 3.10 


What is different on those nodes? kernel, network cards? 

All nodes are same model, but I have 3 nodes with kernel 3.10 and 8 nodes with 2.6.32 kernel. 
(I'm currently migrate all nodes to 3.10) 

I have added 2 nodes (kvm11,kvm12) with 3.10 kernel 1 week ago (without any multicast problem) 



>> I would like to try to change corosync window_size, but how can I do it online ? 
> 
>On all nodes? 

Yes, if possible. As I can't edit cluster.conf (read only), don't known how to inject it online. 


>> (and /etc/init.d/cman stop is hanging) 
> 
>I guess you already tried to reboot that node? 

I can't reboot for now, it's a production node, and I can't live migrate as pmxcfs is read only. 


I'll try to restart all services on all nodes to see if It's help 

----- Mail original ----- 

De: "Dietmar Maurer" <dietmar at proxmox.com> 
À: "Alexandre DERUMIER" <aderumier at odiso.com>, pve-devel at pve.proxmox.com 
Envoyé: Dimanche 14 Septembre 2014 08:41:09 
Objet: RE: [pve-devel] corosync problems - need help 

> on this cluster, 2 nodes show "corosync [TOTEM ] Retransmit List" 

What kernel do you run? 2.6.32 or 3.10.0? 
What is different on those nodes? kernel, network cards? 

> But I can't write anything in pmxcfs on any node (read is ok) 
> 
> with a lot erros like this 
> kvm1 pmxcfs[65403]: [dcdb] notice: cpg_join retry 32310 
> kvm1 pmxcfs[65403]: [dcdb] notice: cpg_join retry 32320 
> kvm1 pmxcfs[65403]: [dcdb] notice: cpg_join retry 32330 
> 
> 
> 
> Any idea ? 

Does it help if you restart the cluster file system: 

# service pve-cluster restart 

Note: You also need to restart depending services afterwards: 

# service pvedaemon restart 
# service pveproxy restart 
# service pvestatd restart 

> I would like to try to change corosync window_size, but how can I do it online ? 

On all nodes? 

> (and /etc/init.d/cman stop is hanging) 

I guess you already tried to reboot that node? 
_______________________________________________ 
pve-devel mailing list 
pve-devel at pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 



More information about the pve-devel mailing list