[pve-devel] corosync problems - need help

Alexandre DERUMIER aderumier at odiso.com
Sun Sep 14 09:05:45 CEST 2014


>>What kernel do you run? 2.6.32 or 3.10.0? 

1 node 2.6.32 , 1 node 3.10


What is different on those nodes? kernel, network cards? 

All nodes are same model, but I have 3 nodes with kernel 3.10 and 8 nodes with 2.6.32 kernel.
(I'm currently migrate all nodes to 3.10)

I have added 2 nodes (kvm11,kvm12) with 3.10 kernel  1 week ago (without any multicast problem)



>> I would like to try to change corosync window_size, but how can I do it online ? 
>
>On all nodes? 

Yes, if possible. As I can't edit cluster.conf (read only), don't known how to inject it online.


>> (and /etc/init.d/cman stop is hanging) 
>
>I guess you already tried to reboot that node? 

I can't reboot for now, it's a production node, and I can't live migrate as pmxcfs is read only.


I'll try to restart all services on all nodes to see if It's help

----- Mail original ----- 

De: "Dietmar Maurer" <dietmar at proxmox.com> 
À: "Alexandre DERUMIER" <aderumier at odiso.com>, pve-devel at pve.proxmox.com 
Envoyé: Dimanche 14 Septembre 2014 08:41:09 
Objet: RE: [pve-devel] corosync problems - need help 

> on this cluster, 2 nodes show "corosync [TOTEM ] Retransmit List" 

What kernel do you run? 2.6.32 or 3.10.0? 
What is different on those nodes? kernel, network cards? 

> But I can't write anything in pmxcfs on any node (read is ok) 
> 
> with a lot erros like this 
> kvm1 pmxcfs[65403]: [dcdb] notice: cpg_join retry 32310 
> kvm1 pmxcfs[65403]: [dcdb] notice: cpg_join retry 32320 
> kvm1 pmxcfs[65403]: [dcdb] notice: cpg_join retry 32330 
> 
> 
> 
> Any idea ? 

Does it help if you restart the cluster file system: 

# service pve-cluster restart 

Note: You also need to restart depending services afterwards: 

# service pvedaemon restart 
# service pveproxy restart 
# service pvestatd restart 

> I would like to try to change corosync window_size, but how can I do it online ? 

On all nodes? 

> (and /etc/init.d/cman stop is hanging) 

I guess you already tried to reboot that node? 



More information about the pve-devel mailing list