[pve-devel] corosync bug: cluster break after 1 node clean shutdown

Thomas Lamprecht t.lamprecht at proxmox.com
Wed Sep 16 16:45:12 CEST 2020


On 9/16/20 3:15 PM, Alexandre DERUMIER wrote:
> I have reproduce it again, with pmxcfs in debug mode
> 
> corosync restart at 15:02:10, and it was already block on other nodes at 15:02:12
> 
> The pmxcfs was still logging after the lock.
> 
> 
> here the log on node1 where corosync has been restarted
> 
> http://odisoweb1.odiso.net/pmxcfs-corosync.log
> 


thanks for those, I need a bit to sift through them. Seem like either dfsm gets
out of sync or we do not get a ACK reply from cpg_send.

A full core dump would be still nice, in gdb:
generate-core-file

PS: instead of manually switching to threads you can do:
thread apply all bt full

to get a backtrace for all threads in one command





More information about the pve-devel mailing list