[pve-devel] corosync bug: cluster break after 1 node clean shutdown
Thomas Lamprecht
t.lamprecht at proxmox.com
Thu Sep 17 13:35:55 CEST 2020
On 9/17/20 12:02 PM, Alexandre DERUMIER wrote:
> if needed, here my test script to reproduce it
thanks, I'm now using this specific one, had a similar (but all nodes writes)
running here since ~ two hours without luck yet, lets see how this behaves.
>
> node1 (restart corosync until node2 don't send the timestamp anymore)
> -----
>
> #!/bin/bash
>
> for i in `seq 10000`; do
> now=$(date +"%T")
> echo "restart corosync : $now"
> systemctl restart corosync
> for j in {1..59}; do
> last=$(cat /tmp/timestamp)
> curr=`date '+%s'`
> diff=$(($curr - $last))
> if [ $diff -gt 20 ]; then
> echo "too old"
> exit 0
> fi
> sleep 1
> done
> done
>
>
>
> node2 (write to /etc/pve/test each second, then send the last timestamp to node1)
> -----
> #!/bin/bash
> for i in {1..10000};
> do
> now=$(date +"%T")
> echo "Current time : $now"
> curr=`date '+%s'`
> ssh root at node1 "echo $curr > /tmp/timestamp"
> echo "test" > /etc/pve/test
> sleep 1
> done
>
More information about the pve-devel
mailing list