[pve-devel] [Patch V2 guest-common] fix #1694: Replication risks permanently losing sync in high loads due to timeout bug
Dietmar Maurer
dietmar at proxmox.com
Thu Apr 12 11:06:32 CEST 2018
> diff --git a/PVE/Replication.pm b/PVE/Replication.pm
> index 9bc4e61..d8ccfaf 100644
> --- a/PVE/Replication.pm
> +++ b/PVE/Replication.pm
> @@ -136,8 +136,18 @@ sub prepare {
> $last_snapshots->{$volid}->{$snap} = 1;
> } elsif ($snap =~ m/^\Q$prefix\E/) {
> $logfunc->("delete stale replication snapshot '$snap' on $volid");
> - PVE::Storage::volume_snapshot_delete($storecfg, $volid, $snap);
> - $cleaned_replicated_volumes->{$volid} = 1;
> +
> + eval {
> + PVE::Storage::volume_snapshot_delete($storecfg, $volid, $snap);
> + $cleaned_replicated_volumes->{$volid} = 1;
> + };
> +
> + # If deleting the snapshot fails, we can not be sure if it was due to an
> error or a timeout.
> + # The likelihood that the delete has worked out is high at a timeout.
> + # If it really fails, it will try to remove on the next run.
> + warn $@ if $@;
> +
> + $logfunc->("delete stale replication snapshot error: $@") if $@;
why do we need this in prepare?
More information about the pve-devel
mailing list