[pve-devel] [PATCH v2 guest-common 1/2] partially fix 3111: snapshot rollback: improve removing replication snapshots

Fabian Ebner f.ebner at proxmox.com
Wed Jun 9 11:18:57 CEST 2021

Get the replicatable volumes from the snapshot config rather than the current
config. And filter those volumes further to those that will actually be rolled

Previously, a volume that only had replication snapshots (e.g. because it was
added after the snapshot was taken, or the vmstate volume) would lose them.
Then, on the next replication run, such a volume would lead to an error, because
replication tried to do a full sync, but the target volume still exists.

Should be enough for most real-world scenarios, but not a complete fix:
It is still possible to run into the problem by removing the last
(non-replication) snapshots after a rollback before replication can run once.

The list of volumes is not required to be sorted for prepare(), but it is sorted
by how foreach_volume() iterates now, so not random.

Signed-off-by: Fabian Ebner <f.ebner at proxmox.com>

Changes from v1:
    * dropped already applied patch
    * rebased on top of the new filename (since a src/ prefix was added)
    * add another comment mentioning why the additional filtering is necessary

 src/PVE/AbstractConfig.pm | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/src/PVE/AbstractConfig.pm b/src/PVE/AbstractConfig.pm
index 3348d8a..6542ae4 100644
--- a/src/PVE/AbstractConfig.pm
+++ b/src/PVE/AbstractConfig.pm
@@ -974,13 +974,23 @@ sub snapshot_rollback {
 	if ($prepare) {
 	    my $repl_conf = PVE::ReplicationConfig->new();
 	    if ($repl_conf->check_for_existing_jobs($vmid, 1)) {
-		# remove all replication snapshots
-		my $volumes = $class->get_replicatable_volumes($storecfg, $vmid, $conf, 1);
-		my $sorted_volids = [ sort keys %$volumes ];
+		# remove replication snapshots on volumes affected by rollback *only*!
+		my $volumes = $class->get_replicatable_volumes($storecfg, $vmid, $snap, 1);
+		# filter by what we actually iterate over below (excludes vmstate!)
+		my $volids = [];
+		$class->foreach_volume($snap, sub {
+		    my ($vs, $volume) = @_;
+		    my $volid_key = $class->volid_key();
+		    my $volid = $volume->{$volid_key};
+		    push @{$volids}, $volid if $volumes->{$volid};
+		});
 		# remove all local replication snapshots (jobid => undef)
 		my $logfunc = sub { my $line = shift; chomp $line; print "$line\n"; };
-		PVE::Replication::prepare($storecfg, $sorted_volids, undef, 1, undef, $logfunc);
+		PVE::Replication::prepare($storecfg, $volids, undef, 1, undef, $logfunc);
 	    $class->foreach_volume($snap, sub {

