[pve-devel] [PATCH v3 pve-storage 1/3] qcow2: add external snapshot support

DERUMIER, Alexandre alexandre.derumier at groupe-cyllene.com
Fri Jan 10 10:10:54 CET 2025


> @@ -710,11 +715,15 @@ sub filesystem_path {
>      # Note: qcow2/qed has internal snapshot, so path is always
>      # the same (with or without snapshot => same file).
>      die "can't snapshot this image format\n"
> - if defined($snapname) && $format !~ m/^(qcow2|qed)$/;
> ²

>>I am not sure if we want to allow snapshots for non-qcow2 files just
>>because snapext is enabled? I know it's technically possible to have
>>a raw base image and then a qcow2 backing chain on top, but this
>>quickly becomes confusing (how is the volume named then? which format
>>does it have in which context)..

in the V2 I was allowing it, but for this V3 series, I only manage
external snasphot with qcow2 files. (with the snapshot file renaming,
It'll be too complex to manage, confusing for user indeed... )

I think I forgot to clean this in the V3, the check should be simply

die "can't snapshot this image format\n" if defined($snapname) &&
$format !~ m/^(qcow2|qed)$/;





> 
>      die "can't snapshot this image format\n" if $volname !~
> m/\.(qcow2|qed)$/;
>  
> -    my $path = $class->filesystem_path($scfg, $volname);
> +    if($scfg->{snapext}) {
>  
> -    my $cmd = ['/usr/bin/qemu-img', 'snapshot','-c', $snap, $path];
> + my $path = $class->path($scfg, $volname, $storeid);
> + my $snappath = $class->path($scfg, $volname, $storeid, $snap);
> + my $format = ($class->parse_volname($volname))[6];
> + #rename current volume to snap volume
> + rename($path, $snappath) if -e $path && !-e $snappath;

>>I think this should die if the snappath already exists, and the one
>>(IMHO wrong) call in qemu-server should switch to
>>vdisk_alloc/alloc_image.. this is rather dangerous otherwise!

right !



> +    if ($scfg->{snapext}) {
> + #technically, we could manage multibranch, we it need lot more work
> for snapshot delete
> + #we need to implemente block-stream from deleted snapshot to all
> others child branchs

>>see my comments in qemu-server - I think we actually want block-
>>stream anyway, since it has the semantics we want..

I don't agree, we don't want always, because with block-stream, you
need to copy parent to child.

for example, you have a 1TB image,  you take a snapshot, writing 5MB in
the snapshot, delete the snapshot,  you'll need to read/copy 1TB data
from parent to the snapshot file.  
I don't read your qemu-server comment yet ;)


> + #when online, we need to do a transaction for multiple disk when
> delete the last snapshot
> + #and need to merge in current running file
> +
> + my $snappath = $class->path($scfg, $volname, $storeid, $snap);
> + my $snapshots = $class->volume_snapshot_info($scfg, $storeid,
> $volname);
> + my $parentsnap = $snapshots->{current}->{parent};
> +
> + return 1 if !-e $snappath || $snapshots->{$parentsnap}->{file} eq
> $snappath;

>>why do we return 1 here if the snapshot doesn't exist? if we only
>>allow rollback to the most recent snapshot for now, then we could
>>just query the current path and see if it is backed by our snapshot?

I think I forget to remove this this from the V2. But the idea is to
check indead if the snapshot back the current image ( with $snapshots-
>{current}->{parent}.  

> +
> + die "can't rollback, '$snap' is not most recent snapshot on
> '$volname'\n";
> +    }
> +
>      return 1;
>  }
>  
> @@ -1201,13 +1257,52 @@ sub volume_snapshot_delete {
>  
>      return 1 if $running;
>  
> +    my $cmd = "";
>      my $path = $class->filesystem_path($scfg, $volname);
>  
> -    $class->deactivate_volume($storeid, $scfg, $volname, $snap, {});
> +    if ($scfg->{snapext}) {
>  
> -    my $cmd = ['/usr/bin/qemu-img', 'snapshot','-d', $snap, $path];
> + my $snapshots = $class->volume_snapshot_info($scfg, $storeid,
> $volname);
> + my $snappath = $snapshots->{$snap}->{file};
> + return if !-e $snappath;  #already deleted ?

>>shouldn't this be an error?

This one was if we want to do retry in case of error, if we have
multiple disks. (for example, first snapshot delete api call,  the
first disk remove the snapshot, but a bug occur and second disk don't
remove the snapshot). 

User could want to unlock the vm-snaphot lock and  and fix it manually
with calling again the snapshot delete.

I'm not sure how to handle this correctly ?

> +     print"commit $childpath\n";
> +     $cmd = ['/usr/bin/qemu-img', 'commit', $childpath];
> +     run_command($cmd);
> +     print"delete $childpath\n";
> +
> +     unlink($childpath);

this unlink can be skipped?

> +     print"rename $snappath to $childpath\n";
> +     rename($snappath, $childpath);

>>since this will overwrite $childpath anyway.. this also reduces the
>>chance of something going wrong:
>>
>>- if the commit fails halfway through, nothing bad should have
>>happened, other than some data is now stored in two snapshots and
>>takes up extra space
>>- if the rename fails, then all of the data of $snap is stored twice,
>>but the backing chain is still valid
>>
>>notable, there is no longer a gap where $childpath doesn't exist,
>>which would break the backing chain!

yes you are right, better to have it atomic indeed


> + } else {
> +     print"commit $snappath\n";
> +     $cmd = ['/usr/bin/qemu-img', 'commit', $snappath];

>>leftover from previous version? not used/overwritten below ;)

no, this is really to commit the the snapshot to parent

> +     #if we delete an intermediate snapshot, we need to link upper
> snapshot to base snapshot
> +     die "missing parentsnap snapshot to rebase child $childpath\n"
> if !$parentpath;
> +     print "link $childsnap to $parentsnap\n";
> +     $cmd = ['/usr/bin/qemu-img', 'rebase', '-u', '-b', $parentpath,
> '-F', 'qcow2', '-f', 'qcow2', $childpath];

>>does this work? I would read the qemu-img manpage to say that '-u' is
>>for when you've moved/converted the backing file, and want to update
>>the reference in its overlay, and that it doesn't copy any data.. but
>>we need to copy the data from $snap to $childpath (we just want to
>>delete the snapshot, we don't want to drop all its changes from the
>>history, that would corrupt the contents of the image).
>>note the description of the "safe" variant:
>>
>>"                     This  is  the  default mode and performs a real
>>rebase operation. The new backing file may differ from the old one
>>and qemu-img rebase will take care of keeping the
>>                     guest-visible content of FILENAME unchanged."
>>
>>IMHO this is the behaviour we need here?

This is only to change the backing chain ref in the qcow2 snapshot.
(this is the only way to do it, they was a qemu-img ammend command in
past, but it has been removed in
2020 https://patchwork.kernel.org/project/qemu-devel/patch/20200403175859.863248-5-eblake@redhat.com/,
so the rebase is the good way to do it)

The merge is done by the previous qemu-img commit. (qemu-img commit
can't change  change automatically the backing chain of the upper
snapshot, because it don't have any idea than an upper snapshot could
exist).

this is for this usecase :

A<----B<----C.

you commit B to A,  then you need to change the backing file of C to A
(instead B)

A<----C

(when done it live, qemu qmp block-commit is able to change
automatically the backing chain of the upper snapshot, because qemu
known the whole chain)

This is how libvirt is doing too
https://kashyapc.fedorapeople.org/virt/lc-2012/snapshots-handout.html
see "Deleting snapshots (and 'offline commit')"
Method (1): base <- sn1 <- sn3 (by copying sn2 into sn1)
Method (2): base <- sn1 <- sn3 (by copying sn2 into sn3)
(This is commit vs stream)


I think that we should look at used space of parent vs child,
to choose the correct direction/method.


> +     run_command($cmd);
> +     #delete the snapshot
> +     unlink($snappath);
> + }
> +
> +    } else {
> + $class->deactivate_volume($storeid, $scfg, $volname, $snap, {});
>  
> -    run_command($cmd);
> + $cmd = ['/usr/bin/qemu-img', 'snapshot','-d', $snap, $path];
> + run_command($cmd);
> +    }
>  
>      return undef;
>  }
> @@ -1246,8 +1341,8 @@ sub volume_has_feature {
>       current => { qcow2 => 1, raw => 1, vmdk => 1 },
>   },
>   rename => {
> -     current => {qcow2 => 1, raw => 1, vmdk => 1},
> - },
> +     current => { qcow2 => 1, raw => 1, vmdk => 1},
> + }

>>nit: unrelated change?
yep
>      };
>  
>      if ($feature eq 'clone') {
> @@ -1481,7 +1576,37 @@ sub status {
>  sub volume_snapshot_info {
>      my ($class, $scfg, $storeid, $volname) = @_;
>  
> -    die "volume_snapshot_info is not implemented for $class";

>>should this be guarded with $snapext being enabled?

yes indeed





More information about the pve-devel mailing list