[pve-devel] [PATCH v4 pve-storage 1/5] qcow2: add external snapshot support
DERUMIER, Alexandre
alexandre.derumier at groupe-cyllene.com
Wed Apr 2 10:01:44 CEST 2025
>
> @@ -716,7 +721,11 @@ sub filesystem_path {
>
> my $dir = $class->get_subdir($scfg, $vtype);
>
> - $dir .= "/$vmid" if $vtype eq 'images';
> + if ($scfg->{snapext} && $snapname) {
> + $name = $class->get_snap_volname($volname, $snapname);
> + } else {
> + $dir .= "/$vmid" if $vtype eq 'images';
> + }
>>this is a bit weird, as it mixes volnames (with the `$vmid/` prefix)
>>and names (without), it's only called twice in this patch, and this
>>here already has $volname parsed, so could we maybe let
>>get_snap_volname take and return the $name part without the dir?
ok!
>
> my $path = "$dir/$name";
>
> @@ -873,7 +882,7 @@ sub clone_image {
> }
>
> sub alloc_image {
> - my ($class, $storeid, $scfg, $vmid, $fmt, $name, $size) = @_;
> + my ($class, $storeid, $scfg, $vmid, $fmt, $name, $size,
> $backing) = @_;
>>this extends the storage API, so it should actually do that.. and
>>probably $backing should not be an arbitrary path, but something that
>>is resolved locally?
I'll send the $snapname as param instead
> +
> + if($backing) {
> + push @$cmd, '-b', $backing, '-F', 'qcow2';
> + push @$options, 'extended_l2=on','cluster_size=128k';
> + };
> + push @$options, preallocation_cmd_option($scfg, $fmt);
> + push @$cmd, '-o', join(',', @$options) if @$options > 0;
> + push @$cmd, '-f', $fmt, $path;
> + push @$cmd, "${size}K" if !$backing;
>>is this because it will automatically take the size from the backing
>>image?
Yes, it take size from the backing. (and refuse if you specify size
param at the same time than backing file)
> + my $path = $class->path($scfg, $volname, $storeid);
> + my $snappath = $class->path($scfg, $volname, $storeid, $snap);
> + #rename current volume to snap volume
> + die "snapshot volume $snappath already exist\n" if -e $snappath;
> + rename($path, $snappath) if -e $path;
>>this is still looking weird.. I don't think it makes sense interface
>>wise to allow snapshotting a volume that doesn't even exist..
This is more by security, I'm still unsure of the behaviour if you have
multiple disks, and that snapshot is dying in the middle. (1 disk
rename, the other not renamed).
> +
> + my ($vtype, $name, $vmid, undef, undef, $isBase, $format) =
> + $class->parse_volname($volname);
> +
> + $class->alloc_image($storeid, $scfg, $vmid, 'qcow2', $name, undef,
> $snappath);
> + if ($@) {
> + eval { $class->free_image($storeid, $scfg, $volname, 0) };
> + warn $@ if $@;
>>missing cleanup - this should undo the rename from above
Do you have an idea how to do it with mutiple disk ?
(revert renaming of other disks elsewhere in the code? just keep them
like this)?
>
> @@ -1187,9 +1251,15 @@ sub volume_snapshot_rollback {
>
> my $path = $class->filesystem_path($scfg, $volname);
>
> - my $cmd = ['/usr/bin/qemu-img', 'snapshot','-a', $snap, $path];
> -
> - run_command($cmd);
> + if ($scfg->{snapext}) {
> + #simply delete the current snapshot and recreate it
> + my $path = $class->filesystem_path($scfg, $volname);
> + unlink($path);
> + $class->volume_snapshot($scfg, $storeid, $volname, $snap);
>>instead of volume_snapshot, this could simply call alloc_image with
>>the backing file? then volume_snapshot could always rename and always
>>cleanup properly..
Yes , better like this indeed
>
> + } else {
> + #we rebase the child image on the parent as new backing image
>>should we extend this to make it clear what this means? it means
>>copying any parts of $snap that are not in $parent and not yet
>>overwritten by $child into $child, right?
>>
yes,
intermediate snapshot: (rebase)
-------------------------------
snap1 (10G)-->snap2 (1G)----current(1G)
---> delete snap2
rebase current on snap1
snap1(10G)----->current(2G)
or
snap1 (10G)-->snap2 (1G)----> snap3 (1G)--->current(1G)
---> delete snap2
rebase snap3 on snap1
snap1 (10G)---> snap3 (2G)--->current(1G)
>>so how expensive this is depends on:
>>- how many changes are between $parent and $snap (increases cost)
>>- how many of those are overwritten by changes between $snap and
>>$child (decreases cost)
but yes, the final size of the child is not 100% the additional content
of the deleted snapshot, if some blocks has already been overwriten in
the child
so, we could call it: "merge diff content of the delete snap to the
childsnap"
>>+sub get_snap_volname {
>>+ my ($class, $volname, $snapname) = @_;
>>+
>>+ my ($vtype, $name, $vmid, $basename, $basevmid, $isBase,
>>$format) = $class->parse_volname($volname);
+ $name = !$snapname || $snapname eq 'current' ? $volname :
"$vmid/snap-$snapname-$name";
>>other way round would be better to group by volume first IMHO
($vmid/snap-$name-$snapname), as this is similar to how we encode
>>snapshots often on the storage level (volume at snap). we also need to
>>have some delimiter between snapshot and volume name that is not
>>allowed in either (hard for volname since basically everything but
>>'/' goes, but snapshots have a restricted character set (configid,
>>which means alphanumeric, hyphen and underscore), so we could use
>>something like '.' as delimiter? or we switch to directories and do
>>$vmid/snap/$snap/$name?)
Personnaly, I prefer a '.' delimiter than sub directory. (better to
have the info in the filename itself)
More information about the pve-devel
mailing list