[pve-devel] [PATCH v2 pve-storage 2/2] add lvmqcow2 plugin: (lvm with external qcow2 snapshot)

DERUMIER, Alexandre alexandre.derumier at groupe-cyllene.com
Wed Oct 23 15:45:07 CEST 2024


>>I am not yet convinced this is somehow a good idea, but maybe you can
>>convince me otherwise ;)
>>
>>variant A: this is just useful for very short-lived snapshots
>>variant B: these snapshots are supposed to be long-lived

Can you defined "short "/ "long"  ? and the different usecase ?

because for me, a snapshot is a snapshot. Sometime I take a snapshot
before doing some critical changes, but I can't known if I need to
rollback in next 2h, or next month.

I think that "long-lived" usecase is backup (but we don't need it),
or replication (this could apply here, if we want to add replication
for disaster recovery)


>>A is not something we want. we intentionally don't have non-thin LVM
>>snapshots for example.

AFAIK, we never had implemented it because LVM snasphot is slow as
hell.(as a lvm extent are around 4MB, if you want 4k on a snapshot, you
need to reread and rewrite the 4MB,  so around 1000x over-
amplification and slow iops)

This is really the main blocker for my customers migrating from vmware
(and to be honest I have some of them going to oracle ovlm (with
ovirt), because ovirt support it this way). 



>>B once I create a single snapshot, the "original" storage only
>>contains the data written up to that point, anything else is stored
>>on the "snapshot" storage. this means my snapshot storage must be at
>>least as fast/good/shared/.. as my original storage. in that case, I
>>can just use the snapshot storage directly and ditch the original
>>storage?

Sorry, but I don't understand why you are talking about
original/snapshot storage ? I never have thinked to use another storage
for external snapshot.

The patch is really to add external snapshot on same lvm storage,
through lvm additional volume, but with qcow2 format to have good
performance (vs slow lvm snapshot)









> Alexandre Derumier via pve-devel <pve-devel at lists.proxmox.com> hat am
> 30.09.2024 13:31 CEST geschrieben:
> Signed-off-by: Alexandre Derumier <alexandre.derumier at groupe-
> cyllene.com>
> ---
>  src/PVE/Storage.pm                |   2 +
>  src/PVE/Storage/LvmQcow2Plugin.pm | 460
> ++++++++++++++++++++++++++++++
>  src/PVE/Storage/Makefile          |   3 +-
>  3 files changed, 464 insertions(+), 1 deletion(-)
>  create mode 100644 src/PVE/Storage/LvmQcow2Plugin.pm
> 
> diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm
> index 57b2038..119998f 100755
> --- a/src/PVE/Storage.pm
> +++ b/src/PVE/Storage.pm
> @@ -28,6 +28,7 @@ use PVE::Storage::Plugin;
>  use PVE::Storage::DirPlugin;
>  use PVE::Storage::LVMPlugin;
>  use PVE::Storage::LvmThinPlugin;
> +use PVE::Storage::LvmQcow2Plugin;
>  use PVE::Storage::NFSPlugin;
>  use PVE::Storage::CIFSPlugin;
>  use PVE::Storage::ISCSIPlugin;
> @@ -54,6 +55,7 @@ our $KNOWN_EXPORT_FORMATS = ['raw+size',
> 'tar+size', 'qcow2+size', 'vmdk+size',
>  PVE::Storage::DirPlugin->register();
>  PVE::Storage::LVMPlugin->register();
>  PVE::Storage::LvmThinPlugin->register();
> +PVE::Storage::LvmQcow2Plugin->register();
>  PVE::Storage::NFSPlugin->register();
>  PVE::Storage::CIFSPlugin->register();
>  PVE::Storage::ISCSIPlugin->register();
> diff --git a/src/PVE/Storage/LvmQcow2Plugin.pm
> b/src/PVE/Storage/LvmQcow2Plugin.pm
> new file mode 100644
> index 0000000..68c8686
> --- /dev/null
> +++ b/src/PVE/Storage/LvmQcow2Plugin.pm
> @@ -0,0 +1,460 @@
> +package PVE::Storage::LvmQcow2Plugin;
> +
> +use strict;
> +use warnings;
> +
> +use IO::File;
> +
> +use PVE::Tools qw(run_command trim);
> +use PVE::Storage::Plugin;
> +use PVE::Storage::LVMPlugin;
> +use PVE::JSONSchema qw(get_standard_option);
> +
> +use base qw(PVE::Storage::LVMPlugin);
> +
> +# Configuration
> +
> +sub type {
> +    return 'lvmqcow2';
> +}
> +
> +sub plugindata {
> +    return {
> + #container not yet implemented #need to implemented dm-qcow2
> + content => [ {images => 1, rootdir => 1}, { images => 1 }],
> +    };
> +}
> +
> +sub properties {
> +    return {
> +    };
> +}
> +
> +sub options {
> +    return {
> + vgname => { fixed => 1 },
> + nodes => { optional => 1 },
> + shared => { optional => 1 },
> + disable => { optional => 1 },
> + saferemove => { optional => 1 },
> + saferemove_throughput => { optional => 1 },
> + content => { optional => 1 },
> + base => { fixed => 1, optional => 1 },
> + tagged_only => { optional => 1 },
> + bwlimit => { optional => 1 },
> + snapext => { fixed => 1 },
> +    };
> +}
> +
> +# Storage implementation
> +
> +sub parse_volname {
> +    my ($class, $volname) = @_;
> +
> +    PVE::Storage::Plugin::parse_lvm_name($volname);
> +    my $format = $volname =~ m/^(.*)-snap-/ ? 'qcow2' : 'raw';
> +
> +    if ($volname =~ m/^((vm|base)-(\d+)-\S+)$/) {
> +        return ('images', $1, $3, undef, undef, $2 eq 'base',
> $format);
> +    }
> +
> +    die "unable to parse lvm volume name '$volname'\n";
> +}
> +
> +sub filesystem_path {
> +    my ($class, $scfg, $volname, $snapname, $current_snap) = @_;
> +
> +    my ($vtype, $name, $vmid) = $class->parse_volname($volname);
> +
> +    my $vg = $scfg->{vgname};
> +
> +    my $path = "/dev/$vg/$name";
> +
> +    if($snapname) {
> + $path = get_snap_volname($path, $snapname);
> +    } elsif ($current_snap) {
> + $path = $current_snap->{file};
> +    }
> +
> +    return wantarray ? ($path, $vmid, $vtype) : $path;
> +}
> +
> +sub create_base {
> +    my ($class, $storeid, $scfg, $volname) = @_;
> +
> +    my $vg = $scfg->{vgname};
> +
> +    my ($vtype, $name, $vmid, $basename, $basevmid, $isBase) =
> +        $class->parse_volname($volname);
> +
> +    die "create_base not possible with base image\n" if $isBase;
> +
> +    die "unable to create base volume - found snapshot" if $class-
> >snapshot_exist($scfg, $storeid, $volname);
> +
> +    my $newname = $name;
> +    $newname =~ s/^vm-/base-/;
> +
> +    my $cmd = ['/sbin/lvrename', $vg, $volname, $newname];
> +    run_command($cmd, errmsg => "lvrename '$vg/$volname' =>
> '$vg/$newname' error");
> +
> +    # set inactive, read-only flags
> +    $cmd = ['/sbin/lvchange', '-an', '-pr', "$vg/$newname"];
> +    eval { run_command($cmd); };
> +    warn $@ if $@;
> +
> +    my $newvolname = $newname;
> +
> +    return $newvolname;
> +}
> +
> +sub clone_image {
> +    my ($class, $scfg, $storeid, $volname, $vmid, $snap) = @_;
> +
> +    die "can't clone images in lvm storage\n";
> +}
> +
> +sub alloc_image {
> +    my ($class, $storeid, $scfg, $vmid, $fmt, $name, $size) = @_;
> +
> +    die "unsupported format '$fmt'" if $fmt ne 'raw';
> +
> +    die "illegal name '$name' - should be 'vm-$vmid-*'\n"
> + if  $name && $name !~ m/^vm-$vmid-/;
> +
> +    my $vgs = PVE::Storage::LVMPlugin::lvm_vgs();
> +
> +    my $vg = $scfg->{vgname};
> +
> +    die "no such volume group '$vg'\n" if !defined ($vgs->{$vg});
> +
> +    my $free = int($vgs->{$vg}->{free});
> +
> +    die "not enough free space ($free < $size)\n" if $free < $size;
> +
> +    $name = $class->find_free_diskname($storeid, $scfg, $vmid)
> + if !$name;
> +
> +    my $tags = ["pve-vm-$vmid"];
> +    if ($name =~ m/^(((vm|base)-(\d+)-disk-(\d+)))(-snap-(.*))?/) {
> +        push @$tags, "\@pve-$1";
> +    }
> +
> +    PVE::Storage::LVMPlugin::lvcreate($vg, $name, $size, $tags);
> +
> +    return $name;
> +}
> +
> +sub volume_snapshot_info {
> +    my ($class, $scfg, $storeid, $volname) = @_;
> +
> +    return $class->list_snapshots($scfg, $storeid, $volname);
> +}
> +
> +sub activate_volume {
> +    my ($class, $storeid, $scfg, $volname, $snapname, $cache) = @_;
> +
> +    my $lvm_activate_mode = 'ey';
> +    my $tag = undef;
> +
> +    #activate volume && all volumes snapshots by tag
> +    if($volname =~ m/^(((vm|base)-(\d+)-disk-(\d+)))(-snap-(.*))?/)
> {
> + $tag = "\@pve-vm-$4-disk-$5";
> +    }
> +
> +    my $cmd = ['/sbin/lvchange', "-a$lvm_activate_mode", $tag];
> +    run_command($cmd, errmsg => "can't activate LV '$tag'");
> +
> +    $cmd = ['/sbin/lvchange', '--refresh', $tag];
> +    run_command($cmd, errmsg => "can't refresh LV '$tag' for
> activation");
> +}
> +
> +sub deactivate_volume {
> +    my ($class, $storeid, $scfg, $volname, $snapname, $cache) = @_;
> +
> +    my $tag = undef;
> +    #deactivate volume && all volumes snasphots by tag
> +    if($volname =~ m/^(((vm|base)-(\d+)-disk-(\d+)))(-snap-(.*))?/)
> {
> + $tag = "\@pve-vm-$4-disk-$5";
> +    }
> +
> +    my $cmd = ['/sbin/lvchange', '-aln', $tag];
> +    run_command($cmd, errmsg => "can't deactivate LV '$tag'");
> +}
> +
> +sub volume_resize {
> +    my ($class, $scfg, $storeid, $volname, $size, $running) = @_;
> +
> +    #we should resize the base image and parents snapshots,
> +    #but how to manage rollback ?
> +
> +    die "can't resize if snasphots exist" if $class-
> >snapshot_exist($scfg, $storeid, $volname);
> +
> +    return 1;
> +}
> +
> +sub volume_snapshot {
> +    my ($class, $scfg, $storeid, $volname, $snap) = @_;
> +
> +    $class->activate_volume($storeid, $scfg, $volname, undef, {});
> +
> +    my $current_path = $class->path($scfg, $volname, $storeid);
> +    my $current_format =
> (PVE::Storage::Plugin::file_size_info($current_path))[1];
> +    my $snappath = get_snap_volname($current_path, $snap);
> +
> +    my $snapvolname = get_snap_volname($volname, $snap);
> +    #allocate lvm snapshot volume
> +    my ($vtype, $name, $vmid, $basename, $basevmid, $isBase) =
> + $class->parse_volname($volname);
> +    my $size = $class->volume_size_info($scfg, $storeid, $volname,
> 5);
> +    #add 100M for qcow2 headers
> +    $size = int($size/1024) + (100*1024);
> +
> +    $class->alloc_image($storeid, $scfg, $vmid, 'raw', $snapvolname,
> $size);
> +
> +    # create the qcow2 fs
> +    eval {
> +        my $cmd = ['/usr/bin/qemu-img', 'create', '-b',
> $current_path,
> +                   '-F', $current_format, '-f', 'qcow2', $snappath];
> +        my $options = "extended_l2=on,";
> +        $options .=
> PVE::Storage::Plugin::preallocation_cmd_option($scfg, 'qcow2');
> +        push @$cmd, '-o', $options;
> +        run_command($cmd);
> +    };
> +    if ($@) {
> + eval { $class->free_image($storeid, $scfg, $snapvolname, 0) };
> + warn $@ if $@;
> +    }
> +}
> +
> +# Asserts that a rollback to $snap on $volname is possible.
> +# If certain snapshots are preventing the rollback and $blockers is
> an array
> +# reference, the snapshot names can be pushed onto $blockers prior
> to dying.
> +sub volume_rollback_is_possible {
> +    my ($class, $scfg, $storeid, $volname, $snap, $blockers) = @_;
> +
> +    my $path = $class->filesystem_path($scfg, $volname);
> +    my $snappath = get_snap_volname($path, $snap);
> +    my $currentpath = $class->path($scfg, $volname, $storeid);
> +    return 1 if $currentpath eq $snappath;
> +
> +    die "can't rollback, '$snap' is not most recent snapshot on
> '$volname'\n";
> +
> +    return 1;
> +}
> +
> +sub volume_snapshot_rollback {
> +    my ($class, $scfg, $storeid, $volname, $snap) = @_;
> +
> +    $class->activate_volume($storeid, $scfg, $volname, undef, {});
> +    #simply delete the current snapshot and recreate it
> +
> +    my $snapvolname = get_snap_volname($volname, $snap);
> +
> +    $class->free_image($storeid, $scfg, $snapvolname, 0);
> +    $class->volume_snapshot($scfg, $storeid, $volname, $snap);
> +}
> +
> +sub volume_snapshot_delete {
> +    my ($class, $scfg, $storeid, $volname, $snap, $running) = @_;
> +
> +    return 1 if $running;
> +
> +    $class->activate_volume($storeid, $scfg, $volname, undef, {});
> +
> +    my $snapshots = $class->volume_snapshot_info($scfg, $storeid,
> $volname);
> +    my $snappath = $snapshots->{$snap}->{file};
> +    if(!$snappath) {
> + warn "$snap already deleted. skip\n";
> + return;
> +    }
> +
> +    my $snapvolname = $snapshots->{$snap}->{volname};
> +    my $parentsnap = $snapshots->{$snap}->{parent};
> +    my $childsnap = $snapshots->{$snap}->{child};
> +    die "error: can't find a parent for this snapshot" if
> !$parentsnap;
> +
> +    my $parentpath = $snapshots->{$parentsnap}->{file};
> +    my $parentformat = $snapshots->{$parentsnap}->{'format'} if
> $parentsnap;
> +    my $childpath = $snapshots->{$childsnap}->{file} if $childsnap;
> +    my $childformat = $snapshots->{$childsnap}->{'format'} if
> $childsnap;
> +
> +    print "merge snapshot $snap to $parentsnap\n";
> +    my $cmd = ['/usr/bin/qemu-img', 'commit', $snappath];
> +    run_command($cmd);
> +
> +    #if we delete an intermediate snapshot, we need to link upper
> snapshot to base snapshot
> +    if($childpath && -e $childpath) {
> + die "missing parentsnap snapshot to rebase child $childpath\n" if
> !$parentpath;
> + print "link $childsnap to $parentsnap\n";
> + $cmd = ['/usr/bin/qemu-img', 'rebase', '-u', '-b', $parentpath, '-
> F', $parentformat, '-f', $childformat, $childpath];
> + run_command($cmd);
> +    }
> +
> +    #delete the snapshot
> +    $class->free_image($storeid, $scfg, $snapvolname, 0);
> + 
> +    return;
> +}
> +
> +sub volume_has_feature {
> +    my ($class, $scfg, $feature, $storeid, $volname, $snapname,
> $running) = @_;
> +
> +    my $features = {
> + snapshot => { current => 1 },
> +# clone => { base => 1, snap => 1},   #don't allow to clone as we
> can't activate the base between different host ?
> + template => { current => 1},
> + copy => { base => 1, current => 1, snap => 1},
> + sparseinit => { base => 1, current => 1},
> + rename => {current => 1},
> +    };
> +
> +    my ($vtype, $name, $vmid, $basename, $basevmid, $isBase) =
> + $class->parse_volname($volname);
> +
> +    my $key = undef;
> +    if($snapname){
> + $key = 'snap';
> +    }else{
> + $key =  $isBase ? 'base' : 'current';
> +    }
> +    return 1 if $features->{$feature}->{$key};
> +
> +    return undef;
> +}
> +
> +sub get_snap_volname {
> +    my ($path, $snap) = @_;
> +
> +    my $basepath = "";
> +    my $baseformat = "";
> +    if ($path =~ m/^((.*)((vm|base)-(\d+)-disk-(\d+)))(-snap-([a-zA-
> Z0-9]+))?(\.(raw|qcow2))?/) {
> +        $basepath = $1;
> +        $baseformat = $8;
> +    }
> +    my $snapvolname = $basepath."-snap-$snap.qcow2";
> +    return $snapvolname;
> +}
> +
> +sub get_snapname_from_path {
> +    my ($path) = @_;
> +
> +    if ($path =~ m/^((.*)((vm|base)-(\d+)-disk-(\d+)))(-snap-([a-zA-
> Z0-9]+))?(\.(raw|qcow2))?/) {
> +        my $snapname = $7;
> +        return $snapname;
> +    }
> +    die "can't parse snapname from path $path";
> +}
> +
> +sub get_current_snapshot {
> +    my ($class, $scfg, $storeid, $volname) = @_;
> +
> +    #get more recent ctime volume
> +    return $class->list_snapshots($scfg, $storeid, $volname, 1);
> +}
> +my $check_tags = sub {
> +    my ($tags) = @_;
> +
> +    return defined($tags) && $tags =~ /(^|,)pve-vm-\d+(,|$)/;
> +};
> +
> +sub list_images {
> +    my ($class, $storeid, $scfg, $vmid, $vollist, $cache) = @_;
> +
> +    my $vgname = $scfg->{vgname};
> +
> +    $cache->{lvs} = PVE::Storage::LVMPlugin::lvm_list_volumes() if
> !$cache->{lvs};
> +
> +    my $res = [];
> +
> +    if (my $dat = $cache->{lvs}->{$vgname}) {
> +
> + foreach my $volname (keys %$dat) {
> +
> +     next if $volname !~ m/^(vm|base)-(\d+)-/;
> +     my $owner = $2;
> +
> +     my $info = $dat->{$volname};
> +
> +     next if $scfg->{tagged_only} && !&$check_tags($info->{tags});
> +
> +     # Allow mirrored and RAID LVs
> +     next if $info->{lv_type} !~ m/^[-mMrR]$/;
> +
> +     my $volid = "$storeid:$volname";
> +
> +     if ($vollist) {
> + my $found = grep { $_ eq $volid } @$vollist;
> + next if !$found;
> +     } else {
> + next if defined($vmid) && ($owner ne $vmid);
> +     }
> +
> +     push @$res, {
> + volid => $volid, format => 'raw', size => $info->{lv_size}, vmid =>
> $owner,
> + ctime => $info->{ctime},
> +     };
> + }
> +    }
> +
> +    return $res;
> +}
> +
> +sub list_snapshots {
> +    my ($class, $scfg, $storeid, $volname, $current_only) = @_;
> +
> +    my $vgname = $scfg->{vgname};
> +
> +    my $basevolname = $volname;
> +    my $lvs = PVE::Storage::LVMPlugin::lvm_list_volumes($vgname);
> +
> +    my $vg = $lvs->{$vgname};
> +
> +    my ($vtype, $name, $vmid, $basename, $basevmid, $isBase,
> $format) = $class->parse_volname($volname);
> +    my $snapshots = $class->list_images($storeid, $scfg, $vmid);
> +
> +    my $info = {};
> +    for my $snap (@$snapshots) {
> + my $snap_volid = $snap->{volid};
> + next if ($snap_volid !~ m/$basevolname/);
> +
> + my $snapname = get_snapname_from_path($snap_volid);
> + my (undef, $snap_volname) =
> PVE::Storage::parse_volume_id($snap_volid);
> + my $snapfile = $class->filesystem_path($scfg, $snap_volname,
> $snapname);
> + $snapname = 'base' if !$snapname;
> + $info->{$snapname}->{file} = $snapfile;
> + $info->{$snapname}->{volname} = $snap_volname;
> + $info->{$snapname}->{volid} = $snap_volid;
> + $info->{$snapname}->{ctime} = $snap->{ctime};
> +
> + if (!$current_only) {
> +     my (undef, $format, undef, $parentfile, undef) =
> PVE::Storage::Plugin::file_size_info($snapfile);
> +     next if !$parentfile && $snapname ne 'base';   #bad unlinked
> snasphot
> +
> +     my $parentname = get_snapname_from_path($parentfile) if
> $parentfile;
> +     $parentname = 'base' if !$parentname && $parentfile;
> +
> +     $info->{$snapname}->{'format'} = $format;
> +     $info->{$snapname}->{parent} = $parentname if $parentname;
> +     $info->{$parentname}->{child} = $snapname if $parentname;
> + }
> +    }
> +
> +    my @snapshots_sorted = sort { $info->{$b}{ctime} <=> $info-
> >{$a}{ctime} } keys %$info;
> +    my $current_snapname = $snapshots_sorted[0];
> +    my $current_snapshot = $info->{$current_snapname};
> +    return $current_snapshot if $current_only;
> +
> +    $info->{current} = { %$current_snapshot };
> +    return $info;
> +}
> +
> +sub snapshot_exist {
> +    my ($class, $scfg, $storeid, $volname) = @_;
> +
> +    my $basepath = $class->filesystem_path($scfg, $volname);
> +    my $currentpath = $class->path($scfg, $volname, $storeid);
> +
> +    die "can't resize if snasphots exist" if $currentpath ne
> $basepath;
> +
> +}
> +1;
> diff --git a/src/PVE/Storage/Makefile b/src/PVE/Storage/Makefile
> index d5cc942..1af8aab 100644
> --- a/src/PVE/Storage/Makefile
> +++ b/src/PVE/Storage/Makefile
> @@ -14,7 +14,8 @@ SOURCES= \
>   PBSPlugin.pm \
>   BTRFSPlugin.pm \
>   LvmThinPlugin.pm \
> - ESXiPlugin.pm
> + ESXiPlugin.pm \
> + LvmQcow2Plugin.pm
>  
>  .PHONY: install
>  install:
> -- 
> 2.39.2




More information about the pve-devel mailing list