[pve-devel] [PATCH pve-storage 0/5] v5 fix regression and indentation

Fabian Grünbichler f.gruenbichler at proxmox.com
Mon Jun 19 10:38:19 CEST 2017


On Fri, Jun 16, 2017 at 03:42:07PM +0200, datanom.net wrote:
> Hi Fabian,
> 
> Most of your input I will study this weekend but a can give a quick answers
> to a few comments which follows below.
> 
> On 2017-06-16 14:21, Fabian Grünbichler wrote:
> > >         next unless $zvol->{name} =~ /^(base|vm)-(\d+)-disk-\d+$/;
> > >         $vmid = $2;
> > >         $parent = undef;
> > >         foreach my $snap (@$snapshots) {
> > >             next unless $snap->{name} eq "__base__$vmid";
> > 
> > this seems a very strange way to handle this.. is it not possible to get
> > the origin of a dataset/zvol using the API? creating an artificial extra
> > snapshot per VM for this seems very workaround-ish
> 
> Unfortunately not. This was the best way I could find to remedy this.



> 
> > >             $parent = $snap->{filesystem} =~ /^$scfg->{pool}\/(.+)$/
> > > ? $1 : undef;
> > >         }
> > >         $list->{$scfg->{pool}}->{$zvol->{name}} = {
> > >             name => $zvol->{name},
> > >             size => $zvol->{volsize},
> > >             parent => $parent,
> > >             vmid => $vmid,
> > >             format => 'raw',
> > >         };
> > >         if ($zvol->{name} =~ /^base-(.*)/) {
> > >             $hide->{"vm-$1"} = 1;
> > >         }
> > 
> > shouldn't you hide those vm-XXX-foo zvols for which a base-XXX-foo clone
> > exists? instead of hiding the clones?
> I am hiding the vm-XXX-foo zvols? ( $hide->{"vm-$1"} )
> and
> > >     delete @{$list->{$scfg->{pool}}}{keys %$hide};
> > 

sorry, misread - you are hiding the correct ones! I guess that's what
you get for reading too much code in one go on a Friday.

> 
> > again, no way to limit by target(-name) on the server side or get one
> > directly?
> 
> Unfortunately not.

really unfortunate. does the upcoming 11.x API support this?

> 
> > 
> > both branches do the same thing..
> > 
> For now yes, but the coming API for 11.x is wastly expanded so this is just
> for preparation.
> The same replies to your comment for the same in other methods.
> 
> > >     foreach my $tgroup (@$targetgroups) {
> > >         if ($tgroup->{iscsi_target} == $target &&
> > >             $tgroup->{iscsi_target_portalgroup} ==
> > > $scfg->{portal_group} &&
> > >             $tgroup->{iscsi_target_initiatorgroup} ==
> > > $scfg->{initiator_group}) {
> > 
> > is there now way to limit by any of this on the server side?
> > 
> 
> No, unfortunately not.
> 
> > >     foreach my $ext (@$extents) {
> > >         if ($ext->{iscsi_target_extent_path} =~
> > > /$scfg->{pool}\/$volname$/) {
> > 
> > no way to limit for this on the server side?
> > 
> 
> Unfortunately not.
> 
> > >     foreach my $t2ext (@$t2extents) {
> > >         if ($t2ext->{iscsi_target} == $target &&
> > > $t2ext->{iscsi_extent} == $extent) {
> > 
> > server side limits?
> > 
> 
> Unfortunately not.
> 
> > 
> > this check for the max_luns limit does not account for volumes with
> > custom names (which are allowed in PVE, and also here in alloc_image).
> > 
> 
> Custom names? I though the disk naming scheme is vm-{id}-disk-{#}

for disks generated by PVE, yes. but most (all?) storage plugins also
allow the more general vm-{id}-{whatever}:

# pvesm alloc somestorage 100 disk-foo 1G
illegal name 'disk-foo' - sould be 'vm-100-*'

> 
> > > 
> > >         foreach my $t2extent (@$t2extents) {
> > >             next unless $t2extent->{iscsi_target} == $target &&
> > >                         $t2extent->{iscsi_lunid} + 1 > $max_luns &&
> > >                         $t2extent->{iscsi_lunid} < $max_luns +
> > > $active_snaps;
> > 
> > given that $max_luns is 20, and $active_snaps is 4, what is this
> > supposed to achieve? this seems to always "next", so
> > 
> 
> No, you only get $max_luns <= $lun < $max_luns + $active_snaps

ah, now I get it. please correct me if I misunderstood something
(again):

- get all targettoextent mappings and loop over them
- only look at those with the desired target and with a lunid that is in
  the "special range" reserved for snapshots and state volumes
- if the volume has an extent, check if it matches the one of the
  targettoextent mapping
- else, proceed to the next potentially still free slot and repeat

I see two remaining issues:
- you rely on an implict sorting of t2extents by lunid
- you always iterate over the whole thing (except for the "exist" case)

I'd suggest dropping the id++, and instead marking which of the
potential slots you have already seen (or how many, since LUNs should
not be duplicated in the result anyway I guess?), and moving the check
and abort into the foreach.

> 
> > >             my $eid = freenas_get_extent($scfg, $volname);
> > >             if ($eid) {
> > >                 $response = freenas_request($scfg, 'GET',
> > > "services/iscsi/extent/$eid");
> > >                 die HTTP::Status::status_message($response) if
> > > $response =~ /^\d+$/;
> > >                 my $extent = decode_json($response);
> > >                 # Request to get lunid for an existing lun
> > >                 last if $t2extent->{iscsi_extent} eq $eid;
> > >             }
> > >             $id++;
> > >         }
> > >         die "Max snapshots (4) is reached" unless ($id - $max_luns)
> > > < $active_snaps;
> > 
> > $id is still $max_luns here, so this can never trigger. also, hard coded
> > value for active_snaps.
> > 
> 
> See comment above.
> 
> > >         $lunid = $id;
> > 
> > and this just sets $lunid to $max_luns?
> > 
> 
> See comment above.
> 
> > >     } elsif ($volname =~ /^(vm|base)-\d+-disk-\d+\@vzdump$/) {
> > >         # Required to be able to exposed read-only LUNs for snapshot
> > > backup CT
> > 
> > why limit this to @vzdump? for example,  your plugin could support full
> > cloning from snapshots as well, couldn't it?
> > 
> 
> This is to handle the special case for snapshot backup of LXC. These
> snapshots are always named vzdump.

I know that ;) I wanted to know whether there is a reason not to allow
cloning from snapshots as well (which would generalize this to arbitrary
snapshots). At least the full clone case ("copy" feature in PVE::Storage
terminology) shouldn't be a problem.

> 
> > >         $lunid = $max_luns + $active_snaps;
> > >     }
> > 
> > what about custom volume names? vm-VMID-foo is allowed!
> > 
> 
> See comment befoere.
> 
> > > 
> > >     # abort rollback if snapshot is not the latest
> > >     my $response = freenas_request($scfg, 'GET', "storage/snapshot");
> > 
> > why no limit here, but in freenas_list_zvol there is one? ;) also,
> > filtering by $volname on the server side would be great if possible.
> > 
> 
> This is how the API is ;-( and no server side filtering what so ever.

yes, but why do you GET "storage/snapshot?limit=$limit" in
freenas_list_zvol, but GET "storage/snapshot" without a limit here? it's
the same API endpoint, but inconsistenly used.

> 
> > 
> > like previously said, it would probably be better to store this in
> > /etc/pve/priv and not expose it over the API.
> > 
> How is that suppose to happen? You would require to change the gui code for
> configuring storages too.

no, you would just require manual setup just like for other storage
types as well:
- LVM, LVMThin, local ZFS all require local setup before adding the
  storage
- Ceph requires at least putting the keyring into place
- probably others I missed?


> 
> > >         blocksize => { optional => 1 },
> > 
> > not used anywhere?
> 
> Next version of the API.
> 
> > > sub parse_volname {
> > 
> > it would be great if some of the REs in the helper methods could be
> > replaced with calls to parse_volname. having such things encoded in one
> > place makes them easier to check and change.
> > 
> 
> You lost be here ;-)

you have use regular expressions like /^(base|vm)-(\d+)-disk-\d+$/ in
your code, where you just want to extract the VMID or similar. IMHO
those should use parse_volname, so that you only have one place where
you need to make sure that the RE is correct, and where you need to
change it if it needs changing.

I see:
- /^(base|vm)-(\d+)-disk-\d+$/ in freenas_list_zvols, only $2 is used
- /^(vm|base)-(\d+)-/ in freenas_get_target_name, only $2 is used
- /(vm|base)-$vmid-disk-(\d+)/ in freenas_find_free_diskname, only $2 is
  used (but this might be left as is, it is a special case and not
  returned by parse_volname anyway)
- /^(vm|base)-\d+-disk-(\d+)$ in freenas_get_lun_number, only $2 is used
  (not returned by parse_volname)

at least the former two should be updated, the latter two can stay as is
I guess.

> 
> > >         if (@$children) {
> > >             $used = $children->[0]{used};
> > >             $total = $children->[0]{avail};
> > 
> > this might warrant an explanatory comment - is this a workaround for
> > some API quirk?
> 
> I will provide inline comment in the code. Reason is FreeNAS/API.
> A pool is created in a dataset under RPOOL which means:
> If pool is empty (no zvols) size is read from the dataset
> If pool contains children (zvols > 0) size is read as the some of children.

so the first child is a meta-child that represents the sum of all
children? the more I know about this API the less I like it ;)

> 
> > > 
> > >             if ($vollist) {
> > >                 my $found = grep { $_ eq $info->{volid} } @$vollist;
> > >                 next if !$found;
> > >             } else {
> > >                 next if defined ($vmid) && ($owner ne $vmid);
> > >             }
> > 
> > I wonder if those two are really exclusive?
> > 
> 
> Copy/pasted from the zfspool plugin
> 
> > > 
> > >     my $data = {
> > >         force => bless( do{\(my $o = 0)}, 'JSON::XS::Boolean' ),
> > 
> > you mean JSON::false ?
> > 
> 
> Yes. I no strange but force => does not translate to force => false ;-)

just to make sure there is no misunderstanding here - the JSON module
provides ::true and ::false, so you don't need to bless anything
yourself:

$ perl -e '
use strict;
use warnings;
use JSON;
my $one = { force => bless( do{\(my $o = 0)}, "JSON::XS::Boolean") };
my $two = { force => JSON::false };
print JSON::encode_json($one), "\n";
print JSON::encode_json($two), "\n";
'
{"force":false}
{"force":false}

> 
> > >             $active_luns->{$lun} = "0:0:0:$lun";
> > 
> > where does this hard-coded 0:0:0: come from? wouldn't it be better to
> > call get_active_luns again, or to let freenas_create_lun return some of
> > the info it has anyway?
> 
> Just placement holders. The important part is $active_luns->{$lun} is not
> 0|null|undef
> 
> > 
> > nope! this might be fine for debugging, but you need to use udevadm
> > to trigger and settle if you want to wait for device nodes to appear.
> 
> How do I interopt with udevadm from a Perl?

see alloc_image in ZFSPoolPlugin.pm




More information about the pve-devel mailing list