[pve-devel] [RFC storage] zfspoolplugin: check if mounted instead of imported

Stoiko Ivanov s.ivanov at proxmox.com
Thu Feb 18 14:15:56 CET 2021


Thanks for looking at the patch!

On Thu, 18 Feb 2021 13:20:55 +0100
Thomas Lamprecht <t.lamprecht at proxmox.com> wrote:

> On 18.02.21 12:33, Stoiko Ivanov wrote:
> > This patch addresses an issue we recently saw on a production machine:
> > * after booting a ZFS pool failed to get imported (due to an empty
> >   /etc/zfs/zpool.cache)
> > * pvestatd/guest-startall eventually tried to import the pool
> > * the pool was imported, yet the datasets of the pool remained
> >   not mounted
> > 
> > A bit of debugging showed that `zpool import <poolname>` is not
> > atomic, in fact it does fork+exec `mount` with appropriate parameters.
> > If an import ran longer than the hardcoded timeout of 15s, it could
> > happen that the pool got imported, but the zpool command (and its
> > forks) got terminated due to timing out.
> > 
> > reproducing this is straight-forward by setting (drastic) bw+iops
> > limits on a guests' disk (which contains a zpool) - e.g.:
> > `qm set 100 -scsi1 wd:vm-100-disk-1,iops_rd=10,iops_rd_max=20,\
> > iops_wr=15,iops_wr_max=20,mbps_rd=10,mbps_rd_max=15,mbps_wr=10,mbps_wr_max=15`
> > afterwards running `timeout 15 zpool import <poolname>` resulted in
> > that situation in the guest on my machine
> > 
> > The patch changes the check in activate_storage for the ZFSPoolPlugin,
> > to check if the configured 'pool' (which can also be a sub-dataset) is
> > mounted (via `zfs get mounted <dataset>`).
> > 
> > * In case this errors out we try to import the pool (and run `zfs
> >   mount -a` if the dataset is not mounted after importing)
> > * In case it succeeds but reports the dataset as not mounted,
> >   we run `zfs mount -a`
> > 
> > The choice of running `zfs mount -a` has potential for regression, in
> > case someone manually umounts some dataset after booting the system
> > (but does not set the canmount option)  
> 
> Why not
> # zfs mount <dataset>
> ?
That would only mount the dataset and not anything beneath it, which
is different from what `zpool import` does (without the -N option).
AFAICT there is no way to recursively mount everything below a dataset
(short of manually walking through the tree)

While we mitigated those issues with container-datasets recently (by
explicitly mounting them before starting the container) - this does not
hold true for manually created datasets (e.g. for setting different
properties for certain guests, or for storing backups....)

(in case of the original reproducer, there was a PBS-datastore defined on
a sub-dataset)
> 
> > 
> > Signed-off-by: Stoiko Ivanov <s.ivanov at proxmox.com>
> > ---
> > * sending as RFC since I have the feeling that I miss quite a few corner-cases.
> > * an alternative to getting the mounted property via `zfs get` might be
> >   parsing /proc/mounts (but zfs get mounted does mostly just that +stating
> >   files in /sys, and 2 (potentially blocking) ioctls
> > * could reproduce the issue with ZFS 2.0.1 and 0.8.6 - so my current guess
> >   is the issue on the production box might be related to some other
> >   performance regression in its disk-subsystem (maybe introduced with the
> >   new kernel)
> > * huge thanks to Dominik for many suggestions (including the idea of
> >   limiting the disk-speed via our stack instead of dm-delay :)
> > 
> >  PVE/Storage/ZFSPoolPlugin.pm | 25 +++++++++++++++++++++----
> >  1 file changed, 21 insertions(+), 4 deletions(-)
> > 
> > diff --git a/PVE/Storage/ZFSPoolPlugin.pm b/PVE/Storage/ZFSPoolPlugin.pm
> > index 105d802..32ad1c9 100644
> > --- a/PVE/Storage/ZFSPoolPlugin.pm
> > +++ b/PVE/Storage/ZFSPoolPlugin.pm
> > @@ -525,8 +525,17 @@ sub activate_storage {
> >      my ($class, $storeid, $scfg, $cache) = @_;
> >  
> >      # Note: $scfg->{pool} can include dataset <pool>/<dataset>
> > -    my $pool = $scfg->{pool};
> > -    $pool =~ s!/.*$!!;
> > +    my $dataset = $scfg->{pool};
> > +    my $pool = ($dataset =~ s!/.*$!!r);
> > +
> > +    my $dataset_mounted = sub {
> > +	my $mounted = eval { $class->zfs_get_properties($scfg, 'mounted', "$dataset") };  
> 
> quotes don't do anything here, just gets you a "undefined value in string ..." warning
> if undefined (and underlying run_command does not change behavior either).
> 
> I.e.,
> 
> # perl -we 'use PVE::Tools qw(run_command); my $arg; run_command(["touch", "$arg"])'
> Use of uninitialized value $arg in string at -e line 1.
> touch: cannot touch '': No such file or directory
> command 'touch ''' failed: exit code 1
> 
> vs.
> 
> # perl -we 'use PVE::Tools qw(run_command); my $arg; run_command(["touch", $arg])'
> Use of uninitialized value $_[1] in exec at /usr/share/perl/5.28/IPC/Open3.pm line 178.
> touch: cannot touch '': No such file or directory
> command 'touch ''' failed: exit code 1
> 
> I like to avoid using quotes in such cases to avoid suggesting that they have
> any use.
sorry - that was a bit shoddy copying - will fix that in a v2

> 
> > +	if ($@) {
> > +	    warn "$@\n";
> > +	    return undef;
> > +	}  
> 
> I know this is just copied, but both should just use a simple
> warn "$@\n" if ($@);
> 
> as the return is correctly using a defined check anyway
sounds good

> 
> 
> > +	return defined($mounted) && $mounted =~ m/^yes$/;
> > +    };
> >  
> >      my $pool_imported = sub {
> >  	my @param = ('-o', 'name', '-H', "$pool");
> > @@ -538,13 +547,21 @@ sub activate_storage {
> >  	return defined($res) && $res =~ m/$pool/;
> >      };
> >  
> > -    if (!$pool_imported->()) {
> > +    if (!defined($dataset_mounted->())) {  
> 
> don't complicated boolean, drop the defined.
in that case the return value of $dataset_mounted->() is ternary:
* undefined - error during `zfs get` -> we most likely need to import the
  pool
* false - pool is imported, but dataset is not mounted -> we need to mount
  only (the elsif branch)
* true - everything as it should be

coming to think about it:
this would be problematic, if the storage has a 'pool', which points to a 
dataset configured with 'canmount=off' - see the description of 'canmount'
in zfsprops(8) for when this might make sense):
with the current check (if the pool is imported) this would work fine, with
the patch in place this would result in each activate_storage() to call zfs
mount -a. 
My expectation is that this does not happen in the wild (manually setting
'canmount' to off for the root-dataset of a storage) - but it is
technically possible - but maybe I'm missing the use-case for this?

> 
> > +	# error from zfs get mounted - pool import necessary
> >  	# import can only be done if not yet imported!
> >  	my @param = ('-d', '/dev/disk/by-id/', '-o', 'cachefile=none', "$pool");
> >  	eval { $class->zfs_request($scfg, undef, 'zpool_import', @param) };
> >  	if (my $err = $@) {
> >  	    # just could've raced with another import, so recheck if it is imported
> > -	    die "could not activate storage '$storeid', $@\n" if !$pool_imported->();
> > +	    die "could not activate storage '$storeid', $err\n" if !$pool_imported->();
> > +	}
> > +	if (!$dataset_mounted->()) {
> > +	    eval { $class->zfs_request($scfg, undef, 'mount', '-a') };
> > +	    if (my $err = $@) {
> > +		# just could've raced with another import, so recheck if it is imported  
> 
> but you do not (re-)check anything?
will refactor that a bit in the v2

> 
> > +		die "could not activate storage '$storeid', $err\n";
> > +	    }
> >  	}
> >      }
> >      return 1;
> >   
> 






More information about the pve-devel mailing list