[pve-devel] [PATCH storage] ZFS: mount subvols in activate_volume

Wolfgang Bumiller w.bumiller at proxmox.com
Wed Apr 22 13:26:25 CEST 2020


On Wed, Apr 22, 2020 at 11:28:36AM +0200, Thomas Lamprecht wrote:
> On 4/22/20 10:30 AM, Fabian Ebner wrote:
> > Turns out that this alone doesn't make starting containers with an unmounted zfs subvolume working. But "pct clone" and "pct mount" can still benefit from this patch. The problem for starting a container is that when we call PVE::Storage::activate_volumes, it happens inside the lxc-pve-prestart-hook where the kernel/apparmor will block the "zfs mount" operation. And this seems to be the case even for a privileged container and/or adding "lxc.apparmor.profile: unconfined" (I guess because the relevant apparmor profile is always "lxc-start" which is not affected by those changes?)
> > 
> 
> This sounds wrong, Wolfgang can you check this out?

lxc-start and any hooks run by it are restricted to mounting only within the
paths lxc uses for setting up the container. Particularly it uses these
rules:

    mount -> /usr/lib*/*/lxc/{**,},
    mount -> /usr/lib*/lxc/{**,},
    mount -> /usr/lib/x86_64-linux-gnu/lxc/rootfs/{,**},

(There are more mount rules, but they're for special cases.)

The only thing we *can* do is check whether there's a zfs mount point
there before bind-mounting and erroring out if that's not the case.

(My personal favorite would've been using `mountpoint=legacy` for
container subvols, though, which would be mounted only to the
destination.)

With this patch we might at least get better errors than without?

> > Calling activate_volumes directly in vm_start would work around this,
> > but if we remove it from lxc-pve-prestart-hook then it can break things for somebody using "systemctl start pve-container at ID" directly (depending on the storage) and if we don't remove it, it's code/work duplication. Is modifying the lxc-start profile viable?
> > 
> > On 21.04.20 14:00, Fabian Ebner wrote:
> >> This makes containers work even if the subvolumes are not already
> >> mounted beforehand for some reason. Without this patch, container
> >> could quietly fail for e.g. start and full clone, because
> >> bind-mounting still "worked" on the empty directory.
> >>
> >> Signed-off-by: Fabian Ebner <f.ebner at proxmox.com>
> >> ---
> >>
> >> This can be seen as an alternative to [0], which hasn't been applied
> >> AFAICT. And this approach even works when the subvolume is not mounted
> >> for a different reason than the one described in [0]. For example, I
> >> ran into the problem by having a non-empty '/myzpool', causing 'myzpool'
> >> to be not mountable, but 'myzpool/subvol-123-disk-0' would still be mountable.
> >>
> >> [0]: https://pve.proxmox.com/pipermail/pve-devel/2020-March/042089.html
> >>
> >>   PVE/Storage/ZFSPoolPlugin.pm | 11 ++++++++---
> >>   1 file changed, 8 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/PVE/Storage/ZFSPoolPlugin.pm b/PVE/Storage/ZFSPoolPlugin.pm
> >> index 10354b3..4a85732 100644
> >> --- a/PVE/Storage/ZFSPoolPlugin.pm
> >> +++ b/PVE/Storage/ZFSPoolPlugin.pm
> >> @@ -546,9 +546,14 @@ sub activate_volume {
> >>         my (undef, undef, undef, undef, undef, undef, $format) = $class->parse_volname($volname);
> >>   -    return 1 if $format ne 'raw';
> >> -
> >> -    $class->zfs_wait_for_zvol_link($scfg, $volname);
> >> +    if ($format eq 'raw') {
> >> +    $class->zfs_wait_for_zvol_link($scfg, $volname);
> >> +    } elsif ($format eq 'subvol') {
> >> +    my $mounted = $class->zfs_get_properties($scfg, 'mounted', "$scfg->{pool}/$volname");
> >> +    if ($mounted !~ m/^yes$/) {
> >> +        $class->zfs_request($scfg, undef, 'mount', "$scfg->{pool}/$volname");
> >> +    }
> >> +    }
> >>         return 1;
> >>   }
> >>
> > 
> > _______________________________________________
> > pve-devel mailing list
> > pve-devel at pve.proxmox.com
> > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> > 
> 
> 




More information about the pve-devel mailing list