[pve-devel] [PATCH storage 0/2] fix #4997: lvm: avoid autoactivating (new) LVs after boot

Fri Feb 7 13:44:39 CET 2025

On 11/01/2024 16:03, Friedrich Weber wrote:
> By default, LVM autoactivates LVs after boot. In a cluster with VM disks on a
> shared LVM VG (e.g. on top of iSCSI), this can indirectly cause guest creation
> or VM live-migration to fail. See bug #4997 [1] and patch #2 for details.
> 
> The goal of this series is to avoid autoactivating LVs after boot. Fabian
> suggested to use the "activation skip" flag for LVs. LVs with that flag can
> only be activated if the `-K` flag is passed during activation (`-K` is not
> passed for autoactivation after boot).

I revisited this issue now, and with some distance, I wondered why
to go with the quite heavy-handed approach of

- (1) activating LVs with -K/--ignoreactivationskip
- (2) starting with PVE 9, creating LVs with -k/`--setactivationskip y`

If it would be possible to instead just create new LVs with
`--setautoactivation n`, we wouldn't need to touch the activation code
at all, and wouldn't need to worry about the mixed-version cluster
scenario.

For VGs on iSCSI volumes, setting `--setautoactivation n` on LVs seems
to work fine -- this way, after iSCSI login, these volumes are not
automatically activated. The reason is that autoactivation is done by
udev via /lib/udev/rules.d/69-lvm.rules:

	IMPORT{program}="/sbin/lvm pvscan --cache --listvg --checkcomplete --vgonline --autoactivation event --udevoutput --journal=output $env{DEVNAME}"
	TEST!="/run/systemd/system", GOTO="lvm_direct_vgchange"

	ENV{LVM_VG_NAME_COMPLETE}=="?*", RUN+="/usr/bin/systemd-run --no-block --property DefaultDependencies=no --unit lvm-activate-$env{LVM_VG_NAME_COMPLETE} /sbin/lvm vgchange -aay --autoactivation event $env{LVM_VG_NAME_COMPLETE}"
	GOTO="lvm_end"

	LABEL="lvm_direct_vgchange"
	ENV{LVM_VG_NAME_COMPLETE}=="?*", RUN+="/sbin/lvm vgchange -aay --autoactivation event $env{LVM_VG_NAME_COMPLETE}"
	GOTO="lvm_end"

In both branches vgchange is called with `-aay` where `ay` specifies
autoactivation, which according to `man vgchange` doesn't activate LVs
with `--setautoactivation n`.

However, for VGs on FC/SAS+multipath disks (these are visible at boot
time), even an LV with `--setautoactivation n` is active after boot.
I was curious why, and I *think* (see [2] for info how to debug this)
it's because they are already activated in initramfs (because they are
visible at boot time):

- 69-lvm.conf does in fact call vgchange with `-aay`, and does not
activate LVs with `--setautoactivation n`, as expected

- but zfs-initramfs also installs a script
/usr/share/initramfs-tools/scripts/local-top/zfs [1] to initramfs which
calls:

        /sbin/lvm vgscan
        /sbin/lvm vgchange -a y --sysinit

Note that it calls `vgchange -ay` (instead of `vgchange -aay`) so LVM
doesn't consider this autoactivation and activates all LVs regardless of
their `--setautoactivation` flag. If I edit the script to use `-aay`
instead, the LV with `--setautoactivation n` flag is inactive after boot
as expected.

So I'm wondering:

(a) could the ZFS initramfs script use `-aay` instead of `-ay`, so the
`--setautoactivation` flag has an effect again for LVs that are visible
at boot?

(b) assuming (a) can be fixed, is there any additional reason to prefer
to --ignoreactivationskip/--setactivationskip approach over the
`--setautoactivation n` approach?

[1]
https://github.com/openzfs/zfs/blob/c2458ba921a8d59cd6b705f81b37f50531f8670b/contrib/initramfs/scripts/local-top/zfs

[2] FWIW, one way to debug this:

- attaching a serial console to the VM
- removing `quiet` and adding `console=ttyS0` to the kernel command line
- setting `log_level=debug` in
/usr/share/initramfs-tools/scripts/init-top/udev
- adding some debug printouts (echo XYZ > /dev/ttyS0) to
/usr/share/initramfs-tools/scripts/local-top/zfs
- don't forget `update-initramfs -u`
- attach to serial console on host: `qm console VMID`