[pve-devel] [PATCH storage 0/2] fix #4997: lvm: avoid autoactivating (new) LVs after boot

Mon Feb 10 11:47:15 CET 2025

On February 7, 2025 1:44 pm, Friedrich Weber wrote:
> On 11/01/2024 16:03, Friedrich Weber wrote:
>> By default, LVM autoactivates LVs after boot. In a cluster with VM disks on a
>> shared LVM VG (e.g. on top of iSCSI), this can indirectly cause guest creation
>> or VM live-migration to fail. See bug #4997 [1] and patch #2 for details.
>> 
>> The goal of this series is to avoid autoactivating LVs after boot. Fabian
>> suggested to use the "activation skip" flag for LVs. LVs with that flag can
>> only be activated if the `-K` flag is passed during activation (`-K` is not
>> passed for autoactivation after boot).
> 
> I revisited this issue now, and with some distance, I wondered why
> to go with the quite heavy-handed approach of
> 
> - (1) activating LVs with -K/--ignoreactivationskip
> - (2) starting with PVE 9, creating LVs with -k/`--setactivationskip y`
> 
> If it would be possible to instead just create new LVs with
> `--setautoactivation n`, we wouldn't need to touch the activation code
> at all, and wouldn't need to worry about the mixed-version cluster
> scenario.
> 
> For VGs on iSCSI volumes, setting `--setautoactivation n` on LVs seems
> to work fine -- this way, after iSCSI login, these volumes are not
> automatically activated. The reason is that autoactivation is done by
> udev via /lib/udev/rules.d/69-lvm.rules:
> 
> 	IMPORT{program}="/sbin/lvm pvscan --cache --listvg --checkcomplete --vgonline --autoactivation event --udevoutput --journal=output $env{DEVNAME}"
> 	TEST!="/run/systemd/system", GOTO="lvm_direct_vgchange"
> 
> 	ENV{LVM_VG_NAME_COMPLETE}=="?*", RUN+="/usr/bin/systemd-run --no-block --property DefaultDependencies=no --unit lvm-activate-$env{LVM_VG_NAME_COMPLETE} /sbin/lvm vgchange -aay --autoactivation event $env{LVM_VG_NAME_COMPLETE}"
> 	GOTO="lvm_end"
> 
> 	LABEL="lvm_direct_vgchange"
> 	ENV{LVM_VG_NAME_COMPLETE}=="?*", RUN+="/sbin/lvm vgchange -aay --autoactivation event $env{LVM_VG_NAME_COMPLETE}"
> 	GOTO="lvm_end"
> 
> In both branches vgchange is called with `-aay` where `ay` specifies
> autoactivation, which according to `man vgchange` doesn't activate LVs
> with `--setautoactivation n`.
> 
> However, for VGs on FC/SAS+multipath disks (these are visible at boot
> time), even an LV with `--setautoactivation n` is active after boot.
> I was curious why, and I *think* (see [2] for info how to debug this)
> it's because they are already activated in initramfs (because they are
> visible at boot time):
> 
> - 69-lvm.conf does in fact call vgchange with `-aay`, and does not
> activate LVs with `--setautoactivation n`, as expected
> 
> - but zfs-initramfs also installs a script
> /usr/share/initramfs-tools/scripts/local-top/zfs [1] to initramfs which
> calls:
> 
>         /sbin/lvm vgscan
>         /sbin/lvm vgchange -a y --sysinit
> 
> Note that it calls `vgchange -ay` (instead of `vgchange -aay`) so LVM
> doesn't consider this autoactivation and activates all LVs regardless of
> their `--setautoactivation` flag. If I edit the script to use `-aay`
> instead, the LV with `--setautoactivation n` flag is inactive after boot
> as expected.

thanks for the detailed analysis/follow-up!

> 
> So I'm wondering:
> 
> (a) could the ZFS initramfs script use `-aay` instead of `-ay`, so the
> `--setautoactivation` flag has an effect again for LVs that are visible
> at boot?

probably ;)

> 
> (b) assuming (a) can be fixed, is there any additional reason to prefer
> to --ignoreactivationskip/--setactivationskip approach over the
> `--setautoactivation n` approach?

the only other reason I could think of is that there's a metric ton of
tutorials/howtos for various things that contain "vgchange -ay" which
would still entail potential breakage.. but then again, one could argue
that people blindly following such instructions are to blame, and the
reduced complexity on our end would be worth it..

> 
> [1]
> https://github.com/openzfs/zfs/blob/c2458ba921a8d59cd6b705f81b37f50531f8670b/contrib/initramfs/scripts/local-top/zfs
> 
> [2] FWIW, one way to debug this:
> 
> - attaching a serial console to the VM
> - removing `quiet` and adding `console=ttyS0` to the kernel command line
> - setting `log_level=debug` in
> /usr/share/initramfs-tools/scripts/init-top/udev
> - adding some debug printouts (echo XYZ > /dev/ttyS0) to
> /usr/share/initramfs-tools/scripts/local-top/zfs
> - don't forget `update-initramfs -u`
> - attach to serial console on host: `qm console VMID`
> 
>