[pve-devel] [PATCH pve-kernel-meta 3/5] proxmox-boot: fix #3671 add pin/unpin for kernel-version
Fabian Grünbichler
f.gruenbichler at proxmox.com
Tue Feb 1 12:35:09 CET 2022
On January 31, 2022 6:59 pm, Stoiko Ivanov wrote:
> The 2 commands follow the mechanics of p-b-t kernel add/remove in
> writing the desired abi-version to a config-file in /etc/kernel and
> actually modifying the boot-loader configuration upon p-b-t refresh.
>
> A dedicated new file is used instead of writing the version (with some
> kind of annotation) to the manual kernel list to keep parsing the file
> simple (and hopefully also cause fewer problems with manually edited
> files)
one thing I noticed while playing around - the following sequence of
actions is a bit surprising:
- pin (old) version FOO
- refresh
- ... (long time, different admin, ..)
- apt remove pve-kernel-$FOO
while this prints
No linux-image /boot/vmlinuz-$FOO found - skipping
this is kind of hard to understand without knowing about p-b-t internals,
skipping here means we don't copy the kernel/initrd from /boot to the
ESP (since there is no source). now the $FOO kernel (and initrd) are on
the ESPs, but not in /boot. since the package is no longer installed,
future ABI-compatible upgrades are not installed, and the initrd is
never regenerated when triggered by other factors.
worse, if I pinned that kernel for important reasons (e.g., HW-compat),
removing the pin (via unpin, pinning another version, or next-boot to
try whether an updated kernel improves the situation!) will remove the
only copy of it..
I am not sure what we can do here (except making the message more
prominent?) - failing apt is ugly, removing the kernel on the ESP when
removing it from /boot despite it being pinned only makes it worse..
OTOH since a pinned kernel is by definition never auto-removed, hooking
into the APT hook might work since that would mean the removal is never
started, and the resulting dpkg/apt state is clean? obviously only
possible for our kernels where we know the naming scheme, anything
custom could still run into the issue..
> For systemd-boot we write the entry into the loader.conf on the ESP(s)
> instead of relying on the `bootctl set-default` mechanics (bootctl(1))
> which write the entry in an EFI-var. This was preferred, because of a
> few reports of unwriteable EFI-vars on some systems (e.g. DELL servers
> have a setting preventing writing EFI-vars from the OS). The rationale
> in `Why not simply rely on the EFI boot menu logic?` from [0] also
> makes a few points in that direction.
>
> For grub the following choices were made:
> * write the pinned version (or actually the menu-path leading to it)
> to /etc/default/grub instead of editing the grub.cfg files on the
> partition. Mostly to divert as little as possible from the
> grub-workflow I assume people are used to.
did you test whether adding a snippet overriding GRUB_DEFAULT also
works? we already do that to set the distributor for the various
products.. creating/deleting a
/etc/default/grub.d/y_proxmox_pinned_kernel.cfg
and (if we want to make the latter be separate from pinning, see other
mail)
/etc/default/grub.d/z_proxmox_next_boot.cfg
seems like the cleaner approach compared to modifying the admin-managed
/etc/default/grub ..
> * the 'root-device-id' part of the menu-entries is parsed from
> /boot/grub/grug.cfg since it was stable (the same on all ESPs and in
> /boot/grub), saves us from copying the part of "find device behind
> /, mangle it if zfs/btrfs, call grub_probe a few times" part of
> grub-mkconfig - and seems a bit more robust
>
> Tested with a BIOS and an UEFI VM with / on ZFS.
>
> [0] https://systemd.io/BOOT_LOADER_SPECIFICATION/
>
> Signed-off-by: Stoiko Ivanov <s.ivanov at proxmox.com>
More information about the pve-devel
mailing list