[pve-devel] [PATCH pve-kernel-meta 3/5] proxmox-boot: fix #3671 add pin/unpin for kernel-version

Fabian Grünbichler f.gruenbichler at proxmox.com
Tue Feb 1 12:35:09 CET 2022


On January 31, 2022 6:59 pm, Stoiko Ivanov wrote:
> The 2 commands follow the mechanics of p-b-t kernel add/remove in
> writing the desired abi-version to a config-file in /etc/kernel and
> actually modifying the boot-loader configuration upon p-b-t refresh.
> 
> A dedicated new file is used instead of writing the version (with some
> kind of annotation) to the manual kernel list to keep parsing the file
> simple (and hopefully also cause fewer problems with manually edited
> files)

one thing I noticed while playing around - the following sequence of 
actions is a bit surprising:

- pin (old) version FOO
- refresh
- ... (long time, different admin, ..)
- apt remove pve-kernel-$FOO

while this prints

 No linux-image /boot/vmlinuz-$FOO found - skipping

this is kind of hard to understand without knowing about p-b-t internals,
skipping here means we don't copy the kernel/initrd from /boot to the 
ESP (since there is no source). now the $FOO kernel (and initrd) are on 
the ESPs, but not in /boot. since the package is no longer installed, 
future ABI-compatible upgrades are not installed, and the initrd is 
never regenerated when triggered by other factors.

worse, if I pinned that kernel for important reasons (e.g., HW-compat), 
removing the pin (via unpin, pinning another version, or next-boot to 
try whether an updated kernel improves the situation!) will remove the 
only copy of it..

I am not sure what we can do here (except making the message more 
prominent?) - failing apt is ugly, removing the kernel on the ESP when 
removing it from /boot despite it being pinned only makes it worse..

OTOH since a pinned kernel is by definition never auto-removed, hooking 
into the APT hook might work since that would mean the removal is never 
started, and the resulting dpkg/apt state is clean? obviously only 
possible for our kernels where we know the naming scheme, anything 
custom could still run into the issue..

> For systemd-boot we write the entry into the loader.conf on the ESP(s)
> instead of relying on the `bootctl set-default` mechanics (bootctl(1))
> which write the entry in an EFI-var. This was preferred, because of a
> few reports of unwriteable EFI-vars on some systems (e.g. DELL servers
> have a setting preventing writing EFI-vars from the OS). The rationale
> in `Why not simply rely on the EFI boot menu logic?` from [0] also
> makes a few points in that direction.
> 
> For grub the following choices were made:
> * write the pinned version (or actually the menu-path leading to it)
>   to /etc/default/grub instead of editing the grub.cfg files on the
>   partition. Mostly to divert as little as possible from the
>   grub-workflow I assume people are used to.

did you test whether adding a snippet overriding GRUB_DEFAULT also 
works? we already do that to set the distributor for the various 
products.. creating/deleting a 

/etc/default/grub.d/y_proxmox_pinned_kernel.cfg

and (if we want to make the latter be separate from pinning, see other 
mail)

/etc/default/grub.d/z_proxmox_next_boot.cfg

seems like the cleaner approach compared to modifying the admin-managed 
/etc/default/grub ..

> * the 'root-device-id' part of the menu-entries is parsed from
>   /boot/grub/grug.cfg since it was stable (the same on all ESPs and in
>   /boot/grub), saves us from copying the part of "find device behind
>   /, mangle it if zfs/btrfs, call grub_probe a few times" part of
>   grub-mkconfig - and seems a bit more robust
> 
> Tested with a BIOS and an UEFI VM with / on ZFS.
> 
> [0] https://systemd.io/BOOT_LOADER_SPECIFICATION/
> 
> Signed-off-by: Stoiko Ivanov <s.ivanov at proxmox.com>





More information about the pve-devel mailing list