[pve-devel] [RFC qemu-server/storage 0/3] fix #5779: introduce guest hints to pass rxbounce flag to KRBD

Friedrich Weber f.weber at proxmox.com
Tue Oct 28 17:35:06 CET 2025


Fiona and I discussed this off-list (thanks again!), to sum up:

- even though there is currently only one hint, we'd like a solution to
that still allows to add more hints later on but doesn't generate too
much churn now

- passing the OS type to enable rxbounce is only relevant at a few
callsites, e.g. when activating on (Windows) VM start, not when e.g.
activating for offline-cloning. So it would be nice to avoid touching
all activate_volumes callsites now

- it's likely that other hypothetical hints we might add in the future
will be relevant only for VM start as well, but if this assumption turns
out to be false, we should be able to adapt without too much hassle.

- rxbounce cannot be applied to already-mapped RBD volumes without
unmapping them first, but hypothetical future flags (that are set based
on guest hints) for other storage plugins might be applicable to
already-mapped volumes without unmapping, so we should be able to allow
this in the future without too much hassle.

- when starting a VM, if a volume is already active without rxbounce,
unmapping and re-mapping it (with rxbounce) should be safe [0], but it
might not be safe for all callsites

- so for now, we can go for a variant of Fiona's suggestion 1 here [1]
where:

* for the base functionality, we pass the hints only at the
{activate,map}_volumes callsites that are relevant for rxbounce
(essentially, on VM start). This avoids having to modify all callsites
now. If a hypothetical future plugin needs hints also at other
callsites, we can still pass the storage hints there

* we define a boolean hint 'plugin-may-deactivate' that, if true, tells
the plugin that it is free to unmap and a re-map volume if necessary to
make use of a hint. If 'plugin-may-deactivate' is 0, the plugin must not
unmap+map to apply a hint -- if it can apply a hint to an already-mapped
volume without unmap+map, it can still do so. On VM start, we pass 1 for
this flag. This way, passing flags at more callsites in the future is
doable (if unmap+map isn't safe, we just pass 0). And we keep the door
open for hints that plugins can apply without unmap+map.

* RBDPlugin::map_volume checks whether an already-mapped volume has
rxbounce set in sysfs, and if it hasn't but should (according to the
hints) and plugin-may-deactivate is 1, it can unmap and map the volume
with rxbounce.

* this can result in unnecessary map (without rxbounce)+unmap+map (with
rxbounce) sequences e.g. in case of live restore, which is not optimal
but also not really problematic. In case of live restore we can probably
fix this by passing the storage hints already on the first activation,
which can be done on top of the base functionality in v1 or (worst case)
as a follow-up.

* if a hypothetical future hint+plugin combination actually runs into
the problem that a hint was not passed on initial activation, is passed
later but cannot be applied anymore then, we still have the option of
adding storage hints to more/all activate_volumes callsites later on.

* Fiona pointed out a problem with volumes that already active after
creation [2]: depending on the hints and the plugin, in such a case the
plugin may be able to apply the hint directly or unmap+map to do so. If
both doesn't work, we can still think about adding hints also to
alloc_image (which would be backwards-compatible).

* the hints applying to all volumes passed to activate_volumes is not
great as Fiona also mentioned [2], and having an alternative API that
allows per-volume hint or even a helper class might be nicer, but since
this is kind of orthogonal I won't cover this in the v1. Having such an
alternative API might still be nice in the future (also because if we
ever need more hints, TPM state and EFI disk are likely candidates, and
then, per-volume hints would make sense). Implementing this wouldn't
change the plugin API.

* we won't add hints to {deactivate,unmap}_volume for now -- they're not
relevant for rxbounce, and possibly not relevant for future hints either
-- and even if they are, the plugin may be better off to track relevant
state on its own, because hypothetical hint X might even change between
volume activation and deactivation. If we still need to add this, it can
be done in a backwards-compatible way.

I'll follow up with a v1.

[0]
https://lore.proxmox.com/pve-devel/59d16447-5f20-4a38-a7b0-fa4890174440@proxmox.com/
[1]
https://lore.proxmox.com/pve-devel/d182c102-204b-43f0-9336-d6faf754fe72@proxmox.com/
[2]
https://lore.proxmox.com/pve-devel/85dab76c-d461-43d1-8f1c-45ea1b31200a@proxmox.com/




More information about the pve-devel mailing list