[pve-devel] [RFC qemu-server/storage 0/3] fix #5779: introduce guest hints to pass rxbounce flag to KRBD

Fiona Ebner f.ebner at proxmox.com
Tue Oct 28 11:50:04 CET 2025


Am 28.10.25 um 10:02 AM schrieb Friedrich Weber:
> On 24/10/2025 14:27, Friedrich Weber wrote:
>> [...]
>> Obstacles I faced so far:
>>
>> - The biggest obstacle is that we need to update all callers of
>>   activate_volumes to pass guest hints where possible. This means that optimally
>>   all callers should be able to generate the guest hints. Right now, there is
>>   only the 'guest-ostype' hint which is taken from the VM config. Currently, this
>>   is not always available to the caller of activate_volumes, sometimes
>>   some extra work/refactoring would be needed to get it (e.g. see
>>   PVE::QemuServer::QemuImage::convert or PVE::QemuServer::clone_disk),
>>   so this needs quite some code changes, which I have not done in all cases in
>>   this RFC.
>>
>> - There are also some indirect callers of activate_volumes, e.g. via
>>   PVE::Storage::abs_filesystem_path or PVE::Storage::storage_migrate -- these
>>   would also need to be extended to accept hints (not done in this RFC)
>>
>> - Initially, to avoid having to modify all (direct+indirect) callers of
>>   activate_volumes, I thought I could pass the hints only at the few "relevant"
>>   call sites (i.e., when starting a VM), but then noticed that volumes may be
>>   activated by an action unrelated to a VM start (e.g. a clone), then stay
>>   active, and not be re-activated by a VM start. So if e.g. we do not pass the
>>   hints on clone, the KRBD volume would be mapped without rxbounce, stay active,
>>   and when starting the VM, a user could run into the original problem again.
>>   So we can't get away with only passing hints to the few relevant call sites,
>>   and actually need to pass them everywhere (where possible).
>>
> 
> Thomas and I discussed this point off-list:
> 
> - to clarify: if a Windows guest volume was mapped with KRBD without
> rxbounce (e.g. by a clone where the activate_volumes caller doesn't pass
> $hints) and doesn't get unmapped, and then a VM start activates the
> volumes again (passing $hints this time so we'd like to pass rxbounce),
> RBDPlugin::map_volume will early-exit because the volume is already mapped:
> 
> sub map_volume {
>     my ($class, $storeid, $scfg, $volname, $snapname) = @_;
> 
>     my ($vtype, $img_name, $vmid) = $class->parse_volname($volname);
> 
>     my $name = $img_name;
>     $name .= '@' . $snapname if $snapname;
> 
>     my $kerneldev = get_rbd_dev_path($scfg, $storeid, $name);
> 
>     return $kerneldev if -b $kerneldev; # already mapped
> 
>     [...]
> }
> 
> ... which is the VM will just use the guest volume without rxbounce and
> the user can run into the issue.
> 
> - we discussed whether, to avoid this, we could apply the rxbounce
> option "on the fly" to an already-mapped volume. I looked a bit [1] and
> didn't see any way to apply rxbounce to an already-mapped volume.
> Calling `rbd map` again apparently just maps the volume a second time
> which doesn't sound like a good idea, and an `rbd unmap` followed by an
> `rbd map` (with rxbounce) is likely not safe either?
> 
> [1] https://docs.ceph.com/en/reef/man/8/rbd/

I was also thinking along similar lines when reading that paragraph of
the cover letter.

Using unmap and then map again might be safe in certain situations like
VM start, where there should be no other active users of the volume
(from the top of my head, but would need to be checked in detail of
course). That information is not available to the map_volume()
implementation of course, so it would need to be part of the contract
that either:

1. hints are only used in situations where a deactivate/unmap up front
is safe or
2. that the callers using hints are required to ensure the volume is
deactivated first or there is no guarantee that the hint is used.

Approach 1. would have the advantage that the plugin could check if it
even needs to do the unmap/deactivate and for some plugins/hints it
might be able to be done on the fly.

Question is, do we already have callers that would be unhappy with that
requirement? As you described, the approach from the series has the
requirement of knowing "early enough" about what hints we'll need, which
is not always easy, but I haven't looked at it yet, just wanted to ask
about this already :)




More information about the pve-devel mailing list