[pbs-devel] [PATCH proxmox-backup v3 1/3] fix #6195: api: datastore: add endpoint for moving namespaces
Hannes Laimer
h.laimer at proxmox.com
Mon Sep 15 11:19:10 CEST 2025
On 15.09.25 10:56, Christian Ebner wrote:
> On 9/15/25 10:27 AM, Hannes Laimer wrote:
>> On 15.09.25 10:15, Christian Ebner wrote:
>>> Thanks for having a go at this issue, I did not yet have an in depth
>>> look at this but unfortunately I'm afraid the current implementation
>>> approach will not work for the S3 backend (and might also have issues
>>> for local datastores).
>>>
>>> Copying the S3 objects is not an atomic operation and will take some
>>> time, so leaves you open for race conditions. E.g. while you copy
>>> contents, a new backup snapshot might be created in one of the
>>> already copied backup groups, which will then however be deleted
>>> afterwards. Same is true for pruning, and other metadata editing
>>> operations such as
>>> adding notes, backup task logs, ecc.
>>>
>>
>> Yes, but not really. We lock the `active_operations` tracking file, so
>> no new read/write operations can be started after we start the moving
>> process. There's a short comment in the API endpoint function.
>
> Ah yes, I did miss that part. But by doing that you will basically block
> any datastore operation, not just the ones to the source or target
> namespace. This is not ideal IMO. Further you cannot move a NS if any
> other operation is ongoing on the datastore, which might be completely
> unrelated to the source and target namespace, e.g. a backup to another
> namespace?
Yes. But I don't think this is something we can (easily) check for,
maybe there is a good way, but I can't think of a feasible one.
We could lock all affected groups in advance, but I'm not super sure we
can just move a locked dir, at least with the old locking.
Given both for local and S3 datastores this is I'd argue a rather fast
operations, so just saying 'nobody does anything while we move stuff' is
reasonable.
What we could think about adding is maybe a checkbox for update jobs
referencing the NS, but not sure about if we want that.
>
>> I'm not sure there is much value in more granular locking, I mean, is
>> half a successful move worth much? Unless we add some kind of rollback,
>> but tbh, I feel like that would not be worth the effort I think.
>
> Well, it could be just like we do for the sync jobs, skipping the move
> for the ones where the backup group could not be locked or fails for
> some other reason?
>
Hmm, but then we'd have it in two places, and moving again later won't
work because we can't distinguish between a same named ns already
existing and a new try to complete an earlier move. And we also can't
allow that in general, cause what happens if there's the same VMID
twice.
> I think having a more granular backup group unit instead of namespace
> makes this more flexible: what if I only want to move one backup group
> from one namespace to another one, as the initial request in the bug
> report?
>
That is not possible currently. And, at least with this series, not
intended. We could support that eventually, but that should be rather
orthogonal to this one I think.
> For example, I had a VM which has been backed up to a given namespace,
> has however since been destroyed, but I want to keep the backups by
> moving the group with all the snapshots to a different namespace,
> freeing the backup type and ID for the current namespace?
>
I see the use-case for this, but I think these are two things. Moving a
NS and moving a single group.
>>
>>> So IMO this must be tackled on a group level, making sure to get an
>>> exclusive lock for each group (on the source as well as target of the
>>> move operation) before doing any manipulation. Only then it is okay
>>> to do any non-atomic operations.
>>>
>>> The moving of the namespace must then be implemented as batch
>>> operations on the groups and sub-namespaces.
>>>
>>> This should be handled the same also for regular datastores, to avoid
>>> any races there to.
>>
>
More information about the pbs-devel
mailing list