[pbs-devel] [PATCH proxmox-backup v3 1/3] fix #6195: api: datastore: add endpoint for moving namespaces

Mon Sep 15 12:01:48 CEST 2025

On 9/15/25 11:19 AM, Hannes Laimer wrote:
> On 15.09.25 10:56, Christian Ebner wrote:
>> On 9/15/25 10:27 AM, Hannes Laimer wrote:
>>> On 15.09.25 10:15, Christian Ebner wrote:
>>>> Thanks for having a go at this issue, I did not yet have an in depth 
>>>> look at this but unfortunately I'm afraid the current implementation 
>>>> approach will not work for the S3 backend (and might also have 
>>>> issues for local datastores).
>>>>
>>>> Copying the S3 objects is not an atomic operation and  will take 
>>>> some time, so leaves you open for race conditions. E.g. while you 
>>>> copy contents, a new backup snapshot might be created in one of the 
>>>> already copied backup groups, which will then however be deleted 
>>>> afterwards. Same is true for pruning, and other metadata editing 
>>>> operations such as
>>>> adding notes, backup task logs, ecc.
>>>>
>>>
>>> Yes, but not really. We lock the `active_operations` tracking file, so
>>> no new read/write operations can be started after we start the moving
>>> process. There's a short comment in the API endpoint function.
>>
>> Ah yes, I did miss that part. But by doing that you will basically 
>> block any datastore operation, not just the ones to the source or 
>> target namespace. This is not ideal IMO. Further you cannot move a NS 
>> if any other operation is ongoing on the datastore, which might be 
>> completely unrelated to the source and target namespace, e.g. a backup 
>> to another namespace?
> 
> Yes. But I don't think this is something we can (easily) check for, 
> maybe there is a good way, but I can't think of a feasible one.
> We could lock all affected groups in advance, but I'm not super sure we 
> can just move a locked dir, at least with the old locking.

No, not lock all in advance, but we can lock it on a per backup group 
basis (source and target) and consider that as the basic operation, so 
this is mostly a local sync job on the same datastore from one namespace 
to another one. That is why I suggested to consider the moving of a 
namespace as batch operation of moving backup groups. While not as 
performant, this should eliminate possible races and makes error 
handling/rollback much easier.

> Given both for local and S3 datastores this is I'd argue a rather fast
> operations, so just saying 'nobody does anything while we move stuff' is
> reasonable.

Well, for an S3 object store with several sub-namespaces, containing 
hundreds of backup groups and thousands of snapshots (with notes, backup 
task logs and other metadata) this might take some time. After all there 
is a copy request for each of the objects involved. Do you have some 
hard numbers on this?

> 
> What we could think about adding is maybe a checkbox for update jobs
> referencing the NS, but not sure about if we want that.
> 
>>
>>> I'm not sure there is much value in more granular locking, I mean, is
>>> half a successful move worth much? Unless we add some kind of rollback,
>>> but tbh, I feel like that would not be worth the effort I think.
>>
>> Well, it could be just like we do for the sync jobs, skipping the move 
>> for the ones where the backup group could not be locked or fails for 
>> some other reason?
>>
> 
> Hmm, but then we'd have it in two places, and moving again later won't
> work because we can't distinguish between a same named ns already
> existing and a new try to complete an earlier move. And we also can't
> allow that in general, cause what happens if there's the same VMID
> twice.

Not if the failed/skipped group is cleaned up correctly, if not 
preexisting? And skip if it is preexisting... disallowing any group name 
collisions.

> 
>> I think having a more granular backup group unit instead of namespace 
>> makes this more flexible: what if I only want to move one backup group 
>> from one namespace to another one, as the initial request in the bug 
>> report?
>>
> 
> That is not possible currently. And, at least with this series, not
> intended. We could support that eventually, but that should be rather
> orthogonal to this one I think.

But then this does not really fix the issue ;)

> 
>> For example, I had a VM which has been backed up to a given namespace, 
>> has however since been destroyed, but I want to keep the backups by 
>> moving the group with all the snapshots to a different namespace, 
>> freeing the backup type and ID for the current namespace?
>>
> 
> I see the use-case for this, but I think these are two things. Moving a 
> NS and moving a single group.

This is a design decision we should make now.
To me it seems to make more sense to see the namespace moving as batch 
operation of moving groups.

Alternatively, IMO we must implement locking for namespace analogous to 
the locking for backup groups to be able to keep a consistent state, 
especially for the S3 backend were there are a lot of failure modes. 
Locking all operations on the datastore and requiring for none (even 
unrelated ones) to be active before trying the move is not ideal.

Other opinions here? CC'ing Thomas and Fabian...

> 
>>>
>>>> So IMO this must be tackled on a group level, making sure to get an 
>>>> exclusive lock for each group (on the source as well as target of 
>>>> the move operation) before doing any manipulation. Only then it is 
>>>> okay to do any non-atomic operations.
>>>>
>>>> The moving of the namespace must then be implemented as batch 
>>>> operations on the groups and sub-namespaces.
>>>>
>>>> This should be handled the same also for regular datastores, to 
>>>> avoid any races there to.
>>>
>>
>