[pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy

Fri Apr 18 13:51:56 CEST 2025

On April 18, 2025 1:06 pm, Thomas Lamprecht wrote:
> Am 17.04.25 um 11:29 schrieb Fabian Grünbichler:
>> On April 16, 2025 4:18 pm, Christian Ebner wrote:
>>> Instead of directly deleting the snapshot directory and it's contents
>>> on a prune, move the snapshot directory into the `.trash` subfolder
>>> of the datastore.
>>>
>>> This allows to mark chunks which were used by these index files if
>>> the snapshot was pruned during an ongoing garbage collection.
>>> Garbage collection will clean up these files before starting with the
>>> marking phase 1 and read all index files after completing that phase,
>>> touching these chunks as well.
>> 
>> some other variants to maybe consider:
>> 
>> marking the snapshot itself as trash (in the manifest, or by adding a
>> trash marker file inside the dir) - this would mean that there is no
>> iterator race issue when undoing a prune, no double-pruning collisions,
>> .. - but it also means we need to adapt all call sites that should skip
>> trashed snapshots (most existing ones), which is more churn.
> 
> Shouldn't we use the central iterators implementations to query indexes?
> I.e., any call site that doesn't should probably be switched over to
> those, just like Chris did for GC recently.
> Then it could be defined if trashed indexes should be skipped – the
> default – or included when instantiating that iterator, e.g. through a
> parameter or probably a dedicated "with_trash" fn – but that's details.

yes, the parts where we iterate are fairly easy to handle, but we do
have quite a few where we access a snapshot directly and might not want
to treat a trashed one like a non-trashed one ;)

>> having a trash dir per group instead of a global one for the whole
>> datastore (less likely to incur extra costs in case somebody has a weird
>> setup where namespaces/.. are symlinked or bindmounted or similar
>> shenanigans). would need to postpone group removal to GC in case all
>> snapshots are pruned.
> 
> Both of those variants would make restore simpler too, depends IMO a
> bit if we already read the manifest everywhere anyway where the info
> is needed, in that case I'd slightly favor that as place to store the
> info if a backup is trash or not, as that would avoid the need for
> another directory or file to store (inode exhaustion) and manage.

the iterator itself doesn't yet read the manifest, neither does the
group's `list_backups` fn.. so the "manifest-only" part would
potentially add quite a bit of overhead there.. but of course, things
like verification and snapshot listing over the API already do read the
manifest later...

another variant would be to have a per-group list of trashed snapshots
*and* a marker in the manifest - that way both use cases would be
supported, with the caveat that they'd have to be kept in sync (but that
only affects prune/undo_prune itself, so might not be *that* bad?)