[pbs-devel] [RFC proxmox-backup 3/4] datastore: move snapshots to trash folder on destroy

Christian Ebner c.ebner at proxmox.com
Fri Apr 18 14:45:15 CEST 2025


On 4/18/25 14:03, Fabian Grünbichler wrote:
> On April 18, 2025 1:49 pm, Christian Ebner wrote:
>> On 4/18/25 13:06, Thomas Lamprecht wrote:
>>> Am 17.04.25 um 11:29 schrieb Fabian Grünbichler:
>>>> On April 16, 2025 4:18 pm, Christian Ebner wrote:
>>>>> Instead of directly deleting the snapshot directory and it's contents
>>>>> on a prune, move the snapshot directory into the `.trash` subfolder
>>>>> of the datastore.
>>>>>
>>>>> This allows to mark chunks which were used by these index files if
>>>>> the snapshot was pruned during an ongoing garbage collection.
>>>>> Garbage collection will clean up these files before starting with the
>>>>> marking phase 1 and read all index files after completing that phase,
>>>>> touching these chunks as well.
>>>>
>>>> some other variants to maybe consider:
>>>>
>>>> marking the snapshot itself as trash (in the manifest, or by adding a
>>>> trash marker file inside the dir) - this would mean that there is no
>>>> iterator race issue when undoing a prune, no double-pruning collisions,
>>>> .. - but it also means we need to adapt all call sites that should skip
>>>> trashed snapshots (most existing ones), which is more churn.
>>>
>>> Shouldn't we use the central iterators implementations to query indexes?
>>
>> Yes, correct me if I'm wrong, have not checked all call sites yet but
>> index files are mostly accessed by going trough the manifest, either via
>> BackupManifest::files or at least verifying it via
>> BackupManifest::verfiy_file, as that's also were encryption and
>> verification state are stored.
>>
>> So adding a label to store a trashed state there would work out just
>> fine, filtering these snapshots for listing, sync job, ecc. is then fine
>> as well. Also, fetching the previous backup snapshot for fast
>> incremental mode will work, although require additional filtering.
>>
>> Although, I'm a bit concerned about performance for the content listing
>> if we keep and iterate all of the pruned snapshots. After all they will
>> persist until next GC, which could lead to a lot of accumulated snapshots.
> 
> that's a fair point, in some environments this might be cumbersome..
> OTOH, those are exactly the environments that would/should run GC often
> I guess, so maybe it's not that bad?

Well, the content listing performance is already problematic for some 
setups, so I would like to avoid adding to that problem :)

>> One further issue I see with that approach is again sync jobs, which now
>> do not see the trashed snapshot on the target and try to re-sync it? Or
>> would we include that information for the sync jobs to skip over? Would
>> be a bit strange however if the snapshot is not trashed on the source side.
> 
> they'd only resync it if it's after the last local one, unless it's a
> "sync missing" special sync, so this is not different to the current
> state? at least, if syncing a trashed snapshot using the same snapshot
> is allowed and just undoes the trashing?
> 
>> Also, thinking about UI to recover from trash: Might it be good to still
>> show the snapshots while listing, but marked with an icon, just like for
>> e.g. encryption state? Or create a dedicated window/tab to only show
>> trashed items.
> 
> yes, the snapshot list needs to get an option to include trashed ones,
> and the UI should set and handle that appropriately ;)
> 
>> All in all storing the trash information on the manifest might not be
>> the better option. Give above issues, I'm leaning more towards a
>> separate folder structure for this.
> 
> most of the above issues apply to both variants anyway - the main
> difference is that with the separate folder iterating access needs to
> opt-into including trashed snapshots, so only does extra work in case
> that is desired, whereas in the manifest one the extra work is already
> done by the time we can decide to skip/filter out a snapshot because
> it's trash.
> 
> maybe a summary would be:
> 
> pro separate folder:
> - less work when only iterating over non-trash or only over trash
> - no need to parse manifest where it is currently not parsed
> con separate folder:
> - more work/complicated handling when iterating over both trash and non-trash
> - more work to put something into/out of the trash
> 
> ?

Hmm, maybe a variant where we do not rely on the manifest or a dedicated 
folder marker at all, but rather change the folder name for the folder 
to either be hidden or have a dedicated pre/postfix? Similar to your 
marker file suggestion, but without the marker and an easy way to skip 
reading such snapshot folders to begin with. Would then require the 
snapshot creation to perform some additional checking and adapt the 
iterators to have variants with and without the hidden structure, but 
should reduce the issues from both variants discussed above?




More information about the pbs-devel mailing list