[pbs-devel] [PATCH proxmox-backup 1/3] GC: refactor chunk removal helper

Fabian Grünbichler f.gruenbichler at proxmox.com
Wed Oct 15 12:04:20 CEST 2025


On October 15, 2025 11:56 am, Christian Ebner wrote:
> On 10/15/25 11:46 AM, Fabian Grünbichler wrote:
>> On October 15, 2025 11:10 am, Christian Ebner wrote:
>>> one comment inline
>>>
>>> On 10/15/25 10:38 AM, Fabian Grünbichler wrote:
>>>> simplify the callback, and move the error handling to the helper..
>>>>
>>>> Signed-off-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
>>>> ---
>>>>    pbs-datastore/src/chunk_store.rs | 27 ++++++++++++---------------
>>>>    pbs-datastore/src/datastore.rs   |  2 +-
>>>>    2 files changed, 13 insertions(+), 16 deletions(-)
>>>>
>>>> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
>>>> index 6e50327cb..1c7df9074 100644
>>>> --- a/pbs-datastore/src/chunk_store.rs
>>>> +++ b/pbs-datastore/src/chunk_store.rs
>>>> @@ -415,19 +415,13 @@ impl ChunkStore {
>>>>                        stat.st_size as u64,
>>>>                        bad,
>>>>                        status,
>>>> -                    |status| {
>>>> -                        if let Err(err) =
>>>> -                            unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir)
>>>> -                        {
>>>> -                            if bad {
>>>> -                                status.still_bad += 1;
>>>> -                            }
>>>> -                            bail!(
>>>> +                    || {
>>>> +                        unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(|err| {
>>>> +                            format_err!(
>>>>                                    "unlinking chunk {filename:?} failed on store '{}' - {err}",
>>>>                                    self.name,
>>>> -                            );
>>>> -                        }
>>>> -                        Ok(())
>>>> +                            )
>>>> +                        })
>>>>                        },
>>>>                    )?;
>>>>                }
>>>> @@ -441,9 +435,7 @@ impl ChunkStore {
>>>>        /// status accordingly.
>>>>        ///
>>>>        /// If the chunk should be removed, the [`remove_callback`] is executed.
>>>> -    pub(super) fn check_atime_and_update_gc_status<
>>>> -        T: FnOnce(&mut GarbageCollectionStatus) -> Result<(), Error>,
>>>> -    >(
>>>> +    pub(super) fn check_atime_and_update_gc_status<T: FnOnce() -> Result<(), Error>>(
>>>>            atime: i64,
>>>>            min_atime: i64,
>>>>            oldest_writer: i64,
>>>> @@ -453,7 +445,12 @@ impl ChunkStore {
>>>>            remove_callback: T,
>>>>        ) -> Result<(), Error> {
>>>>            if atime < min_atime {
>>>> -            remove_callback(gc_status)?;
>>>> +            if let Err(err) = remove_callback() {
>>>> +                if bad {
>>>> +                    gc_status.still_bad += 1;
>>>> +                    return Err(err);
>>>
>>> Unless I'm overseeing something, this will now no longer propagate the
>>> error in case the removal of a non-bad chunk fails? Previously the error
>>> was returned independent from the `bad` state.
>> 
>> yes, you are right!
>> 
>> although I now wonder - should we make failure to remove bad chunk files
>> non-fatal? or even all chunk files? at this point we've made all the
>> decisions already, and best-effort removal might be better than no
>> removal (e.g., a single chunk with a permission issue effectively blocks
>> GC now??).
> 
> No strong opinion on this, but I would agree. Removal on best effort 
> would at least not lead to unintentional fill up of the chunk store.
> 
> OTOH, most likely if the permissions are wrong on one chunk or the 
> removal fails for that particular file, this affects also others for 
> example out of memory situations on ZFS. So one probably does not gain 
> much? Or it might even lead to spamming of the task log, which should be 
> avoided.

yeah, spamming the task log would indeed be bad.

we could introduce a new gc_status field for those maybe, and just
summarize according to error category instead of logging one error per
line, but that is yet another bigger refactor, so I'll just send v2 with
the handling as it was!




More information about the pbs-devel mailing list