[pbs-devel] [PATCH proxmox-backup 1/3] GC: refactor chunk removal helper
Christian Ebner
c.ebner at proxmox.com
Wed Oct 15 11:56:31 CEST 2025
On 10/15/25 11:46 AM, Fabian Grünbichler wrote:
> On October 15, 2025 11:10 am, Christian Ebner wrote:
>> one comment inline
>>
>> On 10/15/25 10:38 AM, Fabian Grünbichler wrote:
>>> simplify the callback, and move the error handling to the helper..
>>>
>>> Signed-off-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
>>> ---
>>> pbs-datastore/src/chunk_store.rs | 27 ++++++++++++---------------
>>> pbs-datastore/src/datastore.rs | 2 +-
>>> 2 files changed, 13 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
>>> index 6e50327cb..1c7df9074 100644
>>> --- a/pbs-datastore/src/chunk_store.rs
>>> +++ b/pbs-datastore/src/chunk_store.rs
>>> @@ -415,19 +415,13 @@ impl ChunkStore {
>>> stat.st_size as u64,
>>> bad,
>>> status,
>>> - |status| {
>>> - if let Err(err) =
>>> - unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir)
>>> - {
>>> - if bad {
>>> - status.still_bad += 1;
>>> - }
>>> - bail!(
>>> + || {
>>> + unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(|err| {
>>> + format_err!(
>>> "unlinking chunk {filename:?} failed on store '{}' - {err}",
>>> self.name,
>>> - );
>>> - }
>>> - Ok(())
>>> + )
>>> + })
>>> },
>>> )?;
>>> }
>>> @@ -441,9 +435,7 @@ impl ChunkStore {
>>> /// status accordingly.
>>> ///
>>> /// If the chunk should be removed, the [`remove_callback`] is executed.
>>> - pub(super) fn check_atime_and_update_gc_status<
>>> - T: FnOnce(&mut GarbageCollectionStatus) -> Result<(), Error>,
>>> - >(
>>> + pub(super) fn check_atime_and_update_gc_status<T: FnOnce() -> Result<(), Error>>(
>>> atime: i64,
>>> min_atime: i64,
>>> oldest_writer: i64,
>>> @@ -453,7 +445,12 @@ impl ChunkStore {
>>> remove_callback: T,
>>> ) -> Result<(), Error> {
>>> if atime < min_atime {
>>> - remove_callback(gc_status)?;
>>> + if let Err(err) = remove_callback() {
>>> + if bad {
>>> + gc_status.still_bad += 1;
>>> + return Err(err);
>>
>> Unless I'm overseeing something, this will now no longer propagate the
>> error in case the removal of a non-bad chunk fails? Previously the error
>> was returned independent from the `bad` state.
>
> yes, you are right!
>
> although I now wonder - should we make failure to remove bad chunk files
> non-fatal? or even all chunk files? at this point we've made all the
> decisions already, and best-effort removal might be better than no
> removal (e.g., a single chunk with a permission issue effectively blocks
> GC now??).
No strong opinion on this, but I would agree. Removal on best effort
would at least not lead to unintentional fill up of the chunk store.
OTOH, most likely if the permissions are wrong on one chunk or the
removal fails for that particular file, this affects also others for
example out of memory situations on ZFS. So one probably does not gain
much? Or it might even lead to spamming of the task log, which should be
avoided.
More information about the pbs-devel
mailing list