[pbs-devel] [PATCH proxmox-backup v3 18/23] GC: lock chunk marker before cleanup in phase 3 on s3 backends

Fabian Grünbichler f.gruenbichler at proxmox.com
Thu Nov 6 10:37:42 CET 2025


On November 5, 2025 1:22 pm, Christian Ebner wrote:
> To make sure there is no race between atime check and deletion with
> possible re-insertion.

could you expand on this?

sweep_unused_chunks holds the chunk store mutex for the full processing
of each chunk it iterates over..

so we have for each chunk:

- obtain mutex
- fstat
- call cond_sweep_chunk
- lock chunk
- fstat again
- drop mutex
- remove from cache (which re-obtains mutex)

shouldn't we protect this race by obtaining the mutex when conditionally
touching the chunk in `insert_chunk_cached()`, or even in
`datastore.cond_touch_chunk()`? then there is no way for a chunk
insertion to update the atime here.. AFAICT the latter would also close
a race with pull syncing, which touches chunks while they are referenced
by tmp indices, which means those touches are not protected by the mutex
or the rest of the GC logic?

> 
> By only acquiring the file lock if the chunk marker would be removed
> and double stating, the file locking penalty is avoided for the other
> cases.
> 
> Signed-off-by: Christian Ebner <c.ebner at proxmox.com>
> ---
> changes since version 2:
> - no changes
> 
>  pbs-datastore/src/chunk_store.rs | 28 +++++++++++++++++++++++++++-
>  pbs-datastore/src/datastore.rs   |  2 ++
>  2 files changed, 29 insertions(+), 1 deletion(-)
> 
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index 49687b2fa..08519fe2b 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -5,6 +5,7 @@ use std::sync::{Arc, Mutex};
>  use std::time::Duration;
>  
>  use anyhow::{bail, format_err, Context, Error};
> +use hex::FromHex;
>  use tracing::{info, warn};
>  
>  use pbs_api_types::{DatastoreFSyncLevel, GarbageCollectionStatus};
> @@ -22,7 +23,7 @@ use crate::data_blob::DataChunkBuilder;
>  use crate::file_formats::{
>      COMPRESSED_BLOB_MAGIC_1_0, ENCRYPTED_BLOB_MAGIC_1_0, UNCOMPRESSED_BLOB_MAGIC_1_0,
>  };
> -use crate::DataBlob;
> +use crate::{DataBlob, LocalDatastoreLruCache};
>  
>  /// File system based chunk store
>  pub struct ChunkStore {
> @@ -366,6 +367,7 @@ impl ChunkStore {
>          min_atime: i64,
>          status: &mut GarbageCollectionStatus,
>          worker: &dyn WorkerTaskContext,
> +        cache: Option<&LocalDatastoreLruCache>,
>      ) -> Result<(), Error> {
>          // unwrap: only `None` in unit tests
>          assert!(self.locker.is_some());
> @@ -419,6 +421,30 @@ impl ChunkStore {
>                          bad,
>                          status,
>                          || {

basically we have two callbacks here:
- remove chunk/chunk marker (called for all bad chunks or non S3 case)
- remove chunk via cache (called for non-bad S3 chunks)

> +                            if let Some(cache) = cache {
> +                                // never lock bad chunks
> +                                if filename.to_bytes().len() == 64 {

this is already passed in as bool ;)

> +                                    let digest = <[u8; 32]>::from_hex(filename.to_bytes())?;
> +                                    match self.lock_chunk(&digest, Duration::from_secs(0)) {

so this is basically a "skip chunk if currently being uploaded" check,
right? might warrant a comment, else this reads like a deadlock waiting
to happen..

in phase2 we obtain the chunk lock for all chunks on the S3 side outside
of the remove callback, here we do it inside - is there a reason for that?
chunks ending up in this state here should be rather rare (compared to
chunks existing on S3 being garbage collected), right?

but even if we keep this, the whole callback could become (AFAICT):


                        || {
                            // non-bad S3 chunks need to be removed via cache
                            if let Some(cache) = cache {
                                if !bad {
                                    let digest = <[u8; 32]>::from_hex(filename.to_bytes())?;

                                    // unless there is a concurrent upload pending
                                    if let Ok(_guard) =
                                        self.lock_chunk(&digest, Duration::from_secs(0))
                                    {
                                        drop(lock);
                                        cache.remove(&digest)?;
                                    }

                                    return Ok(());
                                }
                            }

                            // bad or local chunks
                            unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(
                                |err| {
                                    format_err!(
                                        "unlinking chunk {filename:?} failed on store '{}' - {err}",
                                        self.name,
                                    )
                                },
                            )
                        },

> +                                        Ok(_guard) => {
> +                                            // don't remove if changed since locking
> +                                            match fstatat(
> +                                                Some(dirfd),
> +                                                filename,
> +                                                nix::fcntl::AtFlags::AT_SYMLINK_NOFOLLOW,
> +                                            ) {
> +                                                Ok(stat) if stat.st_atime < min_atime => {
> +                                                    let _ = cache.remove(&digest);
> +                                                    return Ok(());
> +                                                }
> +                                                _ => return Ok(()),
> +                                            }
> +                                        }
> +                                        Err(_) => return Ok(()),
> +                                    }
> +                                }
> +                            }
> +
>                              unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(
>                                  |err| {
>                                      format_err!(
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 8b7333d80..df344974a 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -1761,6 +1761,7 @@ impl DataStore {
>                  min_atime,
>                  &mut tmp_gc_status,
>                  worker,
> +                self.cache(),
>              )?;
>          } else {
>              self.inner.chunk_store.sweep_unused_chunks(
> @@ -1768,6 +1769,7 @@ impl DataStore {
>                  min_atime,
>                  &mut gc_status,
>                  worker,
> +                None,
>              )?;
>          }
>  
> -- 
> 2.47.3
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 




More information about the pbs-devel mailing list