[pbs-devel] [PATCH proxmox-backup v2 3/3] datastore: insert chunk marker and touch bad chunks in locked context

Mon Nov 10 13:39:44 CET 2025

On 11/10/25 9:31 AM, Fabian Grünbichler wrote:
> On November 6, 2025 6:13 pm, Christian Ebner wrote:
>> Assures that both, the touching of bad chunks as well as the
>> insertion of missing chunk marker files are done while the chunk
>> store mutex is guarded, other operations therefore get a consistent
>> state.
>>
>> To achieve this, introduces a helper method which allows to run a
>> callback in a locked context if the chunk file is missing.
>>
>> Signed-off-by: Christian Ebner <c.ebner at proxmox.com>
>> ---
>> changes since version 1:
>> - not present in previous version
>>
>>   pbs-datastore/src/chunk_store.rs | 17 +++++++++++++
>>   pbs-datastore/src/datastore.rs   | 42 +++++++++++++++++++-------------
>>   2 files changed, 42 insertions(+), 17 deletions(-)
>>
>> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
>> index b88a0a096..063bc55f6 100644
>> --- a/pbs-datastore/src/chunk_store.rs
>> +++ b/pbs-datastore/src/chunk_store.rs
>> @@ -212,6 +212,23 @@ impl ChunkStore {
>>           Ok(())
>>       }
>>   
>> +    /// Update the chunk files atime if it exists, call the provided callback inside a chunk store
>> +    /// locked scope otherwise.
>> +    pub(super) fn cond_touch_chunk_or_locked<T>(
>> +        &self,
>> +        digest: &[u8; 32],
>> +        callback: T,
>> +    ) -> Result<(), Error>
>> +    where
>> +        T: FnOnce() -> Result<(), Error>,
>> +    {
>> +        let _lock = self.mutex.lock();
>> +        if !self.cond_touch_chunk_no_lock(digest, false)? {
>> +            callback()?;
>> +        }
>> +        Ok(())
>> +    }
>> +
>>       /// Update the chunk files atime if it exists.
>>       ///
>>       /// If the chunk file does not exist, return with error if assert_exists is true, with
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index 4527b40f4..b2f414ce1 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -1302,15 +1302,16 @@ impl DataStore {
>>               match s3_client {
>>                   None => {
>>                       // Filesystem backend
>> -                    if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
>> +                    self.inner.chunk_store.cond_touch_chunk_or_locked(digest, || {
>>                           let hex = hex::encode(digest);
>>                           warn!(
>>                               "warning: unable to access non-existent chunk {hex}, required by {file_name:?}"
>>                           );
>>   
>> -                        // touch any corresponding .bad files to keep them around, meaning if a chunk is
>> -                        // rewritten correctly they will be removed automatically, as well as if no index
>> -                        // file requires the chunk anymore (won't get to this loop then)
>> +                        // touch any corresponding .bad files to keep them around, meaning if a
>> +                        // chunk is rewritten correctly they will be removed automatically, as well
>> +                        // as if no index file requires the chunk anymore (won't get to this loop
>> +                        // then)
>>                           for i in 0..=9 {
>>                               let bad_ext = format!("{i}.bad");
>>                               let mut bad_path = PathBuf::new();
>> @@ -1318,22 +1319,29 @@ impl DataStore {
>>                               bad_path.set_extension(bad_ext);
>>                               self.inner.chunk_store.cond_touch_path(&bad_path, false)?;
>>                           }
>> -                    }
>> +                        Ok(())
>> +                    })?;
> 
> do we need to hold the mutex for touching the bad chunks? we don't when
> creating them during verification (should we?), we do when removing them
> during GC..
> 
> if we do, then we should probably have a
> 
> pub(super) fn cond_touch_bad_chunk(s?) in the chunk store
> 
> because each touch here is independent, and there is no overarching need
> to hold the mutex across all of them, and this would allow us to make
> cond_touch_path private..

No, these can indeed be independent since cleanup in GC only happens if 
the good chunk is not present, and the rename will then simply use the 
next free bad filename slot.

> should we also touch bad chunks for the S3 case?

True, so they need to be present to reflect the state for these as well. 
Otherwise they will get incorrectly cleaned up in phase 2.
  >>                   }
>>                   Some(ref _s3_client) => {
>>                       // Update atime on local cache marker files.
>> -                    if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
>> -                        let (chunk_path, _digest) = self.chunk_path(digest);
>> -                        // Insert empty file as marker to tell GC phase2 that this is
>> -                        // a chunk still in-use, so to keep in the S3 object store.
>> -                        std::fs::File::options()
>> -                            .write(true)
>> -                            .create_new(true)
>> -                            .open(&chunk_path)
>> -                            .with_context(|| {
>> -                                format!("failed to create marker for chunk {}", hex::encode(digest))
>> -                            })?;
>> -                    }
>> +                    self.inner
>> +                        .chunk_store
>> +                        .cond_touch_chunk_or_locked(digest, || {
>> +                            let (chunk_path, _digest) = self.chunk_path(digest);
>> +                            // Insert empty file as marker to tell GC phase2 that this is
>> +                            // a chunk still in-use, so to keep in the S3 object store.
>> +                            std::fs::File::options()
>> +                                .write(true)
>> +                                .create_new(true)
>> +                                .open(&chunk_path)
>> +                                .with_context(|| {
>> +                                    format!(
>> +                                        "failed to create marker for chunk {}",
>> +                                        hex::encode(digest)
>> +                                    )
>> +                                })?;
>> +                            Ok(())
>> +                        })?;
> 
> AFAICT, we can fix this together with the other S3-races by obtaining
> the flock on the chunk here?
> 
> i.e.,
> 
> // without flock first, since a chunk missing is the unlikely path
> // (corruption detected by verify, or manual damage to the chunk store)
> if !cond_touch_chunk {
>    // now with flock, to protect the next two calls against concurrent
>    // insert+uploads/renames/..
>    flock {
>      // somebody else could have inserted it since we checked without
>      // locking above
>      if !cond_touch_chunk {
>        // insert empty marker only if chunk is not there
>        store.clear_chunk
>      }
>    }
> }

Okay, will do by moving the s3 specific logic into the regular one, 
which covers then both and send a v2, now however as followup patches to 
be applied on top of [0] since that is required for the per-chunk file 
locking.

[0] 
https://lore.proxmox.com/pbs-devel/20251110115627.280318-1-c.ebner@proxmox.com/T/