[pbs-devel] [PATCH proxmox-backup] GC: s3: fix local marker cleanup for unreferenced, s3 only chunks
Fabian Grünbichler
f.gruenbichler at proxmox.com
Mon Nov 24 09:29:59 CET 2025
On November 24, 2025 9:22 am, Christian Ebner wrote:
> On 11/24/25 9:13 AM, Fabian Grünbichler wrote:
>> On November 24, 2025 8:35 am, Christian Ebner wrote:
>>> Thanks for taking a look at this already!
>>>
>>> On 11/22/25 3:55 PM, Thomas Lamprecht wrote:
>>>> Am 22.11.25 um 11:41 schrieb Christian Ebner:
>>>>> If a chunk object is located on the s3 object store only, not being
>>>>> referenced by any index file and having no local marker file it is
>>>>> marked for cleanup by pretending an atime equal to the unix epoch.
>>>>>
>>>>> While this will mark the chunk for deletion from the backend and
>>>>> include it in the delete list for the next s3 delete objects call, it
>>>>> also will lead to the chunk marker and LRU cache entry being tried to
>>>>> clean up locally, which however fails since there is no marker to be
>>>>> cleaned up.
>>>>>
>>>>> In order to treat this edge case with the same cleanup logic, simply
>>>>> insert the marker file if not present, for it to get correctly
>>>>> cleaned up as expected afterwards. This should not happen under
>>>>> normal datastore operation anyways, most likely to appear after
>>>>> re-creation of the datastore from existing bucket contents containing
>>>>> such unreferenced chunks.
>>>>>
>>>>> Fixes: https://forum.proxmox.com/threads/176567/
>>>>> Signed-off-by: Christian Ebner <c.ebner at proxmox.com>
>>>>> ---
>>>>> pbs-datastore/src/datastore.rs | 9 +++++----
>>>>> 1 file changed, 5 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>>>>> index 65299cca9..a24392d9f 100644
>>>>> --- a/pbs-datastore/src/datastore.rs
>>>>> +++ b/pbs-datastore/src/datastore.rs
>>>>> @@ -1711,11 +1711,12 @@ impl DataStore {
>>>>> let atime = match std::fs::metadata(&chunk_path) {
>>>>> Ok(stat) => stat.accessed()?,
>>>>> Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
>>>>> + unsafe {
>>>>> + // chunk store lock held
>>>>> + // insert marke unconditionally, cleaned up again below if required
>>>>> + self.inner.chunk_store.replace_chunk_with_marker(&digest)?;
>>>>> + }
>>>>> if self.inner.chunk_store.clear_chunk_expected_mark(&digest)? {
>>>>> - unsafe {
>>>>> - // chunk store lock held
>>>>> - self.inner.chunk_store.replace_chunk_with_marker(&digest)?;
>>>>> - }
>>>>> SystemTime::now()
>>>>
>>>> Why not drop that whole branch instead, it does not really makes sense IIUC.
>>>
>>> No, this branch is needed. This is required for avoiding API calls to
>>> the s3 backend in case the chunk is referenced by an index file as
>>> detected during phase 1, but the local marker file is not present. In
>>> that case we do not want to directly check the existence on the backend
>>> (which we need to make sure to not mark a chunk which is however not
>>> present on the backend), but defer that check to phase 2, where we do
>>> the listing of all chunks anyways. This is done by flagging that chunk
>>> via the <digest>.using file.
>>>
>>> Here, if the chunk is encountered during s3 object store listing, but
>>> the local file is missing, we check and clear the chunk expected marker,
>>> which if present tells us the chunk still needs to be used. If not it is
>>> safe to clear it from the backend.
>>>
>>>>
>>>> And `replace_chunk_with_marker` replaces the chunk file directly (no extension) whereas
>>>> `clear_chunk_expected_mark` checks the chunk.using file, so does your reordering even
>>>> change anything, or is there a bug in `replace_chunk_with_marker`?
>>>
>>> `replace_chunk_with_marker` replaces a full chunk file with an empty
>>> marker, but also creates the empty marker if the original file is not
>>> present, so in this particular case it is actually used to create the
>>> marker, not to evict chunks from local datastore cache as under normal
>>> operation. I can send a patch to rename that method to make that clear.
>>>
>>>>
>>>> And independent of that, would it be better (more performant and less confusing) if
>>>> we ignore the "not present in LRU or no marker" in that edge case rather than creating
>>>> a file (doing more IO) just to delete that then again?
>>>
>>> I can do that as well of course by checking a flag in the remove
>>> callback. I opted for not doing that however since above is a very
>>> unlikely case to happen, as the s3 backend and local datastore cache
>>> should be in sync most of the time.
>>> Adding that check would be performed for each chunk being removed, this
>>> only once if the chunk is still present on the backend, but not on the
>>> local datastore cache.
>>>
>>> The additional IO is therefore justfied IMO.
>>>
>>> I could of course also go the route of just setting a boolean flag and
>>> checking that in the callback?
>>
>> we basically already have such a boolean marker - we set `atime` to 0 in
>> this case (and only this case), and we could just ignore the removal
>> errors then? possibly limited to just ignoring ENOTFOUND?
>
> That's a good idea! So I will perform the additional checks based on that.
alternatively, skipping cond_sweep_chunk entirely would also work (this
is with `-w`):
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 65299cca9..b9debd2b1 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1709,21 +1709,22 @@ impl DataStore {
// Check local markers (created or atime updated during phase1) and
// keep or delete chunk based on that.
let atime = match std::fs::metadata(&chunk_path) {
- Ok(stat) => stat.accessed()?,
+ Ok(stat) => Some(stat.accessed()?),
Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
if self.inner.chunk_store.clear_chunk_expected_mark(&digest)? {
unsafe {
// chunk store lock held
self.inner.chunk_store.replace_chunk_with_marker(&digest)?;
}
- SystemTime::now()
+ Some(SystemTime::now())
} else {
- // File not found, delete by setting atime to unix epoch
- SystemTime::UNIX_EPOCH
+ // File not found, only delete from S3
+ None
}
}
Err(err) => return Err(err.into()),
};
+ if let Some(atime) = atime {
let atime = atime.duration_since(SystemTime::UNIX_EPOCH)?.as_secs() as i64;
unsafe {
@@ -1752,6 +1753,13 @@ impl DataStore {
},
)?;
}
+ } else {
+ // set age based on first insertion
+ if delete_list.is_empty() {
+ delete_list_age = epoch_i64();
+ }
+ delete_list.push((content.key, _chunk_guard));
+ }
chunk_count += 1;
More information about the pbs-devel
mailing list