[pbs-devel] [PATCH proxmox-backup v2 3/8] chunk store: invert chunk filename checks in chunk store iterator
Christian Ebner
c.ebner at proxmox.com
Wed Jan 14 09:37:59 CET 2026
On 1/13/26 11:23 AM, Fabian Grünbichler wrote:
> On December 11, 2025 4:38 pm, Christian Ebner wrote:
>> Optimizes the chunk filename check towards regular chunk files by
>> explicitley checking for the correct length.
>>
>> While the check for ascii hexdigits needs to be stated twice, this
>> avoids to check for the `.bad` extension if the chunk filename did
>> already match the expected length.
>
> I don't get this part, we could still check first and only once that the
> first 64 bytes are valid hex?
>
> if bytes.len() < 64 {
> continue;
> }
>
> if !bytes.iter().take(64).all(u8::is_ascii_hexdigit) {
> continue;
> }
But with the code below I'm done after 2 checks in the regular chunk
digest case:
`bytes.len() == 64 && bytes.iter().take(64).all(u8::is_ascii_hexdigit)`
which is the one which is most likely and should be optimized for?
What I tried to tell with the commit message is that the
bytes.iter().take(64).all(u8::is_ascii_hexdigit) is now written out
twice, but only one of the 2 case will ever be checked.
>
> // now start looking at the length + potential extension
>
>>
>> This will also help to better distinguish bad chunks and chunks
>> used markers for s3 datastores in subsequent changes.
>>
>> Signed-off-by: Christian Ebner <c.ebner at proxmox.com>
>> ---
>> pbs-datastore/src/chunk_store.rs | 17 +++++++++++------
>> 1 file changed, 11 insertions(+), 6 deletions(-)
>>
>> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
>> index a5e5f6261..7980938ad 100644
>> --- a/pbs-datastore/src/chunk_store.rs
>> +++ b/pbs-datastore/src/chunk_store.rs
>> @@ -315,15 +315,20 @@ impl ChunkStore {
>> Some(Ok(entry)) => {
>> // skip files if they're not a hash
>> let bytes = entry.file_name().to_bytes();
>> - if bytes.len() != 64 && bytes.len() != 64 + ".0.bad".len() {
>> - continue;
>> +
>> + if bytes.len() == 64 && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
>> + {
>> + return Some((Ok(entry), percentage, false));
>> }
>> - if !bytes.iter().take(64).all(u8::is_ascii_hexdigit) {
>> - continue;
>> +
>> + if bytes.len() == 64 + ".0.bad".len()
>> + && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
>> + {
>> + let bad = bytes.ends_with(b".bad");
>> + return Some((Ok(entry), percentage, bad));
>
> while this mimics the old code, it is still broken (a chunk digest +
> .fooba or any other 6-byte suffix that is not "??.bad" is returned as
> non-bad chunk, since the length matches a bad chunk, but the extension
> does not).
That was the intention here, to keep this close to the previous
behavior. But since we do this check only in the less likely case, I
agree that adding the check for exact extension might be the better
option here.
Will adapt this accordingly, thanks!
>
>> }
>>
>> - let bad = bytes.ends_with(b".bad");
>> - return Some((Ok(entry), percentage, bad));
>> + continue;
>> }
>> Some(Err(err)) => {
>> // stop after first error
>> --
>> 2.47.3
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel at lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
More information about the pbs-devel
mailing list