[pbs-devel] [PATCH proxmox-backup v2 3/8] chunk store: invert chunk filename checks in chunk store iterator
Fabian Grünbichler
f.gruenbichler at proxmox.com
Wed Jan 14 10:41:42 CET 2026
On January 14, 2026 9:37 am, Christian Ebner wrote:
> On 1/13/26 11:23 AM, Fabian Grünbichler wrote:
>> On December 11, 2025 4:38 pm, Christian Ebner wrote:
>>> Optimizes the chunk filename check towards regular chunk files by
>>> explicitley checking for the correct length.
>>>
>>> While the check for ascii hexdigits needs to be stated twice, this
>>> avoids to check for the `.bad` extension if the chunk filename did
>>> already match the expected length.
>>
>> I don't get this part, we could still check first and only once that the
>> first 64 bytes are valid hex?
>>
>> if bytes.len() < 64 {
>> continue;
>> }
>>
>> if !bytes.iter().take(64).all(u8::is_ascii_hexdigit) {
>> continue;
>> }
>
> But with the code below I'm done after 2 checks in the regular chunk
> digest case:
>
> `bytes.len() == 64 && bytes.iter().take(64).all(u8::is_ascii_hexdigit)`
>
> which is the one which is most likely and should be optimized for?
that's easy to do without writing the check twice though (with the added
benefit of stopping at the first non-hex character):
if bytes.iter().take(64).any(|c| !c.is_ascii_hexdigit) {
continue;
}
if bytes.len() == 64 { return .. };
if bytes.len() < 64 { continue }
if bytes.len() == 64 + .. && bytes.ends_with(..) { return .. }
if bytes.len() == 64 + .. && &bytes[64..XX] == ... { return .. }
I still think having the "too short" check up front makes sense - it's
super cheap, makes the code more readable *and* saves us the iteration
for such files..
>
> What I tried to tell with the commit message is that the
> bytes.iter().take(64).all(u8::is_ascii_hexdigit) is now written out
> twice, but only one of the 2 case will ever be checked.
>
>>
>> // now start looking at the length + potential extension
>>
>>>
>>> This will also help to better distinguish bad chunks and chunks
>>> used markers for s3 datastores in subsequent changes.
>>>
>>> Signed-off-by: Christian Ebner <c.ebner at proxmox.com>
>>> ---
>>> pbs-datastore/src/chunk_store.rs | 17 +++++++++++------
>>> 1 file changed, 11 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
>>> index a5e5f6261..7980938ad 100644
>>> --- a/pbs-datastore/src/chunk_store.rs
>>> +++ b/pbs-datastore/src/chunk_store.rs
>>> @@ -315,15 +315,20 @@ impl ChunkStore {
>>> Some(Ok(entry)) => {
>>> // skip files if they're not a hash
>>> let bytes = entry.file_name().to_bytes();
>>> - if bytes.len() != 64 && bytes.len() != 64 + ".0.bad".len() {
>>> - continue;
>>> +
>>> + if bytes.len() == 64 && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
>>> + {
>>> + return Some((Ok(entry), percentage, false));
>>> }
>>> - if !bytes.iter().take(64).all(u8::is_ascii_hexdigit) {
>>> - continue;
>>> +
>>> + if bytes.len() == 64 + ".0.bad".len()
>>> + && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
>>> + {
>>> + let bad = bytes.ends_with(b".bad");
>>> + return Some((Ok(entry), percentage, bad));
>>
>> while this mimics the old code, it is still broken (a chunk digest +
>> .fooba or any other 6-byte suffix that is not "??.bad" is returned as
>> non-bad chunk, since the length matches a bad chunk, but the extension
>> does not).
>
> That was the intention here, to keep this close to the previous
> behavior. But since we do this check only in the less likely case, I
> agree that adding the check for exact extension might be the better
> option here.
>
> Will adapt this accordingly, thanks!
>
>>
>>> }
>>>
>>> - let bad = bytes.ends_with(b".bad");
>>> - return Some((Ok(entry), percentage, bad));
>>> + continue;
>>> }
>>> Some(Err(err)) => {
>>> // stop after first error
>>> --
>>> 2.47.3
>>>
>>>
>>>
>>> _______________________________________________
>>> pbs-devel mailing list
>>> pbs-devel at lists.proxmox.com
>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>>
>>>
>>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel at lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>
>
More information about the pbs-devel
mailing list