[pbs-devel] [PATCH v3 proxmox-backup 43/58] client: pxar: implement store to insert chunks on caching

Christian Ebner c.ebner at proxmox.com
Tue Apr 9 11:12:14 CEST 2024


On 4/5/24 09:52, Fabian Grünbichler wrote:
> Quoting Christian Ebner (2024-03-28 13:36:52)
>> In preparation for the look-ahead caching used to temprarily store
>> entries before encoding them in the pxar archive, being able to
>> decide wether to re-use or re-encode regular file entries.
>>
>> Allows to insert and store reused chunks in the archiver,
>> deduplicating chunks upon insert when possible.
>>
>> Signed-off-by: Christian Ebner <c.ebner at proxmox.com>
>> ---
>> changes since version 2:
>> - Strongly adapted and refactored: keep track also of paddings
>>    introduced by reusing the chunks, making a suggestion whether to
>>    re-use, re-encode or check next entry based on threshold
>> - completely removed code which allowed to calculate offsets based on
>>    chunks found in the middle, they must either be a continuation of the
>>    end or be added after, otherwise offsets are not monotonically
>>    increasing, which is required for sequential restore
>>
>>   pbs-client/src/pxar/create.rs | 126 +++++++++++++++++++++++++++++++++-
>>   1 file changed, 125 insertions(+), 1 deletion(-)
>>
>> diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
>> index 335e3556f..95a91a59b 100644
>> --- a/pbs-client/src/pxar/create.rs
>> +++ b/pbs-client/src/pxar/create.rs
>> @@ -20,7 +20,7 @@ use pathpatterns::{MatchEntry, MatchFlag, MatchList, MatchType, PatternFlag};
>>   use pbs_datastore::index::IndexFile;
>>   use proxmox_sys::error::SysError;
>>   use pxar::accessor::aio::Accessor;
>> -use pxar::encoder::{LinkOffset, SeqWrite};
>> +use pxar::encoder::{LinkOffset, PayloadOffset, SeqWrite};
>>   use pxar::Metadata;
>>   
>>   use proxmox_io::vec;
>> @@ -36,6 +36,128 @@ use crate::pxar::metadata::errno_is_unsupported;
>>   use crate::pxar::tools::assert_single_path_component;
>>   use crate::pxar::Flags;
>>   
>> +const CHUNK_PADDING_THRESHOLD: f64 = 0.1;
>> +
>> +#[derive(Default)]
>> +struct ReusedChunks {
>> +    start_boundary: PayloadOffset,
>> +    total: PayloadOffset,
>> +    padding: u64,
>> +    chunks: Vec<(u64, ReusableDynamicEntry)>,
>> +    must_flush_first: bool,
>> +    suggestion: Suggested,
>> +}
>> +
>> +#[derive(Copy, Clone, Default)]
>> +enum Suggested {
>> +    #[default]
>> +    CheckNext,
>> +    Reuse,
>> +    Reencode,
>> +}
> 
> this is a bit of a misnomer - it's not a suggestion, it's what is going to
> happen ;) maybe `action` or even `result` or something similar might be a
> better fit? or something going into the direction of "decision"?
> 

Renamed this to `Action`, as this seemed better fitting of the two 
suggested options, and I could not come up with a better name.

>> +
>> +impl ReusedChunks {
>> +    fn new() -> Self {
>> +        Self::default()
>> +    }
> 
> this is not needed, can just use default()

Removed

> 
>> +
>> +    fn start_boundary(&self) -> PayloadOffset {
>> +        self.start_boundary
>> +    }
> 
> is this needed? access is only local anyway, and we don't have a similar helper
> for self.chunks ;)

Yes, removed it... my intention was to move this maybe to a different 
sub-module, but in the end it remained where it is now.

> 
>> +
>> +    fn is_empty(&self) -> bool {
>> +        self.chunks.is_empty()
>> +    }
>> +
>> +    fn suggested(&self) -> Suggested {
>> +        self.suggestion
> 
> same question here..

same as above

> 
>> +    }
>> +
>> +    fn insert(
>> +        &mut self,
>> +        indices: Vec<ReusableDynamicEntry>,
>> +        boundary: PayloadOffset,
>> +        start_padding: u64,
>> +        end_padding: u64,
>> +    ) -> PayloadOffset {
> 
> another fn that definitely would benefit from some comments describing the
> thought processes behind the logic :)

added some more explanation as comments

> 
>> +        if self.is_empty() {
>> +            self.start_boundary = boundary;
>> +        }
> 
> okay, so this is actually just set
> 
>> +
>> +        if let Some(offset) = self.last_digest_matched(&indices) {
>> +            if let Some((padding, last)) = self.chunks.last_mut() {
> 
> we already know a last chunk must exist (else last_digest_matched wouldn't have
> returned). that means last_digest_matched could just return the last chunk?

True, but than there is the mutable reference of self blocking further 
modifications, therefore kept this as is.

> 
>> +                // Existing chunk, update padding based on pre-existing one
>> +                // Start padding is expected to be larger than previous padding
> 
> should we validate this expectation? or is it already validated somewhere else?
> and also, isn't start_padding by definition always smaller than last.size()?

Will opt for incorporating the padding into the `ReusableDynamicEntry` 
itself, therefore not needing to check this independently, as it is then 
inherent.

> 
>> +                *padding += start_padding - last.size();
>> +                self.padding += start_padding - last.size();
> 
> (see below) here we correct the per-chunk padding, but only for a partial chunk that is continued..
> 
> if I understand this part correctly, here we basically want to adapt from a
> potential end_padding of the last chunk, if it was:
> 
> A|A|A|P|P|P|P
> 
> and we now have
> 
> P|P|P|P|P|B|B
> 
> we want to end up with just 2 'P's in the middle? isn't start_padding - size
> the count of the payload? so what we actually want is
> 
> *padding -= (last.size() - start_padding)
> 
> ? IMHO that makes the intention much more readable, especially if you factor out

Yes, in this case it was possible to deduplicate the first chunk of the 
list to append, meaning that at least the last chunk of the last file 
and the first chunk of the new file are the same (this if both files are 
fully or partially contained within this chunk).

Therefore the chunk is not inserted again, but rather the padding of the 
already present entry updated, by first subtracting the used and 
possible end_padding (note that this is not equal to the payload size, 
which might even cover more than one chunk), the end_padding being 
readded afterwards.

> 
> let payload_size = last.size() - start_padding;
> *padding -= payload_size;
> self.padding -= payload_size;

makes it more readable, agreed, but this is not the payload_size, so 
this will be referred to as remaining in the upcoming version of the 
patches.

So we have:

chunk_size = start_padding + used + end_padding

and call:

remaining = used + end_padding = chunk_size - start_padding

where these can be start_padding >= 0 and end_padding >=0 if all the 
payload is contained within one chunk. If the file payload covers 
multiple chunks, this is covered by updating the corresponding paddings 
for just the start and end chunk in the list.

> 
> if we want to be extra careful, we could even add the three checks/assertions here:
> - start_padding must be smaller than the chunk size
> - both chunk and total padding must be bigger than the payload size
> 
>> +            }
>> +
>> +            for chunk in indices.into_iter().skip(1) {
>> +                self.total = self.total.add(chunk.size());
>> +                self.chunks.push((0, chunk));
> 
> here we push the second and later chunks with 0 padding
> 
>> +            }
>> +
>> +            if let Some((padding, _last)) = self.chunks.last_mut() {
>> +                *padding += end_padding;
>> +                self.padding += end_padding;
> 
> and here we account for the end_padding of the last chunk, which might actually
> be the same as the first chunk, but that works out..

Yes, this is intentionally updated here, to not have to double account 
for the end padding if there is only that one chunk, as it is 
in-depended of whether this is also the first chunk.

> 
>> +            }
>> +
>> +            let padding_ratio = self.padding as f64 / self.total.raw() as f64;
>> +            if self.chunks.len() > 1 && padding_ratio < CHUNK_PADDING_THRESHOLD {
>> +                self.suggestion = Suggested::Reuse;
>> +            }
> 
> see below
> 
>> +
>> +            self.start_boundary.add(offset + start_padding)
>> +        } else {
>> +            let offset = self.total.raw();
>> +
>> +            if let Some(first) = indices.first() {
>> +                self.total = self.total.add(first.size());
>> +                self.chunks.push((start_padding, first.clone()));
>> +                // New chunk, all start padding counts
>> +                self.padding += start_padding;
>> +            }
>> +
>> +            for chunk in indices.into_iter().skip(1) {
>> +                self.total = self.total.add(chunk.size());
>> +                self.chunks.push((chunk.size(), chunk));
> 
> so here we count the full chunk size as padding (for the per-chunk stats), but
> don't count it for the total padding? I think this should be 0 just like above?

Yes, this is indeed incorrect and should be set to 0 here. These chunks 
don't add padding since they are fully used.

> 
> this and the handling of the first chunk could actually be combined:
> 
> for (idx, chunk) in indices.into_iter().enumerate() {
> 	self.total = self.total.add(chunk.size());
> 	let chunk_padding = if idx == 0 { self.padding += start_padding; start_padding } else { 0 };
> 	self.chunks.push((chunk_padding, chunk));
> }

Yeah, that is more compact, requires however the index check for each 
item in the iteration? But that should not be that much cost I guess. 
Will also add the end_padding accounting in here using the same logic

> 
> or we could make start_padding mut, and do
> 
> self.padding += start_padding;
> for chunk in indices.into_iter() {
> 	self.total = self.total.add(chunk.size());
> 	self.chunks.push((start_padding, chunk));
> 	start_padding = 0; // only first chunk counts
> }
> 
>> +            }
>> +
>> +            if let Some((padding, _last)) = self.chunks.last_mut() {
>> +                *padding += end_padding;
>> +                self.padding += end_padding;
>> +            }
>> +
>> +            if self.chunks.len() > 2 {
> 
> so if we insert more than two chunks without a continuation, all of them are
> accounted for as full of padding but they are still re-used if start and end
> padding are below the threshold ;)

If a file covers more than one chunk, adding less than the given 
threshold value in padding, than it should be reused, yes. Now that the 
paddings are actually set correctly for this branch, that is the intention.

> 
>> +                let padding_ratio = self.padding as f64 / self.total.raw() as f64;
>> +                if padding_ratio < CHUNK_PADDING_THRESHOLD {
>> +                    self.suggestion = Suggested::Reuse;
>> +                } else {
>> +                    self.suggestion = Suggested::Reencode;
>> +                }
> 
> we could just return the suggestion instead of storing it - it's only ever needed right after `insert` anyway?

Agreed, opted for that one

> 
> this calculation above seems to not handle some corner cases though.
> 
>   if I have the following sequence
> 
> |P|A|A|A|A|A|B|P|P|P|P|P|
> |    C1     |    C2     |
> 
> where P represent padding parts, A and B are file contents of two files, and C1
> and C2 are two chunks. let's say both A and B are re-usable files. first A is
> resolved via the index and inserted, but since it's the first chunk, there is
> no "suggestion" yet (CheckNext). now B is resolved and inserted - it doesn't
> continue a partial previous chunk, so we take the else branch. but now the
> (total) padding ratio is skewed because of B, even though A on its own would
> have been perfectly fine to be re-used.. (let's say we then insert another
> chunk different to C1 and C2, depending on the exact ratios that one might lead
> to the whole sequence being re-used or not, even though C2 should not be
> re-used, C1 should, and the hypothetical C3 might or might not!)
> 
> basically, as soon as we have a break in the chunk sequence (file A followed by
> file B, with no chunk overlap) we can treat each part on its own?

True, adapted the check here so that already the single chunk padding 
threshold reach results in a reuse.





More information about the pbs-devel mailing list