[pbs-devel] [pve-devel] [PATCH v2 proxmox-backup-qemu 05/11] access: use bigger cache and LRU chunk reader

Wed Mar 17 14:59:05 CET 2021

On 17.03.21 14:37, Stefan Reiter wrote:
> On 16/03/2021 21:17, Thomas Lamprecht wrote:
>> On 03.03.21 10:56, Stefan Reiter wrote:
>>> Values chosen by fair dice roll, seems to be a good sweet spot on my
>>> machine where any less causes performance degradation but any more
>>> doesn't really make it go any faster.
>>>
>>> Keep in mind that those values are per drive in an actual restore.
>>>
>>> Signed-off-by: Stefan Reiter <s.reiter at proxmox.com>
>>> ---
>>>
>>> Depends on new proxmox-backup.
>>>
>>> v2:
>>> * unchanged
>>>
>>>   src/restore.rs | 5 +++--
>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/restore.rs b/src/restore.rs
>>> index 0790d7f..a1acce4 100644
>>> --- a/src/restore.rs
>>> +++ b/src/restore.rs
>>> @@ -218,15 +218,16 @@ impl RestoreTask {
>>>             let index = client.download_fixed_index(&manifest, &archive_name).await?;
>>>           let archive_size = index.index_bytes();
>>> -        let most_used = index.find_most_used_chunks(8);
>>> +        let most_used = index.find_most_used_chunks(16); // 64 MB most used cache
>>
>>
>>
>>>             let file_info = manifest.lookup_file_info(&archive_name)?;
>>>   -        let chunk_reader = 
RemoteChunkReader::new(
>>> +        let chunk_reader = RemoteChunkReader::new_lru_cached(
>>>               Arc::clone(&client),
>>>               self.crypt_config.clone(),
>>>               file_info.chunk_crypt_mode(),
>>>               most_used,
>>> +            64, // 256 MB LRU cache
>>
>> how does this work with low(er) memory situations? Lots of people do not over
>> dimension their memory that much, and especially the need for mass-recovery could
>> seem to correlate with reduced resource availability (a node failed, now I need
>> to restore X backups on my <test/old/other-already-in-use> node, so multiple
>> restore jobs may run in parallel, and they all may have even multiple disks,
>> so tens of GiB of memory just for the cache are not that unlikely.
> 
> This is a seperate function from the regular restore, so it currently only affects live-restore. This is not an operation you would usually do under memory constraints anyway, and regular restore is unaffected if you just want the data.

And how exactly do you figure/argue that users won't use it if easily available?
Users *will* do use this in a memory constrained environment as it gets their guest
faster up again, cue mass restore on node with not much resources left.

> Upcoming single-file restore too though, I suppose, where it might make 
more sense...
> 
>>
>> How is the behavior, hard failure if memory is not available? Also, some archives
>> may be smaller than 256 MiB (EFI disk??) so there it'd be weird to have 256 cache
>> and get 64 of most used chunks if that's all/more than it would actually need to
>> be..
> 
> Yes, if memory is unavailable it is a hard error. Memory should not be pre-allocated however, so restoring this way will only ever use as much memory as the disk size (not accounting for overhead).

So basically RSS is increased by chunk-sized blocks. But a alloc error is 
not a hard
error here for the total operation, couldn't we catch that and continue with the LRU
size we actually have allocated?

> 
>>
>> There may be the reversed situation too, beefy fast node with lots of memory
>> and restore is used as recovery or migration but network bw/latency to 
PBS is not
>> that good - so bigger cache could be wanted.
> 
> The reason I chose the numbers I did was that I couldn't see any real performance benefits by going higher, though I didn't specifically test with slow networking.
> 
> I don't believe more cache would improve the situation there though, this is mostly to avoid random access from the guest and the linear access from the block-stream operation to interfere with each other, and allow multiple smaller guest reads within the same chunk to be served quickly.

What are the workloads you tested to be so sure about this?

From above statement I'd think that for any workload with a working set 
bigger than
256 MiB it would help? So basically any production DB load (albeit that should be
handled by the DBs memory caching, so maybe not the best example).

I'm just thinking that exposing this as a knob could help, must not be 
placed, but would be nice if there.

> 
>>
>> Maybe we could get the available memory and use that as hint, I mean as memory
>> usage can be highly dynamic it will never be perfect, but better than just ignoring
>> it..
> 
> If anything, I'd make it user-configurable - I don't think a heuristic would be a good choice here.

Yeah, heuristic is not an good option as we cannot know how the system memory
situation will be in the future.

> 
> This way we could also set it smaller for single-file restore for example - on the other hand, that adds another parameter to the already somewhat cluttered QEMU<->Rust interface.

cue versioned structs incoming ;)

> 
>>
>>>           );
>>>             let reader = AsyncIndexReader::new(index, chunk_reader);
>>>
>>