[pbs-devel] [PATCH v3 proxmox-backup 49/58] client: backup: increase average chunk size for metadata
Fabian Grünbichler
f.gruenbichler at proxmox.com
Fri Apr 5 11:42:39 CEST 2024
Quoting Christian Ebner (2024-03-28 13:36:58)
> Use double the average chunk size for the metadata archive as compared
> to the payload stream. This does not only reduce the number of unique
> chunks produced by the metadata archive, not well chunkable because
> mainly many localized small changes, but further has the positive side
> effect of producing well compressable larger chunks. The reduced number
> of chunks further increases the performance for access because of
> reduced number of download requests and increased cachability.
>
> Signed-off-by: Christian Ebner <c.ebner at proxmox.com>
> ---
> changes since version 2:
> - not present in previous version
>
> proxmox-backup-client/src/main.rs | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
> index 66dcaa63e..4aad0ff8c 100644
> --- a/proxmox-backup-client/src/main.rs
> +++ b/proxmox-backup-client/src/main.rs
> @@ -78,6 +78,8 @@ pub(crate) use helper::*;
> pub mod key;
> pub mod namespace;
>
> +const AVG_METADATA_CHUNK_SIZE: usize = 8 * 1024 * 1024;
> +
> fn record_repository(repo: &BackupRepository) {
> let base = match BaseDirectories::with_prefix("proxmox-backup") {
> Ok(v) => v,
> @@ -209,7 +211,15 @@ async fn backup_directory<P: AsRef<Path>>(
> payload_target.is_some(),
> )?;
>
> - let mut chunk_stream = ChunkStream::new(pxar_stream, chunk_size, None);
> + let avg_chunk_size = if payload_stream.is_none() {
> + chunk_size
> + } else {
> + chunk_size
> + .map(|size| 2 * size)
what if the user provided us with a very small chunk size? should we have a lower bound here?
I still wonder whether getting rid of the sliding window chunker wouldn't be a
net benefit for the split archive case. for the metadata stream it probably
doesn't matter much (it has a lot of churn, is small and compresses well).
for the payload stream simple accumulating 1..N files (or rather, their
contents) in a chunk until a certain size threshold is reached might perform
better (as in, both be faster than the current chunker, and give us more/better
re-usable chunks).
> + .or_else(|| Some(AVG_METADATA_CHUNK_SIZE))
> + };
> +
> + let mut chunk_stream = ChunkStream::new(pxar_stream, avg_chunk_size, None);
> let (tx, rx) = mpsc::channel(10); // allow to buffer 10 chunks
>
> let stream = ReceiverStream::new(rx).map_err(Error::from);
> --
> 2.39.2
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
More information about the pbs-devel
mailing list