[pbs-devel] [PATCH v3 proxmox-backup 15/33] fix #3044: server: implement push support for sync operations

Mon Oct 14 11:53:02 CEST 2024

On 10/14/24 11:41, Fabian Grünbichler wrote:
> 
>> Christian Ebner <c.ebner at proxmox.com> hat am 14.10.2024 11:32 CEST geschrieben:
>> On 10/10/24 16:48, Fabian Grünbichler wrote:
>>>> +// Read fixed or dynamic index and push to target by uploading via the backup writer instance
>>>> +//
>>>> +// For fixed indexes, the size must be provided as given by the index reader.
>>>> +#[allow(clippy::too_many_arguments)]
>>>> +async fn push_index<'a>(
>>>> +    filename: &'a str,
>>>> +    index: impl IndexFile + Send + 'static,
>>>> +    chunk_reader: Arc<dyn AsyncReadChunk>,
>>>> +    backup_writer: &BackupWriter,
>>>> +    manifest: &mut BackupManifest,
>>>> +    crypt_mode: CryptMode,
>>>> +    size: Option<u64>,
>>>> +    known_chunks: &Arc<Mutex<HashSet<[u8; 32]>>>,
>>>> +) -> Result<SyncStats, Error> {
>>>> +    let (upload_channel_tx, upload_channel_rx) = mpsc::channel(20);
>>>> +    let mut chunk_infos =
>>>> +        stream::iter(0..index.index_count()).map(move |pos| index.chunk_info(pos).unwrap());
>>>
>>> so this iterates over all the chunks in the index..
>>>
>>>> +
>>>> +    tokio::spawn(async move {
>>>> +        while let Some(chunk_info) = chunk_infos.next().await {
>>>> +            let chunk_info = chunk_reader
>>>> +                .read_raw_chunk(&chunk_info.digest)
>>>
>>> and this reads them
>>>
>>>> +                .await
>>>> +                .map(|chunk| ChunkInfo {
>>>> +                    chunk,
>>>> +                    digest: chunk_info.digest,
>>>> +                    chunk_len: chunk_info.size(),
>>>> +                    offset: chunk_info.range.start,
>>>> +                });
>>>> +            let _ = upload_channel_tx.send(chunk_info).await;
>>>
>>> and sends them further along to the upload code.. which will then (in
>>> many cases) throw away all that data we just read because it's already
>>> on the target and we know that because of the previous manifest..
>>>
>>> wouldn't it be better to deduplicate here already, and instead of
>>> reading known chunks over and over again, just tell the server to
>>> re-register them? or am I missing something here? :)
>>
>> Good catch, this is indeed a possible huge performance bottleneck!
>>
>> Did fix this by moving the known chunks check here (as suggested) and
>> stream a `MergedChunkInfo` instead of `ChunkInfo`, which allows to only
>> send the chunk's digest and size over to the backup writer. By this
>> known chunk are never read.
> 
> this would be one of the few cases btw where re-uploading a chunk that is missing on the server side (for whatever reason) would be possible - not sure how easy it would be to integrate though ;)

Well, unfortunately not directly. Here the chunk is only send via the 
channel to the backup writer, which itself further processes the stream 
and transforms it into futures for the upload requests. So the actual 
failure will be there...

In order to upload corrupt/missing chunks I think a different, 
completely decoupled approach might be better (at least for the sync).

E.g. the backup writer could get access to a chunk buffer, which keeps 
the chunks in memory until the upload was successful. This could also 
include possible lookup mechanisms to the local/remote chunk store or 
re-generate required chunks by reading from the block device in case of 
dirty bitmap tracking..

The re-upload in case of reused payload chunks for `ppxar` archives is 
more tricky...