[pbs-devel] [RFC pxar 4/20] fix #3174: metadata: impl fn to calc byte size

Thu Sep 28 11:00:13 CEST 2023

On Thu, Sep 28, 2023 at 10:07:40AM +0200, Christian Ebner wrote:
> I was giving this some more thought and are not really convinced that sending
> this trough an encoder instance, which digests the encoded byte stream and counts
> the bytes is the right approach here.

How about moving the logic `encode_metadata` from `Encoder` into
`Metadata` with an `Option<&mut SeqWrite>` parameter, not a full
Encoder, and just having the encoding vs counting logic live right next
to each other depending on whether the writer is Some?
That should be as cheap as it gets?

> 
> The purpose of this function is to calculate the bytes, which I can easily skip over
> *without* having to call any expensive encoding/decoding functionality.
> I might get around this by simply calling the decoder on the byte stream, than I do
> not need this at all (if I'm not missing something). Might that be the better approach?

I'm not sure decoding is that much cheaper than dummy-encoding...
depending on the data I'd say it could even be more expensive in some
cases? (rare cases though, only with lots of ACLs/xattrs around I
suppose...)

> 
> Additionally, and maybe even better, I might get rid of this also by letting the
> PXAR_APPENDIX_REF offset point to the start of the file payload entry, instead of the
> file entry as is now, thereby being able to blindly skip over this already to begin with.
> Although I am not sure if that is the best approach for handling the metadata, which should
> ideally not be encoded twice, once before the PXAR_APPENDIX_REF and the PXAR_PAYLOAD.

Not sure why skipping data would encode it twice? Or did you mean to
imply that previously we pointed to metadata, but when instead pointing
to the payload we need to instead encode it in the new archive which we
previously did not need to do?