[pbs-devel] [RFC pxar 7/20] fix #3174: encoder: add helper to incr encoder pos

Wolfgang Bumiller w.bumiller at proxmox.com
Thu Sep 28 09:04:46 CEST 2023

On Wed, Sep 27, 2023 at 02:20:18PM +0200, Christian Ebner wrote:
> > On 27.09.2023 14:07 CEST Wolfgang Bumiller <w.bumiller at proxmox.com> wrote:
> > 
> >  
> > 'incr' :S
> > 
> > On Fri, Sep 22, 2023 at 09:16:08AM +0200, Christian Ebner wrote:
> > > Adds a helper to allow to increase the encoder position by a given
> > > size. This is used to increase the position when adding an appendix
> > > section to the pxar stream, as these bytes are never encoded directly
> > > but rather referenced by already existing chunks.
> > 
> > Exposing this seems like a weird choice to me. Why exactly is this
> > needed? Why don't we instead expose methods to actually write the
> > appendix section instead?
> This is needed in order to increase the byte offset of the encoder itself.
> The appendix section is a list of chunks which are injected in the chunk
> stream on upload, but never really consumed by the encoder and subsequently
> the chunker itself. So there is no direct writing of the appendix section to
> the stream.
> By adding the bytes, consistency with the rest of the pxar archive is assured,
> as these chunks/bytes are present during decoding.

Ah so we inject the *contents* of the old pxar archive by way of sending
the chunks a writing "layer" above. Initially I thought the archive
would contain chunk ids, but this makes more sense. And is unfortunate
for the API :-)

Maybe consider marking the position modification as `unsafe fn`, though?
I mean it is a foot gun to break the resulting archive with, after all

But this means we don't have a direct way of creating incremental pxars
without a PBS context, doesn't it?
Would it make sense to have a method here which returns a Writer to
the `EncoderOutput` where we could in theory also just "dump in"
contents of another actual pxar file (where the byte counting happens
implicitly), which also has an extra `unsafe fn add_out_of_band_bytes()`
to do the raw byte count modification?

One advantage of having a "starting point" for this type of operation is
that we'd also force a `flush()` before out-of-band data gets written.
Otherwise, if we don't need/want this, we should probably just add a
`flush()` to the encoder we should call before adding any chunks out of
band, given that Max already tried to sneak in a BufRead/Writers into
the pxar crate for optimization purposes, IIRC ;-)

More information about the pbs-devel mailing list