[pbs-devel] [PATCH v3 pxar 14/58] format/encoder/decoder: add entry type cli params

Christian Ebner c.ebner at proxmox.com
Wed Apr 3 16:41:30 CEST 2024


On 4/3/24 14:01, Fabian Grünbichler wrote:
> On March 28, 2024 1:36 pm, Christian Ebner wrote:
>> Add an additional entrt type PXAR_CLI_PARAMS which is used to store
>> additional metadata passed by the cli arguments such as the pxar cli
>> exclude patterns.
>>
>> The content is encoded as an arbitrary byte slice. The entry must be
>> encoded right after the pxar format version entry, it is not possible to
>> encode this with the previous format version 1.
> 
> since (from pxar's perspective) this is just an opaque blob of data,
> isn't PXAR_CLI_PARAMS a bit of a misnomer? do we want a single blob, or
> multiple delineated ones (might be more handy for the client using it)?

This is in general still rather open for discussion and not decided 
upon. I added it as opaque blob for now, not wanting to force any format 
on this from the pxar perspective.
As is, this is only used to store the pxar exclude cli parameters during 
backup creation via proxmox-backups client, as encoding that as regular 
file truly broke the reusability logic there.

The idea was to possibly add more stuff in here e.g. further cli 
parameters and other metadata, without having pxar to know about the 
content format.

So should I just rename this to e.g. PXAR_PRELUDE?

> 
> this whole hunk just reverts a change done earlier in the same series
> (forgotten `cargo fmt` for the first patch maybe? ;))

Yeah, seems I squashed this into the wrong patch, will be fixed in the 
next version.

> 
>> diff --git a/src/decoder/mod.rs b/src/decoder/mod.rs
>> index 5b2fafb..4170b2f 100644
>> --- a/src/decoder/mod.rs
>> +++ b/src/decoder/mod.rs
>> @@ -266,7 +266,13 @@ impl<I: SeqRead> DecoderImpl<I> {
>>                       if let Ok(Some(ref entry)) = entry {
>>                           if let EntryKind::Version(version) = entry.kind() {
>>                               self.version = version.clone();
>> -                            return self.read_next_entry().await.map(Some);
>> +                            let entry = self.read_next_entry().await.map(Some);
>> +                            if let Ok(Some(ref entry)) = entry {
>> +                                if let EntryKind::CliParams(_) = entry.kind() {
>> +                                    return self.read_next_entry().await.map(Some);
>> +                                }
>> +                            }
>> +                            return entry;
> 
> so maybe we want a new State::Prelude or something that we transition to
> from Begin if we encounter a FormatVersion, then we can match this and
> future "special" entries before proceeding with the regular archive?

Ack, I added this based on the comment in the reply to the other patch 
already

> 
>>                           }
>>                       }
>>                       return entry;
>> @@ -429,6 +435,11 @@ impl<I: SeqRead> DecoderImpl<I> {
>>               self.current_header = header;
>>               self.entry.kind = EntryKind::Version(self.read_format_version().await?);
>>   
>> +            Ok(Some(self.entry.take()))
>> +        } else if header.htype == format::PXAR_CLI_PARAMS {
>> +            self.current_header = header;
>> +            self.entry.kind = EntryKind::CliParams(self.read_cli_params().await?);
>> +
> 
> and here (well, not here, at the start of read_next_entry_or_eof ;)) we
> should maybe save the previous state before setting it to Default, so
> that we can then check it for some header types like FormatVersion or
> CliParams to ensure a misconstructed input cannot confuse our state
> machine/decoder?

Hmm, yes, that will do. Will add such a check in the next version of the 
patches.

> 
>>               Ok(Some(self.entry.take()))
>>           } else if header.htype == format::PXAR_ENTRY || header.htype == format::PXAR_ENTRY_V1 {
>>               if header.htype == format::PXAR_ENTRY {
>> @@ -802,6 +813,11 @@ impl<I: SeqRead> DecoderImpl<I> {
>>               _ => io_bail!("unexpected pxar format version"),
>>           }
>>       }
>> +
>> +    async fn read_cli_params(&mut self) -> io::Result<format::CliParams> {
>> +        let data = self.read_entry_as_bytes().await?;
>> +        Ok(format::CliParams { data })
>> +    }
>>   }
>>   
>>   /// Reader for file contents inside a pxar archive.
>> diff --git a/src/encoder/aio.rs b/src/encoder/aio.rs
>> index 6da32bd..956b2a3 100644
>> --- a/src/encoder/aio.rs
>> +++ b/src/encoder/aio.rs
>> @@ -25,11 +25,13 @@ impl<'a, T: tokio::io::AsyncWrite + 'a> Encoder<'a, TokioWriter<T>> {
>>           output: T,
>>           metadata: &Metadata,
>>           payload_output: Option<T>,
>> +        cli_params: Option<&[u8]>,
>>       ) -> io::Result<Encoder<'a, TokioWriter<T>>> {
>>           Encoder::new(
>>               TokioWriter::new(output),
>>               metadata,
>>               payload_output.map(|payload_output| TokioWriter::new(payload_output)),
>> +            cli_params,
>>           )
>>           .await
>>       }
>> @@ -46,6 +48,7 @@ impl<'a> Encoder<'a, TokioWriter<tokio::fs::File>> {
>>               TokioWriter::new(tokio::fs::File::create(path.as_ref()).await?),
>>               metadata,
>>               None,
>> +            None,
>>           )
>>           .await
>>       }
>> @@ -57,9 +60,11 @@ impl<'a, T: SeqWrite + 'a> Encoder<'a, T> {
>>           output: T,
>>           metadata: &Metadata,
>>           payload_output: Option<T>,
>> +        cli_params: Option<&[u8]>,
>>       ) -> io::Result<Encoder<'a, T>> {
>>           Ok(Self {
>> -            inner: encoder::EncoderImpl::new(output.into(), metadata, payload_output).await?,
>> +            inner: encoder::EncoderImpl::new(output.into(), metadata, payload_output, cli_params)
>> +                .await?,
>>           })
>>       }
>>   
>> @@ -331,10 +336,14 @@ mod test {
>>       /// Assert that `Encoder` is `Send`
>>       fn send_test() {
>>           let test = async {
>> -            let mut encoder =
>> -                Encoder::new(DummyOutput, &Metadata::dir_builder(0o700).build(), None)
>> -                    .await
>> -                    .unwrap();
>> +            let mut encoder = Encoder::new(
>> +                DummyOutput,
>> +                &Metadata::dir_builder(0o700).build(),
>> +                None,
>> +                None,
>> +            )
>> +            .await
>> +            .unwrap();
>>               {
>>                   encoder
>>                       .create_directory("baba", &Metadata::dir_builder(0o700).build())
>> diff --git a/src/encoder/mod.rs b/src/encoder/mod.rs
>> index 9270153..b0ec877 100644
>> --- a/src/encoder/mod.rs
>> +++ b/src/encoder/mod.rs
>> @@ -316,6 +316,7 @@ impl<'a, T: SeqWrite + 'a> EncoderImpl<'a, T> {
>>           output: EncoderOutput<'a, T>,
>>           metadata: &Metadata,
>>           mut payload_output: Option<T>,
>> +        cli_params: Option<&[u8]>,
>>       ) -> io::Result<EncoderImpl<'a, T>> {
>>           if !metadata.is_dir() {
>>               io_bail!("directory metadata must contain the directory mode flag");
>> @@ -343,6 +344,9 @@ impl<'a, T: SeqWrite + 'a> EncoderImpl<'a, T> {
>>           };
>>   
>>           this.encode_format_version().await?;
>> +        if let Some(params) = cli_params {
>> +            this.encode_cli_params(params).await?;
>> +        }
>>           this.encode_metadata(metadata).await?;
>>           let state = this.state_mut()?;
>>           state.files_offset = state.position();
>> @@ -740,16 +744,38 @@ impl<'a, T: SeqWrite + 'a> EncoderImpl<'a, T> {
>>           Ok(())
>>       }
>>   
>> +    async fn encode_cli_params(&mut self, params: &[u8]) -> io::Result<()> {
>> +        if self.version == FormatVersion::Version1 {
>> +            io_bail!("encoding cli params not supported pxar format version 1");
> 
> nit: missing "in" or "for" or "with"

ack

> 
>> +        }
>> +
>> +        let (output, state) = self.output_state()?;
>> +        if state.write_position != (size_of::<u64>() + size_of::<format::Header>()) as u64 {
> 
> this seems brittle, shouldn't it explicitly use the size of a
> FormatVersion entry?

Will double check, but afaik the size_of for format::FormatVersion was 
not correct since that is an enum, the version number however encoded as 
u64.

> 
> this and the similar check for the version introduced in the previous
> patch smell a bit like "we actually have a state machine here but
> pretend not to" :) for the payload archive, we also have a very simple
> one: start_marker (1) -> payload entry (0..N) -> tail_marker (1) that is
> not enforced atm (as in, nothing stops a bug from writing other entry
> types, or additonal start/tail markers, or .. to the payload output).

I agree, for the encoder we however do not have a state machine, I am 
open for suggestions on how to improve this!

>> +            io_bail!(
>> +                "cli params must be encoded following the version header, current position {}",
>> +                state.write_position,
>> +            );
>> +        }
>> +
>> +        seq_write_pxar_entry(
>> +            output,
>> +            format::PXAR_CLI_PARAMS,
>> +            params,
>> +            &mut state.write_position,
>> +        )
>> +        .await
>> +    }
>> +
>>       async fn encode_format_version(&mut self) -> io::Result<()> {
>> -		let version_bytes = match self.version {
>> -			format::FormatVersion::Version1 => return Ok(()),
>> -			format::FormatVersion::Version2 => 2u64.to_le_bytes(),
>> -		};
>> +        let version_bytes = match self.version {
>> +            format::FormatVersion::Version1 => return Ok(()),
>> +            format::FormatVersion::Version2 => 2u64.to_le_bytes(),
>> +        };
> 
> cargo fmt?

ack, incorrectly squashed, fixed in new version

> 
>>   
>>           let (output, state) = self.output_state()?;
>> -		if state.write_position != 0 {
>> -			io_bail!("pxar format version must be encoded at the beginning of an archive");
>> -		}
>> +        if state.write_position != 0 {
>> +            io_bail!("pxar format version must be encoded at the beginning of an archive");
>> +        }
> 
> cargo fmt?

ack, incorrectly squashed, fixed in new version

> 
>>   
>>           seq_write_pxar_entry(
>>               output,
>> diff --git a/src/encoder/sync.rs b/src/encoder/sync.rs
>> index a6e16f4..3f706c1 100644
>> --- a/src/encoder/sync.rs
>> +++ b/src/encoder/sync.rs
>> @@ -28,7 +28,7 @@ impl<'a, T: io::Write + 'a> Encoder<'a, StandardWriter<T>> {
>>       /// Encode a `pxar` archive into a regular `std::io::Write` output.
>>       #[inline]
>>       pub fn from_std(output: T, metadata: &Metadata) -> io::Result<Encoder<'a, StandardWriter<T>>> {
>> -        Encoder::new(StandardWriter::new(output), metadata, None)
>> +        Encoder::new(StandardWriter::new(output), metadata, None, None)
>>       }
>>   }
>>   
>> @@ -42,6 +42,7 @@ impl<'a> Encoder<'a, StandardWriter<std::fs::File>> {
>>               StandardWriter::new(std::fs::File::create(path.as_ref())?),
>>               metadata,
>>               None,
>> +            None,
>>           )
>>       }
>>   }
>> @@ -53,12 +54,18 @@ impl<'a, T: SeqWrite + 'a> Encoder<'a, T> {
>>       /// not allowed to use the `Waker`, as this will cause a `panic!`.
>>       // Optionally attach a dedicated writer to redirect the payloads of regular files to a separate
>>       // output.
>> -    pub fn new(output: T, metadata: &Metadata, payload_output: Option<T>) -> io::Result<Self> {
>> +    pub fn new(
>> +        output: T,
>> +        metadata: &Metadata,
>> +        payload_output: Option<T>,
>> +        cli_params: Option<&[u8]>,
>> +    ) -> io::Result<Self> {
>>           Ok(Self {
>>               inner: poll_result_once(encoder::EncoderImpl::new(
>>                   output.into(),
>>                   metadata,
>>                   payload_output,
>> +                cli_params,
>>               ))?,
>>           })
>>       }
>> diff --git a/src/format/mod.rs b/src/format/mod.rs
>> index 2bf33c9..82ef196 100644
>> --- a/src/format/mod.rs
>> +++ b/src/format/mod.rs
>> @@ -87,6 +87,7 @@ pub const PXAR_FORMAT_VERSION: u64 = 0x730f6c75df16a40d;
>>   pub const PXAR_ENTRY: u64 = 0xd5956474e588acef;
>>   /// Previous version of the entry struct
>>   pub const PXAR_ENTRY_V1: u64 = 0x11da850a1c1cceff;
>> +pub const PXAR_CLI_PARAMS: u64 = 0xcf58b7dd627f604a;
>>   pub const PXAR_FILENAME: u64 = 0x16701121063917b3;
>>   pub const PXAR_SYMLINK: u64 = 0x27f971e7dbf5dc5f;
>>   pub const PXAR_DEVICE: u64 = 0x9fc9e906586d5ce9;
>> @@ -147,6 +148,7 @@ impl Header {
>>       #[inline]
>>       pub fn max_content_size(&self) -> u64 {
>>           match self.htype {
>> +            PXAR_CLI_PARAMS => u64::MAX - (size_of::<Self>() as u64),
>>               // + null-termination
>>               PXAR_FILENAME => crate::util::MAX_FILENAME_LEN + 1,
>>               // + null-termination
>> @@ -190,6 +192,7 @@ impl Display for Header {
>>       fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
>>           let readable = match self.htype {
>>               PXAR_FORMAT_VERSION => "FORMAT_VERSION",
>> +            PXAR_CLI_PARAMS => "CLI_PARAMS",
>>               PXAR_FILENAME => "FILENAME",
>>               PXAR_SYMLINK => "SYMLINK",
>>               PXAR_HARDLINK => "HARDLINK",
>> @@ -694,6 +697,29 @@ impl Device {
>>       }
>>   }
>>   
>> +#[derive(Clone, Debug)]
>> +pub struct CliParams {
>> +    pub data: Vec<u8>,
>> +}
>> +
>> +impl CliParams {
>> +    pub fn as_os_str(&self) -> &OsStr {
>> +        self.as_ref()
>> +    }
>> +}
>> +
>> +impl AsRef<[u8]> for CliParams {
>> +    fn as_ref(&self) -> &[u8] {
>> +        &self.data
>> +    }
>> +}
>> +
>> +impl AsRef<OsStr> for CliParams {
>> +    fn as_ref(&self) -> &OsStr {
>> +        OsStr::from_bytes(&self.data[..self.data.len().max(1) - 1])
>> +    }
>> +}
>> +
>>   #[cfg(all(test, target_os = "linux"))]
>>   #[test]
>>   fn test_linux_devices() {
>> diff --git a/src/lib.rs b/src/lib.rs
>> index a87b5ac..cc85759 100644
>> --- a/src/lib.rs
>> +++ b/src/lib.rs
>> @@ -345,6 +345,9 @@ pub enum EntryKind {
>>       /// Pxar file format version
>>       Version(format::FormatVersion),
>>   
>> +    /// Cli parameter.
>> +    CliParams(format::CliParams),
>> +
>>       /// Symbolic links.
>>       Symlink(format::Symlink),
>>   
>> -- 
>> 2.39.2
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel at lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 





More information about the pbs-devel mailing list