[pbs-devel] [PATCH v2 stable-2 pxar 1/1] format/decoder/accessor: backport pxar entry type `Version`
Christian Ebner
c.ebner at proxmox.com
Thu Jun 6 10:49:39 CEST 2024
On 6/6/24 10:21, Fabian Grünbichler wrote:
> On June 5, 2024 5:41 pm, Christian Ebner wrote:
>> Backports the pxar format entry type `Version` and the associated
>> decoder methods. The format version entry is expected once as the
>> first entry of the pxar archive, marked with a `PXAR_FORMAT_VERSION`
>> header followed by the encoded version number for archives with
>> format version 2 or higher.
>> If not present, the default format version 1 is assumed as encoding
>> format for the archive.
>>
>> The entry allows to early detect and bail if an incompatible archive
>> version is encountered.
>>
>> The format version entry is not backwards compatible to pxar format
>> version 1.
>>
>> Signed-off-by: Christian Ebner <c.ebner at proxmox.com>
>> ---
>> Note:
>>
>> This patch is intended to be applied on a dedicated branch to be forked
>> from previous master commit 675ecff32fbeff0973eaea016c4b8f3877015adb
>>
>> examples/mk-format-hashes.rs | 5 +++++
>> src/accessor/mod.rs | 28 ++++++++++++++++++++++++++--
>> src/decoder/mod.rs | 28 ++++++++++++++++++++++++++--
>> src/format/mod.rs | 19 +++++++++++++++++++
>> src/lib.rs | 3 +++
>> tests/simple/fs.rs | 1 +
>> 6 files changed, 80 insertions(+), 4 deletions(-)
>>
>> diff --git a/examples/mk-format-hashes.rs b/examples/mk-format-hashes.rs
>> index 6e00654..afd0924 100644
>> --- a/examples/mk-format-hashes.rs
>> +++ b/examples/mk-format-hashes.rs
>> @@ -1,6 +1,11 @@
>> use pxar::format::hash_filename;
>>
>> const CONSTANTS: &[(&str, &str, &str)] = &[
>> + (
>> + "Pxar format version entry, fallback to version 1 if not present",
>> + "PXAR_FORMAT_VERSION",
>> + "__PROXMOX_FORMAT_VERSION__",
>> + ),
>> (
>> "Beginning of an entry (current version).",
>> "PXAR_ENTRY",
>> diff --git a/src/accessor/mod.rs b/src/accessor/mod.rs
>> index 6a2de73..73d79e1 100644
>> --- a/src/accessor/mod.rs
>> +++ b/src/accessor/mod.rs
>> @@ -17,7 +17,7 @@ use endian_trait::Endian;
>>
>> use crate::binary_tree_array;
>> use crate::decoder::{self, DecoderImpl};
>> -use crate::format::{self, GoodbyeItem};
>> +use crate::format::{self, FormatVersion, GoodbyeItem};
>> use crate::util;
>> use crate::{Entry, EntryKind};
>>
>> @@ -185,11 +185,23 @@ pub(crate) struct AccessorImpl<T> {
>> }
>>
>> impl<T: ReadAt> AccessorImpl<T> {
>> - pub async fn new(input: T, size: u64) -> io::Result<Self> {
>> + pub async fn new(mut input: T, size: u64) -> io::Result<Self> {
>> if size < (size_of::<GoodbyeItem>() as u64) {
>> io_bail!("too small to contain a pxar archive");
>> }
>>
>> + let header: format::Header = read_entry_at(&mut input, 0).await?;
>> + header.check_header_size()?;
>> +
>> + if header.htype == format::PXAR_FORMAT_VERSION {
>> + let version: u64 = read_entry_at(
>> + &mut input,
>> + size_of::<format::Header>() as u64,
>> + )
>> + .await?;
>> + FormatVersion::deserialize(version)?;
>> + }
>
> is there some other way to construct the AccessorImpl? if not, wouldn't
> this check here be enough and the ones below can actually never
> trigger/happen? see below as well, I think the deserialize could just be
> an io_bail
True, I just wanted to keep the logic the same as for the current
master, but I am fine send a new version simply bailing here instead.
>
>> +
>> Ok(Self {
>> input,
>> size,
>> @@ -293,6 +305,12 @@ impl<T: Clone + ReadAt> AccessorImpl<T> {
>> .next()
>> .await
>> .ok_or_else(|| io_format_err!("unexpected EOF while decoding file entry"))??;
>> +
>> + if let EntryKind::Version(_) = entry.kind() {
>> + // client is incompatible with any format version entry (version 1 is never encoded)
>> + io_bail!("got format version not compatible with this client.");
>> + }
>
> since no encoded version can be deserialized by the stable-2 parser,
> this cannot happen since the deserializer would have bailed before?
Also true, have these in place as additional safeguard. But I can drop
this in a new version.
>
>> +
>> Ok(FileEntryImpl {
>> input: self.input.clone(),
>> entry,
>> @@ -516,6 +534,12 @@ impl<T: Clone + ReadAt> DirectoryImpl<T> {
>> .next()
>> .await
>> .ok_or_else(|| io_format_err!("unexpected EOF while decoding directory entry"))??;
>> +
>> + if let EntryKind::Version(_) = entry.kind() {
>> + // client is incompatible with any format version entry (version 1 is never encoded)
>> + io_bail!("got format version not compatible with this client.");
>> + }
>
> same here
same as above
>
>> +
>> Ok((entry, decoder))
>> }
>>
>> diff --git a/src/decoder/mod.rs b/src/decoder/mod.rs
>> index d1fb911..c6eae9f 100644
>> --- a/src/decoder/mod.rs
>> +++ b/src/decoder/mod.rs
>> @@ -17,7 +17,7 @@ use std::task::{Context, Poll};
>>
>> use endian_trait::Endian;
>>
>> -use crate::format::{self, Header};
>> +use crate::format::{self, FormatVersion, Header};
>> use crate::util::{self, io_err_other};
>> use crate::{Entry, EntryKind, Metadata};
>>
>> @@ -162,6 +162,7 @@ pub(crate) struct DecoderImpl<T> {
>> eof_after_entry: bool,
>> }
>>
>> +#[derive(Clone, PartialEq)]
>> enum State {
>> Begin,
>> Default,
>> @@ -236,7 +237,16 @@ impl<I: SeqRead> DecoderImpl<I> {
>> loop {
>> match self.state {
>> State::Eof => return Ok(None),
>> - State::Begin => return self.read_next_entry().await.map(Some),
>> + State::Begin => {
>> + let entry = self.read_next_entry().await.map(Some);
>> + if let Ok(Some(ref entry)) = entry {
>> + if let EntryKind::Version(_) = entry.kind() {
>> + // client is incompatible with any format version entry (version 1 is never encoded)
>> + io_bail!("got format version not compatible with this client.");
>
> do we want to include the version here? but see below, I think we can
> skip this altogether since we never ever will encounter a valid Version
> entry..
Yes, same as above. I keep this as safeguard, but can drop this as well
>
>> + }
>> + }
>> + return entry;
>> + }
>> State::Default => {
>> // we completely finished an entry, so now we're going "up" in the directory
>> // hierarchy and parse the next PXAR_FILENAME or the PXAR_GOODBYE:
>> @@ -354,6 +364,7 @@ impl<I: SeqRead> DecoderImpl<I> {
>> }
>>
>> async fn read_next_entry_or_eof(&mut self) -> io::Result<Option<Entry>> {
>> + let previous_state = self.state.clone();
>> self.state = State::Default;
>> self.entry.clear_data();
>>
>> @@ -373,6 +384,14 @@ impl<I: SeqRead> DecoderImpl<I> {
>> self.entry.metadata = Metadata::default();
>> self.entry.kind = EntryKind::Hardlink(self.read_hardlink().await?);
>>
>> + Ok(Some(self.entry.take()))
>> + } else if header.htype == format::PXAR_FORMAT_VERSION {
>> + if previous_state != State::Begin {
>> + io_bail!("Got format version entry at unexpected position");
>> + }
>
> technically any position is unexpected, so we could drop this check
> here..
>
>> + self.current_header = header;
>> + self.entry.kind = EntryKind::Version(self.read_format_version().await?);
>
> we can skip this, since there can never be a valid Version entry, and
> just inline read_format_version as a single call to seq_read_entry
> followed by bailing?
Okay, will do that.
>
>> +
>> Ok(Some(self.entry.take()))
>> } else if header.htype == format::PXAR_ENTRY || header.htype == format::PXAR_ENTRY_V1 {
>> if header.htype == format::PXAR_ENTRY {
>> @@ -661,6 +680,11 @@ impl<I: SeqRead> DecoderImpl<I> {
>> async fn read_quota_project_id(&mut self) -> io::Result<format::QuotaProjectId> {
>> self.read_simple_entry("quota project id").await
>> }
>> +
>> + async fn read_format_version(&mut self) -> io::Result<format::FormatVersion> {
>> + let version: u64 = seq_read_entry(&mut self.input).await?;
>> + FormatVersion::deserialize(version)
>> + }
>> }
>>
>> /// Reader for file contents inside a pxar archive.
>> diff --git a/src/format/mod.rs b/src/format/mod.rs
>> index bfea9f6..2e21635 100644
>> --- a/src/format/mod.rs
>> +++ b/src/format/mod.rs
>> @@ -6,6 +6,7 @@
>> //! item data.
>> //!
>> //! An archive contains items in the following order:
>> +//! * `FORMAT_VERSION` -- (optional for v1), version of encoding format
>> //! * `ENTRY` -- containing general stat() data and related bits
>> //! * `XATTR` -- one extended attribute
>> //! * ... -- more of these when there are multiple defined
>> @@ -79,6 +80,8 @@ pub mod mode {
>> }
>>
>> // Generated by `cargo run --example mk-format-hashes`
>> +/// Pxar format version entry, fallback to version 1 if not present
>> +pub const PXAR_FORMAT_VERSION: u64 = 0x730f6c75df16a40d;
>> /// Beginning of an entry (current version).
>> pub const PXAR_ENTRY: u64 = 0xd5956474e588acef;
>> /// Previous version of the entry struct
>> @@ -177,6 +180,7 @@ impl Header {
>> impl Display for Header {
>> fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
>> let readable = match self.htype {
>> + PXAR_FORMAT_VERSION => "FORMAT_VERSION",
>> PXAR_FILENAME => "FILENAME",
>> PXAR_SYMLINK => "SYMLINK",
>> PXAR_HARDLINK => "HARDLINK",
>> @@ -540,6 +544,21 @@ impl From<&std::fs::Metadata> for Stat {
>> }
>> }
>>
>> +#[derive(Clone, Debug, Default, PartialEq)]
>> +pub enum FormatVersion {
>> + #[default]
>> + Version1,
>> +}
>> +
>> +impl FormatVersion {
>> + pub fn deserialize(version: u64) -> Result<FormatVersion, io::Error> {
>> + match version {
>> + 1u64 => Ok(FormatVersion::Version1),
>
> the 1u64 here is wrong, right? it can't ever be encoded that way.. so
> this can go straight to io_bail!, or we can even skip the deserialize
> altogether and just inline that bail above in `read_format_version`
Same as above, I tried to keep the logic similar to current master, but
can drop this as well.
>
>> + version => io_bail!("incompatible format version {version}")
>> + }
>> + }
>> +}
>> +
>> #[derive(Clone, Debug)]
>> pub struct Filename {
>> pub name: Vec<u8>,
>> diff --git a/src/lib.rs b/src/lib.rs
>> index 210c4b1..b63d43c 100644
>> --- a/src/lib.rs
>> +++ b/src/lib.rs
>> @@ -342,6 +342,9 @@ impl Acl {
>> /// Identifies whether the entry is a file, symlink, directory, etc.
>> #[derive(Clone, Debug)]
>> pub enum EntryKind {
>> + /// Pxar file format version
>> + Version(format::FormatVersion),
>> +
>
> if we never construct such an entry, since it is always considered
> invalid, we can skip this?
Will drop this as well
>
>> /// Symbolic links.
>> Symlink(format::Symlink),
>>
>> diff --git a/tests/simple/fs.rs b/tests/simple/fs.rs
>> index 9a89c4d..fd13e65 100644
>> --- a/tests/simple/fs.rs
>> +++ b/tests/simple/fs.rs
>> @@ -229,6 +229,7 @@ impl Entry {
>> })?))
>> };
>> match item.kind() {
>> + PxarEntryKind::Version(_) => continue,
>
> and as a result, this?
Same, given that I think this would not even require the patches on the
pbs side anymore, as the decoder/accessor will always fail anyway.
>
>> PxarEntryKind::GoodbyeTable => break,
>> PxarEntryKind::File { size, .. } => {
>> let mut data = Vec::new();
>> --
>> 2.30.2
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel at lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
More information about the pbs-devel
mailing list