[pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path

Wolfgang Bumiller w.bumiller at proxmox.com
Fri Aug 11 10:26:21 CEST 2023


On Wed, Aug 09, 2023 at 12:19:12PM +0200, Gabriel Goller wrote:
> Added `matches_path()` function, which only matches against the path and returns
> an error if a file_mode pattern is found/needed in the matching list. This is
> useful when we want to check if a file is excluded before running `stat()` on
> the file to get the file_mode (which could fail).
> 
> Signed-off-by: Gabriel Goller <g.goller at proxmox.com>
> ---
>  src/match_list.rs | 159 +++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 158 insertions(+), 1 deletion(-)
> 
> diff --git a/src/match_list.rs b/src/match_list.rs
> index c5b14e0..acad328 100644
> --- a/src/match_list.rs
> +++ b/src/match_list.rs
> @@ -1,6 +1,6 @@
>  //! Helpers for include/exclude lists.
> -
>  use bitflags::bitflags;
> +use std::fmt;
>  
>  use crate::PatternFlag;
>  
> @@ -39,6 +39,17 @@ impl Default for MatchFlag {
>      }
>  }
>  
> +#[derive(Debug, PartialEq)]
> +pub struct FileModeRequiredForMatching;

Let's shorten this to just `FileModeRequired` ;-)

> +
> +impl fmt::Display for FileModeRequiredForMatching {
> +    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
> +        write!(f, "File mode is required for matching")
> +    }
> +}
> +
> +impl std::error::Error for FileModeRequiredForMatching {}
> +
>  /// A pattern entry. (Glob patterns or literal patterns.)
>  // Note:
>  // For regex we'd likely use the POSIX extended REs via `regexec(3)`, since we're targetting
> @@ -304,12 +315,32 @@ impl MatchEntry {
>  
>          self.matches_path_exact(path)
>      }
> +
> +    /// Check whether the path contains a matching suffix. Returns an error if a file mode is required.
> +    pub fn matches_path<T: AsRef<[u8]>>(
> +        &self,
> +        path: T,
> +    ) -> Result<bool, FileModeRequiredForMatching> {
> +        self.matches_path_do(path.as_ref())
> +    }
> +
> +    fn matches_path_do(&self, path: &[u8]) -> Result<bool, FileModeRequiredForMatching> {
> +        if !self.flags.contains(MatchFlag::ANY_FILE_TYPE) {
> +            return Err(FileModeRequiredForMatching);
> +        }
> +
> +        Ok(self.matches_path_suffix_do(path))
> +    }
>  }
>  
>  #[doc(hidden)]
>  pub trait MatchListEntry {
>      fn entry_matches(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
>      fn entry_matches_exact(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
> +    fn entry_matches_path(
> +        &self,
> +        path: &[u8],
> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching>;

>  }
>  
>  impl MatchListEntry for &'_ MatchEntry {
> @@ -328,6 +359,21 @@ impl MatchListEntry for &'_ MatchEntry {
>              None
>          }
>      }
> +
> +    fn entry_matches_path(
> +        &self,
> +        path: &[u8],
> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
> +        if let Ok(b) = self.matches_path(path) {

This can just use `?`, it's the exact same error type after all.
(Also `if let Ok` is generally best avoided since the 'else' branch
discards the error, and if not, it's often a case like this where it can
juse use '?' ;-) ).

Effectively this could be as short as

    Ok(self.matches_path(path)?.then(|| self.match_type()))

> +            if b {

As an additional hint: when you nest ifs around things you can match,
you can just include both cases in the patterns:
    match self.matches_path(path) {
        Ok(true) => Ok(Some(self.match_type())),
        Ok(false) => Ok(None),
        Err(err) => Err(err),
        // where this Err() case already tells you that you can use '?' instead
    }

> +                Ok(Some(self.match_type()))
> +            } else {
> +                Ok(None)
> +            }
> +        } else {
> +            Err(FileModeRequiredForMatching)
> +        }
> +    }
>  }
>  
>  impl MatchListEntry for &'_ &'_ MatchEntry {
> @@ -346,6 +392,21 @@ impl MatchListEntry for &'_ &'_ MatchEntry {
>              None
>          }
>      }
> +
> +    fn entry_matches_path(
> +        &self,
> +        path: &[u8],
> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {

same

> +        if let Ok(b) = self.matches_path(path) {
> +            if b {
> +                Ok(Some(self.match_type()))
> +            } else {
> +                Ok(None)
> +            }
> +        } else {
> +            Err(FileModeRequiredForMatching)
> +        }
> +    }
>  }
>  
>  /// This provides [`matches`](MatchList::matches) and [`matches_exact`](MatchList::matches_exact)
> @@ -374,6 +435,20 @@ pub trait MatchList {
>      }
>  
>      fn matches_exact_do(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
> +
> +    /// Check whether this list contains anything exactly matching the path, returns error if
> +    /// `file_mode` is required for exact matching.
> +    fn matches_path<T: AsRef<[u8]>>(
> +        &self,
> +        path: T,
> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
> +        self.matches_path_do(path.as_ref())
> +    }
> +
> +    fn matches_path_do(
> +        &self,
> +        path: &[u8],
> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching>;
>  }
>  
>  impl<'a, T> MatchList for T
> @@ -408,6 +483,24 @@ where
>  
>          None
>      }
> +
> +    fn matches_path_do(
> +        &self,
> +        path: &[u8],
> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {

Given the amount of tiny match helpers we run through with those 2
traits already I wonder if we should just make a breaking change here
instead and only have the versions with the `Result` while users (or
defaulted helpers in the trait) just pass `file_mode.unwrap_or(!0)`
(since a mode of 0 should match anything ;-) ).

But I don't have any strong feelings about this, so either way is fine
with me.

> +        // This is an &self method on a `T where T: 'a`.
> +        let this: &'a Self = unsafe { std::mem::transmute(self) };
> +
> +        for m in this.into_iter().rev() {
> +            if let Ok(mt) = m.entry_matches_path(path) {

IIRC the intention was actually to immediately fail if we hit a pattern
with a file mode, since we wouldn't be able to tell if it would already
exclude the file, otherwise we really *could* just skip this entirely
and have a failing stat() call just use `Some(!0)` as file mode.

Basically, if the user runs into an inaccessible file they have to
append `--exclude=/that/one/file` to the CLI invocation to fix it,
whereas otherwise they'd still get an error.

So this should just be

    if let Some(mt) = m.entry_matches_path(path)? {
        return Some(mt);
    }


> +                if mt.is_some() {
> +                    return Ok(mt);
> +                }
> +            }
> +        }
> +
> +        Err(FileModeRequiredForMatching)

Also this is wrong. If nothing matches, nothing matches ;-) (just like
in the other matching variants).
Which immediately tells you that skipping the error above doesn't make
much sense :-) )





More information about the pbs-devel mailing list