[pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path

Gabriel Goller g.goller at proxmox.com
Mon Aug 14 11:32:50 CEST 2023


On 8/11/23 10:26, Wolfgang Bumiller wrote:

> On Wed, Aug 09, 2023 at 12:19:12PM +0200, Gabriel Goller wrote:
>> Added `matches_path()` function, which only matches against the path and returns
>> an error if a file_mode pattern is found/needed in the matching list. This is
>> useful when we want to check if a file is excluded before running `stat()` on
>> the file to get the file_mode (which could fail).
>>
>> Signed-off-by: Gabriel Goller <g.goller at proxmox.com>
>> ---
>>   src/match_list.rs | 159 +++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 158 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/match_list.rs b/src/match_list.rs
>> index c5b14e0..acad328 100644
>> --- a/src/match_list.rs
>> +++ b/src/match_list.rs
>> @@ -1,6 +1,6 @@
>>   //! Helpers for include/exclude lists.
>> -
>>   use bitflags::bitflags;
>> +use std::fmt;
>>   
>>   use crate::PatternFlag;
>>   
>> @@ -39,6 +39,17 @@ impl Default for MatchFlag {
>>       }
>>   }
>>   
>> +#[derive(Debug, PartialEq)]
>> +pub struct FileModeRequiredForMatching;
> Let's shorten this to just `FileModeRequired` ;-)
Agree
>> +
>> +impl fmt::Display for FileModeRequiredForMatching {
>> +    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
>> +        write!(f, "File mode is required for matching")
>> +    }
>> +}
>> +
>> +impl std::error::Error for FileModeRequiredForMatching {}
>> +
>>   /// A pattern entry. (Glob patterns or literal patterns.)
>>   // Note:
>>   // For regex we'd likely use the POSIX extended REs via `regexec(3)`, since we're targetting
>> @@ -304,12 +315,32 @@ impl MatchEntry {
>>   
>>           self.matches_path_exact(path)
>>       }
>> +
>> +    /// Check whether the path contains a matching suffix. Returns an error if a file mode is required.
>> +    pub fn matches_path<T: AsRef<[u8]>>(
>> +        &self,
>> +        path: T,
>> +    ) -> Result<bool, FileModeRequiredForMatching> {
>> +        self.matches_path_do(path.as_ref())
>> +    }
>> +
>> +    fn matches_path_do(&self, path: &[u8]) -> Result<bool, FileModeRequiredForMatching> {
>> +        if !self.flags.contains(MatchFlag::ANY_FILE_TYPE) {
>> +            return Err(FileModeRequiredForMatching);
>> +        }
>> +
>> +        Ok(self.matches_path_suffix_do(path))
>> +    }
>>   }
>>   
>>   #[doc(hidden)]
>>   pub trait MatchListEntry {
>>       fn entry_matches(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
>>       fn entry_matches_exact(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
>> +    fn entry_matches_path(
>> +        &self,
>> +        path: &[u8],
>> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching>;
>>   }
>>   
>>   impl MatchListEntry for &'_ MatchEntry {
>> @@ -328,6 +359,21 @@ impl MatchListEntry for &'_ MatchEntry {
>>               None
>>           }
>>       }
>> +
>> +    fn entry_matches_path(
>> +        &self,
>> +        path: &[u8],
>> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
>> +        if let Ok(b) = self.matches_path(path) {
> This can just use `?`, it's the exact same error type after all.
> (Also `if let Ok` is generally best avoided since the 'else' branch
> discards the error, and if not, it's often a case like this where it can
> juse use '?' ;-) ).
>
> Effectively this could be as short as
>
>      Ok(self.matches_path(path)?.then(|| self.match_type()))
>
>> +            if b {
> As an additional hint: when you nest ifs around things you can match,
> you can just include both cases in the patterns:
>      match self.matches_path(path) {
>          Ok(true) => Ok(Some(self.match_type())),
>          Ok(false) => Ok(None),
>          Err(err) => Err(err),
>          // where this Err() case already tells you that you can use '?' instead
>      }
>
>> +                Ok(Some(self.match_type()))
>> +            } else {
>> +                Ok(None)
>> +            }
>> +        } else {
>> +            Err(FileModeRequiredForMatching)
>> +        }
>> +    }
>>   }
>>   
>>   impl MatchListEntry for &'_ &'_ MatchEntry {
>> @@ -346,6 +392,21 @@ impl MatchListEntry for &'_ &'_ MatchEntry {
>>               None
>>           }
>>       }
>> +
>> +    fn entry_matches_path(
>> +        &self,
>> +        path: &[u8],
>> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
> same
>
>> +        if let Ok(b) = self.matches_path(path) {
>> +            if b {
>> +                Ok(Some(self.match_type()))
>> +            } else {
>> +                Ok(None)
>> +            }
>> +        } else {
>> +            Err(FileModeRequiredForMatching)
>> +        }
>> +    }
>>   }
>>   
>>   /// This provides [`matches`](MatchList::matches) and [`matches_exact`](MatchList::matches_exact)
>> @@ -374,6 +435,20 @@ pub trait MatchList {
>>       }
>>   
>>       fn matches_exact_do(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
>> +
>> +    /// Check whether this list contains anything exactly matching the path, returns error if
>> +    /// `file_mode` is required for exact matching.
>> +    fn matches_path<T: AsRef<[u8]>>(
>> +        &self,
>> +        path: T,
>> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
>> +        self.matches_path_do(path.as_ref())
>> +    }
>> +
>> +    fn matches_path_do(
>> +        &self,
>> +        path: &[u8],
>> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching>;
>>   }
>>   
>>   impl<'a, T> MatchList for T
>> @@ -408,6 +483,24 @@ where
>>   
>>           None
>>       }
>> +
>> +    fn matches_path_do(
>> +        &self,
>> +        path: &[u8],
>> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
> Given the amount of tiny match helpers we run through with those 2
> traits already I wonder if we should just make a breaking change here
> instead and only have the versions with the `Result` while users (or
> defaulted helpers in the trait) just pass `file_mode.unwrap_or(!0)`
> (since a mode of 0 should match anything ;-) ).
>
> But I don't have any strong feelings about this, so either way is fine
> with me.

Hmm, I don't get what you mean by the `versions with the Result`?
Do you mean we should just modify the `matches()` and `matches_exact()` 
function to
return a `Result` if the mode is `ANY_FILE_TYPE` and we need one (a file 
mode) to match?

>> +        // This is an &self method on a `T where T: 'a`.
>> +        let this: &'a Self = unsafe { std::mem::transmute(self) };
>> +
>> +        for m in this.into_iter().rev() {
>> +            if let Ok(mt) = m.entry_matches_path(path) {
> IIRC the intention was actually to immediately fail if we hit a pattern
> with a file mode, since we wouldn't be able to tell if it would already
> exclude the file, otherwise we really *could* just skip this entirely
> and have a failing stat() call just use `Some(!0)` as file mode.
>
> Basically, if the user runs into an inaccessible file they have to
> append `--exclude=/that/one/file` to the CLI invocation to fix it,
> whereas otherwise they'd still get an error.
>
> So this should just be
>
>      if let Some(mt) = m.entry_matches_path(path)? {
>          return Some(mt);
>      }

Hmm, my solution would have been to traverse all the patterns with a 
`ANY_FILE_TYPE` mode and
if there is a match, nice, return. If there is no match, check if there 
are other patterns without
the `ANY_FILE_TYPE` mode (then we need to fail because we don't know the 
`file_mode` yet...) or
return nothing if there are no patterns left. I don't know if this is 
the correct approach though...

>> +                if mt.is_some() {
>> +                    return Ok(mt);
>> +                }
>> +            }
>> +        }
>> +
>> +        Err(FileModeRequiredForMatching)
> Also this is wrong. If nothing matches, nothing matches ;-) (just like
> in the other matching variants).
> Which immediately tells you that skipping the error above doesn't make
> much sense :-) )
Yes, you are right, corrected it.





More information about the pbs-devel mailing list