[pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path
Gabriel Goller
g.goller at proxmox.com
Mon Aug 14 11:32:50 CEST 2023
On 8/11/23 10:26, Wolfgang Bumiller wrote:
> On Wed, Aug 09, 2023 at 12:19:12PM +0200, Gabriel Goller wrote:
>> Added `matches_path()` function, which only matches against the path and returns
>> an error if a file_mode pattern is found/needed in the matching list. This is
>> useful when we want to check if a file is excluded before running `stat()` on
>> the file to get the file_mode (which could fail).
>>
>> Signed-off-by: Gabriel Goller <g.goller at proxmox.com>
>> ---
>> src/match_list.rs | 159 +++++++++++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 158 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/match_list.rs b/src/match_list.rs
>> index c5b14e0..acad328 100644
>> --- a/src/match_list.rs
>> +++ b/src/match_list.rs
>> @@ -1,6 +1,6 @@
>> //! Helpers for include/exclude lists.
>> -
>> use bitflags::bitflags;
>> +use std::fmt;
>>
>> use crate::PatternFlag;
>>
>> @@ -39,6 +39,17 @@ impl Default for MatchFlag {
>> }
>> }
>>
>> +#[derive(Debug, PartialEq)]
>> +pub struct FileModeRequiredForMatching;
> Let's shorten this to just `FileModeRequired` ;-)
Agree
>> +
>> +impl fmt::Display for FileModeRequiredForMatching {
>> + fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
>> + write!(f, "File mode is required for matching")
>> + }
>> +}
>> +
>> +impl std::error::Error for FileModeRequiredForMatching {}
>> +
>> /// A pattern entry. (Glob patterns or literal patterns.)
>> // Note:
>> // For regex we'd likely use the POSIX extended REs via `regexec(3)`, since we're targetting
>> @@ -304,12 +315,32 @@ impl MatchEntry {
>>
>> self.matches_path_exact(path)
>> }
>> +
>> + /// Check whether the path contains a matching suffix. Returns an error if a file mode is required.
>> + pub fn matches_path<T: AsRef<[u8]>>(
>> + &self,
>> + path: T,
>> + ) -> Result<bool, FileModeRequiredForMatching> {
>> + self.matches_path_do(path.as_ref())
>> + }
>> +
>> + fn matches_path_do(&self, path: &[u8]) -> Result<bool, FileModeRequiredForMatching> {
>> + if !self.flags.contains(MatchFlag::ANY_FILE_TYPE) {
>> + return Err(FileModeRequiredForMatching);
>> + }
>> +
>> + Ok(self.matches_path_suffix_do(path))
>> + }
>> }
>>
>> #[doc(hidden)]
>> pub trait MatchListEntry {
>> fn entry_matches(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
>> fn entry_matches_exact(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
>> + fn entry_matches_path(
>> + &self,
>> + path: &[u8],
>> + ) -> Result<Option<MatchType>, FileModeRequiredForMatching>;
>> }
>>
>> impl MatchListEntry for &'_ MatchEntry {
>> @@ -328,6 +359,21 @@ impl MatchListEntry for &'_ MatchEntry {
>> None
>> }
>> }
>> +
>> + fn entry_matches_path(
>> + &self,
>> + path: &[u8],
>> + ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
>> + if let Ok(b) = self.matches_path(path) {
> This can just use `?`, it's the exact same error type after all.
> (Also `if let Ok` is generally best avoided since the 'else' branch
> discards the error, and if not, it's often a case like this where it can
> juse use '?' ;-) ).
>
> Effectively this could be as short as
>
> Ok(self.matches_path(path)?.then(|| self.match_type()))
>
>> + if b {
> As an additional hint: when you nest ifs around things you can match,
> you can just include both cases in the patterns:
> match self.matches_path(path) {
> Ok(true) => Ok(Some(self.match_type())),
> Ok(false) => Ok(None),
> Err(err) => Err(err),
> // where this Err() case already tells you that you can use '?' instead
> }
>
>> + Ok(Some(self.match_type()))
>> + } else {
>> + Ok(None)
>> + }
>> + } else {
>> + Err(FileModeRequiredForMatching)
>> + }
>> + }
>> }
>>
>> impl MatchListEntry for &'_ &'_ MatchEntry {
>> @@ -346,6 +392,21 @@ impl MatchListEntry for &'_ &'_ MatchEntry {
>> None
>> }
>> }
>> +
>> + fn entry_matches_path(
>> + &self,
>> + path: &[u8],
>> + ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
> same
>
>> + if let Ok(b) = self.matches_path(path) {
>> + if b {
>> + Ok(Some(self.match_type()))
>> + } else {
>> + Ok(None)
>> + }
>> + } else {
>> + Err(FileModeRequiredForMatching)
>> + }
>> + }
>> }
>>
>> /// This provides [`matches`](MatchList::matches) and [`matches_exact`](MatchList::matches_exact)
>> @@ -374,6 +435,20 @@ pub trait MatchList {
>> }
>>
>> fn matches_exact_do(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
>> +
>> + /// Check whether this list contains anything exactly matching the path, returns error if
>> + /// `file_mode` is required for exact matching.
>> + fn matches_path<T: AsRef<[u8]>>(
>> + &self,
>> + path: T,
>> + ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
>> + self.matches_path_do(path.as_ref())
>> + }
>> +
>> + fn matches_path_do(
>> + &self,
>> + path: &[u8],
>> + ) -> Result<Option<MatchType>, FileModeRequiredForMatching>;
>> }
>>
>> impl<'a, T> MatchList for T
>> @@ -408,6 +483,24 @@ where
>>
>> None
>> }
>> +
>> + fn matches_path_do(
>> + &self,
>> + path: &[u8],
>> + ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
> Given the amount of tiny match helpers we run through with those 2
> traits already I wonder if we should just make a breaking change here
> instead and only have the versions with the `Result` while users (or
> defaulted helpers in the trait) just pass `file_mode.unwrap_or(!0)`
> (since a mode of 0 should match anything ;-) ).
>
> But I don't have any strong feelings about this, so either way is fine
> with me.
Hmm, I don't get what you mean by the `versions with the Result`?
Do you mean we should just modify the `matches()` and `matches_exact()`
function to
return a `Result` if the mode is `ANY_FILE_TYPE` and we need one (a file
mode) to match?
>> + // This is an &self method on a `T where T: 'a`.
>> + let this: &'a Self = unsafe { std::mem::transmute(self) };
>> +
>> + for m in this.into_iter().rev() {
>> + if let Ok(mt) = m.entry_matches_path(path) {
> IIRC the intention was actually to immediately fail if we hit a pattern
> with a file mode, since we wouldn't be able to tell if it would already
> exclude the file, otherwise we really *could* just skip this entirely
> and have a failing stat() call just use `Some(!0)` as file mode.
>
> Basically, if the user runs into an inaccessible file they have to
> append `--exclude=/that/one/file` to the CLI invocation to fix it,
> whereas otherwise they'd still get an error.
>
> So this should just be
>
> if let Some(mt) = m.entry_matches_path(path)? {
> return Some(mt);
> }
Hmm, my solution would have been to traverse all the patterns with a
`ANY_FILE_TYPE` mode and
if there is a match, nice, return. If there is no match, check if there
are other patterns without
the `ANY_FILE_TYPE` mode (then we need to fail because we don't know the
`file_mode` yet...) or
return nothing if there are no patterns left. I don't know if this is
the correct approach though...
>> + if mt.is_some() {
>> + return Ok(mt);
>> + }
>> + }
>> + }
>> +
>> + Err(FileModeRequiredForMatching)
> Also this is wrong. If nothing matches, nothing matches ;-) (just like
> in the other matching variants).
> Which immediately tells you that skipping the error above doesn't make
> much sense :-) )
Yes, you are right, corrected it.
More information about the pbs-devel
mailing list