[pve-devel] [PATCH v2 pve-common 01/12] introduce PVE::Path

Wolfgang Bumiller w.bumiller at proxmox.com
Thu Jan 9 12:06:40 CET 2025


On Thu, Jan 09, 2025 at 10:56:16AM +0100, Max Carrara wrote:
> On Wed Jan 8, 2025 at 3:05 PM CET, Wolfgang Bumiller wrote:
> > On Fri, Dec 20, 2024 at 07:51:56PM +0100, Max Carrara wrote:
> > > +my sub _path_file_suffixes : prototype($) {
> > > +    my ($file_name_no_prefix) = @_;
> > > +
> > > +    confess "\$file_name_no_prefix is undef" if !defined($file_name_no_prefix);
> > > +
> > > +    # Suffixes are extracted "manually" because join()ing the result of split()
> > > +    # results in a different file name than the original. Let's say you have a
> > > +    # file named "foo.bar.". The correct suffixes would be ("bar", "").
> > > +    # With split, you get the following:
> > > +    #     split(/\./, ".bar.")     --> ("", "bar")     --> join()ed to "foo..bar"
> > > +    #     split(/\./, ".bar.", -1) --> ("", "bar", "") --> join()ed to "foo..bar."
> > > +    my @suffixes = ();
> > > +    while ($file_name_no_prefix =~ s|^(\.[^\.]*)||) {
> > > +	my $suffix = $1;
> > > +	$suffix =~ s|^\.||;
> > > +	push(@suffixes, $suffix);
> > > +    }
> > > +
> > > +    return @suffixes;
> > > +}
> > > +
> > > +=head3 path_file_suffixes($path)
> > > +
> > > +Returns the suffixes of the C<$path>'s file name as a list. If the C<$path> does
> > > +not have a valid file name, an empty list is returned instead.
> > > +
> > > +In scalar context, returns a reference to a list.
> >
> > (^ Isn't this a bit awkward?)
> 
> Well, do you think it is? I've been using that kind of "return style"
> here and there and have been liking it, but if it's weird to others, I
> can adapt it. :P

If you remove the distinction and always return the list, the scalar
context would give you the final element, which would be "the extension"
('gz', not 'tar.gz' :P), which would also be a nice shortcut for what
I'd anticipate to be one of the most common(?) uses.

> 
> >
> > > +
> > > +The suffixes of a path are essentially the file name's extensions, the parts
> > > +that come after the L<< prefix|/"path_file_prefix($path)" >>.
> > > +
> > > +    my @suffixes = path_file_suffixes("/etc/resolv.conf");
> > > +    # ("conf")
> > > +
> > > +    my $suffixes = path_file_prefix("/tmp/archive.tar.zst");
> > > +    # ["tar", "zst"]
> > > +
> > > +Throws an exception if C<$path> is C<undef>.
> > > +
> > > +=cut
> > > +
> > > +sub path_file_suffixes : prototype($) {
> > > +    my ($path) = @_;
> > > +
> > > +    croak "\$path is undef" if !defined($path);
> > > +
> > > +    my $file_name = path_file_name($path);
> > > +    if (!defined($file_name)) {
> > > +	return wantarray ? () : [];
> > > +    }
> > > +
> > > +    (undef, $file_name) = _path_file_prefix($file_name);
> > > +
> > > +    my @suffixes = _path_file_suffixes($file_name);
> > > +
> > > +    return wantarray ? @suffixes : \@suffixes;
> > > +}
> > > +
> > > +=head3 path_with_file_suffixes($path, @suffixes)
> > > +
> > > +Returns C<$path> with new C<@suffixes>. This is similar to
> > > +C<L<< path_with_file_name()|/"path_with_file_name($path, $file_name)" >>>,
> > > +except that the suffixes of the file name are replaced.
> > > +
> > > +If the C<$path> does not have a file name, C<undef> is returned.
> > > +
> > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "pxar", "gz");
> > > +    # /tmp/archive.pxar.gz
> > > +
> > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "gz");
> > > +    # /tmp/archive.gz
> > > +
> > > +If the file name has no suffixes, the C<@suffixes> are appended instead:
> > > +
> > > +    my $new_path = path_with_file_suffixes("/etc/resolv", "conf");
> > > +    # /etc/resolv.conf
> > > +
> > > +    my $new_path = path_with_file_suffixes("/etc/resolv", "conf", "zst");
> > > +    # /etc/resolv.conf.zst
> > > +
> > > +If there are no C<@suffixes> provided, the file name's suffixes will
> > > +be removed (if there are any):
> > > +
> > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst");
> > > +    # /tmp/archive
> > > +
> > > +Note that an empty string is still a valid suffix (an "empty" file ending):
> > > +
> > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "", "", "", "zst");
> > > +    # /tmp/archive....zst
> > > +
> > > +Throws an exception if C<$path> or any of the C<@suffixes> is C<undef>, or
> > > +if any suffix contains a path separator (C</>) or a C<.>.
> > > +
> > > +=cut
> > > +
> > > +sub path_with_file_suffixes : prototype($@) {
> > > +    my ($path, @suffixes) = @_;
> >
> > I am questioning a bit the sanity of having "suffixes" throughout this
> > module instead of simply an "extension" that covers them all and can be
> > split if needed...
> >
> > Do we have/anticipate particular use cases where this is more
> > convenient?
> 
> To be really honest, it was 50/50 on the "extension" vs "suffixes"
> decision. I then opted for suffixes instead, because they seemed to be
> less ambiguous than an "extension". Let me elaborate.
> 
> Let's say we have a file called "foo.tar.gz" -- what would its extension
> be? Some might say that ".tar.gz" is its extension, while some others
> might say ".gz"; both have different reasonings but aren't necessarily
> more or less correct than the other.
> 
> In our case, both you and I would say that ".tar.gz" is the extension,
> but...
> 
> Rust's std::path::Path::extension [1] would return "gz" for the above,
> while C++'s std::filesystem::path::extension [2] returns ".gz" (but they
> don't have such a case in their docs!). Java has java.nio.file.Path, but
> conveniently doesn't care about extensions. Kotlin fixest his by adding
> an "extension" property to that class, which returns "gz" [3].
> 
> Python's pathlib is the only one I've seen go the "suffix route" and
> provides the pathlib.PurePath.suffix and .suffixes methods [4]. This
> type of approach also came up on the (Rust) tracking issue for
> Path::file_prefix [5].
> 
> When I was looking up all that, I decided to go for the prefix +
> suffix(es) route, as that just seemed to be the most unambiguous.
> Convenience wasn't really a factor here, because there's the
> "path_with_file_* family" of functions that should handle most of the
> replacement cases.
> 
> (Besides, I found it quite nice that I could join() on the prefix +
> suffixes (e.g. join(".", "foo", "tar", "gz")) that I first extracted
> with path_file_prefix() and path_file_suffixes(); I like that there's an
> "inverse" operation.)
> 
> *But,* I wouldn't be opposed to adding a function that just returns
> "tar.gz" for the above case. Perhaps with a different name though :P
> 
> [1] https://doc.rust-lang.org/std/path/struct.Path.html#method.extension
> [2] https://en.cppreference.com/w/cpp/filesystem/path/extension
> [3] https://github.com/JetBrains/kotlin/blob/rrr/2.1.0/core-docs/libraries/stdlib/jdk7/src/kotlin/io/path/PathUtils.kt#L46
> [4] https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.suffix
> [5] https://github.com/rust-lang/rust/issues/86319

TBH I wouldn't care if it returned "tar.gz", ".tar.gz", "gz" or ".gz"
as long as it's documented ;-).

But yeah, I guess suffixes makes sense - "forces"(🤪) you to not ignore
it in your code...




More information about the pve-devel mailing list