[pve-devel] [PATCH v9 storage 1/1] status: add download_url method

Thomas Lamprecht t.lamprecht at proxmox.com
Mon Jun 21 09:29:57 CEST 2021


On 16.06.21 11:35, Lorenz Stechauner wrote:
> uses common function PVE::Tools::download_file_from_url to download
> iso files.
> 
> Only users with permissions `Sys.Audit` and `Sys.Modify` on `/` are
> permitted to perform this action. This restriction is due to the
> fact, that the download function is able to download files from
> internal networks (which are not visible/accessible from outside).
> Users with these permissions anyway have the means to alter node
> (network) config, so this does not create any further security risk.
> 
> Signed-off-by: Lorenz Stechauner <l.stechauner at proxmox.com>
> ---
>  PVE/API2/Storage/Status.pm | 128 +++++++++++++++++++++++++++++++++++--
>  PVE/Storage.pm             |  10 +++
>  2 files changed, 133 insertions(+), 5 deletions(-)
> 
> diff --git a/PVE/API2/Storage/Status.pm b/PVE/API2/Storage/Status.pm
> index 897b4a7..584ba5d 100644
> --- a/PVE/API2/Storage/Status.pm
> +++ b/PVE/API2/Storage/Status.pm
> @@ -412,11 +412,7 @@ __PACKAGE__->register_method ({
>  	my $size = -s $tmpfilename;
>  	die "temporary file '$tmpfilename' does not exist\n" if !defined($size);
>  
> -	my $filename = $param->{filename};
> -
> -	chomp $filename;
> -	$filename =~ s/^.*[\/\\]//;
> -	$filename =~ s/[^-a-zA-Z0-9_.]/_/g;
> +	my $filename = PVE::Storage::normalize_content_filename($param->{filename});

nit/no hard feelings: factoring out normalize_content_filename could have been it's own
patch. IMO for re-using existing code that is nicer, when adding new code they can be
one patch too, if so small like here, as then it can be seen as part of the "single thing"
that patch does.

>  
>  	my $path;
>  
> @@ -497,4 +493,126 @@ __PACKAGE__->register_method ({
>  	return $upid;
>     }});
>  
> +__PACKAGE__->register_method({
> +    name => 'download_url',
> +    path => '{storage}/download-url',

you forgot to add this path to the index endpoint, making it basically a hidden
method..

We probably should assert this in the schema verifier to avoid, as the file-restore
from Stefan also forgot this.

> +    method => 'POST',
> +    description => "Download templates and ISO images by using an URL.",
> +    proxyto => 'node',
> +    permissions => {
> +	check => [ 'and',
> +	    ['perm', '/storage/{storage}', [ 'Datastore.AllocateTemplate' ]],
> +	    ['perm', '/', [ 'Sys.Audit', 'Sys.Modify' ]],
> +	],
> +    },
> +    protected => 1,
> +    parameters => {
> +	additionalProperties => 0,
> +	properties => {
> +	    node => get_standard_option('pve-node'),
> +	    storage => get_standard_option('pve-storage-id'),
> +	    url => {
> +		description => "The URL to download the file from.",
> +		type => 'string',
> +		pattern => 'https?://.*',
> +	    },
> +	    content => {
> +		description => "Content type.",
> +		type => 'string', format => 'pve-storage-content',
> +	    },
> +	    filename => {
> +		description => "The name of the file to create.",

does not mention that it will be normalized, i.e., possibly altered. IMO this is important
as else the user may not find their file again if looking for the original name.

I'd also rather limit this to what we can actually return in the content API call, I mean you
already enforce a implicit limit in the code, so why not encode the union of the allowed
file endings below? ideally using factored out, shared regexes for the ending.


> +		type => 'string',
> +	    },
> +	    checksum => {
> +		description => "The expected checksum of the file.",
> +		type => 'string',
> +		requires => 'checksum-algorithm',
> +		optional => 1,
> +	    },
> +	    'checksum-algorithm' => {
> +		description => "The algorithm to calculate the checksum of the file.",
> +		type => 'string',
> +		enum => ['md5', 'sha1', 'sha224', 'sha256', 'sha384', 'sha512'],
> +		requires => 'checksum',
> +		optional => 1,
> +	    },
> +	    'verify-certificates' => {
> +		description => "If false, no SSL/TLS certificates will be verified.",
> +		type => 'boolean',
> +		optional => 1,
> +		default => 1,
> +	    }
> +	},
> +    },
> +    returns => {
> +	type => "string"
> +    },
> +    code => sub {
> +	my ($param) = @_;
> +
> +	my $cfg = PVE::Storage::config();
> +
> +	my ($node, $storage) = $param->@{'node', 'storage'};
> +	my $scfg = PVE::Storage::storage_check_enabled($cfg, $storage, $node);
> +
> +	die "can't upload to storage type '$scfg->{type}, not a file based storage!'\n" if !defined($scfg->{path});
> +
> +	my ($content, $url) = $param->@{'content', 'url'};
> +
> +	my $filename = PVE::Storage::normalize_content_filename($param->{filename});
> +	my $path;
> +
> +	# MIME type is checked in front end only
> +	# this check is omitted here intentionally and replaced by file extension check
> +	if ($content eq 'iso') {
> +	    if ($filename !~ m![^/]+$PVE::Storage::iso_extension_re$!) {
> +		raise_param_exc({ filename => "missing '.iso' or '.img' extension" });
> +	    }
> +	    $path = PVE::Storage::get_iso_dir($cfg, $storage);
> +	} elsif ($content eq 'vztmpl') {
> +	    if ($filename !~ m![^/]+\.tar\.[gx]z$!) {

we npwadays also support .zst (zstd), and .vma is missing too, please just avoid reinventing
regexes, especially if not really closly checking what it should cover, rather use an existing
one, which sometimes needs to be factored out first, `path_to_volume_id`

> +		raise_param_exc({ filename => "missing '.tar.gz' or '.tar.xz' extension" });
> +	    }
> +	    $path = PVE::Storage::get_vztmpl_dir($cfg, $storage);
> +	} else {
> +	    raise_param_exc({ content => "upload content type '$content' not allowed" });
> +	}
> +
> +	die "storage '$storage' does not support '$content' content\n" if !$scfg->{content}->{$content};

"does not support" is not necesarrily true, the storage could support it but an admin
could have disabled it, or forgot to enable it, so rather something along the lines of:

"storage '$storage' is not configured for content-type '$content'\n" if !$scfg->{content}->{$content};

> +
> +	PVE::Storage::activate_storage($cfg, $storage);
> +	File::Path::make_path($path);
> +
> +	my $dest = "$path/$filename";
> +
> +	my $opts = {
> +	    hash_required => 0,
> +	    verify_certificates => $param->{'verify-certificates'} // 1,
> +	};
> +
> +	my ($checksum, $checksum_algorithm) = $param->@{'checksum', 'checksum-algorithm'};
> +	if ($checksum) {
> +	    $opts->{"${checksum_algorithm}sum"} = $checksum;
> +	    $opts->{hash_required} = 1;
> +	}
> +
> +	my $dccfg = PVE::Cluster::cfs_read_file('datacenter.cfg');
> +	if ($dccfg->{http_proxy}) {
> +	    $opts->{http_proxy} = $dccfg->{http_proxy};
> +	}
> +
> +	my $worker = sub {
> +	    my $upid = shift;

unused variable.

> +	    PVE::Tools::download_file_from_url($dest, $url, $opts);
> +	};
> +
> +	my $rpcenv = PVE::RPCEnvironment::get();
> +	my $user = $rpcenv->get_user();

nit and not too hard feelings, but basically all existing API methods have above two
lines rather at the start of the  API endpoint's code definition.

> +
> +	my $upid = $rpcenv->fork_worker('download', $filename, $user, $worker);

> +
> +	return $upid;
> +    }});
> +
>  1;
> diff --git a/PVE/Storage.pm b/PVE/Storage.pm
> index aa36bad..f9a3d49 100755
> --- a/PVE/Storage.pm
> +++ b/PVE/Storage.pm
> @@ -1929,4 +1929,14 @@ sub assert_sid_unused {
>      return undef;
>  }
>  

the method below is small, but a small comment about what normalize means here
could still be nice as regex can change their meaning a lot by subtle changes.

> +sub normalize_content_filename {
> +    my ($filename) = @_;
> +
> +    chomp $filename;
> +    $filename =~ s/^.*[\/\\]//;
> +    $filename =~ s/[^-a-zA-Z0-9_.]/_/g;
> +
> +    return $filename;
> +}
> +
>  1;
> 






More information about the pve-devel mailing list