[pve-devel] [PATCH manager] api: implement node-independent bulk actions
Michael Köppl
m.koeppl at proxmox.com
Fri Apr 4 16:14:48 CEST 2025
On 4/3/25 16:12, Dominik Csapak wrote:
> To achieve this, start a worker task and use our generic api client
> to start the tasks on the relevant nodes. The client always points
> to 'localhost' so we let the pveproxy worry about the proxying etc.
>
> We reuse some logic from the startall/stopall/etc. calls, like getting
> the ordered guest info list. For that to work, we must convert some of
> the private subs into proper subs. We also fix handling loading configs
> from other nodes.
>
> In each worker, for each task, we check if the target is in the desired
> state (e.g. stopped when wanting to start, etc.). If that iss the case,
> start the task and put the UPID in a queue to check. This is done until
> the parallel count is at 'max_workers', at which point we wait until at
> least one task is finished before starting the next one.
>
> Failures (e.g. task errors or failure to fetch the tasks) are printed,
> ant the vmid is saved and they're collectively printed at the end for
> convenience.
>
> Special handling is required for checking the permissions for suspend:
> We have to load the config of the VM and find the target state storage.
> We can then check the privileges for that storage.
>
> Further improvements can be:
> * filters (I'd prefer starting out with front end filters)
> * failure mode resolution (I'd wait until someone requests that)
> * token handling (probably not necessary since we do check the
> permissions upfront for the correct token.)
> Signed-off-by: Dominik Csapak <d.csapak at proxmox.com>
I tested start, suspend, shutdown, migrate with a 3-node cluster with
multiple VMs and CTs. I tested with both ticket and token authentication
and tried the following scenarios:
- Start (with no vms parameter, selection of VMIDs, and with empty
string vor vms)
- Shutdown (vms parameter as above, force-stop on and off)
- Suspend (vms parameter as above, to-disk on and off)
- Migrate (vms parameter as above, online on and off, with-local-disks
on and off)
All of the above I also tried with varying settings for max-workers.
The tested cases seem to work as expected and I find the API intuitive.
I like it!
I added 3 comments inline
> +++ b/PVE/API2/Cluster/BulkAction/Guest.pm
> @@ -0,0 +1,701 @@
> +package PVE::API2::Cluster::BulkAction::Guest;
> +
> +use strict;
> +use warnings;
> +
> +use PVE::APIClient::LWP;
> +use PVE::AccessControl;
> +use PVE::Cluster;
> +use PVE::Exception qw(raise raise_perm_exc raise_param_exc);
> +use PVE::INotify;
> +use PVE::JSONSchema qw(get_standard_option);
> +use PVE::QemuConfig;
> +use PVE::QemuServer;
> +use PVE::RESTEnvironment qw(log_warn);
> +use PVE::RESTHandler;
> +use PVE::RPCEnvironment;
> +use PVE::Storage;
> +use PVE::Tools qw();
> +
> +use PVE::API2::Nodes;
> +
> +use base qw(PVE::RESTHandler);
> +
> +
> +__PACKAGE__->register_method ({
> + name => 'index',
> + path => '',
> + method => 'GET',
> + description => "Bulk action index.",
> + permissions => { user => 'all' },
> + parameters => {
> + additionalProperties => 0,
> + properties => {},
> + },
> + returns => {
> + type => 'array',
> + items => {
> + type => "object",
> + properties => {},
> + },
> + links => [ { rel => 'child', href => "{name}" } ],
> + },
> + code => sub {
> + my ($param) = @_;
> +
> + return [
> + { name => 'start' },
> + { name => 'shutdown' },
> + { name => 'migrate' },
> + ];
> + }});
'suspend' seems to be missing here.
> +__PACKAGE__->register_method({
> + name => 'shutdown',
> + path => 'shutdown',
> + method => 'POST',
> + description => "Bulk shutdown all guests on the cluster.",
> + permissions => {
> + description => "The 'VM.PowerMgmt' permission is required on '/' or on '/vms/<ID>' for "
> + ."each ID passed via the 'vms' parameter.",
> + user => 'all',
> + },
> + protected => 1,
> + parameters => {
> + additionalProperties => 0,
> + properties => {
> + vms => {
> + description => "Only consider guests from this comma separated list of VMIDs.",
> + type => 'string', format => 'pve-vmid-list',
> + optional => 1,
> + },
> + timeout => {
> + description => "Default shutdown timeout in seconds if none is configured for the guest.",
> + type => 'integer',
> + default => 180,
> + optional => 1,
> + },
> + 'force-stop' => {
> + description => "Makes sure the Guest stops after the timeout.",
> + type => 'boolean',
> + default => 1,
> + optional => 1,
> + },
> + 'max-workers' => {
> + description => "How many parallel tasks at maximum should be started.",
> + optional => 1,
> + default => 1,
> + type => 'integer',
> + },
> + # TODO:
> + # Failure resolution mode (fail, warn, retry?)
> + # mode-limits (offline only, suspend only, ?)
> + # filter (tags, name, ?)
> + },
> + },
> + returns => {
> + type => 'string',
> + description => "UPID of the worker",
> + },
> + code => sub {
> + my ($param) = @_;
> +
> + my $rpcenv = PVE::RPCEnvironment::get();
> + my $authuser = $rpcenv->get_user();
> +
> + my $vmlist = [PVE::Tools::split_list($param->{vms})];
> +
> + check_guest_permissions($rpcenv, $authuser, $vmlist, [ 'VM.PowerMgmt' ]);
> +
> + my $code = sub {
> + my $startlist = PVE::API2::Nodes::Nodeinfo::get_start_stop_list(undef, undef, $param->{vms});
> +
> + if (scalar($vmlist->@*)) {
> + print STDERR "Shutting down guests: " . join(', ', $vmlist->@*) . "\n";
> + } else {
> + print STDERR "Shutting down all guests\n";
> + }
nit: the above part is used (with varying permissions that are required)
in multiple subroutines. I think a helper function for the initial setup
of each subroutine and checking the required permissions would make sense.
> +
> + # reverse order for shutdown
> + for my $order (keys $startlist->%*) {
> + my $list = delete $startlist->{$order};
> + $order = $order * -1;
> + $startlist->{$order} = $list;
> + }
> +
> + my $start_task = sub {
> + my ($api_client, $vmid, $guest) = @_;
> + my $node = $guest->{node};
> +
> + my $type = $guest->{type};
> + my $type_text = get_type_text($type);
> +
> + my $status = eval { $api_client->get("/nodes/$node/$type/$vmid/status/current") };
> + if (defined($status) && $status->{status} ne 'running') {
> + print STDERR "Skipping $type_text $vmid, not running.\n";
> + return 1;
> + }
> +
> + if (defined($status) && $status->{qmpstatus} eq 'paused' && !$param->{'force-stop'}) {
> + log_warn("Skipping $type_text $vmid, resume paused VM before shutdown.\n");
> + return 1;
> + }
Since there is no check if $status->{qmpstatus} is defined, I get a "Use
of uninitialized value" error printed to the log when running shutdown
with containers.
More information about the pve-devel
mailing list