[pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start

Thomas Lamprecht t.lamprecht at proxmox.com
Wed Jan 29 19:48:22 CET 2025


Am 29.01.25 um 11:51 schrieb Dominik Csapak:
> Sending as RFC, because it's still very rough and i want to get some
> early feedback.
> 
> This series implements an api call 'bulk-start' which is running on
> the pdm itself, that mimics the bulkstart from pve, but without the
> node limitation of pve.
> 
> Does that make sense? Or would it be better to try to implement that
> on pve side? The advantage we have here is that we have an
> external view of the cluster, which means that things like node
> failures, synchronisation, etc. are much easier to handle.

I think we talked offlist about this a while ago, albeit rather casually,
and yes IMO exposing this on the PVE side would be better – it can be done
more efficiently there, better control for overall active job count and
avoids some oddities. TBH I'd be surprised if it's easier to do from
external with the same feature set.

Having an external services handle this over a potentially flaky connection
seems much more error-prone to me compared going over a LAN that clusters
require.

IMO we actually should avoid having much of this stuff or dedicated state
(that affects the remotes or their resources) in the PDM directly. The
more things are handled by the end products the 1) simpler PDM stays
(PVE needs some complexity anyway, coupling two complex projects will IMO
amplify maintenance cost more) 2) ensures PVE provides already a powerful
feature set on its own – i.e. PVE already has a good architecture and is
not as limited like vmware esxi, which requires vsphere for relatively
simple (from user POV, not implementation) things even if they are only
affecting nodes in the same LAN, so we should continue to mainly "empower"
PVE and plug that into PDM 3) PDM will become relatively complex even
with trying to avoid state and such features implemented only there,
all the metrics, tasks, health and SDN tracking is already quite a bit
to handle, if done actually well, flexible and powerful.

> If we'd implment something like this on PVE, there has to be a node
> that has control of the api calls to make (or to schedule something via
> pmxcfs) and that is probably much harder to do there (pmxcfs sync queue)
> or brings some problems with it (node dies in the middle of an api call)

In the simplest architecture it could be like the SDN reload is
implemented; I'm quite sure that I mentioned that, but would not bet that
much on my (or most) brain(s) that is. 

I.e. a single task on one node that connects to all involved cluster nodes
through the API and creates the respective bulk-tasks for the guests residing
on each node and then polls these. Some generic infrastructure for doing such
things might be nice and would have some reuse between different bulk tasks
and SDN, potentially others in the future.
Switching to an even more efficient channel or method could be done
transparently (from POV of the external user/program of the cluster-wide
bulk-action API), so I'd not worry too much about that now.

Besides that there are (most of the time) fewer points of failures between
nodes compared to PDM and nodes network wise, if node(s) indeed die in the
middle of an API call the PDM is naturally cannot magically fix that and
as node failure is not expected behavior but rather an extraordinary event
it also means that an interrupted bulk-action is not really a big problem
there.

in short: lets do this in PVE directly.

> It's very early, so please don't judge the actual api call code just
> now, I'd extend it with failure resulotion, polling the task, etc.
> 
> OTOH there is the question if the UI makes sense this way, or if we want
> to combine the 'select to view details' and 'select to to a bulk action'
> into one. Or if we want to do the bulk actions more like in pve with
> a popup that shows the vm list again.
> 
> Dominik Csapak (3):
>   server: pve api: add new bulkstart api call
>   pdm-client: add bulk_start method
>   ui: pve tree: add bulk start action
> 
>  lib/pdm-client/src/lib.rs |   9 ++-
>  server/src/api/pve/mod.rs |  98 +++++++++++++++++++++++++++-
>  ui/src/pve/tree.rs        | 133 ++++++++++++++++++++++++++++++++++++--
>  3 files changed, 234 insertions(+), 6 deletions(-)
> 





More information about the pdm-devel mailing list