[pve-devel] [PATCH ha-manager 8/9] manager: make online node usage computation granular
Fiona Ebner
f.ebner at proxmox.com
Fri Oct 17 14:32:53 CEST 2025
Am 30.09.25 um 4:20 PM schrieb Daniel Kral:
> The HA Manager builds $online_node_usage in every FSM iteration in
> manage(...) and at every HA resource state change in
> change_service_state(...). This becomes quite costly with a high HA
> resource count and a lot of state changes happening at once, e.g.
> starting up multiple nodes with rebalance_on_request_start set or a
> failover of a node with many configured HA resources.
>
> To improve this situation, make the changes to the $online_node_usage
> more granular by building $online_node_usage only once per call to
> manage(...) and changing the nodes a HA resource uses individually on
> every HA resource state transition.
>
> The change in service usage "freshness" should be negligible here as the
> static service usage data is cached anyway (except if the cache fails
> for some reason).
But the cache is refreshed on every recompute_online_node_usage(), which
happened much more frequently before, so the fact that it's cached
doesn't seem like a strong argument here?
I /do/ think there is a real tradeoff being made, namely "the ability to
manage much larger fleets of guests" versus "immediately incorporating
every guest config change in decisions". Config changes that would lead
to wildly different decisions would need to be timed very badly to cause
actual issues and should be rare to begin with. Also, with PSI-based
information, things are also less "instant", I don't see an issue with
moving in the same direction.
>
> Signed-off-by: Daniel Kral <d.kral at proxmox.com>
Reviewed-by: Fiona Ebner <f.ebner at proxmox.com>
> ---
> The add_service_usage(...) helper is added in anticipation for the next
> patch, we don't need a helper if we don't go for #9.
I think it's nice to have regardless. Inlining the function would just
bloat change_service_state() or what would be the alternative?
> @@ -314,7 +329,8 @@ my $change_service_state = sub {
> $sd->{$k} = $v;
> }
>
> - $self->recompute_online_node_usage();
> + $self->{online_node_usage}->remove_service_usage($sid);
> + $self->add_service_usage($sid, $sd);
Nice!
>
> $sd->{uid} = compute_new_uuid($new_state);
>
More information about the pve-devel
mailing list