[pve-devel] [PATCH ha-manager 9/9] manager: make service node usage computation more granular

Daniel Kral d.kral at proxmox.com
Fri Oct 17 17:59:01 CEST 2025


On Fri Oct 17, 2025 at 2:42 PM CEST, Fiona Ebner wrote:
> Am 30.09.25 um 4:21 PM schrieb Daniel Kral:
>> The $online_node_usage is built on every call to manage(...) now, but
>> can be reduced to only be built on any scheduler mode change (including
>> initialization or error path to be complete).
>> 
>> This allows recompute_online_node_usage(...) to be reduced to
>> adding/removing nodes whenever these become online or are not online
>> anymore and handle the service usage updates whenever these change.
>> Therefore, recompute_online_node_usage(...) must only be called once in
>> manage(...) after $ns was properly updated.
>> 
>> Note that this makes the ha-manager not acknowledge any hotplug changes
>> to the guest configs anymore as long as the HA resource state doesn't
>> change.
>
> I'm not comfortable with that to be honest, because it would not just be
> a very badly timed large change that can lead to unexpected decisions,
> but an accumulation of smaller changes without any bad timing.
>
>> 
>> Signed-off-by: Daniel Kral <d.kral at proxmox.com>
>> ---
>> If we go for this patch, then we would need some mechanism to update the
>> static usage for a single or all HA resources registered in
>> $online_node_usage at once (or just rebuilt $online_node_usage at that
>> point..).
>
> You mean triggered from qemu-server/pve-container upon update? In
> combination with that it would be acceptable I think. Question is, do we
> want to spend even more time optimizing the static scheduler, or just
> apply a v2 without patch 9/9 and rather focus on getting a PSI-based
> scheduler going?

Right, the patch was also more of a leftover from an initial approach
but wanted to still get feedback if there's any benefit to do it that
way, but in hindsight it probably only adds unnecessary complexity and
might even be an overhead at long last which could introduce weird bugs.

Especially since the performance now is very acceptable, I don't see a
reason to optimize here further until we find a better reason for that.




More information about the pve-devel mailing list