[pve-devel] [PATCH manager] pvestatd: improve broadcast of node version-info
Fabian Grünbichler
f.gruenbichler at proxmox.com
Thu Feb 27 15:52:25 CET 2025
On February 27, 2025 9:59 am, Fiona Ebner wrote:
> Am 26.02.25 um 17:02 schrieb Aaron Lauterer:
>>
>>
>> On 2025-01-17 13:18, Fiona Ebner wrote:
>>> Am 16.01.25 um 17:30 schrieb Aaron Lauterer:
>>>> Until now, the pvestatd did broadcast the pve-manager version only once
>>>> after startup of the service. But there are some situations, where the
>>>> local pmxcfs (pve-cluster) restarts and loses that information.
>>>> Basically everytime we restart the pmxcfs without restarting pvestatd
>>>> too.
>>>>
>>>> For example, on a cluster join, or if the pmxcfs has been restarted
>>>> manually.
>>>>
>>>> By additionally checking if the local kv-store of the pmxcfs has any
>>>> version info for the node, we can decide if another broadcast is
>>>> necessary.
>>>> Therefore after the next run of pvestatd, we should have the full
>>>> version info available again.
>>>>
>>>> Signed-off-by: Aaron Lauterer <a.lauterer at proxmox.com>
>>>> ---
>>>> This patch is preparation to get reliable version infos as I am picking
>>>> of the patch series of Folke to include more metrics into the RRD data
>>>> and summary graphs. [0]
>>>> This was a big blocker and now with the major version change coming up,
>>>> we at least can assume the latest 8.x installed as part of the update to
>>>> PVE 9.
>>>> Therefore, we should get this in with PVE 8. Additional patches for PVE
>>>> 8 will follow to make the transition smoother. But as mentioned, this
>>>> here is one of the things that needs to work reliably, which is why I
>>>> submit the patch already now.
>>>
>>> If we start relying more on this, we likely also want:
>>> https://lore.proxmox.com/pve-devel/20221006125414.58279-1-
>>> f.ebner at proxmox.com/
>>
>> Hmm, honestly, I might prefer having the last known version info still
>> present. That would make it easier to determine if all cluster nodes are
>> on at least a required version ;).
>
> That is an edge case where it might be useful, but I'd argue that in
> general, it can be problematic to rely on stale information, especially
> if you can't detect if it's stale or not. And IMHO, it's worth doing
> properly here too, i.e. wait for the node to send its current version.
> You already need to wait for nodes that were not online before.
we could make it detectable by including a timestamp? that way, if using
stale information is (not) okay, that decision can be made by the
consumer of the information, instead of only allowing either variant?
More information about the pve-devel
mailing list