[pdm-devel] RFC: Synchronizing configuration changes across remotes

Tue Feb 4 11:34:25 CET 2025

On 2/3/25 18:02, Thomas Lamprecht wrote:
> Yeah, digest is not giving you anything here, at least for anything that
> consists of more than one change; and adding a dedicated central API
> endpoint for every variant of batch update we might need seems hardly
> scalable nor like good API design.

Yes, although I've considered adding an endpoint for getting / setting
the whole SDN configuration at some point, but I've scrapped that since
it's unnecessary for what I'm currently implementing (adding single
zones / vnets / ...).

> Does it really require sweeping changes? I'd think modifications are
> already hedging against concurrent access now, so this should not mean
> we change to a completely new edit paradigm here.

We'd at least have to touch every non-read request in SDN to check for
the global lock - but yes, the wording is a bit overly dramatic. We
already have an existing lock_sdn_config, so adding another layer of
locking there shouldn't be an issue. If I decide to go for the .new
config route described below, this will be a bit more involved though.

> My thoughts when we talked was to go roughly for:
> Add a new endpoint that 1) ensure basic healthiness and 2) registers a
> lock for the whole, or potentially only some parts, of the SDN stack.
> This should work by returning a lock-cookie random string to be used by
> subsequent calls to do various updates in one go while ensuring nothing
> else can do so or just steal our lock.  Then check this lock centrally
> on any write-config and be basically done I think?

That was basically what I envisioned as the implementation for the
lock too.

> A slightly more elaborate variant might be to also split the edit step,
> i.e.
> 1. check all remotes and get lock
> 2. extend the config(s) with a section (or a separate ".new" config) for
>    pending changes, write all new changes to that.
> 3. commit the pending sections or .new config file.
> 
> With that you would have the smallest possibility for failure due to
> unrelated node/connection hickups and reduce the time gap for actually
> activating the changes. If something is off an admin even could manually
> apply these directly on the cluster/nodes.

This sounds like an even better idea, I'll look into how I could
implement that. As a first step, I think I'll simply go for the
lock-cookie approach, since we can always implement this more elaborate
approach on top of that.

>> * In case of failures on the PDM side it is harder to recover, since
>> it requires manual intervention (removing the lock manually).
> 
> Well, a partially rolled out SDN update might always be (relatively)
> hard to recover from; which approach would avoid that (and not require
> paxos, or raft level guarantees)?

One idea that came to my mind was automatic rollback after a timeout if
some health check on the PVE side fails, similar to when you change
resolution in a graphics driver.

> FWIW, we already got pmxcfs backed domain locks, which I added for the
> HA stack back in the day. These allow relatively cheaply to take a lock
> that only one pmxcfs instance (i.e., one node) at a time can hold.  Pair
> that with some local lock (e.g., flock, in single-process, many threads
> rust land it could be an even cheaper mutex) and you can quite simply
> and not to expensively lock edits – and I'd figure SDN modifications do
> not have _that_ high of a frequency to make performance here to critical
> for such locking to become a problem.

I'll look into those - thanks for the pointer.