[pve-devel] [RFC cluster 0/2] fix #4886: improve SSH handling

Fabian Grünbichler f.gruenbichler at proxmox.com
Tue Jan 9 09:57:30 CET 2024


> Esi Y via pve-devel <pve-devel at lists.proxmox.com> hat am 09.01.2024 06:01 CET geschrieben:
> On Thu, Dec 21, 2023 at 10:53:11AM +0100, Fabian Grünbichler wrote:
> > RFC since this would be a bigger change in how we approach intra-cluster
> > SSH access.
> > 
> > there are still a few parts that currently don't use SSHInfo, but
> > would need to be switched over if we want to pursue this approach:
> > 
> > - get_vnc_connection_info in PVE::API2::Nodes
> > - 'upload' API endpoint in PVE::API2::Storage::Status
> > - SSH proxy in pvesh
> > 
> > these changes would need to happen coordinated with the patches from
> > this RFC series!
> 
> Not necessarily.

if we do the unmerge as well, then yes - else a node with unmerged known host would fail to connect to nodes that joined the cluster after unmerging.
  
> > next steps afterwards:
> > - unmerge known hosts in `pvecm updatecerts`, instead of merging
> > -- to disentangle regular ssh from intra-cluster SSH
> 
> Both of these could be accomplished with codebase/complexity reduction in an approach to:

I am not sure what "both" means here? there is only a single thing quoted ;)

> 1.  eliminate shared ssh config and key files, i.e. completely remove any need for symlinks; and

that's the evaluate part further below.

> 2.  instead initialise each node at join (first one at cluster creation) into their respective node-local files with ssh certs; whilst

versus just copying the host key, which is far simpler?

> 3.  keeping the setup maintenance-free, since any single key addition/refresh does not need to propagate any individual keys around the cluster; meanwhile

yes it does, changing the key would entail revoking the old one (and distributing that!) and signing the new one.

> 4.  no requirement for additional -o options;

we already need -o options anyway, so there is no downside to adding additional ones

> 5.  no requirement for sshd config appends, just drop-ins;

there's no need for sshd config changes either with the patches here?

> 6.  existing X509 CA key can be reused for ssh PKI as well.

that might no actually be the best of ideas ;)

> > -- to allow `ssh-keygen -f .. -R ..` to work properly again
> 
> Will always work with local-only files. In the other approach, the -R will still not work with a file stored on pmxcfs.

yes, the -R will work with a file stored on pmxcfs. just not with a symlink, no matter where it points. which is why the next step above lists unmerging the known hosts to get rid of the symlink.
 
> > -- existing keys would still be preserved for not-yet-upgraded nodes, so this
> >    should be do-able without waiting for a major release..
> 
> With the ssh certs, the old (non-cert) keys could be safely left behind in the pmxcfs location, upgraded nodes would append the previously shared content into local files, old nodes would not make use of the new keys until upgraded, but will keep working with the old to the extent they used to work. Universal remedy for any legacy issues would be to upgrade the node.

the same is true for the patches here:
- updated nodes publish their own key, and use published keys if available
- non-updated nodes will still have the symlink and use the "old" merged file

> 
> > - evaluate whether we want to split out
> > -- the client config (we currently force a cipher order there)
> > -- the client key (could live in /etc/pve/priv instead?)
> > -- or even the sshd instance altogether (would allow not touching the
> >    regular sshd config at all)
> 
> Non-issue in the ssh certs approach.

on the contrary, all of the above are also valid for cert-based auth..

> What's the counter-argument to ssh certs, given the simlicity in comparison with both the old approach and the intially suggested one here?

the one sentence summary - it doesn't get us closer to where we want to end up (either getting rid of SSH entirely, or full disentangling PVE-internal SSH use from the regular, default sshd instance), but adds more complexity instead.




More information about the pve-devel mailing list