[pve-devel] [PATCH cluster 2/4] fix #4886: SSH: pin node's host key if available

Fabian Grünbichler f.gruenbichler at proxmox.com
Tue Jan 16 10:00:10 CET 2024


> Esi Y via pve-devel <pve-devel at lists.proxmox.com> hat am 15.01.2024 15:31 CET geschrieben:
> On Mon, Jan 15, 2024 at 12:51:48PM +0100, Fabian Grünbichler wrote:
> > > On Thu, Jan 11, 2024 at 11:51:16AM +0100, Fabian Grünbichler wrote:
> > > > if the target node has already stored their SSH host key on pmxcfs, pin it and
> > > > ignore the global known hosts information.
> > > > 
> > > > Signed-off-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
> > > > ---
> > > >  src/PVE/SSHInfo.pm | 15 ++++++++++++++-
> > > >  1 file changed, 14 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/src/PVE/SSHInfo.pm b/src/PVE/SSHInfo.pm
> > > > index c351148..fad23bf 100644
> > > > --- a/src/PVE/SSHInfo.pm
> > > > +++ b/src/PVE/SSHInfo.pm
> > > > @@ -49,11 +49,24 @@ sub get_ssh_info {
> > > >  
> > > >  sub ssh_info_to_command_base {
> > > >      my ($info, @extra_options) = @_;
> > > > +
> > > > +    my $nodename = $info->{name};
> > > > +
> > > > +    my $known_hosts_file = "/etc/pve/nodes/$nodename/ssh_known_hosts";
> > > > +    my $known_hosts_options = undef;
> > > > +    if (-f $known_hosts_file) {
> > > > +	$known_hosts_options = [
> > > > +	    '-o', "UserKnownHostsFile=$known_hosts_file",
> > > > +	    '-o', 'GlobalKnownHostsFile=none',
> > > 
> > > why does Global need to be none, even as this only applies if the snippet exists?
> > 
> > because we want to only let SSH look at our pinned file, not the regular one, which might contain bogus information. since our pinned file contains an entry for our host key alias which must match, the global file can never improve the situation, but it can cause a verification failure.
> 
> This might not work as expected.
> 
> 1. There will not be any verification failure if there is at least some valid key present. If wrong keys are present alongside a good one, it's a pass. If _only_ wrong keys are present, with StrictHostKeyChecking default (ask) it will outright stop.
> 
> 2. The Global none does not improve anything there. If no keys are present it will try to ask (under SKHC default), but no use in BatchMode.

technically true, but doesn't really matter for our use case. we only want to use our own pinned key (or maybe, keys, at some point in the future) for internal connections.

> 3. Using -o UserKHF alongside default SKHC, e.g. if run by someone even manually after a failed script without BatchMode, will have it crash for them because the pinned file cannot be updated by ssh properly due to the same issue as mentioned before regarding ssh-keygen -R. In this case the pmxcfs will cause it to crash again on link-unlink-rename() again [1].
> 
> [1] https://github.com/openssh/openssh-portable/blob/50080fa42f5f744b798ee29400c0710f1b59f50e/hostfile.c#L695

it doesn't crash, it just fails to work. and this is not the same issue as the original one at all, since previously running the suggest command would break the PVE setup by removing our symlink, whereas now it creates an empty temp file but preserves our setup.

> 4. I suppose you did not like my suggestion re KnownHostsCommand [2] instead of "pinning", but giving -o's to ssh code where the files reside on pmxcfs is just creating the same problem (that e.g. keygen -R had) elsewhere depending if you plan e.g. multiline.

the only advantage of a KnownHostsCommand would be to avoid the above (tiny) issue in interactive use cases. our use case is by definition not interactive. the only situation where this should arise in practice is if you manually rotate the SSH host key of a node already in the cluster. even then, it will solve itself after a reboot (or manual invocation of pvecm updatecerts, which should definitely be noted in a yet-to-be-written "keys/secrets and rotating them" section of the docs).

the command approach has similar problems though:
- if it outputs a non-matching host key line, the connection will be aborted (so this is stricter than the file based solution! which is especially problematic if we extend this to handle all key types, since then a rotation of one of them would already trip it up)
- it internally treats the command option as if it were a file, leading to very nice output like this:

Offending RSA key in KnownHostsCommand-HOSTNAME:2
  remove with:
  ssh-keygen -f "KnownHostsCommand-HOSTNAME" -R "XXX"
Host key for XXX has changed and you have requested strict checking.
Host key verification failed.

(XXX is my hostname, the rest is output exactly like it is!)

last but not least, switching to a command is always possible as follow-up since it's entirely on the client side anyway and requires no coordination across the cluster - the command would just output the contents of the file anyhow.




More information about the pve-devel mailing list