[pve-devel] [PATCH pve-manager 5/8] fix #4759: ceph: configure keyring for ceph-crash.service

Mon Feb 5 12:57:17 CET 2024

On 1/31/24 14:17, Fabian Grünbichler wrote:
> On January 30, 2024 7:40 pm, Max Carrara wrote:
>> when creating the cluster's first monitor.
>>
>> Signed-off-by: Max Carrara <m.carrara at proxmox.com>
>> ---
>>  PVE/API2/Ceph/MON.pm | 28 +++++++++++++++++++++++++++-
>>  PVE/Ceph/Services.pm | 12 ++++++++++--
>>  PVE/Ceph/Tools.pm    | 38 ++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 75 insertions(+), 3 deletions(-)
>>
>> diff --git a/PVE/API2/Ceph/MON.pm b/PVE/API2/Ceph/MON.pm
>> index 1e959ef3..8d75f5d1 100644
>> --- a/PVE/API2/Ceph/MON.pm
>> +++ b/PVE/API2/Ceph/MON.pm
>> @@ -459,11 +459,37 @@ __PACKAGE__->register_method ({
>>  	    });
>>  	    die $@ if $@;
>>  	    # automatically create manager after the first monitor is created
>> +	    # and set up keyring and config for ceph-crash.service
>>  	    if ($is_first_monitor) {
>>  		PVE::API2::Ceph::MGR->createmgr({
>>  		    node => $param->{node},
>>  		    id => $param->{node}
>> -		})
>> +		});
>> +
>> +		PVE::Cluster::cfs_lock_file('ceph.conf', undef, sub {
>> +		    my $cfg = cfs_read_file('ceph.conf');
>> +
>> +		    if ($cfg->{'client.crash'}) {
>> +			return undef;
>> +		    }
>> +
>> +		    $cfg->{'client.crash'}->{keyring} = '/etc/pve/ceph/$cluster.$name.keyring';
>> +
>> +		    cfs_write_file('ceph.conf', $cfg);
>> +		});
>> +		die $@ if $@;
>> +
>> +		eval {
>> +		    PVE::Ceph::Tools::get_or_create_crash_keyring();
>> +		};
>> +		warn "Unable to configure keyring for ceph-crash.service: $@" if $@;
> 
> the order here should maybe be switched around? first handle the
> keyring, then put it in the config?
> 
>> +
>> +		print "enabling service 'ceph-crash.service'\n";
>> +		PVE::Ceph::Services::ceph_service_cmd('enable', 'crash');
> 
> shouldn't this already be handled by default?
> 
>> +		print "starting service 'ceph-crash.service'\n";
>> +		# ceph-crash already runs by default,
>> +		# this makes sure the keyring is used
>> +		PVE::Ceph::Services::ceph_service_cmd('restart', 'crash');
> 
> this should probably be a try-restart to avoid starting it if the admin
> explicitly disabled and/or stopped it..
> 
> but - AFAICT, the ceph-crash script that is executed by the service
> boils down to (as forked process!) "ceph -n XXX ..." where XXX is (in sequence)
> client.crash.$HOST, client.crash, client.admin, so a service restart
> shouldn't even be needed, since a fresh ceph (client) process will pick
> up the config changes anyway?

Good point - I've found the respective code and will just change the order there
as well as drop the calls here to enable/restart `ceph-crash`. Thanks!

> 
>>  	    }
>>  	};
>>  
>> diff --git a/PVE/Ceph/Services.pm b/PVE/Ceph/Services.pm
>> index e0f31e8e..5f5986f9 100644
>> --- a/PVE/Ceph/Services.pm
>> +++ b/PVE/Ceph/Services.pm
> 
>> [..]
> 
>> diff --git a/PVE/Ceph/Tools.pm b/PVE/Ceph/Tools.pm
>> index 3acef11b..cf9f2ed4 100644
>> --- a/PVE/Ceph/Tools.pm
>> +++ b/PVE/Ceph/Tools.pm
> 
>> [..]
> 
>> +    my $output = $rados->mon_command({
>> +	prefix => 'auth get-or-create',
>> +	entity => 'client.crash',
>> +	caps => [
>> +	    mon => 'profile crash',
>> +	    mgr => 'profile crash',
>> +	],
>> +	format => 'plain',
>> +    });
>> +
>> +    if (! -d $pve_ceph_cfgdir) {
>> +	mkdir $pve_ceph_cfgdir;
>> +    }
>> +
>> +    PVE::Tools::file_set_contents($pve_ceph_crash_key_path, $output);
>> +
>> +    return $pve_ceph_crash_key_path;
>> +}
>> +
> 
> we have another helper for creating a keyring (and another inline call
> to ceph-authtool when creating a monitor), should we unify them?

In this case it's better not to, in my opinion - the function for `ceph-crash`
specifically uses `ceph auth get-or-create` as that's quite a bit easier to use
in this scenario, as the key will automatically be generated if it doesn't exist.
This does require a connection to RADOS, but that will exist once the first mon is
set up anyway.

Otherwise we'd have to use `ceph-authtool` and then also import the key to cephx
if it doesn't exist already, like we do in other places.

Ultimately it ends up achieving the same, but the former just seemed more
straightforward IMO.

I did however notice that there are several `run_command` calls to `ceph-authtool`
floating around that could maaaybe benefit from a helper function, but I would
rather implement that in a different patch series, as that's not really relevant
for this one.

> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
>