[pve-devel] [PATCH v4 pve-manager 15/16] fix #4759: ceph: configure ceph-crash.service and its key
Fabian Grünbichler
f.gruenbichler at proxmox.com
Tue Mar 19 11:04:23 CET 2024
On March 5, 2024 4:07 pm, Max Carrara wrote:
> Due to Ceph dropping privileges when running the 'ceph-crash' daemon
> [0], it is necessary to allow the daemon to authenticate with its
> cluster in a safe manner.
>
> In order to avoid exposing sensitive keyrings or somehow escalating
> its privileges again, 'ceph-crash' is therefore provided with its own
> keyring in the '/etc/pve/ceph' directory. This directory, due to being
> on 'pmxcfs', may be read by members of the 'www-data' group, which
> 'ceph-crash' is made part of [1].
>
>
> Expected Configuration
> ----------------------
>
> 1. A keyring file named '/etc/pve/ceph/ceph.client.crash.keyring'
> exists
> 2. A section named 'client.crash' exists in '/etc/pve/ceph.conf'
> 3. The 'client.crash' section has a key named 'keyring' which
> references the keyring file as '/etc/pve/ceph/$cluster.$name.keyring'
> 4. The 'client.crash' section has *no* key named 'key'
>
>
> New Clusters
> ------------
>
> The keyring file is created and the conf file is updated after the first
> monitor has been created (when calling `pveceph mon create`).
>
>
> Existing Clusters
> -----------------
>
> A new helper script creates and configures the 'client.crash' keyring in
> `postinst`, if:
> * Ceph is installed
> * Ceph is initialized ('/etc/pve/ceph.conf' and '/etc/pve/ceph' exist)
> * Connection to RADOS is successful
>
> If the above conditions are met, the helper script ensures that the
> existing configuration matches the expected configuration mentioned
> above.
>
> The configuration is not changed if it is already as expected.
>
> The helper script may be called again manually if the `postinst` hook
> fails. It is installed to '/usr/share/pve-manager/helpers/pve-init-ceph-crash'.
>
>
> Existing `client.crash` Key
> ---------------------------
>
> If a key named 'client.crash' already exists within the cluster, it is
> reused and not regenerated.
>
>
> [0]: https://github.com/ceph/ceph/pull/48713
> [1]: https://git.proxmox.com/?p=ceph.git;a=commitdiff;h=f72c698a55905d93e9a0b7b95674616547deba8a
>
> Signed-off-by: Max Carrara <m.carrara at proxmox.com>
> ---
> Changes v1 --> v2:
> * fix 'keyring' key being appended to 'client.crash' section even
> if it already exists and configured correctly
> Changes v2 --> v3:
> * avoid needless second 'pmxcfs' lock on 'ceph.conf' when configuring
> 'client.crash' keyring after first monitor was created
> * change name of helper sub that creates the ceph.crash keyring
> * the helper sub now implicitly expects the /etc/pve/ceph directory to
> exist (as the previous commit enforces its existence)
> * remove BASH part of postinst hook related to `ceph-crash`
> * add Perl helper script and call that instead in postinst hook
> * reword commit message
> Changes v3 --> v4:
> * ensure key named 'key' does not exist for 'client.crash' section
> in helper script (thanks Friedrich!)
> * restart ceph-crash.service if it wasn't disabled (thanks Friedrich!)
> * add empty line before and after making changes in `update_ceph_conf`
> in 'postinst' hook
> * clean up the helper script; logic is easier to follow now
> * increase verbosity of output of helper script, showing what's
> exactly done when invoked
>
> PVE/API2/Ceph/MON.pm | 8 +++
> PVE/Ceph/Tools.pm | 35 +++++++++++
> bin/Makefile | 1 +
> bin/pve-init-ceph-crash | 129 ++++++++++++++++++++++++++++++++++++++++
> debian/postinst | 14 +++++
> 5 files changed, 187 insertions(+)
> create mode 100755 bin/pve-init-ceph-crash
>
> diff --git a/PVE/API2/Ceph/MON.pm b/PVE/API2/Ceph/MON.pm
> index 1e959ef3..8b7ce376 100644
> --- a/PVE/API2/Ceph/MON.pm
> +++ b/PVE/API2/Ceph/MON.pm
> @@ -450,6 +450,14 @@ __PACKAGE__->register_method ({
> )
> };
> warn "$@" if $@;
> +
> + print "Configuring keyring for ceph-crash.service\n";
> + eval {
> + PVE::Ceph::Tools::create_or_update_crash_keyring_file();
> + $cfg->{'client.crash'}->{keyring} = '/etc/pve/ceph/$cluster.$name.keyring';
> + cfs_write_file('ceph.conf', $cfg);
> + };
> + warn "Unable to configure keyring for ceph-crash.service: $@" if $@;
> }
>
> eval { PVE::Ceph::Services::ceph_service_cmd('enable', $monsection) };
> diff --git a/PVE/Ceph/Tools.pm b/PVE/Ceph/Tools.pm
> index 735bb116..9b97171e 100644
> --- a/PVE/Ceph/Tools.pm
> +++ b/PVE/Ceph/Tools.pm
> @@ -20,6 +20,7 @@ my $pve_ceph_cfgpath = "/etc/pve/$ccname.conf";
> my $ceph_cfgpath = "$ceph_cfgdir/$ccname.conf";
> my $pve_ceph_cfgdir = "/etc/pve/ceph";
>
> +my $pve_ceph_crash_key_path = "$pve_ceph_cfgdir/$ccname.client.crash.keyring";
> my $pve_mon_key_path = "/etc/pve/priv/$ccname.mon.keyring";
> my $pve_ckeyring_path = "/etc/pve/priv/$ccname.client.admin.keyring";
> my $ckeyring_path = "/etc/ceph/ceph.client.admin.keyring";
> @@ -45,6 +46,7 @@ my $config_values = {
>
> my $config_files = {
> pve_ceph_cfgpath => $pve_ceph_cfgpath,
> + pve_ceph_crash_key_path => $pve_ceph_crash_key_path,
> pve_mon_key_path => $pve_mon_key_path,
> pve_ckeyring_path => $pve_ckeyring_path,
> ceph_bootstrap_osd_keyring => $ceph_bootstrap_osd_keyring,
> @@ -420,6 +422,39 @@ sub get_or_create_admin_keyring {
> return $pve_ckeyring_path;
> }
>
> +# is also used in `pve-init-ceph-crash` helper
> +sub create_or_update_crash_keyring_file {
> + my ($rados) = @_;
> +
> + if (!defined($rados)) {
> + $rados = PVE::RADOS->new();
> + }
> +
> + my $output = $rados->mon_command({
> + prefix => 'auth get-or-create',
> + entity => 'client.crash',
> + caps => [
> + mon => 'profile crash',
> + mgr => 'profile crash',
> + ],
> + format => 'plain',
> + });
> +
> + if (-f $pve_ceph_crash_key_path) {
> + my $contents = PVE::Tools::file_get_contents($pve_ceph_crash_key_path);
> +
> + if ($contents ne $output) {
> + PVE::Tools::file_set_contents($pve_ceph_crash_key_path, $output);
> + return 1;
> + }
> + } else {
> + PVE::Tools::file_set_contents($pve_ceph_crash_key_path, $output);
> + return 1;
> + }
> +
> + return 0;
> +}
> +
> # get ceph-volume managed osds
> sub ceph_volume_list {
> my $result = {};
> diff --git a/bin/Makefile b/bin/Makefile
> index 06d8148e..b221e4b1 100644
> --- a/bin/Makefile
> +++ b/bin/Makefile
> @@ -83,6 +83,7 @@ install: $(SCRIPTS) $(CLI_MANS) $(SERVICE_MANS) $(BASH_COMPLETIONS) $(ZSH_COMPLE
> install -m 0755 $(SCRIPTS) $(BINDIR)
> install -d $(USRSHARE)/helpers
> install -m 0755 pve-startall-delay $(USRSHARE)/helpers
> + install -m 0755 pve-init-ceph-crash $(USRSHARE)/helpers
> install -d $(MAN1DIR)
> install -m 0644 $(CLI_MANS) $(MAN1DIR)
> install -d $(MAN8DIR)
> diff --git a/bin/pve-init-ceph-crash b/bin/pve-init-ceph-crash
> new file mode 100755
> index 00000000..c8e9217d
> --- /dev/null
> +++ b/bin/pve-init-ceph-crash
> @@ -0,0 +1,129 @@
> +#!/usr/bin/perl
> +
> +use strict;
> +use warnings;
> +
> +use List::Util qw(first);
> +
> +use PVE::Ceph::Tools;
> +use PVE::Cluster;
> +use PVE::RADOS;
> +use PVE::RPCEnvironment;
> +
> +my $ceph_cfg_file = 'ceph.conf';
> +my $keyring_value = '/etc/pve/ceph/$cluster.$name.keyring';
> +
> +my $entity = 'client.crash';
nit: this could be inlined?
> +
> +
> +sub try_adapt_cfg {
> + my ($cfg) = @_;
> +
> + my $changed = 0;
> + print("Checking whether the configuration for '$entity' needs to be updated.\n");
> +
> + my $add_keyring = sub {
> + print("Setting keyring path to '$keyring_value'.\n");
> + $cfg->{$entity}->{keyring} = $keyring_value;
> + $changed = 1;
this can be removed, you always return right after calling this sub..
> + };
> +
> + if (!exists($cfg->{$entity})) {
> + print("Adding missing section for '$entity'.\n");
> + $add_keyring->();
> + return $changed;
and all of these can be changed to uncondtionally return 1
> + }
> +
> + if (exists($cfg->{$entity}->{key})) {
> + print("Removing existing usage of key.\n");
> + delete($cfg->{$entity}->{key});
> + $changed = 1;
> + }
> +
> + if (!exists($cfg->{$entity}->{keyring})) {
> + print("Keyring path is missing from configuration.\n");
> + $add_keyring->();
> + return $changed;
> + }
> +
> + my $current_keyring_value = $cfg->{$entity}->{keyring};
> + if ($current_keyring_value ne $keyring_value) {
> + print("Current keyring path differs from expected path.\n");
> + $add_keyring->();
> + return $changed;
> + }
> +
> + return $changed;
$changed only serves the purpose of returning 1 iff only the 'key'
value/entry was removed. so it could be renamed to reflect that ;)
> +}
> +
> +
> +sub main {
> + my $rpcenv = PVE::RPCEnvironment->init('cli');
> +
> + $rpcenv->init_request();
> + $rpcenv->set_language($ENV{LANG});
> + $rpcenv->set_user('root at pam');
why do we need an rpcenv here?
> +
> + die "Not running as root. Exiting.\n" if $> != 0;
> +
> + if (!PVE::Ceph::Tools::check_ceph_installed('ceph_bin', 1)) {
> + print("Ceph is not installed. No action required.\n");
> + exit 0;
> + }
> +
> + eval {
> + PVE::Ceph::Tools::check_ceph_inited();
> + };
> + if ($@) {
> + print("Ceph is not initialized. No action required.\n");
> + exit 0;
> + }
> +
> + my $rados = eval { PVE::RADOS->new() };
> + $@ = ''; # warnings are displayed regardless
what's this line for?
> + my $ceph_crash_key_path = PVE::Ceph::Tools::get_config('pve_ceph_crash_key_path');
> +
> + PVE::Cluster::cfs_lock_file($ceph_cfg_file, undef, sub {
if you let this sub return a boolean value (worked/failed)
> + my $cfg = PVE::Cluster::cfs_read_file($ceph_cfg_file);
> +
> + if (!defined($rados)) {
> + my $has_mon_config = defined(first { m/mon\..*/ } keys $cfg->%*);
> + if ($has_mon_config) {
> + die "Connection to RADOS failed even though a monitor is configured.\n" .
> + "Please verify whether your configuration is correct.\n"
> + }
> +
> + print(
> + "Connection to RADOS failed and no monitor is configured. ".
> + "Assume that things are fine. No action required.\n"
> + );
not sure if it wouldn't be better to check mon_host here? that's the
thing actually needed to contact the mon, everything else is optional..
and even that could be moved to DNS, but we don't support that for
PVE-managed clusters..
> + return;
> + }
> +
> + my $updated_keyring = PVE::Ceph::Tools::create_or_update_crash_keyring_file($rados);
> +
> + if ($updated_keyring) {
> + print("Keyring file '$ceph_crash_key_path' was updated.\n");
> + }
> +
> + my $changed = try_adapt_cfg($cfg);
> +
> + if ($changed) {
> + print("Committing updated configuration.\n");
> + PVE::Cluster::cfs_write_file($ceph_cfg_file, $cfg);
> + print("Successfully updated configuration for 'ceph-crash.service'.\n");
> + } else {
> + print("Configuration does not need to be updated.\n");
> + }
> + });
then you can just check the return value (undef means cfs_lock set an
error, true means it worked, false means it failed without dieing, or
whatever you want to return)
> + if ($@) {
> + warn("Failed to configure keyring for 'ceph-crash.service'. Error:\n");
nit: I think I'd prefer the "Error: to be on the next line with the error message.."
> + warn("$@");
> + exit 1;
> + }
> +}
> +
> +main();
> +
> +0;
why?
> diff --git a/debian/postinst b/debian/postinst
> index 61b50f97..c36afa13 100755
> --- a/debian/postinst
> +++ b/debian/postinst
> @@ -82,10 +82,24 @@ EOF
>
> update_ceph_conf() {
> CEPH_CONF_DIR='/etc/pve/ceph'
> + UNIT='ceph-crash.service'
> +
> + echo ""
>
> if test ! -d "${CEPH_CONF_DIR}"; then
> mkdir -p "${CEPH_CONF_DIR}"
> fi
> +
> + # Don't fail in case user has "exotic" configuration where RADOS
> + # isn't available on all nodes for some reason
> + /usr/share/pve-manager/helpers/pve-init-ceph-crash || true
> +
> + if systemctl -q is-enabled "$UNIT"; then
like Friedrich said, please silence this (`2>/dev/null`)
> + echo "Restarting '$UNIT'"
> + deb-systemd-invoke restart "$UNIT"
|| true please!
> + fi
> +
> + echo ""
> }
>
> migrate_apt_auth_conf() {
> --
> 2.39.2
>
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>
>
More information about the pve-devel
mailing list