[pve-devel] [PATCH master ceph, quincy-stable-8 ceph, pve-storage, pve-manager 0/8] Fix #4759: Configure Permissions for ceph-crash.service

Friedrich Weber f.weber at proxmox.com
Wed Jan 31 15:22:53 CET 2024


Thanks a lot for tackling this issue!

Gave this a quick spin on a pre-existing 3-node Quincy cluster on which
I provoked a few crashes with `kill -n11 $(pidof ceph-osd)`.

ceph-base with patch 2 applied (provided by Max off-list) correctly
changed the /var/lib/ceph/crash/posted permissions to ceph:ceph for me.

Installing pve-manager (with patch 8 applied) on node 1 created a
keyring and added the section to /etc/pve/ceph.conf. However, installing
on node 2 added a second `keyring` line to the section:

[client.crash]
	 keyring = /etc/pve/ceph/$cluster.$name.keyring
	 keyring = /etc/pve/ceph/$cluster.$name.keyring

Same thing happens on each `dpkg-reconfigure pve-manager` I think.

Also, looks like every time ceph-crash posts a report, the syslog reads:

Jan 31 15:02:30 ceph1 ceph-crash[110939]: WARNING:ceph-crash:post
/var/lib/ceph/crash/2024-01-31T13:53:16.419342Z_1b5a078a-f665-4fcd-abd5-9bf602048d1f
as client.crash.ceph1 failed: 2024-01-31T15:02:30.105+0100 7f10bf7ae6c0
-1 auth: unable to find a keyring on
/etc/pve/priv/ceph.client.crash.ceph1.keyring: (13) Permission denied
Jan 31 15:02:30 ceph1 ceph-crash[110939]: 2024-01-31T15:02:30.105+0100
7f10bf7ae6c0 -1 auth: unable to find a keyring on
/etc/pve/priv/ceph.client.crash.ceph1.keyring: (13) Permission denied
Jan 31 15:02:30 ceph1 ceph-crash[110939]: 2024-01-31T15:02:30.105+0100
7f10bf7ae6c0 -1 auth: unable to find a keyring on
/etc/pve/priv/ceph.client.crash.ceph1.keyring: (13) Permission denied
Jan 31 15:02:30 ceph1 ceph-crash[110939]: 2024-01-31T15:02:30.105+0100
7f10bf7ae6c0 -1 auth: unable to find a keyring on
/etc/pve/priv/ceph.client.crash.ceph1.keyring: (13) Permission denied
Jan 31 15:02:30 ceph1 ceph-crash[110939]: 2024-01-31T15:02:30.105+0100
7f10bf7ae6c0 -1 monclient: keyring not found
Jan 31 15:02:30 ceph1 ceph-crash[110939]: [errno 13] RADOS permission
denied (error connecting to the cluster)

I remember you mentioned this before. Do I remember correctly there is
no easy way to prevent these messages? Having them appear only when a
crash is posted is certainly better than every 10 minutes, but they are
a bit misleading as they very much look like an error that needs attention.




More information about the pve-devel mailing list