[PVE-User] Ceph: Monitors not running but cannot be destroyed or recreated
Frank Thommen
f.thommen at dkfz-heidelberg.de
Sun Jan 26 14:14:44 CET 2020
Dear all,
I am trying to destroy "old" Ceph monitors but they can't be deleted and
also cannot be recreated:
I am currently configuring Ceph on our PVE cluster (3 nodes running PVE
6.1-3). There have been some "remainders" of a previous Ceph
configuration which I had tried to configure while the nodes were not in
a cluster configuration yet (and I had used the wrong network). However
I had purged these configurations with `pveceph purge`. I have redone
the basic Ceph configuration through the GUI on the first node and I
have deleted the still existing managers through the GUI (to have a
fresh start).
A new monitor has been created on the first node automatically, but I am
unable to delete the monitors on nodes 2 and 3. They show up as
Status=stopped and Address=Unknown in the GUI and they cannot be started
(no error message). In the syslog window I see (after rebooting node
odcf-pve02):
------------
Jan 26 13:51:53 odcf-pve02 systemd[1]: Started Ceph cluster monitor daemon.
Jan 26 13:51:55 odcf-pve02 ceph-mon[1372]: 2020-01-26 13:51:55.450
7faa98ab9280 -1 mon.odcf-pve02 at 0(electing) e1 failed to get devid for :
fallback method has serial ''but no model
------------
On the other hand I see the same message on the first node, and there
the monitor seems to work fine.
Trying to destroy them results in the message, that there is no such
monitor, and trying to create a new monitor on these nodes results in
the message, that the monitor already exists.... I am stuck in this
existence loop. Destroying or creating them also doesn't work on the
commandline.
Any idea on how to fix this? I'd rather not completely reinstall the
nodes :-)
Cheers
frank
More information about the pve-user
mailing list