[PVE-User] Ceph: Monitors not running but cannot be destroyed or recreated

Frank Thommen f.thommen at dkfz-heidelberg.de
Sun Jan 26 23:51:54 CET 2020

On 26/01/2020 16:46, Frank Thommen wrote:
> On 26/01/2020 14:14, Frank Thommen wrote:
>> Dear all,
>> I am trying to destroy "old" Ceph monitors but they can't be deleted 
>> and also cannot be recreated:
>> I am currently configuring Ceph on our PVE cluster (3 nodes running 
>> PVE 6.1-3).  There have been some "remainders" of a previous Ceph 
>> configuration which I had tried to configure while the nodes were not 
>> in a cluster configuration yet (and I had used the wrong network).  
>> However I had purged these configurations with `pveceph purge`.  I 
>> have redone the basic Ceph configuration through the GUI on the first 
>> node and I have deleted the still existing managers through the GUI 
>> (to have a fresh start).
>> A new monitor has been created on the first node automatically, but I 
>> am unable to delete the monitors on nodes 2 and 3.  They show up as 
>> Status=stopped and Address=Unknown in the GUI and they cannot be 
>> started (no error message).  In the syslog window I see (after 
>> rebooting node odcf-pve02):
>> ------------
>> Jan 26 13:51:53 odcf-pve02 systemd[1]: Started Ceph cluster monitor 
>> daemon.
>> Jan 26 13:51:55 odcf-pve02 ceph-mon[1372]: 2020-01-26 13:51:55.450 
>> 7faa98ab9280 -1 mon.odcf-pve02 at 0(electing) e1 failed to get devid for 
>> : fallback method has serial ''but no model
>> ------------
>> On the other hand I see the same message on the first node, and there 
>> the monitor seems to work fine.
>> Trying to destroy them results in the message, that there is no such 
>> monitor, and trying to create a new monitor on these nodes results in 
>> the message, that the monitor already exists.... I am stuck in this 
>> existence loop.  Destroying or creating them also doesn't work on the 
>> commandline.
>> Any idea on how to fix this?  I'd rather not completely reinstall the 
>> nodes :-)
>> Cheers
>> frank
> In an attempt to clean up the Ceph setup again, I ran
>    pveceph stop ceph.target
>    pveceph purge
> on the first node.  Now I get an
>     rados_connect failed - No such file or directory (500)
> when I select Ceph in the GUI of any of the three nodes.  A reboot of 
> all nodes didn't help.
> frank

I was finally able to completely purge the old settings and reconfigure 
Ceph with the various instructions from this 

Maybe this information could be added to the official documentation 
(unless there is a nicer way of completely resetting Ceph in a PROXMOX 


More information about the pve-user mailing list