[PVE-User] Ceph Cluster with proxmox failure
Ronny Aasen
ronny+pve-user at aasen.cx
Fri Sep 28 22:52:35 CEST 2018
On 28.09.2018 21:49, Gilberto Nunes wrote:
> Hi there
> I have a 6 server Ceph Cluster maded with proxmox 5.2
> Suddenly, after power failure, I have only 3 servers UP, but even with 3
> server, Ceph Cluster doesn't work.
> pveceph status give me a timeout
> pveceph status got timeout
>
> Any advice?
out of your 6 servers, how many was mon hosts. and how many mon hosts
are running at this time ?
does ceph -s work on the command line of the servers. ? do you have a
mgr running ?
you will need that the quorum of the mon hosts are alive.
so if you had 3 mon hosts, you need 2 live ones, and can loose 1
if you had 5 mon hosts, you need 3 live ones, and can loose 2
if you had six mon hosts you would need 4 live ones, and can still only
loose 2.
if a mon host is not running, try to restart it, read the logs and find
out why not.
if the logs does not show a reason, increase log verbosity and try
restart again.
once you have quorum of mon hosts (and a running mgr host)
you can start looking at osd's recovery and backfilling. if you have
the default 3x replication, pg's should come online as soon as it have 2
whole copies. try to pay attention to the fill level of disks, since
you do not want to make a bad situation worse by filling up your osd's
use things like
ceph osd tree
ceph osd df
ceph -s
good luck
Ronny Aasen
More information about the pve-user
mailing list